critical: Bulletproof Position Manager state persistence (Bug #87)

PROBLEM: Container restart caused Position Manager to lose tracking of runner
system state, resulting in on-chain TP1 order closing entire position (100%)
instead of partial close (60%).

ROOT CAUSE: updateTradeState() had race condition in configSnapshot merge logic
- nested Prisma query inside update caused non-atomic read-modify-write
- positionManagerState was NULL in database despite saveTradeState() calls
- Missing critical state fields: tp2Hit, trailingStopActive, peakPrice

THE FIX (3-Layer Protection):
1. Atomic state persistence with verification
   - Separate read → merge → write → verify steps
   - Bulletproof verification after save (catches silent failures)
   - Persistent logger for save failures (investigation trail)

2. Complete state tracking
   - Added tp2Hit (runner system activation)
   - Added trailingStopActive (trailing stop recovery)
   - Added peakPrice (trailing stop calculations)
   - All MAE/MFE fields preserved

3. Bulletproof recovery on restart
   - initialize() restores ALL state from configSnapshot
   - Runner system can continue after TP1 partial close
   - Trailing stop resumes with correct peak price
   - No on-chain order conflicts

FILES CHANGED:
- lib/database/trades.ts (lines 66-90, 282-362)
  * UpdateTradeStateParams: Added tp2Hit, trailingStopActive, peakPrice
  * updateTradeState(): 4-step atomic save with verification
  * Persistent logging for save failures

- lib/trading/position-manager.ts (lines 2233-2258)
  * saveTradeState(): Now saves ALL critical runner system state
  * Includes tp2Hit, trailingStopActive, peakPrice
  * Complete MAE/MFE tracking

EXPECTED BEHAVIOR AFTER FIX:
- Container restart: PM restores full state from database
- TP1 partial close: 60% closed, 40% runner continues
- TP2 activation: Runner exits with trailing stop
- No on-chain order conflicts (PM controls partial closes)

USER IMPACT:
- No more missed runner profits due to restarts
- Complete position tracking through container lifecycle
- Bulletproof verification catches save failures early

INCIDENT REFERENCE:
- Trade ID: cmja0z6r00006t907qh24jfyk
- Date: Dec 17, 2025
- Loss: ~$18.56 potential runner profit missed
- User quote: "we have missed out here despite being a winner"

See Bug #87 in Common Pitfalls for full incident details
This commit is contained in:
mindesbunister
2025-12-17 15:06:05 +01:00
parent 8fdcf06d4b
commit 341341d8b1
2 changed files with 80 additions and 25 deletions

View File

@@ -2236,13 +2236,20 @@ export class PositionManager {
positionId: trade.positionId,
currentSize: trade.currentSize,
tp1Hit: trade.tp1Hit,
tp2Hit: trade.tp2Hit, // CRITICAL for runner system recovery
trailingStopActive: trade.trailingStopActive, // CRITICAL for trailing stop recovery
slMovedToBreakeven: trade.slMovedToBreakeven,
slMovedToProfit: trade.slMovedToProfit,
stopLossPrice: trade.stopLossPrice,
peakPrice: trade.peakPrice, // CRITICAL for trailing stop calculations
realizedPnL: trade.realizedPnL,
unrealizedPnL: trade.unrealizedPnL,
peakPnL: trade.peakPnL,
lastPrice: trade.lastPrice,
maxFavorableExcursion: trade.maxFavorableExcursion,
maxAdverseExcursion: trade.maxAdverseExcursion,
maxFavorablePrice: trade.maxFavorablePrice,
maxAdversePrice: trade.maxAdversePrice,
})
} catch (error) {
const tradeId = (trade as any).id ?? 'unknown'