critical: Fix P&L compounding during close verification (20x inflation bug)

Problem:
- Close transaction confirmed but Drift state takes 5-10s to propagate
- Position Manager returned needsVerification=true to keep monitoring
- BUT: Monitoring loop detected position as 'externally closed' EVERY 2 seconds
- Each detection called handleExternalClosure() and added P&L to database
- Result: .66 actual profit → 73.36 in database (20x compounding)
- Logs showed: $112.96 → $117.62 → $122.28 → ... → $173.36 (14+ updates)

Root Cause:
- Common Pitfall #47 fix introduced needsVerification flag to wait for propagation
- But NO flag to prevent external closure detection during wait period
- Monitoring loop thought position was 'closed externally' on every cycle
- Rate limiting (429 errors) made it worse by extending wait time

Fix (closingInProgress flag):
1. Added closingInProgress boolean to ActiveTrade interface
2. Set flag=true when needsVerification returned (close confirmed, waiting)
3. Skip external closure detection entirely while flag=true
4. Timeout after 60s if stuck (abnormal case - allows cleanup)

Impact:
- Every close with verification delay (most closes) had 10-20x P&L inflation
- This is variant of Common Pitfall #27 but during verification, not external closure
- Rate limited closes were hit hardest (longer wait = more compounding cycles)

Files:
- lib/trading/position-manager.ts: Added closingInProgress flag + skip logic

Incident: Nov 16, 11:50 CET - SHORT 41.64→40.08 showed 73.36 vs .66 real
Documented: Common Pitfall #48
This commit is contained in:
mindesbunister
2025-11-16 15:07:27 +01:00
parent 54815a0daa
commit 018f973609
2 changed files with 100 additions and 10 deletions

View File

@@ -2064,6 +2064,65 @@ trade.realizedPnL += actualRealizedPnL // NOT: result.realizedPnL from SDK
```
- **Lesson:** In DEX trading, always verify state changes actually propagated before updating local state. ALWAYS verify container restart timestamp matches or exceeds commit timestamps before declaring fixes deployed.
48. **P&L compounding during close verification (CRITICAL - Fixed Nov 16, 2025):**
- **Symptom:** Database P&L shows $173.36 when actual P&L was $8.66 (20× too high)
- **Root Cause:** Variant of Common Pitfall #27 - duplicate external closure detection during close verification wait
- **Real incident (Nov 16, 11:50 CET):**
* SHORT position: Entry $141.64 → Exit $140.08 (expected P&L: $8.66)
* Close transaction confirmed, Drift verification pending (5-10s propagation delay)
* Position Manager returned with `needsVerification: true` flag
* **Every 2 seconds:** Monitoring loop checked Drift, saw position "missing", called `handleExternalClosure()`
* **Each call added P&L:** $112.96 → $117.62 → $122.28 → ... → $173.36 (14+ compounding updates)
* Rate limiting made it worse (429 errors delayed final cleanup)
- **Why it happened:**
* Fix #47 introduced `needsVerification` flag to keep monitoring during propagation delay
* BUT: No flag to prevent external closure detection during this wait period
* Monitoring loop thought position was "closed externally" every cycle
* Each detection calculated P&L and updated database, compounding the value
- **Impact:** Every close with verification delay (most closes) vulnerable to 10-20× P&L inflation
- **Fix (closingInProgress flag):**
```typescript
// In ActiveTrade interface (line ~15):
// Close verification tracking (Nov 16, 2025)
closingInProgress?: boolean // True when close tx confirmed but Drift not yet propagated
closeConfirmedAt?: number // Timestamp when close was confirmed (for timeout)
// In executeExit() when needsVerification returned (line ~1210):
if ((result as any).needsVerification) {
// CRITICAL: Mark as "closing in progress" to prevent duplicate external closure detection
trade.closingInProgress = true
trade.closeConfirmedAt = Date.now()
console.log(`🔒 Marked as closing in progress - external closure detection disabled`)
return
}
// In monitoring loop BEFORE external closure check (line ~640):
if (trade.closingInProgress) {
const timeInClosing = Date.now() - (trade.closeConfirmedAt || Date.now())
if (timeInClosing > 60000) {
// Stuck >60s (abnormal) - allow cleanup
trade.closingInProgress = false
} else {
// Normal: Skip external closure detection entirely during propagation wait
console.log(`🔒 Close in progress (${(timeInClosing / 1000).toFixed(0)}s) - skipping external closure check`)
}
}
// External closure check only runs if NOT closingInProgress
if ((position === null || position.size === 0) && !trade.closingInProgress) {
// ... handle external closure
}
```
- **Behavior now:**
* Close confirmed → Set `closingInProgress = true`
* Monitoring continues but SKIPS external closure detection
* After 5-10s: Drift propagates, ghost detection cleans up correctly (one time only)
* If stuck >60s: Timeout allows cleanup (abnormal case)
- **Prevents:** Duplicate P&L updates during the 5-10s verification window
- **Related to:** Common Pitfall #27 (external closure duplicates), but different trigger
- **Files changed:** `lib/trading/position-manager.ts` (interface + logic)
- **Lesson:** When introducing wait periods in financial systems, always add flags to prevent duplicate state updates during the wait
46. **100% position sizing causes InsufficientCollateral (Fixed Nov 16, 2025):**
- **Symptom:** Bot configured for 100% position size gets InsufficientCollateral errors, but Drift UI can open same size position
- **Root Cause:** Drift's margin calculation includes fees, slippage buffers, and rounding - exact 100% leaves no room