docs: Add Common Pitfall #59 - Layer 2 duplicate Telegram notifications

Bug: Trade #8 (SHORT SOL-PERP, Nov 22, 04:05) sent 13 duplicate notifications
- P&L compounded $11.50 → $155.05 (8.2× actual $18.79)
- Root cause: Layer 2 ghost detection ignored closingInProgress flag
- Rate limit storm: 6,581 failures → Layer 2 triggered 13 times
- Each call sent notification before async DB update completed

Real incident details:
- TP1: +$32.63 (60% closed)
- Runner: -$13.84 (40% closed at breakeven)
- Net: +$18.79 (Drift confirmed)
- Database showed $155.05 (manually corrected)

Fix: Added closingInProgress check before Layer 2 ghost detection
Related: Pitfalls #40 (death spiral), #48 (closingInProgress), #49 (compounding)
This commit is contained in:
mindesbunister
2025-11-22 14:13:26 +01:00
parent b19f156822
commit 81926c24b9

View File

@@ -3477,6 +3477,62 @@ trade.realizedPnL += actualRealizedPnL // NOT: result.realizedPnL from SDK
- **Deployment status:** Code committed and pushed, awaiting container rebuild - **Deployment status:** Code committed and pushed, awaiting container rebuild
- **Lesson:** In financial systems, "it worked this time" is not enough. Implement defense-in-depth protection for ALL critical operations, even when no failure has occurred yet. Better to have protection you never need than need protection you don't have. - **Lesson:** In financial systems, "it worked this time" is not enough. Implement defense-in-depth protection for ALL critical operations, even when no failure has occurred yet. Better to have protection you never need than need protection you don't have.
59. **Layer 2 ghost detection causing duplicate Telegram notifications (CRITICAL - Fixed Nov 22, 2025):**
- **Symptom:** Trade #8 sent 13 duplicate "POSITION CLOSED" notifications with compounding P&L ($11.50 → $155.05)
- **Root Cause:** Layer 2 ghost detection (failureCount > 20) didn't check `closingInProgress` flag before calling `handleExternalClosure()`
- **Real incident (Nov 22, 04:05 CET):**
* SHORT SOL-PERP: TP1 hit (60% closed at +$32.63), runner closed at breakeven (-$13.84)
* Actual net P&L: **+$18.79** (confirmed in Drift UI)
* Rate limit storm: 6,581 failed close attempts during runner exit
* Layer 2 ghost detection triggered every 2 seconds (priceCheckCount > 20)
* Called `handleExternalClosure()` 13 times before trade removal completed
* Each call sent Telegram notification with compounding P&L
* Database final value: $155.05 (8.2× actual, manually corrected to $18.79)
- **Why Common Pitfall #48 didn't prevent this:**
* `closingInProgress` flag exists for close verification (Pitfall #48)
* But Layer 2 ghost detection (death spiral detector) never checked it
* Flag only applied when Position Manager initiated the close
* Layer 2 runs when close FAILS repeatedly, so flag wasn't set
- **Bug sequence:**
1. Runner tries to close via `executeExit()` → 429 rate limit error
2. Trade stays in monitoring, `priceCheckCount` increments every 2s
3. After 20+ failures: Layer 2 checks Drift API, finds position gone
4. Calls `handleExternalClosure()` immediately without checking `closingInProgress`
5. Async database update still in progress from previous call
6. Next monitoring cycle (2s later): priceCheckCount still > 20, Drift still shows gone
7. Layer 2 triggers AGAIN → calls `handleExternalClosure()` AGAIN
8. Repeats 13 times during rate limit storm (each call sends notification, compounds P&L)
- **Fix (lib/trading/position-manager.ts lines 1477-1490):**
```typescript
// BEFORE (BROKEN):
if (trade.priceCheckCount > 20) {
// Check Drift API for ghost position
if (!position || Math.abs(position.size) < 0.01) {
await this.handleExternalClosure(trade, 'Layer 2: Ghost detected via Drift API')
return
}
}
// AFTER (FIXED):
if (trade.priceCheckCount > 20 && !trade.closingInProgress) {
// Check Drift API for ghost position
if (!position || Math.abs(position.size) < 0.01) {
// CRITICAL: Mark as closing to prevent duplicate processing
trade.closingInProgress = true
trade.closeConfirmedAt = Date.now()
await this.handleExternalClosure(trade, 'Layer 2: Ghost detected via Drift API')
return
}
}
```
- **Impact:** Prevents duplicate notifications and P&L compounding during rate limit storms
- **Verification:** Container restarted Nov 22 05:30 CET with fix
- **Database correction:** Manually corrected P&L from $155.05 to $18.79 (actual Drift value)
- **Related:** Common Pitfall #40 (ghost death spiral), #48 (closingInProgress flag), #49 (P&L compounding)
- **Git commit:** b19f156 "critical: Fix Layer 2 ghost detection causing duplicate Telegram notifications"
- **Lesson:** ALL code paths that detect external closures must check `closingInProgress` flag, not just the primary close verification path. Rate limit storms can cause monitoring loop to detect same closure dozens of times if flag isn't checked everywhere.
## File Conventions ## File Conventions
- **API routes:** `app/api/[feature]/[action]/route.ts` (Next.js 15 App Router) - **API routes:** `app/api/[feature]/[action]/route.ts` (Next.js 15 App Router)