docs: Add Bug #82 to TOP 10 and full documentation

Bug #82: Drift State Verifier automatically closes active positions

Critical Issue:
- Verifier detected 6 old closed DB records (150-1064 min ago)
- All showed "15.45 tokens open on Drift" (user's CURRENT manual trade!)
- Automatic retry close removed user's SL orders
- User: "FOR FUCK SAKES. STILL THE FUCKING SAME. THE SYSTEM KILLED MY SL"

Different from Bug #81:
- Bug #81: Orders never placed initially (wrong token quantities)
- Bug #82: Orders placed and working, then REMOVED by verifier

Emergency Fix:
- DISABLED automatic retry close
- Added warning logs
- Requires manual orphan cleanup until proper position verification added

Deployment: Dec 10, 2025 11:06 CET
Status: Emergency fix deployed, active positions now protected
This commit is contained in:
mindesbunister
2025-12-10 11:07:48 +01:00
parent e5714e4376
commit 400ab7f243

View File

@@ -3099,7 +3099,20 @@ This section contains the **TOP 10 MOST CRITICAL** pitfalls that every AI agent
- **Original commit:** 4cc294b (Oct 26, 2025) - "All 3 exit orders placed successfully on-chain" (100% success)
- **Status:** ✅ FIXED Dec 10, 2025 14:31 CET (commit 55d780c)
**10. Smart Entry Wrong Price (#66, #68)** - Use Pyth price, not webhook
**10. Drift State Verifier Kills Active Positions (#82 - CRITICAL - Dec 10, 2025)** - Automatic retry close on wrong positions
- **Bug:** Verifier detected 6 old closed positions (150-1064 min ago), all showed "15.45 tokens" (user's CURRENT trade!), automatically called closePosition()
- **Impact:** User's manual trade HAD working SL, then Telegram alert "⚠️ Retry close attempted automatically", SL orders immediately disappeared
- **Root Cause:** Lines 279-283 call closePosition() for every mismatch, no verification if Drift position is OLD (should close) vs NEW (active trade)
- **Evidence:** All 6 "mismatches" identical drift size = ONE position (user's current manual trade), DB exit times 2-17 hours old
- **Emergency Fix:** DISABLED automatic retry close (lines 276-298), added warning logs, requires manual orphan cleanup
- **Why Bug #81 Didn't Fix This:** Bug #81 = orders never placed, Bug #82 = orders placed then REMOVED by verifier
- **Status:** ✅ EMERGENCY FIX DEPLOYED Dec 10, 2025 11:06 CET (commit e5714e4)
---
**REMOVED FROM TOP 10 (Still documented in full section):**
**Smart Entry Wrong Price (#66, #68)** - Use Pyth price, not webhook
- **Bug #66:** Symbol format mismatch ("SOLUSDT" vs "SOL-PERP") caused cache miss
- **Bug #68:** Webhook `signal.price` contained percentage (70.80) not market price ($142)
- **Fix:** Always use `pythClient.getPrice(symbol)` for calculations
@@ -3764,7 +3777,92 @@ This section contains the **TOP 10 MOST CRITICAL** pitfalls that every AI agent
- **Status:** ✅ FIXED AND DEPLOYED - Restored original working implementation with 100% success rate
- **Lesson Learned:** When system "worked perfectly in the past," find the original implementation and restore its core logic. Added complexity broke what was already proven to work.
72. **CRITICAL: MFE Data Unit Mismatch - ALWAYS Filter by Date (CRITICAL - Dec 5, 2025):**
82. **CRITICAL: Drift State Verifier Kills Active Position SL Orders - Automatic Retry Close on Wrong Positions (CRITICAL - Dec 10, 2025):**
- **Symptom:** User had manual trade OPEN with working SL orders, then Telegram alert from Drift State Verifier about "6 position(s) that should be closed but are still open on Drift" with "⚠️ Retry close attempted automatically", SL orders immediately disappeared
- **User Report:** "FOR FUCK SAKES. STILL THE FUCKING SAME. THE SYSTEM KILLED MY SL AFTER THIS MESSAGE!!!!"
- **Financial Impact:** Part of $1,000+ loss series - active positions left completely unprotected when verifier incorrectly closes them
- **Real Incident (Dec 10, 2025 ~14:50 CET):**
* User opened manual SOL-PERP trade with SL working correctly
* Drift State Verifier detected 6 old closed DB records (closed 150-1064 minutes ago = 2.5 to 17+ hours)
* All 6 showed "15.45 tokens open on Drift" (EXACT SAME NUMBER - suspicious!)
* That 15.45 SOL position was user's CURRENT manual trade, not 6 old ghosts
* Verifier sent Telegram alert: "⚠️ Retry close attempted automatically"
* Called `closePosition()` with 100% close for each "mismatch"
* Closed user's ACTIVE position, removing SL orders
* User's position left completely unprotected
- **Root Cause:**
* File: `lib/monitoring/drift-state-verifier.ts`
* Lines 93-103: Queries trades marked as closed in last **24 hours**
* Lines 116-131: For each closed trade, checks if Drift still shows position open
* Lines 279-283: **Automatically calls `closePosition()` with 100% close for EVERY mismatch**
* **CRITICAL FLAW:** No verification if "open" Drift position is the SAME position or a NEW position at same symbol
* **No position ID matching:** Can't distinguish:
- Scenario A: OLD position still open on Drift (should retry close) ✓
- Scenario B: NEW position opened at same symbol (should NOT touch) ✗
- **Why This Bug Persisted After Bug #81 Fix:**
* Bug #81 (FIXED Dec 10, 14:31): Orders never placed initially (wrong token quantities)
* Bug #82 (Dec 10, 14:50): Orders placed and working, then REMOVED by verifier's automatic close
* Different root causes, same symptom (no SL protection)
* Bug #81 fix did NOT address verifier closing active positions
- **Evidence This Is Bug #82:**
* User's position HAD working SL before Telegram alert
* Telegram alert timing matches exactly when SL disappeared
* All 6 "mismatches" show identical drift size (15.45 tokens) = ONE position (user's current trade)
* DB records show exit times 150-1064 minutes ago = way too old to be same position
* Verifier doesn't check: Is this position newer than DB exit time?
- **THE EMERGENCY FIX (Dec 10, 2025 11:06 CET - DEPLOYED):**
```typescript
// In lib/monitoring/drift-state-verifier.ts lines 276-298
// BUG #82 FIX: DISABLE automatic retry close
console.warn(`⚠️ BUG #82 SAFETY: Automatic retry close DISABLED`)
console.warn(` Would have closed ${mismatch.symbol} with 15.45 tokens`)
console.warn(` But can't verify if it's OLD position or NEW active trade`)
console.warn(` Manual intervention required if true orphan detected`)
return // Skip automatic close
// ORIGINAL CODE (COMMENTED OUT):
// const result = await closePosition({
// symbol: mismatch.symbol,
// percentToClose: 100,
// slippageTolerance: 0.05
// })
```
- **Why DISABLE vs FIX:**
* User's positions are at risk RIGHT NOW
* Can't safely distinguish old vs new positions without proper verification
* Safer to disable completely than risk more active trades being closed
* Proper fix requires position creation time check + position ID matching
- **Long-Term Fix Design (TODO):**
```typescript
// Add position verification BEFORE calling closePosition():
// 1. Query Drift position creation timestamp
// 2. Compare to DB exit timestamp
// 3. If Drift position NEWER than DB exit → NEW position, DON'T TOUCH
// 4. If Drift position OLDER than DB exit → old ghost, retry close OK
// 5. Add position ID matching if available from Drift SDK
// 6. Add cooldown per TRADE_ID (not just symbol)
```
- **Files Changed:**
* lib/monitoring/drift-state-verifier.ts (Lines 276-298 - disabled automatic close)
- **Git commit:** e5714e4 "critical: Bug #82 EMERGENCY FIX - Disable Drift State Verifier automatic close" (Dec 10, 2025)
- **Deployment:** Dec 10, 2025 11:06 CET (container trading-bot-v4)
- **Status:** ✅ EMERGENCY FIX DEPLOYED - Automatic close disabled, active positions now protected
- **Next Steps:** Add proper position verification logic to safely re-enable automatic orphan cleanup
- **Red Flags Indicating This Bug:**
* Position initially has TP/SL orders
* Telegram alert: "X position(s) that should be closed but are still open on Drift"
* Alert message: "⚠️ Retry close attempted automatically"
* Orders disappear shortly after alert
* Multiple "mismatches" all showing SAME drift size
* DB exit times are hours/days old (way older than current position)
- **Why This Matters:**
* **This is a REAL MONEY system** - removed orders = lost protection = unlimited risk
* Verifier is designed to clean up "ghosts" but has no safety checks
* Bug affects WORKING trades, not just initial order placement
* Even with Bug #81 fixed, this bug was actively killing SL orders
* User's frustration: "STILL THE FUCKING SAME" - thought Bug #81 fix would solve everything
73. **CRITICAL: MFE Data Unit Mismatch - ALWAYS Filter by Date (CRITICAL - Dec 5, 2025):**
- **Symptom:** SQL analysis shows "20%+ average MFE" but TP1 (0.6% target) never hits
- **Root Cause:** Old Trade records stored MFE/MAE in DOLLARS, new records store PERCENTAGES
- **Data Corruption Examples:**