diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md index faec5b1..99a6357 100644 --- a/.github/copilot-instructions.md +++ b/.github/copilot-instructions.md @@ -1448,6 +1448,54 @@ trade.realizedPnL += actualRealizedPnL // NOT: result.realizedPnL from SDK - **Verification:** After restart, check logs for "Found 0 open trades" (not "Found 1 open trades to restore") - **Lesson:** status field is for classification, exitReason is for lifecycle management - both must be set on closure +36. **closePosition() missing retry logic causes rate limit storm (CRITICAL - Fixed Nov 15, 2025):** + - **Symptom:** Position Manager tries to close trade, gets 429 error, retries EVERY 2 SECONDS → 100+ failed attempts → rate limit exhaustion + - **Root Cause:** `placeExitOrders()` has `retryWithBackoff()` wrapper (Nov 14 fix), but `closePosition()` did NOT + - **Real incident:** Trade cmi0il8l30000r607l8aec701 (Nov 15, 16:49 CET) + 1. Position Manager tried to close (SL or TP trigger) + 2. closePosition() called raw `placePerpOrder()` → 429 error + 3. executeExit() caught 429, returned early (line 935-940) + 4. Position Manager kept monitoring, retried close EVERY 2 seconds + 5. Logs show 100+ "❌ Failed to close position: 429" + "⚠️ Rate limited while closing SOL-PERP" + 6. Meanwhile: On-chain TP2 limit order filled (unaffected by SDK rate limits) + 7. External closure detected, DB updated 8 TIMES: $0.14 → $0.20 → $0.26 → ... → $0.51 + 8. Container eventually restarted (likely from rate limit exhaustion) + - **Why duplicate updates:** Common Pitfall #27 fix (remove from Map before DB update) works UNLESS rate limits cause tons of retries before external closure detection + - **Impact:** User saw $0.51 profit in DB, $0.03 on Drift UI (8× compounding vs 1 actual fill) + - **Fix:** Wrapped closePosition() with retryWithBackoff() in lib/drift/orders.ts: + ```typescript + // Line ~567 (BEFORE): + const txSig = await driftClient.placePerpOrder(orderParams) + + // Line ~567 (AFTER): + const txSig = await retryWithBackoff(async () => { + return await driftClient.placePerpOrder(orderParams) + }, 3, 8000) // 8s base delay, 3 max retries (8s → 16s → 32s) + ``` + - **Behavior now:** 3 SDK retries over 56s (8+16+32) + Position Manager natural retry on next monitoring cycle = robust without spam + - **RPC load reduction:** 30-50× fewer requests during close operations (3 retries vs 100+ attempts) + - **Verification:** Container restarted 18:05 CET Nov 15, code deployed + - **Lesson:** EVERY SDK order operation (open, close, cancel, place) MUST have retry wrapper - Position Manager monitoring creates infinite retry loop without it + - **Root Cause:** Phantom auto-closure sets `status='phantom'` but leaves `exitReason=NULL` + - **Bug:** Startup validator checks `exitReason !== null` (line 122 of init-position-manager.ts), ignores status field + - **Consequence:** Phantom trade with exitReason=NULL treated as "open" and restored to Position Manager + - **Real incident:** Nov 14 phantom trade (cmhy6xul20067nx077agh260n) caused 232% size mismatch, hundreds of false "🔴 RUNNER STOP LOSS" alerts + - **Fix:** When auto-closing phantom trades, MUST set exitReason: + ```typescript + // In app/api/trading/execute/route.ts (phantom detection): + await updateTradeExit({ + tradeId: trade.id, + exitPrice: currentPrice, + exitReason: 'manual', // CRITICAL: Must set exitReason for cleanup + realizedPnL: actualPnL, + status: 'phantom' + }) + ``` + - **Manual cleanup:** If phantom already exists: `UPDATE "Trade" SET "exitReason" = 'manual' WHERE status = 'phantom' AND "exitReason" IS NULL` + - **Impact:** Without exitReason, phantom trades create ghost positions that trigger false alerts and pollute monitoring + - **Verification:** After restart, check logs for "Found 0 open trades" (not "Found 1 open trades to restore") + - **Lesson:** status field is for classification, exitReason is for lifecycle management - both must be set on closure + ## File Conventions - **API routes:** `app/api/[feature]/[action]/route.ts` (Next.js 15 App Router)