diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md index 32faaca..4200315 100644 --- a/.github/copilot-instructions.md +++ b/.github/copilot-instructions.md @@ -2888,72 +2888,82 @@ ORDER BY MIN(adx) DESC; ## Common Pitfalls -**⚠️ CRITICAL REFERENCE: See `docs/COMMON_PITFALLS.md` for complete list (72 documented issues)** +**⚠️ CRITICAL REFERENCE: See `docs/COMMON_PITFALLS.md` for complete list (73 documented issues)** This section contains the **TOP 10 MOST CRITICAL** pitfalls that every AI agent must know. For full details, category breakdowns, code examples, and historical context, see the complete documentation. ### 🔴 TOP 10 CRITICAL PITFALLS -**1. Drift SDK Memory Leak (#1)** - JavaScript heap OOM after 10+ hours +**1. Position Manager Monitoring Stops Randomly (#73 - CRITICAL - Dec 7, 2025)** +- **Symptom:** PM last update at 23:21 Dec 6, stopped for 90+ minutes, user forced to manually close +- **Root Cause:** Drift state propagation delay (5+ min) → 60s timeout expires → false "external closure" detection → `activeTrades.delete()` → monitoring stops +- **Financial Impact:** Real losses during unmonitored period +- **THE FIX (3 Safety Layers - DEPLOYED Dec 7, 2025):** + * **Layer 1:** Extended timeout from 60s → 5 minutes (allows Drift state to propagate) + * **Layer 2:** Double-check with 10s delay before processing external closure (catches false positives) + * **Layer 3:** Verify Drift has no positions before calling stopMonitoring() (fail-safe) +- **Code Locations:** + * Layer 1: `lib/trading/position-manager.ts` line ~792 (timeout extension) + * Layer 2: `lib/trading/position-manager.ts` line ~603 (double-check logic) + * Layer 3: `lib/trading/position-manager.ts` line ~1069 (Drift verification) +- **Expected Impact:** Zero unprotected positions, false positive detection eliminated +- **Status:** ✅ DEPLOYED Dec 7, 2025 02:47 UTC (commit ed9e4d5) +- **See:** `docs/PM_MONITORING_STOP_ROOT_CAUSE_DEC7_2025.md` for complete analysis + +**2. Drift SDK Memory Leak (#1)** - JavaScript heap OOM after 10+ hours - **Solution:** Smart error-based health monitoring (`lib/monitoring/drift-health-monitor.ts`) - **Detection:** `interceptWebSocketErrors()` patches console.error - **Action:** Restarts if 50+ errors in 30-second window - **Status:** Fixed Nov 15, 2025, Enhanced Nov 24, 2025 -**2. Wrong RPC Provider (#2)** - Alchemy breaks Drift SDK subscriptions +**3. Wrong RPC Provider (#2)** - Alchemy breaks Drift SDK subscriptions - **FINAL CONCLUSION:** Use Helius RPC, NEVER use Alchemy - **Root Cause:** Alchemy rate limits break Drift's burst subscription pattern - **Evidence:** 17-71 subscription errors with Alchemy vs 0 with Helius - **Status:** Investigation Complete Nov 14, 2025 -**3. P&L Compounding Race Condition (#48, #49, #59, #60, #61, #67)** +**4. P&L Compounding Race Condition (#48, #49, #59, #60, #61, #67)** - **Pattern:** Multiple monitoring loops detect same closure → each adds P&L - **Result:** $6 real → $92 recorded (15x inflation) - **Fix:** Use `Map.delete()` atomic return as deduplication lock (Dec 2, 2025) - **Code:** `if (!this.activeTrades.delete(tradeId)) return` - first caller wins -**4. Database-First Pattern (#29)** - Save DB before Position Manager +**5. Database-First Pattern (#29)** - Save DB before Position Manager - **Rule:** `createTrade()` MUST succeed before `positionManager.addTrade()` - **Why:** If DB fails, API returns 500 with "CLOSE POSITION MANUALLY" - **Impact:** Without this, positions become untracked on container restart - **Status:** Fixed Nov 13, 2025 -**5. Container Deployment Verification (#31)** +**6. Container Deployment Verification (#31)** - **Rule:** NEVER say "fixed" without checking container timestamp - **Verification:** `docker logs trading-bot-v4 | grep "Server starting"` vs `git log -1 --format='%ai'` - **If container older than commit:** CODE NOT DEPLOYED, FIX NOT ACTIVE - **Status:** Critical lesson from Nov 13, 2025 incident -**6. Position.size Tokens vs USD (#24)** - SDK returns tokens, not USD +**7. Position.size Tokens vs USD (#24)** - SDK returns tokens, not USD - **Bug:** Comparing 12.28 tokens to $1,950 → "99.4% reduction" → false TP1 - **Fix:** `positionSizeUSD = Math.abs(position.size) * currentPrice` - **Impact:** Without fix, TP1 never triggers correctly - **Status:** Fixed Nov 12, 2025 -**7. Ghost Detection Atomic Lock (#67)** - Map.delete() as deduplication +**8. Ghost Detection Atomic Lock (#67)** - Map.delete() as deduplication - **Pattern:** Async handlers called by multiple code paths simultaneously - **Solution:** `if (!this.activeTrades.delete(tradeId)) { return }` - atomic lock - **Why:** JavaScript Map.delete() returns true only for first caller - **Status:** Fixed Dec 2, 2025 -**8. Smart Entry Wrong Price (#66, #68)** - Use Pyth price, not webhook +**9. Smart Entry Wrong Price (#66, #68)** - Use Pyth price, not webhook - **Bug #66:** Symbol format mismatch ("SOLUSDT" vs "SOL-PERP") caused cache miss - **Bug #68:** Webhook `signal.price` contained percentage (70.80) not market price ($142) - **Fix:** Always use `pythClient.getPrice(symbol)` for calculations - **Status:** Fixed Dec 1-3, 2025 -**9. MFE/MAE Wrong Units (#54)** - Store percentages, not dollars +**10. MFE/MAE Wrong Units (#54)** - Store percentages, not dollars - **Bug:** Storing $64.08 when should store 0.48% (133× inflation) - **Fix:** `trade.maxFavorableExcursion = profitPercent` not `currentPnLDollars` - **Impact:** Analytics completely wrong for all trades - **Status:** Fixed Nov 23, 2025 -**10. Execute Endpoint Quality Bypass (#62)** - Quality check after timeframe validation -- **Bug:** Quality calculated but never validated → trades at quality 30 executed -- **Fix:** Add explicit quality check AFTER timeframe confirmation -- **Code:** `if (qualityResult.score < minQualityScore) return 400` -- **Status:** Fixed Nov 27, 2025 - --- ### Quick Links by Category