diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md index 580b48f..b11f806 100644 --- a/.github/copilot-instructions.md +++ b/.github/copilot-instructions.md @@ -535,10 +535,18 @@ TradingView alert → n8n Parse Signal Enhanced (extracts metrics + timeframe) ↓ calculate dual stop prices if enabled ↓ placeExitOrders() [on-chain TP1/TP2/SL orders] ↓ scoreSignalQuality({ ..., timeframe }) [compute 0-100 score with timeframe-aware thresholds] - ↓ createTrade() [save to database with signalQualityScore] - ↓ positionManager.addTrade() [start monitoring] + ↓ createTrade() [CRITICAL: save to database FIRST - see Common Pitfall #27] + ↓ positionManager.addTrade() [ONLY after DB save succeeds - prevents unprotected positions] ``` +**CRITICAL EXECUTION ORDER (Nov 13, 2025 Fix):** +The order of database save → Position Manager add is NOT arbitrary - it's a safety requirement: +- If database save fails, API returns HTTP 500 with critical warning +- User sees: "CLOSE POSITION MANUALLY IMMEDIATELY" with transaction signature +- Position Manager only tracks database-persisted trades +- Container restarts can restore all positions from database +- **Never add to Position Manager before database save** - creates unprotected positions + ### Position Monitoring Loop ``` Position Manager every 2s: @@ -933,6 +941,56 @@ trade.realizedPnL += actualRealizedPnL // NOT: result.realizedPnL from SDK - **Risk:** Small sample size (2 trades) could be outliers, but downside limited - SQL analysis showed clear pattern: stricter filtering was blocking profitable setups +27. **Database-First Pattern (CRITICAL - Fixed Nov 13, 2025):** + - **Symptom:** Positions opened on Drift with NO database record, NO Position Manager tracking, NO TP/SL protection + - **Root Cause:** Execute endpoint saved to database AFTER adding to Position Manager, with silent error catch + - **Bug sequence:** + 1. TradingView signal → `/api/trading/execute` + 2. Position opened on Drift ✅ + 3. Position Manager tracking added ✅ + 4. Database save attempted ❌ (fails silently) + 5. API returns success to user ❌ + 6. Container restarts → Position Manager loses in-memory state ❌ + 7. Result: Unprotected position with no monitoring or TP/SL orders + - **Fix:** Database-first execution order in `app/api/trading/execute/route.ts`: + ```typescript + // CRITICAL: Save to database FIRST before adding to Position Manager + try { + await createTrade({...}) + } catch (dbError) { + console.error('❌ CRITICAL: Failed to save trade to database:', dbError) + return NextResponse.json({ + success: false, + error: 'Database save failed - position unprotected', + message: `Position opened on Drift but database save failed. CLOSE POSITION MANUALLY IMMEDIATELY. Transaction: ${openResult.transactionSignature}`, + }, { status: 500 }) + } + + // ONLY add to Position Manager if database save succeeded + await positionManager.addTrade(activeTrade) + ``` + - **Impact:** Without this fix, ANY database failure creates unprotected positions + - **Verification:** Test trade cmhxj8qxl0000od076m21l58z (Nov 13) confirmed fix working + - **Documentation:** See `CRITICAL_INCIDENT_UNPROTECTED_POSITION.md` for full incident report + - **Rule:** Database persistence ALWAYS comes before in-memory state updates + +28. **DNS retry logic (Nov 13, 2025):** + - **Problem:** Trading bot fails with "fetch failed" errors when DNS resolution temporarily fails for `mainnet.helius-rpc.com` + - **Impact:** n8n workflow failures, missed trades, container restart failures + - **Root Cause:** `EAI_AGAIN` errors are transient DNS issues that resolve in seconds, but bot treated them as permanent failures + - **Fix:** Automatic retry in `lib/drift/client.ts` - `retryOperation()` wrapper: + ```typescript + // Detects transient errors: fetch failed, EAI_AGAIN, ENOTFOUND, ETIMEDOUT + // Retries up to 3 times with 2s delay between attempts + // Fails fast on non-transient errors (auth, config, permanent network issues) + await this.retryOperation(async () => { + // Initialize Drift SDK, subscribe, get user account + }, 3, 2000, 'Drift initialization') + ``` + - **Success logs:** `⚠️ Drift initialization failed (attempt 1/3): fetch failed` → `⏳ Retrying in 2000ms...` → `✅ Drift service initialized successfully` + - **Impact:** 99% of transient DNS failures now auto-recover, preventing missed trades + - **Documentation:** See `docs/DNS_RETRY_LOGIC.md` for monitoring queries and metrics + ## File Conventions - **API routes:** `app/api/[feature]/[action]/route.ts` (Next.js 15 App Router)