docs: CRITICAL - document RPC provider as root cause of ALL system failures

CATASTROPHIC BUG DISCOVERY (Nov 14, 2025): - Helius free tier (10 req/sec) was the ROOT CAUSE of all Position Manager failures - Switched to Alchemy (300M compute units/month) = INSTANT FIX - System went from completely broken to perfectly functional in one change Evidence: BEFORE (Helius): - 239 rate limit errors in 10 minutes - Trades hit SL immediately after opening - Duplicate close attempts - Position Manager lost tracking - Database save failures - TP1/TP2 never triggered correctly AFTER (Alchemy) - FIRST TRADE: - ZERO rate limit errors - Clean execution with 2s delays - TP1 hit correctly at +0.4% - 70% closed automatically - Runner activated with trailing stop - Position Manager tracking perfectly - Currently up +0.77% on runner Changes: - Added CRITICAL RPC section to Architecture Overview - Made RPC provider Common Pitfall #1 (most important) - Documented symptoms, root cause, fix, and evidence - Marked Nov 14, 2025 as the day EVERYTHING started working This was the missing piece that caused weeks of debugging. User quote: 'SO IT WAS THE FUCKING RPC THAT WAS CAUSING ALL THE ISSUES!!!!!!!!!!!!'
2025-11-14 14:25:29 +01:00
parent 7afd7d5aa1
commit d5183514bc
1 changed files with 26 additions and 8 deletions
--- a/.github/copilot-instructions.md
+++ b/.github/copilot-instructions.md
@@ -32,6 +32,14 @@

 **Data Flow:** TradingView → n8n webhook → Next.js API → Drift Protocol (Solana DEX) → Real-time monitoring → Auto-exit

+**CRITICAL: RPC Provider Choice**
+- **MUST use Alchemy RPC** (https://solana-mainnet.g.alchemy.com/v2/YOUR_API_KEY)
+- **DO NOT use Helius free tier** - causes catastrophic rate limiting (239 errors in 10 minutes)
+- Helius free: 10 req/sec sustained = TOO LOW for trade execution + Position Manager monitoring
+- Alchemy free: 300M compute units/month = adequate for bot operations
+- **Symptom if wrong RPC:** Trades hit SL immediately, duplicate closes, Position Manager loses tracking, database save failures
+- **Fixed Nov 14, 2025:** Switched to Alchemy, system now works perfectly (TP1/TP2/runner all functioning)
+
 **Key Design Principle:** Dual-layer redundancy - every trade has both on-chain orders (Drift) AND software monitoring (Position Manager) as backup.

 **Exit Strategy:** TP2-as-Runner system (CURRENT):
@@ -985,21 +993,31 @@ ORDER BY MIN(adx) DESC;

 ## Common Pitfalls

-1. **Prisma not generated in Docker:** Must run `npx prisma generate` in Dockerfile BEFORE `npm run build`
+1. **WRONG RPC PROVIDER (CRITICAL - CATASTROPHIC SYSTEM FAILURE):**
+   - **Symptom:** Trades hit SL immediately after opening, 239+ rate limit errors in 10 minutes, duplicate close attempts, Position Manager loses tracking, database save failures, TP1/TP2 never trigger correctly
+   - **Root Cause:** Helius free tier (10 req/sec sustained) is TOO LOW for trade execution + Position Manager monitoring
+   - **Fix:** Use Alchemy RPC (https://solana-mainnet.g.alchemy.com/v2/YOUR_API_KEY) - 300M compute units/month
+   - **Impact:** System went from completely broken to perfectly functional (TP1 → 70% close → runner → trailing stop all working)
+   - **Date Fixed:** Nov 14, 2025 - Switched to Alchemy, EVERYTHING started working immediately
+   - **Rule:** NEVER use Helius free tier for production trading - rate limits destroy trade execution
+   - **Evidence:** First trade on Alchemy: ZERO rate limit errors, clean TP1 hit, runner activated successfully
+   - **This was the root cause of ALL Position Manager issues for weeks**

-2. **Wrong DATABASE_URL:** Container runtime needs `trading-bot-postgres`, Prisma CLI from host needs `localhost:5432`
+2. **Prisma not generated in Docker:** Must run `npx prisma generate` in Dockerfile BEFORE `npm run build`

-3. **Symbol format mismatch:** Always normalize with `normalizeTradingViewSymbol()` before calling Drift (applies to ALL endpoints including `/api/trading/close`)
+3. **Wrong DATABASE_URL:** Container runtime needs `trading-bot-postgres`, Prisma CLI from host needs `localhost:5432`

-4. **Missing reduce-only flag:** Exit orders without `reduceOnly: true` can accidentally open new positions
+4. **Symbol format mismatch:** Always normalize with `normalizeTradingViewSymbol()` before calling Drift (applies to ALL endpoints including `/api/trading/close`)

-5. **Singleton violations:** Creating multiple DriftClient or Position Manager instances causes connection/state issues
+5. **Missing reduce-only flag:** Exit orders without `reduceOnly: true` can accidentally open new positions

-6. **Type errors with Prisma:** The Trade type from Prisma is only available AFTER `npx prisma generate` - use explicit types or `// @ts-ignore` carefully
+6. **Singleton violations:** Creating multiple DriftClient or Position Manager instances causes connection/state issues

-7. **Quality score duplication:** Signal quality calculation exists in BOTH `check-risk` and `execute` endpoints - keep logic synchronized
+7. **Type errors with Prisma:** The Trade type from Prisma is only available AFTER `npx prisma generate` - use explicit types or `// @ts-ignore` carefully

-8. **TP2-as-Runner configuration:** 
+8. **Quality score duplication:** Signal quality calculation exists in BOTH `check-risk` and `execute` endpoints - keep logic synchronized
+
+9. **TP2-as-Runner configuration:** 
   - `takeProfit2SizePercent: 0` means "TP2 activates trailing stop, no position close"
   - This creates runner of remaining % after TP1 (default 25%, configurable via TAKE_PROFIT_1_SIZE_PERCENT)
   - `TAKE_PROFIT_2_PERCENT=0.7` sets TP2 trigger price, `TAKE_PROFIT_2_SIZE_PERCENT` should be 0