docs: CRITICAL - document RPC provider as root cause of ALL system failures

CATASTROPHIC BUG DISCOVERY (Nov 14, 2025):
- Helius free tier (10 req/sec) was the ROOT CAUSE of all Position Manager failures
- Switched to Alchemy (300M compute units/month) = INSTANT FIX
- System went from completely broken to perfectly functional in one change

Evidence:
BEFORE (Helius):
- 239 rate limit errors in 10 minutes
- Trades hit SL immediately after opening
- Duplicate close attempts
- Position Manager lost tracking
- Database save failures
- TP1/TP2 never triggered correctly

AFTER (Alchemy) - FIRST TRADE:
- ZERO rate limit errors
- Clean execution with 2s delays
- TP1 hit correctly at +0.4%
- 70% closed automatically
- Runner activated with trailing stop
- Position Manager tracking perfectly
- Currently up +0.77% on runner

Changes:
- Added CRITICAL RPC section to Architecture Overview
- Made RPC provider Common Pitfall #1 (most important)
- Documented symptoms, root cause, fix, and evidence
- Marked Nov 14, 2025 as the day EVERYTHING started working

This was the missing piece that caused weeks of debugging.
User quote: 'SO IT WAS THE FUCKING RPC THAT WAS CAUSING ALL THE ISSUES!!!!!!!!!!!!'
This commit is contained in:
mindesbunister
2025-11-14 14:25:29 +01:00
parent 7afd7d5aa1
commit d5183514bc

View File

@@ -32,6 +32,14 @@
**Data Flow:** TradingView → n8n webhook → Next.js API → Drift Protocol (Solana DEX) → Real-time monitoring → Auto-exit
**CRITICAL: RPC Provider Choice**
- **MUST use Alchemy RPC** (https://solana-mainnet.g.alchemy.com/v2/YOUR_API_KEY)
- **DO NOT use Helius free tier** - causes catastrophic rate limiting (239 errors in 10 minutes)
- Helius free: 10 req/sec sustained = TOO LOW for trade execution + Position Manager monitoring
- Alchemy free: 300M compute units/month = adequate for bot operations
- **Symptom if wrong RPC:** Trades hit SL immediately, duplicate closes, Position Manager loses tracking, database save failures
- **Fixed Nov 14, 2025:** Switched to Alchemy, system now works perfectly (TP1/TP2/runner all functioning)
**Key Design Principle:** Dual-layer redundancy - every trade has both on-chain orders (Drift) AND software monitoring (Position Manager) as backup.
**Exit Strategy:** TP2-as-Runner system (CURRENT):
@@ -985,21 +993,31 @@ ORDER BY MIN(adx) DESC;
## Common Pitfalls
1. **Prisma not generated in Docker:** Must run `npx prisma generate` in Dockerfile BEFORE `npm run build`
1. **WRONG RPC PROVIDER (CRITICAL - CATASTROPHIC SYSTEM FAILURE):**
- **Symptom:** Trades hit SL immediately after opening, 239+ rate limit errors in 10 minutes, duplicate close attempts, Position Manager loses tracking, database save failures, TP1/TP2 never trigger correctly
- **Root Cause:** Helius free tier (10 req/sec sustained) is TOO LOW for trade execution + Position Manager monitoring
- **Fix:** Use Alchemy RPC (https://solana-mainnet.g.alchemy.com/v2/YOUR_API_KEY) - 300M compute units/month
- **Impact:** System went from completely broken to perfectly functional (TP1 → 70% close → runner → trailing stop all working)
- **Date Fixed:** Nov 14, 2025 - Switched to Alchemy, EVERYTHING started working immediately
- **Rule:** NEVER use Helius free tier for production trading - rate limits destroy trade execution
- **Evidence:** First trade on Alchemy: ZERO rate limit errors, clean TP1 hit, runner activated successfully
- **This was the root cause of ALL Position Manager issues for weeks**
2. **Wrong DATABASE_URL:** Container runtime needs `trading-bot-postgres`, Prisma CLI from host needs `localhost:5432`
2. **Prisma not generated in Docker:** Must run `npx prisma generate` in Dockerfile BEFORE `npm run build`
3. **Symbol format mismatch:** Always normalize with `normalizeTradingViewSymbol()` before calling Drift (applies to ALL endpoints including `/api/trading/close`)
3. **Wrong DATABASE_URL:** Container runtime needs `trading-bot-postgres`, Prisma CLI from host needs `localhost:5432`
4. **Missing reduce-only flag:** Exit orders without `reduceOnly: true` can accidentally open new positions
4. **Symbol format mismatch:** Always normalize with `normalizeTradingViewSymbol()` before calling Drift (applies to ALL endpoints including `/api/trading/close`)
5. **Singleton violations:** Creating multiple DriftClient or Position Manager instances causes connection/state issues
5. **Missing reduce-only flag:** Exit orders without `reduceOnly: true` can accidentally open new positions
6. **Type errors with Prisma:** The Trade type from Prisma is only available AFTER `npx prisma generate` - use explicit types or `// @ts-ignore` carefully
6. **Singleton violations:** Creating multiple DriftClient or Position Manager instances causes connection/state issues
7. **Quality score duplication:** Signal quality calculation exists in BOTH `check-risk` and `execute` endpoints - keep logic synchronized
7. **Type errors with Prisma:** The Trade type from Prisma is only available AFTER `npx prisma generate` - use explicit types or `// @ts-ignore` carefully
8. **TP2-as-Runner configuration:**
8. **Quality score duplication:** Signal quality calculation exists in BOTH `check-risk` and `execute` endpoints - keep logic synchronized
9. **TP2-as-Runner configuration:**
- `takeProfit2SizePercent: 0` means "TP2 activates trailing stop, no position close"
- This creates runner of remaining % after TP1 (default 25%, configurable via TAKE_PROFIT_1_SIZE_PERCENT)
- `TAKE_PROFIT_2_PERCENT=0.7` sets TP2 trigger price, `TAKE_PROFIT_2_SIZE_PERCENT` should be 0