CRITICAL BUG: Missing retry wrapper caused rate limit storm Real Incident (Nov 15, 16:49 CET): - Trade cmi0il8l30000r607l8aec701 triggered close attempt - closePosition() had NO retryWithBackoff() wrapper - Failed with 429 → Position Manager retried EVERY 2 SECONDS - 100+ close attempts exhausted Helius rate limit - On-chain TP2 filled during storm - External closure detected 8 times: $0.14 → $0.51 (compounding bug) Why This Was Missed: - placeExitOrders() got retry wrapper on Nov 14 - openPosition() still has no wrapper (less critical - runs once) - closePosition() overlooked - MOST CRITICAL because runs in monitoring loop - Position Manager executeExit() catches 429 and returns early - But monitoring continues, retries close every 2s = infinite loop The Fix: - Wrapped closePosition() placePerpOrder() with retryWithBackoff() - 8s base delay, 3 max retries (same as placeExitOrders) - Reduces RPC load by 30-50x during close operations - Container deployed 18:05 CET Nov 15 Impact: Prevents rate limit exhaustion + duplicate external closure updates Files: .github/copilot-instructions.md (added Common Pitfall #36)
88 KiB
AI Agent Instructions for Trading Bot v4
Mission & Financial Goals
Primary Objective: Build wealth systematically from $106 → $100,000+ through algorithmic trading
Current Phase: Phase 1 - Survival & Proof (Nov 2025 - Jan 2026)
- Current Capital: $97.55 USDC (zero debt, 100% health)
- Starting Capital: $106 (Nov 2025)
- Target: $2,500 by end of Phase 1 (Month 2.5)
- Strategy: Aggressive compounding, 0 withdrawals
- Position Sizing: 100% of free collateral (~$97 at 15x leverage = ~$1,463 notional)
- Risk Tolerance: EXTREME - This is recovery/proof-of-concept mode
- Win Target: 20-30% monthly returns to reach $2,500
- Trades Executed: 161 (as of Nov 12, 2025)
Why This Matters for AI Agents:
- Every dollar counts at this stage - optimize for profitability, not just safety
- User needs this system to work for long-term financial goals ($300-500/month withdrawals starting Month 3)
- No changes that reduce win rate unless they improve profit factor
- System must prove itself before scaling (see
TRADING_GOALS.mdfor full 8-phase roadmap)
Key Constraints:
- Can't afford extended drawdowns (limited capital)
- Must maintain 60%+ win rate to compound effectively
- Quality over quantity - only trade 60+ signal quality scores (lowered from 65 on Nov 12, 2025)
- After 3 consecutive losses, STOP and review system
Architecture Overview
Type: Autonomous cryptocurrency trading bot with Next.js 15 frontend + Solana/Drift Protocol backend
Data Flow: TradingView → n8n webhook → Next.js API → Drift Protocol (Solana DEX) → Real-time monitoring → Auto-exit
CRITICAL: RPC Provider Choice
- MUST use Alchemy RPC (https://solana-mainnet.g.alchemy.com/v2/YOUR_API_KEY)
- DO NOT use Helius free tier - causes catastrophic rate limiting (239 errors in 10 minutes)
- Helius free: 10 req/sec sustained = TOO LOW for trade execution + Position Manager monitoring
- Alchemy free: 300M compute units/month = adequate for bot operations
- Symptom if wrong RPC: Trades hit SL immediately, duplicate closes, Position Manager loses tracking, database save failures
- Fixed Nov 14, 2025: Switched to Alchemy, system now works perfectly (TP1/TP2/runner all functioning)
Key Design Principle: Dual-layer redundancy - every trade has both on-chain orders (Drift) AND software monitoring (Position Manager) as backup.
Exit Strategy: TP2-as-Runner system (CURRENT):
- TP1 at +0.4%: Close configurable % (default 75%, adjustable via
TAKE_PROFIT_1_SIZE_PERCENT) - TP2 at +0.7%: Activates trailing stop on full remaining % (no position close)
- Runner: Remaining % after TP1 with ATR-based trailing stop (default 25%, configurable)
- Note: All UI displays dynamically calculate runner% as
100 - TAKE_PROFIT_1_SIZE_PERCENT
Per-Symbol Configuration: SOL and ETH have independent enable/disable toggles and position sizing:
SOLANA_ENABLED,SOLANA_POSITION_SIZE,SOLANA_LEVERAGE(defaults: true, 100%, 15x)ETHEREUM_ENABLED,ETHEREUM_POSITION_SIZE,ETHEREUM_LEVERAGE(defaults: true, 100%, 1x)- BTC and other symbols fall back to global settings (
MAX_POSITION_SIZE_USD,LEVERAGE) - Priority: Per-symbol ENV → Market config → Global ENV → Defaults
Signal Quality System: Filters trades based on 5 metrics (ATR, ADX, RSI, volumeRatio, pricePosition) scored 0-100. Only trades scoring 60+ are executed (lowered from 65 after data analysis showed 60-64 tier outperformed higher scores). Scores stored in database for future optimization.
Timeframe-Aware Scoring: Signal quality thresholds adjust based on timeframe (5min vs daily):
- 5min: ADX 12+ trending (vs 18+ for daily), ATR 0.2-0.7% healthy (vs 0.4%+ for daily)
- Anti-chop filter: -20 points for extreme sideways (ADX <10, ATR <0.25%, Vol <0.9x)
- Pass
timeframeparam toscoreSignalQuality()from TradingView alerts (e.g.,timeframe: "5")
MAE/MFE Tracking: Every trade tracks Maximum Favorable Excursion (best profit %) and Maximum Adverse Excursion (worst loss %) updated every 2s. Used for data-driven optimization of TP/SL levels.
Manual Trading via Telegram: Send plain-text messages like long sol, short eth, long btc to open positions instantly (bypasses n8n, calls /api/trading/execute directly with preset healthy metrics). CRITICAL: Manual trades are marked with signalSource='manual' and excluded from TradingView indicator analysis (prevents data contamination).
Re-Entry Analytics System: Manual trades are validated before execution using fresh TradingView data:
- Market data cached from TradingView signals (5min expiry)
/api/analytics/reentry-checkscores re-entry based on fresh metrics + recent performance- Telegram bot blocks low-quality re-entries unless
--forceflag used - Uses real TradingView ADX/ATR/RSI when available, falls back to historical data
- Penalty for recent losing trades, bonus for winning streaks
VERIFICATION MANDATE: Financial Code Requires Proof
CRITICAL: THIS IS A REAL MONEY TRADING SYSTEM - NOT A TOY PROJECT
Core Principle: In trading systems, "working" means "verified with real data", NOT "code looks correct".
NEVER declare something working without:
- Observing actual logs showing expected behavior
- Verifying database state matches expectations
- Comparing calculated values to source data
- Testing with real trades when applicable
- CONFIRMING CODE IS DEPLOYED - Check container start time vs commit time
CODE COMMITTED ≠ CODE DEPLOYED
- Git commit at 15:56 means NOTHING if container started at 15:06
- ALWAYS verify:
docker logs trading-bot-v4 | grep "Server starting" | head -1 - Compare container start time to commit timestamp
- If container older than commit: CODE NOT DEPLOYED, FIX NOT ACTIVE
- Never say "fixed" or "protected" until deployment verified
Critical Path Verification Requirements
Position Manager Changes:
- Execute test trade with DRY_RUN=false (small size)
- Watch docker logs for full TP1 → TP2 → exit cycle
- SQL query: verify
tp1Hit,slMovedToBreakeven,currentSizematch Position Manager logs - Compare Position Manager tracked size to actual Drift position size
- Check exit reason matches actual trigger (TP1/TP2/SL/trailing)
Exit Logic Changes (TP/SL/Trailing):
- Log EXPECTED values (TP1 price, SL price after breakeven, trailing stop distance)
- Log ACTUAL values from Drift position and Position Manager state
- Verify: Does TP1 hit when price crosses TP1? Does SL move to breakeven?
- Test: Open position, let it hit TP1, verify 75% closed + SL moved
- Document: What SHOULD happen vs what ACTUALLY happened
API Endpoint Changes:
- curl test with real payload from TradingView/n8n
- Check response JSON matches expectations
- Verify database record created with correct fields
- Check Telegram notification shows correct values (leverage, size, etc.)
- SQL query: confirm all fields populated correctly
Calculation Changes (P&L, Position Sizing, Percentages):
- Add console.log for EVERY step of calculation
- Verify units match (tokens vs USD, percent vs decimal, etc.)
- SQL query with manual calculation: does code result match hand calculation?
- Test edge cases: 0%, 100%, negative values, very small/large numbers
SDK/External Data Integration:
- Log raw SDK response to verify assumptions about data format
- NEVER trust documentation - verify with console.log
- Example: position.size doc said "USD" but logs showed "tokens"
- Document actual behavior in Common Pitfalls section
Red Flags Requiring Extra Verification
High-Risk Changes:
- Unit conversions (tokens ↔ USD, percent ↔ decimal)
- State transitions (TP1 hit → move SL to breakeven)
- Configuration precedence (per-symbol vs global vs defaults)
- Display values from complex calculations (leverage, size, P&L)
- Timing-dependent logic (grace periods, cooldowns, race conditions)
Verification Steps for Each:
- Before declaring working: Show proof (logs, SQL results, test output)
- After deployment: Monitor first real trade closely, verify behavior
- Edge cases: Test boundary conditions (0, 100%, max leverage, min size)
- Regression: Check that fix didn't break other functionality
SQL Verification Queries
After Position Manager changes:
-- Verify TP1 detection worked correctly
SELECT
symbol, entryPrice, currentSize, realizedPnL,
tp1Hit, slMovedToBreakeven, exitReason,
TO_CHAR(createdAt, 'MM-DD HH24:MI') as time
FROM "Trade"
WHERE exitReason IS NULL -- Open positions
OR createdAt > NOW() - INTERVAL '1 hour' -- Recent closes
ORDER BY createdAt DESC
LIMIT 5;
-- Compare Position Manager state to expectations
SELECT configSnapshot->'positionManagerState' as pm_state
FROM "Trade"
WHERE symbol = 'SOL-PERP' AND exitReason IS NULL;
After calculation changes:
-- Verify P&L calculations
SELECT
symbol, direction, entryPrice, exitPrice,
positionSize, realizedPnL,
-- Manual calculation:
CASE
WHEN direction = 'long' THEN
positionSize * ((exitPrice - entryPrice) / entryPrice)
ELSE
positionSize * ((entryPrice - exitPrice) / entryPrice)
END as expected_pnl,
-- Difference:
realizedPnL - CASE
WHEN direction = 'long' THEN
positionSize * ((exitPrice - entryPrice) / entryPrice)
ELSE
positionSize * ((entryPrice - exitPrice) / entryPrice)
END as pnl_difference
FROM "Trade"
WHERE exitReason IS NOT NULL
AND createdAt > NOW() - INTERVAL '24 hours'
ORDER BY createdAt DESC
LIMIT 10;
Example: How Position.size Bug Should Have Been Caught
What went wrong:
- Read code: "Looks like it's comparing sizes correctly"
- Declared: "Position Manager is working!"
- Didn't verify with actual trade
What should have been done:
// In Position Manager monitoring loop - ADD THIS LOGGING:
console.log('🔍 VERIFICATION:', {
positionSizeRaw: position.size, // What SDK returns
positionSizeUSD: position.size * currentPrice, // Converted to USD
trackedSizeUSD: trade.currentSize, // What we're tracking
ratio: (position.size * currentPrice) / trade.currentSize,
tp1ShouldTrigger: (position.size * currentPrice) < trade.currentSize * 0.95
})
Then observe logs on actual trade:
🔍 VERIFICATION: {
positionSizeRaw: 12.28, // ← AH! This is SOL tokens, not USD!
positionSizeUSD: 1950.84, // ← Correct USD value
trackedSizeUSD: 1950.00,
ratio: 1.0004, // ← Should be near 1.0 when position full
tp1ShouldTrigger: false // ← Correct
}
Lesson: One console.log would have exposed the bug immediately.
Deployment Checklist
MANDATORY PRE-DEPLOYMENT VERIFICATION:
- Check container start time:
docker logs trading-bot-v4 | grep "Server starting" | head -1 - Compare to commit timestamp: Container MUST be newer than code changes
- If container older: STOP - Code not deployed, fix not active
- Never declare "fixed" or "working" until container restarted with new code
Before marking feature complete:
- Code review completed
- Unit tests pass (if applicable)
- Integration test with real API calls
- Logs show expected behavior
- Database state verified with SQL
- Edge cases tested
- Container restarted and verified running new code
- Documentation updated (including Common Pitfalls if applicable)
- User notified of what to verify during first real trade
When to Escalate to User
Don't say "it's working" if:
- You haven't observed actual logs showing the expected behavior
- SQL query shows unexpected values
- Test trade behaved differently than expected
- You're unsure about unit conversions or SDK behavior
- Change affects money (position sizing, P&L, exits)
- Container hasn't been restarted since code commit
Instead say:
- "Code is updated. Need to verify with test trade - watch for [specific log message]"
- "Fixed, but requires verification: check database shows [expected value]"
- "Deployed. First real trade should show [behavior]. If not, there's still a bug."
- "Code committed but NOT deployed - container running old version, fix not active yet"
Docker Build Best Practices
CRITICAL: Prevent build interruptions with background execution + live monitoring
Docker builds take 40-70 seconds and are easily interrupted by terminal issues. Use this pattern:
# Start build in background with live log tail
cd /home/icke/traderv4 && docker compose build trading-bot > /tmp/docker-build-live.log 2>&1 & BUILD_PID=$!; echo "Build started, PID: $BUILD_PID"; tail -f /tmp/docker-build-live.log
Why this works:
- Build runs in background (
&) - immune to terminal disconnects/Ctrl+C - Output redirected to log file - can review later if needed
tail -fshows real-time progress - see compilation, linting, errors- Can Ctrl+C the
tail -fwithout killing build - build continues - Verification after:
tail -50 /tmp/docker-build-live.logto check success
Success indicators:
✓ Compiled successfully in 27s✓ Generating static pages (30/30)#22 naming to docker.io/library/traderv4-trading-bot doneDONE X.Xson final step
Failure indicators:
Failed to compile.Type error:ERROR: process "/bin/sh -c npm run build" did not complete successfully: exit code: 1
After successful build:
# Deploy new container
docker compose up -d --force-recreate trading-bot
# Verify it started
docker logs --tail=30 trading-bot-v4
# Confirm deployed version
docker logs trading-bot-v4 | grep "Server starting" | head -1
DO NOT use: docker compose build trading-bot in foreground - one network hiccup kills 60s of work
Docker Cleanup After Builds
CRITICAL: Prevent disk full issues from build cache accumulation
Docker builds create intermediate layers (1.3+ GB per build) that accumulate over time. Build cache can reach 40-50 GB after frequent rebuilds.
After successful deployment, clean up:
# Remove dangling images (old builds)
docker image prune -f
# Remove build cache (biggest space hog - 40+ GB typical)
docker builder prune -f
# Optional: Remove dangling volumes (if no important data)
docker volume prune -f
# Check space saved
docker system df
When to run:
- After each successful deployment (recommended)
- Weekly if building frequently
- When disk space warnings appear
- Before major updates/migrations
Space typically freed:
- Dangling images: 2-5 GB
- Build cache: 40-50 GB
- Dangling volumes: 0.5-1 GB
- Total: 40-55 GB per cleanup
What's safe to delete:
<none>tagged images (old builds)- Build cache (recreated on next build)
- Dangling volumes (orphaned from removed containers)
What NOT to delete:
- Named volumes (contain data:
trading-bot-postgres, etc.) - Active containers
- Tagged images currently in use
Critical Components
1. Phantom Trade Auto-Closure System
Purpose: Automatically close positions when size mismatch detected (position opened but wrong size)
When triggered:
- Position opened on Drift successfully
- Expected size: $50 (50% @ 1x leverage)
- Actual size: $1.37 (7% fill - likely oracle price stale or exchange rejection)
- Size ratio < 50% threshold → phantom detected
Automated response (all happens in <1 second):
- Immediate closure: Market order closes 100% of phantom position
- Database logging: Creates trade record with
status='phantom', saves P&L - n8n notification: Returns HTTP 200 with full details (not 500 - allows workflow to continue)
- Telegram alert: Message includes entry/exit prices, P&L, reason, transaction IDs
Why auto-close instead of manual intervention:
- User may be asleep, away from devices, unavailable for hours
- Unmonitored position = unlimited risk exposure
- Position Manager won't track phantom (by design)
- No TP/SL protection, no trailing stop, no monitoring
- Better to exit with small loss/gain than leave position exposed
- Re-entry always possible if setup was actually good
Example notification:
⚠️ PHANTOM TRADE AUTO-CLOSED
Symbol: SOL-PERP
Direction: LONG
Expected Size: $48.75
Actual Size: $1.37 (2.8%)
Entry: $168.50
Exit: $168.45
P&L: -$0.02
Reason: Size mismatch detected - likely oracle price issue or exchange rejection
Action: Position auto-closed for safety (unmonitored positions = risk)
TX: 5Yx2Fm8vQHKLdPaw...
Database tracking:
status='phantom'field identifies these tradesisPhantom=true,phantomReason='ORACLE_PRICE_MISMATCH'expectedSizeUSD,actualSizeUSDfields for analysis- Exit reason:
'manual'(phantom auto-close category) - Enables post-trade analysis of phantom frequency and patterns
Code location: app/api/trading/execute/route.ts lines 322-445
2. Signal Quality Scoring (lib/trading/signal-quality.ts)
Purpose: Unified quality validation system that scores trading signals 0-100 based on 5 market metrics
Timeframe-aware thresholds:
scoreSignalQuality({
atr, adx, rsi, volumeRatio, pricePosition,
timeframe?: string // "5" for 5min, undefined for higher timeframes
})
5min chart adjustments:
- ADX healthy range: 12-22 (vs 18-30 for daily)
- ATR healthy range: 0.2-0.7% (vs 0.4%+ for daily)
- Anti-chop filter: -20 points for extreme sideways (ADX <10, ATR <0.25%, Vol <0.9x)
Price position penalties (all timeframes):
- Long at 90-95%+ range: -15 to -30 points (chasing highs)
- Short at <5-10% range: -15 to -30 points (chasing lows)
- Prevents flip-flop losses from entering range extremes
Key behaviors:
- Returns score 0-100 and detailed breakdown object
- Minimum score 60 required to execute trade
- Called by both
/api/trading/check-riskand/api/trading/execute - Scores saved to database for post-trade analysis
2. Position Manager (lib/trading/position-manager.ts)
Purpose: Software-based monitoring loop that checks prices every 2 seconds and closes positions via market orders
Singleton pattern: Always use getInitializedPositionManager() - never instantiate directly
const positionManager = await getInitializedPositionManager()
await positionManager.addTrade(activeTrade)
Key behaviors:
- Tracks
ActiveTradeobjects in a Map - TP2-as-Runner system: TP1 (configurable %, default 75%) → TP2 trigger (no close, activate trailing) → Runner (remaining %) with ATR-based trailing stop
- Dynamic SL adjustments: Moves to breakeven after TP1, locks profit at +1.2%
- On-chain order synchronization: After TP1 hits, calls
cancelAllOrders()thenplaceExitOrders()with updated SL price at breakeven (usesretryWithBackoff()for rate limit handling) - ATR-based trailing stop: Calculates trail distance as
(atrAtEntry / currentPrice × 100) × trailingStopAtrMultiplier, clamped between min/max % - Trailing stop: Activates when TP2 price hit, tracks
peakPriceand trails dynamically - Closes positions via
closePosition()market orders when targets hit - Acts as backup if on-chain orders don't fill
- State persistence: Saves to database, restores on restart via
configSnapshot.positionManagerState - Startup validation: On container restart, cross-checks last 24h "closed" trades against Drift to detect orphaned positions (see
lib/startup/init-position-manager.ts) - Grace period for new trades: Skips "external closure" detection for positions <30 seconds old (Drift positions take 5-10s to propagate)
- Exit reason detection: Uses trade state flags (
tp1Hit,tp2Hit) and realized P&L to determine exit reason, NOT current price (avoids misclassification when price moves after order fills) - Real P&L calculation: Calculates actual profit based on entry vs exit price, not SDK's potentially incorrect values
- Rate limit-aware exit: On 429 errors during close, keeps trade in monitoring (doesn't mark closed), retries naturally on next price update
3. Telegram Bot (telegram_command_bot.py)
Purpose: Python-based Telegram bot for manual trading commands and position status monitoring
Manual trade commands via plain text:
# User sends plain text message (not slash commands)
"long sol" → Validates via analytics, then opens SOL-PERP long
"short eth" → Validates via analytics, then opens ETH-PERP short
"long btc --force" → Skips analytics validation, opens BTC-PERP long immediately
Key behaviors:
- MessageHandler processes all text messages (not just commands)
- Maps user-friendly symbols (sol, eth, btc) to Drift format (SOL-PERP, etc.)
- Analytics validation: Calls
/api/analytics/reentry-checkbefore execution- Blocks trades with score <55 unless
--forceflag used - Uses fresh TradingView data (<5min old) when available
- Falls back to historical metrics with penalty
- Considers recent trade performance (last 3 trades)
- Blocks trades with score <55 unless
- Calls
/api/trading/executedirectly with preset healthy metrics (ATR=0.45, ADX=32, RSI=58/42) - Bypasses n8n workflow and TradingView requirements
- 60-second timeout for API calls
- Responds with trade confirmation or analytics rejection message
Status command:
/status → Returns JSON of open positions from Drift
Implementation details:
- Uses
python-telegram-botlibrary - Deployed via
docker-compose.telegram-bot.yml - Requires
TELEGRAM_BOT_TOKENandTELEGRAM_CHANNEL_IDin .env - API calls to
http://trading-bot:3000/api/trading/execute
Drift client integration:
- Singleton pattern: Use
initializeDriftService()andgetDriftService()- maintains single connection
const driftService = await initializeDriftService()
const health = await driftService.getAccountHealth()
- Wallet handling: Supports both JSON array
[91,24,...]and base58 string formats from Phantom wallet
4. Rate Limit Monitoring (lib/drift/orders.ts + app/api/analytics/rate-limits)
Purpose: Track and analyze Solana RPC rate limiting (429 errors) to prevent silent failures
Helius RPC Limits (Free Tier):
- Burst: 100 requests/second
- Sustained: 10 requests/second
- Monthly: 100k requests
- See
docs/HELIUS_RATE_LIMITS.mdfor upgrade recommendations
Retry mechanism with exponential backoff (Nov 14, 2025 - Updated):
await retryWithBackoff(async () => {
return await driftClient.cancelOrders(...)
}, maxRetries = 3, baseDelay = 5000) // Increased from 2s to 5s
Progression: 5s → 10s → 20s (vs old 2s → 4s → 8s) Rationale: Gives Helius time to recover, reduces cascade pressure by 2.5x
Database logging: Three event types in SystemEvent table:
rate_limit_hit: Each 429 error (logged with attempt #, delay, error snippet)rate_limit_recovered: Successful retry (logged with total time, retry count)rate_limit_exhausted: Failed after max retries (CRITICAL - order operation failed)
Analytics endpoint:
curl http://localhost:3001/api/analytics/rate-limits
Returns: Total hits/recoveries/failures, hourly patterns, recovery times, success rate
Key behaviors:
- Only RPC calls wrapped:
cancelAllOrders(),placeExitOrders(),closePosition() - Position Manager monitoring: Event-driven via Pyth WebSocket (not polling)
- Rate limit-aware exit: Position Manager keeps monitoring on 429 errors (retries naturally)
- Logs to both console and database for post-trade analysis
Monitoring queries: See docs/RATE_LIMIT_MONITORING.md for SQL queries
Startup Position Validation (Nov 14, 2025 - Added): On container startup, cross-checks last 24h of "closed" trades against actual Drift positions:
- If DB says closed but Drift shows open → reopens in DB to restore Position Manager tracking
- Prevents orphaned positions from failed close transactions
- Logs:
🔴 CRITICAL: ${symbol} marked as CLOSED in DB but still OPEN on Drift! - Implementation:
lib/startup/init-position-manager.ts-validateOpenTrades()
5. Order Placement (lib/drift/orders.ts)
Critical functions:
openPosition()- Opens market position with transaction confirmationclosePosition()- Closes position with transaction confirmationplaceExitOrders()- Places TP/SL orders on-chaincancelAllOrders()- Cancels all reduce-only orders for a market
CRITICAL: Transaction Confirmation Pattern
Both openPosition() and closePosition() MUST confirm transactions on-chain:
const txSig = await driftClient.placePerpOrder(orderParams)
console.log('⏳ Confirming transaction on-chain...')
const connection = driftService.getConnection()
const confirmation = await connection.confirmTransaction(txSig, 'confirmed')
if (confirmation.value.err) {
throw new Error(`Transaction failed: ${JSON.stringify(confirmation.value.err)}`)
}
console.log('✅ Transaction confirmed on-chain')
Without this, the SDK returns signatures for transactions that never execute, causing phantom trades/closes.
CRITICAL: Drift SDK position.size is BASE ASSET TOKENS, not USD
The Drift SDK returns position.size as token quantity (SOL/ETH/BTC), NOT USD notional:
// CORRECT: Convert tokens to USD by multiplying by current price
const positionSizeUSD = Math.abs(position.size) * currentPrice
// WRONG: Using position.size directly as USD (off by 150x+ for SOL!)
const positionSizeUSD = Math.abs(position.size)
This affects Position Manager's TP1/TP2 detection - if position.size is not converted to USD before comparing to tracked USD values, the system will never detect partial closes correctly. See Common Pitfall #22 for the full bug details and fix applied Nov 12, 2025.
Solana RPC Rate Limiting with Exponential Backoff Solana RPC endpoints return 429 errors under load. Always use retry logic for order operations:
export async function retryWithBackoff<T>(
operation: () => Promise<T>,
maxRetries: number = 3,
initialDelay: number = 5000 // Increased from 2000ms to 5000ms (Nov 14, 2025)
): Promise<T> {
for (let attempt = 0; attempt < maxRetries; attempt++) {
try {
return await operation()
} catch (error: any) {
if (error?.message?.includes('429') && attempt < maxRetries - 1) {
const delay = initialDelay * Math.pow(2, attempt)
console.log(`⏳ Rate limited, retrying in ${delay/1000}s... (attempt ${attempt + 1}/${maxRetries})`)
await new Promise(resolve => setTimeout(resolve, delay))
continue
}
throw error
}
}
throw new Error('Max retries exceeded')
}
// Usage in cancelAllOrders
await retryWithBackoff(() => driftClient.cancelOrders(...))
Note: Increased from 2s to 5s base delay to give Helius RPC more recovery time. See docs/HELIUS_RATE_LIMITS.md for detailed analysis.
Without this, order cancellations fail silently during TP1→breakeven order updates, leaving ghost orders that cause incorrect fills.
Dual Stop System (USE_DUAL_STOPS=true):
// Soft stop: TRIGGER_LIMIT at -1.5% (avoids wicks)
// Hard stop: TRIGGER_MARKET at -2.5% (guarantees exit)
Order types:
- Entry: MARKET (immediate execution)
- TP1/TP2: LIMIT reduce-only orders
- Soft SL: TRIGGER_LIMIT reduce-only
- Hard SL: TRIGGER_MARKET reduce-only
6. Database (lib/database/trades.ts + prisma/schema.prisma)
Purpose: PostgreSQL via Prisma ORM for trade history and analytics
Models: Trade, PriceUpdate, SystemEvent, DailyStats, BlockedSignal
Singleton pattern: Use getPrismaClient() - never instantiate PrismaClient directly
Key functions:
createTrade()- Save trade after execution (includes dual stop TX signatures + signalQualityScore)updateTradeExit()- Record exit with P&LaddPriceUpdate()- Track price movements (called by Position Manager)getTradeStats()- Win rate, profit factor, avg win/lossgetLastTrade()- Fetch most recent trade for analytics dashboardcreateBlockedSignal()- Save blocked signals for data-driven optimization analysisgetRecentBlockedSignals()- Query recent blocked signalsgetBlockedSignalsForAnalysis()- Fetch signals needing price analysis (future automation)
Important fields:
signalSource(String?) - Identifies trade origin: 'tradingview', 'manual', or NULL (old trades)- CRITICAL: Manual Telegram trades are marked
signalSource='manual'and excluded from TradingView indicator analysis - Use filter:
WHERE ("signalSource" IS NULL OR "signalSource" != 'manual')for indicator optimization queries - See
docs/MANUAL_TRADE_FILTERING.mdfor complete SQL filtering guide
- CRITICAL: Manual Telegram trades are marked
signalQualityScore(Int?) - 0-100 score for data-driven optimizationsignalQualityVersion(String?) - Tracks which scoring logic was used ('v1', 'v2', 'v3', 'v4')- v1: Original logic (price position < 5% threshold)
- v2: Added volume compensation for low ADX (2025-11-07)
- v3: Stricter breakdown requirements: positions < 15% require (ADX > 18 AND volume > 1.2x) OR (RSI < 35 for shorts / RSI > 60 for longs)
- v4: CURRENT - Blocked signals tracking enabled for data-driven threshold optimization (2025-11-11)
- All new trades tagged with current version for comparative analysis
maxFavorableExcursion/maxAdverseExcursion- Track best/worst P&L during trade lifetimemaxFavorablePrice/maxAdversePrice- Track prices at MFE/MAE pointsconfigSnapshot(Json) - Stores Position Manager state for crash recoveryatr,adx,rsi,volumeRatio,pricePosition- Context metrics from TradingView
BlockedSignal model fields (NEW):
- Signal metrics:
atr,adx,rsi,volumeRatio,pricePosition,timeframe - Quality scoring:
signalQualityScore,signalQualityVersion,scoreBreakdown(JSON),minScoreRequired - Block tracking:
blockReason(QUALITY_SCORE_TOO_LOW, COOLDOWN_PERIOD, HOURLY_TRADE_LIMIT, etc.),blockDetails - Future analysis:
priceAfter1/5/15/30Min,wouldHitTP1/TP2/SL,analysisComplete - Automatically saved by check-risk endpoint when signals are blocked
- Enables data-driven optimization: collect 10-20 blocked signals → analyze patterns → adjust thresholds
Per-symbol functions:
getLastTradeTimeForSymbol(symbol)- Get last trade time for specific coin (enables per-symbol cooldown)- Each coin (SOL/ETH/BTC) has independent cooldown timer to avoid missed opportunities
Configuration System
Three-layer merge:
DEFAULT_TRADING_CONFIG(config/trading.ts)- Environment variables (.env) via
getConfigFromEnv() - Runtime overrides via
getMergedConfig(overrides)
Always use: getMergedConfig() to get final config - never read env vars directly in business logic
Per-symbol position sizing: Use getPositionSizeForSymbol(symbol, config) which returns { size, leverage, enabled }
const { size, leverage, enabled } = getPositionSizeForSymbol('SOL-PERP', config)
if (!enabled) {
return NextResponse.json({ success: false, error: 'Symbol trading disabled' }, { status: 400 })
}
Symbol normalization: TradingView sends "SOLUSDT" → must convert to "SOL-PERP" for Drift
const driftSymbol = normalizeTradingViewSymbol(body.symbol)
API Endpoints Architecture
Authentication: All /api/trading/* endpoints (except /test) require Authorization: Bearer API_SECRET_KEY
Pattern: Each endpoint follows same flow:
- Auth check
- Get config via
getMergedConfig() - Initialize Drift service
- Check account health
- Execute operation
- Save to database
- Add to Position Manager if applicable
Key endpoints:
/api/trading/execute- Main entry point from n8n (production, requires auth), auto-caches market data/api/trading/check-risk- Pre-execution validation (duplicate check, quality score, per-symbol cooldown, rate limits, symbol enabled check, saves blocked signals automatically)/api/trading/test- Test trades from settings UI (no auth required, respects symbol enable/disable)/api/trading/close- Manual position closing (requires symbol normalization)/api/trading/sync-positions- Force Position Manager sync with Drift (POST, requires auth) - restores tracking for orphaned positions/api/trading/cancel-orders- Manual order cleanup (for stuck/ghost orders after rate limit failures)/api/trading/positions- Query open positions from Drift/api/trading/market-data- Webhook for TradingView market data updates (GET for debug, POST for data)/api/settings- Get/update config (writes to .env file, includes per-symbol settings)/api/analytics/last-trade- Fetch most recent trade details for dashboard (includes quality score)/api/analytics/reentry-check- Validate manual re-entry with fresh TradingView data + recent performance/api/analytics/version-comparison- Compare performance across signal quality logic versions (v1/v2/v3/v4)/api/restart- Create restart flag for watch-restart.sh script
Critical Workflows
Execute Trade (Production)
TradingView alert → n8n Parse Signal Enhanced (extracts metrics + timeframe)
↓ /api/trading/check-risk [validates quality score ≥60, checks duplicates, per-symbol cooldown]
↓ /api/trading/execute
↓ normalize symbol (SOLUSDT → SOL-PERP)
↓ getMergedConfig()
↓ getPositionSizeForSymbol() [check if symbol enabled + get sizing]
↓ openPosition() [MARKET order]
↓ calculate dual stop prices if enabled
↓ placeExitOrders() [on-chain TP1/TP2/SL orders]
↓ scoreSignalQuality({ ..., timeframe }) [compute 0-100 score with timeframe-aware thresholds]
↓ createTrade() [CRITICAL: save to database FIRST - see Common Pitfall #27]
↓ positionManager.addTrade() [ONLY after DB save succeeds - prevents unprotected positions]
CRITICAL EXECUTION ORDER (Nov 13, 2025 Fix): The order of database save → Position Manager add is NOT arbitrary - it's a safety requirement:
- If database save fails, API returns HTTP 500 with critical warning
- User sees: "CLOSE POSITION MANUALLY IMMEDIATELY" with transaction signature
- Position Manager only tracks database-persisted trades
- Container restarts can restore all positions from database
- Never add to Position Manager before database save - creates unprotected positions
Position Monitoring Loop
Position Manager every 2s:
↓ Verify on-chain position still exists (detect external closures)
↓ getPythPriceMonitor().getLatestPrice()
↓ Calculate current P&L and update MAE/MFE metrics
↓ Check emergency stop (-2%) → closePosition(100%)
↓ Check SL hit → closePosition(100%)
↓ Check TP1 hit → closePosition(75%), cancelAllOrders(), placeExitOrders() with SL at breakeven
↓ Check profit lock trigger (+1.2%) → move SL to +configured%
↓ Check TP2 hit → closePosition(80% of remaining), activate runner
↓ Check trailing stop (if runner active) → adjust SL dynamically based on peakPrice
↓ addPriceUpdate() [save to database every N checks]
↓ saveTradeState() [persist Position Manager state + MAE/MFE for crash recovery]
Settings Update
Web UI → /api/settings POST
↓ Validate new settings
↓ Write to .env file using string replacement
↓ Return success
↓ User clicks "Restart Bot" → /api/restart
↓ Creates /tmp/trading-bot-restart.flag
↓ watch-restart.sh detects flag
↓ Executes: docker restart trading-bot-v4
Docker Context
Multi-stage build: deps → builder → runner (Node 20 Alpine)
Critical Dockerfile steps:
- Install deps with
npm install --production - Copy source and
npx prisma generate(MUST happen before build) npm run build(Next.js standalone output)- Runner stage copies standalone + static + node_modules + Prisma client
Container networking:
- External:
trading-bot-v4on port 3001 - Internal: Next.js on port 3000
- Database:
trading-bot-postgreson 172.28.0.0/16 network
DATABASE_URL caveat: Use trading-bot-postgres (container name) in .env for runtime, but localhost:5432 for Prisma CLI migrations from host
Project-Specific Patterns
1. Singleton Services
Never create multiple instances - always use getter functions:
const driftService = await initializeDriftService() // NOT: new DriftService()
const positionManager = getPositionManager() // NOT: new PositionManager()
const prisma = getPrismaClient() // NOT: new PrismaClient()
2. Price Calculations
Direction matters for long vs short:
function calculatePrice(entry: number, percent: number, direction: 'long' | 'short') {
if (direction === 'long') {
return entry * (1 + percent / 100) // Long: +1% = higher price
} else {
return entry * (1 - percent / 100) // Short: +1% = lower price
}
}
3. Error Handling
Database failures should not fail trades - always wrap in try/catch:
try {
await createTrade(params)
console.log('💾 Trade saved to database')
} catch (dbError) {
console.error('❌ Failed to save trade:', dbError)
// Don't fail the trade if database save fails
}
4. Reduce-Only Orders
All exit orders MUST be reduce-only (can only close, not open positions):
const orderParams = {
reduceOnly: true, // CRITICAL for TP/SL orders
// ... other params
}
5. Nextcloud Deck Roadmap Sync
Purpose: Visual kanban board for tracking optimization roadmap progress
Key Components:
scripts/discover-deck-ids.sh- Find Nextcloud Deck board/stack IDsscripts/sync-roadmap-to-deck.py- Sync roadmap files to Deck cardsdocs/NEXTCLOUD_DECK_SYNC.md- Complete documentation
Workflow:
# One-time setup (already done)
bash scripts/discover-deck-ids.sh # Creates /tmp/deck-config.json
# Sync roadmap to Deck (creates/updates cards)
python3 scripts/sync-roadmap-to-deck.py --init
# Always dry-run first to preview changes
python3 scripts/sync-roadmap-to-deck.py --init --dry-run
Stack Mapping:
- 📥 Backlog: Future phases, ideas, ML work (status: FUTURE)
- 📋 Planning: Next phases, ready to implement (status: PENDING, NEXT)
- 🚀 In Progress: Currently active work (status: CURRENT, IN PROGRESS, DEPLOYED)
- ✅ Complete: Finished phases (status: COMPLETE)
Card Structure:
- 3 high-level initiative cards (from
OPTIMIZATION_MASTER_ROADMAP.md) - 18 detailed phase cards (from individual roadmap files)
- Total: 21 cards tracking all optimization work
When to Sync:
- After completing a phase (update markdown status → re-sync)
- When starting new phase (move card in Deck UI)
- Weekly during active development to keep visual state current
Important Notes:
- API doesn't support duplicate detection - always use
--dry-runfirst - Manual card deletion required (API returns 405 on DELETE)
- Code blocks auto-removed from descriptions (prevent API errors)
- Card titles cleaned (no markdown, emojis removed for readability)
Testing Commands
# Local development
npm run dev
# Build production
npm run build && npm start
# Docker build and restart
docker compose build trading-bot
docker compose up -d --force-recreate trading-bot
docker logs -f trading-bot-v4
# Database operations
npx prisma generate # Generate client
DATABASE_URL="postgresql://...@localhost:5432/..." npx prisma migrate dev
docker exec trading-bot-postgres psql -U postgres -d trading_bot_v4 -c "\dt"
# Test trade from UI
# Go to http://localhost:3001/settings
# Click "Test LONG" or "Test SHORT"
SQL Analysis Queries
Essential queries for monitoring signal quality and blocked signals. Run via:
docker exec trading-bot-postgres psql -U postgres -d trading_bot_v4 -c "YOUR_QUERY"
Phase 1: Monitor Data Collection Progress
-- Check blocked signals count (target: 10-20 for Phase 2)
SELECT COUNT(*) as total_blocked FROM "BlockedSignal";
-- Score distribution of blocked signals
SELECT
CASE
WHEN signalQualityScore >= 60 THEN '60-64 (Close Call)'
WHEN signalQualityScore >= 55 THEN '55-59 (Marginal)'
WHEN signalQualityScore >= 50 THEN '50-54 (Weak)'
ELSE '0-49 (Very Weak)'
END as tier,
COUNT(*) as count,
ROUND(AVG(signalQualityScore)::numeric, 1) as avg_score
FROM "BlockedSignal"
WHERE blockReason = 'QUALITY_SCORE_TOO_LOW'
GROUP BY tier
ORDER BY MIN(signalQualityScore) DESC;
-- Recent blocked signals with full details
SELECT
symbol,
direction,
signalQualityScore as score,
ROUND(adx::numeric, 1) as adx,
ROUND(atr::numeric, 2) as atr,
ROUND(pricePosition::numeric, 1) as pos,
ROUND(volumeRatio::numeric, 2) as vol,
blockReason,
TO_CHAR(createdAt, 'MM-DD HH24:MI') as time
FROM "BlockedSignal"
ORDER BY createdAt DESC
LIMIT 10;
Phase 2: Compare Blocked vs Executed Trades
-- Compare executed trades in 60-69 score range
SELECT
signalQualityScore as score,
COUNT(*) as trades,
ROUND(AVG(realizedPnL)::numeric, 2) as avg_pnl,
ROUND(SUM(realizedPnL)::numeric, 2) as total_pnl,
ROUND(100.0 * SUM(CASE WHEN realizedPnL > 0 THEN 1 ELSE 0 END) / COUNT(*)::numeric, 1) as win_rate
FROM "Trade"
WHERE exitReason IS NOT NULL
AND signalQualityScore BETWEEN 60 AND 69
GROUP BY signalQualityScore
ORDER BY signalQualityScore;
-- Block reason breakdown
SELECT
blockReason,
COUNT(*) as count,
ROUND(AVG(signalQualityScore)::numeric, 1) as avg_score
FROM "BlockedSignal"
GROUP BY blockReason
ORDER BY count DESC;
Analyze Specific Patterns
-- Blocked signals at range extremes (price position)
SELECT
direction,
signalQualityScore as score,
ROUND(pricePosition::numeric, 1) as pos,
ROUND(adx::numeric, 1) as adx,
ROUND(volumeRatio::numeric, 2) as vol,
symbol,
TO_CHAR(createdAt, 'MM-DD HH24:MI') as time
FROM "BlockedSignal"
WHERE blockReason = 'QUALITY_SCORE_TOO_LOW'
AND (pricePosition < 10 OR pricePosition > 90)
ORDER BY signalQualityScore DESC;
-- ADX distribution in blocked signals
SELECT
CASE
WHEN adx >= 25 THEN 'Strong (25+)'
WHEN adx >= 20 THEN 'Moderate (20-25)'
WHEN adx >= 15 THEN 'Weak (15-20)'
ELSE 'Very Weak (<15)'
END as adx_tier,
COUNT(*) as count,
ROUND(AVG(signalQualityScore)::numeric, 1) as avg_score
FROM "BlockedSignal"
WHERE blockReason = 'QUALITY_SCORE_TOO_LOW'
AND adx IS NOT NULL
GROUP BY adx_tier
ORDER BY MIN(adx) DESC;
Usage Pattern:
- Run "Monitor Data Collection" queries weekly during Phase 1
- Once 10+ blocked signals collected, run "Compare Blocked vs Executed" queries
- Use "Analyze Specific Patterns" to identify optimization opportunities
- Full query reference:
BLOCKED_SIGNALS_TRACKING.md
Common Pitfalls
-
DRIFT SDK MEMORY LEAK (CRITICAL - Fixed Nov 15, 2025):
- Symptom: JavaScript heap out of memory after 10+ hours runtime, Telegram bot timeouts (60s)
- Root Cause: Drift SDK accumulates WebSocket subscriptions over time without cleanup
- Manifestation: Thousands of
accountUnsubscribe error: readyState was 2 (CLOSING)in logs - Heap Growth: Normal ~200MB → 4GB+ after 10 hours → OOM crash
- Solution: Automatic reconnection every 4 hours (
lib/drift/client.ts) - Implementation:
scheduleReconnection()- Sets 4-hour timer after initializationreconnect()- Unsubscribes, resets state, reinitializes Drift client- Timer cleared in
disconnect()to prevent orphaned timers
- Manual Control:
/api/drift/reconnectendpoint (POST with auth, GET for status) - Impact: System now self-healing, can run indefinitely without manual restarts
- Monitoring: Watch for scheduled reconnection logs:
🔄 Scheduled reconnection...
-
WRONG RPC PROVIDER (CRITICAL - CATASTROPHIC SYSTEM FAILURE):
-
FINAL CONCLUSION Nov 14, 2025 (INVESTIGATION COMPLETE): Helius is the ONLY reliable RPC provider for Drift SDK
-
Root Cause CONFIRMED: Alchemy's rate limiting breaks Drift SDK's burst subscription pattern during initialization
-
Definitive Proof (Nov 14, 21:14 CET):
- Created diagnostic endpoint
/api/testing/drift-init - Alchemy: 17-71 subscription errors EVERY init (49 avg over 5 runs), 1644ms avg init time
- Helius: 0 subscription errors EVERY init, 800ms avg init time
- See
docs/ALCHEMY_RPC_INVESTIGATION_RESULTS.mdfor full test data
- Created diagnostic endpoint
-
Why Alchemy Fails:
- Drift SDK subscribes to 30-50+ accounts simultaneously during init (burst pattern)
- Alchemy's CUPS enforcement rate limits these burst requests
- Drift SDK does NOT retry failed subscriptions
- SDK reports "initialized successfully" but with incomplete subscription set
- Subsequent operations fail/timeout due to missing account data
- Error message: "Received JSON-RPC error calling
accountSubscribe"
-
Why "Breakthrough" at 14:25 Wasn't Real:
- First Alchemy test had 17-71 subscription errors (random variation)
- Sometimes gets lucky with "just enough" subscriptions for one operation
- SDK in degraded state from the start, just not obvious until second operation
- This explains why first trade "worked" but subsequent trades failed
-
Why Helius Works:
- Higher burst tolerance for Solana dApp subscription patterns
- Zero subscription errors during init
- Faster initialization (800ms vs 1600ms)
- Stable for continuous operations
-
Technical Reality vs Documentation:
- Alchemy DOES support WebSocket subscriptions (research confirmed)
- Alchemy DOES support accountSubscribe method (not -32601 error)
- BUT: Rate limit enforcement model incompatible with Drift's burst pattern
- Documentation doesn't mention burst subscription limits
-
Production Status:
- Using: Helius RPC (https://mainnet.helius-rpc.com/?api-key=...)
- Retry logic: 5s exponential backoff for rate limits
- System: Stable, TP1/TP2/SL working, Position Manager tracking correctly
-
Investigation Closed: This is DEFINITIVE. Use Helius. Do not use Alchemy.
-
Test Yourself:
curl 'http://localhost:3001/api/testing/drift-init?rpc=alchemy'
-
-
Prisma not generated in Docker: Must run
npx prisma generatein Dockerfile BEFOREnpm run build -
Wrong DATABASE_URL: Container runtime needs
trading-bot-postgres, Prisma CLI from host needslocalhost:5432 -
Symbol format mismatch: Always normalize with
normalizeTradingViewSymbol()before calling Drift (applies to ALL endpoints including/api/trading/close) -
Missing reduce-only flag: Exit orders without
reduceOnly: truecan accidentally open new positions -
Singleton violations: Creating multiple DriftClient or Position Manager instances causes connection/state issues
-
Type errors with Prisma: The Trade type from Prisma is only available AFTER
npx prisma generate- use explicit types or// @ts-ignorecarefully -
Quality score duplication: Signal quality calculation exists in BOTH
check-riskandexecuteendpoints - keep logic synchronized -
TP2-as-Runner configuration:
takeProfit2SizePercent: 0means "TP2 activates trailing stop, no position close"- This creates runner of remaining % after TP1 (default 25%, configurable via TAKE_PROFIT_1_SIZE_PERCENT)
TAKE_PROFIT_2_PERCENT=0.7sets TP2 trigger price,TAKE_PROFIT_2_SIZE_PERCENTshould be 0- Settings UI correctly shows "TP2 activates trailing stop" with dynamic runner % calculation
- P&L calculation CRITICAL: Use actual entry vs exit price calculation, not SDK values:
const profitPercent = this.calculateProfitPercent(trade.entryPrice, exitPrice, trade.direction)
const actualRealizedPnL = (closedSizeUSD * profitPercent) / 100
trade.realizedPnL += actualRealizedPnL // NOT: result.realizedPnL from SDK
-
Transaction confirmation CRITICAL: Both
openPosition()ANDclosePosition()MUST callconnection.confirmTransaction()afterplacePerpOrder(). Without this, the SDK returns transaction signatures that aren't confirmed on-chain, causing "phantom trades" or "phantom closes". Always checkconfirmation.value.errbefore proceeding. -
Execution order matters: When creating trades via API endpoints, the order MUST be:
- Open position + place exit orders
- Save to database (
createTrade()) - Add to Position Manager (
positionManager.addTrade())
If Position Manager is added before database save, race conditions occur where monitoring checks before the trade exists in DB.
-
New trade grace period: Position Manager skips "external closure" detection for trades <30 seconds old because Drift positions take 5-10 seconds to propagate after opening. Without this grace period, new positions are immediately detected as "closed externally" and cancelled.
-
Drift minimum position sizes: Actual minimums differ from documentation:
- SOL-PERP: 0.1 SOL (~$5-15 depending on price)
- ETH-PERP: 0.01 ETH (~$38-40 at $4000/ETH)
- BTC-PERP: 0.0001 BTC (~$10-12 at $100k/BTC)
Always calculate:
minOrderSize × currentPricemust exceed Drift's $4 minimum. Add buffer for price movement. -
Exit reason detection bug: Position Manager was using current price to determine exit reason, but on-chain orders filled at a DIFFERENT price in the past. Now uses
trade.tp1Hit/trade.tp2Hitflags and realized P&L to correctly identify whether TP1, TP2, or SL triggered. Prevents profitable trades being mislabeled as "SL" exits. -
Per-symbol cooldown: Cooldown period is per-symbol, NOT global. ETH trade at 10:00 does NOT block SOL trade at 10:01. Each coin (SOL/ETH/BTC) has independent cooldown timer to avoid missing opportunities on different assets.
-
Timeframe-aware scoring crucial: Signal quality thresholds MUST adjust for 5min vs higher timeframes:
- 5min charts naturally have lower ADX (12-22 healthy) and ATR (0.2-0.7% healthy) than daily charts
- Without timeframe awareness, valid 5min breakouts get blocked as "low quality"
- Anti-chop filter applies -20 points for extreme sideways regardless of timeframe
- Always pass
timeframeparameter from TradingView alerts toscoreSignalQuality()
-
Price position chasing causes flip-flops: Opening longs at 90%+ range or shorts at <10% range reliably loses money:
- Database analysis showed overnight flip-flop losses all had price position 9-94% (chasing extremes)
- These trades had valid ADX (16-18) but entered at worst possible time
- Quality scoring now penalizes -15 to -30 points for range extremes
- Prevents rapid reversals when price is already overextended
-
TradingView ADX minimum for 5min: Set ADX filter to 15 (not 20+) in TradingView alerts for 5min charts:
- Higher timeframes can use ADX 20+ for strong trends
- 5min charts need lower threshold to catch valid breakouts
- Bot's quality scoring provides second-layer filtering with context-aware metrics
- Two-stage filtering (TradingView + bot) prevents both overtrading and missing valid signals
-
Prisma Decimal type handling: Raw SQL queries return Prisma
Decimalobjects, not plain numbers:- Use
anytype for numeric fields in$queryRawresults:total_pnl: any - Convert with
Number()before returning to frontend:totalPnL: Number(stat.total_pnl) || 0 - Frontend uses
.toFixed()which doesn't exist on Decimal objects - Applies to all aggregations: SUM(), AVG(), ROUND() - all return Decimal types
- Example:
/api/analytics/version-comparisonconverts all numeric fields
- Use
-
ATR-based trailing stop implementation (Nov 11, 2025): Runner system was using FIXED 0.3% trailing, causing immediate stops:
- Problem: At $168 SOL, 0.3% = $0.50 wiggle room. Trades with +7-9% MFE exited for losses.
- Fix:
trailingDistancePercent = (atrAtEntry / currentPrice * 100) × trailingStopAtrMultiplier - Config:
TRAILING_STOP_ATR_MULTIPLIER=1.5,MIN=0.25%,MAX=0.9%,ACTIVATION=0.5% - Typical improvement: 0.45% ATR × 1.5 = 0.675% trail ($1.13 vs $0.50 = 2.26x more room)
- Fallback: If
atrAtEntryunavailable, uses clamped legacytrailingStopPercent - Log verification: Look for "📊 ATR-based trailing: 0.0045 (0.52%) × 1.5x = 0.78%" messages
- ActiveTrade interface: Must include
atrAtEntry?: numberfield for calculation - See
ATR_TRAILING_STOP_FIX.mdfor full details and database analysis
-
CreateTradeParams interface sync: When adding new database fields to Trade model, MUST update
CreateTradeParamsinterface inlib/database/trades.ts:- Interface defines what parameters
createTrade()accepts - Must add new field to interface (e.g.,
indicatorVersion?: string) - Must add field to Prisma create data object in
createTrade()function - TypeScript build will fail if endpoint passes field not in interface
- Example: indicatorVersion tracking required 3-file update (execute route.ts, CreateTradeParams interface, createTrade function)
- Interface defines what parameters
-
Position.size tokens vs USD bug (CRITICAL - Fixed Nov 12, 2025):
- Symptom: Position Manager detects false TP1 hits, moves SL to breakeven prematurely
- Root Cause:
lib/drift/client.tsreturnsposition.sizeas BASE ASSET TOKENS (12.28 SOL), not USD ($1,950) - Bug: Comparing tokens (12.28) directly to USD ($1,950) → 12.28 < 1,950 × 0.95 = "99.4% reduction" → FALSE TP1!
- Fix: Always convert to USD before comparisons:
// In Position Manager (lines 322, 519, 558, 591) const positionSizeUSD = Math.abs(position.size) * currentPrice // Now compare USD to USD if (positionSizeUSD < trade.currentSize * 0.95) { // Actual 5%+ reduction detected }- Impact: Without this fix, TP1 never triggers correctly, SL moves at wrong times, runner system fails
- Where it matters: Position Manager, any code querying Drift positions
- Database evidence: Trade showed
tp1Hit: truewhen 100% still open,slMovedToBreakeven: trueprematurely
-
Leverage display showing global config instead of symbol-specific (Fixed Nov 12, 2025):
- Symptom: Telegram notifications showing "⚡ Leverage: 10x" when actual position uses 15x or 20x
- Root Cause: API response returning
config.leverage(global default) instead of symbol-specific value - Fix: Use actual leverage from
getPositionSizeForSymbol():
// app/api/trading/execute/route.ts (lines 345, 448, 522, 557) const { size, leverage, enabled } = getPositionSizeForSymbol(driftSymbol, config) // Return symbol-specific leverage leverage: leverage, // NOT: config.leverage- Impact: Misleading notifications, user confusion about actual position risk
- Hierarchy: Per-symbol ENV (SOLANA_LEVERAGE) → Market config → Global ENV (LEVERAGE) → Defaults
-
Indicator version tracking (Nov 12, 2025+):
- Database field
indicatorVersiontracks which TradingView strategy generated the signal - v5: Buy/Sell Signal strategy (pre-Nov 12)
- v6: HalfTrend + BarColor strategy (Nov 12+)
- Used for performance comparison between strategies
- Database field
-
External closure duplicate updates bug (CRITICAL - Fixed Nov 12, 2025):
- Symptom: Trades showing 7-8x larger losses than actual ($58 loss when Drift shows $7 loss)
- Root Cause: Position Manager monitoring loop re-processes external closures multiple times before trade removed from activeTrades Map
- Bug sequence:
- Trade closed externally (on-chain SL order fills at -$7.98)
- Position Manager detects closure:
position === null - Calculates P&L and calls
updateTradeExit()→ -$7.50 in DB - BUT: Trade still in
activeTradesMap (removal happens after DB update) - Next monitoring loop (2s later) detects closure AGAIN
- Accumulates P&L:
previouslyRealized (-$7.50) + runnerRealized (-$7.50) = -$15.00 - Updates database AGAIN → -$15.00 in DB
- Repeats 8 times → final -$58.43 (8× the actual loss)
- Fix: Remove trade from
activeTradesMap BEFORE database update:
// BEFORE (BROKEN): await updateTradeExit({ ... }) await this.removeTrade(trade.id) // Too late! Loop already ran again // AFTER (FIXED): this.activeTrades.delete(trade.id) // Remove FIRST await updateTradeExit({ ... }) // Then update DB if (this.activeTrades.size === 0) { this.stopMonitoring() }- Impact: Without this fix, every external closure is recorded 5-8 times with compounding P&L
- Root cause: Async timing issue -
removeTrade()is async but monitoring loop continues synchronously - Evidence: Logs showed 8 consecutive "External closure recorded" messages with increasing P&L
- Line:
lib/trading/position-manager.tsline 493 (external closure detection block) - Must update
CreateTradeParamsinterface when adding new database fields (see pitfall #21) - Analytics endpoint
/api/analytics/version-comparisoncompares v5 vs v6 performance
-
Signal quality threshold adjustment (Nov 12, 2025):
- Lowered from 65 → 60 based on data analysis of 161 trades
- Reason: Score 60-64 tier outperformed higher scores:
- 60-64: 2 trades, +$45.78 total, 100% WR, +$22.89 avg
- 65-69: 13 trades, +$28.28 total, 53.8% WR, +$2.18 avg
- 70-79: 67 trades, +$8.28 total, 44.8% WR (worst performance!)
- Paradox: Higher quality scores don't correlate with better performance in current data
- Expected impact: 2-3 additional trades/week, +$46-69 weekly profit potential
- Data collection: Enables blocked signals at 55-59 range for Phase 2 optimization
- Risk: Small sample size (2 trades) could be outliers, but downside limited
- SQL analysis showed clear pattern: stricter filtering was blocking profitable setups
-
Database-First Pattern (CRITICAL - Fixed Nov 13, 2025):
- Symptom: Positions opened on Drift with NO database record, NO Position Manager tracking, NO TP/SL protection
- Root Cause: Execute endpoint saved to database AFTER adding to Position Manager, with silent error catch
- Bug sequence:
- TradingView signal →
/api/trading/execute - Position opened on Drift ✅
- Position Manager tracking added ✅
- Database save attempted ❌ (fails silently)
- API returns success to user ❌
- Container restarts → Position Manager loses in-memory state ❌
- Result: Unprotected position with no monitoring or TP/SL orders
- TradingView signal →
- Fix: Database-first execution order in
app/api/trading/execute/route.ts:
// CRITICAL: Save to database FIRST before adding to Position Manager try { await createTrade({...}) } catch (dbError) { console.error('❌ CRITICAL: Failed to save trade to database:', dbError) return NextResponse.json({ success: false, error: 'Database save failed - position unprotected', message: `Position opened on Drift but database save failed. CLOSE POSITION MANUALLY IMMEDIATELY. Transaction: ${openResult.transactionSignature}`, }, { status: 500 }) } // ONLY add to Position Manager if database save succeeded await positionManager.addTrade(activeTrade)- Impact: Without this fix, ANY database failure creates unprotected positions
- Verification: Test trade cmhxj8qxl0000od076m21l58z (Nov 13) confirmed fix working
- Documentation: See
CRITICAL_INCIDENT_UNPROTECTED_POSITION.mdfor full incident report - Rule: Database persistence ALWAYS comes before in-memory state updates
-
DNS retry logic (Nov 13, 2025):
- Problem: Trading bot fails with "fetch failed" errors when DNS resolution temporarily fails for
mainnet.helius-rpc.com - Impact: n8n workflow failures, missed trades, container restart failures
- Root Cause:
EAI_AGAINerrors are transient DNS issues that resolve in seconds, but bot treated them as permanent failures - Fix: Automatic retry in
lib/drift/client.ts-retryOperation()wrapper:
// Detects transient errors: fetch failed, EAI_AGAIN, ENOTFOUND, ETIMEDOUT // Retries up to 3 times with 2s delay between attempts (DNS-specific, separate from rate limit retries) // Fails fast on non-transient errors (auth, config, permanent network issues) await this.retryOperation(async () => { // Initialize Drift SDK, subscribe, get user account }, 3, 2000, 'Drift initialization')- Success logs:
⚠️ Drift initialization failed (attempt 1/3): fetch failed→⏳ Retrying in 2000ms...→✅ Drift service initialized successfully - Impact: 99% of transient DNS failures now auto-recover, preventing missed trades
- Note: DNS retries use 2s delays (fast recovery), rate limit retries use 5s delays (RPC cooldown)
- Documentation: See
docs/DNS_RETRY_LOGIC.mdfor monitoring queries and metrics
- Problem: Trading bot fails with "fetch failed" errors when DNS resolution temporarily fails for
-
Declaring fixes "working" before deployment (CRITICAL - Nov 13, 2025):
- Symptom: AI says "position is protected" or "fix is deployed" when container still running old code
- Root Cause: Conflating "code committed to git" with "code running in production"
- Real Incident: Database-first fix committed 15:56, declared "working" at 19:42, but container started 15:06 (old code)
- Result: Unprotected position opened, database save failed silently, Position Manager never tracked it
- Financial Impact: User discovered $250+ unprotected position 3.5 hours after opening
- Verification Required:
# ALWAYS check before declaring fix deployed: docker logs trading-bot-v4 | grep "Server starting" | head -1 # Compare container start time to git commit timestamp # If container older: FIX NOT DEPLOYED - Rule: NEVER say "fixed", "working", "protected", or "deployed" without verifying container restart timestamp
- Impact: This is a REAL MONEY system - premature declarations cause financial losses
- Documentation: Added mandatory deployment verification to VERIFICATION MANDATE section
-
Phantom trade notification workflow breaks (Nov 14, 2025):
- Symptom: Phantom trade detected, position opened on Drift, but n8n workflow stops with HTTP 500 error. User NOT notified.
- Root Cause: Execute endpoint returned HTTP 500 when phantom detected, causing n8n chain to halt before Telegram notification
- Problem: Unmonitored phantom position on exchange while user is asleep/away = unlimited risk exposure
- Fix: Auto-close phantom trades immediately + return HTTP 200 with warning (allows n8n to continue)
// When phantom detected in app/api/trading/execute/route.ts: // 1. Immediately close position via closePosition() // 2. Save to database (create trade + update with exit info) // 3. Return HTTP 200 with full notification message in response // 4. n8n workflow continues to Telegram notification step- Response format change:
{ success: true, warning: 'Phantom trade detected and auto-closed', isPhantom: true, message: '[Full notification text]', phantomDetails: {...} } - Why auto-close: User can't always respond (sleeping, no phone, traveling). Better to exit with small loss/gain than leave unmonitored position exposed.
- Impact: Protects user from unlimited risk during unavailable hours. Phantom trades are rare edge cases (oracle issues, exchange rejections).
- Database tracking:
status='phantom',exitReason='manual', enables analysis of phantom frequency and patterns
-
Wrong entry price after orphaned position restoration (CRITICAL - Fixed Nov 15, 2025):
- Symptom: Position Manager tracking SHORT at $141.51 entry, but Drift UI shows $141.31 actual entry
- Root Cause: Startup validation restored orphaned position but used OLD database entry price instead of querying Drift for real value
- Bug sequence:
- Position opened at $141.317 (per Drift order history)
- TP1 closed 70% at $140.942
- Database incorrectly saved entry as $141.508 (maybe averaged or from previous position)
- Container restart → startup validation found position on Drift
- Reopened trade in DB but used stale
trade.entryPricefrom database - Position Manager tracked with wrong entry ($141.51 vs actual $141.31)
- Stop loss calculated from wrong base: $141.08 instead of $140.89
- Impact: 0.14% difference ($0.20/SOL) in SL placement - could mean difference between small profit and small loss
- Fix: Query Drift SDK for actual entry price during orphaned position restoration
// In lib/startup/init-position-manager.ts (line 121-144): // When reopening closed trade found on Drift: const currentPrice = await driftService.getOraclePrice(marketConfig.driftMarketIndex) const positionSizeUSD = position.size * currentPrice await prisma.trade.update({ where: { id: trade.id }, data: { status: 'open', exitReason: null, entryPrice: position.entryPrice, // CRITICAL: Use Drift's actual entry price positionSizeUSD: positionSizeUSD, // Update to current size (runner after TP1) } })- Drift SDK returns real entry:
position.entryPricefromgetPosition()calculates from on-chain data (quoteAssetAmount / baseAssetAmount) - Future-proofed: All orphaned position restorations now use authoritative Drift entry price, not stale DB value
- Manual fix required once: Had to manually UPDATE database for existing position, then restart container
- Lesson: Always prefer on-chain data over cached database values for critical trading parameters
-
Runner stop loss gap - NO protection between TP1 and TP2 (CRITICAL - Fixed Nov 15, 2025):
- Symptom: Runner position remained open despite price moving far above stop loss level
- Root Cause: Position Manager only checked stop loss BEFORE TP1 hit (line 693) OR AFTER TP2 hit (line 835), creating a gap
- Bug sequence:
- SHORT opened at $141.317, TP1 hit at $140.942 (70% closed)
- Runner (30% remaining, $12.70) had stop loss at $140.89 (profit lock)
- Price rose to $141.98 (way above $140.89 SL) → NO STOP LOSS CHECK
- Position exposed to unlimited loss for hours during TP1→TP2 window
- User manually checked: "runner close did not work. still open and the price is above 141,98"
- Impact: Hours of unprotected runner exposure = potential unlimited loss on 25-30% remaining position
- Code analysis:
// Line 693: Stop loss checked ONLY before TP1 if (!trade.tp1Hit && this.shouldStopLoss(currentPrice, trade)) { console.log(`🔴 STOP LOSS: ${trade.symbol}`) await this.executeExit(trade, 100, 'SL', currentPrice) } // Lines 706-831: TP1 and TP2 processing - NO STOP LOSS CHECK // Line 835: Stop loss checked ONLY after TP2 if (trade.tp2Hit && this.config.useTrailingStop && this.shouldStopLoss(currentPrice, trade)) { console.log(`🔴 TRAILING STOP: ${trade.symbol}`) await this.executeExit(trade, 100, 'SL', currentPrice) } // BUG: Runner between TP1-TP2 has ZERO stop loss protection! - Fix: Added explicit runner stop loss check at line ~795:
// CRITICAL: Check stop loss for runner (after TP1, before TP2) if (trade.tp1Hit && !trade.tp2Hit && this.shouldStopLoss(currentPrice, trade)) { console.log(`🔴 RUNNER STOP LOSS: ${trade.symbol} at ${profitPercent.toFixed(2)}% (profit lock triggered)`) await this.executeExit(trade, 100, 'SL', currentPrice) return }- Verification: After fix deployed, runner closed at $141.133 with +$0.59 profit (+4.6% on $12.70 runner)
- Database evidence: Trade shows
exitReason='SL', proving runner stop loss triggered correctly - Why undetected: Runner system relatively new (Nov 11), most trades hit TP2 quickly without price reversals
- Lesson: Every conditional branch in risk management MUST have explicit stop loss checks - never assume "it'll get caught somewhere"
-
Flip-flop price context using wrong data (CRITICAL - Fixed Nov 14, 2025):
- Symptom: Flip-flop detection showing "100% price move" when actual movement was 0.2%, allowing trades that should be blocked
- Root Cause:
currentPriceparameter not available in check-risk endpoint (trade hasn't opened yet), so calculation used undefined/zero - Real incident: Nov 14, 06:05 CET - SHORT allowed with 0.2% flip-flop, lost -$1.56 in 5 minutes
- Bug sequence:
- LONG opened at $143.86 (06:00)
- SHORT signal 4min later at $143.58 (0.2% move)
- Flip-flop check:
(undefined - 143.86) / 143.86 * 100= garbage → showed "100%" - System thought it was reversal → allowed trade
- Should have been blocked as tight-range chop
- Fix: Two-part fix in commits
77a9437and795026a:
// In app/api/trading/check-risk/route.ts: // Get current price from Pyth BEFORE quality scoring const priceMonitor = getPythPriceMonitor() const latestPrice = priceMonitor.getCachedPrice(body.symbol) const currentPrice = latestPrice?.price || body.currentPrice // In lib/trading/signal-quality.ts: // Validate price data exists before calculation if (!params.currentPrice || params.currentPrice === 0) { // No current price available - apply penalty (conservative) console.warn(`⚠️ Flip-flop check: No currentPrice available, applying penalty`) frequencyPenalties.flipFlop = -25 score -= 25 } else { const priceChangePercent = Math.abs( (params.currentPrice - recentSignals.oppositeDirectionPrice) / recentSignals.oppositeDirectionPrice * 100 ) console.log(`🔍 Flip-flop price check: $${recentSignals.oppositeDirectionPrice.toFixed(2)} → $${params.currentPrice.toFixed(2)} = ${priceChangePercent.toFixed(2)}%`) // Apply penalty only if < 2% move }- Impact: Without this fix, flip-flop detection is useless - blocks reversals, allows chop
- Lesson: Always validate input data for financial calculations, especially when data might not exist yet
- Monitoring: Watch logs for "🔍 Flip-flop price check: $X → $Y = Z%" to verify correct calculations
-
Phantom trades need exitReason for cleanup (CRITICAL - Fixed Nov 15, 2025):
- Symptom: Position Manager keeps restoring phantom trade on every restart, triggers false runner stop loss alerts
- Root Cause: Phantom auto-closure sets
status='phantom'but leavesexitReason=NULL - Bug: Startup validator checks
exitReason !== null(line 122 of init-position-manager.ts), ignores status field - Consequence: Phantom trade with exitReason=NULL treated as "open" and restored to Position Manager
- Real incident: Nov 14 phantom trade (cmhy6xul20067nx077agh260n) caused 232% size mismatch, hundreds of false "🔴 RUNNER STOP LOSS" alerts
- Fix: When auto-closing phantom trades, MUST set exitReason:
// In app/api/trading/execute/route.ts (phantom detection): await updateTradeExit({ tradeId: trade.id, exitPrice: currentPrice, exitReason: 'manual', // CRITICAL: Must set exitReason for cleanup realizedPnL: actualPnL, status: 'phantom' })- Manual cleanup: If phantom already exists:
UPDATE "Trade" SET "exitReason" = 'manual' WHERE status = 'phantom' AND "exitReason" IS NULL - Impact: Without exitReason, phantom trades create ghost positions that trigger false alerts and pollute monitoring
- Verification: After restart, check logs for "Found 0 open trades" (not "Found 1 open trades to restore")
- Lesson: status field is for classification, exitReason is for lifecycle management - both must be set on closure
-
closePosition() missing retry logic causes rate limit storm (CRITICAL - Fixed Nov 15, 2025):
- Symptom: Position Manager tries to close trade, gets 429 error, retries EVERY 2 SECONDS → 100+ failed attempts → rate limit exhaustion
- Root Cause:
placeExitOrders()hasretryWithBackoff()wrapper (Nov 14 fix), butclosePosition()did NOT - Real incident: Trade cmi0il8l30000r607l8aec701 (Nov 15, 16:49 CET)
- Position Manager tried to close (SL or TP trigger)
- closePosition() called raw
placePerpOrder()→ 429 error - executeExit() caught 429, returned early (line 935-940)
- Position Manager kept monitoring, retried close EVERY 2 seconds
- Logs show 100+ "❌ Failed to close position: 429" + "⚠️ Rate limited while closing SOL-PERP"
- Meanwhile: On-chain TP2 limit order filled (unaffected by SDK rate limits)
- External closure detected, DB updated 8 TIMES: $0.14 → $0.20 → $0.26 → ... → $0.51
- Container eventually restarted (likely from rate limit exhaustion)
- Why duplicate updates: Common Pitfall #27 fix (remove from Map before DB update) works UNLESS rate limits cause tons of retries before external closure detection
- Impact: User saw $0.51 profit in DB, $0.03 on Drift UI (8× compounding vs 1 actual fill)
- Fix: Wrapped closePosition() with retryWithBackoff() in lib/drift/orders.ts:
// Line ~567 (BEFORE): const txSig = await driftClient.placePerpOrder(orderParams) // Line ~567 (AFTER): const txSig = await retryWithBackoff(async () => { return await driftClient.placePerpOrder(orderParams) }, 3, 8000) // 8s base delay, 3 max retries (8s → 16s → 32s)- Behavior now: 3 SDK retries over 56s (8+16+32) + Position Manager natural retry on next monitoring cycle = robust without spam
- RPC load reduction: 30-50× fewer requests during close operations (3 retries vs 100+ attempts)
- Verification: Container restarted 18:05 CET Nov 15, code deployed
- Lesson: EVERY SDK order operation (open, close, cancel, place) MUST have retry wrapper - Position Manager monitoring creates infinite retry loop without it
- Root Cause: Phantom auto-closure sets
status='phantom'but leavesexitReason=NULL - Bug: Startup validator checks
exitReason !== null(line 122 of init-position-manager.ts), ignores status field - Consequence: Phantom trade with exitReason=NULL treated as "open" and restored to Position Manager
- Real incident: Nov 14 phantom trade (cmhy6xul20067nx077agh260n) caused 232% size mismatch, hundreds of false "🔴 RUNNER STOP LOSS" alerts
- Fix: When auto-closing phantom trades, MUST set exitReason:
// In app/api/trading/execute/route.ts (phantom detection): await updateTradeExit({ tradeId: trade.id, exitPrice: currentPrice, exitReason: 'manual', // CRITICAL: Must set exitReason for cleanup realizedPnL: actualPnL, status: 'phantom' })- Manual cleanup: If phantom already exists:
UPDATE "Trade" SET "exitReason" = 'manual' WHERE status = 'phantom' AND "exitReason" IS NULL - Impact: Without exitReason, phantom trades create ghost positions that trigger false alerts and pollute monitoring
- Verification: After restart, check logs for "Found 0 open trades" (not "Found 1 open trades to restore")
- Lesson: status field is for classification, exitReason is for lifecycle management - both must be set on closure
File Conventions
- API routes:
app/api/[feature]/[action]/route.ts(Next.js 15 App Router) - Services:
lib/[service]/[module].ts(drift, pyth, trading, database) - Config: Single source in
config/trading.tswith env merging - Types: Define interfaces in same file as implementation (not separate types directory)
- Console logs: Use emojis for visual scanning: 🎯 🚀 ✅ ❌ 💰 📊 🛡️
Re-Entry Analytics System (Phase 1)
Purpose: Validate manual Telegram trades using fresh TradingView data + recent performance analysis
Components:
-
Market Data Cache (
lib/trading/market-data-cache.ts)- Singleton service storing TradingView metrics
- 5-minute expiry on cached data
- Tracks: ATR, ADX, RSI, volume ratio, price position, timeframe
-
Market Data Webhook (
app/api/trading/market-data/route.ts)- Receives TradingView alerts every 1-5 minutes
- POST: Updates cache with fresh metrics
- GET: View cached data (debugging)
-
Re-Entry Check Endpoint (
app/api/analytics/reentry-check/route.ts)- Validates manual trade requests
- Uses fresh TradingView data if available (<5min old)
- Falls back to historical metrics from last trade
- Scores signal quality + applies performance modifiers:
- -20 points if last 3 trades lost money (avgPnL < -5%)
- +10 points if last 3 trades won (avgPnL > +5%, WR >= 66%)
- -5 points for stale data, -10 points for no data
- Minimum score: 55 (vs 60 for new signals)
-
Auto-Caching (
app/api/trading/execute/route.ts)- Every trade signal from TradingView auto-caches metrics
- Ensures fresh data available for manual re-entries
-
Telegram Integration (
telegram_command_bot.py)- Calls
/api/analytics/reentry-checkbefore executing manual trades - Shows data freshness ("✅ FRESH 23s old" vs "⚠️ Historical")
- Blocks low-quality re-entries unless
--forceflag used - Fail-open: Proceeds if analytics check fails
- Calls
User Flow:
User: "long sol"
↓ Check cache for SOL-PERP
↓ Fresh data? → Use real TradingView metrics
↓ Stale/missing? → Use historical + penalty
↓ Score quality + recent performance
↓ Score >= 55? → Execute
↓ Score < 55? → Block (unless --force)
TradingView Setup: Create alerts that fire every 1-5 minutes with this webhook message:
{
"action": "market_data",
"symbol": "{{ticker}}",
"timeframe": "{{interval}}",
"atr": {{ta.atr(14)}},
"adx": {{ta.dmi(14, 14)}},
"rsi": {{ta.rsi(14)}},
"volumeRatio": {{volume / ta.sma(volume, 20)}},
"pricePosition": {{(close - ta.lowest(low, 100)) / (ta.highest(high, 100) - ta.lowest(low, 100)) * 100}},
"currentPrice": {{close}}
}
Webhook URL: https://your-domain.com/api/trading/market-data
Per-Symbol Trading Controls
Purpose: Independent enable/disable toggles and position sizing for SOL and ETH to support different trading strategies (e.g., ETH for data collection at minimal size, SOL for profit generation).
Configuration Priority:
- Per-symbol ENV vars (highest priority)
SOLANA_ENABLED,SOLANA_POSITION_SIZE,SOLANA_LEVERAGEETHEREUM_ENABLED,ETHEREUM_POSITION_SIZE,ETHEREUM_LEVERAGE
- Market-specific config (from
MARKET_CONFIGSin config/trading.ts) - Global ENV vars (fallback for BTC and other symbols)
MAX_POSITION_SIZE_USD,LEVERAGE
- Default config (lowest priority)
Settings UI: app/settings/page.tsx has dedicated sections:
- 💎 Solana section: Toggle + position size + leverage + risk calculator
- ⚡ Ethereum section: Toggle + position size + leverage + risk calculator
- 💰 Global fallback: For BTC-PERP and future symbols
Example usage:
// In execute/test endpoints
const { size, leverage, enabled } = getPositionSizeForSymbol(driftSymbol, config)
if (!enabled) {
return NextResponse.json({
success: false,
error: 'Symbol trading disabled'
}, { status: 400 })
}
Test buttons: Settings UI has symbol-specific test buttons:
- 💎 Test SOL LONG/SHORT (disabled when
SOLANA_ENABLED=false) - ⚡ Test ETH LONG/SHORT (disabled when
ETHEREUM_ENABLED=false)
When Making Changes
- Adding new config: Update DEFAULT_TRADING_CONFIG + getConfigFromEnv() + .env file
- Adding database fields: Update prisma/schema.prisma →
npx prisma migrate dev→npx prisma generate→ rebuild Docker - Changing order logic: Test with DRY_RUN=true first, use small position sizes ($10)
- API endpoint changes: Update both endpoint + corresponding n8n workflow JSON (Check Risk and Execute Trade nodes)
- Docker changes: Rebuild with
docker compose build trading-botthen restart container - Modifying quality score logic: Update BOTH
/api/trading/check-riskand/api/trading/executeendpoints, ensure timeframe-aware thresholds are synchronized - Exit strategy changes: Modify Position Manager logic + update on-chain order placement in
placeExitOrders() - TradingView alert changes: Ensure alerts pass
timeframefield (e.g.,"timeframe": "5") to enable proper signal quality scoring - Position Manager changes: ALWAYS execute test trade after deployment
- Use
/api/trading/testendpoint or Telegramlong sol --force - Monitor
docker logs -f trading-bot-v4for full cycle - Verify TP1 hit → 75% close → SL moved to breakeven
- SQL: Check
tp1Hit,slMovedToBreakeven,currentSizein Trade table - Compare: Position Manager logs vs actual Drift position size
- Use
- Calculation changes: Add verbose logging and verify with SQL
- Log every intermediate step, especially unit conversions
- Never assume SDK data format - log raw values to verify
- SQL query with manual calculation to compare results
- Test boundary cases: 0%, 100%, min/max values
- DEPLOYMENT VERIFICATION (MANDATORY): Before declaring ANY fix working:
- Check container start time vs commit timestamp
- If container older than commit: CODE NOT DEPLOYED
- Restart container and verify new code is running
- Never say "fixed" or "protected" without deployment confirmation
- This is a REAL MONEY system - unverified fixes cause losses
- GIT COMMIT AND PUSH (MANDATORY): After completing ANY feature, fix, or significant change:
- ALWAYS commit changes with descriptive message
- ALWAYS push to remote repository
- User should NOT have to ask for this - it's part of completion
- Commit message format:
git add -A git commit -m "type: brief description - Bullet point details - Files changed - Why the change was needed " git push - Types:
feat:(feature),fix:(bug fix),docs:(documentation),refactor:(code restructure) - This is NOT optional - code exists only when committed and pushed
- NEXTCLOUD DECK SYNC (MANDATORY): After completing phases or making significant roadmap progress:
- Update roadmap markdown files with new status (🔄 IN PROGRESS, ✅ COMPLETE, 🔜 NEXT)
- Run sync to update Deck cards:
python3 scripts/sync-roadmap-to-deck.py --init - Move cards between stacks in Nextcloud Deck UI to reflect progress visually
- Backlog (📥) → Planning (📋) → In Progress (🚀) → Complete (✅)
- Keep Deck in sync with actual work - it's the visual roadmap tracker
- Documentation:
docs/NEXTCLOUD_DECK_SYNC.md
- UPDATE COPILOT-INSTRUCTIONS.MD (MANDATORY): After implementing ANY significant feature or system change:
- Document new database fields and their purpose
- Add filtering requirements (e.g., manual vs TradingView trades)
- Update "Important fields" sections with new schema changes
- Add new API endpoints to the architecture overview
- Document data integrity requirements (what must be excluded from analysis)
- Add SQL query patterns for common operations
- Update "When Making Changes" section with new patterns learned
- Create reference docs in
docs/for complex features (e.g.,MANUAL_TRADE_FILTERING.md) - WHY: Future AI agents need complete context to maintain data integrity and avoid breaking analysis
- EXAMPLES: signalSource field for filtering, MAE/MFE tracking, phantom trade detection
Development Roadmap
Current Status (Nov 14, 2025):
- 168 trades executed with quality scores and MAE/MFE tracking
- Capital: $97.55 USDC at 100% health (zero debt, all USDC collateral)
- Leverage: 15x SOL (reduced from 20x for safer liquidation cushion)
- Three active optimization initiatives in data collection phase:
- Signal Quality: 0/20 blocked signals collected → need 10-20 for analysis
- Position Scaling: 161 v5 trades, collecting v6 data → need 50+ v6 trades
- ATR-based TP: 1/50 trades with ATR data → need 50 for validation
- Expected combined impact: 35-40% P&L improvement when all three optimizations complete
- Master roadmap: See
OPTIMIZATION_MASTER_ROADMAP.mdfor consolidated view
See SIGNAL_QUALITY_OPTIMIZATION_ROADMAP.md for systematic signal quality improvements:
- Phase 1 (🔄 IN PROGRESS): Collect 10-20 blocked signals with quality scores (1-2 weeks)
- Phase 2 (🔜 NEXT): Analyze patterns and make data-driven threshold decisions
- Phase 3 (🎯 FUTURE): Implement dual-threshold system or other optimizations based on data
- Phase 4 (🤖 FUTURE): Automated price analysis for blocked signals
- Phase 5 (🧠 DISTANT): ML-based scoring weight optimization
See POSITION_SCALING_ROADMAP.md for planned position management optimizations:
- Phase 1 (✅ COMPLETE): Collect data with quality scores (20-50 trades needed)
- Phase 2: ATR-based dynamic targets (adapt to volatility)
- Phase 3: Signal quality-based scaling (high quality = larger runners)
- Phase 4: Direction-based optimization (shorts vs longs have different performance)
- Phase 5 (✅ COMPLETE): TP2-as-runner system implemented - configurable runner (default 25%, adjustable via TAKE_PROFIT_1_SIZE_PERCENT) with ATR-based trailing stop
- Phase 6: ML-based exit prediction (future)
Recent Implementation: TP2-as-runner system provides 5x larger runner (default 25% vs old 5%) for better profit capture on extended moves. When TP2 price is hit, trailing stop activates on full remaining position instead of closing partial amount. Runner size is configurable (100% - TP1 close %).
Blocked Signals Tracking (Nov 11, 2025): System now automatically saves all blocked signals to database for data-driven optimization. See BLOCKED_SIGNALS_TRACKING.md for SQL queries and analysis workflows.
Data-driven approach: Each phase requires validation through SQL analysis before implementation. No premature optimization.
Signal Quality Version Tracking: Database tracks signalQualityVersion field to compare algorithm performance:
- Analytics dashboard shows version comparison: trades, win rate, P&L, extreme position stats
- v4 (current) includes blocked signals tracking for data-driven optimization
- Focus on extreme positions (< 15% range) - v3 aimed to reduce losses from weak ADX entries
- SQL queries in
docs/analysis/SIGNAL_QUALITY_VERSION_ANALYSIS.sqlfor deep-dive analysis - Need 20+ trades per version before meaningful comparison
Financial Roadmap Integration: All technical improvements must align with current phase objectives (see top of document):
- Phase 1 (CURRENT): Prove system works, compound aggressively, 60%+ win rate mandatory
- Phase 2-3: Transition to sustainable growth while funding withdrawals
- Phase 4+: Scale capital while reducing risk progressively
- See
TRADING_GOALS.mdfor complete 8-phase plan ($106 → $1M+) - SQL queries in
docs/analysis/SIGNAL_QUALITY_VERSION_ANALYSIS.sqlfor deep-dive analysis - Need 20+ trades per version before meaningful comparison
Blocked Signals Analysis: See BLOCKED_SIGNALS_TRACKING.md for:
- SQL queries to analyze blocked signal patterns
- Score distribution and metric analysis
- Comparison with executed trades at similar quality levels
- Future automation of price tracking (would TP1/TP2/SL have hit?)
Integration Points
- n8n: Expects exact response format from
/api/trading/execute(see n8n-complete-workflow.json) - Drift Protocol: Uses SDK v2.75.0 - check docs at docs.drift.trade for API changes
- Pyth Network: WebSocket + HTTP fallback for price feeds (handles reconnection)
- PostgreSQL: Version 16-alpine, must be running before bot starts
Key Mental Model: Think of this as two parallel systems (on-chain orders + software monitoring) working together. The Position Manager is the "backup brain" that constantly watches and acts if on-chain orders fail. Both write to the same database for complete trade history.