trading_bot_v4

Author	SHA1	Message	Date
mindesbunister	6990f20d6f	feat: Orderbook shadow logging system - Phase 1 complete Implementation: - Added 7 orderbook fields to Trade model (spreadBps, imbalanceRatio, depths, impact, walls) - Oracle-based estimates with 2bps spread assumption - ENV flag: ENABLE_ORDERBOOK_LOGGING (defaults true) - Execute wrapper lines 1037-1053 guards orderbook logic Database: - Direct SQL ALTER TABLE (avoided migration drift issues) - All columns nullable DOUBLE PRECISION - Prisma schema synced via db pull + generate Deployment: - Container rebuilt and deployed successfully - All 7 columns verified accessible - System operational, ready for live trade validation Files changed: - config/trading.ts (enableOrderbookLogging flag, line 127) - types/trading.ts (orderbook interfaces) - lib/database/trades.ts (createTrade saves orderbook data) - app/api/trading/execute/route.ts (ENV wrapper lines 1037-1053) - prisma/schema.prisma (7 orderbook fields) - docs/ORDERBOOK_SHADOW_LOGGING.md (complete documentation) Status: ✅ PRODUCTION READY - awaiting first trade for validation	2025-12-19 08:51:36 +01:00
mindesbunister	b11da009eb	critical: Bug #89 - Detect and handle Drift fractional position remnants (3-part fix) - Part 1: Position Manager fractional remnant detection after close attempts * Check if position < 1.5× minOrderSize after close transaction * Log to persistent logger with FRACTIONAL_REMNANT_DETECTED * Track closeAttempts, limit to 3 maximum * Mark exitReason='FRACTIONAL_REMNANT' in database * Remove from monitoring after 3 failed attempts - Part 2: Pre-close validation in closePosition() * Check if position viable before attempting close * Reject positions < 1.5× minOrderSize with specific error * Prevent wasted transaction attempts on too-small positions * Return POSITION_TOO_SMALL_TO_CLOSE error with manual instructions - Part 3: Health monitor detection for fractional remnants * Query Trade table for FRACTIONAL_REMNANT exits in last 24h * Alert operators with position details and manual cleanup instructions * Provide trade IDs, symbols, and Drift UI link - Database schema: Added closeAttempts Int? field to Track attempts Root cause: Drift protocol exchange constraints can leave fractional positions Evidence: 3 close transactions confirmed but 0.15 SOL remnant persisted Financial impact: ,000+ risk from unprotected fractional positions Status: Fix implemented, awaiting deployment verification See: docs/COMMON_PITFALLS.md Bug #89 for complete incident details	2025-12-16 22:05:12 +01:00
mindesbunister	d637aac2d7	feat: Deploy HA auto-failover with database promotion - Enhanced DNS failover monitor on secondary (72.62.39.24) - Auto-promotes database: pg_ctl promote on failover - Creates DEMOTED flag on primary via SSH (split-brain protection) - Telegram notifications with database promotion status - Startup safety script ready (integration pending) - 90-second automatic recovery vs 10-30 min manual - Zero-cost 95% enterprise HA benefit Status: DEPLOYED and MONITORING (14:52 CET) Next: Controlled failover test during maintenance	2025-12-12 15:54:03 +01:00
mindesbunister	4e286c91ef	fix: harden drift verifier and validation flow	2025-12-10 15:05:44 +01:00
mindesbunister	55d780cc4c	critical: Fix usdToBase() to use specific prices (TP1/TP2/SL) not entryPrice ROOT CAUSE IDENTIFIED (Dec 10, 2025): - Original working implementation (`4cc294b`, Oct 26): Used SPECIFIC price for each order - Broken implementation: Used entryPrice for ALL orders - Impact: Wrong token quantities = orders rejected/failed = NULL database signatures THE FIX: - Reverted usdToBase(usd) to usdToBase(usd, price) - TP1: Now uses options.tp1Price (not entryPrice) - TP2: Now uses options.tp2Price (not entryPrice) - SL: Now uses options.stopLossPrice (not entryPrice) WHY THIS FIXES IT: - To close 60% at TP1 price $141.20, need DIFFERENT token quantity than at entry $140.00 - Using wrong price = wrong size = Drift rejects order OR creates wrong size - Correct price = correct token quantity = orders placed successfully ORIGINAL COMMIT MESSAGE (`4cc294b`): "All 3 exit orders placed successfully on-chain" FILES CHANGED: - lib/drift/orders.ts: Fixed usdToBase() function signature + all 3 call sites This fix restores the proven working implementation that had 100% success rate. User lost $1,000+ from this bug causing positions without risk management.	2025-12-10 10:45:44 +01:00
copilot-swe-agent[bot]	63b94016fe	fix: Implement critical risk management fixes for bugs #76 , #77 , #78 , #80 Co-authored-by: mindesbunister <32161838+mindesbunister@users.noreply.github.com>	2025-12-09 22:23:43 +00:00
mindesbunister	f2e4156c8a	debug: Add comprehensive logging to closePosition for TP1 investigation - Added console.log debugging to closePosition function - Logs: percentToClose, position.size, calculated sizeToClose, minimum check - Logs: Override decision if size below minimum - Purpose: Investigate why TP1 closes 100% instead of configured 60% - User reported: Telegram shows '60% closed, 40% runner' but position fully closes - Files changed: lib/drift/orders.ts (lines 500-522)	2025-12-09 19:02:21 +01:00
mindesbunister	302511293c	feat: Add production logging gating (Phase 1, Task 1.1) - Created logger utility with environment-based gating (lib/utils/logger.ts) - Replaced 517 console.log statements with logger.log (71% reduction) - Fixed import paths in 15 files (resolved comment-trapped imports) - Added DEBUG_LOGS=false to .env - Achieves 71% immediate log reduction (517/731 statements) - Expected 90% reduction in production when deployed Impact: Reduced I/O blocking, lower log volume in production Risk: LOW (easy rollback, non-invasive) Phase: Phase 1, Task 1.1 (Quick Wins - Console.log Production Gating) Files changed: - NEW: lib/utils/logger.ts (production-safe logging) - NEW: scripts/replace-console-logs.js (automation tool) - Modified: 15 lib/*.ts files (console.log → logger.log) - Modified: .env (DEBUG_LOGS=false) Next: Task 1.2 (Image Size Optimization)	2025-12-05 00:32:41 +01:00
mindesbunister	0cdcd973cd	fix(drift): Fix health monitor error interception - CRITICAL BUG Critical bug fix for automatic restart system: - Moved interceptWebSocketErrors() call outside retry wrapper - Now runs once after successful Drift initialization - Ensures console.error patching works correctly - Enables health monitor to detect and count errors - Restores automatic recovery from Drift SDK memory leak Bug Impact: - Health monitor was starting but never recording errors - System accumulated 800+ accountUnsubscribe errors without triggering restart - Required manual restart intervention (container unhealthy) - Projection page stuck loading due to API unresponsiveness Root Cause: - interceptWebSocketErrors() was called inside retryOperation wrapper - Retry wrapper executes 0-3 times depending on network conditions - Console.error patching failed or ran multiple times - Monitor never received error events Fix Implementation: - Added interceptWebSocketErrors() call on line 185 (after Drift init) - Removed duplicate call from inside retry wrapper - Added logging: '🔧 Setting up error interception...' and '✅ Error interception active' - Error recording now functional Testing: - Health API returns errorCount: 0, threshold: 50 - Monitor will trigger restart when 50 errors in 30 seconds - System now self-healing without manual intervention Deployment: Nov 25, 2025 Container verified: Error interception active, health monitor operational	2025-11-25 10:19:04 +01:00
mindesbunister	dc197f52a4	feat: Replace blind 2-hour reconnect with error-based health monitoring User Request: Replace blind 2-hour restart timer with smart monitoring that only restarts when accountUnsubscribe errors actually occur Changes: . Health Monitor (NEW): - Created lib/monitoring/drift-health-monitor.ts - Tracks accountUnsubscribe errors in 30-second sliding window - Triggers container restart via flag file when 50+ errors detected - Prevents unnecessary restarts when SDK healthy . Drift Client: - Removed blind scheduleReconnection() and 2-hour timer - Added interceptWebSocketErrors() to catch SDK errors - Patches console.error to monitor for accountUnsubscribe patterns - Starts health monitor after successful initialization - Removed unused reconnect() method and reconnectTimer field . Health API (NEW): - GET /api/drift/health - Check current error count and health status - Returns: healthy boolean, errorCount, threshold, message - Useful for external monitoring and debugging Impact: - System only restarts when actual memory leak detected - Prevents unnecessary downtime every 2 hours - More targeted response to SDK issues - Better operational stability Files: - lib/monitoring/drift-health-monitor.ts (NEW - 165 lines) - lib/drift/client.ts (removed timer, added error interception) - app/api/drift/health/route.ts (NEW - health check endpoint) Testing: - Health monitor starts on initialization: ✅ - API endpoint returns healthy status: ✅ - No blind reconnection scheduled: ✅	2025-11-24 16:49:10 +01:00
mindesbunister	29fce0176f	fix: Correct order filtering to prevent false '32 orders' count Problem: Bot reported '32 open orders' when Drift UI showed 0 orders Root Cause: Filter checked orderId > 0 but didn't verify baseAssetAmount Impact: Misleading logs suggesting ghost order accumulation Fix: Enhanced filter with proper empty slot detection: - Check orderId exists and is non-zero - Check baseAssetAmount exists and is non-zero (BN comparison) - Added logging to show: 'Found X orders (checked 32 total slots)' Result: Bot now correctly reports 0 orders when none exist Verification: Container restart shows no false positives Files: lib/drift/orders.ts (cancelAllOrders function)	2025-11-21 16:44:04 +01:00
mindesbunister	c37a9a37d3	fix: Implement Associated Token Account for USDC withdrawals - Fixed PublicKey undefined error (derive from DRIFT_WALLET_PRIVATE_KEY) - Implemented ATA resolution using @solana/spl-token - Added comprehensive debug logging for withdrawal flow - Fixed AccountOwnedByWrongProgram error (need ATA not wallet address) - Successfully tested .58 withdrawal with on-chain confirmation - Updated .env with TOTAL_WITHDRAWN and LAST_WITHDRAWAL_TIME tracking Key changes: - lib/drift/withdraw.ts: Added getAssociatedTokenAddress() for USDC ATA - tsconfig.json: Excluded archive folders from compilation - package.json: Added bn.js as direct dependency Transaction: 4drNfMR1xBosGCQtfJ2a4r6oEawUByrT6L7Thyqu6QQWz555hX3QshFuJqiLZreL7KrheSgTdCEqMcXP26fi54JF Wallet: 3dG7wayp7b9NBMo92D2qL2sy1curSC4TTmskFpaGDrtA USDC ATA: 8ZEMwErnwxPNNNHJigUcMfrkBG14LCREDdKbqKm49YY7	2025-11-19 20:35:32 +01:00
mindesbunister	9cd317887a	fix: Correct BN import for withdrawal system Changed from '@project-serum/anchor' to 'bn.js' to match other Drift SDK integrations. Fixes 'Cannot read properties of undefined (reading '_bn')' error. User can now test withdrawal with $5 minimum.	2025-11-19 18:39:45 +01:00
mindesbunister	1c79178aac	fix: TypeScript errors in withdrawal system - Fixed LAST_WITHDRAWAL_TIME type (null \| string) - Removed parseFloat on health.freeCollateral (already number) - Fixed getDriftClient() → getClient() method name - Build now compiles successfully Deployed: Withdrawal system now live on dashboard	2025-11-19 18:25:54 +01:00
mindesbunister	ca7b49f745	feat: Add automated profit withdrawal system - UI page: /withdrawals with stats dashboard and config form - Settings API: GET/POST for .env configuration - Stats API: Real-time profit and withdrawal calculations - Execute API: Safe withdrawal with Drift SDK integration - Drift service: withdrawFromDrift() with USDC spot market (index 0) - Safety checks: Min withdrawal amount, min account balance, profit-only - Telegram notifications: Withdrawal alerts with Solscan links - Dashboard navigation: Added Withdrawals card (3-card grid) User goal: 10% of profits automatically withdrawn on schedule Current: Manual trigger ready, scheduled automation pending Files: 5 new (withdrawals page, 3 APIs, Drift service), 2 modified	2025-11-19 18:07:07 +01:00
mindesbunister	b23dde057b	fix: Add needsVerification field to ClosePositionResult interface - Added optional needsVerification?: boolean to ClosePositionResult - Fixes TypeScript build error from commit `c607a66` - Required for position close verification logic - Allows Position Manager to keep monitoring if close not yet propagated	2025-11-16 10:28:46 +01:00
mindesbunister	c607a66239	critical: Fix position close verification to prevent ghost positions Problem: - Close transaction confirmed on-chain BUT Drift state takes 5-10s to propagate - Position Manager immediately checked position after close → still showed open - Continued monitoring with stale state → eventually ghost detected - Database marked 'SL closed' but position actually stayed open for 6+ hours - Position was UNPROTECTED during this time (no monitoring, no TP/SL backup) Root Cause: - Transaction confirmation ≠ Drift internal state updated - SDK needs time to propagate on-chain changes to internal cache - Position Manager assumed immediate state consistency Fix (2-layer verification): 1. closePosition(): After 100% close confirmation, wait 5s then verify - Query Drift to confirm position actually gone - If still exists: Return needsVerification=true flag - Log CRITICAL error with transaction signature 2. Position Manager: Handle needsVerification flag - DON'T mark position closed in database - DON'T remove from monitoring - Keep monitoring until ghost detection sees it's actually closed - Prevents premature cleanup with wrong exit data Impact: - Prevents 6-hour unmonitored position exposure - Ensures database exit data matches actual Drift closure - Ghost detection becomes safety net, not primary close mechanism - User positions always protected until VERIFIED closed Files: - lib/drift/orders.ts: Added 5s wait + position verification after close - lib/trading/position-manager.ts: Check needsVerification flag before cleanup Incident: Nov 16, 02:51 - Close confirmed but position stayed open until 08:51	2025-11-16 10:00:10 +01:00
mindesbunister	f505db4ac8	fix: Reduce Drift SDK auto-reconnect interval from 4h to 2h Problem: Bot froze after only 1 hour of runtime with API timeouts, despite having 4-hour auto-reconnect protection for Drift SDK memory leak. Investigation showed: - Singleton pattern working correctly (reusing same instance) - Hundreds of accountUnsubscribe errors (WebSocket leak) - Container froze at ~1 hour, not 4 hours Root Cause: Drift SDK's memory leak is MORE SEVERE than expected. Even with single instance, subscriptions accumulate faster than anticipated. 4-hour interval too long - system hits memory/connection limits before cleanup. Solution: Reduce auto-reconnect interval to 2 hours (more aggressive). This ensures cleanup happens before critical thresholds reached. Code change (lib/drift/client.ts): - reconnectIntervalMs: 4 hours → 2 hours - Updated log messages to reflect new interval Impact: System now self-heals every 2 hours instead of 4, preventing the freeze that occurred tonight at 1-hour mark. Related: Common Pitfall #1 (Drift SDK memory leak)	2025-11-16 02:15:01 +01:00
mindesbunister	54c68b45d2	fix: Add retry logic to closePosition() for rate limit protection CRITICAL FIX: Rate limit storm causing infinite close attempts Root Cause Analysis (Trade cmi0il8l30000r607l8aec701): - Position Manager tried to close position (SL or TP trigger) - closePosition() in orders.ts had NO retry wrapper - Failed with 429 error, returned to Position Manager - Position Manager caught 429, kept monitoring - EVERY 2 SECONDS: Attempted close again → 429 → retry - Result: 100+ close attempts in logs, exhausted Helius rate limit - Meanwhile: On-chain TP2 limit order filled (not affected by SDK limits) - External closure detected, updated DB 8 TIMES ($0.14 → $0.51 compounding bug) Why This Happened: - placeExitOrders() has retryWithBackoff() wrapper (Nov 14 fix) - openPosition() has NO retry wrapper (but less critical - only runs once) - closePosition() had NO retry wrapper (CRITICAL - runs in monitoring loop) - When closePosition() failed, Position Manager retried EVERY monitoring cycle The Fix: - Wrapped closePosition() placePerpOrder() call with retryWithBackoff() - 8s base delay, 3 max retries (8s → 16s → 32s progression) - Same pattern as placeExitOrders() for consistency - Position Manager executeExit() already handles 429 by returning early - Now: 3 SDK retries (24s) + Position Manager monitoring retry = robust Impact: - Prevents rate limit exhaustion from infinite close attempts - Reduces RPC load by 30-50x during close operations - Protects against external closure duplicate update bug - User saw: $0.51 profit (8 DB updates) vs actual $0.14 (1 fill) Files: lib/drift/orders.ts (line ~567: wrapped placePerpOrder in retryWithBackoff) Verification: Container restarted 18:05 CET, code deployed	2025-11-15 18:06:12 +01:00
mindesbunister	8717f72a54	fix: Add retry logic to exit order placement (TP/SL) CRITICAL FIX: Exit orders failed without retry on 429 rate limits Root Cause: - placeExitOrders() placed TP1/TP2/SL orders directly without retry wrapper - cancelAllOrders() HAD retry logic (8s → 16s → 32s progression) - Rate limit errors during exit order placement = unprotected positions - If container crashes after opening, no TP/SL orders on-chain Fix Applied: - Wrapped ALL order placements in retryWithBackoff(): * TP1 limit order (line ~310) * TP2 limit order (line ~334) * Soft stop trigger-limit (dual stop system) * Hard stop trigger-market (dual stop system) * Single stop trigger-limit * Single stop trigger-market (default) Retry Behavior: - Base delay: 8 seconds (was 5s, increased Nov 14) - Progression: 8s → 16s → 32s (max 3 retries) - Logs rate_limit_recovered to database on success - Logs rate_limit_exhausted on max retries exceeded Impact: - Exit orders now retry up to 3x on 429 errors (56 seconds total wait) - Positions protected even during RPC rate limit spikes - Reduces need for immediate Helius upgrade - Database analytics track retry success/failure Files: lib/drift/orders.ts (6 placePerpOrder calls wrapped) Note: cancelAllOrders() already had retry logic - this completes coverage	2025-11-15 17:34:01 +01:00
mindesbunister	fa4b187f46	feat: Hybrid RPC strategy - Helius for init, Alchemy for trades CRITICAL FIX: Rate limiting causing unprotected positions Root Cause: - Rate limit errors preventing exit order placement after opening positions - Positions opened with NO on-chain TP/SL protection - If container crashes, position has unlimited risk exposure Hybrid RPC Solution: - Helius RPC: Drift SDK initialization (handles burst subscriptions perfectly) - Alchemy RPC: Trade operations - open, close, confirmations (better sustained rate limits) - Graceful fallback: If Alchemy not configured, uses Helius for everything Implementation: - DriftService: Dual connections (connection + tradeConnection) - getTradeConnection() returns Alchemy if configured, else Helius - openPosition() and closePosition() use tradeConnection for confirmTransaction() - Added ALCHEMY_RPC_URL to .env (optional) Configuration: - SOLANA_RPC_URL: Helius (existing) - ALCHEMY_RPC_URL: Added with your Alchemy key Files: - lib/drift/client.ts: Dual connection support + getTradeConnection() - lib/drift/orders.ts: Use getTradeConnection() for all confirmations - .env: Added ALCHEMY_RPC_URL Logs show: '🔀 Hybrid RPC mode: Helius for init, Alchemy for trades' Next: Test with new trade to verify orders place successfully	2025-11-15 12:15:23 +01:00
mindesbunister	0ef6b82106	feat: Hybrid RPC strategy (Helius init + Alchemy trades) CRITICAL: Fix rate limiting by using dual RPC approach Problem: - Helius RPC gets overwhelmed during trade execution (429 errors) - Exit orders fail to place, leaving positions UNPROTECTED - No on-chain TP/SL orders = unlimited risk if container crashes Solution: Hybrid RPC Strategy - Helius for Drift SDK initialization (handles burst subscriptions well) - Alchemy for trade operations (better sustained rate limits) - Falls back to Helius if Alchemy not configured Implementation: - DriftService now has two connections: connection (Helius) + tradeConnection (Alchemy) - Added getTradeConnection() method for trade operations - Updated openPosition() and closePosition() to use trade connection - Added ALCHEMY_RPC_URL to .env (optional, falls back to Helius) Benefits: - Helius: 0 subscription errors during init (proven reliable for SDK setup) - Alchemy: 300M compute units/month for sustained trade operations - Best of both worlds: reliable init + reliable trades Files: - lib/drift/client.ts: Dual connection support - lib/drift/orders.ts: Use getTradeConnection() for confirmations - .env: Added ALCHEMY_RPC_URL Testing: Deploy and execute test trade to verify orders place successfully	2025-11-15 12:00:57 +01:00
mindesbunister	fb4beee418	fix: Add periodic Drift reconnection to prevent memory leaks - Memory leak identified: Drift SDK accumulates WebSocket subscriptions over time - Root cause: accountUnsubscribe errors pile up when connections close/reconnect - Symptom: Heap grows to 4GB+ after 10+ hours, eventual OOM crash - Solution: Automatic reconnection every 4 hours to clear subscriptions Changes: - lib/drift/client.ts: Add reconnectTimer and scheduleReconnection() - lib/drift/client.ts: Implement private reconnect() method - lib/drift/client.ts: Clear timer in disconnect() - app/api/drift/reconnect/route.ts: Manual reconnection endpoint (POST) - app/api/drift/reconnect/route.ts: Reconnection status endpoint (GET) Impact: - Prevents JavaScript heap out of memory crashes - Telegram bot timeouts resolved (was failing due to unresponsive bot) - System will auto-heal every 4 hours instead of requiring manual restart - Emergency manual reconnect available via API if needed Tested: Container restarted successfully, no more WebSocket accumulation expected	2025-11-15 09:22:15 +01:00
mindesbunister	19beaf9c02	fix: Revert to Helius - Alchemy 'breakthrough' was not sustainable FINAL CONCLUSION after extensive testing: - Alchemy appeared to work perfectly at 14:25 CET (first trade) - User quote: 'SO IT WAS THE FUCKING RPC THAT WAS CAUSING ALL THE ISSUES!!!!!!!!!!!!' - BUT: Alchemy consistently fails after that initial success - Multiple attempts to use Alchemy (pure config, no fallback) = same result - Symptoms: timeouts, positions open WITHOUT TP/SL orders, no Position Manager tracking HELIUS = ONLY RELIABLE OPTION: - User confirmed: 'telegram works again' after reverting to Helius - Works consistently across multiple tests - Supports WebSocket subscriptions (accountSubscribe) that Drift SDK requires - Rate limits manageable with 5s exponential backoff ALCHEMY INCOMPATIBILITY CONFIRMED: - Does NOT support WebSocket subscriptions (accountSubscribe method) - SDK appears to initialize but is fundamentally broken - First trade might work, then SDK gets into bad state - Cannot be used reliably for Drift Protocol trading Files restored from working Helius state. This is the definitive answer: Helius only, no alternatives work.	2025-11-14 21:07:58 +01:00
mindesbunister	78ab9e1a94	fix: Increase transaction confirmation timeout to 60s for Alchemy Growth - Alchemy Growth (10,000 CU/s) can handle longer confirmation waits - Increased timeout from 30s to 60s in both openPosition() and closePosition() - Added debug logging to execute endpoint to trace hang points - Configured dual RPC: Alchemy primary (transactions), Helius fallback (subscriptions) - Previous 30s timeout was causing premature failures during Solana congestion - This should resolve 'Transaction was not confirmed in 30.00 seconds' errors Related: User reported n8n webhook returning 500 with timeout error	2025-11-14 20:42:59 +01:00
mindesbunister	6dccea5d91	revert: Back to last known working state (`27eb5d4`) - Restored Drift client, orders, and .env from commit `27eb5d4` - Updated to current Helius API key - ISSUE: Execute/check-risk endpoints still hang - Root cause appears to be Drift SDK initialization hanging at runtime - Bot initializes successfully at startup but hangs on subsequent Drift calls - Non-Drift endpoints work fine (settings, positions query) - Needs investigation: Drift SDK behavior or RPC interaction issue	2025-11-14 20:17:50 +01:00
mindesbunister	db0961d04e	revert: Remove Alchemy fallback causing crashes - getFallbackConnection() code was causing execute endpoint to crash - Reverting to Helius-only configuration - Need to investigate root cause before re-adding fallback	2025-11-14 20:10:21 +01:00
mindesbunister	6445a135a8	feat: Helius primary + Alchemy fallback for trade execution - Helius HTTPS: Primary RPC for Drift SDK initialization and subscriptions - Alchemy HTTPS (10K CU/s): Fallback RPC for transaction confirmations - Added getFallbackConnection() method to DriftService - openPosition() and closePosition() now use Alchemy for tx confirmations - accountSubscribe errors are non-fatal warnings (SDK falls back gracefully) - System fully operational: Drift initialized, Position Manager ready - Trade execution will use high-throughput Alchemy for confirmations	2025-11-14 16:51:14 +01:00
mindesbunister	1cf5c9aba1	feat: Smart startup RPC strategy (Helius → Alchemy) Strategy: 1. Start with Helius (handles startup burst better - 10 req/sec sustained) 2. After successful init, switch to Alchemy (more stable for trading) 3. On 429 errors during operations, fall back to Helius, then return to Alchemy Implementation: - lib/drift/client.ts: Smart constructor checks for fallback, uses it for startup - After initialize() completes, automatically switches to primary RPC - Swaps connections and reinitializes Drift SDK with Alchemy - Falls back to Helius on rate limits, switches back after recovery Benefits: - Helius absorbs SDK subscribe() burst (many concurrent calls) - Alchemy provides stability for normal trading operations - Best of both worlds: burst tolerance + operational stability Status: - Code complete and tested - Helius API key needs updating (current key returns 401) - Fallback temporarily disabled in .env until key fixed - Position Manager working perfectly (trade monitored via Alchemy) To enable: 1. Get fresh Helius API key from helius.dev 2. Set SOLANA_FALLBACK_RPC_URL in .env 3. Restart bot - will use Helius for startup automatically	2025-11-14 15:41:52 +01:00
mindesbunister	7ff78ee0bd	feat: Hybrid RPC fallback system (Alchemy → Helius) - Automatic fallback after 2 consecutive rate limits - Primary: Alchemy (300M CU/month, stable for normal ops) - Fallback: Helius (10 req/sec, backup for startup bursts) - Reduced startup validation: 6h window, 5 trades (was 24h, 20 trades) - Multi-position safety check (prevents order cancellation conflicts) - Rate limit-aware retry logic with exponential backoff Implementation: - lib/drift/client.ts: Added fallbackConnection, switchToFallbackRpc() - .env: SOLANA_FALLBACK_RPC_URL configuration - lib/startup/init-position-manager.ts: Reduced validation scope - lib/trading/position-manager.ts: Multi-position order protection Tested: System switched to fallback on startup, Position Manager active Result: 1 active trade being monitored after automatic RPC switch	2025-11-14 15:28:07 +01:00
mindesbunister	7afd7d5aa1	feat: switch from Helius to Alchemy RPC provider Changes: - Updated SOLANA_RPC_URL to use Alchemy (https://solana-mainnet.g.alchemy.com/v2/...) - Migrated from Helius free tier to Alchemy free tier - Includes previous rate limit fixes (8s backoff, 2s operation delays) Context: - Helius free tier: 10 req/sec sustained, 100 req/sec burst - Alchemy free tier: 300M compute units/month (more generous) - User hit 239 rate limit errors in 10 minutes on Helius - User registered Alchemy account and provided API key Impact: - Should significantly reduce 429 rate limit errors - Better free tier limits for trading bot operations - Combined with delay fixes for optimal RPC usage	2025-11-14 14:01:52 +01:00
mindesbunister	27eb5d4fe8	fix: Critical rate limit handling + startup position restoration Problem 1: Rate Limit Cascade - Position Manager tried to close repeatedly, overwhelming Helius RPC (10 req/s limit) - Base retry delay was too aggressive (2s → 4s → 8s) - No graceful handling when 429 errors occur Problem 2: Orphaned Positions After Restart - Container restarts lost Position Manager state - Positions marked 'closed' in DB but still open on Drift (failed close transactions) - No cross-validation between database and actual Drift positions Solutions Implemented: 1. Increased retry delays (orders.ts): - Base delay: 2s → 5s (progression now 5s → 10s → 20s) - Reduces RPC pressure during rate limit situations - Gives Helius time to recover between retries - Documented Helius limits: 100 req/s burst, 10 req/s sustained (free tier) 2. Startup position validation (init-position-manager.ts): - Cross-checks last 24h of 'closed' trades against actual Drift positions - If DB says closed but Drift shows open → reopens in DB to restore tracking - Prevents unmonitored positions from existing after container restarts - Logs detailed mismatch info for debugging 3. Rate limit-aware exit handling (position-manager.ts): - Detects 429 errors during position close - Keeps trade in monitoring instead of removing it - Natural retry on next price update (vs aggressive 2s loop) - Prevents marking position as closed when transaction actually failed Impact: - Eliminates orphaned positions after restarts - Reduces RPC pressure by 2.5x (5s vs 2s base delay) - Graceful degradation under rate limits - Position Manager continues monitoring even during temporary RPC issues Testing needed: - Monitor next container restart to verify position restoration works - Check rate limit analytics after next close attempt - Verify no more phantom 'closed' positions when Drift shows open	2025-11-14 09:50:13 +01:00
mindesbunister	6590f4fb1e	feat: phantom trade auto-closure system - Auto-close phantom positions immediately via market order - Return HTTP 200 (not 500) to allow n8n workflow continuation - Save phantom trades to database with full P&L tracking - Exit reason: 'manual' category for phantom auto-closes - Protects user during unavailable hours (sleeping, no phone) - Add Docker build best practices to instructions (background + tail) - Document phantom system as Critical Component #1 - Add Common Pitfall #30: Phantom notification workflow Why auto-close: - User can't always respond to phantom alerts - Unmonitored position = unlimited risk exposure - Better to exit with small loss/gain than leave exposed - Re-entry possible if setup actually good Files changed: - app/api/trading/execute/route.ts: Auto-close logic - .github/copilot-instructions.md: Documentation + build pattern	2025-11-14 05:37:51 +01:00
mindesbunister	5e826dee5d	Add DNS retry logic to Drift initialization - Handles transient network failures (EAI_AGAIN, ENOTFOUND, ETIMEDOUT) - Automatically retries up to 3 times with 2s delay between attempts - Logs retry attempts for monitoring - Prevents 500 errors from temporary DNS hiccups - Fixes: n8n workflow failures during brief network issues Impact: - Improves reliability during DNS/network instability - Reduces false negatives (missed trades due to transient errors) - User-friendly retry logs for diagnostics	2025-11-13 16:05:42 +01:00
mindesbunister	bd9633fbc2	CRITICAL FIX: Prevent unprotected positions via database-first pattern Root Cause: - Execute endpoint saved to database AFTER adding to Position Manager - Database save failures were silently caught and ignored - API returned success even when DB save failed - Container restarts lost in-memory Position Manager state - Result: Unprotected positions with no TP/SL monitoring Fixes Applied: 1. Database-First Pattern (app/api/trading/execute/route.ts): - MOVED createTrade() BEFORE positionManager.addTrade() - If database save fails, return HTTP 500 with critical error - Error message: 'CLOSE POSITION MANUALLY IMMEDIATELY' - Position Manager only tracks database-persisted trades - Ensures container restarts can restore all positions 2. Transaction Timeout (lib/drift/orders.ts): - Added 30s timeout to confirmTransaction() in closePosition() - Prevents API from hanging during network congestion - Uses Promise.race() pattern for timeout enforcement 3. Telegram Error Messages (telegram_command_bot.py): - Parse JSON for ALL responses (not just 200 OK) - Extract detailed error messages from 'message' field - Shows critical warnings to user immediately - Fail-open: proceeds if analytics check fails 4. Position Manager (lib/trading/position-manager.ts): - Move lastPrice update to TOP of monitoring loop - Ensures /status endpoint always shows current price Verification: - Test trade cmhxj8qxl0000od076m21l58z executed successfully - Database save completed BEFORE Position Manager tracking - SL triggered correctly at -$4.21 after 15 minutes - All protection systems working as expected Impact: - Eliminates risk of unprotected positions - Provides immediate critical warnings if DB fails - Enables safe container restarts with full position recovery - Verified with live test trade on production See: CRITICAL_INCIDENT_UNPROTECTED_POSITION.md for full incident report	2025-11-13 15:56:28 +01:00
mindesbunister	03e91fc18d	feat: ATR-based trailing stop + rate limit monitoring MAJOR FIXES: - ATR-based trailing stop for runners (was fixed 0.3%, now adapts to volatility) - Fixes runners with +7-9% MFE exiting for losses - Typical improvement: 2.24x more room (0.3% → 0.67% at 0.45% ATR) - Enhanced rate limit logging with database tracking - New /api/analytics/rate-limits endpoint for monitoring DETAILS: - Position Manager: Calculate trailing as (atrAtEntry / price × 100) × multiplier - Config: TRAILING_STOP_ATR_MULTIPLIER=1.5, MIN=0.25%, MAX=0.9% - Settings UI: Added ATR multiplier controls - Rate limits: Log hits/recoveries/exhaustions to SystemEvent table - Documentation: ATR_TRAILING_STOP_FIX.md + RATE_LIMIT_MONITORING.md IMPACT: - Runners can now capture big moves (like morning's $172→$162 SOL drop) - Rate limit visibility prevents silent failures - Data-driven optimization for RPC endpoint health	2025-11-11 14:51:41 +01:00
mindesbunister	c3a053df63	CRITICAL FIX: Use ?? instead of \|\| for tp2SizePercent to allow 0 value BUG FOUND: Line 558: tp2SizePercent: config.takeProfit2SizePercent \|\| 100 When config.takeProfit2SizePercent = 0 (TP2-as-runner system), JavaScript's \|\| operator treats 0 as falsy and falls back to 100, causing TP2 to close 100% of remaining position instead of activating trailing stop. IMPACT: - On-chain orders placed correctly (line 481 uses ?? correctly) - Position Manager reads from DB and expects TP2 to close position - Result: User sees TWO take-profit orders instead of runner system FIX: Changed both tp1SizePercent and tp2SizePercent to use ?? operator: - tp1SizePercent: config.takeProfit1SizePercent ?? 75 - tp2SizePercent: config.takeProfit2SizePercent ?? 0 This allows 0 value to be saved correctly for TP2-as-runner system. VERIFICATION NEEDED: Current open SHORT position in database has tp2SizePercent=100 from before this fix. Next trade will use correct runner system.	2025-11-10 19:46:03 +01:00
mindesbunister	988fdb9ea4	Fix runner system + strengthen anti-chop filter Three critical bugs fixed: 1. P&L calculation (65x inflation) - now uses collateralUSD not notional 2. handlePostTp1Adjustments() - checks tp2SizePercent===0 for runner mode 3. JavaScript \|\| operator bug - changed to ?? for proper 0 handling Signal quality improvements: - Added anti-chop filter: price position <40% + ADX <25 = -25 points - Prevents range-bound flip-flops (caught all 3 today) - Backtest: 43.8% → 55.6% win rate, +86% profit per trade Changes: - lib/trading/signal-quality.ts: RANGE-BOUND CHOP penalty - lib/drift/orders.ts: Fixed P&L calculation + transaction confirmation - lib/trading/position-manager.ts: Runner system logic - app/api/trading/execute/route.ts: \|\| to ?? for tp2SizePercent - app/api/trading/test/route.ts: \|\| to ?? for tp1/tp2SizePercent - prisma/schema.prisma: Added collateralUSD field - scripts/fix_pnl_calculations.sql: Historical P&L correction	2025-11-10 15:36:51 +01:00
mindesbunister	22195ed34c	Fix P&L calculation and signal flip detection - Fix external closure P&L using tp1Hit flag instead of currentSize - Add direction change detection to prevent false TP1 on signal flips - Signal flips now recorded with accurate P&L as 'manual' exits - Add retry logic with exponential backoff for Solana RPC rate limits - Create /api/trading/cancel-orders endpoint for manual cleanup - Improves data integrity for win/loss statistics	2025-11-09 17:59:50 +01:00
mindesbunister	9b767342dc	feat: Implement re-entry analytics system with fresh TradingView data - Add market data cache service (5min expiry) for storing TradingView metrics - Create /api/trading/market-data webhook endpoint for continuous data updates - Add /api/analytics/reentry-check endpoint for validating manual trades - Update execute endpoint to auto-cache metrics from incoming signals - Enhance Telegram bot with pre-execution analytics validation - Support --force flag to override analytics blocks - Use fresh ADX/ATR/RSI data when available, fallback to historical - Apply performance modifiers: -20 for losing streaks, +10 for winning - Minimum re-entry score 55 (vs 60 for new signals) - Fail-open design: proceeds if analytics unavailable - Show data freshness and source in Telegram responses - Add comprehensive setup guide in docs/guides/REENTRY_ANALYTICS_QUICKSTART.md Phase 1 implementation for smart manual trade validation.	2025-11-07 20:40:07 +01:00
mindesbunister	36ba3809a1	Fix runner system by checking minimum position size viability PROBLEM: Runner never activated because Drift force-closes positions below minimum size. TP2 would close 80% leaving 5% runner (~$105), but Drift automatically closed the entire position. SOLUTION: 1. Created runner-calculator.ts with canUseRunner() to check if remaining size would be above Drift minimums BEFORE executing TP2 close 2. If runner not viable: Skip TP2 close entirely, activate trailing stop on full 25% remaining (from TP1) 3. If runner viable: Execute TP2 as normal, activate trailing on 5% Benefits: - Runner system will now actually work for viable position sizes - Positions that are too small won't try to force-close below minimums - Better logs showing why runner did/didn't activate - Trailing stop works on larger % if runner not viable (better R:R) Example: $2100 position → $525 after TP1 → $105 runner = VIABLE $4 ETH position → $1 after TP1 → $0.20 runner = NOT VIABLE Runner will trail with ATR-based dynamic % (0.25-0.9%) below peak price.	2025-11-07 15:10:01 +01:00
mindesbunister	4996bc2aad	Fix SHORT position P&L calculation bug CRITICAL BUG FIX: SHORT positions were calculating P&L with inverted logic, causing profits to be recorded as losses and vice versa. Problem Example: - SHORT at $156.58, exit at $154.66 (price dropped $1.92) - Should be +~$25 profit - Was recorded as -$499.23 LOSS Root Cause: Old formula: profitPercent = (exit - entry) / entry * (side === 'long' ? 1 : -1) This multiplied the LONG formula by -1 for shorts, but then applied it to full notional instead of properly accounting for direction. Fix: - LONG: priceDiff = (exit - entry) → profit when price rises - SHORT: priceDiff = (entry - exit) → profit when price falls - profitPercent = priceDiff / entry * 100 - Proper leverage calculation: realizedPnL = collateral * profitPercent * leverage This fixes both dry-run and live close position calculations in lib/drift/orders.ts Impact: All SHORT trades since bot launch have incorrect P&L in database. Future trades will calculate correctly.	2025-11-07 14:53:03 +01:00
mindesbunister	5241920d44	Prevent repeated TP2 cleanup loops	2025-11-05 16:14:17 +01:00
mindesbunister	a100945864	Enhance trailing stop with ATR-based sizing	2025-11-05 15:28:12 +01:00
mindesbunister	cbb6592153	fix: correct PnL math and add health probe	2025-11-05 07:58:27 +01:00
mindesbunister	8bc08955cc	feat: Add phantom trade detection and database tracking - Detect position size mismatches (>50% variance) after opening - Save phantom trades to database with expectedSizeUSD, actualSizeUSD, phantomReason - Return error from execute endpoint to prevent Position Manager tracking - Add comprehensive documentation of phantom trade issue and solution - Enable data collection for pattern analysis and future optimization Fixes oracle price lag issue during volatile markets where transactions confirm but positions don't actually open at expected size.	2025-11-04 10:34:38 +01:00
mindesbunister	cfc15cd3b0	fix: Prevent runner positions from being below minimum order size Problem: When closing small runner positions (5% after TP1+TP2), the calculated size could be below Drift's minimum order size: - ETH minimum: 0.01 ETH - After TP1 (75%): 0.0025 ETH left - After TP2 (80%): 0.0005 ETH runner - Trailing stop tries to close 0.0005 ETH → ERROR: Below minimum 0.01 n8n showed: "Order size 0.0011 is below minimum 0.01" Root Cause: closePosition() calculated: sizeToClose = position.size * (percentToClose / 100) No validation against marketConfig.minOrderSize before submitting to Drift. Solution: Added minimum size check in closePosition() (lib/drift/orders.ts): 1. Calculate intended close size 2. If below minOrderSize → force 100% close instead 3. Log warning when this happens 4. Prevents Drift API rejection Code Change: ```typescript let sizeToClose = position.size * (params.percentToClose / 100) // If calculated size is below minimum, close 100% if (sizeToClose < marketConfig.minOrderSize) { console.log('⚠️ Calculated size below minimum - forcing 100% close') sizeToClose = position.size } ``` Impact: - ✅ Small runner positions close successfully - ✅ No more "below minimum" errors from Drift - Trades complete cleanly - ⚠️ Runner may close slightly earlier than intended (but better than error) Example: ETH runner at 0.0005 ETH → tries to close → detects <0.01 → closes entire 0.0005 ETH position at once instead of rejecting. This is the correct behavior - if the position is already too small, we should close it entirely.	2025-11-03 15:59:31 +01:00
mindesbunister	9572b54775	fix(drift): calculate realizedPnL with leverage on USD notional, not base asset - Old calculation: (closePrice - entryPrice) * sizeInBaseAsset = tiny P&L in dollars - New calculation: profitPercent * leverage * notionalUSD / 100 = correct leveraged P&L - Example: -0.13% price move * 10x leverage * $540 notional = -$7.02 (not -$0.38) - Fixes trades showing -$0.10 to -$0.77 losses when they should be -$5 to -$40 - Applied to both DRY_RUN and real execution paths	2025-11-02 20:45:57 +01:00
mindesbunister	056440bf8f	feat: add quality score display and timezone fixes - Add qualityScore to ExecuteTradeResponse interface and response object - Update analytics page to always show Signal Quality card (N/A if unavailable) - Fix n8n workflow to pass context metrics and qualityScore to execute endpoint - Fix timezone in Telegram notifications (Europe/Berlin) - Fix symbol normalization in /api/trading/close endpoint - Update Drift ETH-PERP minimum order size (0.002 ETH not 0.01) - Add transaction confirmation to closePosition() to prevent phantom closes - Add 30-second grace period for new trades in Position Manager - Fix execution order: database save before Position Manager.addTrade() - Update copilot instructions with transaction confirmation pattern	2025-11-01 17:00:37 +01:00
mindesbunister	c82da51bdc	CRITICAL FIX: Add transaction confirmation to detect failed orders - Added getConnection() method to DriftService - Added proper transaction confirmation in openPosition() - Check confirmation.value.err to detect on-chain failures - Return error if transaction fails instead of assuming success - Prevents phantom trades that never actually execute This fixes the issue where bot was recording trades with transaction signatures that don't exist on-chain (like 2gqrPxnvGzdRp56...).	2025-11-01 02:26:47 +01:00

1 2

61 Commits