Root Cause:
- Execute endpoint saved to database AFTER adding to Position Manager
- Database save failures were silently caught and ignored
- API returned success even when DB save failed
- Container restarts lost in-memory Position Manager state
- Result: Unprotected positions with no TP/SL monitoring
Fixes Applied:
1. Database-First Pattern (app/api/trading/execute/route.ts):
- MOVED createTrade() BEFORE positionManager.addTrade()
- If database save fails, return HTTP 500 with critical error
- Error message: 'CLOSE POSITION MANUALLY IMMEDIATELY'
- Position Manager only tracks database-persisted trades
- Ensures container restarts can restore all positions
2. Transaction Timeout (lib/drift/orders.ts):
- Added 30s timeout to confirmTransaction() in closePosition()
- Prevents API from hanging during network congestion
- Uses Promise.race() pattern for timeout enforcement
3. Telegram Error Messages (telegram_command_bot.py):
- Parse JSON for ALL responses (not just 200 OK)
- Extract detailed error messages from 'message' field
- Shows critical warnings to user immediately
- Fail-open: proceeds if analytics check fails
4. Position Manager (lib/trading/position-manager.ts):
- Move lastPrice update to TOP of monitoring loop
- Ensures /status endpoint always shows current price
Verification:
- Test trade cmhxj8qxl0000od076m21l58z executed successfully
- Database save completed BEFORE Position Manager tracking
- SL triggered correctly at -$4.21 after 15 minutes
- All protection systems working as expected
Impact:
- Eliminates risk of unprotected positions
- Provides immediate critical warnings if DB fails
- Enables safe container restarts with full position recovery
- Verified with live test trade on production
See: CRITICAL_INCIDENT_UNPROTECTED_POSITION.md for full incident report