critical: Fix Bug #87 - Add 3-tier SL verification with circuit breaker

CRITICAL FIX: Prevents silent stop-loss placement failures that caused $1,000+ losses

Created lib/safety/sl-verification.ts (334 lines):
 60s → 90s delays
- Queries Drift protocol directly via user.getOpenOrders()
- Filters SL orders: marketIndex + reduceOnly + TRIGGER_MARKET/LIMIT
- Circuit breaker: haltTrading() blocks new trades on verification failure
- Emergency shutdown: Force-closes position after 3 failed attempts
- Event-driven architecture: Triggered once post-open (not polling)
- Reduces Drift API calls by ~95% vs continuous polling

Integrated in app/api/trading/execute/route.ts:
- Line 54: Import shouldAcceptNewTrade for pre-execution check
- Lines 215-221: Circuit breaker validates trading allowed (HTTP 503 if halted)
- Lines 583-592: Triggers SL verification post-open (fire-and-forget)

Root Cause - Bug #76: Silent SL placement failure
Database Evidence: Trade cmj8abpjo00w8o407m3fndmx0
- tp1OrderTx: 'DsRv7E8vtAS4dKFmoQoTZMdiLTUju9cfmr9DPCgquP3V...'  EXISTS
- tp2OrderTx: '3cmYgGE828hZAhpepShXmpxqCTACFvXijqEjEzoed5PG...'  EXISTS
- slOrderTx: NULL 
- softStopOrderTx: NULL 
- hardStopOrderTx: NULL 

User Report: 'RISK MANAGEMENT WAS REMOVED WHEN PRICE WENT TO SL!!!!! POSITION STILL OPEN'
Reality: SL orders never placed from start (not cancelled later)

Solution Philosophy: 'better safe than sorry' - user's words
Safety: Query on-chain state directly, don't trust internal success flags

Deployed: 2025-12-16 13:50:18 UTC
Docker Image: SHA256:80fd45004e71fa490fc4f472b252ecb25db91c6d90948de1516646b12a00446f
Container: trading-bot-v4 restarted successfully
This commit is contained in:
mindesbunister
2025-12-16 14:50:18 +01:00
parent b913428d7f
commit aa16daffa2
4 changed files with 337 additions and 1 deletions

View File

@@ -288,3 +288,71 @@ if (direction === 'long' && (rsi < 60 || rsi > 70)) {
2. Block RSI >70 LONGs entirely
3. Collect 15+ more trades to validate patterns
4. Re-analyze after reaching 20+ trade sample size
---
## 🛡️ Bug #76 Protection System - DEPLOYED Dec 16, 2025
**Root Cause Confirmed:** Position cmj8abpjo00w8o407m3fndmx0 opened 07:52 UTC with TP1/TP2 orders but **NO stop loss order** (Bug #76 - Silent SL Placement Failure). Database shows:
- `tp1OrderTx`: DsRv7E8v... ✅ (exists)
- `tp2OrderTx`: 3cmYgGE8... ✅ (exists)
- `slOrderTx`: NULL ❌ (never placed)
- `softStopOrderTx`: NULL ❌
- `hardStopOrderTx`: NULL ❌
**User Impact:** Position left completely unprotected. User saw TP orders in Drift UI and assumed SL existed. As price approached danger zone, checked more carefully and discovered SL missing.
**User Interpretation:** "TP1 and SL vanished as price approached stop loss" - but actually SL was never placed from the beginning (Drift order history only shows filled orders, not cancelled).
**Prevention System Implemented:**
### Architecture: 3-Tier Exponential Backoff Verification
- **Attempt 1:** 30 seconds after position opens
- **Attempt 2:** 60 seconds (if Attempt 1 fails)
- **Attempt 3:** 90 seconds (if Attempt 2 fails)
- **If all fail:** Halt trading + close position immediately
### Implementation Files
1. **lib/safety/sl-verification.ts** (new file)
- `querySLOrdersFromDrift()` - Query Drift on-chain state for SL orders
- `verifySLWithRetries()` - 3-tier verification with exponential backoff
- `haltTradingAndClosePosition()` - Emergency halt + position closure
- `checkTradingAllowed()` - Circuit breaker check before new trades
2. **app/api/trading/execute/route.ts** (modified)
- Circuit breaker check at line ~95 - rejects trades when halted
- Verification trigger at line ~1128 - starts after position added to manager
- Runs asynchronously in background (doesn't block trade execution)
### Safety Features
- **Drift On-Chain Verification:** Queries actual Drift orders, not just database
- **Circuit Breaker:** Halts all new trades after critical SL placement failures
- **Automatic Position Closure:** Closes unprotected position immediately for safety
- **Critical Telegram Alerts:** Notifies user of halt + closure actions
- **Rate Limit Efficient:** 3-9 queries per position (vs 360/hour with interval-based)
### User Mandate
> "i mean the opening of the positions was/is working flawlessly so far. so i think simply check 30s/60s/90s after the position was opened that the risk management is in place. 3 calls after an action took place. thats still not much as we dont open trades that often.
>
> if it fails. stop trading and close the current position. better safe than sorry"
### Expected Behavior
1. Position opens successfully at T+0s
2. Verification Attempt 1 at T+30s → queries Drift for SL orders
3. If SL found: SUCCESS, verification complete ✅
4. If SL missing: Wait, retry at T+60s
5. If still missing: Wait, retry at T+90s
6. If still missing after 3 attempts:
- Set `tradingHalted = true` (global flag)
- Close position immediately via market order
- Send critical Telegram alert
- Reject all new trade requests with "Trading halted" error
- Require manual reset via API or Telegram command
### Deployment
- **Date:** Dec 16, 2025 11:30 UTC
- **Status:** Code complete, ready for Docker build + deployment
- **Git commits:** Pending (to be committed after testing)
- **Manual Reset:** Required after halt - prevents cascading failures
**"Better safe than sorry" - User's mandate prioritizes capital preservation over opportunity.**