docs: Add Common Pitfall #79 - Smart Validation Queue in-memory loss

Details Smart Validation Queue bug where marginal quality signals (50-89)
were blocked and saved to database, but validation queue never monitored
them after container restarts.

Root causes:
1. Queue used Map (in-memory only), lost on container restart
2. logger.log() silenced in production, making debug impossible

Financial impact: Missed +$18.56 manual entry opportunity (quality 85 signal
that moved +1.21% in 1 minute = 4× confirmation threshold).

Fix deployed Dec 9, 2025: Database restoration on startup + console.log()
for production visibility.

Related commits:
- 2a1badf: Smart Validation Queue database restoration fix
- 1ecef77: Health monitor TypeScript fix (getAllPositions)

User quote: 'the smart validation system should have entered the trade
as it shot up shouldnt it?'

This was part of the $1,000+ losses investigation - multiple critical bugs
discovered and fixed in same session.
This commit is contained in:
mindesbunister
2025-12-09 17:44:34 +01:00
parent 2a1badf3ab
commit 919e54d448

View File

@@ -3420,6 +3420,110 @@ This section contains the **TOP 10 MOST CRITICAL** pitfalls that every AI agent
- **Git commit:** [Health monitoring deployed Dec 8, 2025 - detects missing orders]
- **Status:** ⚠️ Health monitor deployed (detects issue), root cause fix pending
79. **CRITICAL: Smart Validation Queue Never Monitors - In-Memory Queue Lost on Container Restart (CRITICAL - Dec 9, 2025):**
- **Symptom:** Quality 50-89 signals blocked and saved to database, but validation queue never monitors them for price confirmation
- **User Report:** "the smart validation system should have entered the trade as it shot up shouldnt it?"
- **Financial Impact:** Missed +$18.56 manual entry (SOL-PERP LONG quality 85, price moved +1.21% in 1 minute = 4× the +0.3% confirmation threshold)
- **Real Incident (Dec 9, 2025 15:40):**
* Signal: cmiyqy6uf03tcn30722n02lnk
* Quality: 85/90 (blocked correctly per thresholds)
* Entry Price: $134.94
* Price after 1min: $136.57 (+1.21%)
* Confirmation threshold: +0.3%
* System should have: Queued → Monitored → Entered at confirmation
* What happened: Signal saved to database, queue NEVER monitored it
* User had to manually enter → +$18.56 profit
- **Root Cause #1 - In-Memory Queue Lost on Restart:**
* File: `lib/trading/smart-validation-queue.ts`
* Queue uses `Map<string, QueuedSignal>` in-memory storage
* BlockedSignal records saved to PostgreSQL ✅
* But queue Map is empty after container restart ❌
* startSmartValidation() just created empty singleton, never loaded from database
- **Root Cause #2 - Production Logger Silencing:**
* logger.log() calls silenced when NODE_ENV=production
* File: `lib/utils/logger.ts` - logger.log() only works in dev mode
* Startup messages never appeared in container logs
* Silent failure - no errors, no indication queue was empty
- **THE FIX (Dec 9, 2025 - DEPLOYED):**
```typescript
// In lib/trading/smart-validation-queue.ts startSmartValidation()
export async function startSmartValidation(): Promise<void> {
const queue = getSmartValidationQueue()
// Query BlockedSignal table for signals within 30-minute entry window
const thirtyMinutesAgo = new Date(Date.now() - 30 * 60 * 1000)
const recentBlocked = await prisma.blockedSignal.findMany({
where: {
blockReason: 'QUALITY_SCORE_TOO_LOW',
signalQualityScore: { gte: 50, lt: 90 }, // Marginal quality range
createdAt: { gte: thirtyMinutesAgo },
},
})
console.log(`🔄 Restoring ${recentBlocked.length} pending signals from database`)
// Re-queue each signal with original parameters
for (const signal of recentBlocked) {
await queue.addSignal({ /* signal params */ })
}
console.log(`✅ Smart validation restored ${recentBlocked.length} signals, monitoring started`)
}
```
- **Why Both Fixes Were Needed:**
1. Database restoration: Load pending signals from PostgreSQL on startup
2. console.log(): Replace logger.log() calls with console.log() for production visibility
3. Without #1: Queue always empty after restart
4. Without #2: Couldn't debug why queue was empty (no logs)
- **Expected Behavior After Fix:**
* Container restart: Queries database for signals within 30-minute window
* Signals found: Re-queued with original entry price, quality score, metrics
* Monitoring starts: 30-second price checks begin immediately
* Logs show: "🔄 Restoring N pending signals from database"
* Confirmation: "👁️ Smart validation monitoring started (checks every 30s)"
- **Verification Commands:**
```bash
# Check startup logs
docker logs trading-bot-v4 2>&1 | grep -E "(Smart|validation|Restor)"
# Expected output:
# 🧠 Starting smart entry validation system...
# 🔄 Restoring N pending signals from database
# ✅ Smart validation restored N signals, monitoring started
# 👁️ Smart validation monitoring started (checks every 30s)
# If N=0, check database for recent signals
docker exec trading-bot-postgres psql -U postgres -d trading_bot_v4 -c \
"SELECT COUNT(*) FROM \"BlockedSignal\" WHERE \"blockReason\" = 'QUALITY_SCORE_TOO_LOW'
AND \"signalQualityScore\" BETWEEN 50 AND 89
AND \"createdAt\" > NOW() - INTERVAL '30 minutes';"
```
- **Why This Matters:**
* **This is a REAL MONEY system** - validation queue is designed to catch marginal signals that confirm
* Quality 50-89 signals = borderline setups that need price confirmation
* Without validation: Miss profitable confirmed moves (like +$18.56 opportunity)
* System appeared to work (signals blocked correctly, saved to database)
* But critical validation step never executed (queue empty, monitoring off)
- **Prevention Rules:**
1. NEVER use in-memory-only data structures for critical financial logic
2. ALWAYS restore state from database on startup for trading systems
3. ALWAYS use console.log() for critical startup messages (not logger.log())
4. ALWAYS verify monitoring actually started (check logs for confirmation)
5. Add database restoration for ANY system that queues/monitors signals
6. Test container restart scenarios to catch in-memory state loss
- **Red Flags Indicating This Bug:**
* BlockedSignal records exist in database but no queue monitoring logs
* Price moves meet confirmation threshold but no execution
* User manually enters trades that validation queue should have handled
* No "👁️ Smart validation check: N pending signals" logs every 30 seconds
* Telegram shows "⏰ SIGNAL QUEUED FOR VALIDATION" but nothing after
- **Files Changed:**
* lib/trading/smart-validation-queue.ts (Lines 456-500, 137-175, 117-127)
- **Git commit:** 2a1badf "critical: Fix Smart Validation Queue - restore signals from database on startup"
- **Deploy Status:** ✅ DEPLOYED Dec 9, 2025 17:07 CET
- **Status:** ✅ Fixed - Queue now restores pending signals on startup, production logging enabled
72. **CRITICAL: MFE Data Unit Mismatch - ALWAYS Filter by Date (CRITICAL - Dec 5, 2025):**
- **Symptom:** SQL analysis shows "20%+ average MFE" but TP1 (0.6% target) never hits
- **Root Cause:** Old Trade records stored MFE/MAE in DOLLARS, new records store PERCENTAGES