docs: Add Common Pitfall #73 - Service initialization bug (k impact)
Added to copilot-instructions.md Common Pitfalls section: PITFALL #73: Service Initialization Never Ran (Dec 5, 2025) - Duration: 16 days (Nov 19 - Dec 5) - Financial impact: 00-1,400 (k user estimate) - Root cause: Services after validation with early return - Affected: Stop hunt revenge, smart validation, blocked signal tracker, data cleanup - Fix: Move services BEFORE validation (commits51b63f4,f6c9a7b,35c2d7f) - Prevention: Test suite, CI/CD, startup health checks, console.log for critical logs - Full docs: docs/CRITICAL_SERVICE_INITIALIZATION_BUG_DEC5_2025.md
This commit is contained in:
81
.github/copilot-instructions.md
vendored
81
.github/copilot-instructions.md
vendored
@@ -2966,7 +2966,7 @@ This section contains the **TOP 10 MOST CRITICAL** pitfalls that every AI agent
|
||||
**Smart Entry:** #63, #66, #68, #70
|
||||
**Deployment:** #31, #47
|
||||
|
||||
📚 **Full Documentation:** `docs/COMMON_PITFALLS.md` (72 pitfalls with code examples, git commits, deployment dates)
|
||||
📚 **Full Documentation:** `docs/COMMON_PITFALLS.md` (73 pitfalls with code examples, git commits, deployment dates)
|
||||
|
||||
72. **CRITICAL: MFE Data Unit Mismatch - ALWAYS Filter by Date (CRITICAL - Dec 5, 2025):**
|
||||
- **Symptom:** SQL analysis shows "20%+ average MFE" but TP1 (0.6% target) never hits
|
||||
@@ -3098,6 +3098,85 @@ This section contains the **TOP 10 MOST CRITICAL** pitfalls that every AI agent
|
||||
- **Git commit:** PR #3 on branch `copilot/remove-env-from-git-tracking`
|
||||
- **Status:** ✅ Fixed - .env removed from tracking, .gitignore updated
|
||||
|
||||
73. **CRITICAL: Service Initialization Never Ran - $1,000 Lost (CRITICAL - Dec 5, 2025):**
|
||||
- **Symptom:** 4 critical services coded correctly but never started for 16 days
|
||||
- **Financial Impact:** $700-1,400 in missed opportunities (user estimate: $1,000)
|
||||
- **Duration:** Nov 19 - Dec 5, 2025 (16 days)
|
||||
- **Root Cause:** Services initialized AFTER validation function with early return
|
||||
- **Code Flow (BROKEN):**
|
||||
```typescript
|
||||
// lib/startup/init-position-manager.ts
|
||||
await validateOpenTrades() // Line 43
|
||||
// validateOpenTrades() returns early if no trades (line 111)
|
||||
|
||||
// SERVICE INITIALIZATION (Lines 59-72) - NEVER REACHED
|
||||
startDataCleanup()
|
||||
startBlockedSignalTracking()
|
||||
await startStopHuntTracking()
|
||||
await startSmartValidation()
|
||||
```
|
||||
- **Affected Services:**
|
||||
1. **Stop Hunt Revenge Tracker** (Nov 20) - Never attempted revenge on quality 85+ stop-outs
|
||||
2. **Smart Entry Validation** (Nov 30) - Manual Telegram trades used stale data instead of fresh TradingView metrics
|
||||
3. **Blocked Signal Price Tracker** (Nov 19) - No data collected for threshold optimization
|
||||
4. **Data Cleanup Service** (Dec 2) - Database bloat, no 28-day retention enforcement
|
||||
- **Why It Went Undetected:**
|
||||
* **Silent failure:** No errors thrown, services simply never initialized
|
||||
* **Logger silencing:** Production logger (`logger.log`) silenced by `NODE_ENV=production`
|
||||
* **Split logging:** Some logs appeared (from service functions), others didn't (from init function)
|
||||
* **Common trigger:** Bug only occurred when `openTrades.length === 0` (frequent in production)
|
||||
- **Financial Breakdown:**
|
||||
* Stop hunt revenge: $300-600 lost (missed reversal opportunities)
|
||||
* Smart validation: $200-400 lost (stale data caused bad entries)
|
||||
* Blocked signals: $200-400 lost (suboptimal quality thresholds)
|
||||
* Total: $700-1,400 over 16 days
|
||||
- **Fix (Dec 5, 2025):**
|
||||
```typescript
|
||||
// CORRECT ORDER:
|
||||
// 1. Start services FIRST (lines 34-50)
|
||||
startDataCleanup()
|
||||
startBlockedSignalTracking()
|
||||
await startStopHuntTracking()
|
||||
await startSmartValidation()
|
||||
|
||||
// 2. THEN validate (line 56) - can return early safely
|
||||
await validateAllOpenTrades()
|
||||
await validateOpenTrades() // Early return OK now
|
||||
|
||||
// 3. Finally init Position Manager
|
||||
const manager = await getInitializedPositionManager()
|
||||
```
|
||||
- **Logging Fix:** Changed `logger.log()` to `console.log()` for production visibility
|
||||
- **Verification:**
|
||||
```bash
|
||||
$ docker logs trading-bot-v4 | grep -E "🧹|🔬|🎯|🧠|📊"
|
||||
🧹 Starting data cleanup service...
|
||||
🔬 Starting blocked signal price tracker...
|
||||
🎯 Starting stop hunt revenge tracker...
|
||||
📊 No active stop hunts - tracker will start when needed
|
||||
🧠 Starting smart entry validation system...
|
||||
```
|
||||
- **Prevention Measures:**
|
||||
1. **Test suite (PR #2):** 113 tests covering Position Manager - add service initialization tests
|
||||
2. **CI/CD pipeline (PR #5):** Automated quality gates - add service startup validation
|
||||
3. **Startup health check:** Verify all expected services initialized, throw error if missing
|
||||
4. **Production logging standard:** Critical operations use `console.log()`, not `logger.log()`
|
||||
- **Lessons Learned:**
|
||||
* Service initialization order matters - never place critical services after functions with early returns
|
||||
* Silent failures are dangerous - add explicit verification that services started
|
||||
* Production logging must be visible - logger utilities that silence logs = debugging nightmare
|
||||
* Test real-world conditions - bug only occurred with `NODE_ENV=production` + `openTrades.length === 0`
|
||||
- **Timeline:**
|
||||
* Nov 19: Blocked Signal Tracker deployed (never ran)
|
||||
* Nov 20: Stop Hunt Revenge deployed (never ran)
|
||||
* Nov 30: Smart Validation deployed (never ran)
|
||||
* Dec 2: Data Cleanup deployed (never ran)
|
||||
* Dec 5: Bug discovered and fixed
|
||||
* Result: **16 days of development with 0 production execution**
|
||||
- **Git commits:** 51b63f4 (service order fix), f6c9a7b (console.log fix), 35c2d7f (stop hunt logs fix)
|
||||
- **Full documentation:** `docs/CRITICAL_SERVICE_INITIALIZATION_BUG_DEC5_2025.md`
|
||||
- **Status:** ✅ Fixed - All services now start on every container restart, verified in production logs
|
||||
|
||||
## File Conventions
|
||||
|
||||
- **API routes:** `app/api/[feature]/[action]/route.ts` (Next.js 15 App Router)
|
||||
|
||||
Reference in New Issue
Block a user