diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md index ce2391e..4800d28 100644 --- a/.github/copilot-instructions.md +++ b/.github/copilot-instructions.md @@ -3229,6 +3229,111 @@ This section contains the **TOP 10 MOST CRITICAL** pitfalls that every AI agent - **Why Bug #81 Didn't Fix This:** Bug #81 = orders never placed, Bug #82 = orders placed then REMOVED by verifier - **Status:** ✅ EMERGENCY FIX DEPLOYED Dec 10, 2025 11:06 CET (commit e5714e4) +**12. Docker Cache Prevents Telegram Notification Code Deployment (#86 - CRITICAL - Dec 17, 2025)** - `--force-recreate` ≠ `--no-cache` +- **Symptom:** Container newer than commits, but Telegram shows OLD notification format (0.15% instead of 0.3%) +- **User Report:** "telegram is not fixed" after multiple rebuilds showing old code +- **Financial Impact:** 2 hours of debugging, delayed feature deployment, confusion about system state +- **Real Incident Timeline (Dec 17, 2025):** + * 13:38:09 - Committed dd9e5bd: Changed Telegram 0.15% → 0.3% + * 14:00:11 - Committed 6ac2647: Made Telegram thresholds adaptive + * 14:03:31 - Container rebuilt and restarted with `--force-recreate` + * 14:40:00 - User receives NEW signal showing **0.15% old format** (pre-dd9e5bd code!) + * Investigation: Container start time (14:03:31) > commit time (14:00:11) ✅ BUT showing old code ❌ +- **Root Cause:** + * Multi-stage Dockerfile caches `COPY . .` and `RUN npm run build` layers + * `docker compose up -d --force-recreate` only recreates CONTAINER, not IMAGE layers + * Docker reused cached build layer from BEFORE dd9e5bd commit + * Notification string changes (telegram.ts) didn't trigger cache invalidation + * Container appeared "new" but contained "old" compiled code +- **Git History Analysis:** + ```bash + git show dd9e5bd^ # Showed exact 0.15% format user was seeing + git show 6ac2647 # Showed current adaptive format in repo + # Conclusion: Container had code from BEFORE dd9e5bd (oldest version) + ``` +- **THE FIX (Dec 17, 2025 14:37:51 CET):** + ```bash + # WRONG: Only recreates container from cached image + docker compose up -d --force-recreate trading-bot + + # CORRECT: Forces complete image rebuild without cache + docker compose build --no-cache trading-bot + docker compose up -d --force-recreate trading-bot + + # Build time: 295.4s (vs ~30s with cache) + # Result: ALL code freshly compiled, ALL commits deployed ✅ + ``` +- **Why `--force-recreate` is Misleading:** + * Flag name suggests "rebuild everything from scratch" + * Reality: Only destroys and recreates container from existing image + * Image layers remain cached from previous builds + * Common misconception in Docker workflows +- **When to Use `--no-cache`:** + * Telegram/notification message changes not appearing + * UI text/string constants showing old values + * Hardcoded values not updating after code changes + * Normal rebuild deploys everything EXCEPT your specific change + * Debugging "why does new container have old code?" + * Any time Docker layer cache might be stale +- **When `--force-recreate` is Sufficient:** + * Configuration file changes (config.ts logic changes) + * Environment variable updates (.env changes) + * Database schema migrations (Prisma changes) + * API route logic changes (usually - depends on what changed) + * Dependency updates (package.json changes trigger rebuild) +- **Verification After Rebuild:** + ```bash + # 1. Check container start time + docker logs trading-bot-v4 | grep "Server starting" | head -1 + # Output: Server starting at 2025-12-17T14:37:51.123Z + + # 2. Check latest commit time + git log -1 --format='%ai' + # Output: 2025-12-17 14:00:11 +0100 + + # 3. Verify container NEWER than commit + # Container 14:37:51 > Commit 14:00:11 ✅ + + # 4. Test actual behavior (wait for next signal or test manually) + # Expected: New format with 0.3%, "confirms"/"against" text + ``` +- **Code Evolution (3 Versions):** + * **Version 1** (pre-dd9e5bd): Hardcoded 0.15%, old structure + * **Version 2** (dd9e5bd to 6ac2647^): Hardcoded 0.3%, old structure + * **Version 3** (6ac2647+): Adaptive display, new 4-line format + * Container showed Version 1 due to cache, now shows Version 3 ✅ +- **Files Affected:** + * lib/notifications/telegram.ts (lines 118-130) + * lib/trading/smart-validation-queue.ts (lines 107-109, 120-128) + * All notification text changes susceptible to this bug +- **Prevention Rules:** + 1. ALWAYS use `--no-cache` when notification/UI text changes + 2. NEVER trust `--force-recreate` alone for code deployment + 3. ALWAYS verify actual behavior after "deployment" (not just container timestamp) + 4. Check specific changed files in running container if possible + 5. Build time 10× longer (30s → 300s) is normal and expected + 6. Add to deployment checklist: "String changes require --no-cache" +- **Red Flags Indicating Docker Cache Bug:** + * Container start time > commit time, but showing old behavior + * Code changes in repository don't appear in container + * Telegram/UI showing old text despite rebuilds + * User says "it's not fixed" after rebuild + * git show reveals old code format matches what's running + * No TypeScript compilation errors, but behavior unchanged +- **Why This Matters:** + * **This is a REAL MONEY system** - delayed deployments = missed features + * User confusion: "I rebuilt, why isn't it working?" + * Wasted debugging time (2 hours investigating non-existent code issues) + * Misleading system state (appears deployed, actually isn't) + * Future notification changes will hit same issue without --no-cache +- **Git Commits:** + * dd9e5bd - "fix: Correct Smart Validation Queue confirmation threshold (0.15% → 0.3%)" + * 6ac2647 - "feat: Make Smart Validation Queue thresholds adaptive in Telegram notifications" + * 0310b14 - "fix: Enable BlockedSignalTracker for SMART_VALIDATION_QUEUED signals" +- **Deployment:** Dec 17, 2025 14:37:51 CET (--no-cache rebuild) +- **Status:** ✅ FIXED - All notification code deployed, next signal will show correct format +- **Lesson Learned:** Docker cache optimization (fast builds) can backfire for notification/UI changes. `--force-recreate` is misleadingly named - only recreates container, not image layers. Always use `--no-cache` for string/notification changes. Build time cost (295s vs 30s) is worth correct code deployment in real money system. + --- **REMOVED FROM TOP 10 (Still documented in full section):**