MANDATORY section for all AI agents to check first:
- Current TradingView indicator: v11 All Filters (filters NOW working)
- Signal quality scoring: v9 with direction-specific thresholds
- Position management: TP2-as-runner with ATR-based dynamic targets
- Adaptive leverage: 10x/5x with LONG 95+, SHORT 90+ thresholds
- Verification commands to check what's actually deployed
This prevents confusion when multiple implementations exist.
User mandate: 'i want this to be a mandatory thing to document'
- Added Bug #87 (Position Manager state lost on restart) to TOP 10 Critical Pitfalls
- Comprehensive documentation with incident details, root cause, fix implementation
- Created state-persistence.test.ts to validate all 18 critical state fields
- Test suite validates tp2Hit, trailingStopActive, peakPrice (critical for runner recovery)
- Testing notes: TypeScript ✅, npm test ⏱ timeout (120s), Docker deployment ✅
- Real-world validation pending: Next trade with container restart
Bug #87 Impact:
- Financial: ~$18.56 runner profit lost
- Root Cause: Race condition in nested Prisma query
- Fix: 4-step bulletproof atomic persistence with verification
- Status: ✅ DEPLOYED Dec 17, 2025 15:14 UTC (commit 341341d)
- Documents critical incident where --force-recreate didn't deploy code
- Telegram showed 0.15% instead of 0.3% despite commits and rebuild
- Root cause: Docker cached build layers, only container recreated
- Solution: docker compose build --no-cache trading-bot required
- Adds when to use --no-cache vs --force-recreate guidelines
- Includes verification steps and prevention rules
- 2 hours debugging time, now documented for future reference
Bug #84 Documentation:
- v11 'All Filters' indicator calculated filters but never applied them
- Root cause: finalLongSignal = buyReady (missing AND conditions)
- Impact: ,000 losses - all 7 v11 trades were unfiltered
- Fix: Added all filter checks to signal logic (commit acf103c)
- User observation 'nothing changes' was key diagnostic clue
- Explains why filter optimization work had zero effect
- Added comprehensive Bug #77 documentation as entry #2
- Root cause: handlePriceUpdate() early return when Drift not initialized
- Impact: ,000+ losses from silently failed monitoring (29+ minutes unprotected)
- Fix: Removed early return, monitoring now runs regardless of Drift state
- Verification: Test position shows price checks every 2 seconds
- Prevention: Never add early returns that silently skip critical operations
- Renumbered entries 2-10 to 3-11 to accommodate new critical pitfall
- This bug caused 'the whole time all the development we did was not working'
- Created comprehensive docs/BUG_83_AUTO_SYNC_ORDER_SIGNATURES_FIX.md
- Updated .github/copilot-instructions.md with full Bug #83 entry
- Documented two-part fix: order discovery + fallback logic
- Included testing procedures, prevention rules, future improvements
- User requested: 'ok go fix it and dont forget documentation' - COMPLETED
Documentation covers:
- Root cause analysis (NULL order signatures in auto-synced positions)
- Real incident details (Dec 12, 2025 position cmj3f5w3s0010pf0779cgqywi)
- Two-part solution (proactive discovery + reactive fallback)
- Expected impact and verification methods
- Why this is different from Bugs #77 and #78
Status: Fix deployed Dec 12, 2025 23:00 CET
Container: trading-bot-v4 with NULL signature fallback active
- Enhanced DNS failover monitor on secondary (72.62.39.24)
- Auto-promotes database: pg_ctl promote on failover
- Creates DEMOTED flag on primary via SSH (split-brain protection)
- Telegram notifications with database promotion status
- Startup safety script ready (integration pending)
- 90-second automatic recovery vs 10-30 min manual
- Zero-cost 95% enterprise HA benefit
Status: DEPLOYED and MONITORING (14:52 CET)
Next: Controlled failover test during maintenance
Bug #82: Drift State Verifier automatically closes active positions
Critical Issue:
- Verifier detected 6 old closed DB records (150-1064 min ago)
- All showed "15.45 tokens open on Drift" (user's CURRENT manual trade!)
- Automatic retry close removed user's SL orders
- User: "FOR FUCK SAKES. STILL THE FUCKING SAME. THE SYSTEM KILLED MY SL"
Different from Bug #81:
- Bug #81: Orders never placed initially (wrong token quantities)
- Bug #82: Orders placed and working, then REMOVED by verifier
Emergency Fix:
- DISABLED automatic retry close
- Added warning logs
- Requires manual orphan cleanup until proper position verification added
Deployment: Dec 10, 2025 11:06 CET
Status: Emergency fix deployed, active positions now protected
Bug #81 (usdToBase wrong price) deserves TOP 10 status because:
- ROOT CAUSE of ,000+ user losses
- Broke working implementation (4cc294b: 100% success rate)
- Positions repeatedly created without stop loss protection
- Database showed NULL signatures despite orders supposedly placed
- User had to manually close multiple positions
This was THE bug that made user say: "we had this working perfectly in the past"
Fix: Reverted usdToBase() to use SPECIFIC price for each order (TP1/TP2/SL)
Status: ✅ DEPLOYED Dec 10, 2025 14:31 CET (commit 55d780c)
- CRITICAL: Database can be wrong, Drift is source of truth
- Incident Dec 9: Database -9.33, Drift -2.21 (missing .88)
- Root cause: Retry loop chaos caused multi-chunk close, only first recorded
- User mandate: 'drift tells the truth not you' - always verify with API
- Pattern: Query Drift → Compare → Report discrepancies → Correct database
- This is NON-NEGOTIABLE for real money trading system
CRITICAL incident (Dec 9, 2025):
- Agent closed position based on stale bot data
- User explicitly said NOT to close
- Bot logs showed 'closed' but Drift still had open position
- Catastrophic if user wants to keep position open
NEW IRON-CLAD RULE:
- NEVER trust bot logs, API responses, or database alone
- ALWAYS query Drift API first: curl sync-positions
- Verify actual position.size, entry, P&L from Drift
- Only AFTER Drift verification: proceed with any operation
This is NON-NEGOTIABLE for financial system integrity.
Details Smart Validation Queue bug where marginal quality signals (50-89)
were blocked and saved to database, but validation queue never monitored
them after container restarts.
Root causes:
1. Queue used Map (in-memory only), lost on container restart
2. logger.log() silenced in production, making debug impossible
Financial impact: Missed +$18.56 manual entry opportunity (quality 85 signal
that moved +1.21% in 1 minute = 4× confirmation threshold).
Fix deployed Dec 9, 2025: Database restoration on startup + console.log()
for production visibility.
Related commits:
- 2a1badf: Smart Validation Queue database restoration fix
- 1ecef77: Health monitor TypeScript fix (getAllPositions)
User quote: 'the smart validation system should have entered the trade
as it shot up shouldnt it?'
This was part of the $1,000+ losses investigation - multiple critical bugs
discovered and fixed in same session.
CRITICAL DOCUMENTATION (Dec 8, 2025):
Three bugs discovered that caused $1,000+ losses:
**Bug #76: Silent SL Placement Failure**
- placeExitOrders() returns SUCCESS with only 2/3 orders
- TP1+TP2 placed but SL missing (NULL in database)
- No error logs, no indication of failure
- Position completely unprotected from downside
- Real incident: cmix773hk019gn307fjjhbikx (SOL $138.45, $2,003 size)
**Bug #77: Position Manager Never Monitors**
- Logs: "✅ Trade added to position manager for monitoring"
- Reality: isMonitoring=false, no price checks whatsoever
- configSnapshot.positionManagerState = NULL
- No Pyth monitor startup, no price updates
- $1,000+ losses because positions had ZERO protection
**Bug #78: Orphan Cleanup Removes Active Orders**
- Old orphaned position triggers cleanup
- cancelAllOrders() affects ALL positions on symbol
- User's NEW position loses TP/SL protection
- Orders initially placed, then removed by system
- Position left open with NO protection
SOLUTION: Position Manager Health Monitoring System
- File: lib/health/position-manager-health.ts (177 lines)
- Runs every 30 seconds automatically
- Detects all three bugs within 30 seconds
- CRITICAL alerts logged immediately
- Started via lib/startup/init-position-manager.ts
TEST SUITE: monitoring-verification.test.ts
- 8 test cases validating PM actually monitors
- Validates Pyth monitor starts
- Validates isMonitoring flag
- Validates price updates trigger checks
User quote: "we have lost 1000$...... i hope with the new test system this is an issue of the past"
This documentation ensures these bugs NEVER happen again.
CRITICAL LESSON LEARNED (Dec 8, 2025):
- Database has 2024 dates, current date is 2025
- Query 'WHERE exitTime >= 2024-12-07' matches Oct-Dec (247 rows)
- Should query 'WHERE exitTime >= 2025-12-07' (6 rows)
- Result: Reported -$1,616 loss instead of actual -$137.55 (12× inflation)
- User was RIGHT with $120.89 figure, AI agent wrong due to year mismatch
PREVENTION:
- Always use NOW() or CURRENT_DATE for relative queries
- Never hardcode year without verification
- Check row counts before declaring results
- Include YYYY-MM-DD in SELECT to catch mismatches
- Trust user's numbers when they dispute - verify query year first
This is a REAL MONEY system - wrong numbers = wrong decisions.
Drift tells the truth. User was right. Verify queries.
- Moved from #10 to #1 (most critical)
- This bug cost user 08 in real losses Dec 8, 2025
- Root cause: Container restart without verifying fix deployment
- Prevention: ALWAYS verify container timestamp > commit timestamp
- Discovered that n8n normalizes symbols BEFORE sending to bot
- Bot normalization code is never used (symbols already in *-PERP format)
- Adding new symbols requires updating n8n workflow, not bot code
- FARTCOIN fix applied to workflows/trading/parse_signal_enhanced.json
- User must import updated workflow to n8n for FARTCOIN to work
Added comprehensive documentation for Dec 7, 2025 timeout change:
- Extended from 10 → 30 minutes based on blocked signal analysis
- Data: 3/10 signals hit TP1, most moves after 15-30 min
- Example: Quality 70 + ADX 29.7 hit TP1 at 0.41% after 30+ min
- Trade-off: -0.4% drawdown limit protects against extended losses
- Deployment: c9c987a commit, verified operational
Updated Architecture Overview > Smart Validation Queue section with
full rationale, configuration details, and production status.
Moved Position Manager monitoring stop bug to #1 spot in Top 10 Critical Pitfalls.
This is now the most critical known issue, having caused real financial losses
during 90-minute monitoring gap on Dec 6-7, 2025.
Changes:
- Position Manager monitoring stop: Now #1 (was not listed)
- Drift SDK memory leak: Now #2 (was #1)
- Execute endpoint quality bypass: Removed from top 10 (less critical)
Documentation includes:
- Complete root cause explanation
- All 3 safety layer fixes deployed
- Code locations for each layer
- Expected impact and verification status
- Reference to full analysis: docs/PM_MONITORING_STOP_ROOT_CAUSE_DEC7_2025.md
User can now see this is the highest priority reliability issue and has been
comprehensively addressed with multiple fail-safes.
Added to copilot-instructions.md Common Pitfalls section:
PITFALL #73: Service Initialization Never Ran (Dec 5, 2025)
- Duration: 16 days (Nov 19 - Dec 5)
- Financial impact: 00-1,400 (k user estimate)
- Root cause: Services after validation with early return
- Affected: Stop hunt revenge, smart validation, blocked signal tracker, data cleanup
- Fix: Move services BEFORE validation (commits 51b63f4, f6c9a7b, 35c2d7f)
- Prevention: Test suite, CI/CD, startup health checks, console.log for critical logs
- Full docs: docs/CRITICAL_SERVICE_INITIALIZATION_BUG_DEC5_2025.md
- Added STEP 1: Run tests BEFORE deployment (113 tests in ~30s)
- Added STEP 2: Validate with test trade AFTER deployment
- Why mandatory: Catch bugs (tokens vs USD, false TP1, wrong SL) before real money loss
- Tests prevent Common Pitfalls #24, #43, #45, #52, #54, #67 recurrence
- DO NOT deploy if tests fail - fix issue or update tests first
Added comprehensive feature discovery section to copilot-instructions.md:
- Quick Reference Table: 9 common scenarios with existing features
- Quick Search Commands: bash commands for feature discovery
- Feature Discovery by Category: 6 categories with 30+ features
- Decision Flowchart: 5-step verification process
- Historical Examples: why each feature was built
This helps users/AI agents discover existing features before rebuilding.
Co-authored-by: mindesbunister <32161838+mindesbunister@users.noreply.github.com>
- Only 31 records from multi-timeframe alerts (not 11,429)
- 11,398 records are 1-minute data (kept as DATA_COLLECTION_ONLY)
- Total marked as OLD_V9_VERSION: 31 (15min/1H/4H/Daily only)
- Discovery: All TradingView alerts (5min/15min/1H/4H/Daily) attached to OLD v9 version
- Impact: 11,429 records from wrong indicator settings (confirmBars=0 vs current)
- Solution: Marked as DATA_COLLECTION_OLD_V9_VERSION to prevent analysis contamination
- Exception: 1-minute data (11,398) kept as DATA_COLLECTION_ONLY (unaffected)
- Fresh data from corrected alerts will use DATA_COLLECTION_ONLY going forward
- Old data preserved for historical reference, clearly marked
CRITICAL LESSON LEARNED (Dec 5, 2025):
Document the data analysis disaster caused by MFE/MAE stored in mixed units.
What Happened:
- Analyzed blocked vs executed signals to improve win rate
- SQL showed executed signals: 20.15% avg MFE (appeared excellent)
- Implemented "optimizations" based on this data:
* Tighter ATR multipliers (2.0→1.5, 4.0→3.0)
* Higher TP1 close (60%→75%)
* Increased leverage (1×→5×)
- User questioned: Why doesn't TP1 hit if it's 20% MFE?
- Investigation: Only 2/11 trades reached TP1 price target
- Root cause: Old records stored MFE in DOLLARS, new in PERCENTAGES
- TRUE MFE: 0.76% (long), 1.20% (short) - NOT 20%!
- 26× inflation due to unit mismatch
Why This Matters:
- This is a REAL MONEY system - wrong analysis = wrong trades = losses
- MFE/MAE used for critical decisions (exit timing, quality validation)
- Agent made "data-driven" optimizations on 26× inflated data
- All changes had to be reverted (commits a67a338, f65aae5)
MANDATORY SQL Pattern:
- ALWAYS filter by createdAt >= '2025-11-23' for MFE/MAE queries
- OR recalculate from prices (maxFavorablePrice - entryPrice)
- NEVER trust raw AVG(maxFavorableExcursion) without date filter
Prevention:
- Verify stored vs calculated values before ANY MFE/MAE analysis
- Check sample of recent vs old records to detect unit changes
- Document data format changes in Common Pitfalls immediately
Related:
- Common Pitfall #54: Original MFE/MAE units bug (Nov 23, 2025)
- Revert commit: a15f17f
- Incorrect optimization: a67a338, f65aae5
CRITICAL DATA BUG DISCOVERED (Dec 5, 2025):
Previous commits a67a338 and f65aae5 implemented optimizations based on
INCORRECT analysis of maxFavorableExcursion (MFE) data.
Problem: Old Trade records stored MFE in DOLLARS, not PERCENTAGES
- Appeared to show 20%+ average favorable movement
- Actually only 0.76% (long) and 1.20% (short) average movement
- 26× inflation of perceived performance due to unit mismatch
Incorrect Changes Reverted:
- ATR_MULTIPLIER_TP1: 1.5 → back to 2.0
- ATR_MULTIPLIER_TP2: 3.0 → back to 4.0
- ATR_MULTIPLIER_SL: 2.5 → back to 3.0
- TAKE_PROFIT_1_SIZE_PERCENT: 75 → back to 60
- LEVERAGE: 5 → back to 1
- Safety bounds restored to original values
- TRAILING_STOP_ATR_MULTIPLIER: back to 2.5
REAL FINDINGS (after data correction):
- TP1 orders ARE being placed (tp1OrderTx populated)
- TP1 prices NOT being reached (only 2/11 trades in sample)
- Recent trades (6 total): avg MFE 0.74%, only 2/6 reached TP1
- Problem is ENTRY QUALITY, not exit timing
- Quality 90+ signals barely move favorably before reversing
See Common Pitfall #54 - MFE data stored in mixed units
Need to filter by createdAt >= '2025-11-23' for accurate analysis
Updated copilot-instructions.md with:
- New ATR multipliers (1.5×/3.0×/2.5× vs 2.0×/4.0×/3.0×)
- Rationale: 0% TP hit rate despite 17-24% avg MFE
- Problem: Targets hit then reversed before monitoring loop detected
- Solution: Tighter targets catch moves before reversal
- 75% close at TP1 (vs 60%) to bank profit immediately
- 25% runner (vs 40%) for extended trends with tighter trail
- Leverage 5× during testing phase
This is MANDATORY documentation update per #1 priority rule.
- Added DUAL REMOTE SETUP section with origin (Gitea) and github (GitHub) configuration
- Documented post-commit hook location and purpose (.git/hooks/post-commit)
- Explained auto-sync to github vs manual push to origin workflow
- Added verification commands for sync status checking
- Included automation setup details and testing confirmation
- Updated commit workflow to reflect hook behavior
- Added recent example (de77cfe test commit) to demonstrate automation
- Created docs/COMMON_PITFALLS.md with all 72 pitfalls
- Organized by severity and category for better navigation
- Added quick reference table and cross-reference index
- Reduced copilot-instructions.md from 6,575 to 3,608 lines (45%)
- Kept Top 10 critical pitfalls in main instructions
- Preserved all git commits, dates, code examples
- Updated docs/README.md with references to new doc
Benefits:
- Faster AI agent context loading
- Easier maintenance and updates
- Better searchability by category
- Clear pattern recognition for similar issues
- Maintains comprehensive knowledge base
Co-authored-by: mindesbunister <32161838+mindesbunister@users.noreply.github.com>
User mandate: Manual Telegram trades bypass quality scoring entirely.
Documentation updates:
- Added 'Manual Trade Quality Bypass' section
- Explains user requirement for instant execution
- Documents implementation details (timeframe='manual' detection)
- Clarifies that analytics check is now advisory only
- Notes --force flag no longer needed for manual trades
Context: This is part of the mandatory documentation workflow -
every code change requires corresponding documentation update.
Related commit: 0982578 (quality bypass implementation)
Date: Dec 4, 2025
- New IRON-CLAD RULE: Always search docs before making suggestions or asking questions
- Purpose: Prevent wasting user time with already-answered questions
- Examples: TradingView rate limits, roadmap features, known bugs, configuration
- Workflow: Read request → Search docs → Check if answered → THEN respond
- Applies to: Features, bugs, config, architecture, deployment, troubleshooting
- Red flags: User says 'we already documented this' or 'check docs first'
- Why: User spent months documenting comprehensively, 'NOTHING gets lost' principle
- Impact: Respect user's documentation effort, save time = save money in financial system
Files modified:
- .github/copilot-instructions.md (line ~103-150, added Rule #5 with examples and workflow)
- Added explicit onboarding workflow at top of file
- 4-step sequence: copilot-instructions → docs/README → README → explore
- Lists all 8 documentation subdirectories with descriptions
- Emphasizes 'NOTHING gets lost' principle
- Ensures new agents have clear entry point without manual explanation