docs: add VERIFICATION MANDATE section to prevent future bugs

Added comprehensive "VERIFICATION MANDATE" section requiring proof before declaring features working. This addresses pattern of bugs slipping through despite documentation. NEW SECTION INCLUDES: - Core principle: "working" = verified with real data, not code review - Critical path verification checklists for: * Position Manager changes (test trade + logs + SQL verification) * Exit logic changes (expected vs actual behavior) * API endpoint changes (curl + database + notifications) * Calculation changes (verbose logging + SQL validation) * SDK integration (never trust docs, verify with console.log) - Red flags requiring extra verification (unit conversions, state transitions, config precedence, display values, timing logic) - SQL verification queries for Position Manager and P&L calculations - Real example: How position.size bug should have been caught with one console.log statement showing tokens vs USD mismatch - Deployment checklist: code review → tests → logs → database → edge cases → documentation → user notification - When to escalate: Don't say "it's working" without proof UPDATED "When Making Changes" section: #9: Position Manager changes require test trade + log monitoring + SQL #10: Calculation changes require verbose logging + SQL verification This creates "prove it works" culture vs "looks like it works". Root cause of recent bugs: confirmation bias without verification. - position.size tokens vs USD: looked right, wasn't tested - leverage display: looked right, notification showed wrong value - Both would've been caught with one test trade + log observation Impact: At $97.55 capital with 15x leverage, each bug costs 5-20% of account. Verification mandate makes this unacceptable going forward.
2025-11-12 13:02:58 +01:00
parent 4404103f6a
commit 381aa168b5
1 changed files with 179 additions and 0 deletions
--- a/.github/copilot-instructions.md
+++ b/.github/copilot-instructions.md
@@ -64,6 +64,174 @@
 - Uses real TradingView ADX/ATR/RSI when available, falls back to historical data
 - Penalty for recent losing trades, bonus for winning streaks

+## VERIFICATION MANDATE: Financial Code Requires Proof
+
+**Core Principle:** In trading systems, "working" means "verified with real data", NOT "code looks correct".
+
+**NEVER declare something working without:**
+1. Observing actual logs showing expected behavior
+2. Verifying database state matches expectations
+3. Comparing calculated values to source data
+4. Testing with real trades when applicable
+
+### Critical Path Verification Requirements
+
+**Position Manager Changes:**
+- [ ] Execute test trade with DRY_RUN=false (small size)
+- [ ] Watch docker logs for full TP1 → TP2 → exit cycle
+- [ ] SQL query: verify `tp1Hit`, `slMovedToBreakeven`, `currentSize` match Position Manager logs
+- [ ] Compare Position Manager tracked size to actual Drift position size
+- [ ] Check exit reason matches actual trigger (TP1/TP2/SL/trailing)
+
+**Exit Logic Changes (TP/SL/Trailing):**
+- [ ] Log EXPECTED values (TP1 price, SL price after breakeven, trailing stop distance)
+- [ ] Log ACTUAL values from Drift position and Position Manager state
+- [ ] Verify: Does TP1 hit when price crosses TP1? Does SL move to breakeven?
+- [ ] Test: Open position, let it hit TP1, verify 75% closed + SL moved
+- [ ] Document: What SHOULD happen vs what ACTUALLY happened
+
+**API Endpoint Changes:**
+- [ ] curl test with real payload from TradingView/n8n
+- [ ] Check response JSON matches expectations
+- [ ] Verify database record created with correct fields
+- [ ] Check Telegram notification shows correct values (leverage, size, etc.)
+- [ ] SQL query: confirm all fields populated correctly
+
+**Calculation Changes (P&L, Position Sizing, Percentages):**
+- [ ] Add console.log for EVERY step of calculation
+- [ ] Verify units match (tokens vs USD, percent vs decimal, etc.)
+- [ ] SQL query with manual calculation: does code result match hand calculation?
+- [ ] Test edge cases: 0%, 100%, negative values, very small/large numbers
+
+**SDK/External Data Integration:**
+- [ ] Log raw SDK response to verify assumptions about data format
+- [ ] NEVER trust documentation - verify with console.log
+- [ ] Example: position.size doc said "USD" but logs showed "tokens"
+- [ ] Document actual behavior in Common Pitfalls section
+
+### Red Flags Requiring Extra Verification
+
+**High-Risk Changes:**
+- Unit conversions (tokens ↔ USD, percent ↔ decimal)
+- State transitions (TP1 hit → move SL to breakeven)
+- Configuration precedence (per-symbol vs global vs defaults)
+- Display values from complex calculations (leverage, size, P&L)
+- Timing-dependent logic (grace periods, cooldowns, race conditions)
+
+**Verification Steps for Each:**
+1. **Before declaring working**: Show proof (logs, SQL results, test output)
+2. **After deployment**: Monitor first real trade closely, verify behavior
+3. **Edge cases**: Test boundary conditions (0, 100%, max leverage, min size)
+4. **Regression**: Check that fix didn't break other functionality
+
+### SQL Verification Queries
+
+**After Position Manager changes:**
+```sql
+-- Verify TP1 detection worked correctly
+SELECT 
+  symbol, entryPrice, currentSize, realizedPnL,
+  tp1Hit, slMovedToBreakeven, exitReason,
+  TO_CHAR(createdAt, 'MM-DD HH24:MI') as time
+FROM "Trade"
+WHERE exitReason IS NULL  -- Open positions
+  OR createdAt > NOW() - INTERVAL '1 hour'  -- Recent closes
+ORDER BY createdAt DESC
+LIMIT 5;
+
+-- Compare Position Manager state to expectations
+SELECT configSnapshot->'positionManagerState' as pm_state
+FROM "Trade" 
+WHERE symbol = 'SOL-PERP' AND exitReason IS NULL;
+```
+
+**After calculation changes:**
+```sql
+-- Verify P&L calculations
+SELECT 
+  symbol, direction, entryPrice, exitPrice,
+  positionSize, realizedPnL,
+  -- Manual calculation:
+  CASE 
+    WHEN direction = 'long' THEN 
+      positionSize * ((exitPrice - entryPrice) / entryPrice)
+    ELSE 
+      positionSize * ((entryPrice - exitPrice) / entryPrice)
+  END as expected_pnl,
+  -- Difference:
+  realizedPnL - CASE 
+    WHEN direction = 'long' THEN 
+      positionSize * ((exitPrice - entryPrice) / entryPrice)
+    ELSE 
+      positionSize * ((entryPrice - exitPrice) / entryPrice)
+  END as pnl_difference
+FROM "Trade"
+WHERE exitReason IS NOT NULL
+  AND createdAt > NOW() - INTERVAL '24 hours'
+ORDER BY createdAt DESC
+LIMIT 10;
+```
+
+### Example: How Position.size Bug Should Have Been Caught
+
+**What went wrong:**
+- Read code: "Looks like it's comparing sizes correctly"
+- Declared: "Position Manager is working!"
+- Didn't verify with actual trade
+
+**What should have been done:**
+```typescript
+// In Position Manager monitoring loop - ADD THIS LOGGING:
+console.log('🔍 VERIFICATION:', {
+  positionSizeRaw: position.size,  // What SDK returns
+  positionSizeUSD: position.size * currentPrice,  // Converted to USD
+  trackedSizeUSD: trade.currentSize,  // What we're tracking
+  ratio: (position.size * currentPrice) / trade.currentSize,
+  tp1ShouldTrigger: (position.size * currentPrice) < trade.currentSize * 0.95
+})
+```
+
+Then observe logs on actual trade:
+```
+🔍 VERIFICATION: {
+  positionSizeRaw: 12.28,  // ← AH! This is SOL tokens, not USD!
+  positionSizeUSD: 1950.84,  // ← Correct USD value
+  trackedSizeUSD: 1950.00,
+  ratio: 1.0004,  // ← Should be near 1.0 when position full
+  tp1ShouldTrigger: false  // ← Correct
+}
+```
+
+**Lesson:** One console.log would have exposed the bug immediately.
+
+### Deployment Checklist
+
+Before marking feature complete:
+- [ ] Code review completed
+- [ ] Unit tests pass (if applicable)
+- [ ] Integration test with real API calls
+- [ ] Logs show expected behavior
+- [ ] Database state verified with SQL
+- [ ] Edge cases tested
+- [ ] Documentation updated (including Common Pitfalls if applicable)
+- [ ] User notified of what to verify during first real trade
+
+### When to Escalate to User
+
+**Don't say "it's working" if:**
+- You haven't observed actual logs showing the expected behavior
+- SQL query shows unexpected values
+- Test trade behaved differently than expected
+- You're unsure about unit conversions or SDK behavior
+- Change affects money (position sizing, P&L, exits)
+
+**Instead say:**
+- "Code is updated. Need to verify with test trade - watch for [specific log message]"
+- "Fixed, but requires verification: check database shows [expected value]"
+- "Deployed. First real trade should show [behavior]. If not, there's still a bug."
+
+---
+
 ## Critical Components

 ### 1. Signal Quality Scoring (`lib/trading/signal-quality.ts`)
@@ -839,6 +1007,17 @@ if (!enabled) {
 6. **Modifying quality score logic:** Update BOTH `/api/trading/check-risk` and `/api/trading/execute` endpoints, ensure timeframe-aware thresholds are synchronized
 7. **Exit strategy changes:** Modify Position Manager logic + update on-chain order placement in `placeExitOrders()`
 8. **TradingView alert changes:** Ensure alerts pass `timeframe` field (e.g., `"timeframe": "5"`) to enable proper signal quality scoring
+9. **Position Manager changes:** ALWAYS execute test trade after deployment
+   - Use `/api/trading/test` endpoint or Telegram `long sol --force`
+   - Monitor `docker logs -f trading-bot-v4` for full cycle
+   - Verify TP1 hit → 75% close → SL moved to breakeven
+   - SQL: Check `tp1Hit`, `slMovedToBreakeven`, `currentSize` in Trade table
+   - Compare: Position Manager logs vs actual Drift position size
+10. **Calculation changes:** Add verbose logging and verify with SQL
+    - Log every intermediate step, especially unit conversions
+    - Never assume SDK data format - log raw values to verify
+    - SQL query with manual calculation to compare results
+    - Test boundary cases: 0%, 100%, min/max values

 ## Development Roadmap