docs: Add 1-minute simplified price feed to reduce TradingView alert queue pressure
- Create moneyline_1min_price_feed.pinescript (70% smaller payload) - Remove ATR/ADX/RSI/VOL/POS from 1-minute alerts (not used for decisions) - Keep only price + symbol + timeframe for market data cache - Document rationale in docs/1MIN_SIMPLIFIED_FEED.md - Fix: 5-minute trading signals being dropped due to 1-minute flood (60/hour) - Impact: Preserve priority for actual trading signals
This commit is contained in:
242
docs/bugs/README.md
Normal file
242
docs/bugs/README.md
Normal file
@@ -0,0 +1,242 @@
|
||||
# Bug Reports & Critical Fixes
|
||||
|
||||
**Historical record of critical incidents, bugs, and their resolutions.**
|
||||
|
||||
This directory contains CRITICAL_*.md and FIXES_*.md bug reports. Every file documents a real incident that cost money, time, or could have caused financial loss if not caught.
|
||||
|
||||
---
|
||||
|
||||
## 🚨 Critical Bug Reports
|
||||
|
||||
### **Position Management**
|
||||
- `CRITICAL_FIX_POSITION_SIZE_BUG.md` - Smart Entry using webhook percentage as signal price
|
||||
- **Impact:** $89 position sizes instead of $2,300, 97% pullback calculations
|
||||
- **Root Cause:** TradingView webhook sent percentage (70.80) not price ($142.50)
|
||||
- **Fix:** Use Pyth oracle price instead of webhook signal.price
|
||||
- **Date:** Dec 3, 2025
|
||||
|
||||
- `CRITICAL_INCIDENT_UNPROTECTED_POSITION.md` - Database-first pattern violation
|
||||
- **Impact:** Position opened with NO database record, NO Position Manager tracking
|
||||
- **Root Cause:** Position Manager added before database save
|
||||
- **Fix:** Database save FIRST, then Position Manager add
|
||||
- **Date:** Nov 13, 2025
|
||||
|
||||
- `CRITICAL_TP1_FALSE_DETECTION_BUG.md` - TP1 detection fails when on-chain orders fill fast
|
||||
- **Impact:** Winners marked as "SL" exits, analytics incorrect
|
||||
- **Root Cause:** External closure detected after both TP1 + runner closed
|
||||
- **Fix:** Percentage-based exit reason inference
|
||||
- **Date:** Nov 19, 2025
|
||||
|
||||
### **Trade Execution**
|
||||
- `CRITICAL_MISSING_TRADES_NOV19.md` - Trades executed but not in database
|
||||
- **Impact:** 3 trades missing from database, P&L values inflated 5-14×
|
||||
- **Root Cause:** P&L compounding in external closure detection
|
||||
- **Fix:** Don't mutate trade.realizedPnL during calculation
|
||||
- **Date:** Nov 19, 2025
|
||||
|
||||
- `CRITICAL_ISSUES_FOUND.md` - Multiple critical discovery incidents
|
||||
- Container restart killing positions
|
||||
- Phantom detection killing runners
|
||||
- Field name mismatches in startup validation
|
||||
- **Dates:** Various Nov 2025
|
||||
|
||||
---
|
||||
|
||||
## 🔧 Fix Documentation
|
||||
|
||||
### **Runner System**
|
||||
- `FIXES_RUNNER_AND_CANCELLATION.md` - Runner stop loss gap fixes
|
||||
- **Problem:** No SL protection between TP1 and TP2
|
||||
- **Solution:** Explicit runner SL check in monitoring loop
|
||||
- **Impact:** Prevented unlimited risk exposure on 25-40% runner
|
||||
- **Date:** Nov 15, 2025
|
||||
|
||||
### **Applied Fixes**
|
||||
- `FIXES_APPLIED.md` - Collection of multiple bug fixes
|
||||
- TP1 detection
|
||||
- P&L calculation
|
||||
- External closure handling
|
||||
- Order cancellation
|
||||
- **Period:** Nov 2025
|
||||
|
||||
---
|
||||
|
||||
## 📋 Common Bug Patterns
|
||||
|
||||
### **1. P&L Compounding (Multiple Incidents)**
|
||||
**Pattern:** Monitoring loop detects closure multiple times, accumulates P&L
|
||||
**Symptoms:** Database shows 5-20× actual P&L, duplicate Telegram notifications
|
||||
**Root Cause:** Async operations + monitoring loop = race condition
|
||||
**Solution:** Delete from Map IMMEDIATELY (atomic operation) before any async work
|
||||
|
||||
**Example Fixes:**
|
||||
- Common Pitfall #49 (Nov 17, 2025)
|
||||
- Common Pitfall #61 (Nov 22, 2025)
|
||||
- Common Pitfall #67 (Dec 2, 2025)
|
||||
|
||||
### **2. Database-First Violations**
|
||||
**Pattern:** In-memory state updated before database write
|
||||
**Symptoms:** Container restart loses tracking, positions unprotected
|
||||
**Root Cause:** Developer assumes database always succeeds
|
||||
**Solution:** Always database write FIRST, then update in-memory state
|
||||
|
||||
**Example Fixes:**
|
||||
- Common Pitfall #29 (Nov 13, 2025)
|
||||
- CRITICAL_INCIDENT_UNPROTECTED_POSITION.md
|
||||
|
||||
### **3. External Closure Detection**
|
||||
**Pattern:** Position closed on-chain, Position Manager detects late
|
||||
**Symptoms:** Duplicate updates, wrong exit reasons, ghost positions
|
||||
**Root Cause:** Drift state propagation delay (5-10 seconds)
|
||||
**Solution:** Verification wait + closingInProgress flag + ghost detection
|
||||
|
||||
**Example Fixes:**
|
||||
- Common Pitfall #47 (Nov 16, 2025) - Close verification gap
|
||||
- Common Pitfall #56 (Nov 20, 2025) - Ghost orders
|
||||
- Common Pitfall #57 (Nov 20, 2025) - P&L accuracy
|
||||
|
||||
### **4. Unit Conversion Errors**
|
||||
**Pattern:** Tokens vs USD, percentage vs decimal, on-chain vs display units
|
||||
**Symptoms:** Position sizes off by 100×+, TP/SL at wrong prices
|
||||
**Root Cause:** SDK returns different units than expected
|
||||
**Solution:** Always log raw values, verify units explicitly
|
||||
|
||||
**Example Fixes:**
|
||||
- Common Pitfall #22 (Nov 12, 2025) - position.size is TOKENS not USD
|
||||
- Common Pitfall #68 (Dec 3, 2025) - Signal price is percentage not USD
|
||||
|
||||
---
|
||||
|
||||
## 🔍 Debugging Checklist
|
||||
|
||||
**When investigating bugs:**
|
||||
|
||||
1. **Check logs first:**
|
||||
```bash
|
||||
docker logs -f trading-bot-v4 | grep -E "CRITICAL|ERROR|Failed"
|
||||
```
|
||||
|
||||
2. **Verify deployment:**
|
||||
```bash
|
||||
# Container start time MUST be > commit timestamp
|
||||
docker inspect trading-bot-v4 --format='{{.State.StartedAt}}'
|
||||
git log -1 --format='%ai %s'
|
||||
```
|
||||
|
||||
3. **Query database state:**
|
||||
```sql
|
||||
-- Check for inconsistencies
|
||||
SELECT * FROM "Trade"
|
||||
WHERE exitReason IS NULL
|
||||
AND (createdAt < NOW() - INTERVAL '1 hour')
|
||||
ORDER BY createdAt DESC;
|
||||
```
|
||||
|
||||
4. **Compare to Drift:**
|
||||
```bash
|
||||
# Get actual position from Drift
|
||||
curl -X GET http://localhost:3001/api/trading/positions \
|
||||
-H "Authorization: Bearer $API_SECRET_KEY"
|
||||
```
|
||||
|
||||
5. **Search Common Pitfalls:**
|
||||
```bash
|
||||
grep -n "symptom_keyword" .github/copilot-instructions.md | head -20
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 📝 Creating Bug Reports
|
||||
|
||||
**Required Sections:**
|
||||
```markdown
|
||||
# [Bug Title] (CRITICAL - Fixed [Date])
|
||||
|
||||
**Symptom:** [What user observed]
|
||||
|
||||
**Root Cause:** [Technical explanation]
|
||||
|
||||
**Real Incident ([Date]):**
|
||||
* [Specific trade or event]
|
||||
* [Expected behavior]
|
||||
* [Actual behavior]
|
||||
* [Financial impact]
|
||||
|
||||
**Impact:** [Scope of problem]
|
||||
|
||||
**Fix ([Date]):**
|
||||
```typescript
|
||||
// Code showing the fix
|
||||
```
|
||||
|
||||
**Files changed:** [List of files]
|
||||
|
||||
**Git commit:** [Commit hash]
|
||||
|
||||
**Deployed:** [Deployment timestamp]
|
||||
|
||||
**Verification Required:**
|
||||
* [How to test fix works]
|
||||
* [Expected logs/behavior]
|
||||
|
||||
**Lessons Learned:**
|
||||
1. [Key insight #1]
|
||||
2. [Key insight #2]
|
||||
```
|
||||
|
||||
**Naming Convention:**
|
||||
- `CRITICAL_[TOPIC]_BUG.md` - Single critical bug
|
||||
- `CRITICAL_[TOPIC]_[DATE].md` - Dated incident report
|
||||
- `FIXES_[SYSTEM].md` - Multiple related fixes
|
||||
|
||||
---
|
||||
|
||||
## ⚠️ Prevention Guidelines
|
||||
|
||||
**From Past Incidents:**
|
||||
|
||||
1. **Always verify deployment before declaring "fixed"**
|
||||
- Container restart timestamp > commit timestamp
|
||||
- Test with actual trade if possible
|
||||
- Check logs for expected behavior change
|
||||
|
||||
2. **Never trust SDK data formats**
|
||||
- Log raw values first
|
||||
- Verify units explicitly (tokens vs USD, % vs decimal)
|
||||
- Check SDK docs are often wrong
|
||||
|
||||
3. **Database writes before in-memory updates**
|
||||
- Save to database FIRST
|
||||
- Only then update Position Manager, caches, etc.
|
||||
- If DB fails, don't proceed
|
||||
|
||||
4. **Async operations need serialization**
|
||||
- Delete from Map IMMEDIATELY (atomic lock)
|
||||
- Don't check Map.has() then delete later (race condition)
|
||||
- Use Map.delete() return value as lock
|
||||
|
||||
5. **External closures need verification**
|
||||
- Wait 5-10 seconds for Drift state propagation
|
||||
- Query Drift to confirm position actually closed
|
||||
- Keep monitoring if verification fails
|
||||
|
||||
---
|
||||
|
||||
## 📞 Escalation
|
||||
|
||||
**When to declare "CRITICAL":**
|
||||
- Financial loss occurred or was narrowly avoided
|
||||
- System produced incorrect data (P&L, positions, exits)
|
||||
- Position left unprotected (no monitoring, no TP/SL)
|
||||
- Bug could recur and cause future losses
|
||||
|
||||
**Response Required:**
|
||||
1. Stop trading immediately if data integrity affected
|
||||
2. Document incident in this directory
|
||||
3. Add to Common Pitfalls in copilot-instructions.md
|
||||
4. Fix + deploy + verify within 24 hours
|
||||
5. Update relevant architecture docs
|
||||
|
||||
---
|
||||
|
||||
See `../README.md` for overall documentation structure.
|
||||
Reference in New Issue
Block a user