Files

mindesbunister c6b34c45c4 docs: Document closePosition retry logic bug (Common Pitfall #36 )

CRITICAL BUG: Missing retry wrapper caused rate limit storm

Real Incident (Nov 15, 16:49 CET):
- Trade cmi0il8l30000r607l8aec701 triggered close attempt
- closePosition() had NO retryWithBackoff() wrapper
- Failed with 429 → Position Manager retried EVERY 2 SECONDS
- 100+ close attempts exhausted Helius rate limit
- On-chain TP2 filled during storm
- External closure detected 8 times: $0.14 → $0.51 (compounding bug)

Why This Was Missed:
- placeExitOrders() got retry wrapper on Nov 14
- openPosition() still has no wrapper (less critical - runs once)
- closePosition() overlooked - MOST CRITICAL because runs in monitoring loop
- Position Manager executeExit() catches 429 and returns early
- But monitoring continues, retries close every 2s = infinite loop

The Fix:
- Wrapped closePosition() placePerpOrder() with retryWithBackoff()
- 8s base delay, 3 max retries (same as placeExitOrders)
- Reduces RPC load by 30-50x during close operations
- Container deployed 18:05 CET Nov 15

Impact: Prevents rate limit exhaustion + duplicate external closure updates

Files: .github/copilot-instructions.md (added Common Pitfall #36)

2025-11-15 18:07:26 +01:00

88 KiB

Raw Blame History

AI Agent Instructions for Trading Bot v4

Mission & Financial Goals

Primary Objective: Build wealth systematically from $106 → $100,000+ through algorithmic trading

Current Phase: Phase 1 - Survival & Proof (Nov 2025 - Jan 2026)

Current Capital: $97.55 USDC (zero debt, 100% health)
Starting Capital: $106 (Nov 2025)
Target: $2,500 by end of Phase 1 (Month 2.5)
Strategy: Aggressive compounding, 0 withdrawals
Position Sizing: 100% of free collateral (~$97 at 15x leverage = ~$1,463 notional)
Risk Tolerance: EXTREME - This is recovery/proof-of-concept mode
Win Target: 20-30% monthly returns to reach $2,500
Trades Executed: 161 (as of Nov 12, 2025)

Why This Matters for AI Agents:

Every dollar counts at this stage - optimize for profitability, not just safety
User needs this system to work for long-term financial goals ($300-500/month withdrawals starting Month 3)
No changes that reduce win rate unless they improve profit factor
System must prove itself before scaling (see TRADING_GOALS.md for full 8-phase roadmap)

Key Constraints:

Can't afford extended drawdowns (limited capital)
Must maintain 60%+ win rate to compound effectively
Quality over quantity - only trade 60+ signal quality scores (lowered from 65 on Nov 12, 2025)
After 3 consecutive losses, STOP and review system

Architecture Overview

Type: Autonomous cryptocurrency trading bot with Next.js 15 frontend + Solana/Drift Protocol backend

Data Flow: TradingView → n8n webhook → Next.js API → Drift Protocol (Solana DEX) → Real-time monitoring → Auto-exit

CRITICAL: RPC Provider Choice

MUST use Alchemy RPC (https://solana-mainnet.g.alchemy.com/v2/YOUR_API_KEY)
DO NOT use Helius free tier - causes catastrophic rate limiting (239 errors in 10 minutes)
Helius free: 10 req/sec sustained = TOO LOW for trade execution + Position Manager monitoring
Alchemy free: 300M compute units/month = adequate for bot operations
Symptom if wrong RPC: Trades hit SL immediately, duplicate closes, Position Manager loses tracking, database save failures
Fixed Nov 14, 2025: Switched to Alchemy, system now works perfectly (TP1/TP2/runner all functioning)

Key Design Principle: Dual-layer redundancy - every trade has both on-chain orders (Drift) AND software monitoring (Position Manager) as backup.

Exit Strategy: TP2-as-Runner system (CURRENT):

TP1 at +0.4%: Close configurable % (default 75%, adjustable via TAKE_PROFIT_1_SIZE_PERCENT)
TP2 at +0.7%: Activates trailing stop on full remaining % (no position close)
Runner: Remaining % after TP1 with ATR-based trailing stop (default 25%, configurable)
Note: All UI displays dynamically calculate runner% as 100 - TAKE_PROFIT_1_SIZE_PERCENT

Per-Symbol Configuration: SOL and ETH have independent enable/disable toggles and position sizing:

SOLANA_ENABLED, SOLANA_POSITION_SIZE, SOLANA_LEVERAGE (defaults: true, 100%, 15x)
ETHEREUM_ENABLED, ETHEREUM_POSITION_SIZE, ETHEREUM_LEVERAGE (defaults: true, 100%, 1x)
BTC and other symbols fall back to global settings (MAX_POSITION_SIZE_USD, LEVERAGE)
Priority: Per-symbol ENV → Market config → Global ENV → Defaults

Signal Quality System: Filters trades based on 5 metrics (ATR, ADX, RSI, volumeRatio, pricePosition) scored 0-100. Only trades scoring 60+ are executed (lowered from 65 after data analysis showed 60-64 tier outperformed higher scores). Scores stored in database for future optimization.

Timeframe-Aware Scoring: Signal quality thresholds adjust based on timeframe (5min vs daily):

5min: ADX 12+ trending (vs 18+ for daily), ATR 0.2-0.7% healthy (vs 0.4%+ for daily)
Anti-chop filter: -20 points for extreme sideways (ADX <10, ATR <0.25%, Vol <0.9x)
Pass timeframe param to scoreSignalQuality() from TradingView alerts (e.g., timeframe: "5")

MAE/MFE Tracking: Every trade tracks Maximum Favorable Excursion (best profit %) and Maximum Adverse Excursion (worst loss %) updated every 2s. Used for data-driven optimization of TP/SL levels.

Manual Trading via Telegram: Send plain-text messages like long sol, short eth, long btc to open positions instantly (bypasses n8n, calls /api/trading/execute directly with preset healthy metrics). CRITICAL: Manual trades are marked with signalSource='manual' and excluded from TradingView indicator analysis (prevents data contamination).

Re-Entry Analytics System: Manual trades are validated before execution using fresh TradingView data:

Market data cached from TradingView signals (5min expiry)
/api/analytics/reentry-check scores re-entry based on fresh metrics + recent performance
Telegram bot blocks low-quality re-entries unless --force flag used
Uses real TradingView ADX/ATR/RSI when available, falls back to historical data
Penalty for recent losing trades, bonus for winning streaks

VERIFICATION MANDATE: Financial Code Requires Proof

CRITICAL: THIS IS A REAL MONEY TRADING SYSTEM - NOT A TOY PROJECT

Core Principle: In trading systems, "working" means "verified with real data", NOT "code looks correct".

NEVER declare something working without:

Observing actual logs showing expected behavior
Verifying database state matches expectations
Comparing calculated values to source data
Testing with real trades when applicable
CONFIRMING CODE IS DEPLOYED - Check container start time vs commit time

CODE COMMITTED ≠ CODE DEPLOYED

Git commit at 15:56 means NOTHING if container started at 15:06
ALWAYS verify: docker logs trading-bot-v4 | grep "Server starting" | head -1
Compare container start time to commit timestamp
If container older than commit: CODE NOT DEPLOYED, FIX NOT ACTIVE
Never say "fixed" or "protected" until deployment verified

Critical Path Verification Requirements

Position Manager Changes:

Execute test trade with DRY_RUN=false (small size)
Watch docker logs for full TP1 → TP2 → exit cycle
SQL query: verify tp1Hit, slMovedToBreakeven, currentSize match Position Manager logs
Compare Position Manager tracked size to actual Drift position size
Check exit reason matches actual trigger (TP1/TP2/SL/trailing)

Exit Logic Changes (TP/SL/Trailing):

Log EXPECTED values (TP1 price, SL price after breakeven, trailing stop distance)
Log ACTUAL values from Drift position and Position Manager state
Verify: Does TP1 hit when price crosses TP1? Does SL move to breakeven?
Test: Open position, let it hit TP1, verify 75% closed + SL moved
Document: What SHOULD happen vs what ACTUALLY happened

API Endpoint Changes:

curl test with real payload from TradingView/n8n
Check response JSON matches expectations
Verify database record created with correct fields
Check Telegram notification shows correct values (leverage, size, etc.)
SQL query: confirm all fields populated correctly

Calculation Changes (P&L, Position Sizing, Percentages):

Add console.log for EVERY step of calculation
Verify units match (tokens vs USD, percent vs decimal, etc.)
SQL query with manual calculation: does code result match hand calculation?
Test edge cases: 0%, 100%, negative values, very small/large numbers

SDK/External Data Integration:

Log raw SDK response to verify assumptions about data format
NEVER trust documentation - verify with console.log
Example: position.size doc said "USD" but logs showed "tokens"
Document actual behavior in Common Pitfalls section

Red Flags Requiring Extra Verification

High-Risk Changes:

Unit conversions (tokens ↔ USD, percent ↔ decimal)
State transitions (TP1 hit → move SL to breakeven)
Configuration precedence (per-symbol vs global vs defaults)
Display values from complex calculations (leverage, size, P&L)
Timing-dependent logic (grace periods, cooldowns, race conditions)

Verification Steps for Each:

Before declaring working: Show proof (logs, SQL results, test output)
After deployment: Monitor first real trade closely, verify behavior
Edge cases: Test boundary conditions (0, 100%, max leverage, min size)
Regression: Check that fix didn't break other functionality

SQL Verification Queries

After Position Manager changes:

-- Verify TP1 detection worked correctly
SELECT 
  symbol, entryPrice, currentSize, realizedPnL,
  tp1Hit, slMovedToBreakeven, exitReason,
  TO_CHAR(createdAt, 'MM-DD HH24:MI') as time
FROM "Trade"
WHERE exitReason IS NULL  -- Open positions
  OR createdAt > NOW() - INTERVAL '1 hour'  -- Recent closes
ORDER BY createdAt DESC
LIMIT 5;

-- Compare Position Manager state to expectations
SELECT configSnapshot->'positionManagerState' as pm_state
FROM "Trade" 
WHERE symbol = 'SOL-PERP' AND exitReason IS NULL;

After calculation changes:

-- Verify P&L calculations
SELECT 
  symbol, direction, entryPrice, exitPrice,
  positionSize, realizedPnL,
  -- Manual calculation:
  CASE 
    WHEN direction = 'long' THEN 
      positionSize * ((exitPrice - entryPrice) / entryPrice)
    ELSE 
      positionSize * ((entryPrice - exitPrice) / entryPrice)
  END as expected_pnl,
  -- Difference:
  realizedPnL - CASE 
    WHEN direction = 'long' THEN 
      positionSize * ((exitPrice - entryPrice) / entryPrice)
    ELSE 
      positionSize * ((entryPrice - exitPrice) / entryPrice)
  END as pnl_difference
FROM "Trade"
WHERE exitReason IS NOT NULL
  AND createdAt > NOW() - INTERVAL '24 hours'
ORDER BY createdAt DESC
LIMIT 10;

Example: How Position.size Bug Should Have Been Caught

What went wrong:

Read code: "Looks like it's comparing sizes correctly"
Declared: "Position Manager is working!"
Didn't verify with actual trade

What should have been done:

// In Position Manager monitoring loop - ADD THIS LOGGING:
console.log('🔍 VERIFICATION:', {
  positionSizeRaw: position.size,  // What SDK returns
  positionSizeUSD: position.size * currentPrice,  // Converted to USD
  trackedSizeUSD: trade.currentSize,  // What we're tracking
  ratio: (position.size * currentPrice) / trade.currentSize,
  tp1ShouldTrigger: (position.size * currentPrice) < trade.currentSize * 0.95
})

Then observe logs on actual trade:

🔍 VERIFICATION: {
  positionSizeRaw: 12.28,  // ← AH! This is SOL tokens, not USD!
  positionSizeUSD: 1950.84,  // ← Correct USD value
  trackedSizeUSD: 1950.00,
  ratio: 1.0004,  // ← Should be near 1.0 when position full
  tp1ShouldTrigger: false  // ← Correct
}

Lesson: One console.log would have exposed the bug immediately.

Deployment Checklist

MANDATORY PRE-DEPLOYMENT VERIFICATION:

Check container start time: docker logs trading-bot-v4 | grep "Server starting" | head -1
Compare to commit timestamp: Container MUST be newer than code changes
If container older: STOP - Code not deployed, fix not active
Never declare "fixed" or "working" until container restarted with new code

Before marking feature complete:

Code review completed
Unit tests pass (if applicable)
Integration test with real API calls
Logs show expected behavior
Database state verified with SQL
Edge cases tested
Container restarted and verified running new code
Documentation updated (including Common Pitfalls if applicable)
User notified of what to verify during first real trade

When to Escalate to User

Don't say "it's working" if:

You haven't observed actual logs showing the expected behavior
SQL query shows unexpected values
Test trade behaved differently than expected
You're unsure about unit conversions or SDK behavior
Change affects money (position sizing, P&L, exits)
Container hasn't been restarted since code commit

Instead say:

"Code is updated. Need to verify with test trade - watch for [specific log message]"
"Fixed, but requires verification: check database shows [expected value]"
"Deployed. First real trade should show [behavior]. If not, there's still a bug."
"Code committed but NOT deployed - container running old version, fix not active yet"

Docker Build Best Practices

CRITICAL: Prevent build interruptions with background execution + live monitoring

Docker builds take 40-70 seconds and are easily interrupted by terminal issues. Use this pattern:

# Start build in background with live log tail
cd /home/icke/traderv4 && docker compose build trading-bot > /tmp/docker-build-live.log 2>&1 & BUILD_PID=$!; echo "Build started, PID: $BUILD_PID"; tail -f /tmp/docker-build-live.log

Why this works:

Build runs in background (&) - immune to terminal disconnects/Ctrl+C
Output redirected to log file - can review later if needed
tail -f shows real-time progress - see compilation, linting, errors
Can Ctrl+C the tail -f without killing build - build continues
Verification after: tail -50 /tmp/docker-build-live.log to check success

Success indicators:

✓ Compiled successfully in 27s
✓ Generating static pages (30/30)
#22 naming to docker.io/library/traderv4-trading-bot done
DONE X.Xs on final step

Failure indicators:

Failed to compile.
Type error:
ERROR: process "/bin/sh -c npm run build" did not complete successfully: exit code: 1

After successful build:

# Deploy new container
docker compose up -d --force-recreate trading-bot

# Verify it started
docker logs --tail=30 trading-bot-v4

# Confirm deployed version
docker logs trading-bot-v4 | grep "Server starting" | head -1

DO NOT use: docker compose build trading-bot in foreground - one network hiccup kills 60s of work

Docker Cleanup After Builds

CRITICAL: Prevent disk full issues from build cache accumulation

Docker builds create intermediate layers (1.3+ GB per build) that accumulate over time. Build cache can reach 40-50 GB after frequent rebuilds.

After successful deployment, clean up:

# Remove dangling images (old builds)
docker image prune -f

# Remove build cache (biggest space hog - 40+ GB typical)
docker builder prune -f

# Optional: Remove dangling volumes (if no important data)
docker volume prune -f

# Check space saved
docker system df

When to run:

After each successful deployment (recommended)
Weekly if building frequently
When disk space warnings appear
Before major updates/migrations

Space typically freed:

Dangling images: 2-5 GB
Build cache: 40-50 GB
Dangling volumes: 0.5-1 GB
Total: 40-55 GB per cleanup

What's safe to delete:

<none> tagged images (old builds)
Build cache (recreated on next build)
Dangling volumes (orphaned from removed containers)

What NOT to delete:

Named volumes (contain data: trading-bot-postgres, etc.)
Active containers
Tagged images currently in use

Critical Components

1. Phantom Trade Auto-Closure System

Purpose: Automatically close positions when size mismatch detected (position opened but wrong size)

When triggered:

Position opened on Drift successfully
Expected size: $50 (50% @ 1x leverage)
Actual size: $1.37 (7% fill - likely oracle price stale or exchange rejection)
Size ratio < 50% threshold → phantom detected

Automated response (all happens in <1 second):

Immediate closure: Market order closes 100% of phantom position
Database logging: Creates trade record with status='phantom', saves P&L
n8n notification: Returns HTTP 200 with full details (not 500 - allows workflow to continue)
Telegram alert: Message includes entry/exit prices, P&L, reason, transaction IDs

Why auto-close instead of manual intervention:

User may be asleep, away from devices, unavailable for hours
Unmonitored position = unlimited risk exposure
Position Manager won't track phantom (by design)
No TP/SL protection, no trailing stop, no monitoring
Better to exit with small loss/gain than leave position exposed
Re-entry always possible if setup was actually good

Example notification:

⚠️ PHANTOM TRADE AUTO-CLOSED

Symbol: SOL-PERP
Direction: LONG
Expected Size: $48.75
Actual Size: $1.37 (2.8%)

Entry: $168.50
Exit: $168.45
P&L: -$0.02

Reason: Size mismatch detected - likely oracle price issue or exchange rejection
Action: Position auto-closed for safety (unmonitored positions = risk)

TX: 5Yx2Fm8vQHKLdPaw...

Database tracking:

status='phantom' field identifies these trades
isPhantom=true, phantomReason='ORACLE_PRICE_MISMATCH'
expectedSizeUSD, actualSizeUSD fields for analysis
Exit reason: 'manual' (phantom auto-close category)
Enables post-trade analysis of phantom frequency and patterns

Code location: app/api/trading/execute/route.ts lines 322-445

2. Signal Quality Scoring (`lib/trading/signal-quality.ts`)

Purpose: Unified quality validation system that scores trading signals 0-100 based on 5 market metrics

Timeframe-aware thresholds:

scoreSignalQuality({ 
  atr, adx, rsi, volumeRatio, pricePosition, 
  timeframe?: string // "5" for 5min, undefined for higher timeframes
})

5min chart adjustments:

ADX healthy range: 12-22 (vs 18-30 for daily)
ATR healthy range: 0.2-0.7% (vs 0.4%+ for daily)
Anti-chop filter: -20 points for extreme sideways (ADX <10, ATR <0.25%, Vol <0.9x)

Price position penalties (all timeframes):

Long at 90-95%+ range: -15 to -30 points (chasing highs)
Short at <5-10% range: -15 to -30 points (chasing lows)
Prevents flip-flop losses from entering range extremes

Key behaviors:

Returns score 0-100 and detailed breakdown object
Minimum score 60 required to execute trade
Called by both /api/trading/check-risk and /api/trading/execute
Scores saved to database for post-trade analysis

2. Position Manager (`lib/trading/position-manager.ts`)

Purpose: Software-based monitoring loop that checks prices every 2 seconds and closes positions via market orders

Singleton pattern: Always use getInitializedPositionManager() - never instantiate directly

const positionManager = await getInitializedPositionManager()
await positionManager.addTrade(activeTrade)

Key behaviors:

Tracks ActiveTrade objects in a Map
TP2-as-Runner system: TP1 (configurable %, default 75%) → TP2 trigger (no close, activate trailing) → Runner (remaining %) with ATR-based trailing stop
Dynamic SL adjustments: Moves to breakeven after TP1, locks profit at +1.2%
On-chain order synchronization: After TP1 hits, calls cancelAllOrders() then placeExitOrders() with updated SL price at breakeven (uses retryWithBackoff() for rate limit handling)
ATR-based trailing stop: Calculates trail distance as (atrAtEntry / currentPrice × 100) × trailingStopAtrMultiplier, clamped between min/max %
Trailing stop: Activates when TP2 price hit, tracks peakPrice and trails dynamically
Closes positions via closePosition() market orders when targets hit
Acts as backup if on-chain orders don't fill
State persistence: Saves to database, restores on restart via configSnapshot.positionManagerState
Startup validation: On container restart, cross-checks last 24h "closed" trades against Drift to detect orphaned positions (see lib/startup/init-position-manager.ts)
Grace period for new trades: Skips "external closure" detection for positions <30 seconds old (Drift positions take 5-10s to propagate)
Exit reason detection: Uses trade state flags (tp1Hit, tp2Hit) and realized P&L to determine exit reason, NOT current price (avoids misclassification when price moves after order fills)
Real P&L calculation: Calculates actual profit based on entry vs exit price, not SDK's potentially incorrect values
Rate limit-aware exit: On 429 errors during close, keeps trade in monitoring (doesn't mark closed), retries naturally on next price update

3. Telegram Bot (`telegram_command_bot.py`)

Purpose: Python-based Telegram bot for manual trading commands and position status monitoring

Manual trade commands via plain text:

# User sends plain text message (not slash commands)
"long sol"          → Validates via analytics, then opens SOL-PERP long
"short eth"         → Validates via analytics, then opens ETH-PERP short
"long btc --force"  → Skips analytics validation, opens BTC-PERP long immediately

Key behaviors:

MessageHandler processes all text messages (not just commands)
Maps user-friendly symbols (sol, eth, btc) to Drift format (SOL-PERP, etc.)
Analytics validation: Calls /api/analytics/reentry-check before execution
- Blocks trades with score <55 unless --force flag used
- Uses fresh TradingView data (<5min old) when available
- Falls back to historical metrics with penalty
- Considers recent trade performance (last 3 trades)
Calls /api/trading/execute directly with preset healthy metrics (ATR=0.45, ADX=32, RSI=58/42)
Bypasses n8n workflow and TradingView requirements
60-second timeout for API calls
Responds with trade confirmation or analytics rejection message

Status command:

/status → Returns JSON of open positions from Drift

Implementation details:

Uses python-telegram-bot library
Deployed via docker-compose.telegram-bot.yml
Requires TELEGRAM_BOT_TOKEN and TELEGRAM_CHANNEL_ID in .env
API calls to http://trading-bot:3000/api/trading/execute

Drift client integration:

Singleton pattern: Use initializeDriftService() and getDriftService() - maintains single connection

const driftService = await initializeDriftService()
const health = await driftService.getAccountHealth()

Wallet handling: Supports both JSON array [91,24,...] and base58 string formats from Phantom wallet

4. Rate Limit Monitoring (`lib/drift/orders.ts` + `app/api/analytics/rate-limits`)

Purpose: Track and analyze Solana RPC rate limiting (429 errors) to prevent silent failures

Helius RPC Limits (Free Tier):

Burst: 100 requests/second
Sustained: 10 requests/second
Monthly: 100k requests
See docs/HELIUS_RATE_LIMITS.md for upgrade recommendations

Retry mechanism with exponential backoff (Nov 14, 2025 - Updated):

await retryWithBackoff(async () => {
  return await driftClient.cancelOrders(...)
}, maxRetries = 3, baseDelay = 5000) // Increased from 2s to 5s

Progression: 5s → 10s → 20s (vs old 2s → 4s → 8s) Rationale: Gives Helius time to recover, reduces cascade pressure by 2.5x

Database logging: Three event types in SystemEvent table:

rate_limit_hit: Each 429 error (logged with attempt #, delay, error snippet)
rate_limit_recovered: Successful retry (logged with total time, retry count)
rate_limit_exhausted: Failed after max retries (CRITICAL - order operation failed)

Analytics endpoint:

curl http://localhost:3001/api/analytics/rate-limits

Returns: Total hits/recoveries/failures, hourly patterns, recovery times, success rate

Key behaviors:

Only RPC calls wrapped: cancelAllOrders(), placeExitOrders(), closePosition()
Position Manager monitoring: Event-driven via Pyth WebSocket (not polling)
Rate limit-aware exit: Position Manager keeps monitoring on 429 errors (retries naturally)
Logs to both console and database for post-trade analysis

Monitoring queries: See docs/RATE_LIMIT_MONITORING.md for SQL queries

Startup Position Validation (Nov 14, 2025 - Added): On container startup, cross-checks last 24h of "closed" trades against actual Drift positions:

If DB says closed but Drift shows open → reopens in DB to restore Position Manager tracking
Prevents orphaned positions from failed close transactions
Logs: 🔴 CRITICAL: ${symbol} marked as CLOSED in DB but still OPEN on Drift!
Implementation: lib/startup/init-position-manager.ts - validateOpenTrades()

5. Order Placement (`lib/drift/orders.ts`)

Critical functions:

openPosition() - Opens market position with transaction confirmation
closePosition() - Closes position with transaction confirmation
placeExitOrders() - Places TP/SL orders on-chain
cancelAllOrders() - Cancels all reduce-only orders for a market

CRITICAL: Transaction Confirmation Pattern Both openPosition() and closePosition() MUST confirm transactions on-chain:

const txSig = await driftClient.placePerpOrder(orderParams)
console.log('⏳ Confirming transaction on-chain...')
const connection = driftService.getConnection()
const confirmation = await connection.confirmTransaction(txSig, 'confirmed')

if (confirmation.value.err) {
  throw new Error(`Transaction failed: ${JSON.stringify(confirmation.value.err)}`)
}
console.log('✅ Transaction confirmed on-chain')

Without this, the SDK returns signatures for transactions that never execute, causing phantom trades/closes.

CRITICAL: Drift SDK position.size is BASE ASSET TOKENS, not USD The Drift SDK returns position.size as token quantity (SOL/ETH/BTC), NOT USD notional:

// CORRECT: Convert tokens to USD by multiplying by current price
const positionSizeUSD = Math.abs(position.size) * currentPrice

// WRONG: Using position.size directly as USD (off by 150x+ for SOL!)
const positionSizeUSD = Math.abs(position.size)

This affects Position Manager's TP1/TP2 detection - if position.size is not converted to USD before comparing to tracked USD values, the system will never detect partial closes correctly. See Common Pitfall #22 for the full bug details and fix applied Nov 12, 2025.

Solana RPC Rate Limiting with Exponential Backoff Solana RPC endpoints return 429 errors under load. Always use retry logic for order operations:

export async function retryWithBackoff<T>(
  operation: () => Promise<T>,
  maxRetries: number = 3,
  initialDelay: number = 5000  // Increased from 2000ms to 5000ms (Nov 14, 2025)
): Promise<T> {
  for (let attempt = 0; attempt < maxRetries; attempt++) {
    try {
      return await operation()
    } catch (error: any) {
      if (error?.message?.includes('429') && attempt < maxRetries - 1) {
        const delay = initialDelay * Math.pow(2, attempt)
        console.log(`⏳ Rate limited, retrying in ${delay/1000}s... (attempt ${attempt + 1}/${maxRetries})`)
        await new Promise(resolve => setTimeout(resolve, delay))
        continue
      }
      throw error
    }
  }
  throw new Error('Max retries exceeded')
}

// Usage in cancelAllOrders
await retryWithBackoff(() => driftClient.cancelOrders(...))

Note: Increased from 2s to 5s base delay to give Helius RPC more recovery time. See docs/HELIUS_RATE_LIMITS.md for detailed analysis. Without this, order cancellations fail silently during TP1→breakeven order updates, leaving ghost orders that cause incorrect fills.

Dual Stop System (USE_DUAL_STOPS=true):

// Soft stop: TRIGGER_LIMIT at -1.5% (avoids wicks)
// Hard stop: TRIGGER_MARKET at -2.5% (guarantees exit)

Order types:

Entry: MARKET (immediate execution)
TP1/TP2: LIMIT reduce-only orders
Soft SL: TRIGGER_LIMIT reduce-only
Hard SL: TRIGGER_MARKET reduce-only

6. Database (`lib/database/trades.ts` + `prisma/schema.prisma`)

Purpose: PostgreSQL via Prisma ORM for trade history and analytics

Models: Trade, PriceUpdate, SystemEvent, DailyStats, BlockedSignal

Singleton pattern: Use getPrismaClient() - never instantiate PrismaClient directly

Key functions:

createTrade() - Save trade after execution (includes dual stop TX signatures + signalQualityScore)
updateTradeExit() - Record exit with P&L
addPriceUpdate() - Track price movements (called by Position Manager)
getTradeStats() - Win rate, profit factor, avg win/loss
getLastTrade() - Fetch most recent trade for analytics dashboard
createBlockedSignal() - Save blocked signals for data-driven optimization analysis
getRecentBlockedSignals() - Query recent blocked signals
getBlockedSignalsForAnalysis() - Fetch signals needing price analysis (future automation)

Important fields:

signalSource (String?) - Identifies trade origin: 'tradingview', 'manual', or NULL (old trades)
- CRITICAL: Manual Telegram trades are marked signalSource='manual' and excluded from TradingView indicator analysis
- Use filter: WHERE ("signalSource" IS NULL OR "signalSource" != 'manual') for indicator optimization queries
- See docs/MANUAL_TRADE_FILTERING.md for complete SQL filtering guide
signalQualityScore (Int?) - 0-100 score for data-driven optimization
signalQualityVersion (String?) - Tracks which scoring logic was used ('v1', 'v2', 'v3', 'v4')
- v1: Original logic (price position < 5% threshold)
- v2: Added volume compensation for low ADX (2025-11-07)
- v3: Stricter breakdown requirements: positions < 15% require (ADX > 18 AND volume > 1.2x) OR (RSI < 35 for shorts / RSI > 60 for longs)
- v4: CURRENT - Blocked signals tracking enabled for data-driven threshold optimization (2025-11-11)
- All new trades tagged with current version for comparative analysis
maxFavorableExcursion / maxAdverseExcursion - Track best/worst P&L during trade lifetime
maxFavorablePrice / maxAdversePrice - Track prices at MFE/MAE points
configSnapshot (Json) - Stores Position Manager state for crash recovery
atr, adx, rsi, volumeRatio, pricePosition - Context metrics from TradingView

BlockedSignal model fields (NEW):

Signal metrics: atr, adx, rsi, volumeRatio, pricePosition, timeframe
Quality scoring: signalQualityScore, signalQualityVersion, scoreBreakdown (JSON), minScoreRequired
Block tracking: blockReason (QUALITY_SCORE_TOO_LOW, COOLDOWN_PERIOD, HOURLY_TRADE_LIMIT, etc.), blockDetails
Future analysis: priceAfter1/5/15/30Min, wouldHitTP1/TP2/SL, analysisComplete
Automatically saved by check-risk endpoint when signals are blocked
Enables data-driven optimization: collect 10-20 blocked signals → analyze patterns → adjust thresholds

Per-symbol functions:

getLastTradeTimeForSymbol(symbol) - Get last trade time for specific coin (enables per-symbol cooldown)
Each coin (SOL/ETH/BTC) has independent cooldown timer to avoid missed opportunities

Configuration System

Three-layer merge:

DEFAULT_TRADING_CONFIG (config/trading.ts)
Environment variables (.env) via getConfigFromEnv()
Runtime overrides via getMergedConfig(overrides)

Always use: getMergedConfig() to get final config - never read env vars directly in business logic

Per-symbol position sizing: Use getPositionSizeForSymbol(symbol, config) which returns { size, leverage, enabled }

const { size, leverage, enabled } = getPositionSizeForSymbol('SOL-PERP', config)
if (!enabled) {
  return NextResponse.json({ success: false, error: 'Symbol trading disabled' }, { status: 400 })
}

Symbol normalization: TradingView sends "SOLUSDT" → must convert to "SOL-PERP" for Drift

const driftSymbol = normalizeTradingViewSymbol(body.symbol)

API Endpoints Architecture

Authentication: All /api/trading/* endpoints (except /test) require Authorization: Bearer API_SECRET_KEY

Pattern: Each endpoint follows same flow:

Auth check
Get config via getMergedConfig()
Initialize Drift service
Check account health
Execute operation
Save to database
Add to Position Manager if applicable

Key endpoints:

/api/trading/execute - Main entry point from n8n (production, requires auth), auto-caches market data
/api/trading/check-risk - Pre-execution validation (duplicate check, quality score, per-symbol cooldown, rate limits, symbol enabled check, saves blocked signals automatically)
/api/trading/test - Test trades from settings UI (no auth required, respects symbol enable/disable)
/api/trading/close - Manual position closing (requires symbol normalization)
/api/trading/sync-positions - Force Position Manager sync with Drift (POST, requires auth) - restores tracking for orphaned positions
/api/trading/cancel-orders - Manual order cleanup (for stuck/ghost orders after rate limit failures)
/api/trading/positions - Query open positions from Drift
/api/trading/market-data - Webhook for TradingView market data updates (GET for debug, POST for data)
/api/settings - Get/update config (writes to .env file, includes per-symbol settings)
/api/analytics/last-trade - Fetch most recent trade details for dashboard (includes quality score)
/api/analytics/reentry-check - Validate manual re-entry with fresh TradingView data + recent performance
/api/analytics/version-comparison - Compare performance across signal quality logic versions (v1/v2/v3/v4)
/api/restart - Create restart flag for watch-restart.sh script

Critical Workflows

Execute Trade (Production)

TradingView alert → n8n Parse Signal Enhanced (extracts metrics + timeframe)
  ↓ /api/trading/check-risk [validates quality score ≥60, checks duplicates, per-symbol cooldown]
  ↓ /api/trading/execute
  ↓ normalize symbol (SOLUSDT → SOL-PERP)
  ↓ getMergedConfig()
  ↓ getPositionSizeForSymbol() [check if symbol enabled + get sizing]
  ↓ openPosition() [MARKET order]
  ↓ calculate dual stop prices if enabled
  ↓ placeExitOrders() [on-chain TP1/TP2/SL orders]
  ↓ scoreSignalQuality({ ..., timeframe }) [compute 0-100 score with timeframe-aware thresholds]
  ↓ createTrade() [CRITICAL: save to database FIRST - see Common Pitfall #27]
  ↓ positionManager.addTrade() [ONLY after DB save succeeds - prevents unprotected positions]

CRITICAL EXECUTION ORDER (Nov 13, 2025 Fix): The order of database save → Position Manager add is NOT arbitrary - it's a safety requirement:

If database save fails, API returns HTTP 500 with critical warning
User sees: "CLOSE POSITION MANUALLY IMMEDIATELY" with transaction signature
Position Manager only tracks database-persisted trades
Container restarts can restore all positions from database
Never add to Position Manager before database save - creates unprotected positions

Position Monitoring Loop

Position Manager every 2s:
  ↓ Verify on-chain position still exists (detect external closures)
  ↓ getPythPriceMonitor().getLatestPrice()
  ↓ Calculate current P&L and update MAE/MFE metrics
  ↓ Check emergency stop (-2%) → closePosition(100%)
  ↓ Check SL hit → closePosition(100%)
  ↓ Check TP1 hit → closePosition(75%), cancelAllOrders(), placeExitOrders() with SL at breakeven
  ↓ Check profit lock trigger (+1.2%) → move SL to +configured%
  ↓ Check TP2 hit → closePosition(80% of remaining), activate runner
  ↓ Check trailing stop (if runner active) → adjust SL dynamically based on peakPrice
  ↓ addPriceUpdate() [save to database every N checks]
  ↓ saveTradeState() [persist Position Manager state + MAE/MFE for crash recovery]

Settings Update

Web UI → /api/settings POST
  ↓ Validate new settings
  ↓ Write to .env file using string replacement
  ↓ Return success
  ↓ User clicks "Restart Bot" → /api/restart
  ↓ Creates /tmp/trading-bot-restart.flag
  ↓ watch-restart.sh detects flag
  ↓ Executes: docker restart trading-bot-v4

Docker Context

Multi-stage build: deps → builder → runner (Node 20 Alpine)

Critical Dockerfile steps:

Install deps with npm install --production
Copy source and npx prisma generate (MUST happen before build)
npm run build (Next.js standalone output)
Runner stage copies standalone + static + node_modules + Prisma client

Container networking:

External: trading-bot-v4 on port 3001
Internal: Next.js on port 3000
Database: trading-bot-postgres on 172.28.0.0/16 network

DATABASE_URL caveat: Use trading-bot-postgres (container name) in .env for runtime, but localhost:5432 for Prisma CLI migrations from host

Project-Specific Patterns

1. Singleton Services

Never create multiple instances - always use getter functions:

const driftService = await initializeDriftService() // NOT: new DriftService()
const positionManager = getPositionManager()        // NOT: new PositionManager()
const prisma = getPrismaClient()                     // NOT: new PrismaClient()

2. Price Calculations

Direction matters for long vs short:

function calculatePrice(entry: number, percent: number, direction: 'long' | 'short') {
  if (direction === 'long') {
    return entry * (1 + percent / 100)  // Long: +1% = higher price
  } else {
    return entry * (1 - percent / 100)  // Short: +1% = lower price
  }
}

3. Error Handling

Database failures should not fail trades - always wrap in try/catch:

try {
  await createTrade(params)
  console.log('💾 Trade saved to database')
} catch (dbError) {
  console.error('❌ Failed to save trade:', dbError)
  // Don't fail the trade if database save fails
}

4. Reduce-Only Orders

All exit orders MUST be reduce-only (can only close, not open positions):

const orderParams = {
  reduceOnly: true,  // CRITICAL for TP/SL orders
  // ... other params
}

5. Nextcloud Deck Roadmap Sync

Purpose: Visual kanban board for tracking optimization roadmap progress

Key Components:

scripts/discover-deck-ids.sh - Find Nextcloud Deck board/stack IDs
scripts/sync-roadmap-to-deck.py - Sync roadmap files to Deck cards
docs/NEXTCLOUD_DECK_SYNC.md - Complete documentation

Workflow:

# One-time setup (already done)
bash scripts/discover-deck-ids.sh  # Creates /tmp/deck-config.json

# Sync roadmap to Deck (creates/updates cards)
python3 scripts/sync-roadmap-to-deck.py --init

# Always dry-run first to preview changes
python3 scripts/sync-roadmap-to-deck.py --init --dry-run

Stack Mapping:

📥 Backlog: Future phases, ideas, ML work (status: FUTURE)
📋 Planning: Next phases, ready to implement (status: PENDING, NEXT)
🚀 In Progress: Currently active work (status: CURRENT, IN PROGRESS, DEPLOYED)
✅ Complete: Finished phases (status: COMPLETE)

Card Structure:

3 high-level initiative cards (from OPTIMIZATION_MASTER_ROADMAP.md)
18 detailed phase cards (from individual roadmap files)
Total: 21 cards tracking all optimization work

When to Sync:

After completing a phase (update markdown status → re-sync)
When starting new phase (move card in Deck UI)
Weekly during active development to keep visual state current

Important Notes:

API doesn't support duplicate detection - always use --dry-run first
Manual card deletion required (API returns 405 on DELETE)
Code blocks auto-removed from descriptions (prevent API errors)
Card titles cleaned (no markdown, emojis removed for readability)

Testing Commands

# Local development
npm run dev

# Build production
npm run build && npm start

# Docker build and restart
docker compose build trading-bot
docker compose up -d --force-recreate trading-bot
docker logs -f trading-bot-v4

# Database operations
npx prisma generate                                    # Generate client
DATABASE_URL="postgresql://...@localhost:5432/..." npx prisma migrate dev
docker exec trading-bot-postgres psql -U postgres -d trading_bot_v4 -c "\dt"

# Test trade from UI
# Go to http://localhost:3001/settings
# Click "Test LONG" or "Test SHORT"

SQL Analysis Queries

Essential queries for monitoring signal quality and blocked signals. Run via:

docker exec trading-bot-postgres psql -U postgres -d trading_bot_v4 -c "YOUR_QUERY"

Phase 1: Monitor Data Collection Progress

-- Check blocked signals count (target: 10-20 for Phase 2)
SELECT COUNT(*) as total_blocked FROM "BlockedSignal";

-- Score distribution of blocked signals
SELECT 
  CASE 
    WHEN signalQualityScore >= 60 THEN '60-64 (Close Call)'
    WHEN signalQualityScore >= 55 THEN '55-59 (Marginal)'
    WHEN signalQualityScore >= 50 THEN '50-54 (Weak)'
    ELSE '0-49 (Very Weak)'
  END as tier,
  COUNT(*) as count,
  ROUND(AVG(signalQualityScore)::numeric, 1) as avg_score
FROM "BlockedSignal"
WHERE blockReason = 'QUALITY_SCORE_TOO_LOW'
GROUP BY tier
ORDER BY MIN(signalQualityScore) DESC;

-- Recent blocked signals with full details
SELECT 
  symbol,
  direction,
  signalQualityScore as score,
  ROUND(adx::numeric, 1) as adx,
  ROUND(atr::numeric, 2) as atr,
  ROUND(pricePosition::numeric, 1) as pos,
  ROUND(volumeRatio::numeric, 2) as vol,
  blockReason,
  TO_CHAR(createdAt, 'MM-DD HH24:MI') as time
FROM "BlockedSignal"
ORDER BY createdAt DESC
LIMIT 10;

Phase 2: Compare Blocked vs Executed Trades

-- Compare executed trades in 60-69 score range
SELECT 
  signalQualityScore as score,
  COUNT(*) as trades,
  ROUND(AVG(realizedPnL)::numeric, 2) as avg_pnl,
  ROUND(SUM(realizedPnL)::numeric, 2) as total_pnl,
  ROUND(100.0 * SUM(CASE WHEN realizedPnL > 0 THEN 1 ELSE 0 END) / COUNT(*)::numeric, 1) as win_rate
FROM "Trade"
WHERE exitReason IS NOT NULL
  AND signalQualityScore BETWEEN 60 AND 69
GROUP BY signalQualityScore
ORDER BY signalQualityScore;

-- Block reason breakdown
SELECT 
  blockReason,
  COUNT(*) as count,
  ROUND(AVG(signalQualityScore)::numeric, 1) as avg_score
FROM "BlockedSignal"
GROUP BY blockReason
ORDER BY count DESC;

Analyze Specific Patterns

-- Blocked signals at range extremes (price position)
SELECT 
  direction,
  signalQualityScore as score,
  ROUND(pricePosition::numeric, 1) as pos,
  ROUND(adx::numeric, 1) as adx,
  ROUND(volumeRatio::numeric, 2) as vol,
  symbol,
  TO_CHAR(createdAt, 'MM-DD HH24:MI') as time
FROM "BlockedSignal"
WHERE blockReason = 'QUALITY_SCORE_TOO_LOW'
  AND (pricePosition < 10 OR pricePosition > 90)
ORDER BY signalQualityScore DESC;

-- ADX distribution in blocked signals
SELECT 
  CASE 
    WHEN adx >= 25 THEN 'Strong (25+)'
    WHEN adx >= 20 THEN 'Moderate (20-25)'
    WHEN adx >= 15 THEN 'Weak (15-20)'
    ELSE 'Very Weak (<15)'
  END as adx_tier,
  COUNT(*) as count,
  ROUND(AVG(signalQualityScore)::numeric, 1) as avg_score
FROM "BlockedSignal"
WHERE blockReason = 'QUALITY_SCORE_TOO_LOW'
  AND adx IS NOT NULL
GROUP BY adx_tier
ORDER BY MIN(adx) DESC;

Usage Pattern:

Run "Monitor Data Collection" queries weekly during Phase 1
Once 10+ blocked signals collected, run "Compare Blocked vs Executed" queries
Use "Analyze Specific Patterns" to identify optimization opportunities
Full query reference: BLOCKED_SIGNALS_TRACKING.md

Common Pitfalls

DRIFT SDK MEMORY LEAK (CRITICAL - Fixed Nov 15, 2025):
- Symptom: JavaScript heap out of memory after 10+ hours runtime, Telegram bot timeouts (60s)
- Root Cause: Drift SDK accumulates WebSocket subscriptions over time without cleanup
- Manifestation: Thousands of accountUnsubscribe error: readyState was 2 (CLOSING) in logs
- Heap Growth: Normal ~200MB → 4GB+ after 10 hours → OOM crash
- Solution: Automatic reconnection every 4 hours (lib/drift/client.ts)
- Implementation:
  - scheduleReconnection() - Sets 4-hour timer after initialization
  - reconnect() - Unsubscribes, resets state, reinitializes Drift client
  - Timer cleared in disconnect() to prevent orphaned timers
- Manual Control: /api/drift/reconnect endpoint (POST with auth, GET for status)
- Impact: System now self-healing, can run indefinitely without manual restarts
- Monitoring: Watch for scheduled reconnection logs: 🔄 Scheduled reconnection...
WRONG RPC PROVIDER (CRITICAL - CATASTROPHIC SYSTEM FAILURE):
- FINAL CONCLUSION Nov 14, 2025 (INVESTIGATION COMPLETE): Helius is the ONLY reliable RPC provider for Drift SDK
- Root Cause CONFIRMED: Alchemy's rate limiting breaks Drift SDK's burst subscription pattern during initialization
- Definitive Proof (Nov 14, 21:14 CET):
  - Created diagnostic endpoint /api/testing/drift-init
  - Alchemy: 17-71 subscription errors EVERY init (49 avg over 5 runs), 1644ms avg init time
  - Helius: 0 subscription errors EVERY init, 800ms avg init time
  - See docs/ALCHEMY_RPC_INVESTIGATION_RESULTS.md for full test data
- Why Alchemy Fails:
  - Drift SDK subscribes to 30-50+ accounts simultaneously during init (burst pattern)
  - Alchemy's CUPS enforcement rate limits these burst requests
  - Drift SDK does NOT retry failed subscriptions
  - SDK reports "initialized successfully" but with incomplete subscription set
  - Subsequent operations fail/timeout due to missing account data
  - Error message: "Received JSON-RPC error calling accountSubscribe"
- Why "Breakthrough" at 14:25 Wasn't Real:
  - First Alchemy test had 17-71 subscription errors (random variation)
  - Sometimes gets lucky with "just enough" subscriptions for one operation
  - SDK in degraded state from the start, just not obvious until second operation
  - This explains why first trade "worked" but subsequent trades failed
- Why Helius Works:
  - Higher burst tolerance for Solana dApp subscription patterns
  - Zero subscription errors during init
  - Faster initialization (800ms vs 1600ms)
  - Stable for continuous operations
- Technical Reality vs Documentation:
  - Alchemy DOES support WebSocket subscriptions (research confirmed)
  - Alchemy DOES support accountSubscribe method (not -32601 error)
  - BUT: Rate limit enforcement model incompatible with Drift's burst pattern
  - Documentation doesn't mention burst subscription limits
- Production Status:
  - Using: Helius RPC (https://mainnet.helius-rpc.com/?api-key=...)
  - Retry logic: 5s exponential backoff for rate limits
  - System: Stable, TP1/TP2/SL working, Position Manager tracking correctly
- Investigation Closed: This is DEFINITIVE. Use Helius. Do not use Alchemy.
- Test Yourself: curl 'http://localhost:3001/api/testing/drift-init?rpc=alchemy'
Prisma not generated in Docker: Must run npx prisma generate in Dockerfile BEFORE npm run build
Wrong DATABASE_URL: Container runtime needs trading-bot-postgres, Prisma CLI from host needs localhost:5432
Symbol format mismatch: Always normalize with normalizeTradingViewSymbol() before calling Drift (applies to ALL endpoints including /api/trading/close)
Missing reduce-only flag: Exit orders without reduceOnly: true can accidentally open new positions
Singleton violations: Creating multiple DriftClient or Position Manager instances causes connection/state issues
Type errors with Prisma: The Trade type from Prisma is only available AFTER npx prisma generate - use explicit types or // @ts-ignore carefully
Quality score duplication: Signal quality calculation exists in BOTH check-risk and execute endpoints - keep logic synchronized
TP2-as-Runner configuration:

takeProfit2SizePercent: 0 means "TP2 activates trailing stop, no position close"
This creates runner of remaining % after TP1 (default 25%, configurable via TAKE_PROFIT_1_SIZE_PERCENT)
TAKE_PROFIT_2_PERCENT=0.7 sets TP2 trigger price, TAKE_PROFIT_2_SIZE_PERCENT should be 0
Settings UI correctly shows "TP2 activates trailing stop" with dynamic runner % calculation

P&L calculation CRITICAL: Use actual entry vs exit price calculation, not SDK values:

const profitPercent = this.calculateProfitPercent(trade.entryPrice, exitPrice, trade.direction)
const actualRealizedPnL = (closedSizeUSD * profitPercent) / 100
trade.realizedPnL += actualRealizedPnL  // NOT: result.realizedPnL from SDK

Transaction confirmation CRITICAL: Both openPosition() AND closePosition() MUST call connection.confirmTransaction() after placePerpOrder(). Without this, the SDK returns transaction signatures that aren't confirmed on-chain, causing "phantom trades" or "phantom closes". Always check confirmation.value.err before proceeding.
Execution order matters: When creating trades via API endpoints, the order MUST be:
1. Open position + place exit orders
2. Save to database (createTrade())
3. Add to Position Manager (positionManager.addTrade())
If Position Manager is added before database save, race conditions occur where monitoring checks before the trade exists in DB.
New trade grace period: Position Manager skips "external closure" detection for trades <30 seconds old because Drift positions take 5-10 seconds to propagate after opening. Without this grace period, new positions are immediately detected as "closed externally" and cancelled.
Drift minimum position sizes: Actual minimums differ from documentation:
- SOL-PERP: 0.1 SOL (~$5-15 depending on price)
- ETH-PERP: 0.01 ETH (~$38-40 at $4000/ETH)
- BTC-PERP: 0.0001 BTC (~$10-12 at $100k/BTC)
Always calculate: minOrderSize × currentPrice must exceed Drift's $4 minimum. Add buffer for price movement.
Exit reason detection bug: Position Manager was using current price to determine exit reason, but on-chain orders filled at a DIFFERENT price in the past. Now uses trade.tp1Hit / trade.tp2Hit flags and realized P&L to correctly identify whether TP1, TP2, or SL triggered. Prevents profitable trades being mislabeled as "SL" exits.
Per-symbol cooldown: Cooldown period is per-symbol, NOT global. ETH trade at 10:00 does NOT block SOL trade at 10:01. Each coin (SOL/ETH/BTC) has independent cooldown timer to avoid missing opportunities on different assets.
Timeframe-aware scoring crucial: Signal quality thresholds MUST adjust for 5min vs higher timeframes:
- 5min charts naturally have lower ADX (12-22 healthy) and ATR (0.2-0.7% healthy) than daily charts
- Without timeframe awareness, valid 5min breakouts get blocked as "low quality"
- Anti-chop filter applies -20 points for extreme sideways regardless of timeframe
- Always pass timeframe parameter from TradingView alerts to scoreSignalQuality()
Price position chasing causes flip-flops: Opening longs at 90%+ range or shorts at <10% range reliably loses money:
- Database analysis showed overnight flip-flop losses all had price position 9-94% (chasing extremes)
- These trades had valid ADX (16-18) but entered at worst possible time
- Quality scoring now penalizes -15 to -30 points for range extremes
- Prevents rapid reversals when price is already overextended
TradingView ADX minimum for 5min: Set ADX filter to 15 (not 20+) in TradingView alerts for 5min charts:
- Higher timeframes can use ADX 20+ for strong trends
- 5min charts need lower threshold to catch valid breakouts
- Bot's quality scoring provides second-layer filtering with context-aware metrics
- Two-stage filtering (TradingView + bot) prevents both overtrading and missing valid signals
Prisma Decimal type handling: Raw SQL queries return Prisma Decimal objects, not plain numbers:
- Use any type for numeric fields in $queryRaw results: total_pnl: any
- Convert with Number() before returning to frontend: totalPnL: Number(stat.total_pnl) || 0
- Frontend uses .toFixed() which doesn't exist on Decimal objects
- Applies to all aggregations: SUM(), AVG(), ROUND() - all return Decimal types
- Example: /api/analytics/version-comparison converts all numeric fields
ATR-based trailing stop implementation (Nov 11, 2025): Runner system was using FIXED 0.3% trailing, causing immediate stops:
- Problem: At $168 SOL, 0.3% = $0.50 wiggle room. Trades with +7-9% MFE exited for losses.
- Fix: trailingDistancePercent = (atrAtEntry / currentPrice * 100) × trailingStopAtrMultiplier
- Config: TRAILING_STOP_ATR_MULTIPLIER=1.5, MIN=0.25%, MAX=0.9%, ACTIVATION=0.5%
- Typical improvement: 0.45% ATR × 1.5 = 0.675% trail ($1.13 vs $0.50 = 2.26x more room)
- Fallback: If atrAtEntry unavailable, uses clamped legacy trailingStopPercent
- Log verification: Look for "📊 ATR-based trailing: 0.0045 (0.52%) × 1.5x = 0.78%" messages
- ActiveTrade interface: Must include atrAtEntry?: number field for calculation
- See ATR_TRAILING_STOP_FIX.md for full details and database analysis
CreateTradeParams interface sync: When adding new database fields to Trade model, MUST update CreateTradeParams interface in lib/database/trades.ts:
- Interface defines what parameters createTrade() accepts
- Must add new field to interface (e.g., indicatorVersion?: string)
- Must add field to Prisma create data object in createTrade() function
- TypeScript build will fail if endpoint passes field not in interface
- Example: indicatorVersion tracking required 3-file update (execute route.ts, CreateTradeParams interface, createTrade function)
Position.size tokens vs USD bug (CRITICAL - Fixed Nov 12, 2025):
- Symptom: Position Manager detects false TP1 hits, moves SL to breakeven prematurely
- Root Cause: lib/drift/client.ts returns position.size as BASE ASSET TOKENS (12.28 SOL), not USD ($1,950)
- Bug: Comparing tokens (12.28) directly to USD ($1,950) → 12.28 < 1,950 × 0.95 = "99.4% reduction" → FALSE TP1!
- Fix: Always convert to USD before comparisons:
```
// In Position Manager (lines 322, 519, 558, 591)
const positionSizeUSD = Math.abs(position.size) * currentPrice

// Now compare USD to USD
if (positionSizeUSD < trade.currentSize * 0.95) {
  // Actual 5%+ reduction detected
}
```
- Impact: Without this fix, TP1 never triggers correctly, SL moves at wrong times, runner system fails
- Where it matters: Position Manager, any code querying Drift positions
- Database evidence: Trade showed tp1Hit: true when 100% still open, slMovedToBreakeven: true prematurely
Leverage display showing global config instead of symbol-specific (Fixed Nov 12, 2025):
- Symptom: Telegram notifications showing "⚡ Leverage: 10x" when actual position uses 15x or 20x
- Root Cause: API response returning config.leverage (global default) instead of symbol-specific value
- Fix: Use actual leverage from getPositionSizeForSymbol():
```
// app/api/trading/execute/route.ts (lines 345, 448, 522, 557)
const { size, leverage, enabled } = getPositionSizeForSymbol(driftSymbol, config)

// Return symbol-specific leverage
leverage: leverage,  // NOT: config.leverage
```
- Impact: Misleading notifications, user confusion about actual position risk
- Hierarchy: Per-symbol ENV (SOLANA_LEVERAGE) → Market config → Global ENV (LEVERAGE) → Defaults
Indicator version tracking (Nov 12, 2025+):
- Database field indicatorVersion tracks which TradingView strategy generated the signal
- v5: Buy/Sell Signal strategy (pre-Nov 12)
- v6: HalfTrend + BarColor strategy (Nov 12+)
- Used for performance comparison between strategies
External closure duplicate updates bug (CRITICAL - Fixed Nov 12, 2025):
- Symptom: Trades showing 7-8x larger losses than actual ($58 loss when Drift shows $7 loss)
- Root Cause: Position Manager monitoring loop re-processes external closures multiple times before trade removed from activeTrades Map
- Bug sequence:
  1. Trade closed externally (on-chain SL order fills at -$7.98)
  2. Position Manager detects closure: position === null
  3. Calculates P&L and calls updateTradeExit() → -$7.50 in DB
  4. BUT: Trade still in activeTrades Map (removal happens after DB update)
  5. Next monitoring loop (2s later) detects closure AGAIN
  6. Accumulates P&L: previouslyRealized (-$7.50) + runnerRealized (-$7.50) = -$15.00
  7. Updates database AGAIN → -$15.00 in DB
  8. Repeats 8 times → final -$58.43 (8× the actual loss)
- Fix: Remove trade from activeTrades Map BEFORE database update:
```
// BEFORE (BROKEN):
await updateTradeExit({ ... })
await this.removeTrade(trade.id)  // Too late! Loop already ran again

// AFTER (FIXED):
this.activeTrades.delete(trade.id)  // Remove FIRST
await updateTradeExit({ ... })      // Then update DB
if (this.activeTrades.size === 0) {
  this.stopMonitoring()
}
```
- Impact: Without this fix, every external closure is recorded 5-8 times with compounding P&L
- Root cause: Async timing issue - removeTrade() is async but monitoring loop continues synchronously
- Evidence: Logs showed 8 consecutive "External closure recorded" messages with increasing P&L
- Line: lib/trading/position-manager.ts line 493 (external closure detection block)
- Must update CreateTradeParams interface when adding new database fields (see pitfall #21)
- Analytics endpoint /api/analytics/version-comparison compares v5 vs v6 performance
Signal quality threshold adjustment (Nov 12, 2025):
- Lowered from 65 → 60 based on data analysis of 161 trades
- Reason: Score 60-64 tier outperformed higher scores:
  - 60-64: 2 trades, +$45.78 total, 100% WR, +$22.89 avg
  - 65-69: 13 trades, +$28.28 total, 53.8% WR, +$2.18 avg
  - 70-79: 67 trades, +$8.28 total, 44.8% WR (worst performance!)
- Paradox: Higher quality scores don't correlate with better performance in current data
- Expected impact: 2-3 additional trades/week, +$46-69 weekly profit potential
- Data collection: Enables blocked signals at 55-59 range for Phase 2 optimization
- Risk: Small sample size (2 trades) could be outliers, but downside limited
- SQL analysis showed clear pattern: stricter filtering was blocking profitable setups
Database-First Pattern (CRITICAL - Fixed Nov 13, 2025):
- Symptom: Positions opened on Drift with NO database record, NO Position Manager tracking, NO TP/SL protection
- Root Cause: Execute endpoint saved to database AFTER adding to Position Manager, with silent error catch
- Bug sequence:
  1. TradingView signal → /api/trading/execute
  2. Position opened on Drift ✅
  3. Position Manager tracking added ✅
  4. Database save attempted ❌ (fails silently)
  5. API returns success to user ❌
  6. Container restarts → Position Manager loses in-memory state ❌
  7. Result: Unprotected position with no monitoring or TP/SL orders
- Fix: Database-first execution order in app/api/trading/execute/route.ts:
```
// CRITICAL: Save to database FIRST before adding to Position Manager
try {
  await createTrade({...})
} catch (dbError) {
  console.error('❌ CRITICAL: Failed to save trade to database:', dbError)
  return NextResponse.json({
    success: false,
    error: 'Database save failed - position unprotected',
    message: `Position opened on Drift but database save failed. CLOSE POSITION MANUALLY IMMEDIATELY. Transaction: ${openResult.transactionSignature}`,
  }, { status: 500 })
}

// ONLY add to Position Manager if database save succeeded
await positionManager.addTrade(activeTrade)
```
- Impact: Without this fix, ANY database failure creates unprotected positions
- Verification: Test trade cmhxj8qxl0000od076m21l58z (Nov 13) confirmed fix working
- Documentation: See CRITICAL_INCIDENT_UNPROTECTED_POSITION.md for full incident report
- Rule: Database persistence ALWAYS comes before in-memory state updates
DNS retry logic (Nov 13, 2025):
- Problem: Trading bot fails with "fetch failed" errors when DNS resolution temporarily fails for mainnet.helius-rpc.com
- Impact: n8n workflow failures, missed trades, container restart failures
- Root Cause: EAI_AGAIN errors are transient DNS issues that resolve in seconds, but bot treated them as permanent failures
- Fix: Automatic retry in lib/drift/client.ts - retryOperation() wrapper:
```
// Detects transient errors: fetch failed, EAI_AGAIN, ENOTFOUND, ETIMEDOUT
// Retries up to 3 times with 2s delay between attempts (DNS-specific, separate from rate limit retries)
// Fails fast on non-transient errors (auth, config, permanent network issues)
await this.retryOperation(async () => {
  // Initialize Drift SDK, subscribe, get user account
}, 3, 2000, 'Drift initialization')
```
- Success logs: ⚠️ Drift initialization failed (attempt 1/3): fetch failed → ⏳ Retrying in 2000ms... → ✅ Drift service initialized successfully
- Impact: 99% of transient DNS failures now auto-recover, preventing missed trades
- Note: DNS retries use 2s delays (fast recovery), rate limit retries use 5s delays (RPC cooldown)
- Documentation: See docs/DNS_RETRY_LOGIC.md for monitoring queries and metrics
Declaring fixes "working" before deployment (CRITICAL - Nov 13, 2025):
- Symptom: AI says "position is protected" or "fix is deployed" when container still running old code
- Root Cause: Conflating "code committed to git" with "code running in production"
- Real Incident: Database-first fix committed 15:56, declared "working" at 19:42, but container started 15:06 (old code)
- Result: Unprotected position opened, database save failed silently, Position Manager never tracked it
- Financial Impact: User discovered $250+ unprotected position 3.5 hours after opening
- Verification Required:
```
# ALWAYS check before declaring fix deployed:
docker logs trading-bot-v4 | grep "Server starting" | head -1
# Compare container start time to git commit timestamp
# If container older: FIX NOT DEPLOYED
```
- Rule: NEVER say "fixed", "working", "protected", or "deployed" without verifying container restart timestamp
- Impact: This is a REAL MONEY system - premature declarations cause financial losses
- Documentation: Added mandatory deployment verification to VERIFICATION MANDATE section
Phantom trade notification workflow breaks (Nov 14, 2025):
- Symptom: Phantom trade detected, position opened on Drift, but n8n workflow stops with HTTP 500 error. User NOT notified.
- Root Cause: Execute endpoint returned HTTP 500 when phantom detected, causing n8n chain to halt before Telegram notification
- Problem: Unmonitored phantom position on exchange while user is asleep/away = unlimited risk exposure
- Fix: Auto-close phantom trades immediately + return HTTP 200 with warning (allows n8n to continue)
```
// When phantom detected in app/api/trading/execute/route.ts:
// 1. Immediately close position via closePosition()
// 2. Save to database (create trade + update with exit info)
// 3. Return HTTP 200 with full notification message in response
// 4. n8n workflow continues to Telegram notification step
```
- Response format change: { success: true, warning: 'Phantom trade detected and auto-closed', isPhantom: true, message: '[Full notification text]', phantomDetails: {...} }
- Why auto-close: User can't always respond (sleeping, no phone, traveling). Better to exit with small loss/gain than leave unmonitored position exposed.
- Impact: Protects user from unlimited risk during unavailable hours. Phantom trades are rare edge cases (oracle issues, exchange rejections).
- Database tracking: status='phantom', exitReason='manual', enables analysis of phantom frequency and patterns
Wrong entry price after orphaned position restoration (CRITICAL - Fixed Nov 15, 2025):
- Symptom: Position Manager tracking SHORT at $141.51 entry, but Drift UI shows $141.31 actual entry
- Root Cause: Startup validation restored orphaned position but used OLD database entry price instead of querying Drift for real value
- Bug sequence:
  1. Position opened at $141.317 (per Drift order history)
  2. TP1 closed 70% at $140.942
  3. Database incorrectly saved entry as $141.508 (maybe averaged or from previous position)
  4. Container restart → startup validation found position on Drift
  5. Reopened trade in DB but used stale trade.entryPrice from database
  6. Position Manager tracked with wrong entry ($141.51 vs actual $141.31)
  7. Stop loss calculated from wrong base: $141.08 instead of $140.89
- Impact: 0.14% difference ($0.20/SOL) in SL placement - could mean difference between small profit and small loss
- Fix: Query Drift SDK for actual entry price during orphaned position restoration
```
// In lib/startup/init-position-manager.ts (line 121-144):
// When reopening closed trade found on Drift:
const currentPrice = await driftService.getOraclePrice(marketConfig.driftMarketIndex)
const positionSizeUSD = position.size * currentPrice

await prisma.trade.update({
  where: { id: trade.id },
  data: {
    status: 'open',
    exitReason: null,
    entryPrice: position.entryPrice, // CRITICAL: Use Drift's actual entry price
    positionSizeUSD: positionSizeUSD, // Update to current size (runner after TP1)
  }
})
```
- Drift SDK returns real entry: position.entryPrice from getPosition() calculates from on-chain data (quoteAssetAmount / baseAssetAmount)
- Future-proofed: All orphaned position restorations now use authoritative Drift entry price, not stale DB value
- Manual fix required once: Had to manually UPDATE database for existing position, then restart container
- Lesson: Always prefer on-chain data over cached database values for critical trading parameters
Runner stop loss gap - NO protection between TP1 and TP2 (CRITICAL - Fixed Nov 15, 2025):
- Symptom: Runner position remained open despite price moving far above stop loss level
- Root Cause: Position Manager only checked stop loss BEFORE TP1 hit (line 693) OR AFTER TP2 hit (line 835), creating a gap
- Bug sequence:
  1. SHORT opened at $141.317, TP1 hit at $140.942 (70% closed)
  2. Runner (30% remaining, $12.70) had stop loss at $140.89 (profit lock)
  3. Price rose to $141.98 (way above $140.89 SL) → NO STOP LOSS CHECK
  4. Position exposed to unlimited loss for hours during TP1→TP2 window
  5. User manually checked: "runner close did not work. still open and the price is above 141,98"
- Impact: Hours of unprotected runner exposure = potential unlimited loss on 25-30% remaining position
- Code analysis:
```
// Line 693: Stop loss checked ONLY before TP1
if (!trade.tp1Hit && this.shouldStopLoss(currentPrice, trade)) {
  console.log(`🔴 STOP LOSS: ${trade.symbol}`)
  await this.executeExit(trade, 100, 'SL', currentPrice)
}

// Lines 706-831: TP1 and TP2 processing - NO STOP LOSS CHECK

// Line 835: Stop loss checked ONLY after TP2
if (trade.tp2Hit && this.config.useTrailingStop && this.shouldStopLoss(currentPrice, trade)) {
  console.log(`🔴 TRAILING STOP: ${trade.symbol}`)
  await this.executeExit(trade, 100, 'SL', currentPrice)
}

// BUG: Runner between TP1-TP2 has ZERO stop loss protection!
```
- Fix: Added explicit runner stop loss check at line ~795:
```
// CRITICAL: Check stop loss for runner (after TP1, before TP2)
if (trade.tp1Hit && !trade.tp2Hit && this.shouldStopLoss(currentPrice, trade)) {
  console.log(`🔴 RUNNER STOP LOSS: ${trade.symbol} at ${profitPercent.toFixed(2)}% (profit lock triggered)`)
  await this.executeExit(trade, 100, 'SL', currentPrice)
  return
}
```
- Verification: After fix deployed, runner closed at $141.133 with +$0.59 profit (+4.6% on $12.70 runner)
- Database evidence: Trade shows exitReason='SL', proving runner stop loss triggered correctly
- Why undetected: Runner system relatively new (Nov 11), most trades hit TP2 quickly without price reversals
- Lesson: Every conditional branch in risk management MUST have explicit stop loss checks - never assume "it'll get caught somewhere"

Flip-flop price context using wrong data (CRITICAL - Fixed Nov 14, 2025):

Symptom: Flip-flop detection showing "100% price move" when actual movement was 0.2%, allowing trades that should be blocked
Root Cause: currentPrice parameter not available in check-risk endpoint (trade hasn't opened yet), so calculation used undefined/zero
Real incident: Nov 14, 06:05 CET - SHORT allowed with 0.2% flip-flop, lost -$1.56 in 5 minutes
Bug sequence:
1. LONG opened at $143.86 (06:00)
2. SHORT signal 4min later at $143.58 (0.2% move)
3. Flip-flop check: (undefined - 143.86) / 143.86 * 100 = garbage → showed "100%"
4. System thought it was reversal → allowed trade
5. Should have been blocked as tight-range chop
Fix: Two-part fix in commits 77a9437 and 795026a:

// In app/api/trading/check-risk/route.ts:
// Get current price from Pyth BEFORE quality scoring
const priceMonitor = getPythPriceMonitor()
const latestPrice = priceMonitor.getCachedPrice(body.symbol)
const currentPrice = latestPrice?.price || body.currentPrice

// In lib/trading/signal-quality.ts:
// Validate price data exists before calculation
if (!params.currentPrice || params.currentPrice === 0) {
  // No current price available - apply penalty (conservative)
  console.warn(`⚠️ Flip-flop check: No currentPrice available, applying penalty`)
  frequencyPenalties.flipFlop = -25
  score -= 25
} else {
  const priceChangePercent = Math.abs(
    (params.currentPrice - recentSignals.oppositeDirectionPrice) / 
    recentSignals.oppositeDirectionPrice * 100
  )
  console.log(`🔍 Flip-flop price check: $${recentSignals.oppositeDirectionPrice.toFixed(2)} → $${params.currentPrice.toFixed(2)} = ${priceChangePercent.toFixed(2)}%`)
  // Apply penalty only if < 2% move
}

Impact: Without this fix, flip-flop detection is useless - blocks reversals, allows chop
Lesson: Always validate input data for financial calculations, especially when data might not exist yet
Monitoring: Watch logs for "🔍 Flip-flop price check: $X → $Y = Z%" to verify correct calculations

Phantom trades need exitReason for cleanup (CRITICAL - Fixed Nov 15, 2025):
- Symptom: Position Manager keeps restoring phantom trade on every restart, triggers false runner stop loss alerts
- Root Cause: Phantom auto-closure sets status='phantom' but leaves exitReason=NULL
- Bug: Startup validator checks exitReason !== null (line 122 of init-position-manager.ts), ignores status field
- Consequence: Phantom trade with exitReason=NULL treated as "open" and restored to Position Manager
- Real incident: Nov 14 phantom trade (cmhy6xul20067nx077agh260n) caused 232% size mismatch, hundreds of false "🔴 RUNNER STOP LOSS" alerts
- Fix: When auto-closing phantom trades, MUST set exitReason:
```
// In app/api/trading/execute/route.ts (phantom detection):
await updateTradeExit({
  tradeId: trade.id,
  exitPrice: currentPrice,
  exitReason: 'manual', // CRITICAL: Must set exitReason for cleanup
  realizedPnL: actualPnL,
  status: 'phantom'
})
```
- Manual cleanup: If phantom already exists: UPDATE "Trade" SET "exitReason" = 'manual' WHERE status = 'phantom' AND "exitReason" IS NULL
- Impact: Without exitReason, phantom trades create ghost positions that trigger false alerts and pollute monitoring
- Verification: After restart, check logs for "Found 0 open trades" (not "Found 1 open trades to restore")
- Lesson: status field is for classification, exitReason is for lifecycle management - both must be set on closure
closePosition() missing retry logic causes rate limit storm (CRITICAL - Fixed Nov 15, 2025):
- Symptom: Position Manager tries to close trade, gets 429 error, retries EVERY 2 SECONDS → 100+ failed attempts → rate limit exhaustion
- Root Cause: placeExitOrders() has retryWithBackoff() wrapper (Nov 14 fix), but closePosition() did NOT
- Real incident: Trade cmi0il8l30000r607l8aec701 (Nov 15, 16:49 CET)
  1. Position Manager tried to close (SL or TP trigger)
  2. closePosition() called raw placePerpOrder() → 429 error
  3. executeExit() caught 429, returned early (line 935-940)
  4. Position Manager kept monitoring, retried close EVERY 2 seconds
  5. Logs show 100+ "❌ Failed to close position: 429" + "⚠️ Rate limited while closing SOL-PERP"
  6. Meanwhile: On-chain TP2 limit order filled (unaffected by SDK rate limits)
  7. External closure detected, DB updated 8 TIMES: $0.14 → $0.20 → $0.26 → ... → $0.51
  8. Container eventually restarted (likely from rate limit exhaustion)
- Why duplicate updates: Common Pitfall #27 fix (remove from Map before DB update) works UNLESS rate limits cause tons of retries before external closure detection
- Impact: User saw $0.51 profit in DB, $0.03 on Drift UI (8× compounding vs 1 actual fill)
- Fix: Wrapped closePosition() with retryWithBackoff() in lib/drift/orders.ts:
```
// Line ~567 (BEFORE):
const txSig = await driftClient.placePerpOrder(orderParams)

// Line ~567 (AFTER):
const txSig = await retryWithBackoff(async () => {
  return await driftClient.placePerpOrder(orderParams)
}, 3, 8000) // 8s base delay, 3 max retries (8s → 16s → 32s)
```
- Behavior now: 3 SDK retries over 56s (8+16+32) + Position Manager natural retry on next monitoring cycle = robust without spam
- RPC load reduction: 30-50× fewer requests during close operations (3 retries vs 100+ attempts)
- Verification: Container restarted 18:05 CET Nov 15, code deployed
- Lesson: EVERY SDK order operation (open, close, cancel, place) MUST have retry wrapper - Position Manager monitoring creates infinite retry loop without it
- Root Cause: Phantom auto-closure sets status='phantom' but leaves exitReason=NULL
- Bug: Startup validator checks exitReason !== null (line 122 of init-position-manager.ts), ignores status field
- Consequence: Phantom trade with exitReason=NULL treated as "open" and restored to Position Manager
- Real incident: Nov 14 phantom trade (cmhy6xul20067nx077agh260n) caused 232% size mismatch, hundreds of false "🔴 RUNNER STOP LOSS" alerts
- Fix: When auto-closing phantom trades, MUST set exitReason:
```
// In app/api/trading/execute/route.ts (phantom detection):
await updateTradeExit({
  tradeId: trade.id,
  exitPrice: currentPrice,
  exitReason: 'manual', // CRITICAL: Must set exitReason for cleanup
  realizedPnL: actualPnL,
  status: 'phantom'
})
```
- Manual cleanup: If phantom already exists: UPDATE "Trade" SET "exitReason" = 'manual' WHERE status = 'phantom' AND "exitReason" IS NULL
- Impact: Without exitReason, phantom trades create ghost positions that trigger false alerts and pollute monitoring
- Verification: After restart, check logs for "Found 0 open trades" (not "Found 1 open trades to restore")
- Lesson: status field is for classification, exitReason is for lifecycle management - both must be set on closure

File Conventions

API routes: app/api/[feature]/[action]/route.ts (Next.js 15 App Router)
Services: lib/[service]/[module].ts (drift, pyth, trading, database)
Config: Single source in config/trading.ts with env merging
Types: Define interfaces in same file as implementation (not separate types directory)
Console logs: Use emojis for visual scanning: 🎯 🚀 ✅ ❌ 💰 📊 🛡️

Re-Entry Analytics System (Phase 1)

Purpose: Validate manual Telegram trades using fresh TradingView data + recent performance analysis

Components:

Market Data Cache (lib/trading/market-data-cache.ts)
- Singleton service storing TradingView metrics
- 5-minute expiry on cached data
- Tracks: ATR, ADX, RSI, volume ratio, price position, timeframe
Market Data Webhook (app/api/trading/market-data/route.ts)
- Receives TradingView alerts every 1-5 minutes
- POST: Updates cache with fresh metrics
- GET: View cached data (debugging)
Re-Entry Check Endpoint (app/api/analytics/reentry-check/route.ts)
- Validates manual trade requests
- Uses fresh TradingView data if available (<5min old)
- Falls back to historical metrics from last trade
- Scores signal quality + applies performance modifiers:
  - -20 points if last 3 trades lost money (avgPnL < -5%)
  - +10 points if last 3 trades won (avgPnL > +5%, WR >= 66%)
  - -5 points for stale data, -10 points for no data
- Minimum score: 55 (vs 60 for new signals)
Auto-Caching (app/api/trading/execute/route.ts)
- Every trade signal from TradingView auto-caches metrics
- Ensures fresh data available for manual re-entries
Telegram Integration (telegram_command_bot.py)
- Calls /api/analytics/reentry-check before executing manual trades
- Shows data freshness ("✅ FRESH 23s old" vs "⚠️ Historical")
- Blocks low-quality re-entries unless --force flag used
- Fail-open: Proceeds if analytics check fails

User Flow:

User: "long sol"
  ↓ Check cache for SOL-PERP
  ↓ Fresh data? → Use real TradingView metrics
  ↓ Stale/missing? → Use historical + penalty
  ↓ Score quality + recent performance
  ↓ Score >= 55? → Execute
  ↓ Score < 55? → Block (unless --force)

TradingView Setup: Create alerts that fire every 1-5 minutes with this webhook message:

{
  "action": "market_data",
  "symbol": "{{ticker}}",
  "timeframe": "{{interval}}",
  "atr": {{ta.atr(14)}},
  "adx": {{ta.dmi(14, 14)}},
  "rsi": {{ta.rsi(14)}},
  "volumeRatio": {{volume / ta.sma(volume, 20)}},
  "pricePosition": {{(close - ta.lowest(low, 100)) / (ta.highest(high, 100) - ta.lowest(low, 100)) * 100}},
  "currentPrice": {{close}}
}

Webhook URL: https://your-domain.com/api/trading/market-data

Per-Symbol Trading Controls

Purpose: Independent enable/disable toggles and position sizing for SOL and ETH to support different trading strategies (e.g., ETH for data collection at minimal size, SOL for profit generation).

Configuration Priority:

Per-symbol ENV vars (highest priority)
- SOLANA_ENABLED, SOLANA_POSITION_SIZE, SOLANA_LEVERAGE
- ETHEREUM_ENABLED, ETHEREUM_POSITION_SIZE, ETHEREUM_LEVERAGE
Market-specific config (from MARKET_CONFIGS in config/trading.ts)
Global ENV vars (fallback for BTC and other symbols)
- MAX_POSITION_SIZE_USD, LEVERAGE
Default config (lowest priority)

Settings UI: app/settings/page.tsx has dedicated sections:

💎 Solana section: Toggle + position size + leverage + risk calculator
⚡ Ethereum section: Toggle + position size + leverage + risk calculator
💰 Global fallback: For BTC-PERP and future symbols

Example usage:

// In execute/test endpoints
const { size, leverage, enabled } = getPositionSizeForSymbol(driftSymbol, config)
if (!enabled) {
  return NextResponse.json({
    success: false,
    error: 'Symbol trading disabled'
  }, { status: 400 })
}

Test buttons: Settings UI has symbol-specific test buttons:

💎 Test SOL LONG/SHORT (disabled when SOLANA_ENABLED=false)
⚡ Test ETH LONG/SHORT (disabled when ETHEREUM_ENABLED=false)

When Making Changes

Adding new config: Update DEFAULT_TRADING_CONFIG + getConfigFromEnv() + .env file
Adding database fields: Update prisma/schema.prisma → npx prisma migrate dev → npx prisma generate → rebuild Docker
Changing order logic: Test with DRY_RUN=true first, use small position sizes ($10)
API endpoint changes: Update both endpoint + corresponding n8n workflow JSON (Check Risk and Execute Trade nodes)
Docker changes: Rebuild with docker compose build trading-bot then restart container
Modifying quality score logic: Update BOTH /api/trading/check-risk and /api/trading/execute endpoints, ensure timeframe-aware thresholds are synchronized
Exit strategy changes: Modify Position Manager logic + update on-chain order placement in placeExitOrders()
TradingView alert changes: Ensure alerts pass timeframe field (e.g., "timeframe": "5") to enable proper signal quality scoring
Position Manager changes: ALWAYS execute test trade after deployment
- Use /api/trading/test endpoint or Telegram long sol --force
- Monitor docker logs -f trading-bot-v4 for full cycle
- Verify TP1 hit → 75% close → SL moved to breakeven
- SQL: Check tp1Hit, slMovedToBreakeven, currentSize in Trade table
- Compare: Position Manager logs vs actual Drift position size
Calculation changes: Add verbose logging and verify with SQL
- Log every intermediate step, especially unit conversions
- Never assume SDK data format - log raw values to verify
- SQL query with manual calculation to compare results
- Test boundary cases: 0%, 100%, min/max values
DEPLOYMENT VERIFICATION (MANDATORY): Before declaring ANY fix working:
- Check container start time vs commit timestamp
- If container older than commit: CODE NOT DEPLOYED
- Restart container and verify new code is running
- Never say "fixed" or "protected" without deployment confirmation
- This is a REAL MONEY system - unverified fixes cause losses
GIT COMMIT AND PUSH (MANDATORY): After completing ANY feature, fix, or significant change:
- ALWAYS commit changes with descriptive message
- ALWAYS push to remote repository
- User should NOT have to ask for this - it's part of completion
- Commit message format:
```
git add -A
git commit -m "type: brief description

- Bullet point details
- Files changed
- Why the change was needed
"
git push
```
- Types: feat: (feature), fix: (bug fix), docs: (documentation), refactor: (code restructure)
- This is NOT optional - code exists only when committed and pushed
NEXTCLOUD DECK SYNC (MANDATORY): After completing phases or making significant roadmap progress:
- Update roadmap markdown files with new status (🔄 IN PROGRESS, ✅ COMPLETE, 🔜 NEXT)
- Run sync to update Deck cards: python3 scripts/sync-roadmap-to-deck.py --init
- Move cards between stacks in Nextcloud Deck UI to reflect progress visually
- Backlog (📥) → Planning (📋) → In Progress (🚀) → Complete (✅)
- Keep Deck in sync with actual work - it's the visual roadmap tracker
- Documentation: docs/NEXTCLOUD_DECK_SYNC.md
UPDATE COPILOT-INSTRUCTIONS.MD (MANDATORY): After implementing ANY significant feature or system change:
- Document new database fields and their purpose
- Add filtering requirements (e.g., manual vs TradingView trades)
- Update "Important fields" sections with new schema changes
- Add new API endpoints to the architecture overview
- Document data integrity requirements (what must be excluded from analysis)
- Add SQL query patterns for common operations
- Update "When Making Changes" section with new patterns learned
- Create reference docs in docs/ for complex features (e.g., MANUAL_TRADE_FILTERING.md)
- WHY: Future AI agents need complete context to maintain data integrity and avoid breaking analysis
- EXAMPLES: signalSource field for filtering, MAE/MFE tracking, phantom trade detection

Development Roadmap

Current Status (Nov 14, 2025):

168 trades executed with quality scores and MAE/MFE tracking
Capital: $97.55 USDC at 100% health (zero debt, all USDC collateral)
Leverage: 15x SOL (reduced from 20x for safer liquidation cushion)
Three active optimization initiatives in data collection phase:
1. Signal Quality: 0/20 blocked signals collected → need 10-20 for analysis
2. Position Scaling: 161 v5 trades, collecting v6 data → need 50+ v6 trades
3. ATR-based TP: 1/50 trades with ATR data → need 50 for validation
Expected combined impact: 35-40% P&L improvement when all three optimizations complete
Master roadmap: See OPTIMIZATION_MASTER_ROADMAP.md for consolidated view

See SIGNAL_QUALITY_OPTIMIZATION_ROADMAP.md for systematic signal quality improvements:

Phase 1 (🔄 IN PROGRESS): Collect 10-20 blocked signals with quality scores (1-2 weeks)
Phase 2 (🔜 NEXT): Analyze patterns and make data-driven threshold decisions
Phase 3 (🎯 FUTURE): Implement dual-threshold system or other optimizations based on data
Phase 4 (🤖 FUTURE): Automated price analysis for blocked signals
Phase 5 (🧠 DISTANT): ML-based scoring weight optimization

See POSITION_SCALING_ROADMAP.md for planned position management optimizations:

Phase 1 (✅ COMPLETE): Collect data with quality scores (20-50 trades needed)
Phase 2: ATR-based dynamic targets (adapt to volatility)
Phase 3: Signal quality-based scaling (high quality = larger runners)
Phase 4: Direction-based optimization (shorts vs longs have different performance)
Phase 5 (✅ COMPLETE): TP2-as-runner system implemented - configurable runner (default 25%, adjustable via TAKE_PROFIT_1_SIZE_PERCENT) with ATR-based trailing stop
Phase 6: ML-based exit prediction (future)

Recent Implementation: TP2-as-runner system provides 5x larger runner (default 25% vs old 5%) for better profit capture on extended moves. When TP2 price is hit, trailing stop activates on full remaining position instead of closing partial amount. Runner size is configurable (100% - TP1 close %).

Blocked Signals Tracking (Nov 11, 2025): System now automatically saves all blocked signals to database for data-driven optimization. See BLOCKED_SIGNALS_TRACKING.md for SQL queries and analysis workflows.

Data-driven approach: Each phase requires validation through SQL analysis before implementation. No premature optimization.

Signal Quality Version Tracking: Database tracks signalQualityVersion field to compare algorithm performance:

Analytics dashboard shows version comparison: trades, win rate, P&L, extreme position stats
v4 (current) includes blocked signals tracking for data-driven optimization
Focus on extreme positions (< 15% range) - v3 aimed to reduce losses from weak ADX entries
SQL queries in docs/analysis/SIGNAL_QUALITY_VERSION_ANALYSIS.sql for deep-dive analysis
Need 20+ trades per version before meaningful comparison

Financial Roadmap Integration: All technical improvements must align with current phase objectives (see top of document):

Phase 1 (CURRENT): Prove system works, compound aggressively, 60%+ win rate mandatory
Phase 2-3: Transition to sustainable growth while funding withdrawals
Phase 4+: Scale capital while reducing risk progressively
See TRADING_GOALS.md for complete 8-phase plan ($106 → $1M+)
SQL queries in docs/analysis/SIGNAL_QUALITY_VERSION_ANALYSIS.sql for deep-dive analysis
Need 20+ trades per version before meaningful comparison

Blocked Signals Analysis: See BLOCKED_SIGNALS_TRACKING.md for:

SQL queries to analyze blocked signal patterns
Score distribution and metric analysis
Comparison with executed trades at similar quality levels
Future automation of price tracking (would TP1/TP2/SL have hit?)

Integration Points

n8n: Expects exact response format from /api/trading/execute (see n8n-complete-workflow.json)
Drift Protocol: Uses SDK v2.75.0 - check docs at docs.drift.trade for API changes
Pyth Network: WebSocket + HTTP fallback for price feeds (handles reconnection)
PostgreSQL: Version 16-alpine, must be running before bot starts

Key Mental Model: Think of this as two parallel systems (on-chain orders + software monitoring) working together. The Position Manager is the "backup brain" that constantly watches and acts if on-chain orders fail. Both write to the same database for complete trade history.

88 KiB Raw Blame History Unescape Escape

AI Agent Instructions for Trading Bot v4

Mission & Financial Goals

Architecture Overview

VERIFICATION MANDATE: Financial Code Requires Proof

Critical Path Verification Requirements

Red Flags Requiring Extra Verification

SQL Verification Queries

Example: How Position.size Bug Should Have Been Caught

Deployment Checklist

When to Escalate to User

Docker Build Best Practices

Docker Cleanup After Builds

Critical Components

1. Phantom Trade Auto-Closure System

2. Signal Quality Scoring (lib/trading/signal-quality.ts)

2. Position Manager (lib/trading/position-manager.ts)

3. Telegram Bot (telegram_command_bot.py)

4. Rate Limit Monitoring (lib/drift/orders.ts + app/api/analytics/rate-limits)

5. Order Placement (lib/drift/orders.ts)

6. Database (lib/database/trades.ts + prisma/schema.prisma)

Configuration System

API Endpoints Architecture

Critical Workflows

Execute Trade (Production)

Position Monitoring Loop

Settings Update

Docker Context

Project-Specific Patterns

1. Singleton Services

2. Price Calculations

3. Error Handling

4. Reduce-Only Orders

5. Nextcloud Deck Roadmap Sync

Testing Commands

SQL Analysis Queries

Phase 1: Monitor Data Collection Progress

Phase 2: Compare Blocked vs Executed Trades

Analyze Specific Patterns

Common Pitfalls

File Conventions

Re-Entry Analytics System (Phase 1)

Per-Symbol Trading Controls

When Making Changes

Development Roadmap

Integration Points

88 KiB

Raw Blame History

2. Signal Quality Scoring (`lib/trading/signal-quality.ts`)

2. Position Manager (`lib/trading/position-manager.ts`)

3. Telegram Bot (`telegram_command_bot.py`)

4. Rate Limit Monitoring (`lib/drift/orders.ts` + `app/api/analytics/rate-limits`)

5. Order Placement (`lib/drift/orders.ts`)

6. Database (`lib/database/trades.ts` + `prisma/schema.prisma`)