Commit Graph

130 Commits

Author SHA1 Message Date
mindesbunister
c37a9a37d3 fix: Implement Associated Token Account for USDC withdrawals
- Fixed PublicKey undefined error (derive from DRIFT_WALLET_PRIVATE_KEY)
- Implemented ATA resolution using @solana/spl-token
- Added comprehensive debug logging for withdrawal flow
- Fixed AccountOwnedByWrongProgram error (need ATA not wallet address)
- Successfully tested .58 withdrawal with on-chain confirmation
- Updated .env with TOTAL_WITHDRAWN and LAST_WITHDRAWAL_TIME tracking

Key changes:
- lib/drift/withdraw.ts: Added getAssociatedTokenAddress() for USDC ATA
- tsconfig.json: Excluded archive folders from compilation
- package.json: Added bn.js as direct dependency

Transaction: 4drNfMR1xBosGCQtfJ2a4r6oEawUByrT6L7Thyqu6QQWz555hX3QshFuJqiLZreL7KrheSgTdCEqMcXP26fi54JF
Wallet: 3dG7wayp7b9NBMo92D2qL2sy1curSC4TTmskFpaGDrtA
USDC ATA: 8ZEMwErnwxPNNNHJigUcMfrkBG14LCREDdKbqKm49YY7
2025-11-19 20:35:32 +01:00
mindesbunister
9cd317887a fix: Correct BN import for withdrawal system
Changed from '@project-serum/anchor' to 'bn.js' to match
other Drift SDK integrations. Fixes 'Cannot read properties
of undefined (reading '_bn')' error.

User can now test withdrawal with $5 minimum.
2025-11-19 18:39:45 +01:00
mindesbunister
1c79178aac fix: TypeScript errors in withdrawal system
- Fixed LAST_WITHDRAWAL_TIME type (null | string)
- Removed parseFloat on health.freeCollateral (already number)
- Fixed getDriftClient() → getClient() method name
- Build now compiles successfully

Deployed: Withdrawal system now live on dashboard
2025-11-19 18:25:54 +01:00
mindesbunister
ca7b49f745 feat: Add automated profit withdrawal system
- UI page: /withdrawals with stats dashboard and config form
- Settings API: GET/POST for .env configuration
- Stats API: Real-time profit and withdrawal calculations
- Execute API: Safe withdrawal with Drift SDK integration
- Drift service: withdrawFromDrift() with USDC spot market (index 0)
- Safety checks: Min withdrawal amount, min account balance, profit-only
- Telegram notifications: Withdrawal alerts with Solscan links
- Dashboard navigation: Added Withdrawals card (3-card grid)

User goal: 10% of profits automatically withdrawn on schedule
Current: Manual trigger ready, scheduled automation pending
Files: 5 new (withdrawals page, 3 APIs, Drift service), 2 modified
2025-11-19 18:07:07 +01:00
mindesbunister
60fc571aa6 feat: Automated multi-timeframe price tracking system
Implemented comprehensive price tracking for multi-timeframe signal analysis.

**Components Added:**
- lib/analysis/blocked-signal-tracker.ts - Background job tracking prices
- app/api/analytics/signal-tracking/route.ts - Status/metrics endpoint

**Features:**
- Automatic price tracking at 1min, 5min, 15min, 30min intervals
- TP1/TP2/SL hit detection using ATR-based targets
- Max favorable/adverse excursion tracking (MFE/MAE)
- Analysis completion after 30 minutes
- Background job runs every 5 minutes
- Entry price captured from signal time

**Database Changes:**
- Added entryPrice field to BlockedSignal (for price tracking baseline)
- Added maxFavorablePrice, maxAdversePrice fields
- Added maxFavorableExcursion, maxAdverseExcursion fields

**Integration:**
- Auto-starts on container startup
- Tracks all DATA_COLLECTION_ONLY signals
- Uses same TP/SL calculation as live trades (ATR-based)
- Calculates profit % based on direction (long vs short)

**API Endpoints:**
- GET /api/analytics/signal-tracking - View tracking status and metrics
- POST /api/analytics/signal-tracking - Manually trigger update (auth required)

**Purpose:**
Enables data-driven multi-timeframe comparison. After 50+ signals per
timeframe, can analyze which timeframe (5min vs 15min vs 1H vs 4H vs Daily)
has best win rate, profit potential, and signal quality.

**What It Tracks:**
- Price at 1min, 5min, 15min, 30min after signal
- Would TP1/TP2/SL have been hit?
- Maximum profit/loss during 30min window
- Complete analysis of signal profitability

**How It Works:**
1. Signal comes in (15min, 1H, 4H, Daily) → saved to BlockedSignal
2. Background job runs every 5min
3. Queries current price from Pyth
4. Calculates profit % from entry
5. Checks if TP/SL thresholds crossed
6. Updates MFE/MAE if new highs/lows
7. After 30min, marks analysisComplete=true

**Future Analysis:**
After 50+ signals per timeframe:
- Compare TP1 hit rates across timeframes
- Identify which timeframe has highest win rate
- Determine optimal signal frequency vs quality trade-off
- Switch production to best-performing timeframe

User requested: "i want all the bells and whistles. lets make the
powerhouse more powerfull. i cant see any reason why we shouldnt"
2025-11-19 17:18:47 +01:00
mindesbunister
eccecf7aaa critical: Fix container restart killing positions + phantom detection
Two critical bugs caused by container restart:

1. **Startup order restore failure:**
   - Wrong field names: takeProfit1OrderTx → tp1OrderTx
   - Caused: Prisma error, orders not restored, position unprotected
   - Impact: Container restart left position with NO TP/SL backup

2. **Phantom detection killing runners:**
   - Bug: Flagged runners after TP1 as phantom trades
   - Logic: (currentSize / positionSize) < 0.5
   - Example: $3,317 runner / $8,325 original = 40% = PHANTOM!
   - Result: Set P&L to $0.00 on profitable runner exit

Fixes:
- Use correct DB field names (tp1OrderTx, tp2OrderTx, slOrderTx)
- Phantom detection only checks BEFORE TP1 hit
- Runner P&L calculated on currentSize, not originalPositionSize
- If TP1 hit, we're closing the RUNNER (currentSize)
- If TP1 not hit, we're closing FULL position (originalPositionSize)

Real Impact (Nov 19, 2025):
- SHORT $138.355 → Runner trailing at $136.72 (peak)
- Container restart → Orders failed to restore
 Closed with $0.00 P&L
- Actual profit from Drift: ~$54.41 (TP1 + runner combined)

Prevention:
- Next restart will restore orders correctly
- Runners will calculate P&L properly
- No more premature closures from schema errors
2025-11-19 15:03:15 +01:00
mindesbunister
b2cb6a3ecd critical: Fix ADX-based runner SL in on-chain fill detection path
The ADX-based runner SL logic was only applied in the direct price
check path (lines 1065-1086) but NOT in the on-chain fill detection
path (lines 590-650).

When TP1 fills via on-chain order (most common), the system was using
hard-coded breakeven SL instead of ADX-based positioning.

Bug Impact:
- ADX 20.0 trade got breakeven SL ($138.355) instead of -0.3% ($138.77)
- Runner has $0.42 less room to breathe than intended
- Weak trends protected correctly but moderate/strong trends not

Fix:
- Applied same ADX-based logic to on-chain fill detection
- Added detailed logging for each ADX tier
- Runner SL now correct regardless of TP1 trigger path

Current trade (cmi5zpx5s0000lo07ncba1kzh):
- Already hit TP1 via old code (breakeven SL active)
- New ADX-based SL will apply to NEXT trade
- Current position: SHORT $138.3550, runner at breakeven

Code paths with ADX logic:
1. Direct price check (lines 1050-1100) 
2. On-chain fill detection (lines 607-642)  FIXED
2025-11-19 14:56:24 +01:00
mindesbunister
66b292246b feat: ADX-based adaptive runner SL positioning after TP1
Implements intelligent runner protection based on trend strength:

ADX-based SL positioning (Nov 19, 2025):
- ADX < 20: SL at 0% (breakeven) - Weak trend, preserve capital
- ADX 20-25: SL at -0.3% - Moderate trend, some retracement room
- ADX > 25: SL at -0.55% - Strong trend, full retracement tolerance

Rationale:
- User observation: Entry at candle close = always at top
- Screenshots showed -1% to -1.5% pullbacks even on valid trends
- Fixed -0.55% SL would cause unnecessary losses on weak trends
- Adaptive approach: Protect capital when trend weak, give room when strong

Benefits:
1. Capital preservation: Weak trends (ADX <20) get breakeven SL
2. Optimized risk/reward: Only risk runner drawdown on high-probability setups
3. Data-driven thresholds: Based on historical ADX distribution (18-32 range)
4. Complements ADX trailing stop multiplier (also trend-strength adaptive)

Example scenarios:
- ADX 18 (weak): TP1 +$38.70, Runner SL at breakeven → Total: +$38.70
- ADX 29 (strong): TP1 +$38.70, Runner SL at -0.55% → Survives pullback, captures big move

Logging:
- Shows ADX value and selected SL percentage
- Format: "🔒 ADX-based runner SL: 29.3 → -0.55% (60% closed, 40% remaining): $139.23"

Data collection phase:
- After 50-100 trades, will analyze optimal ADX thresholds
- May adjust breakpoints (20/25) based on actual performance
- Tracks runner stop-out rate vs ADX for optimization

Files changed:
- lib/trading/position-manager.ts: ADX-based SL calculation in TP1 handler
- Replaces fixed PROFIT_LOCK_AFTER_TP1_PERCENT with dynamic logic
- Uses trade.adxAtEntry (already tracked in database)
2025-11-19 13:10:10 +01:00
mindesbunister
d09838d1dc feat: Add ADX-based trend strength multiplier for trailing stops
Implements graduated trailing stop widening based on ADX at entry:
- ADX > 30: 1.5x wider trail (very strong trends)
- ADX 25-30: 1.25x wider trail (strong trends)
- ADX < 25: Base trail (weak/moderate trends)

Also adds profit acceleration:
- Profit > 2%: Additional 1.3x multiplier
- Combines with ADX for maximum trail width

Purpose: Capture more of massive trend moves (like 38% MFE trades)
- Current system exits at ~1% with tight 0.67% trail
- ADX 29.3 + 2% profit: Trail widens to ~1.1-1.3%
- Expected improvement: 50%+ better profit capture on big moves

Example impact (Nov 19 trade):
- Entry: $140.17, MFE: 38.12%, Captured: 0.99%
- With ADX multiplier: Would capture ~1.5-2% (50%+ improvement)

Changes:
- lib/trading/position-manager.ts: Added adxAtEntry to ActiveTrade interface
- Trail calculation now checks trade.adxAtEntry and applies multipliers
- Backward compatible: Trades without ADX use base multiplier
- Logs show: "Strong trend (ADX 29.3): Trail multiplier 1.5x → 1.88x"

Data-driven decision based on:
- 4 recent v8 trades with ADX: 29.3, 20.8, 21.4, 18.3
- Nov 19 trade: ADX 29.3, MFE 38.12%, only captured 0.99%
- System needed wider trail for strong trends
2025-11-19 11:57:21 +01:00
mindesbunister
de57c9634c fix: Correct TP1 detection for on-chain order fills
Problem: When TP1 order fills on-chain and runner closes quickly,
Position Manager detects entire position gone but doesn't know TP1 filled.
Result: Marks trade as 'SL' instead of 'TP1', closes 100% instead of partial.

Root cause: Position Manager monitoring loop only knows about trade state
flags (tp1Hit), not actual Drift order fill history. When both TP1 and
runner close before next monitoring cycle, tp1Hit=false but position gone.

Fix: Use profit percentage to infer exit reason instead of trade flags
- Profit >1.2%: TP2 range
- Profit 0.3-1.2%: TP1 range
- Profit <0.3%: SL/breakeven range

Always calculate P&L on full originalPositionSize for external closures.
Exit reason logic determines what actually triggered based on P&L amount.

Example from Nov 19 08:40 CET trade:
- Entry $140.17, Exit $140.85 = 0.48% profit
- Old: Marked as 'SL' (tp1Hit=false, didn't know TP1 filled)
- New: Will mark as 'TP1' (profit in TP1 range)

Lines changed: lib/trading/position-manager.ts:760-835
2025-11-19 09:03:12 +01:00
mindesbunister
7833686b7b critical: Fix P&L compounding in external closure detection
Root cause: trade.realizedPnL was reading from in-memory ActiveTrade object
which could have stale/mutated values from previous detection cycles.

Bug sequence:
1. External closure detected, calculates P&L including previouslyRealized
2. Updates database with totalRealizedPnL
3. Same closure detected again (due to race condition or rate limits)
4. Reads previouslyRealized from same in-memory object (now has accumulated value)
5. Adds MORE P&L to it, compounds 2x, 5x, 10x

Real impact: 9 expected P&L became 81 (10x inflation)

Fix: Remove previouslyRealized from calculation entirely for external closures.
External closures calculate ONLY the current position P&L, not cumulative.
Database will have correct cumulative value if TP1 was processed separately.

Lines changed: lib/trading/position-manager.ts:785-803
- Removed: const previouslyRealized = trade.realizedPnL
- Removed: previouslyRealized + runnerRealized
- Now: totalRealizedPnL = runnerRealized (ONLY this closure's P&L)

Tested: Build completed successfully, container deployed and monitoring positions
2025-11-19 08:41:10 +01:00
mindesbunister
89f30ab704 fix: Remove accountPnL reference in log statement
- TypeScript error: Cannot find name 'accountPnL'
- Removed account percentage from monitoring logs
- Now shows: MFE/MAE in dollars (not percentages)
- Part of Nov 19 MAE/MFE dollar fix
2025-11-19 07:47:56 +01:00
mindesbunister
267456f699 critical: Fix MAE/MFE storing percentages instead of dollars + duplicate Telegram notifications
CRITICAL BUGS FIXED (Nov 19, 2025):

1. MAE/MFE Bug:
   - Was storing: account percentage (profit % × leverage)
   - Example: 1.31% move × 15x = 19.73% stored as MFE
   - Should store: actual dollar P&L (81 not 19.73%)
   - Impact: Telegram shows 'Max Gain: +19.73%' instead of '+.XX'
   - Fix: Changed from accountPnL (leverage-adjusted %) to currentPnLDollars
   - Lines 964-987: Removed accountPnL calculation, use currentPnLDollars

2. Duplicate Notification Bug:
   - handleExternalClosure() was checking if trade removed AFTER removal
   - Result: 16 duplicate Telegram notifications with compounding P&L
   - Example: 6 → 2 → 11 → ... → 81 (16 notifications for 1 close)
   - Fix: Check if trade already removed BEFORE processing
   - Lines 382-391: Move duplicate check to START of function
   - Early return prevents notification send if already processed

3. Database Compounding (NOT A BUG):
   - Nov 17 fix (Common Pitfall #49) still working correctly
   - Only 1 database record with 81 P&L
   - Issue was notification duplication, not DB duplication

IMPACT:
- MAE/MFE data now usable for TP/SL optimization
- Telegram notifications accurate (1 per close, correct P&L)
- Database analytics will show real dollar movements
- Next trade will have correct Max Gain/Drawdown display

FILES:
- lib/trading/position-manager.ts: MAE/MFE calculation + duplicate check
2025-11-19 07:45:58 +01:00
mindesbunister
6156c0f958 critical: Fix P&L compounding bug in external closure detection
- CRITICAL BUG: trade.realizedPnL was being mutated during each external closure detection
- This caused exponential compounding: $6 → $12 → $24 → $48 → $96
- Each time monitoring loop detected closure, it added previouslyRealized + runnerRealized
- But previouslyRealized was the ALREADY ACCUMULATED value from previous iteration
- Result: P&L compounded 15-20x on actual value

ROOT CAUSE (line 797):
  const totalRealizedPnL = previouslyRealized + runnerRealized
  trade.realizedPnL = totalRealizedPnL  // ← BUG: Mutates in-memory trade object

Next detection cycle:
  const previouslyRealized = trade.realizedPnL  // ← Gets ACCUMULATED value
  const totalRealizedPnL = previouslyRealized + runnerRealized  // ← Adds AGAIN

FIX:
- Don't mutate trade.realizedPnL during external closure detection
- Calculate totalRealizedPnL locally, use for database update only
- trade.realizedPnL stays immutable after initial DB save
- Log message clarified: 'P&L calculation' not 'P&L snapshot'

IMPACT:
- Every external closure (TP/SL on-chain orders) affected
- With rate limiting, closure detected 15-20 times before removal
- Real example: $6 actual profit showed as $92.46 in database
- This is WORSE than duplicate notification bug - corrupts financial data

FILES CHANGED:
- lib/trading/position-manager.ts: Removed trade.realizedPnL mutation (line 799)
- Database manually corrected: $92.46 → $6.00 for affected trade

RELATED BUGS:
- Common Pitfall #48: closingInProgress flag prevents some duplicates
- But doesn't help if monitoring loop runs DURING external closure detection
- Need both fixes: closingInProgress + no mutation of trade.realizedPnL
2025-11-17 15:28:08 +01:00
mindesbunister
3aeb00f998 critical: Fix P&L calculation and TP1 false detection bugs
- Add originalPositionSize tracking to prevent stale size usage
- Add price validation to TP1 detection (prevents manual closes misidentified as TP1)
- Fix external closure P&L to use originalPositionSize not currentSize
- Add handleManualClosure method for proper exit reason detection
- Add isPriceAtTarget helper for TP/SL price validation (0.2% tolerance)
- Update all ActiveTrade creation points (execute, test, sync-positions, test-db)

Bug fixes:
- Manual close at 42.34 was detected as TP1 (target 40.71) - FIXED
- P&L showed -$1.71 instead of actual -$2.92 - FIXED
- Exit reason showed SL instead of manual - FIXED

Root cause: Position Manager detected size reduction without validating
price was actually at TP1 level. Used stale currentSize for P&L calculation.

Files modified:
- lib/trading/position-manager.ts (core fixes)
- app/api/trading/execute/route.ts
- app/api/trading/test/route.ts
- app/api/trading/sync-positions/route.ts
- app/api/trading/test-db/route.ts
2025-11-17 15:10:15 +01:00
mindesbunister
be2410c639 critical: Auto-restore missing on-chain orders on startup
PROBLEM (Nov 16, 22:03):
- Ghost position closed with -$6.77 loss
- Validator cleanup removed orders
- Position existed on Drift with NO on-chain TP/SL
- Only Position Manager software protection active
- If bot crashes, position completely unprotected

FIX:
- Added restoreOrdersIfMissing() to startup validator
- Checks every verified position for orders
- Automatically places TP/SL if missing
- Updates database with order transaction IDs

BEHAVIOR:
- Runs on every container startup
- Validates all open positions
- Logs: ' {symbol} has X on-chain orders'
- Or: '⚠️ {symbol} has NO orders - restoring...'
- Provides dual-layer protection always

Impact: Eliminates unprotected position risk after
validator cleanups, container restarts, or order issues.
2025-11-16 22:18:56 +01:00
mindesbunister
cdd3a5dcb0 critical: Fix startup validator reopening duplicate trades
- Group trades by symbol before validation
- Keep only most recent trade per symbol
- Close older duplicates with DUPLICATE_CLEANUP reason
- Prevents reopening old closed trades when checking recent trades

Bug: Startup validator was reopening ALL closed trades for a symbol
if Drift showed one position, causing 3 trades to be tracked when
only 1 actual position existed on Drift.

Impact: Position Manager was tracking ghost positions, causing
confusion and potential missed risk management.
2025-11-16 21:49:26 +01:00
mindesbunister
b813a38ae9 fix: Handle multiple DB trades for single Drift position in validator
PROBLEM (User identified):
- Analytics showed 3 open trades when Drift UI showed only 1 position
- Database had 3 separate trade records all marked as 'open'
- Root cause: Drift has 1 POSITION + 3 ORDERS (TP/SL exit orders)
- Validator was incorrectly treating each as separate position

SOLUTION:
- Group database trades by symbol before validation
- If multiple DB trades exist for one Drift position, keep only MOST RECENT
- Close older duplicate trades with exitReason='DUPLICATE_CLEANUP'
- Properly handles: 1 Drift position → 1 DB trade (correct state)

RESULT:
- Database: 1 open trade (matches Drift reality)
- Analytics: Shows accurate position count
- Runs automatically every 10 minutes + manual trigger available
2025-11-16 21:36:21 +01:00
mindesbunister
66c3c42547 feat: Add automated database sync validator for ghost position detection
PROBLEM:
- Analytics page showed 3 open trades when only 1 actually open on Drift
- Ghost positions in database (realizedPnL set but exitReason = NULL)
- Happens when on-chain orders fill but database update fails
- Manual cleanup required = unreliable dashboard

SOLUTION: Automated Database Sync Validator
1. Runs every 10 minutes (independent of Position Manager)
2. Validates ALL 'open' database trades against actual Drift positions
3. Auto-fixes ghost positions (marks as closed with exitReason)
4. Provides manual validation endpoint: GET /api/admin/validate-db

FEATURES:
- Detects ghost positions (DB open, Drift closed)
- Detects orphan positions (DB closed, Drift open)
- Provides detailed validation reports
- Runs on server startup + periodic intervals
- Zero manual intervention required

FILES:
- lib/database/sync-validator.ts: Core validation logic
- app/api/admin/validate-db/route.ts: Manual validation endpoint
- instrumentation.ts: Auto-start on server initialization

RESULT: Reliable dashboard data - always matches Drift reality
2025-11-16 21:30:29 +01:00
mindesbunister
018f973609 critical: Fix P&L compounding during close verification (20x inflation bug)
Problem:
- Close transaction confirmed but Drift state takes 5-10s to propagate
- Position Manager returned needsVerification=true to keep monitoring
- BUT: Monitoring loop detected position as 'externally closed' EVERY 2 seconds
- Each detection called handleExternalClosure() and added P&L to database
- Result: .66 actual profit → 73.36 in database (20x compounding)
- Logs showed: $112.96 → $117.62 → $122.28 → ... → $173.36 (14+ updates)

Root Cause:
- Common Pitfall #47 fix introduced needsVerification flag to wait for propagation
- But NO flag to prevent external closure detection during wait period
- Monitoring loop thought position was 'closed externally' on every cycle
- Rate limiting (429 errors) made it worse by extending wait time

Fix (closingInProgress flag):
1. Added closingInProgress boolean to ActiveTrade interface
2. Set flag=true when needsVerification returned (close confirmed, waiting)
3. Skip external closure detection entirely while flag=true
4. Timeout after 60s if stuck (abnormal case - allows cleanup)

Impact:
- Every close with verification delay (most closes) had 10-20x P&L inflation
- This is variant of Common Pitfall #27 but during verification, not external closure
- Rate limited closes were hit hardest (longer wait = more compounding cycles)

Files:
- lib/trading/position-manager.ts: Added closingInProgress flag + skip logic

Incident: Nov 16, 11:50 CET - SHORT 41.64→40.08 showed 73.36 vs .66 real
Documented: Common Pitfall #48
2025-11-16 15:07:27 +01:00
mindesbunister
b23dde057b fix: Add needsVerification field to ClosePositionResult interface
- Added optional needsVerification?: boolean to ClosePositionResult
- Fixes TypeScript build error from commit c607a66
- Required for position close verification logic
- Allows Position Manager to keep monitoring if close not yet propagated
2025-11-16 10:28:46 +01:00
mindesbunister
c607a66239 critical: Fix position close verification to prevent ghost positions
Problem:
- Close transaction confirmed on-chain BUT Drift state takes 5-10s to propagate
- Position Manager immediately checked position after close → still showed open
- Continued monitoring with stale state → eventually ghost detected
- Database marked 'SL closed' but position actually stayed open for 6+ hours
- Position was UNPROTECTED during this time (no monitoring, no TP/SL backup)

Root Cause:
- Transaction confirmation ≠ Drift internal state updated
- SDK needs time to propagate on-chain changes to internal cache
- Position Manager assumed immediate state consistency

Fix (2-layer verification):
1. closePosition(): After 100% close confirmation, wait 5s then verify
   - Query Drift to confirm position actually gone
   - If still exists: Return needsVerification=true flag
   - Log CRITICAL error with transaction signature

2. Position Manager: Handle needsVerification flag
   - DON'T mark position closed in database
   - DON'T remove from monitoring
   - Keep monitoring until ghost detection sees it's actually closed
   - Prevents premature cleanup with wrong exit data

Impact:
- Prevents 6-hour unmonitored position exposure
- Ensures database exit data matches actual Drift closure
- Ghost detection becomes safety net, not primary close mechanism
- User positions always protected until VERIFIED closed

Files:
- lib/drift/orders.ts: Added 5s wait + position verification after close
- lib/trading/position-manager.ts: Check needsVerification flag before cleanup

Incident: Nov 16, 02:51 - Close confirmed but position stayed open until 08:51
2025-11-16 10:00:10 +01:00
mindesbunister
673a49302a critical: Fix breakeven SL using wrong entry price after TP1
- CRITICAL BUG: Drift SDK's position.entryPrice RECALCULATES after partial closes
- After TP1, Drift returns COST BASIS of remaining position, NOT original entry
- Example: SHORT @ 38.52 → TP1 @ 70% → Drift shows entry 40.01 (runner's basis)
- Result: Breakeven SL set .50 ABOVE actual entry = guaranteed loss if triggered

Fix:
- Always use database trade.entryPrice for breakeven calculations
- Drift's position.entryPrice = current state (runner cost basis)
- Database entryPrice = original entry (authoritative for breakeven)
- Added logging to show both values for verification

Impact:
- Every TP1 → breakeven transition was using WRONG price
- Locking in losses instead of true breakeven protection
- Financial loss bug affecting every trade with TP1

Files:
- lib/trading/position-manager.ts: Line 513 - use trade.entryPrice not position.entryPrice
- .github/copilot-instructions.md: Added Common Pitfall #43, deprecated old #44

Incident: Nov 16, 02:47 CET - SHORT entry 38.52, breakeven SL set at 40.01
Position closed by ghost detection before SL could trigger (lucky)
2025-11-16 03:00:22 +01:00
mindesbunister
f505db4ac8 fix: Reduce Drift SDK auto-reconnect interval from 4h to 2h
Problem: Bot froze after only 1 hour of runtime with API timeouts,
despite having 4-hour auto-reconnect protection for Drift SDK memory leak.

Investigation showed:
- Singleton pattern working correctly (reusing same instance)
- Hundreds of accountUnsubscribe errors (WebSocket leak)
- Container froze at ~1 hour, not 4 hours

Root Cause: Drift SDK's memory leak is MORE SEVERE than expected.
Even with single instance, subscriptions accumulate faster than anticipated.
4-hour interval too long - system hits memory/connection limits before cleanup.

Solution: Reduce auto-reconnect interval to 2 hours (more aggressive).
This ensures cleanup happens before critical thresholds reached.

Code change (lib/drift/client.ts):
- reconnectIntervalMs: 4 hours → 2 hours
- Updated log messages to reflect new interval

Impact: System now self-heals every 2 hours instead of 4,
preventing the freeze that occurred tonight at 1-hour mark.

Related: Common Pitfall #1 (Drift SDK memory leak)
2025-11-16 02:15:01 +01:00
mindesbunister
528a0f4f43 fix: Use Drift's actual entry price for breakeven SL
Problem: After TP1, SL moved to 'breakeven' but used database entry price,
which can differ from Drift's actual fill price by /bin/bash.10-0.15.

Example:
- DB stored: $139.18291 entry
- Drift actual: $139.07 entry
- SL set to: $139.18 (DB value)

Solution: Query position.entryPrice from Drift SDK when setting breakeven SL.
Drift SDK calculates entry from on-chain data (quoteAssetAmount / baseAssetAmount)
which is more accurate than database stored value.

Code change (lib/trading/position-manager.ts line ~511):
- Before: trade.stopLossPrice = trade.entryPrice
- After: trade.stopLossPrice = position.entryPrice || trade.entryPrice

Impact: TRUE breakeven protection - no slippage losses from price discrepancies.

Related: Common Pitfall #33 (orphaned position restoration with wrong entry price)
2025-11-16 01:44:57 +01:00
mindesbunister
b1ca454a6f feat: Add Telegram notifications for position closures
Implemented direct Telegram notifications when Position Manager closes positions:
- New helper: lib/notifications/telegram.ts with sendPositionClosedNotification()
- Integrated into Position Manager's executeExit() for all closure types
- Also sends notifications for ghost position cleanups

Notification includes:
- Symbol, direction, entry/exit prices
- P&L amount and percentage
- Position size and hold time
- Exit reason (TP1, TP2, SL, manual, ghost cleanup, etc.)
- MAE/MFE stats (max gain/drawdown during trade)

User request: Receive P&L notifications on position closures via Telegram bot
Previously: Only opening notifications via n8n workflow
Now: All closures (TP/SL/manual/ghost) send notifications directly
2025-11-16 00:51:56 +01:00
mindesbunister
9db5f8566d refactor: Remove time-based ghost detection, rely purely on Drift API
User feedback: Time-based cleanup (6 hours) too aggressive for legitimate long-running positions.
Drift API is the authoritative source of truth.

Changes:
- Removed cleanupStalePositions() method entirely
- Removed age-based Layer 1 from validatePositions()
- Updated Layer 2: Now verifies with Drift API before removing position
- All ghost detection now uses Drift blockchain as source of truth

Ghost detection methods:
- Layer 2: Queries Drift after 20 failed close attempts
- Layer 3: Queries Drift every 40 seconds during monitoring
- Periodic validation: Queries Drift every 5 minutes

Result: No premature closures, more reliable ghost detection.
2025-11-16 00:22:19 +01:00
mindesbunister
4779a9f732 fix: 3-layer ghost position prevention system (CRITICAL autonomous reliability fix)
PROBLEM: Ghost positions caused death spirals
- Position Manager tracked 2 positions that were actually closed
- Caused massive rate limit storms (100+ RPC calls)
- Telegram /status showed wrong data
- Periodic validation SKIPPED during rate limiting (fatal flaw)
- Created death spiral: ghosts → rate limits → validation skipped → more rate limits

USER REQUIREMENT: "bot has to work all the time especially when i am not on my laptop"
- System MUST be fully autonomous
- Must self-heal from ghost accumulation
- Cannot rely on manual container restarts

SOLUTION: 3-layer protection system (Nov 15, 2025)

**LAYER 1: Database-based age check**
- Runs every 5 minutes during validation
- Removes positions >6 hours old (likely ghosts)
- Doesn't require RPC calls - ALWAYS works even during rate limiting
- Prevents long-term ghost accumulation

**LAYER 2: Death spiral detector**
- Monitors close attempt failures during rate limiting
- After 20+ failed close attempts (40+ seconds), forces removal
- Breaks rate limit death spirals immediately
- Prevents infinite retry loops

**LAYER 3: Monitoring loop integration**
- Every 20 price checks (~40 seconds), verifies position exists on Drift
- Catches ghosts quickly during normal monitoring
- No 5-minute wait - immediate detection
- Silently skips check during RPC errors (no log spam)

**Key fixes:**
- validatePositions(): Now runs database cleanup FIRST before Drift checks
- Changed 'skipping validation' to 'using database-only validation'
- Added cleanupStalePositions() function (>6h age threshold)
- Added death spiral detection in executeExit() rate limit handler
- Added ghost check in checkTradeConditions() every 20 price updates
- All layers work together - if one fails, others protect

**Impact:**
- System now self-healing - no manual intervention needed
- Ghost positions cleaned within 40-360 seconds (depending on layer)
- Works even during severe rate limiting (Layer 1 always runs)
- Telegram /status always shows correct data
- User can be away from laptop - bot handles itself

**Testing:**
- Container restart cleared ghosts (as expected - DB shows all closed)
- New fixes will prevent future accumulation autonomously

Files changed:
- lib/trading/position-manager.ts (3 layers added)
2025-11-15 23:51:19 +01:00
mindesbunister
59bc267206 fix: Add runner stop loss protection (CRITICAL)
- CRITICAL BUG: Position Manager only checked SL before TP1
- After TP1 hit, runner had NO stop loss protection
- Added separate SL check for runner (after TP1, before TP2)
- Runner now protected by profit-lock SL on Position Manager

Bug discovered: Runner position with no on-chain orders (below min size)
AND no software protection (SL check skipped after TP1).

Impact: 2.79 runner exposed to unlimited loss for 10+ minutes.
Fix: Added line 881-886 runner SL check in monitoring loop.
2025-11-15 22:10:41 +01:00
mindesbunister
5b2ec408a8 fix: Update on-chain SL to breakeven after TP1 hit (CRITICAL)
CRITICAL BUG: After TP1 filled, Position Manager updated internal
stopLossPrice but NEVER updated the actual on-chain orders on Drift.
Runner had NO real stop loss protection at breakeven.

Fix:
- After TP1 detection, call cancelAllOrders() to remove old orders
- Then call placeExitOrders() with updated SL at breakeven
- Place TP2 as new TP1 for runner (activates trailing at that level)
- Logs: 'Cancelling old exit orders', 'Placing new exit orders'

Impact: Runner now properly protected at breakeven on-chain, not just
in Position Manager tracking.

Found: User screenshot showed SL still at original levels (46.57)
after TP1 hit, when it should have been at entry (42.89).
2025-11-15 19:37:05 +01:00
mindesbunister
d236e08cc0 feat: Add periodic Drift position validation to prevent ghost positions
- Added 5-minute validation interval to Position Manager
- Validates tracked positions against actual Drift state
- Auto-cleanup ghost positions (DB shows open but Drift shows closed)
- Prevents rate limit storms from accumulated ghost positions
- Logs detailed ghost detection: DB state vs Drift state
- Self-healing system requires no manual intervention

Implementation:
- scheduleValidation(): Sets 5-minute timer after monitoring starts
- validatePositions(): Queries each tracked position on Drift
- handleExternalClosure(): Reusable method for ghost cleanup
- Clears interval when monitoring stops

Benefits:
- Prevents ghost position accumulation
- Eliminates need for manual container restarts
- Minimal RPC overhead (1 check per 5 min per position)
- Addresses root cause (state management) not symptom (rate limits)

Fixes:
- Ghost positions from failed DB updates during external closures
- Container restart state sync issues
- Rate limit exhaustion from managing non-existent positions
2025-11-15 19:20:51 +01:00
mindesbunister
54c68b45d2 fix: Add retry logic to closePosition() for rate limit protection
CRITICAL FIX: Rate limit storm causing infinite close attempts

Root Cause Analysis (Trade cmi0il8l30000r607l8aec701):
- Position Manager tried to close position (SL or TP trigger)
- closePosition() in orders.ts had NO retry wrapper
- Failed with 429 error, returned to Position Manager
- Position Manager caught 429, kept monitoring
- EVERY 2 SECONDS: Attempted close again → 429 → retry
- Result: 100+ close attempts in logs, exhausted Helius rate limit
- Meanwhile: On-chain TP2 limit order filled (not affected by SDK limits)
- External closure detected, updated DB 8 TIMES ($0.14 → $0.51 compounding bug)

Why This Happened:
- placeExitOrders() has retryWithBackoff() wrapper (Nov 14 fix)
- openPosition() has NO retry wrapper (but less critical - only runs once)
- closePosition() had NO retry wrapper (CRITICAL - runs in monitoring loop)
- When closePosition() failed, Position Manager retried EVERY monitoring cycle

The Fix:
- Wrapped closePosition() placePerpOrder() call with retryWithBackoff()
- 8s base delay, 3 max retries (8s → 16s → 32s progression)
- Same pattern as placeExitOrders() for consistency
- Position Manager executeExit() already handles 429 by returning early
- Now: 3 SDK retries (24s) + Position Manager monitoring retry = robust

Impact:
- Prevents rate limit exhaustion from infinite close attempts
- Reduces RPC load by 30-50x during close operations
- Protects against external closure duplicate update bug
- User saw: $0.51 profit (8 DB updates) vs actual $0.14 (1 fill)

Files: lib/drift/orders.ts (line ~567: wrapped placePerpOrder in retryWithBackoff)

Verification: Container restarted 18:05 CET, code deployed
2025-11-15 18:06:12 +01:00
mindesbunister
8717f72a54 fix: Add retry logic to exit order placement (TP/SL)
CRITICAL FIX: Exit orders failed without retry on 429 rate limits

Root Cause:
- placeExitOrders() placed TP1/TP2/SL orders directly without retry wrapper
- cancelAllOrders() HAD retry logic (8s → 16s → 32s progression)
- Rate limit errors during exit order placement = unprotected positions
- If container crashes after opening, no TP/SL orders on-chain

Fix Applied:
- Wrapped ALL order placements in retryWithBackoff():
  * TP1 limit order (line ~310)
  * TP2 limit order (line ~334)
  * Soft stop trigger-limit (dual stop system)
  * Hard stop trigger-market (dual stop system)
  * Single stop trigger-limit
  * Single stop trigger-market (default)

Retry Behavior:
- Base delay: 8 seconds (was 5s, increased Nov 14)
- Progression: 8s → 16s → 32s (max 3 retries)
- Logs rate_limit_recovered to database on success
- Logs rate_limit_exhausted on max retries exceeded

Impact:
- Exit orders now retry up to 3x on 429 errors (56 seconds total wait)
- Positions protected even during RPC rate limit spikes
- Reduces need for immediate Helius upgrade
- Database analytics track retry success/failure

Files: lib/drift/orders.ts (6 placePerpOrder calls wrapped)

Note: cancelAllOrders() already had retry logic - this completes coverage
2025-11-15 17:34:01 +01:00
mindesbunister
fa4b187f46 feat: Hybrid RPC strategy - Helius for init, Alchemy for trades
CRITICAL FIX: Rate limiting causing unprotected positions

Root Cause:
- Rate limit errors preventing exit order placement after opening positions
- Positions opened with NO on-chain TP/SL protection
- If container crashes, position has unlimited risk exposure

Hybrid RPC Solution:
- Helius RPC: Drift SDK initialization (handles burst subscriptions perfectly)
- Alchemy RPC: Trade operations - open, close, confirmations (better sustained rate limits)
- Graceful fallback: If Alchemy not configured, uses Helius for everything

Implementation:
- DriftService: Dual connections (connection + tradeConnection)
- getTradeConnection() returns Alchemy if configured, else Helius
- openPosition() and closePosition() use tradeConnection for confirmTransaction()
- Added ALCHEMY_RPC_URL to .env (optional)

Configuration:
- SOLANA_RPC_URL: Helius (existing)
- ALCHEMY_RPC_URL: Added with your Alchemy key

Files:
- lib/drift/client.ts: Dual connection support + getTradeConnection()
- lib/drift/orders.ts: Use getTradeConnection() for all confirmations
- .env: Added ALCHEMY_RPC_URL

Logs show: '🔀 Hybrid RPC mode: Helius for init, Alchemy for trades'

Next: Test with new trade to verify orders place successfully
2025-11-15 12:15:23 +01:00
mindesbunister
0ef6b82106 feat: Hybrid RPC strategy (Helius init + Alchemy trades)
CRITICAL: Fix rate limiting by using dual RPC approach

Problem:
- Helius RPC gets overwhelmed during trade execution (429 errors)
- Exit orders fail to place, leaving positions UNPROTECTED
- No on-chain TP/SL orders = unlimited risk if container crashes

Solution: Hybrid RPC Strategy
- Helius for Drift SDK initialization (handles burst subscriptions well)
- Alchemy for trade operations (better sustained rate limits)
- Falls back to Helius if Alchemy not configured

Implementation:
- DriftService now has two connections: connection (Helius) + tradeConnection (Alchemy)
- Added getTradeConnection() method for trade operations
- Updated openPosition() and closePosition() to use trade connection
- Added ALCHEMY_RPC_URL to .env (optional, falls back to Helius)

Benefits:
- Helius: 0 subscription errors during init (proven reliable for SDK setup)
- Alchemy: 300M compute units/month for sustained trade operations
- Best of both worlds: reliable init + reliable trades

Files:
- lib/drift/client.ts: Dual connection support
- lib/drift/orders.ts: Use getTradeConnection() for confirmations
- .env: Added ALCHEMY_RPC_URL

Testing: Deploy and execute test trade to verify orders place successfully
2025-11-15 12:00:57 +01:00
mindesbunister
ec5483041a fix(CRITICAL): Add missing stop loss check for runner between TP1 and TP2
CRITICAL BUG: Runner had NO stop loss protection between TP1 and TP2!

Impact: Runner position completely unprotected for entire TP1→TP2 window
Risk: Unlimited loss exposure on 25-30% remaining position

Example: SHORT at $141.31, TP1 closed 70% at $140.94, runner has SL at $140.89
- Price rises to $141.98 (way above SL) → NO STOP LOSS CHECK → Losses accumulate
- Should have closed at $140.89 with 0.3% profit locked

Fix: Added explicit stop loss check for runner state (TP1 hit but TP2 not hit)
Log: "🔴 RUNNER STOP LOSS" to distinguish from pre-TP1 stops

Files: lib/trading/position-manager.ts
2025-11-15 11:28:54 +01:00
mindesbunister
8163858b0d fix: Correct entry price when restoring orphaned positions from Drift
- Startup validation now updates entryPrice to match Drift's actual value
- Prevents tracking with wrong entry price after container restarts
- Also updates positionSizeUSD to reflect current position (runner after TP1)

Bug: When reopening closed trades found on Drift, used stale DB entry price
Result: Stop loss calculated from wrong entry (41.51 vs actual 41.31)
Impact: 0.14% difference in SL placement (~$0.20 per SOL)

Fix: Query Drift for real entry price and update DB during restoration
Files: lib/startup/init-position-manager.ts
2025-11-15 11:16:05 +01:00
mindesbunister
324e5ba002 refactor: Rename breakEvenTriggerPercent to profitLockAfterTP1Percent for clarity
- Renamed config variable to accurately reflect behavior (locks profit, not breakeven)
- Updated log messages to say 'lock +X% profit' instead of misleading 'breakeven'
- Maintains backwards compatibility (accepts old BREAKEVEN_TRIGGER_PERCENT env var)
- Updated .env with new variable name and explanatory comment

Why: Config was named 'breakeven' but actually locks profit at entry ± X%
For SHORT at $141.51 with 0.3% lock: SL moves to $141.08 (not breakeven $141.51)
This protects remaining runner position after TP1 by allowing small profit giveback

Files changed:
- config/trading.ts: Interface + default + env parsing
- lib/trading/position-manager.ts: Usage + log message
- .env: Variable rename with migration comment
2025-11-15 11:06:44 +01:00
mindesbunister
fb4beee418 fix: Add periodic Drift reconnection to prevent memory leaks
- Memory leak identified: Drift SDK accumulates WebSocket subscriptions over time
- Root cause: accountUnsubscribe errors pile up when connections close/reconnect
- Symptom: Heap grows to 4GB+ after 10+ hours, eventual OOM crash
- Solution: Automatic reconnection every 4 hours to clear subscriptions

Changes:
- lib/drift/client.ts: Add reconnectTimer and scheduleReconnection()
- lib/drift/client.ts: Implement private reconnect() method
- lib/drift/client.ts: Clear timer in disconnect()
- app/api/drift/reconnect/route.ts: Manual reconnection endpoint (POST)
- app/api/drift/reconnect/route.ts: Reconnection status endpoint (GET)

Impact:
- Prevents JavaScript heap out of memory crashes
- Telegram bot timeouts resolved (was failing due to unresponsive bot)
- System will auto-heal every 4 hours instead of requiring manual restart
- Emergency manual reconnect available via API if needed

Tested: Container restarted successfully, no more WebSocket accumulation expected
2025-11-15 09:22:15 +01:00
mindesbunister
25776413d0 feat: Add signalSource field to identify manual vs TradingView trades
- Set signalSource='manual' for Telegram trades, 'tradingview' for TradingView
- Updated analytics queries to exclude manual trades from indicator analysis
- getTradingStats() filters manual trades (TradingView performance only)
- Version comparison endpoint filters manual trades
- Created comprehensive filtering guide: docs/MANUAL_TRADE_FILTERING.md
- Ensures clean data for indicator optimization without contamination
2025-11-14 22:55:14 +01:00
mindesbunister
19beaf9c02 fix: Revert to Helius - Alchemy 'breakthrough' was not sustainable
FINAL CONCLUSION after extensive testing:
- Alchemy appeared to work perfectly at 14:25 CET (first trade)
- User quote: 'SO IT WAS THE FUCKING RPC THAT WAS CAUSING ALL THE ISSUES!!!!!!!!!!!!'
- BUT: Alchemy consistently fails after that initial success
- Multiple attempts to use Alchemy (pure config, no fallback) = same result
- Symptoms: timeouts, positions open WITHOUT TP/SL orders, no Position Manager tracking

HELIUS = ONLY RELIABLE OPTION:
- User confirmed: 'telegram works again' after reverting to Helius
- Works consistently across multiple tests
- Supports WebSocket subscriptions (accountSubscribe) that Drift SDK requires
- Rate limits manageable with 5s exponential backoff

ALCHEMY INCOMPATIBILITY CONFIRMED:
- Does NOT support WebSocket subscriptions (accountSubscribe method)
- SDK appears to initialize but is fundamentally broken
- First trade might work, then SDK gets into bad state
- Cannot be used reliably for Drift Protocol trading

Files restored from working Helius state.
This is the definitive answer: Helius only, no alternatives work.
2025-11-14 21:07:58 +01:00
mindesbunister
78ab9e1a94 fix: Increase transaction confirmation timeout to 60s for Alchemy Growth
- Alchemy Growth (10,000 CU/s) can handle longer confirmation waits
- Increased timeout from 30s to 60s in both openPosition() and closePosition()
- Added debug logging to execute endpoint to trace hang points
- Configured dual RPC: Alchemy primary (transactions), Helius fallback (subscriptions)
- Previous 30s timeout was causing premature failures during Solana congestion
- This should resolve 'Transaction was not confirmed in 30.00 seconds' errors

Related: User reported n8n webhook returning 500 with timeout error
2025-11-14 20:42:59 +01:00
mindesbunister
6dccea5d91 revert: Back to last known working state (27eb5d4)
- Restored Drift client, orders, and .env from commit 27eb5d4
- Updated to current Helius API key
- ISSUE: Execute/check-risk endpoints still hang
- Root cause appears to be Drift SDK initialization hanging at runtime
- Bot initializes successfully at startup but hangs on subsequent Drift calls
- Non-Drift endpoints work fine (settings, positions query)
- Needs investigation: Drift SDK behavior or RPC interaction issue
2025-11-14 20:17:50 +01:00
mindesbunister
db0961d04e revert: Remove Alchemy fallback causing crashes
- getFallbackConnection() code was causing execute endpoint to crash
- Reverting to Helius-only configuration
- Need to investigate root cause before re-adding fallback
2025-11-14 20:10:21 +01:00
mindesbunister
6445a135a8 feat: Helius primary + Alchemy fallback for trade execution
- Helius HTTPS: Primary RPC for Drift SDK initialization and subscriptions
- Alchemy HTTPS (10K CU/s): Fallback RPC for transaction confirmations
- Added getFallbackConnection() method to DriftService
- openPosition() and closePosition() now use Alchemy for tx confirmations
- accountSubscribe errors are non-fatal warnings (SDK falls back gracefully)
- System fully operational: Drift initialized, Position Manager ready
- Trade execution will use high-throughput Alchemy for confirmations
2025-11-14 16:51:14 +01:00
mindesbunister
1cf5c9aba1 feat: Smart startup RPC strategy (Helius → Alchemy)
Strategy:
1. Start with Helius (handles startup burst better - 10 req/sec sustained)
2. After successful init, switch to Alchemy (more stable for trading)
3. On 429 errors during operations, fall back to Helius, then return to Alchemy

Implementation:
- lib/drift/client.ts: Smart constructor checks for fallback, uses it for startup
- After initialize() completes, automatically switches to primary RPC
- Swaps connections and reinitializes Drift SDK with Alchemy
- Falls back to Helius on rate limits, switches back after recovery

Benefits:
- Helius absorbs SDK subscribe() burst (many concurrent calls)
- Alchemy provides stability for normal trading operations
- Best of both worlds: burst tolerance + operational stability

Status:
- Code complete and tested
- Helius API key needs updating (current key returns 401)
- Fallback temporarily disabled in .env until key fixed
- Position Manager working perfectly (trade monitored via Alchemy)

To enable:
1. Get fresh Helius API key from helius.dev
2. Set SOLANA_FALLBACK_RPC_URL in .env
3. Restart bot - will use Helius for startup automatically
2025-11-14 15:41:52 +01:00
mindesbunister
7ff78ee0bd feat: Hybrid RPC fallback system (Alchemy → Helius)
- Automatic fallback after 2 consecutive rate limits
- Primary: Alchemy (300M CU/month, stable for normal ops)
- Fallback: Helius (10 req/sec, backup for startup bursts)
- Reduced startup validation: 6h window, 5 trades (was 24h, 20 trades)
- Multi-position safety check (prevents order cancellation conflicts)
- Rate limit-aware retry logic with exponential backoff

Implementation:
- lib/drift/client.ts: Added fallbackConnection, switchToFallbackRpc()
- .env: SOLANA_FALLBACK_RPC_URL configuration
- lib/startup/init-position-manager.ts: Reduced validation scope
- lib/trading/position-manager.ts: Multi-position order protection

Tested: System switched to fallback on startup, Position Manager active
Result: 1 active trade being monitored after automatic RPC switch
2025-11-14 15:28:07 +01:00
mindesbunister
7afd7d5aa1 feat: switch from Helius to Alchemy RPC provider
Changes:
- Updated SOLANA_RPC_URL to use Alchemy (https://solana-mainnet.g.alchemy.com/v2/...)
- Migrated from Helius free tier to Alchemy free tier
- Includes previous rate limit fixes (8s backoff, 2s operation delays)

Context:
- Helius free tier: 10 req/sec sustained, 100 req/sec burst
- Alchemy free tier: 300M compute units/month (more generous)
- User hit 239 rate limit errors in 10 minutes on Helius
- User registered Alchemy account and provided API key

Impact:
- Should significantly reduce 429 rate limit errors
- Better free tier limits for trading bot operations
- Combined with delay fixes for optimal RPC usage
2025-11-14 14:01:52 +01:00
mindesbunister
a0dc80e96b docs: add Docker cleanup instructions to prevent disk full issues
- Document build cache accumulation problem (40-50 GB typical)
- Add cleanup commands: image prune, builder prune, volume prune
- Recommend running after each deployment or weekly
- Typical space freed: 40-55 GB per cleanup
- Clarify what's safe vs not safe to delete
- Part of maintaining healthy development environment
2025-11-14 10:46:15 +01:00
mindesbunister
27eb5d4fe8 fix: Critical rate limit handling + startup position restoration
**Problem 1: Rate Limit Cascade**
- Position Manager tried to close repeatedly, overwhelming Helius RPC (10 req/s limit)
- Base retry delay was too aggressive (2s → 4s → 8s)
- No graceful handling when 429 errors occur

**Problem 2: Orphaned Positions After Restart**
- Container restarts lost Position Manager state
- Positions marked 'closed' in DB but still open on Drift (failed close transactions)
- No cross-validation between database and actual Drift positions

**Solutions Implemented:**

1. **Increased retry delays (orders.ts)**:
   - Base delay: 2s → 5s (progression now 5s → 10s → 20s)
   - Reduces RPC pressure during rate limit situations
   - Gives Helius time to recover between retries
   - Documented Helius limits: 100 req/s burst, 10 req/s sustained (free tier)

2. **Startup position validation (init-position-manager.ts)**:
   - Cross-checks last 24h of 'closed' trades against actual Drift positions
   - If DB says closed but Drift shows open → reopens in DB to restore tracking
   - Prevents unmonitored positions from existing after container restarts
   - Logs detailed mismatch info for debugging

3. **Rate limit-aware exit handling (position-manager.ts)**:
   - Detects 429 errors during position close
   - Keeps trade in monitoring instead of removing it
   - Natural retry on next price update (vs aggressive 2s loop)
   - Prevents marking position as closed when transaction actually failed

**Impact:**
- Eliminates orphaned positions after restarts
- Reduces RPC pressure by 2.5x (5s vs 2s base delay)
- Graceful degradation under rate limits
- Position Manager continues monitoring even during temporary RPC issues

**Testing needed:**
- Monitor next container restart to verify position restoration works
- Check rate limit analytics after next close attempt
- Verify no more phantom 'closed' positions when Drift shows open
2025-11-14 09:50:13 +01:00