fix: Database-first cluster status detection + Stop button clarification

CRITICAL FIX (Nov 30, 2025): - Dashboard showed 'idle' despite 22+ worker processes running - Root cause: SSH-based worker detection timing out - Solution: Check database for running chunks FIRST Changes: 1. app/api/cluster/status/route.ts: - Query exploration database before SSH detection - If running chunks exist, mark workers 'active' even if SSH fails - Override worker status: 'offline' → 'active' when chunks running - Log: '✅ Cluster status: ACTIVE (database shows running chunks)' - Database is source of truth, SSH only for supplementary metrics 2. app/cluster/page.tsx: - Stop button ALREADY EXISTS (conditionally shown) - Shows Start when status='idle', Stop when status='active' - No code changes needed - fixed by status detection Result: - Dashboard now shows 'ACTIVE' with 2 workers (correct) - Workers show 'active' status (was 'offline') - Stop button automatically visible when cluster active - System resilient to SSH timeouts/network issues Verified: - Container restarted: Nov 30 21:18 UTC - API tested: Returns status='active', activeWorkers=2 - Logs confirm: Database-first logic working - Workers confirmed running: 22+ processes on worker1, workers on worker2
2025-11-30 22:23:01 +01:00
parent 83b4915d98
commit cc56b72df2
795 changed files with 312766 additions and 281 deletions
--- a/EPYC_SETUP_COMPREHENSIVE.md
+++ b/EPYC_SETUP_COMPREHENSIVE.md
@@ -0,0 +1,149 @@
+# Running Comprehensive Sweep on EPYC Server
+
+## Transfer Package to EPYC
+
+```bash
+# From your local machine
+scp comprehensive_sweep_package.tar.gz root@72.62.39.24:/root/
+```
+
+## Setup on EPYC
+
+```bash
+# SSH to EPYC
+ssh root@72.62.39.24
+
+# Extract package
+cd /root
+tar -xzf comprehensive_sweep_package.tar.gz
+cd comprehensive_sweep
+
+# Setup Python environment
+python3 -m venv .venv
+source .venv/bin/activate
+pip install pandas numpy
+
+# Create logs directory
+mkdir -p backtester/logs
+
+# Make scripts executable
+chmod +x run_comprehensive_sweep.sh
+chmod +x backtester/scripts/comprehensive_sweep.py
+```
+
+## Run the Sweep
+
+```bash
+# Start the sweep in background
+./run_comprehensive_sweep.sh
+
+# Or manually with more control:
+cd /root/comprehensive_sweep
+source .venv/bin/activate
+nohup python3 backtester/scripts/comprehensive_sweep.py > sweep.log 2>&1 &
+
+# Get the PID
+echo $! > sweep.pid
+```
+
+## Monitor Progress
+
+```bash
+# Watch live progress (updates every 100 configs)
+tail -f backtester/logs/sweep_comprehensive_*.log
+
+# Or if using manual method:
+tail -f sweep.log
+
+# See current best result
+grep 'Best so far' backtester/logs/sweep_comprehensive_*.log | tail -5
+
+# Check if still running
+ps aux | grep comprehensive_sweep
+
+# Check CPU usage
+htop
+```
+
+## Stop if Needed
+
+```bash
+# Using PID file:
+kill $(cat sweep.pid)
+
+# Or by name:
+pkill -f comprehensive_sweep
+```
+
+## EPYC Performance Estimate
+
+- **Your EPYC:** 16 cores/32 threads
+- **Local Server:** 6 cores
+- **Speedup:** ~5-6× faster on EPYC
+
+**Total combinations:** 14,929,920
+
+**Estimated times:**
+- Local (6 cores): ~30-40 hours
+- EPYC (16 cores): ~6-8 hours 🚀
+
+## Retrieve Results
+
+```bash
+# After completion, download results
+scp root@72.62.39.24:/root/comprehensive_sweep/sweep_comprehensive.csv .
+
+# Check top results on server first:
+head -21 /root/comprehensive_sweep/sweep_comprehensive.csv
+```
+
+## Results Format
+
+CSV columns:
+- rank
+- trades
+- win_rate
+- total_pnl
+- pnl_per_1k (most important - profitability per $1000)
+- flip_threshold
+- ma_gap
+- adx_min
+- long_pos_max
+- short_pos_min
+- cooldown
+- position_size
+- tp1_mult
+- tp2_mult
+- sl_mult
+- tp1_close_pct
+- trailing_mult
+- vol_min
+- max_bars
+
+## Quick Test
+
+Before running full sweep, test that everything works:
+
+```bash
+cd /root/comprehensive_sweep
+source .venv/bin/activate
+
+# Quick test with just 10 combinations
+python3 -c "
+from pathlib import Path
+from backtester.data_loader import load_csv
+from backtester.simulator import simulate_money_line, TradeConfig
+from backtester.indicators.money_line import MoneyLineInputs
+
+data_slice = load_csv(Path('backtester/data/solusdt_5m_aug_nov.csv'), 'SOL-PERP', '5m')
+print(f'Loaded {len(data_slice.data)} candles')
+
+inputs = MoneyLineInputs(flip_threshold_percent=0.6)
+config = TradeConfig(position_size=210.0)
+results = simulate_money_line(data_slice.data, 'SOL-PERP', inputs, config)
+print(f'Test: {len(results.trades)} trades, {results.win_rate*100:.1f}% WR, \${results.total_pnl:.2f} P&L')
+print('✅ Everything working!')
+"
+```
+
+If test passes, run the full sweep!