Files
trading_bot_v4/EPYC_SETUP_COMPREHENSIVE.md
mindesbunister cc56b72df2 fix: Database-first cluster status detection + Stop button clarification
CRITICAL FIX (Nov 30, 2025):
- Dashboard showed 'idle' despite 22+ worker processes running
- Root cause: SSH-based worker detection timing out
- Solution: Check database for running chunks FIRST

Changes:
1. app/api/cluster/status/route.ts:
   - Query exploration database before SSH detection
   - If running chunks exist, mark workers 'active' even if SSH fails
   - Override worker status: 'offline' → 'active' when chunks running
   - Log: ' Cluster status: ACTIVE (database shows running chunks)'
   - Database is source of truth, SSH only for supplementary metrics

2. app/cluster/page.tsx:
   - Stop button ALREADY EXISTS (conditionally shown)
   - Shows Start when status='idle', Stop when status='active'
   - No code changes needed - fixed by status detection

Result:
- Dashboard now shows 'ACTIVE' with 2 workers (correct)
- Workers show 'active' status (was 'offline')
- Stop button automatically visible when cluster active
- System resilient to SSH timeouts/network issues

Verified:
- Container restarted: Nov 30 21:18 UTC
- API tested: Returns status='active', activeWorkers=2
- Logs confirm: Database-first logic working
- Workers confirmed running: 22+ processes on worker1, workers on worker2
2025-11-30 22:23:01 +01:00

150 lines
3.0 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Running Comprehensive Sweep on EPYC Server
## Transfer Package to EPYC
```bash
# From your local machine
scp comprehensive_sweep_package.tar.gz root@72.62.39.24:/root/
```
## Setup on EPYC
```bash
# SSH to EPYC
ssh root@72.62.39.24
# Extract package
cd /root
tar -xzf comprehensive_sweep_package.tar.gz
cd comprehensive_sweep
# Setup Python environment
python3 -m venv .venv
source .venv/bin/activate
pip install pandas numpy
# Create logs directory
mkdir -p backtester/logs
# Make scripts executable
chmod +x run_comprehensive_sweep.sh
chmod +x backtester/scripts/comprehensive_sweep.py
```
## Run the Sweep
```bash
# Start the sweep in background
./run_comprehensive_sweep.sh
# Or manually with more control:
cd /root/comprehensive_sweep
source .venv/bin/activate
nohup python3 backtester/scripts/comprehensive_sweep.py > sweep.log 2>&1 &
# Get the PID
echo $! > sweep.pid
```
## Monitor Progress
```bash
# Watch live progress (updates every 100 configs)
tail -f backtester/logs/sweep_comprehensive_*.log
# Or if using manual method:
tail -f sweep.log
# See current best result
grep 'Best so far' backtester/logs/sweep_comprehensive_*.log | tail -5
# Check if still running
ps aux | grep comprehensive_sweep
# Check CPU usage
htop
```
## Stop if Needed
```bash
# Using PID file:
kill $(cat sweep.pid)
# Or by name:
pkill -f comprehensive_sweep
```
## EPYC Performance Estimate
- **Your EPYC:** 16 cores/32 threads
- **Local Server:** 6 cores
- **Speedup:** ~5-6× faster on EPYC
**Total combinations:** 14,929,920
**Estimated times:**
- Local (6 cores): ~30-40 hours
- EPYC (16 cores): ~6-8 hours 🚀
## Retrieve Results
```bash
# After completion, download results
scp root@72.62.39.24:/root/comprehensive_sweep/sweep_comprehensive.csv .
# Check top results on server first:
head -21 /root/comprehensive_sweep/sweep_comprehensive.csv
```
## Results Format
CSV columns:
- rank
- trades
- win_rate
- total_pnl
- pnl_per_1k (most important - profitability per $1000)
- flip_threshold
- ma_gap
- adx_min
- long_pos_max
- short_pos_min
- cooldown
- position_size
- tp1_mult
- tp2_mult
- sl_mult
- tp1_close_pct
- trailing_mult
- vol_min
- max_bars
## Quick Test
Before running full sweep, test that everything works:
```bash
cd /root/comprehensive_sweep
source .venv/bin/activate
# Quick test with just 10 combinations
python3 -c "
from pathlib import Path
from backtester.data_loader import load_csv
from backtester.simulator import simulate_money_line, TradeConfig
from backtester.indicators.money_line import MoneyLineInputs
data_slice = load_csv(Path('backtester/data/solusdt_5m_aug_nov.csv'), 'SOL-PERP', '5m')
print(f'Loaded {len(data_slice.data)} candles')
inputs = MoneyLineInputs(flip_threshold_percent=0.6)
config = TradeConfig(position_size=210.0)
results = simulate_money_line(data_slice.data, 'SOL-PERP', inputs, config)
print(f'Test: {len(results.trades)} trades, {results.win_rate*100:.1f}% WR, \${results.total_pnl:.2f} P&L')
print('✅ Everything working!')
"
```
If test passes, run the full sweep!