Files
trading_bot_v4/EPYC_SETUP_COMPREHENSIVE.md
mindesbunister cc56b72df2 fix: Database-first cluster status detection + Stop button clarification
CRITICAL FIX (Nov 30, 2025):
- Dashboard showed 'idle' despite 22+ worker processes running
- Root cause: SSH-based worker detection timing out
- Solution: Check database for running chunks FIRST

Changes:
1. app/api/cluster/status/route.ts:
   - Query exploration database before SSH detection
   - If running chunks exist, mark workers 'active' even if SSH fails
   - Override worker status: 'offline' → 'active' when chunks running
   - Log: ' Cluster status: ACTIVE (database shows running chunks)'
   - Database is source of truth, SSH only for supplementary metrics

2. app/cluster/page.tsx:
   - Stop button ALREADY EXISTS (conditionally shown)
   - Shows Start when status='idle', Stop when status='active'
   - No code changes needed - fixed by status detection

Result:
- Dashboard now shows 'ACTIVE' with 2 workers (correct)
- Workers show 'active' status (was 'offline')
- Stop button automatically visible when cluster active
- System resilient to SSH timeouts/network issues

Verified:
- Container restarted: Nov 30 21:18 UTC
- API tested: Returns status='active', activeWorkers=2
- Logs confirm: Database-first logic working
- Workers confirmed running: 22+ processes on worker1, workers on worker2
2025-11-30 22:23:01 +01:00

3.0 KiB
Raw Blame History

Running Comprehensive Sweep on EPYC Server

Transfer Package to EPYC

# From your local machine
scp comprehensive_sweep_package.tar.gz root@72.62.39.24:/root/

Setup on EPYC

# SSH to EPYC
ssh root@72.62.39.24

# Extract package
cd /root
tar -xzf comprehensive_sweep_package.tar.gz
cd comprehensive_sweep

# Setup Python environment
python3 -m venv .venv
source .venv/bin/activate
pip install pandas numpy

# Create logs directory
mkdir -p backtester/logs

# Make scripts executable
chmod +x run_comprehensive_sweep.sh
chmod +x backtester/scripts/comprehensive_sweep.py

Run the Sweep

# Start the sweep in background
./run_comprehensive_sweep.sh

# Or manually with more control:
cd /root/comprehensive_sweep
source .venv/bin/activate
nohup python3 backtester/scripts/comprehensive_sweep.py > sweep.log 2>&1 &

# Get the PID
echo $! > sweep.pid

Monitor Progress

# Watch live progress (updates every 100 configs)
tail -f backtester/logs/sweep_comprehensive_*.log

# Or if using manual method:
tail -f sweep.log

# See current best result
grep 'Best so far' backtester/logs/sweep_comprehensive_*.log | tail -5

# Check if still running
ps aux | grep comprehensive_sweep

# Check CPU usage
htop

Stop if Needed

# Using PID file:
kill $(cat sweep.pid)

# Or by name:
pkill -f comprehensive_sweep

EPYC Performance Estimate

  • Your EPYC: 16 cores/32 threads
  • Local Server: 6 cores
  • Speedup: ~5-6× faster on EPYC

Total combinations: 14,929,920

Estimated times:

  • Local (6 cores): ~30-40 hours
  • EPYC (16 cores): ~6-8 hours 🚀

Retrieve Results

# After completion, download results
scp root@72.62.39.24:/root/comprehensive_sweep/sweep_comprehensive.csv .

# Check top results on server first:
head -21 /root/comprehensive_sweep/sweep_comprehensive.csv

Results Format

CSV columns:

  • rank
  • trades
  • win_rate
  • total_pnl
  • pnl_per_1k (most important - profitability per $1000)
  • flip_threshold
  • ma_gap
  • adx_min
  • long_pos_max
  • short_pos_min
  • cooldown
  • position_size
  • tp1_mult
  • tp2_mult
  • sl_mult
  • tp1_close_pct
  • trailing_mult
  • vol_min
  • max_bars

Quick Test

Before running full sweep, test that everything works:

cd /root/comprehensive_sweep
source .venv/bin/activate

# Quick test with just 10 combinations
python3 -c "
from pathlib import Path
from backtester.data_loader import load_csv
from backtester.simulator import simulate_money_line, TradeConfig
from backtester.indicators.money_line import MoneyLineInputs

data_slice = load_csv(Path('backtester/data/solusdt_5m_aug_nov.csv'), 'SOL-PERP', '5m')
print(f'Loaded {len(data_slice.data)} candles')

inputs = MoneyLineInputs(flip_threshold_percent=0.6)
config = TradeConfig(position_size=210.0)
results = simulate_money_line(data_slice.data, 'SOL-PERP', inputs, config)
print(f'Test: {len(results.trades)} trades, {results.win_rate*100:.1f}% WR, \${results.total_pnl:.2f} P&L')
print('✅ Everything working!')
"

If test passes, run the full sweep!