Files

mindesbunister cc56b72df2 fix: Database-first cluster status detection + Stop button clarification

CRITICAL FIX (Nov 30, 2025):
- Dashboard showed 'idle' despite 22+ worker processes running
- Root cause: SSH-based worker detection timing out
- Solution: Check database for running chunks FIRST

Changes:
1. app/api/cluster/status/route.ts:
   - Query exploration database before SSH detection
   - If running chunks exist, mark workers 'active' even if SSH fails
   - Override worker status: 'offline' → 'active' when chunks running
   - Log: '✅ Cluster status: ACTIVE (database shows running chunks)'
   - Database is source of truth, SSH only for supplementary metrics

2. app/cluster/page.tsx:
   - Stop button ALREADY EXISTS (conditionally shown)
   - Shows Start when status='idle', Stop when status='active'
   - No code changes needed - fixed by status detection

Result:
- Dashboard now shows 'ACTIVE' with 2 workers (correct)
- Workers show 'active' status (was 'offline')
- Stop button automatically visible when cluster active
- System resilient to SSH timeouts/network issues

Verified:
- Container restarted: Nov 30 21:18 UTC
- API tested: Returns status='active', activeWorkers=2
- Logs confirm: Database-first logic working
- Workers confirmed running: 22+ processes on worker1, workers on worker2

2025-11-30 22:23:01 +01:00

3.0 KiB

Raw Blame History

Running Comprehensive Sweep on EPYC Server

Transfer Package to EPYC

# From your local machine
scp comprehensive_sweep_package.tar.gz root@72.62.39.24:/root/

Setup on EPYC

# SSH to EPYC
ssh root@72.62.39.24

# Extract package
cd /root
tar -xzf comprehensive_sweep_package.tar.gz
cd comprehensive_sweep

# Setup Python environment
python3 -m venv .venv
source .venv/bin/activate
pip install pandas numpy

# Create logs directory
mkdir -p backtester/logs

# Make scripts executable
chmod +x run_comprehensive_sweep.sh
chmod +x backtester/scripts/comprehensive_sweep.py

Run the Sweep

# Start the sweep in background
./run_comprehensive_sweep.sh

# Or manually with more control:
cd /root/comprehensive_sweep
source .venv/bin/activate
nohup python3 backtester/scripts/comprehensive_sweep.py > sweep.log 2>&1 &

# Get the PID
echo $! > sweep.pid

Monitor Progress

# Watch live progress (updates every 100 configs)
tail -f backtester/logs/sweep_comprehensive_*.log

# Or if using manual method:
tail -f sweep.log

# See current best result
grep 'Best so far' backtester/logs/sweep_comprehensive_*.log | tail -5

# Check if still running
ps aux | grep comprehensive_sweep

# Check CPU usage
htop

Stop if Needed

# Using PID file:
kill $(cat sweep.pid)

# Or by name:
pkill -f comprehensive_sweep

EPYC Performance Estimate

Your EPYC: 16 cores/32 threads
Local Server: 6 cores
Speedup: ~5-6× faster on EPYC

Total combinations: 14,929,920

Estimated times:

Local (6 cores): ~30-40 hours
EPYC (16 cores): ~6-8 hours 🚀

Retrieve Results

# After completion, download results
scp root@72.62.39.24:/root/comprehensive_sweep/sweep_comprehensive.csv .

# Check top results on server first:
head -21 /root/comprehensive_sweep/sweep_comprehensive.csv

Results Format

CSV columns:

rank
trades
win_rate
total_pnl
pnl_per_1k (most important - profitability per $1000)
flip_threshold
ma_gap
adx_min
long_pos_max
short_pos_min
cooldown
position_size
tp1_mult
tp2_mult
sl_mult
tp1_close_pct
trailing_mult
vol_min
max_bars

Quick Test

Before running full sweep, test that everything works:

cd /root/comprehensive_sweep
source .venv/bin/activate

# Quick test with just 10 combinations
python3 -c "
from pathlib import Path
from backtester.data_loader import load_csv
from backtester.simulator import simulate_money_line, TradeConfig
from backtester.indicators.money_line import MoneyLineInputs

data_slice = load_csv(Path('backtester/data/solusdt_5m_aug_nov.csv'), 'SOL-PERP', '5m')
print(f'Loaded {len(data_slice.data)} candles')

inputs = MoneyLineInputs(flip_threshold_percent=0.6)
config = TradeConfig(position_size=210.0)
results = simulate_money_line(data_slice.data, 'SOL-PERP', inputs, config)
print(f'Test: {len(results.trades)} trades, {results.win_rate*100:.1f}% WR, \${results.total_pnl:.2f} P&L')
print('✅ Everything working!')
"

If test passes, run the full sweep!

3.0 KiB Raw Blame History Unescape Escape

Running Comprehensive Sweep on EPYC Server

Transfer Package to EPYC

Setup on EPYC

Run the Sweep

Monitor Progress

Stop if Needed

EPYC Performance Estimate

Retrieve Results

Results Format

Quick Test

3.0 KiB

Raw Blame History