Files

mindesbunister cc56b72df2 fix: Database-first cluster status detection + Stop button clarification

CRITICAL FIX (Nov 30, 2025):
- Dashboard showed 'idle' despite 22+ worker processes running
- Root cause: SSH-based worker detection timing out
- Solution: Check database for running chunks FIRST

Changes:
1. app/api/cluster/status/route.ts:
   - Query exploration database before SSH detection
   - If running chunks exist, mark workers 'active' even if SSH fails
   - Override worker status: 'offline' → 'active' when chunks running
   - Log: '✅ Cluster status: ACTIVE (database shows running chunks)'
   - Database is source of truth, SSH only for supplementary metrics

2. app/cluster/page.tsx:
   - Stop button ALREADY EXISTS (conditionally shown)
   - Shows Start when status='idle', Stop when status='active'
   - No code changes needed - fixed by status detection

Result:
- Dashboard now shows 'ACTIVE' with 2 workers (correct)
- Workers show 'active' status (was 'offline')
- Stop button automatically visible when cluster active
- System resilient to SSH timeouts/network issues

Verified:
- Container restarted: Nov 30 21:18 UTC
- API tested: Returns status='active', activeWorkers=2
- Logs confirm: Database-first logic working
- Workers confirmed running: 22+ processes on worker1, workers on worker2

2025-11-30 22:23:01 +01:00

5.1 KiB

Raw Blame History

Dual v9 Parameter Sweep Package

Purpose: Run two INDEPENDENT parameter sweeps to compare which performs better

What This Tests

TWO SEPARATE SWEEPS (not combined):

Raw v9 Sweep: v9 Money Line indicator WITHOUT any filter
- Baseline performance across all parameters
- File: scripts/run_backtest_sweep.py
- Output: sweep_v9_raw.csv
RSI Filtered Sweep: v9 Money Line indicator WITH RSI divergence filter
- Same parameters, but only trades with RSI divergence
- File: scripts/run_backtest_sweep_rsi.py
- Output: sweep_v9_rsi_divergence.csv

Both test 65,536 parameter combinations independently, then we compare best results.

Package Contents

data/solusdt_5m.csv - OHLCV data (Aug 1 - Nov 28, 2024, 34,273 candles)
backtester/ - Core backtesting modules
scripts/run_backtest_sweep.py - Vanilla v9 sweep
scripts/run_backtest_sweep_rsi.py - RSI divergence filtered sweep
setup_dual_sweep.sh - Setup script
run_sweep_vanilla_epyc.sh - Launch vanilla sweep
run_sweep_rsi_epyc.sh - Launch RSI sweep

Quick Start (EPYC Servers)

EPYC Server 1 - Raw v9 Sweep (No Filter)

# Extract package
tar -xzf backtest_v9_dual_sweep.tar.gz
cd backtest

# Setup environment
./setup_dual_sweep.sh

# Run raw v9 sweep (65,536 combinations, ~12-13h with 24 workers)
./run_sweep_vanilla_epyc.sh

# Monitor progress
tail -f v9_vanilla_sweep.log

EPYC Server 2 - RSI Filtered v9 Sweep

# Extract package
tar -xzf backtest_v9_dual_sweep.tar.gz
cd backtest

# Setup environment
./setup_dual_sweep.sh

# Run RSI sweep (65,536 combinations, ~12-13h with 24 workers)
./run_sweep_rsi_epyc.sh

# Monitor progress
tail -f v9_rsi_sweep.log

Parameter Grid

Both sweeps test the same 8 parameters (4 values each = 65,536 combinations):

flip_threshold: 0.4, 0.5, 0.6, 0.7
ma_gap: 0.20, 0.30, 0.40, 0.50
momentum_adx: 18, 21, 24, 27
momentum_long_pos: 60, 65, 70, 75
momentum_short_pos: 20, 25, 30, 35
cooldown_bars: 1, 2, 3, 4
momentum_spacing: 2, 3, 4, 5
momentum_cooldown: 1, 2, 3, 4

Expected Outputs

Vanilla Sweep

File: sweep_v9_vanilla_epyc.csv
Columns: All 8 parameters + trades, total_pnl, win_rate, avg_pnl, max_drawdown, profit_factor
Sorted by: total_pnl (descending)
Baseline Performance (default params): $405.88, 569 trades, 60.98% WR

RSI Divergence Sweep

File: sweep_v9_rsi_divergence.csv
Columns: Same as vanilla
Sorted by: total_pnl (descending)
Filter: Only trades with RSI divergence (bullish/bearish patterns, 20-bar lookback)
Top: 100 results only (to keep file size manageable)
Baseline Performance: $423.46, 224 trades, 63.39% WR (39% fewer trades but better quality)

Key Differences

Vanilla v9

All signals execute (no post-filter)
Tests which parameters maximize profit across all market conditions
Higher trade frequency

RSI Divergence v9

Post-simulation filter: only keeps trades with RSI divergence detected
Tests which parameters work best when combined with divergence confirmation
Lower trade frequency but potentially higher win rate

Performance Estimates

Hardware: AMD EPYC 7282 (16-core) or similar
Workers: 24 parallel processes
Speed: ~1.6s per combination
Total Time: ~29 hours for 65,536 combinations
Output Size: ~5-10 MB per CSV (vanilla full results, RSI top 100)

Comparison Strategy

After both sweeps complete:

Find best vanilla result: head -1 sweep_v9_vanilla_epyc.csv
Find best RSI result: head -1 sweep_v9_rsi_divergence.csv
Compare total P&L, trade count, win rate
Decision: Implement whichever yields highest total profit

Monitoring Commands

# Check sweep status
ps aux | grep run_backtest_sweep

# Watch progress (vanilla)
tail -f v9_vanilla_sweep.log

# Watch progress (RSI)
tail -f v9_rsi_sweep.log

# Check completion
ls -lh sweep_v9_*.csv

# Kill sweep if needed
pkill -f run_backtest_sweep

Troubleshooting

Import Errors

Ensure .venv is activated: source .venv/bin/activate
Check pandas/numpy installed: pip list | grep -E 'pandas|numpy'

Memory Issues

Reduce workers: Edit run script, change --workers 24 to --workers 16
Monitor: htop or free -h

Slow Progress

Check CPU usage: htop (should see 24 python processes at 100%)
Check I/O: iostat -x 1 (shouldn't be bottleneck with CSV in memory)

Expected Results Format

Vanilla CSV Example

flip_threshold,ma_gap,momentum_adx,momentum_long_pos,momentum_short_pos,cooldown_bars,momentum_spacing,momentum_cooldown,trades,total_pnl,win_rate,avg_pnl,max_drawdown,profit_factor
0.6,0.35,23,70,25,2,3,2,569,405.88,60.98,0.71,1360.58,1.022

RSI Divergence CSV Example

flip_threshold,ma_gap,momentum_adx,momentum_long_pos,momentum_short_pos,cooldown_bars,momentum_spacing,momentum_cooldown,total_pnl,num_trades,win_rate,profit_factor,max_drawdown,avg_win,avg_loss
0.6,0.35,23,70,25,2,3,2,423.46,224,63.39,1.087,1124.33,5.23,-3.42

Package Info

Size: 1.1 MB compressed
MD5: d540906b1a9a3eaa0404bbd800349c59
Created: November 29, 2025

5.1 KiB Raw Blame History