CRITICAL FIX (Nov 30, 2025):
- Dashboard showed 'idle' despite 22+ worker processes running
- Root cause: SSH-based worker detection timing out
- Solution: Check database for running chunks FIRST
Changes:
1. app/api/cluster/status/route.ts:
- Query exploration database before SSH detection
- If running chunks exist, mark workers 'active' even if SSH fails
- Override worker status: 'offline' → 'active' when chunks running
- Log: '✅ Cluster status: ACTIVE (database shows running chunks)'
- Database is source of truth, SSH only for supplementary metrics
2. app/cluster/page.tsx:
- Stop button ALREADY EXISTS (conditionally shown)
- Shows Start when status='idle', Stop when status='active'
- No code changes needed - fixed by status detection
Result:
- Dashboard now shows 'ACTIVE' with 2 workers (correct)
- Workers show 'active' status (was 'offline')
- Stop button automatically visible when cluster active
- System resilient to SSH timeouts/network issues
Verified:
- Container restarted: Nov 30 21:18 UTC
- API tested: Returns status='active', activeWorkers=2
- Logs confirm: Database-first logic working
- Workers confirmed running: 22+ processes on worker1, workers on worker2
39 lines
877 B
Bash
Executable File
39 lines
877 B
Bash
Executable File
#!/bin/bash
|
|
# Run RSI divergence v9 parameter sweep on EPYC server
|
|
# Usage: ./run_sweep_rsi_epyc.sh
|
|
|
|
set -e
|
|
|
|
# Activate virtual environment
|
|
source .venv/bin/activate
|
|
|
|
# Set PYTHONPATH to current directory
|
|
export PYTHONPATH=$(pwd)
|
|
|
|
echo "=== Starting RSI DIVERGENCE v9 Parameter Sweep ==="
|
|
echo "Workers: 24 (EPYC optimized)"
|
|
echo "Started: $(date)"
|
|
echo "Output: sweep_v9_rsi_divergence.csv"
|
|
echo "Log: v9_rsi_sweep.log"
|
|
echo
|
|
|
|
# Run in background with nohup
|
|
nohup python scripts/run_backtest_sweep_rsi.py \
|
|
--workers 24 \
|
|
--top 100 \
|
|
> v9_rsi_sweep.log 2>&1 &
|
|
|
|
SWEEP_PID=$!
|
|
echo "Sweep launched: PID $SWEEP_PID"
|
|
echo "Monitor: tail -f v9_rsi_sweep.log"
|
|
echo
|
|
sleep 3
|
|
|
|
# Show initial output
|
|
echo "=== First 30 lines of log ==="
|
|
head -30 v9_rsi_sweep.log
|
|
|
|
echo
|
|
echo "Sweep running in background (PID $SWEEP_PID)"
|
|
echo "Results will be saved to: sweep_v9_rsi_divergence.csv"
|