THREE critical bugs in cluster/v11_test_worker.py: 1. Missing use_quality_filters parameter when creating MoneyLineV11Inputs - Parameter defaults to True but wasn't being passed explicitly - Fix: Added use_quality_filters=True to inputs creation 2. Missing fixed RSI parameters (rsi_long_max, rsi_short_min) - Worker only passed rsi_long_min and rsi_short_max (sweep params) - Missing rsi_long_max=70 and rsi_short_min=30 (fixed params) - Fix: Added both fixed parameters to inputs creation 3. Import path mismatch - worker imported OLD version - Worker added cluster/ to sys.path, imported from parent directory - Old v11_moneyline_all_filters.py (21:40) missing use_quality_filters - Fixed v11_moneyline_all_filters.py was in backtester/ subdirectory - Fix: Deployed corrected file to /home/comprehensive_sweep/ Result: 0 signals → 1,096-1,186 signals per config ✓ Verified: Local test (314 signals), EPYC dataset test (1,186 signals), Worker log now shows signal variety across 27 concurrent configs. Progressive sweep now running successfully on EPYC cluster.
Distributed Continuous Optimization Cluster
24/7 automated strategy discovery across 2 EPYC servers (64 cores total). Explores entire indicator/parameter space to find the absolute best trading approach.
🏗️ Architecture
Three-Component Distributed System:
-
Coordinator (
distributed_coordinator.py) - Master orchestrator running on srvdocker02- Defines parameter grid (14 dimensions, ~500k combinations)
- Splits work into chunks (e.g., 10,000 combos per chunk)
- Deploys worker script to EPYC servers via SSH/SCP
- Assigns chunks to idle workers dynamically
- Collects CSV results and imports to SQLite database
- Tracks progress (completed/running/pending chunks)
-
Worker (
distributed_worker.py) - Runs on EPYC servers- Integrates with existing
/home/comprehensive_sweep/backtester/infrastructure - Uses proven
simulator.pyvectorized engine andMoneyLineInputsclass - Loads chunk spec (start_idx, end_idx from total parameter grid)
- Generates parameter combinations via
itertools.product() - Runs multiprocessing sweep with
mp.cpu_count()workers - Saves results to CSV (same format as comprehensive_sweep.py)
- Integrates with existing
-
Monitor (
exploration_status.py) - Real-time status dashboard- SSH worker health checks (active distributed_worker.py processes)
- Chunk progress tracking (total/completed/running/pending)
- Top 10 strategies leaderboard (P&L, trades, WR, PF, DD)
- Best configuration details (full parameters)
- Watch mode for continuous monitoring (30s refresh)
Infrastructure:
- Worker 1: pve-nu-monitor01 (10.10.254.106) - EPYC 7282 32 threads, 62GB RAM
- Worker 2: pve-srvmon01 (10.20.254.100 via worker1 2-hop SSH) - EPYC 7302 32 threads, 31GB RAM
- Combined: 64 cores, ~108,000 backtests/day capacity (proven: 65,536 in 29h)
- Existing Backtester:
/home/comprehensive_sweep/backtester/with simulator.py, indicators/, data/ - Data:
solusdt_5m.csv- Binance 5-minute OHLCV (Nov 2024 - Nov 2025) - Database:
exploration.dbSQLite with strategies/chunks/phases tables
🚀 Quick Start
1. Test with Small Chunk (RECOMMENDED FIRST)
Verify system works before large-scale deployment:
cd /home/icke/traderv4/cluster
# Modify distributed_coordinator.py temporarily (lines 120-135)
# Reduce parameter ranges to 2-3 values per dimension
# Total: ~500-1000 combinations for testing
# Run test
python3 distributed_coordinator.py --chunk-size 100
# Monitor in separate terminal
python3 exploration_status.py --watch
Expected: 5-10 chunks complete in 30-60 minutes, all results in exploration.db
Verify:
- SSH commands execute successfully
- Worker script deploys to
/home/comprehensive_sweep/backtester/scripts/ - CSV results appear in
cluster/distributed_results/ - Database populated with strategies (check with
sqlite3 exploration.db "SELECT COUNT(*) FROM strategies") - Monitoring dashboard shows accurate worker/chunk status
2. Run Full v9 Parameter Sweep
After test succeeds, explore full parameter space:
cd /home/icke/traderv4/cluster
# Restore full parameter ranges in distributed_coordinator.py
# Total: ~500,000 combinations (4^8 * 3^3 * 1 ≈ 500k)
# Start exploration (runs in background)
nohup python3 distributed_coordinator.py --chunk-size 10000 > sweep.log 2>&1 &
# Monitor progress
python3 exploration_status.py --watch
# OR
watch -n 60 'python3 exploration_status.py'
# Check logs
tail -f sweep.log
Expected Results:
- Duration: ~3.5 hours with 64 cores
- Find 5-10 configurations with P&L > $250/1k (baseline: $192/1k)
- Quality filters: 700+ trades, 50-70% WR, PF ≥ 1.2
3. Query Top Strategies
# Top 20 performers
sqlite3 cluster/exploration.db <<EOF
SELECT
params_json,
printf('$%.2f', pnl_per_1k) as pnl,
trades,
printf('%.1f%%', win_rate * 100) as wr,
printf('%.2f', profit_factor) as pf,
printf('%.1f%%', max_drawdown * 100) as dd,
DATE(tested_at) as tested
FROM strategies
WHERE trades >= 700
AND win_rate >= 0.50
AND win_rate <= 0.70
AND profit_factor >= 1.2
ORDER BY pnl_per_1k DESC
LIMIT 20;
EOF
📊 Parameter Space (14 Dimensions)
v9 Money Line Configuration:
ParameterGrid(
flip_thresholds=[0.4, 0.5, 0.6, 0.7], # EMA flip confirmation (4 values)
ma_gaps=[0.20, 0.30, 0.40, 0.50], # MA50-MA200 convergence bonus (4 values)
adx_mins=[18, 21, 24, 27], # ADX requirement for momentum filter (4 values)
long_pos_maxs=[60, 65, 70, 75], # Price position for LONG momentum (4 values)
short_pos_mins=[20, 25, 30, 35], # Price position for SHORT momentum (4 values)
cooldowns=[1, 2, 3, 4], # Bars between signals (4 values)
position_sizes=[1.0], # Full position (1 value fixed)
tp1_multipliers=[1.5, 2.0, 2.5], # TP1 as ATR multiple (3 values)
tp2_multipliers=[3.0, 4.0, 5.0], # TP2 as ATR multiple (3 values)
sl_multipliers=[2.0, 3.0, 4.0], # SL as ATR multiple (3 values)
tp1_close_percents=[0.5, 0.6, 0.7, 0.75], # TP1 close % (4 values)
trailing_multipliers=[1.0, 1.5, 2.0], # Trailing stop multiplier (3 values)
vol_mins=[0.8, 1.0, 1.2], # Minimum volume ratio (3 values)
max_bars_list=[100, 150, 200] # Max bars in position (3 values)
)
# Total: 4×4×4×4×4×4×1×3×3×3×4×3×3×3 ≈ 497,664 combinations
🎯 Quality Filters
Applied to all strategy results:
- Minimum trades: 700+ (statistical significance)
- Win rate range: 50-70% (realistic, avoids overfitting)
- Profit factor: ≥ 1.2 (solid edge)
- Max drawdown: Tracked but no hard limit (informational)
Why these filters:
- Trade count validates statistical robustness
- WR range prevents curve-fitting (>70% = overfit, <50% = coin flip)
- PF threshold ensures strategy has actual edge
📈 Expected Results
Current Baseline (v9 default parameters):
- P&L: $192 per $1k capital
- Trades: ~700
- Win Rate: ~61%
- Profit Factor: ~1.4
Optimization Goals:
- Target: >$250/1k P&L (30% improvement)
- Stretch: >$300/1k P&L (56% improvement)
- Expected: Find 5-10 configurations meeting quality filters with P&L > $250/1k
Why achievable:
- 500k combinations vs 27 tested in narrow sweep
- Full parameter space exploration vs limited grid
- Proven infrastructure (65,536 backtests completed successfully)
🔄 Continuous Exploration Roadmap
Phase 1: v9 Money Line Parameter Optimization (~500k combos, 3.5h)
- Status: READY TO RUN
- Goal: Find optimal flip_threshold, ma_gap, momentum filters
- Expected: >$250/1k P&L
Phase 2: RSI Divergence Integration (~100k combos, 45min)
- Add RSI divergence detection
- Combine with v9 momentum filter
- Parameters: RSI lookback, divergence strength threshold
- Goal: Catch trend reversals early
Phase 3: Volume Profile Analysis (~200k combos, 1.5h)
- Volume profile zones (POC, VAH, VAL)
- Order flow imbalance detection
- Parameters: Profile window, entry threshold, confirmation bars
- Goal: Better entry timing
Phase 4: Multi-Timeframe Confirmation (~150k combos, 1h)
- 5min + 15min + 1H alignment
- Higher timeframe trend filter
- Parameters: Timeframes to use, alignment strictness
- Goal: Reduce false signals
Phase 5: Hybrid Indicators (~50k combos, 30min)
- Combine best performers from Phase 1-4
- Test cross-strategy synergy
- Goal: Break $300/1k barrier
Phase 6: ML-Based Optimization (~100k+ combos, 1h+)
- Feature engineering from top strategies
- Gradient boosting / random forest
- Genetic algorithm parameter tuning
- Goal: Discover non-obvious patterns
📁 File Structure
cluster/
├── distributed_coordinator.py # Master orchestrator (650 lines)
├── distributed_worker.py # Worker script (350 lines)
├── exploration_status.py # Monitoring dashboard (200 lines)
├── exploration.db # SQLite results database
├── distributed_results/ # CSV results from workers
│ ├── worker1_chunk_0.csv
│ ├── worker1_chunk_1.csv
│ └── worker2_chunk_0.csv
└── README.md # This file
/home/comprehensive_sweep/backtester/ (on EPYC servers)
├── simulator.py # Core vectorized engine
├── indicators/
│ ├── money_line.py # MoneyLineInputs class
│ └── ...
├── data/
│ └── solusdt_5m.csv # Binance 5-minute OHLCV
├── scripts/
│ ├── comprehensive_sweep.py # Original multiprocessing sweep
│ └── distributed_worker.py # Deployed by coordinator
└── .venv/ # Python 3.11.2, pandas, numpy
💾 Database Schema
strategies table
CREATE TABLE strategies (
id INTEGER PRIMARY KEY AUTOINCREMENT,
phase_id INTEGER, -- Which exploration phase (1=v9, 2=RSI, etc.)
params_json TEXT NOT NULL, -- JSON parameter configuration
pnl_per_1k REAL, -- Performance metric ($ PnL per $1k)
trades INTEGER, -- Total trades in backtest
win_rate REAL, -- Decimal win rate (0.61 = 61%)
profit_factor REAL, -- Gross profit / gross loss
max_drawdown REAL, -- Largest peak-to-trough decline (decimal)
tested_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
FOREIGN KEY (phase_id) REFERENCES phases(id)
);
CREATE INDEX idx_strategies_pnl ON strategies(pnl_per_1k DESC);
CREATE INDEX idx_strategies_trades ON strategies(trades);
chunks table
CREATE TABLE chunks (
id INTEGER PRIMARY KEY AUTOINCREMENT,
phase_id INTEGER,
worker_id TEXT, -- 'worker1' or 'worker2'
start_idx INTEGER, -- Start index in parameter grid
end_idx INTEGER, -- End index (exclusive)
total_combos INTEGER, -- Total in this chunk
status TEXT DEFAULT 'pending', -- pending/running/completed/failed
assigned_at TIMESTAMP,
completed_at TIMESTAMP,
result_file TEXT, -- Path to CSV result file
FOREIGN KEY (phase_id) REFERENCES phases(id)
);
CREATE INDEX idx_chunks_status ON chunks(status);
phases table
CREATE TABLE phases (
id INTEGER PRIMARY KEY AUTOINCREMENT,
name TEXT NOT NULL, -- 'v9_optimization', 'rsi_divergence', etc.
description TEXT,
total_combinations INTEGER, -- Total parameter combinations
started_at TIMESTAMP,
completed_at TIMESTAMP
);
🔧 Troubleshooting
SSH Connection Issues
Symptom: "Connection refused" or timeout errors
Solutions:
# Test Worker 1 connectivity
ssh root@10.10.254.106 'echo "Worker 1 OK"'
# Test Worker 2 (2-hop) connectivity
ssh root@10.10.254.106 'ssh root@10.20.254.100 "echo Worker 2 OK"'
# Check SSH keys
ssh-add -l
# Verify authorized_keys on workers
ssh root@10.10.254.106 'cat ~/.ssh/authorized_keys'
Path/Import Errors on Workers
Symptom: "ModuleNotFoundError" or "FileNotFoundError"
Solutions:
# Verify backtester exists on Worker 1
ssh root@10.10.254.106 'ls -lah /home/comprehensive_sweep/backtester/'
# Check Python environment
ssh root@10.10.254.106 'cd /home/comprehensive_sweep && source .venv/bin/activate && python --version'
# Verify data file
ssh root@10.10.254.106 'ls -lh /home/comprehensive_sweep/backtester/data/solusdt_5m.csv'
# Check distributed_worker.py deployment
ssh root@10.10.254.106 'ls -lh /home/comprehensive_sweep/backtester/scripts/distributed_worker.py'
Worker Processes Stuck/Hung
Symptom: exploration_status.py shows "running" but no progress
Solutions:
# Check worker processes
ssh root@10.10.254.106 'ps aux | grep distributed_worker'
# Check worker CPU usage (should be near 100% on 32 cores)
ssh root@10.10.254.106 'top -bn1 | head -20'
# Kill hung worker (coordinator will reassign chunk)
ssh root@10.10.254.106 'pkill -f distributed_worker.py'
# Check worker logs
ssh root@10.10.254.106 'tail -50 /home/comprehensive_sweep/backtester/scripts/worker_*.log'
Database Locked/Corrupt
Symptom: "database is locked" errors
Solutions:
# Check for stale locks
cd /home/icke/traderv4/cluster
fuser exploration.db
# Backup and rebuild
cp exploration.db exploration.db.backup
sqlite3 exploration.db "VACUUM;"
# Verify integrity
sqlite3 exploration.db "PRAGMA integrity_check;"
Results Not Importing
Symptom: CSVs in distributed_results/ but database empty
Solutions:
# Check CSV format
head -20 cluster/distributed_results/worker1_chunk_0.csv
# Manual import test
python3 -c "
import sqlite3
import pandas as pd
df = pd.read_csv('cluster/distributed_results/worker1_chunk_0.csv')
print(f'Loaded {len(df)} results')
print(df.columns.tolist())
print(df.head())
"
# Check coordinator logs for import errors
grep -i "error\|exception" sweep.log | tail -20
⚡ Performance Tuning
Chunk Size Trade-offs
Small chunks (1,000-5,000):
- ✅ Better load balancing
- ✅ Faster feedback loop
- ❌ More SSH/SCP overhead
- ❌ More database writes
Large chunks (10,000-20,000):
- ✅ Less overhead
- ✅ Fewer database transactions
- ❌ Less granular progress tracking
- ❌ Wasted work if chunk fails
Recommended: 10,000 combos per chunk (good balance)
Worker Concurrency
Current: Uses mp.cpu_count() (32 workers per EPYC)
To reduce CPU load:
# In distributed_worker.py line ~280
# Change from:
workers = mp.cpu_count()
# To:
workers = int(mp.cpu_count() * 0.7) # 70% utilization (22 workers)
Database Optimization
For large result sets (>100k strategies):
# Add indexes if queries slow
sqlite3 cluster/exploration.db <<EOF
CREATE INDEX IF NOT EXISTS idx_strategies_phase ON strategies(phase_id);
CREATE INDEX IF NOT EXISTS idx_strategies_wr ON strategies(win_rate);
CREATE INDEX IF NOT EXISTS idx_strategies_pf ON strategies(profit_factor);
ANALYZE;
EOF
✅ Best Practices
- Always test with small chunk first (100-1000 combos) before full sweep
- Monitor regularly with
exploration_status.py --watchduring runs - Backup database before major changes:
cp exploration.db exploration.db.backup - Review top strategies after each phase completion
- Archive old results if disk space low (CSV files can be deleted after import)
- Validate quality filters - adjust if too strict/lenient based on results
- Check worker logs if progress stalls:
ssh root@10.10.254.106 'tail -f /home/comprehensive_sweep/backtester/scripts/worker_*.log'
🔗 Integration with Production Bot
After finding top strategy:
- Extract parameters from database:
sqlite3 cluster/exploration.db <<EOF
SELECT params_json FROM strategies
WHERE id = (SELECT id FROM strategies ORDER BY pnl_per_1k DESC LIMIT 1);
EOF
-
Update TradingView indicator (
workflows/trading/moneyline_v9_ma_gap.pinescript):- Set
flip_threshold,ma_gap,momentum_adx, etc. to optimal values - Test in replay mode with historical data
- Set
-
Update bot configuration (
.envfile):- Adjust
MIN_SIGNAL_QUALITY_SCOREif needed - Update position sizing if strategy has different risk profile
- Adjust
-
Forward test (50-100 trades) before increasing capital:
- Use
SOLANA_POSITION_SIZE=10(10% of capital) - Monitor win rate, P&L, drawdown
- If metrics match backtest ± 10%, increase to full size
- Use
📚 Support & Documentation
- Main project docs:
/home/icke/traderv4/.github/copilot-instructions.md(5,181 lines) - Trading goals:
TRADING_GOALS.md(8-phase $106→$100k+ roadmap) - v9 indicator:
INDICATOR_V9_MA_GAP_ROADMAP.md - Optimization roadmaps:
SIGNAL_QUALITY_OPTIMIZATION_ROADMAP.md,POSITION_SCALING_ROADMAP.md - Adaptive leverage:
ADAPTIVE_LEVERAGE_SYSTEM.md
🚀 Future Enhancements
Potential additions:
- Genetic Algorithm Optimization - Breed top performers, test offspring
- Bayesian Optimization - Guide search toward promising parameter regions
- Web Dashboard - Real-time browser-based monitoring (Flask/FastAPI)
- Telegram Alerts - Notify when exceptional strategies found (P&L > threshold)
- Walk-Forward Analysis - Test strategies on rolling time windows
- Multi-Asset Support - Extend to ETH, BTC, other Drift markets
- Auto-Deployment - Push top strategies to production after validation
Questions? Check main project documentation or ask in development chat.
Ready to start? Run test sweep first: python3 cluster/distributed_coordinator.py --chunk-size 100