feat: V9 advanced parameter sweep with MA gap filter (810K configs)

Parameter space expansion: - Original 15 params: 101K configurations - NEW: MA gap filter (3 dimensions) = 18× expansion - Total: ~810,000 configurations across 4 time profiles - Chunk size: 1,000 configs/chunk = ~810 chunks MA Gap Filter parameters: - use_ma_gap: True/False (2 values) - ma_gap_min_long: -5.0%, 0%, +5.0% (3 values) - ma_gap_min_short: -5.0%, 0%, +5.0% (3 values) Implementation: - money_line_v9.py: Full v9 indicator with MA gap logic - v9_advanced_worker.py: Chunk processor (1,000 configs) - v9_advanced_coordinator.py: Work distributor (2 EPYC workers) - run_v9_advanced_sweep.sh: Startup script (generates + launches) Infrastructure: - Uses existing EPYC cluster (64 cores total) - Worker1: bd-epyc-02 (32 threads) - Worker2: bd-host01 (32 threads via SSH hop) - Expected runtime: 70-80 hours - Database: SQLite (chunk tracking + results) Goal: Find optimal MA gap thresholds for filtering false breakouts during MA whipsaw zones while preserving trend entries.
2025-12-01 18:11:47 +01:00
parent 2993bc8895
commit 7e1fe1cc30
9 changed files with 2541 additions and 0 deletions
--- a/cluster/V9_ADVANCED_SWEEP_README.md
+++ b/cluster/V9_ADVANCED_SWEEP_README.md
@@ -0,0 +1,349 @@
+# V9 Advanced Parameter Sweep - 810K Configurations
+
+**Status:** Ready to launch (Dec 1, 2025)  
+**Total Configs:** ~810,000 (18-parameter grid with MA gap filter)  
+**Expected Runtime:** 70-80 hours on 2 EPYC servers  
+**Enhancement:** Added MA gap filter exploration (8× expansion from 101K)
+
+## Architecture
+
+### Parameter Space (18 dimensions)
+
+Builds on existing v9 grid but adds **MA gap filter** parameters:
+
+**Original 15 parameters (101K configs):**
+- Time profiles: minutes, hours, daily, weekly (4 profiles)
+- ATR periods: profile-specific (3-4 values each)
+- ATR multipliers: profile-specific (3-4 values each)
+- RSI boundaries: long_min/max, short_min/max (3×4 values)
+- Volume max: 3.0, 3.5, 4.0
+- Entry buffer: 0.15, 0.20, 0.25
+- ADX length: 14, 16, 18
+
+**NEW: MA Gap Filter (3 dimensions = 18× multiplier):**
+- `use_ma_gap`: True/False (2 values)
+- `ma_gap_min_long`: -5.0%, 0%, +5.0% (3 values)
+- `ma_gap_min_short`: -5.0%, 0%, +5.0% (3 values)
+
+**Total:** 101K × 2 × 3 × 3 = **~810,000 configurations**
+
+### What is MA Gap Filter?
+
+**Purpose:** Filter entries based on MA50-MA200 convergence/divergence
+
+**Long Entry Logic:**
+```python
+if use_ma_gap and ma_gap_min_long is not None:
+    ma_gap_percent = (ma50 - ma200) / ma200 * 100
+    if ma_gap_percent < ma_gap_min_long:
+        block_entry  # MAs too diverged or converging wrong way
+```
+
+**Short Entry Logic:**
+```python
+if use_ma_gap and ma_gap_min_short is not None:
+    ma_gap_percent = (ma50 - ma200) / ma200 * 100
+    if ma_gap_percent > ma_gap_min_short:
+        block_entry  # MAs too diverged or converging wrong way
+```
+
+**Hypothesis:**
+- **LONG at MA crossover:** Require ma_gap ≥ 0% (bullish or neutral)
+- **SHORT at MA crossover:** Require ma_gap ≤ 0% (bearish or neutral)
+- **Avoid whipsaws:** Block entries when MAs are too diverged
+
+**Parameter Exploration:**
+- `-5.0%`: Allows 5% adverse gap (catching reversals)
+- `0%`: Requires neutral or favorable gap
+- `+5.0%`: Requires 5% favorable gap (strong trend only)
+
+**Expected Findings:**
+1. **Optimal gap thresholds** for each profile (minutes vs daily may differ)
+2. **Direction-specific gaps** (LONGs may need different threshold than SHORTs)
+3. **Performance comparison** (use_ma_gap=True vs False baseline)
+
+### Why This Matters
+
+**User Context (Nov 27 analysis):**
+- v9 has strong baseline edge ($405.88 on 1-year data)
+- But parameter insensitivity suggests edge is in **logic**, not tuning
+- MA gap filter adds **new logic dimension** (not just parameter tuning)
+- Could filter false breakouts that occur during MA whipsaw zones
+
+**Real-world validation needed:**
+- Some MAs converging = good entries (trend formation)
+- Some MAs diverged = good entries (strong trend continuation)
+- Optimal gap threshold is data-driven discovery goal
+
+## Implementation
+
+### Files
+
+```
+cluster/
+├── money_line_v9.py              # v9 indicator (copied from backtester/indicators/)
+├── v9_advanced_worker.py         # Worker script (processes 1 chunk)
+├── v9_advanced_coordinator.py    # Coordinator (assigns chunks to workers)
+├── run_v9_advanced_sweep.sh      # Startup script (generates configs + launches)
+├── chunks/                       # Generated parameter configurations
+│   ├── v9_advanced_chunk_0000.json  (1,000 configs)
+│   ├── v9_advanced_chunk_0001.json  (1,000 configs)
+│   └── ... (~810 chunk files)
+├── exploration.db                # SQLite database (chunk tracking)
+└── distributed_results/          # CSV outputs from workers
+    ├── v9_advanced_chunk_0000_results.csv
+    └── ...
+```
+
+### Database Schema
+
+**v9_advanced_chunks table:**
+```sql
+CREATE TABLE v9_advanced_chunks (
+    id TEXT PRIMARY KEY,              -- v9_advanced_chunk_0000
+    start_combo INTEGER,              -- 0 (not used, legacy)
+    end_combo INTEGER,                -- 1000 (not used, legacy)
+    total_combos INTEGER,             -- 1000 configs per chunk
+    status TEXT,                      -- 'pending', 'running', 'completed', 'failed'
+    assigned_worker TEXT,             -- 'worker1', 'worker2', NULL
+    started_at INTEGER,               -- Unix timestamp
+    completed_at INTEGER,             -- Unix timestamp
+    created_at INTEGER DEFAULT (strftime('%s', 'now'))
+)
+```
+
+**v9_advanced_strategies table:**
+```sql
+CREATE TABLE v9_advanced_strategies (
+    id INTEGER PRIMARY KEY AUTOINCREMENT,
+    chunk_id TEXT NOT NULL,           -- FK to v9_advanced_chunks
+    params TEXT NOT NULL,             -- JSON of 18 parameters
+    pnl REAL NOT NULL,               -- Total P&L
+    win_rate REAL NOT NULL,          -- % winners
+    profit_factor REAL NOT NULL,     -- (Not yet implemented)
+    max_drawdown REAL NOT NULL,      -- Max DD %
+    total_trades INTEGER NOT NULL,   -- Number of trades
+    created_at INTEGER DEFAULT (strftime('%s', 'now'))
+)
+```
+
+## Usage
+
+### 1. Launch Sweep
+
+```bash
+cd /home/icke/traderv4/cluster
+./run_v9_advanced_sweep.sh
+```
+
+**What it does:**
+1. Generates 810K parameter configurations
+2. Splits into ~810 chunks (1,000 configs each)
+3. Creates SQLite database with chunk tracking
+4. Launches coordinator in background
+5. Coordinator assigns chunks to 2 EPYC workers
+6. Workers process chunks in parallel
+
+### 2. Monitor Progress
+
+**Web Dashboard:**
+```
+http://localhost:3001/cluster
+```
+
+**Command Line:**
+```bash
+# Watch coordinator logs
+tail -f coordinator_v9_advanced.log
+
+# Check database status
+sqlite3 exploration.db "
+    SELECT 
+        status,
+        COUNT(*) as count,
+        ROUND(COUNT(*) * 100.0 / 810, 1) as percent
+    FROM v9_advanced_chunks
+    GROUP BY status
+"
+```
+
+### 3. Analyze Results
+
+After completion, aggregate all results:
+
+```bash
+# Combine all CSV files
+cd distributed_results
+cat v9_advanced_chunk_*_results.csv | head -1 > all_v9_advanced_results.csv
+tail -n +2 -q v9_advanced_chunk_*_results.csv >> all_v9_advanced_results.csv
+
+# Top 100 performers
+sort -t, -k2 -rn all_v9_advanced_results.csv | head -100 > top_100_v9_advanced.csv
+```
+
+**Analysis queries:**
+
+```python
+import pandas as pd
+
+df = pd.read_csv('all_v9_advanced_results.csv')
+
+# Compare MA gap filter vs baseline
+baseline = df[df['use_ma_gap'] == False]
+filtered = df[df['use_ma_gap'] == True]
+
+print(f"Baseline avg: ${baseline['profit'].mean():.2f}")
+print(f"Filtered avg: ${filtered['profit'].mean():.2f}")
+
+# Find optimal gap thresholds
+for profile in ['minutes', 'hours', 'daily', 'weekly']:
+    profile_df = df[df['profile'] == profile]
+    best = profile_df.nlargest(10, 'profit')
+    print(f"\n{profile.upper()} - Top 10 gap thresholds:")
+    print(best[['ma_gap_min_long', 'ma_gap_min_short', 'profit', 'win_rate']])
+```
+
+## Expected Outcomes
+
+### If MA Gap Filter Helps:
+
+**Expected pattern:**
+- Filtered configs outperform baseline
+- Optimal gap thresholds cluster around certain values
+- Direction-specific gaps emerge (LONGs need +gap, SHORTs need -gap)
+
+**Action:**
+- Update production v9 with optimal gap thresholds
+- Deploy to live trading after forward testing
+
+### If MA Gap Filter Hurts:
+
+**Expected pattern:**
+- Baseline (use_ma_gap=False) outperforms all filtered configs
+- No clear threshold patterns emerge
+- Performance degrades with stricter gaps
+
+**Action:**
+- Keep production v9 as-is (no MA gap filter)
+- Document findings: MA divergence not predictive for v9 signals
+
+### If Results Inconclusive:
+
+**Expected pattern:**
+- Filtered and baseline perform similarly
+- High variance in gap threshold performance
+
+**Action:**
+- Keep baseline for simplicity (Occam's Razor)
+- Consider gap as optional "turbo mode" for specific profiles
+
+## Worker Infrastructure
+
+Uses **existing EPYC cluster setup** (64 cores total):
+
+**Worker 1 (bd-epyc-02):**
+- Direct SSH: `root@10.10.254.106`
+- Workspace: `/home/comprehensive_sweep`
+- Python: `.venv` with pandas/numpy
+- Cores: 32 threads
+
+**Worker 2 (bd-host01):**
+- SSH hop: via worker1
+- Direct: `root@10.20.254.100`
+- Workspace: `/home/backtest_dual/backtest`
+- Python: `.venv` with pandas/numpy
+- Cores: 32 threads
+
+**Prerequisites (already met):**
+- Python 3.11+ with pandas, numpy
+- Virtual environments active
+- SOLUSDT 5m OHLCV data (Nov 2024 - Nov 2025)
+- SSH keys configured
+
+## Timeline
+
+**Estimate:** 70-80 hours total (810K configs ÷ 64 cores)
+
+**Breakdown:**
+- Config generation: ~5 minutes
+- Chunk assignment: ~1 minute
+- Parallel execution: ~70 hours (1,000 configs/chunk × ~3.1s/config ÷ 2 workers)
+- Result aggregation: ~10 minutes
+
+**Monitoring intervals:**
+- Check status: Every 30-60 minutes
+- Full results available: After ~3 days
+
+## Lessons from Previous Sweeps
+
+### v9 Baseline (65K configs, Nov 28-29):
+- **Finding:** Parameter insensitivity observed
+- **Implication:** Edge is in core logic, not specific parameter values
+- **Action:** Explore new logic dimensions (MA gap) instead of tighter parameter grids
+
+### v10 Removal (Nov 28):
+- **Finding:** 72 configs produced identical results
+- **Implication:** New logic must add real edge, not just complexity
+- **Action:** MA gap filter is **observable market state** (not derived metric)
+
+### Distributed Worker Bug (Dec 1):
+- **Finding:** Dict passed instead of lambda function
+- **Implication:** Type safety critical for 810K config sweep
+- **Action:** Simplified v9_advanced_worker.py with explicit types
+
+## File Locations
+
+**Master (srvdocker02):**
+```
+/home/icke/traderv4/cluster/
+```
+
+**Workers (EPYC servers):**
+```
+/home/comprehensive_sweep/  (worker1)
+/home/backtest_dual/backtest/  (worker2)
+```
+
+**Results (master):**
+```
+/home/icke/traderv4/cluster/distributed_results/
+```
+
+## Git Commits
+
+```bash
+# Before launch
+git add cluster/
+git commit -m "feat: V9 advanced parameter sweep with MA gap filter (810K configs)
+
+- Added MA gap filter exploration (3 dimensions = 18× expansion)
+- Created v9_advanced_worker.py for chunk processing
+- Created v9_advanced_coordinator.py for work distribution
+- Uses existing EPYC cluster infrastructure (64 cores)
+- Expected runtime: 70-80 hours for 810K configurations
+"
+git push
+```
+
+## Post-Sweep Actions
+
+1. **Aggregate results** into single CSV
+2. **Compare MA gap filter** vs baseline performance
+3. **Identify optimal thresholds** per profile and direction
+4. **Update production v9** if MA gap filter shows consistent edge
+5. **Forward test** for 50-100 trades before live deployment
+6. **Document findings** in INDICATOR_V9_MA_GAP_ROADMAP.md
+
+## Support
+
+**Questions during sweep:**
+- Check `coordinator_v9_advanced.log` for coordinator status
+- Check worker logs via SSH: `ssh worker1 tail -f /home/comprehensive_sweep/worker.log`
+- Database queries: `sqlite3 exploration.db "SELECT ..."
+`
+
+**If sweep stalls:**
+1. Check coordinator process: `ps aux | grep v9_advanced_coordinator`
+2. Check worker processes: SSH to workers, `ps aux | grep python`
+3. Reset failed chunks: `UPDATE v9_advanced_chunks SET status='pending' WHERE status='failed'`
+4. Restart coordinator: `./run_v9_advanced_sweep.sh`