feat: V9 advanced parameter sweep with MA gap filter (810K configs)

Parameter space expansion:
- Original 15 params: 101K configurations
- NEW: MA gap filter (3 dimensions) = 18× expansion
- Total: ~810,000 configurations across 4 time profiles
- Chunk size: 1,000 configs/chunk = ~810 chunks

MA Gap Filter parameters:
- use_ma_gap: True/False (2 values)
- ma_gap_min_long: -5.0%, 0%, +5.0% (3 values)
- ma_gap_min_short: -5.0%, 0%, +5.0% (3 values)

Implementation:
- money_line_v9.py: Full v9 indicator with MA gap logic
- v9_advanced_worker.py: Chunk processor (1,000 configs)
- v9_advanced_coordinator.py: Work distributor (2 EPYC workers)
- run_v9_advanced_sweep.sh: Startup script (generates + launches)

Infrastructure:
- Uses existing EPYC cluster (64 cores total)
- Worker1: bd-epyc-02 (32 threads)
- Worker2: bd-host01 (32 threads via SSH hop)
- Expected runtime: 70-80 hours
- Database: SQLite (chunk tracking + results)

Goal: Find optimal MA gap thresholds for filtering false breakouts
during MA whipsaw zones while preserving trend entries.
This commit is contained in:
mindesbunister
2025-12-01 18:11:47 +01:00
parent 2993bc8895
commit 7e1fe1cc30
9 changed files with 2541 additions and 0 deletions

View File

@@ -0,0 +1,349 @@
# V9 Advanced Parameter Sweep - 810K Configurations
**Status:** Ready to launch (Dec 1, 2025)
**Total Configs:** ~810,000 (18-parameter grid with MA gap filter)
**Expected Runtime:** 70-80 hours on 2 EPYC servers
**Enhancement:** Added MA gap filter exploration (8× expansion from 101K)
## Architecture
### Parameter Space (18 dimensions)
Builds on existing v9 grid but adds **MA gap filter** parameters:
**Original 15 parameters (101K configs):**
- Time profiles: minutes, hours, daily, weekly (4 profiles)
- ATR periods: profile-specific (3-4 values each)
- ATR multipliers: profile-specific (3-4 values each)
- RSI boundaries: long_min/max, short_min/max (3×4 values)
- Volume max: 3.0, 3.5, 4.0
- Entry buffer: 0.15, 0.20, 0.25
- ADX length: 14, 16, 18
**NEW: MA Gap Filter (3 dimensions = 18× multiplier):**
- `use_ma_gap`: True/False (2 values)
- `ma_gap_min_long`: -5.0%, 0%, +5.0% (3 values)
- `ma_gap_min_short`: -5.0%, 0%, +5.0% (3 values)
**Total:** 101K × 2 × 3 × 3 = **~810,000 configurations**
### What is MA Gap Filter?
**Purpose:** Filter entries based on MA50-MA200 convergence/divergence
**Long Entry Logic:**
```python
if use_ma_gap and ma_gap_min_long is not None:
ma_gap_percent = (ma50 - ma200) / ma200 * 100
if ma_gap_percent < ma_gap_min_long:
block_entry # MAs too diverged or converging wrong way
```
**Short Entry Logic:**
```python
if use_ma_gap and ma_gap_min_short is not None:
ma_gap_percent = (ma50 - ma200) / ma200 * 100
if ma_gap_percent > ma_gap_min_short:
block_entry # MAs too diverged or converging wrong way
```
**Hypothesis:**
- **LONG at MA crossover:** Require ma_gap ≥ 0% (bullish or neutral)
- **SHORT at MA crossover:** Require ma_gap ≤ 0% (bearish or neutral)
- **Avoid whipsaws:** Block entries when MAs are too diverged
**Parameter Exploration:**
- `-5.0%`: Allows 5% adverse gap (catching reversals)
- `0%`: Requires neutral or favorable gap
- `+5.0%`: Requires 5% favorable gap (strong trend only)
**Expected Findings:**
1. **Optimal gap thresholds** for each profile (minutes vs daily may differ)
2. **Direction-specific gaps** (LONGs may need different threshold than SHORTs)
3. **Performance comparison** (use_ma_gap=True vs False baseline)
### Why This Matters
**User Context (Nov 27 analysis):**
- v9 has strong baseline edge ($405.88 on 1-year data)
- But parameter insensitivity suggests edge is in **logic**, not tuning
- MA gap filter adds **new logic dimension** (not just parameter tuning)
- Could filter false breakouts that occur during MA whipsaw zones
**Real-world validation needed:**
- Some MAs converging = good entries (trend formation)
- Some MAs diverged = good entries (strong trend continuation)
- Optimal gap threshold is data-driven discovery goal
## Implementation
### Files
```
cluster/
├── money_line_v9.py # v9 indicator (copied from backtester/indicators/)
├── v9_advanced_worker.py # Worker script (processes 1 chunk)
├── v9_advanced_coordinator.py # Coordinator (assigns chunks to workers)
├── run_v9_advanced_sweep.sh # Startup script (generates configs + launches)
├── chunks/ # Generated parameter configurations
│ ├── v9_advanced_chunk_0000.json (1,000 configs)
│ ├── v9_advanced_chunk_0001.json (1,000 configs)
│ └── ... (~810 chunk files)
├── exploration.db # SQLite database (chunk tracking)
└── distributed_results/ # CSV outputs from workers
├── v9_advanced_chunk_0000_results.csv
└── ...
```
### Database Schema
**v9_advanced_chunks table:**
```sql
CREATE TABLE v9_advanced_chunks (
id TEXT PRIMARY KEY, -- v9_advanced_chunk_0000
start_combo INTEGER, -- 0 (not used, legacy)
end_combo INTEGER, -- 1000 (not used, legacy)
total_combos INTEGER, -- 1000 configs per chunk
status TEXT, -- 'pending', 'running', 'completed', 'failed'
assigned_worker TEXT, -- 'worker1', 'worker2', NULL
started_at INTEGER, -- Unix timestamp
completed_at INTEGER, -- Unix timestamp
created_at INTEGER DEFAULT (strftime('%s', 'now'))
)
```
**v9_advanced_strategies table:**
```sql
CREATE TABLE v9_advanced_strategies (
id INTEGER PRIMARY KEY AUTOINCREMENT,
chunk_id TEXT NOT NULL, -- FK to v9_advanced_chunks
params TEXT NOT NULL, -- JSON of 18 parameters
pnl REAL NOT NULL, -- Total P&L
win_rate REAL NOT NULL, -- % winners
profit_factor REAL NOT NULL, -- (Not yet implemented)
max_drawdown REAL NOT NULL, -- Max DD %
total_trades INTEGER NOT NULL, -- Number of trades
created_at INTEGER DEFAULT (strftime('%s', 'now'))
)
```
## Usage
### 1. Launch Sweep
```bash
cd /home/icke/traderv4/cluster
./run_v9_advanced_sweep.sh
```
**What it does:**
1. Generates 810K parameter configurations
2. Splits into ~810 chunks (1,000 configs each)
3. Creates SQLite database with chunk tracking
4. Launches coordinator in background
5. Coordinator assigns chunks to 2 EPYC workers
6. Workers process chunks in parallel
### 2. Monitor Progress
**Web Dashboard:**
```
http://localhost:3001/cluster
```
**Command Line:**
```bash
# Watch coordinator logs
tail -f coordinator_v9_advanced.log
# Check database status
sqlite3 exploration.db "
SELECT
status,
COUNT(*) as count,
ROUND(COUNT(*) * 100.0 / 810, 1) as percent
FROM v9_advanced_chunks
GROUP BY status
"
```
### 3. Analyze Results
After completion, aggregate all results:
```bash
# Combine all CSV files
cd distributed_results
cat v9_advanced_chunk_*_results.csv | head -1 > all_v9_advanced_results.csv
tail -n +2 -q v9_advanced_chunk_*_results.csv >> all_v9_advanced_results.csv
# Top 100 performers
sort -t, -k2 -rn all_v9_advanced_results.csv | head -100 > top_100_v9_advanced.csv
```
**Analysis queries:**
```python
import pandas as pd
df = pd.read_csv('all_v9_advanced_results.csv')
# Compare MA gap filter vs baseline
baseline = df[df['use_ma_gap'] == False]
filtered = df[df['use_ma_gap'] == True]
print(f"Baseline avg: ${baseline['profit'].mean():.2f}")
print(f"Filtered avg: ${filtered['profit'].mean():.2f}")
# Find optimal gap thresholds
for profile in ['minutes', 'hours', 'daily', 'weekly']:
profile_df = df[df['profile'] == profile]
best = profile_df.nlargest(10, 'profit')
print(f"\n{profile.upper()} - Top 10 gap thresholds:")
print(best[['ma_gap_min_long', 'ma_gap_min_short', 'profit', 'win_rate']])
```
## Expected Outcomes
### If MA Gap Filter Helps:
**Expected pattern:**
- Filtered configs outperform baseline
- Optimal gap thresholds cluster around certain values
- Direction-specific gaps emerge (LONGs need +gap, SHORTs need -gap)
**Action:**
- Update production v9 with optimal gap thresholds
- Deploy to live trading after forward testing
### If MA Gap Filter Hurts:
**Expected pattern:**
- Baseline (use_ma_gap=False) outperforms all filtered configs
- No clear threshold patterns emerge
- Performance degrades with stricter gaps
**Action:**
- Keep production v9 as-is (no MA gap filter)
- Document findings: MA divergence not predictive for v9 signals
### If Results Inconclusive:
**Expected pattern:**
- Filtered and baseline perform similarly
- High variance in gap threshold performance
**Action:**
- Keep baseline for simplicity (Occam's Razor)
- Consider gap as optional "turbo mode" for specific profiles
## Worker Infrastructure
Uses **existing EPYC cluster setup** (64 cores total):
**Worker 1 (bd-epyc-02):**
- Direct SSH: `root@10.10.254.106`
- Workspace: `/home/comprehensive_sweep`
- Python: `.venv` with pandas/numpy
- Cores: 32 threads
**Worker 2 (bd-host01):**
- SSH hop: via worker1
- Direct: `root@10.20.254.100`
- Workspace: `/home/backtest_dual/backtest`
- Python: `.venv` with pandas/numpy
- Cores: 32 threads
**Prerequisites (already met):**
- Python 3.11+ with pandas, numpy
- Virtual environments active
- SOLUSDT 5m OHLCV data (Nov 2024 - Nov 2025)
- SSH keys configured
## Timeline
**Estimate:** 70-80 hours total (810K configs ÷ 64 cores)
**Breakdown:**
- Config generation: ~5 minutes
- Chunk assignment: ~1 minute
- Parallel execution: ~70 hours (1,000 configs/chunk × ~3.1s/config ÷ 2 workers)
- Result aggregation: ~10 minutes
**Monitoring intervals:**
- Check status: Every 30-60 minutes
- Full results available: After ~3 days
## Lessons from Previous Sweeps
### v9 Baseline (65K configs, Nov 28-29):
- **Finding:** Parameter insensitivity observed
- **Implication:** Edge is in core logic, not specific parameter values
- **Action:** Explore new logic dimensions (MA gap) instead of tighter parameter grids
### v10 Removal (Nov 28):
- **Finding:** 72 configs produced identical results
- **Implication:** New logic must add real edge, not just complexity
- **Action:** MA gap filter is **observable market state** (not derived metric)
### Distributed Worker Bug (Dec 1):
- **Finding:** Dict passed instead of lambda function
- **Implication:** Type safety critical for 810K config sweep
- **Action:** Simplified v9_advanced_worker.py with explicit types
## File Locations
**Master (srvdocker02):**
```
/home/icke/traderv4/cluster/
```
**Workers (EPYC servers):**
```
/home/comprehensive_sweep/ (worker1)
/home/backtest_dual/backtest/ (worker2)
```
**Results (master):**
```
/home/icke/traderv4/cluster/distributed_results/
```
## Git Commits
```bash
# Before launch
git add cluster/
git commit -m "feat: V9 advanced parameter sweep with MA gap filter (810K configs)
- Added MA gap filter exploration (3 dimensions = 18× expansion)
- Created v9_advanced_worker.py for chunk processing
- Created v9_advanced_coordinator.py for work distribution
- Uses existing EPYC cluster infrastructure (64 cores)
- Expected runtime: 70-80 hours for 810K configurations
"
git push
```
## Post-Sweep Actions
1. **Aggregate results** into single CSV
2. **Compare MA gap filter** vs baseline performance
3. **Identify optimal thresholds** per profile and direction
4. **Update production v9** if MA gap filter shows consistent edge
5. **Forward test** for 50-100 trades before live deployment
6. **Document findings** in INDICATOR_V9_MA_GAP_ROADMAP.md
## Support
**Questions during sweep:**
- Check `coordinator_v9_advanced.log` for coordinator status
- Check worker logs via SSH: `ssh worker1 tail -f /home/comprehensive_sweep/worker.log`
- Database queries: `sqlite3 exploration.db "SELECT ..."
`
**If sweep stalls:**
1. Check coordinator process: `ps aux | grep v9_advanced_coordinator`
2. Check worker processes: SSH to workers, `ps aux | grep python`
3. Reset failed chunks: `UPDATE v9_advanced_chunks SET status='pending' WHERE status='failed'`
4. Restart coordinator: `./run_v9_advanced_sweep.sh`