Files
trading_bot_v4/cluster/V9_ADVANCED_SWEEP_README.md
mindesbunister 7e1fe1cc30 feat: V9 advanced parameter sweep with MA gap filter (810K configs)
Parameter space expansion:
- Original 15 params: 101K configurations
- NEW: MA gap filter (3 dimensions) = 18× expansion
- Total: ~810,000 configurations across 4 time profiles
- Chunk size: 1,000 configs/chunk = ~810 chunks

MA Gap Filter parameters:
- use_ma_gap: True/False (2 values)
- ma_gap_min_long: -5.0%, 0%, +5.0% (3 values)
- ma_gap_min_short: -5.0%, 0%, +5.0% (3 values)

Implementation:
- money_line_v9.py: Full v9 indicator with MA gap logic
- v9_advanced_worker.py: Chunk processor (1,000 configs)
- v9_advanced_coordinator.py: Work distributor (2 EPYC workers)
- run_v9_advanced_sweep.sh: Startup script (generates + launches)

Infrastructure:
- Uses existing EPYC cluster (64 cores total)
- Worker1: bd-epyc-02 (32 threads)
- Worker2: bd-host01 (32 threads via SSH hop)
- Expected runtime: 70-80 hours
- Database: SQLite (chunk tracking + results)

Goal: Find optimal MA gap thresholds for filtering false breakouts
during MA whipsaw zones while preserving trend entries.
2025-12-01 18:11:47 +01:00

350 lines
11 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# V9 Advanced Parameter Sweep - 810K Configurations
**Status:** Ready to launch (Dec 1, 2025)
**Total Configs:** ~810,000 (18-parameter grid with MA gap filter)
**Expected Runtime:** 70-80 hours on 2 EPYC servers
**Enhancement:** Added MA gap filter exploration (8× expansion from 101K)
## Architecture
### Parameter Space (18 dimensions)
Builds on existing v9 grid but adds **MA gap filter** parameters:
**Original 15 parameters (101K configs):**
- Time profiles: minutes, hours, daily, weekly (4 profiles)
- ATR periods: profile-specific (3-4 values each)
- ATR multipliers: profile-specific (3-4 values each)
- RSI boundaries: long_min/max, short_min/max (3×4 values)
- Volume max: 3.0, 3.5, 4.0
- Entry buffer: 0.15, 0.20, 0.25
- ADX length: 14, 16, 18
**NEW: MA Gap Filter (3 dimensions = 18× multiplier):**
- `use_ma_gap`: True/False (2 values)
- `ma_gap_min_long`: -5.0%, 0%, +5.0% (3 values)
- `ma_gap_min_short`: -5.0%, 0%, +5.0% (3 values)
**Total:** 101K × 2 × 3 × 3 = **~810,000 configurations**
### What is MA Gap Filter?
**Purpose:** Filter entries based on MA50-MA200 convergence/divergence
**Long Entry Logic:**
```python
if use_ma_gap and ma_gap_min_long is not None:
ma_gap_percent = (ma50 - ma200) / ma200 * 100
if ma_gap_percent < ma_gap_min_long:
block_entry # MAs too diverged or converging wrong way
```
**Short Entry Logic:**
```python
if use_ma_gap and ma_gap_min_short is not None:
ma_gap_percent = (ma50 - ma200) / ma200 * 100
if ma_gap_percent > ma_gap_min_short:
block_entry # MAs too diverged or converging wrong way
```
**Hypothesis:**
- **LONG at MA crossover:** Require ma_gap ≥ 0% (bullish or neutral)
- **SHORT at MA crossover:** Require ma_gap ≤ 0% (bearish or neutral)
- **Avoid whipsaws:** Block entries when MAs are too diverged
**Parameter Exploration:**
- `-5.0%`: Allows 5% adverse gap (catching reversals)
- `0%`: Requires neutral or favorable gap
- `+5.0%`: Requires 5% favorable gap (strong trend only)
**Expected Findings:**
1. **Optimal gap thresholds** for each profile (minutes vs daily may differ)
2. **Direction-specific gaps** (LONGs may need different threshold than SHORTs)
3. **Performance comparison** (use_ma_gap=True vs False baseline)
### Why This Matters
**User Context (Nov 27 analysis):**
- v9 has strong baseline edge ($405.88 on 1-year data)
- But parameter insensitivity suggests edge is in **logic**, not tuning
- MA gap filter adds **new logic dimension** (not just parameter tuning)
- Could filter false breakouts that occur during MA whipsaw zones
**Real-world validation needed:**
- Some MAs converging = good entries (trend formation)
- Some MAs diverged = good entries (strong trend continuation)
- Optimal gap threshold is data-driven discovery goal
## Implementation
### Files
```
cluster/
├── money_line_v9.py # v9 indicator (copied from backtester/indicators/)
├── v9_advanced_worker.py # Worker script (processes 1 chunk)
├── v9_advanced_coordinator.py # Coordinator (assigns chunks to workers)
├── run_v9_advanced_sweep.sh # Startup script (generates configs + launches)
├── chunks/ # Generated parameter configurations
│ ├── v9_advanced_chunk_0000.json (1,000 configs)
│ ├── v9_advanced_chunk_0001.json (1,000 configs)
│ └── ... (~810 chunk files)
├── exploration.db # SQLite database (chunk tracking)
└── distributed_results/ # CSV outputs from workers
├── v9_advanced_chunk_0000_results.csv
└── ...
```
### Database Schema
**v9_advanced_chunks table:**
```sql
CREATE TABLE v9_advanced_chunks (
id TEXT PRIMARY KEY, -- v9_advanced_chunk_0000
start_combo INTEGER, -- 0 (not used, legacy)
end_combo INTEGER, -- 1000 (not used, legacy)
total_combos INTEGER, -- 1000 configs per chunk
status TEXT, -- 'pending', 'running', 'completed', 'failed'
assigned_worker TEXT, -- 'worker1', 'worker2', NULL
started_at INTEGER, -- Unix timestamp
completed_at INTEGER, -- Unix timestamp
created_at INTEGER DEFAULT (strftime('%s', 'now'))
)
```
**v9_advanced_strategies table:**
```sql
CREATE TABLE v9_advanced_strategies (
id INTEGER PRIMARY KEY AUTOINCREMENT,
chunk_id TEXT NOT NULL, -- FK to v9_advanced_chunks
params TEXT NOT NULL, -- JSON of 18 parameters
pnl REAL NOT NULL, -- Total P&L
win_rate REAL NOT NULL, -- % winners
profit_factor REAL NOT NULL, -- (Not yet implemented)
max_drawdown REAL NOT NULL, -- Max DD %
total_trades INTEGER NOT NULL, -- Number of trades
created_at INTEGER DEFAULT (strftime('%s', 'now'))
)
```
## Usage
### 1. Launch Sweep
```bash
cd /home/icke/traderv4/cluster
./run_v9_advanced_sweep.sh
```
**What it does:**
1. Generates 810K parameter configurations
2. Splits into ~810 chunks (1,000 configs each)
3. Creates SQLite database with chunk tracking
4. Launches coordinator in background
5. Coordinator assigns chunks to 2 EPYC workers
6. Workers process chunks in parallel
### 2. Monitor Progress
**Web Dashboard:**
```
http://localhost:3001/cluster
```
**Command Line:**
```bash
# Watch coordinator logs
tail -f coordinator_v9_advanced.log
# Check database status
sqlite3 exploration.db "
SELECT
status,
COUNT(*) as count,
ROUND(COUNT(*) * 100.0 / 810, 1) as percent
FROM v9_advanced_chunks
GROUP BY status
"
```
### 3. Analyze Results
After completion, aggregate all results:
```bash
# Combine all CSV files
cd distributed_results
cat v9_advanced_chunk_*_results.csv | head -1 > all_v9_advanced_results.csv
tail -n +2 -q v9_advanced_chunk_*_results.csv >> all_v9_advanced_results.csv
# Top 100 performers
sort -t, -k2 -rn all_v9_advanced_results.csv | head -100 > top_100_v9_advanced.csv
```
**Analysis queries:**
```python
import pandas as pd
df = pd.read_csv('all_v9_advanced_results.csv')
# Compare MA gap filter vs baseline
baseline = df[df['use_ma_gap'] == False]
filtered = df[df['use_ma_gap'] == True]
print(f"Baseline avg: ${baseline['profit'].mean():.2f}")
print(f"Filtered avg: ${filtered['profit'].mean():.2f}")
# Find optimal gap thresholds
for profile in ['minutes', 'hours', 'daily', 'weekly']:
profile_df = df[df['profile'] == profile]
best = profile_df.nlargest(10, 'profit')
print(f"\n{profile.upper()} - Top 10 gap thresholds:")
print(best[['ma_gap_min_long', 'ma_gap_min_short', 'profit', 'win_rate']])
```
## Expected Outcomes
### If MA Gap Filter Helps:
**Expected pattern:**
- Filtered configs outperform baseline
- Optimal gap thresholds cluster around certain values
- Direction-specific gaps emerge (LONGs need +gap, SHORTs need -gap)
**Action:**
- Update production v9 with optimal gap thresholds
- Deploy to live trading after forward testing
### If MA Gap Filter Hurts:
**Expected pattern:**
- Baseline (use_ma_gap=False) outperforms all filtered configs
- No clear threshold patterns emerge
- Performance degrades with stricter gaps
**Action:**
- Keep production v9 as-is (no MA gap filter)
- Document findings: MA divergence not predictive for v9 signals
### If Results Inconclusive:
**Expected pattern:**
- Filtered and baseline perform similarly
- High variance in gap threshold performance
**Action:**
- Keep baseline for simplicity (Occam's Razor)
- Consider gap as optional "turbo mode" for specific profiles
## Worker Infrastructure
Uses **existing EPYC cluster setup** (64 cores total):
**Worker 1 (bd-epyc-02):**
- Direct SSH: `root@10.10.254.106`
- Workspace: `/home/comprehensive_sweep`
- Python: `.venv` with pandas/numpy
- Cores: 32 threads
**Worker 2 (bd-host01):**
- SSH hop: via worker1
- Direct: `root@10.20.254.100`
- Workspace: `/home/backtest_dual/backtest`
- Python: `.venv` with pandas/numpy
- Cores: 32 threads
**Prerequisites (already met):**
- Python 3.11+ with pandas, numpy
- Virtual environments active
- SOLUSDT 5m OHLCV data (Nov 2024 - Nov 2025)
- SSH keys configured
## Timeline
**Estimate:** 70-80 hours total (810K configs ÷ 64 cores)
**Breakdown:**
- Config generation: ~5 minutes
- Chunk assignment: ~1 minute
- Parallel execution: ~70 hours (1,000 configs/chunk × ~3.1s/config ÷ 2 workers)
- Result aggregation: ~10 minutes
**Monitoring intervals:**
- Check status: Every 30-60 minutes
- Full results available: After ~3 days
## Lessons from Previous Sweeps
### v9 Baseline (65K configs, Nov 28-29):
- **Finding:** Parameter insensitivity observed
- **Implication:** Edge is in core logic, not specific parameter values
- **Action:** Explore new logic dimensions (MA gap) instead of tighter parameter grids
### v10 Removal (Nov 28):
- **Finding:** 72 configs produced identical results
- **Implication:** New logic must add real edge, not just complexity
- **Action:** MA gap filter is **observable market state** (not derived metric)
### Distributed Worker Bug (Dec 1):
- **Finding:** Dict passed instead of lambda function
- **Implication:** Type safety critical for 810K config sweep
- **Action:** Simplified v9_advanced_worker.py with explicit types
## File Locations
**Master (srvdocker02):**
```
/home/icke/traderv4/cluster/
```
**Workers (EPYC servers):**
```
/home/comprehensive_sweep/ (worker1)
/home/backtest_dual/backtest/ (worker2)
```
**Results (master):**
```
/home/icke/traderv4/cluster/distributed_results/
```
## Git Commits
```bash
# Before launch
git add cluster/
git commit -m "feat: V9 advanced parameter sweep with MA gap filter (810K configs)
- Added MA gap filter exploration (3 dimensions = 18× expansion)
- Created v9_advanced_worker.py for chunk processing
- Created v9_advanced_coordinator.py for work distribution
- Uses existing EPYC cluster infrastructure (64 cores)
- Expected runtime: 70-80 hours for 810K configurations
"
git push
```
## Post-Sweep Actions
1. **Aggregate results** into single CSV
2. **Compare MA gap filter** vs baseline performance
3. **Identify optimal thresholds** per profile and direction
4. **Update production v9** if MA gap filter shows consistent edge
5. **Forward test** for 50-100 trades before live deployment
6. **Document findings** in INDICATOR_V9_MA_GAP_ROADMAP.md
## Support
**Questions during sweep:**
- Check `coordinator_v9_advanced.log` for coordinator status
- Check worker logs via SSH: `ssh worker1 tail -f /home/comprehensive_sweep/worker.log`
- Database queries: `sqlite3 exploration.db "SELECT ..."
`
**If sweep stalls:**
1. Check coordinator process: `ps aux | grep v9_advanced_coordinator`
2. Check worker processes: SSH to workers, `ps aux | grep python`
3. Reset failed chunks: `UPDATE v9_advanced_chunks SET status='pending' WHERE status='failed'`
4. Restart coordinator: `./run_v9_advanced_sweep.sh`