docs: Add comprehensive v11 test sweep documentation and deployment script

Co-authored-by: mindesbunister <32161838+mindesbunister@users.noreply.github.com>
2025-12-06 19:18:37 +00:00
parent 4599afafaa
commit 73887ac4f3
2 changed files with 352 additions and 0 deletions
--- a/cluster/V11_TEST_SWEEP_README.md
+++ b/cluster/V11_TEST_SWEEP_README.md
@@ -0,0 +1,286 @@
+# V11 Test Parameter Sweep
+
+Fast validation sweep for v11 "Money Line All Filters" indicator before full 65,536-combination optimization.
+
+## Overview
+
+**Purpose:** Verify v11 indicator produces valid backtest data with varied P&L before committing to 30-35 hour full sweep.
+
+**Test Size:** 256 combinations (2^8 parameters)  
+**Expected Runtime:** 6-25 minutes  
+**Workers:** 2 × 27 cores (85% CPU limit)  
+**Output:** CSV files + SQLite database
+
+## Critical v11 Fix
+
+**v9 Bug:** Filters were calculated but NOT applied to signals (broken logic)  
+**v11 Fix:** ALL filters must pass for signal generation (lines 271-272 from pinescript)
+
+```python
+# v11: Filters actually applied
+if adx_ok and volume_ok and rsi_ok and pos_ok and entry_buffer_ok:
+    signals.append(...)  # Signal only fires when ALL conditions met
+```
+
+## Test Parameter Grid
+
+8 parameters × 2 values each = 256 combinations:
+
+| Parameter | Values | Purpose |
+|-----------|--------|---------|
+| `flip_threshold` | 0.5, 0.6 | % price must move to flip trend |
+| `adx_min` | 18, 21 | Minimum ADX for trend strength |
+| `long_pos_max` | 75, 80 | Max price position for longs (%) |
+| `short_pos_min` | 20, 25 | Min price position for shorts (%) |
+| `vol_min` | 0.8, 1.0 | Minimum volume ratio |
+| `entry_buffer_atr` | 0.15, 0.20 | ATR buffer beyond Money Line |
+| `rsi_long_min` | 35, 40 | RSI minimum for longs |
+| `rsi_short_max` | 65, 70 | RSI maximum for shorts |
+
+## Worker Configuration
+
+### Worker 1 (pve-nu-monitor01)
+- **Host:** root@10.10.254.106
+- **Cores:** 27 (85% of 32 threads)
+- **Availability:** 24/7 no restrictions
+- **Workspace:** /home/comprehensive_sweep
+
+### Worker 2 (pve-srvmon01)
+- **Host:** root@10.20.254.100 (via Worker 1 SSH hop)
+- **Cores:** 27 (85% of 32 threads)
+- **Availability:** 
+  - **Weekdays (Mon-Fri):** 6 PM - 8 AM only (nights)
+  - **Weekends (Sat-Sun):** 24/7 at 85%
+  - **Office hours (Mon-Fri 8am-6pm):** DISABLED
+- **Workspace:** /home/backtest_dual/backtest
+
+### Expected Performance
+
+**Test sweep (256 combos):**
+- Worker 1 only (weekday daytime): ~25 minutes
+- Both workers (nights/weekends): ~12-15 minutes
+
+**Full sweep (65,536 combos) - after test passes:**
+- Optimal start: Friday 6 PM
+- Completion: ~30-35 hours (by Tuesday morning)
+
+## Usage
+
+### Step 1: Deploy to Cluster
+
+```bash
+# On local machine
+cd /home/icke/traderv4
+rsync -avz --exclude 'node_modules' --exclude '.next' cluster/ root@10.10.254.106:/home/comprehensive_sweep/
+rsync -avz backtester/ root@10.10.254.106:/home/comprehensive_sweep/backtester/
+```
+
+### Step 2: Launch Test Sweep
+
+```bash
+# SSH to Worker 1
+ssh root@10.10.254.106
+
+# Navigate to workspace
+cd /home/comprehensive_sweep
+
+# Launch coordinator
+bash run_v11_test_sweep.sh
+```
+
+### Step 3: Monitor Progress
+
+```bash
+# Watch coordinator logs
+tail -f coordinator_v11_test.log
+
+# Check database status
+sqlite3 exploration.db "SELECT id, status, assigned_worker FROM v11_test_chunks"
+
+# Check completion
+sqlite3 exploration.db "SELECT COUNT(*) FROM v11_test_strategies"
+# Expected: 256
+
+# View top results
+sqlite3 exploration.db "SELECT params, pnl, total_trades FROM v11_test_strategies ORDER BY pnl DESC LIMIT 10"
+```
+
+## Output Files
+
+### CSV Results
+
+Location: `cluster/v11_test_results/`
+
+Files:
+- `v11_test_chunk_0000_results.csv` (128 combinations)
+- `v11_test_chunk_0001_results.csv` (128 combinations)
+
+Format:
+```csv
+flip_threshold,adx_min,long_pos_max,short_pos_min,vol_min,entry_buffer_atr,rsi_long_min,rsi_short_max,pnl,win_rate,profit_factor,max_drawdown,total_trades
+0.5,18,75,20,0.8,0.15,35,65,245.32,58.3,1.245,125.40,48
+...
+```
+
+### Database Tables
+
+**v11_test_chunks:**
+```sql
+CREATE TABLE v11_test_chunks (
+    id TEXT PRIMARY KEY,
+    start_combo INTEGER,
+    end_combo INTEGER,
+    total_combos INTEGER,
+    status TEXT,
+    assigned_worker TEXT,
+    started_at INTEGER,
+    completed_at INTEGER
+);
+```
+
+**v11_test_strategies:**
+```sql
+CREATE TABLE v11_test_strategies (
+    id INTEGER PRIMARY KEY AUTOINCREMENT,
+    chunk_id TEXT,
+    params TEXT,
+    pnl REAL,
+    win_rate REAL,
+    profit_factor REAL,
+    max_drawdown REAL,
+    total_trades INTEGER,
+    FOREIGN KEY (chunk_id) REFERENCES v11_test_chunks(id)
+);
+```
+
+## Verification Steps
+
+After sweep completes (~6-25 minutes):
+
+```bash
+# 1. Check output files exist
+ls -lh cluster/v11_test_results/
+# Expected: 2 CSV files
+
+# 2. Verify database has all strategies
+sqlite3 cluster/exploration.db "SELECT COUNT(*) FROM v11_test_strategies"
+# Expected: 256
+
+# 3. Check for varied PnL (NOT all zeros like v9 bug)
+head -10 cluster/v11_test_results/v11_test_chunk_0000_results.csv
+# Should show different PnL values
+
+# 4. View top 5 results
+sqlite3 cluster/exploration.db "SELECT params, pnl, total_trades FROM v11_test_strategies ORDER BY pnl DESC LIMIT 5"
+# Should show PnL > $0 and trades > 0
+
+# 5. Check coordinator logs
+tail -100 cluster/coordinator_v11_test.log
+# Should show "V11 TEST SWEEP COMPLETE!"
+```
+
+## Success Criteria
+
+✅ Completes in <30 minutes  
+✅ CSV files have 256 rows total  
+✅ PnL values are varied (not all zeros)  
+✅ Database has 256 strategies  
+✅ Top result shows PnL > $0 and trades > 0  
+✅ Worker 2 respected office hours (if applicable)
+
+## Telegram Notifications
+
+Bot sends 3 notifications:
+
+1. **Start:** When coordinator launches
+   - Shows available workers
+   - Worker 2 status (active or office hours)
+
+2. **Completion:** When all chunks finish
+   - Duration in minutes
+   - Total strategies tested
+   - Links to results
+
+3. **Failure:** If coordinator crashes
+   - Premature stop notification
+
+## Troubleshooting
+
+### Worker 2 Not Starting (Weekday Daytime)
+**Expected:** Worker 2 is disabled Mon-Fri 8am-6pm for office hours  
+**Action:** Wait until 6 PM or start on weekend for full 2-worker speed
+
+### No Signals Generated (All Zero PnL)
+**Symptom:** All PnL values are 0.0  
+**Cause:** Filters too strict (blocks all signals)  
+**Action:** This is the validation - if v11 produces zeros like v9, don't run full sweep
+
+### SSH Timeout on Worker 2
+**Symptom:** Worker 2 fails to deploy  
+**Cause:** SSH hop connection issue  
+**Action:** 
+```bash
+# Test connection manually
+ssh -o ProxyJump=root@10.10.254.106 root@10.20.254.100 'hostname'
+```
+
+### Database Locked
+**Symptom:** SQLite error "database is locked"  
+**Cause:** Coordinator still running  
+**Action:**
+```bash
+# Find coordinator PID
+ps aux | grep v11_test_coordinator
+# Kill gracefully
+kill <PID>
+```
+
+## Next Steps After Test Passes
+
+1. **User verifies data quality:**
+   - PnL values varied (not all zeros)
+   - Top results show positive P&L
+   - Trade counts > 0
+
+2. **If test PASSES:**
+   - Create full 65,536-combo sweep coordinator
+   - 4096 values per parameter (comprehensive grid)
+   - Start Friday 6 PM for optimal weekend utilization
+   - Complete by Tuesday morning (~30-35 hours)
+
+3. **If test FAILS (all zeros):**
+   - v11 filters may still be broken
+   - Debug indicator logic
+   - Compare with pinescript lines 271-272
+   - Don't run full sweep until fixed
+
+## Architecture
+
+```
+run_v11_test_sweep.sh
+    ├── Initializes database (2 chunks)
+    └── Launches v11_test_coordinator.py
+            ├── Worker 1 (always available)
+            │   └── v11_test_worker.py (27 cores)
+            │       └── backtester/v11_moneyline_all_filters.py
+            └── Worker 2 (office hours aware)
+                └── v11_test_worker.py (27 cores)
+                    └── backtester/v11_moneyline_all_filters.py
+```
+
+## Files
+
+| File | Purpose | Lines |
+|------|---------|-------|
+| `run_v11_test_sweep.sh` | Launch script | 52 |
+| `v11_test_coordinator.py` | Orchestrates sweep | 384 |
+| `v11_test_worker.py` | Processes chunks | 296 |
+| `backtester/v11_moneyline_all_filters.py` | Indicator logic | 335 |
+
+## References
+
+- **Pinescript:** `workflows/trading/moneyline_v11_all_filters.pinescript`
+- **v9 Bug:** Filters calculated but not applied (lines 271-272 broken)
+- **v9 Coordinator:** `cluster/v9_advanced_coordinator.py` (reference pattern)
+- **Math Utils:** `backtester/math_utils.py` (ATR, ADX, RSI)
+- **Simulator:** `backtester/simulator.py` (backtest engine)
--- a/cluster/deploy_v11_test.sh
+++ b/cluster/deploy_v11_test.sh
@@ -0,0 +1,66 @@
+#!/bin/bash
+# V11 Test Sweep - Quick Deployment Script
+# Syncs files to EPYC cluster and verifies setup
+
+set -e
+
+echo "================================================================"
+echo "V11 TEST SWEEP - DEPLOYMENT TO EPYC CLUSTER"
+echo "================================================================"
+echo ""
+
+# Configuration
+WORKER1_HOST="root@10.10.254.106"
+WORKER1_WORKSPACE="/home/comprehensive_sweep"
+LOCAL_CLUSTER="cluster"
+LOCAL_BACKTESTER="backtester"
+
+echo "📦 Step 1: Sync cluster scripts to Worker 1..."
+rsync -avz --progress \
+  --exclude '.venv' \
+  --exclude '__pycache__' \
+  --exclude '*.pyc' \
+  --exclude 'exploration.db' \
+  --exclude '*.log' \
+  --exclude '*_results' \
+  ${LOCAL_CLUSTER}/v11_test_coordinator.py \
+  ${LOCAL_CLUSTER}/v11_test_worker.py \
+  ${LOCAL_CLUSTER}/run_v11_test_sweep.sh \
+  ${LOCAL_CLUSTER}/V11_TEST_SWEEP_README.md \
+  ${WORKER1_HOST}:${WORKER1_WORKSPACE}/
+
+echo ""
+echo "📦 Step 2: Sync v11 indicator to Worker 1..."
+rsync -avz --progress \
+  --exclude '__pycache__' \
+  --exclude '*.pyc' \
+  ${LOCAL_BACKTESTER}/v11_moneyline_all_filters.py \
+  ${WORKER1_HOST}:${WORKER1_WORKSPACE}/backtester/
+
+echo ""
+echo "📦 Step 3: Verify math_utils exists on Worker 1..."
+ssh ${WORKER1_HOST} "test -f ${WORKER1_WORKSPACE}/backtester/math_utils.py && echo '✓ math_utils.py found' || echo '✗ math_utils.py missing!'"
+
+echo ""
+echo "📦 Step 4: Verify data file exists on Worker 1..."
+ssh ${WORKER1_HOST} "test -f ${WORKER1_WORKSPACE}/data/solusdt_5m.csv && echo '✓ data/solusdt_5m.csv found' || echo '✗ data/solusdt_5m.csv missing!'"
+
+echo ""
+echo "📦 Step 5: Make scripts executable on Worker 1..."
+ssh ${WORKER1_HOST} "chmod +x ${WORKER1_WORKSPACE}/run_v11_test_sweep.sh ${WORKER1_WORKSPACE}/v11_test_coordinator.py ${WORKER1_WORKSPACE}/v11_test_worker.py"
+
+echo ""
+echo "================================================================"
+echo "✅ DEPLOYMENT COMPLETE"
+echo "================================================================"
+echo ""
+echo "To start test sweep, run:"
+echo "  ssh ${WORKER1_HOST}"
+echo "  cd ${WORKER1_WORKSPACE}"
+echo "  bash run_v11_test_sweep.sh"
+echo ""
+echo "To monitor progress:"
+echo "  tail -f ${WORKER1_WORKSPACE}/coordinator_v11_test.log"
+echo ""
+echo "Expected runtime: 6-25 minutes"
+echo "================================================================"