Files
trading_bot_v4/cluster/WORKER2_TIME_RESTRICTION.md
mindesbunister 0babd1ea1a docs: Add worker2 time restriction documentation
- Complete guide for noise constraint management
- Time-based scheduling logic explained
- Performance impact analysis (27% reduction)
- Monitoring commands and troubleshooting
- Fixed stuck chunk 14 documentation
2025-12-04 14:12:09 +01:00

7.0 KiB

Worker2 Time Restriction - Noise Constraint Management

Date: December 4, 2025
Issue: Node 2 (bd-host01) generates excessive noise during office hours
Solution: Time-restricted scheduling (19:00 - 06:00 only)


Problem

Worker2 (bd-host01 / 10.20.254.100) is an EPYC 16-core server that generates significant noise when running parameter sweeps at full load. This is disruptive during office hours (06:00 - 19:00).


Solution Implemented

Time-Based Worker Scheduling

Configuration in v9_advanced_coordinator.py:

WORKERS = {
    'worker1': {
        'host': 'root@10.10.254.106',
        'workspace': '/home/comprehensive_sweep',
        # No time restriction - runs 24/7
    },
    'worker2': {
        'host': 'root@10.20.254.100', 
        'workspace': '/home/backtest_dual/backtest',
        'ssh_hop': 'root@10.10.254.106',
        'time_restricted': True,      # Enable time-based control
        'allowed_start_hour': 19,     # 7 PM
        'allowed_end_hour': 6,        # 6 AM
    }
}

Logic Implementation

def is_worker_allowed_to_run(worker_name: str) -> bool:
    """Check if worker is allowed to run based on time restrictions"""
    worker = WORKERS[worker_name]
    
    # If no time restriction, always allowed
    if not worker.get('time_restricted', False):
        return True
    
    # Check current hour (local time)
    current_hour = datetime.now().hour
    start_hour = worker['allowed_start_hour']
    end_hour = worker['allowed_end_hour']
    
    # Handle time range that crosses midnight (e.g., 19:00 - 06:00)
    if start_hour > end_hour:
        allowed = current_hour >= start_hour or current_hour < end_hour
    else:
        allowed = start_hour <= current_hour < end_hour
    
    return allowed

Coordinator Integration

The coordinator now checks time restrictions before assigning work:

# Assign work to idle workers
for worker_name in WORKERS.keys():
    # Check if worker is allowed to run (time restrictions)
    if not is_worker_allowed_to_run(worker_name):
        if iteration % 10 == 0:  # Log every 10 iterations to avoid spam
            print(f"⏰ {worker_name} not allowed (office hours, noise restriction)")
        continue
    
    # ... continue with worker assignment ...

Operating Hours

Worker Hours Status Reason
Worker1 24/7 Always active No noise constraint
Worker2 19:00 - 06:00 Time-restricted Noise during office hours

Worker2 Schedule:

  • ACTIVE: 7:00 PM - 6:00 AM (11 hours/day)
  • IDLE: 6:00 AM - 7:00 PM (13 hours/day)

Impact on Sweep Performance

Before Time Restriction

  • Worker1: 32 cores, 24/7 = 768 core-hours/day
  • Worker2: 32 cores, 24/7 = 768 core-hours/day
  • Total: 1,536 core-hours/day

After Time Restriction

  • Worker1: 32 cores, 24/7 = 768 core-hours/day
  • Worker2: 32 cores, 11h/day = 352 core-hours/day
  • Total: 1,120 core-hours/day

Performance Impact: ~27% reduction in daily throughput (worker2 contributes 45.8% less)

Sweep Progress Impact

  • Chunks completed: 63 / 1,693 (3.7%)
  • Chunks pending: 1,629
  • Estimated completion time:
    • Old: ~40 days (both workers 24/7)
    • New: ~54 days (worker2 time-restricted)
    • Delta: +14 days

Acceptable trade-off: Quiet office hours > slightly longer sweep time


Verification

Test Current Time Restriction (Dec 4, 14:11)

cd /home/icke/traderv4/cluster
python3 -c "
from datetime import datetime
current_hour = datetime.now().hour
allowed = current_hour >= 19 or current_hour < 6
print(f'Current hour: {current_hour}')
print(f'Worker2 allowed: {allowed}')
"

Output:

Current hour: 14
Worker2 allowed: False  ✅ Correct (office hours)

Monitor Coordinator Logs

cd /home/icke/traderv4/cluster
tail -f v9_advanced_coordinator.log | grep "⏰"

Expected output during office hours:

⏰ worker2 not allowed (office hours, noise restriction)

Fixed Issues

Stuck Chunk Problem (Dec 2 - Dec 4)

Issue: Chunk 14 assigned to worker2 on Dec 2 at 15:14, never completed

  • Database showed: status='running'
  • Reality: No processes running on worker2
  • Impact: Blocked new work assignment to worker2 for 46+ hours

Resolution:

UPDATE v9_advanced_chunks 
SET status='pending', assigned_worker=NULL 
WHERE id='v9_advanced_chunk_0014';

Chunk 14 now available for reassignment during worker2's active hours (19:00-06:00).


Manual Overrides

Temporarily Disable Time Restriction

If needed for urgent sweeps, modify coordinator:

# In WORKERS['worker2'], comment out time restriction:
'worker2': {
    # 'time_restricted': True,  # TEMPORARILY DISABLED
    'allowed_start_hour': 19,
    'allowed_end_hour': 6,
}

Then restart coordinator.

Adjust Operating Hours

To change allowed hours (e.g., extend to 8 PM - 5 AM):

'worker2': {
    'time_restricted': True,
    'allowed_start_hour': 20,  # 8 PM
    'allowed_end_hour': 5,     # 5 AM
}

Monitoring Commands

Check Worker2 Status

# Check if worker2 has active processes
ssh root@10.10.254.106 "ssh root@10.20.254.100 'ps aux | grep v9_advanced_worker | grep -v grep | wc -l'"

# Check worker2 assignments in database
cd /home/icke/traderv4/cluster
sqlite3 exploration.db "SELECT COUNT(*) FROM v9_advanced_chunks WHERE assigned_worker='worker2' AND status='running';"

Check Time Restriction Status

cd /home/icke/traderv4/cluster
sqlite3 exploration.db "
SELECT 
    assigned_worker, 
    COUNT(*) as chunks,
    SUM(CASE WHEN status='completed' THEN 1 ELSE 0 END) as completed,
    SUM(CASE WHEN status='running' THEN 1 ELSE 0 END) as running
FROM v9_advanced_chunks 
WHERE assigned_worker IS NOT NULL
GROUP BY assigned_worker;
"

Expected Behavior

During Office Hours (06:00 - 19:00)

  • Worker1: Processing chunks
  • Worker2: ⏸️ Idle (time restriction active)
  • Coordinator logs: " worker2 not allowed (office hours, noise restriction)"

During Off Hours (19:00 - 06:00)

  • Worker1: Processing chunks
  • Worker2: Processing chunks (if available)
  • Both workers: Full 32-core utilization

Files Modified

  • cluster/v9_advanced_coordinator.py - Added time restriction logic
  • cluster/exploration.db - Reset stuck chunk 14
  • cluster/WORKER2_TIME_RESTRICTION.md - This documentation

Future Improvements

  1. Dynamic hour adjustment via environment variables
  2. Holiday/weekend override (allow 24/7 on non-work days)
  3. Load-based throttling (reduce cores instead of full stop)
  4. SMS alerts when worker2 transitions active/idle

Contact

For adjustments to worker2 operating hours or noise constraint issues, update the configuration in v9_advanced_coordinator.py and restart the coordinator.

Current Status (Dec 4, 2025):

  • Time restriction implemented
  • Stuck chunk 14 resolved
  • Worker1 processing continuously
  • ⏸️ Worker2 waiting for 19:00 (off-hours start)