- Complete guide for noise constraint management - Time-based scheduling logic explained - Performance impact analysis (27% reduction) - Monitoring commands and troubleshooting - Fixed stuck chunk 14 documentation
7.0 KiB
Worker2 Time Restriction - Noise Constraint Management
Date: December 4, 2025
Issue: Node 2 (bd-host01) generates excessive noise during office hours
Solution: Time-restricted scheduling (19:00 - 06:00 only)
Problem
Worker2 (bd-host01 / 10.20.254.100) is an EPYC 16-core server that generates significant noise when running parameter sweeps at full load. This is disruptive during office hours (06:00 - 19:00).
Solution Implemented
Time-Based Worker Scheduling
Configuration in v9_advanced_coordinator.py:
WORKERS = {
'worker1': {
'host': 'root@10.10.254.106',
'workspace': '/home/comprehensive_sweep',
# No time restriction - runs 24/7
},
'worker2': {
'host': 'root@10.20.254.100',
'workspace': '/home/backtest_dual/backtest',
'ssh_hop': 'root@10.10.254.106',
'time_restricted': True, # Enable time-based control
'allowed_start_hour': 19, # 7 PM
'allowed_end_hour': 6, # 6 AM
}
}
Logic Implementation
def is_worker_allowed_to_run(worker_name: str) -> bool:
"""Check if worker is allowed to run based on time restrictions"""
worker = WORKERS[worker_name]
# If no time restriction, always allowed
if not worker.get('time_restricted', False):
return True
# Check current hour (local time)
current_hour = datetime.now().hour
start_hour = worker['allowed_start_hour']
end_hour = worker['allowed_end_hour']
# Handle time range that crosses midnight (e.g., 19:00 - 06:00)
if start_hour > end_hour:
allowed = current_hour >= start_hour or current_hour < end_hour
else:
allowed = start_hour <= current_hour < end_hour
return allowed
Coordinator Integration
The coordinator now checks time restrictions before assigning work:
# Assign work to idle workers
for worker_name in WORKERS.keys():
# Check if worker is allowed to run (time restrictions)
if not is_worker_allowed_to_run(worker_name):
if iteration % 10 == 0: # Log every 10 iterations to avoid spam
print(f"⏰ {worker_name} not allowed (office hours, noise restriction)")
continue
# ... continue with worker assignment ...
Operating Hours
| Worker | Hours | Status | Reason |
|---|---|---|---|
| Worker1 | 24/7 | Always active | No noise constraint |
| Worker2 | 19:00 - 06:00 | Time-restricted | Noise during office hours |
Worker2 Schedule:
- ACTIVE: 7:00 PM - 6:00 AM (11 hours/day)
- IDLE: 6:00 AM - 7:00 PM (13 hours/day)
Impact on Sweep Performance
Before Time Restriction
- Worker1: 32 cores, 24/7 = 768 core-hours/day
- Worker2: 32 cores, 24/7 = 768 core-hours/day
- Total: 1,536 core-hours/day
After Time Restriction
- Worker1: 32 cores, 24/7 = 768 core-hours/day
- Worker2: 32 cores, 11h/day = 352 core-hours/day
- Total: 1,120 core-hours/day
Performance Impact: ~27% reduction in daily throughput (worker2 contributes 45.8% less)
Sweep Progress Impact
- Chunks completed: 63 / 1,693 (3.7%)
- Chunks pending: 1,629
- Estimated completion time:
- Old: ~40 days (both workers 24/7)
- New: ~54 days (worker2 time-restricted)
- Delta: +14 days
Acceptable trade-off: Quiet office hours > slightly longer sweep time
Verification
Test Current Time Restriction (Dec 4, 14:11)
cd /home/icke/traderv4/cluster
python3 -c "
from datetime import datetime
current_hour = datetime.now().hour
allowed = current_hour >= 19 or current_hour < 6
print(f'Current hour: {current_hour}')
print(f'Worker2 allowed: {allowed}')
"
Output:
Current hour: 14
Worker2 allowed: False ✅ Correct (office hours)
Monitor Coordinator Logs
cd /home/icke/traderv4/cluster
tail -f v9_advanced_coordinator.log | grep "⏰"
Expected output during office hours:
⏰ worker2 not allowed (office hours, noise restriction)
Fixed Issues
Stuck Chunk Problem (Dec 2 - Dec 4)
Issue: Chunk 14 assigned to worker2 on Dec 2 at 15:14, never completed
- Database showed:
status='running' - Reality: No processes running on worker2
- Impact: Blocked new work assignment to worker2 for 46+ hours
Resolution:
UPDATE v9_advanced_chunks
SET status='pending', assigned_worker=NULL
WHERE id='v9_advanced_chunk_0014';
Chunk 14 now available for reassignment during worker2's active hours (19:00-06:00).
Manual Overrides
Temporarily Disable Time Restriction
If needed for urgent sweeps, modify coordinator:
# In WORKERS['worker2'], comment out time restriction:
'worker2': {
# 'time_restricted': True, # TEMPORARILY DISABLED
'allowed_start_hour': 19,
'allowed_end_hour': 6,
}
Then restart coordinator.
Adjust Operating Hours
To change allowed hours (e.g., extend to 8 PM - 5 AM):
'worker2': {
'time_restricted': True,
'allowed_start_hour': 20, # 8 PM
'allowed_end_hour': 5, # 5 AM
}
Monitoring Commands
Check Worker2 Status
# Check if worker2 has active processes
ssh root@10.10.254.106 "ssh root@10.20.254.100 'ps aux | grep v9_advanced_worker | grep -v grep | wc -l'"
# Check worker2 assignments in database
cd /home/icke/traderv4/cluster
sqlite3 exploration.db "SELECT COUNT(*) FROM v9_advanced_chunks WHERE assigned_worker='worker2' AND status='running';"
Check Time Restriction Status
cd /home/icke/traderv4/cluster
sqlite3 exploration.db "
SELECT
assigned_worker,
COUNT(*) as chunks,
SUM(CASE WHEN status='completed' THEN 1 ELSE 0 END) as completed,
SUM(CASE WHEN status='running' THEN 1 ELSE 0 END) as running
FROM v9_advanced_chunks
WHERE assigned_worker IS NOT NULL
GROUP BY assigned_worker;
"
Expected Behavior
During Office Hours (06:00 - 19:00)
- Worker1: ✅ Processing chunks
- Worker2: ⏸️ Idle (time restriction active)
- Coordinator logs: "⏰ worker2 not allowed (office hours, noise restriction)"
During Off Hours (19:00 - 06:00)
- Worker1: ✅ Processing chunks
- Worker2: ✅ Processing chunks (if available)
- Both workers: Full 32-core utilization
Files Modified
cluster/v9_advanced_coordinator.py- Added time restriction logiccluster/exploration.db- Reset stuck chunk 14cluster/WORKER2_TIME_RESTRICTION.md- This documentation
Future Improvements
- Dynamic hour adjustment via environment variables
- Holiday/weekend override (allow 24/7 on non-work days)
- Load-based throttling (reduce cores instead of full stop)
- SMS alerts when worker2 transitions active/idle
Contact
For adjustments to worker2 operating hours or noise constraint issues, update the configuration in v9_advanced_coordinator.py and restart the coordinator.
Current Status (Dec 4, 2025):
- ✅ Time restriction implemented
- ✅ Stuck chunk 14 resolved
- ✅ Worker1 processing continuously
- ⏸️ Worker2 waiting for 19:00 (off-hours start)