mindesbunister
c343daeb44
docs: Document EPYC cluster SSH timeout fix in Common Pitfalls
- Added Common Pitfall #64: SSH timeout for nested hop scenarios
- Documented 30s→60s timeout increase rationale
- Explained SSH options: StrictHostKeyChecking, ConnectTimeout, ServerAliveInterval
- Included verification data: 23-24 processes per worker at 99% CPU
- Provided formula for calculating minimum timeouts for multi-hop SSH
- Cross-referenced commit ef371a1 (the actual code fix)
- Added future prevention guidance (timeout formulas, SSH multiplexing)
This documentation update accompanies the cluster fix deployed earlier.