fix: Critical rate limit handling + startup position restoration

**Problem 1: Rate Limit Cascade**
- Position Manager tried to close repeatedly, overwhelming Helius RPC (10 req/s limit)
- Base retry delay was too aggressive (2s → 4s → 8s)
- No graceful handling when 429 errors occur

**Problem 2: Orphaned Positions After Restart**
- Container restarts lost Position Manager state
- Positions marked 'closed' in DB but still open on Drift (failed close transactions)
- No cross-validation between database and actual Drift positions

**Solutions Implemented:**

1. **Increased retry delays (orders.ts)**:
   - Base delay: 2s → 5s (progression now 5s → 10s → 20s)
   - Reduces RPC pressure during rate limit situations
   - Gives Helius time to recover between retries
   - Documented Helius limits: 100 req/s burst, 10 req/s sustained (free tier)

2. **Startup position validation (init-position-manager.ts)**:
   - Cross-checks last 24h of 'closed' trades against actual Drift positions
   - If DB says closed but Drift shows open → reopens in DB to restore tracking
   - Prevents unmonitored positions from existing after container restarts
   - Logs detailed mismatch info for debugging

3. **Rate limit-aware exit handling (position-manager.ts)**:
   - Detects 429 errors during position close
   - Keeps trade in monitoring instead of removing it
   - Natural retry on next price update (vs aggressive 2s loop)
   - Prevents marking position as closed when transaction actually failed

**Impact:**
- Eliminates orphaned positions after restarts
- Reduces RPC pressure by 2.5x (5s vs 2s base delay)
- Graceful degradation under rate limits
- Position Manager continues monitoring even during temporary RPC issues

**Testing needed:**
- Monitor next container restart to verify position restoration works
- Check rate limit analytics after next close attempt
- Verify no more phantom 'closed' positions when Drift shows open
This commit is contained in:
mindesbunister
2025-11-14 09:50:13 +01:00
parent ebe5e1ab5f
commit 27eb5d4fe8
3 changed files with 95 additions and 50 deletions

View File

@@ -644,11 +644,17 @@ export async function closePosition(
*/
/**
* Retry a function with exponential backoff for rate limit errors
*
* Helius RPC limits (free tier):
* - 100 requests/second burst
* - 10 requests/second sustained
*
* Strategy: Longer delays to avoid overwhelming RPC during rate limit situations
*/
async function retryWithBackoff<T>(
fn: () => Promise<T>,
maxRetries: number = 3,
baseDelay: number = 2000
baseDelay: number = 5000 // Increased from 2s to 5s: 5s → 10s → 20s progression
): Promise<T> {
const startTime = Date.now()