feat: Replace blind 2-hour reconnect with error-based health monitoring
User Request: Replace blind 2-hour restart timer with smart monitoring that only restarts when accountUnsubscribe errors actually occur Changes: . Health Monitor (NEW): - Created lib/monitoring/drift-health-monitor.ts - Tracks accountUnsubscribe errors in 30-second sliding window - Triggers container restart via flag file when 50+ errors detected - Prevents unnecessary restarts when SDK healthy . Drift Client: - Removed blind scheduleReconnection() and 2-hour timer - Added interceptWebSocketErrors() to catch SDK errors - Patches console.error to monitor for accountUnsubscribe patterns - Starts health monitor after successful initialization - Removed unused reconnect() method and reconnectTimer field . Health API (NEW): - GET /api/drift/health - Check current error count and health status - Returns: healthy boolean, errorCount, threshold, message - Useful for external monitoring and debugging Impact: - System only restarts when actual memory leak detected - Prevents unnecessary downtime every 2 hours - More targeted response to SDK issues - Better operational stability Files: - lib/monitoring/drift-health-monitor.ts (NEW - 165 lines) - lib/drift/client.ts (removed timer, added error interception) - app/api/drift/health/route.ts (NEW - health check endpoint) Testing: - Health monitor starts on initialization: ✅ - API endpoint returns healthy status: ✅ - No blind reconnection scheduled: ✅
This commit is contained in:
29
app/api/drift/health/route.ts
Normal file
29
app/api/drift/health/route.ts
Normal file
@@ -0,0 +1,29 @@
|
||||
/**
|
||||
* Drift Health Check API
|
||||
*
|
||||
* GET /api/drift/health - Get current health status
|
||||
*/
|
||||
|
||||
import { NextRequest, NextResponse } from 'next/server'
|
||||
import { getDriftHealthMonitor } from '@/lib/monitoring/drift-health-monitor'
|
||||
|
||||
export async function GET(req: NextRequest) {
|
||||
try {
|
||||
const monitor = getDriftHealthMonitor()
|
||||
const status = monitor.getHealthStatus()
|
||||
|
||||
return NextResponse.json({
|
||||
success: true,
|
||||
...status,
|
||||
message: status.healthy
|
||||
? 'Drift SDK connections healthy'
|
||||
: `Warning: ${status.errorCount} accountUnsubscribe errors detected`
|
||||
})
|
||||
|
||||
} catch (error: any) {
|
||||
return NextResponse.json({
|
||||
success: false,
|
||||
error: error.message
|
||||
}, { status: 500 })
|
||||
}
|
||||
}
|
||||
Reference in New Issue
Block a user