MAJOR FIXES: - ATR-based trailing stop for runners (was fixed 0.3%, now adapts to volatility) - Fixes runners with +7-9% MFE exiting for losses - Typical improvement: 2.24x more room (0.3% → 0.67% at 0.45% ATR) - Enhanced rate limit logging with database tracking - New /api/analytics/rate-limits endpoint for monitoring DETAILS: - Position Manager: Calculate trailing as (atrAtEntry / price × 100) × multiplier - Config: TRAILING_STOP_ATR_MULTIPLIER=1.5, MIN=0.25%, MAX=0.9% - Settings UI: Added ATR multiplier controls - Rate limits: Log hits/recoveries/exhaustions to SystemEvent table - Documentation: ATR_TRAILING_STOP_FIX.md + RATE_LIMIT_MONITORING.md IMPACT: - Runners can now capture big moves (like morning's $172→$162 SOL drop) - Rate limit visibility prevents silent failures - Data-driven optimization for RPC endpoint health
5.2 KiB
5.2 KiB
Rate Limit Monitoring - SQL Queries
Quick Access
# View rate limit analytics via API
curl http://localhost:3001/api/analytics/rate-limits | python3 -m json.tool
# Direct database queries
docker exec trading-bot-postgres psql -U postgres -d trading_bot_v4
Common Queries
1. Recent Rate Limit Events (Last 24 Hours)
SELECT
"eventType",
message,
details,
TO_CHAR("createdAt", 'MM-DD HH24:MI:SS') as time
FROM "SystemEvent"
WHERE "eventType" IN ('rate_limit_hit', 'rate_limit_recovered', 'rate_limit_exhausted')
AND "createdAt" > NOW() - INTERVAL '24 hours'
ORDER BY "createdAt" DESC
LIMIT 20;
2. Rate Limit Statistics (Last 7 Days)
SELECT
"eventType",
COUNT(*) as occurrences,
MIN("createdAt") as first_seen,
MAX("createdAt") as last_seen
FROM "SystemEvent"
WHERE "eventType" IN ('rate_limit_hit', 'rate_limit_recovered', 'rate_limit_exhausted')
AND "createdAt" > NOW() - INTERVAL '7 days'
GROUP BY "eventType"
ORDER BY occurrences DESC;
3. Rate Limit Pattern by Hour (Find Peak Times)
SELECT
EXTRACT(HOUR FROM "createdAt") as hour,
COUNT(*) as rate_limit_hits,
COUNT(DISTINCT DATE("createdAt")) as days_affected
FROM "SystemEvent"
WHERE "eventType" = 'rate_limit_hit'
AND "createdAt" > NOW() - INTERVAL '7 days'
GROUP BY EXTRACT(HOUR FROM "createdAt")
ORDER BY rate_limit_hits DESC;
4. Recovery Time Analysis
SELECT
(details->>'retriesNeeded')::int as retries,
(details->>'totalTimeMs')::int as recovery_ms,
TO_CHAR("createdAt", 'MM-DD HH24:MI:SS') as recovered_at
FROM "SystemEvent"
WHERE "eventType" = 'rate_limit_recovered'
AND "createdAt" > NOW() - INTERVAL '7 days'
ORDER BY recovery_ms DESC;
5. Failed Recoveries (Exhausted Retries)
SELECT
details->>'errorMessage' as error,
(details->>'totalTimeMs')::int as failed_after_ms,
TO_CHAR("createdAt", 'MM-DD HH24:MI:SS') as failed_at
FROM "SystemEvent"
WHERE "eventType" = 'rate_limit_exhausted'
AND "createdAt" > NOW() - INTERVAL '7 days'
ORDER BY "createdAt" DESC;
6. Rate Limit Health Score (Last 24h)
SELECT
COUNT(CASE WHEN "eventType" = 'rate_limit_hit' THEN 1 END) as total_hits,
COUNT(CASE WHEN "eventType" = 'rate_limit_recovered' THEN 1 END) as recovered,
COUNT(CASE WHEN "eventType" = 'rate_limit_exhausted' THEN 1 END) as failed,
CASE
WHEN COUNT(CASE WHEN "eventType" = 'rate_limit_hit' THEN 1 END) = 0 THEN '✅ HEALTHY'
WHEN COUNT(CASE WHEN "eventType" = 'rate_limit_exhausted' THEN 1 END) > 0 THEN '🔴 CRITICAL'
WHEN COUNT(CASE WHEN "eventType" = 'rate_limit_hit' THEN 1 END) > 10 THEN '⚠️ WARNING'
ELSE '✅ HEALTHY'
END as health_status,
ROUND(100.0 * COUNT(CASE WHEN "eventType" = 'rate_limit_recovered' THEN 1 END) /
NULLIF(COUNT(CASE WHEN "eventType" = 'rate_limit_hit' THEN 1 END), 0), 1) as recovery_rate
FROM "SystemEvent"
WHERE "eventType" IN ('rate_limit_hit', 'rate_limit_recovered', 'rate_limit_exhausted')
AND "createdAt" > NOW() - INTERVAL '24 hours';
What to Watch For
🔴 Critical Alerts
- rate_limit_exhausted events: Order placement/cancellation failed completely
- Recovery rate below 80%: System struggling to handle rate limits
- Multiple exhausted events in short time: RPC endpoint may be degraded
⚠️ Warnings
- More than 10 rate_limit_hit events per hour: High trading frequency
- Recovery times > 10 seconds: Backoff delays stacking up
- Rate limits during specific hours: Identify peak Solana network times
✅ Healthy Patterns
- 100% recovery rate: All rate limits handled successfully
- Recovery times 2-4 seconds: Retries working efficiently
- Zero rate_limit_exhausted events: No failed operations
Optimization Actions
If seeing frequent rate limits:
- Increase
baseDelayinretryWithBackoff()(currently 2000ms) - Add delay between
cancelAllOrders()andplaceExitOrders()(currently immediate) - Consider using a faster RPC endpoint (Helius Pro, Triton, etc.)
- Batch order operations if possible
If seeing exhausted retries:
- Increase
maxRetriesfrom 3 to 5 - Increase exponential backoff multiplier (currently 2x)
- Check RPC endpoint health/status page
- Consider implementing circuit breaker pattern
Live Monitoring Commands
# Watch rate limits in real-time
docker logs -f trading-bot-v4 | grep -i "rate limit"
# Count rate limit events today
docker exec trading-bot-postgres psql -U postgres -d trading_bot_v4 -c "
SELECT COUNT(*) FROM \"SystemEvent\"
WHERE \"eventType\" = 'rate_limit_hit'
AND DATE(\"createdAt\") = CURRENT_DATE;"
# Check latest rate limit event
docker exec trading-bot-postgres psql -U postgres -d trading_bot_v4 -c "
SELECT * FROM \"SystemEvent\"
WHERE \"eventType\" IN ('rate_limit_hit', 'rate_limit_recovered', 'rate_limit_exhausted')
ORDER BY \"createdAt\" DESC LIMIT 1;"
Integration with Alerts
When implementing automated alerts, trigger on:
- Any
rate_limit_exhaustedevent (critical) - More than 5
rate_limit_hitevents in 5 minutes (warning) - Recovery rate below 90% over 1 hour (warning)
Log format examples:
✅ Retry successful after 2341ms (1 retries)
⏳ Rate limited (429), retrying in 2s... (attempt 1/3)
❌ RATE LIMIT EXHAUSTED: Failed after 3 retries and 14523ms