Files

mindesbunister 09825782bb feat: Bypass quality scoring for manual Telegram trades

User requirement: Manual long/short commands via Telegram shall execute
immediately without quality checks.

Changes:
- Execute endpoint now checks for timeframe='manual' flag
- Added isManualTrade bypass alongside isValidatedEntry bypass
- Manual trades skip quality threshold validation completely
- Logs show 'MANUAL TRADE BYPASS' for transparency

Impact: Telegram commands (long sol, short eth) now execute instantly
without being blocked by low quality scores.

Commit: Dec 4, 2025

2025-12-04 19:56:17 +01:00

41 KiB

Raw Permalink Blame History

Comprehensive Trading Bot Improvement Plan

Generated: December 4, 2025
Analysis Period: Nov-Dec 2025
Methodology: Data-driven analysis via 8 systematic measurements

Executive Summary

Purpose: Comprehensive system-wide optimization analysis covering performance, size, code quality, infrastructure, and development velocity.

Methodology: 8 terminal commands + documentation review to establish quantitative baselines before recommendations. All findings are measurable with before/after metrics.

Key Discovery: System is healthy and well-architected, with optimization opportunities rather than critical problems. Current state: 10.88% CPU, 8.77% memory (179.7MiB), stable operation, 170+ successful trades.

Top 3 Priorities:

Console.log Epidemic (731 statements) - Production logging overhead + storage costs
Position Manager Refactor (1,945 lines) - Maintainability bottleneck
Database Query Optimization (32 trade queries) - Performance concentration point

📊 Baseline Metrics (Current State)

Infrastructure

Docker Image: 1.32GB trading bot, 275MB postgres (5× size difference)
Memory Usage: 179.7MiB bot (8.77% of 2GB limit), 39.53MiB postgres (3.86%)
CPU Usage: 10.88% bot, 3.37% postgres
Disk Usage: 1.3GB total
- 620MB node_modules (47.7%)
- 221MB .next build (17.0%)
- 79KB logs (minimal)
Database: 20MB (170+ trades, highly efficient)
Network I/O: 20.5GB received, 646MB sent (high read volume from RPC calls)

Build System

Build Time: 54.74s real time, 1m23s CPU time
Bundle Output: 102KB shared JS chunks
API Endpoints: 43 compiled successfully
Type Checking: Included in build process

Code Quality

Total Files Analyzed: 18 lib/ files with console statements
Console Statements: 731 unguarded (CRITICAL finding)
Timer/Interval Calls: 20 locations (monitoring overhead)
Database Queries: 62 total Prisma operations
Singleton Patterns: 5 getInstance/getInitialized implementations
Type-Only Imports: 49 missing optimization
JSON Operations: 14 stringify/parse locations
Total Exports: 93 across all lib/ files

Database Query Distribution

Table	Query Count	Percentage	Category
prisma.trade	32	51.6%	PRIMARY DATA SINK
prisma.stopHunt	15	24.2%	Analysis feature
prisma.marketData	8	12.9%	Price tracking
prisma.blockedSignal	5	8.1%	Signal analysis
prisma.systemEvent	1	1.6%	Event logging
prisma.priceUpdate	1	1.6%	Historical data

File Complexity (Top 10)

position-manager.ts - 1,945 lines (REFACTOR CANDIDATE)
orders.ts - 922 lines
trades.ts - 751 lines (32 Prisma calls concentrated)
smart-entry-timer.ts - 717 lines
blocked-signal-tracker.ts - 629 lines
stop-hunt-tracker.ts - 616 lines (15 Prisma calls)
client.ts - 496 lines
init-position-manager.ts - 460 lines
smart-validation-queue.ts - 458 lines
signal-quality.ts - 339 lines

🎯 Category 1: Performance Optimizations

1.1 Console.log Production Overhead (CRITICAL)

Finding: 731 unguarded console statements across 18 files

Impact:

Performance: Each log statement = synchronous I/O blocking event loop
Storage: Persistent logs grow indefinitely (Docker volumes)
Security: Sensitive data may leak (API keys, private keys, account balances)
Observability: Signal-to-noise ratio degraded (important logs buried)

Affected Files:

lib/trading/position-manager.ts          - Heavy logging
lib/drift/orders.ts                      - Order execution logs
lib/database/trades.ts                   - Database operation logs
lib/trading/stop-hunt-tracker.ts         - Analysis logs
lib/analysis/blocked-signal-tracker.ts   - Tracking logs
lib/trading/smart-validation-queue.ts    - Queue logs
lib/startup/init-position-manager.ts     - Initialization logs
lib/trading/smart-entry-timer.ts         - Timer logs
lib/drift/client.ts                      - SDK logs
lib/trading/signal-quality.ts            - Scoring logs
... (8 more files)

Solutions (3 Options):

Option A: Environment-Gated Logging (RECOMMENDED)

// lib/utils/logger.ts
export const logger = {
  debug: (...args: any[]) => {
    if (process.env.NODE_ENV === 'development' || process.env.DEBUG_LOGGING === 'true') {
      console.log(...args)
    }
  },
  info: (...args: any[]) => console.log(...args),
  warn: (...args: any[]) => console.warn(...args),
  error: (...args: any[]) => console.error(...args)
}

// Usage in files
import { logger } from '@/lib/utils/logger'
logger.debug('🔍 Position Manager state:', trade)  // Only in dev
logger.info('✅ Trade executed successfully')       // Always
logger.error('❌ Failed to close position:', error) // Always

Effort: 3-4 hours (create logger, find/replace across 18 files)
Impact: 90% log volume reduction, faster event loop, smaller log files
Risk: LOW - preserves info/warn/error logs, only gates debug logs
Priority: HIGH - Quick win with large impact

Option B: Structured JSON Logging

// lib/utils/logger.ts
import { createLogger, format, transports } from 'winston'

export const logger = createLogger({
  level: process.env.LOG_LEVEL || 'info',
  format: format.combine(
    format.timestamp(),
    format.errors({ stack: true }),
    format.json()
  ),
  transports: [
    new transports.File({ filename: 'logs/error.log', level: 'error' }),
    new transports.File({ filename: 'logs/combined.log' }),
    new transports.Console({
      format: format.combine(
        format.colorize(),
        format.simple()
      )
    })
  ]
})

Effort: 1 day (winston setup + migration + log rotation)
Impact: Queryable logs, automatic rotation, performance improvement
Risk: MEDIUM - Dependency addition, more complex than Option A
Priority: MEDIUM - Better long-term solution, more effort

Option C: Complete Removal

# Nuclear option - remove all console.logs
find lib/ -type f -name "*.ts" -exec sed -i '/console\.log/d' {} \;

Effort: 5 minutes
Impact: Maximum performance gain, no log overhead
Risk: HIGH - Lose all debugging capability, not recommended
Priority: LOW - Only for extreme cases

Recommendation: Implement Option A first (quick win), then migrate to Option B (winston) in Phase 2 or Phase 3.

1.2 Database Query Optimization

Finding: 32 trade queries (51.6% of all database operations) concentrated in trades.ts

Current Pattern:

// Potentially N+1 query pattern
const trades = await prisma.trade.findMany({ where: { exitReason: null } })
for (const trade of trades) {
  const stopHunt = await prisma.stopHunt.findFirst({ 
    where: { originalTradeId: trade.id } 
  })
}

Optimization Opportunities:

1.2.1 Batch Operations with Prisma include

// BEFORE (N+1 queries)
const trades = await prisma.trade.findMany()
const tradeIds = trades.map(t => t.id)
const stopHunts = await Promise.all(
  tradeIds.map(id => prisma.stopHunt.findFirst({ where: { originalTradeId: id } }))
)

// AFTER (Single query with join)
const trades = await prisma.trade.findMany({
  include: {
    stopHunt: true,  // Prisma joins automatically
    priceUpdates: true
  }
})

Effort: 2-3 hours (identify patterns, refactor queries)
Impact: 50-70% reduction in database round-trips
Risk: LOW - Prisma handles joins safely
Priority: HIGH - Significant performance gain

1.2.2 Database Indexing Audit

-- Analyze query patterns
EXPLAIN ANALYZE SELECT * FROM "Trade" WHERE "exitReason" IS NULL ORDER BY "createdAt" DESC;

-- Add missing indexes
CREATE INDEX CONCURRENTLY idx_trade_exit_reason ON "Trade"("exitReason") WHERE "exitReason" IS NULL;
CREATE INDEX CONCURRENTLY idx_trade_symbol_status ON "Trade"("symbol", "status");
CREATE INDEX CONCURRENTLY idx_stophunt_original_trade ON "StopHunt"("originalTradeId");

Effort: 4-5 hours (EXPLAIN ANALYZE all queries, add strategic indexes)
Impact: 2-5× query speed improvement on high-volume tables
Risk: LOW - Concurrent index creation doesn't block
Priority: MEDIUM - Scales with data growth

1.2.3 Connection Pooling Optimization

// config/database.ts
export const prisma = new PrismaClient({
  datasources: {
    db: {
      url: process.env.DATABASE_URL
    }
  },
  // CURRENT: No explicit pool config (uses defaults)
  // OPTIMIZED: Explicit pool sizing
  log: process.env.NODE_ENV === 'development' ? ['query', 'error', 'warn'] : ['error'],
})

// Add to .env
DATABASE_CONNECTION_LIMIT=10      # Default: 10 (appropriate for single bot)
DATABASE_POOL_TIMEOUT=30          # Seconds before connection timeout
DATABASE_STATEMENT_TIMEOUT=60000  # Milliseconds for slow query alerts

Effort: 1 hour (config adjustment + monitoring)
Impact: Prevents connection exhaustion under load
Risk: LOW - Tuning existing infrastructure
Priority: LOW - System stable, revisit if scaling

1.3 TypeScript Compilation Optimization

Finding: 49 imports without type keyword unnecessarily included in runtime bundle

Current:

import { TradingConfig, MarketConfig } from '@/config/trading'  // ❌ Both in runtime

Optimized:

import type { TradingConfig, MarketConfig } from '@/config/trading'  // ✅ Type-only

Benefits:

Faster TypeScript compilation (skip emitting type imports)
Smaller runtime bundle (types erased completely)
Better tree-shaking (unused types don't block dead code elimination)

Implementation:

# Automated fix with ts-morph or ESLint rule
npm install --save-dev @typescript-eslint/eslint-plugin

# .eslintrc.json
{
  "rules": {
    "@typescript-eslint/consistent-type-imports": ["error", {
      "prefer": "type-imports",
      "disallowTypeAnnotations": false
    }]
  }
}

# Run fix
npx eslint lib/ --fix

Effort: 30 minutes (ESLint rule + automated fix)
Impact: 5-10% TypeScript compilation speedup, cleaner bundle
Risk: NONE - Pure syntax change, no runtime behavior
Priority: HIGH - Quick win, low effort

1.4 Timer/Interval Consolidation

Finding: 20 separate setInterval/setTimeout calls across monitoring systems

Current Architecture:

// position-manager.ts
setInterval(monitorPrices, 2000)  // Every 2 seconds

// blocked-signal-tracker.ts
setInterval(trackSignals, 5 * 60 * 1000)  // Every 5 minutes

// stop-hunt-tracker.ts
setInterval(checkRevenge, 30 * 1000)  // Every 30 seconds

// smart-validation-queue.ts
setInterval(validateQueue, 30 * 1000)  // Every 30 seconds

// drift-health-monitor.ts
setInterval(checkHealth, 5 * 60 * 1000)  // Every 5 minutes

Optimization: Event-Driven Architecture

// lib/utils/event-emitter.ts
import { EventEmitter } from 'events'

export const systemEvents = new EventEmitter()

// Emit events instead of polling
systemEvents.emit('price:update', { symbol: 'SOL-PERP', price: 142.50 })
systemEvents.emit('trade:opened', { tradeId: '...' })
systemEvents.emit('trade:closed', { tradeId: '...' })

// Subscribers react to events
systemEvents.on('price:update', (data) => {
  positionManager.checkConditions(data)
  validationQueue.checkSignals(data)
})

// Keep minimal polling for external state
setInterval(() => {
  // Query Drift once, emit events to all subscribers
  const price = await driftService.getPrice()
  systemEvents.emit('price:update', { price })
}, 2000)

Benefits:

Single price query instead of 4-5 separate queries
Lower RPC call volume
Faster response time (event-driven vs polling)
Easier to add new monitoring features

Effort: 1-2 days (refactor monitoring architecture)
Impact: 50-70% reduction in RPC calls, lower CPU usage
Risk: MEDIUM - Architectural change, needs thorough testing
Priority: MEDIUM - High impact but requires design work

📦 Category 2: Size Optimizations

2.1 Docker Image Investigation (CRITICAL)

Finding: 1.32GB trading bot vs 275MB postgres (5× size difference)

Analysis Blocked: docker history trading-bot-v4:latest failed (image likely named traderv4_trading-bot-v4 or traderv4-trading-bot)

Investigation Steps:

# 1. Find correct image name
docker images | grep trading-bot
docker images | grep traderv4

# 2. Analyze layer sizes
docker history <CORRECT_IMAGE_NAME> --human --no-trunc

# 3. Dive into image
docker run --rm -it \
  -v /var/run/docker.sock:/var/run/docker.sock \
  wagoodman/dive:latest <CORRECT_IMAGE_NAME>

Common Culprits (Hypothesis):

Node modules cached in layers (620MB × multiple layers)
.next build artifacts in intermediate stages
Dev dependencies included in production image
Prisma client generated multiple times
Large Solana/Drift SDK dependencies

Target Size: 600-800MB (50% reduction)

Dockerfile Optimization Pattern:

# Multi-stage build (already implemented)
FROM node:20-alpine AS deps
# Install ONLY production dependencies
COPY package.json package-lock.json ./
RUN npm ci --only=production

FROM node:20-alpine AS builder
# Install ALL dependencies for build
COPY package.json package-lock.json ./
RUN npm ci
COPY . .
RUN npx prisma generate
RUN npm run build

FROM node:20-alpine AS runner
# Copy ONLY production artifacts
COPY --from=deps /app/node_modules ./node_modules
COPY --from=builder /app/.next ./.next
COPY --from=builder /app/prisma ./prisma
# ❌ DON'T COPY: source files, dev dependencies, build cache

Effort: 2-3 hours (analyze + optimize Dockerfile)
Impact: 50% image size reduction, faster deployments
Risk: LOW - Multi-stage already present, just optimization
Priority: HIGH - Significant infrastructure win

2.2 Node Modules Audit

Finding: 620MB node_modules (47.7% of total disk usage)

Analysis:

# Analyze dependency tree
npx depcheck  # Find unused dependencies
npx npm-check-updates  # Check outdated packages
npx du-cli node_modules  # Size breakdown by package

# Check for duplicate dependencies
npm dedupe
npm prune

# Analyze bundle impact
npx webpack-bundle-analyzer .next/analyze.json

Common Optimizations:

Remove dev dependencies from production:

// package.json - Move to devDependencies
{
  "devDependencies": {
    "@types/*": "...",
    "eslint": "...",
    "typescript": "..."
  }
}

Replace heavy dependencies:
- moment (288KB) → date-fns (78KB) or native Intl.DateTimeFormat
- Full lodash → Individual imports (lodash.debounce)
- Check if @drift-labs/sdk has lighter alternatives

Audit Solana dependencies:

npm ls @solana/web3.js  # Check if duplicated
npm ls bs58              # Check usage patterns

Effort: 3-4 hours (audit + replace + test)
Impact: 20-30% node_modules size reduction (600MB → 420-480MB)
Risk: MEDIUM - Dependency changes need regression testing
Priority: MEDIUM - Good housekeeping, not urgent

2.3 Build Artifact Optimization

Finding: .next build 221MB (17% of total disk)

Analysis:

# Analyze bundle composition
npx @next/bundle-analyzer

# Check for unnecessary includes
ls -lh .next/standalone/
ls -lh .next/static/chunks/

Optimizations:

// next.config.js
module.exports = {
  // Enable SWC minification (already likely enabled in Next.js 15)
  swcMinify: true,
  
  // Optimize image loading
  images: {
    formats: ['image/webp', 'image/avif'],
    minimumCacheTTL: 60 * 60 * 24 * 7, // 7 days
  },
  
  // Remove source maps in production
  productionBrowserSourceMaps: false,
  
  // Optimize standalone output
  output: 'standalone',
  
  // Webpack optimizations
  webpack: (config, { dev, isServer }) => {
    if (!dev && !isServer) {
      // Bundle analyzer in CI only
      if (process.env.ANALYZE === 'true') {
        const { BundleAnalyzerPlugin } = require('webpack-bundle-analyzer')
        config.plugins.push(new BundleAnalyzerPlugin())
      }
      
      // Split chunks aggressively
      config.optimization.splitChunks = {
        chunks: 'all',
        cacheGroups: {
          default: false,
          vendors: false,
          // Separate Drift/Solana bundles
          driftVendor: {
            name: 'drift-vendor',
            test: /[\\/]node_modules[\\/](@drift-labs|@solana)[\\/]/,
            priority: 10,
          },
          // Separate React/Next bundles
          framework: {
            name: 'framework',
            test: /[\\/]node_modules[\\/](react|react-dom|next)[\\/]/,
            priority: 20,
          }
        }
      }
    }
    return config
  }
}

Effort: 2 hours (config tuning + build testing)
Impact: 10-15% build artifact reduction
Risk: LOW - Standard Next.js optimization patterns
Priority: LOW - Build already efficient (54.74s)

🧹 Category 3: Code Quality & Maintainability

3.1 Position Manager Refactor (HIGHEST COMPLEXITY)

Finding: position-manager.ts at 1,945 lines (LARGEST file in codebase)

Current Structure:

// lib/trading/position-manager.ts (1,945 lines)
class PositionManager {
  // Price monitoring (lines 1-400)
  private monitoringInterval: NodeJS.Timeout | null
  private async monitorPositions(): Promise<void> { /* 200+ lines */ }
  
  // Trade lifecycle (lines 401-800)
  async addTrade(trade: ActiveTrade): Promise<void> { /* ... */ }
  async executeExit(trade: ActiveTrade, ...): Promise<void> { /* 300+ lines */ }
  
  // TP/SL logic (lines 801-1200)
  private shouldTakeProfit1(): boolean { /* ... */ }
  private shouldTakeProfit2(): boolean { /* ... */ }
  private shouldStopLoss(): boolean { /* ... */ }
  
  // External closure handling (lines 1201-1600)
  private async handleExternalClosure(): Promise<void> { /* 200+ lines */ }
  
  // Ghost detection (lines 1601-1945)
  private async validatePositions(): Promise<void> { /* ... */ }
}

Proposed Refactor (Modular Architecture):

// lib/trading/position-manager/index.ts (200 lines)
export class PositionManager {
  private monitor: PriceMonitor
  private lifecycle: TradeLifecycle
  private exitStrategy: ExitStrategy
  private validator: PositionValidator
  
  constructor() {
    this.monitor = new PriceMonitor(this)
    this.lifecycle = new TradeLifecycle(this)
    this.exitStrategy = new ExitStrategy(this)
    this.validator = new PositionValidator(this)
  }
}

// lib/trading/position-manager/price-monitor.ts (300 lines)
export class PriceMonitor {
  async startMonitoring(): Promise<void> { /* ... */ }
  async checkTradeConditions(trade: ActiveTrade, price: number): Promise<void> { /* ... */ }
}

// lib/trading/position-manager/trade-lifecycle.ts (400 lines)
export class TradeLifecycle {
  async addTrade(trade: ActiveTrade): Promise<void> { /* ... */ }
  async removeTrade(tradeId: string): Promise<void> { /* ... */ }
  async handleTradeUpdate(trade: ActiveTrade): Promise<void> { /* ... */ }
}

// lib/trading/position-manager/exit-strategy.ts (500 lines)
export class ExitStrategy {
  async executeExit(trade: ActiveTrade, percent: number, reason: string): Promise<void> { /* ... */ }
  shouldTakeProfit1(price: number, trade: ActiveTrade): boolean { /* ... */ }
  shouldTakeProfit2(price: number, trade: ActiveTrade): boolean { /* ... */ }
  shouldStopLoss(price: number, trade: ActiveTrade): boolean { /* ... */ }
}

// lib/trading/position-manager/position-validator.ts (300 lines)
export class PositionValidator {
  async validatePositions(): Promise<void> { /* ... */ }
  async handleExternalClosure(trade: ActiveTrade, reason: string): Promise<void> { /* ... */ }
  async detectGhostPositions(): Promise<void> { /* ... */ }
}

// lib/trading/position-manager/types.ts (100 lines)
export interface ActiveTrade { /* ... */ }
export interface PriceUpdate { /* ... */ }
export interface ExitResult { /* ... */ }

Benefits:

Testability: Each module independently testable
Readability: 300-500 line files instead of 1,945 line monolith
Maintainability: Clear separation of concerns
Extensibility: Easy to add new exit strategies or validation logic
Collaboration: Multiple developers can work on different modules

Migration Strategy (Zero Downtime):

Phase 1: Create new modular structure alongside existing (1 day)
Phase 2: Move PriceMonitor logic, test thoroughly (2 days)
Phase 3: Move TradeLifecycle logic, test thoroughly (2 days)
Phase 4: Move ExitStrategy logic, test thoroughly (3 days)
Phase 5: Move PositionValidator logic, test thoroughly (2 days)
Phase 6: Remove old monolithic file, update imports (1 day)

Effort: 11 days (staged migration with testing)
Impact: Dramatically improved maintainability, easier to add features
Risk: HIGH - Core trading logic, requires extensive testing
Priority: MEDIUM - Important but not urgent, system currently stable

Testing Requirements:

Unit tests for each new module (90%+ coverage)
Integration tests for full lifecycle
Shadow testing: Run both old and new side-by-side for 50-100 trades
Rollback plan if any issues detected

3.2 Export Tree-Shaking Audit

Finding: 93 exports across lib/ files - potential unused exports

Analysis:

# Find unused exports
npx ts-prune | grep -v "(used in module)"

# Analyze import patterns
grep -r "export" lib/ | wc -l  # Total exports
grep -r "import.*from '@/lib" app/ | wc -l  # Total imports

# Check for circular dependencies
npx madge --circular --extensions ts,tsx lib/

Common Patterns:

// lib/utils/helpers.ts
export const formatPrice = (price: number) => { /* ... */ }  // ✅ Used 15 times
export const formatDate = (date: Date) => { /* ... */ }      // ✅ Used 8 times
export const calculateFibonacci = (n: number) => { /* ... */ } // ❌ Never used

// Action: Remove unused exports
// npx ts-prune will identify these automatically

Implementation:

# 1. Identify unused exports
npx ts-prune > unused-exports.txt

# 2. Review manually (some false positives)
cat unused-exports.txt

# 3. Remove confirmed unused exports
# Manual deletion or automated with jscodeshift

# 4. Verify bundle size reduction
npm run build
# Check .next/static/chunks/ size before/after

Effort: 2-3 hours (analysis + removal + verification)
Impact: 5-10% bundle size reduction, cleaner codebase
Risk: LOW - Unused code doesn't affect runtime
Priority: LOW - Nice to have, not performance critical

3.3 Circular Dependency Resolution

Finding: 5 singleton patterns (potential circular dependency risk)

Current Patterns:

// lib/drift/client.ts
export function getDriftService() {
  if (!driftServiceInstance) {
    driftServiceInstance = new DriftService()
  }
  return driftServiceInstance
}

// lib/database/trades.ts imports from lib/drift/client.ts
import { getDriftService } from '@/lib/drift/client'

// lib/drift/client.ts imports from lib/database/trades.ts (potential circular)
import { saveTrade } from '@/lib/database/trades'

Detection:

# Visualize dependency graph
npx madge --circular --extensions ts,tsx lib/ --image deps.svg

# Text output
npx madge --circular --extensions ts,tsx lib/

Resolution Strategies:

Option A: Dependency Injection

// lib/drift/client.ts
export class DriftService {
  constructor(private database?: DatabaseService) {}
  
  async closePosition(params: CloseParams) {
    const result = await this.executeClose(params)
    // Don't save to database here
    return result
  }
}

// lib/trading/position-manager.ts
const driftService = await initializeDriftService()
const result = await driftService.closePosition(params)
await createTrade(result)  // Database save happens at higher level

Option B: Event-Driven Decoupling

// lib/utils/events.ts
export const tradeEvents = new EventEmitter()

// lib/drift/client.ts
async closePosition(params: CloseParams) {
  const result = await this.executeClose(params)
  tradeEvents.emit('position:closed', result)
  return result
}

// lib/database/trades.ts
tradeEvents.on('position:closed', async (result) => {
  await createTrade(result)
})

Effort: 1-2 days (refactor dependency chains)
Impact: Cleaner architecture, easier to test, fewer runtime errors
Risk: MEDIUM - Architectural change, needs careful testing
Priority: LOW - System stable, revisit during major refactors

🏗️ Category 4: Infrastructure Efficiency

4.1 Monitoring Overhead Reduction

Finding: 20 timer/interval calls across monitoring systems (covered in 1.4)

Additional Optimization: Adaptive Polling

// lib/trading/position-manager.ts
class PositionManager {
  private baseInterval = 2000  // 2 seconds baseline
  private adaptiveInterval = 2000
  
  private adjustPollingRate() {
    const activeTradeCount = this.activeTrades.size
    
    if (activeTradeCount === 0) {
      // No trades: Check every 30 seconds
      this.adaptiveInterval = 30000
    } else if (activeTradeCount <= 2) {
      // Few trades: Normal 2-second polling
      this.adaptiveInterval = 2000
    } else {
      // Many trades: More aggressive 1-second polling
      this.adaptiveInterval = 1000
    }
    
    // Restart interval with new rate
    this.restartMonitoring()
  }
}

Effort: 2 hours (implement adaptive polling)
Impact: 50-80% CPU reduction when idle, faster response when active
Risk: LOW - Graceful degradation, monitoring continues
Priority: LOW - System CPU already low (10.88%)

4.2 RPC Call Pattern Optimization

Finding: 20.5GB network received (high read volume from Solana RPC)

Analysis Needed:

# Monitor RPC call frequency
docker logs -f trading-bot-v4 | grep -i "rpc\|solana\|drift" | pv -l -i 10 > /dev/null

# Check for rate limiting
docker logs -f trading-bot-v4 | grep "429\|rate limit"

# Analyze call patterns
# - How many calls per second during monitoring?
# - Are we polling when we should use WebSockets?
# - Are we caching oracle prices adequately?

Optimization Opportunities:

Oracle Price Caching:

// lib/pyth/price-monitor.ts
private priceCache = new Map<string, { price: number, timestamp: number }>()
private CACHE_TTL = 2000  // 2 seconds

async getPrice(symbol: string): Promise<number> {
  const cached = this.priceCache.get(symbol)
  if (cached && Date.now() - cached.timestamp < this.CACHE_TTL) {
    return cached.price  // Return cached, avoid RPC call
  }

  const fresh = await this.fetchPrice(symbol)
  this.priceCache.set(symbol, { price: fresh, timestamp: Date.now() })
  return fresh
}

Batch RPC Requests:

// Instead of 5 separate calls
const price1 = await getOraclePrice('SOL-PERP')
const price2 = await getOraclePrice('ETH-PERP')
const price3 = await getOraclePrice('BTC-PERP')

// Single batched call
const prices = await batchGetOraclePrices(['SOL-PERP', 'ETH-PERP', 'BTC-PERP'])

WebSocket vs Polling:

// Current: Polling every 2 seconds
setInterval(() => getPrice(), 2000)

// Better: WebSocket subscription (if supported by Pyth)
pythClient.subscribeToPriceUpdates('SOL-PERP', (price) => {
  systemEvents.emit('price:update', { price })
})

Effort: 1-2 days (implement caching + batching + WebSocket investigation)
Impact: 30-50% RPC call reduction, lower network I/O
Risk: LOW - Graceful degradation if cache stale
Priority: MEDIUM - RPC costs scale with usage

🚀 Category 5: Development Velocity

5.1 Build Time Optimization

Finding: 54.74s build time (baseline established)

Analysis:

# Profile build steps
time npm run build 2>&1 | tee build-profile.log

# Check which step takes longest:
# - Prisma generation
# - TypeScript compilation
# - Next.js build
# - Bundle optimization

Optimizations:

5.1.1 Incremental TypeScript Builds

// tsconfig.json
{
  "compilerOptions": {
    "incremental": true,
    "tsBuildInfoFile": ".tsbuildinfo"
  }
}

5.1.2 Parallel Processing

// next.config.js
module.exports = {
  experimental: {
    workerThreads: true,
    cpus: 4  // Use 4 CPU cores for build
  }
}

5.1.3 Build Cache (Turborepo/Nx)

# Install Turborepo for advanced caching
npm install turbo --save-dev

# turbo.json
{
  "$schema": "https://turbo.build/schema.json",
  "pipeline": {
    "build": {
      "dependsOn": ["^build"],
      "outputs": [".next/**", "!.next/cache/**"]
    }
  }
}

Target: 54.74s → 25-30s (50% reduction with caching)

Effort: 4-5 hours (implement incremental builds + caching)
Impact: 50% faster builds during development
Risk: LOW - Standard Next.js optimization patterns
Priority: LOW - Build already reasonable for size

5.2 Hot Reload Performance

Finding: Not yet measured (needs investigation)

Measurement:

# Time to see changes in browser after file save
time (echo "export const test = 1" >> lib/utils/test.ts && wait for reload)

# Check HMR bundle size
ls -lh .next/static/webpack/

# Monitor memory during development
watch -n 1 'ps aux | grep next-dev'

Common Issues:

Large files cause full page reload instead of HMR
Too many watched files slow down file system monitoring
Memory leaks in dev server over time

Optimizations:

// next.config.js
module.exports = {
  // Reduce watch overhead
  webpack: (config) => {
    config.watchOptions = {
      poll: 1000,  // Check for changes every 1s instead of inotify
      aggregateTimeout: 300,  // Wait 300ms before rebuilding
      ignored: [
        '**/node_modules/**',
        '**/.next/**',
        '**/logs/**',
        '**/prisma/.migrations/**'
      ]
    }
    return config
  }
}

Effort: 2 hours (measure + optimize)
Impact: Faster development iteration
Risk: NONE - Dev environment only
Priority: LOW - Only affects development workflow

📋 Implementation Roadmap

Phase 1: Quick Wins (1-2 weeks)

Goal: Maximum impact with minimal effort and risk

Task	Effort	Impact	Priority	Owner
1.1 Console.log Gating (Option A)	4h	HIGH	CRITICAL	Backend
1.3 Type-Only Imports	30m	MEDIUM	HIGH	Backend
2.1 Docker Image Investigation	3h	HIGH	HIGH	DevOps
3.2 Export Tree-Shaking	3h	LOW	MEDIUM	Backend

Expected Results:

90% log volume reduction
5-10% compilation speedup
50% Docker image size reduction
Cleaner codebase

Risk: LOW - All changes are optimizations without functional changes

Phase 2: Medium Initiatives (2-4 weeks)

Goal: Performance improvements requiring deeper changes

Task	Effort	Impact	Priority	Owner
1.2.1 Database Query Batching	3h	HIGH	HIGH	Backend
1.2.2 Database Indexing	5h	MEDIUM	MEDIUM	Database
1.4 Timer/Interval Consolidation	2d	MEDIUM	MEDIUM	Backend
2.2 Node Modules Audit	4h	MEDIUM	MEDIUM	DevOps
4.2 RPC Call Optimization	2d	MEDIUM	MEDIUM	Backend

Expected Results:

50% database query reduction
30% RPC call reduction
20% node_modules size reduction
Lower CPU and network usage

Risk: MEDIUM - Requires testing, affects runtime behavior

Phase 3: Long-Term Projects (1-3 months)

Goal: Architectural improvements for scalability

Task	Effort	Impact	Priority	Owner
1.1 Winston Structured Logging	1d	MEDIUM	MEDIUM	Backend
3.1 Position Manager Refactor	11d	HIGH	MEDIUM	Backend
3.3 Circular Dependency Resolution	2d	LOW	LOW	Backend
5.1 Build Time Optimization	5h	LOW	LOW	DevOps

Expected Results:

Queryable structured logs
Modular, maintainable codebase
Faster builds during development
Foundation for future features

Risk: HIGH - Major architectural changes, requires extensive testing

📊 Success Metrics

Before (Baseline - Dec 4, 2025)

Metric	Value	Category
Console.log Statements	731	Code Quality
Build Time	54.74s	Performance
Docker Image Size	1.32GB	Infrastructure
Node Modules Size	620MB	Infrastructure
Database Queries (Trade)	32	Performance
Position Manager Lines	1,945	Maintainability
Type-Only Imports	49 missing	Code Quality
CPU Usage	10.88%	Performance
Memory Usage	179.7MiB	Performance

After Phase 1 (Target - Dec 18, 2025)

Metric	Target	Improvement
Console.log Statements	~73 (90% gated)	90% reduction
Build Time	52-53s	3-4% faster
Docker Image Size	600-700MB	45-53% reduction
Node Modules Size	620MB (unchanged)	-
Database Queries (Trade)	32 (unchanged)	-
Position Manager Lines	1,945 (unchanged)	-
Type-Only Imports	0 missing	100% compliant
CPU Usage	10-11%	Similar
Memory Usage	160-170MiB	5-10% reduction

After Phase 2 (Target - Jan 15, 2026)

Metric	Target	Improvement
Console.log Statements	~73 (gated)	90% reduction
Build Time	50-52s	5-9% faster
Docker Image Size	600-700MB	45-53% reduction
Node Modules Size	480-500MB	20-23% reduction
Database Queries (Trade)	15-20	38-53% reduction
Position Manager Lines	1,945 (unchanged)	-
Type-Only Imports	0 missing	100% compliant
CPU Usage	8-9%	18-27% reduction
Memory Usage	150-160MiB	11-17% reduction

After Phase 3 (Target - Mar 1, 2026)

Metric	Target	Improvement
Console.log Statements	0 (Winston only)	100% removed
Build Time	25-30s	45-54% faster
Docker Image Size	600-700MB	45-53% reduction
Node Modules Size	480-500MB	20-23% reduction
Database Queries (Trade)	15-20	38-53% reduction
Position Manager Lines	~800 (refactored)	59% reduction
Type-Only Imports	0 missing	100% compliant
CPU Usage	7-8%	27-36% reduction
Memory Usage	140-150MiB	17-22% reduction

⚠️ Risk Mitigation

Trading System Constraints

Context: Real-money trading system ($540 capital, targeting $2,500)

Critical Requirements:

Win Rate Preservation: Cannot drop below 60% during optimizations
Dual-Layer Redundancy: On-chain orders + Position Manager monitoring must remain
ATR-Based TP/SL: Dynamic targets must remain functional
Database Integrity: 170+ historical trades must be preserved
Zero Downtime: System must stay operational during migrations

Mitigation Strategies:

1. Shadow Testing:

// Run new code alongside old code, compare results
const oldResult = await legacyPositionManager.shouldExit(trade)
const newResult = await refactoredPositionManager.shouldExit(trade)

if (oldResult !== newResult) {
  console.error('DIVERGENCE DETECTED:', { old: oldResult, new: newResult })
  // Use old result, log for investigation
  return oldResult
}

2. Feature Flags:

// .env
USE_REFACTORED_POSITION_MANAGER=false
USE_STRUCTURED_LOGGING=false
USE_QUERY_BATCHING=false

// Runtime toggle without deployment
if (process.env.USE_REFACTORED_POSITION_MANAGER === 'true') {
  return new RefactoredPositionManager()
} else {
  return new LegacyPositionManager()
}

3. Rollback Plan:

# Before major changes
git tag v1.0.0-pre-refactor
docker tag trading-bot-v4:latest trading-bot-v4:v1.0.0-pre-refactor

# If issues detected
git checkout v1.0.0-pre-refactor
docker compose up -d --force-recreate trading-bot

# Verify rollback successful
curl http://localhost:3001/api/health

4. Comprehensive Testing:

Unit Tests: 90%+ coverage for new modules
Integration Tests: Full trade lifecycle (open → TP1 → TP2 → close)
Load Tests: 50-100 trades with new code before declaring stable
Regression Tests: Ensure old functionality preserved

5. Gradual Rollout:

// Example: Phased database query migration
const MIGRATION_PERCENTAGE = parseInt(process.env.QUERY_MIGRATION_PERCENT || '0')

async function getTrades() {
  const shouldUseBatched = Math.random() * 100 < MIGRATION_PERCENTAGE
  
  if (shouldUseBatched) {
    return await getTradesBatched()  // New optimized version
  } else {
    return await getTradesLegacy()   // Old proven version
  }
}

// Start: QUERY_MIGRATION_PERCENT=10 (10% of queries)
// Week 1: Increase to 50%
// Week 2: Increase to 100%

📚 Documentation Updates Required

After Phase 1:

Update .github/copilot-instructions.md with:
- New logger utility usage patterns
- Type-only import conventions
- Docker optimization results
- Updated baseline metrics

After Phase 2:

Create docs/QUERY_OPTIMIZATION_GUIDE.md:
- Batching patterns
- Index strategy
- Performance benchmarks
Update docs/OPTIMIZATION_MASTER_ROADMAP.md:
- Phase 1-2 completion status
- Measured improvements
- Lessons learned

After Phase 3:

Create docs/POSITION_MANAGER_ARCHITECTURE.md:
- Modular design rationale
- Module responsibilities
- Testing strategies
- Migration history
Create docs/STRUCTURED_LOGGING_GUIDE.md:
- Winston configuration
- Log levels and when to use
- Query patterns for log analysis
- Retention policies

🎯 Next Actions

Immediate (This Week - Dec 4-11, 2025)

✅ COMPLETE: Comprehensive analysis documented
🔄 Review: Share this plan with user for prioritization feedback
📋 Plan: Break Phase 1 tasks into Nextcloud Deck cards
🚀 Execute: Begin with console.log gating (highest impact, lowest risk)

Short Term (2-3 Weeks)

Complete Phase 1 quick wins
Measure and document improvements
Begin Phase 2 database optimizations
Monitor system stability throughout

Medium Term (1-2 Months)

Complete Phase 2 medium initiatives
Validate performance improvements
Plan Phase 3 architectural refactors
Consider if Phase 3 needed based on Phase 1-2 results

📈 Integration with Existing Roadmaps

Note: This improvement plan complements existing optimization roadmaps, not replaces them.

OPTIMIZATION_MASTER_ROADMAP.md Alignment

Existing Focus: Trading strategy optimizations (signal quality, position scaling, ATR-based TP)
This Plan Focus: Infrastructure, code quality, performance optimizations
Integration: Run in parallel - trading optimizations continue while infrastructure improves

Synergies:

Console.log Gating: Reduces noise during signal quality analysis
Database Indexing: Faster backtesting queries for position scaling analysis
Position Manager Refactor: Easier to implement new exit strategies
Structured Logging: Better data for trading performance analysis

No Conflicts

All proposed optimizations are infrastructure-level and do not affect trading logic, quality thresholds, or position sizing strategies currently under data collection.

💡 Key Insights

System is Healthy: 10.88% CPU, 8.77% memory, stable operation - not fixing problems, optimizing opportunities
Console.log is the Biggest Win: 731 statements = immediate performance + storage improvement with minimal risk
Size Over Speed: Docker image (1.32GB) and node_modules (620MB) are larger optimization targets than build time (54.74s already reasonable)
Maintainability Matters: position-manager.ts at 1,945 lines is biggest long-term concern for adding new features
Database is Efficient: 20MB for 170+ trades shows good schema design, but query patterns can improve
Documentation is Strong: OPTIMIZATION_MASTER_ROADMAP.md shows mature optimization tracking already in place
Risk-Aware: All recommendations include rollback strategies and testing requirements for real-money trading system

🏁 Conclusion

This comprehensive analysis identified 20+ optimization opportunities across 5 categories, prioritized into a 3-phase implementation roadmap spanning 3 months.

Phase 1 Quick Wins target 90% log reduction, 50% Docker size reduction, and 100% type import compliance with minimal risk.

Phase 2 Medium Initiatives target database query optimization, RPC call reduction, and dependency cleanup.

Phase 3 Long-Term Projects focus on architectural improvements for future scalability.

All recommendations are data-driven with quantified baselines, measurable success metrics, and risk mitigation strategies appropriate for a real-money trading system.

Recommendation: Begin with Phase 1, measure results, then reassess priorities before Phase 2.

41 KiB Raw Permalink Blame History Unescape Escape

Comprehensive Trading Bot Improvement Plan

Executive Summary

📊 Baseline Metrics (Current State)

Infrastructure

Build System

Code Quality

Database Query Distribution

File Complexity (Top 10)

🎯 Category 1: Performance Optimizations

1.1 Console.log Production Overhead (CRITICAL)

1.2 Database Query Optimization

1.3 TypeScript Compilation Optimization

1.4 Timer/Interval Consolidation

📦 Category 2: Size Optimizations

2.1 Docker Image Investigation (CRITICAL)

2.2 Node Modules Audit

2.3 Build Artifact Optimization

🧹 Category 3: Code Quality & Maintainability

3.1 Position Manager Refactor (HIGHEST COMPLEXITY)

3.2 Export Tree-Shaking Audit

3.3 Circular Dependency Resolution

🏗️ Category 4: Infrastructure Efficiency

4.1 Monitoring Overhead Reduction

4.2 RPC Call Pattern Optimization

🚀 Category 5: Development Velocity

5.1 Build Time Optimization

5.2 Hot Reload Performance

📋 Implementation Roadmap

Phase 1: Quick Wins (1-2 weeks)

Phase 2: Medium Initiatives (2-4 weeks)

Phase 3: Long-Term Projects (1-3 months)

📊 Success Metrics

Before (Baseline - Dec 4, 2025)

After Phase 1 (Target - Dec 18, 2025)

After Phase 2 (Target - Jan 15, 2026)

After Phase 3 (Target - Mar 1, 2026)

⚠️ Risk Mitigation

Trading System Constraints

📚 Documentation Updates Required

After Phase 1:

After Phase 2:

After Phase 3:

🎯 Next Actions

Immediate (This Week - Dec 4-11, 2025)

Short Term (2-3 Weeks)

Medium Term (1-2 Months)

📈 Integration with Existing Roadmaps

OPTIMIZATION_MASTER_ROADMAP.md Alignment

No Conflicts

💡 Key Insights

🏁 Conclusion

41 KiB

Raw Permalink Blame History