Files
trading_bot_v4/docs/HA_DATABASE_SYNC_STRATEGY.md
mindesbunister 0baac4f137 feat: Automated failover system with certificate sync and DNS monitoring
Certificate Synchronization (COMPLETE):
- Created cert-push-to-hostinger.sh on srvrevproxy02
- Hourly cron job pushes /etc/letsencrypt/ from srvrevproxy02 to Hostinger
- SSH key authentication (id_ed25519_hostinger) configured
- 22MB of Let's Encrypt certificates synced successfully
- Automatic nginx reload on Hostinger after sync
- Log: /var/log/cert-push-hostinger.log

DNS Failover Monitor (READY):
- Python script: dns-failover-monitor.py on Hostinger
- INWX API integration for automatic DNS updates
- Health monitoring every 30s, failover after 3 failures (90s)
- Systemd service with auto-restart
- Setup script: setup-inwx-env.sh for INWX credentials
- Log: /var/log/dns-failover.log

Architecture:
- Primary: srvrevproxy02 (10.0.0.29) - Certificate source
- Secondary: Hostinger (72.62.39.24) - Failover target
- Nginx on Hostinger now uses flow.egonetix.de certificate

Next Steps:
- Run /root/setup-inwx-env.sh on Hostinger
- Enter INWX credentials
- Start monitoring: systemctl start dns-failover
2025-11-25 16:01:15 +01:00

10 KiB

HA Database Sync Strategy

Overview

This document explains the database synchronization strategy for the High Availability (HA) setup between Primary (Hetzner) and Secondary (Hostinger) servers.

Current Setup (Nov 25, 2025)

  • Primary: Hetzner Proxmox LXC - /home/icke/traderv4
  • Secondary: Hostinger VPS (72.62.39.24) - /home/icke/traderv4
  • Database: PostgreSQL 16-alpine
  • Database Name: trading_bot_v4

What is it?

PostgreSQL's native master-slave replication using Write-Ahead Logs (WAL). Secondary automatically receives and applies all changes from primary in near real-time.

Benefits

  • Real-time: Changes replicate within 1-2 seconds
  • Native: Built into PostgreSQL, no external tools
  • Automatic: Once setup, runs continuously
  • Failover ready: Secondary can be promoted to master instantly

Setup Steps

1. Configure Primary (Master)

# SSH to primary
ssh root@hetzner-ip

# Create replication user
docker exec trading-bot-postgres psql -U postgres -c "
CREATE USER replicator WITH REPLICATION ENCRYPTED PASSWORD 'your-strong-password-here';
SELECT * FROM pg_user WHERE usename = 'replicator';
"

# Configure postgresql.conf for replication
docker exec trading-bot-postgres bash -c "cat >> /var/lib/postgresql/data/postgresql.conf << CONF
# Replication settings
wal_level = replica
max_wal_senders = 3
wal_keep_size = 64
hot_standby = on
CONF"

# Allow replication connections from secondary
docker exec trading-bot-postgres bash -c "echo 'host replication replicator 72.62.39.24/32 md5' >> /var/lib/postgresql/data/pg_hba.conf"

# Restart PostgreSQL to apply changes
docker restart trading-bot-postgres

# Verify replication user
docker exec trading-bot-postgres psql -U postgres -c "SELECT usename, userepl FROM pg_user WHERE usename = 'replicator';"

2. Configure Secondary (Replica)

# SSH to secondary
ssh root@72.62.39.24

cd /home/icke/traderv4

# Stop PostgreSQL if running
docker compose down postgres

# Backup existing data (if any)
sudo mv postgres-data postgres-data.backup-$(date +%Y%m%d-%H%M%S) || true

# Create base backup from primary
docker run --rm \
  -v $(pwd)/postgres-data:/var/lib/postgresql/data \
  -e PGPASSWORD='your-strong-password-here' \
  postgres:16-alpine \
  pg_basebackup -h hetzner-ip -p 5432 -U replicator -D /var/lib/postgresql/data -P -R

# Start PostgreSQL in replica mode
docker compose up -d postgres

# Wait for startup
sleep 10

# Verify replication status
docker exec trading-bot-postgres psql -U postgres -c "SELECT status, receive_start_lsn FROM pg_stat_wal_receiver;"

3. Verify Replication

# On PRIMARY - Check replication status
docker exec trading-bot-postgres psql -U postgres -c "
SELECT 
  client_addr,
  state,
  sent_lsn,
  write_lsn,
  flush_lsn,
  replay_lsn,
  sync_state
FROM pg_stat_replication;
"

# Should show:
# - client_addr: 72.62.39.24
# - state: streaming
# - sync_state: async

# Test replication - Insert test data on PRIMARY
docker exec trading-bot-postgres psql -U postgres trading_bot_v4 -c "
CREATE TABLE IF NOT EXISTS replication_test (id SERIAL PRIMARY KEY, test_time TIMESTAMP);
INSERT INTO replication_test (test_time) VALUES (NOW());
SELECT * FROM replication_test;
"

# Check on SECONDARY (should see same data within 1-2 seconds)
ssh root@72.62.39.24 "docker exec trading-bot-postgres psql -U postgres trading_bot_v4 -c 'SELECT * FROM replication_test;'"

# Clean up test
docker exec trading-bot-postgres psql -U postgres trading_bot_v4 -c "DROP TABLE replication_test;"

Monitoring Replication

# Check replication lag (run on PRIMARY)
docker exec trading-bot-postgres psql -U postgres -c "
SELECT 
  client_addr,
  state,
  pg_wal_lsn_diff(pg_current_wal_lsn(), replay_lsn) AS lag_bytes,
  EXTRACT(EPOCH FROM (NOW() - pg_last_xact_replay_timestamp())) AS lag_seconds
FROM pg_stat_replication;
"

# Healthy values:
# - lag_bytes: < 100KB
# - lag_seconds: < 5 seconds

Option 2: Periodic Backup (FALLBACK)

What is it?

Scheduled pg_dump backups copied to secondary every 6 hours via cron job.

Benefits

  • Simple: Easy to setup and understand
  • No config changes: Primary runs normally
  • Cross-platform: Works even with different PostgreSQL versions

Drawbacks

  • Data loss window: Up to 6 hours of trades could be lost
  • Not instant: Secondary is always 0-6 hours behind

Setup

# On PRIMARY - Create sync script
cat > /root/sync-database-to-secondary.sh << 'SCRIPT'
#!/bin/bash
LOG="/var/log/secondary-db-sync.log"
echo "[$(date)] Starting database sync..." >> $LOG

# Dump database
docker exec trading-bot-postgres pg_dump -U postgres trading_bot_v4 > /tmp/trading_bot_backup.sql

if [ $? -eq 0 ]; then
  echo "[$(date)] Database dump successful ($(wc -l < /tmp/trading_bot_backup.sql) lines)" >> $LOG
  
  # Copy to secondary
  scp /tmp/trading_bot_backup.sql root@72.62.39.24:/tmp/trading_bot_backup.sql
  
  if [ $? -eq 0 ]; then
    echo "[$(date)] Backup copied to secondary" >> $LOG
    
    # Restore on secondary
    ssh root@72.62.39.24 "docker exec -i trading-bot-postgres psql -U postgres -c 'DROP DATABASE IF EXISTS trading_bot_v4; CREATE DATABASE trading_bot_v4;' && docker exec -i trading-bot-postgres psql -U postgres trading_bot_v4 < /tmp/trading_bot_backup.sql"
    
    if [ $? -eq 0 ]; then
      echo "[$(date)] Database restored on secondary successfully" >> $LOG
    else
      echo "[$(date)] ERROR: Database restore failed" >> $LOG
    fi
  else
    echo "[$(date)] ERROR: Failed to copy backup to secondary" >> $LOG
  fi
else
  echo "[$(date)] ERROR: Database dump failed" >> $LOG
fi

echo "[$(date)] Sync complete" >> $LOG
SCRIPT

chmod +x /root/sync-database-to-secondary.sh

# Test it
/root/sync-database-to-secondary.sh

# Check log
tail /var/log/secondary-db-sync.log

# Setup cron (every 6 hours)
crontab -l > /tmp/crontab.backup
echo "0 */6 * * * /root/sync-database-to-secondary.sh" >> /tmp/crontab.backup
crontab /tmp/crontab.backup

# Verify cron
crontab -l | grep sync-database

Code Sync (Both Options)

Code needs to be synced separately (n8n workflows, trading bot code, etc.)

# Create code sync script
cat > /root/sync-code-to-secondary.sh << 'SCRIPT'
#!/bin/bash
LOG="/var/log/secondary-code-sync.log"
echo "[$(date)] Starting code sync..." >> $LOG

rsync -avz --delete \
  --exclude 'node_modules' \
  --exclude '.next' \
  --exclude '.git' \
  --exclude 'logs/*' \
  --exclude 'postgres-data' \
  /home/icke/traderv4/ root@72.62.39.24:/home/icke/traderv4/ >> $LOG 2>&1

# Sync .env file
rsync -avz /home/icke/traderv4/.env root@72.62.39.24:/home/icke/traderv4/.env >> $LOG 2>&1

echo "[$(date)] Code sync complete" >> $LOG
SCRIPT

chmod +x /root/sync-code-to-secondary.sh

# Run daily at 3 AM
crontab -l > /tmp/crontab.backup
echo "0 3 * * * /root/sync-code-to-secondary.sh" >> /tmp/crontab.backup
crontab /tmp/crontab.backup

n8n Workflow Sync

n8n workflows are stored in SQLite database or files depending on setup.

If using SQLite (file-based)

# Find n8n database location
ssh root@72.62.39.24 "docker exec n8n ls -la /home/node/.n8n/"

# Sync n8n data directory
rsync -avz /path/to/n8n/data/ root@72.62.39.24:/path/to/n8n/data/

If using PostgreSQL for n8n

n8n can use PostgreSQL - already covered by database replication above.

Failover Procedure

Scenario: Primary goes down, need to activate Secondary

1. Promote Secondary to Master (if using streaming replication)

# SSH to secondary
ssh root@72.62.39.24

# Promote to master
docker exec trading-bot-postgres pg_ctl promote -D /var/lib/postgresql/data

# Verify it's now accepting writes
docker exec trading-bot-postgres psql -U postgres trading_bot_v4 -c "SELECT pg_is_in_recovery();"
# Should return 'f' (false = not in recovery = is master)

# Start trading bot (if not already running)
cd /home/icke/traderv4
docker compose up -d trading-bot

2. Update Cloudflare DNS

# Via health monitor script (automatic)
# OR manually via Cloudflare API/dashboard
# Point flow.egonetix.de to 72.62.39.24

3. When Primary recovers

# Reconfigure primary as new replica
# Then switch DNS back
# Or keep secondary as new primary (depends on data drift)

Comparison

Feature Streaming Replication Periodic Backup
Lag 1-2 seconds 0-6 hours
Setup Complexity Medium Simple
Data Loss Risk Minimal (seconds) High (hours)
Failover Time Instant Minutes
Resource Usage Low Low
Best For Production HA Testing/Dev

Recommendation

Use PostgreSQL Streaming Replication for production HA setup:

  • Real-time sync (1-2 second lag)
  • Zero manual intervention needed
  • Instant failover capability
  • Native PostgreSQL feature (well-tested)

Fallback to Periodic Backup only if:

  • Streaming replication setup fails
  • Network between servers is unreliable
  • You're just testing the HA concept

Monitoring

Add to health monitor script:

# Check replication lag
def check_replication_lag():
    result = subprocess.run([
        "docker", "exec", "trading-bot-postgres", 
        "psql", "-U", "postgres", "-t", "-c",
        "SELECT pg_wal_lsn_diff(pg_current_wal_lsn(), replay_lsn) FROM pg_stat_replication;"
    ], capture_output=True, text=True)
    
    lag_bytes = int(result.stdout.strip() or 0)
    
    if lag_bytes > 100000:  # 100KB
        send_telegram(f"⚠️ HIGH REPLICATION LAG: {lag_bytes/1024:.1f}KB")

Troubleshooting

Secondary not connecting

# Check firewall on primary
# Allow port 5432 from 72.62.39.24

# Check pg_hba.conf on primary
docker exec trading-bot-postgres cat /var/lib/postgresql/data/pg_hba.conf | grep replication

# Check logs on secondary
docker logs trading-bot-postgres

Replication lag increasing

# Check network between servers
ping 72.62.39.24

# Check disk space on secondary
ssh root@72.62.39.24 "df -h"

# Check WAL sender processes
docker exec trading-bot-postgres psql -U postgres -c "SELECT * FROM pg_stat_replication;"

Replication stopped

# Restart from base backup
# Follow "Configure Secondary" steps again