* Initial plan * docs: add comprehensive contributing guidelines and templates Co-authored-by: onnwee <211922112+onnwee@users.noreply.github.com> * docs: update README and SECURITY with better formatting and links Co-authored-by: onnwee <211922112+onnwee@users.noreply.github.com> * docs: finalize contributing guidelines and formatting Co-authored-by: onnwee <211922112+onnwee@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: onnwee <211922112+onnwee@users.noreply.github.com>
7.4 KiB
Backup & Recovery Guide
Quick reference guide for backup and recovery operations.
Table of Contents
Backup Operations
Manual Backup
Create a manual backup of the database:
cd backend
npm run db:backup
With custom options:
# Set environment variables for custom backup
export BACKUP_DIR="/path/to/backups"
export S3_BUCKET="my-backup-bucket"
export S3_BUCKET_SECONDARY="my-backup-bucket-secondary"
export ENABLE_ENCRYPTION="true"
export GPG_RECIPIENT="backups@example.com"
cd scripts
./backup.sh
Automated Backups
Backups are automatically scheduled:
- Full backups: Daily at 2 AM UTC
- Incremental backups: Every 6 hours (via WAL archiving)
- Health checks: Every 6 hours
Configuration is handled by the scheduled tasks system in src/utils/scheduledTasks.ts.
Backup Types
-
Full Backup (
BACKUP_TYPE=FULL)- Complete database dump
- Compressed with gzip
- Optionally encrypted with GPG
- Stored in S3 and locally
-
Incremental Backup (
BACKUP_TYPE=INCREMENTAL)- WAL (Write-Ahead Log) segments
- Enables point-in-time recovery
- Automatically archived every hour
-
WAL Archive (
BACKUP_TYPE=WAL_ARCHIVE)- Continuous archiving of transaction logs
- Required for point-in-time recovery
Recovery Operations
Restore from Latest Backup
cd backend
npm run db:restore
This will prompt you to select from available backups.
Restore from Specific Backup
cd scripts
# Restore from local file
./restore.sh /var/backups/spywatcher/spywatcher_full_20240125_120000.dump.gz
# Restore from S3
./restore.sh s3://spywatcher-backups/postgres/full/spywatcher_full_20240125_120000.dump.gz
# Restore encrypted backup (will prompt for decryption)
./restore.sh /var/backups/spywatcher/spywatcher_full_20240125_120000.dump.gz.gpg
Point-in-Time Recovery
Restore to a specific point in time (requires WAL archiving):
cd scripts
./restore.sh <backup_file> "2024-01-25 14:30:00"
Example:
./restore.sh s3://spywatcher-backups/postgres/full/backup.dump.gz "2024-01-25 14:30:00"
Monitoring
Check Backup Health
cd backend
npm run backup:health-check
Returns:
- Last successful backup time
- Any issues detected
- Overall health status
View Backup Statistics
cd backend
npm run backup:stats
Returns:
- Total backups
- Success rate
- Average size and duration
- Failed backups count
List Recent Backups
cd backend
npm run backup:recent
Lists the 10 most recent backups with their status.
Backup Logs
All backup operations are logged to the database in the BackupLog table:
SELECT * FROM "BackupLog"
ORDER BY "startedAt" DESC
LIMIT 10;
Configuration
Environment Variables
Add these to your .env file:
# Backup directory (default: /var/backups/spywatcher)
BACKUP_DIR=/var/backups/spywatcher
# Retention in days (default: 30)
RETENTION_DAYS=30
# Number of monthly backups to keep (default: 12)
RETENTION_MONTHLY=12
# Enable encryption (default: false)
ENABLE_ENCRYPTION=true
GPG_RECIPIENT=backups@example.com
# S3 storage
S3_BUCKET=spywatcher-backups
S3_BUCKET_SECONDARY=spywatcher-backups-us-west
S3_STORAGE_CLASS=STANDARD_IA
# Notifications
SLACK_WEBHOOK=https://hooks.slack.com/services/YOUR/WEBHOOK/URL
DISCORD_WEBHOOK=https://discord.com/api/webhooks/YOUR/WEBHOOK/URL
WAL Archiving Setup
Enable WAL archiving for point-in-time recovery:
cd scripts
sudo ./setup-wal-archiving.sh
This will:
- Configure PostgreSQL for WAL archiving
- Set up archive command
- Enable point-in-time recovery
Note: Requires restart of PostgreSQL after setup.
Verify WAL Archiving
sudo -u postgres psql -c "SELECT * FROM pg_stat_archiver;"
Check for:
archived_countincreasing over timefailed_countshould be 0last_archived_timeshould be recent
Troubleshooting
Backup Fails
Check logs:
tail -f /var/log/postgresql/postgresql-15-main.log
Common issues:
-
Disk space full
df -h /var/backups/spywatcher -
Database connection issues
psql -h $DB_HOST -U $DB_USER -d $DB_NAME -c "SELECT 1;" -
S3 permissions
aws s3 ls s3://$S3_BUCKET/
Restore Fails
Common issues:
-
File not found
- Check backup file path
- Verify S3 bucket and key
- Ensure AWS credentials are configured
-
Decryption fails
- Verify GPG key is available
- Check GPG recipient matches
-
Database locked
- Stop the application first
- Kill existing connections:
SELECT pg_terminate_backend(pg_stat_activity.pid) FROM pg_stat_activity WHERE pg_stat_activity.datname = 'spywatcher' AND pid <> pg_backend_pid();
No Recent Backups
Check scheduled tasks:
# Check if scheduled tasks are running
ps aux | grep node | grep scheduledTasks
# Check application logs
tail -f logs/app.log
Manual trigger:
cd backend
npm run db:backup
Backup Size Abnormal
Check database size:
SELECT pg_size_pretty(pg_database_size('spywatcher'));
Check for data growth:
SELECT
schemaname,
tablename,
pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) AS size
FROM pg_tables
WHERE schemaname = 'public'
ORDER BY pg_total_relation_size(schemaname||'.'||tablename) DESC;
WAL Archiving Not Working
Check archive status:
SELECT * FROM pg_stat_archiver;
Check PostgreSQL config:
grep -E "wal_level|archive_mode|archive_command" /etc/postgresql/15/main/postgresql.conf
Check archive directory permissions:
ls -la /var/lib/postgresql/wal_archive/
# or for S3
aws s3 ls s3://$S3_BUCKET/wal/
Best Practices
-
Test Restores Regularly
- Monthly restore to test database
- Quarterly disaster recovery drills
- Document restore times
-
Monitor Backup Health
- Review backup health check daily
- Set up alerts for failures
- Monitor backup size trends
-
Keep Multiple Copies
- Local backups (7 days)
- Primary S3 bucket (30 days)
- Secondary S3 bucket in different region
-
Secure Your Backups
- Enable encryption for sensitive data
- Use strong GPG keys
- Rotate keys regularly
- Restrict S3 bucket access
-
Document Everything
- Keep this guide updated
- Document any custom procedures
- Maintain contact lists
- Record drill results
Emergency Contacts
For disaster recovery situations:
- Primary On-Call: Check PagerDuty schedule
- Database Admin: db-admin@spywatcher.com
- DevOps Lead: devops@spywatcher.com
- Security Team: security@spywatcher.com
Related Documentation
- DISASTER_RECOVERY.md - Comprehensive disaster recovery runbook
- CONNECTION_POOLING.md - Database connection management
- POSTGRESQL.md - PostgreSQL setup and management
- MONITORING.md - Monitoring and alerting setup
Last Updated: 2024-11-02
Maintained By: DevOps Team
Next Review: 2025-02-02