subculture-collective/discord-spywatcher

Files

Copilot 2aa4be44f7 [WIP] Create contributing guidelines for open source contributors (#170 )

* Initial plan

* docs: add comprehensive contributing guidelines and templates

Co-authored-by: onnwee <211922112+onnwee@users.noreply.github.com>

* docs: update README and SECURITY with better formatting and links

Co-authored-by: onnwee <211922112+onnwee@users.noreply.github.com>

* docs: finalize contributing guidelines and formatting

Co-authored-by: onnwee <211922112+onnwee@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: onnwee <211922112+onnwee@users.noreply.github.com>

2025-11-04 15:38:59 -06:00

13 KiB

Raw Permalink Blame History

PostgreSQL Setup and Management Guide

This guide covers the PostgreSQL configuration, features, and management for Discord SpyWatcher.

Overview
PostgreSQL-Specific Features
Connection Configuration
Database Management
Performance Optimization
Backup and Recovery
Monitoring
Troubleshooting

Overview

Discord SpyWatcher uses PostgreSQL 15+ as its production database, providing:

Advanced data types (JSONB, arrays, UUIDs, timestamps with timezone)
Full-text search capabilities
Better concurrency and performance
Production-ready reliability
Advanced indexing strategies

PostgreSQL-Specific Features

Advanced Data Types

JSONB Fields

All event models include a metadata field using JSONB for flexible, queryable JSON storage:

// Store flexible metadata
await db.presenceEvent.create({
    data: {
        userId: '123',
        username: 'user',
        clients: ['desktop', 'mobile'],
        metadata: {
            status: 'online',
            activities: ['gaming'],
            customField: 'value',
        },
    },
});

// Query JSONB data
const events = await db.presenceEvent.findMany({
    where: {
        metadata: {
            path: ['status'],
            equals: 'online',
        },
    },
});

Array Fields

Comma-separated strings have been converted to native PostgreSQL arrays:

// PresenceEvent.clients - array of client types
clients: ['desktop', 'web', 'mobile'];

// RoleChangeEvent.addedRoles - array of role IDs
addedRoles: ['123456789', '987654321'];

UUID Primary Keys

Event models use UUIDs for better distribution and security:

// Auto-generated UUID
id: '550e8400-e29b-41d4-a716-446655440000';

Timezone-Aware Timestamps

All timestamp fields use TIMESTAMPTZ for proper timezone handling:

createdAt: DateTime @default(now()) @db.Timestamptz

Full-Text Search

The MessageEvent model supports full-text search on the content field using PostgreSQL's native text search capabilities.

Setup

Run the setup script to add the full-text search index:

DB_PASSWORD=yourpassword ./scripts/setup-fulltext-search.sh

This creates:

A generated content_search tsvector column
A GIN index for efficient text searches

Usage

// Raw SQL query for full-text search
const results = await db.$queryRaw`
  SELECT *, ts_rank(content_search, to_tsquery('english', ${query})) AS rank
  FROM "MessageEvent"
  WHERE content_search @@ to_tsquery('english', ${query})
  ORDER BY rank DESC
  LIMIT 20;
`;

// Search for phrases
const results = await db.$queryRaw`
  SELECT * FROM "MessageEvent"
  WHERE content_search @@ plainto_tsquery('english', ${'hello world'})
`;

// Complex search with operators
const results = await db.$queryRaw`
  SELECT * FROM "MessageEvent"
  WHERE content_search @@ to_tsquery('english', ${'cat & dog | bird'})
`;

Search Operators

& - AND (both terms must be present)
| - OR (either term can be present)
! - NOT (term must not be present)
<-> - Phrase search (terms must be adjacent)

Example: 'cat & dog' finds messages with both "cat" and "dog"

Performance Features

Optimized Indexes

The schema includes strategic indexes for common queries:

// Event models have indexes on:
- userId (for user-specific queries)
- guildId (for guild-specific queries)
- channelId (for channel-specific queries)
- createdAt (for time-based queries)

// User model has indexes on:
- discordId (for Discord integration)
- email (for user lookup)

// Security models have indexes on:
- ipAddress (for IP-based queries)
- action (for action-based queries)
- timestamp/createdAt (for time-based queries)

Composite Indexes

Some models use composite indexes for complex queries:

@@index([userId, createdAt]) // User activity over time
@@index([guildId, createdAt]) // Guild activity over time

Connection Configuration

Connection String Format

DATABASE_URL="postgresql://username:password@host:port/database?schema=public&connection_limit=10&pool_timeout=20&connect_timeout=10"

Connection Pooling Parameters

Parameter	Recommended Value	Description
`connection_limit`	10-50	Maximum number of connections in the pool
`pool_timeout`	20	Seconds to wait for an available connection
`connect_timeout`	10	Seconds to wait for initial connection
`sslmode`	require (prod)	SSL/TLS encryption mode

Example Configurations

Development

DATABASE_URL="postgresql://spywatcher:password@localhost:5432/spywatcher?connection_limit=10&pool_timeout=20"

Production

DATABASE_URL="postgresql://spywatcher:securepassword@db.example.com:5432/spywatcher?sslmode=require&connection_limit=50&pool_timeout=20&connect_timeout=10"

Docker

DATABASE_URL="postgresql://spywatcher:${DB_PASSWORD}@postgres:5432/spywatcher?connection_limit=20&pool_timeout=20"

Database Management

Migrations

Create a Migration

cd backend
npx prisma migrate dev --name add_new_feature

Apply Migrations (Production)

npx prisma migrate deploy

Reset Database (Development Only)

npx prisma migrate reset

Data Migration from SQLite

Use the provided migration script:

# Dry run first
cd backend
DRY_RUN=true \
  SQLITE_DATABASE_URL="file:./prisma/dev.db" \
  DATABASE_URL="postgresql://spywatcher:password@localhost:5432/spywatcher" \
  npm run db:migrate:dry

# Actual migration
SQLITE_DATABASE_URL="file:./prisma/dev.db" \
  DATABASE_URL="postgresql://spywatcher:password@localhost:5432/spywatcher" \
  npm run db:migrate

Seeding Data

cd backend
npm run prisma:seed

Performance Optimization

PostgreSQL Configuration

The Docker Compose configuration includes optimized PostgreSQL settings:

command:
  - "postgres"
  - "-c" "max_connections=100"
  - "-c" "shared_buffers=256MB"
  - "-c" "effective_cache_size=1GB"
  - "-c" "maintenance_work_mem=64MB"
  - "-c" "checkpoint_completion_target=0.9"
  - "-c" "wal_buffers=16MB"
  - "-c" "default_statistics_target=100"
  - "-c" "random_page_cost=1.1"
  - "-c" "effective_io_concurrency=200"
  - "-c" "work_mem=4MB"
  - "-c" "min_wal_size=1GB"
  - "-c" "max_wal_size=4GB"

Query Optimization

Use Indexes Effectively

// Good - uses index
const events = await db.presenceEvent.findMany({
    where: { userId: '123' },
    orderBy: { createdAt: 'desc' },
});

// Bad - no index on username alone
const events = await db.presenceEvent.findMany({
    where: { username: 'john' },
});

Batch Operations

// Use createMany for bulk inserts
await db.presenceEvent.createMany({
    data: events,
    skipDuplicates: true,
});

Pagination

// Efficient pagination with cursor
const events = await db.presenceEvent.findMany({
    take: 20,
    skip: 1,
    cursor: { id: lastId },
    orderBy: { createdAt: 'desc' },
});

Monitoring Queries

Enable query logging in development:

const db = new PrismaClient({
    log: ['query', 'error', 'warn'],
});

Backup and Recovery

Automated Backups

Use the provided backup script:

# Manual backup
DB_PASSWORD=yourpassword npm run db:backup

# Scheduled backup (cron)
0 2 * * * DB_PASSWORD=yourpassword /path/to/scripts/backup.sh

Backup Features

Compressed backups (gzip)
30-day retention by default
Optional S3 upload
Automatic cleanup

Restore from Backup

DB_PASSWORD=yourpassword npm run db:restore /path/to/backup.sql.gz

Point-in-Time Recovery

For production, enable WAL archiving:

-- Enable continuous archiving
ALTER SYSTEM SET wal_level = 'replica';
ALTER SYSTEM SET archive_mode = ON;
ALTER SYSTEM SET archive_command = 'cp %p /path/to/archive/%f';

Monitoring

Database Metrics

Connection Count

SELECT count(*), state
FROM pg_stat_activity
WHERE datname = 'spywatcher'
GROUP BY state;

Database Size

SELECT pg_size_pretty(pg_database_size('spywatcher'));

Table Sizes

SELECT
  schemaname,
  tablename,
  pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) AS size
FROM pg_tables
WHERE schemaname = 'public'
ORDER BY pg_total_relation_size(schemaname||'.'||tablename) DESC;

Index Usage

SELECT
  schemaname || '.' || tablename AS table,
  indexname,
  idx_scan AS scans,
  pg_size_pretty(pg_relation_size(indexrelid)) AS size
FROM pg_stat_user_indexes
WHERE schemaname = 'public'
ORDER BY idx_scan DESC;

Slow Queries

SELECT
  pid,
  now() - query_start AS duration,
  state,
  query
FROM pg_stat_activity
WHERE state != 'idle'
  AND datname = 'spywatcher'
  AND now() - query_start > interval '1 second'
ORDER BY duration DESC;

Maintenance Script

Run regular maintenance:

DB_PASSWORD=yourpassword npm run db:maintenance

This performs:

VACUUM ANALYZE (cleanup and optimization)
Statistics updates
Bloat detection
Unused index detection
Connection monitoring
Long-running query detection

Performance Monitoring Tools

pg_stat_statements

Enable query statistics:

CREATE EXTENSION IF NOT EXISTS pg_stat_statements;

-- View top queries by total time
SELECT
  query,
  calls,
  total_exec_time,
  mean_exec_time,
  max_exec_time
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 10;

Troubleshooting

Connection Issues

Problem: Cannot connect to database

Error: P1001: Can't reach database server at `postgres:5432`

Solutions:

Check if PostgreSQL is running: docker-compose ps
Verify DATABASE_URL is correct
Check network connectivity
Ensure PostgreSQL container is healthy

Problem: Too many connections

Error: FATAL: remaining connection slots are reserved

Solutions:

Increase max_connections in PostgreSQL configuration
Reduce connection_limit in DATABASE_URL
Check for connection leaks in application code
Use connection pooling (PgBouncer)

Performance Issues

Problem: Slow queries

Solutions:

Enable query logging to identify slow queries
Add indexes for commonly queried fields
Use EXPLAIN ANALYZE to understand query plans
Consider materialized views for complex queries
Optimize WHERE clauses to use indexes

Problem: High memory usage

Solutions:

Reduce shared_buffers if too high
Adjust work_mem for complex queries
Run VACUUM to reclaim space
Check for table bloat

Migration Issues

Problem: Migration fails with schema mismatch

Solutions:

# Check migration status
npx prisma migrate status

# Mark specific migration as applied
npx prisma migrate resolve --applied "migration_name"

# Reset and reapply (development only)
npx prisma migrate reset

Problem: Type errors after migration

Solutions:

# Regenerate Prisma Client
npx prisma generate

# Rebuild application
npm run build

Data Issues

Problem: UUID vs Integer ID conflicts

Solution: Use the migration script to handle ID conversion:

npm run db:migrate

Problem: Array field errors

Solution: Ensure comma-separated strings are converted to arrays:

// Old: clients: "desktop,mobile"
// New: clients: ["desktop", "mobile"]

Best Practices

Connection Management
- Always use connection pooling
- Close connections when done
- Set appropriate pool limits
Indexing
- Index foreign keys
- Index frequently queried fields
- Monitor index usage
- Remove unused indexes
Query Optimization
- Use prepared statements
- Avoid N+1 queries
- Use batch operations
- Implement pagination
Security
- Use SSL/TLS in production
- Rotate credentials regularly
- Limit user permissions
- Enable audit logging
Backup and Recovery
- Automate backups
- Test restore procedures
- Store backups securely
- Document recovery process
Monitoring
- Track query performance
- Monitor connection usage
- Watch for slow queries
- Set up alerts

Additional Resources

Support

For issues or questions:

Check the main README
Review MIGRATION.md
Review scripts/README.md
Open an issue on GitHub

13 KiB Raw Permalink Blame History

PostgreSQL Setup and Management Guide

Table of Contents

Overview

PostgreSQL-Specific Features

Advanced Data Types

JSONB Fields

Array Fields

UUID Primary Keys

Timezone-Aware Timestamps

Full-Text Search

Setup

Usage

Search Operators

Performance Features

Optimized Indexes

Composite Indexes

Connection Configuration

Connection String Format

Connection Pooling Parameters

Example Configurations

Development

Production

Docker

Database Management

Migrations

Create a Migration

Apply Migrations (Production)

Reset Database (Development Only)

Data Migration from SQLite

Seeding Data

Performance Optimization

PostgreSQL Configuration

Query Optimization

Use Indexes Effectively

Batch Operations

Pagination

Monitoring Queries

Backup and Recovery

Automated Backups

Backup Features

Restore from Backup

Point-in-Time Recovery

Monitoring

Database Metrics

Connection Count

Database Size

Table Sizes

Index Usage

Slow Queries

Maintenance Script

Performance Monitoring Tools

pg_stat_statements

Troubleshooting

Connection Issues

Problem: Cannot connect to database

Problem: Too many connections

Performance Issues

Problem: Slow queries

Problem: High memory usage

Migration Issues

Problem: Migration fails with schema mismatch

Problem: Type errors after migration

Data Issues

Problem: UUID vs Integer ID conflicts

Problem: Array field errors

Best Practices

Additional Resources

Support

13 KiB

Raw Permalink Blame History