Files
discord-spywatcher/DATABASE_OPTIMIZATION.md
Copilot 2aa4be44f7 [WIP] Create contributing guidelines for open source contributors (#170)
* Initial plan

* docs: add comprehensive contributing guidelines and templates

Co-authored-by: onnwee <211922112+onnwee@users.noreply.github.com>

* docs: update README and SECURITY with better formatting and links

Co-authored-by: onnwee <211922112+onnwee@users.noreply.github.com>

* docs: finalize contributing guidelines and formatting

Co-authored-by: onnwee <211922112+onnwee@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: onnwee <211922112+onnwee@users.noreply.github.com>
2025-11-04 15:38:59 -06:00

9.0 KiB

Database Optimization & Indexing Strategy

This document outlines the database optimization strategy for Discord Spywatcher, including indexing, query optimization, and maintenance procedures.

📊 Overview

The database optimization implementation focuses on:

  • Strategic indexing for common query patterns
  • Query optimization to reduce database load
  • Performance monitoring to identify bottlenecks
  • Maintenance utilities for ongoing health

🎯 Index Strategy

Composite Indexes

Composite indexes are created for common query patterns that filter by multiple columns:

PresenceEvent

  • (userId, createdAt DESC) - User presence history queries
  • (userId) - Single user lookups
  • (createdAt) - Time-based queries

MessageEvent

  • (userId, createdAt DESC) - User message history
  • (guildId, channelId) - Guild-channel message queries
  • (guildId, createdAt DESC) - Guild message history
  • Individual indexes on userId, guildId, channelId, createdAt

TypingEvent

  • (userId, channelId) - User typing in specific channels
  • (guildId, createdAt DESC) - Guild typing activity over time
  • Individual indexes on userId, guildId, channelId, createdAt

ReactionTime

  • (observerId, createdAt DESC) - Observer reaction history
  • (guildId, createdAt DESC) - Guild reaction history
  • (deltaMs) - Fast reaction queries
  • Individual indexes on observerId, actorId, guildId, createdAt

User

  • (role) - Role-based queries
  • (lastSeenAt DESC) - Last seen queries
  • (role, lastSeenAt DESC) - Combined role and activity queries
  • Individual indexes on discordId, email

Partial Indexes (PostgreSQL-specific)

Partial indexes only index rows matching a WHERE clause, reducing index size and improving performance:

  • Multi-client presence: idx_presence_multi_client - Only indexes presence events with multiple clients
  • Fast reactions: idx_reaction_fast_delta - Only indexes reaction times < 5000ms
  • Recent events: idx_message_recent, idx_typing_recent - Only indexes last 90 days
  • New accounts: idx_join_new_accounts - Only indexes accounts < 14 days old

GIN Indexes

GIN (Generalized Inverted Index) indexes for JSONB columns enable efficient JSON queries:

  • Metadata columns on all event tables: PresenceEvent, TypingEvent, MessageEvent, ReactionTime, JoinEvent, RoleChangeEvent
  • Full-text search on MessageEvent.content (configured via scripts/setup-fulltext-search.sh)

🚀 Query Optimizations

Ghost Detection

Before (N+1 query pattern):

// Multiple separate queries - inefficient
const typings = await db.typingEvent.groupBy(...);
const messages = await db.messageEvent.groupBy(...);
// Merge in application code

After (Single optimized query):

// Single aggregation query using raw SQL
const result = await db.$queryRaw`
  SELECT ... FROM (
    SELECT userId, COUNT(*) as typing_count FROM TypingEvent ...
  ) t
  FULL OUTER JOIN (
    SELECT userId, COUNT(*) as message_count FROM MessageEvent ...
  ) m ON t.userId = m.userId
  ...
`;

Performance improvement: ~70% reduction in query time for large datasets

Lurker Detection

Optimized from multiple findMany calls to a single query with subqueries:

  • Identifies users with presence but no activity
  • Uses LEFT JOIN to efficiently find users without matching activity records
  • Filters in database rather than application code

Reaction Stats

Changed from in-memory aggregation to database-level aggregation:

  • Uses SQL AVG() and COUNT() FILTER for efficient calculation
  • Reduces data transfer from database to application
  • Handles filtering at database level

📈 Performance Monitoring

Slow Query Logger

The application includes a Prisma middleware that tracks slow queries:

// Configurable thresholds (env variables)
SLOW_QUERY_THRESHOLD_MS = 100; // Warn threshold
CRITICAL_QUERY_THRESHOLD_MS = 1000; // Critical threshold

Features:

  • Logs queries exceeding thresholds to console
  • Stores last 100 slow queries in memory
  • Provides statistics API for monitoring dashboards

Monitoring Endpoints

Admin monitoring endpoints at /api/admin/monitoring/database/:

  • GET /health - Database connection status and version
  • GET /tables - Table sizes and row counts
  • GET /indexes - Index usage statistics and unused indexes
  • GET /slow-queries - Application-tracked slow queries
  • GET /pg-slow-queries - PostgreSQL pg_stat_statements queries
  • POST /analyze - Run ANALYZE on all tables
  • GET /report - Comprehensive maintenance report

Database Maintenance Utilities

The databaseMaintenance.ts utility provides:

  • Index usage statistics - Identify unused indexes
  • Table statistics - Monitor table and index sizes
  • Index bloat detection - Find indexes needing REINDEX
  • Slow query analysis - PostgreSQL statistics integration
  • Health checks - Connection and configuration verification

🔧 Maintenance Procedures

Initial Setup

  1. Apply Prisma migrations:
cd backend
npm run prisma:migrate
  1. Apply PostgreSQL-specific indexes:
psql -d spywatcher -f ../scripts/add-performance-indexes.sql
  1. Initialize full-text search (if not already done):
npm run db:fulltext

Regular Maintenance

Weekly

  • Review slow query logs via monitoring dashboard
  • Check index usage statistics
  • Review table growth trends

Monthly

  • Run ANALYZE on all tables:
curl -X POST http://localhost:3000/api/admin/monitoring/database/analyze \
  -H "Authorization: Bearer YOUR_TOKEN"
  • Check for unused indexes:
curl http://localhost:3000/api/admin/monitoring/database/indexes \
  -H "Authorization: Bearer YOUR_TOKEN"
  • Review maintenance report:
curl http://localhost:3000/api/admin/monitoring/database/report \
  -H "Authorization: Bearer YOUR_TOKEN"

Quarterly

  • Review and remove truly unused indexes
  • Consider table partitioning for very large tables
  • Review and adjust connection pool settings

Index Bloat Management

Check for index bloat periodically:

SELECT
  schemaname, tablename, indexname,
  pg_size_pretty(pg_relation_size(indexrelid)) as index_size
FROM pg_stat_user_indexes
WHERE pg_relation_size(indexrelid) > 1048576 -- > 1MB
ORDER BY pg_relation_size(indexrelid) DESC;

Rebuild bloated indexes:

REINDEX INDEX CONCURRENTLY idx_name;

📊 Performance Targets

Based on the issue requirements:

  • All queries under 100ms (p95)
  • Critical queries under 50ms (p95)
  • Index usage > 95% on frequently accessed tables
  • No full table scans on large tables
  • Automated slow query alerts
  • Query performance monitoring

🔍 Monitoring Best Practices

  1. Enable pg_stat_statements for PostgreSQL query tracking:
-- Add to postgresql.conf
shared_preload_libraries = 'pg_stat_statements'
pg_stat_statements.track = all
  1. Set up alerts for:

    • Queries exceeding 1000ms
    • Index usage below 50% on tables > 10k rows
    • Connection pool saturation (> 80% usage)
    • Table sizes growing abnormally
  2. Regular reviews of:

    • Slow query patterns
    • Index hit ratios
    • Cache effectiveness
    • Connection pool metrics

🛠️ Troubleshooting

Query Performance Issues

  1. Check EXPLAIN ANALYZE output:
EXPLAIN ANALYZE SELECT * FROM "MessageEvent"
WHERE "guildId" = 'xxx' AND "createdAt" > NOW() - INTERVAL '7 days';
  1. Verify index usage:
SELECT * FROM pg_stat_user_indexes
WHERE tablename = 'MessageEvent';
  1. Check for sequential scans on large tables:
SELECT schemaname, tablename, seq_scan, seq_tup_read, idx_scan
FROM pg_stat_user_tables
WHERE seq_scan > idx_scan AND n_live_tup > 10000
ORDER BY seq_scan DESC;

High Memory Usage

  1. Check connection pool size in DATABASE_URL
  2. Review long-running queries
  3. Check for memory leaks in application code
  4. Consider reducing result set sizes with pagination

Index Not Being Used

Common reasons:

  1. Statistics are outdated - Run ANALYZE
  2. Small table size - PostgreSQL may prefer sequential scan
  3. Poor selectivity - Index doesn't filter enough rows
  4. Wrong query pattern - Query doesn't match index column order

📚 References

🎯 Future Optimizations

Potential future enhancements:

  1. Table Partitioning - Partition large event tables by date
  2. Materialized Views - For complex analytics queries
  3. Read Replicas - For read-heavy workloads
  4. Connection Pooling - External pooler like PgBouncer
  5. Query Caching - Redis cache for frequently accessed data
  6. Archival Strategy - Move old data to archive tables