* Initial plan * docs: add comprehensive contributing guidelines and templates Co-authored-by: onnwee <211922112+onnwee@users.noreply.github.com> * docs: update README and SECURITY with better formatting and links Co-authored-by: onnwee <211922112+onnwee@users.noreply.github.com> * docs: finalize contributing guidelines and formatting Co-authored-by: onnwee <211922112+onnwee@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: onnwee <211922112+onnwee@users.noreply.github.com>
11 KiB
Redis Caching Layer Documentation
Overview
The Discord SpyWatcher application implements a comprehensive Redis caching layer to reduce database load, improve response times, and enable real-time features. This document describes the caching architecture, strategies, and usage.
Architecture
Core Components
1. CacheService (backend/src/services/cache.ts)
The CacheService provides a high-level API for caching with the following features:
- Cache-Aside Pattern: Load data from cache first, fetch from database on miss
- Tag-Based Invalidation: Group related cache entries and invalidate by tag
- TTL Management: Automatic expiration based on data type
- Remember Pattern: Convenient cache-or-fetch helper
- Cache Statistics: Monitor hit rates, memory usage, and performance
Key Methods:
// Get value from cache
const value = await cache.get<T>(key);
// Set value with TTL and tags
await cache.set(key, value, {
ttl: 300, // 5 minutes
tags: ['guild:123', 'analytics:ghosts'],
});
// Remember pattern - cache or execute callback
const result = await cache.remember(
key,
300,
async () => {
return await expensiveOperation();
},
{ tags: ['my-tag'] }
);
// Invalidate by tag
await cache.invalidateByTag('guild:123');
// Get cache statistics
const stats = await cache.getStats();
2. PubSubService (backend/src/services/pubsub.ts)
The PubSubService enables real-time updates using Redis Pub/Sub:
// Publish update
await pubsub.publish('channel-name', { data: 'value' });
// Subscribe to channel
await pubsub.subscribe('channel-name', (message) => {
console.log('Received:', message);
});
// Specialized helpers
await pubsub.publishAnalyticsUpdate(guildId, 'ghosts', data);
await pubsub.publishNotification(userId, notification);
3. CacheInvalidationService (backend/src/services/cacheInvalidation.ts)
Handles automatic cache invalidation on data changes:
// Called automatically when events occur
await cacheInvalidation.onMessageCreated(guildId);
await cacheInvalidation.onTypingEvent(guildId);
await cacheInvalidation.onPresenceUpdate(userId);
await cacheInvalidation.onRoleChanged(guildId);
Caching Strategies
Analytics Caching
All analytics endpoints use the cache-aside pattern with appropriate TTLs:
| Endpoint | Function | TTL | Cache Key Pattern | Tags |
|---|---|---|---|---|
| Ghost Scores | getGhostScores |
5 min | analytics:ghosts:{guildId}:{since} |
guild:{guildId}, analytics:ghosts |
| Lurker Flags | getLurkerFlags |
5 min | analytics:lurkers:{guildId}:{since} |
guild:{guildId}, analytics:lurkers |
| Heatmap | getChannelHeatmap |
15 min | analytics:heatmap:{guildId}:{since} |
guild:{guildId}, analytics:heatmap |
| Role Drift | getRoleDriftFlags |
10 min | analytics:roles:{guildId}:{since} |
guild:{guildId}, analytics:roles |
| Client Drift | getClientDriftFlags |
2 min | analytics:clients:{guildId}:{since} |
guild:{guildId}, analytics:clients |
| Behavior Shifts | getBehaviorShiftFlags |
5 min | analytics:shifts:{guildId}:{since} |
guild:{guildId}, analytics:shifts |
Rationale:
- Ghost/Lurker/Shifts: 5 minutes - Moderate volatility, balance freshness with DB load
- Heatmap: 15 minutes - Slower-changing aggregate data
- Roles: 10 minutes - Role changes are infrequent
- Clients: 2 minutes - Rapid client switching requires fresher data
Cache Invalidation Strategies
Event-Based Invalidation
Cache entries are automatically invalidated when relevant events occur:
// Message created -> Invalidate analytics
client.on('messageCreate', async (message) => {
await db.messageEvent.create({ ... });
await cacheInvalidation.onMessageCreated(message.guild.id);
});
// Role change -> Invalidate role analytics
client.on('guildMemberUpdate', async (oldMember, newMember) => {
await db.roleChangeEvent.create({ ... });
await cacheInvalidation.onRoleChanged(newMember.guild.id);
});
Tag-Based Invalidation
Tags allow invalidating related cache entries efficiently:
// Invalidate all analytics for a guild
await cache.invalidateByTag('guild:123456789');
// Invalidate specific analytics type
await cache.invalidateByTag('analytics:ghosts');
// Invalidate multiple tags
await cache.invalidateByTags(['guild:123', 'analytics:ghosts']);
Manual Invalidation
Admin endpoints for manual cache management:
DELETE /api/admin/monitoring/cache/clear- Clear all cache entriesDELETE /api/admin/monitoring/cache/invalidate/:tag- Invalidate by tag
Configuration
Environment Variables
# Redis connection
REDIS_URL=redis://localhost:6379
# Enable Redis for rate limiting and caching
ENABLE_REDIS_RATE_LIMITING=true
Redis Setup
Development (Single Instance)
# Start Redis
docker run -d -p 6379:6379 redis:7-alpine
Production (Cluster)
For high availability in production, use Redis Cluster:
// backend/src/utils/redis.ts
const redisCluster = new Redis.Cluster(
[
{ host: 'redis-node-1', port: 6379 },
{ host: 'redis-node-2', port: 6379 },
{ host: 'redis-node-3', port: 6379 },
],
{
redisOptions: {
password: process.env.REDIS_PASSWORD,
tls: process.env.NODE_ENV === 'production' ? {} : undefined,
},
}
);
Memory Management
Redis eviction policy should be configured:
# redis.conf
maxmemory 2gb
maxmemory-policy allkeys-lru
Monitoring
Cache Statistics Endpoint
GET /api/admin/monitoring/cache/stats
Returns:
{
"stats": {
"hits": 1000,
"misses": 200,
"hitRate": 83.33,
"memoryUsed": "2.5M",
"evictedKeys": 5,
"keyCount": 150
},
"status": "healthy",
"timestamp": "2025-10-29T19:00:00.000Z"
}
Key Metrics
- Hit Rate: Should be > 80% for analytics endpoints
- Memory Usage: Should stay < 2GB (configurable)
- Evicted Keys: Should be minimal, indicating proper TTLs
- Key Count: Monitor growth over time
Logging
Cache operations are logged for debugging:
console.log(`Invalidated ${count} cache entries for tag: ${tag}`);
console.log(`Cache get error for key ${key}:`, error);
Best Practices
1. Use Appropriate TTLs
Choose TTLs based on:
- Data volatility
- Query cost
- Freshness requirements
2. Tag Everything
Always include tags for efficient invalidation:
await cache.set(key, value, {
ttl: 300,
tags: [`guild:${guildId}`, `analytics:${type}`],
});
3. Handle Cache Misses
The cache service gracefully handles Redis unavailability:
// If Redis is down, falls back to direct DB queries
const result = await cache.remember(key, ttl, async () => {
return await db.query(...); // Still works if cache fails
});
4. Monitor Performance
Regularly check cache statistics:
curl http://localhost:3001/api/admin/monitoring/cache/stats
5. Use Remember Pattern
Prefer the remember pattern over manual get/set:
// Good ✅
const data = await cache.remember(key, ttl, () => fetchData());
// Verbose ❌
let data = await cache.get(key);
if (!data) {
data = await fetchData();
await cache.set(key, data, { ttl });
}
Real-Time Updates
WebSocket Integration (Future)
The PubSub service is ready for WebSocket integration:
// Subscribe to analytics updates
pubsub.subscribe(`analytics:ghosts:${guildId}`, (data) => {
// Broadcast to WebSocket clients
io.to(guildId).emit('analytics:update', data);
});
// Publish updates when cache is invalidated
await pubsub.publishAnalyticsUpdate(guildId, 'ghosts', freshData);
Troubleshooting
Cache Not Working
-
Check Redis Connection
redis-cli ping # Should return PONG -
Verify Environment Variables
echo $REDIS_URL echo $ENABLE_REDIS_RATE_LIMITING -
Check Logs
- Look for "Redis connected successfully"
- Look for "Redis connection error"
Low Hit Rate
- Check TTLs - May be too short
- Check Invalidation - May be too aggressive
- Monitor Key Patterns - Ensure consistent key generation
High Memory Usage
- Review TTLs - May be too long
- Check for Key Leaks - Unused keys not expiring
- Adjust eviction policy - Use
allkeys-lru
Testing
Unit Tests
npm test -- __tests__/unit/services/cache.test.ts
Integration Tests
# Start Redis
docker run -d -p 6379:6379 redis:7-alpine
# Run tests
npm test -- __tests__/integration/
Manual Testing
# Set a test value
curl -X POST http://localhost:3001/api/analytics/ghosts?guildId=123
# Check cache stats
curl http://localhost:3001/api/admin/monitoring/cache/stats
# Invalidate cache
curl -X DELETE http://localhost:3001/api/admin/monitoring/cache/invalidate/guild:123
# Check stats again
curl http://localhost:3001/api/admin/monitoring/cache/stats
Performance Metrics
Expected Improvements
With proper caching implementation:
- Response Time: 50-90% faster for cached endpoints
- Database Load: 60-80% reduction in query load
- API Throughput: 2-5x increase in requests per second
- Cache Hit Rate: 80-95% for stable workloads
Benchmarking
// Before caching
const start = Date.now();
const data = await db.query(...);
console.log(`Query time: ${Date.now() - start}ms`);
// After caching
const start = Date.now();
const data = await cache.remember(key, ttl, () => db.query(...));
console.log(`Cache time: ${Date.now() - start}ms`);
Cache Warming
The cache service includes a warming utility to pre-populate cache on startup:
import { cache } from './services/cache';
// Warm cache with initial data
await cache.warm([
{
key: 'analytics:ghosts:123:all',
value: await fetchGhostData('123'),
options: { ttl: 300, tags: ['guild:123'] },
},
// ... more entries
]);
Future Enhancements
- Distributed Caching - Multi-region cache synchronization
- Cache Compression - Reduce memory usage for large values
- Smart Invalidation - ML-based invalidation strategies
- Cache Analytics - Detailed usage patterns and optimization suggestions