* Initial plan * feat: Implement CLI tool with init, upload, and verify commands Co-authored-by: PatrickFanella <61631520+PatrickFanella@users.noreply.github.com> * fix: Resolve linting issues and add CLI-specific eslint config Co-authored-by: PatrickFanella <61631520+PatrickFanella@users.noreply.github.com> * refactor: Address code review feedback - improve validation, memory efficiency, and type safety Co-authored-by: PatrickFanella <61631520+PatrickFanella@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: PatrickFanella <61631520+PatrickFanella@users.noreply.github.com>
7.0 KiB
Database Indexing Strategy
This document outlines the indexing strategy for the Internet ID database schema to ensure optimal query performance at scale.
Overview
The database schema uses PostgreSQL with Prisma ORM. Indexes are strategically placed to optimize common query patterns while avoiding index bloat.
Index Categories
1. Primary Keys and Unique Constraints
All models have primary keys (@id) that are automatically indexed:
User.id(cuid)Content.id(cuid)PlatformBinding.id(cuid)Verification.id(cuid)Account.id(cuid)Session.id(cuid)
Unique constraints (automatically indexed):
User.address,User.emailContent.contentHashPlatformBinding.[platform, platformId](composite unique)Account.[provider, providerAccountId](composite unique)Session.sessionTokenVerificationToken.[identifier, token](composite unique)VerificationToken.token
2. Foreign Key Indexes
Foreign keys improve JOIN performance and enforce referential integrity:
Content model:
@@index([creatorId])- Optimizes queries filtering by user (Content.creator relation)@@index([creatorAddress])- Optimizes queries by blockchain address
PlatformBinding model:
@@index([contentId])- Optimizes queries for bindings by content
Verification model:
@@index([contentId])- Optimizes queries for verifications by content
Account model:
@@index([userId])- Optimizes queries for accounts by user@@index([userId, provider])- Composite index for efficient user + provider lookups
Session model:
@@index([userId])- Optimizes queries for sessions by user
3. Filter/Sort Indexes
These indexes optimize WHERE clauses and ORDER BY operations:
Time-based sorting (createdAt):
User.@@index([createdAt])Content.@@index([createdAt])PlatformBinding.@@index([createdAt])Verification.@@index([createdAt])
Status filtering:
Verification.@@index([status])- Filters by verification status (OK, WARN, FAIL)
Session expiration:
Session.@@index([expires])- Optimizes cleanup queries for expired sessions
4. Lookup Field Indexes
Content lookup:
Verification.@@index([contentHash])- Fast lookup of verifications by content hash
Platform filtering:
PlatformBinding.@@index([platform])- Query all bindings for a specific platform (e.g., all YouTube bindings)
Username lookup:
Account.@@index([username])- Fast lookup by platform username
5. Composite Indexes
Composite indexes optimize queries with multiple filters:
Verification queries:
@@index([contentHash, createdAt])- Optimizes "get verifications for content X, ordered by time"@@index([status, createdAt])- Optimizes "get failed verifications, ordered by time"
Account queries:
@@index([userId, provider])- Optimizes "get user's account for provider X"
Query Optimization Guidelines
Critical Query Patterns
-
List content by recency:
prisma.content.findMany({ orderBy: { createdAt: "desc" }, include: { bindings: true }, });- Uses:
Content.createdAtindex - Related binding queries use:
PlatformBinding.contentIdindex
- Uses:
-
Get content by hash:
prisma.content.findUnique({ where: { contentHash: hash }, });- Uses:
Content.contentHashunique constraint (automatically indexed)
- Uses:
-
List verifications for content:
prisma.verification.findMany({ where: { contentHash: hash }, orderBy: { createdAt: "desc" }, });- Uses:
Verification.[contentHash, createdAt]composite index
- Uses:
-
Filter verifications by status:
prisma.verification.findMany({ where: { status: "FAIL" }, orderBy: { createdAt: "desc" }, });- Uses:
Verification.[status, createdAt]composite index
- Uses:
-
Get user accounts by provider:
prisma.account.findFirst({ where: { userId, provider }, });- Uses:
Account.[userId, provider]composite index
- Uses:
-
Lookup platform binding:
prisma.platformBinding.upsert({ where: { platform_platformId: { platform, platformId } }, });- Uses:
PlatformBinding.[platform, platformId]unique constraint (automatically indexed)
- Uses:
Performance Recommendations
-
Always use indexes for:
- WHERE clauses on large tables
- ORDER BY operations
- JOIN operations (foreign keys)
- GROUP BY operations
-
Avoid:
- Full table scans on tables with > 1000 rows
- Non-indexed WHERE clauses on high-cardinality columns
- Leading wildcards in LIKE queries (e.g.,
%search)
-
Monitor and optimize:
- Use
EXPLAIN ANALYZEto verify index usage - Monitor slow query logs in production
- Add partial indexes for specific use cases if needed
- Use
Example: Using EXPLAIN ANALYZE
To verify index usage in PostgreSQL:
EXPLAIN ANALYZE SELECT * FROM "Verification"
WHERE "contentHash" = '0x123...'
ORDER BY "createdAt" DESC;
Expected output should show:
- Index Scan using
Verification_contentHash_createdAt_idx - No "Seq Scan" on large tables
Index Maintenance
Creating New Indexes
When adding new indexes:
- Analyze query patterns in production
- Test with
EXPLAIN ANALYZEfirst - Create migration:
npx prisma migrate dev --name add_index_name - Monitor performance impact
Partial Indexes (Future Consideration)
For common filter combinations, consider partial indexes:
// Example: Index only failed verifications
@@index([createdAt], where: status = 'FAIL')
PostgreSQL supports partial indexes, but Prisma's support is limited. Use raw SQL migrations if needed.
Index Size Monitoring
Monitor index sizes to prevent bloat:
SELECT schemaname, tablename, indexname,
pg_size_pretty(pg_relation_size(indexrelid)) as size
FROM pg_stat_user_indexes
ORDER BY pg_relation_size(indexrelid) DESC;
Migration Strategy
For Existing Databases
- Add indexes during low-traffic periods
- Use
CREATE INDEX CONCURRENTLYin PostgreSQL (prevents table locking) - Monitor query performance before/after
Schema Changes
Always run migrations with:
npm run db:migrate
This ensures:
- Both clients (API and Web) are updated
- Migration history is tracked
- Indexes are created properly
Related Issues
- Issue #12: Database schema optimization with proper indexes
- Issue #13: Monitor slow query log in production
- Issue #10: Optimization roadmap
- Issue #5: Schema consolidation
Performance Targets
With proper indexing, the system should handle:
- 100k+ content registrations
- 1M+ verifications
- Sub-100ms query response times for indexed queries
- Efficient pagination with offset/cursor-based approaches
Future Optimizations
- Partial indexes for common status filters
- Covering indexes to avoid table lookups
- Index-only scans for frequently accessed columns
- Materialized views for complex aggregations
- Query result caching at the application layer