* Initial plan * Add architecture docs, contributor guide, and env examples - Created docs/ARCHITECTURE.md with system architecture diagram - Created docs/CONTRIBUTOR_ONBOARDING.md with detailed setup guide - Created web/.env.example for Next.js configuration - Enhanced root .env.example with better descriptions - Updated README with documentation links Co-authored-by: PatrickFanella <61631520+PatrickFanella@users.noreply.github.com> * Format markdown documentation files with prettier Co-authored-by: PatrickFanella <61631520+PatrickFanella@users.noreply.github.com> * Update README.md Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: PatrickFanella <61631520+PatrickFanella@users.noreply.github.com> Co-authored-by: ⓪ηηωεε忧世 <onnweexd@gmail.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
20 KiB
Internet-ID Architecture
Overview
Internet-ID is a decentralized content provenance system that enables creators to anchor their original content on-chain, proving authorship and creation time. The system consists of four main layers that work together to provide end-to-end content verification.
System Architecture
┌─────────────────────────────────────────────────────────────────────┐
│ Web UI (Next.js) │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ Upload Flow │ │ Verify Flow │ │ Account/Auth │ │
│ │ (One-shot) │ │ (Public) │ │ (NextAuth) │ │
│ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ │
│ │ │ │ │
│ └──────────────────┴──────────────────┘ │
│ │ │
│ API Calls (REST) │
└────────────────────────────┼────────────────────────────────────────┘
│
┌────────────────────────────┼────────────────────────────────────────┐
│ Express API Server │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ Upload │ │ Register │ │ Verify │ │
│ │ /upload │ │ /register │ │ /verify │ │
│ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ │
│ │ │ │ │
│ ┌────┴────┐ ┌────┴────┐ ┌────┴────┐ │
│ │ IPFS │ │ Web3/ │ │ Prisma │ │
│ │ Service │ │ Ethers │ │ ORM │ │
│ └─────────┘ └────┬────┘ └────┬────┘ │
│ │ │ │
│ Rate Limiting ◄──► Redis Cache ◄──────────┐ │ │
└────────────────────────────┼────────────────┼─┼─────────────────────┘
│ │ │
│ │ │
┌────────────────────────────┼────────────────┘ │
│ Blockchain Layer │ │
│ ┌─────────────────────────▼────────────┐ │
│ │ ContentRegistry.sol (Smart Contract)│ │
│ │ • register(hash, manifestURI) │ │
│ │ • bindPlatform(hash, platform, id) │ │
│ │ • resolveByPlatform(platform, id) │ │
│ └───────────────────────────────────────┘ │
│ │
│ Deployed on multiple EVM chains: │
│ • Base, Base Sepolia (Recommended) │
│ • Ethereum, Sepolia │
│ • Polygon, Polygon Amoy │
│ • Arbitrum, Optimism (+ testnets) │
└────────────────────────────────────────────────┘
│
┌────────────────────────────────────────────────┼─────────────────────┐
│ Database Layer (Prisma) │ │
│ ┌────────────────────────────────────────────▼───────────────────┐ │
│ │ PostgreSQL / SQLite │ │
│ │ ┌──────────┐ ┌──────────┐ ┌────────────┐ ┌─────────────┐ │ │
│ │ │ Users │ │ Contents │ │ Platform │ │ Verifications│ │ │
│ │ │ │ │ │ │ Bindings │ │ │ │ │
│ │ └──────────┘ └──────────┘ └────────────┘ └─────────────┘ │ │
│ │ ┌──────────┐ ┌──────────┐ │ │
│ │ │ Accounts │ │ Sessions │ (NextAuth models) │ │
│ │ └──────────┘ └──────────┘ │ │
│ └────────────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────────┐
│ External Storage (IPFS) │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ Web3.Storage │ │ Pinata │ │ Infura │ │
│ │ (primary) │ │ (fallback) │ │ (fallback) │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
│ │
│ Stores: Content files, Manifest JSON files │
└─────────────────────────────────────────────────────────────────────┘
Component Interactions
1. Smart Contract Layer (ContentRegistry.sol)
Purpose: Immutable on-chain registry for content provenance
Key Functions:
register(contentHash, manifestURI)- Anchor content hash and manifest location on-chainbindPlatform(contentHash, platform, platformId)- Link platform-specific IDs (e.g., YouTube video IDs) to original contentresolveByPlatform(platform, platformId)- Look up content by platform binding
Storage:
- Mapping:
contentHash → Entry(creator, manifestURI, timestamp) - Mapping:
platformKey → contentHash(for YouTube, TikTok, etc.)
Multi-Chain: Deployed on multiple EVM networks for flexibility and cost optimization.
2. API Server Layer (Express + TypeScript)
Purpose: Business logic, IPFS uploads, blockchain interactions, caching
Key Endpoints:
Protected Endpoints (require x-api-key header when API_KEY is set):
POST /api/upload- Upload files to IPFSPOST /api/manifest- Generate and optionally upload manifest JSONPOST /api/register- Register content hash on-chain via ContentRegistryPOST /api/bind- Bind single platform IDPOST /api/bind-many- Bind multiple platform IDs in batch
Public Endpoints:
GET /api/health- Server health checkGET /api/contents- List registered contentPOST /api/verify- Verify file against manifest and on-chain entryPOST /api/proof- Generate portable proof bundleGET /api/resolve?platform=youtube&platformId=xxx- Resolve platform bindingGET /api/badge/[hash]- Generate SVG badge for contentGET /api/qr?url=...- Generate QR code for share links
Key Services:
- IPFS Service: Multi-provider upload with automatic fallback (Web3.Storage → Pinata → Infura)
- Registry Service: Blockchain interactions using ethers.js
- Cache Service: Redis-based caching for performance (optional)
- Rate Limiting: Tiered rate limits (strict/moderate/relaxed) using Redis or in-memory store
Dependencies:
- Express.js for HTTP server
- Ethers.js v6 for blockchain interactions
- Prisma ORM for database access
- Multer for file uploads
- Redis for caching and rate limiting (optional)
3. Database Layer (Prisma + PostgreSQL/SQLite)
Purpose: Store content metadata, platform bindings, verifications, and user data
Key Models:
Content Management:
Content- Registered content with hash, manifest URI, creator infoPlatformBinding- Links content to platform-specific IDsVerification- History of verification attempts
Authentication (NextAuth):
User- User accounts (email, wallet address)Account- OAuth provider accounts (GitHub, Google, Twitter, etc.)Session- Active user sessions
Schema Location: Single source of truth at prisma/schema.prisma
Generators: Two Prisma clients generated from one schema:
- Root client for API/scripts:
./node_modules/@prisma/client - Web client for Next.js:
../web/node_modules/.prisma/client
Performance: 17 indexes optimize queries for content lookup, creator filtering, and platform resolution.
4. Web UI Layer (Next.js App Router)
Purpose: User-facing interface for content registration, verification, and account management
Key Pages/Features:
Upload & Registration:
/upload- Upload files to IPFS/manifest- Create manifest JSON/register- Register content on-chain/oneshot- One-click flow: upload → manifest → register (with optional content upload)
Verification:
/verify- Public verification page (shareable)/proof- Generate portable proof bundles
Platform Bindings:
/bind- Bind single platform ID/bind-many- Batch bind multiple platform IDs
Account & Auth:
/account- User profile and linked OAuth accounts- NextAuth integration for GitHub, Google, Twitter, TikTok, etc.
Browse & Share:
/contents- Browse registered content- Share block with badge, QR code, embed HTML
Technologies:
- Next.js 15 (App Router)
- NextAuth v4 for authentication
- React 18 (Server Components)
- Prisma Client (web generator)
5. IPFS Storage Layer
Purpose: Decentralized storage for content files and manifest JSON
Providers (with automatic fallback):
- Web3.Storage - Primary, free tier available
- Pinata - Fallback, JWT authentication
- Infura - Fallback, project credentials required
- Local Kubo Node - Optional, for self-hosting
What Gets Stored:
- Content files (videos, images, documents)
- Manifest JSON files (metadata + signature)
Why IPFS:
- Content-addressed (CID = hash of content)
- Decentralized and censorship-resistant
- Verifiable integrity
Data Flow Examples
Registration Flow (Complete)
1. Creator hashes content locally (SHA-256)
└─> contentHash: 0xabc123...
2. Creator signs manifest JSON with private key
└─> Manifest: { content_hash, content_uri, signature, creator }
3. Upload manifest to IPFS (via API)
API (/api/upload) → IPFS Service → Web3.Storage
└─> manifestCID: QmXyz789...
└─> manifestURI: ipfs://QmXyz789...
4. Register on-chain (via API)
API (/api/register) → Registry Service → ContentRegistry.register(hash, manifestURI)
└─> Transaction broadcast to blockchain
└─> Event emitted: ContentRegistered(hash, creator, manifestURI, timestamp)
5. Store metadata in database
Prisma → Content table
└─> Record: { contentHash, manifestUri, creatorAddress, txHash, ... }
6. Cache cleared (if Redis enabled)
Cache Service invalidates related keys
Verification Flow (Public)
1. Verifier provides file or platform URL
└─> File: video.mp4
└─> OR Platform URL: https://youtube.com/watch?v=abc123
2. Compute file hash (SHA-256)
└─> contentHash: 0xabc123...
3. Fetch manifest from IPFS
manifestURI → IPFS Gateway → manifest.json
└─> Extract: content_hash, signature, creator
4. Verify signature
ecrecover(signature) → recovered address
└─> Compare with manifest.creator
5. Check on-chain registry
ContentRegistry.entries[hash] → Entry
└─> Verify: creator, manifestURI match
6. Return verification result
API (/api/verify) → Response: { status: "valid", creator, timestamp, ... }
Platform Binding Flow (YouTube Example)
1. Creator uploads master file to YouTube
└─> YouTube re-encodes → different hash
└─> Video ID: dQw4w9WgXcQ
2. Bind YouTube ID to original content hash
API (/api/bind) → Registry Service
└─> ContentRegistry.bindPlatform(hash, "youtube", "dQw4w9WgXcQ")
└─> platformKey = keccak256("youtube:dQw4w9WgXcQ")
3. Store binding in database
Prisma → PlatformBinding table
└─> Record: { platform: "youtube", platformId: "dQw4w9WgXcQ", contentId: ... }
4. Verification via YouTube URL
API (/api/resolve?platform=youtube&platformId=dQw4w9WgXcQ)
└─> Lookup binding → contentHash
└─> Fetch manifest and verify as usual
Security Model
Authentication & Authorization
- API Key Protection: Optional
API_KEYenvironment variable protects sensitive endpoints - OAuth Integration: NextAuth supports multiple providers (GitHub, Google, Twitter, TikTok, etc.)
- Creator-Only Operations: Smart contract enforces
onlyCreatormodifier for updates and bindings - Signature Verification: Manifest signatures validated using ECDSA recovery
Input Validation
- Zod Schemas: Comprehensive validation for all API inputs
- XSS Prevention: Sanitization of user-provided strings
- SQL Injection: Prisma ORM uses parameterized queries
- Path Traversal: File upload paths restricted to temp directories
- Rate Limiting: Tiered limits prevent abuse (strict/moderate/relaxed)
Smart Contract Security
- No External Calls: No reentrancy risk
- Integer Overflow: Solidity 0.8+ built-in protection
- Access Control:
onlyCreatormodifier for sensitive operations - Timestamp-Based Existence: Simple, gas-efficient checks
See: Security Policy | Smart Contract Audit
Caching & Performance
Redis Cache (Optional)
When REDIS_URL is configured, the API uses Redis for:
-
Response Caching:
- Content metadata: 10 minutes
- Manifest data: 15 minutes
- Platform bindings: 3 minutes
- Verification status: 5 minutes
- IPFS gateway URLs: 30 minutes
-
Rate Limiting:
- Distributed rate limiting across multiple API instances
- Per-IP and per-API-key tracking
Cache Strategy: Cache-aside pattern with automatic invalidation on writes
Monitoring: /api/cache/metrics endpoint for hit rates and performance stats
Database Indexes
17 indexes optimize common queries:
- Content lookup by hash (unique)
- Creator filtering (non-unique)
- Platform binding resolution (unique composite)
- Temporal queries (createdAt indexes)
See: Caching Architecture | Database Indexing Strategy
Multi-Chain Support
Why Multi-Chain?
- Cost Optimization: L2 chains (Base, Polygon, Arbitrum, Optimism) offer 10-100x lower gas costs
- Network Effects: Reach different ecosystems and user bases
- Redundancy: Deploy on multiple chains for resilience
Deployment Model
- Independent Contracts: Each chain has its own ContentRegistry deployment
- Saved Addresses: Deployment info saved in
deployed/<network>.json - Automatic Resolution: Registry service selects contract based on
chainId
Supported Networks
Mainnets: Ethereum, Polygon, Base, Arbitrum, Optimism
Testnets: Sepolia, Polygon Amoy, Base Sepolia, Arbitrum Sepolia, Optimism Sepolia
See: Multi-Chain Deployment Guide
Technology Stack Summary
| Layer | Technology | Purpose |
|---|---|---|
| Smart Contracts | Solidity 0.8.20 | Immutable content registry |
| Development | Hardhat + TypeScript | Contract compilation, testing, deployment |
| Blockchain | Ethers.js v6 | Web3 interactions |
| API | Express.js | REST API server |
| Database | Prisma ORM | Type-safe database access |
| Storage | PostgreSQL / SQLite | Relational data storage |
| Cache | Redis | Performance optimization |
| IPFS | ipfs-http-client | Decentralized file storage |
| Web | Next.js 15 (App Router) | User interface |
| Auth | NextAuth v4 | OAuth integration |
| Validation | Zod | Input validation schemas |
| Linting | ESLint + Prettier | Code quality |
| Testing | Mocha + Chai | Unit and integration tests |
| CI/CD | GitHub Actions | Automated testing |
Environment Configuration
See Contributor Onboarding Guide for detailed setup instructions.
Further Reading
- Contributor Onboarding Guide - Setup, development workflow, testing
- Smart Contract Audit - Security analysis
- Input Validation - Zod schemas and security
- Caching Architecture - Redis caching details
- Rate Limiting - Rate limit configuration
- Multi-Chain Deployment - Deployment guide
- Database Indexing - Query optimization
- Platform Verification - Platform binding details