ADR 0002: Database Schema per Module¶
Status: Accepted
Date: 2024-12-30
Deciders: Architecture Team
Context¶
The Spryx Backend is a modular monolith with multiple bounded contexts (modules): - RAG - Retrieval-Augmented Generation - Agent - AI Agent conversations - Billing - Subscription and usage tracking
We need to decide how to organize database tables across these modules, considering: 1. Current monolithic deployment 2. Future potential extraction to microservices 3. Clear ownership and boundaries 4. Development team autonomy
Options Considered¶
- Single schema (public) - All tables in one schema
- Schema per module - Each module has its own PostgreSQL schema
- Separate databases - Each module has its own database
Decision¶
We will use a separate PostgreSQL schema for each module.
Rules:
- Schema name = module name (e.g., rag, agent, billing)
- No foreign keys between schemas
- No direct writes to other module's schema
- Cross-module access only via contracts/ports
Consequences¶
Positive¶
- Clear ownership: Each module owns its schema completely
- Independent evolution: Schema changes don't require coordination
- Microservice ready: Easy to extract schema to separate database
- Testing isolation: Modules can be tested independently
- Performance isolation: No cross-module locks or FK cascades
Negative¶
- No referential integrity: Cross-module references can become stale
- Eventual consistency: Need to handle cross-module data sync
- Query complexity: Can't JOIN across modules directly
- Data duplication: May need to denormalize some data
Mitigations¶
- Stale references: Application-level validation + soft handling
- Consistency: Event-driven sync for critical data
- Queries: Query services for read-only cross-module views
- Duplication: Accept strategic denormalization
Implementation¶
Schema Creation¶
-- Each module creates its own schema
CREATE SCHEMA IF NOT EXISTS rag;
CREATE SCHEMA IF NOT EXISTS agent;
CREATE SCHEMA IF NOT EXISTS billing;
Cross-Module Access Pattern¶
# Module A needs data from Module B
# Use contract/port, never direct DB access
class RagContract(Protocol):
async def get_knowledge_base(self, kb_id: str) -> KnowledgeBase | None:
...
# Agent module uses the contract
class AgentService:
def __init__(self, rag: RagContract):
self.rag = rag
async def create_conversation(self, kb_id: str):
kb = await self.rag.get_knowledge_base(kb_id)
if not kb:
raise NotFoundError("Knowledge base not found")
# ...