Pular para conteúdo

ADR 0002: Database Schema per Module

Status: Accepted
Date: 2024-12-30
Deciders: Architecture Team

Context

The Spryx Backend is a modular monolith with multiple bounded contexts (modules): - RAG - Retrieval-Augmented Generation - Agent - AI Agent conversations - Billing - Subscription and usage tracking

We need to decide how to organize database tables across these modules, considering: 1. Current monolithic deployment 2. Future potential extraction to microservices 3. Clear ownership and boundaries 4. Development team autonomy

Options Considered

  1. Single schema (public) - All tables in one schema
  2. Schema per module - Each module has its own PostgreSQL schema
  3. Separate databases - Each module has its own database

Decision

We will use a separate PostgreSQL schema for each module.

Rules: - Schema name = module name (e.g., rag, agent, billing) - No foreign keys between schemas - No direct writes to other module's schema - Cross-module access only via contracts/ports

Consequences

Positive

  • Clear ownership: Each module owns its schema completely
  • Independent evolution: Schema changes don't require coordination
  • Microservice ready: Easy to extract schema to separate database
  • Testing isolation: Modules can be tested independently
  • Performance isolation: No cross-module locks or FK cascades

Negative

  • No referential integrity: Cross-module references can become stale
  • Eventual consistency: Need to handle cross-module data sync
  • Query complexity: Can't JOIN across modules directly
  • Data duplication: May need to denormalize some data

Mitigations

  1. Stale references: Application-level validation + soft handling
  2. Consistency: Event-driven sync for critical data
  3. Queries: Query services for read-only cross-module views
  4. Duplication: Accept strategic denormalization

Implementation

Schema Creation

-- Each module creates its own schema
CREATE SCHEMA IF NOT EXISTS rag;
CREATE SCHEMA IF NOT EXISTS agent;
CREATE SCHEMA IF NOT EXISTS billing;

Cross-Module Access Pattern

# Module A needs data from Module B
# Use contract/port, never direct DB access

class RagContract(Protocol):
    async def get_knowledge_base(self, kb_id: str) -> KnowledgeBase | None:
        ...

# Agent module uses the contract
class AgentService:
    def __init__(self, rag: RagContract):
        self.rag = rag

    async def create_conversation(self, kb_id: str):
        kb = await self.rag.get_knowledge_base(kb_id)
        if not kb:
            raise NotFoundError("Knowledge base not found")
        # ...