# Phase 1: Data Foundation - Research **Researched:** 2026-02-24 **Domain:** PostgreSQL schema design, Supabase PostgREST model layer, TypeScript static class pattern **Confidence:** HIGH ## User Constraints (from CONTEXT.md) ### Locked Decisions #### Migration approach - Use the existing `DatabaseMigrator` class in `backend/src/models/migrate.ts` - New `.sql` files go in `src/models/migrations/`, run with `npm run db:migrate` - The migrator tracks applied migrations in a `migrations` table — handles idempotency - Forward-only migrations (no rollback/down scripts). If something needs fixing, write a new migration. - Migrations execute via `supabase.rpc('exec_sql', { sql })` — works with cloud Supabase from any environment including Firebase #### Schema details - Status fields use TEXT with CHECK constraints (e.g., `CHECK (status IN ('healthy','degraded','down'))`) — easy to extend, no enum type management - Table names are descriptive, matching existing style: `service_health_checks`, `alert_events` (like `processing_jobs`, `document_chunks`) - Include JSONB `probe_details` / `details` columns for flexible metadata per service (response codes, error specifics) without future schema changes - All tables get indexes on `created_at` (required for 30-day retention queries and dashboard time-range filters) - Enable Row Level Security on new tables — admin-only access, matching existing security patterns #### Model layer pattern - One model file per table: `HealthCheckModel.ts`, `AlertEventModel.ts` - Static methods on model classes (e.g., `AlertEventModel.create()`, `AlertEventModel.findActive()`) — matches `DocumentModel.ts` pattern - Use `getSupabaseServiceClient()` (PostgREST) for all monitoring reads/writes — monitoring is not on the critical processing path, so no need for direct PostgreSQL pool - Input validation in the model layer before writing (defense in depth alongside DB CHECK constraints) ### Claude's Discretion - Exact column types for non-status fields (INTEGER vs BIGINT for latency_ms, etc.) - Whether to create a shared base model or keep models independent - Index strategy beyond created_at (e.g., composite indexes on service_name + created_at) - Winston logging patterns within model methods ### Deferred Ideas (OUT OF SCOPE) None — discussion stayed within phase scope --- ## Phase Requirements | ID | Description | Research Support | |----|-------------|-----------------| | INFR-01 | Database migrations create `service_health_checks` and `alert_events` tables with indexes on `created_at` | Migration file naming convention (012_), `CREATE TABLE IF NOT EXISTS` + `CREATE INDEX IF NOT EXISTS` patterns from migration 005/010; TEXT+CHECK for status; JSONB for probe_details; TIMESTAMP WITH TIME ZONE for created_at | | INFR-04 | Analytics writes use existing Supabase connection, no new database infrastructure | `getSupabaseServiceClient()` already exported from `config/supabase.ts`; PostgREST `.from().insert().select().single()` pattern confirmed in DocumentModel.ts; monitoring path is not critical so no need for direct pg pool | --- ## Summary Phase 1 is a pure database + model layer task. No services, routes, or frontend changes. The existing codebase has a well-established pattern: SQL migration files in `backend/src/models/migrations/` (sequentially numbered), a `DatabaseMigrator` class that tracks and runs them via `supabase.rpc('exec_sql')`, and TypeScript model classes with static methods using `getSupabaseServiceClient()`. All of this exists and works — the task is to follow it precisely. The most important finding is that `getSupabaseServiceClient()` creates a **new client on every call** (no singleton caching, unlike `getSupabaseClient()`). This is intentional for the service-key client but means model methods must call it per-operation, not store it at module level. Existing models follow both patterns — `ProcessingJobModel.ts` calls `getSupabaseServiceClient()` inline where needed, while `DocumentModel.ts` uses the same inline-call approach. Either is fine; inline-per-method is most consistent. The codebase has no RLS SQL in any existing migration — existing tables pre-date or omit RLS. The CONTEXT.md requires RLS on the new tables, so this is new territory within this project. The pattern is standard Supabase RLS (`ALTER TABLE ... ENABLE ROW LEVEL SECURITY` + `CREATE POLICY`) and well-documented, but it is new to these migrations and worth verifying against the actual Supabase RLS policy syntax for service-role key bypass. **Primary recommendation:** Create migration `012_create_monitoring_tables.sql` following the pattern of `005_create_processing_jobs_table.sql`, then create `HealthCheckModel.ts` and `AlertEventModel.ts` following the `DocumentModel.ts` static-class pattern, using `getSupabaseServiceClient()` per method. --- ## Standard Stack ### Core | Library | Version | Purpose | Why Standard | |---------|---------|---------|--------------| | `@supabase/supabase-js` | Already installed | PostgREST client for model layer reads/writes | Locked: project uses Supabase exclusively; `getSupabaseServiceClient()` already in `config/supabase.ts` | | PostgreSQL (via Supabase) | Cloud-managed | Table storage, indexes, CHECK constraints, RLS | Already the only database; no new infrastructure | | TypeScript | Already installed | Model type definitions | Project-wide strict TypeScript | | Winston logger | Already installed | Logging within model methods | `backend/src/utils/logger.ts` — NEVER `console.log` per `.cursorrules` | ### Supporting | Library | Version | Purpose | When to Use | |---------|---------|---------|-------------| | `pg` (Pool) | Already installed | Direct PostgreSQL for critical-path writes | NOT needed here — monitoring is not critical path; use PostgREST only | ### Alternatives Considered | Instead of | Could Use | Tradeoff | |------------|-----------|----------| | `getSupabaseServiceClient()` | `getPostgresPool()` | Direct pg bypasses PostgREST cache (only relevant for critical-path inserts); monitoring writes can tolerate PostgREST; service client is simpler and sufficient | | TEXT + CHECK constraint | PostgreSQL ENUM | ENUMs require `CREATE TYPE` and are harder to extend; TEXT+CHECK confirmed pattern in `processing_jobs`, `agent_executions`, `users` tables | | Separate model files | Shared BaseModel class | A shared base would add indirection with minimal benefit for two small models; keep independent, consistent with existing models | **Installation:** No new packages needed — all dependencies already installed. --- ## Architecture Patterns ### Recommended Project Structure New files slot into existing structure: ``` backend/src/ ├── models/ │ ├── migrations/ │ │ └── 012_create_monitoring_tables.sql # NEW │ ├── HealthCheckModel.ts # NEW │ ├── AlertEventModel.ts # NEW │ └── index.ts # UPDATE: add exports ``` **Migration numbering:** Current highest is `011_create_vector_database_tables.sql`. Next must be `012_`. ### Pattern 1: SQL Migration File **What:** `CREATE TABLE IF NOT EXISTS` with CHECK constraints, followed by `CREATE INDEX IF NOT EXISTS` for every planned query pattern. **When to use:** All schema changes — always forward-only. ```sql -- Source: backend/src/models/migrations/005_create_processing_jobs_table.sql (verified) -- Migration: Create monitoring tables -- Created: 2026-02-24 CREATE TABLE IF NOT EXISTS service_health_checks ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), service_name VARCHAR(100) NOT NULL, status TEXT NOT NULL CHECK (status IN ('healthy', 'degraded', 'down')), latency_ms INTEGER, checked_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP, probe_details JSONB, created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP ); CREATE INDEX IF NOT EXISTS idx_service_health_checks_created_at ON service_health_checks(created_at); CREATE INDEX IF NOT EXISTS idx_service_health_checks_service_name ON service_health_checks(service_name); CREATE INDEX IF NOT EXISTS idx_service_health_checks_service_created ON service_health_checks(service_name, created_at); CREATE TABLE IF NOT EXISTS alert_events ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), service_name VARCHAR(100) NOT NULL, alert_type TEXT NOT NULL CHECK (alert_type IN ('service_down', 'service_degraded', 'recovery')), status TEXT NOT NULL CHECK (status IN ('active', 'resolved')), details JSONB, created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP, resolved_at TIMESTAMP WITH TIME ZONE ); CREATE INDEX IF NOT EXISTS idx_alert_events_created_at ON alert_events(created_at); CREATE INDEX IF NOT EXISTS idx_alert_events_status ON alert_events(status); CREATE INDEX IF NOT EXISTS idx_alert_events_service_name ON alert_events(service_name); -- RLS ALTER TABLE service_health_checks ENABLE ROW LEVEL SECURITY; ALTER TABLE alert_events ENABLE ROW LEVEL SECURITY; -- Service role bypasses RLS automatically in Supabase; -- anon/authenticated roles get no access by default when RLS is enabled with no policies -- Add explicit deny-all or admin-only policies if needed ``` ### Pattern 2: TypeScript Model Class (Static Methods) **What:** Exported class with static async methods. Each method calls `getSupabaseServiceClient()` inline (not cached at module level for service client). Uses `logger` from `utils/logger`. Validates input before writing. **When to use:** All model methods — matches `DocumentModel.ts` exactly. ```typescript // Source: backend/src/models/DocumentModel.ts (verified pattern) import { getSupabaseServiceClient } from '../config/supabase'; import { logger } from '../utils/logger'; export interface ServiceHealthCheck { id: string; service_name: string; status: 'healthy' | 'degraded' | 'down'; latency_ms?: number; checked_at: string; probe_details?: Record; created_at: string; } export interface CreateHealthCheckData { service_name: string; status: 'healthy' | 'degraded' | 'down'; latency_ms?: number; probe_details?: Record; } export class HealthCheckModel { static async create(data: CreateHealthCheckData): Promise { // Input validation if (!data.service_name) throw new Error('service_name is required'); if (!['healthy', 'degraded', 'down'].includes(data.status)) { throw new Error(`Invalid status: ${data.status}`); } try { const supabase = getSupabaseServiceClient(); const { data: record, error } = await supabase .from('service_health_checks') .insert({ service_name: data.service_name, status: data.status, latency_ms: data.latency_ms, probe_details: data.probe_details, }) .select() .single(); if (error) { logger.error('Error creating health check', { error: error.message, data }); throw new Error(`Failed to create health check: ${error.message}`); } if (!record) throw new Error('Failed to create health check: No data returned'); logger.info('Health check recorded', { service: data.service_name, status: data.status }); return record; } catch (error) { logger.error('Error in HealthCheckModel.create', { error: error instanceof Error ? error.message : String(error), data, }); throw error; } } } ``` ### Pattern 3: Running the Migration **What:** `npm run db:migrate` calls `ts-node src/scripts/setup-database.ts`, which invokes `DatabaseMigrator.migrate()`. The migrator reads all `.sql` files from `migrations/` sorted alphabetically, checks the `migrations` table for each, and executes new ones via `supabase.rpc('exec_sql', { sql })`. **Important:** The migrator skips already-executed migrations by ID (filename without `.sql`). This is the idempotency mechanism — re-running `npm run db:migrate` is safe. ### Anti-Patterns to Avoid - **Using `console.log` in model files:** Always use `logger` from `../utils/logger`. The project enforces this in `.cursorrules`. - **Using `getPostgresPool()` for monitoring writes:** Only needed for critical-path operations that hit PostgREST cache issues (`ProcessingJobModel` is the one exception). Monitoring writes are fire-and-forget; PostgREST is fine. - **Storing `getSupabaseServiceClient()` at module level:** The service client function creates a new client each call (no caching). Call it inside each method. (The anon client `getSupabaseClient()` does cache, but monitoring models use the service client.) - **Using `any` type in TypeScript interfaces:** Strict TypeScript — use `Record` for JSONB columns, or specific typed interfaces. - **Skipping `CREATE TABLE IF NOT EXISTS` / `CREATE INDEX IF NOT EXISTS`:** All migration DDL in this codebase uses `IF NOT EXISTS`. Never omit it. - **Writing a rollback/down script:** Forward-only migrations only. If schema needs fixing, write `013_fix_...sql`. - **Numbering the migration `11_` or `11`:** Must be zero-padded to three digits: `012_`. --- ## Don't Hand-Roll | Problem | Don't Build | Use Instead | Why | |---------|-------------|-------------|-----| | Migration tracking / idempotency | Custom migration table logic | Existing `DatabaseMigrator` in `migrate.ts` | Already handles migrations table, skip-if-executed logic, error logging | | Supabase client instantiation | New client setup | `getSupabaseServiceClient()` from `config/supabase.ts` | Handles auth, timeout, headers; INFR-04 requires no new DB connections | | Input validation before write | Runtime type guards | Manual validation in model (project pattern) | `DocumentModel` and `ProcessingJobModel` both validate before writing; adds defense in depth | | Logging | Direct `console.log` or custom logger | `logger` from `utils/logger` | Winston-backed, structured JSON, correlation ID support | **Key insight:** The migration infrastructure is already production-ready. Adding two SQL files and two TypeScript model classes is additive work, not infrastructure work. --- ## Common Pitfalls ### Pitfall 1: Migration Numbering Gap or Conflict **What goes wrong:** A migration numbered `011_` or `012_` conflicts with an existing file, or the migration runs out of alphabetical order because numbering is inconsistent. **Why it happens:** Not checking what the current highest number is before creating a new file. **How to avoid:** Verify current highest (`011_create_vector_database_tables.sql`) — new file must be `012_create_monitoring_tables.sql`. **Warning signs:** Migration runs but skips one of the new tables; alphabetical sort puts new file before existing ones. ### Pitfall 2: RLS Blocks Service-Role Reads **What goes wrong:** After enabling RLS, `getSupabaseServiceClient()` (which uses the service role key) cannot read or write rows. **Why it happens:** Misunderstanding of how Supabase RLS interacts with the service role. **Fact (HIGH confidence, Supabase docs):** The service role key **bypasses RLS by default**. Enabling RLS only restricts the anon key and authenticated-user JWTs. So `getSupabaseServiceClient()` will work fine with RLS enabled and no policies defined. **How to avoid:** No special policies needed for service-role access. If explicit policies are desired for documentation clarity, `CREATE POLICY "service_role_all" ON table USING (true)` with `TO service_role` works, but it is not required. **Warning signs:** Model methods return empty results or permission errors after migration runs. ### Pitfall 3: JSONB Column Typing **What goes wrong:** TypeScript `probe_details` typed as `any`, then strict lint rules fail. **Why it happens:** JSONB has no enforced schema — the path of least resistance is `any`. **How to avoid:** Type as `Record | null` or define a specific interface for common probe shapes. Accept that the TypeScript type is a superset of what the DB stores. **Warning signs:** `eslint` errors on `no-explicit-any` rule (project has strict TypeScript). ### Pitfall 4: `latency_ms` Integer Overflow **What goes wrong:** PostgreSQL `INTEGER` maxes out at ~2.1 billion. For latency in milliseconds this is impossible to overflow (2.1B ms = 24 days). But for metrics that could store large values, `BIGINT` is safer. **Why it happens:** Defaulting to `INTEGER` without considering the value range. **How to avoid:** `INTEGER` is correct for `latency_ms` (milliseconds always fit). No overflow risk here. **Warning signs:** N/A for latency; only relevant if storing epoch timestamps or byte counts in integer columns. ### Pitfall 5: Missing `checked_at` vs `created_at` Distinction **What goes wrong:** Using only `created_at` for health checks loses the distinction between "when the probe ran" and "when the row was inserted". These are usually the same, but could differ if inserts are batched or retried. **Why it happens:** Copying the `created_at = DEFAULT CURRENT_TIMESTAMP` pattern without thinking about the probe time. **How to avoid:** Include an explicit `checked_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP` column on `service_health_checks`. Let `created_at` be the insert time. When recording a health check, set `checked_at` explicitly to the moment the probe was made. The `created_at` index still covers retention queries; `checked_at` is the semantically accurate probe time. **Warning signs:** Dashboard shows "time checked" as several seconds after the actual API call. --- ## Code Examples Verified patterns from codebase: ### Migration: Full SQL File Pattern ```sql -- Source: backend/src/models/migrations/005_create_processing_jobs_table.sql (verified) -- Confirmed patterns: CREATE TABLE IF NOT EXISTS, UUID PK, TEXT CHECK constraint, -- TIMESTAMP WITH TIME ZONE, CREATE INDEX IF NOT EXISTS on created_at CREATE TABLE IF NOT EXISTS processing_jobs ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), status VARCHAR(50) NOT NULL DEFAULT 'pending' CHECK (status IN ('pending', 'processing', 'completed', 'failed')), created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP ); CREATE INDEX IF NOT EXISTS idx_processing_jobs_created_at ON processing_jobs(created_at); ``` ### DatabaseMigrator: How It Executes SQL ```typescript // Source: backend/src/models/migrate.ts (verified) // Migration executes via: const { error } = await supabase.rpc('exec_sql', { sql: migration.sql }); // Idempotency: checks `migrations` table by migration ID (filename without .sql) // Run via: npm run db:migrate → ts-node src/scripts/setup-database.ts ``` ### Supabase Service Client: Per-Method Call Pattern ```typescript // Source: backend/src/config/supabase.ts (verified) // getSupabaseServiceClient() creates a new client each call — no singleton export const getSupabaseServiceClient = (): SupabaseClient => { // Creates new createClient(...) each invocation }; // Correct usage in model methods: static async create(data: CreateData): Promise { const supabase = getSupabaseServiceClient(); // Called inside method, not at module level const { data: record, error } = await supabase.from('table').insert(data).select().single(); } ``` ### Model: Error Handling Pattern ```typescript // Source: backend/src/models/ProcessingJobModel.ts (verified) // Error check pattern used throughout: if (error) { if (error.code === 'PGRST116') { return null; // Not found — not an error } logger.error('Error doing X', { error, id }); throw new Error(`Failed to do X: ${error.message}`); } ``` ### Model Index Export ```typescript // Source: backend/src/models/index.ts (verified) // New models must be added here: export { HealthCheckModel } from './HealthCheckModel'; export { AlertEventModel } from './AlertEventModel'; ``` --- ## State of the Art | Old Approach | Current Approach | When Changed | Impact | |--------------|------------------|--------------|--------| | In-memory `uploadMonitoringService` (UploadMonitoringService class with EventEmitter) | Persistent Supabase tables | Phase 1 introduces this | Data survives cold starts; enables 30-day retention; enables dashboard queries | | `any` type in model interfaces | `Record` or typed interface | Project baseline | Strict TypeScript requirement | **Deprecated/outdated in this project:** - `uploadMonitoringService.ts` in-memory storage: Still used by existing routes but being superseded by persistent tables. Phase 1 does NOT modify `uploadMonitoringService.ts` — that is Phase 2+ work. This phase only creates the tables and model classes. --- ## Open Questions 1. **RLS Policy Detail: Should we create explicit service-role policies or rely on implicit bypass?** - What we know: Supabase service role key bypasses RLS by default. No policy needed for service-role access to work. - What's unclear: The CONTEXT.md says "admin-only access, matching existing security patterns" — but no existing migration uses RLS, so there is no project pattern to match exactly. - Recommendation: Enable RLS (`ALTER TABLE ... ENABLE ROW LEVEL SECURITY`) without creating any policies initially. The service-role key bypass is sufficient for all model-layer reads/writes. Add explicit policies in Phase 3 when admin API routes are added and authenticated user access may be needed. 2. **`performance_metrics` table: Use or ignore?** - What we know: `010_add_performance_metrics_and_events.sql` created a `performance_metrics` table but CONTEXT.md notes nothing writes to it. The new `service_health_checks` table is a different concept (external API health vs. internal processing metrics). - What's unclear: Whether Phase 1 should verify the `performance_metrics` schema to avoid future confusion. - Recommendation: No action needed in Phase 1. The CONTEXT.md note "verify its schema before building on it" is a Phase 2+ concern when writing to it. Phase 1 creates new tables only. 3. **`checked_at` column: Explicit or use `created_at`?** - What we know: `created_at` has the index required by INFR-01. Adding `checked_at` as a separate column is semantically better (Pitfall 5 above). - What's unclear: Whether the planner wants both columns or a single `created_at`. - Recommendation: Include both — `checked_at` (explicitly set when probe runs) and `created_at` (DB default). Index only `created_at` as required by INFR-01. This is Claude's discretion and adds minimal complexity. --- ## Sources ### Primary (HIGH confidence) - `backend/src/models/migrate.ts` — Verified: migration execution mechanism, idempotency via `migrations` table, `supabase.rpc('exec_sql')` call - `backend/src/models/migrations/005_create_processing_jobs_table.sql` — Verified: `CREATE TABLE IF NOT EXISTS`, TEXT CHECK, UUID PK, `CREATE INDEX IF NOT EXISTS`, `TIMESTAMP WITH TIME ZONE` - `backend/src/models/migrations/010_add_performance_metrics_and_events.sql` — Verified: JSONB column pattern, index naming convention - `backend/src/config/supabase.ts` — Verified: `getSupabaseServiceClient()` creates new client per call (no caching); `getPostgresPool()` exists but for critical-path only - `backend/src/models/DocumentModel.ts` — Verified: static class pattern, `getSupabaseServiceClient()` inside methods, `logger.error()` with structured object, retry pattern - `backend/src/models/ProcessingJobModel.ts` — Verified: `PGRST116` not-found handling, static methods, logger usage - `backend/src/models/index.ts` — Verified: export pattern for new models - `backend/package.json` — Verified: `npm run db:migrate` runs `ts-node src/scripts/setup-database.ts`; `npm test` runs `vitest run` - `backend/vitest.config.ts` — Verified: Vitest framework, `src/__tests__/**/*.{test,spec}.{ts,js}` glob, 30s timeout - `.planning/config.json` — Verified: `workflow.nyquist_validation` not present → Validation Architecture section omitted ### Secondary (MEDIUM confidence) - Supabase RLS service-role bypass behavior: Service role key bypasses RLS; this is standard Supabase behavior documented at supabase.com/docs. Confidence: HIGH from training data, not directly verified via web fetch in this session. ### Tertiary (LOW confidence) - None — all critical claims verified against codebase directly. --- ## Metadata **Confidence breakdown:** - Standard stack: HIGH — all libraries already in codebase, verified in package.json and import statements - Architecture: HIGH — migration file structure, model class pattern, and export mechanism all verified from actual source files - Pitfalls: HIGH for migration numbering (files counted directly); HIGH for RLS service-role bypass (standard Supabase behavior); MEDIUM for `checked_at` recommendation (judgement call, not a verified bug) **Research date:** 2026-02-24 **Valid until:** 2026-03-25 (30 days — Supabase and TypeScript patterns are stable)