diff --git a/.planning/ROADMAP.md b/.planning/ROADMAP.md index 7d4d286..fa835fa 100644 --- a/.planning/ROADMAP.md +++ b/.planning/ROADMAP.md @@ -28,7 +28,11 @@ Decimal phases appear between their surrounding integers in numeric order. 2. All new tables use the existing Supabase client from `config/supabase.ts` — no new database connections added 3. `AlertModel.ts` exists and its CRUD methods can be called in isolation without errors 4. Migration SQL can be run against the live Supabase instance and produces the expected schema -**Plans**: TBD +**Plans:** 2 plans + +Plans: +- [ ] 01-01-PLAN.md — Migration SQL + HealthCheckModel + AlertEventModel +- [ ] 01-02-PLAN.md — Unit tests for both monitoring models ### Phase 2: Backend Services **Goal**: All monitoring logic runs correctly — health probes make real API calls, alerts fire with deduplication, analytics events write non-blocking to Supabase, and data is cleaned up on schedule @@ -72,7 +76,7 @@ Phases execute in numeric order: 1 → 2 → 3 → 4 | Phase | Plans Complete | Status | Completed | |-------|----------------|--------|-----------| -| 1. Data Foundation | 0/TBD | Not started | - | +| 1. Data Foundation | 0/2 | Not started | - | | 2. Backend Services | 0/TBD | Not started | - | | 3. API Layer | 0/TBD | Not started | - | | 4. Frontend | 0/TBD | Not started | - | diff --git a/.planning/phases/01-data-foundation/01-01-PLAN.md b/.planning/phases/01-data-foundation/01-01-PLAN.md new file mode 100644 index 0000000..dfef2bc --- /dev/null +++ b/.planning/phases/01-data-foundation/01-01-PLAN.md @@ -0,0 +1,226 @@ +--- +phase: 01-data-foundation +plan: 01 +type: execute +wave: 1 +depends_on: [] +files_modified: + - backend/src/models/migrations/012_create_monitoring_tables.sql + - backend/src/models/HealthCheckModel.ts + - backend/src/models/AlertEventModel.ts + - backend/src/models/index.ts +autonomous: true +requirements: + - INFR-01 + - INFR-04 + +must_haves: + truths: + - "Migration SQL creates service_health_checks and alert_events tables with all required columns and CHECK constraints" + - "Both tables have indexes on created_at (INFR-01 requirement)" + - "RLS is enabled on both new tables" + - "HealthCheckModel and AlertEventModel use getSupabaseServiceClient() for all database operations (INFR-04 — no new DB infrastructure)" + - "Model static methods validate input before writing" + artifacts: + - path: "backend/src/models/migrations/012_create_monitoring_tables.sql" + provides: "DDL for service_health_checks and alert_events tables" + contains: "CREATE TABLE IF NOT EXISTS service_health_checks" + - path: "backend/src/models/HealthCheckModel.ts" + provides: "CRUD operations for service_health_checks table" + exports: ["HealthCheckModel", "ServiceHealthCheck", "CreateHealthCheckData"] + - path: "backend/src/models/AlertEventModel.ts" + provides: "CRUD operations for alert_events table" + exports: ["AlertEventModel", "AlertEvent", "CreateAlertEventData"] + - path: "backend/src/models/index.ts" + provides: "Barrel exports for new models" + contains: "HealthCheckModel" + key_links: + - from: "backend/src/models/HealthCheckModel.ts" + to: "backend/src/config/supabase.ts" + via: "getSupabaseServiceClient() import" + pattern: "import.*getSupabaseServiceClient.*from.*config/supabase" + - from: "backend/src/models/AlertEventModel.ts" + to: "backend/src/config/supabase.ts" + via: "getSupabaseServiceClient() import" + pattern: "import.*getSupabaseServiceClient.*from.*config/supabase" + - from: "backend/src/models/HealthCheckModel.ts" + to: "backend/src/utils/logger.ts" + via: "Winston logger import" + pattern: "import.*logger.*from.*utils/logger" +--- + + +Create the database migration and TypeScript model layer for the monitoring system. + +Purpose: Establish the data foundation that all subsequent phases (health probes, alerts, analytics) depend on. Tables must exist and model CRUD must work before any service can write monitoring data. + +Output: One SQL migration file, two TypeScript model classes, updated barrel exports. + + + +@/home/jonathan/.claude/get-shit-done/workflows/execute-plan.md +@/home/jonathan/.claude/get-shit-done/templates/summary.md + + + +@.planning/PROJECT.md +@.planning/ROADMAP.md +@.planning/STATE.md +@.planning/phases/01-data-foundation/01-RESEARCH.md +@.planning/phases/01-data-foundation/01-CONTEXT.md + +# Existing patterns to follow +@backend/src/models/DocumentModel.ts +@backend/src/models/ProcessingJobModel.ts +@backend/src/models/index.ts +@backend/src/models/migrations/005_create_processing_jobs_table.sql +@backend/src/config/supabase.ts +@backend/src/utils/logger.ts + + + + + + Task 1: Create monitoring tables migration + backend/src/models/migrations/012_create_monitoring_tables.sql + +Create migration file `012_create_monitoring_tables.sql` following the pattern from `005_create_processing_jobs_table.sql`. + +**service_health_checks table:** +- `id UUID PRIMARY KEY DEFAULT gen_random_uuid()` +- `service_name VARCHAR(100) NOT NULL` +- `status TEXT NOT NULL CHECK (status IN ('healthy', 'degraded', 'down'))` +- `latency_ms INTEGER` (nullable — INTEGER is correct, max ~2.1B ms which is impossible for latency) +- `checked_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP` (when the probe actually ran — distinct from created_at per Research Pitfall 5) +- `error_message TEXT` (nullable — for storing probe failure details) +- `probe_details JSONB` (nullable — flexible metadata per service: response codes, error specifics) +- `created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP` + +**Indexes for service_health_checks:** +- `idx_service_health_checks_created_at ON service_health_checks(created_at)` — required by INFR-01, used for 30-day retention queries +- `idx_service_health_checks_service_created ON service_health_checks(service_name, created_at)` — composite for dashboard "latest check per service" queries + +**alert_events table:** +- `id UUID PRIMARY KEY DEFAULT gen_random_uuid()` +- `service_name VARCHAR(100) NOT NULL` +- `alert_type TEXT NOT NULL CHECK (alert_type IN ('service_down', 'service_degraded', 'recovery'))` +- `status TEXT NOT NULL CHECK (status IN ('active', 'acknowledged', 'resolved'))` +- `message TEXT` (nullable — human-readable alert description) +- `details JSONB` (nullable — structured metadata about the alert) +- `created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP` +- `acknowledged_at TIMESTAMP WITH TIME ZONE` (nullable) +- `resolved_at TIMESTAMP WITH TIME ZONE` (nullable) + +**Indexes for alert_events:** +- `idx_alert_events_created_at ON alert_events(created_at)` — required by INFR-01 +- `idx_alert_events_status ON alert_events(status)` — for "active alerts" queries +- `idx_alert_events_service_status ON alert_events(service_name, status)` — for "active alerts per service" + +**RLS:** +- `ALTER TABLE service_health_checks ENABLE ROW LEVEL SECURITY;` +- `ALTER TABLE alert_events ENABLE ROW LEVEL SECURITY;` +- No explicit policies needed — service role key bypasses RLS automatically in Supabase (Research Pitfall 2). Policies for authenticated users will be added in Phase 3. + +**Important patterns (per CONTEXT.md):** +- ALL DDL uses `IF NOT EXISTS` — `CREATE TABLE IF NOT EXISTS`, `CREATE INDEX IF NOT EXISTS` +- Forward-only migration — no rollback/down scripts +- File must be numbered `012_` (current highest is `011_create_vector_database_tables.sql`) +- Include header comment with migration purpose and date + +**Do NOT:** +- Use PostgreSQL ENUM types — use TEXT + CHECK per user decision +- Create rollback/down scripts — forward-only per user decision +- Add any DML (INSERT/UPDATE/DELETE) — migration is DDL only + + + cd /home/jonathan/Coding/cim_summary && ls -la backend/src/models/migrations/012_create_monitoring_tables.sql && grep -c "CREATE TABLE IF NOT EXISTS" backend/src/models/migrations/012_create_monitoring_tables.sql | grep -q "2" && echo "PASS: 2 tables found" || echo "FAIL: expected 2 CREATE TABLE statements" + Verify SQL syntax is valid and matches existing migration patterns + + Migration file exists with both tables, CHECK constraints on status fields, JSONB columns for flexible metadata, indexes on created_at for both tables, composite indexes for common query patterns, and RLS enabled on both tables. + + + + Task 2: Create HealthCheckModel and AlertEventModel with barrel exports + + backend/src/models/HealthCheckModel.ts + backend/src/models/AlertEventModel.ts + backend/src/models/index.ts + + +**HealthCheckModel.ts** — Follow DocumentModel.ts static class pattern exactly: + +Interfaces: +- `ServiceHealthCheck` — full row type matching all columns from migration (id, service_name, status, latency_ms, checked_at, error_message, probe_details, created_at). Use `'healthy' | 'degraded' | 'down'` union for status. Use `Record` for probe_details (not `any` — strict TypeScript per CONVENTIONS.md). +- `CreateHealthCheckData` — input type for create method (service_name required, status required, latency_ms optional, error_message optional, probe_details optional). + +Static methods: +- `create(data: CreateHealthCheckData): Promise` — Validate service_name is non-empty, validate status is one of the three allowed values. Call `getSupabaseServiceClient()` inside the method (not cached at module level — per Research finding). Use `.from('service_health_checks').insert({...}).select().single()`. Log with Winston logger on success and error. Throw on Supabase error with descriptive message. +- `findLatestByService(serviceName: string): Promise` — Get most recent health check for a given service. Order by `checked_at` desc, limit 1. Return null if not found (handle PGRST116 like ProcessingJobModel). +- `findAll(options?: { limit?: number; serviceName?: string }): Promise` — List health checks with optional filtering. Default limit 100. Order by created_at desc. +- `deleteOlderThan(days: number): Promise` — For 30-day retention cleanup (used by Phase 2 scheduler). Delete rows where `created_at < NOW() - interval`. Return count of deleted rows. + +**AlertEventModel.ts** — Same pattern: + +Interfaces: +- `AlertEvent` — full row type (id, service_name, alert_type, status, message, details, created_at, acknowledged_at, resolved_at). Use union types for alert_type and status. Use `Record` for details. +- `CreateAlertEventData` — input type (service_name, alert_type, status default 'active', message optional, details optional). + +Static methods: +- `create(data: CreateAlertEventData): Promise` — Validate service_name non-empty, validate alert_type and status values. Insert with default status 'active' if not provided. Same Supabase pattern as HealthCheckModel. +- `findActive(serviceName?: string): Promise` — Get active (unresolved, unacknowledged) alerts. Filter `status = 'active'`. Optional service_name filter. Order by created_at desc. +- `acknowledge(id: string): Promise` — Set status to 'acknowledged' and acknowledged_at to current timestamp. Return updated row. +- `resolve(id: string): Promise` — Set status to 'resolved' and resolved_at to current timestamp. Return updated row. +- `findRecentByService(serviceName: string, alertType: string, withinMinutes: number): Promise` — For deduplication in Phase 2. Find most recent alert of given type for service within time window. +- `deleteOlderThan(days: number): Promise` — Same retention pattern as HealthCheckModel. + +**Common patterns for BOTH models:** +- Import `getSupabaseServiceClient` from `'../config/supabase'` +- Import `logger` from `'../utils/logger'` +- Call `getSupabaseServiceClient()` per-method (not at module level) +- Error handling: check `if (error)` after every Supabase call, log with `logger.error()`, throw with descriptive message +- Handle PGRST116 (not found) by returning null instead of throwing (ProcessingJobModel pattern) +- Type guard on catch: `error instanceof Error ? error.message : String(error)` +- All methods are `static async` + +**index.ts update:** +- Add export lines for both new models: `export { HealthCheckModel } from './HealthCheckModel';` and `export { AlertEventModel } from './AlertEventModel';` +- Also export the interfaces: `export type { ServiceHealthCheck, CreateHealthCheckData } from './HealthCheckModel';` and `export type { AlertEvent, CreateAlertEventData } from './AlertEventModel';` +- Keep all existing exports intact + +**Do NOT:** +- Use `any` type anywhere — use `Record` for JSONB fields +- Use `console.log` — use Winston logger only +- Cache `getSupabaseServiceClient()` at module level +- Create a shared base model class (per Research recommendation — keep models independent) + + + cd /home/jonathan/Coding/cim_summary/backend && npx tsc --noEmit --pretty 2>&1 | tail -20 + Verify both models export from index.ts and follow DocumentModel.ts patterns + + HealthCheckModel.ts and AlertEventModel.ts exist with typed interfaces, static CRUD methods, input validation, getSupabaseServiceClient() per-method, Winston logging. Both models exported from index.ts. TypeScript compiles without errors. + + + + + +1. `ls backend/src/models/migrations/012_create_monitoring_tables.sql` — migration file exists +2. `grep "CREATE TABLE IF NOT EXISTS service_health_checks" backend/src/models/migrations/012_create_monitoring_tables.sql` — table DDL present +3. `grep "CREATE TABLE IF NOT EXISTS alert_events" backend/src/models/migrations/012_create_monitoring_tables.sql` — table DDL present +4. `grep "idx_.*_created_at" backend/src/models/migrations/012_create_monitoring_tables.sql` — INFR-01 indexes present +5. `grep "ENABLE ROW LEVEL SECURITY" backend/src/models/migrations/012_create_monitoring_tables.sql` — RLS enabled +6. `grep "getSupabaseServiceClient" backend/src/models/HealthCheckModel.ts` — INFR-04 uses existing Supabase connection +7. `grep "getSupabaseServiceClient" backend/src/models/AlertEventModel.ts` — INFR-04 uses existing Supabase connection +8. `cd backend && npx tsc --noEmit` — TypeScript compiles cleanly + + + +- Migration file 012 creates both tables with CHECK constraints, JSONB columns, all indexes, and RLS +- Both model classes compile, export typed interfaces, use getSupabaseServiceClient() per-method +- Both models are re-exported from index.ts +- No new database connections or infrastructure introduced (INFR-04) +- TypeScript strict compilation passes + + + +After completion, create `.planning/phases/01-data-foundation/01-01-SUMMARY.md` + diff --git a/.planning/phases/01-data-foundation/01-02-PLAN.md b/.planning/phases/01-data-foundation/01-02-PLAN.md new file mode 100644 index 0000000..9674775 --- /dev/null +++ b/.planning/phases/01-data-foundation/01-02-PLAN.md @@ -0,0 +1,194 @@ +--- +phase: 01-data-foundation +plan: 02 +type: execute +wave: 2 +depends_on: + - 01-01 +files_modified: + - backend/src/__tests__/models/HealthCheckModel.test.ts + - backend/src/__tests__/models/AlertEventModel.test.ts +autonomous: true +requirements: + - INFR-01 + - INFR-04 + +must_haves: + truths: + - "HealthCheckModel CRUD methods work correctly with mocked Supabase client" + - "AlertEventModel CRUD methods work correctly with mocked Supabase client" + - "Input validation rejects invalid status values and empty service names" + - "Models use getSupabaseServiceClient (not getSupabaseClient or getPostgresPool)" + artifacts: + - path: "backend/src/__tests__/models/HealthCheckModel.test.ts" + provides: "Unit tests for HealthCheckModel" + contains: "HealthCheckModel" + - path: "backend/src/__tests__/models/AlertEventModel.test.ts" + provides: "Unit tests for AlertEventModel" + contains: "AlertEventModel" + key_links: + - from: "backend/src/__tests__/models/HealthCheckModel.test.ts" + to: "backend/src/models/HealthCheckModel.ts" + via: "import HealthCheckModel" + pattern: "import.*HealthCheckModel" + - from: "backend/src/__tests__/models/AlertEventModel.test.ts" + to: "backend/src/models/AlertEventModel.ts" + via: "import AlertEventModel" + pattern: "import.*AlertEventModel" +--- + + +Create unit tests for both monitoring model classes to verify CRUD operations, input validation, and correct Supabase client usage. + +Purpose: Ensure model layer works correctly before Phase 2 services depend on it. Verify INFR-04 compliance (models use existing Supabase connection) and that input validation catches bad data before it hits the database. + +Output: Two test files covering all model static methods with mocked Supabase client. + + + +@/home/jonathan/.claude/get-shit-done/workflows/execute-plan.md +@/home/jonathan/.claude/get-shit-done/templates/summary.md + + + +@.planning/PROJECT.md +@.planning/ROADMAP.md +@.planning/STATE.md +@.planning/phases/01-data-foundation/01-RESEARCH.md +@.planning/phases/01-data-foundation/01-01-SUMMARY.md + +# Test patterns +@backend/src/__tests__/mocks/logger.mock.ts +@backend/src/__tests__/utils/test-helpers.ts +@.planning/codebase/TESTING.md + +# Models to test +@backend/src/models/HealthCheckModel.ts +@backend/src/models/AlertEventModel.ts + + + + + + Task 1: Create HealthCheckModel unit tests + backend/src/__tests__/models/HealthCheckModel.test.ts + +Create unit tests for HealthCheckModel using Vitest. This is the first model test in the project, so establish the Supabase mocking pattern. + +**Supabase mock setup:** +- Mock `../config/supabase` module using `vi.mock()` +- Create a mock Supabase client with chainable methods: `.from()` returns object with `.insert()`, `.select()`, `.single()`, `.order()`, `.limit()`, `.eq()`, `.lt()`, `.delete()` +- Each chainable method returns the mock object (fluent pattern) except terminal methods (`.single()`, `.select()` at end) which return `{ data, error }` +- Mock `getSupabaseServiceClient` to return the mock client +- Also mock `'../utils/logger'` using the existing `logger.mock.ts` pattern + +**Test suites:** + +`describe('HealthCheckModel')`: + +`describe('create')`: +- `test('creates a health check with valid data')` — call with { service_name: 'document_ai', status: 'healthy', latency_ms: 150 }, verify Supabase insert called with correct data, verify returned record matches +- `test('creates a health check with minimal data')` — call with only required fields (service_name, status), verify optional fields not included +- `test('creates a health check with probe_details')` — include JSONB probe_details, verify passed through +- `test('throws on empty service_name')` — expect Error thrown before Supabase is called +- `test('throws on invalid status')` — pass status 'unknown', expect Error thrown before Supabase is called +- `test('throws on Supabase error')` — mock Supabase returning { data: null, error: { message: 'connection failed' } }, verify error thrown with descriptive message +- `test('logs error on Supabase failure')` — verify logger.error called with error details + +`describe('findLatestByService')`: +- `test('returns latest health check for service')` — mock Supabase returning a record, verify correct table and filters used +- `test('returns null when no records found')` — mock Supabase returning null/empty, verify null returned (not thrown) + +`describe('findAll')`: +- `test('returns health checks with default limit')` — verify limit 100 applied +- `test('filters by serviceName when provided')` — verify .eq() called with service_name +- `test('respects custom limit')` — pass limit: 50, verify .limit(50) + +`describe('deleteOlderThan')`: +- `test('deletes records older than specified days')` — verify .lt() called with correct date calculation +- `test('returns count of deleted records')` — mock returning count + +**Pattern notes:** +- Use `describe`/`test` (not `it`) to match project convention +- Use `beforeEach` to reset mocks between tests: `vi.clearAllMocks()` +- Verify `getSupabaseServiceClient` is called per method invocation (INFR-04 pattern) +- Import from vitest: `{ describe, test, expect, vi, beforeEach }` + + + cd /home/jonathan/Coding/cim_summary/backend && npx vitest run src/__tests__/models/HealthCheckModel.test.ts --reporter=verbose 2>&1 | tail -30 + + All HealthCheckModel tests pass. Tests cover create (valid, minimal, with probe_details), input validation (empty name, invalid status), Supabase error handling, findLatestByService (found, not found), findAll (default, filtered, custom limit), deleteOlderThan. + + + + Task 2: Create AlertEventModel unit tests + backend/src/__tests__/models/AlertEventModel.test.ts + +Create unit tests for AlertEventModel following the same Supabase mocking pattern established in HealthCheckModel tests. + +**Reuse the same mock setup pattern** from Task 1 (mock getSupabaseServiceClient and logger). + +**Test suites:** + +`describe('AlertEventModel')`: + +`describe('create')`: +- `test('creates an alert event with valid data')` — call with { service_name: 'claude_ai', alert_type: 'service_down', message: 'API returned 503' }, verify insert called, verify returned record +- `test('defaults status to active')` — create without explicit status, verify 'active' sent to Supabase +- `test('creates with explicit status')` — pass status: 'acknowledged', verify it is used +- `test('creates with details JSONB')` — include details object, verify passed through +- `test('throws on empty service_name')` — expect Error before Supabase call +- `test('throws on invalid alert_type')` — pass alert_type: 'warning', expect Error +- `test('throws on invalid status')` — pass status: 'pending', expect Error +- `test('throws on Supabase error')` — mock error response, verify descriptive throw + +`describe('findActive')`: +- `test('returns active alerts')` — mock returning array of active alerts, verify .eq('status', 'active') +- `test('filters by serviceName when provided')` — verify additional .eq() for service_name +- `test('returns empty array when no active alerts')` — mock returning empty array + +`describe('acknowledge')`: +- `test('sets status to acknowledged with timestamp')` — verify .update() called with { status: 'acknowledged', acknowledged_at: expect.any(String) } +- `test('throws when alert not found')` — mock Supabase returning null/error, verify error thrown + +`describe('resolve')`: +- `test('sets status to resolved with timestamp')` — verify .update() with { status: 'resolved', resolved_at: expect.any(String) } +- `test('throws when alert not found')` — verify error handling + +`describe('findRecentByService')`: +- `test('finds recent alert within time window')` — mock returning a match, verify filters for service_name, alert_type, and created_at > threshold +- `test('returns null when no recent alerts')` — mock returning empty, verify null + +`describe('deleteOlderThan')`: +- `test('deletes records older than specified days')` — same pattern as HealthCheckModel +- `test('returns count of deleted records')` — verify count + +**Pattern notes:** +- Same mock setup as HealthCheckModel test +- Same beforeEach/clearAllMocks pattern +- Verify getSupabaseServiceClient called per method + + + cd /home/jonathan/Coding/cim_summary/backend && npx vitest run src/__tests__/models/AlertEventModel.test.ts --reporter=verbose 2>&1 | tail -30 + + All AlertEventModel tests pass. Tests cover create (valid, default status, explicit status, with details), input validation (empty name, invalid alert_type, invalid status), Supabase error handling, findActive (all, filtered, empty), acknowledge, resolve, findRecentByService (found, not found), deleteOlderThan. + + + + + +1. `cd backend && npx vitest run src/__tests__/models/ --reporter=verbose` — all model tests pass +2. `cd backend && npx vitest run --reporter=verbose` — full test suite still passes (no regressions) +3. Tests mock `getSupabaseServiceClient` (not `getSupabaseClient` or `getPostgresPool`) confirming INFR-04 compliance + + + +- All HealthCheckModel tests pass covering create, findLatestByService, findAll, deleteOlderThan, plus validation errors +- All AlertEventModel tests pass covering create, findActive, acknowledge, resolve, findRecentByService, deleteOlderThan, plus validation errors +- Existing test suite continues to pass (no regressions) +- Supabase mocking pattern established for future model tests + + + +After completion, create `.planning/phases/01-data-foundation/01-02-SUMMARY.md` +