docs(01): create phase plan

This commit is contained in:
admin
2026-02-24 11:24:17 -05:00
parent c480d4b990
commit 6429e98f58
3 changed files with 426 additions and 2 deletions

View File

@@ -28,7 +28,11 @@ Decimal phases appear between their surrounding integers in numeric order.
2. All new tables use the existing Supabase client from `config/supabase.ts` — no new database connections added
3. `AlertModel.ts` exists and its CRUD methods can be called in isolation without errors
4. Migration SQL can be run against the live Supabase instance and produces the expected schema
**Plans**: TBD
**Plans:** 2 plans
Plans:
- [ ] 01-01-PLAN.md — Migration SQL + HealthCheckModel + AlertEventModel
- [ ] 01-02-PLAN.md — Unit tests for both monitoring models
### Phase 2: Backend Services
**Goal**: All monitoring logic runs correctly — health probes make real API calls, alerts fire with deduplication, analytics events write non-blocking to Supabase, and data is cleaned up on schedule
@@ -72,7 +76,7 @@ Phases execute in numeric order: 1 → 2 → 3 → 4
| Phase | Plans Complete | Status | Completed |
|-------|----------------|--------|-----------|
| 1. Data Foundation | 0/TBD | Not started | - |
| 1. Data Foundation | 0/2 | Not started | - |
| 2. Backend Services | 0/TBD | Not started | - |
| 3. API Layer | 0/TBD | Not started | - |
| 4. Frontend | 0/TBD | Not started | - |

View File

@@ -0,0 +1,226 @@
---
phase: 01-data-foundation
plan: 01
type: execute
wave: 1
depends_on: []
files_modified:
- backend/src/models/migrations/012_create_monitoring_tables.sql
- backend/src/models/HealthCheckModel.ts
- backend/src/models/AlertEventModel.ts
- backend/src/models/index.ts
autonomous: true
requirements:
- INFR-01
- INFR-04
must_haves:
truths:
- "Migration SQL creates service_health_checks and alert_events tables with all required columns and CHECK constraints"
- "Both tables have indexes on created_at (INFR-01 requirement)"
- "RLS is enabled on both new tables"
- "HealthCheckModel and AlertEventModel use getSupabaseServiceClient() for all database operations (INFR-04 — no new DB infrastructure)"
- "Model static methods validate input before writing"
artifacts:
- path: "backend/src/models/migrations/012_create_monitoring_tables.sql"
provides: "DDL for service_health_checks and alert_events tables"
contains: "CREATE TABLE IF NOT EXISTS service_health_checks"
- path: "backend/src/models/HealthCheckModel.ts"
provides: "CRUD operations for service_health_checks table"
exports: ["HealthCheckModel", "ServiceHealthCheck", "CreateHealthCheckData"]
- path: "backend/src/models/AlertEventModel.ts"
provides: "CRUD operations for alert_events table"
exports: ["AlertEventModel", "AlertEvent", "CreateAlertEventData"]
- path: "backend/src/models/index.ts"
provides: "Barrel exports for new models"
contains: "HealthCheckModel"
key_links:
- from: "backend/src/models/HealthCheckModel.ts"
to: "backend/src/config/supabase.ts"
via: "getSupabaseServiceClient() import"
pattern: "import.*getSupabaseServiceClient.*from.*config/supabase"
- from: "backend/src/models/AlertEventModel.ts"
to: "backend/src/config/supabase.ts"
via: "getSupabaseServiceClient() import"
pattern: "import.*getSupabaseServiceClient.*from.*config/supabase"
- from: "backend/src/models/HealthCheckModel.ts"
to: "backend/src/utils/logger.ts"
via: "Winston logger import"
pattern: "import.*logger.*from.*utils/logger"
---
<objective>
Create the database migration and TypeScript model layer for the monitoring system.
Purpose: Establish the data foundation that all subsequent phases (health probes, alerts, analytics) depend on. Tables must exist and model CRUD must work before any service can write monitoring data.
Output: One SQL migration file, two TypeScript model classes, updated barrel exports.
</objective>
<execution_context>
@/home/jonathan/.claude/get-shit-done/workflows/execute-plan.md
@/home/jonathan/.claude/get-shit-done/templates/summary.md
</execution_context>
<context>
@.planning/PROJECT.md
@.planning/ROADMAP.md
@.planning/STATE.md
@.planning/phases/01-data-foundation/01-RESEARCH.md
@.planning/phases/01-data-foundation/01-CONTEXT.md
# Existing patterns to follow
@backend/src/models/DocumentModel.ts
@backend/src/models/ProcessingJobModel.ts
@backend/src/models/index.ts
@backend/src/models/migrations/005_create_processing_jobs_table.sql
@backend/src/config/supabase.ts
@backend/src/utils/logger.ts
</context>
<tasks>
<task type="auto">
<name>Task 1: Create monitoring tables migration</name>
<files>backend/src/models/migrations/012_create_monitoring_tables.sql</files>
<action>
Create migration file `012_create_monitoring_tables.sql` following the pattern from `005_create_processing_jobs_table.sql`.
**service_health_checks table:**
- `id UUID PRIMARY KEY DEFAULT gen_random_uuid()`
- `service_name VARCHAR(100) NOT NULL`
- `status TEXT NOT NULL CHECK (status IN ('healthy', 'degraded', 'down'))`
- `latency_ms INTEGER` (nullable — INTEGER is correct, max ~2.1B ms which is impossible for latency)
- `checked_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP` (when the probe actually ran — distinct from created_at per Research Pitfall 5)
- `error_message TEXT` (nullable — for storing probe failure details)
- `probe_details JSONB` (nullable — flexible metadata per service: response codes, error specifics)
- `created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP`
**Indexes for service_health_checks:**
- `idx_service_health_checks_created_at ON service_health_checks(created_at)` — required by INFR-01, used for 30-day retention queries
- `idx_service_health_checks_service_created ON service_health_checks(service_name, created_at)` — composite for dashboard "latest check per service" queries
**alert_events table:**
- `id UUID PRIMARY KEY DEFAULT gen_random_uuid()`
- `service_name VARCHAR(100) NOT NULL`
- `alert_type TEXT NOT NULL CHECK (alert_type IN ('service_down', 'service_degraded', 'recovery'))`
- `status TEXT NOT NULL CHECK (status IN ('active', 'acknowledged', 'resolved'))`
- `message TEXT` (nullable — human-readable alert description)
- `details JSONB` (nullable — structured metadata about the alert)
- `created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP`
- `acknowledged_at TIMESTAMP WITH TIME ZONE` (nullable)
- `resolved_at TIMESTAMP WITH TIME ZONE` (nullable)
**Indexes for alert_events:**
- `idx_alert_events_created_at ON alert_events(created_at)` — required by INFR-01
- `idx_alert_events_status ON alert_events(status)` — for "active alerts" queries
- `idx_alert_events_service_status ON alert_events(service_name, status)` — for "active alerts per service"
**RLS:**
- `ALTER TABLE service_health_checks ENABLE ROW LEVEL SECURITY;`
- `ALTER TABLE alert_events ENABLE ROW LEVEL SECURITY;`
- No explicit policies needed — service role key bypasses RLS automatically in Supabase (Research Pitfall 2). Policies for authenticated users will be added in Phase 3.
**Important patterns (per CONTEXT.md):**
- ALL DDL uses `IF NOT EXISTS``CREATE TABLE IF NOT EXISTS`, `CREATE INDEX IF NOT EXISTS`
- Forward-only migration — no rollback/down scripts
- File must be numbered `012_` (current highest is `011_create_vector_database_tables.sql`)
- Include header comment with migration purpose and date
**Do NOT:**
- Use PostgreSQL ENUM types — use TEXT + CHECK per user decision
- Create rollback/down scripts — forward-only per user decision
- Add any DML (INSERT/UPDATE/DELETE) — migration is DDL only
</action>
<verify>
<automated>cd /home/jonathan/Coding/cim_summary && ls -la backend/src/models/migrations/012_create_monitoring_tables.sql && grep -c "CREATE TABLE IF NOT EXISTS" backend/src/models/migrations/012_create_monitoring_tables.sql | grep -q "2" && echo "PASS: 2 tables found" || echo "FAIL: expected 2 CREATE TABLE statements"</automated>
<manual>Verify SQL syntax is valid and matches existing migration patterns</manual>
</verify>
<done>Migration file exists with both tables, CHECK constraints on status fields, JSONB columns for flexible metadata, indexes on created_at for both tables, composite indexes for common query patterns, and RLS enabled on both tables.</done>
</task>
<task type="auto">
<name>Task 2: Create HealthCheckModel and AlertEventModel with barrel exports</name>
<files>
backend/src/models/HealthCheckModel.ts
backend/src/models/AlertEventModel.ts
backend/src/models/index.ts
</files>
<action>
**HealthCheckModel.ts** — Follow DocumentModel.ts static class pattern exactly:
Interfaces:
- `ServiceHealthCheck` — full row type matching all columns from migration (id, service_name, status, latency_ms, checked_at, error_message, probe_details, created_at). Use `'healthy' | 'degraded' | 'down'` union for status. Use `Record<string, unknown>` for probe_details (not `any` — strict TypeScript per CONVENTIONS.md).
- `CreateHealthCheckData` — input type for create method (service_name required, status required, latency_ms optional, error_message optional, probe_details optional).
Static methods:
- `create(data: CreateHealthCheckData): Promise<ServiceHealthCheck>` — Validate service_name is non-empty, validate status is one of the three allowed values. Call `getSupabaseServiceClient()` inside the method (not cached at module level — per Research finding). Use `.from('service_health_checks').insert({...}).select().single()`. Log with Winston logger on success and error. Throw on Supabase error with descriptive message.
- `findLatestByService(serviceName: string): Promise<ServiceHealthCheck | null>` — Get most recent health check for a given service. Order by `checked_at` desc, limit 1. Return null if not found (handle PGRST116 like ProcessingJobModel).
- `findAll(options?: { limit?: number; serviceName?: string }): Promise<ServiceHealthCheck[]>` — List health checks with optional filtering. Default limit 100. Order by created_at desc.
- `deleteOlderThan(days: number): Promise<number>` — For 30-day retention cleanup (used by Phase 2 scheduler). Delete rows where `created_at < NOW() - interval`. Return count of deleted rows.
**AlertEventModel.ts** — Same pattern:
Interfaces:
- `AlertEvent` — full row type (id, service_name, alert_type, status, message, details, created_at, acknowledged_at, resolved_at). Use union types for alert_type and status. Use `Record<string, unknown>` for details.
- `CreateAlertEventData` — input type (service_name, alert_type, status default 'active', message optional, details optional).
Static methods:
- `create(data: CreateAlertEventData): Promise<AlertEvent>` — Validate service_name non-empty, validate alert_type and status values. Insert with default status 'active' if not provided. Same Supabase pattern as HealthCheckModel.
- `findActive(serviceName?: string): Promise<AlertEvent[]>` — Get active (unresolved, unacknowledged) alerts. Filter `status = 'active'`. Optional service_name filter. Order by created_at desc.
- `acknowledge(id: string): Promise<AlertEvent>` — Set status to 'acknowledged' and acknowledged_at to current timestamp. Return updated row.
- `resolve(id: string): Promise<AlertEvent>` — Set status to 'resolved' and resolved_at to current timestamp. Return updated row.
- `findRecentByService(serviceName: string, alertType: string, withinMinutes: number): Promise<AlertEvent | null>` — For deduplication in Phase 2. Find most recent alert of given type for service within time window.
- `deleteOlderThan(days: number): Promise<number>` — Same retention pattern as HealthCheckModel.
**Common patterns for BOTH models:**
- Import `getSupabaseServiceClient` from `'../config/supabase'`
- Import `logger` from `'../utils/logger'`
- Call `getSupabaseServiceClient()` per-method (not at module level)
- Error handling: check `if (error)` after every Supabase call, log with `logger.error()`, throw with descriptive message
- Handle PGRST116 (not found) by returning null instead of throwing (ProcessingJobModel pattern)
- Type guard on catch: `error instanceof Error ? error.message : String(error)`
- All methods are `static async`
**index.ts update:**
- Add export lines for both new models: `export { HealthCheckModel } from './HealthCheckModel';` and `export { AlertEventModel } from './AlertEventModel';`
- Also export the interfaces: `export type { ServiceHealthCheck, CreateHealthCheckData } from './HealthCheckModel';` and `export type { AlertEvent, CreateAlertEventData } from './AlertEventModel';`
- Keep all existing exports intact
**Do NOT:**
- Use `any` type anywhere — use `Record<string, unknown>` for JSONB fields
- Use `console.log` — use Winston logger only
- Cache `getSupabaseServiceClient()` at module level
- Create a shared base model class (per Research recommendation — keep models independent)
</action>
<verify>
<automated>cd /home/jonathan/Coding/cim_summary/backend && npx tsc --noEmit --pretty 2>&1 | tail -20</automated>
<manual>Verify both models export from index.ts and follow DocumentModel.ts patterns</manual>
</verify>
<done>HealthCheckModel.ts and AlertEventModel.ts exist with typed interfaces, static CRUD methods, input validation, getSupabaseServiceClient() per-method, Winston logging. Both models exported from index.ts. TypeScript compiles without errors.</done>
</task>
</tasks>
<verification>
1. `ls backend/src/models/migrations/012_create_monitoring_tables.sql` — migration file exists
2. `grep "CREATE TABLE IF NOT EXISTS service_health_checks" backend/src/models/migrations/012_create_monitoring_tables.sql` — table DDL present
3. `grep "CREATE TABLE IF NOT EXISTS alert_events" backend/src/models/migrations/012_create_monitoring_tables.sql` — table DDL present
4. `grep "idx_.*_created_at" backend/src/models/migrations/012_create_monitoring_tables.sql` — INFR-01 indexes present
5. `grep "ENABLE ROW LEVEL SECURITY" backend/src/models/migrations/012_create_monitoring_tables.sql` — RLS enabled
6. `grep "getSupabaseServiceClient" backend/src/models/HealthCheckModel.ts` — INFR-04 uses existing Supabase connection
7. `grep "getSupabaseServiceClient" backend/src/models/AlertEventModel.ts` — INFR-04 uses existing Supabase connection
8. `cd backend && npx tsc --noEmit` — TypeScript compiles cleanly
</verification>
<success_criteria>
- Migration file 012 creates both tables with CHECK constraints, JSONB columns, all indexes, and RLS
- Both model classes compile, export typed interfaces, use getSupabaseServiceClient() per-method
- Both models are re-exported from index.ts
- No new database connections or infrastructure introduced (INFR-04)
- TypeScript strict compilation passes
</success_criteria>
<output>
After completion, create `.planning/phases/01-data-foundation/01-01-SUMMARY.md`
</output>

View File

@@ -0,0 +1,194 @@
---
phase: 01-data-foundation
plan: 02
type: execute
wave: 2
depends_on:
- 01-01
files_modified:
- backend/src/__tests__/models/HealthCheckModel.test.ts
- backend/src/__tests__/models/AlertEventModel.test.ts
autonomous: true
requirements:
- INFR-01
- INFR-04
must_haves:
truths:
- "HealthCheckModel CRUD methods work correctly with mocked Supabase client"
- "AlertEventModel CRUD methods work correctly with mocked Supabase client"
- "Input validation rejects invalid status values and empty service names"
- "Models use getSupabaseServiceClient (not getSupabaseClient or getPostgresPool)"
artifacts:
- path: "backend/src/__tests__/models/HealthCheckModel.test.ts"
provides: "Unit tests for HealthCheckModel"
contains: "HealthCheckModel"
- path: "backend/src/__tests__/models/AlertEventModel.test.ts"
provides: "Unit tests for AlertEventModel"
contains: "AlertEventModel"
key_links:
- from: "backend/src/__tests__/models/HealthCheckModel.test.ts"
to: "backend/src/models/HealthCheckModel.ts"
via: "import HealthCheckModel"
pattern: "import.*HealthCheckModel"
- from: "backend/src/__tests__/models/AlertEventModel.test.ts"
to: "backend/src/models/AlertEventModel.ts"
via: "import AlertEventModel"
pattern: "import.*AlertEventModel"
---
<objective>
Create unit tests for both monitoring model classes to verify CRUD operations, input validation, and correct Supabase client usage.
Purpose: Ensure model layer works correctly before Phase 2 services depend on it. Verify INFR-04 compliance (models use existing Supabase connection) and that input validation catches bad data before it hits the database.
Output: Two test files covering all model static methods with mocked Supabase client.
</objective>
<execution_context>
@/home/jonathan/.claude/get-shit-done/workflows/execute-plan.md
@/home/jonathan/.claude/get-shit-done/templates/summary.md
</execution_context>
<context>
@.planning/PROJECT.md
@.planning/ROADMAP.md
@.planning/STATE.md
@.planning/phases/01-data-foundation/01-RESEARCH.md
@.planning/phases/01-data-foundation/01-01-SUMMARY.md
# Test patterns
@backend/src/__tests__/mocks/logger.mock.ts
@backend/src/__tests__/utils/test-helpers.ts
@.planning/codebase/TESTING.md
# Models to test
@backend/src/models/HealthCheckModel.ts
@backend/src/models/AlertEventModel.ts
</context>
<tasks>
<task type="auto">
<name>Task 1: Create HealthCheckModel unit tests</name>
<files>backend/src/__tests__/models/HealthCheckModel.test.ts</files>
<action>
Create unit tests for HealthCheckModel using Vitest. This is the first model test in the project, so establish the Supabase mocking pattern.
**Supabase mock setup:**
- Mock `../config/supabase` module using `vi.mock()`
- Create a mock Supabase client with chainable methods: `.from()` returns object with `.insert()`, `.select()`, `.single()`, `.order()`, `.limit()`, `.eq()`, `.lt()`, `.delete()`
- Each chainable method returns the mock object (fluent pattern) except terminal methods (`.single()`, `.select()` at end) which return `{ data, error }`
- Mock `getSupabaseServiceClient` to return the mock client
- Also mock `'../utils/logger'` using the existing `logger.mock.ts` pattern
**Test suites:**
`describe('HealthCheckModel')`:
`describe('create')`:
- `test('creates a health check with valid data')` — call with { service_name: 'document_ai', status: 'healthy', latency_ms: 150 }, verify Supabase insert called with correct data, verify returned record matches
- `test('creates a health check with minimal data')` — call with only required fields (service_name, status), verify optional fields not included
- `test('creates a health check with probe_details')` — include JSONB probe_details, verify passed through
- `test('throws on empty service_name')` — expect Error thrown before Supabase is called
- `test('throws on invalid status')` — pass status 'unknown', expect Error thrown before Supabase is called
- `test('throws on Supabase error')` — mock Supabase returning { data: null, error: { message: 'connection failed' } }, verify error thrown with descriptive message
- `test('logs error on Supabase failure')` — verify logger.error called with error details
`describe('findLatestByService')`:
- `test('returns latest health check for service')` — mock Supabase returning a record, verify correct table and filters used
- `test('returns null when no records found')` — mock Supabase returning null/empty, verify null returned (not thrown)
`describe('findAll')`:
- `test('returns health checks with default limit')` — verify limit 100 applied
- `test('filters by serviceName when provided')` — verify .eq() called with service_name
- `test('respects custom limit')` — pass limit: 50, verify .limit(50)
`describe('deleteOlderThan')`:
- `test('deletes records older than specified days')` — verify .lt() called with correct date calculation
- `test('returns count of deleted records')` — mock returning count
**Pattern notes:**
- Use `describe`/`test` (not `it`) to match project convention
- Use `beforeEach` to reset mocks between tests: `vi.clearAllMocks()`
- Verify `getSupabaseServiceClient` is called per method invocation (INFR-04 pattern)
- Import from vitest: `{ describe, test, expect, vi, beforeEach }`
</action>
<verify>
<automated>cd /home/jonathan/Coding/cim_summary/backend && npx vitest run src/__tests__/models/HealthCheckModel.test.ts --reporter=verbose 2>&1 | tail -30</automated>
</verify>
<done>All HealthCheckModel tests pass. Tests cover create (valid, minimal, with probe_details), input validation (empty name, invalid status), Supabase error handling, findLatestByService (found, not found), findAll (default, filtered, custom limit), deleteOlderThan.</done>
</task>
<task type="auto">
<name>Task 2: Create AlertEventModel unit tests</name>
<files>backend/src/__tests__/models/AlertEventModel.test.ts</files>
<action>
Create unit tests for AlertEventModel following the same Supabase mocking pattern established in HealthCheckModel tests.
**Reuse the same mock setup pattern** from Task 1 (mock getSupabaseServiceClient and logger).
**Test suites:**
`describe('AlertEventModel')`:
`describe('create')`:
- `test('creates an alert event with valid data')` — call with { service_name: 'claude_ai', alert_type: 'service_down', message: 'API returned 503' }, verify insert called, verify returned record
- `test('defaults status to active')` — create without explicit status, verify 'active' sent to Supabase
- `test('creates with explicit status')` — pass status: 'acknowledged', verify it is used
- `test('creates with details JSONB')` — include details object, verify passed through
- `test('throws on empty service_name')` — expect Error before Supabase call
- `test('throws on invalid alert_type')` — pass alert_type: 'warning', expect Error
- `test('throws on invalid status')` — pass status: 'pending', expect Error
- `test('throws on Supabase error')` — mock error response, verify descriptive throw
`describe('findActive')`:
- `test('returns active alerts')` — mock returning array of active alerts, verify .eq('status', 'active')
- `test('filters by serviceName when provided')` — verify additional .eq() for service_name
- `test('returns empty array when no active alerts')` — mock returning empty array
`describe('acknowledge')`:
- `test('sets status to acknowledged with timestamp')` — verify .update() called with { status: 'acknowledged', acknowledged_at: expect.any(String) }
- `test('throws when alert not found')` — mock Supabase returning null/error, verify error thrown
`describe('resolve')`:
- `test('sets status to resolved with timestamp')` — verify .update() with { status: 'resolved', resolved_at: expect.any(String) }
- `test('throws when alert not found')` — verify error handling
`describe('findRecentByService')`:
- `test('finds recent alert within time window')` — mock returning a match, verify filters for service_name, alert_type, and created_at > threshold
- `test('returns null when no recent alerts')` — mock returning empty, verify null
`describe('deleteOlderThan')`:
- `test('deletes records older than specified days')` — same pattern as HealthCheckModel
- `test('returns count of deleted records')` — verify count
**Pattern notes:**
- Same mock setup as HealthCheckModel test
- Same beforeEach/clearAllMocks pattern
- Verify getSupabaseServiceClient called per method
</action>
<verify>
<automated>cd /home/jonathan/Coding/cim_summary/backend && npx vitest run src/__tests__/models/AlertEventModel.test.ts --reporter=verbose 2>&1 | tail -30</automated>
</verify>
<done>All AlertEventModel tests pass. Tests cover create (valid, default status, explicit status, with details), input validation (empty name, invalid alert_type, invalid status), Supabase error handling, findActive (all, filtered, empty), acknowledge, resolve, findRecentByService (found, not found), deleteOlderThan.</done>
</task>
</tasks>
<verification>
1. `cd backend && npx vitest run src/__tests__/models/ --reporter=verbose` — all model tests pass
2. `cd backend && npx vitest run --reporter=verbose` — full test suite still passes (no regressions)
3. Tests mock `getSupabaseServiceClient` (not `getSupabaseClient` or `getPostgresPool`) confirming INFR-04 compliance
</verification>
<success_criteria>
- All HealthCheckModel tests pass covering create, findLatestByService, findAll, deleteOlderThan, plus validation errors
- All AlertEventModel tests pass covering create, findActive, acknowledge, resolve, findRecentByService, deleteOlderThan, plus validation errors
- Existing test suite continues to pass (no regressions)
- Supabase mocking pattern established for future model tests
</success_criteria>
<output>
After completion, create `.planning/phases/01-data-foundation/01-02-SUMMARY.md`
</output>