chore: complete v1.0 Analytics & Monitoring milestone
Archive milestone artifacts (roadmap, requirements, audit, phase directories) to .planning/milestones/. Evolve PROJECT.md with validated requirements and decision outcomes. Create MILESTONES.md and RETROSPECTIVE.md. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,226 @@
|
||||
---
|
||||
phase: 01-data-foundation
|
||||
plan: 01
|
||||
type: execute
|
||||
wave: 1
|
||||
depends_on: []
|
||||
files_modified:
|
||||
- backend/src/models/migrations/012_create_monitoring_tables.sql
|
||||
- backend/src/models/HealthCheckModel.ts
|
||||
- backend/src/models/AlertEventModel.ts
|
||||
- backend/src/models/index.ts
|
||||
autonomous: true
|
||||
requirements:
|
||||
- INFR-01
|
||||
- INFR-04
|
||||
|
||||
must_haves:
|
||||
truths:
|
||||
- "Migration SQL creates service_health_checks and alert_events tables with all required columns and CHECK constraints"
|
||||
- "Both tables have indexes on created_at (INFR-01 requirement)"
|
||||
- "RLS is enabled on both new tables"
|
||||
- "HealthCheckModel and AlertEventModel use getSupabaseServiceClient() for all database operations (INFR-04 — no new DB infrastructure)"
|
||||
- "Model static methods validate input before writing"
|
||||
artifacts:
|
||||
- path: "backend/src/models/migrations/012_create_monitoring_tables.sql"
|
||||
provides: "DDL for service_health_checks and alert_events tables"
|
||||
contains: "CREATE TABLE IF NOT EXISTS service_health_checks"
|
||||
- path: "backend/src/models/HealthCheckModel.ts"
|
||||
provides: "CRUD operations for service_health_checks table"
|
||||
exports: ["HealthCheckModel", "ServiceHealthCheck", "CreateHealthCheckData"]
|
||||
- path: "backend/src/models/AlertEventModel.ts"
|
||||
provides: "CRUD operations for alert_events table"
|
||||
exports: ["AlertEventModel", "AlertEvent", "CreateAlertEventData"]
|
||||
- path: "backend/src/models/index.ts"
|
||||
provides: "Barrel exports for new models"
|
||||
contains: "HealthCheckModel"
|
||||
key_links:
|
||||
- from: "backend/src/models/HealthCheckModel.ts"
|
||||
to: "backend/src/config/supabase.ts"
|
||||
via: "getSupabaseServiceClient() import"
|
||||
pattern: "import.*getSupabaseServiceClient.*from.*config/supabase"
|
||||
- from: "backend/src/models/AlertEventModel.ts"
|
||||
to: "backend/src/config/supabase.ts"
|
||||
via: "getSupabaseServiceClient() import"
|
||||
pattern: "import.*getSupabaseServiceClient.*from.*config/supabase"
|
||||
- from: "backend/src/models/HealthCheckModel.ts"
|
||||
to: "backend/src/utils/logger.ts"
|
||||
via: "Winston logger import"
|
||||
pattern: "import.*logger.*from.*utils/logger"
|
||||
---
|
||||
|
||||
<objective>
|
||||
Create the database migration and TypeScript model layer for the monitoring system.
|
||||
|
||||
Purpose: Establish the data foundation that all subsequent phases (health probes, alerts, analytics) depend on. Tables must exist and model CRUD must work before any service can write monitoring data.
|
||||
|
||||
Output: One SQL migration file, two TypeScript model classes, updated barrel exports.
|
||||
</objective>
|
||||
|
||||
<execution_context>
|
||||
@/home/jonathan/.claude/get-shit-done/workflows/execute-plan.md
|
||||
@/home/jonathan/.claude/get-shit-done/templates/summary.md
|
||||
</execution_context>
|
||||
|
||||
<context>
|
||||
@.planning/PROJECT.md
|
||||
@.planning/ROADMAP.md
|
||||
@.planning/STATE.md
|
||||
@.planning/phases/01-data-foundation/01-RESEARCH.md
|
||||
@.planning/phases/01-data-foundation/01-CONTEXT.md
|
||||
|
||||
# Existing patterns to follow
|
||||
@backend/src/models/DocumentModel.ts
|
||||
@backend/src/models/ProcessingJobModel.ts
|
||||
@backend/src/models/index.ts
|
||||
@backend/src/models/migrations/005_create_processing_jobs_table.sql
|
||||
@backend/src/config/supabase.ts
|
||||
@backend/src/utils/logger.ts
|
||||
</context>
|
||||
|
||||
<tasks>
|
||||
|
||||
<task type="auto">
|
||||
<name>Task 1: Create monitoring tables migration</name>
|
||||
<files>backend/src/models/migrations/012_create_monitoring_tables.sql</files>
|
||||
<action>
|
||||
Create migration file `012_create_monitoring_tables.sql` following the pattern from `005_create_processing_jobs_table.sql`.
|
||||
|
||||
**service_health_checks table:**
|
||||
- `id UUID PRIMARY KEY DEFAULT gen_random_uuid()`
|
||||
- `service_name VARCHAR(100) NOT NULL`
|
||||
- `status TEXT NOT NULL CHECK (status IN ('healthy', 'degraded', 'down'))`
|
||||
- `latency_ms INTEGER` (nullable — INTEGER is correct, max ~2.1B ms which is impossible for latency)
|
||||
- `checked_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP` (when the probe actually ran — distinct from created_at per Research Pitfall 5)
|
||||
- `error_message TEXT` (nullable — for storing probe failure details)
|
||||
- `probe_details JSONB` (nullable — flexible metadata per service: response codes, error specifics)
|
||||
- `created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP`
|
||||
|
||||
**Indexes for service_health_checks:**
|
||||
- `idx_service_health_checks_created_at ON service_health_checks(created_at)` — required by INFR-01, used for 30-day retention queries
|
||||
- `idx_service_health_checks_service_created ON service_health_checks(service_name, created_at)` — composite for dashboard "latest check per service" queries
|
||||
|
||||
**alert_events table:**
|
||||
- `id UUID PRIMARY KEY DEFAULT gen_random_uuid()`
|
||||
- `service_name VARCHAR(100) NOT NULL`
|
||||
- `alert_type TEXT NOT NULL CHECK (alert_type IN ('service_down', 'service_degraded', 'recovery'))`
|
||||
- `status TEXT NOT NULL CHECK (status IN ('active', 'acknowledged', 'resolved'))`
|
||||
- `message TEXT` (nullable — human-readable alert description)
|
||||
- `details JSONB` (nullable — structured metadata about the alert)
|
||||
- `created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP`
|
||||
- `acknowledged_at TIMESTAMP WITH TIME ZONE` (nullable)
|
||||
- `resolved_at TIMESTAMP WITH TIME ZONE` (nullable)
|
||||
|
||||
**Indexes for alert_events:**
|
||||
- `idx_alert_events_created_at ON alert_events(created_at)` — required by INFR-01
|
||||
- `idx_alert_events_status ON alert_events(status)` — for "active alerts" queries
|
||||
- `idx_alert_events_service_status ON alert_events(service_name, status)` — for "active alerts per service"
|
||||
|
||||
**RLS:**
|
||||
- `ALTER TABLE service_health_checks ENABLE ROW LEVEL SECURITY;`
|
||||
- `ALTER TABLE alert_events ENABLE ROW LEVEL SECURITY;`
|
||||
- No explicit policies needed — service role key bypasses RLS automatically in Supabase (Research Pitfall 2). Policies for authenticated users will be added in Phase 3.
|
||||
|
||||
**Important patterns (per CONTEXT.md):**
|
||||
- ALL DDL uses `IF NOT EXISTS` — `CREATE TABLE IF NOT EXISTS`, `CREATE INDEX IF NOT EXISTS`
|
||||
- Forward-only migration — no rollback/down scripts
|
||||
- File must be numbered `012_` (current highest is `011_create_vector_database_tables.sql`)
|
||||
- Include header comment with migration purpose and date
|
||||
|
||||
**Do NOT:**
|
||||
- Use PostgreSQL ENUM types — use TEXT + CHECK per user decision
|
||||
- Create rollback/down scripts — forward-only per user decision
|
||||
- Add any DML (INSERT/UPDATE/DELETE) — migration is DDL only
|
||||
</action>
|
||||
<verify>
|
||||
<automated>cd /home/jonathan/Coding/cim_summary && ls -la backend/src/models/migrations/012_create_monitoring_tables.sql && grep -c "CREATE TABLE IF NOT EXISTS" backend/src/models/migrations/012_create_monitoring_tables.sql | grep -q "2" && echo "PASS: 2 tables found" || echo "FAIL: expected 2 CREATE TABLE statements"</automated>
|
||||
<manual>Verify SQL syntax is valid and matches existing migration patterns</manual>
|
||||
</verify>
|
||||
<done>Migration file exists with both tables, CHECK constraints on status fields, JSONB columns for flexible metadata, indexes on created_at for both tables, composite indexes for common query patterns, and RLS enabled on both tables.</done>
|
||||
</task>
|
||||
|
||||
<task type="auto">
|
||||
<name>Task 2: Create HealthCheckModel and AlertEventModel with barrel exports</name>
|
||||
<files>
|
||||
backend/src/models/HealthCheckModel.ts
|
||||
backend/src/models/AlertEventModel.ts
|
||||
backend/src/models/index.ts
|
||||
</files>
|
||||
<action>
|
||||
**HealthCheckModel.ts** — Follow DocumentModel.ts static class pattern exactly:
|
||||
|
||||
Interfaces:
|
||||
- `ServiceHealthCheck` — full row type matching all columns from migration (id, service_name, status, latency_ms, checked_at, error_message, probe_details, created_at). Use `'healthy' | 'degraded' | 'down'` union for status. Use `Record<string, unknown>` for probe_details (not `any` — strict TypeScript per CONVENTIONS.md).
|
||||
- `CreateHealthCheckData` — input type for create method (service_name required, status required, latency_ms optional, error_message optional, probe_details optional).
|
||||
|
||||
Static methods:
|
||||
- `create(data: CreateHealthCheckData): Promise<ServiceHealthCheck>` — Validate service_name is non-empty, validate status is one of the three allowed values. Call `getSupabaseServiceClient()` inside the method (not cached at module level — per Research finding). Use `.from('service_health_checks').insert({...}).select().single()`. Log with Winston logger on success and error. Throw on Supabase error with descriptive message.
|
||||
- `findLatestByService(serviceName: string): Promise<ServiceHealthCheck | null>` — Get most recent health check for a given service. Order by `checked_at` desc, limit 1. Return null if not found (handle PGRST116 like ProcessingJobModel).
|
||||
- `findAll(options?: { limit?: number; serviceName?: string }): Promise<ServiceHealthCheck[]>` — List health checks with optional filtering. Default limit 100. Order by created_at desc.
|
||||
- `deleteOlderThan(days: number): Promise<number>` — For 30-day retention cleanup (used by Phase 2 scheduler). Delete rows where `created_at < NOW() - interval`. Return count of deleted rows.
|
||||
|
||||
**AlertEventModel.ts** — Same pattern:
|
||||
|
||||
Interfaces:
|
||||
- `AlertEvent` — full row type (id, service_name, alert_type, status, message, details, created_at, acknowledged_at, resolved_at). Use union types for alert_type and status. Use `Record<string, unknown>` for details.
|
||||
- `CreateAlertEventData` — input type (service_name, alert_type, status default 'active', message optional, details optional).
|
||||
|
||||
Static methods:
|
||||
- `create(data: CreateAlertEventData): Promise<AlertEvent>` — Validate service_name non-empty, validate alert_type and status values. Insert with default status 'active' if not provided. Same Supabase pattern as HealthCheckModel.
|
||||
- `findActive(serviceName?: string): Promise<AlertEvent[]>` — Get active (unresolved, unacknowledged) alerts. Filter `status = 'active'`. Optional service_name filter. Order by created_at desc.
|
||||
- `acknowledge(id: string): Promise<AlertEvent>` — Set status to 'acknowledged' and acknowledged_at to current timestamp. Return updated row.
|
||||
- `resolve(id: string): Promise<AlertEvent>` — Set status to 'resolved' and resolved_at to current timestamp. Return updated row.
|
||||
- `findRecentByService(serviceName: string, alertType: string, withinMinutes: number): Promise<AlertEvent | null>` — For deduplication in Phase 2. Find most recent alert of given type for service within time window.
|
||||
- `deleteOlderThan(days: number): Promise<number>` — Same retention pattern as HealthCheckModel.
|
||||
|
||||
**Common patterns for BOTH models:**
|
||||
- Import `getSupabaseServiceClient` from `'../config/supabase'`
|
||||
- Import `logger` from `'../utils/logger'`
|
||||
- Call `getSupabaseServiceClient()` per-method (not at module level)
|
||||
- Error handling: check `if (error)` after every Supabase call, log with `logger.error()`, throw with descriptive message
|
||||
- Handle PGRST116 (not found) by returning null instead of throwing (ProcessingJobModel pattern)
|
||||
- Type guard on catch: `error instanceof Error ? error.message : String(error)`
|
||||
- All methods are `static async`
|
||||
|
||||
**index.ts update:**
|
||||
- Add export lines for both new models: `export { HealthCheckModel } from './HealthCheckModel';` and `export { AlertEventModel } from './AlertEventModel';`
|
||||
- Also export the interfaces: `export type { ServiceHealthCheck, CreateHealthCheckData } from './HealthCheckModel';` and `export type { AlertEvent, CreateAlertEventData } from './AlertEventModel';`
|
||||
- Keep all existing exports intact
|
||||
|
||||
**Do NOT:**
|
||||
- Use `any` type anywhere — use `Record<string, unknown>` for JSONB fields
|
||||
- Use `console.log` — use Winston logger only
|
||||
- Cache `getSupabaseServiceClient()` at module level
|
||||
- Create a shared base model class (per Research recommendation — keep models independent)
|
||||
</action>
|
||||
<verify>
|
||||
<automated>cd /home/jonathan/Coding/cim_summary/backend && npx tsc --noEmit --pretty 2>&1 | tail -20</automated>
|
||||
<manual>Verify both models export from index.ts and follow DocumentModel.ts patterns</manual>
|
||||
</verify>
|
||||
<done>HealthCheckModel.ts and AlertEventModel.ts exist with typed interfaces, static CRUD methods, input validation, getSupabaseServiceClient() per-method, Winston logging. Both models exported from index.ts. TypeScript compiles without errors.</done>
|
||||
</task>
|
||||
|
||||
</tasks>
|
||||
|
||||
<verification>
|
||||
1. `ls backend/src/models/migrations/012_create_monitoring_tables.sql` — migration file exists
|
||||
2. `grep "CREATE TABLE IF NOT EXISTS service_health_checks" backend/src/models/migrations/012_create_monitoring_tables.sql` — table DDL present
|
||||
3. `grep "CREATE TABLE IF NOT EXISTS alert_events" backend/src/models/migrations/012_create_monitoring_tables.sql` — table DDL present
|
||||
4. `grep "idx_.*_created_at" backend/src/models/migrations/012_create_monitoring_tables.sql` — INFR-01 indexes present
|
||||
5. `grep "ENABLE ROW LEVEL SECURITY" backend/src/models/migrations/012_create_monitoring_tables.sql` — RLS enabled
|
||||
6. `grep "getSupabaseServiceClient" backend/src/models/HealthCheckModel.ts` — INFR-04 uses existing Supabase connection
|
||||
7. `grep "getSupabaseServiceClient" backend/src/models/AlertEventModel.ts` — INFR-04 uses existing Supabase connection
|
||||
8. `cd backend && npx tsc --noEmit` — TypeScript compiles cleanly
|
||||
</verification>
|
||||
|
||||
<success_criteria>
|
||||
- Migration file 012 creates both tables with CHECK constraints, JSONB columns, all indexes, and RLS
|
||||
- Both model classes compile, export typed interfaces, use getSupabaseServiceClient() per-method
|
||||
- Both models are re-exported from index.ts
|
||||
- No new database connections or infrastructure introduced (INFR-04)
|
||||
- TypeScript strict compilation passes
|
||||
</success_criteria>
|
||||
|
||||
<output>
|
||||
After completion, create `.planning/phases/01-data-foundation/01-01-SUMMARY.md`
|
||||
</output>
|
||||
Reference in New Issue
Block a user