Archive milestone artifacts (roadmap, requirements, audit, phase directories) to .planning/milestones/. Evolve PROJECT.md with validated requirements and decision outcomes. Create MILESTONES.md and RETROSPECTIVE.md. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
5.3 KiB
5.3 KiB
phase: 02-backend-services
plan: 02
subsystem: infra
tags: [health-probes, document-ai, anthropic, firebase-auth, postgres, vitest, nodemailer]
# Dependency graph
requires:
- phase: 01-data-foundation
provides: HealthCheckModel.create() for persistence
- phase: 02-backend-services
plan: 01
provides: Schema and model layer for service_health_checks table
provides:
- healthProbeService with 4 real API probers (document_ai, llm_api, supabase, firebase_auth)
- ProbeResult interface exported for use by health endpoint
- runAllProbes orchestrator with fault-tolerant probe isolation
- nodemailer installed (needed by Plan 03 alert notifications)
affects: [02-backend-services, 02-03-PLAN]
# Tech tracking
tech-stack:
added: [nodemailer@8.0.1, @types/nodemailer]
patterns:
- Promise.allSettled for fault-tolerant concurrent probe orchestration
- firebase-admin verifyIdToken probe distinguishes expected vs unexpected errors
- Direct PostgreSQL pool (getPostgresPool) for Supabase probe, not PostgREST
- LLM probe uses cheapest model (claude-haiku-4-5) with max_tokens 5
key-files:
created:
- backend/src/services/healthProbeService.ts
- backend/src/tests/unit/healthProbeService.test.ts
modified:
- backend/package.json (nodemailer + @types/nodemailer added)
key-decisions:
- "LLM probe uses claude-haiku-4-5 with max_tokens 5 (cheapest available, prevents expensive accidental probes)"
- "Supabase probe uses getPostgresPool().query('SELECT 1') not PostgREST client (bypasses caching/middleware)"
- "Firebase Auth probe uses verifyIdToken('invalid-token') — always throws, distinguished by error message content"
- "Promise.allSettled chosen over Promise.all to guarantee all probes run even if one throws outside try/catch"
- "HealthCheckModel.create failure per probe is swallowed with logger.error — probe results still returned to caller"
patterns-established:
- "Probe pattern: record start time, try real API call, compute latency, return ProbeResult with status/latency_ms/error_message"
- "Firebase SDK probe: verifyIdToken always throws; 'argument'/'INVALID'/'Decoding' in message = SDK alive = healthy"
- "429 rate limit errors = degraded (not down) — service is alive but throttling"
- "vi.mock with inline vi.fn() in factory — no outer variable references (Vitest hoisting TDZ safe)"
requirements-completed: [HLTH-02, HLTH-04]
# Metrics
duration: 18min
completed: 2026-02-24
Phase 02 Plan 02: Health Probe Service Summary
Four real authenticated API probers (Document AI, LLM claude-haiku-4-5, Supabase pg pool, Firebase Auth) with fault-tolerant orchestrator and Supabase persistence via HealthCheckModel
Performance
- Duration: 18 min
- Started: 2026-02-24T14:05:00Z
- Completed: 2026-02-24T14:23:55Z
- Tasks: 2
- Files modified: 4
Accomplishments
- Created
healthProbeService.tswith 4 individual probers each making real authenticated API calls - Implemented
runAllProbesorchestrator usingPromise.allSettledfor fault isolation (one probe failure never blocks others) - Each probe result persisted to Supabase via
HealthCheckModel.create()after completion - 9 unit tests covering all probers, fault tolerance, 429 degraded handling, Supabase pool verification, and Firebase error discrimination
- Installed nodemailer (needed by Plan 03 alert notifications) to avoid package.json conflicts in parallel execution
Task Commits
Each task was committed atomically:
- Task 1: Install nodemailer and create healthProbeService -
4129826(feat) - Task 2: Create healthProbeService unit tests -
a8ba884(test)
Plan metadata: (docs commit — created below)
Files Created/Modified
backend/src/services/healthProbeService.ts- Health probe orchestrator with ProbeResult interface and 4 individual probersbackend/src/__tests__/unit/healthProbeService.test.ts- 9 unit tests covering all probers and orchestratorbackend/package.json- nodemailer + @types/nodemailer added
Decisions Made
- LLM probe uses
claude-haiku-4-5withmax_tokens: 5— cheapest Anthropic model prevents accidental expensive probe calls - Supabase probe uses
getPostgresPool().query('SELECT 1')— bypasses PostgREST middleware/caching, tests actual DB connectivity - Firebase Auth probe strategy:
verifyIdToken('invalid-token-probe-check')always throws; error message containing 'argument', 'INVALID', or 'Decoding' = SDK functioning = 'healthy' Promise.allSettledoverPromise.all— guarantees all 4 probes run even if one rejects outside its own try/catch- Per-probe persistence failure is swallowed (logger.error only) so probe results are still returned to caller
Deviations from Plan
None - plan executed exactly as written.
Issues Encountered
None — all probes compiled and tested cleanly on first implementation.
User Setup Required
None - no external service configuration required beyond what's already in .env.
Next Phase Readiness
healthProbeService.runAllProbes()is ready to be called by the health scheduler (Plan 03)nodemaileris installed and ready for Plan 03 alert notification serviceProbeResultinterface exported and ready for use in health status API endpoints
Phase: 02-backend-services Completed: 2026-02-24