# Requirements Archive: v1.0 Analytics & Monitoring **Archived:** 2026-02-25 **Status:** SHIPPED For current requirements, see `.planning/REQUIREMENTS.md`. --- # Requirements: CIM Summary — Analytics & Monitoring **Defined:** 2026-02-24 **Core Value:** When something breaks — an API key expires, a service goes down, a credential needs reauthorization — the admin knows immediately and knows exactly what to fix. ## v1 Requirements Requirements for initial release. Each maps to roadmap phases. ### Service Health - [x] **HLTH-01**: Admin can view live health status (healthy/degraded/down) for Document AI, Claude/OpenAI, Supabase, and Firebase Auth - [x] **HLTH-02**: Each health probe makes a real authenticated API call, not just config checks - [x] **HLTH-03**: Health probes run on a scheduled interval, separate from document processing - [x] **HLTH-04**: Health probe results persist to Supabase (survive cold starts) ### Alerting - [x] **ALRT-01**: Admin receives email alert when a service goes down or degrades - [x] **ALRT-02**: Alert deduplication prevents repeat emails for the same ongoing issue (cooldown period) - [x] **ALRT-03**: Admin sees in-app alert banner for active critical issues - [x] **ALRT-04**: Alert recipient stored as configuration, not hardcoded ### Processing Analytics - [x] **ANLY-01**: Document processing events persist to Supabase at write time (not in-memory only) - [x] **ANLY-02**: Admin can view processing summary: upload counts, success/failure rates, avg processing time - [x] **ANLY-03**: Analytics instrumentation is non-blocking (fire-and-forget, never delays processing pipeline) ### Infrastructure - [x] **INFR-01**: Database migrations create service_health_checks and alert_events tables with indexes on created_at - [x] **INFR-02**: Admin API routes protected by Firebase Auth with admin email check - [x] **INFR-03**: 30-day rolling data retention cleanup runs on schedule - [x] **INFR-04**: Analytics writes use existing Supabase connection, no new database infrastructure ## v2 Requirements Deferred to future release. Tracked but not in current roadmap. ### Service Health - **HLTH-05**: Admin can view 7-day service health history with uptime percentages - **HLTH-06**: Real-time auth failure detection classifies auth errors (401/403) vs transient errors (429/503) and alerts immediately on credential issues ### Alerting - **ALRT-05**: Admin can acknowledge or snooze alerts from the UI - **ALRT-06**: Admin receives recovery email when a downed service returns healthy ### Processing Analytics - **ANLY-04**: Admin can view processing time trend charts over time - **ANLY-05**: Admin can view LLM token usage and estimated cost per document and per month ### Infrastructure - **INFR-05**: Dashboard shows staleness warning when monitoring data stops arriving ## Out of Scope | Feature | Reason | |---------|--------| | External monitoring tools (Grafana, Datadog) | Operational overhead unjustified for single-admin app | | Multi-user analytics views | One admin user, RBAC complexity for zero benefit | | WebSocket/SSE real-time updates | Polling at 60s intervals sufficient; WebSockets complex in Cloud Functions | | Mobile push notifications | Email + in-app covers notification needs | | Historical analytics beyond 30 days | Storage costs; can extend later | | ML-based anomaly detection | Threshold-based alerting sufficient for this scale | | Log aggregation / log search UI | Firebase Cloud Logging handles this | ## Traceability Which phases cover which requirements. Updated during roadmap creation. | Requirement | Phase | Status | |-------------|-------|--------| | INFR-01 | Phase 1 | Complete | | INFR-04 | Phase 1 | Complete | | HLTH-02 | Phase 2 | Complete | | HLTH-03 | Phase 2 | Complete | | HLTH-04 | Phase 2 | Complete | | ALRT-01 | Phase 2 | Complete | | ALRT-02 | Phase 2 | Complete | | ALRT-04 | Phase 2 | Complete | | ANLY-01 | Phase 2 | Complete | | ANLY-03 | Phase 2 | Complete | | INFR-03 | Phase 2 | Complete | | INFR-02 | Phase 3 | Complete | | HLTH-01 | Phase 3 | Complete | | ANLY-02 | Phase 3 | Complete | | ALRT-03 | Phase 4 | Complete | **Coverage:** - v1 requirements: 15 total - Mapped to phases: 15 - Unmapped: 0 --- *Requirements defined: 2026-02-24* *Last updated: 2026-02-24 — traceability mapped after roadmap creation*