4.1 KiB
CIM Summary — Analytics & Monitoring
What This Is
An analytics dashboard and service health monitoring system for the existing CIM Summary application. Provides document processing metrics, user activity tracking, real-time service health detection, scheduled health probes, and email + in-app alerting when APIs or credentials need attention.
Core Value
When something breaks — an API key expires, a service goes down, a credential needs reauthorization — the admin knows immediately and knows exactly what to fix.
Requirements
Validated
- ✓ Document upload and processing pipeline — existing
- ✓ Multi-provider LLM integration (Anthropic, OpenAI, OpenRouter) — existing
- ✓ Google Document AI text extraction — existing
- ✓ Supabase PostgreSQL with pgvector for storage and search — existing
- ✓ Firebase Authentication — existing
- ✓ Google Cloud Storage for file management — existing
- ✓ Background job queue with retry logic — existing
- ✓ Structured logging with Winston and correlation IDs — existing
- ✓ Basic health endpoints (
/health,/health/config,/monitoring/dashboard) — existing - ✓ PDF generation and export — existing
Active
- In-app admin analytics dashboard (processing metrics + user activity)
- Service health monitoring for Google Document AI, Claude/OpenAI, Supabase, Firebase Auth
- Real-time auth failure detection with actionable alerts
- Scheduled periodic health probes for all 4 services
- Email alerting for critical service issues
- In-app alert notifications for admin
- 30-day rolling data retention for analytics
Out of Scope
- External monitoring tools (Grafana, Datadog) — keeping it in-app for simplicity
- Non-admin user analytics views — admin-only for now
- Mobile push notifications — email + in-app sufficient
- Historical analytics beyond 30 days — lean storage, can extend later
- Real-time WebSocket updates — polling is sufficient for admin dashboard
Context
The CIM Summary application already has basic health endpoints and structured logging with correlation IDs. The existing /monitoring/dashboard endpoint provides some system metrics. The performance_metrics table in Supabase already exists for storing system performance data. Winston logging captures errors with context, but there's no alerting mechanism — errors are logged but nobody gets notified.
The admin user is jpressnell@bluepointcapital.com. This is a single-admin system for now.
Four external services need monitoring:
- Google Document AI — uses service account credentials, can expire or lose permissions
- Claude/OpenAI — API keys can be revoked, rate limited, or run out of credits
- Supabase — connection pool issues, service key rotation, pgvector availability
- Firebase Auth — project config changes, token verification failures
Constraints
- Tech stack: Must integrate with existing Express.js backend and React frontend
- Auth: Admin-only access, use existing Firebase Auth with role check for jpressnell@bluepointcapital.com
- Storage: Use existing Supabase PostgreSQL — no new database infrastructure
- Email: Need an email sending service (SendGrid, Resend, or similar) for alerts
- Deployment: Must work within Firebase Cloud Functions 14-minute timeout
- Data retention: 30-day rolling window to keep storage costs low
Key Decisions
| Decision | Rationale | Outcome |
|---|---|---|
| In-app dashboard over external tools | Simpler setup, no additional infrastructure, admin can see everything in one place | — Pending |
| Email + in-app dual alerting | Redundancy for critical issues — in-app for when you're already looking, email for when you're not | — Pending |
| 30-day retention | Balances useful trend data with storage efficiency | — Pending |
| Single admin (jpressnell@bluepointcapital.com) | Simple RBAC for now, can extend later | — Pending |
| Real-time detection + scheduled probes | Catches failures as they happen AND proactively tests services before users hit them | — Pending |
Last updated: 2026-02-24 after initialization