CIM Summary — Analytics & Monitoring

What This Is

An analytics dashboard and service health monitoring system for the existing CIM Summary application. Provides document processing metrics, user activity tracking, real-time service health detection, scheduled health probes, and email + in-app alerting when APIs or credentials need attention.

Core Value

When something breaks — an API key expires, a service goes down, a credential needs reauthorization — the admin knows immediately and knows exactly what to fix.

Requirements

Validated

✓ Document upload and processing pipeline — existing
✓ Multi-provider LLM integration (Anthropic, OpenAI, OpenRouter) — existing
✓ Google Document AI text extraction — existing
✓ Supabase PostgreSQL with pgvector for storage and search — existing
✓ Firebase Authentication — existing
✓ Google Cloud Storage for file management — existing
✓ Background job queue with retry logic — existing
✓ Structured logging with Winston and correlation IDs — existing
✓ Basic health endpoints (/health, /health/config, /monitoring/dashboard) — existing
✓ PDF generation and export — existing

Active

In-app admin analytics dashboard (processing metrics + user activity)
Service health monitoring for Google Document AI, Claude/OpenAI, Supabase, Firebase Auth
Real-time auth failure detection with actionable alerts
Scheduled periodic health probes for all 4 services
Email alerting for critical service issues
In-app alert notifications for admin
30-day rolling data retention for analytics

Out of Scope

External monitoring tools (Grafana, Datadog) — keeping it in-app for simplicity
Non-admin user analytics views — admin-only for now
Mobile push notifications — email + in-app sufficient
Historical analytics beyond 30 days — lean storage, can extend later
Real-time WebSocket updates — polling is sufficient for admin dashboard

Context

The CIM Summary application already has basic health endpoints and structured logging with correlation IDs. The existing /monitoring/dashboard endpoint provides some system metrics. The performance_metrics table in Supabase already exists for storing system performance data. Winston logging captures errors with context, but there's no alerting mechanism — errors are logged but nobody gets notified.

The admin user is jpressnell@bluepointcapital.com. This is a single-admin system for now.

Four external services need monitoring:

Google Document AI — uses service account credentials, can expire or lose permissions
Claude/OpenAI — API keys can be revoked, rate limited, or run out of credits
Supabase — connection pool issues, service key rotation, pgvector availability
Firebase Auth — project config changes, token verification failures

Constraints

Tech stack: Must integrate with existing Express.js backend and React frontend
Auth: Admin-only access, use existing Firebase Auth with role check for jpressnell@bluepointcapital.com
Storage: Use existing Supabase PostgreSQL — no new database infrastructure
Email: Need an email sending service (SendGrid, Resend, or similar) for alerts
Deployment: Must work within Firebase Cloud Functions 14-minute timeout
Data retention: 30-day rolling window to keep storage costs low

Key Decisions

Decision	Rationale	Outcome
In-app dashboard over external tools	Simpler setup, no additional infrastructure, admin can see everything in one place	— Pending
Email + in-app dual alerting	Redundancy for critical issues — in-app for when you're already looking, email for when you're not	— Pending
30-day retention	Balances useful trend data with storage efficiency	— Pending
Single admin (jpressnell@bluepointcapital.com)	Simple RBAC for now, can extend later	— Pending
Real-time detection + scheduled probes	Catches failures as they happen AND proactively tests services before users hit them	— Pending

Last updated: 2026-02-24 after initialization

4.1 KiB Raw Blame History