Files
cim_summary/.planning/PROJECT.md
2026-02-24 10:49:52 -05:00

4.1 KiB

CIM Summary — Analytics & Monitoring

What This Is

An analytics dashboard and service health monitoring system for the existing CIM Summary application. Provides document processing metrics, user activity tracking, real-time service health detection, scheduled health probes, and email + in-app alerting when APIs or credentials need attention.

Core Value

When something breaks — an API key expires, a service goes down, a credential needs reauthorization — the admin knows immediately and knows exactly what to fix.

Requirements

Validated

  • ✓ Document upload and processing pipeline — existing
  • ✓ Multi-provider LLM integration (Anthropic, OpenAI, OpenRouter) — existing
  • ✓ Google Document AI text extraction — existing
  • ✓ Supabase PostgreSQL with pgvector for storage and search — existing
  • ✓ Firebase Authentication — existing
  • ✓ Google Cloud Storage for file management — existing
  • ✓ Background job queue with retry logic — existing
  • ✓ Structured logging with Winston and correlation IDs — existing
  • ✓ Basic health endpoints (/health, /health/config, /monitoring/dashboard) — existing
  • ✓ PDF generation and export — existing

Active

  • In-app admin analytics dashboard (processing metrics + user activity)
  • Service health monitoring for Google Document AI, Claude/OpenAI, Supabase, Firebase Auth
  • Real-time auth failure detection with actionable alerts
  • Scheduled periodic health probes for all 4 services
  • Email alerting for critical service issues
  • In-app alert notifications for admin
  • 30-day rolling data retention for analytics

Out of Scope

  • External monitoring tools (Grafana, Datadog) — keeping it in-app for simplicity
  • Non-admin user analytics views — admin-only for now
  • Mobile push notifications — email + in-app sufficient
  • Historical analytics beyond 30 days — lean storage, can extend later
  • Real-time WebSocket updates — polling is sufficient for admin dashboard

Context

The CIM Summary application already has basic health endpoints and structured logging with correlation IDs. The existing /monitoring/dashboard endpoint provides some system metrics. The performance_metrics table in Supabase already exists for storing system performance data. Winston logging captures errors with context, but there's no alerting mechanism — errors are logged but nobody gets notified.

The admin user is jpressnell@bluepointcapital.com. This is a single-admin system for now.

Four external services need monitoring:

  1. Google Document AI — uses service account credentials, can expire or lose permissions
  2. Claude/OpenAI — API keys can be revoked, rate limited, or run out of credits
  3. Supabase — connection pool issues, service key rotation, pgvector availability
  4. Firebase Auth — project config changes, token verification failures

Constraints

  • Tech stack: Must integrate with existing Express.js backend and React frontend
  • Auth: Admin-only access, use existing Firebase Auth with role check for jpressnell@bluepointcapital.com
  • Storage: Use existing Supabase PostgreSQL — no new database infrastructure
  • Email: Need an email sending service (SendGrid, Resend, or similar) for alerts
  • Deployment: Must work within Firebase Cloud Functions 14-minute timeout
  • Data retention: 30-day rolling window to keep storage costs low

Key Decisions

Decision Rationale Outcome
In-app dashboard over external tools Simpler setup, no additional infrastructure, admin can see everything in one place — Pending
Email + in-app dual alerting Redundancy for critical issues — in-app for when you're already looking, email for when you're not — Pending
30-day retention Balances useful trend data with storage efficiency — Pending
Single admin (jpressnell@bluepointcapital.com) Simple RBAC for now, can extend later — Pending
Real-time detection + scheduled probes Catches failures as they happen AND proactively tests services before users hit them — Pending

Last updated: 2026-02-24 after initialization