docs: initialize project
This commit is contained in:
76
.planning/PROJECT.md
Normal file
76
.planning/PROJECT.md
Normal file
@@ -0,0 +1,76 @@
|
||||
# CIM Summary — Analytics & Monitoring
|
||||
|
||||
## What This Is
|
||||
|
||||
An analytics dashboard and service health monitoring system for the existing CIM Summary application. Provides document processing metrics, user activity tracking, real-time service health detection, scheduled health probes, and email + in-app alerting when APIs or credentials need attention.
|
||||
|
||||
## Core Value
|
||||
|
||||
When something breaks — an API key expires, a service goes down, a credential needs reauthorization — the admin knows immediately and knows exactly what to fix.
|
||||
|
||||
## Requirements
|
||||
|
||||
### Validated
|
||||
|
||||
- ✓ Document upload and processing pipeline — existing
|
||||
- ✓ Multi-provider LLM integration (Anthropic, OpenAI, OpenRouter) — existing
|
||||
- ✓ Google Document AI text extraction — existing
|
||||
- ✓ Supabase PostgreSQL with pgvector for storage and search — existing
|
||||
- ✓ Firebase Authentication — existing
|
||||
- ✓ Google Cloud Storage for file management — existing
|
||||
- ✓ Background job queue with retry logic — existing
|
||||
- ✓ Structured logging with Winston and correlation IDs — existing
|
||||
- ✓ Basic health endpoints (`/health`, `/health/config`, `/monitoring/dashboard`) — existing
|
||||
- ✓ PDF generation and export — existing
|
||||
|
||||
### Active
|
||||
|
||||
- [ ] In-app admin analytics dashboard (processing metrics + user activity)
|
||||
- [ ] Service health monitoring for Google Document AI, Claude/OpenAI, Supabase, Firebase Auth
|
||||
- [ ] Real-time auth failure detection with actionable alerts
|
||||
- [ ] Scheduled periodic health probes for all 4 services
|
||||
- [ ] Email alerting for critical service issues
|
||||
- [ ] In-app alert notifications for admin
|
||||
- [ ] 30-day rolling data retention for analytics
|
||||
|
||||
### Out of Scope
|
||||
|
||||
- External monitoring tools (Grafana, Datadog) — keeping it in-app for simplicity
|
||||
- Non-admin user analytics views — admin-only for now
|
||||
- Mobile push notifications — email + in-app sufficient
|
||||
- Historical analytics beyond 30 days — lean storage, can extend later
|
||||
- Real-time WebSocket updates — polling is sufficient for admin dashboard
|
||||
|
||||
## Context
|
||||
|
||||
The CIM Summary application already has basic health endpoints and structured logging with correlation IDs. The existing `/monitoring/dashboard` endpoint provides some system metrics. The `performance_metrics` table in Supabase already exists for storing system performance data. Winston logging captures errors with context, but there's no alerting mechanism — errors are logged but nobody gets notified.
|
||||
|
||||
The admin user is jpressnell@bluepointcapital.com. This is a single-admin system for now.
|
||||
|
||||
Four external services need monitoring:
|
||||
1. **Google Document AI** — uses service account credentials, can expire or lose permissions
|
||||
2. **Claude/OpenAI** — API keys can be revoked, rate limited, or run out of credits
|
||||
3. **Supabase** — connection pool issues, service key rotation, pgvector availability
|
||||
4. **Firebase Auth** — project config changes, token verification failures
|
||||
|
||||
## Constraints
|
||||
|
||||
- **Tech stack**: Must integrate with existing Express.js backend and React frontend
|
||||
- **Auth**: Admin-only access, use existing Firebase Auth with role check for jpressnell@bluepointcapital.com
|
||||
- **Storage**: Use existing Supabase PostgreSQL — no new database infrastructure
|
||||
- **Email**: Need an email sending service (SendGrid, Resend, or similar) for alerts
|
||||
- **Deployment**: Must work within Firebase Cloud Functions 14-minute timeout
|
||||
- **Data retention**: 30-day rolling window to keep storage costs low
|
||||
|
||||
## Key Decisions
|
||||
|
||||
| Decision | Rationale | Outcome |
|
||||
|----------|-----------|---------|
|
||||
| In-app dashboard over external tools | Simpler setup, no additional infrastructure, admin can see everything in one place | — Pending |
|
||||
| Email + in-app dual alerting | Redundancy for critical issues — in-app for when you're already looking, email for when you're not | — Pending |
|
||||
| 30-day retention | Balances useful trend data with storage efficiency | — Pending |
|
||||
| Single admin (jpressnell@bluepointcapital.com) | Simple RBAC for now, can extend later | — Pending |
|
||||
| Real-time detection + scheduled probes | Catches failures as they happen AND proactively tests services before users hit them | — Pending |
|
||||
|
||||
---
|
||||
*Last updated: 2026-02-24 after initialization*
|
||||
Reference in New Issue
Block a user