3.0 KiB
3.0 KiB
Operational To-Dos & Optimization Backlog
To-Do List (as of 2026-02-23)
- Wire Firebase Functions secrets: Attach
ANTHROPIC_API_KEY,OPENAI_API_KEY,OPENROUTER_API_KEY,SUPABASE_SERVICE_KEY,SUPABASE_ANON_KEY,DATABASE_URL,EMAIL_PASS, andFIREBASE_SERVICE_ACCOUNTto every deployed function so the runtime no longer depends on local.envvalues. - Set
GCLOUD_PROJECT_IDexplicitly: ExportGCLOUD_PROJECT_ID=cim-summarizer(or the active project) for local scripts and production functions so Document AI processor paths stop defaulting toprojects/undefined. - Acceptance-test expansion: Add additional CIM/output fixture pairs (beyond Handi Foods) so the automated acceptance suite enforces coverage across diverse deal structures.
- Backend log hygiene: Keep tailing
logs/error.logafter each deploy to confirm the service account + Anthropic credential fixes remain in place; document notable findings in deployment notes. - Infrastructure deployment checklist: Update
DEPLOYMENT_GUIDE.mdwith the exact Firebase/GCP commands used to fetch secrets and run Sonnet validation so future deploys stay reproducible.
Optimization Backlog (ordered by Accuracy → Speed → Cost benefit vs. implementation risk)
- Deterministic financial parser enhancements (status: partially addressed). Continue improving token alignment (multi-row tables, negative numbers) to reduce dependence on LLM retries. Risk: low, limited to parser module.
- Retrieval gating per Agentic pass. Swap the “top-N chunk blast” with similarity search keyed to each prompt (deal overview, market, thesis). Benefit: higher accuracy + lower token count. Risk: medium; needs robust Supabase RPC fallbacks.
- Embedding cache keyed by document checksum. Skip re-embedding when a document/version is unchanged to cut processing time/cost on retries. Risk: medium; requires schema changes to store content hashes.
- Field-level validation & dependency checks prior to gap filling. Enforce numeric relationships (e.g., EBITDA margin = EBITDA / Revenue) and re-query only the failing sections. Benefit: accuracy; risk: medium (adds validator & targeted prompts).
- Stream Document AI chunks directly into chunker. Avoid writing intermediate PDFs to disk/GCS when splitting >30 page CIMs. Benefit: speed/cost; risk: medium-high because it touches PDF splitting + Document AI integration.
- Parallelize independent multi-pass queries (e.g., run Pass 2 and Pass 3 concurrently when quota allows). Benefit: lower latency; risk: medium-high due to Anthropic rate limits & merge ordering.
- Expose per-pass metrics via
/health/agentic-rag. Surface timing/token/cost data so regressions are visible. Benefit: operational accuracy; risk: low. - Structured comparison harness for CIM outputs. Reuse the acceptance-test fixtures to generate diff reports for human reviewers (baseline vs. new model). Benefit: accuracy guardrail; risk: low once additional fixtures exist.