Files
cim_summary/TODO_AND_OPTIMIZATIONS.md

3.0 KiB

Operational To-Dos & Optimization Backlog

To-Do List (as of 2026-02-23)

  • Wire Firebase Functions secrets: Attach ANTHROPIC_API_KEY, OPENAI_API_KEY, OPENROUTER_API_KEY, SUPABASE_SERVICE_KEY, SUPABASE_ANON_KEY, DATABASE_URL, EMAIL_PASS, and FIREBASE_SERVICE_ACCOUNT to every deployed function so the runtime no longer depends on local .env values.
  • Set GCLOUD_PROJECT_ID explicitly: Export GCLOUD_PROJECT_ID=cim-summarizer (or the active project) for local scripts and production functions so Document AI processor paths stop defaulting to projects/undefined.
  • Acceptance-test expansion: Add additional CIM/output fixture pairs (beyond Handi Foods) so the automated acceptance suite enforces coverage across diverse deal structures.
  • Backend log hygiene: Keep tailing logs/error.log after each deploy to confirm the service account + Anthropic credential fixes remain in place; document notable findings in deployment notes.
  • Infrastructure deployment checklist: Update DEPLOYMENT_GUIDE.md with the exact Firebase/GCP commands used to fetch secrets and run Sonnet validation so future deploys stay reproducible.

Optimization Backlog (ordered by Accuracy → Speed → Cost benefit vs. implementation risk)

  1. Deterministic financial parser enhancements (status: partially addressed). Continue improving token alignment (multi-row tables, negative numbers) to reduce dependence on LLM retries. Risk: low, limited to parser module.
  2. Retrieval gating per Agentic pass. Swap the “top-N chunk blast” with similarity search keyed to each prompt (deal overview, market, thesis). Benefit: higher accuracy + lower token count. Risk: medium; needs robust Supabase RPC fallbacks.
  3. Embedding cache keyed by document checksum. Skip re-embedding when a document/version is unchanged to cut processing time/cost on retries. Risk: medium; requires schema changes to store content hashes.
  4. Field-level validation & dependency checks prior to gap filling. Enforce numeric relationships (e.g., EBITDA margin = EBITDA / Revenue) and re-query only the failing sections. Benefit: accuracy; risk: medium (adds validator & targeted prompts).
  5. Stream Document AI chunks directly into chunker. Avoid writing intermediate PDFs to disk/GCS when splitting >30 page CIMs. Benefit: speed/cost; risk: medium-high because it touches PDF splitting + Document AI integration.
  6. Parallelize independent multi-pass queries (e.g., run Pass 2 and Pass 3 concurrently when quota allows). Benefit: lower latency; risk: medium-high due to Anthropic rate limits & merge ordering.
  7. Expose per-pass metrics via /health/agentic-rag. Surface timing/token/cost data so regressions are visible. Benefit: operational accuracy; risk: low.
  8. Structured comparison harness for CIM outputs. Reuse the acceptance-test fixtures to generate diff reports for human reviewers (baseline vs. new model). Benefit: accuracy guardrail; risk: low once additional fixtures exist.