- Phase 2 plan 4 complete — two scheduled Cloud Function exports added
- SUMMARY.md created with decisions, deviations, and phase readiness notes
- STATE.md updated: phase 2 complete, plan counter at 4/4
- ROADMAP.md updated: phase 2 all 4 plans complete
- Requirements HLTH-03 and INFR-03 marked complete
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Created 02-01-SUMMARY.md with full execution documentation
- Updated ROADMAP.md with phase 2 plan progress (2 of 4 plans with summaries)
- Marked requirements ANLY-01 and ANLY-03 complete in REQUIREMENTS.md
- Added 02-01 key decisions to STATE.md
- 9 tests covering all 4 probers and orchestrator
- Verifies all probes return 4 ProbeResults with correct service names
- Verifies results persisted via HealthCheckModel.create 4 times
- Verifies one probe failure does not abort other probes
- Verifies LLM probe 429 returns degraded not down
- Verifies Supabase probe uses getPostgresPool (not PostgREST)
- Verifies Firebase Auth distinguishes expected vs unexpected errors
- Verifies latency_ms is a non-negative number
- Verifies HealthCheckModel.create failure is isolated
- Install nodemailer + @types/nodemailer (needed by Plan 03)
- Create healthProbeService.ts with 4 probers: document_ai, llm_api, supabase, firebase_auth
- Each probe makes a real authenticated API call
- Each probe returns structured ProbeResult with status, latency_ms, error_message
- LLM probe uses cheapest model (claude-haiku-4-5) with max_tokens 5
- Supabase probe uses getPostgresPool().query('SELECT 1') not PostgREST
- Firebase Auth probe distinguishes expected vs unexpected errors
- runAllProbes orchestrator uses Promise.allSettled for fault isolation
- Results persisted via HealthCheckModel.create() after each probe
- Add service_health_checks table with status CHECK constraint, JSONB probe_details, checked_at column
- Add alert_events table with alert_type and status CHECK constraints, lifecycle timestamps
- Add created_at indexes on both tables (INFR-01 requirement)
- Add composite indexes for common query patterns
- Enable RLS on both tables (service role bypasses RLS per Supabase pattern)
Node.js 20 is being decommissioned 2026-10-30. This upgrades the runtime
to Node.js 22 (LTS), bumps firebase-functions from v6 to v7, removes the
deprecated functions.config() fallback, and aligns the TS target to ES2022.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
These are completed implementation plans, one-time analysis artifacts,
and generic guides that no longer reflect the current codebase.
All useful content is either implemented in code or captured in
TODO_AND_OPTIMIZATIONS.md.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The confirmUpload and inline processing paths were hardcoded to
'document_ai_agentic_rag', ignoring the config setting. Now reads
from config.processingStrategy so the single-pass processor is
actually used when configured.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
New processing strategy `single_pass_quality_check` replaces the multi-pass
agentic RAG pipeline (15-25 min) with a streamlined 2-call approach:
1. Full-document LLM extraction (Sonnet) — single call with complete CIM text
2. Delta quality-check (Haiku) — reviews extraction, returns only corrections
Key changes:
- New singlePassProcessor.ts with extraction + quality check flow
- llmService: qualityCheckCIMDocument() with delta-only corrections array
- llmService: improved prompt requiring professional inferences for qualitative
fields instead of defaulting to "Not specified in CIM"
- Removed deterministic financial parser from single-pass flow (LLM outperforms
it — parser matched footnotes and narrative text as financials)
- Default strategy changed to single_pass_quality_check
- Completeness scoring with diagnostic logging of empty fields
Tested on 2 real CIMs: 100% completeness, correct financials, ~150s each.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Fix invalid model name claude-3-7-sonnet-latest → use config.llm.model
- Increase LLM timeout from 3 min to 6 min for complex CIM analysis
- Improve RAG fallback to use evenly-spaced chunks when keyword matching
finds too few results (prevents sending tiny fragments to LLM)
- Add model name normalization for Claude 4.x family
- Add googleServiceAccount utility for unified credential resolution
- Add Cloud Run log fetching script
- Update default models to Claude 4.6/4.5 family
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add pre-deploy-check.sh script to validate .env doesn't contain secrets
- Add clean-env-secrets.sh script to remove secrets from .env before deployment
- Update deploy:firebase script to run validation automatically
- Add sync-secrets npm script for local development
- Add deploy:firebase:force for deployments that skip validation
This prevents 'Secret environment variable overlaps non secret environment variable' errors
by ensuring secrets defined via defineSecret() are not also in .env file.
## Completed Todos
- ✅ Test financial extraction with Stax Holding Company CIM - All values correct (FY-3: $64M, FY-2: $71M, FY-1: $71M, LTM: $76M)
- ✅ Implement deterministic parser fallback - Integrated into simpleDocumentProcessor
- ✅ Implement few-shot examples - Added comprehensive examples for PRIMARY table identification
- ✅ Fix primary table identification - Financial extraction now correctly identifies PRIMARY table (millions) vs subsidiary tables (thousands)
## Pending Todos
1. Review older commits (1-2 months ago) to see how financial extraction was working then
- Check commits: 185c780 (Claude 3.7), 5b3b1bf (Document AI fixes), 0ec3d14 (multi-pass extraction)
- Compare prompt simplicity - older versions may have had simpler, more effective prompts
- Check if deterministic parser was being used more effectively
2. Review best practices for structured financial data extraction from PDFs/CIMs
- Research: LLM prompt engineering for tabular data (few-shot examples, chain-of-thought)
- Period identification strategies
- Validation techniques
- Hybrid approaches (deterministic + LLM)
- Error handling patterns
- Check academic papers and industry case studies
3. Determine how to reduce processing time without sacrificing accuracy
- Options: 1) Use Claude Haiku 4.5 for initial extraction, Sonnet 4.5 for validation
- 2) Parallel extraction of different sections
- 3) Caching common patterns
- 4) Streaming responses
- 5) Incremental processing with early validation
- 6) Reduce prompt verbosity while maintaining clarity
4. Add unit tests for financial extraction validation logic
- Test: invalid value rejection, cross-period validation, numeric extraction
- Period identification from various formats (years, FY-X, mixed)
- Include edge cases: missing periods, projections mixed with historical, inconsistent formatting
5. Monitor production financial extraction accuracy
- Track: extraction success rate, validation rejection rate, common error patterns
- User feedback on extracted financial data
- Set up alerts for validation failures and extraction inconsistencies
6. Optimize prompt size for financial extraction
- Current prompts may be too verbose
- Test shorter, more focused prompts that maintain accuracy
- Consider: removing redundant instructions, using more concise examples, focusing on critical rules only
7. Add financial data visualization
- Consider adding a financial data preview/validation step in the UI
- Allow users to verify/correct extracted values if needed
- Provides human-in-the-loop validation for critical financial data
8. Document extraction strategies
- Document the different financial table formats found in CIMs
- Create a reference guide for common patterns (years format, FY-X format, mixed format, etc.)
- This will help with prompt engineering and parser improvements
9. Compare RAG-based extraction vs simple full-document extraction for financial accuracy
- Determine which approach produces more accurate financial data and why
- May need to hybrid approach
10. Add confidence scores to financial extraction results
- Flag low-confidence extractions for manual review
- Helps identify when extraction may be incorrect and needs human validation
- Upgrade to Claude Sonnet 4.5 for better accuracy
- Simplify and clarify financial extraction prompts
- Add flexible period identification (years, FY-X, LTM formats)
- Add cross-validation to catch wrong column extraction
- Reject values that are too small (<M revenue, <00K EBITDA)
- Add monitoring scripts for document processing
- Improve validation to catch inconsistent values across periods
Replaces single-pass RAG extraction with 6-pass targeted extraction strategy:
**Pass 1: Metadata & Structure**
- Deal overview fields (company name, industry, geography, employees)
- Targeted RAG query for basic company information
- 20 chunks focused on executive summary and overview sections
**Pass 2: Financial Data**
- All financial metrics (FY-3, FY-2, FY-1, LTM)
- Revenue, EBITDA, margins, cash flow
- 30 chunks with emphasis on financial tables and appendices
- Extracts quality of earnings, capex, working capital
**Pass 3: Market Analysis**
- TAM/SAM market sizing, growth rates
- Competitive landscape and positioning
- Industry trends and barriers to entry
- 25 chunks focused on market and industry sections
**Pass 4: Business & Operations**
- Products/services and value proposition
- Customer and supplier information
- Management team and org structure
- 25 chunks covering business model and operations
**Pass 5: Investment Thesis**
- Strategic analysis and recommendations
- Value creation levers and risks
- Alignment with fund strategy
- 30 chunks for synthesis and high-level analysis
**Pass 6: Validation & Gap-Filling**
- Identifies fields still marked "Not specified in CIM"
- Groups missing fields into logical batches
- Makes targeted RAG queries for each batch
- Dynamic API usage based on gaps found
**Key Improvements:**
- Each pass uses targeted RAG queries optimized for that data type
- Smart merge strategy preserves first non-empty value for each field
- Gap-filling pass catches data missed in initial passes
- Total ~5-10 LLM API calls vs. 1 (controlled cost increase)
- Expected to achieve 95-98% data coverage vs. ~40-50% currently
**Technical Details:**
- Updated processLargeDocument to use generateLLMAnalysisMultiPass
- Added processingStrategy: 'document_ai_multi_pass_rag'
- Each pass includes keyword fallback if RAG search fails
- Deep merge utility prevents "Not specified" from overwriting good data
- Comprehensive logging for debugging each pass
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>