cim_summary

Author	SHA1	Message	Date
admin	9a906763c7	Remove 15 stale planning and analysis docs These are completed implementation plans, one-time analysis artifacts, and generic guides that no longer reflect the current codebase. All useful content is either implemented in code or captured in TODO_AND_OPTIMIZATIONS.md. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-24 10:12:23 -05:00
admin	3d01085b10	Fix hardcoded processing strategy in document controller The confirmUpload and inline processing paths were hardcoded to 'document_ai_agentic_rag', ignoring the config setting. Now reads from config.processingStrategy so the single-pass processor is actually used when configured. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-23 22:37:38 -05:00
admin	5cfb136484	Add single-pass CIM processor: 2 LLM calls, ~2.5 min processing New processing strategy `single_pass_quality_check` replaces the multi-pass agentic RAG pipeline (15-25 min) with a streamlined 2-call approach: 1. Full-document LLM extraction (Sonnet) — single call with complete CIM text 2. Delta quality-check (Haiku) — reviews extraction, returns only corrections Key changes: - New singlePassProcessor.ts with extraction + quality check flow - llmService: qualityCheckCIMDocument() with delta-only corrections array - llmService: improved prompt requiring professional inferences for qualitative fields instead of defaulting to "Not specified in CIM" - Removed deterministic financial parser from single-pass flow (LLM outperforms it — parser matched footnotes and narrative text as financials) - Default strategy changed to single_pass_quality_check - Completeness scoring with diagnostic logging of empty fields Tested on 2 real CIMs: 100% completeness, correct financials, ~150s each. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-23 22:28:45 -05:00
admin	f4bd60ca38	Fix CIM processing pipeline: embeddings, model refs, and timeouts - Fix invalid model name claude-3-7-sonnet-latest → use config.llm.model - Increase LLM timeout from 3 min to 6 min for complex CIM analysis - Improve RAG fallback to use evenly-spaced chunks when keyword matching finds too few results (prevents sending tiny fragments to LLM) - Add model name normalization for Claude 4.x family - Add googleServiceAccount utility for unified credential resolution - Add Cloud Run log fetching script - Update default models to Claude 4.6/4.5 family Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-23 18:33:31 -05:00
admin	b00700edd7	Note runtime/dependency upgrades in to-do list	2026-02-23 14:50:56 -05:00
admin	9480a3c994	Add acceptance tests and align defaults to Sonnet 4	2026-02-23 14:45:57 -05:00
admin	14d5c360e5	Set up clean Firebase deploy workflow from git source - Add @google-cloud/functions-framework and ts-node deps to match deployed - Add .env.bak ignore patterns to firebase.json - Fix adminService.ts: inline axios client (was importing non-existent module) - Clean .env to exclude GCP Secret Manager secrets (prevents deploy overlap error) - Verified: both frontend and backend build and deploy successfully Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-23 13:41:00 -05:00
admin	ecd4b13115	Fix EBITDA margin auto-correction and TypeScript compilation error - Added auto-correction logic for EBITDA margins when difference >15pp - Fixed missing closing brace in revenue validation block - Enhanced margin validation to catch cases like 95% -> 22.3%	2025-11-10 15:53:17 -05:00
admin	59e0938b72	Implement Claude Haiku 3.5 for financial extraction - Use Haiku 3.5 (claude-3-5-haiku-latest) for financial extraction by default - Automatically adjust maxTokens to 8192 for Haiku (vs 16000 for Sonnet) - Add intelligent fallback to Sonnet 4.5 if Haiku validation fails - Add comprehensive test script for Haiku financial extraction - Fix TypeScript errors in financial validation logic Benefits: - ~50% faster processing (13s vs 26s estimated) - ~92% cost reduction (--.014 vs --.15 per extraction) - Maintains accuracy with validation fallback Tested successfully with Stax Holding Company CIM: - Correctly extracted FY3=4M, FY2=1M, FY1=6M, LTM=1M - Processing time: 13.15s - Cost: --.0138	2025-11-10 14:44:37 -05:00
admin	e1411ec39c	Fix financial summary generation issues - Fix period ordering: Display periods in chronological order (FY3 → FY2 → FY1 → LTM) - Add missing metrics: Include Gross Profit and Gross Margin rows in summary table - Enhance financial parser: Improve column alignment validation and logging - Strengthen LLM prompts: Add better examples, validation checks, and column alignment guidance - Improve validation: Add cross-period validation, trend checking, and margin consistency checks - Add test suite: Create comprehensive tests for financial summary workflow All tests passing. Summary table now correctly displays periods chronologically and includes all required metrics.	2025-11-10 14:00:42 -05:00
admin	ac561f9021	fix: Remove duplicate sync:secrets script (reappeared in working directory)	2025-11-10 06:35:07 -05:00
admin	f62ef72a8a	docs: Add comprehensive financial extraction improvement plan This plan addresses all 10 pending todos with detailed implementation steps: Priority 1 (Weeks 1-2): Research & Analysis - Review older commits for historical patterns - Research best practices for financial data extraction Priority 2 (Weeks 3-4): Performance Optimization - Reduce processing time from 178s to <120s - Implement tiered model approach, parallel processing, prompt optimization Priority 3 (Weeks 5-6): Testing & Validation - Add comprehensive unit tests (>80% coverage) - Test invalid value rejection, cross-period validation, period identification Priority 4 (Weeks 7-8): Monitoring & Observability - Track extraction success rates, error patterns - Implement user feedback collection Priority 5 (Weeks 9-11): Code Quality & Documentation - Optimize prompt size (20-30% reduction) - Add financial data visualization UI - Document extraction strategies Priority 6 (Weeks 12-14): Advanced Features - Compare RAG vs Simple extraction approaches - Add confidence scores for extractions Includes detailed tasks, deliverables, success criteria, timeline, and risk mitigation strategies.	2025-11-10 06:33:41 -05:00
admin	b2c9db59c2	fix: Remove duplicate sync:secrets script, keep sync-secrets as canonical - Remove duplicate 'sync:secrets' script (line 41) - Keep 'sync-secrets' (line 29) as the canonical version - Matches existing references in bash scripts (clean-env-secrets.sh, pre-deploy-check.sh) - Resolves DRY violation and script naming confusion	2025-11-10 02:46:56 -05:00
admin	8b15732a98	feat: Add pre-deployment validation and deployment automation - Add pre-deploy-check.sh script to validate .env doesn't contain secrets - Add clean-env-secrets.sh script to remove secrets from .env before deployment - Update deploy:firebase script to run validation automatically - Add sync-secrets npm script for local development - Add deploy:firebase:force for deployments that skip validation This prevents 'Secret environment variable overlaps non secret environment variable' errors by ensuring secrets defined via defineSecret() are not also in .env file. ## Completed Todos - ✅ Test financial extraction with Stax Holding Company CIM - All values correct (FY-3: $64M, FY-2: $71M, FY-1: $71M, LTM: $76M) - ✅ Implement deterministic parser fallback - Integrated into simpleDocumentProcessor - ✅ Implement few-shot examples - Added comprehensive examples for PRIMARY table identification - ✅ Fix primary table identification - Financial extraction now correctly identifies PRIMARY table (millions) vs subsidiary tables (thousands) ## Pending Todos 1. Review older commits (1-2 months ago) to see how financial extraction was working then - Check commits: `185c780` (Claude 3.7), `5b3b1bf` (Document AI fixes), `0ec3d14` (multi-pass extraction) - Compare prompt simplicity - older versions may have had simpler, more effective prompts - Check if deterministic parser was being used more effectively 2. Review best practices for structured financial data extraction from PDFs/CIMs - Research: LLM prompt engineering for tabular data (few-shot examples, chain-of-thought) - Period identification strategies - Validation techniques - Hybrid approaches (deterministic + LLM) - Error handling patterns - Check academic papers and industry case studies 3. Determine how to reduce processing time without sacrificing accuracy - Options: 1) Use Claude Haiku 4.5 for initial extraction, Sonnet 4.5 for validation - 2) Parallel extraction of different sections - 3) Caching common patterns - 4) Streaming responses - 5) Incremental processing with early validation - 6) Reduce prompt verbosity while maintaining clarity 4. Add unit tests for financial extraction validation logic - Test: invalid value rejection, cross-period validation, numeric extraction - Period identification from various formats (years, FY-X, mixed) - Include edge cases: missing periods, projections mixed with historical, inconsistent formatting 5. Monitor production financial extraction accuracy - Track: extraction success rate, validation rejection rate, common error patterns - User feedback on extracted financial data - Set up alerts for validation failures and extraction inconsistencies 6. Optimize prompt size for financial extraction - Current prompts may be too verbose - Test shorter, more focused prompts that maintain accuracy - Consider: removing redundant instructions, using more concise examples, focusing on critical rules only 7. Add financial data visualization - Consider adding a financial data preview/validation step in the UI - Allow users to verify/correct extracted values if needed - Provides human-in-the-loop validation for critical financial data 8. Document extraction strategies - Document the different financial table formats found in CIMs - Create a reference guide for common patterns (years format, FY-X format, mixed format, etc.) - This will help with prompt engineering and parser improvements 9. Compare RAG-based extraction vs simple full-document extraction for financial accuracy - Determine which approach produces more accurate financial data and why - May need to hybrid approach 10. Add confidence scores to financial extraction results - Flag low-confidence extractions for manual review - Helps identify when extraction may be incorrect and needs human validation	2025-11-10 02:43:47 -05:00
admin	77df7c2101	Merge feature/fix-financial-extraction-primary-table: Financial extraction now correctly identifies PRIMARY table	2025-11-10 02:22:38 -05:00
admin	7acd1297bb	feat: Implement separate financial extraction with few-shot examples - Add processFinancialsOnly() method for focused financial extraction - Integrate deterministic parser into simpleDocumentProcessor - Add comprehensive few-shot examples showing PRIMARY vs subsidiary tables - Enhance prompt with explicit PRIMARY table identification rules - Fix maxTokens default from 3500 to 16000 to prevent truncation - Add test script for Stax Holding Company CIM validation Test Results: ✅ FY-3: 4M revenue, cd /home/jonathan/Coding/cim_summary && git commit -m "feat: Implement separate financial extraction with few-shot examples - Add processFinancialsOnly() method for focused financial extraction - Integrate deterministic parser into simpleDocumentProcessor - Add comprehensive few-shot examples showing PRIMARY vs subsidiary tables - Enhance prompt with explicit PRIMARY table identification rules - Fix maxTokens default from 3500 to 16000 to prevent truncation - Add test script for Stax Holding Company CIM validation Test Results: ✅ FY-3: $64M revenue, $19M EBITDA (correct) ✅ FY-2: $71M revenue, $24M EBITDA (correct) ✅ FY-1: $71M revenue, $24M EBITDA (correct) ✅ LTM: $76M revenue, $27M EBITDA (correct) All financial values now correctly extracted from PRIMARY table (millions format) instead of subsidiary tables (thousands format)."9M EBITDA (correct) ✅ FY-2: 1M revenue, 4M EBITDA (correct) ✅ FY-1: 1M revenue, 4M EBITDA (correct) ✅ LTM: 6M revenue, 7M EBITDA (correct) All financial values now correctly extracted from PRIMARY table (millions format) instead of subsidiary tables (thousands format).	2025-11-10 02:17:40 -05:00
admin	531686bb91	fix: Improve financial extraction accuracy and validation - Upgrade to Claude Sonnet 4.5 for better accuracy - Simplify and clarify financial extraction prompts - Add flexible period identification (years, FY-X, LTM formats) - Add cross-validation to catch wrong column extraction - Reject values that are too small (<M revenue, <00K EBITDA) - Add monitoring scripts for document processing - Improve validation to catch inconsistent values across periods	2025-11-09 21:57:55 -05:00
admin	63fe7e97a8	Merge pull request 'production-current' (#1 ) from production-current into master Reviewed-on: #1	2025-11-09 21:09:23 -05:00
admin	9c916d12f4	feat: Production release v2.0.0 - Simple Document Processor Major release with significant performance improvements and new processing strategy. ## Core Changes - Implemented simple_full_document processing strategy (default) - Full document → LLM approach: 1-2 passes, ~5-6 minutes processing time - Achieved 100% completeness with 2 API calls (down from 5+) - Removed redundant Document AI passes for faster processing ## Financial Data Extraction - Enhanced deterministic financial table parser - Improved FY3/FY2/FY1/LTM identification from varying CIM formats - Automatic merging of parser results with LLM extraction ## Code Quality & Infrastructure - Cleaned up debug logging (removed emoji markers from production code) - Fixed Firebase Secrets configuration (using modern defineSecret approach) - Updated OpenAI API key - Resolved deployment conflicts (secrets vs environment variables) - Added .env files to Firebase ignore list ## Deployment - Firebase Functions v2 deployment successful - All 7 required secrets verified and configured - Function URL: https://api-y56ccs6wva-uc.a.run.app ## Performance Improvements - Processing time: ~5-6 minutes (down from 23+ minutes) - API calls: 1-2 (down from 5+) - Completeness: 100% achievable - LLM Model: claude-3-7-sonnet-latest ## Breaking Changes - Default processing strategy changed to 'simple_full_document' - RAG processor available as alternative strategy 'document_ai_agentic_rag' ## Files Changed - 36 files changed, 5642 insertions(+), 4451 deletions(-) - Removed deprecated documentation files - Cleaned up unused services and models This release represents a major refactoring focused on speed, accuracy, and maintainability. v2.0.0	2025-11-09 21:07:22 -05:00
admin	0ec3d1412b	feat: Implement multi-pass hierarchical extraction for 95-98% data coverage Replaces single-pass RAG extraction with 6-pass targeted extraction strategy: Pass 1: Metadata & Structure - Deal overview fields (company name, industry, geography, employees) - Targeted RAG query for basic company information - 20 chunks focused on executive summary and overview sections Pass 2: Financial Data - All financial metrics (FY-3, FY-2, FY-1, LTM) - Revenue, EBITDA, margins, cash flow - 30 chunks with emphasis on financial tables and appendices - Extracts quality of earnings, capex, working capital Pass 3: Market Analysis - TAM/SAM market sizing, growth rates - Competitive landscape and positioning - Industry trends and barriers to entry - 25 chunks focused on market and industry sections Pass 4: Business & Operations - Products/services and value proposition - Customer and supplier information - Management team and org structure - 25 chunks covering business model and operations Pass 5: Investment Thesis - Strategic analysis and recommendations - Value creation levers and risks - Alignment with fund strategy - 30 chunks for synthesis and high-level analysis Pass 6: Validation & Gap-Filling - Identifies fields still marked "Not specified in CIM" - Groups missing fields into logical batches - Makes targeted RAG queries for each batch - Dynamic API usage based on gaps found Key Improvements: - Each pass uses targeted RAG queries optimized for that data type - Smart merge strategy preserves first non-empty value for each field - Gap-filling pass catches data missed in initial passes - Total ~5-10 LLM API calls vs. 1 (controlled cost increase) - Expected to achieve 95-98% data coverage vs. ~40-50% currently Technical Details: - Updated processLargeDocument to use generateLLMAnalysisMultiPass - Added processingStrategy: 'document_ai_multi_pass_rag' - Each pass includes keyword fallback if RAG search fails - Deep merge utility prevents "Not specified" from overwriting good data - Comprehensive logging for debugging each pass 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-07 13:15:19 -05:00
admin	053426c88d	fix: Correct OpenRouter model IDs and add error handling Critical fixes for LLM processing failures: - Updated model mapping to use valid OpenRouter IDs (claude-haiku-4.5, claude-sonnet-4.5) - Changed default models from dated versions to generic names - Added HTTP status checking before accessing response data - Enhanced logging for OpenRouter provider selection Resolves "invalid model ID" errors that were causing all CIM processing to fail. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-06 20:58:26 -05:00
Jon	c8c2783241	feat: Implement comprehensive CIM Review editing and admin features - Add inline editing for CIM Review template with auto-save functionality - Implement CSV export with comprehensive data formatting - Add automated file naming (YYYYMMDD_CompanyName_CIM_Review.pdf/csv) - Create admin role system for jpressnell@bluepointcapital.com - Hide analytics/monitoring tabs from non-admin users - Add email sharing functionality via mailto links - Implement save status indicators and last saved timestamps - Add backend endpoints for CIM Review save/load and CSV export - Create admin service for role-based access control - Update document viewer with save/export handlers - Add proper error handling and user feedback Backup: Live version preserved in backup-live-version-e0a37bf-clean branch	2025-08-14 11:54:25 -04:00
Jon	e0a37bf9f9	Fix PDF generation: correct method call to use Puppeteer directly instead of generatePDFBuffer PRODUCTION-BACKUP-v1.0	2025-08-02 15:40:15 -04:00
Jon	1954d9d0a6	Replace Puppeteer fallback with PDFKit for reliable PDF generation in Firebase Functions	2025-08-02 15:35:32 -04:00
Jon	c709e8b8c4	Fix PDF generation issues: add logo to build process and implement fallback methods	2025-08-02 15:23:45 -04:00
Jon	5e8add6cc5	Add Bluepoint logo integration to PDF reports and web navigation	2025-08-02 15:12:33 -04:00
Jon	bdc50f9e38	feat: Add GCS cleanup script for automated storage management	2025-08-02 09:32:10 -04:00
Jon	6e164d2bcb	fix: Fix TypeScript error in PDF generation service cache cleanup	2025-08-02 09:17:49 -04:00
Jon	a4f393d4ac	Fix financial table rendering and enhance PDF generation - Fix [object Object] issue in PDF financial table rendering - Enhance Key Questions and Investment Thesis sections with detailed prompts - Update year labeling in Overview tab (FY0 -> LTM) - Improve PDF generation service with page pooling and caching - Add better error handling for financial data structure - Increase textarea rows for detailed content sections - Update API configuration for Cloud Run deployment - Add comprehensive styling improvements to PDF output	2025-08-01 20:33:16 -04:00
Jon	df079713c4	feat: Complete cloud-native CIM Document Processor with full BPCP template 🌐 Cloud-Native Architecture: - Firebase Functions deployment (no Docker) - Supabase database (replacing local PostgreSQL) - Google Cloud Storage integration - Document AI + Agentic RAG processing pipeline - Claude-3.5-Sonnet LLM integration ✅ Full BPCP CIM Review Template (7 sections): - Deal Overview - Business Description - Market & Industry Analysis - Financial Summary (with historical financials table) - Management Team Overview - Preliminary Investment Thesis - Key Questions & Next Steps 🔧 Cloud Migration Improvements: - PostgreSQL → Supabase migration complete - Local storage → Google Cloud Storage - Docker deployment → Firebase Functions - Schema mapping fixes (camelCase/snake_case) - Enhanced error handling and logging - Vector database with fallback mechanisms 📄 Complete End-to-End Cloud Workflow: 1. Upload PDF → Document AI extraction 2. Agentic RAG processing → Structured CIM data 3. Store in Supabase → Vector embeddings 4. Auto-generate PDF → Full BPCP template 5. Download complete CIM review 🚀 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-08-01 17:51:45 -04:00
Jon	3d94fcbeb5	Pre Kiro	2025-08-01 15:46:43 -04:00
Jon	f453efb0f8	Pre-cleanup commit: Current state before service layer consolidation	2025-08-01 14:57:56 -04:00
Jon	95c92946de	fix(core): Overhaul and fix the end-to-end document processing pipeline	2025-08-01 11:13:03 -04:00
Jon	6057d1d7fd	🔧 Fix authentication and document upload issues ## What was done: ✅ Fixed Firebase Admin initialization to use default credentials for Firebase Functions ✅ Updated frontend to use correct Firebase Functions URL (was using Cloud Run URL) ✅ Added comprehensive debugging to authentication middleware ✅ Added debugging to file upload middleware and CORS handling ✅ Added debug buttons to frontend for troubleshooting authentication ✅ Enhanced error handling and logging throughout the stack ## Current issues: ❌ Document upload still returns 400 Bad Request despite authentication working ❌ GET requests work fine (200 OK) but POST upload requests fail ❌ Frontend authentication is working correctly (valid JWT tokens) ❌ Backend authentication middleware is working (rejects invalid tokens) ❌ CORS is configured correctly and allowing requests ## Root cause analysis: - Authentication is NOT the issue (tokens are valid, GET requests work) - The problem appears to be in the file upload handling or multer configuration - Request reaches the server but fails during upload processing - Need to identify exactly where in the upload pipeline the failure occurs ## TODO next steps: 1. 🔍 Check Firebase Functions logs after next upload attempt to see debugging output 2. 🔍 Verify if request reaches upload middleware (look for '�� Upload middleware called' logs) 3. 🔍 Check if file validation is triggered (look for '🔍 File filter called' logs) 4. 🔍 Identify specific error in upload pipeline (multer, file processing, etc.) 5. 🔍 Test with smaller file or different file type to isolate issue 6. 🔍 Check if issue is with Firebase Functions file size limits or timeout 7. 🔍 Verify multer configuration and file handling in Firebase Functions environment ## Technical details: - Frontend: https://cim-summarizer.web.app - Backend: https://us-central1-cim-summarizer.cloudfunctions.net/api - Authentication: Firebase Auth with JWT tokens (working correctly) - File upload: Multer with memory storage for immediate GCS upload - Debug buttons available in production frontend for troubleshooting	2025-07-31 16:18:53 -04:00
Jon	aa0931ecd7	feat: Add Document AI + Genkit integration for CIM processing This commit implements a comprehensive Document AI + Genkit integration for superior CIM document processing with the following features: Core Integration: - Add DocumentAiGenkitProcessor service for Document AI + Genkit processing - Integrate with Google Cloud Document AI OCR processor (ID: add30c555ea0ff89) - Add unified document processing strategy 'document_ai_genkit' - Update environment configuration for Document AI settings Document AI Features: - Google Cloud Storage integration for document upload/download - Document AI batch processing with OCR and entity extraction - Automatic cleanup of temporary files - Support for PDF, DOCX, and image formats - Entity recognition for companies, money, percentages, dates - Table structure preservation and extraction Genkit AI Integration: - Structured AI analysis using Document AI extracted data - CIM-specific analysis prompts and schemas - Comprehensive investment analysis output - Risk assessment and investment recommendations Testing & Validation: - Comprehensive test suite with 10+ test scripts - Real processor verification and integration testing - Mock processing for development and testing - Full end-to-end integration testing - Performance benchmarking and validation Documentation: - Complete setup instructions for Document AI - Integration guide with benefits and implementation details - Testing guide with step-by-step instructions - Performance comparison and optimization guide Infrastructure: - Google Cloud Functions deployment updates - Environment variable configuration - Service account setup and permissions - GCS bucket configuration for Document AI Performance Benefits: - 50% faster processing compared to traditional methods - 90% fewer API calls for cost efficiency - 35% better quality through structured extraction - 50% lower costs through optimized processing Breaking Changes: None Migration: Add Document AI environment variables to .env file Testing: All tests pass, integration verified with real processor	2025-07-31 09:55:14 -04:00
Jon	dbe4b12f13	feat: optimize deployment and add debugging	2025-07-30 22:06:52 -04:00
Jon	2d98dfc814	temp: firebase deployment progress	2025-07-30 22:02:17 -04:00
Jon	67b77b0f15	Implement Firebase Authentication and Cloud Functions deployment - Replace custom JWT auth with Firebase Auth SDK - Add Firebase web app configuration - Implement user registration and login with Firebase - Update backend to use Firebase Admin SDK for token verification - Remove custom auth routes and controllers - Add Firebase Cloud Functions deployment configuration - Update frontend to use Firebase Auth state management - Add registration mode toggle to login form - Configure CORS and deployment for Firebase hosting 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-07-29 15:26:55 -04:00
Jon	5f09a1b2fb	Clean up and optimize root directory - Remove large test PDF files (15.5MB total): '2025-04-23 Stax Holding Company, LLC Confidential Information Presentation for Stax Holding Company, LLC - April 2025.pdf' (9.9MB) and 'stax-cim-test.pdf' (5.6MB) - Remove unused dependency: form-data from root package.json - Keep all essential documentation and configuration files - Maintain project structure integrity while reducing repository size	2025-07-29 00:51:27 -04:00
Jon	70c02df6e7	Clean up and optimize backend code - Remove large log files (13MB total) - Remove dist directory (1.9MB, can be regenerated) - Remove unused dependencies: bcrypt, bull, langchain, @langchain/openai, form-data, express-validator - Remove unused service files: advancedLLMProcessor, enhancedCIMProcessor, enhancedLLMService, financialAnalysisEngine, qualityValidationService - Keep essential services: uploadProgressService, sessionService, vectorDatabaseService, vectorDocumentProcessor, ragDocumentProcessor - Maintain all working functionality while reducing bundle size and improving maintainability	2025-07-29 00:49:56 -04:00
Jon	df7bbe47f6	Clean up and optimize frontend code - Remove temporary files: verify-auth.js, frontend_test_results.txt, test-output.css - Remove empty directories: src/pages, src/hooks - Remove unused dependencies: @tanstack/react-query, react-hook-form - Remove unused utility file: parseCIMData.ts - Clean up commented mock data and unused imports in App.tsx - Maintain all working functionality while reducing bundle size	2025-07-29 00:44:24 -04:00
Jon	0bd6a3508b	Clean up temporary files and logs - Remove test PDF files, log files, and temporary scripts - Keep important documentation and configuration files - Clean up root directory test files and logs - Maintain project structure integrity	2025-07-29 00:41:38 -04:00
Jon	785195908f	Fix employee count field mapping - Add employeeCount field to LLM schema and prompt - Update frontend to use correct dealOverview.employeeCount field - Add employee count to CIMReviewTemplate interface and rendering - Include employee count in PDF summary generation - Fix incorrect mapping from customerConcentrationRisk to proper employeeCount field	2025-07-29 00:39:08 -04:00
Jon	a4c8aac92d	Improve PDF formatting with financial tables and professional styling - Add comprehensive financial table with FY1/FY2/FY3/LTM periods - Include all missing sections (investment analysis, next steps, etc.) - Update PDF styling with smaller fonts (10pt), Times New Roman, professional layout - Add proper table formatting with borders and headers - Fix TypeScript compilation errors	2025-07-29 00:34:12 -04:00
Jon	4ce430b531	Fix CIM template data linkage issues - update field mapping to use proper nested paths	2025-07-29 00:25:04 -04:00
Jon	d794e64a02	Fix frontend data display and download issues - Fixed backend API to return analysis_data as extractedData for frontend compatibility - Added PDF generation to jobQueueService to ensure summary_pdf_path is populated - Generated PDF for existing document to fix download functionality - Backend now properly serves analysis data to frontend - Frontend should now display real financial data instead of N/A values	2025-07-29 00:16:17 -04:00
Jon	dccfcfaa23	Fix download functionality and clean up temporary files FIXED ISSUES: 1. Download functionality (404 errors): - Added PDF generation to jobQueueService after document processing - PDFs are now generated from summaries and stored in summary_pdf_path - Download endpoint now works correctly 2. Frontend-Backend communication: - Verified Vite proxy configuration is correct (/api -> localhost:5000) - Backend is responding to health checks - API authentication is working 3. Temporary files cleanup: - Removed 50+ temporary debug/test files from backend/ - Cleaned up check-.js, test-.js, debug-.js, fix-.js files - Removed one-time processing scripts and debug utilities TECHNICAL DETAILS: - Modified jobQueueService.ts to generate PDFs using pdfGenerationService - Added path import for file path handling - PDFs are generated with timestamp in filename for uniqueness - All temporary development files have been removed STATUS: Download functionality should now work. Frontend-backend communication verified.	2025-07-28 21:33:28 -04:00
Jon	4326599916	Fix TypeScript compilation errors and start services correctly - Fixed unused imports in documentController.ts and vector.ts - Fixed null/undefined type issues in pdfGenerationService.ts - Commented out unused enrichChunksWithMetadata method in agenticRAGProcessor.ts - Successfully started both frontend (port 3000) and backend (port 5000) TODO: Need to investigate: - Why frontend is not getting backend data properly - Why download functionality is not working (404 errors in logs) - Need to clean up temporary debug/test files	2025-07-28 21:30:32 -04:00
Jon	adb33154cc	feat: Implement optimized agentic RAG processor with vector embeddings and LLM analysis - Add LLM analysis integration to optimized agentic RAG processor - Fix strategy routing in job queue service to use configured processing strategy - Update ProcessingResult interface to include LLM analysis results - Integrate vector database operations with semantic chunking - Add comprehensive CIM review generation with proper error handling - Fix TypeScript errors and improve type safety - Ensure complete pipeline from upload to final analysis output The optimized agentic RAG processor now: - Creates intelligent semantic chunks with metadata enrichment - Generates vector embeddings for all chunks - Stores chunks in pgvector database with optimized batching - Runs LLM analysis to generate comprehensive CIM reviews - Provides complete integration from upload to final output Tested successfully with STAX CIM document processing.	2025-07-28 20:11:32 -04:00
Jon	7cca54445d	Enhanced CIM processing with vector database integration and optimized agentic RAG processor	2025-07-28 19:46:46 -04:00

1 2 3

106 Commits