# RAG Processing System for CIM Analysis ## Overview This document describes the new RAG (Retrieval-Augmented Generation) processing system that provides an alternative to the current chunking approach for CIM document analysis. ## Why RAG? ### Current Chunking Issues - **9 sequential chunks** per document (inefficient) - **Context fragmentation** (each chunk analyzed in isolation) - **Redundant processing** (same company analyzed 9 times) - **Inconsistent results** (contradictions between chunks) - **High costs** (more API calls = higher total cost) ### RAG Benefits - **6-8 focused queries** instead of 9+ chunks - **Full document context** maintained throughout - **Intelligent retrieval** of relevant sections - **Lower costs** with better quality - **Faster processing** with parallel capability ## Architecture ### Components 1. **RAG Document Processor** (`ragDocumentProcessor.ts`) - Intelligent document segmentation - Section-specific analysis - Context-aware retrieval - Performance tracking 2. **Unified Document Processor** (`unifiedDocumentProcessor.ts`) - Strategy switching - Performance comparison - Quality assessment - Statistics tracking 3. **API Endpoints** (enhanced `documents.ts`) - `/api/documents/:id/process-rag` - Process with RAG - `/api/documents/:id/compare-strategies` - Compare both approaches - `/api/documents/:id/switch-strategy` - Switch processing strategy - `/api/documents/processing-stats` - Get performance statistics ## Configuration ### Environment Variables ```bash # Processing Strategy (default: 'chunking') PROCESSING_STRATEGY=rag # Enable RAG Processing ENABLE_RAG_PROCESSING=true # Enable Processing Comparison ENABLE_PROCESSING_COMPARISON=true # LLM Configuration for RAG LLM_CHUNK_SIZE=15000 # Increased from 4000 LLM_MAX_TOKENS=4000 # Increased from 3500 LLM_MAX_INPUT_TOKENS=200000 # Increased from 180000 LLM_PROMPT_BUFFER=1000 # Increased from 500 LLM_TIMEOUT_MS=180000 # Increased from 120000 LLM_MAX_COST_PER_DOCUMENT=3.00 # Increased from 2.00 ``` ## Usage ### 1. Process Document with RAG ```javascript // Using the unified processor const result = await unifiedDocumentProcessor.processDocument( documentId, userId, documentText, { strategy: 'rag' } ); console.log('RAG Processing Results:', { success: result.success, processingTime: result.processingTime, apiCalls: result.apiCalls, summary: result.summary }); ``` ### 2. Compare Both Strategies ```javascript const comparison = await unifiedDocumentProcessor.compareProcessingStrategies( documentId, userId, documentText ); console.log('Comparison Results:', { winner: comparison.winner, timeDifference: comparison.performanceMetrics.timeDifference, apiCallDifference: comparison.performanceMetrics.apiCallDifference, qualityScore: comparison.performanceMetrics.qualityScore }); ``` ### 3. API Endpoints #### Process with RAG ```bash POST /api/documents/{id}/process-rag ``` #### Compare Strategies ```bash POST /api/documents/{id}/compare-strategies ``` #### Switch Strategy ```bash POST /api/documents/{id}/switch-strategy Content-Type: application/json { "strategy": "rag" // or "chunking" } ``` #### Get Processing Stats ```bash GET /api/documents/processing-stats ``` ## Processing Flow ### RAG Approach 1. **Document Segmentation** - Identify logical sections (executive summary, business description, financials, etc.) 2. **Key Metrics Extraction** - Extract financial and business metrics from each section 3. **Query-Based Analysis** - Process 6 focused queries for BPCP template sections 4. **Context Synthesis** - Combine results with full document context 5. **Final Summary** - Generate comprehensive markdown summary ### Comparison with Chunking | Aspect | Chunking | RAG | |--------|----------|-----| | **Processing** | 9 sequential chunks | 6 focused queries | | **Context** | Fragmented per chunk | Full document context | | **Quality** | Inconsistent across chunks | Consistent, focused analysis | | **Cost** | High (9+ API calls) | Lower (6-8 API calls) | | **Speed** | Slow (sequential) | Faster (parallel possible) | | **Accuracy** | Context loss issues | Precise, relevant retrieval | ## Testing ### Run RAG Test ```bash cd backend npm run build node test-rag-processing.js ``` ### Expected Output ``` 🚀 Testing RAG Processing Approach ================================== 📋 Testing RAG Processing... ✅ RAG Processing Results: - Success: true - Processing Time: 45000ms - API Calls: 8 - Error: None 📊 Analysis Summary: - Company: ABC Manufacturing - Industry: Aerospace & Defense - Revenue: $62M - EBITDA: $12.1M 🔄 Testing Unified Processor Comparison... ✅ Comparison Results: - Winner: rag - Time Difference: -15000ms - API Call Difference: -1 - Quality Score: 0.75 ``` ## Performance Metrics ### Quality Assessment - **Summary Length** - Longer summaries tend to be more comprehensive - **Markdown Structure** - Headers, lists, and formatting indicate better structure - **Content Completeness** - Coverage of all BPCP template sections - **Consistency** - No contradictions between sections ### Cost Analysis - **API Calls** - RAG typically uses 6-8 calls vs 9+ for chunking - **Token Usage** - More efficient token usage with focused queries - **Processing Time** - Faster due to parallel processing capability ## Migration Strategy ### Phase 1: Parallel Testing - Keep current chunking system - Add RAG system alongside - Use comparison endpoints to evaluate performance - Collect statistics on both approaches ### Phase 2: Gradual Migration - Switch to RAG for new documents - Use comparison to validate results - Monitor performance and quality metrics ### Phase 3: Full Migration - Make RAG the default strategy - Keep chunking as fallback option - Optimize based on collected data ## Troubleshooting ### Common Issues 1. **RAG Processing Fails** - Check LLM API configuration - Verify document text extraction - Review error logs for specific issues 2. **Poor Quality Results** - Adjust section relevance thresholds - Review query prompts - Check document structure 3. **High Processing Time** - Monitor API response times - Check network connectivity - Consider parallel processing optimization ### Debug Mode ```bash # Enable debug logging LOG_LEVEL=debug ENABLE_PROCESSING_COMPARISON=true ``` ## Future Enhancements 1. **Vector Embeddings** - Add semantic search capabilities 2. **Caching** - Cache section analysis for repeated queries 3. **Parallel Processing** - Process queries in parallel for speed 4. **Custom Queries** - Allow user-defined analysis queries 5. **Quality Feedback** - Learn from user feedback to improve prompts ## Support For issues or questions about the RAG processing system: 1. Check the logs for detailed error information 2. Run the test script to validate functionality 3. Compare with chunking approach to identify issues 4. Review configuration settings