6.9 KiB
6.9 KiB
RAG Processing System for CIM Analysis
Overview
This document describes the new RAG (Retrieval-Augmented Generation) processing system that provides an alternative to the current chunking approach for CIM document analysis.
Why RAG?
Current Chunking Issues
- 9 sequential chunks per document (inefficient)
- Context fragmentation (each chunk analyzed in isolation)
- Redundant processing (same company analyzed 9 times)
- Inconsistent results (contradictions between chunks)
- High costs (more API calls = higher total cost)
RAG Benefits
- 6-8 focused queries instead of 9+ chunks
- Full document context maintained throughout
- Intelligent retrieval of relevant sections
- Lower costs with better quality
- Faster processing with parallel capability
Architecture
Components
-
RAG Document Processor (
ragDocumentProcessor.ts)- Intelligent document segmentation
- Section-specific analysis
- Context-aware retrieval
- Performance tracking
-
Unified Document Processor (
unifiedDocumentProcessor.ts)- Strategy switching
- Performance comparison
- Quality assessment
- Statistics tracking
-
API Endpoints (enhanced
documents.ts)/api/documents/:id/process-rag- Process with RAG/api/documents/:id/compare-strategies- Compare both approaches/api/documents/:id/switch-strategy- Switch processing strategy/api/documents/processing-stats- Get performance statistics
Configuration
Environment Variables
# Processing Strategy (default: 'chunking')
PROCESSING_STRATEGY=rag
# Enable RAG Processing
ENABLE_RAG_PROCESSING=true
# Enable Processing Comparison
ENABLE_PROCESSING_COMPARISON=true
# LLM Configuration for RAG
LLM_CHUNK_SIZE=15000 # Increased from 4000
LLM_MAX_TOKENS=4000 # Increased from 3500
LLM_MAX_INPUT_TOKENS=200000 # Increased from 180000
LLM_PROMPT_BUFFER=1000 # Increased from 500
LLM_TIMEOUT_MS=180000 # Increased from 120000
LLM_MAX_COST_PER_DOCUMENT=3.00 # Increased from 2.00
Usage
1. Process Document with RAG
// Using the unified processor
const result = await unifiedDocumentProcessor.processDocument(
documentId,
userId,
documentText,
{ strategy: 'rag' }
);
console.log('RAG Processing Results:', {
success: result.success,
processingTime: result.processingTime,
apiCalls: result.apiCalls,
summary: result.summary
});
2. Compare Both Strategies
const comparison = await unifiedDocumentProcessor.compareProcessingStrategies(
documentId,
userId,
documentText
);
console.log('Comparison Results:', {
winner: comparison.winner,
timeDifference: comparison.performanceMetrics.timeDifference,
apiCallDifference: comparison.performanceMetrics.apiCallDifference,
qualityScore: comparison.performanceMetrics.qualityScore
});
3. API Endpoints
Process with RAG
POST /api/documents/{id}/process-rag
Compare Strategies
POST /api/documents/{id}/compare-strategies
Switch Strategy
POST /api/documents/{id}/switch-strategy
Content-Type: application/json
{
"strategy": "rag" // or "chunking"
}
Get Processing Stats
GET /api/documents/processing-stats
Processing Flow
RAG Approach
- Document Segmentation - Identify logical sections (executive summary, business description, financials, etc.)
- Key Metrics Extraction - Extract financial and business metrics from each section
- Query-Based Analysis - Process 6 focused queries for BPCP template sections
- Context Synthesis - Combine results with full document context
- Final Summary - Generate comprehensive markdown summary
Comparison with Chunking
| Aspect | Chunking | RAG |
|---|---|---|
| Processing | 9 sequential chunks | 6 focused queries |
| Context | Fragmented per chunk | Full document context |
| Quality | Inconsistent across chunks | Consistent, focused analysis |
| Cost | High (9+ API calls) | Lower (6-8 API calls) |
| Speed | Slow (sequential) | Faster (parallel possible) |
| Accuracy | Context loss issues | Precise, relevant retrieval |
Testing
Run RAG Test
cd backend
npm run build
node test-rag-processing.js
Expected Output
🚀 Testing RAG Processing Approach
==================================
📋 Testing RAG Processing...
✅ RAG Processing Results:
- Success: true
- Processing Time: 45000ms
- API Calls: 8
- Error: None
📊 Analysis Summary:
- Company: ABC Manufacturing
- Industry: Aerospace & Defense
- Revenue: $62M
- EBITDA: $12.1M
🔄 Testing Unified Processor Comparison...
✅ Comparison Results:
- Winner: rag
- Time Difference: -15000ms
- API Call Difference: -1
- Quality Score: 0.75
Performance Metrics
Quality Assessment
- Summary Length - Longer summaries tend to be more comprehensive
- Markdown Structure - Headers, lists, and formatting indicate better structure
- Content Completeness - Coverage of all BPCP template sections
- Consistency - No contradictions between sections
Cost Analysis
- API Calls - RAG typically uses 6-8 calls vs 9+ for chunking
- Token Usage - More efficient token usage with focused queries
- Processing Time - Faster due to parallel processing capability
Migration Strategy
Phase 1: Parallel Testing
- Keep current chunking system
- Add RAG system alongside
- Use comparison endpoints to evaluate performance
- Collect statistics on both approaches
Phase 2: Gradual Migration
- Switch to RAG for new documents
- Use comparison to validate results
- Monitor performance and quality metrics
Phase 3: Full Migration
- Make RAG the default strategy
- Keep chunking as fallback option
- Optimize based on collected data
Troubleshooting
Common Issues
-
RAG Processing Fails
- Check LLM API configuration
- Verify document text extraction
- Review error logs for specific issues
-
Poor Quality Results
- Adjust section relevance thresholds
- Review query prompts
- Check document structure
-
High Processing Time
- Monitor API response times
- Check network connectivity
- Consider parallel processing optimization
Debug Mode
# Enable debug logging
LOG_LEVEL=debug
ENABLE_PROCESSING_COMPARISON=true
Future Enhancements
- Vector Embeddings - Add semantic search capabilities
- Caching - Cache section analysis for repeated queries
- Parallel Processing - Process queries in parallel for speed
- Custom Queries - Allow user-defined analysis queries
- Quality Feedback - Learn from user feedback to improve prompts
Support
For issues or questions about the RAG processing system:
- Check the logs for detailed error information
- Run the test script to validate functionality
- Compare with chunking approach to identify issues
- Review configuration settings