# RAG Processing System for CIM Analysis

## Overview

This document describes the new RAG (Retrieval-Augmented Generation) processing system that provides an alternative to the current chunking approach for CIM document analysis.

## Why RAG?

### Current Chunking Issues
- **9 sequential chunks** per document (inefficient)
- **Context fragmentation** (each chunk analyzed in isolation)
- **Redundant processing** (same company analyzed 9 times)
- **Inconsistent results** (contradictions between chunks)
- **High costs** (more API calls = higher total cost)

### RAG Benefits
- **6-8 focused queries** instead of 9+ chunks
- **Full document context** maintained throughout
- **Intelligent retrieval** of relevant sections
- **Lower costs** with better quality
- **Faster processing** with parallel capability

## Architecture

### Components

1. **RAG Document Processor** (`ragDocumentProcessor.ts`)
   - Intelligent document segmentation
   - Section-specific analysis
   - Context-aware retrieval
   - Performance tracking

2. **Unified Document Processor** (`unifiedDocumentProcessor.ts`)
   - Strategy switching
   - Performance comparison
   - Quality assessment
   - Statistics tracking

3. **API Endpoints** (enhanced `documents.ts`)
   - `/api/documents/:id/process-rag` - Process with RAG
   - `/api/documents/:id/compare-strategies` - Compare both approaches
   - `/api/documents/:id/switch-strategy` - Switch processing strategy
   - `/api/documents/processing-stats` - Get performance statistics

## Configuration

### Environment Variables

```bash
# Processing Strategy (default: 'chunking')
PROCESSING_STRATEGY=rag

# Enable RAG Processing
ENABLE_RAG_PROCESSING=true

# Enable Processing Comparison
ENABLE_PROCESSING_COMPARISON=true

# LLM Configuration for RAG
LLM_CHUNK_SIZE=15000          # Increased from 4000
LLM_MAX_TOKENS=4000          # Increased from 3500
LLM_MAX_INPUT_TOKENS=200000  # Increased from 180000
LLM_PROMPT_BUFFER=1000       # Increased from 500
LLM_TIMEOUT_MS=180000        # Increased from 120000
LLM_MAX_COST_PER_DOCUMENT=3.00  # Increased from 2.00
```

## Usage

### 1. Process Document with RAG

```javascript
// Using the unified processor
const result = await unifiedDocumentProcessor.processDocument(
  documentId,
  userId,
  documentText,
  { strategy: 'rag' }
);

console.log('RAG Processing Results:', {
  success: result.success,
  processingTime: result.processingTime,
  apiCalls: result.apiCalls,
  summary: result.summary
});
```

### 2. Compare Both Strategies

```javascript
const comparison = await unifiedDocumentProcessor.compareProcessingStrategies(
  documentId,
  userId,
  documentText
);

console.log('Comparison Results:', {
  winner: comparison.winner,
  timeDifference: comparison.performanceMetrics.timeDifference,
  apiCallDifference: comparison.performanceMetrics.apiCallDifference,
  qualityScore: comparison.performanceMetrics.qualityScore
});
```

### 3. API Endpoints

#### Process with RAG
```bash
POST /api/documents/{id}/process-rag
```

#### Compare Strategies
```bash
POST /api/documents/{id}/compare-strategies
```

#### Switch Strategy
```bash
POST /api/documents/{id}/switch-strategy
Content-Type: application/json

{
  "strategy": "rag"  // or "chunking"
}
```

#### Get Processing Stats
```bash
GET /api/documents/processing-stats
```

## Processing Flow

### RAG Approach
1. **Document Segmentation** - Identify logical sections (executive summary, business description, financials, etc.)
2. **Key Metrics Extraction** - Extract financial and business metrics from each section
3. **Query-Based Analysis** - Process 6 focused queries for BPCP template sections
4. **Context Synthesis** - Combine results with full document context
5. **Final Summary** - Generate comprehensive markdown summary

### Comparison with Chunking

| Aspect | Chunking | RAG |
|--------|----------|-----|
| **Processing** | 9 sequential chunks | 6 focused queries |
| **Context** | Fragmented per chunk | Full document context |
| **Quality** | Inconsistent across chunks | Consistent, focused analysis |
| **Cost** | High (9+ API calls) | Lower (6-8 API calls) |
| **Speed** | Slow (sequential) | Faster (parallel possible) |
| **Accuracy** | Context loss issues | Precise, relevant retrieval |

## Testing

### Run RAG Test
```bash
cd backend
npm run build
node test-rag-processing.js
```

### Expected Output
```
🚀 Testing RAG Processing Approach
==================================

📋 Testing RAG Processing...
✅ RAG Processing Results:
- Success: true
- Processing Time: 45000ms
- API Calls: 8
- Error: None

📊 Analysis Summary:
- Company: ABC Manufacturing
- Industry: Aerospace & Defense
- Revenue: $62M
- EBITDA: $12.1M

🔄 Testing Unified Processor Comparison...
✅ Comparison Results:
- Winner: rag
- Time Difference: -15000ms
- API Call Difference: -1
- Quality Score: 0.75
```

## Performance Metrics

### Quality Assessment
- **Summary Length** - Longer summaries tend to be more comprehensive
- **Markdown Structure** - Headers, lists, and formatting indicate better structure
- **Content Completeness** - Coverage of all BPCP template sections
- **Consistency** - No contradictions between sections

### Cost Analysis
- **API Calls** - RAG typically uses 6-8 calls vs 9+ for chunking
- **Token Usage** - More efficient token usage with focused queries
- **Processing Time** - Faster due to parallel processing capability

## Migration Strategy

### Phase 1: Parallel Testing
- Keep current chunking system
- Add RAG system alongside
- Use comparison endpoints to evaluate performance
- Collect statistics on both approaches

### Phase 2: Gradual Migration
- Switch to RAG for new documents
- Use comparison to validate results
- Monitor performance and quality metrics

### Phase 3: Full Migration
- Make RAG the default strategy
- Keep chunking as fallback option
- Optimize based on collected data

## Troubleshooting

### Common Issues

1. **RAG Processing Fails**
   - Check LLM API configuration
   - Verify document text extraction
   - Review error logs for specific issues

2. **Poor Quality Results**
   - Adjust section relevance thresholds
   - Review query prompts
   - Check document structure

3. **High Processing Time**
   - Monitor API response times
   - Check network connectivity
   - Consider parallel processing optimization

### Debug Mode
```bash
# Enable debug logging
LOG_LEVEL=debug
ENABLE_PROCESSING_COMPARISON=true
```

## Future Enhancements

1. **Vector Embeddings** - Add semantic search capabilities
2. **Caching** - Cache section analysis for repeated queries
3. **Parallel Processing** - Process queries in parallel for speed
4. **Custom Queries** - Allow user-defined analysis queries
5. **Quality Feedback** - Learn from user feedback to improve prompts

## Support

For issues or questions about the RAG processing system:
1. Check the logs for detailed error information
2. Run the test script to validate functionality
3. Compare with chunking approach to identify issues
4. Review configuration settings