# LLM Service Documentation

## 📄 File Information

**File Path**: `backend/src/services/llmService.ts`  
**File Type**: `TypeScript`  
**Last Updated**: `2024-12-20`  
**Version**: `1.0.0`  
**Status**: `Active`

---

## 🎯 Purpose & Overview

**Primary Purpose**: Centralized service for all LLM (Large Language Model) interactions, providing intelligent model selection, prompt engineering, and structured output generation for CIM document analysis.

**Business Context**: Handles the AI-powered analysis of Confidential Information Memorandums by orchestrating interactions with Claude AI and OpenAI, ensuring optimal model selection, cost management, and quality output generation.

**Key Responsibilities**:
- Intelligent model selection based on task complexity
- Prompt engineering and system prompt management
- Multi-provider LLM integration (Claude AI, OpenAI)
- Structured output generation and validation
- Cost tracking and optimization
- Error handling and retry logic
- CIM-specific analysis and synthesis

---

## 🏗️ Architecture & Dependencies

### Dependencies
**Internal Dependencies**:
- `config/env.ts` - Environment configuration and API keys
- `logger.ts` - Structured logging utility
- `llmSchemas.ts` - CIM review data structure definitions and validation
- `zod` - Schema validation library

**External Dependencies**:
- `@anthropic-ai/sdk` - Claude AI API client
- `@openai/openai` - OpenAI API client
- `zod` - TypeScript-first schema validation

### Integration Points
- **Input Sources**: Document text from processing services
- **Output Destinations**: Structured CIM analysis data, summaries, section analysis
- **Event Triggers**: Document analysis requests from processing pipeline
- **Event Listeners**: Analysis completion events, error events

---

## 🔧 Implementation Details

### Core Functions/Methods

#### `processCIMDocument`
```typescript
/**
 * @purpose Main entry point for CIM document processing with intelligent model selection
 * @context Called when document analysis is needed with structured output requirements
 * @inputs text: string, template: string, analysis?: Record<string, any>
 * @outputs CIMAnalysisResult with structured data, cost tracking, and validation
 * @dependencies Claude AI/OpenAI APIs, schema validation, cost estimation
 * @errors API failures, validation errors, parsing errors
 * @complexity O(1) - Single LLM call with comprehensive prompt engineering
 */
```

**Example Usage**:
```typescript
const llmService = new LLMService();
const result = await llmService.processCIMDocument(
  documentText,
  cimTemplate,
  { refinementMode: false, overviewMode: true }
);
```

#### `callLLM`
```typescript
/**
 * @purpose Generic LLM call method with provider abstraction
 * @context Called for all LLM interactions regardless of provider
 * @inputs request: LLMRequest with prompt and configuration
 * @outputs LLMResponse with content and usage metrics
 * @dependencies Provider-specific API clients
 * @errors API failures, rate limiting, authentication errors
 * @complexity O(1) - Direct API call with error handling
 */
```

#### `callAnthropic`
```typescript
/**
 * @purpose Claude AI specific API interactions
 * @context Called when using Claude AI as the LLM provider
 * @inputs request: LLMRequest with Claude-specific parameters
 * @outputs LLMResponse with Claude AI response and token usage
 * @dependencies @anthropic-ai/sdk
 * @errors Claude API failures, rate limiting, model errors
 * @complexity O(1) - Direct Claude API call
 */
```

#### `callOpenAI`
```typescript
/**
 * @purpose OpenAI specific API interactions
 * @context Called when using OpenAI as the LLM provider
 * @inputs request: LLMRequest with OpenAI-specific parameters
 * @outputs LLMResponse with OpenAI response and token usage
 * @dependencies @openai/openai
 * @errors OpenAI API failures, rate limiting, model errors
 * @complexity O(1) - Direct OpenAI API call
 */
```

### Data Structures

#### `LLMRequest`
```typescript
interface LLMRequest {
  prompt: string;                 // Main prompt text
  systemPrompt?: string;          // System prompt for context
  maxTokens?: number;             // Maximum tokens for response
  temperature?: number;           // Response creativity (0-2)
  model?: string;                 // Specific model to use
}
```

#### `LLMResponse`
```typescript
interface LLMResponse {
  success: boolean;               // Request success status
  content: string;                // LLM response content
  usage?: {                       // Token usage metrics
    promptTokens: number;         // Input tokens used
    completionTokens: number;     // Output tokens used
    totalTokens: number;          // Total tokens used
  };
  error?: string;                 // Error message if failed
}
```

#### `CIMAnalysisResult`
```typescript
interface CIMAnalysisResult {
  success: boolean;               // Analysis success status
  jsonOutput?: CIMReview;         // Structured analysis data
  error?: string;                 // Error message if failed
  model: string;                  // Model used for analysis
  cost: number;                   // Estimated cost in USD
  inputTokens: number;            // Input tokens consumed
  outputTokens: number;           // Output tokens consumed
  validationIssues?: z.ZodIssue[]; // Schema validation issues
}
```

### Configuration
```typescript
// Key configuration options
const LLM_CONFIG = {
  provider: 'anthropic' | 'openai',        // LLM provider selection
  defaultModel: 'claude-3-opus-20240229',  // Default model for provider
  maxTokens: 3500,                         // Default max tokens
  temperature: 0.1,                        // Default temperature
  promptBuffer: 500,                       // Buffer for prompt engineering
  retryAttempts: 3,                        // Number of retry attempts
  costThreshold: 5.0,                      // Cost threshold per request (USD)
};
```

---

## 📊 Data Flow

### Input Processing
1. **Task Analysis**: Determine task complexity and requirements
2. **Model Selection**: Select optimal model based on complexity and tokens
3. **Prompt Engineering**: Build appropriate prompt based on analysis type
4. **System Prompt**: Generate context-appropriate system prompt
5. **Parameter Optimization**: Optimize temperature, tokens, and other parameters

### Processing Pipeline
1. **Provider Selection**: Route to appropriate provider (Claude/OpenAI)
2. **API Call**: Execute LLM API call with retry logic
3. **Response Processing**: Extract and validate response content
4. **JSON Parsing**: Parse structured output from response
5. **Schema Validation**: Validate output against CIM review schema

### Output Generation
1. **Content Extraction**: Extract structured data from LLM response
2. **Cost Calculation**: Calculate and track API usage costs
3. **Validation**: Validate output against expected schema
4. **Error Handling**: Handle parsing and validation errors
5. **Result Formatting**: Format final analysis result

### Data Transformations
- `Document Text` → `Task Analysis` → `Model Selection` → `Prompt Engineering` → `LLM Response`
- `Analysis Requirements` → `Prompt Strategy` → `System Context` → `Structured Output`
- `Raw Response` → `JSON Parsing` → `Schema Validation` → `Validated Data`

---

## 🚨 Error Handling

### Error Types
```typescript
/**
 * @errorType API_ERROR
 * @description LLM API call failed due to network or service issues
 * @recoverable true
 * @retryStrategy exponential_backoff
 * @userMessage "LLM service temporarily unavailable"
 */

/**
 * @errorType RATE_LIMIT_ERROR
 * @description API rate limit exceeded
 * @recoverable true
 * @retryStrategy exponential_backoff
 * @userMessage "Rate limit exceeded, retrying shortly"
 */

/**
 * @errorType VALIDATION_ERROR
 * @description LLM response failed schema validation
 * @recoverable true
 * @retryStrategy retry_with_different_prompt
 * @userMessage "Response validation failed, retrying with improved prompt"
 */

/**
 * @errorType PARSING_ERROR
 * @description Failed to parse JSON from LLM response
 * @recoverable true
 * @retryStrategy retry_with_json_formatting
 * @userMessage "Response parsing failed, retrying with JSON formatting"
 */
```

### Error Recovery
- **API Errors**: Implement exponential backoff and retry logic
- **Rate Limit Errors**: Respect rate limits and implement backoff
- **Validation Errors**: Retry with improved prompts and formatting
- **Parsing Errors**: Retry with explicit JSON formatting instructions

### Fallback Strategies
- **Primary Strategy**: Claude AI with comprehensive prompts
- **Fallback Strategy**: OpenAI with similar prompts
- **Degradation Strategy**: Simplified analysis with basic prompts

---

## 🧪 Testing

### Test Coverage
- **Unit Tests**: 95% - Core LLM interaction logic and prompt engineering
- **Integration Tests**: 90% - End-to-end LLM processing workflows
- **Performance Tests**: API response time and cost optimization

### Test Data
```typescript
/**
 * @testData sample_cim_text.txt
 * @description Standard CIM document text for testing
 * @size 10KB
 * @sections Financial, Market, Management
 * @expectedOutput Valid CIMReview with all sections populated
 */

/**
 * @testData complex_cim_text.txt
 * @description Complex CIM document for model selection testing
 * @size 50KB
 * @sections Comprehensive business analysis
 * @expectedOutput Complex analysis with appropriate model selection
 */

/**
 * @testData malformed_response.json
 * @description Malformed LLM response for error handling testing
 * @size 2KB
 * @format Invalid JSON structure
 * @expectedOutput Proper error handling and retry logic
 */
```

### Mock Strategy
- **External APIs**: Mock Claude AI and OpenAI responses
- **Schema Validation**: Mock validation scenarios and error cases
- **Cost Tracking**: Mock token usage and cost calculations

---

## 📈 Performance Characteristics

### Performance Metrics
- **Average Response Time**: 10-30 seconds per LLM call
- **Token Usage**: 1000-4000 tokens per analysis
- **Cost per Analysis**: $0.01-$0.10 per document
- **Success Rate**: 95%+ with retry logic
- **Validation Success**: 90%+ with prompt engineering

### Optimization Strategies
- **Model Selection**: Intelligent model selection based on task complexity
- **Prompt Engineering**: Optimized prompts for better response quality
- **Cost Management**: Token usage optimization and cost tracking
- **Caching**: Cache similar requests to reduce API calls
- **Batch Processing**: Process multiple sections in single requests

### Scalability Limits
- **API Rate Limits**: Respect provider-specific rate limits
- **Cost Limits**: Maximum cost per request and daily limits
- **Token Limits**: Maximum input/output token limits per model
- **Concurrent Requests**: Limit concurrent API calls

---

## 🔍 Debugging & Monitoring

### Logging
```typescript
/**
 * @logging Structured logging with detailed LLM interaction metrics
 * @levels debug, info, warn, error
 * @correlation Request ID and model tracking
 * @context Prompt engineering, model selection, cost tracking, validation
 */
```

### Debug Tools
- **Prompt Analysis**: Detailed prompt engineering and system prompt analysis
- **Model Selection**: Model selection logic and reasoning
- **Cost Tracking**: Detailed cost analysis and optimization
- **Response Validation**: Schema validation and error analysis

### Common Issues
1. **API Failures**: Monitor API health and implement proper retry logic
2. **Rate Limiting**: Implement proper rate limiting and backoff strategies
3. **Validation Errors**: Improve prompt engineering for better response quality
4. **Cost Optimization**: Monitor and optimize token usage and model selection

---

## 🔐 Security Considerations

### Input Validation
- **Text Content**: Sanitization of input text for prompt injection prevention
- **API Keys**: Secure storage and rotation of API keys
- **Request Validation**: Validation of all input parameters

### Authentication & Authorization
- **API Access**: Secure access to LLM provider APIs
- **Key Management**: Secure API key management and rotation
- **Request Logging**: Secure logging of requests and responses

### Data Protection
- **Text Processing**: Secure handling of sensitive document content
- **Response Storage**: Secure storage of LLM responses and analysis
- **Cost Tracking**: Secure tracking and reporting of API usage costs

---

## 📚 Related Documentation

### Internal References
- `optimizedAgenticRAGProcessor.ts` - Uses this service for LLM analysis
- `llmSchemas.ts` - CIM review data structure definitions
- `config/env.ts` - Environment configuration and API keys
- `logger.ts` - Structured logging utility

### External References
- [Claude AI API Documentation](https://docs.anthropic.com/)
- [OpenAI API Documentation](https://platform.openai.com/docs)
- [Zod Schema Validation](https://zod.dev/)

---

## 🔄 Change History

### Recent Changes
- `2024-12-20` - Implemented intelligent model selection and cost tracking - `[Author]`
- `2024-12-15` - Added comprehensive prompt engineering and validation - `[Author]`
- `2024-12-10` - Implemented multi-provider support (Claude AI, OpenAI) - `[Author]`

### Planned Changes
- Advanced prompt engineering improvements - `2025-01-15`
- Multi-language support for international documents - `2025-01-30`
- Enhanced cost optimization and caching - `2025-02-15`

---

## 📋 Usage Examples

### Basic Usage
```typescript
import { LLMService } from './llmService';

const llmService = new LLMService();
const result = await llmService.processCIMDocument(
  documentText,
  cimTemplate
);

if (result.success) {
  console.log('Analysis completed:', result.jsonOutput);
  console.log('Cost:', result.cost, 'USD');
  console.log('Tokens used:', result.inputTokens + result.outputTokens);
} else {
  console.error('Analysis failed:', result.error);
}
```

### Advanced Usage
```typescript
import { LLMService } from './llmService';

const llmService = new LLMService();
const result = await llmService.processCIMDocument(
  documentText,
  cimTemplate,
  {
    refinementMode: true,
    overviewMode: false,
    sectionType: 'financial'
  }
);

// Monitor detailed metrics
console.log('Model used:', result.model);
console.log('Input tokens:', result.inputTokens);
console.log('Output tokens:', result.outputTokens);
console.log('Total cost:', result.cost, 'USD');
```

### Error Handling
```typescript
try {
  const result = await llmService.processCIMDocument(
    documentText,
    cimTemplate
  );
  
  if (!result.success) {
    logger.error('LLM analysis failed', { 
      error: result.error,
      model: result.model,
      cost: result.cost 
    });
  }
  
  if (result.validationIssues) {
    logger.warn('Validation issues found', { 
      issues: result.validationIssues 
    });
  }
} catch (error) {
  logger.error('Unexpected error during LLM analysis', { 
    error: error.message 
  });
}
```

---

## 🎯 LLM Agent Notes

### Key Understanding Points
- This service is the central hub for all LLM interactions in the system
- Implements intelligent model selection based on task complexity
- Provides comprehensive prompt engineering for different analysis types
- Handles multi-provider support (Claude AI, OpenAI) with fallback logic
- Includes cost tracking and optimization for API usage

### Common Modifications
- Adding new providers - Implement new provider methods and update selection logic
- Modifying prompt engineering - Update prompt building methods for different analysis types
- Adjusting model selection - Modify selectModel method for different complexity criteria
- Enhancing validation - Extend schema validation and error handling
- Optimizing costs - Adjust cost thresholds and token optimization strategies

### Integration Patterns
- Strategy Pattern - Different providers and models for different tasks
- Factory Pattern - Creating different types of prompts and system contexts
- Observer Pattern - Cost tracking and performance monitoring
- Chain of Responsibility - Retry logic and fallback strategies

---

This documentation provides comprehensive information about the LLM service, enabling LLM agents to understand its purpose, implementation, and usage patterns for effective code evaluation and modification.