# LLM Service Documentation ## ๐Ÿ“„ File Information **File Path**: `backend/src/services/llmService.ts` **File Type**: `TypeScript` **Last Updated**: `2024-12-20` **Version**: `1.0.0` **Status**: `Active` --- ## ๐ŸŽฏ Purpose & Overview **Primary Purpose**: Centralized service for all LLM (Large Language Model) interactions, providing intelligent model selection, prompt engineering, and structured output generation for CIM document analysis. **Business Context**: Handles the AI-powered analysis of Confidential Information Memorandums by orchestrating interactions with Claude AI and OpenAI, ensuring optimal model selection, cost management, and quality output generation. **Key Responsibilities**: - Intelligent model selection based on task complexity - Prompt engineering and system prompt management - Multi-provider LLM integration (Claude AI, OpenAI) - Structured output generation and validation - Cost tracking and optimization - Error handling and retry logic - CIM-specific analysis and synthesis --- ## ๐Ÿ—๏ธ Architecture & Dependencies ### Dependencies **Internal Dependencies**: - `config/env.ts` - Environment configuration and API keys - `logger.ts` - Structured logging utility - `llmSchemas.ts` - CIM review data structure definitions and validation - `zod` - Schema validation library **External Dependencies**: - `@anthropic-ai/sdk` - Claude AI API client - `@openai/openai` - OpenAI API client - `zod` - TypeScript-first schema validation ### Integration Points - **Input Sources**: Document text from processing services - **Output Destinations**: Structured CIM analysis data, summaries, section analysis - **Event Triggers**: Document analysis requests from processing pipeline - **Event Listeners**: Analysis completion events, error events --- ## ๐Ÿ”ง Implementation Details ### Core Functions/Methods #### `processCIMDocument` ```typescript /** * @purpose Main entry point for CIM document processing with intelligent model selection * @context Called when document analysis is needed with structured output requirements * @inputs text: string, template: string, analysis?: Record * @outputs CIMAnalysisResult with structured data, cost tracking, and validation * @dependencies Claude AI/OpenAI APIs, schema validation, cost estimation * @errors API failures, validation errors, parsing errors * @complexity O(1) - Single LLM call with comprehensive prompt engineering */ ``` **Example Usage**: ```typescript const llmService = new LLMService(); const result = await llmService.processCIMDocument( documentText, cimTemplate, { refinementMode: false, overviewMode: true } ); ``` #### `callLLM` ```typescript /** * @purpose Generic LLM call method with provider abstraction * @context Called for all LLM interactions regardless of provider * @inputs request: LLMRequest with prompt and configuration * @outputs LLMResponse with content and usage metrics * @dependencies Provider-specific API clients * @errors API failures, rate limiting, authentication errors * @complexity O(1) - Direct API call with error handling */ ``` #### `callAnthropic` ```typescript /** * @purpose Claude AI specific API interactions * @context Called when using Claude AI as the LLM provider * @inputs request: LLMRequest with Claude-specific parameters * @outputs LLMResponse with Claude AI response and token usage * @dependencies @anthropic-ai/sdk * @errors Claude API failures, rate limiting, model errors * @complexity O(1) - Direct Claude API call */ ``` #### `callOpenAI` ```typescript /** * @purpose OpenAI specific API interactions * @context Called when using OpenAI as the LLM provider * @inputs request: LLMRequest with OpenAI-specific parameters * @outputs LLMResponse with OpenAI response and token usage * @dependencies @openai/openai * @errors OpenAI API failures, rate limiting, model errors * @complexity O(1) - Direct OpenAI API call */ ``` ### Data Structures #### `LLMRequest` ```typescript interface LLMRequest { prompt: string; // Main prompt text systemPrompt?: string; // System prompt for context maxTokens?: number; // Maximum tokens for response temperature?: number; // Response creativity (0-2) model?: string; // Specific model to use } ``` #### `LLMResponse` ```typescript interface LLMResponse { success: boolean; // Request success status content: string; // LLM response content usage?: { // Token usage metrics promptTokens: number; // Input tokens used completionTokens: number; // Output tokens used totalTokens: number; // Total tokens used }; error?: string; // Error message if failed } ``` #### `CIMAnalysisResult` ```typescript interface CIMAnalysisResult { success: boolean; // Analysis success status jsonOutput?: CIMReview; // Structured analysis data error?: string; // Error message if failed model: string; // Model used for analysis cost: number; // Estimated cost in USD inputTokens: number; // Input tokens consumed outputTokens: number; // Output tokens consumed validationIssues?: z.ZodIssue[]; // Schema validation issues } ``` ### Configuration ```typescript // Key configuration options const LLM_CONFIG = { provider: 'anthropic' | 'openai', // LLM provider selection defaultModel: 'claude-3-opus-20240229', // Default model for provider maxTokens: 3500, // Default max tokens temperature: 0.1, // Default temperature promptBuffer: 500, // Buffer for prompt engineering retryAttempts: 3, // Number of retry attempts costThreshold: 5.0, // Cost threshold per request (USD) }; ``` --- ## ๐Ÿ“Š Data Flow ### Input Processing 1. **Task Analysis**: Determine task complexity and requirements 2. **Model Selection**: Select optimal model based on complexity and tokens 3. **Prompt Engineering**: Build appropriate prompt based on analysis type 4. **System Prompt**: Generate context-appropriate system prompt 5. **Parameter Optimization**: Optimize temperature, tokens, and other parameters ### Processing Pipeline 1. **Provider Selection**: Route to appropriate provider (Claude/OpenAI) 2. **API Call**: Execute LLM API call with retry logic 3. **Response Processing**: Extract and validate response content 4. **JSON Parsing**: Parse structured output from response 5. **Schema Validation**: Validate output against CIM review schema ### Output Generation 1. **Content Extraction**: Extract structured data from LLM response 2. **Cost Calculation**: Calculate and track API usage costs 3. **Validation**: Validate output against expected schema 4. **Error Handling**: Handle parsing and validation errors 5. **Result Formatting**: Format final analysis result ### Data Transformations - `Document Text` โ†’ `Task Analysis` โ†’ `Model Selection` โ†’ `Prompt Engineering` โ†’ `LLM Response` - `Analysis Requirements` โ†’ `Prompt Strategy` โ†’ `System Context` โ†’ `Structured Output` - `Raw Response` โ†’ `JSON Parsing` โ†’ `Schema Validation` โ†’ `Validated Data` --- ## ๐Ÿšจ Error Handling ### Error Types ```typescript /** * @errorType API_ERROR * @description LLM API call failed due to network or service issues * @recoverable true * @retryStrategy exponential_backoff * @userMessage "LLM service temporarily unavailable" */ /** * @errorType RATE_LIMIT_ERROR * @description API rate limit exceeded * @recoverable true * @retryStrategy exponential_backoff * @userMessage "Rate limit exceeded, retrying shortly" */ /** * @errorType VALIDATION_ERROR * @description LLM response failed schema validation * @recoverable true * @retryStrategy retry_with_different_prompt * @userMessage "Response validation failed, retrying with improved prompt" */ /** * @errorType PARSING_ERROR * @description Failed to parse JSON from LLM response * @recoverable true * @retryStrategy retry_with_json_formatting * @userMessage "Response parsing failed, retrying with JSON formatting" */ ``` ### Error Recovery - **API Errors**: Implement exponential backoff and retry logic - **Rate Limit Errors**: Respect rate limits and implement backoff - **Validation Errors**: Retry with improved prompts and formatting - **Parsing Errors**: Retry with explicit JSON formatting instructions ### Fallback Strategies - **Primary Strategy**: Claude AI with comprehensive prompts - **Fallback Strategy**: OpenAI with similar prompts - **Degradation Strategy**: Simplified analysis with basic prompts --- ## ๐Ÿงช Testing ### Test Coverage - **Unit Tests**: 95% - Core LLM interaction logic and prompt engineering - **Integration Tests**: 90% - End-to-end LLM processing workflows - **Performance Tests**: API response time and cost optimization ### Test Data ```typescript /** * @testData sample_cim_text.txt * @description Standard CIM document text for testing * @size 10KB * @sections Financial, Market, Management * @expectedOutput Valid CIMReview with all sections populated */ /** * @testData complex_cim_text.txt * @description Complex CIM document for model selection testing * @size 50KB * @sections Comprehensive business analysis * @expectedOutput Complex analysis with appropriate model selection */ /** * @testData malformed_response.json * @description Malformed LLM response for error handling testing * @size 2KB * @format Invalid JSON structure * @expectedOutput Proper error handling and retry logic */ ``` ### Mock Strategy - **External APIs**: Mock Claude AI and OpenAI responses - **Schema Validation**: Mock validation scenarios and error cases - **Cost Tracking**: Mock token usage and cost calculations --- ## ๐Ÿ“ˆ Performance Characteristics ### Performance Metrics - **Average Response Time**: 10-30 seconds per LLM call - **Token Usage**: 1000-4000 tokens per analysis - **Cost per Analysis**: $0.01-$0.10 per document - **Success Rate**: 95%+ with retry logic - **Validation Success**: 90%+ with prompt engineering ### Optimization Strategies - **Model Selection**: Intelligent model selection based on task complexity - **Prompt Engineering**: Optimized prompts for better response quality - **Cost Management**: Token usage optimization and cost tracking - **Caching**: Cache similar requests to reduce API calls - **Batch Processing**: Process multiple sections in single requests ### Scalability Limits - **API Rate Limits**: Respect provider-specific rate limits - **Cost Limits**: Maximum cost per request and daily limits - **Token Limits**: Maximum input/output token limits per model - **Concurrent Requests**: Limit concurrent API calls --- ## ๐Ÿ” Debugging & Monitoring ### Logging ```typescript /** * @logging Structured logging with detailed LLM interaction metrics * @levels debug, info, warn, error * @correlation Request ID and model tracking * @context Prompt engineering, model selection, cost tracking, validation */ ``` ### Debug Tools - **Prompt Analysis**: Detailed prompt engineering and system prompt analysis - **Model Selection**: Model selection logic and reasoning - **Cost Tracking**: Detailed cost analysis and optimization - **Response Validation**: Schema validation and error analysis ### Common Issues 1. **API Failures**: Monitor API health and implement proper retry logic 2. **Rate Limiting**: Implement proper rate limiting and backoff strategies 3. **Validation Errors**: Improve prompt engineering for better response quality 4. **Cost Optimization**: Monitor and optimize token usage and model selection --- ## ๐Ÿ” Security Considerations ### Input Validation - **Text Content**: Sanitization of input text for prompt injection prevention - **API Keys**: Secure storage and rotation of API keys - **Request Validation**: Validation of all input parameters ### Authentication & Authorization - **API Access**: Secure access to LLM provider APIs - **Key Management**: Secure API key management and rotation - **Request Logging**: Secure logging of requests and responses ### Data Protection - **Text Processing**: Secure handling of sensitive document content - **Response Storage**: Secure storage of LLM responses and analysis - **Cost Tracking**: Secure tracking and reporting of API usage costs --- ## ๐Ÿ“š Related Documentation ### Internal References - `optimizedAgenticRAGProcessor.ts` - Uses this service for LLM analysis - `llmSchemas.ts` - CIM review data structure definitions - `config/env.ts` - Environment configuration and API keys - `logger.ts` - Structured logging utility ### External References - [Claude AI API Documentation](https://docs.anthropic.com/) - [OpenAI API Documentation](https://platform.openai.com/docs) - [Zod Schema Validation](https://zod.dev/) --- ## ๐Ÿ”„ Change History ### Recent Changes - `2024-12-20` - Implemented intelligent model selection and cost tracking - `[Author]` - `2024-12-15` - Added comprehensive prompt engineering and validation - `[Author]` - `2024-12-10` - Implemented multi-provider support (Claude AI, OpenAI) - `[Author]` ### Planned Changes - Advanced prompt engineering improvements - `2025-01-15` - Multi-language support for international documents - `2025-01-30` - Enhanced cost optimization and caching - `2025-02-15` --- ## ๐Ÿ“‹ Usage Examples ### Basic Usage ```typescript import { LLMService } from './llmService'; const llmService = new LLMService(); const result = await llmService.processCIMDocument( documentText, cimTemplate ); if (result.success) { console.log('Analysis completed:', result.jsonOutput); console.log('Cost:', result.cost, 'USD'); console.log('Tokens used:', result.inputTokens + result.outputTokens); } else { console.error('Analysis failed:', result.error); } ``` ### Advanced Usage ```typescript import { LLMService } from './llmService'; const llmService = new LLMService(); const result = await llmService.processCIMDocument( documentText, cimTemplate, { refinementMode: true, overviewMode: false, sectionType: 'financial' } ); // Monitor detailed metrics console.log('Model used:', result.model); console.log('Input tokens:', result.inputTokens); console.log('Output tokens:', result.outputTokens); console.log('Total cost:', result.cost, 'USD'); ``` ### Error Handling ```typescript try { const result = await llmService.processCIMDocument( documentText, cimTemplate ); if (!result.success) { logger.error('LLM analysis failed', { error: result.error, model: result.model, cost: result.cost }); } if (result.validationIssues) { logger.warn('Validation issues found', { issues: result.validationIssues }); } } catch (error) { logger.error('Unexpected error during LLM analysis', { error: error.message }); } ``` --- ## ๐ŸŽฏ LLM Agent Notes ### Key Understanding Points - This service is the central hub for all LLM interactions in the system - Implements intelligent model selection based on task complexity - Provides comprehensive prompt engineering for different analysis types - Handles multi-provider support (Claude AI, OpenAI) with fallback logic - Includes cost tracking and optimization for API usage ### Common Modifications - Adding new providers - Implement new provider methods and update selection logic - Modifying prompt engineering - Update prompt building methods for different analysis types - Adjusting model selection - Modify selectModel method for different complexity criteria - Enhancing validation - Extend schema validation and error handling - Optimizing costs - Adjust cost thresholds and token optimization strategies ### Integration Patterns - Strategy Pattern - Different providers and models for different tasks - Factory Pattern - Creating different types of prompts and system contexts - Observer Pattern - Cost tracking and performance monitoring - Chain of Responsibility - Retry logic and fallback strategies --- This documentation provides comprehensive information about the LLM service, enabling LLM agents to understand its purpose, implementation, and usage patterns for effective code evaluation and modification.