Files
cim_summary/backend/src/services/llmService.md

16 KiB

LLM Service Documentation

📄 File Information

File Path: backend/src/services/llmService.ts
File Type: TypeScript
Last Updated: 2024-12-20
Version: 1.0.0
Status: Active


🎯 Purpose & Overview

Primary Purpose: Centralized service for all LLM (Large Language Model) interactions, providing intelligent model selection, prompt engineering, and structured output generation for CIM document analysis.

Business Context: Handles the AI-powered analysis of Confidential Information Memorandums by orchestrating interactions with Claude AI and OpenAI, ensuring optimal model selection, cost management, and quality output generation.

Key Responsibilities:

  • Intelligent model selection based on task complexity
  • Prompt engineering and system prompt management
  • Multi-provider LLM integration (Claude AI, OpenAI)
  • Structured output generation and validation
  • Cost tracking and optimization
  • Error handling and retry logic
  • CIM-specific analysis and synthesis

🏗️ Architecture & Dependencies

Dependencies

Internal Dependencies:

  • config/env.ts - Environment configuration and API keys
  • logger.ts - Structured logging utility
  • llmSchemas.ts - CIM review data structure definitions and validation
  • zod - Schema validation library

External Dependencies:

  • @anthropic-ai/sdk - Claude AI API client
  • @openai/openai - OpenAI API client
  • zod - TypeScript-first schema validation

Integration Points

  • Input Sources: Document text from processing services
  • Output Destinations: Structured CIM analysis data, summaries, section analysis
  • Event Triggers: Document analysis requests from processing pipeline
  • Event Listeners: Analysis completion events, error events

🔧 Implementation Details

Core Functions/Methods

processCIMDocument

/**
 * @purpose Main entry point for CIM document processing with intelligent model selection
 * @context Called when document analysis is needed with structured output requirements
 * @inputs text: string, template: string, analysis?: Record<string, any>
 * @outputs CIMAnalysisResult with structured data, cost tracking, and validation
 * @dependencies Claude AI/OpenAI APIs, schema validation, cost estimation
 * @errors API failures, validation errors, parsing errors
 * @complexity O(1) - Single LLM call with comprehensive prompt engineering
 */

Example Usage:

const llmService = new LLMService();
const result = await llmService.processCIMDocument(
  documentText,
  cimTemplate,
  { refinementMode: false, overviewMode: true }
);

callLLM

/**
 * @purpose Generic LLM call method with provider abstraction
 * @context Called for all LLM interactions regardless of provider
 * @inputs request: LLMRequest with prompt and configuration
 * @outputs LLMResponse with content and usage metrics
 * @dependencies Provider-specific API clients
 * @errors API failures, rate limiting, authentication errors
 * @complexity O(1) - Direct API call with error handling
 */

callAnthropic

/**
 * @purpose Claude AI specific API interactions
 * @context Called when using Claude AI as the LLM provider
 * @inputs request: LLMRequest with Claude-specific parameters
 * @outputs LLMResponse with Claude AI response and token usage
 * @dependencies @anthropic-ai/sdk
 * @errors Claude API failures, rate limiting, model errors
 * @complexity O(1) - Direct Claude API call
 */

callOpenAI

/**
 * @purpose OpenAI specific API interactions
 * @context Called when using OpenAI as the LLM provider
 * @inputs request: LLMRequest with OpenAI-specific parameters
 * @outputs LLMResponse with OpenAI response and token usage
 * @dependencies @openai/openai
 * @errors OpenAI API failures, rate limiting, model errors
 * @complexity O(1) - Direct OpenAI API call
 */

Data Structures

LLMRequest

interface LLMRequest {
  prompt: string;                 // Main prompt text
  systemPrompt?: string;          // System prompt for context
  maxTokens?: number;             // Maximum tokens for response
  temperature?: number;           // Response creativity (0-2)
  model?: string;                 // Specific model to use
}

LLMResponse

interface LLMResponse {
  success: boolean;               // Request success status
  content: string;                // LLM response content
  usage?: {                       // Token usage metrics
    promptTokens: number;         // Input tokens used
    completionTokens: number;     // Output tokens used
    totalTokens: number;          // Total tokens used
  };
  error?: string;                 // Error message if failed
}

CIMAnalysisResult

interface CIMAnalysisResult {
  success: boolean;               // Analysis success status
  jsonOutput?: CIMReview;         // Structured analysis data
  error?: string;                 // Error message if failed
  model: string;                  // Model used for analysis
  cost: number;                   // Estimated cost in USD
  inputTokens: number;            // Input tokens consumed
  outputTokens: number;           // Output tokens consumed
  validationIssues?: z.ZodIssue[]; // Schema validation issues
}

Configuration

// Key configuration options
const LLM_CONFIG = {
  provider: 'anthropic' | 'openai',        // LLM provider selection
  defaultModel: 'claude-3-opus-20240229',  // Default model for provider
  maxTokens: 3500,                         // Default max tokens
  temperature: 0.1,                        // Default temperature
  promptBuffer: 500,                       // Buffer for prompt engineering
  retryAttempts: 3,                        // Number of retry attempts
  costThreshold: 5.0,                      // Cost threshold per request (USD)
};

📊 Data Flow

Input Processing

  1. Task Analysis: Determine task complexity and requirements
  2. Model Selection: Select optimal model based on complexity and tokens
  3. Prompt Engineering: Build appropriate prompt based on analysis type
  4. System Prompt: Generate context-appropriate system prompt
  5. Parameter Optimization: Optimize temperature, tokens, and other parameters

Processing Pipeline

  1. Provider Selection: Route to appropriate provider (Claude/OpenAI)
  2. API Call: Execute LLM API call with retry logic
  3. Response Processing: Extract and validate response content
  4. JSON Parsing: Parse structured output from response
  5. Schema Validation: Validate output against CIM review schema

Output Generation

  1. Content Extraction: Extract structured data from LLM response
  2. Cost Calculation: Calculate and track API usage costs
  3. Validation: Validate output against expected schema
  4. Error Handling: Handle parsing and validation errors
  5. Result Formatting: Format final analysis result

Data Transformations

  • Document TextTask AnalysisModel SelectionPrompt EngineeringLLM Response
  • Analysis RequirementsPrompt StrategySystem ContextStructured Output
  • Raw ResponseJSON ParsingSchema ValidationValidated Data

🚨 Error Handling

Error Types

/**
 * @errorType API_ERROR
 * @description LLM API call failed due to network or service issues
 * @recoverable true
 * @retryStrategy exponential_backoff
 * @userMessage "LLM service temporarily unavailable"
 */

/**
 * @errorType RATE_LIMIT_ERROR
 * @description API rate limit exceeded
 * @recoverable true
 * @retryStrategy exponential_backoff
 * @userMessage "Rate limit exceeded, retrying shortly"
 */

/**
 * @errorType VALIDATION_ERROR
 * @description LLM response failed schema validation
 * @recoverable true
 * @retryStrategy retry_with_different_prompt
 * @userMessage "Response validation failed, retrying with improved prompt"
 */

/**
 * @errorType PARSING_ERROR
 * @description Failed to parse JSON from LLM response
 * @recoverable true
 * @retryStrategy retry_with_json_formatting
 * @userMessage "Response parsing failed, retrying with JSON formatting"
 */

Error Recovery

  • API Errors: Implement exponential backoff and retry logic
  • Rate Limit Errors: Respect rate limits and implement backoff
  • Validation Errors: Retry with improved prompts and formatting
  • Parsing Errors: Retry with explicit JSON formatting instructions

Fallback Strategies

  • Primary Strategy: Claude AI with comprehensive prompts
  • Fallback Strategy: OpenAI with similar prompts
  • Degradation Strategy: Simplified analysis with basic prompts

🧪 Testing

Test Coverage

  • Unit Tests: 95% - Core LLM interaction logic and prompt engineering
  • Integration Tests: 90% - End-to-end LLM processing workflows
  • Performance Tests: API response time and cost optimization

Test Data

/**
 * @testData sample_cim_text.txt
 * @description Standard CIM document text for testing
 * @size 10KB
 * @sections Financial, Market, Management
 * @expectedOutput Valid CIMReview with all sections populated
 */

/**
 * @testData complex_cim_text.txt
 * @description Complex CIM document for model selection testing
 * @size 50KB
 * @sections Comprehensive business analysis
 * @expectedOutput Complex analysis with appropriate model selection
 */

/**
 * @testData malformed_response.json
 * @description Malformed LLM response for error handling testing
 * @size 2KB
 * @format Invalid JSON structure
 * @expectedOutput Proper error handling and retry logic
 */

Mock Strategy

  • External APIs: Mock Claude AI and OpenAI responses
  • Schema Validation: Mock validation scenarios and error cases
  • Cost Tracking: Mock token usage and cost calculations

📈 Performance Characteristics

Performance Metrics

  • Average Response Time: 10-30 seconds per LLM call
  • Token Usage: 1000-4000 tokens per analysis
  • Cost per Analysis: $0.01-$0.10 per document
  • Success Rate: 95%+ with retry logic
  • Validation Success: 90%+ with prompt engineering

Optimization Strategies

  • Model Selection: Intelligent model selection based on task complexity
  • Prompt Engineering: Optimized prompts for better response quality
  • Cost Management: Token usage optimization and cost tracking
  • Caching: Cache similar requests to reduce API calls
  • Batch Processing: Process multiple sections in single requests

Scalability Limits

  • API Rate Limits: Respect provider-specific rate limits
  • Cost Limits: Maximum cost per request and daily limits
  • Token Limits: Maximum input/output token limits per model
  • Concurrent Requests: Limit concurrent API calls

🔍 Debugging & Monitoring

Logging

/**
 * @logging Structured logging with detailed LLM interaction metrics
 * @levels debug, info, warn, error
 * @correlation Request ID and model tracking
 * @context Prompt engineering, model selection, cost tracking, validation
 */

Debug Tools

  • Prompt Analysis: Detailed prompt engineering and system prompt analysis
  • Model Selection: Model selection logic and reasoning
  • Cost Tracking: Detailed cost analysis and optimization
  • Response Validation: Schema validation and error analysis

Common Issues

  1. API Failures: Monitor API health and implement proper retry logic
  2. Rate Limiting: Implement proper rate limiting and backoff strategies
  3. Validation Errors: Improve prompt engineering for better response quality
  4. Cost Optimization: Monitor and optimize token usage and model selection

🔐 Security Considerations

Input Validation

  • Text Content: Sanitization of input text for prompt injection prevention
  • API Keys: Secure storage and rotation of API keys
  • Request Validation: Validation of all input parameters

Authentication & Authorization

  • API Access: Secure access to LLM provider APIs
  • Key Management: Secure API key management and rotation
  • Request Logging: Secure logging of requests and responses

Data Protection

  • Text Processing: Secure handling of sensitive document content
  • Response Storage: Secure storage of LLM responses and analysis
  • Cost Tracking: Secure tracking and reporting of API usage costs

Internal References

  • optimizedAgenticRAGProcessor.ts - Uses this service for LLM analysis
  • llmSchemas.ts - CIM review data structure definitions
  • config/env.ts - Environment configuration and API keys
  • logger.ts - Structured logging utility

External References


🔄 Change History

Recent Changes

  • 2024-12-20 - Implemented intelligent model selection and cost tracking - [Author]
  • 2024-12-15 - Added comprehensive prompt engineering and validation - [Author]
  • 2024-12-10 - Implemented multi-provider support (Claude AI, OpenAI) - [Author]

Planned Changes

  • Advanced prompt engineering improvements - 2025-01-15
  • Multi-language support for international documents - 2025-01-30
  • Enhanced cost optimization and caching - 2025-02-15

📋 Usage Examples

Basic Usage

import { LLMService } from './llmService';

const llmService = new LLMService();
const result = await llmService.processCIMDocument(
  documentText,
  cimTemplate
);

if (result.success) {
  console.log('Analysis completed:', result.jsonOutput);
  console.log('Cost:', result.cost, 'USD');
  console.log('Tokens used:', result.inputTokens + result.outputTokens);
} else {
  console.error('Analysis failed:', result.error);
}

Advanced Usage

import { LLMService } from './llmService';

const llmService = new LLMService();
const result = await llmService.processCIMDocument(
  documentText,
  cimTemplate,
  {
    refinementMode: true,
    overviewMode: false,
    sectionType: 'financial'
  }
);

// Monitor detailed metrics
console.log('Model used:', result.model);
console.log('Input tokens:', result.inputTokens);
console.log('Output tokens:', result.outputTokens);
console.log('Total cost:', result.cost, 'USD');

Error Handling

try {
  const result = await llmService.processCIMDocument(
    documentText,
    cimTemplate
  );
  
  if (!result.success) {
    logger.error('LLM analysis failed', { 
      error: result.error,
      model: result.model,
      cost: result.cost 
    });
  }
  
  if (result.validationIssues) {
    logger.warn('Validation issues found', { 
      issues: result.validationIssues 
    });
  }
} catch (error) {
  logger.error('Unexpected error during LLM analysis', { 
    error: error.message 
  });
}

🎯 LLM Agent Notes

Key Understanding Points

  • This service is the central hub for all LLM interactions in the system
  • Implements intelligent model selection based on task complexity
  • Provides comprehensive prompt engineering for different analysis types
  • Handles multi-provider support (Claude AI, OpenAI) with fallback logic
  • Includes cost tracking and optimization for API usage

Common Modifications

  • Adding new providers - Implement new provider methods and update selection logic
  • Modifying prompt engineering - Update prompt building methods for different analysis types
  • Adjusting model selection - Modify selectModel method for different complexity criteria
  • Enhancing validation - Extend schema validation and error handling
  • Optimizing costs - Adjust cost thresholds and token optimization strategies

Integration Patterns

  • Strategy Pattern - Different providers and models for different tasks
  • Factory Pattern - Creating different types of prompts and system contexts
  • Observer Pattern - Cost tracking and performance monitoring
  • Chain of Responsibility - Retry logic and fallback strategies

This documentation provides comprehensive information about the LLM service, enabling LLM agents to understand its purpose, implementation, and usage patterns for effective code evaluation and modification.