Files

Jon 5e8add6cc5 Add Bluepoint logo integration to PDF reports and web navigation

2025-08-02 15:12:33 -04:00

16 KiB

Raw Blame History

LLM Service Documentation

📄 File Information

File Path: backend/src/services/llmService.ts
File Type: TypeScript
Last Updated: 2024-12-20
Version: 1.0.0
Status: Active

🎯 Purpose & Overview

Primary Purpose: Centralized service for all LLM (Large Language Model) interactions, providing intelligent model selection, prompt engineering, and structured output generation for CIM document analysis.

Business Context: Handles the AI-powered analysis of Confidential Information Memorandums by orchestrating interactions with Claude AI and OpenAI, ensuring optimal model selection, cost management, and quality output generation.

Key Responsibilities:

Intelligent model selection based on task complexity
Prompt engineering and system prompt management
Multi-provider LLM integration (Claude AI, OpenAI)
Structured output generation and validation
Cost tracking and optimization
Error handling and retry logic
CIM-specific analysis and synthesis

🏗️ Architecture & Dependencies

Dependencies

Internal Dependencies:

config/env.ts - Environment configuration and API keys
logger.ts - Structured logging utility
llmSchemas.ts - CIM review data structure definitions and validation
zod - Schema validation library

External Dependencies:

@anthropic-ai/sdk - Claude AI API client
@openai/openai - OpenAI API client
zod - TypeScript-first schema validation

Integration Points

Input Sources: Document text from processing services
Output Destinations: Structured CIM analysis data, summaries, section analysis
Event Triggers: Document analysis requests from processing pipeline
Event Listeners: Analysis completion events, error events

🔧 Implementation Details

Core Functions/Methods

`processCIMDocument`

/**
 * @purpose Main entry point for CIM document processing with intelligent model selection
 * @context Called when document analysis is needed with structured output requirements
 * @inputs text: string, template: string, analysis?: Record<string, any>
 * @outputs CIMAnalysisResult with structured data, cost tracking, and validation
 * @dependencies Claude AI/OpenAI APIs, schema validation, cost estimation
 * @errors API failures, validation errors, parsing errors
 * @complexity O(1) - Single LLM call with comprehensive prompt engineering
 */

Example Usage:

const llmService = new LLMService();
const result = await llmService.processCIMDocument(
  documentText,
  cimTemplate,
  { refinementMode: false, overviewMode: true }
);

`callLLM`

/**
 * @purpose Generic LLM call method with provider abstraction
 * @context Called for all LLM interactions regardless of provider
 * @inputs request: LLMRequest with prompt and configuration
 * @outputs LLMResponse with content and usage metrics
 * @dependencies Provider-specific API clients
 * @errors API failures, rate limiting, authentication errors
 * @complexity O(1) - Direct API call with error handling
 */

`callAnthropic`

/**
 * @purpose Claude AI specific API interactions
 * @context Called when using Claude AI as the LLM provider
 * @inputs request: LLMRequest with Claude-specific parameters
 * @outputs LLMResponse with Claude AI response and token usage
 * @dependencies @anthropic-ai/sdk
 * @errors Claude API failures, rate limiting, model errors
 * @complexity O(1) - Direct Claude API call
 */

`callOpenAI`

/**
 * @purpose OpenAI specific API interactions
 * @context Called when using OpenAI as the LLM provider
 * @inputs request: LLMRequest with OpenAI-specific parameters
 * @outputs LLMResponse with OpenAI response and token usage
 * @dependencies @openai/openai
 * @errors OpenAI API failures, rate limiting, model errors
 * @complexity O(1) - Direct OpenAI API call
 */

Data Structures

`LLMRequest`

interface LLMRequest {
  prompt: string;                 // Main prompt text
  systemPrompt?: string;          // System prompt for context
  maxTokens?: number;             // Maximum tokens for response
  temperature?: number;           // Response creativity (0-2)
  model?: string;                 // Specific model to use
}

`LLMResponse`

interface LLMResponse {
  success: boolean;               // Request success status
  content: string;                // LLM response content
  usage?: {                       // Token usage metrics
    promptTokens: number;         // Input tokens used
    completionTokens: number;     // Output tokens used
    totalTokens: number;          // Total tokens used
  };
  error?: string;                 // Error message if failed
}

`CIMAnalysisResult`

interface CIMAnalysisResult {
  success: boolean;               // Analysis success status
  jsonOutput?: CIMReview;         // Structured analysis data
  error?: string;                 // Error message if failed
  model: string;                  // Model used for analysis
  cost: number;                   // Estimated cost in USD
  inputTokens: number;            // Input tokens consumed
  outputTokens: number;           // Output tokens consumed
  validationIssues?: z.ZodIssue[]; // Schema validation issues
}

Configuration

// Key configuration options
const LLM_CONFIG = {
  provider: 'anthropic' | 'openai',        // LLM provider selection
  defaultModel: 'claude-3-opus-20240229',  // Default model for provider
  maxTokens: 3500,                         // Default max tokens
  temperature: 0.1,                        // Default temperature
  promptBuffer: 500,                       // Buffer for prompt engineering
  retryAttempts: 3,                        // Number of retry attempts
  costThreshold: 5.0,                      // Cost threshold per request (USD)
};

📊 Data Flow

Input Processing

Task Analysis: Determine task complexity and requirements
Model Selection: Select optimal model based on complexity and tokens
Prompt Engineering: Build appropriate prompt based on analysis type
System Prompt: Generate context-appropriate system prompt
Parameter Optimization: Optimize temperature, tokens, and other parameters

Processing Pipeline

Provider Selection: Route to appropriate provider (Claude/OpenAI)
API Call: Execute LLM API call with retry logic
Response Processing: Extract and validate response content
JSON Parsing: Parse structured output from response
Schema Validation: Validate output against CIM review schema

Output Generation

Content Extraction: Extract structured data from LLM response
Cost Calculation: Calculate and track API usage costs
Validation: Validate output against expected schema
Error Handling: Handle parsing and validation errors
Result Formatting: Format final analysis result

Data Transformations

Document Text → Task Analysis → Model Selection → Prompt Engineering → LLM Response
Analysis Requirements → Prompt Strategy → System Context → Structured Output
Raw Response → JSON Parsing → Schema Validation → Validated Data

🚨 Error Handling

Error Types

/**
 * @errorType API_ERROR
 * @description LLM API call failed due to network or service issues
 * @recoverable true
 * @retryStrategy exponential_backoff
 * @userMessage "LLM service temporarily unavailable"
 */

/**
 * @errorType RATE_LIMIT_ERROR
 * @description API rate limit exceeded
 * @recoverable true
 * @retryStrategy exponential_backoff
 * @userMessage "Rate limit exceeded, retrying shortly"
 */

/**
 * @errorType VALIDATION_ERROR
 * @description LLM response failed schema validation
 * @recoverable true
 * @retryStrategy retry_with_different_prompt
 * @userMessage "Response validation failed, retrying with improved prompt"
 */

/**
 * @errorType PARSING_ERROR
 * @description Failed to parse JSON from LLM response
 * @recoverable true
 * @retryStrategy retry_with_json_formatting
 * @userMessage "Response parsing failed, retrying with JSON formatting"
 */

Error Recovery

API Errors: Implement exponential backoff and retry logic
Rate Limit Errors: Respect rate limits and implement backoff
Validation Errors: Retry with improved prompts and formatting
Parsing Errors: Retry with explicit JSON formatting instructions

Fallback Strategies

Primary Strategy: Claude AI with comprehensive prompts
Fallback Strategy: OpenAI with similar prompts
Degradation Strategy: Simplified analysis with basic prompts

🧪 Testing

Test Coverage

Unit Tests: 95% - Core LLM interaction logic and prompt engineering
Integration Tests: 90% - End-to-end LLM processing workflows
Performance Tests: API response time and cost optimization

Test Data

/**
 * @testData sample_cim_text.txt
 * @description Standard CIM document text for testing
 * @size 10KB
 * @sections Financial, Market, Management
 * @expectedOutput Valid CIMReview with all sections populated
 */

/**
 * @testData complex_cim_text.txt
 * @description Complex CIM document for model selection testing
 * @size 50KB
 * @sections Comprehensive business analysis
 * @expectedOutput Complex analysis with appropriate model selection
 */

/**
 * @testData malformed_response.json
 * @description Malformed LLM response for error handling testing
 * @size 2KB
 * @format Invalid JSON structure
 * @expectedOutput Proper error handling and retry logic
 */

Mock Strategy

External APIs: Mock Claude AI and OpenAI responses
Schema Validation: Mock validation scenarios and error cases
Cost Tracking: Mock token usage and cost calculations

📈 Performance Characteristics

Performance Metrics

Average Response Time: 10-30 seconds per LLM call
Token Usage: 1000-4000 tokens per analysis
Cost per Analysis: $0.01-$0.10 per document
Success Rate: 95%+ with retry logic
Validation Success: 90%+ with prompt engineering

Optimization Strategies

Model Selection: Intelligent model selection based on task complexity
Prompt Engineering: Optimized prompts for better response quality
Cost Management: Token usage optimization and cost tracking
Caching: Cache similar requests to reduce API calls
Batch Processing: Process multiple sections in single requests

Scalability Limits

API Rate Limits: Respect provider-specific rate limits
Cost Limits: Maximum cost per request and daily limits
Token Limits: Maximum input/output token limits per model
Concurrent Requests: Limit concurrent API calls

🔍 Debugging & Monitoring

Logging

/**
 * @logging Structured logging with detailed LLM interaction metrics
 * @levels debug, info, warn, error
 * @correlation Request ID and model tracking
 * @context Prompt engineering, model selection, cost tracking, validation
 */

Debug Tools

Prompt Analysis: Detailed prompt engineering and system prompt analysis
Model Selection: Model selection logic and reasoning
Cost Tracking: Detailed cost analysis and optimization
Response Validation: Schema validation and error analysis

Common Issues

API Failures: Monitor API health and implement proper retry logic
Rate Limiting: Implement proper rate limiting and backoff strategies
Validation Errors: Improve prompt engineering for better response quality
Cost Optimization: Monitor and optimize token usage and model selection

🔐 Security Considerations

Input Validation

Text Content: Sanitization of input text for prompt injection prevention
API Keys: Secure storage and rotation of API keys
Request Validation: Validation of all input parameters

Authentication & Authorization

API Access: Secure access to LLM provider APIs
Key Management: Secure API key management and rotation
Request Logging: Secure logging of requests and responses

Data Protection

Text Processing: Secure handling of sensitive document content
Response Storage: Secure storage of LLM responses and analysis
Cost Tracking: Secure tracking and reporting of API usage costs

Internal References

optimizedAgenticRAGProcessor.ts - Uses this service for LLM analysis
llmSchemas.ts - CIM review data structure definitions
config/env.ts - Environment configuration and API keys
logger.ts - Structured logging utility

External References

🔄 Change History

Recent Changes

2024-12-20 - Implemented intelligent model selection and cost tracking - [Author]
2024-12-15 - Added comprehensive prompt engineering and validation - [Author]
2024-12-10 - Implemented multi-provider support (Claude AI, OpenAI) - [Author]

Planned Changes

Advanced prompt engineering improvements - 2025-01-15
Multi-language support for international documents - 2025-01-30
Enhanced cost optimization and caching - 2025-02-15

📋 Usage Examples

Basic Usage

import { LLMService } from './llmService';

const llmService = new LLMService();
const result = await llmService.processCIMDocument(
  documentText,
  cimTemplate
);

if (result.success) {
  console.log('Analysis completed:', result.jsonOutput);
  console.log('Cost:', result.cost, 'USD');
  console.log('Tokens used:', result.inputTokens + result.outputTokens);
} else {
  console.error('Analysis failed:', result.error);
}

Advanced Usage

import { LLMService } from './llmService';

const llmService = new LLMService();
const result = await llmService.processCIMDocument(
  documentText,
  cimTemplate,
  {
    refinementMode: true,
    overviewMode: false,
    sectionType: 'financial'
  }
);

// Monitor detailed metrics
console.log('Model used:', result.model);
console.log('Input tokens:', result.inputTokens);
console.log('Output tokens:', result.outputTokens);
console.log('Total cost:', result.cost, 'USD');

Error Handling

try {
  const result = await llmService.processCIMDocument(
    documentText,
    cimTemplate
  );
  
  if (!result.success) {
    logger.error('LLM analysis failed', { 
      error: result.error,
      model: result.model,
      cost: result.cost 
    });
  }
  
  if (result.validationIssues) {
    logger.warn('Validation issues found', { 
      issues: result.validationIssues 
    });
  }
} catch (error) {
  logger.error('Unexpected error during LLM analysis', { 
    error: error.message 
  });
}

🎯 LLM Agent Notes

Key Understanding Points

This service is the central hub for all LLM interactions in the system
Implements intelligent model selection based on task complexity
Provides comprehensive prompt engineering for different analysis types
Handles multi-provider support (Claude AI, OpenAI) with fallback logic
Includes cost tracking and optimization for API usage

Common Modifications

Adding new providers - Implement new provider methods and update selection logic
Modifying prompt engineering - Update prompt building methods for different analysis types
Adjusting model selection - Modify selectModel method for different complexity criteria
Enhancing validation - Extend schema validation and error handling
Optimizing costs - Adjust cost thresholds and token optimization strategies

Integration Patterns

Strategy Pattern - Different providers and models for different tasks
Factory Pattern - Creating different types of prompts and system contexts
Observer Pattern - Cost tracking and performance monitoring
Chain of Responsibility - Retry logic and fallback strategies

This documentation provides comprehensive information about the LLM service, enabling LLM agents to understand its purpose, implementation, and usage patterns for effective code evaluation and modification.

16 KiB Raw Blame History