fix: Correct OpenRouter model IDs and add error handling
Critical fixes for LLM processing failures: - Updated model mapping to use valid OpenRouter IDs (claude-haiku-4.5, claude-sonnet-4.5) - Changed default models from dated versions to generic names - Added HTTP status checking before accessing response data - Enhanced logging for OpenRouter provider selection Resolves "invalid model ID" errors that were causing all CIM processing to fail. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
418
backend/TROUBLESHOOTING_PLAN.md
Normal file
418
backend/TROUBLESHOOTING_PLAN.md
Normal file
@@ -0,0 +1,418 @@
|
||||
# CIM Summary LLM Processing - Rapid Diagnostic & Fix Plan
|
||||
|
||||
## 🚨 If Processing Fails - Execute This Plan
|
||||
|
||||
### Phase 1: Immediate Diagnosis (2-5 minutes)
|
||||
|
||||
#### Step 1.1: Check Recent Failures in Database
|
||||
```bash
|
||||
npx ts-node -e "
|
||||
import { supabase } from './src/config/supabase';
|
||||
|
||||
(async () => {
|
||||
const { data } = await supabase
|
||||
.from('documents')
|
||||
.select('id, filename, status, error_message, created_at, updated_at')
|
||||
.eq('status', 'failed')
|
||||
.order('updated_at', { ascending: false })
|
||||
.limit(5);
|
||||
|
||||
console.log('Recent Failures:');
|
||||
data?.forEach(d => {
|
||||
console.log(\`- \${d.filename}: \${d.error_message?.substring(0, 200)}\`);
|
||||
});
|
||||
process.exit(0);
|
||||
})();
|
||||
"
|
||||
```
|
||||
|
||||
**What to look for:**
|
||||
- Repeating error patterns
|
||||
- Specific error messages (timeout, API error, invalid model, etc.)
|
||||
- Time pattern (all failures at same time = system issue)
|
||||
|
||||
---
|
||||
|
||||
#### Step 1.2: Check Real-Time Error Logs
|
||||
```bash
|
||||
# Check last 100 errors
|
||||
tail -100 logs/error.log | grep -E "(error|ERROR|failed|FAILED|timeout|TIMEOUT)" | tail -20
|
||||
|
||||
# Or check specific patterns
|
||||
grep -E "OpenRouter|Anthropic|LLM|model ID" logs/error.log | tail -20
|
||||
```
|
||||
|
||||
**What to look for:**
|
||||
- `"invalid model ID"` → Model name issue
|
||||
- `"timeout"` → Timeout configuration issue
|
||||
- `"rate limit"` → API quota exceeded
|
||||
- `"401"` or `"403"` → Authentication issue
|
||||
- `"Cannot read properties"` → Code bug
|
||||
|
||||
---
|
||||
|
||||
#### Step 1.3: Test LLM Directly (Fastest Check)
|
||||
```bash
|
||||
# This takes 30-60 seconds
|
||||
npx ts-node src/scripts/test-openrouter-simple.ts 2>&1 | grep -E "(SUCCESS|FAILED|error.*model|OpenRouter API)"
|
||||
```
|
||||
|
||||
**Expected output if working:**
|
||||
```
|
||||
✅ OpenRouter API call successful
|
||||
✅ Test Result: SUCCESS
|
||||
```
|
||||
|
||||
**If it fails, note the EXACT error message.**
|
||||
|
||||
---
|
||||
|
||||
### Phase 2: Root Cause Identification (3-10 minutes)
|
||||
|
||||
Based on the error from Phase 1, jump to the appropriate section:
|
||||
|
||||
#### **Error Type A: Invalid Model ID**
|
||||
|
||||
**Symptoms:**
|
||||
```
|
||||
"anthropic/claude-haiku-4 is not a valid model ID"
|
||||
"anthropic/claude-sonnet-4 is not a valid model ID"
|
||||
```
|
||||
|
||||
**Root Cause:** Model name mismatch with OpenRouter API
|
||||
|
||||
**Fix Location:** `backend/src/services/llmService.ts` lines 526-552
|
||||
|
||||
**Verification:**
|
||||
```bash
|
||||
# Check what OpenRouter actually supports
|
||||
curl -s "https://openrouter.ai/api/v1/models" \
|
||||
-H "Authorization: Bearer $OPENROUTER_API_KEY" | \
|
||||
python3 -m json.tool | \
|
||||
grep -A 2 "\"id\": \"anthropic" | \
|
||||
head -30
|
||||
```
|
||||
|
||||
**Quick Fix:**
|
||||
Update the model mapping in `llmService.ts`:
|
||||
```typescript
|
||||
// Current valid OpenRouter model IDs (as of Nov 2024):
|
||||
if (model.includes('sonnet') && model.includes('4')) {
|
||||
openRouterModel = 'anthropic/claude-sonnet-4.5';
|
||||
} else if (model.includes('haiku') && model.includes('4')) {
|
||||
openRouterModel = 'anthropic/claude-haiku-4.5';
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
#### **Error Type B: Timeout Errors**
|
||||
|
||||
**Symptoms:**
|
||||
```
|
||||
"LLM call timeout after X minutes"
|
||||
"Processing timeout: Document stuck"
|
||||
```
|
||||
|
||||
**Root Cause:** Operation taking longer than configured timeout
|
||||
|
||||
**Diagnosis:**
|
||||
```bash
|
||||
# Check current timeout settings
|
||||
grep -E "timeout|TIMEOUT" backend/src/config/env.ts | grep -v "//"
|
||||
grep "timeoutMs" backend/src/services/llmService.ts | head -5
|
||||
```
|
||||
|
||||
**Check Locations:**
|
||||
1. `env.ts:319` - `LLM_TIMEOUT_MS` (default 180000 = 3 min)
|
||||
2. `llmService.ts:343` - Wrapper timeout
|
||||
3. `llmService.ts:516` - OpenRouter abort timeout
|
||||
|
||||
**Quick Fix:**
|
||||
Add to `.env`:
|
||||
```bash
|
||||
LLM_TIMEOUT_MS=360000 # Increase to 6 minutes
|
||||
```
|
||||
|
||||
Or edit `env.ts:319`:
|
||||
```typescript
|
||||
timeoutMs: parseInt(envVars['LLM_TIMEOUT_MS'] || '360000'), // 6 min
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
#### **Error Type C: Authentication/API Key Issues**
|
||||
|
||||
**Symptoms:**
|
||||
```
|
||||
"401 Unauthorized"
|
||||
"403 Forbidden"
|
||||
"API key is missing"
|
||||
"ANTHROPIC_API_KEY is not set"
|
||||
```
|
||||
|
||||
**Root Cause:** Missing or invalid API keys
|
||||
|
||||
**Diagnosis:**
|
||||
```bash
|
||||
# Check which keys are set
|
||||
echo "ANTHROPIC_API_KEY: ${ANTHROPIC_API_KEY:0:20}..."
|
||||
echo "OPENROUTER_API_KEY: ${OPENROUTER_API_KEY:0:20}..."
|
||||
echo "OPENAI_API_KEY: ${OPENAI_API_KEY:0:20}..."
|
||||
|
||||
# Check .env file
|
||||
grep -E "ANTHROPIC|OPENROUTER|OPENAI" backend/.env | grep -v "^#"
|
||||
```
|
||||
|
||||
**Quick Fix:**
|
||||
Ensure these are set in `backend/.env`:
|
||||
```bash
|
||||
ANTHROPIC_API_KEY=sk-ant-api03-...
|
||||
OPENROUTER_API_KEY=sk-or-v1-...
|
||||
OPENROUTER_USE_BYOK=true
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
#### **Error Type D: Rate Limit Exceeded**
|
||||
|
||||
**Symptoms:**
|
||||
```
|
||||
"429 Too Many Requests"
|
||||
"rate limit exceeded"
|
||||
"Retry after X seconds"
|
||||
```
|
||||
|
||||
**Root Cause:** Too many API calls in short time
|
||||
|
||||
**Diagnosis:**
|
||||
```bash
|
||||
# Check recent API call frequency
|
||||
grep "LLM API call" logs/testing.log | tail -20 | \
|
||||
awk '{print $1, $2}' | uniq -c
|
||||
```
|
||||
|
||||
**Quick Fix:**
|
||||
1. Wait for rate limit to reset (check error for retry time)
|
||||
2. Add rate limiting in code:
|
||||
```typescript
|
||||
// In llmService.ts, add delay between retries
|
||||
await new Promise(resolve => setTimeout(resolve, 2000)); // 2 sec delay
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
#### **Error Type E: Code Bugs (TypeError, Cannot read property)**
|
||||
|
||||
**Symptoms:**
|
||||
```
|
||||
"Cannot read properties of undefined (reading '0')"
|
||||
"TypeError: response.data is undefined"
|
||||
"Unexpected token in JSON"
|
||||
```
|
||||
|
||||
**Root Cause:** Missing null checks or incorrect data access
|
||||
|
||||
**Diagnosis:**
|
||||
```bash
|
||||
# Find the exact line causing the error
|
||||
grep -A 5 "Cannot read properties" logs/error.log | tail -10
|
||||
```
|
||||
|
||||
**Quick Fix Pattern:**
|
||||
Replace unsafe access:
|
||||
```typescript
|
||||
// Bad:
|
||||
const content = response.data.choices[0].message.content;
|
||||
|
||||
// Good:
|
||||
const content = response.data?.choices?.[0]?.message?.content || '';
|
||||
```
|
||||
|
||||
**File to check:** `llmService.ts:696-720`
|
||||
|
||||
---
|
||||
|
||||
### Phase 3: Systematic Testing (5-10 minutes)
|
||||
|
||||
After applying a fix, test in this order:
|
||||
|
||||
#### Test 1: Direct LLM Call
|
||||
```bash
|
||||
npx ts-node src/scripts/test-openrouter-simple.ts
|
||||
```
|
||||
**Expected:** Success in 30-90 seconds
|
||||
|
||||
#### Test 2: Simple RAG Processing
|
||||
```bash
|
||||
npx ts-node -e "
|
||||
import { llmService } from './src/services/llmService';
|
||||
|
||||
(async () => {
|
||||
const text = 'CIM for Target Corp. Revenue: \$100M. EBITDA: \$20M.';
|
||||
const result = await llmService.processCIMDocument(text, 'BPCP Template');
|
||||
console.log('Success:', result.success);
|
||||
console.log('Has JSON:', !!result.jsonOutput);
|
||||
process.exit(result.success ? 0 : 1);
|
||||
})();
|
||||
"
|
||||
```
|
||||
**Expected:** Success with JSON output
|
||||
|
||||
#### Test 3: Full Document Upload
|
||||
Use the frontend to upload a real CIM and monitor:
|
||||
```bash
|
||||
# In one terminal, watch logs
|
||||
tail -f logs/testing.log | grep -E "(error|success|completed)"
|
||||
|
||||
# Check processing status
|
||||
npx ts-node src/scripts/check-current-processing.ts
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Phase 4: Emergency Fallback Options
|
||||
|
||||
If all else fails, use these fallback strategies:
|
||||
|
||||
#### Option 1: Switch to Direct Anthropic (Bypass OpenRouter)
|
||||
```bash
|
||||
# In .env
|
||||
LLM_PROVIDER=anthropic # Instead of openrouter
|
||||
```
|
||||
|
||||
**Pro:** Eliminates OpenRouter as variable
|
||||
**Con:** Different rate limits
|
||||
|
||||
#### Option 2: Use Older Claude Model
|
||||
```bash
|
||||
# In .env or env.ts
|
||||
LLM_MODEL=claude-3.5-sonnet
|
||||
LLM_FAST_MODEL=claude-3.5-haiku
|
||||
```
|
||||
|
||||
**Pro:** More stable, widely supported
|
||||
**Con:** Slightly older model
|
||||
|
||||
#### Option 3: Reduce Input Size
|
||||
```typescript
|
||||
// In optimizedAgenticRAGProcessor.ts:651
|
||||
const targetTokenCount = 8000; // Down from 50000
|
||||
```
|
||||
|
||||
**Pro:** Faster processing, less likely to timeout
|
||||
**Con:** Less context for analysis
|
||||
|
||||
---
|
||||
|
||||
### Phase 5: Preventive Monitoring
|
||||
|
||||
Set up these checks to catch issues early:
|
||||
|
||||
#### Daily Health Check Script
|
||||
Create `backend/scripts/daily-health-check.sh`:
|
||||
```bash
|
||||
#!/bin/bash
|
||||
echo "=== Daily CIM Processor Health Check ==="
|
||||
echo ""
|
||||
|
||||
# Check for stuck documents
|
||||
npx ts-node src/scripts/check-database-failures.ts
|
||||
|
||||
# Test LLM connectivity
|
||||
npx ts-node src/scripts/test-openrouter-simple.ts
|
||||
|
||||
# Check recent success rate
|
||||
echo "Recent processing stats (last 24 hours):"
|
||||
npx ts-node -e "
|
||||
import { supabase } from './src/config/supabase';
|
||||
(async () => {
|
||||
const yesterday = new Date(Date.now() - 86400000).toISOString();
|
||||
const { data } = await supabase
|
||||
.from('documents')
|
||||
.select('status')
|
||||
.gte('created_at', yesterday);
|
||||
|
||||
const stats = data?.reduce((acc, d) => {
|
||||
acc[d.status] = (acc[d.status] || 0) + 1;
|
||||
return acc;
|
||||
}, {});
|
||||
|
||||
console.log(stats);
|
||||
process.exit(0);
|
||||
})();
|
||||
"
|
||||
```
|
||||
|
||||
Run daily:
|
||||
```bash
|
||||
chmod +x backend/scripts/daily-health-check.sh
|
||||
./backend/scripts/daily-health-check.sh
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 📋 Quick Reference Checklist
|
||||
|
||||
When processing fails, check in this order:
|
||||
|
||||
- [ ] **Error logs** (`tail -100 logs/error.log`)
|
||||
- [ ] **Recent failures** (database query in Step 1.1)
|
||||
- [ ] **Direct LLM test** (`test-openrouter-simple.ts`)
|
||||
- [ ] **Model ID validity** (curl OpenRouter API)
|
||||
- [ ] **API keys set** (check `.env`)
|
||||
- [ ] **Timeout values** (check `env.ts`)
|
||||
- [ ] **OpenRouter vs Anthropic** (which provider?)
|
||||
- [ ] **Rate limits** (check error for 429)
|
||||
- [ ] **Code bugs** (look for TypeErrors in logs)
|
||||
- [ ] **Build succeeded** (`npm run build`)
|
||||
|
||||
---
|
||||
|
||||
## 🔧 Common Fix Commands
|
||||
|
||||
```bash
|
||||
# Rebuild after code changes
|
||||
npm run build
|
||||
|
||||
# Clear error logs and start fresh
|
||||
> logs/error.log
|
||||
|
||||
# Test with verbose logging
|
||||
LOG_LEVEL=debug npx ts-node src/scripts/test-openrouter-simple.ts
|
||||
|
||||
# Check what's actually in .env
|
||||
cat .env | grep -v "^#" | grep -E "LLM|ANTHROPIC|OPENROUTER"
|
||||
|
||||
# Verify OpenRouter models
|
||||
curl -s "https://openrouter.ai/api/v1/models" -H "Authorization: Bearer $OPENROUTER_API_KEY" | python3 -m json.tool | grep "claude.*haiku\|claude.*sonnet"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 📞 Escalation Path
|
||||
|
||||
If issue persists after 30 minutes:
|
||||
|
||||
1. **Check OpenRouter Status:** https://status.openrouter.ai/
|
||||
2. **Check Anthropic Status:** https://status.anthropic.com/
|
||||
3. **Review OpenRouter Docs:** https://openrouter.ai/docs
|
||||
4. **Test with curl:** Send raw API request to isolate issue
|
||||
5. **Compare git history:** `git diff HEAD~10 -- backend/src/services/llmService.ts`
|
||||
|
||||
---
|
||||
|
||||
## 🎯 Success Criteria
|
||||
|
||||
Processing is "working" when:
|
||||
|
||||
- ✅ Direct LLM test completes in < 2 minutes
|
||||
- ✅ Returns valid JSON matching schema
|
||||
- ✅ No errors in last 10 log entries
|
||||
- ✅ Database shows recent "completed" documents
|
||||
- ✅ Frontend can upload and process test CIM
|
||||
|
||||
---
|
||||
|
||||
**Last Updated:** 2025-11-07
|
||||
**Next Review:** After any production deployment
|
||||
@@ -1,9 +1,48 @@
|
||||
import dotenv from 'dotenv';
|
||||
import Joi from 'joi';
|
||||
import * as functions from 'firebase-functions';
|
||||
|
||||
// Load environment variables
|
||||
// Load environment variables from .env file (for local development)
|
||||
dotenv.config();
|
||||
|
||||
// Use process.env directly - Firebase Functions v2 supports environment variables
|
||||
// For production, set environment variables using:
|
||||
// - firebase functions:secrets:set for sensitive data (recommended)
|
||||
// - Environment variables in firebase.json or function configuration
|
||||
// - .env files for local development
|
||||
// CRITICAL: Also check functions.config() as fallback for Firebase Functions v1 compatibility
|
||||
let env = { ...process.env };
|
||||
|
||||
// CRITICAL FIX: Firebase Functions v1 uses functions.config(), v2 uses process.env
|
||||
// Try to read from functions.config() if process.env doesn't have the value
|
||||
try {
|
||||
const functionsConfig = functions.config();
|
||||
if (functionsConfig && Object.keys(functionsConfig).length > 0) {
|
||||
console.log('[CONFIG DEBUG] functions.config() is available, merging with process.env');
|
||||
// Merge functions.config() values into env (process.env takes precedence)
|
||||
Object.keys(functionsConfig).forEach(key => {
|
||||
if (typeof functionsConfig[key] === 'object' && functionsConfig[key] !== null) {
|
||||
// Handle nested config like functions.config().llm.provider
|
||||
Object.keys(functionsConfig[key]).forEach(subKey => {
|
||||
const envKey = `${key.toUpperCase()}_${subKey.toUpperCase()}`;
|
||||
if (!env[envKey]) {
|
||||
env[envKey] = String(functionsConfig[key][subKey]);
|
||||
}
|
||||
});
|
||||
} else {
|
||||
// Handle flat config
|
||||
const envKey = key.toUpperCase();
|
||||
if (!env[envKey]) {
|
||||
env[envKey] = String(functionsConfig[key]);
|
||||
}
|
||||
}
|
||||
});
|
||||
}
|
||||
} catch (error) {
|
||||
// functions.config() might not be available in v2, that's okay
|
||||
console.log('[CONFIG DEBUG] functions.config() not available (this is normal for v2)');
|
||||
}
|
||||
|
||||
// Environment validation schema
|
||||
const envSchema = Joi.object({
|
||||
NODE_ENV: Joi.string().valid('development', 'production', 'test').default('development'),
|
||||
@@ -74,17 +113,22 @@ const envSchema = Joi.object({
|
||||
ALLOWED_FILE_TYPES: Joi.string().default('application/pdf'),
|
||||
|
||||
// LLM
|
||||
LLM_PROVIDER: Joi.string().valid('openai', 'anthropic').default('openai'),
|
||||
LLM_PROVIDER: Joi.string().valid('openai', 'anthropic', 'openrouter').default('openai'),
|
||||
OPENAI_API_KEY: Joi.string().when('LLM_PROVIDER', {
|
||||
is: 'openai',
|
||||
then: Joi.string().required(),
|
||||
otherwise: Joi.string().allow('').optional()
|
||||
}),
|
||||
ANTHROPIC_API_KEY: Joi.string().when('LLM_PROVIDER', {
|
||||
is: 'anthropic',
|
||||
is: ['anthropic', 'openrouter'],
|
||||
then: Joi.string().required(),
|
||||
otherwise: Joi.string().allow('').optional()
|
||||
}),
|
||||
OPENROUTER_API_KEY: Joi.string().when('LLM_PROVIDER', {
|
||||
is: 'openrouter',
|
||||
then: Joi.string().optional(), // Optional if using BYOK
|
||||
otherwise: Joi.string().allow('').optional()
|
||||
}),
|
||||
LLM_MODEL: Joi.string().default('gpt-4'),
|
||||
LLM_MAX_TOKENS: Joi.number().default(3500),
|
||||
LLM_TEMPERATURE: Joi.number().min(0).max(2).default(0.1),
|
||||
@@ -130,7 +174,7 @@ const envSchema = Joi.object({
|
||||
}).unknown();
|
||||
|
||||
// Validate environment variables
|
||||
const { error, value: envVars } = envSchema.validate(process.env);
|
||||
const { error, value: envVars } = envSchema.validate(env);
|
||||
|
||||
// Enhanced error handling for serverless environments
|
||||
if (error) {
|
||||
@@ -186,6 +230,7 @@ export const validateRuntimeConfig = (): { isValid: boolean; errors: string[] }
|
||||
};
|
||||
|
||||
// Export validated configuration
|
||||
console.log('envVars:', envVars);
|
||||
export const config = {
|
||||
env: envVars.NODE_ENV,
|
||||
nodeEnv: envVars.NODE_ENV,
|
||||
@@ -225,13 +270,22 @@ export const config = {
|
||||
|
||||
upload: {
|
||||
maxFileSize: envVars.MAX_FILE_SIZE,
|
||||
allowedFileTypes: envVars.ALLOWED_FILE_TYPES.split(','),
|
||||
allowedFileTypes: (envVars.ALLOWED_FILE_TYPES || 'application/pdf').split(','),
|
||||
// Cloud-only: No local upload directory needed
|
||||
uploadDir: '/tmp/uploads', // Temporary directory for file processing
|
||||
},
|
||||
|
||||
llm: {
|
||||
provider: envVars['LLM_PROVIDER'] || 'anthropic', // Default to Claude for cost efficiency
|
||||
// CRITICAL: Read LLM_PROVIDER with explicit logging
|
||||
provider: (() => {
|
||||
const provider = envVars['LLM_PROVIDER'] || process.env['LLM_PROVIDER'] || 'anthropic';
|
||||
console.log('[CONFIG DEBUG] LLM Provider resolution:', {
|
||||
fromEnvVars: envVars['LLM_PROVIDER'],
|
||||
fromProcessEnv: process.env['LLM_PROVIDER'],
|
||||
finalProvider: provider
|
||||
});
|
||||
return provider;
|
||||
})(),
|
||||
|
||||
// Anthropic Configuration (Primary)
|
||||
anthropicApiKey: envVars['ANTHROPIC_API_KEY'],
|
||||
@@ -239,18 +293,24 @@ export const config = {
|
||||
// OpenAI Configuration (Fallback)
|
||||
openaiApiKey: envVars['OPENAI_API_KEY'],
|
||||
|
||||
// OpenRouter Configuration (Rate limit workaround)
|
||||
openrouterApiKey: envVars['OPENROUTER_API_KEY'],
|
||||
openrouterUseBYOK: envVars['OPENROUTER_USE_BYOK'] === 'true', // Use BYOK (Bring Your Own Key)
|
||||
|
||||
// Model Selection - Hybrid approach optimized for different tasks
|
||||
model: envVars['LLM_MODEL'] || 'claude-3-7-sonnet-20250219', // Primary model for analysis
|
||||
fastModel: envVars['LLM_FAST_MODEL'] || 'claude-3-5-haiku-20241022', // Fast model for cost optimization
|
||||
fallbackModel: envVars['LLM_FALLBACK_MODEL'] || 'gpt-4.5-preview-2025-02-27', // Fallback for creativity
|
||||
// UPDATED: Using latest Claude 4.5 models compatible with OpenRouter
|
||||
// For OpenRouter: Use generic version names (claude-sonnet-4, claude-haiku-4) instead of dated versions
|
||||
model: envVars['LLM_MODEL'] || 'claude-sonnet-4', // Primary model for analysis (Claude 4.5)
|
||||
fastModel: envVars['LLM_FAST_MODEL'] || 'claude-haiku-4', // Fast model for cost optimization (Claude 4.5)
|
||||
fallbackModel: envVars['LLM_FALLBACK_MODEL'] || 'gpt-4o', // Fallback for creativity
|
||||
|
||||
// Task-specific model selection
|
||||
financialModel: envVars['LLM_FINANCIAL_MODEL'] || 'claude-3-7-sonnet-20250219', // Best for financial analysis
|
||||
creativeModel: envVars['LLM_CREATIVE_MODEL'] || 'gpt-4.5-preview-2025-02-27', // Best for creative content
|
||||
reasoningModel: envVars['LLM_REASONING_MODEL'] || 'claude-3-7-sonnet-20250219', // Best for complex reasoning
|
||||
financialModel: envVars['LLM_FINANCIAL_MODEL'] || 'claude-sonnet-4', // Best for financial analysis (Claude 4.5)
|
||||
creativeModel: envVars['LLM_CREATIVE_MODEL'] || 'gpt-4o', // Best for creative content
|
||||
reasoningModel: envVars['LLM_REASONING_MODEL'] || 'claude-sonnet-4', // Best for complex reasoning (Claude 4.5)
|
||||
|
||||
// Token Limits - Optimized for CIM documents with hierarchical processing
|
||||
maxTokens: parseInt(envVars['LLM_MAX_TOKENS'] || '4000'), // Output tokens (increased for better analysis)
|
||||
maxTokens: parseInt(envVars['LLM_MAX_TOKENS'] || '16000'), // Output tokens (Claude Sonnet 4.5 supports up to 16,384)
|
||||
maxInputTokens: parseInt(envVars['LLM_MAX_INPUT_TOKENS'] || '200000'), // Input tokens (increased for larger context)
|
||||
chunkSize: parseInt(envVars['LLM_CHUNK_SIZE'] || '15000'), // Chunk size for section analysis (increased from 4000)
|
||||
promptBuffer: parseInt(envVars['LLM_PROMPT_BUFFER'] || '1000'), // Buffer for prompt tokens (increased)
|
||||
|
||||
@@ -41,18 +41,63 @@ class LLMService {
|
||||
private temperature: number;
|
||||
|
||||
constructor() {
|
||||
this.provider = config.llm.provider;
|
||||
this.apiKey = this.provider === 'openai'
|
||||
? config.llm.openaiApiKey!
|
||||
: config.llm.anthropicApiKey!;
|
||||
// CRITICAL DEBUG: Log what we're reading from config
|
||||
logger.info('LLM Service constructor - Reading provider from config', {
|
||||
providerFromConfig: config.llm.provider,
|
||||
hasOpenrouterApiKey: !!config.llm.openrouterApiKey,
|
||||
hasAnthropicApiKey: !!config.llm.anthropicApiKey,
|
||||
openrouterUseBYOK: config.llm.openrouterUseBYOK,
|
||||
processEnvLLMProvider: process.env['LLM_PROVIDER'],
|
||||
processEnvOpenrouterKey: process.env['OPENROUTER_API_KEY'] ? 'SET' : 'NOT SET'
|
||||
});
|
||||
|
||||
// Set the correct default model based on provider
|
||||
if (this.provider === 'anthropic') {
|
||||
this.defaultModel = 'claude-3-opus-20240229';
|
||||
} else {
|
||||
this.defaultModel = config.llm.model;
|
||||
// Read provider from config (supports openrouter, anthropic, openai)
|
||||
this.provider = config.llm.provider;
|
||||
|
||||
// CRITICAL: If provider is not set correctly, log and use fallback
|
||||
if (!this.provider || (this.provider !== 'openrouter' && this.provider !== 'anthropic' && this.provider !== 'openai')) {
|
||||
logger.error('LLM provider is invalid or not set', {
|
||||
provider: this.provider,
|
||||
configProvider: config.llm.provider,
|
||||
processEnvProvider: process.env['LLM_PROVIDER'],
|
||||
defaultingTo: 'anthropic'
|
||||
});
|
||||
this.provider = 'anthropic'; // Fallback
|
||||
}
|
||||
|
||||
// Log provider selection for debugging
|
||||
logger.info('LLM Service provider selected', {
|
||||
provider: this.provider,
|
||||
willUseOpenRouter: this.provider === 'openrouter'
|
||||
});
|
||||
|
||||
if (this.provider === 'openrouter') {
|
||||
logger.info('LLM Service initialized with OpenRouter provider');
|
||||
} else if (this.provider === 'anthropic') {
|
||||
logger.info('LLM Service initialized with Anthropic provider');
|
||||
} else {
|
||||
logger.info('LLM Service initialized with OpenAI provider');
|
||||
}
|
||||
|
||||
// Set API key based on provider
|
||||
if (this.provider === 'openai') {
|
||||
this.apiKey = config.llm.openaiApiKey!;
|
||||
} else if (this.provider === 'openrouter') {
|
||||
// OpenRouter: Use OpenRouter key if provided, otherwise use Anthropic key for BYOK
|
||||
this.apiKey = config.llm.openrouterApiKey || config.llm.anthropicApiKey!;
|
||||
logger.info('OpenRouter API key configured', {
|
||||
usingOpenrouterKey: !!config.llm.openrouterApiKey,
|
||||
usingAnthropicKeyForBYOK: !config.llm.openrouterApiKey && !!config.llm.anthropicApiKey,
|
||||
useBYOK: config.llm.openrouterUseBYOK
|
||||
});
|
||||
} else {
|
||||
this.apiKey = config.llm.anthropicApiKey!;
|
||||
}
|
||||
|
||||
// Use configured model instead of hardcoded value
|
||||
// This ensures we use the latest models (e.g., claude-sonnet-4-5-20250929)
|
||||
this.defaultModel = config.llm.model;
|
||||
|
||||
this.maxTokens = config.llm.maxTokens;
|
||||
this.temperature = config.llm.temperature;
|
||||
}
|
||||
@@ -61,13 +106,63 @@ class LLMService {
|
||||
* Process CIM document with intelligent model selection and self-correction
|
||||
*/
|
||||
async processCIMDocument(text: string, template: string, analysis?: Record<string, any>): Promise<CIMAnalysisResult> {
|
||||
logger.info('Starting CIM document processing with LLM');
|
||||
logger.info('Starting CIM document processing with LLM', {
|
||||
textLength: text.length,
|
||||
templateLength: template.length,
|
||||
textPreview: text.substring(0, 200)
|
||||
});
|
||||
|
||||
const taskComplexity = this.determineTaskComplexity(text, analysis || {});
|
||||
const estimatedTokens = this.estimateTokenCount(text + template);
|
||||
// Check and truncate text if it exceeds maxInputTokens
|
||||
const maxInputTokens = config.llm.maxInputTokens || 200000;
|
||||
const systemPromptTokens = this.estimateTokenCount(this.getCIMSystemPrompt());
|
||||
const templateTokens = this.estimateTokenCount(template);
|
||||
const promptBuffer = config.llm.promptBuffer || 1000;
|
||||
|
||||
// Calculate available tokens for document text
|
||||
// Reserve tokens for: system prompt + template + prompt buffer + output tokens
|
||||
const reservedTokens = systemPromptTokens + templateTokens + promptBuffer + (config.llm.maxTokens || 16000);
|
||||
const availableTokens = maxInputTokens - reservedTokens;
|
||||
|
||||
const textTokens = this.estimateTokenCount(text);
|
||||
let processedText = text;
|
||||
let wasTruncated = false;
|
||||
|
||||
if (textTokens > availableTokens) {
|
||||
logger.warn('Document text exceeds token limit, truncating', {
|
||||
textTokens,
|
||||
availableTokens,
|
||||
maxInputTokens,
|
||||
reservedTokens,
|
||||
truncationRatio: (availableTokens / textTokens * 100).toFixed(1) + '%'
|
||||
});
|
||||
|
||||
processedText = this.truncateText(text, availableTokens);
|
||||
wasTruncated = true;
|
||||
|
||||
logger.info('Text truncated successfully', {
|
||||
originalLength: text.length,
|
||||
truncatedLength: processedText.length,
|
||||
originalTokens: textTokens,
|
||||
truncatedTokens: this.estimateTokenCount(processedText)
|
||||
});
|
||||
}
|
||||
|
||||
const taskComplexity = this.determineTaskComplexity(processedText, analysis || {});
|
||||
const estimatedTokens = this.estimateTokenCount(processedText + template);
|
||||
const selectedModel = this.selectModel(taskComplexity, estimatedTokens);
|
||||
|
||||
logger.info('Model selection completed', { taskComplexity, estimatedTokens, selectedModel });
|
||||
logger.info('Model selection completed', {
|
||||
taskComplexity,
|
||||
estimatedTokens,
|
||||
selectedModel,
|
||||
textTokens: this.estimateTokenCount(processedText),
|
||||
systemPromptTokens,
|
||||
templateTokens,
|
||||
reservedTokens,
|
||||
totalEstimatedTokens: estimatedTokens + systemPromptTokens,
|
||||
wasTruncated,
|
||||
maxInputTokens
|
||||
});
|
||||
|
||||
const isRefinement = analysis?.['refinementMode'] === true;
|
||||
const isOverview = analysis?.['overviewMode'] === true;
|
||||
@@ -78,28 +173,57 @@ class LLMService {
|
||||
|
||||
for (let attempt = 1; attempt <= 3; attempt++) {
|
||||
try {
|
||||
// If previous attempt failed with rate limit, wait before retrying
|
||||
if (lastError && lastError.message.includes('rate limit')) {
|
||||
const retryDelay = Math.min(60000 * attempt, 300000); // Exponential backoff: 60s, 120s, 180s (max 5 min)
|
||||
logger.warn(`Rate limit detected, waiting ${retryDelay}ms before retry attempt ${attempt}`, {
|
||||
retryDelay,
|
||||
attempt
|
||||
});
|
||||
await new Promise(resolve => setTimeout(resolve, retryDelay));
|
||||
}
|
||||
|
||||
logger.info(`LLM processing attempt ${attempt}/3`);
|
||||
|
||||
let prompt: string;
|
||||
let systemPrompt: string;
|
||||
|
||||
if (isOverview) {
|
||||
prompt = this.buildOverviewPrompt(text);
|
||||
prompt = this.buildOverviewPrompt(processedText);
|
||||
systemPrompt = this.getOverviewSystemPrompt();
|
||||
} else if (isSynthesis) {
|
||||
prompt = this.buildSynthesisPrompt(text);
|
||||
prompt = this.buildSynthesisPrompt(processedText);
|
||||
systemPrompt = this.getSynthesisSystemPrompt();
|
||||
} else if (sectionType) {
|
||||
prompt = this.buildSectionPrompt(text, sectionType, analysis || {});
|
||||
prompt = this.buildSectionPrompt(processedText, sectionType, analysis || {});
|
||||
systemPrompt = this.getSectionSystemPrompt(sectionType);
|
||||
} else if (isRefinement) {
|
||||
prompt = this.buildRefinementPrompt(text);
|
||||
prompt = this.buildRefinementPrompt(processedText);
|
||||
systemPrompt = this.getRefinementSystemPrompt();
|
||||
} else {
|
||||
prompt = this.buildCIMPrompt(text, template, lastError ? lastError.message : undefined);
|
||||
// Use processedText (may be truncated) instead of original text
|
||||
prompt = this.buildCIMPrompt(processedText, template, lastError ? lastError.message : undefined);
|
||||
systemPrompt = this.getCIMSystemPrompt();
|
||||
}
|
||||
|
||||
// Log prompt details before sending
|
||||
const promptTokens = this.estimateTokenCount(prompt);
|
||||
const systemPromptTokens = this.estimateTokenCount(systemPrompt);
|
||||
const totalInputTokens = promptTokens + systemPromptTokens;
|
||||
|
||||
logger.info('Sending LLM request', {
|
||||
attempt,
|
||||
model: selectedModel,
|
||||
promptTokens,
|
||||
systemPromptTokens,
|
||||
totalInputTokens,
|
||||
maxTokens: config.llm.maxTokens,
|
||||
maxInputTokens: config.llm.maxInputTokens,
|
||||
promptLength: prompt.length,
|
||||
systemPromptLength: systemPrompt.length,
|
||||
withinLimits: totalInputTokens <= (config.llm.maxInputTokens || 200000)
|
||||
});
|
||||
|
||||
const response = await this.callLLM({
|
||||
prompt,
|
||||
systemPrompt,
|
||||
@@ -109,10 +233,55 @@ class LLMService {
|
||||
});
|
||||
|
||||
if (!response.success) {
|
||||
throw new Error('LLM processing failed');
|
||||
logger.error('LLM API call failed', {
|
||||
attempt,
|
||||
model: selectedModel,
|
||||
error: response.error,
|
||||
promptTokens,
|
||||
systemPromptTokens,
|
||||
totalInputTokens
|
||||
});
|
||||
throw new Error(response.error || 'LLM processing failed');
|
||||
}
|
||||
|
||||
// Log successful response details
|
||||
logger.info('LLM API call successful', {
|
||||
attempt,
|
||||
model: selectedModel,
|
||||
responseLength: response.content.length,
|
||||
usage: response.usage,
|
||||
promptTokens: response.usage?.promptTokens || promptTokens,
|
||||
completionTokens: response.usage?.completionTokens || 0,
|
||||
totalTokens: response.usage?.totalTokens || totalInputTokens
|
||||
});
|
||||
|
||||
// Check if response was truncated by examining usage stats
|
||||
if (response.usage && response.usage.completionTokens >= config.llm.maxTokens * 0.95) {
|
||||
logger.warn('LLM response may be truncated - hit token limit', {
|
||||
completionTokens: response.usage.completionTokens,
|
||||
maxTokens: config.llm.maxTokens,
|
||||
percentage: (response.usage.completionTokens / config.llm.maxTokens * 100).toFixed(1) + '%'
|
||||
});
|
||||
}
|
||||
|
||||
// DEBUG: Log what we're extracting
|
||||
logger.info(`Extracting JSON from LLM response (attempt ${attempt})`, {
|
||||
contentLength: response.content.length,
|
||||
contentPreview: response.content.substring(0, 500),
|
||||
hasJsonBlock: response.content.includes('```json'),
|
||||
jsonBlockCount: (response.content.match(/```json/g) || []).length
|
||||
});
|
||||
|
||||
const jsonOutput = this.extractJsonFromResponse(response.content);
|
||||
|
||||
logger.info(`JSON extracted from LLM response (attempt ${attempt})`, {
|
||||
jsonOutputType: typeof jsonOutput,
|
||||
jsonOutputKeys: jsonOutput ? Object.keys(jsonOutput) : [],
|
||||
jsonOutputPreview: jsonOutput ? JSON.stringify(jsonOutput).substring(0, 500) : 'null',
|
||||
hasDealOverview: jsonOutput ? 'dealOverview' in jsonOutput : false,
|
||||
hasBusinessDescription: jsonOutput ? 'businessDescription' in jsonOutput : false,
|
||||
});
|
||||
|
||||
const validation = cimReviewSchema.safeParse(jsonOutput);
|
||||
|
||||
if (validation.success) {
|
||||
@@ -127,11 +296,23 @@ class LLMService {
|
||||
};
|
||||
} else {
|
||||
lastError = new Error(`JSON validation failed: ${validation.error.errors.map(e => e.message).join(', ')}`);
|
||||
logger.warn(`LLM output validation failed on attempt ${attempt}`, { issues: validation.error.errors });
|
||||
logger.warn(`LLM output validation failed on attempt ${attempt}`, {
|
||||
issues: validation.error.errors,
|
||||
jsonOutputPreview: JSON.stringify(jsonOutput).substring(0, 500),
|
||||
validationErrors: validation.error.errors.map(e => `${e.path.join('.')}: ${e.message}`)
|
||||
});
|
||||
|
||||
// Log the full LLM response for debugging on last attempt
|
||||
if (attempt === 3) {
|
||||
logger.error('LLM validation failed after 3 attempts - Full LLM response:', {
|
||||
fullResponse: response.content,
|
||||
extractedJson: jsonOutput,
|
||||
validationErrors: validation.error.errors
|
||||
});
|
||||
|
||||
return {
|
||||
success: false,
|
||||
error: 'Failed to generate valid JSON after 3 attempts.',
|
||||
error: `Failed to generate valid JSON after 3 attempts. Validation errors: ${validation.error.errors.map(e => `${e.path.join('.')}: ${e.message}`).join('; ')}`,
|
||||
model: selectedModel,
|
||||
cost: this.estimateCost(estimatedTokens, selectedModel),
|
||||
inputTokens: estimatedTokens,
|
||||
@@ -157,17 +338,36 @@ class LLMService {
|
||||
*/
|
||||
private async callLLM(request: LLMRequest): Promise<LLMResponse> {
|
||||
try {
|
||||
// Use configured timeout from config.llm.timeoutMs (default 6 minutes for complex analysis)
|
||||
// Increased from 3 minutes to handle complex CIM analysis even with RAG reduction
|
||||
const timeoutMs = config.llm.timeoutMs || 360000;
|
||||
const timeoutMinutes = Math.round(timeoutMs / 60000);
|
||||
|
||||
// Add a timeout wrapper to prevent hanging
|
||||
const timeoutPromise = new Promise<LLMResponse>((_, reject) => {
|
||||
setTimeout(() => reject(new Error('LLM call timeout after 3 minutes')), 180000);
|
||||
setTimeout(() => reject(new Error(`LLM call timeout after ${timeoutMinutes} minutes`)), timeoutMs);
|
||||
});
|
||||
|
||||
const llmPromise = (async () => {
|
||||
// CRITICAL DEBUG: Log which provider method we're calling
|
||||
logger.info('Calling LLM provider method', {
|
||||
provider: this.provider,
|
||||
model: request.model || this.defaultModel,
|
||||
willCallOpenRouter: this.provider === 'openrouter',
|
||||
willCallAnthropic: this.provider === 'anthropic',
|
||||
willCallOpenAI: this.provider === 'openai'
|
||||
});
|
||||
|
||||
if (this.provider === 'openai') {
|
||||
return await this.callOpenAI(request);
|
||||
} else if (this.provider === 'openrouter') {
|
||||
logger.info('Routing to callOpenRouter method');
|
||||
return await this.callOpenRouter(request);
|
||||
} else if (this.provider === 'anthropic') {
|
||||
logger.info('Routing to callAnthropic method');
|
||||
return await this.callAnthropic(request);
|
||||
} else {
|
||||
logger.error('Unsupported LLM provider', { provider: this.provider });
|
||||
throw new Error(`Unsupported LLM provider: ${this.provider}`);
|
||||
}
|
||||
})();
|
||||
@@ -189,9 +389,14 @@ class LLMService {
|
||||
private async callOpenAI(request: LLMRequest): Promise<LLMResponse> {
|
||||
const { default: OpenAI } = await import('openai');
|
||||
|
||||
// Use configured timeout to match wrapper timeout
|
||||
// Add 10 seconds buffer to ensure wrapper timeout fires first if needed
|
||||
const timeoutMs = config.llm.timeoutMs || 180000;
|
||||
const sdkTimeout = timeoutMs + 10000; // 10 second buffer
|
||||
|
||||
const openai = new OpenAI({
|
||||
apiKey: this.apiKey,
|
||||
timeout: 120000, // 2 minute timeout
|
||||
timeout: sdkTimeout,
|
||||
});
|
||||
|
||||
const messages: any[] = [];
|
||||
@@ -236,9 +441,14 @@ class LLMService {
|
||||
try {
|
||||
const { default: Anthropic } = await import('@anthropic-ai/sdk');
|
||||
|
||||
// Use configured timeout to match wrapper timeout
|
||||
// Add 10 seconds buffer to ensure wrapper timeout fires first if needed
|
||||
const timeoutMs = config.llm.timeoutMs || 180000;
|
||||
const sdkTimeout = timeoutMs + 10000; // 10 second buffer
|
||||
|
||||
const anthropic = new Anthropic({
|
||||
apiKey: this.apiKey,
|
||||
timeout: 120000, // 2 minute timeout
|
||||
timeout: sdkTimeout,
|
||||
});
|
||||
|
||||
const message = await anthropic.messages.create({
|
||||
@@ -266,9 +476,392 @@ class LLMService {
|
||||
content,
|
||||
usage,
|
||||
};
|
||||
} catch (error) {
|
||||
} catch (error: any) {
|
||||
// Check for rate limit errors (429)
|
||||
const isRateLimit = error?.status === 429 ||
|
||||
error?.error?.type === 'rate_limit_error' ||
|
||||
error?.message?.includes('rate limit') ||
|
||||
error?.message?.includes('429');
|
||||
|
||||
if (isRateLimit) {
|
||||
const retryAfter = error?.headers?.['retry-after'] ||
|
||||
error?.error?.retry_after ||
|
||||
'60'; // Default to 60 seconds
|
||||
|
||||
logger.error('Anthropic API rate limit error (429)', {
|
||||
error: error?.error || error?.message,
|
||||
retryAfter,
|
||||
requestId: error?.request_id,
|
||||
status: error?.status
|
||||
});
|
||||
|
||||
throw new Error(`Anthropic API rate limit exceeded. Retry after ${retryAfter} seconds. Request ID: ${error?.request_id || 'unknown'}`);
|
||||
}
|
||||
|
||||
logger.error('Anthropic API error', error);
|
||||
throw new Error('Anthropic API error');
|
||||
throw new Error(`Anthropic API error: ${error?.message || error?.error?.message || 'Unknown error'}`);
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Call OpenRouter API (with BYOK support for better rate limits)
|
||||
*/
|
||||
private async callOpenRouter(request: LLMRequest): Promise<LLMResponse> {
|
||||
const startTime = Date.now();
|
||||
let requestSentTime: number | null = null;
|
||||
let responseReceivedTime: number | null = null;
|
||||
|
||||
// CRITICAL: Increase timeout to 6 minutes (360s) for complex analysis
|
||||
// Even with RAG reduction, complex CIM analysis can take time
|
||||
const timeoutMs = config.llm.timeoutMs || 360000; // Default to 6 minutes instead of 3
|
||||
const abortTimeoutMs = timeoutMs - 10000; // Abort 10 seconds before wrapper timeout
|
||||
|
||||
try {
|
||||
// OpenRouter uses OpenAI-compatible API format
|
||||
const axios = await import('axios');
|
||||
|
||||
const model = request.model || this.defaultModel;
|
||||
const useBYOK = config.llm.openrouterUseBYOK;
|
||||
|
||||
// OpenRouter model format: Use exact model IDs from OpenRouter API
|
||||
// Map Anthropic model names to OpenRouter format
|
||||
let openRouterModel = model;
|
||||
if (model.includes('claude')) {
|
||||
// Convert Anthropic model names to OpenRouter format
|
||||
// Handle both versioned (claude-sonnet-4-5-20250929) and generic (claude-sonnet-4) formats
|
||||
if (model.includes('sonnet') && model.includes('4')) {
|
||||
openRouterModel = 'anthropic/claude-sonnet-4.5'; // Claude 4.5 Sonnet
|
||||
} else if (model.includes('haiku') && model.includes('4')) {
|
||||
openRouterModel = 'anthropic/claude-haiku-4.5'; // Claude 4.5 Haiku
|
||||
} else if (model.includes('opus') && model.includes('4')) {
|
||||
openRouterModel = 'anthropic/claude-opus-4';
|
||||
} else if (model.includes('sonnet') && model.includes('3.7')) {
|
||||
openRouterModel = 'anthropic/claude-3.7-sonnet';
|
||||
} else if (model.includes('sonnet') && model.includes('3.5')) {
|
||||
openRouterModel = 'anthropic/claude-3.5-sonnet';
|
||||
} else if (model.includes('haiku') && model.includes('3.5')) {
|
||||
openRouterModel = 'anthropic/claude-3.5-haiku';
|
||||
} else if (model.includes('haiku') && model.includes('3')) {
|
||||
openRouterModel = 'anthropic/claude-3-haiku';
|
||||
} else if (model.includes('opus') && model.includes('3')) {
|
||||
openRouterModel = 'anthropic/claude-3-opus';
|
||||
} else {
|
||||
// Fallback: try to construct from model name
|
||||
openRouterModel = `anthropic/${model}`;
|
||||
}
|
||||
}
|
||||
|
||||
const headers: Record<string, string> = {
|
||||
'Authorization': `Bearer ${this.apiKey}`,
|
||||
'Content-Type': 'application/json',
|
||||
'HTTP-Referer': 'https://cim-summarizer-testing.firebaseapp.com', // Optional: for analytics
|
||||
'X-Title': 'CIM Summarizer', // Optional: for analytics
|
||||
};
|
||||
|
||||
// If using BYOK, add provider credentials
|
||||
// CRITICAL: For Anthropic models via OpenRouter, X-Anthropic-Api-Key must be set for BYOK
|
||||
if (useBYOK && openRouterModel.includes('anthropic/')) {
|
||||
if (!config.llm.anthropicApiKey) {
|
||||
throw new Error('BYOK enabled but ANTHROPIC_API_KEY is not set');
|
||||
}
|
||||
headers['X-Anthropic-Api-Key'] = config.llm.anthropicApiKey;
|
||||
logger.info('Using BYOK with Anthropic API key', {
|
||||
hasKey: !!config.llm.anthropicApiKey,
|
||||
keyLength: config.llm.anthropicApiKey?.length || 0
|
||||
});
|
||||
}
|
||||
|
||||
// CRITICAL: Log before making the OpenRouter API call
|
||||
logger.info('Making OpenRouter API call', {
|
||||
url: 'https://openrouter.ai/api/v1/chat/completions',
|
||||
model: openRouterModel,
|
||||
originalModel: model,
|
||||
useBYOK,
|
||||
hasAnthropicKey: !!config.llm.anthropicApiKey,
|
||||
timeout: timeoutMs,
|
||||
abortTimeout: abortTimeoutMs,
|
||||
promptLength: request.prompt.length,
|
||||
systemPromptLength: request.systemPrompt?.length || 0,
|
||||
maxTokens: request.maxTokens || this.maxTokens,
|
||||
timestamp: new Date().toISOString()
|
||||
});
|
||||
|
||||
// CRITICAL FIX: Use AbortController for proper timeout handling
|
||||
// Axios timeout doesn't always work correctly in Firebase Functions
|
||||
// IMPORTANT: Abort 10 seconds before wrapper timeout to ensure proper cleanup
|
||||
const abortController = new AbortController();
|
||||
const timeoutId = setTimeout(() => {
|
||||
const elapsed = Date.now() - startTime;
|
||||
logger.error('OpenRouter request timeout - aborting', {
|
||||
elapsedMs: elapsed,
|
||||
timeoutMs,
|
||||
abortTimeoutMs,
|
||||
requestSentTime: requestSentTime ? Date.now() - requestSentTime : null,
|
||||
responseReceivedTime: responseReceivedTime ? Date.now() - responseReceivedTime : null,
|
||||
signalAborted: abortController.signal.aborted
|
||||
});
|
||||
abortController.abort();
|
||||
}, abortTimeoutMs);
|
||||
|
||||
// CRITICAL: Don't use interceptors - they may interfere with the request
|
||||
// Instead, log before and after the axios call directly
|
||||
try {
|
||||
requestSentTime = Date.now();
|
||||
|
||||
// CRITICAL: Construct request body and validate format
|
||||
const requestBody = {
|
||||
model: openRouterModel,
|
||||
messages: [
|
||||
...(request.systemPrompt ? [{
|
||||
role: 'system',
|
||||
content: request.systemPrompt
|
||||
}] : []),
|
||||
{
|
||||
role: 'user',
|
||||
content: request.prompt
|
||||
}
|
||||
],
|
||||
max_tokens: request.maxTokens || this.maxTokens,
|
||||
temperature: request.temperature || this.temperature,
|
||||
};
|
||||
|
||||
// Validate request body structure
|
||||
if (!requestBody.model || !requestBody.messages || requestBody.messages.length === 0) {
|
||||
throw new Error('Invalid OpenRouter request body: missing model or messages');
|
||||
}
|
||||
|
||||
const requestBodySize = JSON.stringify(requestBody).length;
|
||||
const requestBodyPreview = JSON.stringify({
|
||||
model: requestBody.model,
|
||||
messageCount: requestBody.messages.length,
|
||||
firstMessageRole: requestBody.messages[0]?.role,
|
||||
firstMessageLength: requestBody.messages[0]?.content?.length || 0,
|
||||
max_tokens: requestBody.max_tokens,
|
||||
temperature: requestBody.temperature
|
||||
});
|
||||
|
||||
// CRITICAL: Log the EXACT request being sent (full details)
|
||||
logger.info('=== OPENROUTER REQUEST DETAILS ===', {
|
||||
url: 'https://openrouter.ai/api/v1/chat/completions',
|
||||
method: 'POST',
|
||||
headers: {
|
||||
...headers,
|
||||
// Don't log full API keys, just indicate presence
|
||||
'Authorization': headers['Authorization'] ? `Bearer ${headers['Authorization'].substring(7, 20)}...` : 'MISSING',
|
||||
'X-Anthropic-Api-Key': headers['X-Anthropic-Api-Key'] ? `${headers['X-Anthropic-Api-Key'].substring(0, 20)}...` : 'NOT SET'
|
||||
},
|
||||
requestBody: {
|
||||
model: requestBody.model,
|
||||
messageCount: requestBody.messages.length,
|
||||
messages: requestBody.messages.map((msg: any, idx: number) => ({
|
||||
index: idx,
|
||||
role: msg.role,
|
||||
contentLength: msg.content?.length || 0,
|
||||
contentPreview: msg.content?.substring(0, 200) + (msg.content?.length > 200 ? '...' : ''),
|
||||
fullContent: msg.content // Log full content for debugging
|
||||
})),
|
||||
max_tokens: requestBody.max_tokens,
|
||||
temperature: requestBody.temperature
|
||||
},
|
||||
requestBodySize,
|
||||
timeSinceStart: Date.now() - startTime,
|
||||
signalAborted: abortController.signal.aborted,
|
||||
timestamp: new Date().toISOString()
|
||||
});
|
||||
|
||||
// CRITICAL: Log immediately after axios.post is called (before await)
|
||||
logger.info('Axios POST call initiated, awaiting response...', {
|
||||
timeSinceStart: Date.now() - startTime,
|
||||
timestamp: new Date().toISOString()
|
||||
});
|
||||
|
||||
// Use axios.default.post with proper timeout and AbortController
|
||||
// CRITICAL: Use the validated requestBody we constructed above
|
||||
const response = await axios.default.post(
|
||||
'https://openrouter.ai/api/v1/chat/completions',
|
||||
requestBody, // Use the validated request body
|
||||
{
|
||||
headers,
|
||||
// CRITICAL: Set timeout here (not on instance) to work with AbortController
|
||||
// Axios timeout should be slightly longer than AbortController to let abort fire first
|
||||
timeout: abortTimeoutMs + 1000, // 1 second buffer after abort timeout
|
||||
signal: abortController.signal, // CRITICAL: Use AbortController signal
|
||||
// Add validateStatus to ensure we get proper error responses
|
||||
validateStatus: (status) => status < 500, // Don't throw on 4xx errors
|
||||
}
|
||||
);
|
||||
|
||||
clearTimeout(timeoutId);
|
||||
|
||||
const totalTime = Date.now() - startTime;
|
||||
responseReceivedTime = Date.now();
|
||||
|
||||
// CRITICAL: Check for API errors before accessing response data
|
||||
if (response.status >= 400) {
|
||||
// Handle error response
|
||||
logger.error('OpenRouter API error', {
|
||||
status: response.status,
|
||||
error: response.data?.error || response.data,
|
||||
user_id: headers['X-User-Id'] || 'unknown'
|
||||
});
|
||||
throw new Error(response.data?.error?.message || `OpenRouter API error: HTTP ${response.status}`);
|
||||
}
|
||||
|
||||
// CRITICAL: Log the EXACT response received (full details)
|
||||
const content = response.data?.choices?.[0]?.message?.content || '';
|
||||
const usage = response.data.usage ? {
|
||||
promptTokens: response.data.usage.prompt_tokens || 0,
|
||||
completionTokens: response.data.usage.completion_tokens || 0,
|
||||
totalTokens: response.data.usage.total_tokens || 0,
|
||||
} : undefined;
|
||||
|
||||
logger.info('=== OPENROUTER RESPONSE RECEIVED ===', {
|
||||
status: response.status,
|
||||
statusText: response.statusText,
|
||||
headers: response.headers ? Object.keys(response.headers) : [],
|
||||
responseData: {
|
||||
id: response.data.id,
|
||||
model: response.data.model,
|
||||
object: response.data.object,
|
||||
created: response.data.created,
|
||||
choices: response.data.choices ? response.data.choices.map((choice: any, idx: number) => ({
|
||||
index: idx,
|
||||
finishReason: choice.finish_reason,
|
||||
messageRole: choice.message?.role,
|
||||
messageContentLength: choice.message?.content?.length || 0,
|
||||
messageContentPreview: choice.message?.content?.substring(0, 500) + (choice.message?.content?.length > 500 ? '...' : ''),
|
||||
messageContentFull: choice.message?.content // Log full content for debugging
|
||||
})) : [],
|
||||
usage: usage,
|
||||
fullResponseData: response.data // Log full response for debugging
|
||||
},
|
||||
timeSinceStart: totalTime,
|
||||
timeSinceRequest: responseReceivedTime - (requestSentTime || startTime),
|
||||
timestamp: new Date().toISOString()
|
||||
});
|
||||
|
||||
logger.info('OpenRouter API call successful (summary)', {
|
||||
model: openRouterModel,
|
||||
usage,
|
||||
responseLength: content.length,
|
||||
totalTimeMs: totalTime,
|
||||
requestTimeMs: requestSentTime ? responseReceivedTime - requestSentTime : null,
|
||||
timestamp: new Date().toISOString()
|
||||
});
|
||||
|
||||
return {
|
||||
success: true,
|
||||
content,
|
||||
usage,
|
||||
};
|
||||
} catch (axiosError: any) {
|
||||
clearTimeout(timeoutId);
|
||||
|
||||
const totalTime = Date.now() - startTime;
|
||||
|
||||
// CRITICAL: Log the EXACT error details
|
||||
logger.error('=== OPENROUTER REQUEST ERROR ===', {
|
||||
errorName: axiosError.name,
|
||||
errorMessage: axiosError.message,
|
||||
errorCode: axiosError.code,
|
||||
errorStack: axiosError.stack,
|
||||
response: axiosError.response ? {
|
||||
status: axiosError.response.status,
|
||||
statusText: axiosError.response.statusText,
|
||||
headers: axiosError.response.headers ? Object.keys(axiosError.response.headers) : [],
|
||||
data: axiosError.response.data, // Full error response data
|
||||
dataString: JSON.stringify(axiosError.response.data)
|
||||
} : null,
|
||||
request: axiosError.request ? {
|
||||
method: axiosError.request.method,
|
||||
path: axiosError.request.path,
|
||||
headers: axiosError.request.headers ? Object.keys(axiosError.request.headers) : []
|
||||
} : null,
|
||||
config: axiosError.config ? {
|
||||
url: axiosError.config.url,
|
||||
method: axiosError.config.method,
|
||||
timeout: axiosError.config.timeout,
|
||||
headers: axiosError.config.headers ? Object.keys(axiosError.config.headers) : []
|
||||
} : null,
|
||||
totalTimeMs: totalTime,
|
||||
requestSentTime: requestSentTime ? Date.now() - requestSentTime : null,
|
||||
timeoutMs,
|
||||
abortTimeoutMs,
|
||||
signalAborted: abortController.signal.aborted,
|
||||
wasRequestSent: requestSentTime !== null,
|
||||
timestamp: new Date().toISOString()
|
||||
});
|
||||
|
||||
// Check if it was aborted
|
||||
if (axiosError.name === 'AbortError' || axiosError.code === 'ECONNABORTED' || abortController.signal.aborted) {
|
||||
logger.error('OpenRouter request was aborted (timeout)', {
|
||||
totalTimeMs: totalTime,
|
||||
requestSentTime: requestSentTime ? Date.now() - requestSentTime : null,
|
||||
timeoutMs,
|
||||
abortTimeoutMs,
|
||||
error: axiosError.message,
|
||||
code: axiosError.code,
|
||||
name: axiosError.name,
|
||||
signalAborted: abortController.signal.aborted,
|
||||
wasRequestSent: requestSentTime !== null
|
||||
});
|
||||
throw new Error(`OpenRouter API request timed out after ${Math.round(totalTime / 1000)}s (abort timeout: ${Math.round(abortTimeoutMs / 1000)}s)`);
|
||||
}
|
||||
|
||||
// Check if it's an axios timeout (different from abort)
|
||||
if (axiosError.code === 'ECONNABORTED' && axiosError.message?.includes('timeout')) {
|
||||
logger.error('OpenRouter request timed out (axios timeout)', {
|
||||
totalTimeMs: totalTime,
|
||||
requestSentTime: requestSentTime ? Date.now() - requestSentTime : null,
|
||||
timeoutMs,
|
||||
abortTimeoutMs,
|
||||
error: axiosError.message,
|
||||
code: axiosError.code
|
||||
});
|
||||
throw new Error(`OpenRouter API request timed out after ${Math.round(totalTime / 1000)}s (axios timeout)`);
|
||||
}
|
||||
|
||||
// Re-throw to be handled by outer catch
|
||||
throw axiosError;
|
||||
}
|
||||
} catch (error: any) {
|
||||
const totalTime = Date.now() - startTime;
|
||||
|
||||
// Check for rate limit errors (429)
|
||||
const isRateLimit = error?.response?.status === 429 ||
|
||||
error?.response?.data?.error?.message?.includes('rate limit') ||
|
||||
error?.message?.includes('rate limit') ||
|
||||
error?.message?.includes('429');
|
||||
|
||||
if (isRateLimit) {
|
||||
const retryAfter = error?.response?.headers?.['retry-after'] ||
|
||||
error?.response?.data?.error?.retry_after ||
|
||||
'60';
|
||||
|
||||
logger.error('OpenRouter API rate limit error (429)', {
|
||||
error: error?.response?.data?.error || error?.message,
|
||||
retryAfter,
|
||||
status: error?.response?.status,
|
||||
totalTimeMs: totalTime
|
||||
});
|
||||
|
||||
throw new Error(`OpenRouter API rate limit exceeded. Retry after ${retryAfter} seconds.`);
|
||||
}
|
||||
|
||||
// Enhanced error logging
|
||||
logger.error('OpenRouter API error', {
|
||||
error: error?.response?.data || error?.message,
|
||||
status: error?.response?.status,
|
||||
code: error?.code,
|
||||
name: error?.name,
|
||||
totalTimeMs: totalTime,
|
||||
requestSentTime: requestSentTime ? Date.now() - requestSentTime : null,
|
||||
responseReceivedTime: responseReceivedTime ? Date.now() - responseReceivedTime : null,
|
||||
isTimeout: error?.message?.includes('timeout') || error?.message?.includes('timed out'),
|
||||
isAborted: error?.name === 'AbortError' || error?.code === 'ECONNABORTED'
|
||||
});
|
||||
|
||||
throw new Error(`OpenRouter API error: ${error?.response?.data?.error?.message || error?.message || 'Unknown error'}`);
|
||||
}
|
||||
}
|
||||
|
||||
@@ -473,8 +1066,100 @@ SPECIAL REQUIREMENTS FOR PRELIMINARY INVESTMENT THESIS:
|
||||
private extractJsonFromResponse(content: string): any {
|
||||
try {
|
||||
// First, try to find JSON within ```json ... ```
|
||||
const jsonMatch = content.match(/```json\n([\s\S]*?)\n```/);
|
||||
// Use indexOf/lastIndexOf to get the complete block (not just first match)
|
||||
const jsonBlockStart = content.indexOf('```json');
|
||||
logger.info('JSON extraction - checking for ```json block', {
|
||||
jsonBlockStart,
|
||||
hasJsonBlock: jsonBlockStart !== -1,
|
||||
contentLength: content.length,
|
||||
contentEnds: content.substring(content.length - 50),
|
||||
});
|
||||
|
||||
if (jsonBlockStart !== -1) {
|
||||
const jsonContentStart = content.indexOf('\n', jsonBlockStart) + 1;
|
||||
// Find closing ``` by looking for the pattern that ends a code block
|
||||
// Search for \n``` or ``` at the end, starting from jsonContentStart
|
||||
let closingBackticks = -1;
|
||||
|
||||
// Try to find \n``` first (most common)
|
||||
const newlineBackticks = content.indexOf('\n```', jsonContentStart);
|
||||
if (newlineBackticks !== -1) {
|
||||
closingBackticks = newlineBackticks + 1; // Position after the newline
|
||||
} else {
|
||||
// Fallback: look for ``` at the very end
|
||||
if (content.endsWith('```')) {
|
||||
closingBackticks = content.length - 3;
|
||||
} else {
|
||||
// LLM didn't close the code fence - use entire remaining content
|
||||
closingBackticks = content.length;
|
||||
logger.warn('LLM response has no closing backticks, using entire content');
|
||||
}
|
||||
}
|
||||
|
||||
logger.info('JSON extraction - found block boundaries', {
|
||||
jsonContentStart,
|
||||
closingBackticks,
|
||||
newlineBackticks,
|
||||
contentEndsWithBackticks: content.endsWith('```'),
|
||||
isValid: closingBackticks > jsonContentStart,
|
||||
});
|
||||
|
||||
if (jsonContentStart > 0 && closingBackticks > jsonContentStart) {
|
||||
const jsonStr = content.substring(jsonContentStart, closingBackticks).trim();
|
||||
|
||||
logger.info('JSON extraction - extracted string', {
|
||||
jsonStrLength: jsonStr.length,
|
||||
startsWithBrace: jsonStr.startsWith('{'),
|
||||
jsonStrPreview: jsonStr.substring(0, 300),
|
||||
});
|
||||
|
||||
if (jsonStr && jsonStr.startsWith('{')) {
|
||||
try {
|
||||
// Use brace matching to get the complete root object
|
||||
let braceCount = 0;
|
||||
let rootEndIndex = -1;
|
||||
for (let i = 0; i < jsonStr.length; i++) {
|
||||
if (jsonStr[i] === '{') braceCount++;
|
||||
else if (jsonStr[i] === '}') {
|
||||
braceCount--;
|
||||
if (braceCount === 0) {
|
||||
rootEndIndex = i;
|
||||
break;
|
||||
}
|
||||
}
|
||||
}
|
||||
if (rootEndIndex !== -1) {
|
||||
const completeJsonStr = jsonStr.substring(0, rootEndIndex + 1);
|
||||
logger.info('Brace matching succeeded', {
|
||||
originalLength: jsonStr.length,
|
||||
extractedLength: completeJsonStr.length,
|
||||
extractedPreview: completeJsonStr.substring(0, 200),
|
||||
});
|
||||
return JSON.parse(completeJsonStr);
|
||||
} else {
|
||||
logger.warn('Brace matching failed to find closing brace', {
|
||||
jsonStrLength: jsonStr.length,
|
||||
jsonStrPreview: jsonStr.substring(0, 500),
|
||||
});
|
||||
}
|
||||
} catch (e) {
|
||||
logger.error('Brace matching threw error, falling back to regex', {
|
||||
error: e instanceof Error ? e.message : String(e),
|
||||
stack: e instanceof Error ? e.stack : undefined,
|
||||
});
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// Fallback to regex match - use GREEDY match to get full JSON
|
||||
logger.warn('Using fallback regex extraction');
|
||||
const jsonMatch = content.match(/```json\n([\s\S]+)\n```/); // Changed to greedy (+)
|
||||
if (jsonMatch && jsonMatch[1]) {
|
||||
logger.info('Regex extraction found JSON', {
|
||||
matchLength: jsonMatch[1].length,
|
||||
matchPreview: jsonMatch[1].substring(0, 200),
|
||||
});
|
||||
return JSON.parse(jsonMatch[1]);
|
||||
}
|
||||
|
||||
@@ -507,14 +1192,26 @@ SPECIAL REQUIREMENTS FOR PRELIMINARY INVESTMENT THESIS:
|
||||
}
|
||||
|
||||
if (endIndex === -1) {
|
||||
// If we can't find a complete JSON object, try to extract what we have
|
||||
// and attempt to complete it
|
||||
// If we can't find a complete JSON object, the response was likely truncated
|
||||
const partialJson = content.substring(startIndex);
|
||||
const openBraces = (partialJson.match(/{/g) || []).length;
|
||||
const closeBraces = (partialJson.match(/}/g) || []).length;
|
||||
const isTruncated = openBraces > closeBraces;
|
||||
|
||||
logger.warn('Attempting to recover from truncated JSON response', {
|
||||
contentLength: content.length,
|
||||
partialJsonLength: partialJson.length
|
||||
partialJsonLength: partialJson.length,
|
||||
openBraces,
|
||||
closeBraces,
|
||||
isTruncated,
|
||||
endsAbruptly: !content.trim().endsWith('}') && !content.trim().endsWith('```')
|
||||
});
|
||||
|
||||
// If clearly truncated (more open than close braces), throw a specific error
|
||||
if (isTruncated && openBraces - closeBraces > 2) {
|
||||
throw new Error(`Response was truncated due to token limit. Expected ${openBraces - closeBraces} more closing braces. Increase maxTokens limit.`);
|
||||
}
|
||||
|
||||
// Try to find the last complete object or array
|
||||
const lastCompleteMatch = partialJson.match(/(\{[^{}]*(?:\{[^{}]*\}[^{}]*)*\})/);
|
||||
if (lastCompleteMatch && lastCompleteMatch[1]) {
|
||||
@@ -527,7 +1224,7 @@ SPECIAL REQUIREMENTS FOR PRELIMINARY INVESTMENT THESIS:
|
||||
return JSON.parse(lastPairMatch[1]);
|
||||
}
|
||||
|
||||
throw new Error('Unable to extract valid JSON from truncated response');
|
||||
throw new Error(`Unable to extract valid JSON from truncated response. Response appears incomplete (${openBraces} open braces, ${closeBraces} close braces). Increase maxTokens limit.`);
|
||||
}
|
||||
|
||||
const jsonString = content.substring(startIndex, endIndex + 1);
|
||||
@@ -550,6 +1247,47 @@ SPECIAL REQUIREMENTS FOR PRELIMINARY INVESTMENT THESIS:
|
||||
return Math.ceil(text.length / 4);
|
||||
}
|
||||
|
||||
/**
|
||||
* Truncate text to fit within token limit while preserving sentence boundaries
|
||||
*/
|
||||
private truncateText(text: string, maxTokens: number): string {
|
||||
// Convert token limit to character limit (approximate)
|
||||
const maxChars = maxTokens * 4;
|
||||
|
||||
if (text.length <= maxChars) {
|
||||
return text;
|
||||
}
|
||||
|
||||
// Try to truncate at sentence boundaries for better context preservation
|
||||
const truncated = text.substring(0, maxChars);
|
||||
|
||||
// Find the last sentence boundary (period, exclamation, question mark followed by space)
|
||||
const sentenceEndRegex = /[.!?]\s+/g;
|
||||
let lastMatch: RegExpExecArray | null = null;
|
||||
let match: RegExpExecArray | null;
|
||||
|
||||
while ((match = sentenceEndRegex.exec(truncated)) !== null) {
|
||||
if (match.index < maxChars * 0.95) { // Only use if within 95% of limit
|
||||
lastMatch = match;
|
||||
}
|
||||
}
|
||||
|
||||
if (lastMatch) {
|
||||
// Truncate at sentence boundary
|
||||
return text.substring(0, lastMatch.index + lastMatch[0].length).trim();
|
||||
}
|
||||
|
||||
// Fallback: truncate at word boundary
|
||||
const wordBoundaryRegex = /\s+/;
|
||||
const lastSpaceIndex = truncated.lastIndexOf(' ');
|
||||
if (lastSpaceIndex > maxChars * 0.9) {
|
||||
return text.substring(0, lastSpaceIndex).trim();
|
||||
}
|
||||
|
||||
// Final fallback: hard truncate
|
||||
return truncated.trim();
|
||||
}
|
||||
|
||||
/**
|
||||
* Select the best model for the task based on complexity and cost optimization
|
||||
*/
|
||||
|
||||
Reference in New Issue
Block a user