# CIM Summary Codebase Architecture Summary
**Last Updated**: December 2024
**Purpose**: Comprehensive technical reference for senior developers optimizing and debugging the codebase
---
## Table of Contents
1. [System Overview](#1-system-overview)
2. [Application Entry Points](#2-application-entry-points)
3. [Request Flow & API Architecture](#3-request-flow--api-architecture)
4. [Document Processing Pipeline (Critical Path)](#4-document-processing-pipeline-critical-path)
5. [Core Services Deep Dive](#5-core-services-deep-dive)
6. [Data Models & Database Schema](#6-data-models--database-schema)
7. [Component Handoffs & Integration Points](#7-component-handoffs--integration-points)
8. [Error Handling & Resilience](#8-error-handling--resilience)
9. [Performance Optimization Points](#9-performance-optimization-points)
10. [Background Processing Architecture](#10-background-processing-architecture)
11. [Frontend Architecture](#11-frontend-architecture)
12. [Configuration & Environment](#12-configuration--environment)
13. [Debugging Guide](#13-debugging-guide)
14. [Optimization Opportunities](#14-optimization-opportunities)
---
## 1. System Overview
### High-Level Architecture
```
┌─────────────────────────────────────────────────────────────────┐
│ Frontend (React) │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ DocumentUpload│ │ DocumentList │ │ Analytics │ │
│ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ │
│ │ │ │ │
│ └──────────────────┴──────────────────┘ │
│ │ │
│ ┌────────▼────────┐ │
│ │ documentService │ │
│ │ (Axios Client) │ │
│ └────────┬────────┘ │
└────────────────────────────┼────────────────────────────────────┘
│ HTTPS + JWT
┌────────────────────────────▼────────────────────────────────────┐
│ Backend (Express + Node.js) │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ Middleware Chain: CORS → Auth → Validation → Error Handler │ │
│ └──────────────────────────────────────────────────────────┘ │
│ │ │
│ ┌──────────────────┼──────────────────┐ │
│ │ │ │ │
│ ┌──────▼──────┐ ┌────────▼────────┐ ┌─────▼──────┐ │
│ │ Routes │ │ Controllers │ │ Services │ │
│ └──────┬──────┘ └────────┬────────┘ └─────┬──────┘ │
│ │ │ │ │
│ └──────────────────┴──────────────────┘ │
└────────────────────────────┬────────────────────────────────────┘
│
┌────────────────────┼────────────────────┐
│ │ │
┌────▼────┐ ┌──────▼──────┐ ┌───────▼───────┐
│Supabase │ │Google Cloud│ │ LLM APIs │
│(Postgres)│ │ Storage │ │(Claude/OpenAI)│
└─────────┘ └────────────┘ └───────────────┘
```
### Technology Stack
**Frontend:**
- React 18 + TypeScript
- Vite (build tool)
- Axios (HTTP client)
- Firebase Auth (authentication)
- React Router (routing)
**Backend:**
- Node.js + Express + TypeScript
- Firebase Functions v2 (deployment)
- Supabase (PostgreSQL + Vector DB)
- Google Cloud Storage (file storage)
- Google Document AI (PDF text extraction)
- Puppeteer (PDF generation)
**AI/ML Services:**
- Anthropic Claude (primary LLM)
- OpenAI (fallback LLM)
- OpenRouter (LLM routing)
- OpenAI Embeddings (vector embeddings)
### Core Purpose
Automated processing and analysis of Confidential Information Memorandums (CIMs) using:
1. **Text Extraction**: Google Document AI extracts text from PDFs
2. **Semantic Chunking**: Split text into 4000-char chunks with overlap
3. **Vector Embeddings**: Generate embeddings for semantic search
4. **LLM Analysis**: Claude AI analyzes chunks and generates structured CIMReview data
5. **PDF Generation**: Create summary PDF with analysis results
---
## 2. Application Entry Points
### Backend Entry Point
**File**: `backend/src/index.ts`
```1:22:backend/src/index.ts
// Initialize Firebase Admin SDK first
import './config/firebase';
import express from 'express';
import cors from 'cors';
import helmet from 'helmet';
import morgan from 'morgan';
import rateLimit from 'express-rate-limit';
import { config } from './config/env';
import { logger } from './utils/logger';
import documentRoutes from './routes/documents';
import vectorRoutes from './routes/vector';
import monitoringRoutes from './routes/monitoring';
import auditRoutes from './routes/documentAudit';
import { jobQueueService } from './services/jobQueueService';
import { errorHandler, correlationIdMiddleware } from './middleware/errorHandler';
import { notFoundHandler } from './middleware/notFoundHandler';
// Start the job queue service for background processing
jobQueueService.start();
```
**Key Initialization Steps:**
1. Firebase Admin SDK initialization (`./config/firebase`)
2. Express app setup with middleware chain
3. Route registration (`/documents`, `/vector`, `/monitoring`, `/api/audit`)
4. Job queue service startup (legacy in-memory queue)
5. Firebase Functions export for Cloud deployment
**Scheduled Function**: `processDocumentJobs` (```210:267:backend/src/index.ts```)
- Runs every minute via Firebase Cloud Scheduler
- Processes pending/retrying jobs from database
- Detects and resets stuck jobs
### Frontend Entry Point
**File**: `frontend/src/main.tsx`
```1:10:frontend/src/main.tsx
import React from 'react';
import ReactDOM from 'react-dom/client';
import App from './App';
import './index.css';
ReactDOM.createRoot(document.getElementById('root')!).render(
);
```
**Main App Component**: `frontend/src/App.tsx`
- Sets up React Router
- Provides AuthContext
- Renders protected routes and dashboard
---
## 3. Request Flow & API Architecture
### Request Lifecycle
```
Client Request
│
▼
┌─────────────────────────────────────┐
│ 1. CORS Middleware │
│ - Validates origin │
│ - Sets CORS headers │
└──────────────┬──────────────────────┘
│
▼
┌─────────────────────────────────────┐
│ 2. Correlation ID Middleware │
│ - Generates/reads X-Correlation-ID│
│ - Adds to request object │
└──────────────┬──────────────────────┘
│
▼
┌─────────────────────────────────────┐
│ 3. Firebase Auth Middleware │
│ - Verifies JWT token │
│ - Attaches user to req.user │
└──────────────┬──────────────────────┘
│
▼
┌─────────────────────────────────────┐
│ 4. Rate Limiting │
│ - 1000 requests per 15 minutes │
└──────────────┬──────────────────────┘
│
▼
┌─────────────────────────────────────┐
│ 5. Body Parsing │
│ - JSON (10MB limit) │
│ - URL-encoded (10MB limit) │
└──────────────┬──────────────────────┘
│
▼
┌─────────────────────────────────────┐
│ 6. Route Handler │
│ - Matches route pattern │
│ - Calls controller method │
└──────────────┬──────────────────────┘
│
▼
┌─────────────────────────────────────┐
│ 7. Controller │
│ - Validates input │
│ - Calls service methods │
│ - Returns response │
└──────────────┬──────────────────────┘
│
▼
┌─────────────────────────────────────┐
│ 8. Service Layer │
│ - Business logic │
│ - Database operations │
│ - External API calls │
└──────────────┬──────────────────────┘
│
▼
┌─────────────────────────────────────┐
│ 9. Error Handler (if error) │
│ - Categorizes error │
│ - Logs with correlation ID │
│ - Returns structured response │
└─────────────────────────────────────┘
```
### Authentication Flow
**Middleware**: `backend/src/middleware/firebaseAuth.ts`
```27:81:backend/src/middleware/firebaseAuth.ts
export const verifyFirebaseToken = async (
req: FirebaseAuthenticatedRequest,
res: Response,
next: NextFunction
): Promise => {
try {
console.log('🔐 Authentication middleware called for:', req.method, req.url);
console.log('🔐 Request headers:', Object.keys(req.headers));
// Debug Firebase Admin initialization
console.log('🔐 Firebase apps available:', admin.apps.length);
console.log('🔐 Firebase app names:', admin.apps.filter(app => app !== null).map(app => app!.name));
const authHeader = req.headers.authorization;
console.log('🔐 Auth header present:', !!authHeader);
console.log('🔐 Auth header starts with Bearer:', authHeader?.startsWith('Bearer '));
if (!authHeader || !authHeader.startsWith('Bearer ')) {
console.log('❌ No valid authorization header');
res.status(401).json({ error: 'No valid authorization header' });
return;
}
const idToken = authHeader.split('Bearer ')[1];
console.log('🔐 Token extracted, length:', idToken?.length);
if (!idToken) {
console.log('❌ No token provided');
res.status(401).json({ error: 'No token provided' });
return;
}
console.log('🔐 Attempting to verify Firebase ID token...');
console.log('🔐 Token preview:', idToken.substring(0, 20) + '...');
// Verify the Firebase ID token
const decodedToken = await admin.auth().verifyIdToken(idToken, true);
console.log('✅ Token verified successfully for user:', decodedToken.email);
console.log('✅ Token UID:', decodedToken.uid);
console.log('✅ Token issuer:', decodedToken.iss);
// Check if token is expired
const now = Math.floor(Date.now() / 1000);
if (decodedToken.exp && decodedToken.exp < now) {
logger.warn('Token expired for user:', decodedToken.uid);
res.status(401).json({ error: 'Token expired' });
return;
}
req.user = decodedToken;
// Log successful authentication
logger.info('Authenticated request for user:', decodedToken.email);
next();
```
**Frontend Auth**: `frontend/src/services/authService.ts`
- Manages Firebase Auth state
- Provides token via `getToken()`
- Axios interceptor adds token to requests
### Route Structure
**Main Routes** (`backend/src/routes/documents.ts`):
- `POST /documents/upload-url` - Get signed upload URL
- `POST /documents/:id/confirm-upload` - Confirm upload and start processing
- `GET /documents` - List user's documents
- `GET /documents/:id` - Get document details
- `GET /documents/:id/download` - Download processed PDF
- `GET /documents/analytics` - Get processing analytics
- `POST /documents/:id/process-optimized-agentic-rag` - Trigger AI processing
**Middleware Applied**:
```22:29:backend/src/routes/documents.ts
// Apply authentication and correlation ID to all routes
router.use(verifyFirebaseToken);
router.use(addCorrelationId);
// Add logging middleware for document routes
router.use((req, res, next) => {
console.log(`📄 Document route accessed: ${req.method} ${req.path}`);
next();
});
```
---
## 4. Document Processing Pipeline (Critical Path)
### Complete Flow Diagram
```
┌─────────────────────────────────────────────────────────────────────┐
│ DOCUMENT PROCESSING PIPELINE │
└─────────────────────────────────────────────────────────────────────┘
1. UPLOAD PHASE
┌─────────────────────────────────────────────────────────────┐
│ User selects PDF │
│ ↓ │
│ DocumentUpload component │
│ ↓ │
│ documentService.uploadDocument() │
│ ↓ │
│ POST /documents/upload-url │
│ ↓ │
│ documentController.getUploadUrl() │
│ ↓ │
│ DocumentModel.create() → documents table │
│ ↓ │
│ fileStorageService.generateSignedUploadUrl() │
│ ↓ │
│ Direct upload to GCS via signed URL │
│ ↓ │
│ POST /documents/:id/confirm-upload │
└─────────────────────────────────────────────────────────────┘
2. JOB CREATION PHASE
┌─────────────────────────────────────────────────────────────┐
│ documentController.confirmUpload() │
│ ↓ │
│ ProcessingJobModel.create() → processing_jobs table │
│ ↓ │
│ Status: 'pending' │
│ ↓ │
│ Returns 202 Accepted (async processing) │
└─────────────────────────────────────────────────────────────┘
3. JOB PROCESSING PHASE (Background)
┌─────────────────────────────────────────────────────────────┐
│ Scheduled Function: processDocumentJobs (every 1 minute) │
│ OR │
│ Immediate processing via jobProcessorService.processJob() │
│ ↓ │
│ JobProcessorService.processJob() │
│ ↓ │
│ Download file from GCS │
│ ↓ │
│ unifiedDocumentProcessor.processDocument() │
└─────────────────────────────────────────────────────────────┘
4. TEXT EXTRACTION PHASE
┌─────────────────────────────────────────────────────────────┐
│ documentAiProcessor.processDocument() │
│ ↓ │
│ Google Document AI API │
│ ↓ │
│ Extracted text returned │
└─────────────────────────────────────────────────────────────┘
5. CHUNKING & EMBEDDING PHASE
┌─────────────────────────────────────────────────────────────┐
│ optimizedAgenticRAGProcessor.processLargeDocument() │
│ ↓ │
│ createIntelligentChunks() │
│ - Semantic boundary detection │
│ - 4000-char chunks with 200-char overlap │
│ ↓ │
│ processChunksInBatches() │
│ - Batch size: 10 │
│ - Max concurrent: 5 │
│ ↓ │
│ storeChunksOptimized() │
│ ↓ │
│ vectorDatabaseService.storeEmbedding() │
│ - OpenAI embeddings API │
│ - Store in document_chunks table │
└─────────────────────────────────────────────────────────────┘
6. LLM ANALYSIS PHASE
┌─────────────────────────────────────────────────────────────┐
│ generateLLMAnalysisHybrid() │
│ ↓ │
│ llmService.processCIMDocument() │
│ ↓ │
│ Vector search for relevant chunks │
│ ↓ │
│ Claude/OpenAI API call with structured prompt │
│ ↓ │
│ Parse and validate CIMReview JSON │
│ ↓ │
│ Return structured analysisData │
└─────────────────────────────────────────────────────────────┘
7. PDF GENERATION PHASE
┌─────────────────────────────────────────────────────────────┐
│ pdfGenerationService.generatePDF() │
│ ↓ │
│ Puppeteer browser instance │
│ ↓ │
│ Render HTML template with analysisData │
│ ↓ │
│ Generate PDF buffer │
│ ↓ │
│ Upload PDF to GCS │
│ ↓ │
│ Update document record with PDF path │
└─────────────────────────────────────────────────────────────┘
8. STATUS UPDATE PHASE
┌─────────────────────────────────────────────────────────────┐
│ DocumentModel.updateById() │
│ - status: 'completed' │
│ - pdf_path: GCS path │
│ ↓ │
│ ProcessingJobModel.markAsCompleted() │
│ ↓ │
│ Frontend polls /documents/:id for status updates │
└─────────────────────────────────────────────────────────────┘
```
### Key Handoff Points
**1. Upload to Job Creation**
```138:202:backend/src/controllers/documentController.ts
async confirmUpload(req: Request, res: Response): Promise {
// ... validation ...
// Update status to processing
await DocumentModel.updateById(documentId, {
status: 'processing_llm'
});
// Acknowledge the request immediately
res.status(202).json({
message: 'Upload confirmed, processing has started.',
document: document,
status: 'processing'
});
// CRITICAL FIX: Use database-backed job queue
const { ProcessingJobModel } = await import('../models/ProcessingJobModel');
await ProcessingJobModel.create({
document_id: documentId,
user_id: userId,
options: {
fileName: document.original_file_name,
mimeType: 'application/pdf'
}
});
}
```
**2. Job Processing to Document Processing**
```109:200:backend/src/services/jobProcessorService.ts
private async processJob(jobId: string): Promise<{ success: boolean; error?: string }> {
// Get job details
job = await ProcessingJobModel.findById(jobId);
// Mark job as processing
await ProcessingJobModel.markAsProcessing(jobId);
// Download file from GCS
const fileBuffer = await fileStorageService.downloadFile(document.file_path);
// Process document
const result = await unifiedDocumentProcessor.processDocument(
job.document_id,
job.user_id,
fileBuffer.toString('utf-8'), // This will be re-read as buffer
{
fileBuffer,
fileName: job.options?.fileName || 'document.pdf',
mimeType: job.options?.mimeType || 'application/pdf'
}
);
}
```
**3. Document Processing to Text Extraction**
```50:80:backend/src/services/documentAiProcessor.ts
async processDocument(
documentId: string,
userId: string,
fileBuffer: Buffer,
fileName: string,
mimeType: string
): Promise {
// Step 1: Extract text using Document AI or fallback
const extractedText = await this.extractTextFromDocument(fileBuffer, fileName, mimeType);
// Step 2: Process extracted text through Agentic RAG
const agenticRagResult = await this.processWithAgenticRAG(documentId, extractedText);
}
```
**4. Text to Chunking**
```40:109:backend/src/services/optimizedAgenticRAGProcessor.ts
async processLargeDocument(
documentId: string,
text: string,
options: {
enableSemanticChunking?: boolean;
enableMetadataEnrichment?: boolean;
similarityThreshold?: number;
} = {}
): Promise {
// Step 1: Create intelligent chunks with semantic boundaries
const chunks = await this.createIntelligentChunks(text, documentId, options.enableSemanticChunking);
// Step 2: Process chunks in batches to manage memory
const processedChunks = await this.processChunksInBatches(chunks, documentId, options);
// Step 3: Store chunks with optimized batching
const embeddingApiCalls = await this.storeChunksOptimized(processedChunks, documentId);
// Step 4: Generate LLM analysis using HYBRID approach
const llmResult = await this.generateLLMAnalysisHybrid(documentId, text, processedChunks);
}
```
---
## 5. Core Services Deep Dive
### 5.1 UnifiedDocumentProcessor
**File**: `backend/src/services/unifiedDocumentProcessor.ts`
**Purpose**: Main orchestrator for document processing strategies
**Key Method**:
```123:143:backend/src/services/unifiedDocumentProcessor.ts
async processDocument(
documentId: string,
userId: string,
text: string,
options: any = {}
): Promise {
const strategy = options.strategy || 'document_ai_agentic_rag';
logger.info('Processing document with unified processor', {
documentId,
strategy,
textLength: text.length
});
// Only support document_ai_agentic_rag strategy
if (strategy === 'document_ai_agentic_rag') {
return await this.processWithDocumentAiAgenticRag(documentId, userId, text, options);
} else {
throw new Error(`Unsupported processing strategy: ${strategy}. Only 'document_ai_agentic_rag' is supported.`);
}
}
```
**Dependencies**:
- `documentAiProcessor` - Text extraction
- `optimizedAgenticRAGProcessor` - AI processing
- `llmService` - LLM interactions
- `pdfGenerationService` - PDF generation
**Error Handling**: Wraps errors with detailed context, validates analysisData presence
### 5.2 OptimizedAgenticRAGProcessor
**File**: `backend/src/services/optimizedAgenticRAGProcessor.ts` (1885 lines)
**Purpose**: Core AI processing engine for chunking, embeddings, and LLM analysis
**Key Configuration**:
```32:35:backend/src/services/optimizedAgenticRAGProcessor.ts
private readonly maxChunkSize = 4000; // Optimal chunk size for embeddings
private readonly overlapSize = 200; // Overlap between chunks
private readonly maxConcurrentEmbeddings = 5; // Limit concurrent API calls
private readonly batchSize = 10; // Process chunks in batches
```
**Key Methods**:
- `processLargeDocument()` - Main entry point
- `createIntelligentChunks()` - Semantic chunking with boundary detection
- `processChunksInBatches()` - Batch processing for memory efficiency
- `storeChunksOptimized()` - Embedding generation and storage
- `generateLLMAnalysisHybrid()` - LLM analysis with vector search
**Performance Optimizations**:
- Semantic boundary detection (paragraphs, sections)
- Batch processing to limit memory usage
- Concurrent embedding generation (max 5)
- Vector search with document_id filtering
### 5.3 JobProcessorService
**File**: `backend/src/services/jobProcessorService.ts`
**Purpose**: Database-backed job processor (replaces legacy in-memory queue)
**Key Method**:
```15:97:backend/src/services/jobProcessorService.ts
async processJobs(): Promise<{
processed: number;
succeeded: number;
failed: number;
skipped: number;
}> {
// Prevent concurrent processing runs
if (this.isProcessing) {
logger.info('Job processor already running, skipping this run');
return { processed: 0, succeeded: 0, failed: 0, skipped: 0 };
}
this.isProcessing = true;
const stats = { processed: 0, succeeded: 0, failed: 0, skipped: 0 };
try {
// Reset stuck jobs first
const resetCount = await ProcessingJobModel.resetStuckJobs(this.JOB_TIMEOUT_MINUTES);
// Get pending jobs
const pendingJobs = await ProcessingJobModel.getPendingJobs(this.MAX_CONCURRENT_JOBS);
// Get retrying jobs
const retryingJobs = await ProcessingJobModel.getRetryableJobs(
Math.max(0, this.MAX_CONCURRENT_JOBS - pendingJobs.length)
);
const allJobs = [...pendingJobs, ...retryingJobs];
// Process jobs in parallel (up to MAX_CONCURRENT_JOBS)
const results = await Promise.allSettled(
allJobs.map((job) => this.processJob(job.id))
);
```
**Configuration**:
- `MAX_CONCURRENT_JOBS = 3`
- `JOB_TIMEOUT_MINUTES = 15`
**Features**:
- Stuck job detection and recovery
- Retry logic with exponential backoff
- Parallel processing with concurrency limit
- Database-backed state management
### 5.4 VectorDatabaseService
**File**: `backend/src/services/vectorDatabaseService.ts`
**Purpose**: Vector embeddings and similarity search
**Key Method - Vector Search**:
```88:150:backend/src/services/vectorDatabaseService.ts
async searchSimilar(
embedding: number[],
limit: number = 10,
threshold: number = 0.7,
documentId?: string
): Promise {
try {
if (this.provider === 'supabase') {
// Use optimized Supabase vector search function with document_id filtering
// This prevents timeouts by only searching within a specific document
const rpcParams: any = {
query_embedding: embedding,
match_threshold: threshold,
match_count: limit
};
// Add document_id filter if provided (critical for performance)
if (documentId) {
rpcParams.filter_document_id = documentId;
}
// Set a timeout for the RPC call (10 seconds)
const searchPromise = this.supabaseClient
.rpc('match_document_chunks', rpcParams);
const timeoutPromise = new Promise<{ data: null; error: { message: string } }>((_, reject) => {
setTimeout(() => reject(new Error('Vector search timeout after 10s')), 10000);
});
let result: any;
try {
result = await Promise.race([searchPromise, timeoutPromise]);
} catch (timeoutError: any) {
if (timeoutError.message?.includes('timeout')) {
logger.error('Vector search timed out', { documentId, timeout: '10s' });
throw new Error('Vector search timeout after 10s');
}
throw timeoutError;
}
```
**Critical Optimization**: Always pass `documentId` to filter search scope and prevent timeouts
**SQL Function**: `backend/sql/fix_vector_search_timeout.sql`
```10:39:backend/sql/fix_vector_search_timeout.sql
CREATE OR REPLACE FUNCTION match_document_chunks (
query_embedding vector(1536),
match_threshold float,
match_count int,
filter_document_id text DEFAULT NULL
)
RETURNS TABLE (
id UUID,
document_id TEXT,
content text,
metadata JSONB,
chunk_index INT,
similarity float
)
LANGUAGE sql STABLE
AS $$
SELECT
document_chunks.id,
document_chunks.document_id,
document_chunks.content,
document_chunks.metadata,
document_chunks.chunk_index,
1 - (document_chunks.embedding <=> query_embedding) AS similarity
FROM document_chunks
WHERE document_chunks.embedding IS NOT NULL
AND (filter_document_id IS NULL OR document_chunks.document_id = filter_document_id)
AND 1 - (document_chunks.embedding <=> query_embedding) > match_threshold
ORDER BY document_chunks.embedding <=> query_embedding
LIMIT match_count;
$$;
```
### 5.5 LLMService
**File**: `backend/src/services/llmService.ts`
**Purpose**: LLM interactions (Claude/OpenAI/OpenRouter)
**Provider Selection**:
```43:103:backend/src/services/llmService.ts
constructor() {
// Read provider from config (supports openrouter, anthropic, openai)
this.provider = config.llm.provider;
// CRITICAL: If provider is not set correctly, log and use fallback
if (!this.provider || (this.provider !== 'openrouter' && this.provider !== 'anthropic' && this.provider !== 'openai')) {
logger.error('LLM provider is invalid or not set', {
provider: this.provider,
configProvider: config.llm.provider,
processEnvProvider: process.env['LLM_PROVIDER'],
defaultingTo: 'anthropic'
});
this.provider = 'anthropic'; // Fallback
}
// Set API key based on provider
if (this.provider === 'openai') {
this.apiKey = config.llm.openaiApiKey!;
} else if (this.provider === 'openrouter') {
// OpenRouter: Use OpenRouter key if provided, otherwise use Anthropic key for BYOK
this.apiKey = config.llm.openrouterApiKey || config.llm.anthropicApiKey!;
} else {
this.apiKey = config.llm.anthropicApiKey!;
}
// Use configured model instead of hardcoded value
this.defaultModel = config.llm.model;
this.maxTokens = config.llm.maxTokens;
this.temperature = config.llm.temperature;
}
```
**Key Method**:
```108:148:backend/src/services/llmService.ts
async processCIMDocument(text: string, template: string, analysis?: Record): Promise {
// Check and truncate text if it exceeds maxInputTokens
const maxInputTokens = config.llm.maxInputTokens || 200000;
const systemPromptTokens = this.estimateTokenCount(this.getCIMSystemPrompt());
const templateTokens = this.estimateTokenCount(template);
const promptBuffer = config.llm.promptBuffer || 1000;
// Calculate available tokens for document text
const reservedTokens = systemPromptTokens + templateTokens + promptBuffer + (config.llm.maxTokens || 16000);
const availableTokens = maxInputTokens - reservedTokens;
const textTokens = this.estimateTokenCount(text);
let processedText = text;
let wasTruncated = false;
if (textTokens > availableTokens) {
logger.warn('Document text exceeds token limit, truncating', {
textTokens,
availableTokens,
maxInputTokens,
reservedTokens,
truncationRatio: (availableTokens / textTokens * 100).toFixed(1) + '%'
});
processedText = this.truncateText(text, availableTokens);
wasTruncated = true;
}
```
**Features**:
- Automatic token counting and truncation
- Model selection based on task complexity
- JSON schema validation with Zod
- Retry logic with exponential backoff
- Cost tracking
### 5.6 DocumentAiProcessor
**File**: `backend/src/services/documentAiProcessor.ts`
**Purpose**: Google Document AI integration for text extraction
**Key Method**:
```50:146:backend/src/services/documentAiProcessor.ts
async processDocument(
documentId: string,
userId: string,
fileBuffer: Buffer,
fileName: string,
mimeType: string
): Promise {
const startTime = Date.now();
try {
logger.info('Starting Document AI + Agentic RAG processing', {
documentId,
userId,
fileName,
fileSize: fileBuffer.length,
mimeType
});
// Step 1: Extract text using Document AI or fallback
const extractedText = await this.extractTextFromDocument(fileBuffer, fileName, mimeType);
if (!extractedText) {
throw new Error('Failed to extract text from document');
}
logger.info('Text extraction completed', {
textLength: extractedText.length
});
// Step 2: Process extracted text through Agentic RAG
const agenticRagResult = await this.processWithAgenticRAG(documentId, extractedText);
const processingTime = Date.now() - startTime;
return {
success: true,
content: agenticRagResult.summary || extractedText,
metadata: {
processingStrategy: 'document_ai_agentic_rag',
processingTime,
extractedTextLength: extractedText.length,
agenticRagResult,
fileSize: fileBuffer.length,
fileName,
mimeType
}
};
```
**Fallback Strategy**: Uses `pdf-parse` if Document AI fails
### 5.7 PDFGenerationService
**File**: `backend/src/services/pdfGenerationService.ts`
**Purpose**: PDF generation using Puppeteer
**Key Features**:
- Page pooling for performance
- Caching for repeated requests
- Browser instance reuse
- Fallback to PDFKit if Puppeteer fails
**Configuration**:
```65:85:backend/src/services/pdfGenerationService.ts
class PDFGenerationService {
private browser: any = null;
private pagePool: PagePool[] = [];
private readonly maxPoolSize = 5;
private readonly pageTimeout = 30000; // 30 seconds
private readonly cache = new Map();
private readonly cacheTimeout = 300000; // 5 minutes
private readonly defaultOptions: PDFGenerationOptions = {
format: 'A4',
margin: {
top: '1in',
right: '1in',
bottom: '1in',
left: '1in',
},
displayHeaderFooter: true,
printBackground: true,
quality: 'high',
timeout: 30000,
};
```
### 5.8 FileStorageService
**File**: `backend/src/services/fileStorageService.ts`
**Purpose**: Google Cloud Storage operations
**Key Methods**:
- `generateSignedUploadUrl()` - Generate signed URL for direct upload
- `downloadFile()` - Download file from GCS
- `saveBuffer()` - Save buffer to GCS
- `deleteFile()` - Delete file from GCS
**Credential Handling**:
```40:145:backend/src/services/fileStorageService.ts
constructor() {
this.bucketName = config.googleCloud.gcsBucketName;
// Check if we're in Firebase Functions/Cloud Run environment
const isCloudEnvironment = process.env.FUNCTION_TARGET ||
process.env.FUNCTION_NAME ||
process.env.K_SERVICE ||
process.env.GOOGLE_CLOUD_PROJECT ||
!!process.env.GCLOUD_PROJECT ||
process.env.X_GOOGLE_GCLOUD_PROJECT;
// Initialize Google Cloud Storage
const storageConfig: any = {
projectId: config.googleCloud.projectId,
};
// Only use keyFilename in local development
// In Firebase Functions/Cloud Run, use Application Default Credentials
if (isCloudEnvironment) {
// In cloud, ALWAYS clear GOOGLE_APPLICATION_CREDENTIALS to force use of ADC
// Firebase Functions automatically provides credentials via metadata service
// These credentials have signing capabilities for generating signed URLs
const originalCreds = process.env.GOOGLE_APPLICATION_CREDENTIALS;
if (originalCreds) {
delete process.env.GOOGLE_APPLICATION_CREDENTIALS;
logger.info('Using Application Default Credentials for GCS (cloud environment)', {
clearedEnvVar: 'GOOGLE_APPLICATION_CREDENTIALS',
originalValue: originalCreds,
projectId: config.googleCloud.projectId
});
}
```
---
## 6. Data Models & Database Schema
### Core Models
**DocumentModel** (`backend/src/models/DocumentModel.ts`):
- `create()` - Create document record
- `findById()` - Get document by ID
- `updateById()` - Update document status/metadata
- `findByUserId()` - List user's documents
**ProcessingJobModel** (`backend/src/models/ProcessingJobModel.ts`):
- `create()` - Create processing job (uses direct PostgreSQL to bypass PostgREST cache)
- `findById()` - Get job by ID
- `getPendingJobs()` - Get pending jobs (limit by concurrency)
- `getRetryableJobs()` - Get jobs ready for retry
- `markAsProcessing()` - Update job status
- `markAsCompleted()` - Mark job complete
- `markAsFailed()` - Mark job failed with error
- `resetStuckJobs()` - Reset jobs stuck in processing
**VectorDatabaseModel** (`backend/src/models/VectorDatabaseModel.ts`):
- Chunk storage and retrieval
- Embedding management
### Database Tables
**documents**:
- `id` (UUID, primary key)
- `user_id` (UUID, foreign key)
- `original_file_name` (text)
- `file_path` (text, GCS path)
- `file_size` (bigint)
- `status` (text: 'uploading', 'uploaded', 'processing_llm', 'completed', 'failed')
- `pdf_path` (text, GCS path for generated PDF)
- `created_at`, `updated_at` (timestamps)
**processing_jobs**:
- `id` (UUID, primary key)
- `document_id` (UUID, foreign key)
- `user_id` (UUID, foreign key)
- `status` (text: 'pending', 'processing', 'completed', 'failed', 'retrying')
- `attempts` (int)
- `max_attempts` (int, default 3)
- `options` (JSONB, processing options)
- `error` (text, error message if failed)
- `result` (JSONB, processing result)
- `created_at`, `started_at`, `completed_at`, `updated_at` (timestamps)
**document_chunks**:
- `id` (UUID, primary key)
- `document_id` (text, foreign key)
- `content` (text)
- `embedding` (vector(1536))
- `metadata` (JSONB)
- `chunk_index` (int)
- `created_at`, `updated_at` (timestamps)
**agentic_rag_sessions**:
- `id` (UUID, primary key)
- `document_id` (UUID, foreign key)
- `user_id` (UUID, foreign key)
- `status` (text)
- `metadata` (JSONB)
- `created_at`, `updated_at` (timestamps)
### Vector Search Optimization
**Critical SQL Function**: `match_document_chunks` with `document_id` filtering
```10:39:backend/sql/fix_vector_search_timeout.sql
CREATE OR REPLACE FUNCTION match_document_chunks (
query_embedding vector(1536),
match_threshold float,
match_count int,
filter_document_id text DEFAULT NULL
)
RETURNS TABLE (
id UUID,
document_id TEXT,
content text,
metadata JSONB,
chunk_index INT,
similarity float
)
LANGUAGE sql STABLE
AS $$
SELECT
document_chunks.id,
document_chunks.document_id,
document_chunks.content,
document_chunks.metadata,
document_chunks.chunk_index,
1 - (document_chunks.embedding <=> query_embedding) AS similarity
FROM document_chunks
WHERE document_chunks.embedding IS NOT NULL
AND (filter_document_id IS NULL OR document_chunks.document_id = filter_document_id)
AND 1 - (document_chunks.embedding <=> query_embedding) > match_threshold
ORDER BY document_chunks.embedding <=> query_embedding
LIMIT match_count;
$$;
```
**Always pass `filter_document_id`** to prevent timeouts when searching across all documents.
---
## 7. Component Handoffs & Integration Points
### Frontend ↔ Backend
**Axios Interceptor** (`frontend/src/services/documentService.ts`):
```8:54:frontend/src/services/documentService.ts
export const apiClient = axios.create({
baseURL: API_BASE_URL,
timeout: 300000, // 5 minutes
});
// Add auth token to requests
apiClient.interceptors.request.use(async (config) => {
const token = await authService.getToken();
if (token) {
config.headers.Authorization = `Bearer ${token}`;
}
return config;
});
// Handle auth errors with retry logic
apiClient.interceptors.response.use(
(response) => response,
async (error) => {
const originalRequest = error.config;
if (error.response?.status === 401 && !originalRequest._retry) {
originalRequest._retry = true;
try {
// Attempt to refresh the token
const newToken = await authService.getToken();
if (newToken) {
// Retry the original request with the new token
originalRequest.headers.Authorization = `Bearer ${newToken}`;
return apiClient(originalRequest);
}
} catch (refreshError) {
console.error('Token refresh failed:', refreshError);
}
// If token refresh fails, logout the user
authService.logout();
window.location.href = '/login';
}
return Promise.reject(error);
}
);
```
### Backend ↔ Database
**Two Connection Methods**:
1. **Supabase Client** (default for most operations):
```typescript
import { getSupabaseServiceClient } from '../config/supabase';
const supabase = getSupabaseServiceClient();
```
2. **Direct PostgreSQL** (for critical operations, bypasses PostgREST cache):
```47:81:backend/src/models/ProcessingJobModel.ts
static async create(data: CreateProcessingJobData): Promise {
try {
// Use direct PostgreSQL connection to bypass PostgREST cache
// This is critical because PostgREST cache issues can block entire processing pipeline
const pool = getPostgresPool();
const result = await pool.query(
`INSERT INTO processing_jobs (
document_id, user_id, status, attempts, max_attempts, options, created_at
) VALUES ($1, $2, $3, $4, $5, $6, $7)
RETURNING *`,
[
data.document_id,
data.user_id,
'pending',
0,
data.max_attempts || 3,
JSON.stringify(data.options || {}),
new Date().toISOString()
]
);
if (result.rows.length === 0) {
throw new Error('Failed to create processing job: No data returned');
}
const job = result.rows[0];
logger.info('Processing job created via direct PostgreSQL', {
jobId: job.id,
documentId: data.document_id,
userId: data.user_id,
});
return job;
```
### Backend ↔ GCS
**Signed URL Generation**:
```typescript
const uploadUrl = await fileStorageService.generateSignedUploadUrl(filePath, contentType);
```
**Direct Upload** (frontend):
```403:410:frontend/src/services/documentService.ts
const fetchPromise = fetch(uploadUrl, {
method: 'PUT',
headers: {
'Content-Type': contentType, // Must match exactly what was used in signed URL generation
},
body: file,
signal: signal,
});
```
**File Download** (for processing):
```typescript
const fileBuffer = await fileStorageService.downloadFile(document.file_path);
```
### Backend ↔ Document AI
**Text Extraction**:
```148:249:backend/src/services/documentAiProcessor.ts
private async extractTextFromDocument(fileBuffer: Buffer, fileName: string, mimeType: string): Promise {
try {
// Check document size first
// ... size validation ...
// Upload to GCS for Document AI processing
const gcsFileName = `temp/${Date.now()}_${fileName}`;
await this.storageClient.bucket(this.gcsBucketName).file(gcsFileName).save(fileBuffer);
// Process with Document AI
const request = {
name: this.processorName,
rawDocument: {
gcsSource: {
uri: `gs://${this.gcsBucketName}/${gcsFileName}`
},
mimeType: mimeType
}
};
const [result] = await this.documentAiClient.processDocument(request);
// Extract text from result
const text = result.document?.text || '';
// Clean up temp file
await this.storageClient.bucket(this.gcsBucketName).file(gcsFileName).delete();
return text;
} catch (error) {
// Fallback to pdf-parse
logger.warn('Document AI failed, using pdf-parse fallback', { error });
const data = await pdf(fileBuffer);
return data.text;
}
}
```
### Backend ↔ LLM APIs
**Provider Selection** (Claude/OpenAI/OpenRouter):
- Configured via `LLM_PROVIDER` environment variable
- Automatic API key selection based on provider
- Model selection based on task complexity
**Request Flow**:
```typescript
// 1. Token counting and truncation
const processedText = this.truncateText(text, availableTokens);
// 2. Model selection
const model = this.selectModel(taskComplexity);
// 3. API call with retry logic
const response = await this.callLLMAPI({
prompt: processedText,
systemPrompt: systemPrompt,
model: model,
maxTokens: this.maxTokens,
temperature: this.temperature
});
// 4. JSON parsing and validation
const parsed = JSON.parse(response.content);
const validated = cimReviewSchema.parse(parsed);
```
### Services ↔ Services
**Event-Driven Patterns**:
- `jobQueueService` emits events: `job:added`, `job:started`, `job:completed`, `job:failed`
- `uploadMonitoringService` tracks upload events
**Direct Method Calls**:
- Most service interactions are direct method calls
- Services are exported as singletons for easy access
---
## 8. Error Handling & Resilience
### Error Propagation Path
```
Service Method
│
▼ (throws error)
Controller
│
▼ (catches, logs, re-throws)
Express Error Handler
│
▼ (categorizes, logs, responds)
Client (structured error response)
```
### Error Categories
**File**: `backend/src/middleware/errorHandler.ts`
```17:26:backend/src/middleware/errorHandler.ts
export enum ErrorCategory {
VALIDATION = 'validation',
AUTHENTICATION = 'authentication',
AUTHORIZATION = 'authorization',
NOT_FOUND = 'not_found',
EXTERNAL_SERVICE = 'external_service',
PROCESSING = 'processing',
SYSTEM = 'system',
DATABASE = 'database'
}
```
**Error Response Structure**:
```29:39:backend/src/middleware/errorHandler.ts
export interface ErrorResponse {
success: false;
error: {
code: string;
message: string;
details?: any;
correlationId: string;
timestamp: string;
retryable: boolean;
};
}
```
### Retry Mechanisms
**1. Job Retries**:
- Max attempts: 3 (configurable per job)
- Exponential backoff between retries
- Jobs marked as `retrying` status
**2. API Retries**:
- LLM API calls: 3 retries with exponential backoff
- Document AI: Fallback to pdf-parse
- Vector search: 10-second timeout, fallback to direct query
**3. Database Retries**:
```10:46:backend/src/models/DocumentModel.ts
private static async retryOperation(
operation: () => Promise,
operationName: string,
maxRetries: number = 3,
baseDelay: number = 1000
): Promise {
let lastError: any;
for (let attempt = 1; attempt <= maxRetries; attempt++) {
try {
return await operation();
} catch (error: any) {
lastError = error;
const isNetworkError = error?.message?.includes('fetch failed') ||
error?.message?.includes('ENOTFOUND') ||
error?.message?.includes('ECONNREFUSED') ||
error?.message?.includes('ETIMEDOUT') ||
error?.name === 'TypeError';
if (!isNetworkError || attempt === maxRetries) {
throw error;
}
const delay = baseDelay * Math.pow(2, attempt - 1);
logger.warn(`${operationName} failed (attempt ${attempt}/${maxRetries}), retrying in ${delay}ms`, {
error: error?.message || String(error),
code: error?.code,
attempt,
maxRetries
});
await new Promise(resolve => setTimeout(resolve, delay));
}
}
throw lastError;
}
```
### Timeout Handling
**Vector Search Timeout**:
```109:126:backend/src/services/vectorDatabaseService.ts
// Set a timeout for the RPC call (10 seconds)
const searchPromise = this.supabaseClient
.rpc('match_document_chunks', rpcParams);
const timeoutPromise = new Promise<{ data: null; error: { message: string } }>((_, reject) => {
setTimeout(() => reject(new Error('Vector search timeout after 10s')), 10000);
});
let result: any;
try {
result = await Promise.race([searchPromise, timeoutPromise]);
} catch (timeoutError: any) {
if (timeoutError.message?.includes('timeout')) {
logger.error('Vector search timed out', { documentId, timeout: '10s' });
throw new Error('Vector search timeout after 10s');
}
throw timeoutError;
}
```
**LLM API Timeout**: Handled by axios timeout configuration
**Job Timeout**: 15 minutes, jobs stuck longer are reset
### Stuck Job Detection and Recovery
```34:37:backend/src/services/jobProcessorService.ts
// Reset stuck jobs first
const resetCount = await ProcessingJobModel.resetStuckJobs(this.JOB_TIMEOUT_MINUTES);
if (resetCount > 0) {
logger.info('Reset stuck jobs', { count: resetCount });
}
```
**Scheduled Function Monitoring**:
```228:246:backend/src/index.ts
// Check for jobs stuck in processing status
const stuckProcessingJobs = await ProcessingJobModel.getStuckJobs(15); // Jobs stuck > 15 minutes
if (stuckProcessingJobs.length > 0) {
logger.warn('Found stuck processing jobs', {
count: stuckProcessingJobs.length,
jobIds: stuckProcessingJobs.map(j => j.id),
timestamp: new Date().toISOString(),
});
}
// Check for jobs stuck in pending status (alert if > 2 minutes)
const stuckPendingJobs = await ProcessingJobModel.getStuckPendingJobs(2); // Jobs pending > 2 minutes
if (stuckPendingJobs.length > 0) {
logger.warn('Found stuck pending jobs (may indicate processing issues)', {
count: stuckPendingJobs.length,
jobIds: stuckPendingJobs.map(j => j.id),
oldestJobAge: stuckPendingJobs[0] ? Math.round((Date.now() - new Date(stuckPendingJobs[0].created_at).getTime()) / 1000 / 60) : 0,
timestamp: new Date().toISOString(),
});
}
```
### Graceful Degradation
**Document AI Failure**: Falls back to `pdf-parse` library
**Vector Search Failure**: Falls back to direct database query without similarity calculation
**LLM API Failure**: Returns error with retryable flag, job can be retried
**PDF Generation Failure**: Falls back to PDFKit if Puppeteer fails
---
## 9. Performance Optimization Points
### Vector Search Optimization
**Critical**: Always pass `document_id` filter to prevent timeouts
```104:107:backend/src/services/vectorDatabaseService.ts
// Add document_id filter if provided (critical for performance)
if (documentId) {
rpcParams.filter_document_id = documentId;
}
```
**SQL Function Optimization**: `match_document_chunks` filters by `document_id` first before vector similarity calculation
### Chunking Strategy
**Optimal Configuration**:
```32:35:backend/src/services/optimizedAgenticRAGProcessor.ts
private readonly maxChunkSize = 4000; // Optimal chunk size for embeddings
private readonly overlapSize = 200; // Overlap between chunks
private readonly maxConcurrentEmbeddings = 5; // Limit concurrent API calls
private readonly batchSize = 10; // Process chunks in batches
```
**Semantic Chunking**: Detects paragraph and section boundaries for better chunk quality
### Batch Processing
**Embedding Generation**:
- Processes chunks in batches of 10
- Max 5 concurrent embedding API calls
- Prevents memory overflow and API rate limiting
**Chunk Storage**:
- Batched database inserts
- Reduces database round trips
### Memory Management
**Chunk Processing**:
- Processes chunks in batches to limit memory usage
- Cleans up processed chunks from memory after storage
**PDF Generation**:
- Page pooling (max 5 pages)
- Page timeout (30 seconds)
- Cache with 5-minute TTL
### Database Optimization
**Direct PostgreSQL for Critical Operations**:
- Job creation uses direct PostgreSQL to bypass PostgREST cache issues
- Ensures reliable job creation even when PostgREST schema cache is stale
**Connection Pooling**:
- Supabase client uses connection pooling
- Direct PostgreSQL uses pg pool
### API Call Optimization
**LLM Token Management**:
- Automatic token counting
- Text truncation if exceeds limits
- Model selection based on complexity (smaller models for simpler tasks)
**Embedding Caching**:
```31:32:backend/src/services/vectorDatabaseService.ts
private semanticCache: Map = new Map();
private readonly CACHE_TTL = 3600000; // 1 hour cache TTL
```
---
## 10. Background Processing Architecture
### Legacy vs Current System
**Legacy: In-Memory Queue** (`jobQueueService`)
- EventEmitter-based
- In-memory job storage
- Still initialized but being phased out
- Location: `backend/src/services/jobQueueService.ts`
**Current: Database-Backed Queue** (`jobProcessorService`)
- Database-backed job storage
- Scheduled processing via Firebase Cloud Scheduler
- Location: `backend/src/services/jobProcessorService.ts`
### Job Processing Flow
```
Job Creation
│
▼
ProcessingJobModel.create()
│
▼
Status: 'pending' in database
│
▼
Scheduled Function (every 1 minute)
OR
Immediate processing via API
│
▼
JobProcessorService.processJobs()
│
▼
Get pending/retrying jobs (max 3 concurrent)
│
▼
Process jobs in parallel
│
▼
For each job:
- Mark as 'processing'
- Download file from GCS
- Call unifiedDocumentProcessor
- Update document status
- Mark job as 'completed' or 'failed'
```
### Scheduled Function
**File**: `backend/src/index.ts`
```210:267:backend/src/index.ts
export const processDocumentJobs = onSchedule({
schedule: 'every 1 minutes', // Minimum interval for Firebase Cloud Scheduler
timeoutSeconds: 900, // 15 minutes (max for Gen2 scheduled functions)
memory: '1GiB',
retryCount: 2, // Retry up to 2 times on failure
}, async (event) => {
logger.info('Processing document jobs scheduled function triggered', {
timestamp: new Date().toISOString(),
scheduleTime: event.scheduleTime,
});
try {
const { jobProcessorService } = await import('./services/jobProcessorService');
// Check for stuck jobs before processing (monitoring)
const { ProcessingJobModel } = await import('./models/ProcessingJobModel');
// Check for jobs stuck in processing status
const stuckProcessingJobs = await ProcessingJobModel.getStuckJobs(15); // Jobs stuck > 15 minutes
if (stuckProcessingJobs.length > 0) {
logger.warn('Found stuck processing jobs', {
count: stuckProcessingJobs.length,
jobIds: stuckProcessingJobs.map(j => j.id),
timestamp: new Date().toISOString(),
});
}
// Check for jobs stuck in pending status (alert if > 2 minutes)
const stuckPendingJobs = await ProcessingJobModel.getStuckPendingJobs(2); // Jobs pending > 2 minutes
if (stuckPendingJobs.length > 0) {
logger.warn('Found stuck pending jobs (may indicate processing issues)', {
count: stuckPendingJobs.length,
jobIds: stuckPendingJobs.map(j => j.id),
oldestJobAge: stuckPendingJobs[0] ? Math.round((Date.now() - new Date(stuckPendingJobs[0].created_at).getTime()) / 1000 / 60) : 0,
timestamp: new Date().toISOString(),
});
}
const result = await jobProcessorService.processJobs();
logger.info('Document jobs processing completed', {
...result,
timestamp: new Date().toISOString(),
});
} catch (error) {
const errorMessage = error instanceof Error ? error.message : String(error);
const errorStack = error instanceof Error ? error.stack : undefined;
logger.error('Error processing document jobs', {
error: errorMessage,
stack: errorStack,
timestamp: new Date().toISOString(),
});
// Re-throw to trigger retry mechanism (up to retryCount times)
throw error;
}
});
```
### Job States
```
pending → processing → completed
│ │
│ ▼
│ failed
│ │
└──────────────────────┘
│
▼
retrying
│
▼
(back to pending)
```
### Concurrency Control
**Max Concurrent Jobs**: 3
```9:10:backend/src/services/jobProcessorService.ts
private readonly MAX_CONCURRENT_JOBS = 3;
private readonly JOB_TIMEOUT_MINUTES = 15;
```
**Processing Logic**:
```40:63:backend/src/services/jobProcessorService.ts
// Get pending jobs
const pendingJobs = await ProcessingJobModel.getPendingJobs(this.MAX_CONCURRENT_JOBS);
// Get retrying jobs (enabled - schema is updated)
const retryingJobs = await ProcessingJobModel.getRetryableJobs(
Math.max(0, this.MAX_CONCURRENT_JOBS - pendingJobs.length)
);
const allJobs = [...pendingJobs, ...retryingJobs];
if (allJobs.length === 0) {
logger.debug('No jobs to process');
return stats;
}
logger.info('Processing jobs', {
totalJobs: allJobs.length,
pendingJobs: pendingJobs.length,
retryingJobs: retryingJobs.length,
});
// Process jobs in parallel (up to MAX_CONCURRENT_JOBS)
const results = await Promise.allSettled(
allJobs.map((job) => this.processJob(job.id))
);
```
---
## 11. Frontend Architecture
### Component Structure
**Main Components**:
- `DocumentUpload` - File upload with drag-and-drop
- `DocumentList` - List of user's documents with status
- `DocumentViewer` - View processed document and PDF
- `Analytics` - Processing statistics dashboard
- `UploadMonitoringDashboard` - Real-time upload monitoring
### State Management
**AuthContext** (`frontend/src/contexts/AuthContext.tsx`):
```11:46:frontend/src/contexts/AuthContext.tsx
export const AuthProvider: React.FC = ({ children }) => {
const [user, setUser] = useState(null);
const [token, setToken] = useState(null);
const [isLoading, setIsLoading] = useState(true);
const [error, setError] = useState(null);
const [isInitialized, setIsInitialized] = useState(false);
useEffect(() => {
setIsLoading(true);
// Listen for Firebase auth state changes
const unsubscribe = authService.onAuthStateChanged(async (firebaseUser) => {
try {
if (firebaseUser) {
const user = authService.getCurrentUser();
const token = await authService.getToken();
setUser(user);
setToken(token);
} else {
setUser(null);
setToken(null);
}
} catch (error) {
console.error('Auth state change error:', error);
setError('Authentication error occurred');
setUser(null);
setToken(null);
} finally {
setIsLoading(false);
setIsInitialized(true);
}
});
// Cleanup subscription on unmount
return () => unsubscribe();
}, []);
```
### API Communication
**Document Service** (`frontend/src/services/documentService.ts`):
- Axios client with auth interceptor
- Automatic token refresh on 401 errors
- Progress tracking for uploads
- Error handling with user-friendly messages
**Upload Flow**:
```224:361:frontend/src/services/documentService.ts
async uploadDocument(
file: File,
onProgress?: (progress: number) => void,
signal?: AbortSignal
): Promise {
try {
// Check authentication before upload
const token = await authService.getToken();
if (!token) {
throw new Error('Authentication required. Please log in to upload documents.');
}
// Step 1: Get signed upload URL
onProgress?.(5); // 5% - Getting upload URL
const uploadUrlResponse = await apiClient.post('/documents/upload-url', {
fileName: file.name,
fileSize: file.size,
contentType: contentTypeForSigning
}, { signal });
const { documentId, uploadUrl } = uploadUrlResponse.data;
// Step 2: Upload directly to Firebase Storage
onProgress?.(10); // 10% - Starting direct upload
await this.uploadToFirebaseStorage(
file,
uploadUrl,
contentTypeForSigning,
(uploadProgress) => {
// Map upload progress (10-90%)
const mappedProgress = 10 + (uploadProgress * 0.8);
onProgress?.(mappedProgress);
},
signal
);
// Step 3: Confirm upload
onProgress?.(90); // 90% - Confirming upload
const confirmResponse = await apiClient.post(
`/documents/${documentId}/confirm-upload`,
{},
{ signal }
);
onProgress?.(100); // 100% - Complete
return confirmResponse.data.document;
} catch (error) {
// ... error handling ...
}
}
```
### Real-Time Updates
**Polling for Processing Status**:
- Frontend polls `/documents/:id` endpoint
- Updates UI when status changes from 'processing' to 'completed'
- Shows error messages if status is 'failed'
**Upload Progress**:
- Real-time progress tracking via `onProgress` callback
- Visual progress bar in `DocumentUpload` component
---
## 12. Configuration & Environment
### Environment Variables
**File**: `backend/src/config/env.ts`
**Key Configuration Categories**:
1. **LLM Provider**:
- `LLM_PROVIDER` - 'anthropic', 'openai', or 'openrouter'
- `ANTHROPIC_API_KEY` - Claude API key
- `OPENAI_API_KEY` - OpenAI API key
- `OPENROUTER_API_KEY` - OpenRouter API key
- `LLM_MODEL` - Model name (e.g., 'claude-sonnet-4-5-20250929')
- `LLM_MAX_TOKENS` - Max output tokens
- `LLM_MAX_INPUT_TOKENS` - Max input tokens (default 200000)
2. **Database**:
- `SUPABASE_URL` - Supabase project URL
- `SUPABASE_SERVICE_KEY` - Service role key
- `SUPABASE_ANON_KEY` - Anonymous key
3. **Google Cloud**:
- `GCLOUD_PROJECT_ID` - GCP project ID
- `GCS_BUCKET_NAME` - Storage bucket name
- `DOCUMENT_AI_PROCESSOR_ID` - Document AI processor ID
- `DOCUMENT_AI_LOCATION` - Processor location (default 'us')
4. **Feature Flags**:
- `AGENTIC_RAG_ENABLED` - Enable/disable agentic RAG processing
### Configuration Loading
**Priority Order**:
1. `process.env` (Firebase Functions v2)
2. `functions.config()` (Firebase Functions v1 fallback)
3. `.env` file (local development)
**Validation**: Joi schema validates all required environment variables
---
## 13. Debugging Guide
### Key Log Points
**Correlation IDs**: Every request has a correlation ID for tracing
**Structured Logging**: Winston logger with structured data
**Key Log Locations**:
1. **Request Entry**: `backend/src/index.ts` - All incoming requests
2. **Authentication**: `backend/src/middleware/firebaseAuth.ts` - Auth success/failure
3. **Job Processing**: `backend/src/services/jobProcessorService.ts` - Job lifecycle
4. **Document Processing**: `backend/src/services/unifiedDocumentProcessor.ts` - Processing steps
5. **LLM Calls**: `backend/src/services/llmService.ts` - API calls and responses
6. **Vector Search**: `backend/src/services/vectorDatabaseService.ts` - Search operations
7. **Error Handling**: `backend/src/middleware/errorHandler.ts` - All errors with categorization
### Common Failure Points
**1. Vector Search Timeouts**
- **Symptom**: "Vector search timeout after 10s"
- **Cause**: Searching across all documents without `document_id` filter
- **Fix**: Always pass `documentId` to `vectorDatabaseService.searchSimilar()`
**2. LLM API Failures**
- **Symptom**: "LLM API call failed" or "Invalid JSON response"
- **Cause**: API rate limits, network issues, or invalid response format
- **Fix**: Check API keys, retry logic, and response validation
**3. GCS Upload Failures**
- **Symptom**: "Failed to upload to GCS" or "Signed URL expired"
- **Cause**: Credential issues, bucket permissions, or URL expiration
- **Fix**: Check GCS credentials and bucket configuration
**4. Job Stuck in Processing**
- **Symptom**: Job status remains 'processing' for > 15 minutes
- **Cause**: Process crashed, timeout, or error not caught
- **Fix**: Check logs, reset stuck jobs, investigate error
**5. Document AI Failures**
- **Symptom**: "Failed to extract text from document"
- **Cause**: Document AI API error or invalid file format
- **Fix**: Check Document AI processor configuration, fallback to pdf-parse
### Diagnostic Tools
**Health Check Endpoints**:
- `GET /health` - Basic health check
- `GET /health/config` - Configuration health
- `GET /health/agentic-rag` - Agentic RAG health status
**Monitoring Endpoints**:
- `GET /monitoring/upload-metrics` - Upload statistics
- `GET /monitoring/upload-health` - Upload health
- `GET /monitoring/real-time-stats` - Real-time statistics
**Database Debugging**:
```sql
-- Check pending jobs
SELECT * FROM processing_jobs WHERE status = 'pending' ORDER BY created_at DESC;
-- Check stuck jobs
SELECT * FROM processing_jobs
WHERE status = 'processing'
AND started_at < NOW() - INTERVAL '15 minutes';
-- Check document status
SELECT id, original_file_name, status, created_at
FROM documents
WHERE user_id = ''
ORDER BY created_at DESC;
```
**Job Inspection**:
```typescript
// Get job details
const job = await ProcessingJobModel.findById(jobId);
// Check job error
console.log('Job error:', job.error);
// Check job result
console.log('Job result:', job.result);
```
### Debugging Workflow
1. **Identify the Issue**: Check error logs with correlation ID
2. **Trace the Request**: Follow correlation ID through logs
3. **Check Job Status**: Query `processing_jobs` table for job state
4. **Check Document Status**: Query `documents` table for document state
5. **Review Service Logs**: Check specific service logs for detailed errors
6. **Test Components**: Test individual services in isolation
7. **Check External Services**: Verify GCS, Document AI, LLM APIs are accessible
---
## 14. Optimization Opportunities
### Identified Bottlenecks
**1. Vector Search Performance**
- **Current**: 10-second timeout, can be slow for large document sets
- **Optimization**: Ensure `document_id` filter is always used
- **Future**: Consider indexing optimizations, batch search
**2. LLM API Calls**
- **Current**: Sequential processing, no caching of similar requests
- **Optimization**: Implement response caching for similar documents
- **Future**: Batch API calls, use smaller models for simpler tasks
**3. PDF Generation**
- **Current**: Puppeteer can be memory-intensive
- **Optimization**: Page pooling already implemented
- **Future**: Consider serverless PDF generation service
**4. Database Queries**
- **Current**: Some queries don't use indexes effectively
- **Optimization**: Add indexes on frequently queried columns
- **Future**: Query optimization, connection pooling tuning
### Memory Usage Patterns
**Chunk Processing**:
- Processes chunks in batches to limit memory
- Cleans up processed chunks after storage
- **Optimization**: Consider streaming for very large documents
**PDF Generation**:
- Page pooling limits memory usage
- Browser instance reuse reduces overhead
- **Optimization**: Consider headless browser optimization
### API Call Optimization
**Embedding Generation**:
- Current: Max 5 concurrent calls
- **Optimization**: Tune based on API rate limits
- **Future**: Batch embedding API if available
**LLM Calls**:
- Current: Single call per document
- **Optimization**: Use smaller models for simpler tasks
- **Future**: Implement response caching
### Database Query Optimization
**Frequently Queried Tables**:
- `documents` - Add index on `user_id`, `status`
- `processing_jobs` - Add index on `status`, `created_at`
- `document_chunks` - Add index on `document_id`, `chunk_index`
**Vector Search**:
- Current: Uses `match_document_chunks` function
- **Optimization**: Ensure `document_id` filter is always used
- **Future**: Consider HNSW index for faster similarity search
---
## Appendix: Key File Locations
### Backend Services
- `backend/src/services/unifiedDocumentProcessor.ts` - Main orchestrator
- `backend/src/services/optimizedAgenticRAGProcessor.ts` - AI processing engine
- `backend/src/services/jobProcessorService.ts` - Job processor
- `backend/src/services/vectorDatabaseService.ts` - Vector operations
- `backend/src/services/llmService.ts` - LLM interactions
- `backend/src/services/documentAiProcessor.ts` - Document AI integration
- `backend/src/services/pdfGenerationService.ts` - PDF generation
- `backend/src/services/fileStorageService.ts` - GCS operations
### Backend Models
- `backend/src/models/DocumentModel.ts` - Document data model
- `backend/src/models/ProcessingJobModel.ts` - Job data model
- `backend/src/models/VectorDatabaseModel.ts` - Vector data model
### Backend Routes
- `backend/src/routes/documents.ts` - Document endpoints
- `backend/src/routes/vector.ts` - Vector endpoints
- `backend/src/routes/monitoring.ts` - Monitoring endpoints
### Backend Controllers
- `backend/src/controllers/documentController.ts` - Document controller
### Frontend Services
- `frontend/src/services/documentService.ts` - Document API client
- `frontend/src/services/authService.ts` - Authentication service
### Frontend Components
- `frontend/src/components/DocumentUpload.tsx` - Upload component
- `frontend/src/components/DocumentList.tsx` - Document list
- `frontend/src/components/DocumentViewer.tsx` - Document viewer
### Configuration
- `backend/src/config/env.ts` - Environment configuration
- `backend/src/config/supabase.ts` - Supabase configuration
- `backend/src/config/firebase.ts` - Firebase configuration
### SQL
- `backend/sql/fix_vector_search_timeout.sql` - Vector search optimization
---
**End of Architecture Summary**