# CIM Summary Codebase Architecture Summary **Last Updated**: December 2024 **Purpose**: Comprehensive technical reference for senior developers optimizing and debugging the codebase --- ## Table of Contents 1. [System Overview](#1-system-overview) 2. [Application Entry Points](#2-application-entry-points) 3. [Request Flow & API Architecture](#3-request-flow--api-architecture) 4. [Document Processing Pipeline (Critical Path)](#4-document-processing-pipeline-critical-path) 5. [Core Services Deep Dive](#5-core-services-deep-dive) 6. [Data Models & Database Schema](#6-data-models--database-schema) 7. [Component Handoffs & Integration Points](#7-component-handoffs--integration-points) 8. [Error Handling & Resilience](#8-error-handling--resilience) 9. [Performance Optimization Points](#9-performance-optimization-points) 10. [Background Processing Architecture](#10-background-processing-architecture) 11. [Frontend Architecture](#11-frontend-architecture) 12. [Configuration & Environment](#12-configuration--environment) 13. [Debugging Guide](#13-debugging-guide) 14. [Optimization Opportunities](#14-optimization-opportunities) --- ## 1. System Overview ### High-Level Architecture ``` ┌─────────────────────────────────────────────────────────────────┐ │ Frontend (React) │ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ │ │ DocumentUpload│ │ DocumentList │ │ Analytics │ │ │ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ │ │ │ │ │ │ │ └──────────────────┴──────────────────┘ │ │ │ │ │ ┌────────▼────────┐ │ │ │ documentService │ │ │ │ (Axios Client) │ │ │ └────────┬────────┘ │ └────────────────────────────┼────────────────────────────────────┘ │ HTTPS + JWT ┌────────────────────────────▼────────────────────────────────────┐ │ Backend (Express + Node.js) │ │ ┌──────────────────────────────────────────────────────────┐ │ │ │ Middleware Chain: CORS → Auth → Validation → Error Handler │ │ │ └──────────────────────────────────────────────────────────┘ │ │ │ │ │ ┌──────────────────┼──────────────────┐ │ │ │ │ │ │ │ ┌──────▼──────┐ ┌────────▼────────┐ ┌─────▼──────┐ │ │ │ Routes │ │ Controllers │ │ Services │ │ │ └──────┬──────┘ └────────┬────────┘ └─────┬──────┘ │ │ │ │ │ │ │ └──────────────────┴──────────────────┘ │ └────────────────────────────┬────────────────────────────────────┘ │ ┌────────────────────┼────────────────────┐ │ │ │ ┌────▼────┐ ┌──────▼──────┐ ┌───────▼───────┐ │Supabase │ │Google Cloud│ │ LLM APIs │ │(Postgres)│ │ Storage │ │(Claude/OpenAI)│ └─────────┘ └────────────┘ └───────────────┘ ``` ### Technology Stack **Frontend:** - React 18 + TypeScript - Vite (build tool) - Axios (HTTP client) - Firebase Auth (authentication) - React Router (routing) **Backend:** - Node.js + Express + TypeScript - Firebase Functions v2 (deployment) - Supabase (PostgreSQL + Vector DB) - Google Cloud Storage (file storage) - Google Document AI (PDF text extraction) - Puppeteer (PDF generation) **AI/ML Services:** - Anthropic Claude (primary LLM) - OpenAI (fallback LLM) - OpenRouter (LLM routing) - OpenAI Embeddings (vector embeddings) ### Core Purpose Automated processing and analysis of Confidential Information Memorandums (CIMs) using: 1. **Text Extraction**: Google Document AI extracts text from PDFs 2. **Semantic Chunking**: Split text into 4000-char chunks with overlap 3. **Vector Embeddings**: Generate embeddings for semantic search 4. **LLM Analysis**: Claude AI analyzes chunks and generates structured CIMReview data 5. **PDF Generation**: Create summary PDF with analysis results --- ## 2. Application Entry Points ### Backend Entry Point **File**: `backend/src/index.ts` ```1:22:backend/src/index.ts // Initialize Firebase Admin SDK first import './config/firebase'; import express from 'express'; import cors from 'cors'; import helmet from 'helmet'; import morgan from 'morgan'; import rateLimit from 'express-rate-limit'; import { config } from './config/env'; import { logger } from './utils/logger'; import documentRoutes from './routes/documents'; import vectorRoutes from './routes/vector'; import monitoringRoutes from './routes/monitoring'; import auditRoutes from './routes/documentAudit'; import { jobQueueService } from './services/jobQueueService'; import { errorHandler, correlationIdMiddleware } from './middleware/errorHandler'; import { notFoundHandler } from './middleware/notFoundHandler'; // Start the job queue service for background processing jobQueueService.start(); ``` **Key Initialization Steps:** 1. Firebase Admin SDK initialization (`./config/firebase`) 2. Express app setup with middleware chain 3. Route registration (`/documents`, `/vector`, `/monitoring`, `/api/audit`) 4. Job queue service startup (legacy in-memory queue) 5. Firebase Functions export for Cloud deployment **Scheduled Function**: `processDocumentJobs` (```210:267:backend/src/index.ts```) - Runs every minute via Firebase Cloud Scheduler - Processes pending/retrying jobs from database - Detects and resets stuck jobs ### Frontend Entry Point **File**: `frontend/src/main.tsx` ```1:10:frontend/src/main.tsx import React from 'react'; import ReactDOM from 'react-dom/client'; import App from './App'; import './index.css'; ReactDOM.createRoot(document.getElementById('root')!).render( ); ``` **Main App Component**: `frontend/src/App.tsx` - Sets up React Router - Provides AuthContext - Renders protected routes and dashboard --- ## 3. Request Flow & API Architecture ### Request Lifecycle ``` Client Request │ ▼ ┌─────────────────────────────────────┐ │ 1. CORS Middleware │ │ - Validates origin │ │ - Sets CORS headers │ └──────────────┬──────────────────────┘ │ ▼ ┌─────────────────────────────────────┐ │ 2. Correlation ID Middleware │ │ - Generates/reads X-Correlation-ID│ │ - Adds to request object │ └──────────────┬──────────────────────┘ │ ▼ ┌─────────────────────────────────────┐ │ 3. Firebase Auth Middleware │ │ - Verifies JWT token │ │ - Attaches user to req.user │ └──────────────┬──────────────────────┘ │ ▼ ┌─────────────────────────────────────┐ │ 4. Rate Limiting │ │ - 1000 requests per 15 minutes │ └──────────────┬──────────────────────┘ │ ▼ ┌─────────────────────────────────────┐ │ 5. Body Parsing │ │ - JSON (10MB limit) │ │ - URL-encoded (10MB limit) │ └──────────────┬──────────────────────┘ │ ▼ ┌─────────────────────────────────────┐ │ 6. Route Handler │ │ - Matches route pattern │ │ - Calls controller method │ └──────────────┬──────────────────────┘ │ ▼ ┌─────────────────────────────────────┐ │ 7. Controller │ │ - Validates input │ │ - Calls service methods │ │ - Returns response │ └──────────────┬──────────────────────┘ │ ▼ ┌─────────────────────────────────────┐ │ 8. Service Layer │ │ - Business logic │ │ - Database operations │ │ - External API calls │ └──────────────┬──────────────────────┘ │ ▼ ┌─────────────────────────────────────┐ │ 9. Error Handler (if error) │ │ - Categorizes error │ │ - Logs with correlation ID │ │ - Returns structured response │ └─────────────────────────────────────┘ ``` ### Authentication Flow **Middleware**: `backend/src/middleware/firebaseAuth.ts` ```27:81:backend/src/middleware/firebaseAuth.ts export const verifyFirebaseToken = async ( req: FirebaseAuthenticatedRequest, res: Response, next: NextFunction ): Promise => { try { console.log('🔐 Authentication middleware called for:', req.method, req.url); console.log('🔐 Request headers:', Object.keys(req.headers)); // Debug Firebase Admin initialization console.log('🔐 Firebase apps available:', admin.apps.length); console.log('🔐 Firebase app names:', admin.apps.filter(app => app !== null).map(app => app!.name)); const authHeader = req.headers.authorization; console.log('🔐 Auth header present:', !!authHeader); console.log('🔐 Auth header starts with Bearer:', authHeader?.startsWith('Bearer ')); if (!authHeader || !authHeader.startsWith('Bearer ')) { console.log('❌ No valid authorization header'); res.status(401).json({ error: 'No valid authorization header' }); return; } const idToken = authHeader.split('Bearer ')[1]; console.log('🔐 Token extracted, length:', idToken?.length); if (!idToken) { console.log('❌ No token provided'); res.status(401).json({ error: 'No token provided' }); return; } console.log('🔐 Attempting to verify Firebase ID token...'); console.log('🔐 Token preview:', idToken.substring(0, 20) + '...'); // Verify the Firebase ID token const decodedToken = await admin.auth().verifyIdToken(idToken, true); console.log('✅ Token verified successfully for user:', decodedToken.email); console.log('✅ Token UID:', decodedToken.uid); console.log('✅ Token issuer:', decodedToken.iss); // Check if token is expired const now = Math.floor(Date.now() / 1000); if (decodedToken.exp && decodedToken.exp < now) { logger.warn('Token expired for user:', decodedToken.uid); res.status(401).json({ error: 'Token expired' }); return; } req.user = decodedToken; // Log successful authentication logger.info('Authenticated request for user:', decodedToken.email); next(); ``` **Frontend Auth**: `frontend/src/services/authService.ts` - Manages Firebase Auth state - Provides token via `getToken()` - Axios interceptor adds token to requests ### Route Structure **Main Routes** (`backend/src/routes/documents.ts`): - `POST /documents/upload-url` - Get signed upload URL - `POST /documents/:id/confirm-upload` - Confirm upload and start processing - `GET /documents` - List user's documents - `GET /documents/:id` - Get document details - `GET /documents/:id/download` - Download processed PDF - `GET /documents/analytics` - Get processing analytics - `POST /documents/:id/process-optimized-agentic-rag` - Trigger AI processing **Middleware Applied**: ```22:29:backend/src/routes/documents.ts // Apply authentication and correlation ID to all routes router.use(verifyFirebaseToken); router.use(addCorrelationId); // Add logging middleware for document routes router.use((req, res, next) => { console.log(`📄 Document route accessed: ${req.method} ${req.path}`); next(); }); ``` --- ## 4. Document Processing Pipeline (Critical Path) ### Complete Flow Diagram ``` ┌─────────────────────────────────────────────────────────────────────┐ │ DOCUMENT PROCESSING PIPELINE │ └─────────────────────────────────────────────────────────────────────┘ 1. UPLOAD PHASE ┌─────────────────────────────────────────────────────────────┐ │ User selects PDF │ │ ↓ │ │ DocumentUpload component │ │ ↓ │ │ documentService.uploadDocument() │ │ ↓ │ │ POST /documents/upload-url │ │ ↓ │ │ documentController.getUploadUrl() │ │ ↓ │ │ DocumentModel.create() → documents table │ │ ↓ │ │ fileStorageService.generateSignedUploadUrl() │ │ ↓ │ │ Direct upload to GCS via signed URL │ │ ↓ │ │ POST /documents/:id/confirm-upload │ └─────────────────────────────────────────────────────────────┘ 2. JOB CREATION PHASE ┌─────────────────────────────────────────────────────────────┐ │ documentController.confirmUpload() │ │ ↓ │ │ ProcessingJobModel.create() → processing_jobs table │ │ ↓ │ │ Status: 'pending' │ │ ↓ │ │ Returns 202 Accepted (async processing) │ └─────────────────────────────────────────────────────────────┘ 3. JOB PROCESSING PHASE (Background) ┌─────────────────────────────────────────────────────────────┐ │ Scheduled Function: processDocumentJobs (every 1 minute) │ │ OR │ │ Immediate processing via jobProcessorService.processJob() │ │ ↓ │ │ JobProcessorService.processJob() │ │ ↓ │ │ Download file from GCS │ │ ↓ │ │ unifiedDocumentProcessor.processDocument() │ └─────────────────────────────────────────────────────────────┘ 4. TEXT EXTRACTION PHASE ┌─────────────────────────────────────────────────────────────┐ │ documentAiProcessor.processDocument() │ │ ↓ │ │ Google Document AI API │ │ ↓ │ │ Extracted text returned │ └─────────────────────────────────────────────────────────────┘ 5. CHUNKING & EMBEDDING PHASE ┌─────────────────────────────────────────────────────────────┐ │ optimizedAgenticRAGProcessor.processLargeDocument() │ │ ↓ │ │ createIntelligentChunks() │ │ - Semantic boundary detection │ │ - 4000-char chunks with 200-char overlap │ │ ↓ │ │ processChunksInBatches() │ │ - Batch size: 10 │ │ - Max concurrent: 5 │ │ ↓ │ │ storeChunksOptimized() │ │ ↓ │ │ vectorDatabaseService.storeEmbedding() │ │ - OpenAI embeddings API │ │ - Store in document_chunks table │ └─────────────────────────────────────────────────────────────┘ 6. LLM ANALYSIS PHASE ┌─────────────────────────────────────────────────────────────┐ │ generateLLMAnalysisHybrid() │ │ ↓ │ │ llmService.processCIMDocument() │ │ ↓ │ │ Vector search for relevant chunks │ │ ↓ │ │ Claude/OpenAI API call with structured prompt │ │ ↓ │ │ Parse and validate CIMReview JSON │ │ ↓ │ │ Return structured analysisData │ └─────────────────────────────────────────────────────────────┘ 7. PDF GENERATION PHASE ┌─────────────────────────────────────────────────────────────┐ │ pdfGenerationService.generatePDF() │ │ ↓ │ │ Puppeteer browser instance │ │ ↓ │ │ Render HTML template with analysisData │ │ ↓ │ │ Generate PDF buffer │ │ ↓ │ │ Upload PDF to GCS │ │ ↓ │ │ Update document record with PDF path │ └─────────────────────────────────────────────────────────────┘ 8. STATUS UPDATE PHASE ┌─────────────────────────────────────────────────────────────┐ │ DocumentModel.updateById() │ │ - status: 'completed' │ │ - pdf_path: GCS path │ │ ↓ │ │ ProcessingJobModel.markAsCompleted() │ │ ↓ │ │ Frontend polls /documents/:id for status updates │ └─────────────────────────────────────────────────────────────┘ ``` ### Key Handoff Points **1. Upload to Job Creation** ```138:202:backend/src/controllers/documentController.ts async confirmUpload(req: Request, res: Response): Promise { // ... validation ... // Update status to processing await DocumentModel.updateById(documentId, { status: 'processing_llm' }); // Acknowledge the request immediately res.status(202).json({ message: 'Upload confirmed, processing has started.', document: document, status: 'processing' }); // CRITICAL FIX: Use database-backed job queue const { ProcessingJobModel } = await import('../models/ProcessingJobModel'); await ProcessingJobModel.create({ document_id: documentId, user_id: userId, options: { fileName: document.original_file_name, mimeType: 'application/pdf' } }); } ``` **2. Job Processing to Document Processing** ```109:200:backend/src/services/jobProcessorService.ts private async processJob(jobId: string): Promise<{ success: boolean; error?: string }> { // Get job details job = await ProcessingJobModel.findById(jobId); // Mark job as processing await ProcessingJobModel.markAsProcessing(jobId); // Download file from GCS const fileBuffer = await fileStorageService.downloadFile(document.file_path); // Process document const result = await unifiedDocumentProcessor.processDocument( job.document_id, job.user_id, fileBuffer.toString('utf-8'), // This will be re-read as buffer { fileBuffer, fileName: job.options?.fileName || 'document.pdf', mimeType: job.options?.mimeType || 'application/pdf' } ); } ``` **3. Document Processing to Text Extraction** ```50:80:backend/src/services/documentAiProcessor.ts async processDocument( documentId: string, userId: string, fileBuffer: Buffer, fileName: string, mimeType: string ): Promise { // Step 1: Extract text using Document AI or fallback const extractedText = await this.extractTextFromDocument(fileBuffer, fileName, mimeType); // Step 2: Process extracted text through Agentic RAG const agenticRagResult = await this.processWithAgenticRAG(documentId, extractedText); } ``` **4. Text to Chunking** ```40:109:backend/src/services/optimizedAgenticRAGProcessor.ts async processLargeDocument( documentId: string, text: string, options: { enableSemanticChunking?: boolean; enableMetadataEnrichment?: boolean; similarityThreshold?: number; } = {} ): Promise { // Step 1: Create intelligent chunks with semantic boundaries const chunks = await this.createIntelligentChunks(text, documentId, options.enableSemanticChunking); // Step 2: Process chunks in batches to manage memory const processedChunks = await this.processChunksInBatches(chunks, documentId, options); // Step 3: Store chunks with optimized batching const embeddingApiCalls = await this.storeChunksOptimized(processedChunks, documentId); // Step 4: Generate LLM analysis using HYBRID approach const llmResult = await this.generateLLMAnalysisHybrid(documentId, text, processedChunks); } ``` --- ## 5. Core Services Deep Dive ### 5.1 UnifiedDocumentProcessor **File**: `backend/src/services/unifiedDocumentProcessor.ts` **Purpose**: Main orchestrator for document processing strategies **Key Method**: ```123:143:backend/src/services/unifiedDocumentProcessor.ts async processDocument( documentId: string, userId: string, text: string, options: any = {} ): Promise { const strategy = options.strategy || 'document_ai_agentic_rag'; logger.info('Processing document with unified processor', { documentId, strategy, textLength: text.length }); // Only support document_ai_agentic_rag strategy if (strategy === 'document_ai_agentic_rag') { return await this.processWithDocumentAiAgenticRag(documentId, userId, text, options); } else { throw new Error(`Unsupported processing strategy: ${strategy}. Only 'document_ai_agentic_rag' is supported.`); } } ``` **Dependencies**: - `documentAiProcessor` - Text extraction - `optimizedAgenticRAGProcessor` - AI processing - `llmService` - LLM interactions - `pdfGenerationService` - PDF generation **Error Handling**: Wraps errors with detailed context, validates analysisData presence ### 5.2 OptimizedAgenticRAGProcessor **File**: `backend/src/services/optimizedAgenticRAGProcessor.ts` (1885 lines) **Purpose**: Core AI processing engine for chunking, embeddings, and LLM analysis **Key Configuration**: ```32:35:backend/src/services/optimizedAgenticRAGProcessor.ts private readonly maxChunkSize = 4000; // Optimal chunk size for embeddings private readonly overlapSize = 200; // Overlap between chunks private readonly maxConcurrentEmbeddings = 5; // Limit concurrent API calls private readonly batchSize = 10; // Process chunks in batches ``` **Key Methods**: - `processLargeDocument()` - Main entry point - `createIntelligentChunks()` - Semantic chunking with boundary detection - `processChunksInBatches()` - Batch processing for memory efficiency - `storeChunksOptimized()` - Embedding generation and storage - `generateLLMAnalysisHybrid()` - LLM analysis with vector search **Performance Optimizations**: - Semantic boundary detection (paragraphs, sections) - Batch processing to limit memory usage - Concurrent embedding generation (max 5) - Vector search with document_id filtering ### 5.3 JobProcessorService **File**: `backend/src/services/jobProcessorService.ts` **Purpose**: Database-backed job processor (replaces legacy in-memory queue) **Key Method**: ```15:97:backend/src/services/jobProcessorService.ts async processJobs(): Promise<{ processed: number; succeeded: number; failed: number; skipped: number; }> { // Prevent concurrent processing runs if (this.isProcessing) { logger.info('Job processor already running, skipping this run'); return { processed: 0, succeeded: 0, failed: 0, skipped: 0 }; } this.isProcessing = true; const stats = { processed: 0, succeeded: 0, failed: 0, skipped: 0 }; try { // Reset stuck jobs first const resetCount = await ProcessingJobModel.resetStuckJobs(this.JOB_TIMEOUT_MINUTES); // Get pending jobs const pendingJobs = await ProcessingJobModel.getPendingJobs(this.MAX_CONCURRENT_JOBS); // Get retrying jobs const retryingJobs = await ProcessingJobModel.getRetryableJobs( Math.max(0, this.MAX_CONCURRENT_JOBS - pendingJobs.length) ); const allJobs = [...pendingJobs, ...retryingJobs]; // Process jobs in parallel (up to MAX_CONCURRENT_JOBS) const results = await Promise.allSettled( allJobs.map((job) => this.processJob(job.id)) ); ``` **Configuration**: - `MAX_CONCURRENT_JOBS = 3` - `JOB_TIMEOUT_MINUTES = 15` **Features**: - Stuck job detection and recovery - Retry logic with exponential backoff - Parallel processing with concurrency limit - Database-backed state management ### 5.4 VectorDatabaseService **File**: `backend/src/services/vectorDatabaseService.ts` **Purpose**: Vector embeddings and similarity search **Key Method - Vector Search**: ```88:150:backend/src/services/vectorDatabaseService.ts async searchSimilar( embedding: number[], limit: number = 10, threshold: number = 0.7, documentId?: string ): Promise { try { if (this.provider === 'supabase') { // Use optimized Supabase vector search function with document_id filtering // This prevents timeouts by only searching within a specific document const rpcParams: any = { query_embedding: embedding, match_threshold: threshold, match_count: limit }; // Add document_id filter if provided (critical for performance) if (documentId) { rpcParams.filter_document_id = documentId; } // Set a timeout for the RPC call (10 seconds) const searchPromise = this.supabaseClient .rpc('match_document_chunks', rpcParams); const timeoutPromise = new Promise<{ data: null; error: { message: string } }>((_, reject) => { setTimeout(() => reject(new Error('Vector search timeout after 10s')), 10000); }); let result: any; try { result = await Promise.race([searchPromise, timeoutPromise]); } catch (timeoutError: any) { if (timeoutError.message?.includes('timeout')) { logger.error('Vector search timed out', { documentId, timeout: '10s' }); throw new Error('Vector search timeout after 10s'); } throw timeoutError; } ``` **Critical Optimization**: Always pass `documentId` to filter search scope and prevent timeouts **SQL Function**: `backend/sql/fix_vector_search_timeout.sql` ```10:39:backend/sql/fix_vector_search_timeout.sql CREATE OR REPLACE FUNCTION match_document_chunks ( query_embedding vector(1536), match_threshold float, match_count int, filter_document_id text DEFAULT NULL ) RETURNS TABLE ( id UUID, document_id TEXT, content text, metadata JSONB, chunk_index INT, similarity float ) LANGUAGE sql STABLE AS $$ SELECT document_chunks.id, document_chunks.document_id, document_chunks.content, document_chunks.metadata, document_chunks.chunk_index, 1 - (document_chunks.embedding <=> query_embedding) AS similarity FROM document_chunks WHERE document_chunks.embedding IS NOT NULL AND (filter_document_id IS NULL OR document_chunks.document_id = filter_document_id) AND 1 - (document_chunks.embedding <=> query_embedding) > match_threshold ORDER BY document_chunks.embedding <=> query_embedding LIMIT match_count; $$; ``` ### 5.5 LLMService **File**: `backend/src/services/llmService.ts` **Purpose**: LLM interactions (Claude/OpenAI/OpenRouter) **Provider Selection**: ```43:103:backend/src/services/llmService.ts constructor() { // Read provider from config (supports openrouter, anthropic, openai) this.provider = config.llm.provider; // CRITICAL: If provider is not set correctly, log and use fallback if (!this.provider || (this.provider !== 'openrouter' && this.provider !== 'anthropic' && this.provider !== 'openai')) { logger.error('LLM provider is invalid or not set', { provider: this.provider, configProvider: config.llm.provider, processEnvProvider: process.env['LLM_PROVIDER'], defaultingTo: 'anthropic' }); this.provider = 'anthropic'; // Fallback } // Set API key based on provider if (this.provider === 'openai') { this.apiKey = config.llm.openaiApiKey!; } else if (this.provider === 'openrouter') { // OpenRouter: Use OpenRouter key if provided, otherwise use Anthropic key for BYOK this.apiKey = config.llm.openrouterApiKey || config.llm.anthropicApiKey!; } else { this.apiKey = config.llm.anthropicApiKey!; } // Use configured model instead of hardcoded value this.defaultModel = config.llm.model; this.maxTokens = config.llm.maxTokens; this.temperature = config.llm.temperature; } ``` **Key Method**: ```108:148:backend/src/services/llmService.ts async processCIMDocument(text: string, template: string, analysis?: Record): Promise { // Check and truncate text if it exceeds maxInputTokens const maxInputTokens = config.llm.maxInputTokens || 200000; const systemPromptTokens = this.estimateTokenCount(this.getCIMSystemPrompt()); const templateTokens = this.estimateTokenCount(template); const promptBuffer = config.llm.promptBuffer || 1000; // Calculate available tokens for document text const reservedTokens = systemPromptTokens + templateTokens + promptBuffer + (config.llm.maxTokens || 16000); const availableTokens = maxInputTokens - reservedTokens; const textTokens = this.estimateTokenCount(text); let processedText = text; let wasTruncated = false; if (textTokens > availableTokens) { logger.warn('Document text exceeds token limit, truncating', { textTokens, availableTokens, maxInputTokens, reservedTokens, truncationRatio: (availableTokens / textTokens * 100).toFixed(1) + '%' }); processedText = this.truncateText(text, availableTokens); wasTruncated = true; } ``` **Features**: - Automatic token counting and truncation - Model selection based on task complexity - JSON schema validation with Zod - Retry logic with exponential backoff - Cost tracking ### 5.6 DocumentAiProcessor **File**: `backend/src/services/documentAiProcessor.ts` **Purpose**: Google Document AI integration for text extraction **Key Method**: ```50:146:backend/src/services/documentAiProcessor.ts async processDocument( documentId: string, userId: string, fileBuffer: Buffer, fileName: string, mimeType: string ): Promise { const startTime = Date.now(); try { logger.info('Starting Document AI + Agentic RAG processing', { documentId, userId, fileName, fileSize: fileBuffer.length, mimeType }); // Step 1: Extract text using Document AI or fallback const extractedText = await this.extractTextFromDocument(fileBuffer, fileName, mimeType); if (!extractedText) { throw new Error('Failed to extract text from document'); } logger.info('Text extraction completed', { textLength: extractedText.length }); // Step 2: Process extracted text through Agentic RAG const agenticRagResult = await this.processWithAgenticRAG(documentId, extractedText); const processingTime = Date.now() - startTime; return { success: true, content: agenticRagResult.summary || extractedText, metadata: { processingStrategy: 'document_ai_agentic_rag', processingTime, extractedTextLength: extractedText.length, agenticRagResult, fileSize: fileBuffer.length, fileName, mimeType } }; ``` **Fallback Strategy**: Uses `pdf-parse` if Document AI fails ### 5.7 PDFGenerationService **File**: `backend/src/services/pdfGenerationService.ts` **Purpose**: PDF generation using Puppeteer **Key Features**: - Page pooling for performance - Caching for repeated requests - Browser instance reuse - Fallback to PDFKit if Puppeteer fails **Configuration**: ```65:85:backend/src/services/pdfGenerationService.ts class PDFGenerationService { private browser: any = null; private pagePool: PagePool[] = []; private readonly maxPoolSize = 5; private readonly pageTimeout = 30000; // 30 seconds private readonly cache = new Map(); private readonly cacheTimeout = 300000; // 5 minutes private readonly defaultOptions: PDFGenerationOptions = { format: 'A4', margin: { top: '1in', right: '1in', bottom: '1in', left: '1in', }, displayHeaderFooter: true, printBackground: true, quality: 'high', timeout: 30000, }; ``` ### 5.8 FileStorageService **File**: `backend/src/services/fileStorageService.ts` **Purpose**: Google Cloud Storage operations **Key Methods**: - `generateSignedUploadUrl()` - Generate signed URL for direct upload - `downloadFile()` - Download file from GCS - `saveBuffer()` - Save buffer to GCS - `deleteFile()` - Delete file from GCS **Credential Handling**: ```40:145:backend/src/services/fileStorageService.ts constructor() { this.bucketName = config.googleCloud.gcsBucketName; // Check if we're in Firebase Functions/Cloud Run environment const isCloudEnvironment = process.env.FUNCTION_TARGET || process.env.FUNCTION_NAME || process.env.K_SERVICE || process.env.GOOGLE_CLOUD_PROJECT || !!process.env.GCLOUD_PROJECT || process.env.X_GOOGLE_GCLOUD_PROJECT; // Initialize Google Cloud Storage const storageConfig: any = { projectId: config.googleCloud.projectId, }; // Only use keyFilename in local development // In Firebase Functions/Cloud Run, use Application Default Credentials if (isCloudEnvironment) { // In cloud, ALWAYS clear GOOGLE_APPLICATION_CREDENTIALS to force use of ADC // Firebase Functions automatically provides credentials via metadata service // These credentials have signing capabilities for generating signed URLs const originalCreds = process.env.GOOGLE_APPLICATION_CREDENTIALS; if (originalCreds) { delete process.env.GOOGLE_APPLICATION_CREDENTIALS; logger.info('Using Application Default Credentials for GCS (cloud environment)', { clearedEnvVar: 'GOOGLE_APPLICATION_CREDENTIALS', originalValue: originalCreds, projectId: config.googleCloud.projectId }); } ``` --- ## 6. Data Models & Database Schema ### Core Models **DocumentModel** (`backend/src/models/DocumentModel.ts`): - `create()` - Create document record - `findById()` - Get document by ID - `updateById()` - Update document status/metadata - `findByUserId()` - List user's documents **ProcessingJobModel** (`backend/src/models/ProcessingJobModel.ts`): - `create()` - Create processing job (uses direct PostgreSQL to bypass PostgREST cache) - `findById()` - Get job by ID - `getPendingJobs()` - Get pending jobs (limit by concurrency) - `getRetryableJobs()` - Get jobs ready for retry - `markAsProcessing()` - Update job status - `markAsCompleted()` - Mark job complete - `markAsFailed()` - Mark job failed with error - `resetStuckJobs()` - Reset jobs stuck in processing **VectorDatabaseModel** (`backend/src/models/VectorDatabaseModel.ts`): - Chunk storage and retrieval - Embedding management ### Database Tables **documents**: - `id` (UUID, primary key) - `user_id` (UUID, foreign key) - `original_file_name` (text) - `file_path` (text, GCS path) - `file_size` (bigint) - `status` (text: 'uploading', 'uploaded', 'processing_llm', 'completed', 'failed') - `pdf_path` (text, GCS path for generated PDF) - `created_at`, `updated_at` (timestamps) **processing_jobs**: - `id` (UUID, primary key) - `document_id` (UUID, foreign key) - `user_id` (UUID, foreign key) - `status` (text: 'pending', 'processing', 'completed', 'failed', 'retrying') - `attempts` (int) - `max_attempts` (int, default 3) - `options` (JSONB, processing options) - `error` (text, error message if failed) - `result` (JSONB, processing result) - `created_at`, `started_at`, `completed_at`, `updated_at` (timestamps) **document_chunks**: - `id` (UUID, primary key) - `document_id` (text, foreign key) - `content` (text) - `embedding` (vector(1536)) - `metadata` (JSONB) - `chunk_index` (int) - `created_at`, `updated_at` (timestamps) **agentic_rag_sessions**: - `id` (UUID, primary key) - `document_id` (UUID, foreign key) - `user_id` (UUID, foreign key) - `status` (text) - `metadata` (JSONB) - `created_at`, `updated_at` (timestamps) ### Vector Search Optimization **Critical SQL Function**: `match_document_chunks` with `document_id` filtering ```10:39:backend/sql/fix_vector_search_timeout.sql CREATE OR REPLACE FUNCTION match_document_chunks ( query_embedding vector(1536), match_threshold float, match_count int, filter_document_id text DEFAULT NULL ) RETURNS TABLE ( id UUID, document_id TEXT, content text, metadata JSONB, chunk_index INT, similarity float ) LANGUAGE sql STABLE AS $$ SELECT document_chunks.id, document_chunks.document_id, document_chunks.content, document_chunks.metadata, document_chunks.chunk_index, 1 - (document_chunks.embedding <=> query_embedding) AS similarity FROM document_chunks WHERE document_chunks.embedding IS NOT NULL AND (filter_document_id IS NULL OR document_chunks.document_id = filter_document_id) AND 1 - (document_chunks.embedding <=> query_embedding) > match_threshold ORDER BY document_chunks.embedding <=> query_embedding LIMIT match_count; $$; ``` **Always pass `filter_document_id`** to prevent timeouts when searching across all documents. --- ## 7. Component Handoffs & Integration Points ### Frontend ↔ Backend **Axios Interceptor** (`frontend/src/services/documentService.ts`): ```8:54:frontend/src/services/documentService.ts export const apiClient = axios.create({ baseURL: API_BASE_URL, timeout: 300000, // 5 minutes }); // Add auth token to requests apiClient.interceptors.request.use(async (config) => { const token = await authService.getToken(); if (token) { config.headers.Authorization = `Bearer ${token}`; } return config; }); // Handle auth errors with retry logic apiClient.interceptors.response.use( (response) => response, async (error) => { const originalRequest = error.config; if (error.response?.status === 401 && !originalRequest._retry) { originalRequest._retry = true; try { // Attempt to refresh the token const newToken = await authService.getToken(); if (newToken) { // Retry the original request with the new token originalRequest.headers.Authorization = `Bearer ${newToken}`; return apiClient(originalRequest); } } catch (refreshError) { console.error('Token refresh failed:', refreshError); } // If token refresh fails, logout the user authService.logout(); window.location.href = '/login'; } return Promise.reject(error); } ); ``` ### Backend ↔ Database **Two Connection Methods**: 1. **Supabase Client** (default for most operations): ```typescript import { getSupabaseServiceClient } from '../config/supabase'; const supabase = getSupabaseServiceClient(); ``` 2. **Direct PostgreSQL** (for critical operations, bypasses PostgREST cache): ```47:81:backend/src/models/ProcessingJobModel.ts static async create(data: CreateProcessingJobData): Promise { try { // Use direct PostgreSQL connection to bypass PostgREST cache // This is critical because PostgREST cache issues can block entire processing pipeline const pool = getPostgresPool(); const result = await pool.query( `INSERT INTO processing_jobs ( document_id, user_id, status, attempts, max_attempts, options, created_at ) VALUES ($1, $2, $3, $4, $5, $6, $7) RETURNING *`, [ data.document_id, data.user_id, 'pending', 0, data.max_attempts || 3, JSON.stringify(data.options || {}), new Date().toISOString() ] ); if (result.rows.length === 0) { throw new Error('Failed to create processing job: No data returned'); } const job = result.rows[0]; logger.info('Processing job created via direct PostgreSQL', { jobId: job.id, documentId: data.document_id, userId: data.user_id, }); return job; ``` ### Backend ↔ GCS **Signed URL Generation**: ```typescript const uploadUrl = await fileStorageService.generateSignedUploadUrl(filePath, contentType); ``` **Direct Upload** (frontend): ```403:410:frontend/src/services/documentService.ts const fetchPromise = fetch(uploadUrl, { method: 'PUT', headers: { 'Content-Type': contentType, // Must match exactly what was used in signed URL generation }, body: file, signal: signal, }); ``` **File Download** (for processing): ```typescript const fileBuffer = await fileStorageService.downloadFile(document.file_path); ``` ### Backend ↔ Document AI **Text Extraction**: ```148:249:backend/src/services/documentAiProcessor.ts private async extractTextFromDocument(fileBuffer: Buffer, fileName: string, mimeType: string): Promise { try { // Check document size first // ... size validation ... // Upload to GCS for Document AI processing const gcsFileName = `temp/${Date.now()}_${fileName}`; await this.storageClient.bucket(this.gcsBucketName).file(gcsFileName).save(fileBuffer); // Process with Document AI const request = { name: this.processorName, rawDocument: { gcsSource: { uri: `gs://${this.gcsBucketName}/${gcsFileName}` }, mimeType: mimeType } }; const [result] = await this.documentAiClient.processDocument(request); // Extract text from result const text = result.document?.text || ''; // Clean up temp file await this.storageClient.bucket(this.gcsBucketName).file(gcsFileName).delete(); return text; } catch (error) { // Fallback to pdf-parse logger.warn('Document AI failed, using pdf-parse fallback', { error }); const data = await pdf(fileBuffer); return data.text; } } ``` ### Backend ↔ LLM APIs **Provider Selection** (Claude/OpenAI/OpenRouter): - Configured via `LLM_PROVIDER` environment variable - Automatic API key selection based on provider - Model selection based on task complexity **Request Flow**: ```typescript // 1. Token counting and truncation const processedText = this.truncateText(text, availableTokens); // 2. Model selection const model = this.selectModel(taskComplexity); // 3. API call with retry logic const response = await this.callLLMAPI({ prompt: processedText, systemPrompt: systemPrompt, model: model, maxTokens: this.maxTokens, temperature: this.temperature }); // 4. JSON parsing and validation const parsed = JSON.parse(response.content); const validated = cimReviewSchema.parse(parsed); ``` ### Services ↔ Services **Event-Driven Patterns**: - `jobQueueService` emits events: `job:added`, `job:started`, `job:completed`, `job:failed` - `uploadMonitoringService` tracks upload events **Direct Method Calls**: - Most service interactions are direct method calls - Services are exported as singletons for easy access --- ## 8. Error Handling & Resilience ### Error Propagation Path ``` Service Method │ ▼ (throws error) Controller │ ▼ (catches, logs, re-throws) Express Error Handler │ ▼ (categorizes, logs, responds) Client (structured error response) ``` ### Error Categories **File**: `backend/src/middleware/errorHandler.ts` ```17:26:backend/src/middleware/errorHandler.ts export enum ErrorCategory { VALIDATION = 'validation', AUTHENTICATION = 'authentication', AUTHORIZATION = 'authorization', NOT_FOUND = 'not_found', EXTERNAL_SERVICE = 'external_service', PROCESSING = 'processing', SYSTEM = 'system', DATABASE = 'database' } ``` **Error Response Structure**: ```29:39:backend/src/middleware/errorHandler.ts export interface ErrorResponse { success: false; error: { code: string; message: string; details?: any; correlationId: string; timestamp: string; retryable: boolean; }; } ``` ### Retry Mechanisms **1. Job Retries**: - Max attempts: 3 (configurable per job) - Exponential backoff between retries - Jobs marked as `retrying` status **2. API Retries**: - LLM API calls: 3 retries with exponential backoff - Document AI: Fallback to pdf-parse - Vector search: 10-second timeout, fallback to direct query **3. Database Retries**: ```10:46:backend/src/models/DocumentModel.ts private static async retryOperation( operation: () => Promise, operationName: string, maxRetries: number = 3, baseDelay: number = 1000 ): Promise { let lastError: any; for (let attempt = 1; attempt <= maxRetries; attempt++) { try { return await operation(); } catch (error: any) { lastError = error; const isNetworkError = error?.message?.includes('fetch failed') || error?.message?.includes('ENOTFOUND') || error?.message?.includes('ECONNREFUSED') || error?.message?.includes('ETIMEDOUT') || error?.name === 'TypeError'; if (!isNetworkError || attempt === maxRetries) { throw error; } const delay = baseDelay * Math.pow(2, attempt - 1); logger.warn(`${operationName} failed (attempt ${attempt}/${maxRetries}), retrying in ${delay}ms`, { error: error?.message || String(error), code: error?.code, attempt, maxRetries }); await new Promise(resolve => setTimeout(resolve, delay)); } } throw lastError; } ``` ### Timeout Handling **Vector Search Timeout**: ```109:126:backend/src/services/vectorDatabaseService.ts // Set a timeout for the RPC call (10 seconds) const searchPromise = this.supabaseClient .rpc('match_document_chunks', rpcParams); const timeoutPromise = new Promise<{ data: null; error: { message: string } }>((_, reject) => { setTimeout(() => reject(new Error('Vector search timeout after 10s')), 10000); }); let result: any; try { result = await Promise.race([searchPromise, timeoutPromise]); } catch (timeoutError: any) { if (timeoutError.message?.includes('timeout')) { logger.error('Vector search timed out', { documentId, timeout: '10s' }); throw new Error('Vector search timeout after 10s'); } throw timeoutError; } ``` **LLM API Timeout**: Handled by axios timeout configuration **Job Timeout**: 15 minutes, jobs stuck longer are reset ### Stuck Job Detection and Recovery ```34:37:backend/src/services/jobProcessorService.ts // Reset stuck jobs first const resetCount = await ProcessingJobModel.resetStuckJobs(this.JOB_TIMEOUT_MINUTES); if (resetCount > 0) { logger.info('Reset stuck jobs', { count: resetCount }); } ``` **Scheduled Function Monitoring**: ```228:246:backend/src/index.ts // Check for jobs stuck in processing status const stuckProcessingJobs = await ProcessingJobModel.getStuckJobs(15); // Jobs stuck > 15 minutes if (stuckProcessingJobs.length > 0) { logger.warn('Found stuck processing jobs', { count: stuckProcessingJobs.length, jobIds: stuckProcessingJobs.map(j => j.id), timestamp: new Date().toISOString(), }); } // Check for jobs stuck in pending status (alert if > 2 minutes) const stuckPendingJobs = await ProcessingJobModel.getStuckPendingJobs(2); // Jobs pending > 2 minutes if (stuckPendingJobs.length > 0) { logger.warn('Found stuck pending jobs (may indicate processing issues)', { count: stuckPendingJobs.length, jobIds: stuckPendingJobs.map(j => j.id), oldestJobAge: stuckPendingJobs[0] ? Math.round((Date.now() - new Date(stuckPendingJobs[0].created_at).getTime()) / 1000 / 60) : 0, timestamp: new Date().toISOString(), }); } ``` ### Graceful Degradation **Document AI Failure**: Falls back to `pdf-parse` library **Vector Search Failure**: Falls back to direct database query without similarity calculation **LLM API Failure**: Returns error with retryable flag, job can be retried **PDF Generation Failure**: Falls back to PDFKit if Puppeteer fails --- ## 9. Performance Optimization Points ### Vector Search Optimization **Critical**: Always pass `document_id` filter to prevent timeouts ```104:107:backend/src/services/vectorDatabaseService.ts // Add document_id filter if provided (critical for performance) if (documentId) { rpcParams.filter_document_id = documentId; } ``` **SQL Function Optimization**: `match_document_chunks` filters by `document_id` first before vector similarity calculation ### Chunking Strategy **Optimal Configuration**: ```32:35:backend/src/services/optimizedAgenticRAGProcessor.ts private readonly maxChunkSize = 4000; // Optimal chunk size for embeddings private readonly overlapSize = 200; // Overlap between chunks private readonly maxConcurrentEmbeddings = 5; // Limit concurrent API calls private readonly batchSize = 10; // Process chunks in batches ``` **Semantic Chunking**: Detects paragraph and section boundaries for better chunk quality ### Batch Processing **Embedding Generation**: - Processes chunks in batches of 10 - Max 5 concurrent embedding API calls - Prevents memory overflow and API rate limiting **Chunk Storage**: - Batched database inserts - Reduces database round trips ### Memory Management **Chunk Processing**: - Processes chunks in batches to limit memory usage - Cleans up processed chunks from memory after storage **PDF Generation**: - Page pooling (max 5 pages) - Page timeout (30 seconds) - Cache with 5-minute TTL ### Database Optimization **Direct PostgreSQL for Critical Operations**: - Job creation uses direct PostgreSQL to bypass PostgREST cache issues - Ensures reliable job creation even when PostgREST schema cache is stale **Connection Pooling**: - Supabase client uses connection pooling - Direct PostgreSQL uses pg pool ### API Call Optimization **LLM Token Management**: - Automatic token counting - Text truncation if exceeds limits - Model selection based on complexity (smaller models for simpler tasks) **Embedding Caching**: ```31:32:backend/src/services/vectorDatabaseService.ts private semanticCache: Map = new Map(); private readonly CACHE_TTL = 3600000; // 1 hour cache TTL ``` --- ## 10. Background Processing Architecture ### Legacy vs Current System **Legacy: In-Memory Queue** (`jobQueueService`) - EventEmitter-based - In-memory job storage - Still initialized but being phased out - Location: `backend/src/services/jobQueueService.ts` **Current: Database-Backed Queue** (`jobProcessorService`) - Database-backed job storage - Scheduled processing via Firebase Cloud Scheduler - Location: `backend/src/services/jobProcessorService.ts` ### Job Processing Flow ``` Job Creation │ ▼ ProcessingJobModel.create() │ ▼ Status: 'pending' in database │ ▼ Scheduled Function (every 1 minute) OR Immediate processing via API │ ▼ JobProcessorService.processJobs() │ ▼ Get pending/retrying jobs (max 3 concurrent) │ ▼ Process jobs in parallel │ ▼ For each job: - Mark as 'processing' - Download file from GCS - Call unifiedDocumentProcessor - Update document status - Mark job as 'completed' or 'failed' ``` ### Scheduled Function **File**: `backend/src/index.ts` ```210:267:backend/src/index.ts export const processDocumentJobs = onSchedule({ schedule: 'every 1 minutes', // Minimum interval for Firebase Cloud Scheduler timeoutSeconds: 900, // 15 minutes (max for Gen2 scheduled functions) memory: '1GiB', retryCount: 2, // Retry up to 2 times on failure }, async (event) => { logger.info('Processing document jobs scheduled function triggered', { timestamp: new Date().toISOString(), scheduleTime: event.scheduleTime, }); try { const { jobProcessorService } = await import('./services/jobProcessorService'); // Check for stuck jobs before processing (monitoring) const { ProcessingJobModel } = await import('./models/ProcessingJobModel'); // Check for jobs stuck in processing status const stuckProcessingJobs = await ProcessingJobModel.getStuckJobs(15); // Jobs stuck > 15 minutes if (stuckProcessingJobs.length > 0) { logger.warn('Found stuck processing jobs', { count: stuckProcessingJobs.length, jobIds: stuckProcessingJobs.map(j => j.id), timestamp: new Date().toISOString(), }); } // Check for jobs stuck in pending status (alert if > 2 minutes) const stuckPendingJobs = await ProcessingJobModel.getStuckPendingJobs(2); // Jobs pending > 2 minutes if (stuckPendingJobs.length > 0) { logger.warn('Found stuck pending jobs (may indicate processing issues)', { count: stuckPendingJobs.length, jobIds: stuckPendingJobs.map(j => j.id), oldestJobAge: stuckPendingJobs[0] ? Math.round((Date.now() - new Date(stuckPendingJobs[0].created_at).getTime()) / 1000 / 60) : 0, timestamp: new Date().toISOString(), }); } const result = await jobProcessorService.processJobs(); logger.info('Document jobs processing completed', { ...result, timestamp: new Date().toISOString(), }); } catch (error) { const errorMessage = error instanceof Error ? error.message : String(error); const errorStack = error instanceof Error ? error.stack : undefined; logger.error('Error processing document jobs', { error: errorMessage, stack: errorStack, timestamp: new Date().toISOString(), }); // Re-throw to trigger retry mechanism (up to retryCount times) throw error; } }); ``` ### Job States ``` pending → processing → completed │ │ │ ▼ │ failed │ │ └──────────────────────┘ │ ▼ retrying │ ▼ (back to pending) ``` ### Concurrency Control **Max Concurrent Jobs**: 3 ```9:10:backend/src/services/jobProcessorService.ts private readonly MAX_CONCURRENT_JOBS = 3; private readonly JOB_TIMEOUT_MINUTES = 15; ``` **Processing Logic**: ```40:63:backend/src/services/jobProcessorService.ts // Get pending jobs const pendingJobs = await ProcessingJobModel.getPendingJobs(this.MAX_CONCURRENT_JOBS); // Get retrying jobs (enabled - schema is updated) const retryingJobs = await ProcessingJobModel.getRetryableJobs( Math.max(0, this.MAX_CONCURRENT_JOBS - pendingJobs.length) ); const allJobs = [...pendingJobs, ...retryingJobs]; if (allJobs.length === 0) { logger.debug('No jobs to process'); return stats; } logger.info('Processing jobs', { totalJobs: allJobs.length, pendingJobs: pendingJobs.length, retryingJobs: retryingJobs.length, }); // Process jobs in parallel (up to MAX_CONCURRENT_JOBS) const results = await Promise.allSettled( allJobs.map((job) => this.processJob(job.id)) ); ``` --- ## 11. Frontend Architecture ### Component Structure **Main Components**: - `DocumentUpload` - File upload with drag-and-drop - `DocumentList` - List of user's documents with status - `DocumentViewer` - View processed document and PDF - `Analytics` - Processing statistics dashboard - `UploadMonitoringDashboard` - Real-time upload monitoring ### State Management **AuthContext** (`frontend/src/contexts/AuthContext.tsx`): ```11:46:frontend/src/contexts/AuthContext.tsx export const AuthProvider: React.FC = ({ children }) => { const [user, setUser] = useState(null); const [token, setToken] = useState(null); const [isLoading, setIsLoading] = useState(true); const [error, setError] = useState(null); const [isInitialized, setIsInitialized] = useState(false); useEffect(() => { setIsLoading(true); // Listen for Firebase auth state changes const unsubscribe = authService.onAuthStateChanged(async (firebaseUser) => { try { if (firebaseUser) { const user = authService.getCurrentUser(); const token = await authService.getToken(); setUser(user); setToken(token); } else { setUser(null); setToken(null); } } catch (error) { console.error('Auth state change error:', error); setError('Authentication error occurred'); setUser(null); setToken(null); } finally { setIsLoading(false); setIsInitialized(true); } }); // Cleanup subscription on unmount return () => unsubscribe(); }, []); ``` ### API Communication **Document Service** (`frontend/src/services/documentService.ts`): - Axios client with auth interceptor - Automatic token refresh on 401 errors - Progress tracking for uploads - Error handling with user-friendly messages **Upload Flow**: ```224:361:frontend/src/services/documentService.ts async uploadDocument( file: File, onProgress?: (progress: number) => void, signal?: AbortSignal ): Promise { try { // Check authentication before upload const token = await authService.getToken(); if (!token) { throw new Error('Authentication required. Please log in to upload documents.'); } // Step 1: Get signed upload URL onProgress?.(5); // 5% - Getting upload URL const uploadUrlResponse = await apiClient.post('/documents/upload-url', { fileName: file.name, fileSize: file.size, contentType: contentTypeForSigning }, { signal }); const { documentId, uploadUrl } = uploadUrlResponse.data; // Step 2: Upload directly to Firebase Storage onProgress?.(10); // 10% - Starting direct upload await this.uploadToFirebaseStorage( file, uploadUrl, contentTypeForSigning, (uploadProgress) => { // Map upload progress (10-90%) const mappedProgress = 10 + (uploadProgress * 0.8); onProgress?.(mappedProgress); }, signal ); // Step 3: Confirm upload onProgress?.(90); // 90% - Confirming upload const confirmResponse = await apiClient.post( `/documents/${documentId}/confirm-upload`, {}, { signal } ); onProgress?.(100); // 100% - Complete return confirmResponse.data.document; } catch (error) { // ... error handling ... } } ``` ### Real-Time Updates **Polling for Processing Status**: - Frontend polls `/documents/:id` endpoint - Updates UI when status changes from 'processing' to 'completed' - Shows error messages if status is 'failed' **Upload Progress**: - Real-time progress tracking via `onProgress` callback - Visual progress bar in `DocumentUpload` component --- ## 12. Configuration & Environment ### Environment Variables **File**: `backend/src/config/env.ts` **Key Configuration Categories**: 1. **LLM Provider**: - `LLM_PROVIDER` - 'anthropic', 'openai', or 'openrouter' - `ANTHROPIC_API_KEY` - Claude API key - `OPENAI_API_KEY` - OpenAI API key - `OPENROUTER_API_KEY` - OpenRouter API key - `LLM_MODEL` - Model name (e.g., 'claude-sonnet-4-5-20250929') - `LLM_MAX_TOKENS` - Max output tokens - `LLM_MAX_INPUT_TOKENS` - Max input tokens (default 200000) 2. **Database**: - `SUPABASE_URL` - Supabase project URL - `SUPABASE_SERVICE_KEY` - Service role key - `SUPABASE_ANON_KEY` - Anonymous key 3. **Google Cloud**: - `GCLOUD_PROJECT_ID` - GCP project ID - `GCS_BUCKET_NAME` - Storage bucket name - `DOCUMENT_AI_PROCESSOR_ID` - Document AI processor ID - `DOCUMENT_AI_LOCATION` - Processor location (default 'us') 4. **Feature Flags**: - `AGENTIC_RAG_ENABLED` - Enable/disable agentic RAG processing ### Configuration Loading **Priority Order**: 1. `process.env` (Firebase Functions v2) 2. `functions.config()` (Firebase Functions v1 fallback) 3. `.env` file (local development) **Validation**: Joi schema validates all required environment variables --- ## 13. Debugging Guide ### Key Log Points **Correlation IDs**: Every request has a correlation ID for tracing **Structured Logging**: Winston logger with structured data **Key Log Locations**: 1. **Request Entry**: `backend/src/index.ts` - All incoming requests 2. **Authentication**: `backend/src/middleware/firebaseAuth.ts` - Auth success/failure 3. **Job Processing**: `backend/src/services/jobProcessorService.ts` - Job lifecycle 4. **Document Processing**: `backend/src/services/unifiedDocumentProcessor.ts` - Processing steps 5. **LLM Calls**: `backend/src/services/llmService.ts` - API calls and responses 6. **Vector Search**: `backend/src/services/vectorDatabaseService.ts` - Search operations 7. **Error Handling**: `backend/src/middleware/errorHandler.ts` - All errors with categorization ### Common Failure Points **1. Vector Search Timeouts** - **Symptom**: "Vector search timeout after 10s" - **Cause**: Searching across all documents without `document_id` filter - **Fix**: Always pass `documentId` to `vectorDatabaseService.searchSimilar()` **2. LLM API Failures** - **Symptom**: "LLM API call failed" or "Invalid JSON response" - **Cause**: API rate limits, network issues, or invalid response format - **Fix**: Check API keys, retry logic, and response validation **3. GCS Upload Failures** - **Symptom**: "Failed to upload to GCS" or "Signed URL expired" - **Cause**: Credential issues, bucket permissions, or URL expiration - **Fix**: Check GCS credentials and bucket configuration **4. Job Stuck in Processing** - **Symptom**: Job status remains 'processing' for > 15 minutes - **Cause**: Process crashed, timeout, or error not caught - **Fix**: Check logs, reset stuck jobs, investigate error **5. Document AI Failures** - **Symptom**: "Failed to extract text from document" - **Cause**: Document AI API error or invalid file format - **Fix**: Check Document AI processor configuration, fallback to pdf-parse ### Diagnostic Tools **Health Check Endpoints**: - `GET /health` - Basic health check - `GET /health/config` - Configuration health - `GET /health/agentic-rag` - Agentic RAG health status **Monitoring Endpoints**: - `GET /monitoring/upload-metrics` - Upload statistics - `GET /monitoring/upload-health` - Upload health - `GET /monitoring/real-time-stats` - Real-time statistics **Database Debugging**: ```sql -- Check pending jobs SELECT * FROM processing_jobs WHERE status = 'pending' ORDER BY created_at DESC; -- Check stuck jobs SELECT * FROM processing_jobs WHERE status = 'processing' AND started_at < NOW() - INTERVAL '15 minutes'; -- Check document status SELECT id, original_file_name, status, created_at FROM documents WHERE user_id = '' ORDER BY created_at DESC; ``` **Job Inspection**: ```typescript // Get job details const job = await ProcessingJobModel.findById(jobId); // Check job error console.log('Job error:', job.error); // Check job result console.log('Job result:', job.result); ``` ### Debugging Workflow 1. **Identify the Issue**: Check error logs with correlation ID 2. **Trace the Request**: Follow correlation ID through logs 3. **Check Job Status**: Query `processing_jobs` table for job state 4. **Check Document Status**: Query `documents` table for document state 5. **Review Service Logs**: Check specific service logs for detailed errors 6. **Test Components**: Test individual services in isolation 7. **Check External Services**: Verify GCS, Document AI, LLM APIs are accessible --- ## 14. Optimization Opportunities ### Identified Bottlenecks **1. Vector Search Performance** - **Current**: 10-second timeout, can be slow for large document sets - **Optimization**: Ensure `document_id` filter is always used - **Future**: Consider indexing optimizations, batch search **2. LLM API Calls** - **Current**: Sequential processing, no caching of similar requests - **Optimization**: Implement response caching for similar documents - **Future**: Batch API calls, use smaller models for simpler tasks **3. PDF Generation** - **Current**: Puppeteer can be memory-intensive - **Optimization**: Page pooling already implemented - **Future**: Consider serverless PDF generation service **4. Database Queries** - **Current**: Some queries don't use indexes effectively - **Optimization**: Add indexes on frequently queried columns - **Future**: Query optimization, connection pooling tuning ### Memory Usage Patterns **Chunk Processing**: - Processes chunks in batches to limit memory - Cleans up processed chunks after storage - **Optimization**: Consider streaming for very large documents **PDF Generation**: - Page pooling limits memory usage - Browser instance reuse reduces overhead - **Optimization**: Consider headless browser optimization ### API Call Optimization **Embedding Generation**: - Current: Max 5 concurrent calls - **Optimization**: Tune based on API rate limits - **Future**: Batch embedding API if available **LLM Calls**: - Current: Single call per document - **Optimization**: Use smaller models for simpler tasks - **Future**: Implement response caching ### Database Query Optimization **Frequently Queried Tables**: - `documents` - Add index on `user_id`, `status` - `processing_jobs` - Add index on `status`, `created_at` - `document_chunks` - Add index on `document_id`, `chunk_index` **Vector Search**: - Current: Uses `match_document_chunks` function - **Optimization**: Ensure `document_id` filter is always used - **Future**: Consider HNSW index for faster similarity search --- ## Appendix: Key File Locations ### Backend Services - `backend/src/services/unifiedDocumentProcessor.ts` - Main orchestrator - `backend/src/services/optimizedAgenticRAGProcessor.ts` - AI processing engine - `backend/src/services/jobProcessorService.ts` - Job processor - `backend/src/services/vectorDatabaseService.ts` - Vector operations - `backend/src/services/llmService.ts` - LLM interactions - `backend/src/services/documentAiProcessor.ts` - Document AI integration - `backend/src/services/pdfGenerationService.ts` - PDF generation - `backend/src/services/fileStorageService.ts` - GCS operations ### Backend Models - `backend/src/models/DocumentModel.ts` - Document data model - `backend/src/models/ProcessingJobModel.ts` - Job data model - `backend/src/models/VectorDatabaseModel.ts` - Vector data model ### Backend Routes - `backend/src/routes/documents.ts` - Document endpoints - `backend/src/routes/vector.ts` - Vector endpoints - `backend/src/routes/monitoring.ts` - Monitoring endpoints ### Backend Controllers - `backend/src/controllers/documentController.ts` - Document controller ### Frontend Services - `frontend/src/services/documentService.ts` - Document API client - `frontend/src/services/authService.ts` - Authentication service ### Frontend Components - `frontend/src/components/DocumentUpload.tsx` - Upload component - `frontend/src/components/DocumentList.tsx` - Document list - `frontend/src/components/DocumentViewer.tsx` - Document viewer ### Configuration - `backend/src/config/env.ts` - Environment configuration - `backend/src/config/supabase.ts` - Supabase configuration - `backend/src/config/firebase.ts` - Firebase configuration ### SQL - `backend/sql/fix_vector_search_timeout.sql` - Vector search optimization --- **End of Architecture Summary**