Files
cim_summary/CODEBASE_ARCHITECTURE_SUMMARY.md
admin 9c916d12f4 feat: Production release v2.0.0 - Simple Document Processor
Major release with significant performance improvements and new processing strategy.

## Core Changes
- Implemented simple_full_document processing strategy (default)
- Full document → LLM approach: 1-2 passes, ~5-6 minutes processing time
- Achieved 100% completeness with 2 API calls (down from 5+)
- Removed redundant Document AI passes for faster processing

## Financial Data Extraction
- Enhanced deterministic financial table parser
- Improved FY3/FY2/FY1/LTM identification from varying CIM formats
- Automatic merging of parser results with LLM extraction

## Code Quality & Infrastructure
- Cleaned up debug logging (removed emoji markers from production code)
- Fixed Firebase Secrets configuration (using modern defineSecret approach)
- Updated OpenAI API key
- Resolved deployment conflicts (secrets vs environment variables)
- Added .env files to Firebase ignore list

## Deployment
- Firebase Functions v2 deployment successful
- All 7 required secrets verified and configured
- Function URL: https://api-y56ccs6wva-uc.a.run.app

## Performance Improvements
- Processing time: ~5-6 minutes (down from 23+ minutes)
- API calls: 1-2 (down from 5+)
- Completeness: 100% achievable
- LLM Model: claude-3-7-sonnet-latest

## Breaking Changes
- Default processing strategy changed to 'simple_full_document'
- RAG processor available as alternative strategy 'document_ai_agentic_rag'

## Files Changed
- 36 files changed, 5642 insertions(+), 4451 deletions(-)
- Removed deprecated documentation files
- Cleaned up unused services and models

This release represents a major refactoring focused on speed, accuracy, and maintainability.
2025-11-09 21:07:22 -05:00

74 KiB

CIM Summary Codebase Architecture Summary

Last Updated: December 2024
Purpose: Comprehensive technical reference for senior developers optimizing and debugging the codebase


Table of Contents

  1. System Overview
  2. Application Entry Points
  3. Request Flow & API Architecture
  4. Document Processing Pipeline (Critical Path)
  5. Core Services Deep Dive
  6. Data Models & Database Schema
  7. Component Handoffs & Integration Points
  8. Error Handling & Resilience
  9. Performance Optimization Points
  10. Background Processing Architecture
  11. Frontend Architecture
  12. Configuration & Environment
  13. Debugging Guide
  14. Optimization Opportunities

1. System Overview

High-Level Architecture

┌─────────────────────────────────────────────────────────────────┐
│                         Frontend (React)                         │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐         │
│  │ DocumentUpload│  │ DocumentList │  │  Analytics   │         │
│  └──────┬───────┘  └──────┬───────┘  └──────┬───────┘         │
│         │                  │                  │                  │
│         └──────────────────┴──────────────────┘                  │
│                            │                                     │
│                    ┌────────▼────────┐                            │
│                    │  documentService │                            │
│                    │  (Axios Client)  │                            │
│                    └────────┬────────┘                            │
└────────────────────────────┼────────────────────────────────────┘
                             │ HTTPS + JWT
┌────────────────────────────▼────────────────────────────────────┐
│                    Backend (Express + Node.js)                  │
│  ┌──────────────────────────────────────────────────────────┐ │
│  │ Middleware Chain: CORS → Auth → Validation → Error Handler │ │
│  └──────────────────────────────────────────────────────────┘ │
│                            │                                     │
│         ┌──────────────────┼──────────────────┐                │
│         │                  │                  │                  │
│  ┌──────▼──────┐  ┌────────▼────────┐  ┌─────▼──────┐         │
│  │  Routes     │  │   Controllers    │  │  Services   │         │
│  └──────┬──────┘  └────────┬────────┘  └─────┬──────┘         │
│         │                  │                  │                  │
│         └──────────────────┴──────────────────┘                │
└────────────────────────────┬────────────────────────────────────┘
                              │
         ┌────────────────────┼────────────────────┐
         │                    │                      │
    ┌────▼────┐        ┌──────▼──────┐      ┌───────▼───────┐
    │Supabase │        │Google Cloud│      │  LLM APIs     │
    │(Postgres)│        │  Storage   │      │(Claude/OpenAI)│
    └─────────┘        └────────────┘      └───────────────┘

Technology Stack

Frontend:

  • React 18 + TypeScript
  • Vite (build tool)
  • Axios (HTTP client)
  • Firebase Auth (authentication)
  • React Router (routing)

Backend:

  • Node.js + Express + TypeScript
  • Firebase Functions v2 (deployment)
  • Supabase (PostgreSQL + Vector DB)
  • Google Cloud Storage (file storage)
  • Google Document AI (PDF text extraction)
  • Puppeteer (PDF generation)

AI/ML Services:

  • Anthropic Claude (primary LLM)
  • OpenAI (fallback LLM)
  • OpenRouter (LLM routing)
  • OpenAI Embeddings (vector embeddings)

Core Purpose

Automated processing and analysis of Confidential Information Memorandums (CIMs) using:

  1. Text Extraction: Google Document AI extracts text from PDFs
  2. Semantic Chunking: Split text into 4000-char chunks with overlap
  3. Vector Embeddings: Generate embeddings for semantic search
  4. LLM Analysis: Claude AI analyzes chunks and generates structured CIMReview data
  5. PDF Generation: Create summary PDF with analysis results

2. Application Entry Points

Backend Entry Point

File: backend/src/index.ts

// Initialize Firebase Admin SDK first
import './config/firebase';

import express from 'express';
import cors from 'cors';
import helmet from 'helmet';
import morgan from 'morgan';
import rateLimit from 'express-rate-limit';
import { config } from './config/env';
import { logger } from './utils/logger';
import documentRoutes from './routes/documents';
import vectorRoutes from './routes/vector';
import monitoringRoutes from './routes/monitoring';
import auditRoutes from './routes/documentAudit';
import { jobQueueService } from './services/jobQueueService';

import { errorHandler, correlationIdMiddleware } from './middleware/errorHandler';
import { notFoundHandler } from './middleware/notFoundHandler';

// Start the job queue service for background processing
jobQueueService.start();

Key Initialization Steps:

  1. Firebase Admin SDK initialization (./config/firebase)
  2. Express app setup with middleware chain
  3. Route registration (/documents, /vector, /monitoring, /api/audit)
  4. Job queue service startup (legacy in-memory queue)
  5. Firebase Functions export for Cloud deployment

Scheduled Function: processDocumentJobs (210:267:backend/src/index.ts)

  • Runs every minute via Firebase Cloud Scheduler
  • Processes pending/retrying jobs from database
  • Detects and resets stuck jobs

Frontend Entry Point

File: frontend/src/main.tsx

import React from 'react';
import ReactDOM from 'react-dom/client';
import App from './App';
import './index.css';

ReactDOM.createRoot(document.getElementById('root')!).render(
  <React.StrictMode>
    <App />
  </React.StrictMode>
);

Main App Component: frontend/src/App.tsx

  • Sets up React Router
  • Provides AuthContext
  • Renders protected routes and dashboard

3. Request Flow & API Architecture

Request Lifecycle

Client Request
    │
    ▼
┌─────────────────────────────────────┐
│ 1. CORS Middleware                   │
│    - Validates origin                │
│    - Sets CORS headers               │
└──────────────┬──────────────────────┘
               │
               ▼
┌─────────────────────────────────────┐
│ 2. Correlation ID Middleware         │
│    - Generates/reads X-Correlation-ID│
│    - Adds to request object          │
└──────────────┬──────────────────────┘
               │
               ▼
┌─────────────────────────────────────┐
│ 3. Firebase Auth Middleware          │
│    - Verifies JWT token              │
│    - Attaches user to req.user       │
└──────────────┬──────────────────────┘
               │
               ▼
┌─────────────────────────────────────┐
│ 4. Rate Limiting                    │
│    - 1000 requests per 15 minutes   │
└──────────────┬──────────────────────┘
               │
               ▼
┌─────────────────────────────────────┐
│ 5. Body Parsing                     │
│    - JSON (10MB limit)               │
│    - URL-encoded (10MB limit)        │
└──────────────┬──────────────────────┘
               │
               ▼
┌─────────────────────────────────────┐
│ 6. Route Handler                    │
│    - Matches route pattern           │
│    - Calls controller method         │
└──────────────┬──────────────────────┘
               │
               ▼
┌─────────────────────────────────────┐
│ 7. Controller                       │
│    - Validates input                 │
│    - Calls service methods           │
│    - Returns response                │
└──────────────┬──────────────────────┘
               │
               ▼
┌─────────────────────────────────────┐
│ 8. Service Layer                    │
│    - Business logic                  │
│    - Database operations             │
│    - External API calls              │
└──────────────┬──────────────────────┘
               │
               ▼
┌─────────────────────────────────────┐
│ 9. Error Handler (if error)         │
│    - Categorizes error               │
│    - Logs with correlation ID        │
│    - Returns structured response     │
└─────────────────────────────────────┘

Authentication Flow

Middleware: backend/src/middleware/firebaseAuth.ts

export const verifyFirebaseToken = async (
  req: FirebaseAuthenticatedRequest,
  res: Response,
  next: NextFunction
): Promise<void> => {
  try {
    console.log('🔐 Authentication middleware called for:', req.method, req.url);
    console.log('🔐 Request headers:', Object.keys(req.headers));
    
    // Debug Firebase Admin initialization
    console.log('🔐 Firebase apps available:', admin.apps.length);
    console.log('🔐 Firebase app names:', admin.apps.filter(app => app !== null).map(app => app!.name));
    
    const authHeader = req.headers.authorization;
    console.log('🔐 Auth header present:', !!authHeader);
    console.log('🔐 Auth header starts with Bearer:', authHeader?.startsWith('Bearer '));
    
    if (!authHeader || !authHeader.startsWith('Bearer ')) {
      console.log('❌ No valid authorization header');
      res.status(401).json({ error: 'No valid authorization header' });
      return;
    }

    const idToken = authHeader.split('Bearer ')[1];
    console.log('🔐 Token extracted, length:', idToken?.length);
    
    if (!idToken) {
      console.log('❌ No token provided');
      res.status(401).json({ error: 'No token provided' });
      return;
    }

    console.log('🔐 Attempting to verify Firebase ID token...');
    console.log('🔐 Token preview:', idToken.substring(0, 20) + '...');
    
    // Verify the Firebase ID token
    const decodedToken = await admin.auth().verifyIdToken(idToken, true);
    console.log('✅ Token verified successfully for user:', decodedToken.email);
    console.log('✅ Token UID:', decodedToken.uid);
    console.log('✅ Token issuer:', decodedToken.iss);
    
    // Check if token is expired
    const now = Math.floor(Date.now() / 1000);
    if (decodedToken.exp && decodedToken.exp < now) {
      logger.warn('Token expired for user:', decodedToken.uid);
      res.status(401).json({ error: 'Token expired' });
      return;
    }
    
    req.user = decodedToken;
    
    // Log successful authentication
    logger.info('Authenticated request for user:', decodedToken.email);
    
    next();

Frontend Auth: frontend/src/services/authService.ts

  • Manages Firebase Auth state
  • Provides token via getToken()
  • Axios interceptor adds token to requests

Route Structure

Main Routes (backend/src/routes/documents.ts):

  • POST /documents/upload-url - Get signed upload URL
  • POST /documents/:id/confirm-upload - Confirm upload and start processing
  • GET /documents - List user's documents
  • GET /documents/:id - Get document details
  • GET /documents/:id/download - Download processed PDF
  • GET /documents/analytics - Get processing analytics
  • POST /documents/:id/process-optimized-agentic-rag - Trigger AI processing

Middleware Applied:

// Apply authentication and correlation ID to all routes
router.use(verifyFirebaseToken);
router.use(addCorrelationId);

// Add logging middleware for document routes
router.use((req, res, next) => {
  console.log(`📄 Document route accessed: ${req.method} ${req.path}`);
  next();
});

4. Document Processing Pipeline (Critical Path)

Complete Flow Diagram

┌─────────────────────────────────────────────────────────────────────┐
│                    DOCUMENT PROCESSING PIPELINE                    │
└─────────────────────────────────────────────────────────────────────┘

1. UPLOAD PHASE
   ┌─────────────────────────────────────────────────────────────┐
   │ User selects PDF                                             │
   │   ↓                                                          │
   │ DocumentUpload component                                     │
   │   ↓                                                          │
   │ documentService.uploadDocument()                            │
   │   ↓                                                          │
   │ POST /documents/upload-url                                  │
   │   ↓                                                          │
   │ documentController.getUploadUrl()                           │
   │   ↓                                                          │
   │ DocumentModel.create() → documents table                     │
   │   ↓                                                          │
   │ fileStorageService.generateSignedUploadUrl()                 │
   │   ↓                                                          │
   │ Direct upload to GCS via signed URL                         │
   │   ↓                                                          │
   │ POST /documents/:id/confirm-upload                          │
   └─────────────────────────────────────────────────────────────┘

2. JOB CREATION PHASE
   ┌─────────────────────────────────────────────────────────────┐
   │ documentController.confirmUpload()                           │
   │   ↓                                                          │
   │ ProcessingJobModel.create() → processing_jobs table          │
   │   ↓                                                          │
   │ Status: 'pending'                                            │
   │   ↓                                                          │
   │ Returns 202 Accepted (async processing)                      │
   └─────────────────────────────────────────────────────────────┘

3. JOB PROCESSING PHASE (Background)
   ┌─────────────────────────────────────────────────────────────┐
   │ Scheduled Function: processDocumentJobs (every 1 minute)     │
   │   OR                                                          │
   │ Immediate processing via jobProcessorService.processJob()    │
   │   ↓                                                          │
   │ JobProcessorService.processJob()                            │
   │   ↓                                                          │
   │ Download file from GCS                                       │
   │   ↓                                                          │
   │ unifiedDocumentProcessor.processDocument()                   │
   └─────────────────────────────────────────────────────────────┘

4. TEXT EXTRACTION PHASE
   ┌─────────────────────────────────────────────────────────────┐
   │ documentAiProcessor.processDocument()                        │
   │   ↓                                                          │
   │ Google Document AI API                                      │
   │   ↓                                                          │
   │ Extracted text returned                                      │
   └─────────────────────────────────────────────────────────────┘

5. CHUNKING & EMBEDDING PHASE
   ┌─────────────────────────────────────────────────────────────┐
   │ optimizedAgenticRAGProcessor.processLargeDocument()          │
   │   ↓                                                          │
   │ createIntelligentChunks()                                    │
   │   - Semantic boundary detection                               │
   │   - 4000-char chunks with 200-char overlap                  │
   │   ↓                                                          │
   │ processChunksInBatches()                                     │
   │   - Batch size: 10                                           │
   │   - Max concurrent: 5                                        │
   │   ↓                                                          │
   │ storeChunksOptimized()                                       │
   │   ↓                                                          │
   │ vectorDatabaseService.storeEmbedding()                       │
   │   - OpenAI embeddings API                                    │
   │   - Store in document_chunks table                           │
   └─────────────────────────────────────────────────────────────┘

6. LLM ANALYSIS PHASE
   ┌─────────────────────────────────────────────────────────────┐
   │ generateLLMAnalysisHybrid()                                  │
   │   ↓                                                          │
   │ llmService.processCIMDocument()                             │
   │   ↓                                                          │
   │ Vector search for relevant chunks                            │
   │   ↓                                                          │
   │ Claude/OpenAI API call with structured prompt                │
   │   ↓                                                          │
   │ Parse and validate CIMReview JSON                             │
   │   ↓                                                          │
   │ Return structured analysisData                                │
   └─────────────────────────────────────────────────────────────┘

7. PDF GENERATION PHASE
   ┌─────────────────────────────────────────────────────────────┐
   │ pdfGenerationService.generatePDF()                           │
   │   ↓                                                          │
   │ Puppeteer browser instance                                   │
   │   ↓                                                          │
   │ Render HTML template with analysisData                       │
   │   ↓                                                          │
   │ Generate PDF buffer                                          │
   │   ↓                                                          │
   │ Upload PDF to GCS                                            │
   │   ↓                                                          │
   │ Update document record with PDF path                         │
   └─────────────────────────────────────────────────────────────┘

8. STATUS UPDATE PHASE
   ┌─────────────────────────────────────────────────────────────┐
   │ DocumentModel.updateById()                                    │
   │   - status: 'completed'                                       │
   │   - pdf_path: GCS path                                       │
   │   ↓                                                          │
   │ ProcessingJobModel.markAsCompleted()                              │
   │   ↓                                                          │
   │ Frontend polls /documents/:id for status updates            │
   └─────────────────────────────────────────────────────────────┘

Key Handoff Points

1. Upload to Job Creation

async confirmUpload(req: Request, res: Response): Promise<void> {
  // ... validation ...
  
  // Update status to processing
  await DocumentModel.updateById(documentId, { 
    status: 'processing_llm'
  });

  // Acknowledge the request immediately
  res.status(202).json({
    message: 'Upload confirmed, processing has started.',
    document: document,
    status: 'processing'
  });

  // CRITICAL FIX: Use database-backed job queue
  const { ProcessingJobModel } = await import('../models/ProcessingJobModel');
  await ProcessingJobModel.create({
    document_id: documentId,
    user_id: userId,
    options: {
      fileName: document.original_file_name,
      mimeType: 'application/pdf'
    }
  });
}

2. Job Processing to Document Processing

private async processJob(jobId: string): Promise<{ success: boolean; error?: string }> {
  // Get job details
  job = await ProcessingJobModel.findById(jobId);
  
  // Mark job as processing
  await ProcessingJobModel.markAsProcessing(jobId);

  // Download file from GCS
  const fileBuffer = await fileStorageService.downloadFile(document.file_path);

  // Process document
  const result = await unifiedDocumentProcessor.processDocument(
    job.document_id,
    job.user_id,
    fileBuffer.toString('utf-8'), // This will be re-read as buffer
    {
      fileBuffer,
      fileName: job.options?.fileName || 'document.pdf',
      mimeType: job.options?.mimeType || 'application/pdf'
    }
  );
}

3. Document Processing to Text Extraction

async processDocument(
  documentId: string, 
  userId: string, 
  fileBuffer: Buffer, 
  fileName: string, 
  mimeType: string
): Promise<ProcessingResult> {
  // Step 1: Extract text using Document AI or fallback
  const extractedText = await this.extractTextFromDocument(fileBuffer, fileName, mimeType);
  
  // Step 2: Process extracted text through Agentic RAG
  const agenticRagResult = await this.processWithAgenticRAG(documentId, extractedText);
}

4. Text to Chunking

async processLargeDocument(
  documentId: string,
  text: string,
  options: {
    enableSemanticChunking?: boolean;
    enableMetadataEnrichment?: boolean;
    similarityThreshold?: number;
  } = {}
): Promise<ProcessingResult> {
  // Step 1: Create intelligent chunks with semantic boundaries
  const chunks = await this.createIntelligentChunks(text, documentId, options.enableSemanticChunking);

  // Step 2: Process chunks in batches to manage memory
  const processedChunks = await this.processChunksInBatches(chunks, documentId, options);

  // Step 3: Store chunks with optimized batching
  const embeddingApiCalls = await this.storeChunksOptimized(processedChunks, documentId);

  // Step 4: Generate LLM analysis using HYBRID approach
  const llmResult = await this.generateLLMAnalysisHybrid(documentId, text, processedChunks);
}

5. Core Services Deep Dive

5.1 UnifiedDocumentProcessor

File: backend/src/services/unifiedDocumentProcessor.ts
Purpose: Main orchestrator for document processing strategies

Key Method:

async processDocument(
  documentId: string, 
  userId: string, 
  text: string,
  options: any = {}
): Promise<ProcessingResult> {
  const strategy = options.strategy || 'document_ai_agentic_rag';
  
  logger.info('Processing document with unified processor', {
    documentId,
    strategy,
    textLength: text.length
  });

  // Only support document_ai_agentic_rag strategy
  if (strategy === 'document_ai_agentic_rag') {
    return await this.processWithDocumentAiAgenticRag(documentId, userId, text, options);
  } else {
    throw new Error(`Unsupported processing strategy: ${strategy}. Only 'document_ai_agentic_rag' is supported.`);
  }
}

Dependencies:

  • documentAiProcessor - Text extraction
  • optimizedAgenticRAGProcessor - AI processing
  • llmService - LLM interactions
  • pdfGenerationService - PDF generation

Error Handling: Wraps errors with detailed context, validates analysisData presence

5.2 OptimizedAgenticRAGProcessor

File: backend/src/services/optimizedAgenticRAGProcessor.ts (1885 lines)
Purpose: Core AI processing engine for chunking, embeddings, and LLM analysis

Key Configuration:

private readonly maxChunkSize = 4000; // Optimal chunk size for embeddings
private readonly overlapSize = 200; // Overlap between chunks
private readonly maxConcurrentEmbeddings = 5; // Limit concurrent API calls
private readonly batchSize = 10; // Process chunks in batches

Key Methods:

  • processLargeDocument() - Main entry point
  • createIntelligentChunks() - Semantic chunking with boundary detection
  • processChunksInBatches() - Batch processing for memory efficiency
  • storeChunksOptimized() - Embedding generation and storage
  • generateLLMAnalysisHybrid() - LLM analysis with vector search

Performance Optimizations:

  • Semantic boundary detection (paragraphs, sections)
  • Batch processing to limit memory usage
  • Concurrent embedding generation (max 5)
  • Vector search with document_id filtering

5.3 JobProcessorService

File: backend/src/services/jobProcessorService.ts
Purpose: Database-backed job processor (replaces legacy in-memory queue)

Key Method:

async processJobs(): Promise<{
  processed: number;
  succeeded: number;
  failed: number;
  skipped: number;
}> {
  // Prevent concurrent processing runs
  if (this.isProcessing) {
    logger.info('Job processor already running, skipping this run');
    return { processed: 0, succeeded: 0, failed: 0, skipped: 0 };
  }

  this.isProcessing = true;
  const stats = { processed: 0, succeeded: 0, failed: 0, skipped: 0 };

  try {
    // Reset stuck jobs first
    const resetCount = await ProcessingJobModel.resetStuckJobs(this.JOB_TIMEOUT_MINUTES);
    
    // Get pending jobs
    const pendingJobs = await ProcessingJobModel.getPendingJobs(this.MAX_CONCURRENT_JOBS);
    
    // Get retrying jobs
    const retryingJobs = await ProcessingJobModel.getRetryableJobs(
      Math.max(0, this.MAX_CONCURRENT_JOBS - pendingJobs.length)
    );

    const allJobs = [...pendingJobs, ...retryingJobs];

    // Process jobs in parallel (up to MAX_CONCURRENT_JOBS)
    const results = await Promise.allSettled(
      allJobs.map((job) => this.processJob(job.id))
    );

Configuration:

  • MAX_CONCURRENT_JOBS = 3
  • JOB_TIMEOUT_MINUTES = 15

Features:

  • Stuck job detection and recovery
  • Retry logic with exponential backoff
  • Parallel processing with concurrency limit
  • Database-backed state management

5.4 VectorDatabaseService

File: backend/src/services/vectorDatabaseService.ts
Purpose: Vector embeddings and similarity search

Key Method - Vector Search:

async searchSimilar(
  embedding: number[], 
  limit: number = 10, 
  threshold: number = 0.7,
  documentId?: string
): Promise<VectorSearchResult[]> {
  try {
    if (this.provider === 'supabase') {
      // Use optimized Supabase vector search function with document_id filtering
      // This prevents timeouts by only searching within a specific document
      const rpcParams: any = {
        query_embedding: embedding,
        match_threshold: threshold,
        match_count: limit
      };
      
      // Add document_id filter if provided (critical for performance)
      if (documentId) {
        rpcParams.filter_document_id = documentId;
      }

      // Set a timeout for the RPC call (10 seconds)
      const searchPromise = this.supabaseClient
        .rpc('match_document_chunks', rpcParams);

      const timeoutPromise = new Promise<{ data: null; error: { message: string } }>((_, reject) => {
        setTimeout(() => reject(new Error('Vector search timeout after 10s')), 10000);
      });

      let result: any;
      try {
        result = await Promise.race([searchPromise, timeoutPromise]);
      } catch (timeoutError: any) {
        if (timeoutError.message?.includes('timeout')) {
          logger.error('Vector search timed out', { documentId, timeout: '10s' });
          throw new Error('Vector search timeout after 10s');
        }
        throw timeoutError;
      }

Critical Optimization: Always pass documentId to filter search scope and prevent timeouts

SQL Function: backend/sql/fix_vector_search_timeout.sql

CREATE OR REPLACE FUNCTION match_document_chunks (
  query_embedding vector(1536),
  match_threshold float,
  match_count int,
  filter_document_id text DEFAULT NULL
)
RETURNS TABLE (
  id UUID,
  document_id TEXT,
  content text,
  metadata JSONB,
  chunk_index INT,
  similarity float
)
LANGUAGE sql STABLE
AS $$
  SELECT
    document_chunks.id,
    document_chunks.document_id,
    document_chunks.content,
    document_chunks.metadata,
    document_chunks.chunk_index,
    1 - (document_chunks.embedding <=> query_embedding) AS similarity
  FROM document_chunks
  WHERE document_chunks.embedding IS NOT NULL
    AND (filter_document_id IS NULL OR document_chunks.document_id = filter_document_id)
    AND 1 - (document_chunks.embedding <=> query_embedding) > match_threshold
  ORDER BY document_chunks.embedding <=> query_embedding
  LIMIT match_count;
$$;

5.5 LLMService

File: backend/src/services/llmService.ts
Purpose: LLM interactions (Claude/OpenAI/OpenRouter)

Provider Selection:

constructor() {
  // Read provider from config (supports openrouter, anthropic, openai)
  this.provider = config.llm.provider;
  
  // CRITICAL: If provider is not set correctly, log and use fallback
  if (!this.provider || (this.provider !== 'openrouter' && this.provider !== 'anthropic' && this.provider !== 'openai')) {
    logger.error('LLM provider is invalid or not set', {
      provider: this.provider,
      configProvider: config.llm.provider,
      processEnvProvider: process.env['LLM_PROVIDER'],
      defaultingTo: 'anthropic'
    });
    this.provider = 'anthropic'; // Fallback
  }
  
  // Set API key based on provider
  if (this.provider === 'openai') {
    this.apiKey = config.llm.openaiApiKey!;
  } else if (this.provider === 'openrouter') {
    // OpenRouter: Use OpenRouter key if provided, otherwise use Anthropic key for BYOK
    this.apiKey = config.llm.openrouterApiKey || config.llm.anthropicApiKey!;
  } else {
    this.apiKey = config.llm.anthropicApiKey!;
  }
  
  // Use configured model instead of hardcoded value
  this.defaultModel = config.llm.model;
  this.maxTokens = config.llm.maxTokens;
  this.temperature = config.llm.temperature;
}

Key Method:

async processCIMDocument(text: string, template: string, analysis?: Record<string, any>): Promise<CIMAnalysisResult> {
  // Check and truncate text if it exceeds maxInputTokens
  const maxInputTokens = config.llm.maxInputTokens || 200000;
  const systemPromptTokens = this.estimateTokenCount(this.getCIMSystemPrompt());
  const templateTokens = this.estimateTokenCount(template);
  const promptBuffer = config.llm.promptBuffer || 1000;
  
  // Calculate available tokens for document text
  const reservedTokens = systemPromptTokens + templateTokens + promptBuffer + (config.llm.maxTokens || 16000);
  const availableTokens = maxInputTokens - reservedTokens;
  
  const textTokens = this.estimateTokenCount(text);
  let processedText = text;
  let wasTruncated = false;
  
  if (textTokens > availableTokens) {
    logger.warn('Document text exceeds token limit, truncating', {
      textTokens,
      availableTokens,
      maxInputTokens,
      reservedTokens,
      truncationRatio: (availableTokens / textTokens * 100).toFixed(1) + '%'
    });
    
    processedText = this.truncateText(text, availableTokens);
    wasTruncated = true;
  }

Features:

  • Automatic token counting and truncation
  • Model selection based on task complexity
  • JSON schema validation with Zod
  • Retry logic with exponential backoff
  • Cost tracking

5.6 DocumentAiProcessor

File: backend/src/services/documentAiProcessor.ts
Purpose: Google Document AI integration for text extraction

Key Method:

async processDocument(
  documentId: string, 
  userId: string, 
  fileBuffer: Buffer, 
  fileName: string, 
  mimeType: string
): Promise<ProcessingResult> {
  const startTime = Date.now();
  
  try {
    logger.info('Starting Document AI + Agentic RAG processing', { 
      documentId, 
      userId, 
      fileName,
      fileSize: fileBuffer.length,
      mimeType
    });

    // Step 1: Extract text using Document AI or fallback
    const extractedText = await this.extractTextFromDocument(fileBuffer, fileName, mimeType);
    
    if (!extractedText) {
      throw new Error('Failed to extract text from document');
    }

    logger.info('Text extraction completed', {
      textLength: extractedText.length
    });

    // Step 2: Process extracted text through Agentic RAG
    const agenticRagResult = await this.processWithAgenticRAG(documentId, extractedText);
    
    const processingTime = Date.now() - startTime;
    
    return {
      success: true,
      content: agenticRagResult.summary || extractedText,
      metadata: {
        processingStrategy: 'document_ai_agentic_rag',
        processingTime,
        extractedTextLength: extractedText.length,
        agenticRagResult,
        fileSize: fileBuffer.length,
        fileName,
        mimeType
      }
    };

Fallback Strategy: Uses pdf-parse if Document AI fails

5.7 PDFGenerationService

File: backend/src/services/pdfGenerationService.ts
Purpose: PDF generation using Puppeteer

Key Features:

  • Page pooling for performance
  • Caching for repeated requests
  • Browser instance reuse
  • Fallback to PDFKit if Puppeteer fails

Configuration:

class PDFGenerationService {
  private browser: any = null;
  private pagePool: PagePool[] = [];
  private readonly maxPoolSize = 5;
  private readonly pageTimeout = 30000; // 30 seconds
  private readonly cache = new Map<string, { buffer: Buffer; timestamp: number }>();
  private readonly cacheTimeout = 300000; // 5 minutes

  private readonly defaultOptions: PDFGenerationOptions = {
    format: 'A4',
    margin: {
      top: '1in',
      right: '1in',
      bottom: '1in',
      left: '1in',
    },
    displayHeaderFooter: true,
    printBackground: true,
    quality: 'high',
    timeout: 30000,
  };

5.8 FileStorageService

File: backend/src/services/fileStorageService.ts
Purpose: Google Cloud Storage operations

Key Methods:

  • generateSignedUploadUrl() - Generate signed URL for direct upload
  • downloadFile() - Download file from GCS
  • saveBuffer() - Save buffer to GCS
  • deleteFile() - Delete file from GCS

Credential Handling:

constructor() {
  this.bucketName = config.googleCloud.gcsBucketName;
  
  // Check if we're in Firebase Functions/Cloud Run environment
  const isCloudEnvironment = process.env.FUNCTION_TARGET || 
                             process.env.FUNCTION_NAME || 
                             process.env.K_SERVICE ||
                             process.env.GOOGLE_CLOUD_PROJECT ||
                             !!process.env.GCLOUD_PROJECT ||
                             process.env.X_GOOGLE_GCLOUD_PROJECT;
  
  // Initialize Google Cloud Storage
  const storageConfig: any = {
    projectId: config.googleCloud.projectId,
  };
  
  // Only use keyFilename in local development
  // In Firebase Functions/Cloud Run, use Application Default Credentials
  if (isCloudEnvironment) {
    // In cloud, ALWAYS clear GOOGLE_APPLICATION_CREDENTIALS to force use of ADC
    // Firebase Functions automatically provides credentials via metadata service
    // These credentials have signing capabilities for generating signed URLs
    const originalCreds = process.env.GOOGLE_APPLICATION_CREDENTIALS;
    if (originalCreds) {
      delete process.env.GOOGLE_APPLICATION_CREDENTIALS;
      logger.info('Using Application Default Credentials for GCS (cloud environment)', {
        clearedEnvVar: 'GOOGLE_APPLICATION_CREDENTIALS',
        originalValue: originalCreds,
        projectId: config.googleCloud.projectId
      });
    }

6. Data Models & Database Schema

Core Models

DocumentModel (backend/src/models/DocumentModel.ts):

  • create() - Create document record
  • findById() - Get document by ID
  • updateById() - Update document status/metadata
  • findByUserId() - List user's documents

ProcessingJobModel (backend/src/models/ProcessingJobModel.ts):

  • create() - Create processing job (uses direct PostgreSQL to bypass PostgREST cache)
  • findById() - Get job by ID
  • getPendingJobs() - Get pending jobs (limit by concurrency)
  • getRetryableJobs() - Get jobs ready for retry
  • markAsProcessing() - Update job status
  • markAsCompleted() - Mark job complete
  • markAsFailed() - Mark job failed with error
  • resetStuckJobs() - Reset jobs stuck in processing

VectorDatabaseModel (backend/src/models/VectorDatabaseModel.ts):

  • Chunk storage and retrieval
  • Embedding management

Database Tables

documents:

  • id (UUID, primary key)
  • user_id (UUID, foreign key)
  • original_file_name (text)
  • file_path (text, GCS path)
  • file_size (bigint)
  • status (text: 'uploading', 'uploaded', 'processing_llm', 'completed', 'failed')
  • pdf_path (text, GCS path for generated PDF)
  • created_at, updated_at (timestamps)

processing_jobs:

  • id (UUID, primary key)
  • document_id (UUID, foreign key)
  • user_id (UUID, foreign key)
  • status (text: 'pending', 'processing', 'completed', 'failed', 'retrying')
  • attempts (int)
  • max_attempts (int, default 3)
  • options (JSONB, processing options)
  • error (text, error message if failed)
  • result (JSONB, processing result)
  • created_at, started_at, completed_at, updated_at (timestamps)

document_chunks:

  • id (UUID, primary key)
  • document_id (text, foreign key)
  • content (text)
  • embedding (vector(1536))
  • metadata (JSONB)
  • chunk_index (int)
  • created_at, updated_at (timestamps)

agentic_rag_sessions:

  • id (UUID, primary key)
  • document_id (UUID, foreign key)
  • user_id (UUID, foreign key)
  • status (text)
  • metadata (JSONB)
  • created_at, updated_at (timestamps)

Vector Search Optimization

Critical SQL Function: match_document_chunks with document_id filtering

CREATE OR REPLACE FUNCTION match_document_chunks (
  query_embedding vector(1536),
  match_threshold float,
  match_count int,
  filter_document_id text DEFAULT NULL
)
RETURNS TABLE (
  id UUID,
  document_id TEXT,
  content text,
  metadata JSONB,
  chunk_index INT,
  similarity float
)
LANGUAGE sql STABLE
AS $$
  SELECT
    document_chunks.id,
    document_chunks.document_id,
    document_chunks.content,
    document_chunks.metadata,
    document_chunks.chunk_index,
    1 - (document_chunks.embedding <=> query_embedding) AS similarity
  FROM document_chunks
  WHERE document_chunks.embedding IS NOT NULL
    AND (filter_document_id IS NULL OR document_chunks.document_id = filter_document_id)
    AND 1 - (document_chunks.embedding <=> query_embedding) > match_threshold
  ORDER BY document_chunks.embedding <=> query_embedding
  LIMIT match_count;
$$;

Always pass filter_document_id to prevent timeouts when searching across all documents.


7. Component Handoffs & Integration Points

Frontend ↔ Backend

Axios Interceptor (frontend/src/services/documentService.ts):

export const apiClient = axios.create({
  baseURL: API_BASE_URL,
  timeout: 300000, // 5 minutes
});

// Add auth token to requests
apiClient.interceptors.request.use(async (config) => {
  const token = await authService.getToken();
  if (token) {
    config.headers.Authorization = `Bearer ${token}`;
  }
  return config;
});

// Handle auth errors with retry logic
apiClient.interceptors.response.use(
  (response) => response,
  async (error) => {
    const originalRequest = error.config;
    
    if (error.response?.status === 401 && !originalRequest._retry) {
      originalRequest._retry = true;
      
      try {
        // Attempt to refresh the token
        const newToken = await authService.getToken();
        if (newToken) {
          // Retry the original request with the new token
          originalRequest.headers.Authorization = `Bearer ${newToken}`;
          return apiClient(originalRequest);
        }
      } catch (refreshError) {
        console.error('Token refresh failed:', refreshError);
      }
      
      // If token refresh fails, logout the user
      authService.logout();
      window.location.href = '/login';
    }
    
    return Promise.reject(error);
  }
);

Backend ↔ Database

Two Connection Methods:

  1. Supabase Client (default for most operations):
import { getSupabaseServiceClient } from '../config/supabase';
const supabase = getSupabaseServiceClient();
  1. Direct PostgreSQL (for critical operations, bypasses PostgREST cache):
static async create(data: CreateProcessingJobData): Promise<ProcessingJob> {
  try {
    // Use direct PostgreSQL connection to bypass PostgREST cache
    // This is critical because PostgREST cache issues can block entire processing pipeline
    const pool = getPostgresPool();
    
    const result = await pool.query(
      `INSERT INTO processing_jobs (
        document_id, user_id, status, attempts, max_attempts, options, created_at
      ) VALUES ($1, $2, $3, $4, $5, $6, $7)
      RETURNING *`,
      [
        data.document_id,
        data.user_id,
        'pending',
        0,
        data.max_attempts || 3,
        JSON.stringify(data.options || {}),
        new Date().toISOString()
      ]
    );

    if (result.rows.length === 0) {
      throw new Error('Failed to create processing job: No data returned');
    }

    const job = result.rows[0];

    logger.info('Processing job created via direct PostgreSQL', {
      jobId: job.id,
      documentId: data.document_id,
      userId: data.user_id,
    });

    return job;

Backend ↔ GCS

Signed URL Generation:

const uploadUrl = await fileStorageService.generateSignedUploadUrl(filePath, contentType);

Direct Upload (frontend):

const fetchPromise = fetch(uploadUrl, {
  method: 'PUT',
  headers: {
    'Content-Type': contentType, // Must match exactly what was used in signed URL generation
  },
  body: file,
  signal: signal,
});

File Download (for processing):

const fileBuffer = await fileStorageService.downloadFile(document.file_path);

Backend ↔ Document AI

Text Extraction:

private async extractTextFromDocument(fileBuffer: Buffer, fileName: string, mimeType: string): Promise<string> {
  try {
    // Check document size first
    // ... size validation ...
    
    // Upload to GCS for Document AI processing
    const gcsFileName = `temp/${Date.now()}_${fileName}`;
    await this.storageClient.bucket(this.gcsBucketName).file(gcsFileName).save(fileBuffer);
    
    // Process with Document AI
    const request = {
      name: this.processorName,
      rawDocument: {
        gcsSource: {
          uri: `gs://${this.gcsBucketName}/${gcsFileName}`
        },
        mimeType: mimeType
      }
    };
    
    const [result] = await this.documentAiClient.processDocument(request);
    
    // Extract text from result
    const text = result.document?.text || '';
    
    // Clean up temp file
    await this.storageClient.bucket(this.gcsBucketName).file(gcsFileName).delete();
    
    return text;
  } catch (error) {
    // Fallback to pdf-parse
    logger.warn('Document AI failed, using pdf-parse fallback', { error });
    const data = await pdf(fileBuffer);
    return data.text;
  }
}

Backend ↔ LLM APIs

Provider Selection (Claude/OpenAI/OpenRouter):

  • Configured via LLM_PROVIDER environment variable
  • Automatic API key selection based on provider
  • Model selection based on task complexity

Request Flow:

// 1. Token counting and truncation
const processedText = this.truncateText(text, availableTokens);

// 2. Model selection
const model = this.selectModel(taskComplexity);

// 3. API call with retry logic
const response = await this.callLLMAPI({
  prompt: processedText,
  systemPrompt: systemPrompt,
  model: model,
  maxTokens: this.maxTokens,
  temperature: this.temperature
});

// 4. JSON parsing and validation
const parsed = JSON.parse(response.content);
const validated = cimReviewSchema.parse(parsed);

Services ↔ Services

Event-Driven Patterns:

  • jobQueueService emits events: job:added, job:started, job:completed, job:failed
  • uploadMonitoringService tracks upload events

Direct Method Calls:

  • Most service interactions are direct method calls
  • Services are exported as singletons for easy access

8. Error Handling & Resilience

Error Propagation Path

Service Method
    │
    ▼ (throws error)
Controller
    │
    ▼ (catches, logs, re-throws)
Express Error Handler
    │
    ▼ (categorizes, logs, responds)
Client (structured error response)

Error Categories

File: backend/src/middleware/errorHandler.ts

export enum ErrorCategory {
  VALIDATION = 'validation',
  AUTHENTICATION = 'authentication',
  AUTHORIZATION = 'authorization',
  NOT_FOUND = 'not_found',
  EXTERNAL_SERVICE = 'external_service',
  PROCESSING = 'processing',
  SYSTEM = 'system',
  DATABASE = 'database'
}

Error Response Structure:

export interface ErrorResponse {
  success: false;
  error: {
    code: string;
    message: string;
    details?: any;
    correlationId: string;
    timestamp: string;
    retryable: boolean;
  };
}

Retry Mechanisms

1. Job Retries:

  • Max attempts: 3 (configurable per job)
  • Exponential backoff between retries
  • Jobs marked as retrying status

2. API Retries:

  • LLM API calls: 3 retries with exponential backoff
  • Document AI: Fallback to pdf-parse
  • Vector search: 10-second timeout, fallback to direct query

3. Database Retries:

private static async retryOperation<T>(
  operation: () => Promise<T>,
  operationName: string,
  maxRetries: number = 3,
  baseDelay: number = 1000
): Promise<T> {
  let lastError: any;
  
  for (let attempt = 1; attempt <= maxRetries; attempt++) {
    try {
      return await operation();
    } catch (error: any) {
      lastError = error;
      const isNetworkError = error?.message?.includes('fetch failed') || 
                           error?.message?.includes('ENOTFOUND') ||
                           error?.message?.includes('ECONNREFUSED') ||
                           error?.message?.includes('ETIMEDOUT') ||
                           error?.name === 'TypeError';
      
      if (!isNetworkError || attempt === maxRetries) {
        throw error;
      }
      
      const delay = baseDelay * Math.pow(2, attempt - 1);
      logger.warn(`${operationName} failed (attempt ${attempt}/${maxRetries}), retrying in ${delay}ms`, {
        error: error?.message || String(error),
        code: error?.code,
        attempt,
        maxRetries
      });
      
      await new Promise(resolve => setTimeout(resolve, delay));
    }
  }
  
  throw lastError;
}

Timeout Handling

Vector Search Timeout:

// Set a timeout for the RPC call (10 seconds)
const searchPromise = this.supabaseClient
  .rpc('match_document_chunks', rpcParams);

const timeoutPromise = new Promise<{ data: null; error: { message: string } }>((_, reject) => {
  setTimeout(() => reject(new Error('Vector search timeout after 10s')), 10000);
});

let result: any;
try {
  result = await Promise.race([searchPromise, timeoutPromise]);
} catch (timeoutError: any) {
  if (timeoutError.message?.includes('timeout')) {
    logger.error('Vector search timed out', { documentId, timeout: '10s' });
    throw new Error('Vector search timeout after 10s');
  }
  throw timeoutError;
}

LLM API Timeout: Handled by axios timeout configuration

Job Timeout: 15 minutes, jobs stuck longer are reset

Stuck Job Detection and Recovery

// Reset stuck jobs first
const resetCount = await ProcessingJobModel.resetStuckJobs(this.JOB_TIMEOUT_MINUTES);
if (resetCount > 0) {
  logger.info('Reset stuck jobs', { count: resetCount });
}

Scheduled Function Monitoring:

// Check for jobs stuck in processing status
const stuckProcessingJobs = await ProcessingJobModel.getStuckJobs(15); // Jobs stuck > 15 minutes
if (stuckProcessingJobs.length > 0) {
  logger.warn('Found stuck processing jobs', {
    count: stuckProcessingJobs.length,
    jobIds: stuckProcessingJobs.map(j => j.id),
    timestamp: new Date().toISOString(),
  });
}

// Check for jobs stuck in pending status (alert if > 2 minutes)
const stuckPendingJobs = await ProcessingJobModel.getStuckPendingJobs(2); // Jobs pending > 2 minutes
if (stuckPendingJobs.length > 0) {
  logger.warn('Found stuck pending jobs (may indicate processing issues)', {
    count: stuckPendingJobs.length,
    jobIds: stuckPendingJobs.map(j => j.id),
    oldestJobAge: stuckPendingJobs[0] ? Math.round((Date.now() - new Date(stuckPendingJobs[0].created_at).getTime()) / 1000 / 60) : 0,
    timestamp: new Date().toISOString(),
  });
}

Graceful Degradation

Document AI Failure: Falls back to pdf-parse library

Vector Search Failure: Falls back to direct database query without similarity calculation

LLM API Failure: Returns error with retryable flag, job can be retried

PDF Generation Failure: Falls back to PDFKit if Puppeteer fails


9. Performance Optimization Points

Vector Search Optimization

Critical: Always pass document_id filter to prevent timeouts

// Add document_id filter if provided (critical for performance)
if (documentId) {
  rpcParams.filter_document_id = documentId;
}

SQL Function Optimization: match_document_chunks filters by document_id first before vector similarity calculation

Chunking Strategy

Optimal Configuration:

private readonly maxChunkSize = 4000; // Optimal chunk size for embeddings
private readonly overlapSize = 200; // Overlap between chunks
private readonly maxConcurrentEmbeddings = 5; // Limit concurrent API calls
private readonly batchSize = 10; // Process chunks in batches

Semantic Chunking: Detects paragraph and section boundaries for better chunk quality

Batch Processing

Embedding Generation:

  • Processes chunks in batches of 10
  • Max 5 concurrent embedding API calls
  • Prevents memory overflow and API rate limiting

Chunk Storage:

  • Batched database inserts
  • Reduces database round trips

Memory Management

Chunk Processing:

  • Processes chunks in batches to limit memory usage
  • Cleans up processed chunks from memory after storage

PDF Generation:

  • Page pooling (max 5 pages)
  • Page timeout (30 seconds)
  • Cache with 5-minute TTL

Database Optimization

Direct PostgreSQL for Critical Operations:

  • Job creation uses direct PostgreSQL to bypass PostgREST cache issues
  • Ensures reliable job creation even when PostgREST schema cache is stale

Connection Pooling:

  • Supabase client uses connection pooling
  • Direct PostgreSQL uses pg pool

API Call Optimization

LLM Token Management:

  • Automatic token counting
  • Text truncation if exceeds limits
  • Model selection based on complexity (smaller models for simpler tasks)

Embedding Caching:

private semanticCache: Map<string, { embedding: number[]; timestamp: number }> = new Map();
private readonly CACHE_TTL = 3600000; // 1 hour cache TTL

10. Background Processing Architecture

Legacy vs Current System

Legacy: In-Memory Queue (jobQueueService)

  • EventEmitter-based
  • In-memory job storage
  • Still initialized but being phased out
  • Location: backend/src/services/jobQueueService.ts

Current: Database-Backed Queue (jobProcessorService)

  • Database-backed job storage
  • Scheduled processing via Firebase Cloud Scheduler
  • Location: backend/src/services/jobProcessorService.ts

Job Processing Flow

Job Creation
    │
    ▼
ProcessingJobModel.create()
    │
    ▼
Status: 'pending' in database
    │
    ▼
Scheduled Function (every 1 minute)
    OR
Immediate processing via API
    │
    ▼
JobProcessorService.processJobs()
    │
    ▼
Get pending/retrying jobs (max 3 concurrent)
    │
    ▼
Process jobs in parallel
    │
    ▼
For each job:
  - Mark as 'processing'
  - Download file from GCS
  - Call unifiedDocumentProcessor
  - Update document status
  - Mark job as 'completed' or 'failed'

Scheduled Function

File: backend/src/index.ts

export const processDocumentJobs = onSchedule({
  schedule: 'every 1 minutes', // Minimum interval for Firebase Cloud Scheduler
  timeoutSeconds: 900, // 15 minutes (max for Gen2 scheduled functions)
  memory: '1GiB',
  retryCount: 2, // Retry up to 2 times on failure
}, async (event) => {
  logger.info('Processing document jobs scheduled function triggered', {
    timestamp: new Date().toISOString(),
    scheduleTime: event.scheduleTime,
  });

  try {
    const { jobProcessorService } = await import('./services/jobProcessorService');
    
    // Check for stuck jobs before processing (monitoring)
    const { ProcessingJobModel } = await import('./models/ProcessingJobModel');
    
    // Check for jobs stuck in processing status
    const stuckProcessingJobs = await ProcessingJobModel.getStuckJobs(15); // Jobs stuck > 15 minutes
    if (stuckProcessingJobs.length > 0) {
      logger.warn('Found stuck processing jobs', {
        count: stuckProcessingJobs.length,
        jobIds: stuckProcessingJobs.map(j => j.id),
        timestamp: new Date().toISOString(),
      });
    }
    
    // Check for jobs stuck in pending status (alert if > 2 minutes)
    const stuckPendingJobs = await ProcessingJobModel.getStuckPendingJobs(2); // Jobs pending > 2 minutes
    if (stuckPendingJobs.length > 0) {
      logger.warn('Found stuck pending jobs (may indicate processing issues)', {
        count: stuckPendingJobs.length,
        jobIds: stuckPendingJobs.map(j => j.id),
        oldestJobAge: stuckPendingJobs[0] ? Math.round((Date.now() - new Date(stuckPendingJobs[0].created_at).getTime()) / 1000 / 60) : 0,
        timestamp: new Date().toISOString(),
      });
    }
    
    const result = await jobProcessorService.processJobs();

    logger.info('Document jobs processing completed', {
      ...result,
      timestamp: new Date().toISOString(),
    });
  } catch (error) {
    const errorMessage = error instanceof Error ? error.message : String(error);
    const errorStack = error instanceof Error ? error.stack : undefined;
    
    logger.error('Error processing document jobs', {
      error: errorMessage,
      stack: errorStack,
      timestamp: new Date().toISOString(),
    });

    // Re-throw to trigger retry mechanism (up to retryCount times)
    throw error;
  }
});

Job States

pending → processing → completed
    │                      │
    │                      ▼
    │                   failed
    │                      │
    └──────────────────────┘
                    │
                    ▼
                retrying
                    │
                    ▼
              (back to pending)

Concurrency Control

Max Concurrent Jobs: 3

private readonly MAX_CONCURRENT_JOBS = 3;
private readonly JOB_TIMEOUT_MINUTES = 15;

Processing Logic:

// Get pending jobs
const pendingJobs = await ProcessingJobModel.getPendingJobs(this.MAX_CONCURRENT_JOBS);

// Get retrying jobs (enabled - schema is updated)
const retryingJobs = await ProcessingJobModel.getRetryableJobs(
  Math.max(0, this.MAX_CONCURRENT_JOBS - pendingJobs.length)
);

const allJobs = [...pendingJobs, ...retryingJobs];

if (allJobs.length === 0) {
  logger.debug('No jobs to process');
  return stats;
}

logger.info('Processing jobs', {
  totalJobs: allJobs.length,
  pendingJobs: pendingJobs.length,
  retryingJobs: retryingJobs.length,
});

// Process jobs in parallel (up to MAX_CONCURRENT_JOBS)
const results = await Promise.allSettled(
  allJobs.map((job) => this.processJob(job.id))
);

11. Frontend Architecture

Component Structure

Main Components:

  • DocumentUpload - File upload with drag-and-drop
  • DocumentList - List of user's documents with status
  • DocumentViewer - View processed document and PDF
  • Analytics - Processing statistics dashboard
  • UploadMonitoringDashboard - Real-time upload monitoring

State Management

AuthContext (frontend/src/contexts/AuthContext.tsx):

export const AuthProvider: React.FC<AuthProviderProps> = ({ children }) => {
  const [user, setUser] = useState<User | null>(null);
  const [token, setToken] = useState<string | null>(null);
  const [isLoading, setIsLoading] = useState(true);
  const [error, setError] = useState<string | null>(null);
  const [isInitialized, setIsInitialized] = useState(false);

  useEffect(() => {
    setIsLoading(true);
    
    // Listen for Firebase auth state changes
    const unsubscribe = authService.onAuthStateChanged(async (firebaseUser) => {
      try {
        if (firebaseUser) {
          const user = authService.getCurrentUser();
          const token = await authService.getToken();
          setUser(user);
          setToken(token);
        } else {
          setUser(null);
          setToken(null);
        }
      } catch (error) {
        console.error('Auth state change error:', error);
        setError('Authentication error occurred');
        setUser(null);
        setToken(null);
      } finally {
        setIsLoading(false);
        setIsInitialized(true);
      }
    });

    // Cleanup subscription on unmount
    return () => unsubscribe();
  }, []);

API Communication

Document Service (frontend/src/services/documentService.ts):

  • Axios client with auth interceptor
  • Automatic token refresh on 401 errors
  • Progress tracking for uploads
  • Error handling with user-friendly messages

Upload Flow:

async uploadDocument(
  file: File, 
  onProgress?: (progress: number) => void, 
  signal?: AbortSignal
): Promise<Document> {
  try {
    // Check authentication before upload
    const token = await authService.getToken();
    if (!token) {
      throw new Error('Authentication required. Please log in to upload documents.');
    }

    // Step 1: Get signed upload URL
    onProgress?.(5); // 5% - Getting upload URL
    
    const uploadUrlResponse = await apiClient.post('/documents/upload-url', {
      fileName: file.name,
      fileSize: file.size,
      contentType: contentTypeForSigning
    }, { signal });

    const { documentId, uploadUrl } = uploadUrlResponse.data;

    // Step 2: Upload directly to Firebase Storage
    onProgress?.(10); // 10% - Starting direct upload
    
    await this.uploadToFirebaseStorage(
      file,
      uploadUrl,
      contentTypeForSigning,
      (uploadProgress) => {
        // Map upload progress (10-90%)
        const mappedProgress = 10 + (uploadProgress * 0.8);
        onProgress?.(mappedProgress);
      },
      signal
    );

    // Step 3: Confirm upload
    onProgress?.(90); // 90% - Confirming upload
    
    const confirmResponse = await apiClient.post(
      `/documents/${documentId}/confirm-upload`,
      {},
      { signal }
    );

    onProgress?.(100); // 100% - Complete

    return confirmResponse.data.document;
  } catch (error) {
    // ... error handling ...
  }
}

Real-Time Updates

Polling for Processing Status:

  • Frontend polls /documents/:id endpoint
  • Updates UI when status changes from 'processing' to 'completed'
  • Shows error messages if status is 'failed'

Upload Progress:

  • Real-time progress tracking via onProgress callback
  • Visual progress bar in DocumentUpload component

12. Configuration & Environment

Environment Variables

File: backend/src/config/env.ts

Key Configuration Categories:

  1. LLM Provider:

    • LLM_PROVIDER - 'anthropic', 'openai', or 'openrouter'
    • ANTHROPIC_API_KEY - Claude API key
    • OPENAI_API_KEY - OpenAI API key
    • OPENROUTER_API_KEY - OpenRouter API key
    • LLM_MODEL - Model name (e.g., 'claude-sonnet-4-5-20250929')
    • LLM_MAX_TOKENS - Max output tokens
    • LLM_MAX_INPUT_TOKENS - Max input tokens (default 200000)
  2. Database:

    • SUPABASE_URL - Supabase project URL
    • SUPABASE_SERVICE_KEY - Service role key
    • SUPABASE_ANON_KEY - Anonymous key
  3. Google Cloud:

    • GCLOUD_PROJECT_ID - GCP project ID
    • GCS_BUCKET_NAME - Storage bucket name
    • DOCUMENT_AI_PROCESSOR_ID - Document AI processor ID
    • DOCUMENT_AI_LOCATION - Processor location (default 'us')
  4. Feature Flags:

    • AGENTIC_RAG_ENABLED - Enable/disable agentic RAG processing

Configuration Loading

Priority Order:

  1. process.env (Firebase Functions v2)
  2. functions.config() (Firebase Functions v1 fallback)
  3. .env file (local development)

Validation: Joi schema validates all required environment variables


13. Debugging Guide

Key Log Points

Correlation IDs: Every request has a correlation ID for tracing

Structured Logging: Winston logger with structured data

Key Log Locations:

  1. Request Entry: backend/src/index.ts - All incoming requests
  2. Authentication: backend/src/middleware/firebaseAuth.ts - Auth success/failure
  3. Job Processing: backend/src/services/jobProcessorService.ts - Job lifecycle
  4. Document Processing: backend/src/services/unifiedDocumentProcessor.ts - Processing steps
  5. LLM Calls: backend/src/services/llmService.ts - API calls and responses
  6. Vector Search: backend/src/services/vectorDatabaseService.ts - Search operations
  7. Error Handling: backend/src/middleware/errorHandler.ts - All errors with categorization

Common Failure Points

1. Vector Search Timeouts

  • Symptom: "Vector search timeout after 10s"
  • Cause: Searching across all documents without document_id filter
  • Fix: Always pass documentId to vectorDatabaseService.searchSimilar()

2. LLM API Failures

  • Symptom: "LLM API call failed" or "Invalid JSON response"
  • Cause: API rate limits, network issues, or invalid response format
  • Fix: Check API keys, retry logic, and response validation

3. GCS Upload Failures

  • Symptom: "Failed to upload to GCS" or "Signed URL expired"
  • Cause: Credential issues, bucket permissions, or URL expiration
  • Fix: Check GCS credentials and bucket configuration

4. Job Stuck in Processing

  • Symptom: Job status remains 'processing' for > 15 minutes
  • Cause: Process crashed, timeout, or error not caught
  • Fix: Check logs, reset stuck jobs, investigate error

5. Document AI Failures

  • Symptom: "Failed to extract text from document"
  • Cause: Document AI API error or invalid file format
  • Fix: Check Document AI processor configuration, fallback to pdf-parse

Diagnostic Tools

Health Check Endpoints:

  • GET /health - Basic health check
  • GET /health/config - Configuration health
  • GET /health/agentic-rag - Agentic RAG health status

Monitoring Endpoints:

  • GET /monitoring/upload-metrics - Upload statistics
  • GET /monitoring/upload-health - Upload health
  • GET /monitoring/real-time-stats - Real-time statistics

Database Debugging:

-- Check pending jobs
SELECT * FROM processing_jobs WHERE status = 'pending' ORDER BY created_at DESC;

-- Check stuck jobs
SELECT * FROM processing_jobs 
WHERE status = 'processing' 
AND started_at < NOW() - INTERVAL '15 minutes';

-- Check document status
SELECT id, original_file_name, status, created_at 
FROM documents 
WHERE user_id = '<user_id>' 
ORDER BY created_at DESC;

Job Inspection:

// Get job details
const job = await ProcessingJobModel.findById(jobId);

// Check job error
console.log('Job error:', job.error);

// Check job result
console.log('Job result:', job.result);

Debugging Workflow

  1. Identify the Issue: Check error logs with correlation ID
  2. Trace the Request: Follow correlation ID through logs
  3. Check Job Status: Query processing_jobs table for job state
  4. Check Document Status: Query documents table for document state
  5. Review Service Logs: Check specific service logs for detailed errors
  6. Test Components: Test individual services in isolation
  7. Check External Services: Verify GCS, Document AI, LLM APIs are accessible

14. Optimization Opportunities

Identified Bottlenecks

1. Vector Search Performance

  • Current: 10-second timeout, can be slow for large document sets
  • Optimization: Ensure document_id filter is always used
  • Future: Consider indexing optimizations, batch search

2. LLM API Calls

  • Current: Sequential processing, no caching of similar requests
  • Optimization: Implement response caching for similar documents
  • Future: Batch API calls, use smaller models for simpler tasks

3. PDF Generation

  • Current: Puppeteer can be memory-intensive
  • Optimization: Page pooling already implemented
  • Future: Consider serverless PDF generation service

4. Database Queries

  • Current: Some queries don't use indexes effectively
  • Optimization: Add indexes on frequently queried columns
  • Future: Query optimization, connection pooling tuning

Memory Usage Patterns

Chunk Processing:

  • Processes chunks in batches to limit memory
  • Cleans up processed chunks after storage
  • Optimization: Consider streaming for very large documents

PDF Generation:

  • Page pooling limits memory usage
  • Browser instance reuse reduces overhead
  • Optimization: Consider headless browser optimization

API Call Optimization

Embedding Generation:

  • Current: Max 5 concurrent calls
  • Optimization: Tune based on API rate limits
  • Future: Batch embedding API if available

LLM Calls:

  • Current: Single call per document
  • Optimization: Use smaller models for simpler tasks
  • Future: Implement response caching

Database Query Optimization

Frequently Queried Tables:

  • documents - Add index on user_id, status
  • processing_jobs - Add index on status, created_at
  • document_chunks - Add index on document_id, chunk_index

Vector Search:

  • Current: Uses match_document_chunks function
  • Optimization: Ensure document_id filter is always used
  • Future: Consider HNSW index for faster similarity search

Appendix: Key File Locations

Backend Services

  • backend/src/services/unifiedDocumentProcessor.ts - Main orchestrator
  • backend/src/services/optimizedAgenticRAGProcessor.ts - AI processing engine
  • backend/src/services/jobProcessorService.ts - Job processor
  • backend/src/services/vectorDatabaseService.ts - Vector operations
  • backend/src/services/llmService.ts - LLM interactions
  • backend/src/services/documentAiProcessor.ts - Document AI integration
  • backend/src/services/pdfGenerationService.ts - PDF generation
  • backend/src/services/fileStorageService.ts - GCS operations

Backend Models

  • backend/src/models/DocumentModel.ts - Document data model
  • backend/src/models/ProcessingJobModel.ts - Job data model
  • backend/src/models/VectorDatabaseModel.ts - Vector data model

Backend Routes

  • backend/src/routes/documents.ts - Document endpoints
  • backend/src/routes/vector.ts - Vector endpoints
  • backend/src/routes/monitoring.ts - Monitoring endpoints

Backend Controllers

  • backend/src/controllers/documentController.ts - Document controller

Frontend Services

  • frontend/src/services/documentService.ts - Document API client
  • frontend/src/services/authService.ts - Authentication service

Frontend Components

  • frontend/src/components/DocumentUpload.tsx - Upload component
  • frontend/src/components/DocumentList.tsx - Document list
  • frontend/src/components/DocumentViewer.tsx - Document viewer

Configuration

  • backend/src/config/env.ts - Environment configuration
  • backend/src/config/supabase.ts - Supabase configuration
  • backend/src/config/firebase.ts - Firebase configuration

SQL

  • backend/sql/fix_vector_search_timeout.sql - Vector search optimization

End of Architecture Summary