cim_summary/CODEBASE_ARCHITECTURE_SUMMARY.md

# CIM Summary Codebase Architecture Summary

**Last Updated**: December 2024
**Purpose**: Comprehensive technical reference for senior developers optimizing and debugging the codebase

---

## Table of Contents

1. [System Overview](#1-system-overview)
2. [Application Entry Points](#2-application-entry-points)
3. [Request Flow & API Architecture](#3-request-flow--api-architecture)
4. [Document Processing Pipeline (Critical Path)](#4-document-processing-pipeline-critical-path)
5. [Core Services Deep Dive](#5-core-services-deep-dive)
6. [Data Models & Database Schema](#6-data-models--database-schema)
7. [Component Handoffs & Integration Points](#7-component-handoffs--integration-points)
8. [Error Handling & Resilience](#8-error-handling--resilience)
9. [Performance Optimization Points](#9-performance-optimization-points)
10. [Background Processing Architecture](#10-background-processing-architecture)
11. [Frontend Architecture](#11-frontend-architecture)
12. [Configuration & Environment](#12-configuration--environment)
13. [Debugging Guide](#13-debugging-guide)
14. [Optimization Opportunities](#14-optimization-opportunities)

---

## 1. System Overview

### High-Level Architecture

```
┌─────────────────────────────────────────────────────────────────┐
│                         Frontend (React)                         │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐         │
│  │ DocumentUpload│  │ DocumentList │  │  Analytics   │         │
│  └──────┬───────┘  └──────┬───────┘  └──────┬───────┘         │
│         │                  │                  │                  │
│         └──────────────────┴──────────────────┘                  │
│                            │                                     │
│                    ┌────────▼────────┐                            │
│                    │  documentService │                            │
│                    │  (Axios Client)  │                            │
│                    └────────┬────────┘                            │
└────────────────────────────┼────────────────────────────────────┘
                             │ HTTPS + JWT
┌────────────────────────────▼────────────────────────────────────┐
│                    Backend (Express + Node.js)                  │
│  ┌──────────────────────────────────────────────────────────┐ │
│  │ Middleware Chain: CORS → Auth → Validation → Error Handler │ │
│  └──────────────────────────────────────────────────────────┘ │
│                            │                                     │
│         ┌──────────────────┼──────────────────┐                │
│         │                  │                  │                  │
│  ┌──────▼──────┐  ┌────────▼────────┐  ┌─────▼──────┐         │
│  │  Routes     │  │   Controllers    │  │  Services   │         │
│  └──────┬──────┘  └────────┬────────┘  └─────┬──────┘         │
│         │                  │                  │                  │
│         └──────────────────┴──────────────────┘                │
└────────────────────────────┬────────────────────────────────────┘
                              │
         ┌────────────────────┼────────────────────┐
         │                    │                      │
    ┌────▼────┐        ┌──────▼──────┐      ┌───────▼───────┐
    │Supabase │        │Google Cloud│      │  LLM APIs     │
    │(Postgres)│        │  Storage   │      │(Claude/OpenAI)│
    └─────────┘        └────────────┘      └───────────────┘
```

### Technology Stack

**Frontend:**
- React 18 + TypeScript
- Vite (build tool)
- Axios (HTTP client)
- Firebase Auth (authentication)
- React Router (routing)

**Backend:**
- Node.js + Express + TypeScript
- Firebase Functions v2 (deployment)
- Supabase (PostgreSQL + Vector DB)
- Google Cloud Storage (file storage)
- Google Document AI (PDF text extraction)
- Puppeteer (PDF generation)

**AI/ML Services:**
- Anthropic Claude (primary LLM)
- OpenAI (fallback LLM)
- OpenRouter (LLM routing)
- OpenAI Embeddings (vector embeddings)

### Core Purpose

Automated processing and analysis of Confidential Information Memorandums (CIMs) using:
1. **Text Extraction**: Google Document AI extracts text from PDFs
2. **Semantic Chunking**: Split text into 4000-char chunks with overlap
3. **Vector Embeddings**: Generate embeddings for semantic search
4. **LLM Analysis**: Claude AI analyzes chunks and generates structured CIMReview data
5. **PDF Generation**: Create summary PDF with analysis results

---

## 2. Application Entry Points

### Backend Entry Point

**File**: `backend/src/index.ts`

```1:22:backend/src/index.ts
// Initialize Firebase Admin SDK first
import './config/firebase';

import express from 'express';
import cors from 'cors';
import helmet from 'helmet';
import morgan from 'morgan';
import rateLimit from 'express-rate-limit';
import { config } from './config/env';
import { logger } from './utils/logger';
import documentRoutes from './routes/documents';
import vectorRoutes from './routes/vector';
import monitoringRoutes from './routes/monitoring';
import auditRoutes from './routes/documentAudit';
import { jobQueueService } from './services/jobQueueService';

import { errorHandler, correlationIdMiddleware } from './middleware/errorHandler';
import { notFoundHandler } from './middleware/notFoundHandler';

// Start the job queue service for background processing
jobQueueService.start();
```

**Key Initialization Steps:**
1. Firebase Admin SDK initialization (`./config/firebase`)
2. Express app setup with middleware chain
3. Route registration (`/documents`, `/vector`, `/monitoring`, `/api/audit`)
4. Job queue service startup (legacy in-memory queue)
5. Firebase Functions export for Cloud deployment

**Scheduled Function**: `processDocumentJobs` (```210:267:backend/src/index.ts```)
- Runs every minute via Firebase Cloud Scheduler
- Processes pending/retrying jobs from database
- Detects and resets stuck jobs

### Frontend Entry Point

**File**: `frontend/src/main.tsx`

```1:10:frontend/src/main.tsx
import React from 'react';
import ReactDOM from 'react-dom/client';
import App from './App';
import './index.css';

ReactDOM.createRoot(document.getElementById('root')!).render(
  <React.StrictMode>
    <App />
  </React.StrictMode>
);
```

**Main App Component**: `frontend/src/App.tsx`
- Sets up React Router
- Provides AuthContext
- Renders protected routes and dashboard

---

## 3. Request Flow & API Architecture

### Request Lifecycle

```
Client Request
    │
    ▼
┌─────────────────────────────────────┐
│ 1. CORS Middleware                   │
│    - Validates origin                │
│    - Sets CORS headers               │
└──────────────┬──────────────────────┘
               │
               ▼
┌─────────────────────────────────────┐
│ 2. Correlation ID Middleware         │
│    - Generates/reads X-Correlation-ID│
│    - Adds to request object          │
└──────────────┬──────────────────────┘
               │
               ▼
┌─────────────────────────────────────┐
│ 3. Firebase Auth Middleware          │
│    - Verifies JWT token              │
│    - Attaches user to req.user       │
└──────────────┬──────────────────────┘
               │
               ▼
┌─────────────────────────────────────┐
│ 4. Rate Limiting                    │
│    - 1000 requests per 15 minutes   │
└──────────────┬──────────────────────┘
               │
               ▼
┌─────────────────────────────────────┐
│ 5. Body Parsing                     │
│    - JSON (10MB limit)               │
│    - URL-encoded (10MB limit)        │
└──────────────┬──────────────────────┘
               │
               ▼
┌─────────────────────────────────────┐
│ 6. Route Handler                    │
│    - Matches route pattern           │
│    - Calls controller method         │
└──────────────┬──────────────────────┘
               │
               ▼
┌─────────────────────────────────────┐
│ 7. Controller                       │
│    - Validates input                 │
│    - Calls service methods           │
│    - Returns response                │
└──────────────┬──────────────────────┘
               │
               ▼
┌─────────────────────────────────────┐
│ 8. Service Layer                    │
│    - Business logic                  │
│    - Database operations             │
│    - External API calls              │
└──────────────┬──────────────────────┘
               │
               ▼
┌─────────────────────────────────────┐
│ 9. Error Handler (if error)         │
│    - Categorizes error               │
│    - Logs with correlation ID        │
│    - Returns structured response     │
└─────────────────────────────────────┘
```

### Authentication Flow

**Middleware**: `backend/src/middleware/firebaseAuth.ts`

```27:81:backend/src/middleware/firebaseAuth.ts
export const verifyFirebaseToken = async (
  req: FirebaseAuthenticatedRequest,
  res: Response,
  next: NextFunction
): Promise<void> => {
  try {
    console.log('🔐 Authentication middleware called for:', req.method, req.url);
    console.log('🔐 Request headers:', Object.keys(req.headers));

    // Debug Firebase Admin initialization
    console.log('🔐 Firebase apps available:', admin.apps.length);
    console.log('🔐 Firebase app names:', admin.apps.filter(app => app !== null).map(app => app!.name));

    const authHeader = req.headers.authorization;
    console.log('🔐 Auth header present:', !!authHeader);
    console.log('🔐 Auth header starts with Bearer:', authHeader?.startsWith('Bearer '));

    if (!authHeader || !authHeader.startsWith('Bearer ')) {
      console.log('❌ No valid authorization header');
      res.status(401).json({ error: 'No valid authorization header' });
      return;
    }

    const idToken = authHeader.split('Bearer ')[1];
    console.log('🔐 Token extracted, length:', idToken?.length);

    if (!idToken) {
      console.log('❌ No token provided');
      res.status(401).json({ error: 'No token provided' });
      return;
    }

    console.log('🔐 Attempting to verify Firebase ID token...');
    console.log('🔐 Token preview:', idToken.substring(0, 20) + '...');

    // Verify the Firebase ID token
    const decodedToken = await admin.auth().verifyIdToken(idToken, true);
    console.log('✅ Token verified successfully for user:', decodedToken.email);
    console.log('✅ Token UID:', decodedToken.uid);
    console.log('✅ Token issuer:', decodedToken.iss);

    // Check if token is expired
    const now = Math.floor(Date.now() / 1000);
    if (decodedToken.exp && decodedToken.exp < now) {
      logger.warn('Token expired for user:', decodedToken.uid);
      res.status(401).json({ error: 'Token expired' });
      return;
    }

    req.user = decodedToken;

    // Log successful authentication
    logger.info('Authenticated request for user:', decodedToken.email);

    next();
```

**Frontend Auth**: `frontend/src/services/authService.ts`
- Manages Firebase Auth state
- Provides token via `getToken()`
- Axios interceptor adds token to requests

### Route Structure

**Main Routes** (`backend/src/routes/documents.ts`):
- `POST /documents/upload-url` - Get signed upload URL
- `POST /documents/:id/confirm-upload` - Confirm upload and start processing
- `GET /documents` - List user's documents
- `GET /documents/:id` - Get document details
- `GET /documents/:id/download` - Download processed PDF
- `GET /documents/analytics` - Get processing analytics
- `POST /documents/:id/process-optimized-agentic-rag` - Trigger AI processing

**Middleware Applied**:
```22:29:backend/src/routes/documents.ts
// Apply authentication and correlation ID to all routes
router.use(verifyFirebaseToken);
router.use(addCorrelationId);

// Add logging middleware for document routes
router.use((req, res, next) => {
  console.log(`📄 Document route accessed: ${req.method} ${req.path}`);
  next();
});
```

---

## 4. Document Processing Pipeline (Critical Path)

### Complete Flow Diagram

```
┌─────────────────────────────────────────────────────────────────────┐
│                    DOCUMENT PROCESSING PIPELINE                    │
└─────────────────────────────────────────────────────────────────────┘

1. UPLOAD PHASE
   ┌─────────────────────────────────────────────────────────────┐
   │ User selects PDF                                             │
   │   ↓                                                          │
   │ DocumentUpload component                                     │
   │   ↓                                                          │
   │ documentService.uploadDocument()                            │
   │   ↓                                                          │
   │ POST /documents/upload-url                                  │
   │   ↓                                                          │
   │ documentController.getUploadUrl()                           │
   │   ↓                                                          │
   │ DocumentModel.create() → documents table                     │
   │   ↓                                                          │
   │ fileStorageService.generateSignedUploadUrl()                 │
   │   ↓                                                          │
   │ Direct upload to GCS via signed URL                         │
   │   ↓                                                          │
   │ POST /documents/:id/confirm-upload                          │
   └─────────────────────────────────────────────────────────────┘

2. JOB CREATION PHASE
   ┌─────────────────────────────────────────────────────────────┐
   │ documentController.confirmUpload()                           │
   │   ↓                                                          │
   │ ProcessingJobModel.create() → processing_jobs table          │
   │   ↓                                                          │
   │ Status: 'pending'                                            │
   │   ↓                                                          │
   │ Returns 202 Accepted (async processing)                      │
   └─────────────────────────────────────────────────────────────┘

3. JOB PROCESSING PHASE (Background)
   ┌─────────────────────────────────────────────────────────────┐
   │ Scheduled Function: processDocumentJobs (every 1 minute)     │
   │   OR                                                          │
   │ Immediate processing via jobProcessorService.processJob()    │
   │   ↓                                                          │
   │ JobProcessorService.processJob()                            │
   │   ↓                                                          │
   │ Download file from GCS                                       │
   │   ↓                                                          │
   │ unifiedDocumentProcessor.processDocument()                   │
   └─────────────────────────────────────────────────────────────┘

4. TEXT EXTRACTION PHASE
   ┌─────────────────────────────────────────────────────────────┐
   │ documentAiProcessor.processDocument()                        │
   │   ↓                                                          │
   │ Google Document AI API                                      │
   │   ↓                                                          │
   │ Extracted text returned                                      │
   └─────────────────────────────────────────────────────────────┘

5. CHUNKING & EMBEDDING PHASE
   ┌─────────────────────────────────────────────────────────────┐
   │ optimizedAgenticRAGProcessor.processLargeDocument()          │
   │   ↓                                                          │
   │ createIntelligentChunks()                                    │
   │   - Semantic boundary detection                               │
   │   - 4000-char chunks with 200-char overlap                  │
   │   ↓                                                          │
   │ processChunksInBatches()                                     │
   │   - Batch size: 10                                           │
   │   - Max concurrent: 5                                        │
   │   ↓                                                          │
   │ storeChunksOptimized()                                       │
   │   ↓                                                          │
   │ vectorDatabaseService.storeEmbedding()                       │
   │   - OpenAI embeddings API                                    │
   │   - Store in document_chunks table                           │
   └─────────────────────────────────────────────────────────────┘

6. LLM ANALYSIS PHASE
   ┌─────────────────────────────────────────────────────────────┐
   │ generateLLMAnalysisHybrid()                                  │
   │   ↓                                                          │
   │ llmService.processCIMDocument()                             │
   │   ↓                                                          │
   │ Vector search for relevant chunks                            │
   │   ↓                                                          │
   │ Claude/OpenAI API call with structured prompt                │
   │   ↓                                                          │
   │ Parse and validate CIMReview JSON                             │
   │   ↓                                                          │
   │ Return structured analysisData                                │
   └─────────────────────────────────────────────────────────────┘

7. PDF GENERATION PHASE
   ┌─────────────────────────────────────────────────────────────┐
   │ pdfGenerationService.generatePDF()                           │
   │   ↓                                                          │
   │ Puppeteer browser instance                                   │
   │   ↓                                                          │
   │ Render HTML template with analysisData                       │
   │   ↓                                                          │
   │ Generate PDF buffer                                          │
   │   ↓                                                          │
   │ Upload PDF to GCS                                            │
   │   ↓                                                          │
   │ Update document record with PDF path                         │
   └─────────────────────────────────────────────────────────────┘

8. STATUS UPDATE PHASE
   ┌─────────────────────────────────────────────────────────────┐
   │ DocumentModel.updateById()                                    │
   │   - status: 'completed'                                       │
   │   - pdf_path: GCS path                                       │
   │   ↓                                                          │
   │ ProcessingJobModel.markAsCompleted()                              │
   │   ↓                                                          │
   │ Frontend polls /documents/:id for status updates            │
   └─────────────────────────────────────────────────────────────┘
```

### Key Handoff Points

**1. Upload to Job Creation**
```138:202:backend/src/controllers/documentController.ts
async confirmUpload(req: Request, res: Response): Promise<void> {
  // ... validation ...

  // Update status to processing
  await DocumentModel.updateById(documentId, {
    status: 'processing_llm'
  });

  // Acknowledge the request immediately
  res.status(202).json({
    message: 'Upload confirmed, processing has started.',
    document: document,
    status: 'processing'
  });

  // CRITICAL FIX: Use database-backed job queue
  const { ProcessingJobModel } = await import('../models/ProcessingJobModel');
  await ProcessingJobModel.create({
    document_id: documentId,
    user_id: userId,
    options: {
      fileName: document.original_file_name,
      mimeType: 'application/pdf'
    }
  });
}
```

**2. Job Processing to Document Processing**
```109:200:backend/src/services/jobProcessorService.ts
private async processJob(jobId: string): Promise<{ success: boolean; error?: string }> {
  // Get job details
  job = await ProcessingJobModel.findById(jobId);

  // Mark job as processing
  await ProcessingJobModel.markAsProcessing(jobId);

  // Download file from GCS
  const fileBuffer = await fileStorageService.downloadFile(document.file_path);

  // Process document
  const result = await unifiedDocumentProcessor.processDocument(
    job.document_id,
    job.user_id,
    fileBuffer.toString('utf-8'), // This will be re-read as buffer
    {
      fileBuffer,
      fileName: job.options?.fileName || 'document.pdf',
      mimeType: job.options?.mimeType || 'application/pdf'
    }
  );
}
```

**3. Document Processing to Text Extraction**
```50:80:backend/src/services/documentAiProcessor.ts
async processDocument(
  documentId: string,
  userId: string,
  fileBuffer: Buffer,
  fileName: string,
  mimeType: string
): Promise<ProcessingResult> {
  // Step 1: Extract text using Document AI or fallback
  const extractedText = await this.extractTextFromDocument(fileBuffer, fileName, mimeType);

  // Step 2: Process extracted text through Agentic RAG
  const agenticRagResult = await this.processWithAgenticRAG(documentId, extractedText);
}
```

**4. Text to Chunking**
```40:109:backend/src/services/optimizedAgenticRAGProcessor.ts
async processLargeDocument(
  documentId: string,
  text: string,
  options: {
    enableSemanticChunking?: boolean;
    enableMetadataEnrichment?: boolean;
    similarityThreshold?: number;
  } = {}
): Promise<ProcessingResult> {
  // Step 1: Create intelligent chunks with semantic boundaries
  const chunks = await this.createIntelligentChunks(text, documentId, options.enableSemanticChunking);

  // Step 2: Process chunks in batches to manage memory
  const processedChunks = await this.processChunksInBatches(chunks, documentId, options);

  // Step 3: Store chunks with optimized batching
  const embeddingApiCalls = await this.storeChunksOptimized(processedChunks, documentId);

  // Step 4: Generate LLM analysis using HYBRID approach
  const llmResult = await this.generateLLMAnalysisHybrid(documentId, text, processedChunks);
}
```

---

## 5. Core Services Deep Dive

### 5.1 UnifiedDocumentProcessor

**File**: `backend/src/services/unifiedDocumentProcessor.ts`
**Purpose**: Main orchestrator for document processing strategies

**Key Method**:
```123:143:backend/src/services/unifiedDocumentProcessor.ts
async processDocument(
  documentId: string,
  userId: string,
  text: string,
  options: any = {}
): Promise<ProcessingResult> {
  const strategy = options.strategy || 'document_ai_agentic_rag';

  logger.info('Processing document with unified processor', {
    documentId,
    strategy,
    textLength: text.length
  });

  // Only support document_ai_agentic_rag strategy
  if (strategy === 'document_ai_agentic_rag') {
    return await this.processWithDocumentAiAgenticRag(documentId, userId, text, options);
  } else {
    throw new Error(`Unsupported processing strategy: ${strategy}. Only 'document_ai_agentic_rag' is supported.`);
  }
}
```

**Dependencies**:
- `documentAiProcessor` - Text extraction
- `optimizedAgenticRAGProcessor` - AI processing
- `llmService` - LLM interactions
- `pdfGenerationService` - PDF generation

**Error Handling**: Wraps errors with detailed context, validates analysisData presence

### 5.2 OptimizedAgenticRAGProcessor

**File**: `backend/src/services/optimizedAgenticRAGProcessor.ts` (1885 lines)
**Purpose**: Core AI processing engine for chunking, embeddings, and LLM analysis

**Key Configuration**:
```32:35:backend/src/services/optimizedAgenticRAGProcessor.ts
private readonly maxChunkSize = 4000; // Optimal chunk size for embeddings
private readonly overlapSize = 200; // Overlap between chunks
private readonly maxConcurrentEmbeddings = 5; // Limit concurrent API calls
private readonly batchSize = 10; // Process chunks in batches
```

**Key Methods**:
- `processLargeDocument()` - Main entry point
- `createIntelligentChunks()` - Semantic chunking with boundary detection
- `processChunksInBatches()` - Batch processing for memory efficiency
- `storeChunksOptimized()` - Embedding generation and storage
- `generateLLMAnalysisHybrid()` - LLM analysis with vector search

**Performance Optimizations**:
- Semantic boundary detection (paragraphs, sections)
- Batch processing to limit memory usage
- Concurrent embedding generation (max 5)
- Vector search with document_id filtering

### 5.3 JobProcessorService

**File**: `backend/src/services/jobProcessorService.ts`
**Purpose**: Database-backed job processor (replaces legacy in-memory queue)

**Key Method**:
```15:97:backend/src/services/jobProcessorService.ts
async processJobs(): Promise<{
  processed: number;
  succeeded: number;
  failed: number;
  skipped: number;
}> {
  // Prevent concurrent processing runs
  if (this.isProcessing) {
    logger.info('Job processor already running, skipping this run');
    return { processed: 0, succeeded: 0, failed: 0, skipped: 0 };
  }

  this.isProcessing = true;
  const stats = { processed: 0, succeeded: 0, failed: 0, skipped: 0 };

  try {
    // Reset stuck jobs first
    const resetCount = await ProcessingJobModel.resetStuckJobs(this.JOB_TIMEOUT_MINUTES);

    // Get pending jobs
    const pendingJobs = await ProcessingJobModel.getPendingJobs(this.MAX_CONCURRENT_JOBS);

    // Get retrying jobs
    const retryingJobs = await ProcessingJobModel.getRetryableJobs(
      Math.max(0, this.MAX_CONCURRENT_JOBS - pendingJobs.length)
    );

    const allJobs = [...pendingJobs, ...retryingJobs];

    // Process jobs in parallel (up to MAX_CONCURRENT_JOBS)
    const results = await Promise.allSettled(
      allJobs.map((job) => this.processJob(job.id))
    );
```

**Configuration**:
- `MAX_CONCURRENT_JOBS = 3`
- `JOB_TIMEOUT_MINUTES = 15`

**Features**:
- Stuck job detection and recovery
- Retry logic with exponential backoff
- Parallel processing with concurrency limit
- Database-backed state management

### 5.4 VectorDatabaseService

**File**: `backend/src/services/vectorDatabaseService.ts`
**Purpose**: Vector embeddings and similarity search

**Key Method - Vector Search**:
```88:150:backend/src/services/vectorDatabaseService.ts
async searchSimilar(
  embedding: number[],
  limit: number = 10,
  threshold: number = 0.7,
  documentId?: string
): Promise<VectorSearchResult[]> {
  try {
    if (this.provider === 'supabase') {
      // Use optimized Supabase vector search function with document_id filtering
      // This prevents timeouts by only searching within a specific document
      const rpcParams: any = {
        query_embedding: embedding,
        match_threshold: threshold,
        match_count: limit
      };

      // Add document_id filter if provided (critical for performance)
      if (documentId) {
        rpcParams.filter_document_id = documentId;
      }

      // Set a timeout for the RPC call (10 seconds)
      const searchPromise = this.supabaseClient
        .rpc('match_document_chunks', rpcParams);

      const timeoutPromise = new Promise<{ data: null; error: { message: string } }>((_, reject) => {
        setTimeout(() => reject(new Error('Vector search timeout after 10s')), 10000);
      });

      let result: any;
      try {
        result = await Promise.race([searchPromise, timeoutPromise]);
      } catch (timeoutError: any) {
        if (timeoutError.message?.includes('timeout')) {
          logger.error('Vector search timed out', { documentId, timeout: '10s' });
          throw new Error('Vector search timeout after 10s');
        }
        throw timeoutError;
      }
```

**Critical Optimization**: Always pass `documentId` to filter search scope and prevent timeouts

**SQL Function**: `backend/sql/fix_vector_search_timeout.sql`
```10:39:backend/sql/fix_vector_search_timeout.sql
CREATE OR REPLACE FUNCTION match_document_chunks (
  query_embedding vector(1536),
  match_threshold float,
  match_count int,
  filter_document_id text DEFAULT NULL
)
RETURNS TABLE (
  id UUID,
  document_id TEXT,
  content text,
  metadata JSONB,
  chunk_index INT,
  similarity float
)
LANGUAGE sql STABLE
AS $$
  SELECT
    document_chunks.id,
    document_chunks.document_id,
    document_chunks.content,
    document_chunks.metadata,
    document_chunks.chunk_index,
    1 - (document_chunks.embedding <=> query_embedding) AS similarity
  FROM document_chunks
  WHERE document_chunks.embedding IS NOT NULL
    AND (filter_document_id IS NULL OR document_chunks.document_id = filter_document_id)
    AND 1 - (document_chunks.embedding <=> query_embedding) > match_threshold
  ORDER BY document_chunks.embedding <=> query_embedding
  LIMIT match_count;
$$;
```

### 5.5 LLMService

**File**: `backend/src/services/llmService.ts`
**Purpose**: LLM interactions (Claude/OpenAI/OpenRouter)

**Provider Selection**:
```43:103:backend/src/services/llmService.ts
constructor() {
  // Read provider from config (supports openrouter, anthropic, openai)
  this.provider = config.llm.provider;

  // CRITICAL: If provider is not set correctly, log and use fallback
  if (!this.provider || (this.provider !== 'openrouter' && this.provider !== 'anthropic' && this.provider !== 'openai')) {
    logger.error('LLM provider is invalid or not set', {
      provider: this.provider,
      configProvider: config.llm.provider,
      processEnvProvider: process.env['LLM_PROVIDER'],
      defaultingTo: 'anthropic'
    });
    this.provider = 'anthropic'; // Fallback
  }

  // Set API key based on provider
  if (this.provider === 'openai') {
    this.apiKey = config.llm.openaiApiKey!;
  } else if (this.provider === 'openrouter') {
    // OpenRouter: Use OpenRouter key if provided, otherwise use Anthropic key for BYOK
    this.apiKey = config.llm.openrouterApiKey || config.llm.anthropicApiKey!;
  } else {
    this.apiKey = config.llm.anthropicApiKey!;
  }

  // Use configured model instead of hardcoded value
  this.defaultModel = config.llm.model;
  this.maxTokens = config.llm.maxTokens;
  this.temperature = config.llm.temperature;
}
```

**Key Method**:
```108:148:backend/src/services/llmService.ts
async processCIMDocument(text: string, template: string, analysis?: Record<string, any>): Promise<CIMAnalysisResult> {
  // Check and truncate text if it exceeds maxInputTokens
  const maxInputTokens = config.llm.maxInputTokens || 200000;
  const systemPromptTokens = this.estimateTokenCount(this.getCIMSystemPrompt());
  const templateTokens = this.estimateTokenCount(template);
  const promptBuffer = config.llm.promptBuffer || 1000;

  // Calculate available tokens for document text
  const reservedTokens = systemPromptTokens + templateTokens + promptBuffer + (config.llm.maxTokens || 16000);
  const availableTokens = maxInputTokens - reservedTokens;

  const textTokens = this.estimateTokenCount(text);
  let processedText = text;
  let wasTruncated = false;

  if (textTokens > availableTokens) {
    logger.warn('Document text exceeds token limit, truncating', {
      textTokens,
      availableTokens,
      maxInputTokens,
      reservedTokens,
      truncationRatio: (availableTokens / textTokens * 100).toFixed(1) + '%'
    });

    processedText = this.truncateText(text, availableTokens);
    wasTruncated = true;
  }
```

**Features**:
- Automatic token counting and truncation
- Model selection based on task complexity
- JSON schema validation with Zod
- Retry logic with exponential backoff
- Cost tracking

### 5.6 DocumentAiProcessor

**File**: `backend/src/services/documentAiProcessor.ts`
**Purpose**: Google Document AI integration for text extraction

**Key Method**:
```50:146:backend/src/services/documentAiProcessor.ts
async processDocument(
  documentId: string,
  userId: string,
  fileBuffer: Buffer,
  fileName: string,
  mimeType: string
): Promise<ProcessingResult> {
  const startTime = Date.now();

  try {
    logger.info('Starting Document AI + Agentic RAG processing', {
      documentId,
      userId,
      fileName,
      fileSize: fileBuffer.length,
      mimeType
    });

    // Step 1: Extract text using Document AI or fallback
    const extractedText = await this.extractTextFromDocument(fileBuffer, fileName, mimeType);

    if (!extractedText) {
      throw new Error('Failed to extract text from document');
    }

    logger.info('Text extraction completed', {
      textLength: extractedText.length
    });

    // Step 2: Process extracted text through Agentic RAG
    const agenticRagResult = await this.processWithAgenticRAG(documentId, extractedText);

    const processingTime = Date.now() - startTime;

    return {
      success: true,
      content: agenticRagResult.summary || extractedText,
      metadata: {
        processingStrategy: 'document_ai_agentic_rag',
        processingTime,
        extractedTextLength: extractedText.length,
        agenticRagResult,
        fileSize: fileBuffer.length,
        fileName,
        mimeType
      }
    };
```

**Fallback Strategy**: Uses `pdf-parse` if Document AI fails

### 5.7 PDFGenerationService

**File**: `backend/src/services/pdfGenerationService.ts`
**Purpose**: PDF generation using Puppeteer

**Key Features**:
- Page pooling for performance
- Caching for repeated requests
- Browser instance reuse
- Fallback to PDFKit if Puppeteer fails

**Configuration**:
```65:85:backend/src/services/pdfGenerationService.ts
class PDFGenerationService {
  private browser: any = null;
  private pagePool: PagePool[] = [];
  private readonly maxPoolSize = 5;
  private readonly pageTimeout = 30000; // 30 seconds
  private readonly cache = new Map<string, { buffer: Buffer; timestamp: number }>();
  private readonly cacheTimeout = 300000; // 5 minutes

  private readonly defaultOptions: PDFGenerationOptions = {
    format: 'A4',
    margin: {
      top: '1in',
      right: '1in',
      bottom: '1in',
      left: '1in',
    },
    displayHeaderFooter: true,
    printBackground: true,
    quality: 'high',
    timeout: 30000,
  };
```

### 5.8 FileStorageService

**File**: `backend/src/services/fileStorageService.ts`
**Purpose**: Google Cloud Storage operations

**Key Methods**:
- `generateSignedUploadUrl()` - Generate signed URL for direct upload
- `downloadFile()` - Download file from GCS
- `saveBuffer()` - Save buffer to GCS
- `deleteFile()` - Delete file from GCS

**Credential Handling**:
```40:145:backend/src/services/fileStorageService.ts
constructor() {
  this.bucketName = config.googleCloud.gcsBucketName;

  // Check if we're in Firebase Functions/Cloud Run environment
  const isCloudEnvironment = process.env.FUNCTION_TARGET ||
                             process.env.FUNCTION_NAME ||
                             process.env.K_SERVICE ||
                             process.env.GOOGLE_CLOUD_PROJECT ||
                             !!process.env.GCLOUD_PROJECT ||
                             process.env.X_GOOGLE_GCLOUD_PROJECT;

  // Initialize Google Cloud Storage
  const storageConfig: any = {
    projectId: config.googleCloud.projectId,
  };

  // Only use keyFilename in local development
  // In Firebase Functions/Cloud Run, use Application Default Credentials
  if (isCloudEnvironment) {
    // In cloud, ALWAYS clear GOOGLE_APPLICATION_CREDENTIALS to force use of ADC
    // Firebase Functions automatically provides credentials via metadata service
    // These credentials have signing capabilities for generating signed URLs
    const originalCreds = process.env.GOOGLE_APPLICATION_CREDENTIALS;
    if (originalCreds) {
      delete process.env.GOOGLE_APPLICATION_CREDENTIALS;
      logger.info('Using Application Default Credentials for GCS (cloud environment)', {
        clearedEnvVar: 'GOOGLE_APPLICATION_CREDENTIALS',
        originalValue: originalCreds,
        projectId: config.googleCloud.projectId
      });
    }
```

---

## 6. Data Models & Database Schema

### Core Models

**DocumentModel** (`backend/src/models/DocumentModel.ts`):
- `create()` - Create document record
- `findById()` - Get document by ID
- `updateById()` - Update document status/metadata
- `findByUserId()` - List user's documents

**ProcessingJobModel** (`backend/src/models/ProcessingJobModel.ts`):
- `create()` - Create processing job (uses direct PostgreSQL to bypass PostgREST cache)
- `findById()` - Get job by ID
- `getPendingJobs()` - Get pending jobs (limit by concurrency)
- `getRetryableJobs()` - Get jobs ready for retry
- `markAsProcessing()` - Update job status
- `markAsCompleted()` - Mark job complete
- `markAsFailed()` - Mark job failed with error
- `resetStuckJobs()` - Reset jobs stuck in processing

**VectorDatabaseModel** (`backend/src/models/VectorDatabaseModel.ts`):
- Chunk storage and retrieval
- Embedding management

### Database Tables

**documents**:
- `id` (UUID, primary key)
- `user_id` (UUID, foreign key)
- `original_file_name` (text)
- `file_path` (text, GCS path)
- `file_size` (bigint)
- `status` (text: 'uploading', 'uploaded', 'processing_llm', 'completed', 'failed')
- `pdf_path` (text, GCS path for generated PDF)
- `created_at`, `updated_at` (timestamps)

**processing_jobs**:
- `id` (UUID, primary key)
- `document_id` (UUID, foreign key)
- `user_id` (UUID, foreign key)
- `status` (text: 'pending', 'processing', 'completed', 'failed', 'retrying')
- `attempts` (int)
- `max_attempts` (int, default 3)
- `options` (JSONB, processing options)
- `error` (text, error message if failed)
- `result` (JSONB, processing result)
- `created_at`, `started_at`, `completed_at`, `updated_at` (timestamps)

**document_chunks**:
- `id` (UUID, primary key)
- `document_id` (text, foreign key)
- `content` (text)
- `embedding` (vector(1536))
- `metadata` (JSONB)
- `chunk_index` (int)
- `created_at`, `updated_at` (timestamps)

**agentic_rag_sessions**:
- `id` (UUID, primary key)
- `document_id` (UUID, foreign key)
- `user_id` (UUID, foreign key)
- `status` (text)
- `metadata` (JSONB)
- `created_at`, `updated_at` (timestamps)

### Vector Search Optimization

**Critical SQL Function**: `match_document_chunks` with `document_id` filtering

```10:39:backend/sql/fix_vector_search_timeout.sql
CREATE OR REPLACE FUNCTION match_document_chunks (
  query_embedding vector(1536),
  match_threshold float,
  match_count int,
  filter_document_id text DEFAULT NULL
)
RETURNS TABLE (
  id UUID,
  document_id TEXT,
  content text,
  metadata JSONB,
  chunk_index INT,
  similarity float
)
LANGUAGE sql STABLE
AS $$
  SELECT
    document_chunks.id,
    document_chunks.document_id,
    document_chunks.content,
    document_chunks.metadata,
    document_chunks.chunk_index,
    1 - (document_chunks.embedding <=> query_embedding) AS similarity
  FROM document_chunks
  WHERE document_chunks.embedding IS NOT NULL
    AND (filter_document_id IS NULL OR document_chunks.document_id = filter_document_id)
    AND 1 - (document_chunks.embedding <=> query_embedding) > match_threshold
  ORDER BY document_chunks.embedding <=> query_embedding
  LIMIT match_count;
$$;
```

**Always pass `filter_document_id`** to prevent timeouts when searching across all documents.

---

## 7. Component Handoffs & Integration Points

### Frontend ↔ Backend

**Axios Interceptor** (`frontend/src/services/documentService.ts`):
```8:54:frontend/src/services/documentService.ts
export const apiClient = axios.create({
  baseURL: API_BASE_URL,
  timeout: 300000, // 5 minutes
});

// Add auth token to requests
apiClient.interceptors.request.use(async (config) => {
  const token = await authService.getToken();
  if (token) {
    config.headers.Authorization = `Bearer ${token}`;
  }
  return config;
});

// Handle auth errors with retry logic
apiClient.interceptors.response.use(
  (response) => response,
  async (error) => {
    const originalRequest = error.config;

    if (error.response?.status === 401 && !originalRequest._retry) {
      originalRequest._retry = true;

      try {
        // Attempt to refresh the token
        const newToken = await authService.getToken();
        if (newToken) {
          // Retry the original request with the new token
          originalRequest.headers.Authorization = `Bearer ${newToken}`;
          return apiClient(originalRequest);
        }
      } catch (refreshError) {
        console.error('Token refresh failed:', refreshError);
      }

      // If token refresh fails, logout the user
      authService.logout();
      window.location.href = '/login';
    }

    return Promise.reject(error);
  }
);
```

### Backend ↔ Database

**Two Connection Methods**:

1. **Supabase Client** (default for most operations):
```typescript
import { getSupabaseServiceClient } from '../config/supabase';
const supabase = getSupabaseServiceClient();
```

2. **Direct PostgreSQL** (for critical operations, bypasses PostgREST cache):
```47:81:backend/src/models/ProcessingJobModel.ts
static async create(data: CreateProcessingJobData): Promise<ProcessingJob> {
  try {
    // Use direct PostgreSQL connection to bypass PostgREST cache
    // This is critical because PostgREST cache issues can block entire processing pipeline
    const pool = getPostgresPool();

    const result = await pool.query(
      `INSERT INTO processing_jobs (
        document_id, user_id, status, attempts, max_attempts, options, created_at
      ) VALUES ($1, $2, $3, $4, $5, $6, $7)
      RETURNING *`,
      [
        data.document_id,
        data.user_id,
        'pending',
        0,
        data.max_attempts || 3,
        JSON.stringify(data.options || {}),
        new Date().toISOString()
      ]
    );

    if (result.rows.length === 0) {
      throw new Error('Failed to create processing job: No data returned');
    }

    const job = result.rows[0];

    logger.info('Processing job created via direct PostgreSQL', {
      jobId: job.id,
      documentId: data.document_id,
      userId: data.user_id,
    });

    return job;
```

### Backend ↔ GCS

**Signed URL Generation**:
```typescript
const uploadUrl = await fileStorageService.generateSignedUploadUrl(filePath, contentType);
```

**Direct Upload** (frontend):
```403:410:frontend/src/services/documentService.ts
const fetchPromise = fetch(uploadUrl, {
  method: 'PUT',
  headers: {
    'Content-Type': contentType, // Must match exactly what was used in signed URL generation
  },
  body: file,
  signal: signal,
});
```

**File Download** (for processing):
```typescript
const fileBuffer = await fileStorageService.downloadFile(document.file_path);
```

### Backend ↔ Document AI

**Text Extraction**:
```148:249:backend/src/services/documentAiProcessor.ts
private async extractTextFromDocument(fileBuffer: Buffer, fileName: string, mimeType: string): Promise<string> {
  try {
    // Check document size first
    // ... size validation ...

    // Upload to GCS for Document AI processing
    const gcsFileName = `temp/${Date.now()}_${fileName}`;
    await this.storageClient.bucket(this.gcsBucketName).file(gcsFileName).save(fileBuffer);

    // Process with Document AI
    const request = {
      name: this.processorName,
      rawDocument: {
        gcsSource: {
          uri: `gs://${this.gcsBucketName}/${gcsFileName}`
        },
        mimeType: mimeType
      }
    };

    const [result] = await this.documentAiClient.processDocument(request);

    // Extract text from result
    const text = result.document?.text || '';

    // Clean up temp file
    await this.storageClient.bucket(this.gcsBucketName).file(gcsFileName).delete();

    return text;
  } catch (error) {
    // Fallback to pdf-parse
    logger.warn('Document AI failed, using pdf-parse fallback', { error });
    const data = await pdf(fileBuffer);
    return data.text;
  }
}
```

### Backend ↔ LLM APIs

**Provider Selection** (Claude/OpenAI/OpenRouter):
- Configured via `LLM_PROVIDER` environment variable
- Automatic API key selection based on provider
- Model selection based on task complexity

**Request Flow**:
```typescript
// 1. Token counting and truncation
const processedText = this.truncateText(text, availableTokens);

// 2. Model selection
const model = this.selectModel(taskComplexity);

// 3. API call with retry logic
const response = await this.callLLMAPI({
  prompt: processedText,
  systemPrompt: systemPrompt,
  model: model,
  maxTokens: this.maxTokens,
  temperature: this.temperature
});

// 4. JSON parsing and validation
const parsed = JSON.parse(response.content);
const validated = cimReviewSchema.parse(parsed);
```

### Services ↔ Services

**Event-Driven Patterns**:
- `jobQueueService` emits events: `job:added`, `job:started`, `job:completed`, `job:failed`
- `uploadMonitoringService` tracks upload events

**Direct Method Calls**:
- Most service interactions are direct method calls
- Services are exported as singletons for easy access

---

## 8. Error Handling & Resilience

### Error Propagation Path

```
Service Method
    │
    ▼ (throws error)
Controller
    │
    ▼ (catches, logs, re-throws)
Express Error Handler
    │
    ▼ (categorizes, logs, responds)
Client (structured error response)
```

### Error Categories

**File**: `backend/src/middleware/errorHandler.ts`

```17:26:backend/src/middleware/errorHandler.ts
export enum ErrorCategory {
  VALIDATION = 'validation',
  AUTHENTICATION = 'authentication',
  AUTHORIZATION = 'authorization',
  NOT_FOUND = 'not_found',
  EXTERNAL_SERVICE = 'external_service',
  PROCESSING = 'processing',
  SYSTEM = 'system',
  DATABASE = 'database'
}
```

**Error Response Structure**:
```29:39:backend/src/middleware/errorHandler.ts
export interface ErrorResponse {
  success: false;
  error: {
    code: string;
    message: string;
    details?: any;
    correlationId: string;
    timestamp: string;
    retryable: boolean;
  };
}
```

### Retry Mechanisms

**1. Job Retries**:
- Max attempts: 3 (configurable per job)
- Exponential backoff between retries
- Jobs marked as `retrying` status

**2. API Retries**:
- LLM API calls: 3 retries with exponential backoff
- Document AI: Fallback to pdf-parse
- Vector search: 10-second timeout, fallback to direct query

**3. Database Retries**:
```10:46:backend/src/models/DocumentModel.ts
private static async retryOperation<T>(
  operation: () => Promise<T>,
  operationName: string,
  maxRetries: number = 3,
  baseDelay: number = 1000
): Promise<T> {
  let lastError: any;

  for (let attempt = 1; attempt <= maxRetries; attempt++) {
    try {
      return await operation();
    } catch (error: any) {
      lastError = error;
      const isNetworkError = error?.message?.includes('fetch failed') ||
                           error?.message?.includes('ENOTFOUND') ||
                           error?.message?.includes('ECONNREFUSED') ||
                           error?.message?.includes('ETIMEDOUT') ||
                           error?.name === 'TypeError';

      if (!isNetworkError || attempt === maxRetries) {
        throw error;
      }

      const delay = baseDelay * Math.pow(2, attempt - 1);
      logger.warn(`${operationName} failed (attempt ${attempt}/${maxRetries}), retrying in ${delay}ms`, {
        error: error?.message || String(error),
        code: error?.code,
        attempt,
        maxRetries
      });

      await new Promise(resolve => setTimeout(resolve, delay));
    }
  }

  throw lastError;
}
```

### Timeout Handling

**Vector Search Timeout**:
```109:126:backend/src/services/vectorDatabaseService.ts
// Set a timeout for the RPC call (10 seconds)
const searchPromise = this.supabaseClient
  .rpc('match_document_chunks', rpcParams);

const timeoutPromise = new Promise<{ data: null; error: { message: string } }>((_, reject) => {
  setTimeout(() => reject(new Error('Vector search timeout after 10s')), 10000);
});

let result: any;
try {
  result = await Promise.race([searchPromise, timeoutPromise]);
} catch (timeoutError: any) {
  if (timeoutError.message?.includes('timeout')) {
    logger.error('Vector search timed out', { documentId, timeout: '10s' });
    throw new Error('Vector search timeout after 10s');
  }
  throw timeoutError;
}
```

**LLM API Timeout**: Handled by axios timeout configuration

**Job Timeout**: 15 minutes, jobs stuck longer are reset

### Stuck Job Detection and Recovery

```34:37:backend/src/services/jobProcessorService.ts
// Reset stuck jobs first
const resetCount = await ProcessingJobModel.resetStuckJobs(this.JOB_TIMEOUT_MINUTES);
if (resetCount > 0) {
  logger.info('Reset stuck jobs', { count: resetCount });
}
```

**Scheduled Function Monitoring**:
```228:246:backend/src/index.ts
// Check for jobs stuck in processing status
const stuckProcessingJobs = await ProcessingJobModel.getStuckJobs(15); // Jobs stuck > 15 minutes
if (stuckProcessingJobs.length > 0) {
  logger.warn('Found stuck processing jobs', {
    count: stuckProcessingJobs.length,
    jobIds: stuckProcessingJobs.map(j => j.id),
    timestamp: new Date().toISOString(),
  });
}

// Check for jobs stuck in pending status (alert if > 2 minutes)
const stuckPendingJobs = await ProcessingJobModel.getStuckPendingJobs(2); // Jobs pending > 2 minutes
if (stuckPendingJobs.length > 0) {
  logger.warn('Found stuck pending jobs (may indicate processing issues)', {
    count: stuckPendingJobs.length,
    jobIds: stuckPendingJobs.map(j => j.id),
    oldestJobAge: stuckPendingJobs[0] ? Math.round((Date.now() - new Date(stuckPendingJobs[0].created_at).getTime()) / 1000 / 60) : 0,
    timestamp: new Date().toISOString(),
  });
}
```

### Graceful Degradation

**Document AI Failure**: Falls back to `pdf-parse` library

**Vector Search Failure**: Falls back to direct database query without similarity calculation

**LLM API Failure**: Returns error with retryable flag, job can be retried

**PDF Generation Failure**: Falls back to PDFKit if Puppeteer fails

---

## 9. Performance Optimization Points

### Vector Search Optimization

**Critical**: Always pass `document_id` filter to prevent timeouts

```104:107:backend/src/services/vectorDatabaseService.ts
// Add document_id filter if provided (critical for performance)
if (documentId) {
  rpcParams.filter_document_id = documentId;
}
```

**SQL Function Optimization**: `match_document_chunks` filters by `document_id` first before vector similarity calculation

### Chunking Strategy

**Optimal Configuration**:
```32:35:backend/src/services/optimizedAgenticRAGProcessor.ts
private readonly maxChunkSize = 4000; // Optimal chunk size for embeddings
private readonly overlapSize = 200; // Overlap between chunks
private readonly maxConcurrentEmbeddings = 5; // Limit concurrent API calls
private readonly batchSize = 10; // Process chunks in batches
```

**Semantic Chunking**: Detects paragraph and section boundaries for better chunk quality

### Batch Processing

**Embedding Generation**:
- Processes chunks in batches of 10
- Max 5 concurrent embedding API calls
- Prevents memory overflow and API rate limiting

**Chunk Storage**:
- Batched database inserts
- Reduces database round trips

### Memory Management

**Chunk Processing**:
- Processes chunks in batches to limit memory usage
- Cleans up processed chunks from memory after storage

**PDF Generation**:
- Page pooling (max 5 pages)
- Page timeout (30 seconds)
- Cache with 5-minute TTL

### Database Optimization

**Direct PostgreSQL for Critical Operations**:
- Job creation uses direct PostgreSQL to bypass PostgREST cache issues
- Ensures reliable job creation even when PostgREST schema cache is stale

**Connection Pooling**:
- Supabase client uses connection pooling
- Direct PostgreSQL uses pg pool

### API Call Optimization

**LLM Token Management**:
- Automatic token counting
- Text truncation if exceeds limits
- Model selection based on complexity (smaller models for simpler tasks)

**Embedding Caching**:
```31:32:backend/src/services/vectorDatabaseService.ts
private semanticCache: Map<string, { embedding: number[]; timestamp: number }> = new Map();
private readonly CACHE_TTL = 3600000; // 1 hour cache TTL
```

---

## 10. Background Processing Architecture

### Legacy vs Current System

**Legacy: In-Memory Queue** (`jobQueueService`)
- EventEmitter-based
- In-memory job storage
- Still initialized but being phased out
- Location: `backend/src/services/jobQueueService.ts`

**Current: Database-Backed Queue** (`jobProcessorService`)
- Database-backed job storage
- Scheduled processing via Firebase Cloud Scheduler
- Location: `backend/src/services/jobProcessorService.ts`

### Job Processing Flow

```
Job Creation
    │
    ▼
ProcessingJobModel.create()
    │
    ▼
Status: 'pending' in database
    │
    ▼
Scheduled Function (every 1 minute)
    OR
Immediate processing via API
    │
    ▼
JobProcessorService.processJobs()
    │
    ▼
Get pending/retrying jobs (max 3 concurrent)
    │
    ▼
Process jobs in parallel
    │
    ▼
For each job:
  - Mark as 'processing'
  - Download file from GCS
  - Call unifiedDocumentProcessor
  - Update document status
  - Mark job as 'completed' or 'failed'
```

### Scheduled Function

**File**: `backend/src/index.ts`

```210:267:backend/src/index.ts
export const processDocumentJobs = onSchedule({
  schedule: 'every 1 minutes', // Minimum interval for Firebase Cloud Scheduler
  timeoutSeconds: 900, // 15 minutes (max for Gen2 scheduled functions)
  memory: '1GiB',
  retryCount: 2, // Retry up to 2 times on failure
}, async (event) => {
  logger.info('Processing document jobs scheduled function triggered', {
    timestamp: new Date().toISOString(),
    scheduleTime: event.scheduleTime,
  });

  try {
    const { jobProcessorService } = await import('./services/jobProcessorService');

    // Check for stuck jobs before processing (monitoring)
    const { ProcessingJobModel } = await import('./models/ProcessingJobModel');

    // Check for jobs stuck in processing status
    const stuckProcessingJobs = await ProcessingJobModel.getStuckJobs(15); // Jobs stuck > 15 minutes
    if (stuckProcessingJobs.length > 0) {
      logger.warn('Found stuck processing jobs', {
        count: stuckProcessingJobs.length,
        jobIds: stuckProcessingJobs.map(j => j.id),
        timestamp: new Date().toISOString(),
      });
    }

    // Check for jobs stuck in pending status (alert if > 2 minutes)
    const stuckPendingJobs = await ProcessingJobModel.getStuckPendingJobs(2); // Jobs pending > 2 minutes
    if (stuckPendingJobs.length > 0) {
      logger.warn('Found stuck pending jobs (may indicate processing issues)', {
        count: stuckPendingJobs.length,
        jobIds: stuckPendingJobs.map(j => j.id),
        oldestJobAge: stuckPendingJobs[0] ? Math.round((Date.now() - new Date(stuckPendingJobs[0].created_at).getTime()) / 1000 / 60) : 0,
        timestamp: new Date().toISOString(),
      });
    }

    const result = await jobProcessorService.processJobs();

    logger.info('Document jobs processing completed', {
      ...result,
      timestamp: new Date().toISOString(),
    });
  } catch (error) {
    const errorMessage = error instanceof Error ? error.message : String(error);
    const errorStack = error instanceof Error ? error.stack : undefined;

    logger.error('Error processing document jobs', {
      error: errorMessage,
      stack: errorStack,
      timestamp: new Date().toISOString(),
    });

    // Re-throw to trigger retry mechanism (up to retryCount times)
    throw error;
  }
});
```

### Job States

```
pending → processing → completed
    │                      │
    │                      ▼
    │                   failed
    │                      │
    └──────────────────────┘
                    │
                    ▼
                retrying
                    │
                    ▼
              (back to pending)
```

### Concurrency Control

**Max Concurrent Jobs**: 3

```9:10:backend/src/services/jobProcessorService.ts
private readonly MAX_CONCURRENT_JOBS = 3;
private readonly JOB_TIMEOUT_MINUTES = 15;
```

**Processing Logic**:
```40:63:backend/src/services/jobProcessorService.ts
// Get pending jobs
const pendingJobs = await ProcessingJobModel.getPendingJobs(this.MAX_CONCURRENT_JOBS);

// Get retrying jobs (enabled - schema is updated)
const retryingJobs = await ProcessingJobModel.getRetryableJobs(
  Math.max(0, this.MAX_CONCURRENT_JOBS - pendingJobs.length)
);

const allJobs = [...pendingJobs, ...retryingJobs];

if (allJobs.length === 0) {
  logger.debug('No jobs to process');
  return stats;
}

logger.info('Processing jobs', {
  totalJobs: allJobs.length,
  pendingJobs: pendingJobs.length,
  retryingJobs: retryingJobs.length,
});

// Process jobs in parallel (up to MAX_CONCURRENT_JOBS)
const results = await Promise.allSettled(
  allJobs.map((job) => this.processJob(job.id))
);
```

---

## 11. Frontend Architecture

### Component Structure

**Main Components**:
- `DocumentUpload` - File upload with drag-and-drop
- `DocumentList` - List of user's documents with status
- `DocumentViewer` - View processed document and PDF
- `Analytics` - Processing statistics dashboard
- `UploadMonitoringDashboard` - Real-time upload monitoring

### State Management

**AuthContext** (`frontend/src/contexts/AuthContext.tsx`):
```11:46:frontend/src/contexts/AuthContext.tsx
export const AuthProvider: React.FC<AuthProviderProps> = ({ children }) => {
  const [user, setUser] = useState<User | null>(null);
  const [token, setToken] = useState<string | null>(null);
  const [isLoading, setIsLoading] = useState(true);
  const [error, setError] = useState<string | null>(null);
  const [isInitialized, setIsInitialized] = useState(false);

  useEffect(() => {
    setIsLoading(true);

    // Listen for Firebase auth state changes
    const unsubscribe = authService.onAuthStateChanged(async (firebaseUser) => {
      try {
        if (firebaseUser) {
          const user = authService.getCurrentUser();
          const token = await authService.getToken();
          setUser(user);
          setToken(token);
        } else {
          setUser(null);
          setToken(null);
        }
      } catch (error) {
        console.error('Auth state change error:', error);
        setError('Authentication error occurred');
        setUser(null);
        setToken(null);
      } finally {
        setIsLoading(false);
        setIsInitialized(true);
      }
    });

    // Cleanup subscription on unmount
    return () => unsubscribe();
  }, []);
```

### API Communication

**Document Service** (`frontend/src/services/documentService.ts`):
- Axios client with auth interceptor
- Automatic token refresh on 401 errors
- Progress tracking for uploads
- Error handling with user-friendly messages

**Upload Flow**:
```224:361:frontend/src/services/documentService.ts
async uploadDocument(
  file: File,
  onProgress?: (progress: number) => void,
  signal?: AbortSignal
): Promise<Document> {
  try {
    // Check authentication before upload
    const token = await authService.getToken();
    if (!token) {
      throw new Error('Authentication required. Please log in to upload documents.');
    }

    // Step 1: Get signed upload URL
    onProgress?.(5); // 5% - Getting upload URL

    const uploadUrlResponse = await apiClient.post('/documents/upload-url', {
      fileName: file.name,
      fileSize: file.size,
      contentType: contentTypeForSigning
    }, { signal });

    const { documentId, uploadUrl } = uploadUrlResponse.data;

    // Step 2: Upload directly to Firebase Storage
    onProgress?.(10); // 10% - Starting direct upload

    await this.uploadToFirebaseStorage(
      file,
      uploadUrl,
      contentTypeForSigning,
      (uploadProgress) => {
        // Map upload progress (10-90%)
        const mappedProgress = 10 + (uploadProgress * 0.8);
        onProgress?.(mappedProgress);
      },
      signal
    );

    // Step 3: Confirm upload
    onProgress?.(90); // 90% - Confirming upload

    const confirmResponse = await apiClient.post(
      `/documents/${documentId}/confirm-upload`,
      {},
      { signal }
    );

    onProgress?.(100); // 100% - Complete

    return confirmResponse.data.document;
  } catch (error) {
    // ... error handling ...
  }
}
```

### Real-Time Updates

**Polling for Processing Status**:
- Frontend polls `/documents/:id` endpoint
- Updates UI when status changes from 'processing' to 'completed'
- Shows error messages if status is 'failed'

**Upload Progress**:
- Real-time progress tracking via `onProgress` callback
- Visual progress bar in `DocumentUpload` component

---

## 12. Configuration & Environment

### Environment Variables

**File**: `backend/src/config/env.ts`

**Key Configuration Categories**:

1. **LLM Provider**:
   - `LLM_PROVIDER` - 'anthropic', 'openai', or 'openrouter'
   - `ANTHROPIC_API_KEY` - Claude API key
   - `OPENAI_API_KEY` - OpenAI API key
   - `OPENROUTER_API_KEY` - OpenRouter API key
   - `LLM_MODEL` - Model name (e.g., 'claude-sonnet-4-5-20250929')
   - `LLM_MAX_TOKENS` - Max output tokens
   - `LLM_MAX_INPUT_TOKENS` - Max input tokens (default 200000)

2. **Database**:
   - `SUPABASE_URL` - Supabase project URL
   - `SUPABASE_SERVICE_KEY` - Service role key
   - `SUPABASE_ANON_KEY` - Anonymous key

3. **Google Cloud**:
   - `GCLOUD_PROJECT_ID` - GCP project ID
   - `GCS_BUCKET_NAME` - Storage bucket name
   - `DOCUMENT_AI_PROCESSOR_ID` - Document AI processor ID
   - `DOCUMENT_AI_LOCATION` - Processor location (default 'us')

4. **Feature Flags**:
   - `AGENTIC_RAG_ENABLED` - Enable/disable agentic RAG processing

### Configuration Loading

**Priority Order**:
1. `process.env` (Firebase Functions v2)
2. `functions.config()` (Firebase Functions v1 fallback)
3. `.env` file (local development)

**Validation**: Joi schema validates all required environment variables

---

## 13. Debugging Guide

### Key Log Points

**Correlation IDs**: Every request has a correlation ID for tracing

**Structured Logging**: Winston logger with structured data

**Key Log Locations**:
1. **Request Entry**: `backend/src/index.ts` - All incoming requests
2. **Authentication**: `backend/src/middleware/firebaseAuth.ts` - Auth success/failure
3. **Job Processing**: `backend/src/services/jobProcessorService.ts` - Job lifecycle
4. **Document Processing**: `backend/src/services/unifiedDocumentProcessor.ts` - Processing steps
5. **LLM Calls**: `backend/src/services/llmService.ts` - API calls and responses
6. **Vector Search**: `backend/src/services/vectorDatabaseService.ts` - Search operations
7. **Error Handling**: `backend/src/middleware/errorHandler.ts` - All errors with categorization

### Common Failure Points

**1. Vector Search Timeouts**
- **Symptom**: "Vector search timeout after 10s"
- **Cause**: Searching across all documents without `document_id` filter
- **Fix**: Always pass `documentId` to `vectorDatabaseService.searchSimilar()`

**2. LLM API Failures**
- **Symptom**: "LLM API call failed" or "Invalid JSON response"
- **Cause**: API rate limits, network issues, or invalid response format
- **Fix**: Check API keys, retry logic, and response validation

**3. GCS Upload Failures**
- **Symptom**: "Failed to upload to GCS" or "Signed URL expired"
- **Cause**: Credential issues, bucket permissions, or URL expiration
- **Fix**: Check GCS credentials and bucket configuration

**4. Job Stuck in Processing**
- **Symptom**: Job status remains 'processing' for > 15 minutes
- **Cause**: Process crashed, timeout, or error not caught
- **Fix**: Check logs, reset stuck jobs, investigate error

**5. Document AI Failures**
- **Symptom**: "Failed to extract text from document"
- **Cause**: Document AI API error or invalid file format
- **Fix**: Check Document AI processor configuration, fallback to pdf-parse

### Diagnostic Tools

**Health Check Endpoints**:
- `GET /health` - Basic health check
- `GET /health/config` - Configuration health
- `GET /health/agentic-rag` - Agentic RAG health status

**Monitoring Endpoints**:
- `GET /monitoring/upload-metrics` - Upload statistics
- `GET /monitoring/upload-health` - Upload health
- `GET /monitoring/real-time-stats` - Real-time statistics

**Database Debugging**:
```sql
-- Check pending jobs
SELECT * FROM processing_jobs WHERE status = 'pending' ORDER BY created_at DESC;

-- Check stuck jobs
SELECT * FROM processing_jobs
WHERE status = 'processing'
AND started_at < NOW() - INTERVAL '15 minutes';

-- Check document status
SELECT id, original_file_name, status, created_at
FROM documents
WHERE user_id = '<user_id>'
ORDER BY created_at DESC;
```

**Job Inspection**:
```typescript
// Get job details
const job = await ProcessingJobModel.findById(jobId);

// Check job error
console.log('Job error:', job.error);

// Check job result
console.log('Job result:', job.result);
```

### Debugging Workflow

1. **Identify the Issue**: Check error logs with correlation ID
2. **Trace the Request**: Follow correlation ID through logs
3. **Check Job Status**: Query `processing_jobs` table for job state
4. **Check Document Status**: Query `documents` table for document state
5. **Review Service Logs**: Check specific service logs for detailed errors
6. **Test Components**: Test individual services in isolation
7. **Check External Services**: Verify GCS, Document AI, LLM APIs are accessible

---

## 14. Optimization Opportunities

### Identified Bottlenecks

**1. Vector Search Performance**
- **Current**: 10-second timeout, can be slow for large document sets
- **Optimization**: Ensure `document_id` filter is always used
- **Future**: Consider indexing optimizations, batch search

**2. LLM API Calls**
- **Current**: Sequential processing, no caching of similar requests
- **Optimization**: Implement response caching for similar documents
- **Future**: Batch API calls, use smaller models for simpler tasks

**3. PDF Generation**
- **Current**: Puppeteer can be memory-intensive
- **Optimization**: Page pooling already implemented
- **Future**: Consider serverless PDF generation service

**4. Database Queries**
- **Current**: Some queries don't use indexes effectively
- **Optimization**: Add indexes on frequently queried columns
- **Future**: Query optimization, connection pooling tuning

### Memory Usage Patterns

**Chunk Processing**:
- Processes chunks in batches to limit memory
- Cleans up processed chunks after storage
- **Optimization**: Consider streaming for very large documents

**PDF Generation**:
- Page pooling limits memory usage
- Browser instance reuse reduces overhead
- **Optimization**: Consider headless browser optimization

### API Call Optimization

**Embedding Generation**:
- Current: Max 5 concurrent calls
- **Optimization**: Tune based on API rate limits
- **Future**: Batch embedding API if available

**LLM Calls**:
- Current: Single call per document
- **Optimization**: Use smaller models for simpler tasks
- **Future**: Implement response caching

### Database Query Optimization

**Frequently Queried Tables**:
- `documents` - Add index on `user_id`, `status`
- `processing_jobs` - Add index on `status`, `created_at`
- `document_chunks` - Add index on `document_id`, `chunk_index`

**Vector Search**:
- Current: Uses `match_document_chunks` function
- **Optimization**: Ensure `document_id` filter is always used
- **Future**: Consider HNSW index for faster similarity search

---

## Appendix: Key File Locations

### Backend Services
- `backend/src/services/unifiedDocumentProcessor.ts` - Main orchestrator
- `backend/src/services/optimizedAgenticRAGProcessor.ts` - AI processing engine
- `backend/src/services/jobProcessorService.ts` - Job processor
- `backend/src/services/vectorDatabaseService.ts` - Vector operations
- `backend/src/services/llmService.ts` - LLM interactions
- `backend/src/services/documentAiProcessor.ts` - Document AI integration
- `backend/src/services/pdfGenerationService.ts` - PDF generation
- `backend/src/services/fileStorageService.ts` - GCS operations

### Backend Models
- `backend/src/models/DocumentModel.ts` - Document data model
- `backend/src/models/ProcessingJobModel.ts` - Job data model
- `backend/src/models/VectorDatabaseModel.ts` - Vector data model

### Backend Routes
- `backend/src/routes/documents.ts` - Document endpoints
- `backend/src/routes/vector.ts` - Vector endpoints
- `backend/src/routes/monitoring.ts` - Monitoring endpoints

### Backend Controllers
- `backend/src/controllers/documentController.ts` - Document controller

### Frontend Services
- `frontend/src/services/documentService.ts` - Document API client
- `frontend/src/services/authService.ts` - Authentication service

### Frontend Components
- `frontend/src/components/DocumentUpload.tsx` - Upload component
- `frontend/src/components/DocumentList.tsx` - Document list
- `frontend/src/components/DocumentViewer.tsx` - Document viewer

### Configuration
- `backend/src/config/env.ts` - Environment configuration
- `backend/src/config/supabase.ts` - Supabase configuration
- `backend/src/config/firebase.ts` - Firebase configuration

### SQL
- `backend/sql/fix_vector_search_timeout.sql` - Vector search optimization

---

**End of Architecture Summary**