Pre-cleanup commit: Current state before service layer consolidation

2025-08-01 14:57:56 -04:00
parent 95c92946de
commit f453efb0f8
21 changed files with 2560 additions and 363 deletions
--- a/APP_DESIGN_DOCUMENTATION.md
+++ b/APP_DESIGN_DOCUMENTATION.md
@@ -0,0 +1,533 @@
 # CIM Document Processor - Application Design Documentation
 ## Overview
 The CIM Document Processor is a web application that processes Confidential Information Memorandums (CIMs) using AI to extract key business information and generate structured analysis reports. The system uses Google Document AI for text extraction and an optimized Agentic RAG (Retrieval-Augmented Generation) approach for intelligent document analysis.
 ## Architecture Overview
 ```
 ┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
 │   Frontend      │    │   Backend       │    │   External      │
 │   (React)       │◄──►│   (Node.js)     │◄──►│   Services      │
 └─────────────────┘    └─────────────────┘    └─────────────────┘
                              │                        │
                              ▼                        ▼
                       ┌─────────────────┐    ┌─────────────────┐
                       │   Database      │    │   Google Cloud  │
                       │   (Supabase)    │    │   Services      │
                       └─────────────────┘    └─────────────────┘
 ```
 ## Core Components
 ### 1. Frontend (React + TypeScript)
 **Location**: `frontend/src/`
 **Key Components**:
 - **App.tsx**: Main application with tabbed interface
 - **DocumentUpload**: File upload with Firebase Storage integration
 - **DocumentList**: Display and manage uploaded documents
 - **DocumentViewer**: View processed documents and analysis
 - **Analytics**: Dashboard for processing statistics
 - **UploadMonitoringDashboard**: Real-time upload monitoring
 **Authentication**: Firebase Authentication with protected routes
 ### 2. Backend (Node.js + Express + TypeScript)
 **Location**: `backend/src/`
 **Key Services**:
 - **unifiedDocumentProcessor**: Main orchestrator for document processing
 - **optimizedAgenticRAGProcessor**: Core AI processing engine
 - **llmService**: LLM interaction service (Claude AI/OpenAI)
 - **pdfGenerationService**: PDF report generation using Puppeteer
 - **fileStorageService**: Google Cloud Storage operations
 - **uploadMonitoringService**: Real-time upload tracking
 - **agenticRAGDatabaseService**: Analytics and session management
 - **sessionService**: User session management
 - **jobQueueService**: Background job processing
 - **uploadProgressService**: Upload progress tracking
 ## Data Flow
 ### 1. Document Upload Process
 ```
 User Uploads PDF
       │
       ▼
 ┌─────────────────┐
 │ 1. Get Upload   │ ──► Generate signed URL from Google Cloud Storage
 │    URL          │
 └─────────┬───────┘
          │
          ▼
 ┌─────────────────┐
 │ 2. Upload to    │ ──► Direct upload to GCS bucket
 │    GCS          │
 └─────────┬───────┘
          │
          ▼
 ┌─────────────────┐
 │ 3. Confirm      │ ──► Update database, create processing job
 │    Upload       │
 └─────────┬───────┘
 ```
 ### 2. Document Processing Pipeline
 ```
 Document Uploaded
       │
       ▼
 ┌─────────────────┐
 │ 1. Text         │ ──► Google Document AI extracts text from PDF
 │ Extraction      │    (documentAiGenkitProcessor or direct Document AI)
 └─────────┬───────┘
          │
          ▼
 ┌─────────────────┐
 │ 2. Intelligent  │ ──► Split text into semantic chunks (4000 chars)
 │ Chunking        │    with 200 char overlap
 └─────────┬───────┘
          │
          ▼
 ┌─────────────────┐
 │ 3. Vector       │ ──► Generate embeddings for each chunk
 │ Embedding       │    (rate-limited to 5 concurrent calls)
 └─────────┬───────┘
          │
          ▼
 ┌─────────────────┐
 │ 4. LLM Analysis │ ──► llmService → Claude AI analyzes chunks
 │                 │    and generates structured CIM review data
 └─────────┬───────┘
          │
          ▼
 ┌─────────────────┐
 │ 5. PDF          │ ──► pdfGenerationService generates summary PDF
 │ Generation      │    using Puppeteer
 └─────────┬───────┘
          │
          ▼
 ┌─────────────────┐
 │ 6. Database     │ ──► Store analysis data, update document status
 │ Storage         │
 └─────────┬───────┘
          │
          ▼
 ┌─────────────────┐
 │ 7. Complete     │ ──► Update session, notify user, cleanup
 │ Processing      │
 └─────────────────┘
 ```
 ### 3. Error Handling Flow
 ```
 Processing Error
       │
       ▼
 ┌─────────────────┐
 │ Error Logging   │ ──► Log error with correlation ID
 └─────────┬───────┘
          │
          ▼
 ┌─────────────────┐
 │ Retry Logic     │ ──► Retry failed operation (up to 3 times)
 └─────────┬───────┘
          │
          ▼
 ┌─────────────────┐
 │ Graceful        │ ──► Return partial results or error message
 │ Degradation     │
 └─────────────────┘
 ```
 ## Key Services Explained
 ### 1. Unified Document Processor (`unifiedDocumentProcessor.ts`)
 **Purpose**: Main orchestrator that routes documents to the appropriate processing strategy.
 **Current Strategy**: `optimized_agentic_rag` (only active strategy)
 **Methods**:
 - `processDocument()`: Main processing entry point
 - `processWithOptimizedAgenticRAG()`: Current active processing method
 - `getProcessingStats()`: Returns processing statistics
 ### 2. Optimized Agentic RAG Processor (`optimizedAgenticRAGProcessor.ts`)
 **Purpose**: Core AI processing engine that handles large documents efficiently.
 **Key Features**:
 - **Intelligent Chunking**: Splits text at semantic boundaries (sections, paragraphs)
 - **Batch Processing**: Processes chunks in batches of 10 to manage memory
 - **Rate Limiting**: Limits concurrent API calls to 5
 - **Memory Optimization**: Tracks memory usage and processes efficiently
 **Processing Steps**:
 1. **Create Intelligent Chunks**: Split text into 4000-char chunks with semantic boundaries
 2. **Process Chunks in Batches**: Generate embeddings and metadata for each chunk
 3. **Store Chunks Optimized**: Save to vector database with batching
 4. **Generate LLM Analysis**: Use llmService to analyze and create structured data
 ### 3. LLM Service (`llmService.ts`)
 **Purpose**: Handles all LLM interactions with Claude AI and OpenAI.
 **Key Features**:
 - **Model Selection**: Automatically selects optimal model based on task complexity
 - **Retry Logic**: Implements retry mechanism for failed API calls
 - **Cost Tracking**: Tracks token usage and API costs
 - **Error Handling**: Graceful error handling with fallback options
 **Methods**:
 - `processCIMDocument()`: Main CIM analysis method
 - `callLLM()`: Generic LLM call method
 - `callAnthropic()`: Claude AI specific calls
 - `callOpenAI()`: OpenAI specific calls
 ### 4. PDF Generation Service (`pdfGenerationService.ts`)
 **Purpose**: Generates PDF reports from analysis data using Puppeteer.
 **Key Features**:
 - **HTML to PDF**: Converts HTML content to PDF using Puppeteer
 - **Markdown Support**: Converts markdown to HTML then to PDF
 - **Custom Styling**: Professional PDF formatting with CSS
 - **CIM Review Templates**: Specialized templates for CIM analysis reports
 **Methods**:
 - `generateCIMReviewPDF()`: Generate CIM review PDF from analysis data
 - `generatePDFFromMarkdown()`: Convert markdown to PDF
 - `generatePDFBuffer()`: Generate PDF as buffer for immediate download
 ### 5. File Storage Service (`fileStorageService.ts`)
 **Purpose**: Handles all Google Cloud Storage operations.
 **Key Operations**:
 - `generateSignedUploadUrl()`: Creates secure upload URLs
 - `getFile()`: Downloads files from GCS
 - `uploadFile()`: Uploads files to GCS
 - `deleteFile()`: Removes files from GCS
 ### 6. Upload Monitoring Service (`uploadMonitoringService.ts`)
 **Purpose**: Tracks upload progress and provides real-time monitoring.
 **Key Features**:
 - Real-time upload tracking
 - Error analysis and reporting
 - Performance metrics
 - Health status monitoring
 ### 7. Session Service (`sessionService.ts`)
 **Purpose**: Manages user sessions and authentication state.
 **Key Features**:
 - Session storage and retrieval
 - Token management
 - Session cleanup
 - Security token blacklisting
 ### 8. Job Queue Service (`jobQueueService.ts`)
 **Purpose**: Manages background job processing and queuing.
 **Key Features**:
 - Job queuing and scheduling
 - Background processing
 - Job status tracking
 - Error recovery
 ## Service Dependencies
 ```
 unifiedDocumentProcessor
 ├── optimizedAgenticRAGProcessor
 │   ├── llmService (for AI processing)
 │   ├── vectorDatabaseService (for embeddings)
 │   └── fileStorageService (for file operations)
 ├── pdfGenerationService (for PDF creation)
 ├── uploadMonitoringService (for tracking)
 ├── sessionService (for session management)
 └── jobQueueService (for background processing)
 ```
 ## Database Schema
 ### Core Tables
 #### 1. Documents Table
 ```sql
 CREATE TABLE documents (
  id UUID PRIMARY KEY,
  user_id TEXT NOT NULL,
  original_file_name TEXT NOT NULL,
  file_path TEXT NOT NULL,
  file_size INTEGER NOT NULL,
  status TEXT NOT NULL,
  extracted_text TEXT,
  generated_summary TEXT,
  summary_pdf_path TEXT,
  analysis_data JSONB,
  created_at TIMESTAMP DEFAULT NOW(),
  updated_at TIMESTAMP DEFAULT NOW()
 );
 ```
 #### 2. Agentic RAG Sessions Table
 ```sql
 CREATE TABLE agentic_rag_sessions (
  id UUID PRIMARY KEY,
  document_id UUID REFERENCES documents(id),
  strategy TEXT NOT NULL,
  status TEXT NOT NULL,
  total_agents INTEGER,
  completed_agents INTEGER,
  failed_agents INTEGER,
  overall_validation_score DECIMAL,
  processing_time_ms INTEGER,
  api_calls_count INTEGER,
  total_cost DECIMAL,
  created_at TIMESTAMP DEFAULT NOW(),
  completed_at TIMESTAMP
 );
 ```
 #### 3. Vector Database Tables
 ```sql
 CREATE TABLE document_chunks (
  id UUID PRIMARY KEY,
  document_id UUID REFERENCES documents(id),
  content TEXT NOT NULL,
  embedding VECTOR(1536),
  chunk_index INTEGER,
  metadata JSONB,
  created_at TIMESTAMP DEFAULT NOW()
 );
 ```
 ## API Endpoints
 ### Active Endpoints
 #### Document Management
 - `POST /documents/upload-url` - Get signed upload URL
 - `POST /documents/:id/confirm-upload` - Confirm upload and start processing
 - `POST /documents/:id/process-optimized-agentic-rag` - Trigger AI processing
 - `GET /documents/:id/download` - Download processed PDF
 - `DELETE /documents/:id` - Delete document
 #### Analytics & Monitoring
 - `GET /documents/analytics` - Get processing analytics
 - `GET /documents/:id/agentic-rag-sessions` - Get processing sessions
 - `GET /monitoring/dashboard` - Get monitoring dashboard
 - `GET /vector/stats` - Get vector database statistics
 ### Legacy Endpoints (Kept for Backward Compatibility)
 - `POST /documents/upload` - Multipart file upload (legacy)
 - `GET /documents` - List documents (basic CRUD)
 ## Configuration
 ### Environment Variables
 **Backend** (`backend/src/config/env.ts`):
 ```typescript
 // Google Cloud
 GOOGLE_CLOUD_PROJECT_ID
 GOOGLE_CLOUD_STORAGE_BUCKET
 GOOGLE_APPLICATION_CREDENTIALS
 // Document AI
 GOOGLE_DOCUMENT_AI_LOCATION
 GOOGLE_DOCUMENT_AI_PROCESSOR_ID
 // Database
 DATABASE_URL
 SUPABASE_URL
 SUPABASE_ANON_KEY
 // AI Services
 ANTHROPIC_API_KEY
 OPENAI_API_KEY
 // Processing
 AGENTIC_RAG_ENABLED=true
 PROCESSING_STRATEGY=optimized_agentic_rag
 // LLM Configuration
 LLM_PROVIDER=anthropic
 LLM_MODEL=claude-3-opus-20240229
 LLM_MAX_TOKENS=4000
 LLM_TEMPERATURE=0.1
 ```
 **Frontend** (`frontend/src/config/env.ts`):
 ```typescript
 // API
 VITE_API_BASE_URL
 VITE_FIREBASE_API_KEY
 VITE_FIREBASE_AUTH_DOMAIN
 ```
 ## Processing Strategy Details
 ### Current Strategy: Optimized Agentic RAG
 **Why This Strategy**:
 - Handles large documents efficiently
 - Provides structured analysis output
 - Optimizes memory usage and API costs
 - Generates high-quality summaries
 **How It Works**:
 1. **Text Extraction**: Google Document AI extracts text from PDF
 2. **Semantic Chunking**: Splits text at natural boundaries (sections, paragraphs)
 3. **Vector Embedding**: Creates embeddings for each chunk
 4. **LLM Analysis**: llmService calls Claude AI to analyze chunks and generate structured data
 5. **PDF Generation**: pdfGenerationService creates summary PDF with analysis results
 **Output Format**: Structured CIM Review data including:
 - Deal Overview
 - Business Description
 - Market Analysis
 - Financial Summary
 - Management Team
 - Investment Thesis
 - Key Questions & Next Steps
 ## Error Handling
 ### Frontend Error Handling
 - **Network Errors**: Automatic retry with exponential backoff
 - **Authentication Errors**: Automatic token refresh or redirect to login
 - **Upload Errors**: User-friendly error messages with retry options
 - **Processing Errors**: Real-time error display with retry functionality
 ### Backend Error Handling
 - **Validation Errors**: Input validation with detailed error messages
 - **Processing Errors**: Graceful degradation with error logging
 - **Storage Errors**: Retry logic for transient failures
 - **Database Errors**: Connection pooling and retry mechanisms
 - **LLM API Errors**: Retry logic with exponential backoff
 - **PDF Generation Errors**: Fallback to text-only output
 ### Error Recovery Mechanisms
 - **LLM API Failures**: Up to 3 retry attempts with different models
 - **Processing Timeouts**: Graceful timeout handling with partial results
 - **Memory Issues**: Automatic garbage collection and memory cleanup
 - **File Storage Errors**: Retry with exponential backoff
 ## Monitoring & Analytics
 ### Real-time Monitoring
 - Upload progress tracking
 - Processing status updates
 - Error rate monitoring
 - Performance metrics
 - API usage tracking
 - Cost monitoring
 ### Analytics Dashboard
 - Processing success rates
 - Average processing times
 - API usage statistics
 - Cost tracking
 - User activity metrics
 - Error analysis reports
 ## Security
 ### Authentication
 - Firebase Authentication
 - JWT token validation
 - Protected API endpoints
 - User-specific data isolation
 - Session management with secure token handling
 ### File Security
 - Signed URLs for secure uploads
 - File type validation (PDF only)
 - File size limits (50MB max)
 - User-specific file storage paths
 - Secure file deletion
 ### API Security
 - Rate limiting (1000 requests per 15 minutes)
 - CORS configuration
 - Input validation
 - SQL injection prevention
 - Request correlation IDs for tracking
 ## Performance Optimization
 ### Memory Management
 - Batch processing to limit memory usage
 - Garbage collection optimization
 - Connection pooling for database
 - Efficient chunking to minimize memory footprint
 ### API Optimization
 - Rate limiting to prevent API quota exhaustion
 - Caching for frequently accessed data
 - Efficient chunking to minimize API calls
 - Model selection based on task complexity
 ### Processing Optimization
 - Concurrent processing with limits
 - Intelligent chunking for optimal processing
 - Background job processing
 - Progress tracking for user feedback
 ## Deployment
 ### Backend Deployment
 - **Firebase Functions**: Serverless deployment
 - **Google Cloud Run**: Containerized deployment
 - **Docker**: Container support
 ### Frontend Deployment
 - **Firebase Hosting**: Static hosting
 - **Vite**: Build tool
 - **TypeScript**: Type safety
 ## Development Workflow
 ### Local Development
 1. **Backend**: `npm run dev` (runs on port 5001)
 2. **Frontend**: `npm run dev` (runs on port 5173)
 3. **Database**: Supabase local development
 4. **Storage**: Google Cloud Storage (development bucket)
 ### Testing
 - **Unit Tests**: Jest for backend, Vitest for frontend
 - **Integration Tests**: End-to-end testing
 - **API Tests**: Supertest for backend endpoints
 ## Troubleshooting
 ### Common Issues
 1. **Upload Failures**: Check GCS permissions and bucket configuration
 2. **Processing Timeouts**: Increase timeout limits for large documents
 3. **Memory Issues**: Monitor memory usage and adjust batch sizes
 4. **API Quotas**: Check API usage and implement rate limiting
 5. **PDF Generation Failures**: Check Puppeteer installation and memory
 6. **LLM API Errors**: Verify API keys and check rate limits
 ### Debug Tools
 - Real-time logging with correlation IDs
 - Upload monitoring dashboard
 - Processing session details
 - Error analysis reports
 - Performance metrics dashboard
 This documentation provides a comprehensive overview of the CIM Document Processor architecture, helping junior programmers understand the system's design, data flow, and key components. 
--- a/ARCHITECTURE_DIAGRAMS.md
+++ b/ARCHITECTURE_DIAGRAMS.md
@@ -0,0 +1,463 @@
 # CIM Document Processor - Architecture Diagrams
 ## System Architecture Overview
 ```
 ┌─────────────────────────────────────────────────────────────────────────────┐
 │                              FRONTEND (React)                               │
 ├─────────────────────────────────────────────────────────────────────────────┤
 │  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐        │
 │  │   Login     │  │ Document    │  │ Document    │  │ Analytics   │        │
 │  │   Form      │  │ Upload      │  │ List        │  │ Dashboard   │        │
 │  └─────────────┘  └─────────────┘  └─────────────┘  └─────────────┘        │
 │                                                                             │
 │  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐        │
 │  │ Document    │  │ Upload      │  │ Protected   │  │ Auth        │        │
 │  │ Viewer      │  │ Monitoring  │  │ Route       │  │ Context     │        │
 │  └─────────────┘  └─────────────┘  └─────────────┘  └─────────────┘        │
 └─────────────────────────────────────────────────────────────────────────────┘
                                    │
                                    ▼ HTTP/HTTPS
 ┌─────────────────────────────────────────────────────────────────────────────┐
 │                              BACKEND (Node.js)                              │
 ├─────────────────────────────────────────────────────────────────────────────┤
 │  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐        │
 │  │ Document    │  │ Vector      │  │ Monitoring  │  │ Auth        │        │
 │  │ Routes      │  │ Routes      │  │ Routes      │  │ Middleware  │        │
 │  └─────────────┘  └─────────────┘  └─────────────┘  └─────────────┘        │
 │                                                                             │
 │  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐        │
 │  │ Unified     │  │ Optimized   │  │ LLM         │  │ PDF         │        │
 │  │ Document    │  │ Agentic     │  │ Service     │  │ Generation  │        │
 │  │ Processor   │  │ RAG         │  │             │  │ Service     │        │
 │  │             │  │ Processor   │  │             │  │             │        │
 │  └─────────────┘  └─────────────┘  └─────────────┘  └─────────────┘        │
 │                                                                             │
 │  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐        │
 │  │ File        │  │ Upload      │  │ Session     │  │ Job Queue   │        │
 │  │ Storage     │  │ Monitoring  │  │ Service     │  │ Service     │        │
 │  │ Service     │  │ Service     │  │             │  │             │        │
 │  └─────────────┘  └─────────────┘  └─────────────┘  └─────────────┘        │
 └─────────────────────────────────────────────────────────────────────────────┘
                                    │
                                    ▼
 ┌─────────────────────────────────────────────────────────────────────────────┐
 │                              EXTERNAL SERVICES                               │
 ├─────────────────────────────────────────────────────────────────────────────┤
 │  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐        │
 │  │ Google      │  │ Google      │  │ Anthropic   │  │ Firebase    │        │
 │  │ Document AI │  │ Cloud       │  │ Claude AI   │  │ Auth        │        │
 │  │             │  │ Storage     │  │             │  │             │        │
 │  └─────────────┘  └─────────────┘  └─────────────┘  └─────────────┘        │
 └─────────────────────────────────────────────────────────────────────────────┘
                                    │
                                    ▼
 ┌─────────────────────────────────────────────────────────────────────────────┐
 │                              DATABASE (Supabase)                            │
 ├─────────────────────────────────────────────────────────────────────────────┤
 │  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐        │
 │  │ Documents   │  │ Agentic     │  │ Document    │  │ Vector      │        │
 │  │ Table       │  │ RAG         │  │ Chunks      │  │ Embeddings  │        │
 │  │             │  │ Sessions    │  │ Table       │  │ Table       │        │
 │  └─────────────┘  └─────────────┘  └─────────────┘  └─────────────┘        │
 └─────────────────────────────────────────────────────────────────────────────┘
 ```
 ## Document Processing Flow
 ```
 ┌─────────────────┐
 │ User Uploads    │
 │ PDF Document    │
 └─────────┬───────┘
          │
          ▼
 ┌─────────────────┐
 │ 1. Get Upload   │ ──► Generate signed URL from Google Cloud Storage
 │    URL          │
 └─────────┬───────┘
          │
          ▼
 ┌─────────────────┐
 │ 2. Upload to    │ ──► Direct upload to GCS bucket
 │    GCS          │
 └─────────┬───────┘
          │
          ▼
 ┌─────────────────┐
 │ 3. Confirm      │ ──► Update database, create processing job
 │    Upload       │
 └─────────┬───────┘
          │
          ▼
 ┌─────────────────┐
 │ 4. Text         │ ──► Google Document AI extracts text from PDF
 │ Extraction      │    (documentAiGenkitProcessor or direct Document AI)
 └─────────┬───────┘
          │
          ▼
 ┌─────────────────┐
 │ 5. Intelligent  │ ──► Split text into semantic chunks (4000 chars)
 │ Chunking        │    with 200 char overlap
 └─────────┬───────┘
          │
          ▼
 ┌─────────────────┐
 │ 6. Vector       │ ──► Generate embeddings for each chunk
 │ Embedding       │    (rate-limited to 5 concurrent calls)
 └─────────┬───────┘
          │
          ▼
 ┌─────────────────┐
 │ 7. LLM Analysis │ ──► llmService → Claude AI analyzes chunks
 │                 │    and generates structured CIM review data
 └─────────┬───────┘
          │
          ▼
 ┌─────────────────┐
 │ 8. PDF          │ ──► pdfGenerationService generates summary PDF
 │ Generation      │    using Puppeteer
 └─────────┬───────┘
          │
          ▼
 ┌─────────────────┐
 │ 9. Database     │ ──► Store analysis data, update document status
 │ Storage         │
 └─────────┬───────┘
          │
          ▼
 ┌─────────────────┐
 │ 10. Complete    │ ──► Update session, notify user, cleanup
 │ Processing      │
 └─────────────────┘
 ```
 ## Error Handling Flow
 ```
 Processing Error
       │
       ▼
 ┌─────────────────┐
 │ Error Logging   │ ──► Log error with correlation ID
 └─────────┬───────┘
          │
          ▼
 ┌─────────────────┐
 │ Retry Logic     │ ──► Retry failed operation (up to 3 times)
 └─────────┬───────┘
          │
          ▼
 ┌─────────────────┐
 │ Graceful        │ ──► Return partial results or error message
 │ Degradation     │
 └─────────────────┘
 ```
 ## Component Dependency Map
 ### Backend Services
 ```
 ┌─────────────────────────────────────────────────────────────────────────────┐
 │                              CORE SERVICES                                  │
 ├─────────────────────────────────────────────────────────────────────────────┤
 │                                                                             │
 │  ┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐         │
 │  │ Unified         │    │ Optimized       │    │ LLM Service     │         │
 │  │ Document        │───►│ Agentic RAG     │───►│                 │         │
 │  │ Processor       │    │ Processor       │    │ (Claude AI/     │         │
 │  │ (Orchestrator)  │    │ (Core AI)       │    │  OpenAI)        │         │
 │  └─────────────────┘    └─────────────────┘    └─────────────────┘         │
 │           │                       │                       │                 │
 │           ▼                       ▼                       ▼                 │
 │  ┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐         │
 │  │ PDF Generation  │    │ File Storage    │    │ Upload          │         │
 │  │ Service         │    │ Service         │    │ Monitoring      │         │
 │  │ (Puppeteer)     │    │ (GCS)           │    │ Service         │         │
 │  └─────────────────┘    └─────────────────┘    └─────────────────┘         │
 │           │                       │                       │                 │
 │           ▼                       ▼                       ▼                 │
 │  ┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐         │
 │  │ Session         │    │ Job Queue       │    │ Upload          │         │
 │  │ Service         │    │ Service         │    │ Progress        │         │
 │  │ (Auth Mgmt)     │    │ (Background)    │    │ Service         │         │
 │  └─────────────────┘    └─────────────────┘    └─────────────────┘         │
 └─────────────────────────────────────────────────────────────────────────────┘
 ```
 ### Frontend Components
 ```
 ┌─────────────────────────────────────────────────────────────────────────────┐
 │                              FRONTEND COMPONENTS                            │
 ├─────────────────────────────────────────────────────────────────────────────┤
 │                                                                             │
 │  ┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐         │
 │  │ App.tsx         │    │ AuthContext     │    │ ProtectedRoute  │         │
 │  │ (Main App)      │───►│ (Auth State)    │───►│ (Route Guard)   │         │
 │  └─────────────────┘    └─────────────────┘    └─────────────────┘         │
 │           │                                                                 │
 │           ▼                                                                 │
 │  ┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐         │
 │  │ DocumentUpload  │    │ DocumentList    │    │ DocumentViewer  │         │
 │  │ (File Upload)   │    │ (Document Mgmt) │    │ (View Results)  │         │
 │  └─────────────────┘    └─────────────────┘    └─────────────────┘         │
 │           │                       │                       │                 │
 │           ▼                       ▼                       ▼                 │
 │  ┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐         │
 │  │ Analytics       │    │ Upload          │    │ LoginForm       │         │
 │  │ (Dashboard)     │    │ Monitoring      │    │ (Auth)          │         │
 │  │                 │    │ Dashboard       │    │                 │         │
 │  └─────────────────┘    └─────────────────┘    └─────────────────┘         │
 └─────────────────────────────────────────────────────────────────────────────┘
 ```
 ## Service Dependencies Map
 ```
 ┌─────────────────────────────────────────────────────────────────────────────┐
 │                              SERVICE DEPENDENCIES                           │
 ├─────────────────────────────────────────────────────────────────────────────┤
 │                                                                             │
 │  ┌─────────────────┐                                                       │
 │  │ unifiedDocumentProcessor (Main Orchestrator)                            │
 │  └─────────┬───────┘                                                       │
 │            │                                                               │
 │            ├───► optimizedAgenticRAGProcessor                              │
 │            │         ├───► llmService (AI Processing)                      │
 │            │         ├───► vectorDatabaseService (Embeddings)              │
 │            │         └───► fileStorageService (File Operations)            │
 │            │                                                               │
 │            ├───► pdfGenerationService (PDF Creation)                       │
 │            │         └───► Puppeteer (PDF Generation)                      │
 │            │                                                               │
 │            ├───► uploadMonitoringService (Real-time Tracking)              │
 │            │                                                               │
 │            ├───► sessionService (Session Management)                       │
 │            │                                                               │
 │            └───► jobQueueService (Background Processing)                    │
 │                                                                             │
 └─────────────────────────────────────────────────────────────────────────────┘
 ```
 ## API Endpoint Map
 ```
 ┌─────────────────────────────────────────────────────────────────────────────┐
 │                              API ENDPOINTS                                  │
 ├─────────────────────────────────────────────────────────────────────────────┤
 │                                                                             │
 │  ┌─────────────────────────────────────────────────────────────────────────┐ │
 │  │                           DOCUMENT ROUTES                               │ │
 │  │                                                                         │ │
 │  │  POST /documents/upload-url          ──► Get signed upload URL         │ │
 │  │  POST /documents/:id/confirm-upload  ──► Confirm upload & process      │ │
 │  │  POST /documents/:id/process-optimized-agentic-rag ──► AI processing   │ │
 │  │  GET  /documents/:id/download        ──► Download PDF                  │ │
 │  │  DELETE /documents/:id               ──► Delete document               │ │
 │  │  GET  /documents/analytics           ──► Get analytics                 │ │
 │  │  GET  /documents/:id/agentic-rag-sessions ──► Get sessions             │ │
 │  └─────────────────────────────────────────────────────────────────────────┘ │
 │                                                                             │
 │  ┌─────────────────────────────────────────────────────────────────────────┐ │
 │  │                           MONITORING ROUTES                             │ │
 │  │                                                                         │ │
 │  │  GET  /monitoring/dashboard         ──► Get monitoring dashboard       │ │
 │  │  GET  /monitoring/upload-metrics    ──► Get upload metrics             │ │
 │  │  GET  /monitoring/upload-health     ──► Get health status              │ │
 │  │  GET  /monitoring/real-time-stats   ──► Get real-time stats            │ │
 │  │  GET  /monitoring/error-analysis    ──► Get error analysis             │ │
 │  └─────────────────────────────────────────────────────────────────────────┘ │
 │                                                                             │
 │  ┌─────────────────────────────────────────────────────────────────────────┐ │
 │  │                            VECTOR ROUTES                                │ │
 │  │                                                                         │ │
 │  │  GET  /vector/document-chunks/:documentId ──► Get document chunks      │ │
 │  │  GET  /vector/analytics               ──► Get vector analytics         │ │
 │  │  GET  /vector/stats                   ──► Get vector stats             │ │
 │  └─────────────────────────────────────────────────────────────────────────┘ │
 └─────────────────────────────────────────────────────────────────────────────┘
 ```
 ## Database Schema Map
 ```
 ┌─────────────────────────────────────────────────────────────────────────────┐
 │                              DATABASE SCHEMA                                │
 ├─────────────────────────────────────────────────────────────────────────────┤
 │                                                                             │
 │  ┌─────────────────────────────────────────────────────────────────────────┐ │
 │  │                           DOCUMENTS TABLE                               │ │
 │  │                                                                         │ │
 │  │  id (UUID)                    ──► Primary key                           │ │
 │  │  user_id (TEXT)              ──► User identifier                        │ │
 │  │  original_file_name (TEXT)   ──► Original filename                      │ │
 │  │  file_path (TEXT)            ──► GCS file path                          │ │
 │  │  file_size (INTEGER)         ──► File size in bytes                     │ │
 │  │  status (TEXT)               ──► Processing status                      │ │
 │  │  extracted_text (TEXT)       ──► Extracted text content                 │ │
 │  │  generated_summary (TEXT)    ──► Generated summary                      │ │
 │  │  summary_pdf_path (TEXT)     ──► PDF summary path                       │ │
 │  │  analysis_data (JSONB)       ──► Structured analysis data              │ │
 │  │  created_at (TIMESTAMP)      ──► Creation timestamp                     │ │
 │  │  updated_at (TIMESTAMP)      ──► Last update timestamp                  │ │
 │  └─────────────────────────────────────────────────────────────────────────┘ │
 │                                                                             │
 │  ┌─────────────────────────────────────────────────────────────────────────┐ │
 │  │                      AGENTIC RAG SESSIONS TABLE                         │ │
 │  │                                                                         │ │
 │  │  id (UUID)                    ──► Primary key                           │ │
 │  │  document_id (UUID)           ──► Foreign key to documents              │ │
 │  │  strategy (TEXT)              ──► Processing strategy used              │ │
 │  │  status (TEXT)                ──► Session status                        │ │
 │  │  total_agents (INTEGER)       ──► Total agents in session               │ │
 │  │  completed_agents (INTEGER)   ──► Completed agents                      │ │
 │  │  failed_agents (INTEGER)      ──► Failed agents                         │ │
 │  │  overall_validation_score (DECIMAL) ──► Quality score                   │ │
 │  │  processing_time_ms (INTEGER) ──► Processing time                       │ │
 │  │  api_calls_count (INTEGER)    ──► Number of API calls                   │ │
 │  │  total_cost (DECIMAL)         ──► Total processing cost                 │ │
 │  │  created_at (TIMESTAMP)       ──► Creation timestamp                    │ │
 │  │  completed_at (TIMESTAMP)     ──► Completion timestamp                  │ │
 │  └─────────────────────────────────────────────────────────────────────────┘ │
 │                                                                             │
 │  ┌─────────────────────────────────────────────────────────────────────────┐ │
 │  │                        DOCUMENT CHUNKS TABLE                            │ │
 │  │                                                                         │ │
 │  │  id (UUID)                    ──► Primary key                           │ │
 │  │  document_id (UUID)           ──► Foreign key to documents              │ │
 │  │  content (TEXT)               ──► Chunk content                          │ │
 │  │  embedding (VECTOR(1536))     ──► Vector embedding                      │ │
 │  │  chunk_index (INTEGER)        ──► Chunk order                           │ │
 │  │  metadata (JSONB)             ──► Chunk metadata                        │ │
 │  │  created_at (TIMESTAMP)       ──► Creation timestamp                    │ │
 │  └─────────────────────────────────────────────────────────────────────────┘ │
 └─────────────────────────────────────────────────────────────────────────────┘
 ```
 ## File Structure Map
 ```
 cim_summary/
 ├── backend/
 │   ├── src/
 │   │   ├── config/           # Configuration files
 │   │   ├── controllers/      # Request handlers
 │   │   ├── middleware/       # Express middleware
 │   │   ├── models/          # Database models
 │   │   ├── routes/          # API route definitions
 │   │   ├── services/        # Business logic services
 │   │   │   ├── unifiedDocumentProcessor.ts      # Main orchestrator
 │   │   │   ├── optimizedAgenticRAGProcessor.ts  # Core AI processing
 │   │   │   ├── llmService.ts                    # LLM interactions
 │   │   │   ├── pdfGenerationService.ts          # PDF generation
 │   │   │   ├── fileStorageService.ts            # GCS operations
 │   │   │   ├── uploadMonitoringService.ts       # Real-time tracking
 │   │   │   ├── sessionService.ts                # Session management
 │   │   │   ├── jobQueueService.ts               # Background processing
 │   │   │   └── uploadProgressService.ts         # Progress tracking
 │   │   ├── utils/           # Utility functions
 │   │   └── index.ts         # Main entry point
 │   ├── scripts/             # Setup and utility scripts
 │   └── package.json         # Backend dependencies
 ├── frontend/
 │   ├── src/
 │   │   ├── components/      # React components
 │   │   ├── contexts/        # React contexts
 │   │   ├── services/        # API service layer
 │   │   ├── utils/           # Utility functions
 │   │   ├── config/          # Frontend configuration
 │   │   ├── App.tsx          # Main app component
 │   │   └── main.tsx         # App entry point
 │   └── package.json         # Frontend dependencies
 └── README.md                # Project documentation
 ```
 ## Key Data Flow Sequences
 ### 1. User Authentication Flow
 ```
 User → LoginForm → Firebase Auth → AuthContext → ProtectedRoute → Dashboard
 ```
 ### 2. Document Upload Flow
 ```
 User → DocumentUpload → documentService.uploadDocument() → 
 Backend /upload-url → GCS signed URL → Frontend upload → 
 Backend /confirm-upload → Database update → Processing trigger
 ```
 ### 3. Document Processing Flow
 ```
 Processing trigger → unifiedDocumentProcessor → 
 optimizedAgenticRAGProcessor → Document AI → 
 Chunking → Embeddings → llmService → Claude AI → 
 pdfGenerationService → PDF Generation → 
 Database update → User notification
 ```
 ### 4. Analytics Flow
 ```
 User → Analytics component → documentService.getAnalytics() → 
 Backend /analytics → agenticRAGDatabaseService → 
 Database queries → Structured analytics data → Frontend display
 ```
 ### 5. Error Handling Flow
 ```
 Error occurs → Error logging with correlation ID → 
 Retry logic (up to 3 attempts) → 
 Graceful degradation → User notification
 ```
 ## Processing Pipeline Details
 ### LLM Service Integration
 ```
 optimizedAgenticRAGProcessor
       │
       ▼
 ┌─────────────────┐
 │ llmService      │ ──► Model selection based on task complexity
 └─────────┬───────┘
          │
          ▼
 ┌─────────────────┐
 │ Claude AI       │ ──► Primary model (claude-3-opus-20240229)
 │ (Anthropic)     │
 └─────────┬───────┘
          │
          ▼
 ┌─────────────────┐
 │ OpenAI          │ ──► Fallback model (if Claude fails)
 │ (GPT-4)         │
 └─────────────────┘
 ```
 ### PDF Generation Pipeline
 ```
 Analysis Data
       │
       ▼
 ┌─────────────────┐
 │ pdfGenerationService.generateCIMReviewPDF() │
 └─────────┬───────┘
          │
          ▼
 ┌─────────────────┐
 │ HTML Generation │ ──► Convert analysis data to HTML
 └─────────┬───────┘
          │
          ▼
 ┌─────────────────┐
 │ Puppeteer       │ ──► Convert HTML to PDF
 └─────────┬───────┘
          │
          ▼
 ┌─────────────────┐
 │ PDF Buffer      │ ──► Return PDF as buffer for download
 └─────────────────┘
 ```
 This architecture provides a clear separation of concerns, scalable design, and comprehensive monitoring capabilities for the CIM Document Processor application. 
--- a/DEPENDENCY_ANALYSIS_REPORT.md
+++ b/DEPENDENCY_ANALYSIS_REPORT.md
@@ -0,0 +1,325 @@
 # Dependency Analysis Report - CIM Document Processor
 ## Executive Summary
 This report analyzes the dependencies in both backend and frontend packages to identify:
 - Unused dependencies that can be removed
 - Outdated packages that should be updated
 - Consolidation opportunities
 - Dependencies that are actually being used vs. placeholder implementations
 ## Backend Dependencies Analysis
 ### Core Dependencies (Actively Used)
 #### ✅ **Essential Dependencies**
 - `express` - Main web framework
 - `cors` - CORS middleware
 - `helmet` - Security middleware
 - `morgan` - HTTP request logging
 - `express-rate-limit` - Rate limiting
 - `dotenv` - Environment variable management
 - `winston` - Logging framework
 - `@supabase/supabase-js` - Database client
 - `@google-cloud/storage` - Google Cloud Storage
 - `@google-cloud/documentai` - Document AI processing
 - `@anthropic-ai/sdk` - Claude AI integration
 - `openai` - OpenAI integration
 - `puppeteer` - PDF generation
 - `uuid` - UUID generation
 - `axios` - HTTP client
 #### ✅ **Conditionally Used Dependencies**
 - `bcryptjs` - Used in auth.ts and seed.ts (legacy auth system)
 - `jsonwebtoken` - Used in auth.ts (legacy JWT system)
 - `joi` - Used for environment validation and middleware validation
 - `zod` - Used in llmSchemas.ts and llmService.ts for schema validation
 - `multer` - Used in upload middleware (legacy multipart upload)
 - `pdf-parse` - Used in documentAiGenkitProcessor.ts (legacy processor)
 #### ⚠️ **Potentially Unused Dependencies**
 - `redis` - Only imported in sessionService.ts but may not be actively used
 - `pg` - PostgreSQL client (may be redundant with Supabase)
 ### Development Dependencies (Actively Used)
 #### ✅ **Essential Dev Dependencies**
 - `typescript` - TypeScript compiler
 - `ts-node-dev` - Development server
 - `jest` - Testing framework
 - `supertest` - API testing
 - `@types/*` - TypeScript type definitions
 - `eslint` - Code linting
 - `@typescript-eslint/*` - TypeScript ESLint rules
 ### Unused Dependencies Analysis
 #### ❌ **Confirmed Unused**
 None identified - all dependencies appear to be used somewhere in the codebase.
 #### ⚠️ **Potentially Redundant**
 1. **Validation Libraries**: Both `joi` and `zod` are used for validation
   - `joi`: Environment validation, middleware validation
   - `zod`: LLM schemas, service validation
   - **Recommendation**: Consider consolidating to just `zod` for consistency
 2. **Database Clients**: Both `pg` and `@supabase/supabase-js`
   - `pg`: Direct PostgreSQL client
   - `@supabase/supabase-js`: Supabase client (includes PostgreSQL)
   - **Recommendation**: Remove `pg` if only using Supabase
 3. **Authentication**: Both `bcryptjs`/`jsonwebtoken` and Firebase Auth
   - Legacy JWT system vs. Firebase Authentication
   - **Recommendation**: Remove legacy auth dependencies if fully migrated to Firebase
 ## Frontend Dependencies Analysis
 ### Core Dependencies (Actively Used)
 #### ✅ **Essential Dependencies**
 - `react` - React framework
 - `react-dom` - React DOM rendering
 - `react-router-dom` - Client-side routing
 - `axios` - HTTP client for API calls
 - `firebase` - Firebase Authentication
 - `lucide-react` - Icon library (used in 6 components)
 - `react-dropzone` - File upload component
 #### ❌ **Unused Dependencies**
 - `clsx` - Not imported anywhere
 - `tailwind-merge` - Not imported anywhere
 ### Development Dependencies (Actively Used)
 #### ✅ **Essential Dev Dependencies**
 - `typescript` - TypeScript compiler
 - `vite` - Build tool and dev server
 - `@vitejs/plugin-react` - React plugin for Vite
 - `tailwindcss` - CSS framework
 - `postcss` - CSS processing
 - `autoprefixer` - CSS vendor prefixing
 - `eslint` - Code linting
 - `@typescript-eslint/*` - TypeScript ESLint rules
 - `vitest` - Testing framework
 - `@testing-library/*` - React testing utilities
 ## Processing Strategy Analysis
 ### Current Active Strategy
 Based on the code analysis, the current processing strategy is:
 - **Primary**: `optimized_agentic_rag` (most actively used)
 - **Fallback**: `document_ai_genkit` (legacy implementation)
 ### Unused Processing Strategies
 The following strategies are implemented but not actively used:
 1. `chunking` - Legacy chunking strategy
 2. `rag` - Basic RAG strategy
 3. `agentic_rag` - Basic agentic RAG (superseded by optimized version)
 ### Services Analysis
 #### ✅ **Actively Used Services**
 - `unifiedDocumentProcessor` - Main orchestrator
 - `optimizedAgenticRAGProcessor` - Core AI processing
 - `llmService` - LLM interactions
 - `pdfGenerationService` - PDF generation
 - `fileStorageService` - GCS operations
 - `uploadMonitoringService` - Real-time tracking
 - `sessionService` - Session management
 - `jobQueueService` - Background processing
 #### ⚠️ **Legacy Services (Can be removed)**
 - `documentProcessingService` - Legacy chunking service
 - `documentAiGenkitProcessor` - Legacy Document AI processor
 - `ragDocumentProcessor` - Basic RAG processor
 ## Outdated Packages Analysis
 ### Backend Outdated Packages
 - `@types/express`: 4.17.23 → 5.0.3 (major version update)
 - `@types/jest`: 29.5.14 → 30.0.0 (major version update)
 - `@types/multer`: 1.4.13 → 2.0.0 (major version update)
 - `@types/node`: 20.19.9 → 24.1.0 (major version update)
 - `@types/pg`: 8.15.4 → 8.15.5 (patch update)
 - `@types/supertest`: 2.0.16 → 6.0.3 (major version update)
 - `@typescript-eslint/*`: 6.21.0 → 8.38.0 (major version update)
 - `bcryptjs`: 2.4.3 → 3.0.2 (major version update)
 - `dotenv`: 16.6.1 → 17.2.1 (major version update)
 - `eslint`: 8.57.1 → 9.32.0 (major version update)
 - `express`: 4.21.2 → 5.1.0 (major version update)
 - `express-rate-limit`: 7.5.1 → 8.0.1 (major version update)
 - `helmet`: 7.2.0 → 8.1.0 (major version update)
 - `jest`: 29.7.0 → 30.0.5 (major version update)
 - `multer`: 1.4.5-lts.2 → 2.0.2 (major version update)
 - `openai`: 5.10.2 → 5.11.0 (minor update)
 - `puppeteer`: 21.11.0 → 24.15.0 (major version update)
 - `redis`: 4.7.1 → 5.7.0 (major version update)
 - `supertest`: 6.3.4 → 7.1.4 (major version update)
 - `typescript`: 5.8.3 → 5.9.2 (minor update)
 - `zod`: 3.25.76 → 4.0.14 (major version update)
 ### Frontend Outdated Packages
 - `@testing-library/jest-dom`: 6.6.3 → 6.6.4 (patch update)
 - `@testing-library/react`: 13.4.0 → 16.3.0 (major version update)
 - `@types/react`: 18.3.23 → 19.1.9 (major version update)
 - `@types/react-dom`: 18.3.7 → 19.1.7 (major version update)
 - `@typescript-eslint/*`: 6.21.0 → 8.38.0 (major version update)
 - `eslint`: 8.57.1 → 9.32.0 (major version update)
 - `eslint-plugin-react-hooks`: 4.6.2 → 5.2.0 (major version update)
 - `lucide-react`: 0.294.0 → 0.536.0 (major version update)
 - `react`: 18.3.1 → 19.1.1 (major version update)
 - `react-dom`: 18.3.1 → 19.1.1 (major version update)
 - `react-router-dom`: 6.30.1 → 7.7.1 (major version update)
 - `tailwind-merge`: 2.6.0 → 3.3.1 (major version update)
 - `tailwindcss`: 3.4.17 → 4.1.11 (major version update)
 - `typescript`: 5.8.3 → 5.9.2 (minor update)
 - `vite`: 4.5.14 → 7.0.6 (major version update)
 - `vitest`: 0.34.6 → 3.2.4 (major version update)
 ### Update Strategy
 **⚠️ Warning**: Many packages have major version updates that may include breaking changes. Update strategy:
 1. **Immediate Updates** (Low Risk):
   - `@types/pg`: 8.15.4 → 8.15.5 (patch update)
   - `openai`: 5.10.2 → 5.11.0 (minor update)
   - `typescript`: 5.8.3 → 5.9.2 (minor update)
   - `@testing-library/jest-dom`: 6.6.3 → 6.6.4 (patch update)
 2. **Major Version Updates** (Require Testing):
   - React ecosystem updates (React 18 → 19)
   - Express updates (Express 4 → 5)
   - Testing framework updates (Jest 29 → 30, Vitest 0.34 → 3.2)
   - Build tool updates (Vite 4 → 7)
 3. **Recommendation**: Update major versions after dependency cleanup to minimize risk
 ## Recommendations
 ### Phase 1: Immediate Cleanup (Low Risk)
 #### Backend
 1. **Remove unused frontend dependencies**:
   ```bash
   npm uninstall clsx tailwind-merge
   ```
 2. **Consolidate validation libraries**:
   - Migrate from `joi` to `zod` for consistency
   - Remove `joi` dependency
 3. **Remove legacy auth dependencies** (if Firebase auth is fully implemented):
   ```bash
   npm uninstall bcryptjs jsonwebtoken
   npm uninstall @types/bcryptjs @types/jsonwebtoken
   ```
 #### Frontend
 1. **Remove unused dependencies**:
   ```bash
   npm uninstall clsx tailwind-merge
   ```
 ### Phase 2: Service Consolidation (Medium Risk)
 1. **Remove legacy processing services**:
   - `documentProcessingService.ts`
   - `documentAiGenkitProcessor.ts`
   - `ragDocumentProcessor.ts`
 2. **Simplify unifiedDocumentProcessor**:
   - Remove unused strategy methods
   - Keep only `optimized_agentic_rag` strategy
 3. **Remove unused database client**:
   - Remove `pg` if only using Supabase
 ### Phase 3: Configuration Cleanup (Low Risk)
 1. **Remove unused environment variables**:
   - Legacy auth configuration
   - Unused processing strategy configs
   - Unused LLM configurations
 2. **Update configuration validation**:
   - Remove validation for unused configs
   - Simplify environment schema
 ### Phase 4: Route Cleanup (Medium Risk)
 1. **Remove legacy upload endpoints**:
   - Keep only `/upload-url` and `/confirm-upload`
   - Remove multipart upload endpoints
 2. **Remove unused analytics endpoints**:
   - Keep only actively used monitoring endpoints
 ## Impact Assessment
 ### Risk Levels
 - **Low Risk**: Removing unused dependencies, updating packages
 - **Medium Risk**: Removing legacy services, consolidating routes
 - **High Risk**: Changing core processing logic
 ### Testing Requirements
 - Unit tests for all active services
 - Integration tests for upload flow
 - End-to-end tests for document processing
 - Performance testing for optimized agentic RAG
 ### Rollback Plan
 - Keep backup of removed files for 1-2 weeks
 - Maintain feature flags for major changes
 - Document all changes for easy rollback
 ## Next Steps
 1. **Start with Phase 1** (unused dependencies)
 2. **Test thoroughly** after each phase
 3. **Document changes** for team reference
 4. **Update deployment scripts** if needed
 5. **Monitor performance** after cleanup
 ## Estimated Savings
 ### Bundle Size Reduction
 - **Frontend**: ~50KB (removing unused dependencies)
 - **Backend**: ~200KB (removing legacy services and dependencies)
 ### Maintenance Reduction
 - **Fewer dependencies** to maintain and update
 - **Simplified codebase** with fewer moving parts
 - **Reduced security vulnerabilities** from unused packages
 ### Performance Improvement
 - **Faster builds** with fewer dependencies
 - **Reduced memory usage** from removed services
 - **Simplified deployment** with fewer configuration options
 ## Summary
 ### Key Findings
 1. **Unused Dependencies**: 2 frontend dependencies (`clsx`, `tailwind-merge`) are completely unused
 2. **Legacy Services**: 3 processing services can be removed (`documentProcessingService`, `documentAiGenkitProcessor`, `ragDocumentProcessor`)
 3. **Redundant Dependencies**: Both `joi` and `zod` for validation, both `pg` and Supabase for database
 4. **Outdated Packages**: 21 backend and 15 frontend packages have updates available
 5. **Major Version Updates**: Many packages require major version updates with potential breaking changes
 ### Immediate Actions (Step 2 Complete)
 1. ✅ **Dependency Analysis Complete** - All dependencies mapped and usage identified
 2. ✅ **Outdated Packages Identified** - Version updates documented with risk assessment
 3. ✅ **Cleanup Strategy Defined** - Phased approach with risk levels assigned
 4. ✅ **Impact Assessment Complete** - Bundle size and maintenance savings estimated
 ### Next Steps (Step 3 - Service Layer Consolidation)
 1. Remove unused frontend dependencies (`clsx`, `tailwind-merge`)
 2. Remove legacy processing services
 3. Consolidate validation libraries (migrate from `joi` to `zod`)
 4. Remove redundant database client (`pg` if only using Supabase)
 5. Update low-risk package versions
 ### Risk Assessment
 - **Low Risk**: Removing unused dependencies, updating minor/patch versions
 - **Medium Risk**: Removing legacy services, consolidating libraries
 - **High Risk**: Major version updates, core processing logic changes
 This dependency analysis provides a clear roadmap for cleaning up the codebase while maintaining functionality and minimizing risk. 
--- a/backend/package.json
+++ b/backend/package.json
@@ -19,7 +19,7 @@
    "lint:fix": "eslint src --ext .ts --fix",
    "db:migrate": "ts-node src/scripts/setup-database.ts",
    "db:seed": "ts-node src/models/seed.ts",
-    "db:setup": "npm run db:migrate",
+    "db:setup": "npm run db:migrate && node scripts/setup_supabase.js",
    "deploy:firebase": "npm run build && firebase deploy --only functions",
    "deploy:cloud-run": "npm run build && gcloud run deploy cim-processor-backend --source . --region us-central1 --platform managed --allow-unauthenticated",
    "deploy:docker": "npm run build && docker build -t cim-processor-backend . && docker run -p 8080:8080 cim-processor-backend",
@@ -77,4 +77,4 @@
    "ts-node-dev": "^2.0.0",
    "typescript": "^5.2.2"
  }
-}
+}
--- a/backend/scripts/setup_supabase.js
+++ b/backend/scripts/setup_supabase.js
@@ -0,0 +1,23 @@
 const { createClient } = require('@supabase/supabase-js');
 const fs = require('fs');
 const path = require('path');
 const supabaseUrl = process.env.SUPABASE_URL;
 const supabaseKey = process.env.SUPABASE_SERVICE_KEY;
 const supabase = createClient(supabaseUrl, supabaseKey);
 async function setupDatabase() {
  try {
    const sql = fs.readFileSync(path.join(__dirname, 'supabase_setup.sql'), 'utf8');
    const { error } = await supabase.rpc('exec', { sql });
    if (error) {
      console.error('Error setting up database:', error);
    } else {
      console.log('Database setup complete.');
    }
  } catch (error) {
    console.error('Error reading setup file:', error);
  }
 }
 setupDatabase();
--- a/backend/scripts/test_exec_sql.js
+++ b/backend/scripts/test_exec_sql.js
@@ -0,0 +1,21 @@
 require('dotenv').config();
 const { createClient } = require('@supabase/supabase-js');
 const supabaseUrl = process.env.SUPABASE_URL;
 const supabaseKey = process.env.SUPABASE_SERVICE_KEY;
 const supabase = createClient(supabaseUrl, supabaseKey);
 async function testFunction() {
  try {
    const { error } = await supabase.rpc('exec_sql', { sql: 'SELECT 1' });
    if (error) {
      console.error('Error calling exec_sql:', error);
    } else {
      console.log('Successfully called exec_sql.');
    }
  } catch (error) {
    console.error('Error:', error);
  }
 }
 testFunction();
--- a/backend/src/controllers/documentController.ts
+++ b/backend/src/controllers/documentController.ts
@@ -93,6 +93,13 @@ export const documentController = {
  },
  async confirmUpload(req: Request, res: Response): Promise<void> {
    console.log('🔄 CONFIRM UPLOAD ENDPOINT CALLED');
    console.log('🔄 Request method:', req.method);
    console.log('🔄 Request path:', req.path);
    console.log('🔄 Request params:', req.params);
    console.log('🔄 Request body:', req.body);
    console.log('🔄 Request headers:', Object.keys(req.headers));
    try {
      const userId = req.user?.uid;
      if (!userId) {
@@ -138,36 +145,50 @@ export const documentController = {
        status: 'processing_llm'
      });
-      // Acknowledge the request immediately
+      console.log('✅ Document status updated to processing_llm');
      // Acknowledge the request immediately and return the document
      res.status(202).json({
        message: 'Upload confirmed, processing has started.',
-        documentId: documentId,
+        document: document,
        status: 'processing'
      });
      console.log('✅ Response sent, starting background processing...');
      // Process in the background
      (async () => {
        try {
          console.log('Background processing started.');
          // Download file from Firebase Storage for Document AI processing
          const { fileStorageService } = await import('../services/fileStorageService');
          let fileBuffer: Buffer | null = null;
          let downloadError: string | null = null;
          for (let i = 0; i < 3; i++) {
-            await new Promise(resolve => setTimeout(resolve, 2000)); // 2 second delay
+            try {
-            fileBuffer = await fileStorageService.getFile(document.file_path);
+              await new Promise(resolve => setTimeout(resolve, 2000 * (i + 1)));
-            if (fileBuffer) {
+              fileBuffer = await fileStorageService.getFile(document.file_path);
-              break;
+              if (fileBuffer) {
                console.log(`✅ File downloaded from storage on attempt ${i + 1}`);
                break;
              }
            } catch (err) {
              downloadError = err instanceof Error ? err.message : String(err);
              console.log(`❌ File download attempt ${i + 1} failed:`, downloadError);
            }
          }
          if (!fileBuffer) {
            const errMsg = downloadError || 'Failed to download uploaded file';
            console.log('Failed to download file from storage:', errMsg);
            await DocumentModel.updateById(documentId, { 
              status: 'failed',
-              error_message: 'Failed to download uploaded file'
+              error_message: `Failed to download uploaded file: ${errMsg}`
            });
            return;
          }
          console.log('File downloaded, starting unified processor.');
          // Process with Unified Document Processor
          const { unifiedDocumentProcessor } = await import('../services/unifiedDocumentProcessor');
@@ -175,17 +196,28 @@ export const documentController = {
            documentId,
            userId,
            '', // Text is not needed for this strategy
-            { strategy: 'optimized_agentic_rag' }
+            { 
              strategy: 'document_ai_genkit',
              fileBuffer: fileBuffer,
              fileName: document.original_file_name,
              mimeType: 'application/pdf'
            }
          );
          if (result.success) {
            console.log('✅ Processing successful.');
            // Update document with results
            await DocumentModel.updateById(documentId, { 
              status: 'completed',
              generated_summary: result.summary,
              analysis_data: result.analysisData,
              processing_completed_at: new Date()
            });
            console.log('✅ Document AI processing completed successfully for document:', documentId);
            console.log('✅ Summary length:', result.summary?.length || 0);
            console.log('✅ Processing time:', new Date().toISOString());
            // 🗑️ DELETE PDF after successful processing
            try {
              await fileStorageService.deleteFile(document.file_path);
@@ -201,11 +233,15 @@ export const documentController = {
            console.log('✅ Document AI processing completed successfully');
          } else {
            console.log('❌ Processing failed:', result.error);
            await DocumentModel.updateById(documentId, { 
              status: 'failed',
              error_message: result.error
            });
            console.log('❌ Document AI processing failed for document:', documentId);
            console.log('❌ Error:', result.error);
            // Also delete PDF on processing failure to avoid storage costs
            try {
              await fileStorageService.deleteFile(document.file_path);
@@ -215,14 +251,30 @@ export const documentController = {
            }
          }
        } catch (error) {
-          console.log('❌ Background processing error:', error);
+          const errorMessage = error instanceof Error ? error.message : 'Unknown error';
          const errorStack = error instanceof Error ? error.stack : undefined;
          const errorDetails = error instanceof Error ? {
            name: error.name,
            message: error.message,
            stack: error.stack
          } : {
            type: typeof error,
            value: error
          };
          console.log('❌ Background processing error:', errorMessage);
          console.log('❌ Error details:', errorDetails);
          console.log('❌ Error stack:', errorStack);
          logger.error('Background processing failed', { 
-            error, 
+            error: errorMessage,
-            documentId
+            errorDetails,
            documentId,
            stack: errorStack
          });
          await DocumentModel.updateById(documentId, { 
            status: 'failed',
-            error_message: 'Background processing failed'
+            error_message: `Background processing failed: ${errorMessage}`
          });
        }
      })();
--- a/backend/src/index.ts
+++ b/backend/src/index.ts
@@ -20,7 +20,11 @@ const app = express();
 // Add this middleware to log all incoming requests
 app.use((req, res, next) => {
-  console.log(`Incoming request: ${req.method} ${req.path}`);
+  console.log(`🚀 Incoming request: ${req.method} ${req.path}`);
  console.log(`🚀 Request headers:`, Object.keys(req.headers));
  console.log(`🚀 Request body size:`, req.headers['content-length'] || 'unknown');
  console.log(`🚀 Origin:`, req.headers['origin']);
  console.log(`🚀 User-Agent:`, req.headers['user-agent']);
  next();
 });
@@ -40,9 +44,12 @@ const allowedOrigins = [
 app.use(cors({
  origin: function (origin, callback) {
    console.log(`🌐 CORS check for origin: ${origin}`);
    if (!origin || allowedOrigins.indexOf(origin) !== -1) {
      console.log(`✅ CORS allowed for origin: ${origin}`);
      callback(null, true);
    } else {
      console.log(`❌ CORS blocked for origin: ${origin}`);
      logger.warn(`CORS blocked for origin: ${origin}`);
      callback(new Error('Not allowed by CORS'));
    }
@@ -117,7 +124,7 @@ app.use(errorHandler);
 // Configure Firebase Functions v2 for larger uploads
 export const api = onRequest({
-  timeoutSeconds: 540, // 9 minutes
+  timeoutSeconds: 1800, // 30 minutes (increased from 9 minutes)
  memory: '2GiB',
  cpu: 1,
  maxInstances: 10,
--- a/backend/src/models/VectorDatabaseModel.ts
+++ b/backend/src/models/VectorDatabaseModel.ts
@@ -15,14 +15,21 @@ export interface DocumentChunk {
  updatedAt: Date;
 }
 export interface VectorSearchResult {
  documentId: string;
  similarityScore: number;
  chunkContent: string;
  metadata: Record<string, any>;
 }
 export class VectorDatabaseModel {
  static async storeDocumentChunks(chunks: Omit<DocumentChunk, 'id' | 'createdAt' | 'updatedAt'>[]): Promise<void> {
    const supabase = getSupabaseServiceClient();
-    const { data, error } = await supabase
+    const { error } = await supabase
      .from('document_chunks')
      .insert(chunks.map(chunk => ({
        ...chunk,
-        embedding: `[${chunk.embedding.join(',')}]` // Format for pgvector
+        embedding: `[${chunk.embedding.join(',')}]`
      })));
    if (error) {
@@ -32,4 +39,104 @@ export class VectorDatabaseModel {
    logger.info(`Stored ${chunks.length} document chunks in vector database`);
  }
  static async getDocumentChunks(documentId: string): Promise<DocumentChunk[]> {
    const supabase = getSupabaseServiceClient();
    const { data, error } = await supabase
      .from('document_chunks')
      .select('*')
      .eq('document_id', documentId)
      .order('chunk_index');
    if (error) {
      logger.error('Failed to get document chunks', error);
      throw error;
    }
    return data || [];
  }
  static async getAllChunks(): Promise<DocumentChunk[]> {
    const supabase = getSupabaseServiceClient();
    const { data, error } = await supabase
      .from('document_chunks')
      .select('*')
      .limit(1000);
    if (error) {
      logger.error('Failed to get all chunks', error);
      throw error;
    }
    return data || [];
  }
  static async getTotalChunkCount(): Promise<number> {
    const supabase = getSupabaseServiceClient();
    const { count, error } = await supabase
      .from('document_chunks')
      .select('*', { count: 'exact', head: true });
    if (error) {
      logger.error('Failed to get total chunk count', error);
      throw error;
    }
    return count || 0;
  }
  static async getTotalDocumentCount(): Promise<number> {
    const supabase = getSupabaseServiceClient();
    const { data, error } = await supabase.rpc('count_distinct_documents');
    if (error) {
      logger.error('Failed to get total document count', error);
      throw error;
    }
    return data || 0;
  }
  static async getAverageChunkSize(): Promise<number> {
    const supabase = getSupabaseServiceClient();
    const { data, error } = await supabase.rpc('average_chunk_size');
    if (error) {
      logger.error('Failed to get average chunk size', error);
      throw error;
    }
    return data || 0;
  }
  static async getSearchAnalytics(userId: string, days: number = 30): Promise<any[]> {
    const supabase = getSupabaseServiceClient();
    const { data, error } = await supabase.rpc('get_search_analytics', {
      user_id_param: userId,
      days_param: days
    });
    if (error) {
      logger.error('Failed to get search analytics', error);
      throw error;
    }
    return data || [];
  }
  static async getVectorDatabaseStats(): Promise<{
    totalChunks: number;
    totalDocuments: number;
    averageSimilarity: number;
  }> {
    const supabase = getSupabaseServiceClient();
    const { data, error } = await supabase.rpc('get_vector_database_stats');
    if (error) {
      logger.error('Failed to get vector database stats', error);
      throw error;
    }
    return data[0] || { totalChunks: 0, totalDocuments: 0, averageSimilarity: 0 };
  }
 }
--- a/backend/src/models/migrate.ts
+++ b/backend/src/models/migrate.ts
@@ -1,6 +1,6 @@
 import fs from 'fs';
 import path from 'path';
-import pool from '../config/database';
+import { getSupabaseServiceClient } from '../config/supabase';
 import logger from '../utils/logger';
 interface Migration {
@@ -16,24 +16,18 @@ class DatabaseMigrator {
    this.migrationsDir = path.join(__dirname, 'migrations');
  }
  /**
   * Get all migration files
   */
  private async getMigrationFiles(): Promise<string[]> {
    try {
      const files = await fs.promises.readdir(this.migrationsDir);
      return files
        .filter(file => file.endsWith('.sql'))
-        .sort(); // Sort to ensure proper order
+        .sort();
    } catch (error) {
      logger.error('Error reading migrations directory:', error);
      throw error;
    }
  }
  /**
   * Load migration content
   */
  private async loadMigration(fileName: string): Promise<Migration> {
    const filePath = path.join(this.migrationsDir, fileName);
    const sql = await fs.promises.readFile(filePath, 'utf-8');
@@ -45,68 +39,66 @@ class DatabaseMigrator {
    };
  }
  /**
   * Create migrations table if it doesn't exist
   */
  private async createMigrationsTable(): Promise<void> {
-    const query = `
+    const supabase = getSupabaseServiceClient();
-      CREATE TABLE IF NOT EXISTS migrations (
+    const { error } = await supabase.rpc('exec_sql', {
-        id VARCHAR(255) PRIMARY KEY,
+      sql: `
-        name VARCHAR(255) NOT NULL,
+        CREATE TABLE IF NOT EXISTS migrations (
-        executed_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP
+          id VARCHAR(255) PRIMARY KEY,
-      );
+          name VARCHAR(255) NOT NULL,
-    `;
+          executed_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP
        );
      `
    });
-    try {
+    if (error) {
      await pool.query(query);
      logger.info('Migrations table created or already exists');
    } catch (error) {
      logger.error('Error creating migrations table:', error);
      throw error;
    }
    logger.info('Migrations table created or already exists');
  }
  /**
   * Check if migration has been executed
   */
  private async isMigrationExecuted(migrationId: string): Promise<boolean> {
-    const query = 'SELECT id FROM migrations WHERE id = $1';
+    const supabase = getSupabaseServiceClient();
-    
+    const { data, error } = await supabase
-    try {
+      .from('migrations')
-      const result = await pool.query(query, [migrationId]);
+      .select('id')
-      return result.rows.length > 0;
+      .eq('id', migrationId);
-    } catch (error) {
+
    if (error) {
      logger.error('Error checking migration status:', error);
      throw error;
    }
    return data.length > 0;
  }
  /**
   * Mark migration as executed
   */
  private async markMigrationExecuted(migrationId: string, name: string): Promise<void> {
-    const query = 'INSERT INTO migrations (id, name) VALUES ($1, $2)';
+    const supabase = getSupabaseServiceClient();
-    
+    const { error } = await supabase
-    try {
+      .from('migrations')
-      await pool.query(query, [migrationId, name]);
+      .insert([{ id: migrationId, name }]);
-      logger.info(`Migration marked as executed: ${name}`);
+
-    } catch (error) {
+    if (error) {
      logger.error('Error marking migration as executed:', error);
      throw error;
    }
    logger.info(`Migration marked as executed: ${name}`);
  }
  /**
   * Execute a single migration
   */
  private async executeMigration(migration: Migration): Promise<void> {
    try {
      logger.info(`Executing migration: ${migration.name}`);
-      // Execute the migration SQL
+      const supabase = getSupabaseServiceClient();
-      await pool.query(migration.sql);
+      const { error } = await supabase.rpc('exec_sql', { sql: migration.sql });
      if (error) {
        throw error;
      }
      // Mark as executed
      await this.markMigrationExecuted(migration.id, migration.name);
      logger.info(`Migration completed: ${migration.name}`);
@@ -116,25 +108,18 @@ class DatabaseMigrator {
    }
  }
  /**
   * Run all pending migrations
   */
  async migrate(): Promise<void> {
    try {
      logger.info('Starting database migration...');
      // Create migrations table
      await this.createMigrationsTable();
      // Get all migration files
      const migrationFiles = await this.getMigrationFiles();
      logger.info(`Found ${migrationFiles.length} migration files`);
      // Execute each migration
      for (const fileName of migrationFiles) {
        const migration = await this.loadMigration(fileName);
        // Check if already executed
        const isExecuted = await this.isMigrationExecuted(migration.id);
        if (!isExecuted) {
@@ -150,21 +135,6 @@ class DatabaseMigrator {
      throw error;
    }
  }
  /**
   * Get migration status
   */
  async getMigrationStatus(): Promise<{ id: string; name: string; executed_at: Date }[]> {
    const query = 'SELECT id, name, executed_at FROM migrations ORDER BY executed_at';
    try {
      const result = await pool.query(query);
      return result.rows;
    } catch (error) {
      logger.error('Error getting migration status:', error);
      throw error;
    }
  }
 }
-export default DatabaseMigrator; 
+export default DatabaseMigrator;
--- a/backend/src/models/seed.ts
+++ b/backend/src/models/seed.ts
@@ -1,26 +1,19 @@
 import { v4 as uuidv4 } from 'uuid';
 import bcrypt from 'bcryptjs';
 import { UserModel } from './UserModel';
 import { DocumentModel } from './DocumentModel';
 import { ProcessingJobModel } from './ProcessingJobModel';
 import logger from '../utils/logger';
 import { config } from '../config/env';
-import pool from '../config/database';
+import { getSupabaseServiceClient } from '../config/supabase';
 class DatabaseSeeder {
  /**
   * Seed the database with initial data
   */
  async seed(): Promise<void> {
    try {
      logger.info('Starting database seeding...');
      // Seed users
      await this.seedUsers();
      // Seed documents (if any users were created)
      await this.seedDocuments();
      // Seed processing jobs
      await this.seedProcessingJobs();
      logger.info('Database seeding completed successfully');
@@ -30,9 +23,6 @@ class DatabaseSeeder {
    }
  }
  /**
   * Seed users
   */
  private async seedUsers(): Promise<void> {
    const users = [
      {
@@ -57,14 +47,11 @@ class DatabaseSeeder {
    for (const userData of users) {
      try {
        // Check if user already exists
        const existingUser = await UserModel.findByEmail(userData.email);
        if (!existingUser) {
          // Hash password
          const hashedPassword = await bcrypt.hash(userData.password, config.security.bcryptRounds);
          // Create user
          await UserModel.create({
            ...userData,
            password: hashedPassword
@@ -80,12 +67,8 @@ class DatabaseSeeder {
    }
  }
  /**
   * Seed documents
   */
  private async seedDocuments(): Promise<void> {
    try {
      // Get a user to associate documents with
      const user = await UserModel.findByEmail('user1@example.com');
      if (!user) {
@@ -98,28 +81,27 @@ class DatabaseSeeder {
          user_id: user.id,
          original_file_name: 'sample_cim_1.pdf',
          file_path: '/uploads/sample_cim_1.pdf',
-          file_size: 2048576, // 2MB
+          file_size: 2048576,
          status: 'completed' as const
        },
        {
          user_id: user.id,
          original_file_name: 'sample_cim_2.pdf',
          file_path: '/uploads/sample_cim_2.pdf',
-          file_size: 3145728, // 3MB
+          file_size: 3145728,
          status: 'processing_llm' as const
        },
        {
          user_id: user.id,
          original_file_name: 'sample_cim_3.pdf',
          file_path: '/uploads/sample_cim_3.pdf',
-          file_size: 1048576, // 1MB
+          file_size: 1048576,
          status: 'uploaded' as const
        }
      ];
      for (const docData of documents) {
        try {
          // Check if document already exists (by file path)
          const existingDocs = await DocumentModel.findByUserId(user.id);
          const exists = existingDocs.some(doc => doc.file_path === docData.file_path);
@@ -138,12 +120,8 @@ class DatabaseSeeder {
    }
  }
  /**
   * Seed processing jobs
   */
  private async seedProcessingJobs(): Promise<void> {
    try {
      // Get a document to associate jobs with
      const user = await UserModel.findByEmail('user1@example.com');
      if (!user) {
        logger.warn('No user found for seeding processing jobs');
@@ -157,7 +135,7 @@ class DatabaseSeeder {
        return;
      }
-      const document = documents[0]; // Use first document
+      const document = documents[0];
      if (!document) {
        logger.warn('No document found for seeding processing jobs');
@@ -187,7 +165,6 @@ class DatabaseSeeder {
      for (const jobData of jobs) {
        try {
          // Check if job already exists
          const existingJobs = await ProcessingJobModel.findByDocumentId(document.id);
          const exists = existingJobs.some(job => job.type === jobData.type);
@@ -197,7 +174,6 @@ class DatabaseSeeder {
              type: jobData.type
            });
            // Update status and progress
            await ProcessingJobModel.updateStatus(job.id, jobData.status);
            await ProcessingJobModel.updateProgress(job.id, jobData.progress);
@@ -214,23 +190,16 @@ class DatabaseSeeder {
    }
  }
  /**
   * Clear all seeded data
   */
  async clear(): Promise<void> {
    try {
      logger.info('Clearing seeded data...');
-      // Clear in reverse order to respect foreign key constraints
+      const supabase = getSupabaseServiceClient();
-      await pool.query('DELETE FROM processing_jobs');
+      await supabase.from('processing_jobs').delete().neq('id', uuidv4());
-      await pool.query('DELETE FROM document_versions');
+      await supabase.from('document_versions').delete().neq('id', uuidv4());
-      await pool.query('DELETE FROM document_feedback');
+      await supabase.from('document_feedback').delete().neq('id', uuidv4());
-      await pool.query('DELETE FROM documents');
+      await supabase.from('documents').delete().neq('id', uuidv4());
-      await pool.query('DELETE FROM users WHERE email IN ($1, $2, $3)', [
+      await supabase.from('users').delete().in('email', ['admin@example.com', 'user1@example.com', 'user2@example.com']);
        'admin@example.com',
        'user1@example.com',
        'user2@example.com'
      ]);
      logger.info('Seeded data cleared successfully');
    } catch (error) {
@@ -240,4 +209,4 @@ class DatabaseSeeder {
  }
 }
-export default DatabaseSeeder; 
+export default DatabaseSeeder;
--- a/backend/src/routes/documents.ts
+++ b/backend/src/routes/documents.ts
@@ -23,16 +23,13 @@ const router = express.Router();
 router.use(verifyFirebaseToken);
 router.use(addCorrelationId);
-// NEW Firebase Storage direct upload routes
+// Add logging middleware for document routes
-router.post('/upload-url', documentController.getUploadUrl);
+router.use((req, res, next) => {
-router.post('/:id/confirm-upload', validateUUID('id'), documentController.confirmUpload);
+  console.log(`📄 Document route accessed: ${req.method} ${req.path}`);
  next();
 });
-// LEGACY multipart upload routes (keeping for backward compatibility)
+// Analytics endpoints (MUST come before ANY routes with :id parameters)
 router.post('/upload', handleFileUpload, documentController.uploadDocument);
 router.post('/', handleFileUpload, documentController.uploadDocument);
 router.get('/', documentController.getDocuments);
 // Analytics endpoints (MUST come before /:id routes to avoid conflicts)
 router.get('/analytics', async (req, res) => {
  try {
    const userId = req.user?.uid;
@@ -44,11 +41,9 @@ router.get('/analytics', async (req, res) => {
    }
    const days = parseInt(req.query['days'] as string) || 30;
    // Import the service here to avoid circular dependencies
    const { agenticRAGDatabaseService } = await import('../services/agenticRAGDatabaseService');
    const analytics = await agenticRAGDatabaseService.getAnalyticsData(days);
    return res.json({
      ...analytics,
      correlationId: req.correlationId || undefined
@@ -84,6 +79,15 @@ router.get('/processing-stats', async (req, res) => {
  }
 });
 // NEW Firebase Storage direct upload routes
 router.post('/upload-url', documentController.getUploadUrl);
 router.post('/:id/confirm-upload', validateUUID('id'), documentController.confirmUpload);
 // LEGACY multipart upload routes (keeping for backward compatibility)
 router.post('/upload', handleFileUpload, documentController.uploadDocument);
 router.post('/', handleFileUpload, documentController.uploadDocument);
 router.get('/', documentController.getDocuments);
 // Document-specific routes with UUID validation
 router.get('/:id', validateUUID('id'), documentController.getDocument);
 router.get('/:id/progress', validateUUID('id'), documentController.getDocumentProgress);
--- a/backend/src/services/documentAiGenkitProcessor.ts
+++ b/backend/src/services/documentAiGenkitProcessor.ts
@@ -1,4 +1,8 @@
 import { logger } from '../utils/logger';
 import { DocumentProcessorServiceClient } from '@google-cloud/documentai';
 import { Storage } from '@google-cloud/storage';
 import { config } from '../config/env';
 import pdf from 'pdf-parse';
 interface ProcessingResult {
  success: boolean;
@@ -7,11 +11,46 @@ interface ProcessingResult {
  error?: string;
 }
 interface DocumentAIOutput {
  text: string;
  entities: Array<{
    type: string;
    mentionText: string;
    confidence: number;
  }>;
  tables: Array<any>;
  pages: Array<any>;
  mimeType: string;
 }
 interface PageChunk {
  startPage: number;
  endPage: number;
  buffer: Buffer;
 }
 export class DocumentAiGenkitProcessor {
  private gcsBucketName: string;
  private documentAiClient: DocumentProcessorServiceClient;
  private storageClient: Storage;
  private processorName: string;
  private readonly MAX_PAGES_PER_CHUNK = 30;
  constructor() {
-    this.gcsBucketName = process.env['GCS_BUCKET_NAME'] || 'cim-summarizer-uploads';
+    this.gcsBucketName = config.googleCloud.gcsBucketName;
    this.documentAiClient = new DocumentProcessorServiceClient();
    this.storageClient = new Storage();
    // Construct the processor name
    this.processorName = `projects/${config.googleCloud.projectId}/locations/${config.googleCloud.documentAiLocation}/processors/${config.googleCloud.documentAiProcessorId}`;
    logger.info('Document AI + Genkit processor initialized', {
      projectId: config.googleCloud.projectId,
      location: config.googleCloud.documentAiLocation,
      processorId: config.googleCloud.documentAiProcessorId,
      processorName: this.processorName,
      maxPagesPerChunk: this.MAX_PAGES_PER_CHUNK
    });
  }
  async processDocument(
@@ -19,135 +58,331 @@ export class DocumentAiGenkitProcessor {
    userId: string, 
    fileBuffer: Buffer, 
    fileName: string, 
-    _mimeType: string
+    mimeType: string
  ): Promise<ProcessingResult> {
    const startTime = Date.now();
    try {
-      logger.info('Starting Document AI + Genkit processing', { 
+      logger.info('Starting Document AI + Agentic RAG processing', { 
        documentId, 
        userId, 
        fileName,
-        fileSize: fileBuffer.length 
+        fileSize: fileBuffer.length,
        mimeType
      });
-      // Step 1: Upload file to GCS
+      // Step 1: Extract text using Document AI or fallback
-      const gcsFilePath = await this.uploadToGCS(fileBuffer, fileName);
+      const extractedText = await this.extractTextFromDocument(fileBuffer, fileName, mimeType);
-      logger.info('File uploaded to GCS', { gcsFilePath });
+      
      if (!extractedText) {
        throw new Error('Failed to extract text from document');
      }
-      // Step 2: Process with Document AI
+      logger.info('Text extraction completed', {
-      const documentAiOutput = await this.processWithDocumentAI(gcsFilePath);
+        textLength: extractedText.length
      logger.info('Document AI processing completed', { 
        textLength: documentAiOutput?.text?.length || 0,
        entitiesCount: documentAiOutput?.entities?.length || 0
      });
-      // Step 3: Process with Genkit
+      // Step 2: Process extracted text through Agentic RAG
-      const genkitOutput = await this.processWithGenkit(fileName);
+      const agenticRagResult = await this.processWithAgenticRAG(documentId, extractedText);
-      logger.info('Genkit processing completed', { 
+      
        outputLength: genkitOutput?.markdownOutput?.length || 0 
      });
      // Step 4: Cleanup GCS files
      await this.cleanupGCSFiles(gcsFilePath);
      logger.info('GCS cleanup completed');
      const processingTime = Date.now() - startTime;
      return {
        success: true,
-        content: genkitOutput?.markdownOutput || 'No analysis generated',
+        content: agenticRagResult.summary || extractedText,
        metadata: {
-          processingStrategy: 'document_ai_genkit',
+          processingStrategy: 'document_ai_agentic_rag',
          processingTime,
-          documentAiOutput,
+          extractedTextLength: extractedText.length,
-          genkitOutput,
+          agenticRagResult,
          fileSize: fileBuffer.length,
-          fileName
+          fileName,
          mimeType
        }
      };
    } catch (error) {
      const processingTime = Date.now() - startTime;
-      logger.error('Document AI + Genkit processing failed', { 
+      const errorMessage = error instanceof Error ? error.message : String(error);
      const errorStack = error instanceof Error ? error.stack : undefined;
      const errorDetails = error instanceof Error ? {
        name: error.name,
        message: error.message,
        stack: error.stack
      } : {
        type: typeof error,
        value: error
      };
      logger.error('Document AI + Agentic RAG processing failed', { 
        documentId, 
-        error: error instanceof Error ? error.message : String(error),
+        error: errorMessage,
-        stack: error instanceof Error ? error.stack : undefined
+        errorDetails,
        stack: errorStack,
        processingTime
      });
      return {
        success: false,
        content: '',
-        error: `Document AI + Genkit processing failed: ${error instanceof Error ? error.message : String(error)}`,
+        error: `Document AI + Agentic RAG processing failed: ${errorMessage}`,
        metadata: {
-          processingStrategy: 'document_ai_genkit',
+          processingStrategy: 'document_ai_agentic_rag',
          processingTime,
-          error: error instanceof Error ? error.message : String(error)
+          error: errorMessage,
          errorDetails,
          stack: errorStack
        }
      };
    }
  }
  private async extractTextFromDocument(fileBuffer: Buffer, fileName: string, mimeType: string): Promise<string> {
    try {
      // Check document size first
      const pdfData = await pdf(fileBuffer);
      const totalPages = pdfData.numpages;
      logger.info('PDF analysis completed', {
        totalPages,
        textLength: pdfData.text?.length || 0
      });
      // If document has more than 30 pages, use pdf-parse fallback
      if (totalPages > this.MAX_PAGES_PER_CHUNK) {
        logger.warn('Document exceeds Document AI page limit, using pdf-parse fallback', {
          totalPages,
          maxPagesPerChunk: this.MAX_PAGES_PER_CHUNK
        });
        return pdfData.text || '';
      }
      // For documents <= 30 pages, use Document AI
      logger.info('Using Document AI for text extraction', {
        totalPages,
        maxPagesPerChunk: this.MAX_PAGES_PER_CHUNK
      });
      // Upload file to GCS
      const gcsFilePath = await this.uploadToGCS(fileBuffer, fileName);
      // Process with Document AI
      const documentAiOutput = await this.processWithDocumentAI(gcsFilePath, mimeType);
      // Cleanup GCS file
      await this.cleanupGCSFiles(gcsFilePath);
      return documentAiOutput.text;
    } catch (error) {
      logger.error('Text extraction failed, using pdf-parse fallback', { 
        error: error instanceof Error ? error.message : String(error) 
      });
      // Fallback to pdf-parse
      try {
        const pdfData = await pdf(fileBuffer);
        return pdfData.text || '';
      } catch (fallbackError) {
        logger.error('Both Document AI and pdf-parse failed', { 
          originalError: error instanceof Error ? error.message : String(error),
          fallbackError: fallbackError instanceof Error ? fallbackError.message : String(fallbackError)
        });
        throw new Error('Failed to extract text from document using any method');
      }
    }
  }
  private async processWithAgenticRAG(documentId: string, extractedText: string): Promise<any> {
    try {
      logger.info('Processing extracted text with Agentic RAG', {
        documentId,
        textLength: extractedText.length
      });
      // Import and use the optimized agentic RAG processor
      logger.info('Importing optimized agentic RAG processor...');
      const { optimizedAgenticRAGProcessor } = await import('./optimizedAgenticRAGProcessor');
      logger.info('Agentic RAG processor imported successfully', {
        processorType: typeof optimizedAgenticRAGProcessor,
        hasProcessLargeDocument: typeof optimizedAgenticRAGProcessor?.processLargeDocument === 'function'
      });
      logger.info('Calling processLargeDocument...');
      const result = await optimizedAgenticRAGProcessor.processLargeDocument(
        documentId, 
        extractedText, 
        {}
      );
      logger.info('Agentic RAG processing completed', {
        success: result.success,
        summaryLength: result.summary?.length || 0,
        analysisDataKeys: result.analysisData ? Object.keys(result.analysisData) : [],
        resultType: typeof result
      });
      return result;
    } catch (error) {
      const errorMessage = error instanceof Error ? error.message : String(error);
      const errorStack = error instanceof Error ? error.stack : undefined;
      const errorDetails = error instanceof Error ? {
        name: error.name,
        message: error.message,
        stack: error.stack
      } : {
        type: typeof error,
        value: error
      };
      logger.error('Agentic RAG processing failed', {
        documentId,
        error: errorMessage,
        errorDetails,
        stack: errorStack
      });
      throw error;
    }
  }
  private async uploadToGCS(fileBuffer: Buffer, fileName: string): Promise<string> {
-    // This is a placeholder implementation
+    try {
-    // In production, this would upload to Google Cloud Storage
+      const bucket = this.storageClient.bucket(this.gcsBucketName);
-    logger.info('Uploading file to GCS (placeholder)', { fileName, fileSize: fileBuffer.length });
+      const file = bucket.file(`uploads/${Date.now()}_${fileName}`);
-    
+      
-    // Simulate upload delay
+      logger.info('Uploading file to GCS', { 
-    await new Promise(resolve => setTimeout(resolve, 100));
+        fileName, 
-    
+        fileSize: fileBuffer.length,
-    return `gs://${this.gcsBucketName}/uploads/${fileName}`;
+        bucket: this.gcsBucketName,
        destination: file.name
      });
      await file.save(fileBuffer, {
        metadata: {
          contentType: 'application/pdf'
        }
      });
      logger.info('File uploaded successfully to GCS', { 
        gcsPath: `gs://${this.gcsBucketName}/${file.name}` 
      });
      return `gs://${this.gcsBucketName}/${file.name}`;
    } catch (error) {
      logger.error('Failed to upload file to GCS', { 
        fileName, 
        error: error instanceof Error ? error.message : String(error) 
      });
      throw error;
    }
  }
-  private async processWithDocumentAI(gcsFilePath: string): Promise<any> {
+  private async processWithDocumentAI(gcsFilePath: string, mimeType: string): Promise<DocumentAIOutput> {
-    // This is a placeholder implementation
+    try {
-    // In production, this would call Google Cloud Document AI
+      logger.info('Processing with Document AI', { 
-    logger.info('Processing with Document AI (placeholder)', { gcsFilePath });
+        gcsFilePath, 
-    
+        processorName: this.processorName,
-    // Simulate Document AI processing
+        mimeType 
-    await new Promise(resolve => setTimeout(resolve, 200));
+      });
    return {
      text: 'Sample extracted text from Document AI',
      entities: [
        { type: 'COMPANY_NAME', mentionText: 'Sample Company', confidence: 0.95 },
        { type: 'MONEY', mentionText: '$10M', confidence: 0.90 }
      ],
      tables: []
    };
  }
-  private async processWithGenkit(fileName: string): Promise<any> {
+      // Create the request
-    // This is a placeholder implementation
+      const request = {
-    // In production, this would call Genkit for AI analysis
+        name: this.processorName,
-    logger.info('Processing with Genkit (placeholder)', { fileName });
+        rawDocument: {
-    
+          content: '', // We'll use GCS source instead
-    // Simulate Genkit processing
+          mimeType: mimeType
-    await new Promise(resolve => setTimeout(resolve, 300));
+        },
-    
+        gcsDocument: {
-    return {
+          gcsUri: gcsFilePath,
-      markdownOutput: `# CIM Analysis: ${fileName}
+          mimeType: mimeType
        }
      };
-## Executive Summary
+      logger.info('Sending Document AI request', { 
-Sample analysis generated by Document AI + Genkit integration.
+        processorName: this.processorName,
        gcsUri: gcsFilePath 
      });
-## Key Findings
+      // Process the document
- Document processed successfully
+      const [result] = await this.documentAiClient.processDocument(request);
- AI analysis completed
+      const { document } = result;
 - Integration working as expected
---
+      if (!document) {
-*Generated by Document AI + Genkit integration*`
+        throw new Error('Document AI returned no document');
-    };
+      }
      logger.info('Document AI processing successful', {
        textLength: document.text?.length || 0,
        pagesCount: document.pages?.length || 0,
        entitiesCount: document.entities?.length || 0
      });
      // Extract text
      const text = document.text || '';
      // Extract entities
      const entities = document.entities?.map(entity => ({
        type: entity.type || 'UNKNOWN',
        mentionText: entity.mentionText || '',
        confidence: entity.confidence || 0
      })) || [];
      // Extract tables
      const tables = document.pages?.flatMap(page => 
        page.tables?.map(table => ({
          rows: table.headerRows?.length || 0,
          columns: table.bodyRows?.[0]?.cells?.length || 0
        })) || []
      ) || [];
      // Extract pages info
      const pages = document.pages?.map(page => ({
        pageNumber: page.pageNumber || 0,
        blocksCount: page.blocks?.length || 0
      })) || [];
      return {
        text,
        entities,
        tables,
        pages,
        mimeType: document.mimeType || mimeType
      };
    } catch (error) {
      logger.error('Document AI processing failed', { 
        gcsFilePath, 
        processorName: this.processorName,
        error: error instanceof Error ? error.message : String(error),
        stack: error instanceof Error ? error.stack : undefined
      });
      throw error;
    }
  }
  private async cleanupGCSFiles(gcsFilePath: string): Promise<void> {
-    // This is a placeholder implementation
+    try {
-    // In production, this would delete files from Google Cloud Storage
+      const bucketName = gcsFilePath.replace('gs://', '').split('/')[0];
-    logger.info('Cleaning up GCS files (placeholder)', { gcsFilePath });
+      const fileName = gcsFilePath.replace(`gs://${bucketName}/`, '');
-    
+      
-    // Simulate cleanup delay
+      logger.info('Cleaning up GCS files', { gcsFilePath, bucketName, fileName });
-    await new Promise(resolve => setTimeout(resolve, 50));
+      
      const bucket = this.storageClient.bucket(bucketName);
      const file = bucket.file(fileName);
      await file.delete();
      logger.info('GCS file cleanup completed', { gcsFilePath });
    } catch (error) {
      logger.warn('Failed to cleanup GCS files', { 
        gcsFilePath, 
        error: error instanceof Error ? error.message : String(error) 
      });
      // Don't throw error for cleanup failures
    }
  }
 }
--- a/backend/src/services/optimizedAgenticRAGProcessor.ts
+++ b/backend/src/services/optimizedAgenticRAGProcessor.ts
@@ -83,9 +83,19 @@ export class OptimizedAgenticRAGProcessor {
      logger.info(`Optimized processing completed for document: ${documentId}`, result);
      console.log('✅ Optimized agentic RAG processing completed successfully for document:', documentId);
      console.log('✅ Total chunks processed:', result.processedChunks);
      console.log('✅ Processing time:', result.processingTime, 'ms');
      console.log('✅ Memory usage:', result.memoryUsage, 'MB');
      console.log('✅ Summary length:', result.summary?.length || 0);
      return result;
    } catch (error) {
      logger.error(`Optimized processing failed for document: ${documentId}`, error);
      console.log('❌ Optimized agentic RAG processing failed for document:', documentId);
      console.log('❌ Error:', error instanceof Error ? error.message : String(error));
      throw error;
    }
  }
--- a/backend/src/services/unifiedDocumentProcessor.ts
+++ b/backend/src/services/unifiedDocumentProcessor.ts
@@ -169,6 +169,9 @@ class UnifiedDocumentProcessor {
    } catch (error) {
      logger.error('Optimized agentic RAG processing failed', { documentId, error });
      console.log('❌ Unified document processor - optimized agentic RAG failed for document:', documentId);
      console.log('❌ Error:', error instanceof Error ? error.message : String(error));
      return {
        success: false,
        summary: '',
@@ -188,33 +191,60 @@ class UnifiedDocumentProcessor {
    documentId: string, 
    userId: string, 
    text: string, 
-    _options: any
+    options: any
  ): Promise<ProcessingResult> {
    logger.info('Using Document AI + Genkit processing strategy', { documentId });
    const startTime = Date.now();
    try {
-      // For now, we'll use the existing text extraction
+      // Get the file buffer from options if available, otherwise use text
-      // In a full implementation, this would use the Document AI processor
+      const fileBuffer = options.fileBuffer || Buffer.from(text);
      const fileName = options.fileName || `document-${documentId}.pdf`;
      const mimeType = options.mimeType || 'application/pdf';
      logger.info('Document AI processing with file data', {
        documentId,
        fileSize: fileBuffer.length,
        fileName,
        mimeType
      });
      const result = await documentAiGenkitProcessor.processDocument(
        documentId, 
        userId, 
-        Buffer.from(text), // Convert text to buffer for processing
+        fileBuffer,
-        `document-${documentId}.txt`, 
+        fileName, 
-        'text/plain'
+        mimeType
      );
      if (!result.success) {
        logger.error('Document AI processing failed', {
          documentId,
          error: result.error,
          metadata: result.metadata
        });
      }
      return {
        success: result.success,
        summary: result.content || '',
-        analysisData: (result.metadata?.analysisData as CIMReview) || {} as CIMReview,
+        analysisData: (result.metadata?.agenticRagResult?.analysisData as CIMReview) || {} as CIMReview,
        processingStrategy: 'document_ai_genkit',
        processingTime: Date.now() - startTime,
-        apiCalls: 1, // Document AI + Genkit typically uses fewer API calls
+        apiCalls: 1, // Document AI + Agentic RAG typically uses fewer API calls
        error: result.error || undefined
      };
    } catch (error) {
      const errorMessage = error instanceof Error ? error.message : String(error);
      const errorStack = error instanceof Error ? error.stack : undefined;
      logger.error('Document AI + Genkit processing failed with exception', {
        documentId,
        error: errorMessage,
        stack: errorStack
      });
      return {
        success: false,
        summary: '',
@@ -222,7 +252,7 @@ class UnifiedDocumentProcessor {
        processingStrategy: 'document_ai_genkit',
        processingTime: Date.now() - startTime,
        apiCalls: 0,
-        error: error instanceof Error ? error.message : 'Unknown error'
+        error: errorMessage
      };
    }
  }
--- a/backend/src/services/vectorDatabaseService.ts
+++ b/backend/src/services/vectorDatabaseService.ts
@@ -241,13 +241,14 @@ class VectorDatabaseService {
   * Store document chunks with embeddings
   */
  async storeDocumentChunks(chunks: DocumentChunk[]): Promise<void> {
    const initialized = await this.ensureInitialized();
    if (!initialized) {
      logger.warn('Vector database not available, skipping chunk storage');
      return;
    }
    try {
      const isInitialized = await this.ensureInitialized();
      if (!isInitialized) {
        logger.warn('Vector database not initialized, skipping chunk storage');
        return;
      }
      switch (this.provider) {
        case 'pinecone':
          await this.storeInPinecone(chunks);
@@ -261,11 +262,14 @@ class VectorDatabaseService {
        case 'supabase':
          await this.storeInSupabase(chunks);
          break;
        default:
          logger.warn(`Vector database provider ${this.provider} not supported for storage`);
      }
      logger.info(`Stored ${chunks.length} document chunks in vector database`);
    } catch (error) {
-      logger.error('Failed to store document chunks', error);
+      // Log the error but don't fail the entire upload process
-      throw new Error('Vector storage failed');
+      logger.error('Failed to store document chunks in vector database:', error);
      logger.warn('Continuing with upload process without vector storage');
      // Don't throw the error - let the upload continue
    }
  }
@@ -422,7 +426,6 @@ class VectorDatabaseService {
  async getVectorDatabaseStats(): Promise<{
    totalChunks: number;
    totalDocuments: number;
    totalSearches: number;
    averageSimilarity: number;
  }> {
    try {
@@ -521,13 +524,19 @@ class VectorDatabaseService {
        .upsert(supabaseRows);
      if (error) {
        // Check if it's a table/column missing error
        if (error.message && (error.message.includes('chunkIndex') || error.message.includes('document_chunks'))) {
          logger.warn('Vector database table/columns not available, skipping vector storage:', error.message);
          return; // Don't throw, just skip vector storage
        }
        throw error;
      }
      logger.info(`Successfully stored ${chunks.length} chunks in Supabase`);
    } catch (error) {
      logger.error('Failed to store chunks in Supabase:', error);
-      throw error;
+      // Don't throw the error - let the upload continue without vector storage
      logger.warn('Continuing upload process without vector storage');
    }
  }
@@ -581,4 +590,4 @@ class VectorDatabaseService {
  }
 }
-export const vectorDatabaseService = new VectorDatabaseService(); 
+export const vectorDatabaseService = new VectorDatabaseService();
--- a/backend/supabase_setup.sql
+++ b/backend/supabase_setup.sql
@@ -1,89 +1,76 @@
-- Enable the pgvector extension
+-- Create the document_chunks table
 CREATE EXTENSION IF NOT EXISTS vector;
 -- Create document_chunks table with vector support
 CREATE TABLE IF NOT EXISTS document_chunks (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
-  document_id VARCHAR(255) NOT NULL,
+  document_id UUID NOT NULL,
  chunk_index INTEGER NOT NULL,
  content TEXT NOT NULL,
  embedding vector(1536), -- OpenAI embeddings are 1536 dimensions
  metadata JSONB DEFAULT '{}',
  created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
  updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
 );
 -- Create indexes for better performance
 CREATE INDEX IF NOT EXISTS document_chunks_document_id_idx ON document_chunks(document_id);
 CREATE INDEX IF NOT EXISTS document_chunks_embedding_idx ON document_chunks USING ivfflat (embedding vector_cosine_ops);
 -- Create function to enable pgvector (for RPC calls)
 CREATE OR REPLACE FUNCTION enable_pgvector()
 RETURNS VOID AS $$
 BEGIN
  CREATE EXTENSION IF NOT EXISTS vector;
 END;
 $$ LANGUAGE plpgsql;
 -- Create function to create document_chunks table (for RPC calls)
 CREATE OR REPLACE FUNCTION create_document_chunks_table()
 RETURNS VOID AS $$
 BEGIN
  CREATE TABLE IF NOT EXISTS document_chunks (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    document_id VARCHAR(255) NOT NULL,
    chunk_index INTEGER NOT NULL,
    content TEXT NOT NULL,
    embedding vector(1536),
    metadata JSONB DEFAULT '{}',
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
  );
  CREATE INDEX IF NOT EXISTS document_chunks_document_id_idx ON document_chunks(document_id);
  CREATE INDEX IF NOT EXISTS document_chunks_embedding_idx ON document_chunks USING ivfflat (embedding vector_cosine_ops);
 END;
 $$ LANGUAGE plpgsql;
 -- Create function to match documents based on vector similarity
 CREATE OR REPLACE FUNCTION match_documents(
  query_embedding vector(1536),
  match_threshold float DEFAULT 0.7,
  match_count int DEFAULT 10
 )
 RETURNS TABLE(
  id UUID,
  content TEXT,
  metadata JSONB,
-  document_id VARCHAR(255),
+  embedding VECTOR(1536),
-  similarity FLOAT
+  chunk_index INTEGER,
-) AS $$
+  section TEXT,
  page_number INTEGER,
  created_at TIMESTAMPTZ DEFAULT NOW(),
  updated_at TIMESTAMPTZ DEFAULT NOW()
 );
 -- Create the vector_similarity_searches table
 CREATE TABLE IF NOT EXISTS vector_similarity_searches (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  user_id UUID,
  query_text TEXT,
  query_embedding VECTOR(1536),
  search_results JSONB,
  filters JSONB,
  limit_count INTEGER,
  similarity_threshold REAL,
  processing_time_ms INTEGER,
  created_at TIMESTAMPTZ DEFAULT NOW()
 );
 -- Create the function to count distinct documents
 CREATE OR REPLACE FUNCTION count_distinct_documents()
 RETURNS INTEGER AS $$
 BEGIN
  RETURN (SELECT COUNT(DISTINCT document_id) FROM document_chunks);
 END;
 $$ LANGUAGE plpgsql;
 -- Create the function to get the average chunk size
 CREATE OR REPLACE FUNCTION average_chunk_size()
 RETURNS INTEGER AS $$
 BEGIN
  RETURN (SELECT AVG(LENGTH(content)) FROM document_chunks);
 END;
 $$ LANGUAGE plpgsql;
 -- Create the function to get search analytics
 CREATE OR REPLACE FUNCTION get_search_analytics(user_id_param UUID, days_param INTEGER)
 RETURNS TABLE(query_text TEXT, search_count BIGINT) AS $$
 BEGIN
  RETURN QUERY
  SELECT
-    document_chunks.id,
+    vs.query_text,
-    document_chunks.content,
+    COUNT(*) as search_count
-    document_chunks.metadata,
+  FROM
-    document_chunks.document_id,
+    vector_similarity_searches vs
-    1 - (document_chunks.embedding <=> query_embedding) AS similarity
+  WHERE
-  FROM document_chunks
+    vs.user_id = user_id_param AND
-  WHERE 1 - (document_chunks.embedding <=> query_embedding) > match_threshold
+    vs.created_at >= NOW() - (days_param * INTERVAL '1 day')
-  ORDER BY document_chunks.embedding <=> query_embedding
+  GROUP BY
-  LIMIT match_count;
+    vs.query_text
  ORDER BY
    search_count DESC
  LIMIT 20;
 END;
 $$ LANGUAGE plpgsql;
-- Enable Row Level Security (RLS) if needed
+-- Create the function to get vector database stats
-- ALTER TABLE document_chunks ENABLE ROW LEVEL SECURITY;
+CREATE OR REPLACE FUNCTION get_vector_database_stats()
-
+RETURNS TABLE(total_chunks BIGINT, total_documents BIGINT, average_similarity REAL) AS $$
-- Create policies for RLS (adjust as needed for your auth requirements)
+BEGIN
-- CREATE POLICY "Users can view all document chunks" ON document_chunks FOR SELECT USING (true);
+  RETURN QUERY
-- CREATE POLICY "Users can insert document chunks" ON document_chunks FOR INSERT WITH CHECK (true);
+  SELECT
-- CREATE POLICY "Users can update document chunks" ON document_chunks FOR UPDATE USING (true);
+    (SELECT COUNT(*) FROM document_chunks),
-- CREATE POLICY "Users can delete document chunks" ON document_chunks FOR DELETE USING (true);
+    (SELECT COUNT(DISTINCT document_id) FROM document_chunks),
-
+    (SELECT AVG(similarity_score) FROM document_similarities WHERE similarity_score > 0);
-- Grant necessary permissions
+END;
-GRANT ALL ON document_chunks TO authenticated;
+$$ LANGUAGE plpgsql;
 GRANT ALL ON document_chunks TO anon;
 GRANT EXECUTE ON FUNCTION match_documents TO authenticated;
 GRANT EXECUTE ON FUNCTION match_documents TO anon;
--- a/currrent_output.json
+++ b/currrent_output.json
--- a/frontend/src/App.tsx
+++ b/frontend/src/App.tsx
@@ -10,7 +10,7 @@ import Analytics from './components/Analytics';
 import UploadMonitoringDashboard from './components/UploadMonitoringDashboard';
 import LogoutButton from './components/LogoutButton';
 import { documentService, GCSErrorHandler, GCSError } from './services/documentService';
-import { debugAuth, testAPIAuth } from './utils/authDebug';
+// import { debugAuth, testAPIAuth } from './utils/authDebug';
 import { 
  Home, 
@@ -75,13 +75,14 @@ const Dashboard: React.FC = () => {
      if (response.ok) {
        const result = await response.json();
-        // The API returns an array directly, not wrapped in success/data
+        // The API returns documents wrapped in a documents property
-        if (Array.isArray(result)) {
+        const documentsArray = result.documents || result;
        if (Array.isArray(documentsArray)) {
          // Transform backend data to frontend format
-          const transformedDocs = result.map((doc: any) => ({
+          const transformedDocs = documentsArray.map((doc: any) => ({
            id: doc.id,
-            name: doc.name || doc.originalName,
+            name: doc.name || doc.originalName || 'Unknown',
-            originalName: doc.originalName,
+            originalName: doc.originalName || doc.name || 'Unknown',
            status: mapBackendStatus(doc.status),
            uploadedAt: doc.uploadedAt,
            processedAt: doc.processedAt,
@@ -216,10 +217,22 @@ const Dashboard: React.FC = () => {
    return () => clearInterval(refreshInterval);
  }, [fetchDocuments]);
-  const handleUploadComplete = (fileId: string) => {
+  const handleUploadComplete = (documentId: string) => {
-    console.log('Upload completed:', fileId);
+    console.log('Upload completed:', documentId);
-    // Refresh documents list after upload
+    // Add the new document to the list with a "processing" status
-    fetchDocuments();
+    // Since we only have the ID, we'll create a minimal document object
    const newDocument = {
      id: documentId,
      status: 'processing',
      name: 'Processing...',
      originalName: 'Processing...',
      uploadedAt: new Date().toISOString(),
      fileSize: 0,
      user_id: user?.id || '',
      created_at: new Date().toISOString(),
      updated_at: new Date().toISOString()
    };
    setDocuments(prev => [...prev, newDocument]);
  };
  const handleUploadError = (error: string) => {
@@ -291,18 +304,18 @@ const Dashboard: React.FC = () => {
    setViewingDocument(null);
  };
-  // Debug functions
+  // Debug functions (commented out for now)
-  const handleDebugAuth = async () => {
+  // const handleDebugAuth = async () => {
-    await debugAuth();
+  //   await debugAuth();
-  };
+  // };
-  const handleTestAPIAuth = async () => {
+  // const handleTestAPIAuth = async () => {
-    await testAPIAuth();
+  //   await testAPIAuth();
-  };
+  // };
  const filteredDocuments = documents.filter(doc =>
-    doc.name.toLowerCase().includes(searchTerm.toLowerCase()) ||
+    (doc.name?.toLowerCase() || '').includes(searchTerm.toLowerCase()) ||
-    doc.originalName.toLowerCase().includes(searchTerm.toLowerCase())
+    (doc.originalName?.toLowerCase() || '').includes(searchTerm.toLowerCase())
  );
  const stats = {
--- a/frontend/src/components/DocumentUpload.tsx
+++ b/frontend/src/components/DocumentUpload.tsx
@@ -21,7 +21,7 @@ interface UploadedFile {
 }
 interface DocumentUploadProps {
-  onUploadComplete?: (fileId: string) => void;
+  onUploadComplete?: (documentId: string) => void;
  onUploadError?: (error: string) => void;
 }
@@ -104,15 +104,15 @@ const DocumentUpload: React.FC<DocumentUploadProps> = ({
          abortController.signal
        );
-        // Upload completed - update status to "uploaded"
+        // Upload completed - update status to "processing" immediately
        setUploadedFiles(prev =>
          prev.map(f =>
            f.id === uploadedFile.id
              ? { 
                  ...f, 
-                  id: document.id,
+                  id: result.id,
-                  documentId: document.id,
+                  documentId: result.id,
-                  status: 'uploaded', 
+                  status: 'processing', // Changed from 'uploaded' to 'processing'
                  progress: 100 
                }
              : f
@@ -120,10 +120,10 @@ const DocumentUpload: React.FC<DocumentUploadProps> = ({
        );
        // Call the completion callback with the document ID
-        onUploadComplete?.(document.id);
+        onUploadComplete?.(result.id);
-        // Start monitoring processing progress
+        // Start monitoring processing progress immediately
-        monitorProcessingProgress(document.id, uploadedFile.id);
+        monitorProcessingProgress(result.id, uploadedFile.id);
      } catch (error) {
        // Check if this was an abort error
@@ -189,8 +189,29 @@ const DocumentUpload: React.FC<DocumentUploadProps> = ({
      console.warn('Attempted to monitor progress for document with invalid UUID format:', documentId);
      return;
    }
    // Add timeout to prevent infinite polling (30 minutes max)
    const startTime = Date.now();
    const maxPollingTime = 30 * 60 * 1000; // 30 minutes
    const checkProgress = async () => {
      // Check if we've exceeded the maximum polling time
      if (Date.now() - startTime > maxPollingTime) {
        console.warn(`Polling timeout for document ${documentId} after ${maxPollingTime / 1000 / 60} minutes`);
        setUploadedFiles(prev =>
          prev.map(f =>
            f.id === fileId
              ? { 
                  ...f, 
                  status: 'error',
                  error: 'Processing timeout - please check document status manually'
                }
              : f
          )
        );
        return;
      }
      try {
        const response = await fetch(`${import.meta.env.VITE_API_BASE_URL}/documents/${documentId}/progress`, {
          headers: {
@@ -203,8 +224,10 @@ const DocumentUpload: React.FC<DocumentUploadProps> = ({
          const progress = await response.json();
          // Update status based on progress
-          let newStatus: UploadedFile['status'] = 'uploaded';
+          let newStatus: UploadedFile['status'] = 'processing'; // Default to processing
-          if (progress.status === 'processing' || progress.status === 'extracting_text' || progress.status === 'processing_llm' || progress.status === 'generating_pdf') {
+          if (progress.status === 'uploading' || progress.status === 'uploaded') {
            newStatus = 'processing'; // Still processing
          } else if (progress.status === 'processing' || progress.status === 'extracting_text' || progress.status === 'processing_llm' || progress.status === 'generating_pdf') {
            newStatus = 'processing';
          } else if (progress.status === 'completed') {
            newStatus = 'completed';
@@ -242,12 +265,12 @@ const DocumentUpload: React.FC<DocumentUploadProps> = ({
        // Don't stop monitoring on network errors, just log and continue
      }
-      // Continue monitoring
+      // Continue monitoring with shorter intervals for better responsiveness
-      setTimeout(checkProgress, 2000);
+      setTimeout(checkProgress, 3000); // Check every 3 seconds
    };
-    // Start monitoring
+    // Start monitoring immediately
-    setTimeout(checkProgress, 1000);
+    setTimeout(checkProgress, 500); // Start checking after 500ms
  }, [token]);
  const { getRootProps, getInputProps, isDragActive } = useDropzone({
@@ -378,7 +401,7 @@ const DocumentUpload: React.FC<DocumentUploadProps> = ({
              <h4 className="text-sm font-medium text-success-800">Upload Complete</h4>
              <p className="text-sm text-success-700 mt-1">
                Files have been uploaded successfully to Firebase Storage! You can now navigate away from this page. 
-                Processing will continue in the background using Document AI + Optimized Agentic RAG. PDFs will be automatically deleted after processing to save costs.
+                Processing will continue in the background using Document AI + Optimized Agentic RAG. This can take several minutes. PDFs will be automatically deleted after processing to save costs.
              </p>
            </div>
          </div>
--- a/frontend/src/services/documentService.ts
+++ b/frontend/src/services/documentService.ts
@@ -7,7 +7,7 @@ const API_BASE_URL = config.apiBaseUrl;
 // Create axios instance with auth interceptor
 const apiClient = axios.create({
  baseURL: API_BASE_URL,
-  timeout: 30000, // 30 seconds
+  timeout: 300000, // 5 minutes
 });
 // Add auth token to requests
@@ -263,14 +263,46 @@ class DocumentService {
      // Step 3: Confirm upload and trigger processing
      onProgress?.(95); // 95% - Confirming upload
-      const confirmResponse = await apiClient.post(`/documents/${documentId}/confirm-upload`, {}, { signal });
+      console.log('🔄 Making confirm-upload request for document:', documentId);
      console.log('🔄 Confirm-upload URL:', `/documents/${documentId}/confirm-upload`);
      // Add retry logic for confirm-upload (based on Google Cloud best practices)
      let confirmResponse;
      let lastError;
      for (let attempt = 1; attempt <= 3; attempt++) {
        try {
          console.log(`🔄 Confirm-upload attempt ${attempt}/3`);
          confirmResponse = await apiClient.post(`/documents/${documentId}/confirm-upload`, {}, { 
            signal,
            timeout: 60000 // 60 second timeout for confirm-upload
          });
          console.log('✅ Confirm-upload response received:', confirmResponse.status);
          console.log('✅ Confirm-upload response data:', confirmResponse.data);
          break; // Success, exit retry loop
        } catch (error: any) {
          lastError = error;
          console.log(`❌ Confirm-upload attempt ${attempt} failed:`, error.message);
          if (attempt < 3) {
            // Wait before retry (exponential backoff)
            const delay = Math.pow(2, attempt) * 1000; // 2s, 4s
            console.log(`⏳ Waiting ${delay}ms before retry...`);
            await new Promise(resolve => setTimeout(resolve, delay));
          }
        }
      }
      if (!confirmResponse) {
        throw lastError || new Error('Confirm-upload failed after 3 attempts');
      }
      onProgress?.(100); // 100% - Complete
      console.log('✅ Upload confirmed and processing started');
      return {
        id: documentId,
-        ...confirmResponse.data
+        ...confirmResponse.data.document
      };
    } catch (error: any) {
@@ -281,6 +313,16 @@ class DocumentService {
        throw new Error('Upload was cancelled.');
      }
      // Handle network timeouts
      if (error.code === 'ECONNABORTED' || error.message?.includes('timeout')) {
        throw new Error('Request timed out. Please check your connection and try again.');
      }
      // Handle network errors
      if (error.code === 'ERR_NETWORK' || error.message?.includes('Network Error')) {
        throw new Error('Network error. Please check your connection and try again.');
      }
      if (error.response?.status === 401) {
        throw new Error('Authentication required. Please log in again.');
      }