Jon dccfcfaa23 Fix download functionality and clean up temporary files
FIXED ISSUES:
1. Download functionality (404 errors):
   - Added PDF generation to jobQueueService after document processing
   - PDFs are now generated from summaries and stored in summary_pdf_path
   - Download endpoint now works correctly

2. Frontend-Backend communication:
   - Verified Vite proxy configuration is correct (/api -> localhost:5000)
   - Backend is responding to health checks
   - API authentication is working

3. Temporary files cleanup:
   - Removed 50+ temporary debug/test files from backend/
   - Cleaned up check-*.js, test-*.js, debug-*.js, fix-*.js files
   - Removed one-time processing scripts and debug utilities

TECHNICAL DETAILS:
- Modified jobQueueService.ts to generate PDFs using pdfGenerationService
- Added path import for file path handling
- PDFs are generated with timestamp in filename for uniqueness
- All temporary development files have been removed

STATUS: Download functionality should now work. Frontend-backend communication verified.
2025-07-28 21:33:28 -04:00

CIM Document Processor

A comprehensive web application for processing and analyzing Confidential Information Memorandums (CIMs) using AI-powered document analysis and the BPCP CIM Review Template.

Features

🔐 Authentication & Security

  • Secure user authentication with JWT tokens
  • Role-based access control
  • Protected routes and API endpoints
  • Rate limiting and security headers

📄 Document Processing

  • Upload PDF, DOC, and DOCX files (up to 50MB)
  • Drag-and-drop file upload interface
  • Real-time upload progress tracking
  • AI-powered document text extraction
  • Automatic document analysis and insights

📊 BPCP CIM Review Template

  • Comprehensive review template with 7 sections:
    • Deal Overview: Company information, transaction details, and deal context
    • Business Description: Core operations, products/services, customer base
    • Market & Industry Analysis: Market size, growth, competitive landscape
    • Financial Summary: Historical financials, trends, and analysis
    • Management Team Overview: Leadership assessment and organizational structure
    • Preliminary Investment Thesis: Key attractions, risks, and value creation
    • Key Questions & Next Steps: Critical questions and action items

🎯 Document Management

  • Document status tracking (pending, processing, completed, error)
  • Search and filter documents
  • View processed results and extracted data
  • Download processed documents and reports
  • Retry failed processing jobs

📈 Analytics & Insights

  • Document processing statistics
  • Financial trend analysis
  • Risk and opportunity identification
  • Key metrics extraction
  • Export capabilities (PDF, JSON)

Technology Stack

Frontend

  • React 18 with TypeScript
  • Vite for fast development and building
  • Tailwind CSS for styling
  • React Router for navigation
  • React Hook Form for form handling
  • React Dropzone for file uploads
  • Lucide React for icons
  • Axios for API communication

Backend

  • Node.js with TypeScript
  • Express.js web framework
  • PostgreSQL database with migrations
  • Redis for job queue and caching
  • JWT for authentication
  • Multer for file uploads
  • Bull for job queue management
  • Winston for logging
  • Jest for testing

AI & Processing

  • OpenAI GPT-4 for document analysis
  • Anthropic Claude for advanced text processing
  • PDF-parse for PDF text extraction
  • Puppeteer for PDF generation

Project Structure

cim_summary/
├── frontend/                 # React frontend application
│   ├── src/
│   │   ├── components/      # React components
│   │   ├── services/        # API services
│   │   ├── contexts/        # React contexts
│   │   ├── utils/           # Utility functions
│   │   └── types/           # TypeScript type definitions
│   └── package.json
├── backend/                  # Node.js backend API
│   ├── src/
│   │   ├── controllers/     # API controllers
│   │   ├── models/          # Database models
│   │   ├── services/        # Business logic services
│   │   ├── routes/          # API routes
│   │   ├── middleware/      # Express middleware
│   │   └── utils/           # Utility functions
│   └── package.json
└── README.md

Getting Started

Prerequisites

  • Node.js 18+ and npm
  • PostgreSQL 14+
  • Redis 6+
  • OpenAI API key
  • Anthropic API key

Environment Setup

  1. Clone the repository

    git clone <repository-url>
    cd cim_summary
    
  2. Backend Setup

    cd backend
    npm install
    
    # Copy environment template
    cp .env.example .env
    
    # Edit .env with your configuration
    # Required variables:
    # - DATABASE_URL
    # - REDIS_URL
    # - JWT_SECRET
    # - OPENAI_API_KEY
    # - ANTHROPIC_API_KEY
    
  3. Frontend Setup

    cd frontend
    npm install
    
    # Copy environment template
    cp .env.example .env
    
    # Edit .env with your configuration
    # Required variables:
    # - VITE_API_URL (backend API URL)
    

Database Setup

  1. Create PostgreSQL database

    CREATE DATABASE cim_processor;
    
  2. Run migrations

    cd backend
    npm run db:migrate
    
  3. Seed initial data (optional)

    npm run db:seed
    

Running the Application

  1. Start Redis

    redis-server
    
  2. Start Backend

    cd backend
    npm run dev
    

    Backend will be available at http://localhost:5000

  3. Start Frontend

    cd frontend
    npm run dev
    

    Frontend will be available at http://localhost:3000

Usage

1. Authentication

  • Navigate to the login page
  • Use the seeded admin account or create a new user
  • JWT tokens are automatically managed

2. Document Upload

  • Go to the "Upload" tab
  • Drag and drop CIM documents (PDF, DOC, DOCX)
  • Monitor upload and processing progress
  • Files are automatically queued for AI processing

3. Document Review

  • View processed documents in the "Documents" tab
  • Click "View" to open the document viewer
  • Access the BPCP CIM Review Template
  • Fill out the comprehensive review sections

4. Analysis & Export

  • Review extracted financial data and insights
  • Complete the investment thesis
  • Export review as PDF
  • Download processed documents

API Endpoints

Authentication

  • POST /api/auth/login - User login
  • POST /api/auth/register - User registration
  • POST /api/auth/logout - User logout

Documents

  • GET /api/documents - List user documents
  • POST /api/documents/upload - Upload document
  • GET /api/documents/:id - Get document details
  • GET /api/documents/:id/status - Get processing status
  • GET /api/documents/:id/download - Download document
  • DELETE /api/documents/:id - Delete document
  • POST /api/documents/:id/retry - Retry processing

Reviews

  • GET /api/documents/:id/review - Get CIM review data
  • POST /api/documents/:id/review - Save CIM review
  • GET /api/documents/:id/export - Export review as PDF

Development

Running Tests

# Backend tests
cd backend
npm test

# Frontend tests
cd frontend
npm test

Code Quality

# Backend linting
cd backend
npm run lint

# Frontend linting
cd frontend
npm run lint

Database Migrations

cd backend
npm run db:migrate  # Run migrations
npm run db:seed     # Seed data

Configuration

Environment Variables

Backend (.env)

# Database
DATABASE_URL=postgresql://user:password@localhost:5432/cim_processor

# Redis
REDIS_URL=redis://localhost:6379

# Authentication
JWT_SECRET=your-secret-key

# AI Services
OPENAI_API_KEY=your-openai-key
ANTHROPIC_API_KEY=your-anthropic-key

# Server
PORT=5000
NODE_ENV=development
FRONTEND_URL=http://localhost:3000

Frontend (.env)

VITE_API_URL=http://localhost:5000/api

Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Support

For support and questions, please contact the development team or create an issue in the repository.

Acknowledgments

  • BPCP for the CIM Review Template
  • OpenAI for GPT-4 integration
  • Anthropic for Claude integration
  • The open-source community for the excellent tools and libraries used in this project
Description
CIM Document Processor with Hybrid LLM Analysis
Readme 8.3 MiB
Languages
TypeScript 92.2%
JavaScript 3.7%
PLpgSQL 3.1%
Shell 1%