375 lines
15 KiB
Markdown
375 lines
15 KiB
Markdown
# Codebase Structure
|
|
|
|
**Analysis Date:** 2026-02-24
|
|
|
|
## Directory Layout
|
|
|
|
```
|
|
cim_summary/
|
|
├── backend/ # Express.js + TypeScript backend (Node.js)
|
|
│ ├── src/
|
|
│ │ ├── index.ts # Express app + Firebase Functions exports
|
|
│ │ ├── controllers/ # Request handlers
|
|
│ │ ├── models/ # Database access + schema
|
|
│ │ ├── services/ # Business logic + external integrations
|
|
│ │ ├── routes/ # Express route definitions
|
|
│ │ ├── middleware/ # Express middleware (auth, validation, error)
|
|
│ │ ├── config/ # Configuration (env, firebase, supabase)
|
|
│ │ ├── utils/ # Utilities (logger, validation, parsing)
|
|
│ │ ├── types/ # TypeScript type definitions
|
|
│ │ ├── scripts/ # One-off CLI scripts (diagnostics, setup)
|
|
│ │ ├── assets/ # Static assets (HTML templates)
|
|
│ │ └── __tests__/ # Test suites (unit, integration, acceptance)
|
|
│ ├── package.json # Node dependencies
|
|
│ ├── tsconfig.json # TypeScript config
|
|
│ ├── .eslintrc.json # ESLint config
|
|
│ └── dist/ # Compiled JavaScript (generated)
|
|
│
|
|
├── frontend/ # React + Vite + TypeScript frontend
|
|
│ ├── src/
|
|
│ │ ├── main.tsx # React entry point
|
|
│ │ ├── App.tsx # Root component with routing
|
|
│ │ ├── components/ # React components (UI)
|
|
│ │ ├── services/ # API clients (documentService, authService)
|
|
│ │ ├── contexts/ # React Context (AuthContext)
|
|
│ │ ├── config/ # Configuration (env, firebase)
|
|
│ │ ├── types/ # TypeScript interfaces
|
|
│ │ ├── utils/ # Utilities (validation, cn, auth debug)
|
|
│ │ └── assets/ # Static images and icons
|
|
│ ├── package.json # Node dependencies
|
|
│ ├── tsconfig.json # TypeScript config
|
|
│ ├── vite.config.ts # Vite bundler config
|
|
│ ├── eslintrc.json # ESLint config
|
|
│ ├── tailwind.config.js # Tailwind CSS config
|
|
│ ├── postcss.config.js # PostCSS config
|
|
│ └── dist/ # Built static assets (generated)
|
|
│
|
|
├── .planning/ # GSD planning directory
|
|
│ └── codebase/ # Codebase analysis documents
|
|
│
|
|
├── package.json # Monorepo root package (if used)
|
|
├── .git/ # Git repository
|
|
├── .gitignore # Git ignore rules
|
|
├── .cursorrules # Cursor IDE configuration
|
|
├── README.md # Project overview
|
|
├── CONFIGURATION_GUIDE.md # Setup instructions
|
|
├── CODEBASE_ARCHITECTURE_SUMMARY.md # Existing architecture notes
|
|
└── [PDF documents] # Sample CIM documents for testing
|
|
```
|
|
|
|
## Directory Purposes
|
|
|
|
**backend/src/:**
|
|
- Purpose: All backend server code
|
|
- Contains: TypeScript source files
|
|
- Key files: `index.ts` (main app), routes, controllers, services, models
|
|
|
|
**backend/src/controllers/:**
|
|
- Purpose: HTTP request handlers
|
|
- Contains: `documentController.ts`, `authController.ts`
|
|
- Functions: Map HTTP requests to service calls, handle validation, construct responses
|
|
|
|
**backend/src/services/:**
|
|
- Purpose: Business logic and external integrations
|
|
- Contains: Document processing, LLM integration, file storage, database, job queue
|
|
- Key files:
|
|
- `unifiedDocumentProcessor.ts` - Orchestrator, strategy selection
|
|
- `singlePassProcessor.ts` - 2-LLM extraction (current default)
|
|
- `optimizedAgenticRAGProcessor.ts` - Advanced agentic processing (stub)
|
|
- `documentAiProcessor.ts` - Google Document AI OCR
|
|
- `llmService.ts` - LLM API calls (Anthropic/OpenAI/OpenRouter)
|
|
- `jobQueueService.ts` - Async job queue (in-memory, EventEmitter)
|
|
- `jobProcessorService.ts` - Dequeue and execute jobs
|
|
- `fileStorageService.ts` - GCS signed URLs and upload
|
|
- `vectorDatabaseService.ts` - Supabase pgvector operations
|
|
- `pdfGenerationService.ts` - Puppeteer PDF rendering
|
|
- `uploadProgressService.ts` - Track upload status
|
|
- `uploadMonitoringService.ts` - Monitor processing progress
|
|
- `llmSchemas.ts` - Zod schemas for LLM extraction (CIMReview, financial data)
|
|
|
|
**backend/src/models/:**
|
|
- Purpose: Database access layer and schema definitions
|
|
- Contains: Document, User, ProcessingJob, Feedback models
|
|
- Key files:
|
|
- `types.ts` - TypeScript interfaces (Document, ProcessingJob, ProcessingStatus)
|
|
- `DocumentModel.ts` - Document CRUD with retry logic
|
|
- `ProcessingJobModel.ts` - Job tracking in database
|
|
- `UserModel.ts` - User management
|
|
- `VectorDatabaseModel.ts` - Vector embedding queries
|
|
- `migrate.ts` - Database migrations
|
|
- `seed.ts` - Test data seeding
|
|
- `migrations/` - SQL migration files
|
|
|
|
**backend/src/routes/:**
|
|
- Purpose: Express route definitions
|
|
- Contains: Route handlers and middleware bindings
|
|
- Key files:
|
|
- `documents.ts` - GET/POST/PUT/DELETE document endpoints
|
|
- `vector.ts` - Vector search endpoints
|
|
- `monitoring.ts` - Health and status endpoints
|
|
- `documentAudit.ts` - Audit log endpoints
|
|
|
|
**backend/src/middleware/:**
|
|
- Purpose: Express middleware for cross-cutting concerns
|
|
- Contains: Authentication, validation, error handling
|
|
- Key files:
|
|
- `firebaseAuth.ts` - Firebase ID token verification
|
|
- `errorHandler.ts` - Global error handling + correlation ID
|
|
- `notFoundHandler.ts` - 404 handler
|
|
- `validation.ts` - Request validation (UUID, pagination)
|
|
|
|
**backend/src/config/:**
|
|
- Purpose: Configuration and initialization
|
|
- Contains: Environment setup, service initialization
|
|
- Key files:
|
|
- `env.ts` - Environment variable validation (Joi schema)
|
|
- `firebase.ts` - Firebase Admin SDK initialization
|
|
- `supabase.ts` - Supabase client and pool setup
|
|
- `database.ts` - PostgreSQL connection (legacy)
|
|
- `errorConfig.ts` - Error handling config
|
|
|
|
**backend/src/utils/:**
|
|
- Purpose: Shared utility functions
|
|
- Contains: Logging, validation, parsing
|
|
- Key files:
|
|
- `logger.ts` - Winston logger setup (console + file transports)
|
|
- `validation.ts` - UUID and pagination validators
|
|
- `googleServiceAccount.ts` - Google Cloud credentials resolution
|
|
- `financialExtractor.ts` - Financial data parsing (deprecated for single-pass)
|
|
- `templateParser.ts` - CIM template utilities
|
|
- `auth.ts` - Authentication helpers
|
|
|
|
**backend/src/scripts/:**
|
|
- Purpose: One-off CLI scripts for diagnostics and setup
|
|
- Contains: Database setup, testing, monitoring
|
|
- Key files:
|
|
- `setup-database.ts` - Initialize database schema
|
|
- `monitor-document-processing.ts` - Watch job queue status
|
|
- `check-current-job.ts` - Debug stuck jobs
|
|
- `test-full-llm-pipeline.ts` - End-to-end testing
|
|
- `comprehensive-diagnostic.ts` - System health check
|
|
|
|
**backend/src/__tests__/:**
|
|
- Purpose: Test suites
|
|
- Contains: Unit, integration, acceptance tests
|
|
- Subdirectories:
|
|
- `unit/` - Isolated component tests
|
|
- `integration/` - Multi-component tests
|
|
- `acceptance/` - End-to-end flow tests
|
|
- `mocks/` - Mock data and fixtures
|
|
- `utils/` - Test utilities
|
|
|
|
**frontend/src/:**
|
|
- Purpose: All frontend code
|
|
- Contains: React components, services, types
|
|
|
|
**frontend/src/components/:**
|
|
- Purpose: React UI components
|
|
- Contains: Page components, reusable widgets
|
|
- Key files:
|
|
- `DocumentUpload.tsx` - File upload UI with drag-and-drop
|
|
- `DocumentList.tsx` - List of processed documents
|
|
- `DocumentViewer.tsx` - View and edit extracted data
|
|
- `ProcessingProgress.tsx` - Real-time processing status
|
|
- `UploadMonitoringDashboard.tsx` - Admin view of active jobs
|
|
- `LoginForm.tsx` - Firebase auth login UI
|
|
- `ProtectedRoute.tsx` - Route guard for authenticated pages
|
|
- `Analytics.tsx` - Document analytics and statistics
|
|
- `CIMReviewTemplate.tsx` - Display extracted CIM review data
|
|
|
|
**frontend/src/services/:**
|
|
- Purpose: API clients and external service integration
|
|
- Contains: HTTP clients for backend
|
|
- Key files:
|
|
- `documentService.ts` - Document API calls (upload, list, process, status)
|
|
- `authService.ts` - Firebase authentication (login, logout, token)
|
|
- `adminService.ts` - Admin-only operations
|
|
|
|
**frontend/src/contexts/:**
|
|
- Purpose: React Context for global state
|
|
- Contains: AuthContext for user and authentication state
|
|
- Key files:
|
|
- `AuthContext.tsx` - User, token, login/logout state
|
|
|
|
**frontend/src/config/:**
|
|
- Purpose: Configuration
|
|
- Contains: Environment variables, Firebase setup
|
|
- Key files:
|
|
- `env.ts` - VITE_API_BASE_URL and other env vars
|
|
- `firebase.ts` - Firebase client initialization
|
|
|
|
**frontend/src/types/:**
|
|
- Purpose: TypeScript interfaces
|
|
- Contains: API response types, component props
|
|
- Key files:
|
|
- `auth.ts` - User, LoginCredentials, AuthContextType
|
|
|
|
**frontend/src/utils/:**
|
|
- Purpose: Shared utility functions
|
|
- Contains: Validation, CSS utilities
|
|
- Key files:
|
|
- `validation.ts` - Email, password validators
|
|
- `cn.ts` - Classname merger (clsx wrapper)
|
|
- `authDebug.ts` - Authentication debugging helpers
|
|
|
|
## Key File Locations
|
|
|
|
**Entry Points:**
|
|
- `backend/src/index.ts` - Main Express app and Firebase Functions exports
|
|
- `frontend/src/main.tsx` - React entry point
|
|
- `frontend/src/App.tsx` - Root component with routing
|
|
|
|
**Configuration:**
|
|
- `backend/src/config/env.ts` - Environment variable schema and validation
|
|
- `backend/src/config/firebase.ts` - Firebase Admin SDK setup
|
|
- `backend/src/config/supabase.ts` - Supabase client and connection pool
|
|
- `frontend/src/config/firebase.ts` - Firebase client configuration
|
|
- `frontend/src/config/env.ts` - Frontend environment variables
|
|
|
|
**Core Logic:**
|
|
- `backend/src/services/unifiedDocumentProcessor.ts` - Main document processing orchestrator
|
|
- `backend/src/services/singlePassProcessor.ts` - Single-pass 2-LLM strategy
|
|
- `backend/src/services/llmService.ts` - LLM API integration with retry
|
|
- `backend/src/services/jobQueueService.ts` - Background job queue
|
|
- `backend/src/services/vectorDatabaseService.ts` - Vector search implementation
|
|
|
|
**Testing:**
|
|
- `backend/src/__tests__/unit/` - Unit tests
|
|
- `backend/src/__tests__/integration/` - Integration tests
|
|
- `backend/src/__tests__/acceptance/` - End-to-end tests
|
|
|
|
**Database:**
|
|
- `backend/src/models/types.ts` - TypeScript type definitions
|
|
- `backend/src/models/DocumentModel.ts` - Document CRUD operations
|
|
- `backend/src/models/ProcessingJobModel.ts` - Job tracking
|
|
- `backend/src/models/migrations/` - SQL migration files
|
|
|
|
**Middleware:**
|
|
- `backend/src/middleware/firebaseAuth.ts` - JWT authentication
|
|
- `backend/src/middleware/errorHandler.ts` - Global error handling
|
|
- `backend/src/middleware/validation.ts` - Input validation
|
|
|
|
**Logging:**
|
|
- `backend/src/utils/logger.ts` - Winston logger configuration
|
|
|
|
## Naming Conventions
|
|
|
|
**Files:**
|
|
- Controllers: `{resource}Controller.ts` (e.g., `documentController.ts`)
|
|
- Services: `{service}Service.ts` or descriptive (e.g., `llmService.ts`, `singlePassProcessor.ts`)
|
|
- Models: `{Entity}Model.ts` (e.g., `DocumentModel.ts`)
|
|
- Routes: `{resource}.ts` (e.g., `documents.ts`)
|
|
- Middleware: `{purpose}Handler.ts` or `{purpose}.ts` (e.g., `firebaseAuth.ts`)
|
|
- Types/Interfaces: `types.ts` or `{name}Types.ts`
|
|
- Tests: `{file}.test.ts` or `{file}.spec.ts`
|
|
|
|
**Directories:**
|
|
- Plurals for collections: `services/`, `models/`, `utils/`, `routes/`, `controllers/`
|
|
- Singular for specific features: `config/`, `middleware/`, `types/`, `contexts/`
|
|
- Nested by feature in larger directories: `__tests__/unit/`, `models/migrations/`
|
|
|
|
**Functions/Variables:**
|
|
- Camel case: `processDocument()`, `getUserId()`, `documentId`
|
|
- Constants: UPPER_SNAKE_CASE: `MAX_RETRIES`, `TIMEOUT_MS`
|
|
- Private methods: Prefix with `_` or use TypeScript `private`: `_retryOperation()`
|
|
|
|
**Classes:**
|
|
- Pascal case: `DocumentModel`, `JobQueueService`, `SinglePassProcessor`
|
|
- Service instances exported as singletons: `export const llmService = new LLMService()`
|
|
|
|
**React Components:**
|
|
- Pascal case: `DocumentUpload.tsx`, `ProtectedRoute.tsx`
|
|
- Hooks: `use{Feature}` (e.g., `useAuth` from AuthContext)
|
|
|
|
## Where to Add New Code
|
|
|
|
**New Document Processing Strategy:**
|
|
- Primary code: `backend/src/services/{strategyName}Processor.ts`
|
|
- Schema: Add types to `backend/src/services/llmSchemas.ts`
|
|
- Integration: Register in `backend/src/services/unifiedDocumentProcessor.ts`
|
|
- Tests: `backend/src/__tests__/integration/{strategyName}.test.ts`
|
|
|
|
**New API Endpoint:**
|
|
- Route: `backend/src/routes/{resource}.ts`
|
|
- Controller: `backend/src/controllers/{resource}Controller.ts`
|
|
- Service: `backend/src/services/{resource}Service.ts` (if needed)
|
|
- Model: `backend/src/models/{Resource}Model.ts` (if database access)
|
|
- Tests: `backend/src/__tests__/integration/{endpoint}.test.ts`
|
|
|
|
**New React Component:**
|
|
- Component: `frontend/src/components/{ComponentName}.tsx`
|
|
- Types: Add to `frontend/src/types/` or inline in component
|
|
- Services: Use existing `frontend/src/services/documentService.ts`
|
|
- Tests: `frontend/src/__tests__/{ComponentName}.test.tsx` (if added)
|
|
|
|
**Shared Utilities:**
|
|
- Backend: `backend/src/utils/{utility}.ts`
|
|
- Frontend: `frontend/src/utils/{utility}.ts`
|
|
- Avoid code duplication - consider extracting common patterns
|
|
|
|
**Database Schema Changes:**
|
|
- Migration file: `backend/src/models/migrations/{timestamp}_{description}.sql`
|
|
- TypeScript interface: Update `backend/src/models/types.ts`
|
|
- Model methods: Update corresponding `*Model.ts` file
|
|
- Run: `npm run db:migrate` in backend
|
|
|
|
**Configuration Changes:**
|
|
- Environment: Update `backend/src/config/env.ts` (Joi schema)
|
|
- Frontend env: Update `frontend/src/config/env.ts`
|
|
- Firebase secrets: Use `firebase functions:secrets:set VAR_NAME`
|
|
- Local dev: Add to `.env` file (gitignored)
|
|
|
|
## Special Directories
|
|
|
|
**backend/src/__tests__/mocks/:**
|
|
- Purpose: Mock data and fixtures for testing
|
|
- Generated: No (manually maintained)
|
|
- Committed: Yes
|
|
- Usage: Import in tests for consistent test data
|
|
|
|
**backend/src/scripts/:**
|
|
- Purpose: One-off CLI utilities for development and operations
|
|
- Generated: No (manually maintained)
|
|
- Committed: Yes
|
|
- Execution: `ts-node src/scripts/{script}.ts` or `npm run {script}`
|
|
|
|
**backend/src/assets/:**
|
|
- Purpose: Static HTML templates for PDF generation
|
|
- Generated: No (manually maintained)
|
|
- Committed: Yes
|
|
- Usage: Rendered by Puppeteer in `pdfGenerationService.ts`
|
|
|
|
**backend/src/models/migrations/:**
|
|
- Purpose: Database schema migration SQL files
|
|
- Generated: No (manually created)
|
|
- Committed: Yes
|
|
- Execution: Run via `npm run db:migrate`
|
|
|
|
**frontend/src/assets/:**
|
|
- Purpose: Images, icons, logos
|
|
- Generated: No (manually added)
|
|
- Committed: Yes
|
|
- Usage: Import in components (e.g., `bluepoint-logo.png`)
|
|
|
|
**backend/dist/ and frontend/dist/:**
|
|
- Purpose: Compiled JavaScript and optimized bundles
|
|
- Generated: Yes (build output)
|
|
- Committed: No (gitignored)
|
|
- Regeneration: `npm run build` in respective directory
|
|
|
|
**backend/node_modules/ and frontend/node_modules/:**
|
|
- Purpose: Installed dependencies
|
|
- Generated: Yes (npm install)
|
|
- Committed: No (gitignored)
|
|
- Regeneration: `npm install`
|
|
|
|
**backend/logs/:**
|
|
- Purpose: Runtime log files
|
|
- Generated: Yes (runtime)
|
|
- Committed: No (gitignored)
|
|
- Contents: `error.log`, `upload.log`, combined logs
|
|
|
|
---
|
|
|
|
*Structure analysis: 2026-02-24*
|