Files
cim_summary/.planning/codebase/STRUCTURE.md
2026-02-24 10:28:22 -05:00

375 lines
15 KiB
Markdown

# Codebase Structure
**Analysis Date:** 2026-02-24
## Directory Layout
```
cim_summary/
├── backend/ # Express.js + TypeScript backend (Node.js)
│ ├── src/
│ │ ├── index.ts # Express app + Firebase Functions exports
│ │ ├── controllers/ # Request handlers
│ │ ├── models/ # Database access + schema
│ │ ├── services/ # Business logic + external integrations
│ │ ├── routes/ # Express route definitions
│ │ ├── middleware/ # Express middleware (auth, validation, error)
│ │ ├── config/ # Configuration (env, firebase, supabase)
│ │ ├── utils/ # Utilities (logger, validation, parsing)
│ │ ├── types/ # TypeScript type definitions
│ │ ├── scripts/ # One-off CLI scripts (diagnostics, setup)
│ │ ├── assets/ # Static assets (HTML templates)
│ │ └── __tests__/ # Test suites (unit, integration, acceptance)
│ ├── package.json # Node dependencies
│ ├── tsconfig.json # TypeScript config
│ ├── .eslintrc.json # ESLint config
│ └── dist/ # Compiled JavaScript (generated)
├── frontend/ # React + Vite + TypeScript frontend
│ ├── src/
│ │ ├── main.tsx # React entry point
│ │ ├── App.tsx # Root component with routing
│ │ ├── components/ # React components (UI)
│ │ ├── services/ # API clients (documentService, authService)
│ │ ├── contexts/ # React Context (AuthContext)
│ │ ├── config/ # Configuration (env, firebase)
│ │ ├── types/ # TypeScript interfaces
│ │ ├── utils/ # Utilities (validation, cn, auth debug)
│ │ └── assets/ # Static images and icons
│ ├── package.json # Node dependencies
│ ├── tsconfig.json # TypeScript config
│ ├── vite.config.ts # Vite bundler config
│ ├── eslintrc.json # ESLint config
│ ├── tailwind.config.js # Tailwind CSS config
│ ├── postcss.config.js # PostCSS config
│ └── dist/ # Built static assets (generated)
├── .planning/ # GSD planning directory
│ └── codebase/ # Codebase analysis documents
├── package.json # Monorepo root package (if used)
├── .git/ # Git repository
├── .gitignore # Git ignore rules
├── .cursorrules # Cursor IDE configuration
├── README.md # Project overview
├── CONFIGURATION_GUIDE.md # Setup instructions
├── CODEBASE_ARCHITECTURE_SUMMARY.md # Existing architecture notes
└── [PDF documents] # Sample CIM documents for testing
```
## Directory Purposes
**backend/src/:**
- Purpose: All backend server code
- Contains: TypeScript source files
- Key files: `index.ts` (main app), routes, controllers, services, models
**backend/src/controllers/:**
- Purpose: HTTP request handlers
- Contains: `documentController.ts`, `authController.ts`
- Functions: Map HTTP requests to service calls, handle validation, construct responses
**backend/src/services/:**
- Purpose: Business logic and external integrations
- Contains: Document processing, LLM integration, file storage, database, job queue
- Key files:
- `unifiedDocumentProcessor.ts` - Orchestrator, strategy selection
- `singlePassProcessor.ts` - 2-LLM extraction (current default)
- `optimizedAgenticRAGProcessor.ts` - Advanced agentic processing (stub)
- `documentAiProcessor.ts` - Google Document AI OCR
- `llmService.ts` - LLM API calls (Anthropic/OpenAI/OpenRouter)
- `jobQueueService.ts` - Async job queue (in-memory, EventEmitter)
- `jobProcessorService.ts` - Dequeue and execute jobs
- `fileStorageService.ts` - GCS signed URLs and upload
- `vectorDatabaseService.ts` - Supabase pgvector operations
- `pdfGenerationService.ts` - Puppeteer PDF rendering
- `uploadProgressService.ts` - Track upload status
- `uploadMonitoringService.ts` - Monitor processing progress
- `llmSchemas.ts` - Zod schemas for LLM extraction (CIMReview, financial data)
**backend/src/models/:**
- Purpose: Database access layer and schema definitions
- Contains: Document, User, ProcessingJob, Feedback models
- Key files:
- `types.ts` - TypeScript interfaces (Document, ProcessingJob, ProcessingStatus)
- `DocumentModel.ts` - Document CRUD with retry logic
- `ProcessingJobModel.ts` - Job tracking in database
- `UserModel.ts` - User management
- `VectorDatabaseModel.ts` - Vector embedding queries
- `migrate.ts` - Database migrations
- `seed.ts` - Test data seeding
- `migrations/` - SQL migration files
**backend/src/routes/:**
- Purpose: Express route definitions
- Contains: Route handlers and middleware bindings
- Key files:
- `documents.ts` - GET/POST/PUT/DELETE document endpoints
- `vector.ts` - Vector search endpoints
- `monitoring.ts` - Health and status endpoints
- `documentAudit.ts` - Audit log endpoints
**backend/src/middleware/:**
- Purpose: Express middleware for cross-cutting concerns
- Contains: Authentication, validation, error handling
- Key files:
- `firebaseAuth.ts` - Firebase ID token verification
- `errorHandler.ts` - Global error handling + correlation ID
- `notFoundHandler.ts` - 404 handler
- `validation.ts` - Request validation (UUID, pagination)
**backend/src/config/:**
- Purpose: Configuration and initialization
- Contains: Environment setup, service initialization
- Key files:
- `env.ts` - Environment variable validation (Joi schema)
- `firebase.ts` - Firebase Admin SDK initialization
- `supabase.ts` - Supabase client and pool setup
- `database.ts` - PostgreSQL connection (legacy)
- `errorConfig.ts` - Error handling config
**backend/src/utils/:**
- Purpose: Shared utility functions
- Contains: Logging, validation, parsing
- Key files:
- `logger.ts` - Winston logger setup (console + file transports)
- `validation.ts` - UUID and pagination validators
- `googleServiceAccount.ts` - Google Cloud credentials resolution
- `financialExtractor.ts` - Financial data parsing (deprecated for single-pass)
- `templateParser.ts` - CIM template utilities
- `auth.ts` - Authentication helpers
**backend/src/scripts/:**
- Purpose: One-off CLI scripts for diagnostics and setup
- Contains: Database setup, testing, monitoring
- Key files:
- `setup-database.ts` - Initialize database schema
- `monitor-document-processing.ts` - Watch job queue status
- `check-current-job.ts` - Debug stuck jobs
- `test-full-llm-pipeline.ts` - End-to-end testing
- `comprehensive-diagnostic.ts` - System health check
**backend/src/__tests__/:**
- Purpose: Test suites
- Contains: Unit, integration, acceptance tests
- Subdirectories:
- `unit/` - Isolated component tests
- `integration/` - Multi-component tests
- `acceptance/` - End-to-end flow tests
- `mocks/` - Mock data and fixtures
- `utils/` - Test utilities
**frontend/src/:**
- Purpose: All frontend code
- Contains: React components, services, types
**frontend/src/components/:**
- Purpose: React UI components
- Contains: Page components, reusable widgets
- Key files:
- `DocumentUpload.tsx` - File upload UI with drag-and-drop
- `DocumentList.tsx` - List of processed documents
- `DocumentViewer.tsx` - View and edit extracted data
- `ProcessingProgress.tsx` - Real-time processing status
- `UploadMonitoringDashboard.tsx` - Admin view of active jobs
- `LoginForm.tsx` - Firebase auth login UI
- `ProtectedRoute.tsx` - Route guard for authenticated pages
- `Analytics.tsx` - Document analytics and statistics
- `CIMReviewTemplate.tsx` - Display extracted CIM review data
**frontend/src/services/:**
- Purpose: API clients and external service integration
- Contains: HTTP clients for backend
- Key files:
- `documentService.ts` - Document API calls (upload, list, process, status)
- `authService.ts` - Firebase authentication (login, logout, token)
- `adminService.ts` - Admin-only operations
**frontend/src/contexts/:**
- Purpose: React Context for global state
- Contains: AuthContext for user and authentication state
- Key files:
- `AuthContext.tsx` - User, token, login/logout state
**frontend/src/config/:**
- Purpose: Configuration
- Contains: Environment variables, Firebase setup
- Key files:
- `env.ts` - VITE_API_BASE_URL and other env vars
- `firebase.ts` - Firebase client initialization
**frontend/src/types/:**
- Purpose: TypeScript interfaces
- Contains: API response types, component props
- Key files:
- `auth.ts` - User, LoginCredentials, AuthContextType
**frontend/src/utils/:**
- Purpose: Shared utility functions
- Contains: Validation, CSS utilities
- Key files:
- `validation.ts` - Email, password validators
- `cn.ts` - Classname merger (clsx wrapper)
- `authDebug.ts` - Authentication debugging helpers
## Key File Locations
**Entry Points:**
- `backend/src/index.ts` - Main Express app and Firebase Functions exports
- `frontend/src/main.tsx` - React entry point
- `frontend/src/App.tsx` - Root component with routing
**Configuration:**
- `backend/src/config/env.ts` - Environment variable schema and validation
- `backend/src/config/firebase.ts` - Firebase Admin SDK setup
- `backend/src/config/supabase.ts` - Supabase client and connection pool
- `frontend/src/config/firebase.ts` - Firebase client configuration
- `frontend/src/config/env.ts` - Frontend environment variables
**Core Logic:**
- `backend/src/services/unifiedDocumentProcessor.ts` - Main document processing orchestrator
- `backend/src/services/singlePassProcessor.ts` - Single-pass 2-LLM strategy
- `backend/src/services/llmService.ts` - LLM API integration with retry
- `backend/src/services/jobQueueService.ts` - Background job queue
- `backend/src/services/vectorDatabaseService.ts` - Vector search implementation
**Testing:**
- `backend/src/__tests__/unit/` - Unit tests
- `backend/src/__tests__/integration/` - Integration tests
- `backend/src/__tests__/acceptance/` - End-to-end tests
**Database:**
- `backend/src/models/types.ts` - TypeScript type definitions
- `backend/src/models/DocumentModel.ts` - Document CRUD operations
- `backend/src/models/ProcessingJobModel.ts` - Job tracking
- `backend/src/models/migrations/` - SQL migration files
**Middleware:**
- `backend/src/middleware/firebaseAuth.ts` - JWT authentication
- `backend/src/middleware/errorHandler.ts` - Global error handling
- `backend/src/middleware/validation.ts` - Input validation
**Logging:**
- `backend/src/utils/logger.ts` - Winston logger configuration
## Naming Conventions
**Files:**
- Controllers: `{resource}Controller.ts` (e.g., `documentController.ts`)
- Services: `{service}Service.ts` or descriptive (e.g., `llmService.ts`, `singlePassProcessor.ts`)
- Models: `{Entity}Model.ts` (e.g., `DocumentModel.ts`)
- Routes: `{resource}.ts` (e.g., `documents.ts`)
- Middleware: `{purpose}Handler.ts` or `{purpose}.ts` (e.g., `firebaseAuth.ts`)
- Types/Interfaces: `types.ts` or `{name}Types.ts`
- Tests: `{file}.test.ts` or `{file}.spec.ts`
**Directories:**
- Plurals for collections: `services/`, `models/`, `utils/`, `routes/`, `controllers/`
- Singular for specific features: `config/`, `middleware/`, `types/`, `contexts/`
- Nested by feature in larger directories: `__tests__/unit/`, `models/migrations/`
**Functions/Variables:**
- Camel case: `processDocument()`, `getUserId()`, `documentId`
- Constants: UPPER_SNAKE_CASE: `MAX_RETRIES`, `TIMEOUT_MS`
- Private methods: Prefix with `_` or use TypeScript `private`: `_retryOperation()`
**Classes:**
- Pascal case: `DocumentModel`, `JobQueueService`, `SinglePassProcessor`
- Service instances exported as singletons: `export const llmService = new LLMService()`
**React Components:**
- Pascal case: `DocumentUpload.tsx`, `ProtectedRoute.tsx`
- Hooks: `use{Feature}` (e.g., `useAuth` from AuthContext)
## Where to Add New Code
**New Document Processing Strategy:**
- Primary code: `backend/src/services/{strategyName}Processor.ts`
- Schema: Add types to `backend/src/services/llmSchemas.ts`
- Integration: Register in `backend/src/services/unifiedDocumentProcessor.ts`
- Tests: `backend/src/__tests__/integration/{strategyName}.test.ts`
**New API Endpoint:**
- Route: `backend/src/routes/{resource}.ts`
- Controller: `backend/src/controllers/{resource}Controller.ts`
- Service: `backend/src/services/{resource}Service.ts` (if needed)
- Model: `backend/src/models/{Resource}Model.ts` (if database access)
- Tests: `backend/src/__tests__/integration/{endpoint}.test.ts`
**New React Component:**
- Component: `frontend/src/components/{ComponentName}.tsx`
- Types: Add to `frontend/src/types/` or inline in component
- Services: Use existing `frontend/src/services/documentService.ts`
- Tests: `frontend/src/__tests__/{ComponentName}.test.tsx` (if added)
**Shared Utilities:**
- Backend: `backend/src/utils/{utility}.ts`
- Frontend: `frontend/src/utils/{utility}.ts`
- Avoid code duplication - consider extracting common patterns
**Database Schema Changes:**
- Migration file: `backend/src/models/migrations/{timestamp}_{description}.sql`
- TypeScript interface: Update `backend/src/models/types.ts`
- Model methods: Update corresponding `*Model.ts` file
- Run: `npm run db:migrate` in backend
**Configuration Changes:**
- Environment: Update `backend/src/config/env.ts` (Joi schema)
- Frontend env: Update `frontend/src/config/env.ts`
- Firebase secrets: Use `firebase functions:secrets:set VAR_NAME`
- Local dev: Add to `.env` file (gitignored)
## Special Directories
**backend/src/__tests__/mocks/:**
- Purpose: Mock data and fixtures for testing
- Generated: No (manually maintained)
- Committed: Yes
- Usage: Import in tests for consistent test data
**backend/src/scripts/:**
- Purpose: One-off CLI utilities for development and operations
- Generated: No (manually maintained)
- Committed: Yes
- Execution: `ts-node src/scripts/{script}.ts` or `npm run {script}`
**backend/src/assets/:**
- Purpose: Static HTML templates for PDF generation
- Generated: No (manually maintained)
- Committed: Yes
- Usage: Rendered by Puppeteer in `pdfGenerationService.ts`
**backend/src/models/migrations/:**
- Purpose: Database schema migration SQL files
- Generated: No (manually created)
- Committed: Yes
- Execution: Run via `npm run db:migrate`
**frontend/src/assets/:**
- Purpose: Images, icons, logos
- Generated: No (manually added)
- Committed: Yes
- Usage: Import in components (e.g., `bluepoint-logo.png`)
**backend/dist/ and frontend/dist/:**
- Purpose: Compiled JavaScript and optimized bundles
- Generated: Yes (build output)
- Committed: No (gitignored)
- Regeneration: `npm run build` in respective directory
**backend/node_modules/ and frontend/node_modules/:**
- Purpose: Installed dependencies
- Generated: Yes (npm install)
- Committed: No (gitignored)
- Regeneration: `npm install`
**backend/logs/:**
- Purpose: Runtime log files
- Generated: Yes (runtime)
- Committed: No (gitignored)
- Contents: `error.log`, `upload.log`, combined logs
---
*Structure analysis: 2026-02-24*