admin/cim_summary

Fork 0

Files

admin e6e1b1fa6f docs: map existing codebase

2026-02-24 10:28:22 -05:00

12 KiB

Raw Blame History

Coding Conventions

Analysis Date: 2026-02-24

Naming Patterns

Files:

Backend service files: camelCase.ts (e.g., llmService.ts, unifiedDocumentProcessor.ts, vectorDatabaseService.ts)
Backend middleware/controllers: camelCase.ts (e.g., errorHandler.ts, firebaseAuth.ts)
Frontend components: PascalCase.tsx (e.g., DocumentUpload.tsx, LoginForm.tsx, ProtectedRoute.tsx)
Frontend utility files: camelCase.ts (e.g., cn.ts for class name utilities)
Type definition files: camelCase.ts with .d.ts suffix optional (e.g., express.d.ts)
Model files: PascalCase.ts in backend/src/models/ (e.g., DocumentModel.ts)
Config files: camelCase.ts (e.g., env.ts, firebase.ts, supabase.ts)

Functions:

Both backend and frontend use camelCase: processDocument(), validateUUID(), handleUpload()
React components are PascalCase: DocumentUpload, ErrorHandler
Handler functions use handle or verb prefix: handleVisibilityChange(), onDrop()
Async functions use descriptive names: fetchDocuments(), uploadDocument(), processDocument()

Variables:

camelCase for all variables: documentId, correlationId, isUploading, uploadedFiles
Constant state use UPPER_SNAKE_CASE in rare cases: MAX_CONCURRENT_LLM_CALLS, MAX_TOKEN_LIMITS
Boolean prefixes: is* (isUploading, isAdmin), has* (hasError), can* (canProcess)

Types:

Interfaces use PascalCase: LLMRequest, UploadedFile, DocumentUploadProps, CIMReview
Type unions use PascalCase: ErrorCategory, ProcessingStrategy
Generic types use single uppercase letter or descriptive name: T, K, V
Enum values use UPPER_SNAKE_CASE: ErrorCategory.VALIDATION, ErrorCategory.AUTHENTICATION

Interfaces vs Types:

Interfaces for object shapes that represent entities or components: interface Document, interface UploadedFile
Types for unions, primitives, and specialized patterns: type ProcessingStrategy = 'document_ai_agentic_rag' | 'simple_full_document'

Code Style

Formatting:

No formal Prettier config detected in repo (allow varied formatting)
2-space indentation (observed in TypeScript files)
Semicolons required at end of statements
Single quotes for strings in TypeScript, double quotes in JSX attributes
Line length: preferably under 100 characters but not enforced

Linting:

Tool: ESLint with TypeScript support
Config: .eslintrc.js in backend
Key rules:
- @typescript-eslint/no-unused-vars: error (allows leading underscore for intentionally unused)
- @typescript-eslint/no-explicit-any: warn (use unknown instead)
- @typescript-eslint/no-non-null-assertion: warn (use proper type guards)
- no-console: off in backend (logging used via Winston)
- no-undef: error (strict undefined checking)
Frontend ESLint ignores unused disable directives and has max-warnings: 0

TypeScript Standards:

Strict mode not fully enabled (noImplicitAny disabled in tsconfig.json for legacy reasons)
Prefer explicit typing over any: use unknown when type is truly unknown
Type guards required for safety checks: error instanceof Error ? error.message : String(error)
No type assertions with as for complex types; use proper type narrowing

Import Organization

Order:

External framework/library imports (express, react, winston)
Google Cloud/Firebase imports (@google-cloud/storage, firebase-admin)
Third-party service imports (axios, zod, joi)
Internal config imports ('../config/env', '../config/firebase')
Internal utility imports ('../utils/logger', '../utils/cn')
Internal model imports ('../models/DocumentModel')
Internal service imports ('../services/llmService')
Internal middleware/helper imports ('../middleware/errorHandler')
Type-only imports at the end: import type { ProcessingStrategy } from '...'

Examples:

Backend service pattern from optimizedAgenticRAGProcessor.ts:

import { logger } from '../utils/logger';
import { vectorDatabaseService } from './vectorDatabaseService';
import { VectorDatabaseModel } from '../models/VectorDatabaseModel';
import { llmService } from './llmService';
import { CIMReview } from './llmSchemas';
import { config } from '../config/env';
import type { ParsedFinancials } from './financialTableParser';
import type { StructuredTable } from './documentAiProcessor';

Frontend component pattern from DocumentList.tsx:

import React from 'react';
import {
  FileText,
  Eye,
  Download,
  Trash2,
  Calendar,
  User,
  Clock
} from 'lucide-react';
import { cn } from '../utils/cn';

Path Aliases:

No @ alias imports detected; all use relative ../ patterns
Monorepo structure: frontend and backend in separate directories with independent module resolution

Error Handling

Patterns:

Structured Error Objects with Categories:
- Use ErrorCategory enum for classification: VALIDATION, AUTHENTICATION, AUTHORIZATION, NOT_FOUND, EXTERNAL_SERVICE, PROCESSING, DATABASE, SYSTEM
- Attach AppError interface properties: statusCode, isOperational, code, correlationId, category, retryable, context
- Example from errorHandler.ts:
```
const enhancedError: AppError = {
  category: ErrorCategory.VALIDATION,
  statusCode: 400,
  code: 'INVALID_UUID_FORMAT',
  retryable: false
};
```

Try-Catch with Structured Logging:

Always catch errors with explicit type checking
Log with structured data including correlation ID

Example pattern:

try {
  await operation();
} catch (error) {
  logger.error('Operation failed', {
    error: error instanceof Error ? error.message : String(error),
    stack: error instanceof Error ? error.stack : undefined,
    context: { documentId, userId }
  });
  throw error;
}

HTTP Response Pattern:
- Success responses: { success: true, data: {...} }
- Error responses: { success: false, error: { code, message, details, correlationId, timestamp, retryable } }
- User-friendly messages mapped by error category
- Include X-Correlation-ID header in responses
Retry Logic:
- LLM service implements concurrency limiting: max 1 concurrent call to prevent rate limits
- 3 retry attempts for LLM API calls with exponential backoff (see llmService.ts lines 236-450)
- Jobs respect 14-minute timeout limit with graceful status updates
External Service Errors:
- Firebase Auth errors: extract from error.message and error.name (TokenExpiredError, JsonWebTokenError)
- Supabase errors: check error.code and error.message, handle UUID validation errors
- GCS errors: extract from error objects with proper null checks

Logging

Framework: Winston logger from backend/src/utils/logger.ts

Levels:

logger.debug(): Detailed diagnostic info (disabled in production)
logger.info(): Normal operation information, upload start/completion, processing status
logger.warn(): Warning conditions, CORS rejections, non-critical issues
logger.error(): Error conditions with full context and stack traces

Structured Logging Pattern:

logger.info('Message', {
  correlationId: correlationId,
  category: 'operation_type',
  operation: 'specific_action',
  documentId: documentId,
  userId: userId,
  metadata: value,
  timestamp: new Date().toISOString()
});

StructuredLogger Class:

Use for operations requiring correlation ID tracking
Constructor: const logger = new StructuredLogger(correlationId)
Specialized methods:
- uploadStart(), uploadSuccess(), uploadError() - for file operations
- processingStart(), processingSuccess(), processingError() - for document processing
- storageOperation() - for file storage operations
- jobQueueOperation() - for background jobs
- info(), warn(), error(), debug() - general logging
All methods automatically attach correlation ID to metadata

What NOT to Log:

Credentials, API keys, or sensitive data
Large file contents or binary data
User passwords or tokens (log only presence: "token available" or "NO_TOKEN")
Request body contents (sanitized in error handler - only whitelisted fields: documentId, id, status, fileName, fileSize, contentType, correlationId)

Console Usage:

Backend: console.log disabled by ESLint in production code; only Winston logger used
Frontend: console.log used in development (observed in DocumentUpload, App components)
Special case: logger initialization may use console.warn for setup diagnostics

Comments

When to Comment:

Complex algorithms or business logic: explain "why", not "what" the code does
Non-obvious type conversions or workarounds
Links to related issues, tickets, or documentation
Critical security considerations or performance implications
TODO items for incomplete work (format: // TODO: [description])

JSDoc/TSDoc:

Used for function and class documentation in utility and service files

Function signature example from test-helpers.ts:

/**
 * Creates a mock correlation ID for testing
 */
export function createMockCorrelationId(): string

Parameter and return types documented via TypeScript typing (preferred over verbose JSDoc)
Service classes include operation summaries: /** Process document using Document AI + Agentic RAG strategy */

Function Design

Size:

Keep functions focused on single responsibility
Long services (300+ lines) separate concerns into helper methods
Controller/middleware functions stay under 50 lines

Parameters:

Max 3-4 required parameters; use object for additional config
Example: processDocument(documentId: string, userId: string, text: string, options?: { strategy?: string })
Use destructuring for config objects: { strategy, maxTokens, temperature }

Return Values:

Async operations return Promise with typed success/error objects
Pattern: Promise<{ success: boolean; data: T; error?: string }>
Avoid throwing in service methods; return error in object
Controllers/middleware can throw for Express error handler

Type Signatures:

Always specify parameter and return types (no implicit any)
Use generics for reusable patterns: Promise<T>, Array<Document>
Union types for multiple possibilities: 'uploading' | 'uploaded' | 'processing' | 'completed' | 'error'

Module Design

Exports:

Services exported as singleton instances: export const llmService = new LLMService()
Utility functions exported as named exports: export function validateUUID() { ... }
Type definitions exported from dedicated type files or alongside implementation
Classes exported as default or named based on usage pattern

Barrel Files:

Not consistently used; services import directly from implementation files
Example: import { llmService } from './llmService' not from ./services/index
Consider adding for cleaner imports when services directory grows

Service Singletons:

All services instantiated once and exported as singletons
Examples:
- backend/src/services/llmService.ts: export const llmService = new LLMService()
- backend/src/services/fileStorageService.ts: export const fileStorageService = new FileStorageService()
- backend/src/services/vectorDatabaseService.ts: export const vectorDatabaseService = new VectorDatabaseService()
Prevents multiple initialization and enables dependency sharing

Frontend Context Pattern:

React Context for auth: AuthContext exports useAuth() hook
Services pattern: documentService contains API methods, used as singleton
No service singletons in frontend (class instances recreated as needed)

Deprecated Patterns (DO NOT USE)

❌ Direct PostgreSQL connections - Use Supabase client instead
❌ JWT authentication - Use Firebase Auth tokens
❌ console.log in production code - Use Winston logger
❌ Type assertions with as for complex types - Use type guards
❌ Manual error handling without correlation IDs
❌ Redis caching - Not used in current architecture
❌ Jest testing - Use Vitest instead

Convention analysis: 2026-02-24

12 KiB Raw Blame History