Files
cim_summary/.planning/codebase/CONVENTIONS.md
2026-02-24 10:28:22 -05:00

12 KiB

Coding Conventions

Analysis Date: 2026-02-24

Naming Patterns

Files:

  • Backend service files: camelCase.ts (e.g., llmService.ts, unifiedDocumentProcessor.ts, vectorDatabaseService.ts)
  • Backend middleware/controllers: camelCase.ts (e.g., errorHandler.ts, firebaseAuth.ts)
  • Frontend components: PascalCase.tsx (e.g., DocumentUpload.tsx, LoginForm.tsx, ProtectedRoute.tsx)
  • Frontend utility files: camelCase.ts (e.g., cn.ts for class name utilities)
  • Type definition files: camelCase.ts with .d.ts suffix optional (e.g., express.d.ts)
  • Model files: PascalCase.ts in backend/src/models/ (e.g., DocumentModel.ts)
  • Config files: camelCase.ts (e.g., env.ts, firebase.ts, supabase.ts)

Functions:

  • Both backend and frontend use camelCase: processDocument(), validateUUID(), handleUpload()
  • React components are PascalCase: DocumentUpload, ErrorHandler
  • Handler functions use handle or verb prefix: handleVisibilityChange(), onDrop()
  • Async functions use descriptive names: fetchDocuments(), uploadDocument(), processDocument()

Variables:

  • camelCase for all variables: documentId, correlationId, isUploading, uploadedFiles
  • Constant state use UPPER_SNAKE_CASE in rare cases: MAX_CONCURRENT_LLM_CALLS, MAX_TOKEN_LIMITS
  • Boolean prefixes: is* (isUploading, isAdmin), has* (hasError), can* (canProcess)

Types:

  • Interfaces use PascalCase: LLMRequest, UploadedFile, DocumentUploadProps, CIMReview
  • Type unions use PascalCase: ErrorCategory, ProcessingStrategy
  • Generic types use single uppercase letter or descriptive name: T, K, V
  • Enum values use UPPER_SNAKE_CASE: ErrorCategory.VALIDATION, ErrorCategory.AUTHENTICATION

Interfaces vs Types:

  • Interfaces for object shapes that represent entities or components: interface Document, interface UploadedFile
  • Types for unions, primitives, and specialized patterns: type ProcessingStrategy = 'document_ai_agentic_rag' | 'simple_full_document'

Code Style

Formatting:

  • No formal Prettier config detected in repo (allow varied formatting)
  • 2-space indentation (observed in TypeScript files)
  • Semicolons required at end of statements
  • Single quotes for strings in TypeScript, double quotes in JSX attributes
  • Line length: preferably under 100 characters but not enforced

Linting:

  • Tool: ESLint with TypeScript support
  • Config: .eslintrc.js in backend
  • Key rules:
    • @typescript-eslint/no-unused-vars: error (allows leading underscore for intentionally unused)
    • @typescript-eslint/no-explicit-any: warn (use unknown instead)
    • @typescript-eslint/no-non-null-assertion: warn (use proper type guards)
    • no-console: off in backend (logging used via Winston)
    • no-undef: error (strict undefined checking)
  • Frontend ESLint ignores unused disable directives and has max-warnings: 0

TypeScript Standards:

  • Strict mode not fully enabled (noImplicitAny disabled in tsconfig.json for legacy reasons)
  • Prefer explicit typing over any: use unknown when type is truly unknown
  • Type guards required for safety checks: error instanceof Error ? error.message : String(error)
  • No type assertions with as for complex types; use proper type narrowing

Import Organization

Order:

  1. External framework/library imports (express, react, winston)
  2. Google Cloud/Firebase imports (@google-cloud/storage, firebase-admin)
  3. Third-party service imports (axios, zod, joi)
  4. Internal config imports ('../config/env', '../config/firebase')
  5. Internal utility imports ('../utils/logger', '../utils/cn')
  6. Internal model imports ('../models/DocumentModel')
  7. Internal service imports ('../services/llmService')
  8. Internal middleware/helper imports ('../middleware/errorHandler')
  9. Type-only imports at the end: import type { ProcessingStrategy } from '...'

Examples:

Backend service pattern from optimizedAgenticRAGProcessor.ts:

import { logger } from '../utils/logger';
import { vectorDatabaseService } from './vectorDatabaseService';
import { VectorDatabaseModel } from '../models/VectorDatabaseModel';
import { llmService } from './llmService';
import { CIMReview } from './llmSchemas';
import { config } from '../config/env';
import type { ParsedFinancials } from './financialTableParser';
import type { StructuredTable } from './documentAiProcessor';

Frontend component pattern from DocumentList.tsx:

import React from 'react';
import {
  FileText,
  Eye,
  Download,
  Trash2,
  Calendar,
  User,
  Clock
} from 'lucide-react';
import { cn } from '../utils/cn';

Path Aliases:

  • No @ alias imports detected; all use relative ../ patterns
  • Monorepo structure: frontend and backend in separate directories with independent module resolution

Error Handling

Patterns:

  1. Structured Error Objects with Categories:

    • Use ErrorCategory enum for classification: VALIDATION, AUTHENTICATION, AUTHORIZATION, NOT_FOUND, EXTERNAL_SERVICE, PROCESSING, DATABASE, SYSTEM
    • Attach AppError interface properties: statusCode, isOperational, code, correlationId, category, retryable, context
    • Example from errorHandler.ts:
      const enhancedError: AppError = {
        category: ErrorCategory.VALIDATION,
        statusCode: 400,
        code: 'INVALID_UUID_FORMAT',
        retryable: false
      };
      
  2. Try-Catch with Structured Logging:

    • Always catch errors with explicit type checking
    • Log with structured data including correlation ID
    • Example pattern:
      try {
        await operation();
      } catch (error) {
        logger.error('Operation failed', {
          error: error instanceof Error ? error.message : String(error),
          stack: error instanceof Error ? error.stack : undefined,
          context: { documentId, userId }
        });
        throw error;
      }
      
  3. HTTP Response Pattern:

    • Success responses: { success: true, data: {...} }
    • Error responses: { success: false, error: { code, message, details, correlationId, timestamp, retryable } }
    • User-friendly messages mapped by error category
    • Include X-Correlation-ID header in responses
  4. Retry Logic:

    • LLM service implements concurrency limiting: max 1 concurrent call to prevent rate limits
    • 3 retry attempts for LLM API calls with exponential backoff (see llmService.ts lines 236-450)
    • Jobs respect 14-minute timeout limit with graceful status updates
  5. External Service Errors:

    • Firebase Auth errors: extract from error.message and error.name (TokenExpiredError, JsonWebTokenError)
    • Supabase errors: check error.code and error.message, handle UUID validation errors
    • GCS errors: extract from error objects with proper null checks

Logging

Framework: Winston logger from backend/src/utils/logger.ts

Levels:

  • logger.debug(): Detailed diagnostic info (disabled in production)
  • logger.info(): Normal operation information, upload start/completion, processing status
  • logger.warn(): Warning conditions, CORS rejections, non-critical issues
  • logger.error(): Error conditions with full context and stack traces

Structured Logging Pattern:

logger.info('Message', {
  correlationId: correlationId,
  category: 'operation_type',
  operation: 'specific_action',
  documentId: documentId,
  userId: userId,
  metadata: value,
  timestamp: new Date().toISOString()
});

StructuredLogger Class:

  • Use for operations requiring correlation ID tracking
  • Constructor: const logger = new StructuredLogger(correlationId)
  • Specialized methods:
    • uploadStart(), uploadSuccess(), uploadError() - for file operations
    • processingStart(), processingSuccess(), processingError() - for document processing
    • storageOperation() - for file storage operations
    • jobQueueOperation() - for background jobs
    • info(), warn(), error(), debug() - general logging
  • All methods automatically attach correlation ID to metadata

What NOT to Log:

  • Credentials, API keys, or sensitive data
  • Large file contents or binary data
  • User passwords or tokens (log only presence: "token available" or "NO_TOKEN")
  • Request body contents (sanitized in error handler - only whitelisted fields: documentId, id, status, fileName, fileSize, contentType, correlationId)

Console Usage:

  • Backend: console.log disabled by ESLint in production code; only Winston logger used
  • Frontend: console.log used in development (observed in DocumentUpload, App components)
  • Special case: logger initialization may use console.warn for setup diagnostics

Comments

When to Comment:

  • Complex algorithms or business logic: explain "why", not "what" the code does
  • Non-obvious type conversions or workarounds
  • Links to related issues, tickets, or documentation
  • Critical security considerations or performance implications
  • TODO items for incomplete work (format: // TODO: [description])

JSDoc/TSDoc:

  • Used for function and class documentation in utility and service files
  • Function signature example from test-helpers.ts:
    /**
     * Creates a mock correlation ID for testing
     */
    export function createMockCorrelationId(): string
    
  • Parameter and return types documented via TypeScript typing (preferred over verbose JSDoc)
  • Service classes include operation summaries: /** Process document using Document AI + Agentic RAG strategy */

Function Design

Size:

  • Keep functions focused on single responsibility
  • Long services (300+ lines) separate concerns into helper methods
  • Controller/middleware functions stay under 50 lines

Parameters:

  • Max 3-4 required parameters; use object for additional config
  • Example: processDocument(documentId: string, userId: string, text: string, options?: { strategy?: string })
  • Use destructuring for config objects: { strategy, maxTokens, temperature }

Return Values:

  • Async operations return Promise with typed success/error objects
  • Pattern: Promise<{ success: boolean; data: T; error?: string }>
  • Avoid throwing in service methods; return error in object
  • Controllers/middleware can throw for Express error handler

Type Signatures:

  • Always specify parameter and return types (no implicit any)
  • Use generics for reusable patterns: Promise<T>, Array<Document>
  • Union types for multiple possibilities: 'uploading' | 'uploaded' | 'processing' | 'completed' | 'error'

Module Design

Exports:

  • Services exported as singleton instances: export const llmService = new LLMService()
  • Utility functions exported as named exports: export function validateUUID() { ... }
  • Type definitions exported from dedicated type files or alongside implementation
  • Classes exported as default or named based on usage pattern

Barrel Files:

  • Not consistently used; services import directly from implementation files
  • Example: import { llmService } from './llmService' not from ./services/index
  • Consider adding for cleaner imports when services directory grows

Service Singletons:

  • All services instantiated once and exported as singletons
  • Examples:
    • backend/src/services/llmService.ts: export const llmService = new LLMService()
    • backend/src/services/fileStorageService.ts: export const fileStorageService = new FileStorageService()
    • backend/src/services/vectorDatabaseService.ts: export const vectorDatabaseService = new VectorDatabaseService()
  • Prevents multiple initialization and enables dependency sharing

Frontend Context Pattern:

  • React Context for auth: AuthContext exports useAuth() hook
  • Services pattern: documentService contains API methods, used as singleton
  • No service singletons in frontend (class instances recreated as needed)

Deprecated Patterns (DO NOT USE)

  • Direct PostgreSQL connections - Use Supabase client instead
  • JWT authentication - Use Firebase Auth tokens
  • console.log in production code - Use Winston logger
  • Type assertions with as for complex types - Use type guards
  • Manual error handling without correlation IDs
  • Redis caching - Not used in current architecture
  • Jest testing - Use Vitest instead

Convention analysis: 2026-02-24