Pre-cleanup commit: Current state before service layer consolidation
This commit is contained in:
325
DEPENDENCY_ANALYSIS_REPORT.md
Normal file
325
DEPENDENCY_ANALYSIS_REPORT.md
Normal file
@@ -0,0 +1,325 @@
|
||||
# Dependency Analysis Report - CIM Document Processor
|
||||
|
||||
## Executive Summary
|
||||
|
||||
This report analyzes the dependencies in both backend and frontend packages to identify:
|
||||
- Unused dependencies that can be removed
|
||||
- Outdated packages that should be updated
|
||||
- Consolidation opportunities
|
||||
- Dependencies that are actually being used vs. placeholder implementations
|
||||
|
||||
## Backend Dependencies Analysis
|
||||
|
||||
### Core Dependencies (Actively Used)
|
||||
|
||||
#### ✅ **Essential Dependencies**
|
||||
- `express` - Main web framework
|
||||
- `cors` - CORS middleware
|
||||
- `helmet` - Security middleware
|
||||
- `morgan` - HTTP request logging
|
||||
- `express-rate-limit` - Rate limiting
|
||||
- `dotenv` - Environment variable management
|
||||
- `winston` - Logging framework
|
||||
- `@supabase/supabase-js` - Database client
|
||||
- `@google-cloud/storage` - Google Cloud Storage
|
||||
- `@google-cloud/documentai` - Document AI processing
|
||||
- `@anthropic-ai/sdk` - Claude AI integration
|
||||
- `openai` - OpenAI integration
|
||||
- `puppeteer` - PDF generation
|
||||
- `uuid` - UUID generation
|
||||
- `axios` - HTTP client
|
||||
|
||||
#### ✅ **Conditionally Used Dependencies**
|
||||
- `bcryptjs` - Used in auth.ts and seed.ts (legacy auth system)
|
||||
- `jsonwebtoken` - Used in auth.ts (legacy JWT system)
|
||||
- `joi` - Used for environment validation and middleware validation
|
||||
- `zod` - Used in llmSchemas.ts and llmService.ts for schema validation
|
||||
- `multer` - Used in upload middleware (legacy multipart upload)
|
||||
- `pdf-parse` - Used in documentAiGenkitProcessor.ts (legacy processor)
|
||||
|
||||
#### ⚠️ **Potentially Unused Dependencies**
|
||||
- `redis` - Only imported in sessionService.ts but may not be actively used
|
||||
- `pg` - PostgreSQL client (may be redundant with Supabase)
|
||||
|
||||
### Development Dependencies (Actively Used)
|
||||
|
||||
#### ✅ **Essential Dev Dependencies**
|
||||
- `typescript` - TypeScript compiler
|
||||
- `ts-node-dev` - Development server
|
||||
- `jest` - Testing framework
|
||||
- `supertest` - API testing
|
||||
- `@types/*` - TypeScript type definitions
|
||||
- `eslint` - Code linting
|
||||
- `@typescript-eslint/*` - TypeScript ESLint rules
|
||||
|
||||
### Unused Dependencies Analysis
|
||||
|
||||
#### ❌ **Confirmed Unused**
|
||||
None identified - all dependencies appear to be used somewhere in the codebase.
|
||||
|
||||
#### ⚠️ **Potentially Redundant**
|
||||
1. **Validation Libraries**: Both `joi` and `zod` are used for validation
|
||||
- `joi`: Environment validation, middleware validation
|
||||
- `zod`: LLM schemas, service validation
|
||||
- **Recommendation**: Consider consolidating to just `zod` for consistency
|
||||
|
||||
2. **Database Clients**: Both `pg` and `@supabase/supabase-js`
|
||||
- `pg`: Direct PostgreSQL client
|
||||
- `@supabase/supabase-js`: Supabase client (includes PostgreSQL)
|
||||
- **Recommendation**: Remove `pg` if only using Supabase
|
||||
|
||||
3. **Authentication**: Both `bcryptjs`/`jsonwebtoken` and Firebase Auth
|
||||
- Legacy JWT system vs. Firebase Authentication
|
||||
- **Recommendation**: Remove legacy auth dependencies if fully migrated to Firebase
|
||||
|
||||
## Frontend Dependencies Analysis
|
||||
|
||||
### Core Dependencies (Actively Used)
|
||||
|
||||
#### ✅ **Essential Dependencies**
|
||||
- `react` - React framework
|
||||
- `react-dom` - React DOM rendering
|
||||
- `react-router-dom` - Client-side routing
|
||||
- `axios` - HTTP client for API calls
|
||||
- `firebase` - Firebase Authentication
|
||||
- `lucide-react` - Icon library (used in 6 components)
|
||||
- `react-dropzone` - File upload component
|
||||
|
||||
#### ❌ **Unused Dependencies**
|
||||
- `clsx` - Not imported anywhere
|
||||
- `tailwind-merge` - Not imported anywhere
|
||||
|
||||
### Development Dependencies (Actively Used)
|
||||
|
||||
#### ✅ **Essential Dev Dependencies**
|
||||
- `typescript` - TypeScript compiler
|
||||
- `vite` - Build tool and dev server
|
||||
- `@vitejs/plugin-react` - React plugin for Vite
|
||||
- `tailwindcss` - CSS framework
|
||||
- `postcss` - CSS processing
|
||||
- `autoprefixer` - CSS vendor prefixing
|
||||
- `eslint` - Code linting
|
||||
- `@typescript-eslint/*` - TypeScript ESLint rules
|
||||
- `vitest` - Testing framework
|
||||
- `@testing-library/*` - React testing utilities
|
||||
|
||||
## Processing Strategy Analysis
|
||||
|
||||
### Current Active Strategy
|
||||
Based on the code analysis, the current processing strategy is:
|
||||
- **Primary**: `optimized_agentic_rag` (most actively used)
|
||||
- **Fallback**: `document_ai_genkit` (legacy implementation)
|
||||
|
||||
### Unused Processing Strategies
|
||||
The following strategies are implemented but not actively used:
|
||||
1. `chunking` - Legacy chunking strategy
|
||||
2. `rag` - Basic RAG strategy
|
||||
3. `agentic_rag` - Basic agentic RAG (superseded by optimized version)
|
||||
|
||||
### Services Analysis
|
||||
|
||||
#### ✅ **Actively Used Services**
|
||||
- `unifiedDocumentProcessor` - Main orchestrator
|
||||
- `optimizedAgenticRAGProcessor` - Core AI processing
|
||||
- `llmService` - LLM interactions
|
||||
- `pdfGenerationService` - PDF generation
|
||||
- `fileStorageService` - GCS operations
|
||||
- `uploadMonitoringService` - Real-time tracking
|
||||
- `sessionService` - Session management
|
||||
- `jobQueueService` - Background processing
|
||||
|
||||
#### ⚠️ **Legacy Services (Can be removed)**
|
||||
- `documentProcessingService` - Legacy chunking service
|
||||
- `documentAiGenkitProcessor` - Legacy Document AI processor
|
||||
- `ragDocumentProcessor` - Basic RAG processor
|
||||
|
||||
## Outdated Packages Analysis
|
||||
|
||||
### Backend Outdated Packages
|
||||
- `@types/express`: 4.17.23 → 5.0.3 (major version update)
|
||||
- `@types/jest`: 29.5.14 → 30.0.0 (major version update)
|
||||
- `@types/multer`: 1.4.13 → 2.0.0 (major version update)
|
||||
- `@types/node`: 20.19.9 → 24.1.0 (major version update)
|
||||
- `@types/pg`: 8.15.4 → 8.15.5 (patch update)
|
||||
- `@types/supertest`: 2.0.16 → 6.0.3 (major version update)
|
||||
- `@typescript-eslint/*`: 6.21.0 → 8.38.0 (major version update)
|
||||
- `bcryptjs`: 2.4.3 → 3.0.2 (major version update)
|
||||
- `dotenv`: 16.6.1 → 17.2.1 (major version update)
|
||||
- `eslint`: 8.57.1 → 9.32.0 (major version update)
|
||||
- `express`: 4.21.2 → 5.1.0 (major version update)
|
||||
- `express-rate-limit`: 7.5.1 → 8.0.1 (major version update)
|
||||
- `helmet`: 7.2.0 → 8.1.0 (major version update)
|
||||
- `jest`: 29.7.0 → 30.0.5 (major version update)
|
||||
- `multer`: 1.4.5-lts.2 → 2.0.2 (major version update)
|
||||
- `openai`: 5.10.2 → 5.11.0 (minor update)
|
||||
- `puppeteer`: 21.11.0 → 24.15.0 (major version update)
|
||||
- `redis`: 4.7.1 → 5.7.0 (major version update)
|
||||
- `supertest`: 6.3.4 → 7.1.4 (major version update)
|
||||
- `typescript`: 5.8.3 → 5.9.2 (minor update)
|
||||
- `zod`: 3.25.76 → 4.0.14 (major version update)
|
||||
|
||||
### Frontend Outdated Packages
|
||||
- `@testing-library/jest-dom`: 6.6.3 → 6.6.4 (patch update)
|
||||
- `@testing-library/react`: 13.4.0 → 16.3.0 (major version update)
|
||||
- `@types/react`: 18.3.23 → 19.1.9 (major version update)
|
||||
- `@types/react-dom`: 18.3.7 → 19.1.7 (major version update)
|
||||
- `@typescript-eslint/*`: 6.21.0 → 8.38.0 (major version update)
|
||||
- `eslint`: 8.57.1 → 9.32.0 (major version update)
|
||||
- `eslint-plugin-react-hooks`: 4.6.2 → 5.2.0 (major version update)
|
||||
- `lucide-react`: 0.294.0 → 0.536.0 (major version update)
|
||||
- `react`: 18.3.1 → 19.1.1 (major version update)
|
||||
- `react-dom`: 18.3.1 → 19.1.1 (major version update)
|
||||
- `react-router-dom`: 6.30.1 → 7.7.1 (major version update)
|
||||
- `tailwind-merge`: 2.6.0 → 3.3.1 (major version update)
|
||||
- `tailwindcss`: 3.4.17 → 4.1.11 (major version update)
|
||||
- `typescript`: 5.8.3 → 5.9.2 (minor update)
|
||||
- `vite`: 4.5.14 → 7.0.6 (major version update)
|
||||
- `vitest`: 0.34.6 → 3.2.4 (major version update)
|
||||
|
||||
### Update Strategy
|
||||
**⚠️ Warning**: Many packages have major version updates that may include breaking changes. Update strategy:
|
||||
|
||||
1. **Immediate Updates** (Low Risk):
|
||||
- `@types/pg`: 8.15.4 → 8.15.5 (patch update)
|
||||
- `openai`: 5.10.2 → 5.11.0 (minor update)
|
||||
- `typescript`: 5.8.3 → 5.9.2 (minor update)
|
||||
- `@testing-library/jest-dom`: 6.6.3 → 6.6.4 (patch update)
|
||||
|
||||
2. **Major Version Updates** (Require Testing):
|
||||
- React ecosystem updates (React 18 → 19)
|
||||
- Express updates (Express 4 → 5)
|
||||
- Testing framework updates (Jest 29 → 30, Vitest 0.34 → 3.2)
|
||||
- Build tool updates (Vite 4 → 7)
|
||||
|
||||
3. **Recommendation**: Update major versions after dependency cleanup to minimize risk
|
||||
|
||||
## Recommendations
|
||||
|
||||
### Phase 1: Immediate Cleanup (Low Risk)
|
||||
|
||||
#### Backend
|
||||
1. **Remove unused frontend dependencies**:
|
||||
```bash
|
||||
npm uninstall clsx tailwind-merge
|
||||
```
|
||||
|
||||
2. **Consolidate validation libraries**:
|
||||
- Migrate from `joi` to `zod` for consistency
|
||||
- Remove `joi` dependency
|
||||
|
||||
3. **Remove legacy auth dependencies** (if Firebase auth is fully implemented):
|
||||
```bash
|
||||
npm uninstall bcryptjs jsonwebtoken
|
||||
npm uninstall @types/bcryptjs @types/jsonwebtoken
|
||||
```
|
||||
|
||||
#### Frontend
|
||||
1. **Remove unused dependencies**:
|
||||
```bash
|
||||
npm uninstall clsx tailwind-merge
|
||||
```
|
||||
|
||||
### Phase 2: Service Consolidation (Medium Risk)
|
||||
|
||||
1. **Remove legacy processing services**:
|
||||
- `documentProcessingService.ts`
|
||||
- `documentAiGenkitProcessor.ts`
|
||||
- `ragDocumentProcessor.ts`
|
||||
|
||||
2. **Simplify unifiedDocumentProcessor**:
|
||||
- Remove unused strategy methods
|
||||
- Keep only `optimized_agentic_rag` strategy
|
||||
|
||||
3. **Remove unused database client**:
|
||||
- Remove `pg` if only using Supabase
|
||||
|
||||
### Phase 3: Configuration Cleanup (Low Risk)
|
||||
|
||||
1. **Remove unused environment variables**:
|
||||
- Legacy auth configuration
|
||||
- Unused processing strategy configs
|
||||
- Unused LLM configurations
|
||||
|
||||
2. **Update configuration validation**:
|
||||
- Remove validation for unused configs
|
||||
- Simplify environment schema
|
||||
|
||||
### Phase 4: Route Cleanup (Medium Risk)
|
||||
|
||||
1. **Remove legacy upload endpoints**:
|
||||
- Keep only `/upload-url` and `/confirm-upload`
|
||||
- Remove multipart upload endpoints
|
||||
|
||||
2. **Remove unused analytics endpoints**:
|
||||
- Keep only actively used monitoring endpoints
|
||||
|
||||
## Impact Assessment
|
||||
|
||||
### Risk Levels
|
||||
- **Low Risk**: Removing unused dependencies, updating packages
|
||||
- **Medium Risk**: Removing legacy services, consolidating routes
|
||||
- **High Risk**: Changing core processing logic
|
||||
|
||||
### Testing Requirements
|
||||
- Unit tests for all active services
|
||||
- Integration tests for upload flow
|
||||
- End-to-end tests for document processing
|
||||
- Performance testing for optimized agentic RAG
|
||||
|
||||
### Rollback Plan
|
||||
- Keep backup of removed files for 1-2 weeks
|
||||
- Maintain feature flags for major changes
|
||||
- Document all changes for easy rollback
|
||||
|
||||
## Next Steps
|
||||
|
||||
1. **Start with Phase 1** (unused dependencies)
|
||||
2. **Test thoroughly** after each phase
|
||||
3. **Document changes** for team reference
|
||||
4. **Update deployment scripts** if needed
|
||||
5. **Monitor performance** after cleanup
|
||||
|
||||
## Estimated Savings
|
||||
|
||||
### Bundle Size Reduction
|
||||
- **Frontend**: ~50KB (removing unused dependencies)
|
||||
- **Backend**: ~200KB (removing legacy services and dependencies)
|
||||
|
||||
### Maintenance Reduction
|
||||
- **Fewer dependencies** to maintain and update
|
||||
- **Simplified codebase** with fewer moving parts
|
||||
- **Reduced security vulnerabilities** from unused packages
|
||||
|
||||
### Performance Improvement
|
||||
- **Faster builds** with fewer dependencies
|
||||
- **Reduced memory usage** from removed services
|
||||
- **Simplified deployment** with fewer configuration options
|
||||
|
||||
## Summary
|
||||
|
||||
### Key Findings
|
||||
1. **Unused Dependencies**: 2 frontend dependencies (`clsx`, `tailwind-merge`) are completely unused
|
||||
2. **Legacy Services**: 3 processing services can be removed (`documentProcessingService`, `documentAiGenkitProcessor`, `ragDocumentProcessor`)
|
||||
3. **Redundant Dependencies**: Both `joi` and `zod` for validation, both `pg` and Supabase for database
|
||||
4. **Outdated Packages**: 21 backend and 15 frontend packages have updates available
|
||||
5. **Major Version Updates**: Many packages require major version updates with potential breaking changes
|
||||
|
||||
### Immediate Actions (Step 2 Complete)
|
||||
1. ✅ **Dependency Analysis Complete** - All dependencies mapped and usage identified
|
||||
2. ✅ **Outdated Packages Identified** - Version updates documented with risk assessment
|
||||
3. ✅ **Cleanup Strategy Defined** - Phased approach with risk levels assigned
|
||||
4. ✅ **Impact Assessment Complete** - Bundle size and maintenance savings estimated
|
||||
|
||||
### Next Steps (Step 3 - Service Layer Consolidation)
|
||||
1. Remove unused frontend dependencies (`clsx`, `tailwind-merge`)
|
||||
2. Remove legacy processing services
|
||||
3. Consolidate validation libraries (migrate from `joi` to `zod`)
|
||||
4. Remove redundant database client (`pg` if only using Supabase)
|
||||
5. Update low-risk package versions
|
||||
|
||||
### Risk Assessment
|
||||
- **Low Risk**: Removing unused dependencies, updating minor/patch versions
|
||||
- **Medium Risk**: Removing legacy services, consolidating libraries
|
||||
- **High Risk**: Major version updates, core processing logic changes
|
||||
|
||||
This dependency analysis provides a clear roadmap for cleaning up the codebase while maintaining functionality and minimizing risk.
|
||||
Reference in New Issue
Block a user