Files

Jon 5a3c961bfc feat: Complete implementation of Tasks 1-5 - CIM Document Processor

Backend Infrastructure:
- Complete Express server setup with security middleware (helmet, CORS, rate limiting)
- Comprehensive error handling and logging with Winston
- Authentication system with JWT tokens and session management
- Database models and migrations for Users, Documents, Feedback, and Processing Jobs
- API routes structure for authentication and document management
- Integration tests for all server components (86 tests passing)

Frontend Infrastructure:
- React application with TypeScript and Vite
- Authentication UI with login form, protected routes, and logout functionality
- Authentication context with proper async state management
- Component tests with proper async handling (25 tests passing)
- Tailwind CSS styling and responsive design

Key Features:
- User registration, login, and authentication
- Protected routes with role-based access control
- Comprehensive error handling and user feedback
- Database schema with proper relationships
- Security middleware and validation
- Production-ready build configuration

Test Coverage: 111/111 tests passing
Tasks Completed: 1-5 (Project setup, Database, Auth system, Frontend UI, Backend infrastructure)

Ready for Task 6: File upload backend infrastructure

2025-07-27 13:29:26 -04:00

6.6 KiB

Raw Blame History

Database Setup and Management

This document describes the database setup, migrations, and management for the CIM Document Processor backend.

Database Schema

The application uses PostgreSQL with the following tables:

Users Table

id (UUID, Primary Key)
email (VARCHAR, Unique)
name (VARCHAR)
password_hash (VARCHAR)
role (VARCHAR, 'user' or 'admin')
created_at (TIMESTAMP)
updated_at (TIMESTAMP)
last_login (TIMESTAMP, nullable)
is_active (BOOLEAN)

Documents Table

id (UUID, Primary Key)
user_id (UUID, Foreign Key to users.id)
original_file_name (VARCHAR)
file_path (VARCHAR)
file_size (BIGINT)
uploaded_at (TIMESTAMP)
status (VARCHAR, processing status)
extracted_text (TEXT, nullable)
generated_summary (TEXT, nullable)
summary_markdown_path (VARCHAR, nullable)
summary_pdf_path (VARCHAR, nullable)
processing_started_at (TIMESTAMP, nullable)
processing_completed_at (TIMESTAMP, nullable)
error_message (TEXT, nullable)
created_at (TIMESTAMP)
updated_at (TIMESTAMP)

Document Feedback Table

id (UUID, Primary Key)
document_id (UUID, Foreign Key to documents.id)
user_id (UUID, Foreign Key to users.id)
feedback (TEXT)
regeneration_instructions (TEXT, nullable)
created_at (TIMESTAMP)

Document Versions Table

id (UUID, Primary Key)
document_id (UUID, Foreign Key to documents.id)
version_number (INTEGER)
summary_markdown (TEXT)
summary_pdf_path (VARCHAR)
feedback (TEXT, nullable)
created_at (TIMESTAMP)

Processing Jobs Table

id (UUID, Primary Key)
document_id (UUID, Foreign Key to documents.id)
type (VARCHAR, job type)
status (VARCHAR, job status)
progress (INTEGER, 0-100)
error_message (TEXT, nullable)
created_at (TIMESTAMP)
started_at (TIMESTAMP, nullable)
completed_at (TIMESTAMP, nullable)

Setup Instructions

1. Install Dependencies

npm install

2. Configure Environment Variables

Copy the example environment file and configure your database settings:

cp .env.example .env

Update the following variables in .env:

DATABASE_URL - PostgreSQL connection string
DB_HOST, DB_PORT, DB_NAME, DB_USER, DB_PASSWORD - Database credentials

3. Create Database

Create a PostgreSQL database:

CREATE DATABASE cim_processor;

4. Run Migrations and Seed Data

npm run db:setup

This command will:

Run all database migrations to create tables
Seed the database with initial test data

Available Scripts

Database Management

npm run db:migrate - Run database migrations
npm run db:seed - Seed database with test data
npm run db:setup - Run migrations and seed data

Development

npm run dev - Start development server
npm run build - Build for production
npm run test - Run tests
npm run lint - Run linting

Database Models

The application includes the following models:

UserModel

create(userData) - Create new user
findById(id) - Find user by ID
findByEmail(email) - Find user by email
findAll(limit, offset) - Get all users (admin)
update(id, updates) - Update user
delete(id) - Soft delete user
emailExists(email) - Check if email exists
count() - Count total users

DocumentModel

create(documentData) - Create new document
findById(id) - Find document by ID
findByUserId(userId, limit, offset) - Get user's documents
findAll(limit, offset) - Get all documents (admin)
updateStatus(id, status) - Update document status
updateExtractedText(id, text) - Update extracted text
updateGeneratedSummary(id, summary, markdownPath, pdfPath) - Update summary
delete(id) - Delete document
countByUser(userId) - Count user's documents
findByStatus(status, limit, offset) - Get documents by status

DocumentFeedbackModel

create(feedbackData) - Create new feedback
findByDocumentId(documentId) - Get document feedback
findByUserId(userId, limit, offset) - Get user's feedback
update(id, updates) - Update feedback
delete(id) - Delete feedback

DocumentVersionModel

create(versionData) - Create new version
findByDocumentId(documentId) - Get document versions
findLatestByDocumentId(documentId) - Get latest version
getNextVersionNumber(documentId) - Get next version number
update(id, updates) - Update version
delete(id) - Delete version

ProcessingJobModel

create(jobData) - Create new job
findByDocumentId(documentId) - Get document jobs
findByType(type, limit, offset) - Get jobs by type
findByStatus(status, limit, offset) - Get jobs by status
findPendingJobs(limit) - Get pending jobs
updateStatus(id, status) - Update job status
updateProgress(id, progress) - Update job progress
delete(id) - Delete job

Seeded Data

The database is seeded with the following test data:

Users

admin@example.com / admin123 (Admin role)
user1@example.com / user123 (User role)
user2@example.com / user123 (User role)

Sample Documents

Sample CIM documents with different processing statuses
Associated processing jobs for testing

Indexes

The following indexes are created for optimal performance:

Users Table

idx_users_email - Email lookups
idx_users_role - Role-based queries
idx_users_is_active - Active user filtering

Documents Table

idx_documents_user_id - User document queries
idx_documents_status - Status-based queries
idx_documents_uploaded_at - Date-based queries
idx_documents_user_status - Composite index for user + status

Other Tables

Foreign key indexes on all relationship columns
Composite indexes for common query patterns

Triggers

update_users_updated_at - Automatically updates updated_at timestamp on user updates
update_documents_updated_at - Automatically updates updated_at timestamp on document updates

Backup and Recovery

Backup

pg_dump -h localhost -U username -d cim_processor > backup.sql

Restore

psql -h localhost -U username -d cim_processor < backup.sql

Troubleshooting

Common Issues

Connection refused: Check database credentials and ensure PostgreSQL is running
Permission denied: Ensure database user has proper permissions
Migration errors: Check if migrations table exists and is accessible
Seed data errors: Ensure all required tables exist before seeding

Logs

Check the application logs for detailed error information:

Database connection errors
Migration execution logs
Seed data creation logs

6.6 KiB Raw Blame History

Database Setup and Management

Database Schema

Users Table

Documents Table

Document Feedback Table

Document Versions Table

Processing Jobs Table

Setup Instructions

1. Install Dependencies

2. Configure Environment Variables

3. Create Database

4. Run Migrations and Seed Data

Available Scripts

Database Management

Development

Database Models

UserModel

DocumentModel

DocumentFeedbackModel

DocumentVersionModel

ProcessingJobModel

Seeded Data

Users

Sample Documents

Indexes

Users Table

Documents Table

Other Tables

Triggers

Backup and Recovery

Backup

Restore

Troubleshooting

Common Issues

Logs

6.6 KiB

Raw Blame History