Files
cim_summary/backend/DATABASE.md
Jon 5a3c961bfc feat: Complete implementation of Tasks 1-5 - CIM Document Processor
Backend Infrastructure:
- Complete Express server setup with security middleware (helmet, CORS, rate limiting)
- Comprehensive error handling and logging with Winston
- Authentication system with JWT tokens and session management
- Database models and migrations for Users, Documents, Feedback, and Processing Jobs
- API routes structure for authentication and document management
- Integration tests for all server components (86 tests passing)

Frontend Infrastructure:
- React application with TypeScript and Vite
- Authentication UI with login form, protected routes, and logout functionality
- Authentication context with proper async state management
- Component tests with proper async handling (25 tests passing)
- Tailwind CSS styling and responsive design

Key Features:
- User registration, login, and authentication
- Protected routes with role-based access control
- Comprehensive error handling and user feedback
- Database schema with proper relationships
- Security middleware and validation
- Production-ready build configuration

Test Coverage: 111/111 tests passing
Tasks Completed: 1-5 (Project setup, Database, Auth system, Frontend UI, Backend infrastructure)

Ready for Task 6: File upload backend infrastructure
2025-07-27 13:29:26 -04:00

6.6 KiB

Database Setup and Management

This document describes the database setup, migrations, and management for the CIM Document Processor backend.

Database Schema

The application uses PostgreSQL with the following tables:

Users Table

  • id (UUID, Primary Key)
  • email (VARCHAR, Unique)
  • name (VARCHAR)
  • password_hash (VARCHAR)
  • role (VARCHAR, 'user' or 'admin')
  • created_at (TIMESTAMP)
  • updated_at (TIMESTAMP)
  • last_login (TIMESTAMP, nullable)
  • is_active (BOOLEAN)

Documents Table

  • id (UUID, Primary Key)
  • user_id (UUID, Foreign Key to users.id)
  • original_file_name (VARCHAR)
  • file_path (VARCHAR)
  • file_size (BIGINT)
  • uploaded_at (TIMESTAMP)
  • status (VARCHAR, processing status)
  • extracted_text (TEXT, nullable)
  • generated_summary (TEXT, nullable)
  • summary_markdown_path (VARCHAR, nullable)
  • summary_pdf_path (VARCHAR, nullable)
  • processing_started_at (TIMESTAMP, nullable)
  • processing_completed_at (TIMESTAMP, nullable)
  • error_message (TEXT, nullable)
  • created_at (TIMESTAMP)
  • updated_at (TIMESTAMP)

Document Feedback Table

  • id (UUID, Primary Key)
  • document_id (UUID, Foreign Key to documents.id)
  • user_id (UUID, Foreign Key to users.id)
  • feedback (TEXT)
  • regeneration_instructions (TEXT, nullable)
  • created_at (TIMESTAMP)

Document Versions Table

  • id (UUID, Primary Key)
  • document_id (UUID, Foreign Key to documents.id)
  • version_number (INTEGER)
  • summary_markdown (TEXT)
  • summary_pdf_path (VARCHAR)
  • feedback (TEXT, nullable)
  • created_at (TIMESTAMP)

Processing Jobs Table

  • id (UUID, Primary Key)
  • document_id (UUID, Foreign Key to documents.id)
  • type (VARCHAR, job type)
  • status (VARCHAR, job status)
  • progress (INTEGER, 0-100)
  • error_message (TEXT, nullable)
  • created_at (TIMESTAMP)
  • started_at (TIMESTAMP, nullable)
  • completed_at (TIMESTAMP, nullable)

Setup Instructions

1. Install Dependencies

npm install

2. Configure Environment Variables

Copy the example environment file and configure your database settings:

cp .env.example .env

Update the following variables in .env:

  • DATABASE_URL - PostgreSQL connection string
  • DB_HOST, DB_PORT, DB_NAME, DB_USER, DB_PASSWORD - Database credentials

3. Create Database

Create a PostgreSQL database:

CREATE DATABASE cim_processor;

4. Run Migrations and Seed Data

npm run db:setup

This command will:

  • Run all database migrations to create tables
  • Seed the database with initial test data

Available Scripts

Database Management

  • npm run db:migrate - Run database migrations
  • npm run db:seed - Seed database with test data
  • npm run db:setup - Run migrations and seed data

Development

  • npm run dev - Start development server
  • npm run build - Build for production
  • npm run test - Run tests
  • npm run lint - Run linting

Database Models

The application includes the following models:

UserModel

  • create(userData) - Create new user
  • findById(id) - Find user by ID
  • findByEmail(email) - Find user by email
  • findAll(limit, offset) - Get all users (admin)
  • update(id, updates) - Update user
  • delete(id) - Soft delete user
  • emailExists(email) - Check if email exists
  • count() - Count total users

DocumentModel

  • create(documentData) - Create new document
  • findById(id) - Find document by ID
  • findByUserId(userId, limit, offset) - Get user's documents
  • findAll(limit, offset) - Get all documents (admin)
  • updateStatus(id, status) - Update document status
  • updateExtractedText(id, text) - Update extracted text
  • updateGeneratedSummary(id, summary, markdownPath, pdfPath) - Update summary
  • delete(id) - Delete document
  • countByUser(userId) - Count user's documents
  • findByStatus(status, limit, offset) - Get documents by status

DocumentFeedbackModel

  • create(feedbackData) - Create new feedback
  • findByDocumentId(documentId) - Get document feedback
  • findByUserId(userId, limit, offset) - Get user's feedback
  • update(id, updates) - Update feedback
  • delete(id) - Delete feedback

DocumentVersionModel

  • create(versionData) - Create new version
  • findByDocumentId(documentId) - Get document versions
  • findLatestByDocumentId(documentId) - Get latest version
  • getNextVersionNumber(documentId) - Get next version number
  • update(id, updates) - Update version
  • delete(id) - Delete version

ProcessingJobModel

  • create(jobData) - Create new job
  • findByDocumentId(documentId) - Get document jobs
  • findByType(type, limit, offset) - Get jobs by type
  • findByStatus(status, limit, offset) - Get jobs by status
  • findPendingJobs(limit) - Get pending jobs
  • updateStatus(id, status) - Update job status
  • updateProgress(id, progress) - Update job progress
  • delete(id) - Delete job

Seeded Data

The database is seeded with the following test data:

Users

  • admin@example.com / admin123 (Admin role)
  • user1@example.com / user123 (User role)
  • user2@example.com / user123 (User role)

Sample Documents

  • Sample CIM documents with different processing statuses
  • Associated processing jobs for testing

Indexes

The following indexes are created for optimal performance:

Users Table

  • idx_users_email - Email lookups
  • idx_users_role - Role-based queries
  • idx_users_is_active - Active user filtering

Documents Table

  • idx_documents_user_id - User document queries
  • idx_documents_status - Status-based queries
  • idx_documents_uploaded_at - Date-based queries
  • idx_documents_user_status - Composite index for user + status

Other Tables

  • Foreign key indexes on all relationship columns
  • Composite indexes for common query patterns

Triggers

  • update_users_updated_at - Automatically updates updated_at timestamp on user updates
  • update_documents_updated_at - Automatically updates updated_at timestamp on document updates

Backup and Recovery

Backup

pg_dump -h localhost -U username -d cim_processor > backup.sql

Restore

psql -h localhost -U username -d cim_processor < backup.sql

Troubleshooting

Common Issues

  1. Connection refused: Check database credentials and ensure PostgreSQL is running
  2. Permission denied: Ensure database user has proper permissions
  3. Migration errors: Check if migrations table exists and is accessible
  4. Seed data errors: Ensure all required tables exist before seeding

Logs

Check the application logs for detailed error information:

  • Database connection errors
  • Migration execution logs
  • Seed data creation logs