Backend Infrastructure: - Complete Express server setup with security middleware (helmet, CORS, rate limiting) - Comprehensive error handling and logging with Winston - Authentication system with JWT tokens and session management - Database models and migrations for Users, Documents, Feedback, and Processing Jobs - API routes structure for authentication and document management - Integration tests for all server components (86 tests passing) Frontend Infrastructure: - React application with TypeScript and Vite - Authentication UI with login form, protected routes, and logout functionality - Authentication context with proper async state management - Component tests with proper async handling (25 tests passing) - Tailwind CSS styling and responsive design Key Features: - User registration, login, and authentication - Protected routes with role-based access control - Comprehensive error handling and user feedback - Database schema with proper relationships - Security middleware and validation - Production-ready build configuration Test Coverage: 111/111 tests passing Tasks Completed: 1-5 (Project setup, Database, Auth system, Frontend UI, Backend infrastructure) Ready for Task 6: File upload backend infrastructure
224 lines
6.6 KiB
Markdown
224 lines
6.6 KiB
Markdown
# Database Setup and Management
|
|
|
|
This document describes the database setup, migrations, and management for the CIM Document Processor backend.
|
|
|
|
## Database Schema
|
|
|
|
The application uses PostgreSQL with the following tables:
|
|
|
|
### Users Table
|
|
- `id` (UUID, Primary Key)
|
|
- `email` (VARCHAR, Unique)
|
|
- `name` (VARCHAR)
|
|
- `password_hash` (VARCHAR)
|
|
- `role` (VARCHAR, 'user' or 'admin')
|
|
- `created_at` (TIMESTAMP)
|
|
- `updated_at` (TIMESTAMP)
|
|
- `last_login` (TIMESTAMP, nullable)
|
|
- `is_active` (BOOLEAN)
|
|
|
|
### Documents Table
|
|
- `id` (UUID, Primary Key)
|
|
- `user_id` (UUID, Foreign Key to users.id)
|
|
- `original_file_name` (VARCHAR)
|
|
- `file_path` (VARCHAR)
|
|
- `file_size` (BIGINT)
|
|
- `uploaded_at` (TIMESTAMP)
|
|
- `status` (VARCHAR, processing status)
|
|
- `extracted_text` (TEXT, nullable)
|
|
- `generated_summary` (TEXT, nullable)
|
|
- `summary_markdown_path` (VARCHAR, nullable)
|
|
- `summary_pdf_path` (VARCHAR, nullable)
|
|
- `processing_started_at` (TIMESTAMP, nullable)
|
|
- `processing_completed_at` (TIMESTAMP, nullable)
|
|
- `error_message` (TEXT, nullable)
|
|
- `created_at` (TIMESTAMP)
|
|
- `updated_at` (TIMESTAMP)
|
|
|
|
### Document Feedback Table
|
|
- `id` (UUID, Primary Key)
|
|
- `document_id` (UUID, Foreign Key to documents.id)
|
|
- `user_id` (UUID, Foreign Key to users.id)
|
|
- `feedback` (TEXT)
|
|
- `regeneration_instructions` (TEXT, nullable)
|
|
- `created_at` (TIMESTAMP)
|
|
|
|
### Document Versions Table
|
|
- `id` (UUID, Primary Key)
|
|
- `document_id` (UUID, Foreign Key to documents.id)
|
|
- `version_number` (INTEGER)
|
|
- `summary_markdown` (TEXT)
|
|
- `summary_pdf_path` (VARCHAR)
|
|
- `feedback` (TEXT, nullable)
|
|
- `created_at` (TIMESTAMP)
|
|
|
|
### Processing Jobs Table
|
|
- `id` (UUID, Primary Key)
|
|
- `document_id` (UUID, Foreign Key to documents.id)
|
|
- `type` (VARCHAR, job type)
|
|
- `status` (VARCHAR, job status)
|
|
- `progress` (INTEGER, 0-100)
|
|
- `error_message` (TEXT, nullable)
|
|
- `created_at` (TIMESTAMP)
|
|
- `started_at` (TIMESTAMP, nullable)
|
|
- `completed_at` (TIMESTAMP, nullable)
|
|
|
|
## Setup Instructions
|
|
|
|
### 1. Install Dependencies
|
|
```bash
|
|
npm install
|
|
```
|
|
|
|
### 2. Configure Environment Variables
|
|
Copy the example environment file and configure your database settings:
|
|
```bash
|
|
cp .env.example .env
|
|
```
|
|
|
|
Update the following variables in `.env`:
|
|
- `DATABASE_URL` - PostgreSQL connection string
|
|
- `DB_HOST`, `DB_PORT`, `DB_NAME`, `DB_USER`, `DB_PASSWORD` - Database credentials
|
|
|
|
### 3. Create Database
|
|
Create a PostgreSQL database:
|
|
```sql
|
|
CREATE DATABASE cim_processor;
|
|
```
|
|
|
|
### 4. Run Migrations and Seed Data
|
|
```bash
|
|
npm run db:setup
|
|
```
|
|
|
|
This command will:
|
|
- Run all database migrations to create tables
|
|
- Seed the database with initial test data
|
|
|
|
## Available Scripts
|
|
|
|
### Database Management
|
|
- `npm run db:migrate` - Run database migrations
|
|
- `npm run db:seed` - Seed database with test data
|
|
- `npm run db:setup` - Run migrations and seed data
|
|
|
|
### Development
|
|
- `npm run dev` - Start development server
|
|
- `npm run build` - Build for production
|
|
- `npm run test` - Run tests
|
|
- `npm run lint` - Run linting
|
|
|
|
## Database Models
|
|
|
|
The application includes the following models:
|
|
|
|
### UserModel
|
|
- `create(userData)` - Create new user
|
|
- `findById(id)` - Find user by ID
|
|
- `findByEmail(email)` - Find user by email
|
|
- `findAll(limit, offset)` - Get all users (admin)
|
|
- `update(id, updates)` - Update user
|
|
- `delete(id)` - Soft delete user
|
|
- `emailExists(email)` - Check if email exists
|
|
- `count()` - Count total users
|
|
|
|
### DocumentModel
|
|
- `create(documentData)` - Create new document
|
|
- `findById(id)` - Find document by ID
|
|
- `findByUserId(userId, limit, offset)` - Get user's documents
|
|
- `findAll(limit, offset)` - Get all documents (admin)
|
|
- `updateStatus(id, status)` - Update document status
|
|
- `updateExtractedText(id, text)` - Update extracted text
|
|
- `updateGeneratedSummary(id, summary, markdownPath, pdfPath)` - Update summary
|
|
- `delete(id)` - Delete document
|
|
- `countByUser(userId)` - Count user's documents
|
|
- `findByStatus(status, limit, offset)` - Get documents by status
|
|
|
|
### DocumentFeedbackModel
|
|
- `create(feedbackData)` - Create new feedback
|
|
- `findByDocumentId(documentId)` - Get document feedback
|
|
- `findByUserId(userId, limit, offset)` - Get user's feedback
|
|
- `update(id, updates)` - Update feedback
|
|
- `delete(id)` - Delete feedback
|
|
|
|
### DocumentVersionModel
|
|
- `create(versionData)` - Create new version
|
|
- `findByDocumentId(documentId)` - Get document versions
|
|
- `findLatestByDocumentId(documentId)` - Get latest version
|
|
- `getNextVersionNumber(documentId)` - Get next version number
|
|
- `update(id, updates)` - Update version
|
|
- `delete(id)` - Delete version
|
|
|
|
### ProcessingJobModel
|
|
- `create(jobData)` - Create new job
|
|
- `findByDocumentId(documentId)` - Get document jobs
|
|
- `findByType(type, limit, offset)` - Get jobs by type
|
|
- `findByStatus(status, limit, offset)` - Get jobs by status
|
|
- `findPendingJobs(limit)` - Get pending jobs
|
|
- `updateStatus(id, status)` - Update job status
|
|
- `updateProgress(id, progress)` - Update job progress
|
|
- `delete(id)` - Delete job
|
|
|
|
## Seeded Data
|
|
|
|
The database is seeded with the following test data:
|
|
|
|
### Users
|
|
- `admin@example.com` / `admin123` (Admin role)
|
|
- `user1@example.com` / `user123` (User role)
|
|
- `user2@example.com` / `user123` (User role)
|
|
|
|
### Sample Documents
|
|
- Sample CIM documents with different processing statuses
|
|
- Associated processing jobs for testing
|
|
|
|
## Indexes
|
|
|
|
The following indexes are created for optimal performance:
|
|
|
|
### Users Table
|
|
- `idx_users_email` - Email lookups
|
|
- `idx_users_role` - Role-based queries
|
|
- `idx_users_is_active` - Active user filtering
|
|
|
|
### Documents Table
|
|
- `idx_documents_user_id` - User document queries
|
|
- `idx_documents_status` - Status-based queries
|
|
- `idx_documents_uploaded_at` - Date-based queries
|
|
- `idx_documents_user_status` - Composite index for user + status
|
|
|
|
### Other Tables
|
|
- Foreign key indexes on all relationship columns
|
|
- Composite indexes for common query patterns
|
|
|
|
## Triggers
|
|
|
|
- `update_users_updated_at` - Automatically updates `updated_at` timestamp on user updates
|
|
- `update_documents_updated_at` - Automatically updates `updated_at` timestamp on document updates
|
|
|
|
## Backup and Recovery
|
|
|
|
### Backup
|
|
```bash
|
|
pg_dump -h localhost -U username -d cim_processor > backup.sql
|
|
```
|
|
|
|
### Restore
|
|
```bash
|
|
psql -h localhost -U username -d cim_processor < backup.sql
|
|
```
|
|
|
|
## Troubleshooting
|
|
|
|
### Common Issues
|
|
|
|
1. **Connection refused**: Check database credentials and ensure PostgreSQL is running
|
|
2. **Permission denied**: Ensure database user has proper permissions
|
|
3. **Migration errors**: Check if migrations table exists and is accessible
|
|
4. **Seed data errors**: Ensure all required tables exist before seeding
|
|
|
|
### Logs
|
|
Check the application logs for detailed error information:
|
|
- Database connection errors
|
|
- Migration execution logs
|
|
- Seed data creation logs |