Files
HomeAudit/PAPERLESS_AI_DATABASE_ISSUE_FIX.md
admin 45363040f3 feat: Complete infrastructure cleanup phase documentation and status updates
## Major Infrastructure Milestones Achieved

###  Service Migrations Completed
- Jellyfin: Successfully migrated to Docker Swarm with latest version
- Vaultwarden: Running in Docker Swarm on OMV800 (eliminated duplicate)
- Nextcloud: Operational with database optimization and cron setup
- Paperless services: Both NGX and AI running successfully

### 🚨 Duplicate Service Analysis Complete
- Identified MariaDB conflict (OMV800 Swarm vs lenovo410 standalone)
- Identified Vaultwarden duplication (now resolved)
- Documented PostgreSQL and Redis consolidation opportunities
- Mapped monitoring stack optimization needs

### 🏗️ Infrastructure Status Documentation
- Updated README with current cleanup phase status
- Enhanced Service Analysis with duplicate service inventory
- Updated Quick Start guide with immediate action items
- Documented current container distribution across 6 nodes

### 📋 Action Plan Documentation
- Phase 1: Immediate service conflict resolution (this week)
- Phase 2: Service migration and load balancing (next 2 weeks)
- Phase 3: Database consolidation and optimization (future)

### 🔧 Current Infrastructure Health
- Docker Swarm: All 6 nodes operational and healthy
- Caddy Reverse Proxy: Fully operational with SSL certificates
- Storage: MergerFS healthy, local storage for databases
- Monitoring: Prometheus + Grafana + Uptime Kuma operational

### 📊 Container Distribution Status
- OMV800: 25+ containers (needs load balancing)
- lenovo410: 9 containers (cleanup in progress)
- fedora: 1 container (ready for additional services)
- audrey: 4 containers (well-balanced, monitoring hub)
- lenovo420: 7 containers (balanced, can assist)
- surface: 9 containers (specialized, reverse proxy)

### 🎯 Next Steps
1. Remove lenovo410 MariaDB (eliminate port 3306 conflict)
2. Clean up lenovo410 Vaultwarden (256MB space savings)
3. Verify no service conflicts exist
4. Begin service migration from OMV800 to fedora/audrey

Status: Infrastructure 99% complete, entering cleanup and optimization phase
2025-09-01 16:50:37 -04:00

5.7 KiB

Paperless AI Database Issue - Complete Fix

🚨 Problem Summary

You're experiencing a database issue where Paperless AI and Paperless-ngx are using different databases, causing tags and titles applied by Paperless AI to not match the documents in Paperless-ngx.

🔍 Root Cause Analysis

Database Mismatch

  • Paperless-ngx: Uses PostgreSQL with host postgresql_postgresql_primary
  • Paperless AI: Uses its own local database in /app/data

Configuration Differences

  • Paperless-ngx: Properly configured with external PostgreSQL database
  • Paperless AI: Uses network_mode: bridge and doesn't connect to the same database

Missing Integration

  • Paperless AI lacks proper environment variables to connect to Paperless-ngx
  • No shared database connection between the two services
  • Different network configurations preventing proper communication

🛠️ Complete Solution

1. New Paperless AI Configuration

I've created a new configuration file: stacks/ai/paperless-ai.yml

Key Features:

  • Connects to the same PostgreSQL database as Paperless-ngx
  • Uses the same Redis instance
  • Shares the same network configuration
  • Proper environment variable configuration
  • Health checks and monitoring
  • Secure secrets management

2. Setup Scripts

Diagnostic Script

./scripts/diagnose_paperless_issues.sh
  • Analyzes current configuration
  • Identifies specific issues
  • Provides detailed recommendations

Quick Fix Script

./scripts/quick_fix_paperless_ai.sh
  • Stops problematic containers
  • Creates backups
  • Sets up proper integration

Complete Setup Script

./scripts/setup_paperless_ai_integration.sh
  • Interactive configuration
  • Environment file creation
  • Deployment automation

3. Environment Configuration

The new setup requires proper environment variables:

# Paperless-ngx Connection
PAPERLESS_URL=https://paperless.pressmess.duckdns.org
PAPERLESS_USERNAME=admin
PAPERLESS_PASSWORD=your_password

# Database Connection (same as Paperless-ngx)
PAPERLESS_DBHOST=postgresql_postgresql_primary
PAPERLESS_DBNAME=paperless
PAPERLESS_DBUSER=postgres
PAPERLESS_DBPASS_FILE=/run/secrets/pg_root_password

# AI Provider (configure at least one)
OPENAI_API_KEY=your_openai_key
OLLAMA_BASE_URL=http://ollama:11434
DEEPSEEK_API_KEY=your_deepseek_key

🚀 Implementation Steps

Step 1: Run Diagnostic

./scripts/diagnose_paperless_issues.sh

Step 2: Quick Fix (Immediate)

./scripts/quick_fix_paperless_ai.sh

Step 3: Complete Setup

./scripts/setup_paperless_ai_integration.sh

Step 4: Deploy

cd stacks/ai
docker-compose -f paperless-ai.yml --env-file .env up -d

Step 5: Verify

./scripts/verify_paperless_ai.sh

🔧 Configuration Details

Database Integration

  • Both services now use the same PostgreSQL database
  • Shared Redis instance for caching and messaging
  • Proper network connectivity between containers

Document Processing

  • Paperless AI can access the same document storage
  • Tags and titles are applied directly to the shared database
  • Real-time synchronization between services

Security

  • Uses Docker secrets for sensitive data
  • Proper network isolation
  • Secure API token management

📊 Expected Results

After implementing this fix:

  1. Unified Database: Both services use the same PostgreSQL database
  2. Synchronized Tags: Tags applied by Paperless AI appear in Paperless-ngx
  3. Consistent Titles: Document titles are properly synchronized
  4. Real-time Updates: Changes are immediately visible in both interfaces
  5. Proper Integration: Seamless communication between services

🛡️ Backup and Recovery

Automatic Backups

  • Current Paperless AI data is automatically backed up
  • Backup location: backups/paperless-ai-YYYYMMDD_HHMMSS/
  • Includes all configuration and data

Rollback Procedure

If issues occur:

# Stop new configuration
cd stacks/ai
docker-compose -f paperless-ai.yml down

# Restore from backup
tar xzf backups/paperless-ai-YYYYMMDD_HHMMSS/paperless-ai-data-backup.tar.gz

🔍 Monitoring and Troubleshooting

Health Checks

  • Container health monitoring
  • Database connectivity verification
  • API endpoint testing

Logs and Debugging

# View Paperless AI logs
docker-compose -f stacks/ai/paperless-ai.yml logs -f

# View Paperless-ngx logs
docker logs paperless

# Check database connectivity
docker exec paperless-ai pg_isready -h postgresql_postgresql_primary

Common Issues and Solutions

Issue Solution
Database connection failed Verify PostgreSQL container is running
API authentication failed Check PAPERLESS_USERNAME/PAPERLESS_PASSWORD
AI processing not working Configure at least one AI provider API key
Network connectivity issues Ensure both containers are on same network

📚 Additional Resources

🎯 Success Criteria

The fix is successful when:

  • Paperless AI container starts without errors
  • Database connectivity is established
  • API authentication works
  • Tags applied by Paperless AI appear in Paperless-ngx
  • Document titles are properly synchronized
  • Health checks pass
  • No error messages in logs

Note: This solution ensures that Paperless AI and Paperless-ngx work together as a unified document management system with proper database synchronization and real-time updates.