Files
HomeAudit/dev_documentation/QUICK_START.md
admin 705a2757c1 Major infrastructure migration and Vaultwarden PostgreSQL troubleshooting
COMPREHENSIVE CHANGES:

INFRASTRUCTURE MIGRATION:
- Migrated services to Docker Swarm on OMV800 (192.168.50.229)
- Deployed PostgreSQL database for Vaultwarden migration
- Updated all stack configurations for Docker Swarm compatibility
- Added comprehensive monitoring stack (Prometheus, Grafana, Blackbox)
- Implemented proper secret management for all services

VAULTWARDEN POSTGRESQL MIGRATION:
- Attempted migration from SQLite to PostgreSQL for NFS compatibility
- Created PostgreSQL stack with proper user/password configuration
- Built custom Vaultwarden image with PostgreSQL support
- Troubleshot persistent SQLite fallback issue despite PostgreSQL config
- Identified known issue where Vaultwarden silently falls back to SQLite
- Added ENABLE_DB_WAL=false to prevent filesystem compatibility issues
- Current status: Old Vaultwarden on lenovo410 still working, new one has config issues

PAPERLESS SERVICES:
- Successfully deployed Paperless-NGX and Paperless-AI on OMV800
- Both services running on ports 8000 and 3000 respectively
- Caddy configuration updated for external access
- Services accessible via paperless.pressmess.duckdns.org and paperless-ai.pressmess.duckdns.org

CADDY CONFIGURATION:
- Updated Caddyfile on Surface (192.168.50.254) for new service locations
- Fixed Vaultwarden reverse proxy to point to new Docker Swarm service
- Removed old notification hub reference that was causing conflicts
- All services properly configured for external access via DuckDNS

BACKUP AND DISCOVERY:
- Created comprehensive backup system for all hosts
- Generated detailed discovery reports for infrastructure analysis
- Implemented automated backup validation scripts
- Created migration progress tracking and verification reports

MONITORING STACK:
- Deployed Prometheus, Grafana, and Blackbox monitoring
- Created infrastructure and system overview dashboards
- Added proper service discovery and alerting configuration
- Implemented performance monitoring for all critical services

DOCUMENTATION:
- Reorganized documentation into logical structure
- Created comprehensive migration playbook and troubleshooting guides
- Added hardware specifications and optimization recommendations
- Documented all configuration changes and service dependencies

CURRENT STATUS:
- Paperless services:  Working and accessible externally
- Vaultwarden:  PostgreSQL configuration issues, old instance still working
- Monitoring:  Deployed and operational
- Caddy:  Updated and working for external access
- PostgreSQL:  Database running, connection issues with Vaultwarden

NEXT STEPS:
- Continue troubleshooting Vaultwarden PostgreSQL configuration
- Consider alternative approaches for Vaultwarden migration
- Validate all external service access
- Complete final migration validation

TECHNICAL NOTES:
- Used Docker Swarm for orchestration on OMV800
- Implemented proper secret management for sensitive data
- Added comprehensive logging and monitoring
- Created automated backup and validation scripts
2025-08-30 20:18:44 -04:00

7.3 KiB

QUICK START GUIDE - HOMEAUDIT MIGRATION

Generated: 2025-08-29
Status: READY FOR SERVICE MIGRATION - 99% Complete


🎯 PROJECT OVERVIEW

Home infrastructure migration to Docker Swarm with optimized service distribution. All critical infrastructure is now in place and ready for service migration.


📊 CURRENT STATUS DASHBOARD

COMPLETED INFRASTRUCTURE

  • Docker Swarm: All 6 nodes joined and labeled
  • Caddy Reverse Proxy: Deployed and secured on surface
  • Storage Configuration: SMB/NFS hybrid complete
  • Service Analysis: Complete with security hardening
  • Node Renaming: lenovo410 (formerly jonathan-2518f5u)
  • Backup Infrastructure: Comprehensive system with RAID-1

🔄 NEXT STEPS

  • Service Migration: Move services to Docker Swarm
  • Database Services: Deploy PostgreSQL and MariaDB
  • Monitoring Stack: Deploy Grafana + Netdata
  • GPU Acceleration: Configure for Jellyfin/Immich
  • Paperless Services: Both Paperless-NGX and Paperless-AI now running on OMV800

🏗️ INFRASTRUCTURE ARCHITECTURE

Docker Swarm Nodes:

OMV800 (Manager)     - role=storage, cpu=high, memory=high, gpu=false
fedora               - role=compute, cpu=medium, memory=medium, gpu=false  
lenovo410            - role=compute, cpu=medium, memory=medium, gpu=false
audrey               - role=compute, cpu=medium, memory=medium, gpu=false
surface              - role=compute, cpu=medium, memory=medium, gpu=false
lenovo420            - role=ai-ml, cpu=high, memory=high, gpu=true

Networks:

  • swarm-public: Overlay network for service communication
  • database-network: For database services
  • monitoring-network: For monitoring services
  • ingress: For ingress traffic

Reverse Proxy:

  • Caddy: Running on surface (192.168.50.254)
  • SSL: Automatic certificates via DuckDNS
  • Security: High-risk services removed from external access

Storage Infrastructure:

  • SMB/NFS Hybrid: Both protocols available
  • Exports Available: adguard, appflowy, caddy, homeassistant, immich, jellyfin, media, nextcloud, ollama, paperless, vaultwarden
  • Permissions: Properly configured for service access

Backup Infrastructure:

  • Primary Storage: raspberrypi with 7.3TB RAID-1 array
  • Automated Backups: Comprehensive backup system with validation
  • Offsite Capability: Cloud integration ready
  • Restoration Testing: Automated verification procedures
  • Discovery Complete: Comprehensive backup targets identified
  • Backup Size: 1-15GB estimated total
  • Critical Data: Databases, volumes, configurations, secrets, user data

🚀 IMMEDIATE ACTIONS

1. Deploy Database Services

# Deploy PostgreSQL and MariaDB on OMV800
ssh root@omv800.local "cd /opt/stacks/databases && docker stack deploy -c postgresql.yml databases"
ssh root@omv800.local "cd /opt/stacks/databases && docker stack deploy -c mariadb.yml databases"

2. Migrate Services to Swarm

# Start with simple services first
ssh root@omv800.local "cd /opt/stacks/apps && docker stack deploy -c jellyfin.yml media"

3. Deploy Monitoring

# Deploy basic monitoring stack
ssh root@omv800.local "cd /opt/stacks/monitoring && docker stack deploy -c grafana.yml monitoring"

🔧 DEVELOPMENT WORKFLOW

Service Deployment Process:

  1. Test locally with docker-compose
  2. Convert to stack format
  3. Deploy to swarm with proper labels
  4. Update Caddy if needed
  5. Test access via domain

Configuration Management:

  • Stack files: /opt/stacks/ on OMV800
  • Secrets: Docker Swarm secrets
  • Volumes: NFS/SMB mounts from OMV800
  • Networks: Overlay networks for service communication

📋 ESSENTIAL FILES

Infrastructure:

  • dev_documentation/infrastructure/SERVICE_ANALYSIS_AND_CADDYFILE.md - Service mapping and routing
  • dev_documentation/infrastructure/HARDWARE_SPECIFICATIONS.md - Hardware details
  • dev_documentation/infrastructure/COMPREHENSIVE_END_STATE_ANALYSIS.md - Optimization strategy

Migration:

  • dev_documentation/migration/COMPREHENSIVE_MIGRATION_ISSUES_REPORT.md - Migration status
  • migration_scripts/scripts/ - Automation scripts
  • stacks/ - Docker Swarm stack files

Monitoring:

  • dev_documentation/monitoring/ - Monitoring configuration
  • configs/monitoring/ - Prometheus/Grafana configs

🛠️ COMMON TASKS

Deploy a New Service:

# 1. Create stack file
vim /opt/stacks/apps/newservice.yml

# 2. Deploy to swarm
docker stack deploy -c newservice.yml apps

# 3. Update Caddy if needed
scp caddyfile.txt jon@192.168.50.254:/tmp/
ssh jon@192.168.50.254 "sudo cp /tmp/caddyfile.txt /etc/caddy/Caddyfile && sudo systemctl reload caddy"

Check Service Status:

# Check all services
ssh root@omv800.local "docker service ls"

# Check specific service
ssh root@omv800.local "docker service ps servicename"

# Check logs
ssh root@omv800.local "docker service logs servicename"

Scale Services:

# Scale a service
ssh root@omv800.local "docker service scale servicename=3"

# Update service
ssh root@omv800.local "docker service update --image newimage:tag servicename"

🚨 EMERGENCY PROCEDURES

Service Down:

# Check service status
ssh root@omv800.local "docker service ls"

# Restart service
ssh root@omv800.local "docker service update --force servicename"

# Check logs
ssh root@omv800.local "docker service logs servicename"

Node Issues:

# Check node status
ssh root@omv800.local "docker node ls"

# Drain node (move services away)
ssh root@omv800.local "docker node update --availability drain nodename"

# Remove node
ssh root@omv800.local "docker node rm nodename"

Caddy Issues:

# Check Caddy status
ssh jon@192.168.50.254 "sudo systemctl status caddy"

# Restart Caddy
ssh jon@192.168.50.254 "sudo systemctl restart caddy"

# Check logs
ssh jon@192.168.50.254 "sudo journalctl -u caddy -f"

⚠️ IMPORTANT WARNINGS

Security:

  • Never expose system management interfaces externally
  • Use secrets for all passwords and API keys
  • Keep AdGuard Home local-only for DNS security
  • Monitor access to sensitive services

Data Safety:

  • Backup before major changes
  • Test migrations on non-critical services first
  • Verify data integrity after service moves
  • Keep original configurations as backup

Performance:

  • Monitor resource usage during migration
  • Scale gradually to avoid overwhelming nodes
  • Test under load before going live
  • Have rollback plan ready

📞 SUPPORT CONTACTS

Infrastructure:

  • OMV800: Primary storage and database host
  • surface: Caddy reverse proxy
  • lenovo410: Home automation services
  • lenovo420: AI/ML processing
  • audrey: Monitoring services
  • fedora: Development and automation

Access Methods:

  • SSH: Use inventory.ini for correct usernames
  • Web: Services accessible via Caddy domains
  • Monitoring: Uptime Kuma for service status

Status: READY FOR SERVICE MIGRATION 🚀
Last Updated: 2025-08-29
Next Review: After database deployment