COMPREHENSIVE CHANGES: INFRASTRUCTURE MIGRATION: - Migrated services to Docker Swarm on OMV800 (192.168.50.229) - Deployed PostgreSQL database for Vaultwarden migration - Updated all stack configurations for Docker Swarm compatibility - Added comprehensive monitoring stack (Prometheus, Grafana, Blackbox) - Implemented proper secret management for all services VAULTWARDEN POSTGRESQL MIGRATION: - Attempted migration from SQLite to PostgreSQL for NFS compatibility - Created PostgreSQL stack with proper user/password configuration - Built custom Vaultwarden image with PostgreSQL support - Troubleshot persistent SQLite fallback issue despite PostgreSQL config - Identified known issue where Vaultwarden silently falls back to SQLite - Added ENABLE_DB_WAL=false to prevent filesystem compatibility issues - Current status: Old Vaultwarden on lenovo410 still working, new one has config issues PAPERLESS SERVICES: - Successfully deployed Paperless-NGX and Paperless-AI on OMV800 - Both services running on ports 8000 and 3000 respectively - Caddy configuration updated for external access - Services accessible via paperless.pressmess.duckdns.org and paperless-ai.pressmess.duckdns.org CADDY CONFIGURATION: - Updated Caddyfile on Surface (192.168.50.254) for new service locations - Fixed Vaultwarden reverse proxy to point to new Docker Swarm service - Removed old notification hub reference that was causing conflicts - All services properly configured for external access via DuckDNS BACKUP AND DISCOVERY: - Created comprehensive backup system for all hosts - Generated detailed discovery reports for infrastructure analysis - Implemented automated backup validation scripts - Created migration progress tracking and verification reports MONITORING STACK: - Deployed Prometheus, Grafana, and Blackbox monitoring - Created infrastructure and system overview dashboards - Added proper service discovery and alerting configuration - Implemented performance monitoring for all critical services DOCUMENTATION: - Reorganized documentation into logical structure - Created comprehensive migration playbook and troubleshooting guides - Added hardware specifications and optimization recommendations - Documented all configuration changes and service dependencies CURRENT STATUS: - Paperless services: ✅ Working and accessible externally - Vaultwarden: ❌ PostgreSQL configuration issues, old instance still working - Monitoring: ✅ Deployed and operational - Caddy: ✅ Updated and working for external access - PostgreSQL: ✅ Database running, connection issues with Vaultwarden NEXT STEPS: - Continue troubleshooting Vaultwarden PostgreSQL configuration - Consider alternative approaches for Vaultwarden migration - Validate all external service access - Complete final migration validation TECHNICAL NOTES: - Used Docker Swarm for orchestration on OMV800 - Implemented proper secret management for sensitive data - Added comprehensive logging and monitoring - Created automated backup and validation scripts
249 lines
7.3 KiB
Markdown
249 lines
7.3 KiB
Markdown
# QUICK START GUIDE - HOMEAUDIT MIGRATION
|
|
**Generated:** 2025-08-29
|
|
**Status:** READY FOR SERVICE MIGRATION - 99% Complete
|
|
|
|
---
|
|
|
|
## 🎯 **PROJECT OVERVIEW**
|
|
|
|
**Home infrastructure migration to Docker Swarm with optimized service distribution.** All critical infrastructure is now in place and ready for service migration.
|
|
|
|
---
|
|
|
|
## 📊 **CURRENT STATUS DASHBOARD**
|
|
|
|
### **✅ COMPLETED INFRASTRUCTURE**
|
|
- **Docker Swarm**: All 6 nodes joined and labeled ✅
|
|
- **Caddy Reverse Proxy**: Deployed and secured on surface ✅
|
|
- **Storage Configuration**: SMB/NFS hybrid complete ✅
|
|
- **Service Analysis**: Complete with security hardening ✅
|
|
- **Node Renaming**: lenovo410 (formerly jonathan-2518f5u) ✅
|
|
- **Backup Infrastructure**: Comprehensive system with RAID-1 ✅
|
|
|
|
### **🔄 NEXT STEPS**
|
|
- **Service Migration**: Move services to Docker Swarm
|
|
- **Database Services**: Deploy PostgreSQL and MariaDB
|
|
- **Monitoring Stack**: Deploy Grafana + Netdata
|
|
- **GPU Acceleration**: Configure for Jellyfin/Immich
|
|
- **Paperless Services**: ✅ Both Paperless-NGX and Paperless-AI now running on OMV800
|
|
|
|
---
|
|
|
|
## 🏗️ **INFRASTRUCTURE ARCHITECTURE**
|
|
|
|
### **Docker Swarm Nodes:**
|
|
```
|
|
OMV800 (Manager) - role=storage, cpu=high, memory=high, gpu=false
|
|
fedora - role=compute, cpu=medium, memory=medium, gpu=false
|
|
lenovo410 - role=compute, cpu=medium, memory=medium, gpu=false
|
|
audrey - role=compute, cpu=medium, memory=medium, gpu=false
|
|
surface - role=compute, cpu=medium, memory=medium, gpu=false
|
|
lenovo420 - role=ai-ml, cpu=high, memory=high, gpu=true
|
|
```
|
|
|
|
### **Networks:**
|
|
- **swarm-public**: Overlay network for service communication
|
|
- **database-network**: For database services
|
|
- **monitoring-network**: For monitoring services
|
|
- **ingress**: For ingress traffic
|
|
|
|
### **Reverse Proxy:**
|
|
- **Caddy**: Running on surface (192.168.50.254)
|
|
- **SSL**: Automatic certificates via DuckDNS
|
|
- **Security**: High-risk services removed from external access
|
|
|
|
### **Storage Infrastructure:**
|
|
- **SMB/NFS Hybrid**: Both protocols available
|
|
- **Exports Available**: adguard, appflowy, caddy, homeassistant, immich, jellyfin, media, nextcloud, ollama, paperless, vaultwarden
|
|
- **Permissions**: Properly configured for service access
|
|
|
|
### **Backup Infrastructure:**
|
|
- **Primary Storage**: raspberrypi with 7.3TB RAID-1 array
|
|
- **Automated Backups**: Comprehensive backup system with validation
|
|
- **Offsite Capability**: Cloud integration ready
|
|
- **Restoration Testing**: Automated verification procedures
|
|
- **Discovery Complete**: Comprehensive backup targets identified
|
|
- **Backup Size**: 1-15GB estimated total
|
|
- **Critical Data**: Databases, volumes, configurations, secrets, user data
|
|
|
|
---
|
|
|
|
## 🚀 **IMMEDIATE ACTIONS**
|
|
|
|
### **1. Deploy Database Services**
|
|
```bash
|
|
# Deploy PostgreSQL and MariaDB on OMV800
|
|
ssh root@omv800.local "cd /opt/stacks/databases && docker stack deploy -c postgresql.yml databases"
|
|
ssh root@omv800.local "cd /opt/stacks/databases && docker stack deploy -c mariadb.yml databases"
|
|
```
|
|
|
|
### **2. Migrate Services to Swarm**
|
|
```bash
|
|
# Start with simple services first
|
|
ssh root@omv800.local "cd /opt/stacks/apps && docker stack deploy -c jellyfin.yml media"
|
|
```
|
|
|
|
### **3. Deploy Monitoring**
|
|
```bash
|
|
# Deploy basic monitoring stack
|
|
ssh root@omv800.local "cd /opt/stacks/monitoring && docker stack deploy -c grafana.yml monitoring"
|
|
```
|
|
|
|
---
|
|
|
|
## 🔧 **DEVELOPMENT WORKFLOW**
|
|
|
|
### **Service Deployment Process:**
|
|
1. **Test locally** with docker-compose
|
|
2. **Convert to stack** format
|
|
3. **Deploy to swarm** with proper labels
|
|
4. **Update Caddy** if needed
|
|
5. **Test access** via domain
|
|
|
|
### **Configuration Management:**
|
|
- **Stack files**: `/opt/stacks/` on OMV800
|
|
- **Secrets**: Docker Swarm secrets
|
|
- **Volumes**: NFS/SMB mounts from OMV800
|
|
- **Networks**: Overlay networks for service communication
|
|
|
|
---
|
|
|
|
## 📋 **ESSENTIAL FILES**
|
|
|
|
### **Infrastructure:**
|
|
- `dev_documentation/infrastructure/SERVICE_ANALYSIS_AND_CADDYFILE.md` - Service mapping and routing
|
|
- `dev_documentation/infrastructure/HARDWARE_SPECIFICATIONS.md` - Hardware details
|
|
- `dev_documentation/infrastructure/COMPREHENSIVE_END_STATE_ANALYSIS.md` - Optimization strategy
|
|
|
|
### **Migration:**
|
|
- `dev_documentation/migration/COMPREHENSIVE_MIGRATION_ISSUES_REPORT.md` - Migration status
|
|
- `migration_scripts/scripts/` - Automation scripts
|
|
- `stacks/` - Docker Swarm stack files
|
|
|
|
### **Monitoring:**
|
|
- `dev_documentation/monitoring/` - Monitoring configuration
|
|
- `configs/monitoring/` - Prometheus/Grafana configs
|
|
|
|
---
|
|
|
|
## 🛠️ **COMMON TASKS**
|
|
|
|
### **Deploy a New Service:**
|
|
```bash
|
|
# 1. Create stack file
|
|
vim /opt/stacks/apps/newservice.yml
|
|
|
|
# 2. Deploy to swarm
|
|
docker stack deploy -c newservice.yml apps
|
|
|
|
# 3. Update Caddy if needed
|
|
scp caddyfile.txt jon@192.168.50.254:/tmp/
|
|
ssh jon@192.168.50.254 "sudo cp /tmp/caddyfile.txt /etc/caddy/Caddyfile && sudo systemctl reload caddy"
|
|
```
|
|
|
|
### **Check Service Status:**
|
|
```bash
|
|
# Check all services
|
|
ssh root@omv800.local "docker service ls"
|
|
|
|
# Check specific service
|
|
ssh root@omv800.local "docker service ps servicename"
|
|
|
|
# Check logs
|
|
ssh root@omv800.local "docker service logs servicename"
|
|
```
|
|
|
|
### **Scale Services:**
|
|
```bash
|
|
# Scale a service
|
|
ssh root@omv800.local "docker service scale servicename=3"
|
|
|
|
# Update service
|
|
ssh root@omv800.local "docker service update --image newimage:tag servicename"
|
|
```
|
|
|
|
---
|
|
|
|
## 🚨 **EMERGENCY PROCEDURES**
|
|
|
|
### **Service Down:**
|
|
```bash
|
|
# Check service status
|
|
ssh root@omv800.local "docker service ls"
|
|
|
|
# Restart service
|
|
ssh root@omv800.local "docker service update --force servicename"
|
|
|
|
# Check logs
|
|
ssh root@omv800.local "docker service logs servicename"
|
|
```
|
|
|
|
### **Node Issues:**
|
|
```bash
|
|
# Check node status
|
|
ssh root@omv800.local "docker node ls"
|
|
|
|
# Drain node (move services away)
|
|
ssh root@omv800.local "docker node update --availability drain nodename"
|
|
|
|
# Remove node
|
|
ssh root@omv800.local "docker node rm nodename"
|
|
```
|
|
|
|
### **Caddy Issues:**
|
|
```bash
|
|
# Check Caddy status
|
|
ssh jon@192.168.50.254 "sudo systemctl status caddy"
|
|
|
|
# Restart Caddy
|
|
ssh jon@192.168.50.254 "sudo systemctl restart caddy"
|
|
|
|
# Check logs
|
|
ssh jon@192.168.50.254 "sudo journalctl -u caddy -f"
|
|
```
|
|
|
|
---
|
|
|
|
## ⚠️ **IMPORTANT WARNINGS**
|
|
|
|
### **Security:**
|
|
- **Never expose** system management interfaces externally
|
|
- **Use secrets** for all passwords and API keys
|
|
- **Keep AdGuard Home** local-only for DNS security
|
|
- **Monitor access** to sensitive services
|
|
|
|
### **Data Safety:**
|
|
- **Backup before** major changes
|
|
- **Test migrations** on non-critical services first
|
|
- **Verify data integrity** after service moves
|
|
- **Keep original** configurations as backup
|
|
|
|
### **Performance:**
|
|
- **Monitor resource usage** during migration
|
|
- **Scale gradually** to avoid overwhelming nodes
|
|
- **Test under load** before going live
|
|
- **Have rollback plan** ready
|
|
|
|
---
|
|
|
|
## 📞 **SUPPORT CONTACTS**
|
|
|
|
### **Infrastructure:**
|
|
- **OMV800**: Primary storage and database host
|
|
- **surface**: Caddy reverse proxy
|
|
- **lenovo410**: Home automation services
|
|
- **lenovo420**: AI/ML processing
|
|
- **audrey**: Monitoring services
|
|
- **fedora**: Development and automation
|
|
|
|
### **Access Methods:**
|
|
- **SSH**: Use inventory.ini for correct usernames
|
|
- **Web**: Services accessible via Caddy domains
|
|
- **Monitoring**: Uptime Kuma for service status
|
|
|
|
---
|
|
|
|
**Status: READY FOR SERVICE MIGRATION** 🚀
|
|
**Last Updated:** 2025-08-29
|
|
**Next Review:** After database deployment
|