Major infrastructure migration and Vaultwarden PostgreSQL troubleshooting

COMPREHENSIVE CHANGES:

INFRASTRUCTURE MIGRATION:
- Migrated services to Docker Swarm on OMV800 (192.168.50.229)
- Deployed PostgreSQL database for Vaultwarden migration
- Updated all stack configurations for Docker Swarm compatibility
- Added comprehensive monitoring stack (Prometheus, Grafana, Blackbox)
- Implemented proper secret management for all services

VAULTWARDEN POSTGRESQL MIGRATION:
- Attempted migration from SQLite to PostgreSQL for NFS compatibility
- Created PostgreSQL stack with proper user/password configuration
- Built custom Vaultwarden image with PostgreSQL support
- Troubleshot persistent SQLite fallback issue despite PostgreSQL config
- Identified known issue where Vaultwarden silently falls back to SQLite
- Added ENABLE_DB_WAL=false to prevent filesystem compatibility issues
- Current status: Old Vaultwarden on lenovo410 still working, new one has config issues

PAPERLESS SERVICES:
- Successfully deployed Paperless-NGX and Paperless-AI on OMV800
- Both services running on ports 8000 and 3000 respectively
- Caddy configuration updated for external access
- Services accessible via paperless.pressmess.duckdns.org and paperless-ai.pressmess.duckdns.org

CADDY CONFIGURATION:
- Updated Caddyfile on Surface (192.168.50.254) for new service locations
- Fixed Vaultwarden reverse proxy to point to new Docker Swarm service
- Removed old notification hub reference that was causing conflicts
- All services properly configured for external access via DuckDNS

BACKUP AND DISCOVERY:
- Created comprehensive backup system for all hosts
- Generated detailed discovery reports for infrastructure analysis
- Implemented automated backup validation scripts
- Created migration progress tracking and verification reports

MONITORING STACK:
- Deployed Prometheus, Grafana, and Blackbox monitoring
- Created infrastructure and system overview dashboards
- Added proper service discovery and alerting configuration
- Implemented performance monitoring for all critical services

DOCUMENTATION:
- Reorganized documentation into logical structure
- Created comprehensive migration playbook and troubleshooting guides
- Added hardware specifications and optimization recommendations
- Documented all configuration changes and service dependencies

CURRENT STATUS:
- Paperless services:  Working and accessible externally
- Vaultwarden:  PostgreSQL configuration issues, old instance still working
- Monitoring:  Deployed and operational
- Caddy:  Updated and working for external access
- PostgreSQL:  Database running, connection issues with Vaultwarden

NEXT STEPS:
- Continue troubleshooting Vaultwarden PostgreSQL configuration
- Consider alternative approaches for Vaultwarden migration
- Validate all external service access
- Complete final migration validation

TECHNICAL NOTES:
- Used Docker Swarm for orchestration on OMV800
- Implemented proper secret management for sensitive data
- Added comprehensive logging and monitoring
- Created automated backup and validation scripts
This commit is contained in:
admin
2025-08-30 20:18:44 -04:00
parent 9ea31368f5
commit 705a2757c1
155 changed files with 16781 additions and 1243 deletions

View File

@@ -0,0 +1,270 @@
# Comprehensive Home Lab Service Inventory Report
**Generated:** 2025-08-23
**Total Devices Audited:** 6 out of 7 (1 unreachable)
**Audit Status:** Complete
## Executive Summary
Your home lab infrastructure consists of **6 actively audited devices** running a sophisticated mix of **43 Docker containers** and **dozens of native services**. The infrastructure shows a well-architected approach with centralized storage, distributed monitoring, comprehensive home automation, and development environments.
### Quick Statistics
- **Total Running Containers:** 43 (across 5 hosts)
- **Host-Level Services:** 50+ unique services
- **Web Interfaces:** 15+ admin panels
- **Database Instances:** 6 (PostgreSQL, MariaDB, Redis)
- **Storage Capacity:** 26+ TB (19TB primary + 7.3TB backup)
- **Paperless Services:** Both NGX and AI now running on OMV800
---
## Host-by-Host Service Breakdown
### 1. OMV800 (192.168.50.229) - Primary Storage & Media Server
**OS:** Debian 12 | **Role:** NAS/Media/Document Hub | **Docker Containers:** 19
#### Docker Services (Running)
| Service | Port | Purpose | Status |
|---------|------|---------|--------|
| AdGuard Home | 53, 3000 | DNS filtering & ad blocking | Running |
| Paperless-NGX | 8000 | Document management | ✅ Running |
| Paperless-AI | 3000 | AI document enhancement | ✅ Running |
| Vikunja | 3456 | Task management | Running |
| PostgreSQL | 5432 | Database for Paperless | ⚠️ Restarting |
| Redis | 6379 | Cache/message broker | Running |
#### Native Services
- **Apache2** - Web server for OMV interface
- **OpenMediaVault** - NAS management
- **Netdata** - System monitoring
- **Tailscale** - VPN mesh networking
- **19TB Storage Array** - Primary file storage
### 2. jonathan-2518f5u (192.168.50.181) - Home Automation Hub
**OS:** Ubuntu 24.04 | **Role:** IoT/Automation Center | **Docker Containers:** 6
#### Docker Services
| Service | Port | Purpose | Status |
|---------|------|---------|--------|
| Home Assistant | 8123 | Smart home automation | Running |
| ESPHome | 6052 | ESP device management | Running |
| Paperless-NGX | 8001 | Document processing | ⚠️ Not running (moved to OMV800) |
| Paperless-AI | 3000 | AI-enhanced docs | ⚠️ Not running (moved to OMV800) |
| Portainer | 9000 | Container management | Running |
| Redis | 6379 | Data broker | Running |
#### Native Services
- **Netdata** (Port 19999) - System monitoring
- **iPerf3** - Network testing
- **Auditd** - Security monitoring
- **Smartmontools** - Disk health monitoring
- **NFS Client** - Storage access to OMV800
### 3. surface (192.168.50.254) - Development & Web Services
**OS:** Ubuntu 24.04 | **Role:** Development/Collaboration | **Docker Containers:** 7
#### Docker Services (AppFlowy Stack)
| Service | Port | Purpose | Status |
|---------|------|---------|--------|
| AppFlowy Cloud | 8000 | Collaboration platform API | Running |
| AppFlowy Web | 80 | Web interface | Running |
| GoTrue | - | Authentication service | Running |
| PostgreSQL | 5432 | AppFlowy database | Running |
| Redis | 6379 | Session cache | Running |
| Nginx | 8080, 8443 | Reverse proxy | Running |
| MinIO | - | Object storage | Running |
#### Native Services
- **Apache HTTP Server** (Port 8888) - Web server
- **MariaDB** (Port 3306) - Database server
- **Caddy** (Port 80, 443) - Reverse proxy
- **PHP 8.2 FPM** - PHP processing
- **Ollama** (Port 11434) - Local LLM service
- **Netdata** (Port 19999) - Monitoring
- **CUPS** - Printing service
- **GNOME Remote Desktop** - Remote access
### 4. raspberrypi (192.168.50.107) - Backup NAS
**OS:** Debian 12 | **Role:** Backup Storage | **Docker Containers:** 0
#### Native Services Only
- **OpenMediaVault** - NAS management interface
- **NFS Server** - Network file sharing (multiple exports)
- **Samba/SMB** (Ports 139, 445) - Windows file sharing
- **Nginx** (Port 80) - OMV web interface
- **Netdata** (Port 19999) - System monitoring
- **Orb** (Port 7443) - Custom service
- **RAID 1 Array** - 7.3TB backup storage
#### Storage Exports
- `/export/audrey_backup`
- `/export/surface_backup`
- `/export/omv800_backup`
- `/export/fedora_backup`
### 5. fedora (192.168.50.225) - Development Workstation
**OS:** Fedora 42 | **Role:** Development | **Docker Containers:** 1
#### Docker Services
| Service | Port | Purpose | Status |
|---------|------|---------|--------|
| Portainer Agent | 9001 | Container monitoring | ⚠️ Restarting |
#### Native Services
- **Netdata** (Port 19999) - System monitoring
- **Tailscale** - VPN client
- **Nextcloud WebDAV mount** - Cloud storage access
- **GNOME Desktop** - GUI workstation environment
### 6. audrey (192.168.50.145) - Monitoring Hub
**OS:** Ubuntu 24.04 | **Role:** Monitoring/Admin | **Docker Containers:** 4
#### Docker Services
| Service | Port | Purpose | Status |
|---------|------|---------|--------|
| Portainer Agent | 9001 | Container management | Running |
| Dozzle | 9999 | Docker log viewer | Running |
| Uptime Kuma | 3001 | Service uptime monitoring | Running |
| Code Server | 8443 | Web-based VS Code | Running |
#### Native Services
- **Orb** (Port 7443) - Custom monitoring
- **Tailscale** - VPN mesh networking
- **Fail2ban** - Intrusion prevention
- **NFS Client** - Backup storage access
---
## Network Architecture & Port Summary
### Administrative Interfaces
- **9000** - Portainer (central container management)
- **9001** - Portainer Agents (distributed)
- **3001** - Uptime Kuma (service monitoring)
- **9999** - Dozzle (log aggregation)
- **19999** - Netdata (system monitoring on 4 hosts)
### Home Automation & IoT
- **8123** - Home Assistant (smart home hub)
- **6052** - ESPHome (ESP device management)
- **7443** - Orb sensors (custom monitoring)
### Development & Productivity
- **8443** - Code Server & AppFlowy HTTPS
- **8000** - AppFlowy Cloud API
- **11434** - Ollama (local AI/LLM)
- **3000** - Paperless-AI, AppFlowy Auth
### Document Management
- **8001** - Paperless-NGX (jonathan-2518f5u)
- **8010** - Paperless-NGX (OMV800) ⚠️
- **3456** - Vikunja (task management)
### Database Services
- **5432** - PostgreSQL (surface, OMV800)
- **3306** - MariaDB (surface)
- **6379** - Redis (multiple hosts)
### File Sharing & Storage
- **80** - Nginx/OMV interfaces
- **139/445** - Samba/SMB (raspberrypi)
- **2049** - NFS server (raspberrypi)
---
## Installed But Not Running Services
### Package Analysis Summary
Based on package inventories across all hosts:
#### Security Tools (Installed)
- **AIDE** - Advanced Intrusion Detection (OMV800)
- **Fail2ban** - Available on most hosts
- **AppArmor** - Security framework (Ubuntu hosts)
- **Auditd** - Security auditing (audrey, jonathan-2518f5u)
#### Development Tools
- **Apache2** - Installed but not primary on some hosts
- **PHP** versions - Available across multiple hosts
- **Git, build tools** - Standard development stack
- **Docker/Podman** - Container runtimes
#### System Administration
- **Anacron** - Alternative to cron (all hosts)
- **APT tools** - Package management utilities
- **CUPS** - Printing system (available but not always active)
---
## Infrastructure Patterns & Architecture
### 1. **Centralized Storage with Distributed Access**
- **Primary:** OMV800 (19TB) serves files via NFS/SMB
- **Backup:** raspberrypi (7.3TB RAID-1) for redundancy
- **Access:** All hosts mount NFS shares for data access
### 2. **Layered Monitoring Architecture**
- **System Level:** Netdata on 4 hosts
- **Service Level:** Uptime Kuma for availability monitoring
- **Container Level:** Dozzle for log aggregation
- **Application Level:** Custom Orb sensors
### 3. **Hybrid Container Management**
- **Central Control:** Portainer on jonathan-2518f5u
- **Distributed Agents:** Portainer agents on remote hosts
- **Container Distribution:** Services spread based on resource needs
### 4. **Security Mesh Network**
- **Tailscale VPN:** Secure mesh networking across all hosts
- **Segmented Access:** Different hosts serve different functions
- **Monitoring:** Comprehensive logging and intrusion detection
### 5. **Home Automation Integration**
- **Central Hub:** Home Assistant with ESPHome integration
- **Storage Integration:** Document processing with NFS backend
- **Monitoring Integration:** Custom sensors feeding into monitoring stack
---
## Security Assessment
### ✅ Security Strengths
- SSH root disabled on 4/6 hosts
- Tailscale mesh VPN implemented
- Comprehensive monitoring and logging
- Regular security updates (recent package versions)
- Fail2ban intrusion prevention deployed
### ⚠️ Security Concerns
- **OMV800** & **raspberrypi**: SSH root login enabled
- Some containers showing health issues (PostgreSQL restarts)
- UFW firewall inactive on some hosts
- Failed SSH attempts logged on surface and audrey
### 🔧 Recommended Actions
1. Disable SSH root on OMV800 and raspberrypi
2. Enable UFW firewall on Ubuntu hosts
3. Investigate container health issues
4. Review SSH access logs for patterns
5. Consider centralizing authentication
---
## Summary & Recommendations
Your home lab demonstrates **sophisticated infrastructure management** with well-thought-out service distribution. The combination of centralized storage, distributed monitoring, comprehensive home automation, and development services creates a highly functional environment.
### Key Strengths
- **Comprehensive monitoring** across all layers
- **Redundant storage** with backup strategies
- **Service distribution** optimized for resources
- **Modern containerized** applications
- **Integrated automation** with document management
### Optimization Opportunities
1. **Health Monitoring:** Address container restart issues on OMV800
2. **Security Hardening:** Standardize SSH and firewall configurations
3. **Backup Automation:** Enhance the existing backup infrastructure
4. **Resource Optimization:** Consider workload balancing across hosts
5. **Documentation:** Maintain service dependency mapping
**Total Unique Services Identified:** 60+ distinct services across containerized and native deployments.