Files
HomeAudit/migration_scripts/scripts/deploy_traefik.sh
admin 705a2757c1 Major infrastructure migration and Vaultwarden PostgreSQL troubleshooting
COMPREHENSIVE CHANGES:

INFRASTRUCTURE MIGRATION:
- Migrated services to Docker Swarm on OMV800 (192.168.50.229)
- Deployed PostgreSQL database for Vaultwarden migration
- Updated all stack configurations for Docker Swarm compatibility
- Added comprehensive monitoring stack (Prometheus, Grafana, Blackbox)
- Implemented proper secret management for all services

VAULTWARDEN POSTGRESQL MIGRATION:
- Attempted migration from SQLite to PostgreSQL for NFS compatibility
- Created PostgreSQL stack with proper user/password configuration
- Built custom Vaultwarden image with PostgreSQL support
- Troubleshot persistent SQLite fallback issue despite PostgreSQL config
- Identified known issue where Vaultwarden silently falls back to SQLite
- Added ENABLE_DB_WAL=false to prevent filesystem compatibility issues
- Current status: Old Vaultwarden on lenovo410 still working, new one has config issues

PAPERLESS SERVICES:
- Successfully deployed Paperless-NGX and Paperless-AI on OMV800
- Both services running on ports 8000 and 3000 respectively
- Caddy configuration updated for external access
- Services accessible via paperless.pressmess.duckdns.org and paperless-ai.pressmess.duckdns.org

CADDY CONFIGURATION:
- Updated Caddyfile on Surface (192.168.50.254) for new service locations
- Fixed Vaultwarden reverse proxy to point to new Docker Swarm service
- Removed old notification hub reference that was causing conflicts
- All services properly configured for external access via DuckDNS

BACKUP AND DISCOVERY:
- Created comprehensive backup system for all hosts
- Generated detailed discovery reports for infrastructure analysis
- Implemented automated backup validation scripts
- Created migration progress tracking and verification reports

MONITORING STACK:
- Deployed Prometheus, Grafana, and Blackbox monitoring
- Created infrastructure and system overview dashboards
- Added proper service discovery and alerting configuration
- Implemented performance monitoring for all critical services

DOCUMENTATION:
- Reorganized documentation into logical structure
- Created comprehensive migration playbook and troubleshooting guides
- Added hardware specifications and optimization recommendations
- Documented all configuration changes and service dependencies

CURRENT STATUS:
- Paperless services:  Working and accessible externally
- Vaultwarden:  PostgreSQL configuration issues, old instance still working
- Monitoring:  Deployed and operational
- Caddy:  Updated and working for external access
- PostgreSQL:  Database running, connection issues with Vaultwarden

NEXT STEPS:
- Continue troubleshooting Vaultwarden PostgreSQL configuration
- Consider alternative approaches for Vaultwarden migration
- Validate all external service access
- Complete final migration validation

TECHNICAL NOTES:
- Used Docker Swarm for orchestration on OMV800
- Implemented proper secret management for sensitive data
- Added comprehensive logging and monitoring
- Created automated backup and validation scripts
2025-08-30 20:18:44 -04:00

579 lines
15 KiB
Bash
Executable File

#!/bin/bash
# Deploy Traefik Reverse Proxy
# This script deploys Traefik with SSL, security, and monitoring
set -euo pipefail
echo "🌐 Deploying Traefik reverse proxy..."
# Colors for output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m' # No Color
# Function to print colored output
print_status() {
echo -e "${GREEN}[INFO]${NC} $1"
}
print_warning() {
echo -e "${YELLOW}[WARNING]${NC} $1"
}
print_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
# Configuration
MANAGER_HOST="omv800"
TRAEFIK_CONFIG_DIR="/opt/migration/configs/traefik"
DOMAIN="yourdomain.com"
EMAIL="admin@yourdomain.com"
# 1. Create Traefik configuration directory
print_status "Step 1: Creating Traefik configuration directory..."
mkdir -p "$TRAEFIK_CONFIG_DIR"
mkdir -p "$TRAEFIK_CONFIG_DIR/dynamic"
mkdir -p "$TRAEFIK_CONFIG_DIR/certificates"
# 2. Create Traefik static configuration
print_status "Step 2: Creating Traefik static configuration..."
cat > "$TRAEFIK_CONFIG_DIR/traefik.yml" << EOF
# Traefik Static Configuration
global:
checkNewVersion: false
sendAnonymousUsage: false
api:
dashboard: true
insecure: false
entryPoints:
web:
address: ":80"
http:
redirections:
entrypoint:
to: websecure
scheme: https
permanent: true
websecure:
address: ":443"
http:
tls:
certResolver: letsencrypt
domains:
- main: "*.${DOMAIN}"
sans:
- "*.${DOMAIN}"
providers:
docker:
swarmMode: true
exposedByDefault: false
network: traefik-public
watch: true
file:
directory: /etc/traefik/dynamic
watch: true
certificatesResolvers:
letsencrypt:
acme:
email: ${EMAIL}
storage: /certificates/acme.json
httpChallenge:
entryPoint: web
log:
level: INFO
format: json
accessLog:
filePath: /var/log/traefik/access.log
format: json
fields:
defaultMode: keep
headers:
defaultMode: keep
metrics:
prometheus:
addEntryPointsLabels: true
addServicesLabels: true
buckets:
- 0.1
- 0.3
- 1.2
- 5.0
ping:
entryPoint: web
providers:
docker:
swarmMode: true
exposedByDefault: false
network: traefik-public
watch: true
file:
directory: /etc/traefik/dynamic
watch: true
EOF
# 3. Create dynamic configuration
print_status "Step 3: Creating dynamic configuration..."
# Copy middleware configuration
cp "$(dirname "$0")/../configs/traefik/dynamic/middleware.yml" "$TRAEFIK_CONFIG_DIR/dynamic/"
# Create service-specific configurations
cat > "$TRAEFIK_CONFIG_DIR/dynamic/services.yml" << EOF
# Service-specific configurations
http:
routers:
# Immich Photo Management
immich-api:
rule: "Host(\`immich.${DOMAIN}\`) && PathPrefix(\`/api\`)"
service: immich-api
entryPoints:
- websecure
tls:
certResolver: letsencrypt
middlewares:
- security-headers@file
- rate-limit@file
- cors@file
immich-web:
rule: "Host(\`immich.${DOMAIN}\`)"
service: immich-web
entryPoints:
- websecure
tls:
certResolver: letsencrypt
middlewares:
- security-headers@file
- rate-limit@file
- compression@file
# Jellyfin Media Server
jellyfin:
rule: "Host(\`jellyfin.${DOMAIN}\`)"
service: jellyfin
entryPoints:
- websecure
tls:
certResolver: letsencrypt
middlewares:
- security-headers@file
- rate-limit@file
- compression@file
# Home Assistant
homeassistant:
rule: "Host(\`home.${DOMAIN}\`)"
service: homeassistant
entryPoints:
- websecure
tls:
certResolver: letsencrypt
middlewares:
- security-headers@file
- rate-limit@file
- websocket@file
# AppFlowy Collaboration
appflowy:
rule: "Host(\`appflowy.${DOMAIN}\`)"
service: appflowy
entryPoints:
- websecure
tls:
certResolver: letsencrypt
middlewares:
- security-headers@file
- rate-limit@file
- cors@file
# Paperless Document Management
paperless:
rule: "Host(\`paperless.${DOMAIN}\`)"
service: paperless
entryPoints:
- websecure
tls:
certResolver: letsencrypt
middlewares:
- security-headers@file
- rate-limit@file
- auth@file
# Portainer Container Management
portainer:
rule: "Host(\`portainer.${DOMAIN}\`)"
service: portainer
entryPoints:
- websecure
tls:
certResolver: letsencrypt
middlewares:
- security-headers@file
- auth@file
- ip-whitelist@file
# Grafana Monitoring
grafana:
rule: "Host(\`grafana.${DOMAIN}\`)"
service: grafana
entryPoints:
- websecure
tls:
certResolver: letsencrypt
middlewares:
- security-headers@file
- auth@file
- ip-whitelist@file
# Prometheus Metrics
prometheus:
rule: "Host(\`prometheus.${DOMAIN}\`)"
service: prometheus
entryPoints:
- websecure
tls:
certResolver: letsencrypt
middlewares:
- security-headers@file
- auth@file
- ip-whitelist@file
# Uptime Kuma Monitoring
uptime-kuma:
rule: "Host(\`uptime.${DOMAIN}\`)"
service: uptime-kuma
entryPoints:
- websecure
tls:
certResolver: letsencrypt
middlewares:
- security-headers@file
- auth@file
- ip-whitelist@file
services:
# Service definitions will be auto-discovered by Docker provider
# These are fallback definitions for external services
# Error service for maintenance pages
error-service:
loadBalancer:
servers:
- url: "http://error-page:8080"
# Auth service for forward authentication
auth-service:
loadBalancer:
servers:
- url: "http://auth-service:8080"
EOF
# 4. Create users file for basic auth
print_status "Step 4: Creating users file for basic auth..."
cat > "$TRAEFIK_CONFIG_DIR/users" << EOF
# Basic Auth Users
# Format: username:hashed_password
# Generate with: htpasswd -nb username password
admin:\$2y\$10\$92IXUNpkjO0rOQ5byMi.Ye4oKoEa3Ro9llC/.og/at2.uheWG/igi
migration:\$2y\$10\$92IXUNpkjO0rOQ5byMi.Ye4oKoEa3Ro9llC/.og/at2.uheWG/igi
EOF
# 5. Set proper permissions
print_status "Step 5: Setting proper permissions..."
chmod 600 "$TRAEFIK_CONFIG_DIR/users"
chmod 644 "$TRAEFIK_CONFIG_DIR/traefik.yml"
chmod 644 "$TRAEFIK_CONFIG_DIR/dynamic/"*.yml
# 6. Deploy Traefik stack
print_status "Step 6: Deploying Traefik stack..."
cd "$TRAEFIK_CONFIG_DIR"
# Create docker-compose file for deployment
cat > "docker-compose.yml" << EOF
version: '3.8'
services:
traefik:
image: traefik:v3.0
command:
# API and dashboard
- --api.dashboard=true
- --api.insecure=false
# Docker provider
- --providers.docker.swarmMode=true
- --providers.docker.exposedbydefault=false
- --providers.docker.network=traefik-public
# Entry points
- --entrypoints.web.address=:80
- --entrypoints.websecure.address=:443
- --entrypoints.web.http.redirections.entrypoint.to=websecure
- --entrypoints.web.http.redirections.entrypoint.scheme=https
# SSL/TLS configuration
- --certificatesresolvers.letsencrypt.acme.email=${EMAIL}
- --certificatesresolvers.letsencrypt.acme.storage=/certificates/acme.json
- --certificatesresolvers.letsencrypt.acme.httpchallenge.entrypoint=web
# Security headers
- --entrypoints.websecure.http.middlewares=security-headers@file
- --entrypoints.websecure.http.middlewares=rate-limit@file
# Logging
- --log.level=INFO
- --accesslog=true
- --accesslog.filepath=/var/log/traefik/access.log
- --accesslog.format=json
# Metrics
- --metrics.prometheus=true
- --metrics.prometheus.addEntryPointsLabels=true
- --metrics.prometheus.addServicesLabels=true
# Health checks
- --ping=true
- --ping.entryPoint=web
# File provider for static configuration
- --providers.file.directory=/etc/traefik/dynamic
- --providers.file.watch=true
ports:
- "80:80"
- "443:443"
- "8080:8080" # Dashboard (internal only)
volumes:
- /var/run/docker.sock:/var/run/docker.sock:ro
- traefik-certificates:/certificates
- traefik-logs:/var/log/traefik
- ./dynamic:/etc/traefik/dynamic:ro
- ./traefik.yml:/etc/traefik/traefik.yml:ro
- ./users:/etc/traefik/users:ro
networks:
- traefik-public
deploy:
placement:
constraints:
- node.role == manager
preferences:
- spread: node.labels.zone
replicas: 2
resources:
limits:
memory: 512M
cpus: '0.5'
reservations:
memory: 256M
cpus: '0.25'
labels:
# Traefik dashboard
- "traefik.enable=true"
- "traefik.http.routers.traefik-dashboard.rule=Host(\`traefik.${DOMAIN}\`)"
- "traefik.http.routers.traefik-dashboard.entrypoints=websecure"
- "traefik.http.routers.traefik-dashboard.tls.certresolver=letsencrypt"
- "traefik.http.routers.traefik-dashboard.service=api@internal"
- "traefik.http.routers.traefik-dashboard.middlewares=auth@file"
# Health check
- "traefik.http.routers.traefik-health.rule=PathPrefix(\`/ping\`)"
- "traefik.http.routers.traefik-health.entrypoints=web"
- "traefik.http.routers.traefik-health.service=ping@internal"
# Metrics
- "traefik.http.routers.traefik-metrics.rule=Host(\`traefik.${DOMAIN}\`) && PathPrefix(\`/metrics\`)"
- "traefik.http.routers.traefik-metrics.entrypoints=websecure"
- "traefik.http.routers.traefik-metrics.tls.certresolver=letsencrypt"
- "traefik.http.routers.traefik-metrics.service=prometheus@internal"
- "traefik.http.routers.traefik-metrics.middlewares=auth@file"
restart_policy:
condition: on-failure
delay: 5s
max_attempts: 3
window: 120s
update_config:
parallelism: 1
delay: 10s
order: start-first
rollback_config:
parallelism: 1
delay: 5s
order: stop-first
volumes:
traefik-certificates:
driver: local
traefik-logs:
driver: local
networks:
traefik-public:
external: true
EOF
# 7. Deploy the stack
print_status "Step 7: Deploying Traefik stack..."
ssh "$MANAGER_HOST" "cd $TRAEFIK_CONFIG_DIR && docker stack deploy -c docker-compose.yml traefik"
# 8. Wait for deployment
print_status "Step 8: Waiting for deployment to complete..."
sleep 30
# 9. Verify deployment
print_status "Step 9: Verifying deployment..."
ssh "$MANAGER_HOST" "docker service ls | grep traefik"
ssh "$MANAGER_HOST" "docker service ps traefik_traefik"
# 10. Test Traefik health
print_status "Step 10: Testing Traefik health..."
sleep 10
# Test HTTP to HTTPS redirect
if curl -s -I "http://$MANAGER_HOST" | grep -q "301\|302"; then
print_status "✅ HTTP to HTTPS redirect working"
else
print_warning "⚠️ HTTP to HTTPS redirect may not be working"
fi
# Test Traefik dashboard (internal)
if curl -s "http://$MANAGER_HOST:8080/api/rawdata" | grep -q "traefik"; then
print_status "✅ Traefik dashboard accessible"
else
print_warning "⚠️ Traefik dashboard may not be accessible"
fi
# 11. Create health check script
print_status "Step 11: Creating health check script..."
cat > "/opt/migration/scripts/check_traefik_health.sh" << 'EOF'
#!/bin/bash
# Check Traefik Health
set -euo pipefail
MANAGER_HOST="omv800"
DOMAIN="yourdomain.com"
echo "🏥 Checking Traefik health..."
# Check service status
echo "📋 Service status:"
ssh "$MANAGER_HOST" "docker service ls | grep traefik"
# Check service tasks
echo "🔧 Service tasks:"
ssh "$MANAGER_HOST" "docker service ps traefik_traefik"
# Check logs
echo "📝 Recent logs:"
ssh "$MANAGER_HOST" "docker service logs --tail 20 traefik_traefik"
# Test HTTP redirect
echo "🔄 Testing HTTP redirect:"
if curl -s -I "http://$MANAGER_HOST" | grep -q "301\|302"; then
echo "✅ HTTP to HTTPS redirect working"
else
echo "❌ HTTP to HTTPS redirect not working"
fi
# Test dashboard
echo "📊 Testing dashboard:"
if curl -s "http://$MANAGER_HOST:8080/api/rawdata" | grep -q "traefik"; then
echo "✅ Traefik dashboard accessible"
else
echo "❌ Traefik dashboard not accessible"
fi
# Test SSL certificate
echo "🔒 Testing SSL certificate:"
if curl -s -I "https://$MANAGER_HOST" | grep -q "HTTP/2\|HTTP/1.1 200"; then
echo "✅ SSL certificate working"
else
echo "❌ SSL certificate not working"
fi
echo "✅ Traefik health check completed"
EOF
chmod +x "/opt/migration/scripts/check_traefik_health.sh"
# 12. Create configuration summary
print_status "Step 12: Creating configuration summary..."
cat > "/opt/migration/traefik_summary.txt" << EOF
Traefik Deployment Summary
Generated: $(date)
Configuration:
Domain: ${DOMAIN}
Email: ${EMAIL}
Manager Host: ${MANAGER_HOST}
Services Configured:
- Immich Photo Management: https://immich.${DOMAIN}
- Jellyfin Media Server: https://jellyfin.${DOMAIN}
- Home Assistant: https://home.${DOMAIN}
- AppFlowy Collaboration: https://appflowy.${DOMAIN}
- Paperless Documents: https://paperless.${DOMAIN}
- Portainer Management: https://portainer.${DOMAIN}
- Grafana Monitoring: https://grafana.${DOMAIN}
- Prometheus Metrics: https://prometheus.${DOMAIN}
- Uptime Kuma: https://uptime.${DOMAIN}
- Traefik Dashboard: https://traefik.${DOMAIN}
Security Features:
- SSL/TLS with Let's Encrypt
- Security headers
- Rate limiting
- Basic authentication
- IP whitelisting
- CORS support
Monitoring:
- Prometheus metrics
- Access logging
- Health checks
- Dashboard
Configuration Files:
- Static config: ${TRAEFIK_CONFIG_DIR}/traefik.yml
- Dynamic config: ${TRAEFIK_CONFIG_DIR}/dynamic/
- Users file: ${TRAEFIK_CONFIG_DIR}/users
- Health check: /opt/migration/scripts/check_traefik_health.sh
Next Steps:
1. Update DNS records to point to ${MANAGER_HOST}
2. Test SSL certificate generation
3. Deploy monitoring stack
4. Begin service migration
EOF
print_status "✅ Traefik deployment completed successfully!"
print_status "📋 Configuration summary saved to: /opt/migration/traefik_summary.txt"
print_status "🔧 Health check script: /opt/migration/scripts/check_traefik_health.sh"
echo ""
print_status "Next steps:"
echo " 1. Update DNS: Point *.${DOMAIN} to ${MANAGER_HOST}"
echo " 2. Test SSL: ./scripts/check_traefik_health.sh"
echo " 3. Deploy monitoring: ./scripts/setup_monitoring.sh"
echo " 4. Begin migration: ./scripts/start_migration.sh"