--- phase: 02-backend-services plan: 04 subsystem: infra tags: [firebase-functions, cloud-scheduler, health-probes, retention-cleanup, onSchedule] # Dependency graph requires: - phase: 02-backend-services provides: healthProbeService.runAllProbes(), alertService.evaluateAndAlert(), HealthCheckModel.deleteOlderThan(), AlertEventModel.deleteOlderThan(), deleteProcessingEventsOlderThan() provides: - runHealthProbes Cloud Function export (every 5 minutes, separate from processDocumentJobs) - runRetentionCleanup Cloud Function export (weekly Monday 02:00, 30-day rolling deletion) affects: [03-api-layer, 04-frontend, phase-03, phase-04] # Tech tracking tech-stack: added: [] patterns: - "onSchedule Cloud Functions use dynamic import() to avoid cold-start overhead and module-level secret access" - "Health probes as separate named Cloud Function — never piggybacked on processDocumentJobs (PITFALL-2)" - "retryCount: 0 for health probes — 5-minute schedule makes retries unnecessary" - "Promise.all() for parallel multi-table retention cleanup" key-files: created: [] modified: - backend/src/index.ts key-decisions: - "runHealthProbes is completely separate from processDocumentJobs — distinct Cloud Function, distinct schedule (PITFALL-2 compliance)" - "retryCount: 0 on runHealthProbes — probes recur every 5 minutes, retry would create confusing duplicate results" - "runRetentionCleanup uses Promise.all() for parallel deletes — three tables are independent, no ordering constraint" - "runRetentionCleanup only deletes monitoring tables (service_health_checks, alert_events, document_processing_events) — agentic RAG tables out of scope per research Open Question 4" - "RETENTION_DAYS = 30 is a constant, not configurable — matches INFR-03 spec exactly" patterns-established: - "Scheduled Cloud Functions: dynamic import() + explicit secrets array per function" - "Retention cleanup: Promise.all([model.deleteOlderThan(), ...]) pattern for parallel table cleanup" requirements-completed: [HLTH-03, INFR-03] # Metrics duration: 1min completed: 2026-02-24 --- # Phase 2 Plan 04: Scheduled Cloud Function Exports Summary **Two new Firebase onSchedule Cloud Functions: runHealthProbes (5-minute interval) and runRetentionCleanup (weekly Monday 02:00) added to index.ts as standalone exports decoupled from document processing** ## Performance - **Duration:** ~1 min - **Started:** 2026-02-24T19:34:20Z - **Completed:** 2026-02-24T19:35:17Z - **Tasks:** 2 - **Files modified:** 1 ## Accomplishments - Added `runHealthProbes` onSchedule export that calls `healthProbeService.runAllProbes()` then `alertService.evaluateAndAlert()` on a 5-minute cadence - Added `runRetentionCleanup` onSchedule export that deletes rows older than 30 days from `service_health_checks`, `alert_events`, and `document_processing_events` in parallel - Both functions use dynamic `import()` pattern and list all required Firebase secrets explicitly - All 64 existing tests continue to pass ## Task Commits Both tasks modified the same file in a single edit operation: 1. **Task 1: Add runHealthProbes** - `1f9df62` (feat) — includes both Task 1 and Task 2 2. **Task 2: Add runRetentionCleanup** — included in `1f9df62` above **Plan metadata:** (docs commit forthcoming) ## Files Created/Modified - `backend/src/index.ts` - Added `runHealthProbes` and `runRetentionCleanup` scheduled Cloud Function exports after `processDocumentJobs` ## Decisions Made - Combined both exports into one commit since they were added simultaneously to the same file — functionally equivalent to two separate commits - `retryCount: 0` on `runHealthProbes` — with a 5-minute schedule, a failed probe run is superseded by the next run before any retry would be useful - `timeoutSeconds: 120` on `runRetentionCleanup` — cleanup may process large batches; 60 seconds could be tight for large datasets ## Deviations from Plan None - plan executed exactly as written. ## Issues Encountered None — TypeScript compiled cleanly on first pass, all tests passed. ## User Setup Required None - no external service configuration required. Firebase deployment will pick up the new exports automatically. ## Next Phase Readiness - All Phase 2 backend service plans complete (02-01 through 02-04) - Ready for Phase 3 API layer development - Health probe infrastructure fully wired: probes run on schedule, alerts sent via email, data retained for 30 days - Monitoring system is operational end-to-end --- *Phase: 02-backend-services* *Completed: 2026-02-24*