Files
cim_summary/.planning/milestones/v1.0-phases/02-backend-services/02-04-PLAN.md
admin 38a0f0619d chore: complete v1.0 Analytics & Monitoring milestone
Archive milestone artifacts (roadmap, requirements, audit, phase directories)
to .planning/milestones/. Evolve PROJECT.md with validated requirements and
decision outcomes. Create MILESTONES.md and RETROSPECTIVE.md.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-25 10:34:18 -05:00

8.6 KiB

phase, plan, type, wave, depends_on, files_modified, autonomous, requirements, must_haves
phase plan type wave depends_on files_modified autonomous requirements must_haves
02-backend-services 04 execute 3
02-01
02-02
02-03
backend/src/index.ts
true
HLTH-03
INFR-03
truths artifacts key_links
runHealthProbes Cloud Function export runs on 'every 5 minutes' schedule, completely separate from processDocumentJobs
runRetentionCleanup Cloud Function export runs on 'every monday 02:00' schedule
runHealthProbes calls healthProbeService.runAllProbes() and then alertService.evaluateAndAlert()
runRetentionCleanup deletes from service_health_checks, alert_events, and document_processing_events older than 30 days
Both exports list required Firebase secrets in their secrets array
Both exports use dynamic import() pattern (same as processDocumentJobs)
path provides exports
backend/src/index.ts Two new onSchedule Cloud Function exports
runHealthProbes
runRetentionCleanup
from to via pattern
backend/src/index.ts (runHealthProbes) backend/src/services/healthProbeService.ts dynamic import('./services/healthProbeService') import('./services/healthProbeService')
from to via pattern
backend/src/index.ts (runHealthProbes) backend/src/services/alertService.ts dynamic import('./services/alertService') import('./services/alertService')
from to via pattern
backend/src/index.ts (runRetentionCleanup) backend/src/models/HealthCheckModel.ts dynamic import for deleteOlderThan(30) HealthCheckModel.deleteOlderThan
from to via pattern
backend/src/index.ts (runRetentionCleanup) backend/src/services/analyticsService.ts dynamic import for deleteProcessingEventsOlderThan(30) deleteProcessingEventsOlderThan
Add two new Firebase Cloud Function scheduled exports to index.ts: runHealthProbes (every 5 minutes) and runRetentionCleanup (weekly).

Purpose: HLTH-03 requires health probes to run on a schedule separate from document processing (PITFALL-2). INFR-03 requires 30-day rolling data retention cleanup on schedule.

Output: Two new onSchedule exports in backend/src/index.ts.

<execution_context> @/home/jonathan/.claude/get-shit-done/workflows/execute-plan.md @/home/jonathan/.claude/get-shit-done/templates/summary.md </execution_context>

@.planning/PROJECT.md @.planning/ROADMAP.md @.planning/STATE.md @.planning/phases/02-backend-services/02-RESEARCH.md @.planning/phases/02-backend-services/02-01-PLAN.md @.planning/phases/02-backend-services/02-02-PLAN.md @.planning/phases/02-backend-services/02-03-PLAN.md @backend/src/index.ts Task 1: Add runHealthProbes scheduled Cloud Function export backend/src/index.ts Add a new `onSchedule` export to `backend/src/index.ts` AFTER the existing `processDocumentJobs` export. Follow the exact same pattern as `processDocumentJobs`.
// Health probe scheduler — separate from document processing (PITFALL-2, HLTH-03)
export const runHealthProbes = onSchedule({
  schedule: 'every 5 minutes',
  timeoutSeconds: 60,
  memory: '256MiB',
  retryCount: 0,  // Probes should not retry — they run again in 5 minutes anyway
  secrets: [
    anthropicApiKey,    // for LLM probe
    openaiApiKey,       // for OpenAI probe fallback
    databaseUrl,        // for Supabase probe
    supabaseServiceKey,
    supabaseAnonKey,
  ],
}, async (_event) => {
  const { healthProbeService } = await import('./services/healthProbeService');
  const { alertService } = await import('./services/alertService');

  const results = await healthProbeService.runAllProbes();
  await alertService.evaluateAndAlert(results);

  logger.info('runHealthProbes: complete', {
    probeCount: results.length,
    statuses: results.map(r => ({ service: r.service_name, status: r.status })),
  });
});

Key requirements:

  • Use dynamic import() (not static import at top of file) — same pattern as processDocumentJobs
  • List ALL secrets that probes need in the secrets array (Firebase Secrets must be explicitly listed per function)
  • Use the existing anthropicApiKey, openaiApiKey, databaseUrl, supabaseServiceKey, supabaseAnonKey variables already defined via defineSecret at the top of index.ts
  • Set retryCount: 0 — probes run every 5 minutes, no need to retry failures
  • First call runAllProbes() to measure and persist, then evaluateAndAlert() to check for alerts
  • Log a summary with probe count and statuses cd /home/jonathan/Coding/cim_summary/backend && npx tsc --noEmit --pretty 2>&1 | head -30 Verify index.ts has export const runHealthProbes as a separate export from processDocumentJobs runHealthProbes export added to index.ts. Runs every 5 minutes. Calls healthProbeService.runAllProbes() then alertService.evaluateAndAlert(). Uses dynamic imports. Lists all required secrets. TypeScript compiles.
Task 2: Add runRetentionCleanup scheduled Cloud Function export backend/src/index.ts Add a second `onSchedule` export to `backend/src/index.ts` AFTER runHealthProbes.
// Retention cleanup — weekly, separate from document processing (PITFALL-7, INFR-03)
export const runRetentionCleanup = onSchedule({
  schedule: 'every monday 02:00',
  timeoutSeconds: 120,
  memory: '256MiB',
  secrets: [databaseUrl, supabaseServiceKey, supabaseAnonKey],
}, async (_event) => {
  const { HealthCheckModel } = await import('./models/HealthCheckModel');
  const { AlertEventModel } = await import('./models/AlertEventModel');
  const { deleteProcessingEventsOlderThan } = await import('./services/analyticsService');

  const RETENTION_DAYS = 30;

  const [hcCount, alertCount, eventCount] = await Promise.all([
    HealthCheckModel.deleteOlderThan(RETENTION_DAYS),
    AlertEventModel.deleteOlderThan(RETENTION_DAYS),
    deleteProcessingEventsOlderThan(RETENTION_DAYS),
  ]);

  logger.info('runRetentionCleanup: complete', {
    retentionDays: RETENTION_DAYS,
    deletedHealthChecks: hcCount,
    deletedAlerts: alertCount,
    deletedProcessingEvents: eventCount,
  });
});

Key requirements:

  • Use dynamic import() for all model and service imports
  • Run all three deletes in parallel with Promise.all() (they touch different tables)
  • Only include the secrets needed for Supabase access (no LLM keys needed for cleanup)
  • Set timeoutSeconds: 120 (cleanup may take longer than probes)
  • The 30-day retention period is a constant, not configurable via env (matches INFR-03 spec)
  • Only manage monitoring tables: service_health_checks, alert_events, document_processing_events. Do NOT delete from performance_metrics, session_events, or execution_events (those are agentic RAG tables, out of scope per research Open Question 4)
  • Log the count of deleted rows from each table cd /home/jonathan/Coding/cim_summary/backend && npx tsc --noEmit --pretty 2>&1 | head -30 Verify index.ts has export const runRetentionCleanup as a separate export. Verify it calls deleteOlderThan on all three tables. runRetentionCleanup export added to index.ts. Runs weekly Monday 02:00. Deletes from service_health_checks, alert_events, and document_processing_events older than 30 days. Uses Promise.all for parallel execution. Logs deletion counts. TypeScript compiles.
1. `npx tsc --noEmit` passes 2. `grep 'export const runHealthProbes' backend/src/index.ts` returns a match 3. `grep 'export const runRetentionCleanup' backend/src/index.ts` returns a match 4. Both exports use `onSchedule` (not piggybacked on processDocumentJobs — PITFALL-2 compliance) 5. Both exports use dynamic `import()` pattern 6. Full test suite still passes: `npx vitest run --reporter=verbose`

<success_criteria>

  • runHealthProbes is a separate onSchedule export running every 5 minutes
  • runRetentionCleanup is a separate onSchedule export running weekly Monday 02:00
  • Both are completely decoupled from processDocumentJobs
  • runHealthProbes calls runAllProbes() then evaluateAndAlert()
  • runRetentionCleanup calls deleteOlderThan(30) on all three monitoring tables
  • All required Firebase secrets listed in each function's secrets array
  • TypeScript compiles with no errors
  • Existing test suite passes with no regressions </success_criteria>
After completion, create `.planning/phases/02-backend-services/02-04-SUMMARY.md`