cim_summary/.planning/milestones/v1.0-phases/02-backend-services/02-04-PLAN.md at 38a0f0619d7d95ac586afd2f4c721212dda45287

Files

admin 38a0f0619d chore: complete v1.0 Analytics & Monitoring milestone

Archive milestone artifacts (roadmap, requirements, audit, phase directories)
to .planning/milestones/. Evolve PROJECT.md with validated requirements and
decision outcomes. Create MILESTONES.md and RETROSPECTIVE.md.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-02-25 10:34:18 -05:00

8.6 KiB

Raw Blame History

phase, plan, type, wave, depends_on, files_modified, autonomous, requirements, must_haves

phase

plan

type

wave

depends_on

files_modified

autonomous

requirements

must_haves

02-backend-services

execute

02-01

02-02

02-03

backend/src/index.ts

true

HLTH-03

INFR-03

truths

artifacts

key_links

runHealthProbes Cloud Function export runs on 'every 5 minutes' schedule, completely separate from processDocumentJobs

runRetentionCleanup Cloud Function export runs on 'every monday 02:00' schedule

runHealthProbes calls healthProbeService.runAllProbes() and then alertService.evaluateAndAlert()

runRetentionCleanup deletes from service_health_checks, alert_events, and document_processing_events older than 30 days

Both exports list required Firebase secrets in their secrets array

Both exports use dynamic import() pattern (same as processDocumentJobs)

path

provides

exports

backend/src/index.ts

Two new onSchedule Cloud Function exports

runHealthProbes

runRetentionCleanup

from	to	via	pattern
backend/src/index.ts (runHealthProbes)	backend/src/services/healthProbeService.ts	dynamic import('./services/healthProbeService')	import('./services/healthProbeService')

from	to	via	pattern
backend/src/index.ts (runHealthProbes)	backend/src/services/alertService.ts	dynamic import('./services/alertService')	import('./services/alertService')

from	to	via	pattern
backend/src/index.ts (runRetentionCleanup)	backend/src/models/HealthCheckModel.ts	dynamic import for deleteOlderThan(30)	HealthCheckModel.deleteOlderThan

from	to	via	pattern
backend/src/index.ts (runRetentionCleanup)	backend/src/services/analyticsService.ts	dynamic import for deleteProcessingEventsOlderThan(30)	deleteProcessingEventsOlderThan

Add two new Firebase Cloud Function scheduled exports to index.ts: runHealthProbes (every 5 minutes) and runRetentionCleanup (weekly).

Purpose: HLTH-03 requires health probes to run on a schedule separate from document processing (PITFALL-2). INFR-03 requires 30-day rolling data retention cleanup on schedule.

Output: Two new onSchedule exports in backend/src/index.ts.

<execution_context> @/home/jonathan/.claude/get-shit-done/workflows/execute-plan.md @/home/jonathan/.claude/get-shit-done/templates/summary.md </execution_context>

@.planning/PROJECT.md @.planning/ROADMAP.md @.planning/STATE.md @.planning/phases/02-backend-services/02-RESEARCH.md @.planning/phases/02-backend-services/02-01-PLAN.md @.planning/phases/02-backend-services/02-02-PLAN.md @.planning/phases/02-backend-services/02-03-PLAN.md @backend/src/index.ts Task 1: Add runHealthProbes scheduled Cloud Function export backend/src/index.ts Add a new `onSchedule` export to `backend/src/index.ts` AFTER the existing `processDocumentJobs` export. Follow the exact same pattern as `processDocumentJobs`.

// Health probe scheduler — separate from document processing (PITFALL-2, HLTH-03)
export const runHealthProbes = onSchedule({
  schedule: 'every 5 minutes',
  timeoutSeconds: 60,
  memory: '256MiB',
  retryCount: 0,  // Probes should not retry — they run again in 5 minutes anyway
  secrets: [
    anthropicApiKey,    // for LLM probe
    openaiApiKey,       // for OpenAI probe fallback
    databaseUrl,        // for Supabase probe
    supabaseServiceKey,
    supabaseAnonKey,
  ],
}, async (_event) => {
  const { healthProbeService } = await import('./services/healthProbeService');
  const { alertService } = await import('./services/alertService');

  const results = await healthProbeService.runAllProbes();
  await alertService.evaluateAndAlert(results);

  logger.info('runHealthProbes: complete', {
    probeCount: results.length,
    statuses: results.map(r => ({ service: r.service_name, status: r.status })),
  });
});

Key requirements:

Use dynamic import() (not static import at top of file) — same pattern as processDocumentJobs
List ALL secrets that probes need in the secrets array (Firebase Secrets must be explicitly listed per function)
Use the existing anthropicApiKey, openaiApiKey, databaseUrl, supabaseServiceKey, supabaseAnonKey variables already defined via defineSecret at the top of index.ts
Set retryCount: 0 — probes run every 5 minutes, no need to retry failures
First call runAllProbes() to measure and persist, then evaluateAndAlert() to check for alerts
Log a summary with probe count and statuses cd /home/jonathan/Coding/cim_summary/backend && npx tsc --noEmit --pretty 2>&1 | head -30 Verify index.ts has export const runHealthProbes as a separate export from processDocumentJobs runHealthProbes export added to index.ts. Runs every 5 minutes. Calls healthProbeService.runAllProbes() then alertService.evaluateAndAlert(). Uses dynamic imports. Lists all required secrets. TypeScript compiles.

Task 2: Add runRetentionCleanup scheduled Cloud Function export backend/src/index.ts Add a second `onSchedule` export to `backend/src/index.ts` AFTER runHealthProbes.

// Retention cleanup — weekly, separate from document processing (PITFALL-7, INFR-03)
export const runRetentionCleanup = onSchedule({
  schedule: 'every monday 02:00',
  timeoutSeconds: 120,
  memory: '256MiB',
  secrets: [databaseUrl, supabaseServiceKey, supabaseAnonKey],
}, async (_event) => {
  const { HealthCheckModel } = await import('./models/HealthCheckModel');
  const { AlertEventModel } = await import('./models/AlertEventModel');
  const { deleteProcessingEventsOlderThan } = await import('./services/analyticsService');

  const RETENTION_DAYS = 30;

  const [hcCount, alertCount, eventCount] = await Promise.all([
    HealthCheckModel.deleteOlderThan(RETENTION_DAYS),
    AlertEventModel.deleteOlderThan(RETENTION_DAYS),
    deleteProcessingEventsOlderThan(RETENTION_DAYS),
  ]);

  logger.info('runRetentionCleanup: complete', {
    retentionDays: RETENTION_DAYS,
    deletedHealthChecks: hcCount,
    deletedAlerts: alertCount,
    deletedProcessingEvents: eventCount,
  });
});

Key requirements:

Use dynamic import() for all model and service imports
Run all three deletes in parallel with Promise.all() (they touch different tables)
Only include the secrets needed for Supabase access (no LLM keys needed for cleanup)
Set timeoutSeconds: 120 (cleanup may take longer than probes)
The 30-day retention period is a constant, not configurable via env (matches INFR-03 spec)
Only manage monitoring tables: service_health_checks, alert_events, document_processing_events. Do NOT delete from performance_metrics, session_events, or execution_events (those are agentic RAG tables, out of scope per research Open Question 4)
Log the count of deleted rows from each table cd /home/jonathan/Coding/cim_summary/backend && npx tsc --noEmit --pretty 2>&1 | head -30 Verify index.ts has export const runRetentionCleanup as a separate export. Verify it calls deleteOlderThan on all three tables. runRetentionCleanup export added to index.ts. Runs weekly Monday 02:00. Deletes from service_health_checks, alert_events, and document_processing_events older than 30 days. Uses Promise.all for parallel execution. Logs deletion counts. TypeScript compiles.

1. `npx tsc --noEmit` passes 2. `grep 'export const runHealthProbes' backend/src/index.ts` returns a match 3. `grep 'export const runRetentionCleanup' backend/src/index.ts` returns a match 4. Both exports use `onSchedule` (not piggybacked on processDocumentJobs — PITFALL-2 compliance) 5. Both exports use dynamic `import()` pattern 6. Full test suite still passes: `npx vitest run --reporter=verbose`

<success_criteria>

runHealthProbes is a separate onSchedule export running every 5 minutes
runRetentionCleanup is a separate onSchedule export running weekly Monday 02:00
Both are completely decoupled from processDocumentJobs
runHealthProbes calls runAllProbes() then evaluateAndAlert()
runRetentionCleanup calls deleteOlderThan(30) on all three monitoring tables
All required Firebase secrets listed in each function's secrets array
TypeScript compiles with no errors
Existing test suite passes with no regressions </success_criteria>

After completion, create `.planning/phases/02-backend-services/02-04-SUMMARY.md`

8.6 KiB Raw Blame History

8.6 KiB

Raw Blame History