chore: complete v1.0 Analytics & Monitoring milestone
Archive milestone artifacts (roadmap, requirements, audit, phase directories) to .planning/milestones/. Evolve PROJECT.md with validated requirements and decision outcomes. Create MILESTONES.md and RETROSPECTIVE.md. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,197 @@
|
||||
---
|
||||
phase: 02-backend-services
|
||||
plan: 04
|
||||
type: execute
|
||||
wave: 3
|
||||
depends_on: [02-01, 02-02, 02-03]
|
||||
files_modified:
|
||||
- backend/src/index.ts
|
||||
autonomous: true
|
||||
requirements: [HLTH-03, INFR-03]
|
||||
|
||||
must_haves:
|
||||
truths:
|
||||
- "runHealthProbes Cloud Function export runs on 'every 5 minutes' schedule, completely separate from processDocumentJobs"
|
||||
- "runRetentionCleanup Cloud Function export runs on 'every monday 02:00' schedule"
|
||||
- "runHealthProbes calls healthProbeService.runAllProbes() and then alertService.evaluateAndAlert()"
|
||||
- "runRetentionCleanup deletes from service_health_checks, alert_events, and document_processing_events older than 30 days"
|
||||
- "Both exports list required Firebase secrets in their secrets array"
|
||||
- "Both exports use dynamic import() pattern (same as processDocumentJobs)"
|
||||
artifacts:
|
||||
- path: "backend/src/index.ts"
|
||||
provides: "Two new onSchedule Cloud Function exports"
|
||||
exports: ["runHealthProbes", "runRetentionCleanup"]
|
||||
key_links:
|
||||
- from: "backend/src/index.ts (runHealthProbes)"
|
||||
to: "backend/src/services/healthProbeService.ts"
|
||||
via: "dynamic import('./services/healthProbeService')"
|
||||
pattern: "import\\('./services/healthProbeService'\\)"
|
||||
- from: "backend/src/index.ts (runHealthProbes)"
|
||||
to: "backend/src/services/alertService.ts"
|
||||
via: "dynamic import('./services/alertService')"
|
||||
pattern: "import\\('./services/alertService'\\)"
|
||||
- from: "backend/src/index.ts (runRetentionCleanup)"
|
||||
to: "backend/src/models/HealthCheckModel.ts"
|
||||
via: "dynamic import for deleteOlderThan(30)"
|
||||
pattern: "HealthCheckModel\\.deleteOlderThan"
|
||||
- from: "backend/src/index.ts (runRetentionCleanup)"
|
||||
to: "backend/src/services/analyticsService.ts"
|
||||
via: "dynamic import for deleteProcessingEventsOlderThan(30)"
|
||||
pattern: "deleteProcessingEventsOlderThan"
|
||||
---
|
||||
|
||||
<objective>
|
||||
Add two new Firebase Cloud Function scheduled exports to index.ts: runHealthProbes (every 5 minutes) and runRetentionCleanup (weekly).
|
||||
|
||||
Purpose: HLTH-03 requires health probes to run on a schedule separate from document processing (PITFALL-2). INFR-03 requires 30-day rolling data retention cleanup on schedule.
|
||||
|
||||
Output: Two new onSchedule exports in backend/src/index.ts.
|
||||
</objective>
|
||||
|
||||
<execution_context>
|
||||
@/home/jonathan/.claude/get-shit-done/workflows/execute-plan.md
|
||||
@/home/jonathan/.claude/get-shit-done/templates/summary.md
|
||||
</execution_context>
|
||||
|
||||
<context>
|
||||
@.planning/PROJECT.md
|
||||
@.planning/ROADMAP.md
|
||||
@.planning/STATE.md
|
||||
@.planning/phases/02-backend-services/02-RESEARCH.md
|
||||
@.planning/phases/02-backend-services/02-01-PLAN.md
|
||||
@.planning/phases/02-backend-services/02-02-PLAN.md
|
||||
@.planning/phases/02-backend-services/02-03-PLAN.md
|
||||
@backend/src/index.ts
|
||||
</context>
|
||||
|
||||
<tasks>
|
||||
|
||||
<task type="auto">
|
||||
<name>Task 1: Add runHealthProbes scheduled Cloud Function export</name>
|
||||
<files>
|
||||
backend/src/index.ts
|
||||
</files>
|
||||
<action>
|
||||
Add a new `onSchedule` export to `backend/src/index.ts` AFTER the existing `processDocumentJobs` export. Follow the exact same pattern as `processDocumentJobs`.
|
||||
|
||||
```typescript
|
||||
// Health probe scheduler — separate from document processing (PITFALL-2, HLTH-03)
|
||||
export const runHealthProbes = onSchedule({
|
||||
schedule: 'every 5 minutes',
|
||||
timeoutSeconds: 60,
|
||||
memory: '256MiB',
|
||||
retryCount: 0, // Probes should not retry — they run again in 5 minutes anyway
|
||||
secrets: [
|
||||
anthropicApiKey, // for LLM probe
|
||||
openaiApiKey, // for OpenAI probe fallback
|
||||
databaseUrl, // for Supabase probe
|
||||
supabaseServiceKey,
|
||||
supabaseAnonKey,
|
||||
],
|
||||
}, async (_event) => {
|
||||
const { healthProbeService } = await import('./services/healthProbeService');
|
||||
const { alertService } = await import('./services/alertService');
|
||||
|
||||
const results = await healthProbeService.runAllProbes();
|
||||
await alertService.evaluateAndAlert(results);
|
||||
|
||||
logger.info('runHealthProbes: complete', {
|
||||
probeCount: results.length,
|
||||
statuses: results.map(r => ({ service: r.service_name, status: r.status })),
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
Key requirements:
|
||||
- Use dynamic `import()` (not static import at top of file) — same pattern as processDocumentJobs
|
||||
- List ALL secrets that probes need in the `secrets` array (Firebase Secrets must be explicitly listed per function)
|
||||
- Use the existing `anthropicApiKey`, `openaiApiKey`, `databaseUrl`, `supabaseServiceKey`, `supabaseAnonKey` variables already defined via `defineSecret` at the top of index.ts
|
||||
- Set `retryCount: 0` — probes run every 5 minutes, no need to retry failures
|
||||
- First call `runAllProbes()` to measure and persist, then `evaluateAndAlert()` to check for alerts
|
||||
- Log a summary with probe count and statuses
|
||||
</action>
|
||||
<verify>
|
||||
<automated>cd /home/jonathan/Coding/cim_summary/backend && npx tsc --noEmit --pretty 2>&1 | head -30</automated>
|
||||
<manual>Verify index.ts has `export const runHealthProbes` as a separate export from processDocumentJobs</manual>
|
||||
</verify>
|
||||
<done>runHealthProbes export added to index.ts. Runs every 5 minutes. Calls healthProbeService.runAllProbes() then alertService.evaluateAndAlert(). Uses dynamic imports. Lists all required secrets. TypeScript compiles.</done>
|
||||
</task>
|
||||
|
||||
<task type="auto">
|
||||
<name>Task 2: Add runRetentionCleanup scheduled Cloud Function export</name>
|
||||
<files>
|
||||
backend/src/index.ts
|
||||
</files>
|
||||
<action>
|
||||
Add a second `onSchedule` export to `backend/src/index.ts` AFTER runHealthProbes.
|
||||
|
||||
```typescript
|
||||
// Retention cleanup — weekly, separate from document processing (PITFALL-7, INFR-03)
|
||||
export const runRetentionCleanup = onSchedule({
|
||||
schedule: 'every monday 02:00',
|
||||
timeoutSeconds: 120,
|
||||
memory: '256MiB',
|
||||
secrets: [databaseUrl, supabaseServiceKey, supabaseAnonKey],
|
||||
}, async (_event) => {
|
||||
const { HealthCheckModel } = await import('./models/HealthCheckModel');
|
||||
const { AlertEventModel } = await import('./models/AlertEventModel');
|
||||
const { deleteProcessingEventsOlderThan } = await import('./services/analyticsService');
|
||||
|
||||
const RETENTION_DAYS = 30;
|
||||
|
||||
const [hcCount, alertCount, eventCount] = await Promise.all([
|
||||
HealthCheckModel.deleteOlderThan(RETENTION_DAYS),
|
||||
AlertEventModel.deleteOlderThan(RETENTION_DAYS),
|
||||
deleteProcessingEventsOlderThan(RETENTION_DAYS),
|
||||
]);
|
||||
|
||||
logger.info('runRetentionCleanup: complete', {
|
||||
retentionDays: RETENTION_DAYS,
|
||||
deletedHealthChecks: hcCount,
|
||||
deletedAlerts: alertCount,
|
||||
deletedProcessingEvents: eventCount,
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
Key requirements:
|
||||
- Use dynamic `import()` for all model and service imports
|
||||
- Run all three deletes in parallel with `Promise.all()` (they touch different tables)
|
||||
- Only include the secrets needed for Supabase access (no LLM keys needed for cleanup)
|
||||
- Set `timeoutSeconds: 120` (cleanup may take longer than probes)
|
||||
- The 30-day retention period is a constant, not configurable via env (matches INFR-03 spec)
|
||||
- Only manage monitoring tables: service_health_checks, alert_events, document_processing_events. Do NOT delete from performance_metrics, session_events, or execution_events (those are agentic RAG tables, out of scope per research Open Question 4)
|
||||
- Log the count of deleted rows from each table
|
||||
</action>
|
||||
<verify>
|
||||
<automated>cd /home/jonathan/Coding/cim_summary/backend && npx tsc --noEmit --pretty 2>&1 | head -30</automated>
|
||||
<manual>Verify index.ts has `export const runRetentionCleanup` as a separate export. Verify it calls deleteOlderThan on all three tables.</manual>
|
||||
</verify>
|
||||
<done>runRetentionCleanup export added to index.ts. Runs weekly Monday 02:00. Deletes from service_health_checks, alert_events, and document_processing_events older than 30 days. Uses Promise.all for parallel execution. Logs deletion counts. TypeScript compiles.</done>
|
||||
</task>
|
||||
|
||||
</tasks>
|
||||
|
||||
<verification>
|
||||
1. `npx tsc --noEmit` passes
|
||||
2. `grep 'export const runHealthProbes' backend/src/index.ts` returns a match
|
||||
3. `grep 'export const runRetentionCleanup' backend/src/index.ts` returns a match
|
||||
4. Both exports use `onSchedule` (not piggybacked on processDocumentJobs — PITFALL-2 compliance)
|
||||
5. Both exports use dynamic `import()` pattern
|
||||
6. Full test suite still passes: `npx vitest run --reporter=verbose`
|
||||
</verification>
|
||||
|
||||
<success_criteria>
|
||||
- runHealthProbes is a separate onSchedule export running every 5 minutes
|
||||
- runRetentionCleanup is a separate onSchedule export running weekly Monday 02:00
|
||||
- Both are completely decoupled from processDocumentJobs
|
||||
- runHealthProbes calls runAllProbes() then evaluateAndAlert()
|
||||
- runRetentionCleanup calls deleteOlderThan(30) on all three monitoring tables
|
||||
- All required Firebase secrets listed in each function's secrets array
|
||||
- TypeScript compiles with no errors
|
||||
- Existing test suite passes with no regressions
|
||||
</success_criteria>
|
||||
|
||||
<output>
|
||||
After completion, create `.planning/phases/02-backend-services/02-04-SUMMARY.md`
|
||||
</output>
|
||||
Reference in New Issue
Block a user