Files
cim_summary/.planning/milestones/v1.0-phases/02-backend-services/02-03-SUMMARY.md
admin 38a0f0619d chore: complete v1.0 Analytics & Monitoring milestone
Archive milestone artifacts (roadmap, requirements, audit, phase directories)
to .planning/milestones/. Evolve PROJECT.md with validated requirements and
decision outcomes. Create MILESTONES.md and RETROSPECTIVE.md.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-25 10:34:18 -05:00

125 lines
6.2 KiB
Markdown

---
phase: 02-backend-services
plan: 03
subsystem: infra
tags: [nodemailer, smtp, alerting, deduplication, email, vitest]
# Dependency graph
requires:
- phase: 02-backend-services
provides: "AlertEventModel with findRecentByService() and create() for deduplication"
- phase: 02-backend-services
provides: "ProbeResult type from healthProbeService for alert evaluation"
provides:
- "alertService with evaluateAndAlert(probeResults) — deduplication, row creation, email send"
- "SMTP email via nodemailer with lazy transporter (Firebase Secret timing safe)"
- "Config-based recipient via process.env.EMAIL_WEEKLY_RECIPIENT (never hardcoded)"
- "8 unit tests covering all alert scenarios and edge cases"
affects: [02-04-scheduler, 03-api]
# Tech tracking
tech-stack:
added: []
patterns:
- "Lazy transporter pattern: nodemailer.createTransport() called inside function, not at module level (Firebase Secret timing)"
- "Alert deduplication: findRecentByService() cooldown check before row creation AND email"
- "Non-throwing email: catch email errors, log them, never re-throw (probe pipeline safety)"
- "vi.mock factories with inline vi.fn() only — no outer variable references to avoid TDZ hoisting"
key-files:
created:
- backend/src/services/alertService.ts
- backend/src/__tests__/unit/alertService.test.ts
modified: []
key-decisions:
- "Transporter created inside sendAlertEmail() on each call — not at module level — avoids Firebase Secret not-yet-available error (PITFALL A)"
- "Suppressed alerts skip BOTH AlertEventModel.create() AND sendMail — prevents duplicate DB rows in addition to duplicate emails"
- "Email failure caught in try/catch and logged via logger.error — never re-thrown so probe pipeline continues"
patterns-established:
- "Alert deduplication pattern: check findRecentByService before creating row or sending email"
- "Non-throwing side effects: email, analytics, and similar fire-and-forget paths must never throw"
requirements-completed: [ALRT-01, ALRT-02, ALRT-04]
# Metrics
duration: 12min
completed: 2026-02-24
---
# Phase 2 Plan 03: Alert Service Summary
**Nodemailer SMTP alert service with cooldown deduplication via AlertEventModel, config-based recipient, and lazy transporter pattern for Firebase Secret compatibility**
## Performance
- **Duration:** 12 min
- **Started:** 2026-02-24T19:27:42Z
- **Completed:** 2026-02-24T19:39:30Z
- **Tasks:** 2
- **Files modified:** 2
## Accomplishments
- `alertService.evaluateAndAlert()` evaluates ProbeResults and sends email alerts for degraded/down services
- Deduplication via `AlertEventModel.findRecentByService()` with configurable `ALERT_COOLDOWN_MINUTES` env var
- Email recipient read from `process.env.EMAIL_WEEKLY_RECIPIENT` — never hardcoded (ALRT-04)
- Lazy transporter pattern: `nodemailer.createTransport()` called inside `sendAlertEmail()` function (Firebase Secret timing fix)
- 8 unit tests cover all alert scenarios: healthy skip, down/degraded alerts, deduplication, recipient config, missing recipient, email failure, and multi-probe processing
## Task Commits
Each task was committed atomically:
1. **Task 1: Create alertService with deduplication and email** - `91f609c` (feat)
2. **Task 2: Create alertService unit tests** - `4b5afe2` (test)
**Plan metadata:** `0acacd1` (docs: complete alertService plan)
## Files Created/Modified
- `backend/src/services/alertService.ts` - Alert evaluation, deduplication, and email delivery
- `backend/src/__tests__/unit/alertService.test.ts` - 8 unit tests, all passing
## Decisions Made
- **Lazy transporter:** `nodemailer.createTransport()` called inside `sendAlertEmail()` on each call, not cached at module level. This is required because Firebase Secrets (`EMAIL_PASS`) are not injected into `process.env` at module load time — only when the function is invoked.
- **Suppress both row and email:** When `findRecentByService()` returns a non-null alert, both `AlertEventModel.create()` and `sendMail` are skipped. This prevents duplicate DB rows in the alert_events table in addition to preventing duplicate emails.
- **Non-throwing email path:** Email send failures are caught in try/catch and logged via `logger.error`. The function never re-throws, so email outages cannot break the health probe pipeline.
## Deviations from Plan
### Auto-fixed Issues
**1. [Rule 3 - Blocking] Restructured nodemailer mock to avoid Vitest TDZ hoisting error**
- **Found during:** Task 2 (alertService unit tests)
- **Issue:** Test file declared `const mockSendMail = vi.fn()` outside the `vi.mock()` factory and referenced it inside. Because `vi.mock()` is hoisted to the top of the file, `mockSendMail` was accessed before initialization, causing `ReferenceError: Cannot access 'mockSendMail' before initialization`
- **Fix:** Removed the outer `mockSendMail` variable. The nodemailer mock factory uses only inline `vi.fn()` calls. Tests access the mock's `sendMail` via `vi.mocked(nodemailer.createTransport).mock.results[0].value` through a `getMockSendMail()` helper. This is consistent with the project decision: "vi.mock() factories must use only inline vi.fn() to avoid Vitest hoisting TDZ errors" (established in 01-02)
- **Files modified:** `backend/src/__tests__/unit/alertService.test.ts`
- **Verification:** All 8 tests pass after fix
- **Committed in:** `4b5afe2` (Task 2 commit)
---
**Total deviations:** 1 auto-fixed (1 blocking — Vitest TDZ hoisting)
**Impact on plan:** Required fix for tests to run. No scope creep. Consistent with established project pattern from 01-02.
## Issues Encountered
None beyond the auto-fixed TDZ hoisting issue above.
## User Setup Required
None - no external service configuration required beyond the existing email env vars (`EMAIL_HOST`, `EMAIL_PORT`, `EMAIL_SECURE`, `EMAIL_USER`, `EMAIL_PASS`, `EMAIL_WEEKLY_RECIPIENT`, `ALERT_COOLDOWN_MINUTES`) documented in prior research.
## Next Phase Readiness
- `alertService.evaluateAndAlert()` ready to be called from the health probe scheduler (Plan 02-04)
- All 3 alert requirements satisfied: ALRT-01 (email on degraded/down), ALRT-02 (cooldown deduplication), ALRT-04 (recipient from config)
- No blockers for Phase 2 Plan 04 (scheduler)
---
*Phase: 02-backend-services*
*Completed: 2026-02-24*