Moltbot/tmp-refactoring-strategy.md

# Refactoring Strategy — Oversized Files

> **Target:** ~500–700 LOC per file (AGENTS.md guideline)
> **Baseline:** 681K total lines across 3,781 code files (avg 180 LOC)
> **Problem:** 50+ files exceed 700 LOC; top offenders are 2–4× over target

---

## Progress Summary

| Item                                | Before | After                                | Status  |
| ----------------------------------- | ------ | ------------------------------------ | ------- |
| `src/config/schema.ts`              | 1,114  | 353 + 729 (field-metadata)           | ✅ Done |
| `src/security/audit-extra.ts`       | 1,199  | 31 barrel + 559 (sync) + 668 (async) | ✅ Done |
| `src/infra/session-cost-usage.ts`   | 984    | —                                    | Pending |
| `src/media-understanding/runner.ts` | 1,232  | —                                    | Pending |

### All Targets (current LOC)

| Phase | File                             | Current LOC | Target |
| ----- | -------------------------------- | ----------- | ------ |
| 1     | session-cost-usage.ts            | 984         | ~700   |
| 1     | media-understanding/runner.ts    | 1,232       | ~700   |
| 2a    | heartbeat-runner.ts              | 956         | ~560   |
| 2a    | message-action-runner.ts         | 1,082       | ~620   |
| 2b    | tts/tts.ts                       | 1,445       | ~950   |
| 2b    | exec-approvals.ts                | 1,437       | ~700   |
| 2b    | update-cli.ts                    | 1,245       | ~1,000 |
| 3     | memory/manager.ts                | 2,280       | ~1,300 |
| 3     | bash-tools.exec.ts               | 1,546       | ~1,000 |
| 3     | ws-connection/message-handler.ts | 970         | ~720   |
| 4     | ui/views/usage.ts                | 3,076       | ~1,200 |
| 4     | ui/views/agents.ts               | 1,894       | ~950   |
| 4     | ui/views/nodes.ts                | 1,118       | ~440   |
| 4     | bluebubbles/monitor.ts           | 2,348       | ~650   |

---

## Naming Convention (Established Pattern)

The codebase uses **dot-separated module decomposition**: `<base-module>.<concern>.ts`

**Examples from codebase:**

- `provider-usage.ts` → `provider-usage.types.ts`, `provider-usage.fetch.ts`, `provider-usage.shared.ts`
- `zod-schema.ts` → `zod-schema.core.ts`, `zod-schema.agents.ts`, `zod-schema.session.ts`
- `directive-handling.ts` → `directive-handling.parse.ts`, `directive-handling.impl.ts`, `directive-handling.shared.ts`

**Pattern:**

- `<base>.ts` — main barrel, re-exports public API
- `<base>.types.ts` — type definitions
- `<base>.shared.ts` — shared constants/utilities
- `<base>.<domain>.ts` — domain-specific implementations

**Consequences for this refactoring:**

- ✅ Renamed: `audit-collectors-sync.ts` → `audit-extra.sync.ts`, `audit-collectors-async.ts` → `audit-extra.async.ts`
- Use `session-cost-usage.types.ts` (not `session-cost-types.ts`)
- Use `runner.binary.ts` (not `binary-resolve.ts`)

---

## Triage: What NOT to split

| File                                           | LOC   | Reason to skip                                                                   |
| ---------------------------------------------- | ----- | -------------------------------------------------------------------------------- |
| `ui/src/ui/views/usageStyles.ts`               | 1,911 | Pure CSS-in-JS data. Zero logic.                                                 |
| `apps/macos/.../GatewayModels.swift`           | 2,790 | Generated/shared protocol models. Splitting fragments the schema.                |
| `apps/shared/.../GatewayModels.swift`          | 2,790 | Same — shared protocol definitions.                                              |
| `*.test.ts` files (bot.test, audit.test, etc.) | 1K–3K | Tests naturally grow with the module. Split only if parallel execution needs it. |
| `ui/src/ui/app-render.ts`                      | 1,222 | Mechanical prop-wiring glue. Large but low complexity. Optional.                 |

---

## Phase 1 — Low-Risk, High-Impact (Pure Data / Independent Functions)

These files contain cleanly separable sections with no shared mutable state. Each extraction is a straightforward "move functions + update imports" operation.

### 1. ✅ `src/config/schema.ts` (1,114 → 353 LOC) — DONE

| Extract to                        | What moves                                                          | LOC |
| --------------------------------- | ------------------------------------------------------------------- | --- |
| `config/schema.field-metadata.ts` | `FIELD_LABELS`, `FIELD_HELP`, `FIELD_PLACEHOLDERS`, sensitivity map | 729 |

**Result:** schema.ts reduced to 353 LOC. Field metadata extracted to schema.field-metadata.ts (729 LOC).

### 2. ✅ `src/security/audit-extra.ts` (1,199 → 31 LOC barrel) — DONE

| Extract to                      | What moves                                                                                                                                             | LOC |
| ------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------ | --- |
| `security/audit-extra.sync.ts`  | 7 sync collectors (config-based, no I/O): attack surface, synced folders, secrets, hooks, model hygiene, small model risk, exposure matrix             | 559 |
| `security/audit-extra.async.ts` | 6 async collectors (filesystem/plugin checks): plugins trust, include perms, deep filesystem, config snapshot, plugins code safety, skills code safety | 668 |

**Result:** Used centralized sync vs. async split (2 files) instead of domain scatter (3 files). audit-extra.ts is now a 31-line re-export barrel for backward compatibility. Files renamed to follow `<base>.<concern>.ts` convention.

### 3. `src/infra/session-cost-usage.ts` (984 → ~700 LOC)

| Extract to                            | What moves                                                                                                                                                                      | LOC  |
| ------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---- |
| `infra/session-cost-usage.types.ts`   | 20+ exported type definitions                                                                                                                                                   | ~130 |
| `infra/session-cost-usage.parsers.ts` | `emptyTotals`, `toFiniteNumber`, `extractCostBreakdown`, `parseTimestamp`, `parseTranscriptEntry`, `formatDayKey`, `computeLatencyStats`, `apply*` helpers, `scan*File` helpers | ~240 |

**Why:** Types + pure parser functions. Zero side effects. Consumers just import them.

### 4. `src/media-understanding/runner.ts` (1,232 → ~700 LOC)

| Extract to                             | What moves                                                                                                         | LOC  |
| -------------------------------------- | ------------------------------------------------------------------------------------------------------------------ | ---- |
| `media-understanding/runner.binary.ts` | `findBinary`, `hasBinary`, `isExecutable`, `candidateBinaryNames` + caching                                        | ~150 |
| `media-understanding/runner.cli.ts`    | `extractGeminiResponse`, `extractSherpaOnnxText`, `probeGeminiCli`, `resolveCliOutput`                             | ~200 |
| `media-understanding/runner.entry.ts`  | local entry resolvers, `resolveAutoEntries`, `resolveAutoImageModel`, `resolveActiveModelEntry`, `resolveKeyEntry` | ~250 |

**Why:** Three clean layers (binary discovery → CLI output parsing → entry resolution). One-way dependency flow.

---

## Phase 2 — Medium-Risk, Clean Boundaries

These require converting private methods or closure variables to explicit parameters, but the seams are well-defined.

### 5. `src/infra/heartbeat-runner.ts` (956 → ~560 LOC)

| Extract to                         | What moves                                                                                                     | LOC  |
| ---------------------------------- | -------------------------------------------------------------------------------------------------------------- | ---- |
| `infra/heartbeat-runner.config.ts` | Active hours logic, config/agent/session resolution, `resolveHeartbeat*` helpers, `isHeartbeatEnabledForAgent` | ~370 |
| `infra/heartbeat-runner.reply.ts`  | Reply payload helpers: `resolveHeartbeatReplyPayload`, `normalizeHeartbeatReply`, `restoreHeartbeatUpdatedAt`  | ~100 |

### 6. `src/infra/outbound/message-action-runner.ts` (1,082 → ~620 LOC)

| Extract to                                        | What moves                                                                                           | LOC  |
| ------------------------------------------------- | ---------------------------------------------------------------------------------------------------- | ---- |
| `infra/outbound/message-action-runner.media.ts`   | Attachment handling (max bytes, filename, base64, sandbox) + hydration (group icon, send attachment) | ~330 |
| `infra/outbound/message-action-runner.context.ts` | Cross-context decoration + Slack/Telegram auto-threading                                             | ~190 |

### 7. `src/tts/tts.ts` (1,445 → ~950 LOC, then follow-up)

| Extract to              | What moves                                               | LOC  |
| ----------------------- | -------------------------------------------------------- | ---- |
| `tts/tts.directives.ts` | `parseTtsDirectives` + related types/constants           | ~260 |
| `tts/tts.providers.ts`  | `elevenLabsTTS`, `openaiTTS`, `edgeTTS`, `summarizeText` | ~200 |
| `tts/tts.prefs.ts`      | 15 TTS preference get/set functions                      | ~165 |

**Note:** Still ~955 LOC after this. A second pass could extract config resolution (~100 LOC) into `tts-config.ts`.

### 8. `src/infra/exec-approvals.ts` (1,437 → ~700 LOC)

| Extract to                          | What moves                                                                                                                                                       | LOC  |
| ----------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---- |
| `infra/exec-approvals.shell.ts`     | `iterateQuoteAware`, `splitShellPipeline`, `analyzeWindowsShellCommand`, `tokenizeWindowsSegment`, `analyzeShellCommand`, `analyzeArgvCommand`                   | ~250 |
| `infra/exec-approvals.allowlist.ts` | `matchAllowlist`, `matchesPattern`, `globToRegExp`, `isSafeBinUsage`, `evaluateSegments`, `evaluateExecAllowlist`, `splitCommandChain`, `evaluateShellAllowlist` | ~350 |

**Note:** Still ~942 LOC. Follow-up: `exec-command-resolution.ts` (~220 LOC) and `exec-approvals-io.ts` (~200 LOC) would bring it under 700.

### 9. `src/cli/update-cli.ts` (1,245 → ~1,000 LOC)

| Extract to                  | What moves                                                                                | LOC  |
| --------------------------- | ----------------------------------------------------------------------------------------- | ---- |
| `cli/update-cli.helpers.ts` | Version/tag helpers, constants, shell completion, git checkout, global manager resolution | ~340 |

**Note:** The 3 command functions (`updateCommand`, `updateStatusCommand`, `updateWizardCommand`) are large but procedural with heavy shared context. Deeper splitting needs an interface layer.

---

## Phase 3 — Higher Risk / Structural Refactors

These files need more than "move functions" — they need closure variable threading, class decomposition, or handler-per-method patterns.

### 10. `src/memory/manager.ts` (2,280 → ~1,300 LOC, then follow-up)

| Extract to                    | What moves                                                                                                                                    | LOC  |
| ----------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------- | ---- |
| `memory/manager.embedding.ts` | `embedChunksWithVoyageBatch`, `embedChunksWithOpenAiBatch`, `embedChunksWithGeminiBatch` (3 functions ~90% identical — **dedup opportunity**) | ~600 |
| `memory/manager.batch.ts`     | `embedBatchWithRetry`, `runBatchWithFallback`, `runBatchWithTimeoutRetry`, `recordBatchFailure`, `resetBatchFailureCount`                     | ~300 |
| `memory/manager.cache.ts`     | `loadEmbeddingCache`, `upsertEmbeddingCache`, `computeProviderKey`                                                                            | ~150 |

**Key insight:** The 3 provider embedding methods share ~90% identical structure. After extraction, refactor into a single generic `embedChunksWithProvider(config)` with provider-specific config objects. This is both a size and a logic DRY win.

**Still ~1,362 LOC** — session sync + search could be a follow-up split.

### 11. `src/agents/bash-tools.exec.ts` (1,546 → ~1,000 LOC)

| Extract to                          | What moves                                                       | LOC  |
| ----------------------------------- | ---------------------------------------------------------------- | ---- |
| `agents/bash-tools.exec.process.ts` | `runExecProcess` + supporting spawn helpers                      | ~400 |
| `agents/bash-tools.exec.helpers.ts` | Security constants, `validateHostEnv`, normalizers, PATH helpers | ~200 |

**Challenge:** `runExecProcess` reads closure variables from `createExecTool`. Extraction requires passing explicit params.

### 12. `src/gateway/server/ws-connection/message-handler.ts` (970 → ~720 LOC)

| Extract to                                 | What moves                              | LOC  |
| ------------------------------------------ | --------------------------------------- | ---- |
| `ws-connection/message-handler.auth.ts`    | Device signature/nonce/key verification | ~180 |
| `ws-connection/message-handler.pairing.ts` | Pairing flow                            | ~110 |

**Challenge:** Everything is inside a single deeply-nested closure sharing `send`, `close`, `frame`, `connectParams`. Extraction requires threading many parameters. Consider refactoring to a class or state machine first.

---

## UI Files

### 13. `ui/src/ui/views/usage.ts` (3,076 → ~1,200 LOC)

| Extract to                   | What moves                                                                                       | LOC  |
| ---------------------------- | ------------------------------------------------------------------------------------------------ | ---- |
| `views/usage.aggregation.ts` | Data builders, CSV export, query engine                                                          | ~550 |
| `views/usage.charts.ts`      | `renderDailyChartCompact`, `renderCostBreakdown`, `renderTimeSeriesCompact`, `renderUsageMosaic` | ~600 |
| `views/usage.sessions.ts`    | `renderSessionsCard`, `renderSessionDetailPanel`, `renderSessionLogsCompact`                     | ~800 |

### 14. `ui/src/ui/views/agents.ts` (1,894 → ~950 LOC)

| Extract to                 | What moves                            | LOC  |
| -------------------------- | ------------------------------------- | ---- |
| `views/agents.tools.ts`    | Tools panel + policy matching helpers | ~350 |
| `views/agents.skills.ts`   | Skills panel + grouping logic         | ~280 |
| `views/agents.channels.ts` | Channels + cron panels                | ~380 |

### 15. `ui/src/ui/views/nodes.ts` (1,118 → ~440 LOC)

| Extract to                      | What moves                                  | LOC  |
| ------------------------------- | ------------------------------------------- | ---- |
| `views/nodes.exec-approvals.ts` | Exec approvals rendering + state resolution | ~500 |
| `views/nodes.devices.ts`        | Device management rendering                 | ~230 |

---

## Extension: BlueBubbles

### 16. `extensions/bluebubbles/src/monitor.ts` (2,348 → ~650 LOC)

| Extract to                         | What moves                                                                                      | LOC    |
| ---------------------------------- | ----------------------------------------------------------------------------------------------- | ------ |
| `monitor.normalize.ts`             | `normalizeWebhookMessage`, `normalizeWebhookReaction`, field extractors, participant resolution | ~500   |
| `monitor.debounce.ts`              | Debounce infrastructure, combine/flush logic                                                    | ~200   |
| `monitor.webhook.ts`               | `handleBlueBubblesWebhookRequest` + registration                                                | ~1,050 |
| Merge into existing `reactions.ts` | tapback parsing, reaction normalization                                                         | ~120   |

**Key insight:** Message/reaction normalization share ~300 lines of near-identical field extraction — dedup opportunity similar to memory providers.

---

## Execution Plan

| Wave        | Files                                                          | Total extractable LOC | Est. effort  | Status                                |
| ----------- | -------------------------------------------------------------- | --------------------- | ------------ | ------------------------------------- |
| **Wave 1**  | #1–#4 (schema, audit-extra, session-cost, media-understanding) | ~2,600                | 1 session    | ✅ #1 done, ✅ #2 done, #3–#4 pending |
| **Wave 2a** | #5–#6 (heartbeat, message-action-runner)                       | ~990                  | 1 session    | Not started                           |
| **Wave 2b** | #7–#9 (tts, exec-approvals, update-cli)                        | ~1,565                | 1–2 sessions | Not started                           |
| **Wave 3**  | #10–#12 (memory, bash-tools, message-handler)                  | ~1,830                | 2 sessions   | Not started                           |
| **Wave 4**  | #13–#16 (UI + BlueBubbles)                                     | ~4,560                | 2–3 sessions | Not started                           |

### Ground Rules

1. **No behavior changes.** Every extraction is a pure structural move + import update.
2. **Tests must pass.** Run `pnpm test` after each file extraction.
3. **Imports only.** New files re-export from old paths if needed to avoid breaking external consumers.
4. **Dot-naming convention.** Use `<base>.<concern>.ts` pattern (e.g., `runner.binary.ts`, not `binary-resolve.ts`).
5. **Centralized patterns over scatter.** Prefer 2 logical groupings (e.g., sync vs async) over 3-4 domain-specific fragments.
6. **Update colocated tests.** If `foo.test.ts` imports from `foo.ts`, update imports to the new module.
7. **CI gate.** Each PR must pass `pnpm build && pnpm check && pnpm test`.

---

## Metrics

After all waves complete, the expected result:

| Metric                          | Before | After (est.)               |
| ------------------------------- | ------ | -------------------------- |
| Files > 1,000 LOC (non-test TS) | 17     | ~5                         |
| Files > 700 LOC (non-test TS)   | 50+    | ~15–20                     |
| New files created               | 0      | ~35                        |
| Net LOC change                  | 0      | ~0 (moves only)            |
| Largest core `src/` file        | 2,280  | ~1,300 (memory/manager.ts) |