Commit Graph

450 Commits

Author SHA1 Message Date
Peter Steinberger
5193189953 refactor(tests): dedupe cron store migration setup 2026-03-03 01:54:27 +00:00
Peter Steinberger
fbb88d5063 refactor(tests): dedupe isolated agent cron turn assertions 2026-03-03 01:54:27 +00:00
Peter Steinberger
61f29830bc fix(test): resolve upstream typing drift in feishu and cron suites 2026-03-03 01:44:21 +00:00
Peter Steinberger
47736e3432 refactor(test): extract cron issue-regression harness and frozen-time helper 2026-03-03 01:44:21 +00:00
David Rudduck
11e1363d2d feat(hooks): add trigger and channelId to plugin hook agent context (#28623)
* feat(hooks): add trigger and channelId to plugin hook agent context

Adds `trigger` and `channelId` fields to `PluginHookAgentContext` so
plugins can determine what initiated the agent run and which channel
it originated from, without session-key parsing or Redis bridging.

trigger values: "user", "heartbeat", "cron", "memory"
channelId values: "telegram", "discord", "whatsapp", etc.

Both fields are threaded through run.ts and attempt.ts hookCtx so all
hook phases receive them (before_model_resolve, before_prompt_build,
before_agent_start, llm_input, llm_output, agent_end).

channelId falls back from messageChannel to messageProvider when the
former is not set. followup-runner passes originatingChannel so queued
followup runs also carry channel context.

* docs(changelog): note hook context parity fix for #28623

---------

Co-authored-by: Vincent Koc <vincentkoc@ieee.org>
2026-03-02 17:39:20 -08:00
Peter Steinberger
0f5f20ee6b refactor(tests): dedupe cron delivered status assertions 2026-03-03 01:37:12 +00:00
HCL
586f057c24 fix(cron): let resolveOutboundTarget handle missing delivery target fallback
The cron delivery path short-circuits with an error when `toCandidate` is
falsy (line 151), before reaching `resolveOutboundTarget()` which provides
the `plugin.config.resolveDefaultTo()` fallback. The direct send path in
`targets.ts` already uses this fallback correctly.

Remove the early `!toCandidate` exit so that `resolveOutboundTarget()`
can attempt the plugin-provided default. Guard the WhatsApp allowFrom
override against falsy `toCandidate` to maintain existing behavior when
a target IS resolved.

Fixes #32355

Signed-off-by: HCL <chenglunhu@gmail.com>
2026-03-03 01:09:47 +00:00
Peter Steinberger
d7bafae387 perf(test): trim fixture and serialization overhead in integration suites 2026-03-03 01:09:07 +00:00
Peter Steinberger
282b107e99 test(perf): speed up cron, memory, and secrets hotspots 2026-03-03 00:43:01 +00:00
Peter Steinberger
6a42d09129 refactor: dedupe gateway config and infra flows 2026-03-03 00:15:14 +00:00
ningding97
4d19dc8671 test(cron): assert embedded model on last call to avoid bun ordering flake
Bun runs can trigger multiple embedded agent invocations in a single cron
turn (e.g. retries/fallbacks), making assertions against call[0] flaky.
Assert against the last invocation instead.
2026-03-02 23:36:13 +00:00
Peter Steinberger
f9cbcfca0d refactor: modularize slack/config/cron/daemon internals 2026-03-02 22:30:21 +00:00
Peter Steinberger
7253e91300 fix: strengthen cron heartbeat multi-payload suppression (#32131) (thanks @adhishthite) 2026-03-02 22:16:18 +00:00
Adhish
2330c71b63 fix(cron): suppress delivery when multi-payload response contains HEARTBEAT_OK
When a cron agent emits multiple text payloads (narration + tool
summaries) followed by a final HEARTBEAT_OK, the delivery suppression
check `isHeartbeatOnlyResponse` fails because it uses `.every()` —
requiring ALL payloads to be heartbeat tokens. In practice, agents
narrate their work before signaling nothing needs attention.

Fix: check if ANY payload contains HEARTBEAT_OK (`.some()`) while
preserving the media delivery exception (if any payload has media,
always deliver). This matches the semantic intent: HEARTBEAT_OK is
the agent's explicit signal that nothing needs user attention.

Real-world example: heartbeat agent returns 3 payloads:
1. "It's 12:49 AM — quiet hours. Let me run the checks quickly."
2. "Emails: Just 2 calendar invites. Not urgent."
3. "HEARTBEAT_OK"

Previously: all 3 delivered to Telegram. Now: correctly suppressed.

Related: #32013 (fixed a different HEARTBEAT_OK leak path via system
events in timer.ts)
2026-03-02 22:16:18 +00:00
Peter Steinberger
91dd89313a refactor(core): dedupe command, hook, and cron fixtures 2026-03-02 21:31:36 +00:00
Peter Steinberger
f7765bc151 perf(cron): cache schedule evaluators and stagger offsets 2026-03-02 20:19:10 +00:00
Peter Steinberger
b1c30f0ba9 refactor: dedupe cli config cron and install flows 2026-03-02 19:57:33 +00:00
bmendonca3
4cd04e4652 fix(cron): migrate legacy string schedule and command jobs 2026-03-02 19:55:19 +00:00
scoootscooob
a3c5d21b4d fix(cron): suppress HEARTBEAT_OK summary from leaking into main session (#32013)
When an isolated cron agent returns HEARTBEAT_OK (nothing to announce),
the direct delivery is correctly skipped, but the fallback path in
timer.ts still enqueues the summary as a system event to the main
session. Filter out heartbeat-only summaries using isCronSystemEvent
before enqueuing, so internal ack tokens never reach user conversations.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-02 19:51:51 +00:00
Peter Steinberger
39afcee864 test(perf): trim cron and audit fixture overhead 2026-03-02 19:48:02 +00:00
Peter Steinberger
1616113170 perf(runtime): reduce cron persistence and logger overhead 2026-03-02 19:34:04 +00:00
Peter Steinberger
83ec545bed test(perf): trim repeated setup in cron memory and config suites 2026-03-02 19:16:46 +00:00
Peter Steinberger
99392f9868 chore: keep #31930 scoped to voice webhook path fix 2026-03-02 18:57:46 +00:00
Andrii Furmanets
662f389f45 Tests: isolate webhook path suite and reset cron auth state 2026-03-02 18:57:46 +00:00
scoootscooob
4030de6c73 fix(cron): move session reaper to finally block so it runs reliably (#31996)
* fix(cron): move session reaper to finally block so it runs reliably

The cron session reaper was placed inside the try block of onTimer(),
after job execution and state updates. If the locked persist section
threw, the reaper was skipped — causing isolated cron run sessions to
accumulate indefinitely in sessions.json.

Move the reaper into the finally block so it always executes after a
timer tick, regardless of whether job execution succeeded. The reaper
is already self-throttled (MIN_SWEEP_INTERVAL_MS = 5 min) so calling
it more reliably has no performance impact.

Closes #31946

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: strengthen cron reaper failure-path coverage and changelog (#31996) (thanks @scoootscooob)

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Peter Steinberger <steipete@gmail.com>
2026-03-02 18:38:59 +00:00
bboyyan
d94de5c4a1 fix(cron): normalize topic-qualified target.to in messaging tool suppress check (#29480)
* fix(cron): pass job.delivery.accountId through to delivery target resolution

* fix(cron): normalize topic-qualified target.to in messaging tool suppress check

When a cron job targets a Telegram forum topic (e.g. delivery.to =
"-1003597428309:topic:462"), delivery.to is stripped to the chatId
only by resolveOutboundTarget. However, the agent's message tool may
pass the full topic-qualified address as its target, causing
matchesMessagingToolDeliveryTarget to fail the equality check and not
suppress the tool send.

Strip the :topic:NNN suffix from target.to before comparing so the
suppress check works correctly for topic-bound cron deliveries.
Without this, the agent's message tool fires separately using the
announce session's accountId (often "default"), hitting 403 when
default bot is not in the multi-account target group.

* fix(cron): remove duplicate accountId keys after rebase

---------

Co-authored-by: jaxpkm <jaxpkm@jaxpkmdeMac-mini.local>
Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
2026-03-02 10:32:06 -06:00
Glucksberg
09f49cd921 fix(cron): accept delivery mode "none" for sessionTarget="main" (#27431) (#28871) 2026-03-02 10:32:00 -06:00
Peter Steinberger
14fbd0e6b6 test(perf): reduce timer teardown overhead in cron issue regressions 2026-03-02 16:06:04 +00:00
Peter Steinberger
a3d2021eea test(cron): stabilize model precedence mocks in bun runs (#31594) 2026-03-02 15:47:21 +00:00
Peter Steinberger
998d477f5e test: stabilize cross-platform regression suites (#31594) 2026-03-02 15:47:21 +00:00
Evgeny Zislis
4b4ea5df8b feat(cron): add failure destination support to failed cron jobs (#31059)
* feat(cron): add failure destination support with webhook mode and bestEffort handling

Extends PR #24789 failure alerts with features from PR #29145:
- Add webhook delivery mode for failure alerts (mode: 'webhook')
- Add accountId support for multi-account channel configurations
- Add bestEffort handling to skip alerts when job has bestEffort=true
- Add separate failureDestination config (global + per-job in delivery)
- Add duplicate prevention (prevents sending to same as primary delivery)
- Add CLI flags: --failure-alert-mode, --failure-alert-account-id
- Add UI fields for new options in web cron editor

* fix(cron): merge failureAlert mode/accountId and preserve failureDestination on updates

- Fix mergeCronFailureAlert to merge mode and accountId fields
- Fix mergeCronDelivery to preserve failureDestination on updates
- Fix isSameDeliveryTarget to use 'announce' as default instead of 'none'
  to properly detect duplicates when delivery.mode is undefined

* fix(cron): validate webhook mode requires URL in resolveFailureDestination

When mode is 'webhook' but no 'to' URL is provided, return null
instead of creating an invalid plan that silently fails later.

* fix(cron): fail closed on webhook mode without URL and make failureDestination fields clearable

- sendCronFailureAlert: fail closed when mode is webhook but URL is missing
- mergeCronDelivery: use per-key presence checks so callers can clear
  nested failureDestination fields via cron.update

Note: protocol:check shows missing internalEvents in Swift models - this is
a pre-existing issue unrelated to these changes (upstream sync needed).

* fix(cron): use separate schema for failureDestination and fix type cast

- Create CronFailureDestinationSchema excluding after/cooldownMs fields
- Fix type cast in sendFailureNotificationAnnounce to use CronMessageChannel

* fix(cron): merge global failureDestination with partial job overrides

When job has partial failureDestination config, fall back to global
config for unset fields instead of treating it as a full override.

* fix(cron): avoid forcing announce mode and clear inherited to on mode change

- UI: only include mode in patch if explicitly set to non-default
- delivery.ts: clear inherited 'to' when job overrides mode, since URL
  semantics differ between announce and webhook modes

* fix(cron): preserve explicit to on mode override and always include mode in UI patches

- delivery.ts: preserve job-level explicit 'to' when overriding mode
- UI: always include mode in failureAlert patch so users can switch between announce/webhook

* fix(cron): allow clearing accountId and treat undefined global mode as announce

- UI: always include accountId in patch so users can clear it
- delivery.ts: treat undefined global mode as announce when comparing for clearing inherited 'to'

* Cron: harden failure destination routing and add regression coverage

* Cron: resolve failure destination review feedback

* Cron: drop unrelated timeout assertions from conflict resolution

* Cron: format cron CLI regression test

* Cron: align gateway cron test mock types

---------

Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
2026-03-02 09:27:41 -06:00
Peter Steinberger
a905b6dabc test(perf): merge duplicate one-shot retry regression paths 2026-03-02 15:23:58 +00:00
Peter Steinberger
d212721df1 test(perf): merge forum-topic direct-delivery scenarios 2026-03-02 15:17:28 +00:00
Peter Steinberger
a469d00345 test(perf): reuse cron heartbeat delivery temp homes per suite 2026-03-02 15:14:17 +00:00
Peter Steinberger
3fb0ab7435 test(perf): tighten cron issue-regression timeout windows 2026-03-02 15:11:14 +00:00
Peter Steinberger
64ac790aa8 test(perf): reuse temp-home root in cron announce delivery suite 2026-03-02 15:08:35 +00:00
Peter Steinberger
c5f1cf3c3b test(perf): reuse isolated-agent temp home root per suite 2026-03-02 15:00:08 +00:00
Peter Steinberger
cd18472405 test(perf): trim redundant cron regression setup coverage 2026-03-02 14:25:49 +00:00
Peter Steinberger
2a2a9902d9 test(perf): merge isolated-agent model precedence cases 2026-03-02 14:24:32 +00:00
Peter Steinberger
5561a6b659 test(perf): dedupe isolated-agent delivery announce cases 2026-03-02 14:24:32 +00:00
Peter Steinberger
a05b8f47b1 test(perf): tighten cron regression timeout windows 2026-03-02 14:20:31 +00:00
Yasunori Morishima(盛島康徳)
be8930d6f9 fix: clear stale runningAtMs in cron.run() before already-running check (#17949)
Add recomputeNextRunsForMaintenance() call in run() so that stale
runningAtMs markers (from a crashed Phase-1 persist) are cleared by the
existing normalizeJobTickState logic before the already-running guard.

Without this, a manual cron.run() could be blocked for up to
STUCK_RUN_MS (2 hours) even though no job was actually running.

Fixes #17554

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-02 07:47:36 -06:00
Kate Chapman
6df8bd9741 fix(cron): wrap computeJobNextRunAtMs in try-catch inside applyJobResult (#30905)
* fix(cron): wrap computeJobNextRunAtMs in try-catch inside applyJobResult

Without this guard, if the croner library throws during schedule
computation (timezone/expression edge cases), the exception propagates
out of applyJobResult and the entire state update is lost — runningAtMs
never clears, lastRunAtMs never advances, nextRunAtMs never recomputes.
After STUCK_RUN_MS (2h), stuck detection clears runningAtMs and the job
re-fires, creating a ~2h repeat cycle instead of the intended schedule.

The sibling function recomputeJobNextRunAtMs in jobs.ts already wraps
computeJobNextRunAtMs in try-catch; this was an oversight in the
applyJobResult call sites.

Changes:
- Error-backoff path: catch and fall back to backoff-only schedule
- Success path: catch and fall through to the MIN_REFIRE_GAP_MS safety net
- applyOutcomeToStoredJob: log a warning when job not found after forceReload

* fix(cron): use recordScheduleComputeError in applyJobResult catch blocks

Address review feedback: the original catch blocks only logged a warning,
which meant a persistent computeJobNextRunAtMs throw would cause a
MIN_REFIRE_GAP_MS (2s) hot loop on cron-kind jobs.

Now both catch blocks call recordScheduleComputeError (exported from
jobs.ts), which tracks consecutive schedule errors and auto-disables the
job after 3 failures — matching the existing behavior in
recomputeJobNextRunAtMs.

* test(cron): cover applyJobResult schedule-throw fallback paths

---------

Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
2026-03-02 07:38:33 -06:00
Andrey
21e19e42a3 fix(cron): skip isError payloads when picking summary/delivery content (#21454)
* fix(cron): skip isError payloads when picking summary/delivery content

buildEmbeddedRunPayloads appends isError warnings as the last payload.
Three functions in helpers.ts iterate last-to-first and pick the error
over real agent output. Use two-pass selection: prefer non-error payloads,
fall back to error-only when no real content exists.

Fixes: pickSummaryFromPayloads, pickLastNonEmptyTextFromPayloads,
pickLastDeliverablePayload — all now accept and filter isError.

* Changelog: note cron payload isError filtering (#21454)

---------

Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
2026-03-02 07:38:05 -06:00
Peter Steinberger
2c192a3795 test(perf): reduce cron overlap timer advance slack 2026-03-02 13:37:26 +00:00
Yuzuru Suzuki
6513c42d2d fix(cron): treat announce delivery failure as ok when execution succeeded (#31082)
* cron: treat announce delivery failure as ok when agent execution succeeded

* fix: set delivered:false and error on announce delivery failure paths

* Changelog: note cron announce delivery status handling (#31082)

---------

Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
2026-03-02 07:27:57 -06:00
Peter Steinberger
79cb5e2c9b test(perf): trim cron regression timeout windows 2026-03-02 13:25:49 +00:00
Tak Hoffman
254bb7ceee ui(cron): add advanced controls for run-if-due and routing (#31244)
* ui(cron): add advanced run controls and routing fields

* ui(cron): gate delivery account id to announce mode

* ui(cron): allow clearing delivery account id in editor

* cron: persist payload lightContext updates

* tests(cron): fix payload lightContext assertion typing
2026-03-02 07:24:33 -06:00
Peter Steinberger
6adc93cc92 test(perf): skip scheduler startup in cron delivery-plan tests 2026-03-02 12:54:53 +00:00
Peter Steinberger
afda085b39 test(perf): disable scheduler startup in manual-only cron regressions 2026-03-02 12:41:56 +00:00