Commit Graph

111 Commits

Author SHA1 Message Date
Glucksberg
09f49cd921 fix(cron): accept delivery mode "none" for sessionTarget="main" (#27431) (#28871) 2026-03-02 10:32:00 -06:00
Evgeny Zislis
4b4ea5df8b feat(cron): add failure destination support to failed cron jobs (#31059)
* feat(cron): add failure destination support with webhook mode and bestEffort handling

Extends PR #24789 failure alerts with features from PR #29145:
- Add webhook delivery mode for failure alerts (mode: 'webhook')
- Add accountId support for multi-account channel configurations
- Add bestEffort handling to skip alerts when job has bestEffort=true
- Add separate failureDestination config (global + per-job in delivery)
- Add duplicate prevention (prevents sending to same as primary delivery)
- Add CLI flags: --failure-alert-mode, --failure-alert-account-id
- Add UI fields for new options in web cron editor

* fix(cron): merge failureAlert mode/accountId and preserve failureDestination on updates

- Fix mergeCronFailureAlert to merge mode and accountId fields
- Fix mergeCronDelivery to preserve failureDestination on updates
- Fix isSameDeliveryTarget to use 'announce' as default instead of 'none'
  to properly detect duplicates when delivery.mode is undefined

* fix(cron): validate webhook mode requires URL in resolveFailureDestination

When mode is 'webhook' but no 'to' URL is provided, return null
instead of creating an invalid plan that silently fails later.

* fix(cron): fail closed on webhook mode without URL and make failureDestination fields clearable

- sendCronFailureAlert: fail closed when mode is webhook but URL is missing
- mergeCronDelivery: use per-key presence checks so callers can clear
  nested failureDestination fields via cron.update

Note: protocol:check shows missing internalEvents in Swift models - this is
a pre-existing issue unrelated to these changes (upstream sync needed).

* fix(cron): use separate schema for failureDestination and fix type cast

- Create CronFailureDestinationSchema excluding after/cooldownMs fields
- Fix type cast in sendFailureNotificationAnnounce to use CronMessageChannel

* fix(cron): merge global failureDestination with partial job overrides

When job has partial failureDestination config, fall back to global
config for unset fields instead of treating it as a full override.

* fix(cron): avoid forcing announce mode and clear inherited to on mode change

- UI: only include mode in patch if explicitly set to non-default
- delivery.ts: clear inherited 'to' when job overrides mode, since URL
  semantics differ between announce and webhook modes

* fix(cron): preserve explicit to on mode override and always include mode in UI patches

- delivery.ts: preserve job-level explicit 'to' when overriding mode
- UI: always include mode in failureAlert patch so users can switch between announce/webhook

* fix(cron): allow clearing accountId and treat undefined global mode as announce

- UI: always include accountId in patch so users can clear it
- delivery.ts: treat undefined global mode as announce when comparing for clearing inherited 'to'

* Cron: harden failure destination routing and add regression coverage

* Cron: resolve failure destination review feedback

* Cron: drop unrelated timeout assertions from conflict resolution

* Cron: format cron CLI regression test

* Cron: align gateway cron test mock types

---------

Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
2026-03-02 09:27:41 -06:00
Yasunori Morishima(盛島康徳)
be8930d6f9 fix: clear stale runningAtMs in cron.run() before already-running check (#17949)
Add recomputeNextRunsForMaintenance() call in run() so that stale
runningAtMs markers (from a crashed Phase-1 persist) are cleared by the
existing normalizeJobTickState logic before the already-running guard.

Without this, a manual cron.run() could be blocked for up to
STUCK_RUN_MS (2 hours) even though no job was actually running.

Fixes #17554

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-02 07:47:36 -06:00
Kate Chapman
6df8bd9741 fix(cron): wrap computeJobNextRunAtMs in try-catch inside applyJobResult (#30905)
* fix(cron): wrap computeJobNextRunAtMs in try-catch inside applyJobResult

Without this guard, if the croner library throws during schedule
computation (timezone/expression edge cases), the exception propagates
out of applyJobResult and the entire state update is lost — runningAtMs
never clears, lastRunAtMs never advances, nextRunAtMs never recomputes.
After STUCK_RUN_MS (2h), stuck detection clears runningAtMs and the job
re-fires, creating a ~2h repeat cycle instead of the intended schedule.

The sibling function recomputeJobNextRunAtMs in jobs.ts already wraps
computeJobNextRunAtMs in try-catch; this was an oversight in the
applyJobResult call sites.

Changes:
- Error-backoff path: catch and fall back to backoff-only schedule
- Success path: catch and fall through to the MIN_REFIRE_GAP_MS safety net
- applyOutcomeToStoredJob: log a warning when job not found after forceReload

* fix(cron): use recordScheduleComputeError in applyJobResult catch blocks

Address review feedback: the original catch blocks only logged a warning,
which meant a persistent computeJobNextRunAtMs throw would cause a
MIN_REFIRE_GAP_MS (2s) hot loop on cron-kind jobs.

Now both catch blocks call recordScheduleComputeError (exported from
jobs.ts), which tracks consecutive schedule errors and auto-disables the
job after 3 failures — matching the existing behavior in
recomputeJobNextRunAtMs.

* test(cron): cover applyJobResult schedule-throw fallback paths

---------

Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
2026-03-02 07:38:33 -06:00
Tak Hoffman
254bb7ceee ui(cron): add advanced controls for run-if-due and routing (#31244)
* ui(cron): add advanced run controls and routing fields

* ui(cron): gate delivery account id to announce mode

* ui(cron): allow clearing delivery account id in editor

* cron: persist payload lightContext updates

* tests(cron): fix payload lightContext assertion typing
2026-03-02 07:24:33 -06:00
Peter Steinberger
6b78544f82 refactor(commands): unify repeated ACP and routing flows 2026-03-02 05:20:19 +00:00
Glucksberg
08c35eb13f fix(cron): re-arm one-shot at-jobs when rescheduled after completion (openclaw#28915) thanks @Glucksberg
Verified:
- pnpm install --frozen-lockfile
- pnpm build
- pnpm check
- pnpm test:macmini

Co-authored-by: Glucksberg <80581902+Glucksberg@users.noreply.github.com>
Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
2026-03-01 21:31:24 -06:00
FlamesCN
aaa7de45fa fix(cron): prevent armTimer tight loop when job has stuck runningAtMs (openclaw#29853) thanks @FlamesCN
Verified:
- pnpm build
- pnpm check
- pnpm test:macmini

Co-authored-by: FlamesCN <12966659+FlamesCN@users.noreply.github.com>
Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
2026-03-01 19:58:58 -06:00
C. Liao
313a655d13 fix(cron): reject sessionTarget "main" for non-default agents at creation time (openclaw#30217) thanks @liaosvcaf
Verified:
- pnpm install --frozen-lockfile
- pnpm build
- pnpm check
- pnpm test:macmini

Co-authored-by: liaosvcaf <51533973+liaosvcaf@users.noreply.github.com>
Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
2026-03-01 19:54:53 -06:00
0xbrak
4637b90c07 feat(cron): configurable failure alerts for repeated job errors (openclaw#24789) thanks @0xbrak
Verified:
- pnpm install --frozen-lockfile
- pnpm check
- pnpm test -- --run src/cron/service.failure-alert.test.ts src/cli/cron-cli.test.ts src/gateway/protocol/cron-validators.test.ts

Co-authored-by: 0xbrak <181251288+0xbrak@users.noreply.github.com>
Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
2026-03-01 08:18:15 -06:00
Sid
29a55948d6 fix(cron): guard list sorting against malformed legacy jobs (#28896)
* fix(cron): guard list sorting against malformed legacy jobs

Prevent list operations from crashing when old or corrupted cron entries are missing name/id fields by hardening sort comparators.

Closes #28862

* cron: format list sort guard test imports

---------

Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
2026-03-01 07:01:36 -06:00
NIO
ea3955cd78 fix(cron): add retry policy for one-shot jobs on transient errors (#24355) (openclaw#24435) thanks @hugenshen
Verified:
- pnpm install --frozen-lockfile
- pnpm check
- pnpm test -- --run src/cron/service.issue-regressions.test.ts src/config/config-misc.test.ts

Co-authored-by: hugenshen <16300669+hugenshen@users.noreply.github.com>
Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
2026-03-01 06:58:03 -06:00
StingNing
5b49cc4129 fix(cron): notify user when cron job is auto-disabled after repeated errors (openclaw#29098) thanks @ningding97
Verified:
- pnpm install --frozen-lockfile
- pnpm check
- pnpm test -- --run src/cron/service.runs-one-shot-main-job-disables-it.test.ts

Co-authored-by: ningding97 <17723822+ningding97@users.noreply.github.com>
Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
2026-03-01 06:54:02 -06:00
Sid
504c1f3607 fix(cron): migrate legacy schedule cron fields on load (#28889)
Backfill legacy jobs that still use schedule.cron and jobId so upgraded instances keep firing existing cron schedules instead of failing silently.

Closes #28861
2026-03-01 06:53:39 -06:00
Vignesh Natarajan
2050fd7539 Cron: preserve session scope for main-target reminders 2026-02-28 14:53:19 -08:00
Marcus Widing
8ae1987f2a fix(cron): pass heartbeat target=last for main-session cron jobs (#28508) (#28583)
* fix(cron): pass heartbeat target=last for main-session cron jobs

When a cron job with sessionTarget=main and wakeMode=now fires, it
triggers a heartbeat via runHeartbeatOnce. Since e2362d35 changed the
default heartbeat target from "last" to "none", these cron-triggered
heartbeats silently discard their responses instead of delivering them
to the last active channel (e.g. Telegram).

Fix: pass heartbeat: { target: "last" } from the cron timer to
runHeartbeatOnce for main-session jobs, and wire the override through
the gateway cron service builder. This restores delivery for
sessionTarget=main cron jobs without reverting the intentional default
change for regular heartbeats.

Regression introduced in: e2362d35 (2026-02-25)

Fixes #28508

* Cron: align server-cron wake routing expectations for main-target jobs

---------

Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
2026-02-28 11:14:24 -06:00
Sid
fe9a7c4082 fix(cron): force main-target system events onto main session (#28898)
Ignore persisted sessionKey overrides for sessionTarget=main jobs so cron system events consistently route to the agent main session after upgrades.

Closes #28770
2026-02-28 11:08:53 -06:00
Marvin
5e2ef0e883 feat(cron): add --account flag for multi-account delivery routing (#26284)
* feat(cron): add --account flag for multi-account delivery routing

Add support for explicit delivery account routing in cron jobs across CLI, normalization, delivery planning, and isolated delivery target resolution.

Highlights:
- Add --account <id> to cron add and cron edit
- Add optional delivery.accountId to cron types and delivery plan
- Normalize and trim delivery.accountId in cron create/update normalization
- Prefer explicit accountId over session lastAccountId and bindings fallback
- Thread accountId through isolated cron run delivery resolution
- Preserve cron edit --best-effort-deliver/--no-best-effort-deliver behavior by keeping implicit announce mode
- Expand tests for account passthrough/merge/precedence and CLI account flows

* cron: resolve rebase duplicate accountId fields

---------

Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
2026-02-28 10:57:49 -06:00
Pierre
e1c8094ad0 fix: schedule nextWakeAtMs for isolated sessionTarget cron jobs (#19541)
* fix(cron): repair isolated next wake scheduling

* cron: harden isolated next-wake timestamp guards

---------

Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
2026-02-28 10:48:31 -06:00
Peter Steinberger
b402770f63 refactor(reply): split abort cutoff and timeout policy modules 2026-02-26 14:00:35 +01:00
Peter Steinberger
c397a02c9a fix(queue): harden drain/abort/timeout race handling
- reject new lane enqueues once gateway drain begins
- always reset lane draining state and isolate onWait callback failures
- persist per-session abort cutoff and skip stale queued messages
- avoid false 600s agentTurn timeout in isolated cron jobs

Fixes #27407
Fixes #27332
Fixes #27427

Co-authored-by: Kevin Shenghui <shenghuikevin@github.com>
Co-authored-by: zjmy <zhangjunmengyang@gmail.com>
Co-authored-by: suko <miha.sukic@gmail.com>
2026-02-26 13:43:39 +01:00
Peter Steinberger
b37dc42240 fix(cron): suppress fallback summary after attempted announce delivery 2026-02-26 03:09:14 +00:00
Ayaan Zaidi
03122e5933 fix(cron): preserve telegram announce target + delivery truth 2026-02-23 11:45:18 +05:30
Evgeny Zislis
78f801e243 Validate Telegram delivery targets to reject invalid formats (#21930)
Merged via /review-pr -> /prepare-pr -> /merge-pr.

Prepared head SHA: 02c9b1c3dd4273988d571d513403e02e3b062e46
Co-authored-by: kesor <7056+kesor@users.noreply.github.com>
Co-authored-by: obviyus <22031114+obviyus@users.noreply.github.com>
Reviewed-by: @obviyus
2026-02-23 10:44:46 +05:30
Tak Hoffman
77c3b142a9 Web UI: add full cron edit parity, all-jobs run history, and compact filters (openclaw#24155) thanks @Takhoffman
Verified:
- pnpm install --frozen-lockfile
- pnpm build
- pnpm check
- pnpm test:macmini

Co-authored-by: Takhoffman <781889+Takhoffman@users.noreply.github.com>
Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
2026-02-22 23:05:42 -06:00
Tak Hoffman
a54dc7fe80 Cron: suppress fallback main summary for delivery-target errors (openclaw#24074) thanks @Takhoffman
Verified:
- pnpm build
- pnpm check
- pnpm test:macmini

Co-authored-by: Takhoffman <781889+Takhoffman@users.noreply.github.com>
Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
2026-02-22 20:24:08 -06:00
Tak Hoffman
f6c2e99f5d Cron: preserve due jobs after manual runs (#23994) 2026-02-22 19:02:05 -06:00
Tak Hoffman
211ab9e4f6 Cron: persist manual run marker before unlock (#23993)
* Cron: persist manual run marker before unlock

* Cron tests: relax wakeMode now microtask wait after run lock persist
2026-02-22 18:39:37 -06:00
Tak Hoffman
3efe63d1ad Cron: respect aborts in main wake-now retries (#23967)
* Cron: respect aborts in main wake-now retries

* Changelog: add main-session cron abort retry fix note

* Cron tests: format post-rebase conflict resolution
2026-02-22 17:19:27 -06:00
Tak Hoffman
73e5bb7635 Cron: apply timeout to startup catch-up runs (#23966)
* Cron: apply timeout to startup catch-up runs

* Changelog: add cron startup timeout catch-up note
2026-02-22 17:04:30 -06:00
Tak Hoffman
556af3f08b fix(cron): cancel timed-out runs before side effects (openclaw#22411) thanks @Takhoffman
Verified:
- pnpm check
- pnpm vitest run src/memory/qmd-manager.test.ts src/cron/service.issue-regressions.test.ts src/cron/isolated-agent.delivers-response-has-heartbeat-ok-but-includes.test.ts --maxWorkers=1

Co-authored-by: Takhoffman <781889+Takhoffman@users.noreply.github.com>
Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
2026-02-22 15:45:27 -06:00
Peter Steinberger
5d90e31807 refactor(cron): share timed job-execution helper 2026-02-22 20:18:20 +00:00
Peter Steinberger
7e83e7b3a7 fix(cron): narrow manual run execution state 2026-02-22 20:19:23 +01:00
Peter Steinberger
9cf445e37c fix(cron): restore interval cadence after restart 2026-02-22 20:19:23 +01:00
Peter Steinberger
aa4c250eb8 fix(cron): split run and delivery status tracking 2026-02-22 20:19:23 +01:00
Peter Steinberger
c3bb723673 fix(cron): enforce timeout for manual cron runs 2026-02-22 20:19:23 +01:00
Peter Steinberger
8bf3c37c6c fix(cron): keep watchdog timer armed during ticks 2026-02-22 20:19:23 +01:00
Peter Steinberger
5db1ee4ec6 fix(cron): keep manual runs non-blocking 2026-02-22 20:19:22 +01:00
Peter Steinberger
34ea33f057 refactor: dedupe core config and runtime helpers 2026-02-22 17:11:54 +00:00
Frank Yang
1051f42f96 fix(stability): patch regex retries and timeout abort handling 2026-02-22 10:59:34 +01:00
Vignesh Natarajan
961bde27fe Cron: guard missing expr in schedule parsing 2026-02-21 20:18:11 -08:00
Vignesh Natarajan
2830dafbe9 Cron: keep list/status responsive during startup catch-up 2026-02-21 19:13:04 -08:00
Simone Macario
09d5f508b1 fix(cron): persist delivered flag in job state to surface delivery failures (openclaw#19174) thanks @simonemacario
Verified:
- pnpm build
- pnpm check
- pnpm test:macmini

Co-authored-by: simonemacario <2116609+simonemacario@users.noreply.github.com>
Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
2026-02-21 12:47:29 -06:00
Tak Hoffman
7417c36268 fix(cron): honor maxConcurrentRuns in timer loop (openclaw#22413) thanks @Takhoffman
Verified:
- pnpm install --frozen-lockfile
- pnpm build
- pnpm check
- pnpm test:macmini (failed on unrelated baseline test: src/memory/qmd-manager.test.ts > throws when sqlite index is busy)

Co-authored-by: Takhoffman <781889+Takhoffman@users.noreply.github.com>
Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
2026-02-20 22:31:58 -06:00
Peter Steinberger
b8b43175c5 style: align formatting with oxfmt 0.33 2026-02-18 01:34:35 +00:00
Peter Steinberger
31f9be126c style: run oxfmt and fix gate failures 2026-02-18 01:29:02 +00:00
Peter Steinberger
6dcc052bb4 fix: stabilize model catalog and pi discovery auth storage compatibility 2026-02-18 02:09:40 +01:00
Peter Steinberger
dd4eb8bf63 fix(cron): retry next-second schedule compute on undefined 2026-02-17 23:48:14 +01:00
Peter Steinberger
c26cf6aa83 feat(cron): add default stagger controls for scheduled jobs 2026-02-17 23:48:14 +01:00
Tyler Yust
75001a0490 fix cron announce routing and timeout handling 2026-02-17 11:40:04 -08:00