Commit Graph

499 Commits

Author SHA1 Message Date
Peter Steinberger
f4a4b50cd5 refactor: compile allowlist matchers 2026-03-11 00:07:47 +00:00
Peter Steinberger
fa0329c340 test: cover cron nested lane selection 2026-03-11 00:02:00 +00:00
Peter Steinberger
825a435709 fix: avoid cron embedded lane deadlock 2026-03-10 23:56:21 +00:00
Teconomix
6d0547dc2e mattermost: fix DM media upload for unprefixed user IDs (#29925)
Merged via squash.

Prepared head SHA: 5cffcb072cc82394fe4c93d6c1c0c520325180b7
Co-authored-by: teconomix <6959299+teconomix@users.noreply.github.com>
Co-authored-by: mukhtharcm <56378562+mukhtharcm@users.noreply.github.com>
Reviewed-by: @mukhtharcm
2026-03-10 14:22:24 +05:30
futuremind2026
382287026b cron: record lastErrorReason in job state (#14382)
Merged via squash.

Prepared head SHA: baa6b5d566a41950dea0a214881eef48697326d8
Co-authored-by: futuremind2026 <258860756+futuremind2026@users.noreply.github.com>
Co-authored-by: BunsDev <68980965+BunsDev@users.noreply.github.com>
Reviewed-by: @BunsDev
2026-03-10 00:01:45 -05:00
Altay
531e8362b1 Agents: add fallback error observations (#41337)
Merged via squash.

Prepared head SHA: 852469c82ff28fb0e1be7f1019f5283e712c4283
Co-authored-by: altaywtf <9790196+altaywtf@users.noreply.github.com>
Co-authored-by: altaywtf <9790196+altaywtf@users.noreply.github.com>
Reviewed-by: @altaywtf
2026-03-10 01:12:10 +03:00
Robin Waslander
2b2e5e2038 fix(cron): do not misclassify empty/NO_REPLY as interim acknowledgement (#41401)
* fix(cron): do not misclassify empty/NO_REPLY as interim acknowledgement

When a cron task's agent returns NO_REPLY, the payload filter strips the
silent token, leaving an empty text string. isLikelyInterimCronMessage()
previously returned true for empty input, causing the cron runner to
inject a forced rerun prompt ('Your previous response was only an
acknowledgement...').

Change the empty-string branch to return false: empty text after payload
filtering means the agent deliberately chose silent completion, not that
it sent an interim 'on it' message.

Fixes #41246

* fix(cron): do not misclassify empty/NO_REPLY as interim acknowledgement

Fixes #41246. (#41383) thanks @jackal092927.

---------

Co-authored-by: xaeon2026 <xaeon2026@gmail.com>
2026-03-09 21:16:28 +01:00
Mariano
d4e59a3666 Cron: enforce cron-owned delivery contract (#40998)
Merged via squash.

Prepared head SHA: 5877389e33d5b3a518925b5793a6f6294cb3fb3d
Co-authored-by: mbelinky <132747814+mbelinky@users.noreply.github.com>
Co-authored-by: mbelinky <132747814+mbelinky@users.noreply.github.com>
Reviewed-by: @mbelinky
2026-03-09 20:12:37 +01:00
Peter Steinberger
e86b38f09d refactor: split cron startup catch-up flow 2026-03-09 06:19:10 +00:00
Peter Steinberger
96d17f3cb1 fix: stagger missed cron jobs on restart (#18925) (thanks @rexlunae) 2026-03-09 06:07:43 +00:00
rexlunae
79853aca9c fix(cron): stagger missed jobs on restart to prevent gateway overload
When the gateway restarts with many overdue cron jobs, they are now
executed with staggered delays to prevent overwhelming the gateway.

- Add missedJobStaggerMs config (default 5s between jobs)
- Add maxMissedJobsPerRestart limit (default 5 jobs immediately)
- Prioritize most overdue jobs by sorting by nextRunAtMs
- Reschedule deferred jobs to fire gradually via normal timer

Fixes #18892
2026-03-09 06:07:43 +00:00
Peter Steinberger
03a6e3b460 test(cron): cover owner-only tool availability 2026-03-09 05:52:04 +00:00
Peter Steinberger
41e023a80b fix(cron): restore owner-only tools for isolated runs 2026-03-09 05:49:20 +00:00
GazeKingNuWu
41450187dd fix: clear plugin discovery cache after plugin installation (openclaw#39752)
Verified:
- pnpm build
- pnpm check
- pnpm test:macmini

Co-authored-by: GazeKingNuWu <264914544+GazeKingNuWu@users.noreply.github.com>
Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
2026-03-09 00:16:25 -05:00
Ayaan Zaidi
a40c29b11a Fix cron text announce delivery for Telegram targets (#40575)
Merged via squash.

Prepared head SHA: 54b1513c78613bddd8cae16ab2d617788a0dacb6
Co-authored-by: obviyus <22031114+obviyus@users.noreply.github.com>
Co-authored-by: obviyus <22031114+obviyus@users.noreply.github.com>
Reviewed-by: @obviyus
2026-03-09 10:26:17 +05:30
Tyler Yust
38543d8196 fix(cron): consolidate announce delivery, fire-and-forget trigger, and minimal prompt mode (#40204)
* fix(cron): consolidate announce delivery and detach manual runs

* fix: queue detached cron runs (#40204)
2026-03-08 14:46:33 -07:00
Peter Steinberger
386b811ddd test(cron): relax concurrent start race timeout 2026-03-08 13:44:10 +00:00
gambletan
8a20f51460 fix: add rate limit patterns for 'too many tokens' and 'tokens per day' (#39377)
Merged via squash.

Prepared head SHA: 132a45728694053c0e3220e7d861508524f17244
Co-authored-by: gambletan <266203672+gambletan@users.noreply.github.com>
Co-authored-by: altaywtf <9790196+altaywtf@users.noreply.github.com>
Reviewed-by: @altaywtf
2026-03-08 13:03:33 +03:00
Peter Steinberger
149ae45bad fix(cron): preserve manual timeoutSeconds on add 2026-03-08 00:48:57 +00:00
Peter Steinberger
e66c418c45 refactor(cron): normalize legacy delivery at ingress 2026-03-08 00:48:57 +00:00
Peter Steinberger
9b99787c31 refactor(cron): extract delivery tool policy helpers 2026-03-08 00:48:57 +00:00
Peter Steinberger
45d3e62f50 refactor(cron): extract agent defaults merge helpers 2026-03-08 00:48:56 +00:00
Peter Steinberger
6b18ec479c refactor(cron): centralize initial delivery defaults 2026-03-08 00:48:56 +00:00
Peter Steinberger
4e07bdbdfd fix(cron): restore isolated delivery defaults 2026-03-08 00:18:45 +00:00
Peter Steinberger
92f5a2e252 fix(models): refresh gpt/gemini alias defaults (#38638, thanks @ademczuk)
Co-authored-by: ademczuk <andrew.demczuk@gmail.com>
2026-03-07 21:10:58 +00:00
Tyler Yust
e554c59aac fix(cron): eliminate double-announce and replace delivery polling with push-based flow (#39089)
* fix(cron): eliminate double-announce and replace delivery polling with push-based flow

- Set deliveryAttempted=true in announce early-return paths (active-subagent
  suppression and stale-interim suppression) so the heartbeat timer no longer
  fires a redundant enqueueSystemEvent fallback (double-announce bug).

- Refactor waitForDescendantSubagentSummary to use event-based agent.wait RPC
  calls instead of a 500ms busy-poll loop.  Each active descendant run is now
  awaited concurrently via Promise.allSettled, and only a short bounded grace
  period (5s) remains to capture the cron agent's post-orchestration synthesis.
  Eliminates O(n*timeoutMs/500ms) gateway calls and wasted wall-clock time.

- Add FAST_TEST_MODE (OPENCLAW_TEST_FAST=1) to subagent-followup.ts to keep
  the grace-period tests instant in CI.

- Add comprehensive tests for the new waitForDescendantSubagentSummary behaviour
  (push-based wait, error resilience, NO_REPLY handling, multi-descendant waits).

* fix: prep cron double-announce followup tests (#39089) (thanks @tyler6204)
2026-03-07 12:13:37 -08:00
Peter Steinberger
c5bb6db85b refactor(cron): share isolated-agent turn core test setup 2026-03-07 17:58:31 +00:00
Peter Steinberger
41e0c35b61 refactor(cron): reuse cron job builder in issue-13992 tests 2026-03-07 17:58:31 +00:00
Peter Steinberger
8fd043abac refactor(cron): dedupe interim retry fallback assertions 2026-03-07 17:05:23 +00:00
Peter Steinberger
d01cb7b65f refactor(cron): share cron schedule resolver 2026-03-07 17:05:23 +00:00
Peter Steinberger
3c71e2bd48 refactor(core): extract shared dedup helpers 2026-03-07 10:41:05 +00:00
拐爷&&老拐瘦
2e31aead39 fix(gateway): invalidate bootstrap cache on session rollover (openclaw#38535)
Verified:
- pnpm install --frozen-lockfile
- pnpm build
- pnpm check
- pnpm test:macmini

Co-authored-by: yfge <1186273+yfge@users.noreply.github.com>
Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
2026-03-06 23:46:02 -06:00
Altay
6e962d8b9e fix(agents): handle overloaded failover separately (#38301)
* fix(agents): skip auth-profile failure on overload

* fix(agents): note overload auth-profile fallback fix

* fix(agents): classify overloaded failures separately

* fix(agents): back off before overload failover

* fix(agents): tighten overload probe and backoff state

* fix(agents): persist overloaded cooldown across runs

* fix(agents): tighten overloaded status handling

* test(agents): add overload regression coverage

* fix(agents): restore runner imports after rebase

* test(agents): add overload fallback integration coverage

* fix(agents): harden overloaded failover abort handling

* test(agents): tighten overload classifier coverage

* test(agents): cover all-overloaded fallback exhaustion

* fix(cron): retry overloaded fallback summaries

* fix(cron): treat HTTP 529 as overloaded retry
2026-03-07 01:42:11 +03:00
Josh Lehman
fee91fefce feature(context): extend plugin system to support custom context management (#22201)
* feat(context-engine): add ContextEngine interface and registry

Introduce the pluggable ContextEngine abstraction that allows external
plugins to register custom context management strategies.

- ContextEngine interface with lifecycle methods: bootstrap, ingest,
  ingestBatch, afterTurn, assemble, compact, prepareSubagentSpawn,
  onSubagentEnded, dispose
- Module-level singleton registry with registerContextEngine() and
  resolveContextEngine() (config-driven slot selection)
- LegacyContextEngine: pass-through implementation wrapping existing
  compaction behavior for 100% backward compatibility
- ensureContextEnginesInitialized() guard for safe one-time registration
- 19 tests covering contract, registry, resolution, and legacy parity

* feat(plugins): add context-engine slot and registerContextEngine API

Wire the ContextEngine abstraction into the plugin system so external
plugins can register context engines via the standard plugin API.

- Add 'context-engine' to PluginKind union type
- Add 'contextEngine' slot to PluginSlotsConfig (default: 'legacy')
- Wire registerContextEngine() through OpenClawPluginApi
- Export ContextEngine types from plugin-sdk for external consumers
- Restore proper slot-based resolution in registry

* feat(context-engine): wire ContextEngine into agent run lifecycle

Integrate the ContextEngine abstraction into the core agent run path:

- Resolve context engine once per run (reused across retries)
- Bootstrap: hydrate canonical store from session file on first run
- Assemble: route context assembly through pluggable engine
- Auto-compaction guard: disable built-in auto-compaction when
  the engine declares ownsCompaction (prevents double-compaction)
- AfterTurn: post-turn lifecycle hook for ingest + background
  compaction decisions
- Overflow compaction: route through contextEngine.compact()
- Dispose: clean up engine resources in finally block
- Notify context engine on subagent lifecycle events

Legacy engine: all lifecycle methods are pass-through/no-op, preserving
100% backward compatibility for users without a context engine plugin.

* feat(plugins): add scoped subagent methods and gateway request scope

Expose runtime.subagent.{run, waitForRun, getSession, deleteSession}
so external plugins can spawn sub-agent sessions without raw gateway
dispatch access.

Uses AsyncLocalStorage request-scope bridge to dispatch internally via
handleGatewayRequest with a synthetic operator client. Methods are only
available during gateway request handling.

- Symbol.for-backed global singleton for cross-module-reload safety
- Fallback gateway context for non-WS dispatch paths (Telegram/WhatsApp)
- Set gateway request scope for all handlers, not just plugin handlers
- 3 staleness tests for fallback context hardening

* feat(context-engine): route /compact and sessions.get through context engine

Wire the /compact command and sessions.get handler through the pluggable
ContextEngine interface.

- Thread tokenBudget and force parameters to context engine compact
- Route /compact through contextEngine.compact() when registered
- Wire sessions.get as runtime alias for plugin subagent dispatch
- Add .pebbles/ to .gitignore

* style: format with oxfmt 0.33.0

Fix duplicate import (ControlUiRootState in server.impl.ts) and
import ordering across all changed files.

* fix: update extension test mocks for context-engine types

Add missing subagent property to bluebubbles PluginRuntime mock.
Add missing registerContextEngine to lobster OpenClawPluginApi mock.

* fix(subagents): keep deferred delete cleanup retryable

* style: format run attempt for CI

* fix(rebase): remove duplicate embedded-run imports

* test: add missing gateway context mock export

* fix: pass resolved auth profile into afterTurn compaction

Ensure the embedded runner forwards resolved auth profile context into
legacy context-engine compaction params on the normal afterTurn path,
matching overflow compaction behavior. This allows downstream LCM
summarization to use the intended provider auth/profile consistently.

Also fix strict TS typing in external-link token dedupe and align an
attempt unit test reasoningLevel value with the current ReasoningLevel
enum.

Regeneration-Prompt: |
  We were debugging context-engine compaction where downstream summary
  calls were missing the right auth/profile context in normal afterTurn
  flow, while overflow compaction already propagated it. Preserve current
  behavior and keep changes additive: thread the resolved authProfileId
  through run -> attempt -> legacy compaction param builder without
  broad refactors.

  Add tests that prove the auth profile is included in afterTurn legacy
  params and that overflow compaction still passes it through run
  attempts. Keep existing APIs stable, and only adjust small type issues
  needed for strict compilation.

* fix: remove duplicate imports from rebase

* feat: add context-engine system prompt additions

* fix(rebase): dedupe attempt import declarations

* test: fix fetch mock typing in ollama autodiscovery

* fix(test): add registerContextEngine to diffs extension mock APIs

* test(windows): use path.delimiter in ios-team-id fixture PATH

* test(cron): add model formatting and precedence edge case tests

Covers:
- Provider/model string splitting (whitespace, nested paths, empty segments)
- Provider normalization (casing, aliases like bedrock→amazon-bedrock)
- Anthropic model alias normalization (opus-4.5→claude-opus-4-5)
- Precedence: job payload > session override > config default
- Sequential runs with different providers (CI flake regression pattern)
- forceNew session preserving stored model overrides
- Whitespace/empty model string edge cases
- Config model as string vs object format

* test(cron): fix model formatting test config types

* test(phone-control): add registerContextEngine to mock API

* fix: re-export ChannelKind from config-reload-plan

* fix: add subagent mock to plugin-runtime-mock test util

* docs: add changelog fragment for context engine PR #22201
2026-03-06 05:31:59 -08:00
Vincent Koc
44ec3e4111 Cron: stabilize runs-one-shot migration tests 2026-03-06 01:27:23 -05:00
Vincent Koc
a622aee45a Cron: migrate legacy provider delivery hints 2026-03-06 01:27:23 -05:00
aerelune
0e2bc588c4 fix: enforce 600 perms for cron store and run logs (#36078)
* fix: enforce secure permissions for cron store and run logs

* fix(cron): enforce dir perms and gate posix tests on windows

* Cron store tests: cover existing directory permission hardening

* Cron run-log tests: cover existing directory permission hardening

* Changelog: note cron file permission hardening

---------

Co-authored-by: linhey <linhey@mini.local>
Co-authored-by: Vincent Koc <vincentkoc@ieee.org>
2026-03-06 00:48:35 -05:00
Vignesh Natarajan
d45353f95b fix(agents): honor explicit rate-limit cooldown probes in fallback runs 2026-03-05 20:03:06 -08:00
Tyler Yust
81b93b9ce0 fix(subagents): announce delivery with descendant gating, frozen result refresh, and cron retry (#35080)
Thanks @tyler6204
2026-03-05 19:20:24 -08:00
Jacob Riff
aad372e15f feat: append UTC time alongside local time in shared Current time lines (#32423)
Merged via squash.

Prepared head SHA: 9e8ec13933b5317e7cff3f0bc048de515826c31a
Co-authored-by: jriff <50276+jriff@users.noreply.github.com>
Co-authored-by: altaywtf <9790196+altaywtf@users.noreply.github.com>
Reviewed-by: @altaywtf
2026-03-06 01:26:34 +03:00
Tak Hoffman
9741e91a64 test(cron): add cross-channel announce fallback regression coverage (openclaw#36197)
Verified:
- pnpm install --frozen-lockfile
- pnpm build
- pnpm check (fails on pre-existing origin/main lint debt in extensions/mattermost imports)
- pnpm test:macmini

Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
2026-03-05 07:37:37 -06:00
Tak Hoffman
544abc927f fix(cron): restore direct fallback after announce failure in best-effort mode (openclaw#36177)
Verified:
- pnpm build
- pnpm check (fails on pre-existing origin/main lint debt in extensions/mattermost imports)
- pnpm test:macmini

Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
2026-03-05 07:25:24 -06:00
Tak Hoffman
cc5dad81bc cron: unify stale-run recovery and preserve manual-run every anchors (#35363)
* cron: unify stale-run recovery and preserve manual every anchors

* cron: address unresolved review threads on recovery paths

* cron: remove duplicate timestamp helper after rebase
2026-03-04 22:12:32 -06:00
Tak Hoffman
28dc2e8a40 cron: narrow startup replay backoff guard (#35391) 2026-03-04 22:11:11 -06:00
Tak Hoffman
79d00ae398 fix(cron): stabilize restart catch-up replay semantics (#35351)
* Cron: stabilize restart catch-up replay semantics

* Cron: respect backoff in startup missed-run replay
2026-03-04 21:50:16 -06:00
sline
1059b406a8 fix: cron backup should preserve pre-edit snapshot (#35195) (#35234)
* fix(cron): avoid overwriting .bak during normalization

Fixes openclaw/openclaw#35195

* test(cron): preserve pre-edit bak snapshot in normalization path

---------

Co-authored-by: 0xsline <sline@users.noreply.github.com>
Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
2026-03-04 21:46:27 -06:00
Sid
8b8167d547 fix(agents): bypass pendingDescendantRuns guard for cron announce delivery (#35185)
* fix(agents): bypass pendingDescendantRuns guard for cron announce delivery

Standalone cron job completions were blocked from direct channel delivery
when the cron run had spawned subagents that were still registered as
pending. The pendingDescendantRuns guard exists for live orchestration
coordination and should not apply to fire-and-forget cron announce sends.

Thread the announceType through the delivery chain and skip both the
child-descendant and requester-descendant pending-run guards when the
announce originates from a cron job.

Closes #34966

* fix: ensure outbound session entry for cron announce with named agents (#32432)

Named agents may not have a session entry for their delivery target,
causing the announce flow to silently fail (delivered=false, no error).

Two fixes:
1. Call ensureOutboundSessionEntry when resolving the cron announce
   session key so downstream delivery can find channel metadata.
2. Fall back to direct outbound delivery when announce delivery fails
   to ensure cron output reaches the target channel.

Closes #32432

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: guard announce direct-delivery fallback against suppression leaks (#32432)

The `!delivered` fallback condition was too broad — it caught intentional
suppressions (active subagents, interim messages, SILENT_REPLY_TOKEN) in
addition to actual announce delivery failures.  Add an
`announceDeliveryWasAttempted` flag so the direct-delivery fallback only
fires when `runSubagentAnnounceFlow` was actually called and failed.

Also remove the redundant `if (route)` guard in
`resolveCronAnnounceSessionKey` since `resolved` being truthy guarantees
`route` is non-null.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(cron): harden announce synthesis follow-ups

---------

Co-authored-by: scoootscooob <zhentongfan@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
2026-03-04 21:31:33 -06:00
Madoka
63ce7c74bd fix(feishu): comprehensive reply mechanism — outbound replyToId forwarding + topic-aware reply targeting (#33789)
* fix(feishu): comprehensive reply mechanism fix — outbound replyToId forwarding + topic-aware reply targeting

- Forward replyToId from ChannelOutboundContext through sendText/sendMedia
  to sendMessageFeishu/sendMarkdownCardFeishu/sendMediaFeishu, enabling
  reply-to-message via the message tool.

- Fix group reply targeting: use ctx.messageId (triggering message) in
  normal groups to prevent silent topic thread creation (#32980). Preserve
  ctx.rootId targeting for topic-mode groups (group_topic/group_topic_sender)
  and groups with explicit replyInThread config.

- Add regression tests for both fixes.

Fixes #32980
Fixes #32958
Related #19784

* fix: normalize Feishu delivery.to before comparing with messaging tool targets

- Add normalizeDeliveryTarget helper to strip user:/chat: prefixes for Feishu
- Apply normalization in matchesMessagingToolDeliveryTarget before comparison
- This ensures cron duplicate suppression works when session uses prefixed targets
  (user:ou_xxx) but messaging tool extract uses normalized bare IDs (ou_xxx)

Fixes review comment on PR #32755

(cherry picked from commit fc20106f16ccc88a5f02e58922bb7b7999fe9dcd)

* fix(feishu): catch thrown SDK errors for withdrawn reply targets

The Feishu Lark SDK can throw exceptions (SDK errors with .code or
AxiosErrors with .response.data.code) for withdrawn/deleted reply
targets, in addition to returning error codes in the response object.

Wrap reply calls in sendMessageFeishu and sendCardFeishu with
try-catch to handle thrown withdrawn/not-found errors (230011,
231003) and fall back to client.im.message.create, matching the
existing response-level fallback behavior.

Also extract sendFallbackDirect helper to deduplicate the
direct-send fallback block across both functions.

Closes #33496

(cherry picked from commit ad0901aec103a2c52f186686cfaf5f8ba54b4a48)

* feishu: forward outbound reply target context

(cherry picked from commit c129a691fcf552a1cebe1e8a22ea8611ffc3b377)

* feishu extension: tighten reply target fallback semantics

(cherry picked from commit f85ec610f267020b66713c09e648ec004b2e26f1)

* fix(feishu): align synthesized fallback typing and changelog attribution

* test(feishu): cover group_topic_sender reply targeting

---------

Co-authored-by: Xu Zimo <xuzimojimmy@163.com>
Co-authored-by: Munem Hashmi <munem.hashmi@gmail.com>
Co-authored-by: bmendonca3 <bmendonca3@users.noreply.github.com>
Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
2026-03-04 20:32:28 -06:00
Gustavo Madeira Santana
e4b4486a96 Agent: unify bootstrap truncation warning handling (#32769)
Merged via squash.

Prepared head SHA: 5d6d4ddfa620011e267d892b402751847d5ac0c3
Co-authored-by: gumadeiras <5599352+gumadeiras@users.noreply.github.com>
Co-authored-by: gumadeiras <5599352+gumadeiras@users.noreply.github.com>
Reviewed-by: @gumadeiras
2026-03-03 16:28:38 -05:00
Peter Steinberger
5193189953 refactor(tests): dedupe cron store migration setup 2026-03-03 01:54:27 +00:00