Commit Graph

2696 Commits

Author SHA1 Message Date
Mariano Belinky
4ec0af00fe Agents: fix embedded auth-profile failure helper typing 2026-02-24 15:16:11 +00:00
Peter Steinberger
878b4e0ed7 refactor: unify tools.fs workspaceOnly resolution 2026-02-24 15:14:05 +00:00
Peter Steinberger
13bfe7faa6 refactor(sandbox): share bind parsing and host-path policy checks 2026-02-24 15:04:47 +00:00
Peter Steinberger
370d115549 fix: enforce workspaceOnly for native prompt image autoload 2026-02-24 14:47:59 +00:00
Peter Steinberger
6da03eabe2 fix: add changelog and clean regression comment for tool-result guard (#25429) (thanks @mikaeldiakhate-cell) 2026-02-24 14:42:09 +00:00
Leakim
8db7ca8c02 fix: prevent synthetic toolResult for aborted/errored assistant messages
When an assistant message with toolCalls has stopReason 'aborted' or 'error',
the guard should not add those tool call IDs to the pending map. Creating
synthetic tool results for incomplete/aborted tool calls causes API 400 errors:
'unexpected tool_use_id found in tool_result blocks'

This aligns the WRITE path (session-tool-result-guard.ts) with the READ path
(session-transcript-repair.ts) which already skips aborted messages.

Fixes: orphaned tool_result causing session corruption

Tests added:
- does NOT create synthetic toolResult for aborted assistant messages
- does NOT create synthetic toolResult for errored assistant messages
2026-02-24 14:42:09 +00:00
Elarwei
aa2826b5b1 fix(usage): parse Kimi K2 cached_tokens from prompt_tokens_details
Kimi K2 models use automatic prefix caching and return cache stats in
a nested field: usage.prompt_tokens_details.cached_tokens

This fixes issue #7073 where cacheRead was showing 0 for K2.5 users.

Also adds cached_tokens (top-level) for moonshot-v1 explicit caching API.

Closes #7073
2026-02-24 14:40:52 +00:00
Peter Steinberger
b5787e4abb fix(sandbox): harden bind validation for symlink missing-leaf paths 2026-02-24 14:37:35 +00:00
Peter Steinberger
39631639b7 fix: add changelog + typed omission test note (#25314) (thanks @lbo728) 2026-02-24 14:22:02 +00:00
lbo728
b863316e7b fix(models): preserve user reasoning override when merging with built-in catalog
When a built-in provider model has reasoning:true (e.g. MiniMax-M2.5) and
the user explicitly sets reasoning:false in their config, mergeProviderModels
unconditionally overwrote the user's value with the built-in catalog value.

The merge code refreshes capability metadata (input, contextWindow, maxTokens,
reasoning) from the implicit catalog. This is correct for fields like
contextWindow and maxTokens — the catalog has authoritative values that
shouldn't be stale. But reasoning is a user preference, not just a
capability descriptor: users may need to disable it to avoid 'Message
ordering conflict' errors with certain models or backends.

Fix: check whether 'reasoning' is present in the explicit (user-supplied)
model entry. If the user has set it (even to false), honour that value.
If the user hasn't set it, fall back to the built-in catalog default.

This allows users to configure tools.models.providers.minimax.models with
reasoning:false for MiniMax-M2.5 without being silently overridden.

Fixes #25244
2026-02-24 14:22:02 +00:00
SidQin-cyber
99d854db82 fix(agents): await block-reply flush before tool execution starts
handleToolExecutionStart() flushed pending block replies and then called
onBlockReplyFlush() as fire-and-forget (`void`). This created a race where
fast tool results (especially media on Telegram) could be delivered before
the text block that preceded the tool call.

Await onBlockReplyFlush() so the block pipeline finishes before tool
execution continues, preserving delivery order.

Fixes #25267

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-24 14:11:40 +00:00
Peter Steinberger
d3ecc234da test: align flaky CI expectations after main changes (#24991) (thanks @stakeswky) 2026-02-24 04:34:49 +00:00
Peter Steinberger
d427d09b5e fix: align reasoning payload typing for #24991 (thanks @stakeswky) 2026-02-24 04:34:49 +00:00
User
7d76c241f8 fix: suppress reasoning payloads from generic channel dispatch path
When reasoningLevel is 'on', reasoning content was being sent as a
visible message to WhatsApp and other non-Telegram channels via two
paths:
1. Block reply: emitted via onBlockReply in handleMessageEnd
2. Final payloads: added to replyItems in buildEmbeddedRunPayloads

Telegram has its own dispatch path (bot-message-dispatch.ts) that
splits reasoning into a dedicated lane and handles suppression.
The generic dispatch-from-config.ts path used by WhatsApp, web, etc.
had no such filtering.

Fix:
- Add isReasoning?: boolean flag to ReplyPayload
- Tag reasoning payloads at both emission points
- Filter isReasoning payloads in dispatch-from-config.ts for both
  block reply and final reply paths

Telegram is unaffected: it uses its own deliver callback that detects
reasoning via the 'Reasoning:\n' prefix and routes to a separate lane.

Fixes #24954
2026-02-24 04:34:49 +00:00
zerone0x
bc52d4a459 fix(openrouter): skip reasoning effort injection for 'auto' routing model
The 'auto' model on OpenRouter dynamically routes to any underlying model
OpenRouter selects, including reasoning-required endpoints. Previously,
OpenClaw would unconditionally inject `reasoning.effort: "none"` into
every request when the thinking level was "off", which causes a 400 error
on models where reasoning is mandatory and cannot be disabled.

Root cause:
- openrouter/auto has reasoning: false in the built-in catalog
- With thinking level "off", createOpenRouterWrapper injects
  `reasoning: { effort: "none" }` via mapThinkingLevelToOpenRouterReasoningEffort
- For any OpenRouter-routed model that requires reasoning this results in:
  "400 Reasoning is mandatory for this endpoint and cannot be disabled"
- The reasoning: false is then persisted back to models.json on every
  ensureOpenClawModelsJson call, so manually removing it has no lasting effect

Fix:
- In applyExtraParamsToAgent, when provider is "openrouter" and the model
  id is "auto", pass undefined as thinkingLevel to createOpenRouterWrapper
  so no reasoning.effort is injected at all, letting OpenRouter's upstream
  model handle it natively
- Add an explanatory comment in buildOpenrouterProvider clarifying that the
  reasoning: false catalog value does NOT cause effort injection for "auto"

Users who need explicit reasoning control should target a specific model
id (e.g. openrouter/deepseek/deepseek-r1) rather than the auto router.

Fixes #24851

(cherry picked from commit aa554397980972d917dece09ab03c4cc15f5d100)
2026-02-24 04:33:51 +00:00
Ben Marvell
eae13d9367 test(agents): update test to match universal tool-result repair for OpenAI
The previous test asserted that OpenAI-responses sessions would NOT get
synthetic tool results for orphaned tool calls. With repairToolUseResultPairing
now running universally, the correct behavior is that orphaned tool calls
get a synthetic tool_result — matching what OpenAI actually requires.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
(cherry picked from commit 2edb0ffe0bf96e9e415c03458ff9cee6bf29bcbe)
2026-02-24 04:33:51 +00:00
Ben Marvell
252079f001 fix(agents): repair orphaned tool results for OpenAI after history truncation
repairToolUseResultPairing was gated behind !isOpenAi, skipping orphaned
tool_result cleanup for OpenAI providers. When limitHistoryTurns truncated
conversation history, tool_result messages whose matching tool_call was
before the truncation point survived and were sent as function_call_output
items with stale call_id references. OpenAI rejects these with:
"No tool call found for function call output with call_id ..."

Enable the repair universally — all providers need it after truncation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
(cherry picked from commit 97b065aa6e56fff97414bee26a6b6fc5a33f019a)
2026-02-24 04:33:50 +00:00
Marc Gratch
75969ed5c4 fix(plugins): pass session context to before_compaction hook in subscribe handler
The handleAutoCompactionStart handler was calling runBeforeCompaction with
only messageCount and an empty hook context. Plugins receiving this hook
could not identify the session or snapshot the transcript during
auto-compaction.

The other call site in compact.ts already passes the full payload
(messages, sessionFile, sessionKey). This aligns the subscribe handler
to do the same using ctx.params.session and ctx.params.sessionKey.

(cherry picked from commit 318a19d1a1a428ff1be2e03f51777c3829c6e322)
2026-02-24 04:33:50 +00:00
JackyWay
792bd6195c fix: recognize Bedrock as Anthropic-compatible in transcript policy
(cherry picked from commit 3b5154081cdd6f9ff94b35c50b8f57714f9ad381)
2026-02-24 04:33:50 +00:00
github-actions[bot]
3823587ada fix(agents): allow empty edit replacement text
(cherry picked from commit 3c21fc30d38ae69f59c7200cfae76642473b2f03)
2026-02-24 04:33:50 +00:00
Mitch McAlister
5710d72527 feat(agents): configurable default runTimeoutSeconds for subagent spawns
When sessions_spawn is called without runTimeoutSeconds, subagents
previously defaulted to 0 (no timeout). This adds a config key at
agents.defaults.subagents.runTimeoutSeconds so operators can set a
global default timeout for all subagent runs.

The agent-provided value still takes precedence when explicitly passed.
When neither the agent nor the config specifies a timeout, behavior is
unchanged (0 = no timeout), preserving backwards compatibility.

Updated for the subagent-spawn.ts refactor (logic moved from
sessions-spawn-tool.ts to spawnSubagentDirect).

Closes #19288

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-24 04:22:43 +00:00
zerone0x
ac6cec7677 fix(providers): strip trailing /v1 from Anthropic baseUrl to prevent double-path
The pi-ai Anthropic provider constructs the full API endpoint as
`${baseUrl}/v1/messages`. If a user configures
`models.providers.anthropic.baseUrl` with a trailing `/v1`
(e.g. "https://api.anthropic.com/v1"), the resolved URL becomes
"https://api.anthropic.com/v1/v1/messages" which the Anthropic API
rejects with a 404 / connection failure.

This regression appeared in v2026.2.22 when @mariozechner/pi-ai bumped
from 0.54.0 to 0.54.1, which started appending the /v1 segment where
the previous version did not.

Fix: in normalizeModelCompat(), detect anthropic-messages models and
strip a single trailing /v1 (with optional trailing slash) from the
configured baseUrl before it is handed to pi-ai. Models with baseUrls
that do not end in /v1 are unaffected. Non-anthropic-messages models
are not touched.

Adds 6 unit tests covering the normalisation scenarios.

Fixes #24709

(cherry picked from commit 4c4857fdcb3506dc277f9df75d4df5879dca8d41)
2026-02-24 04:20:30 +00:00
User
c7bf0dacb8 chore: remove unused isMinimal param from buildSkillsSection
Address review feedback: isMinimal is no longer referenced after the
early-return guard was removed in the parent commit.

(cherry picked from commit 2efe04d301c386f1c7dc93d4ae60de8fac8a63b2)
2026-02-24 04:20:30 +00:00
User
2398b51378 fix: include available_skills in isolated cron agentTurn sessions (closes #24888)
buildSkillsSection() had an early-return guard on isMinimal that silently
dropped the entire <available_skills> block for any session using
promptMode="minimal" — which includes all isolated cron agentTurn sessions
(isCronSessionKey → promptMode="minimal" in attempt.ts:497-500).

Fix: remove the isMinimal guard from buildSkillsSection so that skills are
emitted whenever a non-empty skillsPrompt is provided, regardless of mode.
Memory, docs, reply-tags, and other verbose sections remain gated on isMinimal.

Tests added:
- "includes skills in minimal prompt mode when skillsPrompt is provided (cron regression)"
- "omits skills in minimal prompt mode when skillsPrompt is absent"
- Updated existing minimal-mode test expectation to match corrected behaviour.

(cherry picked from commit 66af86e7eede75721a0439cff595209aa4548eff)
2026-02-24 04:20:30 +00:00
SidQin-cyber
f3459d71e8 fix(exec): treat shell exit codes 126/127 as failures instead of completed
When a command exits with code 127 (command not found) or 126 (not
executable), the exec tool previously returned status "completed" with
the error buried in the output text. This caused cron jobs to report
status "ok" and never increment consecutiveErrors, silently swallowing
failures like `python: command not found` across multiple daily cycles.

Now these shell-reserved exit codes are classified as "failed", which
propagates through the cron pipeline to properly increment
consecutiveErrors and surface the issue for operator attention.

Fixes #24587

Co-authored-by: Cursor <cursoragent@cursor.com>
(cherry picked from commit 2b1d1985ef09000977131bbb1a5c2d732b6cd6e4)
2026-02-24 04:20:30 +00:00
HeMuling
3c13f4c2b4 test(subagents): mock sessions store in steer-restart coverage 2026-02-24 04:17:56 +00:00
HeMuling
d0e008d460 chore(status): clarify bootstrap file semantics 2026-02-24 04:17:56 +00:00
HeMuling
c3b3065cc9 fix(subagents): reconcile orphaned restored runs 2026-02-24 04:17:56 +00:00
Keith
b2719d00ff fix(subagents): restore isInternalMessageChannel guard in resolveAnnounceOrigin
Restores the narrower internal-channel guard from PR #22223 (fe57bea08) that was
inadvertently reverted by f555835b0.

The original !isDeliverableMessageChannel() check strips the requester's channel
whenever it is not in the registered deliverable set. This causes delivery
failures for plugin channels whose adapter ID differs from their plugin ID (e.g.
"gmail" vs "openclaw-gmail"): the requester origin is discarded and the announce
falls back to stale session routes — typically WhatsApp — resulting in a timeout
followed by an E.164 format error.

Replacing with isInternalMessageChannel() limits stripping to explicitly internal
channels (webchat), preserving the requester origin for all external channels
regardless of whether they are currently in the deliverable list.

Fixes: #22223 regression introduced in f555835b0

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
2026-02-24 04:13:40 +00:00
Sahil Satralkar
420d8c663c Tests/Typing: stabilize subagent completion routing changes 2026-02-24 04:12:25 +00:00
Sahil Satralkar
f9ffd41cfa Subagents: fallback completion announce to internal session when outbound route is incomplete 2026-02-24 04:12:25 +00:00
Sahil Satralkar
3eabd53898 Tests: add regressions for subagent completion fallback and explicit direct route 2026-02-24 04:12:25 +00:00
root
8d2035633b fix(agents): include SOUL.md, IDENTITY.md, USER.md in subagent/cron bootstrap allowlist
Subagent and isolated cron sessions only loaded AGENTS.md and TOOLS.md,
causing subagents to lose their role personality, identity, and user
preferences. Expand MINIMAL_BOOTSTRAP_ALLOWLIST to include the three
missing identity files.

Closes #24852

(cherry picked from commit c33377150eeddb42c2a24f4a48c2d01b5cdf8d3e)
2026-02-24 04:04:35 +00:00
Peter Steinberger
de0e01259a fix: expand openrouter thinking-off regression coverage (#24863) (thanks @DevSecTim) 2026-02-24 03:54:29 +00:00
Tim Jones
b96d32c1c2 chore: fix oxfmt formatting in extraparams test
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-24 03:54:29 +00:00
Tim Jones
3e974dc93f fix: don't inject reasoning: { effort: "none" } for OpenRouter when thinking is off
"off" is a truthy string, so the existing guard `if (thinkingLevel && ...)`
was always entering the injection block and sending `reasoning: { effort: "none" }`
to every OpenRouter request — even when thinking wasn't enabled. Models that
require reasoning (e.g. deepseek/deepseek-r1) reject this with:
  400 Reasoning is mandatory for this endpoint and cannot be disabled.

Fix: skip the reasoning injection entirely when thinkingLevel is "off".
The reasoning_effort flat-field cleanup still runs. Omitting the reasoning
field lets each model use its own default behavior.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-24 03:54:29 +00:00
Peter Steinberger
6c1ed9493c fix: harden queue retry debounce and add regression tests 2026-02-24 03:52:49 +00:00
Peter Steinberger
dd145f1346 fix: suppress sessions_send warning leakage coverage (#24740) (thanks @Glucksberg) 2026-02-24 03:49:52 +00:00
Glucksberg
947883d2e0 fix: suppress sessions_send error warnings from leaking to chat (#23989)
sessions_send timeout/error results were being surfaced as raw warning
messages in Telegram chats because the tool is classified as mutating,
which forces error warnings to always be shown. However, sessions_send
failures are transient inter-session communication issues where the
message may still have been delivered, so they should not leak to users.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-24 03:49:52 +00:00
Sid
d95ee859f8 fix(cron): use full prompt mode for isolated cron sessions to include skills (#24944)
Isolated cron sessions (agentTurn) were grouped with subagent sessions
under the "minimal" prompt mode, which causes buildSkillsSection to
return an empty array. This meant <available_skills> was never included
in the system prompt for isolated cron runs.

Subagent sessions legitimately need minimal prompts (reduced context),
but isolated cron sessions are full agent turns that should have access
to all configured skills, matching the behavior of normal chat sessions
and non-isolated cron runs.

Remove isCronSessionKey from the minimal prompt condition so only
subagent sessions use "minimal" mode.

Fixes openclaw#24888

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-24 03:33:54 +00:00
Peter Machona
9ced64054f fix(auth): classify missing OAuth scopes as auth failures (#24761) 2026-02-24 03:33:44 +00:00
banna-commits
e3da57d956 fix: add exponential backoff to announce queue drain on failure (#24783)
When the gateway rejects connections (e.g. scope-upgrade 'pairing required'),
the announce queue drain loop would retry every ~1s indefinitely because
the only delay was the fixed debounceMs (default 1000ms).

This adds a consecutiveFailures counter with exponential backoff:
2s, 4s, 8s, 16s, 32s, 60s (capped). The counter resets on successful drain.

The backoff is applied by shifting lastEnqueuedAt forward so that
waitForQueueDebounce naturally delays the next attempt.

Fixes #24777

Co-authored-by: Knut <knut@Knut-sin-Mac-mini.local>
2026-02-24 03:33:34 +00:00
Adam
d07d24eebe fix: clamp poll sleep duration to non-negative in bash-tools process (#24889)
`Math.min(250, deadline - Date.now())` could return a negative value if
the deadline expired between the while-condition check and the setTimeout
call. Wrap with `Math.max(0, ...)` to ensure the sleep is never negative.

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-24 03:22:58 +00:00
Omair Afzal
19c43eade2 fix(memory): strip null bytes from workspace paths causing ENOTDIR (#24876)
Add stripNullBytes() helper and apply it to all return paths in
resolveAgentWorkspaceDir() including configured, default, and
state-dir-derived paths. Null bytes in paths cause ENOTDIR errors
when Node tries to resolve them as directories.
2026-02-24 03:22:42 +00:00
Omair Afzal
177f167eab fix: guard .trim() calls on potentially undefined workspaceDir (#24875)
Change workspaceDir param type from string to string | undefined in
resolvePluginSkillDirs and use nullish coalescing before .trim() to
prevent TypeError when workspaceDir is undefined.
2026-02-24 03:22:39 +00:00
Peter Steinberger
7b2b86c60a fix(exec): add approval race changelog and regressions 2026-02-24 03:22:05 +00:00
Peter Steinberger
6f0dd61795 fix(exec): restore two-phase approval registration flow 2026-02-24 03:16:36 +00:00
Peter Steinberger
204d9fb404 refactor(security): dedupe shell env probe and add path regression test 2026-02-24 03:11:33 +00:00
Peter Steinberger
4a3f8438e5 fix(gateway): bind node exec approvals to nodeId 2026-02-24 03:05:58 +00:00
Peter Steinberger
e578521ef4 fix(security): harden session export image data-url handling 2026-02-24 02:53:39 +00:00