Arrow function passed to registerInternalHook was implicitly returning
the number from Array.push(), which is not assignable to void | Promise<void>.
Use block body to discard the return value.
Adds two new internal hook events that fire after media/link processing:
- message:transcribed: fires when audio has been transcribed, providing
the transcript text alongside the original body and media metadata.
Useful for logging, analytics, or routing based on spoken content.
- message:preprocessed: fires for every message after all media + link
understanding completes. Gives hooks access to the fully enriched body
(transcripts, image descriptions, link summaries) before the agent sees it.
Both hooks are added in get-reply.ts, after applyMediaUnderstanding and
applyLinkUnderstanding. message:received and message:sent are already
in upstream (f07bb8e8) and are not duplicated here.
Typed contexts (MessageTranscribedHookContext, MessagePreprocessedHookContext)
and type guards (isMessageTranscribedEvent, isMessagePreprocessedEvent) added
to internal-hooks.ts alongside the existing received/sent types.
Test coverage in src/hooks/message-hooks.test.ts.
* fix(hooks): deduplicate after_tool_call hook in embedded runs
(cherry picked from commit c129a1a74ba247460c6c061776ceeb995c757ecf)
* fix(hooks): propagate sessionKey in after_tool_call context
The after_tool_call hook in handleToolExecutionEnd was passing
`sessionKey: undefined` in the ToolContext, even though the value is
available on ctx.params. This broke plugins that need session context
in after_tool_call handlers (e.g., for per-session audit trails or
security logging).
- Add `sessionKey` to the `ToolHandlerParams` Pick type
- Pass `ctx.params.sessionKey` through to the hook context
- Add test assertion to prevent regression
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
(cherry picked from commit b7117384fc1a09d60b25db0f80847a3519ddb3c3)
* fix(hooks): thread agentId through to after_tool_call hook context
Follow-up to #30511 — the after_tool_call hook context was passing
`agentId: undefined` because SubscribeEmbeddedPiSessionParams did not
carry the agent identity. This threads sessionAgentId (resolved in
attempt.ts) through the session params into the tool handler context,
giving plugins accurate agent-scoped context for both before_tool_call
and after_tool_call hooks.
Changes:
- Add `agentId?: string` to SubscribeEmbeddedPiSessionParams
- Add "agentId" to ToolHandlerParams Pick type
- Pass `agentId: sessionAgentId` at the subscribeEmbeddedPiSession()
call site in attempt.ts
- Wire ctx.params.agentId into the after_tool_call hook context
- Update tests to assert agentId propagation
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
(cherry picked from commit aad01edd3e0d367ad0defeb691d5689ddf2ee617)
* changelog: credit after_tool_call hook contributors
* Update CHANGELOG.md
* agents: preserve adjusted params until tool end
* agents: emit after_tool_call with adjusted args
* tests: cover adjusted after_tool_call params
* tests: align adapter after_tool_call expectation
---------
Co-authored-by: jbeno <jim@jimbeno.net>
Co-authored-by: scoootscooob <zhentongfan@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* fix(sandbox): prevent Windows PATH from poisoning docker exec shell lookup
On Windows hosts, `buildDockerExecArgs` passes the host PATH env var
(containing Windows paths like `C:\Windows\System32`) to `docker exec -e
PATH=...`. Docker uses this PATH to resolve the executable argument
(`sh`), which fails because Windows paths don't exist in the Linux
container — producing `exec: "sh": executable file not found in $PATH`.
Two changes:
- Skip PATH in the `-e` env loop (it's already handled separately via
OPENCLAW_PREPEND_PATH + shell export)
- Use absolute `/bin/sh` instead of bare `sh` to eliminate PATH
dependency entirely
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* style: add braces around continue to satisfy linter
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(test): update assertion to match /bin/sh in buildDockerExecArgs
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
When a cron agent emits multiple text payloads (narration + tool
summaries) followed by a final HEARTBEAT_OK, the delivery suppression
check `isHeartbeatOnlyResponse` fails because it uses `.every()` —
requiring ALL payloads to be heartbeat tokens. In practice, agents
narrate their work before signaling nothing needs attention.
Fix: check if ANY payload contains HEARTBEAT_OK (`.some()`) while
preserving the media delivery exception (if any payload has media,
always deliver). This matches the semantic intent: HEARTBEAT_OK is
the agent's explicit signal that nothing needs user attention.
Real-world example: heartbeat agent returns 3 payloads:
1. "It's 12:49 AM — quiet hours. Let me run the checks quickly."
2. "Emails: Just 2 calendar invites. Not urgent."
3. "HEARTBEAT_OK"
Previously: all 3 delivered to Telegram. Now: correctly suppressed.
Related: #32013 (fixed a different HEARTBEAT_OK leak path via system
events in timer.ts)
Fixes duplicate message processing in Slack DMs where both message.im
and app_mention events fire for the same message, causing:
- 2x token/credit usage per message
- 2x API calls
- Duplicate agent invocations with same runId
Root cause: app_mention events should only fire for channel mentions,
not DMs. Added channel_type check to skip im/mpim in app_mention handler.
Evidence of bug (from production logs):
- Same runId firing twice within 200-300ms
- Example: runId 13cd482c... at 20:32:42.699Z and 20:32:42.954Z
After fix:
- One message = one runId = one processing run
- 50% reduction in duplicate processing
Slack's Events API includes the parent message's files array in every
thread reply event payload. This caused OpenClaw to re-download and
attach the parent's files to every text-only thread reply, creating
ghost media attachments.
The fix filters out files that belong to the thread starter by comparing
file IDs. The resolveSlackThreadStarter result is already cached, so
this adds no extra API calls.
Closes#32203
The session-store cache used only mtime for invalidation. In fast CI
runs (especially under bun), test writes to the session store can
complete within the same filesystem mtime granularity (~1s on HFS+/ext4),
so the cache returns stale data. This caused non-deterministic failures
in model precedence tests where a session override written to disk was
not observed by the next loadSessionStore() call.
Fix: add file size as a secondary cache invalidation signal. The cache
now checks both mtimeMs and sizeBytes — if either differs from the
cached values, it reloads from disk.
Changes:
- cache-utils.ts: add getFileSizeBytes() helper
- sessions/store.ts: extend SessionStoreCacheEntry with sizeBytes field,
check size in cache-hit path, populate size on cache writes
- sessions.cache.test.ts: add regression test for same-mtime rewrite
When echoTranscript is enabled in tools.media.audio config, the
transcription text is sent back to the originating chat immediately
after successful audio transcription — before the agent processes it.
This lets users verify what was heard from their voice note.
Changes:
- config/types.tools.ts: add echoTranscript (bool) and echoFormat
(string template) to MediaUnderstandingConfig
- media-understanding/apply.ts: sendTranscriptEcho() helper that
resolves channel/to from ctx, guards on isDeliverableMessageChannel,
and calls deliverOutboundPayloads best-effort
- config/schema.help.ts: help text for both new fields
- config/schema.labels.ts: labels for both new fields
- media-understanding/apply.echo-transcript.test.ts: 10 vitest cases
covering disabled/enabled/custom-format/no-audio/failed-transcription/
non-deliverable-channel/missing-from/OriginatingTo/delivery-failure
Default echoFormat: '📝 "{transcript}"'
Closes#32102
Expose audio transcription through the PluginRuntime so external
plugins (e.g. marmot) can use openclaw's media-understanding provider
framework without importing unexported internal modules.
The new transcribeAudioFile() wraps runCapability({capability: "audio"})
and reads provider/model/apiKey from tools.media.audio in the config,
matching the pattern used by the Discord VC implementation.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Increase test audio file sizes to meet MIN_AUDIO_FILE_BYTES (1024) threshold
introduced by the skip-empty-audio feature. Fix localPathRoots in skip-tiny-audio
tests so temp files pass path validation. Remove undefined loadApply() call
in apply.test.ts.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add a minimum file size guard (MIN_AUDIO_FILE_BYTES = 1024) before
sending audio to transcription APIs. Files below this threshold are
almost certainly empty or corrupt and would cause unhelpful errors
from Whisper/Deepgram/Groq providers.
Changes:
- Add 'tooSmall' skip reason to MediaUnderstandingSkipError
- Add MIN_AUDIO_FILE_BYTES constant (1024 bytes) to defaults
- Guard both provider and CLI audio paths in runner.ts
- Add comprehensive tests for tiny, empty, and valid audio files
- Update existing test fixtures to use audio files above threshold
runProviderEntry now calls resolveProxyFetchFromEnv() and passes the
result as fetchFn to transcribeAudio/describeVideo, so media provider
API calls respect HTTPS_PROXY/HTTP_PROXY behind corporate proxies.
Move makeProxyFetch to src/infra/net/proxy-fetch.ts and add
resolveProxyFetchFromEnv which reads standard proxy env vars
(HTTPS_PROXY, HTTP_PROXY, and lowercase variants) and returns a
proxy-aware fetch via undici's EnvHttpProxyAgent. Telegram re-exports
from the shared location to avoid duplication.
The openai provider implements transcribeAudio via
transcribeOpenAiCompatibleAudio (Whisper API), but its capabilities
array only declared ["image"]. This caused the media-understanding
runner to skip the openai provider when processing inbound audio
messages, resulting in raw audio files being passed to agents
instead of transcribed text.
Fix: Add "audio" to the capabilities array so the runner correctly
selects the openai provider for audio transcription.
Co-authored-by: Cursor <cursoragent@cursor.com>