Commit Graph

10552 Commits

Author SHA1 Message Date
AI南柯(KingMo)
30ab9b2068 fix(agents): recognize connection errors as retryable timeout failures (#31697)
* fix(agents): recognize connection errors as retryable timeout failures

## Problem

When a model endpoint becomes unreachable (e.g., local proxy down,
relay server offline), the failover system fails to switch to the
next candidate model. Errors like "Connection error." are not
classified as retryable, causing the session to hang on a broken
endpoint instead of falling back to healthy alternatives.

## Root Cause

Connection/network errors are not recognized by the current failover
classifier:
- Text patterns like "Connection error.", "fetch failed", "network error"
- Error codes like ECONNREFUSED, ENOTFOUND, EAI_AGAIN (in message text)

While `failover-error.ts` handles these as error codes (err.code),
it misses them when they appear as plain text in error messages.

## Solution

Extend timeout error patterns to include connection/network failures:

**In `errors.ts` (ERROR_PATTERNS.timeout):**
- Text: "connection error", "network error", "fetch failed", etc.
- Regex: /\beconn(?:refused|reset|aborted)\b/i, /\benotfound\b/i, /\beai_again\b/i

**In `failover-error.ts` (TIMEOUT_HINT_RE):**
- Same patterns for non-assistant error paths

## Testing

Added test cases covering:
- "Connection error."
- "fetch failed"
- "network error: ECONNREFUSED"
- "ENOTFOUND" / "EAI_AGAIN" in message text

## Impact

- **Compatibility:** High - only expands retryable error detection
- **Behavior:** Connection failures now trigger automatic fallback
- **Risk:** Low - changes are additive and well-tested

* style: fix code formatting for test file
2026-03-03 02:37:23 +00:00
AaronWander
4c32411bee fix(exec): suggest increasing timeout on timeouts 2026-03-03 02:35:10 +00:00
Gustavo Madeira Santana
91cdb703bd Agents: add context metadata warmup retry backoff 2026-03-02 21:34:55 -05:00
john
04ac688dff fix(acp): use publishable acpx install hint 2026-03-03 02:34:07 +00:00
苏敏童0668001043
b29e913efe fix(docker): correct awk quoting in Docker GPG fingerprint check (#32153) 2026-03-03 02:32:46 +00:00
Peter Steinberger
895abc5a64 perf(security): allow audit snapshot and summary cache reuse 2026-03-03 02:32:13 +00:00
Peter Steinberger
62582fc088 perf(agents): cache per-pass context char estimates 2026-03-03 02:32:13 +00:00
Peter Steinberger
57336203d5 test(telegram): move preview-finalization cases to lane unit tests 2026-03-03 02:32:13 +00:00
Peter Steinberger
1929151103 refactor(telegram): extract sequential key module 2026-03-03 02:32:13 +00:00
Peter Steinberger
6ab9e00e17 fix: resolve pi-tools typing regressions 2026-03-03 02:27:59 +00:00
Peter Steinberger
1dd77e4106 refactor(slack): extract socket reconnect policy helpers 2026-03-03 02:19:34 +00:00
Peter Steinberger
4d52dfe85b refactor(sessions): add explicit merge activity policies 2026-03-03 02:19:34 +00:00
Peter Steinberger
d380ed710d refactor(agents): split pi-tools param and host-edit wrappers 2026-03-03 02:19:34 +00:00
Peter Steinberger
03755f8463 test(telegram): dedupe streaming cases and tighten sequential key checks 2026-03-03 02:14:15 +00:00
Peter Steinberger
7fdbf1202e test(security): reduce audit fixture setup overhead 2026-03-03 02:14:15 +00:00
Peter Steinberger
70db52de71 test(agents): centralize AgentMessage fixtures and remove unsafe casts 2026-03-03 02:14:15 +00:00
Gustavo Madeira Santana
15a0455d04 CLI: unify routed config positional parsing 2026-03-02 21:11:53 -05:00
倪汉杰0668001185
0fb3f188b2 fix(agents): only recover edit when oldText no longer in file (review feedback) 2026-03-03 02:06:59 +00:00
倪汉杰0668001185
bf6aa7ca67 fix(agents): treat host edit tool as success when file contains newText after upstream throw (fixes #32333) 2026-03-03 02:06:59 +00:00
Peter Steinberger
0fd77c9856 refactor: modularize plugin runtime and test hooks 2026-03-03 02:06:58 +00:00
Peter Steinberger
c7ec237089 fix: fail fast on non-recoverable slack auth errors (#32377) (thanks @scoootscooob) 2026-03-03 01:59:47 +00:00
scoootscooob
1ae82be55a fix(slack): fail fast on non-recoverable auth errors instead of retry loop
When a Slack bot is removed from a workspace while still configured in
OpenClaw, the gateway enters an infinite retry loop on account_inactive
or invalid_auth errors, making the entire gateway unresponsive.

Add isNonRecoverableSlackAuthError() to detect permanent credential
failures (account_inactive, invalid_auth, token_revoked, etc.) and
throw immediately instead of retrying.  This mirrors how the Telegram
provider already distinguishes recoverable network errors from fatal
auth errors via isRecoverableTelegramNetworkError().

The check is applied in both the startup catch block and the disconnect
reconnect path so stale credentials always fail fast with a clear error
message.

Closes #32366

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 01:59:47 +00:00
romeodiaz
a467517b2b fix(sessions): preserve idle reset timestamp on inbound metadata 2026-03-03 01:57:53 +00:00
nico-hoff
3eec79bd6c feat(memory): add Ollama embedding provider (#26349)
Merged via /review-pr -> /prepare-pr -> /merge-pr.

Prepared head SHA: ac413865431614c352c3b29f2dfccc5593f0605a
Co-authored-by: nico-hoff <43175972+nico-hoff@users.noreply.github.com>
Co-authored-by: gumadeiras <5599352+gumadeiras@users.noreply.github.com>
Reviewed-by: @gumadeiras
2026-03-02 20:56:40 -05:00
Peter Steinberger
4ba5937ef9 refactor(tests): dedupe tools invoke http request helpers 2026-03-03 01:54:28 +00:00
Peter Steinberger
6fc3f504d6 refactor(tests): dedupe media transcript echo config setup 2026-03-03 01:54:28 +00:00
Peter Steinberger
b17687b775 refactor(tests): dedupe security fix scenario helpers 2026-03-03 01:54:27 +00:00
Peter Steinberger
eca242b971 refactor(tests): dedupe manifest registry link fixture setup 2026-03-03 01:54:27 +00:00
Peter Steinberger
4494844d17 refactor(tests): dedupe discord monitor e2e fixtures 2026-03-03 01:54:27 +00:00
Peter Steinberger
5193189953 refactor(tests): dedupe cron store migration setup 2026-03-03 01:54:27 +00:00
Peter Steinberger
fbb88d5063 refactor(tests): dedupe isolated agent cron turn assertions 2026-03-03 01:54:27 +00:00
Peter Steinberger
c0715db3c8 fix: add session hook context regression tests (#26394) (thanks @tempeste) 2026-03-03 01:48:46 +00:00
tempeste
20c15ccc63 Plugins: add sessionKey to session lifecycle hooks 2026-03-03 01:48:46 +00:00
Peter Steinberger
61f29830bc fix(test): resolve upstream typing drift in feishu and cron suites 2026-03-03 01:44:21 +00:00
Peter Steinberger
47736e3432 refactor(test): extract cron issue-regression harness and frozen-time helper 2026-03-03 01:44:21 +00:00
Peter Steinberger
39520ad21b test(agents): tighten pi message typing and dedupe malformed tool-call cases 2026-03-03 01:44:21 +00:00
Sk Akram
bd8c3230e8 fix: force supportsDeveloperRole=false for non-native OpenAI endpoints (#29479)
Merged via squash.

Prepared head SHA: 1416c584ac4cdc48af9f224e3d870ef40900c752
Co-authored-by: akramcodez <179671552+akramcodez@users.noreply.github.com>
Co-authored-by: gumadeiras <5599352+gumadeiras@users.noreply.github.com>
Reviewed-by: @gumadeiras
2026-03-02 20:43:49 -05:00
Peter Steinberger
ebbb572639 fix: add requestHeartbeatNow runtime coverage (#19464) (thanks @AustinEral) 2026-03-03 01:40:31 +00:00
Austin Eral
40e5c6a18d feat(plugins): expose requestHeartbeatNow on plugin runtime
Add requestHeartbeatNow to PluginRuntime.system so extensions can
trigger an immediate heartbeat wake without importing internal modules.

This enables extensions to inject a system event and wake the agent
in one step — useful for inbound message handlers that use the
heartbeat model (e.g. agent-to-agent DMs via Nostr).

Changes:
- src/plugins/runtime/types.ts: add RequestHeartbeatNow type alias
  and requestHeartbeatNow to PluginRuntime.system
- src/plugins/runtime/index.ts: import and wire requestHeartbeatNow
  into createPluginRuntime()
2026-03-03 01:40:31 +00:00
David Rudduck
11e1363d2d feat(hooks): add trigger and channelId to plugin hook agent context (#28623)
* feat(hooks): add trigger and channelId to plugin hook agent context

Adds `trigger` and `channelId` fields to `PluginHookAgentContext` so
plugins can determine what initiated the agent run and which channel
it originated from, without session-key parsing or Redis bridging.

trigger values: "user", "heartbeat", "cron", "memory"
channelId values: "telegram", "discord", "whatsapp", etc.

Both fields are threaded through run.ts and attempt.ts hookCtx so all
hook phases receive them (before_model_resolve, before_prompt_build,
before_agent_start, llm_input, llm_output, agent_end).

channelId falls back from messageChannel to messageProvider when the
former is not set. followup-runner passes originatingChannel so queued
followup runs also carry channel context.

* docs(changelog): note hook context parity fix for #28623

---------

Co-authored-by: Vincent Koc <vincentkoc@ieee.org>
2026-03-02 17:39:20 -08:00
Peter Steinberger
ee646dae82 fix: add runtime.events regression tests (#16044) (thanks @scifantastic) 2026-03-03 01:37:56 +00:00
SciFantastic
85f01cd9eb Fix styles 2026-03-03 01:37:56 +00:00
SciFantastic
bab5d994bc docs: expand JSDoc for onSessionTranscriptUpdate params and return 2026-03-03 01:37:56 +00:00
SciFantastic
2365c6c86a docs: add JSDoc to onSessionTranscriptUpdate 2026-03-03 01:37:56 +00:00
SciFantastic
b91a22a3fb style: fix indentation in transcript-events 2026-03-03 01:37:56 +00:00
SciFantastic
2aab6dff76 fix: wrap transcript event listeners in try/catch to prevent throw propagation 2026-03-03 01:37:56 +00:00
SciFantastic
980388fcf0 plugin-sdk: expose onAgentEvent + onSessionTranscriptUpdate via PluginRuntime.events 2026-03-03 01:37:56 +00:00
Peter Steinberger
2f6718b8e7 refactor(gateway): extract channel health policy and timing aliases 2026-03-03 01:37:39 +00:00
Peter Steinberger
b5350bf46f refactor(outbound): unify channel selection and action input normalization 2026-03-03 01:37:39 +00:00
Peter Steinberger
0f5f20ee6b refactor(tests): dedupe cron delivered status assertions 2026-03-03 01:37:12 +00:00