Peter Steinberger
adfbbcf1f6
chore: merge origin/main into main
2026-02-22 13:42:52 +00:00
Peter Steinberger
1becebe188
fix: harden session lock contention and cleanup
2026-02-22 13:40:55 +00:00
Peter Steinberger
2c40a20737
test: trim background hold duration in abort coverage
2026-02-22 12:38:57 +00:00
Peter Steinberger
5b23159c4c
test: create homedir before sandbox image mkdtemp
2026-02-22 12:35:38 +00:00
Peter Steinberger
96515a5729
test: merge duplicate read-tool content coverage cases
2026-02-22 12:32:05 +00:00
Peter Steinberger
c8a4977378
test: replace mtime sleep with explicit utimes bump
2026-02-22 12:29:53 +00:00
Peter Steinberger
dc356ae1c2
test: remove duplicate workspace path-resolution case
2026-02-22 12:27:55 +00:00
Peter Steinberger
c7a4346e4d
test: remove sharp dependency from read-tool metadata test
2026-02-22 12:27:10 +00:00
Peter Steinberger
60a0291bf8
test: dedupe workspace path-resolution scenarios
2026-02-22 12:25:57 +00:00
Peter Steinberger
07527e22ce
refactor(auth-profiles): centralize active-window logic + strengthen regression coverage
2026-02-22 13:23:19 +01:00
Peter Steinberger
1152b25866
fix(gateway): guard trim crashes in subagent flow
2026-02-22 13:21:26 +01:00
Peter Steinberger
0d0f4c6992
refactor(exec): centralize safe-bin policy checks
2026-02-22 13:18:25 +01:00
Artale
51e9c54f09
fix(agents): skip bootstrap files with undefined path ( #22698 )
...
* fix(agents): skip bootstrap files with undefined path
buildBootstrapContextFiles() called file.path.replace() without checking
that path was defined. If a hook pushed a bootstrap file using 'filePath'
instead of 'path', the function threw TypeError and crashed every agent
session — not just the misconfigured hook.
Fix: add a null-guard before the path.replace() call. Files with undefined
path are skipped with a warning so one bad hook can't take down all agents.
Also adds a test covering the undefined-path case.
Fixes #22693
* fix: harden bootstrap path validation and report guards (#22698 ) (thanks @arosstale)
---------
Co-authored-by: Peter Steinberger <steipete@gmail.com >
2026-02-22 13:17:07 +01:00
Peter Steinberger
7c3c406a35
fix: keep auth-profile cooldown windows immutable in-window ( #23536 ) (thanks @arosstale)
2026-02-22 13:14:02 +01:00
artale
dc69610d51
fix(auth-profiles): never shorten cooldown deadline on retry
...
When the backoff saturates at 60 min and retries fire every 30 min
(e.g. cron jobs), each failed request was resetting cooldownUntil to
now+60m. Because now+60m < existing deadline, the window kept getting
renewed and the profile never recovered without manually clearing
usageStats in auth-profiles.json.
Fix: only write a new cooldownUntil (or disabledUntil for billing) when
the new deadline is strictly later than the existing one. This lets the
original window expire naturally while still allowing genuine backoff
extension when error counts climb further.
Fixes #23516
[AI-assisted]
2026-02-22 13:14:02 +01:00
Peter Steinberger
47c3f742b6
fix(exec): require explicit safe-bin profiles
2026-02-22 12:58:55 +01:00
Peter Steinberger
29cc7f431f
test: share runtime scan filters and cached test scans
2026-02-22 12:44:44 +01:00
Peter Steinberger
3a65e4b523
test: make snapshot env override assertion independent of host env
2026-02-22 12:40:30 +01:00
Peter Steinberger
a4607277a9
test: consolidate sessions_spawn and guardrail helpers
2026-02-22 12:34:55 +01:00
Peter Steinberger
c343132dbb
fix(agents): harden bash tool and reply directive handling
2026-02-22 11:29:31 +00:00
Peter Steinberger
50c7aef22f
test: stabilize session lock tests and move out of e2e
2026-02-22 11:28:20 +00:00
Peter Steinberger
401106b963
fix: harden flaky tests and cover native google thought signatures ( #23457 ) (thanks @echoVic)
2026-02-22 12:24:53 +01:00
echoVic
9176571ec1
fix(gemini): sanitize thoughtSignatures for native Google provider
...
Native Google Gemini provider was accumulating 2K-8K tokens of Base64
thoughtSignature blobs per turn, causing premature context overflow.
The sanitizer was only enabled for OpenRouter Gemini, not native Google.
Fixes #23392
2026-02-22 12:24:53 +01:00
Peter Steinberger
78c3c2a542
fix: stabilize flaky tests and sanitize directive-only chat tags
2026-02-22 12:19:33 +01:00
Peter Steinberger
7d09a9e74d
test: update agent tool assertions and reclassify suites
2026-02-22 11:18:50 +00:00
Peter Steinberger
fcb86408fd
test: move embedded and tool agent suites out of e2e
2026-02-22 11:17:47 +00:00
Peter Steinberger
e441390fd1
test: reclassify agent local suites out of e2e
2026-02-22 11:16:37 +00:00
Peter Steinberger
713e2928b2
test: move duplicate local scenario suites out of agents e2e
2026-02-22 10:56:58 +00:00
Peter Steinberger
bfada9e425
test: move more local agents helper suites out of e2e
2026-02-22 10:55:22 +00:00
Peter Steinberger
4267fc8593
test: reclassify pi embedded helper suites out of agents e2e
2026-02-22 10:53:50 +00:00
Peter Steinberger
adace58505
test: reclassify local helper suites out of agents e2e
2026-02-22 10:53:40 +00:00
Peter Steinberger
1d4e9ad8d1
test: reclassify remaining bash suites as unit tests
2026-02-22 10:48:32 +00:00
Peter Steinberger
ab38e1e6b2
test: reclassify image tool suite as unit test
2026-02-22 10:47:16 +00:00
Peter Steinberger
aa487bd4f3
test: reclassify bash pty suites as unit tests
2026-02-22 10:47:10 +00:00
Peter Steinberger
3c9f98452e
test: reclassify tool-result persist hook suite as unit test
2026-02-22 10:46:02 +00:00
Peter Steinberger
047e18693e
test: reclassify exec approval-id suite as unit test
2026-02-22 10:45:23 +00:00
Peter Steinberger
17a65a6f4c
test: split pure docker exec arg checks from bash e2e suite
2026-02-22 10:44:40 +00:00
Peter Steinberger
239963ac44
perf(test): shrink bash command fixtures and polling windows
2026-02-22 10:43:22 +00:00
Peter Steinberger
1d7dbd8cd9
test: reclassify web fetch/readability suites as unit tests
2026-02-22 10:41:29 +00:00
Peter Steinberger
304eef575b
test: reclassify sandbox and web/image tool suites as unit tests
2026-02-22 10:40:40 +00:00
Peter Steinberger
3b09a0d2d0
perf(test): trim bash e2e log fixtures and abort wait bounds
2026-02-22 10:39:18 +00:00
Peter Steinberger
c68bb8d6d5
test: stabilize bash e2e suites with explicit exec approvals mode
2026-02-22 10:37:44 +00:00
Peter Steinberger
97eb4af01e
test: harden models-config env isolation list
2026-02-22 10:34:23 +00:00
Peter Steinberger
744df0fbe7
test: reclassify models-config suites from e2e to unit lane
2026-02-22 10:34:23 +00:00
Peter Steinberger
740fd7ae35
test: reclassify skills suites from e2e to unit lane
2026-02-22 10:34:23 +00:00
Peter Steinberger
c56ab39da5
perf(test): reduce bash e2e wait windows
2026-02-22 10:28:43 +00:00
Peter Steinberger
abff3f0f61
test: reclassify sessions_spawn lifecycle suite as unit test
2026-02-22 10:28:43 +00:00
Peter Steinberger
0b7c7ee1aa
perf(test): speed up sessions_spawn lifecycle suite setup
2026-02-22 10:28:43 +00:00
Peter Steinberger
c962bcba37
test: reclassify sandbox merge and exec path suites as unit tests
2026-02-22 10:28:43 +00:00
Peter Steinberger
9ab7b85a66
perf(test): tighten background abort timing windows
2026-02-22 10:28:43 +00:00