Commit Graph

117 Commits

Author SHA1 Message Date
Peter Steinberger
faec6ccb1d perf(test): reduce module reload churn in unit suites 2026-02-13 15:19:13 +00:00
Rodrigo Uroz
b912d3992d (fix): handle Cloudflare 521 and transient 5xx errors gracefully (#13500)
Merged via /review-pr -> /prepare-pr -> /merge-pr.

Prepared head SHA: a8347e95c55c6244bbf2e9066c8bf77bf62de6c9
Co-authored-by: rodrigouroz <384037+rodrigouroz@users.noreply.github.com>
Co-authored-by: Takhoffman <781889+Takhoffman@users.noreply.github.com>
Reviewed-by: @Takhoffman
2026-02-11 21:42:33 -06:00
Vignesh Natarajan
36e27ad561 Memory: make qmd search-mode flags compatible 2026-02-11 17:51:08 -08:00
Vignesh Natarajan
6d9d4d04ed Memory/QMD: add configurable search mode 2026-02-11 17:51:08 -08:00
Vignesh Natarajan
2f1f82674a Memory/QMD: harden no-results parsing 2026-02-11 15:39:28 -08:00
Vignesh Natarajan
3d343932cf Memory/QMD: treat plain-text no-results as empty 2026-02-11 15:39:28 -08:00
Rodrigo Uroz
7f1712c1ba (fix): enforce embedding model token limit to prevent overflow (#13455)
* fix: enforce embedding model token limit to prevent 8192 overflow

- Replace EMBEDDING_APPROX_CHARS_PER_TOKEN=1 with UTF-8 byte length
  estimation (safe upper bound for tokenizer output)
- Add EMBEDDING_MODEL_MAX_TOKENS=8192 hard cap
- Add splitChunkToTokenLimit() that binary-searches for the largest
  safe split point, with surrogate pair handling
- Add enforceChunkTokenLimit() wrapper called in indexFile() after
  chunkMarkdown(), before any embedding API call
- Fixes: session files with large JSONL entries could produce chunks
  exceeding text-embedding-3-small's 8192 token limit

Tests: 2 new colocated tests in manager.embedding-token-limit.test.ts
- Verifies oversized ASCII chunks are split to <=8192 bytes each
- Verifies multibyte (emoji) content batching respects byte limits

* fix: make embedding token limit provider-aware

- Add optional maxInputTokens to EmbeddingProvider interface
- Each provider (openai, gemini, voyage) reports its own limit
- Known-limits map as fallback: openai 8192, gemini 2048, voyage 32K
- Resolution: provider field > known map > default 8192
- Backward compatible: local/llama uses fallback

* fix: enforce embedding input size limits (#13455) (thanks @rodrigouroz)

---------

Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
2026-02-10 20:10:17 -06:00
Marcus Castro
45488e4ec9 fix: remap session JSONL chunk line numbers to original source positions (#12102)
* fix: remap session JSONL chunk line numbers to original source positions

buildSessionEntry() flattens JSONL messages into plain text before
chunkMarkdown() assigns line numbers. The stored startLine/endLine
values therefore reference positions in the flattened text, not the
original JSONL file.

- Add lineMap to SessionFileEntry tracking which JSONL line each
  extracted message came from
- Add remapChunkLines() to translate chunk positions back to original
  JSONL lines after chunking
- Guard remap with source === "sessions" to prevent misapplication
- Include lineMap in content hash so existing sessions get re-indexed

Fixes #12044

* memory: dedupe session JSONL parsing

---------

Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
2026-02-10 18:09:24 -06:00
Vignesh
ef4a0e92b7 fix(memory/qmd): scope query to managed collections (#11645) 2026-02-09 23:35:27 -08:00
max
ec910a235e refactor: consolidate duplicate utility functions (#12439)
* refactor: consolidate duplicate utility functions

- Add escapeRegExp to src/utils.ts and remove 10 local duplicates
- Rename bash-tools clampNumber to clampWithDefault (different signature)
- Centralize formatError calls to use formatErrorMessage from infra/errors.ts
- Re-export formatErrorMessage from cli/cli-utils.ts to preserve API

* refactor: consolidate remaining escapeRegExp duplicates

* refactor: consolidate sleep, stripAnsi, and clamp duplicates
2026-02-08 23:59:43 -08:00
Tyler Yust
e4651d6afa Memory/QMD: reuse default model cache and skip ENOENT warnings (#12114)
* Memory/QMD: symlink default model cache into custom XDG_CACHE_HOME

QmdMemoryManager overrides XDG_CACHE_HOME to isolate the qmd index
per-agent, but this also moves where qmd looks for its ML models
(~2.1GB). Since models are installed at the default location
(~/.cache/qmd/models/), every qmd invocation would attempt to
re-download them from HuggingFace and time out.

Fix: on initialization, symlink ~/.cache/qmd/models/ into the custom
XDG_CACHE_HOME path so the index stays isolated per-agent while the
shared models are reused. The symlink is only created when the default
models directory exists and the target path does not already exist.

Includes tests for the three key scenarios: symlink creation, existing
directory preservation, and graceful skip when no default models exist.

* Memory/QMD: skip model symlink warning on ENOENT

* test: stabilize warning-filter visibility assertion (#12114) (thanks @tyler6204)

* fix: add changelog entry for QMD cache reuse (#12114) (thanks @tyler6204)

* fix: handle plain context-overflow strings in compaction detection (#12114) (thanks @tyler6204)
2026-02-08 23:43:08 -08:00
max
223eee0a20 refactor: unify peer kind to ChatType, rename dm to direct (#11881)
* fix: use .js extension for ESM imports of RoutePeerKind

The imports incorrectly used .ts extension which doesn't resolve
with moduleResolution: NodeNext. Changed to .js and added 'type'
import modifier.

* fix tsconfig

* refactor: unify peer kind to ChatType, rename dm to direct

- Replace RoutePeerKind with ChatType throughout codebase
- Change 'dm' literal values to 'direct' in routing/session keys
- Keep backward compat: normalizeChatType accepts 'dm' -> 'direct'
- Add ChatType export to plugin-sdk, deprecate RoutePeerKind
- Update session key parsing to accept both 'dm' and 'direct' markers
- Update all channel monitors and extensions to use ChatType

BREAKING CHANGE: Session keys now use 'direct' instead of 'dm'.
Existing 'dm' keys still work via backward compat layer.

* fix tests

* test: update session key expectations for dmdirect migration

- Fix test expectations to expect :direct: in generated output
- Add explicit backward compat test for normalizeChatType('dm')
- Keep input test data with :dm: keys to verify backward compat

* fix: accept legacy 'dm' in session key parsing for backward compat

getDmHistoryLimitFromSessionKey now accepts both :dm: and :direct:
to ensure old session keys continue to work correctly.

* test: add explicit backward compat tests for dmdirect migration

- session-key.test.ts: verify both :dm: and :direct: keys are valid
- getDmHistoryLimitFromSessionKey: verify both formats work

* feat: backward compat for resetByType.dm config key

* test: skip unix-path Nix tests on Windows
2026-02-09 09:20:52 +09:00
Vignesh Natarajan
7f7d49aef0 Memory/QMD: warn when scope denies search 2026-02-08 09:21:17 -08:00
max
a1123dd9be Centralize date/time formatting utilities (#11831) 2026-02-08 04:53:31 -08:00
Gustavo Madeira Santana
e2dea2684f Tests: harden flake hotspots and consolidate provider-auth suites (#11598)
* Tests: harden flake hotspots and consolidate provider-auth suites

* Tests: restore env vars by deleting missing snapshot values

* Tests: use real newline in memory summary filter case

* Tests(memory): use fake timers for qmd timeout coverage

* Changelog: add tests hardening entry for #11598
2026-02-07 21:32:23 -05:00
Vignesh Natarajan
95263f4e60 Memory: add SQLITE_BUSY fallback regression test 2026-02-07 17:55:34 -08:00
Vignesh Natarajan
6f1ba986b3 Memory: make QMD cache eviction callback idempotent 2026-02-07 17:55:34 -08:00
Vignesh Natarajan
c741d008dd Memory: chain forced QMD queue and fail over on busy index 2026-02-07 17:55:34 -08:00
Vignesh Natarajan
0d60ef6fef Memory: queue forced QMD sync and handle sqlite busy reads 2026-02-07 17:55:34 -08:00
Vignesh Natarajan
ce715c4c56 Memory: harden QMD startup, timeouts, and fallback recovery 2026-02-07 17:55:34 -08:00
Jake
e78ae48e69 fix(memory): add input_type to Voyage AI embeddings for improved retrieval (#10818)
* fix(memory): add input_type to Voyage AI embeddings for improved retrieval

Voyage AI recommends passing input_type='document' when indexing and
input_type='query' when searching. This improves retrieval quality by
optimising the embedding space for each direction.

Changes:
- embedQuery now passes input_type: 'query'
- embedBatch now passes input_type: 'document'
- Batch API request_params includes input_type: 'document'
- Tests updated to verify input_type is passed correctly

* Changelog: note Voyage embeddings input_type fix (#10818) (thanks @mcinteerj)

---------

Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
2026-02-06 21:55:09 -06:00
Jake
6965a2cc9d feat(memory): native Voyage AI support (#7078)
* feat(memory): add native Voyage AI embedding support with batching

Cherry-picked from PR #2519, resolved conflict in memory-search.ts
(hasRemote -> hasRemoteConfig rename + added voyage provider)

* fix(memory): optimize voyage batch memory usage with streaming and deduplicate code

Cherry-picked from PR #2519. Fixed lint error: changed this.runWithConcurrency
to use imported runWithConcurrency function after extraction to internal.ts
2026-02-06 15:09:32 -06:00
Vignesh Natarajan
30098b04d7 chore: fix lint warnings 2026-02-02 23:45:05 -08:00
Vignesh Natarajan
f72214725d chore: restore OpenClaw branding 2026-02-02 23:45:05 -08:00
Vignesh Natarajan
9bef525944 chore: apply formatter 2026-02-02 23:45:05 -08:00
Vignesh Natarajan
d0b98c75e5 fix: make QMD cache key deterministic 2026-02-02 23:45:05 -08:00
Benjamin Jesuiter
e332a717a8 Lint: add braces for single-line ifs 2026-02-02 23:45:05 -08:00
Benjamin Jesuiter
23cfcd60df Fix build regressions after merge 2026-02-02 23:45:05 -08:00
Benjamin Jesuiter
465536e811 QMD: use OpenClaw config types 2026-02-02 23:45:05 -08:00
Benjamin Jesuiter
3d1c3b78ec Tests: cover QMD scope, reads, and citation clamp 2026-02-02 23:45:05 -08:00
Benjamin Jesuiter
c248da0317 Memory: harden QMD memory_get path checks 2026-02-02 23:45:05 -08:00
Benjamin Jesuiter
b7f4755020 Memory: fix QMD scope channel parsing 2026-02-02 23:45:05 -08:00
Benjamin Jesuiter
3e82cbd55b Memory: parse quoted qmd command 2026-02-02 23:45:05 -08:00
Benjamin Jesuiter
5d8c665baf Tests: use OPENCLAW_STATE_DIR in qmd manager 2026-02-02 23:45:05 -08:00
vignesh07
9df78b3379 fix(memory/qmd): throttle embed + citations auto + restore --force 2026-02-02 23:45:05 -08:00
Vignesh Natarajan
564fe6f089 fix(memory-qmd): create collections via qmd CLI (no YAML) 2026-02-02 23:45:05 -08:00
Vignesh Natarajan
dd8373a424 fix(memory-qmd): write XDG index.yml + legacy compat 2026-02-02 23:45:05 -08:00
Vignesh Natarajan
9be3c27bb7 fix(qmd): use XDG dirs for qmd home; drop ollama docs 2026-02-02 23:45:05 -08:00
Vignesh Natarajan
e12184661e Fix build errors 2026-02-02 23:45:05 -08:00
Vignesh Natarajan
3a57106c1e Add more tests; make fall back more resilient and visible 2026-02-02 23:45:05 -08:00
Vignesh Natarajan
2c30ba400b Make memory more resilient to failure 2026-02-02 23:45:05 -08:00
Vignesh Natarajan
5d3af3bc62 feat (memory): Implement new (opt-in) QMD memory backend 2026-02-02 23:45:05 -08:00
Sk Akram
5020bfa2a9 fix: L2-normalize local embedding vectors to fix semantic search (#5332)
* fix: L2-normalize local embedding vectors to fix semantic search

* fix: handle non‑finite magnitude in L2 normalization and remove stale test reset

* refactor: add braces to l2Normalize guard clause in embeddings

* fix: sanitize local embeddings (#5332) (thanks @akramcodez)

---------

Co-authored-by: Gustavo Madeira Santana <gumadeiras@gmail.com>
2026-02-01 22:56:44 -05:00
cpojer
f06dd8df06 chore: Enable "experimentalSortImports" in Oxfmt and reformat all imorts. 2026-02-01 10:03:47 +09:00
cpojer
5ceff756e1 chore: Enable "curly" rule to avoid single-statement if confusion/errors. 2026-01-31 16:19:20 +09:00
cpojer
009b16fab8 chore: more lint cleanup. 2026-01-31 16:16:13 +09:00
cpojer
15792b153f chore: Enable more lint rules, disable some that trigger a lot. Will clean up later. 2026-01-31 16:04:04 +09:00
cpojer
7a9ddcd590 chore: Enable some "perf" lint rules. 2026-01-31 15:58:24 +09:00
Peter Steinberger
9a7160786a refactor: rename to openclaw 2026-01-30 03:16:21 +01:00
Gustavo Madeira Santana
a44da67069 fix: local updates for PR #3600
Co-authored-by: kira-ariaki <kira-ariaki@users.noreply.github.com>
2026-01-28 22:00:11 -05:00