Commit Graph

68 Commits

Author SHA1 Message Date
Peter Steinberger
b8b43175c5 style: align formatting with oxfmt 0.33 2026-02-18 01:34:35 +00:00
Peter Steinberger
31f9be126c style: run oxfmt and fix gate failures 2026-02-18 01:29:02 +00:00
cpojer
d0cb8c19b2 chore: wtf. 2026-02-17 13:36:48 +09:00
Sebastian
ed11e93cf2 chore(format) 2026-02-16 23:20:16 -05:00
Peter Steinberger
5115f6fdf3 style: normalize imports for oxfmt 0.33 2026-02-17 00:59:54 +00:00
Peter Steinberger
ddef3cadba refactor: replace memory manager prototype mixing 2026-02-17 01:50:04 +01:00
cpojer
1dc9bb8d62 chore: Fix more type issues. 2026-02-17 09:29:47 +09:00
cpojer
90ef2d6bdf chore: Update formatting. 2026-02-17 09:18:40 +09:00
Rodrigo Uroz
6b3e0710f4 feat(memory): Add opt-in temporal decay for hybrid search scoring
Exponential decay (half-life configurable, default 30 days) applied
before MMR re-ranking. Dated daily files (memory/YYYY-MM-DD.md) use
filename date; evergreen files (MEMORY.md, topic files) are not
decayed; other sources fall back to file mtime.

Config: memorySearch.query.hybrid.temporalDecay.{enabled, halfLifeDays}
Default: disabled (backwards compatible, opt-in).
2026-02-16 23:59:19 +01:00
康熙
bcab2469de feat: LLM-based query expansion for FTS mode
When searching in FTS-only mode (no embedding provider), extract meaningful
keywords from conversational queries using LLM to improve search results.

Changes:
- New query-expansion module with keyword extraction
- Supports English and Chinese stop word filtering
- Null safety guards for FTS-only mode (provider can be null)
- Lint compliance fixes for string iteration

This helps users find relevant memory entries even with vague queries.
2026-02-16 23:53:21 +01:00
康熙
65aedac20e fix: enable FTS fallback when no embedding provider available (#17725)
When no embedding provider is available (e.g., OAuth mode without API keys),
memory_search now falls back to FTS-only mode instead of returning disabled: true.

Changes:
- embeddings.ts: return null provider with reason instead of throwing
- manager.ts: handle null provider, use FTS-only search mode
- manager-search.ts: allow searching all models when provider is undefined
- memory-tool.ts: expose search mode in results

The search results now include a 'mode' field indicating 'hybrid' or 'fts-only'.
2026-02-16 23:53:21 +01:00
Vignesh Natarajan
7addb519da fix (memory/builtin): keep status dirty state stable across invocations 2026-02-14 18:48:58 -08:00
Peter Steinberger
4c401d336d refactor(memory): extract manager sync and embedding ops 2026-02-13 19:08:37 +00:00
Rodrigo Uroz
7f1712c1ba (fix): enforce embedding model token limit to prevent overflow (#13455)
* fix: enforce embedding model token limit to prevent 8192 overflow

- Replace EMBEDDING_APPROX_CHARS_PER_TOKEN=1 with UTF-8 byte length
  estimation (safe upper bound for tokenizer output)
- Add EMBEDDING_MODEL_MAX_TOKENS=8192 hard cap
- Add splitChunkToTokenLimit() that binary-searches for the largest
  safe split point, with surrogate pair handling
- Add enforceChunkTokenLimit() wrapper called in indexFile() after
  chunkMarkdown(), before any embedding API call
- Fixes: session files with large JSONL entries could produce chunks
  exceeding text-embedding-3-small's 8192 token limit

Tests: 2 new colocated tests in manager.embedding-token-limit.test.ts
- Verifies oversized ASCII chunks are split to <=8192 bytes each
- Verifies multibyte (emoji) content batching respects byte limits

* fix: make embedding token limit provider-aware

- Add optional maxInputTokens to EmbeddingProvider interface
- Each provider (openai, gemini, voyage) reports its own limit
- Known-limits map as fallback: openai 8192, gemini 2048, voyage 32K
- Resolution: provider field > known map > default 8192
- Backward compatible: local/llama uses fallback

* fix: enforce embedding input size limits (#13455) (thanks @rodrigouroz)

---------

Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
2026-02-10 20:10:17 -06:00
Marcus Castro
45488e4ec9 fix: remap session JSONL chunk line numbers to original source positions (#12102)
* fix: remap session JSONL chunk line numbers to original source positions

buildSessionEntry() flattens JSONL messages into plain text before
chunkMarkdown() assigns line numbers. The stored startLine/endLine
values therefore reference positions in the flattened text, not the
original JSONL file.

- Add lineMap to SessionFileEntry tracking which JSONL line each
  extracted message came from
- Add remapChunkLines() to translate chunk positions back to original
  JSONL lines after chunking
- Guard remap with source === "sessions" to prevent misapplication
- Include lineMap in content hash so existing sessions get re-indexed

Fixes #12044

* memory: dedupe session JSONL parsing

---------

Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
2026-02-10 18:09:24 -06:00
max
a1123dd9be Centralize date/time formatting utilities (#11831) 2026-02-08 04:53:31 -08:00
Jake
6965a2cc9d feat(memory): native Voyage AI support (#7078)
* feat(memory): add native Voyage AI embedding support with batching

Cherry-picked from PR #2519, resolved conflict in memory-search.ts
(hasRemote -> hasRemoteConfig rename + added voyage provider)

* fix(memory): optimize voyage batch memory usage with streaming and deduplicate code

Cherry-picked from PR #2519. Fixed lint error: changed this.runWithConcurrency
to use imported runWithConcurrency function after extraction to internal.ts
2026-02-06 15:09:32 -06:00
Benjamin Jesuiter
23cfcd60df Fix build regressions after merge 2026-02-02 23:45:05 -08:00
Vignesh Natarajan
3a57106c1e Add more tests; make fall back more resilient and visible 2026-02-02 23:45:05 -08:00
Vignesh Natarajan
5d3af3bc62 feat (memory): Implement new (opt-in) QMD memory backend 2026-02-02 23:45:05 -08:00
cpojer
f06dd8df06 chore: Enable "experimentalSortImports" in Oxfmt and reformat all imorts. 2026-02-01 10:03:47 +09:00
cpojer
5ceff756e1 chore: Enable "curly" rule to avoid single-statement if confusion/errors. 2026-01-31 16:19:20 +09:00
cpojer
15792b153f chore: Enable more lint rules, disable some that trigger a lot. Will clean up later. 2026-01-31 16:04:04 +09:00
cpojer
7a9ddcd590 chore: Enable some "perf" lint rules. 2026-01-31 15:58:24 +09:00
Peter Steinberger
9a7160786a refactor: rename to openclaw 2026-01-30 03:16:21 +01:00
Gustavo Madeira Santana
a44da67069 fix: local updates for PR #3600
Co-authored-by: kira-ariaki <kira-ariaki@users.noreply.github.com>
2026-01-28 22:00:11 -05:00
Kira
0fd9d3abd1 feat(memory): add explicit paths config for memory search
Add a `paths` option to `memorySearch` config, allowing users to
explicitly specify additional directories or files to include in
memory search.

Follow-up to #2961 as suggested by @gumadeiras — instead of auto-following
symlinks (which has security implications), users can now explicitly
declare additional search paths.

- Add `memorySearch.paths` config option (array of strings)
- Paths can be absolute or relative (resolved from workspace)
- Directories are recursively scanned for `.md` files
- Single `.md` files can also be specified
- Paths from defaults and agent overrides are merged
- Added 4 test cases for listMemoryFiles
2026-01-28 22:00:11 -05:00
Peter Steinberger
6d16a658e5 refactor: rename clawdbot to moltbot with legacy compat 2026-01-27 12:21:02 +00:00
Peter Steinberger
472b8fe15d fix: prevent memory CLI hangs 2026-01-22 03:14:59 +00:00
Peter Steinberger
8479dc97da fix: make session memory indexing async 2026-01-21 10:39:00 +00:00
Peter Steinberger
d4df747f9f fix: harden doctor config cleanup 2026-01-20 01:43:59 +00:00
Peter Steinberger
4bac76e66d fix: improve memory status and batch fallback 2026-01-19 22:49:06 +00:00
Peter Steinberger
50fdd514ae refactor(logging): split config + subsystem imports 2026-01-19 00:15:44 +00:00
Peter Steinberger
ab340c82fb fix: stabilize tests and logging 2026-01-18 18:43:31 +00:00
Peter Steinberger
0be9d773cb fix(memory): preserve fallback source id 2026-01-18 16:12:10 +00:00
Peter Steinberger
9464774133 feat(memory): add gemini batches + safe reindex
Co-authored-by: Gustavo Madeira Santana <gumadeiras@gmail.com>
2026-01-18 16:12:10 +00:00
Peter Steinberger
20c26eb303 fix: prevent sqlite-vec duplicate id failures 2026-01-18 13:55:56 +00:00
Peter Steinberger
676d41d415 fix: seed embedding cache for atomic reindex 2026-01-18 09:28:42 +00:00
Peter Steinberger
a3a4996adb feat: add gemini memory embeddings 2026-01-18 09:09:45 +00:00
Peter Steinberger
ca350fc66c chore(format): oxfmt memory 2026-01-18 07:30:07 +00:00
Peter Steinberger
c9c9516206 refactor(memory): extract sync + status helpers 2026-01-18 07:03:06 +00:00
Peter Steinberger
9c0ff87c86 fix: align plugin runtime and exec wiring 2026-01-18 05:44:22 +00:00
Peter Steinberger
c00ea63bb0 refactor: split memory manager internals 2026-01-18 05:40:10 +00:00
Peter Steinberger
55aff22274 feat: surface batch request progress 2026-01-18 04:30:15 +00:00
Peter Steinberger
e4e1396a98 perf: improve batch status logging 2026-01-18 04:28:14 +00:00
Peter Steinberger
afb877a96b perf: speed up memory batch polling 2026-01-18 03:55:14 +00:00
Peter Steinberger
0c93b9b7bb style: apply oxfmt 2026-01-18 02:19:35 +00:00
Peter Steinberger
ccb30665f7 feat: add hybrid memory search 2026-01-18 01:47:58 +00:00
Peter Steinberger
0fb2777c6d feat: add memory embedding cache 2026-01-18 01:47:58 +00:00
Peter Steinberger
8b1bec11d0 feat: speed up memory batch indexing 2026-01-18 01:24:51 +00:00