fix: improve compaction summary instructions to preserve active work
Expand staged-summary merge instructions to preserve active task status, batch progress, latest user request, and follow-up commitments so compaction handoffs retain in-flight work context.
Co-authored-by: joetomasone <56984887+joetomasone@users.noreply.github.com>
Co-authored-by: Josh Lehman <josh@martian.engineering>
* includes: prompt overhead in compaction safeguard calculation.
Subtracts SUMMARIZATION_OVERHEAD_TOKENS from maxChunkTokens in both the main summarization path and the dropped-messages summarization path.
This ensures the chunk budget leaves room for the prompt overhead that generateSummary wraps around each chunk.
* adds: budget for overhead tokens to use an effectiveMax instead of maxTokens naïvely.
- Added `SUMMARIZATION_OVERHEAD_TOKENS = 4096` — a budget for the tokens that `generateSummary` adds on top of the serialized conversation (system prompt, `<conversation>` tags, summarization instructions, `<previous-summary>` block, and reasoning: "high" thinking budget).
- `chunkMessagesByMaxTokens` now divides `maxTokens` by `SAFETY_MARGIN` (1.2) before comparing against estimated token counts. Previously, the safety margin was only used in `computeAdaptiveChunkRatio` and `isOversizedForSummary` but not in the actual chunking loop — so chunks could be built that fit the estimated budget but exceeded the real budget once the API tokenized them properly.
When pruneHistoryForContextShare drops chunks of messages, it could drop
an assistant message with tool_use blocks while leaving corresponding
tool_result messages in the kept portion. These orphaned tool_results
cause Anthropic's API to reject the session with 'unexpected tool_use_id'.
Fix by calling repairToolUseResultPairing after each chunk drop to clean
up any orphaned tool_results. This reuses existing battle-tested code
from session-transcript-repair.ts.
Fixes#9769, #9724, #9672