openclaw - 💡(How to fix) Fix memory-tencentdb 插件导致 OpenAI-compatible provider 缓存命中率大幅退化 / Prompt cache hit rate regression with memory-tencentdb

StepCodex · 2026-05-31T16:33:52Z

[openclaw] 问题描述 / Problem 启用 memory-tencentdb 插件后，OpenAI-compatible 提供商（DeepSeek、MiMo）的 prompt 缓存命中率出现显著退化。环境 / Environment - OpenClaw 2026.5.28（5 月 30 日从 202… ## 问题描述 / Problem 启用 memory-tencentdb 插件后，OpenAI-compatible 提供商（DeepSeek、MiMo）的 prompt 缓存命中率出现显著退化。 ### 环境 / Environment - OpenClaw 2026.5.28（5 月 30 日从 2026.5.19 升级） - 提供商：DeepSeek V4 Pro、MiMo V2.5 Pro（均为 `openai-completions` API，依赖 prefix-matching 缓存） - memory-tencentdb 插件于 5 月 30 日上线 ### 现象 / Symptoms | 日期 | OpenClaw | TencentDB | MiMo 命中率 | DeepSeek 命中率 | |------|----------|-----------|------------|----------------| | 5/29 | 5.19 | ❌ 未上线 | 91.1% | 95.7% | | 5/31 | 5.28 | ✅ 全量 | 63.5% | 83.3% | 关闭 TencentDB 后缓存立刻回升，新会话中 DeepSeek 恢复到 72%+，MiniMax cron 首轮恢复到 91-99%。 ### 根因分析 / Root Cause **主因：prependContext → 上下文膨胀 → 前缀缓存失效** 1. TencentDB 每轮向用户消息开头注入 `prependContext`（召回的记忆，约 500-1700 tokens）。当 `showInjected=true` 时，这些内容被冻结写入对话历史中。 2. 多轮对话后，上下文快速膨胀。膨胀触发更频繁的 tool result truncation。 3. truncation 的截断量每轮不同（基于 token budget 动态计算），导致对话历史前缀不一致 → prefix-matching 缓存失效。 **次要：appendSystemContext 放置位置不当** `composeSystemPromptWithHookContext` 将 persona + 场景导航（~4000 字符）直接拼接到系统提示的 CACHE_BOUNDARY 之后，未调用已有的 `prependSystemPromptAdditionAfterCacheBoundary`。稳定内容每轮被当做新 token 计费。 ### 建议 / Suggestions 1. 稳定 persona 内容应放在 CACHE_BOUNDARY 之前参与缓存 2. 评估 `showInjected` 对对话历史膨胀的长期影响 3. 考虑 session 级稳定系统提示追加内容的去重 --- ## Problem Prompt cache hit rates for OpenAI-compatible providers (DeepSeek, MiMo) degraded significantly after enabling the memory-tencentdb plugin, combined with the OpenClaw 5.19 → 5.28 upgrade. ### Environment - OpenClaw 2026.5.28 (upgraded from 2026.5.19 on May 30) - Providers: DeepSeek V4 Pro, MiMo V2.5 Pro (both `openai-completions` API, prefix-matching cache) - memory-tencentdb plugin deployed on May 30 ### Cache Hit Rate Comparison | Date | OpenClaw | TencentDB | MiMo Hit Rate | DeepSeek Hit Rate | |------|----------|-----------|---------------|-------------------| | May 29 | 5.19 | ❌ Off | 91.1% | 95.7% | | May 31 | 5.28 | ✅ On | 63.5% | 83.3% | Cache recovered immediately after disabling the plugin: new DeepSeek session hit 72%+, MiniMax cron first-turn 91-99%. ### Root Cause **Primary: prependContext → context bloat → prefix cache invalidation** 1. TencentDB prepends `prependContext` (recalled memories, ~500-1700 tokens) to each user message. With `showInjected=true`, this content is frozen into conversation history. 2. Context bloat triggers more frequent tool result truncation over multiple turns. 3. Variable truncation amounts per turn (dynamic token budget) → conversation history prefix inconsistent → prefix-matching cache invalidated. **Secondary: appendSystemContext placed after CACHE_BOUNDARY** `composeSystemPromptWithHookContext` appends persona + scene navigation (~4000 chars) after the CACHE_BOUNDARY marker without using the existing `prependSystemPromptAdditionAfterCacheBoundary`. Stable content re-sent as fresh tokens every turn. ### Suggestions 1. Place stable persona content before CACHE_BOUNDARY for caching 2. Evaluate long-term impact of `showInjected` on conversation history growth 3. Consider session-level dedup of stable system prompt additions

openclaw2026-05-31 16:33:52

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

Root Cause

Primary: prependContext → context bloat → prefix cache invalidation

TencentDB prepends prependContext (recalled memories, ~500-1700 tokens) to each user message. With showInjected=true, this content is frozen into conversation history.
Context bloat triggers more frequent tool result truncation over multiple turns.
Variable truncation amounts per turn (dynamic token budget) → conversation history prefix inconsistent → prefix-matching cache invalidated.

Secondary: appendSystemContext placed after CACHE_BOUNDARY

composeSystemPromptWithHookContext appends persona + scene navigation (~4000 chars) after the CACHE_BOUNDARY marker without using the existing prependSystemPromptAdditionAfterCacheBoundary. Stable content re-sent as fresh tokens every turn.

RAW_BUFFERClick to expand / collapse

问题描述 / Problem

启用 memory-tencentdb 插件后，OpenAI-compatible 提供商（DeepSeek、MiMo）的 prompt 缓存命中率出现显著退化。

环境 / Environment

OpenClaw 2026.5.28（5 月 30 日从 2026.5.19 升级）
提供商：DeepSeek V4 Pro、MiMo V2.5 Pro（均为 openai-completions API，依赖 prefix-matching 缓存）
memory-tencentdb 插件于 5 月 30 日上线

现象 / Symptoms

日期	OpenClaw	TencentDB	MiMo 命中率	DeepSeek 命中率
5/29	5.19	❌ 未上线	91.1%	95.7%
5/31	5.28	✅ 全量	63.5%	83.3%

关闭 TencentDB 后缓存立刻回升，新会话中 DeepSeek 恢复到 72%+，MiniMax cron 首轮恢复到 91-99%。

根因分析 / Root Cause

主因：prependContext → 上下文膨胀 → 前缀缓存失效

TencentDB 每轮向用户消息开头注入 prependContext（召回的记忆，约 500-1700 tokens）。当 showInjected=true 时，这些内容被冻结写入对话历史中。
多轮对话后，上下文快速膨胀。膨胀触发更频繁的 tool result truncation。
truncation 的截断量每轮不同（基于 token budget 动态计算），导致对话历史前缀不一致 → prefix-matching 缓存失效。

次要：appendSystemContext 放置位置不当

composeSystemPromptWithHookContext 将 persona + 场景导航（~4000 字符）直接拼接到系统提示的 CACHE_BOUNDARY 之后，未调用已有的 prependSystemPromptAdditionAfterCacheBoundary。稳定内容每轮被当做新 token 计费。

建议 / Suggestions

稳定 persona 内容应放在 CACHE_BOUNDARY 之前参与缓存
评估 showInjected 对对话历史膨胀的长期影响
考虑 session 级稳定系统提示追加内容的去重

Problem

Prompt cache hit rates for OpenAI-compatible providers (DeepSeek, MiMo) degraded significantly after enabling the memory-tencentdb plugin, combined with the OpenClaw 5.19 → 5.28 upgrade.

Environment

OpenClaw 2026.5.28 (upgraded from 2026.5.19 on May 30)
Providers: DeepSeek V4 Pro, MiMo V2.5 Pro (both openai-completions API, prefix-matching cache)
memory-tencentdb plugin deployed on May 30

Cache Hit Rate Comparison

Date	OpenClaw	TencentDB	MiMo Hit Rate	DeepSeek Hit Rate
May 29	5.19	❌ Off	91.1%	95.7%
May 31	5.28	✅ On	63.5%	83.3%

Cache recovered immediately after disabling the plugin: new DeepSeek session hit 72%+, MiniMax cron first-turn 91-99%.

Root Cause

Primary: prependContext → context bloat → prefix cache invalidation

TencentDB prepends prependContext (recalled memories, ~500-1700 tokens) to each user message. With showInjected=true, this content is frozen into conversation history.
Context bloat triggers more frequent tool result truncation over multiple turns.
Variable truncation amounts per turn (dynamic token budget) → conversation history prefix inconsistent → prefix-matching cache invalidated.

Secondary: appendSystemContext placed after CACHE_BOUNDARY

Suggestions

Place stable persona content before CACHE_BOUNDARY for caching
Evaluate long-term impact of showInjected on conversation history growth
Consider session-level dedup of stable system prompt additions

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

openclaw - 💡(How to fix) Fix memory-tencentdb 插件导致 OpenAI-compatible provider 缓存命中率大幅退化 / Prompt cache hit rate regression with memory-tencentdb

Recommended Tools

GitHub issue graph ai analysis

Root Cause

问题描述 / Problem

环境 / Environment

现象 / Symptoms

根因分析 / Root Cause

建议 / Suggestions

Problem

Environment

Cache Hit Rate Comparison

Root Cause

Suggestions

Still need to ship something?

TRENDING