openclaw - 💡(How to fix) Fix [Feature]: Skills prompt injection consumes ~10K chars with 80+ skills, causing model selection noise and context window pressure [1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#73248Fetched 2026-04-29 06:21:51
View on GitHub
Comments
1
Participants
2
Timeline
2
Reactions
0
Timeline (top)
commented ×1labeled ×1

Fix Action

Fix / Workaround

According to the documented formula: total = 195 + Σ (97 + len(name_escaped) + len(description_escaped) + len(location_escaped)) With 80 skills, this results in:

  • ~2,000–2,500 tokens overhead (at ~4 chars/token)
  • 15–30% of context window consumed on 8K/16K models before any user message

Impact

  1. Context window pressure: Reduces effective reasoning space for complex multi-step tasks
  2. Model selection noise: Forces the model to choose from 80+ unstructured skill options, increasing mismatch rate
  3. No hierarchical discovery: The flat XML list provides no categorization, tagging, or priority mechanism

Current Workaround

Users are forced to maintain manual external indexes (e.g., a TOOLS.md file) to compensate for the lack of structured skill discovery within the system.

RAW_BUFFERClick to expand / collapse

Summary

Problem Description

With 80+ skills installed across multiple skill directories (~/.openclaw/skills, ~/.agents/skills, <workspace>/.agents/skills, etc.), the XML-formatted skills list injected into the system prompt consumes approximately 7,000–10,000 characters.

According to the documented formula: total = 195 + Σ (97 + len(name_escaped) + len(description_escaped) + len(location_escaped)) With 80 skills, this results in:

  • ~2,000–2,500 tokens overhead (at ~4 chars/token)
  • 15–30% of context window consumed on 8K/16K models before any user message

Problem to solve

Problem Description

With 80+ skills installed across multiple skill directories (~/.openclaw/skills, ~/.agents/skills, <workspace>/.agents/skills, etc.), the XML-formatted skills list injected into the system prompt consumes approximately 7,000–10,000 characters.

According to the documented formula: total = 195 + Σ (97 + len(name_escaped) + len(description_escaped) + len(location_escaped)) With 80 skills, this results in:

  • ~2,000–2,500 tokens overhead (at ~4 chars/token)
  • 15–30% of context window consumed on 8K/16K models before any user message

Impact

  1. Context window pressure: Reduces effective reasoning space for complex multi-step tasks
  2. Model selection noise: Forces the model to choose from 80+ unstructured skill options, increasing mismatch rate
  3. No hierarchical discovery: The flat XML list provides no categorization, tagging, or priority mechanism

Current Workaround

Users are forced to maintain manual external indexes (e.g., a TOOLS.md file) to compensate for the lack of structured skill discovery within the system.

Proposed solution

为什么这会影响智能代理设计

当前的扁平化技能注入违反了智能代理架构的一个基本原则:分层抽象。 人类专家不会同时将80个工具放在工作记忆中。他们会将知识组织成领域: "这是一个金融问题 → 打开金融工具箱 → 选择特定工具" OpenClaw 目前的机制迫使模型做相反的事情: "这里有80个不相关的工具,弄清楚哪一个适合" 这会产生三种失败模式: 选择瘫痪 — 面对80多个选项,模型的概率质量分散到不相关的技能上,增加了幻觉和错误匹配率 上下文污染 — 10K字符的技能描述挤占了实际用户意图和对话历史,降低了真实任务的推理质量 技能利用率不足 — 频率较低但高度相关的技能被常见技能“淹没”,因为模型的注意力机制自然偏向熟悉的模式 真正智能的代理应该做的事情: 第一层:维护一个轻量级的“技能目录”(仅包含类别 → 技能名称) 第二层:在意图分类时,动态加载仅相关类别的完整技能定义 第三层:通过精确的工具调用执行 这反映了人类组织的运作方式——你不会给每个员工整个公司的程序手册。你给他们索引,并让他们在需要时提取相关部分。 当前这种“将所有内容都转储到上下文中”的方法在超过约20个技能后扩展性很差,并且从根本上限制了OpenClaw作为复杂多领域代理编排器的能力。

Alternatives considered

Option A: Hierarchical/Categorized Prompt Format

Instead of flat XML injection, inject a category index first:

<skills>
  <category name="Finance">
    <skill name="stock-analysis" description="..."/>
    <skill name="market-analysis" description="..."/>
  </category>
  <category name="Productivity">
    <skill name="lark-doc" description="..."/>
  </category>
</skills>
Model selects category first, then specific skill. Reduces cognitive load from O(n) to O(√n).
Option B: Lazy-Load Mechanism
Inject only a lightweight skill catalog at session start. When model expresses intent (e.g., "analyze stock"), dynamically retrieve and inject relevant skill definitions via RAG-style retrieval.
Option C: Skill Tagging + Grouped Allowlists
Extend the existing agents.defaults.skills allowlist with tag-based grouping:
{
  agents: {
    defaults: {
      skillTags: ["finance", "lark-ecosystem"],
    }
  }
}

### Impact

## Impact
1. **Context window pressure**: Reduces effective reasoning space for complex multi-step tasks
2. **Model selection noise**: Forces the model to choose from 80+ unstructured skill options, increasing mismatch rate
3. **No hierarchical discovery**: The flat XML list provides no categorization, tagging, or priority mechanism


### Evidence/examples

_No response_

### Additional information

_No response_

extent analysis

TL;DR

Implement a hierarchical or categorized prompt format to reduce the cognitive load on the model and alleviate context window pressure.

Guidance

  • Consider using a category index to organize skills, allowing the model to select a category first and then a specific skill, reducing the cognitive load from O(n) to O(√n).
  • Evaluate the feasibility of implementing a lazy-load mechanism, where only a lightweight skill catalog is injected at session start, and relevant skill definitions are dynamically retrieved and injected when the model expresses intent.
  • Explore extending the existing agents.defaults.skills allowlist with tag-based grouping to enable more structured skill discovery.
  • Assess the potential benefits of implementing a hierarchical discovery mechanism, such as reducing model selection noise and improving skill utilization.

Example

<skills>
  <category name="Finance">
    <skill name="stock-analysis" description="..."/>
    <skill name="market-analysis" description="..."/>
  </category>
  <category name="Productivity">
    <skill name="lark-doc" description="..."/>
  </category>
</skills>

This example illustrates a possible implementation of a categorized prompt format, where skills are organized into categories, and the model can select a category first and then a specific skill.

Notes

The proposed solutions aim to address the limitations of the current flat XML injection approach, which can lead to context window pressure, model selection noise, and poor skill utilization. However, the effectiveness of these solutions may depend on the specific requirements and constraints of the OpenClaw system.

Recommendation

Apply a hierarchical or categorized prompt format, such as the one described in Option A, to reduce the cognitive load on the model and alleviate context window pressure. This approach has the potential to improve the overall performance and efficiency of the OpenClaw system.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix [Feature]: Skills prompt injection consumes ~10K chars with 80+ skills, causing model selection noise and context window pressure [1 comments, 2 participants]