openclaw - ✅(Solved) Fix [Bug]: Auto-generated `AGENTS.md` puts load-bearing tool-use rules at the bottom; head-truncation by `bootstrapMaxChars` strips them [1 pull requests, 1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#75187Fetched 2026-05-01 05:37:12
View on GitHub
Comments
1
Participants
2
Timeline
9
Reactions
2
Author
Timeline (top)
cross-referenced ×4labeled ×2referenced ×2commented ×1

The AGENTS.md template auto-generated by openclaw doctor --fix orders content with personality/onboarding guidance at the top and the load-bearing ## Red Lines + tool-use guidance at the bottom; when a user lowers agents.defaults.bootstrapMaxChars (typically to fit a small/mid model context budget), the head-truncation strips exactly the rules that govern tool-dispatch reliability and external-action safety, while preserving the less-critical content.

Root Cause

Consequence: silent tool-dispatch hallucination — the agent claims to have searched/fetched/executed without actually doing so. Particularly dangerous because the hallucinated reply often sounds correct (Hermes-3 8B's example.com "summary" was generic enough to be plausible). Real-world consequences include: missed alerts, fabricated facts presented as authoritative, security guidance ignored.

Fix Action

Fix / Workaround

The AGENTS.md template auto-generated by openclaw doctor --fix orders content with personality/onboarding guidance at the top and the load-bearing ## Red Lines + tool-use guidance at the bottom; when a user lowers agents.defaults.bootstrapMaxChars (typically to fit a small/mid model context budget), the head-truncation strips exactly the rules that govern tool-dispatch reliability and external-action safety, while preserving the less-critical content.

  1. Install OpenClaw 2026.4.9 standalone on a fresh host.
  2. Run openclaw doctor --fix. Inspect the auto-generated file:
    $ wc -c ~/.openclaw/workspace/AGENTS.md
    7809 ~/.openclaw/workspace/AGENTS.md
    $ head -10 ~/.openclaw/workspace/AGENTS.md
    # AGENTS.md - Your Workspace
    This folder is home. Treat it that way.
    ## First Run
    If `BOOTSTRAP.md` exists, that's your birth certificate. Follow it, ...
    ## Session Startup
    Before doing anything else: ...
    The relevant ## Red Lines and ## External vs Internal sections are at the bottom of the file (after ## Memory, ## Group Chats, etc.).
  3. Set a typical small-model trim:
    openclaw config set agents.defaults.bootstrapMaxChars 1500
    systemctl --user restart openclaw-gateway.service
  4. Run an agent turn against a small/mid local model (Hermes-3 8B, Qwen3 8B, etc.) requiring tool dispatch:
    openclaw agent --session-id repro -m "Search memory for X, then if you find anything fetch https://example.com and tell me whether they agree." --json
  5. Inspect the response and the session jsonl at ~/.openclaw/agents/main/sessions/<sessionId>.jsonl for toolCall events.

Severity: medium. Functional workaround exists (rewrite AGENTS.md manually, as we did), but the failure mode is silent and easy to misdiagnose as a model defect or a tool-call-parser defect (which is how we initially diagnosed it before the public-domain review surfaced #41304's root-cause analysis).

PR fix notes

PR #75248: fix(agents): reorder workspace AGENTS.md template to put load-bearing rules first

Description (problem / solution / changelog)

Bug being fixed

Closes #75187

The auto-generated docs/reference/templates/AGENTS.md (used by the workspace bootstrap to seed ~/.openclaw/workspace/AGENTS.md) ordered content with personality/onboarding guidance at the top and the load-bearing ## Red Lines, ## External vs Internal, and ## Tools guidance at the bottom.

When a user lowers agents.defaults.bootstrapMaxChars (typical for small/mid local models — Hermes-3 8B, Qwen3 8B — to fit a tight context budget), bootstrap-budget head-truncates the file. With the old order, that stripped exactly the safety + tool-dispatch rules the model needed, while preserving the less operationally-critical Memory/Group Chats/Heartbeats sections. The reporter's vLLM repro showed 0 structured tool_call events vs. 1 successful structured tool call after manually rewriting AGENTS.md to put tool-use guidance at the top — same model, same parser, same bootstrapMaxChars, content order was the only difference.

Fix

Reorder docs/reference/templates/AGENTS.md so the section sequence is:

  1. First Run
  2. Session Startup
  3. Red Lines (was #4)
  4. External vs Internal (was #5)
  5. Tools (was #7)
  6. Memory (was #3)
  7. Group Chats (was #6)
  8. Heartbeats (was #8)
  9. Make It Yours

Section content is unchanged byte-for-byte — only the H2 ordering moves. Path #1 ("Quickest win — reorder the auto-generated AGENTS.md template content") in the issue's recommended resolution order.

This aligns the seeded workspace template with the existing post-compaction priority contract: agents.defaults.compactionAgentsMdReinjectionSections already names Session Startup and Red Lines as the priority sections to re-inject after compaction. Putting them at the top of the seeded file means head-truncation now matches that same priority instead of fighting it.

Why this is the best fix

  • Smallest blast radius: a docs-only template content change. No runtime, schema, or budget logic touched.
  • Existing users: their already-customized AGENTS.md files are not rewritten; this only affects newly-seeded workspaces (and openclaw doctor --fix --regenerate-bootstrap-files flows when applicable).
  • Doesn't preempt larger work: orthogonal to #75189 (verbose default content) and #22438 / #22439 (tiered bootstrap loading); paths #2 and #3 in the issue's recommended order remain valid future work on top of this base fix.
  • Aligns with existing contract: matches the post-compaction reinjection priority in agents.defaults.compactionAgentsMdReinjectionSections.

Test plan

  • pnpm test src/agents/workspace-templates.test.ts — 4 new regression tests pass (Red Lines / External vs Internal / Tools all assert ahead of Memory + Group Chats; First Run / Session Startup stay at the top).
  • pnpm test src/agents/workspace.test.ts src/agents/system-prompt-stability.test.ts — 25 + 4 existing tests pass.
  • pnpm exec oxfmt --check — clean.
  • pnpm tsgo:core + pnpm tsgo:core:test — clean.
  • Lint (pnpm lint:core) failure on oxlint config is pre-existing on origin/main (Rule 'no-underscore-dangle' not found in plugin 'eslint'), unrelated to this PR.

https://github.com/openclaw/openclaw/issues/75187

Changed files

  • docs/reference/templates/AGENTS.md (modified, +17/-33)
  • src/agents/pi-embedded-helpers.buildbootstrapcontextfiles.test.ts (modified, +36/-0)
  • src/agents/workspace-templates.test.ts (modified, +50/-0)

Code Example

$ wc -c ~/.openclaw/workspace/AGENTS.md
   7809 ~/.openclaw/workspace/AGENTS.md
   $ head -10 ~/.openclaw/workspace/AGENTS.md
   # AGENTS.md - Your Workspace
   This folder is home. Treat it that way.
   ## First Run
   If `BOOTSTRAP.md` exists, that's your birth certificate. Follow it, ...
   ## Session Startup
   Before doing anything else: ...

---

openclaw config set agents.defaults.bootstrapMaxChars 1500
   systemctl --user restart openclaw-gateway.service

---

openclaw agent --session-id repro -m "Search memory for X, then if you find anything fetch https://example.com and tell me whether they agree." --json

---

Auto-generated `AGENTS.md` outline (after `openclaw doctor --fix`):

# AGENTS.md - Your Workspace
This folder is home. Treat it that way.
## First Run                          ← gets injected when bootstrapMaxChars=1500
## Session Startup                    ← gets injected
## Memory                             ← partially injected
### 🧠 MEMORY.md - Your Long-Term Memory
### 📝 Write It Down - No "Mental Notes"!
## Red LinesTRUNCATED OUT (was the goal)
## External vs InternalTRUNCATED OUT
## Group ChatsTRUNCATED OUT
### 💬 Know When to Speak!TRUNCATED OUT


Session jsonl from a "broken" run (auto-generated `AGENTS.md` + `bootstrapMaxChars: 1500`, prompt: "Fetch https://example.com and summarize"):

L5  message  role=user
L6  message  role=assistant  ctypes=['text']
            text="The fetched page from https://example.com is a security notice indicating..."
            ← hallucinated; gateway log has zero tool|fetch|invoke markers


Session jsonl from the "working" run (rewritten `AGENTS.md` with tool-rules at top, same `bootstrapMaxChars: 1500`):

L5  message  role=user
L6  message  role=assistant  ctypes=['toolCall']STRUCTURED TOOL CALL
L7  message  role=toolResult ctypes=['text']
            text='{"url":"https://example.com","status":200,"contentType":"text/html",...}'
L8  message  role=assistant  ctypes=['text']
            text="The fetched page at https://example.com is a security notice..."
            ← real summary of real content


Same model, same vLLM, same parser, same `bootstrapMaxChars` — only difference is content ordering inside `AGENTS.md`.
RAW_BUFFERClick to expand / collapse

Bug type

Behavior bug (incorrect output/state without crash)

Beta release blocker

No

Summary

The AGENTS.md template auto-generated by openclaw doctor --fix orders content with personality/onboarding guidance at the top and the load-bearing ## Red Lines + tool-use guidance at the bottom; when a user lowers agents.defaults.bootstrapMaxChars (typically to fit a small/mid model context budget), the head-truncation strips exactly the rules that govern tool-dispatch reliability and external-action safety, while preserving the less-critical content.

Steps to reproduce

  1. Install OpenClaw 2026.4.9 standalone on a fresh host.
  2. Run openclaw doctor --fix. Inspect the auto-generated file:
    $ wc -c ~/.openclaw/workspace/AGENTS.md
    7809 ~/.openclaw/workspace/AGENTS.md
    $ head -10 ~/.openclaw/workspace/AGENTS.md
    # AGENTS.md - Your Workspace
    This folder is home. Treat it that way.
    ## First Run
    If `BOOTSTRAP.md` exists, that's your birth certificate. Follow it, ...
    ## Session Startup
    Before doing anything else: ...
    The relevant ## Red Lines and ## External vs Internal sections are at the bottom of the file (after ## Memory, ## Group Chats, etc.).
  3. Set a typical small-model trim:
    openclaw config set agents.defaults.bootstrapMaxChars 1500
    systemctl --user restart openclaw-gateway.service
  4. Run an agent turn against a small/mid local model (Hermes-3 8B, Qwen3 8B, etc.) requiring tool dispatch:
    openclaw agent --session-id repro -m "Search memory for X, then if you find anything fetch https://example.com and tell me whether they agree." --json
  5. Inspect the response and the session jsonl at ~/.openclaw/agents/main/sessions/<sessionId>.jsonl for toolCall events.

Expected behavior

The auto-generated AGENTS.md should put the most operationally-critical guidance — tool-use rules and red-lines — at the top, so head-truncation by bootstrapMaxChars preserves them. Either:

  • (a) Reorder the auto-generated template, OR
  • (b) Add a content-priority hint that the prompt builder respects (e.g. fenced sections marked <!-- bootstrap-priority: high -->), OR
  • (c) Switch from head-truncation to a content-aware truncation that keeps high-priority sections, OR
  • (d) Document the truncation behavior in bootstrapMaxChars's schema entry and tell users to manually re-order their AGENTS.md if they trim.

Actual behavior

With bootstrapMaxChars: 1500 and the default auto-generated AGENTS.md (7,809 chars), only the first 1,500 chars are injected. Those 1,500 chars contain ## First Run, ## Session Startup (read SOUL.md, USER.md, etc.), and the start of ## Memory. The injected content does NOT contain ## Red Lines, ## External vs Internal (which lists "Search the web" as safe-to-do-freely), or any explicit tool-use guidance. The agent then proceeds without instruction on when to invoke tools — and on a small/mid model, defaults to producing plausible-sounding text describing tool use rather than emitting structured tool_call events.

Concrete repro (Hermes-3-Llama-3.1-8B on bare OpenClaw + vLLM + --tool-call-parser hermes):

  • With auto-generated AGENTS.md + bootstrapMaxChars: 1500: rung 5 ("Fetch https://example.com and summarize") returned a hallucinated generic summary; session jsonl had 0 toolCall events; no actual HTTP request was made.
  • After manually rewriting AGENTS.md to put a ## How to use tools (READ THIS FIRST) section at the top (with explicit examples for memory_search, web_fetch, exec, and chained workflows), keeping bootstrapMaxChars: 1500: rung 5 emitted 1 structured toolCall, got a toolResult with HTTP 200 from example.com, and quoted the actual page content in the reply. Rung 7 (chained 3-step) emitted 3 structured toolCall events.

OpenClaw version

2026.4.9 (build 0512059)

Operating system

Ubuntu 24.04 LTS aarch64 (Linux 6.17.0-1014-nvidia)

Install method

npm global (npm install -g [email protected]), Node v22.22.2 via nvm

Model

NousResearch/Hermes-3-Llama-3.1-8B (representative; the same content-ordering bug surfaces on any small/mid model where users lower bootstrapMaxChars to fit context budgets)

Provider / routing chain

openclaw (standalone host gateway) → vLLM (http://127.0.0.1:8002/v1) → Hermes-3-Llama-3.1-8B

Additional provider/model setup details

Standalone OpenClaw on a host (no NemoClaw sandbox). vLLM 0.19.1 Docker container at :8002 with --enable-auto-tool-choice --tool-call-parser hermes --gpu-memory-utilization 0.20 --max-model-len 32768. ~/.openclaw/openclaw.json gateway.mode=local, primary model inference/hermes-3-llama-3.1-8b.

Logs, screenshots, and evidence

Auto-generated `AGENTS.md` outline (after `openclaw doctor --fix`):

# AGENTS.md - Your Workspace
This folder is home. Treat it that way.
## First Run                          ← gets injected when bootstrapMaxChars=1500
## Session Startup                    ← gets injected
## Memory                             ← partially injected
### 🧠 MEMORY.md - Your Long-Term Memory
### 📝 Write It Down - No "Mental Notes"!
## Red Lines                          ← TRUNCATED OUT (was the goal)
## External vs Internal               ← TRUNCATED OUT
## Group Chats                        ← TRUNCATED OUT
### 💬 Know When to Speak!            ← TRUNCATED OUT


Session jsonl from a "broken" run (auto-generated `AGENTS.md` + `bootstrapMaxChars: 1500`, prompt: "Fetch https://example.com and summarize"):

L5  message  role=user
L6  message  role=assistant  ctypes=['text']
            text="The fetched page from https://example.com is a security notice indicating..."
            ← hallucinated; gateway log has zero tool|fetch|invoke markers


Session jsonl from the "working" run (rewritten `AGENTS.md` with tool-rules at top, same `bootstrapMaxChars: 1500`):

L5  message  role=user
L6  message  role=assistant  ctypes=['toolCall']  ← STRUCTURED TOOL CALL
L7  message  role=toolResult ctypes=['text']
            text='{"url":"https://example.com","status":200,"contentType":"text/html",...}'
L8  message  role=assistant  ctypes=['text']
            text="The fetched page at https://example.com is a security notice..."
            ← real summary of real content


Same model, same vLLM, same parser, same `bootstrapMaxChars` — only difference is content ordering inside `AGENTS.md`.

Impact and severity

Affected: any user lowering bootstrapMaxChars to fit a small/mid model context budget — most commonly local-inference users on consumer GPUs, but also anyone optimizing token cost on cloud APIs. Per OpenClaw issue #22438 ("Tiered bootstrap file loading") and the body of public-domain research on long-prompt tool-following degradation (AGENTIF / Berkeley / Tsinghua), this user segment is significant and growing.

Severity: medium. Functional workaround exists (rewrite AGENTS.md manually, as we did), but the failure mode is silent and easy to misdiagnose as a model defect or a tool-call-parser defect (which is how we initially diagnosed it before the public-domain review surfaced #41304's root-cause analysis).

Frequency: deterministic when bootstrapMaxChars < ~3000 with default auto-generated content; behavior varies above that threshold.

Consequence: silent tool-dispatch hallucination — the agent claims to have searched/fetched/executed without actually doing so. Particularly dangerous because the hallucinated reply often sounds correct (Hermes-3 8B's example.com "summary" was generic enough to be plausible). Real-world consequences include: missed alerts, fabricated facts presented as authoritative, security guidance ignored.

Additional information

Related issues:

  • #22438 (open) "feat: Tiered bootstrap file loading for progressive context control" — adjacent ask. PR #22439 in flight to add bootstrapTier: minimal | standard | full. This (#08) is more specific: the content order within each file, separate from how many files get loaded.
  • #41304 (open) "Agent refuses to invoke write/action tools, hallucinates success" — kinthaiofficial 2026-04-28 root-cause comment ("system prompt too long → tool-use instruction deprioritized") explains the failure shape this issue addresses. Our wire-level repro confirms that diagnosis with concrete tool-call event counts.
  • #66060 (open) "active-memory: bloated prompt — full agent system prompt included in memory search context" — related; the memory_search subagent inherits the same auto-generated AGENTS.md ordering and faces the same head-truncation problem if its budget is constrained.
  • #62182 (closed) "Config validation rejects bootstrapMaxCharsPerFile as unrecognized" — closed in favor of the uniform bootstrapMaxChars. Per-file priority would be one path; content-priority within a file (this issue) is another.

Recommended order of resolution:

  1. Quickest win — reorder the auto-generated AGENTS.md template content (single change to the bootstrap-template generator). 1-line config change for users to update existing workspaces by re-running openclaw doctor --fix --regenerate-bootstrap-files (if such a flag exists; otherwise document a manual reset).
  2. Better long-term — content-priority annotations the prompt builder respects.
  3. Best for power-users — tier-based bootstrap (per #22438 / #22439) with content priority.

extent analysis

TL;DR

Reorder the auto-generated AGENTS.md template to prioritize operationally-critical guidance, such as tool-use rules and red-lines, to prevent head-truncation from removing essential content.

Guidance

  • Verify the issue by checking the auto-generated AGENTS.md file and the session jsonl for toolCall events after setting a small bootstrapMaxChars value.
  • Manually reorder the AGENTS.md content to put critical sections like ## Red Lines and ## External vs Internal at the top as a temporary workaround.
  • Consider adding content-priority hints, such as fenced sections marked <!-- bootstrap-priority: high -->, to ensure the prompt builder respects the priority of different sections.
  • Review related issues, such as #22438 and #41304, for additional context and potential solutions.

Example

No code snippet is provided as the issue is related to content ordering and priority, rather than code syntax.

Notes

The issue is specific to the OpenClaw version 2026.4.9 and may not apply to other versions. The recommended order of resolution prioritizes reordering the auto-generated AGENTS.md template content as the quickest win.

Recommendation

Apply the workaround by reordering the AGENTS.md content manually, as this is the most straightforward solution to ensure critical guidance is preserved when bootstrapMaxChars is set to a small value.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

The auto-generated AGENTS.md should put the most operationally-critical guidance — tool-use rules and red-lines — at the top, so head-truncation by bootstrapMaxChars preserves them. Either:

  • (a) Reorder the auto-generated template, OR
  • (b) Add a content-priority hint that the prompt builder respects (e.g. fenced sections marked <!-- bootstrap-priority: high -->), OR
  • (c) Switch from head-truncation to a content-aware truncation that keeps high-priority sections, OR
  • (d) Document the truncation behavior in bootstrapMaxChars's schema entry and tell users to manually re-order their AGENTS.md if they trim.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - ✅(Solved) Fix [Bug]: Auto-generated `AGENTS.md` puts load-bearing tool-use rules at the bottom; head-truncation by `bootstrapMaxChars` strips them [1 pull requests, 1 comments, 2 participants]