openclaw - 💡(How to fix) Fix [Feature]: Token optimization flags: compact tool descriptions and system prompt sections for effective use of local hosted models [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#73786Fetched 2026-04-29 06:15:11
View on GitHub
Comments
0
Participants
1
Timeline
1
Reactions
0
Participants
Timeline (top)
labeled ×1

Add experimental token-saving flags (compact tool descriptions, compact system prompt sections, and optional memory injection skip) to reduce context consumption for resource-constrained deployments.

Root Cause

Add experimental token-saving flags (compact tool descriptions, compact system prompt sections, and optional memory injection skip) to reduce context consumption for resource-constrained deployments.

Code Example

Baseline total:   9,405 tokens
  System prompt:  2,311 tokens
  Tool schemas:   7,094 tokens

Optimized total:  5,244 tokens
  System prompt:  1,571 tokens
  Tool schemas:   3,673 tokens
  ─────────────────────────────
  Total saved:    4,161 tokens  (44.2%)
    From prompt:    740 tokens
    From tools:   3,421 tokens
  + memory skip: ~200500 tokens additional  (production only)
RAW_BUFFERClick to expand / collapse

Summary

Add experimental token-saving flags (compact tool descriptions, compact system prompt sections, and optional memory injection skip) to reduce context consumption for resource-constrained deployments.

Problem to solve

Users running OpenClaw with smaller language models (GPT-4, Sonnet, Haiku), local model backends (Ollama), or cost-sensitive deployments face significant constraints from token limits and costs. The current system prompt and tool descriptions are verbose and designed for large context windows, consuming 800-900+ tokens for fixed sections alone. Users need a way to reduce token consumption without losing tool functionality or manually editing/removing features. Currently, there's no built-in mechanism to intelligently compact the system prompt and tool descriptions, forcing users to either accept high token costs, manually strip features, or avoid using OpenClaw with resource-constrained models.

Proposed solution

Introduce three new experimental flags under agentDefaults.experimental:

  1. compactToolDescriptions - Replaces verbose tool descriptions with short display summaries before sending to the model. Significantly reduces token usage while keeping all tools callable.

  2. compactSystemPromptSections - Replaces verbose fixed system prompt sections (Documentation, CLI Quick Reference, Self-Update, Output Directives) with compact 1-2 line summaries. Saves approximately 800-900 tokens per run while maintaining all sections.

  3. skipAutoMemoryInjection - Skips auto-injecting memory files (MEMORY.md) and the memory plugin prompt section into the system prompt. Users can still access memory on-demand via tools when this is enabled (e.g., instructions in AGENTS.md).

These flags are particularly useful for:

  • Users running agents with smaller/local language models
  • Cost-conscious deployments with large token-heavy interactions
  • Scenarios where context window is the primary constraint

All flags default to false to preserve current behavior, and users can enable them individually in openclaw.config under agentDefaults.experimental.

Alternatives considered

No response

Impact

Affected users/systems:

  • Users deploying OpenClaw with local model backends (Ollama, LLaMA, etc.)
  • Self-hosted or on-premise deployments with limited infrastructure budgets
  • High-volume agent deployments where per-token costs significantly impact viability

Severity: Annoying

  • Users wanting to use OpenClaw with smaller/local models currently cannot, as verbose prompts and tool descriptions consume too much of the limited context window
  • Cost-sensitive deployments are forced to use larger, more expensive models to maintain functionality

Frequency: Always

  • Every agent run incurs the token cost of verbose system prompt sections (~800-900 tokens) and detailed tool descriptions
  • Users with context-window-limited models are continuously blocked from adoption

Consequences:

  • Inability to use OpenClaw with resource-constrained setups (local models, smaller cloudmodels)
  • Increased operational costs for cost-conscious deployments
  • Lost opportunities to serve users who prefer or require local/smaller model deployments
  • Users must choose between OpenClaw capabilities or model/cost constraints, rather than having both

Evidence/examples

I am running the changes in my local fork of openclaw and seeing ~45% token savings without any visible impact on the functioning of the agent

All flags enabled simultaneously:

Baseline total:   9,405 tokens
  System prompt:  2,311 tokens
  Tool schemas:   7,094 tokens

Optimized total:  5,244 tokens
  System prompt:  1,571 tokens
  Tool schemas:   3,673 tokens
  ─────────────────────────────
  Total saved:    4,161 tokens  (44.2%)
    From prompt:    740 tokens
    From tools:   3,421 tokens
  + memory skip: ~200–500 tokens additional  (production only)

With all three flags: up to ~4,661 tokens saved (~47%) per agent turn.

Additional information

No response

extent analysis

TL;DR

Enable the compactToolDescriptions, compactSystemPromptSections, and skipAutoMemoryInjection flags under agentDefaults.experimental in openclaw.config to reduce token consumption.

Guidance

  • Set compactToolDescriptions to true to replace verbose tool descriptions with short summaries, reducing token usage while keeping tools callable.
  • Set compactSystemPromptSections to true to replace verbose system prompt sections with compact summaries, saving approximately 800-900 tokens per run.
  • Set skipAutoMemoryInjection to true to skip auto-injecting memory files and the memory plugin prompt section into the system prompt, allowing users to access memory on-demand via tools.
  • Verify the token savings by comparing the total tokens used before and after enabling these flags, as shown in the provided example.

Example

{
  "agentDefaults": {
    "experimental": {
      "compactToolDescriptions": true,
      "compactSystemPromptSections": true,
      "skipAutoMemoryInjection": true
    }
  }
}

Notes

The exact token savings may vary depending on the specific use case and configuration. Enabling these flags may have a minor impact on the functionality or user experience, but the provided example suggests that the impact is minimal.

Recommendation

Apply the workaround by enabling the three experimental flags, as it can significantly reduce token consumption without losing tool functionality. This is a safe and effective way to mitigate the issue, especially for users with resource-constrained deployments.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING