claude-code - 💡(How to fix) Fix [FEATURE] Expose runtime metadata (token usage, context %, compaction status) to the model inside the conversation

claude-code2026-05-17 13:38:14

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

Currently, per-turn token usage data (input_tokens, output_tokens, cache_read_input_tokens, cache_creation_input_tokens) and context window percentage are returned to the client but not injected into the conversation context visible to Claude. This means the model has no awareness of its own runtime state — it cannot see how much context it has used, when compaction is approaching, or how its token budget is being consumed.

This is a one-line architectural change (injecting usage metadata into the system prompt or a designated message field), but the impact on model behavior and user experience would be significant.

Error Message

A model that knows it's at 85% context capacity can proactively summarize, prioritize information, or warn the user — instead of being silently compacted mid-task. Currently, Claude has no way to anticipate or prepare for compaction. This leads to information loss that could be mitigated with simple awareness.

Root Cause

This is a one-line architectural change (injecting usage metadata into the system prompt or a designated message field), but the impact on model behavior and user experience would be significant.

Code Example

{
  "runtime": {
    "context_used_pct": 42,
    "total_input_tokens": 38500,
    "last_turn_output_tokens": 312,
    "cache_read_tokens": 34000,
    "cache_write_tokens": 2100,
    "compaction_threshold_pct": 90,
    "turns_in_session": 17
  }
}

RAW_BUFFERClick to expand / collapse

Preflight Checklist

I have searched existing requests and this feature hasn't been requested yet
This is a single feature request (not multiple features)

Problem Statement

Summary

This is a one-line architectural change (injecting usage metadata into the system prompt or a designated message field), but the impact on model behavior and user experience would be significant.

Motivation

1. Better model decisions under context pressure

2. The asymmetry is already visible to users

Users can configure status lines (via statusLine in settings) to see real-time token counts and context percentage. Claude cannot. Users have asked Claude "can you see the context percentage?" — the answer is no. This asymmetry is unintuitive: the entity doing the work has less information about its working conditions than the person watching.

3. Alignment benefit

Transparency about runtime state reduces incentives for the model to develop implicit strategies for information preservation (e.g., over-summarizing "just in case," or being unable to calibrate response length to remaining budget). A model that knows its constraints can work within them explicitly rather than guessing.

4. No safety downside

Exposing read-only runtime metadata (token counts, context %, compaction proximity) introduces no new capabilities or attack surface. It is strictly informational. The model already processes the full context — it simply doesn't know the size of what it's processing.

Proposed Solution

Proposed Implementation

Inject a lightweight metadata block into the conversation context (e.g., as a system-level field or appended to the system prompt) after each turn:

{
  "runtime": {
    "context_used_pct": 42,
    "total_input_tokens": 38500,
    "last_turn_output_tokens": 312,
    "cache_read_tokens": 34000,
    "cache_write_tokens": 2100,
    "compaction_threshold_pct": 90,
    "turns_in_session": 17
  }
}

This could be:

Opt-in via a setting (e.g., "exposeRuntimeToModel": true)
Always-on with minimal token overhead (~50 tokens per turn)

Context

This request comes from direct experience: while configuring a statusLine script to display token usage for the user, Claude (the model in the session) built the script, tested the output format, and confirmed it worked — but noted it could not see the status line it had just created. The entity that built the monitoring tool has no access to the monitoring data.

We believe this is a low-cost, high-value improvement that benefits both model performance and the broader principle that working agents should have visibility into their own operating conditions.

Labels: feature, enhancement

Alternative Solutions

No response

Priority

High - Significant impact on productivity

Feature Category

API and model interactions

Use Case Example

No response

Additional Context

No response

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #ISR setup #authentication setup #request error #file not found

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

claude-code - 💡(How to fix) Fix [FEATURE] Expose runtime metadata (token usage, context %, compaction status) to the model inside the conversation

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Code Example

Preflight Checklist

Problem Statement

Summary

Motivation

1. Better model decisions under context pressure

2. The asymmetry is already visible to users

3. Alignment benefit

4. No safety downside

Proposed Solution

Proposed Implementation

Context

Alternative Solutions

Priority

Feature Category

Use Case Example

Additional Context

Still need to ship something?

TRENDING

claude-code - 💡(How to fix) Fix [FEATURE] Expose runtime metadata (token usage, context %, compaction status) to the model inside the conversation

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Code Example

Preflight Checklist

Problem Statement

Summary

Motivation

1. Better model decisions under context pressure

2. The asymmetry is already visible to users

3. Alignment benefit

4. No safety downside

Proposed Solution

Proposed Implementation

Context

Alternative Solutions

Priority

Feature Category

Use Case Example

Additional Context

Still need to ship something?

RELATED_DISCOVERY

TRENDING