claude-code - 💡(How to fix) Fix Expose `anthropic-ratelimit-*` response headers as OpenTelemetry attributes on `api_request` [1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
anthropics/claude-code#60502Fetched 2026-05-20 03:56:57
View on GitHub
Comments
1
Participants
2
Timeline
3
Reactions
0
Timeline (top)
labeled ×2commented ×1

Expose Anthropic's anthropic-ratelimit-* response headers as attributes on Claude Code's OpenTelemetry log records (api_request event) and/or spans (claude_code.llm_request).

Currently the API call's outcome makes it into telemetry (token counts, cost, model, stop_reason), but the rate-limit budget headers that the Anthropic API returns on every response do not. That makes it impossible to build a "how much budget do I have left" indicator without leaving the terminal for console.anthropic.com (and for users on org-managed enterprise plans, the console isn't even visible from their side).

Root Cause

I run a local OpenTelemetry collector that captures Claude Code's emitted log records into DuckDB for analysis. I can compute spend ($ / hour, $ / session, $ / day-of-week, cache thrash signatures, etc.) — but I can't tell whether I'm about to hit the rate-limit wall, because the limit/remaining numbers never enter the local pipeline.

Code Example

anthropic-ratelimit-requests-limit
anthropic-ratelimit-requests-remaining
anthropic-ratelimit-requests-reset
anthropic-ratelimit-input-tokens-limit
anthropic-ratelimit-input-tokens-remaining
anthropic-ratelimit-input-tokens-reset
anthropic-ratelimit-output-tokens-limit
anthropic-ratelimit-output-tokens-remaining
anthropic-ratelimit-output-tokens-reset
anthropic-ratelimit-tokens-limit
anthropic-ratelimit-tokens-remaining
anthropic-ratelimit-tokens-reset
retry-after  (on 429 responses)

---

ratelimit.requests_limit
ratelimit.requests_remaining
ratelimit.requests_reset
ratelimit.input_tokens_limit
...
RAW_BUFFERClick to expand / collapse

Summary

Expose Anthropic's anthropic-ratelimit-* response headers as attributes on Claude Code's OpenTelemetry log records (api_request event) and/or spans (claude_code.llm_request).

Currently the API call's outcome makes it into telemetry (token counts, cost, model, stop_reason), but the rate-limit budget headers that the Anthropic API returns on every response do not. That makes it impossible to build a "how much budget do I have left" indicator without leaving the terminal for console.anthropic.com (and for users on org-managed enterprise plans, the console isn't even visible from their side).

Which headers

The Anthropic Messages API returns the following response headers on every successful call (per the public API docs):

anthropic-ratelimit-requests-limit
anthropic-ratelimit-requests-remaining
anthropic-ratelimit-requests-reset
anthropic-ratelimit-input-tokens-limit
anthropic-ratelimit-input-tokens-remaining
anthropic-ratelimit-input-tokens-reset
anthropic-ratelimit-output-tokens-limit
anthropic-ratelimit-output-tokens-remaining
anthropic-ratelimit-output-tokens-reset
anthropic-ratelimit-tokens-limit
anthropic-ratelimit-tokens-remaining
anthropic-ratelimit-tokens-reset
retry-after  (on 429 responses)

These would naturally map to OTel attributes on the existing api_request log event, with a ratelimit. prefix:

ratelimit.requests_limit
ratelimit.requests_remaining
ratelimit.requests_reset
ratelimit.input_tokens_limit
...

Motivation

I run a local OpenTelemetry collector that captures Claude Code's emitted log records into DuckDB for analysis. I can compute spend ($ / hour, $ / session, $ / day-of-week, cache thrash signatures, etc.) — but I can't tell whether I'm about to hit the rate-limit wall, because the limit/remaining numbers never enter the local pipeline.

Concrete pain: on 2026-05-18 a long-running session hit 3× 429 errors in 5 minutes. The 429s show up in the telemetry, but only after the fact. A burn-rate monitor that surfaced "remaining input-tokens this window = N%" 15 minutes before that wall would have let me throttle or pause cleanly instead of seeing failures land.

This matters even more on Anthropic's enterprise plans, where users typically cannot see the org-level budget in the console — only the admin can. Telemetry-side visibility would let individual engineers self-monitor without needing to ping their admin.

What I tried

  • Inspected OTEL_LOG_RAW_API_BODIES=file:... raw-body capture: response JSON contains model, id, type, role, content, stop_reason, usage, diagnostics — no header data.
  • Inspected claude_code.api_request log events via OTLP: attributes include input_tokens, output_tokens, cache_read_tokens, cache_creation_tokens, cost_usd, duration_ms, model, session.id, organization.id, etc. — no ratelimit.* or equivalent.
  • Inspected claude_code.llm_request spans: same shape, no rate-limit fields.

So the binary appears to drop the response headers before the OTel signal-emission stage.

Proposed API surface

No new env var needed — just add the attributes to the existing api_request log record (or claude_code.llm_request span; either or both works). Attribute names with a ratelimit. prefix would be a natural fit and avoid colliding with anything else.

Optional follow-up: emit a separate api_ratelimit_warning event when remaining < 10% to make alerting trivial without consumer-side math.

Why this generalises beyond me

Any user running observability on Claude Code (Datadog / Honeycomb / Grafana Cloud / etc.) hits the same gap. The community is building burn-rate monitors against the OTel signals already (mine is one of several I've seen on GitHub) — every one of them currently has the same "spend is observable, budget isn't" blind spot.

Environment

  • Claude Code 2.1.133
  • CLAUDE_CODE_ENABLE_TELEMETRY=1 + CLAUDE_CODE_ENHANCED_TELEMETRY_BETA=1
  • OTEL_LOGS_EXPORTER=otlp, OTEL_LOG_RAW_API_BODIES=file:...
  • Both Pro/Max (personal) and TomTom enterprise org IDs observed in the same data

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING