claude-code - 💡(How to fix) Fix [FEATURE] Include tool execution duration_ms in PostToolUse and PostToolUseFailure hook inputs [1 comments, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
anthropics/claude-code#52047Fetched 2026-04-23 07:37:59
View on GitHub
Comments
1
Participants
1
Timeline
6
Reactions
0
Participants
Timeline (top)
labeled ×2closed ×1commented ×1mentioned ×1

Error Message

PostToolUse and PostToolUseFailure hook stdin include tool_name, tool_input, tool_response/error, tool_use_id, and the base fields, but no duration.

Fix Action

Fix / Workaround

  • PreToolUsePostToolUse correlation: works but adds one extra hook invocation per tool (~80–120ms Node startup). Worse, the resulting "duration" includes permission wait time, which in default permission mode can be hours (user away from keyboard). Misleading.
  • Transcript parsing: read transcript_path jsonl backwards to find matching tool_use_id's timestamp. Undocumented format, I/O cost, brittle across Claude Code versions.
  • is_interrupt as a proxy: doesn't help, tells me what happened but not when. - Reporting -1 / omitting duration: current workaround; analytics lose a core signal.
RAW_BUFFERClick to expand / collapse

Preflight Checklist

  • I have searched existing requests and this feature hasn't been requested yet
  • This is a single feature request (not multiple features)

Problem Statement

I build session level telemetry on top of Claude Code hooks which captures every tool call event for downstream analytics (which tools are slow, which fail, how long a session spends in verification vs. editing, etc.). The one field I can't get is tool execution duration.

PostToolUse and PostToolUseFailure hook stdin include tool_name, tool_input, tool_response/error, tool_use_id, and the base fields, but no duration.

Claude Code does measure this internally as that value is used for OTel / internal stats but never flows into the hook input.

Proposed Solution

Add duration_ms to both hook input schemas (PostToolUse and PostToolUseFailure)

Alternative Solutions

  • PreToolUsePostToolUse correlation: works but adds one extra hook invocation per tool (~80–120ms Node startup). Worse, the resulting "duration" includes permission wait time, which in default permission mode can be hours (user away from keyboard). Misleading.
  • Transcript parsing: read transcript_path jsonl backwards to find matching tool_use_id's timestamp. Undocumented format, I/O cost, brittle across Claude Code versions.
  • is_interrupt as a proxy: doesn't help, tells me what happened but not when. - Reporting -1 / omitting duration: current workaround; analytics lose a core signal.

Cursor's postToolUseFailure hook already ships a duration field (see docs.cursor.com/hooks). Parity would simplify cross-IDE telemetry pipelines.

Priority

Critical - Blocking my work

Feature Category

Developer tools/SDK

Use Case Example

Scenario: session-level tool performance analytics for an AI coding agent

  1. Agent runs a turn with 40 tool calls: 30 Read, 5 Edit, 4 Bash, 1 long-running Bash (npm test).
  2. PostToolUse fires for each; my hook captures the event + forwards to a collector.
  3. Today's downstream analytics can answer: "which tools ran, in what order, what failed." Cannot answer: "which tool calls exceeded 2 seconds."
  4. With duration_ms, I can:
    • Flag slow tools per session (most common culprit: long Bash commands on unindexed codebases).
    • Build p50/p95/p99 distributions per tool type across many sessions.
    • Correlate "time in tools" vs. "time in model" to surface agent efficiency.
  5. Same data is already captured internally by Claude Code — exposing it costs ~1 line of schema + ~1 line to pass it through.

Concrete downstream query (after the field is added):

"In the last 100 agent sessions, which tool_calls took >5s, sorted by duration, grouped by tool_name?"

Today: impossible without the PreToolUse-correlation hack. After: trivial.

Additional Context

No response

extent analysis

TL;DR

Add a duration_ms field to the PostToolUse and PostToolUseFailure hook input schemas to capture tool execution duration.

Guidance

  • Review the proposed solution to add duration_ms to both hook input schemas, which would provide the necessary information for downstream analytics.
  • Consider the alternative solutions mentioned, such as PreToolUsePostToolUse correlation, transcript parsing, and using is_interrupt as a proxy, but note their limitations and potential drawbacks.
  • Evaluate the priority and use case example provided to understand the importance of capturing tool execution duration for session-level tool performance analytics.
  • Investigate the existing implementation of duration_ms in Cursor's postToolUseFailure hook, as mentioned in the issue, to ensure parity and simplify cross-IDE telemetry pipelines.

Example

No code snippet is provided as the issue does not contain specific code examples.

Notes

The issue lacks information about the internal implementation of Claude Code and how the duration_ms value is currently measured and used for OTel/internal stats. Additional context or clarification may be necessary to fully understand the requirements and constraints of the proposed solution.

Recommendation

Apply the proposed solution to add duration_ms to the PostToolUse and PostToolUseFailure hook input schemas, as it appears to be a straightforward and effective way to capture tool execution duration and address the blocking issue.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING