claude-code - ✅(Solved) Fix Feature request: AssistantMessageDelta hook for streaming-aware tooling (TTS, telemetry, mirror) [1 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
anthropics/claude-code#60564Fetched 2026-05-20 03:55:18
View on GitHub
Comments
0
Participants
1
Timeline
3
Reactions
0
Participants
Timeline (top)
labeled ×2cross-referenced ×1

Fix Action

Fix / Workaround

Workarounds considered (and why they don't suffice)

PR fix notes

PR #5: Highlight-and-speak: clip subcommand + Hammerspoon hotkey

Description (problem / solution / changelog)

Summary

Stacked on top of #4 (per-terminal routing). Adds on-demand TTS: highlight any text on screen, press a hotkey, the daemon speaks it. Bypasses mute and claim gates because it's user-initiated.

Why this instead of streaming-while-Claude-types? I went down that rabbit hole — verified empirically that Claude Code flushes the JSONL transcript in one shot at end-of-turn (not incrementally), and that the Stop hook is structurally end-of-turn. Filed anthropics/claude-code#60564 for an AssistantMessageDelta hook. Until that ships, on-demand select-and-speak is the better UX anyway — you choose what gets read.

What changes

  • claude-voice clip — reads the macOS clipboard (pbpaste), sends it to the daemon, daemon speaks it.
  • override: true flag on speak — bypasses both the per-tty mute set and the active-tty claim. Used by clip because a user-initiated speak should always run regardless of which terminal is claimed.
  • integrations/hammerspoon-snippet.lua — drop into ~/.hammerspoon/init.lua. Default binding is Cmd+Shift+T. The hotkey sends Cmd+C, waits ~80ms for the clipboard to update, runs claude-voice clip, then restores the previous clipboard contents so the user doesn't lose what they had copied.

Karabiner-Elements, Raycast, BetterTouchTool, or any other macOS hotkey tool also work — Hammerspoon is just the lightest dependency. Documented in the README.

Tests

2 new unittest cases in tests/test_routing.py:

  • test_override_bypasses_mute — speak with override + muted tty → still queued
  • test_override_bypasses_claim — speak with override + wrong tty under active claim → still queued
$ python -m unittest discover tests -v
Ran 17 tests in 0.931s
OK

Reviewer notes

Built on feat/per-terminal-routing (#4). GitHub will show the cumulative diff against main — view the third commit alone to see only this PR's changes. Once #4 merges, I'll rebase this onto the new main and the diff collapses to just the clip work.

If #4 turns out to be too much scope, the override flag and the clip command can be cleanly transplanted onto #3 (daemon-only) instead. Let me know which structure works best for you.

🤖 Generated with Claude Code

Changed files

  • README.md (modified, +86/-7)
  • claude_voice.py (modified, +533/-49)
  • integrations/hammerspoon-snippet.lua (added, +47/-0)
  • speak.py (modified, +533/-49)
  • tests/test_daemon_protocol.py (added, +163/-0)
  • tests/test_routing.py (added, +211/-0)

Code Example

{
  "event": "AssistantMessageDelta",
  "session_id": "...",
  "message_id": "...",
  "delta": {
    "text": "...the text added since the last delta..."
  },
  "cumulative_text": "...full assistant text so far..."  // optional but very useful
}

---

"hooks": {
  "AssistantMessageDelta": [
    { "matcher": "", "hooks": [{ "type": "command", "command": "...", "async": true, "timeout": 5 }] }
  ]
}
RAW_BUFFERClick to expand / collapse

Motivation

The current hook surface fires only at boundary events: UserPromptSubmit, PreToolUse, PostToolUse, Stop, etc. There is no way to react to assistant text while it is being generated.

For an entire class of downstream tooling — local TTS, transcript mirroring, on-the-fly translation, accessibility narrators, telemetry — that boundary-only model means the user has to wait for the full response to finish before any side-effect can begin. On a 500-word response that is a 15–30s wait before any narration starts.

I just shipped a local Kokoro-TTS daemon (Null-Phnix/claude-voice#3) that drops warm time-to-first-audio from ~6s to ~0.6s. The bottleneck is no longer the model; it is that the Stop hook only fires after the assistant turn completes.

I verified by polling the JSONL transcript file every 100ms during an assistant response: the file grew exactly once, with the entire 7.2KB assistant message written in a single flush at end-of-turn. So tailing the transcript can't substitute either.

Proposed hook

A new hook event fired periodically during assistant generation with the in-progress text:

{
  "event": "AssistantMessageDelta",
  "session_id": "...",
  "message_id": "...",
  "delta": {
    "text": "...the text added since the last delta..."
  },
  "cumulative_text": "...full assistant text so far..."  // optional but very useful
}

Configuration would be the same as other hooks in ~/.claude/settings.json:

"hooks": {
  "AssistantMessageDelta": [
    { "matcher": "", "hooks": [{ "type": "command", "command": "...", "async": true, "timeout": 5 }] }
  ]
}

Implementation notes / requests

A few things would make this hook actually usable for streaming consumers:

  1. async: true should be the only sensible mode. A sync hook on every delta would tank generation throughput. Document it clearly.
  2. Debounce/batch. Firing per-token is overkill. Firing every ~100ms or on every sentence-terminator (.!? followed by whitespace) would cover almost every consumer's needs without flooding hook processes.
  3. Either delta or cumulative_text, not both required. Consumers can derive one from the other. Sending only delta is fine if you flag the order; sending cumulative_text is fine if you accept the redundancy.
  4. Final delta marker. A flag like "is_final": true on the last delta of a turn (or a separate AssistantMessageEnd event) helps consumers flush their last sentence cleanly.

Use cases this unlocks

  • Real-time TTS — start speaking sentence 1 while Claude is still generating sentence 2. The reason I filed this.
  • Accessibility — screen readers that narrate as the assistant types, the way visually-impaired users already use screen readers with chat apps.
  • Live transcript mirror — push the response to a second device / dashboard as it's being typed, instead of post-hoc.
  • Per-sentence translation — pipe deltas to a translator and emit in the user's language as fast as the assistant emits in English.
  • Content moderation / safety filters — abort or rewrite a response mid-stream when an external rule fires, instead of after the user has already seen it.
  • Token-level telemetry for tools that need finer-grained data than Stop provides.

Workarounds considered (and why they don't suffice)

  • Tail the JSONL transcript — file is flushed once at end-of-turn (verified empirically). No streaming visibility.
  • Wrap Claude Code's stdout via PTY — fragile, breaks every time the TUI redraws or cursor-moves, and forces users to launch via a wrapper.
  • PostToolUse — fires only on tool use, not on plain-text assistant turns.
  • Polling the API directly via a custom client — duplicates Claude Code's session, auth, MCP plumbing.

A first-class hook avoids all of those failure modes.

Happy to help

If a delta hook is on the roadmap, I'd be happy to:

  • Test against any preview build and give feedback before stable.
  • Update Null-Phnix/claude-voice as a reference consumer the moment it ships.
  • Write user-facing docs once the contract is settled.

Thanks for building Claude Code.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

claude-code - ✅(Solved) Fix Feature request: AssistantMessageDelta hook for streaming-aware tooling (TTS, telemetry, mirror) [1 pull requests, 1 participants]