claude-code - ✅(Solved) Fix Feature request: AssistantMessageDelta hook for streaming-aware tooling (TTS, telemetry, mirror) [1 pull requests, 1 participants]

claude-code2026-05-19 14:41:16

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

anthropics/claude-code#60564•Fetched 2026-05-20 03:55:18

View on GitHub

Comments

Participants

Timeline

Reactions

Author

eli-newman

Participants

eli-newman

Timeline (top)

labeled ×2cross-referenced ×1

Fix Action

Fix / Workaround

Workarounds considered (and why they don't suffice)

PR fix notes

PR #5: Highlight-and-speak: clip subcommand + Hammerspoon hotkey

Repository: Null-Phnix/claude-voice
Author: eli-newman
State: open | merged: False
Link: https://github.com/Null-Phnix/claude-voice/pull/5

Description (problem / solution / changelog)

Summary

Stacked on top of #4 (per-terminal routing). Adds on-demand TTS: highlight any text on screen, press a hotkey, the daemon speaks it. Bypasses mute and claim gates because it's user-initiated.

Why this instead of streaming-while-Claude-types? I went down that rabbit hole — verified empirically that Claude Code flushes the JSONL transcript in one shot at end-of-turn (not incrementally), and that the Stop hook is structurally end-of-turn. Filed anthropics/claude-code#60564 for an AssistantMessageDelta hook. Until that ships, on-demand select-and-speak is the better UX anyway — you choose what gets read.

What changes

claude-voice clip — reads the macOS clipboard (pbpaste), sends it to the daemon, daemon speaks it.
override: true flag on speak — bypasses both the per-tty mute set and the active-tty claim. Used by clip because a user-initiated speak should always run regardless of which terminal is claimed.
integrations/hammerspoon-snippet.lua — drop into ~/.hammerspoon/init.lua. Default binding is Cmd+Shift+T. The hotkey sends Cmd+C, waits ~80ms for the clipboard to update, runs claude-voice clip, then restores the previous clipboard contents so the user doesn't lose what they had copied.

Karabiner-Elements, Raycast, BetterTouchTool, or any other macOS hotkey tool also work — Hammerspoon is just the lightest dependency. Documented in the README.

Tests

2 new unittest cases in tests/test_routing.py:

test_override_bypasses_mute — speak with override + muted tty → still queued
test_override_bypasses_claim — speak with override + wrong tty under active claim → still queued

$ python -m unittest discover tests -v
Ran 17 tests in 0.931s
OK

Reviewer notes

Built on feat/per-terminal-routing (#4). GitHub will show the cumulative diff against main — view the third commit alone to see only this PR's changes. Once #4 merges, I'll rebase this onto the new main and the diff collapses to just the clip work.

If #4 turns out to be too much scope, the override flag and the clip command can be cleanly transplanted onto #3 (daemon-only) instead. Let me know which structure works best for you.

🤖 Generated with Claude Code

Changed files

README.md (modified, +86/-7)
claude_voice.py (modified, +533/-49)
integrations/hammerspoon-snippet.lua (added, +47/-0)
speak.py (modified, +533/-49)
tests/test_daemon_protocol.py (added, +163/-0)
tests/test_routing.py (added, +211/-0)

Code Example

{
  "event": "AssistantMessageDelta",
  "session_id": "...",
  "message_id": "...",
  "delta": {
    "text": "...the text added since the last delta..."
  },
  "cumulative_text": "...full assistant text so far..."  // optional but very useful
}

---

"hooks": {
  "AssistantMessageDelta": [
    { "matcher": "", "hooks": [{ "type": "command", "command": "...", "async": true, "timeout": 5 }] }
  ]
}

RAW_BUFFERClick to expand / collapse

Motivation

The current hook surface fires only at boundary events: UserPromptSubmit, PreToolUse, PostToolUse, Stop, etc. There is no way to react to assistant text while it is being generated.

For an entire class of downstream tooling — local TTS, transcript mirroring, on-the-fly translation, accessibility narrators, telemetry — that boundary-only model means the user has to wait for the full response to finish before any side-effect can begin. On a 500-word response that is a 15–30s wait before any narration starts.

I just shipped a local Kokoro-TTS daemon (Null-Phnix/claude-voice#3) that drops warm time-to-first-audio from ~6s to ~0.6s. The bottleneck is no longer the model; it is that the Stop hook only fires after the assistant turn completes.

I verified by polling the JSONL transcript file every 100ms during an assistant response: the file grew exactly once, with the entire 7.2KB assistant message written in a single flush at end-of-turn. So tailing the transcript can't substitute either.

Proposed hook

A new hook event fired periodically during assistant generation with the in-progress text:

{
  "event": "AssistantMessageDelta",
  "session_id": "...",
  "message_id": "...",
  "delta": {
    "text": "...the text added since the last delta..."
  },
  "cumulative_text": "...full assistant text so far..."  // optional but very useful
}

Configuration would be the same as other hooks in ~/.claude/settings.json:

"hooks": {
  "AssistantMessageDelta": [
    { "matcher": "", "hooks": [{ "type": "command", "command": "...", "async": true, "timeout": 5 }] }
  ]
}

Implementation notes / requests

A few things would make this hook actually usable for streaming consumers:

async: true should be the only sensible mode. A sync hook on every delta would tank generation throughput. Document it clearly.
Debounce/batch. Firing per-token is overkill. Firing every ~100ms or on every sentence-terminator (.!? followed by whitespace) would cover almost every consumer's needs without flooding hook processes.
Either delta or cumulative_text, not both required. Consumers can derive one from the other. Sending only delta is fine if you flag the order; sending cumulative_text is fine if you accept the redundancy.
Final delta marker. A flag like "is_final": true on the last delta of a turn (or a separate AssistantMessageEnd event) helps consumers flush their last sentence cleanly.

Use cases this unlocks

Real-time TTS — start speaking sentence 1 while Claude is still generating sentence 2. The reason I filed this.
Accessibility — screen readers that narrate as the assistant types, the way visually-impaired users already use screen readers with chat apps.
Live transcript mirror — push the response to a second device / dashboard as it's being typed, instead of post-hoc.
Per-sentence translation — pipe deltas to a translator and emit in the user's language as fast as the assistant emits in English.
Content moderation / safety filters — abort or rewrite a response mid-stream when an external rule fires, instead of after the user has already seen it.
Token-level telemetry for tools that need finer-grained data than Stop provides.

Workarounds considered (and why they don't suffice)

Tail the JSONL transcript — file is flushed once at end-of-turn (verified empirically). No streaming visibility.
Wrap Claude Code's stdout via PTY — fragile, breaks every time the TUI redraws or cursor-moves, and forces users to launch via a wrapper.
PostToolUse — fires only on tool use, not on plain-text assistant turns.
Polling the API directly via a custom client — duplicates Claude Code's session, auth, MCP plumbing.

A first-class hook avoids all of those failure modes.

Happy to help

If a delta hook is on the roadmap, I'd be happy to:

Test against any preview build and give feedback before stable.
Update Null-Phnix/claude-voice as a reference consumer the moment it ships.
Write user-facing docs once the contract is settled.

Thanks for building Claude Code.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #generation error #database connection #vector store #embedding generation

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

claude-code - ✅(Solved) Fix Feature request: AssistantMessageDelta hook for streaming-aware tooling (TTS, telemetry, mirror) [1 pull requests, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Fix Action

Fix / Workaround

Workarounds considered (and why they don't suffice)

PR fix notes

PR #5: Highlight-and-speak: clip subcommand + Hammerspoon hotkey

Description (problem / solution / changelog)

Summary

What changes

Tests

Reviewer notes

Changed files

Code Example

Motivation

Proposed hook

Implementation notes / requests

Use cases this unlocks

Workarounds considered (and why they don't suffice)

Happy to help

Still need to ship something?

TRENDING

claude-code - ✅(Solved) Fix Feature request: AssistantMessageDelta hook for streaming-aware tooling (TTS, telemetry, mirror) [1 pull requests, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Fix Action

Fix / Workaround

Workarounds considered (and why they don't suffice)

PR fix notes

PR #5: Highlight-and-speak: clip subcommand + Hammerspoon hotkey

Description (problem / solution / changelog)

Summary

What changes

Tests

Reviewer notes

Changed files

Code Example

Motivation

Proposed hook

Implementation notes / requests

Use cases this unlocks

Workarounds considered (and why they don't suffice)

Happy to help

Still need to ship something?

RELATED_DISCOVERY

TRENDING