codex - 💡(How to fix) Fix Codex Desktop voice transcription failures can discard unrecoverable audio

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Error Message

  • The session JSONL only recorded the user text event and did not contain an audio attachment or local audio path.
  • ~/.codex/transcription-history.jsonl contained only the transcription text, not the source audio file path.
  • Searching common local locations such as the workspace, ~/.codex, Codex application support/cache directories, /tmp, and /var/folders did not reveal a corresponding audio file.
  • In one case the resulting transcription was only a tiny fragment (for example, 嗯?) even though the intended recording was much longer.

Root Cause

For use cases like interviews, meetings, notes, or spoken bug reports, users may rely on Codex voice input as the capture mechanism. If STT fails and the raw audio is discarded, there is no recovery path.

RAW_BUFFERClick to expand / collapse

What happened?

In Codex Desktop, voice input / speech-to-text can fail or produce an obviously incomplete transcription. When that happens, the original recorded audio appears to be unavailable: it is not attached to the conversation, not referenced in the session JSONL, and not discoverable in the local Codex cache or transcription history.

This makes voice input risky for longer or important recordings, because a transcription failure can mean the user loses both the transcription and the original audio.

Why this matters

For use cases like interviews, meetings, notes, or spoken bug reports, users may rely on Codex voice input as the capture mechanism. If STT fails and the raw audio is discarded, there is no recovery path.

Observed behavior

  • The session JSONL only recorded the user text event and did not contain an audio attachment or local audio path.
  • ~/.codex/transcription-history.jsonl contained only the transcription text, not the source audio file path.
  • Searching common local locations such as the workspace, ~/.codex, Codex application support/cache directories, /tmp, and /var/folders did not reveal a corresponding audio file.
  • In one case the resulting transcription was only a tiny fragment (for example, 嗯?) even though the intended recording was much longer.

Expected behavior

Codex should provide a recovery path when voice transcription fails or produces incomplete output. Possible fixes:

  1. Keep the original audio locally until the transcription succeeds and the user sends/accepts it.
  2. If transcription fails, expose a retry option using the original audio.
  3. Attach or link the temporary audio file in the session record until the turn is complete.
  4. Make the retention behavior explicit in the UI, especially for longer recordings.
  5. Consider saving failed voice captures to a recoverable local folder, with a clear privacy/retention policy.

Environment

  • App: Codex Desktop
  • Platform: macOS
  • CLI version observed in session metadata: 0.133.0-alpha.1
  • Source: VS Code / Codex Desktop session

Additional context

This issue is about reliability and data recovery, not transcription accuracy alone. Even if transcription can fail occasionally, the user should not lose the original audio without a way to retry or recover it.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

Codex should provide a recovery path when voice transcription fails or produces incomplete output. Possible fixes:

  1. Keep the original audio locally until the transcription succeeds and the user sends/accepts it.
  2. If transcription fails, expose a retry option using the original audio.
  3. Attach or link the temporary audio file in the session record until the turn is complete.
  4. Make the retention behavior explicit in the UI, especially for longer recordings.
  5. Consider saving failed voice captures to a recoverable local folder, with a clear privacy/retention policy.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

codex - 💡(How to fix) Fix Codex Desktop voice transcription failures can discard unrecoverable audio