codex - 💡(How to fix) Fix Responses WebSocket connect failures wait through all stream retries before HTTP fallback [2 comments, 3 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openai/codex#19821Fetched 2026-04-28 06:36:32
View on GitHub
Comments
2
Participants
3
Timeline
6
Reactions
0
Author
Timeline (top)
labeled ×3commented ×2subscribed ×1

Many users behind proxies, especially in mainland China, see Codex print Reconnecting... 1/5 through 5/5 at the start of a turn before it finally begins responding. A practical workaround reported in #14297 is to define a custom provider with supports_websockets = false, which makes Codex use HTTP/SSE immediately.

I investigated the code path and prepared a small patch in my fork:

Error Message

fn should_fallback_to_http_after_websocket_connect_error(error: &ApiError) -> bool { matches!( error, ApiError::Transport(TransportError::Timeout | TransportError::Network(_)) ) }

Root Cause

The default OpenAI provider supports Responses WebSocket transport. When the local/proxy environment cannot carry WebSocket traffic correctly, WebSocket connect fails with timeout/network errors. Today those failures are treated like retryable stream failures, so the turn loop consumes the full stream_max_retries budget before activating HTTP fallback.

This matches user logs from #14297: every attempt is transport="responses_websocket"; after Reconnecting... 5/5, Codex logs falling back to HTTP, then the HTTP Responses request completes quickly.

Fix Action

Fix / Workaround

Many users behind proxies, especially in mainland China, see Codex print Reconnecting... 1/5 through 5/5 at the start of a turn before it finally begins responding. A practical workaround reported in #14297 is to define a custom provider with supports_websockets = false, which makes Codex use HTTP/SSE immediately.

I investigated the code path and prepared a small patch in my fork:

The patch adds a small helper:

Code Example

fn should_fallback_to_http_after_websocket_connect_error(error: &ApiError) -> bool {
    matches!(
        error,
        ApiError::Transport(TransportError::Timeout | TransportError::Network(_))
    )
}

---

cargo test -p codex-core websocket_fallback_switches_to_http_on_connect_timeout -- --exact
RAW_BUFFERClick to expand / collapse

Summary

Many users behind proxies, especially in mainland China, see Codex print Reconnecting... 1/5 through 5/5 at the start of a turn before it finally begins responding. A practical workaround reported in #14297 is to define a custom provider with supports_websockets = false, which makes Codex use HTTP/SSE immediately.

I investigated the code path and prepared a small patch in my fork:

Root Cause

The default OpenAI provider supports Responses WebSocket transport. When the local/proxy environment cannot carry WebSocket traffic correctly, WebSocket connect fails with timeout/network errors. Today those failures are treated like retryable stream failures, so the turn loop consumes the full stream_max_retries budget before activating HTTP fallback.

This matches user logs from #14297: every attempt is transport="responses_websocket"; after Reconnecting... 5/5, Codex logs falling back to HTTP, then the HTTP Responses request completes quickly.

Proposed Change

Fallback to HTTP/SSE immediately when Responses WebSocket connection setup fails with:

  • TransportError::Timeout
  • TransportError::Network(_)

Keep the existing behavior for established stream failures and for explicit 426 Upgrade Required fallback.

The patch adds a small helper:

fn should_fallback_to_http_after_websocket_connect_error(error: &ApiError) -> bool {
    matches!(
        error,
        ApiError::Transport(TransportError::Timeout | TransportError::Network(_))
    )
}

and applies it in both WebSocket preconnect/prewarm and normal turn-time WebSocket connection setup.

Why This Helps

Users whose proxies do not support WebSocket/TUN routing should no longer wait through all 5 reconnect attempts before Codex switches to the HTTP path that already works for them. Users with working WebSocket transport should continue using WebSocket as before.

Test Coverage

The fork adds a regression test that simulates a WebSocket handshake timeout and asserts Codex performs only one WebSocket attempt before using HTTP/SSE successfully.

cargo test -p codex-core websocket_fallback_switches_to_http_on_connect_timeout -- --exact

I could not complete the test locally on my Windows machine because the environment is missing the MSVC linker link.exe, but formatting and git diff --check passed locally.

extent analysis

TL;DR

Apply the proposed patch to fallback to HTTP/SSE immediately when Responses WebSocket connection setup fails with timeout or network errors.

Guidance

  • Review the patch in the provided branch and commit to understand the changes made to the code.
  • Test the patch using the added regression test websocket_fallback_switches_to_http_on_connect_timeout to ensure it works as expected.
  • Consider defining a custom provider with supports_websockets = false as a temporary workaround for users experiencing issues.
  • Verify that the patch does not introduce any regressions for users with working WebSocket transport.

Example

fn should_fallback_to_http_after_websocket_connect_error(error: &ApiError) -> bool {
    matches!(
        error,
        ApiError::Transport(TransportError::Timeout | TransportError::Network(_))
    )
}

This function is used to determine whether to fallback to HTTP after a WebSocket connection error.

Notes

The patch only addresses the issue of WebSocket connection setup failures and does not affect established stream failures or explicit 426 Upgrade Required fallback.

Recommendation

Apply the workaround by defining a custom provider with supports_websockets = false until the patch is fully tested and merged.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

codex - 💡(How to fix) Fix Responses WebSocket connect failures wait through all stream retries before HTTP fallback [2 comments, 3 participants]