codex - 💡(How to fix) Fix Linux remote compact can time out at ~31s despite 20m request timeout due to reqwest TCP_USER_TIMEOUT

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

On Linux, legacy remote compaction can fail at ~31s per attempt even though Codex configures /responses/compact with a 20 minute request timeout.

I reproduced this on Codex CLI 0.133.0 / source tag rust-v0.133.0 and verified that changing only the reqwest TCP_USER_TIMEOUT setting made the same /compact request succeed.

Error Message

attempt=0 duration_ms=30885 error.message="timeout" endpoint="/responses/compact" attempt=1 duration_ms=31043 error.message="timeout" endpoint="/responses/compact" attempt=2 duration_ms=31334 error.message="timeout" endpoint="/responses/compact" attempt=3 duration_ms=30869 error.message="timeout" endpoint="/responses/compact" attempt=4 duration_ms=31043 error.message="timeout" endpoint="/responses/compact"

Root Cause

The backend returned successfully after ~42.7s. The unpatched Linux client never waited that long because the lower transport-level timeout fired first.

Fix Action

Fix / Workaround

Patch tested

The backend returned successfully after ~42.7s. The unpatched Linux client never waited that long because the lower transport-level timeout fired first.

Code Example

remote_compaction_v2=false
provider_name=OpenAI
provider_base_url=https://chatgpt.com/backend-api/codex
endpoint="/responses/compact"
stream_idle_timeout_ms=300000
compact_request_timeout_ms=1200000

---

attempt=0 duration_ms=30885 error.message="timeout" endpoint="/responses/compact"
attempt=1 duration_ms=31043 error.message="timeout" endpoint="/responses/compact"
attempt=2 duration_ms=31334 error.message="timeout" endpoint="/responses/compact"
attempt=3 duration_ms=30869 error.message="timeout" endpoint="/responses/compact"
attempt=4 duration_ms=31043 error.message="timeout" endpoint="/responses/compact"

codex_api::endpoint::session: close time.idle=158s
codex_core::compact_remote: remote compaction failed ... compact_error=request timed out

---

let tcp_user_timeout = Duration::from_secs(600);
let mut builder = reqwest::Client::builder()
    .default_headers(default_headers())
    .tcp_user_timeout(tcp_user_timeout);

---

debug reqwest tcp_user_timeout override tcp_user_timeout_ms=600000
debug compact request timeout ... compact_request_timeout_ms=1200000 endpoint="/responses/compact"

event.name="codex.api_request"
duration_ms=42722
http.response.status_code=200
attempt=0
endpoint="/responses/compact"

codex_api::endpoint::session: close time.idle=42.7s

---

#[cfg(any(target_os = "android", target_os = "fuchsia", target_os = "linux"))]
RAW_BUFFERClick to expand / collapse

Summary

On Linux, legacy remote compaction can fail at ~31s per attempt even though Codex configures /responses/compact with a 20 minute request timeout.

I reproduced this on Codex CLI 0.133.0 / source tag rust-v0.133.0 and verified that changing only the reqwest TCP_USER_TIMEOUT setting made the same /compact request succeed.

What was measured

Codex selected the legacy compact path:

remote_compaction_v2=false
provider_name=OpenAI
provider_base_url=https://chatgpt.com/backend-api/codex
endpoint="/responses/compact"
stream_idle_timeout_ms=300000
compact_request_timeout_ms=1200000

Despite the 20 minute compact request timeout, each HTTP attempt timed out at ~31s:

attempt=0 duration_ms=30885 error.message="timeout" endpoint="/responses/compact"
attempt=1 duration_ms=31043 error.message="timeout" endpoint="/responses/compact"
attempt=2 duration_ms=31334 error.message="timeout" endpoint="/responses/compact"
attempt=3 duration_ms=30869 error.message="timeout" endpoint="/responses/compact"
attempt=4 duration_ms=31043 error.message="timeout" endpoint="/responses/compact"

codex_api::endpoint::session: close time.idle=158s
codex_core::compact_remote: remote compaction failed ... compact_error=request timed out

So the visible ~2m37s failure was 5 * ~31s, not Codex's compact request timeout.

Patch tested

I rebuilt Codex with only this reqwest client change:

let tcp_user_timeout = Duration::from_secs(600);
let mut builder = reqwest::Client::builder()
    .default_headers(default_headers())
    .tcp_user_timeout(tcp_user_timeout);

After that change, the same thread and same /compact operation succeeded on the first attempt:

debug reqwest tcp_user_timeout override tcp_user_timeout_ms=600000
debug compact request timeout ... compact_request_timeout_ms=1200000 endpoint="/responses/compact"

event.name="codex.api_request"
duration_ms=42722
http.response.status_code=200
attempt=0
endpoint="/responses/compact"

codex_api::endpoint::session: close time.idle=42.7s

The backend returned successfully after ~42.7s. The unpatched Linux client never waited that long because the lower transport-level timeout fired first.

Likely cause

reqwest 0.12 sets Linux TCP_USER_TIMEOUT to 30 seconds by default. For long-running unary requests such as legacy POST /responses/compact, that can become the effective lower timeout, even when Codex sets a much longer per-request timeout.

This specific reproduction is Linux-only. In reqwest 0.12.28, both the public tcp_user_timeout(...) builder method and the default tcp_user_timeout: Some(Duration::from_secs(30)) are compiled only for:

#[cfg(any(target_os = "android", target_os = "fuchsia", target_os = "linux"))]

So macOS developers are unlikely to reproduce this exact ~31s per attempt failure mode. They may still see other remote compaction failures, but not this specific Linux reqwest TCP_USER_TIMEOUT=30s cause.

Suggested fix

Ensure the reqwest TCP_USER_TIMEOUT used by Codex HTTP traffic is not shorter than the request timeout for long-running unary endpoints such as /responses/compact.

Possible approaches:

  1. Disable TCP_USER_TIMEOUT for Codex's normal HTTP client.
  2. Set it to a larger value.
  3. Override it for legacy /responses/compact.
  4. Move affected OpenAI traffic to a streamed compaction path where this unary idle request shape no longer exists.

Related:

  • #11282 (/compact SSE keepalive approach for long-running compactions)
  • #18450
  • #22798
  • #23451
  • #23697 (remote_compaction_v2 for OpenAI, likely avoids the legacy unary path)

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING