claude-code - 💡(How to fix) Fix claude.ai code-execution sandbox: HTTP 503 'DNS cache overflow' on egress to Vercel-hosted host [2 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
anthropics/claude-code#53149Fetched 2026-04-26 05:23:09
View on GitHub
Comments
2
Participants
2
Timeline
3
Reactions
1
Timeline (top)
commented ×2labeled ×1

Outbound HTTPS requests from the claude.ai code-execution sandbox to a Vercel-hosted host (shelf.grafals.net, CNAME cname.vercel-dns.com.76.76.21.142, 66.33.60.66) return:

HTTP/2 503
content-length: 18
content-type: text/plain

DNS cache overflow

The same request from any other source (a personal Mac on residential ISP, a Hetzner VPS) returns the expected HTTP/2 307 redirect to /login with server: Vercel. So the failure is specific to the sandbox's egress path, not the destination.

Error Message

  • Not Vercel emitting the error. The 18-byte plaintext body and the literal phrase "DNS cache overflow" don't match Vercel's branded error responses. Vercel uses HTML or JSON error pages with x-vercel-id / x-vercel-cache headers; the 503 here has none of those (only content-length, content-type, date). The error string and shape look like an internal sandbox-egress proxy resolver error — the egress proxy's DNS cache filling up and emitting a custom 503 instead of resolving the destination. Pointing this out because:
  • It's a custom error string (not standard HTTP, not Vercel, not Cloudflare, not standard nginx/envoy).

Root Cause

It breaks any MCP-server integration that mints signed URLs and expects the model to curl raw bytes to them from the sandbox — a common pattern for moving large payloads (audio, PDFs, screenshots) without round-tripping bytes through model context. MCP traffic itself is unaffected (different relay path), so the failure mode is asymmetric and surprising.

Code Example

HTTP/2 503
content-length: 18
content-type: text/plain

DNS cache overflow

---

curl -sS -i "https://shelf.grafals.net/"
RAW_BUFFERClick to expand / collapse

Note: this is a bug in claude.ai's code-execution (bash) sandbox, not the Claude Code CLI. Filing here because there's no public claude.ai issue tracker — please redirect if there's a better channel.

Summary

Outbound HTTPS requests from the claude.ai code-execution sandbox to a Vercel-hosted host (shelf.grafals.net, CNAME cname.vercel-dns.com.76.76.21.142, 66.33.60.66) return:

HTTP/2 503
content-length: 18
content-type: text/plain

DNS cache overflow

The same request from any other source (a personal Mac on residential ISP, a Hetzner VPS) returns the expected HTTP/2 307 redirect to /login with server: Vercel. So the failure is specific to the sandbox's egress path, not the destination.

Repro

From a claude.ai code-execution bash session:

curl -sS -i "https://shelf.grafals.net/"

Expected: HTTP/2 307 redirect (Vercel responding normally). Actual: HTTP/2 503 with body DNS cache overflow.

Reproduces consistently over a span of minutes, on retry, and across multiple *.grafals.net hosts that resolve to Vercel anycast.

What I ruled out

  • Not the destination. Two unrelated source IPs (Mac, VPS) get clean 307s from shelf.grafals.net at the same time the sandbox is getting 503.
  • Not our reverse proxy. shelf.grafals.net is on Vercel anycast (CNAME cname.vercel-dns.com.); my own VPS reverse proxies (Caddy/nginx) aren't in the traffic path and don't contain the string "DNS cache overflow" anywhere in their configs.
  • Not Vercel emitting the error. The 18-byte plaintext body and the literal phrase "DNS cache overflow" don't match Vercel's branded error responses. Vercel uses HTML or JSON error pages with x-vercel-id / x-vercel-cache headers; the 503 here has none of those (only content-length, content-type, date).
  • Not Anthropic egress allowlist. No x-deny-reason header on the response. The host is reachable enough to complete TLS and exchange HTTP/2 frames.

Hypothesis

The error string and shape look like an internal sandbox-egress proxy resolver error — the egress proxy's DNS cache filling up and emitting a custom 503 instead of resolving the destination. Pointing this out because:

  • It's a custom error string (not standard HTTP, not Vercel, not Cloudflare, not standard nginx/envoy).
  • It surfaces only from sandbox-origin traffic.
  • TLS completes (so the traffic is reaching something on Anthropic's path, not failing at the network layer).

Why this matters

It breaks any MCP-server integration that mints signed URLs and expects the model to curl raw bytes to them from the sandbox — a common pattern for moving large payloads (audio, PDFs, screenshots) without round-tripping bytes through model context. MCP traffic itself is unaffected (different relay path), so the failure mode is asymmetric and surprising.

Suggested next steps for triage

  • Grep the sandbox egress proxy config / source for the literal string "DNS cache overflow".
  • Check resolver cache size / eviction behavior under load — the phrasing suggests a bounded LRU that returns this verbatim on overflow.
  • If reproducible: confirm whether it's bound to specific destination ASNs (Vercel anycast AS13335/AS396982) or general.

extent analysis

TL;DR

The issue can be resolved by investigating and adjusting the DNS cache configuration of the claude.ai code-execution sandbox's egress proxy.

Guidance

  • Investigate the egress proxy config/source for the literal string "DNS cache overflow" to understand the custom error handling.
  • Check the resolver cache size and eviction behavior under load to determine if it's causing the overflow.
  • Verify if the issue is specific to Vercel anycast ASNs (AS13335/AS396982) or a more general problem.
  • Consider increasing the DNS cache size or adjusting the eviction policy to mitigate the overflow.

Example

No code snippet is provided as the issue is related to the configuration of the egress proxy, not a code-specific problem.

Notes

The issue seems to be specific to the claude.ai code-execution sandbox's egress path, and the error string suggests a custom error handling mechanism. The problem may be related to the DNS cache size or eviction behavior under load.

Recommendation

Apply a workaround by adjusting the DNS cache configuration of the egress proxy to prevent the overflow, as the root cause seems to be related to the custom error handling mechanism.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING