litellm - 💡(How to fix) Fix A2A: `/a2a/{id}/message/send` returns -32603 "Stream consumed" on raw JSON-RPC POST [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
BerriAI/litellm#27836Fetched 2026-05-14 03:30:20
View on GitHub
Comments
0
Participants
1
Timeline
0
Reactions
0
Author
Participants

LiteLLM proxy's A2A endpoints (POST /a2a/{agent_id} and POST /a2a/{agent_id}/message/send) return a JSON-RPC 2.0 error response {"jsonrpc":"2.0","id":null,"error":{"code":-32603,"message":"Internal error: Stream consumed"}} when invoked via raw HTTP POST (curl).

Affects: LiteLLM main branch as of 2026-05-13, observed on Docker image ghcr.io/berriai/litellm-non_root:main-stable (running locally as v1.83.10-equivalent with a2a-sdk==0.3.24 bundled).

Error Message

{"jsonrpc":"2.0","id":null,"error":{"code":-32603,"message":"Internal error: Stream consumed"}}

Root Cause

The Google a2a-sdk Python client works correctly because it discovers the AgentCard url and POSTs directly to the AgentCard endpoint, bypassing the LiteLLM proxy /a2a/* paths entirely. This works on Docker DNS (http://<agent-card-host>:<port>/) or any reachable URL.

Fix Action

Workaround

The Google a2a-sdk Python client works correctly because it discovers the AgentCard url and POSTs directly to the AgentCard endpoint, bypassing the LiteLLM proxy /a2a/* paths entirely. This works on Docker DNS (http://<agent-card-host>:<port>/) or any reachable URL.

So: Python/SDK clients are unaffected. Raw curl / manual JSON-RPC clients hit the bug.

Code Example

TRACE=$(uuidgen)
curl -fsSk -X POST \
  -H "Authorization: Bearer ${VK}" \
  -H "x-litellm-trace-id: ${TRACE}" \
  -H "Content-Type: application/json" \
  "https://${LITELLM_HOST}/a2a/${AGENT_ID}/message/send" \
  -d "{
    \"jsonrpc\": \"2.0\",
    \"id\": \"$(uuidgen)\",
    \"method\": \"message/send\",
    \"params\": {
      \"message\": {
        \"kind\": \"message\",
        \"messageId\": \"$(uuidgen)\",
        \"role\": \"user\",
        \"parts\": [{\"kind\": \"text\", \"text\": \"ok\"}]
      },
      \"metadata\": {\"kind\": \"warm_ping\"}
    }
  }"

---

{"jsonrpc":"2.0","id":null,"error":{"code":-32603,"message":"Internal error: Stream consumed"}}
RAW_BUFFERClick to expand / collapse

Bug: /a2a/{id}/message/send returns -32603 "Stream consumed" on JSON-RPC POST

Summary

LiteLLM proxy's A2A endpoints (POST /a2a/{agent_id} and POST /a2a/{agent_id}/message/send) return a JSON-RPC 2.0 error response {"jsonrpc":"2.0","id":null,"error":{"code":-32603,"message":"Internal error: Stream consumed"}} when invoked via raw HTTP POST (curl).

Affects: LiteLLM main branch as of 2026-05-13, observed on Docker image ghcr.io/berriai/litellm-non_root:main-stable (running locally as v1.83.10-equivalent with a2a-sdk==0.3.24 bundled).

Repro

  1. Register an AgentCard via POST /v1/agents with admin master-key — succeeds, AgentCard visible at GET /v1/agents, dashboard playground works.
  2. From outside the proxy, POST a JSON-RPC message/send request:
TRACE=$(uuidgen)
curl -fsSk -X POST \
  -H "Authorization: Bearer ${VK}" \
  -H "x-litellm-trace-id: ${TRACE}" \
  -H "Content-Type: application/json" \
  "https://${LITELLM_HOST}/a2a/${AGENT_ID}/message/send" \
  -d "{
    \"jsonrpc\": \"2.0\",
    \"id\": \"$(uuidgen)\",
    \"method\": \"message/send\",
    \"params\": {
      \"message\": {
        \"kind\": \"message\",
        \"messageId\": \"$(uuidgen)\",
        \"role\": \"user\",
        \"parts\": [{\"kind\": \"text\", \"text\": \"ok\"}]
      },
      \"metadata\": {\"kind\": \"warm_ping\"}
    }
  }"

Result: HTTP 200 with JSON-RPC error body:

{"jsonrpc":"2.0","id":null,"error":{"code":-32603,"message":"Internal error: Stream consumed"}}

Same error fires for POST /a2a/{agent_id} (no /message/send subpath).

Expected

The downstream AgentCard endpoint (set via the AgentCard url field when registering) should receive the JSON-RPC request body and respond with a valid Task envelope.

Workaround

The Google a2a-sdk Python client works correctly because it discovers the AgentCard url and POSTs directly to the AgentCard endpoint, bypassing the LiteLLM proxy /a2a/* paths entirely. This works on Docker DNS (http://<agent-card-host>:<port>/) or any reachable URL.

So: Python/SDK clients are unaffected. Raw curl / manual JSON-RPC clients hit the bug.

Root cause hypothesis (educated guess)

LiteLLM's proxy middleware reads the request body once (for auth / logging / VK validation), then attempts to forward the body to the upstream AgentCard. The Starlette Request.stream() can only be consumed once — second consumer gets the error.

Suggested fix: cache await request.body() early in the auth middleware + provide the cached bytes to the forward step instead of re-consuming the stream.

Context

  • Affects raw curl / manual testing only — Python a2a-sdk client works via direct AgentCard URL POST.
  • Reproduced with master-key AND with regular VK that has object_permission.agents allowlisting the target AgentCard.
  • Trace propagation header x-litellm-trace-id is set on the request; no x-correlation-id etc.
  • Server-side error log entry not yet captured — would help confirm the stream-double-read hypothesis.

Why this matters for users

Phase 19.1 of a homelab project shipped an A2A mesh on top of LiteLLM. The Python-side flows work great via the a2a-sdk client. The bug only blocks ad-hoc curl smoke tests + non-Python clients. Documenting + tracking it so curl-shaped probing of A2A endpoints doesn't mislead operators.

Happy to provide more reproducer detail / collect server-side logs / test a fix patch.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

litellm - 💡(How to fix) Fix A2A: `/a2a/{id}/message/send` returns -32603 "Stream consumed" on raw JSON-RPC POST [1 participants]