litellm - 💡(How to fix) Fix [Bug]: AgentCore: plain string SSE events from Strands-based agents dropped silently [3 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
BerriAI/litellm#25691Fetched 2026-04-16 06:37:10
View on GitHub
Comments
3
Participants
2
Timeline
8
Reactions
0
Author
Participants
Timeline (top)
commented ×3labeled ×3mentioned ×1subscribed ×1

Code Example

model_list:
  - model_name: my-agent
    litellm_params:
      model: bedrock/agentcore/arn:aws:bedrock-agentcore:eu-west-1:<account_id>:runtime/<runtime_id>
      aws_region_name: "eu-west-1"
      aws_role_name: "arn:aws:iam::<account_id>:role/<role_name>"

---

import os
from strands import Agent
from bedrock_agentcore.runtime import BedrockAgentCoreApp
from strands.models import BedrockModel

MODEL_ID = "global.anthropic.claude-sonnet-4-5-20250929-v1:0"
app = BedrockAgentCoreApp()

@app.entrypoint
async def invoke(payload, context):
    agent = Agent(
        model=BedrockModel(model_id=MODEL_ID),
        system_prompt="You are a friendly customer support agent")

    async for event in agent.stream_async(payload.get("prompt")):
        if "data" in event and isinstance(event["data"], str):
            yield event["data"]

if __name__ == "__main__":
    app.run()

---

transformation.py - AgentCore transform_request - optional_params keys: ['aws_region_name', 'aws_role_name']
transformation.py - PAYLOAD: {'prompt': 'hi'}
litellm_logging.py - RAW RESPONSE: first stream response received
router.py - litellm.acompletion(...) 200 OK
proxy_server.py - inside generator
proxy_server.py - streaming chunk model mismatch ...

---

{"choices": [{"finish_reason": "stop", "index": 0, "delta": {}}]}

---
RAW_BUFFERClick to expand / collapse

Check for existing issues

  • I have searched the existing issues and checked that my issue is not a duplicate.

What happened?

When invoking a Bedrock AgentCore runtime backed by the Strands Agents SDK, all streaming chunks are silently dropped. The response always returns content: "" with completion_tokens: 0, even though the agent is generating output.

LiteLLM Configuration File

model_list:
  - model_name: my-agent
    litellm_params:
      model: bedrock/agentcore/arn:aws:bedrock-agentcore:eu-west-1:<account_id>:runtime/<runtime_id>
      aws_region_name: "eu-west-1"
      aws_role_name: "arn:aws:iam::<account_id>:role/<role_name>"

LiteLLM Version

1.82.3 (also reproduced on main as of 2026-04-14)

Example Strands Agentcore agent code

import os
from strands import Agent
from bedrock_agentcore.runtime import BedrockAgentCoreApp
from strands.models import BedrockModel

MODEL_ID = "global.anthropic.claude-sonnet-4-5-20250929-v1:0"
app = BedrockAgentCoreApp()

@app.entrypoint
async def invoke(payload, context):
    agent = Agent(
        model=BedrockModel(model_id=MODEL_ID),
        system_prompt="You are a friendly customer support agent")

    async for event in agent.stream_async(payload.get("prompt")):
        if "data" in event and isinstance(event["data"], str):
            yield event["data"]

if __name__ == "__main__":
    app.run()

LiteLLM Debug Logs

Request is sent and receives HTTP 200, without errors, stream iterates silently with zero content chunks yielded.

transformation.py - AgentCore transform_request - optional_params keys: ['aws_region_name', 'aws_role_name']
transformation.py - PAYLOAD: {'prompt': 'hi'}
litellm_logging.py - RAW RESPONSE: first stream response received
router.py - litellm.acompletion(...) 200 OK
proxy_server.py - inside generator
proxy_server.py - streaming chunk model mismatch ...

Expected behavior

Streaming chunks with actual content should be returned.

Actual behavior

{"choices": [{"finish_reason": "stop", "index": 0, "delta": {}}]}

completion_tokens: 0, content: ""

Steps to Reproduce

  1. Create AWS Bedrock AgentCore agent with Strands and streaming as described above
  2. Add config for bedrock agentcore model like above example yaml
  3. curl -N http://localhost:4000/v1/chat/completions -H "Content-Type: application/json" -H "Authorization: Bearer {YOUR_TOKEN_HERE} -d '{"model": "{YOUR_BEDROCK_AGENTCORE_ARN_HERE}","messages": [{"role": "user", "content": "Hello, what can you do?"}],"stream": false}'

Relevant log output

What part of LiteLLM is this about?

Proxy

What LiteLLM version are you on ?

1.82.3

Twitter / LinkedIn details

No response

extent analysis

TL;DR

The issue is likely due to a mismatch between the expected and actual streaming behavior in the Bedrock AgentCore runtime, and a workaround may involve adjusting the streaming configuration or the agent code to properly yield content chunks.

Guidance

  • Verify that the stream_async method in the agent code is correctly yielding content chunks by adding logging or debugging statements to inspect the event data.
  • Check the LiteLLM configuration file to ensure that the stream parameter is set to true in the request payload, as the example curl command has it set to false.
  • Investigate the "streaming chunk model mismatch" log message in proxy_server.py to understand the cause of the mismatch and how it affects the streaming behavior.
  • Review the agent code to ensure that it is properly handling the prompt payload and generating output that can be streamed.

Example

async for event in agent.stream_async(payload.get("prompt")):
    if "data" in event and isinstance(event["data"], str):
        print(f"Yielding content chunk: {event['data']}")
        yield event["data"]

Notes

The issue may be specific to the Bedrock AgentCore runtime or the Strands Agents SDK, and further investigation may be needed to determine the root cause. Additionally, the stream parameter in the request payload may need to be adjusted to enable streaming.

Recommendation

Apply workaround: Adjust the agent code and streaming configuration to properly yield content chunks, as the issue is likely due to a mismatch between the expected and actual streaming behavior.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

Streaming chunks with actual content should be returned.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING