litellm - ✅(Solved) Fix [Bug]: OCI sync streaming missing split_chunks causes JSONDecodeError [3 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
BerriAI/litellm#24819Fetched 2026-04-08 01:53:48
View on GitHub
Comments
0
Participants
1
Timeline
7
Reactions
0
Author
Participants
Timeline (top)
cross-referenced ×3referenced ×2labeled ×1renamed ×1

Error Message

litellm.APIConnectionError: Extra data: line 3 column 1 (char 123)

File "litellm/litellm_core_utils/streaming_handler.py", line 1611, in next response: Optional[ModelResponseStream] = self.chunk_creator(chunk=chunk)

File "litellm/llms/oci/chat/transformation.py", line 1082, in chunk_creator dict_chunk = json.loads(chunk[5:]) ^^^^^^^^^^^^^^^^^^^^^ json.decoder.JSONDecodeError: Extra data: line 3 column 1 (char 123)

Root Cause

In litellm/llms/oci/chat/transformation.py:

Async path (line ~846) correctly splits multi-event chunks:

async def split_chunks(completion_stream: AsyncIterator[str]):
    async for item in completion_stream:
        for chunk in item.split("\n\n"):
            if not chunk:
                continue
            yield chunk.strip()

streaming_response = OCIStreamWrapper(
    completion_stream=split_chunks(completion_stream), ...
)

Sync path (line ~799) is missing this:

completion_stream = response.iter_text()

streaming_response = OCIStreamWrapper(
    completion_stream=completion_stream, ...  # no split_chunks!
)

PR fix notes

PR #24830: [Fix] OCI sync streaming missing split_chunks causes JSONDecodeError

Description (problem / solution / changelog)

Relevant issues

Fixes #24819

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

  • I have Added testing in the tests/test_litellm/ directory, Adding at least 1 test is a hard requirement - see details
  • My PR passes all unit tests on make test-unit
  • My PR's scope is as isolated as possible, it only solves 1 specific problem
  • I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

Delays in PR merge?

If you're seeing a delay in your PR being merged, ping the LiteLLM Team on Slack (#pr-review).

CI (LiteLLM team)

CI status guideline:

  • 50-55 passing tests: main is stable with minor issues.
  • 45-49 passing tests: acceptable but needs attention
  • <= 40 passing tests: unstable; be careful with your merges and assess the risk.
  • Branch creation CI run Link:

  • CI run for the last commit Link:

  • Merge / cherry-pick CI run Links:

Type

🐛 Bug Fix

Changes

The sync streaming path in OCIChatConfig.get_sync_custom_stream_wrapper is missing the split_chunks logic that the async path already has. When the OCI endpoint batches multiple SSE events into a single HTTP chunk, chunk_creator receives a multi-line string and json.loads() fails with:

json.decoder.JSONDecodeError: Extra data: line 3 column 1 (char 123)

Fix

  • Extract a shared _split_sse_text() helper that splits batched SSE text on "\n\n" boundaries
  • Add a sync split_chunks generator to get_sync_custom_stream_wrapper using the helper
  • Refactor the existing async split_chunks in get_async_custom_stream_wrapper to use the same helper
  • Add 7 unit tests covering the helper and the sync streaming path

Changed files

  • litellm/llms/oci/chat/transformation.py (modified, +17/-7)
  • tests/test_litellm/llms/oci/chat/test_oci_chat_transformation.py (modified, +71/-1)

PR #24835: fix(oci): sync streaming missing split_chunks causes JSONDecodeError

Description (problem / solution / changelog)

Relevant issues

Fixes #24819

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

  • I have Added testing in the tests/test_litellm/ directory, Adding at least 1 test is a hard requirement - see details
  • My PR passes all unit tests on make test-unit
  • My PR's scope is as isolated as possible, it only solves 1 specific problem
  • I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

Delays in PR merge?

If you're seeing a delay in your PR being merged, ping the LiteLLM Team on Slack (#pr-review).

CI (LiteLLM team)

CI status guideline:

  • 50-55 passing tests: main is stable with minor issues.
  • 45-49 passing tests: acceptable but needs attention
  • <= 40 passing tests: unstable; be careful with your merges and assess the risk.
  • Branch creation CI run Link:

  • CI run for the last commit Link:

  • Merge / cherry-pick CI run Links:

Type

🐛 Bug Fix

Changes

The sync streaming path in OCIChatConfig.get_sync_custom_stream_wrapper is missing the split_chunks logic that the async path already has. When the OCI endpoint batches multiple SSE events into a single HTTP chunk, chunk_creator receives a multi-line string and json.loads() fails with:

json.decoder.JSONDecodeError: Extra data: line 3 column 1 (char 123)

Fix

  • Extract a shared _split_sse_text() helper that splits batched SSE text on "\n\n" boundaries
  • Add a sync split_chunks generator to get_sync_custom_stream_wrapper using the helper
  • Refactor the existing async split_chunks in get_async_custom_stream_wrapper to use the same helper
  • Add 7 unit tests covering the helper and the sync streaming path

Changed files

  • litellm/llms/oci/chat/transformation.py (modified, +17/-7)
  • tests/test_litellm/llms/oci/chat/test_oci_chat_transformation.py (modified, +70/-1)

PR #25177: feat(oci): official OCI Generative AI integration — production-ready chat, embeddings & tool use across all model families

Description (problem / solution / changelog)

Overview

Hi LiteLLM team 👋

I'm a Senior Principal Engineer on the Agentic AI team at Oracle, and I'd like to officially land a production-ready OCI Generative AI integration into LiteLLM. OCI GenAI is Oracle's managed inference service, covering Meta, Google, xAI, Cohere, and OpenAI-hosted models — all behind a single RSA-SHA256-signed endpoint. Our team builds agentic AI systems on top of it and we want LiteLLM to be the standard gateway for customers running on OCI.

This PR extends the existing community-contributed OCI chat support into a complete, well-tested integration: embeddings, bug fixes found during production use, and a current model catalog covering all 43 OCI models available today.


What's in this PR

New: Embeddings (litellm/llms/oci/embed/)

  • OCIEmbedConfig implementing BaseEmbeddingConfig with full OCI Cohere embed support
  • All 7 Cohere embed models: cohere.embed-english-v3.0, cohere.embed-multilingual-v3.0, cohere.embed-v4.0, and image variants
  • Batch up to 96 documents per request; inputType mapped from OpenAI input_type; outputDimensions from dimensions
  • ON_DEMAND and DEDICATED (endpoint OCID) serving modes
  • RSA-SHA256 signing via sign_request hook (same pattern already used by OCI chat)
  • OCIEmbeddingConfig alias for backwards compatibility

Bug Fixes (found in production)

  • Sync streaming JSONDecodeError: Extra dataiter_text() returns raw byte chunks that can span multiple SSE events; fixed by splitting on \n\n before JSON parsing
  • Reasoning model crash (google.gemini-2.5-flash, xai.grok-3-mini) — when max_tokens is exhausted on reasoning, completionTokens and message are absent from the OCI response; made both Optional in the Pydantic schema
  • oci_compartment_id env var silently ignored in chat — transform_request was calling optional_params.get() directly instead of going through resolve_oci_credentials()
  • Tool call id absent for Google Gemini via OCI — OCI's GENERIC apiFormat doesn't include id on tool calls for Gemini; made OCIToolCall.id optional, added UUID fallback
  • get_complete_url falling back to litellm.api_base — both chat and embed URL builders were using the global litellm.api_base which could belong to another provider; now only the explicit api_base parameter is honoured
  • Cohere usage AttributeErrorCohereChatResponse.usage can be None; added guard with zero-value fallback
  • Dead code in __init__ and transform_response — removed unreachable locals() loop and reordered isinstance/error checks
  • OCI_KEY env var now supported for inline PEM keys (previously only OCI_KEY_FILE path was read)
  • Fixed datetime.utcnow() deprecation warning in request signing

Model Catalog (model_prices_and_context_window.json)

43 total OCI models in the catalog (includes models added by upstream main since PR was opened).

Code Quality

  • Removed dead code paths identified during coverage analysis
  • All OCI source files formatted with Black
  • CodeQL SHA256 false positive suppressed via codeql-config.yml
  • Cyclic import edges reduced (litellm.utilslitellm.types.utils / litellm_core_utils)

Tests

Unit tests — 322 tests, no credentials required, run in CI

FileTestsCoverage
test_oci_chat_transformation.pyChat config, request/response transforms, split_chunks, streaming100%
test_oci_cohere_tool_calls.pyCohere tool call adaptation, preamble, stream chunks100%
test_oci_generic_chat.pyGeneric message adaptation, content types, tool validation100%
test_oci_streaming_tool_calls.pyStreaming tool call field handling100%
test_oci_common_utils.pySigning, credentials, schema resolution, URL building98%
test_oci_embed_transformation.pyEmbed request/response transforms, validation100%
test_oci_embedding.pyEmbed config integration100%
test_oci_coverage_boost.pyParam mapping, vendor routing, all finish reasons100%
test_oci_coverage_patch.pyError paths, streaming wrappers, import guards, dead code removal100%

Patch coverage: 99.5% — 5 of 6 source files at 100%, common_utils.py at 98% (4 remaining lines are except ImportError branches tested via subprocess isolation).

Live integration tests — 46 tests against OCI GenAI (us-chicago-1)

Auto-skipped in CI when ~/.oci/config is absent. Configurable via OCI_CONFIG_PROFILE env var.

All 46 verified passing (52s):

CategoryModelsTests
Basic completionMeta, Google, xAI, Cohere4 sync + 4 async
StreamingMeta, Google, xAI, Cohere4 sync + 4 async
Multi-turnMeta, Google, xAI, Cohere4
System messageMeta, Google, xAI, Cohere4
Tool useMeta, Cohere, Google3 sync + 3 async
Usage populatedMeta, Google, xAI, Cohere4
Embeddingsenglish-v3, multilingual-v3, embed-v4, batch, input_type, similarity, usage7 sync + 3 async
Env var credentialscompletion + embedding2

All P0/P1 Review Findings — Resolved

FindingFix
Real network calls in mock-only test folderMoved to tests/llm_translation/
Duplicate embed JSON keys (6 entries)Removed duplicates
Duplicate chat JSON keys with conflicting pricingReconciled and deduplicated
get_complete_url falls back to litellm.api_baseRemoved fallback in both chat + embed
AttributeError when Cohere usage is NoneAdded None guard with zero-value fallback
CodeQL SHA256 false positiveExcluded in codeql-config.yml

Checklist

  • Tests added (tests/test_litellm/)
  • make test-unit passes — unit tests run without OCI credentials
  • 46 live integration tests pass against OCI GenAI
  • 99.5% patch coverage (codecov/patch passes)
  • All P0/P1 Greptile findings addressed
  • No changes to shared infrastructure beyond the provider-agnostic sign_request hook already established by the existing OCI chat implementation
  • Rebased on latest main, clean rebase

Closes #25082

Changed files

  • .github/codeql/codeql-config.yml (modified, +7/-0)
  • litellm/llms/custom_httpx/llm_http_handler.py (modified, +29/-7)
  • litellm/llms/oci/chat/cohere.py (added, +283/-0)
  • litellm/llms/oci/chat/generic.py (added, +404/-0)
  • litellm/llms/oci/chat/transformation.py (modified, +215/-1154)
  • litellm/llms/oci/common_utils.py (modified, +533/-1)
  • litellm/llms/oci/embed/transformation.py (modified, +173/-225)
  • litellm/main.py (modified, +18/-0)
  • litellm/types/llms/oci.py (modified, +68/-10)
  • litellm/utils.py (modified, +4/-4)
  • model_prices_and_context_window.json (modified, +168/-75)
  • tests/llm_translation/test_oci_integration.py (added, +531/-0)
  • tests/test_litellm/llms/oci/chat/test_oci_chat_transformation.py (modified, +467/-55)
  • tests/test_litellm/llms/oci/chat/test_oci_chat_transformation_for_14158.py (modified, +54/-60)
  • tests/test_litellm/llms/oci/chat/test_oci_cohere_tool_calls.py (modified, +265/-151)
  • tests/test_litellm/llms/oci/chat/test_oci_generic_chat.py (added, +302/-0)
  • tests/test_litellm/llms/oci/chat/test_oci_streaming_tool_calls.py (modified, +62/-131)
  • tests/test_litellm/llms/oci/embed/test_oci_embed_transformation.py (added, +390/-0)
  • tests/test_litellm/llms/oci/embed/test_oci_embedding.py (modified, +29/-14)
  • tests/test_litellm/llms/oci/rerank/__init__.py (added, +0/-0)
  • tests/test_litellm/llms/oci/test_oci_common_utils.py (added, +469/-0)
  • tests/test_litellm/llms/oci/test_oci_coverage_boost.py (added, +960/-0)
  • tests/test_litellm/llms/oci/test_oci_coverage_patch.py (added, +965/-0)

Code Example

litellm.APIConnectionError: Extra data: line 3 column 1 (char 123)

File "litellm/litellm_core_utils/streaming_handler.py", line 1611, in __next__
    response: Optional[ModelResponseStream] = self.chunk_creator(chunk=chunk)

File "litellm/llms/oci/chat/transformation.py", line 1082, in chunk_creator
    dict_chunk = json.loads(chunk[5:])
                 ^^^^^^^^^^^^^^^^^^^^^
json.decoder.JSONDecodeError: Extra data: line 3 column 1 (char 123)

---

async def split_chunks(completion_stream: AsyncIterator[str]):
    async for item in completion_stream:
        for chunk in item.split("\n\n"):
            if not chunk:
                continue
            yield chunk.strip()

streaming_response = OCIStreamWrapper(
    completion_stream=split_chunks(completion_stream), ...
)

---

completion_stream = response.iter_text()

streaming_response = OCIStreamWrapper(
    completion_stream=completion_stream, ...  # no split_chunks!
)

---

def split_chunks(completion_stream):
    for item in completion_stream:
        for chunk in item.split("\n\n"):
            if not chunk:
                continue
            yield chunk.strip()

streaming_response = OCIStreamWrapper(
    completion_stream=split_chunks(completion_stream), ...
)
RAW_BUFFERClick to expand / collapse

What happened?

The sync streaming path in OCIChatConfig.get_sync_custom_stream_wrapper is missing the split_chunks logic that the async path (get_async_custom_stream_wrapper) already has. When the OCI endpoint batches multiple SSE events into a single HTTP chunk, chunk_creator receives a multi-line string and json.loads() fails with JSONDecodeError: Extra data.

Expected: Streaming works the same in sync and async paths.

Steps to Reproduce

  1. Configure litellm to use an OCI model (e.g., oci/openai.gpt-5.4)
  2. Make a sync streaming chat completion call
  3. Observe JSONDecodeError: Extra data when the endpoint batches multiple SSE events into one chunk

Relevant log output

litellm.APIConnectionError: Extra data: line 3 column 1 (char 123)

File "litellm/litellm_core_utils/streaming_handler.py", line 1611, in __next__
    response: Optional[ModelResponseStream] = self.chunk_creator(chunk=chunk)

File "litellm/llms/oci/chat/transformation.py", line 1082, in chunk_creator
    dict_chunk = json.loads(chunk[5:])
                 ^^^^^^^^^^^^^^^^^^^^^
json.decoder.JSONDecodeError: Extra data: line 3 column 1 (char 123)

Root Cause

In litellm/llms/oci/chat/transformation.py:

Async path (line ~846) correctly splits multi-event chunks:

async def split_chunks(completion_stream: AsyncIterator[str]):
    async for item in completion_stream:
        for chunk in item.split("\n\n"):
            if not chunk:
                continue
            yield chunk.strip()

streaming_response = OCIStreamWrapper(
    completion_stream=split_chunks(completion_stream), ...
)

Sync path (line ~799) is missing this:

completion_stream = response.iter_text()

streaming_response = OCIStreamWrapper(
    completion_stream=completion_stream, ...  # no split_chunks!
)

Suggested Fix

Add a sync equivalent of split_chunks to the sync path:

def split_chunks(completion_stream):
    for item in completion_stream:
        for chunk in item.split("\n\n"):
            if not chunk:
                continue
            yield chunk.strip()

streaming_response = OCIStreamWrapper(
    completion_stream=split_chunks(completion_stream), ...
)

Component

SDK (litellm Python package)

Version

v1.82.6 (also reproduced on v1.78.2)

extent analysis

Fix Plan

To fix the issue, we need to add a sync equivalent of split_chunks to the sync path. Here are the steps:

  • Create a new function split_chunks_sync that splits multi-event chunks:
def split_chunks_sync(completion_stream):
    for item in completion_stream:
        for chunk in item.split("\n\n"):
            if not chunk:
                continue
            yield chunk.strip()
  • Modify the sync path to use the new split_chunks_sync function:
completion_stream = response.iter_text()

streaming_response = OCIStreamWrapper(
    completion_stream=split_chunks_sync(completion_stream), 
)

Alternatively, you can also reuse the existing split_chunks function by making it compatible with both async and sync streams:

def split_chunks(completion_stream):
    if isinstance(completion_stream, AsyncIterator):
        async for item in completion_stream:
            for chunk in item.split("\n\n"):
                if not chunk:
                    continue
                yield chunk.strip()
    else:
        for item in completion_stream:
            for chunk in item.split("\n\n"):
                if not chunk:
                    continue
                yield chunk.strip()

# Usage
streaming_response = OCIStreamWrapper(
    completion_stream=split_chunks(completion_stream), 
)

Verification

To verify that the fix worked, you can test the sync streaming chat completion call again and check that the JSONDecodeError: Extra data exception is no longer raised.

Extra Tips

  • Make sure to test the fix with different input scenarios to ensure that it works correctly in all cases.
  • Consider adding additional logging or error handling to handle any potential issues that may arise during the chunk splitting process.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING