litellm - 💡(How to fix) Fix [Bug]: litellm.APIConnectionError: Unable to parse ollama chunk [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
BerriAI/litellm#23155Fetched 2026-04-08 00:38:18
View on GitHub
Comments
0
Participants
1
Timeline
3
Reactions
0
Author
Participants
Timeline (top)
labeled ×3

Error Message

proxy_server.py:5164 - litellm.proxy.proxy_server.async_data_generator(): Exception occured - litellm.ServiceUnavailableError: litellm.MidStreamFallbackError: litellm.APIConnectionError: Unable to parse ollama chunk - {'model': 'hf.co/mradermacher/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-i1-GGUF:Q4_K_M', 'created_at': '2026-03-09T14:15:14.997957676Z', 'response': '', 'done': False} Traceback (most recent call last): File "/usr/lib/python3.13/site-packages/litellm/litellm_core_utils/streaming_handler.py", line 1915, in anext async for chunk in self.completion_stream: ...<71 lines>... return processed_chunk File "/usr/lib/python3.13/site-packages/litellm/llms/base_llm/base_model_iterator.py", line 172, in anext chunk = self._handle_string_chunk(str_line=str_line) File "/usr/lib/python3.13/site-packages/litellm/llms/ollama/completion/transformation.py", line 509, in _handle_string_chunk return self.chunk_parser(json.loads(str_line)) ~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3.13/site-packages/litellm/llms/ollama/completion/transformation.py", line 593, in chunk_parser raise e File "/usr/lib/python3.13/site-packages/litellm/llms/ollama/completion/transformation.py", line 591, in chunk_parser raise Exception(f"Unable to parse ollama chunk - {chunk}") Exception: Unable to parse ollama chunk - {'model': 'hf.co/mradermacher/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-i1-GGUF:Q4_K_M', 'created_at': '2026-03-09T14:15:14.997957676Z', 'response': '', 'done': False} . Received Model Group=qwen3.5:27b Available Model Group Fallbacks=None Original exception: APIConnectionError: litellm.APIConnectionError: Unable to parse ollama chunk - {'model': 'hf.co/mradermacher/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-i1-GGUF:Q4_K_M', 'created_at': '2026-03-09T14:15:14.997957676Z', 'response': '', 'done': False} Traceback (most recent call last): File "/usr/lib/python3.13/site-packages/litellm/litellm_core_utils/streaming_handler.py", line 1915, in anext async for chunk in self.completion_stream: ...<71 lines>... return processed_chunk File "/usr/lib/python3.13/site-packages/litellm/llms/base_llm/base_model_iterator.py", line 172, in anext chunk = self._handle_string_chunk(str_line=str_line) File "/usr/lib/python3.13/site-packages/litellm/llms/ollama/completion/transformation.py", line 509, in _handle_string_chunk return self.chunk_parser(json.loads(str_line)) ~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3.13/site-packages/litellm/llms/ollama/completion/transformation.py", line 593, in chunk_parser raise e File "/usr/lib/python3.13/site-packages/litellm/llms/ollama/completion/transformation.py", line 591, in chunk_parser raise Exception(f"Unable to parse ollama chunk - {chunk}") Exception: Unable to parse ollama chunk - {'model': 'hf.co/mradermacher/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-i1-GGUF:Q4_K_M', 'created_at': '2026-03-09T14:15:14.997957676Z', 'response': '', 'done': False} Traceback (most recent call last): File "/usr/lib/python3.13/site-packages/litellm/litellm_core_utils/streaming_handler.py", line 1915, in anext async for chunk in self.completion_stream: ...<71 lines>... return processed_chunk File "/usr/lib/python3.13/site-packages/litellm/llms/base_llm/base_model_iterator.py", line 172, in anext chunk = self._handle_string_chunk(str_line=str_line) File "/usr/lib/python3.13/site-packages/litellm/llms/ollama/completion/transformation.py", line 509, in _handle_string_chunk return self.chunk_parser(json.loads(str_line)) ~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3.13/site-packages/litellm/llms/ollama/completion/transformation.py", line 593, in chunk_parser raise e File "/usr/lib/python3.13/site-packages/litellm/llms/ollama/completion/transformation.py", line 591, in chunk_parser raise Exception(f"Unable to parse ollama chunk - {chunk}") Exception: Unable to parse ollama chunk - {'model': 'hf.co/mradermacher/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-i1-GGUF:Q4_K_M', 'created_at': '2026-03-09T14:15:14.997957676Z', 'response': '', 'done': False}

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/usr/lib/python3.13/site-packages/litellm/proxy/proxy_server.py", line 5123, in async_data_generator async for chunk in proxy_logging_obj.async_post_call_streaming_iterator_hook( ...<32 lines>... yield f"data: {str(e)}\n\n" File "/usr/lib/python3.13/site-packages/litellm/proxy/utils.py", line 2131, in async_post_call_streaming_iterator_hook async for chunk in current_response: yield chunk File "/usr/lib/python3.13/site-packages/litellm/integrations/custom_logger.py", line 459, in async_post_call_streaming_iterator_hook async for item in response: yield item File "/usr/lib/python3.13/site-packages/litellm/integrations/custom_logger.py", line 459, in async_post_call_streaming_iterator_hook async for item in response: yield item File "/usr/lib/python3.13/site-packages/litellm/integrations/custom_logger.py", line 459, in async_post_call_streaming_iterator_hook async for item in response: yield item File "/usr/lib/python3.13/site-packages/litellm/proxy/hooks/responses_id_security.py", line 270, in async_post_call_streaming_iterator_hook async for chunk in response: ...<7 lines>... yield chunk File "/usr/lib/python3.13/site-packages/litellm/integrations/custom_logger.py", line 459, in async_post_call_streaming_iterator_hook async for item in response: yield item File "/usr/lib/python3.13/site-packages/litellm/integrations/custom_logger.py", line 459, in async_post_call_streaming_iterator_hook async for item in response: yield item File "/usr/lib/python3.13/site-packages/litellm/integrations/custom_logger.py", line 459, in async_post_call_streaming_iterator_hook async for item in response: yield item [Previous line repeated 4 more times] File "/usr/lib/python3.13/site-packages/litellm/router.py", line 1563, in anext return await self._async_generator.anext() ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3.13/site-packages/litellm/router.py", line 1664, in stream_with_fallbacks raise fallback_error File "/usr/lib/python3.13/site-packages/litellm/router.py", line 1608, in stream_with_fallbacks await self.async_function_with_fallbacks_common_utils( ...<8 lines>... ) File "/usr/lib/python3.13/site-packages/litellm/router.py", line 5061, in async_function_with_fallbacks_common_utils raise original_exception File "/usr/lib/python3.13/site-packages/litellm/router.py", line 1568, in stream_with_fallbacks async for item in model_response: yield item File "/usr/lib/python3.13/site-packages/litellm/litellm_core_utils/streaming_handler.py", line 2141, in anext raise MidStreamFallbackError( ...<6 lines>... ) litellm.exceptions.MidStreamFallbackError: litellm.ServiceUnavailableError: litellm.MidStreamFallbackError: litellm.APIConnectionError: Unable to parse ollama chunk - {'model': 'hf.co/mradermacher/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-i1-GGUF:Q4_K_M', 'created_at': '2026-03-09T14:15:14.997957676Z', 'response': '', 'done': False} Traceback (most recent call last): File "/usr/lib/python3.13/site-packages/litellm/litellm_core_utils/streaming_handler.py", line 1915, in anext async for chunk in self.completion_stream: ...<71 lines>... return processed_chunk File "/usr/lib/python3.13/site-packages/litellm/llms/base_llm/base_model_iterator.py", line 172, in anext chunk = self._handle_string_chunk(str_line=str_line) File "/usr/lib/python3.13/site-packages/litellm/llms/ollama/completion/transformation.py", line 509, in _handle_string_chunk return self.chunk_parser(json.loads(str_line)) ~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3.13/site-packages/litellm/llms/ollama/completion/transformation.py", line 593, in chunk_parser raise e File "/usr/lib/python3.13/site-packages/litellm/llms/ollama/completion/transformation.py", line 591, in chunk_parser raise Exception(f"Unable to parse ollama chunk - {chunk}") Exception: Unable to parse ollama chunk - {'model': 'hf.co/mradermacher/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-i1-GGUF:Q4_K_M', 'created_at': '2026-03-09T14:15:14.997957676Z', 'response': '', 'done': False} . Received Model Group=qwen3.5:27b Available Model Group Fallbacks=None Original exception: APIConnectionError: litellm.APIConnectionError: Unable to parse ollama chunk - {'model': 'hf.co/mradermacher/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-i1-GGUF:Q4_K_M', 'created_at': '2026-03-09T14:15:14.997957676Z', 'response': '', 'done': False} Traceback (most recent call last): File "/usr/lib/python3.13/site-packages/litellm/litellm_core_utils/streaming_handler.py", line 1915, in anext async for chunk in self.completion_stream: ...<71 lines>... return processed_chunk File "/usr/lib/python3.13/site-packages/litellm/llms/base_llm/base_model_iterator.py", line 172, in anext chunk = self._handle_string_chunk(str_line=str_line) File "/usr/lib/python3.13/site-packages/litellm/llms/ollama/completion/transformation.py", line 509, in _handle_string_chunk return self.chunk_parser(json.loads(str_line)) ~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3.13/site-packages/litellm/llms/ollama/completion/transformation.py", line 593, in chunk_parser raise e File "/usr/lib/python3.13/site-packages/litellm/llms/ollama/completion/transformation.py", line 591, in chunk_parser raise Exception(f"Unable to parse ollama chunk - {chunk}") Exception: Unable to parse ollama chunk - {'model': 'hf.co/mradermacher/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-i1-GGUF:Q4_K_M', 'created_at': '2026-03-09T14:15:14.997957676Z', 'response': '', 'done': False}

Calling end() on an ended span.

Code Example

proxy_server.py:5164 - litellm.proxy.proxy_server.async_data_generator(): Exception occured - litellm.ServiceUnavailableError: litellm.MidStreamFallbackError: litellm.APIConnectionError: Unable to parse ollama chunk - {'model': 'hf.co/mradermacher/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-i1-GGUF:Q4_K_M', 'created_at': '2026-03-09T14:15:14.997957676Z', 'response': '', 'done': False}
Traceback (most recent call last):
  File "/usr/lib/python3.13/site-packages/litellm/litellm_core_utils/streaming_handler.py", line 1915, in __anext__
    async for chunk in self.completion_stream:
    ...<71 lines>...
        return processed_chunk
  File "/usr/lib/python3.13/site-packages/litellm/llms/base_llm/base_model_iterator.py", line 172, in __anext__
    chunk = self._handle_string_chunk(str_line=str_line)
  File "/usr/lib/python3.13/site-packages/litellm/llms/ollama/completion/transformation.py", line 509, in _handle_string_chunk
    return self.chunk_parser(json.loads(str_line))
           ~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.13/site-packages/litellm/llms/ollama/completion/transformation.py", line 593, in chunk_parser
    raise e
  File "/usr/lib/python3.13/site-packages/litellm/llms/ollama/completion/transformation.py", line 591, in chunk_parser
    raise Exception(f"Unable to parse ollama chunk - {chunk}")
Exception: Unable to parse ollama chunk - {'model': 'hf.co/mradermacher/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-i1-GGUF:Q4_K_M', 'created_at': '2026-03-09T14:15:14.997957676Z', 'response': '', 'done': False}
. Received Model Group=qwen3.5:27b
Available Model Group Fallbacks=None Original exception: APIConnectionError: litellm.APIConnectionError: Unable to parse ollama chunk - {'model': 'hf.co/mradermacher/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-i1-GGUF:Q4_K_M', 'created_at': '2026-03-09T14:15:14.997957676Z', 'response': '', 'done': False}
Traceback (most recent call last):
  File "/usr/lib/python3.13/site-packages/litellm/litellm_core_utils/streaming_handler.py", line 1915, in __anext__
    async for chunk in self.completion_stream:
    ...<71 lines>...
        return processed_chunk
  File "/usr/lib/python3.13/site-packages/litellm/llms/base_llm/base_model_iterator.py", line 172, in __anext__
    chunk = self._handle_string_chunk(str_line=str_line)
  File "/usr/lib/python3.13/site-packages/litellm/llms/ollama/completion/transformation.py", line 509, in _handle_string_chunk
    return self.chunk_parser(json.loads(str_line))
           ~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.13/site-packages/litellm/llms/ollama/completion/transformation.py", line 593, in chunk_parser
    raise e
  File "/usr/lib/python3.13/site-packages/litellm/llms/ollama/completion/transformation.py", line 591, in chunk_parser
    raise Exception(f"Unable to parse ollama chunk - {chunk}")
Exception: Unable to parse ollama chunk - {'model': 'hf.co/mradermacher/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-i1-GGUF:Q4_K_M', 'created_at': '2026-03-09T14:15:14.997957676Z', 'response': '', 'done': False}
Traceback (most recent call last):
  File "/usr/lib/python3.13/site-packages/litellm/litellm_core_utils/streaming_handler.py", line 1915, in __anext__
    async for chunk in self.completion_stream:
    ...<71 lines>...
        return processed_chunk
  File "/usr/lib/python3.13/site-packages/litellm/llms/base_llm/base_model_iterator.py", line 172, in __anext__
    chunk = self._handle_string_chunk(str_line=str_line)
  File "/usr/lib/python3.13/site-packages/litellm/llms/ollama/completion/transformation.py", line 509, in _handle_string_chunk
    return self.chunk_parser(json.loads(str_line))
           ~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.13/site-packages/litellm/llms/ollama/completion/transformation.py", line 593, in chunk_parser
    raise e
  File "/usr/lib/python3.13/site-packages/litellm/llms/ollama/completion/transformation.py", line 591, in chunk_parser
    raise Exception(f"Unable to parse ollama chunk - {chunk}")
Exception: Unable to parse ollama chunk - {'model': 'hf.co/mradermacher/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-i1-GGUF:Q4_K_M', 'created_at': '2026-03-09T14:15:14.997957676Z', 'response': '', 'done': False}

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.13/site-packages/litellm/proxy/proxy_server.py", line 5123, in async_data_generator
    async for chunk in proxy_logging_obj.async_post_call_streaming_iterator_hook(
    ...<32 lines>...
            yield f"data: {str(e)}\n\n"
  File "/usr/lib/python3.13/site-packages/litellm/proxy/utils.py", line 2131, in async_post_call_streaming_iterator_hook
    async for chunk in current_response:
        yield chunk
  File "/usr/lib/python3.13/site-packages/litellm/integrations/custom_logger.py", line 459, in async_post_call_streaming_iterator_hook
    async for item in response:
        yield item
  File "/usr/lib/python3.13/site-packages/litellm/integrations/custom_logger.py", line 459, in async_post_call_streaming_iterator_hook
    async for item in response:
        yield item
  File "/usr/lib/python3.13/site-packages/litellm/integrations/custom_logger.py", line 459, in async_post_call_streaming_iterator_hook
    async for item in response:
        yield item
  File "/usr/lib/python3.13/site-packages/litellm/proxy/hooks/responses_id_security.py", line 270, in async_post_call_streaming_iterator_hook
    async for chunk in response:
    ...<7 lines>...
        yield chunk
  File "/usr/lib/python3.13/site-packages/litellm/integrations/custom_logger.py", line 459, in async_post_call_streaming_iterator_hook
    async for item in response:
        yield item
  File "/usr/lib/python3.13/site-packages/litellm/integrations/custom_logger.py", line 459, in async_post_call_streaming_iterator_hook
    async for item in response:
        yield item
  File "/usr/lib/python3.13/site-packages/litellm/integrations/custom_logger.py", line 459, in async_post_call_streaming_iterator_hook
    async for item in response:
        yield item
  [Previous line repeated 4 more times]
  File "/usr/lib/python3.13/site-packages/litellm/router.py", line 1563, in __anext__
    return await self._async_generator.__anext__()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.13/site-packages/litellm/router.py", line 1664, in stream_with_fallbacks
    raise fallback_error
  File "/usr/lib/python3.13/site-packages/litellm/router.py", line 1608, in stream_with_fallbacks
    await self.async_function_with_fallbacks_common_utils(
    ...<8 lines>...
    )
  File "/usr/lib/python3.13/site-packages/litellm/router.py", line 5061, in async_function_with_fallbacks_common_utils
    raise original_exception
  File "/usr/lib/python3.13/site-packages/litellm/router.py", line 1568, in stream_with_fallbacks
    async for item in model_response:
        yield item
  File "/usr/lib/python3.13/site-packages/litellm/litellm_core_utils/streaming_handler.py", line 2141, in __anext__
    raise MidStreamFallbackError(
    ...<6 lines>...
    )
litellm.exceptions.MidStreamFallbackError: litellm.ServiceUnavailableError: litellm.MidStreamFallbackError: litellm.APIConnectionError: Unable to parse ollama chunk - {'model': 'hf.co/mradermacher/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-i1-GGUF:Q4_K_M', 'created_at': '2026-03-09T14:15:14.997957676Z', 'response': '', 'done': False}
Traceback (most recent call last):
  File "/usr/lib/python3.13/site-packages/litellm/litellm_core_utils/streaming_handler.py", line 1915, in __anext__
    async for chunk in self.completion_stream:
    ...<71 lines>...
        return processed_chunk
  File "/usr/lib/python3.13/site-packages/litellm/llms/base_llm/base_model_iterator.py", line 172, in __anext__
    chunk = self._handle_string_chunk(str_line=str_line)
  File "/usr/lib/python3.13/site-packages/litellm/llms/ollama/completion/transformation.py", line 509, in _handle_string_chunk
    return self.chunk_parser(json.loads(str_line))
           ~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.13/site-packages/litellm/llms/ollama/completion/transformation.py", line 593, in chunk_parser
    raise e
  File "/usr/lib/python3.13/site-packages/litellm/llms/ollama/completion/transformation.py", line 591, in chunk_parser
    raise Exception(f"Unable to parse ollama chunk - {chunk}")
Exception: Unable to parse ollama chunk - {'model': 'hf.co/mradermacher/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-i1-GGUF:Q4_K_M', 'created_at': '2026-03-09T14:15:14.997957676Z', 'response': '', 'done': False}
. Received Model Group=qwen3.5:27b
Available Model Group Fallbacks=None Original exception: APIConnectionError: litellm.APIConnectionError: Unable to parse ollama chunk - {'model': 'hf.co/mradermacher/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-i1-GGUF:Q4_K_M', 'created_at': '2026-03-09T14:15:14.997957676Z', 'response': '', 'done': False}
Traceback (most recent call last):
  File "/usr/lib/python3.13/site-packages/litellm/litellm_core_utils/streaming_handler.py", line 1915, in __anext__
    async for chunk in self.completion_stream:
    ...<71 lines>...
        return processed_chunk
  File "/usr/lib/python3.13/site-packages/litellm/llms/base_llm/base_model_iterator.py", line 172, in __anext__
    chunk = self._handle_string_chunk(str_line=str_line)
  File "/usr/lib/python3.13/site-packages/litellm/llms/ollama/completion/transformation.py", line 509, in _handle_string_chunk
    return self.chunk_parser(json.loads(str_line))
           ~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.13/site-packages/litellm/llms/ollama/completion/transformation.py", line 593, in chunk_parser
    raise e
  File "/usr/lib/python3.13/site-packages/litellm/llms/ollama/completion/transformation.py", line 591, in chunk_parser
    raise Exception(f"Unable to parse ollama chunk - {chunk}")
Exception: Unable to parse ollama chunk - {'model': 'hf.co/mradermacher/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-i1-GGUF:Q4_K_M', 'created_at': '2026-03-09T14:15:14.997957676Z', 'response': '', 'done': False}

Calling end() on an ended span.
RAW_BUFFERClick to expand / collapse

Check for existing issues

  • I have searched the existing issues and checked that my issue is not a duplicate.

What happened?

When using model, hf.co/mradermacher/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-i1-GGUF:Q4_K_M, from HuggingFace with Ollama, LiteLLM throws an error when processing the response.

Steps to Reproduce

Relevant log output

proxy_server.py:5164 - litellm.proxy.proxy_server.async_data_generator(): Exception occured - litellm.ServiceUnavailableError: litellm.MidStreamFallbackError: litellm.APIConnectionError: Unable to parse ollama chunk - {'model': 'hf.co/mradermacher/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-i1-GGUF:Q4_K_M', 'created_at': '2026-03-09T14:15:14.997957676Z', 'response': '', 'done': False}
Traceback (most recent call last):
  File "/usr/lib/python3.13/site-packages/litellm/litellm_core_utils/streaming_handler.py", line 1915, in __anext__
    async for chunk in self.completion_stream:
    ...<71 lines>...
        return processed_chunk
  File "/usr/lib/python3.13/site-packages/litellm/llms/base_llm/base_model_iterator.py", line 172, in __anext__
    chunk = self._handle_string_chunk(str_line=str_line)
  File "/usr/lib/python3.13/site-packages/litellm/llms/ollama/completion/transformation.py", line 509, in _handle_string_chunk
    return self.chunk_parser(json.loads(str_line))
           ~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.13/site-packages/litellm/llms/ollama/completion/transformation.py", line 593, in chunk_parser
    raise e
  File "/usr/lib/python3.13/site-packages/litellm/llms/ollama/completion/transformation.py", line 591, in chunk_parser
    raise Exception(f"Unable to parse ollama chunk - {chunk}")
Exception: Unable to parse ollama chunk - {'model': 'hf.co/mradermacher/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-i1-GGUF:Q4_K_M', 'created_at': '2026-03-09T14:15:14.997957676Z', 'response': '', 'done': False}
. Received Model Group=qwen3.5:27b
Available Model Group Fallbacks=None Original exception: APIConnectionError: litellm.APIConnectionError: Unable to parse ollama chunk - {'model': 'hf.co/mradermacher/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-i1-GGUF:Q4_K_M', 'created_at': '2026-03-09T14:15:14.997957676Z', 'response': '', 'done': False}
Traceback (most recent call last):
  File "/usr/lib/python3.13/site-packages/litellm/litellm_core_utils/streaming_handler.py", line 1915, in __anext__
    async for chunk in self.completion_stream:
    ...<71 lines>...
        return processed_chunk
  File "/usr/lib/python3.13/site-packages/litellm/llms/base_llm/base_model_iterator.py", line 172, in __anext__
    chunk = self._handle_string_chunk(str_line=str_line)
  File "/usr/lib/python3.13/site-packages/litellm/llms/ollama/completion/transformation.py", line 509, in _handle_string_chunk
    return self.chunk_parser(json.loads(str_line))
           ~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.13/site-packages/litellm/llms/ollama/completion/transformation.py", line 593, in chunk_parser
    raise e
  File "/usr/lib/python3.13/site-packages/litellm/llms/ollama/completion/transformation.py", line 591, in chunk_parser
    raise Exception(f"Unable to parse ollama chunk - {chunk}")
Exception: Unable to parse ollama chunk - {'model': 'hf.co/mradermacher/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-i1-GGUF:Q4_K_M', 'created_at': '2026-03-09T14:15:14.997957676Z', 'response': '', 'done': False}
Traceback (most recent call last):
  File "/usr/lib/python3.13/site-packages/litellm/litellm_core_utils/streaming_handler.py", line 1915, in __anext__
    async for chunk in self.completion_stream:
    ...<71 lines>...
        return processed_chunk
  File "/usr/lib/python3.13/site-packages/litellm/llms/base_llm/base_model_iterator.py", line 172, in __anext__
    chunk = self._handle_string_chunk(str_line=str_line)
  File "/usr/lib/python3.13/site-packages/litellm/llms/ollama/completion/transformation.py", line 509, in _handle_string_chunk
    return self.chunk_parser(json.loads(str_line))
           ~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.13/site-packages/litellm/llms/ollama/completion/transformation.py", line 593, in chunk_parser
    raise e
  File "/usr/lib/python3.13/site-packages/litellm/llms/ollama/completion/transformation.py", line 591, in chunk_parser
    raise Exception(f"Unable to parse ollama chunk - {chunk}")
Exception: Unable to parse ollama chunk - {'model': 'hf.co/mradermacher/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-i1-GGUF:Q4_K_M', 'created_at': '2026-03-09T14:15:14.997957676Z', 'response': '', 'done': False}

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.13/site-packages/litellm/proxy/proxy_server.py", line 5123, in async_data_generator
    async for chunk in proxy_logging_obj.async_post_call_streaming_iterator_hook(
    ...<32 lines>...
            yield f"data: {str(e)}\n\n"
  File "/usr/lib/python3.13/site-packages/litellm/proxy/utils.py", line 2131, in async_post_call_streaming_iterator_hook
    async for chunk in current_response:
        yield chunk
  File "/usr/lib/python3.13/site-packages/litellm/integrations/custom_logger.py", line 459, in async_post_call_streaming_iterator_hook
    async for item in response:
        yield item
  File "/usr/lib/python3.13/site-packages/litellm/integrations/custom_logger.py", line 459, in async_post_call_streaming_iterator_hook
    async for item in response:
        yield item
  File "/usr/lib/python3.13/site-packages/litellm/integrations/custom_logger.py", line 459, in async_post_call_streaming_iterator_hook
    async for item in response:
        yield item
  File "/usr/lib/python3.13/site-packages/litellm/proxy/hooks/responses_id_security.py", line 270, in async_post_call_streaming_iterator_hook
    async for chunk in response:
    ...<7 lines>...
        yield chunk
  File "/usr/lib/python3.13/site-packages/litellm/integrations/custom_logger.py", line 459, in async_post_call_streaming_iterator_hook
    async for item in response:
        yield item
  File "/usr/lib/python3.13/site-packages/litellm/integrations/custom_logger.py", line 459, in async_post_call_streaming_iterator_hook
    async for item in response:
        yield item
  File "/usr/lib/python3.13/site-packages/litellm/integrations/custom_logger.py", line 459, in async_post_call_streaming_iterator_hook
    async for item in response:
        yield item
  [Previous line repeated 4 more times]
  File "/usr/lib/python3.13/site-packages/litellm/router.py", line 1563, in __anext__
    return await self._async_generator.__anext__()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.13/site-packages/litellm/router.py", line 1664, in stream_with_fallbacks
    raise fallback_error
  File "/usr/lib/python3.13/site-packages/litellm/router.py", line 1608, in stream_with_fallbacks
    await self.async_function_with_fallbacks_common_utils(
    ...<8 lines>...
    )
  File "/usr/lib/python3.13/site-packages/litellm/router.py", line 5061, in async_function_with_fallbacks_common_utils
    raise original_exception
  File "/usr/lib/python3.13/site-packages/litellm/router.py", line 1568, in stream_with_fallbacks
    async for item in model_response:
        yield item
  File "/usr/lib/python3.13/site-packages/litellm/litellm_core_utils/streaming_handler.py", line 2141, in __anext__
    raise MidStreamFallbackError(
    ...<6 lines>...
    )
litellm.exceptions.MidStreamFallbackError: litellm.ServiceUnavailableError: litellm.MidStreamFallbackError: litellm.APIConnectionError: Unable to parse ollama chunk - {'model': 'hf.co/mradermacher/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-i1-GGUF:Q4_K_M', 'created_at': '2026-03-09T14:15:14.997957676Z', 'response': '', 'done': False}
Traceback (most recent call last):
  File "/usr/lib/python3.13/site-packages/litellm/litellm_core_utils/streaming_handler.py", line 1915, in __anext__
    async for chunk in self.completion_stream:
    ...<71 lines>...
        return processed_chunk
  File "/usr/lib/python3.13/site-packages/litellm/llms/base_llm/base_model_iterator.py", line 172, in __anext__
    chunk = self._handle_string_chunk(str_line=str_line)
  File "/usr/lib/python3.13/site-packages/litellm/llms/ollama/completion/transformation.py", line 509, in _handle_string_chunk
    return self.chunk_parser(json.loads(str_line))
           ~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.13/site-packages/litellm/llms/ollama/completion/transformation.py", line 593, in chunk_parser
    raise e
  File "/usr/lib/python3.13/site-packages/litellm/llms/ollama/completion/transformation.py", line 591, in chunk_parser
    raise Exception(f"Unable to parse ollama chunk - {chunk}")
Exception: Unable to parse ollama chunk - {'model': 'hf.co/mradermacher/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-i1-GGUF:Q4_K_M', 'created_at': '2026-03-09T14:15:14.997957676Z', 'response': '', 'done': False}
. Received Model Group=qwen3.5:27b
Available Model Group Fallbacks=None Original exception: APIConnectionError: litellm.APIConnectionError: Unable to parse ollama chunk - {'model': 'hf.co/mradermacher/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-i1-GGUF:Q4_K_M', 'created_at': '2026-03-09T14:15:14.997957676Z', 'response': '', 'done': False}
Traceback (most recent call last):
  File "/usr/lib/python3.13/site-packages/litellm/litellm_core_utils/streaming_handler.py", line 1915, in __anext__
    async for chunk in self.completion_stream:
    ...<71 lines>...
        return processed_chunk
  File "/usr/lib/python3.13/site-packages/litellm/llms/base_llm/base_model_iterator.py", line 172, in __anext__
    chunk = self._handle_string_chunk(str_line=str_line)
  File "/usr/lib/python3.13/site-packages/litellm/llms/ollama/completion/transformation.py", line 509, in _handle_string_chunk
    return self.chunk_parser(json.loads(str_line))
           ~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.13/site-packages/litellm/llms/ollama/completion/transformation.py", line 593, in chunk_parser
    raise e
  File "/usr/lib/python3.13/site-packages/litellm/llms/ollama/completion/transformation.py", line 591, in chunk_parser
    raise Exception(f"Unable to parse ollama chunk - {chunk}")
Exception: Unable to parse ollama chunk - {'model': 'hf.co/mradermacher/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-i1-GGUF:Q4_K_M', 'created_at': '2026-03-09T14:15:14.997957676Z', 'response': '', 'done': False}

Calling end() on an ended span.

What part of LiteLLM is this about?

Proxy

What LiteLLM version are you on ?

v1.81.14

Twitter / LinkedIn details

No response

extent analysis

Fix Plan

The issue seems to be related to the chunk_parser function in ollama/completion/transformation.py which is unable to parse the ollama chunk. To fix this, we need to modify the chunk_parser function to handle the empty response.

Here are the steps to fix the issue:

  • Modify the chunk_parser function to check if the response is empty before trying to parse it.
  • If the response is empty, return a default value or raise a custom exception.

Example code:

def chunk_parser(self, chunk):
    if not chunk['response']:
        # Return a default value or raise a custom exception
        raise Exception("Empty response from ollama")
    # Rest of the function remains the same

Alternatively, you can also add a try-except block to catch the Exception raised by the json.loads function and return a default value or raise a custom exception.

def chunk_parser(self, chunk):
    try:
        return json.loads(chunk)
    except Exception as e:
        # Return a default value or raise a custom exception
        raise Exception("Unable to parse ollama chunk - {}".format(chunk))

Verification

To verify that the fix worked, you can test the chunk_parser function with an empty response and check if it returns the expected default value or raises the expected custom exception.

Extra Tips

  • Make sure to test the chunk_parser function thoroughly with different types of responses to ensure that it handles all possible scenarios correctly.
  • Consider adding logging statements to the chunk_parser function to track any errors or exceptions that occur during parsing.
  • If you are using a third-party library to parse the ollama chunk, make sure to check the library's documentation for any known issues or limitations that may be causing the problem.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

litellm - 💡(How to fix) Fix [Bug]: litellm.APIConnectionError: Unable to parse ollama chunk [1 participants]