litellm - ✅(Solved) Fix [Bug]: AzureException APIError - utf-8 codec can't encode character '\ud83e' (surrogates not allowed) causes retry/fallback loop [5 pull requests, 1 participants]

seonghobae · 2026-03-18T05:17:21Z

[litellm] PR 1: fix: reject malformed Azure surrogate payloads - Repository: seonghobae/litellm - Author: seonghobae - State: closed | merged: False - Link: ht… # PR #1: fix: reject malformed Azure surrogate payloads - Repository: seonghobae/litellm - Author: seonghobae - State: closed | merged: False - Link: https://github.com/seonghobae/litellm/pull/1 ## Description (problem / solution / changelog) ## Relevant issues Fixes BerriAI/litellm#23959 ## Pre-Submission checklist **Please complete all items before asking a LiteLLM maintainer to review your PR** - [x] I have Added testing in the [`tests/test_litellm/`](https://github.com/BerriAI/litellm/tree/main/tests/test_litellm) directory, **Adding at least 1 test is a hard requirement** - [see details](https://docs.litellm.ai/docs/extras/contributing_code) - [ ] My PR passes all unit tests on [`make test-unit`](https://docs.litellm.ai/docs/extras/contributing_code) - [x] My PR's scope is as isolated as possible, it only solves 1 specific problem - [ ] I have requested a Greptile review by commenting `@greptileai` and received a **Confidence Score of at least 4/5** before requesting a maintainer review ## Delays in PR merge? If you're seeing a delay in your PR being merged, ping the LiteLLM Team on [Slack (#pr-review)](https://join.slack.com/t/litellmossslack/shared_invite/zt-3o7nkuyfr-p_kbNJj8taRfXGgQI1~YyA). ## CI (LiteLLM team) > **CI status guideline:** > > - 50-55 passing tests: main is stable with minor issues. > - 45-49 passing tests: acceptable but needs attention > - <= 40 passing tests: unstable; be careful with your merges and assess the risk. - [ ] **Branch creation CI run** Link: https://github.com/seonghobae/litellm/actions/runs/23329340758 - [ ] **CI run for the last commit** Link: https://github.com/seonghobae/litellm/actions/runs/23329340758 - [ ] **Merge / cherry-pick CI run** Links: ## Type 🐛 Bug Fix ✅ Test ## Changes - reject lone surrogate Unicode code points in Azure request payloads before transport dispatch - reject escaped lone surrogates at proxy request parsing time with a 400 `request_body` error - add regression coverage for Azure preflight validation, proxy parsing, and router fail-fast retry behavior ## Changed files - `AGENTS.md` (modified, +2/-1) - `ARCHITECTURE.md` (modified, +5/-0) - `docs/agents/README.md` (added, +17/-0) - `docs/coderabbit/review-commands.md` (added, +17/-0) - `docs/engineering/acceptance-criteria.md` (added, +35/-0) - `docs/engineering/harness-engineering.md` (added, +28/-0) - `docs/workflow/one-day-delivery-plan.md` (added, +26/-0) - `docs/workflow/pr-continuity.md` (added, +22/-0) - `litellm/litellm_core_utils/llm_request_utils.py` (modified, +37/-1) - `litellm/llms/azure/azure.py` (modified, +34/-3) - `litellm/llms/azure/common_utils.py` (modified, +21/-0) - `litellm/llms/azure/completion/handler.py` (modified, +9/-1) - `litellm/proxy/common_utils/http_parsing_utils.py` (modified, +24/-5) - `responses.py` (added, +145/-0) - `tests/test_litellm/conftest.py` (modified, +36/-42) - `tests/test_litellm/interactions/test_openapi_compliance.py` (modified, +72/-38) - `tests/test_litellm/llms/azure/test_azure_exception_mapping.py` (modified, +86/-79) - `tests/test_litellm/llms/databricks/responses/__init__.py` (removed, +0/-0) - `tests/test_litellm/llms/manus/responses/__init__.py` (removed, +0/-2) - `tests/test_litellm/llms/openai_like/responses/__init__.py` (removed, +0/-0) - `tests/test_litellm/proxy/client/test_chat.py` (modified, +2/-14) - `tests/test_litellm/proxy/client/test_credentials.py` (modified, +0/-7) - `tests/test_litellm/proxy/client/test_http_client.py` (modified, +0/-6) - `tests/test_litellm/proxy/client/test_http_commands.py` (modified, +0/-6) - `tests/test_litellm/proxy/client/test_keys.py` (modified, +2/-13) - `tests/test_litellm/proxy/client/test_model_groups.py` (modified, +0/-7) - `tests/test_litellm/proxy/client/test_models.py` (modified, +5/-25) - `tests/test_litellm/proxy/common_utils/test_http_parsing_utils.py` (modified, +99/-100) - `tests/test_litellm/test_router_retry_non_retryable_errors.py` (modified, +87/-25) --- # PR #2: fix: reject malformed Azure surrogate payloads - Repository: seonghobae/litellm - Author: seonghobae - State: closed | merged: False - Link: https://github.com/seonghobae/litellm/pull/2 ## Description (problem / solution / changelog) ## Relevant issues Fixes BerriAI/litellm#23959 ## Stack - base branch: `test-google-interactions-openapi-refs` - issue diff stays isolated to the Azure surrogate validation fix while branch CI unblocks land underneath it in separate PRs ## Pre-Submission checklist **Please complete all items before asking a LiteLLM maintainer to review your PR** - [x] I have Added testing in the [`tests/test_litellm/`](https://github.com/BerriAI/litellm/tree/main/tests/test_litellm) directory, **Adding at least 1 test is a hard requirement** - [see details](https://docs.litellm.ai/docs/extras/contributing_code) - [ ] My PR passes all unit tests on

litellm2026-03-18 05:17:21

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

BerriAI/litellm#23959•Fetched 2026-04-08 00:53:43

View on GitHub

Comments

Participants

Timeline

Reactions

Author

seonghobae

Participants

seonghobae

Timeline (top)

cross-referenced ×5labeled ×2

Error Message

Alert type: llm_exceptions Level: High Timestamp: 03:47:12

Message: LLM API call failed: litellm.APIError: AzureException APIError - 'utf-8' codec can't encode character '\ud83e' in position 325105: surrogates not allowed. Received Model Group=gpt-5.4 Available Model Group Fallbacks=['gpt-5.4-pro', 'gpt-5.2', 'gpt-5.1', 'gpt-5.2-codex', 'gpt-5.1-codex-max', 'gpt-5-pro', 'gpt-5'] Error doing the fallback: litellm.APIError: AzureException APIError - 'utf-8' codec can't encode character '\ud83e' in position 325103: surrogates not allowedNo fallback model group found for original model_group=gpt-5. Fallbacks=[{'gpt-5.2-codex': ['gpt-5.1-codex-max', 'gpt-5.1', 'gpt-5-pro']}, {'gpt-5.2': ['gpt-5.1', 'gpt-5.1-codex-max', 'gpt-5.2-codex', 'gpt-5', 'gpt-5-pro']}, {'gpt-5.4': ['gpt-5.4-pro', 'gpt-5.2', 'gpt-5.1', 'gpt-5.2-codex', 'gpt-5.1-codex-max', 'gpt-5-pro', 'gpt-5']}]. Received Model Group=gpt-5 Available Model Group Fallbacks=None Error doing the fallback: litellm.APIError: AzureException APIError - 'utf-8' codec can't encode character '\ud83e' in position 325103: surrogates not allowedNo fallback model group found for original model_group=gpt-5. Fallbacks=[{'gpt-5.2-codex': ['gpt-5.1-codex-max', 'gpt-5.1', 'gpt-5-pro']}, {'gpt-5.2': ['gpt-5.1', 'gpt-5.1-codex-max', 'gpt-5.2-codex', 'gpt-5', 'gpt-5-pro']}, {'gpt-5.4': ['gpt-5.4-pro', 'gpt-5.2', 'gpt-5.1', 'gpt-5.2-codex', 'gpt-5.1-codex-max', 'gpt-5-pro', 'gpt-5']}] LiteLLM Retried: 5 times, LiteLLM Max Retries: 5 LiteLLM Retried: 5 times, LiteLLM Max Retries: 5 LiteLLM Retried: 5 times, LiteLLM Max Retries: 5 Model: azure/gpt-5.4 API Base: [redacted Azure endpoint] Messages: None

Proxy URL: [redacted internal proxy URL]

Root Cause

Expected behavior:

LiteLLM should sanitize / reject lone surrogate characters before sending the request to the provider
or escape the offending value before transport/logging
and fallback handling should preserve the original root cause without cascading misleading fallback errors

Code Example

Alert type: llm_exceptions
Level: High
Timestamp: 03:47:12

Message: LLM API call failed: `litellm.APIError: AzureException APIError - 'utf-8' codec can't encode character '\ud83e' in position 325105: surrogates not allowed. Received Model Group=gpt-5.4
Available Model Group Fallbacks=['gpt-5.4-pro', 'gpt-5.2', 'gpt-5.1', 'gpt-5.2-codex', 'gpt-5.1-codex-max', 'gpt-5-pro', 'gpt-5']
Error doing the fallback: litellm.APIError: AzureException APIError - 'utf-8' codec can't encode character '\ud83e' in position 325103: surrogates not allowedNo fallback model group found for original model_group=gpt-5. Fallbacks=[{'gpt-5.2-codex': ['gpt-5.1-codex-max', 'gpt-5.1', 'gpt-5-pro']}, {'gpt-5.2': ['gpt-5.1', 'gpt-5.1-codex-max', 'gpt-5.2-codex', 'gpt-5', 'gpt-5-pro']}, {'gpt-5.4': ['gpt-5.4-pro', 'gpt-5.2', 'gpt-5.1', 'gpt-5.2-codex', 'gpt-5.1-codex-max', 'gpt-5-pro', 'gpt-5']}]. Received Model Group=gpt-5
Available Model Group Fallbacks=None
Error doing the fallback: litellm.APIError: AzureException APIError - 'utf-8' codec can't encode character '\ud83e' in position 325103: surrogates not allowedNo fallback model group found for original model_group=gpt-5. Fallbacks=[{'gpt-5.2-codex': ['gpt-5.1-codex-max', 'gpt-5.1', 'gpt-5-pro']}, {'gpt-5.2': ['gpt-5.1', 'gpt-5.1-codex-max', 'gpt-5.2-codex', 'gpt-5', 'gpt-5-pro']}, {'gpt-5.4': ['gpt-5.4-pro', 'gpt-5.2', 'gpt-5.1', 'gpt-5.2-codex', 'gpt-5.1-codex-max', 'gpt-5-pro', 'gpt-5']}] LiteLLM Retried: 5 times, LiteLLM Max Retries: 5 LiteLLM Retried: 5 times, LiteLLM Max Retries: 5 LiteLLM Retried: 5 times, LiteLLM Max Retries: 5
Model: azure/gpt-5.4
API Base: [redacted Azure endpoint]
Messages: None`

Proxy URL: [redacted internal proxy URL]

RAW_BUFFERClick to expand / collapse

Check for existing issues

I have searched the existing issues and checked that my issue is not a duplicate.

What happened?

LiteLLM Proxy fails to send a request to Azure with:

litellm.APIError: AzureException APIError - 'utf-8' codec can't encode character '\ud83e' ... surrogates not allowed

This happens while routing azure/gpt-5.4 with configured model-group fallbacks.

Observed behavior:

the original request fails with a UnicodeEncodeError on \ud83e
LiteLLM retries 5 times
fallback attempts also fail with the same encoding error
one fallback path additionally emits No fallback model group found for original model_group=gpt-5, which looks like secondary noise after the original malformed input already poisoned the request

Expected behavior:

LiteLLM should sanitize / reject lone surrogate characters before sending the request to the provider
or escape the offending value before transport/logging
and fallback handling should preserve the original root cause without cascading misleading fallback errors

This looks related to malformed Unicode input (likely a broken/truncated emoji surrogate) reaching the Azure request serialization path.

Possibly related to prior surrogate/UTF-8 issues such as #8583, but this case is on Azure + proxy fallback flow.

Steps to Reproduce

Run LiteLLM Proxy with an Azure model group such as gpt-5.4.
Configure fallbacks for that model group.
Send a request where one of the text fields (message content, tool output, or other forwarded text) contains a lone surrogate character such as \ud83e instead of a valid Unicode scalar value.
Observe that the Azure call fails before completion with:
- 'utf-8' codec can't encode character '\ud83e' ... surrogates not allowed
Observe repeated retries and fallback attempts failing with the same root error.

Relevant log output

Alert type: llm_exceptions
Level: High
Timestamp: 03:47:12

Message: LLM API call failed: `litellm.APIError: AzureException APIError - 'utf-8' codec can't encode character '\ud83e' in position 325105: surrogates not allowed. Received Model Group=gpt-5.4
Available Model Group Fallbacks=['gpt-5.4-pro', 'gpt-5.2', 'gpt-5.1', 'gpt-5.2-codex', 'gpt-5.1-codex-max', 'gpt-5-pro', 'gpt-5']
Error doing the fallback: litellm.APIError: AzureException APIError - 'utf-8' codec can't encode character '\ud83e' in position 325103: surrogates not allowedNo fallback model group found for original model_group=gpt-5. Fallbacks=[{'gpt-5.2-codex': ['gpt-5.1-codex-max', 'gpt-5.1', 'gpt-5-pro']}, {'gpt-5.2': ['gpt-5.1', 'gpt-5.1-codex-max', 'gpt-5.2-codex', 'gpt-5', 'gpt-5-pro']}, {'gpt-5.4': ['gpt-5.4-pro', 'gpt-5.2', 'gpt-5.1', 'gpt-5.2-codex', 'gpt-5.1-codex-max', 'gpt-5-pro', 'gpt-5']}]. Received Model Group=gpt-5
Available Model Group Fallbacks=None
Error doing the fallback: litellm.APIError: AzureException APIError - 'utf-8' codec can't encode character '\ud83e' in position 325103: surrogates not allowedNo fallback model group found for original model_group=gpt-5. Fallbacks=[{'gpt-5.2-codex': ['gpt-5.1-codex-max', 'gpt-5.1', 'gpt-5-pro']}, {'gpt-5.2': ['gpt-5.1', 'gpt-5.1-codex-max', 'gpt-5.2-codex', 'gpt-5', 'gpt-5-pro']}, {'gpt-5.4': ['gpt-5.4-pro', 'gpt-5.2', 'gpt-5.1', 'gpt-5.2-codex', 'gpt-5.1-codex-max', 'gpt-5-pro', 'gpt-5']}] LiteLLM Retried: 5 times, LiteLLM Max Retries: 5 LiteLLM Retried: 5 times, LiteLLM Max Retries: 5 LiteLLM Retried: 5 times, LiteLLM Max Retries: 5
Model: azure/gpt-5.4
API Base: [redacted Azure endpoint]
Messages: None`

Proxy URL: [redacted internal proxy URL]

What part of LiteLLM is this about?

Proxy

What LiteLLM version are you on ?

1.82.3

Twitter / LinkedIn details

No response

extent analysis

Fix Plan

To address the issue of UnicodeEncodeError caused by lone surrogate characters, we need to sanitize the input text before sending it to Azure. We can achieve this by using Python's built-in unicode functions to detect and replace or remove surrogate characters.

Step-by-Step Solution:

Identify and replace surrogate characters: Before sending the request to Azure, iterate through the input text and check for surrogate characters. You can use the following Python function to achieve this:

import re

def remove_surrogate_characters(text): # Use regular expression to find surrogate characters surrogate_pattern = re.compile(r'[\ud800-\udfff]') return surrogate_pattern.sub('', text)

Example usage:

input_text = "Hello, \ud83e world!" cleaned_text = remove_surrogate_characters(input_text) print(cleaned_text) # Output: "Hello, world!"

2. **Integrate the function into the request pipeline**: Modify the LiteLLM Proxy code to call the `remove_surrogate_characters` function before sending the request to Azure. This will ensure that all input text is sanitized before being sent.

#### Code Changes:
You will need to modify the part of the code that handles the request to Azure. The exact changes will depend on the structure of your code, but it should look something like this:
```python
# Assuming 'request_text' is the variable holding the input text
cleaned_request_text = remove_surrogate_characters(request_text)

# Send the cleaned request text to Azure
azure_response = send_request_to_azure(cleaned_request_text)

Verification

To verify that the fix worked, you can test the LiteLLM Proxy with input text containing lone surrogate characters. The proxy should now successfully send the request to Azure without encountering the UnicodeEncodeError.

Extra Tips

Make sure to test the remove_surrogate_characters function thoroughly to ensure it correctly handles different types of input text.
Consider adding logging to track any instances where surrogate characters are detected and removed, to help with debugging and monitoring.
If you're using a library or framework to handle Unicode text, check its documentation for built-in functions or options to handle surrogate characters.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #ssr #installation #tensor shape #autograd error #cache issue #memory leak #API versioning #request timeout

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

litellm - ✅(Solved) Fix [Bug]: AzureException APIError - utf-8 codec can't encode character '\ud83e' (surrogates not allowed) causes retry/fallback loop [5 pull requests, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fixed

PR fix notes

PR #1: fix: reject malformed Azure surrogate payloads

Description (problem / solution / changelog)

Relevant issues

Pre-Submission checklist

Delays in PR merge?

CI (LiteLLM team)

Type

Changes

Changed files

PR #2: fix: reject malformed Azure surrogate payloads

Description (problem / solution / changelog)

Relevant issues

Stack

Pre-Submission checklist

Type

Changes

Verification

Changed files

PR #24380: fix: reject malformed Azure surrogate payloads

Description (problem / solution / changelog)

Relevant issues

Pre-Submission checklist

Delays in PR merge?

CI (LiteLLM team)

Type

Changes

Verification

Changed files

PR #16: fix: sanitize malformed Azure surrogate payloads

Description (problem / solution / changelog)

Relevant issues

Pre-Submission checklist

Type

Changes

Verification

Changed files

PR #24382: fix: sanitize malformed Azure surrogate payloads

Description (problem / solution / changelog)

Relevant issues

Pre-Submission checklist

Type

Changes

Verification

Changed files

Code Example

Check for existing issues

What happened?

Steps to Reproduce

Relevant log output

What part of LiteLLM is this about?

What LiteLLM version are you on ?

Twitter / LinkedIn details

extent analysis

Fix Plan

Step-by-Step Solution:

Example usage:

Verification

Extra Tips

Still need to ship something?

RELATED_DISCOVERY

TRENDING