litellm - ✅(Solved) Fix [Bug]: Async embedding requests to Azure AI Foundry (v1 API) fail with ResourceNotFound [2 pull requests, 1 participants]

apujari-netapp · 2026-03-31T07:35:12Z

[litellm] PR 24911: fix azure : forward api version to aembedding for Azure AI Foundry v1 endpoints - Repository: BerriAI/litellm - Author: Sameerlite - State:… # PR #24911: fix(azure): forward api_version to aembedding() for Azure AI Foundry v1 endpoints - Repository: BerriAI/litellm - Author: Sameerlite - State: closed | merged: True - Link: https://github.com/BerriAI/litellm/pull/24911 ## Description (problem / solution / changelog) ## What Fixes a silent parameter drop in the async embedding path for Azure AI Foundry. ## Root Cause In `BaseAzureLLM.embedding()`, when `aembedding=True`, the call to `self.aembedding()` was missing `api_version=api_version`. This meant `get_azure_openai_client()` received `None` instead of `"v1"`, causing `_is_azure_v1_api_version()` to return `False` and `AsyncAzureOpenAI` to be selected instead of `AsyncOpenAI`. The wrong client constructs Azure-specific URLs that don't exist on AI Foundry endpoints, resulting in a 404 `ResourceNotFound`. The sync path was unaffected — it passed `api_version` directly to `get_azure_openai_client()`. Only the async path (which the proxy always uses) had this bug. ## Fix One-line fix in `litellm/llms/azure/azure.py`: add `api_version=api_version` to the `self.aembedding()` call. ## Tests Added `tests/litellm/llms/azure/test_azure_embedding.py` with: - Regression test verifying `api_version` is forwarded through `embedding()` → `aembedding()` - Tests verifying `get_azure_openai_client()` returns `AsyncOpenAI` (not `AsyncAzureOpenAI`) for `api_version="v1"` - Tests verifying the `/openai/v1/` base URL is used for v1 clients - Parametrized test covering all v1 variants (`"v1"`, `"latest"`, `"preview"`) Fixes #24848 ## Changed files - `litellm/llms/azure/azure.py` (modified, +1/-0) - `tests/litellm/llms/azure/__init__.py` (added, +0/-0) - `tests/litellm/llms/azure/test_azure_embedding.py` (added, +94/-0) --- # PR #24958: Litellm oss staging 04 01 2026 - Repository: BerriAI/litellm - Author: krrish-berri-2 - State: closed | merged: False - Link: https://github.com/BerriAI/litellm/pull/24958 ## Description (problem / solution / changelog) ## Relevant issues ## Pre-Submission checklist **Please complete all items before asking a LiteLLM maintainer to review your PR** - [ ] I have Added testing in the [`tests/test_litellm/`](https://github.com/BerriAI/litellm/tree/main/tests/test_litellm) directory, **Adding at least 1 test is a hard requirement** - [see details](https://docs.litellm.ai/docs/extras/contributing_code) - [ ] My PR passes all unit tests on [`make test-unit`](https://docs.litellm.ai/docs/extras/contributing_code) - [ ] My PR's scope is as isolated as possible, it only solves 1 specific problem - [ ] I have requested a Greptile review by commenting `@greptileai` and received a **Confidence Score of at least 4/5** before requesting a maintainer review ## Delays in PR merge? If you're seeing a delay in your PR being merged, ping the LiteLLM Team on [Slack (#pr-review)](https://join.slack.com/t/litellmossslack/shared_invite/zt-3o7nkuyfr-p_kbNJj8taRfXGgQI1~YyA). ## CI (LiteLLM team) > **CI status guideline:** > > - 50-55 passing tests: main is stable with minor issues. > - 45-49 passing tests: acceptable but needs attention > - 🆕 New Feature 🐛 Bug Fix 🧹 Refactoring 📖 Documentation 🚄 Infrastructure ✅ Test ## Changes ## Changed files - `.circleci/config.yml` (modified, +1022/-923) - `.circleci/requirements.txt` (added, +21/-0) - `.devcontainer/post-create.sh` (modified, +7/-7) - `.gitguardian.yaml` (modified, +1/-1) - `.github/pull_request_template.md` (modified, +0/-7) - `.github/workflows/_test-unit-base.yml` (modified, +19/-58) - `.github/workflows/_test-unit-services-base.yml` (modified, +39/-65) - `.github/workflows/auto_update_price_and_context_window.yml` (modified, +4/-5) - `.github/workflows/codspeed.yml` (modified, +5/-13) - `.github/workflows/create-release.yml` (modified, +1/-15) - `.github/workflows/guard-main-branch.yml` (removed, +0/-42) - `.github/workflows/llm-translation-testing.yml` (modified, +12/-11) - `.github/workflows/publish_to_pypi.yml` (modified, +37/-54) - `.github/workflows/run_llm_translation_tests.py` (modified, +3/-3) - `.github/workflows/test-linting.yml` (modified, +15/-20) - `.github/workflows/test-litellm-matrix.yml` (added, +214/-0) - `.github/workflows/test-litellm.yml` (modified, +15/-7) - `.github/workflows/test-mcp.yml` (modified, +17/-7) - `.github/workflows/test-proxy-e2e-azure-batches.yml` (added, +97/-0) - `.github/workflows/test-unit-caching-redis.yml` (added, +38/-0) - `.github/workflows/test-unit-core-utils.yml` (modified, +0/-3) - `.github/workflows/test-unit-documentation.y

litellm2026-03-31 07:35:12

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

BerriAI/litellm#24848•Fetched 2026-04-08 01:59:12

View on GitHub

Comments

Participants

Timeline

Reactions

Author

apujari-netapp

Participants

apujari-netapp

Timeline (top)

cross-referenced ×2referenced ×2labeled ×1

Error Message

openai.NotFoundError: Error code: 404 - {'error': {'code': '404', 'message': 'Resource not found'}}

PR fix notes

PR #24911: fix(azure): forward api_version to aembedding() for Azure AI Foundry v1 endpoints

Repository: BerriAI/litellm
Author: Sameerlite
State: closed | merged: True
Link: https://github.com/BerriAI/litellm/pull/24911

Description (problem / solution / changelog)

What

Fixes a silent parameter drop in the async embedding path for Azure AI Foundry.

Root Cause

In BaseAzureLLM.embedding(), when aembedding=True, the call to self.aembedding() was missing api_version=api_version. This meant get_azure_openai_client() received None instead of "v1", causing _is_azure_v1_api_version() to return False and AsyncAzureOpenAI to be selected instead of AsyncOpenAI. The wrong client constructs Azure-specific URLs that don't exist on AI Foundry endpoints, resulting in a 404 ResourceNotFound.

The sync path was unaffected — it passed api_version directly to get_azure_openai_client(). Only the async path (which the proxy always uses) had this bug.

Fix

One-line fix in litellm/llms/azure/azure.py: add api_version=api_version to the self.aembedding() call.

Tests

Added tests/litellm/llms/azure/test_azure_embedding.py with:

Regression test verifying api_version is forwarded through embedding() → aembedding()
Tests verifying get_azure_openai_client() returns AsyncOpenAI (not AsyncAzureOpenAI) for api_version="v1"
Tests verifying the /openai/v1/ base URL is used for v1 clients
Parametrized test covering all v1 variants ("v1", "latest", "preview")

Fixes #24848

Changed files

litellm/llms/azure/azure.py (modified, +1/-0)
tests/litellm/llms/azure/__init__.py (added, +0/-0)
tests/litellm/llms/azure/test_azure_embedding.py (added, +94/-0)

PR #24958: Litellm oss staging 04 01 2026

Repository: BerriAI/litellm
Author: krrish-berri-2
State: closed | merged: False
Link: https://github.com/BerriAI/litellm/pull/24958

Description (problem / solution / changelog)

Relevant issues

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

I have Added testing in the tests/test_litellm/ directory, Adding at least 1 test is a hard requirement - see details
My PR passes all unit tests on make test-unit
My PR's scope is as isolated as possible, it only solves 1 specific problem
I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

Delays in PR merge?

If you're seeing a delay in your PR being merged, ping the LiteLLM Team on Slack (#pr-review).

CI (LiteLLM team)

CI status guideline:

50-55 passing tests: main is stable with minor issues.

45-49 passing tests: acceptable but needs attention

<= 40 passing tests: unstable; be careful with your merges and assess the risk.

Branch creation CI run
Link:
CI run for the last commit
Link:
Merge / cherry-pick CI run
Links:

Type

🆕 New Feature 🐛 Bug Fix 🧹 Refactoring 📖 Documentation 🚄 Infrastructure ✅ Test

Changes

Changed files

.circleci/config.yml (modified, +1022/-923)
.circleci/requirements.txt (added, +21/-0)
.devcontainer/post-create.sh (modified, +7/-7)
.gitguardian.yaml (modified, +1/-1)
.github/pull_request_template.md (modified, +0/-7)
.github/workflows/_test-unit-base.yml (modified, +19/-58)
.github/workflows/_test-unit-services-base.yml (modified, +39/-65)
.github/workflows/auto_update_price_and_context_window.yml (modified, +4/-5)
.github/workflows/codspeed.yml (modified, +5/-13)
.github/workflows/create-release.yml (modified, +1/-15)
.github/workflows/guard-main-branch.yml (removed, +0/-42)
.github/workflows/llm-translation-testing.yml (modified, +12/-11)
.github/workflows/publish_to_pypi.yml (modified, +37/-54)
.github/workflows/run_llm_translation_tests.py (modified, +3/-3)
.github/workflows/test-linting.yml (modified, +15/-20)
.github/workflows/test-litellm-matrix.yml (added, +214/-0)
.github/workflows/test-litellm.yml (modified, +15/-7)
.github/workflows/test-mcp.yml (modified, +17/-7)
.github/workflows/test-proxy-e2e-azure-batches.yml (added, +97/-0)
.github/workflows/test-unit-caching-redis.yml (added, +38/-0)
.github/workflows/test-unit-core-utils.yml (modified, +0/-3)
.github/workflows/test-unit-documentation.yml (modified, +21/-14)
.github/workflows/test-unit-enterprise-routing.yml (modified, +0/-3)
.github/workflows/test-unit-integrations.yml (modified, +0/-3)
.github/workflows/test-unit-llm-providers.yml (modified, +0/-10)
.github/workflows/test-unit-misc.yml (modified, +0/-3)
.github/workflows/test-unit-proxy-auth.yml (modified, +0/-3)
.github/workflows/test-unit-proxy-db.yml (modified, +2/-6)
.github/workflows/test-unit-proxy-endpoints.yml (modified, +0/-3)
.github/workflows/test-unit-proxy-infra.yml (modified, +0/-3)
.github/workflows/test-unit-proxy-legacy.yml (modified, +18/-11)
.github/workflows/test-unit-responses-caching-types.yml (modified, +0/-3)
.github/workflows/test-unit-security.yml (modified, +1/-3)
.github/workflows/test_server_root_path.yml (modified, +1/-7)
AGENTS.md (modified, +12/-13)
CLAUDE.md (modified, +4/-16)
CONTRIBUTING.md (modified, +4/-25)
Dockerfile (modified, +86/-55)
GEMINI.md (modified, +4/-4)
Makefile (modified, +47/-41)
README.md (modified, +76/-122)
cookbook/litellm-ollama-docker-image/requirements.txt (modified, +1/-1)
cookbook/misc/RELEASE_NOTES_GENERATION_INSTRUCTIONS.md (modified, +0/-26)
docker/Dockerfile.alpine (modified, +30/-41)
docker/Dockerfile.custom_ui (modified, +0/-8)
docker/Dockerfile.database (modified, +79/-50)
docker/Dockerfile.dev (modified, +82/-53)
docker/Dockerfile.health_check (modified, +11/-17)
docker/Dockerfile.non_root (modified, +174/-145)
docker/README.md (modified, +4/-4)
docker/build_from_pip/Dockerfile.build_from_pip (modified, +13/-39)
docker/build_from_pip/requirements.txt (added, +6/-0)
docker/entrypoint.sh (modified, +8/-11)
docker/install_auto_router.sh (modified, +2/-3)
docs/my-website/Dockerfile (modified, +2/-25)
docs/my-website/blog/april_townhall_announcement/index.md (removed, +0/-40)
docs/my-website/blog/april_townhall_updates/index.md (removed, +0/-162)
docs/my-website/blog/authors.yml (modified, +1/-1)
docs/my-website/blog/ci_cd_v2_improvements/index.md (modified, +0/-35)
docs/my-website/blog/redis_circuit_breaker/diagrams.js (removed, +0/-159)
docs/my-website/blog/redis_circuit_breaker/index.md (removed, +0/-141)
docs/my-website/blog/security_hardening_april_2026/index.md (removed, +0/-66)
docs/my-website/blog/security_townhall_updates/index.md (modified, +1/-34)
docs/my-website/blog/security_update_march_2026/index.md (modified, +0/-34)
docs/my-website/docs/adding_provider/generic_prompt_management_api.md (modified, +1/-1)
docs/my-website/docs/benchmarks.md (modified, +43/-51)
docs/my-website/docs/caching/all_caches.md (modified, +6/-6)
docs/my-website/docs/completion/anthropic_advisor_tool.md (removed, +0/-489)
docs/my-website/docs/completion/message_sanitization.md (modified, +1/-1)
docs/my-website/docs/completion/prompt_compression.md (removed, +0/-123)
docs/my-website/docs/contributing.md (modified, +1/-1)
docs/my-website/docs/default_code_snippet.md (modified, +1/-1)
docs/my-website/docs/extras/contributing_code.md (modified, +1/-1)
docs/my-website/docs/index.md (modified, +3/-3)
docs/my-website/docs/integrations/letta.md (modified, +2/-2)
docs/my-website/docs/langchain/langchain.md (modified, +1/-1)
docs/my-website/docs/learn/gateway_quickstart.md (modified, +1/-1)
docs/my-website/docs/learn/sdk_quickstart.md (modified, +1/-1)
docs/my-website/docs/load_test.md (modified, +1/-1)
docs/my-website/docs/load_test_advanced.md (modified, +2/-2)
docs/my-website/docs/mcp.md (modified, +1/-2)
docs/my-website/docs/mcp_aws_sigv4.md (modified, +3/-52)
docs/my-website/docs/mcp_oauth.md (modified, +1/-1)
docs/my-website/docs/mcp_toolsets.md (removed, +0/-231)
docs/my-website/docs/observability/braintrust.md (modified, +1/-1)
docs/my-website/docs/observability/lago.md (modified, +1/-1)
docs/my-website/docs/observability/langfuse_integration.md (modified, +4/-4)
docs/my-website/docs/observability/langfuse_otel_integration.md (modified, +1/-1)
docs/my-website/docs/observability/langsmith_integration.md (modified, +1/-1)
docs/my-website/docs/observability/levo_integration.md (modified, +3/-3)
docs/my-website/docs/observability/literalai_integration.md (modified, +1/-1)
docs/my-website/docs/observability/logfire_integration.md (modified, +5/-5)
docs/my-website/docs/observability/lunary_integration.md (modified, +2/-2)
docs/my-website/docs/observability/mlflow.md (modified, +2/-2)
docs/my-website/docs/observability/openmeter.md (modified, +1/-1)
docs/my-website/docs/observability/opentelemetry_integration.md (modified, +3/-3)
docs/my-website/docs/observability/phoenix_integration.md (modified, +2/-2)
docs/my-website/docs/observability/qualifire_integration.md (modified, +1/-1)
docs/my-website/docs/observability/ramp_integration.md (removed, +0/-131)
docs/my-website/docs/observability/raw_request_response.md (modified, +1/-1)

Code Example

openai.NotFoundError: Error code: 404 - {'error': {'code': '404', 'message': 'Resource not found'}}

RAW_BUFFERClick to expand / collapse

Check for existing issues

I have searched the existing issues and checked that my issue is not a duplicate.

What happened?

When LiteLLM routes an embedding request through the proxy (which is always async), the [api_version] value that was correctly resolved in main.py is silently dropped before it reaches the client selection logic. This causes the wrong SDK client type to be instantiated — [AsyncAzureOpenAI] instead of [AsyncOpenAI] which constructs the wrong request URL for Azure AI Foundry endpoints.

Steps to Reproduce

Use LiteLLM proxy
configure api_version as v1 for embedding model
Make call using the embedding model

Relevant log output

openai.NotFoundError: Error code: 404 - {'error': {'code': '404', 'message': 'Resource not found'}}

What part of LiteLLM is this about?

Proxy

What LiteLLM version are you on ?

v1.82.6

Twitter / LinkedIn details

No response

extent analysis

TL;DR

The issue can be resolved by ensuring the api_version value is properly passed through the proxy to the client selection logic.

Guidance

Verify that the api_version is correctly set in the main.py file and that it is being passed to the proxy.
Check the proxy implementation to ensure it is not modifying or dropping the api_version value.
Review the client selection logic to ensure it is correctly instantiating the AsyncOpenAI client type when api_version is set to v1.
Test the embedding request with the api_version set to v1 to ensure the correct request URL is being constructed for Azure AI Foundry endpoints.

Example

No code snippet is provided as the issue does not contain sufficient information about the implementation details.

Notes

The issue seems to be related to the proxy implementation and how it handles the api_version value. Without more information about the proxy implementation, it is difficult to provide a more specific solution.

Recommendation

Apply workaround: Modify the proxy implementation to ensure it properly passes the api_version value to the client selection logic, allowing the correct SDK client type to be instantiated.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #runtime error #dependency conflict #environment setup #docker error

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

litellm - ✅(Solved) Fix [Bug]: Async embedding requests to Azure AI Foundry (v1 API) fail with ResourceNotFound [2 pull requests, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

PR fix notes

PR #24911: fix(azure): forward api_version to aembedding() for Azure AI Foundry v1 endpoints

Description (problem / solution / changelog)

What

Root Cause

Fix

Tests

Changed files

PR #24958: Litellm oss staging 04 01 2026

Description (problem / solution / changelog)

Relevant issues

Pre-Submission checklist

Delays in PR merge?

CI (LiteLLM team)

Type

Changes

Changed files

Code Example

Check for existing issues

What happened?

Steps to Reproduce

Relevant log output

What part of LiteLLM is this about?

What LiteLLM version are you on ?

Twitter / LinkedIn details

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING