litellm - ✅(Solved) Fix [Bug]: Confirmed bug: ssl_verify=false not working if using streaming Text Completion API [2 pull requests, 1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
BerriAI/litellm#26053Fetched 2026-04-19 15:06:07
View on GitHub
Comments
1
Participants
2
Timeline
8
Reactions
0
Timeline (top)
labeled ×3cross-referenced ×2referenced ×2commented ×1

Error Message

WORKS: direct to llama-server with -k (skip SSL check manually)

curl -sk https://127.0.0.1:8099/v1/completions
-H "Content-Type: application/json"
-d '{"model":"ggml-org/gemma-3-1b-it-GGUF:Q4_K_M","prompt":"Once upon a time","stream":true,"max_tokens":10}'

WORKS: LiteLLM non-streaming — ssl_verify: false is correctly applied here

curl -s http://localhost:4001/v1/completions
-H "Content-Type: application/json"
-H "Authorization: Bearer sk-test"
-d '{"model":"test-textcompletion","prompt":"Once upon a time","max_tokens":10}'

BUG: LiteLLM streaming text completion — ssl_verify: false is NOT applied

Returns: {"error":{"message":"litellm.APIConnectionError: ... Connection error."}}

curl -s http://localhost:4001/v1/completions
-H "Content-Type: application/json"
-H "Authorization: Bearer sk-test"
-d '{"model":"test-textcompletion","prompt":"Once upon a time","stream":true,"max_tokens":10}'

Fix Action

Fixed

PR fix notes

PR #26057: fix(openai): use ssl-aware http client in text completion streaming

Description (problem / solution / changelog)

Root Cause

The streaming(), async_streaming(), and sync non-streaming completion() methods in OpenAITextCompletion (litellm/llms/openai/completion/handler.py) used litellm.client_session and litellm.aclient_session directly when creating OpenAI clients. These global sessions bypass SSL verification settings configured via ssl_verify=False.

Meanwhile, the acompletion() method already used BaseOpenAILLM._get_async_http_client() which correctly calls get_ssl_configuration() to respect the ssl_verify parameter.

Fix

Changed all four code paths to consistently use BaseOpenAILLM._get_sync_http_client() and BaseOpenAILLM._get_async_http_client(), which call get_ssl_configuration() and properly respect the ssl_verify setting.

Before

# streaming() - line 227
http_client=litellm.client_session,        # ignores ssl_verify

# async_streaming() - line 288  
http_client=litellm.aclient_session,       # ignores ssl_verify

# completion() sync non-streaming - line 117
http_client=litellm.client_session,        # ignores ssl_verify

After

# streaming()
http_client=BaseOpenAILLM._get_sync_http_client(),    # respects ssl_verify

# async_streaming()
http_client=BaseOpenAILLM._get_async_http_client(),   # respects ssl_verify

# completion() sync non-streaming
http_client=BaseOpenAILLM._get_sync_http_client(),    # respects ssl_verify

Testing

The fix is consistent with the existing acompletion() method (line 171) which already used BaseOpenAILLM._get_async_http_client(). The existing test test_acompletion_uses_optimized_http_client in tests/llm_translation/test_text_completion_unit_tests.py validates this pattern.

Impact

  • Changes 3 lines in a single file
  • No API changes, no new dependencies
  • Fixes #26053

Changed files

  • litellm/llms/openai/completion/handler.py (modified, +3/-3)

PR #26058: fix: apply ssl_verify to streaming text completion requests (#26053)

Description (problem / solution / changelog)

Summary

Fixes #26053 — ssl_verify: false was silently ignored for streaming Text Completion API requests while working correctly for non-streaming and Chat Completion requests.

Root cause: streaming() and async_streaming() in litellm/llms/openai/completion/handler.py passed litellm.client_session / litellm.aclient_session directly as the http_client for the OpenAI SDK client. Both values are None by default, causing the SDK to create its own httpx client with no SSL overrides.

The acompletion() method (async non-streaming) already used BaseOpenAILLM._get_async_http_client(), which reads litellm.ssl_verify and builds a properly configured httpx client — that's why async non-streaming worked.

Fix: Replace the three raw litellm.client_session / litellm.aclient_session references with BaseOpenAILLM._get_sync_http_client() / BaseOpenAILLM._get_async_http_client(), consistent with how chat completions and async text completions already work.

Changed paths:

  • litellm/llms/openai/completion/handler.py
    • completion() — sync non-streaming path
    • streaming() — sync streaming path
    • async_streaming() — async streaming path (the primary proxy code path, and the one that manifests the bug)

Test plan

  • Start a server with a self-signed cert (no IP SAN) so ssl_verify: true always fails
  • Configure ssl_verify: false in litellm_settings
  • Confirm POST /v1/completions non-streaming returns a valid response
  • Confirm POST /v1/completions with stream: true now streams successfully (previously returned APIConnectionError)
  • Confirm POST /v1/chat/completions with stream: true still works (no regression)

Changed files

  • .circleci/config.yml (modified, +9/-161)
  • .github/scripts/close_duplicate_issues.py (modified, +31/-9)
  • .github/scripts/scan_keywords.py (modified, +14/-6)
  • .github/workflows/guard-main-branch.yml (added, +42/-0)
  • .github/workflows/test-linting.yml (modified, +5/-1)
  • .github/workflows/test-litellm-ui-build.yml (modified, +5/-1)
  • .github/workflows/test-mcp.yml (modified, +5/-1)
  • .github/workflows/test-model-map.yaml (modified, +5/-1)
  • .github/workflows/test-unit-core-utils.yml (modified, +5/-1)
  • .github/workflows/test-unit-documentation.yml (modified, +5/-1)
  • .github/workflows/test-unit-enterprise-routing.yml (modified, +5/-1)
  • .github/workflows/test-unit-integrations.yml (modified, +5/-1)
  • .github/workflows/test-unit-llm-providers.yml (modified, +5/-1)
  • .github/workflows/test-unit-misc.yml (modified, +5/-1)
  • .github/workflows/test-unit-proxy-auth.yml (modified, +5/-1)
  • .github/workflows/test-unit-proxy-db.yml (modified, +1/-1)
  • .github/workflows/test-unit-proxy-endpoints.yml (modified, +5/-1)
  • .github/workflows/test-unit-proxy-infra.yml (modified, +5/-1)
  • .github/workflows/test-unit-proxy-legacy.yml (modified, +5/-1)
  • .github/workflows/test-unit-responses-caching-types.yml (modified, +5/-1)
  • .github/workflows/test-unit-security.yml (modified, +1/-1)
  • .github/workflows/test_server_root_path.yml (modified, +6/-2)
  • CLAUDE.md (modified, +1/-1)
  • ci_cd/run_migration.py (modified, +3/-1)
  • cookbook/anthropic_agent_sdk/agent_with_mcp.py (modified, +27/-23)
  • cookbook/anthropic_agent_sdk/common.py (modified, +33/-29)
  • cookbook/anthropic_agent_sdk/main.py (modified, +23/-19)
  • cookbook/litellm_proxy_server/batch_api/bedrock/bedrock.py (modified, +3/-3)
  • cookbook/litellm_proxy_server/cli_token_usage.py (modified, +10/-9)
  • cookbook/litellm_proxy_server/mcp/mcp_with_litellm_proxy.py (modified, +6/-5)
  • cookbook/litellm_proxy_server/secret_manager/my_secret_manager.py (modified, +4/-3)
  • cookbook/livekit_agent_sdk/main.py (modified, +42/-33)
  • cookbook/misc/RELEASE_NOTES_GENERATION_INSTRUCTIONS.md (modified, +26/-0)
  • cookbook/misc/test_responses_api.py (modified, +5/-9)
  • cookbook/nova_sonic_realtime.py (modified, +7/-4)
  • cookbook/veo_video_generation.py (modified, +87/-77)
  • deploy/charts/litellm-helm/templates/deployment.yaml (modified, +2/-2)
  • deploy/charts/litellm-helm/templates/migrations-job.yaml (modified, +2/-2)
  • deploy/charts/litellm-helm/tests/deployment_tests.yaml (modified, +58/-0)
  • deploy/charts/litellm-helm/tests/migrations-job_tests.yaml (modified, +66/-0)
  • docs/my-website/blog/claude_opus_4_7/index.md (added, +366/-0)
  • docs/my-website/docs/completion/prompt_compression.md (added, +123/-0)
  • docs/my-website/docs/index.md (modified, +4/-0)
  • docs/my-website/docs/providers/github_copilot.md (modified, +7/-0)
  • docs/my-website/docs/proxy/config_settings.md (modified, +11/-0)
  • docs/my-website/docs/proxy/cost_tracking.md (modified, +4/-0)
  • docs/my-website/docs/proxy/guardrails/hiddenlayer.md (modified, +1/-0)
  • docs/my-website/docs/proxy/guardrails/policy_flow_builder.md (modified, +99/-5)
  • docs/my-website/docs/proxy/health.md (modified, +25/-1)
  • docs/my-website/docs/proxy/ui_team_soft_budget_alerts.md (modified, +10/-0)
  • docs/my-website/docs/proxy/users.md (modified, +61/-0)
  • docs/my-website/docs/skills_gateway.md (added, +111/-0)
  • docs/my-website/docs/troubleshoot/cost_discrepancy.md (added, +205/-0)
  • docs/my-website/docs/tutorials/claude_code_byok.md (modified, +11/-0)
  • docs/my-website/docusaurus.config.js (modified, +27/-1)
  • docs/my-website/img/release_notes/guardrail_fallbacks.png (added, +0/-0)
  • docs/my-website/package-lock.json (modified, +403/-0)
  • docs/my-website/package.json (modified, +2/-0)
  • docs/my-website/release_notes/v1.83.3/index.md (modified, +337/-51)
  • docs/my-website/release_notes/v1.83.7.rc.1/index.md (added, +223/-0)
  • docs/my-website/sidebars.js (modified, +14/-0)
  • docs/my-website/src/pages/index.md (modified, +4/-0)
  • docs/my-website/static/img/cost-discrepancy-debug/date-range-picker.png (added, +0/-0)
  • docs/my-website/static/img/cost-discrepancy-debug/go-to-model-activity.png (added, +0/-0)
  • docs/my-website/static/img/cost-discrepancy-debug/scroll-to-model.png (added, +0/-0)
  • docs/my-website/static/img/cost-discrepancy-debug/token-categories.png (added, +0/-0)
  • enterprise/litellm_enterprise/enterprise_callbacks/send_emails/base_email.py (modified, +168/-28)
  • enterprise/pyproject.toml (modified, +2/-2)
  • litellm-proxy-extras/litellm_proxy_extras/_logging.py (modified, +5/-2)
  • litellm-proxy-extras/litellm_proxy_extras/migrations/20260401000000_add_budget_limits/migration.sql (added, +5/-0)
  • litellm-proxy-extras/litellm_proxy_extras/migrations/20260401000000_add_team_member_model_scope/migration.sql (added, +9/-0)
  • litellm-proxy-extras/litellm_proxy_extras/migrations/20260414140000_add_mcp_server_instructions/migration.sql (added, +2/-0)
  • litellm-proxy-extras/litellm_proxy_extras/migrations/20260415120000_health_check_latest_per_model_index/migration.sql (added, +12/-0)
  • litellm-proxy-extras/litellm_proxy_extras/schema.prisma (modified, +7/-1)
  • litellm-proxy-extras/litellm_proxy_extras/utils.py (modified, +76/-21)
  • litellm-proxy-extras/pyproject.toml (modified, +2/-2)
  • litellm/__init__.py (modified, +70/-54)
  • litellm/_internal_context.py (added, +13/-0)
  • litellm/_lazy_imports.py (modified, +1/-0)
  • litellm/_logging.py (modified, +2/-0)
  • litellm/a2a_protocol/main.py (modified, +3/-3)
  • litellm/a2a_protocol/providers/bedrock_agentcore/handler.py (modified, +1/-3)
  • litellm/a2a_protocol/streaming_iterator.py (modified, +3/-3)
  • litellm/anthropic_beta_headers_config.json (modified, +4/-4)
  • litellm/anthropic_interface/__init__.py (modified, +1/-0)
  • litellm/anthropic_interface/messages/__init__.py (modified, +2/-2)
  • litellm/caching/caching_handler.py (modified, +6/-4)
  • litellm/caching/gcs_cache.py (modified, +1/-0)
  • litellm/caching/in_memory_cache.py (modified, +4/-3)
  • litellm/completion_extras/litellm_responses_transformation/handler.py (modified, +1/-3)
  • litellm/completion_extras/litellm_responses_transformation/transformation.py (modified, +20/-16)
  • litellm/compression/__init__.py (added, +3/-0)
  • litellm/compression/compress.py (added, +257/-0)
  • litellm/compression/content_detection.py (added, +45/-0)
  • litellm/compression/message_stubbing.py (added, +120/-0)
  • litellm/compression/retrieval_tool.py (added, +35/-0)
  • litellm/compression/scoring/__init__.py (added, +4/-0)
  • litellm/compression/scoring/bm25.py (added, +123/-0)
  • litellm/compression/scoring/embedding_scorer.py (added, +95/-0)
  • litellm/constants.py (modified, +51/-2)

Code Example

mkdir /tmp/litellm-sslbug && cd /tmp/litellm-sslbug

# 1. Self-signed cert with no IP SAN (so SSL verification always fails without curl -k)
openssl req -x509 -newkey rsa:2048 -keyout key.pem -out cert.pem \
-days 30 -nodes -subj "/CN=TestCert" 2>/dev/null

# 2. Start llama-server with SSL. (NOTE: downloads 800MB model once)
llama-server -hf ggml-org/gemma-3-1b-it-GGUF:Q4_K_M \
--host 0.0.0.0 --port 8099 \
--ssl-key-file key.pem --ssl-cert-file cert.pem

# Confirm llama-server HTTPS is up
curl -sk https://127.0.0.1:8099/health

# 3. LiteLLM config with ssl_verify: false

cat > litellm-config.yaml << 'EOF'
model_list:
  - model_name: test-textcompletion
    litellm_params:
      model: text-completion-openai/ggml-org/gemma-3-1b-it-GGUF:Q4_K_M
      api_base: api_base: https://host.docker.internal:8099/v1  # host IP as seen from Docker container
      api_key: no-key

litellm_settings:
  ssl_verify: false  # intended to skip SSL verification - but broken for streaming text completions

general_settings:
  master_key: sk-test
EOF

# 4. LiteLLM in Docker on port 4001

cat > docker-compose.yml << 'EOF'
services:
  litellm:
    image: docker.litellm.ai/berriai/litellm:main-latest
    container_name: litellm-sslbug
    ports:
      - 4001:4000
    volumes:
      - ./litellm-config.yaml:/app/config.yaml
    extra_hosts:
      - "host.docker.internal:host-gateway"
    command: [--config, /app/config.yaml, --port, "4000", --host, "0.0.0.0"]
EOF

docker compose up -d

# Confirm LiteLLM is up
curl -sk -H 'Authorization: Bearer sk-test' http://127.0.0.1:4001/health

---

# WORKS: direct to llama-server with -k (skip SSL check manually)
curl -sk https://127.0.0.1:8099/v1/completions \
-H "Content-Type: application/json" \
-d '{"model":"ggml-org/gemma-3-1b-it-GGUF:Q4_K_M","prompt":"Once upon a time","stream":true,"max_tokens":10}'

# WORKS: LiteLLM non-streaming — ssl_verify: false is correctly applied here
curl -s http://localhost:4001/v1/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer sk-test" \
-d '{"model":"test-textcompletion","prompt":"Once upon a time","max_tokens":10}'

# BUG: LiteLLM streaming text completion — ssl_verify: false is NOT applied
# Returns: {"error":{"message":"litellm.APIConnectionError: ... Connection error."}}
curl -s http://localhost:4001/v1/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer sk-test" \
-d '{"model":"test-textcompletion","prompt":"Once upon a time","stream":true,"max_tokens":10}'

---
RAW_BUFFERClick to expand / collapse

Check for existing issues

  • I have searched the existing issues and checked that my issue is not a duplicate.

What happened?

(AI disclaimer: I ran into this issue myself on my homelab, I just asked AI to reproduce it locally on a single system so I can attach easy reproduction steps to this issue. I tested the steps before posting.)

The bug: when proxying to an upstream server which has a bad SSL certificate, ssl_verify=false allows allows non-streaming Text Completions to work, but streaming Text Completion fail. (Chat Completions also work, of course)

Steps to Reproduce

REQUIRED: llama-server and docker installed

Initial setup:

mkdir /tmp/litellm-sslbug && cd /tmp/litellm-sslbug

# 1. Self-signed cert with no IP SAN (so SSL verification always fails without curl -k)
openssl req -x509 -newkey rsa:2048 -keyout key.pem -out cert.pem \
-days 30 -nodes -subj "/CN=TestCert" 2>/dev/null

# 2. Start llama-server with SSL. (NOTE: downloads 800MB model once)
llama-server -hf ggml-org/gemma-3-1b-it-GGUF:Q4_K_M \
--host 0.0.0.0 --port 8099 \
--ssl-key-file key.pem --ssl-cert-file cert.pem

# Confirm llama-server HTTPS is up
curl -sk https://127.0.0.1:8099/health

# 3. LiteLLM config with ssl_verify: false

cat > litellm-config.yaml << 'EOF'
model_list:
  - model_name: test-textcompletion
    litellm_params:
      model: text-completion-openai/ggml-org/gemma-3-1b-it-GGUF:Q4_K_M
      api_base: api_base: https://host.docker.internal:8099/v1  # host IP as seen from Docker container
      api_key: no-key

litellm_settings:
  ssl_verify: false  # intended to skip SSL verification - but broken for streaming text completions

general_settings:
  master_key: sk-test
EOF

# 4. LiteLLM in Docker on port 4001

cat > docker-compose.yml << 'EOF'
services:
  litellm:
    image: docker.litellm.ai/berriai/litellm:main-latest
    container_name: litellm-sslbug
    ports:
      - 4001:4000
    volumes:
      - ./litellm-config.yaml:/app/config.yaml
    extra_hosts:
      - "host.docker.internal:host-gateway"
    command: [--config, /app/config.yaml, --port, "4000", --host, "0.0.0.0"]
EOF

docker compose up -d

# Confirm LiteLLM is up
curl -sk -H 'Authorization: Bearer sk-test' http://127.0.0.1:4001/health

Tests:

# WORKS: direct to llama-server with -k (skip SSL check manually)
curl -sk https://127.0.0.1:8099/v1/completions \
-H "Content-Type: application/json" \
-d '{"model":"ggml-org/gemma-3-1b-it-GGUF:Q4_K_M","prompt":"Once upon a time","stream":true,"max_tokens":10}'

# WORKS: LiteLLM non-streaming — ssl_verify: false is correctly applied here
curl -s http://localhost:4001/v1/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer sk-test" \
-d '{"model":"test-textcompletion","prompt":"Once upon a time","max_tokens":10}'

# BUG: LiteLLM streaming text completion — ssl_verify: false is NOT applied
# Returns: {"error":{"message":"litellm.APIConnectionError: ... Connection error."}}
curl -s http://localhost:4001/v1/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer sk-test" \
-d '{"model":"test-textcompletion","prompt":"Once upon a time","stream":true,"max_tokens":10}'

Relevant log output

What part of LiteLLM is this about?

Proxy

What LiteLLM version are you on ?

latest docker image (5cfceb7aa09c)

Twitter / LinkedIn details

No response

extent analysis

TL;DR

The issue can be fixed by ensuring that the ssl_verify: false setting is properly applied to streaming text completions in the LiteLLM configuration.

Guidance

  • Review the LiteLLM configuration to ensure that ssl_verify: false is correctly set for streaming text completions.
  • Verify that the litellm_settings section in the litellm-config.yaml file is being read and applied correctly for streaming requests.
  • Check the LiteLLM documentation to see if there are any specific settings or configurations required for streaming text completions with SSL verification disabled.
  • Test the streaming text completion API endpoint with the ssl_verify: false setting to ensure it is working as expected.

Example

No code example is provided as the issue seems to be related to the configuration of LiteLLM rather than a code-specific problem.

Notes

The issue may be related to how LiteLLM handles SSL verification for streaming requests, and further investigation into the LiteLLM configuration and documentation may be necessary to resolve the issue.

Recommendation

Apply a workaround by modifying the LiteLLM configuration to ensure that ssl_verify: false is applied to streaming text completions, as the current configuration seems to only apply to non-streaming requests.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

litellm - ✅(Solved) Fix [Bug]: Confirmed bug: ssl_verify=false not working if using streaming Text Completion API [2 pull requests, 1 comments, 2 participants]