langchain - 💡(How to fix) Fix langchain-openrouter: ChatOpenRouter creates fresh httpx clients per instantiation (no default-client caching)

Official PRs (…)
ON THIS PAGE

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

ChatOpenRouter.__init__ constructs a brand-new pair of httpx.Client / httpx.AsyncClient for every instance, with no caching at the base level. Under any usage pattern that instantiates the model per-request (LangGraph factory graphs, FastAPI dependency injection that returns a fresh model, etc.), this leaks TLS keep-alive sockets and httpx pool state to openrouter.ai for the lifetime of the unclosed pools.

This is the same class of bug that was fixed for AzureChatOpenAI in #32489 / PR #32531 — langchain_openai.BaseChatOpenAI uses _get_default_httpx_client / _get_default_async_httpx_client from _client_utils to cache a single default httpx client per (base_url, timeout). langchain-openrouter does not.

Root Cause

Expected (with proper default-client caching): socket count stays roughly constant. Observed: socket count grows linearly with instance count and does not return to baseline after del + gc.collect(), because the httpx pools embedded in each discarded instance retain their TLS connections until interpreter exit.

Code Example

if extra_headers:
    import httpx

    client_kwargs["client"] = httpx.Client(
        headers=extra_headers, follow_redirects=True
    )
    client_kwargs["async_client"] = httpx.AsyncClient(
        headers=extra_headers, follow_redirects=True
    )

---

import gc, os, asyncio
from langchain_openrouter import ChatOpenRouter

os.environ.setdefault("OPENROUTER_API_KEY", "sk-or-...")  # any valid key

def count_sockets():
    import psutil
    p = psutil.Process()
    return sum(1 for c in p.net_connections(kind="tcp") if c.status == "ESTABLISHED")

async def main():
    print("before:", count_sockets())
    for i in range(20):
        m = ChatOpenRouter(model="openai/gpt-4o-mini")
        await m.ainvoke("hi")
        del m
    gc.collect()
    print("after 20 fresh instances:", count_sockets())

asyncio.run(main())
RAW_BUFFERClick to expand / collapse

Summary

ChatOpenRouter.__init__ constructs a brand-new pair of httpx.Client / httpx.AsyncClient for every instance, with no caching at the base level. Under any usage pattern that instantiates the model per-request (LangGraph factory graphs, FastAPI dependency injection that returns a fresh model, etc.), this leaks TLS keep-alive sockets and httpx pool state to openrouter.ai for the lifetime of the unclosed pools.

This is the same class of bug that was fixed for AzureChatOpenAI in #32489 / PR #32531 — langchain_openai.BaseChatOpenAI uses _get_default_httpx_client / _get_default_async_httpx_client from _client_utils to cache a single default httpx client per (base_url, timeout). langchain-openrouter does not.

Source

Package: langchain-openrouter==0.2.3 (latest), Python 3.13.

Relevant code in langchain_openrouter/chat_models.py (approximate line numbers from installed wheel, around lines 390–397):

if extra_headers:
    import httpx

    client_kwargs["client"] = httpx.Client(
        headers=extra_headers, follow_redirects=True
    )
    client_kwargs["async_client"] = httpx.AsyncClient(
        headers=extra_headers, follow_redirects=True
    )

These clients are then passed into openrouter.OpenRouter(**client_kwargs). When extra_headers is empty, the underlying openrouter SDK still constructs its own httpx clients per instance, so the leak path exists in both branches.

There is no _get_default_*_httpx_client style caching, no aclose() on instance teardown, and no shared module-level client.

Minimal reproduction

import gc, os, asyncio
from langchain_openrouter import ChatOpenRouter

os.environ.setdefault("OPENROUTER_API_KEY", "sk-or-...")  # any valid key

def count_sockets():
    import psutil
    p = psutil.Process()
    return sum(1 for c in p.net_connections(kind="tcp") if c.status == "ESTABLISHED")

async def main():
    print("before:", count_sockets())
    for i in range(20):
        m = ChatOpenRouter(model="openai/gpt-4o-mini")
        await m.ainvoke("hi")
        del m
    gc.collect()
    print("after 20 fresh instances:", count_sockets())

asyncio.run(main())

Expected (with proper default-client caching): socket count stays roughly constant. Observed: socket count grows linearly with instance count and does not return to baseline after del + gc.collect(), because the httpx pools embedded in each discarded instance retain their TLS connections until interpreter exit.

Why this matters in practice

LangGraph "factory graphs" (the graph constructor takes a RunnableConfig and is invoked per request — documented behavior in aegra / langgraph-server) result in a fresh ChatOpenRouter per request unless the caller wraps the factory in a manual cache. With streaming responses pinning connections (see openai/openai-python#763) and connection pools sized for a single client, this hits resource limits in production well before users hit any model rate limit.

Suggested fix

Mirror PR #32531:

  • Import _get_default_httpx_client / _get_default_async_httpx_client from langchain_openai._client_utils (or replicate in langchain_openrouter).
  • In the model-validator path that currently constructs httpx.Client(...) / httpx.AsyncClient(...), fall through to the cached helper when the user hasn't provided http_client / http_async_client explicitly.
  • Expose http_client / http_async_client as constructor kwargs so callers who do want a custom pool can pass one in (matching ChatOpenAI's public surface).

Happy to send a PR mirroring #32531 if a maintainer can confirm this is the desired direction.

Environment

  • langchain-openrouter==0.2.3
  • httpx>=0.27
  • Python 3.13
  • Linux (also reproduces on macOS)

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

langchain - 💡(How to fix) Fix langchain-openrouter: ChatOpenRouter creates fresh httpx clients per instantiation (no default-client caching)