langchain - ✅(Solved) Fix [RFC] Feature: TieredSemanticRouter – Automated Cost/Latency Optimization via Confidence-Based Model Fallbacks [1 pull requests, 3 comments, 3 participants]

langchain2026-02-14 22:15:45

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

langchain-ai/langchain#35228•Fetched 2026-04-08 00:27:02

View on GitHub

Comments

Participants

Timeline

Reactions

Author

Participants

Timeline (top)

labeled ×4commented ×3closed ×1cross-referenced ×1

Fix Action

Fixed

Fixed by PR: Feat/tiered semantic router (https://github.com/langchain-ai/langchain/pull/35245)

PR fix notes

PR #35245: Feat/tiered semantic router

Repository: langchain-ai/langchain
Author: rimysore
State: closed | merged: False
Link: https://github.com/langchain-ai/langchain/pull/35245

Description (problem / solution / changelog)

Closes #35228 feat(core): implement TieredSemanticRouter for proactive routing

Summary: Implements the TieredSemanticRouter for proactive, complexity-based routing between models.

Verification:

Ran make format, make lint, and make test in libs/core.
All 1,690+ tests passed locally.
Added specific unit tests for routing logic, async, and config propagation.

Disclaimer: Note that this contribution was assisted by AI.

Changed files

libs/core/langchain_core/runnables/__init__.py (modified, +1/-0)
libs/core/langchain_core/runnables/tiered_router.py (added, +46/-0)
libs/core/tests/unit_tests/runnables/test_router.py (added, +72/-0)
libs/core/uv.lock (modified, +2/-2)

Code Example

from langchain_core.runnable import TieredSemanticRouter
    
    small_model = ChatOpenAI(model="gpt-4o-mini")
    large_model = ChatOpenAI(model="gpt-4o")
    
    router = TieredSemanticRouter(
        primary=small_model, 
        fallback=large_model,
        threshold=0.85
    )
    
    chain = prompt | router | parser

RAW_BUFFERClick to expand / collapse

Checked other resources - [x] This is a feature request, not a bug report or usage question. - [x] I added a clear and descriptive title that summarizes the feature request. - [x] I used the GitHub search to find a similar feature request and didn't find it. - [x] I checked the LangChain documentation and API reference to see if this feature already exists. - [x] This is not related to the langchain-community package.

Package (Required)
- [x] langchain
- [x] langchain-core

Feature Description
Current fallback logic in LangChain is reactive, only triggering when a model crashes or times out. I propose a TieredSemanticRouter that intelligently routes traffic between small, fast models like GPT-4o-mini or Llama-3-8B and larger, more complex models based on a confidence check.

Motivation
Enterprise users often default to expensive models to avoid edge-case failures, leading to much higher costs than necessary. A proactive router allows simple traffic to be handled at a lower cost while maintaining a high-quality fallback for complex queries.

Proposed API
```python
from langchain_core.runnable import TieredSemanticRouter

small_model = ChatOpenAI(model="gpt-4o-mini")
large_model = ChatOpenAI(model="gpt-4o")

router = TieredSemanticRouter(
    primary=small_model, 
    fallback=large_model,
    threshold=0.85
)

chain = prompt | router | parser
```

Key Use Cases
1. Cost Reduction: Offloading basic tasks to smaller models.
2. Latency Optimization: Responding faster to simple queries.
3. Profitability: Improving the unit economics of AI applications.

extent analysis

Problem Summary Tiered Semantic Router

Root Cause Analysis The current fallback logic in LangChain is reactive, only triggering when a model crashes or times out.

Fix Plan Implement a Tiered Semantic Router that intelligently routes traffic between small, fast models and larger, more complex models based on a confidence check.

Step-by-Step Solution Plan

1. Define Small and Large Models

from langchain_core.runnable import ChatOpenAI

small_model = ChatOpenAI(model="gpt-4o-mini")
large_model = ChatOpenAI(model="gpt-4o")

2. Create a Tiered Semantic Router

from langchain_core.runnable import TieredSemanticRouter

router = TieredSemanticRouter(
    primary=small_model, 
    fallback=large_model,
    threshold=0.85
)

3. Integrate the Router with the Chain

chain = prompt | router | parser

4. Update the Chain Configuration

Update the chain configuration to use the new Tiered Semantic Router.

Example Use Case

# Define a prompt
prompt = "What is the capital of France?"

# Define a parser
parser = ...

# Create a chain with the Tiered Semantic Router
chain = prompt | router | parser

# Run the chain
result = chain.run()
print(result)

Verification Verify that the Tiered Semantic Router is working correctly by checking the output of the chain and ensuring that it is using the correct model for each query.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #ssr #installation #tensor shape #optimization #authentication issue #prompt issue #agent setup

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

langchain - ✅(Solved) Fix [RFC] Feature: TieredSemanticRouter – Automated Cost/Latency Optimization via Confidence-Based Model Fallbacks [1 pull requests, 3 comments, 3 participants]

Recommended Tools

GitHub issue graph ai analysis

Fix Action

Fixed

PR fix notes

PR #35245: Feat/tiered semantic router

Description (problem / solution / changelog)

Changed files

Code Example

extent analysis

Step-by-Step Solution Plan

1. Define Small and Large Models

2. Create a Tiered Semantic Router

3. Integrate the Router with the Chain

4. Update the Chain Configuration

Example Use Case

Still need to ship something?

TRENDING

langchain - ✅(Solved) Fix [RFC] Feature: TieredSemanticRouter – Automated Cost/Latency Optimization via Confidence-Based Model Fallbacks [1 pull requests, 3 comments, 3 participants]

Recommended Tools

GitHub issue graph ai analysis

Fix Action

Fixed

PR fix notes

PR #35245: Feat/tiered semantic router

Description (problem / solution / changelog)

Changed files

Code Example

extent analysis

Step-by-Step Solution Plan

1. Define Small and Large Models

2. Create a Tiered Semantic Router

3. Integrate the Router with the Chain

4. Update the Chain Configuration

Example Use Case

Still need to ship something?

RELATED_DISCOVERY

TRENDING