langchain - 💡(How to fix) Fix Add support for server-side sparse inference in QdrantVectorStore [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
langchain-ai/langchain#36196Fetched 2026-04-08 01:21:35
View on GitHub
Comments
0
Participants
1
Timeline
4
Reactions
1
Participants
Timeline (top)
labeled ×3issue_type_added ×1
RAW_BUFFERClick to expand / collapse

Checked other resources

  • This is a feature request, not a bug report or usage question.
  • I added a clear and descriptive title that summarizes the feature request.
  • I used the GitHub search to find a similar feature request and didn't find it.
  • I checked the LangChain documentation and API reference to see if this feature already exists.
  • This is not related to the langchain-community package.

Package (Required)

  • langchain
  • langchain-openai
  • langchain-anthropic
  • langchain-classic
  • langchain-core
  • langchain-model-profiles
  • langchain-tests
  • langchain-text-splitters
  • langchain-chroma
  • langchain-deepseek
  • langchain-exa
  • langchain-fireworks
  • langchain-groq
  • langchain-huggingface
  • langchain-mistralai
  • langchain-nomic
  • langchain-ollama
  • langchain-openrouter
  • langchain-perplexity
  • langchain-qdrant
  • langchain-xai
  • Other / not sure / general

Feature Description

I would like Langchain to support server-side inference for BM25 sparse representations of points in qdrant.

qdrant_client library supports sparse embeddings from text input directly on the server (see here) and also inference at query time.

QdrantVectorStore usage in hybrid retrieval mode requires an instance of SparseEmbeddings for sparse representations, which is currently only implemented by FastEmbedSparse.

Use Case

I'm trying to build an application which restricts usage of external api calls like the ones used by the FastEmbedSparse class to HuggingFace.

Currently, I'm trying to bypass this mechanism by implementing the SparseEmbeddings and overriding the existing methods to use the underlying qdrant_client functions to generate the sparse embedding.

Proposed Solution

No response

Alternatives Considered

No response

Additional Context

No response

extent analysis

Fix Plan

To support server-side inference for BM25 sparse representations of points in Qdrant, we need to implement a new class that extends the SparseEmbeddings interface and utilizes the qdrant_client library for server-side inference.

Here are the steps to achieve this:

  • Create a new class, e.g., QdrantSparseEmbeddings, that implements the SparseEmbeddings interface.
  • Override the embed method to use the qdrant_client library for server-side inference.
  • Use the qdrant_client library to generate sparse embeddings from text input directly on the server.

Example code:

from langchain.embeddings import SparseEmbeddings
from qdrant_client import QdrantClient

class QdrantSparseEmbeddings(SparseEmbeddings):
    def __init__(self, qdrant_client: QdrantClient):
        self.qdrant_client = qdrant_client

    def embed(self, text: str) -> np.ndarray:
        # Use qdrant_client to generate sparse embedding from text input
        embedding = self.qdrant_client.generate_sparse_embedding(text)
        return embedding
  • Update the QdrantVectorStore to use the new QdrantSparseEmbeddings class in hybrid retrieval mode.

Verification

To verify that the fix worked, you can test the QdrantSparseEmbeddings class by generating sparse embeddings from text input and checking that the resulting embeddings are correct.

Example code:

qdrant_client = QdrantClient()
qdrant_sparse_embeddings = QdrantSparseEmbeddings(qdrant_client)
text = "Example text input"
embedding = qdrant_sparse_embeddings.embed(text)
print(embedding)

Extra Tips

  • Make sure to handle any errors that may occur during server-side inference, such as network errors or invalid input.
  • Consider adding additional logging or monitoring to track the performance of the server-side inference.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING