dify - ✅(Solved) Fix Score threshold not applied to reranked score in hybrid search [2 pull requests, 1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
langgenius/dify#35233Fetched 2026-04-15 06:45:29
View on GitHub
Comments
1
Participants
2
Timeline
2
Reactions
1
Timeline (top)
commented ×1labeled ×1

Fix Action

Fix / Workaround

  • Valid queries return empty results at reasonable threshold values
  • The score threshold setting is misleading — it does not behave as the UI implies
  • Users are forced to use unnecessary workarounds such as multiple retrieval nodes with different thresholds to compensate for this behavior

Current workaround

This workaround should not be necessary.

PR fix notes

PR #35263: fix: apply score threshold after reranking in hybrid search

Description (problem / solution / changelog)

Summary

Fixes #35233

In hybrid search with reranking enabled, the score threshold is applied to the pre-rerank/fusion score instead of the post-rerank score. This causes documents with high reranked scores (0.84-0.96) to be filtered out because their pre-rerank scores were below the threshold.

Root Cause

RetrievalService.embedding_search passes score_threshold to the vector DB query for HYBRID_SEARCH, which filters on pre-rerank similarity scores. After reranking produces new scores, the threshold-filtered results are already gone.

Fix

  • For HYBRID_SEARCH, pass score_threshold=None to the vector DB search step so no pre-filtering occurs
  • Let the reranking pipeline (RerankModelRunner, WeightRerankRunner) apply the threshold to the final reranked/fused scores
  • When no rerank runner is configured, fall back to filtering by vector score threshold after retrieval

Made with Cursor

Changed files

  • api/core/rag/datasource/retrieval_service.py (modified, +30/-2)

PR #35281: feat: when enable rerank vdb not use threshold to filter

Description (problem / solution / changelog)

[!IMPORTANT]

  1. Make sure you have read our contribution guidelines
  2. Ensure there is an associated issue and you have been assigned to it
  3. Use the correct syntax to link this PR: Fixes #<issue number>.

Summary

score_threshold use in rerank stage when rerank model is enable not in vector search stage

fix #35233

<!-- Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context. List any dependencies that are required for this change. --> <!-- If this PR was created by an automated agent, add `From <Tool Name>` as the final line of the description. Example: `From Codex`. -->

Screenshots

BeforeAfter
......

Checklist

  • This change requires a documentation update, included: Dify Document
  • I understand that this PR may be closed in case there was no previous discussion or issues. (This doesn't apply to typos!)
  • I've added a test for each change that was introduced, and I tried as much as possible to make a single atomic change.
  • I've updated the documentation accordingly.
  • I ran make lint && make type-check (backend) and cd web && pnpm exec vp staged (frontend) to appease the lint gods

Changed files

  • api/core/rag/data_post_processor/data_post_processor.py (modified, +4/-0)
  • api/core/rag/datasource/retrieval_service.py (modified, +13/-2)
  • api/tests/unit_tests/core/rag/data_post_processor/test_data_post_processor.py (modified, +85/-0)
  • api/tests/unit_tests/core/rag/datasource/test_datasource_retrieval.py (modified, +187/-0)
RAW_BUFFERClick to expand / collapse

Self Checks

  • I have read the Contributing Guide and Language Policy.
  • This is only for bug report, if you would like to ask a question, please head to Discussions.
  • I have searched for existing issues search for existing issues, including closed ones.
  • I confirm that I am using English to submit this report, otherwise it will be closed.
  • 【中文用户 & Non English User】请使用英语提交,否则会被关闭 :)
  • Please do not modify this template :) and fill in all the required fields.

Dify version

1.13.3

Cloud or Self Hosted

Cloud

Steps to reproduce

Environment

  • Platform: Dify Cloud
  • Retrieval mode: Hybrid search
  • Embedding model: text-embedding-3-large (OpenAI)
  • Rerank model: jina-reranker-v1-base-en (Jina)

Describe the bug

When using hybrid search with a rerank model configured, setting a score threshold of 0.60 returns empty results. However, lowering the threshold to 0.45 returns chunks that display scores of 0.840.96.

If the returned chunks are displaying scores of 0.840.96, they should have easily passed a threshold of 0.60. This is contradictory and confusing behavior.

This indicates that the threshold is being applied against a different score than what is displayed in the result. The user sets a threshold expecting it to filter against the displayed score — but it does not.


Steps to reproduce

  1. Create a knowledge base with hybrid search enabled

  2. Configure Jina reranker

  3. Set score threshold to 0.60 on retrieval node

  4. Send a query

  5. Retrieval node returns empty result

    <img width="1047" height="839" alt="Image" src="https://github.com/user-attachments/assets/140e67a5-5701-46d7-b300-0a3921115a24" />
  6. Lower threshold to 0.45 on a second retrieval node with the same query

  7. Returns 10 chunks with displayed scores of 0.840.96

<!-- Failed to upload "スクリーンショット 2026-04-15 104823.png" -->

Proof

NodeThresholdChunks returnedDisplayed score range
Node 10.600
Node 20.45100.840.96

The same chunks that were blocked at threshold 0.60 display scores of 0.840.96 when retrieved at threshold 0.45. This means the threshold and the displayed score are not measuring the same thing.


✔️ Expected Behavior

Expected behavior

The score threshold should filter against the reranked score — which is the score displayed to the user in the result.

If a chunk displays a score of 0.84, it must pass a threshold of 0.60. The threshold value and the displayed score must be consistent so that users can set a meaningful threshold based on what they see.


❌ Actual Behavior

Actual behavior

The threshold filters against the raw hybrid score before reranking. The displayed score is the reranked score. These are two completely different scoring systems with different value distributions — but the same threshold value is silently applied to both without any indication to the user.

This means:

  • A user sets threshold 0.60 expecting to filter out chunks scoring below 0.60
  • But chunks scoring 0.840.96 (displayed) are blocked
  • The user has no visibility into the raw hybrid score that is actually used for filtering
  • The displayed score gives a false impression of what passed the threshold

Impact

  • Valid queries return empty results at reasonable threshold values
  • The score threshold setting is misleading — it does not behave as the UI implies
  • Users are forced to use unnecessary workarounds such as multiple retrieval nodes with different thresholds to compensate for this behavior

Current workaround

Two retrieval nodes are required:

NodeThresholdPurpose
Node 10.60Primary retrieval
Node 20.45Fallback when Node 1 returns empty
Code node0.65Manual filter on displayed reranked score

This workaround should not be necessary.


Suggested fix

When a rerank model is configured, the score threshold should be applied against the reranked score — the same score that is displayed to the user. The threshold and the displayed score must always refer to the same value.


Related issue

#3146

extent analysis

TL;DR

Apply the score threshold against the reranked score displayed to the user, rather than the raw hybrid score, to ensure consistency and accurate filtering.

Guidance

  • Review the current implementation of the score threshold in the hybrid search with reranking to identify where the threshold is being applied against the raw hybrid score instead of the reranked score.
  • Modify the code to apply the score threshold against the reranked score, ensuring that the threshold and displayed score refer to the same value.
  • Test the modified implementation with different threshold values to verify that it behaves as expected and returns the correct results.
  • Consider adding documentation or UI updates to clearly indicate how the score threshold is applied and what score it refers to, to avoid user confusion.

Example

No specific code example can be provided without access to the actual implementation, but the fix would involve updating the logic that applies the score threshold to use the reranked score instead of the raw hybrid score.

Notes

The suggested fix assumes that the reranked score is the intended score to filter against, as implied by the issue description. However, the actual implementation details may vary, and additional considerations may be necessary to ensure the fix works correctly in all scenarios.

Recommendation

Apply the workaround of using multiple retrieval nodes with different thresholds until the suggested fix can be implemented, to minimize the impact of the current behavior on users.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

The score threshold should filter against the reranked score — which is the score displayed to the user in the result.

If a chunk displays a score of 0.84, it must pass a threshold of 0.60. The threshold value and the displayed score must be consistent so that users can set a meaningful threshold based on what they see.


Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING