dify - ✅(Solved) Fix RAG mixed retrieval problem [1 pull requests, 1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
langgenius/dify#35482Fetched 2026-04-23 07:45:34
View on GitHub
Comments
1
Participants
2
Timeline
2
Reactions
1
Author
Timeline (top)
commented ×1cross-referenced ×1

Fix Action

Fixed

PR fix notes

PR #35498: fix(rag): decouple hybrid recall from top_k for stable merged ranking…

Description (problem / solution / changelog)

Summary

Fixes mixed (hybrid) RAG retrieval so changing only Top K no longer changes which segments participate in score fusion, which could reorder the head and make the first N results differ between a small Top K and a large Top K for the same query.

Problem

Hybrid search runs vector and full-text in parallel, deduplicates, then merges scores (weighted score or reranker). The final top_k was also used as the per-channel limit for each sub-retriever. A segment can sit below k in both channels but still get a high combined score. With a small k it never entered the candidate pool; with a larger k it did—so the merged top results were inconsistent with user expectations (reported in #35482).

Solution

  • Introduce a dedicated per-channel recall for hybrid: min(200, max(50, final_top_k)).
  • Use it for embedding_search and full_text_index_search in hybrid mode.
  • Keep post-merge cut at the user’s top_k via DataPostProcessor.invoke(..., top_n=top_k).

Trade-offs

Slightly more work per hybrid query (larger per-channel fetch) in exchange for stable, correct fusion behavior.

Test plan

  • uv run pytest tests/unit_tests/core/rag/datasource/test_datasource_retrieval.py::test_hybrid_recall_top_k_for_merge_contract -q
  • uv run pytest tests/unit_tests/core/rag/retrieval/test_dataset_retrieval.py -k hybrid -q
  • Manual: same query, hybrid + weighted / rerank, Top K 3 vs 8 — first 3 should match (same ordering and segments).

Related

  • Closes / fixes #35482

Changed files

  • api/core/rag/datasource/retrieval_service.py (modified, +33/-3)
  • api/tests/unit_tests/core/rag/datasource/test_datasource_retrieval.py (modified, +9/-0)
RAW_BUFFERClick to expand / collapse

Self Checks

  • I have read the Contributing Guide and Language Policy.
  • This is only for bug report, if you would like to ask a question, please head to Discussions.
  • I have searched for existing issues search for existing issues, including closed ones.
  • I confirm that I am using English to submit this report, otherwise it will be closed.
  • 【中文用户 & Non English User】请使用英语提交,否则会被关闭 :)
  • Please do not modify this template :) and fill in all the required fields.

Dify version

1.9.2

Cloud or Self Hosted

Self Hosted (Docker)

Steps to reproduce

In the case of the same text input: When I chose the mixed search and reach model, the TOPK was 3 and the score threshold was not set. The results were: A content scored 0.58 points, B content scored 0.47 points, and C content scored 0.45 points 2. When I chose the mixed search and reach model, the TOPK was 8 and the score threshold was not set. The results were: A content scored 0.58 points, D content scored 0.52 points, B content scored 0.47 points, and C content scored 0.45 points ... My question is why I only modified the Topk, and the content with higher scores in the later D did not appear in the top 3, but appeared in the top 8 and ranked second

✔️ Expected Behavior

Even if I modify the value of Top, if it is the top 8, at least the value of top 3 should be the same as the value of top 3

❌ Actual Behavior

No response

extent analysis

TL;DR

The issue can be investigated by analyzing the scoring and ranking logic in the mixed search and reach model to understand why changing the TOPK value affects the results.

Guidance

  • Review the scoring algorithm to ensure it is correctly calculating scores for each content item.
  • Verify that the ranking logic is correctly sorting content items based on their scores.
  • Check if there are any threshold or filtering conditions that might be affecting the results when TOPK is set to 3 versus 8.
  • Investigate if the issue is related to the specific model or if it's a general problem with the ranking system.

Example

No code snippet can be provided without more information about the implementation details of the mixed search and reach model.

Notes

The issue might be related to the specific implementation of the mixed search and reach model, and more information about the model's logic and configuration would be necessary to provide a more accurate diagnosis.

Recommendation

Apply workaround: temporarily set TOPK to a higher value (e.g., 8) to ensure that all relevant content items are included in the results, while investigating the root cause of the issue.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING