llamaIndex - 💡(How to fix) Fix [Question]: Ambiguity in DEFAULT_FACT_CONDENSE_PROMPT: Full snapshot vs incremental facts? [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
run-llama/llama_index#21103Fetched 2026-04-08 01:08:11
View on GitHub
Comments
0
Participants
1
Timeline
2
Reactions
0
Author
Participants
Timeline (top)
comment_deleted ×1labeled ×1

Hi team,

I’m currently using FactExtractionMemoryBlock with the default DEFAULT_FACT_CONDENSE_PROMPT, and I’ve encountered some ambiguity in how the prompt is intended to behave.

Specifically, it is unclear whether the LLM is expected to return:

  • A full, condensed list of facts (final snapshot) or
  • Only new / incremental facts extracted from the latest conversation

Root Cause

This distinction affects:

  • How developers merge or replace facts
  • Whether downstream logic should handle deduplication or conflict resolution
  • Memory consistency over time

Code Example

Do not duplicate facts that are already in the existing facts list

<existing_facts>
{{ existing_facts }}
</existing_facts>

---

self.facts = new_facts

---

Do not include duplicate or semantically redundant facts in the final output list.
Ensure each fact appears only once.
RAW_BUFFERClick to expand / collapse

Question Validation

  • I have searched both the documentation and discord for an answer.

Question

Description

Hi team,

I’m currently using FactExtractionMemoryBlock with the default DEFAULT_FACT_CONDENSE_PROMPT, and I’ve encountered some ambiguity in how the prompt is intended to behave.

Specifically, it is unclear whether the LLM is expected to return:

  • A full, condensed list of facts (final snapshot) or
  • Only new / incremental facts extracted from the latest conversation

Relevant Prompt Section

Do not duplicate facts that are already in the existing facts list

<existing_facts>
{{ existing_facts }}
</existing_facts>

Observed Ambiguity

The prompt provides existing_facts and asks the model to "condense the facts", which suggests a full rewrite of the fact list.

However, the instruction:

"Do not duplicate facts that are already in the existing facts list"

can be interpreted as:

  • ❌ "Do not re-output existing facts" → implies incremental output
  • ✅ "Do not repeat semantically duplicate facts" → implies full output with deduplication

This dual interpretation makes it unclear how the output should be handled downstream.


Current Behavior (Based on Code)

From the typical usage pattern:

self.facts = new_facts

it appears that the system expects a full replacement (snapshot) rather than incremental updates.


Questions

  1. Is the intended behavior for the LLM to return:

    • (A) A full, deduplicated, condensed list of facts, or
    • (B) Only new facts (incremental updates)?
  2. If (A) is correct (full snapshot), should the prompt wording be clarified to avoid confusion?


Suggested Improvement

To reduce ambiguity, I suggest updating the instruction to something like:

Do not include duplicate or semantically redundant facts in the final output list.
Ensure each fact appears only once.

This makes it clear that:

  • The output is a complete list
  • Deduplication is about content, not presence in existing_facts

Why This Matters

This distinction affects:

  • How developers merge or replace facts
  • Whether downstream logic should handle deduplication or conflict resolution
  • Memory consistency over time

Additional Context

I initially interpreted the prompt as incremental, which led to incorrect handling (e.g., appending instead of replacing facts). Clarifying this would help avoid similar confusion for others.


Thanks for the great work on the memory system—this is a very powerful abstraction!

Happy to help further if needed 🙌

extent analysis

Fix Plan

To resolve the ambiguity in the FactExtractionMemoryBlock prompt, we need to clarify the instruction to ensure it aligns with the intended behavior of returning a full, deduplicated, condensed list of facts.

Here are the steps:

  • Update the prompt instruction to:
Do not include duplicate or semantically redundant facts in the final output list.
Ensure each fact appears only once.
  • Ensure the code handles the output as a full replacement (snapshot) rather than incremental updates. The current usage pattern:
self.facts = new_facts

already suggests this approach.

Verification

To verify the fix, test the updated prompt with various input scenarios, including:

  • Empty existing_facts list
  • existing_facts list with duplicate or semantically redundant facts
  • existing_facts list with a mix of old and new facts

Verify that the output is a full, deduplicated, condensed list of facts in all cases.

Extra Tips

  • Consider adding a comment or documentation to the code to explain the intended behavior of the FactExtractionMemoryBlock prompt.
  • Review downstream logic to ensure it correctly handles the full, deduplicated list of facts.
  • Test the updated prompt with different LLM models to ensure consistency in output.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING