openclaw - 💡(How to fix) Fix cognee plugin silently skips re-indexing of modified files (filename deduplication bug) [1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#49465Fetched 2026-04-08 00:55:00
View on GitHub
Comments
1
Participants
2
Timeline
1
Reactions
0
Timeline (top)
commented ×1

Error Message

When a file watched by the Cognee plugin is modified and re-uploaded via /api/v1/add, Cognee deduplicates by filename — it never overwrites the stored file, and the existing pipeline_status row is already marked DATA_ITEM_PROCESSING_COMPLETED. The result is that the cognify pipeline silently skips the file with no error or warning. Updated content is never extracted.

Root Cause

Cognee's /add endpoint skips writing the file if a storage entry with the same filename already exists. The data table row retains pipeline_status: COMPLETED for the current dataset, so cognify treats it as already done.

Fix Action

Workaround

Before re-uploading a changed file, delete the data table row by name (via sqlite) and remove the stored file from Cognee's .data_storage directory.

RAW_BUFFERClick to expand / collapse

When a file watched by the Cognee plugin is modified and re-uploaded via /api/v1/add, Cognee deduplicates by filename — it never overwrites the stored file, and the existing pipeline_status row is already marked DATA_ITEM_PROCESSING_COMPLETED. The result is that the cognify pipeline silently skips the file with no error or warning. Updated content is never extracted.

This affects any use of the Cognee plugin with OpenClaw's memory/**/*.md files, which are inherently dynamic — daily notes are appended every session, MEMORY.md is updated continuously, and discussion files grow over time.

Reproduction

  1. Index a file via the Cognee plugin
  2. Append content to the same file
  3. Wait for or trigger re-indexing
  4. Observe: cognify returns HTTP 200, but Docker logs show no extract_graph_from_data activity and no new nodes are extracted

Root cause

Cognee's /add endpoint skips writing the file if a storage entry with the same filename already exists. The data table row retains pipeline_status: COMPLETED for the current dataset, so cognify treats it as already done.

Workaround

Before re-uploading a changed file, delete the data table row by name (via sqlite) and remove the stored file from Cognee's .data_storage directory.

Expected behavior

Re-uploading a file with changed content should replace the stored copy and re-run the pipeline for that file.

extent analysis

Fix Plan

To fix the issue, we need to modify the Cognee plugin to overwrite the stored file and update the pipeline_status when a file with the same name is re-uploaded. Here are the steps:

  • Modify the /api/v1/add endpoint to check if a file with the same name already exists in the storage directory.
  • If the file exists, delete the existing data table row and remove the stored file from the .data_storage directory.
  • Update the pipeline_status to PENDING or NEW to trigger the cognify pipeline to re-process the file.

Example code snippet:

import os
import sqlite3

# Assuming the Cognee plugin uses a SQLite database
conn = sqlite3.connect('cognee.db')
cursor = conn.cursor()

def add_file(file_name, file_content):
    # Check if file already exists in storage directory
    storage_dir = '.data_storage'
    file_path = os.path.join(storage_dir, file_name)
    if os.path.exists(file_path):
        # Delete existing data table row
        cursor.execute("DELETE FROM data WHERE file_name = ?", (file_name,))
        conn.commit()
        # Remove stored file from .data_storage directory
        os.remove(file_path)
    
    # Write new file to storage directory
    with open(file_path, 'wb') as f:
        f.write(file_content)
    
    # Update pipeline status to PENDING
    cursor.execute("INSERT OR REPLACE INTO data (file_name, pipeline_status) VALUES (?, ?)", (file_name, 'PENDING'))
    conn.commit()

Verification

To verify that the fix worked, follow the reproduction steps and check that:

  • The cognify pipeline is triggered and new nodes are extracted after re-uploading a modified file.
  • The pipeline_status is updated to PENDING or NEW after re-uploading a modified file.
  • The stored file in the .data_storage directory is updated with the new content.

Extra Tips

  • Make sure to handle errors and exceptions properly when deleting files and updating the database.
  • Consider adding logging to track changes to the pipeline_status and file uploads.
  • Test the fix thoroughly to ensure that it works correctly for different file types and sizes.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

Re-uploading a file with changed content should replace the stored copy and re-run the pipeline for that file.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix cognee plugin silently skips re-indexing of modified files (filename deduplication bug) [1 comments, 2 participants]