llamaIndex - ✅(Solved) Fix Feature: compaction boundary event for behavioral drift monitoring in long-running agents [1 pull requests, 3 comments, 3 participants]

Official PRs (…)
ON THIS PAGE

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
run-llama/llama_index#21207Fetched 2026-04-08 01:45:27
View on GitHub
Comments
3
Participants
3
Timeline
5
Reactions
0
Timeline (top)
commented ×3closed ×1cross-referenced ×1

LlamaIndex's March 26 blog post ("Files Are All You Need") makes a compelling case for files as the primary context management abstraction for long-running agents — including storing compressed conversation histories when context compaction triggers.

This pattern solves the token budget problem. It creates a new monitoring problem: file-based context compaction is a behavioral boundary, and there's currently no standardized way to observe whether agent behavior changed after crossing one.

Root Cause

LlamaIndex's March 26 blog post ("Files Are All You Need") makes a compelling case for files as the primary context management abstraction for long-running agents — including storing compressed conversation histories when context compaction triggers.

This pattern solves the token budget problem. It creates a new monitoring problem: file-based context compaction is a behavioral boundary, and there's currently no standardized way to observe whether agent behavior changed after crossing one.

Fix Action

Fixed

PR fix notes

PR #21208: callbacks: add CONTEXT_COMPACTION event type and payload keys to CBEventType

Description (problem / solution / changelog)

Summary

Adds CONTEXT_COMPACTION to CBEventType and four matching EventPayload keys:

  • PRE_COMPACTION_TOKEN_COUNT
  • POST_COMPACTION_TOKEN_COUNT
  • COMPACTION_SUMMARY
  • DROPPED_MESSAGE_COUNT

This is a pure schema addition: no existing behaviour changes, no breaking changes, fully backward-compatible.

Motivation

When a long-running agent compacts its context window (summarizes + drops older messages), the compaction event is invisible to the callback system today. Observability tools, monitors, and production operators have no standard hook to:

  • Record what was dropped and what summary was produced
  • Compare agent behaviour before and after the boundary
  • Alert on or roll back from unexpected behavioural drift

This addition follows the request in Issue #21207 and mirrors the CBEventType/EventPayload patterns already established for AGENT_STEP, RERANKING, and others.

Usage

from llama_index.core.callbacks import CallbackManager, CBEventType
from llama_index.core.callbacks.schema import EventPayload

class CompactionMonitor(BaseCallbackHandler):
    def on_event_start(self, event_type, payload=None, **kwargs):
        if event_type == CBEventType.CONTEXT_COMPACTION:
            pre = payload.get(EventPayload.PRE_COMPACTION_TOKEN_COUNT, 0)
            print(f"Compaction starting: {pre} tokens in context")

    def on_event_end(self, event_type, payload=None, **kwargs):
        if event_type == CBEventType.CONTEXT_COMPACTION:
            post = payload.get(EventPayload.POST_COMPACTION_TOKEN_COUNT, 0)
            summary = payload.get(EventPayload.COMPACTION_SUMMARY, "")
            dropped = payload.get(EventPayload.DROPPED_MESSAGE_COUNT, 0)
            print(f"Compaction done: {post} tokens, {dropped} messages dropped")
            # attach behavioral fingerprint here for drift detection

Scope

  • llama-index-core/llama_index/core/callbacks/schema.py only
  • Docstring updated; no logic changes
  • Existing LEAF_EVENTS tuple unchanged (compaction is not a leaf event)

Closes #21207 (partial — schema foundation; emit-side wiring in compaction paths is a follow-up)

Changed files

  • llama-index-core/llama_index/core/callbacks/schema.py (modified, +111/-101)

Code Example

class CompactionEvent:
    pre_compaction_message_count: int
    post_compaction_message_count: int
    summary_text: str
    dropped_token_count: int
    timestamp: datetime
RAW_BUFFERClick to expand / collapse

Context

LlamaIndex's March 26 blog post ("Files Are All You Need") makes a compelling case for files as the primary context management abstraction for long-running agents — including storing compressed conversation histories when context compaction triggers.

This pattern solves the token budget problem. It creates a new monitoring problem: file-based context compaction is a behavioral boundary, and there's currently no standardized way to observe whether agent behavior changed after crossing one.

What I mean

When an agent compacts context into a file (or summarizes + discards older messages), two things happen:

  1. The agent's effective "memory" is now a summary, not the original trace
  2. The vocabulary, task focus, and tool-use patterns may have shifted silently

The agent continues running. If there's no instrument watching for the shift, you won't know until an output is visibly wrong — which in long-horizon agents is often too late.

The gap

LlamaIndex has excellent per-query and per-tool instrumentation via callbacks. What's missing is a compaction boundary event with enough metadata to enable cross-boundary behavioral comparison:

  • Which messages were dropped?
  • What was the summary produced?
  • Did topic focus, tool-use distribution, or vocabulary shift between pre/post windows?

What I'm proposing

A CompactionEvent or equivalent callback hook (similar to existing CBEventType patterns) that fires at the context compaction boundary, emitting:

class CompactionEvent:
    pre_compaction_message_count: int
    post_compaction_message_count: int
    summary_text: str
    dropped_token_count: int
    timestamp: datetime

This would let observability tools, monitoring libraries, and production operators attach a behavioral fingerprint before and after compaction — enabling rollback, alerting, and drift detection without modifying the core compaction logic.

Reference

I built a toolkit for exactly this gap: compression-monitor. It currently hooks into frameworks via filesystem inspection (LangChain compaction markers), but first-class events from the framework would be cleaner and more reliable.

Happy to draft a PR for the callback type if there's interest in adding this to the core.callbacks surface.

extent analysis

Fix Plan

To address the issue, we need to implement a CompactionEvent callback hook. Here are the steps:

  • Define the CompactionEvent class with the required metadata:
from datetime import datetime

class CompactionEvent:
    def __init__(self, pre_compaction_message_count: int, post_compaction_message_count: int, 
                 summary_text: str, dropped_token_count: int, timestamp: datetime):
        self.pre_compaction_message_count = pre_compaction_message_count
        self.post_compaction_message_count = post_compaction_message_count
        self.summary_text = summary_text
        self.dropped_token_count = dropped_token_count
        self.timestamp = timestamp
  • Create a callback hook in the core.callbacks surface that fires at the context compaction boundary:
from typing import Callable

class Callbacks:
    def __init__(self):
        self.compaction_event_callbacks = []

    def add_compaction_event_callback(self, callback: Callable[[CompactionEvent], None]):
        self.compaction_event_callbacks.append(callback)

    def emit_compaction_event(self, event: CompactionEvent):
        for callback in self.compaction_event_callbacks:
            callback(event)
  • Modify the context compaction logic to emit the CompactionEvent:
def compact_context(...):
    # Compaction logic...
    pre_compaction_message_count = ...
    post_compaction_message_count = ...
    summary_text = ...
    dropped_token_count = ...
    timestamp = datetime.now()
    
    event = CompactionEvent(pre_compaction_message_count, post_compaction_message_count, 
                           summary_text, dropped_token_count, timestamp)
    callbacks.emit_compaction_event(event)

Verification

To verify that the fix worked, you can:

  • Add a test callback to the compaction_event_callbacks list and check that it is called with the correct CompactionEvent metadata.
  • Use a monitoring library or toolkit (such as compression-monitor) to attach a behavioral fingerprint before and after compaction.

Extra Tips

  • Consider adding additional metadata to the CompactionEvent class as needed.
  • Ensure that the CompactionEvent callback hook is properly documented and exposed in the core.callbacks surface.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

llamaIndex - ✅(Solved) Fix Feature: compaction boundary event for behavioral drift monitoring in long-running agents [1 pull requests, 3 comments, 3 participants]