openclaw - 💡(How to fix) Fix Commitment delivery should revalidate source context before sending inferred open-loop follow-ups

openclaw2026-05-25 06:38:42

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

Native inferred commitments can send stale follow-ups because due commitment delivery is currently driven by stored commitment metadata without replaying the source conversation or allowing tools/live verification. This is documented behavior, but it creates a sharp edge for mutable open loops: the original issue can be resolved, corrected, or superseded before the due window, while the commitment still delivers the old suggestedText.

This caused a real stale Slack thread follow-up in production. We have disabled commitments.enabled locally for now.

Root Cause

Fix Action

Workaround

Disable native inferred commitments:

openclaw config set commitments.enabled false

This avoids the stale delivery path, but loses the useful parts of inferred commitments.

Code Example

{
  "commitments": {
    "enabled": true,
    "maxPerDay": 3
  },
  "agents": {
    "defaults": {
      "heartbeat": {
        "every": "1h",
        "target": "slack",
        "lightContext": true,
        "isolatedSession": true
      }
    }
  }
}

---

{
  "kind": "open_loop",
  "source": "agent_promise",
  "status": "sent",
  "reason": "The assistant asked the user to confirm the competitor list before proceeding with the broader competitor review, creating an unscheduled open loop if they do not respond.",
  "suggestedText": "Checking back on the competitor list: should I keep [old competitor list]?",
  "dedupeKey": "competitor-review:confirm-list:2026-05-21",
  "confidence": 0.86,
  "createdAt": "2026-05-21T14:00:42.977Z",
  "sentAt": "2026-05-21T17:53:21.473Z",
  "sourceMessageId": "<redacted-thread-root>",
  "sourceRunId": "<redacted>"
}

---

openclaw config set commitments.enabled false

RAW_BUFFERClick to expand / collapse

Summary

This caused a real stale Slack thread follow-up in production. We have disabled commitments.enabled locally for now.

Environment

OpenClaw: 2026.5.22 (a374c3a)
Config at time of incident:

{
  "commitments": {
    "enabled": true,
    "maxPerDay": 3
  },
  "agents": {
    "defaults": {
      "heartbeat": {
        "every": "1h",
        "target": "slack",
        "lightContext": true,
        "isolatedSession": true
      }
    }
  }
}

Documented behavior checked

This does not appear to be a local config misunderstanding. The local docs describe commitments as opt-in inferred follow-up memories delivered through heartbeat, and explicitly state that commitment delivery prompts do not replay original conversation text and run without OpenClaw tools.

Relevant local docs:

docs/concepts/commitments.md: commitments are short-lived inferred follow-up memories; heartbeat delivers them when due.
docs/concepts/commitments.md: delivery prompts do not replay the original conversation text, and due commitment heartbeat turns run without tools.
docs/cli/commitments.md: CLI lists/manages inferred follow-up commitments and exposes suggested check-in text.
Config type: CommitmentsConfig.enabled enables inferred follow-up extraction, storage, and heartbeat delivery.

So the implementation seems aligned with docs; the issue is that the documented delivery model is unsafe for some inferred open_loop commitments.

Incident, sanitized

A Slack thread asked the agent to build a competitor review. The first assistant turn asked the user to confirm a competitor list. Shortly after, the user corrected the list: remove one brand and add another. The agent then completed the work using the corrected list, created/updated the artifact, and later repaired the artifact after a user follow-up.

Despite that resolution, a native commitment created from the initial confirmation question remained pending and was delivered later with the stale original list.

Sanitized commitment record excerpt:

{
  "kind": "open_loop",
  "source": "agent_promise",
  "status": "sent",
  "reason": "The assistant asked the user to confirm the competitor list before proceeding with the broader competitor review, creating an unscheduled open loop if they do not respond.",
  "suggestedText": "Checking back on the competitor list: should I keep [old competitor list]?",
  "dedupeKey": "competitor-review:confirm-list:2026-05-21",
  "confidence": 0.86,
  "createdAt": "2026-05-21T14:00:42.977Z",
  "sentAt": "2026-05-21T17:53:21.473Z",
  "sourceMessageId": "<redacted-thread-root>",
  "sourceRunId": "<redacted>"
}

Relevant timeline:

10:00 ET: user asks for competitor review; agent asks for confirmation of initial list.
10:01 ET: user corrects the list.
10:12 ET: agent completes the review using the corrected list and explicitly says the removed brand was excluded.
13:17 ET: agent repairs the sheet/artifact and reports the corrected state.
13:53 ET: commitment delivery sends the stale original-list check-in.
14:49 ET: user replies with a short acknowledgement.
Main thread handler returns NO_REPLY because the latest visible user message was a bare acknowledgement without enough surrounding context.

Important nuance: the normal hourly heartbeat scan behaved correctly earlier and stayed quiet. The stale message came from the commitment-delivery subpath of the heartbeat runner.

Observed implementation behavior

In the installed bundle, buildCommitmentHeartbeatPrompt creates a prompt containing commitment fields like reason, suggestedText, sourceMessageId, and sourceRunId. The prompt says commitment metadata is untrusted, but also says not to use tools because of commitment content. The delivery turn therefore cannot inspect the latest thread, artifact state, or related sessions before deciding whether to send.

The heartbeat runner then separately enumerates due commitment session keys and calls runOnce with reason: "commitment" for each due session. That means due commitments can be delivered independently of the normal heartbeat task prompt/guardrails.

Why this is risky

Stored inferred commitments are hypotheses, not facts. For mutable open loops, the state can change after extraction:

the user may answer in the same thread after the source message,
the agent may complete or supersede the work,
an artifact/sheet/doc may be updated,
the user may close the loop in a different channel/thread,
a cron/background job may finish the work,
memory may record newer resolution state.

If the delivery turn only sees stale commitment metadata, it can re-open completed work, ask questions that were already answered, or post outdated artifact/list/status claims.

Proposed design

Commitment delivery should be revalidation-first, not metadata-first.

Suggested flow:

Commitment becomes due.
Trigger an internal review turn, not an immediate user-facing send.
Load the source session/thread from sessionKey and sourceMessageId, including messages after the source message.
Search relevant recent memory/sessions/artifacts since createdAtMs for closure/supersession evidence, especially for the same topic/entities.
Classify current state:
- still_pending_user: user input is still needed; optional follow-up may be appropriate.
- still_pending_agent: agent owes work; do/continue the work or report current status.
- completed: dismiss silently.
- superseded: dismiss silently.
- unclear: retry later or dismiss silently; do not poke the user.
Only then decide whether to send a visible message.
If context/tools are unavailable, default to no visible send rather than using stale suggestedText.

Minimum viable safety checks

Even before a full revalidation flow, these would prevent the incident above:

Suppress inferred open_loop delivery when there is a newer assistant completion in the same session/thread after sourceMessageId.
Suppress when there is a newer user correction/answer in the same session/thread after sourceMessageId.
Include recent thread context after sourceMessageId in the commitment delivery prompt.
Do not directly send or closely paraphrase suggestedText; treat it as a stale hint.
Allow tools or provide a precomputed current-state snapshot for due commitments that mention artifacts, sheets, docs, counts, lists, or external state.
If delivery cannot verify current state, mark/dismiss/retry internally instead of sending externally.

Acceptance criteria

A due commitment from an earlier open loop does not send if the same thread later contains a user answer/correction and assistant completion.
A due commitment does not send stale artifact/list/status text when a newer artifact/session/memory state contradicts it.
Commitment delivery has access to enough recent context to decide whether the commitment is still live.
HEARTBEAT.md or equivalent heartbeat policy guardrails are applied consistently to commitment-only heartbeat turns, or commitment delivery has its own equivalent safety policy.
Tests cover stale open-loop suppression and cross-session closure/supersession where feasible.

Workaround

Disable native inferred commitments:

openclaw config set commitments.enabled false

This avoids the stale delivery path, but loses the useful parts of inferred commitments.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

openclaw - 💡(How to fix) Fix Commitment delivery should revalidate source context before sending inferred open-loop follow-ups

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Workaround

Code Example

Summary

Environment

Documented behavior checked

Incident, sanitized

Observed implementation behavior

Why this is risky

Proposed design

Minimum viable safety checks

Acceptance criteria

Workaround

Still need to ship something?

TRENDING