openclaw - 💡(How to fix) Fix [Bug]: commitments store has no write serialization — concurrent `markCommitmentsStatus` / `markCommitmentsAttempted` calls silently lose dismiss/sent/attempt updates (data-loss class; can re-deliver an already-sent commitment) [1 pull requests]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

The commitments store (~/.openclaw/commitments/commitments.json) is updated by markCommitmentsAttempted / markCommitmentsStatus (src/commitments/store.ts:434-461, 463-493) and by upsertInferredCommitments (src/commitments/store.ts:295-331) via an unsynchronized load-modify-save pattern: each call independently reads the entire store, mutates in memory, and writes the whole file back. There is no lockfile, mutex, or writer queue around the critical section.

When two callers race (background heartbeat + user CLI dismiss, or two heartbeat ticks for different sessions, or two Promise.all-style calls within one heartbeat tick), whichever save lands second silently overwrites the first caller's mutation with stale data from before the first save.

By contrast, the session store (src/config/sessions/store.ts:432, 442, 556, 638) wraps every mutation in runExclusiveSessionStoreWrite(storePath, fn) — a per-path async writer queue defined in src/config/sessions/store-writer.ts:14-97 (uses WRITER_QUEUES: Map<string, SessionStoreWriterQueue>). That exact pattern is exported from the session-store and could be reused for commitments. Commitments has no equivalent protection.

Failure modes that manifest under concurrency:

  1. Lost dismiss — User runs openclaw commitments dismiss cm_xyz while the heartbeat is bumping attempts on another commitment. Heartbeat's snapshot lacks the dismiss; heartbeat's save wins → cm_xyz reverts to pending and gets delivered to the user.
  2. Lost "sent" status → duplicate delivery — Heartbeat marks cm_xyz as sent (src/infra/heartbeat-runner.ts:1929). A concurrent markCommitmentsAttempted (src/infra/heartbeat-runner.ts:1568) on a different commitment overwrites the snapshot → next heartbeat tick sees cm_xyz as still pending and re-sends the same commitment. User-visible: duplicate follow-up message.
  3. Lost attempts counter — Same race: attempts: N+1 gets rolled back to N; the retry-budget cap (maxPerDay) is silently extended; the user gets more reminders than the configured policy allows.

Verified live on v2026.5.10-beta.1 (152ea9af34); same source on v2026.5.7 stable.

Root Cause

The commitments store (~/.openclaw/commitments/commitments.json) is updated by markCommitmentsAttempted / markCommitmentsStatus (src/commitments/store.ts:434-461, 463-493) and by upsertInferredCommitments (src/commitments/store.ts:295-331) via an unsynchronized load-modify-save pattern: each call independently reads the entire store, mutates in memory, and writes the whole file back. There is no lockfile, mutex, or writer queue around the critical section.

When two callers race (background heartbeat + user CLI dismiss, or two heartbeat ticks for different sessions, or two Promise.all-style calls within one heartbeat tick), whichever save lands second silently overwrites the first caller's mutation with stale data from before the first save.

By contrast, the session store (src/config/sessions/store.ts:432, 442, 556, 638) wraps every mutation in runExclusiveSessionStoreWrite(storePath, fn) — a per-path async writer queue defined in src/config/sessions/store-writer.ts:14-97 (uses WRITER_QUEUES: Map<string, SessionStoreWriterQueue>). That exact pattern is exported from the session-store and could be reused for commitments. Commitments has no equivalent protection.

Failure modes that manifest under concurrency:

  1. Lost dismiss — User runs openclaw commitments dismiss cm_xyz while the heartbeat is bumping attempts on another commitment. Heartbeat's snapshot lacks the dismiss; heartbeat's save wins → cm_xyz reverts to pending and gets delivered to the user.
  2. Lost "sent" status → duplicate delivery — Heartbeat marks cm_xyz as sent (src/infra/heartbeat-runner.ts:1929). A concurrent markCommitmentsAttempted (src/infra/heartbeat-runner.ts:1568) on a different commitment overwrites the snapshot → next heartbeat tick sees cm_xyz as still pending and re-sends the same commitment. User-visible: duplicate follow-up message.
  3. Lost attempts counter — Same race: attempts: N+1 gets rolled back to N; the retry-budget cap (maxPerDay) is silently extended; the user gets more reminders than the configured policy allows.

Verified live on v2026.5.10-beta.1 (152ea9af34); same source on v2026.5.7 stable.

Fix Action

Fixed

Code Example

cd ~/Gittensor/OpenCoven/openclaw-upstream     # any current checkout (HEAD 152ea9af34 verified)
export PATH=~/.nvm/versions/node/v22.22.2/bin:$PATH
pnpm install --prefer-offline                  # if dist/ isn't already built
pnpm build                                     # if needed

cat > race-repro.mjs <<'EOF'
import {
  loadCommitmentStore,
  saveCommitmentStore,
  markCommitmentsStatus,
  resolveCommitmentStorePath,
} from "./dist/commitments/store.js";

function seedStore() {
  const now = Date.now();
  return {
    version: 1,
    commitments: ["A", "B"].map((tag) => ({
      id: `cm_race${tag}`,
      agentId: "qa-race",
      sessionKey: "qa-race-session",
      channel: "slack",
      kind: "follow_up",
      sensitivity: "normal",
      source: "qa-seed",
      status: "pending",
      reason: `QA race repro id ${tag}`,
      suggestedText: `ping ${tag}`,
      dedupeKey: `qa-race-${tag}`,
      confidence: 0.9,
      dueWindow: { earliestMs: now - 1000, latestMs: now + 3600_000, timezone: "UTC" },
      createdAtMs: now,
      updatedAtMs: now,
      attempts: 0,
    })),
  };
}
async function snap() {
  const s = await loadCommitmentStore();
  return Object.fromEntries(s.commitments.map((c) => [c.id, c.status]));
}

let loss = 0, ok = 0;
for (let i = 0; i < 20; i++) {
  await saveCommitmentStore(undefined, seedStore());
  await Promise.all([
    markCommitmentsStatus({ ids: ["cm_raceA"], status: "dismissed", nowMs: Date.now() }),
    markCommitmentsStatus({ ids: ["cm_raceB"], status: "dismissed", nowMs: Date.now() }),
  ]);
  const a = await snap();
  if (a.cm_raceA === "dismissed" && a.cm_raceB === "dismissed") ok++; else loss++;
  console.log(`trial ${i+1}: A=${a.cm_raceA} B=${a.cm_raceB}`);
}
console.log(`\n${loss}/20 trials lost a write`);
EOF

node race-repro.mjs

---

trial 1: A=pending B=dismissed
trial 2: A=pending B=dismissed
trial 3: A=pending B=dismissed
trial 20: A=dismissed B=pending

20/20 trials lost a write

---

// src/commitments/store.ts:434-461
export async function markCommitmentsAttempted(params: {}): Promise<void> {
  const store = await loadCommitmentStore();          // ← read snapshot S0
  let changed = false;
  store.commitments = store.commitments.map((commitment) => {});   // ← mutate in memory
  if (changed) {
    await saveCommitmentStore(undefined, store);      // ← write back
  }
}

---

caller A                                  caller B                                disk
─────────                                 ─────────                                ─────
loadCommitmentStore()S0
                                          loadCommitmentStore()S0
mutate(A=dismissed in S0)S1_A
                                          mutate(B=dismissed in S0)S1_B
saveCommitmentStore(S1_A) ───────────────────────────────────────────────────→  {A:dismissed, B:pending}
                                          saveCommitmentStore(S1_B) ─────────→  {A:pending,   B:dismissed}A's dismiss is gone

---
RAW_BUFFERClick to expand / collapse

Bug type

Behavior bug (incorrect output/state without crash)

Beta release blocker

No

Summary

The commitments store (~/.openclaw/commitments/commitments.json) is updated by markCommitmentsAttempted / markCommitmentsStatus (src/commitments/store.ts:434-461, 463-493) and by upsertInferredCommitments (src/commitments/store.ts:295-331) via an unsynchronized load-modify-save pattern: each call independently reads the entire store, mutates in memory, and writes the whole file back. There is no lockfile, mutex, or writer queue around the critical section.

When two callers race (background heartbeat + user CLI dismiss, or two heartbeat ticks for different sessions, or two Promise.all-style calls within one heartbeat tick), whichever save lands second silently overwrites the first caller's mutation with stale data from before the first save.

By contrast, the session store (src/config/sessions/store.ts:432, 442, 556, 638) wraps every mutation in runExclusiveSessionStoreWrite(storePath, fn) — a per-path async writer queue defined in src/config/sessions/store-writer.ts:14-97 (uses WRITER_QUEUES: Map<string, SessionStoreWriterQueue>). That exact pattern is exported from the session-store and could be reused for commitments. Commitments has no equivalent protection.

Failure modes that manifest under concurrency:

  1. Lost dismiss — User runs openclaw commitments dismiss cm_xyz while the heartbeat is bumping attempts on another commitment. Heartbeat's snapshot lacks the dismiss; heartbeat's save wins → cm_xyz reverts to pending and gets delivered to the user.
  2. Lost "sent" status → duplicate delivery — Heartbeat marks cm_xyz as sent (src/infra/heartbeat-runner.ts:1929). A concurrent markCommitmentsAttempted (src/infra/heartbeat-runner.ts:1568) on a different commitment overwrites the snapshot → next heartbeat tick sees cm_xyz as still pending and re-sends the same commitment. User-visible: duplicate follow-up message.
  3. Lost attempts counter — Same race: attempts: N+1 gets rolled back to N; the retry-budget cap (maxPerDay) is silently extended; the user gets more reminders than the configured policy allows.

Verified live on v2026.5.10-beta.1 (152ea9af34); same source on v2026.5.7 stable.

Steps to reproduce

Deterministic in-process repro using the built dist/commitments/store.js directly. The same race triggers via CLI + heartbeat overlap in practice but the timing window is wider in-process.

cd ~/Gittensor/OpenCoven/openclaw-upstream     # any current checkout (HEAD 152ea9af34 verified)
export PATH=~/.nvm/versions/node/v22.22.2/bin:$PATH
pnpm install --prefer-offline                  # if dist/ isn't already built
pnpm build                                     # if needed

cat > race-repro.mjs <<'EOF'
import {
  loadCommitmentStore,
  saveCommitmentStore,
  markCommitmentsStatus,
  resolveCommitmentStorePath,
} from "./dist/commitments/store.js";

function seedStore() {
  const now = Date.now();
  return {
    version: 1,
    commitments: ["A", "B"].map((tag) => ({
      id: `cm_race${tag}`,
      agentId: "qa-race",
      sessionKey: "qa-race-session",
      channel: "slack",
      kind: "follow_up",
      sensitivity: "normal",
      source: "qa-seed",
      status: "pending",
      reason: `QA race repro id ${tag}`,
      suggestedText: `ping ${tag}`,
      dedupeKey: `qa-race-${tag}`,
      confidence: 0.9,
      dueWindow: { earliestMs: now - 1000, latestMs: now + 3600_000, timezone: "UTC" },
      createdAtMs: now,
      updatedAtMs: now,
      attempts: 0,
    })),
  };
}
async function snap() {
  const s = await loadCommitmentStore();
  return Object.fromEntries(s.commitments.map((c) => [c.id, c.status]));
}

let loss = 0, ok = 0;
for (let i = 0; i < 20; i++) {
  await saveCommitmentStore(undefined, seedStore());
  await Promise.all([
    markCommitmentsStatus({ ids: ["cm_raceA"], status: "dismissed", nowMs: Date.now() }),
    markCommitmentsStatus({ ids: ["cm_raceB"], status: "dismissed", nowMs: Date.now() }),
  ]);
  const a = await snap();
  if (a.cm_raceA === "dismissed" && a.cm_raceB === "dismissed") ok++; else loss++;
  console.log(`trial ${i+1}: A=${a.cm_raceA} B=${a.cm_raceB}`);
}
console.log(`\n${loss}/20 trials lost a write`);
EOF

node race-repro.mjs

Result (verified on v2026.5.10-beta.1 / 152ea9af34): 20/20 trials lose one of the two dismisses. One of the commitments reverts to pending every single time.

trial 1: A=pending B=dismissed
trial 2: A=pending B=dismissed
trial 3: A=pending B=dismissed
trial 20: A=dismissed B=pending

20/20 trials lost a write

Expected behavior

Both dismisses should succeed; final state should have both cm_raceA and cm_raceB marked dismissed. Equivalent reasoning applies to markCommitmentsAttempted and upsertInferredCommitments: independent mutations to disjoint records should not lose each other.

The session store achieves this with runExclusiveSessionStoreWrite(storePath, fn) (src/config/sessions/store-writer.ts:72) — every load-modify-save is queued per-store-path so a second writer can only start after the first writer's save commits. The commitments store should use the same pattern.

Actual behavior

markCommitmentsStatus and markCommitmentsAttempted perform the following sequence with no lock:

// src/commitments/store.ts:434-461
export async function markCommitmentsAttempted(params: {}): Promise<void> {
  const store = await loadCommitmentStore();          // ← read snapshot S0
  let changed = false;
  store.commitments = store.commitments.map((commitment) => {});   // ← mutate in memory
  if (changed) {
    await saveCommitmentStore(undefined, store);      // ← write back
  }
}

The save path uses privateFileStore(...).writeJson(...) which DOES atomic-rename on disk, so the file is never torn. But atomic-rename only protects against partial reads; it does not protect against last-writer-wins data loss between two read-modify-save cycles whose reads happen before the first cycle's save commits.

Sequence diagram of the bug:

caller A                                  caller B                                disk
─────────                                 ─────────                                ─────
loadCommitmentStore()           → S0
                                          loadCommitmentStore()         → S0
mutate(A=dismissed in S0)→S1_A
                                          mutate(B=dismissed in S0)→S1_B
saveCommitmentStore(S1_A) ───────────────────────────────────────────────────→  {A:dismissed, B:pending}
                                          saveCommitmentStore(S1_B) ─────────→  {A:pending,   B:dismissed}   ← A's dismiss is gone

20/20 reproduction shows this is deterministic when both markCommitmentsStatus calls share an event-loop tick (typical of Promise.all and concurrent CLI/heartbeat invocations).

OpenClaw version

v2026.5.10-beta.1, v2026.5.7

Operating system

Ubuntu 24.04

Install method

No response

Model

Not applicable

Provider / routing chain

Not applicable

Additional provider/model setup details

No response

Logs, screenshots, and evidence

Impact and severity

No response

Additional information

No response

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

Both dismisses should succeed; final state should have both cm_raceA and cm_raceB marked dismissed. Equivalent reasoning applies to markCommitmentsAttempted and upsertInferredCommitments: independent mutations to disjoint records should not lose each other.

The session store achieves this with runExclusiveSessionStoreWrite(storePath, fn) (src/config/sessions/store-writer.ts:72) — every load-modify-save is queued per-store-path so a second writer can only start after the first writer's save commits. The commitments store should use the same pattern.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix [Bug]: commitments store has no write serialization — concurrent `markCommitmentsStatus` / `markCommitmentsAttempted` calls silently lose dismiss/sent/attempt updates (data-loss class; can re-deliver an already-sent commitment) [1 pull requests]