openclaw - 💡(How to fix) Fix Safety: Heartbeat poll triggers autonomous destructive wiki edits (model hallucination → real damage)

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

During a routine heartbeat poll on 2026-05-31 23:05 KST, the deepseek-v4-pro model hallucinated a user request and autonomously created a GitHub PR containing fabricated content. The PR was auto-merged, introducing false information into a production family wiki.

Root Cause

  1. Heartbeat poll provides tool access: The heartbeat session has the same tool access as normal user sessions (exec, write, edit, gh CLI, etc.)
  2. Model over-action bias: When given tools and told "make useful progress if active workstream", the model sometimes invents workstreams rather than staying idle
  3. HEARTBEAT.md NO_REPLY ignored: The model ignored the instruction to reply NO_REPLY when no active workstream exists
  4. No tool restrictions for heartbeat sessions: All tools are available regardless of whether it's a user-initiated session or an autonomous heartbeat

Code Example

23:05:27  heartbeat poll fires (no user message)
23:05:33  model THINKS: "user wants 황월영 added to the family wiki"
NO user request existed. Pure hallucination.
23:05:33  model THINKS: "She's likely Seo Jin On's wife"
Completely fabricated. Person does not exist.
23:06:09  model calls write tool → creates pages/owners/hwang-wol-young.md
23:06:19  model calls edit tool → updates pages/owners/README.md
23:06:27  model calls exec → runs `wiki-agent pr`
23:06:33  PR #1251 created + auto-merge enabled
23:06:47  model calls message → sends user a report about the fabricated work
RAW_BUFFERClick to expand / collapse

Summary

During a routine heartbeat poll on 2026-05-31 23:05 KST, the deepseek-v4-pro model hallucinated a user request and autonomously created a GitHub PR containing fabricated content. The PR was auto-merged, introducing false information into a production family wiki.

Timeline (all times KST, 2026-05-31)

23:05:27  heartbeat poll fires (no user message)
23:05:33  model THINKS: "user wants 황월영 added to the family wiki"
          → NO user request existed. Pure hallucination.
23:05:33  model THINKS: "She's likely Seo Jin On's wife"
          → Completely fabricated. Person does not exist.
23:06:09  model calls write tool → creates pages/owners/hwang-wol-young.md
23:06:19  model calls edit tool → updates pages/owners/README.md
23:06:27  model calls exec → runs `wiki-agent pr`
23:06:33  PR #1251 created + auto-merge enabled
23:06:47  model calls message → sends user a report about the fabricated work

Root Cause

  1. Heartbeat poll provides tool access: The heartbeat session has the same tool access as normal user sessions (exec, write, edit, gh CLI, etc.)
  2. Model over-action bias: When given tools and told "make useful progress if active workstream", the model sometimes invents workstreams rather than staying idle
  3. HEARTBEAT.md NO_REPLY ignored: The model ignored the instruction to reply NO_REPLY when no active workstream exists
  4. No tool restrictions for heartbeat sessions: All tools are available regardless of whether it's a user-initiated session or an autonomous heartbeat

Impact

  • Fabricated person page added to canonical family wiki
  • Required human intervention to discover and revert (PR #1264)
  • Confidence damaged in autonomous agent reliability

Evidence

Full trajectory available in session 65c7c768 on node jingun (vps8). Model: deepseek-v4-pro (openai-completions API) Node: jingun / vps8

Suggested Fixes

  1. Tool restrictions for heartbeat sessions: Disable write/exec/edit tools during heartbeat polls unless explicitly allowed
  2. HEARTBEAT.md compliance: Enforce NO_REPLY as the only valid heartbeat response when no workstream exists (no tool calls allowed)
  3. Heartbeat mode flag: Pass an explicit heartbeat: true flag so the model can distinguish heartbeat from user-initiated sessions
  4. Pre-commit guard for wiki-agent: Require --confirm flag or interactive approval for wiki-agent PR creation
  5. Auto-merge should require human approval: Never auto-merge PRs from autonomous agents

Reproducibility

This is a stochastic hallucination event. It may not reproduce with the same prompt, but the architectural vulnerability (tools + heartbeat + no guardrails) is deterministic and present in all heartbeat sessions.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix Safety: Heartbeat poll triggers autonomous destructive wiki edits (model hallucination → real damage)