openclaw - ✅(Solved) Fix [Bug]: sessions_history returns empty after tool execution — response silently lost [2 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#71872Fetched 2026-04-26 05:07:16
View on GitHub
Comments
0
Participants
1
Timeline
4
Reactions
0
Participants
Timeline (top)
cross-referenced ×2referenced ×2

sessions_history returns empty for ACP harness sessions that ran tools, and the agent response is silently lost (never delivered). Must send another message to trigger delivery.

Root Cause

Session write lock cleanup as dead-pid while tool is executing, or lock held for 191s (Windows, #70857). Same root cause as #64362.

Fix Action

Workaround

Retry via sessions_list + direct JSONL file read after sessions_history returns empty.

PR fix notes

PR #1: fix(session-write-lock): prevent lock removal when subprocess inherits parent PID

Description (problem / solution / changelog)

Summary

When a subprocess inherits the parent gateway process PID (common on Windows and certain Node subprocess models) and exits, it may leave behind a session lock file with the parent's PID but no entry in the in-memory HELD_LOCKS map (since HELD_LOCKS is process-local).

Previously, shouldTreatAsOrphanSelfLock would incorrectly treat this as an orphan lock and remove it — even though the parent gateway process is still alive and actively using the session. This caused agent responses to be silently lost after tool execution: the lock was removed mid-write, corrupting the session JSONL file and making sessions_history return empty.

Root Cause

In shouldTreatAsOrphanSelfLock, the original logic only checked:

  1. Does the lock PID match process.pid?
  2. Does HELD_LOCKS NOT have this session file?

If both were true, it was considered an orphan. But this misses the subprocess scenario: the subprocess inherited the parent's PID, exited, and left a lock — yet the parent is still alive and may be trying to write.

Fix

Added an isPidAlive(pid) guard before declaring a self-PID lock as orphan. If the PID is alive, the lock is never treated as orphan, regardless of HELD_LOCKS state:

  • Linux: isPidAlive is accurate; the existing starttime check separately detects PID recycling
  • Windows/macOS: getProcessStartTime returns null, so isPidAlive is the only safeguard against removing a lock held by a live parent process

Files Changed

  • src/agents/session-write-lock.ts — added isPidAlive(pid) guard in shouldTreatAsOrphanSelfLock
  • src/agents/session-write-lock.test.ts — skipped one test that relied on the old unsafe behavior; documented why

Related Issues

Fixes openclaw/openclaw#71872 Related: openclaw/openclaw#64362, openclaw/openclaw#70857

Changed files

  • .agents/skills/openclaw-pr-maintainer/SKILL.md (modified, +15/-0)
  • .env.example (modified, +1/-0)
  • .github/labeler.yml (modified, +16/-0)
  • .github/workflows/auto-response.yml (modified, +13/-492)
  • .github/workflows/ci.yml (modified, +1/-0)
  • .github/workflows/openclaw-live-and-e2e-checks-reusable.yml (modified, +5/-0)
  • AGENTS.md (modified, +4/-1)
  • CHANGELOG.md (modified, +1713/-1429)
  • appcast.xml (modified, +49/-117)
  • apps/android/app/src/main/AndroidManifest.xml (modified, +2/-1)
  • apps/android/app/src/main/java/ai/openclaw/app/MainViewModel.kt (modified, +10/-1)
  • apps/android/app/src/main/java/ai/openclaw/app/NodeForegroundService.kt (modified, +147/-25)
  • apps/android/app/src/main/java/ai/openclaw/app/NodeRuntime.kt (modified, +86/-23)
  • apps/android/app/src/main/java/ai/openclaw/app/SecurePrefs.kt (modified, +6/-5)
  • apps/android/app/src/main/java/ai/openclaw/app/VoiceCaptureMode.kt (added, +7/-0)
  • apps/android/app/src/main/java/ai/openclaw/app/ui/VoiceTabScreen.kt (modified, +63/-16)
  • apps/android/app/src/main/java/ai/openclaw/app/voice/TalkModeManager.kt (modified, +45/-13)
  • apps/android/app/src/test/java/ai/openclaw/app/NodeForegroundServiceTest.kt (modified, +30/-0)
  • apps/android/app/src/test/java/ai/openclaw/app/SecurePrefsTest.kt (modified, +28/-0)
  • apps/android/app/src/test/java/ai/openclaw/app/voice/TalkModeManagerTest.kt (modified, +45/-0)
  • apps/ios/Sources/Settings/SettingsTab.swift (modified, +6/-0)
  • apps/ios/Sources/Voice/TalkModeGatewayConfig.swift (modified, +4/-1)
  • apps/ios/Sources/Voice/TalkModeManager.swift (modified, +9/-1)
  • apps/ios/Sources/Voice/TalkSpeechLocale.swift (added, +100/-0)
  • apps/ios/Tests/Logic/TalkConfigParsingTests.swift (modified, +10/-0)
  • apps/ios/Tests/TalkSpeechLocaleTests.swift (added, +41/-0)
  • apps/macos/Sources/OpenClaw/AppState.swift (modified, +31/-0)
  • apps/macos/Sources/OpenClaw/Constants.swift (modified, +2/-0)
  • apps/macos/Sources/OpenClaw/ExecAllowlistMatcher.swift (modified, +2/-1)
  • apps/macos/Sources/OpenClaw/ExecApprovals.swift (modified, +2/-1)
  • apps/macos/Sources/OpenClaw/TalkModeController.swift (modified, +31/-0)
  • apps/macos/Sources/OpenClaw/TalkModeGatewayConfig.swift (modified, +4/-0)
  • apps/macos/Sources/OpenClaw/TalkModeRuntime.swift (modified, +30/-6)
  • apps/macos/Sources/OpenClaw/TalkSpeechInterruptMonitor.swift (added, +57/-0)
  • apps/macos/Sources/OpenClaw/VoicePushToTalk.swift (modified, +1/-0)
  • apps/macos/Sources/OpenClaw/VoiceWakeSettings.swift (modified, +25/-0)
  • apps/macos/Tests/OpenClawIPCTests/TalkModeGatewayConfigTests.swift (modified, +4/-4)
  • apps/shared/OpenClawKit/Sources/OpenClawKit/TalkConfigParsing.swift (modified, +40/-0)
  • apps/shared/OpenClawKit/Tests/OpenClawKitTests/TalkConfigParsingTests.swift (modified, +17/-0)
  • docs/.generated/config-baseline.sha256 (modified, +3/-3)
  • docs/.generated/plugin-sdk-api-baseline.sha256 (modified, +2/-2)
  • docs/.i18n/glossary.zh-CN.json (modified, +12/-0)
  • docs/automation/cron-jobs.md (modified, +22/-2)
  • docs/automation/hooks.md (modified, +5/-0)
  • docs/automation/index.md (modified, +1/-1)
  • docs/channels/matrix.md (modified, +41/-0)
  • docs/channels/msteams.md (modified, +254/-165)
  • docs/ci.md (modified, +4/-1)
  • docs/cli/browser.md (modified, +8/-0)
  • docs/cli/crestodian.md (modified, +4/-4)
  • docs/cli/cron.md (modified, +4/-0)
  • docs/cli/gateway.md (modified, +5/-0)
  • docs/cli/hooks.md (modified, +2/-1)
  • docs/cli/infer.md (modified, +6/-0)
  • docs/cli/mcp.md (modified, +5/-0)
  • docs/cli/plugins.md (modified, +22/-9)
  • docs/concepts/agent-runtimes.md (modified, +14/-0)
  • docs/concepts/memory-search.md (modified, +5/-0)
  • docs/concepts/messages.md (modified, +13/-0)
  • docs/concepts/session.md (modified, +20/-2)
  • docs/docs.json (modified, +5/-2)
  • docs/gateway/bonjour.md (modified, +5/-0)
  • docs/gateway/config-agents.md (modified, +11/-3)
  • docs/gateway/configuration-reference.md (modified, +17/-9)
  • docs/gateway/diagnostics.md (modified, +4/-3)
  • docs/gateway/doctor.md (modified, +10/-0)
  • docs/gateway/heartbeat.md (modified, +3/-2)
  • docs/gateway/index.md (modified, +6/-0)
  • docs/gateway/logging.md (modified, +2/-1)
  • docs/gateway/opentelemetry.md (added, +308/-0)
  • docs/gateway/pairing.md (modified, +6/-5)
  • docs/gateway/protocol.md (modified, +12/-2)
  • docs/gateway/security/audit-checks.md (modified, +3/-3)
  • docs/gateway/security/index.md (modified, +6/-1)
  • docs/gateway/tailscale.md (modified, +4/-0)
  • docs/gateway/troubleshooting.md (modified, +6/-0)
  • docs/help/testing-live.md (modified, +17/-2)
  • docs/help/testing.md (modified, +16/-4)
  • docs/install/fly.md (modified, +14/-1)
  • docs/install/migrating-matrix.md (modified, +26/-22)
  • docs/install/updating.md (modified, +4/-0)
  • docs/logging.md (modified, +27/-256)
  • docs/nodes/talk.md (modified, +9/-0)
  • docs/platforms/android.md (modified, +4/-2)
  • docs/plugins/architecture-internals.md (modified, +12/-7)
  • docs/plugins/codex-harness.md (modified, +60/-36)
  • docs/plugins/community.md (modified, +4/-0)
  • docs/plugins/google-meet.md (modified, +8/-0)
  • docs/plugins/hooks.md (modified, +40/-4)
  • docs/plugins/manifest.md (modified, +42/-0)
  • docs/plugins/sdk-agent-harness.md (modified, +31/-2)
  • docs/plugins/sdk-migration.md (modified, +2/-1)
  • docs/plugins/sdk-provider-plugins.md (modified, +8/-2)
  • docs/plugins/sdk-setup.md (modified, +3/-2)
  • docs/plugins/sdk-subpaths.md (modified, +3/-2)
  • docs/plugins/voice-call.md (modified, +6/-0)
  • docs/providers/azure-speech.md (added, +119/-0)
  • docs/providers/fal.md (modified, +30/-4)
  • docs/providers/google.md (modified, +2/-2)
  • docs/providers/index.md (modified, +1/-0)

PR #71903: fix(session-write-lock): prevent lock removal when subprocess inherits parent PID

Description (problem / solution / changelog)

Summary

When a subprocess inherits the parent gateway process PID (common on Windows and certain Node subprocess models) and exits, it may leave behind a session lock file with the parent's PID but no entry in the in-memory HELD_LOCKS map (since HELD_LOCKS is process-local).

Previously, shouldTreatAsOrphanSelfLock would incorrectly treat this as an orphan lock and remove it — even though the parent gateway process is still alive and actively using the session. This caused agent responses to be silently lost after tool execution: the lock was removed mid-write, corrupting the session JSONL file and making sessions_history return empty.

Root Cause

In shouldTreatAsOrphanSelfLock, the original logic only checked:

  1. Does the lock PID match process.pid?
  2. Does HELD_LOCKS NOT have this session file?

If both were true, it was considered an orphan. But this misses the subprocess scenario: the subprocess inherited the parent's PID, exited, and left a lock — yet the parent is still alive and may be trying to write.

Fix

Added an isPidAlive(pid) guard before declaring a self-PID lock as orphan. If the PID is alive, the lock is never treated as orphan, regardless of HELD_LOCKS state:

  • Linux: isPidAlive is accurate; the existing starttime check separately detects PID recycling
  • Windows/macOS: getProcessStartTime returns null, so isPidAlive is the only safeguard against removing a lock held by a live parent process

Files Changed

  • src/agents/session-write-lock.ts — added isPidAlive(pid) guard in shouldTreatAsOrphanSelfLock
  • src/agents/session-write-lock.test.ts — skipped one test that relied on the old unsafe behavior; documented why

Related Issues

Fixes openclaw/openclaw#71872 Related: openclaw/openclaw#64362, openclaw/openclaw#70857

Changed files

  • src/agents/session-write-lock.test.ts (modified, +9/-4)
  • src/agents/session-write-lock.ts (modified, +9/-0)
RAW_BUFFERClick to expand / collapse

Bug type

Regression

Summary

sessions_history returns empty for ACP harness sessions that ran tools, and the agent response is silently lost (never delivered). Must send another message to trigger delivery.

Root cause

Session write lock cleanup as dead-pid while tool is executing, or lock held for 191s (Windows, #70857). Same root cause as #64362.

Evidence

  • #64362 (Telegram + macOS): response lost, dead-pid session lock cleanup during exec
  • #70857 (Windows): lock held 191457ms vs 15000ms max
  • Our observation: ACP sessions complete with status: completed but sessions_history returns []

Steps to Reproduce

  1. sessions_spawn ACP harness session (pi/opencode)
  2. Parent runs exec/cron tools
  3. Agent response NOT delivered to channel
  4. sessions_history for the session ID returns []

Workaround

Retry via sessions_list + direct JSONL file read after sessions_history returns empty.

Related

#64362 #70857

extent analysis

TL;DR

The issue can be worked around by retrying via sessions_list and direct JSONL file read after sessions_history returns empty.

Guidance

  • The root cause is related to session write lock cleanup as dead-pid while a tool is executing, or lock held for an extended period, which is the same as issues #64362 and #70857.
  • To verify the issue, reproduce the steps: spawn an ACP harness session, run exec/cron tools, and check if the agent response is delivered and sessions_history returns a non-empty result.
  • The provided workaround can be used to mitigate the issue: retry via sessions_list and direct JSONL file read after sessions_history returns empty.
  • Investigate the session write lock mechanism to prevent cleanup during tool execution or extended lock hold times.

Example

No code snippet is provided as it is not clearly supported by the issue.

Notes

The issue seems to be related to a known problem with session write lock cleanup, and the provided workaround can help mitigate it. However, a more permanent fix would require addressing the root cause of the session write lock issue.

Recommendation

Apply the provided workaround: retry via sessions_list and direct JSONL file read after sessions_history returns empty, as it can help mitigate the issue until a permanent fix is available.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - ✅(Solved) Fix [Bug]: sessions_history returns empty after tool execution — response silently lost [2 pull requests, 1 participants]