openclaw - 💡(How to fix) Fix Feature: emit plan/session lifecycle events to `/var/log/openclaw-events.jsonl` [1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#73603Fetched 2026-04-29 06:17:36
View on GitHub
Comments
1
Participants
2
Timeline
3
Reactions
0
Author
Timeline (top)
commented ×1mentioned ×1subscribed ×1

Two independent OpenClaw deployments (CKS by @David-CKS, Lobster by @sergio-barrera) need to audit Maestro task completion: detect plans that the model started but never finished within a sliding window. Today this is not implementable cleanly because plan/session lifecycle events are referenced inside dist/ but are not emitted to the canonical structured event sink (/var/log/openclaw-events.jsonl).

We are asking for a config flag events.emit.plan: true (default false for back-compat) that makes the gateway publish plan/session transitions to the JSONL stream alongside the existing skill events.

This unblocks third-party "task-completion-audit" tools as a defense-in-depth complement to existing verification approaches (e.g. our LLM-judge pattern; Sergio's Verification Judge).


Error Message

The persistent JSON log inside the container (/tmp/openclaw-0/openclaw-YYYY-MM-DD.log) does carry some lifecycle data, but only error-side: Detect plans the Maestro update_plan-ed but never closes_plan-ed within 24 h. Defense-in-depth: independent of the model's own self-verification, so it catches both forgotten tasks (model moved on without finishing) and silently failed tasks (failover cascade ended in unrecovered error).

Root Cause

Two independent OpenClaw deployments (CKS by @David-CKS, Lobster by @sergio-barrera) need to audit Maestro task completion: detect plans that the model started but never finished within a sliding window. Today this is not implementable cleanly because plan/session lifecycle events are referenced inside dist/ but are not emitted to the canonical structured event sink (/var/log/openclaw-events.jsonl).

Fix Action

Fix / Workaround

Workaround we ship today (and want to drop)

Code Example

docker exec <container> sh -c '
  cd /usr/local/lib/node_modules/openclaw/dist &&
  for term in update_plan sessions_spawn sessions_yield closes_plan; do
    n=$(grep -roh "$term" . | wc -l)
    printf "%-20s %s refs\n" "$term" "$n"
  done
'

---

# Last 1000 events from the canonical JSONL sink:
tail -n 1000 /var/log/openclaw-events.jsonl | jq -r '.kind' | sort | uniq -c

---

docker exec <container> grep -oE '"event":"[a-z_]+"' /tmp/openclaw-0/openclaw-YYYY-MM-DD.log | sort -u

---

{
  "events": {
    "emit": {
      "plan": true        // emit plan/session lifecycle events to events.jsonl
    }
  }
}

---

{"ts":"2026-04-28T07:38:11Z","kind":"plan","event":"update_plan","plan_id":"plan_a1b2c3","steps":4,"agent":"main"}
{"ts":"2026-04-28T07:38:14Z","kind":"plan","event":"sessions_spawn","plan_ref":"plan_a1b2c3","leaf":"cks-dev","subagent_id":"a5076e18-37e7-4414-a716-79e3954ea428"}
{"ts":"2026-04-28T07:38:42Z","kind":"plan","event":"sessions_yield","plan_ref":"plan_a1b2c3","subagent_id":"a5076e18-37e7-4414-a716-79e3954ea428","result":"ok","duration_ms":28032}
{"ts":"2026-04-28T07:39:05Z","kind":"plan","event":"agent_final_response","closes_plan":"plan_a1b2c3","status":"ok"}
RAW_BUFFERClick to expand / collapse

Suggested title: Feature: emit plan/session lifecycle events to /var/log/openclaw-events.jsonl

Suggested labels: enhancement, observability, events, plugin-sdk

Status: Draft (not yet published — awaiting gh auth login + co-author confirmation from @sergio-barrera).


Summary

Two independent OpenClaw deployments (CKS by @David-CKS, Lobster by @sergio-barrera) need to audit Maestro task completion: detect plans that the model started but never finished within a sliding window. Today this is not implementable cleanly because plan/session lifecycle events are referenced inside dist/ but are not emitted to the canonical structured event sink (/var/log/openclaw-events.jsonl).

We are asking for a config flag events.emit.plan: true (default false for back-compat) that makes the gateway publish plan/session transitions to the JSONL stream alongside the existing skill events.

This unblocks third-party "task-completion-audit" tools as a defense-in-depth complement to existing verification approaches (e.g. our LLM-judge pattern; Sergio's Verification Judge).


Repro: events exist as concepts in code, but nothing reaches the canonical events sink

Verified empirically against 2026.4.21 (f788c88) running in container openclaw-ald8-openclaw-1.

1. Concepts exist in dist

docker exec <container> sh -c '
  cd /usr/local/lib/node_modules/openclaw/dist &&
  for term in update_plan sessions_spawn sessions_yield closes_plan; do
    n=$(grep -roh "$term" . | wc -l)
    printf "%-20s %s refs\n" "$term" "$n"
  done
'

Result on a clean install:

symbolrefs in dist/
update_plan5
sessions_spawn63
sessions_yield25
closes_plan(small but present)

So the planner subsystem clearly thinks in these terms internally.

2. The canonical events sink shows zero plan/session events

# Last 1000 events from the canonical JSONL sink:
tail -n 1000 /var/log/openclaw-events.jsonl | jq -r '.kind' | sort | uniq -c

In our deployment, 227 of the last 227 events are kind: "skill" — these are emissions from external run-skill.sh wrappers (cron jobs we wrote ourselves). Zero entries of kind: "plan", kind: "session", kind: "agent", or anything related to Maestro / leaf lifecycle.

3. The internal text/JSON gateway log only exposes a subset

The persistent JSON log inside the container (/tmp/openclaw-0/openclaw-YYYY-MM-DD.log) does carry some lifecycle data, but only error-side:

docker exec <container> grep -oE '"event":"[a-z_]+"' /tmp/openclaw-0/openclaw-YYYY-MM-DD.log | sort -u

Yields just three event names in our window:

  • embedded_run_agent_end (only end, never _start — see "What's missing" below)
  • embedded_run_failover_decision
  • model_fallback_decision

There is no update_plan, sessions_spawn, sessions_yield, closes_plan, agent_final_response, embedded_run_agent_start, or any plan-id correlator.

4. The text gateway log carries even less (and is transient)

/var/log/openclaw-gateway.log (inside the container, ~28 KB per session, truncated on restart) only shows [ws] ⇄ res lines, telegram side effects, and the same three event names re-rendered as text. It is not a viable auditing source.


Use case

Two use cases, both real:

Use case A — "Task completion audit" (Sergio Barrera, Lobster)

Detect plans the Maestro update_plan-ed but never closes_plan-ed within 24 h. Defense-in-depth: independent of the model's own self-verification, so it catches both forgotten tasks (model moved on without finishing) and silently failed tasks (failover cascade ended in unrecovered error).

Use case B — "Forgotten task watcher" (David Utrero, CKS)

Same need. We have a working MVP today (scripts/cks-task-completion-audit.sh in our repo) that parses the text log to correlate embedded_run_agent_end isError=true events without a later isError=false for the same runId. It is brittle — one log format change upstream breaks it — and it has zero plan-level granularity (we cannot tell "this was step 3 of a 5-step plan that the model abandoned at step 4").

A canonical event stream would let both of us drop the brittle parsers and rely on a stable contract.


Proposed shape

A new boolean config flag, default false:

{
  "events": {
    "emit": {
      "plan": true        // emit plan/session lifecycle events to events.jsonl
    }
  }
}

When true, the gateway publishes events of kind: "plan" to /var/log/openclaw-events.jsonl at the same persistence point that the skill events use, alongside their existing emission paths. Concretely something like:

{"ts":"2026-04-28T07:38:11Z","kind":"plan","event":"update_plan","plan_id":"plan_a1b2c3","steps":4,"agent":"main"}
{"ts":"2026-04-28T07:38:14Z","kind":"plan","event":"sessions_spawn","plan_ref":"plan_a1b2c3","leaf":"cks-dev","subagent_id":"a5076e18-37e7-4414-a716-79e3954ea428"}
{"ts":"2026-04-28T07:38:42Z","kind":"plan","event":"sessions_yield","plan_ref":"plan_a1b2c3","subagent_id":"a5076e18-37e7-4414-a716-79e3954ea428","result":"ok","duration_ms":28032}
{"ts":"2026-04-28T07:39:05Z","kind":"plan","event":"agent_final_response","closes_plan":"plan_a1b2c3","status":"ok"}

Field naming follows what already exists in dist/ (we'd take whatever you actually use internally) — the important contract is: every plan emits an open event with a plan_id, every spawn carries plan_ref, every yield carries plan_ref + outcome, and every plan terminates with exactly one closes_plan: <plan_id> event (or an explicit timeout/abort event if the gateway gives up on a plan).


Trade-offs

We see one real concern and two non-concerns:

  • Volume. A single plan with 4 steps generates ≥5 events (1 update_plan + 4 spawn + 4 yield + 1 close in the worst case). For our deployments this is fine — events.jsonl already rotates daily and gzips after 1 day. For larger fleets we'd suggest also exposing events.emit.plan.sample: 1.0 (or similar) so operators can sample. Optional, can come in a follow-up.
  • Schema churn. We're aware the planner internals may still evolve. A v0/experimental flag would already unblock both of us; we are happy to track field renames behind a major-version bump.
  • Performance. Negligible compared to the actual model calls. The events sink already absorbs ~250 events/day in our setup with no measurable overhead.

Workaround we ship today (and want to drop)

scripts/cks-task-completion-audit.sh (CKS) — see the script header for the full design notes and acknowledged limitations. In one paragraph:

The script parses /tmp/openclaw-0/openclaw-YYYY-MM-DD.log line by line, ignores malformed lines, extracts embedded_run_agent_end {runId, isError, providerRuntimeFailureKind}, groups by runId within a 24 h window, and reports a runId as "forgotten" iff every end inside the window had isError: true (i.e. no later success recovered the same runId). It then emits kind: "task_forgotten" events into events.jsonl and exits non-zero past a threshold.

That works as a blunt watchdog (we can tell something went wrong) but cannot answer the actual question users want — "which plan did the Maestro abandon, and at which step?" That requires the canonical events asked for in this issue.

Sergio's Lobster has a parallel script with the same caveats.


Co-author

cc @sergio-barrera — happy to co-author once you confirm the proposed event shape works for your auditor too. We can sign off jointly so it's clear two independent operators are asking for the same contract.


How to verify a fix

When this lands, the following should hold against any real Maestro session:

  1. tail -n 200 /var/log/openclaw-events.jsonl | jq -r '.kind' | sort -u includes plan.
  2. For every update_plan event there is exactly one agent_final_response with closes_plan: <same plan_id> (or an explicit timeout/abort marker).
  3. The number of sessions_spawn events equals the number of sessions_yield events for any closed plan (no orphan spawns).
  4. The brittle text-log parser in our repo can be deleted and replaced with jq 'select(.kind=="plan")' against the canonical sink.

Happy to write a verify suite from our side once a flag/PR is up — we already have a TDD harness covering the Maestro lifecycle (scripts/cks-task-completion-audit.test.sh).


Drafted: 2026-04-28 Authors: @David-CKS, @sergio-barrera (pending sign-off) Source repo: David-CKS/openclaw-cks-ops (private — happy to share log samples if helpful, redacted of bot tokens / chat content)

extent analysis

TL;DR

Enable the emission of plan/session lifecycle events to the canonical event sink by adding a config flag events.emit.plan: true.

Guidance

  • Add a new boolean config flag events.emit.plan with a default value of false to control the emission of plan/session lifecycle events.
  • When events.emit.plan is true, the gateway should publish events of kind: "plan" to /var/log/openclaw-events.jsonl.
  • Verify that the events are being emitted correctly by checking the contents of /var/log/openclaw-events.jsonl for kind: "plan" events.
  • Use a tool like jq to parse and filter the events in /var/log/openclaw-events.jsonl to ensure that the expected plan/session lifecycle events are present.

Example

{
  "events": {
    "emit": {
      "plan": true
    }
  }
}

This config flag enables the emission of plan/session lifecycle events to the canonical event sink.

Notes

The proposed solution assumes that the internal event handling mechanisms are already in place and that the only missing piece is the configuration flag to control the emission of plan/session lifecycle events. Additionally, the solution does not address potential performance or volume concerns, which may need to be addressed separately.

Recommendation

Apply the workaround by adding the events.emit.plan config flag and setting it to true. This will enable the emission of plan/session lifecycle events to the canonical event sink, allowing for more accurate auditing and monitoring of Maestro task completion.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix Feature: emit plan/session lifecycle events to `/var/log/openclaw-events.jsonl` [1 comments, 2 participants]