openclaw - 💡(How to fix) Fix [Feature]: agent_end (or new turn_end) hook payload should expose per-turn tool-call outcomes + completion_reason

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Error Message

  • Tool results (success/failure per call, error messages on failure)
  • Completion reason ("stop" | "max_tokens" | "tool_use" | "error" from the LLM provider)
  1. Couples plugins to the message-block schema. Any provider that doesn't emit Anthropic-shape content blocks (some local backends, future provider integrations) silently breaks the plugin's tool-call parsing — without a typed error surface. error?: { message: string; code?: string }; completion_reason: "stop" | "max_tokens" | "tool_use" | "error" | "interrupted";

Root Cause

Concrete consumer: Evolve is building a per-turn struggle/cost analyzer (see linked spec). It needs tool_error_count + tool_retry_count + completion_reason as primary features. Right now we read these by post-hoc-parsing messages[] content blocks — works, but couples our plugin to a content-block schema we don't own and miss per-tool durations entirely.

More broadly, this is the foundational data shape any cost/performance/quality observability plugin needs. Cascadia, Helicone-style traces, FrugalGPT-style cascading, RouteLLM-style classifier training data — all of them want tool-outcome data per turn as first-class fields, not parsed-out-of-message-blocks heuristics. Same need is implied by issue #82548 (AI safety and quality observability events) but at a finer granularity.

Fix Action

Fix / Workaround

Workaround we're shipping while waiting

For our struggle detector, we currently parse tool_use and tool_result blocks out of messages[] and count them. Works for Anthropic-shape providers; will need a fallback path for other backends. Per-tool wall-clock duration we don't have a workaround for — we fall back to total_durationMs / tool_call_count as a proxy. Acceptable for v1 but lossy.

Code Example

{
  messages: Message[],  // full conversation history
  success: boolean,
  durationMs: number
}

---

interface AgentEndEvent {
  messages: Message[];           // existing
  success: boolean;              // existing
  durationMs: number;            // existing
  // New fields (added, no existing field changed):
  tool_calls: Array<{
    name: string;
    arguments: Record<string, unknown>;
    duration_ms?: number;
    succeeded: boolean;
    error?: { message: string; code?: string };
  }>;
  completion_reason: "stop" | "max_tokens" | "tool_use" | "error" | "interrupted";
  llm_finish_reason?: string;    // raw value from provider (when distinct)
}
RAW_BUFFERClick to expand / collapse

Problem

Plugins can subscribe to agent_end to observe when a turn finishes, but the payload is structurally insufficient for any observability use case that needs per-turn outcome data. Today's agent_end payload (as observed against OC 2026.5.22) carries:

{
  messages: Message[],  // full conversation history
  success: boolean,
  durationMs: number
}

Plus ctx with session identity. Notably missing:

  • Tool calls made during the turn (which tools, with what arguments)
  • Tool results (success/failure per call, error messages on failure)
  • Completion reason ("stop" | "max_tokens" | "tool_use" | "error" from the LLM provider)
  • Per-tool wall-clock durations

Tool-use and tool-result blocks DO appear inline inside messages[] as Anthropic-format content blocks, so a careful plugin can parse them out of the message array. But that approach has costs:

  1. Couples plugins to the message-block schema. Any provider that doesn't emit Anthropic-shape content blocks (some local backends, future provider integrations) silently breaks the plugin's tool-call parsing — without a typed error surface.
  2. Doesn't expose per-tool durations at all. Plugins computing tool-call latency stats must read OC's session JSONL post-hoc.
  3. No completion_reason in any payload field, so plugins can't distinguish "model decided to stop" from "ran out of context" from "tool-use loop continues."

Concrete ask

Either of:

Option A: Enrich agent_end payload (preferred for backward compatibility)

interface AgentEndEvent {
  messages: Message[];           // existing
  success: boolean;              // existing
  durationMs: number;            // existing
  // New fields (added, no existing field changed):
  tool_calls: Array<{
    name: string;
    arguments: Record<string, unknown>;
    duration_ms?: number;
    succeeded: boolean;
    error?: { message: string; code?: string };
  }>;
  completion_reason: "stop" | "max_tokens" | "tool_use" | "error" | "interrupted";
  llm_finish_reason?: string;    // raw value from provider (when distinct)
}

Option B: New turn_end hook with the rich payload

If extending agent_end is undesirable (semver, downstream coupling), add a complementary turn_end hook that fires alongside agent_end and carries the new fields. Plugins opt in by subscribing.

Why this matters

Concrete consumer: Evolve is building a per-turn struggle/cost analyzer (see linked spec). It needs tool_error_count + tool_retry_count + completion_reason as primary features. Right now we read these by post-hoc-parsing messages[] content blocks — works, but couples our plugin to a content-block schema we don't own and miss per-tool durations entirely.

More broadly, this is the foundational data shape any cost/performance/quality observability plugin needs. Cascadia, Helicone-style traces, FrugalGPT-style cascading, RouteLLM-style classifier training data — all of them want tool-outcome data per turn as first-class fields, not parsed-out-of-message-blocks heuristics. Same need is implied by issue #82548 (AI safety and quality observability events) but at a finer granularity.

What I'm NOT asking for

  • Tool-call-level hooks (before_tool_call, after_tool_call). The hook-governance discussion explicitly classifies these as out-of-scope for the plugin API. Per-turn aggregate data on agent_end (or turn_end) is the right interception point.
  • Direct OC session-store access. That's the OC bundle's concern; plugins shouldn't read OC's on-disk format.
  • A tool-call interceptor. We don't want to gate tools; we just want to observe their outcomes.

Workaround we're shipping while waiting

For our struggle detector, we currently parse tool_use and tool_result blocks out of messages[] and count them. Works for Anthropic-shape providers; will need a fallback path for other backends. Per-tool wall-clock duration we don't have a workaround for — we fall back to total_durationMs / tool_call_count as a proxy. Acceptable for v1 but lossy.

Acceptance criteria

  • agent_end (or new turn_end) payload includes the four new field families above
  • TypeScript types in the plugin SDK reflect the new shape
  • A short doc page or changelog entry explains the difference between message-block-parsing and the new typed fields
  • Existing plugins that subscribed to agent_end continue to work unchanged (additive change)
  • Available in plugin SDK ≥ 2026.6.x (no rush)

Environment

  • OC version observed: 2026.5.22 (npm openclaw global on macOS)
  • Plugin SDK: openclaw/plugin-sdk shipping with that release
  • Plugin: @openclaw/evolve (private; not relevant to the ask)
  • Reproducer: api.on("agent_end", (event, ctx) => console.log(Object.keys(event))) produces only messages, success, durationMs.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix [Feature]: agent_end (or new turn_end) hook payload should expose per-turn tool-call outcomes + completion_reason