hermes - ✅(Solved) Fix feat: structured response object from run_conversation() [1 pull requests, 1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
NousResearch/hermes-agent#28474Fetched 2026-05-20 04:03:37
View on GitHub
Comments
1
Participants
2
Timeline
5
Reactions
0
Author
Participants
Timeline (top)
labeled ×3commented ×1cross-referenced ×1

Error Message

outcome: Literal["success", "skipped", "error", "retry"]

Fix Action

Fixed

PR fix notes

PR #28538: feat: expose tool_call_log and metadata from run_conversation()

Description (problem / solution / changelog)

Summary

Exposes two new structured fields from run_conversation():

  • tool_call_log: List[dict] — one entry per tool invocation with tool_name, arguments, outcome (success|error|skipped|blocked), response, duration_ms, call_id.
  • metadata: RunMetadata dataclass with model, turns, total_tokens, finish_reason, duration_seconds.

Also preserves the existing content_segments field as dataclass instances (unchanged from PR #28453).

Changes

  • agent/conversation_loop.py: Added ToolCallResult / RunMetadata dataclasses; initialise _tool_call_log and _run_start_time on _init_conversation; collect per-call records in sequential and concurrent tool-execution paths via agent._tool_call_log.append(ToolCallResult(...)); return dict augmented with tool_call_log and metadata fields.
  • agent/tool_executor.py: Import ToolCallResult; replace raw dict appends with ToolCallResult(...) instances in both sequential and concurrent paths.
  • tests/run_agent/test_run_agent.py: Added TestToolCallLog with 4 tests (success, error, metadata, coexist).

Closes #28474

Changed files

  • .dev-workflow/code-graph.db (added, +0/-0)
  • .dev-workflow/experiences.jsonl (added, +2/-0)
  • agent/conversation_loop.py (modified, +122/-1)
  • agent/tool_executor.py (modified, +48/-1)
  • analyzed_issues.json (added, +17/-0)
  • tests/run_agent/test_run_agent.py (modified, +135/-0)

Code Example

@dataclass
class ContentSegment:
    turn_index: int
    role: Literal["user", "assistant", "tool"]
    content: str
    tool_calls: list[ToolCall] | None = None
    is_final: bool = False

---

@dataclass
class ToolCallResult:
    call_id: str
    tool_name: str
    arguments: dict
    outcome: Literal["success", "skipped", "error", "retry"]
    response: str | None
    duration_ms: float | None

---

@dataclass
class RunMetadata:
    model: str
    turns: int
    total_tokens: int | None
    finish_reason: str | None
    duration_seconds: float | None

---

{
    "final_response": str,               # existing field, unchanged
    "content_segments": list,             # new
    "tool_call_log": list,               # new
    "metadata": dict,                    # new
    # ... existing fields unchanged
}

---

pytest tests/run_agent/test_run_agent.py -k "test_oneshot_response_has_structured_segments"
pytest tests/run_agent/test_run_agent.py -k "test_tool_call_log_recorded"
RAW_BUFFERClick to expand / collapse

Problem

The return value of run_conversation() (and run_agent()) is an opaque string. Callers — API users, other agents, plugins — cannot distinguish:

  • Which content came from the initial user turn vs. a follow-up
  • Whether a tool call succeeded, was skipped, or failed
  • Where in a multi-turn conversation a particular piece of content originated

This forces downstream consumers to guess from heuristics, leading to issues like #28326, #28431, #28456.

Proposed Solution

Replace the flat string return with a structured RunResponse dataclass that exposes:

1. Content segment trail

A content_segments: list[ContentSegment] field giving the full per-turn breakdown:

@dataclass
class ContentSegment:
    turn_index: int
    role: Literal["user", "assistant", "tool"]
    content: str
    tool_calls: list[ToolCall] | None = None
    is_final: bool = False

2. Tool call log

A tool_call_log: list[ToolCallResult] that records every tool invocation:

@dataclass
class ToolCallResult:
    call_id: str
    tool_name: str
    arguments: dict
    outcome: Literal["success", "skipped", "error", "retry"]
    response: str | None
    duration_ms: float | None

3. Conversation metadata

@dataclass
class RunMetadata:
    model: str
    turns: int
    total_tokens: int | None
    finish_reason: str | None
    duration_seconds: float | None

Backward compatibility

run_conversation() returns a dict today. The new dict shape is:

{
    "final_response": str,               # existing field, unchanged
    "content_segments": list,             # new
    "tool_call_log": list,               # new
    "metadata": dict,                    # new
    # ... existing fields unchanged
}

Implementation Notes

  • content_segments is already implemented in PR #28453 — this FR extends that work with tool_call_log and metadata
  • ContentSegment.role enum already exists in conversation_loop.py from #28453
  • Tool call instrumentation exists in execute_tool_calls(); the missing piece is persisting structured results to the response object
  • Affects: agent/conversation_loop.py, run_agent.py

Verification

pytest tests/run_agent/test_run_agent.py -k "test_oneshot_response_has_structured_segments"
pytest tests/run_agent/test_run_agent.py -k "test_tool_call_log_recorded"

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

hermes - ✅(Solved) Fix feat: structured response object from run_conversation() [1 pull requests, 1 comments, 2 participants]