hermes - ✅(Solved) Fix [Resilience] has_incomplete_scratchpad false positive causes wasted retries and lost output [1 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
NousResearch/hermes-agent#11663Fetched 2026-04-18 05:59:30
View on GitHub
Comments
0
Participants
1
Timeline
1
Reactions
0
Author
Participants
Timeline (top)
cross-referenced ×1

Fix Action

Fixed

PR fix notes

PR #11743: fix(agent): ignore quoted scratchpad tags in retry detection

Description (problem / solution / changelog)

Summary

  • stop has_incomplete_scratchpad() from treating quoted or code-form scratchpad tag mentions as an unfinished reasoning block
  • strip fenced code blocks, inline code, and blockquotes before checking for unclosed <REASONING_SCRATCHPAD> tags
  • add focused regression coverage for real incomplete tags plus the false-positive markdown cases

Root Cause

The retry guard in agent/trajectory.py used a plain substring check for <REASONING_SCRATCHPAD> without understanding markdown context. That meant grep output, quoted user text, or inline code containing the tag could trigger the incomplete-scratchpad path, causing wasted retries and dropped output.

Closes #11663.

Validation

  • python3 -m py_compile agent/trajectory.py tests/agent/test_trajectory.py
  • uv run --extra dev pytest -o addopts='' tests/agent/test_trajectory.py -q

Platform Tested

  • macOS 15.x (Apple Silicon)

Contribution Guide Notes

  • Reviewed CONTRIBUTING.md and checked for existing open PRs before submitting this scoped change.
  • Ran the targeted verification commands listed above for this PR. I have not claimed a full repo-wide pytest tests/ -q pass unless explicitly noted.

Changed files

  • agent/trajectory.py (modified, +18/-3)
  • tests/agent/test_trajectory.py (added, +36/-0)
RAW_BUFFERClick to expand / collapse

问题描述

trajectory.py的has_incomplete_scratchpad检测使用纯字符串包含检查,当上下文中包含REASONING_SCRATCHPAD文本(如grep搜索结果),模型引用时误触发。

影响

  • 重试3次全部失败
  • 输出丢失

建议改进

  • 检查标签是否在代码块/引用内
  • 或用XML parser验证

版本

Hermes v0.9.0

extent analysis

TL;DR

The issue can be fixed by improving the has_incomplete_scratchpad detection in trajectory.py to avoid false positives when the context contains specific text like REASONING_SCRATCHPAD.

Guidance

  • The current implementation uses a simple string containment check, which is too broad and leads to false positives.
  • To improve this, consider checking if the detected text is within a code block or a reference, as suggested.
  • Alternatively, using an XML parser to validate the context could provide a more robust solution.
  • Before implementing any changes, verify that the issue is indeed caused by the has_incomplete_scratchpad detection and not by another factor.

Example

No specific code example can be provided without more context, but the suggested improvement could involve modifying the has_incomplete_scratchpad function to use a more sophisticated parsing or checking mechanism.

Notes

The provided information does not specify the exact implementation details of has_incomplete_scratchpad or the context in which it is used, which might limit the applicability of the suggested solutions.

Recommendation

Apply a workaround by modifying the has_incomplete_scratchpad detection to check for the text within specific contexts, such as code blocks or references, as this approach directly addresses the reported issue without requiring an upgrade to a potentially non-existent fixed version.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING