hermes - 💡(How to fix) Fix /goal judge over-continues exploratory goals unless the assistant explicitly says the goal is complete

Q: Expected behavior

For exploratory goals, `/goal` should stop when the assistant has reasonably completed the review and produced actionable recommendations, unless the original goal clearly requires concrete follow-up artifacts. A response should not be judged incomplete solely because it lacks explicit wording such as “the goal is complete”.

Root Cause

This can turn planning/reflection goals into unwanted execution. The user may ask the assistant to inspect context and suggest possible help, but /goal can keep pushing until the assistant manufactures additional deliverables.

That is especially risky when the “possible help” examples include sending messages, editing files, creating tickets, or performing investigations.

Code Example

[Continuing toward your standing goal]
Goal: <original exploratory goal>

Continue working toward this goal. Take the next concrete step. If you believe the goal is complete, state so explicitly and stop.

---

18:28:51 inbound message: msg='observe tasks from two work domains ... overdue/today/tomorrow tasks ... notes ... chat context ... same-day notes/journal context ... reflect on ways you can help, for example writing tickets, preparing a message, or preparing an investigation ... maybe create kanban activities ... or maybe resolve differently.'

---

18:35:21 Turn ended: reason=text_response(finish_reason=stop)
         response_len=10022

---

18:35:22 hermes_cli.goals: goal judge: verdict=continue reason=A resposta mostra análise e propostas, mas não confirma explicitamente que a varredura completa foi concluída nem entreg… [truncated]
18:35:28 gateway.run: inbound message: msg='[Continuing toward your standing goal] Goal: observe tasks ...'

---

18:36:16 goal judge: verdict=continue ...
18:36:17 inbound synthetic continuation ...

18:36:44 goal judge: verdict=continue ...
18:36:44 inbound synthetic continuation ...

18:37:30 goal judge: verdict=continue ...
18:37:32 inbound synthetic continuation ...

18:38:17 goal judge: verdict=continue ...
18:38:23 inbound synthetic continuation ...

18:39:22 goal judge: verdict=continue ...
18:39:29 inbound synthetic continuation ...

---

18:39:38 Turn ended: reason=text_response(finish_reason=stop) response_len=822
18:39:39 hermes_cli.goals: goal judge: verdict=done reason=The agent explicitly states the goal is complete and lists the deliverables produced, so the goal is satisfied.

Summary

A Discord /goal with an exploratory objective produced a valid synthesis/proposal answer, but the goal judge repeatedly returned continue because the assistant did not explicitly state that the goal was complete.

The synthetic continuation loop escalated the task from “review context and reflect on ways to help” into producing multiple concrete artifacts that the user had only mentioned as examples of possible help.

The issue is that /goal appears too dependent on explicit completion phrasing. For exploratory goals such as “review / reflect / suggest options”, a high-quality synthesis can satisfy the goal even without a magic phrase like “goal complete”.

Expected behavior

For exploratory goals, /goal should stop when the assistant has reasonably completed the review and produced actionable recommendations, unless the original goal clearly requires concrete follow-up artifacts.

A response should not be judged incomplete solely because it lacks explicit wording such as “the goal is complete”.

Actual behavior

User set a broad exploratory goal:
- review recent tasks from two work domains;
- include overdue/today/tomorrow items and task notes;
- review related chat context;
- review the user's same-day notes/journal context;
- reflect on ways the assistant could help;
- examples included writing tickets, preparing messages, preparing an investigation, or maybe creating Kanban activities.
The assistant reviewed the context and produced a large synthesis with a menu of possible ways to help.
The goal judge returned continue because the answer did not explicitly confirm that the review was complete.
Hermes injected repeated synthetic messages:

[Continuing toward your standing goal]
Goal: <original exploratory goal>

Continue working toward this goal. Take the next concrete step. If you believe the goal is complete, state so explicitly and stop.

The assistant escalated from reflection/proposal into producing multiple concrete deliverables:
- ticket drafts;
- dependency/blocker wording;
- handoff text;
- messages to third parties;
- technical brief material.
The loop only stopped once the assistant explicitly wrote that the goal was complete.

Sanitized evidence

Initial user goal shape:

18:28:51 inbound message: msg='observe tasks from two work domains ... overdue/today/tomorrow tasks ... notes ... chat context ... same-day notes/journal context ... reflect on ways you can help, for example writing tickets, preparing a message, or preparing an investigation ... maybe create kanban activities ... or maybe resolve differently.'

First complete-enough answer:

18:35:21 Turn ended: reason=text_response(finish_reason=stop)
         response_len=10022

Goal judge then continued:

18:35:22 hermes_cli.goals: goal judge: verdict=continue reason=A resposta mostra análise e propostas, mas não confirma explicitamente que a varredura completa foi concluída nem entreg… [truncated]
18:35:28 gateway.run: inbound message: msg='[Continuing toward your standing goal] Goal: observe tasks ...'

Repeated continuation pattern:

18:36:16 goal judge: verdict=continue ...
18:36:17 inbound synthetic continuation ...

18:36:44 goal judge: verdict=continue ...
18:36:44 inbound synthetic continuation ...

18:37:30 goal judge: verdict=continue ...
18:37:32 inbound synthetic continuation ...

18:38:17 goal judge: verdict=continue ...
18:38:23 inbound synthetic continuation ...

18:39:22 goal judge: verdict=continue ...
18:39:29 inbound synthetic continuation ...

Loop stops only after explicit completion wording:

18:39:38 Turn ended: reason=text_response(finish_reason=stop) response_len=822
18:39:39 hermes_cli.goals: goal judge: verdict=done reason=The agent explicitly states the goal is complete and lists the deliverables produced, so the goal is satisfied.

Why this matters

That is especially risky when the “possible help” examples include sending messages, editing files, creating tickets, or performing investigations.

Suspected cause

The goal judge appears to use explicit completion language as a stronger signal than the actual semantic sufficiency of the response.

This creates a bad incentive: unless the assistant says “the goal is complete”, the loop may keep running even when the useful answer has already been delivered.

Proposed fixes / invariants

Treat exploratory/review/proposal goals as completable by a sufficient synthesis, even without explicit “goal complete” wording.
Distinguish examples of possible next actions from required deliverables.
Add goal-judge guidance such as:
- if the user asked to “review/reflect/suggest”, a concrete recommendation list can satisfy the goal;
- do not require producing every artifact mentioned as an example;
- do not continue merely to force an explicit completion phrase.
Consider making the continuation prompt safer for exploratory goals:
- “If the previous answer substantially satisfied the review/proposal request, mark complete instead of producing extra artifacts.”

Related issues

#26986 — keep persistent goals active when the response explicitly reports incomplete work. This issue is the inverse: do not keep goals active when the response functionally completed an exploratory goal.
#27585 — /goal can spam repeated completion messages when judge errors fail-open to continue. Related because both involve continuation after terminal-ish answers.
#28649 — gateway /goal continuation loop behavior on Telegram/Discord.
#18467 / #33618 — /goal state and session-id/compression lifecycle. Related but not required to reproduce this case.

FAQ

Expected behavior

A response should not be judged incomplete solely because it lacks explicit wording such as “the goal is complete”.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

hermes - 💡(How to fix) Fix /goal judge over-continues exploratory goals unless the assistant explicitly says the goal is complete

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Code Example

Summary

Expected behavior

Actual behavior

Sanitized evidence

Why this matters

Suspected cause

Proposed fixes / invariants

Related issues

FAQ

Expected behavior

Still need to ship something?

TRENDING