hermes - 💡(How to fix) Fix /goal auto-continuation can be amplified by preflight compression/session split and resurrect stale task state

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

A Discord /goal task completed a narrowly scoped file-editing request and produced a final answer saying the requested scope was done. Immediately afterward, the goal judge returned continue. That synthetic continuation raced with automatic skill-library maintenance and preflight context compression/session split.

The agent then resumed work beyond the explicit scope, edited additional content, and compression preserved/reintroduced a stale active task list. The user had to interrupt and ask why the agent was still working.

This appears to be a lifecycle/compression amplification of a /goal judge false continuation: the judge made a bad continue decision, and compression/session split made the stale continuation state durable.

Error Message

I continued beyond the scope and started inspecting/annotating later sections. That was my error.

Root Cause

  1. User asked the agent to:
    • modify a documentation file in the repo;
    • add a new empty column to story tables;
    • inspect and annotate only one specified section;
    • not modify the outbox copy;
    • not fix implementation, only annotate.
  2. The agent completed exactly that scope and sent a final answer:
    • repo file modified;
    • outbox not modified;
    • only the requested section annotated;
    • checks passed;
    • todo list fully completed.
  3. Immediately after the final response, Hermes started an automatic background skill-library maintenance turn.
  4. That maintenance turn triggered preflight compression.
  5. The goal judge decided continue, apparently because the repo file was still untracked, although commit/staging was not part of the user request.
  6. Gateway injected a synthetic [Continuing toward your standing goal] message.
  7. A second preflight compression started for this continuation while the previous compression was already in flight.
  8. The compressed handoff summary said active task was None, but the synthetic continuation still caused the agent to keep working.
  9. The agent annotated additional sections beyond the requested one.
  10. Later compression/session split preserved an active task list for this extra out-of-scope work.
  11. The user interrupted and the assistant acknowledged it had continued beyond scope.

Code Example

20:31:57 [S1] Turn ended: reason=text_response(finish_reason=stop)
         api_calls=18/60 budget=14/60 tool_turns=29
         last_msg_role=assistant response_len=1504 session=S1

---

Done in the repo file.
The outbox file was not changed.

What was modified:
- Added the new column to the story tables.
- Kept the column empty in other sections.
- Filled/annotated only the requested section.

Checks:
- All rows have the expected columns.
- Only the requested rows have annotations.
- Remaining rows are empty.
- git diff --check passed.
- Git status: file is untracked.

---

{
  "todos": [
    {"id": "edit-columns", "status": "completed"},
    {"id": "inspect-requested-section", "status": "completed"},
    {"id": "annotate-requested-section", "status": "completed"},
    {"id": "verify", "status": "completed"}
  ],
  "summary": {"pending": 0, "in_progress": 0, "completed": 4}
}

---

20:31:57 [S1] conversation turn: msg='Review the conversation above and update the skill library. Be ACTIVE ...'
20:31:57 [S1] Preflight compression: ~128,217 tokens >= 122,400 threshold
20:31:57 [S1] context compression started: session=S1 messages=73 tokens=~128,217 model=gpt-5.5 focus=None
20:31:57 [S1] Auxiliary compression: using anthropic (claude-haiku-4-5-20251001)

---

20:31:58 hermes_cli.goals: goal judge: verdict=continue reason=A resposta indica que a revisão da seção solicitada foi feita, mas o próprio arquivo do repo ainda está untracked, então não há… [truncated]

---

20:31:58 gateway.run: inbound message: platform=discord user=<user> chat=<redacted> msg='[Continuing toward your standing goal] Goal: ...'
20:31:58 [S1] conversation turn: msg='[User] [Continuing toward your standing goal] Goal: ... Continue working toward this goal. Take the next concrete step...'
20:31:58 [S1] Preflight compression: ~125,517 tokens >= 122,400 threshold
20:31:58 [S1] context compression started: session=S1 messages=73 tokens=~125,517 model=gpt-5.5 focus=None

---

20:32:24 [S1] context compression done: session=S1 messages=73->8 tokens=~31,451
20:32:30 [S1] context compression done: session=S2 messages=73->8 tokens=~29,634

---

[CONTEXT COMPACTIONREFERENCE ONLY]
## Active Task
None. User completed the requested section validation and paused; no new task has been assigned.

## In Progress
None.

---

{
  "todos": [
    {"id": "inspect-later-sections", "status": "in_progress"},
    {"id": "annotate-later-sections", "status": "pending"},
    {"id": "verify-later-sections", "status": "pending"}
  ]
}

---

20:35:37 [S2] context compression started: session=S2 messages=76 tokens=~122,917 model=gpt-5.5 focus=None
20:36:04 [S2] context compression done: session=S3 messages=76->77 tokens=~75,345

---

[Your active task list was preserved across context compression]
- [>] inspect-later-sections ... (in_progress)
- [ ] annotate-later-sections ... (pending)
- [ ] verify-later-sections ... (pending)

---

20:36:26 [S3] Turn ended: reason=interrupted_during_api_call
20:36:26 gateway.run: Session split detected: S1S3 (compression)

---

The original request was already completed.
I continued beyond the scope and started inspecting/annotating later sections. That was my error.
RAW_BUFFERClick to expand / collapse

Summary

A Discord /goal task completed a narrowly scoped file-editing request and produced a final answer saying the requested scope was done. Immediately afterward, the goal judge returned continue. That synthetic continuation raced with automatic skill-library maintenance and preflight context compression/session split.

The agent then resumed work beyond the explicit scope, edited additional content, and compression preserved/reintroduced a stale active task list. The user had to interrupt and ask why the agent was still working.

This appears to be a lifecycle/compression amplification of a /goal judge false continuation: the judge made a bad continue decision, and compression/session split made the stale continuation state durable.

Expected behavior

  1. If the assistant has completed the scoped user request and the todo list is complete, /goal should not synthesize a new continuation unless there is an explicit remaining requirement.
  2. A git file being untracked should not imply the goal is incomplete unless the user asked for staging, commit, push, or clean working tree.
  3. Preflight compression/session split should not preserve a synthetic continuation or active task list that contradicts the latest completed final answer.
  4. Background maintenance turns should not race with goal continuation/compression against the same session lineage.

Actual behavior

  1. User asked the agent to:
    • modify a documentation file in the repo;
    • add a new empty column to story tables;
    • inspect and annotate only one specified section;
    • not modify the outbox copy;
    • not fix implementation, only annotate.
  2. The agent completed exactly that scope and sent a final answer:
    • repo file modified;
    • outbox not modified;
    • only the requested section annotated;
    • checks passed;
    • todo list fully completed.
  3. Immediately after the final response, Hermes started an automatic background skill-library maintenance turn.
  4. That maintenance turn triggered preflight compression.
  5. The goal judge decided continue, apparently because the repo file was still untracked, although commit/staging was not part of the user request.
  6. Gateway injected a synthetic [Continuing toward your standing goal] message.
  7. A second preflight compression started for this continuation while the previous compression was already in flight.
  8. The compressed handoff summary said active task was None, but the synthetic continuation still caused the agent to keep working.
  9. The agent annotated additional sections beyond the requested one.
  10. Later compression/session split preserved an active task list for this extra out-of-scope work.
  11. The user interrupted and the assistant acknowledged it had continued beyond scope.

Sanitized timeline

1. Scoped request completed

20:31:57 [S1] Turn ended: reason=text_response(finish_reason=stop)
         api_calls=18/60 budget=14/60 tool_turns=29
         last_msg_role=assistant response_len=1504 session=S1

Final answer shape:

Done in the repo file.
The outbox file was not changed.

What was modified:
- Added the new column to the story tables.
- Kept the column empty in other sections.
- Filled/annotated only the requested section.

Checks:
- All rows have the expected columns.
- Only the requested rows have annotations.
- Remaining rows are empty.
- git diff --check passed.
- Git status: file is untracked.

Todo state immediately before final answer:

{
  "todos": [
    {"id": "edit-columns", "status": "completed"},
    {"id": "inspect-requested-section", "status": "completed"},
    {"id": "annotate-requested-section", "status": "completed"},
    {"id": "verify", "status": "completed"}
  ],
  "summary": {"pending": 0, "in_progress": 0, "completed": 4}
}

2. Background maintenance + compression begins

20:31:57 [S1] conversation turn: msg='Review the conversation above and update the skill library. Be ACTIVE ...'
20:31:57 [S1] Preflight compression: ~128,217 tokens >= 122,400 threshold
20:31:57 [S1] context compression started: session=S1 messages=73 tokens=~128,217 model=gpt-5.5 focus=None
20:31:57 [S1] Auxiliary compression: using anthropic (claude-haiku-4-5-20251001)

3. Goal judge incorrectly continues

20:31:58 hermes_cli.goals: goal judge: verdict=continue reason=A resposta indica que a revisão da seção solicitada foi feita, mas o próprio arquivo do repo ainda está untracked, então não há… [truncated]

The untracked reasoning is wrong for this task: the user did not ask for git add, commit, or push.

4. Synthetic continuation races with compression

20:31:58 gateway.run: inbound message: platform=discord user=<user> chat=<redacted> msg='[Continuing toward your standing goal] Goal: ...'
20:31:58 [S1] conversation turn: msg='[User] [Continuing toward your standing goal] Goal: ... Continue working toward this goal. Take the next concrete step...'
20:31:58 [S1] Preflight compression: ~125,517 tokens >= 122,400 threshold
20:31:58 [S1] context compression started: session=S1 messages=73 tokens=~125,517 model=gpt-5.5 focus=None

5. Compression creates contradictory state

20:32:24 [S1] context compression done: session=S1 messages=73->8 tokens=~31,451
20:32:30 [S1] context compression done: session=S2 messages=73->8 tokens=~29,634

Compressed handoff summary later visible in the new session:

[CONTEXT COMPACTION — REFERENCE ONLY]
## Active Task
None. User completed the requested section validation and paused; no new task has been assigned.

## In Progress
None.

But the synthetic continuation was also present immediately after that summary.

6. Agent resumes out-of-scope work

Instead of stopping, the agent continued into other sections and replaced the todo list with new out-of-scope work:

{
  "todos": [
    {"id": "inspect-later-sections", "status": "in_progress"},
    {"id": "annotate-later-sections", "status": "pending"},
    {"id": "verify-later-sections", "status": "pending"}
  ]
}

7. Later compression preserves the wrong active task list

20:35:37 [S2] context compression started: session=S2 messages=76 tokens=~122,917 model=gpt-5.5 focus=None
20:36:04 [S2] context compression done: session=S3 messages=76->77 tokens=~75,345

New session contained:

[Your active task list was preserved across context compression]
- [>] inspect-later-sections ... (in_progress)
- [ ] annotate-later-sections ... (pending)
- [ ] verify-later-sections ... (pending)

8. User interruption

20:36:26 [S3] Turn ended: reason=interrupted_during_api_call
20:36:26 gateway.run: Session split detected: S1 → S3 (compression)

The assistant then acknowledged:

The original request was already completed.
I continued beyond the scope and started inspecting/annotating later sections. That was my error.

Impact

  • Assistant continued after a completed final response.
  • Extra file edits were made outside the requested scope.
  • A synthetic continuation contradicted the compressed summary saying there was no active task.
  • Compression/session split preserved a wrong active task list.
  • User had to interrupt to stop out-of-scope work.

Suspected contributing causes

  1. Goal judge treated git status: untracked as evidence of incompletion even though staging/commit was not required.
  2. Goal continuation and background skill-library maintenance can both start turns against the same session lineage.
  3. Preflight compression can serialize synthetic continuation state while the latest assistant answer says the task is complete.
  4. Active todo state created after an invalid synthetic continuation can be preserved across compression.
  5. There may be missing consumed/ack state for synthetic continuation/recovery messages.

Proposed fixes / invariants

  1. Do not infer “goal incomplete” from untracked/modified files unless the goal explicitly required staging/commit/push or clean working tree.
  2. If the latest assistant answer reports completion and todo state has no pending/in-progress items, require a stronger reason before queuing /goal continuation.
  3. Serialize/lock session turns during preflight compression so background maintenance and goal continuation cannot both compress/use the same parent history concurrently.
  4. When compression creates a child session, reconcile synthetic /goal continuations against the compressed Active Task summary and latest final answer.
  5. Do not preserve active todo state created by a synthetic continuation if the premise conflicts with a completed final answer.
  6. Add consumed-state markers for synthetic continuation/recovery messages so stale inferred work is not replayed as fresh work.

Related issues

  • #25242 — gateway auto-continue note can be persisted/amplified by interrupt-triggered preflight compression. Closest lifecycle analogue: synthetic note/state becomes durable session poison.
  • #25921 — gateway can reuse parent-sized history after compression split, causing repeated preflight compression. Related compression/session publication problem.
  • #20293 — context compaction + session split injects compressed summary as valid history. Related state/handoff boundary issue.
  • #9631 — iterative compaction keeps completed topics alive and overrides current topic. Related stale-completed-work resurrection.
  • #20250 — prompt lifecycle can remain in-flight after repeated compression timeout. Related long-session/compression lifecycle failure.
  • #18467 / #33618 — /goal state and session-id migration across compression. Related /goal + compression lifecycle.
  • #27585 — /goal can continue after terminal-ish answers when judge handling goes wrong.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

  1. If the assistant has completed the scoped user request and the todo list is complete, /goal should not synthesize a new continuation unless there is an explicit remaining requirement.
  2. A git file being untracked should not imply the goal is incomplete unless the user asked for staging, commit, push, or clean working tree.
  3. Preflight compression/session split should not preserve a synthetic continuation or active task list that contradicts the latest completed final answer.
  4. Background maintenance turns should not race with goal continuation/compression against the same session lineage.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING