hermes - 💡(How to fix) Fix [Bug]: Copilot Responses API 401 'input item ID does not belong to this connection' on assistant codex_message_items replay across credential/connection changes

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Error Message

This is distinct from #27038 (which is about the chatgpt.com Codex backend enforcing a 64-char id length limit). Different provider (copilot), different endpoint (api.githubcopilot.com/responses), different error code/message, different root cause (connection binding, not length).

Root Cause

This is distinct from #27038 (which is about the chatgpt.com Codex backend enforcing a 64-char id length limit). Different provider (copilot), different endpoint (api.githubcopilot.com/responses), different error code/message, different root cause (connection binding, not length).

Fix Action

Workaround

Manually edit state.db (and any active session JSONL) to strip id from every codex_message_items[*] for the session. Or apply the diff above locally and restart the gateway.

Code Example

HTTP 401: input item ID does not belong to this connection

---

HTTP 401: input item ID does not belong to this connection

---

{
  "model": "gpt-5.5",
  "input": [
    ...,
    {"type": "message", "role": "assistant", "id": "PPz6c1ySbeU1EfegTmhITWv/PTjyH/FV6wGF4Ft2rUJXCYts3VfAdV+MoBhvXH+WKUujB/LJIzhyG8XV", "status": "completed", "content": [...]},
    ...
  ],
  "store": false
}

---

--- a/agent/codex_responses_adapter.py
+++ b/agent/codex_responses_adapter.py
@@ def _chat_messages_to_responses_input(
     messages: List[Dict[str, Any]],
     *,
     is_xai_responses: bool = False,
+    is_github_responses: bool = False,
 ) -> List[Dict[str, Any]]:
@@
                 codex_message_items = msg.get("codex_message_items")
                 replayed_message_items = 0
-                if isinstance(codex_message_items, list):
+                if isinstance(codex_message_items, list) and not is_github_responses:

--- a/agent/transports/codex.py
+++ b/agent/transports/codex.py
@@ class ResponsesApiTransport(ProviderTransport):
     def convert_messages(self, messages, **kwargs):
         return _chat_messages_to_responses_input(
             messages,
             is_xai_responses=bool(kwargs.get("is_xai_responses")),
+            is_github_responses=bool(kwargs.get("is_github_responses")),
         )
@@
             "input": _chat_messages_to_responses_input(
                 payload_messages,
                 is_xai_responses=is_xai_responses,
+                is_github_responses=is_github_responses,
             ),

---

items = _chat_messages_to_responses_input([
    {'role': 'assistant', 'content': 'hi',
     'codex_message_items': [{'type': 'message', 'role': 'assistant',
                              'id': 'abc', 'status': 'completed',
                              'content': [{'type': 'output_text', 'text': 'hi'}]}]},
    {'role': 'user', 'content': 'test'},
], is_github_responses=True)
# => [{'role': 'assistant', 'content': 'hi'}, {'role': 'user', 'content': 'test'}]
# No 'id' field replayed -> Copilot accepts the request.
RAW_BUFFERClick to expand / collapse

Bug Description

When using the GitHub Copilot provider with a model that routes to the Responses API (e.g. gpt-5.5), Hermes can permanently break a multi-turn session by replaying assistant codex_message_items IDs that were minted under a different "connection." The Copilot backend rejects the request with:

HTTP 401: input item ID does not belong to this connection

Unlike the model_not_supported / token-rotation failure modes, this 401 reproduces even after restoring the original token and is independent of the gh account in use — the bad IDs live in the session transcript, not the credentials. Every subsequent turn re-sends the same poisoned input array and gets the same 401 back, making the session unrecoverable.

This is distinct from #27038 (which is about the chatgpt.com Codex backend enforcing a 64-char id length limit). Different provider (copilot), different endpoint (api.githubcopilot.com/responses), different error code/message, different root cause (connection binding, not length).

Steps to Reproduce

  1. Use the copilot provider with gpt-5.5 (or any model that Hermes routes through codex_responses — anything matched by _should_use_copilot_responses_api).
  2. Take a few multi-turn steps so the gateway captures assistant codex_message_items with server-assigned id fields.
  3. Trigger any event that changes the Copilot "connection" out from under the session. Reliable repros:
    • Credential-pool rotation (e.g. one gh token in the pool transiently fails and the pool rotates to another).
    • Restarting the gateway with a different gh account active vs. the one that produced the prior turns.
    • GitHub-side load-balancer churn between turns (this happens organically in long sessions; observed against a Microsoft Enterprise Copilot seat during a model rollout window).
  4. Continue the conversation.
  5. The next /responses request returns HTTP 401 input item ID does not belong to this connection. The session is now permanently broken for that model.

Expected Behavior

Hermes should not replay opaque assistant message IDs back to the Copilot /responses endpoint, because Copilot binds those IDs to a backend "connection" that does not survive credential rotation, gateway restart, or even routine load-balancer churn. Falling back to plain assistant content (the existing else path in _chat_messages_to_responses_input) keeps multi-turn coherence working without the 401.

Actual Behavior

agent/codex_responses_adapter.py::_chat_messages_to_responses_input unconditionally replays codex_message_items (including the id field stripped only of "id"... no wait, it preserves id on assistant messages — see lines ~344-356). The Copilot Responses path treats those IDs as connection-scoped and rejects them with:

HTTP 401: input item ID does not belong to this connection

Example request shape that triggers it (redacted from a real request dump):

{
  "model": "gpt-5.5",
  "input": [
    ...,
    {"type": "message", "role": "assistant", "id": "PPz6c1ySbeU1EfegTmhITWv/PTjyH/FV6wGF4Ft2rUJXCYts3VfAdV+MoBhvXH+WKUujB/LJIzhyG8XV", "status": "completed", "content": [...]},
    ...
  ],
  "store": false
}

The IDs persist in the session transcript (state.db + JSONL), so the failure survives every gateway restart and credential reset until something edits the history.

Proposed Fix

Mirror the is_xai_responses plumbing for Copilot: add an is_github_responses flag through agent/transports/codex.py::ResponsesApiTransport.{convert_messages, build_kwargs} into _chat_messages_to_responses_input, and skip the codex_message_items replay path when it's True. The existing content_text / content_parts fallback already produces a valid {"role": "assistant", "content": ...} item, which Copilot accepts. Verified locally — patch + sanity test below.

--- a/agent/codex_responses_adapter.py
+++ b/agent/codex_responses_adapter.py
@@ def _chat_messages_to_responses_input(
     messages: List[Dict[str, Any]],
     *,
     is_xai_responses: bool = False,
+    is_github_responses: bool = False,
 ) -> List[Dict[str, Any]]:
@@
                 codex_message_items = msg.get("codex_message_items")
                 replayed_message_items = 0
-                if isinstance(codex_message_items, list):
+                if isinstance(codex_message_items, list) and not is_github_responses:

--- a/agent/transports/codex.py
+++ b/agent/transports/codex.py
@@ class ResponsesApiTransport(ProviderTransport):
     def convert_messages(self, messages, **kwargs):
         return _chat_messages_to_responses_input(
             messages,
             is_xai_responses=bool(kwargs.get("is_xai_responses")),
+            is_github_responses=bool(kwargs.get("is_github_responses")),
         )
@@
             "input": _chat_messages_to_responses_input(
                 payload_messages,
                 is_xai_responses=is_xai_responses,
+                is_github_responses=is_github_responses,
             ),

Sanity test against the real failing payload shape:

items = _chat_messages_to_responses_input([
    {'role': 'assistant', 'content': 'hi',
     'codex_message_items': [{'type': 'message', 'role': 'assistant',
                              'id': 'abc', 'status': 'completed',
                              'content': [{'type': 'output_text', 'text': 'hi'}]}]},
    {'role': 'user', 'content': 'test'},
], is_github_responses=True)
# => [{'role': 'assistant', 'content': 'hi'}, {'role': 'user', 'content': 'test'}]
# No 'id' field replayed -> Copilot accepts the request.

I'm happy to send this as a PR with a regression test if a maintainer can confirm the shape.

Affected Component

  • Agent Core (conversation loop, Responses adapter)
  • Provider integration: copilot / api.githubcopilot.com/responses

Environment

  • Hermes: running from ~/.hermes/hermes-agent, current main.
  • Provider: copilot
  • Endpoint: https://api.githubcopilot.com/responses
  • Model: gpt-5.5
  • API mode: codex_responses (resolved via _should_use_copilot_responses_api)
  • OS: WSL2 Ubuntu on Windows 11
  • Auth source: gh auth token (Copilot via personal + Microsoft Enterprise seats; both reproduce)

Related Issues

  • #27038 — similar shape on the Codex backend, but enforced by length (string_above_max_length), not connection binding. Fix there normalizes oversized IDs; that doesn't help Copilot because the IDs aren't oversized here, they're just scoped to a backend connection.
  • #21665 — clears opaque history on /model switch. Helpful, but doesn't cover the in-model case where the connection rotates underneath you without a model switch.

Workaround

Manually edit state.db (and any active session JSONL) to strip id from every codex_message_items[*] for the session. Or apply the diff above locally and restart the gateway.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING