hermes - 💡(How to fix) Fix Receipts for self-improving agents: proving which skill version produced which output [2 comments, 2 participants]

Hi Hermes team, Tom Farley here. Filing this as a discussion rather than an issue because the question is specifically about the self-modification property of Hermes and the provenance problem it creates.

The governance property that makes Hermes different

From the README: "It is the only agent with a built-in learning loop -- it creates skills from experience, improves them during use, nudges itself to persist knowledge, searches its own past conversations." Skill autogeneration and self-improvement are the key differentiators.

They are also the key governance challenge. A static agent needs authorization for each action. A self-modifying agent needs authorization for each action AND provenance for which version of itself produced that action, because "the skill that ran" is itself a moving target.

Specifically: if Hermes creates skill-compress-inbox-v1 at 10am, improves it to v2 at 2pm based on experience, and runs the improved version at 3pm, three provenance questions arise:

Which version of the skill produced the 3pm output?
What was the policy governing the 10am skill-creation action and the 2pm improvement action?
Did any intermediate version of the skill violate a policy the operator never explicitly approved?

Standard logs answer 1 imperfectly (the agent can misreport) and cannot answer 2 or 3 cryptographically.

The pattern that addresses this

Signed decision receipts, one per tool call AND one per skill-creation or skill-modification event. Each receipt carries:

The skill identifier and version hash (so the running skill is content-addressed, not name-addressed)
The policy that governed the action (with policy digest)
The parent-receipt hash (so the chain is tamper-evident as a whole)
An Ed25519 signature from a supervisor identity distinct from Hermes itself

The chain carries enough evidence to answer the three questions above:

Question 1: walk the chain backwards from the 3pm output receipt; find the corresponding skill version hash.
Question 2: each receipt records the policy_id and policy_digest at evaluation time.
Question 3: each skill-creation and skill-modification receipt is a distinct chain entry; if any version was produced under a denied policy, the chain records the deny.

Why open format, not proprietary

The format is an IETF draft: draft-farley-acta-signed-receipts. Four independent implementations emit it today (TypeScript, Python twice, Rust). A reference verifier ships on npm: npx @veritasacta/verify receipt.json. Offline, no network, no vendor lookup. Hermes would be an interop-compatible fifth implementation if this pattern fits the roadmap.

The IETF draft approach matters specifically for self-improving agents because the evidence needs to outlive any single governance vendor. If Nous picks vendor X today and vendor X disappears in 18 months, the chain still verifies with any other conformant verifier. The wire format is the contract.

Integration shape for Hermes

Hermes is Python-first with a CLI plus messaging-surface gateways. Two natural integration points:

Per tool call: a PreToolUse / PostToolUse hook around every hermes tool call, signing a receipt that references the active skill version hash. The protect-mcp-adk Python package is a drop-in starting point; could fork and adapt to the Hermes tool protocol.
Per skill mutation: a supervisor that signs a receipt at each hermes skill create and hermes skill improve event, with the skill source hash as the subject. This is the piece specific to self-improving agents; no other framework needs it because none of them mutate their own skills autonomously.

Composition with agentskills.io

The README notes Hermes is compatible with the agentskills.io open standard. Receipts compose naturally: when a skill is created or updated, the receipt is a signed attestation of that mutation. If agentskills.io adopts a provenance field for skills, signed receipts are the obvious fill for it.

What I am asking for

Not a PR yet. The specific asks:

Is receipted provenance for skill creation and improvement a problem the Hermes team cares about at this stage, or is it far enough outside the current roadmap that this discussion is premature?
If there is interest, which is the more valuable first integration: per-tool-call receipts (easy, maps to existing protect-mcp patterns) or per-skill-mutation receipts (new, specific to Hermes self-improving property)?
Is there an appetite for a worked example repo showing Hermes + receipts end-to-end? Happy to build one as a reference if the team wants to see the shape before adopting anything.

Adjacent context

IETF draft: draft-farley-acta-signed-receipts
Four existing implementations: protect-mcp, protect-mcp-adk, sb-runtime, APS governance hook
Reference verifier: @veritasacta/verify
Microsoft AGT integration: merged at PR #667
In-toto predicate proposal: in-toto/attestation#549
SLSA for agents: slsa-framework/slsa#1594

Thanks for the work. A self-improving agent that also carries verifiable provenance of its own self-modification would be materially different from what is shipping in this category today.

Tom Farley (Independent capacity; IETF draft author.)

extent analysis

TL;DR

Implementing signed decision receipts for skill creation and improvement in Hermes can address the provenance problem and provide a transparent and verifiable record of the agent's self-modification.

Guidance

Integrate signed decision receipts into Hermes, focusing on per-skill-mutation receipts to address the unique challenge of self-improving agents.
Utilize the IETF draft for signed receipts, which provides a standardized format for provenance and can be implemented using existing libraries such as protect-mcp-adk.
Consider creating a worked example repository to demonstrate the integration of Hermes with signed receipts and provide a reference for future development.
Evaluate the feasibility of implementing per-tool-call receipts as a secondary integration point to further enhance provenance and transparency.

Example

No code snippet is provided as the issue focuses on the conceptual integration of signed decision receipts with Hermes, and the implementation details would depend on the specific requirements and design of the system.

Notes

The implementation of signed decision receipts in Hermes may require significant changes to the existing architecture and governance model. It is essential to carefully evaluate the trade-offs and ensure that the integration aligns with the project's roadmap and goals.

Recommendation

Apply the workaround of implementing signed decision receipts for skill creation and improvement, as it provides a standardized and verifiable solution to the provenance problem in self-improving agents. This approach allows for transparent and auditable records of the agent's self-modification, addressing the key governance challenge.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

hermes - 💡(How to fix) Fix Receipts for self-improving agents: proving which skill version produced which output [2 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

The governance property that makes Hermes different

The pattern that addresses this

Why open format, not proprietary

Integration shape for Hermes

Composition with agentskills.io

What I am asking for

Adjacent context

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

hermes - 💡(How to fix) Fix Receipts for self-improving agents: proving which skill version produced which output [2 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

The governance property that makes Hermes different

The pattern that addresses this

Why open format, not proprietary

Integration shape for Hermes

Composition with agentskills.io

What I am asking for

Adjacent context

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING