hermes - 💡(How to fix) Fix feat(audit): periodic mirror of GH comment stream to object storage

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

The GH comment streaming system (implemented in #94) posts agent cognitive events as GitHub issue comments using the <!-- agent-event:v1 --> wire format. While GitHub provides reasonable durability, it is ultimately "at the company's pleasure" — issues can be locked, archived, or deleted, which would break the audit substrate.

This issue tracks building a periodic mirror of the agent event comment stream to object storage (S3 or GCS).

Root Cause

The GH comment streaming system (implemented in #94) posts agent cognitive events as GitHub issue comments using the <!-- agent-event:v1 --> wire format. While GitHub provides reasonable durability, it is ultimately "at the company's pleasure" — issues can be locked, archived, or deleted, which would break the audit substrate.

This issue tracks building a periodic mirror of the agent event comment stream to object storage (S3 or GCS).

RAW_BUFFERClick to expand / collapse

Summary

The GH comment streaming system (implemented in #94) posts agent cognitive events as GitHub issue comments using the <!-- agent-event:v1 --> wire format. While GitHub provides reasonable durability, it is ultimately "at the company's pleasure" — issues can be locked, archived, or deleted, which would break the audit substrate.

This issue tracks building a periodic mirror of the agent event comment stream to object storage (S3 or GCS).

Motivation

  • Audit durability: Agent decisions are recorded in GH comments. Losing them (via issue archival, repo deletion, or GitHub outage) would destroy the audit trail.
  • Queryability: Object storage + Athena/BigQuery enables ad-hoc analysis of agent behavior over time.
  • Independence: Decouples the audit substrate from GitHub's availability and policies.

Proposed Design

Source

  • Periodically (e.g., hourly) scan all tracked issues via the GitHub API
  • Extract comments containing the <!-- agent-event:v1 --> marker
  • Parse with tools/gh_comment_parser.py (single source of truth)

Target

  • S3 bucket or GCS bucket (to be provisioned)
  • Partitioned by repo/issue_id/date
  • Format: newline-deloned JSON (one AgentEvent per line) or Parquet

Pipeline

  • Scheduled agent run (cron) or GitHub Actions schedule
  • Idempotent: uses event_id as dedup key
  • Checkpoint: store last mirrored created_at timestamp to avoid full scans

Consumer Interface

  • tools/gh_comment_mirror.py: CLI that accepts --repo, --issue, --since
  • Writes to object storage via boto3 / google-cloud-storage
  • Emits a mirror-status summary as a GitHub issue comment

Non-Goals

  • Real-time streaming (batch is fine for audit purposes)
  • Mirroring non-agent comments (only <!-- agent-event:v1 --> marked comments)
  • Replacing GH comments as the primary store (mirror is backup only)

Dependencies

  • tools/gh_comment_parser.py (already implemented)
  • Object storage credentials (to be provisioned)

Acceptance Criteria

  • tools/gh_comment_mirror.py exists and is tested
  • Hourly cron job mirrors all tracked agent-event comments to S3/GCS
  • Deduplication works correctly (same event_id not written twice)
  • Mirror status is posted as a periodic summary comment

Related

  • #94: GH comment streaming implementation (parser, streamer, outbox)
  • tools/gh_comment_parser.py: wire format parser
  • tools/gh_comment_streamer.py: outbound streamer with durable outbox

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

hermes - 💡(How to fix) Fix feat(audit): periodic mirror of GH comment stream to object storage