hermes - 💡(How to fix) Fix Cron prompt scanner: github.com auth-header allowlist only scrubs first match

hermes2026-05-24 16:23:41

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

tools/cronjob_tools.py::_scan_cron_prompt allows a single bundled exception for the GitHub skills' curl -H "Authorization: token $GITHUB_TOKEN" https://api.github.com/... shape, but the allowlist only matches and replaces the first occurrence. Any cron prompt that loads a skill containing 2+ such curl blocks (which is the normal case for the bundled github-issues skill — it has 4) is blocked by exfil_curl_auth_header on every run.

Error Message

A second visibility bug worth a separate issue: BLOCKED status from cron scanner only surfaces in agent.log WARNINGs, not in cronjob action='list' output, so this kind of failure is invisible until a manual audit. Suggest surfacing last_status='blocked' distinct from 'error'.

Root Cause

Root cause (lines ~74–86 of cronjob_tools.py): the allowlist uses re.search(...) + prompt.replace(match.group(0), ...) — one match, one replace. The blocking pattern then re.searches the (still mostly-original) prompt and trips on the next un-scrubbed auth-header curl.

Fix Action

Fix / Workaround

Workaround applied locally: stripped skills from the 3 affected cron jobs; agents call gh directly without preloading the skills.
A second visibility bug worth a separate issue: BLOCKED status from cron scanner only surfaces in agent.log WARNINGs, not in cronjob action='list' output, so this kind of failure is invisible until a manual audit. Suggest surfacing last_status='blocked' distinct from 'error'.

Code Example

import sys; sys.path.insert(0, '.')
from tools.cronjob_tools import _scan_cron_prompt

with open('skills/github/github-issues/SKILL.md') as f:
    skill = f.read()
print(_scan_cron_prompt('prompt\n\n' + skill))         # '' — single skill happens to pass (URL on next line)
# Now combine the three bundled github skills the way cron does:
parts = []
for s in ['github-issues','github-pr-workflow','github-code-review']:
    parts.append(open(f'skills/github/{s}/SKILL.md').read())
print(_scan_cron_prompt('prompt\n\n' + '\n\n'.join(parts)))
# Blocked: prompt matches threat pattern 'exfil_curl_auth_header'.

---

prompt_to_scan = re.sub(
    rf'curl\s+[^\n]*(?:-H|--header)\s+["\']Authorization:\s*token\s+{_CRON_SECRET_VAR_RE}["\']'
    r'\s+["\']?https://api\.github\.com(?:/|\b)[^\n]*',
    'curl https://api.github.com/user',
    prompt,
    flags=re.IGNORECASE,
)

RAW_BUFFERClick to expand / collapse

Summary

Repro

import sys; sys.path.insert(0, '.')
from tools.cronjob_tools import _scan_cron_prompt

with open('skills/github/github-issues/SKILL.md') as f:
    skill = f.read()
print(_scan_cron_prompt('prompt\n\n' + skill))         # '' — single skill happens to pass (URL on next line)
# Now combine the three bundled github skills the way cron does:
parts = []
for s in ['github-issues','github-pr-workflow','github-code-review']:
    parts.append(open(f'skills/github/{s}/SKILL.md').read())
print(_scan_cron_prompt('prompt\n\n' + '\n\n'.join(parts)))
# Blocked: prompt matches threat pattern 'exfil_curl_auth_header'.

Impact

Any cron job that attaches skills: [github-issues, github-pr-workflow, github-code-review] (the natural set for issue triage / PR review / OSS-issue watchers) fails 100% of runs with status BLOCKED. Local impact here: 3 cron jobs (issue reconciliation, issue monitor, OSS-issue watcher) blocked silently for ~3 weeks — only visible in agent.log as repeated WARNINGs, not surfaced to the user. ~1,160 such warnings in the last 7d.

Proposed fix

Switch from single re.search + replace to re.sub (or re.finditer + iterative scrub) so the allowlist applies to every matching api.github.com auth-header curl, not just the first:

prompt_to_scan = re.sub(
    rf'curl\s+[^\n]*(?:-H|--header)\s+["\']Authorization:\s*token\s+{_CRON_SECRET_VAR_RE}["\']'
    r'\s+["\']?https://api\.github\.com(?:/|\b)[^\n]*',
    'curl https://api.github.com/user',
    prompt,
    flags=re.IGNORECASE,
)

The match anchor is still strict (Authorization + api.github.com on the same effective line), so the exception scope doesn't widen — it just applies to all matches instead of one.

Workaround applied locally: stripped skills from the 3 affected cron jobs; agents call gh directly without preloading the skills.
A second visibility bug worth a separate issue: BLOCKED status from cron scanner only surfaces in agent.log WARNINGs, not in cronjob action='list' output, so this kind of failure is invisible until a manual audit. Suggest surfacing last_status='blocked' distinct from 'error'.

Environment: hermes-agent main (local checkout 2026-05-24).

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering