autogen - 💡(How to fix) Fix Proposal: optional Agent Threat Rules security wrapper for autogen-ext

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Fix Action

Fix / Workaround

Concrete proposal. Add a wrapper agent, modeled on the existing MessageFilterAgent in autogen_agentchat.agents, that intercepts messages via on_messages and either tags messages with a risk label or short-circuits the call when a high-confidence pattern matches. The implementation can live as a new optional dependency group in autogen-ext (for example, security or atr) so the core install is unaffected. Detection is regex-only, no model inference, no network. The patch I have in mind would be roughly 100 to 150 lines including a small curated pattern set, a test, and a docstring.

RAW_BUFFERClick to expand / collapse

AutoGen runs LLM-driven agent loops that often invoke shell, browser, file, and MCP tools. The same threat patterns that motivated work like the recent importlib provider hardening fix at #7463 also surface as content-level threats: prompt injection in tool outputs, exfiltration domains in agent-generated URLs, dangerous shell invocations passed to code executors, and credential leakage in messages. There is no first-class place in autogen-agentchat or autogen-ext today for content-level threat detection on the message or action boundary.

I would like to propose adding an optional security scan extension under autogen-ext that wraps a BaseChatAgent and runs ATR-style detection on incoming messages or outgoing actions before they propagate. Reference for the rule set is the open Agent Threat Rules project at https://github.com/Agent-Threat-Rule/agent-threat-rules. ATR is Apache-2.0 and ships 330 rules across 9 categories, including prompt injection, exfiltration, dangerous shell commands, credential leakage, and tool-poisoning patterns. The same rule set has been shipped at Cisco AI Defense skill-scanner and Microsoft agent-governance-toolkit, so attribution and licensing are already well understood across the Microsoft ecosystem.

Concrete proposal. Add a wrapper agent, modeled on the existing MessageFilterAgent in autogen_agentchat.agents, that intercepts messages via on_messages and either tags messages with a risk label or short-circuits the call when a high-confidence pattern matches. The implementation can live as a new optional dependency group in autogen-ext (for example, security or atr) so the core install is unaffected. Detection is regex-only, no model inference, no network. The patch I have in mind would be roughly 100 to 150 lines including a small curated pattern set, a test, and a docstring.

Before I send a PR I want to check three things with maintainers given the repo is in maintenance mode. Is this scope something you would consider, or do you prefer to keep new feature work out of autogen-agentchat and autogen-ext? Is autogen-ext the right home, or would you suggest building this as an external package that depends on autogen-core. Are there review SLA expectations I should be aware of so I do not block the queue.

Happy to scope down to a single regex set and a single wrapper class if that makes review easier. License is Apache-2.0 on ATR side, MIT on AutoGen side, attribution is the only requirement.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING