openclaw - ✅(Solved) Fix Feature: URL routing rules via before_tool_call hook — redirect x.com/twitter.com to skill instead of web_fetch [1 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#69591Fetched 2026-04-22 07:50:29
View on GitHub
Comments
0
Participants
1
Timeline
1
Reactions
0
Author
Participants
Timeline (top)
cross-referenced ×1

When a user drops an X.com (or any domain) URL, the agent reflexively calls web_fetch. For X.com this always fails — X blocks unauthenticated scrapers. The right path is to use the xread skill (Grok API), but no enforcement mechanism exists today. Text rules in AGENTS.md aren't reliable enough because pattern-matching fires before the rule is read.

The before_tool_call hook already exists and is the perfect enforcement point.

Error Message

"action": "warn",

  • If match + action is warn: allows but prepends a warning to the tool result
  • Rate-limited APIs → warn + suggest caching if (rule.action === 'warn') {

Root Cause

When a user drops an X.com (or any domain) URL, the agent reflexively calls web_fetch. For X.com this always fails — X blocks unauthenticated scrapers. The right path is to use the xread skill (Grok API), but no enforcement mechanism exists today. Text rules in AGENTS.md aren't reliable enough because pattern-matching fires before the rule is read.

Fix Action

Fixed

PR fix notes

PR #69596: feat: URL routing rules for web_fetch — redirect/warn/block by domain pattern

Description (problem / solution / changelog)

Summary

Adds tools.web.fetch.urlRouting config — an ordered list of domain routing rules that fire inside runBeforeToolCallHook() before any web_fetch call.

Closes #69591

Problem

Agents reflexively call web_fetch on any URL, including domains that actively block scraping (x.com, linkedin.com) or domains that should route to a different tool entirely. Text rules in AGENTS.md are not reliably enforced — the pattern-matching behavior fires before context-level rules are applied.

This adds code-level enforcement via the existing before_tool_call infrastructure.

Config

{
  "tools": {
    "web": {
      "fetch": {
        "urlRouting": [
          {
            "match": "x\\.com|twitter\\.com",
            "action": "redirect",
            "redirectTo": "skill:xread",
            "reason": "X.com blocks unauthenticated fetch. Use the xread skill (Grok API) instead."
          },
          {
            "match": "linkedin\\.com",
            "action": "warn",
            "reason": "LinkedIn blocks scrapers — results may be empty."
          },
          {
            "match": ".*\\.internal\\.corp",
            "action": "block",
            "reason": "Internal URLs are not accessible from the agent runtime."
          }
        ]
      }
    }
  }
}

Three actions

ActionEffect
redirectBlocks the fetch with a message telling the agent to use redirectTo instead
warnAllows the fetch but surfaces a warning in the tool result
blockBlocks the fetch with no redirect suggestion

Implementation

New files:

  • src/web-fetch/url-routing.tsevaluateUrlRouting() + resolveUrlRoutingRules()
  • src/web-fetch/url-routing.test.ts — 19 tests covering all three actions, edge cases, invalid regex

Modified files:

  • src/config/types.tools.ts — added urlRouting?: UrlRoutingConfig to fetch config block with JSDoc examples
  • src/agents/pi-tools.before-tool-call.ts — wired routing check into runBeforeToolCallHook() after loop detection, before plugin hook runner

Tests: 19/19 ✅

✓ no rules → matched=false
✓ redirect: x.com blocked with skill:xread hint
✓ redirect: twitter.com matched
✓ redirect: case-insensitive match (X.COM)
✓ redirect: unrelated URL not matched
✓ redirect: reason included in blockReason
✓ redirect: works without reason field
✓ redirect: works without redirectTo field
✓ block: blocks with reason, no warnMessage
✓ block: unrelated URL not matched
✓ warn: returns warnMessage, no blockReason
✓ warn: no block — fetch proceeds
✓ rule precedence: first match wins
✓ invalid regex: skipped without throwing
✓ invalid regex: matched=false when all rules invalid
✓ resolveUrlRoutingRules: undefined config → []
✓ resolveUrlRoutingRules: missing urlRouting → []
✓ resolveUrlRoutingRules: configured rules returned
✓ resolveUrlRoutingRules: explicit empty array → []

Backwards compatibility

Empty routing table (default) = zero behavior change. Fully additive.

Changed files

  • src/agents/pi-tools.before-tool-call.ts (modified, +16/-1)
  • src/config/types.tools.ts (modified, +22/-0)
  • src/web-fetch/url-routing.test.ts (added, +208/-0)
  • src/web-fetch/url-routing.ts (added, +137/-0)

Code Example

{
  "tools": {
    "web": {
      "fetch": {
        "urlRouting": [
          {
            "match": "x\.com|twitter\.com",
            "action": "redirect",
            "redirectTo": "skill:xread",
            "reason": "X.com blocks unauthenticated fetch. Use xread skill (Grok API) instead."
          },
          {
            "match": "linkedin\.com",
            "action": "warn",
            "reason": "LinkedIn blocks scrapers — results may be empty."
          }
        ]
      }
    }
  }
}

---

PluginHookBeforeToolCallEvent = {
  toolName: string;   // e.g. "web_fetch"
  params: Record<string, unknown>;  // includes { url: "https://x.com/..." }
}

---

PluginHookBeforeToolCallResult = {
  block?: boolean;
  blockReason?: string;
  params?: Record<string, unknown>;  // mutate params (e.g. rewrite URL)
  requireApproval?: { ... };
}

---

// src/plugins/url-routing.plugin.ts

import type { OpenClawPlugin } from './types.js';

export const urlRoutingPlugin: OpenClawPlugin = {
  id: 'core:url-routing',
  hooks: [
    {
      hookName: 'before_tool_call',
      handler: async (event, ctx) => {
        if (event.toolName !== 'web_fetch') return;
        
        const url = event.params?.url as string | undefined;
        if (!url) return;

        const rules = resolveUrlRoutingRules(ctx); // reads from config
        
        for (const rule of rules) {
          if (!new RegExp(rule.match).test(url)) continue;
          
          if (rule.action === 'redirect') {
            return {
              block: true,
              blockReason: `URL routing: ${url} matches "${rule.match}" — use ${rule.redirectTo} instead. Reason: ${rule.reason}`,
            };
          }
          
          if (rule.action === 'warn') {
            // inject warning into result via params mutation or post-hook
          }
        }
      },
      priority: 100, // run early
    },
  ],
};
RAW_BUFFERClick to expand / collapse

Summary

When a user drops an X.com (or any domain) URL, the agent reflexively calls web_fetch. For X.com this always fails — X blocks unauthenticated scrapers. The right path is to use the xread skill (Grok API), but no enforcement mechanism exists today. Text rules in AGENTS.md aren't reliable enough because pattern-matching fires before the rule is read.

The before_tool_call hook already exists and is the perfect enforcement point.

Proposed: URL routing config in openclaw.json

{
  "tools": {
    "web": {
      "fetch": {
        "urlRouting": [
          {
            "match": "x\.com|twitter\.com",
            "action": "redirect",
            "redirectTo": "skill:xread",
            "reason": "X.com blocks unauthenticated fetch. Use xread skill (Grok API) instead."
          },
          {
            "match": "linkedin\.com",
            "action": "warn",
            "reason": "LinkedIn blocks scrapers — results may be empty."
          }
        ]
      }
    }
  }
}

How it would work

The before_tool_call hook already receives:

PluginHookBeforeToolCallEvent = {
  toolName: string;   // e.g. "web_fetch"
  params: Record<string, unknown>;  // includes { url: "https://x.com/..." }
}

And can return:

PluginHookBeforeToolCallResult = {
  block?: boolean;
  blockReason?: string;
  params?: Record<string, unknown>;  // mutate params (e.g. rewrite URL)
  requireApproval?: { ... };
}

The routing logic would live in a bundled plugin that:

  1. Reads tools.web.fetch.urlRouting from config
  2. On before_tool_call where toolName === "web_fetch":
    • Extracts the url param
    • Checks against routing rules (regex match on domain)
    • If match + action is redirect: blocks the tool call and injects a system note telling the agent to use the specified skill/tool instead
    • If match + action is warn: allows but prepends a warning to the tool result

Why this matters beyond x.com

This generalizes to any domain-specific routing:

  • Paywalled sites → redirect to a summarize skill
  • Internal tools → redirect to an MCP server
  • Rate-limited APIs → warn + suggest caching
  • Security policy → block specific domains entirely

Agents today have no structured way to express "when you see a URL from domain X, use capability Y." This fills that gap with a config-driven, code-enforced approach rather than relying on text rules that can be skipped.

Implementation sketch

// src/plugins/url-routing.plugin.ts

import type { OpenClawPlugin } from './types.js';

export const urlRoutingPlugin: OpenClawPlugin = {
  id: 'core:url-routing',
  hooks: [
    {
      hookName: 'before_tool_call',
      handler: async (event, ctx) => {
        if (event.toolName !== 'web_fetch') return;
        
        const url = event.params?.url as string | undefined;
        if (!url) return;

        const rules = resolveUrlRoutingRules(ctx); // reads from config
        
        for (const rule of rules) {
          if (!new RegExp(rule.match).test(url)) continue;
          
          if (rule.action === 'redirect') {
            return {
              block: true,
              blockReason: `URL routing: ${url} matches "${rule.match}" — use ${rule.redirectTo} instead. Reason: ${rule.reason}`,
            };
          }
          
          if (rule.action === 'warn') {
            // inject warning into result via params mutation or post-hook
          }
        }
      },
      priority: 100, // run early
    },
  ],
};

Related

  • before_tool_call hook already exists in src/plugins/hook-types.ts
  • Existing test coverage: src/plugins/hooks.before-tool-call.test.ts
  • PluginHookBeforeToolCallResult.block + blockReason already supported

This is additive — no breaking changes to existing plugin or hook behavior. The routing table defaults to empty (no rules = no change in behavior).


Surfaced by a real agent (Bartok) that keeps reflexively calling web_fetch on x.com despite text rules saying not to. Code enforcement > text rules.

extent analysis

TL;DR

Implementing a URL routing configuration in openclaw.json and utilizing the before_tool_call hook can effectively redirect or warn against specific domains, such as X.com, to prevent unauthenticated scrapers.

Guidance

  1. Define URL routing rules: In openclaw.json, add a urlRouting section under tools.web.fetch with match patterns for specific domains and corresponding actions (e.g., redirect to skill:xread for X.com).
  2. Utilize the before_tool_call hook: Leverage the existing hook to intercept web_fetch calls, extract the URL, and apply the defined routing rules to either block the call with a reason or prepend a warning to the result.
  3. Implement the routing logic: Create a bundled plugin that reads the urlRouting config and enforces the rules based on domain matches, using the PluginHookBeforeToolCallResult to block or modify the tool call as needed.
  4. Test the implementation: Ensure the new functionality works as expected by testing it with various URLs and domains, including X.com, to verify the correct behavior (redirect or warning) is applied.

Example

The provided urlRoutingPlugin code snippet demonstrates how to implement the routing logic, reading from the openclaw.json config and applying the rules within the before_tool_call hook.

Notes

This solution focuses on utilizing the existing before_tool_call hook and adding a configuration-driven approach for URL routing, which should be more reliable than text rules. However, the effectiveness depends on the accuracy of the defined routing rules and the plugin's implementation.

Recommendation

Apply the proposed workaround by implementing the URL routing configuration and plugin, as it provides a structured and enforceable way to handle domain-specific routing without relying on potentially unreliable text rules.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING