openclaw - 💡(How to fix) Fix [Feature]: Model routing by task tier for ACP subagent spawning [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#62961Fetched 2026-04-09 08:00:07
View on GitHub
Comments
0
Participants
1
Timeline
0
Reactions
0

Add task-tier classification to ACP subagent spawning so mechanical/fast tasks route to cheaper models and only complex reasoning tasks use frontier models.

Root Cause

Add task-tier classification to ACP subagent spawning so mechanical/fast tasks route to cheaper models and only complex reasoning tasks use frontier models.

Code Example

// src/agents/task-tier.ts
export type TaskTier = "frontier" | "mid" | "fast";

export function classifyTaskTier(prompt: string): TaskTier {
  const p = prompt.toLowerCase();
  if (/debug.{0,20}why|architect|security audit|design a|investigate/.test(p))
    return "frontier";
  if (/rename|reformat|update.{0,10}comment|add import|fix typo/.test(p))
    return "fast";
  return "mid"; // safe default — never under-route when uncertain
}

---

{
  "acp": {
    "taskRouting": {
      "enabled": false,
      "tierModels": {
        "frontier": "anthropic/claude-opus-4-6",
        "mid": "anthropic/claude-sonnet-4-6",
        "fast": "anthropic/claude-haiku-4-5-20251001"
      }
    }
  }
}
RAW_BUFFERClick to expand / collapse

Summary

Add task-tier classification to ACP subagent spawning so mechanical/fast tasks route to cheaper models and only complex reasoning tasks use frontier models.

Problem to solve

All ACP-spawned child agents currently use the globally-configured model regardless of task complexity. A rename-variable task runs on the same frontier model as a complex security audit. This has two costs:

  • Monetary: frontier models are 10–20× more expensive than fast-tier models for tasks that don't need deep reasoning.
  • Latency: multi-agent task latency is already 8–15s vs 2–4s for single-agent (framework benchmarks, 2025). Over-routing adds avoidable latency on the most frequent call pattern (mechanical/boilerplate tasks).

Token usage explains 80% of performance variance across multi-agent evaluations (Google Research, BrowseComp 2025) — meaning intelligent token budgeting is the primary lever for improving agent quality at a given cost level. Routing cheap tasks to cheaper models frees budget for tasks that actually benefit from it.

Proposed solution

1. Task tier type + default classifier

// src/agents/task-tier.ts
export type TaskTier = "frontier" | "mid" | "fast";

export function classifyTaskTier(prompt: string): TaskTier {
  const p = prompt.toLowerCase();
  if (/debug.{0,20}why|architect|security audit|design a|investigate/.test(p))
    return "frontier";
  if (/rename|reformat|update.{0,10}comment|add import|fix typo/.test(p))
    return "fast";
  return "mid"; // safe default — never under-route when uncertain
}

Guardrail: when task complexity is uncertain, always route up, not down. The cost of under-routing (slow/expensive for a simple task) is known and bounded. The cost of over-routing (wrong output requiring expensive correction) is often higher.

2. Config surface

{
  "acp": {
    "taskRouting": {
      "enabled": false,
      "tierModels": {
        "frontier": "anthropic/claude-opus-4-6",
        "mid": "anthropic/claude-sonnet-4-6",
        "fast": "anthropic/claude-haiku-4-5-20251001"
      }
    }
  }
}

Default enabled: false so existing deployments are unaffected. Models should resolve through the existing models.providers.* config surface so it works for non-Anthropic providers (OpenAI, Gemini, local models).

3. Wire into resolveHookModelSelection

The existing hooks.internal.entries system already provides the right override surface. The tier classification runs as a pre-hook default, overridable by user config or a before_agent_start hook for per-task fine-tuning.

4. Coordinator classifies; workers don't choose

The task tier is determined at spawn time by the coordinator — workers never select their own model. This matches the coordinator/worker pattern where the coordinator owns routing decisions.

Alternatives considered

  • Per-agent static model config: already supported via authProfileId and explicit model params. Task-tier routing is the dynamic complement — classifying at spawn time based on what the task actually is.
  • Let users specify model per ACP call: valid, but adds burden to every call site. Tier routing is a sensible default that callers can override.
  • Single model for all: current behavior. Works but wastes budget and adds latency on the majority of calls that are mid- or fast-tier work.

Impact

  • Affected: All deployments using ACP subagent spawning (sessions_spawn with runtime: "acp").
  • Severity: Medium — not a blocker, but a meaningful cost and latency improvement for high-volume deployments.
  • Frequency: Every ACP spawn call.
  • Consequence: Without routing, frontier model costs and latency apply uniformly; with routing, mid/fast tasks run ~4–10× cheaper with lower latency.

Evidence/examples

  • Token usage explains 80% of BrowseComp performance variance (Google Research, 2025).
  • Multi-agent latency: 8–15s vs single-agent 2–4s (framework benchmarks, 2025) — cheap routing for simple tasks reduces the gap.
  • Routing signals from BP 08: "design", "architect", "debug why", "security" → frontier; "rename", "format", "update comment", "change X to Y" → fast.

Additional information

  • This is a Claude-specific optimization for the default TIER_MODEL_MAP — but the config surface should be provider-agnostic so non-Anthropic deployments can define their own tier→model mapping.
  • Should integrate with auth profile constraints: if a fast-tier model isn't available under the current auth profile, fall back to mid-tier gracefully.
  • Related review: best-practices/results/agent-review-2026-04-07.md (dimension 08 — Performance & startup).

extent analysis

TL;DR

Implement task-tier classification for ACP subagent spawning to route mechanical and fast tasks to cheaper models, reducing costs and latency.

Guidance

  • Introduce a task tier type system with a default classifier to categorize tasks into "frontier", "mid", or "fast" tiers based on their complexity.
  • Configure the task routing system with tier-specific models, allowing for provider-agnostic deployment.
  • Wire the task tier classification into the resolveHookModelSelection system to enable dynamic model selection.
  • Ensure the coordinator determines the task tier at spawn time, without relying on worker selection.

Example

// Example task tier classification function
export function classifyTaskTier(prompt: string): TaskTier {
  const p = prompt.toLowerCase();
  if (/debug.{0,20}why|architect|security audit|design a|investigate/.test(p))
    return "frontier";
  if (/rename|reformat|update.{0,10}comment|add import|fix typo/.test(p))
    return "fast";
  return "mid"; // safe default — never under-route when uncertain
}

Notes

The proposed solution is specific to Claude models, but the config surface should be designed to be provider-agnostic. Integration with auth profile constraints is necessary to ensure graceful fallback to mid-tier models when fast-tier models are not available.

Recommendation

Apply the proposed task-tier classification and routing system to reduce costs and latency for high-volume deployments. This solution provides a sensible default that can be overridden by user config or per-task fine-tuning.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING