openclaw - 💡(How to fix) Fix Design input needed: token-cost scaling in conversational multi-agent coordination [1 comments, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#72629Fetched 2026-04-28 06:33:54
View on GitHub
Comments
1
Participants
1
Timeline
1
Reactions
0
Participants
Timeline (top)
commented ×1

Multi-agent conversational coordination — where every agent reads the full prior conversation each turn — has token cost that grows roughly quadratically with agents × rounds. We observed this empirically running 5 agents over 3 rounds on a single shared task and ended up burning a significant chunk of $150 on a scenario that didn't, on the surface, justify it.

Filing this to surface the architectural pattern and invite community design input. We are explicitly not proposing a fix — the right trade-off isn't obvious to us, and naive solutions risk eroding the property that makes conversational coordination valuable in the first place.

Root Cause

  • Setup: 5 agents (parent personas), 3 rounds, GPT-4o, cross-agent messaging via sessions_send / sessions_history. Scenario: coordinate emergency school pickup of 7 children in 45 minutes.
  • Run shape: ~80–100 LLM calls. Each agent's call included the full prior conversation each turn, so per-call input tokens grew steadily through the run.
  • Outcome quality: the run produced an honest, constraint-valid plan (5/7 assigned, 2 explicitly flagged as unresolved). Quality was good. The cost of getting there was the concern.
  • Cost shape: with N agents and M rounds, total tokens scale roughly as N × M × (N × M) because each later turn re-ingests every earlier turn. At our (5, 3) it's expensive but tolerable; at (10, 5) it's an order of magnitude worse; at (20, 10) it's prohibitive.
RAW_BUFFERClick to expand / collapse

Summary

Multi-agent conversational coordination — where every agent reads the full prior conversation each turn — has token cost that grows roughly quadratically with agents × rounds. We observed this empirically running 5 agents over 3 rounds on a single shared task and ended up burning a significant chunk of $150 on a scenario that didn't, on the surface, justify it.

Filing this to surface the architectural pattern and invite community design input. We are explicitly not proposing a fix — the right trade-off isn't obvious to us, and naive solutions risk eroding the property that makes conversational coordination valuable in the first place.

What we observed

  • Setup: 5 agents (parent personas), 3 rounds, GPT-4o, cross-agent messaging via sessions_send / sessions_history. Scenario: coordinate emergency school pickup of 7 children in 45 minutes.
  • Run shape: ~80–100 LLM calls. Each agent's call included the full prior conversation each turn, so per-call input tokens grew steadily through the run.
  • Outcome quality: the run produced an honest, constraint-valid plan (5/7 assigned, 2 explicitly flagged as unresolved). Quality was good. The cost of getting there was the concern.
  • Cost shape: with N agents and M rounds, total tokens scale roughly as N × M × (N × M) because each later turn re-ingests every earlier turn. At our (5, 3) it's expensive but tolerable; at (10, 5) it's an order of magnitude worse; at (20, 10) it's prohibitive.

The trade-off that makes this hard

The broadcast pattern is the mechanism that produces the qualitative wins of conversational coordination — emergent-information moments where one agent's reveal unlocks another agent's reasoning. Concretely, in our run: Diana mentioned that one child (Zuri) had met another helper (Emi) at a block party, which changed Raj's reasoning about pickup pairings even though Raj and Diana had no obvious topical overlap. Naive cost reduction (e.g. "only send to topically-relevant agents") would silently kill that property and erode the very thing that makes conversational mode interesting.

So this isn't a "broadcast is wasteful, remove it" issue. It's a "how do we keep the property at scale" issue.

Open questions for the community

  1. Has anyone landed on a coordination pattern that preserves emergent information at lower cost? Specifically: a way for cross-domain reveals to still reach non-obvious recipients without every agent ingesting the full transcript every turn.
  2. Is per-agent context filtering (saliency / relevance scoring) a viable middle ground? If so, what scoring approach has worked, and what's the failure mode when saliency drifts?
  3. Are there published patterns (topic routing, role-based subscriptions, summary tiers, attention sinks, hierarchical agents, etc.) with empirical data behind them at multi-agent scales? Pointers welcome.
  4. Where should this trade-off be exposed in OpenClaw's surface? A coordination-mode knob, a config primitive, or is this a per-scenario harness concern that doesn't belong in the framework at all?
  5. How does this interact with OpenClaw's existing star-topology MAS (sessions_spawn + targeted sessions_send)? Should "conversational coordination" be understood as a separate primitive that needs its own scaling story, or as a degenerate case of MAS that shouldn't be encouraged at scale?

What this issue is not

  • Not a bug report — broadcast in small scenarios is deliberate and valuable.
  • Not a feature proposal — we're not asking for a specific implementation.
  • Not a performance-tuning ask — model choice and prompt size are separately tunable; the cost shape we're describing is structural, not configurational.

Just trying to make sure this is on the radar before users hit it on bigger scenarios. Open to converting this to a Discussion if that's the better venue.

Repro

Scenario, agent contexts, and runner scripts can be shared on request. Anyone wanting to reproduce just needs an OpenClaw install with 5 agents registered, a shared task brief, and a 3-round orchestrator that sends each agent the full conversation each turn — the cost shape will be visible in the first 2-3 rounds.

extent analysis

TL;DR

The most likely fix involves exploring alternative coordination patterns that preserve emergent information at lower cost, such as per-agent context filtering or published patterns like topic routing.

Guidance

  • Investigate per-agent context filtering using saliency or relevance scoring to reduce the amount of information each agent ingests.
  • Explore published patterns like topic routing, role-based subscriptions, or hierarchical agents that have empirical data behind them at multi-agent scales.
  • Consider exposing the trade-off between conversational coordination and cost in OpenClaw's surface, such as a coordination-mode knob or config primitive.
  • Analyze how conversational coordination interacts with OpenClaw's existing star-topology MAS and determine if it should be understood as a separate primitive with its own scaling story.

Example

No code snippet is provided as the issue is focused on architectural patterns and design decisions rather than specific code implementations.

Notes

The issue lacks concrete technical details, and the solution will depend on the specific requirements and constraints of the OpenClaw framework and the conversational coordination use case. Further discussion and experimentation are needed to determine the best approach.

Recommendation

Apply a workaround by exploring alternative coordination patterns, as the current approach is not scalable. This will allow for a more efficient use of resources while preserving the benefits of conversational coordination.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix Design input needed: token-cost scaling in conversational multi-agent coordination [1 comments, 1 participants]