openclaw - 💡(How to fix) Fix Add per-agent maxConcurrent to allow shared gateways with single-lane agents

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

OpenClaw currently exposes agents.defaults.maxConcurrent as a gateway-wide embedded-agent concurrency limit, but does not appear to expose a per-agent concurrency limit such as agents.defaults.maxConcurrentPerAgent or agents.list[].maxConcurrent.

This creates an awkward reliability tradeoff for shared multi-agent gateways:

  • setting agents.defaults.maxConcurrent = 1 protects fragile agents/sessions from concurrent runs, but serializes every agent in the gateway;
  • raising agents.defaults.maxConcurrent allows different agents to run in parallel, but there is no config-supported way to keep each individual agent single-file/single-run;
  • splitting agents into separate gateway processes works, but costs additional RAM and channel/provider runtime overhead.

A useful target behavior would be:

{
  "agents": {
    "defaults": {
      "maxConcurrent": 2,
      "maxConcurrentPerAgent": 1
    },
    "list": [
      { "id": "hexa" },
      { "id": "hkerbot" }
    ]
  }
}

With that shape, a shared gateway could run Hexa and HKerBot concurrently, while still serializing each agent's own turns.

Error Message

SessionWriteLockTimeoutError: session file locked (timeout 60000ms): pid=... /home/molt/.openclaw-hexa-qa/agents/hkerbot/sessions/05477859-4cd4-408a-bd95-9d4846610afb.jsonl.lock errorCode=UNAVAILABLE code=OPENCLAW_SESSION_WRITE_LOCK_TIMEOUT

Root Cause

For small hosts, splitting every reliability-sensitive agent into a separate gateway wastes RAM. A per-agent concurrency limit lets operators keep one shared gateway process while avoiding same-agent overlapping runs from Telegram, Paperclip/API wakes, cron/system events, compaction, or other embedded-agent entry points.

This does not replace the root fix for #84193. Auto-compaction still needs correct lock release/fencing on timeout/abort. But per-agent concurrency is a practical reliability control that avoids choosing between:

  • one global lane for all agents, or
  • many gateway processes.

Fix Action

Fix / Workaround

Feature gap / reliability workaround

As a mitigation, the gateway was configured with:

Code Example

{
  "agents": {
    "defaults": {
      "maxConcurrent": 2,
      "maxConcurrentPerAgent": 1
    },
    "list": [
      { "id": "hexa" },
      { "id": "hkerbot" }
    ]
  }
}

---

⚠️ Something went wrong while processing your request. Please try again, or use /new to start a fresh session.

---

SessionWriteLockTimeoutError: session file locked (timeout 60000ms): pid=... /home/molt/.openclaw-hexa-qa/agents/hkerbot/sessions/05477859-4cd4-408a-bd95-9d4846610afb.jsonl.lock
errorCode=UNAVAILABLE
code=OPENCLAW_SESSION_WRITE_LOCK_TIMEOUT

---

{
  "messages": {
    "queue": {
      "mode": "collect",
      "byChannel": { "telegram": "collect" }
    }
  },
  "agents": {
    "defaults": {
      "timeoutSeconds": 1800,
      "maxConcurrent": 1
    }
  }
}

---

"agents": {
  "defaults": {
    "maxConcurrent": 2,
    "maxConcurrentPerAgent": 1
  }
}

---

"agents": {
  "list": [
    { "id": "hkerbot", "maxConcurrent": 1 },
    { "id": "hexa", "maxConcurrent": 1 }
  ]
}

---

session:<sessionKey> -> agent:<agentId> -> main/global
RAW_BUFFERClick to expand / collapse

Bug type

Feature gap / reliability workaround

Summary

OpenClaw currently exposes agents.defaults.maxConcurrent as a gateway-wide embedded-agent concurrency limit, but does not appear to expose a per-agent concurrency limit such as agents.defaults.maxConcurrentPerAgent or agents.list[].maxConcurrent.

This creates an awkward reliability tradeoff for shared multi-agent gateways:

  • setting agents.defaults.maxConcurrent = 1 protects fragile agents/sessions from concurrent runs, but serializes every agent in the gateway;
  • raising agents.defaults.maxConcurrent allows different agents to run in parallel, but there is no config-supported way to keep each individual agent single-file/single-run;
  • splitting agents into separate gateway processes works, but costs additional RAM and channel/provider runtime overhead.

A useful target behavior would be:

{
  "agents": {
    "defaults": {
      "maxConcurrent": 2,
      "maxConcurrentPerAgent": 1
    },
    "list": [
      { "id": "hexa" },
      { "id": "hkerbot" }
    ]
  }
}

With that shape, a shared gateway could run Hexa and HKerBot concurrently, while still serializing each agent's own turns.

Field case

On a shared Hexa/HKerBot gateway running OpenClaw 2026.5.20, HKerBot hit repeated user-visible generic failures:

⚠️ Something went wrong while processing your request. Please try again, or use /new to start a fresh session.

Logs showed the real error was session JSONL write-lock contention:

SessionWriteLockTimeoutError: session file locked (timeout 60000ms): pid=... /home/molt/.openclaw-hexa-qa/agents/hkerbot/sessions/05477859-4cd4-408a-bd95-9d4846610afb.jsonl.lock
errorCode=UNAVAILABLE
code=OPENCLAW_SESSION_WRITE_LOCK_TIMEOUT

This overlaps with #84193, where auto-compaction can hold a session JSONL write lock long enough to block later turns. In the field case, a restart cleared the lock briefly, but the same old HKerBot session immediately re-entered a compaction/lock loop. The practical recovery was to back up the session and remove only the affected session mapping so HKerBot could start a fresh session.

As a mitigation, the gateway was configured with:

{
  "messages": {
    "queue": {
      "mode": "collect",
      "byChannel": { "telegram": "collect" }
    }
  },
  "agents": {
    "defaults": {
      "timeoutSeconds": 1800,
      "maxConcurrent": 1
    }
  }
}

That is reliable but too blunt: it serializes both Hexa and HKerBot even when they are separate logical agents and could safely run in parallel if each agent were limited to one active run.

Schema evidence

config.schema.lookup agents.list.* does not list maxConcurrent or timeoutSeconds as per-agent keys. Direct schema search also found no maxConcurrentPerAgent, perAgent, or agents.list.*.maxConcurrent; only gateway-wide agents.defaults.maxConcurrent, subagent max concurrency, ACP concurrency, and cron concurrency appear.

Expected behavior

OpenClaw should support one of these shapes:

"agents": {
  "defaults": {
    "maxConcurrent": 2,
    "maxConcurrentPerAgent": 1
  }
}

and/or:

"agents": {
  "list": [
    { "id": "hkerbot", "maxConcurrent": 1 },
    { "id": "hexa", "maxConcurrent": 1 }
  ]
}

The embedded runner should then enforce an agent-level concurrency lane/semaphore in addition to the existing session lane and global lane, roughly:

session:<sessionKey> -> agent:<agentId> -> main/global

or an equivalent composition.

Why this matters

For small hosts, splitting every reliability-sensitive agent into a separate gateway wastes RAM. A per-agent concurrency limit lets operators keep one shared gateway process while avoiding same-agent overlapping runs from Telegram, Paperclip/API wakes, cron/system events, compaction, or other embedded-agent entry points.

This does not replace the root fix for #84193. Auto-compaction still needs correct lock release/fencing on timeout/abort. But per-agent concurrency is a practical reliability control that avoids choosing between:

  • one global lane for all agents, or
  • many gateway processes.

Acceptance criteria

  • Add schema support for agents.defaults.maxConcurrentPerAgent and/or agents.list[].maxConcurrent.
  • Enforce per-agent concurrency across embedded agent run entry points, while keeping same-session serialization intact.
  • Preserve gateway-wide agents.defaults.maxConcurrent as a total cap.
  • Add tests showing two different agents can run concurrently when global max is 2, while two runs for the same agent queue/serialize when per-agent max is 1.
  • Document interaction with messages.queue, cron.maxConcurrentRuns, session lanes, and compaction paths.

Related

  • #84193 — auto-compaction leaves session JSONL write lock held after timeout, blocking later turns.
  • #43367 — multi-agent orchestration instability and session-lock failures.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

OpenClaw should support one of these shapes:

"agents": {
  "defaults": {
    "maxConcurrent": 2,
    "maxConcurrentPerAgent": 1
  }
}

and/or:

"agents": {
  "list": [
    { "id": "hkerbot", "maxConcurrent": 1 },
    { "id": "hexa", "maxConcurrent": 1 }
  ]
}

The embedded runner should then enforce an agent-level concurrency lane/semaphore in addition to the existing session lane and global lane, roughly:

session:<sessionKey> -> agent:<agentId> -> main/global

or an equivalent composition.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix Add per-agent maxConcurrent to allow shared gateways with single-lane agents