openclaw - 💡(How to fix) Fix [Feature]: Batch API support for async-tolerant cron jobs (50% token discount) [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#70606Fetched 2026-04-24 05:55:49
View on GitHub
Comments
0
Participants
1
Timeline
1
Reactions
0
Author
Participants
Timeline (top)
labeled ×1

Add support for Anthropic's Message Batches API (POST /v1/messages/batches) as a delivery path for cron jobs whose output isn't latency-sensitive. Batch API offers a flat 50% discount on input + output tokens, with up to 24h completion — most complete in under 1h. Natural fit for digest/report jobs that fire overnight or on a weekly cadence. Reference: https://platform.claude.com/docs/en/build-with-claude/batch-processing

Error Message

Skip batch mode when the provider is OAuth-backed (Claude Pro/Max subscription plans can't use Batch API) — fall back to real-time automatically with a one-line log note, no error surfaced to the user.

Root Cause

Add support for Anthropic's Message Batches API (POST /v1/messages/batches) as a delivery path for cron jobs whose output isn't latency-sensitive. Batch API offers a flat 50% discount on input + output tokens, with up to 24h completion — most complete in under 1h. Natural fit for digest/report jobs that fire overnight or on a weekly cadence. Reference: https://platform.claude.com/docs/en/build-with-claude/batch-processing

Fix Action

Fix / Workaround

Direct cost: full-rate billing on jobs that could run at 50% off Indirect cost: async cron jobs compete with interactive requests for rate-limit budget, occasionally causing throttling on real-time traffic during overnight batch windows Workaround cost: users who want the discount today must bypass OpenClaw entirely (raw HTTP to the Batch API), losing tool use, memory, and session context — so most don't bother

Willing to contribute: Happy to prototype the Anthropic adapter if there's interest. Main open question is where in the provider plugin lifecycle (extensions/anthropic/ internally) the dispatch decision should happen — per-turn vs per-session — and how the polling loop should surface progress events to the agent runtime so the SSE stream and wake-up logic stay consistent.

RAW_BUFFERClick to expand / collapse

Summary

Add support for Anthropic's Message Batches API (POST /v1/messages/batches) as a delivery path for cron jobs whose output isn't latency-sensitive. Batch API offers a flat 50% discount on input + output tokens, with up to 24h completion — most complete in under 1h. Natural fit for digest/report jobs that fire overnight or on a weekly cadence. Reference: https://platform.claude.com/docs/en/build-with-claude/batch-processing

Problem to solve

Cron jobs in OpenClaw are billed at full real-time rates regardless of how latency-sensitive they are. An overnight digest that fires at 03:00 and a live chat reply both go through the same messages.create path and pay the same per-token price. For users on paid API-key billing, this means a large class of jobs — weekly reports, nightly summaries, scheduled research digests — is paying a 2× premium for latency nobody is using. The user doesn't care whether the message arrives at 03:05 or 03:45. There's also no way today to isolate these async jobs from the real-time RPM/TPM budget. A long overnight report consumes the same rate-limit pool as interactive requests, which can cause collisions during busy hours. The Batch API runs against a separate quota.

Proposed solution

A new cron flag --batch (or payload batchMode: true) that reroutes the agent turn through /v1/messages/batches instead of /v1/messages. The gateway:

Enqueues the request with a custom_id of the cron run ID Polls the batch status endpoint (~once per minute, with exponential backoff up to the user-configured --batch-timeout) On batch completion, extracts the result and proceeds through the normal delivery pipeline (announce / none / tool-call output) On batch timeout or failure, falls through to the existing failure path (--failure-alert-*)

Suggested CLI surface: bashopenclaw cron create
--name "Weekly Report"
--cron "0 3 * * 0"
--tz UTC
--model anthropic/claude-sonnet-4-6
--batch
--batch-timeout 2h
... Suggested payload field: json"payload": { "kind": "agentTurn", "message": "...", "batch": { "enabled": true, "maxWaitMs": 7200000 } } Fallback behavior:

Skip batch mode when the provider is OAuth-backed (Claude Pro/Max subscription plans can't use Batch API) — fall back to real-time automatically with a one-line log note, no error surfaced to the user. Skip batch mode when the configured provider isn't anthropic — fall back to real-time. OpenAI's batch endpoint is a separate surface and out of scope for the initial implementation.

Compatibility:

Works cleanly with existing cron flags: --thinking off, --light-context, --announce, delivery.mode Prompt caching (cache_control) is already supported by Batch API — no changes needed there

Alternatives considered

  1. Local batch plugin — custom plugin wrapping the Anthropic provider. Doable but requires maintaining a fork of provider logic, breaks on every OpenClaw update, and doesn't benefit anyone else.
  2. Manual HTTP batching outside OpenClaw — bypasses the agent loop entirely, losing tool use, memory, session context. Only works for pure-prompt jobs; most cron jobs use tools.
  3. Do nothing — leaves 50% on the table for anyone on API-key billing.

Impact

Affected: OpenClaw users on paid Anthropic API-key billing who run scheduled cron jobs (reports, digests, summaries, research pulls) where delivery latency is measured in hours, not seconds. Not affected: OAuth/subscription users, interactive channel users, users whose cron jobs are latency-sensitive. Severity: Medium — no functionality is broken, but users are paying roughly 2× what they could for a well-defined class of jobs. Also medium-impact on rate-limit headroom: every async cron job currently eats from the same real-time RPM/TPM budget as interactive traffic. Frequency: Continuous for anyone running daily/weekly crons on API-key billing. Compounds over time — a single nightly digest at ~30K tokens/run costs roughly 2× what it needs to, every day, indefinitely. Consequence:

Direct cost: full-rate billing on jobs that could run at 50% off Indirect cost: async cron jobs compete with interactive requests for rate-limit budget, occasionally causing throttling on real-time traffic during overnight batch windows Workaround cost: users who want the discount today must bypass OpenClaw entirely (raw HTTP to the Batch API), losing tool use, memory, and session context — so most don't bother

Evidence/examples

No response

Additional information

Environment:

OpenClaw version: 2026.4.21 Node: 24.x OS: Linux (VPS) Provider plugin: anthropic (paid API key)

Willing to contribute: Happy to prototype the Anthropic adapter if there's interest. Main open question is where in the provider plugin lifecycle (extensions/anthropic/ internally) the dispatch decision should happen — per-turn vs per-session — and how the polling loop should surface progress events to the agent runtime so the SSE stream and wake-up logic stay consistent.

extent analysis

TL;DR

To reduce costs for non-latency-sensitive cron jobs, implement a --batch flag that reroutes requests through the Anthropic Batch API, utilizing a 50% discount on input and output tokens.

Guidance

  • Introduce a new --batch flag for cron jobs to opt-in for batch processing, which would use the Anthropic Batch API for non-latency-sensitive tasks.
  • Modify the gateway to enqueue requests with a custom ID and poll the batch status endpoint with exponential backoff up to the user-configured --batch-timeout.
  • Implement fallback behavior to skip batch mode for OAuth-backed providers or non-Anthropic providers, defaulting to real-time processing.
  • Ensure compatibility with existing cron flags and prompt caching mechanisms.

Example

"payload": {
  "kind": "agentTurn",
  "message": "...",
  "batch": { "enabled": true, "maxWaitMs": 7200000 }
}

Notes

The proposed solution requires modifications to the OpenClaw cron job system and the Anthropic provider plugin. The exact implementation details, such as the dispatch decision point in the provider plugin lifecycle, need further discussion.

Recommendation

Apply the workaround by implementing the --batch flag and modifying the gateway to use the Anthropic Batch API, as this offers a clear cost reduction for non-latency-sensitive cron jobs.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix [Feature]: Batch API support for async-tolerant cron jobs (50% token discount) [1 participants]