openclaw - ✅(Solved) Fix Empty payload (payloads=0) does not trigger model fallback chain [1 pull requests, 1 comments, 2 participants]

openclaw2026-05-06 05:27:56

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openclaw/openclaw#78293•Fetched 2026-05-07 03:38:45

View on GitHub

Comments

Participants

Timeline

Reactions

Author

HendrikHarren

Participants

clawsweeper[bot]

HendrikHarren

Timeline (top)

closed ×1commented ×1cross-referenced ×1

When a provider returns a technically successful response with stopReason=stop and payloads=0 (empty content), runWithModelFallback does not advance to the next configured fallback model. The user sees ⚠️ Agent couldn't generate a response. Please try again. even when a valid fallbacks: [...] chain is configured.

We see this 1–2 times per day with google/gemini-3-flash-preview as primary and anthropic/claude-haiku-4-5 as fallback.

Error Message

runWithModelFallback (src/agents/model-fallback.ts:633) advances candidates only when the run callback throws. The embedded Pi runner (src/agents/pi-embedded-runner/run.ts:1901, the if (incompleteTurnText) branch) returns a successful result with an isError: true payload instead of throwing — so the fallback loop sees success (model-fallback.ts:788) and returns the user-facing error to the caller without trying any fallback. [agent/embedded] incomplete turn detected: runId=2d12b7f7-... sessionId=8fe3bddf-... stopReason=stop payloads=0 — surfacing error to user

Negative: no fallbacks configured → keep current user-error behavior.

Root Cause

HTTP 4xx/5xx errors go through coerceToFailoverError and become FailoverError instances, so the fallback chain triggers correctly for them. Only the "successful but empty" case falls through.

The internal DEFAULT_EMPTY_RESPONSE_RETRY_LIMIT = 1 (src/agents/pi-embedded-runner/run/incomplete-turn.ts:83) retries the same provider/model — it does not switch models.

Fix Action

Fixed

Fixed by PR: Add TypeScript build pipeline for openclaw 2026.5+ compat (https://github.com/deankroker/openclaw-composio-plugin/pull/1)

PR fix notes

PR #1: Add TypeScript build pipeline for openclaw 2026.5+ compat

Repository: deankroker/openclaw-composio-plugin
Author: chrislangston
State: closed | merged: False
Link: https://github.com/deankroker/openclaw-composio-plugin/pull/1

Description (problem / solution / changelog)

Summary

openclaw 2026.5.0+ plugin loader rejects packages whose openclaw.extensions points at .ts source — it requires compiled JS at one of:

./dist/index.js, ./dist/index.mjs, ./dist/index.cjs
or top-level index.js / index.mjs / index.cjs

When the dev fleet was upgraded to openclaw 2026.5.4 to validate the upstream payloads=0 fix (deankroker/claw issues #642 / #638 / #614 — see also openclaw/openclaw#78293), the gateway crashed at boot:

[plugins] installed plugin package requires compiled runtime output for TypeScript entry index.ts:
  expected ./dist/index.js, ./dist/index.mjs, ./dist/index.cjs, index.js, index.mjs, index.cjs
  (plugin=composio, source=/home/node/.openclaw/extensions/composio)

This PR adds the build pipeline so the fork loads cleanly on 2026.5+.

Changes

Add tsconfig.json (target ES2022, module ESNext, outDir ./dist, strict, declarations)
Add typescript + @types/node + openclaw as devDependencies
Add openclaw as peerDependency (>=2026.4.0)
Add build script: tsc
Update package.json:
- main: ./dist/index.js
- openclaw.extensions: ["./dist/index.js"] (was "./index.ts")
- files includes dist
Commit dist/ output

Why commit `dist/`

The Fly bridge install path (infra/fly/start.sh:436) uses tarball install:

wget -q -O composio-fork.tgz "$TARBALL"
tar -xzf composio-fork.tgz -C composio-fork --strip-components=1
cd composio-fork
npm install --omit=dev --no-audit --no-fund --silent

npm install from a tarball does not run prepare lifecycle scripts (those only run for git URL installs and local directory installs). And --omit=dev skips devDependencies, so tsc wouldn't be available even if we tried to build at install time.

Two viable paths:

Commit dist/ (this PR) — install path stays unchanged, no build step at install time
Modify start.sh to install full deps + run build — adds time + complexity to every cold boot

Path 1 is operationally simpler.

Validation

Built locally with npm install && npm run build against [email protected]. tsc compiles cleanly with no errors.

Smoke-tested in dev-clawos-fly environment with the locally-built fork baked into the Docker image — gateway boots, composioReady: true, plugin loads.

Compatibility

Source-level changes are additive; no runtime behavior changes
Existing consumers pinned to the prior SHA (11191acb) are unaffected
The version spoof in infra/fly/start.sh:476 (p.version="0.0.11") continues to work — it operates on package.json and is independent of this PR

deankroker/claw issue #642 (P0 fleet-wide textLen=0)
deankroker/claw issue #638 (DeepSeek payloads=0 with COMPOSIO_SEARCH_TOOLS)
openclaw/openclaw#78293 (upstream payloads=0 fallback fix in v2026.5.4)

Changed files

dist/index.d.ts (added, +32/-0)
dist/index.js (added, +71/-0)
dist/src/cli.d.ts (added, +2/-0)
dist/src/cli.js (added, +131/-0)
dist/src/client.d.ts (added, +14/-0)
dist/src/client.js (added, +58/-0)
dist/src/config.d.ts (added, +32/-0)
dist/src/config.js (added, +53/-0)
dist/src/prompt.d.ts (added, +5/-0)
dist/src/prompt.js (added, +89/-0)
dist/src/state.d.ts (added, +11/-0)
dist/src/state.js (added, +15/-0)
dist/src/tools.d.ts (added, +7/-0)
dist/src/tools.js (added, +118/-0)
dist/src/types.d.ts (added, +17/-0)
dist/src/types.js (added, +1/-0)
openclaw.plugin.json (modified, +3/-0)
package-lock.json (added, +8186/-0)
package.json (modified, +15/-1)
tsconfig.json (added, +21/-0)

Code Example

model: {
     primary: "google/gemini-3-flash-preview",
     fallbacks: ["anthropic/claude-haiku-4-5"],
   }

---

[agent/embedded] incomplete turn detected: runId=2d12b7f7-... sessionId=8fe3bddf-... stopReason=stop payloads=0 — surfacing error to user

---

// run.ts ~1901
if (incompleteTurnText) {
  // existing logging + lifecycle marks ...

  if (
    attempt.replayMetadata.replaySafe &&
    (params.fallbacksRemaining ?? 0) > 0
  ) {
    throw new FailoverError(
      "empty response (payloads=0) — failing over to next model",
      {
        reason: "empty_response",
        provider: activeErrorContext.provider,
        model: activeErrorContext.model,
      },
    );
  }
  // existing return path ...
}

RAW_BUFFERClick to expand / collapse

Summary

We see this 1–2 times per day with google/gemini-3-flash-preview as primary and anthropic/claude-haiku-4-5 as fallback.

Reproduction

Configure an agent with:

model: {
  primary: "google/gemini-3-flash-preview",
  fallbacks: ["anthropic/claude-haiku-4-5"],
}

Trigger a turn where Gemini returns stopReason=stop with empty content blocks (intermittent, ~1/200 turns in our deployment).
Expected: fallback to Haiku, user sees Haiku's answer.
Actual: user sees ⚠️ Agent couldn't generate a response. Please try again.

Root cause

HTTP 4xx/5xx errors go through coerceToFailoverError and become FailoverError instances, so the fallback chain triggers correctly for them. Only the "successful but empty" case falls through.

The internal DEFAULT_EMPTY_RESPONSE_RETRY_LIMIT = 1 (src/agents/pi-embedded-runner/run/incomplete-turn.ts:83) retries the same provider/model — it does not switch models.

Logs (from production deploy)

[agent/embedded] incomplete turn detected: runId=2d12b7f7-... sessionId=8fe3bddf-... stopReason=stop payloads=0 — surfacing error to user

Suggested fix

In src/agents/pi-embedded-runner/run.ts ~line 1901, inside the if (incompleteTurnText) block, throw a FailoverError with reason: "empty_response" when:

attempt.replayMetadata.replaySafe === true (no side-effects yet — Tool calls already done would be unsafe to replay), and
a fallback candidate is still available (caller-provided hint, e.g. params.fallbacksRemaining > 0).

Otherwise keep the existing return path so we don't replay turns that already sent messages or executed mutating tools.

Sketch:

// run.ts ~1901
if (incompleteTurnText) {
  // existing logging + lifecycle marks ...

  if (
    attempt.replayMetadata.replaySafe &&
    (params.fallbacksRemaining ?? 0) > 0
  ) {
    throw new FailoverError(
      "empty response (payloads=0) — failing over to next model",
      {
        reason: "empty_response",
        provider: activeErrorContext.provider,
        model: activeErrorContext.model,
      },
    );
  }
  // existing return path ...
}

FailoverError reason union (src/agents/pi-embedded-helpers/errors.ts) needs "empty_response" added. runWithModelFallback already treats every FailoverError instance as a failover trigger via isFailoverError(normalized) (model-fallback.ts:868), so no additional change to the fallback loop should be needed.

The runner needs to know whether a fallback chance exists — pass fallbacksRemaining (or similar) from runWithModelFallback into the run callback.

Test plan

Mock provider returns stopReason=stop, payloads=[].
Agent with model.primary=mock-empty, fallbacks=["mock-good"] → second call hits mock-good, final result is from the fallback.
Negative: no fallbacks configured → keep current user-error behavior.
Negative: replayMetadata.replaySafe=false → no failover (side-effects already happened).
Cache stability: empty-payload turns must not invalidate the cached prefix on the failover model unless re-prompting requires it.

Environment

OpenClaw: 2026.5.3
Primary: google/gemini-3-flash-preview
Fallback: anthropic/claude-haiku-4-5
Frequency: 1–2× per day in our production deployment

Downstream context

I'm tracking this on the deploy side as HendrikHarren/openclaw-deploy#124 and #127, and have a deploy-repo PR (HendrikHarren/openclaw-deploy#TBD) that fixes a separate config-side bug (Helm template was emitting fallback: "string" singular instead of fallbacks: [array], which OpenClaw silently ignored). The deploy fix at minimum restores HTTP-5xx fallback behaviour. The empty-payload case described here remains.

Happy to send a PR with the fix + tests if maintainers think the suggested approach is right.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#vector store #embedding generation #cache error #pipeline error #runtime error

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

openclaw - ✅(Solved) Fix Empty payload (payloads=0) does not trigger model fallback chain [1 pull requests, 1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fixed

PR fix notes

PR #1: Add TypeScript build pipeline for openclaw 2026.5+ compat

Description (problem / solution / changelog)

Summary

Changes

Why commit `dist/`

Validation

Compatibility

Related

Changed files

Code Example

Summary

Reproduction

Root cause

Logs (from production deploy)

Suggested fix

Test plan

Environment

Downstream context

Still need to ship something?

TRENDING

openclaw - ✅(Solved) Fix Empty payload (payloads=0) does not trigger model fallback chain [1 pull requests, 1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fixed

PR fix notes

PR #1: Add TypeScript build pipeline for openclaw 2026.5+ compat

Description (problem / solution / changelog)

Summary

Changes

Why commit dist/

Validation

Compatibility

Related

Changed files

Code Example

Summary

Reproduction

Root cause

Logs (from production deploy)

Suggested fix

Test plan

Environment

Downstream context

Still need to ship something?

RELATED_DISCOVERY

TRENDING

Why commit `dist/`