openclaw - ✅(Solved) Fix Main session prompt crash: Cannot read properties of undefined (reading 'length') in compaction token estimation [1 pull requests, 3 comments, 4 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#63612Fetched 2026-04-10 03:42:31
View on GitHub
Comments
3
Participants
4
Timeline
6
Reactions
0
Author
Timeline (top)
commented ×3cross-referenced ×3

A long-lived main session can become unrecoverable with the error:

Agent failed before reply: Cannot read properties of undefined (reading 'length')

This is not a provider/model error. The crash happens before prompt submission, during pre-prompt compaction token estimation.

Error Message

TypeError: Cannot read properties of undefined (reading 'length') at estimateTokens (file:///Users/vvc/.local/lib/node_modules/openclaw/node_modules/@mariozechner/pi-coding-agent/src/core/compaction/compaction.ts:257:26) at file:///Users/vvc/.local/lib/node_modules/openclaw/dist/pi-embedded-CNTNdlGw.js:20243:73 at Array.reduce (<anonymous>) at estimateMessagesTokens (file:///Users/vvc/.local/lib/node_modules/openclaw/dist/pi-embedded-CNTNdlGw.js:20243:42) at estimatePrePromptTokens (file:///Users/vvc/.local/lib/node_modules/openclaw/dist/pi-embedded-CNTNdlGw.js:31147:20) at shouldPreemptivelyCompactBeforePrompt (file:///Users/vvc/.local/lib/node_modules/openclaw/dist/pi-embedded-CNTNdlGw.js:31151:32) at runEmbeddedAttempt (file:///Users/vvc/.local/lib/node_modules/openclaw/dist/pi-embedded-CNTNdlGw.js:32286:35)

Root Cause

The crash point is in pi-coding-agent compaction token estimation logic. estimateTokens() assumes message block fields always exist:

  • assistant.content is iterable
  • block.text.length is always safe
  • block.thinking.length is always safe
  • block.name.length is always safe for tool calls
  • toolResult/custom.content is always an array or string

That assumption breaks on malformed or partially-normalized history blocks.

Fix Action

Fix / Workaround

Local mitigation that restored the session

Two layers were patched locally:

After patching the compaction estimateTokens() path, the main session recovered.

Why this matters

This is a high-impact session-killer bug because it affects the recovery path itself:

  • once a main session contains one malformed block
  • every subsequent prompt attempt can fail before provider dispatch
  • model fallback cannot help

PR fix notes

PR #63636: fix(compaction): guard malformed token estimation

Description (problem / solution / changelog)

Summary

  • Problem: long-lived main sessions could crash before provider dispatch when compaction token estimation hit malformed replay history and estimateTokens() read missing .length fields.
  • Why it matters: once a session contained one malformed history block, every later prompt attempt could fail in pre-prompt compaction, making the session effectively unrecoverable.
  • What changed: added a guarded estimateMessageTokens() path in src/agents/compaction.ts and switched preemptive compaction plus embedded compaction metrics/sanity checks to reuse it.
  • What did NOT change (scope boundary): this PR does not redesign replay-history normalization or patch @mariozechner/pi-coding-agent; it only hardens OpenClaw’s local compaction estimation path.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor required for the fix
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

  • Closes #63612

  • This likely also addresses the malformed-history / reading 'length' Telegram manifestation discussed in #64053, and may partially reduce the Telegram crash facet mentioned in #64034, but it does not address the broader Discord overflow/lag symptoms tracked there.

  • This PR fixes a bug or regression

Root Cause (if applicable)

  • Root cause: compaction-side token estimation assumed replayed message blocks always had fully normalized shapes, but malformed assistant/toolResult blocks could still reach estimation and trigger unchecked .length reads.
  • Missing detection / guardrail: OpenClaw had replay sanitization and some downstream try/catch sites, but no shared safe estimator for all compaction-related estimateTokens() call sites.
  • Contributing context (if known): long-lived main sessions exercise pre-prompt compaction on every turn, so one malformed history block could repeatedly fail the recovery path itself.

Regression Test Plan (if applicable)

  • Coverage level that should have caught this:
    • Unit test
    • Seam / integration test
    • End-to-end test
    • Existing coverage already sufficient
  • Target test or file: src/agents/compaction.test.ts, src/agents/pi-embedded-runner/run/preemptive-compaction.test.ts
  • Scenario the test should lock in: malformed assistant/toolResult history blocks do not throw during token estimation or pre-prompt compaction checks.
  • Why this is the smallest reliable guardrail: the crash happens in pure estimation logic before provider dispatch, so unit coverage at the estimation and precheck seam is enough to lock in the failure mode.
  • Existing test that already covers this (if any): none
  • If no new test is added, why not: N/A

User-visible / Behavior Changes

Long-lived sessions with malformed replay history now fail soft in compaction token estimation instead of crashing before reply generation.

Diagram (if applicable)

Before:
[new user turn] -> [pre-prompt compaction estimation] -> [throws on malformed block] -> [session cannot reply]

After:
[new user turn] -> [guarded token estimation] -> [invalid block counted as 0/safe fallback] -> [reply flow continues]

Security Impact (required)

  • New permissions/capabilities? (No)
  • Secrets/tokens handling changed? (No)
  • New/changed network calls? (No)
  • Command/tool execution surface changed? (No)
  • Data access scope changed? (No)
  • If any Yes, explain risk + mitigation:

Repro + Verification

Environment

  • OS: macOS
  • Runtime/container: local Node 22+/pnpm workspace
  • Model/provider: N/A for repro; crash occurs before provider dispatch
  • Integration/channel (if any): embedded main session
  • Relevant config (redacted): default compaction path with long-lived session history

Steps

  1. Build or replay a session history containing malformed assistant/toolResult blocks.
  2. Trigger a new turn that runs pre-prompt compaction estimation.
  3. Observe the behavior before and after the patch.

Expected

  • Token estimation tolerates malformed blocks and the session continues.

Actual

  • Before this fix, estimation could throw Cannot read properties of undefined (reading 'length') and abort the reply before provider dispatch.

Evidence

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)

Human Verification (required)

  • Verified scenarios: ran pnpm test src/agents/compaction.test.ts, pnpm test src/agents/pi-embedded-runner/run/preemptive-compaction.test.ts, and pnpm check.
  • Edge cases checked: malformed assistant content entries, missing assistant content arrays, malformed toolResult content during pre-prompt estimation.
  • What you did not verify: no live reproduction against a real damaged long-lived session transcript.

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

Compatibility / Migration

  • Backward compatible? (Yes)
  • Config/env changes? (No)
  • Migration needed? (No)
  • If yes, exact upgrade steps:

Risks and Mitigations

  • Risk: local fallback token estimation may slightly differ from upstream estimateTokens() for malformed messages.
    • Mitigation: fallback is only used on malformed inputs where the previous behavior was to throw; valid messages still use upstream estimation first.

Changed files

  • src/agents/compaction.test.ts (modified, +116/-0)
  • src/agents/compaction.token-sanitize.test.ts (modified, +113/-1)
  • src/agents/compaction.ts (modified, +232/-2)
  • src/agents/pi-embedded-runner/compact.ts (modified, +9/-13)
  • src/agents/pi-embedded-runner/compact.types.ts (modified, +1/-1)
  • src/agents/pi-embedded-runner/run/preemptive-compaction.test.ts (modified, +31/-0)
  • src/agents/pi-embedded-runner/run/preemptive-compaction.ts (modified, +2/-3)

Code Example

TypeError: Cannot read properties of undefined (reading 'length')
    at estimateTokens (file:///Users/vvc/.local/lib/node_modules/openclaw/node_modules/@mariozechner/pi-coding-agent/src/core/compaction/compaction.ts:257:26)
    at file:///Users/vvc/.local/lib/node_modules/openclaw/dist/pi-embedded-CNTNdlGw.js:20243:73
    at Array.reduce (<anonymous>)
    at estimateMessagesTokens (file:///Users/vvc/.local/lib/node_modules/openclaw/dist/pi-embedded-CNTNdlGw.js:20243:42)
    at estimatePrePromptTokens (file:///Users/vvc/.local/lib/node_modules/openclaw/dist/pi-embedded-CNTNdlGw.js:31147:20)
    at shouldPreemptivelyCompactBeforePrompt (file:///Users/vvc/.local/lib/node_modules/openclaw/dist/pi-embedded-CNTNdlGw.js:31151:32)
    at runEmbeddedAttempt (file:///Users/vvc/.local/lib/node_modules/openclaw/dist/pi-embedded-CNTNdlGw.js:32286:35)

---

embedded_prompt_error_stack
embedded run failover decision: stage=prompt decision=surface_error
lane task error: lane=session:agent:main:main error="TypeError: Cannot read properties of undefined (reading 'length')"
Embedded agent failed before reply: Cannot read properties of undefined (reading 'length')
RAW_BUFFERClick to expand / collapse

Main session prompt crash: Cannot read properties of undefined (reading 'length')

Date: 2026-04-09 Host: macOS OpenClaw version: 2026.4.8

Summary

A long-lived main session can become unrecoverable with the error:

Agent failed before reply: Cannot read properties of undefined (reading 'length')

This is not a provider/model error. The crash happens before prompt submission, during pre-prompt compaction token estimation.

Exact stack

Observed via added local diagnostic logging:

TypeError: Cannot read properties of undefined (reading 'length')
    at estimateTokens (file:///Users/vvc/.local/lib/node_modules/openclaw/node_modules/@mariozechner/pi-coding-agent/src/core/compaction/compaction.ts:257:26)
    at file:///Users/vvc/.local/lib/node_modules/openclaw/dist/pi-embedded-CNTNdlGw.js:20243:73
    at Array.reduce (<anonymous>)
    at estimateMessagesTokens (file:///Users/vvc/.local/lib/node_modules/openclaw/dist/pi-embedded-CNTNdlGw.js:20243:42)
    at estimatePrePromptTokens (file:///Users/vvc/.local/lib/node_modules/openclaw/dist/pi-embedded-CNTNdlGw.js:31147:20)
    at shouldPreemptivelyCompactBeforePrompt (file:///Users/vvc/.local/lib/node_modules/openclaw/dist/pi-embedded-CNTNdlGw.js:31151:32)
    at runEmbeddedAttempt (file:///Users/vvc/.local/lib/node_modules/openclaw/dist/pi-embedded-CNTNdlGw.js:32286:35)

Impact

  • main session cannot reply at all.
  • All fallback models fail the same way because the error occurs before the request reaches the provider.
  • Long-lived sessions are affected most often because they pass through preemptive compaction/token estimation on every new turn.

What seems to trigger it

  • A long-lived main session with mixed message history:
    • user/assistant messages
    • toolResult messages
    • compaction/summary-related messages
  • At least one replayed message block appears to have a shape that is valid enough to survive history loading, but invalid for compaction token estimation.

Root cause

The crash point is in pi-coding-agent compaction token estimation logic. estimateTokens() assumes message block fields always exist:

  • assistant.content is iterable
  • block.text.length is always safe
  • block.thinking.length is always safe
  • block.name.length is always safe for tool calls
  • toolResult/custom.content is always an array or string

That assumption breaks on malformed or partially-normalized history blocks.

Local evidence from runtime logs

Representative runtime log lines:

embedded_prompt_error_stack
embedded run failover decision: stage=prompt decision=surface_error
lane task error: lane=session:agent:main:main error="TypeError: Cannot read properties of undefined (reading 'length')"
Embedded agent failed before reply: Cannot read properties of undefined (reading 'length')

Local mitigation that restored the session

Two layers were patched locally:

  1. Normalize replayed message content before replay sanitization/validation in:
    • dist/pi-embedded-CNTNdlGw.js
  2. Add null guards in compaction token estimation in:
    • node_modules/@mariozechner/pi-coding-agent/dist/core/compaction/compaction.js

Specifically, local guards were added for:

  • missing/non-array assistant.content
  • missing/non-array toolResult/custom.content
  • missing block.text
  • missing block.thinking
  • missing block.name

After patching the compaction estimateTokens() path, the main session recovered.

Suggested upstream fix

In pi-coding-agent compaction token estimation, treat malformed blocks as zero-length instead of throwing.

Minimal expectation:

  • return 0 for invalid message objects
  • treat missing content as empty
  • gate all .length access behind type checks

Why this matters

This is a high-impact session-killer bug because it affects the recovery path itself:

  • once a main session contains one malformed block
  • every subsequent prompt attempt can fail before provider dispatch
  • model fallback cannot help

Notes

This issue was observed on a heavily-used main session, but the underlying bug looks general and could affect any long-running session that accumulates malformed history blocks.

extent analysis

TL;DR

The most likely fix is to add null guards and type checks in the estimateTokens() function of pi-coding-agent compaction token estimation logic to handle malformed or partially-normalized history blocks.

Guidance

  • Identify and normalize replayed message content before replay sanitization/validation to prevent malformed blocks from causing errors.
  • Add null guards in compaction token estimation for missing or non-array fields such as assistant.content, toolResult/custom.content, block.text, block.thinking, and block.name.
  • Treat malformed blocks as zero-length instead of throwing an error by returning 0 for invalid message objects and treating missing content as empty.
  • Gate all .length access behind type checks to prevent Cannot read properties of undefined errors.

Example

// Example of added null guards in estimateTokens() function
function estimateTokens(message) {
  if (!message || !message.assistant || !message.assistant.content) {
    return 0; // treat missing or invalid content as zero-length
  }
  const contentLength = Array.isArray(message.assistant.content) ? message.assistant.content.length : 0;
  // ...
}

Notes

The suggested fix is to modify the pi-coding-agent compaction token estimation logic to handle malformed or partially-normalized history blocks. This fix should be applied upstream to prevent similar errors in the future.

Recommendation

Apply the suggested upstream fix by adding null guards and type checks in the estimateTokens() function to handle malformed blocks and prevent session-killing errors. This fix is necessary to ensure the stability and reliability of long-running sessions.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING