openclaw - ✅(Solved) Fix Main session prompt crash: Cannot read properties of undefined (reading 'length') in compaction token estimation [1 pull requests, 3 comments, 4 participants]

DSVinC · 2026-04-09T07:40:47Z

[openclaw] A long-lived main session can become unrecoverable with the error: Agent failed before reply: Cannot read properties of undefined reading 'length' T… A long-lived `main` session can become unrecoverable with the error: `Agent failed before reply: Cannot read properties of undefined (reading 'length')` This is not a provider/model error. The crash happens before prompt submission, during pre-prompt compaction token estimation. # PR #63636: fix(compaction): guard malformed token estimation - Repository: openclaw/openclaw - Author: GaosCode - State: open | merged: False - Link: https://github.com/openclaw/openclaw/pull/63636 ## Description (problem / solution / changelog) ## Summary - Problem: long-lived `main` sessions could crash before provider dispatch when compaction token estimation hit malformed replay history and `estimateTokens()` read missing `.length` fields. - Why it matters: once a session contained one malformed history block, every later prompt attempt could fail in pre-prompt compaction, making the session effectively unrecoverable. - What changed: added a guarded `estimateMessageTokens()` path in `src/agents/compaction.ts` and switched preemptive compaction plus embedded compaction metrics/sanity checks to reuse it. - What did NOT change (scope boundary): this PR does not redesign replay-history normalization or patch `@mariozechner/pi-coding-agent`; it only hardens OpenClaw’s local compaction estimation path. ## Change Type (select all) - [x] Bug fix - [ ] Feature - [ ] Refactor required for the fix - [ ] Docs - [ ] Security hardening - [ ] Chore/infra ## Scope (select all touched areas) - [x] Gateway / orchestration - [ ] Skills / tool execution - [ ] Auth / tokens - [ ] Memory / storage - [ ] Integrations - [ ] API / contracts - [ ] UI / DX - [ ] CI/CD / infra ## Linked Issue/PR - Closes #63612 - This likely also addresses the malformed-history / `reading 'length'` Telegram manifestation discussed in #64053, and may partially reduce the Telegram crash facet mentioned in #64034, but it does not address the broader Discord overflow/lag symptoms tracked there. - [x] This PR fixes a bug or regression ## Root Cause (if applicable) - Root cause: compaction-side token estimation assumed replayed message blocks always had fully normalized shapes, but malformed assistant/toolResult blocks could still reach estimation and trigger unchecked `.length` reads. - Missing detection / guardrail: OpenClaw had replay sanitization and some downstream try/catch sites, but no shared safe estimator for all compaction-related `estimateTokens()` call sites. - Contributing context (if known): long-lived `main` sessions exercise pre-prompt compaction on every turn, so one malformed history block could repeatedly fail the recovery path itself. ## Regression Test Plan (if applicable) - Coverage level that should have caught this: - [x] Unit test - [ ] Seam / integration test - [ ] End-to-end test - [ ] Existing coverage already sufficient - Target test or file: `src/agents/compaction.test.ts`, `src/agents/pi-embedded-runner/run/preemptive-compaction.test.ts` - Scenario the test should lock in: malformed assistant/toolResult history blocks do not throw during token estimation or pre-prompt compaction checks. - Why this is the smallest reliable guardrail: the crash happens in pure estimation logic before provider dispatch, so unit coverage at the estimation and precheck seam is enough to lock in the failure mode. - Existing test that already covers this (if any): none - If no new test is added, why not: N/A ## User-visible / Behavior Changes Long-lived sessions with malformed replay history now fail soft in compaction token estimation instead of crashing before reply generation. ## Diagram (if applicable) ```text Before: [new user turn] -> [pre-prompt compaction estimation] -> [throws on malformed block] -> [session cannot reply] After: [new user turn] -> [guarded token estimation] -> [invalid block counted as 0/safe fallback] -> [reply flow continues] ``` ## Security Impact (required) - New permissions/capabilities? (`No`) - Secrets/tokens handling changed? (`No`) - New/changed network calls? (`No`) - Command/tool execution surface changed? (`No`) - Data access scope changed? (`No`) - If any `Yes`, explain risk + mitigation: ## Repro + Verification ### Environment - OS: macOS - Runtime/container: local Node 22+/pnpm workspace - Model/provider: N/A for repro; crash occurs before provider dispatch - Integration/channel (if any): embedded `main` session - Relevant config (redacted): default compaction path with long-lived session history ### Steps 1. Build or replay a session history containing malformed assistant/toolResult blocks. 2. Trigger a new turn that runs pre-prompt compaction estimation. 3. Observe the behavior before and after the patch. ### Expected - Token estimation tolerates malformed blocks and the session continues. ### Actual - Before this fix, estimation could throw `Cannot read properties of

openclaw2026-04-09 07:40:47

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openclaw/openclaw#63612•Fetched 2026-04-10 03:42:31

View on GitHub

Comments

Participants

Timeline

Reactions

Author

Participants

Timeline (top)

commented ×3cross-referenced ×3

A long-lived main session can become unrecoverable with the error:

Agent failed before reply: Cannot read properties of undefined (reading 'length')

This is not a provider/model error. The crash happens before prompt submission, during pre-prompt compaction token estimation.

Error Message

TypeError: Cannot read properties of undefined (reading 'length') at estimateTokens (file:///Users/vvc/.local/lib/node_modules/openclaw/node_modules/@mariozechner/pi-coding-agent/src/core/compaction/compaction.ts:257:26) at file:///Users/vvc/.local/lib/node_modules/openclaw/dist/pi-embedded-CNTNdlGw.js:20243:73 at Array.reduce (<anonymous>) at estimateMessagesTokens (file:///Users/vvc/.local/lib/node_modules/openclaw/dist/pi-embedded-CNTNdlGw.js:20243:42) at estimatePrePromptTokens (file:///Users/vvc/.local/lib/node_modules/openclaw/dist/pi-embedded-CNTNdlGw.js:31147:20) at shouldPreemptivelyCompactBeforePrompt (file:///Users/vvc/.local/lib/node_modules/openclaw/dist/pi-embedded-CNTNdlGw.js:31151:32) at runEmbeddedAttempt (file:///Users/vvc/.local/lib/node_modules/openclaw/dist/pi-embedded-CNTNdlGw.js:32286:35)

Root Cause

The crash point is in pi-coding-agent compaction token estimation logic. estimateTokens() assumes message block fields always exist:

assistant.content is iterable
block.text.length is always safe
block.thinking.length is always safe
block.name.length is always safe for tool calls
toolResult/custom.content is always an array or string

That assumption breaks on malformed or partially-normalized history blocks.

Fix Action

Fix / Workaround

Local mitigation that restored the session

Two layers were patched locally:

After patching the compaction estimateTokens() path, the main session recovered.

Why this matters

This is a high-impact session-killer bug because it affects the recovery path itself:

once a main session contains one malformed block
every subsequent prompt attempt can fail before provider dispatch
model fallback cannot help

PR fix notes

PR #63636: fix(compaction): guard malformed token estimation

Repository: openclaw/openclaw
Author: GaosCode
State: open | merged: False
Link: https://github.com/openclaw/openclaw/pull/63636

Description (problem / solution / changelog)

Summary

Problem: long-lived main sessions could crash before provider dispatch when compaction token estimation hit malformed replay history and estimateTokens() read missing .length fields.
Why it matters: once a session contained one malformed history block, every later prompt attempt could fail in pre-prompt compaction, making the session effectively unrecoverable.
What changed: added a guarded estimateMessageTokens() path in src/agents/compaction.ts and switched preemptive compaction plus embedded compaction metrics/sanity checks to reuse it.
What did NOT change (scope boundary): this PR does not redesign replay-history normalization or patch @mariozechner/pi-coding-agent; it only hardens OpenClaw’s local compaction estimation path.

Change Type (select all)

Scope (select all touched areas)

Linked Issue/PR

Closes #63612
This likely also addresses the malformed-history / reading 'length' Telegram manifestation discussed in #64053, and may partially reduce the Telegram crash facet mentioned in #64034, but it does not address the broader Discord overflow/lag symptoms tracked there.
This PR fixes a bug or regression

Root Cause (if applicable)

Root cause: compaction-side token estimation assumed replayed message blocks always had fully normalized shapes, but malformed assistant/toolResult blocks could still reach estimation and trigger unchecked .length reads.
Missing detection / guardrail: OpenClaw had replay sanitization and some downstream try/catch sites, but no shared safe estimator for all compaction-related estimateTokens() call sites.
Contributing context (if known): long-lived main sessions exercise pre-prompt compaction on every turn, so one malformed history block could repeatedly fail the recovery path itself.

Regression Test Plan (if applicable)

Coverage level that should have caught this:
- Unit test
- Seam / integration test
- End-to-end test
- Existing coverage already sufficient
Target test or file: src/agents/compaction.test.ts, src/agents/pi-embedded-runner/run/preemptive-compaction.test.ts
Scenario the test should lock in: malformed assistant/toolResult history blocks do not throw during token estimation or pre-prompt compaction checks.
Why this is the smallest reliable guardrail: the crash happens in pure estimation logic before provider dispatch, so unit coverage at the estimation and precheck seam is enough to lock in the failure mode.
Existing test that already covers this (if any): none
If no new test is added, why not: N/A

User-visible / Behavior Changes

Long-lived sessions with malformed replay history now fail soft in compaction token estimation instead of crashing before reply generation.

Diagram (if applicable)

Before:
[new user turn] -> [pre-prompt compaction estimation] -> [throws on malformed block] -> [session cannot reply]

After:
[new user turn] -> [guarded token estimation] -> [invalid block counted as 0/safe fallback] -> [reply flow continues]

Security Impact (required)

New permissions/capabilities? (No)
Secrets/tokens handling changed? (No)
New/changed network calls? (No)
Command/tool execution surface changed? (No)
Data access scope changed? (No)
If any Yes, explain risk + mitigation:

Repro + Verification

Environment

OS: macOS
Runtime/container: local Node 22+/pnpm workspace
Model/provider: N/A for repro; crash occurs before provider dispatch
Integration/channel (if any): embedded main session
Relevant config (redacted): default compaction path with long-lived session history

Steps

Build or replay a session history containing malformed assistant/toolResult blocks.
Trigger a new turn that runs pre-prompt compaction estimation.
Observe the behavior before and after the patch.

Expected

Token estimation tolerates malformed blocks and the session continues.

Actual

Before this fix, estimation could throw Cannot read properties of undefined (reading 'length') and abort the reply before provider dispatch.

Evidence

Failing test/log before + passing after
Trace/log snippets
Screenshot/recording
Perf numbers (if relevant)

Human Verification (required)

Verified scenarios: ran pnpm test src/agents/compaction.test.ts, pnpm test src/agents/pi-embedded-runner/run/preemptive-compaction.test.ts, and pnpm check.
Edge cases checked: malformed assistant content entries, missing assistant content arrays, malformed toolResult content during pre-prompt estimation.
What you did not verify: no live reproduction against a real damaged long-lived session transcript.

Review Conversations

I replied to or resolved every bot review conversation I addressed in this PR.
I left unresolved only the conversations that still need reviewer or maintainer judgment.

Compatibility / Migration

Backward compatible? (Yes)
Config/env changes? (No)
Migration needed? (No)
If yes, exact upgrade steps:

Risks and Mitigations

Risk: local fallback token estimation may slightly differ from upstream estimateTokens() for malformed messages.
- Mitigation: fallback is only used on malformed inputs where the previous behavior was to throw; valid messages still use upstream estimation first.

Changed files

src/agents/compaction.test.ts (modified, +116/-0)
src/agents/compaction.token-sanitize.test.ts (modified, +113/-1)
src/agents/compaction.ts (modified, +232/-2)
src/agents/pi-embedded-runner/compact.ts (modified, +9/-13)
src/agents/pi-embedded-runner/compact.types.ts (modified, +1/-1)
src/agents/pi-embedded-runner/run/preemptive-compaction.test.ts (modified, +31/-0)
src/agents/pi-embedded-runner/run/preemptive-compaction.ts (modified, +2/-3)

Code Example

TypeError: Cannot read properties of undefined (reading 'length')
    at estimateTokens (file:///Users/vvc/.local/lib/node_modules/openclaw/node_modules/@mariozechner/pi-coding-agent/src/core/compaction/compaction.ts:257:26)
    at file:///Users/vvc/.local/lib/node_modules/openclaw/dist/pi-embedded-CNTNdlGw.js:20243:73
    at Array.reduce (<anonymous>)
    at estimateMessagesTokens (file:///Users/vvc/.local/lib/node_modules/openclaw/dist/pi-embedded-CNTNdlGw.js:20243:42)
    at estimatePrePromptTokens (file:///Users/vvc/.local/lib/node_modules/openclaw/dist/pi-embedded-CNTNdlGw.js:31147:20)
    at shouldPreemptivelyCompactBeforePrompt (file:///Users/vvc/.local/lib/node_modules/openclaw/dist/pi-embedded-CNTNdlGw.js:31151:32)
    at runEmbeddedAttempt (file:///Users/vvc/.local/lib/node_modules/openclaw/dist/pi-embedded-CNTNdlGw.js:32286:35)

---

embedded_prompt_error_stack
embedded run failover decision: stage=prompt decision=surface_error
lane task error: lane=session:agent:main:main error="TypeError: Cannot read properties of undefined (reading 'length')"
Embedded agent failed before reply: Cannot read properties of undefined (reading 'length')

RAW_BUFFERClick to expand / collapse

Main session prompt crash: `Cannot read properties of undefined (reading 'length')`

Date: 2026-04-09 Host: macOS OpenClaw version: 2026.4.8

Summary

A long-lived main session can become unrecoverable with the error:

Agent failed before reply: Cannot read properties of undefined (reading 'length')

This is not a provider/model error. The crash happens before prompt submission, during pre-prompt compaction token estimation.

Exact stack

Observed via added local diagnostic logging:

TypeError: Cannot read properties of undefined (reading 'length')
    at estimateTokens (file:///Users/vvc/.local/lib/node_modules/openclaw/node_modules/@mariozechner/pi-coding-agent/src/core/compaction/compaction.ts:257:26)
    at file:///Users/vvc/.local/lib/node_modules/openclaw/dist/pi-embedded-CNTNdlGw.js:20243:73
    at Array.reduce (<anonymous>)
    at estimateMessagesTokens (file:///Users/vvc/.local/lib/node_modules/openclaw/dist/pi-embedded-CNTNdlGw.js:20243:42)
    at estimatePrePromptTokens (file:///Users/vvc/.local/lib/node_modules/openclaw/dist/pi-embedded-CNTNdlGw.js:31147:20)
    at shouldPreemptivelyCompactBeforePrompt (file:///Users/vvc/.local/lib/node_modules/openclaw/dist/pi-embedded-CNTNdlGw.js:31151:32)
    at runEmbeddedAttempt (file:///Users/vvc/.local/lib/node_modules/openclaw/dist/pi-embedded-CNTNdlGw.js:32286:35)

Impact

main session cannot reply at all.
All fallback models fail the same way because the error occurs before the request reaches the provider.
Long-lived sessions are affected most often because they pass through preemptive compaction/token estimation on every new turn.

What seems to trigger it

A long-lived main session with mixed message history:
- user/assistant messages
- toolResult messages
- compaction/summary-related messages
At least one replayed message block appears to have a shape that is valid enough to survive history loading, but invalid for compaction token estimation.

Root cause

The crash point is in pi-coding-agent compaction token estimation logic. estimateTokens() assumes message block fields always exist:

assistant.content is iterable
block.text.length is always safe
block.thinking.length is always safe
block.name.length is always safe for tool calls
toolResult/custom.content is always an array or string

That assumption breaks on malformed or partially-normalized history blocks.

Local evidence from runtime logs

Representative runtime log lines:

embedded_prompt_error_stack
embedded run failover decision: stage=prompt decision=surface_error
lane task error: lane=session:agent:main:main error="TypeError: Cannot read properties of undefined (reading 'length')"
Embedded agent failed before reply: Cannot read properties of undefined (reading 'length')

Local mitigation that restored the session

Two layers were patched locally:

Normalize replayed message content before replay sanitization/validation in:
- dist/pi-embedded-CNTNdlGw.js
Add null guards in compaction token estimation in:
- node_modules/@mariozechner/pi-coding-agent/dist/core/compaction/compaction.js

Specifically, local guards were added for:

missing/non-array assistant.content
missing/non-array toolResult/custom.content
missing block.text
missing block.thinking
missing block.name

After patching the compaction estimateTokens() path, the main session recovered.

Suggested upstream fix

In pi-coding-agent compaction token estimation, treat malformed blocks as zero-length instead of throwing.

Minimal expectation:

return 0 for invalid message objects
treat missing content as empty
gate all .length access behind type checks

Why this matters

This is a high-impact session-killer bug because it affects the recovery path itself:

once a main session contains one malformed block
every subsequent prompt attempt can fail before provider dispatch
model fallback cannot help

Notes

This issue was observed on a heavily-used main session, but the underlying bug looks general and could affect any long-running session that accumulates malformed history blocks.

extent analysis

TL;DR

The most likely fix is to add null guards and type checks in the estimateTokens() function of pi-coding-agent compaction token estimation logic to handle malformed or partially-normalized history blocks.

Guidance

Identify and normalize replayed message content before replay sanitization/validation to prevent malformed blocks from causing errors.
Add null guards in compaction token estimation for missing or non-array fields such as assistant.content, toolResult/custom.content, block.text, block.thinking, and block.name.
Treat malformed blocks as zero-length instead of throwing an error by returning 0 for invalid message objects and treating missing content as empty.
Gate all .length access behind type checks to prevent Cannot read properties of undefined errors.

Example

// Example of added null guards in estimateTokens() function
function estimateTokens(message) {
  if (!message || !message.assistant || !message.assistant.content) {
    return 0; // treat missing or invalid content as zero-length
  }
  const contentLength = Array.isArray(message.assistant.content) ? message.assistant.content.length : 0;
  // ...
}

Notes

The suggested fix is to modify the pi-coding-agent compaction token estimation logic to handle malformed or partially-normalized history blocks. This fix should be applied upstream to prevent similar errors in the future.

Recommendation

Apply the suggested upstream fix by adding null guards and type checks in the estimateTokens() function to handle malformed blocks and prevent session-killing errors. This fix is necessary to ensure the stability and reliability of long-running sessions.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #ssr #search optimization #API routing #API middleware

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

openclaw - ✅(Solved) Fix Main session prompt crash: Cannot read properties of undefined (reading 'length') in compaction token estimation [1 pull requests, 3 comments, 4 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fix / Workaround

Local mitigation that restored the session

Why this matters

PR fix notes

PR #63636: fix(compaction): guard malformed token estimation

Description (problem / solution / changelog)

Summary

Change Type (select all)

Scope (select all touched areas)

Linked Issue/PR

Root Cause (if applicable)

Regression Test Plan (if applicable)

User-visible / Behavior Changes

Diagram (if applicable)

Security Impact (required)

Repro + Verification

Environment

Steps

Expected

Actual

Evidence

Human Verification (required)

Review Conversations

Compatibility / Migration

Risks and Mitigations

Changed files

Code Example

Main session prompt crash: Cannot read properties of undefined (reading 'length')

Summary

Exact stack

Impact

What seems to trigger it

Root cause

Local evidence from runtime logs

Local mitigation that restored the session

Suggested upstream fix

Why this matters

Notes

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING

Main session prompt crash: `Cannot read properties of undefined (reading 'length')`