openclaw - 💡(How to fix) Fix [Feature]: Add compaction.model config to use faster model for summarization [1 participants]

stwith · 2026-03-12T08:05:17Z

[openclaw] Add agents.defaults.compaction.model config option to specify a different faster model for compaction summarization. Add `agents.defaults.compaction.model` config option to specify a different (faster) model for compaction summarization. ## Fix / Workaround Session details: - 4.5MB JSONL, 940 lines, 3 prior compactions - Model: claude-opus-4-5 - Timeout: 900s → increased to 1800s as workaround ### Summary Add `agents.defaults.compaction.model` config option to specify a different (faster) model for compaction summarization. ### Problem to solve Compaction uses the session's primary model for summarization. When the session uses a slow model like Opus, compaction becomes extremely slow: - Each chunk summarization requires a full LLM API call - Large sessions (150k+ tokens) split into 2-3 chunks of ~80k tokens each - With qualityGuard retries, this can mean 4-6 Opus API calls - Each Opus call for 80k tokens takes 1-3 minutes - Total compaction time: 8-15+ minutes, hitting timeout limits Real-world example: - Session: 4.5MB JSONL, ~160k tokens, claude-opus-4-5 - Compaction timed out at 900s (15 min) - Even after increasing timeout to 1800s, large sessions remain at risk ### Proposed solution Add a `compaction.model` config option: ```jsonc { "agents": { "defaults": { "compaction": { "model": "anthropic/claude-sonnet-4-5", // or "google/gemini-2.0-flash" // ... other compaction settings } } } } ``` Behavior: - If `compaction.model` is set, use that model for `generateSummary()` calls - If not set, fall back to session's model (current behavior) - The model should have sufficient context window for the chunk size ### Alternatives considered | Approach | Why weaker | |----------|-----------| | Increase timeout | Delays but doesn't fundamentally solve; 30+ min compaction is poor UX | | Disable qualityGuard | Reduces calls by half, but Opus is still slow | | Increase reserveTokensFloor | Triggers compaction earlier with smaller batches, but multiple compactions still accumulate | | User manually runs `/compact` early | Relies on user vigilance, doesn't scale | ### Impact - **Affected**: All users with Opus/slow models and long-running sessions - **Severity**: High — compaction timeout causes sessions to get permanently stuck in `processing` state - **Frequency**: Every session that exceeds ~150k tokens - **Consequence**: Stuck sessions, lost messages, requires manual session reset ### Evidence/examples From OpenClaw 2026.3.7 logs: ``` embedded run timeout: sessionId=985aa0c1-... timeoutMs=900000 using current snapshot: timed out during compaction ``` Session details: - 4.5MB JSONL, 940 lines, 3 prior compactions - Model: claude-opus-4-5 - Timeout: 900s → increased to 1800s as workaround ### Additional information - Sonnet is ~3-5x faster than Opus for summarization tasks - Flash/Gemini would be even faster - Summarization doesn't require Opus-level reasoning depth - This would also reduce API costs for compaction

openclaw2026-03-12 08:05:17

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openclaw/openclaw#43834•Fetched 2026-04-08 00:18:47

View on GitHub

Comments

Participants

Timeline

Reactions

Author

stwith

Participants

stwith

Add agents.defaults.compaction.model config option to specify a different (faster) model for compaction summarization.

Root Cause

Add agents.defaults.compaction.model config option to specify a different (faster) model for compaction summarization.

Fix Action

Fix / Workaround

Session details:

4.5MB JSONL, 940 lines, 3 prior compactions
Model: claude-opus-4-5
Timeout: 900s → increased to 1800s as workaround

Code Example

{
  "agents": {
    "defaults": {
      "compaction": {
        "model": "anthropic/claude-sonnet-4-5",  // or "google/gemini-2.0-flash"
        // ... other compaction settings
      }
    }
  }
}

---

embedded run timeout: sessionId=985aa0c1-... timeoutMs=900000
using current snapshot: timed out during compaction

RAW_BUFFERClick to expand / collapse

Summary

Add agents.defaults.compaction.model config option to specify a different (faster) model for compaction summarization.

Problem to solve

Compaction uses the session's primary model for summarization. When the session uses a slow model like Opus, compaction becomes extremely slow:

Each chunk summarization requires a full LLM API call
Large sessions (150k+ tokens) split into 2-3 chunks of ~80k tokens each
With qualityGuard retries, this can mean 4-6 Opus API calls
Each Opus call for 80k tokens takes 1-3 minutes
Total compaction time: 8-15+ minutes, hitting timeout limits

Real-world example:

Session: 4.5MB JSONL, ~160k tokens, claude-opus-4-5
Compaction timed out at 900s (15 min)
Even after increasing timeout to 1800s, large sessions remain at risk

Proposed solution

Add a compaction.model config option:

{
  "agents": {
    "defaults": {
      "compaction": {
        "model": "anthropic/claude-sonnet-4-5",  // or "google/gemini-2.0-flash"
        // ... other compaction settings
      }
    }
  }
}

Behavior:

If compaction.model is set, use that model for generateSummary() calls
If not set, fall back to session's model (current behavior)
The model should have sufficient context window for the chunk size

Alternatives considered

Approach	Why weaker
Increase timeout	Delays but doesn't fundamentally solve; 30+ min compaction is poor UX
Disable qualityGuard	Reduces calls by half, but Opus is still slow
Increase reserveTokensFloor	Triggers compaction earlier with smaller batches, but multiple compactions still accumulate
User manually runs `/compact` early	Relies on user vigilance, doesn't scale

Impact

Affected: All users with Opus/slow models and long-running sessions
Severity: High — compaction timeout causes sessions to get permanently stuck in processing state
Frequency: Every session that exceeds ~150k tokens
Consequence: Stuck sessions, lost messages, requires manual session reset

Evidence/examples

From OpenClaw 2026.3.7 logs:

embedded run timeout: sessionId=985aa0c1-... timeoutMs=900000
using current snapshot: timed out during compaction

Session details:

4.5MB JSONL, 940 lines, 3 prior compactions
Model: claude-opus-4-5
Timeout: 900s → increased to 1800s as workaround

Additional information

Sonnet is ~3-5x faster than Opus for summarization tasks
Flash/Gemini would be even faster
Summarization doesn't require Opus-level reasoning depth
This would also reduce API costs for compaction

extent analysis

Problem Summary

Compaction currently always uses the session’s primary LLM model. When that model is a slow, high‑cost one (e.g., claude-opus-4-5), summarising large sessions takes many minutes per chunk and often hits the 15‑minute timeout.

Root Cause

generateSummary() is called without an explicit model argument, so it inherits the session’s default model. No way to override it for the compaction step.

Fix Plan

1. Extend the configuration schema

Add a new optional field agents.defaults.compaction.model (string).

// src/config/schema.ts (or wherever the JSON schema lives)
export const agentsDefaults = {
  type: "object",
  properties: {
    // … existing defaults …
    compaction: {
      type: "object",
      properties: {
        model: { type: "string", description: "LLM model used for compaction summarisation" },
        // keep any other compaction settings already present
      },
      additionalProperties: false,
    },
  },
  additionalProperties: false,
};

2. Load the new option at runtime

// src/config/index.ts
export function getCompactionModel(config: Config, sessionModel: string): string {
  // `config.agents.defaults.compaction?.model` may be undefined
  return config.agents?.defaults?.compaction?.model ?? sessionModel;
}

3. Pass the model to the summarisation routine

// src/agents/compaction.ts
import { getCompactionModel } from "../config";

export async function compactSession(session: Session, cfg: Config) {
  const modelForCompaction = getCompactionModel(cfg, session.model);

  // split session into chunks (unchanged)
  for (const chunk of chunks) {
    const summary = await generateSummary({
      model: modelForCompaction,   // <-- explicit model
      messages: chunk,
      // keep existing qualityGuard / retry options
    });
    // … store summary, continue …
  }
}

If generateSummary already accepts a model field in its options object, just forward it; otherwise add it:

// src/llm/generator.ts
export async function generateSummary(opts: {
  model: string;
  messages: Message[];
  // other flags

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #ssr #installation #tensor shape #autograd error #parallel task #integration issue #index setup #retrieval issue

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

openclaw - 💡(How to fix) Fix [Feature]: Add compaction.model config to use faster model for summarization [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Fix / Workaround

Code Example

Summary

Problem to solve

Proposed solution

Alternatives considered

Impact

Evidence/examples

Additional information

extent analysis

Problem Summary

Root Cause

Fix Plan

1. Extend the configuration schema

2. Load the new option at runtime

3. Pass the model to the summarisation routine

Still need to ship something?

TRENDING

openclaw - 💡(How to fix) Fix [Feature]: Add compaction.model config to use faster model for summarization [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Fix / Workaround

Code Example

Summary

Problem to solve

Proposed solution

Alternatives considered

Impact

Evidence/examples

Additional information

extent analysis

Problem Summary

Root Cause

Fix Plan

1. Extend the configuration schema

2. Load the new option at runtime

3. Pass the model to the summarisation routine

Still need to ship something?

RELATED_DISCOVERY

TRENDING