openclaw - 💡(How to fix) Fix [Feature]: Add compaction.model config to use faster model for summarization [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#43834Fetched 2026-04-08 00:18:47
View on GitHub
Comments
0
Participants
1
Timeline
0
Reactions
0
Author
Participants

Add agents.defaults.compaction.model config option to specify a different (faster) model for compaction summarization.

Root Cause

Add agents.defaults.compaction.model config option to specify a different (faster) model for compaction summarization.

Fix Action

Fix / Workaround

Session details:

  • 4.5MB JSONL, 940 lines, 3 prior compactions
  • Model: claude-opus-4-5
  • Timeout: 900s → increased to 1800s as workaround

Code Example

{
  "agents": {
    "defaults": {
      "compaction": {
        "model": "anthropic/claude-sonnet-4-5",  // or "google/gemini-2.0-flash"
        // ... other compaction settings
      }
    }
  }
}

---

embedded run timeout: sessionId=985aa0c1-... timeoutMs=900000
using current snapshot: timed out during compaction
RAW_BUFFERClick to expand / collapse

Summary

Add agents.defaults.compaction.model config option to specify a different (faster) model for compaction summarization.

Problem to solve

Compaction uses the session's primary model for summarization. When the session uses a slow model like Opus, compaction becomes extremely slow:

  • Each chunk summarization requires a full LLM API call
  • Large sessions (150k+ tokens) split into 2-3 chunks of ~80k tokens each
  • With qualityGuard retries, this can mean 4-6 Opus API calls
  • Each Opus call for 80k tokens takes 1-3 minutes
  • Total compaction time: 8-15+ minutes, hitting timeout limits

Real-world example:

  • Session: 4.5MB JSONL, ~160k tokens, claude-opus-4-5
  • Compaction timed out at 900s (15 min)
  • Even after increasing timeout to 1800s, large sessions remain at risk

Proposed solution

Add a compaction.model config option:

{
  "agents": {
    "defaults": {
      "compaction": {
        "model": "anthropic/claude-sonnet-4-5",  // or "google/gemini-2.0-flash"
        // ... other compaction settings
      }
    }
  }
}

Behavior:

  • If compaction.model is set, use that model for generateSummary() calls
  • If not set, fall back to session's model (current behavior)
  • The model should have sufficient context window for the chunk size

Alternatives considered

ApproachWhy weaker
Increase timeoutDelays but doesn't fundamentally solve; 30+ min compaction is poor UX
Disable qualityGuardReduces calls by half, but Opus is still slow
Increase reserveTokensFloorTriggers compaction earlier with smaller batches, but multiple compactions still accumulate
User manually runs /compact earlyRelies on user vigilance, doesn't scale

Impact

  • Affected: All users with Opus/slow models and long-running sessions
  • Severity: High — compaction timeout causes sessions to get permanently stuck in processing state
  • Frequency: Every session that exceeds ~150k tokens
  • Consequence: Stuck sessions, lost messages, requires manual session reset

Evidence/examples

From OpenClaw 2026.3.7 logs:

embedded run timeout: sessionId=985aa0c1-... timeoutMs=900000
using current snapshot: timed out during compaction

Session details:

  • 4.5MB JSONL, 940 lines, 3 prior compactions
  • Model: claude-opus-4-5
  • Timeout: 900s → increased to 1800s as workaround

Additional information

  • Sonnet is ~3-5x faster than Opus for summarization tasks
  • Flash/Gemini would be even faster
  • Summarization doesn't require Opus-level reasoning depth
  • This would also reduce API costs for compaction

extent analysis

Problem Summary

Compaction currently always uses the session’s primary LLM model. When that model is a slow, high‑cost one (e.g., claude-opus-4-5), summarising large sessions takes many minutes per chunk and often hits the 15‑minute timeout.

Root Cause

generateSummary() is called without an explicit model argument, so it inherits the session’s default model. No way to override it for the compaction step.

Fix Plan

1. Extend the configuration schema

Add a new optional field agents.defaults.compaction.model (string).

// src/config/schema.ts (or wherever the JSON schema lives)
export const agentsDefaults = {
  type: "object",
  properties: {
    // … existing defaults …
    compaction: {
      type: "object",
      properties: {
        model: { type: "string", description: "LLM model used for compaction summarisation" },
        // keep any other compaction settings already present
      },
      additionalProperties: false,
    },
  },
  additionalProperties: false,
};

2. Load the new option at runtime

// src/config/index.ts
export function getCompactionModel(config: Config, sessionModel: string): string {
  // `config.agents.defaults.compaction?.model` may be undefined
  return config.agents?.defaults?.compaction?.model ?? sessionModel;
}

3. Pass the model to the summarisation routine

// src/agents/compaction.ts
import { getCompactionModel } from "../config";

export async function compactSession(session: Session, cfg: Config) {
  const modelForCompaction = getCompactionModel(cfg, session.model);

  // split session into chunks (unchanged)
  for (const chunk of chunks) {
    const summary = await generateSummary({
      model: modelForCompaction,   // <-- explicit model
      messages: chunk,
      // keep existing qualityGuard / retry options
    });
    // … store summary, continue …
  }
}

If generateSummary already accepts a model field in its options object, just forward it; otherwise add it:

// src/llm/generator.ts
export async function generateSummary(opts: {
  model: string;
  messages: Message[];
  // other flags

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING