gemini-cli - 💡(How to fix) Fix Large tool output (e.g. run_shell_command) sent to model uncapped exceeds 1M input limit and permanently wedges the session [1 pull requests]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Error Message

When a tool returns output far larger than the model's input window, gemini-cli should bound what reaches the model — e.g. send a head/tail slice plus a "full output saved to <file>" note (it already writes the full output to disk), and/or detect that the assembled request exceeds the model's input limit and drop/summarize the largest historical tool results instead of returning the same error on every future turn. Today one oversized command permanently kills the session.

Fix Action

Fixed

Code Example

HTTP 500: Failed to generate content: The input token count exceeds the maximum number of tokens allowed 1048576.

---

~/.gemini/tmp/<user>/tool-outputs/run_shell_command_1.txt   # 24,039,557 bytes, 1 line
RAW_BUFFERClick to expand / collapse

What happened

In a long-running headless session (driven over ACP), the agent ran a single run_shell_command whose output was a ~24 MB single-line JSON blob (a package-lock.json / npm-metadata dump). The very next model turn failed with:

HTTP 500: Failed to generate content: The input token count exceeds the maximum number of tokens allowed 1048576.

~24 MB of text is roughly 6M tokens, about 6x the model's 2^20 input limit, contributed by a single tool result.

The worse problem is that the session never recovers: the oversized functionResponse stays in the conversation history, so every subsequent turn fails with the identical 500, and an automated retry just re-runs the same command and re-poisons the context. The session is permanently wedged.

Evidence

gemini-cli does persist the full tool output to a file:

~/.gemini/tmp/<user>/tool-outputs/run_shell_command_1.txt   # 24,039,557 bytes, 1 line

but it also appears to include that full output in the model request (hence the 1M overflow). I could not find any setting that caps the portion of tool output sent to the model — grepping the installed bundle (@google/gemini-cli/bundle/gemini.js) for truncateToolOutput*, enableToolOutputTruncation, maxOutput*, "Output too large", "truncated" returned nothing in this version.

Expected behavior

When a tool returns output far larger than the model's input window, gemini-cli should bound what reaches the model — e.g. send a head/tail slice plus a "full output saved to <file>" note (it already writes the full output to disk), and/or detect that the assembled request exceeds the model's input limit and drop/summarize the largest historical tool results instead of returning the same error on every future turn. Today one oversized command permanently kills the session.

Question

Is there a configuration knob in 0.42.0 to cap tool-output size sent to the model? If not, would a default cap (with full output preserved on disk) be in scope?

Environment

  • gemini --version: 0.42.0
  • Auth: oauth-personal
  • Mode: headless, driven via ACP (non-interactive)
  • Model: default Gemini (1,048,576-token input window)
  • OS: Linux container (node image)

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

When a tool returns output far larger than the model's input window, gemini-cli should bound what reaches the model — e.g. send a head/tail slice plus a "full output saved to <file>" note (it already writes the full output to disk), and/or detect that the assembled request exceeds the model's input limit and drop/summarize the largest historical tool results instead of returning the same error on every future turn. Today one oversized command permanently kills the session.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING