gemini-cli - ✅(Solved) Fix [Feature] Core Reliability: Tool Name Fuzzy Matching and Response Length Continuation [1 pull requests, 1 participants]

achernez · 2026-04-08T22:06:11Z

[gemini-cli] PR 24978: feat core : implement tool repair and length continuation auto-recovery - Repository: google-gemini/gemini-cli - Author: achernez - Stat… # PR #24978: feat(core): implement tool repair and length continuation auto-recovery - Repository: google-gemini/gemini-cli - Author: achernez - State: open | merged: False - Link: https://github.com/google-gemini/gemini-cli/pull/24978 ## Description (problem / solution / changelog) ## Summary This PR implements two reliability features into `@google/gemini-cli-core` to improve agent robustness: ### 1. Tool Hallucination Auto-Repair (Fuzzy Matching) - **Kebab-to-Snake Normalization**: Automatically converts tool names from kebab-case (`read-file`) to snake_case (`read_file`). - **Levenshtein Distance**: Attempts to repair hallucinated tool names using a strict threshold (Max 2 edits). - **Ambiguity Check**: Aborts repair if multiple valid tools are equally close to the hallucination. - **Implementation**: Logic isolated in `packages/core/src/utils/fuzzy-matcher.ts` and integrated into the `Scheduler`. ### 2. Length Continuation - **Auto-Recovery**: Detects when a model response is truncated due to length (`finishReason: 'MAX_TOKENS'`). - **System Injection**: Automatically prompts the model to continue where it left off. - **Safety Cap**: Limits continuations to 3 iterations per turn to prevent token drain. - **Implementation**: Handled in the `LegacyAgentSession` run loop. ## Verification - Added unit tests for `fuzzy-matcher.ts`. - Added integrated tests in `scheduler.test.ts` for tool repair logic and ambiguity handling. - Added integrated tests in `legacy-agent-session.test.ts` for length continuation and iteration limits. - Verified with `npm run lint` and `npm run build`. - Fixed a bug in `scripts/build.js` that caused build failures in some Linux environments. Fixes #24977 ## Changed files - `packages/core/src/agent/legacy-agent-session.test.ts` (modified, +107/-0) - `packages/core/src/agent/legacy-agent-session.ts` (modified, +34/-3) - `packages/core/src/agent/types.ts` (modified, +4/-0) - `packages/core/src/scheduler/scheduler.test.ts` (modified, +109/-0) - `packages/core/src/scheduler/scheduler.ts` (modified, +55/-3) - `packages/core/src/utils/fuzzy-matcher.test.ts` (added, +63/-0) - `packages/core/src/utils/fuzzy-matcher.ts` (added, +68/-0) - `scripts/build.js` (modified, +25/-6) ## Fixed - Fixed by PR: feat(core): implement tool repair and length continuation auto-recovery (https://github.com/google-gemini/gemini-cli/pull/24978) ## Problem 1. **Tool Hallucinations**: The model sometimes emits tool names in the wrong format (e.g., kebab-case instead of snake_case) or with minor typos, causing execution failures. 2. **Truncated Responses**: Responses that exceed the max token limit are cut off, leading to incomplete information or broken JSON/code blocks. ## Proposed Solution Implement two reliability features: 1. **Tool Hallucination Auto-Repair**: A fuzzy matcher utility using Levenshtein distance (threshold 2) and kebab-to-snake normalization. 2. **Length Continuation**: Automatic recovery from MAX_TOKENS finish reason with a safety cap of 3 iterations.

gemini-cli2026-04-08 22:06:11

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

google-gemini/gemini-cli#24977•Fetched 2026-04-09 08:16:39

View on GitHub

Comments

Participants

Timeline

Reactions

Author

achernez

Participants

achernez

Timeline (top)

labeled ×2cross-referenced ×1

Fix Action

Fixed

Fixed by PR: feat(core): implement tool repair and length continuation auto-recovery (https://github.com/google-gemini/gemini-cli/pull/24978)

PR fix notes

PR #24978: feat(core): implement tool repair and length continuation auto-recovery

Repository: google-gemini/gemini-cli
Author: achernez
State: open | merged: False
Link: https://github.com/google-gemini/gemini-cli/pull/24978

Description (problem / solution / changelog)

Summary

This PR implements two reliability features into @google/gemini-cli-core to improve agent robustness:

1. Tool Hallucination Auto-Repair (Fuzzy Matching)

Kebab-to-Snake Normalization: Automatically converts tool names from kebab-case (read-file) to snake_case (read_file).
Levenshtein Distance: Attempts to repair hallucinated tool names using a strict threshold (Max 2 edits).
Ambiguity Check: Aborts repair if multiple valid tools are equally close to the hallucination.
Implementation: Logic isolated in packages/core/src/utils/fuzzy-matcher.ts and integrated into the Scheduler.

2. Length Continuation

Auto-Recovery: Detects when a model response is truncated due to length (finishReason: 'MAX_TOKENS').
System Injection: Automatically prompts the model to continue where it left off.
Safety Cap: Limits continuations to 3 iterations per turn to prevent token drain.
Implementation: Handled in the LegacyAgentSession run loop.

Verification

Added unit tests for fuzzy-matcher.ts.
Added integrated tests in scheduler.test.ts for tool repair logic and ambiguity handling.
Added integrated tests in legacy-agent-session.test.ts for length continuation and iteration limits.
Verified with npm run lint and npm run build.
Fixed a bug in scripts/build.js that caused build failures in some Linux environments.

Fixes #24977

Changed files

packages/core/src/agent/legacy-agent-session.test.ts (modified, +107/-0)
packages/core/src/agent/legacy-agent-session.ts (modified, +34/-3)
packages/core/src/agent/types.ts (modified, +4/-0)
packages/core/src/scheduler/scheduler.test.ts (modified, +109/-0)
packages/core/src/scheduler/scheduler.ts (modified, +55/-3)
packages/core/src/utils/fuzzy-matcher.test.ts (added, +63/-0)
packages/core/src/utils/fuzzy-matcher.ts (added, +68/-0)
scripts/build.js (modified, +25/-6)

RAW_BUFFERClick to expand / collapse

Problem

Tool Hallucinations: The model sometimes emits tool names in the wrong format (e.g., kebab-case instead of snake_case) or with minor typos, causing execution failures.
Truncated Responses: Responses that exceed the max token limit are cut off, leading to incomplete information or broken JSON/code blocks.

Proposed Solution

Implement two reliability features:

Tool Hallucination Auto-Repair: A fuzzy matcher utility using Levenshtein distance (threshold 2) and kebab-to-snake normalization.
Length Continuation: Automatic recovery from MAX_TOKENS finish reason with a safety cap of 3 iterations.

extent analysis

TL;DR

Implementing a fuzzy matcher utility with Levenshtein distance and kebab-to-snake normalization may help mitigate tool hallucination issues.

Guidance

Review the proposed Tool Hallucination Auto-Repair feature to ensure it correctly handles kebab-case to snake_case conversions and minor typos with a suitable threshold.
Consider testing the Length Continuation feature with different response lengths to verify it can recover from MAX_TOKENS finish reason within the specified 3 iterations safety cap.
Evaluate the performance of the fuzzy matcher utility to prevent potential false positives or negatives that could affect the overall reliability of the system.
Assess the impact of the Length Continuation feature on response completeness and JSON/code block integrity to ensure it does not introduce new issues.

Example

No specific code snippet can be provided without more context, but implementing the Levenshtein distance calculation and kebab-to-snake normalization in the fuzzy matcher utility could look like a combination of string manipulation and distance calculation functions.

Notes

The effectiveness of these features may depend on the specific requirements and constraints of the system, such as the maximum allowed token limit and the acceptable threshold for Levenshtein distance.

Recommendation

Apply the proposed workaround by implementing the Tool Hallucination Auto-Repair and Length Continuation features, as they seem to address the identified issues directly.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#request error #file not found #serialization error #model compatibility #GPU setup

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

gemini-cli - ✅(Solved) Fix [Feature] Core Reliability: Tool Name Fuzzy Matching and Response Length Continuation [1 pull requests, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Fix Action

Fixed

PR fix notes

PR #24978: feat(core): implement tool repair and length continuation auto-recovery

Description (problem / solution / changelog)

Summary

1. Tool Hallucination Auto-Repair (Fuzzy Matching)

2. Length Continuation

Verification

Changed files

Problem

Proposed Solution

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

gemini-cli - ✅(Solved) Fix [Feature] Core Reliability: Tool Name Fuzzy Matching and Response Length Continuation [1 pull requests, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Fix Action

Fixed

PR fix notes

PR #24978: feat(core): implement tool repair and length continuation auto-recovery

Description (problem / solution / changelog)

Summary

1. Tool Hallucination Auto-Repair (Fuzzy Matching)

2. Length Continuation

Verification

Changed files

Problem

Proposed Solution

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING