gemini-cli - ✅(Solved) Fix [Feature] Core Reliability: Tool Name Fuzzy Matching and Response Length Continuation [1 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
google-gemini/gemini-cli#24977Fetched 2026-04-09 08:16:39
View on GitHub
Comments
0
Participants
1
Timeline
3
Reactions
0
Author
Participants
Timeline (top)
labeled ×2cross-referenced ×1

Fix Action

Fixed

PR fix notes

PR #24978: feat(core): implement tool repair and length continuation auto-recovery

Description (problem / solution / changelog)

Summary

This PR implements two reliability features into @google/gemini-cli-core to improve agent robustness:

1. Tool Hallucination Auto-Repair (Fuzzy Matching)

  • Kebab-to-Snake Normalization: Automatically converts tool names from kebab-case (read-file) to snake_case (read_file).
  • Levenshtein Distance: Attempts to repair hallucinated tool names using a strict threshold (Max 2 edits).
  • Ambiguity Check: Aborts repair if multiple valid tools are equally close to the hallucination.
  • Implementation: Logic isolated in packages/core/src/utils/fuzzy-matcher.ts and integrated into the Scheduler.

2. Length Continuation

  • Auto-Recovery: Detects when a model response is truncated due to length (finishReason: 'MAX_TOKENS').
  • System Injection: Automatically prompts the model to continue where it left off.
  • Safety Cap: Limits continuations to 3 iterations per turn to prevent token drain.
  • Implementation: Handled in the LegacyAgentSession run loop.

Verification

  • Added unit tests for fuzzy-matcher.ts.
  • Added integrated tests in scheduler.test.ts for tool repair logic and ambiguity handling.
  • Added integrated tests in legacy-agent-session.test.ts for length continuation and iteration limits.
  • Verified with npm run lint and npm run build.
  • Fixed a bug in scripts/build.js that caused build failures in some Linux environments.

Fixes #24977

Changed files

  • packages/core/src/agent/legacy-agent-session.test.ts (modified, +107/-0)
  • packages/core/src/agent/legacy-agent-session.ts (modified, +34/-3)
  • packages/core/src/agent/types.ts (modified, +4/-0)
  • packages/core/src/scheduler/scheduler.test.ts (modified, +109/-0)
  • packages/core/src/scheduler/scheduler.ts (modified, +55/-3)
  • packages/core/src/utils/fuzzy-matcher.test.ts (added, +63/-0)
  • packages/core/src/utils/fuzzy-matcher.ts (added, +68/-0)
  • scripts/build.js (modified, +25/-6)
RAW_BUFFERClick to expand / collapse

Problem

  1. Tool Hallucinations: The model sometimes emits tool names in the wrong format (e.g., kebab-case instead of snake_case) or with minor typos, causing execution failures.
  2. Truncated Responses: Responses that exceed the max token limit are cut off, leading to incomplete information or broken JSON/code blocks.

Proposed Solution

Implement two reliability features:

  1. Tool Hallucination Auto-Repair: A fuzzy matcher utility using Levenshtein distance (threshold 2) and kebab-to-snake normalization.
  2. Length Continuation: Automatic recovery from MAX_TOKENS finish reason with a safety cap of 3 iterations.

extent analysis

TL;DR

Implementing a fuzzy matcher utility with Levenshtein distance and kebab-to-snake normalization may help mitigate tool hallucination issues.

Guidance

  • Review the proposed Tool Hallucination Auto-Repair feature to ensure it correctly handles kebab-case to snake_case conversions and minor typos with a suitable threshold.
  • Consider testing the Length Continuation feature with different response lengths to verify it can recover from MAX_TOKENS finish reason within the specified 3 iterations safety cap.
  • Evaluate the performance of the fuzzy matcher utility to prevent potential false positives or negatives that could affect the overall reliability of the system.
  • Assess the impact of the Length Continuation feature on response completeness and JSON/code block integrity to ensure it does not introduce new issues.

Example

No specific code snippet can be provided without more context, but implementing the Levenshtein distance calculation and kebab-to-snake normalization in the fuzzy matcher utility could look like a combination of string manipulation and distance calculation functions.

Notes

The effectiveness of these features may depend on the specific requirements and constraints of the system, such as the maximum allowed token limit and the acceptable threshold for Levenshtein distance.

Recommendation

Apply the proposed workaround by implementing the Tool Hallucination Auto-Repair and Length Continuation features, as they seem to address the identified issues directly.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING