gemini-cli - 💡(How to fix) Fix [Bug/Perf] Redundant utility_router calls in LocalAgentExecutor during subagent execution [2 comments, 3 participants]

Code Example

> /about
╭─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│                                                                                                                                                     │
│ About Gemini CLI                                                                                                                                    │
│                                                                                                                                                     │
│ CLI Version                                        0.39.0-nightly.20260408.e77b22e63                                                                │
│ Git Commit                                         21e1c6092                                                                                        │
│ Model                                              Auto (Gemini 3)                                                                                  │
│ Sandbox                                            no sandbox                                                                                       │
│ OS                                                 win32                                                                                            │
│ Auth Method                                        Signed in with Google (i@lm902.cn)                                                               │
│ Tier                                               Gemini Code Assist in Google One AI Pro                                                          │
│                                                                                                                                                     │
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

What happened?

When invoking a subagent (e.g., @codebase_investigator or @generalist) with the model configuration set to auto (or inherited auto), the LocalAgentExecutor triggers a fresh call to ModelRouterService.route() on every single turn of the subagent's loop.

Because the utility_router (typically using gemini-2.5-flash-lite) is used to classify task complexity, calling it repeatedly within the same subagent session is redundant. For a subagent task that requires 10 turns to complete, the system makes 10 extra AI requests for classification.

This behavior differs from the main GeminiClient implementation. The GeminiClient correctly implements "model stickiness" by caching the routing decision in currentSequenceModel and reusing it for the duration of a sequence. LocalAgentExecutor currently lacks this caching mechanism, leading to:

Increased Latency: Every turn waits for an extra classification call before the actual model generation.
Wasted Quota: Multiplies the number of API requests sent to the routing model.
Redundant Telemetry: Floods the logs with identical routing decisions for the same task.

Image: Example of Flash Lite usage almost mirroring Flash usage after heavy usage of subagents.

What did you expect to happen?

The LocalAgentExecutor should cache the routing decision made during the first turn of a subagent's execution. Subsequent turns within that same subagent session should reuse the cached model choice instead of re-invoking the utility_router.

Client information

<details> <summary>Client Information</summary>

Run gemini to enter the interactive CLI, then run the /about command.

> /about
╭─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│                                                                                                                                                     │
│ About Gemini CLI                                                                                                                                    │
│                                                                                                                                                     │
│ CLI Version                                        0.39.0-nightly.20260408.e77b22e63                                                                │
│ Git Commit                                         21e1c6092                                                                                        │
│ Model                                              Auto (Gemini 3)                                                                                  │
│ Sandbox                                            no sandbox                                                                                       │
│ OS                                                 win32                                                                                            │
│ Auth Method                                        Signed in with Google ([email protected])                                                               │
│ Tier                                               Gemini Code Assist in Google One AI Pro                                                          │
│                                                                                                                                                     │
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

</details>

Login information

Signed in with Google

Anything else we need to know?

Users can verify this by running /stats model and observing the disproportionately high call count for the utility_router role compared to the number of user-initiated prompts.

extent analysis

TL;DR

Implementing a caching mechanism in LocalAgentExecutor to store the routing decision for the duration of a subagent session is likely to fix the issue.

Guidance

Identify the LocalAgentExecutor class and locate the method responsible for calling ModelRouterService.route() to understand where the caching mechanism needs to be implemented.
Consider adding a cache, such as a dictionary, to store the routing decision for each subagent session, using a unique identifier for the session as the key.
Modify the LocalAgentExecutor to check the cache before calling ModelRouterService.route(), and update the cache with the routing decision if it's not already stored.
Verify the fix by running /stats model and checking that the call count for the utility_router role is reduced.

Example

# Pseudo-code example of caching mechanism
class LocalAgentExecutor:
    def __init__(self):
        self.route_cache = {}

    def execute_subagent(self, subagent_id):
        if subagent_id not in self.route_cache:
            # Call ModelRouterService.route() and store the result in the cache
            route_decision = ModelRouterService.route()
            self.route_cache[subagent_id] = route_decision
        else:
            # Reuse the cached routing decision
            route_decision = self.route_cache[subagent_id]
        # Proceed with the subagent execution using the route decision

Notes

The exact implementation of the caching mechanism may vary depending on the specific requirements and constraints of the system. Additionally, the cache should be properly invalidated when a subagent session ends to avoid stale data.

Recommendation

Apply a workaround by implementing a caching mechanism in LocalAgentExecutor, as this is likely to fix the issue and reduce the redundant API requests.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

gemini-cli - 💡(How to fix) Fix [Bug/Perf] Redundant utility_router calls in LocalAgentExecutor during subagent execution [2 comments, 3 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Code Example

What happened?

What did you expect to happen?

Client information

Login information

Anything else we need to know?

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

gemini-cli - 💡(How to fix) Fix [Bug/Perf] Redundant utility_router calls in LocalAgentExecutor during subagent execution [2 comments, 3 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Code Example

What happened?

What did you expect to happen?

Client information

Login information

Anything else we need to know?

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING