gemini-cli - 💡(How to fix) Fix Build program failure: no matching function for call to 'convert_float' on RTX 5080 with gpu_artisan backend [1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
google-gemini/gemini-cli#26502Fetched 2026-05-06 06:36:03
View on GitHub
Comments
1
Participants
2
Timeline
2
Reactions
0
Author
Timeline (top)
commented ×1labeled ×1

Error Message

[Routing] GemmaClassifierStrategy failed: _ApiError: {"error":{"message":"failed to create the engine: failed to create engine for model │ │ gemma3-1b-gpu-custom with backend gpu_artisan: failed to create engine\n","code":500,"status":"Internal Server Error"}}

Code Example

> /about
About Gemini CLI│                                                                                                                         │
CLI Version                               0.40.1Git Commit                                7a382e066                                                                     │
Model                                     Auto (Gemini 3)Sandbox                                   no sandbox                                                                    │
OS                                        win32                                                                         │
Auth Method                               Signed in with Google (***@gmail.com)Tier                                      Gemini Code Assist in Google One AI Pro
RAW_BUFFERClick to expand / collapse

What happened?

The Gemini CLI fails to initialize the local inference engine for gemma3-1b-gpu-custom even though the server is running on port 9379 and drivers are up to date. The logs indicate a failure during the OpenCL kernel build process specifically related to FP16 (half-precision) calculations.

Resulting in always choosing Flash.

What did you expect to happen?

Correctly routing to Flash or Pro.

Client information

<details> <summary>Client Information</summary>

Run gemini to enter the interactive CLI, then run the /about command.

> /about
About Gemini CLI                                                                                                        │
│                                                                                                                         │
│ CLI Version                               0.40.1                                                                        │
│ Git Commit                                7a382e066                                                                     │
│ Model                                     Auto (Gemini 3)                                                               │
│ Sandbox                                   no sandbox                                                                    │
│ OS                                        win32                                                                         │
│ Auth Method                               Signed in with Google (***@gmail.com)                                   │
│ Tier                                      Gemini Code Assist in Google One AI Pro
</details>

Login information

[Routing] GemmaClassifierStrategy failed: _ApiError: {"error":{"message":"failed to create the engine: failed to create engine for model │ │ gemma3-1b-gpu-custom with backend gpu_artisan: failed to create engine\n","code":500,"status":"Internal Server Error"}}

Anything else we need to know?

  1. Enable gemmaModelRouter in settings.json.
  2. Start the Gemini CLI and send any request (e.g., "How are you?").
  3. The router attempts to classify the request using the local Gemma 3 model, leading to the engine initialization crash.

extent analysis

TL;DR

The Gemini CLI fails to initialize the local inference engine for the gemma3-1b-gpu-custom model due to an issue with FP16 calculations during the OpenCL kernel build process, resulting in the router always choosing Flash.

Guidance

  • Verify that the GPU drivers are up to date and compatible with the Gemini CLI, as outdated drivers may cause issues with OpenCL kernel builds.
  • Check the settings.json file to ensure that gemmaModelRouter is enabled, as this is a required setting for the Gemini CLI to function correctly.
  • Attempt to disable FP16 calculations or switch to a different model that does not rely on FP16 calculations to see if the issue persists.
  • Review the Gemini CLI documentation to see if there are any specific requirements or recommendations for running the gemma3-1b-gpu-custom model on the current system configuration.

Notes

The issue may be specific to the gemma3-1b-gpu-custom model or the system configuration, and further troubleshooting may be required to determine the root cause.

Recommendation

Apply a workaround by disabling FP16 calculations or switching to a different model, as this may allow the Gemini CLI to function correctly until a permanent fix is available.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

gemini-cli - 💡(How to fix) Fix Build program failure: no matching function for call to 'convert_float' on RTX 5080 with gpu_artisan backend [1 comments, 2 participants]