openclaw - ✅(Solved) Fix [Bug]: NVIDIA provider plugin omits apiKey field, silently breaks pi-coding-agent's models.json validation [1 pull requests, 1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#73013Fetched 2026-04-28 06:28:41
View on GitHub
Comments
1
Participants
2
Timeline
4
Reactions
0
Author
Participants
Timeline (top)
referenced ×2commented ×1cross-referenced ×1

Error Message

ModelRegistry.find('nvidia', 'nemotron-3-super-120b-a12b') // => null // Underlying error: Failed to load models.json: // Provider nvidia: "apiKey" is required when defining custom models

Root Cause

@openclaw/nvidia-provider's catalog.buildProvider writes { baseUrl, api, models, ... } to models.json without an apiKey field.

pi-coding-agent's validateConfig() (line 373-380 of model-registry.js) requires apiKey for any non-built-in provider that declares custom models. When apiKey is missing it throws, and the registry falls back to built-in models only — silently discarding every other custom provider entry in the file.

Fix Action

Workaround

Configure the provider with an explicit apiKey field in models.json, or in openclaw.json under models.providers.<id>.


Reported by FortiumPartners after running into both bugs while wiring NVIDIA NIM into a multi-agent NemoClaw deployment. Happy to test patches.

PR fix notes

PR #73042: fix(nvidia): add NVIDIA_API_KEY marker to provider catalog output

Description (problem / solution / changelog)

Fixes #73013

Summary

  • Problem: buildNvidiaProvider() in extensions/nvidia/provider-catalog.ts returns a ModelProviderConfig without an apiKey field. When that config is persisted to ~/.openclaw/agents/<id>/agent/models.json, pi-coding-agent's validateConfig() rejects the entry because non-built-in providers with custom models must declare apiKey. The whole models.json then fails validation, silently dropping every other custom provider entry in the same file (codex, xai, github-copilot, etc.) and falling through to built-in models only.
  • Why it matters: A single missing field on one bundled plugin silently disables every custom provider a user has registered. There is no warn/error log — the file just stops loading. Reporter discovered this only by direct ModelRegistry.find('nvidia', ...) probing.
  • What changed: Added apiKey: "NVIDIA_API_KEY" to the object returned by buildNvidiaProvider(). This is the bare env-var name; OpenClaw's resolveUsableCustomProviderApiKey already recognizes that form and resolves to process.env.NVIDIA_API_KEY at infer time. Same pattern already used by extensions/codex/provider-catalog.ts:68 (apiKey: CODEX_APP_SERVER_AUTH_MARKER). Added a regression test asserting the field is present.
  • What did NOT change (scope boundary): Auth manifest (already declares envVar: "NVIDIA_API_KEY" in extensions/nvidia/index.ts:21). Model definitions, baseUrl, api, contextWindow, costs. Other provider plugins (the same omission may exist elsewhere — left for follow-up issues so this PR stays focused). Did not change resolveUsableCustomProviderApiKey or the validation contract — the fix conforms to existing behavior.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor required for the fix
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

  • Closes #73013
  • Related #
  • This PR fixes a bug or regression

Root Cause

  • Root cause: When catalog.buildProvider is called and its result is persisted to models.json, downstream consumers (pi-coding-agent's validateConfig()) require apiKey for any non-built-in provider that declares custom models. buildNvidiaProvider() omitted that field, so the persisted entry failed validation. The validation failure isn't local to NVIDIA — it throws synchronously, which causes the entire models.json parse to fail, silently dropping every other custom provider entry in the same file.
  • Missing detection / guardrail: extensions/nvidia/provider-catalog.test.ts only asserted baseUrl, api, and the model id list. It did not assert the presence of apiKey, so the missing field never tripped a test.
  • Contributing context (if known): Several other provider plugins already include apiKey in their buildProvider output — codex uses CODEX_APP_SERVER_AUTH_MARKER, anthropic uses an actual credential value in provider-discovery.ts. The convention exists; nvidia just missed it.

Regression Test Plan

  • Coverage level that should have caught this:
    • Unit test
    • Seam / integration test
    • End-to-end test
    • Existing coverage already sufficient
  • Target test or file: extensions/nvidia/provider-catalog.test.ts
  • Scenario the test should lock in: buildNvidiaProvider().apiKey === "NVIDIA_API_KEY" — explicit assertion that the env-var marker is present and exactly matches the env-var name declared in the auth manifest.
  • Why this is the smallest reliable guardrail: buildNvidiaProvider is a pure function with no I/O. Asserting one field is enough; the wider validation contract is covered by pi-coding-agent's own tests.
  • Existing test that already covers this (if any): None — the existing builds the bundled NVIDIA provider defaults test asserts baseUrl, api, and model ids, but never reads apiKey.
  • If no new test is added, why not: N/A — 1 new test added.

User-visible / Behavior Changes

  • A user who has registered other custom providers (codex, xai, github-copilot, etc.) in ~/.openclaw/agents/<id>/agent/models.json will no longer have those entries silently dropped because of NVIDIA's missing field.
  • For users who only configured NVIDIA: nvidia/nemotron-3-super-120b-a12b and the other three NVIDIA models now appear configured (instead of configured,missing) when NVIDIA_API_KEY is set in the env.
  • No defaults removed or renamed.

Diagram

models.json validation chain

Before:
[user sets NVIDIA_API_KEY, runs `openclaw models list`]
  -> models.json contains nvidia entry without apiKey
  -> pi-coding-agent validateConfig() throws
  -> models.json fails to load entirely
  -> codex/xai/copilot/... all silently dropped
  -> registry falls through to built-in models only
  -> nvidia/* shows as `configured,missing`

After:
[user sets NVIDIA_API_KEY, runs `openclaw models list`]
  -> models.json contains nvidia entry with apiKey: "NVIDIA_API_KEY"
  -> pi-coding-agent validateConfig() accepts the entry
  -> models.json loads successfully; all custom providers stay registered
  -> resolveUsableCustomProviderApiKey() resolves "NVIDIA_API_KEY" to process.env at infer time
  -> nvidia/* shows as `configured`

Security Impact

  • New permissions/capabilities? No
  • Secrets/tokens handling changed? No (the field is the env-var name, not the secret value; existing resolveUsableCustomProviderApiKey reads the actual secret from process.env at infer time)
  • New/changed network calls? No
  • Command/tool execution surface changed? No
  • Data access scope changed? No
  • If any Yes, explain risk + mitigation: N/A

Repro + Verification

Environment

  • OS: Linux (any)
  • Runtime/container: Node 22.14+
  • Model/provider: nvidia/nemotron-3-super-120b-a12b (NVIDIA NIM)
  • Integration/channel (if any): N/A — provider plugin
  • Relevant config (redacted): NVIDIA_API_KEY env var set; one or more custom providers registered in ~/.openclaw/agents/<id>/agent/models.json

Steps

  1. Set NVIDIA_API_KEY env var.
  2. Have at least one other custom provider entry in ~/.openclaw/agents/<id>/agent/models.json alongside the bundled nvidia entry.
  3. Run openclaw models list.

Expected

  • nvidia/nemotron-3-super-120b-a12b row tagged configured.
  • All other custom providers in the same models.json continue to load.

Actual (before fix)

  • nvidia/nemotron-3-super-120b-a12b row tagged configured,missing.
  • ModelRegistry.find('nvidia', 'nemotron-3-super-120b-a12b') returns null.
  • All other custom providers in the same file silently disappear.

Evidence

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)
$ pnpm test extensions/nvidia
 Test Files  2 passed (2)
      Tests  3 passed (3)   # 2 existing + 1 new (apiKey marker assertion)

pnpm check:changed is green: conflict markers, typecheck, lint, runtime import cycles all pass.

Human Verification (required)

  • Verified scenarios:
    • Targeted vitest run for extensions/nvidia (3/3 pass locally on Node 22).
    • Full pnpm check:changed gate (all lanes green).
    • Cross-checked the convention by reading extensions/codex/provider-catalog.ts:68 (apiKey: CODEX_APP_SERVER_AUTH_MARKER) — the new apiKey: "NVIDIA_API_KEY" field follows the same shape and matches the auth manifest's declared envVar exactly.
    • Confirmed extensions/nvidia/index.ts:21 declares envVar: "NVIDIA_API_KEY", so the marker string in buildNvidiaProvider is consistent with the auth method.
  • Edge cases checked:
    • No other place writes to NVIDIA's persisted models.json (grep confirms buildNvidiaProvider is the only producer).
    • Test still asserts the bundled defaults (baseUrl, api, model ids) so we haven't regressed coverage of the rest of the config shape.
  • What you did not verify:
    • Live end-to-end against a real NVIDIA API key (openclaw models list against an actual environment with NVIDIA_API_KEY set and pi-coding-agent loaded). I do not have a NIM subscription. The fix is mechanical and matches an existing convention, and the unit test locks in the field shape; the live path (resolveUsableCustomProviderApiKey reading process.env.NVIDIA_API_KEY) is already covered by other plugins that use bare env-var markers.
    • The same field omission may exist on other provider plugins. The reporter notes "the same fix likely applies to any other plugin whose provider id matches the model id prefix." I scoped this PR to NVIDIA only; auditing the rest is a separate task.

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

(Both will be checked once review activity lands. Currently no bot review conversations on this PR.)

Compatibility / Migration

  • Backward compatible? Yes
  • Config/env changes? No
  • Migration needed? No
  • If yes, exact upgrade steps: N/A

Risks and Mitigations

  • Risk: apiKey: "NVIDIA_API_KEY" is a marker value, not a real secret; if resolveUsableCustomProviderApiKey were ever changed to require a literal secret (instead of resolving env-var names), this entry would break.
    • Mitigation: codex already uses the same bare-marker convention (CODEX_APP_SERVER_AUTH_MARKER), so any such future change would have to handle that case generically. We follow an established pattern.
  • Risk: writing the env-var name into a file might look confusing to users who inspect models.json directly.
    • Mitigation: the value is a documented marker, not a credential. The file's existing style (codex's marker) sets the precedent. A docs improvement covering this convention would be a separate PR.
  • Risk: the same omission likely exists on other plugins (the reporter flagged this).
    • Mitigation: scoped to NVIDIA in this PR per the original issue. If a maintainer wants a sweep, that's a separate refactor — happy to follow up if asked.

Changed files

  • extensions/nvidia/provider-catalog.test.ts (modified, +6/-0)
  • extensions/nvidia/provider-catalog.ts (modified, +12/-4)

Code Example

ModelRegistry.find('nvidia', 'nemotron-3-super-120b-a12b')
// => null
// Underlying error: Failed to load models.json:
//   Provider nvidia: "apiKey" is required when defining custom models
RAW_BUFFERClick to expand / collapse

Symptom

Any custom provider entry in ~/.openclaw/agents/<id>/agent/models.json without an apiKey field causes pi-coding-agent's validateConfig() to throw, which silently drops the entire models.json. The runtime falls through to built-in models only — every other custom provider (codex, xai, github-copilot, etc.) registered in the same file is also dropped if any single entry lacks apiKey.

This is a silent failure: nothing logs at warn/error level, the file just stops loading.

Reproduction

  1. Install OpenClaw 2026.4.23 (commit a979721).
  2. Set NVIDIA_API_KEY env var.
  3. Run openclaw models list.

Expected: nvidia/nemotron-3-super-120b-a12b row tagged configured. Actual: row tagged configured,missing.

Direct probe:

ModelRegistry.find('nvidia', 'nemotron-3-super-120b-a12b')
// => null
// Underlying error: Failed to load models.json:
//   Provider nvidia: "apiKey" is required when defining custom models

Root cause

@openclaw/nvidia-provider's catalog.buildProvider writes { baseUrl, api, models, ... } to models.json without an apiKey field.

pi-coding-agent's validateConfig() (line 373-380 of model-registry.js) requires apiKey for any non-built-in provider that declares custom models. When apiKey is missing it throws, and the registry falls back to built-in models only — silently discarding every other custom provider entry in the file.

Suggested fix

The plugin should write apiKey: "NVIDIA_API_KEY" (matching the env var name declared in the plugin's auth manifest) when persisting catalog.buildProvider output. OpenClaw's resolveUsableCustomProviderApiKey already recognizes the bare env-var name and resolves to process.env.NVIDIA_API_KEY at infer time.

Alternatively: don't write to models.json at all and let runtime resolution use the catalog hook. The same fix likely applies to any other plugin whose provider id matches the model id prefix.

Environment

  • OpenClaw 2026.4.23 (commit a979721)
  • Linux container
  • Node 22.22.1

Workaround

Configure the provider with an explicit apiKey field in models.json, or in openclaw.json under models.providers.<id>.


Reported by FortiumPartners after running into both bugs while wiring NVIDIA NIM into a multi-agent NemoClaw deployment. Happy to test patches.

extent analysis

TL;DR

The most likely fix is to add an apiKey field to the custom provider entry in models.json or update the @openclaw/nvidia-provider plugin to include the apiKey when writing to models.json.

Guidance

  • Verify that the NVIDIA_API_KEY environment variable is set and accessible to the OpenClaw application.
  • Check the models.json file for any custom provider entries without an apiKey field and add the field with the corresponding API key value.
  • Consider updating the @openclaw/nvidia-provider plugin to include the apiKey field when writing to models.json, as suggested in the issue.
  • As a temporary workaround, configure the provider with an explicit apiKey field in models.json or in openclaw.json under models.providers.<id>.

Example

No code snippet is provided as it is not clearly supported by the issue, but an example of adding an apiKey field to a custom provider entry in models.json might look like:

{
  "providers": [
    {
      "id": "nvidia",
      "baseUrl": "...",
      "api": "...",
      "models": "...",
      "apiKey": "NVIDIA_API_KEY"
    }
  ]
}

Notes

The issue seems to be specific to the @openclaw/nvidia-provider plugin and the pi-coding-agent's validateConfig() function. The suggested fix may need to be adapted for other plugins or custom providers.

Recommendation

Apply the workaround by adding an explicit apiKey field to the custom provider entry in models.json or in openclaw.json under models.providers.<id>, as this is a more immediate solution that can be implemented without modifying the plugin code.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - ✅(Solved) Fix [Bug]: NVIDIA provider plugin omits apiKey field, silently breaks pi-coding-agent's models.json validation [1 pull requests, 1 comments, 2 participants]