openclaw - ✅(Solved) Fix [Bug]: NVIDIA provider plugin omits apiKey field, silently breaks pi-coding-agent's models.json validation [1 pull requests, 1 comments, 2 participants]

bautrey · 2026-04-27T20:00:37Z

[openclaw] PR 73042: fix nvidia : add NVIDIA API KEY marker to provider catalog output - Repository: openclaw/openclaw - Author: iot2edge - State: open | merge… # PR #73042: fix(nvidia): add NVIDIA_API_KEY marker to provider catalog output - Repository: openclaw/openclaw - Author: iot2edge - State: open | merged: False - Link: https://github.com/openclaw/openclaw/pull/73042 ## Description (problem / solution / changelog) Fixes #73013 ## Summary - Problem: `buildNvidiaProvider()` in `extensions/nvidia/provider-catalog.ts` returns a `ModelProviderConfig` without an `apiKey` field. When that config is persisted to `~/.openclaw/agents/ /agent/models.json`, pi-coding-agent's `validateConfig()` rejects the entry because non-built-in providers with custom models must declare `apiKey`. The whole `models.json` then fails validation, silently dropping every other custom provider entry in the same file (codex, xai, github-copilot, etc.) and falling through to built-in models only. - Why it matters: A single missing field on one bundled plugin silently disables every custom provider a user has registered. There is no warn/error log — the file just stops loading. Reporter discovered this only by direct `ModelRegistry.find('nvidia', ...)` probing. - What changed: Added `apiKey: "NVIDIA_API_KEY"` to the object returned by `buildNvidiaProvider()`. This is the bare env-var name; OpenClaw's `resolveUsableCustomProviderApiKey` already recognizes that form and resolves to `process.env.NVIDIA_API_KEY` at infer time. Same pattern already used by `extensions/codex/provider-catalog.ts:68` (`apiKey: CODEX_APP_SERVER_AUTH_MARKER`). Added a regression test asserting the field is present. - What did NOT change (scope boundary): Auth manifest (already declares `envVar: "NVIDIA_API_KEY"` in `extensions/nvidia/index.ts:21`). Model definitions, baseUrl, api, contextWindow, costs. Other provider plugins (the same omission may exist elsewhere — left for follow-up issues so this PR stays focused). Did not change `resolveUsableCustomProviderApiKey` or the validation contract — the fix conforms to existing behavior. ## Change Type (select all) - [x] Bug fix - [ ] Feature - [ ] Refactor required for the fix - [ ] Docs - [ ] Security hardening - [ ] Chore/infra ## Scope (select all touched areas) - [ ] Gateway / orchestration - [ ] Skills / tool execution - [ ] Auth / tokens - [ ] Memory / storage - [x] Integrations - [ ] API / contracts - [ ] UI / DX - [ ] CI/CD / infra ## Linked Issue/PR - Closes #73013 - Related # - [x] This PR fixes a bug or regression ## Root Cause - Root cause: When `catalog.buildProvider` is called and its result is persisted to `models.json`, downstream consumers (pi-coding-agent's `validateConfig()`) require `apiKey` for any non-built-in provider that declares custom models. `buildNvidiaProvider()` omitted that field, so the persisted entry failed validation. The validation failure isn't local to NVIDIA — it throws synchronously, which causes the entire `models.json` parse to fail, silently dropping every other custom provider entry in the same file. - Missing detection / guardrail: `extensions/nvidia/provider-catalog.test.ts` only asserted `baseUrl`, `api`, and the model id list. It did not assert the presence of `apiKey`, so the missing field never tripped a test. - Contributing context (if known): Several other provider plugins already include `apiKey` in their `buildProvider` output — codex uses `CODEX_APP_SERVER_AUTH_MARKER`, anthropic uses an actual credential value in `provider-discovery.ts`. The convention exists; nvidia just missed it. ## Regression Test Plan - Coverage level that should have caught this: - [x] Unit test - [ ] Seam / integration test - [ ] End-to-end test - [ ] Existing coverage already sufficient - Target test or file: `extensions/nvidia/provider-catalog.test.ts` - Scenario the test should lock in: `buildNvidiaProvider().apiKey === "NVIDIA_API_KEY"` — explicit assertion that the env-var marker is present and exactly matches the env-var name declared in the auth manifest. - Why this is the smallest reliable guardrail: `buildNvidiaProvider` is a pure function with no I/O. Asserting one field is enough; the wider validation contract is covered by pi-coding-agent's own tests. - Existing test that already covers this (if any): None — the existing `builds the bundled NVIDIA provider defaults` test asserts `baseUrl`, `api`, and model ids, but never reads `apiKey`. - If no new test is added, why not: N/A — 1 new test added. ## User-visible / Behavior Changes - A user who has registered other custom providers (codex, xai, github-copilot, etc.) in `~/.openclaw/agents/ /agent/models.json` will no longer have those entries silently dropped because of NVIDIA's missing field. - For users who only configured NVIDIA: `nvidia/nemotron-3-super-120b-a12b` and the other three NVIDIA models now appear `configured` (instead of `configured,missing`) when `NVIDIA_API_KEY` is set in the env. - No defaults remove

openclaw2026-04-27 20:00:37

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openclaw/openclaw#73013•Fetched 2026-04-28 06:28:41

View on GitHub

Comments

Participants

Timeline

Reactions

Author

bautrey

Participants

bautrey

iot2edge

Timeline (top)

referenced ×2commented ×1cross-referenced ×1

Error Message

ModelRegistry.find('nvidia', 'nemotron-3-super-120b-a12b') // => null // Underlying error: Failed to load models.json: // Provider nvidia: "apiKey" is required when defining custom models

Root Cause

@openclaw/nvidia-provider's catalog.buildProvider writes { baseUrl, api, models, ... } to models.json without an apiKey field.

pi-coding-agent's validateConfig() (line 373-380 of model-registry.js) requires apiKey for any non-built-in provider that declares custom models. When apiKey is missing it throws, and the registry falls back to built-in models only — silently discarding every other custom provider entry in the file.

Fix Action

Workaround

Configure the provider with an explicit apiKey field in models.json, or in openclaw.json under models.providers.<id>.

Reported by FortiumPartners after running into both bugs while wiring NVIDIA NIM into a multi-agent NemoClaw deployment. Happy to test patches.

PR fix notes

PR #73042: fix(nvidia): add NVIDIA_API_KEY marker to provider catalog output

Repository: openclaw/openclaw
Author: iot2edge
State: open | merged: False
Link: https://github.com/openclaw/openclaw/pull/73042

Description (problem / solution / changelog)

Fixes #73013

Summary

Problem: buildNvidiaProvider() in extensions/nvidia/provider-catalog.ts returns a ModelProviderConfig without an apiKey field. When that config is persisted to ~/.openclaw/agents/<id>/agent/models.json, pi-coding-agent's validateConfig() rejects the entry because non-built-in providers with custom models must declare apiKey. The whole models.json then fails validation, silently dropping every other custom provider entry in the same file (codex, xai, github-copilot, etc.) and falling through to built-in models only.
Why it matters: A single missing field on one bundled plugin silently disables every custom provider a user has registered. There is no warn/error log — the file just stops loading. Reporter discovered this only by direct ModelRegistry.find('nvidia', ...) probing.
What changed: Added apiKey: "NVIDIA_API_KEY" to the object returned by buildNvidiaProvider(). This is the bare env-var name; OpenClaw's resolveUsableCustomProviderApiKey already recognizes that form and resolves to process.env.NVIDIA_API_KEY at infer time. Same pattern already used by extensions/codex/provider-catalog.ts:68 (apiKey: CODEX_APP_SERVER_AUTH_MARKER). Added a regression test asserting the field is present.
What did NOT change (scope boundary): Auth manifest (already declares envVar: "NVIDIA_API_KEY" in extensions/nvidia/index.ts:21). Model definitions, baseUrl, api, contextWindow, costs. Other provider plugins (the same omission may exist elsewhere — left for follow-up issues so this PR stays focused). Did not change resolveUsableCustomProviderApiKey or the validation contract — the fix conforms to existing behavior.

Change Type (select all)

Scope (select all touched areas)

Linked Issue/PR

Closes #73013
Related #
This PR fixes a bug or regression

Root Cause

Root cause: When catalog.buildProvider is called and its result is persisted to models.json, downstream consumers (pi-coding-agent's validateConfig()) require apiKey for any non-built-in provider that declares custom models. buildNvidiaProvider() omitted that field, so the persisted entry failed validation. The validation failure isn't local to NVIDIA — it throws synchronously, which causes the entire models.json parse to fail, silently dropping every other custom provider entry in the same file.
Missing detection / guardrail: extensions/nvidia/provider-catalog.test.ts only asserted baseUrl, api, and the model id list. It did not assert the presence of apiKey, so the missing field never tripped a test.
Contributing context (if known): Several other provider plugins already include apiKey in their buildProvider output — codex uses CODEX_APP_SERVER_AUTH_MARKER, anthropic uses an actual credential value in provider-discovery.ts. The convention exists; nvidia just missed it.

Regression Test Plan

Coverage level that should have caught this:
- Unit test
- Seam / integration test
- End-to-end test
- Existing coverage already sufficient
Target test or file: extensions/nvidia/provider-catalog.test.ts
Scenario the test should lock in: buildNvidiaProvider().apiKey === "NVIDIA_API_KEY" — explicit assertion that the env-var marker is present and exactly matches the env-var name declared in the auth manifest.
Why this is the smallest reliable guardrail: buildNvidiaProvider is a pure function with no I/O. Asserting one field is enough; the wider validation contract is covered by pi-coding-agent's own tests.
Existing test that already covers this (if any): None — the existing builds the bundled NVIDIA provider defaults test asserts baseUrl, api, and model ids, but never reads apiKey.
If no new test is added, why not: N/A — 1 new test added.

User-visible / Behavior Changes

A user who has registered other custom providers (codex, xai, github-copilot, etc.) in ~/.openclaw/agents/<id>/agent/models.json will no longer have those entries silently dropped because of NVIDIA's missing field.
For users who only configured NVIDIA: nvidia/nemotron-3-super-120b-a12b and the other three NVIDIA models now appear configured (instead of configured,missing) when NVIDIA_API_KEY is set in the env.
No defaults removed or renamed.

Diagram

models.json validation chain

Before:
[user sets NVIDIA_API_KEY, runs `openclaw models list`]
  -> models.json contains nvidia entry without apiKey
  -> pi-coding-agent validateConfig() throws
  -> models.json fails to load entirely
  -> codex/xai/copilot/... all silently dropped
  -> registry falls through to built-in models only
  -> nvidia/* shows as `configured,missing`

After:
[user sets NVIDIA_API_KEY, runs `openclaw models list`]
  -> models.json contains nvidia entry with apiKey: "NVIDIA_API_KEY"
  -> pi-coding-agent validateConfig() accepts the entry
  -> models.json loads successfully; all custom providers stay registered
  -> resolveUsableCustomProviderApiKey() resolves "NVIDIA_API_KEY" to process.env at infer time
  -> nvidia/* shows as `configured`

Security Impact

New permissions/capabilities? No
Secrets/tokens handling changed? No (the field is the env-var name, not the secret value; existing resolveUsableCustomProviderApiKey reads the actual secret from process.env at infer time)
New/changed network calls? No
Command/tool execution surface changed? No
Data access scope changed? No
If any Yes, explain risk + mitigation: N/A

Repro + Verification

Environment

OS: Linux (any)
Runtime/container: Node 22.14+
Model/provider: nvidia/nemotron-3-super-120b-a12b (NVIDIA NIM)
Integration/channel (if any): N/A — provider plugin
Relevant config (redacted): NVIDIA_API_KEY env var set; one or more custom providers registered in ~/.openclaw/agents/<id>/agent/models.json

Steps

Set NVIDIA_API_KEY env var.
Have at least one other custom provider entry in ~/.openclaw/agents/<id>/agent/models.json alongside the bundled nvidia entry.
Run openclaw models list.

Expected

nvidia/nemotron-3-super-120b-a12b row tagged configured.
All other custom providers in the same models.json continue to load.

Actual (before fix)

nvidia/nemotron-3-super-120b-a12b row tagged configured,missing.
ModelRegistry.find('nvidia', 'nemotron-3-super-120b-a12b') returns null.
All other custom providers in the same file silently disappear.

Evidence

Failing test/log before + passing after
Trace/log snippets
Screenshot/recording
Perf numbers (if relevant)

$ pnpm test extensions/nvidia
 Test Files  2 passed (2)
      Tests  3 passed (3)   # 2 existing + 1 new (apiKey marker assertion)

pnpm check:changed is green: conflict markers, typecheck, lint, runtime import cycles all pass.

Human Verification (required)

Verified scenarios:
- Targeted vitest run for extensions/nvidia (3/3 pass locally on Node 22).
- Full pnpm check:changed gate (all lanes green).
- Cross-checked the convention by reading extensions/codex/provider-catalog.ts:68 (apiKey: CODEX_APP_SERVER_AUTH_MARKER) — the new apiKey: "NVIDIA_API_KEY" field follows the same shape and matches the auth manifest's declared envVar exactly.
- Confirmed extensions/nvidia/index.ts:21 declares envVar: "NVIDIA_API_KEY", so the marker string in buildNvidiaProvider is consistent with the auth method.
Edge cases checked:
- No other place writes to NVIDIA's persisted models.json (grep confirms buildNvidiaProvider is the only producer).
- Test still asserts the bundled defaults (baseUrl, api, model ids) so we haven't regressed coverage of the rest of the config shape.
What you did not verify:
- Live end-to-end against a real NVIDIA API key (openclaw models list against an actual environment with NVIDIA_API_KEY set and pi-coding-agent loaded). I do not have a NIM subscription. The fix is mechanical and matches an existing convention, and the unit test locks in the field shape; the live path (resolveUsableCustomProviderApiKey reading process.env.NVIDIA_API_KEY) is already covered by other plugins that use bare env-var markers.
- The same field omission may exist on other provider plugins. The reporter notes "the same fix likely applies to any other plugin whose provider id matches the model id prefix." I scoped this PR to NVIDIA only; auditing the rest is a separate task.

Review Conversations

I replied to or resolved every bot review conversation I addressed in this PR.
I left unresolved only the conversations that still need reviewer or maintainer judgment.

(Both will be checked once review activity lands. Currently no bot review conversations on this PR.)

Compatibility / Migration

Backward compatible? Yes
Config/env changes? No
Migration needed? No
If yes, exact upgrade steps: N/A

Risks and Mitigations

Risk: apiKey: "NVIDIA_API_KEY" is a marker value, not a real secret; if resolveUsableCustomProviderApiKey were ever changed to require a literal secret (instead of resolving env-var names), this entry would break.
- Mitigation: codex already uses the same bare-marker convention (CODEX_APP_SERVER_AUTH_MARKER), so any such future change would have to handle that case generically. We follow an established pattern.
Risk: writing the env-var name into a file might look confusing to users who inspect models.json directly.
- Mitigation: the value is a documented marker, not a credential. The file's existing style (codex's marker) sets the precedent. A docs improvement covering this convention would be a separate PR.
Risk: the same omission likely exists on other plugins (the reporter flagged this).
- Mitigation: scoped to NVIDIA in this PR per the original issue. If a maintainer wants a sweep, that's a separate refactor — happy to follow up if asked.

Changed files

extensions/nvidia/provider-catalog.test.ts (modified, +6/-0)
extensions/nvidia/provider-catalog.ts (modified, +12/-4)

Code Example

ModelRegistry.find('nvidia', 'nemotron-3-super-120b-a12b')
// => null
// Underlying error: Failed to load models.json:
//   Provider nvidia: "apiKey" is required when defining custom models

RAW_BUFFERClick to expand / collapse

Symptom

Any custom provider entry in ~/.openclaw/agents/<id>/agent/models.json without an apiKey field causes pi-coding-agent's validateConfig() to throw, which silently drops the entire models.json. The runtime falls through to built-in models only — every other custom provider (codex, xai, github-copilot, etc.) registered in the same file is also dropped if any single entry lacks apiKey.

This is a silent failure: nothing logs at warn/error level, the file just stops loading.

Reproduction

Install OpenClaw 2026.4.23 (commit a979721).
Set NVIDIA_API_KEY env var.
Run openclaw models list.

Expected: nvidia/nemotron-3-super-120b-a12b row tagged configured. Actual: row tagged configured,missing.

Direct probe:

ModelRegistry.find('nvidia', 'nemotron-3-super-120b-a12b')
// => null
// Underlying error: Failed to load models.json:
//   Provider nvidia: "apiKey" is required when defining custom models

Root cause

@openclaw/nvidia-provider's catalog.buildProvider writes { baseUrl, api, models, ... } to models.json without an apiKey field.

Suggested fix

The plugin should write apiKey: "NVIDIA_API_KEY" (matching the env var name declared in the plugin's auth manifest) when persisting catalog.buildProvider output. OpenClaw's resolveUsableCustomProviderApiKey already recognizes the bare env-var name and resolves to process.env.NVIDIA_API_KEY at infer time.

Alternatively: don't write to models.json at all and let runtime resolution use the catalog hook. The same fix likely applies to any other plugin whose provider id matches the model id prefix.

Environment

OpenClaw 2026.4.23 (commit a979721)
Linux container
Node 22.22.1

Workaround

Configure the provider with an explicit apiKey field in models.json, or in openclaw.json under models.providers.<id>.

Reported by FortiumPartners after running into both bugs while wiring NVIDIA NIM into a multi-agent NemoClaw deployment. Happy to test patches.

extent analysis

TL;DR

The most likely fix is to add an apiKey field to the custom provider entry in models.json or update the @openclaw/nvidia-provider plugin to include the apiKey when writing to models.json.

Guidance

Verify that the NVIDIA_API_KEY environment variable is set and accessible to the OpenClaw application.
Check the models.json file for any custom provider entries without an apiKey field and add the field with the corresponding API key value.
Consider updating the @openclaw/nvidia-provider plugin to include the apiKey field when writing to models.json, as suggested in the issue.
As a temporary workaround, configure the provider with an explicit apiKey field in models.json or in openclaw.json under models.providers.<id>.

Example

No code snippet is provided as it is not clearly supported by the issue, but an example of adding an apiKey field to a custom provider entry in models.json might look like:

{
  "providers": [
    {
      "id": "nvidia",
      "baseUrl": "...",
      "api": "...",
      "models": "...",
      "apiKey": "NVIDIA_API_KEY"
    }
  ]
}

Notes

The issue seems to be specific to the @openclaw/nvidia-provider plugin and the pi-coding-agent's validateConfig() function. The suggested fix may need to be adapted for other plugins or custom providers.

Recommendation

Apply the workaround by adding an explicit apiKey field to the custom provider entry in models.json or in openclaw.json under models.providers.<id>, as this is a more immediate solution that can be implemented without modifying the plugin code.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #callback error #memory management #API rate limit #retriever error

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

openclaw - ✅(Solved) Fix [Bug]: NVIDIA provider plugin omits apiKey field, silently breaks pi-coding-agent's models.json validation [1 pull requests, 1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Workaround

PR fix notes

PR #73042: fix(nvidia): add NVIDIA_API_KEY marker to provider catalog output

Description (problem / solution / changelog)

Summary

Change Type (select all)

Scope (select all touched areas)

Linked Issue/PR

Root Cause

Regression Test Plan

User-visible / Behavior Changes

Diagram

Security Impact

Repro + Verification

Environment

Steps

Expected

Actual (before fix)

Evidence

Human Verification (required)

Review Conversations

Compatibility / Migration

Risks and Mitigations

Changed files

Code Example

Symptom

Reproduction

Root cause

Suggested fix

Environment

Workaround

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING