openclaw - ✅(Solved) Fix [Bug]: `openclaw doctor --repair` Re-embeds Sensitive Tokens in Systemd Service [3 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#66219Fetched 2026-04-14 05:38:48
View on GitHub
Comments
0
Participants
1
Timeline
6
Reactions
0
Author
Participants
Timeline (top)
cross-referenced ×3labeled ×2closed ×1

The OpenClaw CLI identifies embedded tokens in the systemd service file as a security issue and recommends running openclaw gateway install --force or openclaw doctor --repair creating an endless loop of ineffective repairs.

Root Cause

The OpenClaw CLI identifies embedded tokens in the systemd service file as a security issue and recommends running openclaw gateway install --force or openclaw doctor --repair creating an endless loop of ineffective repairs.

Fix Action

Workaround

Manually edit ~/.config/systemd/user/openclaw-gateway.service to remove the Environment= lines containing secrets and replace them with:

EnvironmentFile=/root/.openclaw/.env

Then run systemctl --user daemon-reload and systemctl --user restart openclaw-gateway. Note that running openclaw doctor --repair again will revert this manual fix and re-embed the tokens.

PR fix notes

PR #66249: fix(daemon): avoid inline dotenv secrets in systemd unit during service repair

Description (problem / solution / changelog)

Summary

Fixes #66219.

openclaw doctor --repair / gateway service rewrites could keep reintroducing inline Environment=... secret values (for example OPENCLAW_GATEWAY_TOKEN) when those values came from state-dir .env. This caused recurring embedded-token findings and a repair loop.

This PR changes Linux systemd unit rendering to:

  • add EnvironmentFile=<state-dir>/.env when state-dir dotenv values exist
  • remove matching dotenv-backed key/value pairs from inline Environment= lines
  • keep non-dotenv service env (for example OPENCLAW_GATEWAY_PORT) inline

Why

The service audit treats inline OPENCLAW_GATEWAY_TOKEN as embedded and recommends reinstall. If reinstall writes the same token inline again, the warning never clears.

Using EnvironmentFile for dotenv-sourced secrets keeps runtime behavior while avoiding plaintext secret embedding in the unit file.

Changes

  • src/daemon/systemd.ts
    • detect state-dir dotenv vars
    • pass environmentFiles to unit renderer
    • filter dotenv-sourced entries from inline environment map
  • src/daemon/systemd-unit.ts
    • support rendering EnvironmentFile= lines
  • src/daemon/service-types.ts
    • extend GatewayServiceRenderArgs with environmentFiles?: string[]
  • tests:
    • src/daemon/systemd.test.ts (new regression for stageSystemdService)
    • src/daemon/systemd-unit.test.ts (new rendering coverage)

Validation

Ran:

  • PATH=$HOME/.nvm/versions/node/v22.14.0/bin:$PATH pnpm test src/daemon/systemd-unit.test.ts src/daemon/systemd.test.ts

Result:

  • unit-fast project: pass
  • daemon project: pass (46 tests)

Changed files

  • CHANGELOG.md (modified, +1/-0)
  • src/config/state-dir-dotenv.ts (modified, +22/-18)
  • src/daemon/service-types.ts (modified, +1/-0)
  • src/daemon/systemd-unit.test.ts (modified, +16/-0)
  • src/daemon/systemd-unit.ts (modified, +16/-0)
  • src/daemon/systemd.test.ts (modified, +112/-0)
  • src/daemon/systemd.ts (modified, +54/-1)

PR #66295: Systemd: keep managed env in drop-in so upgrades preserve user directives

Description (problem / solution / changelog)

Summary

  • Problem: Every openclaw update regenerates ~/.config/systemd/user/openclaw-gateway.service and silently drops user-added EnvironmentFile= or extra Environment= directives. The next gateway start then crash-loops on SecretRefResolutionError: Environment variable "KIMI_API_KEY" is missing or empty (or whichever secret the lost env file was carrying).
  • Why it matters: Every minor upgrade re-bites the same users, the failure mode is misleading (SecretRefResolutionError looks like fresh secrets misconfiguration, not "your unit was rewritten 2 minutes ago"), and the existing preservation pipeline can't reliably recover inline env when the read path encounters permission or timing issues.
  • What changed: Managed env (the OPENCLAW_SERVICE_MANAGED_ENV_KEYS set) now lives in its own OpenClaw-owned drop-in at <unit>.d/openclaw-managed.conf. The main unit is written once on fresh install and from then on is user-owned; upgrades only touch ExecStart= if the entry path drifted (targeted single-line edit) and leave everything else byte-identical. Legacy units with inline managed env get migrated into the new layout on the first install/update under the new code, with a .bak captured so operators can diff.
  • What did NOT change (scope boundary): macOS launchd, Windows Scheduled Task, and the CLI entry points. This PR is systemd-only. Issue #45163 is the launchd analog and needs a different mechanism (launchd has no drop-in equivalent); Windows has a similar shape. Both are out of scope here. The managed-env tracking mechanism (OPENCLAW_SERVICE_MANAGED_ENV_KEYS in src/commands/daemon-install-helpers.ts) is reused as-is — this PR only changes where that set is written.

Fixes #66248.

Change Type (select all)

  • Bug fix
  • Refactor required for the fix

Scope (select all touched areas)

  • Gateway / orchestration
  • Auth / tokens
  • CI/CD / infra

Linked Issue/PR

  • Closes #66248
  • Related #41914 (update removes unit entirely — different symptom, related lifecycle bug)
  • Related #66219 (doctor --repair re-embeds sensitive tokens — same root cause: regeneration loses user state)
  • Related #54521, #57104 (inlining .env secrets / stale tokens in unit — relieved by moving managed keys to a drop-in that owners know not to hand-edit)
  • Related #53926 (install-time EnvironmentFile path mismatch)
  • Related #45163 (same pattern on macOS LaunchAgent — NOT addressed here, needs its own fix)
  • This PR fixes a bug or regression

Root Cause

  • Root cause: writeSystemdUnit() (src/daemon/systemd.ts) rewrites the entire unit file on every install/update. Managed env is written as inline Environment= lines alongside user-added directives, so any regeneration that misses a user's EnvironmentFile= (for permissions, timing, or path-resolution reasons at read time) silently strips it. The collectPreservedExistingServiceEnvVars() helper in daemon-install-helpers.ts is meant to preserve inline user env, but (a) it depends on readSystemdServiceExecStart() successfully reading the previous unit and resolving every referenced EnvironmentFile=, and (b) even when it works, it loses the EnvironmentFile= directive itself — only the resolved key/value pairs are fed forward, inline-ified, and often stripped again next round.
  • Missing detection / guardrail: No test or CI gate asserted that user-added directives in the main unit survive an install → install round-trip. The unit generator didn't distinguish OpenClaw-owned content from user-owned content even though the OPENCLAW_SERVICE_MANAGED_ENV_KEYS sentinel already tracks exactly that distinction.
  • Contributing context (if known): The pattern has bitten users across multiple minor upgrades (documented locally as far back as 2026.4.2, reproduced again on 2026.4.12). The gateway install --force and openclaw doctor --fix paths flow through the same writer, so the same behavior shows up as several different-looking issues (#41914, #66219, #54521, #57104).

Regression Test Plan

  • Coverage level that should have caught this:
    • Unit test
    • Seam / integration test
  • Target test or file: src/daemon/systemd-unit.test.ts (pure-function coverage for strip/split/drop-in/ExecStart helpers) + src/daemon/systemd.test.ts (integration against real tempdir for stageSystemdService fresh/migrate/no-op paths).
  • Scenario the test should lock in: user-added EnvironmentFile= and Environment=USER_ADDED_VAR=... in the main unit survive a legacy-to-new-layout install; ExecStart drift is updated with a single-line edit; repeat installs with identical inputs leave the main unit byte-identical (no .bak appears).
  • Why this is the smallest reliable guardrail: the pure-function tests pin the transform invariants (strip preserves everything outside the managed set, ExecStart update touches only that one line, drop-in output is deterministic). The integration tests pin the file-system wiring. Together they cover the load-bearing behavior without standing up systemctl.
  • Existing test that already covers this (if any): none — systemd-unit.test.ts and systemd.test.ts previously had no coverage for the regen/preservation contract.

User-visible / Behavior Changes

  • A new drop-in file appears at ~/.config/systemd/user/openclaw-gateway.service.d/openclaw-managed.conf after the next install/update. It carries OPENCLAW_GATEWAY_TOKEN, OPENCLAW_SERVICE_VERSION, TELEGRAM_BOT_TOKEN, and any other keys listed in OPENCLAW_SERVICE_MANAGED_ENV_KEYS.
  • The main unit no longer contains those keys after the migration runs. .bak of the pre-migration unit is written alongside it.
  • openclaw gateway uninstall now removes the drop-in and its <unit>.d/ directory (when empty).
  • readSystemdServiceExecStart (used by gateway status --deep and doctor) now merges drop-in values into its audit view so reporting stays consistent with what systemd actually loads.
  • No config or env changes required from users.

Diagram

Before:
openclaw update
  -> writeSystemdUnit(env) rewrites ENTIRE main unit
  -> user EnvironmentFile= / Environment= lost
  -> gateway restart -> SecretRefResolutionError

After:
First install:
  -> main unit written once with only user/unmanaged env inline
  -> drop-in written at <unit>.d/openclaw-managed.conf with managed env + sentinel

Subsequent update:
  -> main unit read
  -> inline managed env (if legacy) stripped into drop-in (one-time migration, .bak taken)
  -> ExecStart updated ONLY if the entry path drifted (single-line edit)
  -> everything else in main unit byte-identical
  -> drop-in rewritten freely with latest managed values

Security Impact

  • New permissions/capabilities? No
  • Secrets/tokens handling changed? Yes (managed secrets move from main unit to a managed drop-in; both live under the same user-owned ~/.config/systemd/user/ tree with identical 0644 permissions, so the trust boundary and storage location are effectively unchanged)
  • New/changed network calls? No
  • Command/tool execution surface changed? No
  • Data access scope changed? No
  • If any Yes, explain risk + mitigation: The drop-in file has the same owner and permissions as the main unit. Splitting managed state into its own file actually reduces the blast radius of accidental hand-edits or regen bugs: users who edit the main unit no longer risk clobbering OpenClaw-managed tokens, and regen bugs in the managed writer no longer risk clobbering user directives. The drop-in header explicitly flags the file as auto-managed.

Repro + Verification

Environment

  • OS: Ubuntu 24.04 (kernel 6.17), and reproducible in the repo's openclaw-prbuild LXC (unprivileged)
  • Runtime/container: systemd user service, Node 22 + pnpm 10.32.1
  • Model/provider: N/A (install/update path, no model involvement)
  • Integration/channel (if any): N/A
  • Relevant config (redacted): ~/.openclaw/workspace/.env with KIMI_API_KEY, OPENCLAW_GATEWAY_TOKEN, etc.; user-added EnvironmentFile=/home/<user>/.openclaw/workspace/.env in the main unit after a prior install.

Steps

  1. Install OpenClaw 2026.4.11 or earlier, let gateway install write the systemd unit.
  2. Add EnvironmentFile=/home/<user>/.openclaw/workspace/.env to the unit (needed because the shipped unit inlines tokens but does not pull supplementary secrets from the .env the rest of OpenClaw writes to).
  3. Run openclaw update --yes.
  4. Gateway restart attempt runs; service fails.

Expected

  • Gateway starts cleanly after the update, loading KIMI_API_KEY (or whichever secret the user-added EnvironmentFile= was carrying).
  • Main unit's user-added directives stay in place.

Actual (before this PR)

  • Main unit regenerated wholesale without the EnvironmentFile= directive.
  • Gateway crash-loops: [secrets] [SECRETS_RELOADER_DEGRADED] SecretRefResolutionError: Environment variable "KIMI_API_KEY" is missing or empty. Gateway failed to start: Error: Startup failed: required secrets are unavailable.

Evidence

  • Failing test/log before + passing after

Before — actual crash log captured on a real host that ran openclaw update from 2026.4.11 → 2026.4.12:

Apr 13 19:49:53 rocinante node[2776813]: [secrets] [SECRETS_RELOADER_DEGRADED] SecretRefResolutionError: Environment variable "KIMI_API_KEY" is missing or empty.
Apr 13 19:49:53 rocinante node[2776813]: Gateway failed to start: Error: Startup failed: required secrets are unavailable.
Apr 13 19:49:53 rocinante systemd[3958]: openclaw-gateway.service: Main process exited, code=exited, status=1/FAILURE
Apr 13 19:49:53 rocinante systemd[3958]: openclaw-gateway.service: Failed with result 'exit-code'.
Apr 13 19:49:59 rocinante systemd[3958]: openclaw-gateway.service: Scheduled restart job, restart counter is at 1.

The pre-upgrade unit had EnvironmentFile=/home/<user>/.openclaw/workspace/.env between Environment=GEMINI_API_KEY=… and Environment=HOME=…. The post-upgrade unit was missing that line entirely, causing the restart loop.

After — test suite in CT 112:

RUN v4.1.4 /home/solomon/openclaw-fork
✓ daemon src/daemon/systemd.test.ts (48 tests) 21ms
✓ unit-fast src/daemon/systemd-unit.test.ts (19 tests) 7ms

Test Files  2 passed (2)
     Tests  67 passed (67)

Key integration test names (see src/daemon/systemd.test.ts):

  • stageSystemdService drop-in migration > writes main unit + managed drop-in on a fresh install
  • stageSystemdService drop-in migration > migrates inline managed env out of a legacy unit while preserving user EnvironmentFile= and Environment=
  • stageSystemdService drop-in migration > leaves main unit byte-identical when nothing drifted between upgrades

Human Verification (required)

  • Verified scenarios:
    • Full test suite for src/daemon/ (67 tests) passes clean in the repo's openclaw-prbuild LXC (Ubuntu + systemd, pnpm 10.32.1, Node 22).
    • pnpm tsgo clean.
    • Repro reproduced on a live production gateway on 2026.4.11 → 2026.4.12 (see Evidence); same host then upgraded by hand, EnvironmentFile= restored manually, confirmed gateway comes up clean with the restored directive.
  • Edge cases checked:
    • Unit with no sentinel (OPENCLAW_SERVICE_MANAGED_ENV_KEYS) present — stripManagedEnvFromSystemdUnit is a no-op, nothing gets rewritten (covered in systemd-unit.test.ts).
    • Environment=OPENCLAW_GATEWAY_TOKEN_BACKUP=… (user-added entry that shares a prefix with a managed key) is NOT stripped (covered in systemd-unit.test.ts).
    • ExecStart matches current args → no write, no .bak (covered in systemd.test.ts).
    • Drop-in absent/unreadable during readSystemdServiceExecStart merge → audit view gracefully falls back to main-unit-only view (covered by an existing systemd.test.ts test whose call-count assertion I updated for the additional probe).
  • What I did not verify:
    • Windows Scheduled Task path and macOS LaunchAgent (intentionally out of scope — see #45163 for the launchd analog).
    • Interaction with operator-installed third-party drop-ins in the same <unit>.d/ directory beyond the openclaw-managed.conf we write (systemd's composition rules should handle this, but I haven't stress-tested with deliberately conflicting operator drop-ins).

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

Compatibility / Migration

  • Backward compatible? Yes (automatic one-time migration on first install/update under the new code; older OpenClaw versions reading a unit written by this code still see the ExecStart, and the drop-in is plain systemd — no OpenClaw-specific parsing required)
  • Config/env changes? No
  • Migration needed? Yes (automatic, no user action required)
  • If yes, exact upgrade steps:
    • Install the new OpenClaw version as usual (openclaw update --yes or openclaw gateway install --force).
    • On the first run, writeSystemdUnit notices the sentinel inline in the main unit, moves those lines into <unit>.d/openclaw-managed.conf, and writes a .bak of the pre-migration main unit.
    • Subsequent installs/updates leave the main unit byte-identical unless ExecStart has drifted.

Risks and Mitigations

  • Risk: The targeted ExecStart= rewrite could miss an edge-case formatting that parseSystemdExecStart handles but the line-level matcher doesn't.
    • Mitigation: updateExecStartInSystemdUnit uses startsWith("ExecStart=") and reconstructs the line through the same systemdEscapeArg path the fresh-install writer uses, so the output format stays consistent. Covered by a test asserting other lines (including Environment=, EnvironmentFile=, [Install]) are untouched when only ExecStart needs updating.
  • Risk: If an operator has manually populated <unit>.d/openclaw-managed.conf before upgrade (unlikely — the filename is OpenClaw-specific), the drop-in writer would overwrite it.
    • Mitigation: Drop-in header explicitly labels the file as auto-managed and links to this issue. Operator drop-ins should use a different filename — systemd composes all *.conf files alphabetically from <unit>.d/, so coexistence is trivial.
  • Risk: On the first upgrade under this PR, users will see a new .bak file appear and the main unit shrink. If they diff without context, it can look alarming.
    • Mitigation: The install-complete output now includes a Migrated inline managed env to drop-in line pointing at the drop-in path, plus the existing Previous unit backed up to line.

Changed files

  • src/daemon/systemd-unit.test.ts (modified, +223/-1)
  • src/daemon/systemd-unit.ts (modified, +184/-1)
  • src/daemon/systemd.test.ts (modified, +185/-1)
  • src/daemon/systemd.ts (modified, +247/-57)

PR #66444: fix(systemd): reconcile managed-env unit migrations

Description (problem / solution / changelog)

Summary

  • carry #66295 onto current main
  • fix the two real migration regressions from review: rebuild malformed units when ExecStart= is missing, and reconcile stale WorkingDirectory= in-place instead of leaving CHDIR failures behind
  • remove the redundant managed-env split and clean up the temp-home test leakage

Fixes #66248. Related #66219. Supersedes #66295.

Why this carry exists

#66295 had the right core direction: move OpenClaw-managed systemd env into a drop-in so upgrades stop clobbering user EnvironmentFile= and Environment= directives.

The branch still needed maintainer follow-up in three spots:

  1. malformed existing units with no ExecStart= were no longer repaired
  2. stale WorkingDirectory= values could survive migration and fail startup at CHDIR
  3. the new integration tests were leaving temp dirs behind

This carry fixes those and keeps the user-preserving drop-in design intact.

Verification

  • pnpm test:serial src/daemon/systemd-unit.test.ts src/daemon/systemd.test.ts
  • pnpm build

Changelog

Added under ## Unreleased with:

  • Thanks @solomonneas.
  • Carry follow-up: @vincentkoc.

Changed files

  • CHANGELOG.md (modified, +2/-20)
  • src/daemon/systemd-unit.test.ts (modified, +282/-8)
  • src/daemon/systemd-unit.ts (modified, +258/-14)
  • src/daemon/systemd.test.ts (modified, +320/-113)
  • src/daemon/systemd.ts (modified, +236/-107)

Code Example

- Gateway service embeds OPENCLAW_GATEWAY_TOKEN and should be reinstalled. (Run
      `openclaw gateway install --force` to remove embedded service token.)

---

[Service]
...
Environment=OPENCLAW_GATEWAY_TOKEN=<RAW_TOKEN>
Environment=LLM_API_KEY=<RAW_KEY>
...

---

openclaw doctor output: 
Gateway service embeds OPENCLAW_GATEWAY_TOKEN and should be reinstalled. (Run `openclaw gateway install --force` to remove embedded service token.)

---

EnvironmentFile=/root/.openclaw/.env
RAW_BUFFERClick to expand / collapse

Bug type

Behavior bug (incorrect output/state without crash)

Beta release blocker

No

Summary

The OpenClaw CLI identifies embedded tokens in the systemd service file as a security issue and recommends running openclaw gateway install --force or openclaw doctor --repair creating an endless loop of ineffective repairs.

Steps to reproduce

  1. Configure OpenClaw with sensitive tokens (e.g., OPENCLAW_GATEWAY_TOKEN, LLM_API_KEY) defined in ~/.openclaw/openclaw.json (even if referenced via placeholders like ${OPENCLAW_GATEWAY_TOKEN}).
  2. Install the gateway service: openclaw gateway install.
  3. Run openclaw doctor.
  4. Observe the warning:
    - Gateway service embeds OPENCLAW_GATEWAY_TOKEN and should be reinstalled. (Run
      `openclaw gateway install --force` to remove embedded service token.)
  5. Accept the prompt to "Update gateway service config to the recommended defaults now?" (or run openclaw doctor --repair).
  6. Inspect the generated systemd service file: cat ~/.config/systemd/user/[openclaw-gateway].service.

Expected behavior

When openclaw doctor attempts to repair the "embedded token" issue, it should update the systemd service file to read secrets securely, such as by using EnvironmentFile=/path/to/.env, and should not write raw tokens into the Environment= directives of the unit file.

Actual behavior

The CLI correctly identifies the security risk but fails to resolve it during the repair process. The generated unit file explicitly sets the tokens in plain text:

[Service]
...
Environment=OPENCLAW_GATEWAY_TOKEN=<RAW_TOKEN>
Environment=LLM_API_KEY=<RAW_KEY>
...

Subsequent runs of openclaw doctor will continue to flag the same issue, creating an endless loop of ineffective repairs.

OpenClaw version

2026.4.12 (1c0672b)

Operating system

Linux (Debian-based, x64)

Install method

Systemd User Service (openclaw gateway install)

Model

google/gemini-3.1-pro-preview

Provider / routing chain

openclaw -> gemini

Additional provider/model setup details

No response

Logs, screenshots, and evidence

openclaw doctor output: 
Gateway service embeds OPENCLAW_GATEWAY_TOKEN and should be reinstalled. (Run `openclaw gateway install --force` to remove embedded service token.)

Impact and severity

No response

Additional information

Workaround

Manually edit ~/.config/systemd/user/openclaw-gateway.service to remove the Environment= lines containing secrets and replace them with:

EnvironmentFile=/root/.openclaw/.env

Then run systemctl --user daemon-reload and systemctl --user restart openclaw-gateway. Note that running openclaw doctor --repair again will revert this manual fix and re-embed the tokens.

extent analysis

TL;DR

Manually editing the systemd service file to use an EnvironmentFile instead of embedding secrets directly may resolve the issue.

Guidance

  • The openclaw doctor command is not correctly updating the systemd service file to securely read secrets, leading to an endless loop of ineffective repairs.
  • To verify the issue, inspect the generated systemd service file after running openclaw doctor --repair and check for embedded tokens in plain text.
  • A potential workaround is to manually edit the ~/.config/systemd/user/openclaw-gateway.service file to remove the Environment= lines containing secrets and replace them with an EnvironmentFile directive.
  • After applying the manual fix, run systemctl --user daemon-reload and systemctl --user restart openclaw-gateway to apply the changes.

Example

The manual edit involves replacing lines like:

Environment=OPENCLAW_GATEWAY_TOKEN=<RAW_TOKEN>
Environment=LLM_API_KEY=<RAW_KEY>

with:

EnvironmentFile=/root/.openclaw/.env

assuming the secrets are stored in the .env file.

Notes

This workaround may not be permanent, as running openclaw doctor --repair again may revert the manual fix. The root cause of the issue appears to be a bug in the openclaw doctor command's repair process.

Recommendation

Apply the manual workaround, as it provides a temporary solution to the issue. However, it is recommended to wait for an official fix from the OpenClaw team to ensure a permanent resolution.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

When openclaw doctor attempts to repair the "embedded token" issue, it should update the systemd service file to read secrets securely, such as by using EnvironmentFile=/path/to/.env, and should not write raw tokens into the Environment= directives of the unit file.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING