gemini-cli - ✅(Solved) Fix Add integration tests for tool sandboxing with plans and tasks [2 pull requests, 1 participants]

jerop · 2026-04-08T15:50:31Z

[gemini-cli] PR 25298: refactor core : abstract OS sandbox managers and fix virtual command permissions - Repository: google-gemini/gemini-cli - Author: ehedlu… # PR #25298: refactor(core): abstract OS sandbox managers and fix virtual command permissions - Repository: google-gemini/gemini-cli - Author: ehedlund - State: closed | merged: False - Link: https://github.com/google-gemini/gemini-cli/pull/25298 ## Description (problem / solution / changelog) ## Summary This PR introduces a significant architectural refactor to the Sandbox Management system across all supported platforms (macOS, Linux, Windows), vastly improving security, maintainability, and testing infrastructure. Most importantly, it closes a critical security vulnerability regarding the escalation of virtual command permissions. ## Details **Architectural Changes:** - **Abstract Base Class:** Introduced `AbstractOsSandboxManager` to standardize the command preparation pipeline (sanitization, overrides, path resolution) using the Template Method pattern. OS-specific managers (`MacOsSandboxManager`, `LinuxSandboxManager`, `WindowsSandboxManager`) now extend this base class, reducing duplicated orchestration logic. - **Path Utilities:** Core path utilities (`sanitizePaths`, `getPathIdentity`) were relocated from the bloated `sandboxManager.ts` to a dedicated `utils/paths.ts` and renamed to `deduplicateAbsolutePaths` and `toPathKey`. This enforces consistent path case-insensitivity rules across the agent lifecycle. **Behavioral & Security Fixes:** - **Virtual Command Privilege Escalation (CRITICAL FIX):** Previously, virtual `__write` commands were spoofed as `cat` on POSIX systems during early pipeline adjustments to bypass `isToolApproved` checks. This incorrectly granted the sandboxed `tee` process full `workspaceWrite` permissions. The `AbstractOsSandboxManager` now injects the targeted write path into the dynamic `fileSystem` permissions array before path resolution, keeping `workspaceWrite = false` and ensuring the sandbox restricts access solely to the target file. - **Governance File Verification:** `touch()` operations for governance files (`.git`, `.gitignore`, `.geminiignore`) are now universally enforced by the abstract base class prior to sandbox execution. Previously, macOS skipped this step, leaving workspaces vulnerable to malicious `.git` initialization if absent. - **Windows Case-Insensitivity:** `isKnownSafeCommand` now explicitly enforces case-insensitivity matching against the `approvedTools` list. **Testing Enhancements:** - Deleted the monolithic `sandboxManager.test.ts`. - Introduced `abstractOsSandboxManager.test.ts` and structurally standardized `LinuxSandboxManager.test.ts`, `MacOsSandboxManager.test.ts`, and `WindowsSandboxManager.test.ts` with consistent `describe` groupings for delegated methods. - Added more test cases to `sandboxManager.integration.test.ts`, including coverage for Plan Mode. ## Related Issues Fixes https://github.com/google-gemini/gemini-cli/issues/24932. ## How to Validate 1. Run the workspace test suite: `npm run test -w @google/gemini-cli-core` 2. Execute the global preflight: `npm run preflight` 3. Verify the behavior of virtual `__write` commands inside a restricted sandbox environment (e.g. Plan Mode) to ensure only the specified target file is writable and the rest of the workspace remains read-only. ## Pre-Merge Checklist - [x] Updated relevant documentation and README (if needed) - [x] Added/updated tests (if needed) - [ ] Noted breaking changes (if any) - [x] Validated on required platforms/methods: - [x] MacOS - [x] npm run - [ ] npx - [ ] Docker - [ ] Podman - [ ] Seatbelt - [ ] Windows - [ ] npm run - [ ] npx - [ ] Docker - [ ] Linux - [ ] npm run - [ ] npx - [ ] Docker ## Changed files - `packages/core/src/policy/sandboxPolicyManager.ts` (modified, +4/-4) - `packages/core/src/sandbox/abstractOsSandboxManager.test.ts` (added, +315/-0) - `packages/core/src/sandbox/abstractOsSandboxManager.ts` (added, +326/-0) - `packages/core/src/sandbox/linux/LinuxSandboxManager.test.ts` (modified, +59/-17) - `packages/core/src/sandbox/linux/LinuxSandboxManager.ts` (modified, +43/-139) - `packages/core/src/sandbox/linux/bwrapArgsBuilder.test.ts` (modified, +28/-2) - `packages/core/src/sandbox/linux/bwrapArgsBuilder.ts` (modified, +12/-2) - `packages/core/src/sandbox/macos/MacOsSandboxManager.test.ts` (modified, +62/-166) - `packages/core/src/sandbox/macos/MacOsSandboxManager.ts` (modified, +35/-118) - `packages/core/src/sandbox/macos/seatbeltArgsBuilder.test.ts` (modified, +1/-1) - `packages/core/src/sandbox/macos/seatbeltArgsBuilder.ts` (modified, +1/-1) - `packages/core/src/sandbox/utils/sandboxReadWriteUtils.ts` (modified, +18/-25) - `packages/core/src/sandbox/windows/WindowsSandboxManager.test.ts` (modified, +463/-413) - `packages/core/src/sandbox/windows/WindowsSandboxManager.ts` (modified, +103/-155) - `packages/core/src/services/sandboxManager.integration.test.ts` (modified, +608/-372) - `packages

gemini-cli2026-04-08 15:50:31

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

google-gemini/gemini-cli#24932•Fetched 2026-04-09 08:17:12

View on GitHub

Comments

Participants

Timeline

Reactions

Author

jerop

Participants

jerop

Assignees

DavidAPierce

Timeline (top)

labeled ×4assigned ×1issue_type_added ×1

PR fix notes

PR #25298: refactor(core): abstract OS sandbox managers and fix virtual command permissions

Repository: google-gemini/gemini-cli
Author: ehedlund
State: closed | merged: False
Link: https://github.com/google-gemini/gemini-cli/pull/25298

Description (problem / solution / changelog)

Summary

This PR introduces a significant architectural refactor to the Sandbox Management system across all supported platforms (macOS, Linux, Windows), vastly improving security, maintainability, and testing infrastructure. Most importantly, it closes a critical security vulnerability regarding the escalation of virtual command permissions.

Details

Architectural Changes:

Abstract Base Class: Introduced AbstractOsSandboxManager to standardize the command preparation pipeline (sanitization, overrides, path resolution) using the Template Method pattern. OS-specific managers (MacOsSandboxManager, LinuxSandboxManager, WindowsSandboxManager) now extend this base class, reducing duplicated orchestration logic.
Path Utilities: Core path utilities (sanitizePaths, getPathIdentity) were relocated from the bloated sandboxManager.ts to a dedicated utils/paths.ts and renamed to deduplicateAbsolutePaths and toPathKey. This enforces consistent path case-insensitivity rules across the agent lifecycle.

Behavioral & Security Fixes:

Virtual Command Privilege Escalation (CRITICAL FIX): Previously, virtual __write commands were spoofed as cat on POSIX systems during early pipeline adjustments to bypass isToolApproved checks. This incorrectly granted the sandboxed tee process full workspaceWrite permissions. The AbstractOsSandboxManager now injects the targeted write path into the dynamic fileSystem permissions array before path resolution, keeping workspaceWrite = false and ensuring the sandbox restricts access solely to the target file.
Governance File Verification: touch() operations for governance files (.git, .gitignore, .geminiignore) are now universally enforced by the abstract base class prior to sandbox execution. Previously, macOS skipped this step, leaving workspaces vulnerable to malicious .git initialization if absent.
Windows Case-Insensitivity: isKnownSafeCommand now explicitly enforces case-insensitivity matching against the approvedTools list.

Testing Enhancements:

Deleted the monolithic sandboxManager.test.ts.
Introduced abstractOsSandboxManager.test.ts and structurally standardized LinuxSandboxManager.test.ts, MacOsSandboxManager.test.ts, and WindowsSandboxManager.test.ts with consistent describe groupings for delegated methods.
Added more test cases to sandboxManager.integration.test.ts, including coverage for Plan Mode.

How to Validate

Run the workspace test suite: npm run test -w @google/gemini-cli-core
Execute the global preflight: npm run preflight
Verify the behavior of virtual __write commands inside a restricted sandbox environment (e.g. Plan Mode) to ensure only the specified target file is writable and the rest of the workspace remains read-only.

Pre-Merge Checklist

Changed files

packages/core/src/policy/sandboxPolicyManager.ts (modified, +4/-4)
packages/core/src/sandbox/abstractOsSandboxManager.test.ts (added, +315/-0)
packages/core/src/sandbox/abstractOsSandboxManager.ts (added, +326/-0)
packages/core/src/sandbox/linux/LinuxSandboxManager.test.ts (modified, +59/-17)
packages/core/src/sandbox/linux/LinuxSandboxManager.ts (modified, +43/-139)
packages/core/src/sandbox/linux/bwrapArgsBuilder.test.ts (modified, +28/-2)
packages/core/src/sandbox/linux/bwrapArgsBuilder.ts (modified, +12/-2)
packages/core/src/sandbox/macos/MacOsSandboxManager.test.ts (modified, +62/-166)
packages/core/src/sandbox/macos/MacOsSandboxManager.ts (modified, +35/-118)
packages/core/src/sandbox/macos/seatbeltArgsBuilder.test.ts (modified, +1/-1)
packages/core/src/sandbox/macos/seatbeltArgsBuilder.ts (modified, +1/-1)
packages/core/src/sandbox/utils/sandboxReadWriteUtils.ts (modified, +18/-25)
packages/core/src/sandbox/windows/WindowsSandboxManager.test.ts (modified, +463/-413)
packages/core/src/sandbox/windows/WindowsSandboxManager.ts (modified, +103/-155)
packages/core/src/services/sandboxManager.integration.test.ts (modified, +608/-372)
packages/core/src/services/sandboxManager.test.ts (removed, +0/-465)
packages/core/src/services/sandboxManager.ts (modified, +0/-236)
packages/core/src/tools/shell.ts (modified, +4/-9)
packages/core/src/utils/paths.test.ts (modified, +59/-0)
packages/core/src/utils/paths.ts (modified, +39/-0)

PR #25307: test(core): improve sandbox integration test coverage and fix OS-specific failures

Repository: google-gemini/gemini-cli
Author: ehedlund
State: closed | merged: True
Link: https://github.com/google-gemini/gemini-cli/pull/25307

Description (problem / solution / changelog)

Summary

This PR significantly improves the integration test coverage for the sandbox environments across all major operating systems (Linux, macOS, Windows). In the process of expanding this test suite to rigorously verify sandbox boundaries and policies, several OS-specific failures were identified and resolved, including initialization inefficiencies and a Bubblewrap permission bug on Linux.

Details

Test Coverage Enhancements (Primary)

Expanded sandboxManager.integration.test.ts to comprehensively exercise sandbox boundaries across Linux, macOS, and Windows.
Added explicit verification for edge cases including recursive directory protection, symlink traversal restrictions, and governance file modification prevention.
Added explicit verification for writing to authorized paths when the workspace is otherwise read-only (Plan Mode).
Refactored Workspace Setup: Migrated the integration test suite from beforeAll to beforeEach for workspace initialization. Each test case now receives a fresh, isolated temporary directory to prevent state leakage (critical for Windows ACL checks). Added an afterEach cleanup block that specifically tracks and removes isolated test directories, drastically reducing the disk footprint during testing.
Improved unit testing for bwrapArgsBuilder (Linux) and MacOsSandboxManager (macOS).

Sandbox Logic Fixes (Secondary)

Linux (bwrap) Security:
- Fixed a bug where write access to policy-authorized paths was incorrectly denied if the path did not yet exist and the command was not explicitly recognized as a write command.
- The builder now grants read-write access (--bind-try) to any path in policyAllowed and its parent directory, unless the command is an explicit read-only virtual command (__read), in which case the parent directory is safely bound as read-only.
Linux (bwrap) & Windows Optimization:
- Refactored sandbox initialization (ensureGovernanceFilesExist) to ensure that governance files (e.g., .git, .gitignore) are secured and verified exactly once per session workspace, rather than redundantly on every single command execution. This reduces disk I/O and improves performance for long-running sessions.
Windows:
- Updated tests to pass pre-existing parent directories to policy.allowedPaths instead of non-existent target files, satisfying the Windows Sandbox requirement that granular access can only be granted to existing filesystem objects.
- Refactored the native C# helper compilation (ensureHelperCompiled) to be a globally static initialization, ensuring it only compiles once per Node process.

How to Validate

Windows, Linux, macOS

Run integration tests:

npm run test -w @google/gemini-cli-core -- src/services/sandboxManager.integration.test.ts

Linux

Verify unit tests:

npm run test -w @google/gemini-cli-core -- src/sandbox/linux/bwrapArgsBuilder.test.ts

Changed files

packages/core/src/sandbox/linux/LinuxSandboxManager.ts (modified, +22/-8)
packages/core/src/sandbox/linux/bwrapArgsBuilder.test.ts (modified, +23/-3)
packages/core/src/sandbox/linux/bwrapArgsBuilder.ts (modified, +8/-5)
packages/core/src/sandbox/macos/MacOsSandboxManager.test.ts (modified, +1/-23)
packages/core/src/sandbox/windows/WindowsSandboxManager.ts (modified, +46/-35)
packages/core/src/services/sandboxManager.integration.test.ts (modified, +641/-377)

RAW_BUFFERClick to expand / collapse

We had some fixes to make the experimental tool sandboxing work well with plans (and tasks)

We need to add integration tests to ensure these do not regress, ideally before promoting tool sandboxing from experimental to stable

extent analysis

TL;DR

Add integration tests to ensure the stability of the experimental tool sandboxing feature before promoting it to stable.

Guidance

Review the changes made in the provided pull requests (24047, 24638, 24762) to understand the fixes and updates to the experimental tool sandboxing feature.
Identify key scenarios and edge cases that should be covered by integration tests to prevent regressions.
Develop and run integration tests to validate the functionality and stability of the tool sandboxing feature in conjunction with plans and tasks.
Consider prioritizing tests based on the most critical fixes and updates made in the referenced pull requests.

Example

No specific code example can be provided without more context, but tests should ideally cover various interactions between tool sandboxing, plans, and tasks.

Notes

The exact nature and scope of the integration tests will depend on the specific requirements and functionality of the experimental tool sandboxing feature, as well as the changes made in the referenced pull requests.

Recommendation

Apply workaround: Develop and implement comprehensive integration tests before promoting the experimental tool sandboxing feature to stable, to ensure its reliability and prevent potential regressions.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#tokenizer error #prompt formatting #chain error #conversation history #tool integration

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

gemini-cli - ✅(Solved) Fix Add integration tests for tool sandboxing with plans and tasks [2 pull requests, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

PR fix notes

PR #25298: refactor(core): abstract OS sandbox managers and fix virtual command permissions

Description (problem / solution / changelog)

Summary

Details

Related Issues

How to Validate

Pre-Merge Checklist

Changed files

PR #25307: test(core): improve sandbox integration test coverage and fix OS-specific failures

Description (problem / solution / changelog)

Summary

Details

Test Coverage Enhancements (Primary)

Sandbox Logic Fixes (Secondary)

Related Issues

How to Validate

Windows, Linux, macOS

Linux

Changed files

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING