langchain - ✅(Solved) Fix ShellToolMiddleware may kill caller process group when create_process_group=False [3 pull requests, 2 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
langchain-ai/langchain#36358Fetched 2026-04-08 01:48:40
View on GitHub
Comments
2
Participants
2
Timeline
16
Reactions
0
Assignees
Timeline (top)
cross-referenced ×3labeled ×3commented ×2mentioned ×2
  • I am using ShellToolMiddleware internals with HostExecutionPolicy(create_process_group=False).
  • I expect timeout cleanup to terminate only the child shell process.
  • Instead, _kill_process uses process-group kill unconditionally via os.killpg(os.getpgid(child_pid), SIGKILL).
  • When create_process_group=False, the child shares the caller process group, so the kill target can include the caller and sibling processes.

Error Message

No Python exception is required to observe the bug. The bug is behavioral and visible from the printed output.

Example output: create_process_group: False current_pgid: 249430 child_pgid: 249430 kill_calls: [(249430, 9)]

Root Cause

  • I am using ShellToolMiddleware internals with HostExecutionPolicy(create_process_group=False).
  • I expect timeout cleanup to terminate only the child shell process.
  • Instead, _kill_process uses process-group kill unconditionally via os.killpg(os.getpgid(child_pid), SIGKILL).
  • When create_process_group=False, the child shares the caller process group, so the kill target can include the caller and sibling processes.

Fix Action

Fix / Workaround

  • This is a bug, not a usage question.
  • I added a clear and descriptive title that summarizes this issue.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangChain rather than my code.
  • The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).
  • This is not related to the langchain-community package.
  • I posted a self-contained, minimal, reproducible example. A maintainer can copy it and run it AS IS.

PR fix notes

PR #36359: fix(langchain): avoid shared process-group kill in shell middleware

Description (problem / solution / changelog)

Fixes #36358


Summary This PR fixes a process termination safety bug in ShellToolMiddleware where cleanup could kill the caller process group when create_process_group is False.

Why this change When the shell child is started without a dedicated process group, it can share the parent group. The existing cleanup path used group kill unconditionally, which could terminate unrelated processes including the caller. This is a high-impact availability risk.

What changed

Updated shell session kill logic to use group kill only when the child is in a different process group than the caller. Added safe fallback to child-only kill when both share the same process group. Added regression tests for both scenarios: Shared process group: no group kill, child-only kill. Dedicated process group: existing group kill behavior is preserved.

Validation

Targeted new tests passed. Full shell tool unit test file passed.

AI assistance disclosure This contribution was prepared with assistance from an AI coding agent. I reviewed, validated, and finalized the proposed changes and test coverage before submission.

Areas for careful review

Process-group detection behavior on Linux and other POSIX environments. Any implications for existing timeout and shutdown flows in shell middleware. Whether additional integration-level coverage is desirable for process cleanup behavior.

Changed files

  • libs/langchain_v1/langchain/agents/middleware/shell_tool.py (modified, +12/-5)
  • libs/langchain_v1/tests/unit_tests/agents/middleware/implementations/test_shell_tool.py (modified, +44/-0)

PR #36360: fix: ShellToolMiddleware may kill caller process group when create_process_group=False

Description (problem / solution / changelog)

Summary

When HostExecutionPolicy(create_process_group=False) is used, the child process shares the caller's process group. The _kill_process method was unconditionally using os.killpg() which kills the entire process group, potentially killing the caller and sibling processes.

Changes

  • Store the create_process_group setting from the policy in ShellSession
  • Use os.kill() (single process kill) when create_process_group=False
  • Use os.killpg() (process group kill) when create_process_group=True

Related Issue

Fixes #36358

Testing

The fix was verified with a simple test script that confirms:

  • When create_process_group=True: uses os.killpg (process group kill)
  • When create_process_group=False: uses process.kill() (single process kill)

This prevents killing the caller's process group when create_process_group=False.

Changed files

  • libs/langchain_v1/langchain/agents/middleware/shell_tool.py (modified, +3/-1)

PR #36363: fix(langchain): prevent killpg from terminating caller when child sha…

Description (problem / solution / changelog)

…res process group

When create_process_group=False, the child shell inherits the caller's process group. The previous _kill_process implementation unconditionally called os.killpg(), which would kill the entire shared group — including the parent process and siblings.

Now we compare the child's PGID against the current process's PGID via os.getpgrp(). Group kill is only used when the child has its own dedicated process group; otherwise we fall back to killing the child process directly.

Fixes #36358

Fixes #

<!-- Replace everything above this line with a 1-2 sentence description of your change. Keep the "Fixes #xx" keyword and update the issue number. -->

Read the full contributing guidelines: https://docs.langchain.com/oss/python/contributing/overview

All contributions must be in English. See the language policy.

If you paste a large clearly AI generated description here your PR may be IGNORED or CLOSED!

Thank you for contributing to LangChain! Follow these steps to have your pull request considered as ready for review.

  1. PR title: Should follow the format: TYPE(SCOPE): DESCRIPTION
  1. PR description:
  • Write 1-2 sentences summarizing the change.
  • The Fixes #xx line at the top is required for external contributions — update the issue number and keep the keyword. This links your PR to the approved issue and auto-closes it on merge.
  • If there are any breaking changes, please clearly describe them.
  • If this PR depends on another PR being merged first, please include "Depends on #PR_NUMBER" in the description.
  1. Run make format, make lint and make test from the root of the package(s) you've modified.
  • We will not consider a PR unless these three are passing in CI.
  1. How did you verify your code works?

Additional guidelines:

  • All external PRs must link to an issue or discussion where a solution has been approved by a maintainer, and you must be assigned to that issue. PRs without prior approval will be closed.
  • PRs should not touch more than one package unless absolutely necessary.
  • Do not update the uv.lock files or add dependencies to pyproject.toml files (even optional ones) unless you have explicit permission to do so by a maintainer.

Social handles (optional)

<!-- If you'd like a shoutout on release, add your socials below -->

Twitter: @ LinkedIn: https://linkedin.com/in/

Changed files

  • libs/langchain_v1/langchain/agents/middleware/shell_tool.py (modified, +12/-5)
  • libs/langchain_v1/tests/unit_tests/agents/middleware/implementations/test_shell_tool.py (modified, +56/-0)

Code Example

import os
import tempfile
from pathlib import Path

from langchain.agents.middleware._execution import HostExecutionPolicy
from langchain.agents.middleware.shell_tool import ShellSession


def main() -> None:
    policy = HostExecutionPolicy(
        create_process_group=False,
        command_timeout=0.1,
        termination_timeout=0.05,
    )

    session = ShellSession(
        workspace=Path(tempfile.mkdtemp(prefix="lc-shell-repro-")),
        policy=policy,
        command=("/bin/bash",),
        environment={},
    )

    kill_calls: list[tuple[int, int]] = []
    original_killpg = os.killpg
    try:
        # Safety: capture killpg call instead of actually killing the process group.
        os.killpg = lambda pgid, sig: kill_calls.append((pgid, sig))  # type: ignore[assignment]

        session.start()
        _ = session.execute("sleep 5", timeout=policy.command_timeout)

        current_pgid = os.getpgrp()
        child_pgid = os.getpgid(session._process.pid) if session._process else None

        print("create_process_group:", policy.create_process_group)
        print("current_pgid:", current_pgid)
        print("child_pgid:", child_pgid)
        print("kill_calls:", kill_calls)

        if not kill_calls:
            raise RuntimeError("Expected _kill_process to call os.killpg at least once.")
        if kill_calls[0][0] != current_pgid:
            raise RuntimeError(
                "Expected killpg target process group to match caller process group."
            )
    finally:
        os.killpg = original_killpg  # type: ignore[assignment]
        session.stop(0.2)


if __name__ == "__main__":
    main()

---

No Python exception is required to observe the bug. The bug is behavioral and visible from the printed output.

Example output:
create_process_group: False
current_pgid: 249430
child_pgid: 249430
kill_calls: [(249430, 9)]
RAW_BUFFERClick to expand / collapse

Checked other resources

  • This is a bug, not a usage question.
  • I added a clear and descriptive title that summarizes this issue.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangChain rather than my code.
  • The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).
  • This is not related to the langchain-community package.
  • I posted a self-contained, minimal, reproducible example. A maintainer can copy it and run it AS IS.

Package (Required)

  • langchain
  • langchain-openai
  • langchain-anthropic
  • langchain-classic
  • langchain-core
  • langchain-model-profiles
  • langchain-tests
  • langchain-text-splitters
  • langchain-chroma
  • langchain-deepseek
  • langchain-exa
  • langchain-fireworks
  • langchain-groq
  • langchain-huggingface
  • langchain-mistralai
  • langchain-nomic
  • langchain-ollama
  • langchain-openrouter
  • langchain-perplexity
  • langchain-qdrant
  • langchain-xai
  • Other / not sure / general

Related Issues / PRs

No response

Reproduction Steps / Example Code (Python)

import os
import tempfile
from pathlib import Path

from langchain.agents.middleware._execution import HostExecutionPolicy
from langchain.agents.middleware.shell_tool import ShellSession


def main() -> None:
    policy = HostExecutionPolicy(
        create_process_group=False,
        command_timeout=0.1,
        termination_timeout=0.05,
    )

    session = ShellSession(
        workspace=Path(tempfile.mkdtemp(prefix="lc-shell-repro-")),
        policy=policy,
        command=("/bin/bash",),
        environment={},
    )

    kill_calls: list[tuple[int, int]] = []
    original_killpg = os.killpg
    try:
        # Safety: capture killpg call instead of actually killing the process group.
        os.killpg = lambda pgid, sig: kill_calls.append((pgid, sig))  # type: ignore[assignment]

        session.start()
        _ = session.execute("sleep 5", timeout=policy.command_timeout)

        current_pgid = os.getpgrp()
        child_pgid = os.getpgid(session._process.pid) if session._process else None

        print("create_process_group:", policy.create_process_group)
        print("current_pgid:", current_pgid)
        print("child_pgid:", child_pgid)
        print("kill_calls:", kill_calls)

        if not kill_calls:
            raise RuntimeError("Expected _kill_process to call os.killpg at least once.")
        if kill_calls[0][0] != current_pgid:
            raise RuntimeError(
                "Expected killpg target process group to match caller process group."
            )
    finally:
        os.killpg = original_killpg  # type: ignore[assignment]
        session.stop(0.2)


if __name__ == "__main__":
    main()

Error Message and Stack Trace (if applicable)

No Python exception is required to observe the bug. The bug is behavioral and visible from the printed output.

Example output:
create_process_group: False
current_pgid: 249430
child_pgid: 249430
kill_calls: [(249430, 9)]

Description

  • I am using ShellToolMiddleware internals with HostExecutionPolicy(create_process_group=False).
  • I expect timeout cleanup to terminate only the child shell process.
  • Instead, _kill_process uses process-group kill unconditionally via os.killpg(os.getpgid(child_pid), SIGKILL).
  • When create_process_group=False, the child shares the caller process group, so the kill target can include the caller and sibling processes.

System Info

System Info System Information

OS: Linux OS Version: #19~24.04.2-Ubuntu SMP PREEMPT_DYNAMIC Fri Mar 6 23:08:46 UTC 2 Python Version: 3.12.3 (main, Mar 3 2026, 12:15:18) [GCC 13.3.0]

Package Information

Optional packages not installed

langsmith deepagents deepagents-cli

extent analysis

Fix Plan

To fix the issue, we need to modify the _kill_process method in langchain.agents.middleware._execution to conditionally use os.kill instead of os.killpg when create_process_group is False.

Here are the steps:

  • Modify the HostExecutionPolicy to store the create_process_group value.
  • Update the _kill_process method to check the create_process_group value and use os.kill when it's False.

Code Changes

# langchain.agents.middleware._execution
class HostExecutionPolicy:
    def __init__(self, create_process_group, command_timeout, termination_timeout):
        self.create_process_group = create_process_group
        # ... existing code ...

def _kill_process(process, sig):
    if not process:
        return
    if not HostExecutionPolicy.create_process_group:
        os.kill(process.pid, sig)
    else:
        os.killpg(os.getpgid(process.pid), sig)

However, since HostExecutionPolicy is an instance and not a class, we should modify the _kill_process method to accept the policy as an argument:

def _kill_process(policy, process, sig):
    if not process:
        return
    if not policy.create_process_group:
        os.kill(process.pid, sig)
    else:
        os.killpg(os.getpgid(process.pid), sig)

Then, update the caller of _kill_process to pass the policy instance:

# langchain.agents.middleware.shell_tool
class ShellSession:
    # ... existing code ...

    def stop(self, timeout):
        # ... existing code ...
        _kill_process(self.policy, self._process, signal.SIGKILL)
        # ... existing code ...

Verification

To verify the fix, run the provided example code and check the output. The kill_calls list should be empty when create_process_group is False, indicating that os.killpg is not called.

Extra Tips

  • Make sure to test the fix on different platforms, as process group behavior may vary.
  • Consider adding a test case to the langchain test suite to ensure this fix is not reverted in the future.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING