openclaw - 💡(How to fix) Fix [Feature]: Unixsocket Provider plugin

StepCodex · 2026-05-28T08:29:08Z

[openclaw] Add a new unixsocket provider plugin extensions/unixsocket/ that implements the provider contract via createStreamFn , replacing HTTP with direct Un… Add a new **unixsocket provider plugin** (`extensions/unixsocket/`) that implements the provider contract via `createStreamFn`, replacing HTTP with direct Unix Socket communication ### Summary Add a new **unixsocket provider plugin** (`extensions/unixsocket/`) that implements the provider contract via `createStreamFn`, replacing HTTP with direct Unix Socket communication ### Problem to solve ### 1. Mobile platforms are blocking HTTP model calls More and more mobile OS vendors (Android, HarmonyOS, etc.) are enforcing strict network policies that **prohibit AI model inference over HTTP on-device**. This makes it impossible to use OpenClaw's existing HTTP-based providers (ollama, lmstudio, vllm) to call models running locally on phones. ### 2. On-device models are the future Running models directly on-device offers major advantages over cloud-based inference: | Benefit | Description | |---------|-------------| | **Privacy & security** | Data never leaves the device — conversations, files, and images are processed locally. No cloud upload, no data leakage, stronger compliance. | | **Offline capable** | Works without internet connectivity — no cellular, no Wi-Fi, no base station required. | | **Faster response** | Zero network round-trip latency. Local compute delivers smoother interaction. | | **Cost efficient** | No API call fees, no data egress charges. Zero per-call cost long-term. | | **Full control** | Customize model versions, parameters, and behavior without platform restrictions or API lock-in. | | **Low-latency interaction** | Real-time conversation, local plugin integration, streaming output — ideal for device-local workflows. | As on-device compute grows more capable, we expect an explosion of local model deployments. ### 3. On-device daemons use Unix Socket — but OpenClaw doesn't support it The vast majority of on-device AI daemons communicate over **Unix Domain Socket (UDS)** with raw JSON — not HTTP. This is the natural IPC choice for local services: zero network stack overhead, filesystem-based access control, no port conflicts. However, OpenClaw's provider transport layer is **entirely HTTP-based**. There is no way to connect to: - **On-device model daemons** on phones / edge devices that expose a Unix socket - **llama.cpp** in its UDS mode - **Ollama** early UDS mode - **Custom embedded daemons** that skip HTTP for latency and security reasons This gap blocks OpenClaw from being used as the agent gateway on mobile devices with local AI inference. ### Proposed solution ### How it works ``` Agent.streamFn() → registerProviderStreamForModel() → resolveProviderStreamFn() → plugin.createStreamFn(ctx) ← returns custom StreamFn → connectSocket() → encodeFrame() → socket.write() → readFrame() → JSON.parse → handleStreaming() → push text_delta/toolcall_delta/done events ``` The plugin auto-detects two response formats without configuration: - **Binary frame**: `[4B big-endian length header][UTF-8 JSON]` - **Newline-delimited JSON**: `{json}\n` (used by llama.cpp and others) It also supports both **streaming** (`choices[0].delta`) and **non-streaming** (`choices[0].message`) responses. ### Config surface ```json { "models": { "providers": { "unixsocket": { "api": "openai-completions", "params": { "socketPath": "/var/run/ai-daemon.sock", "modelId": "local-model", "connectTimeoutMs": 10000, "readTimeoutMs": 120000, "maxRetries": 3 } } } } } ``` ### Features - Retry with exponential backoff on connection failure - Auto-truncation of context + tools for small-context-window models (4K) - Hot-reload support (re-reads config from disk at request time) - Chinese-friendly validation error messages for invalid socketPath - Streaming + non-streaming auto-detection - Socket error guard (prevents uncaught `'error'` crash) ### Plugin architecture Follows the standard bundled provider pattern (similar to ollama/lmstudio): ``` extensions/unixsocket/ ├── index.ts # plugin entry, definePluginEntry() ├── api.ts # public API surface ├── runtime-api.ts # runtime API surface ├── openclaw.plugin.json # manifest ├── src/ │ ├── frame.ts # frame codec + auto-detection │ ├── stream.ts # custom StreamFn (core) │ ├── transport.ts # socket connect/retry │ ├── models.ts # model config resolution + validation │ ├── runtime.ts # config value resolvers │ ├── setup.ts # setup wizard + catalog │ └── defaults.ts # shared constants └── test/ # 53 tests across 5 test files ``` ### Alternatives considered _No response_ ### Impact | Dimension | Detail | |-----------|--------| | **Affected users** | All mobile/edge device users who want to run OpenClaw gateway with local on-device models. Also affects llama.cpp and embedded daemon users who rely on UDS transport. | | **Affected systems** | Mobile phones (ARM64, Android/HarmonyOS), embedded devices, edge gateways running local AI daemons

Root Cause

Dimension	Detail
Affected users	All mobile/edge device users who want to run OpenClaw gateway with local on-device models. Also affects llama.cpp and embedded daemon users who rely on UDS transport.
Affected systems	Mobile phones (ARM64, Android/HarmonyOS), embedded devices, edge gateways running local AI daemons over Unix Domain Socket.
Severity	Blocks workflow — OpenClaw cannot be used at all on platforms that prohibit HTTP model calls. The gateway simply cannot communicate with the local model daemon.
Frequency	Always — 100% of requests to Unix Socket daemons fail because there is no transport path. This is not intermittent; it's a missing capability.
Consequence	Users are forced to either (a) run a separate HTTP proxy layer between the daemon and OpenClaw, adding latency and complexity, or (b) abandon OpenClaw entirely in favor of custom solutions. This blocks adoption on mobile platforms where local inference is the only viable option.

Code Example

Agent.streamFn()
  → registerProviderStreamForModel()
    → resolveProviderStreamFn()
      → plugin.createStreamFn(ctx)  ← returns custom StreamFn
        → connectSocket() → encodeFrame() → socket.write()
        → readFrame() → JSON.parse → handleStreaming()
        → push text_delta/toolcall_delta/done events

---

{
  "models": {
    "providers": {
      "unixsocket": {
        "api": "openai-completions",
        "params": {
          "socketPath": "/var/run/ai-daemon.sock",
          "modelId": "local-model",
          "connectTimeoutMs": 10000,
          "readTimeoutMs": 120000,
          "maxRetries": 3
        }
      }
    }
  }
}

---

extensions/unixsocket/
├── index.ts          # plugin entry, definePluginEntry()
├── api.ts            # public API surface
├── runtime-api.ts    # runtime API surface
├── openclaw.plugin.json  # manifest
├── src/
│   ├── frame.ts      # frame codec + auto-detection
│   ├── stream.ts     # custom StreamFn (core)
│   ├── transport.ts  # socket connect/retry
│   ├── models.ts     # model config resolution + validation
│   ├── runtime.ts    # config value resolvers
│   ├── setup.ts      # setup wizard + catalog
│   └── defaults.ts   # shared constants
└── test/             # 53 tests across 5 test files

Summary

Add a new unixsocket provider plugin (extensions/unixsocket/) that implements the provider contract via createStreamFn, replacing HTTP with direct Unix Socket communication

Problem to solve

1. Mobile platforms are blocking HTTP model calls

More and more mobile OS vendors (Android, HarmonyOS, etc.) are enforcing strict network policies that prohibit AI model inference over HTTP on-device. This makes it impossible to use OpenClaw's existing HTTP-based providers (ollama, lmstudio, vllm) to call models running locally on phones.

2. On-device models are the future

Running models directly on-device offers major advantages over cloud-based inference:

Benefit	Description
Privacy & security	Data never leaves the device — conversations, files, and images are processed locally. No cloud upload, no data leakage, stronger compliance.
Offline capable	Works without internet connectivity — no cellular, no Wi-Fi, no base station required.
Faster response	Zero network round-trip latency. Local compute delivers smoother interaction.
Cost efficient	No API call fees, no data egress charges. Zero per-call cost long-term.
Full control	Customize model versions, parameters, and behavior without platform restrictions or API lock-in.
Low-latency interaction	Real-time conversation, local plugin integration, streaming output — ideal for device-local workflows.

As on-device compute grows more capable, we expect an explosion of local model deployments.

3. On-device daemons use Unix Socket — but OpenClaw doesn't support it

The vast majority of on-device AI daemons communicate over Unix Domain Socket (UDS) with raw JSON — not HTTP. This is the natural IPC choice for local services: zero network stack overhead, filesystem-based access control, no port conflicts.

However, OpenClaw's provider transport layer is entirely HTTP-based. There is no way to connect to:

On-device model daemons on phones / edge devices that expose a Unix socket
llama.cpp in its UDS mode
Ollama early UDS mode
Custom embedded daemons that skip HTTP for latency and security reasons

This gap blocks OpenClaw from being used as the agent gateway on mobile devices with local AI inference.

Proposed solution

How it works

Agent.streamFn()
  → registerProviderStreamForModel()
    → resolveProviderStreamFn()
      → plugin.createStreamFn(ctx)  ← returns custom StreamFn
        → connectSocket() → encodeFrame() → socket.write()
        → readFrame() → JSON.parse → handleStreaming()
        → push text_delta/toolcall_delta/done events

The plugin auto-detects two response formats without configuration:

Binary frame: [4B big-endian length header][UTF-8 JSON]
Newline-delimited JSON: {json}\n (used by llama.cpp and others)

It also supports both streaming (choices[0].delta) and non-streaming (choices[0].message) responses.

Config surface

{
  "models": {
    "providers": {
      "unixsocket": {
        "api": "openai-completions",
        "params": {
          "socketPath": "/var/run/ai-daemon.sock",
          "modelId": "local-model",
          "connectTimeoutMs": 10000,
          "readTimeoutMs": 120000,
          "maxRetries": 3
        }
      }
    }
  }
}

Features

Retry with exponential backoff on connection failure
Auto-truncation of context + tools for small-context-window models (4K)
Hot-reload support (re-reads config from disk at request time)
Chinese-friendly validation error messages for invalid socketPath
Streaming + non-streaming auto-detection
Socket error guard (prevents uncaught 'error' crash)

Plugin architecture

Follows the standard bundled provider pattern (similar to ollama/lmstudio):

extensions/unixsocket/
├── index.ts          # plugin entry, definePluginEntry()
├── api.ts            # public API surface
├── runtime-api.ts    # runtime API surface
├── openclaw.plugin.json  # manifest
├── src/
│   ├── frame.ts      # frame codec + auto-detection
│   ├── stream.ts     # custom StreamFn (core)
│   ├── transport.ts  # socket connect/retry
│   ├── models.ts     # model config resolution + validation
│   ├── runtime.ts    # config value resolvers
│   ├── setup.ts      # setup wizard + catalog
│   └── defaults.ts   # shared constants
└── test/             # 53 tests across 5 test files

Alternatives considered

No response

Impact

Dimension	Detail
Affected users	All mobile/edge device users who want to run OpenClaw gateway with local on-device models. Also affects llama.cpp and embedded daemon users who rely on UDS transport.
Affected systems	Mobile phones (ARM64, Android/HarmonyOS), embedded devices, edge gateways running local AI daemons over Unix Domain Socket.
Severity	Blocks workflow — OpenClaw cannot be used at all on platforms that prohibit HTTP model calls. The gateway simply cannot communicate with the local model daemon.
Frequency	Always — 100% of requests to Unix Socket daemons fail because there is no transport path. This is not intermittent; it's a missing capability.
Consequence	Users are forced to either (a) run a separate HTTP proxy layer between the daemon and OpenClaw, adding latency and complexity, or (b) abandon OpenClaw entirely in favor of custom solutions. This blocks adoption on mobile platforms where local inference is the only viable option.

Evidence/examples

No response

Additional information

No response

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

openclaw - 💡(How to fix) Fix [Feature]: Unixsocket Provider plugin

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Code Example

Summary

Problem to solve

1. Mobile platforms are blocking HTTP model calls

2. On-device models are the future

3. On-device daemons use Unix Socket — but OpenClaw doesn't support it

Proposed solution

How it works

Config surface

Features

Plugin architecture

Alternatives considered

Impact

Evidence/examples

Additional information

Still need to ship something?

TRENDING