openclaw - 💡(How to fix) Fix feat(oap-sidecar): ship /usr/local/bin/oap-healthcheck binary for portable container healthchecks [1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#75626Fetched 2026-05-02 05:32:38
View on GitHub
Comments
1
Participants
2
Timeline
2
Reactions
2
Timeline (top)
closed ×1commented ×1

Request that the ghcr.io/openclaw/oap-sidecar image ship a small /usr/local/bin/oap-healthcheck binary so consumers running the sidecar in container orchestrators (ECS, Kubernetes, Nomad) can configure a healthcheck that does not depend on the host shell or curl.

Root Cause

Today, consumers wiring up an ECS task definition (or Kubernetes liveness probe) for the OAP sidecar have a few imperfect options:

  1. curl -f http://localhost:8443/health || exit 1 — works only if the image contains curl. The current oap-sidecar image does include curl and /bin/sh, but this is an undocumented contract that future image rebuilds (e.g., a minimal/distroless variant) could break silently. Consumers have to read the Dockerfile to know it's safe.

  2. /dev/tcp/localhost/8443 — bash builtin. CMD-SHELL on Alpine (ash) and Debian-slim (dash) does not implement this, so the probe stays UNHEALTHY forever. Already shipped and rolled back as a real bug (see ender-stack PR #237 / Greptile P1).

  3. nc -z localhost 8443netcat is not in default busybox/Debian-slim/Ubuntu-slim images either. Adding it expands the attack surface.

  4. TCP-only socket probe (Kubernetes tcpSocket) — works but only proves the listener is bound, not that /health returns 200. Catches a crashed process, misses a half-broken handler.

Fix Action

Fix / Workaround

We're documenting the implicit curl + /bin/sh contract in our consumer Terraform module README so future bumps don't silently break, but that's a workaround, not a fix.

Code Example

/usr/local/bin/oap-healthcheck

---

healthcheck = ["CMD", "/usr/local/bin/oap-healthcheck"]
RAW_BUFFERClick to expand / collapse

Summary

Request that the ghcr.io/openclaw/oap-sidecar image ship a small /usr/local/bin/oap-healthcheck binary so consumers running the sidecar in container orchestrators (ECS, Kubernetes, Nomad) can configure a healthcheck that does not depend on the host shell or curl.

Why this matters

Today, consumers wiring up an ECS task definition (or Kubernetes liveness probe) for the OAP sidecar have a few imperfect options:

  1. curl -f http://localhost:8443/health || exit 1 — works only if the image contains curl. The current oap-sidecar image does include curl and /bin/sh, but this is an undocumented contract that future image rebuilds (e.g., a minimal/distroless variant) could break silently. Consumers have to read the Dockerfile to know it's safe.

  2. /dev/tcp/localhost/8443 — bash builtin. CMD-SHELL on Alpine (ash) and Debian-slim (dash) does not implement this, so the probe stays UNHEALTHY forever. Already shipped and rolled back as a real bug (see ender-stack PR #237 / Greptile P1).

  3. nc -z localhost 8443netcat is not in default busybox/Debian-slim/Ubuntu-slim images either. Adding it expands the attack surface.

  4. TCP-only socket probe (Kubernetes tcpSocket) — works but only proves the listener is bound, not that /health returns 200. Catches a crashed process, misses a half-broken handler.

Proposed shape

A static or near-static binary in the image at a stable path:

/usr/local/bin/oap-healthcheck

Behavior:

  • Default: probe http://127.0.0.1:8443/health, exit 0 on HTTP 200, non-zero otherwise.
  • Optional flag: --port, --path, --timeout for non-default deployments.
  • No external dependencies (no curl, no shell).

This means consumers can write a portable healthcheck:

healthcheck = ["CMD", "/usr/local/bin/oap-healthcheck"]

…and it works unchanged across slim, alpine, distroless, or future image variants.

Prior art

  • mongo-healthcheck binary in the official MongoDB image
  • redis-cli ping for the Redis image
  • /bin/grpc_health_probe shipped in many gRPC service images

What we'd commit to upstream

If the maintainers are open to this, we'd be willing to contribute the binary + Dockerfile changes against the next oap-sidecar release. Happy to align on Go vs Rust vs a small Node script (whichever matches the existing toolchain best).

Until this lands

We're documenting the implicit curl + /bin/sh contract in our consumer Terraform module README so future bumps don't silently break, but that's a workaround, not a fix.

Cross-reference

  • ender-stack PR #237 (consumer-side correctness bundle)
  • ender-stack issue #166 (the issue this would close on our side)

extent analysis

TL;DR

Include a static /usr/local/bin/oap-healthcheck binary in the ghcr.io/openclaw/oap-sidecar image to enable a reliable healthcheck without depending on external tools like curl or shell scripts.

Guidance

  • Consider contributing a small Go or Rust binary to the oap-sidecar image, following the example of mongo-healthcheck or redis-cli ping.
  • The binary should probe http://127.0.0.1:8443/health by default and allow optional flags for custom port, path, and timeout.
  • Document the implicit curl + /bin/sh contract in consumer documentation to avoid silent breakage until the binary is included.
  • Evaluate the feasibility of using a tcpSocket probe as a temporary workaround, although it may not cover all failure scenarios.

Example

package main

import (
	"fmt"
	"net/http"
)

func main() {
	resp, err := http.Get("http://127.0.0.1:8443/health")
	if err != nil {
		fmt.Println("Healthcheck failed")
		return
	}
	if resp.StatusCode != 200 {
		fmt.Println("Healthcheck failed")
		return
	}
	fmt.Println("Healthcheck ok")
}

Notes

The proposed solution aims to provide a reliable healthcheck mechanism without introducing external dependencies. However, the implementation details, such as the choice of programming language and binary behavior, require further discussion and alignment with the existing toolchain.

Recommendation

Apply a workaround by documenting the implicit curl + /bin/sh contract in consumer documentation until the oap-healthcheck binary is included in the oap-sidecar image.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix feat(oap-sidecar): ship /usr/local/bin/oap-healthcheck binary for portable container healthchecks [1 comments, 2 participants]