openclaw - 💡(How to fix) Fix [Proposal]: OpenClaw Viz — enhanced Control UI for agent monitoring, intervention and enterprise operations

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Enhanced Control UI with agent topology visualization, human intervention console, session replay, RBAC, SSO, multi-cluster monitoring, and Prometheus/Grafana integration.

Error Message

  1. No policy engine — Automated responses to failure modes (error spikes, token overruns, stale sessions) cannot be configured from the UI.
  • System Metrics: CPU/memory/disk, hourly token chart, per-module KPIs, error tracking
  • Smart Alerts: Error spike, stale session, token budget, cost limit, model failure detection

Root Cause

Enhanced Control UI with agent topology visualization, human intervention console, session replay, RBAC, SSO, multi-cluster monitoring, and Prometheus/Grafana integration.

RAW_BUFFERClick to expand / collapse

Summary

Enhanced Control UI with agent topology visualization, human intervention console, session replay, RBAC, SSO, multi-cluster monitoring, and Prometheus/Grafana integration.

Problem to solve

The current Control UI is a basic web interface serving primarily as a Gateway health check. As OpenClaw grows from a single-user tool to a multi-agent, multi-channel, multi-operator system, operators face several pain points:

  1. No agent visibility — No topology graph or dependency map showing how agents, sessions, modules, and cron jobs relate in real time.

  2. No human intervention — The only way to intervene with an off-track agent is via CLI. No UI to send messages, steer sub-agents, or terminate runaway sessions.

  3. No session history analysis — Past sessions cannot be replayed, searched, or exported for debugging.

  4. No multi-user support — Teams share the same auth; there are no roles (viewer/operator/admin), no audit trail, no accountability.

  5. No enterprise observability — No Prometheus metrics endpoint, no Grafana dashboard, no multi-cluster monitoring for organizations running multiple Gateways.

  6. No SSO — Teams cannot integrate with existing identity providers (Google OIDC, GitHub).

  7. No policy engine — Automated responses to failure modes (error spikes, token overruns, stale sessions) cannot be configured from the UI.

  8. No project-level intelligence — Workspace project relationships, milestones, activity patterns, and task flow pipelines are invisible.

These gaps force operators to juggle terminals, spreadsheets, and custom scripts — operational friction that does not scale beyond single-user setups.

Proposed solution

OpenClaw Viz — an open-source Express + React dashboard that extends the Control UI across 22 modules:

  • Agent Topology Graph (D3.js force-directed): Real-time agent/session/cron/module visualization
  • Session Monitoring & Intervention Console: Search, filter, sort sessions; message/steer/kill agents from UI
  • Session Replay: 378-frame playback with 0.5x–10x speed and timeline scrubber
  • Cron Management: Enable/disable, manual trigger, run history
  • System Metrics: CPU/memory/disk, hourly token chart, per-module KPIs, error tracking
  • Project Intelligence: Dependency graph, Gantt timeline, activity heatmap, milestone tracker
  • Smart Alerts: Error spike, stale session, token budget, cost limit, model failure detection
  • Multi-User & RBAC: 3 roles × 10 permissions with audit log
  • Immutable Audit Trail: SHA-256 hash chain with tamper detection
  • SSO / OAuth2: Local JWT + Google OIDC (full PKCE flow)
  • Multi-Cluster Monitoring: Remote connections, DNS-SD auto-discovery
  • Prometheus / Grafana: OpenMetrics endpoint + ready-to-import 7-panel dashboard
  • API Rate Limiting: Per-role limits with Retry-After headers
  • Intervention Policy Engine: 5 built-in rules with create/toggle/delete
  • A/B Test Comparison: Week-over-week model/channel/module metrics

Tech stack: React 18, Vite 6, TailwindCSS 3, D3.js 7, Zustand 5, Express 4, WebSocket (ws), Docker

Repo: https://github.com/sltogethertao-sudo/openclaw-viz

Alternatives considered

  1. Fork and extend the built-in Control UI — Harder to maintain, risks divergence, and core prefers lean design.
  2. CLI-only workflows — Already the status quo; doesn't scale for teams and lacks visualization.
  3. Commercial tools (Datadog, Grafana Cloud) — Overkill for a personal AI assistant; introduces cost and external dependencies.
  4. Build as separate plugins each — Over-engineered; Viz provides a cohesive experience across all modules.

Impact

Affected users:

  • Single-user operators needing better agent visibility
  • Teams needing role-based access and audit trails
  • Organizations managing multiple Gateways needing centralized monitoring

Severity: Medium-High

  • Multi-agent debugging requires manual log grepping and terminal juggling
  • Untracked interventions create security gaps in team settings
  • Multi-cluster means SSH-ing into each Gateway individually

Frequency:

  • Session monitoring: continuous
  • Intervention: multiple times per day
  • Project intelligence: weekly

Consequence:

  • +15–30 min/day/operator in manual debugging
  • Lost context from untracked interventions
  • Delayed incident response without alerting
  • No growth path from single-user to team deployments

Evidence/examples

<img width="1579" height="1095" alt="Image" src="https://github.com/user-attachments/assets/581fae79-27be-4bf0-b93c-4c89298fb908" />

Additional information

We'd love guidance on:

  1. Whether Viz fits as an OpenClaw plugin (native openclaw.plugin.json) or should stay standalone
  2. Any Control UI extension points we should target for tighter integration
  3. Interest in specific components as upstream core PRs (e.g. topology graph, intervention console)

@velvet-shark @BunsDev @steipete — we'd love your thoughts on this

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix [Proposal]: OpenClaw Viz — enhanced Control UI for agent monitoring, intervention and enterprise operations