hermes - 💡(How to fix) Fix [i18n] Thai Translation: Developer Guide Part c - environments, extending-the-cli, gateway-internals, memory-provider-plugin, prompt-assembly [1 participants]

hermes2026-04-24 12:29:17

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

NousResearch/hermes-agent#15128•Fetched 2026-04-25 06:24:26

View on GitHub

Comments

Participants

Timeline

Reactions

Author

nanobro

Participants

nanobro

Timeline (top)

labeled ×2

Error Message

def sync_turn(self, user_content, assistant_content): def _sync(): try: self._api.ingest(user_content, assistant_content) except Exception as e: logger.warning("Sync failed: %s", e)

if self._sync_thread and self._sync_thread.is_alive():
    self._sync_thread.join(timeout=5.0)
self._sync_thread = threading.Thread(target=_sync, daemon=True)
self._sync_thread.start()

Fix Action

Fix / Workaround

เป็น layer ของ hermes-agent (environments/hermes_base_env.py) เพิ่มเติม:

Terminal backend configuration - กำหนด TERMINAL_ENV สำหรับการรันแบบ sandboxed (local, Docker, Modal, Daytona, SSH, Singularity)
Tool resolution - _resolve_tools_for_group() เรียกใช้ get_tool_definitions() ของ hermes-agent เพื่อรับ tool schemas ที่ถูกต้องตาม toolsets ที่เปิด/ปิดใช้งาน
Agent loop integration - collect_trajectory() รัน HermesAgentLoop และให้คะแนนผลลัพธ์
Two-phase operation - Phase 1 (OpenAI server) สำหรับ eval/SFT, Phase 2 (VLLM ManagedServer) สำหรับ full RL พร้อม logprobs
Async safety patches - monkey-patches Modal backend เพื่อให้ทำงานภายใน event loop ของ Atropos

ส่ง messages + tool schemas ไปยัง API ผ่าน server.chat_completion()
หาก response มี tool_calls ให้ dispatch แต่ละตัวผ่าน handle_function_call()
แนบ tool results เข้าไปใน conversation, กลับไปที่ขั้นตอน 1
หากไม่มี tool_calls, agent จะเสร็จสิ้น

@dataclass
class AgentResult:
    messages: List[Dict[str, Any]]       # Full conversation history
    turns_used: int                       # Number of LLM calls made
    finished_naturally: bool              # True if model stopped on its own
    reasoning_per_turn: List[Optional[str]]  # Extracted reasoning content
    tool_errors: List[ToolError]          # Errors encountered during tool dispatch
    managed_state: Optional[Dict]         # VLLM ManagedServer state (Phase 2)

Code Example

classDiagram
    class BaseEnv {
      Server management
      Worker scheduling
      Wandb logging
      CLI: serve / process / evaluate
    }

    class HermesAgentBaseEnv {
      Terminal backend configuration
      Tool resolution
      Agent loop engine
      ToolContext access
    }

    class TerminalTestEnv {
      Stack testing
    }

    class HermesSweEnv {
      SWE training
    }

    class TerminalBench2EvalEnv {
      Benchmark evaluation
    }

    class TBLiteEvalEnv {
      Fast benchmark
    }

    class YCBenchEvalEnv {
      Long-horizon benchmark
    }

    BaseEnv <|-- HermesAgentBaseEnv
    HermesAgentBaseEnv <|-- TerminalTestEnv
    HermesAgentBaseEnv <|-- HermesSweEnv
    HermesAgentBaseEnv <|-- TerminalBench2EvalEnv
    TerminalBench2EvalEnv <|-- TBLiteEvalEnv
    TerminalBench2EvalEnv <|-- YCBenchEvalEnv

---

@dataclass
class AgentResult:
    messages: List[Dict[str, Any]]       # Full conversation history
    turns_used: int                       # Number of LLM calls made
    finished_naturally: bool              # True if model stopped on its own
    reasoning_per_turn: List[Optional[str]]  # Extracted reasoning content
    tool_errors: List[ToolError]          # Errors encountered during tool dispatch
    managed_state: Optional[Dict]         # VLLM ManagedServer state (Phase 2)

---

async def compute_reward(self, item, result, ctx: ToolContext):
    # Run tests in the model's terminal sandbox
    test = ctx.terminal("pytest -v")
    if test["exit_code"] == 0:
        return 1.0

    # Check if a file was created
    content = ctx.read_file("/workspace/solution.py")
    if content.get("content"):
        return 0.5

    # Download files for local verification
    ctx.download_file("/remote/output.bin", "/local/output.bin")
    return 0.0

---

from environments.tool_call_parsers import get_parser

parser = get_parser("hermes")  # or "mistral", "llama3_json", "qwen", "deepseek_v3", etc.
content, tool_calls = parser.parse(raw_model_output)

---

python environments/benchmarks/terminalbench_2/terminalbench2_env.py evaluate \
    --config environments/benchmarks/terminalbench_2/default.yaml

# Run specific tasks
python environments/benchmarks/terminalbench_2/terminalbench2_env.py evaluate \
    --config environments/benchmarks/terminalbench_2/default.yaml \
    --env.task_filter fix-git,git-multibranch

---

python environments/benchmarks/tblite/tblite_env.py evaluate \
    --config environments/benchmarks/tblite/default.yaml

---

# ติดตั้ง yc-bench (optional dependency)
pip install "hermes-agent[yc-bench]"

# Run evaluation
bash environments/benchmarks/yc_bench/run_eval.sh

# หรือโดยตรง
python environments/benchmarks/yc_bench/yc_bench_env.py evaluate \
    --config environments/benchmarks/yc_bench/default.yaml

# Quick single-preset test
python environments/benchmarks/yc_bench/yc_bench_env.py evaluate \
    --config environments/benchmarks/yc_bench/default.yaml \
    --env.presets '["fast_test"]' --env.seeds '[1]'

---

# Process mode (บันทึก rollouts เป็น JSONL, ไม่ต้องใช้ training server)
python environments/terminal_test_env/terminal_test_env.py process \
    --env.data_path_to_save_groups terminal_test_output.jsonl

# Serve mode (เชื่อมต่อกับ Atropos API สำหรับ RL training)
python environments/terminal_test_env/terminal_test_env.py serve

---

python environments/hermes_swe_env/hermes_swe_env.py serve \
    --openai.model_name YourModel \
    --env.dataset_name bigcode/humanevalpack \
    --env.terminal_backend modal

---

python environments/benchmarks/tblite/tblite_env.py evaluate \
    --config environments/benchmarks/tblite/default.yaml \
    --openai.model_name anthropic/claude-sonnet-4.6

---

python environments/terminal_test_env/terminal_test_env.py process \
    --env.data_path_to_save_groups output.jsonl \
    --openai.model_name anthropic/claude-sonnet-4.6

---

# Terminal 1: Start the Atropos API
run-api

# Terminal 2: Start the environment
python environments/hermes_swe_env/hermes_swe_env.py serve \
    --openai.model_name YourModel

---

from environments.hermes_base_env import HermesAgentBaseEnv, HermesAgentEnvConfig
from atroposlib.envs.server_handling.server_manager import APIServerConfig

class MyEnvConfig(HermesAgentEnvConfig):
    my_custom_field: str = "default_value"

class MyEnv(HermesAgentBaseEnv):
    name = "my-env"
    env_config_cls = MyEnvConfig

    @classmethod
    def config_init(cls):
        env_config = MyEnvConfig(
            enabled_toolsets=["terminal", "file"],
            terminal_backend="modal",
            max_agent_turns=30,
        )
        server_configs = [APIServerConfig(
            base_url="https://openrouter.ai/api/v1",
            model_name="anthropic/claude-sonnet-4.6",
            server_type="openai",
        )]
        return env_config, server_configs

    async def setup(self):
        from datasets import load_dataset
        self.dataset = list(load_dataset("my-dataset", split="train"))
        self.iter = 0

    async def get_next_item(self):
        item = self.dataset[self.iter % len(self.dataset)]
        self.iter += 1
        return item

    def format_prompt(self, item):
        return item["instruction"]

    async def compute_reward(self, item, result, ctx):
        # ctx gives full tool access to the rollout's sandbox
        test = ctx.terminal("pytest -v")
        return 1.0 if test["exit_code"] == 0 else 0.0

    async def evaluate(self, *args, **kwargs):
        # Periodic evaluation during training
        pass

if __name__ == "__main__":
    MyEnv.cli()

---

env:
  enabled_toolsets: ["terminal", "file"]
  max_agent_turns: 60
  max_token_length: 32000
  agent_temperature: 0.8
  terminal_backend: "modal"
  terminal_timeout: 300
  dataset_name: "NousResearch/terminal-bench-2"
  tokenizer_name: "NousResearch/Hermes-3-Llama-3.1-8B"
  use_wandb: true
  wandb_name: "my-benchmark"

openai:
  base_url: "https://openrouter.ai/api/v1"
  model_name: "anthropic/claude-sonnet-4.6"
  server_type: "openai"
  health_check: false

---

python my_env.py evaluate \
    --config my_config.yaml \
    --openai.model_name anthropic/claude-opus-4.6  # overrides YAML

---

environments/
├── hermes_base_env.py          # Abstract base class (HermesAgentBaseEnv)
├── agent_loop.py               # Multi-turn agent engine (HermesAgentLoop)
├── tool_context.py             # Per-rollout tool access for reward functions
├── patches.py                  # Async-safety patches for Modal backend
│
├── tool_call_parsers/          # Phase 2 client-side parsers
│   ├── hermes_parser.py        # Hermes/ChatML <tool_call> format
│   ├── mistral_parser.py       # Mistral [TOOL_CALLS] format
│   ├── llama_parser.py         # Llama 3 JSON tool calling
│   ├── qwen_parser.py          # Qwen format
│   ├── deepseek_v3_parser.py   # DeepSeek V3 format
│   └── ...                     # + kimi_k2, longcat, glm45/47, etc.
│
├── terminal_test_env/          # Stack validation (inline tasks)
├── hermes_swe_env/             # SWE-bench training environment
│
└── benchmarks/                 # Evaluation benchmarks
    ├── terminalbench_2/        # 89 terminal tasks, Modal sandboxes
    ├── tblite/                 # 100 calibrated tasks (fast TB2 proxy)
    └── yc_bench/               # Long-horizon strategic benchmark

---

#!/usr/bin/env python3
"""my_cli.py - ตัวอย่าง wrapper CLI ที่ขยาย Hermes."""

from cli import HermesCLI
from prompt_toolkit.layout import FormattedTextControl, Window
from prompt_toolkit.filters import Condition


class MyCLI(HermesCLI):

    def __init__(self, **kwargs):
        super().__init__(**kwargs)
        self._panel_visible = False

    def _get_extra_tui_widgets(self):
        """เพิ่ม info panel แบบสลับได้เหนือ status bar."""
        cli_ref = self
        return [
            Window(
                FormattedTextControl(lambda: "📊 My custom panel content"),
                height=1,
                filter=Condition(lambda: cli_ref._panel_visible),
            ),
        ]

    def _register_extra_tui_keybindings(self, kb, *, input_area):
        """F2 สลับ panel ที่กำหนดเอง."""
        cli_ref = self

        @kb.add("f2")
        def _toggle_panel(event):
            cli_ref._panel_visible = not cli_ref._panel_visible

    def process_command(self, cmd: str) -> bool:
        """เพิ่ม slash command /panel."""
        if cmd.strip().lower() == "/panel":
            self._panel_visible = not self._panel_visible
            state = "visible" if self._panel_visible else "hidden"
            print(f"Panel is now {state}")
            return True
        return super().process_command(cmd)


if __name__ == "__main__":
    cli = MyCLI()
    cli.run()

---

cd ~/.hermes/hermes-agent
source .venv/bin/activate
python my_cli.py

---

def _get_extra_tui_widgets(self) -> list:
    return []  # default: no extra widgets

---

from prompt_toolkit.layout import ConditionalContainer, Window, FormattedTextControl
from prompt_toolkit.filters import Condition

def _get_extra_tui_widgets(self):
    return [
        ConditionalContainer(
            Window(FormattedTextControl("Status: connected"), height=1),
            filter=Condition(lambda: self._show_status),
        ),
    ]

---

def _register_extra_tui_keybindings(self, kb, *, input_area):
    pass  # default: no extra keybindings

---

def _register_extra_tui_keybindings(self, kb, *, input_area):
    cli_ref = self

    @kb.add("f3")
    def _clear_input(event):
        input_area.text = ""

    @kb.add("f4")
    def _insert_template(event):
        input_area.text = "/search "

---

def _build_tui_layout_children(self, *, sudo_widget, secret_widget,
    approval_widget, clarify_widget, spinner_widget, spacer,
    status_bar, input_rule_top, image_bar, input_area,
    input_rule_bot, voice_status_bar, completions_menu) -> list:

---

[
    Window(height=0),       # anchor
    sudo_widget,            # sudo password prompt (conditional)
    secret_widget,          # secret input prompt (conditional)
    approval_widget,        # dangerous command approval (conditional)
    clarify_widget,         # clarify question UI (conditional)
    spinner_widget,         # thinking spinner (conditional)
    spacer,                 # fills remaining vertical space
    *self._get_extra_tui_widgets(),  # YOUR WIDGETS GO HERE
    status_bar,             # model/token/context status line
    input_rule_top,         # ─── border above input
    image_bar,              # attached images indicator
    input_area,             # user text input
    input_rule_bot,         # ─── border below input
    voice_status_bar,       # voice mode status (conditional)
    completions_menu,       # autocomplete dropdown
]

---

┌─────────────────────────────────────────────────┐
│                  GatewayRunner                  │
│                                                 │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐       │
│  │ Telegram │  │ Discord  │  │  Slack   │       │
│  │ Adapter  │  │ Adapter  │  │ Adapter  │       │
│  └────┬─────┘  └────┬─────┘  └────┬─────┘       │
│       │             │             │             │
│       └─────────────┼─────────────┘             │
│                     ▼                           │
│              _handle_message()                  │
│                     │                           │
│         ┌───────────┼───────────┐               │
│         ▼           ▼           ▼               │
│  Slash command   AIAgent    Queue/BG            │
│    dispatch      creation   sessions            │
│                     │                           │
│                     ▼                           │
│                 SessionStore                    │
│              (SQLite persistence)               │
└─────────────────────────────────────────────────┘

---

agent:main:{platform}:{chat_type}:{chat_id}

---

Admin: /pair
Gateway: "Pairing code: ABC123. Share with the user."
New user: ABC123
Gateway: "Paired! You're now authorized."

---

if _quick_key in self._running_agents:
    if canonical == "model":
        return "⏳ Agent is running - wait for it to finish or /stop first."

---

gateway/platforms/
├── base.py              # BaseAdapter - shared logic for all platforms
├── telegram.py          # Telegram Bot API (long polling or webhook)
├── discord.py           # Discord bot via discord.py
├── slack.py             # Slack Socket Mode
├── whatsapp.py          # WhatsApp Business Cloud API
├── signal.py            # Signal via signal-cli REST API
├── matrix.py            # Matrix via mautrix (optional E2EE)
├── mattermost.py        # Mattermost WebSocket API
├── email.py             # Email via IMAP/SMTP
├── sms.py               # SMS via Twilio
├── dingtalk.py          # DingTalk WebSocket
├── feishu.py            # Feishu/Lark WebSocket or webhook
├── wecom.py             # WeCom (WeChat Work) callback
├── weixin.py            # Weixin (personal WeChat) via iLink Bot API
├── bluebubbles.py       # Apple iMessage via BlueBubbles macOS server
├── qqbot.py             # QQ Bot (Tencent QQ) via Official API v2
├── webhook.py           # Inbound/outbound webhook adapter
├── api_server.py        # REST API server adapter
└── homeassistant.py     # Home Assistant conversation integration

---

AIAgent._invoke_tool()
  → self._memory_manager.handle_tool_call(name, args)
    → provider.handle_tool_call(name, args)

---

plugins/memory/my-provider/
├── __init__.py      # MemoryProvider implementation + register() entry point
├── plugin.yaml      # Metadata (name, description, hooks)
└── README.md        # Setup instructions, config reference, tools

---

from agent.memory_provider import MemoryProvider

class MyMemoryProvider(MemoryProvider):
    @property
    def name(self) -> str:
        return "my-provider"

    def is_available(self) -> bool:
        """Check if this provider can activate. NO network calls."""
        return bool(os.environ.get("MY_API_KEY"))

    def initialize(self, session_id: str, **kwargs) -> None:
        """Called once at agent startup.

        kwargs always includes:
          hermes_home (str): Active HERMES_HOME path. Use for storage.
        """
        self._api_key = os.environ.get("MY_API_KEY", "")
        self._session_id = session_id

    # ... implement remaining methods

---

def get_config_schema(self):
    return [
        {
            "key": "api_key",
            "description": "My Provider API key",
            "secret": True,           # → written to .env
            "required": True,
            "env_var": "MY_API_KEY",   # explicit env var name
            "url": "https://my-provider.com/keys",  # where to get it
        },
        {
            "key": "region",
            "description": "Server region",
            "default": "us-east",
            "choices": ["us-east", "eu-west", "ap-south"],
        },
        {
            "key": "project",
            "description": "Project identifier",
            "default": "hermes",
        },
    ]

---

def save_config(self, values: dict, hermes_home: str) -> None:
    """Write non-secret config to your native location."""
    import json
    from pathlib import Path
    config_path = Path(hermes_home) / "my-provider.json"
    config_path.write_text(json.dumps(values, indent=2))

---

def register(ctx) -> None:
    """Called by the memory plugin discovery system."""
    ctx.register_memory_provider(MyMemoryProvider())

---

name: my-provider
version: 1.0.0
description: "Short description of what this provider does."
hooks:
  - on_session_end    # list hooks you implement

---

def sync_turn(self, user_content, assistant_content):
    def _sync():
        try:
            self._api.ingest(user_content, assistant_content)
        except Exception as e:
            logger.warning("Sync failed: %s", e)

    if self._sync_thread and self._sync_thread.is_alive():
        self._sync_thread.join(timeout=5.0)
    self._sync_thread = threading.Thread(target=_sync, daemon=True)
    self._sync_thread.start()

---

# CORRECT - profile-scoped
from hermes_constants import get_hermes_home
data_dir = get_hermes_home() / "my-provider"

# WRONG - shared across all profiles
data_dir = Path("~/.hermes/my-provider").expanduser()

---

from agent.memory_manager import MemoryManager

mgr = MemoryManager()
mgr.add_provider(my_provider)
mgr.initialize_all(session_id="test-1", platform="cli")

# Test tool routing
result = mgr.handle_tool_call("my_tool", {"action": "add", "content": "test"})

# Test lifecycle
mgr.sync_all("user msg", "assistant msg")
mgr.on_session_end([])
mgr.shutdown_all()

---

# plugins/memory/my-provider/cli.py

def my_command(args):
    """Handler dispatched by argparse."""
    sub = getattr(args, "my_command", None)
    if sub == "status":
        print("Provider is active and connected.")
    elif sub == "config":
        print("Showing config...")
    else:
        print("Usage: hermes my-provider <status|config>")

def register_cli(subparser) -> None:
    """Build the hermes my-provider argparse tree.

    Called by discover_plugin_cli_commands() at argparse setup time.
    """
    subs = subparser.add_subparsers(dest="my_command")
    subs.add_parser("status", help="Show provider status")
    subs.add_parser("config", help="Show provider config")
    subparser.set_defaults(func=my_command)

---

plugins/memory/my-provider/
├── __init__.py      # MemoryProvider implementation + register()
├── plugin.yaml      # Metadata
├── cli.py           # register_cli(subparser) - CLI commands
└── README.md        # Setup instructions

---

# Layer 1: Agent Identity (from ~/.hermes/SOUL.md)
You are Hermes, an AI assistant created by Nous Research.
You are an expert software engineer and researcher.
You value correctness, clarity, and efficiency.
...

# Layer 2: Tool-aware behavior guidance
You have persistent memory across sessions. Save durable facts using
the memory tool: user preferences, environment details, tool quirks,
and stable conventions. Memory is injected into every turn, so keep
it compact and focused on facts that will still matter later.
...
When the user references something from a past conversation or you
suspect relevant cross-session context exists, use session_search
to recall it before asking them to repeat themselves.

# Tool-use enforcement (for GPT/Codex models only)
You MUST use your tools to take action - do not describe what you
would do or plan to do without actually doing it.
...

# Layer 3: Honcho static block (when active)
[Honcho personality/context data]

# Layer 4: Optional system message (from config or API)
[User-configured system message override]

# Layer 5: Frozen MEMORY snapshot
## Persistent Memory
- User prefers Python 3.12, uses pyproject.toml
- Default editor is nvim
- Working on project "atlas" in ~/code/atlas
- Timezone: US/Pacific

# Layer 6: Frozen USER profile snapshot
## User Profile
- Name: Alice
- GitHub: alice-dev

# Layer 7: Skills index
## Skills (mandatory)
Before replying, scan the skills below. If one clearly matches
your task, load it with skill_view(name) and follow its instructions.
...
<available_skills>
  software-development:
    - code-review: Structured code review workflow
    - test-driven-development: TDD methodology
  research:
    - arxiv: Search and summarize arXiv papers
</available_skills>

# Layer 8: Context files (from project directory)
# Project Context
The following project context files have been loaded and should be followed:

## AGENTS.md
This is the atlas project. Use pytest for testing. The main
entry point is src/atlas/main.py. Always run `make lint` before
committing.

# Layer 9: Timestamp + session
Current time: 2026-03-30T14:30:00-07:00
Session: abc123

# Layer 10: Platform hint
You are a CLI AI Agent. Try not to use markdown but simple text
renderable inside a terminal.

---

# From agent/prompt_builder.py (simplified)
def load_soul_md() -> Optional[str]:
    soul_path = get_hermes_home() / "SOUL.md"
    if not soul_path.exists():
        return None
    content = soul_path.read_text(encoding="utf-8").strip()
    content = _scan_context_content(content, "SOUL.md")  # Security scan
    content = _truncate_content(content, "SOUL.md")       # Cap at 20k chars
    return content

---

You are Hermes Agent, an intelligent AI assistant created by Nous Research.
You are helpful, knowledgeable, and direct. You assist users with a wide
range of tasks including answering questions, writing and editing code,
analyzing information, creative work, and executing actions via your tools.
You communicate clearly, admit uncertainty when appropriate, and prioritize
being genuinely useful over being verbose unless otherwise directed below.
Be targeted and efficient in your exploration and investigations.

---

# From agent/prompt_builder.py (simplified)
def build_context_files_prompt(cwd=None, skip_soul=False):
    cwd_path = Path(cwd).resolve()

    # Priority: first match wins - only ONE project context loaded
    project_context = (
        _load_hermes_md(cwd_path)       # 1. .hermes.md / HERMES.md (walks to git root)
        or _load_agents_md(cwd_path)    # 2. AGENTS.md (cwd only)
        or _load_claude_md(cwd_path)    # 3. CLAUDE.md (cwd only)
        or _load_cursorrules(cwd_path)  # 4. .cursorrules / .cursor/rules/*.mdc
    )

    sections = []
    if project_context:
        sections.append(project_context)

    # SOUL.md from HERMES_HOME (independent of project context)
    if not skip_soul:
        soul_content = load_soul_md()
        if soul_content:
            sections.append(soul_content)

    if not sections:
        return ""

    return (
        "# Project Context\n\n"
        "The following project context files have been loaded "
        "and should be followed:\n\n"
        + "\n".join(sections)
    )

RAW_BUFFERClick to expand / collapse

📄 developer-guide/environments.md

sidebar_position: 5 title: "Environments, Benchmarks & Data Generation" description: "การสร้างสภาพแวดล้อมสำหรับการฝึก RL, การรัน evaluation benchmarks, และการสร้าง SFT data ด้วยการรวมระบบ Hermes-Agent Atropos"

Environments, Benchmarks & Data Generation

Hermes Agent มี full environment framework ที่เชื่อมโยง tool-calling capabilities เข้ากับ Atropos RL training framework สิ่งนี้ช่วยให้สามารถทำงานได้สามรูปแบบหลัก:

RL Training - ฝึก language models บน multi-turn agentic tasks ด้วย GRPO
Benchmarks - ประเมิน models บน standardized agentic benchmarks
Data Generation - สร้าง SFT training data จาก agent rollouts

ทั้งสามส่วนนี้ใช้ core เดียวกันคือ environment class ที่ทำหน้าที่กำหนด tasks, รัน agent loop, และให้คะแนน output

:::info Repo environments vs RL training tools Python environment framework ที่ระบุไว้ในเอกสารนี้อยู่ใน directory environments/ ของ repo และเป็น API ระดับ implementation สำหรับการรวมระบบ Hermes/Atropos ซึ่งแยกต่างหากจาก rl_* tools ที่ผู้ใช้ใช้งาน ซึ่งทำหน้าที่เป็น orchestration surface สำหรับ workflow การฝึก RL ระยะไกล :::

:::tip Quick Links

ต้องการรัน benchmarks? ข้ามไปที่ Available Benchmarks
ต้องการฝึกด้วย RL? ดู RL Training Tools สำหรับ agent-driven interface หรือ Running Environments สำหรับการรันด้วยตนเอง
ต้องการสร้าง environment ใหม่? ดู Creating Environments :::

Architecture

ระบบ environment ถูกสร้างขึ้นบน three-layer inheritance chain:

classDiagram
    class BaseEnv {
      Server management
      Worker scheduling
      Wandb logging
      CLI: serve / process / evaluate
    }

    class HermesAgentBaseEnv {
      Terminal backend configuration
      Tool resolution
      Agent loop engine
      ToolContext access
    }

    class TerminalTestEnv {
      Stack testing
    }

    class HermesSweEnv {
      SWE training
    }

    class TerminalBench2EvalEnv {
      Benchmark evaluation
    }

    class TBLiteEvalEnv {
      Fast benchmark
    }

    class YCBenchEvalEnv {
      Long-horizon benchmark
    }

    BaseEnv <|-- HermesAgentBaseEnv
    HermesAgentBaseEnv <|-- TerminalTestEnv
    HermesAgentBaseEnv <|-- HermesSweEnv
    HermesAgentBaseEnv <|-- TerminalBench2EvalEnv
    TerminalBench2EvalEnv <|-- TBLiteEvalEnv
    TerminalBench2EvalEnv <|-- YCBenchEvalEnv

BaseEnv (Atropos)

เป็น foundation จาก atroposlib ให้บริการ:

Server management - เชื่อมต่อกับ OpenAI-compatible APIs (VLLM, SGLang, OpenRouter)
Worker scheduling - การประสานงาน rollout แบบ parallel
Wandb integration - การบันทึก metrics และการแสดงผล rollout
CLI interface - มี subcommands สามตัว: serve, process, evaluate
Eval logging - evaluate_log() บันทึกผลลัพธ์เป็น JSON + JSONL

HermesAgentBaseEnv

เป็น layer ของ hermes-agent (environments/hermes_base_env.py) เพิ่มเติม:

Terminal backend configuration - กำหนด TERMINAL_ENV สำหรับการรันแบบ sandboxed (local, Docker, Modal, Daytona, SSH, Singularity)
Tool resolution - _resolve_tools_for_group() เรียกใช้ get_tool_definitions() ของ hermes-agent เพื่อรับ tool schemas ที่ถูกต้องตาม toolsets ที่เปิด/ปิดใช้งาน
Agent loop integration - collect_trajectory() รัน HermesAgentLoop และให้คะแนนผลลัพธ์
Two-phase operation - Phase 1 (OpenAI server) สำหรับ eval/SFT, Phase 2 (VLLM ManagedServer) สำหรับ full RL พร้อม logprobs
Async safety patches - monkey-patches Modal backend เพื่อให้ทำงานภายใน event loop ของ Atropos

Concrete Environments

Environment ของคุณจะสืบทอดมาจาก HermesAgentBaseEnv และต้อง implement ห้า methods:

Method	Purpose
`setup()`	โหลด dataset, initialise state
`get_next_item()`	คืนค่า item ถัดไปสำหรับ rollout
`format_prompt(item)`	แปลง item ให้เป็น user message
`compute_reward(item, result, ctx)`	ให้คะแนน rollout (0.0–1.0)
`evaluate()`	logic การ evaluation เป็นระยะ

Core Components

Agent Loop

HermesAgentLoop (environments/agent_loop.py) คือ multi-turn agent engine ที่นำกลับมาใช้ใหม่ได้ มันรัน tool-calling pattern เดียวกับ main loop ของ hermes-agent:

ส่ง messages + tool schemas ไปยัง API ผ่าน server.chat_completion()
หาก response มี tool_calls ให้ dispatch แต่ละตัวผ่าน handle_function_call()
แนบ tool results เข้าไปใน conversation, กลับไปที่ขั้นตอน 1
หากไม่มี tool_calls, agent จะเสร็จสิ้น

Tool calls จะถูก execute ใน thread pool (ThreadPoolExecutor(128)) เพื่อป้องกันไม่ให้ async backends (Modal, Docker) deadlock ภายใน event loop ของ Atropos

ส่งคืนค่า AgentResult:

@dataclass
class AgentResult:
    messages: List[Dict[str, Any]]       # Full conversation history
    turns_used: int                       # Number of LLM calls made
    finished_naturally: bool              # True if model stopped on its own
    reasoning_per_turn: List[Optional[str]]  # Extracted reasoning content
    tool_errors: List[ToolError]          # Errors encountered during tool dispatch
    managed_state: Optional[Dict]         # VLLM ManagedServer state (Phase 2)

Tool Context

ToolContext (environments/tool_context.py) ให้ reward functions เข้าถึง sandbox เดียวกัน ที่ model ใช้ระหว่างการ rollout การกำหนดขอบเขตด้วย task_id หมายความว่า state ทั้งหมด (files, processes, browser tabs) จะถูกเก็บรักษาไว้

async def compute_reward(self, item, result, ctx: ToolContext):
    # Run tests in the model's terminal sandbox
    test = ctx.terminal("pytest -v")
    if test["exit_code"] == 0:
        return 1.0

    # Check if a file was created
    content = ctx.read_file("/workspace/solution.py")
    if content.get("content"):
        return 0.5

    # Download files for local verification
    ctx.download_file("/remote/output.bin", "/local/output.bin")
    return 0.0

Available methods:

Category	Methods
Terminal	`terminal(command, timeout)`
Files	`read_file(path)`, `write_file(path, content)`, `search(query, path)`
Transfers	`upload_file()`, `upload_dir()`, `download_file()`, `download_dir()`
Web	`web_search(query)`, `web_extract(urls)`
Browser	`browser_navigate(url)`, `browser_snapshot()`
Generic	`call_tool(name, args)` - escape hatch สำหรับ tool ใดๆ ของ hermes-agent
Cleanup	`cleanup()` - ปล่อยทรัพยากรทั้งหมด

Tool Call Parsers

สำหรับ Phase 2 (VLLM ManagedServer), server จะส่งคืน raw text โดยไม่มี structured tool calls client-side parsers ใน environments/tool_call_parsers/ จะดึง tool_calls จาก raw output:

from environments.tool_call_parsers import get_parser

parser = get_parser("hermes")  # or "mistral", "llama3_json", "qwen", "deepseek_v3", etc.
content, tool_calls = parser.parse(raw_model_output)

Available parsers: hermes, mistral, llama3_json, qwen, qwen3_coder, deepseek_v3, deepseek_v3_1, kimi_k2, longcat, glm45, glm47.

ใน Phase 1 (ประเภท OpenAI server), ไม่จำเป็นต้องใช้ parsers - server จะจัดการ tool call parsing โดยธรรมชาติ

Available Benchmarks

TerminalBench2

89 terminal tasks พร้อมสภาพแวดล้อม Docker sandbox ต่อ task


What it tests	ความสามารถในการเขียนโค้ด/sysadmin แบบ single-task
Scoring	ผ่าน/ไม่ผ่าน แบบ binary (การตรวจสอบ test suite)
Sandbox	Modal cloud sandboxes (Docker images ต่อ task)
Tools	`terminal` + `file`
Tasks	89 tasks ในหลาย categories
Cost	~$50–200 สำหรับ full eval (parallel execution)
Time	~2–4 ชั่วโมง

python environments/benchmarks/terminalbench_2/terminalbench2_env.py evaluate \
    --config environments/benchmarks/terminalbench_2/default.yaml

# Run specific tasks
python environments/benchmarks/terminalbench_2/terminalbench2_env.py evaluate \
    --config environments/benchmarks/terminalbench_2/default.yaml \
    --env.task_filter fix-git,git-multibranch

Dataset: NousResearch/terminal-bench-2 บน HuggingFace.

TBLite (OpenThoughts Terminal Bench Lite)

100 tasks ที่มีการปรับระดับความยาก - เป็น proxy ที่เร็วกว่า TerminalBench2


What it tests	เหมือน TB2 (coding/sysadmin), มีการปรับระดับความยาก
Scoring	ผ่าน/ไม่ผ่าน แบบ binary
Sandbox	Modal cloud sandboxes
Tools	`terminal` + `file`
Tasks	100 tasks: Easy (40), Medium (26), Hard (26), Extreme (8)
Correlation	r=0.911 กับ full TB2
Speed	เร็วกว่า TB2 2.6–8×

python environments/benchmarks/tblite/tblite_env.py evaluate \
    --config environments/benchmarks/tblite/default.yaml

TBLite เป็น thin subclass ของ TerminalBench2 - มีเพียง dataset และ timeouts ที่แตกต่างกัน สร้างโดยทีม OpenThoughts Agent (Snorkel AI + Bespoke Labs). Dataset: NousResearch/openthoughts-tblite.

YC-Bench

Long-horizon strategic benchmark - agent จะรับบทเป็น CEO ของ AI startup


What it tests	ความสอดคล้องเชิงกลยุทธ์ multi-turn ตลอดหลายร้อย turns
Scoring	Composite: `0.5 × survival + 0.5 × normalised_funds`
Sandbox	Local terminal (ไม่จำเป็นต้องใช้ Modal)
Tools	`terminal` เท่านั้น
Runs	9 default (3 presets × 3 seeds), sequential
Cost	~$50–200 สำหรับ full eval
Time	~3–6 ชั่วโมง

# ติดตั้ง yc-bench (optional dependency)
pip install "hermes-agent[yc-bench]"

# Run evaluation
bash environments/benchmarks/yc_bench/run_eval.sh

# หรือโดยตรง
python environments/benchmarks/yc_bench/yc_bench_env.py evaluate \
    --config environments/benchmarks/yc_bench/default.yaml

# Quick single-preset test
python environments/benchmarks/yc_bench/yc_bench_env.py evaluate \
    --config environments/benchmarks/yc_bench/default.yaml \
    --env.presets '["fast_test"]' --env.seeds '[1]'

YC-Bench ใช้ collinear-ai/yc-bench - ซึ่งเป็นการจำลองแบบ deterministic ที่มี 4 skill domains (research, inference, data_environment, training), prestige system, employee management, และ financial pressure ต่างจาก YC-Bench ที่ใช้ binary scoring ต่อ task ของ TB2, YC-Bench จะวัดว่า agent สามารถรักษา coherent strategy ตลอดหลายร้อยการตัดสินใจที่สะสมกันได้หรือไม่

Training Environments

TerminalTestEnv

environment แบบ minimal self-contained ที่มี inline tasks (ไม่มี external dataset) ใช้สำหรับ validating the full stack end-to-end แต่ละ task จะขอให้ model สร้างไฟล์ที่ path ที่ทราบล่วงหน้า และ verifier จะตรวจสอบ content

# Process mode (บันทึก rollouts เป็น JSONL, ไม่ต้องใช้ training server)
python environments/terminal_test_env/terminal_test_env.py process \
    --env.data_path_to_save_groups terminal_test_output.jsonl

# Serve mode (เชื่อมต่อกับ Atropos API สำหรับ RL training)
python environments/terminal_test_env/terminal_test_env.py serve

HermesSweEnv

environment สำหรับการฝึกสไตล์ SWE-bench model จะได้รับ coding task, ใช้ terminal + file + web tools ในการแก้ไขปัญหา และ reward function จะรัน tests ใน Modal sandbox เดียวกัน

python environments/hermes_swe_env/hermes_swe_env.py serve \
    --openai.model_name YourModel \
    --env.dataset_name bigcode/humanevalpack \
    --env.terminal_backend modal

Running Environments

ทุก environment เป็น standalone Python script ที่มีสาม CLI subcommands:

`evaluate` - รัน benchmark

สำหรับ environments ที่ใช้ eval เท่านั้น (benchmarks) รันทุก items, คำนวณ metrics, และบันทึกไปยัง wandb

python environments/benchmarks/tblite/tblite_env.py evaluate \
    --config environments/benchmarks/tblite/default.yaml \
    --openai.model_name anthropic/claude-sonnet-4.6

ไม่จำเป็นต้องมี training server หรือ run-api environment จะจัดการทุกอย่าง

`process` - สร้าง SFT data

รัน rollouts และบันทึก scored trajectories เป็น JSONL มีประโยชน์สำหรับการสร้าง training data โดยไม่ต้องมี full RL loop

python environments/terminal_test_env/terminal_test_env.py process \
    --env.data_path_to_save_groups output.jsonl \
    --openai.model_name anthropic/claude-sonnet-4.6

รูปแบบ output: แต่ละบรรทัดคือ scored trajectory ที่มี full conversation history, reward, และ metadata

`serve` - เชื่อมต่อกับ Atropos สำหรับ RL training

เชื่อมต่อ environment กับ Atropos API server ที่กำลังรันอยู่ (run-api) ใช้ระหว่าง live RL training

# Terminal 1: Start the Atropos API
run-api

# Terminal 2: Start the environment
python environments/hermes_swe_env/hermes_swe_env.py serve \
    --openai.model_name YourModel

environment จะรับ items จาก Atropos, รัน agent rollouts, คำนวณ rewards, และส่ง scored trajectories กลับไปสำหรับการฝึก

Two-Phase Operation

Phase 1: OpenAI Server (Eval / SFT)

ใช้ server.chat_completion() ด้วยพารามิเตอร์ tools= server (VLLM, SGLang, OpenRouter, OpenAI) จะจัดการ tool call parsing โดยธรรมชาติ ส่งคืน ChatCompletion objects ที่มี tool_calls แบบ structured

ใช้สำหรับ: evaluation, การสร้าง SFT data, benchmarks, testing
Placeholder tokens ถูกสร้างขึ้นสำหรับ Atropos pipeline (เนื่องจากไม่มี real token IDs จาก OpenAI API)

Phase 2: VLLM ManagedServer (Full RL)

ใช้ ManagedServer สำหรับ exact token IDs + logprobs ผ่าน /generate client-side tool call parser จะ reconstruct structured tool_calls จาก raw output

ใช้สำหรับ: full RL training ด้วย GRPO/PPO
Real tokens, masks, และ logprobs จะไหลผ่าน pipeline
กำหนด tool_call_parser ใน config ให้ตรงกับ format ของ model ของคุณ (เช่น "hermes", "qwen", "mistral")

Creating Environments

Training Environment

from environments.hermes_base_env import HermesAgentBaseEnv, HermesAgentEnvConfig
from atroposlib.envs.server_handling.server_manager import APIServerConfig

class MyEnvConfig(HermesAgentEnvConfig):
    my_custom_field: str = "default_value"

class MyEnv(HermesAgentBaseEnv):
    name = "my-env"
    env_config_cls = MyEnvConfig

    @classmethod
    def config_init(cls):
        env_config = MyEnvConfig(
            enabled_toolsets=["terminal", "file"],
            terminal_backend="modal",
            max_agent_turns=30,
        )
        server_configs = [APIServerConfig(
            base_url="https://openrouter.ai/api/v1",
            model_name="anthropic/claude-sonnet-4.6",
            server_type="openai",
        )]
        return env_config, server_configs

    async def setup(self):
        from datasets import load_dataset
        self.dataset = list(load_dataset("my-dataset", split="train"))
        self.iter = 0

    async def get_next_item(self):
        item = self.dataset[self.iter % len(self.dataset)]
        self.iter += 1
        return item

    def format_prompt(self, item):
        return item["instruction"]

    async def compute_reward(self, item, result, ctx):
        # ctx gives full tool access to the rollout's sandbox
        test = ctx.terminal("pytest -v")
        return 1.0 if test["exit_code"] == 0 else 0.0

    async def evaluate(self, *args, **kwargs):
        # Periodic evaluation during training
        pass

if __name__ == "__main__":
    MyEnv.cli()

Eval-Only Benchmark

สำหรับ benchmarks ให้ทำตาม pattern ที่ใช้โดย TerminalBench2, TBLite, และ YC-Bench:

สร้างภายใต้ environments/benchmarks/your-benchmark/
ตั้งค่า config สำหรับ eval เท่านั้น: eval_handling=STOP_TRAIN, steps_per_eval=1, total_steps=1
Stub training methods: collect_trajectories() คืนค่า (None, []), score() คืนค่า None
Implement rollout_and_score_eval(eval_item) - agent loop ต่อ item + scoring
Implement evaluate() - จัดการการรันทั้งหมด, คำนวณ metrics รวม
เพิ่ม streaming JSONL สำหรับการบันทึกผลลัพธ์ที่ปลอดภัยจากการ crash
เพิ่ม cleanup: การจัดการ KeyboardInterrupt, cleanup_all_environments(), _tool_executor.shutdown()
รันด้วย subcommand evaluate

ดูที่ environments/benchmarks/yc_bench/yc_bench_env.py สำหรับ reference implementation ที่สะอาดและมีเอกสารประกอบที่ดี

Configuration Reference

HermesAgentEnvConfig Fields

Field	Type	Default	Description
`enabled_toolsets`	`List[str]`	`None` (all)	hermes toolsets ที่ต้องการเปิดใช้งาน
`disabled_toolsets`	`List[str]`	`None`	Toolsets ที่ต้องการกรองออก
`distribution`	`str`	`None`	ชื่อการกระจาย toolset แบบ probabilistic
`max_agent_turns`	`int`	`30`	จำนวน LLM calls สูงสุดต่อ rollout
`agent_temperature`	`float`	`1.0`	Sampling temperature
`system_prompt`	`str`	`None`	System message สำหรับ agent
`terminal_backend`	`str`	`"local"`	`local`, `docker`, `modal`, `daytona`, `ssh`, `singularity`
`terminal_timeout`	`int`	`120`	วินาทีต่อ terminal command
`terminal_lifetime`	`int`	`3600`	อายุ sandbox สูงสุด
`dataset_name`	`str`	`None`	HuggingFace dataset identifier
`tool_pool_size`	`int`	`128`	ขนาด thread pool สำหรับ tool execution
`tool_call_parser`	`str`	`"hermes"`	Parser สำหรับ raw output ของ Phase 2
`extra_body`	`Dict`	`None`	extra params สำหรับ OpenAI API (เช่น OpenRouter provider prefs)
`eval_handling`	`Enum`	`STOP_TRAIN`	`STOP_TRAIN`, `LIMIT_TRAIN`, `NONE`

YAML Configuration

Environments สามารถกำหนดค่าผ่านไฟล์ YAML ที่ส่งด้วย --config:

env:
  enabled_toolsets: ["terminal", "file"]
  max_agent_turns: 60
  max_token_length: 32000
  agent_temperature: 0.8
  terminal_backend: "modal"
  terminal_timeout: 300
  dataset_name: "NousResearch/terminal-bench-2"
  tokenizer_name: "NousResearch/Hermes-3-Llama-3.1-8B"
  use_wandb: true
  wandb_name: "my-benchmark"

openai:
  base_url: "https://openrouter.ai/api/v1"
  model_name: "anthropic/claude-sonnet-4.6"
  server_type: "openai"
  health_check: false

ค่าใน YAML จะ override ค่า default ของ config_init() อาร์กิวเมนต์ CLI จะ override ค่า YAML:

python my_env.py evaluate \
    --config my_config.yaml \
    --openai.model_name anthropic/claude-opus-4.6  # overrides YAML

Prerequisites

สำหรับทุก environments

Python >= 3.11
atroposlib: pip install git+https://github.com/NousResearch/atropos.git
API key ของ LLM (OpenRouter, OpenAI, หรือ self-hosted VLLM/SGLang)

สำหรับ Modal-sandboxed benchmarks (TB2, TBLite)

บัญชีและ CLI ของ Modal: pip install "hermes-agent[modal]"
environment variables MODAL_TOKEN_ID และ MODAL_TOKEN_SECRET

สำหรับ YC-Bench

pip install "hermes-agent[yc-bench]" (ติดตั้ง yc-bench CLI + SQLAlchemy)
ไม่จำเป็นต้องใช้ Modal - รันด้วย local terminal backend

สำหรับ RL training

TINKER_API_KEY - API key สำหรับบริการฝึก Tinker
WANDB_API_KEY - สำหรับการติดตาม metrics ของ Weights & Biases
submodule tinker-atropos (ที่ tinker-atropos/ ใน repo)

ดูที่ RL Training สำหรับ workflow RL แบบ agent-driven

Directory Structure

environments/
├── hermes_base_env.py          # Abstract base class (HermesAgentBaseEnv)
├── agent_loop.py               # Multi-turn agent engine (HermesAgentLoop)
├── tool_context.py             # Per-rollout tool access for reward functions
├── patches.py                  # Async-safety patches for Modal backend
│
├── tool_call_parsers/          # Phase 2 client-side parsers
│   ├── hermes_parser.py        # Hermes/ChatML <tool_call> format
│   ├── mistral_parser.py       # Mistral [TOOL_CALLS] format
│   ├── llama_parser.py         # Llama 3 JSON tool calling
│   ├── qwen_parser.py          # Qwen format
│   ├── deepseek_v3_parser.py   # DeepSeek V3 format
│   └── ...                     # + kimi_k2, longcat, glm45/47, etc.
│
├── terminal_test_env/          # Stack validation (inline tasks)
├── hermes_swe_env/             # SWE-bench training environment
│
└── benchmarks/                 # Evaluation benchmarks
    ├── terminalbench_2/        # 89 terminal tasks, Modal sandboxes
    ├── tblite/                 # 100 calibrated tasks (fast TB2 proxy)
    └── yc_bench/               # Long-horizon strategic benchmark

📄 developer-guide/extending-the-cli.md

sidebar_position: 8 title: "การขยาย CLI" description: "สร้าง wrapper CLIs ที่ขยาย Hermes TUI ด้วย widgets, keybindings, และการเปลี่ยนแปลง layout ที่กำหนดเอง"

การขยาย CLI

Hermes เปิดเผย protected extension hooks บน HermesCLI เพื่อให้ wrapper CLIs สามารถเพิ่ม widgets, keybindings, และการปรับแต่ง layout ได้ โดยไม่ต้อง override method run() ที่มีความยาวกว่า 1000 บรรทัด วิธีนี้ช่วยให้ extension ของคุณแยกออกจาก internal changes ได้

จุดขยาย (Extension points)

มีจุดขยาย (extension seams) ห้าจุดให้ใช้งาน:

Hook	วัตถุประสงค์	ต้อง override เมื่อ...
`_get_extra_tui_widgets()`	แทรก widgets เข้าไปใน layout	คุณต้องการองค์ประกอบ UI ที่คงอยู่ (panel, status line, mini-player)
`_register_extra_tui_keybindings(kb, *, input_area)`	เพิ่ม hotkeys (ปุ่มลัด)	คุณต้องการ hotkeys (สลับ panel, ควบคุม transport, hotkeys แบบ modal)
`_build_tui_layout_children(**widgets)`	ควบคุมลำดับ widgets ทั้งหมด	คุณต้องการจัดเรียงใหม่หรือห่อ widgets ที่มีอยู่ (กรณีหายาก)
`process_command()`	เพิ่ม slash commands ที่กำหนดเอง	คุณต้องการการจัดการ `/mycommand` (hook ที่มีอยู่ก่อนแล้ว)
`_build_tui_style_dict()`	custom prompt_toolkit styles	คุณต้องการสีหรือการจัดรูปแบบที่กำหนดเอง (hook ที่มีอยู่ก่อนแล้ว)

สามตัวแรกเป็น protected hooks ที่เป็นใหม่ สองตัวหลังมีอยู่แล้ว

เริ่มต้นอย่างรวดเร็ว: wrapper CLI

#!/usr/bin/env python3
"""my_cli.py - ตัวอย่าง wrapper CLI ที่ขยาย Hermes."""

from cli import HermesCLI
from prompt_toolkit.layout import FormattedTextControl, Window
from prompt_toolkit.filters import Condition


class MyCLI(HermesCLI):

    def __init__(self, **kwargs):
        super().__init__(**kwargs)
        self._panel_visible = False

    def _get_extra_tui_widgets(self):
        """เพิ่ม info panel แบบสลับได้เหนือ status bar."""
        cli_ref = self
        return [
            Window(
                FormattedTextControl(lambda: "📊 My custom panel content"),
                height=1,
                filter=Condition(lambda: cli_ref._panel_visible),
            ),
        ]

    def _register_extra_tui_keybindings(self, kb, *, input_area):
        """F2 สลับ panel ที่กำหนดเอง."""
        cli_ref = self

        @kb.add("f2")
        def _toggle_panel(event):
            cli_ref._panel_visible = not cli_ref._panel_visible

    def process_command(self, cmd: str) -> bool:
        """เพิ่ม slash command /panel."""
        if cmd.strip().lower() == "/panel":
            self._panel_visible = not self._panel_visible
            state = "visible" if self._panel_visible else "hidden"
            print(f"Panel is now {state}")
            return True
        return super().process_command(cmd)


if __name__ == "__main__":
    cli = MyCLI()
    cli.run()

วิธีรัน:

cd ~/.hermes/hermes-agent
source .venv/bin/activate
python my_cli.py

การอ้างอิง Hook

`_get_extra_tui_widgets()`

ส่งคืนรายการของ prompt_toolkit widgets เพื่อแทรกเข้าไปใน TUI layout Widgets จะปรากฏ ระหว่าง spacer และ status bar - อยู่เหนือ input area แต่ใต้ main output

def _get_extra_tui_widgets(self) -> list:
    return []  # default: no extra widgets

widgets แต่ละตัวควรเป็น prompt_toolkit container (เช่น Window, ConditionalContainer, HSplit) ใช้ ConditionalContainer หรือ filter=Condition(...) เพื่อให้ widgets สามารถสลับได้

from prompt_toolkit.layout import ConditionalContainer, Window, FormattedTextControl
from prompt_toolkit.filters import Condition

def _get_extra_tui_widgets(self):
    return [
        ConditionalContainer(
            Window(FormattedTextControl("Status: connected"), height=1),
            filter=Condition(lambda: self._show_status),
        ),
    ]

`_register_extra_tui_keybindings(kb, *, input_area)`

ถูกเรียกใช้หลังจากที่ Hermes ลงทะเบียน keybindings ของตัวเอง และก่อนที่ layout จะถูกสร้างขึ้น เพิ่ม keybindings ของคุณเข้าไปใน kb

def _register_extra_tui_keybindings(self, kb, *, input_area):
    pass  # default: no extra keybindings

Parameters:

kb - instance ของ KeyBindings สำหรับแอปพลิเคชัน prompt_toolkit
input_area - widget TextArea หลัก หากคุณต้องการอ่านหรือจัดการ input ของผู้ใช้

def _register_extra_tui_keybindings(self, kb, *, input_area):
    cli_ref = self

    @kb.add("f3")
    def _clear_input(event):
        input_area.text = ""

    @kb.add("f4")
    def _insert_template(event):
        input_area.text = "/search "

หลีกเลี่ยงความขัดแย้ง กับ keybindings ที่มีอยู่: Enter (submit), Escape Enter (ขึ้นบรรทัดใหม่), Ctrl-C (interrupt), Ctrl-D (exit), Tab (auto-suggest accept). ปุ่มฟังก์ชัน F2+ และ Ctrl-combinations โดยทั่วไปปลอดภัย

`_build_tui_layout_children(**widgets)`

ให้ override เฉพาะเมื่อคุณต้องการควบคุมลำดับ widgets อย่างสมบูรณ์ ส่วนใหญ่แล้ว extension ควรใช้ _get_extra_tui_widgets() แทน

def _build_tui_layout_children(self, *, sudo_widget, secret_widget,
    approval_widget, clarify_widget, spinner_widget, spacer,
    status_bar, input_rule_top, image_bar, input_area,
    input_rule_bot, voice_status_bar, completions_menu) -> list:

การ implement ค่าเริ่มต้นจะส่งคืน:

[
    Window(height=0),       # anchor
    sudo_widget,            # sudo password prompt (conditional)
    secret_widget,          # secret input prompt (conditional)
    approval_widget,        # dangerous command approval (conditional)
    clarify_widget,         # clarify question UI (conditional)
    spinner_widget,         # thinking spinner (conditional)
    spacer,                 # fills remaining vertical space
    *self._get_extra_tui_widgets(),  # YOUR WIDGETS GO HERE
    status_bar,             # model/token/context status line
    input_rule_top,         # ─── border above input
    image_bar,              # attached images indicator
    input_area,             # user text input
    input_rule_bot,         # ─── border below input
    voice_status_bar,       # voice mode status (conditional)
    completions_menu,       # autocomplete dropdown
]

แผนภาพ Layout

Layout ค่าเริ่มต้นจากบนลงล่าง:

Output area - ประวัติการสนทนาที่เลื่อนได้
Spacer
Extra widgets - จาก _get_extra_tui_widgets()
Status bar - model, context %, elapsed time
Image bar - จำนวน image ที่แนบมา
Input area - prompt ของผู้ใช้
Voice status - ตัวบ่งชี้การบันทึก
Completions menu - คำแนะนำ autocomplete

เคล็ดลับ

ทำให้ display ไม่ถูกต้อง (Invalidate the display) หลังจากการเปลี่ยนแปลง state: เรียกใช้ self._invalidate() เพื่อกระตุ้นการ redraw ของ prompt_toolkit
เข้าถึง agent state: self.agent, self.model, self.conversation_history สามารถใช้งานได้ทั้งหมด
Custom styles: Override _build_tui_style_dict() และเพิ่ม entries สำหรับ custom style classes ของคุณ
Slash commands: Override process_command(), จัดการคำสั่งของคุณ, และเรียกใช้ super().process_command(cmd) สำหรับทุกอย่างที่เหลือ
อย่า override run() เว้นแต่จำเป็นจริงๆ - extension hooks มีอยู่โดยเฉพาะเพื่อหลีกเลี่ยงการผูกติดกันแบบนั้น

📄 developer-guide/gateway-internals.md

sidebar_position: 7 title: "Gateway Internals" description: "วิธีการบูตของ messaging gateway, การอนุญาตผู้ใช้, การกำหนดเส้นทางเซสชัน, และการส่งข้อความ"

Gateway Internals

messaging gateway คือ process ที่ทำงานต่อเนื่องซึ่งเชื่อมต่อ Hermes เข้ากับแพลตฟอร์ม messaging ภายนอกกว่า 14 แพลตฟอร์มผ่านสถาปัตยกรรมแบบรวมศูนย์

Key Files

File	Purpose
`gateway/run.py`	`GatewayRunner` - main loop, slash commands, message dispatch (~9,000 lines)
`gateway/session.py`	`SessionStore` - conversation persistence and session key construction
`gateway/delivery.py`	Outbound message delivery to target platforms/channels
`gateway/pairing.py`	DM pairing flow for user authorization
`gateway/channel_directory.py`	Maps chat IDs to human-readable names for cron delivery
`gateway/hooks.py`	Hook discovery, loading, and lifecycle event dispatch
`gateway/mirror.py`	Cross-session message mirroring for `send_message`
`gateway/status.py`	Token lock management for profile-scoped gateway instances
`gateway/builtin_hooks/`	Always-registered hooks (e.g., BOOT.md system prompt hook)
`gateway/platforms/`	Platform adapters (one per messaging platform)

Architecture Overview

┌─────────────────────────────────────────────────┐
│                  GatewayRunner                  │
│                                                 │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐       │
│  │ Telegram │  │ Discord  │  │  Slack   │       │
│  │ Adapter  │  │ Adapter  │  │ Adapter  │       │
│  └────┬─────┘  └────┬─────┘  └────┬─────┘       │
│       │             │             │             │
│       └─────────────┼─────────────┘             │
│                     ▼                           │
│              _handle_message()                  │
│                     │                           │
│         ┌───────────┼───────────┐               │
│         ▼           ▼           ▼               │
│  Slash command   AIAgent    Queue/BG            │
│    dispatch      creation   sessions            │
│                     │                           │
│                     ▼                           │
│                 SessionStore                    │
│              (SQLite persistence)               │
└─────────────────────────────────────────────────┘

Message Flow

เมื่อมีข้อความเข้ามาจากแพลตฟอร์มใดๆ:

Platform adapter จะรับ raw event และทำให้เป็นรูปแบบ MessageEvent
Base adapter จะตรวจสอบ active session guard:
- หาก agent กำลังทำงานสำหรับ session นี้ -> จะ queue message และตั้งค่า interrupt event
- หากเป็น /approve, /deny, /stop -> จะข้าม guard (dispatched inline)
GatewayRunner._handle_message() จะรับ event:
- Resolve session key ผ่าน _session_key_for_source() (format: agent:main:{platform}:{chat_type}:{chat_id})
- ตรวจสอบการอนุญาต (ดู Authorization ด้านล่าง)
- ตรวจสอบว่าเป็น slash command หรือไม่ -> dispatch ไปยัง command handler
- ตรวจสอบว่า agent กำลังทำงานอยู่หรือไม่ -> intercept commands เช่น /stop, /status
- หากไม่เป็นเช่นนั้น -> สร้าง instance ของ AIAgent และรัน conversation
Response จะถูกส่งกลับผ่าน platform adapter

Session Key Format

Session keys จะเข้ารหัส context การกำหนดเส้นทางทั้งหมด:

agent:main:{platform}:{chat_type}:{chat_id}

ตัวอย่างเช่น: agent:main:telegram:private:123456789

แพลตฟอร์มที่รองรับ thread (เช่น Telegram forum topics, Discord threads, Slack threads) อาจรวม thread IDs ไว้ในส่วน chat_id ห้ามสร้าง session keys ด้วยตนเองเด็ดขาด - ให้ใช้ build_session_key() จาก gateway/session.py เสมอ

Two-Level Message Guard

เมื่อ agent กำลังทำงานอยู่ ข้อความขาเข้าจะผ่าน guard สองระดับตามลำดับ:

Level 1 - Base adapter (gateway/platforms/base.py): ตรวจสอบ _active_sessions หาก session นั้น active จะ queue ข้อความใน _pending_messages และตั้งค่า interrupt event สิ่งนี้จะดักจับข้อความ ก่อน ที่จะถึง gateway runner
Level 2 - Gateway runner (gateway/run.py): ตรวจสอบ _running_agents จะ intercept commands เฉพาะ (/stop, /new, /queue, /status, /approve, /deny) และกำหนดเส้นทางอย่างเหมาะสม ส่วนอื่นๆ ทั้งหมดจะเรียกใช้ running_agent.interrupt()

Commands ที่ต้องถึง runner แม้ว่า agent จะถูกบล็อก (เช่น /approve) จะถูก dispatch inline ผ่าน await self._message_handler(event) - ซึ่งจะข้ามระบบ background task เพื่อหลีกเลี่ยง race conditions

Authorization

gateway ใช้การตรวจสอบการอนุญาตแบบหลายชั้น (multi-layer authorization check) ซึ่งจะถูกประเมินตามลำดับ:

Per-platform allow-all flag (เช่น TELEGRAM_ALLOW_ALL_USERS) - หากตั้งค่าไว้ ผู้ใช้ทุกคนบนแพลตฟอร์มนั้นจะได้รับอนุญาต
Platform allowlist (เช่น TELEGRAM_ALLOWED_USERS) - user IDs ที่คั่นด้วยเครื่องหมายจุลภาค
DM pairing - ผู้ใช้ที่ได้รับการยืนยันตัวตนสามารถจับคู่ผู้ใช้ใหม่ผ่าน pairing code
Global allow-all (GATEWAY_ALLOW_ALL_USERS) - หากตั้งค่าไว้ ผู้ใช้ทุกคนบนทุกแพลตฟอร์มจะได้รับอนุญาต
Default: deny - ผู้ใช้ที่ไม่ได้รับอนุญาตจะถูกปฏิเสธ

DM Pairing Flow

Admin: /pair
Gateway: "Pairing code: ABC123. Share with the user."
New user: ABC123
Gateway: "Paired! You're now authorized."

สถานะการจับคู่ (Pairing state) จะถูกบันทึกใน gateway/pairing.py และคงอยู่แม้จะมีการรีสตาร์ท

Slash Command Dispatch

slash commands ทั้งหมดใน gateway จะไหลผ่าน pipeline การ resolve เดียวกัน:

resolve_command() จาก hermes_cli/commands.py จะแมป input ไปยัง canonical name (จัดการ aliases, prefix matching)
canonical name จะถูกตรวจสอบกับ GATEWAY_KNOWN_COMMANDS
Handler ใน _handle_message() จะ dispatch ตาม canonical name
คำสั่งบางอย่างถูกจำกัดด้วย config (gateway_config_gate บน CommandDef)

Running-Agent Guard

Commands ที่ต้องไม่ถูก execute ขณะที่ agent กำลังประมวลผลจะถูกปฏิเสธตั้งแต่เนิ่นๆ:

if _quick_key in self._running_agents:
    if canonical == "model":
        return "⏳ Agent is running - wait for it to finish or /stop first."

Commands ที่ข้าม (Bypass commands) (เช่น /stop, /new, /approve, /deny, /queue, /status) มีการจัดการพิเศษ

Config Sources

gateway อ่าน configuration จากหลายแหล่ง:

Source	What it provides
`~/.hermes/.env`	API keys, bot tokens, platform credentials
`~/.hermes/config.yaml`	Model settings, tool configuration, display options
Environment variables	Override any of the above

ต่างจาก CLI (ซึ่งใช้ load_cli_config() พร้อมค่า default ที่ hardcoded) gateway จะอ่าน config.yaml โดยตรงผ่าน YAML loader ซึ่งหมายความว่า config keys ที่มีอยู่ใน defaults dict ของ CLI แต่ไม่มีในไฟล์ config ของผู้ใช้อาจมีพฤติกรรมที่แตกต่างกันระหว่าง CLI และ gateway

Platform Adapters

แต่ละแพลตฟอร์ม messaging มี adapter อยู่ใน gateway/platforms/:

gateway/platforms/
├── base.py              # BaseAdapter - shared logic for all platforms
├── telegram.py          # Telegram Bot API (long polling or webhook)
├── discord.py           # Discord bot via discord.py
├── slack.py             # Slack Socket Mode
├── whatsapp.py          # WhatsApp Business Cloud API
├── signal.py            # Signal via signal-cli REST API
├── matrix.py            # Matrix via mautrix (optional E2EE)
├── mattermost.py        # Mattermost WebSocket API
├── email.py             # Email via IMAP/SMTP
├── sms.py               # SMS via Twilio
├── dingtalk.py          # DingTalk WebSocket
├── feishu.py            # Feishu/Lark WebSocket or webhook
├── wecom.py             # WeCom (WeChat Work) callback
├── weixin.py            # Weixin (personal WeChat) via iLink Bot API
├── bluebubbles.py       # Apple iMessage via BlueBubbles macOS server
├── qqbot.py             # QQ Bot (Tencent QQ) via Official API v2
├── webhook.py           # Inbound/outbound webhook adapter
├── api_server.py        # REST API server adapter
└── homeassistant.py     # Home Assistant conversation integration

Adapters จะ implement interface ทั่วไป:

connect() / disconnect() - lifecycle management
send_message() - outbound message delivery
on_message() - inbound message normalization -> MessageEvent

Token Locks

Adapters ที่เชื่อมต่อด้วย credentials เฉพาะ จะเรียกใช้ acquire_scoped_lock() ใน connect() และ release_scoped_lock() ใน disconnect() สิ่งนี้ป้องกันไม่ให้โปรไฟล์สองโปรไฟล์ใช้ bot token เดียวกันพร้อมกัน

Delivery Path

การส่งออก (Outgoing deliveries) (gateway/delivery.py) จัดการ:

Direct reply - ส่ง response กลับไปยัง chat ต้นทาง
Home channel delivery - กำหนดเส้นทางผลลัพธ์ของ cron job และ background results ไปยัง home channel ที่กำหนดค่าไว้
Explicit target delivery - send_message tool ที่ระบุ telegram:-1001234567890
Cross-platform delivery - ส่งไปยังแพลตฟอร์มที่แตกต่างจากข้อความต้นทาง

การส่งมอบจาก cron job จะไม่ถูก mirror เข้าไปใน history ของ session ของ gateway - มันจะอยู่ใน cron session ของตัวเองเท่านั้น นี่คือการออกแบบโดยเจตนาเพื่อหลีกเลี่ยงการละเมิดข้อจำกัดเรื่อง message alternation

Hooks

Gateway hooks คือ Python modules ที่ตอบสนองต่อ lifecycle events:

Gateway Hook Events

Event	When fired
`gateway:startup`	Gateway process starts
`session:start`	New conversation session begins
`session:end`	Session completes or times out
`session:reset`	User resets session with `/new`
`agent:start`	Agent begins processing a message
`agent:step`	Agent completes one tool-calling iteration
`agent:end`	Agent finishes and returns response
`command:*`	Any slash command is executed

Hooks ถูกค้นพบจาก gateway/builtin_hooks/ (active เสมอ) และ ~/.hermes/hooks/ (ติดตั้งโดยผู้ใช้) แต่ละ hook คือ directory ที่มี manifest ชื่อ HOOK.yaml และ handler.py

Memory Provider Integration

เมื่อเปิดใช้งาน memory provider plugin (เช่น Honcho):

Gateway จะสร้าง AIAgent ต่อข้อความหนึ่งๆ พร้อมด้วย session ID
MemoryManager จะ initialize provider ด้วย session context
Provider tools (เช่น honcho_profile, viking_search) จะถูกส่งผ่าน:

AIAgent._invoke_tool()
  → self._memory_manager.handle_tool_call(name, args)
    → provider.handle_tool_call(name, args)

เมื่อ session end/reset จะมีการเรียกใช้ on_session_end() สำหรับการ cleanup และการ flush data ครั้งสุดท้าย

Memory Flush Lifecycle

เมื่อ session ถูก reset, resume, หรือหมดอายุ:

Built-in memories จะถูก flush ไปยัง disk
hook on_session_end() ของ memory provider จะถูกเรียกใช้
AIAgent ชั่วคราวจะรัน turn conversation ที่เกี่ยวกับ memory เท่านั้น
Context จะถูก discard หรือ archive

Background Maintenance

gateway จะรันการบำรุงรักษาแบบ periodic ควบคู่ไปกับการจัดการข้อความ:

Cron ticking - ตรวจสอบตารางงานและเรียกใช้งานที่ถึงกำหนด
Session expiry - ทำความสะอาด session ที่ถูกทิ้งไว้หลังจากหมดเวลา
Memory flush - flush memory ล่วงหน้าก่อนที่ session จะหมดอายุ
Cache refresh - refresh รายชื่อ model และสถานะของ provider

Process Management

gateway ทำงานเป็น process ที่มีอายุการใช้งานยาวนาน (long-lived process) โดยจัดการผ่าน:

hermes gateway start / hermes gateway stop - การควบคุมด้วยตนเอง
systemctl (Linux) หรือ launchctl (macOS) - การจัดการ service
PID file ที่ ~/.hermes/gateway.pid - การติดตาม process ที่จำกัดขอบเขตโปรไฟล์ (profile-scoped)

Profile-scoped vs global: start_gateway() ใช้ PID files ที่จำกัดขอบเขตโปรไฟล์ hermes gateway stop จะหยุดเฉพาะ gateway ของ profile ปัจจุบันเท่านั้น hermes gateway stop --all ใช้การสแกน ps aux แบบ global เพื่อ kill gateway processes ทั้งหมด (ใช้ระหว่างการอัปเดต)

Related Docs

📄 developer-guide/memory-provider-plugin.md

sidebar_position: 8 title: "ปลั๊กอิน Memory Provider" description: "วิธีการสร้างปลั๊กอิน memory provider สำหรับ Hermes Agent"

การสร้าง Memory Provider Plugin

Memory provider plugins ช่วยให้ Hermes Agent มีความรู้ที่คงอยู่ข้ามเซสชัน (persistent, cross-session knowledge) นอกเหนือจาก MEMORY.md และ USER.md ที่มีมาให้แล้ว คู่มือนี้จะครอบคลุมวิธีการสร้างปลั๊กอินดังกล่าว

:::tip Memory providers เป็นหนึ่งในสองประเภทของ provider plugin อีกประเภทหนึ่งคือ Context Engine Plugins ซึ่งทำหน้าที่แทนที่ context compressor ที่มีมาให้ ทั้งสองประเภทนี้ใช้รูปแบบเดียวกัน: เลือกใช้ได้เพียงรายการเดียว (single-select) และขับเคลื่อนด้วยการตั้งค่า (config-driven) โดยจัดการผ่าน hermes plugins :::

โครงสร้างไดเรกทอรี

Memory provider แต่ละตัวจะอยู่ใน plugins/memory/<name>/:

plugins/memory/my-provider/
├── __init__.py      # MemoryProvider implementation + register() entry point
├── plugin.yaml      # Metadata (name, description, hooks)
└── README.md        # Setup instructions, config reference, tools

MemoryProvider ABC

ปลั๊กอินของคุณต้องใช้งาน MemoryProvider abstract base class จาก agent/memory_provider.py:

from agent.memory_provider import MemoryProvider

class MyMemoryProvider(MemoryProvider):
    @property
    def name(self) -> str:
        return "my-provider"

    def is_available(self) -> bool:
        """Check if this provider can activate. NO network calls."""
        return bool(os.environ.get("MY_API_KEY"))

    def initialize(self, session_id: str, **kwargs) -> None:
        """Called once at agent startup.

        kwargs always includes:
          hermes_home (str): Active HERMES_HOME path. Use for storage.
        """
        self._api_key = os.environ.get("MY_API_KEY", "")
        self._session_id = session_id

    # ... implement remaining methods

เมธอดที่จำเป็น

Core Lifecycle

Method	When Called	Must Implement?
`name` (property)	Always	Yes
`is_available()`	Agent init, before activation	Yes - no network calls
`initialize(session_id, **kwargs)`	Agent startup	Yes
`get_tool_schemas()`	After init, for tool injection	Yes
`handle_tool_call(name, args)`	When agent uses your tools	Yes (if you have tools)

Config

Method	Purpose	Must Implement?
`get_config_schema()`	Declare config fields for `hermes memory setup`	Yes
`save_config(values, hermes_home)`	Write non-secret config to native location	Yes (unless env-var-only)

Optional Hooks

Method	When Called	Use Case
`system_prompt_block()`	System prompt assembly	Static provider info
`prefetch(query)`	Before each API call	Return recalled context
`queue_prefetch(query)`	After each turn	Pre-warm for next turn
`sync_turn(user, assistant)`	After each completed turn	Persist conversation
`on_session_end(messages)`	Conversation ends	Final extraction/flush
`on_pre_compress(messages)`	Before context compression	Save insights before discard
`on_memory_write(action, target, content)`	Built-in memory writes	Mirror to your backend
`shutdown()`	Process exit	Clean up connections

Schema การตั้งค่า

get_config_schema() คืนค่ารายการของ field descriptors ที่ใช้โดย hermes memory setup:

def get_config_schema(self):
    return [
        {
            "key": "api_key",
            "description": "My Provider API key",
            "secret": True,           # → written to .env
            "required": True,
            "env_var": "MY_API_KEY",   # explicit env var name
            "url": "https://my-provider.com/keys",  # where to get it
        },
        {
            "key": "region",
            "description": "Server region",
            "default": "us-east",
            "choices": ["us-east", "eu-west", "ap-south"],
        },
        {
            "key": "project",
            "description": "Project identifier",
            "default": "hermes",
        },
    ]

Field ที่มี secret: True และ env_var จะถูกบันทึกไปยัง .env ส่วน Field ที่ไม่เป็นความลับจะถูกส่งไปยัง save_config().

:::tip Minimal vs Full Schema ทุก field ใน get_config_schema() จะถูกถามระหว่างการเรียกใช้ hermes memory setup สำหรับ Provider ที่มีตัวเลือกจำนวนมาก ควรทำให้ schema มีความเรียบง่ายที่สุด - ให้รวมเฉพาะ field ที่ผู้ใช้ ต้อง กำหนดค่าเท่านั้น (เช่น API key, credentials ที่จำเป็น) ให้จัดทำเอกสารเกี่ยวกับการตั้งค่าทางเลือกในไฟล์อ้างอิง config (เช่น $HERMES_HOME/myprovider.json) แทนการถามทั้งหมดในระหว่างการตั้งค่า วิธีนี้จะทำให้ setup wizard ทำงานได้รวดเร็ว ในขณะที่ยังคงรองรับการตั้งค่าขั้นสูง ดูตัวอย่างจาก Supermemory provider - ตัวนี้จะถามเฉพาะ API key เท่านั้น ส่วนตัวเลือกอื่น ๆ จะอยู่ใน supermemory.json. :::

การบันทึก Config

def save_config(self, values: dict, hermes_home: str) -> None:
    """Write non-secret config to your native location."""
    import json
    from pathlib import Path
    config_path = Path(hermes_home) / "my-provider.json"
    config_path.write_text(json.dumps(values, indent=2))

สำหรับ Provider ที่ใช้ env-var เท่านั้น ให้ปล่อยค่า default no-op ไว้

จุดเข้าใช้งาน Plugin

def register(ctx) -> None:
    """Called by the memory plugin discovery system."""
    ctx.register_memory_provider(MyMemoryProvider())

plugin.yaml

name: my-provider
version: 1.0.0
description: "Short description of what this provider does."
hooks:
  - on_session_end    # list hooks you implement

ข้อกำหนดเรื่อง Threading

sync_turn() ต้องไม่บล็อก (non-blocking) อย่างเด็ดขาด หาก backend ของคุณมีการหน่วงเวลา (latency) (เช่น API calls, LLM processing) ให้รันงานนั้นใน daemon thread:

def sync_turn(self, user_content, assistant_content):
    def _sync():
        try:
            self._api.ingest(user_content, assistant_content)
        except Exception as e:
            logger.warning("Sync failed: %s", e)

    if self._sync_thread and self._sync_thread.is_alive():
        self._sync_thread.join(timeout=5.0)
    self._sync_thread = threading.Thread(target=_sync, daemon=True)
    self._sync_thread.start()

การแยก Profile

เส้นทางจัดเก็บข้อมูลทั้งหมด ต้อง ใช้ hermes_home kwarg จาก initialize(), ห้ามใช้ ~/.hermes แบบ hardcoded:

# CORRECT - profile-scoped
from hermes_constants import get_hermes_home
data_dir = get_hermes_home() / "my-provider"

# WRONG - shared across all profiles
data_dir = Path("~/.hermes/my-provider").expanduser()

การทดสอบ

ดูที่ tests/agent/test_memory_plugin_e2e.py สำหรับรูปแบบการทดสอบ E2E ที่สมบูรณ์โดยใช้ SQLite provider จริง

from agent.memory_manager import MemoryManager

mgr = MemoryManager()
mgr.add_provider(my_provider)
mgr.initialize_all(session_id="test-1", platform="cli")

# Test tool routing
result = mgr.handle_tool_call("my_tool", {"action": "add", "content": "test"})

# Test lifecycle
mgr.sync_all("user msg", "assistant msg")
mgr.on_session_end([])
mgr.shutdown_all()

การเพิ่มคำสั่ง CLI

Memory provider plugins สามารถลงทะเบียน subcommand tree ของ CLI ของตัวเองได้ (เช่น hermes my-provider status, hermes my-provider config) วิธีนี้ใช้ระบบการค้นพบแบบ convention-based - ไม่จำเป็นต้องแก้ไขไฟล์หลักใด ๆ

วิธีการทำงาน

เพิ่มไฟล์ cli.py ในไดเรกทอรีปลั๊กอินของคุณ
กำหนดฟังก์ชัน register_cli(subparser) ที่สร้างโครงสร้าง argparse tree
ระบบ memory plugin จะค้นพบมันเมื่อเริ่มต้นผ่าน discover_plugin_cli_commands()
คำสั่งของคุณจะปรากฏภายใต้ hermes <provider-name> <subcommand>

Active-provider gating: คำสั่ง CLI ของคุณจะปรากฏก็ต่อเมื่อ Provider ของคุณเป็น memory.provider ที่ใช้งานอยู่ใน config เท่านั้น หากผู้ใช้ยังไม่ได้ตั้งค่า Provider ของคุณ คำสั่งของคุณจะไม่แสดงใน hermes --help.

ตัวอย่าง

# plugins/memory/my-provider/cli.py

def my_command(args):
    """Handler dispatched by argparse."""
    sub = getattr(args, "my_command", None)
    if sub == "status":
        print("Provider is active and connected.")
    elif sub == "config":
        print("Showing config...")
    else:
        print("Usage: hermes my-provider <status|config>")

def register_cli(subparser) -> None:
    """Build the hermes my-provider argparse tree.

    Called by discover_plugin_cli_commands() at argparse setup time.
    """
    subs = subparser.add_subparsers(dest="my_command")
    subs.add_parser("status", help="Show provider status")
    subs.add_parser("config", help="Show provider config")
    subparser.set_defaults(func=my_command)

Reference implementation

ดูที่ plugins/memory/honcho/cli.py สำหรับตัวอย่างเต็มรูปแบบที่มี 13 subcommands, การจัดการข้าม profile (--target-profile), และการอ่าน/เขียน config

โครงสร้างไดเรกทอรีพร้อม CLI

plugins/memory/my-provider/
├── __init__.py      # MemoryProvider implementation + register()
├── plugin.yaml      # Metadata
├── cli.py           # register_cli(subparser) - CLI commands
└── README.md        # Setup instructions

กฎของ Single Provider

สามารถมี Memory provider ภายนอกที่ใช้งานได้เพียง หนึ่งเดียว เท่านั้น หากผู้ใช้พยายามลงทะเบียนตัวที่สอง MemoryManager จะปฏิเสธด้วยคำเตือน สิ่งนี้ช่วยป้องกันปัญหา tool schema bloat และ backend ที่ขัดแย้งกัน

📄 developer-guide/prompt-assembly.md

sidebar_position: 5 title: "Prompt Assembly" description: "How Hermes builds the system prompt, preserves cache stability, and injects ephemeral layers"

Prompt Assembly

Hermes แยกส่วนอย่างจงใจระหว่าง:

cached system prompt state
ephemeral API-call-time additions

นี่คือหนึ่งในการตัดสินใจด้านการออกแบบที่สำคัญที่สุดในโปรเจกต์นี้ เพราะมันส่งผลกระทบต่อ:

token usage
prompt caching effectiveness
session continuity
memory correctness

ไฟล์หลัก:

run_agent.py
agent/prompt_builder.py
tools/memory_tool.py

Cached system prompt layers

ระบบ prompt ของ agent ที่ถูกแคชจะถูกประกอบขึ้นตามลำดับโดยประมาณดังนี้:

agent identity - SOUL.md จาก HERMES_HOME เมื่อมีอยู่ มิฉะนั้นจะใช้ DEFAULT_AGENT_IDENTITY ใน prompt_builder.py เป็นค่าสำรอง
tool-aware behavior guidance
Honcho static block (เมื่อเปิดใช้งาน)
optional system message
frozen MEMORY snapshot
frozen USER profile snapshot
skills index
context files (AGENTS.md, .cursorrules, .cursor/rules/*.mdc) - SOUL.md จะ ไม่ ถูกรวมอยู่ในส่วนนี้เมื่อถูกโหลดเป็น identity ในขั้นตอนที่ 1 แล้ว
timestamp / optional session ID
platform hint

เมื่อตั้งค่า skip_context_files (เช่น การมอบหมายงานให้ subagent) SOUL.md จะไม่ถูกโหลด และจะใช้ DEFAULT_AGENT_IDENTITY ที่กำหนดไว้แบบ hardcode แทน

ตัวอย่างที่เป็นรูปธรรม: assembled system prompt

นี่คือมุมมองแบบง่ายๆ ว่าระบบ prompt สุดท้ายมีหน้าตาอย่างไรเมื่อมีทุก layer (ความคิดเห็นแสดงแหล่งที่มาของแต่ละส่วน):

# Layer 1: Agent Identity (from ~/.hermes/SOUL.md)
You are Hermes, an AI assistant created by Nous Research.
You are an expert software engineer and researcher.
You value correctness, clarity, and efficiency.
...

# Layer 2: Tool-aware behavior guidance
You have persistent memory across sessions. Save durable facts using
the memory tool: user preferences, environment details, tool quirks,
and stable conventions. Memory is injected into every turn, so keep
it compact and focused on facts that will still matter later.
...
When the user references something from a past conversation or you
suspect relevant cross-session context exists, use session_search
to recall it before asking them to repeat themselves.

# Tool-use enforcement (for GPT/Codex models only)
You MUST use your tools to take action - do not describe what you
would do or plan to do without actually doing it.
...

# Layer 3: Honcho static block (when active)
[Honcho personality/context data]

# Layer 4: Optional system message (from config or API)
[User-configured system message override]

# Layer 5: Frozen MEMORY snapshot
## Persistent Memory
- User prefers Python 3.12, uses pyproject.toml
- Default editor is nvim
- Working on project "atlas" in ~/code/atlas
- Timezone: US/Pacific

# Layer 6: Frozen USER profile snapshot
## User Profile
- Name: Alice
- GitHub: alice-dev

# Layer 7: Skills index
## Skills (mandatory)
Before replying, scan the skills below. If one clearly matches
your task, load it with skill_view(name) and follow its instructions.
...
<available_skills>
  software-development:
    - code-review: Structured code review workflow
    - test-driven-development: TDD methodology
  research:
    - arxiv: Search and summarize arXiv papers
</available_skills>

# Layer 8: Context files (from project directory)
# Project Context
The following project context files have been loaded and should be followed:

## AGENTS.md
This is the atlas project. Use pytest for testing. The main
entry point is src/atlas/main.py. Always run `make lint` before
committing.

# Layer 9: Timestamp + session
Current time: 2026-03-30T14:30:00-07:00
Session: abc123

# Layer 10: Platform hint
You are a CLI AI Agent. Try not to use markdown but simple text
renderable inside a terminal.

How SOUL.md appears in the prompt

SOUL.md อยู่ที่ ~/.hermes/SOUL.md และทำหน้าที่เป็น identity ของ agent - ซึ่งเป็นส่วนแรกสุดของระบบ prompt ลอจิกการโหลดใน prompt_builder.py ทำงานดังนี้:

# From agent/prompt_builder.py (simplified)
def load_soul_md() -> Optional[str]:
    soul_path = get_hermes_home() / "SOUL.md"
    if not soul_path.exists():
        return None
    content = soul_path.read_text(encoding="utf-8").strip()
    content = _scan_context_content(content, "SOUL.md")  # Security scan
    content = _truncate_content(content, "SOUL.md")       # Cap at 20k chars
    return content

เมื่อ load_soul_md() ส่งคืน content มันจะแทนที่ DEFAULT_AGENT_IDENTITY ที่กำหนดไว้แบบ hardcode จากนั้นจะเรียกใช้ฟังก์ชัน build_context_files_prompt() ด้วย skip_soul=True เพื่อป้องกันไม่ให้ SOUL.md ปรากฏซ้ำสองครั้ง (ครั้งหนึ่งเป็น identity และอีกครั้งเป็น context file)

หาก SOUL.md ไม่มีอยู่ ระบบจะใช้ค่าสำรองดังนี้:

You are Hermes Agent, an intelligent AI assistant created by Nous Research.
You are helpful, knowledgeable, and direct. You assist users with a wide
range of tasks including answering questions, writing and editing code,
analyzing information, creative work, and executing actions via your tools.
You communicate clearly, admit uncertainty when appropriate, and prioritize
being genuinely useful over being verbose unless otherwise directed below.
Be targeted and efficient in your exploration and investigations.

How context files are injected

build_context_files_prompt() ใช้ priority system - โดยจะโหลด project context ได้เพียงประเภทเดียว (first match wins):

# From agent/prompt_builder.py (simplified)
def build_context_files_prompt(cwd=None, skip_soul=False):
    cwd_path = Path(cwd).resolve()

    # Priority: first match wins - only ONE project context loaded
    project_context = (
        _load_hermes_md(cwd_path)       # 1. .hermes.md / HERMES.md (walks to git root)
        or _load_agents_md(cwd_path)    # 2. AGENTS.md (cwd only)
        or _load_claude_md(cwd_path)    # 3. CLAUDE.md (cwd only)
        or _load_cursorrules(cwd_path)  # 4. .cursorrules / .cursor/rules/*.mdc
    )

    sections = []
    if project_context:
        sections.append(project_context)

    # SOUL.md from HERMES_HOME (independent of project context)
    if not skip_soul:
        soul_content = load_soul_md()
        if soul_content:
            sections.append(soul_content)

    if not sections:
        return ""

    return (
        "# Project Context\n\n"
        "The following project context files have been loaded "
        "and should be followed:\n\n"
        + "\n".join(sections)
    )

Context file discovery details

Priority	Files	Search scope	Notes
1	`.hermes.md`, `HERMES.md`	CWD up to git root	Hermes-native project config
2	`AGENTS.md`	CWD only	Common agent instruction file
3	`CLAUDE.md`	CWD only	Claude Code compatibility
4	`.cursorrules`, `.cursor/rules/*.mdc`	CWD only	Cursor compatibility

Context files ทั้งหมดจะผ่านกระบวนการ:

Security scanned - ตรวจสอบรูปแบบ prompt injection (invisible unicode, "ignore previous instructions", credential exfiltration attempts)
Truncated - ถูกจำกัดความยาวสูงสุดที่ 20,000 characters โดยใช้อัตราส่วน head/tail 70/20 พร้อมเครื่องหมาย truncation
YAML frontmatter stripped - ส่วน frontmatter ของ .hermes.md จะถูกลบออก (สงวนไว้สำหรับการ override config ในอนาคต)

API-call-time-only layers

ส่วนเหล่านี้ถูกตั้งใจให้ ไม่ ถูกบันทึกเป็นส่วนหนึ่งของ cached system prompt:

ephemeral_system_prompt
prefill messages
gateway-derived session context overlays
later-turn Honcho recall injected into the current-turn user message

การแยกส่วนนี้ช่วยให้ stable prefix ยังคงเสถียรสำหรับการแคช

Memory snapshots

ข้อมูล memory ท้องถิ่นและ user profile จะถูกฉีดเป็น frozen snapshots เมื่อเริ่ม session การเขียนข้อมูลระหว่าง session จะอัปเดตสถานะบน disk แต่จะไม่เปลี่ยนแปลง system prompt ที่สร้างขึ้นแล้ว จนกว่าจะเกิด session ใหม่หรือมีการบังคับ rebuild

Context files

agent/prompt_builder.py จะสแกนและ sanitize context files ของโปรเจกต์โดยใช้ priority system - โดยจะโหลดได้เพียงประเภทเดียว (first match wins):

.hermes.md / HERMES.md (walks to git root)
AGENTS.md (CWD เมื่อเริ่มต้น; subdirectories จะถูกค้นพบอย่างต่อเนื่องระหว่าง session ผ่าน agent/subdirectory_hints.py)
CLAUDE.md (CWD เท่านั้น)
.cursorrules / .cursor/rules/*.mdc (CWD เท่านั้น)

SOUL.md ถูกโหลดแยกต่างหากผ่าน load_soul_md() สำหรับ slot ของ identity เมื่อโหลดสำเร็จ build_context_files_prompt(skip_soul=True) จะป้องกันไม่ให้มันปรากฏซ้ำ

ไฟล์ที่มีความยาวมากจะถูก truncate ก่อนการ injection

Skills index

ระบบ skills จะช่วยเพิ่ม skills index ที่กระชับเข้าไปใน prompt เมื่อมีเครื่องมือ skills tooling ให้ใช้งาน

Why prompt assembly is split this way

สถาปัตยกรรมนี้ถูกออกแบบมาโดยเจตนาเพื่อ:

preserve provider-side prompt caching
avoid mutating history unnecessarily
keep memory semantics understandable
let gateway/ACP/CLI add context without poisoning persistent prompt state

Related docs

extent analysis

TL;DR

To resolve the issue, identify the root cause by analyzing the error message and the code, then apply the necessary fix, which may involve modifying the code, updating dependencies, or adjusting configuration settings.

Guidance

Analyze the error message: Carefully read the error message to understand the nature of the issue.
Review the code: Examine the code to identify potential causes of the error, such as syntax errors, incorrect function calls, or missing dependencies.
Check dependencies and versions: Verify that all dependencies are up-to-date and compatible with the current version of the code.
Apply the fix: Based on the analysis, apply the necessary fix, which may involve modifying the code, updating dependencies, or adjusting configuration settings.
Test the fix: Thoroughly test the fix to ensure that it resolves the issue without introducing new problems.

Example

Since the issue body does not provide a specific error message or code snippet, it is not possible to provide a concrete example. However, in general, when encountering an error, it is essential to analyze the error message, review the code, and apply the necessary fix.

Notes

The issue body appears to be a collection of documentation pages for the Hermes Agent project, which provides a framework for building conversational AI models.
The pages cover various topics, including environments, benchmarks, data generation, and memory provider plugins.
Without a specific error message or code snippet, it is challenging to provide a precise solution.

Recommendation

Based on the information provided, it is recommended to review the documentation pages carefully and identify the specific section or code snippet that is causing the issue. Then, apply the necessary fix, which may involve modifying the code, updating dependencies, or adjusting configuration settings. If the issue persists, consider seeking additional support from the Hermes Agent community or a qualified developer.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #conversation history #latency issue #model loading #environment variable

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.