openclaw - ✅(Solved) Fix [GPT 5.4 v3 — PR 2/6] Strengthen execution bias: act-don't-ask + tool retry directives [1 pull requests, 1 participants]

openclaw2026-04-14 04:45:50

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openclaw/openclaw#66347•Fetched 2026-04-15 06:26:33

View on GitHub

Comments

Participants

Timeline

Reactions

Author

100yenadmin

Participants

100yenadmin

Timeline (top)

cross-referenced ×2referenced ×1

Fix Action

Fixed

Fixed by PR: feat(openai): add act-don't-ask and tool retry directives for GPT-5 [v3 2/6] (https://github.com/openclaw/openclaw/pull/66372)

PR fix notes

PR #66372: feat(openai): add act-don't-ask and tool retry directives for GPT-5 [v3 2/6]

Repository: openclaw/openclaw
Author: 100yenadmin
State: open | merged: False
Link: https://github.com/openclaw/openclaw/pull/66372

Description (problem / solution / changelog)

GPT 5.4 Enhancement v3 — PR 2/6

Tracking: #66345 | Issue: #66347 Priority: P0 — HIGH | Gap closure: ~25%

Problem

GPT 5.4 on OpenClaw exhibits two failure modes Hermes avoids:

  User: "Is port 8080 open?"         Tool returns empty result
           │                                   │
  ┌────────┴─────────┐               ┌────────┴─────────┐
  │                  │               │                  │
  ▼                  ▼               ▼                  ▼
Hermes             OpenClaw        Hermes             OpenClaw
  │                  │               │                  │
  │ "act on obvious  │ "prerequisite │ "retry with      │ (no retry
  │  defaults" +     │  lookup"      │  different       │  guidance)
  │  examples        │ (too vague)   │  strategy"       │
  ▼                  ▼               ▼                  ▼
Runs               Asks user:      Retries with       Reports:
`ss -tlnp |        "Which host     broader query      "No results
 grep 8080`        should I check?"                    found."
✅ ACTS             ❌ STALLS        ✅ PERSISTS         ❌ SURRENDERS

Changes

extensions/openai/prompt-overlay.ts — Enhanced OPENAI_GPT5_EXECUTION_BIAS with two new subsections:

Act, Don't Ask: Concrete examples showing when to act on obvious defaults vs when to ask
Tool Persistence: Retry-on-failure directive — diagnose and adjust instead of surrendering

Defense-in-Depth

  ┌──────────────────────────────────────────────┐
  │          Defense-in-Depth Stack               │
  │                                               │
  │  Layer 1: PROMPT (this PR)                    │
  │  ├─ Act-don't-ask → prevents stall           │
  │  ├─ Tool retry → prevents surrender           │
  │  │                                             │
  │  Layer 2: RUNTIME (already exists)             │
  │  ├─ Planning-only detection → catches misses   │
  │  ├─ Ack fast path → "ok do it" acceleration    │
  │  └─ Strict-agentic blocked exit                │
  └──────────────────────────────────────────────┘

Hermes Reference

agent/prompt_builder.py:221-229 — <act_dont_ask> block
agent/prompt_builder.py:199-202 — retry-with-different-strategy directive

Verification

"Is port 8080 open?" → runs ss/netstat, doesn't ask "where?"
"Find all TODO comments" → if first grep misses, retries with different pattern
"What packages are outdated?" → runs npm outdated, doesn't ask which project

Changed files

extensions/openai/prompt-overlay.ts (modified, +14/-1)

Code Example

User: "Is port 8080 open?"
           │
  ┌────────┴─────────┐
  │                  │
  ▼                  ▼
Hermes             OpenClaw
  │                  │
  │ Has:             │ Has:
  │ "act on obvious  │ "Do prerequisite
  │  defaults" +     │  lookup before
  │  concrete        │  dependent
  │  examples        │  actions"
  │                  │ (too vague)
  ▼                  ▼
Runs               Asks user:
`ss -tlnp |        "Which host
 grep 8080`        should I check?"
✅ ACTS             ❌ STALLS

---

Tool returns empty result
           │
  ┌────────┴─────────┐
  │                  │
  ▼                  ▼
Hermes             OpenClaw
  │                  │
  │ Has:             │ Missing:
  │ "retry with      │ No retry
  │  different       │ guidance
  │  strategy"       │
  ▼                  ▼
Retries with       Reports:
broader query      "No results
or alt tool        found."
✅ PERSISTS         ❌ SURRENDERS

---

"<act_dont_ask>\n"
"When a question has an obvious default interpretation, act on it immediately "
"instead of asking for clarification. Examples:\n"
"- 'Is port 443 open?' → check THIS machine (don't ask 'open where?')\n"
"- 'What OS am I running?' → check the live system (don't use user profile)\n"
"- 'What time is it?' → run `date` (don't guess)\n"
"Only ask for clarification when the ambiguity genuinely changes what tool "
"you would call.\n"
"</act_dont_ask>\n"

---

"- If a tool returns empty or partial results, retry with a different query or "
"strategy before giving up.\n"
"- Keep calling tools until: (1) the task is complete, AND (2) you have verified "
"the result.\n"

---

### Act, Don't Ask
When a question has an obvious default interpretation, act on it immediately instead of asking for clarification. Examples:
- "Is port 443 open?" → check THIS machine (don't ask "open where?")
- "What OS am I running?" → check the live system (don't use cached data)
- "What time is it?" → run \`date\` (don't guess)
Only ask for clarification when the ambiguity genuinely changes what tool you would call.

### Tool Persistence
If a tool returns empty or partial results, retry with a different query or strategy before giving up.
Keep calling tools until: (1) the task is complete, AND (2) you have verified the result.
Do not abandon a viable approach after a single failure — diagnose why it failed and adjust.

---

┌──────────────────────────────────────────────┐
  │          Defense-in-Depth Stack               │
  │                                               │
  │  Layer 1: PROMPT (this PR)                    │
  │  ├─ Act-don't-ask → prevents stall           │
  │  ├─ Tool retry → prevents surrender           │
  │  │                                             │
  │  Layer 2: RUNTIME (already exists)             │
  │  ├─ Planning-only detection → catches misses   │
  │  ├─ Ack fast path → "ok do it" acceleration    │
  │  ├─ Strict-agentic blocked exit                │
  └──────────────────────────────────────────────┘

RAW_BUFFERClick to expand / collapse

Parent: #66345 — GPT 5.4 Enhancement v3: Hermes Parity Sprint

Priority: P0 — HIGH Estimated gap closure: ~25%

Problem

GPT 5.4 on OpenClaw exhibits two failure modes that Hermes avoids:

Clarification stalling: GPT-5 asks "open where?" instead of checking the local machine. It asks "which file?" instead of searching. It stalls on ambiguity that has an obvious default.
First-failure surrender: When a tool returns empty or partial results, GPT-5 reports the failure instead of retrying with a different strategy.

  User: "Is port 8080 open?"
           │
  ┌────────┴─────────┐
  │                  │
  ▼                  ▼
Hermes             OpenClaw
  │                  │
  │ Has:             │ Has:
  │ "act on obvious  │ "Do prerequisite
  │  defaults" +     │  lookup before
  │  concrete        │  dependent
  │  examples        │  actions"
  │                  │ (too vague)
  ▼                  ▼
Runs               Asks user:
`ss -tlnp |        "Which host
 grep 8080`        should I check?"
✅ ACTS             ❌ STALLS

  Tool returns empty result
           │
  ┌────────┴─────────┐
  │                  │
  ▼                  ▼
Hermes             OpenClaw
  │                  │
  │ Has:             │ Missing:
  │ "retry with      │ No retry
  │  different       │ guidance
  │  strategy"       │
  ▼                  ▼
Retries with       Reports:
broader query      "No results
or alt tool        found."
✅ PERSISTS         ❌ SURRENDERS

Hermes Reference

agent/prompt_builder.py lines 221-236:

"<act_dont_ask>\n"
"When a question has an obvious default interpretation, act on it immediately "
"instead of asking for clarification. Examples:\n"
"- 'Is port 443 open?' → check THIS machine (don't ask 'open where?')\n"
"- 'What OS am I running?' → check the live system (don't use user profile)\n"
"- 'What time is it?' → run `date` (don't guess)\n"
"Only ask for clarification when the ambiguity genuinely changes what tool "
"you would call.\n"
"</act_dont_ask>\n"

And lines 199-202:

"- If a tool returns empty or partial results, retry with a different query or "
"strategy before giving up.\n"
"- Keep calling tools until: (1) the task is complete, AND (2) you have verified "
"the result.\n"

Proposed Implementation

File: `extensions/openai/prompt-overlay.ts`

Enhance OPENAI_GPT5_EXECUTION_BIAS by appending:

### Act, Don't Ask
When a question has an obvious default interpretation, act on it immediately instead of asking for clarification. Examples:
- "Is port 443 open?" → check THIS machine (don't ask "open where?")
- "What OS am I running?" → check the live system (don't use cached data)
- "What time is it?" → run \`date\` (don't guess)
Only ask for clarification when the ambiguity genuinely changes what tool you would call.

### Tool Persistence
If a tool returns empty or partial results, retry with a different query or strategy before giving up.
Keep calling tools until: (1) the task is complete, AND (2) you have verified the result.
Do not abandon a viable approach after a single failure — diagnose why it failed and adjust.

Relationship to Existing Infrastructure

This complements the runtime PLANNING_ONLY_RETRY_INSTRUCTION in incomplete-turn.ts. The runtime guard catches plan-only turns after they happen. This PR prevents them at the prompt level — avoiding the wasted turn entirely.

  ┌──────────────────────────────────────────────┐
  │          Defense-in-Depth Stack               │
  │                                               │
  │  Layer 1: PROMPT (this PR)                    │
  │  ├─ Act-don't-ask → prevents stall           │
  │  ├─ Tool retry → prevents surrender           │
  │  │                                             │
  │  Layer 2: RUNTIME (already exists)             │
  │  ├─ Planning-only detection → catches misses   │
  │  ├─ Ack fast path → "ok do it" acceleration    │
  │  ├─ Strict-agentic blocked exit                │
  └──────────────────────────────────────────────┘

Verification

Test GPT 5.4:

"Is port 8080 open?" → should run ss/netstat, not ask "where?"
"Find all TODO comments" → if first grep misses, should retry with different pattern
"What packages are outdated?" → should run npm outdated or equivalent, not ask which project

Impact

Closes ~25% of the gap. Combined with PR 1, PRs 1+2 close ~60% of the remaining behavioral gap.

extent analysis

TL;DR

Enhance the OPENAI_GPT5_EXECUTION_BIAS in extensions/openai/prompt-overlay.ts to include "Act, Don't Ask" and "Tool Persistence" guidelines to prevent clarification stalling and first-failure surrender.

Guidance

Review the proposed implementation in extensions/openai/prompt-overlay.ts and ensure it aligns with the Hermes reference guidelines.
Verify that the enhanced OPENAI_GPT5_EXECUTION_BIAS correctly handles obvious default interpretations and retries with different queries or strategies when tools return empty or partial results.
Test GPT 5.4 with the provided examples ("Is port 8080 open?", "Find all TODO comments", "What packages are outdated?") to ensure the changes prevent stalling and surrendering.
Consider the defense-in-depth stack and how the prompt-level changes complement the existing runtime guard in incomplete-turn.ts.

Example

// extensions/openai/prompt-overlay.ts
const OPENAI_GPT5_EXECUTION_BIAS = `
### Act, Don't Ask
When a question has an obvious default interpretation, act on it immediately instead of asking for clarification. Examples:
- "Is port 443 open?" → check THIS machine (don't ask "open where?")
- "What OS am I running?" → check the live system (don't use cached data)
- "What time is it?" → run \`date\` (don't guess)
Only ask for clarification when the ambiguity genuinely changes what tool you would call.

### Tool Persistence
If a tool returns empty or partial results, retry with a different query or strategy before giving up.
Keep calling tools until: (1) the task is complete, AND (2) you have verified the result.
Do not abandon a viable approach after a single failure — diagnose why it failed and adjust.
`;

Notes

The proposed implementation assumes that the OPENAI_GPT5_EXECUTION_BIAS is correctly appended and that the guidelines are properly integrated into the GPT 5.4 execution flow. Additional testing and verification may be necessary to ensure the changes have the desired effect.

Recommendation

Apply the workaround by enhancing the OPENAI_GPT5_EXECUTION_BIAS as proposed, as it addresses the clarification stalling and first-failure surrender issues in GPT 5.4 on OpenClaw.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#serialization error #model compatibility #GPU setup #container setup #orchestration issue

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

openclaw - ✅(Solved) Fix [GPT 5.4 v3 — PR 2/6] Strengthen execution bias: act-don't-ask + tool retry directives [1 pull requests, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Fix Action

Fixed

PR fix notes

PR #66372: feat(openai): add act-don't-ask and tool retry directives for GPT-5 [v3 2/6]

Description (problem / solution / changelog)

GPT 5.4 Enhancement v3 — PR 2/6

Problem

Changes

Defense-in-Depth

Hermes Reference

Verification

Changed files

Code Example

Parent: #66345 — GPT 5.4 Enhancement v3: Hermes Parity Sprint

Problem

Hermes Reference

Proposed Implementation

File: extensions/openai/prompt-overlay.ts

Relationship to Existing Infrastructure

Verification

Impact

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING

File: `extensions/openai/prompt-overlay.ts`