claude-code - 💡(How to fix) Fix [BUG] Cowork — multiple silent failure modes (write tools, Read cache, MCP cache lock) — extends #54847 [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
anthropics/claude-code#54891Fetched 2026-05-01 05:51:42
View on GitHub
Comments
0
Participants
1
Timeline
6
Reactions
0
Author
Participants
Timeline (top)
labeled ×5cross-referenced ×1

Error Message

Multiple Cowork tools (write tools, MCP tool calls, Read tool) silently fail or serve stale data without any error indication. The agent thinks the operation succeeded; the disk shows otherwise; recovery requires direct host-level inspection (xxd, wc -l, stat).

  • Write tools should return an explicit error when the disk write fails (cache lock, truncation, or any non-success condition)
  • Long-body tool calls should either complete, fail with clear error, or surface a timeout warning

Fix Action

Fix / Workaround

This issue extends the symptom reported in #54847 (Tool dispatch stalls silently in 2.1.121-2.1.123) — same class of failure mode but observed across multiple tools, not only tool_use emission stalls.

8. Dispatcher → worker session isolation — Workers cannot read dispatcher's conversation history, must re-paste long content, hit message size limits

9. Dispatcher does not proactively call SendUserMessage — Architectural limitation; SubagentStop hook bug (related: #33049); soft rules in agent memory not reliably followed

Code Example

# Write to /tmp via heredoc (avoids cache layers)
cat > /tmp/file.md << 'EOF_MARKER'
[content]
EOF_MARKER

# Copy to vault disk (avoids obsidian MCP cache + Cowork Write cache)
cp /tmp/file.md /vault/path/file.md

---

ls -la file.md                                                # mtime + size
xxd file.md | tail -3                                         # last bytes (LF? NULL? incomplete UTF-8?)
python3 -c "open('file.md','rb').read().decode('utf-8')"      # UTF-8 strict
python3 -c "print(open('file.md','rb').read().count(b'\\x00'))"  # NULL byte count
wc -l file.md && wc -c file.md                                # line and byte counts
RAW_BUFFERClick to expand / collapse

Production usage context

I've been using Claude Cowork desktop app in production for ~1 week to build an Obsidian-based knowledge management system on a 3,197-file vault. Over 2026-04-29 ~ 04-30 I encountered 10+ systemic silent failure issues that caused real data loss and required Tier 2 manual recovery (clipboard rescue from Obsidian app to chat to disk).

This issue extends the symptom reported in #54847 (Tool dispatch stalls silently in 2.1.121-2.1.123) — same class of failure mode but observed across multiple tools, not only tool_use emission stalls.

I'm filing this as a master issue covering related silent failure modes. Happy to split into per-component issues if engineering prefers — wanted to keep the cross-tool narrative together first.

What's wrong

Multiple Cowork tools (write tools, MCP tool calls, Read tool) silently fail or serve stale data without any error indication. The agent thinks the operation succeeded; the disk shows otherwise; recovery requires direct host-level inspection (xxd, wc -l, stat).

Issues encountered (10+ over 2 days)

P0 — Data loss / corruption

1. obsidian_create_or_update_note MCP silent failure (Obsidian Local REST API plugin via FastMCP)

  • Files ≥ 30 KB with CJK content get truncated to ~13.7 KB
  • Tool returns success; Read tool shows complete content; disk shows truncated
  • Reproducible by writing 30+ KB markdown with mixed Chinese/English content

2. Cowork host-side Write tool silent failure

  • Returns "has been updated successfully"; mtime unchanged; size unchanged
  • Same symptom as Issue #1 — the documented fallback also lies
  • Tested with 5-50 KB files

3. Read tool serves cache, not disk content (most critical)

  • Read returns complete content while cat / xxd on disk shows truncated/corrupted
  • Cache is not invalidated when external writes occur
  • This masks Issues #1, #2, #4, #5 — the agent has no way to detect that writes failed

4. Edit tool appends NULL bytes to file tail

  • Memory.md ended up with 38,228 trailing NULL bytes after a single Edit operation
  • Reproduced ≥ 3 times during 2026-04-29 work
  • Detection: python3 -c "print(open(path,'rb').read().count(b'\x00'))"

5. bash heredoc + Python stdin UTF-8 corruption

  • Multi-byte CJK characters truncated at end-of-stream
  • Last 1-2 bytes of final UTF-8 character missing
  • Detection: xxd file | tail shows incomplete byte sequence (e.g. e6 b5 without third byte)

P1 — Data risk

6. bash mount cache stale — mtime/size doesn't sync with disk after writes; cannot use bash mount for write verification

7. Chat client URL detection corrupts long markdownxxx.md references in long messages auto-linkified to [xxx.md](http://xxx.md), breaking markdown source fidelity when copying back to vault

P2 — Operation low-efficiency

8. Dispatcher → worker session isolation — Workers cannot read dispatcher's conversation history, must re-paste long content, hit message size limits

9. Dispatcher does not proactively call SendUserMessage — Architectural limitation; SubagentStop hook bug (related: #33049); soft rules in agent memory not reliably followed

10. ~/.claude/skills/ mount protected — Cowork cannot mount user-global skills directory; forces vault staging + manual copy workaround

Bonus issue (encountered during this very feedback task)

11. Gmail MCP create_draft hangs mid-call

  • Single tool call with ~13 KB body got "tool permission stream closed before response received"
  • Worker session became unresponsive for 20+ minutes
  • send_message to the worker did not recover; had to terminate task
  • Suggests long-body operations have an unhandled timeout/buffer issue

Steps to reproduce (representative — Issue #1)

  1. In Cowork desktop app, open a vault folder with ≥3,000 markdown files
  2. Use obsidian_create_or_update_note MCP tool to write a 30+ KB markdown file with mixed CJK + English + tables
  3. Tool returns success
  4. Read the file with Read tool — shows complete content
  5. From bash: wc -l file.md → shows truncated row count (~13.7 KB equivalent)
  6. From bash: xxd file.md | tail → shows incomplete UTF-8 sequence at file end

Expected behavior

  • Write tools should return an explicit error when the disk write fails (cache lock, truncation, or any non-success condition)
  • Read tool should serve the disk's actual content, not in-memory cache (or expose a force-fresh-read option)
  • File integrity should be preserved — no NULL byte appending, no UTF-8 byte-boundary truncation
  • Long-body tool calls should either complete, fail with clear error, or surface a timeout warning
  • Dispatcher → worker boundary should provide a context-handle mechanism

Workaround (Iron Rule 7)

After 2 days of debugging, the only reliable write path discovered:

# Write to /tmp via heredoc (avoids cache layers)
cat > /tmp/file.md << 'EOF_MARKER'
[content]
EOF_MARKER

# Copy to vault disk (avoids obsidian MCP cache + Cowork Write cache)
cp /tmp/file.md /vault/path/file.md

Verification (must use direct disk inspection, NOT Read tool):

ls -la file.md                                                # mtime + size
xxd file.md | tail -3                                         # last bytes (LF? NULL? incomplete UTF-8?)
python3 -c "open('file.md','rb').read().decode('utf-8')"      # UTF-8 strict
python3 -c "print(open('file.md','rb').read().count(b'\\x00'))"  # NULL byte count
wc -l file.md && wc -c file.md                                # line and byte counts

Impact

  • ~8-10 hours of recovery work over 2 days
  • Three critical files (Memory.md, daily note, CLAUDE.md) reconstructed via Tier 2 clipboard rescue
  • Production workflow at risk — agents claim success while data is silently lost
  • Required establishing internal protocol of 7 iron rules + 9-item validation checklist for every vault write

Suggested fixes (ordered by impact)

  1. Remove or invalidate cache layer between Read tool and disk — always read from filesystem source of truth, or expose explicit force_fresh option
  2. Make obsidian MCP and Cowork Write tool fail-loud on cache lock / truncation — return errors instead of success
  3. Add automatic atomic write verification to all write tools (post-write read-back from disk + hash check)
  4. Document silent failure modes prominently in Cowork/Dispatch docs while a fix is in progress (currently nothing in errors docs or troubleshooting docs covers these)
  5. Provide dispatcher-level context-handle mechanism so workers don't need re-paste (also avoids Issue #11 long-body hangs)

Environment

  • Platform: Cowork desktop app (Windows version)
  • OS: Windows 11
  • Vault size: 3,197 files / 1.6 GB Obsidian vault
  • Active workers: 4-5 parallel for complex tasks
  • Time period: 2026-04-29 ~ 04-30 (2 days)
  • MCPs in use: Obsidian (FastMCP), bash (workspace), Read/Write/Edit (host-side), Gmail, scheduled-tasks, dispatch, Claude in Chrome

Related issues

  • #54847 — Tool dispatch stalls silently (closely related — same silent failure class, narrower scope)
  • #33049 — SubagentStop hook bug (related to Issue #9)
  • #54821, #54806, #54786, #54772, #54861, #54744, #54803 — recent dispatch / tool reliability reports

Note on the "single issue" preflight

This is filed as a master issue to keep the cross-tool narrative coherent (multiple silent failures share the underlying cache layer and atomic-write problem). Happy to split into 10+ separate issues if the engineering team prefers per-component tickets — please advise.


Filed via GitHub REST API by an AI agent (Cowork dispatcher) on behalf of the user as part of a feedback workflow validating the very tools being reported on.

extent analysis

TL;DR

The most likely fix for the silent failure issues in the Cowork desktop app is to remove or invalidate the cache layer between the Read tool and disk, ensuring that the tool always reads from the filesystem source of truth.

Guidance

  • Investigate the cache layer implementation to identify the cause of silent failures and data corruption.
  • Consider adding automatic atomic write verification to all write tools, including a post-write read-back from disk and hash check.
  • Review the error handling mechanisms in the obsidian MCP and Cowork Write tool to ensure they return errors instead of success in case of cache lock or truncation.
  • Evaluate the need for a dispatcher-level context-handle mechanism to avoid re-pasting content and prevent long-body hangs.
  • Verify the fixes by using direct disk inspection and tools like xxd, wc, and python to check file integrity.

Example

The provided workaround using cat and cp commands to write to /tmp and then copy to the vault disk can be used as a temporary solution to avoid cache layers:

cat > /tmp/file.md << 'EOF_MARKER'
[content]
EOF_MARKER

cp /tmp/file.md /vault/path/file.md

Verification can be done using:

ls -la file.md
xxd file.md | tail -3
python3 -c "open('file.md','rb').read().decode('utf-8')"
python3 -c "print(open('file.md','rb').read().count(b'\x00'))"
wc -l file.md && wc -c file.md

Notes

The issue is complex and affects multiple tools, so a thorough investigation and testing are necessary to ensure the fixes do not introduce new problems. The provided workaround may not be suitable for all use cases, and a more robust solution is needed.

Recommendation

Apply the workaround using cat and cp commands to avoid cache

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

  • Write tools should return an explicit error when the disk write fails (cache lock, truncation, or any non-success condition)
  • Read tool should serve the disk's actual content, not in-memory cache (or expose a force-fresh-read option)
  • File integrity should be preserved — no NULL byte appending, no UTF-8 byte-boundary truncation
  • Long-body tool calls should either complete, fail with clear error, or surface a timeout warning
  • Dispatcher → worker boundary should provide a context-handle mechanism

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

claude-code - 💡(How to fix) Fix [BUG] Cowork — multiple silent failure modes (write tools, Read cache, MCP cache lock) — extends #54847 [1 participants]