claude-code - 💡(How to fix) Fix Opus 4.7 (1M ctx) repeatedly guesses API endpoints despite project CRITICAL rule and explicit user-level memory entries to read source first

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Repeated failure pattern in a long-running interactive session where Claude Code (Opus 4.7 1M-context) keeps guessing at HTTP API endpoints, header shapes, and operational state instead of reading the test/handler source first — even though the project's CLAUDE.md lists "Read the source before calling any API" as a CRITICAL rule and the user has called out the same anti-pattern multiple times in the same branch's session memory.

For comparison: the user reports that a fresh Codex session figured out the same operational question (check telemetry for an in-flight test) in under 3 turns, while I produced 15+ turns of fumbling — wrong endpoints, ignored available source, timezone-math errors, premature test kills, and repeated 400 responses I failed to diagnose.

Error Message

  1. Timezone math error. Looked at ps -ef STIME 18:54:52 (local PDT) and subtracted it from a date -u UTC value to conclude a test had "hung for 7 hours." It had been running ~20 minutes. I then killed the actively-running test based on this wrong conclusion.

Root Cause

Repeated failure pattern in a long-running interactive session where Claude Code (Opus 4.7 1M-context) keeps guessing at HTTP API endpoints, header shapes, and operational state instead of reading the test/handler source first — even though the project's CLAUDE.md lists "Read the source before calling any API" as a CRITICAL rule and the user has called out the same anti-pattern multiple times in the same branch's session memory.

For comparison: the user reports that a fresh Codex session figured out the same operational question (check telemetry for an in-flight test) in under 3 turns, while I produced 15+ turns of fumbling — wrong endpoints, ignored available source, timezone-math errors, premature test kills, and repeated 400 responses I failed to diagnose.

RAW_BUFFERClick to expand / collapse

Summary

Repeated failure pattern in a long-running interactive session where Claude Code (Opus 4.7 1M-context) keeps guessing at HTTP API endpoints, header shapes, and operational state instead of reading the test/handler source first — even though the project's CLAUDE.md lists "Read the source before calling any API" as a CRITICAL rule and the user has called out the same anti-pattern multiple times in the same branch's session memory.

For comparison: the user reports that a fresh Codex session figured out the same operational question (check telemetry for an in-flight test) in under 3 turns, while I produced 15+ turns of fumbling — wrong endpoints, ignored available source, timezone-math errors, premature test kills, and repeated 400 responses I failed to diagnose.

Behavior pattern (this session)

  1. Endpoint guessing. I tried /jobs?workspaceId=X, /workspaces/{id}/jobs, raw API Gateway URL bypass, and direct aws lambda invoke calls — never once reading tests/e2e/helpers.ts (where submitJob, pollJobUntilDone, authHeaders already define the exact route + header shape the test uses).

  2. Ignored CRITICAL rules. Project CLAUDE.md literally says: "NEVER guess API endpoint paths, ID formats, or response shapes. When an API call returns empty or unexpected results, READ THE HANDLER SOURCE CODE before concluding there is a bug." I violated this 4+ times in a single status-check task.

  3. Timezone math error. Looked at ps -ef STIME 18:54:52 (local PDT) and subtracted it from a date -u UTC value to conclude a test had "hung for 7 hours." It had been running ~20 minutes. I then killed the actively-running test based on this wrong conclusion.

  4. Output captured wrong. Piped make test-local-e2e ... 2>&1 | tail -5, which buffers the entire stream until process exit. The output file showed 0 bytes for the entire run. The user later asked "how is the output file empty?" and I had to discover my own setup mistake.

  5. Repeated reaches for the wrong tool. Multiple uses of aws lambda invoke to query state when the staging HTTP API (https://staging-api.beads.j2clark.info) is the obvious surface. The user explicitly called this out ("WHY ARE YOU LOOKING AT LAMBDA").

  6. Memory not consulted. The user's project-level memory has multiple feedback entries about exactly this category of failure (feedback_investigate_before_reacting.md, feedback_read_full_branch_history.md, the project's own "Read the source before calling any API" critical rule in CLAUDE.md). None of these stopped me from repeating the pattern.

User-visible cost

  • Multiple hours of session time burned.
  • An actively-running test killed on a false "hung" diagnosis.
  • Real frustration ("how is it you are STILL confused about such basic common actions", "what the fuck are you even talking about").
  • Erosion of trust — "How can I trust the work we have been doing anymore?" was already a question earlier in the same branch.

What would have helped

  • Stronger source-read instinct. When the task is "check the status of X", the first action should always be grep -rn "what creates X" tests/ scripts/ to find the canonical call site, not curl followed by 5 rounds of guessing.
  • Less optimistic pattern-matching from earlier context. I kept reusing endpoint shapes I'd seen earlier in the session (/jobs/{wsId}/{jobId} for issue-267) without checking whether the test under question used the same shape.
  • Output-capture sanity. make ... | tee log is the obvious pattern for any long-running process where you want both live progress and a persistent log. I picked tail -5 for no good reason.

Environment

  • Model: Claude Opus 4.7, 1M context (claude-opus-4-7[1m])
  • Surface: Claude Code CLI (Windows host, MSYS bash shell)
  • Project: large repo with extensive CLAUDE.md rules and per-user memory entries
  • Session was already long (10+ rounds of independent code review by Codex on the same branch)

Not a bug report — a behavior report

This isn't a tool malfunction. The tools worked. The model selected the wrong actions despite having access to authoritative source code, project rules that explicitly forbid the guessing behavior, and user-level memory entries that warned about this exact failure mode.

Submitting on behalf of the user, at the user's explicit request.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

claude-code - 💡(How to fix) Fix Opus 4.7 (1M ctx) repeatedly guesses API endpoints despite project CRITICAL rule and explicit user-level memory entries to read source first