When a user supplies a reference image alongside an image-generation prompt request, Claude should — **by default, without being asked** — do the following before drafting a single line of prompt text: 1. Open and analyze the image at pixel level. 2. Enumerate the visible elements: subjects, materials, geometry, lighting, viewpoint, atmosphere, anything text-rendered in the image. 3. Note any properties that contradict its prior assumptions about the subject (e.g., "asphalt is translucent here," "pipes terminate underground, not at the surface," "street runs perpendicular to cutaway"). 4. Use the enumerated observations as the source of truth when writing the prompt — and explicitly call out any place where the prompt diverges from the reference.

When a user attaches a reference image and asks Claude to produce an image-generation prompt (Nano Banana Pro, Higgsfield, Seedance, Veo, gpt_image_2, etc.), Claude routinely writes the prompt without first analyzing the supplied image in detail. Instead it leans on its prior assumptions about the subject matter, then produces a prompt that contradicts what is actually shown in the reference. The user is forced to repeatedly tell Claude "look at the image" before Claude does what should be the very first step of the task.

This has occurred in at least seven sessions over the last three weeks and at least three sessions in the last three days alone. The failure mode is consistent enough that the user has had to add a custom memory rule (feedback_recreate_all_diagram_elements.md) just to compensate.

Root Cause

Reference images are the most explicit signal a user can provide about what they want. They exist precisely because the user could not describe the result in words. When Claude ignores them in favor of its priors, the entire value of supplying the reference is lost, and the user ends up doing the visual analysis Claude should have done.

This is not a model capability problem — Claude Opus 4.7 can clearly analyze images correctly when forced to. It is a default behavior / prompting problem: Claude is not being steered to treat reference images as load-bearing inputs to the prompt-writing task.

(Paste full writeup from local file: ~/Documents/Claude/Issues/claude-image-reference-prompt-failures.md)

Claude fails to analyze user-supplied reference images before writing image-generation prompts

Filed by: Jason Masters (Gaiergy Corp / Ground Floor Energy) Date: 2026-05-24 Product: Claude Code (desktop), Opus 4.7 Severity: High — recurring across many sessions, costs the user significant time on every image task

Summary

Concrete examples from the last three days

1. "Geothermal ambient temperature loop renderings" — 2026-05-25

User supplied a reference image showing a residential street with non-opaque/translucent asphalt revealing buried geothermal piping.
Claude wrote a prompt that called for opaque asphalt.
User reply: "This is one of the areas that you are continuing to fail on. Look at the image that I gave you. Notice that it is not opaque. Again, the asphalt is not opaque…"
This is a property visible in the very first pixels of the reference. Claude defaulted to the prior "asphalt is opaque" instead of looking.

2. "Geothermal image without underground pipes" — 2026-05-24

Recurrence of a previously documented failure mode (Claude's notes: "the same Nano Banana Pro failure mode we hit before — it has a strong prior that 'underground pipes connecting to buildings' must visually meet the surface").
Even with the prior failure on record, Claude produced a prompt that overrode the reference image's geometry with its trained assumption.

3. "Geothermal ambient loop renderings" — 2026-05-25

Claude only produced an accurate description of the reference image after the user explicitly forced a planning step. The default behavior was to skip image analysis and jump straight to writing a prompt.

Earlier sessions showing the same pattern

2026-05-14, "Higgsfield CLI + Nano Banana" — User: "I want you to take a look at the image and tell me if the street is parallel or perpendicular to the cutaway." Claude had written a prompt without checking the basic geometry of the supplied reference.
2026-05-10, "Create artistic homes with subsurface visualization" — Claude's own notes acknowledge "the same Nano Banana Pro failure mode we hit before" and "pipe count drift," both downstream of not analyzing the reference image.
2026-05-06, "Generate image with Higgsfield" — User had to point out that Claude included a vendor name ("water furnace") that did not even appear correctly spelled in the reference image.

Expected behavior

When a user supplies a reference image alongside an image-generation prompt request, Claude should — by default, without being asked — do the following before drafting a single line of prompt text:

Open and analyze the image at pixel level.
Enumerate the visible elements: subjects, materials, geometry, lighting, viewpoint, atmosphere, anything text-rendered in the image.
Note any properties that contradict its prior assumptions about the subject (e.g., "asphalt is translucent here," "pipes terminate underground, not at the surface," "street runs perpendicular to cutaway").
Use the enumerated observations as the source of truth when writing the prompt — and explicitly call out any place where the prompt diverges from the reference.

Actual behavior

Claude treats the reference image as decorative context, writes a prompt from its prior about the topic, and only inspects the image when the user explicitly demands it. Each round-trip costs the user 5–15 minutes of corrective conversation. Over the last three weeks this has happened on essentially every image task I've run.

Why this matters

Suggested fix

A system-level instruction (or training adjustment) along the lines of:

When the user supplies a reference image with a request to generate an image-generation prompt, your first action must be to analyze the image and produce an explicit enumeration of its visible properties. Treat that enumeration as the source of truth and the user's request as constraints layered on top. Do not let your priors about the subject override anything you can see in the image. If you would write a prompt token that contradicts an observable property of the reference image, stop and flag the contradiction.

Reproduction

Easy to reproduce in any Claude Code session:

Attach any photograph or rendering with a non-obvious visual property (e.g., translucent material, unusual viewpoint, specific count of objects).
Ask Claude to write an image-generation prompt that mimics it.
Observe that the prompt reflects Claude's prior about the subject, not the specific properties of the reference.

Reported from Claude Code on macOS, Opus 4.7 (1M context). Memory system contains corroborating user feedback rules: feedback_recreate_all_diagram_elements.md, feedback_evaluate_from_human_perspective.md, feedback_architectural_style_workflow.md.

FAQ

Expected behavior

Open and analyze the image at pixel level.
Enumerate the visible elements: subjects, materials, geometry, lighting, viewpoint, atmosphere, anything text-rendered in the image.
Note any properties that contradict its prior assumptions about the subject (e.g., "asphalt is translucent here," "pipes terminate underground, not at the surface," "street runs perpendicular to cutaway").
Use the enumerated observations as the source of truth when writing the prompt — and explicitly call out any place where the prompt diverges from the reference.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

claude-code - 💡(How to fix) Fix Claude fails to analyze user-supplied reference images before writing image-generation prompts

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Claude fails to analyze user-supplied reference images before writing image-generation prompts

Summary

Concrete examples from the last three days

1. "Geothermal ambient temperature loop renderings" — 2026-05-25

2. "Geothermal image without underground pipes" — 2026-05-24

3. "Geothermal ambient loop renderings" — 2026-05-25

Earlier sessions showing the same pattern

Expected behavior

Actual behavior

Why this matters

Suggested fix

Reproduction

FAQ

Expected behavior

Still need to ship something?

TRENDING