litellm - ✅(Solved) Fix [Bug]: amazon.titan-embed-image-v1 — usage.prompt_tokens=0 for image inputs (EmbeddingResponse field, not cost calc) [1 pull requests, 1 participants]

shay314 · 2026-04-16T13:40:35Z

[litellm] PR 25907: fix: count image tokens in Titan embed response - Repository: BerriAI/litellm - Author: ianliuy - State: open | merged: False - Link: https… # PR #25907: fix: count image tokens in Titan embed response - Repository: BerriAI/litellm - Author: ianliuy - State: open | merged: False - Link: https://github.com/BerriAI/litellm/pull/25907 ## Description (problem / solution / changelog) ## What's broken? When using `amazon.titan-embed-image-v1` with image inputs, `EmbeddingResponse.usage.prompt_tokens` is always 0, even though the model successfully processed the image and returned an embedding. ## Who is affected? Any downstream consumer (quota/billing system, logging layer, monitoring tool) reading `response.usage.prompt_tokens` directly — rather than going through LiteLLM's cost calculator — gets `0` and cannot detect that work was performed. Text-only embedding calls are **not** affected (they correctly report `inputTextTokenCount`). ## Root Cause In `_transform_response()` of `amazon_titan_multimodal_transformation.py`, `prompt_tokens` is set solely from `inputTextTokenCount`, which AWS intentionally returns as `0` for image inputs (Titan charges per-image at a flat rate, not per-token). PR #21646 previously fixed the **cost calculator** path by reading `image_count` from `prompt_tokens_details`, but the `usage.prompt_tokens` field on the `EmbeddingResponse` was never updated. ## Fix Added `image_count` to both `prompt_tokens` and `total_tokens` in the `Usage` object: ```python usage = Usage( prompt_tokens=total_prompt_tokens + image_count, # was: total_prompt_tokens total_tokens=total_prompt_tokens + image_count, # was: total_prompt_tokens ... ) ``` This is backward-compatible: text-only paths are unaffected since `image_count=0`. ## Testing Updated existing unit test `test_titan_multimodal_embedding_image_cost_tracking` to assert `prompt_tokens == 1` and `total_tokens == 1` for image inputs. All 3 related tests pass: - `test_titan_multimodal_embedding_image_cost_tracking` ✅ - `test_titan_multimodal_embedding_text_no_image_count` ✅ - `test_titan_multimodal_embedding_backward_compat_no_batch_data` ✅ Fixes #25857 ## Changed files - `litellm/llms/bedrock/embed/amazon_titan_multimodal_transformation.py` (modified, +2/-2) - `tests/test_litellm/llms/bedrock/embed/test_bedrock_embedding.py` (modified, +2/-0) ## Fix / Workaround | Field | Expected | Actual (v1.83.8) | |---|---|---| | `usage.prompt_tokens` | `N` (number of images, e.g. `1`) | **`0`** ← bug | | `usage.total_tokens` | `N` | **`0`** ← bug | | `usage.prompt_tokens_details.image_count` | `N` | correct (fixed by #21646) | | `litellm.completion_cost(response)` | correct (`$0.00006`) | correct (fixed by #21646) | ### Check for existing issues - [x] I have searched the existing issues and checked that my issue is not a duplicate. ### What happened? When calling `litellm.embedding()` with `amazon.titan-embed-image-v1` and an image input (data URI), the returned `EmbeddingResponse.usage.prompt_tokens` is **always 0**, even though the model successfully processed the image and returned an embedding. This is **distinct from the cost-calculation bug** fixed in PR #21646 (merged Feb 20, 2026). That PR correctly fixed LiteLLM's internal cost calculator to use `input_cost_per_image` via `prompt_tokens_details.image_count`. However, **`usage.prompt_tokens` itself is still 0** in the returned response object. Any downstream consumer (e.g., a quota/billing system, logging layer, or monitoring tool) reading `response.usage.prompt_tokens` directly — rather than going through LiteLLM's cost calculator — gets `0` and cannot detect that work was performed. --- ### Root cause In `litellm/llms/bedrock/embed/amazon_titan_multimodal_transformation.py`, `_transform_response()`: ```python total_prompt_tokens += _parsed_response["inputTextTokenCount"] # always 0 for image inputs # image_count IS correctly counted (fix from #21646): if image_count > 0: prompt_tokens_details = PromptTokensDetailsWrapper(image_count=image_count) # But prompt_tokens is set from total_prompt_tokens, which is still 0: usage = Usage( prompt_tokens=total_prompt_tokens, # <- 0 for image-only input total_tokens=total_prompt_tokens, # <- 0 prompt_tokens_details=prompt_tokens_details, ) ``` AWS returns `inputTextTokenCount=0` for image inputs by design — `amazon.titan-embed-image-v1` charges **per image** at a flat rate of `$0.00006/image`, not per token. LiteLLM's pricing registry already encodes this correctly: ```json "amazon.titan-embed-image-v1": { "input_cost_per_image": 6e-05, "input_cost_per_token": 8e-07 } ``` PR #21646 fixed the **cost calculator** path (`generic_cost_per_token` in `utils.py` now reads `image_count` from `prompt_tokens_details` and applies `input_cost_per_image`). But the `usage.prompt_tokens` field on the `EmbeddingResponse` was never updated, so it remains `0`. --- ### Expected vs. actual | Field | Expected | Actual (v1.83.8) | |---|---|---| | `usage.prompt_token

litellm2026-04-16 13:40:35

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

BerriAI/litellm#25857•Fetched 2026-04-17 08:28:39

View on GitHub

Comments

Participants

Timeline

Reactions

Author

shay314

Participants

shay314

Timeline (top)

labeled ×3cross-referenced ×1referenced ×1subscribed ×1

Root Cause

In litellm/llms/bedrock/embed/amazon_titan_multimodal_transformation.py, _transform_response():

total_prompt_tokens += _parsed_response["inputTextTokenCount"]  # always 0 for image inputs

# image_count IS correctly counted (fix from #21646):
if image_count > 0:
    prompt_tokens_details = PromptTokensDetailsWrapper(image_count=image_count)

# But prompt_tokens is set from total_prompt_tokens, which is still 0:
usage = Usage(
    prompt_tokens=total_prompt_tokens,   # <- 0 for image-only input
    total_tokens=total_prompt_tokens,    # <- 0
    prompt_tokens_details=prompt_tokens_details,
)

AWS returns inputTextTokenCount=0 for image inputs by design — amazon.titan-embed-image-v1 charges per image at a flat rate of $0.00006/image, not per token. LiteLLM's pricing registry already encodes this correctly:

"amazon.titan-embed-image-v1": {
    "input_cost_per_image": 6e-05,
    "input_cost_per_token": 8e-07
}

PR #21646 fixed the cost calculator path (generic_cost_per_token in utils.py now reads image_count from prompt_tokens_details and applies input_cost_per_image). But the usage.prompt_tokens field on the EmbeddingResponse was never updated, so it remains 0.

Fix Action

Fix / Workaround

Field	Expected	Actual (v1.83.8)
`usage.prompt_tokens`	`N` (number of images, e.g. `1`)	`0` ← bug
`usage.total_tokens`	`N`	`0` ← bug
`usage.prompt_tokens_details.image_count`	`N`	correct (fixed by #21646)
`litellm.completion_cost(response)`	correct (`$0.00006`)	correct (fixed by #21646)

PR fix notes

PR #25907: fix: count image tokens in Titan embed response

Repository: BerriAI/litellm
Author: ianliuy
State: open | merged: False
Link: https://github.com/BerriAI/litellm/pull/25907

Description (problem / solution / changelog)

What's broken?

When using amazon.titan-embed-image-v1 with image inputs, EmbeddingResponse.usage.prompt_tokens is always 0, even though the model successfully processed the image and returned an embedding.

Who is affected?

Any downstream consumer (quota/billing system, logging layer, monitoring tool) reading response.usage.prompt_tokens directly — rather than going through LiteLLM's cost calculator — gets 0 and cannot detect that work was performed.

Text-only embedding calls are not affected (they correctly report inputTextTokenCount).

Root Cause

In _transform_response() of amazon_titan_multimodal_transformation.py, prompt_tokens is set solely from inputTextTokenCount, which AWS intentionally returns as 0 for image inputs (Titan charges per-image at a flat rate, not per-token).

PR #21646 previously fixed the cost calculator path by reading image_count from prompt_tokens_details, but the usage.prompt_tokens field on the EmbeddingResponse was never updated.

Fix

Added image_count to both prompt_tokens and total_tokens in the Usage object:

usage = Usage(
    prompt_tokens=total_prompt_tokens + image_count,   # was: total_prompt_tokens
    total_tokens=total_prompt_tokens + image_count,     # was: total_prompt_tokens
    ...
)

This is backward-compatible: text-only paths are unaffected since image_count=0.

Testing

Updated existing unit test test_titan_multimodal_embedding_image_cost_tracking to assert prompt_tokens == 1 and total_tokens == 1 for image inputs. All 3 related tests pass:

test_titan_multimodal_embedding_image_cost_tracking ✅
test_titan_multimodal_embedding_text_no_image_count ✅
test_titan_multimodal_embedding_backward_compat_no_batch_data ✅

Fixes #25857

Changed files

litellm/llms/bedrock/embed/amazon_titan_multimodal_transformation.py (modified, +2/-2)
tests/test_litellm/llms/bedrock/embed/test_bedrock_embedding.py (modified, +2/-0)

Code Example

total_prompt_tokens += _parsed_response["inputTextTokenCount"]  # always 0 for image inputs

# image_count IS correctly counted (fix from #21646):
if image_count > 0:
    prompt_tokens_details = PromptTokensDetailsWrapper(image_count=image_count)

# But prompt_tokens is set from total_prompt_tokens, which is still 0:
usage = Usage(
    prompt_tokens=total_prompt_tokens,   # <- 0 for image-only input
    total_tokens=total_prompt_tokens,    # <- 0
    prompt_tokens_details=prompt_tokens_details,
)

---

"amazon.titan-embed-image-v1": {
    "input_cost_per_image": 6e-05,
    "input_cost_per_token": 8e-07
}

---

prompt_tokens_details = None
if image_count > 0:
    prompt_tokens_details = PromptTokensDetailsWrapper(
        image_count=image_count,
        image_tokens=image_count,  # populate the existing but currently unused field
    )

usage = Usage(
    prompt_tokens=total_prompt_tokens + image_count,   # non-zero when images were processed
    completion_tokens=0,
    total_tokens=total_prompt_tokens + image_count,
    prompt_tokens_details=prompt_tokens_details,
)

---

import base64
from io import BytesIO
from PIL import Image
from litellm import embedding

# Image must be a data URI — litellm's is_base64_encoded() requires `data:` prefix
img = Image.new("RGB", (8, 8), color="blue")
buf = BytesIO()
img.save(buf, format="PNG")
img_b64 = base64.b64encode(buf.getvalue()).decode("utf-8")
img_data_uri = f"data:image/png;base64,{img_b64}"

response = embedding(
    model="amazon.titan-embed-image-v1",
    input=[img_data_uri],
    aws_region_name="us-east-1",
)

print(response.usage)
# Usage(prompt_tokens=0, total_tokens=0, ...)  ← BUG: should be non-zero

print(response.usage.prompt_tokens_details)
# PromptTokensDetailsWrapper(image_count=1)  ← image IS tracked here, just not surfaced in prompt_tokens

---

import json, base64, boto3
from io import BytesIO
from PIL import Image

img = Image.new("RGB", (8, 8), color="blue")
buf = BytesIO()
img.save(buf, format="PNG")
img_b64 = base64.b64encode(buf.getvalue()).decode("utf-8")

bedrock = boto3.client("bedrock-runtime", region_name="us-east-1")
resp = bedrock.invoke_model(
    modelId="amazon.titan-embed-image-v1",
    body=json.dumps({"inputImage": img_b64}),
    accept="application/json",
    contentType="application/json",
)

print(json.loads(resp["body"].read()))
# {"embedding": [...], "inputTextTokenCount": 0}
# AWS intentionally returns inputTextTokenCount=0 for images —
# the model is priced PER IMAGE ($0.00006/image), not per token.
# There is no "inputImageTokenCount" field — AWS does not expose one.

---

RAW_BUFFERClick to expand / collapse

Check for existing issues

I have searched the existing issues and checked that my issue is not a duplicate.

What happened?

When calling litellm.embedding() with amazon.titan-embed-image-v1 and an image input (data URI), the returned EmbeddingResponse.usage.prompt_tokens is always 0, even though the model successfully processed the image and returned an embedding.

This is distinct from the cost-calculation bug fixed in PR #21646 (merged Feb 20, 2026). That PR correctly fixed LiteLLM's internal cost calculator to use input_cost_per_image via prompt_tokens_details.image_count. However, usage.prompt_tokens itself is still 0 in the returned response object.

Any downstream consumer (e.g., a quota/billing system, logging layer, or monitoring tool) reading response.usage.prompt_tokens directly — rather than going through LiteLLM's cost calculator — gets 0 and cannot detect that work was performed.

Root cause

In litellm/llms/bedrock/embed/amazon_titan_multimodal_transformation.py, _transform_response():

total_prompt_tokens += _parsed_response["inputTextTokenCount"]  # always 0 for image inputs

# image_count IS correctly counted (fix from #21646):
if image_count > 0:
    prompt_tokens_details = PromptTokensDetailsWrapper(image_count=image_count)

# But prompt_tokens is set from total_prompt_tokens, which is still 0:
usage = Usage(
    prompt_tokens=total_prompt_tokens,   # <- 0 for image-only input
    total_tokens=total_prompt_tokens,    # <- 0
    prompt_tokens_details=prompt_tokens_details,
)

"amazon.titan-embed-image-v1": {
    "input_cost_per_image": 6e-05,
    "input_cost_per_token": 8e-07
}

Expected vs. actual

Field	Expected	Actual (v1.83.8)
`usage.prompt_tokens`	`N` (number of images, e.g. `1`)	`0` ← bug
`usage.total_tokens`	`N`	`0` ← bug
`usage.prompt_tokens_details.image_count`	`N`	correct (fixed by #21646)
`litellm.completion_cost(response)`	correct (`$0.00006`)	correct (fixed by #21646)

Proposed fix

In _transform_response() of amazon_titan_multimodal_transformation.py:

prompt_tokens_details = None
if image_count > 0:
    prompt_tokens_details = PromptTokensDetailsWrapper(
        image_count=image_count,
        image_tokens=image_count,  # populate the existing but currently unused field
    )

usage = Usage(
    prompt_tokens=total_prompt_tokens + image_count,   # non-zero when images were processed
    completion_tokens=0,
    total_tokens=total_prompt_tokens + image_count,
    prompt_tokens_details=prompt_tokens_details,
)

This ensures prompt_tokens > 0 for any completed image embedding call and is fully backward-compatible (text-only paths unaffected since image_count=0).

The same gap likely exists in amazon_nova_transformation.py — worth checking AmazonNovaEmbeddingConfig._transform_response() for identical behaviour.

PR #21646 — fixed cost calculation, but not usage.prompt_tokens

Steps to Reproduce

Minimal pure-LiteLLM repro (no real AWS call needed for the logic, but an AWS account is needed to run):

import base64
from io import BytesIO
from PIL import Image
from litellm import embedding

# Image must be a data URI — litellm's is_base64_encoded() requires `data:` prefix
img = Image.new("RGB", (8, 8), color="blue")
buf = BytesIO()
img.save(buf, format="PNG")
img_b64 = base64.b64encode(buf.getvalue()).decode("utf-8")
img_data_uri = f"data:image/png;base64,{img_b64}"

response = embedding(
    model="amazon.titan-embed-image-v1",
    input=[img_data_uri],
    aws_region_name="us-east-1",
)

print(response.usage)
# Usage(prompt_tokens=0, total_tokens=0, ...)  ← BUG: should be non-zero

print(response.usage.prompt_tokens_details)
# PromptTokensDetailsWrapper(image_count=1)  ← image IS tracked here, just not surfaced in prompt_tokens

Raw boto3 confirmation (proves this is a LiteLLM-level issue, not AWS):

import json, base64, boto3
from io import BytesIO
from PIL import Image

img = Image.new("RGB", (8, 8), color="blue")
buf = BytesIO()
img.save(buf, format="PNG")
img_b64 = base64.b64encode(buf.getvalue()).decode("utf-8")

bedrock = boto3.client("bedrock-runtime", region_name="us-east-1")
resp = bedrock.invoke_model(
    modelId="amazon.titan-embed-image-v1",
    body=json.dumps({"inputImage": img_b64}),
    accept="application/json",
    contentType="application/json",
)

print(json.loads(resp["body"].read()))
# {"embedding": [...], "inputTextTokenCount": 0}
# AWS intentionally returns inputTextTokenCount=0 for images —
# the model is priced PER IMAGE ($0.00006/image), not per token.
# There is no "inputImageTokenCount" field — AWS does not expose one.

Relevant log output

What part of LiteLLM is this about?

SDK (litellm Python package)

What LiteLLM version are you on ?

v1.83.8

Twitter / LinkedIn details

https://www.linkedin.com/in/shay-margalit-40581668/

extent analysis

TL;DR

Update the _transform_response() method in amazon_titan_multimodal_transformation.py to correctly calculate usage.prompt_tokens for image inputs.

Guidance

Review the proposed fix in the issue description, which suggests updating the usage calculation to include image_count when processing image inputs.
Verify that the image_count is correctly populated in the prompt_tokens_details object.
Check for similar issues in other transformation modules, such as amazon_nova_transformation.py.
Test the updated code with the provided minimal repro example to ensure that usage.prompt_tokens is correctly calculated.

Example

prompt_tokens_details = None
if image_count > 0:
    prompt_tokens_details = PromptTokensDetailsWrapper(
        image_count=image_count,
        image_tokens=image_count,  # populate the existing but currently unused field
    )

usage = Usage(
    prompt_tokens=total_prompt_tokens + image_count,   # non-zero when images were processed
    completion_tokens=0,
    total_tokens=total_prompt_tokens + image_count,
    prompt_tokens_details=prompt_tokens_details,
)

Notes

The issue is specific to the LiteLLM SDK (Python package) and version v1.83.8. The proposed fix is backward-compatible and does not affect text-only paths.

Recommendation

Apply the proposed workaround by updating the _transform_response() method in amazon_titan_multimodal_transformation.py to correctly calculate usage.prompt_tokens for image inputs. This fix ensures that downstream consumers can accurately detect work performed on image inputs.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #ssr #installation #tensor shape #autograd error

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

litellm - ✅(Solved) Fix [Bug]: amazon.titan-embed-image-v1 — usage.prompt_tokens=0 for image inputs (EmbeddingResponse field, not cost calc) [1 pull requests, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Fix / Workaround

PR fix notes

PR #25907: fix: count image tokens in Titan embed response

Description (problem / solution / changelog)

What's broken?

Who is affected?

Root Cause

Fix

Testing

Changed files

Code Example

Check for existing issues

What happened?

Root cause

Expected vs. actual

Proposed fix

Related

Steps to Reproduce

Minimal pure-LiteLLM repro (no real AWS call needed for the logic, but an AWS account is needed to run):

Raw boto3 confirmation (proves this is a LiteLLM-level issue, not AWS):

Relevant log output

What part of LiteLLM is this about?

What LiteLLM version are you on ?

Twitter / LinkedIn details

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING