llamaIndex - ✅(Solved) Fix [Bug]: OpenAILike / FunctionAgent: Kimi-K2.5 sometimes returns final answer in reasoning_content [1 pull requests, 2 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
run-llama/llama_index#21337Fetched 2026-04-09 07:51:07
View on GitHub
Comments
2
Participants
2
Timeline
10
Reactions
0
Author
Participants
Timeline (top)
commented ×2labeled ×2mentioned ×2subscribed ×2

Root Cause

Because of this, the agent can sometimes finish with an empty result, even though the model actually produced a valid final answer.

Fix Action

Fix / Workaround

My workaround

I added a small custom OpenAILike wrapper:

  • if the final stream returns no chunks, I fall back to a non-streaming call
  • if all of these are true:
    • role = assistant
    • finish_reason = stop
    • no tool calls are present
    • content is empty
    • reasoning_content exists

PR fix notes

PR #21345: fix(FunctionAgent): fall back to ThinkingBlock content when response content is empty

Description (problem / solution / changelog)

Fixes #21337

Problem

Some OpenAI-compatible models (e.g. Kimi-K2.5) occasionally return the final answer in reasoning_content instead of content. Because FunctionAgent doesn't validate that the response content is non-empty (unlike ReActAgent, which raises ValueError("Got empty message")), it silently returns an empty answer even though the model produced a valid response.

The streaming code in the OpenAI LLM already captures reasoning_content and stores it in a ThinkingBlock inside ChatMessage.blocks. However, ChatMessage.content only aggregates TextBlock text, so the ThinkingBlock content is invisible to the caller.

Solution

In FunctionAgent.take_step, after extracting tool calls, add a fallback: when there are no tool calls and content is empty, scan the response blocks for a ThinkingBlock with non-empty content and, if found, reconstruct the ChatResponse using that content as the text response.

This handles both streaming and non-streaming paths (the check is in take_step, which is called for both).

Conditions for the fallback to trigger — all must be true:

  • No tool calls in the response
  • message.content is empty / None
  • At least one ThinkingBlock in message.blocks has non-empty content

This does not change behaviour for models that behave correctly.

Testing

The fix can be validated by mocking an LLM that returns a ChatMessage with only a ThinkingBlock (no TextBlock content) and verifying that FunctionAgent returns the thinking content as its response rather than an empty string.

Existing tests in tests/agent/workflow/test_single_agent_workflow.py continue to pass unchanged.

Changed files

  • llama-index-core/llama_index/core/agent/workflow/function_agent.py (modified, +26/-1)
RAW_BUFFERClick to expand / collapse

Bug Description

Hello again 😄

I think I may have found a small compatibility issue with Kimi-K2.5 on an OpenAI-compatible endpoint (behind LiteLLM 1.78.5 and Kong 3.9.1).

Problem

In some tool-calling runs with FunctionAgent, the final answer does not come back in the normal content field.

Instead, I sometimes see this behavior:

  • content is empty / None
  • no final stream chunks arrive
  • but the real final answer is present in reasoning_content

Because of this, the agent can sometimes finish with an empty result, even though the model actually produced a valid final answer.

Important note

I have only seen this issue with FunctionAgent.

I did not have this problem before when using ReActAgent, so maybe this detail helps narrow it down.

My workaround

I added a small custom OpenAILike wrapper:

  • if the final stream returns no chunks, I fall back to a non-streaming call
  • if all of these are true:
    • role = assistant
    • finish_reason = stop
    • no tool calls are present
    • content is empty
    • reasoning_content exists

then I copy reasoning_content into content.

This fixes the problem for me and allows the final answer to be shown correctly (it's not reasoning text, it's the answer of the model that never got out of reasoning_content).

Question

Would this be something that could be handled by:

  • an official wrapper for this provider/model
  • or a small normalization option in OpenAILike

Thank you very much!

Version

llama-index==0.12.52

Steps to Reproduce

Kimi-K2.5 + OpenAI-compatible endpoint + tool calling + final answer step sometimes (every 2-3rd answer) returns the final answer in reasoning_content instead of content, or the stream returns no final chunks. Kimi’s tool-calling flow is supposed to continue after tool results until the model can answer normally, and K2.5 also has a documented reasoning_content field in thinking-mode style responses.

Relevant Logs/Tracebacks

extent analysis

TL;DR

Implement a custom wrapper or normalization option in OpenAILike to handle cases where the final answer is returned in reasoning_content instead of content.

Guidance

  • Verify that the issue only occurs with FunctionAgent and not with other agents like ReActAgent to confirm the scope of the problem.
  • Consider implementing a fallback mechanism to non-streaming calls when the final stream returns no chunks, as described in the user's workaround.
  • Investigate adding a normalization option in OpenAILike to handle cases where content is empty and reasoning_content exists, and copy reasoning_content into content when certain conditions are met (e.g., role = assistant, finish_reason = stop, no tool calls present).
  • Test the custom wrapper or normalization option with different scenarios to ensure it fixes the issue and does not introduce new problems.

Example

class CustomOpenAILike:
    def __init__(self, ...):
        # ...

    def get_final_answer(self, response):
        if response['content'] is None or response['content'] == '':
            if response['reasoning_content'] and response['role'] == 'assistant' and response['finish_reason'] == 'stop':
                response['content'] = response['reasoning_content']
        return response['content']

Notes

The issue seems to be specific to FunctionAgent and Kimi-K2.5 with an OpenAI-compatible endpoint, and the user's workaround provides a potential solution. However, further testing and verification are needed to ensure the custom wrapper or normalization option works correctly in all scenarios.

Recommendation

Apply the workaround by implementing a custom wrapper or normalization option in OpenAILike, as it provides a targeted solution to the issue and does not require upgrading to a different version.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING