langchain - 💡(How to fix) Fix To support BytePlus's video functionality, implement lenient validation for the `input.type` field in OpenAI responses by checking if `input.type == input_video` to ensure compatibility with BytePlus. [1 comments, 1 participants]

langchain2026-04-01 05:07:54

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

langchain-ai/langchain#36414•Fetched 2026-04-08 02:22:31

View on GitHub

Comments

Participants

Timeline

Reactions

Author

starcloud530

Participants

starcloud530

Timeline (top)

labeled ×3commented ×1issue_type_added ×1

I'm a developer from China building a multimodal analysis Agent using LangGraph. I'm using BytePlus (ByteDance's overseas model service) which supports text, image, and video inputs through their responses API. However, I've encountered an issue when trying to process video inputs with the langchain-openai integration.

Root Cause

The input_video block is not included in the API request because it's not recognized in the following code:

Fix Action

Solution

I've made a minimal change to support the input_video type:

def _construct_responses_api_input(messages: Sequence[BaseMessage]) -> list:
    # ...
    elif msg["role"] in ("user", "system", "developer"):
        if isinstance(msg["content"], list):
            new_blocks = []
            non_message_item_types = ("mcp_approval_response",)
            for block in msg["content"]:
                if block["type"] in ("text", "image_url", "file"):
                    new_blocks.append(
                        _convert_chat_completions_blocks_to_responses(block)
                    )
                elif block["type"] in ("input_text", "input_image", "input_file", "input_video"):
                    new_blocks.append(block)
                elif block["type"] in non_message_item_types:
                    input_.append(block)
                else:
                    pass
    # ...

Code Example

from langchain.messages import HumanMessage

messages = [
    HumanMessage(
        content=[
            {
                "type": "input_video",
                "video_url": video_path,
            },
            {
                "type": "input_text",
                "text": instruction,
            },
        ]
    )
]

---

def _construct_responses_api_input(messages: Sequence[BaseMessage]) -> list:
    # ...
    elif msg["role"] in ("user", "system", "developer"):
        if isinstance(msg["content"], list):
            new_blocks = []
            non_message_item_types = ("mcp_approval_response",)
            for block in msg["content"]:
                if block["type"] in ("text", "image_url", "file"):
                    new_blocks.append(
                        _convert_chat_completions_blocks_to_responses(block)
                    )
                elif block["type"] in ("input_text", "input_image", "input_file"):
                    new_blocks.append(block)
                elif block["type"] in non_message_item_types:
                    input_.append(block)
                else:
                    pass
    # ...

---

def _construct_responses_api_input(messages: Sequence[BaseMessage]) -> list:
    # ...
    elif msg["role"] in ("user", "system", "developer"):
        if isinstance(msg["content"], list):
            new_blocks = []
            non_message_item_types = ("mcp_approval_response",)
            for block in msg["content"]:
                if block["type"] in ("text", "image_url", "file"):
                    new_blocks.append(
                        _convert_chat_completions_blocks_to_responses(block)
                    )
                elif block["type"] in ("input_text", "input_image", "input_file", "input_video"):
                    new_blocks.append(block)
                elif block["type"] in non_message_item_types:
                    input_.append(block)
                else:
                    pass
    # ...

RAW_BUFFERClick to expand / collapse

Checked other resources

This is a feature request, not a bug report or usage question.
I added a clear and descriptive title that summarizes the feature request.
I used the GitHub search to find a similar feature request and didn't find it.
I checked the LangChain documentation and API reference to see if this feature already exists.
This is not related to the langchain-community package.

Package (Required)

Feature Description

Support for `input_video` type in responses API for BytePlus/ByteDance models

Description

Problem

BytePlus models support video input via the input_video type in their responses API, but the current implementation of _construct_responses_api_input in LangChain's OpenAI integration doesn't include this type, causing video inputs to be ignored.

Current Behavior

When sending a message with input_video type:

from langchain.messages import HumanMessage

messages = [
    HumanMessage(
        content=[
            {
                "type": "input_video",
                "video_url": video_path,
            },
            {
                "type": "input_text",
                "text": instruction,
            },
        ]
    )
]

The input_video block is not included in the API request because it's not recognized in the following code:

def _construct_responses_api_input(messages: Sequence[BaseMessage]) -> list:
    # ...
    elif msg["role"] in ("user", "system", "developer"):
        if isinstance(msg["content"], list):
            new_blocks = []
            non_message_item_types = ("mcp_approval_response",)
            for block in msg["content"]:
                if block["type"] in ("text", "image_url", "file"):
                    new_blocks.append(
                        _convert_chat_completions_blocks_to_responses(block)
                    )
                elif block["type"] in ("input_text", "input_image", "input_file"):
                    new_blocks.append(block)
                elif block["type"] in non_message_item_types:
                    input_.append(block)
                else:
                    pass
    # ...

Expected Behavior

The input_video type should be included in the API request, similar to how input_text, input_image, and input_file are handled.

Solution

I've made a minimal change to support the input_video type:

def _construct_responses_api_input(messages: Sequence[BaseMessage]) -> list:
    # ...
    elif msg["role"] in ("user", "system", "developer"):
        if isinstance(msg["content"], list):
            new_blocks = []
            non_message_item_types = ("mcp_approval_response",)
            for block in msg["content"]:
                if block["type"] in ("text", "image_url", "file"):
                    new_blocks.append(
                        _convert_chat_completions_blocks_to_responses(block)
                    )
                elif block["type"] in ("input_text", "input_image", "input_file", "input_video"):
                    new_blocks.append(block)
                elif block["type"] in non_message_item_types:
                    input_.append(block)
                else:
                    pass
    # ...

Context

BytePlus is ByteDance's overseas model service, which is gaining significant market share in China's AI model space
Their models offer strong multimodal capabilities for text, image, and video
They use the same API structure as OpenAI's responses API but extend it with video support
Currently, there's no dedicated LangChain integration for BytePlus/ByteDance models, so users are leveraging the OpenAI integration with custom base URLs

Impact

This change would enable users of BytePlus and potentially other model providers that support video input through similar structures to use video capabilities with LangChain's OpenAI integration.

Thank you for considering this request!

Use Case

This change would enable users of BytePlus and potentially other model providers that support video input through similar structures to use video capabilities with LangChain's OpenAI integration.

Proposed Solution

base.py

Alternatives Considered

BytePlus is ByteDance's overseas model service, which is gaining significant market share in China's AI model space
Their models offer strong multimodal capabilities for text, image, and video
They use the same API structure as OpenAI's responses API but extend it with video support
Currently, there's no dedicated LangChain integration for BytePlus/ByteDance models, so users are leveraging the OpenAI integration with custom base URLs

Additional Context

extent analysis

TL;DR

To support input_video type in responses API for BytePlus/ByteDance models, modify the _construct_responses_api_input function to include input_video in the list of accepted block types.

Guidance

Update the _construct_responses_api_input function to handle input_video blocks by adding "input_video" to the elif condition that checks block["type"].
Verify that the updated function correctly includes input_video blocks in the API request by checking the new_blocks list.
Test the updated function with a sample message containing an input_video block to ensure it is properly handled.
Consider adding additional error handling or logging to ensure that any issues with input_video blocks are properly reported.

Example

def _construct_responses_api_input(messages: Sequence[BaseMessage]) -> list:
    # ...
    elif msg["role] in (user", "system", "developer"):
        if isinstance(msg["content"], list):
            new_blocks = []
            non_message_item_types = ("mcp_approval_response",)
            for block in msg["content"]:
                if block["type] in (text", "image_url", "file"):
                    new_blocks.append(
                        _convert_chat_completions_blocks_to_responses(block)
                    )
                elif block["type] in (input_text", "input_image", "input_file", "input_video"):
                    new_blocks.append(block)
                elif block in non_message_item_types:
                    input_.append(block)
                else:
                    pass
    # ...

Notes

The proposed solution assumes that the input_video block has a similar structure to the existing input_text, input_image, and input_file blocks. If the input_video block has a different structure, additional modifications may be necessary.

Recommendation

Apply the workaround by modifying the _construct_responses_api_input function to include input_video in the list of accepted block types, as this will allow users to leverage the video capabilities of BytePlus and potentially other model providers.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #ssr #retrieval issue #search optimization #API routing

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

langchain - 💡(How to fix) Fix To support BytePlus's video functionality, implement lenient validation for the `input.type` field in OpenAI responses by checking if `input.type == input_video` to ensure compatibility with BytePlus. [1 comments, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Solution

Code Example

Checked other resources

Package (Required)

Feature Description

Support for input_video type in responses API for BytePlus/ByteDance models

Description

Problem

Current Behavior

Expected Behavior

Solution

Context

Impact

Use Case

Proposed Solution

Alternatives Considered

Additional Context

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING

Support for `input_video` type in responses API for BytePlus/ByteDance models