litellm - 💡(How to fix) Fix [Bug] GET /v1/files returns raw provider IDs for batch output files — wrong created

Code Example

SELECT flat_model_file_ids, created_by FROM "LiteLLM_ManagedFileTable" ORDER BY created_at DESC;

 flat_model_file_ids              | created_by
----------------------------------+-----------------
 {file-94bdfa884ea5458ead0f07b8}  | default_user_id   ← output file (WRONG)
 {file-abc123456def...}           | actual-user-id    ← input file (correct)

---

where={
    "created_by": user_api_key_dict.user_id,       # the calling user's ID
    "flat_model_file_ids": {"hasSome": model_object_ids},
}

---

_response = await proxy_logging_obj.post_call_success_hook(
    data=data, user_api_key_dict=user_api_key_dict, response=response
)
if _response is not None and isinstance(_response, OpenAIFileObject):  # ← wrong type
    response = _response

---

# Before:
if _response is not None and isinstance(_response, OpenAIFileObject):
# After:
if _response is not None and isinstance(_response, (OpenAIFileObject, AsyncCursorPage)):

LiteLLM Version

1.83.14

Environment

Managed Files / Managed Batches mode (Postgres required, model_info.mode: batch)

Describe the bug

When using managed batches, GET /v1/files returns raw provider file IDs (e.g. file-94bdfa884ea5458ead0f07b8) for output files created when batches complete, while returning correct unified managed IDs for input files uploaded via POST /v1/files.

Three consequences:

Callers cannot use the returned output file ID with GET /v1/files/{id}/content — the raw ID bypasses managed-files routing and falls through to default OpenAI credentials, returning a 500 or routing to the wrong deployment.
Ownership checks (from PR #19981) are not enforced on output files — a user who obtains a raw output file ID can retrieve another user's batch output.
GET /v1/files returns an inconsistent mix of managed and raw IDs.

Steps to Reproduce

Configure proxy with a batch model (model_info.mode: batch) and Postgres enabled.
Upload a batch input file via POST /v1/files, create a batch, wait for it to complete.
Call GET /v1/files.
Observe: input file IDs are unified managed IDs (base64-encoded litellm_proxy:...), but output file IDs are raw provider IDs.

Root Cause (verified via DB inspection)

Output file entries in LiteLLM_ManagedFileTable are created with created_by = 'default_user_id' instead of the batch creator's user ID.

DB evidence (queried directly on v1.83.14 Postgres):

SELECT flat_model_file_ids, created_by FROM "LiteLLM_ManagedFileTable" ORDER BY created_at DESC;

 flat_model_file_ids              | created_by
----------------------------------+-----------------
 {file-94bdfa884ea5458ead0f07b8}  | default_user_id   ← output file (WRONG)
 {file-abc123456def...}           | actual-user-id    ← input file (correct)

Why this happens:

CheckBatchCost (enterprise/litellm_enterprise/proxy/common_utils/check_batch_cost.py) explicitly runs as default_user_id (noted in a comment around line 248). When the poller calls logging_obj.async_success_handler after a batch completes, the user_api_key_dict.user_id in scope is default_user_id. This causes the managed files hook to register the output file with the wrong owner.

The batch creator's user ID is available as job.created_by (around line 324) but is not threaded into the user_api_key_dict passed to async_success_handler.

Why this causes the file list to return raw IDs:

get_user_created_file_ids in managed_files.py:336 queries:

where={
    "created_by": user_api_key_dict.user_id,       # the calling user's ID
    "flat_model_file_ids": {"hasSome": model_object_ids},
}

Output files have created_by = 'default_user_id' ≠ calling user's ID, so they are excluded from the ownership filter and fall through to the provider's raw file list response.

Secondary Bug: Wrong type check in `list_files` endpoint

In litellm/proxy/openai_files_endpoints/files_endpoints.py (~line 1378):

_response = await proxy_logging_obj.post_call_success_hook(
    data=data, user_api_key_dict=user_api_key_dict, response=response
)
if _response is not None and isinstance(_response, OpenAIFileObject):  # ← wrong type
    response = _response

The managed files hook returns AsyncCursorPage for file list responses (correctly handled at managed_files.py:1166-1189), but the endpoint only reassigns the response when it's OpenAIFileObject. isinstance(AsyncCursorPage_instance, OpenAIFileObject) is always False, so the hook's return value is discarded. The in-place mutation of response.data inside the hook partially masks this, but the type check is still incorrect.

Expected vs Actual

Expected: GET /v1/files returns unified managed IDs for all files (input and output), consistent with individual GET /v1/files/{id} responses.

Actual: Output file IDs are raw provider IDs; input file IDs are managed IDs. The list is inconsistent and output files bypass access control.

Proposed Fix

Fix 1 (check_batch_cost.py): When calling logging_obj.async_success_handler, construct user_api_key_dict with user_id = job.created_by (the batch creator's ID) rather than the poller's default_user_id. This ensures the managed files hook writes created_by = batch_creator_user_id for output file entries.

Fix 2 (files_endpoints.py ~line 1378): Broaden the type check:

# Before:
if _response is not None and isinstance(_response, OpenAIFileObject):
# After:
if _response is not None and isinstance(_response, (OpenAIFileObject, AsyncCursorPage)):

Additional Context

This was discovered while testing managed batches end-to-end. The test asserts that every file ID returned by GET /v1/files is a unified managed ID.
PR #27984 addressed a related issue (converting raw output_file_id to managed ID in CheckBatchCost), but the created_by attribution problem was not addressed by that PR.
The secondary bug (type check) would also affect any future hook that correctly returns AsyncCursorPage from post_call_success_hook.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

litellm - 💡(How to fix) Fix [Bug] GET /v1/files returns raw provider IDs for batch output files — wrong created_by in CheckBatchCost [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Root Cause (verified via DB inspection)

Code Example

LiteLLM Version

Environment

Describe the bug

Steps to Reproduce

Root Cause (verified via DB inspection)

Secondary Bug: Wrong type check in `list_files` endpoint

Expected vs Actual

Proposed Fix

Additional Context

Still need to ship something?

TRENDING

litellm - 💡(How to fix) Fix [Bug] GET /v1/files returns raw provider IDs for batch output files — wrong created_by in CheckBatchCost [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Root Cause (verified via DB inspection)

Code Example

LiteLLM Version

Environment

Describe the bug

Steps to Reproduce

Root Cause (verified via DB inspection)

Secondary Bug: Wrong type check in list_files endpoint

Expected vs Actual

Proposed Fix

Additional Context

Still need to ship something?

RELATED_DISCOVERY

TRENDING

Secondary Bug: Wrong type check in `list_files` endpoint