litellm - 💡(How to fix) Fix [Bug]: Post API hook is not enabled for passthrough endpoints

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Error Message

--- a/litellm/proxy/pass_through_endpoints/pass_through_endpoints.py +++ b/litellm/proxy/pass_through_endpoints/pass_through_endpoints.py @@ -23,6 +23,7 @@ from fastapi import ( ) from fastapi.responses import StreamingResponse from starlette.datastructures import UploadFile as StarletteUploadFile +from litellm.types.passthrough_endpoints import pass_through_endpoints from starlette.websockets import WebSocketState from websockets.asyncio.client import connect from websockets.exceptions import ( @@ -1001,6 +1002,16 @@ async def pass_through_request( # noqa: PLR0915 "pass_through_endpoint: response body not JSON-parseable, skipping post-call guardrails" )

  •    response_body = await proxy_logging_obj.post_call_success_hook(
  •        data=kwargs,
  •        user_api_key_dict=user_api_key_dict,
  •        response=response_body,  # type: ignore[arg-type]
  •    )
  •    post_hook_content = json.dumps(response_body).encode("utf-8")
  •    if content != post_hook_content:
  •        content = post_hook_content
  •        _content_modified = True
  •    ## LOG SUCCESS
       passthrough_logging_payload["response_body"] = response_body
       end_time = datetime.now()

Code Example

--- a/litellm/proxy/pass_through_endpoints/pass_through_endpoints.py
+++ b/litellm/proxy/pass_through_endpoints/pass_through_endpoints.py
@@ -23,6 +23,7 @@ from fastapi import (
 )
 from fastapi.responses import StreamingResponse
 from starlette.datastructures import UploadFile as StarletteUploadFile
+from litellm.types.passthrough_endpoints import pass_through_endpoints
 from starlette.websockets import WebSocketState
 from websockets.asyncio.client import connect
 from websockets.exceptions import (
@@ -1001,6 +1002,16 @@ async def pass_through_request(  # noqa: PLR0915
                 "pass_through_endpoint: response body not JSON-parseable, skipping post-call guardrails"
             )
 
+        response_body = await proxy_logging_obj.post_call_success_hook(
+            data=kwargs,
+            user_api_key_dict=user_api_key_dict,
+            response=response_body,  # type: ignore[arg-type]
+        )
+        post_hook_content = json.dumps(response_body).encode("utf-8")
+        if content != post_hook_content:
+            content = post_hook_content
+            _content_modified = True
+
         ## LOG SUCCESS
         passthrough_logging_payload["response_body"] = response_body
         end_time = datetime.now()

---
RAW_BUFFERClick to expand / collapse

Check for existing issues

  • I have searched the existing issues and checked that my issue is not a duplicate.

What happened?

Passthrough endpoint works relatively transparent and allows users to do the customization on their part but it is only availabe for Pre-api hook. If a user have to expose an non openai endpoint lets say "/tokenize" and have a specific format for request & response, inorder to make the translation of request & reponse, hooks would be used. The pre-api hook is really helpful to dynamically make the translation of the request based on the llm provider but due to the absence of post-api hook, the response aren't translated to specific fixed format.

The proposal is to add the post-api hook for the passthrough endpoint.

Any challenges:

  1. Post api hook doesn't have any call_type argument, which means we are at our own in filtering the responses and this hook would get triggered for all the passthrough endpoint configuration.
  2. What kind of dict do we need to provide in data argument? It has to be either request_data or generated kwargs. kwargs has more information which allows users to do better customization on the custom callbacks.
  3. If the response is changed, do we need to update the content length or just pop it off.

Suggested changes:

--- a/litellm/proxy/pass_through_endpoints/pass_through_endpoints.py
+++ b/litellm/proxy/pass_through_endpoints/pass_through_endpoints.py
@@ -23,6 +23,7 @@ from fastapi import (
 )
 from fastapi.responses import StreamingResponse
 from starlette.datastructures import UploadFile as StarletteUploadFile
+from litellm.types.passthrough_endpoints import pass_through_endpoints
 from starlette.websockets import WebSocketState
 from websockets.asyncio.client import connect
 from websockets.exceptions import (
@@ -1001,6 +1002,16 @@ async def pass_through_request(  # noqa: PLR0915
                 "pass_through_endpoint: response body not JSON-parseable, skipping post-call guardrails"
             )
 
+        response_body = await proxy_logging_obj.post_call_success_hook(
+            data=kwargs,
+            user_api_key_dict=user_api_key_dict,
+            response=response_body,  # type: ignore[arg-type]
+        )
+        post_hook_content = json.dumps(response_body).encode("utf-8")
+        if content != post_hook_content:
+            content = post_hook_content
+            _content_modified = True
+
         ## LOG SUCCESS
         passthrough_logging_payload["response_body"] = response_body
         end_time = datetime.now()

Steps to Reproduce

  1. Create a custom callback with async_pre_call_hook and async_post_call_success_hook
  2. Configure passthrough endpoint & custom callback in the config.yaml
  3. Litellm runs only the pre-api hook.

Relevant log output

What part of LiteLLM is this about?

Proxy

What LiteLLM version are you on ?

v1.84.0

Twitter / LinkedIn details

No response

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING