litellm - ✅(Solved) Fix [Bug]: litellm_proxy_total_requests_metric Emits status_code=None for some of failed requests [1 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
BerriAI/litellm#24224Fetched 2026-04-08 01:09:17
View on GitHub
Comments
0
Participants
1
Timeline
4
Reactions
0
Participants
Timeline (top)
referenced ×2cross-referenced ×1labeled ×1

Error Message

When aggregating total requests by HTTP error codes (e.g., 4xx and 5xx), the counts do not match with litellm_proxy_failed_requests_metric. // Alternate between malformed body (50%) and auth error (50%) // 50% auth error - wrong API key console.log('Sending auth error request (invalid API key)'); console.log(Error response body: ${response.body});

Fix Action

Fixed

PR fix notes

PR #24264: fix(prometheus): default to status_code=500 for exceptions without status code

Description (problem / solution / changelog)

Summary

  • _extract_status_code() returned None when an exception lacked status_code/code attributes
  • str(None) became the literal "None" in Prometheus labels, causing litellm_proxy_total_requests_metric 4xx/5xx aggregations to not match litellm_proxy_failed_requests_metric
  • Now defaults to 500 (unclassified server error) when an exception is present but carries no extractable status code — covers both direct exception param and kwargs["exception"] paths

Test plan

  • Added 5 regression tests in tests/test_litellm/integrations/test_prometheus_status_code_none.py
  • Verified no regressions in existing prometheus tests (test_prometheus_invalid_key_filtering.py — 2 pre-existing async failures unrelated to this change)

Fixes #24224

🤖 Generated with Claude Code

Changed files

  • litellm/integrations/prometheus.py (modified, +19/-7)
  • tests/test_litellm/integrations/test_prometheus_status_code_none.py (added, +48/-0)

Code Example

# PromQL query 1 - Sum total 4xx and 5xx
sum(litellm_proxy_total_requests_metric_total { cluster="AAA", litellm_identifier="atlas", status_code=~"[4-5].." })

#  Will not match total failed request
sum(litellm_proxy_failed_requests_metric_total { cluster="AAA", litellm_identifier="atlas" })

# But if we include total with status_code equal to none it will match litellm_proxy_failed_requests_metric_total 
sum(litellm_proxy_total_requests_metric_total {
  cluster="AAA",
  litellm_identifier="atlas",
  status_code=~"[4-5]..|None" 
})

---

import http from 'k6/http';
import { check, sleep } from 'k6';

// Test configuration
export const options = {
  scenarios: {
    constant_request_rate: {
      executor: 'constant-arrival-rate',
      rate: 2, // 2 requests per second
      timeUnit: '1s',
      duration: '10m', // Run for 10 minutes (adjust as needed)
      preAllocatedVUs: 2, // Pre-allocate 2 virtual users
      maxVUs: 10, // Maximum virtual users if needed
    },
  },
};

// Configuration
const BASE_URL = 'https://....';
const API_KEY = '.....';

// Models to rotate between
const MODELS = ['gpt-4.1-nano', 'gpt-4.1'];

export default function () {
  // Rotate between models
  const model = MODELS[__ITER % MODELS.length];

  // Alternate between malformed body (50%) and auth error (50%)
  const errorType = Math.random();
  
  let payload;
  let apiKey = API_KEY;
  
  if (errorType < 0.5) {
    // 50% malformed body - missing required messages field
    payload = JSON.stringify({
      model: model,
      // Missing messages field
    });
    console.log('Sending malformed request (missing messages)');
  } else {
    // 50% auth error - wrong API key
    apiKey = 'sk-invalid-key-12345';
    payload = JSON.stringify({
      model: model,
      messages: [
        {
          role: 'user',
          content: 'What is 1+1?',
        },
      ],
    });
    console.log('Sending auth error request (invalid API key)');
  }

  // Request parameters
  const params = {
    headers: {
      'Content-Type': 'application/json',
      'x-litellm-api-key': apiKey,
    },
    timeout: '60s',
  };

  // Make the request
  const response = http.post(`${BASE_URL}/chat/completions`, payload, params);

  // Check response
  check(response, {
    'status is 4xx or 5xx': (r) => r.status >= 400,
    'response has body': (r) => r.body.length > 0,
    'response is valid JSON': (r) => {
      try {
        JSON.parse(r.body);
        return true;
      } catch (e) {
        return false;
      }
    },
  });

  // Log the model used and response details
  console.log(`Request to ${model} - Status: ${response.status} - Duration: ${response.timings.duration}ms`);
  
  if (response.status !== 200) {
    console.log(`Error response body: ${response.body}`);
  }
}

---
RAW_BUFFERClick to expand / collapse

Check for existing issues

  • I have searched the existing issues and checked that my issue is not a duplicate.

What happened?

The litellm_proxy_total_requests_metric sometimes includes entries where status_code=None.

Observed Behavior:

When "testing" litellm and emitted metrics i noticed that some of the requests that return 4xx or 5xx status code (unsure which tbh) cause litellm_proxy_total_requests_metric includes data points with status_code=None.

When aggregating total requests by HTTP error codes (e.g., 4xx and 5xx), the counts do not match with litellm_proxy_failed_requests_metric.

Some failed requests are not being represented with a proper HTTP status code in the total requests metric. Which you can see with following promql queries

# PromQL query 1 - Sum total 4xx and 5xx
sum(litellm_proxy_total_requests_metric_total { cluster="AAA", litellm_identifier="atlas", status_code=~"[4-5].." })

#  Will not match total failed request
sum(litellm_proxy_failed_requests_metric_total { cluster="AAA", litellm_identifier="atlas" })

# But if we include total with status_code equal to none it will match litellm_proxy_failed_requests_metric_total 
sum(litellm_proxy_total_requests_metric_total {
  cluster="AAA",
  litellm_identifier="atlas",
  status_code=~"[4-5]..|None" 
})

Expected Behavior:

  • All requests in litellm_proxy_total_requests_metric should have a valid HTTP status code.
  • Aggregating 4xx and 5xx responses from total requests should align with litellm_proxy_failed_requests_metric.

Steps to Reproduce

  1. I did use grafana k6s to emit bunch of traffic, running this script or just taking requests out of it will reproduce invalid metrics
import http from 'k6/http';
import { check, sleep } from 'k6';

// Test configuration
export const options = {
  scenarios: {
    constant_request_rate: {
      executor: 'constant-arrival-rate',
      rate: 2, // 2 requests per second
      timeUnit: '1s',
      duration: '10m', // Run for 10 minutes (adjust as needed)
      preAllocatedVUs: 2, // Pre-allocate 2 virtual users
      maxVUs: 10, // Maximum virtual users if needed
    },
  },
};

// Configuration
const BASE_URL = 'https://....';
const API_KEY = '.....';

// Models to rotate between
const MODELS = ['gpt-4.1-nano', 'gpt-4.1'];

export default function () {
  // Rotate between models
  const model = MODELS[__ITER % MODELS.length];

  // Alternate between malformed body (50%) and auth error (50%)
  const errorType = Math.random();
  
  let payload;
  let apiKey = API_KEY;
  
  if (errorType < 0.5) {
    // 50% malformed body - missing required messages field
    payload = JSON.stringify({
      model: model,
      // Missing messages field
    });
    console.log('Sending malformed request (missing messages)');
  } else {
    // 50% auth error - wrong API key
    apiKey = 'sk-invalid-key-12345';
    payload = JSON.stringify({
      model: model,
      messages: [
        {
          role: 'user',
          content: 'What is 1+1?',
        },
      ],
    });
    console.log('Sending auth error request (invalid API key)');
  }

  // Request parameters
  const params = {
    headers: {
      'Content-Type': 'application/json',
      'x-litellm-api-key': apiKey,
    },
    timeout: '60s',
  };

  // Make the request
  const response = http.post(`${BASE_URL}/chat/completions`, payload, params);

  // Check response
  check(response, {
    'status is 4xx or 5xx': (r) => r.status >= 400,
    'response has body': (r) => r.body.length > 0,
    'response is valid JSON': (r) => {
      try {
        JSON.parse(r.body);
        return true;
      } catch (e) {
        return false;
      }
    },
  });

  // Log the model used and response details
  console.log(`Request to ${model} - Status: ${response.status} - Duration: ${response.timings.duration}ms`);
  
  if (response.status !== 200) {
    console.log(`Error response body: ${response.body}`);
  }
}

Relevant log output

What part of LiteLLM is this about?

Proxy

What LiteLLM version are you on ?

v1.81.12

Twitter / LinkedIn details

No response

extent analysis

Fix Plan

To fix the issue of litellm_proxy_total_requests_metric including entries with status_code=None, we need to ensure that all requests have a valid HTTP status code.

Here are the steps to fix the issue:

  • Update the metric collection logic to handle cases where the status code is not available or is None.
  • Modify the code to set a default status code (e.g., 500) when the actual status code is None.

Example code snippet in Python:

def collect_metric(response):
    status_code = response.status
    if status_code is None:
        # Set a default status code when the actual status code is None
        status_code = 500
    # Collect the metric with the valid status code
    litellm_proxy_total_requests_metric_total.labels(cluster="AAA", litellm_identifier="atlas", status_code=status_code).inc()
  • Review the litellm_proxy_total_requests_metric collection logic to ensure it handles all possible scenarios, including cases where the status code is not available.

Verification

To verify that the fix worked:

  1. Run the same test script that reproduced the issue.
  2. Check the litellm_proxy_total_requests_metric metric using PromQL queries to ensure that all requests have a valid HTTP status code.
  3. Compare the counts of 4xx and 5xx responses from litellm_proxy_total_requests_metric with litellm_proxy_failed_requests_metric to ensure they match.

Example PromQL query:

sum(litellm_proxy_total_requests_metric_total { cluster="AAA", litellm_identifier="atlas", status_code=~"[4-5].." })

This should match the count from litellm_proxy_failed_requests_metric.

Extra Tips

  • Regularly review and update the metric collection logic to handle new scenarios and edge cases.
  • Consider adding additional logging or monitoring to detect and alert on cases where the status code is None.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING