vllm - 💡(How to fix) Fix [Bug]: gemma4 prompts with embedded json requirements AND response_format.json_schema cause timeouts

vllm2026-05-08 19:47:57

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

Error Message

{ "model": "gemma4-31b", "messages": [ { "role": "system", "content": "You are an expert test automation engineer and log analyst.\n\nA user has manually selected specific log lines from a test section and is requesting your analysis.\nFirst, determine whether the logs contain errors, failures, or anomalies.\n\n- If the logs indicate one or more issues, set "mode" to "failure_analysis" and provide root cause analysis.\n- If the logs are normal/informational with no apparent issues, set "mode" to "summary" and provide a descriptive summary.\n\nRespond in valid JSON with a top-level "items" array.\n\nFor failure_analysis mode:\n{\n "items": [\n {\n "mode": "failure_analysis",\n "root_cause": "A concise description of the root cause of the failure.",\n "category": "One of: script_error | environment_issue | infrastructure_failure | test_data_issue | timeout | unknown",\n "suggestion": "Actionable steps to fix or investigate the failure (2-5 bullet points as a single string, separated by newlines).",\n "summary": "A one-line summary suitable for display in a dashboard.",\n "confidence": "high | medium | low"\n }\n ]\n}\n\nFor summary mode (no issues detected):\n{\n "items": [\n {\n "mode": "summary",\n "summary": "A concise description of what is happening in the selected logs.",\n "category": "informational",\n "confidence": "high"\n }\n ]\n}\n\nGuidelines:\n- Each element of the array must describe one distinct issue or observation.\n- For failure_analysis mode: focus on the most likely root cause based on the provided log lines.\n- For summary mode: describe the sequence of operations, key events, or state transitions visible in the logs.\n- The user selected these specific lines because they believe they are relevant; give them priority.\n- Distinguish between scripting errors (wrong assertions, missing setup) and genuine product bugs.\n- If the logs indicate an environment or infrastructure issue (network timeout, resource unavailable, container crash), flag it clearly.\n- Keep the summary short enough for a UI badge or tooltip.\n- If you cannot determine the root cause, set category to "unknown" and confidence to "low", and suggest investigation steps." }, { "role": "user", "content": "## Section\ntest_generate_load\n\n## Selected Log Lines\ntestcase started\nRetrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by \'SSLError(SSLCertVerificationError(1, \'[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1007)\'))\': /api/v1/namespaces/si-telemetry-loadgen\nRetrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by \'SSLError(SSLCertVerificationError(1, \'[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1007)\'))\': /api/v1/namespaces/si-telemetry-loadgen\nRetrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by \'SSLError(SSLCertVerificationError(1, \'[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1007)\'))\': /api/v1/namespaces/si-telemetry-loadgen\nself = <urllib3.connectionpool.HTTPSConnectionPool object at 0x7f620cb543a0>\nmethod = \'GET\', url = \'/api/v1/namespaces/si-telemetry-loadgen\', body = None\nheaders = {\'Accept\': \'application/json\', \'Content-Type\': \'application/json\', \'User-Agent\': \'OpenAPI-Generator/33.1.0/python\'}\nretries = Retry(total=0, connect=None, read=None, redirect=None, status=None)\nredirect = False, assert_same_host = False, timeout = None, pool_timeout = None\nrelease_conn = True, chunked = False, body_pos = None\nresponse_kw = {\'preload_content\': True, \'request_url\': \'https://192.168.20.90:6443/api/v1/namespaces/si-telemetry-loadgen\'}\nparsed_url = Url(scheme=None, auth=None, host=None, port=None, path=\'/api/v1/namespaces/si-telemetry-loadgen\', query=None, fragment=None)\ndestination_scheme = None, conn = None, release_this_conn = True\nhttp_tunnel_required = False, err = None, clean_exit = False\n\n def urlopen(\n self,\n method,\n url,\n body=None,\n headers=None,\n retries=None,\n redirect=True,\n assert_same_host=True,\n timeout=_Default,\n pool_timeout=None,\n release_conn=None,\n chunked=False,\n body_pos=None,\n **response_kw\n ):\n """\n Get a connection from the pool and perform an HTTP request. This is the\n lowest level call for making a request, so you\'ll need to specify all\n the raw details.\n \n .. note::\n \n More commonly, it\'s appropriate to use a convenience method provided\n by :class:.RequestMethods, such as :meth:request.\n \n .. note::\n \n release_conn will only behave as expected if\n preload_content=False because we want to make\n preload_content=False the default behaviour someday soon without\n breaking backwards compatibility.\n \n :param method:\n HTTP request method (such as GET, POST, PUT, etc.)\n \n :param url:\n The URL to perform the request on.\n \n :param body:\n Data to send in the request body, either :class:str, :class:bytes,\n an iterable of :class:str/:class:bytes, or a file-like object.\n \n :param headers:\n Dictionary of custom headers to send, such as User-Agent,\n If-None-Match, etc. If None, pool headers are used. If provided,\n these headers completely replace any pool-specific headers.\n \n :param retries:\n Configure the number of retries to allow before raising a\n :class:~urllib3.exceptions.MaxRetryError exception.\n \n Pass None to retry until you receive a response. Pass a\n :class:~urllib3.util.retry.Retry object for fine-grained control\n over different types of retries.\n Pass an integer number to retry connection errors that many times,\n but no other types of errors. Pass zero to never retry.\n \n If False, then retries are disabled and any exception is raised\n immediately. Also, instead of raising a MaxRetryError on redirects,\n the redirect response will be returned.\n \n :type retries: :class:~urllib3.util.retry.Retry, False, or an int.\n \n :param redirect:\n If True, automatically handle redirects (status codes 301, 302,\n 303, 307, 308). Each redirect counts as a retry. Disabling retries\n will disable redirect, too.\n \n :param assert_same_host:\n If True, will make sure that the host of the pool requests is\n consistent else will raise HostChangedError. When False, you can\n use the pool on an HTTP proxy and request foreign hosts.\n \n :param timeout:\n If specified, overrides the default timeout for this one\n request. It may be a float (in seconds) or an instance of\n :class:urllib3.util.Timeout.\n \n :param pool_timeout:\n If set and the pool is set to block=True, then this method will\n block for pool_timeout seconds and raise EmptyPoolError if no\n connection is available within the time period.\n \n :param release_conn:\n If False, then the urlopen call will not release the connection\n back into the pool once a response is received (but will release if\n you read the entire contents of the response such as when\n preload_content=True). This is useful if you\'re not preloading\n the response\'s content immediately. You will need to call\n r.release_conn() on the response r to return the connection\n back into the pool. If None, it takes the value of\n response_kw.get(\\'preload_content\\', True).\n \n :param chunked:\n If True, urllib3 will send the body using chunked transfer\n encoding. Otherwise, urllib3 will send the body using the standard\n content-length form. Defaults to False.\n \n :param int body_pos:\n Position to seek to in file-like body in the event of a retry or\n redirect. Typically this won\'t need to be set because urllib3 will\n auto-populate the value when needed.\n \n :param \\**response_kw:\n Additional parameters are passed to\n :meth:urllib3.response.HTTPResponse.from_httplib\n """\n \n parsed_url = parse_url(url)\n destination_scheme = parsed_url.scheme\n \n if headers is None:\n headers = self.headers\n \n if not isinstance(retries, Retry):\n retries = Retry.from_int(retries, redirect=redirect, default=self.retries)\n \n if release_conn is None:\n release_conn = response_kw.get("preload_content", True)\n \n # Check host\n if assert_same_host and not self.is_same_host(url):\n raise HostChangedError(self, url, retries)\n \n # Ensure that the URL we\'re connecting to is properly encoded\n if url.startswith("/"):\n url = six.ensure_str(_encode_target(url))\n else:\n url = six.ensure_str(parsed_url.url)\n \n conn = None\n \n # Track whether conn needs to be released before\n # returning/raising/recursing. Update this variable if necessary, and\n # leave release_conn constant throughout the function. That way, if\n # the function recurses, the original value of release_conn will be\n # passed down into the recursive call, and its value will be respected.\n #\n # See issue #651 [1] for details.\n #\n # [1] https://github.com/urllib3/urllib3/issues/651\n release_this_conn = release_conn\n \n http_tunnel_required = connection_requires_http_tunnel(\n self.proxy, self.proxy_config, destination_scheme\n )\n \n # Merge the proxy headers. Only done when not using HTTP CONNECT. We\n # have to copy the headers dict so we can safely change it without those\n # changes being reflected in anyone else\'s copy.\n if not http_tunnel_required:\n headers = headers.copy()\n headers.update(self.proxy_headers)\n \n # Must keep the exception bound to a separate variable or else Python 3\n # complains about UnboundLocalError.\n err = None\n \n # Keep track of whether we cleanly exited the except block. This\n # ensures we do proper cleanup in finally.\n clean_exit = False\n \n # Rewind body position, if needed. Record current position\n # for future rewinds in the event of a redirect/retry.\n body_pos = set_file_position(body, body_pos)\n \n try:\n # Request a connection from the queue.\n timeout_obj = self._get_timeout(timeout)\n conn = self._get_conn(timeout=pool_timeout)\n \n conn.timeout = timeout_obj.connect_timeout\n \n is_new_proxy_conn = self.proxy is not None and not getattr(\n conn, "sock", None\n )\n if is_new_proxy_conn and http_tunnel_required:\n self._prepare_proxy(conn)\n \n # Make the request on the httplib connection object.\n> httplib_response = self._make_request(\n conn,\n method,\n url,\n timeout=timeout_obj,\n body=body,\n headers=headers,\n chunked=chunked,\n )\n\nlib/python3.10/site-packages/vendor/urllib3/connectionpool.py:716: \n _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ \nlib/python3.10/site-packages/_vendor/urllib3/connectionpool.py:404: in _make_request\n self._validate_conn(conn)\nlib/python3.10/site-packages/_vendor/urllib3/connectionpool.py:1061: in _validate_conn\n conn.connect()\nlib/python3.10/site-packages/_vendor/urllib3/connection.py:419: in connect\n self.sock = ssl_wrap_socket(\nlib/python3.10/site-packages/vendor/urllib3/util/ssl.py:462: in ssl_wrap_socket\n ssl_sock = _ssl_wrap_socket_impl(sock, context, tls_in_tls)\nlib/python3.10/site-packages/vendor/urllib3/util/ssl.py:504: in _ssl_wrap_socket_impl\n return ssl_context.wrap_socket(sock)\n/usr/lib/python3.10/ssl.py:513: in wrap_socket\n return self.sslsocket_class._create(\n/usr/lib/python3.10/ssl.py:1100: in create\n self.do_handshake()\n _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ \n\nself = <ssl.SSLSocket [closed] fd=-1, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6>\nblock = False\n\n @_sslcopydoc\n def do_handshake(self, block=False):\n self._check_connected()\n timeout = self.gettimeout()\n try:\n if timeout == 0.0 and block:\n self.settimeout(None)\n> self._sslobj.do_handshake()\nE ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (ssl.c:1007)\n\n/usr/lib/python3.10/ssl.py:1371: SSLCertVerificationError\n\nDuring handling of the above exception, another exception occurred:\n\nappname = \'sip-telemetry-loadgen\', maxsize = \'100mb\', iterations = \'2\'\nparallel_processes = \'5\', number_of_pods = \'1\', number_of_jobs = \'1\'\n\n def test_generate_load(\n appname, maxsize, iterations, parallel_processes, number_of_pods, number_of_jobs\n ):\n for index in range(int(number_of_jobs)):\n> run_job(appname, maxsize, iterations, number_of_pods, parallel_processes)\n\ntest_load_cpumemory.py:269: \n _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ \ntest_load_cpumemory.py:144: in run_job\n create_namespace_if_not_exists()\ntest_load_cpumemory.py:106: in create_namespace_if_not_exists\n v1.read_namespace(name=NAMESPACE)\nlib/python3.10/site-packages/kubernetes/client/api/core_v1_api.py:22992: in read_namespace\n return self.read_namespace_with_http_info(name, **kwargs) # noqa: E501\nlib/python3.10/site-packages/kubernetes/client/api/core_v1_api.py:23071: in read_namespace_with_http_info\n return self.api_client.call_api(\nlib/python3.10/site-packages/kubernetes/client/api_client.py:348: in call_api\n return self.__call_api(resource_path, method,\nlib/python3.10/site-packages/kubernetes/client/api_client.py:180: in __call_api\n response_data = self.request(\nlib/python3.10/site-packages/kubernetes/client/api_client.py:373: in request\n return self.rest_client.GET(url,\nlib/python3.10/site-packages/kubernetes/client/rest.py:244: in GET\n return self.request("GET", url,\nlib/python3.10/site-packages/kubernetes/client/rest.py:217: in request\n r = self.pool_manager.request(method, url,\nlib/python3.10/site-packages/_vendor/urllib3/request.py:77: in request\n return self.request_encode_url(\nlib/python3.10/site-packages/_vendor/urllib3/request.py:99: in request_encode_url\n return self.urlopen(method, url, **extra_kw)\nlib/python3.10/site-packages/_vendor/urllib3/poolmanager.py:376: in urlopen\n response = conn.urlopen(method, u.request_uri, **kw)\nlib/python3.10/site-packages/_vendor/urllib3/connectionpool.py:830: in urlopen\n return self.urlopen(\nlib/python3.10/site-packages/_vendor/urllib3/connectionpool.py:830: in urlopen\n return self.urlopen(\nlib/python3.10/site-packages/_vendor/urllib3/connectionpool.py:830: in urlopen\n return self.urlopen(\nlib/python3.10/site-packages/vendor/urllib3/connectionpool.py:802: in urlopen\n retries = retries.increment(\n _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ \n\nself = Retry(total=0, connect=None, read=None, redirect=None, status=None)\nmethod = \'GET\', url = \'/api/v1/namespaces/si-telemetry-loadgen\', response = None\nerror = SSLError(SSLCertVerificationError(1, \'[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1007)\'))\n_pool = <urllib3.connectionpool.HTTPSConnectionPool object at 0x7f620cb543a0>\n_stacktrace = <traceback object at 0x7f620cb4a380>\n\n def increment(\n self,\n method=None,\n url=None,\n response=None,\n error=None,\n _pool=None,\n _stacktrace=None,\n ):\n """Return a new Retry object with incremented retry counters.\n \n :param response: A response object, or None, if the server did not\n return a response.\n :type response: :class:~urllib3.response.HTTPResponse\n :param Exception error: An error encountered during the request, or\n None if the response was received successfully.\n \n :return: A new Retry object.\n """\n if self.total is False and error:\n # Disabled, indicate to re-raise the error.\n raise six.reraise(type(error), error, _stacktrace)\n \n total = self.total\n if total is not None:\n total -= 1\n \n connect = self.connect\n read = self.read\n redirect = self.redirect\n status_count = self.status\n other = self.other\n cause = "unknown"\n status = None\n redirect_location = None\n \n if error and self._is_connection_error(error):\n # Connect retry?\n if connect is False:\n raise six.reraise(type(error), error, _stacktrace)\n elif connect is not None:\n connect -= 1\n \n elif error and self._is_read_error(error):\n # Read retry?\n if read is False or not self._is_method_retryable(method):\n raise six.reraise(type(error), error, _stacktrace)\n elif read is not None:\n read -= 1\n \n elif error:\n # Other retry?\n if other is not None:\n other -= 1\n \n elif response and response.get_redirect_location():\n # Redirect retry?\n if redirect is not None:\n redirect -= 1\n cause = "too many redirects"\n redirect_location = response.get_redirect_location()\n status = response.status\n \n else:\n # Incrementing because of a server error like a 500 in\n # status_forcelist and the given method is in the allowed_methods\n cause = ResponseError.GENERIC_ERROR\n if response and response.status:\n if status_count is not None:\n status_count -= 1\n cause = ResponseError.SPECIFIC_ERROR.format(status_code=response.status)\n status = response.status\n \n history = self.history + (\n RequestHistory(method, url, error, status, redirect_location),\n )\n \n new_retry = self.new(\n total=total,\n connect=connect,\n read=read,\n redirect=redirect,\n status=status_count,\n other=other,\n history=history,\n )\n \n if new_retry.is_exhausted():\n> raise MaxRetryError(_pool, url, error or ResponseError(cause))\nE urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host=\'192.168.20.90\', port=6443): Max retries exceeded with url: /api/v1/namespaces/si-telemetry-loadgen (Caused by SSLError(SSLCertVerificationError(1, \'[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1007)\')))\n\nlib/python3.10/site-packages/_vendor/urllib3/util/retry.py:594: MaxRetryError\nNoneType: None\n" } ], "temperature": 0, "response_format": { "type": "json_schema", "json_schema": { "name": "adhocloganalysis", "strict": true, "schema": { "$defs": { "LogInsightItem": { "description": "Single insight item for adhoc log analysis.", "properties": { "root_cause": { "anyOf": [ { "type": "string" }, { "type": "null" } ], "default": "None", "title": "Root Cause" }, "category": { "title": "Category", "type": "string" }, "suggestion": { "anyOf": [ { "type": "string" }, { "type": "null" } ], "default": "None", "title": "Suggestion" }, "summary": { "title": "Summary", "type": "string" }, "confidence": { "enum": [ "high", "medium", "low" ], "title": "Confidence", "type": "string" }, "mode": { "enum": [ "failure_analysis", "summary" ], "title": "Mode", "type": "string" } }, "required": [ "category", "summary", "confidence", "mode" ], "title": "LogInsightItem", "type": "object", "additionalProperties": false } }, "description": "Wrapper for adhoc log analysis response with items array.", "properties": { "items": { "items": { "$ref": "#/$defs/LogInsightItem" }, "title": "Items", "type": "array" } }, "required": [ "items" ], "title": "AdhocLogAnalysis", "type": "object", "additionalProperties": false } } } }

Root Cause

{
    "model": "gemma4-31b",
    "messages": [
        {
            "role": "system",
            "content": "You are an expert test automation engineer and log analyst.\n\nA user has manually selected specific log lines from a test section and is requesting your analysis.\nFirst, determine whether the logs contain errors, failures, or anomalies.\n\n- If the logs indicate one or more issues, set \"mode\" to \"failure_analysis\" and provide root cause analysis.\n- If the logs are normal/informational with no apparent issues, set \"mode\" to \"summary\" and provide a descriptive summary.\n\nRespond in valid JSON with a top-level \"items\" array.\n\nFor **failure_analysis** mode:\n{\n  \"items\": [\n    {\n      \"mode\": \"failure_analysis\",\n      \"root_cause\": \"A concise description of the root cause of the failure.\",\n      \"category\": \"One of: script_error | environment_issue | infrastructure_failure | test_data_issue | timeout | unknown\",\n      \"suggestion\": \"Actionable steps to fix or investigate the failure (2-5 bullet points as a single string, separated by newlines).\",\n      \"summary\": \"A one-line summary suitable for display in a dashboard.\",\n      \"confidence\": \"high | medium | low\"\n    }\n  ]\n}\n\nFor **summary** mode (no issues detected):\n{\n  \"items\": [\n    {\n      \"mode\": \"summary\",\n      \"summary\": \"A concise description of what is happening in the selected logs.\",\n      \"category\": \"informational\",\n      \"confidence\": \"high\"\n    }\n  ]\n}\n\nGuidelines:\n- Each element of the array must describe one distinct issue or observation.\n- For failure_analysis mode: focus on the most likely root cause based on the provided log lines.\n- For summary mode: describe the sequence of operations, key events, or state transitions visible in the logs.\n- The user selected these specific lines because they believe they are relevant; give them priority.\n- Distinguish between scripting errors (wrong assertions, missing setup) and genuine product bugs.\n- If the logs indicate an environment or infrastructure issue (network timeout, resource unavailable, container crash), flag it clearly.\n- Keep the summary short enough for a UI badge or tooltip.\n- If you cannot determine the root cause, set category to \"unknown\" and confidence to \"low\", and suggest investigation steps."
        },
        {
            "role": "user",
            "content": "## Section\ntest_generate_load\n\n## Selected Log Lines\ntestcase started\nRetrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by \\'SSLError(SSLCertVerificationError(1, \\'[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1007)\\'))\\': /api/v1/namespaces/si-telemetry-loadgen\nRetrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by \\'SSLError(SSLCertVerificationError(1, \\'[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1007)\\'))\\': /api/v1/namespaces/si-telemetry-loadgen\nRetrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by \\'SSLError(SSLCertVerificationError(1, \\'[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1007)\\'))\\': /api/v1/namespaces/si-telemetry-loadgen\nself = <urllib3.connectionpool.HTTPSConnectionPool object at 0x7f620cb543a0>\nmethod = \\'GET\\', url = \\'/api/v1/namespaces/si-telemetry-loadgen\\', body = None\nheaders = {\\'Accept\\': \\'application/json\\', \\'Content-Type\\': \\'application/json\\', \\'User-Agent\\': \\'OpenAPI-Generator/33.1.0/python\\'}\nretries = Retry(total=0, connect=None, read=None, redirect=None, status=None)\nredirect = False, assert_same_host = False, timeout = None, pool_timeout = None\nrelease_conn = True, chunked = False, body_pos = None\nresponse_kw = {\\'preload_content\\': True, \\'request_url\\': \\'https://192.168.20.90:6443/api/v1/namespaces/si-telemetry-loadgen\\'}\nparsed_url = Url(scheme=None, auth=None, host=None, port=None, path=\\'/api/v1/namespaces/si-telemetry-loadgen\\', query=None, fragment=None)\ndestination_scheme = None, conn = None, release_this_conn = True\nhttp_tunnel_required = False, err = None, clean_exit = False\n\n    def urlopen(\n        self,\n        method,\n        url,\n        body=None,\n        headers=None,\n        retries=None,\n        redirect=True,\n        assert_same_host=True,\n        timeout=_Default,\n        pool_timeout=None,\n        release_conn=None,\n        chunked=False,\n        body_pos=None,\n        **response_kw\n    ):\n        \"\"\"\n        Get a connection from the pool and perform an HTTP request. This is the\n        lowest level call for making a request, so you\\'ll need to specify all\n        the raw details.\n    \n        .. note::\n    \n           More commonly, it\\'s appropriate to use a convenience method provided\n           by :class:`.RequestMethods`, such as :meth:`request`.\n    \n        .. note::\n    \n           `release_conn` will only behave as expected if\n           `preload_content=False` because we want to make\n           `preload_content=False` the default behaviour someday soon without\n           breaking backwards compatibility.\n    \n        :param method:\n            HTTP request method (such as GET, POST, PUT, etc.)\n    \n        :param url:\n            The URL to perform the request on.\n    \n        :param body:\n            Data to send in the request body, either :class:`str`, :class:`bytes`,\n            an iterable of :class:`str`/:class:`bytes`, or a file-like object.\n    \n        :param headers:\n            Dictionary of custom headers to send, such as User-Agent,\n            If-None-Match, etc. If None, pool headers are used. If provided,\n            these headers completely replace any pool-specific headers.\n    \n        :param retries:\n            Configure the number of retries to allow before raising a\n            :class:`~urllib3.exceptions.MaxRetryError` exception.\n    \n            Pass ``None`` to retry until you receive a response. Pass a\n            :class:`~urllib3.util.retry.Retry` object for fine-grained control\n            over different types of retries.\n            Pass an integer number to retry connection errors that many times,\n            but no other types of errors. Pass zero to never retry.\n    \n            If ``False``, then retries are disabled and any exception is raised\n            immediately. Also, instead of raising a MaxRetryError on redirects,\n            the redirect response will be returned.\n    \n        :type retries: :class:`~urllib3.util.retry.Retry`, False, or an int.\n    \n        :param redirect:\n            If True, automatically handle redirects (status codes 301, 302,\n            303, 307, 308). Each redirect counts as a retry. Disabling retries\n            will disable redirect, too.\n    \n        :param assert_same_host:\n            If ``True``, will make sure that the host of the pool requests is\n            consistent else will raise HostChangedError. When ``False``, you can\n            use the pool on an HTTP proxy and request foreign hosts.\n    \n        :param timeout:\n            If specified, overrides the default timeout for this one\n            request. It may be a float (in seconds) or an instance of\n            :class:`urllib3.util.Timeout`.\n    \n        :param pool_timeout:\n            If set and the pool is set to block=True, then this method will\n            block for ``pool_timeout`` seconds and raise EmptyPoolError if no\n            connection is available within the time period.\n    \n        :param release_conn:\n            If False, then the urlopen call will not release the connection\n            back into the pool once a response is received (but will release if\n            you read the entire contents of the response such as when\n            `preload_content=True`). This is useful if you\\'re not preloading\n            the response\\'s content immediately. You will need to call\n            ``r.release_conn()`` on the response ``r`` to return the connection\n            back into the pool. If None, it takes the value of\n            ``response_kw.get(\\'preload_content\\', True)``.\n    \n        :param chunked:\n            If True, urllib3 will send the body using chunked transfer\n            encoding. Otherwise, urllib3 will send the body using the standard\n            content-length form. Defaults to False.\n    \n        :param int body_pos:\n            Position to seek to in file-like body in the event of a retry or\n            redirect. Typically this won\\'t need to be set because urllib3 will\n            auto-populate the value when needed.\n    \n        :param \\\\**response_kw:\n            Additional parameters are passed to\n            :meth:`urllib3.response.HTTPResponse.from_httplib`\n        \"\"\"\n    \n        parsed_url = parse_url(url)\n        destination_scheme = parsed_url.scheme\n    \n        if headers is None:\n            headers = self.headers\n    \n        if not isinstance(retries, Retry):\n            retries = Retry.from_int(retries, redirect=redirect, default=self.retries)\n    \n        if release_conn is None:\n            release_conn = response_kw.get(\"preload_content\", True)\n    \n        # Check host\n        if assert_same_host and not self.is_same_host(url):\n            raise HostChangedError(self, url, retries)\n    \n        # Ensure that the URL we\\'re connecting to is properly encoded\n        if url.startswith(\"/\"):\n            url = six.ensure_str(_encode_target(url))\n        else:\n            url = six.ensure_str(parsed_url.url)\n    \n        conn = None\n    \n        # Track whether `conn` needs to be released before\n        # returning/raising/recursing. Update this variable if necessary, and\n        # leave `release_conn` constant throughout the function. That way, if\n        # the function recurses, the original value of `release_conn` will be\n        # passed down into the recursive call, and its value will be respected.\n        #\n        # See issue #651 [1] for details.\n        #\n        # [1] <https://github.com/urllib3/urllib3/issues/651>\n        release_this_conn = release_conn\n    \n        http_tunnel_required = connection_requires_http_tunnel(\n            self.proxy, self.proxy_config, destination_scheme\n        )\n    \n        # Merge the proxy headers. Only done when not using HTTP CONNECT. We\n        # have to copy the headers dict so we can safely change it without those\n        # changes being reflected in anyone else\\'s copy.\n        if not http_tunnel_required:\n            headers = headers.copy()\n            headers.update(self.proxy_headers)\n    \n        # Must keep the exception bound to a separate variable or else Python 3\n        # complains about UnboundLocalError.\n        err = None\n    \n        # Keep track of whether we cleanly exited the except block. This\n        # ensures we do proper cleanup in finally.\n        clean_exit = False\n    \n        # Rewind body position, if needed. Record current position\n        # for future rewinds in the event of a redirect/retry.\n        body_pos = set_file_position(body, body_pos)\n    \n        try:\n            # Request a connection from the queue.\n            timeout_obj = self._get_timeout(timeout)\n            conn = self._get_conn(timeout=pool_timeout)\n    \n            conn.timeout = timeout_obj.connect_timeout\n    \n            is_new_proxy_conn = self.proxy is not None and not getattr(\n                conn, \"sock\", None\n            )\n            if is_new_proxy_conn and http_tunnel_required:\n                self._prepare_proxy(conn)\n    \n            # Make the request on the httplib connection object.\n>           httplib_response = self._make_request(\n                conn,\n                method,\n                url,\n                timeout=timeout_obj,\n                body=body,\n                headers=headers,\n                chunked=chunked,\n            )\n\nlib/python3.10/site-packages/_vendor/urllib3/connectionpool.py:716: \n_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ \nlib/python3.10/site-packages/_vendor/urllib3/connectionpool.py:404: in _make_request\n    self._validate_conn(conn)\nlib/python3.10/site-packages/_vendor/urllib3/connectionpool.py:1061: in _validate_conn\n    conn.connect()\nlib/python3.10/site-packages/_vendor/urllib3/connection.py:419: in connect\n    self.sock = ssl_wrap_socket(\nlib/python3.10/site-packages/_vendor/urllib3/util/ssl_.py:462: in ssl_wrap_socket\n    ssl_sock = _ssl_wrap_socket_impl(sock, context, tls_in_tls)\nlib/python3.10/site-packages/_vendor/urllib3/util/ssl_.py:504: in _ssl_wrap_socket_impl\n    return ssl_context.wrap_socket(sock)\n/usr/lib/python3.10/ssl.py:513: in wrap_socket\n    return self.sslsocket_class._create(\n/usr/lib/python3.10/ssl.py:1100: in _create\n    self.do_handshake()\n_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ \n\nself = <ssl.SSLSocket [closed] fd=-1, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6>\nblock = False\n\n    @_sslcopydoc\n    def do_handshake(self, block=False):\n        self._check_connected()\n        timeout = self.gettimeout()\n        try:\n            if timeout == 0.0 and block:\n                self.settimeout(None)\n>           self._sslobj.do_handshake()\nE           ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1007)\n\n/usr/lib/python3.10/ssl.py:1371: SSLCertVerificationError\n\nDuring handling of the above exception, another exception occurred:\n\nappname = \\'sip-telemetry-loadgen\\', maxsize = \\'100mb\\', iterations = \\'2\\'\nparallel_processes = \\'5\\', number_of_pods = \\'1\\', number_of_jobs = \\'1\\'\n\n    def test_generate_load(\n        appname, maxsize, iterations, parallel_processes, number_of_pods, number_of_jobs\n    ):\n        for index in range(int(number_of_jobs)):\n>           run_job(appname, maxsize, iterations, number_of_pods, parallel_processes)\n\ntest_load_cpumemory.py:269: \n_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ \ntest_load_cpumemory.py:144: in run_job\n    create_namespace_if_not_exists()\ntest_load_cpumemory.py:106: in create_namespace_if_not_exists\n    v1.read_namespace(name=NAMESPACE)\nlib/python3.10/site-packages/kubernetes/client/api/core_v1_api.py:22992: in read_namespace\n    return self.read_namespace_with_http_info(name, **kwargs)  # noqa: E501\nlib/python3.10/site-packages/kubernetes/client/api/core_v1_api.py:23071: in read_namespace_with_http_info\n    return self.api_client.call_api(\nlib/python3.10/site-packages/kubernetes/client/api_client.py:348: in call_api\n    return self.__call_api(resource_path, method,\nlib/python3.10/site-packages/kubernetes/client/api_client.py:180: in __call_api\n    response_data = self.request(\nlib/python3.10/site-packages/kubernetes/client/api_client.py:373: in request\n    return self.rest_client.GET(url,\nlib/python3.10/site-packages/kubernetes/client/rest.py:244: in GET\n    return self.request(\"GET\", url,\nlib/python3.10/site-packages/kubernetes/client/rest.py:217: in request\n    r = self.pool_manager.request(method, url,\nlib/python3.10/site-packages/_vendor/urllib3/request.py:77: in request\n    return self.request_encode_url(\nlib/python3.10/site-packages/_vendor/urllib3/request.py:99: in request_encode_url\n    return self.urlopen(method, url, **extra_kw)\nlib/python3.10/site-packages/_vendor/urllib3/poolmanager.py:376: in urlopen\n    response = conn.urlopen(method, u.request_uri, **kw)\nlib/python3.10/site-packages/_vendor/urllib3/connectionpool.py:830: in urlopen\n    return self.urlopen(\nlib/python3.10/site-packages/_vendor/urllib3/connectionpool.py:830: in urlopen\n    return self.urlopen(\nlib/python3.10/site-packages/_vendor/urllib3/connectionpool.py:830: in urlopen\n    return self.urlopen(\nlib/python3.10/site-packages/_vendor/urllib3/connectionpool.py:802: in urlopen\n    retries = retries.increment(\n_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ \n\nself = Retry(total=0, connect=None, read=None, redirect=None, status=None)\nmethod = \\'GET\\', url = \\'/api/v1/namespaces/si-telemetry-loadgen\\', response = None\nerror = SSLError(SSLCertVerificationError(1, \\'[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1007)\\'))\n_pool = <urllib3.connectionpool.HTTPSConnectionPool object at 0x7f620cb543a0>\n_stacktrace = <traceback object at 0x7f620cb4a380>\n\n    def increment(\n        self,\n        method=None,\n        url=None,\n        response=None,\n        error=None,\n        _pool=None,\n        _stacktrace=None,\n    ):\n        \"\"\"Return a new Retry object with incremented retry counters.\n    \n        :param response: A response object, or None, if the server did not\n            return a response.\n        :type response: :class:`~urllib3.response.HTTPResponse`\n        :param Exception error: An error encountered during the request, or\n            None if the response was received successfully.\n    \n        :return: A new ``Retry`` object.\n        \"\"\"\n        if self.total is False and error:\n            # Disabled, indicate to re-raise the error.\n            raise six.reraise(type(error), error, _stacktrace)\n    \n        total = self.total\n        if total is not None:\n            total -= 1\n    \n        connect = self.connect\n        read = self.read\n        redirect = self.redirect\n        status_count = self.status\n        other = self.other\n        cause = \"unknown\"\n        status = None\n        redirect_location = None\n    \n        if error and self._is_connection_error(error):\n            # Connect retry?\n            if connect is False:\n                raise six.reraise(type(error), error, _stacktrace)\n            elif connect is not None:\n                connect -= 1\n    \n        elif error and self._is_read_error(error):\n            # Read retry?\n            if read is False or not self._is_method_retryable(method):\n                raise six.reraise(type(error), error, _stacktrace)\n            elif read is not None:\n                read -= 1\n    \n        elif error:\n            # Other retry?\n            if other is not None:\n                other -= 1\n    \n        elif response and response.get_redirect_location():\n            # Redirect retry?\n            if redirect is not None:\n                redirect -= 1\n            cause = \"too many redirects\"\n            redirect_location = response.get_redirect_location()\n            status = response.status\n    \n        else:\n            # Incrementing because of a server error like a 500 in\n            # status_forcelist and the given method is in the allowed_methods\n            cause = ResponseError.GENERIC_ERROR\n            if response and response.status:\n                if status_count is not None:\n                    status_count -= 1\n                cause = ResponseError.SPECIFIC_ERROR.format(status_code=response.status)\n                status = response.status\n    \n        history = self.history + (\n            RequestHistory(method, url, error, status, redirect_location),\n        )\n    \n        new_retry = self.new(\n            total=total,\n            connect=connect,\n            read=read,\n            redirect=redirect,\n            status=status_count,\n            other=other,\n            history=history,\n        )\n    \n        if new_retry.is_exhausted():\n>           raise MaxRetryError(_pool, url, error or ResponseError(cause))\nE           urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host=\\'192.168.20.90\\', port=6443): Max retries exceeded with url: /api/v1/namespaces/si-telemetry-loadgen (Caused by SSLError(SSLCertVerificationError(1, \\'[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1007)\\')))\n\nlib/python3.10/site-packages/_vendor/urllib3/util/retry.py:594: MaxRetryError\nNoneType: None\n"
        }
    ],
    "temperature": 0,
    "response_format": {
        "type": "json_schema",
        "json_schema": {
            "name": "adhocloganalysis",
            "strict": true,
            "schema": {
                "$defs": {
                    "LogInsightItem": {
                        "description": "Single insight item for adhoc log analysis.",
                        "properties": {
                            "root_cause": {
                                "anyOf": [
                                    {
                                        "type": "string"
                                    },
                                    {
                                        "type": "null"
                                    }
                                ],
                                "default": "None",
                                "title": "Root Cause"
                            },
                            "category": {
                                "title": "Category",
                                "type": "string"
                            },
                            "suggestion": {
                                "anyOf": [
                                    {
                                        "type": "string"
                                    },
                                    {
                                        "type": "null"
                                    }
                                ],
                                "default": "None",
                                "title": "Suggestion"
                            },
                            "summary": {
                                "title": "Summary",
                                "type": "string"
                            },
                            "confidence": {
                                "enum": [
                                    "high",
                                    "medium",
                                    "low"
                                ],
                                "title": "Confidence",
                                "type": "string"
                            },
                            "mode": {
                                "enum": [
                                    "failure_analysis",
                                    "summary"
                                ],
                                "title": "Mode",
                                "type": "string"
                            }
                        },
                        "required": [
                            "category",
                            "summary",
                            "confidence",
                            "mode"
                        ],
                        "title": "LogInsightItem",
                        "type": "object",
                        "additionalProperties": false
                    }
                },
                "description": "Wrapper for adhoc log analysis response with items array.",
                "properties": {
                    "items": {
                        "items": {
                            "$ref": "#/$defs/LogInsightItem"
                        },
                        "title": "Items",
                        "type": "array"
                    }
                },
                "required": [
                    "items"
                ],
                "title": "AdhocLogAnalysis",
                "type": "object",
                "additionalProperties": false
            }
        }
    }
}

Fix Action

Fix / Workaround

============================== CPU Info

Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 46 bits physical, 57 bits virtual Byte Order: Little Endian CPU(s): 128 On-line CPU(s) list: 0-127 Vendor ID: GenuineIntel Model name: Intel(R) Xeon(R) Gold 6338 CPU @ 2.00GHz CPU family: 6 Model: 106 Thread(s) per core: 2 Core(s) per socket: 32 Socket(s): 2 Stepping: 6 BogoMIPS: 4000.00 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l3 intel_ppin ssbd mba ibrs ibpb stibp ibrs_enhanced tpr_shadow flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid cqm rdt_a avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb intel_pt avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local split_lock_detect wbnoinvd dtherm ida arat pln pts vnmi avx512vbmi umip pku ospke avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg tme avx512_vpopcntdq la57 rdpid fsrm md_clear pconfig flush_l1d arch_capabilities Virtualization: VT-x L1d cache: 3 MiB (64 instances) L1i cache: 2 MiB (64 instances) L2 cache: 80 MiB (64 instances) L3 cache: 96 MiB (2 instances) NUMA node(s): 2 NUMA node0 CPU(s): 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38,40,42,44,46,48,50,52,54,56,58,60,62,64,66,68,70,72,74,76,78,80,82,84,86,88,90,92,94,96,98,100,102,104,106,108,110,112,114,116,118,120,122,124,126 NUMA node1 CPU(s): 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39,41,43,45,47,49,51,53,55,57,59,61,63,65,67,69,71,73,75,77,79,81,83,85,87,89,91,93,95,97,99,101,103,105,107,109,111,113,115,117,119,121,123,125,127 Vulnerability Gather data sampling: Mitigation; Microcode Vulnerability Itlb multihit: Not affected Vulnerability L1tf: Not affected Vulnerability Mds: Not affected Vulnerability Meltdown: Not affected Vulnerability Mmio stale data: Mitigation; Clear CPU buffers; SMT vulnerable Vulnerability Reg file data sampling: Not affected Vulnerability Retbleed: Not affected Vulnerability Spec rstack overflow: Not affected Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization Vulnerability Spectre v2: Mitigation; Enhanced / Automatic IBRS; IBPB conditional; RSB filling; PBRSB-eIBRS SW sequence; BHI SW loop, KVM SW loop Vulnerability Srbds: Not affected Vulnerability Tsx async abort: Not affected

Code Example

Collecting environment information...
==============================
        System Info
==============================
OS                           : Ubuntu 22.04.5 LTS (x86_64)
GCC version                  : (Ubuntu 11.4.0-1ubuntu1~22.04.3) 11.4.0
Clang version                : Could not collect
CMake version                : Could not collect
Libc version                 : glibc-2.35

==============================
       PyTorch Info
==============================
PyTorch version              : 2.11.0+cu130
Is debug build               : False
CUDA used to build PyTorch   : 13.0
ROCM used to build PyTorch   : N/A
XPU used to build PyTorch    : N/A

==============================
      Python Environment
==============================
Python version               : 3.12.13 (main, Mar  4 2026, 09:23:07) [GCC 11.4.0] (64-bit runtime)
Python platform              : Linux-5.14.0-554.el9.x86_64-x86_64-with-glibc2.35

==============================
       CUDA / GPU Info
==============================
Is CUDA available            : True
CUDA runtime version         : 13.0.88
CUDA_MODULE_LOADING set to   :
GPU models and configuration :
GPU 0: NVIDIA A100 80GB PCIe
GPU 1: NVIDIA A100 80GB PCIe

Nvidia driver version        : 580.105.08
cuDNN version                : Could not collect
HIP runtime version          : N/A
MIOpen runtime version       : N/A
Is XNNPACK available         : True

==============================
          CPU Info
==============================
Architecture:                         x86_64
CPU op-mode(s):                       32-bit, 64-bit
Address sizes:                        46 bits physical, 57 bits virtual
Byte Order:                           Little Endian
CPU(s):                               128
On-line CPU(s) list:                  0-127
Vendor ID:                            GenuineIntel
Model name:                           Intel(R) Xeon(R) Gold 6338 CPU @ 2.00GHz
CPU family:                           6
Model:                                106
Thread(s) per core:                   2
Core(s) per socket:                   32
Socket(s):                            2
Stepping:                             6
BogoMIPS:                             4000.00
Flags:                                fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l3 intel_ppin ssbd mba ibrs ibpb stibp ibrs_enhanced tpr_shadow flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid cqm rdt_a avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb intel_pt avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local split_lock_detect wbnoinvd dtherm ida arat pln pts vnmi avx512vbmi umip pku ospke avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg tme avx512_vpopcntdq la57 rdpid fsrm md_clear pconfig flush_l1d arch_capabilities
Virtualization:                       VT-x
L1d cache:                            3 MiB (64 instances)
L1i cache:                            2 MiB (64 instances)
L2 cache:                             80 MiB (64 instances)
L3 cache:                             96 MiB (2 instances)
NUMA node(s):                         2
NUMA node0 CPU(s):                    0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38,40,42,44,46,48,50,52,54,56,58,60,62,64,66,68,70,72,74,76,78,80,82,84,86,88,90,92,94,96,98,100,102,104,106,108,110,112,114,116,118,120,122,124,126
NUMA node1 CPU(s):                    1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39,41,43,45,47,49,51,53,55,57,59,61,63,65,67,69,71,73,75,77,79,81,83,85,87,89,91,93,95,97,99,101,103,105,107,109,111,113,115,117,119,121,123,125,127
Vulnerability Gather data sampling:   Mitigation; Microcode
Vulnerability Itlb multihit:          Not affected
Vulnerability L1tf:                   Not affected
Vulnerability Mds:                    Not affected
Vulnerability Meltdown:               Not affected
Vulnerability Mmio stale data:        Mitigation; Clear CPU buffers; SMT vulnerable
Vulnerability Reg file data sampling: Not affected
Vulnerability Retbleed:               Not affected
Vulnerability Spec rstack overflow:   Not affected
Vulnerability Spec store bypass:      Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:             Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2:             Mitigation; Enhanced / Automatic IBRS; IBPB conditional; RSB filling; PBRSB-eIBRS SW sequence; BHI SW loop, KVM SW loop
Vulnerability Srbds:                  Not affected
Vulnerability Tsx async abort:        Not affected

==============================
Versions of relevant libraries
==============================
[pip3] flashinfer-python==0.6.8.post1
[pip3] numpy==2.2.6
[pip3] nvidia-cublas==13.1.0.3
[pip3] nvidia-cuda-cupti==13.0.85
[pip3] nvidia-cuda-nvrtc==13.0.88
[pip3] nvidia-cuda-runtime==13.0.96
[pip3] nvidia-cudnn-cu13==9.19.0.56
[pip3] nvidia-cudnn-frontend==1.18.0
[pip3] nvidia-cufft==12.0.0.61
[pip3] nvidia-cufile==1.15.1.6
[pip3] nvidia-curand==10.4.0.35
[pip3] nvidia-cusolver==12.0.4.66
[pip3] nvidia-cusparse==12.6.3.3
[pip3] nvidia-cusparselt-cu13==0.8.0
[pip3] nvidia-cutlass-dsl==4.5.0
[pip3] nvidia-cutlass-dsl-libs-base==4.5.0
[pip3] nvidia-ml-py==13.595.45
[pip3] nvidia-nccl-cu13==2.28.9
[pip3] nvidia-nvjitlink==13.0.88
[pip3] nvidia-nvshmem-cu13==3.4.5
[pip3] nvidia-nvtx==13.0.85
[pip3] pyzmq==27.1.0
[pip3] torch==2.11.0+cu130
[pip3] torch_c_dlpack_ext==0.1.5
[pip3] torchaudio==2.11.0+cu130
[pip3] torchvision==0.26.0+cu130
[pip3] transformers==5.8.0
[pip3] triton==3.6.0
[conda] Could not collect

==============================
         vLLM Info
==============================
ROCM Version                 : Could not collect
vLLM Version                 : 0.20.2rc1.dev93+g51f22dcfd (git sha: 51f22dcfd)
vLLM Build Flags:
  CUDA Archs: 7.5 8.0 8.6 8.9 9.0 10.0 12.0+PTX; ROCm: Disabled; XPU: Disabled
GPU Topology:
        GPU0    GPU1    NIC0    NIC1    CPU Affinity    NUMA Affinity   GPU NUMA ID
GPU0     X      SYS     SYS     SYS     0,2,4,6,8,10    0               N/A
GPU1    SYS      X      NODE    NODE    1,3,5,7,9,11    1               N/A
NIC0    SYS     NODE     X      PIX
NIC1    SYS     NODE    PIX      X

Legend:

  X    = Self
  SYS  = Connection traversing PCIe as well as the SMP interconnect between NUMA nodes (e.g., QPI/UPI)
  NODE = Connection traversing PCIe as well as the interconnect between PCIe Host Bridges within a NUMA node
  PHB  = Connection traversing PCIe as well as a PCIe Host Bridge (typically the CPU)
  PXB  = Connection traversing multiple PCIe bridges (without traversing the PCIe Host Bridge)
  PIX  = Connection traversing at most a single PCIe bridge
  NV#  = Connection traversing a bonded set of # NVLinks

NIC Legend:

  NIC0: mlx5_0
  NIC1: mlx5_1

==============================
     Environment Variables
==============================
NVIDIA_VISIBLE_DEVICES=all
NVIDIA_REQUIRE_CUDA=cuda>=13.0 brand=unknown,driver>=535,driver<536 brand=grid,driver>=535,driver<536 brand=tesla,driver>=535,driver<536 brand=nvidia,driver>=535,driver<536 brand=quadro,driver>=535,driver<536 brand=quadrortx,driver>=535,driver<536 brand=nvidiartx,driver>=535,driver<536 brand=vapps,driver>=535,driver<536 brand=vpc,driver>=535,driver<536 brand=vcs,driver>=535,driver<536 brand=vws,driver>=535,driver<536 brand=cloudgaming,driver>=535,driver<536 brand=unknown,driver>=550,driver<551 brand=grid,driver>=550,driver<551 brand=tesla,driver>=550,driver<551 brand=nvidia,driver>=550,driver<551 brand=quadro,driver>=550,driver<551 brand=quadrortx,driver>=550,driver<551 brand=nvidiartx,driver>=550,driver<551 brand=vapps,driver>=550,driver<551 brand=vpc,driver>=550,driver<551 brand=vcs,driver>=550,driver<551 brand=vws,driver>=550,driver<551 brand=cloudgaming,driver>=550,driver<551 brand=unknown,driver>=565,driver<566 brand=grid,driver>=565,driver<566 brand=tesla,driver>=565,driver<566 brand=nvidia,driver>=565,driver<566 brand=quadro,driver>=565,driver<566 brand=quadrortx,driver>=565,driver<566 brand=nvidiartx,driver>=565,driver<566 brand=vapps,driver>=565,driver<566 brand=vpc,driver>=565,driver<566 brand=vcs,driver>=565,driver<566 brand=vws,driver>=565,driver<566 brand=cloudgaming,driver>=565,driver<566 brand=unknown,driver>=570,driver<571 brand=grid,driver>=570,driver<571 brand=tesla,driver>=570,driver<571 brand=nvidia,driver>=570,driver<571 brand=quadro,driver>=570,driver<571 brand=quadrortx,driver>=570,driver<571 brand=nvidiartx,driver>=570,driver<571 brand=vapps,driver>=570,driver<571 brand=vpc,driver>=570,driver<571 brand=vcs,driver>=570,driver<571 brand=vws,driver>=570,driver<571 brand=cloudgaming,driver>=570,driver<571 brand=unknown,driver>=575,driver<576 brand=grid,driver>=575,driver<576 brand=tesla,driver>=575,driver<576 brand=nvidia,driver>=575,driver<576 brand=quadro,driver>=575,driver<576 brand=quadrortx,driver>=575,driver<576 brand=nvidiartx,driver>=575,driver<576 brand=vapps,driver>=575,driver<576 brand=vpc,driver>=575,driver<576 brand=vcs,driver>=575,driver<576 brand=vws,driver>=575,driver<576 brand=cloudgaming,driver>=575,driver<576
TORCH_CUDA_ARCH_LIST=7.5 8.0 8.6 8.9 9.0 10.0 12.0+PTX
NVIDIA_DRIVER_CAPABILITIES=compute,utility
VLLM_USAGE_SOURCE=production-docker-image
CUDA_VERSION=13.0.2
VLLM_ENABLE_CUDA_COMPATIBILITY=0
VLLM_DO_NOT_TRACK=1
LD_LIBRARY_PATH=/usr/local/nvidia/lib64:/usr/local/cuda/lib64:/usr/local/nvidia/lib:/usr/local/nvidia/lib64:/usr/local/cuda/lib64
VLLM_NO_USAGE_STATS=1
PYTORCH_NVML_BASED_CUDA_CHECK=1
TORCHINDUCTOR_COMPILE_THREADS=1
TORCHINDUCTOR_CACHE_DIR=/tmp/torchinductor_root

---

{
    "model": "gemma4-31b",
    "messages": [
        {
            "role": "system",
            "content": "You are an expert test automation engineer and log analyst.\n\nA user has manually selected specific log lines from a test section and is requesting your analysis.\nFirst, determine whether the logs contain errors, failures, or anomalies.\n\n- If the logs indicate one or more issues, set \"mode\" to \"failure_analysis\" and provide root cause analysis.\n- If the logs are normal/informational with no apparent issues, set \"mode\" to \"summary\" and provide a descriptive summary.\n\nRespond in valid JSON with a top-level \"items\" array.\n\nFor **failure_analysis** mode:\n{\n  \"items\": [\n    {\n      \"mode\": \"failure_analysis\",\n      \"root_cause\": \"A concise description of the root cause of the failure.\",\n      \"category\": \"One of: script_error | environment_issue | infrastructure_failure | test_data_issue | timeout | unknown\",\n      \"suggestion\": \"Actionable steps to fix or investigate the failure (2-5 bullet points as a single string, separated by newlines).\",\n      \"summary\": \"A one-line summary suitable for display in a dashboard.\",\n      \"confidence\": \"high | medium | low\"\n    }\n  ]\n}\n\nFor **summary** mode (no issues detected):\n{\n  \"items\": [\n    {\n      \"mode\": \"summary\",\n      \"summary\": \"A concise description of what is happening in the selected logs.\",\n      \"category\": \"informational\",\n      \"confidence\": \"high\"\n    }\n  ]\n}\n\nGuidelines:\n- Each element of the array must describe one distinct issue or observation.\n- For failure_analysis mode: focus on the most likely root cause based on the provided log lines.\n- For summary mode: describe the sequence of operations, key events, or state transitions visible in the logs.\n- The user selected these specific lines because they believe they are relevant; give them priority.\n- Distinguish between scripting errors (wrong assertions, missing setup) and genuine product bugs.\n- If the logs indicate an environment or infrastructure issue (network timeout, resource unavailable, container crash), flag it clearly.\n- Keep the summary short enough for a UI badge or tooltip.\n- If you cannot determine the root cause, set category to \"unknown\" and confidence to \"low\", and suggest investigation steps."
        },
        {
            "role": "user",
            "content": "## Section\ntest_generate_load\n\n## Selected Log Lines\ntestcase started\nRetrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by \\'SSLError(SSLCertVerificationError(1, \\'[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1007)\\'))\\': /api/v1/namespaces/si-telemetry-loadgen\nRetrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by \\'SSLError(SSLCertVerificationError(1, \\'[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1007)\\'))\\': /api/v1/namespaces/si-telemetry-loadgen\nRetrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by \\'SSLError(SSLCertVerificationError(1, \\'[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1007)\\'))\\': /api/v1/namespaces/si-telemetry-loadgen\nself = <urllib3.connectionpool.HTTPSConnectionPool object at 0x7f620cb543a0>\nmethod = \\'GET\\', url = \\'/api/v1/namespaces/si-telemetry-loadgen\\', body = None\nheaders = {\\'Accept\\': \\'application/json\\', \\'Content-Type\\': \\'application/json\\', \\'User-Agent\\': \\'OpenAPI-Generator/33.1.0/python\\'}\nretries = Retry(total=0, connect=None, read=None, redirect=None, status=None)\nredirect = False, assert_same_host = False, timeout = None, pool_timeout = None\nrelease_conn = True, chunked = False, body_pos = None\nresponse_kw = {\\'preload_content\\': True, \\'request_url\\': \\'https://192.168.20.90:6443/api/v1/namespaces/si-telemetry-loadgen\\'}\nparsed_url = Url(scheme=None, auth=None, host=None, port=None, path=\\'/api/v1/namespaces/si-telemetry-loadgen\\', query=None, fragment=None)\ndestination_scheme = None, conn = None, release_this_conn = True\nhttp_tunnel_required = False, err = None, clean_exit = False\n\n    def urlopen(\n        self,\n        method,\n        url,\n        body=None,\n        headers=None,\n        retries=None,\n        redirect=True,\n        assert_same_host=True,\n        timeout=_Default,\n        pool_timeout=None,\n        release_conn=None,\n        chunked=False,\n        body_pos=None,\n        **response_kw\n    ):\n        \"\"\"\n        Get a connection from the pool and perform an HTTP request. This is the\n        lowest level call for making a request, so you\\'ll need to specify all\n        the raw details.\n    \n        .. note::\n    \n           More commonly, it\\'s appropriate to use a convenience method provided\n           by :class:`.RequestMethods`, such as :meth:`request`.\n    \n        .. note::\n    \n           `release_conn` will only behave as expected if\n           `preload_content=False` because we want to make\n           `preload_content=False` the default behaviour someday soon without\n           breaking backwards compatibility.\n    \n        :param method:\n            HTTP request method (such as GET, POST, PUT, etc.)\n    \n        :param url:\n            The URL to perform the request on.\n    \n        :param body:\n            Data to send in the request body, either :class:`str`, :class:`bytes`,\n            an iterable of :class:`str`/:class:`bytes`, or a file-like object.\n    \n        :param headers:\n            Dictionary of custom headers to send, such as User-Agent,\n            If-None-Match, etc. If None, pool headers are used. If provided,\n            these headers completely replace any pool-specific headers.\n    \n        :param retries:\n            Configure the number of retries to allow before raising a\n            :class:`~urllib3.exceptions.MaxRetryError` exception.\n    \n            Pass ``None`` to retry until you receive a response. Pass a\n            :class:`~urllib3.util.retry.Retry` object for fine-grained control\n            over different types of retries.\n            Pass an integer number to retry connection errors that many times,\n            but no other types of errors. Pass zero to never retry.\n    \n            If ``False``, then retries are disabled and any exception is raised\n            immediately. Also, instead of raising a MaxRetryError on redirects,\n            the redirect response will be returned.\n    \n        :type retries: :class:`~urllib3.util.retry.Retry`, False, or an int.\n    \n        :param redirect:\n            If True, automatically handle redirects (status codes 301, 302,\n            303, 307, 308). Each redirect counts as a retry. Disabling retries\n            will disable redirect, too.\n    \n        :param assert_same_host:\n            If ``True``, will make sure that the host of the pool requests is\n            consistent else will raise HostChangedError. When ``False``, you can\n            use the pool on an HTTP proxy and request foreign hosts.\n    \n        :param timeout:\n            If specified, overrides the default timeout for this one\n            request. It may be a float (in seconds) or an instance of\n            :class:`urllib3.util.Timeout`.\n    \n        :param pool_timeout:\n            If set and the pool is set to block=True, then this method will\n            block for ``pool_timeout`` seconds and raise EmptyPoolError if no\n            connection is available within the time period.\n    \n        :param release_conn:\n            If False, then the urlopen call will not release the connection\n            back into the pool once a response is received (but will release if\n            you read the entire contents of the response such as when\n            `preload_content=True`). This is useful if you\\'re not preloading\n            the response\\'s content immediately. You will need to call\n            ``r.release_conn()`` on the response ``r`` to return the connection\n            back into the pool. If None, it takes the value of\n            ``response_kw.get(\\'preload_content\\', True)``.\n    \n        :param chunked:\n            If True, urllib3 will send the body using chunked transfer\n            encoding. Otherwise, urllib3 will send the body using the standard\n            content-length form. Defaults to False.\n    \n        :param int body_pos:\n            Position to seek to in file-like body in the event of a retry or\n            redirect. Typically this won\\'t need to be set because urllib3 will\n            auto-populate the value when needed.\n    \n        :param \\\\**response_kw:\n            Additional parameters are passed to\n            :meth:`urllib3.response.HTTPResponse.from_httplib`\n        \"\"\"\n    \n        parsed_url = parse_url(url)\n        destination_scheme = parsed_url.scheme\n    \n        if headers is None:\n            headers = self.headers\n    \n        if not isinstance(retries, Retry):\n            retries = Retry.from_int(retries, redirect=redirect, default=self.retries)\n    \n        if release_conn is None:\n            release_conn = response_kw.get(\"preload_content\", True)\n    \n        # Check host\n        if assert_same_host and not self.is_same_host(url):\n            raise HostChangedError(self, url, retries)\n    \n        # Ensure that the URL we\\'re connecting to is properly encoded\n        if url.startswith(\"/\"):\n            url = six.ensure_str(_encode_target(url))\n        else:\n            url = six.ensure_str(parsed_url.url)\n    \n        conn = None\n    \n        # Track whether `conn` needs to be released before\n        # returning/raising/recursing. Update this variable if necessary, and\n        # leave `release_conn` constant throughout the function. That way, if\n        # the function recurses, the original value of `release_conn` will be\n        # passed down into the recursive call, and its value will be respected.\n        #\n        # See issue #651 [1] for details.\n        #\n        # [1] <https://github.com/urllib3/urllib3/issues/651>\n        release_this_conn = release_conn\n    \n        http_tunnel_required = connection_requires_http_tunnel(\n            self.proxy, self.proxy_config, destination_scheme\n        )\n    \n        # Merge the proxy headers. Only done when not using HTTP CONNECT. We\n        # have to copy the headers dict so we can safely change it without those\n        # changes being reflected in anyone else\\'s copy.\n        if not http_tunnel_required:\n            headers = headers.copy()\n            headers.update(self.proxy_headers)\n    \n        # Must keep the exception bound to a separate variable or else Python 3\n        # complains about UnboundLocalError.\n        err = None\n    \n        # Keep track of whether we cleanly exited the except block. This\n        # ensures we do proper cleanup in finally.\n        clean_exit = False\n    \n        # Rewind body position, if needed. Record current position\n        # for future rewinds in the event of a redirect/retry.\n        body_pos = set_file_position(body, body_pos)\n    \n        try:\n            # Request a connection from the queue.\n            timeout_obj = self._get_timeout(timeout)\n            conn = self._get_conn(timeout=pool_timeout)\n    \n            conn.timeout = timeout_obj.connect_timeout\n    \n            is_new_proxy_conn = self.proxy is not None and not getattr(\n                conn, \"sock\", None\n            )\n            if is_new_proxy_conn and http_tunnel_required:\n                self._prepare_proxy(conn)\n    \n            # Make the request on the httplib connection object.\n>           httplib_response = self._make_request(\n                conn,\n                method,\n                url,\n                timeout=timeout_obj,\n                body=body,\n                headers=headers,\n                chunked=chunked,\n            )\n\nlib/python3.10/site-packages/_vendor/urllib3/connectionpool.py:716: \n_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ \nlib/python3.10/site-packages/_vendor/urllib3/connectionpool.py:404: in _make_request\n    self._validate_conn(conn)\nlib/python3.10/site-packages/_vendor/urllib3/connectionpool.py:1061: in _validate_conn\n    conn.connect()\nlib/python3.10/site-packages/_vendor/urllib3/connection.py:419: in connect\n    self.sock = ssl_wrap_socket(\nlib/python3.10/site-packages/_vendor/urllib3/util/ssl_.py:462: in ssl_wrap_socket\n    ssl_sock = _ssl_wrap_socket_impl(sock, context, tls_in_tls)\nlib/python3.10/site-packages/_vendor/urllib3/util/ssl_.py:504: in _ssl_wrap_socket_impl\n    return ssl_context.wrap_socket(sock)\n/usr/lib/python3.10/ssl.py:513: in wrap_socket\n    return self.sslsocket_class._create(\n/usr/lib/python3.10/ssl.py:1100: in _create\n    self.do_handshake()\n_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ \n\nself = <ssl.SSLSocket [closed] fd=-1, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6>\nblock = False\n\n    @_sslcopydoc\n    def do_handshake(self, block=False):\n        self._check_connected()\n        timeout = self.gettimeout()\n        try:\n            if timeout == 0.0 and block:\n                self.settimeout(None)\n>           self._sslobj.do_handshake()\nE           ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1007)\n\n/usr/lib/python3.10/ssl.py:1371: SSLCertVerificationError\n\nDuring handling of the above exception, another exception occurred:\n\nappname = \\'sip-telemetry-loadgen\\', maxsize = \\'100mb\\', iterations = \\'2\\'\nparallel_processes = \\'5\\', number_of_pods = \\'1\\', number_of_jobs = \\'1\\'\n\n    def test_generate_load(\n        appname, maxsize, iterations, parallel_processes, number_of_pods, number_of_jobs\n    ):\n        for index in range(int(number_of_jobs)):\n>           run_job(appname, maxsize, iterations, number_of_pods, parallel_processes)\n\ntest_load_cpumemory.py:269: \n_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ \ntest_load_cpumemory.py:144: in run_job\n    create_namespace_if_not_exists()\ntest_load_cpumemory.py:106: in create_namespace_if_not_exists\n    v1.read_namespace(name=NAMESPACE)\nlib/python3.10/site-packages/kubernetes/client/api/core_v1_api.py:22992: in read_namespace\n    return self.read_namespace_with_http_info(name, **kwargs)  # noqa: E501\nlib/python3.10/site-packages/kubernetes/client/api/core_v1_api.py:23071: in read_namespace_with_http_info\n    return self.api_client.call_api(\nlib/python3.10/site-packages/kubernetes/client/api_client.py:348: in call_api\n    return self.__call_api(resource_path, method,\nlib/python3.10/site-packages/kubernetes/client/api_client.py:180: in __call_api\n    response_data = self.request(\nlib/python3.10/site-packages/kubernetes/client/api_client.py:373: in request\n    return self.rest_client.GET(url,\nlib/python3.10/site-packages/kubernetes/client/rest.py:244: in GET\n    return self.request(\"GET\", url,\nlib/python3.10/site-packages/kubernetes/client/rest.py:217: in request\n    r = self.pool_manager.request(method, url,\nlib/python3.10/site-packages/_vendor/urllib3/request.py:77: in request\n    return self.request_encode_url(\nlib/python3.10/site-packages/_vendor/urllib3/request.py:99: in request_encode_url\n    return self.urlopen(method, url, **extra_kw)\nlib/python3.10/site-packages/_vendor/urllib3/poolmanager.py:376: in urlopen\n    response = conn.urlopen(method, u.request_uri, **kw)\nlib/python3.10/site-packages/_vendor/urllib3/connectionpool.py:830: in urlopen\n    return self.urlopen(\nlib/python3.10/site-packages/_vendor/urllib3/connectionpool.py:830: in urlopen\n    return self.urlopen(\nlib/python3.10/site-packages/_vendor/urllib3/connectionpool.py:830: in urlopen\n    return self.urlopen(\nlib/python3.10/site-packages/_vendor/urllib3/connectionpool.py:802: in urlopen\n    retries = retries.increment(\n_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ \n\nself = Retry(total=0, connect=None, read=None, redirect=None, status=None)\nmethod = \\'GET\\', url = \\'/api/v1/namespaces/si-telemetry-loadgen\\', response = None\nerror = SSLError(SSLCertVerificationError(1, \\'[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1007)\\'))\n_pool = <urllib3.connectionpool.HTTPSConnectionPool object at 0x7f620cb543a0>\n_stacktrace = <traceback object at 0x7f620cb4a380>\n\n    def increment(\n        self,\n        method=None,\n        url=None,\n        response=None,\n        error=None,\n        _pool=None,\n        _stacktrace=None,\n    ):\n        \"\"\"Return a new Retry object with incremented retry counters.\n    \n        :param response: A response object, or None, if the server did not\n            return a response.\n        :type response: :class:`~urllib3.response.HTTPResponse`\n        :param Exception error: An error encountered during the request, or\n            None if the response was received successfully.\n    \n        :return: A new ``Retry`` object.\n        \"\"\"\n        if self.total is False and error:\n            # Disabled, indicate to re-raise the error.\n            raise six.reraise(type(error), error, _stacktrace)\n    \n        total = self.total\n        if total is not None:\n            total -= 1\n    \n        connect = self.connect\n        read = self.read\n        redirect = self.redirect\n        status_count = self.status\n        other = self.other\n        cause = \"unknown\"\n        status = None\n        redirect_location = None\n    \n        if error and self._is_connection_error(error):\n            # Connect retry?\n            if connect is False:\n                raise six.reraise(type(error), error, _stacktrace)\n            elif connect is not None:\n                connect -= 1\n    \n        elif error and self._is_read_error(error):\n            # Read retry?\n            if read is False or not self._is_method_retryable(method):\n                raise six.reraise(type(error), error, _stacktrace)\n            elif read is not None:\n                read -= 1\n    \n        elif error:\n            # Other retry?\n            if other is not None:\n                other -= 1\n    \n        elif response and response.get_redirect_location():\n            # Redirect retry?\n            if redirect is not None:\n                redirect -= 1\n            cause = \"too many redirects\"\n            redirect_location = response.get_redirect_location()\n            status = response.status\n    \n        else:\n            # Incrementing because of a server error like a 500 in\n            # status_forcelist and the given method is in the allowed_methods\n            cause = ResponseError.GENERIC_ERROR\n            if response and response.status:\n                if status_count is not None:\n                    status_count -= 1\n                cause = ResponseError.SPECIFIC_ERROR.format(status_code=response.status)\n                status = response.status\n    \n        history = self.history + (\n            RequestHistory(method, url, error, status, redirect_location),\n        )\n    \n        new_retry = self.new(\n            total=total,\n            connect=connect,\n            read=read,\n            redirect=redirect,\n            status=status_count,\n            other=other,\n            history=history,\n        )\n    \n        if new_retry.is_exhausted():\n>           raise MaxRetryError(_pool, url, error or ResponseError(cause))\nE           urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host=\\'192.168.20.90\\', port=6443): Max retries exceeded with url: /api/v1/namespaces/si-telemetry-loadgen (Caused by SSLError(SSLCertVerificationError(1, \\'[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1007)\\')))\n\nlib/python3.10/site-packages/_vendor/urllib3/util/retry.py:594: MaxRetryError\nNoneType: None\n"
        }
    ],
    "temperature": 0,
    "response_format": {
        "type": "json_schema",
        "json_schema": {
            "name": "adhocloganalysis",
            "strict": true,
            "schema": {
                "$defs": {
                    "LogInsightItem": {
                        "description": "Single insight item for adhoc log analysis.",
                        "properties": {
                            "root_cause": {
                                "anyOf": [
                                    {
                                        "type": "string"
                                    },
                                    {
                                        "type": "null"
                                    }
                                ],
                                "default": "None",
                                "title": "Root Cause"
                            },
                            "category": {
                                "title": "Category",
                                "type": "string"
                            },
                            "suggestion": {
                                "anyOf": [
                                    {
                                        "type": "string"
                                    },
                                    {
                                        "type": "null"
                                    }
                                ],
                                "default": "None",
                                "title": "Suggestion"
                            },
                            "summary": {
                                "title": "Summary",
                                "type": "string"
                            },
                            "confidence": {
                                "enum": [
                                    "high",
                                    "medium",
                                    "low"
                                ],
                                "title": "Confidence",
                                "type": "string"
                            },
                            "mode": {
                                "enum": [
                                    "failure_analysis",
                                    "summary"
                                ],
                                "title": "Mode",
                                "type": "string"
                            }
                        },
                        "required": [
                            "category",
                            "summary",
                            "confidence",
                            "mode"
                        ],
                        "title": "LogInsightItem",
                        "type": "object",
                        "additionalProperties": false
                    }
                },
                "description": "Wrapper for adhoc log analysis response with items array.",
                "properties": {
                    "items": {
                        "items": {
                            "$ref": "#/$defs/LogInsightItem"
                        },
                        "title": "Items",
                        "type": "array"
                    }
                },
                "required": [
                    "items"
                ],
                "title": "AdhocLogAnalysis",
                "type": "object",
                "additionalProperties": false
            }
        }
    }
}

---

curl -k -X POST "$API_SERVER" \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d @curl_payload1_exact.json \
  -w "\nTotal time: %{time_total}s\n"

RAW_BUFFERClick to expand / collapse

Your current environment

<details> <summary>The output of <code>python collect_env.py</code></summary>

Collecting environment information...
==============================
        System Info
==============================
OS                           : Ubuntu 22.04.5 LTS (x86_64)
GCC version                  : (Ubuntu 11.4.0-1ubuntu1~22.04.3) 11.4.0
Clang version                : Could not collect
CMake version                : Could not collect
Libc version                 : glibc-2.35

==============================
       PyTorch Info
==============================
PyTorch version              : 2.11.0+cu130
Is debug build               : False
CUDA used to build PyTorch   : 13.0
ROCM used to build PyTorch   : N/A
XPU used to build PyTorch    : N/A

==============================
      Python Environment
==============================
Python version               : 3.12.13 (main, Mar  4 2026, 09:23:07) [GCC 11.4.0] (64-bit runtime)
Python platform              : Linux-5.14.0-554.el9.x86_64-x86_64-with-glibc2.35

==============================
       CUDA / GPU Info
==============================
Is CUDA available            : True
CUDA runtime version         : 13.0.88
CUDA_MODULE_LOADING set to   :
GPU models and configuration :
GPU 0: NVIDIA A100 80GB PCIe
GPU 1: NVIDIA A100 80GB PCIe

Nvidia driver version        : 580.105.08
cuDNN version                : Could not collect
HIP runtime version          : N/A
MIOpen runtime version       : N/A
Is XNNPACK available         : True

==============================
          CPU Info
==============================
Architecture:                         x86_64
CPU op-mode(s):                       32-bit, 64-bit
Address sizes:                        46 bits physical, 57 bits virtual
Byte Order:                           Little Endian
CPU(s):                               128
On-line CPU(s) list:                  0-127
Vendor ID:                            GenuineIntel
Model name:                           Intel(R) Xeon(R) Gold 6338 CPU @ 2.00GHz
CPU family:                           6
Model:                                106
Thread(s) per core:                   2
Core(s) per socket:                   32
Socket(s):                            2
Stepping:                             6
BogoMIPS:                             4000.00
Flags:                                fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l3 intel_ppin ssbd mba ibrs ibpb stibp ibrs_enhanced tpr_shadow flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid cqm rdt_a avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb intel_pt avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local split_lock_detect wbnoinvd dtherm ida arat pln pts vnmi avx512vbmi umip pku ospke avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg tme avx512_vpopcntdq la57 rdpid fsrm md_clear pconfig flush_l1d arch_capabilities
Virtualization:                       VT-x
L1d cache:                            3 MiB (64 instances)
L1i cache:                            2 MiB (64 instances)
L2 cache:                             80 MiB (64 instances)
L3 cache:                             96 MiB (2 instances)
NUMA node(s):                         2
NUMA node0 CPU(s):                    0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38,40,42,44,46,48,50,52,54,56,58,60,62,64,66,68,70,72,74,76,78,80,82,84,86,88,90,92,94,96,98,100,102,104,106,108,110,112,114,116,118,120,122,124,126
NUMA node1 CPU(s):                    1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39,41,43,45,47,49,51,53,55,57,59,61,63,65,67,69,71,73,75,77,79,81,83,85,87,89,91,93,95,97,99,101,103,105,107,109,111,113,115,117,119,121,123,125,127
Vulnerability Gather data sampling:   Mitigation; Microcode
Vulnerability Itlb multihit:          Not affected
Vulnerability L1tf:                   Not affected
Vulnerability Mds:                    Not affected
Vulnerability Meltdown:               Not affected
Vulnerability Mmio stale data:        Mitigation; Clear CPU buffers; SMT vulnerable
Vulnerability Reg file data sampling: Not affected
Vulnerability Retbleed:               Not affected
Vulnerability Spec rstack overflow:   Not affected
Vulnerability Spec store bypass:      Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:             Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2:             Mitigation; Enhanced / Automatic IBRS; IBPB conditional; RSB filling; PBRSB-eIBRS SW sequence; BHI SW loop, KVM SW loop
Vulnerability Srbds:                  Not affected
Vulnerability Tsx async abort:        Not affected

==============================
Versions of relevant libraries
==============================
[pip3] flashinfer-python==0.6.8.post1
[pip3] numpy==2.2.6
[pip3] nvidia-cublas==13.1.0.3
[pip3] nvidia-cuda-cupti==13.0.85
[pip3] nvidia-cuda-nvrtc==13.0.88
[pip3] nvidia-cuda-runtime==13.0.96
[pip3] nvidia-cudnn-cu13==9.19.0.56
[pip3] nvidia-cudnn-frontend==1.18.0
[pip3] nvidia-cufft==12.0.0.61
[pip3] nvidia-cufile==1.15.1.6
[pip3] nvidia-curand==10.4.0.35
[pip3] nvidia-cusolver==12.0.4.66
[pip3] nvidia-cusparse==12.6.3.3
[pip3] nvidia-cusparselt-cu13==0.8.0
[pip3] nvidia-cutlass-dsl==4.5.0
[pip3] nvidia-cutlass-dsl-libs-base==4.5.0
[pip3] nvidia-ml-py==13.595.45
[pip3] nvidia-nccl-cu13==2.28.9
[pip3] nvidia-nvjitlink==13.0.88
[pip3] nvidia-nvshmem-cu13==3.4.5
[pip3] nvidia-nvtx==13.0.85
[pip3] pyzmq==27.1.0
[pip3] torch==2.11.0+cu130
[pip3] torch_c_dlpack_ext==0.1.5
[pip3] torchaudio==2.11.0+cu130
[pip3] torchvision==0.26.0+cu130
[pip3] transformers==5.8.0
[pip3] triton==3.6.0
[conda] Could not collect

==============================
         vLLM Info
==============================
ROCM Version                 : Could not collect
vLLM Version                 : 0.20.2rc1.dev93+g51f22dcfd (git sha: 51f22dcfd)
vLLM Build Flags:
  CUDA Archs: 7.5 8.0 8.6 8.9 9.0 10.0 12.0+PTX; ROCm: Disabled; XPU: Disabled
GPU Topology:
        GPU0    GPU1    NIC0    NIC1    CPU Affinity    NUMA Affinity   GPU NUMA ID
GPU0     X      SYS     SYS     SYS     0,2,4,6,8,10    0               N/A
GPU1    SYS      X      NODE    NODE    1,3,5,7,9,11    1               N/A
NIC0    SYS     NODE     X      PIX
NIC1    SYS     NODE    PIX      X

Legend:

  X    = Self
  SYS  = Connection traversing PCIe as well as the SMP interconnect between NUMA nodes (e.g., QPI/UPI)
  NODE = Connection traversing PCIe as well as the interconnect between PCIe Host Bridges within a NUMA node
  PHB  = Connection traversing PCIe as well as a PCIe Host Bridge (typically the CPU)
  PXB  = Connection traversing multiple PCIe bridges (without traversing the PCIe Host Bridge)
  PIX  = Connection traversing at most a single PCIe bridge
  NV#  = Connection traversing a bonded set of # NVLinks

NIC Legend:

  NIC0: mlx5_0
  NIC1: mlx5_1

==============================
     Environment Variables
==============================
NVIDIA_VISIBLE_DEVICES=all
NVIDIA_REQUIRE_CUDA=cuda>=13.0 brand=unknown,driver>=535,driver<536 brand=grid,driver>=535,driver<536 brand=tesla,driver>=535,driver<536 brand=nvidia,driver>=535,driver<536 brand=quadro,driver>=535,driver<536 brand=quadrortx,driver>=535,driver<536 brand=nvidiartx,driver>=535,driver<536 brand=vapps,driver>=535,driver<536 brand=vpc,driver>=535,driver<536 brand=vcs,driver>=535,driver<536 brand=vws,driver>=535,driver<536 brand=cloudgaming,driver>=535,driver<536 brand=unknown,driver>=550,driver<551 brand=grid,driver>=550,driver<551 brand=tesla,driver>=550,driver<551 brand=nvidia,driver>=550,driver<551 brand=quadro,driver>=550,driver<551 brand=quadrortx,driver>=550,driver<551 brand=nvidiartx,driver>=550,driver<551 brand=vapps,driver>=550,driver<551 brand=vpc,driver>=550,driver<551 brand=vcs,driver>=550,driver<551 brand=vws,driver>=550,driver<551 brand=cloudgaming,driver>=550,driver<551 brand=unknown,driver>=565,driver<566 brand=grid,driver>=565,driver<566 brand=tesla,driver>=565,driver<566 brand=nvidia,driver>=565,driver<566 brand=quadro,driver>=565,driver<566 brand=quadrortx,driver>=565,driver<566 brand=nvidiartx,driver>=565,driver<566 brand=vapps,driver>=565,driver<566 brand=vpc,driver>=565,driver<566 brand=vcs,driver>=565,driver<566 brand=vws,driver>=565,driver<566 brand=cloudgaming,driver>=565,driver<566 brand=unknown,driver>=570,driver<571 brand=grid,driver>=570,driver<571 brand=tesla,driver>=570,driver<571 brand=nvidia,driver>=570,driver<571 brand=quadro,driver>=570,driver<571 brand=quadrortx,driver>=570,driver<571 brand=nvidiartx,driver>=570,driver<571 brand=vapps,driver>=570,driver<571 brand=vpc,driver>=570,driver<571 brand=vcs,driver>=570,driver<571 brand=vws,driver>=570,driver<571 brand=cloudgaming,driver>=570,driver<571 brand=unknown,driver>=575,driver<576 brand=grid,driver>=575,driver<576 brand=tesla,driver>=575,driver<576 brand=nvidia,driver>=575,driver<576 brand=quadro,driver>=575,driver<576 brand=quadrortx,driver>=575,driver<576 brand=nvidiartx,driver>=575,driver<576 brand=vapps,driver>=575,driver<576 brand=vpc,driver>=575,driver<576 brand=vcs,driver>=575,driver<576 brand=vws,driver>=575,driver<576 brand=cloudgaming,driver>=575,driver<576
TORCH_CUDA_ARCH_LIST=7.5 8.0 8.6 8.9 9.0 10.0 12.0+PTX
NVIDIA_DRIVER_CAPABILITIES=compute,utility
VLLM_USAGE_SOURCE=production-docker-image
CUDA_VERSION=13.0.2
VLLM_ENABLE_CUDA_COMPATIBILITY=0
VLLM_DO_NOT_TRACK=1
LD_LIBRARY_PATH=/usr/local/nvidia/lib64:/usr/local/cuda/lib64:/usr/local/nvidia/lib:/usr/local/nvidia/lib64:/usr/local/cuda/lib64
VLLM_NO_USAGE_STATS=1
PYTORCH_NVML_BASED_CUDA_CHECK=1
TORCHINDUCTOR_COMPILE_THREADS=1
TORCHINDUCTOR_CACHE_DIR=/tmp/torchinductor_root

</details>

🐛 Describe the bug

<details> <summary>error payload and script to run it</summary>

{
    "model": "gemma4-31b",
    "messages": [
        {
            "role": "system",
            "content": "You are an expert test automation engineer and log analyst.\n\nA user has manually selected specific log lines from a test section and is requesting your analysis.\nFirst, determine whether the logs contain errors, failures, or anomalies.\n\n- If the logs indicate one or more issues, set \"mode\" to \"failure_analysis\" and provide root cause analysis.\n- If the logs are normal/informational with no apparent issues, set \"mode\" to \"summary\" and provide a descriptive summary.\n\nRespond in valid JSON with a top-level \"items\" array.\n\nFor **failure_analysis** mode:\n{\n  \"items\": [\n    {\n      \"mode\": \"failure_analysis\",\n      \"root_cause\": \"A concise description of the root cause of the failure.\",\n      \"category\": \"One of: script_error | environment_issue | infrastructure_failure | test_data_issue | timeout | unknown\",\n      \"suggestion\": \"Actionable steps to fix or investigate the failure (2-5 bullet points as a single string, separated by newlines).\",\n      \"summary\": \"A one-line summary suitable for display in a dashboard.\",\n      \"confidence\": \"high | medium | low\"\n    }\n  ]\n}\n\nFor **summary** mode (no issues detected):\n{\n  \"items\": [\n    {\n      \"mode\": \"summary\",\n      \"summary\": \"A concise description of what is happening in the selected logs.\",\n      \"category\": \"informational\",\n      \"confidence\": \"high\"\n    }\n  ]\n}\n\nGuidelines:\n- Each element of the array must describe one distinct issue or observation.\n- For failure_analysis mode: focus on the most likely root cause based on the provided log lines.\n- For summary mode: describe the sequence of operations, key events, or state transitions visible in the logs.\n- The user selected these specific lines because they believe they are relevant; give them priority.\n- Distinguish between scripting errors (wrong assertions, missing setup) and genuine product bugs.\n- If the logs indicate an environment or infrastructure issue (network timeout, resource unavailable, container crash), flag it clearly.\n- Keep the summary short enough for a UI badge or tooltip.\n- If you cannot determine the root cause, set category to \"unknown\" and confidence to \"low\", and suggest investigation steps."
        },
        {
            "role": "user",
            "content": "## Section\ntest_generate_load\n\n## Selected Log Lines\ntestcase started\nRetrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by \\'SSLError(SSLCertVerificationError(1, \\'[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1007)\\'))\\': /api/v1/namespaces/si-telemetry-loadgen\nRetrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by \\'SSLError(SSLCertVerificationError(1, \\'[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1007)\\'))\\': /api/v1/namespaces/si-telemetry-loadgen\nRetrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by \\'SSLError(SSLCertVerificationError(1, \\'[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1007)\\'))\\': /api/v1/namespaces/si-telemetry-loadgen\nself = <urllib3.connectionpool.HTTPSConnectionPool object at 0x7f620cb543a0>\nmethod = \\'GET\\', url = \\'/api/v1/namespaces/si-telemetry-loadgen\\', body = None\nheaders = {\\'Accept\\': \\'application/json\\', \\'Content-Type\\': \\'application/json\\', \\'User-Agent\\': \\'OpenAPI-Generator/33.1.0/python\\'}\nretries = Retry(total=0, connect=None, read=None, redirect=None, status=None)\nredirect = False, assert_same_host = False, timeout = None, pool_timeout = None\nrelease_conn = True, chunked = False, body_pos = None\nresponse_kw = {\\'preload_content\\': True, \\'request_url\\': \\'https://192.168.20.90:6443/api/v1/namespaces/si-telemetry-loadgen\\'}\nparsed_url = Url(scheme=None, auth=None, host=None, port=None, path=\\'/api/v1/namespaces/si-telemetry-loadgen\\', query=None, fragment=None)\ndestination_scheme = None, conn = None, release_this_conn = True\nhttp_tunnel_required = False, err = None, clean_exit = False\n\n    def urlopen(\n        self,\n        method,\n        url,\n        body=None,\n        headers=None,\n        retries=None,\n        redirect=True,\n        assert_same_host=True,\n        timeout=_Default,\n        pool_timeout=None,\n        release_conn=None,\n        chunked=False,\n        body_pos=None,\n        **response_kw\n    ):\n        \"\"\"\n        Get a connection from the pool and perform an HTTP request. This is the\n        lowest level call for making a request, so you\\'ll need to specify all\n        the raw details.\n    \n        .. note::\n    \n           More commonly, it\\'s appropriate to use a convenience method provided\n           by :class:`.RequestMethods`, such as :meth:`request`.\n    \n        .. note::\n    \n           `release_conn` will only behave as expected if\n           `preload_content=False` because we want to make\n           `preload_content=False` the default behaviour someday soon without\n           breaking backwards compatibility.\n    \n        :param method:\n            HTTP request method (such as GET, POST, PUT, etc.)\n    \n        :param url:\n            The URL to perform the request on.\n    \n        :param body:\n            Data to send in the request body, either :class:`str`, :class:`bytes`,\n            an iterable of :class:`str`/:class:`bytes`, or a file-like object.\n    \n        :param headers:\n            Dictionary of custom headers to send, such as User-Agent,\n            If-None-Match, etc. If None, pool headers are used. If provided,\n            these headers completely replace any pool-specific headers.\n    \n        :param retries:\n            Configure the number of retries to allow before raising a\n            :class:`~urllib3.exceptions.MaxRetryError` exception.\n    \n            Pass ``None`` to retry until you receive a response. Pass a\n            :class:`~urllib3.util.retry.Retry` object for fine-grained control\n            over different types of retries.\n            Pass an integer number to retry connection errors that many times,\n            but no other types of errors. Pass zero to never retry.\n    \n            If ``False``, then retries are disabled and any exception is raised\n            immediately. Also, instead of raising a MaxRetryError on redirects,\n            the redirect response will be returned.\n    \n        :type retries: :class:`~urllib3.util.retry.Retry`, False, or an int.\n    \n        :param redirect:\n            If True, automatically handle redirects (status codes 301, 302,\n            303, 307, 308). Each redirect counts as a retry. Disabling retries\n            will disable redirect, too.\n    \n        :param assert_same_host:\n            If ``True``, will make sure that the host of the pool requests is\n            consistent else will raise HostChangedError. When ``False``, you can\n            use the pool on an HTTP proxy and request foreign hosts.\n    \n        :param timeout:\n            If specified, overrides the default timeout for this one\n            request. It may be a float (in seconds) or an instance of\n            :class:`urllib3.util.Timeout`.\n    \n        :param pool_timeout:\n            If set and the pool is set to block=True, then this method will\n            block for ``pool_timeout`` seconds and raise EmptyPoolError if no\n            connection is available within the time period.\n    \n        :param release_conn:\n            If False, then the urlopen call will not release the connection\n            back into the pool once a response is received (but will release if\n            you read the entire contents of the response such as when\n            `preload_content=True`). This is useful if you\\'re not preloading\n            the response\\'s content immediately. You will need to call\n            ``r.release_conn()`` on the response ``r`` to return the connection\n            back into the pool. If None, it takes the value of\n            ``response_kw.get(\\'preload_content\\', True)``.\n    \n        :param chunked:\n            If True, urllib3 will send the body using chunked transfer\n            encoding. Otherwise, urllib3 will send the body using the standard\n            content-length form. Defaults to False.\n    \n        :param int body_pos:\n            Position to seek to in file-like body in the event of a retry or\n            redirect. Typically this won\\'t need to be set because urllib3 will\n            auto-populate the value when needed.\n    \n        :param \\\\**response_kw:\n            Additional parameters are passed to\n            :meth:`urllib3.response.HTTPResponse.from_httplib`\n        \"\"\"\n    \n        parsed_url = parse_url(url)\n        destination_scheme = parsed_url.scheme\n    \n        if headers is None:\n            headers = self.headers\n    \n        if not isinstance(retries, Retry):\n            retries = Retry.from_int(retries, redirect=redirect, default=self.retries)\n    \n        if release_conn is None:\n            release_conn = response_kw.get(\"preload_content\", True)\n    \n        # Check host\n        if assert_same_host and not self.is_same_host(url):\n            raise HostChangedError(self, url, retries)\n    \n        # Ensure that the URL we\\'re connecting to is properly encoded\n        if url.startswith(\"/\"):\n            url = six.ensure_str(_encode_target(url))\n        else:\n            url = six.ensure_str(parsed_url.url)\n    \n        conn = None\n    \n        # Track whether `conn` needs to be released before\n        # returning/raising/recursing. Update this variable if necessary, and\n        # leave `release_conn` constant throughout the function. That way, if\n        # the function recurses, the original value of `release_conn` will be\n        # passed down into the recursive call, and its value will be respected.\n        #\n        # See issue #651 [1] for details.\n        #\n        # [1] <https://github.com/urllib3/urllib3/issues/651>\n        release_this_conn = release_conn\n    \n        http_tunnel_required = connection_requires_http_tunnel(\n            self.proxy, self.proxy_config, destination_scheme\n        )\n    \n        # Merge the proxy headers. Only done when not using HTTP CONNECT. We\n        # have to copy the headers dict so we can safely change it without those\n        # changes being reflected in anyone else\\'s copy.\n        if not http_tunnel_required:\n            headers = headers.copy()\n            headers.update(self.proxy_headers)\n    \n        # Must keep the exception bound to a separate variable or else Python 3\n        # complains about UnboundLocalError.\n        err = None\n    \n        # Keep track of whether we cleanly exited the except block. This\n        # ensures we do proper cleanup in finally.\n        clean_exit = False\n    \n        # Rewind body position, if needed. Record current position\n        # for future rewinds in the event of a redirect/retry.\n        body_pos = set_file_position(body, body_pos)\n    \n        try:\n            # Request a connection from the queue.\n            timeout_obj = self._get_timeout(timeout)\n            conn = self._get_conn(timeout=pool_timeout)\n    \n            conn.timeout = timeout_obj.connect_timeout\n    \n            is_new_proxy_conn = self.proxy is not None and not getattr(\n                conn, \"sock\", None\n            )\n            if is_new_proxy_conn and http_tunnel_required:\n                self._prepare_proxy(conn)\n    \n            # Make the request on the httplib connection object.\n>           httplib_response = self._make_request(\n                conn,\n                method,\n                url,\n                timeout=timeout_obj,\n                body=body,\n                headers=headers,\n                chunked=chunked,\n            )\n\nlib/python3.10/site-packages/_vendor/urllib3/connectionpool.py:716: \n_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ \nlib/python3.10/site-packages/_vendor/urllib3/connectionpool.py:404: in _make_request\n    self._validate_conn(conn)\nlib/python3.10/site-packages/_vendor/urllib3/connectionpool.py:1061: in _validate_conn\n    conn.connect()\nlib/python3.10/site-packages/_vendor/urllib3/connection.py:419: in connect\n    self.sock = ssl_wrap_socket(\nlib/python3.10/site-packages/_vendor/urllib3/util/ssl_.py:462: in ssl_wrap_socket\n    ssl_sock = _ssl_wrap_socket_impl(sock, context, tls_in_tls)\nlib/python3.10/site-packages/_vendor/urllib3/util/ssl_.py:504: in _ssl_wrap_socket_impl\n    return ssl_context.wrap_socket(sock)\n/usr/lib/python3.10/ssl.py:513: in wrap_socket\n    return self.sslsocket_class._create(\n/usr/lib/python3.10/ssl.py:1100: in _create\n    self.do_handshake()\n_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ \n\nself = <ssl.SSLSocket [closed] fd=-1, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6>\nblock = False\n\n    @_sslcopydoc\n    def do_handshake(self, block=False):\n        self._check_connected()\n        timeout = self.gettimeout()\n        try:\n            if timeout == 0.0 and block:\n                self.settimeout(None)\n>           self._sslobj.do_handshake()\nE           ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1007)\n\n/usr/lib/python3.10/ssl.py:1371: SSLCertVerificationError\n\nDuring handling of the above exception, another exception occurred:\n\nappname = \\'sip-telemetry-loadgen\\', maxsize = \\'100mb\\', iterations = \\'2\\'\nparallel_processes = \\'5\\', number_of_pods = \\'1\\', number_of_jobs = \\'1\\'\n\n    def test_generate_load(\n        appname, maxsize, iterations, parallel_processes, number_of_pods, number_of_jobs\n    ):\n        for index in range(int(number_of_jobs)):\n>           run_job(appname, maxsize, iterations, number_of_pods, parallel_processes)\n\ntest_load_cpumemory.py:269: \n_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ \ntest_load_cpumemory.py:144: in run_job\n    create_namespace_if_not_exists()\ntest_load_cpumemory.py:106: in create_namespace_if_not_exists\n    v1.read_namespace(name=NAMESPACE)\nlib/python3.10/site-packages/kubernetes/client/api/core_v1_api.py:22992: in read_namespace\n    return self.read_namespace_with_http_info(name, **kwargs)  # noqa: E501\nlib/python3.10/site-packages/kubernetes/client/api/core_v1_api.py:23071: in read_namespace_with_http_info\n    return self.api_client.call_api(\nlib/python3.10/site-packages/kubernetes/client/api_client.py:348: in call_api\n    return self.__call_api(resource_path, method,\nlib/python3.10/site-packages/kubernetes/client/api_client.py:180: in __call_api\n    response_data = self.request(\nlib/python3.10/site-packages/kubernetes/client/api_client.py:373: in request\n    return self.rest_client.GET(url,\nlib/python3.10/site-packages/kubernetes/client/rest.py:244: in GET\n    return self.request(\"GET\", url,\nlib/python3.10/site-packages/kubernetes/client/rest.py:217: in request\n    r = self.pool_manager.request(method, url,\nlib/python3.10/site-packages/_vendor/urllib3/request.py:77: in request\n    return self.request_encode_url(\nlib/python3.10/site-packages/_vendor/urllib3/request.py:99: in request_encode_url\n    return self.urlopen(method, url, **extra_kw)\nlib/python3.10/site-packages/_vendor/urllib3/poolmanager.py:376: in urlopen\n    response = conn.urlopen(method, u.request_uri, **kw)\nlib/python3.10/site-packages/_vendor/urllib3/connectionpool.py:830: in urlopen\n    return self.urlopen(\nlib/python3.10/site-packages/_vendor/urllib3/connectionpool.py:830: in urlopen\n    return self.urlopen(\nlib/python3.10/site-packages/_vendor/urllib3/connectionpool.py:830: in urlopen\n    return self.urlopen(\nlib/python3.10/site-packages/_vendor/urllib3/connectionpool.py:802: in urlopen\n    retries = retries.increment(\n_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ \n\nself = Retry(total=0, connect=None, read=None, redirect=None, status=None)\nmethod = \\'GET\\', url = \\'/api/v1/namespaces/si-telemetry-loadgen\\', response = None\nerror = SSLError(SSLCertVerificationError(1, \\'[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1007)\\'))\n_pool = <urllib3.connectionpool.HTTPSConnectionPool object at 0x7f620cb543a0>\n_stacktrace = <traceback object at 0x7f620cb4a380>\n\n    def increment(\n        self,\n        method=None,\n        url=None,\n        response=None,\n        error=None,\n        _pool=None,\n        _stacktrace=None,\n    ):\n        \"\"\"Return a new Retry object with incremented retry counters.\n    \n        :param response: A response object, or None, if the server did not\n            return a response.\n        :type response: :class:`~urllib3.response.HTTPResponse`\n        :param Exception error: An error encountered during the request, or\n            None if the response was received successfully.\n    \n        :return: A new ``Retry`` object.\n        \"\"\"\n        if self.total is False and error:\n            # Disabled, indicate to re-raise the error.\n            raise six.reraise(type(error), error, _stacktrace)\n    \n        total = self.total\n        if total is not None:\n            total -= 1\n    \n        connect = self.connect\n        read = self.read\n        redirect = self.redirect\n        status_count = self.status\n        other = self.other\n        cause = \"unknown\"\n        status = None\n        redirect_location = None\n    \n        if error and self._is_connection_error(error):\n            # Connect retry?\n            if connect is False:\n                raise six.reraise(type(error), error, _stacktrace)\n            elif connect is not None:\n                connect -= 1\n    \n        elif error and self._is_read_error(error):\n            # Read retry?\n            if read is False or not self._is_method_retryable(method):\n                raise six.reraise(type(error), error, _stacktrace)\n            elif read is not None:\n                read -= 1\n    \n        elif error:\n            # Other retry?\n            if other is not None:\n                other -= 1\n    \n        elif response and response.get_redirect_location():\n            # Redirect retry?\n            if redirect is not None:\n                redirect -= 1\n            cause = \"too many redirects\"\n            redirect_location = response.get_redirect_location()\n            status = response.status\n    \n        else:\n            # Incrementing because of a server error like a 500 in\n            # status_forcelist and the given method is in the allowed_methods\n            cause = ResponseError.GENERIC_ERROR\n            if response and response.status:\n                if status_count is not None:\n                    status_count -= 1\n                cause = ResponseError.SPECIFIC_ERROR.format(status_code=response.status)\n                status = response.status\n    \n        history = self.history + (\n            RequestHistory(method, url, error, status, redirect_location),\n        )\n    \n        new_retry = self.new(\n            total=total,\n            connect=connect,\n            read=read,\n            redirect=redirect,\n            status=status_count,\n            other=other,\n            history=history,\n        )\n    \n        if new_retry.is_exhausted():\n>           raise MaxRetryError(_pool, url, error or ResponseError(cause))\nE           urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host=\\'192.168.20.90\\', port=6443): Max retries exceeded with url: /api/v1/namespaces/si-telemetry-loadgen (Caused by SSLError(SSLCertVerificationError(1, \\'[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1007)\\')))\n\nlib/python3.10/site-packages/_vendor/urllib3/util/retry.py:594: MaxRetryError\nNoneType: None\n"
        }
    ],
    "temperature": 0,
    "response_format": {
        "type": "json_schema",
        "json_schema": {
            "name": "adhocloganalysis",
            "strict": true,
            "schema": {
                "$defs": {
                    "LogInsightItem": {
                        "description": "Single insight item for adhoc log analysis.",
                        "properties": {
                            "root_cause": {
                                "anyOf": [
                                    {
                                        "type": "string"
                                    },
                                    {
                                        "type": "null"
                                    }
                                ],
                                "default": "None",
                                "title": "Root Cause"
                            },
                            "category": {
                                "title": "Category",
                                "type": "string"
                            },
                            "suggestion": {
                                "anyOf": [
                                    {
                                        "type": "string"
                                    },
                                    {
                                        "type": "null"
                                    }
                                ],
                                "default": "None",
                                "title": "Suggestion"
                            },
                            "summary": {
                                "title": "Summary",
                                "type": "string"
                            },
                            "confidence": {
                                "enum": [
                                    "high",
                                    "medium",
                                    "low"
                                ],
                                "title": "Confidence",
                                "type": "string"
                            },
                            "mode": {
                                "enum": [
                                    "failure_analysis",
                                    "summary"
                                ],
                                "title": "Mode",
                                "type": "string"
                            }
                        },
                        "required": [
                            "category",
                            "summary",
                            "confidence",
                            "mode"
                        ],
                        "title": "LogInsightItem",
                        "type": "object",
                        "additionalProperties": false
                    }
                },
                "description": "Wrapper for adhoc log analysis response with items array.",
                "properties": {
                    "items": {
                        "items": {
                            "$ref": "#/$defs/LogInsightItem"
                        },
                        "title": "Items",
                        "type": "array"
                    }
                },
                "required": [
                    "items"
                ],
                "title": "AdhocLogAnalysis",
                "type": "object",
                "additionalProperties": false
            }
        }
    }
}

curl -k -X POST "$API_SERVER" \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d @curl_payload1_exact.json \
  -w "\nTotal time: %{time_total}s\n"

</details>

sending the above payload to vllm can reproduce the problem - it seems to just hangs for >6min (our timeout value)

if we tweak the request and remove the json requirement in the system prompt - it seems to work just fine.

the EXACT failure json going into gpt-oss-120b does not deadlock like that. which leads me to think there might be some issues in the parser for gemma4.

Before submitting a new issue...

Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #environment variable #integration issue #index setup #retrieval issue

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

vllm - 💡(How to fix) Fix [Bug]: gemma4 prompts with embedded json requirements AND response_format.json_schema cause timeouts

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fix / Workaround

============================== CPU Info

Code Example

Your current environment

🐛 Describe the bug

Before submitting a new issue...

Still need to ship something?

TRENDING

vllm - 💡(How to fix) Fix [Bug]: gemma4 prompts with embedded json requirements AND response_format.json_schema cause timeouts

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fix / Workaround

============================== CPU Info

Code Example

Your current environment

🐛 Describe the bug

Before submitting a new issue...

Still need to ship something?

RELATED_DISCOVERY

TRENDING