transformers - ✅(Solved) Fix AttributeError: 'Qwen3_5Config' object has no attribute 'num_attention_heads' [2 pull requests, 3 comments, 4 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
huggingface/transformers#44322Fetched 2026-04-08 00:29:10
View on GitHub
Comments
3
Participants
4
Timeline
23
Reactions
1
Author
Assignees
Timeline (top)
subscribed ×8mentioned ×6commented ×3cross-referenced ×3

Error Message

2026-02-27T18:44:43.002681005+08:00 stdout F INFO: ::1:46248 - "POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error 2026-02-27T18:44:43.002993914+08:00 stderr F Error in generation loop: 'Qwen3_5Config' object has no attribute 'num_attention_heads' 2026-02-27T18:44:43.002996802+08:00 stderr F Traceback (most recent call last): 2026-02-27T18:44:43.007934561+08:00 stderr F ERROR: Exception in ASGI application 2026-02-27T18:44:43.00795328+08:00 stderr F + Exception Group Traceback (most recent call last): 2026-02-27T18:44:43.007967316+08:00 stderr F | ExceptionGroup: unhandled errors in a TaskGroup (1 sub-exception) 2026-02-27T18:44:43.007970393+08:00 stderr F | Traceback (most recent call last): 2026-02-27T18:44:43.008000468+08:00 stderr F | self.gen.throw(typ, value, traceback) 2026-02-27T18:44:43.008156224+08:00 stderr F During handling of the above exception, another exception occurred: 2026-02-27T18:44:43.008159094+08:00 stderr F Traceback (most recent call last): 2026-02-27T18:44:43.008195609+08:00 stderr F self.gen.throw(typ, value, traceback)

Fix Action

Fix / Workaround

  • use transformers serve-cli to deploy transformers serve --force-model Qwen/Qwen3.5-27B --port 9016 --continuous-batching
  • logs
2026-02-27T18:44:43.002681005+08:00 stdout F INFO:     ::1:46248 - "POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error
2026-02-27T18:44:43.002993914+08:00 stderr F Error in generation loop: 'Qwen3_5Config' object has no attribute 'num_attention_heads'
2026-02-27T18:44:43.002996802+08:00 stderr F Traceback (most recent call last):
2026-02-27T18:44:43.002998847+08:00 stderr F   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/transformers/generation/continuous_batching/continuous_api.py", line 799, in _run_generation_loop
2026-02-27T18:44:43.003000958+08:00 stderr F     paged_attention_cache = PagedAttentionCache(
2026-02-27T18:44:43.003002582+08:00 stderr F                             ^^^^^^^^^^^^^^^^^^^^
2026-02-27T18:44:43.003004339+08:00 stderr F   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/transformers/generation/continuous_batching/cache.py", line 144, in __init__
2026-02-27T18:44:43.003006276+08:00 stderr F     self.num_key_value_heads: int = kv_heads if kv_heads is not None else config.num_attention_heads
2026-02-27T18:44:43.003007921+08:00 stderr F                                                                           ^^^^^^^^^^^^^^^^^^^^^^^^^^
2026-02-27T18:44:43.003009773+08:00 stderr F   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/transformers/configuration_utils.py", line 164, in __getattribute__
2026-02-27T18:44:43.003011403+08:00 stderr F     return super().__getattribute__(key)
2026-02-27T18:44:43.003013125+08:00 stderr F            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2026-02-27T18:44:43.003016089+08:00 stderr F AttributeError: 'Qwen3_5Config' object has no attribute 'num_attention_heads'
2026-02-27T18:44:43.007934561+08:00 stderr F ERROR:    Exception in ASGI application
2026-02-27T18:44:43.00795328+08:00 stderr F   + Exception Group Traceback (most recent call last):
2026-02-27T18:44:43.007956234+08:00 stderr F   |   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/_utils.py", line 76, in collapse_excgroups
2026-02-27T18:44:43.007958168+08:00 stderr F   |     yield
2026-02-27T18:44:43.007960098+08:00 stderr F   |   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/middleware/base.py", line 186, in __call__
2026-02-27T18:44:43.007961889+08:00 stderr F   |     async with anyio.create_task_group() as task_group:
2026-02-27T18:44:43.007963958+08:00 stderr F   |   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 763, in __aexit__
2026-02-27T18:44:43.007965588+08:00 stderr F   |     raise BaseExceptionGroup(
2026-02-27T18:44:43.007967316+08:00 stderr F   | ExceptionGroup: unhandled errors in a TaskGroup (1 sub-exception)
2026-02-27T18:44:43.007968864+08:00 stderr F   +-+---------------- 1 ----------------
2026-02-27T18:44:43.007970393+08:00 stderr F     | Traceback (most recent call last):
2026-02-27T18:44:43.007972218+08:00 stderr F     |   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/uvicorn/protocols/http/httptools_impl.py", line 401, in run_asgi
2026-02-27T18:44:43.007973802+08:00 stderr F     |     result = await app(  # type: ignore[func-returns-value]
2026-02-27T18:44:43.007975466+08:00 stderr F     |              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2026-02-27T18:44:43.007977074+08:00 stderr F     |   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/uvicorn/middleware/proxy_headers.py", line 60, in __call__
2026-02-27T18:44:43.007978615+08:00 stderr F     |     return await self.app(scope, receive, send)
2026-02-27T18:44:43.007980263+08:00 stderr F     |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2026-02-27T18:44:43.007982039+08:00 stderr F     |   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/fastapi/applications.py", line 1054, in __call__
2026-02-27T18:44:43.007983675+08:00 stderr F     |     await super().__call__(scope, receive, send)
2026-02-27T18:44:43.007985308+08:00 stderr F     |   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/applications.py", line 113, in __call__
2026-02-27T18:44:43.007986948+08:00 stderr F     |     await self.middleware_stack(scope, receive, send)
2026-02-27T18:44:43.007988502+08:00 stderr F     |   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/middleware/errors.py", line 187, in __call__
2026-02-27T18:44:43.007990248+08:00 stderr F     |     raise exc
2026-02-27T18:44:43.007991832+08:00 stderr F     |   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/middleware/errors.py", line 165, in __call__
2026-02-27T18:44:43.007993575+08:00 stderr F     |     await self.app(scope, receive, _send)
2026-02-27T18:44:43.007995117+08:00 stderr F     |   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/middleware/base.py", line 185, in __call__
2026-02-27T18:44:43.007996831+08:00 stderr F     |     with collapse_excgroups():
2026-02-27T18:44:43.007998751+08:00 stderr F     |   File "/home/work/.conda/envs/swift/lib/python3.11/contextlib.py", line 158, in __exit__
2026-02-27T18:44:43.008000468+08:00 stderr F     |     self.gen.throw(typ, value, traceback)
2026-02-27T18:44:43.008002057+08:00 stderr F     |   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/_utils.py", line 82, in collapse_excgroups
2026-02-27T18:44:43.008003849+08:00 stderr F     |     raise exc
2026-02-27T18:44:43.0080053+08:00 stderr F     |   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/middleware/base.py", line 187, in __call__
2026-02-27T18:44:43.008006815+08:00 stderr F     |     response = await self.dispatch_func(request, call_next)
2026-02-27T18:44:43.008008509+08:00 stderr F     |                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2026-02-27T18:44:43.008021307+08:00 stderr F     |   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/transformers/cli/serve.py", line 551, in get_or_set_request_id
2026-02-27T18:44:43.008023938+08:00 stderr F     |     response = await call_next(request)
2026-02-27T18:44:43.008026277+08:00 stderr F     |                ^^^^^^^^^^^^^^^^^^^^^^^^
2026-02-27T18:44:43.008028766+08:00 stderr F     |   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/middleware/base.py", line 163, in call_next
2026-02-27T18:44:43.008031295+08:00 stderr F     |     raise app_exc
2026-02-27T18:44:43.008033757+08:00 stderr F     |   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/middleware/base.py", line 149, in coro
2026-02-27T18:44:43.008037063+08:00 stderr F     |     await self.app(scope, receive_or_disconnect, send_no_error)
2026-02-27T18:44:43.008039529+08:00 stderr F     |   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 62, in __call__
2026-02-27T18:44:43.008042034+08:00 stderr F     |     await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
2026-02-27T18:44:43.008044727+08:00 stderr F     |   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
2026-02-27T18:44:43.008047494+08:00 stderr F     |     raise exc
2026-02-27T18:44:43.008049912+08:00 stderr F     |   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app
2026-02-27T18:44:43.008052506+08:00 stderr F     |     await app(scope, receive, sender)
2026-02-27T18:44:43.008054965+08:00 stderr F     |   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/routing.py", line 715, in __call__
2026-02-27T18:44:43.008057432+08:00 stderr F     |     await self.middleware_stack(scope, receive, send)
2026-02-27T18:44:43.008059883+08:00 stderr F     |   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/routing.py", line 735, in app
2026-02-27T18:44:43.008062259+08:00 stderr F     |     await route.handle(scope, receive, send)
2026-02-27T18:44:43.008067601+08:00 stderr F     |   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/routing.py", line 288, in handle
2026-02-27T18:44:43.008069285+08:00 stderr F     |     await self.app(scope, receive, send)
2026-02-27T18:44:43.008072004+08:00 stderr F     |   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/routing.py", line 76, in app
2026-02-27T18:44:43.008074277+08:00 stderr F     |     await wrap_app_handling_exceptions(app, request)(scope, receive, send)
2026-02-27T18:44:43.008076397+08:00 stderr F     |   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
2026-02-27T18:44:43.008078576+08:00 stderr F     |     raise exc
2026-02-27T18:44:43.008080726+08:00 stderr F     |   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app
2026-02-27T18:44:43.008082688+08:00 stderr F     |     await app(scope, receive, sender)
2026-02-27T18:44:43.008084702+08:00 stderr F     |   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/routing.py", line 73, in app
2026-02-27T18:44:43.008086675+08:00 stderr F     |     response = await f(request)
2026-02-27T18:44:43.008089112+08:00 stderr F     |                ^^^^^^^^^^^^^^^^
2026-02-27T18:44:43.008091164+08:00 stderr F     |   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/fastapi/routing.py", line 301, in app
2026-02-27T18:44:43.008093467+08:00 stderr F     |     raw_response = await run_endpoint_function(
2026-02-27T18:44:43.008095807+08:00 stderr F     |                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2026-02-27T18:44:43.008098262+08:00 stderr F     |   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/fastapi/routing.py", line 214, in run_endpoint_function
2026-02-27T18:44:43.008103772+08:00 stderr F     |     return await run_in_threadpool(dependant.call, **values)
2026-02-27T18:44:43.008105888+08:00 stderr F     |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2026-02-27T18:44:43.008108133+08:00 stderr F     |   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/concurrency.py", line 39, in run_in_threadpool
2026-02-27T18:44:43.008111537+08:00 stderr F     |     return await anyio.to_thread.run_sync(func, *args)
2026-02-27T18:44:43.008113878+08:00 stderr F     |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2026-02-27T18:44:43.008116457+08:00 stderr F     |   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/anyio/to_thread.py", line 56, in run_sync
2026-02-27T18:44:43.008118569+08:00 stderr F     |     return await get_async_backend().run_sync_in_worker_thread(
2026-02-27T18:44:43.00812082+08:00 stderr F     |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2026-02-27T18:44:43.008123449+08:00 stderr F     |   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
2026-02-27T18:44:43.00812586+08:00 stderr F     |     return await future
2026-02-27T18:44:43.008128231+08:00 stderr F     |            ^^^^^^^^^^^^
2026-02-27T18:44:43.008130589+08:00 stderr F     |   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 943, in run
2026-02-27T18:44:43.008133+08:00 stderr F     |     result = context.run(func, *args)
2026-02-27T18:44:43.008135637+08:00 stderr F     |              ^^^^^^^^^^^^^^^^^^^^^^^^
2026-02-27T18:44:43.008138117+08:00 stderr F     |   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/transformers/cli/serve.py", line 504, in chat_completion
2026-02-27T18:44:43.008140634+08:00 stderr F     |     return self.continuous_batching_chat_completion(body, request.state.request_id)
2026-02-27T18:44:43.008143826+08:00 stderr F     |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2026-02-27T18:44:43.008146686+08:00 stderr F     |   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/transformers/cli/serve.py", line 828, in continuous_batching_chat_completion
2026-02-27T18:44:43.008148308+08:00 stderr F     |     ).to(model.device)["input_ids"][0]
2026-02-27T18:44:43.008150004+08:00 stderr F     |       ^^
2026-02-27T18:44:43.008151596+08:00 stderr F     | AttributeError: 'str' object has no attribute 'to'
2026-02-27T18:44:43.008153096+08:00 stderr F     +------------------------------------
2026-02-27T18:44:43.008154576+08:00 stderr F 
2026-02-27T18:44:43.008156224+08:00 stderr F During handling of the above exception, another exception occurred:
2026-02-27T18:44:43.008157577+08:00 stderr F 
2026-02-27T18:44:43.008159094+08:00 stderr F Traceback (most recent call last):
2026-02-27T18:44:43.008160837+08:00 stderr F   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/uvicorn/protocols/http/httptools_impl.py", line 401, in run_asgi
2026-02-27T18:44:43.00816237+08:00 stderr F     result = await app(  # type: ignore[func-returns-value]
2026-02-27T18:44:43.008163893+08:00 stderr F              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2026-02-27T18:44:43.008165505+08:00 stderr F   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/uvicorn/middleware/proxy_headers.py", line 60, in __call__
2026-02-27T18:44:43.008167042+08:00 stderr F     return await self.app(scope, receive, send)
2026-02-27T18:44:43.008168552+08:00 stderr F            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2026-02-27T18:44:43.008170161+08:00 stderr F   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/fastapi/applications.py", line 1054, in __call__
2026-02-27T18:44:43.008171651+08:00 stderr F     await super().__call__(scope, receive, send)
2026-02-27T18:44:43.008173262+08:00 stderr F   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/applications.py", line 113, in __call__
2026-02-27T18:44:43.008177248+08:00 stderr F     await self.middleware_stack(scope, receive, send)
2026-02-27T18:44:43.008178717+08:00 stderr F   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/middleware/errors.py", line 187, in __call__
2026-02-27T18:44:43.008180261+08:00 stderr F     raise exc
2026-02-27T18:44:43.008181747+08:00 stderr F   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/middleware/errors.py", line 165, in __call__
2026-02-27T18:44:43.00818324+08:00 stderr F     await self.app(scope, receive, _send)
2026-02-27T18:44:43.008188175+08:00 stderr F   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/middleware/base.py", line 185, in __call__
2026-02-27T18:44:43.008190636+08:00 stderr F     with collapse_excgroups():
2026-02-27T18:44:43.008193214+08:00 stderr F   File "/home/work/.conda/envs/swift/lib/python3.11/contextlib.py", line 158, in __exit__
2026-02-27T18:44:43.008195609+08:00 stderr F     self.gen.throw(typ, value, traceback)
2026-02-27T18:44:43.008198108+08:00 stderr F   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/_utils.py", line 82, in collapse_excgroups
2026-02-27T18:44:43.008201128+08:00 stderr F     raise exc
2026-02-27T18:44:43.00820262+08:00 stderr F   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/middleware/base.py", line 187, in __call__
2026-02-27T18:44:43.008204139+08:00 stderr F     response = await self.dispatch_func(request, call_next)
2026-02-27T18:44:43.008205637+08:00 stderr F                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2026-02-27T18:44:43.008207155+08:00 stderr F   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/transformers/cli/serve.py", line 551, in get_or_set_request_id
2026-02-27T18:44:43.008208639+08:00 stderr F     response = await call_next(request)
2026-02-27T18:44:43.008210188+08:00 stderr F                ^^^^^^^^^^^^^^^^^^^^^^^^
2026-02-27T18:44:43.008211761+08:00 stderr F   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/middleware/base.py", line 163, in call_next
2026-02-27T18:44:43.008213272+08:00 stderr F     raise app_exc
2026-02-27T18:44:43.008214914+08:00 stderr F   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/middleware/base.py", line 149, in coro
2026-02-27T18:44:43.008216424+08:00 stderr F     await self.app(scope, receive_or_disconnect, send_no_error)
2026-02-27T18:44:43.008218014+08:00 stderr F   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 62, in __call__
2026-02-27T18:44:43.00821985+08:00 stderr F     await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
2026-02-27T18:44:43.008221537+08:00 stderr F   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
2026-02-27T18:44:43.008223062+08:00 stderr F     raise exc
2026-02-27T18:44:43.00822454+08:00 stderr F   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app
2026-02-27T18:44:43.008226123+08:00 stderr F     await app(scope, receive, sender)
2026-02-27T18:44:43.008227586+08:00 stderr F   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/routing.py", line 715, in __call__
2026-02-27T18:44:43.008229068+08:00 stderr F     await self.middleware_stack(scope, receive, send)
2026-02-27T18:44:43.008230552+08:00 stderr F   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/routing.py", line 735, in app
2026-02-27T18:44:43.008232061+08:00 stderr F     await route.handle(scope, receive, send)
2026-02-27T18:44:43.008233752+08:00 stderr F   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/routing.py", line 288, in handle
2026-02-27T18:44:43.008235227+08:00 stderr F     await self.app(scope, receive, send)
2026-02-27T18:44:43.008238047+08:00 stderr F   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/routing.py", line 76, in app
2026-02-27T18:44:43.00824318+08:00 stderr F     await wrap_app_handling_exceptions(app, request)(scope, receive, send)
2026-02-27T18:44:43.008244715+08:00 stderr F   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
2026-02-27T18:44:43.008246386+08:00 stderr F     raise exc
2026-02-27T18:44:43.008247856+08:00 stderr F   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app
2026-02-27T18:44:43.008249355+08:00 stderr F     await app(scope, receive, sender)
2026-02-27T18:44:43.008251094+08:00 stderr F   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/routing.py", line 73, in app
2026-02-27T18:44:43.008252601+08:00 stderr F     response = await f(request)
2026-02-27T18:44:43.008254103+08:00 stderr F                ^^^^^^^^^^^^^^^^
2026-02-27T18:44:43.008255743+08:00 stderr F   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/fastapi/routing.py", line 301, in app
2026-02-27T18:44:43.008257325+08:00 stderr F     raw_response = await run_endpoint_function(
2026-02-27T18:44:43.008259113+08:00 stderr F                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2026-02-27T18:44:43.00826063+08:00 stderr F   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/fastapi/routing.py", line 214, in run_endpoint_function
2026-02-27T18:44:43.008262108+08:00 stderr F     return await run_in_threadpool(dependant.call, **values)
2026-02-27T18:44:43.008263768+08:00 stderr F            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2026-02-27T18:44:43.008265222+08:00 stderr F   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/concurrency.py", line 39, in run_in_threadpool
2026-02-27T18:44:43.008266775+08:00 stderr F     return await anyio.to_thread.run_sync(func, *args)
2026-02-27T18:44:43.00826834+08:00 stderr F            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2026-02-27T18:44:43.008269891+08:00 stderr F   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/anyio/to_thread.py", line 56, in run_sync
2026-02-27T18:44:43.008271492+08:00 stderr F     return await get_async_backend().run_sync_in_worker_thread(
2026-02-27T18:44:43.008273117+08:00 stderr F            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2026-02-27T18:44:43.00827473+08:00 stderr F   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
2026-02-27T18:44:43.008276317+08:00 stderr F     return await future
2026-02-27T18:44:43.008277785+08:00 stderr F            ^^^^^^^^^^^^
2026-02-27T18:44:43.008279313+08:00 stderr F   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 943, in run
2026-02-27T18:44:43.008280816+08:00 stderr F     result = context.run(func, *args)
2026-02-27T18:44:43.008282353+08:00 stderr F              ^^^^^^^^^^^^^^^^^^^^^^^^
2026-02-27T18:44:43.008287887+08:00 stderr F   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/transformers/cli/serve.py", line 504, in chat_completion
2026-02-27T18:44:43.008290487+08:00 stderr F     return self.continuous_batching_chat_completion(body, request.state.request_id)
2026-02-27T18:44:43.008293356+08:00 stderr F            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2026-02-27T18:44:43.008296116+08:00 stderr F   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/transformers/cli/serve.py", line 828, in continuous_batching_chat_completion
2026-02-27T18:44:43.008299386+08:00 stderr F     ).to(model.device)["input_ids"][0]
2026-02-27T18:44:43.008300914+08:00 stderr F       ^^
2026-02-27T18:44:43.008302548+08:00 stderr F AttributeError: 'str' object has no attribute 'to'

PR fix notes

PR #44349: fix: support linear_attention in continuous batching and fix serve ch…

Description (problem / solution / changelog)

What does this PR do?

Inspired by https://github.com/huggingface/transformers/pull/44347#issuecomment-3976028358

Fixes transformers serve failing with hybrid models like Qwen3.5 that use linear_attention layers.

Two issues are addressed:

  1. ValueError in PagedAttentionCache: group_layers_by_attn_type() correctly groups linear_attention layers, but PagedAttentionCache.__init__ only handles full_attention and sliding_attention, raising ValueError("Invalid group type: linear_attention"). Since linear attention layers (e.g. GatedDeltaNet) use recurrent state rather than KV cache, they should be filtered out of paged attention cache management entirely.

  2. AttributeError in chat completion: processor.apply_chat_template() (ProcessorMixin) returns a str, but the code expects tensors. The tokenizer is already extracted at L807 but L829 mistakenly calls processor instead.

Fix: Filter linear_attention groups in group_layers_by_attn_type() and use tokenizer.apply_chat_template() instead of processor.apply_chat_template().

Before submitting

Who can review?

@ArthurZucker @Cyrilvallez — continuous batching / cache owners

Changed files

  • src/transformers/cli/serve.py (modified, +1/-1)
  • src/transformers/generation/continuous_batching/cache.py (modified, +9/-1)
  • tests/generation/test_continuous_batching.py (modified, +39/-1)

Code Example

- logs
RAW_BUFFERClick to expand / collapse

System Info

  • transfomers: 5.3.0.dev0

Who can help?

@remi-or @ArthurZucker @McPatate

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

  • use transformers serve-cli to deploy transformers serve --force-model Qwen/Qwen3.5-27B --port 9016 --continuous-batching
  • logs
2026-02-27T18:44:43.002681005+08:00 stdout F INFO:     ::1:46248 - "POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error
2026-02-27T18:44:43.002993914+08:00 stderr F Error in generation loop: 'Qwen3_5Config' object has no attribute 'num_attention_heads'
2026-02-27T18:44:43.002996802+08:00 stderr F Traceback (most recent call last):
2026-02-27T18:44:43.002998847+08:00 stderr F   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/transformers/generation/continuous_batching/continuous_api.py", line 799, in _run_generation_loop
2026-02-27T18:44:43.003000958+08:00 stderr F     paged_attention_cache = PagedAttentionCache(
2026-02-27T18:44:43.003002582+08:00 stderr F                             ^^^^^^^^^^^^^^^^^^^^
2026-02-27T18:44:43.003004339+08:00 stderr F   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/transformers/generation/continuous_batching/cache.py", line 144, in __init__
2026-02-27T18:44:43.003006276+08:00 stderr F     self.num_key_value_heads: int = kv_heads if kv_heads is not None else config.num_attention_heads
2026-02-27T18:44:43.003007921+08:00 stderr F                                                                           ^^^^^^^^^^^^^^^^^^^^^^^^^^
2026-02-27T18:44:43.003009773+08:00 stderr F   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/transformers/configuration_utils.py", line 164, in __getattribute__
2026-02-27T18:44:43.003011403+08:00 stderr F     return super().__getattribute__(key)
2026-02-27T18:44:43.003013125+08:00 stderr F            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2026-02-27T18:44:43.003016089+08:00 stderr F AttributeError: 'Qwen3_5Config' object has no attribute 'num_attention_heads'
2026-02-27T18:44:43.007934561+08:00 stderr F ERROR:    Exception in ASGI application
2026-02-27T18:44:43.00795328+08:00 stderr F   + Exception Group Traceback (most recent call last):
2026-02-27T18:44:43.007956234+08:00 stderr F   |   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/_utils.py", line 76, in collapse_excgroups
2026-02-27T18:44:43.007958168+08:00 stderr F   |     yield
2026-02-27T18:44:43.007960098+08:00 stderr F   |   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/middleware/base.py", line 186, in __call__
2026-02-27T18:44:43.007961889+08:00 stderr F   |     async with anyio.create_task_group() as task_group:
2026-02-27T18:44:43.007963958+08:00 stderr F   |   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 763, in __aexit__
2026-02-27T18:44:43.007965588+08:00 stderr F   |     raise BaseExceptionGroup(
2026-02-27T18:44:43.007967316+08:00 stderr F   | ExceptionGroup: unhandled errors in a TaskGroup (1 sub-exception)
2026-02-27T18:44:43.007968864+08:00 stderr F   +-+---------------- 1 ----------------
2026-02-27T18:44:43.007970393+08:00 stderr F     | Traceback (most recent call last):
2026-02-27T18:44:43.007972218+08:00 stderr F     |   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/uvicorn/protocols/http/httptools_impl.py", line 401, in run_asgi
2026-02-27T18:44:43.007973802+08:00 stderr F     |     result = await app(  # type: ignore[func-returns-value]
2026-02-27T18:44:43.007975466+08:00 stderr F     |              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2026-02-27T18:44:43.007977074+08:00 stderr F     |   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/uvicorn/middleware/proxy_headers.py", line 60, in __call__
2026-02-27T18:44:43.007978615+08:00 stderr F     |     return await self.app(scope, receive, send)
2026-02-27T18:44:43.007980263+08:00 stderr F     |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2026-02-27T18:44:43.007982039+08:00 stderr F     |   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/fastapi/applications.py", line 1054, in __call__
2026-02-27T18:44:43.007983675+08:00 stderr F     |     await super().__call__(scope, receive, send)
2026-02-27T18:44:43.007985308+08:00 stderr F     |   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/applications.py", line 113, in __call__
2026-02-27T18:44:43.007986948+08:00 stderr F     |     await self.middleware_stack(scope, receive, send)
2026-02-27T18:44:43.007988502+08:00 stderr F     |   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/middleware/errors.py", line 187, in __call__
2026-02-27T18:44:43.007990248+08:00 stderr F     |     raise exc
2026-02-27T18:44:43.007991832+08:00 stderr F     |   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/middleware/errors.py", line 165, in __call__
2026-02-27T18:44:43.007993575+08:00 stderr F     |     await self.app(scope, receive, _send)
2026-02-27T18:44:43.007995117+08:00 stderr F     |   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/middleware/base.py", line 185, in __call__
2026-02-27T18:44:43.007996831+08:00 stderr F     |     with collapse_excgroups():
2026-02-27T18:44:43.007998751+08:00 stderr F     |   File "/home/work/.conda/envs/swift/lib/python3.11/contextlib.py", line 158, in __exit__
2026-02-27T18:44:43.008000468+08:00 stderr F     |     self.gen.throw(typ, value, traceback)
2026-02-27T18:44:43.008002057+08:00 stderr F     |   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/_utils.py", line 82, in collapse_excgroups
2026-02-27T18:44:43.008003849+08:00 stderr F     |     raise exc
2026-02-27T18:44:43.0080053+08:00 stderr F     |   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/middleware/base.py", line 187, in __call__
2026-02-27T18:44:43.008006815+08:00 stderr F     |     response = await self.dispatch_func(request, call_next)
2026-02-27T18:44:43.008008509+08:00 stderr F     |                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2026-02-27T18:44:43.008021307+08:00 stderr F     |   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/transformers/cli/serve.py", line 551, in get_or_set_request_id
2026-02-27T18:44:43.008023938+08:00 stderr F     |     response = await call_next(request)
2026-02-27T18:44:43.008026277+08:00 stderr F     |                ^^^^^^^^^^^^^^^^^^^^^^^^
2026-02-27T18:44:43.008028766+08:00 stderr F     |   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/middleware/base.py", line 163, in call_next
2026-02-27T18:44:43.008031295+08:00 stderr F     |     raise app_exc
2026-02-27T18:44:43.008033757+08:00 stderr F     |   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/middleware/base.py", line 149, in coro
2026-02-27T18:44:43.008037063+08:00 stderr F     |     await self.app(scope, receive_or_disconnect, send_no_error)
2026-02-27T18:44:43.008039529+08:00 stderr F     |   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 62, in __call__
2026-02-27T18:44:43.008042034+08:00 stderr F     |     await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
2026-02-27T18:44:43.008044727+08:00 stderr F     |   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
2026-02-27T18:44:43.008047494+08:00 stderr F     |     raise exc
2026-02-27T18:44:43.008049912+08:00 stderr F     |   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app
2026-02-27T18:44:43.008052506+08:00 stderr F     |     await app(scope, receive, sender)
2026-02-27T18:44:43.008054965+08:00 stderr F     |   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/routing.py", line 715, in __call__
2026-02-27T18:44:43.008057432+08:00 stderr F     |     await self.middleware_stack(scope, receive, send)
2026-02-27T18:44:43.008059883+08:00 stderr F     |   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/routing.py", line 735, in app
2026-02-27T18:44:43.008062259+08:00 stderr F     |     await route.handle(scope, receive, send)
2026-02-27T18:44:43.008067601+08:00 stderr F     |   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/routing.py", line 288, in handle
2026-02-27T18:44:43.008069285+08:00 stderr F     |     await self.app(scope, receive, send)
2026-02-27T18:44:43.008072004+08:00 stderr F     |   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/routing.py", line 76, in app
2026-02-27T18:44:43.008074277+08:00 stderr F     |     await wrap_app_handling_exceptions(app, request)(scope, receive, send)
2026-02-27T18:44:43.008076397+08:00 stderr F     |   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
2026-02-27T18:44:43.008078576+08:00 stderr F     |     raise exc
2026-02-27T18:44:43.008080726+08:00 stderr F     |   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app
2026-02-27T18:44:43.008082688+08:00 stderr F     |     await app(scope, receive, sender)
2026-02-27T18:44:43.008084702+08:00 stderr F     |   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/routing.py", line 73, in app
2026-02-27T18:44:43.008086675+08:00 stderr F     |     response = await f(request)
2026-02-27T18:44:43.008089112+08:00 stderr F     |                ^^^^^^^^^^^^^^^^
2026-02-27T18:44:43.008091164+08:00 stderr F     |   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/fastapi/routing.py", line 301, in app
2026-02-27T18:44:43.008093467+08:00 stderr F     |     raw_response = await run_endpoint_function(
2026-02-27T18:44:43.008095807+08:00 stderr F     |                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2026-02-27T18:44:43.008098262+08:00 stderr F     |   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/fastapi/routing.py", line 214, in run_endpoint_function
2026-02-27T18:44:43.008103772+08:00 stderr F     |     return await run_in_threadpool(dependant.call, **values)
2026-02-27T18:44:43.008105888+08:00 stderr F     |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2026-02-27T18:44:43.008108133+08:00 stderr F     |   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/concurrency.py", line 39, in run_in_threadpool
2026-02-27T18:44:43.008111537+08:00 stderr F     |     return await anyio.to_thread.run_sync(func, *args)
2026-02-27T18:44:43.008113878+08:00 stderr F     |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2026-02-27T18:44:43.008116457+08:00 stderr F     |   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/anyio/to_thread.py", line 56, in run_sync
2026-02-27T18:44:43.008118569+08:00 stderr F     |     return await get_async_backend().run_sync_in_worker_thread(
2026-02-27T18:44:43.00812082+08:00 stderr F     |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2026-02-27T18:44:43.008123449+08:00 stderr F     |   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
2026-02-27T18:44:43.00812586+08:00 stderr F     |     return await future
2026-02-27T18:44:43.008128231+08:00 stderr F     |            ^^^^^^^^^^^^
2026-02-27T18:44:43.008130589+08:00 stderr F     |   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 943, in run
2026-02-27T18:44:43.008133+08:00 stderr F     |     result = context.run(func, *args)
2026-02-27T18:44:43.008135637+08:00 stderr F     |              ^^^^^^^^^^^^^^^^^^^^^^^^
2026-02-27T18:44:43.008138117+08:00 stderr F     |   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/transformers/cli/serve.py", line 504, in chat_completion
2026-02-27T18:44:43.008140634+08:00 stderr F     |     return self.continuous_batching_chat_completion(body, request.state.request_id)
2026-02-27T18:44:43.008143826+08:00 stderr F     |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2026-02-27T18:44:43.008146686+08:00 stderr F     |   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/transformers/cli/serve.py", line 828, in continuous_batching_chat_completion
2026-02-27T18:44:43.008148308+08:00 stderr F     |     ).to(model.device)["input_ids"][0]
2026-02-27T18:44:43.008150004+08:00 stderr F     |       ^^
2026-02-27T18:44:43.008151596+08:00 stderr F     | AttributeError: 'str' object has no attribute 'to'
2026-02-27T18:44:43.008153096+08:00 stderr F     +------------------------------------
2026-02-27T18:44:43.008154576+08:00 stderr F 
2026-02-27T18:44:43.008156224+08:00 stderr F During handling of the above exception, another exception occurred:
2026-02-27T18:44:43.008157577+08:00 stderr F 
2026-02-27T18:44:43.008159094+08:00 stderr F Traceback (most recent call last):
2026-02-27T18:44:43.008160837+08:00 stderr F   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/uvicorn/protocols/http/httptools_impl.py", line 401, in run_asgi
2026-02-27T18:44:43.00816237+08:00 stderr F     result = await app(  # type: ignore[func-returns-value]
2026-02-27T18:44:43.008163893+08:00 stderr F              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2026-02-27T18:44:43.008165505+08:00 stderr F   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/uvicorn/middleware/proxy_headers.py", line 60, in __call__
2026-02-27T18:44:43.008167042+08:00 stderr F     return await self.app(scope, receive, send)
2026-02-27T18:44:43.008168552+08:00 stderr F            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2026-02-27T18:44:43.008170161+08:00 stderr F   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/fastapi/applications.py", line 1054, in __call__
2026-02-27T18:44:43.008171651+08:00 stderr F     await super().__call__(scope, receive, send)
2026-02-27T18:44:43.008173262+08:00 stderr F   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/applications.py", line 113, in __call__
2026-02-27T18:44:43.008177248+08:00 stderr F     await self.middleware_stack(scope, receive, send)
2026-02-27T18:44:43.008178717+08:00 stderr F   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/middleware/errors.py", line 187, in __call__
2026-02-27T18:44:43.008180261+08:00 stderr F     raise exc
2026-02-27T18:44:43.008181747+08:00 stderr F   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/middleware/errors.py", line 165, in __call__
2026-02-27T18:44:43.00818324+08:00 stderr F     await self.app(scope, receive, _send)
2026-02-27T18:44:43.008188175+08:00 stderr F   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/middleware/base.py", line 185, in __call__
2026-02-27T18:44:43.008190636+08:00 stderr F     with collapse_excgroups():
2026-02-27T18:44:43.008193214+08:00 stderr F   File "/home/work/.conda/envs/swift/lib/python3.11/contextlib.py", line 158, in __exit__
2026-02-27T18:44:43.008195609+08:00 stderr F     self.gen.throw(typ, value, traceback)
2026-02-27T18:44:43.008198108+08:00 stderr F   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/_utils.py", line 82, in collapse_excgroups
2026-02-27T18:44:43.008201128+08:00 stderr F     raise exc
2026-02-27T18:44:43.00820262+08:00 stderr F   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/middleware/base.py", line 187, in __call__
2026-02-27T18:44:43.008204139+08:00 stderr F     response = await self.dispatch_func(request, call_next)
2026-02-27T18:44:43.008205637+08:00 stderr F                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2026-02-27T18:44:43.008207155+08:00 stderr F   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/transformers/cli/serve.py", line 551, in get_or_set_request_id
2026-02-27T18:44:43.008208639+08:00 stderr F     response = await call_next(request)
2026-02-27T18:44:43.008210188+08:00 stderr F                ^^^^^^^^^^^^^^^^^^^^^^^^
2026-02-27T18:44:43.008211761+08:00 stderr F   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/middleware/base.py", line 163, in call_next
2026-02-27T18:44:43.008213272+08:00 stderr F     raise app_exc
2026-02-27T18:44:43.008214914+08:00 stderr F   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/middleware/base.py", line 149, in coro
2026-02-27T18:44:43.008216424+08:00 stderr F     await self.app(scope, receive_or_disconnect, send_no_error)
2026-02-27T18:44:43.008218014+08:00 stderr F   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 62, in __call__
2026-02-27T18:44:43.00821985+08:00 stderr F     await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
2026-02-27T18:44:43.008221537+08:00 stderr F   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
2026-02-27T18:44:43.008223062+08:00 stderr F     raise exc
2026-02-27T18:44:43.00822454+08:00 stderr F   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app
2026-02-27T18:44:43.008226123+08:00 stderr F     await app(scope, receive, sender)
2026-02-27T18:44:43.008227586+08:00 stderr F   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/routing.py", line 715, in __call__
2026-02-27T18:44:43.008229068+08:00 stderr F     await self.middleware_stack(scope, receive, send)
2026-02-27T18:44:43.008230552+08:00 stderr F   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/routing.py", line 735, in app
2026-02-27T18:44:43.008232061+08:00 stderr F     await route.handle(scope, receive, send)
2026-02-27T18:44:43.008233752+08:00 stderr F   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/routing.py", line 288, in handle
2026-02-27T18:44:43.008235227+08:00 stderr F     await self.app(scope, receive, send)
2026-02-27T18:44:43.008238047+08:00 stderr F   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/routing.py", line 76, in app
2026-02-27T18:44:43.00824318+08:00 stderr F     await wrap_app_handling_exceptions(app, request)(scope, receive, send)
2026-02-27T18:44:43.008244715+08:00 stderr F   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
2026-02-27T18:44:43.008246386+08:00 stderr F     raise exc
2026-02-27T18:44:43.008247856+08:00 stderr F   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app
2026-02-27T18:44:43.008249355+08:00 stderr F     await app(scope, receive, sender)
2026-02-27T18:44:43.008251094+08:00 stderr F   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/routing.py", line 73, in app
2026-02-27T18:44:43.008252601+08:00 stderr F     response = await f(request)
2026-02-27T18:44:43.008254103+08:00 stderr F                ^^^^^^^^^^^^^^^^
2026-02-27T18:44:43.008255743+08:00 stderr F   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/fastapi/routing.py", line 301, in app
2026-02-27T18:44:43.008257325+08:00 stderr F     raw_response = await run_endpoint_function(
2026-02-27T18:44:43.008259113+08:00 stderr F                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2026-02-27T18:44:43.00826063+08:00 stderr F   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/fastapi/routing.py", line 214, in run_endpoint_function
2026-02-27T18:44:43.008262108+08:00 stderr F     return await run_in_threadpool(dependant.call, **values)
2026-02-27T18:44:43.008263768+08:00 stderr F            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2026-02-27T18:44:43.008265222+08:00 stderr F   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/starlette/concurrency.py", line 39, in run_in_threadpool
2026-02-27T18:44:43.008266775+08:00 stderr F     return await anyio.to_thread.run_sync(func, *args)
2026-02-27T18:44:43.00826834+08:00 stderr F            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2026-02-27T18:44:43.008269891+08:00 stderr F   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/anyio/to_thread.py", line 56, in run_sync
2026-02-27T18:44:43.008271492+08:00 stderr F     return await get_async_backend().run_sync_in_worker_thread(
2026-02-27T18:44:43.008273117+08:00 stderr F            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2026-02-27T18:44:43.00827473+08:00 stderr F   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
2026-02-27T18:44:43.008276317+08:00 stderr F     return await future
2026-02-27T18:44:43.008277785+08:00 stderr F            ^^^^^^^^^^^^
2026-02-27T18:44:43.008279313+08:00 stderr F   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 943, in run
2026-02-27T18:44:43.008280816+08:00 stderr F     result = context.run(func, *args)
2026-02-27T18:44:43.008282353+08:00 stderr F              ^^^^^^^^^^^^^^^^^^^^^^^^
2026-02-27T18:44:43.008287887+08:00 stderr F   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/transformers/cli/serve.py", line 504, in chat_completion
2026-02-27T18:44:43.008290487+08:00 stderr F     return self.continuous_batching_chat_completion(body, request.state.request_id)
2026-02-27T18:44:43.008293356+08:00 stderr F            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2026-02-27T18:44:43.008296116+08:00 stderr F   File "/home/work/.conda/envs/swift/lib/python3.11/site-packages/transformers/cli/serve.py", line 828, in continuous_batching_chat_completion
2026-02-27T18:44:43.008299386+08:00 stderr F     ).to(model.device)["input_ids"][0]
2026-02-27T18:44:43.008300914+08:00 stderr F       ^^
2026-02-27T18:44:43.008302548+08:00 stderr F AttributeError: 'str' object has no attribute 'to'

Expected behavior

support deploy qwen3.5

extent analysis

Fix Plan

The error message indicates that the Qwen3_5Config object has no attribute num_attention_heads. This suggests that the model configuration is not properly defined.

To fix this issue, you need to ensure that the Qwen3_5Config class has the required attributes.

Here are the steps to follow:

  • Check the model configuration file to ensure that it has the num_attention_heads attribute.
  • If the attribute is missing, add it to the configuration file.
  • If you are using a custom model, ensure that the Qwen3_5Config class is properly defined and has the required attributes.

Here is an example of how you can define the Qwen3_5Config class:

from transformers import PreTrainedConfig

class Qwen3_5Config(PreTrainedConfig):
    model_type = "qwen3.5"
    num_attention_heads = 16  # Add this attribute
    # Add other required attributes here
  • After making the changes, try deploying the model again using the transformers serve command.

Verification

To verify that the fix worked, you can check the model deployment logs for any error messages. If the model is deployed successfully, you should not see any error messages related to the num_attention_heads attribute.

You can also test the model by sending a request to the deployment endpoint. If the model is working correctly, you should receive a response without any error messages.

Extra Tips

  • Ensure that you are using the correct version of the transformers library.
  • Check the model configuration file for any typos or missing attributes.
  • If you are using a custom model, ensure that the model is properly defined and has the required attributes.
  • You can also try debugging the model deployment code to identify the root cause of the issue.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

support deploy qwen3.5

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

transformers - ✅(Solved) Fix AttributeError: 'Qwen3_5Config' object has no attribute 'num_attention_heads' [2 pull requests, 3 comments, 4 participants]