vllm - ✅(Solved) Fix Tokenizer error with Huihui-Qwen3.5-35B-A3B-Claude-4.6-Opus-abliterated model - TokenizersBackend not found [1 pull requests, 3 comments, 3 participants]

ArtemSultanov-PG · 2026-03-24T17:44:41Z

[vllm] PR 38099: Bugfix Add fallback for TokenizersBackend tokenizer class - Repository: vllm-project/vllm - Author: Lidang-Jiang - State: open | merged: False… # PR #38099: [Bugfix] Add fallback for TokenizersBackend tokenizer class - Repository: vllm-project/vllm - Author: Lidang-Jiang - State: open | merged: False - Link: https://github.com/vllm-project/vllm/pull/38099 ## Description (problem / solution / changelog) ## Summary Fixes https://github.com/vllm-project/vllm/issues/38024 Models using transformers 5.x may set `"tokenizer_class": "TokenizersBackend"` in `tokenizer_config.json`. When loaded with older transformers, this causes: ``` ValueError: Tokenizer class TokenizersBackend does not exist or is not currently imported. ``` This error **cannot** be resolved by `--trust-remote-code` since `TokenizersBackend` is a built-in transformers backend class, not a custom tokenizer. **This PR adds:** - An allowlist (`_FAST_TOKENIZER_FALLBACK_CLASSES`) of known forward-compatible tokenizer class names - A helper to read `tokenizer_class` from `tokenizer_config.json` - Fallback logic in `CachedHfTokenizer.from_pretrained()` that transparently loads via `PreTrainedTokenizerFast` when the class is in the allowlist - A warning logged suggesting users upgrade transformers - 10 unit tests covering all branches (fallback success, fallback failure, unknown class, unrelated errors, etc.) **Design decisions:** - **Allowlist-based** (not a catch-all fallback) — prevents misfiring on models that genuinely need custom tokenizers - **Triggers regardless of `trust_remote_code`** — because `trust_remote_code` cannot resolve this specific issue - **Precedent**: `vllm/tokenizers/deepseek_v32.py:88` already uses `PreTrainedTokenizerFast.from_pretrained()` for a similar purpose ## Before / After Before fix (error) ``` ====================================================================== BEFORE FIX: Simulating AutoTokenizer.from_pretrained() with tokenizer_config.json containing tokenizer_class=TokenizersBackend ====================================================================== RuntimeError: Failed to load the tokenizer. If the tokenizer is a custom tokenizer not yet available in the HuggingFace transformers library, consider setting `trust_remote_code=True` in LLM or using the `--trust-remote-code` flag in the CLI. >>> Users are told to use --trust-remote-code, >>> but that does NOT resolve this issue. ``` After fix (graceful fallback) ``` ====================================================================== AFTER FIX: Simulating AutoTokenizer.from_pretrained() with tokenizer_config.json containing tokenizer_class=TokenizersBackend ====================================================================== WARNING 03-25 19:00:21 [hf.py:172] Tokenizer class 'TokenizersBackend' is not available in the installed version of transformers. Falling back to PreTrainedTokenizerFast. Consider upgrading: `pip install --upgrade transformers` SUCCESS: Tokenizer loaded via PreTrainedTokenizerFast fallback type = MagicMock vocab = 2 special = [' '] >>> Model loads successfully with graceful fallback. >>> A warning is logged suggesting: pip install --upgrade transformers ``` ## Test plan ```bash pytest tests/tokenizers_/test_hf.py -v -k "not test_cached_tokenizer" ``` Test results (10 passed) ``` tests/tokenizers_/test_hf.py::TestFastTokenizerFallbackHelpers::test_fallback_classes_is_frozenset PASSED [ 10%] tests/tokenizers_/test_hf.py::TestFastTokenizerFallbackHelpers::test_fallback_possible_when_class_in_allowlist PASSED [ 20%] tests/tokenizers_/test_hf.py::TestFastTokenizerFallbackHelpers::test_fallback_not_possible_for_unknown_class PASSED [ 30%] tests/tokenizers_/test_hf.py::TestFastTokenizerFallbackHelpers::test_fallback_not_possible_when_config_unreadable PASSED [ 40%] tests/tokenizers_/test_hf.py::TestTokenizersBackendFallback::test_falls_back_to_pretrained_tokenizer_fast PASSED [ 50%] tests/tokenizers_/test_hf.py::TestTokenizersBackendFallback::test_raises_runtime_error_for_unknown_class PASSED [ 60%] tests/tokenizers_/test_hf.py::TestTokenizersBackendFallback::test_unrelated_valueerror_is_reraised PASSED [ 70%] tests/tokenizers_/test_hf.py::TestTokenizersBackendFallback::test_fallback_works_even_with_trust_remote_code PASSED [ 80%] tests/tokenizers_/test_hf.py::TestTokenizersBackendFallback::test_fallback_triggers_on_execute_tokenizer_file_error PASSED [ 90%] tests/tokenizers_/test_hf.py::TestTokenizersBackendFallback::test_fallback_failure_raises_runtime_error PASSED [100%] ================= 10 passed, 2 deselected, 2 warnings in 2.09s ================= ``` **Pre-commit:** All hooks passed (ruff, mypy, typos, SPDX headers, etc.) ## Notes - This PR is NOT a duplicate — no existing open PR addresses #38024 - AI assistance was used (Claude Code) for implementation and testing - This is a defensive fix; upgrading `

vllm2026-03-24 17:44:41

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

vllm-project/vllm#38024•Fetched 2026-04-08 01:27:00

View on GitHub

Comments

Participants

Timeline

Reactions

Author

Participants

Timeline (top)

referenced ×4commented ×3cross-referenced ×1subscribed ×1

Error Message

ValueError: Tokenizer class TokenizersBackend does not exist or is not currently imported.

Fix Action

Fix / Workaround

The error suggests that the model uses a custom tokenizer class TokenizersBackend that is not recognized by vLLM. Is this a known issue? Are there any workarounds or fixes available?

PR fix notes

PR #38099: [Bugfix] Add fallback for TokenizersBackend tokenizer class

Repository: vllm-project/vllm
Author: Lidang-Jiang
State: open | merged: False
Link: https://github.com/vllm-project/vllm/pull/38099

Description (problem / solution / changelog)

Summary

Fixes https://github.com/vllm-project/vllm/issues/38024

Models using transformers 5.x may set "tokenizer_class": "TokenizersBackend" in tokenizer_config.json. When loaded with older transformers, this causes:

ValueError: Tokenizer class TokenizersBackend does not exist or is not currently imported.

This error cannot be resolved by --trust-remote-code since TokenizersBackend is a built-in transformers backend class, not a custom tokenizer.

This PR adds:

An allowlist (_FAST_TOKENIZER_FALLBACK_CLASSES) of known forward-compatible tokenizer class names
A helper to read tokenizer_class from tokenizer_config.json
Fallback logic in CachedHfTokenizer.from_pretrained() that transparently loads via PreTrainedTokenizerFast when the class is in the allowlist
A warning logged suggesting users upgrade transformers
10 unit tests covering all branches (fallback success, fallback failure, unknown class, unrelated errors, etc.)

Design decisions:

Allowlist-based (not a catch-all fallback) — prevents misfiring on models that genuinely need custom tokenizers
Triggers regardless of trust_remote_code — because trust_remote_code cannot resolve this specific issue
Precedent: vllm/tokenizers/deepseek_v32.py:88 already uses PreTrainedTokenizerFast.from_pretrained() for a similar purpose

Before / After

<details> <summary>Before fix (error)</summary>

======================================================================
BEFORE FIX: Simulating AutoTokenizer.from_pretrained() with
tokenizer_config.json containing tokenizer_class=TokenizersBackend
======================================================================
RuntimeError: Failed to load the tokenizer. If the tokenizer is a custom
tokenizer not yet available in the HuggingFace transformers library, consider
setting `trust_remote_code=True` in LLM or using the `--trust-remote-code`
flag in the CLI.

>>> Users are told to use --trust-remote-code,
>>> but that does NOT resolve this issue.

</details> <details> <summary>After fix (graceful fallback)</summary>

======================================================================
AFTER FIX: Simulating AutoTokenizer.from_pretrained() with
tokenizer_config.json containing tokenizer_class=TokenizersBackend
======================================================================
WARNING 03-25 19:00:21 [hf.py:172] Tokenizer class 'TokenizersBackend' is not
available in the installed version of transformers. Falling back to
PreTrainedTokenizerFast. Consider upgrading: `pip install --upgrade transformers`
SUCCESS: Tokenizer loaded via PreTrainedTokenizerFast fallback
  type    = MagicMock
  vocab   = 2
  special = ['<s>']

>>> Model loads successfully with graceful fallback.
>>> A warning is logged suggesting: pip install --upgrade transformers

</details>

Test plan

pytest tests/tokenizers_/test_hf.py -v -k "not test_cached_tokenizer"

<details> <summary>Test results (10 passed)</summary>

tests/tokenizers_/test_hf.py::TestFastTokenizerFallbackHelpers::test_fallback_classes_is_frozenset PASSED [ 10%]
tests/tokenizers_/test_hf.py::TestFastTokenizerFallbackHelpers::test_fallback_possible_when_class_in_allowlist PASSED [ 20%]
tests/tokenizers_/test_hf.py::TestFastTokenizerFallbackHelpers::test_fallback_not_possible_for_unknown_class PASSED [ 30%]
tests/tokenizers_/test_hf.py::TestFastTokenizerFallbackHelpers::test_fallback_not_possible_when_config_unreadable PASSED [ 40%]
tests/tokenizers_/test_hf.py::TestTokenizersBackendFallback::test_falls_back_to_pretrained_tokenizer_fast PASSED [ 50%]
tests/tokenizers_/test_hf.py::TestTokenizersBackendFallback::test_raises_runtime_error_for_unknown_class PASSED [ 60%]
tests/tokenizers_/test_hf.py::TestTokenizersBackendFallback::test_unrelated_valueerror_is_reraised PASSED [ 70%]
tests/tokenizers_/test_hf.py::TestTokenizersBackendFallback::test_fallback_works_even_with_trust_remote_code PASSED [ 80%]
tests/tokenizers_/test_hf.py::TestTokenizersBackendFallback::test_fallback_triggers_on_execute_tokenizer_file_error PASSED [ 90%]
tests/tokenizers_/test_hf.py::TestTokenizersBackendFallback::test_fallback_failure_raises_runtime_error PASSED [100%]

================= 10 passed, 2 deselected, 2 warnings in 2.09s =================

</details>

Pre-commit: All hooks passed (ruff, mypy, typos, SPDX headers, etc.)

Notes

This PR is NOT a duplicate — no existing open PR addresses #38024
AI assistance was used (Claude Code) for implementation and testing
This is a defensive fix; upgrading transformers remains the recommended long-term solution

🤖 Generated with Claude Code

Changed files

vllm/tokenizers/hf.py (modified, +4/-1)

Code Example

ValueError: Tokenizer class TokenizersBackend does not exist or is not currently imported.

---

sudo docker run -d --name vllm-huihui-qwen35-claude \
  --runtime=nvidia \
  -e CUDA_VISIBLE_DEVICES=4,5,6,7 \
  -v /data:/data \
  -p 8002:8000 \
  --ipc=host \
  vllm/vllm-openai:v0.17.1 \
  python3 -m vllm.entrypoints.openai.api_server \
  --model /data/models/Huihui-Qwen3.5-35B-A3B-Claude-4.6-Opus-abliterated \
  --tensor-parallel-size 4 \
  --max-model-len 100000 \
  --gpu-memory-utilization 0.85 \
  --enable-auto-tool-choice \
  --tool-call-parser qwen3_xml \
  --dtype bfloat16 \
  --kv-cache-dtype fp8 \
  --max-num-seqs 10 \
  --host 0.0.0.0 \
  --port 8000

RAW_BUFFERClick to expand / collapse

Issue Description

When trying to run the model Huihui-Qwen3.5-35B-A3B-Claude-4.6-Opus-abliterated with vLLM, I encounter a tokenizer error.

Model Information

Model Name: Huihui-Qwen3.5-35B-A3B-Claude-4.6-Opus-abliterated
Model Architecture: Qwen3_5MoeForConditionalGeneration (MoE)
Parameters: 35B total, 256 experts, 8 experts per token
Quantization: AWQ 4-bit
Location: /data/models/Huihui-Qwen3.5-35B-A3B-Claude-4.6-Opus-abliterated

Error Message

ValueError: Tokenizer class TokenizersBackend does not exist or is not currently imported.

Environment

vLLM Version: 0.17.1 and nightly
CUDA: Available
GPU: 4x GPUs (tensor-parallel-size=4)
Python: 3.x

Command Used

sudo docker run -d --name vllm-huihui-qwen35-claude \
  --runtime=nvidia \
  -e CUDA_VISIBLE_DEVICES=4,5,6,7 \
  -v /data:/data \
  -p 8002:8000 \
  --ipc=host \
  vllm/vllm-openai:v0.17.1 \
  python3 -m vllm.entrypoints.openai.api_server \
  --model /data/models/Huihui-Qwen3.5-35B-A3B-Claude-4.6-Opus-abliterated \
  --tensor-parallel-size 4 \
  --max-model-len 100000 \
  --gpu-memory-utilization 0.85 \
  --enable-auto-tool-choice \
  --tool-call-parser qwen3_xml \
  --dtype bfloat16 \
  --kv-cache-dtype fp8 \
  --max-num-seqs 10 \
  --host 0.0.0.0 \
  --port 8000

Model Tokenizer Files

The model contains:

tokenizer.json (19 MB)
tokenizer_config.json (5.3 KB)

Tried Solutions

vLLM v0.17.1 - ❌ Error persists
vLLM nightly - ❌ Error persists
--tokenizer-mode auto - ❌ Error persists
--trust-remote-code - ❌ Error persists
--enforce-eager - ❌ Error persists

Question

The error suggests that the model uses a custom tokenizer class TokenizersBackend that is not recognized by vLLM. Is this a known issue? Are there any workarounds or fixes available?

Logs

Full logs are available in the attached file or can be reproduced by running the command above.

Thank you!

extent analysis

Fix Plan

To resolve the ValueError: Tokenizer class TokenizersBackend does not exist or is not currently imported error, follow these steps:

Ensure the TokenizersBackend class is correctly defined and imported in your code.
If using a custom tokenizer, verify that it is properly registered with the vllm library.
Try updating the vllm library to the latest version, as this issue may have been resolved in a recent update.

Here's an example of how to register a custom tokenizer:

import vllm

# Define the custom tokenizer class
class TokenizersBackend:
    # Tokenizer implementation

# Register the custom tokenizer
vllm.tokenization.register_tokenizer("TokenizersBackend", TokenizersBackend)

Alternatively, you can try specifying the tokenizer class explicitly when running the vllm command:

sudo docker run -d --name vllm-huihui-qwen35-claude \
  --runtime=nvidia \
  -e CUDA_VISIBLE_DEVICES=4,5,6,7 \
  -v /data:/data \
  -p 8002:8000 \
  --ipc=host \
  vllm/vllm-openai:v0.17.1 \
  python3 -m vllm.entrypoints.openai.api_server \
  --model /data/models/Huihui-Qwen3.5-35B-A3B-Claude-4.6-Opus-abliterated \
  --tensor-parallel-size 4 \
  --max-model-len 100000 \
  --gpu-memory-utilization 0.85 \
  --enable-auto-tool-choice \
  --tool-call-parser qwen3_xml \
  --dtype bfloat16 \
  --kv-cache-dtype fp8 \
  --max-num-seqs 10 \
  --host 0.0.0.0 \
  --port 8000 \
  --tokenizer-class TokenizersBackend

Verification

To verify that the fix worked, run the vllm command again and check for any error messages related to the tokenizer. If the issue persists, try checking the logs for more detailed error information.

Extra Tips

Ensure that the custom tokenizer class is correctly implemented and compatible with the vllm library.
If using a custom tokenizer, consider submitting a pull request to the vllm repository to add support for your tokenizer.
Keep your vllm library up to date to ensure you have the latest features and bug fixes.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #tokenizer error #authentication setup #request error #file not found

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

vllm - ✅(Solved) Fix Tokenizer error with Huihui-Qwen3.5-35B-A3B-Claude-4.6-Opus-abliterated model - TokenizersBackend not found [1 pull requests, 3 comments, 3 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Fix Action

Fix / Workaround

PR fix notes

PR #38099: [Bugfix] Add fallback for TokenizersBackend tokenizer class

Description (problem / solution / changelog)

Summary

Before / After

Test plan

Notes

Changed files

Code Example

Issue Description

Model Information

Error Message

Environment

Command Used

Model Tokenizer Files

Tried Solutions

Question

Logs

extent analysis

Fix Plan

Verification

Extra Tips

Still need to ship something?

RELATED_DISCOVERY

TRENDING