vllm - 💡(How to fix) Fix [Feature]: Support n_positions config field for nomic_bert models to enable inference beyond max_position_embeddings [1 comments, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
vllm-project/vllm#40130Fetched 2026-04-18 05:52:26
View on GitHub
Comments
1
Participants
1
Timeline
2
Reactions
0
Author
Participants
Timeline (top)
commented ×1labeled ×1

Fix Action

Fix / Workaround

Current workaround:

Code Example

"n_positions": 8192,              /TRUE max positions (8192, not 2048!)
"max_position_embeddings": 2048,  ← what vLLM reads by default (misleading)
"max_trained_positions": 2048,    ← trained up to 2048

"rope_parameters": {
    "rope_theta": 1000.0,RoPE base frequency
    "rope_type": "default"        ← standard RoPE (no dynamic scaling)
},
"rotary_emb_base": 1000,          ← confirms RoPE is active
"rotary_emb_fraction": 1.0,RoPE applied to 100% of head dims
"rotary_emb_interleaved": false,
"rotary_emb_scale_base": null,     ← no trained scaling factor set
"rotary_scaling_factor": null,    ← no scaling factor set

---

--max-model-len 8192 \
--rope-scaling '{"type": "dynamic", "factor": 4.0}' \
--rope-theta 1000 \
--trust-remote-code
RAW_BUFFERClick to expand / collapse

🚀 The feature, motivation and pitch

Support n_positions config field for nomic_bert like models to enable inference beyond max_position_embeddings

nomic-embed-text-v1.5 publishes n_positions: 8192 in its config.json to signal that RoPE frequencies are precomputed to 8192 tokens, enabling inference beyond the training length of 2048. vLLM currently has no awareness of this field and always derives max context from max_position_embeddings, capping at 2048. This RFE requests that vLLM recognise n_positions for nomic_bert model types and use it as the effective max context length.

Config.json https://huggingface.co/nomic-ai/nomic-embed-text-v1.5/blob/main/config.json

"n_positions": 8192,              /← TRUE max positions (8192, not 2048!)
"max_position_embeddings": 2048,  ← what vLLM reads by default (misleading)
"max_trained_positions": 2048,    ← trained up to 2048

"rope_parameters": {
    "rope_theta": 1000.0,          ← RoPE base frequency
    "rope_type": "default"        ← standard RoPE (no dynamic scaling)
},
"rotary_emb_base": 1000,          ← confirms RoPE is active
"rotary_emb_fraction": 1.0,        ← RoPE applied to 100% of head dims
"rotary_emb_interleaved": false,
"rotary_emb_scale_base": null,     ← no trained scaling factor set
"rotary_scaling_factor": null,    ← no scaling factor set

1.When model_type == "nomic_bert" and n_positions is present, use n_positions as the effective max context length instead of max_position_embeddings 2. No --rope-scaling flag should be required — the precomputed frequency buffer already covers the full range 3. --rope-theta should still be respected or auto-derived from rotary_emb_base in config

Alternatives

Current workaround:

--max-model-len 8192 \
--rope-scaling '{"type": "dynamic", "factor": 4.0}' \
--rope-theta 1000 \
--trust-remote-code

Additional context

No response

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

extent analysis

TL;DR

Update vLLM to recognize the n_positions config field for nomic_bert models and use it as the effective max context length.

Guidance

  • Check the model_type and n_positions fields in the config.json to determine if the update is applicable.
  • Verify that the n_positions value is used as the max context length when model_type is nomic_bert.
  • Test the updated model with the nomic-embed-text-v1.5 config to ensure it can handle inference beyond the training length of 2048.
  • Review the rope_parameters and rotary_emb_base fields to ensure RoPE frequencies are precomputed correctly.

Example

No code snippet is provided as the issue does not contain explicit code references.

Notes

The update should only be applied when the model_type is nomic_bert and the n_positions field is present in the config.json.

Recommendation

Apply workaround: Use the provided --max-model-len, --rope-scaling, --rope-theta, and --trust-remote-code flags as a temporary solution until the vLLM update is implemented.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

vllm - 💡(How to fix) Fix [Feature]: Support n_positions config field for nomic_bert models to enable inference beyond max_position_embeddings [1 comments, 1 participants]