vllm - 💡(How to fix) Fix [Feature]: Support n_positions config field for nomic_bert models to enable inference beyond max_position_embeddings [1 comments, 1 participants]

vllm2026-04-17 09:49:17

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

vllm-project/vllm#40130•Fetched 2026-04-18 05:52:26

View on GitHub

Comments

Participants

Timeline

Reactions

Author

Roy214

Participants

Roy214

Timeline (top)

commented ×1labeled ×1

Fix Action

Fix / Workaround

Current workaround:

Code Example

"n_positions": 8192,              /← TRUE max positions (8192, not 2048!)
"max_position_embeddings": 2048,  ← what vLLM reads by default (misleading)
"max_trained_positions": 2048,    ← trained up to 2048

"rope_parameters": {
    "rope_theta": 1000.0,          ← RoPE base frequency
    "rope_type": "default"        ← standard RoPE (no dynamic scaling)
},
"rotary_emb_base": 1000,          ← confirms RoPE is active
"rotary_emb_fraction": 1.0,        ← RoPE applied to 100% of head dims
"rotary_emb_interleaved": false,
"rotary_emb_scale_base": null,     ← no trained scaling factor set
"rotary_scaling_factor": null,    ← no scaling factor set

---

--max-model-len 8192 \
--rope-scaling '{"type": "dynamic", "factor": 4.0}' \
--rope-theta 1000 \
--trust-remote-code

RAW_BUFFERClick to expand / collapse

🚀 The feature, motivation and pitch

Support n_positions config field for nomic_bert like models to enable inference beyond max_position_embeddings

nomic-embed-text-v1.5 publishes n_positions: 8192 in its config.json to signal that RoPE frequencies are precomputed to 8192 tokens, enabling inference beyond the training length of 2048. vLLM currently has no awareness of this field and always derives max context from max_position_embeddings, capping at 2048. This RFE requests that vLLM recognise n_positions for nomic_bert model types and use it as the effective max context length.

Config.json https://huggingface.co/nomic-ai/nomic-embed-text-v1.5/blob/main/config.json

"n_positions": 8192,              /← TRUE max positions (8192, not 2048!)
"max_position_embeddings": 2048,  ← what vLLM reads by default (misleading)
"max_trained_positions": 2048,    ← trained up to 2048

"rope_parameters": {
    "rope_theta": 1000.0,          ← RoPE base frequency
    "rope_type": "default"        ← standard RoPE (no dynamic scaling)
},
"rotary_emb_base": 1000,          ← confirms RoPE is active
"rotary_emb_fraction": 1.0,        ← RoPE applied to 100% of head dims
"rotary_emb_interleaved": false,
"rotary_emb_scale_base": null,     ← no trained scaling factor set
"rotary_scaling_factor": null,    ← no scaling factor set

1.When model_type == "nomic_bert" and n_positions is present, use n_positions as the effective max context length instead of max_position_embeddings 2. No --rope-scaling flag should be required — the precomputed frequency buffer already covers the full range 3. --rope-theta should still be respected or auto-derived from rotary_emb_base in config

Alternatives

Current workaround:

--max-model-len 8192 \
--rope-scaling '{"type": "dynamic", "factor": 4.0}' \
--rope-theta 1000 \
--trust-remote-code

Additional context

No response

Before submitting a new issue...

Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

extent analysis

TL;DR

Update vLLM to recognize the n_positions config field for nomic_bert models and use it as the effective max context length.

Guidance

Check the model_type and n_positions fields in the config.json to determine if the update is applicable.
Verify that the n_positions value is used as the max context length when model_type is nomic_bert.
Test the updated model with the nomic-embed-text-v1.5 config to ensure it can handle inference beyond the training length of 2048.
Review the rope_parameters and rotary_emb_base fields to ensure RoPE frequencies are precomputed correctly.

Example

No code snippet is provided as the issue does not contain explicit code references.

Notes

The update should only be applied when the model_type is nomic_bert and the n_positions field is present in the config.json.

Recommendation

Apply workaround: Use the provided --max-model-len, --rope-scaling, --rope-theta, and --trust-remote-code flags as a temporary solution until the vLLM update is implemented.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#prompt issue #agent setup #task chaining #parallel task #integration issue

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

vllm - 💡(How to fix) Fix [Feature]: Support n_positions config field for nomic_bert models to enable inference beyond max_position_embeddings [1 comments, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Fix Action

Fix / Workaround

Code Example

🚀 The feature, motivation and pitch

Alternatives

Additional context

Before submitting a new issue...

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

vllm - 💡(How to fix) Fix [Feature]: Support n_positions config field for nomic_bert models to enable inference beyond max_position_embeddings [1 comments, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Fix Action

Fix / Workaround

Code Example

🚀 The feature, motivation and pitch

Alternatives

Additional context

Before submitting a new issue...

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING