vllm - ✅(Solved) Fix [Bug]: LMCache MP fallback adapter rejects cache_salt/cache_salts kwargs after #39837 [1 pull requests, 1 comments, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
vllm-project/vllm#40040Fetched 2026-04-17 08:27:30
View on GitHub
Comments
1
Participants
1
Timeline
2
Reactions
0
Author
Participants
Timeline (top)
commented ×1cross-referenced ×1

Error Message

TypeError: LMCacheMPSchedulerAdapter.maybe_submit_lookup_request() got an unexpected keyword argument 'cache_salt' TypeError: LMCacheMPWorkerAdapter.batched_submit_store_requests() got an unexpected keyword argument 'cache_salts' TypeError: LMCacheMPWorkerAdapter.batched_submit_retrieve_requests() got an unexpected keyword argument 'cache_salts'

Fix Action

Fixed

PR fix notes

PR #40041: [Bugfix][LMCache MP Connector] Fix fallback adapter cache_salt signature mismatch

Description (problem / solution / changelog)

Fixes #40040.

Summary

#39837 propagated cache_salt / cache_salts through the LMCache MP connector call sites, but the repo-local fallback adapter copy in vllm/distributed/kv_transfer/kv_connector/v1/lmcache_integration/multi_process_adapter.py still had the old compatibility surface.

On current main, that could raise runtime TypeError in the fallback/internal adapter path when the connector exercises:

  • LMCacheMPSchedulerAdapter.__init__(..., mq_timeout=..., heartbeat_interval=...)
  • LMCacheMPWorkerAdapter.__init__(..., mq_timeout=..., heartbeat_interval=...)
  • maybe_submit_lookup_request(..., cache_salt=...)
  • batched_submit_store_requests(..., cache_salts=...)
  • batched_submit_retrieve_requests(..., cache_salts=...)

Root cause

lmcache_mp_connector.py tries to import the external LMCache MP adapter first, but explicitly falls back to the repo-local adapter copy on ImportError.

After #39837, the connector call sites and the fallback adapter copy no longer matched each other:

  • the factory helpers still passed mq_timeout / heartbeat_interval
  • the request path now passed cache_salt / cache_salts
  • the fallback adapter copy still exposed the older signatures

Fix

  • Accept mq_timeout and heartbeat_interval in the fallback scheduler/worker adapter constructors
  • Accept cache_salt in LMCacheMPSchedulerAdapter.maybe_submit_lookup_request
  • Accept cache_salts in LMCacheMPWorkerAdapter.batched_submit_store_requests
  • Accept cache_salts in LMCacheMPWorkerAdapter.batched_submit_retrieve_requests
  • Add a self-contained regression test that:
    • verifies the constructor and method signatures match the connector call sites
    • instantiates the fallback adapters through create_scheduler_adapter / create_worker_adapter
    • exercises the connector path without requiring the full LMCache runtime stack
    • restores temporary module stubs after the test to avoid sys.modules pollution

Scope / non-goals

This PR intentionally does not change fallback key-construction semantics or make broader claims about external LMCache behavior.

It only fixes the confirmed compatibility regression in the repo-local fallback/internal adapter path.

Duplicate-work check

gh pr list --repo vllm-project/vllm --state open --search "40040 in:body"
gh pr list --repo vllm-project/vllm --state open --search "fallback adapter cache_salt"

Both searches returned no matching open PR before opening this PR.

Test Plan

.venv/bin/python -m py_compile \
  vllm/distributed/kv_transfer/kv_connector/v1/lmcache_integration/multi_process_adapter.py \
  tests/v1/kv_connector/unit/test_lmcache_mp_fallback_adapter.py

.venv/bin/python -m pytest --noconftest \
  tests/v1/kv_connector/unit/test_lmcache_mp_fallback_adapter.py -q

Test Result

..                                                                       [100%]
2 passed in 0.02s

AI assistance disclosure

AI assistance was used to help draft and validate this change. I reviewed the final issue/PR text, code diff, and test evidence.

Changed files

  • tests/v1/kv_connector/unit/test_lmcache_mp_fallback_adapter.py (added, +393/-0)
  • vllm/distributed/kv_transfer/kv_connector/v1/lmcache_integration/multi_process_adapter.py (modified, +24/-0)

Code Example

# vllm/distributed/kv_transfer/kv_connector/v1/lmcache_mp_connector.py
self.worker_adapter.batched_submit_retrieve_requests(
    request_ids, ops, event, cache_salts=cache_salts
)

self.worker_adapter.batched_submit_store_requests(
    request_ids, ops, event, cache_salts=cache_salts
)

self.scheduler_adapter.maybe_submit_lookup_request(
    request.request_id,
    token_ids=list(request.all_token_ids),
    cache_salt=tracker.cache_salt,
)

---

# vllm/distributed/kv_transfer/kv_connector/v1/lmcache_integration/multi_process_adapter.py

def maybe_submit_lookup_request(
    self,
    request_id: str,
    block_hashes: list[bytes] | None = None,
    token_ids: list[int] | None = None,
) -> None:
    ...

def batched_submit_store_requests(
    self,
    request_ids: list[str],
    ops: list[LoadStoreOp],
    event: torch.cuda.Event,
):
    ...

def batched_submit_retrieve_requests(
    self,
    request_ids: list[str],
    ops: list[LoadStoreOp],
    event: torch.cuda.Event,
):
    ...

---

TypeError: LMCacheMPSchedulerAdapter.maybe_submit_lookup_request() got an unexpected keyword argument 'cache_salt'
TypeError: LMCacheMPWorkerAdapter.batched_submit_store_requests() got an unexpected keyword argument 'cache_salts'
TypeError: LMCacheMPWorkerAdapter.batched_submit_retrieve_requests() got an unexpected keyword argument 'cache_salts'
RAW_BUFFERClick to expand / collapse

Your current environment

  • vLLM checkout: main at 3daca38e2279538b420641bd41853c19e5ad01e4
  • Python: 3.10.20
  • Latest stable checked for comparison: v0.19.0 (published 2026-04-03)

This appears to be a current main regression rather than a latest-stable regression.

🐛 Describe the bug

#39837 added cache_salt / cache_salts arguments at the LMCache MP connector call sites, but the repo-local fallback/internal adapter copy in vllm/distributed/kv_transfer/kv_connector/v1/lmcache_integration/multi_process_adapter.py still has the old method signatures.

As a result, the fallback/internal LMCache MP path can raise TypeError on current main.

Why this seems real and not intended

cache_salt is documented as a real cache-isolation feature for shared environments, and #39837 explicitly extends that support into the LMCache MP connector.

Also, #39837 already had a review comment pointing out that the fallback adapter methods needed to accept the new arguments to avoid TypeError, but the merged code still has the old signatures.

Relevant call sites on current main

# vllm/distributed/kv_transfer/kv_connector/v1/lmcache_mp_connector.py
self.worker_adapter.batched_submit_retrieve_requests(
    request_ids, ops, event, cache_salts=cache_salts
)

self.worker_adapter.batched_submit_store_requests(
    request_ids, ops, event, cache_salts=cache_salts
)

self.scheduler_adapter.maybe_submit_lookup_request(
    request.request_id,
    token_ids=list(request.all_token_ids),
    cache_salt=tracker.cache_salt,
)

Current fallback signatures

# vllm/distributed/kv_transfer/kv_connector/v1/lmcache_integration/multi_process_adapter.py

def maybe_submit_lookup_request(
    self,
    request_id: str,
    block_hashes: list[bytes] | None = None,
    token_ids: list[int] | None = None,
) -> None:
    ...

def batched_submit_store_requests(
    self,
    request_ids: list[str],
    ops: list[LoadStoreOp],
    event: torch.cuda.Event,
):
    ...

def batched_submit_retrieve_requests(
    self,
    request_ids: list[str],
    ops: list[LoadStoreOp],
    event: torch.cuda.Event,
):
    ...

Observed failure mode

Using a local runtime-oriented repro against the checked-out source, I can reproduce:

TypeError: LMCacheMPSchedulerAdapter.maybe_submit_lookup_request() got an unexpected keyword argument 'cache_salt'
TypeError: LMCacheMPWorkerAdapter.batched_submit_store_requests() got an unexpected keyword argument 'cache_salts'
TypeError: LMCacheMPWorkerAdapter.batched_submit_retrieve_requests() got an unexpected keyword argument 'cache_salts'

The same mismatch is reachable through the actual connector methods:

  • LMCacheMPConnector.get_num_new_matched_tokens()
  • LMCacheMPConnector.start_load_kv()
  • LMCacheMPConnector.wait_for_save()

Scope narrowing

I am not claiming that the latest external LMCache adapter path is broken end-to-end.

This issue is specifically about the repo-local fallback/internal MP adapter path still rejecting the keywords introduced by #39837.

Why I think this should be fixed

lmcache_mp_connector.py explicitly falls back to the repo-local adapter when the external LMCache adapter import is unavailable, so this fallback path should stay interface-compatible with the call sites in the same repo.

Duplicate-work check

I searched the current open PRs/issues and did not find an existing open tracker for this fallback signature mismatch.

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

extent analysis

TL;DR

Update the fallback/internal adapter methods in vllm/distributed/kv_transfer/kv_connector/v1/lmcache_integration/multi_process_adapter.py to accept the new cache_salt and cache_salts arguments.

Guidance

  • Review the method signatures in lmcache_integration/multi_process_adapter.py and update them to match the new arguments introduced by #39837.
  • Verify that the updated methods can handle the new arguments correctly and do not raise TypeError.
  • Check the call sites in vllm/distributed/kv_transfer/kv_connector/v1/lmcache_mp_connector.py to ensure they are passing the correct arguments to the updated methods.
  • Test the fallback path to ensure it is working as expected and does not raise any errors.

Example

def maybe_submit_lookup_request(
    self,
    request_id: str,
    block_hashes: list[bytes] | None = None,
    token_ids: list[int] | None = None,
    cache_salt: str | None = None,  # Add cache_salt argument
) -> None:
    ...

def batched_submit_store_requests(
    self,
    request_ids: list[str],
    ops: list[LoadStoreOp],
    event: torch.cuda.Event,
    cache_salts: list[str] | None = None,  # Add cache_salts argument
):
    ...

def batched_submit_retrieve_requests(
    self,
    request_ids: list[str],
    ops: list[LoadStoreOp],
    event: torch.cuda.Event,
    cache_salts: list[str] | None = None,  # Add cache_salts argument
):
    ...

Notes

This fix only addresses the fallback/internal adapter path and does not affect the external LMCache adapter path.

Recommendation

Apply workaround by updating the fallback/internal adapter methods to accept the new arguments. This will ensure that the fallback path is interface-compatible with the call sites in the same repo.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

vllm - ✅(Solved) Fix [Bug]: LMCache MP fallback adapter rejects cache_salt/cache_salts kwargs after #39837 [1 pull requests, 1 comments, 1 participants]