pytorch - 💡(How to fix) Fix Expose public API for clearing cuBLAS workspaces (currently only private torch._C._cuda_clearCublasWorkspaces)

pytorch2026-05-17 08:42:55

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

Fix Action

Fix / Workaround

NVIDIA's cuda-checkpoint tool requires ALL GPU-side allocations to be released before checkpointing a process. After calling torch.cuda.empty_cache() and gc.collect(), cuBLAS workspaces still hold GPU memory, causing checkpoint failures. The only workaround is calling the private torch._C._cuda_clearCublasWorkspaces().

Current workaround

Code Example

# Private API - works but undocumented and could break
torch._C._cuda_clearCublasWorkspaces()
gc.collect()
torch.cuda.empty_cache()

---

torch.cuda.clear_cublas_workspaces()

---

torch.cuda.empty_cache(include_cublas_workspaces=True)

RAW_BUFFERClick to expand / collapse

🚀 The feature, motivation and pitch

torch._C._cuda_clearCublasWorkspaces() is the only way to release cuBLAS workspace allocations, but it's a private API. torch.cuda.empty_cache() explicitly does not free them (as documented in docs/source/notes/cuda.rst). There should be a public API for this.

Motivation

This is becoming a practical blocker as cuda-checkpoint adoption grows for GPU process migration and cold start optimization (see vLLM RFC #34303, cuda-checkpoint #4).

Beyond checkpoint/restore, any workflow that needs to fully reclaim GPU memory between models (multi-model serving, benchmarking, testing) hits this gap.

Current workaround

# Private API - works but undocumented and could break
torch._C._cuda_clearCublasWorkspaces()
gc.collect()
torch.cuda.empty_cache()

Proposed API

Either of these would work:

Option A: Dedicated public function

torch.cuda.clear_cublas_workspaces()

Option B: Flag on existing empty_cache()

torch.cuda.empty_cache(include_cublas_workspaces=True)

Option B has the advantage of giving users a single call for "release everything", which is what most people expect empty_cache() to do already.

Alternatives

Continue using the private torch._C._cuda_clearCublasWorkspaces(), but this is fragile and not discoverable. Users who need full GPU memory release (for cuda-checkpoint, multi-model serving, etc.) have to find this through source code or StackOverflow.

Additional context

PyTorch's own test utilities (torch/testing/_internal/common_utils.py) already use clearCublasWorkspaces() for clean test state
The function is documented in docs/source/notes/cuda.rst but only as a private API
Related issues about incomplete memory release: #17157, #46602, #173382, #20837

cc @ptrblck @msaroufim @eqy @jerryzh168 @tinglvv @nWEIdia @csarofeen

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #optimization #retriever error #indexing error #inference speed

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

pytorch - 💡(How to fix) Fix Expose public API for clearing cuBLAS workspaces (currently only private torch._C._cuda_clearCublasWorkspaces)

Recommended Tools

GitHub issue graph ai analysis

Fix Action

Fix / Workaround

Current workaround

Code Example

🚀 The feature, motivation and pitch

Motivation

Current workaround

Proposed API

Alternatives

Additional context

Still need to ship something?

TRENDING

pytorch - 💡(How to fix) Fix Expose public API for clearing cuBLAS workspaces (currently only private torch._C._cuda_clearCublasWorkspaces)

Recommended Tools

GitHub issue graph ai analysis

Fix Action

Fix / Workaround

Current workaround

Code Example

🚀 The feature, motivation and pitch

Motivation

Current workaround

Proposed API

Alternatives

Additional context

Still need to ship something?

RELATED_DISCOVERY

TRENDING