ollama - 💡(How to fix) Fix is it possibel to run both m60 gpus in linux? [7 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
ollama/ollama#15131Fetched 2026-04-08 01:48:53
View on GitHub
Comments
7
Participants
2
Timeline
8
Reactions
0
Author
Participants
Timeline (top)
commented ×7labeled ×1

Error Message

CUDA error: an illegal memory access was encountered current device: 0, in function ggml_backend_cuda_synchronize at /ml/backend/ggml/ggml/src/ggml-cuda/ggml-cuda.cu:2981 cudaStreamSynchronize(cuda_ctx->stream())

Code Example

CUDA error: an illegal memory access was encountered
  current device: 0, in function ggml_backend_cuda_synchronize at /ml/backend/ggml/ggml/src/ggml-cuda/ggml-cuda.cu:2981
  cudaStreamSynchronize(cuda_ctx->stream())
RAW_BUFFERClick to expand / collapse

What is the issue?

I picked up a m60 on ebay. gpu 0 can only run in vulkan mode with no cuda version specified. gpu 1 can run on vulkan with cuda version 12 specified but i can also run without vulkan. i can run 2 instances of ollama with different settings. i tried cuda 11. i tried multiple driver versions and cuda versions. i also tried multiple ollama versions. ive seen posts of people running 1 gpu only in the m60 testla gpu. i also tried snapd and docker with no luck. memtest_vulkan runs fine on both gpus. any suggestions. thanks.

Relevant log output

CUDA error: an illegal memory access was encountered
  current device: 0, in function ggml_backend_cuda_synchronize at /ml/backend/ggml/ggml/src/ggml-cuda/ggml-cuda.cu:2981
  cudaStreamSynchronize(cuda_ctx->stream())

OS

Linux

GPU

Nvidia

CPU

Intel

Ollama version

0.18.3

extent analysis

Fix Plan

The issue seems to be related to CUDA compatibility and memory access. To fix this, we'll try updating CUDA and cuDNN versions, and also modify the code to handle memory access correctly.

  • Update CUDA to the latest version (e.g., 12.0) and install the corresponding cuDNN version.
  • Modify the ggml-cuda.cu file to add error checking for cudaStreamSynchronize:
cudaError_t err = cudaStreamSynchronize(cuda_ctx->stream());
if (err != cudaSuccess) {
    printf("CUDA error: %s\n", cudaGetErrorString(err));
    // Handle the error or exit the program
}
  • Try running the program with the CUDA_DEVICE_ORDER environment variable set to PCI_BUS_ID to ensure the GPUs are detected correctly:
export CUDA_DEVICE_ORDER=PCI_BUS_ID
  • If using Docker, make sure to install the NVIDIA driver and CUDA toolkit inside the container, and mount the GPU devices correctly.

Verification

To verify the fix, run the program with the modified code and check for any CUDA errors. If the program runs without errors, it should be able to utilize both GPUs correctly.

Extra Tips

  • Make sure to install the correct version of cuDNN for the CUDA version being used.
  • If using a Docker container, ensure that the container has the necessary dependencies and environment variables set.
  • Consider using a tool like nvidia-smi to monitor GPU usage and detect any potential issues.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

ollama - 💡(How to fix) Fix is it possibel to run both m60 gpus in linux? [7 comments, 2 participants]