Ollama Running on AMD GPUs – Random Linux Thoughts

This is a quick guide on enabling GPU hardware acceleration for Ollama on AMD GPUs (using the open source amdgpu driver that comes with the kernel, not the proprietary AMD drivers). Supported AMD GPUs are listed here.

It is sometimes possible to get GPU hardware acceleration working on Linux with AMD GPUs that are not listed as officially supported. However, be aware that this can lead to system instability. I have successfully used this method on:

A desktop PC with an ancient Radeon RX 5500 XT with 8GB of VRAM.
A laptop with a Ryzen 7 4800H (Vega 7 iGPU).
A laptop with a Ryzen 7 8840U (Radeon RX 780m).
A laptop with a Ryzen 7 8845HS (which also has a Radeon RX 780m).

This guide is written for Arch Linux, but should be easy enough to use for other (inferior) Linux distros.

Installation

Install the following ROCm packages:

$ sudo pacman -S rocminfo rocm-opencl-sdk rocm-hip-sdk rocm-ml-sdk

Install Ollama with ROCm support:

$ sudo pacman -S ollama-rocm

If your AMD GPU is supported, then that’s all you have to do!

Load an Ollama model:

$ ollama list

NAME                     ID              SIZE      MODIFIED     
codegeex4:latest         867b8e81d038    5.5 GB    16 hours ago    
deepseek-coder:6.7b      ce298d984115    3.8 GB    16 hours ago    
tinyllama:latest         2644915ede35    637 MB    16 hours ago    
yi-coder:1.5b            186c460ee707    866 MB    16 hours ago    
yi-coder:9b              39c63e7675d7    5.0 GB    16 hours ago    
deepseek-coder:latest    3ddd2d3fc8d2    776 MB    16 hours ago    
deepseek-r1:1.5b         a42b25d8c10a    1.1 GB    16 hours ago    
deepseek-r1:latest       0a8c26691023    4.7 GB    16 hours ago    

$ ollama run deepseek-r1:1.5b
>>> Send a message (/? for help)

Open another terminal window and check that Ollama is using the GPU:

$ ollama ps

NAME                ID              SIZE      PROCESSOR    UNTIL              
deepseek-r1:1.5b    a42b25d8c10a    2.0 GB    100% GPU     4 minutes from now

Unsupported GPUs (Radeon RX 780m iGPU)

Unsupported AMD GPUs can sometimes be used on Linux by using the HSA_OVERRIDE_GFX_VERSION environment variable.

The Radeon RX 780m I have in my laptop is not listed in the supported GPUs. As mentioned in the link above, it’s still sometimes possible to get unsupported AMD GPUs working with Ollama (although it can sometimes lead to system instability).

IMPORTANT: Initially, this method did not work for me because my BIOS did not include an option to select the amount of VRAM to reserve for the iGPU (it was stuck at 512MB, and couldn’t be changed). You need at least 4GB of VRAM for hardware acceleration to work with AMD GPUs (depending on the model you’re running). I had to download and install a compatible BIOS from the OEM’s website. Only do this if you have nerves of steel, as there is a chance you could brick your laptop.

Find out the model of your LLVM model of your AMD GPU:

$ glxinfo | grep -iA9 'extended renderer'

Extended renderer info (GLX_MESA_query_renderer):
    Vendor: AMD (0x1002)
    Device: AMD Radeon Graphics (radeonsi, gfx1103_r1, LLVM 19.1.7, DRM 3.59, 6.13.1-arch2-1) (0x1900)
    Version: 24.3.4
    Accelerated: yes
    Video memory: 8192MB
    Unified memory: no
    Preferred profile: core (0x1)
    Max core profile version: 4.6
    Max compat profile version: 4.6

From the above, my GPU is a gfx1103_r1, which equates to LLVM target 11.0.3 (which isn’t listed as a supported AMD GPU on the Ollama github page).

I assumed that I needed set HSA_OVERRIDE_GFX_VERSION to 11.0.3:

$ sudo systemctl edit ollama.service

...
[Service]
Environment="HSA_OVERRIDE_GFX_VERSION=11.0.3"
...

$ sudo systemctl restart ollama.service

This did not work. Looking at the Ollama logs I could see what the problem was:

$ sudo journalctl -fu ollama.service

...
Feb 09 14:35:13 lafite ollama[16809]: rocBLAS error: Cannot read /opt/rocm/lib/rocblas/library/TensileLibrary.dat: Illegal seek for GPU arch : gfx1103
Feb 09 14:35:13 lafite ollama[16809]:  List of available TensileLibrary Files :
Feb 09 14:35:13 lafite ollama[16809]: "/opt/rocm/lib/rocblas/library/TensileLibrary_lazy_gfx908.dat"
Feb 09 14:35:13 lafite ollama[16809]: "/opt/rocm/lib/rocblas/library/TensileLibrary_lazy_gfx1012.dat"
Feb 09 14:35:13 lafite ollama[16809]: "/opt/rocm/lib/rocblas/library/TensileLibrary_lazy_gfx1010.dat"
Feb 09 14:35:13 lafite ollama[16809]: "/opt/rocm/lib/rocblas/library/TensileLibrary_lazy_gfx900.dat"
Feb 09 14:35:13 lafite ollama[16809]: "/opt/rocm/lib/rocblas/library/TensileLibrary_lazy_gfx1102.dat"
Feb 09 14:35:13 lafite ollama[16809]: "/opt/rocm/lib/rocblas/library/TensileLibrary_lazy_gfx942.dat"
Feb 09 14:35:13 lafite ollama[16809]: "/opt/rocm/lib/rocblas/library/TensileLibrary_lazy_gfx941.dat"
Feb 09 14:35:13 lafite ollama[16809]: "/opt/rocm/lib/rocblas/library/TensileLibrary_lazy_gfx90a.dat"
Feb 09 14:35:13 lafite ollama[16809]: "/opt/rocm/lib/rocblas/library/TensileLibrary_lazy_gfx940.dat"
Feb 09 14:35:13 lafite ollama[16809]: "/opt/rocm/lib/rocblas/library/TensileLibrary_lazy_gfx906.dat"
Feb 09 14:35:13 lafite ollama[16809]: "/opt/rocm/lib/rocblas/library/TensileLibrary_lazy_gfx1100.dat"
Feb 09 14:35:13 lafite ollama[16809]: "/opt/rocm/lib/rocblas/library/TensileLibrary_lazy_gfx1030.dat"
Feb 09 14:35:13 lafite ollama[16809]: "/opt/rocm/lib/rocblas/library/TensileLibrary_lazy_gfx1101.dat"
Feb 09 14:35:14 lafite systemd-coredump[16855]: [🡕] Process 16845 (ollama_llama_se) of user 966 dumped core.
Feb 09 14:35:14 lafite ollama[16809]: time=2025-02-09T14:35:14.993Z level=ERROR source=sched.go:455 msg="error loading llama server" error="llama runner process has terminated: error:Cannot read /opt/rocm/lib/rocblas/library/TensileLibrary.dat: Illegal seek for GPU arch : gfx1103"
...

Based on this, I decided to set HSA_OVERRIDE_GFX_VERSION to 11.0.2 (which looked available), and it worked! I confirmed Ollama was using the GPU by running ollama ps in another terminal tab, and by looking at nvtop after entering a prompt into Ollama:

Stability

As mentioned (warned) – when you use an unsupported GPU, there are no guarantees your system will be stable. When I run Ollama on my desktop (which has the RX 5500 XT), I don’t have to set the HSA_OVERRIDE_GFX_VERSION environment variable (the card is automatically detected). However, if I play a video in Firefox, which uses video hardware acceleration for video playback, then my system will become unusable within a couple of minutes after entering a prompt into Ollama. To avoid this, I have to turn off GPU hardware acceleration in Firefox (or use Chromium, which still doesn’t have GPU hardware acceleration for AMD on Linux).

I haven’t yet had the chance to compare the CPU-only vs GPU-only performance of Ollama on my systems, but will do this in the near future.