Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The Ollama backend llama.cpp definitely supports those older cards with the OpenCL and Vulkan backends, though performance is worse than ROCm or CUDA. In their Vulkan thread for instance I see people getting it working with Polaris and even Hawaii cards.

https://github.com/ggerganov/llama.cpp/pull/2059

Personally I just run it on CPU and several tokens/s is good enough for my purposes.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: