• afk_strats@lemmy.world
    link
    fedilink
    English
    arrow-up
    9
    ·
    edit-2
    5 days ago

    Ollama does use ROCm, however, so does llama.cpp. Vulkan happens to be another available backend supported by llama.cpp.

    GitHub: llama.cpp Supported Backends

    There is an old PRs which attempted to bring Vulkan support to Ollama - a logical and helpful move, given that the Ollama engine is based on llama.cpp - but the Ollama maintainers weren’t interested.

    As for performance vs ROCm, it does fine. Against CUDA, it also does well unless you’re in a mulit-gpu setup. Its magic trick is compatibility. Pretty much everything runs Vulkan. And Vulkan is intecompatible between generations of cards, architectures AND vendors. That’s how I’m running a single PC with Nvidia and AMD cards together

    • hendrik@palaver.p3x.de
      link
      fedilink
      English
      arrow-up
      2
      ·
      5 days ago

      I think llama.cpp merged ROCm support in 2023 already. It’s called HIP on their Readme, but I’m not super educated on all the acronyms and compute frameworks and instruction sets.

      • afk_strats@lemmy.world
        link
        fedilink
        English
        arrow-up
        5
        ·
        5 days ago

        ROCm is a software stack which includes a bunch of SDKs and API.

        HIP is a subset of ROCm which lets you program on AMD GPUs with focus portability from Nvidia’s CUDA