• PlanterTree@discuss.tchncs.de
    link
    fedilink
    English
    arrow-up
    1
    ·
    2 days ago

    Intel and ARM Ampere systems.

    Does this mean they optimized for CPU instead of GPU? I doubt they target Intel GPUs tbh, so they really optimized for CPU… interesting!

    • brucethemoose@lemmy.world
      link
      fedilink
      arrow-up
      1
      ·
      edit-2
      17 hours ago

      All the runtimes except Intel ones are llama.cpp Q4KMs, so the Ampere ones aren’t anything special.

      …The Intel ones kinda are though. They actually have runtimes for CPU/GPU, and NPU, and AFAIK the CPU ones may be able to use AMX if you are on a server CPU.

      It’s still not great for a lot of reasons, but one could do worse.