Any frontend/model that runs on a 9070XT

CheeseNoodle@lemmy.world · edit-2 7 months ago

Any frontend/model that runs on a 9070XT

vivendi@programming.dev · 7 months ago

llama.cpp

The Only Inference Engine You’ll Ever Need™

CheeseNoodle@lemmy.world · edit-2 7 months ago

I found this guide which seems very comprehensive but has a few sections where it assumes knowledge I don’t have and doesn’t suggest a clear route by which to gain said knowledge.

For the section just following “Grab the content of SmolLM2 1.7B Instruct” I assume it boils down to run this prior program called MSYS and run this command through it? “GIT_LFS_SKIP_SMUDGE=1 git clone https://huggingface.co/HuggingFaceTB/SmolLM2-1.7B-Instruct”

vivendi@programming.dev · edit-2 7 months ago

That’s for quanting a model yourself. You can instead (read that as “should”) download an already quantized model. You can find quantized models from the HuggingFace page of your model of choice. (Pro tip: quants by Bartowski, Unsloth and Mradermacher are high quality)

And then you just run it.

You can also use Kobold.cpp or OpenWebUI as friendly front ends for llama.cpp

Also, to answer your question, yes.