Xylight@lemdro.id to LocalLLaMA@sh.itjust.worksEnglish · edit-223 hours agoMy 8gb vram system as i try to load GLM-4.6-Q0.00001_XXXS.gguf:media1.tenor.comexternal-linkmessage-square11fedilinkarrow-up179arrow-down15
arrow-up174arrow-down1external-linkMy 8gb vram system as i try to load GLM-4.6-Q0.00001_XXXS.gguf:media1.tenor.comXylight@lemdro.id to LocalLLaMA@sh.itjust.worksEnglish · edit-223 hours agomessage-square11fedilink
minus-squareafk_strats@lemmy.worldlinkfedilinkEnglisharrow-up2·22 hours agoThat fixed it. I am a fan of this quant cook. He often posts perplexity charts. https://huggingface.co/ubergarm All of his quants require ik_llama which works best with Nvidia CUDA but they can do a lot with RAM+vRAM or even hard drive + rams. I don’t know if 8gb is enough for everything.
That fixed it.
I am a fan of this quant cook. He often posts perplexity charts.
https://huggingface.co/ubergarm
All of his quants require ik_llama which works best with Nvidia CUDA but they can do a lot with RAM+vRAM or even hard drive + rams. I don’t know if 8gb is enough for everything.