TheCornCollector@piefed.zip to LocalLLaMA@sh.itjust.worksEnglish · edit-27 hours agoQwen3.6-35B-A3B releasedhuggingface.coexternal-linkmessage-square14fedilinkarrow-up126arrow-down12file-text
arrow-up124arrow-down1external-linkQwen3.6-35B-A3B releasedhuggingface.coTheCornCollector@piefed.zip to LocalLLaMA@sh.itjust.worksEnglish · edit-27 hours agomessage-square14fedilinkfile-text
The Qwen3.5 models are still the best local models I’ve used, so I’m excited to see how this updated version performs.
minus-squarevenusaur@lemmy.worldlinkfedilinkEnglisharrow-up7·6 hours agoWhat kind of system requirements to run this new model decently?
minus-squareTheCornCollector@piefed.zipOPlinkfedilinkEnglisharrow-up5·4 hours agoI’m running it with the UD_Q4_K_XL quant on 24GB VRAM 7900XTX at ~85 token/s. Since it’s an MOE model, CPU inference with 32 GB ram should be doable, but I won’t make any promises on speed.
minus-squarefonix232@fedia.iolinkfedilinkarrow-up1·1 hour agoWonder what the wombo-combo of Ryzen AI APU can do with this. Time to fire up the trusty 370.
minus-squarevenusaur@lemmy.worldlinkfedilinkEnglisharrow-up2·4 hours agoThanks! That sounds expensive. Hopefully 24GB VRAM gets cheaper or models get more efficient soon.
minus-squareJakeroxs@sh.itjust.workslinkfedilinkEnglisharrow-up3·3 hours agoYou would want to wait till smaller models for 3.6 are released, I’d assume it’ll be soon
minus-squarevenusaur@lemmy.worldlinkfedilinkEnglisharrow-up1·2 hours agoThanks! I’m hoping to run at least 20B. Idk if I can do that fast enough without 24GB. Seems to be the sweet spot.
minus-squareInfinite@lemmy.ziplinkfedilinkEnglisharrow-up5·5 hours agoProbably 24 GB VRAM and 32-64 GB RAM for minimum specs with 4-bit quantization. This is a beefy boi.
minus-squarevenusaur@lemmy.worldlinkfedilinkEnglisharrow-up1·4 hours agoThanks! Not for me yet. Hope to save up enough to get 24GB VRAM in near future.
What kind of system requirements to run this new model decently?
I’m running it with the UD_Q4_K_XL quant on 24GB VRAM 7900XTX at ~85 token/s. Since it’s an MOE model, CPU inference with 32 GB ram should be doable, but I won’t make any promises on speed.
Wonder what the wombo-combo of Ryzen AI APU can do with this.
Time to fire up the trusty 370.
Thanks! That sounds expensive. Hopefully 24GB VRAM gets cheaper or models get more efficient soon.
You would want to wait till smaller models for 3.6 are released, I’d assume it’ll be soon
Thanks! I’m hoping to run at least 20B. Idk if I can do that fast enough without 24GB. Seems to be the sweet spot.
Probably 24 GB VRAM and 32-64 GB RAM for minimum specs with 4-bit quantization. This is a beefy boi.
Thanks! Not for me yet. Hope to save up enough to get 24GB VRAM in near future.