The change is a result of MTP support landing in llama.cpp. The Qwen3.6 Unsloth GGUFs are now out of experimental mode, with llama.cpp has merged many PRs, and MTP is now properly supported in Unsloth.
I’ve been using qwen3.6 35b since it came out with really good results, this is a cherry on top. Thanks for sharing!
@davel@lemmy.ml the requirements for running Qwen just got significantly lower, it’s basically the best local model at the moment
Thanks. I haven’t bought hardware to run things locally yet. I did buy some DeepSeek tokens this weekend to play around with. Maybe I should rent until the bubble pops and then buy a supercomputer at fire sale prices.
Oh yeah, that’s definitely the best approach if you don’t already have the hardware since DeepSeek is just absurdly cheap to use. Eventually, hardware prices are going to come down, and local models are going to keep getting more efficient too. So, dumping a few grand on a rig right now doesn’t really make much sense.




