Run Qwen3.6 MTP GGUFs locally ~1.4–2.2× faster with no accuracy loss and with only 18gb VRAM

☆ Yσɠƚԋσʂ ☆@lemmy.ml · 2 months ago

Run Qwen3.6 MTP GGUFs locally ~1.4–2.2× faster with no accuracy loss and with only 18gb VRAM

kmartburrito@lemmy.world · 1 month ago

Agreed! It really is neat to be present and participating during this time. I know the future will hold great things but it’s crazy how quickly things move, to your point.

I know there will be some demand for turnkey AI solutions as people not like us won’t have the time or patience (or hardware) to make it work, but it’s so rewarding when it does work.

And boy does it work!

☆ Yσɠƚԋσʂ ☆@lemmy.ml · 1 month ago

For sure, it’s pretty magical, and I feel like this year has been a real breakthrough for local models where they really can do non-trivial work. I’m really excited to see what things look like by next year.

Run Qwen3.6 MTP GGUFs locally ~1.4–2.2× faster with no accuracy loss and with only 18gb VRAM

Run Qwen3.6 MTP GGUFs locally ~1.4–2.2× faster with no accuracy loss and with only 18gb VRAM

unsloth/Qwen3.6-27B-MTP-GGUF · Hugging Face