Run Qwen3.6 MTP GGUFs locally ~1.4–2.2× faster with no accuracy loss and with only 18gb VRAM

☆ Yσɠƚԋσʂ ☆@lemmy.ml · 2 months ago

Run Qwen3.6 MTP GGUFs locally ~1.4–2.2× faster with no accuracy loss and with only 18gb VRAM

☆ Yσɠƚԋσʂ ☆@lemmy.ml · 2 months ago

Oh yeah, that’s definitely the best approach if you don’t already have the hardware since DeepSeek is just absurdly cheap to use. Eventually, hardware prices are going to come down, and local models are going to keep getting more efficient too. So, dumping a few grand on a rig right now doesn’t really make much sense.

Run Qwen3.6 MTP GGUFs locally ~1.4–2.2× faster with no accuracy loss and with only 18gb VRAM

Run Qwen3.6 MTP GGUFs locally ~1.4–2.2× faster with no accuracy loss and with only 18gb VRAM

unsloth/Qwen3.6-27B-MTP-GGUF · Hugging Face