• 1 Post
  • 4 Comments
Joined 5 months ago
cake
Cake day: April 2nd, 2025

help-circle
  • update: I tried GLM4.5 air and it was awesome until I remembered how censored it is by the Chinese government. Which I guess is fine if I’m just coding but just on principal I didn’t like running a model that will refuse to talk about things China doesn’t like. I tried Dolphin-Mistral-24B which will answer anything but isn’t particularly smart.

    So I’m trying out gpt-oss-120b which was running at an amazing 5.21t/s but the reasoning output was broken and it seems the way to fix it was to switch from the llamacpp python wrapper to pure llamacpp

    …which I did, and it fixed the reasoning output… but now I only get .61t/s :|

    anyway, I’m on my journey :) thanks y’all