My experience with local LLM

ntn888@lemmy.ml · 22 hours ago

My experience with local LLM

☂️-@lemmy.ml · 17 hours ago

is it just me or the smaller models that fit in my vram are very dumb?

SuspciousCarrot78@lemmy.world · edit-2 10 hours ago

It’s not just you. But while they may be natively “dumb”, they can be augmented quite significantly. Even adding a simple web-search tool can help a lot.

So, there are levels of “dumb”. Some - like Qwen3-4B 2507 instruct - may not have the world knowledge of a SOTA, but its reasoning abilities can be quite impressive. See HERE as an example of a self made test suite. You can run something similar yourself.

I guess it depends what you mean by “dumb” and how that affects what you’re trying to do with them. Some are dumb at tool use, some have poor world knowledge etc. You can find small models that are good at what’s important to you if you dig around. Except for coding - that’s rough. Probably the smallest stand-alone that might make you sit up and pay attention is something like Qwen2.5-Coder-14B-Instruct or FrogMini-14B-2510…but I wouldn’t trust them to go spelunking a code base.

☂️-@lemmy.ml · 6 hours ago

how are some other ways to make it better beyond just adding a search tool? is 16gb vram sufficient for usable results?

where do you think is the best place to go into this rabbit hole?

ntn888@lemmy.ml · 10 minutes ago

I didn;t try any 7b ones lately, they may be better fit for 16gb I think. I was able to try the 2b ones as I mentioned (on cpu). they are subpar. like mentioned the usable ones were 31b, I think you need atleast 24gb vram for most models though. maybe someone else can suggest better.