I think you’re right. Saw a post on Reddit basically mentioning the same things I’m seeing.
It looks like autoawq supports it but it might be an issue with how oobabooga implements it or something…
I think you’re right. Saw a post on Reddit basically mentioning the same things I’m seeing.
It looks like autoawq supports it but it might be an issue with how oobabooga implements it or something…
I have two 3090 Turbo GPUs and it seems like oobabooga doesn’t split the load between the two cards when I try to run TheBloke/dolphin-2.7-mixtral-8x7b-AWQ.
Does anyone know how to make text generation webui use both cards? Do I need an nvlink between the two cards?
I tried this but much prefer to use oogabooga text generation webui. Couldn’t get your solution to work in unraid for whatever reason
Check out onju voice!
I’m sure that’s not that big of a deal to some people. For example, I’m mainly using LLMs for use in my home assistant instance
You should probably also sync them. I use orbital sync for this. https://github.com/mattwebbio/orbital-sync
Do you know if there is openai API compatibility? More specifically I’d like to get home assistant to interact with the LLM via the custom OpenAI hacs addon