

OpenWebUI has TTS and STT.


OpenWebUI has TTS and STT.


Adding here. Most docker containers support semver pinning! It’s a great balance between automated updates and advoiding breakage.
I’d recommend a solid backup client. This isn’t something you want to find broken when you need it.
Kopia is what I use, and it supports local (LAN) targets, as well as cloud storage if you want 3:2:1 for some or all of your data. Good luck!


Theres a Romm app in ports you can use for this.


I was interested until I saw the crypto stakes to vote on changes baked in.
Similar setup here with a 7900xtx, works great and the 20-30b models are honestly pretty good these days. Magistral, Qwen 3 Coder, GPT-OSS are most of what I use
Yeah, similar sized environments here too, but had good experiences with Ansible. Saw Chef struggle at even smaller scales. And Puppet. And Saltstack. But I’ve also seen all of them succeed too. Like most things it depends on how you run it. Nothing is a perfect solution. But I think Ansible has few game breaking tradeoffs for it’s advantages.
Wow, huge disagree on saltstack and chef being ahead of Ansible. I’ve used all 3 in production (and even Puppet) and watched Ansible absolutely surge onto the scene and displace everyone else in the enterprise space in a scant few years.
Ansible is just so much lower overhead and so much easier to understand and make changes to. It’s dominating the configuration management space for a reason. And nearly all of the self hosted/homelab space is active in Ansible and have tons of well baked playbooks.
Of course, and this is why the new hotness is a Mixture of Experts for one model that is effectively a bunch of experts arguing over the answer, or else on a different scale there’s the Combination of Agents where different specialized agents perform specialized tasks.


What controller?


Can I guarantee? There are no guarantees in self hosting. By this logic you can never move away from Plex. There’s always unknowns. There’s always new issues to trip over. Plex is hardly without it’s own warts, but because they’re ‘known’ to you and your users nothing else will ever be able to measure up.
It’s a logical fallacy and a trap.
I set up Jellyfin basically overnight when the Plex pass changes occurred. Reverse proxies are trivial, as are docker containers, don’t let the anecdotes about things being hard or VPN being needed intimidate you.
There were absolutely bumps in the road. I had to make users for each person and email them customized sign-up links. Yes, that kinda sucked, but that’s the price for running and controlling the authentication yourself instead of though a 3rd party service that can and absolutely will eventually use that data to snoop.
Most of the time, once sent the link the users were fine, 9/10 of my users had no further issues and quickly adapted. For the last 1/10, I had to trouble shoot a few things and eventually ended up recommending a different device to connect with (it was an old TV with a really old version of Plex for TVs, they ended up buying a $40 Google TV device from Walmart and got set up that way).
The whole time I was running both Plex and Jellyfin so the migration process could happen at my speed.
My point is this: no, it wasn’t painless to switch. Yes, some tech support was required. Yes, the user who was getting hundreds of dollars (annually) of streaming services effectively for free had to shell out a paltry sum to upgrade and actually enjoys their experience much more now. No, that didn’t make it impossible or not worth doing.
I’m not saying what’s best for you and your users, and I’m absolutely not guaranteeing you’ll have no issues beyond these, but I hope you understand your hands aren’t actually tied, you’re just boxing yourself in.


Depends on your goals. For raw tokens per second, yeah you want an Nvidia card with enough™ memory for your target model(s).
But if you don’t care so much for speed beyond a certain amount, or you’re okay sacrificing some speed for economy, AMD RX7900 XT/XTX or 9070 both work pretty well for small to mid sized local models.
Otherwise you can look at the SOC type solutions like AMD Strix Halo or Nvidia DGX for more model size at the cost of speed, but always look for reputable benchmarks showing ‘enough’ speed for your use case.


Can you provide your docker-compose entry or your docker run command?


Okay, thanks for giving me that, I’ll investigate further tonight


So I just spot checked. Both shows work, you just have to not click an episode anymore.
E.g, https://pbskids.org/videos/design-squad -> design-squad
Thank you for telling me, I’ll update the readme


Hmmm. I just double checked and my episodes are still downloading. But maybe newer shows have a different format… What’s the exact error? I’ll try to reproduce and fix.
Totally agreed. LLMs shouldn’t be asked to know things, it’s counterproductive. They should be asked to DO things, and use available tools to do that.