will

will@lemm.ee · 9 days ago

Perhaps the government should collect money from the AI companies — they could call it something simple, like “taxes” — and distribute the money to anyone who had ever written something that made its way to the internet (since we can reasonably assume that everything posted online has now been sucked in to the slop machines)

will@lemm.ee · 15 days ago

Sure, I run OpenWebUI in a docker container from my TrueNAS SCALE home server (it’s one of their standard packages, so basically a 1-click install). From there I’ve configured API use with OpenAI, Gemini, Anthropic and DeepSeek (part of my job involves evaluating the performance of these big models for various in-house tasks), along with pipelines for some of our specific workflows and MCP via mcpo.

I previously had my ollama installation in another docker container but didn’t like having a big GPU in my NAS box, so I moved it to its own box. I am mostly interested in testing small/tiny models there. I again have Ollama running in a Docker container (just the official Docker image), but this time on a Debian bare-metal server, and I configured another OpenWebUI pipeline to point to that (OpenWebUI lets you select which LLM(s) you want to use on a conversation-by-conversation basis, so there’s no problem having a bunch of them hooked up at the same time).

will@lemm.ee · 16 days ago

OpenWebUI is a superb front-end and supports just about any backend that you think of (including Ollama for locally hosted LLMs) and has some really nice features like pipelines that can extend out its functionality however you might need. Definitely has the “copy code” feature built-in and outputs markdown for regular documentation purposes.

will@lemm.ee · 22 days ago

Believe it or not that initial wave of consolidation brought prices down. A license of SGI Power Animator cost over $30k in the 90s. softImage was not far behind. 3ds Max basically took the fight out of them, at which point Autodesk started going on a buying spree.

will@lemm.ee · 1 month ago

Making your own embeddings is for RAG. Most base model providers have standardized on OpenAIs embeddings scheme, but there are many ways. Typically you embed a few tokens worth of data at a time and store that in your vector database. This lets your AI later do some vector math (usually cosine similarity search) to see how similar (related) the embeddings are to each other and to what you asked about. There are fine tuning schemes where you make embeddings before the tuning as well but most people today use whatever fine tuning services their base model provider offers, which usually has some layers of abstraction.

will@lemm.ee · 1 month ago

The easiest option for a layperson is retrieval augmented generation, or RAG. Basically you encode your books and upload them into a special kind of database and then tell a regular base model LLM to check the data when making an answer. I know ChatGPT has a built in UI for this (and maybe anthropic too) but you can also build something out using Langchain or OpenWebUi and the model of your choice.

The next step up from there is fine tuning, where you kinda retrain a base model on your books. This is more complex and time consuming but can give more nuanced answers. It’s often done in combination with RAG for particularly large bodies of information.