Last year when Framework announced the Framework Desktop I immediately ordered one. I’d been wanting a new gaming PC, but I’d also been kicking around the idea of running a local LLM. When it finally arrived it worked great for gaming… but there wasn’t much that would run on the AMD hardware from an LLM standpoint. Over the next few months more tools became available, but it was very slow going. I had many long nights where I’d work and work and work and end up right back where I started.

So I got a Claude Code subscription and used it to help me build out my LLM setup. I made a lot of progress, but now I was comparing my local LLM to Claude, and there was no comparison.

Then I started messing with OpenClaw. First with Claude (expensive, fast), then with my local llama.cpp (cheap, frustrating). I didn’t know enough about it, so I used Claude to help me build a custom app around my llama.cpp. That was fun and I learned a lot, but I was spending most of my time chasing bugs instead of actually optimizing anything.

Around that time I heard about Qwen3-Coder-Next, dropped it into llama.cpp, and wow that was a huge step forward. Better direction-following, better tool calls, just better. I felt like my homegrown app was now holding the model back, so I converted over to OpenClaw. Some growing pains, but once things settled I was impressed again.

We built a lot of tooling along the way: a vector database memory system that cleans itself up each night, a filesystem-based context system, speech-to-text and text-to-speech, and a vision model. At this point my local LLM could see me, hear me, speak to me, and remember things about me, and all of it was built to be LLM-agnostic so Claude and my local system could share the same tools.

I was still leaning on Claude heavily for coding, because honestly it’s amazing at it. I decided to give Qwen a small test project: build a web-based kanban board: desktop and mobile friendly. It built it… but it sucked. Drag between columns? Broken. Fixed that, now you can’t add items. Fixed that, dragging broke on mobile. I kept asking Claude to help troubleshoot and it kept just wanting to rewrite the app. Finally I gave in and said “just fix it” and Claude rewrote the whole thing and it was great. I was disheartened. On top of that, Qwen kept getting into these loops, sometimes running for hours doing nothing productive.

So about a week and a half ago I decided to rethink what I even wanted my local LLM to do. Coding was obviously out. I decided to start fresh and use it to help me journal. A few times a day it reaches out, asks what I’m doing, and if it’s relevant, adds an entry to my journal.

I went through a couple more model swaps trying to get it stable, Qwen3.5 was better than Coder-Next for this use case but I was still hitting loop issues. It was consistently prompting me and doing a decent job with the journal, which was at least a step in the right direction.

Then Qwen3.6 dropped. I put the Q6 quant on the same day it released and immediately I could tell it was faster and the output quality was much higher. And I realized earlier today that since I switched to Qwen3.6 I haven’t had to ask Claude to check in on Qwen even once. The looping is gone. It’s actually following the anti-loop protocols I’ve been trying to get models to follow for months.

I haven’t tried coding with it yet (I don’t have high hopes there) but I’ve given it the ability to create and modify its own skills and it’s been doing that beautifully. Scheduled tasks, multiple agents (voice assistant, primary, Home Assistant), all running smoothly.

My reliance on Claude has dropped off sharply since moving to Qwen3.6, and my system resource usage has gone down significantly too. If you’ve tried to get a local LLM setup running and gave up out of frustration… now might be a good time to jump back in, especially if you know your hardware should be able to handle it.

  • Bob Robertson IX @discuss.tchncs.deOP
    link
    fedilink
    English
    arrow-up
    1
    ·
    2 hours ago

    I hadn’t heard of OcuLink before the Framework event yesterday… and now I’m intrigued. How does that connect to the Framework Desktop?

    And as far as tools go, I tend to borrow ideas from others and then build them specific for my setup. For instance, a few months ago I came across a project called OpenBrain that uses a vector database for memory storage and retrieval. I leaned on Claude and asked it to evaluate the OpenBrain project and to let me know how we could use the concepts for my local system. Claude chugged away and then gave me its recommendations, which included spinning up PostgreSQL, creating an ‘Observer’ to watch my OpenClaw and Claude sessions to pull memories into the database, setting up an embedding LLM server to generate the embeddings, and then a process that runs nightly to remove duplicate memories, combine similar memories and archive old memories. If I recall almost all of these were components of OpenBrain, but I’m cautious about using other people’s code and appreciate that Claude is a good tool for using ideas from other projects without actually needing to use the whole project itself.

    One instance recently where I’m glad I did this was Milla Jovovich’s MemPalace… since I already had a memory system in place I didn’t need to use her project, but one thing I did find very intriguing was the AAAK, which she had described as a lossless compression language for AI agents. I asked Claude to evaluate all of MemPalace and see if there was anything that would be an improvement to my setup, and Claude also got excited about AAAK. Claude said that everything else we were doing was as good or better than MemPalace, but if we could add an AAAK version of each memory that would help lower the token usage (but we kept the full prose memory to help with search). We implemented it and I asked Claude to pull some random memories from the db… and it was immediately clear that there was no way AAAK was lossless because a lot of my memories had to do with system details, including IP addresses, and the AAAK didn’t include any IP addresses. Of the 1800 memories that I had stored, Claude was only able to find 3 where enough meaning had been preserved with AAAK to be usable. It was easy to remove from my system - and I just checked and the project no longer refers to AAAK as lossless.

    Using this method I also built a filesystem based context system that allows any LLM on my system to use the same skills, context, agents, projects, and memories (although the memory folder just has instructions on how to access the memory database). This was another project I saw from someone else and I used a lot of their ideas, but tweaked things to fit my needs.

    Other tools where I am using other’s projects are ‘Faster-Whisper’ for speech-to-text, ‘Piper TTS’ for text-to-speech, ‘Moondream2’ for image analysis and QMD Search for indexing my Obsidian Vault and putting it into the memory system.