

I don’t see any mention of whether this uses local models or cloud models. I’m not interested in sending anything I care about it into the cloud.
I don’t see any mention of whether this uses local models or cloud models. I’m not interested in sending anything I care about it into the cloud.
If you think this isn’t related to human rights, then you’ve missed the point.
People have the right to use technology, and indeed we effectively need technology to exercise our right to free speech. You cannot have one without the other. Not anymore.
The right way to think about this that they are arbitrarily banning a topic of discussion simply because it is not dead-center average. This isn’t even a legal issue, and the justification is utter nonsense (Facebook itself runs on Linux, like >90% of the internet). No government has officially asked them to do this, though the timing suggests that it is unofficially from the Trump administration.
This is about exerting control, establishing precedent, and applying a chilling effect to anything not directly aligned with their interests. This obviously extends to human rights issues. This is a test run.
Maybe if they distilled the coder version of qwen 14b it might be a little better but i doubt it. I think a really high quant 70b model is more in range of cooking functioning code off the bat. Its not really fair to compare a low quant local model to o1 or Claude with on the cloud they are much bigger.
That’s a good point. I got mixed up and thought it was distilled from qwen2.5-coder, which I was using for comparison at the same size and quant. qwen2.5-coder-34b@4bit gave me better (but not entirely correct) responses, without spending several minutes on CoT.
I think I need to play around with this more to see if CoT is really useful for coding. I should probably also compare 32b@4bit to 14b@8bit to see which is better, since those both can run within my memory constraints.
Sounds cool. I’m using LM Studio and I don’t think it has that built in. I should reevaluate others.
I’m not entirely sure how I need to effectively use these models, I guess. I tried some basic coding prompts, and the results were very bad. Using R1 Distill Qwen 32B, 4-bit quant.
The first answer had incorrect, non-runnable syntax. I was able to get it to fix that after multiple followup prompts, but I was NOT able to get it to fix the bugs. It took several minutes of thinking time for each prompt, and gave me worse answers than the stock Qwen model.
For comparison, GPT 4o and Claude Sonnet 3.5 gave me code that would at least run on the first shot. 4o’s was even functional in one shot (Sonnet’s was close but had bugs). And that took just a few seconds instead of 10+ minutes.
Looking over its chain of thought, it seems to get caught in circles, just stating the same points again and again.
Not sure exactly what the use case is for this. For coding, it seems worse than useless.
But any 50 watt chip will get absolutely destroyed by a 500 watt gpu
If you are memory-bound (and since OP’s talking about 192GB, it’s pretty safe to assume they are), then it’s hard to make a direct comparison here.
You’d need 8 high-end consumer GPUs to get 192GB. Not only is that insanely expensive to buy and run, but you won’t even be able to support it on a standard residential electrical circuit, or any consumer-level motherboard. Even 4 GPUs (which would be great for 70B models) would cost more than a Mac.
The speed advantage you get from discrete GPUs rapidly disappears as your memory requirements exceed VRAM capacity. Partial offloading to GPU is better than nothing, but if we’re talking about standard PC hardware, it’s not going to be as fast as Apple Silicon for anything that requires a lot of memory.
This might change in the near future as AMD and Intel catch up to Apple Silicon in terms of memory bandwidth and integrated NPU performance. Then you can sidestep the Apple tax, and perhaps you will be able to pair a discrete GPU and get a meaningful performance boost even with larger models.
This will be highly platform-dependent, and also dependent on your threat model.
On PC laptops, you should probably enable Secure Boot (if it’s not enabled by default), and password-protect your BIOS. On Macs you can disable booting from external media (I think that’s even the default now, but not totally sure). You should definitely enable full-disk encryption – that’s FileVault on Mac and Bitlocker on Windows.
On Apple devices, you can enable USB Restricted Mode, which will protect against some attacks with USB cables or devices.
Apple devices also have lockdown mode, which restricts or disables a whole bunch of functionality in an effort to reduce your attack surface against a variety of sophisticated attacks.
If you’re worried about hardware hacks, then on a laptop you’d want to apply some tamper-evident stickers or something similar, so if an evil maid opens it up and tampers with the hardware, at least you’ll know something fishy happened, so you can go drop your laptop in an active volcano or something.
If you use any external devices, like a keyboard, mouse, hard drive, whatever…well…how paranoid are you? I’m going to be honest: there is a near 0% chance I would even notice if someone replaced my charging cables or peripheral cables with malicious ones. I wouldn’t even notice if someone plugged in a USB keylogger between my desktop PC and my keyboard, because I only look at the back of my PC once in a blue moon. Digital security begins with physical security.
On the software side, make sure you’re the only one with admin rights, and ideally you shouldn’t even log into admin accounts on a day-to-day basis.
If you’re running a consumer level GPU, you’ll be operating with 24GB of VRAM max (RTX 4090, RTX 3090, or Radeon 7900XTX).
90b model = 90GB at 8-bit quantization (plus some extra based on your context size and general overhead, but as a ballpark estimate, just going by the model size is good enough). You would need to drop down to 2-bit quantization to have any hope to fit it in a single consumer GPU. At that point you’d probably be better off using a smaller model will less aggressive quantization, like a 32b model at 4-bit quantization.
So forget about consumer GPUs for that size of model. Instead, you can look at systems with integrated memory, like a Mac with 96-128GB of memory, or something similar. HP has announced a mini PC that might be good, and Nvidia has announced a dedicated AI box as well. Neither of those are available for purchase yet, though.
You could also consider using multiple consumer GPUs. You might be able to get multiple RTX 3090s for cheaper than a Mac with the same amount of memory. But then you’ll be using several times more power to run it, so keep that in mind.
Also works on Twitch with the added benefit of NOT playing ads (you still get breaks, just with a placeholder screen instead of the commercial).
mpv has yt-dlp support built in, so it can just play the streams directly.
vd
(VisiData) is a wonderful TUI spreadsheet program. It can read lots of formats, like csv, sqlite, and even nested formats like json. It supports Python expressions and replayable commands.
I find it most useful for large CSV files from various sources. Logs and reports from a lot of the tools I use can easily be tens of thousands of rows, and it can take many minutes just to open them in GUI apps like Excel or LibreOffice.
I frequently need to re-export fresh data, so I find myself needing to re-process and re-arrange it every time, which visidata makes easy (well, easier) with its replayable command files. So e.g. I can write a script to open a raw csv, add a formula column, resize all columns to fit their content, set the column types as appropriate, and sort it the way I need it. So I can do direct from exporting the data to reading it with no preprocessing in between.
My experience might be a bit outdated, but I remember finding the default Mac OS X Terminal extremely slow. A few years back I ran an output-heavy command, and the speed difference between displaying the output in terminal vs outputting it to a file was orders of magnitude. The same thing on my Linux system was much, much faster. I’m not sure how much of that was due specifically to rendering, vs memory management or something else, though.
I might see if I can still reproduce this in Sequoia and if Ghostty is faster on Mac.
I agree that the models themselves are clearly transformative. That doesn’t mean it’s legal for Meta to pirate everything on earth to use for training. THAT’S where the infringement is. And they admitted they used pirated material: https://www.techspot.com/news/101507-meta-admits-using-pirated-books-train-ai-but.html
You want to use the same bullshit tactics and unreasonable math that the RIAA used in their court cases?
I would enjoying seeing megacorps held to at least the same standards as individuals. I would prefer for those standards to be reasonable across the board, but that’s not really on the table here.
I guess the idea is that the models themselves are not infringing copyright, but the training process DID. Some of the big players have admitted to using pirated material in training data. The rest obviously did even if they haven’t admitted it.
While language models have the capacity to produce infringing output, I don’t think the models themselves are infringing (though there are probably exceptions). I mean, gzip can reproduce infringing material too with the correct input. If producing infringing work requires both the algorithm AND specific, intentional user input, then I don’t think you should put the blame solely on the algorithm.
Either way, I don’t think existing legal frameworks are suitable to answer these questions, so I think it’s more important to think about what the law should be rather than what it currently is.
I remember stories about the RIAA suing individuals for many thousands of dollars per mp3 they downloaded. If you applied that logic to OpenAI — maximum fine for every individual work used — it’d instantly bankrupt them. Honestly, I’d love to see it. But I don’t think any copyright holder has the balls to try that against someone who can afford lawyers. They’re just bullies.
Laptops are a crapshoot, so I’d recommend sticking with distros that are known to support your specific model.
Desktops should, in general, just work.
That said, I’ve never personally had a seamless experience. There’s always something I need to struggle to configure. Usually it’s because I’m very picky and I like things to work MY way. The alternative on Widows would not be that it works my way; it would be that there’d be no way to do that so I’d just have to deal with it. If you’re willing to just roll with the defaults, then yeah, most basic things should just work.
The biggest gotcha is GPU drivers. Not all distros ship with recent kernel versions with modern drivers. You should be pretty safe with Fedora and derivatives.
Why? This cannot possibly have any legal weight. Some adults look young. Some kids look old. The very idea is broken from the outset.
I can’t tell if this is incompetence or malice.
Thanks for the info. I was not aware that Bluesky had public, shareable block lists. That is indeed a great feature.
For anyone else like me who was not aware, I found this site with an index of a lot of public block lists: https://blueskydirectory.com/lists . I was not able to load some of them, but others did load successfully. Maybe some were deleted or are not public? I’m not sure.
I’ve never been heavily invested in microblogging, so my first-hand experience is limited and mostly academic. I have accounts on Mastodon and Bluesky, though. I would not have realized this feature was available in Bluesky if you hadn’t mentioned it and I didn’t find that index site in a web search. It doesn’t seem easily discoverable within Bluesky’s own UI.
Edit: I agree, of course, that there is a larger systemic problem at the society level. I recently read this excellent piece (very long but worth it!) that talks a bit about how that relates to social media: https://www.wrecka.ge/against-the-dark-forest/ . Here’s a relevant excerpt:
If this truly is the case—if the only way to improve our public internet is to convert all humans one by one to a state of greater enlightenment—then a full retreat into the bushes is the only reasonable course.
But it isn’t the case. Because yes, the existence of dipshits is indeed unfixable, but building arrays of Dipshit Accelerators that allow a small number of bad actors to build destructive empires defended by Dipshit Armies is a choice. The refusal to genuinely remodel that machinery when its harms first appear is another choice. Mega-platform executives, themselves frequently dipshits, who make these choices, lie about them to governments and ordinary people, and refuse to materially alter them.
Do you think this is a systemic problem, or just the happenstance of today? Is there something about Bluesky’s architecture or governance that makes it more resilient against that (particularly in the long term)? Or will they have all the same problems as they gain more users and enable more federation with other servers?
I’d rather have something like a “code grammar checker” that highlights potential errors for my examination rather than something that generates code from scratch itself
Agreed. The other good use case I’ve found is as a faster reference for simple things. LLMs are absolutely great for one-liners and generating troublesome (but logically simple) things like complex xpath queries. But I still haven’t seen one generate a good script of even moderate complexity without hand-holding. In some cases I’ve been able to get usable output with a few shots, saving me a bit of time compared to if I’d written the whole darned thing from scratch.
I’ve found LLMs very useful for coding, but they aren’t replacing my actual coding, per se. They replace looking things up, like through man pages, language references, or StackOverflow. Something like ffmpeg, for example, has a million options and it is always a little annoying to sift through the docs manually when I just need to do one specific task.
I’m sure it’ll happen sooner or later. I’m not naive enough to claim that “computers will never be able to do $THING” anymore. I’ll say “not in the next year”, though.
Right, not an IDE. The BB stands for “bare bones”, but it has a robust feature set as far as general text editing goes. Autocomplete is minimal so I tend to use an IDE for more complex coding tasks.
Kids these days and their “Plasma”. BACK IN MY DAY it was just KDE!
I’m not sure why this feels new to me. Perhaps it’s because I spent a lot of time on other DEs after 2009.
But also, from that link:
So I don’t feel like it’s wrong to just call it KDE, just imprecise.