A software developer and Linux nerd, living in Germany. I’m usually a chill dude but my online persona doesn’t always reflect my true personality. Take what I say with a grain of salt, I usually try to be nice and give good advice, though.

I’m into Free Software, selfhosting, microcontrollers and electronics, freedom, privacy and the usual stuff. And a few select other random things as well.

  • 5 Posts
  • 818 Comments
Joined 5 years ago
cake
Cake day: August 21st, 2021

help-circle






  • Probably because they play the same game as Mark Zuckerberg, the Chinese, to some degree OpenAI… They all release open-weights models.

    They’ll generate some hype for their company that way, so it’s advertising. They build good will. They undercut the competition. Or make it clear how they outperform them. Maybe they get some more investor money if they do expand to the local models market. I bet there’s a million reason why it makes sense from a business perspective.


  • Yes. I’ve been somewhat lucky as well. Upgraded my homeserver to 48GB to run a few virtual machines and maxed out my old laptop well before prices skyrocketed. Got to check if I still pay the ~8€ a month for my netcup VPS or if they increased price for existing customers as well…


  • Hmmh. I’ve tried to do benchmarks early on, about when Llama 2 was a thing… Followed the Reddit discussions. And then at some point I wanted to replace Mistral-Nemo with something newer but I disliked how every other model had turned to the ChatGPT / sycophant style of talking… But it’s a massively laborious undertaking. The official benchmarks don’t cover any of that. And there’s no good way to automate it either. So I spend half a day reading output manually and rating it in an Excel spreadsheet. With some success, but it’s way too complicated. So I mainly eyeball it these days. And sometimes there’s some recommendations somewhere on the internet. And I learned to accept how Chatbots always go on and on with redundant information unless I tell them skip the bullshit, I have an appointment at the hairdresser in 10 minutes and you need to explain it in 3 sentences. 😄

    I suppose for tasks like coding, or factual knowledge it’s way easier to come up with fully automated benchmarks.


  • I guess that’s been my general experience for a while. I’d download some new model with promising benchmarks. And once I try it, the results are kinda underwhelming. A few weeks ago -for example- I tried Qwen 3.5, which had “outstanding results across a full range of benchmark evaluations”. And I deleted it after it kept wasting thousands of tokens to reason about how to respond to a “Hello” by the user. And sometimes I just don’t see any real performance improvement with new models. If I had to guess, I’d say they mainly trained (and improved) for/on the benchmarks, not my use-case.



  • Did you read the Wiki? You need to either pass the compress_extension option when mounting it. The Arch Wiki lists how to enable compression on all text files. And I gave you the version with a ‘*’, which enables compression for all files. Or you do a chattr -R +c ... on specific files or directories to compress them. Maybe you missed that and that’s why it doesn’t compress?!

    There’s probably also a way to debug it and somehow figure out what it does and how many files/sectors got compressed on the filesystem. Linux usually buries that kind of information somewhere in /sys or /proc, or there’s special commands to figure it out. But I’m not really an expert on it.

    And there’s also files which just can not be compressed any further because they’re already compressed. Most images, for example. Or music or ZIP archives. If you try to compress those, they’ll usually stay the same size.









  • The issue with the tools I’ve seen is, they either don’t factor in how language models are trained and datasets are prepared in reality. Or they’re based on some outdated information. I’ve never seen any specific tool backed by science or even with a plausible way of working against current data gathering processes… So for all intents and purposes, they’re a bit more alike homeopathy or alternative medicine. Sure, you’re perfectly fine taking sugar pills, there’s nothing wrong with that. But don’t confuse it with actual science-backed medicine.

    And I mean the poisoning goes even further than that. There’s not just people trying to make a LLM output gibberish. There’s also lots of people with a vested (commercial) interest in sneaking in false information, their political agenda, or even a tire company who wants ChatGPT to say “Company XY” is the most trustworthy shop for new tires for your car. Judging by the public information out there, we’re already way past simple attacks. And the AI companies are aware of it. It’s an ongoing cat and mouse game. And while there’s all these sweatshops, they’ll also use other AI to sift through the data, natural language processing. From what I remember they have secret watermarking in place in a lot of commecial chatbots and image generators… So unless people come up with very clever mechanisms, the “poisoning” attempt will probably be detected with some very basic (fully automated) plausibility checks and they’ll just discard your data without wasting a lot of resources on it.