A software developer and Linux nerd, living in Germany. I’m usually a chill dude but my online persona doesn’t always reflect my true personality. Take what I say with a grain of salt, I usually try to be nice and give good advice, though.

I’m into Free Software, selfhosting, microcontrollers and electronics, freedom, privacy and the usual stuff. And a few select other random things as well.

  • 4 Posts
  • 633 Comments
Joined 4 years ago
cake
Cake day: August 21st, 2021

help-circle


  • There’s always a possibility of someone posting arbitrary content when a platform allows user content or combines content from many sources. I mean we do have moderation here and illegal content is supposed to be removed or flagged. However as the operator of some internet service, you are ultimately responsible for what’s on your instance. So you definitely do need to make an effort to stay in control. Btw, there are possible compromises, such as using an allow-list of instances you federate with, so you don’t pull content from sources you don’t trust and didn’t approve.




  • I think kids find ways to play and tinker with stuff. I’d give them an office suite to practice writing letters or advertisements or whatever they come up with, something to draw… maybe not Gimp because that’s not easy to use… I’ve seen people give their kids an instant messenger which connects to their dad/mom so they’re incentivised to type something. And then of course we have games. From Supertux, PlanetPenguin Racer, Tuxkart to commercial games. There are some kids games in the repos. Kartoffelknülch, drawing programs. Programming languages to learn coding with puzzle pieces and blocks or animate Turtles. There are educational games, at least my local library has some and I played some as a kid. But maybe at least try to balance the gaming. There’s so much more interesting stuff in computers. And then of course you could put some content into some directories, I think unrestricted internet access isn’t great at 6yo and the computer will be empty without, so idk. Maybe put some templates there, ideas what to draw, music or audiobooks or whatever fits the purpose…


  • Thanks.

    FIBO, which looks interesting in many ways.

    Indeed. Seems it has good performance, licensed training material… That’s all looking great. I wonder who has to come up with the JSON but I guess that’d be another AI and not my task. Guess I’ll put it on my list of things to try.

    It’s possible that there’ll be companies at some point who proudly train their models with renewable energy

    I said it in another comment, I think that’s a bit hypothetical. It’s possible. I think we should do it. But in reality we ramp up natural gas and coal. US companies hype small nuclear reactors and some some people voiced concerns China might want to take advantage of Russia’s situation for their insatiable demand for (fossil-fuel) energy. I mean they also invest massively in solar. It just looks to me we’re currently overall headed the other direction and we need substantial change to maybe change that some time in the future. So I categorize it more towards wishful-thinking.


  • Your experience with AI coding seems to align with mine. I think it’s awesome for generating boilerplate code, placeholders including images, and for quick mockups. Or asking questions about some documentation. The more complicated it gets, the more it fails me. I’ve measured the time once or twice and I’m fairly sure it’s more than usual, though I didn’t do any proper scientific study. It was just similar tasks and me running a timer. I believe the more complicated maths and trigonometry I mentioned was me yelling at AI for 90min or 120minutes or so until it was close and then I took the stuff around, deleted the maths part and wrote that myself. Maybe AI is going to become more “intelligent” in the future. I think a lot of people hope that’s going to happen. I think as of today we’re need to pay close attention if it fools us but is a big time and energy waster, or if it’s actually a good fit for a given task.

    Local AI will likely have a long lasting impact as it won’t just go away.

    I like to believe that as well, but I don’t think there’s any guarantee they’ll continue to release new models. Sure, they can’t ever take Mistral-Nemo from us. But that’s going to be old and obsolete tech in the world of 2030 and dwarfed by any new tech then. So I think the question is more, are they going to continue? And I think we’re kind of picking up what the big companies dumped when battling and outcompeting each other. I’d imagine this could change once China and the USA settle their battle. Or multiple competitors can’t afford it any more. And they’d all like to become profitable one day. Their motivation is going to change with that as well. Or the AI bubble pops and that’s also going to have a dramatic effect. So I’m really not sure if this is going to continue indefinitely. Ultimately, it’s all speculation. A lot of things could possibly happen in the future.

    At what point is generative AI ethically and legally fine?

    If that’s a question about development of AI in general, it’s an entire can of worms. And I suppose also difficult to answer for your or my individual use. What part of the overall environment footprint gets attributed to a single user? Even more difficult to answer with local models. Do the copyright violations the companies did translate to the product and then to the user? Then what impact do you have on society as a single person using AI for something? Does what you achieve with it outweigh all the cost?

    Firefox for realtime website translations

    Yes, I think that and text to speech and speech to text are massively underrated. Firefox Translate is something I use quite often and I can do crazy stuff with it like casually browse Japanese websites.


  • That’s not comparable. You can’t compare software or even research with a physical object like that. You need a dead cow for salami, if demand increases they have to kill more cows. For these models the training already happened, how many people use it does not matter.

    I really like to disagree here. Sure today’s cow is already dead and turned into sausage. But the pack of salami I buy this week is going to make the supermarket order another pack next week so what I’m really doing is have someone kill the next cow, or at least a tiny bit because I’m having just some slices and it’s the bigger picture and how I’m part of a large group of people creating the overall demand.

    And I think it’s at least quesionable if and how this translates. It’s still part of generating demand for AI. Sure, it’s kind of a byproduct but Meta directly invests additional research, alignment and preparation for these byproducts. And we got an entire ecosystem around it with Huggingface, CivitAI etc which cater to us, sometimes a sunstantial amount of their bussiness is the broader AI community and not just researchers. They provide us with datacenters for storage, bandwith and sometimes compute. So it’s certainly not nothing which gets added due to us. And despite it being immaterial, it has a proper effect on the world. It’s going to direct technology and society in some direction. Have real-world consequences when used. The pollution during the process of creating this non-physical product is real. And Meta seems to pay attention. At least that’s what I got from everything that happened with LLaMA 1 to today. I think if and how we use it is going to affect what they do with the next iteration. Similar to the salami pack analogy. Of course it’s a crude image. And we don’t really know what would happen if we did things differently. Maybe it’d be the same so it’s down to the more philosophical question of whether it’s ethical to benefit from things that have been made in an unethical way. Though this requires today’s use not to have any effect on future demand. Like the nazi example where me using medicine is not going to bring back nazi experiments in the future. And that’s not exactly the situation of AI. They’re still there and actively working on the next iteration. So the logic is more complicated than that.

    And I’m a bit wary because I have no clue about the true motive behind why Meta gifts us these things. It costs them money and they hand control to us, which isn’t exactly how large companies operate. My hunch is it’s mainly the usual war, they’re showing off and they accept cutting into their own business when it does more damage to OpenAI. And the Chinese are battling the USA… And we’re somewhere in the middle of it. Maybe we pick up the crumbs. Maybe we’re chess pieces and being used/exploited in some bigger corporate battles. And I don’t think we’re emancipated with AI, we don’t own the compute necessary to properly shape it, so we might be closer to the chess pieces. I don’t want to start any conspiracy theory but I think these dynamics are part of the picture. I (personally) don’t think it’s a general and easy answer to the question if it’s ethical to use these models. And reality is a bit messy.

    But you don’t have to. I can run small models on my NITRO+ RX 580 with 8 GB VRAM, which I bought 7 years ago. It’s maybe not the best experience, but it certainly “works”. Last time our house used external electricity was 34h ago.

    I think this is the common difference between theory and practice. What you do is commendable. In reality though, AI is in fact mostly made from coal and natural gas. And China and the US ramp up dirty fossil fuel electricity for AI. There’s a hype in small nuclear reactors to satisfy the urgend demand for more electricity and they’re a bit problematic with all the nuclear waste due to how nuclear power plants scale. So yes, I think we could do better. And we should. But that’s kind of a theoretical point unless we actually do it.

    it makes sense to train new models on public domain and cc0 materials

    Yes, I’d like to see this as well. I suppose it’s a long way from pirating books because they’re exempt from law with enough money and lawyers… to a proper consensual use.





  • Thanks. That sounds reasonable. Btw you’re not the only poor person around, I don’t even own a graphics card… I’m not a gamer so I never saw any reason to buy one before I took interest in AI. I’ll do inference on my CPU and that’s connected to more than 8GB of memory. It’s just slow 😉 But I guess I’m fine with that. I don’t rely on AI, it’s just tinkering and I’m patient. And a few times a year I’ll rent some cloud GPU by the hour. Maybe one day I’ll buy one myself.


  • Sure. I’m all for the usual system design strategy with strong cohesion within one component and loose coupling on the outside to interconnect all of that. Every single household appliance should be perfectly functional on its own. Without any hubs or other stuff needed.

    For self-contained products or ones without elaborate features, I kind of hate these external dependencies. I don’t want to miss my NAS and how I can access my files from my phone, computer or TV. But other than that I think the TV and all other electronics should work without being connected to other things.

    I mean edge computing is mainly to save cost and power. It doesn’t make sense to fit each of the devices with a high end computer and maybe half a graphics card to all do AI inference. That’s expensive and you can’t have battery-powered devices that way. If they need internet anyway (and that’s the important requirement) just buy one GPU and let them all use that. They’ll fail without the network connection anyway, so it doesn’t matter, and this is easier to maintain and upgrade, probably faster and cheaper.

    A bit like me buying one NAS instead of one 10TB harddisk for the laptop, one for the phone, one for the TV… And then I can’t listen to the song on the stereo because it was sent to my phone.

    But my premise is that the voice stuff and AI features are optional. If they’re essential, my suggestion wouldn’t really work. I rarely see the need. I mean in your example the smoke alarm could trigger and Home Assistant would send me a push notification on the phone. I’d whip it out and have an entire screen with status information and buttons to deal with the situation. I think that’d be superior to talking to the washing machine. I don’t have a good solution for the timer. One day my phone will do that as well. But mind your solution also needs the devices to communicate via one protocol and be connected. The washing machine would need to get informed by the kitchen, be clever enough to know what to do about it, also need to tell the dryer next to it to shut up… So we’d need to design a smart home system. If the devices all connect to a coordinator, perfect. That could be the edge computing “edge”. If not it’d be some sort of decentral system. And I’m not aware of any in existence. It’d be challenging to design and implement. And they tend to be problematic with innovation because everything needs to stay compatible, pretty much indefinitely. It’d be nice, though. And I can see some benefits if arbitrary things just connect, or stay seperate and there’s not an entire buying into some ecosystem involved.




  • I think they should be roughly in a similar range for selfhosting?! They’re both power-efficient. And probably have enough speed for the average task. There might be a few perks with the ThinkCentre Tiny. I haven’t looked it up but I think you should be able to fit an SSD and a harddrive and maybe swap the RAM if you need more. And they’re sometimes on sale somewhere and should be cheaper than a RasPI 5 plus required extras.


  • I’m a bit below 20W. But I custom-built the computer a long time ago with an energy-efficient mainboard and a PicoPSU. I think other options for people who don’t need a lot of harddisks or a graphics card include old laptops or Mini-PCs. Those should idle at somewhat like 10-15W. It stretches the definition of “desktop pc” a bit, but I guess you could place them on a desk as well 😉


  • You just described your subjective experience of thinking.

    Well, I didn’t just do that. We have MRIs and have looked into the brain and we can see how it’s a process. We know how we learn and change by interacting with the world. None of that is subjective.

    I would say that the LLM-based agent thinks. And thinking is not only “steps of reasoning”, but also using external tools for RAG.

    Yes, that’s right. An LLM alone certainly can’t think. It doesn’t have a state of mind, it’s reset a few seconds after it did something and forgets about everything. It’s strictly tokens from left to right And it also doesn’t interact and that’d have an impact on it. That’s just limited to what we bake in in the training process by what’s on Reddit and other sources. So there are many fundamental differences here.

    The rest of it emerges by an LLM being embedded into a system. We provide tools to it, a scratchpad to write something down, we devise a pipeline of agents so it’s able to devise something and later return to it. Something to wrap it up and not just output all the countless steps before. It’s all a bit limited due to the representation and we have to cram everything into a context window, and it’s also a bit limited to concepts it was able to learn during the training process.

    However, those abilities are not in the LLM itself, but in the bigger thing we build around it. And it depends a bit on the performance of the system. As I said, the current “thinking” processes are more a mirage and I’m pretty sure I’ve read papers on how they don’t really use it to think. And that aligns with what I see once I open the “reasoning” texts. Theoretically, the approach surely makes everything possible (with the limitation of how much context we have, and how much computing power we spend. That’s all limited in practice.) But what kind of performance we actually get is an entirely different story. And we’re not anywhere close to proper cognition. We hope we’re eventually going to get there, but there’s no guarantee.

    The LLM can for sure make abstract models of reality, generalize, create analogies and then extrapolate.

    I’m fairly sure extrapolation is generally difficult with machine learning. There’s a lot of research on it and it’s just massively difficult to make machine learning models do it. Interpolation on the other hand is far easier. And I’ll agree. The entire point of LLMs and other types of machine learning is to force them to generalize and form models. That’s what makes them useful in the first place.

    It doesn’t even have to be an LLM. Some kind of generative or inference engine that produce useful information which can then be modified and corrected by other more specialized components and also inserted into some feedback loop

    I completely agree with that. LLMs are our current approach. And the best approach we have. They just have a scalability problem (and a few other issues). We don’t have infinite datasets to feed in and infinite compute, and everything seems to grow exponentially more costly, so maybe we can’t make them substantially more intelligent than they are today. We also don’t teach them to stick to the truth or be creative or follow any goals. We just feed in random (curated) text and hope for the best with a bit of fine-tuning and reinforcement learning with human feedback on top. But that doesn’t rule out anything. There are other machine learning architectures with feedback-loops and way more powerful architectures. They’re just too complicated to calculate. We could teach AI about factuality and creativity and expose some control mechanisms to guide it. We could train a model with a different goal than just produce one next token so it looks like text from the dataset. That’s all possible. I just think LLMs are limited in the ways I mentioned and we need one of the hypothetical new approaches to get them anywhere close to a level a human can achieve… I mean I frequently use LLMs. And they all fail spectacularly at computer programming tasks I do in 30 minutes. And I don’t see how they’d ever be able to do it, given the level of improvement we see as of today. I think that needs a radical new approach in AI.


  • Agreed.

    Those models could be easily trained with renewables alone but you know, capitalism.

    It’s really sad to read the articles how they’re planning to bulldoze Texas and do fracking and all these massively invasive things and then we also run a lot of the compute on coal and want more nuclear plants as well. That doesn’t really sound that progressive and sophisticated to me.

    The thing is, those models are already out there and the people training them do not gain anything when people download their open weights/open source models for free for local use.

    You’re right. Though the argument doesn’t translate into anything absolute. I can’t buy salami in the supermarket and justify it by saying the cow is dead anyways and someone already sliced it up. It’s down to demand and that’s really complex. Does Mark Zuckerberg really gift an open-weights model to me out of pure altruism? Is it ethical if I get some profit out of some waste, or by-product of some AI war/competition? It is certainly correct that we here don’t invest money in that form. However that’s not the entire story either, we still buy the graphics cards from Nvidia and we also set free some CO2 when doing inference, even if we didn’t pay for the training process. And they spend some extra compute to prepare those public models, so it’s not no extra footprint, but it’s comparatively small.

    I’m not perfect, though. I’ll still eat salami from time to time. And I’ll also use my computer for things I like. Sometimes it serves a purpose and then it’s justified. Sometimes I’ll also do it for fun. And that in itself isn’t something that makes it wrong.

    I’m a huge fan of RAG because it cites where it got the information from

    Yeah, that’s really great and very welcome. Though I think it still needs some improvement on picking sources. If I use some research mode from one of the big AI services, it’ll randomly google things, but some weird blog post or a wrong reddit comment will show up on the same level as a reputable source. So it’s not really fit for those use-cases. It’s awesome to sift through documentation, though. Or a company’s knowledgebase. And I think those are the real use-cases for RAG.