Basically a deer with a human face. Despite probably being some sort of magical nature spirit, his interests are primarily in technology and politics and science fiction.

Spent many years on Reddit before joining the Threadiverse as well.

  • 0 Posts
  • 189 Comments
Joined 2 years ago
cake
Cake day: March 3rd, 2024

help-circle

  • Conversely, if just 10 percent of users in a given social media community largely agree with your stances, you will be more tolerant toward diverse opinions that contradict your own. “There’s a certain chance that some users will end up in communities where it’s very homogenous and 99 percent of users are disagreeing with them,” said Törnberg. “That will cause them to leave, and you get this feedback effect just because of the structure of interaction. But if you have a filter bubble effect, where everyone is shown 10 percent of their own type, that creates a possibility for you to find the people who you agree with within the community. And that stabilizes the entire dynamics so it doesn’t tip over to one side or the other and become extreme or overly homogenous.”

    Ooh, this is interesting. It suggests the possibility of automating this; since most social media allows for upvoting and downvoting it should be possible to automatically determine which users are “agreeable” and which are “disagreeable” and filter thread contents to push it toward this 10 percent threshold.

    Probably wouldn’t work on the Threadiverse yet, though, there’s not a large enough population here yet.





  • FaceDeer@fedia.iotoTechnology@beehaw.org*Permanently Deleted*
    link
    fedilink
    arrow-up
    1
    arrow-down
    1
    ·
    21 days ago

    Alright, so instead of simply saying “include external data in your training run”, extend that to “and also filter the data to exclude erroneous stuff.” That’s a routine part of curating training data in real-world AI training as well, I was already writing a lot so I didn’t feel like adding more detail there would have enhanced it.

    The basic point remains the same, that real world training accounts for the things that were necessary to force model collapse to happen in that old paper I linked. It’s a solved problem. We can see that it’s solved by the fact that AI models continue to get better, despite an increasing amount of AI-generated data being present in the world that training data is being drawn from. Indeed, most models these days use synthetic training data that is intentionally AI-generated.

    A lot of people really want to believe that AI is going to just “go away” somehow, and this notion of model collapse is a convenient way to support that belief. So it’s very persistent and makes for great clickbait. But it’s just not so. If nothing else, the exact same training data that was used to create those earlier models is still around. AI models are never going to get worse than they are now because if they did get worse we’d just throw them out and go back to the earlier ones that worked better, perhaps re-training with the same data but better training techniques or model architectures.


  • FaceDeer@fedia.iotoTechnology@beehaw.org*Permanently Deleted*
    link
    fedilink
    arrow-up
    1
    arrow-down
    1
    ·
    21 days ago

    Model collapse comes from using only training data generated by previous generations.

    All that’s needed to avoid it is to add training data that isn’t directly from the previous “generation” of the LLM in question. The thing that causes model collapse is the loss of data from generation to generation, so you just need to keep the training data “fresh” with stuff that wasn’t directly generated by the earlier generation of your model.

    You could do that with archived material you used for previous training runs. For more recent events you could do that with social media feeds. The Fediverse, for example, would probably be a perfectly fine source of new stuff. Sure, there’s some AI-generated stuff mixed in, but that’s not “poison.”

    As I mentioned, the article that demonstrated model collapse did it using a very artificial set of circumstances. It’s not how real AI training is done.


  • FaceDeer@fedia.iotoTechnology@beehaw.org*Permanently Deleted*
    link
    fedilink
    arrow-up
    3
    arrow-down
    2
    ·
    21 days ago

    The main mechanism leading to model collapse in that paper, as I understand it, are the loss of “rare” elements in the training data as each generation of model omits things that just don’t happen to be asked of it. Like, if the original training data has just one single line somewhere that says “birds are nice”, but the first generation of model never happens to be asked what it thinks of birds then this bit of information won’t be present in the second generation. Over time the training data becomes homogenized. It probably also picks up an increasing load of false or idiosyncratic bits of information that were hallucinated and get reinforced due to random happenstance, it’s been a long time since I read the article and details slip my mind.

    I’m really not seeing how human filtering would mimic this process, so I think it’s safe. The filtering is being done with intent in that case, not due to random drift as is done with a purely automated generation like was done in the paper.


  • FaceDeer@fedia.iotoTechnology@beehaw.org*Permanently Deleted*
    link
    fedilink
    arrow-up
    11
    arrow-down
    2
    ·
    22 days ago

    Semantic quibbling is one of the least interesting kinds of internet debate, so replace the word “understanding” with whatever word makes you happy. I continued with “and talking about” right afterwards so you can just delete the word entirely and the sentence still works fine. You could have just kept reading.

    Since you didn’t read the rest of my comment, I should note that the rest of it after that sentence is about the other issue that OP raised and not even about model collapse at all.

    Anyway. The article about model collapse that I see still crop up every once in a while is this one. It’s not that it has “methodological errors”, though, it’s just that it uses a very artificial training protocol to illustrate model collapse that doesn’t align with how LLMs are actually trained in real life. It’s like demonstrating the effects of inbreeding in animals by crossing brothers and sisters for twenty generations straight - you’ll almost certainly see some strong evidence, but it’s not a pattern of breeding that you are actually going to see in the wild.


  • FaceDeer@fedia.iotoTechnology@beehaw.org*Permanently Deleted*
    link
    fedilink
    arrow-up
    16
    arrow-down
    2
    ·
    22 days ago

    Only in trivial cases where the training data isn’t being curated properly. There was a paper done on the subject a few years back where “model collapse” was demonstrated by repeatedly training generation after generation of models on the output of previous generations, and sure enough, the results were bad. This result gets paraded around every once in a while to “prove” that AI is doomed. However, in the real world this is not remotely close to how AI is actually trained. You can prevent model collapse simply by enriching the training data with good data - stuff that is already archived, that can’t be “contaminated.”

    Indeed, the best models these days are trained largely on synthetic data - data that’s been pre-processed by other AIs to turn it into stuff that makes for better training material. For example a textbook could be processed by an LLM to turn it into a conversation about the information in the textbook, with questions and answers, and the result is training data an AI that’s better at understanding and talking about the content than if it was just fed the raw text.

    If so are these programs that claim to ‘poison’ the training datasets effective?

    This is a separate issue from the usual “model collapse” argument. I assume you’re talking about stuff like Nightshade, which claim to put false patterns into images that cause AIs to miscategorize them. These techniques are also something that only works in a “toy” environment, these adversarial patterns are tailored to affect specific AIs and won’t work on other AIs they weren’t specifically designed for. So for example you might “poison” an image so that a classifier based on Dall-E would become confused by it, but a GPT-Image classifier wouldn’t care. The most obvious illustration of this is the fact that humans are a separate lineage of image classifier and these “poisonings” have no effect on us.

    There’s also the added problem that these adversarial patterns tend to be fragile, they break if you resample the image to resize or crop it. Since that’s usually a routine part of preparing training data for an image AI it may end up making the poison ineffective even for image AIs that it was designed for.

    Essentially, all these things are just added background noise of the sort that AI training operations already have mechanisms for dealing with. But they make people feel better, I suppose.


  • FaceDeer@fedia.iotoTechnology@beehaw.org*Permanently Deleted*
    link
    fedilink
    arrow-up
    20
    arrow-down
    2
    ·
    1 month ago

    Where are you getting these limitations from? They’re not in that article, and I went to the project’s page to double check and they’re not there either.

    Connect any ACP-compatible agent or any model with an OpenAI-compatible API

    At this point that’s basically anything. Including all the popular open frameworks fro running local AIs.

    Automate workflows and recurring tasks: Completely removes your ability to make decisions and understand what is happening.

    What? This is like setting a cron job. Does cron remove your ability to make decisions or understand what is happening?

    Work seamlessly across devices with native applications for Windows, macOS, Linux, iOS, and Android: Until we decide it doesn’t, or maybe it will only be window.

    It’s open source, like the other projects Mozilla maintains. Do you apply this “they could take it away from us at any time!” Concern to Firefox as well?

    Maintain security with self-hosted deployment, optional end-to-end encryption, and device-level access controls: While allowing us to monitor your whole work flow remotely and monetize everything you know.

    Any source for this? Seriously, I know there’s a lot of anti-AI sentiment around here but you’re hallucinating worse than Gemini.




  • Ah. After poking around in the Gradio UI a bit, I found an “Enable ADG” but the tooltip says it’s “Angle Domain Guidance”, same thing?

    I’m a programmer, but sometimes with AI I feel like a primitive tribesperson blindly attempting various rituals in an effort to appease the machine spirits. Eventually something works, and then I just keep on doing that.

    Edit: I have angered the gods! My ritual failed! When I enabled ADG the spirits smote me with the following:

    RuntimeError: The size of tensor a (11400) must match the size of tensor b (5700) at non-singleton dimension 1

    Guess I won’t be trying that for now. :)


  • ADG == Audio-Driven Guidance? I haven’t played around with that part much. I tried it out and couldn’t get it to work, but it turned out that the reason ACE Step wasn’t working was unrelated to that and I only figured out what was wrong after I stopped experimenting with ADG. So I haven’t gone back to try it again.

    I’m not really much of a music connoisseur, I just know what I like when I hear it. So mostly I just put together lyrics and then throw them at the wall to see what sounds good. :)



  • I’d love to hear what local model you settle on for lyrics, I’ve been having a lot of fun with ACE-Step 1.5 but the lyric generator it’s bundled with produces semi-nonsense lyrics that have nothing to do with what I prompt it with. Which is actually kind of fun in its own way, I literally never know what the song’s going to be about, but I’d like a little control sometimes too. :)


  • When the regular controller of the car - be it human, another AI, whatever - isn’t sending control signals, then the onboard controller knows that the car is uncontrolled. Of course it’s a “failure scenario”, I’m suggesting that this chip would be ideal for picking up when that sort of thing happens. The alternative is to just fall over.

    I, too, am not sure what you’re arguing. I suggested that a low-power high-speed AI chip like this would be ideal for putting in robots, which have power constraints and aren’t always in reliable contact with outside controllers. That’s a very broad “niche” indeed. I don’t know what all this landmine stuff or probabilities of brake-slamming is all about or how it relates to what I suggested.