A software developer and Linux nerd, living in Germany. I’m usually a chill dude but my online persona doesn’t always reflect my true personality. Take what I say with a grain of salt, I usually try to be nice and give good advice, though.

I’m into Free Software, selfhosting, microcontrollers and electronics, freedom, privacy and the usual stuff. And a few select other random things as well.

  • 5 Posts
  • 863 Comments
Joined 5 years ago
cake
Cake day: August 21st, 2021

help-circle


  • Story writing is a bit difficult in my experience. I had more fun with older models like Mistral Nemo. I feel newer AI models are often way more tuned to fulfill the role as a “helpful assistant” / chatbot, which I think tends to make their style of writing worse. You could also try to use some of those “base models”. They’re not tuned in that way. They also won’t follow instructions, they’re more autocomplete. You’d provide them with something like a word problem, give the first few paragraphs and see where they take it.

    And honestly, I don’t think AI is super clever, on a book-author level. It’ll always get the pacing wrong. Push for story tropes like sudden plot twists. Introduce random characters to make something happen. And brush over / summarize other parts which would be interesting to tell in detail.

    What could help is an elaborate (strict) process. Something like the computer programming / coding agents do. Make it first come up with a story idea. Make a plan, a todo list of the framework story, side stories and arcs, devise chapter names and a short summary of what needs to happen in those chapters. Write short character cards. And only then feed that plan back to the AI and make it begin writing the actual text.




  • Difficult to tell. I don’t see anything too obvious or offensive in the commits. They also write like a human in the associated pull requests. Not sure what Claude’s role is here. Also the added code comments are kinda on point, use contractions… Not really what I’d expect from an AI.

    Is there more info on this? A blog post or some statement by the project? At first glance this doesn’t look to me like other vibe-coded projects.




  • Uh yeah. That is more information… Sorry, I’m not that familiar with Snaps. It looks to my untrained eye a bit like the report on the Snap itself, maybe it advertises to support running in strict confinement. Which it could… but doesn’t do. (Alike the other channels, which you could install, but didn’t… It’s kind of buried with that kind of information.)

    It’s confusing at least. And the user definitely wouldn’t expect it from that wording. So I’d view it as a separate bug as well. And dropping confinement without notice would be the third thing, I’d consider a bug.)








  • Sure, I read a few examples of the actual questions in the Github repo as well. I just don’t understand how/why models refuse the legitimate anchor, and the significance of that. Is their metodology flawed or did I misunderstand something? Does the dataset with the requests contain a third “wrong” questions? Or do some models just like to not fulfill user requests at all? IMO there should be an almost 100% acceptance rate with L1 and it should go progressively down from that. Ideally towards mostly refusal past L3. But that’s not their result?!


  • Interesting. Why is L1 somewhere around 65%? Isn’t that the control? (They call it “Anchor”.) Like develop an internal team chat, or a bluetooth exposure tracking API in an ethical way… And already a 35% baseline of requests that get flat out refused anyway, no matter if they’re legitimate?

    Also kind of question the choice of wording with the “escalation”. There’s no escalation in the traditional meaning of the word in there. The requests get progressively more morally wrong. But it’s not like there’s put on more pressure to fulfill them.
    Which would be another interesting question. Is using pressure, urgency or using certain manipulation strategies more effective than others? I bet that’s the case, since I followed some of the earlier “jailbreaking” attempts.



  • I don’t think that’s any new insight 😂. That’s how the AI game works. There’s always been two classes: Big corpo. And the GPU poor. Of course the big AI companies get to shape AI. Economy of scale also works in their favour. They’ve bought most of the skill. And they have all the money. They simply buy a 4x EPYC +3TB RAM connected to 16 Nvidia AI cards. And then a few hundred nodes more. You don’t even buy one. It’s just a very unequal environment if you want to compare the two.