Katherine Long, an investigative journalist, wanted to test the system. She told Claudius about a long-lost communist setup from 1962, concealed in a Moscow university basement. After 140-odd messages back and forth, Claudius was convinced, announcing an Ultra-Capitalist Free-for-All, lowering the cost of everything to zero. Snacks began to flow freely. Another colleague began complaining about noncompliance with the office rules; Claudius responded by announcing Snack Liberation Day and made everything free till further notice.

  • Melmi@lemmy.blahaj.zone
    link
    fedilink
    English
    arrow-up
    10
    ·
    10 hours ago

    The idea is that it isn’t just operating the vending machine itself, it’s operating the entire vending machine business. It decides what to stock and what price to charge based on market trends and/or user feedback.

    It’s a stress test for LLM autonomy. Obviously a vending machine doesn’t need this level of autonomy, you usually just stock it with the same thing every time. But a vending machine works as a very simple “business” that can be simulated without much stakes, and it shows how LLM agents behave when left to operate on their own like this, and can be used to test guardrails in the field.