• ffhein@lemmy.world
    link
    fedilink
    English
    arrow-up
    3
    ·
    1 year ago

    I skimmed through the llama 2 research paper, there were some sections about them working to prevent users from circumventing the language model’s programming. IIRC one of the examples of model hijacking was to disguise the request as a creative/fictional prompt. perhaps it’s some part of that training gone wrong.

    • zephyrvs@lemmy.ml
      link
      fedilink
      English
      arrow-up
      5
      arrow-down
      1
      ·
      1 year ago

      Just goes to show the importance of being able to produce uncensored models.

  • j4k3@lemmy.world
    link
    fedilink
    English
    arrow-up
    1
    ·
    edit-2
    1 year ago

    Try and ask it about Polish notation and then prompt it to solve: 3 3 +

    We argued a bit. Ms. Example 7Bitchs earned her new name. 13B was less argumentative about corrections, but I couldn’t find an angle that coaxed correct responses. Early computing languages handled Polish notation much better because it is stack based and linear without arbitrary rules. Also from the early years of programming, I am really surprised that no one has been training a model to code in a threaded interpreted language like Forth because it is super powerful, flexible with far fewer rules and arbitrary syntax, but most importantly, it is linear and builds exponentially. It’s core building mechanic is already tokenized.