• 0 Posts
  • 54 Comments
Joined 2 years ago
cake
Cake day: June 4th, 2023

help-circle









  • LLMs cannot:

    • Tell fact from fiction
    • Accurately recall data from its training set
    • Count

    LLMs can

    • Translate
    • Get the general vibe of a text (sentiment analysis)
    • Generate plausible text

    Semantics aside, they’re very different skills that require different setups to accomplish. Just because counting is an easier task than analysing text for humans, doesn’t mean it’s the same it’s the same for a LLM. You can’t use that as evidence for its inability to do the “harder” tasks.





  • howrar@lemmy.catoAI@lemmy.mlDo I understand LLMs?
    link
    fedilink
    arrow-up
    5
    ·
    10 months ago

    mathematically “correct” sounding output

    It’s hard to say because that’s a rather ambiguous way of describing it (“correct” could mean anything), but it is a valid way of describing its mechanisms.

    “Correct” in the context of LLMs would be a token that is likely to follow the preceding sequence of tokens. In fact, it computes a probability for every possible token, then takes a random sample according to that distribution* to choose the next token, and it repeats that until some termination condition. This is what we call maximum likelihood estimation (MLE) in machine learning (ML). We’re learning a distribution that makes the training data as likely as possible. MLE is indeed the basis of a lot of ML, but not all.

    *Oversimplification.



  • I don’t understand the image. Is that supposed to be a Venn diagram?

    Anyway, to answer your question, I use GitHub Copilot for all of my coding work, and ChatGPT here and there throughout the week. They’ve both been great productivity boosters. Sometimes, it also gets hoisted onto me when I don’t want it. Like when trying to talk to customer service, or Notion trying to put words in my mouth when I accidentally hit the wrong keyboard shortcut.




  • It’s not completely subjective. Think about it from an information theory perspective. We want a word that maximizes the amount of information conveyed, and there are many situations where you need a word that distinguishes AGI, LLMs, deep learning, reinforcement learning, pathfinding, decision trees and the like from the outputs of other computer science subfields. “AI” has historically been that word, so redefining it without a replacement means we don’t have a word for this thing we want to talk about anymore.

    I refuse to replace a single commonly used word in my vocabulary with a full sentence. If anyone wants to see this changed, then offer an alternative.