They are referencing this paper: LMSYS-Chat-1M: A Large-Scale Real-World LLM Conversation Dataset from September 30.

The paper itself provides some insight on how people use LLMs and the distribution of the different use-cases.

The researchers had a look at conversations with 25 LLMs. Data is collected from 210K unique IP addresses in the wild on their Vicuna demo and Chatbot Arena website.

  • rufus@discuss.tchncs.deOP
    link
    fedilink
    English
    arrow-up
    3
    ·
    edit-2
    1 year ago

    That is also my observation. Even for (simple) tasks like summarization, I’ve seen LLMs insert to much inaccurate information to be useful for my own life. The tasks I see are somewhat narrow and require a human in the loop. Despite some people claiming we’re close to AGI.