When I first got into local LLMs nearly 3 years ago, in mid 2023, the frontier closed models were ofcourse impressively capable.

I then tried my hand on running 7b size local models, primarily one called Zephyr-7b (what happened to these models?? Dolphin anyone??), on my gaming PC with 8GB AMD RX580 GPU. Fair to say it was just a curiosity exercise (in terms of model performance).

Fast forward to this month, I revisit local LLM. (Although I no longer have the gaming PC, cost-of-living-crisis anyone 😫 )

And, the 31b size models look very sufficient. #Qwen has taken the helm in this order. Which is still very expensive to setup locally, although within grasp.

I’m rooting for the edge-computing models now - the ~2b size models. Due to their low footprint, they are practical to run in a SBC 24/7 at home for many people.

But these edge models are the ‘curiosity category’ now.

  • SuspciousCarrot78@lemmy.world
    link
    fedilink
    English
    arrow-up
    5
    ·
    edit-2
    16 hours ago

    Probably not; the models they use all tend to be quite lightweight and inexpensive, tbh.

    EDIT:
    https://proton.me/support/lumo-privacy


    Open-source language models

    Lumo is powered by open-source large language models (LLMs) which have been optimized by Proton to give you the best answer based on the model most capable of dealing with your request. The models we’re using currently are Nemo, OpenHands 32B, OLMO 2 32B, GPT-OSS 120B, Qwen, Ernie 4.5 VL 28B, Apertus, and Kimi K2. These run exclusively on servers Proton controls so your data is never stored on a third-party platform.

    Lumo’s code is open source, meaning anyone can see it’s secure and does what it claims to. We’re constantly improving Lumo with the latest models that give the best user experience.


    Quite lightweight swarm for cloud service, barring Kimi K2.

    • NoiseColor @lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      15 hours ago

      They have been working on this. Only 3 months ago it was pretty terrible. Today it’s almost on par with chatgpt. A bit worse on rag, slower,… good enough for normal use.

      • SuspciousCarrot78@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        edit-2
        10 hours ago

        I was playing around with a tiny amount earlier today (I use ProtonMail, so I figured why not).

        I can’t tell much about it. It seems very…safety theater / personality removed.

        Any idea of what models they use now? I get a feeling that the main brain is 14B (based on how it responds to questions / drops nuance).