I’m looking for storage classes for a multi node cluster. I’m currently using Longhorn and NFS, but I’m not happy with the performance. My cluster doesn’t have beefy nodes, so Ceph/Rook is out of the question (for now).

Nodes:

  1. 8 GB RAM, 4 cores VM, control plane. 256 GB SSD
  2. 4 GB RAM, 2 cores, control plane, currently cordoned. 128 GB SSD
  3. 8 GB RAM, 4 cores, ARM, control plane. 512 GB SSD
  4. 8 GB RAM, 4 cores. 256 GB SSD
  5. 16 GB RAM, 6 cores. 256 GB SSD + 1 TB HD
  6. RPi 4, 4 GB RAM. 128 GB SSD
  • h3ron@lemmy.zip
    link
    fedilink
    English
    arrow-up
    2
    ·
    edit-2
    1 day ago

    I have two storage nodes and one is much faster than the other.

    I’m currently evaluating a juicefs deployment based on two minio instances (one per node, replicated with async bucket replication) through a load balancer (sidekick) in failover. Because juicefs also needs a db for metadata, I went with valkey + sentinel.

    Juicefs provides a CSI driver that supports ReadWriteMany volumes and CSI snapshots and manages both read and write cache. Performance is much much better than Ceph. In theory it should be riskier (because of the async replication) but in practice I haven’t yet lost a bit.

    • eutampieri@feddit.itOP
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 day ago

      Thanks! A bit more involved that I’d have thought but still worth considering! Could you update us after your evaluation?

      • h3ron@lemmy.zip
        link
        fedilink
        English
        arrow-up
        2
        ·
        8 hours ago

        Well actually it is very easy to spin up in docker and most of the configuration happens through env variables.

        juicefs itself only exists on the client side, so you basically only have to install and configure the CSI driver with helm.

        as it took me a few days to come up with this solution I’d be happy to share my config files.

        Performance wise is quite fast on sequential reads (it saturates my 2.5G bandwidth) and slower than I expected on sequential writes (for me it caps at 60MB/s). Postgresql seems happy. I saw no visible performance degradation with Authentic, Immich and Opencloud. Nextcloud installation took ages. I’ve yet to try it with jellyfish and the *arr suite.

        A simple NFS share would be faster, but it doesn’t support replication, failover and CSI snapshots.

  • Decronym@lemmy.decronym.xyzB
    link
    fedilink
    English
    arrow-up
    3
    ·
    edit-2
    2 hours ago

    Acronyms, initialisms, abbreviations, contractions, and other phrases which expand to something larger, that I’ve seen in this thread:

    Fewer Letters More Letters
    NFS Network File System, a Unix-based file-sharing protocol known for performance and efficiency
    SSD Solid State Drive mass storage
    k8s Kubernetes container management package

    3 acronyms in this thread; the most compressed thread commented on today has 6 acronyms.

    [Thread #95 for this comm, first seen 15th Feb 2026, 11:20] [FAQ] [Full list] [Contact] [Source code]

  • custard_swollower@lemmy.world
    link
    fedilink
    English
    arrow-up
    3
    ·
    edit-2
    1 day ago

    I’m not sure you’ll get nice performance in local network with small appliances (consumer network hardware, mini PCs and rpi 4). I’ve never got sub-ms network disk access on 1Gbps switch and router. In the end I’ve done the opposite - I’ve added one k8s host with a lot of storage, and any storage services are deployed there. All the other k8s services rely on local SSDs.

    • eutampieri@feddit.itOP
      link
      fedilink
      English
      arrow-up
      2
      ·
      edit-2
      2 days ago

      Found a Reddit thread that says that LINSTOR has a lower CPU usage (which is my main gripe with Longhorn). Might as well try this and report back. Is there a good way to migrate PVs and PVCs?

      • nebula@lemmy.ca
        link
        fedilink
        English
        arrow-up
        1
        ·
        6 hours ago

        It’s great, I didn’t 2 years finding perfect CSI for homelab and landed on Piraeus. The best part is you get full read performance of your local disk so I didn’t have to use 10G, write are limited by network link between nodes. But that hasn’t been a problem for me. Also, they’re super responsive for any issues/bugs you hit.

        Let me know if you have any specific questions about this.

      • supersheep@lemmy.world
        link
        fedilink
        English
        arrow-up
        2
        ·
        2 days ago

        I can confirm that the resource usage is quite low indeed. I only used it with Nomad instead of Kubernetes, so I can’t comment on how to best migrate PVs and PVCs.

    • eutampieri@feddit.itOP
      link
      fedilink
      English
      arrow-up
      2
      ·
      2 days ago

      I used to use Ceph at work and I’m a bit reluctant to use it at home. Don’t get me wrong, it’s really cool, but those were beefy nodes, and I only have 1 Gbps between nodes

    • jonathan@piefed.social
      link
      fedilink
      English
      arrow-up
      2
      ·
      edit-2
      1 day ago

      I’d expect the performance to be awful but it still has relatively niche usecases, especially where performance isn’t a concern. I’m imagining legacy apps that don’t speak S3.