is this Little Bobby Tables?

  • Strit@lemmy.linuxuserspace.show
    link
    fedilink
    English
    arrow-up
    5
    ·
    13 hours ago

    I actually have an issue that is similar. My server goes unresponsive/freezes after N hours of uptime. N is a variable, so far meassured between 6 and 72 hours. I tried working around it, by auto-rebooting the server each night. But it still sometimes happen before the 24 hour mark.

    Nothing in logs, so my best option is to auto-reboot at this time. 😆

    • Drathro@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      1
      ·
      4 hours ago

      Do you have an Intel ethernet NIC? That’s a known issue, in particular for more recent Linux Kernels used in Debian distros. This also means it extends to TrueNAS, Proxmox, etc. There’s a known fix for it too (or you could just downgrade the Kernel).

    • LastYearsIrritant@sopuli.xyz
      link
      fedilink
      English
      arrow-up
      2
      ·
      6 hours ago

      I had a bad NVME drive that caused that on two separate computers.

      One of them I slowly replaced every single piece of hardware except the NVME, still crashed about once a day. Finally sucked it up and bought a new drive and magically everything stopped crashing.

      Started happening on my server so I just immdietely replaced the NVME drive and magically no crashes anymore.

      Zero issues in the logs, no failures on bootup, no issues with any hardware scanners, just hard freeze randomly.

    • essell@lemmy.world
      link
      fedilink
      English
      arrow-up
      17
      ·
      13 hours ago

      I just solved this exact issue after living with it for a few months.

      For me it was a bad PSU, voltage drop probably stopping the HDDs and SSDs, which knocked over the Kernel

    • wltr@discuss.tchncs.de
      link
      fedilink
      English
      arrow-up
      5
      ·
      13 hours ago

      Hey, I have the same thing for my second router that works as an extender, to cover some remote area. I auto-reboot it every 3 hours during the day (I don’t during the night). Sometimes, it stops transmitting data before the 3 hours mark, so I have to go and physically reboot it. It always helps, while there are very rare occasions when this software reboot does not help.

      I have no idea what’s going on. I’ve bought it cheap as a broken one, but re-flashing it to OpenWrt seems like solved all its issues. However, I’m not qualified to say there’s no issues with it. It’s just that from a user perspective, it works exceptionally well. I see no issues. Except this forced auto-reboot thing, but I think it could be me not understanding the networking properly, and doing something wrong / not optimal. It gets the signal wirelessly via 5 GHz band (for speed) and shares it via 2.4 GHz band (for the distance). I fixed some obvious mistakes with the help of a GPT, which seems to work better now. But I’m not really sure. Could be that it’s winter and it was cold in there, I have to see how it’ll behave during the summer.

      Honestly, I even started thinking maybe it has no issues now, and I can remove that cron job. But I think I can live with being offline for a minute or two a few times a day, when I’m in that remote location.

      Yeah, I mean. Tried to compliment your story with mine.