I’ve had two server oses here: alma linux and debian(currently). On both of them, they will hang when I shut them down from cockpit, and they hang at the end of the shutdown.

Also, it takes an hour to a day to have this issue start. if it’s restarted two times in a row quickly, it works perfectly fine for some reason.

What I’ve tried:

  • setting “acpi=off” and “acpi=force” kernel parameters in grub
  • removing my nvidia gpu(i was using nouveau drivers)
  • changing distros

nothing worked. here are some things that both distros had in common with eachother:

  • systemd
  • cockpit
  • libvirt & qemu
  • docker

does anyone have advice? nothing i’ve seen online has worked. thank you for suggestions

  • FauxLiving@lemmy.world
    link
    fedilink
    arrow-up
    1
    ·
    11 hours ago

    reboot: machine restart

    This makes me think it’s a motherboard issue.

    The system is done with its shutdown process and issued the reboot command, but the motherboard didn’t restart.

    There could be some electronics components which get wedged over time. My sound card will occasionally not boot unless it has been completely powered off for 30 seconds or so.

  • catloaf@lemm.ee
    link
    fedilink
    English
    arrow-up
    6
    ·
    1 day ago

    Hardware? Do they shut down properly if you do it from the console or ssh?

      • catloaf@lemm.ee
        link
        fedilink
        English
        arrow-up
        2
        ·
        edit-2
        1 day ago

        Interesting. BIOS update? Maybe check through all the settings, or do a factory reset on the BIOS? I have a similar board (H510 something) running proxmox and it works fine.

        • potentiallynotfelix@lemmy.fishOP
          link
          fedilink
          arrow-up
          3
          ·
          edit-2
          1 day ago

          I would like to note that this may have been caused by a bios update, as it started sometime after it. i’ll try another update now.

          edit: already on the latest bios version.

  • nanook@friendica.eskimo.com
    link
    fedilink
    arrow-up
    5
    arrow-down
    2
    ·
    1 day ago

    @potentiallynotfelix As a diagnostic, I would suggest trying shutting them down by ssh in and then using systemctl to shut them down, if that works then you know the issue is with cockpit. If it hangs even when systemd is asked to halt then I would consider reverting to the previous bios and see if the problem persists.

    • potentiallynotfelix@lemmy.fishOP
      link
      fedilink
      arrow-up
      2
      ·
      1 day ago

      Ok. Cockpit uses the shutdown command to shut down[src], but systemctl poweroff might work. I will also attempt to revert bioses if msi supports it. thank you very much!

          • nanook@friendica.eskimo.com
            link
            fedilink
            arrow-up
            1
            ·
            14 hours ago

            @potentiallynotfelix Well flash to an older and see how it goes. I’ve seen some wired bios issues. I’ve got an i7=6850k machine on an Asus motherboard, and after I flashed to the latest bios, the USB power strobed on and off every few seconds so keyboard and mouse would work then not work then work then not work. I thought something was broken with hardware but then found others had the same issue with the most current BIOS, flashed to one release earlier and all good.

  • CameronDev@programming.dev
    link
    fedilink
    arrow-up
    3
    ·
    1 day ago

    Is it actual server hardware? I’ve seen some very weird things with real servers that take ages to reboot (I was assuming it was self checking or something). Are you sure its hung, and not just very slow to shutdown/reboot?

    Is there any serial/monitor output before the hang?

      • ⲇⲅⲇ@lemmy.ml
        link
        fedilink
        arrow-up
        3
        ·
        1 day ago

        seems its a nvidia issue, i also have that issue, the gpu locks and i need to reboot while the VM with the nvidia passthrough freezes. i need a full reboot from baremetal machine to stop gpu using all his power stuck, don’t let it be for hours being on or you will kill your hardware

        • potentiallynotfelix@lemmy.fishOP
          link
          fedilink
          arrow-up
          1
          ·
          15 hours ago

          sudo dmsetup info returns:

          Name:              raven--vg-root
          State:             ACTIVE
          Read Ahead:        256
          Tables present:    LIVE
          Open count:        1
          Event number:      0
          Major, minor:      254, 0
          Number of targets: 1
          UUID: LVM-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
          
          Name:              raven--vg-swap_1
          State:             ACTIVE
          Read Ahead:        256
          Tables present:    LIVE
          Open count:        2
          Event number:      0
          Major, minor:      254, 1
          Number of targets: 1
          UUID: LVM-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
          
          • ReversalHatchery@beehaw.org
            cake
            link
            fedilink
            arrow-up
            1
            ·
            13 minutes ago

            did you make these yourself? if not, could you cdo an ls -l /dev/mapper? it shows which name corresponds to which dm device

      • BCsven@lemmy.ca
        link
        fedilink
        arrow-up
        1
        ·
        1 day ago

        Says reboot, are you issuing a reboot or a shutdown poweroff? Entering sleep state 5 shout be power off right?

        • potentiallynotfelix@lemmy.fishOP
          link
          fedilink
          arrow-up
          1
          ·
          24 hours ago

          I click the reboot button on cockpit, which issues a shutdown --reboot command as root. I agree that sleep state S5 is powered off. From the acpi docs:

          A computer state where the computer consumes a minimal amount of power. No user mode or system mode code is run. This state requires a large latency in order to return to the Working state. The system’s context will not be preserved by the hardware. The system must be restarted to return to the Working state. It is not safe to disassemble the machine in this state.

          This likely means my system is failing to reach that s5/g2 state.

          • BCsven@lemmy.ca
            link
            fedilink
            arrow-up
            3
            ·
            24 hours ago

            If you ssh login directly and issue same command, not In cockpit interface, does it react the same?

    • potentiallynotfelix@lemmy.fishOP
      link
      fedilink
      arrow-up
      2
      ·
      24 hours ago

      thanks for the suggestion, could you elaborate on what this would do differently from the regular shutdown command that systemctl uses? thanks again

      • undrwater@lemmy.world
        link
        fedilink
        arrow-up
        2
        ·
        6 hours ago

        My understanding is that ‘halt’ had been an alias for ‘halt -p’, but that changed recently. -p tells the command to power off. Without it, it just shuts down process.

  • just_another_person@lemmy.world
    link
    fedilink
    arrow-up
    1
    arrow-down
    1
    ·
    1 day ago

    Your machine isn’t shutting down, it’s trying to sleep.

    You also have active KVM instances which are fighting to keep it alive.

    • potentiallynotfelix@lemmy.fishOP
      link
      fedilink
      arrow-up
      1
      ·
      24 hours ago

      can you elaborate on why you suspect this? The cockpit reboot or shutdown button uses the shutdown command directly along with a --reboot or --poweroff flag.

          onSubmit(event) {
              const Dialogs = this.context;
              const arg = this.props.shutdown ? "--poweroff" : "--reboot";
              if (!this.props.shutdown)
                  cockpit.hint("restart");
      
              cockpit.spawn(["shutdown", arg, this.state.when, this.state.message], { superuser: "require", err: "message" })
                      .then(this.props.onClose || Dialogs.close)
                      .catch(e => this.setState({ error: e.toString() }));
      
              event.preventDefault();
              return false;
          }
      

      (source)