Forcing a reboot when shutdown command fails
Posted on 2025-08-23
Backups are important. Money is important. I don't want backups to take up all my money, and I have several terabytes to back up. So, what's a girl to do?
I've found that hosting my own backups, on my own hardware, is the cheapest way to get a 3 - 2 - 1 backup solution without paying an expensive monthly fee to some cloud company.
As a bonus, this setup comes with a bunch of free problems and complications. You get to run two servers, one in a geographically diverse location that's difficult to maintain!
I should describe my backups. I have two main devices doing backups for me; my home server, and a small remote server that I keep at my Mother's house, 3500 miles away.
My home server is the primary backup server, and has a 12TB RAID array that it uses for many things, including backups. The secondary backup server is a low-power headless ARM machine, a Pine Rockpro64 with a 8TB HD attached. Each server emails backup logs to me via a third server, my Linode VPS.
This remote server is a weak point in self-hosted backups... at least in my case. It connects back to my network using a VPN, and restarts itself weekly as a fail-safe. If something gets hung up, I should only have to wait seven days at the most before it automatically restarts. It's not fool-proof, but is a great way to avoid a lot of issues.
This setup worked flawlessly for over a year. A couple weeks before I was set to visit my family over the Summer, I stopped receiving backup logs from my remote backup server. I attempted SSH connections and ping tests, with no luck. My VPN server had no active connections.
For a couple harrowing weeks, I held my breath and hoped nothing went wrong with my primary backups. After flying across the US, and settling in at Mom's house, I SSH'd into my backup machine, now that I was on the LAN. Still no luck! I checked the power, and all of the status LEDs on the system seemed normal. I didn't have a way to connect a display, so I unplugged the power supply and let it boot up.
Instantly, weeks of daily backup logs started to flood into my inbox. All failures. The oldest log had some clues as to what had happened, mentioning that the last shutdown job couldn't "connect".
I didn't even know the shutdown
binary needed to connect to anything! My
immediate suspicion was that this was systemd-related, but that was probably
prejudice talking. I still don't know what shutdown connects to, because I'm
still on vacation and can't understand how shutdowns have become complicated
now.
Regardless, I needed to make SURE the system was going to restart when I asked
it to. I did some research, and found that echoing a series of characters to
/proc/sysrq-trigger
would force the system to sync all block devices,
remount read-only, and force a restart without any fancy connecting or
prompting.
When I let the first version of the script run as scheduled, I didn't get a notification that the server actually restarted. I slightly changed the syntax of the shutdown command to schedule the shutdown, instead of shutting down immediately, before the email could be sent.
The final version of the script attempts to schedule a normal reboot, just as the original script did. If it succeeds, it outputs a success message to the cron runner, and sends a message to any logged-in users (an improvement over the original).
If it fails for some reason, we're in emergency reboot territory. We print several status messages at each point in the process, but there is little chance that they will actually be received unless you're running the command interactively.
First it syncs all mounted partitions by echoing the letter "s" to the
special file /proc/sysrq-trigger
. Then it echoes the letter "u" to remount
everything read-only. Finally, it echoes a "b" to tell the system to reboot,
immediately, without whatever usual preparations are handled by the shutdown
command. This is obviously not the best case scenario, so it's only used if an
ordinary reboot fails.
Here's the completed script:
#!/usr/bin/env bash
if shutdown --reboot '+5m' 'The server will be shutting down in 5 minutes'; then
# I gave you the chance of aiding me willingly...
echo 'Restart successfully scheduled for five minutes from now.'
else
# ...but you have elected the way of pain.
echo 'Standard reboot has failed; attempting to force a reboot.'
echo
echo 'Manually sync disks...'
echo s > /proc/sysrq-trigger
echo 'Remount all filesystems in read-only mode...'
echo u > /proc/sysrq-trigger
echo 'Immediately reboot. Do not pass go, do not collect $200.'
echo b > /proc/sysrq-trigger
echo "If you're reading this, then we have bigger issues."
fi
This is the first time I've encountered this behavior, so hopefully this will save someone else the headache, and a few lost hours of vacation!
Tags: linux server troubleshooting