He’s Dead Jim… How are your backups doing?

He’s Dead Jim… How are your backups doing?
Photo by Taylor Vick / Unsplash

Home Media Servers are Great

For several years now, my siblings and I have enjoyed playing Minecraft together. My kids have also joined in on the fun in recent years. When we first started playing together, we would create a world on an old Windows 7 computer that lived at my parent’s house that sat in a corner hosting our worlds. It worked for a time, but eventually, there were enough problems and hiccups with the system that we needed a better solution.

I took it upon myself to set up some hardware in my home to act as a multi-purpose server. I had long wanted to try out some in-home streaming services like Plex, so it seemed like as good a time as ever. I put together some old computer hardware, installed Linux on it, and got everything set up. It’s been pretty much smooth sailing from there.

Now, for the last six years or so I’ve been responsible for our Minecraft hosting and I’ve continued to run a media server for my family, and any of my siblings that want to watch the odd show that you can’t find on streaming services easily (Animaniacs anyone?).

Everything has been working great with relatively few hiccups until a few days ago when my wife came to me and said the media server wasn’t working. Sure enough, I couldn’t SSH into it. I tried rebooting manually and checking the networking equipment, but while I could see it on the network, something was wrong.

Troubleshooting

So where do you start troubleshooting? The first thing I did was plug a monitor into it since for some reason I wasn’t able to access it via SSH like you normally would for a server like this. Once I had a screen hooked up, the problem was immediately apparent - it was not booting the operating system completely. The system had booted into recovery mode.

Not being a master of Linux, I figured I would try rebooting again to see if there were any errors on startup that would give me a clue as to what was going on. While watching the boot sequence, one of the logs on the screen was the computer trying to detect the boot disc. After waiting for about 30 seconds or so, it timed out. It appeared the problem was the SSD with the operating system on it.

That’s not good.

After a few attempts to find the overall health of the drive (which was obviously already in question), it turned out that the S.M.A.R.T. monitoring was not enabled, so there was no way that I would actually get a read on how the drive was doing. Normally I would have expected to see a warning from them at some point before this happened, but without the tools enabled, there was no way really diagnose the drive further. It’s possible that all the data on that drive was toast.

Backups Anyone?

Does that mean that all the data is toast right? Well, if you store everything on a single disk, then yes. You’re hosed. But, thankfully, this is not a sad story about data loss, and how you should follow the “3-2-1 Backup Strategy” method of backups so you don’t suffer the same fate that I did.

The server served (no pun intended) two main functions - a media server, and hosting Minecraft worlds. For performance, the Minecraft worlds were kept on the local SSD to decrease any loading times. We’ve been able to have ten people logged in at the same time with no discernable latency or trouble. But, when the SSD containing those worlds goes kaput, then so does the data. This would normally be a problem if I hadn’t set up automated backups to take place daily.

I wrote several script files to handle all of the backup processes automatically, and check once in a while to make sure the backups are working. There is probably a way I could send myself an alert of some kind if there were ever a problem, but I haven’t looked into that yet.

Backups with AWS Simple Storage Service (S3)

All of the Minecraft worlds are backed up every day and uploaded to offsite storage. Each morning at 1:00 AM, every server that is running will be systematically stopped, the world copied, compressed, and uploaded into an AWS Bucket. AWS S3 is an object storage service where you can save files to retrieve again from anywhere. Setup and usage is a topic for another article, but essentially after you set up an account, you configure a unique key for the system that you want to use, and then using the AWS CLI you can programmatically upload/download data at will.

Since all of the Minecraft servers are run within Docker containers, it is a simple matter of stopping the container while the backup file is created, and then starting it up again after a copy of the world files has been made.

Because there are multiple worlds to do, I used a combination of two script files - one to list all the worlds that needed to be backed up, and one that does the actual backing up.

Here is the script that does the actual work:

#!/bin/bash

CONTAINER_NAME=$1
WORLD_DIRECTORY=$2

if [ -z "$CONTAINER_NAME" ] || [ -z "$WORLD_DIRECTORY" ]
then
  echo "Usage: minecraftWorldDockerBackup.sh <docker-container-name> <world-directory>"
  exit -1
fi

echo "--------------------------------------"
echo "Beginning backup of $CONTAINER_NAME..."
echo "--------------------------------------"

DATE=`date +%Y-%m-%d`
FILENAME="$DATE-$CONTAINER_NAME.tar.gz"
BUCKET_NAME="s3://<redacted>"

# Stop the server while the backup is being made
echo "Stopping $CONTAINER_NAME server for backup"
docker stop $CONTAINER_NAME

# Create the backup
echo "Creating backup file"
tar -czf $FILENAME $WORLD_DIRECTORY

# Restart the server
echo "Restarting the $CONTAINER_NAME server"
docker start $CONTAINER_NAME

# Upload the backup file to S3
echo "Uploading $CONTAINER_NAME backup to S3"
aws s3 cp $FILENAME $BUCKET_NAME/$CONTAINER_NAME/

# Remove the backup file from disk to save space
echo "Cleaning up backup file from local disk"
rm $FILENAME

echo "-------------------------------"
echo "$CONTAINER_NAME Backup Complete"
echo "-------------------------------"

As you can see, you simply pass in the name of the docker container and the location of the world files that you want to back up. The world files are compressed to save space, and then uploaded to S3 for storage.

Each world you want to backup can be listed in another script file like this:

#!/bin/bash
BACKUP_SCRIPT_HOME="/home/chris/server-scripts/minecraft/"
WORLD_HOME="/home/chris/minecraft-worlds"

$BACKUP_SCRIPT_HOME/backupMinecraftDockerWorld.sh minecraft_kiddoSurvival $WORLD_HOME/kiddoSurvival
$BACKUP_SCRIPT_HOME/backupMinecraftDockerWorld.sh minecraft_game_world    $WORLD_HOME/game_world
$BACKUP_SCRIPT_HOME/backupMinecraftDockerWorld.sh minecraft_creative      $WORLD_HOME/minecraft_creative
$BACKUP_SCRIPT_HOME/backupMinecraftDockerWorld.sh minecraft_hydrogen      $WORLD_HOME/minecraft_hydrogen

The second script file is executed via a Cron job. The Cron is a software utility, offered by a Linux-like operating system that automates the scheduled task at a predetermined time. It’s a process that runs in the background and performs the specified tasks at the predetermined time without a user needing to do anything.

To do this on my server, I add a line to the Cron tables via the crontab -e command. For running the Minecraft backup script, it looks like this:

0 1 * * * /home/chris/server-scripts/minecraft/backupAllMinecraftWorlds.sh

Cron jobs can be really confusing the first time (or first several times)  you see them, but what essentially is happening here is the first 5 numbers of the line (or stars) indicate the minute, hour, day, months, and day of the week.

Like S3, Cron jobs are a topic for another article entirely, but what this line indicates is that on minute 0 of hour 1 on every day, on every month, on every day of the week, run the script located at /home/chris/server-scripts/minecraft/backupAllMinecraftWorlds.sh. It works like a charm.

What about the Media?

While I don’t have an enormous media collection by some standards, my collection of TV shows and Movies amounts to about 2.5 TB of data. Because of the size, I have not backed this up to the cloud. I’ve considered it a couple of times because I would hate to have to manually back up all of our movies again, but uploading 2.5 TB to the cloud would take a huge amount of time. That doesn’t mean I don’t have a second copy of it though.

Synology DS918+

Luckily, I didn’t even need to restore a backup of the media. I keep all of the media files stored on a NAS (Network Attached Storage) device on my server rack. My device is a DS918+ from Synology. It has four hard drive bays - 3 of them I have populated with 4 TB Iron Wolf drives from Seagate, and the other is a 2 TB Seagate drive that is just for ephemeral data. (Really, it’s just there to fill up the last bay because I haven’t purchased a 4th Iron Wolf drive yet). The Iron Wolf drives are configured in a RAID 5 configuration, so if one of the drives were to fail, none of the data would be lost.

RAID is not a backup, but it’s better than just having everything stored on a single disk. At least with the RAID and the second copy of the data stored on an 8TB external drive, I have some decent redundancy.

Restoring the Media Server

Besides Minecraft, the system also acts as a media server. I don’t actually have a single DVD or Blu-ray player in the house. We just stream everything either via Chromecast or Roku to the two TVs in the house. The home media server keeps all the obscure movies that tend to disappear from streaming services, or in the case where we only had a specific streaming service for a certain set of movies.

I used Plex for several years, but recently I’d been having some trouble with it when it came to local downloads and connecting to our Chromecast. I recently switched over to Jellyfin, and have been having a better experience with it for the most part.

For both Plex and Jellyfin, I run them in Docker containers the same way I do with the Minecraft Servers. The instructions for setting them both up are listed on their respective websites, but here are the instructions for Jellyfin should you be interested in trying it out.

Like with Minecraft, I’ve written scripts that will automate the entire process.

Connecting the NAS

First, I connect the server to the NAS using NFS. NFS (Network File System) is a protocol that allows a user on a client computer to access files over a network as if they were stored locally. The protocol allows for file access to be transparent, meaning that the user does not need to be aware of the location of the file in order to access it. NFS uses the Remote Procedure Call (RPC) protocol to enable communication between the client and the server and relies on the client-server model. The server exports (makes available) certain file systems, which the client can then mount (access) as if they were local file systems. NFS is commonly used in Linux and Unix environments, but can also be used with other operating systems such as Windows. I use a similar system to connect to my Windows and my wife’s macOS computer.

Connecting is quite simple once you know how to do it. To make the server automatically mount the NAS device on boot, I update the /etc/fstab file. The /etc/fstab file is used to specify which file systems should be automatically mounted at boot time and how they should be mounted. Using the following script, I backup the fstab file, then add my changes:

#!/bin/bash

echo "Configuring NAS connection"
sudo apt update
sudo apt install nfs-common -y

# Create a directory for the NAS mount point
mkdir -p /nfs/media

# Backup and modify /etc/fstab to auto-mount the NAS volume
cp /etc/fstab /etc/fstab_backup
echo "<server-ip-address>:<path-to-media> /nfs/media nfs defaults 0 0" >> /etc/fstab

# Mount the NAS volume from etc/fstab
mount -a

Installing Jellyfin

Once the media drive is mounted, it’s time to install Jellyfin. I prefer to use the Docker Compose method because it gives me a bit more options out of the box when setting up the docker environment. Once the docker container is up, then we’re ready to roll. From there it is just a matter of reconfiguring the media libraries.

Of course, there is a way I could back up the data that the docker containers use so I wouldn’t have to configure the media libraries again from scratch, but since this only takes about 5 minutes to do, I’m not that concerned about it right now.

The Server is back online

Overall, the process took me about a day to bring back online once I had the replacement SSD.

Why did it take me that long when everything is automated? Well, because everything is automated now. I actually only had the Minecraft world creation and backups semi-automated and had nothing for the media server. I created all of the script files for the server setup as I was doing it again because this is the first time that I have had a catastrophic failure of the server like this. Unfortunately, even when we know that we should have systems in place for disaster recovery, it takes an actual disaster and the subsequent work to fix it that makes us realize that we don’t want to do things the “hard way” again.

Looking back now though, I’m glad it dies. The learning experience of setting everything up again for a second time after so many years was well worth the effort. Now the system is better than ever with additional redundancies in check and the automated backups are working as expected.

I have also created a new GitHub repository to house all of my script files in case anyone else can find them useful, or use them as a starting point for setting up their own server.

Enjoy!