RaspberryPi Cluster III: Network storage and backup

Intro

The goal in this section is to add a harddrive to our primary node, make it accesible to all nodes as a network drive, partition it so that each of the RPi3B+ nodes can have some disk space that is more reliable than the SD cards, configure them so that they log to the network drive, and set up a backup system for the whole cluster

Mounting an external drive

I bought a 2TB external hard drive and connected it to one of the USB 3.0 slots in my primary node RPi4. (The other USB 3.0 slot is used for the external 1TB SSD drive.)

Connecting a drive will make a linux-run system aware of it. Running lsblk will show you all connected disks. In this case, my two disks show up as:

NAME   MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
...
sda      8:0    0 931.5G  0 disk
├─sda1   8:1    0   256M  0 part /boot/firmware
└─sda2   8:2    0 931.3G  0 part /
sdb      8:16   0   1.8T  0 disk
└─sdb1   8:17   0   1.8T  0 part

The key thing here is that the SSD drive is “mounted” (as indicated by the paths under MOUNTPOINT in the rows for the two partitions of disk sda). If a disk is not mounted, then there are no read-write operations going on between the machine and the disk and it is therefore safe to remove it. If a disk is mounted then you can mess up its internal state if you suddenly remove it — or if, say, the machine loses power — during a read-write operation.

To mount a disk, you need to create a location for it in the file system with mkdir. For temporary/explorartory purposes, it is conventional to create such a location in the /mnt directory. However, we will go ahead and create a mount point in the route directory since we intend to create a network-wide permanent mount point in the next section. In this case, I can run the following:

sudo mkdir /networkshared 
sudo mount /dev/sdb1 /networkshared 
ls /networkshared

At this point, I can read-write to the external 2TB hard drive from the primary node, and I risk corrupting it if I suddenly disconnect. To safely remove a disk, you need to unmount it first with the sudo umount path/to/mountpoint command.

If you want to automatically mount a drive on start up, then you need to add an entry to the /etc/fstab file. In this case, you first need to determine the PARTUUID (partition universally unique id) with:

❯ lsblk -o NAME,PARTUUID,FSTYPE,TYPE /dev/sdb1
NAME PARTUUID                             FSTYPE TYPE
sdb1 a692fa77-01                          ext4   part

… and then add a corresponding line to /etc/fstab (“File System TABle”) as follows:

PARTUUID=a692fa77-01 /networkshared ext4 defaults 0 0

This line basically reads to the system on boot up as “look for a disk partition of filesystem type ext4 with id a692fa77-01 and mount it to /mnt/temp”. (The last three fields (defaults 0 0) are default values for further parameters that determine how the disk will be booted and how it will behave.)

To test that this works, you can reboot the machine or, even easier, run sudo mount -a (for mount ‘all’ in fstab).

Setting up a network drive

Our goal here is not to just mount the 2TB hard drive for usage on the primary node, but to make it available to all nodes. To do that, we need to have the disk mounted on the primary node as we did in the last section, and we need the primary node to run a server whose task is to read/write to the disk on behalf of requests that will come from the network. There are a few different types of server out there that will do this.

If you want to make the disk available to mixed kinds of devices on a network (Mac, Windows, Linux), then the common wisdom is to run a “Samba” server on the node on which the disk is mounted. This is a server/transfer-protocol designed primarily by/for Windows, but generally supported elsewhere.

If you are just sharing files between linux machines, then the common wisdom is that it is better to use the linux-designed NFS (Network File System) server/transfer-protocol.

Also, since we will be creating a permanent mount point useable by all machines in our cluster network, we’ll create a root folder with maximally permissive permissions:

sudo mkdir /networkshared 
sudo chown nobody:nogroup /networkshared
sudo chmod 777 /networkshared

Now, on our primary node, we will run:

sudo apt install nfs-kernel-server

… to install and enable the NFS server service, and now we need to edit /etc/exports by adding the following line:

/networkshared 10.0.0.0/24(rw,sync,no_subtree_check)

… and then run the following commands to restart the server with these settings:

sudo exportfs -a
sudo systemctl restart nfs-kernel-server

Now, on each of the RPi3B+ nodes we need to install software to make NFS requests across the network:

sudo apt install nfs-common
sudo mkdir /networkshared

… and add this line to /etc/fstab:

10.0.0.1:/networkshared /networkshared nfs defaults 0 0

… and run sudo mount -a to activate it. You can now expect to find the external disk accessible on each node at /networkshared.

Testing Read/Write Speeds to Disk

To see how quickly we can read/write to the network-mounted disk, we can use dd – a low level command-line tool for copying devices/files. Running it on the primary node — which has direct access to the disk, yields:

❯ dd if=/dev/zero of=/networkshared/largefile1 bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 4.95901 s, 217 MB/s

This figure — 217 MB/s — is a reasonable write speed to a directly connected hard drive. When we try the same command from e.g. rpi2 writing to the network-mounted disk:

❯ dd if=/dev/zero of=/networkshared/largefile2 bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 212.259 s, 5.1 MB/s

… we get the terrible speed of 5.1 MB/s. Why is this so slow? Is it the CPU or RAM on the rpi2? Is it the CPU or RAM on the rpi1? Is it the network the bottleneck?

We can first crudely monitor the CPU/RAM status by re-starting the dd command above on rpi2 and, while that is running, start htop on both the rpi2 and rpi1. On both machines, the CPU and RAM seemed to be lightly taxed by the dd process.

What about network speed? In our case, we want to measure how fast data can be sent from a minor node like rpi2 to the primary node rpi1 where the disk is physically located.

First, you can monitor a realtime network transfer speed using the cbm tool (installed with sudo install cbm) running on rpi1 while writing from rpi2. This reveals the transfer speed to be only in the 5-12MB/sec ballpark. Is that because the network can only go that fast, or is it due to something else?

Another handy tool for measuring network capacity is iperf. We need to install it on both the source node and the target node (rpi1 and rpi2 in this case) with sudo apt install iperf. Now, on the target node (rpi1) we start iperf in server mode with iperf -s. This will start a server listening on port 5001 on rpi1 awaiting a signal from another instance of iperf running in “client” mode. So on rpi2 we run the following command to tell iperf to ping the target host: iperf -c rpi1. iperf will then take a few seconds to stress and measure the network connection between the two instances of iperf and display the network speed. You can see a screen shot of the results here:

Demo of `iperf` communicating between two nodes; rpi1 upper half; rpi2 lower half

As you can see, it turns out that the max network throughput between these two nodes is only about ~95Mbits/sec, which is actually what this $9 switch is advertised as (10/100Mbits/sec). This corresponds to only ~(95/8)MBs/sec = ~12MBs/sec. So the actual write speed of 5.1MBs/sec is certainly within the order of magnitude of what the network switch will support. On a second run, I got the write speed up to 8.1MBs/sec, and the difference in time between the maximum network speeds (~12MBs/sec) and the disk-write speeds (~5-8MBs/sec) is likely due to overhead on both ends of the nfs connection.

To test read speeds, you can simply swap the input for the output files like so on the rpi2:

❯ dd of=/dev/null if=/networkshared/largefile2 bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 92.3027 s, 11.6 MB/s

Read speeds were, as you can see from this output, likewise bottlenecked to ~12MBs/sec by the network switch.

In conclusion, the network switch is bottlenecking the read/write speeds, and would probably have been an order of magnitude faster if I’d just shelled out another $4 for a gigabit switch.

An Aside on Network Booting

Now, ideally, we would not be using SD cards on our RPi3B+ nodes to house their entire file systems. A better approach would be to boot each of these nodes with a network file system mounted on the external 2TB disk.

(This would require getting DNSMasq to act as a TFTP server on the primary node, adjust the /boot/firmare/cmdline.txt file on the minor nodes, and point them to a file system on the network drive. See here: https://docs.oracle.com/cd/E37670_01/E41137/html/ol-dnsmasq-conf.html)

This is probably possible with these RPi models and Ubuntu server, and I hope to explore this in the future but, for right now, this is a bridge too far since I need to get the primary node open to the internet.

Backing up the primary node

My backup philosophy has long been to not to try to preserve the exact state of a disk in order to try to restore things exactly as they were in the event of e.g. disk failure. Rather, I have long much preferred to just keep copies of files so that, if my disk were to fail, then I would be able to set up a fresh disk in the future and then simply copy over any specific files, or even subsets of files.

Why? First of all, this just simplifies the backup process IMO. It’s conceptually easier to make copies of files versus copies of disks. Second, restoration feels cleaner to me. Over time my computer practices tend to improve, and my files tend to get more disorderly. So I like to “start fresh” whenever I get a new machine, only installing what I most currently think I’ll need, rather than copying over God-knows-what cluttersome programs, config files I had conjured up in the last life. OK, that might mean that I need to do some more configuration, but I am happy to do so if it means that my files get some spring cleaning.

Anyhow, in this instance, the minor nodes are simple enough that, were one to fail, it would not be too much work to restore one from scratch, especially given that I have recorded what I have been doing here.

However, the primary node is and will soon become more complex, and we need to think about making regular backups. There are two basic ways to do backups of a non-GUI linux server.

One way is to set it up manually with cronjobs and rsync. You can see how that is done in this guide. It’s actually not as complicated as you might expect, and going through this guide gives you a sense of how incremental backups work under the hood.

The other way is to use a program or service that aims to abstract away the details of the underlying tools, such as rsync. Which I what I decided to do here.

The first tool I tried after some googling was bacula. However, after running it for several months and realizing that it was not backing things up incrementally (but, rather, performing full backups everyday), and that the configuration scripts were so ridiculously convoluted that I would lose the will to live trying to get it to work for my very simple use case, I decided that I would look for an alternative much closer to running simple snapshots wrapped around rsync.

And that’s exactly what I found: rsnapshot. Unlike bacula, which when installed requires you to configure and manage two background services with insanely idiosyncratic config files and bconsole commands (don’t ask what that is), rsnapshot is a much simple tool that runs no background service. Rather, you simply edit a config file for a tool that just wraps around rsync, and then you set up cronjobs to execute that tool as you prefer. And, in accordance with my backup philosophy, you just specify what fils/dirs you want backed-up, and so restoration in the future involves simply copying/consulting these backed-up files when reestablishing your machine afresh.

(With bacula, you can’t just browse your backed-up files — no no no — the files are stored in some shitty byte-code file format which means that you can only restore those files by working with the god-awful bacula bconsole. Honestly, it’s tools like bacula that really suck the life out of you.)

In fact, I was so happy with the utter simplicity of the rsnapshot that I also installed it in one of my minor nodes in order to also back up the files. For reference, here is my rsnapshot config file:

# ...
                                               #
#######################
# CONFIG FILE VERSION #
#######################

config_version	1.2

###########################
# SNAPSHOT ROOT DIRECTORY #
###########################

# All snapshots will be stored under this root directory.
snapshot_root	/networkshared/rsnapshot


# If no_create_root is enabled, rsnapshot will not automatically create the
# snapshot_root directory. This is particularly useful if you are backing
# up to removable media, such as a FireWire or USB drive.
#
no_create_root	1

#################################
# EXTERNAL PROGRAM DEPENDENCIES #
#################################

# LINUX USERS:   Be sure to uncomment "cmd_cp". This gives you extra features.
# EVERYONE ELSE: Leave "cmd_cp" commented out for compatibility.
#
# See the README file or the man page for more details.
#
cmd_cp		/bin/cp

# uncomment this to use the rm program instead of the built-in perl routine.
#
cmd_rm		/bin/rm

# rsync must be enabled for anything to work. This is the only command that
# must be enabled.
#
cmd_rsync	/usr/bin/rsync

# Uncomment this to enable remote ssh backups over rsync.
#
#cmd_ssh	/usr/bin/ssh

# Comment this out to disable syslog support.
#
cmd_logger	/usr/bin/logger

# Uncomment this to specify the path to "du" for disk usage checks.
# If you have an older version of "du", you may also want to check the
# "du_args" parameter below.
#
cmd_du		/usr/bin/du

# Uncomment this to specify the path to rsnapshot-diff.
#
cmd_rsnapshot_diff	/usr/bin/rsnapshot-diff

# Specify the path to a script (and any optional arguments) to run right
# before rsnapshot syncs files
#
#cmd_preexec	/path/to/preexec/script

# Specify the path to a script (and any optional arguments) to run right
# after rsnapshot syncs files
#
#cmd_postexec	/path/to/postexec/script

# Paths to lvcreate, lvremove, mount and umount commands, for use with
# Linux LVMs.
#
linux_lvm_cmd_lvcreate	/sbin/lvcreate
linux_lvm_cmd_lvremove	/sbin/lvremove
linux_lvm_cmd_mount	/bin/mount
linux_lvm_cmd_umount	/bin/umount

#########################################
#     BACKUP LEVELS / INTERVALS         #
# Must be unique and in ascending order #
# e.g. alpha, beta, gamma, etc.         #
#########################################

# Days
retain	alpha	6
# Weeks
retain	beta	6
# Months
retain	gamma	6

############################################
#              GLOBAL OPTIONS              #
# All are optional, with sensible defaults #
############################################

# Verbose level, 1 through 5.
# 1     Quiet           Print fatal errors only
# 2     Default         Print errors and warnings only
# 3     Verbose         Show equivalent shell commands being executed
# 4     Extra Verbose   Show extra verbose information
# 5     Debug mode      Everything
#
verbose		2

# Same as "verbose" above, but controls the amount of data sent to the
# logfile, if one is being used. The default is 3.
# If you want the rsync output, you have to set it to 4
#
loglevel	3

# If you enable this, data will be written to the file you specify. The
# amount of data written is controlled by the "loglevel" parameter.
#
logfile	/var/log/rsnapshot.log

# If enabled, rsnapshot will write a lockfile to prevent two instances
# from running simultaneously (and messing up the snapshot_root).
# If you enable this, make sure the lockfile directory is not world
# writable. Otherwise anyone can prevent the program from running.
#
lockfile	/var/run/rsnapshot.pid

# By default, rsnapshot check lockfile, check if PID is running
# and if not, consider lockfile as stale, then start
# Enabling this stop rsnapshot if PID in lockfile is not running
#
#stop_on_stale_lockfile		0

# Default rsync args. All rsync commands have at least these options set.
#
#rsync_short_args	-a
#rsync_long_args	--delete --numeric-ids --relative --delete-excluded

# ssh has no args passed by default, but you can specify some here.
#
#ssh_args	-p 22

# Default arguments for the "du" program (for disk space reporting).
# The GNU version of "du" is preferred. See the man page for more details.
# If your version of "du" doesn't support the -h flag, try -k flag instead.
#
du_args	-csh

# If this is enabled, rsync won't span filesystem partitions within a
# backup point. This essentially passes the -x option to rsync.
# The default is 0 (off).
#
#one_fs		0

# The include and exclude parameters, if enabled, simply get passed directly
# to rsync. If you have multiple include/exclude patterns, put each one on a
# separate line. Please look up the --include and --exclude options in the
# rsync man page for more details on how to specify file name patterns.
#
#include	"/"
#exclude	"/networkshared"

# The include_file and exclude_file parameters, if enabled, simply get
# passed directly to rsync. Please look up the --include-from and
# --exclude-from options in the rsync man page for more details.
#
#include_file	/path/to/include/file
#exclude_file	/path/to/exclude/file

# If your version of rsync supports --link-dest, consider enabling this.
# This is the best way to support special files (FIFOs, etc) cross-platform.
# The default is 0 (off).
#
#link_dest	0

# When sync_first is enabled, it changes the default behaviour of rsnapshot.
# Normally, when rsnapshot is called with its lowest interval
# (i.e.: "rsnapshot alpha"), it will sync files AND rotate the lowest
# intervals. With sync_first enabled, "rsnapshot sync" handles the file sync,
# and all interval calls simply rotate files. See the man page for more
# details. The default is 0 (off).
#
#sync_first	0

# If enabled, rsnapshot will move the oldest directory for each interval
# to [interval_name].delete, then it will remove the lockfile and delete
# that directory just before it exits. The default is 0 (off).
#
#use_lazy_deletes	0

# Number of rsync re-tries. If you experience any network problems or
# network card issues that tend to cause ssh to fail with errors like
# "Corrupted MAC on input", for example, set this to a non-zero value
# to have the rsync operation re-tried.
#
#rsync_numtries 0

# LVM parameters. Used to backup with creating lvm snapshot before backup
# and removing it after. This should ensure consistency of data in some special
# cases
#
# LVM snapshot(s) size (lvcreate --size option).
#
#linux_lvm_snapshotsize	100M

# Name to be used when creating the LVM logical volume snapshot(s).
#
#linux_lvm_snapshotname	rsnapshot

# Path to the LVM Volume Groups.
#
#linux_lvm_vgpath	/dev

# Mount point to use to temporarily mount the snapshot(s).
#
#linux_lvm_mountpath	/path/to/mount/lvm/snapshot/during/backup

###############################
### BACKUP POINTS / SCRIPTS ###
###############################

# LOCALHOST
# DWD: Careful -- you need to copy each line and modify, otherwise your tabs will be spaces!
backup	/home/	localhost/
backup	/etc/	localhost/
backup	/usr/	localhost/
backup	/var/	localhost/
# You must set linux_lvm_* parameters below before using lvm snapshots
#backup	lvm://vg0/xen-home/	lvm-vg0/xen-home/

# EXAMPLE.COM
#backup_exec	/bin/date "+ backup of example.com started at %c"
#backup	root@example.com:/home/	example.com/	+rsync_long_args=--bwlimit=16,exclude=core
#backup	root@example.com:/etc/	example.com/	exclude=mtab,exclude=core
#backup_exec	ssh root@example.com "mysqldump -A > /var/db/dump/mysql.sql"
#backup	root@example.com:/var/db/dump/	example.com/
#backup_exec	/bin/date "+ backup of example.com ended at %c"

# CVS.SOURCEFORGE.NET
#backup_script	/usr/local/bin/backup_rsnapshot_cvsroot.sh	rsnapshot.cvs.sourceforge.net/

# RSYNC.SAMBA.ORG
#backup	rsync://rsync.samba.org/rsyncftp/	rsync.samba.org/rsyncftp/

RaspberryPi Cluster III: Network storage and backup

Intro

Mounting an external drive

Setting up a network drive

An Aside on Network Booting

Backing up the primary node

Comments

Leave a Reply Cancel reply

More posts

Signatories

Hello world!

Disk Partitions, Tables, Labels and File Systems

Eslint, Typescript & VSCode