Dan Darg's Blog

Tag: rpi

Raspberry Pi Cluster Part IV: Opening Up to the Internet
Intro

We want to serve and access the cluster to the Internet from a home connection. To do that, we need to set up our home router with port forwarding, and open up ports for SSH, HTTP and HTTPS.

All traffic goes through our primary node (RPi1). If I want to connect to a non-primary node from the outside world, then I can just ssh into RPi1 and then ssh onto the target node.

So far, we have been able to move around the network with with using passwords. This is not ideal. Having to type in a password each time slow us down and makes our automations trickier to secure. It’s also just not good practice to use passwords since hackers will spam your ssh servers with brute-force password attacks.

Network SSH Inter Communication

So we want to be able to ssh from our laptop into any node, and from any node to any node, without using a password. We do this with private-public key pairs.

First, if you have not done so already, create a public-private key pair on your unix laptop with ssh-keygen -t rsa and skip adding a passphrase:
```
❯ ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/home/me/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/me/.ssh/id_rsa
Your public key has been saved in /home/me/.ssh/id_rsa.pub
The key fingerprint is:
....
The key's randomart image is:
+---[RSA 3072]----+
|      . +.o.     |
...
|       .o++Boo   |
+----[SHA256]-----+
```
This will generate two files (“public” and “private”) in ~/.ssh (id_rsa.pub and id_rsa respectively). Now, in order to ssh into another host, we need to copy the content of the public key file into the file ~/.ssh/authorized_keys on the destination server. Once that is in place, you can ssh without needing a password.

Rather than manually copying the content of the public key file into a remote host, Unix machines provide a script called `ssh-copy-id` that provides a shortcut for you. In this case, running it as ssh-copy-id name@server will copy the default id_rsa.pub file content to the /home/user/.ssh/authorized_keys file on the server. If you want to use a non-default public key file, you can specify it with ssh-copy-id -i [non-default] file.

Once the public-private keys are available on the laptop, to make this process easier, we can copy/paste this script to copy the public portion into each of the nodes. Each time we’ll need to specify the password to copy into the node, but afterwards we can ssh without passwords.
```
while read SERVER
do
    ssh-copy-id user@"${SERVER}"
done <<\EOF
rpi1
rpi2
rpi3
rpi4
EOF
```
Next, we want to be able to ssh from any node into any other node. We could just copy the private key we just created on the laptop into each node, but this is not the safest practice. So, in this case, I went into each node and repeated the process (create a private-public pair, and then ran the above script).

Securing SSH Connections

Raspberry Pis are notorious for getting pwned by bots as soon as they’re opened up to the internet. One way to help ensure the security of your RPi is to use 2-Factor Authorization (2FA).

I am not going to do that because it creates added complexity to keep up with, and the other measures I’ll be taking are good enough.

Now that we have set up the ssh keys on our nodes, we need to switch off the ability to ssh in using a password — especially on our primary node which will be open to the internet. We do this by editing the file /etc/ssh/sshd-config and change the line from PasswordAuthentication yes to PasswordAuthentication no. Having done that, the only way to ssh in now is through public-private key-gen pairs, and they only exist on my laptop. (If my laptop gets lost or destroyed, then the only way to access these nodes will be by directly connecting them to a monitor with keyboard, and logging in with a password.)

The next precaution we’ll take is to change the port on which the sshd service runs on the primary node from the default 22 to some random port. We do this by uncommenting the line #Port 22 in the sshd_config file and changing the number to, say, Port 60022 and restarting the sshd service.

Having made this change, you will need to specify this non-default port whenever you try to get into that node with e.g. ssh -p 60022 user@rpi1. A bit annoying but, I have heard off the grapevine, this will prevent 99% of hacker bots in their tracks.

Finally, we can install a service called fail2ban with the usual sudo apt install fail2ban. This is a service that scans log files for suspicious behvior (e.g. failed logins, email-searching bots) and takes action to prevent malicious behavior by modifying the firewall for a period of time (e.g. banning the ip address).

With these three measures in place, we can be confident in the security of the cluster when opening it up to the internet.

Dynamic DNS

To open up our cluster to the internet, we need to get our home’s public IP address. This is assigned to your home router/modem box by your Internet Service Provider (ISP). In my case, I am using a basic Xfinity service with very meager upload speeds. I’ve used Xfinity for several years now, and it tends to provide fairly stable IP addresses, and does not seem to care if you use your public IP address to host content. (By contrast, I once tried setting up port-forwarding for a friend who had a basic home internet connection provided by Cox, and Cox seemed to actively adapt to block this friend’s outgoing traffic. I.e. Cox wants you to upgrade to a business account to serve content from your home connection.)

To see your public IP Address, run curl http://checkip.amazonaws.com.

We want to point a domain name towards our home IP address, but since ISPs can change your public IP address without notice, we need to come up with a method to adapt to any such change. The “classic” way to adapt to these changes in IP address is to use a “dynamic” DNS (DDNS) service. Most modern modem/router devices will give you the option to set up with an account from a a company like no-ip.com, and there are plenty of tutorials on the web if you wish to go this route.

However, I don’t want to pay a company like no-ip.com, or mess about with their “free-tier” service that requires you to e.g. confirm/renew the free service every month.

Since I get my IP addresses using AWS Route 53, we can use a cronjob script wrapping to periodically check that the IP Address assigned by the ISP is the same as the one that the AWS DNS server points to and, if it has changed, then this script can use the AWS CLI to update the IP Address. The process is described here and the adapted script that I am using is based on this gist. The only change I made was to extract the two key variables HOSTED_ZONE_ID and
NAME to the scripts arguments (in order to allow me to run this same script for multiple domains).

Once I had the script _update_ddns in place, I decided to run it every 10 minutes by opening crontab -e and adding the line:
```
*/10 * * * * /home/myname/ddns_update/_update_dns [MY_HOSTED_ZONE] [DOMAIN] >/dev/null 2>&1
```
Port Forwarding

Finally, we need to tell our modem/router device to forward requests submitted to our ssh, http and https ports onto the primary node’s wifi IP Address. Every router/modem device will be different depending on the brand and model, so you’ll have to poke around for the “port-forwarding” setup.

In my case, I’m using an Arris router, and it was not too hard to find the port-forwarding. You’ll then need to set up a bunch of rules that tell the device how to route packets that come from the external network on a given port (80 and 443 in the figure below) and to what internal address and ports I want those packets directed (192.168.0.51 in the figure below). Also add a rule if you want to be able to ssh into your primary node on the non-default port.

Port forwarding on my home modem/router device.

Make sure you have a server running, e.g. apache, when you test the URL you set up through route 53.

And that’s it — we have a pretty secure server open to the internet.
September 27, 2021
RaspberryPi Cluster III: Network storage and backup
Intro

The goal in this section is to add a harddrive to our primary node, make it accesible to all nodes as a network drive, partition it so that each of the RPi3B+ nodes can have some disk space that is more reliable than the SD cards, configure them so that they log to the network drive, and set up a backup system for the whole cluster

Mounting an external drive

I bought a 2TB external hard drive and connected it to one of the USB 3.0 slots in my primary node RPi4. (The other USB 3.0 slot is used for the external 1TB SSD drive.)

Connecting a drive will make a linux-run system aware of it. Running lsblk will show you all connected disks. In this case, my two disks show up as:
```
NAME   MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
...
sda      8:0    0 931.5G  0 disk
├─sda1   8:1    0   256M  0 part /boot/firmware
└─sda2   8:2    0 931.3G  0 part /
sdb      8:16   0   1.8T  0 disk
└─sdb1   8:17   0   1.8T  0 part
```
The key thing here is that the SSD drive is “mounted” (as indicated by the paths under MOUNTPOINT in the rows for the two partitions of disk sda). If a disk is not mounted, then there are no read-write operations going on between the machine and the disk and it is therefore safe to remove it. If a disk is mounted then you can mess up its internal state if you suddenly remove it — or if, say, the machine loses power — during a read-write operation.

To mount a disk, you need to create a location for it in the file system with mkdir. For temporary/explorartory purposes, it is conventional to create such a location in the /mnt directory. However, we will go ahead and create a mount point in the route directory since we intend to create a network-wide permanent mount point in the next section. In this case, I can run the following:
```
sudo mkdir /networkshared 
sudo mount /dev/sdb1 /networkshared 
ls /networkshared
```
At this point, I can read-write to the external 2TB hard drive from the primary node, and I risk corrupting it if I suddenly disconnect. To safely remove a disk, you need to unmount it first with the sudo umount path/to/mountpoint command.

If you want to automatically mount a drive on start up, then you need to add an entry to the /etc/fstab file. In this case, you first need to determine the PARTUUID (partition universally unique id) with:
```
❯ lsblk -o NAME,PARTUUID,FSTYPE,TYPE /dev/sdb1
NAME PARTUUID                             FSTYPE TYPE
sdb1 a692fa77-01                          ext4   part
```
… and then add a corresponding line to /etc/fstab (“File System TABle”) as follows:
```
PARTUUID=a692fa77-01 /networkshared ext4 defaults 0 0
```
This line basically reads to the system on boot up as “look for a disk partition of filesystem type ext4 with id a692fa77-01 and mount it to /mnt/temp”. (The last three fields (defaults 0 0) are default values for further parameters that determine how the disk will be booted and how it will behave.)

To test that this works, you can reboot the machine or, even easier, run sudo mount -a (for mount ‘all’ in fstab).

Setting up a network drive

Our goal here is not to just mount the 2TB hard drive for usage on the primary node, but to make it available to all nodes. To do that, we need to have the disk mounted on the primary node as we did in the last section, and we need the primary node to run a server whose task is to read/write to the disk on behalf of requests that will come from the network. There are a few different types of server out there that will do this.

If you want to make the disk available to mixed kinds of devices on a network (Mac, Windows, Linux), then the common wisdom is to run a “Samba” server on the node on which the disk is mounted. This is a server/transfer-protocol designed primarily by/for Windows, but generally supported elsewhere.

If you are just sharing files between linux machines, then the common wisdom is that it is better to use the linux-designed NFS (Network File System) server/transfer-protocol.

Also, since we will be creating a permanent mount point useable by all machines in our cluster network, we’ll create a root folder with maximally permissive permissions:
```
sudo mkdir /networkshared 
sudo chown nobody:nogroup /networkshared
sudo chmod 777 /networkshared 
```
Now, on our primary node, we will run:
```
sudo apt install nfs-kernel-server
```
… to install and enable the NFS server service, and now we need to edit /etc/exports by adding the following line:
```
/networkshared 10.0.0.0/24(rw,sync,no_subtree_check)
```
… and then run the following commands to restart the server with these settings:
```
sudo exportfs -a
sudo systemctl restart nfs-kernel-server
```
Now, on each of the RPi3B+ nodes we need to install software to make NFS requests across the network:
```
sudo apt install nfs-common
sudo mkdir /networkshared
```
… and add this line to /etc/fstab:
```
10.0.0.1:/networkshared /networkshared nfs defaults 0 0
```
… and run sudo mount -a to activate it. You can now expect to find the external disk accessible on each node at /networkshared.

Testing Read/Write Speeds to Disk

To see how quickly we can read/write to the network-mounted disk, we can use dd – a low level command-line tool for copying devices/files. Running it on the primary node — which has direct access to the disk, yields:
```
❯ dd if=/dev/zero of=/networkshared/largefile1 bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 4.95901 s, 217 MB/s
```
This figure — 217 MB/s — is a reasonable write speed to a directly connected hard drive. When we try the same command from e.g. rpi2 writing to the network-mounted disk:
```
❯ dd if=/dev/zero of=/networkshared/largefile2 bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 212.259 s, 5.1 MB/s
```
… we get the terrible speed of 5.1 MB/s. Why is this so slow? Is it the CPU or RAM on the rpi2? Is it the CPU or RAM on the rpi1? Is it the network the bottleneck?

We can first crudely monitor the CPU/RAM status by re-starting the dd command above on rpi2 and, while that is running, start htop on both the rpi2 and rpi1. On both machines, the CPU and RAM seemed to be lightly taxed by the dd process.

What about network speed? In our case, we want to measure how fast data can be sent from a minor node like rpi2 to the primary node rpi1 where the disk is physically located.

First, you can monitor a realtime network transfer speed using the cbm tool (installed with sudo install cbm) running on rpi1 while writing from rpi2. This reveals the transfer speed to be only in the 5-12MB/sec ballpark. Is that because the network can only go that fast, or is it due to something else?

Another handy tool for measuring network capacity is iperf. We need to install it on both the source node and the target node (rpi1 and rpi2 in this case) with sudo apt install iperf. Now, on the target node (rpi1) we start iperf in server mode with iperf -s. This will start a server listening on port 5001 on rpi1 awaiting a signal from another instance of iperf running in “client” mode. So on rpi2 we run the following command to tell iperf to ping the target host: iperf -c rpi1. iperf will then take a few seconds to stress and measure the network connection between the two instances of iperf and display the network speed. You can see a screen shot of the results here:

Demo of iperf communicating between two nodes; rpi1 upper half; rpi2 lower half

As you can see, it turns out that the max network throughput between these two nodes is only about ~95Mbits/sec, which is actually what this $9 switch is advertised as (10/100Mbits/sec). This corresponds to only ~(95/8)MBs/sec = ~12MBs/sec. So the actual write speed of 5.1MBs/sec is certainly within the order of magnitude of what the network switch will support. On a second run, I got the write speed up to 8.1MBs/sec, and the difference in time between the maximum network speeds (~12MBs/sec) and the disk-write speeds (~5-8MBs/sec) is likely due to overhead on both ends of the nfs connection.

To test read speeds, you can simply swap the input for the output files like so on the rpi2:
```
❯ dd of=/dev/null if=/networkshared/largefile2 bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 92.3027 s, 11.6 MB/s
```
Read speeds were, as you can see from this output, likewise bottlenecked to ~12MBs/sec by the network switch.

In conclusion, the network switch is bottlenecking the read/write speeds, and would probably have been an order of magnitude faster if I’d just shelled out another $4 for a gigabit switch.

An Aside on Network Booting

Now, ideally, we would not be using SD cards on our RPi3B+ nodes to house their entire file systems. A better approach would be to boot each of these nodes with a network file system mounted on the external 2TB disk.

(This would require getting DNSMasq to act as a TFTP server on the primary node, adjust the /boot/firmare/cmdline.txt file on the minor nodes, and point them to a file system on the network drive. See here: https://docs.oracle.com/cd/E37670_01/E41137/html/ol-dnsmasq-conf.html)

This is probably possible with these RPi models and Ubuntu server, and I hope to explore this in the future but, for right now, this is a bridge too far since I need to get the primary node open to the internet.

Backing up the primary node

My backup philosophy has long been to not to try to preserve the exact state of a disk in order to try to restore things exactly as they were in the event of e.g. disk failure. Rather, I have long much preferred to just keep copies of files so that, if my disk were to fail, then I would be able to set up a fresh disk in the future and then simply copy over any specific files, or even subsets of files.

Why? First of all, this just simplifies the backup process IMO. It’s conceptually easier to make copies of files versus copies of disks. Second, restoration feels cleaner to me. Over time my computer practices tend to improve, and my files tend to get more disorderly. So I like to “start fresh” whenever I get a new machine, only installing what I most currently think I’ll need, rather than copying over God-knows-what cluttersome programs, config files I had conjured up in the last life. OK, that might mean that I need to do some more configuration, but I am happy to do so if it means that my files get some spring cleaning.

Anyhow, in this instance, the minor nodes are simple enough that, were one to fail, it would not be too much work to restore one from scratch, especially given that I have recorded what I have been doing here.

However, the primary node is and will soon become more complex, and we need to think about making regular backups. There are two basic ways to do backups of a non-GUI linux server.

One way is to set it up manually with cronjobs and rsync. You can see how that is done in this guide. It’s actually not as complicated as you might expect, and going through this guide gives you a sense of how incremental backups work under the hood.

The other way is to use a program or service that aims to abstract away the details of the underlying tools, such as rsync. Which I what I decided to do here.

The first tool I tried after some googling was bacula. However, after running it for several months and realizing that it was not backing things up incrementally (but, rather, performing full backups everyday), and that the configuration scripts were so ridiculously convoluted that I would lose the will to live trying to get it to work for my very simple use case, I decided that I would look for an alternative much closer to running simple snapshots wrapped around rsync.

And that’s exactly what I found: rsnapshot. Unlike bacula, which when installed requires you to configure and manage two background services with insanely idiosyncratic config files and bconsole commands (don’t ask what that is), rsnapshot is a much simple tool that runs no background service. Rather, you simply edit a config file for a tool that just wraps around rsync, and then you set up cronjobs to execute that tool as you prefer. And, in accordance with my backup philosophy, you just specify what fils/dirs you want backed-up, and so restoration in the future involves simply copying/consulting these backed-up files when reestablishing your machine afresh.

(With bacula, you can’t just browse your backed-up files — no no no — the files are stored in some shitty byte-code file format which means that you can only restore those files by working with the god-awful bacula bconsole. Honestly, it’s tools like bacula that really suck the life out of you.)

In fact, I was so happy with the utter simplicity of the rsnapshot that I also installed it in one of my minor nodes in order to also back up the files. For reference, here is my rsnapshot config file:
```
# ...
                                               #
#######################
# CONFIG FILE VERSION #
#######################

config_version	1.2

###########################
# SNAPSHOT ROOT DIRECTORY #
###########################

# All snapshots will be stored under this root directory.
snapshot_root	/networkshared/rsnapshot


# If no_create_root is enabled, rsnapshot will not automatically create the
# snapshot_root directory. This is particularly useful if you are backing
# up to removable media, such as a FireWire or USB drive.
#
no_create_root	1

#################################
# EXTERNAL PROGRAM DEPENDENCIES #
#################################

# LINUX USERS:   Be sure to uncomment "cmd_cp". This gives you extra features.
# EVERYONE ELSE: Leave "cmd_cp" commented out for compatibility.
#
# See the README file or the man page for more details.
#
cmd_cp		/bin/cp

# uncomment this to use the rm program instead of the built-in perl routine.
#
cmd_rm		/bin/rm

# rsync must be enabled for anything to work. This is the only command that
# must be enabled.
#
cmd_rsync	/usr/bin/rsync

# Uncomment this to enable remote ssh backups over rsync.
#
#cmd_ssh	/usr/bin/ssh

# Comment this out to disable syslog support.
#
cmd_logger	/usr/bin/logger

# Uncomment this to specify the path to "du" for disk usage checks.
# If you have an older version of "du", you may also want to check the
# "du_args" parameter below.
#
cmd_du		/usr/bin/du

# Uncomment this to specify the path to rsnapshot-diff.
#
cmd_rsnapshot_diff	/usr/bin/rsnapshot-diff

# Specify the path to a script (and any optional arguments) to run right
# before rsnapshot syncs files
#
#cmd_preexec	/path/to/preexec/script

# Specify the path to a script (and any optional arguments) to run right
# after rsnapshot syncs files
#
#cmd_postexec	/path/to/postexec/script

# Paths to lvcreate, lvremove, mount and umount commands, for use with
# Linux LVMs.
#
linux_lvm_cmd_lvcreate	/sbin/lvcreate
linux_lvm_cmd_lvremove	/sbin/lvremove
linux_lvm_cmd_mount	/bin/mount
linux_lvm_cmd_umount	/bin/umount

#########################################
#     BACKUP LEVELS / INTERVALS         #
# Must be unique and in ascending order #
# e.g. alpha, beta, gamma, etc.         #
#########################################

# Days
retain	alpha	6
# Weeks
retain	beta	6
# Months
retain	gamma	6

############################################
#              GLOBAL OPTIONS              #
# All are optional, with sensible defaults #
############################################

# Verbose level, 1 through 5.
# 1     Quiet           Print fatal errors only
# 2     Default         Print errors and warnings only
# 3     Verbose         Show equivalent shell commands being executed
# 4     Extra Verbose   Show extra verbose information
# 5     Debug mode      Everything
#
verbose		2

# Same as "verbose" above, but controls the amount of data sent to the
# logfile, if one is being used. The default is 3.
# If you want the rsync output, you have to set it to 4
#
loglevel	3

# If you enable this, data will be written to the file you specify. The
# amount of data written is controlled by the "loglevel" parameter.
#
logfile	/var/log/rsnapshot.log

# If enabled, rsnapshot will write a lockfile to prevent two instances
# from running simultaneously (and messing up the snapshot_root).
# If you enable this, make sure the lockfile directory is not world
# writable. Otherwise anyone can prevent the program from running.
#
lockfile	/var/run/rsnapshot.pid

# By default, rsnapshot check lockfile, check if PID is running
# and if not, consider lockfile as stale, then start
# Enabling this stop rsnapshot if PID in lockfile is not running
#
#stop_on_stale_lockfile		0

# Default rsync args. All rsync commands have at least these options set.
#
#rsync_short_args	-a
#rsync_long_args	--delete --numeric-ids --relative --delete-excluded

# ssh has no args passed by default, but you can specify some here.
#
#ssh_args	-p 22

# Default arguments for the "du" program (for disk space reporting).
# The GNU version of "du" is preferred. See the man page for more details.
# If your version of "du" doesn't support the -h flag, try -k flag instead.
#
du_args	-csh

# If this is enabled, rsync won't span filesystem partitions within a
# backup point. This essentially passes the -x option to rsync.
# The default is 0 (off).
#
#one_fs		0

# The include and exclude parameters, if enabled, simply get passed directly
# to rsync. If you have multiple include/exclude patterns, put each one on a
# separate line. Please look up the --include and --exclude options in the
# rsync man page for more details on how to specify file name patterns.
#
#include	"/"
#exclude	"/networkshared"

# The include_file and exclude_file parameters, if enabled, simply get
# passed directly to rsync. Please look up the --include-from and
# --exclude-from options in the rsync man page for more details.
#
#include_file	/path/to/include/file
#exclude_file	/path/to/exclude/file

# If your version of rsync supports --link-dest, consider enabling this.
# This is the best way to support special files (FIFOs, etc) cross-platform.
# The default is 0 (off).
#
#link_dest	0

# When sync_first is enabled, it changes the default behaviour of rsnapshot.
# Normally, when rsnapshot is called with its lowest interval
# (i.e.: "rsnapshot alpha"), it will sync files AND rotate the lowest
# intervals. With sync_first enabled, "rsnapshot sync" handles the file sync,
# and all interval calls simply rotate files. See the man page for more
# details. The default is 0 (off).
#
#sync_first	0

# If enabled, rsnapshot will move the oldest directory for each interval
# to [interval_name].delete, then it will remove the lockfile and delete
# that directory just before it exits. The default is 0 (off).
#
#use_lazy_deletes	0

# Number of rsync re-tries. If you experience any network problems or
# network card issues that tend to cause ssh to fail with errors like
# "Corrupted MAC on input", for example, set this to a non-zero value
# to have the rsync operation re-tried.
#
#rsync_numtries 0

# LVM parameters. Used to backup with creating lvm snapshot before backup
# and removing it after. This should ensure consistency of data in some special
# cases
#
# LVM snapshot(s) size (lvcreate --size option).
#
#linux_lvm_snapshotsize	100M

# Name to be used when creating the LVM logical volume snapshot(s).
#
#linux_lvm_snapshotname	rsnapshot

# Path to the LVM Volume Groups.
#
#linux_lvm_vgpath	/dev

# Mount point to use to temporarily mount the snapshot(s).
#
#linux_lvm_mountpath	/path/to/mount/lvm/snapshot/during/backup

###############################
### BACKUP POINTS / SCRIPTS ###
###############################

# LOCALHOST
# DWD: Careful -- you need to copy each line and modify, otherwise your tabs will be spaces!
backup	/home/	localhost/
backup	/etc/	localhost/
backup	/usr/	localhost/
backup	/var/	localhost/
# You must set linux_lvm_* parameters below before using lvm snapshots
#backup	lvm://vg0/xen-home/	lvm-vg0/xen-home/

# EXAMPLE.COM
#backup_exec	/bin/date "+ backup of example.com started at %c"
#backup	root@example.com:/home/	example.com/	+rsync_long_args=--bwlimit=16,exclude=core
#backup	root@example.com:/etc/	example.com/	exclude=mtab,exclude=core
#backup_exec	ssh root@example.com "mysqldump -A > /var/db/dump/mysql.sql"
#backup	root@example.com:/var/db/dump/	example.com/
#backup_exec	/bin/date "+ backup of example.com ended at %c"

# CVS.SOURCEFORGE.NET
#backup_script	/usr/local/bin/backup_rsnapshot_cvsroot.sh	rsnapshot.cvs.sourceforge.net/

# RSYNC.SAMBA.ORG
#backup	rsync://rsync.samba.org/rsyncftp/	rsync.samba.org/rsyncftp/
```
September 22, 2021
Raspberry Pi Cluster Part II: Network Setup
Introduction

In the last post we got the hardware in order and made each of our 4 RPi nodes production ready with Ubuntu Server 20.04. We also established wifi connections between each node and the home router.

In this post, I’m going to describe how to set up the “network topology” that will enable the cluster to become easily transportable. The primary RPi4 node will act as the gateway/router to the cluster. It will communicate with the home router on behalf of the whole network. If I move in the future, then I’ll only have to re-establish a wifi connection with this single node in order to restore total network access to each node. I also only need to focus on securing this node in order to expose the whole cluster to the internet. Here’s the schematic again:

In my experience, it’s tough to learn hardware and networking concepts because the field is thick with jargon. I am therefore going to write as though to my younger self keenly interested in becoming self-reliant in the field of computer networking.

Networking Fundamentals

If you’re not confident with your network fundamentals, then I suggest you review the following topics by watching the linked explainer videos. (All these videos are made by the YouTube chanel “Power Cert Animated Videos” and are terrific.
- DHCP (video)
- IP Address (video)
- Hub vs Switch vs Router (video)
- Port Forwarding (video)
- DNS Server (video)
Before we get into the details of our cluster, let’s quickly review the three main things we need to think about when setting up a network: IP-address assignment, domain-name resolution, and routing.

IP-Address Assignment

At its core, networking is about getting fixed-length “packets” of 1s and 0s from one program running on a computer to another program running on any connected computer (including programs running on the same computer). For that to happen, each computer needs to have an address – an IP Address – assigned to it. As explained in the above video, the usual way in which that happens is by interacting with a DHCP server. (However, most computers nowadays run a process in the background that will attempt to negotiate an IP Address automatically in the event that no machine on its network identifies itself as a DHCP server.) In short, we’ll need to make sure that we have a DHCP server on our primary node in order to assign IP addresses to the other nodes.

Domain-Name Resolution

Humans do not like to write instructions as 1s and 0s, so we need each node in our network to be generally capable of exchanging a human-readable address (e.g. ‘www.google.com’, ‘rpi3’) into a binary IP address. This is where domain-name servers (DNS) and related concepts come in.

The word “resolve” is used to describe the process of converting a human-readable address into an IP address. In general, an application that needs to resolve an IP address will interact with a whole bunch of other programs, networks and servers to obtain its target IP address. The term “resolver” is sometimes used to refer to this entire system of programs, networks and servers. The term resolver is also sometimes used to refer to a single element within such a system. (Context usually makes it clear.) From hereon, we’ll use “resolver” to refer to a single element within a system of programs, networks and servers whose job is to convert strings of letters to an IP Address, and “resolver system” to refer to the whole system.

Three types of resolver to understand here are “stub resolvers”, “recursive resolver”, and “authoritative resolver”. A stub resolver is a program that basically acts as a cache within the resolver system. If it has recently received a request to return an IP address in exchange for a domain name (and therefore has it in its cache), then it will return that domain name. Otherwise, it will pass the request onto another resolver, (which might also be a stub resolver that has to just pass the buck on).

A recursive resolver will also act as a cache and if it does not have all of the information needed to return a complete result, then it will pass on a request for information to another resolver. Unlike a stub resolver though, it might not receive back a final answer to its question but, rather, an address to another resolver that might have the final answer. The recursive resolver will keep following any such lead until it gets its final answer.

An “authoritative” resolver is a server that does not pass the buck on. It’s the final link in the chain, and if it does not have the answer or suggestions for another server to consult, then the resolution will fail, and all of these resolvers will send back a failure message.

In summary, domain-name resolution is all about finding a simple lookup table that associates a string (domain name) with a number (the IP Address). This entry in the table is called an “A Record” (A for Address).

Routing

Once a program has an IP Address to send data to, it needs to know where first to send the packet in order to get it relayed. In order for this to happen, each network interface needs to have a router address applied to it when configured. You can see the router(s) on a linux with router -n. In a home setup, this router will be the address of the wifi/modem box. Once the router address is determined, the application can just send packets there and the magic of Internet routing will take over.

Ubuntu Server Networking Fundamentals

Overview

Ubuntu Server 20.04, which we’re using here, comes with several key services/tools that are installed/enabled by default or by common practice: systemd-resolved, systemd-networkd, NetworkManager and netplan.

systemd-resolved

You can learn the basic about it by running:
```
man systemd-resolved
```
This service is a stub resolver making it possible for applications running on the system to resolve hostnames. Applications running on the system can interact with it by issuing some low-level kernel jazz via their underlying C libraries, or by pinging the internal (“loopback”) network address 127.0.0.53. To see it in use as a stub server, you can run dig @127.0.0.53 www.google.com.

You can check what DNS servers it is set up to consult by running resolvectl status. (resolvectl is a pre-installed tool that lets you interact with the running systemd-resolved service; see resolvectl --help to get a sense of what you can do with it.)

Now we need to ask how systemd-resolved resolves hostnames? It does it by communicating over a network with a DNS server. How do you configure it so it knows what DNS servers to consult and in what order of priority?

systemd-networkd

systemd-networkd is a pre-installed and pre-enabled service on Ubuntu that acts as a DHCP client (listening on port 68 for signals from a DHCP server). So when you switch on your machine and this service starts up, it will negotiate the assignment of an IP Address on the network based upon DHCP broadcast signals. In the absence of a DHCP server on the network, it will negotiate with any other device. I believe it is also involved in the configuration of interfaces.

NetworkManager

This is an older service that does much the same as networkd. It is NOT enabled by default, but is so prominent that I thought it would be worth mentioning in this discussion. (Also, during my research to try and get the cluster configured the way I want it, I installed NetworkManager and messed with it only to ultimately conclude that this was unnecessary and confusing.)

Netplan

Netplan is pre-installed tool (not service) that, in theory, makes it easier to configure systemd-resolved and either networkd or NetworkManager. The idea is that you declare your desired network end state in a YAML file (/etc/netplan/50-cloud-init.yaml) so that after start up (or running netplan apply), it will do whatever needs to be done under the hood with the relevant services to get the network into your desired state.

Other Useful Tools

In general, when doing networking on linux machines, it’s useful to install a couple more packages:

sudo apt install net-tools traceroute

The net-tools package gives us a bunch of classic command-line utilities, such as netstat. I often use it (in an alias) to check what ports are in use on my machibne: sudo netstat -tulpn.

traceroute is useful in making sense of how your network is presently set up. Right off the bat, running traceroute google.com, will show you how you reach google.

Research References

For my own reference, the research I am presenting here is derived in large part from the following articles:
- This is the main article I consulted that shows someone using dnsmasq to set up a cluster very similar to this one, but using Raspbian instead of Ubuntu.
- This article and this article on getting dnsmasq and system-resolved to handle single-word domain names.
- Overview of netplan, NetworkManager, etc.
- https://unix.stackexchange.com/questions/612416/why-does-etc-resolv-conf-point-at-127-0-0-53
- This explains why you get the message “ignoring nameserver 127.0.0.1” when starting up dnsmasq.
- Nice general intro to key concepts with linux
- This aids understanding of systemd-resolved’s priorities when multiple DNS’s are configured on same system
- https://opensource.com/business/16/8/introduction-linux-network-routing
- https://www.grandmetric.com/2018/03/08/how-does-switch-work-2/
- https://www.cloudsavvyit.com/3103/how-to-roll-your-own-dynamic-dns-with-aws-route-53/
Setting the Primary Node

OK, enough preliminaries, let’s get down to setting up out cluster.

A chief goal is to try to set up the network so that as much of the configuration as possible is on the primary node. For example, if we want to be able to ssh from rpi2 to rpi3, then we do NOT want to have to go to each node and explicitly state where each hostname is to be found.

So we want our RPi4 to operate as the single source of truth for domain-name resolution and IP-address assignment. We do this by running dnsmasq – a simple service that turns our node into a DNS and DHCP server:
```
sudo apt install dnsmasq
sudo systemctl status dnsmasq
```
We configure dnsmasq with /etc/dnsmasq.conf. On this fresh install, this conf file will be full of fairly detailed notes. Still, it takes some time to get the hang of how it all fits together. This is the file I ended up with:
```
# Choose the device interface to configure
interface=eth0

# We will listen on the static IP address we declared earlier
# Note: this might be redundant
listen-address=127.0.0.1

# Enable addresses in range 10.0.0.1-128 to be leased out for 12 hours
dhcp-range=10.0.0.1,10.0.0.128,12h

# Assign static IPs to cluster members
# Format = MAC:hostname:IP
dhcp-host=ZZ:YY:XX:WW:VV:UU,rpi1,10.0.0.1
dhcp-host=ZZ:YY:XX:WW:VV:UU,rpi2,10.0.0.2
dhcp-host=ZZ:YY:XX:WW:VV:UU,rpi3,10.0.0.3
dhcp-host=ZZ:YY:XX:WW:VV:UU,rpi4,10.0.0.4

# Broadcast the router, DNS and netmask to this LAN
dhcp-option=option:router,10.0.0.1
dhcp-option=option:dns-server,10.0.0.1
dhcp-option=option:netmask,255.255.255.0

# Broadcast host-IP relations defined in /etc/hosts
# And enable single-name domains
# See here for more details
expand-hosts
domain=mydomain.net
local=/mydomain.net/

# Declare upstream DNS's; we'll just use Google's
server=8.8.8.8
server=8.8.4.4

# Useful for debugging issues
# Run 'journalctl -u dnsmasq' for resultant logs
log-queries
log-dhcp

# These two are recommended default settings
# though the exact scenarios they guard against 
# are not entirely clear to me; see man for further details
domain-needed
bogus-priv
```
Hopefully these comments are sufficient to convey what is going on here. Next, we make to sure that the /etc/hosts file associates the primary node with its domain name, rpi1. It’s not clear to me why this is needed. The block of dhcp-host definitions above do succeed in enabling dnsmasq to resolve rpi2, rpi3, and rpi4, but the line for rpi1 does not work. I assume that this is because dnsmasq is not setting the IP address of rpi1, and this type of setting only works for hosts that it sets the IP Address of. (Why that is the case seems odd to me.)
```
# /etc/hosts
10.0.0.1 rpi1
```
Finally, we need to configure the file /etc/netplan/50-cloud-init.yaml on the primary node in order to declare this node with a static IP Address on both the wifi and ethernet networks.
```
network:
    version: 2
    ethernets:
        eth0:
            dhcp4: no
            addresses: [10.0.0.1/24]
    wifis:
        wlan0:
            optional: true
            access-points:
                "MY-WIFI-NAME":
                    password: "MY-PASSWORD"
            dhcp4: no
            addresses: [192.168.0.51/24]
            gateway4: 192.168.0.1
            nameservers:
                addresses: [8.8.8.8,8.8.4.4]
```
Once these configurations are set up and rpi1 is rebooted, you can expect to find that ifconfig will show ip addresses assigned to eth0 and wlan0 as expected, and that resolvectl dns will read something like:
```
Global: 127.0.0.1
Link 3 (wlan0): 8.8.8.8 8.8.4.4 2001:558:feed::1 2001:558:feed::2
Link 2 (eth0): 10.0.0.1
```
Setting up the Non-Primary Nodes

Next we jump into the rpi2 node and edit the /etc/netplan/ to:
```
network:
    version: 2
    ethernets:
        eth0:
            dhcp4: true
            optional: true
            gateway4: 10.0.0.1
            nameservers:
                addresses: [10.0.0.1]
    wifis:
        wlan0:
            optional: true
            access-points:
                "MY-WIFI-NAME":
                    password: "MY-PASSWORD"
            dhcp4: no
            addresses: [192.168.0.52/24]
            gateway4: 192.168.0.1
            nameservers:
                addresses: [8.8.8.8,8.8.4.4]
```
This tells netplan to set up systemd-networkd to get its IP Address from a DHCP server on the ethernet network (which will be found to be on rpi1 when the broadcast event happens), and to route traffic and submit DNS queries to 10.0.0.1.

To reiterate, the wifi config isn’t part of the network topology; this is optionally added because it makes life easier when setting up the network to be able to ssh straight into a node. In my current setup, I am assigning all the nodes static IP Addresses on the wifi network of 192.168.0.51-4.

Next, as described here, in order for our network to be able to resolve single-word domain names, we need to alter the behavior of systemd-resolved by linking these two files together:
```
sudo ln -sf /run/systemd/resolve/resolv.conf /etc/resolv.conf
```
This causes the systemd-resolved stub resolver to dynamically determine a bunch of settings based upon what dnsmasq broadcasts on rpi1.

After rebooting, and doing the same configuration on rpi3 and rpi4, we can run dig rpi1, dig rpi2, etc. on any of the non-primary nodes and expect to get the single-word hostnames resolved as we intend.

If we go to trpi1 and check the ip-address leases:
```
cat /var/lib/misc/dnsmasq.leases
```
… then we can expect to see that dnsmasq has successfully acted as a DHCP server. You can also check that dnsmasq has been receiving DNS queries by examining the system logs: journalctl -u dnsmasq.

Routing All Ethernet Traffic Through the Primary Node

Finally, we want all nodes to be able to connect to the internet by routing through the primary node. This is achieved by first uncommenting the line net.ipv4.ip_forward=1 in the file /etc/sysctl.conf and then running the following commands:
```
sudo iptables -t nat -A POSTROUTING -o wlan0 -j MASQUERADE
sudo iptables -A FORWARD -i wlan0 -o eth0 -m state --state RELATED,ESTABLISHED -j ACCEPT
sudo iptables -A FORWARD -i eth0 -o wlan0 -j ACCEPT
```
These lines mean something like the following:
1. When doing network-address translation (-t nat), and just before the packet is to go out via the wifi interface (-A POSTROUTING = “append a postrouting rule”), replace the source ip address with the ip address of this machine on the outbound network
2. forward packets in from wifi to go out through ethernet
3. forward packets in from ethernet to go out through wifi
For these rules to survive across reboots you need to install:
```
sudo apt install iptables-persistent
```
and agree to storing the rules in /etc/iptables/rules.v4. Reboot, and you can now expect to be able to access the internet from any node, even when the wifi interface is down (sudo ifconfig wlan0 down).

Summary

So there we have it – an easily portable network. If you move location then you only need to adjust the wifi-connection details in the primary node, and the whole network will be connected to the Internet.

In the next part, we’ll open the cluster up to the internet through our home router and discuss security and backups.
September 14, 2021
Raspberry Pi Cluster Part I: Goals, Hardware, Choosing OS
A while back I built a raspberry cluster with 1 x RPi4 and 2 x RPi3b devices. This was shortly after the release of the RPi4, and, due to the many fixes that it required, I didn’t get far beyond hooking them up through a network switch.

Now that RPi4 has had some time to mature, I decided to start again from scratch and to document my journey in some detail.

Goals

Being able to get computers to coordinate together over a network to achieve various tasks is a valuable skill set that has been made affordable to acquire thanks to the RPi Foundation.

My goals are to build a cluster in order to figure out and/or practice the following technical competencies:
- Hardware Skills: acquiring, organizing, and monitoring the cluster hardware
- Networking Skills: setting up a network switch, DHCP server, DNS server, network-mounting drives, etc.
- Dev Ops: installing, updating, managing the software, and backing everything up in a scalable manner
- Web Server Skills: installing apache and/or nginx, with load balancing across the cluster nodes; also, being able to distribute python and node processes over the cluster nodes
- Distributed-Computing Skills: e.g. being able to distribute CPU-intensive tasks across the nodes in the cluster
- Database Skills: being able to create shards and/or replica nodes for Mysql, Postgres, and Mongo
- Kubernetes Skills: implement a kluster across my nodes
Those the are the goals; I hope to make this a multi-part series with a lot of documentation that will help others learn from my research.

Hardware

RPi Devices

This is a 4-node cluster with the following nodes:
- 1 x RPi4b (8GB RAM)
- 3 x RPi3b
I’ll drop the ‘b’s from hereon.

The RPi4 will serve as the master/entry node. If you’re building from scratch then you may well want to go with 4xRpi4. I chose to use 3xRPi3 because I already had three from previous projects, and I liked the thought of having less-power hungry devices running 24×7. (Since their role is entirely one of cluster pedagogy/experimentation, it doesn’t bother me that their IO speed is less than that of the RPi4. Also, the RPi4 really needs some sort of active cooling solution, while the RPi3b arguably does not, so my cluster will only have one fan running 24/7 instead of 4.)

Equipment Organization

I know from my previous attempt that it’s really hard keeping your hardware neat, tidy and portable. It is important to me to be able to transport the cluster with minimal disassembly, and I therefore sought to house everything on a single tray and with a single power cable to operate it. That means that my cluster’s primary connection to the Internet would be by wifi but, importantly, I’ve insisted that the nodes communicate to each other over ethernet through a switch. The network schematic therefore looks something like this:

RPi Cluster Network Schematic

The RPi4 will thus need to act as a router so that the other nodes can access the internet. Since each node has built-in wifi, I’m also going to establish direct links between each node and my home wifi router, but these shall only be used for initial setup and debugging purposes (if/when the network switch fails).

To keep the RPi nodes arranged neatly, I got a cluster case for $20-$30. Unfortunately, the RPi4 has a different physical layout which spoils the symmetry of the build, but it also makes it easy to identify it. I also invested in a strip plug with USB-power connectors, so that I would only need a single plug to connect the cluster to the outside world. I was keen to power the RPi3s through the USB connectors on the strip plug in order to avoid having 5 power supplies,^* which gets bulky and ugly IMO.

Finally, I had to decide about what sort of storage drives I would use on my RPi3s. For the RPi4s, there was no question that I would need an external SSD drive to make the most of its performance.

BEWARE about purchasing an SSD for your RPi4! Not all drives work on the RPi4 and I lost a ton of time/money with Sabrent. This time round I went with this 1TB drive made by Netac. So far, so good. If $130 is too pricey then just get a 120/240GB version in the $20-40 range. (I only got 1TB because I have plans to use my cluster to do some serious picture-file back ups and serving).

For the RPi3s, which I expected to use a lot less in general, there is not nearly as much to be gained from an external SSD. Also, I wanted to limit the cost of the set up as well as the number of cables floating around the cluster and so I decided to start off with SD Cards for the RPi3bs, though I am wary of this decision (and deem it likely that I will regret this decision as soon as one of them fails). I’m using 3x64GB Samsung Evo Plus (U3 speed). I’ll be sure to benchmark their performance once set up.

I also got a 2TB HDD drive to provide the RPi3s with some more durable read-write space, and on which I’ll be able to backup everything on the cluster.

I got a simple network switch, some short micro-USB cables , and some short flexible ethernet cables. Be careful with your ethernet cables; you want them short to keep your cluster tidy, but make sure they are not too rigid as a result; in my previous attempt I got these cables that were short but so rigid that they created a lot of torque between the switch and node connectors, and made the whole cluster look/feel contorted.

I also got a high quality power supply for my RPi4 since it, being the primary node that will undergo the most work, and having two external storage drives to power, needs a reliable voltage.

Finally, I also got a bunch of USB-A and USB-C Volt/Amp-Meters for a few bucks from China, because I like to know the state of the power going through the nodes.

So, in total, I calculate that the equipment will have cost ~$500. It’s added up, but that’s not bad a for computing cluster.

4-Node RPi Cluster Hardware

And, yes, I need a tray upgrade.

Choosing an OS

When it came to choosing an OS, the only two I considered viable candidates were Raspbian OS (64 bit beta), or Ubuntu 20.04 LTS server (64 bit).

I went with Ubuntu in the end because my project is primarily pedagogic in nature and so, by choosing Ubuntu, I figured I’d be deepening my knowledge of a “real world” OS. I also just generally like Ubuntu, and it has long been my OS of choice on cloud servers.

For the RPi3s, I used the Raspberry Pi Imager application to select the Ubuntu server 20.04 and burned that image onto each SD card.

Raspberry Pi Imager

For the RPi4 though I wanted to boot from an external SSD drive, and this isn’t trivial yet with the official Ubuntu image. I therefore opted to use an image posted here that someone had built using the official image but with a few tweaks to enable booting from an external USB device. (It required you to first update the RPi4’s EEPROM, but I had already done that. It’s easily googled.)

Initial Setup

Once the cluster hardware had been assembled and wired up, I powered everything on and then had to go through each fresh install of ubuntu and perform the following:
- Login with ‘ubuntu’, ‘ubuntu’ credentials
- Connect to wifi following this advice; note: you need to reboot after calling sudo netplan apply before it will work! (My netplan conf is included in the next part of this series.)
- Update with sudo apt update; sudo apt upgrade -y
- Set the timezone with sudo timedatectl set-timezone America/New_York; if you want to use a different timezone then list he ones available with timedatectl list-timezones.
- Add a new user ‘dwd’ (sudo adduser dwd) and assign him the groups belonging to the original ubuntu user (sudo usermod -aG $(groups | sed "s/ /,/g") dwd)
- Switch to dwd and disable ubuntu (sudo passwd -l ubuntu)
- Install myconfig and use it to further install super versions of vim, tmux, etc. See this post for more details.
- Install oh-my-zsh, powerlevel10k, and zsh-autosuggestions.
- Install iTerm2 shell integration
- Install nvm
- Create a ~/.ssh/authorized_keys file enabling public-key ssh-ing
- Change the value of /etc/hostname in order to call our nodes rpi0, rpi1, rpi2, rpi3.
This workflow allowed me to get my four nodes into a productive state in a reasonably short amount of time. I also set up an iTerm2 profile so that my cluster nodes have a groovy Raspberry Pi background, making it quick and easy to distinguish where I am.

RPi4 node at the ready with tmux, vim, oh-my-zsh, powerlevel10k

Finally, we also want to allocate memory “swap space” on any device not using. an SD card. (Swap space is the space you allocate on your storage disk that will get used if you use up your RAM. Most linux distros nowadays will not allocate swap space to your boot drive by default, so it has to be done manually.)

Since only the RPi4 has an external drive, that’s all we’ll lset up for now. (Later, once we have a single network mounted HDD drive available to each node, we’ll allocate swap space there.) Use the following to add 8GB of swap:^†
```
sudo fallocate -l 8G /swapfile
sudo chmod 600 /swapfile
sudo mkswap /swapfile
sudo swapon /swapfile
```
Finally, add the following line to /etc/fstab to make this change permanent: /swapfile swap swap defaults 0 0

Summary

That’s it for part I. In the next part, we’re going to set up our ethernet connections between the RPi nodes using our network switch.
1. ^*
  4 x RPi + 1 x Network Switch
2. ^†
  According to lore it’s best practice to only add ~1/2 your RAM size as swap. However, I’ve never encountered issues by going up to x2.
September 2, 2020