Skip to content

Pi Cluster Hat – MMM

MMM stands for the “Missing Missing Manual”. Like all tech, it often built on the shoulders of giants and this is no expectation. This MMM fills in the gaps left in the following post by Davin L

https://medium.com/@dhuck/the-missing-clusterhat-tutorial-45ad2241d738

This is based on the following post by Garrett Mills

https://glmdev.medium.com/building-a-raspberry-pi-cluster-784f0df9afbd

I make no bones that I built my Cluster using a Mac so some of the specifics maybe Mac only

My Hardware

  • 4x Raspberry Pi Zero — for the compute nodes
  • 1x Raspberry Pi 4 — for the master/login node
  • 5x MicroSD Cards
  • 1 x PiCluster Hat
  • 1x 8-port 10/100/1000 network switch
  • Various cables to connect things up

I bought the full kit from the Pi Hut which comes with the Hat, 4 Zeros, 4 SD Cards and the USB link cable and then used a Pi 4 I had lying around. All of the relevant assembly instructions are available from the 8086 website

  • Assembly – Connecting the Cluster HAT (plugging/unplugging the Pi Zeroes).
  • Software – Installing our customised Raspberry Pi OS for the controller and Zeroes.
  • Control – Controlling power to the Pi Zeroes

But there is so much missing, not relevant or misleading for new starters that it was only through trial and error ( a lot of error ) that I finally got a cluster up and running

Lessons Learnt

Choose the correct images. This was a complete amateur mistake. Because I was using a Raspbery Pi 4, I downloaded 64bit version of all of the images. The master pi worked fine but I spent a day and a half reflashing the Zero images over and over again thinking I did something one. First time download the 32bit versions and I was up and running in a couple of hours. doh !

Use the right USB cable. If you boot it all up and see the lights come on when you run ‘sudo clusterhat on’ but see nothing when using ‘ifconfig -a’. After several frustrating hours I found that this was because the USB cable I was using was not a data cable, just power. Switching to a decent cable that you have tested can pass data along it solves this

Running raspi-config after flashing SSD. Raspberry has changed the way SSH is enabled ( hint its not any more ). It needs to be enabled with rasp-config. You cannot do this after your Zero has booted because you cannot SSH to it until you have enabled SSH !!!. The easier method I found was to flash each Zero SSD, then pop it in the master pi, boot, and then run rasp-config, enable SSH and set the password. Then pop it into the zero. The zero will then boot with SSH enabled and from then on you have full access.

Copy SSH Keys. Once you are able to SSH into each of your nodes you need to copy your SSH public key to all them all. This is mostly easily done with the following command

Setting up A Shared Drive

First plug in our USB SSD and then run ‘lsblk’ to find where it is mounted.

As we can see its mounted at sda1 as a single 465G partition. As I am sharing my storage through pimasrter the controller, I install GParted via the UI and then used this to format the drive to ext4. First unmount any partitions, then create a new partition table. This will create a single are partition which you can then format as ext

Next we create a mount directory

Then we need it to mount on boot. For this we need UUID, we get this from the ‘blkid’ utility

And we need the bit of data UUID=”44783497-e748-497b-8d55-9294f612124a”

Now we edit fstab and add the following line

Set loose permissions

Install an NFS Server

For me it was already installed, but this should add it to your system

Now we export the NFS share. Edit /etc/exports and add the following line:

Reload the shares

To be validated and executed

Now that we’ve got the NFS share exported from the master node, we want to mount it on all of the other nodes so they can access it. Repeat this process for all of the other nodes. First we need to install NFS client on each node

After the client is installed we can create a mount point on each client. This should be the same directory that you mounted the flash drive to on the master node. In my case, this is /clusterfs:

We want the NFS share to mount automatically when the nodes boot. Edit /etc/fstab to accomplish this by adding the following line:

Now mount it with and you should be able to create a file in /media/clusterfs and have it show up at the same path across all the nodes.

The End Result

And when I had finally built mine this is what it looks like…..

Reference Material

Next Steps

So we have a cluster, there are a few things we can do with it for real world HPC computing, we could.

  • Install and run Kubernetes
  • Install Open MPI for parallel computing
  • Install Slurm for computer node scheduling

Install Kubernetes