Shutting down nodes in Azure Kubernetes Service

A while back I wrote a post on Adjusting Pod Eviction Timings in Kubernetes. To test the changes made in that post I had to shut down nodes in an Azure Kubernetes Service cluster.

This can be done easily in the Azure portal: –

However I did a presentation recently and didn’t want to have to keep jumping into the portal from VS Code…so I wanted to be able to shut down the nodes in code.

So here’s how to use the azure-cli to shut down a node in an Azure Kubernetes Service cluster.


DISCLAIMER – the following code should only be run against a test cluster!


Firstly, to test, let’s deployment a simple application to the cluster: –

kubectl create deployment test --image=nginx

Confirm: –

kubectl get all

To check the node that the pod is running on: –

kubectl get pods -o wide

kubectl get pods -o jsonpath="{.items[0].spec.nodeName}"

Assign the node that the pod is running on to a variable (we’ll use this in a minute): –

NODE=$(kubectl get pods -o jsonpath="{.items[0].spec.nodeName}" | sed 's/.$/\U&/')  && echo $NODE

N.B. – the sed command at the end of the statement above is to make sure that the last character of the node name is in upper case, which is needed to get the instance ID of the VM in a later statement.

OK now we can look at shutting down the node that the pod is running on.

Nodes in AKS run in a nodepool which is a virtual machine scale set that is in a different resource group that the kubernetes cluster itself. The naming convention of that resource group is: –

MC_resourcegroup_clustername_location

The cluster in the examples here is called kubernetes1 in the resource group kubernetes in EASTUS.

So set the resource group name: –

RESOURCEGROUP="MC_kubernetes_kubernetes1_eastus"

Now we can grab the VMSS name in two ways, firstly by running: –

VMSSNAME=$(az vmss list --resource-group $RESOURCEGROUP --query "[].name" -o tsv) && echo $VMSSNAME 

N.B. – AKS clusters can have multiple nodepools in which case the query above will return multiple values and won’t work.

Or we use the $NODE variable we set earlier and strip out the last few characters: –

VMSSNAME=${NODE:0:27} && echo $VMSSNAME 

Once we have the NODE name and the VMSS name, we need to get the instance ID of the VM in the scale set:-

INSTANCEID=$(az vmss list-instances --name $VMSSNAME --resource-group $RESOURCEGROUP --query "[?osProfile.computerName=='$NODE'].[instanceId]" -o tsv) && echo $INSTANCEID

And now we can shut down the node: –

az vmss deallocate --name $VMSSNAME --instance-ids $INSTANCEID --resource-group $RESOURCEGROUP

Confirm that the node is offline: –

kubectl get nodes

Great! The node is offline! Our test pod didn’t have any tolerations set so it’ll take 5 minutes for a new pod to be created on a healthy node. You can check out how to adjust this in my previous post.

Finally, to restart the node: –

az vmss start --name $VMSSNAME --instance-ids $INSTANCEID --resource-group $RESOURCEGROUP

And that’s how to shutdown a node in AKS to test pod eviction!

Thanks for reading!

Space Invaders on Kubernetes

A while ago I blogged about an awesome Chaos Engineering tools built by Eugenio Marzo (t) call KubeInvaders.

Since then Eugenio has updated the repo to make it easier to deploy KubeInvaders using Helm! So here’s how to deploy KubeInvaders to Azure Kubernetes Service using Helm.

Pre-requisities that need to be installed to run the code here are: –

Windows Subsystem for Linux (or a bash terminal)
Azure-Cli
Kubectl
Helm

First thing to do is log in with the azure cli: –

az login

Create a resource group: –

az group create --name kubeinvaders --location EASTUS

Spin up a AKS cluster: –

az aks create --resource-group kubeinvaders --name kubeinvadersclu --node-count 2

Get credentials to connect kubectl to AKS cluster: –

az aks get-credentials --resource-group kubeinvaders --name kubeinvadersclu

Confirm connection to AKS cluster: –

kubectl get nodes

Add the helm repo for the ingress-nginx controller: –

helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx

Confirm helm repositories: –

helm repo list

Install the ingress-nginx controller: –

helm install ingress-nginx ingress-nginx/ingress-nginx \
--create-namespace \
--namespace ingress-basic \
--set controller.service.annotations."service\.beta\.kubernetes\.io/azure-load-balancer-health-probe-request-path"=/healthz

EDIT – 2023-01 – Updated to add in the annotation

List resources in the ingress-basic namespace: –

kubectl get all -n ingress-basic

Note the external IP of the controller and set the IP address to a variable: –

IP="XX.XX.XXX.XX"

Set a DNS name for the external IP address to a variable: –

DNSNAME="SOMETHING-kubeinvaders"

Get the resource-id of the external ip: –

PUBLICIPID=$(az network public-ip list --query "[?ipAddress!=null]|[?contains(ipAddress, '$IP')].[id]" --output tsv)

Update external ip address with DNS name: –

az network public-ip update --ids $PUBLICIPID --dns-name $DNSNAME

Display the FQDN: –

az network public-ip show --ids $PUBLICIPID --query "[dnsSettings.fqdn]" --output tsv

Now we’re ready to deploy KubeInvaders!

Add the kubeinvaders helm repository: –

helm repo add kubeinvaders https://lucky-sideburn.github.io/helm-charts/

Confirm helm repositories: –

helm repo list

Create a kubeinvaders namespace: –

kubectl create namespace kubeinvaders

Deploy kubeinvaders: –

helm install kubeinvaders --set-string target_namespace="default" \
-n kubeinvaders kubeinvaders/kubeinvaders \
--set ingress.enabled=true \
--set ingress.hostName=SOMETHING-kubeinvaders.eastus.cloudapp.azure.com \
--set image.tag=v1.9

EDIT – 2023-01 – Updated to add in –set ingress.enabled=true

Now go to the FQDN set above in your browser.

If you get a 404 when going to the website it is because there’s a line missing from the annotations of the kubeinvaders ingress.

To fix this edit the ingress: –

kubectl edit ingress -n kubeinvaders

And add the following line: –

kubernetes.io/ingress.class: "nginx"

Save the updated ingress and go back to your FQDN and there is KubeInvaders!

Thanks for reading!

A storage failover issue with SQL Server on Kubernetes

I’ve been running a proof of concept for SQL Server on Kubernetes over the last year or so (ok, probably longer than that…hey, I’m a busy guy 🙂 ) and have come across an issue that has been sort of a show stopper.


UPDATE – This issue has been resolved in Kubernetes version 1.26.
Details are on this github issue: –
https://github.com/kubernetes/kubernetes/issues/65392

And there’s more on the official Kubernetes blog (when a feature called non-graceful node shutdown when into beta): –
https://kubernetes.io/blog/2022/12/16/kubernetes-1-26-non-graceful-node-shutdown-beta/


There are currently no HA solutions for SQL Server running on plain K8s (not discussing Azure Arc here) so my tests have been relying on the in-built HA that Kubernetes provides but there’s a problem.

Let’s see this in action.

First, as we’re running in AKS for this demo, check the storage classes available: –

kubectl get storageclass

We’re going to be using the default storage class for this demo, note the VOLUMEBINDINGMODE is set to WaitForFirstConsumer

Now create the PVC definitions referencing the default storage class: –

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: mssql-system
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: default
  resources:
    requests:
      storage: 1Gi
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: mssql-data
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: default
  resources:
    requests:
      storage: 1Gi
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: mssql-log
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: default
  resources:
    requests:
      storage: 1Gi

Save that yaml as sqlserver_pvcs.yaml and deploy: –

kubectl apply -f sqlserver_pvcs.yaml

Confirm the PVCs have been created: –

kubectl get pvc

N.B. – The PVCs are in a status of pending as the VOLUMEBINDINGMODE mode of the storage class is set to WaitForFirstConsumer

Now create a sqlserver.yaml file: –

apiVersion: apps/v1
kind: Deployment
metadata:
  creationTimestamp: null
  labels:
    app: sqlserver
  name: sqlserver
spec:
  replicas: 1
  selector:
    matchLabels:
      app: sqlserver
  strategy: {}
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: sqlserver
    spec:
      securityContext:
        fsGroup: 10001
      containers:
      - image: mcr.microsoft.com/mssql/server:2019-CU11-ubuntu-18.04
        name: sqlserver
        resources: {}
        env:
        - name: ACCEPT_EULA
          value: "Y"
        - name: MSSQL_SA_PASSWORD
          value: "Testing1122"
        volumeMounts:
        - name: system
          mountPath: /var/opt/mssql
        - name: user
          mountPath: /var/opt/sqlserver/data
        - name: log
          mountPath: /var/opt/sqlserver/log
      tolerations:
      - key: "node.kubernetes.io/unreachable"
        operator: "Exists"
        effect: "NoExecute"
        tolerationSeconds: 10
      - key: "node.kubernetes.io/not-ready"
        operator: "Exists"
        effect: "NoExecute"
        tolerationSeconds: 10
      volumes:
      - name: system
        persistentVolumeClaim:
          claimName: mssql-system
      - name: user
        persistentVolumeClaim:
          claimName: mssql-data
      - name: log
        persistentVolumeClaim:
          claimName: mssql-log
status: {}

N.B. – Note the tolerations set for this deployment, if you want to learn more you can check out my blog post here

Deploy that: –

kubectl apply -f sqlserver.yaml

And check that the deployment was successful: –

kubectl get deployments

Now the PVCs and corresponding PVs will have been created: –

kubectl get pvc
kubectl get pv

OK let’s have a look at the events of the pod: –

kubectl describe pod -l app=sqlserver

So we had a couple of errors from the scheduler initially, (probably) because the PVCs weren’t created in time…but then the attachdetach-controller kicked in and attached the volumes for the pod to use.

Now that the pod is up and confirm the node that the pod is running on: –

kubectl get pods -o wide

OK, shutdown the node in the Azure portal to simulate a node failure: –

Wait for the node to become unavailable: –

kubectl get nodes --watch

Once the node is reported as unavailable, check the status of the pod. A new one should be spun up on a new, available node:-

kubectl get pods -o wide

The old pod is in a Terminating state, a new one has been created but is in the ContainerCreating state and there it will stay…never coming online.

We can see why if we look at the events of the new pod: –

kubectl describe pod sqlserver-59c78ddc9f-tj9qr

And here’s the issue. The attachdetach-controller cannot move the volumes for the new pod to use as they’re still attached to the old pod.

(EDIT – Technically the volumes are attached to the node but the error reports that the volumes are in use by the old pod)

This is because the node that the old pod is on is in a state of NotReady…so the cluster has no idea of the state of that pod (it’s being reported as terminating but hasn’t been removed completely).

Let’s restart the node: –

And wait for it to come online: –

kubectl get nodes --watch

Once the node is online, the old pod will be removed and the new pod will come online: –

kubectl get pods -o wide

Looking at the pod events again: –

kubectl describe pod sqlserver-59c78ddc9f-tj9qr

We can see that once the node came online the attachdetach-controller was able to attach the volumes.

This is an issue as it requires manual intervention for the new pod to come online. Someone has to either bring the node back online or remove it from the cluster completely, not what you want as this will mean extended downtime for SQL Server running in the pod.

So what can we do about this? Well we’ve been looking at a couple of solutions which I’ll cover in upcoming blog posts 🙂

Note, if anyone out there knows how to get around this issue (specifically altering the behaviour of the attachdetach-controller) please get in contact!

Thanks for reading!

Provisioning storage for Azure SQL Edge running on a Raspberry Pi Kubernetes cluster

In a previous post we went through how to setup a Kubernetes cluster on Raspberry Pis and then deploy Azure SQL Edge to it.

In this post I want to go through how to configure a NFS server so that we can use that to provision persistent volumes in the Kubernetes cluster.

Once again, doing this on a Raspberry Pi 4 with an external USB SSD. The kit I bought was: –

1 x Raspberry Pi 4 Model B – 2GB RAM
1 x SanDisk Ultra 16 GB microSDHC Memory Card
1 x SanDisk 128 GB Solid State Flash Drive

The initial set up steps are the same as the previous posts, but we’re going to run through them here (as I don’t just want to link back to the previous blog).

So let’s go ahead and run through setting up a Raspberry Pi NFS server and then deploying persistent volumes for Azure SQL Edge.


Flashing the OS

The first thing to do is flash the SD card using Rufus: –

Grab the Ubuntu 20.04 ARM image from the website and flash the SD card: –

Once that’s done, connect the Pi to an internet connection, plug in the USB drive, and then power the Pi on.


Setting a static IP

Once the Pi is powered on, find it’s IP address on the network. Nmap can be used for this: –

nmap -sP 192.168.1.0/24

Or use a Network Analyzer application on your phone (I find the output of nmap can be confusing at times).

Then we can ssh to the Pi: –

ssh pi@192.168.1.xx

And then change the password of the default ubuntu user (default password is ubuntu): –

Ok, now we can ssh back into the Pi and set a static IP address. Edit the file /etc/netplan/50-cloud-init.yaml to look something like this: –

eth0 is the network the Pi is on (confirm with ip a), 192.168.1.160 is the IP address I’m setting, 192.168.1.254 is the gateway on my network, and 192.168.1.5 is my dns server (my pi-hole).

There is a warning there about changes not persisting, but they do 🙂

Now that the file is configured, we need to run: –

sudo netplan apply

Once this is executed it will break the current shell, wait for the Pi to come back on the network on the new IP address and ssh back into it.


Creating a custom user

Let’s now create a custom user, with sudo access, and diable the default ubuntu user.

To create a new user: –

sudo adduser dbafromthecold

Add to the sudo group: –

sudo usermod -aG sudo dbafromthecold

Then log out of the Pi and log back in with the new user. Once in, disable the default ubuntu user: –

sudo usermod --expiredate 1 ubuntu

Cool! So we’re good to go to set up key based authentication into the Pi.


Setting up key based authentication

In the post about creating the cluster we already created an ssh key pair to use to log into the Pi but if we needed to create a new key we could just run: –

ssh-keygen

And follow the prompts to create a new key pair.

Now we can copy the public key to the Pi. Log out of the Pi and navigate to the location of the public key: –

ssh-copy-id -i ./raspberrypi_k8s.pub dbafromthecold@192.168.1.160

Once the key has been copied to the Pi, add an entry for the Pi into the ssh config file: –

Host pi-nfs-server
    HostName 192.168.1.160
    User dbafromthecold
    IdentityFile ~/raspberrypi_k8s

To make sure that’s all working, try logging into the Pi with: –

ssh dbafromthecold@pi-nfs-server

Installing and configuring the NFS server

Great! Ok, now we can configure the Pi. First thing, let’s rename it to pi-nfs-server and bounce: –

sudo hostnamectl set-hostname pi-nfs-server
sudo reboot

Once the Pi comes back up, log back in and install the nfs server itself: –

sudo apt-get install -y nfs-kernel-server

Now we need to find the USB drive on the Pi so that we can mount it: –

lsblk

And here you can see the USB drive as sda: –

Another way to find the disk is to run: –

sudo lshw -class disk

So we need to get some more information about /dev/sda it in order to mount it: –

sudo blkid /dev/sda

Here you can see the UUID of the drive and that it’s got a type of NTFS.

Now we’re going to create a folder to mount the drive (/mnt/sqledge): –

sudo mkdir /mnt/sqledge/

And then add a record for the mount into /etc/fstab using the UUID we got earlier for the drive: –

sudo vim /etc/fstab

And add (changing the UUID to the value retrieved earlier): –

UUID=242EC6792EC64390 /mnt/sqledge ntfs defaults 0 0

Then mount the drive to /mnt/sqledge: –

sudo mount -a

To confirm the disk is mounted: –

df -h

Great! We have our disk mounted. Now let’s create some subfolders for the SQL system, data, and log files: –

sudo mkdir /mnt/sqledge/{sqlsystem,sqldata,sqllog}

Ok, now we need to modify the export file so that the server knows which directories to share. Get your user and group ID using the id command: –

The edit the /etc/exports file: –

sudo vim /etc/exports

Add the following to the file: –

/mnt/sqledge *(rw,all_squash,insecure,async,no_subtree_check,anonuid=1001,anongid=1001)

N.B. – Update the final two numbers with the values from the id command. A full break down of what’s happening in this file is detailed here.

And then update: –

sudo exportfs -ra

Configuring the Kubernetes Nodes

Each node in the cluster needs to have the nfs tools installed: –

sudo apt-get install nfs-common

And each one will need a reference to the NFS server in its /etc/hosts file. Here’s what the hosts file on k8s-node-1 now looks like: –


Creating a persistent volume

Excellent stuff! Now we’re good to go to create three persistent volumes for our Azure SQL Edge pod: –

apiVersion: v1
kind: PersistentVolume
metadata:
  name: sqlsystem-pv
spec:
  capacity:
    storage: 1024Mi
  accessModes:
    - ReadWriteOnce
  nfs:
    server: pi-nfs-server
    path: "/mnt/sqledge/sqlsystem"
---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: sqldata-pv
spec:
  capacity:
    storage: 1024Mi
  accessModes:
    - ReadWriteOnce
  nfs:
    server: pi-nfs-server
    path: "/mnt/sqledge/sqldata"
---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: sqllog-pv
spec:
  capacity:
    storage: 1024Mi
  accessModes:
    - ReadWriteOnce
  nfs:
    server: pi-nfs-server
    path: "/mnt/sqledge/sqllog"

What this file will do is create three persistent volumes, 1GB in size (although that will kinda be ignored as we’re using NFS shares), in the ReadWriteOnce access mode, pointing at each of the folders we’ve created on the NFS server.

We can either create the file and deploy or run (do this locally with kubectl pointed at the Pi K8s cluster): –

kubectl apply -f https://gist.githubusercontent.com/dbafromthecold/da751e8c93a401524e4e59266812dc63/raw/d97c0a78887b6fcc41d0e48c46f05fe48981c530/azure-sql-edge-pv.yaml

To confirm: –

kubectl get pv

Now we can create three persistent volume claims for the persistent volumes: –

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: sqlsystem-pvc
spec:
  volumeName: sqlsystem-pv
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 1024Mi
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: sqldata-pvc
spec:
  volumeName: sqldata-pv
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 1024Mi
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: sqllog-pvc
spec:
  volumeName: sqllog-pv
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 1024Mi

Each one with the same AccessMode and size as the corresponding persistent volume.

Again, we can create the file and deploy or just run: –

kubectl apply -f https://gist.githubusercontent.com/dbafromthecold/0c8fcd74480bba8455672bb5f66a9d3c/raw/f3fdb63bdd039739ef7d7b6ab71196803bdfebb2/azure-sql-edge-pvc.yaml

And confirm with: –

kubectl get pvc

The PVCs should all have a status of Bound, meaning that they’ve found their corresponding PVs. We can confirm this with: –

kubectl get pv


Deploying Azure SQL Edge with persistent storage

Awesome stuff! Now we are good to go and deploy Azure SQL Edge to our Pi K8s cluster with persistent storage! Here’s the yaml file for Azure SQL Edge: –

apiVersion: apps/v1
kind: Deployment
metadata:
  name: sqledge-deployment
spec:
  replicas: 1
  selector:
    matchLabels:
      app: sqledge
  template:
    metadata:
      labels:
        app: sqledge
    spec:
      volumes:
        - name: sqlsystem
          persistentVolumeClaim:
            claimName: sqlsystem-pvc
        - name: sqldata
          persistentVolumeClaim:
            claimName: sqldata-pvc
        - name: sqllog
          persistentVolumeClaim:
            claimName: sqllog-pvc
      containers:
        - name: azuresqledge
          image: mcr.microsoft.com/azure-sql-edge:latest
          ports:
            - containerPort: 1433
          volumeMounts:
            - name: sqlsystem
              mountPath: /var/opt/mssql
            - name: sqldata
              mountPath: /var/opt/sqlserver/data
            - name: sqllog
              mountPath: /var/opt/sqlserver/log
          env:
            - name: MSSQL_PID
              value: "Developer"
            - name: ACCEPT_EULA
              value: "Y"
            - name: SA_PASSWORD
              value: "Testing1122"
            - name: MSSQL_AGENT_ENABLED
              value: "TRUE"
            - name: MSSQL_COLLATION
              value: "SQL_Latin1_General_CP1_CI_AS"
            - name: MSSQL_LCID
              value: "1033"
            - name: MSSQL_DATA_DIR
              value: "/var/opt/sqlserver/data"
            - name: MSSQL_LOG_DIR
              value: "/var/opt/sqlserver/log"
      terminationGracePeriodSeconds: 30
      securityContext:
        fsGroup: 10001

So we’re referencing our three persistent volume clams and mounting them as

  • sqlsystem-pvc – /var/opt/mssql
  • sqldata-pvc – /var/opt/sqlserver/data
  • sqllog-pvc – /var/opt/sqlserver/log

We’re also setting environment variables to set the default data and log paths to the paths mounted by persistent volume claims.

To deploy: –

kubectl apply -f https://gist.githubusercontent.com/dbafromthecold/92ddea343d525f6c680d9e3fff4906c9/raw/4d1c071e9c515266662361e7c01a27cc162d08b1/azure-sql-edge-persistent.yaml

To confirm: –

kubectl get all

All looks good! To dig in a little deeper: –

kubectl describe pods -l app=sqledge


Testing the persistent volumes

But let’s not take Kubernetes’ word for it! Let’s create a database and see it persistent across pods.

So expose the deployment: –

kubectl expose deployment sqledge-deployment --type=LoadBalancer --port=1433 --target-port=1433

Get the External IP of the service created (provided by MetalLb configured in the previous post): –

kubectl get services

And now create a database with the mssql-cli: –

mssql-cli -S 192.168.1.101 -U sa -P Testing1122 -Q "CREATE DATABASE [testdatabase];"

Confirm the database is there: –

mssql-cli -S 192.168.1.101 -U sa -P Testing1122 -Q "SELECT [name] FROM sys.databases;"

Confirm the database files: –

mssql-cli -S 192.168.1.101 -U sa -P Testing1122 -Q "USE [testdatabase]; EXEC sp_helpfile;"

We can even check on the NFS server itself: –

ls -al /mnt/sqledge/sqldata
ls -al /mnt/sqledge/sqllog

Ok, so the “real” test. Let’s delete the existing pod in the deployment and see if the new pod has the database: –

kubectl delete pod -l app=sqledge

Wait for the new pod to come up: –

kubectl get pods -o wide

And then see if our database is in the new pod: –

mssql-cli -S 192.168.1.101 -U sa -P Testing1122 -Q "SELECT [name] FROM sys.databases;"

And that’s it! We’ve successfully built a Pi NFS server to deploy persistent volumes to our Raspberry Pi Kubernetes cluster so that we can persist databases from one pod to another! Phew!

Thanks for reading!

Updating my Kubernetes Raspberry Pi Cluster to containerd


EDIT – December 2021 – Unfortunately the steps documented here no longer work for the newer versions of containerd and runc. I will update once I get the steps working again.


There’s been a lot of conversations happening on twitter over the last couple of days due to the fact that Docker is deprecated in Kubernetes v1.20.

If you want to know more about the reason why I highly recommend checking out this twitter thread.

I’ve recently built a Raspberry Pi Kubernetes cluster so I thought I’d run through updating them in-place to use containerd as the container runtime instead of Docker.


DISCLAIMER – You’d never do this for a production cluster. For those clusters, you’d simply get rid of the existing nodes and bring new ones in on a rolling basis. This blog is just me mucking about with my Raspberry Pi cluster to see if the update can be done in-place without having to rebuild the nodes (as I really didn’t want to have to do that).


So the first thing to do is drain the node (my node is called k8s-node-1) that is to be updated and cordon it:-

kubectl drain k8s-node-1 --ignore-daemonsets

Then ssh onto the node and stop the kubelet: –

systemctl stop kubelet

Then remove Docker: –

apt-get remove docker.io

Remove old dependencies: –

apt-get autoremove

Now unmask the existing containerd service (containerd is used by Docker so that’s why it’s already there): –

systemctl unmask containerd

Install the dependencies required:-

apt-get install unzip make golang-go libseccomp2 libseccomp-dev btrfs-progs libbtrfs-dev

OK, now we’re following the instructions to install containerd from source detailed here.

I installed from source as I tried to use apt-get to install (as detailed here on the Kubernetes docs) but it wouldn’t work for me. No idea why, didn’t spend to much time looking and tbh, I haven’t installed anything from source before so this was kinda fun (once it worked).

Anyway, doing everything as root, grab the containerd source: –

go get -d github.com/containerd/containerd

Now grab protoc and install: –

wget -c https://github.com/google/protobuf/releases/download/v3.11.4/protoc-3.11.4-linux-x86_64.zip
sudo unzip protoc-3.11.4-linux-x86_64.zip -d /usr/local

Get the runc code: –

go get -d github.com/opencontainers/runc

Navigate to the downloaded package (check your $GOPATH variable) mine was set to ~/go so cd into it and use make to build and install: –

cd ~/go/src/github.com/opencontainers/runc
make
make install

Now we’re going to do the same thing with containerd itself: –

cd ~/go/src/github.com/containerd/containerd
make
make install

Cool. Now copy the containerd.service file to systemd to create the containerd service: –

cp containerd.service /etc/systemd/system/
chmod 644 /etc/systemd/system/containerd.service

And start containerd: –

systemctl daemon-reload
systemctl start containerd
systemctl enable containerd

Let’s confirm containerd is up and running: –

systemctl status containerd

Awesome! Nearly done, now we need to update the kubelet to use containerd as it defaults to docker. We can do this by running: –

sed -i 's/3.2/3.2 --container-runtime=remote --container-runtime-endpoint=unix:\/\/\/run\/containerd\/containerd.sock/g' /var/lib/kubelet/kubeadm-flags.env

The flags for the kubelet are detailed here

I’m using sed to append the flags to the cluster but if that doesn’t work, edit manually with vim:-

vim /var/lib/kubelet/kubeadm-flags.env

And the following flags need to be added: –

–container-runtime=remote –container-runtime-endpoint=unix:///run/containerd/containerd.sock

OK, now that’s done we can start the kubelet: –

systemctl start kubelet

And confirm that it’s working:-

systemctl status kubelet


N.B. – Scroll to the right and we can see the new flags

Finally, uncordon the node. So back on the local machine:-

kubectl uncordon k8s-node-1

Run through that for all the worker nodes in the cluster. I did the control node as well following these instructions (didn’t drain/cordon it) and it worked a charm!

kubectl get nodes -o wide

Thanks for reading!