Getting Back Into Kubernetes

Thomas Büttner

July 13, 2023

Getting Back Into Kubernetes

Things that used to be

I used to run my own OpenShift Kubernetes cluster that hosted most of my Homelab. I dabbled with the technology for quite some time, trying different installation methods like manual setups, kubespray, Ansible playbooks, and the Ansible-based OpenShift 3.x installer. I experimented with various Kubernetes distributions, including Vanilla K8s, OpenShift, and RKE/Rancher, on different host OSs like Rancher OS, Fedora Atomic, CoreOS, and Fedora 24ish. However, most of these options are no longer maintained.

Having a Kubernetes cluster in a Homelab is really great, being able to simply provision whole services with a Helm chart is a bliss. But, maintaining a cluster on 10+ year old bare metal wasn’t. The number of times I had to babysit the cluster-nodes because some of them had a tendency to simply hard-lock under high IO-Pressure where numerous, so often that I had to write a external cluster node watchdog program which monitored the nodes and reseted them once they became unresponsive.

The cluster consisted of 5 DELL R610 with Dual Intel-Nehalem CPUs, 196GB DDR3 and 6x4TB HDDs as worker / storage nodes, along with 3 old Intel NUCs with 16GB RAM and an i5 CPU which where sufficient for OpenShift control-plane nodes.

Due to the recent energy price explosion in Germany, I made the decision to shut down my cluster to save about 2.5-3KWh of power… since old hardware may be cheap to acquire but is not very energy efficient after all.

The Titan that replaced my cluster

So thinking that Im done with the jank I decided to build a “small NAS” I named it Atlas since it would have to carry all of the things my cluster used to provide. Atlas features an Intel Intel-Skylake Xeon, 64GB of DDR4 (ECC DDR4 are still quite expensive), 8x12TB HDDs in a raidz2 ZFS pool and would had a AMD MI25 (if I hadn’t managed to kill my second card).

Atlas runs RHEL-9, which may be overkill for just NFS shares, but I had to replace my cluster and move all of my services of into standalone containers a part of which moved to a Hetzner VPS “temporary” for a year… Now all my services are back running on Atlas as podman container either started by systemd or as autostarting pods.

The services currently running on Atlas are:

Airsonic
Briefkasten
Jellyfin
Minio
Nextcloud
WikiJS
Home Assistant
Gitea/Forgejo
Firefly-III
UniFi
Observium
Paperless-NGX
Prometheus/Grafana/Alertmanager
Vaultwarden

Installing K3S

After some consideration about which Kubernetes distribution to use I settled on K3S mostly because (single-node) OKD is extremely resource hungry and I would have to reinstall Atlas which… i won’t do (my data is now on there and I don’t have something to store ~42TB).

The installation is properly as straightforward as one can get, my only recommendation is to download and inspect the install script instead of blindly piping something downloaded into /bin/sh with potential root privileges!

Here is the installation command I used:

1curl -sfLo get-k3s.sh https://get.k3s.io
2sh ./get-k3s.sh --selinux --secrets-encryption --flannel-backend=wireguard-native --cluster-cidr=172.16.0.0/16 --service-cidr=172.30.0.0/16

Some of these options are to prevent Network collisions since my Server network collides with the defaults, SELinux (as it is pretty much the point of a RHEL-like system), secret encryption since i used that on my OpenShift cluster and the WireGuard backend for flannel just in case this will ever be more than a standalone node.

After the installation, running kubectl get pods --all-namespaces shows the following:

1NAMESPACE     NAME                                     READY   STATUS      RESTARTS   AGE
2kube-system   local-path-provisioner-957fdf8bc-wx26s   1/1     Running     0          2d15h
3kube-system   coredns-77ccd57875-9zrf9                 1/1     Running     0          2d15h
4kube-system   helm-install-traefik-crd-dsx66           0/1     Completed   0          2d15h
5kube-system   helm-install-traefik-rxdlx               0/1     Completed   1          2d15h
6kube-system   svclb-traefik-913a5d09-7gn7m             2/2     Running     0          2d15h
7kube-system   traefik-64f55bb67d-mb2v8                 1/1     Running     0          2d15h
8kube-system   metrics-server-648b5df564-6jx2s          1/1     Running     0          2d15h

Wow that is… not much… especially if you’re used to OpenShifts bloat.

K3S uses Traefik for its Ingress controller and the svclb- pod is their LoadBalancer implementation that grabs a node with the requested Ports open and proxies them through. It will be interesting to see how it fares against to the OpenShift HAProxy router with MetalLB that I used back on my OpenShift cluster.

Things to come

Now that I have a functioning Kubernetes cluster again, I will need to brush up my rusty Helm chart writing skills to migrate my services back into a Kubernetes environment. I’m also excited to catch up with all the recent developments in the Kubernetes ecosystem.

Stay tuned for updates as I will try to document my personal “Cloud” this time around.

Kubernetes