Getting Back Into Kubernetes
Thomas Büttner
Getting Back Into Kubernetes
Things that used to be
I used to run my own OpenShift Kubernetes cluster that hosted most of my Homelab. I dabbled with the technology for quite some time, trying different installation methods like manual setups, kubespray, Ansible playbooks, and the Ansible-based OpenShift 3.x installer. I experimented with various Kubernetes distributions, including Vanilla K8s, OpenShift, and RKE/Rancher, on different host OSs like Rancher OS, Fedora Atomic, CoreOS, and Fedora 24ish. However, most of these options are no longer maintained.
Having a Kubernetes cluster in a Homelab is really great, being able to simply provision whole services with a Helm chart is a bliss. But, maintaining a cluster on 10+ year old bare metal wasn’t. The number of times I had to babysit the cluster-nodes because some of them had a tendency to simply hard-lock under high IO-Pressure where numerous, so often that I had to write a external cluster node watchdog program which monitored the nodes and reseted them once they became unresponsive.
The cluster consisted of 5 DELL R610 with Dual Intel-Nehalem CPUs, 196GB DDR3 and 6x4TB HDDs as worker / storage nodes, along with 3 old Intel NUCs with 16GB RAM and an i5 CPU which where sufficient for OpenShift control-plane nodes.
Due to the recent energy price explosion in Germany, I made the decision to shut down my cluster to save about 2.5-3KWh of power… since old hardware may be cheap to acquire but is not very energy efficient after all.
The Titan that replaced my cluster
So thinking that Im done with the jank I decided to build a “small NAS” I named it Atlas since it would have to carry all of the things my cluster used to provide. Atlas features an Intel Intel-Skylake Xeon, 64GB of DDR4 (ECC DDR4 are still quite expensive), 8x12TB HDDs in a raidz2 ZFS pool and would had a AMD MI25 (if I hadn’t managed to kill my second card).
Atlas runs RHEL-9, which may be overkill for just NFS shares, but I had to replace my cluster and move all of my services of into standalone containers a part of which moved to a Hetzner VPS “temporary” for a year… Now all my services are back running on Atlas as podman container either started by systemd or as autostarting pods.
The services currently running on Atlas are:
- Airsonic
- Briefkasten
- Jellyfin
- Minio
- Nextcloud
- WikiJS
- Home Assistant
- Gitea/Forgejo
- Firefly-III
- UniFi
- Observium
- Paperless-NGX
- Prometheus/Grafana/Alertmanager
- Vaultwarden
Installing K3S
After some consideration about which Kubernetes distribution to use I settled on K3S mostly because (single-node) OKD is extremely resource hungry and I would have to reinstall Atlas which… i won’t do (my data is now on there and I don’t have something to store ~42TB).
The installation is properly as straightforward as one can get, my only recommendation is to download and inspect the install script instead of blindly piping something downloaded into /bin/sh
with potential root privileges!
Here is the installation command I used:
1curl -sfLo get-k3s.sh https://get.k3s.io
2sh ./get-k3s.sh --selinux --secrets-encryption --flannel-backend=wireguard-native --cluster-cidr=172.16.0.0/16 --service-cidr=172.30.0.0/16
Some of these options are to prevent Network collisions since my Server network collides with the defaults, SELinux (as it is pretty much the point of a RHEL-like system), secret encryption since i used that on my OpenShift cluster and the WireGuard backend for flannel just in case this will ever be more than a standalone node.
After the installation, running kubectl get pods --all-namespaces
shows the following:
1NAMESPACE NAME READY STATUS RESTARTS AGE
2kube-system local-path-provisioner-957fdf8bc-wx26s 1/1 Running 0 2d15h
3kube-system coredns-77ccd57875-9zrf9 1/1 Running 0 2d15h
4kube-system helm-install-traefik-crd-dsx66 0/1 Completed 0 2d15h
5kube-system helm-install-traefik-rxdlx 0/1 Completed 1 2d15h
6kube-system svclb-traefik-913a5d09-7gn7m 2/2 Running 0 2d15h
7kube-system traefik-64f55bb67d-mb2v8 1/1 Running 0 2d15h
8kube-system metrics-server-648b5df564-6jx2s 1/1 Running 0 2d15h
Wow that is… not much… especially if you’re used to OpenShifts bloat.
K3S uses Traefik for its Ingress controller and the svclb-
pod is their LoadBalancer implementation that grabs a node with the requested Ports open and proxies them through.
It will be interesting to see how it fares against to the OpenShift HAProxy router with MetalLB that I used back on my OpenShift cluster.
Things to come
Now that I have a functioning Kubernetes cluster again, I will need to brush up my rusty Helm chart writing skills to migrate my services back into a Kubernetes environment. I’m also excited to catch up with all the recent developments in the Kubernetes ecosystem.
Stay tuned for updates as I will try to document my personal “Cloud” this time around.