Running a personal Kubernetes cluster with Calico-connected services on bare-metal

Out of boredom, I decided to undertake an infrastructural experiment of setting up a personal Kubernetes cluster, and moving as much of my personal project workloads into the cluster as possible. The purpose of this exercise was not to improve the resiliency of these fairly inconsequential workloads, but rather to see how far I could go in stretching this setup to fit low-cost servers I acquired from some infamous European providers.

Architecture of personal Kubernetes cluster

It took some trial and errors over a couple weeks of time, but eventually I was able to achieve a setup that is functional and reasonably painless to maintain.

Two servers are used in the setup:

  • A bare-metal Debian host server running QEMU-KVM (libvirt), which in turn runs a number Ubuntu guest VMs, each running a Kubernetes master or worker node, or a GlusterFS replicated storage node.

    • The host server runs former VPS host-grade hardware, and therefore was fairly inexpensive to lease from the right provider, but yet still pretty powerful enough to run my cluster.
    • The Kubernetes node network (10.100.0.0.25) is segregated from the public internet.
    • Two IP addresses are used, one for the exclusive use of ingress to web services running in Kubernetes (10.100.0.128/25), and another for host maintenance and protected kubectl access.
    • Ubuntu guest images were built with Cloud-Init and runs in DHCP mode.
  • An auxiliary server, a low-cost yet fairly powerful virtual machine hosted with a different provider.

    • It was originally intended to be set up as an off-site Kubernetes worker node connected into the main cluster via WireGuard. While I managed to get kubelet joining the master node successfully and its Calico node reaching the main cluster network, I ran into some weird issues with send/receive offloading causing longer-than-MTU pod traffic packets to be dropped on Calico over WireGuard, and had to abandon this idea.
    • If you know why this happens, and how to fix it, please do get in touch -- I'm intrigued.
    • The auxiliary server runs workloads which are tricky to containerise, including my private Docker build environment and container repository (major iptables screw-up) and MySQL for backing some legacy projects.
  • WireGuard runs as a virtualised bridge between this and the auxiliary server hosted elsewhere.

Running Kubernetes on self-managed virtualisation -- and in turn on bare-metal -- is fairly unorthodoxy these days -- not to mention the likes of managed Kubernetes setups such as those from Google and DigitalOcean. A wise production setup would at least not involve maintaining one's own hypervisor -- which I did in this setup. This setup is therefore by no means commercially-sensible for most use-cases, but rather as a personal hobby.

Some infrastructural notes for the setup:

  • The process of setting up libvirt to run KVM in a segregated private subnet and managing it with virsh were fairly well-documented.

  • Setting up Kubernetes master and worker nodes with Docker as container runtime, joining them together, and wiring their pods together with Calico was surprisingly easy, as the bare-metal setup process is very mature.

  • The biggest pain point of running a bare-metal setup of Kubernetes is the lack of a ready-made load-balancer and ingress solution, such as ELB/NLB available when your cluster runs on AWS EC2.

    • Instead, I used MetalLB on Layer 2 routing mode to front the Cluster IP of an NGINX internal ingress service, with MetalLB's own ingress subnet forwarded via NAT to an external ingress IP.
    • The BGP mode of MetalLB would be really nice to have, but it is unfortunately not compatible with Calico's BGP setup.
  • I use GlusterFS as a replicated storage backend, which in this setup is not really redundant since they run on the same physical hard drive of the host server. But in a more budget-accommodating setup this can be easily distributed. GlusterFS is wired into Kubernetes as an endpoint for persistent volumes.

For each of my existing personal projects, I wrote Dockerfiles and supporting Makefiles to enable them to be containerised. These mostly run with three replicas for load-balancing. These projects include:

Due to the non-distributed nature of most of these setups, these projects don't really benefit from additional redundancy and resiliency which Kubernetes is supposed to provide, but this hopefully serves as a good technical demonstrator for the feasibility of managing small scale projects in Kubernetes.

links

social