Movim • Thibault Martin: Kubernetes is not just for Black Friday

Pl chevron_right

Thibault Martin: Kubernetes is not just for Black Friday

news.movim.eu / PlanetGnome • 9 July • 12 minutes

I self-host services mostly for myself. My threat model is particular: the highest threats I face are my own incompetence and hardware failures. To mitigate the weight of my incompetence, I relied on podman containers to minimize the amount of things I could misconfigure. I also wrote ansible playbooks to deploy the containers on my VPS, thus making it easy to redeploy them elsewhere if my VPS failed.

I've always ruled out Kubernetes as too complex machinery designed for large organizations who face significant surges in traffic during specific events like Black Friday sales. I thought Kubernetes had too many moving parts and would work against my objectives.

I was wrong. Kubernetes is not just for large organizations with scalability needs I will never have. Kubernetes makes perfect sense for a homelabber who cares about having a simple, sturdy setup. It has less moving parts than my podman and ansible setup, more standard development and deployment practices, and it allows me to rely on the cumulative expertise of thousands of experts.

I don't want to do things manually or alone

Self-hosting services is much more difficult than just putting them online . This is a hobby for me, something I do on my free time, so I need to spend as little time doing maintenance as possible. I also know I don't have peers to review my deployments. If I have to choose between using standardized methods that have been reviewed by others or doing things my own way, I will use the standardized method.

My main threats are:

I can and will make mistakes. I am an engineer, but my current job is not to maintain services online. In my homelab, I am also a team of one. This means I don't have colleagues to spot the mistakes I make.

[!info] This means I need to use battle-tested and standardized software and deployment methods.

I have limited time for it. I am not on call 24/7. I want to enjoy time off with my family. I have work to do. I can't afford to spend my life in front of a computer to figure out what's wrong.

[!info] This means I need to have a reliable deployment. I need to be notified when something goes in the wrong direction, and when something went completely wrong.

My hardware can fail, or be stolen. Having working backups is critical. But if my hardware failed, I would still need to restore backups somewhere.

[!info] This means I need to be able to rebuild my infrastructure quickly and reliably, and restore backups on it.

I was doing things too manually and alone

Since I wanted to get standardized software, containers seemed like a good idea. podman was particularly interesting to me because it can generate systemd services that will keep the containers up and running across restarts.

I could have deployed the containers manually on my VPS and generated the systemd services by invoking the CLI. But I would then risk making small tweaks on the spot, resulting in a deployment that is difficult to replicate elsewhere.

Instead, I wrote an ansible-playbook based on the containers.podman collection and other ansible modules. This way, ansible deploys the right containers on my VPS, copies or updates the config files for my services, and I can easily replicate this elsewhere.

It has served me well and worked decently for years now, but I'm starting to see the limits of this approach. Indeed, on their introduction , the ansible maintainers state

Ansible uses simple, human-readable scripts called playbooks to automate your tasks. You declare the desired state of a local or remote system in your playbook. Ansible ensures that the system remains in that state.

This is mostly true for ansible, but this is not really the case for the podman collection. In practice I still have to do manual steps in a specific order, like creating a pod first, then adding containers to the pod, then generating a systemd service for the pod, etc.

To give you a very concrete example, this is what the tasks/main.yaml of my Synapse (Matrix) server deployment role looks like.

- name: Create synapse pod
  containers.podman.podman_pod:
    name: pod-synapse
    publish:
      - "10.8.0.2:9000:9000"
    state: created

- name: Stop synapse pod
  containers.podman.podman_pod:
    name: pod-synapse
    publish:
      - "10.8.0.2:9000:9000"
    state: stopped

- name: Create synapse's postgresql
  containers.podman.podman_container:
    name: synapse-postgres
    image: docker.io/library/postgres:{{ synapse_container_pg_tag }}
    pod: pod-synapse
    volume:
      - synapse_pg_pdata:/var/lib/postgresql/data
      - synapse_backup:/tmp/backup
    env:
      {
        "POSTGRES_USER": "{{ synapse_pg_username }}",
        "POSTGRES_PASSWORD": "{{ synapse_pg_password }}",
        "POSTGRES_INITDB_ARGS": "--encoding=UTF-8 --lc-collate=C --lc-ctype=C",
      }

- name: Copy Postgres config
  ansible.builtin.copy:
    src: postgresql.conf
    dest: /var/lib/containers/storage/volumes/synapse_pg_pdata/_data/postgresql.conf
    mode: "600"

- name: Create synapse container and service
  containers.podman.podman_container:
    name: synapse
    image: docker.io/matrixdotorg/synapse:{{ synapse_container_tag }}
    pod: pod-synapse
    volume:
      - synapse_data:/data
      - synapse_backup:/tmp/backup
    labels:
      {
        "traefik.enable": "true",
        "traefik.http.routers.synapse.entrypoints": "websecure",
        "traefik.http.routers.synapse.rule": "Host(`matrix.{{ base_domain }}`)",
        "traefik.http.services.synapse.loadbalancer.server.port": "8008",
        "traefik.http.routers.synapse.tls": "true",
        "traefik.http.routers.synapse.tls.certresolver": "letls",
      }

- name: Copy Synapse's homeserver configuration file
  ansible.builtin.template:
    src: homeserver.yaml.j2
    dest: /var/lib/containers/storage/volumes/synapse_data/_data/homeserver.yaml
    mode: "600"

- name: Copy Synapse's logging configuration file
  ansible.builtin.template:
    src: log.config.j2
    dest: /var/lib/containers/storage/volumes/synapse_data/_data/{{ matrix_server_name}}.log.config
    mode: "600"

- name: Copy Synapse's signing key
  ansible.builtin.template:
    src: signing.key.j2
    dest: /var/lib/containers/storage/volumes/synapse_data/_data/{{ matrix_server_name }}.signing.key
    mode: "600"

- name: Generate the systemd unit for Synapse
  containers.podman.podman_pod:
    name: pod-synapse
    publish:
      - "10.8.0.2:9000:9000"
    generate_systemd:
      path: /etc/systemd/system
      restart_policy: always

- name: Enable synapse unit
  ansible.builtin.systemd:
    name: pod-pod-synapse.service
    enabled: true
    daemon_reload: true

- name: Make sure synapse is running
  ansible.builtin.systemd:
    name: pod-pod-synapse.service
    state: started
    daemon_reload: true

- name: Allow traffic in monitoring firewalld zone for synapse metrics
  ansible.posix.firewalld:
    zone: internal
    port: "9000/tcp"
    permanent: true
    state: enabled
  notify: firewalld reload

I'm certain I'm doing some things wrong and this file can be shortened and improved, but this is also my point: I'm writing a file specifically for my needs, that is not peer reviewed.

Upgrades are also not necessarily trivial. While in theory it's as simple as updating the image tag in my playbook variables, in practice things get more complex when some containers depend on others.

[!info] With Ansible, I must describe precisely the steps my server has to go through to deploy the new containers, how to check their health, and how to roll back if needed.

Finally, discoverability of services is not great. I used traefik as a reverse proxy, and gave it access to the docker socket so it could read the labels of my other containers (like in the labels section of the yaml file above, containing the domain to use for Synapse), figure out what domain names I used, and route traffic to the correct containers. I wish a similar mechanism existed for e.g. prometheus to find new resources and scrape their metrics automatically, but didn't find any. Configuring Prometheus to scrape my pods was brittle and required a lot of manual work.

Working with Kubernetes and its community

What I need is a tool that lets me write "I want to deploy the version X of Keycloak." I need it to be able to figure by itself what version is currently running, what needs to be done to deploy the version X, whether the new deployment goes well, and how to roll back automatically if it can't deploy the new version.

The good news is that this tool exists. It's called Kubernetes, and contrary to popular belief it's not just for large organizations that run services for millions of people and see surges in traffic for Black Friday sales. Kubernetes is software that runs on one or several servers, forming a cluster, and uses their resources to run the containers you asked it to.

Kubernetes gives me more standardized deployments

To deploy services on Kubernetes you have to describe what containers to use, how many of them will be deployed, how they're related to one another, etc. To describe these, you use yaml manifests that you apply to your cluster. Kubernetes takes care of the low-level implementation so you can describe what you want to run, how much resources you want to allocate to it, and how to expose it.

The Kubernetes docs give the following manifest for an example Deployment that will spin up 3 nginx containers (without exposing them outside of the cluster)

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
  labels:
    app: nginx
spec:
  replicas: 3
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:1.14.2
        ports:
        - containerPort: 80

From my laptop, I can apply this file to my cluster by using kubectl , the CLI to control kubernetes clusters.

$ kubectl apply -f nginx-deployment.yaml

[!info] With Kubernetes, I can describe my infrastructure as yaml files. Kubernetes will read them, deploy the right containers, and monitor their health.

If there is already a well established community around the project, chances are that some people already maintain Helm charts for it. Helm charts describe how the containers, volumes and all the Kubernetes object are related to one another. When a project is popular enough, people will write charts for it and publish them on https://artifacthub.io/.

Those charts are open source, and can be peer reviewed. They already describe all the containers, services, and other Kubernetes objects that need to deploy to make a service run. To use the, I only have to define configuration variables, called Helm values , and I can get a service running in minutes.

To deploy a fully fledged Keycloak instance on my cluster, I need to override default parameters in a values.yaml file, like for example

ingress:
	enabled: true
	hostname: keycloak.ergaster.org
	tls: true

This is a short example. In practice, I need to override more values to fine tune my deployment and make it production ready. Once it's done I can deploy a configured Keycloak on my cluster by typing these commands on my laptop

$ helm repo add bitnami https://charts.bitnami.com/bitnami
$ helm install my-keycloak -f values.yaml bitnami/keycloak --version 24.7.4

In practice I don't use helm on my laptop but I write yaml files in a git repository to describe what Helm charts I want to use on my cluster and how to configure them. Then, a software suite running on my Kubernetes cluster called Flux detects changes on the git repository and applies them to my cluster. It makes changes even easier to track and roll back. More on that in a further blog post.

Of course, it would be reckless to deploy charts you don't understand, because you wouldn't be able to chase problems down as they arose. But having community-maintained, peer-reviewed charts gives you a solid base to configure and deploy services. This minimizes the room for error.

[!info] Helm charts let me benefit from the expertise of thousands of experts. I need to understand what they did, but I have a solid foundation to build on

But while a Kubernetes cluster is easy to use , it can be difficult to set up . Or is it?

Kubernetes doesn't have to be complex

A fully fledged Kubernetes cluster is a complex beast. Kubernetes, often abbreviated k8s, was initially designed to run on several machines. When setting up a cluster, you have to choose between components you know nothing about. What do I want for the network of my cluster? I don't know, I don't even know how the cluster network is supposed to work, and I don't want to know! I want a Kubernetes cluster, not a Lego toolset!

[!info] A fully fledged cluster solves problems for large companies with public facing services, not problems for a home lab.

Quite a few cloud providers offer managed cluster options so you don't have to worry about it, but they are expensive for an individual. In particular, they charge fees depending on the amount of outgoing traffic (egress fees). Those are difficult to predict.

Fortunately, a team of brilliant people has created an opinionated bundle of software to deploy a Kubernetes cluster on a single server (though it can form cluster with several nodes too). They cheekily call it it k3s to advertise it as a small k8s.

[!info] For a self-hosting enthusiast who want to run Kubernetes on a single server, k3s works like k8s, but installing and maintaining the cluster itself is much simpler.

Since I don't have High Availability needs and can afford to get my services offline occasionally, k3s on a single node is more than enough to let me play with Kubernetes without the extra complexity.

Installing k3s can be as simple as running the one-liner they advertise on their website

$ curl -sfL https://get.k3s.io | sh -

I've found their k3s-ansible playbook more useful since it does a few checks on the cluster host and copies the kubectl configuration files on your laptop automatically.

Discoverability on Kubernetes is fantastic

In my podman setup, I loved how traefik could read the labels of a container, figure out what domains I used, and where to route traffic for this domain.

Not only is the same thing true for Kubernetes, it goes further. cert-manager, the standard way to retrieve certs in a Kubernetes cluster, will read Ingress properties to figure out what domains I use and retrieve certificates for them. If you're not familiar with Kubernetes, an Ingress is a Kubernetes object telling your cluster "I want to expose those containers to the web, and this is how they can be reached."

Kubernetes has an Operator pattern. When I install a service on my Kubernetes cluster, it often comes with a specific Kubernetes object that Prometheus Operator can read to know how to scrape the service.

All of this happens by adding an annotation or two in yaml files. I don't have to fiddle with networks. I don't have to configure things manually. Kubernetes handles all that complexity for me.

Conclusion

It was a surprise for me to realize that Kubernetes is not the complex beast I thought it was. Kubernetes internals can be difficult to grasp, and there is a steeper learning curve than with docker-compose. But it's well worth the effort.

I got into Kubernetes by deploying k3s on my Raspberry Pi 4. I then moved to a beefier mini pc, nit because Kubernetes added too much overhead, but because the CPU of the Raspberry Pi 4 is too weak to handle my encrypted backups .

With Kubernetes and Helm, I have more standardized deployments. The open source services I deploy have been crafted and reviewed by a community of enthusiasts and professionals. Kubernetes handles a lot of the complexity for me, so I don't have to. My k3s cluster runs on a single node. My volumes live on my disk (via Rancher's local path provisioner ). And I still don't do Black Friday sales!