kickoff/README.md
2024-12-29 23:22:57 +01:00

3.5 KiB

k8sbc

Original description of the 38C3 planning session

We want to build a Kubernetes cluster on distributed single board computers. The idea is a self-hosted / home-hosted system with nodes being distributed among the homes of friends or people one trusts.

Nodes should be easily bootstrappable which is why we want to start with one kind of SBC, so that we only need to provide one kind of system image for the chosen hardware specifically. Nodes are connected with WireGuard.

This is no hands-on workshop but a self-organized planning session

Communication / Collaboration

Checkout the 'kickoff' repo and add your nick and mail address and we will create a gitea account for you

Mailinglist? Matrix Room?

Notes from 38C3

We had a first planning session at 38C3 discussing some of the project aspects, especially latency issues with kubernetes / etcd and distributed storage solutions. Furthermore we discussed how to handle the cluster IP / entry point, ingresses, how nodes discover each other and update their (dynamic) IP adresses

Control Plane on a cloud provider (e.g. Hetzner) - would ease things but is also less interesting. More challenging is to also run the control plane within our SBC cluster

Wireguard seems to be the preferred technology to establish the 'node interconnect' Alternatives are Tailscale and Tinc

Node discovery and Ingress

TODO: Split this topic into 'node discovery' (for nodes joining the cluster) and 'ingress' (service discovery inside the cluster)

DNS-only-solution, nodes update their adress in the public DNS server

Cilium
cloud-controller-manager
Tor hidden service

Storage

S3 compat layer provided by Ceph as one option Ceph Stretch Needs 3 zones (arbiter versus data zones) with a maximum of 700ms latency Ceph Stretch is a solution to specifically cater for higher latencies because its use case is to span over several geographically seperated data centers Possibly rook as provider OpenEBS Longhorn

Problems

etcd latency Network bandwith and latencies not predictable for the individual participant's connections Low bandwith, not symmetric with regard to bandwith

What we did not discuss

What kind of SBC to use or support The more RAM the better. Many SBCs are low on RAM, typically only up to 4GB and while SBCs with much more RAM exist they might not be as well supported by Linux or not as well available like the more common ones An interesting candidate seems to the Raxda Rock 3-5 with up to 32GB of RAM

While there is no immediate or inherent reason to restrict that, it's easiest to commit to one kind of SBC, providing one type of system image to users On the other hand the people with initial interest in the project already own SBCs to dedicate to the cluster and those people are generally capable to create and / or run their own system image for their SBC

What we have

Hardware

Pine64 Pine ROCKPro64 Raspberry Pi

Software

Talos on Pine64

Next steps

Please add your name, mail address, notes and thoughts to this README Decide on and set up communication infrastructure Decide on cluster endpoint / entry point solution Explore Talos more in-depth