Initial commit

This commit is contained in:
mue 2024-12-29 17:02:26 +01:00
commit 0cb8bc2e85

90
README.md Normal file
View File

@ -0,0 +1,90 @@
# k8sbc
## Original description of the 38C3 planning session
We want to build a Kubernetes cluster on distributed single board computers.
The idea is a self-hosted / home-hosted system with nodes being distributed among the homes
of friends or people one trusts.
Nodes should be easily bootstrappable which is why we want to start with
one kind of SBC, so that we only need to provide one kind of system image
for the chosen hardware specifically. Nodes are connected with WireGuard.
This is no hands-on workshop but a self-organized planning session
## Communication / Collaboration
Checkout the 'kickoff' repo and add your nick and mail address and we will
create a gitea account for you
- mue mue-k8sbc@e2m.io
- steigr
- juwi
- hybris
Mailinglist?
Matrix Room?
## Notes from 38C3
We had a first planning session at 38C3 discussing some of the project aspects,
especially latency issues with kubernetes / etcd and distributed storage solutions.
Furthermore we discussed how to handle the cluster IP / entry point, ingresses,
how nodes discover each other and update their (dynamic) IP adresses
Control Plane on a cloud provider (e.g. Hetzner) - would ease things but is also less interesting.
More challenging is to also run the control plane within our SBC cluster
Wireguard seems to be the preferred technology to establish the 'node interconnect'
Alternatives are Tailscale and Tinc
### Node discovery and Ingress
TODO: Split this topic into 'node discovery' (for nodes joining the cluster)
and 'ingress' (service discovery inside the cluster)
DNS-only-solution, nodes update their adress in the public DNS server
[Cilium](https://cilium.io/)
[cloud-controller-manager](https://kubernetes.io/docs/concepts/architecture/cloud-controller)
[Tor hidden service](https://community.torproject.org/onion-services/setup/)
### Storage
S3 compat layer provided by Ceph as one option
[Ceph Stretch](https://docs.ceph.com/en/latest/rados/operations/stretch-mode/)
Needs 3 zones (arbiter versus data zones) with a maximum of 700ms latency
Ceph Stretch is a solution to specifically cater for higher latencies
because its use case is to span over several geographically seperated data centers
Possibly [rook](https://rook.io/docs/rook/v1.9/ceph-storage.html) as provider
[OpenEBS](https://openebs.io/)
[Longhorn](https://longhorn.io/)
### Problems
etcd latency
Network bandwith and latencies not predictable for the individual participant's connections
Low bandwith, not symmetric with regard to bandwith
### What we did _not_ discuss
What kind of SBC to use or support
The more RAM the better. Many SBCs are low on RAM, typically only up to 4GB
and while SBCs with much more RAM exist they might not be as well supported by Linux
or not as well available like the more common ones
An interesting candidate seems to the [Raxda Rock 3-5](https://wiki.radxa.com/Rock3) with up to 32GB of RAM
While there is no immediate or inherent reason to restrict that,
it's easiest to commit to one kind of SBC, providing one type of system image to users
On the other hand the people with initial interest in the project already own SBCs
to dedicate to the cluster and those people are generally capable to create
and / or run their own system image for their SBC
## What we have
### Hardware
Pine64
Pine ROCKPro64
Raspberry Pi
### Software
Talos on Pine64
## Next steps
Please add your name, mail address, notes and thoughts to this README
Decide on and set up communication infrastructure
Decide on cluster endpoint / entry point solution
Explore Talos more in-depth