91 lines
3.5 KiB
Markdown
91 lines
3.5 KiB
Markdown
|
# k8sbc
|
||
|
|
||
|
## Original description of the 38C3 planning session
|
||
|
We want to build a Kubernetes cluster on distributed single board computers.
|
||
|
The idea is a self-hosted / home-hosted system with nodes being distributed among the homes
|
||
|
of friends or people one trusts.
|
||
|
|
||
|
Nodes should be easily bootstrappable which is why we want to start with
|
||
|
one kind of SBC, so that we only need to provide one kind of system image
|
||
|
for the chosen hardware specifically. Nodes are connected with WireGuard.
|
||
|
|
||
|
This is no hands-on workshop but a self-organized planning session
|
||
|
|
||
|
## Communication / Collaboration
|
||
|
|
||
|
Checkout the 'kickoff' repo and add your nick and mail address and we will
|
||
|
create a gitea account for you
|
||
|
|
||
|
- mue mue-k8sbc@e2m.io
|
||
|
- steigr
|
||
|
- juwi
|
||
|
- hybris
|
||
|
|
||
|
Mailinglist?
|
||
|
Matrix Room?
|
||
|
|
||
|
## Notes from 38C3
|
||
|
We had a first planning session at 38C3 discussing some of the project aspects,
|
||
|
especially latency issues with kubernetes / etcd and distributed storage solutions.
|
||
|
Furthermore we discussed how to handle the cluster IP / entry point, ingresses,
|
||
|
how nodes discover each other and update their (dynamic) IP adresses
|
||
|
|
||
|
Control Plane on a cloud provider (e.g. Hetzner) - would ease things but is also less interesting.
|
||
|
More challenging is to also run the control plane within our SBC cluster
|
||
|
|
||
|
Wireguard seems to be the preferred technology to establish the 'node interconnect'
|
||
|
Alternatives are Tailscale and Tinc
|
||
|
|
||
|
### Node discovery and Ingress
|
||
|
TODO: Split this topic into 'node discovery' (for nodes joining the cluster)
|
||
|
and 'ingress' (service discovery inside the cluster)
|
||
|
|
||
|
DNS-only-solution, nodes update their adress in the public DNS server
|
||
|
|
||
|
[Cilium](https://cilium.io/)
|
||
|
[cloud-controller-manager](https://kubernetes.io/docs/concepts/architecture/cloud-controller)
|
||
|
[Tor hidden service](https://community.torproject.org/onion-services/setup/)
|
||
|
|
||
|
### Storage
|
||
|
S3 compat layer provided by Ceph as one option
|
||
|
[Ceph Stretch](https://docs.ceph.com/en/latest/rados/operations/stretch-mode/)
|
||
|
Needs 3 zones (arbiter versus data zones) with a maximum of 700ms latency
|
||
|
Ceph Stretch is a solution to specifically cater for higher latencies
|
||
|
because its use case is to span over several geographically seperated data centers
|
||
|
Possibly [rook](https://rook.io/docs/rook/v1.9/ceph-storage.html) as provider
|
||
|
[OpenEBS](https://openebs.io/)
|
||
|
[Longhorn](https://longhorn.io/)
|
||
|
|
||
|
### Problems
|
||
|
etcd latency
|
||
|
Network bandwith and latencies not predictable for the individual participant's connections
|
||
|
Low bandwith, not symmetric with regard to bandwith
|
||
|
|
||
|
### What we did _not_ discuss
|
||
|
What kind of SBC to use or support
|
||
|
The more RAM the better. Many SBCs are low on RAM, typically only up to 4GB
|
||
|
and while SBCs with much more RAM exist they might not be as well supported by Linux
|
||
|
or not as well available like the more common ones
|
||
|
An interesting candidate seems to the [Raxda Rock 3-5](https://wiki.radxa.com/Rock3) with up to 32GB of RAM
|
||
|
|
||
|
While there is no immediate or inherent reason to restrict that,
|
||
|
it's easiest to commit to one kind of SBC, providing one type of system image to users
|
||
|
On the other hand the people with initial interest in the project already own SBCs
|
||
|
to dedicate to the cluster and those people are generally capable to create
|
||
|
and / or run their own system image for their SBC
|
||
|
|
||
|
## What we have
|
||
|
### Hardware
|
||
|
Pine64
|
||
|
Pine ROCKPro64
|
||
|
Raspberry Pi
|
||
|
|
||
|
### Software
|
||
|
Talos on Pine64
|
||
|
|
||
|
## Next steps
|
||
|
Please add your name, mail address, notes and thoughts to this README
|
||
|
Decide on and set up communication infrastructure
|
||
|
Decide on cluster endpoint / entry point solution
|
||
|
Explore Talos more in-depth
|