Setting Up a Kubernetes Cluster: From Theory to Practice
How to create a Kubernetes cluster from scratch. Options compared, step-by-step setup, and decisions to make before starting.
You want to create a Kubernetes cluster. The question is: how? There are many options, and the right choice depends on context. Cloud managed? Self-hosted? Which distribution? How many nodes?
This guide covers the main options and helps you decide. Then we look at a concrete setup with kubeadm for those who want to get their hands dirty.
First Decision: Managed vs Self-Hosted
The first choice is whether to manage Kubernetes yourself or let someone else do it.
Managed Kubernetes (Cloud)
Cloud providers offer Kubernetes as a Service:
- EKS (AWS)
- GKE (Google Cloud)
- AKS (Azure)
- DOKS (DigitalOcean)
Pros:
- Managed control plane — you don't have to worry about etcd, API server, scheduler
- Simplified upgrades
- Native integration with cloud services (load balancer, storage, IAM)
- Support from the provider
Cons:
- Cost — you pay for the control plane in addition to worker nodes
- Vendor lock-in — native integrations tie you to the provider
- Less control — some configurations aren't possible
When to choose it: If you're on cloud and don't want to manage Kubernetes infrastructure. For most enterprise production cases, it's the reasonable choice.
Self-Hosted
You install and manage everything yourself. The options:
- kubeadm — Official tool for bootstrapping clusters
- K3s — Lightweight Kubernetes (discussed in another article)
- RKE2 (Rancher) — Enterprise-ready Kubernetes
- Kubespray — Ansible playbook for deployment
Pros:
- Total control
- No cost for control plane
- Works anywhere (cloud, on-premise, bare metal)
Cons:
- Responsibility for upgrades, backups, high availability
- More operational complexity
- No support (unless you pay for a commercial distribution)
When to choose it: On-premise, edge computing, specific control requirements, or when you want to learn how Kubernetes really works.
Cluster Architecture
Before creating the cluster, some architectural decisions.
How Many Control Plane Nodes?
- 1 node: Ok for development and testing. If it dies, you lose the cluster.
- 3 nodes: Minimum for production with HA. etcd quorum requires odd numbers.
- 5 nodes: For very critical clusters. More than 5 rarely needed.
How Many Worker Nodes?
Depends on workload. Start with what you need, scale later. Kubernetes makes it easy to add nodes.
Consider:
- Resources needed for your pods
- Margin for scheduling (you don't want nodes at 100%)
- Fault tolerance (if one node dies, can the others handle it?)
Networking CNI
The Container Network Interface (CNI) manages networking between pods. Common options:
- Calico — Most used. Network policy, BGP, flexible.
- Cilium — Based on eBPF, excellent performance, advanced features.
- Flannel — Simple, lightweight, no network policy.
- Weave — Easy to configure, built-in encryption.
For most cases, Calico is fine. If you have advanced performance or security requirements, look at Cilium.
Setup with kubeadm
Let's look at a practical setup with kubeadm on Ubuntu 22.04. This is the "official" method for creating self-hosted clusters.
Prerequisites (All Nodes)
# Disable swap (Kubernetes requires it)
sudo swapoff -a
sudo sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab
# Load required kernel modules
cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf
overlay
br_netfilter
EOF
sudo modprobe overlay
sudo modprobe br_netfilter
# Configure sysctl
cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward = 1
EOF
sudo sysctl --system
Install Container Runtime (containerd)
# Install containerd
sudo apt-get update
sudo apt-get install -y containerd
# Configure containerd
sudo mkdir -p /etc/containerd
containerd config default | sudo tee /etc/containerd/config.toml
# Enable SystemdCgroup
sudo sed -i 's/SystemdCgroup = false/SystemdCgroup = true/g' /etc/containerd/config.toml
sudo systemctl restart containerd
sudo systemctl enable containerd
Install kubeadm, kubelet, kubectl
# Add Kubernetes repository
sudo apt-get update
sudo apt-get install -y apt-transport-https ca-certificates curl gpg
curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.29/deb/Release.key | sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg
echo 'deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.29/deb/ /' | sudo tee /etc/apt/sources.list.d/kubernetes.list
# Install packages
sudo apt-get update
sudo apt-get install -y kubelet kubeadm kubectl
sudo apt-mark hold kubelet kubeadm kubectl
Initialize the Control Plane
On the first control plane node:
sudo kubeadm init \
--control-plane-endpoint="LOAD_BALANCER_IP:6443" \
--upload-certs \
--pod-network-cidr=10.244.0.0/16
If you have a single control plane, you can omit --control-plane-endpoint. The --pod-network-cidr depends on the CNI you'll use (10.244.0.0/16 is for Flannel, for Calico you can use other ranges).
After completion, kubeadm gives you:
- Command to configure kubectl
- Command to join other control planes
- Command to join workers
# Configure kubectl for your user
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Install CNI (Calico)
kubectl apply -f https://raw.githubusercontent.com/projectcalico/calico/v3.27.0/manifests/calico.yaml
Wait for Calico pods to be running:
kubectl get pods -n kube-system
Join Worker Nodes
On each worker node, run the command kubeadm gave you:
sudo kubeadm join CONTROL_PLANE_IP:6443 \
--token TOKEN \
--discovery-token-ca-cert-hash sha256:HASH
If the token has expired, generate a new one on the control plane:
kubeadm token create --print-join-command
Verify the Cluster
kubectl get nodes
You should see all nodes in Ready state.
High Availability
For control plane HA you need at least 3 nodes. Configuration is similar but with some adjustments.
Load Balancer
You need a load balancer in front of the control planes. Options:
- HAProxy
- Nginx
- Cloud load balancer
- kube-vip (software load balancer for Kubernetes)
The load balancer must balance port 6443 (API server) to all control planes.
HA Setup
# First control plane
sudo kubeadm init \
--control-plane-endpoint="LOAD_BALANCER_IP:6443" \
--upload-certs \
--pod-network-cidr=10.244.0.0/16
# Other control planes
sudo kubeadm join LOAD_BALANCER_IP:6443 \
--token TOKEN \
--discovery-token-ca-cert-hash sha256:HASH \
--control-plane \
--certificate-key CERT_KEY
The --certificate-key is shown in the output of the first kubeadm init.
Storage
Kubernetes needs storage for persistent volumes. Options:
Cloud: Use your provider's CSI driver (EBS for AWS, PD for GCP, etc.)
Self-hosted:
- Longhorn — Distributed storage, easy to install
- Rook/Ceph — Powerful but complex
- OpenEBS — Simpler than Ceph, various backend options
- Local Path Provisioner — Local storage, no replication
To start, Local Path Provisioner is fine:
kubectl apply -f https://raw.githubusercontent.com/rancher/local-path-provisioner/master/deploy/local-path-storage.yaml
For production, consider Longhorn or your storage's CSI driver.
Monitoring and Logging
A cluster without monitoring is a ticking time bomb.
Metrics Server
Required for kubectl top and autoscaling:
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
Prometheus + Grafana
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm install prometheus prometheus-community/kube-prometheus-stack
This installs Prometheus, Grafana, and pre-configured dashboards for Kubernetes.
Cluster Upgrade
Kubernetes releases versions every 4 months or so. How to upgrade:
Control Plane
# Find available version
apt-cache madison kubeadm
# Upgrade kubeadm
sudo apt-mark unhold kubeadm
sudo apt-get update && sudo apt-get install -y kubeadm=1.30.0-1.1
sudo apt-mark hold kubeadm
# Check upgrade plan
sudo kubeadm upgrade plan
# Apply upgrade
sudo kubeadm upgrade apply v1.30.0
# Upgrade kubelet and kubectl
sudo apt-mark unhold kubelet kubectl
sudo apt-get update && sudo apt-get install -y kubelet=1.30.0-1.1 kubectl=1.30.0-1.1
sudo apt-mark hold kubelet kubectl
sudo systemctl daemon-reload
sudo systemctl restart kubelet
Worker Nodes
# On control plane: drain the node
kubectl drain NODE_NAME --ignore-daemonsets
# On worker: upgrade
sudo apt-mark unhold kubeadm kubelet kubectl
sudo apt-get update && sudo apt-get install -y kubeadm=1.30.0-1.1 kubelet=1.30.0-1.1 kubectl=1.30.0-1.1
sudo apt-mark hold kubeadm kubelet kubectl
sudo kubeadm upgrade node
sudo systemctl daemon-reload
sudo systemctl restart kubelet
# On control plane: uncordon the node
kubectl uncordon NODE_NAME
Repeat for each worker, one at a time.
Backup
Backup of etcd is critical. If you lose etcd, you lose the cluster.
# Backup
ETCDCTL_API=3 etcdctl snapshot save backup.db \
--endpoints=https://127.0.0.1:2379 \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
--cert=/etc/kubernetes/pki/etcd/server.crt \
--key=/etc/kubernetes/pki/etcd/server.key
# Verify
ETCDCTL_API=3 etcdctl snapshot status backup.db
Automate this backup with a CronJob or external script. Keep backups off-site.
Alternatives to kubeadm
kubeadm is the standard method but not the only one.
Kubespray: Ansible playbook that automates everything. Good if you already know Ansible and want repeatable deployments.
RKE2: Rancher distribution. Includes security hardening, easier to manage than vanilla kubeadm.
K3s: If you don't need all standard Kubernetes features. Much simpler.
Talos Linux: Immutable OS made specifically for Kubernetes. Interesting for those who want security and immutability.
Conclusion
Creating a Kubernetes cluster isn't hard, but requires attention to detail. Decisions made at the beginning (HA, networking, storage) affect everything that comes after.
If you're on cloud and don't have particular requirements, use your provider's managed Kubernetes. It's the pragmatic choice.
If you must go self-hosted, kubeadm is the standard starting point. Invest time in automation (Ansible, Terraform) so you can recreate the cluster quickly if something goes wrong.
And above all: test upgrades and recovery before going to production. Discovering your backup doesn't work during an emergency is an experience you want to avoid.
A Kubernetes cluster is easy to create. A reliable Kubernetes cluster requires planning and ongoing maintenance.