Setting Up a Kubernetes Cluster: From Theory to Practice

How to create a Kubernetes cluster from scratch. Options compared, step-by-step setup, and decisions to make before starting.

You want to create a Kubernetes cluster. The question is: how? There are many options, and the right choice depends on context. Cloud managed? Self-hosted? Which distribution? How many nodes?

This guide covers the main options and helps you decide. Then we look at a concrete setup with kubeadm for those who want to get their hands dirty.

First Decision: Managed vs Self-Hosted

The first choice is whether to manage Kubernetes yourself or let someone else do it.

Managed Kubernetes (Cloud)

Cloud providers offer Kubernetes as a Service:

EKS (AWS)
GKE (Google Cloud)
AKS (Azure)
DOKS (DigitalOcean)

Pros:

Managed control plane — you don't have to worry about etcd, API server, scheduler
Simplified upgrades
Native integration with cloud services (load balancer, storage, IAM)
Support from the provider

Cons:

Cost — you pay for the control plane in addition to worker nodes
Vendor lock-in — native integrations tie you to the provider
Less control — some configurations aren't possible

When to choose it: If you're on cloud and don't want to manage Kubernetes infrastructure. For most enterprise production cases, it's the reasonable choice.

Self-Hosted

You install and manage everything yourself. The options:

kubeadm — Official tool for bootstrapping clusters
K3s — Lightweight Kubernetes (discussed in another article)
RKE2 (Rancher) — Enterprise-ready Kubernetes
Kubespray — Ansible playbook for deployment

Pros:

Total control
No cost for control plane
Works anywhere (cloud, on-premise, bare metal)

Cons:

Responsibility for upgrades, backups, high availability
More operational complexity
No support (unless you pay for a commercial distribution)

When to choose it: On-premise, edge computing, specific control requirements, or when you want to learn how Kubernetes really works.

Cluster Architecture

Before creating the cluster, some architectural decisions.

How Many Control Plane Nodes?

1 node: Ok for development and testing. If it dies, you lose the cluster.
3 nodes: Minimum for production with HA. etcd quorum requires odd numbers.
5 nodes: For very critical clusters. More than 5 rarely needed.

How Many Worker Nodes?

Depends on workload. Start with what you need, scale later. Kubernetes makes it easy to add nodes.

Consider:

Resources needed for your pods
Margin for scheduling (you don't want nodes at 100%)
Fault tolerance (if one node dies, can the others handle it?)

Networking CNI

The Container Network Interface (CNI) manages networking between pods. Common options:

Calico — Most used. Network policy, BGP, flexible.
Cilium — Based on eBPF, excellent performance, advanced features.
Flannel — Simple, lightweight, no network policy.
Weave — Easy to configure, built-in encryption.

For most cases, Calico is fine. If you have advanced performance or security requirements, look at Cilium.

Setup with kubeadm

Let's look at a practical setup with kubeadm on Ubuntu 22.04. This is the "official" method for creating self-hosted clusters.

Prerequisites (All Nodes)

# Disable swap (Kubernetes requires it)
sudo swapoff -a
sudo sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab

# Load required kernel modules
cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf
overlay
br_netfilter
EOF

sudo modprobe overlay
sudo modprobe br_netfilter

# Configure sysctl
cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-iptables  = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward                 = 1
EOF

sudo sysctl --system

Install Container Runtime (containerd)

# Install containerd
sudo apt-get update
sudo apt-get install -y containerd

# Configure containerd
sudo mkdir -p /etc/containerd
containerd config default | sudo tee /etc/containerd/config.toml

# Enable SystemdCgroup
sudo sed -i 's/SystemdCgroup = false/SystemdCgroup = true/g' /etc/containerd/config.toml

sudo systemctl restart containerd
sudo systemctl enable containerd

Install kubeadm, kubelet, kubectl

# Add Kubernetes repository
sudo apt-get update
sudo apt-get install -y apt-transport-https ca-certificates curl gpg

curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.29/deb/Release.key | sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg

echo 'deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.29/deb/ /' | sudo tee /etc/apt/sources.list.d/kubernetes.list

# Install packages
sudo apt-get update
sudo apt-get install -y kubelet kubeadm kubectl
sudo apt-mark hold kubelet kubeadm kubectl

Initialize the Control Plane

On the first control plane node:

sudo kubeadm init \
  --control-plane-endpoint="LOAD_BALANCER_IP:6443" \
  --upload-certs \
  --pod-network-cidr=10.244.0.0/16

If you have a single control plane, you can omit --control-plane-endpoint. The --pod-network-cidr depends on the CNI you'll use (10.244.0.0/16 is for Flannel, for Calico you can use other ranges).

After completion, kubeadm gives you:

Command to configure kubectl
Command to join other control planes
Command to join workers

# Configure kubectl for your user
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

Install CNI (Calico)

kubectl apply -f https://raw.githubusercontent.com/projectcalico/calico/v3.27.0/manifests/calico.yaml

Wait for Calico pods to be running:

kubectl get pods -n kube-system

Join Worker Nodes

On each worker node, run the command kubeadm gave you:

sudo kubeadm join CONTROL_PLANE_IP:6443 \
  --token TOKEN \
  --discovery-token-ca-cert-hash sha256:HASH

If the token has expired, generate a new one on the control plane:

kubeadm token create --print-join-command

Verify the Cluster

kubectl get nodes

You should see all nodes in Ready state.

High Availability

For control plane HA you need at least 3 nodes. Configuration is similar but with some adjustments.

Load Balancer

You need a load balancer in front of the control planes. Options:

HAProxy
Nginx
Cloud load balancer
kube-vip (software load balancer for Kubernetes)

The load balancer must balance port 6443 (API server) to all control planes.

HA Setup

# First control plane
sudo kubeadm init \
  --control-plane-endpoint="LOAD_BALANCER_IP:6443" \
  --upload-certs \
  --pod-network-cidr=10.244.0.0/16

# Other control planes
sudo kubeadm join LOAD_BALANCER_IP:6443 \
  --token TOKEN \
  --discovery-token-ca-cert-hash sha256:HASH \
  --control-plane \
  --certificate-key CERT_KEY

The --certificate-key is shown in the output of the first kubeadm init.

Storage

Kubernetes needs storage for persistent volumes. Options:

Cloud: Use your provider's CSI driver (EBS for AWS, PD for GCP, etc.)

Self-hosted:

Longhorn — Distributed storage, easy to install
Rook/Ceph — Powerful but complex
OpenEBS — Simpler than Ceph, various backend options
Local Path Provisioner — Local storage, no replication

To start, Local Path Provisioner is fine:

kubectl apply -f https://raw.githubusercontent.com/rancher/local-path-provisioner/master/deploy/local-path-storage.yaml

For production, consider Longhorn or your storage's CSI driver.

Monitoring and Logging

A cluster without monitoring is a ticking time bomb.

Metrics Server

Required for kubectl top and autoscaling:

kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml

Prometheus + Grafana

helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm install prometheus prometheus-community/kube-prometheus-stack

This installs Prometheus, Grafana, and pre-configured dashboards for Kubernetes.

Cluster Upgrade

Kubernetes releases versions every 4 months or so. How to upgrade:

Control Plane

# Find available version
apt-cache madison kubeadm

# Upgrade kubeadm
sudo apt-mark unhold kubeadm
sudo apt-get update && sudo apt-get install -y kubeadm=1.30.0-1.1
sudo apt-mark hold kubeadm

# Check upgrade plan
sudo kubeadm upgrade plan

# Apply upgrade
sudo kubeadm upgrade apply v1.30.0

# Upgrade kubelet and kubectl
sudo apt-mark unhold kubelet kubectl
sudo apt-get update && sudo apt-get install -y kubelet=1.30.0-1.1 kubectl=1.30.0-1.1
sudo apt-mark hold kubelet kubectl

sudo systemctl daemon-reload
sudo systemctl restart kubelet

Worker Nodes

# On control plane: drain the node
kubectl drain NODE_NAME --ignore-daemonsets

# On worker: upgrade
sudo apt-mark unhold kubeadm kubelet kubectl
sudo apt-get update && sudo apt-get install -y kubeadm=1.30.0-1.1 kubelet=1.30.0-1.1 kubectl=1.30.0-1.1
sudo apt-mark hold kubeadm kubelet kubectl

sudo kubeadm upgrade node
sudo systemctl daemon-reload
sudo systemctl restart kubelet

# On control plane: uncordon the node
kubectl uncordon NODE_NAME

Repeat for each worker, one at a time.

Backup

Backup of etcd is critical. If you lose etcd, you lose the cluster.

# Backup
ETCDCTL_API=3 etcdctl snapshot save backup.db \
  --endpoints=https://127.0.0.1:2379 \
  --cacert=/etc/kubernetes/pki/etcd/ca.crt \
  --cert=/etc/kubernetes/pki/etcd/server.crt \
  --key=/etc/kubernetes/pki/etcd/server.key

# Verify
ETCDCTL_API=3 etcdctl snapshot status backup.db

Automate this backup with a CronJob or external script. Keep backups off-site.

Alternatives to kubeadm

kubeadm is the standard method but not the only one.

Kubespray: Ansible playbook that automates everything. Good if you already know Ansible and want repeatable deployments.

RKE2: Rancher distribution. Includes security hardening, easier to manage than vanilla kubeadm.

K3s: If you don't need all standard Kubernetes features. Much simpler.

Talos Linux: Immutable OS made specifically for Kubernetes. Interesting for those who want security and immutability.

Conclusion

Creating a Kubernetes cluster isn't hard, but requires attention to detail. Decisions made at the beginning (HA, networking, storage) affect everything that comes after.

If you're on cloud and don't have particular requirements, use your provider's managed Kubernetes. It's the pragmatic choice.

If you must go self-hosted, kubeadm is the standard starting point. Invest time in automation (Ansible, Terraform) so you can recreate the cluster quickly if something goes wrong.

And above all: test upgrades and recovery before going to production. Discovering your backup doesn't work during an emergency is an experience you want to avoid.

A Kubernetes cluster is easy to create. A reliable Kubernetes cluster requires planning and ongoing maintenance.