Longhorn: Storage Distribuito per Kubernetes - Guida Completa
Lo storage e uno dei problemi piu complessi in Kubernetes. I container sono effimeri, ma i dati devono persistere. Longhorn, progetto CNCF graduated, risolve questo problema offrendo storage distribuito, resiliente e facile da gestire. In questa guida vedremo come installare e configurare Longhorn per il tuo cluster Kubernetes.
Il Problema dello Storage in Kubernetes
Storage Effimero
Di default, lo storage dei Pod e effimero:
apiVersion: v1
kind: Pod
metadata:
name: my-app
spec:
containers:
- name: app
image: my-app:latest
# Tutto in /data viene perso al restart!
Soluzioni Tradizionali
| Soluzione | Pro | Contro |
|---|---|---|
| hostPath | Semplice | No portabilita, no HA |
| NFS | Condiviso | Single point of failure |
| Cloud Provider | Gestito | Vendor lock-in, costi |
| Ceph | Potente | Complessita elevata |
| Longhorn | Semplice + Distribuito | Richiede risorse |
Cos'e Longhorn
Longhorn e un sistema di block storage distribuito per Kubernetes che:
- Usa dischi locali dei nodi
- Replica i dati su piu nodi
- Gestisce snapshot e backup
- Fornisce UI web integrata
- Supporta DR e migrazione
Architettura
Componenti Principali
| Componente | Funzione |
|---|---|
| Longhorn Manager | Orchestrazione, API, UI |
| Longhorn Engine | iSCSI target, gestisce repliche |
| Replica | Copia dei dati su disco locale |
| CSI Driver | Integrazione con Kubernetes |
Requisiti
Hardware
| Risorsa | Minimo | Consigliato |
|---|---|---|
| Nodi | 3 | 3+ |
| CPU per nodo | 2 core | 4+ core |
| RAM per nodo | 4 GB | 8+ GB |
| Disco | SSD 50 GB | SSD/NVMe 200+ GB |
Software
Kubernetes: 1.25+
OS: Ubuntu 20.04+, RHEL 8+, SLES 15+
Filesystem: ext4, XFS
Prerequisiti Nodi
# Ogni nodo deve avere open-iscsi
# Ubuntu/Debian
sudo apt install open-iscsi
sudo systemctl enable iscsid
sudo systemctl start iscsid
# RHEL/CentOS
sudo yum install iscsi-initiator-utils
sudo systemctl enable iscsid
sudo systemctl start iscsid
Installazione
Metodo 1: Helm (Consigliato)
# Aggiungi repo
helm repo add longhorn https://charts.longhorn.io
helm repo update
# Installa
helm install longhorn longhorn/longhorn \
--namespace longhorn-system \
--create-namespace \
--set defaultSettings.defaultDataPath="/var/lib/longhorn" \
--set defaultSettings.defaultReplicaCount=3
Metodo 2: Kubectl
kubectl apply -f https://raw.githubusercontent.com/longhorn/longhorn/v1.6.0/deploy/longhorn.yaml
Verifica Installazione
# Controlla pods
kubectl -n longhorn-system get pods
# Output atteso
NAME READY STATUS RESTARTS
longhorn-manager-xxxxx 1/1 Running 0
longhorn-driver-deployer-xxxxx 1/1 Running 0
longhorn-ui-xxxxx 1/1 Running 0
engine-image-ei-xxxxx 1/1 Running 0
instance-manager-xxxxx 1/1 Running 0
Accesso UI
# Port forward
kubectl -n longhorn-system port-forward svc/longhorn-frontend 8080:80
# Accedi a http://localhost:8080
Configurazione StorageClass
StorageClass Default
Longhorn crea automaticamente una StorageClass:
kubectl get storageclass
# NAME PROVISIONER RECLAIMPOLICY
# longhorn (default) driver.longhorn.io Delete
StorageClass Custom
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: longhorn-ssd
provisioner: driver.longhorn.io
allowVolumeExpansion: true
reclaimPolicy: Retain
volumeBindingMode: Immediate
parameters:
numberOfReplicas: "3"
staleReplicaTimeout: "2880"
fromBackup: ""
fsType: "ext4"
dataLocality: "best-effort"
Parametri Importanti
| Parametro | Descrizione | Default |
|---|---|---|
numberOfReplicas | Numero di repliche | 3 |
dataLocality | Localita dati (disabled, best-effort, strict-local) | disabled |
diskSelector | Seleziona dischi specifici | - |
nodeSelector | Seleziona nodi specifici | - |
Creare e Usare Volumi
PersistentVolumeClaim
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: my-data
spec:
accessModes:
- ReadWriteOnce
storageClassName: longhorn
resources:
requests:
storage: 10Gi
Usare il Volume in un Pod
apiVersion: v1
kind: Pod
metadata:
name: my-app
spec:
containers:
- name: app
image: my-app:latest
volumeMounts:
- name: data
mountPath: /data
volumes:
- name: data
persistentVolumeClaim:
claimName: my-data
StatefulSet con Longhorn
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: postgres
spec:
serviceName: postgres
replicas: 1
selector:
matchLabels:
app: postgres
template:
metadata:
labels:
app: postgres
spec:
containers:
- name: postgres
image: postgres:15
env:
- name: POSTGRES_PASSWORD
value: "password"
- name: PGDATA
value: /var/lib/postgresql/data/pgdata
volumeMounts:
- name: data
mountPath: /var/lib/postgresql/data
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: longhorn
resources:
requests:
storage: 20Gi
Backup e Disaster Recovery
Configurare Backup Target
Longhorn supporta backup su S3 o NFS.
S3 Backup:
# Secret per credenziali S3
apiVersion: v1
kind: Secret
metadata:
name: s3-secret
namespace: longhorn-system
type: Opaque
stringData:
AWS_ACCESS_KEY_ID: "your-access-key"
AWS_SECRET_ACCESS_KEY: "your-secret-key"
# Configura via UI o CLI
# Settings > Backup Target
# s3://bucket-name@region/path
NFS Backup:
nfs://server-ip:/path/to/backup
Creare Backup Manuale
apiVersion: longhorn.io/v1beta2
kind: Backup
metadata:
name: my-data-backup
namespace: longhorn-system
spec:
snapshotName: my-data-snapshot
labels:
app: my-app
type: manual
Backup Ricorrenti
apiVersion: longhorn.io/v1beta2
kind: RecurringJob
metadata:
name: daily-backup
namespace: longhorn-system
spec:
cron: "0 2 * * *" # Ogni giorno alle 2:00
task: backup
retain: 7
concurrency: 1
groups:
- default
Restore da Backup
1. Via UI: Volumes > Create Volume > From Backup
2. Via PVC:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: restored-data
spec:
accessModes:
- ReadWriteOnce
storageClassName: longhorn
resources:
requests:
storage: 10Gi
dataSource:
name: my-data-backup
kind: Backup
apiGroup: longhorn.io
Snapshot
Creare Snapshot
apiVersion: longhorn.io/v1beta2
kind: Snapshot
metadata:
name: my-data-snap-1
namespace: longhorn-system
spec:
volume: my-data
labels:
app: my-app
Recurring Snapshots
apiVersion: longhorn.io/v1beta2
kind: RecurringJob
metadata:
name: hourly-snapshot
namespace: longhorn-system
spec:
cron: "0 * * * *" # Ogni ora
task: snapshot
retain: 24 # Mantieni ultime 24
concurrency: 2
groups:
- default
Monitoraggio
Metriche Prometheus
Longhorn espone metriche Prometheus:
# ServiceMonitor per Prometheus Operator
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: longhorn
namespace: longhorn-system
spec:
selector:
matchLabels:
app: longhorn-manager
endpoints:
- port: manager
Metriche Principali
| Metrica | Descrizione |
|---|---|
longhorn_volume_actual_size_bytes | Dimensione reale volume |
longhorn_volume_capacity_bytes | Capacita volume |
longhorn_volume_state | Stato volume |
longhorn_node_storage_capacity_bytes | Capacita storage nodo |
longhorn_node_storage_usage_bytes | Uso storage nodo |
Dashboard Grafana
Import dashboard ID: 13032 (Longhorn Dashboard)
Best Practices Produzione
Replica Count
# Minimo 3 repliche per HA
parameters:
numberOfReplicas: "3"
Node Scheduling
Assicurati che le repliche siano distribuite:
# Settings > Replica Node Level Soft Anti-Affinity: true
# Settings > Replica Zone Level Soft Anti-Affinity: true
Disk Scheduling
# Aggiungi tag ai dischi
kubectl -n longhorn-system label nodes node1 storage=ssd
parameters:
diskSelector: "ssd"
Backup Policy
# Backup giornaliero con retention 7 giorni
apiVersion: longhorn.io/v1beta2
kind: RecurringJob
metadata:
name: daily-backup
spec:
cron: "0 3 * * *"
task: backup
retain: 7
Troubleshooting
Volume Degraded
# Controlla stato volume
kubectl -n longhorn-system get volumes.longhorn.io
# Dettagli
kubectl -n longhorn-system describe volume my-volume
Cause comuni:
- Nodo down
- Disco pieno
- Replica corrotta
Replica Rebuild Lento
# Aumenta concurrent rebuild
# Settings > Concurrent Replica Rebuild Per Node Limit: 5
Spazio Insufficiente
# Controlla spazio per nodo
kubectl -n longhorn-system get nodes.longhorn.io -o wide
Soluzioni:
- Aggiungi dischi
- Elimina snapshot vecchi
- Riduci replica count temporaneamente
Confronto con Alternative
| Feature | Longhorn | Rook-Ceph | OpenEBS | Portworx |
|---|---|---|---|---|
| Complessita | Bassa | Alta | Media | Media |
| Performance | Buona | Ottima | Buona | Ottima |
| UI integrata | Si | No | Si | Si |
| Backup S3 | Si | Si | Si | Si |
| Licenza | Apache 2.0 | Apache 2.0 | Apache 2.0 | Commercial |
| CNCF | Graduated | Graduated | Sandbox | No |
Quando scegliere Longhorn:
- Cluster piccoli-medi
- Team con esperienza Kubernetes limitata
- Necessita setup rapido
- Budget limitato
Conclusioni
Longhorn e la soluzione ideale per chi cerca storage distribuito su Kubernetes senza la complessita di Ceph o i costi di soluzioni enterprise.
Checklist implementazione:
- Prerequisiti nodi (open-iscsi)
- Installazione via Helm
- Configurazione StorageClass
- Setup backup target (S3/NFS)
- Recurring backup configurato
- Monitoraggio attivo
- Test di restore