Longhorn: Storage Distribuito per Kubernetes - Guida Completa

Lo storage e uno dei problemi piu complessi in Kubernetes. I container sono effimeri, ma i dati devono persistere. Longhorn, progetto CNCF graduated, risolve questo problema offrendo storage distribuito, resiliente e facile da gestire. In questa guida vedremo come installare e configurare Longhorn per il tuo cluster Kubernetes.

Il Problema dello Storage in Kubernetes

Storage Effimero

Di default, lo storage dei Pod e effimero:

apiVersion: v1
kind: Pod
metadata:
  name: my-app
spec:
  containers:
    - name: app
      image: my-app:latest
      # Tutto in /data viene perso al restart!

Soluzioni Tradizionali

SoluzioneProContro
hostPathSempliceNo portabilita, no HA
NFSCondivisoSingle point of failure
Cloud ProviderGestitoVendor lock-in, costi
CephPotenteComplessita elevata
LonghornSemplice + DistribuitoRichiede risorse

Cos'e Longhorn

Longhorn e un sistema di block storage distribuito per Kubernetes che:

  • Usa dischi locali dei nodi
  • Replica i dati su piu nodi
  • Gestisce snapshot e backup
  • Fornisce UI web integrata
  • Supporta DR e migrazione

Architettura

Architettura Longhorn - Kubernetes Cluster con Longhorn Manager, Engine, Repliche e Dischi Locali
Architettura di Longhorn: storage distribuito con repliche su dischi locali dei nodi

Componenti Principali

ComponenteFunzione
Longhorn ManagerOrchestrazione, API, UI
Longhorn EngineiSCSI target, gestisce repliche
ReplicaCopia dei dati su disco locale
CSI DriverIntegrazione con Kubernetes

Requisiti

Hardware

RisorsaMinimoConsigliato
Nodi33+
CPU per nodo2 core4+ core
RAM per nodo4 GB8+ GB
DiscoSSD 50 GBSSD/NVMe 200+ GB

Software

Kubernetes: 1.25+
OS: Ubuntu 20.04+, RHEL 8+, SLES 15+
Filesystem: ext4, XFS

Prerequisiti Nodi

# Ogni nodo deve avere open-iscsi
# Ubuntu/Debian
sudo apt install open-iscsi
sudo systemctl enable iscsid
sudo systemctl start iscsid

# RHEL/CentOS
sudo yum install iscsi-initiator-utils
sudo systemctl enable iscsid
sudo systemctl start iscsid

Installazione

Metodo 1: Helm (Consigliato)

# Aggiungi repo
helm repo add longhorn https://charts.longhorn.io
helm repo update

# Installa
helm install longhorn longhorn/longhorn \
  --namespace longhorn-system \
  --create-namespace \
  --set defaultSettings.defaultDataPath="/var/lib/longhorn" \
  --set defaultSettings.defaultReplicaCount=3

Metodo 2: Kubectl

kubectl apply -f https://raw.githubusercontent.com/longhorn/longhorn/v1.6.0/deploy/longhorn.yaml

Verifica Installazione

# Controlla pods
kubectl -n longhorn-system get pods

# Output atteso
NAME                                        READY   STATUS    RESTARTS
longhorn-manager-xxxxx                      1/1     Running   0
longhorn-driver-deployer-xxxxx              1/1     Running   0
longhorn-ui-xxxxx                           1/1     Running   0
engine-image-ei-xxxxx                       1/1     Running   0
instance-manager-xxxxx                      1/1     Running   0

Accesso UI

# Port forward
kubectl -n longhorn-system port-forward svc/longhorn-frontend 8080:80

# Accedi a http://localhost:8080

Configurazione StorageClass

StorageClass Default

Longhorn crea automaticamente una StorageClass:

kubectl get storageclass
# NAME                 PROVISIONER          RECLAIMPOLICY
# longhorn (default)   driver.longhorn.io   Delete

StorageClass Custom

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: longhorn-ssd
provisioner: driver.longhorn.io
allowVolumeExpansion: true
reclaimPolicy: Retain
volumeBindingMode: Immediate
parameters:
  numberOfReplicas: "3"
  staleReplicaTimeout: "2880"
  fromBackup: ""
  fsType: "ext4"
  dataLocality: "best-effort"

Parametri Importanti

ParametroDescrizioneDefault
numberOfReplicasNumero di repliche3
dataLocalityLocalita dati (disabled, best-effort, strict-local)disabled
diskSelectorSeleziona dischi specifici-
nodeSelectorSeleziona nodi specifici-

Creare e Usare Volumi

PersistentVolumeClaim

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: my-data
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: longhorn
  resources:
    requests:
      storage: 10Gi

Usare il Volume in un Pod

apiVersion: v1
kind: Pod
metadata:
  name: my-app
spec:
  containers:
    - name: app
      image: my-app:latest
      volumeMounts:
        - name: data
          mountPath: /data
  volumes:
    - name: data
      persistentVolumeClaim:
        claimName: my-data

StatefulSet con Longhorn

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: postgres
spec:
  serviceName: postgres
  replicas: 1
  selector:
    matchLabels:
      app: postgres
  template:
    metadata:
      labels:
        app: postgres
    spec:
      containers:
        - name: postgres
          image: postgres:15
          env:
            - name: POSTGRES_PASSWORD
              value: "password"
            - name: PGDATA
              value: /var/lib/postgresql/data/pgdata
          volumeMounts:
            - name: data
              mountPath: /var/lib/postgresql/data
  volumeClaimTemplates:
    - metadata:
        name: data
      spec:
        accessModes: ["ReadWriteOnce"]
        storageClassName: longhorn
        resources:
          requests:
            storage: 20Gi

Backup e Disaster Recovery

Configurare Backup Target

Longhorn supporta backup su S3 o NFS.

S3 Backup:

# Secret per credenziali S3
apiVersion: v1
kind: Secret
metadata:
  name: s3-secret
  namespace: longhorn-system
type: Opaque
stringData:
  AWS_ACCESS_KEY_ID: "your-access-key"
  AWS_SECRET_ACCESS_KEY: "your-secret-key"
# Configura via UI o CLI
# Settings > Backup Target
# s3://bucket-name@region/path

NFS Backup:

nfs://server-ip:/path/to/backup

Creare Backup Manuale

apiVersion: longhorn.io/v1beta2
kind: Backup
metadata:
  name: my-data-backup
  namespace: longhorn-system
spec:
  snapshotName: my-data-snapshot
  labels:
    app: my-app
    type: manual

Backup Ricorrenti

apiVersion: longhorn.io/v1beta2
kind: RecurringJob
metadata:
  name: daily-backup
  namespace: longhorn-system
spec:
  cron: "0 2 * * *"  # Ogni giorno alle 2:00
  task: backup
  retain: 7
  concurrency: 1
  groups:
    - default

Restore da Backup

1. Via UI: Volumes > Create Volume > From Backup

2. Via PVC:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: restored-data
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: longhorn
  resources:
    requests:
      storage: 10Gi
  dataSource:
    name: my-data-backup
    kind: Backup
    apiGroup: longhorn.io

Snapshot

Creare Snapshot

apiVersion: longhorn.io/v1beta2
kind: Snapshot
metadata:
  name: my-data-snap-1
  namespace: longhorn-system
spec:
  volume: my-data
  labels:
    app: my-app

Recurring Snapshots

apiVersion: longhorn.io/v1beta2
kind: RecurringJob
metadata:
  name: hourly-snapshot
  namespace: longhorn-system
spec:
  cron: "0 * * * *"  # Ogni ora
  task: snapshot
  retain: 24  # Mantieni ultime 24
  concurrency: 2
  groups:
    - default

Monitoraggio

Metriche Prometheus

Longhorn espone metriche Prometheus:

# ServiceMonitor per Prometheus Operator
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: longhorn
  namespace: longhorn-system
spec:
  selector:
    matchLabels:
      app: longhorn-manager
  endpoints:
    - port: manager

Metriche Principali

MetricaDescrizione
longhorn_volume_actual_size_bytesDimensione reale volume
longhorn_volume_capacity_bytesCapacita volume
longhorn_volume_stateStato volume
longhorn_node_storage_capacity_bytesCapacita storage nodo
longhorn_node_storage_usage_bytesUso storage nodo

Dashboard Grafana

Import dashboard ID: 13032 (Longhorn Dashboard)

Best Practices Produzione

Replica Count

# Minimo 3 repliche per HA
parameters:
  numberOfReplicas: "3"

Node Scheduling

Assicurati che le repliche siano distribuite:

# Settings > Replica Node Level Soft Anti-Affinity: true
# Settings > Replica Zone Level Soft Anti-Affinity: true

Disk Scheduling

# Aggiungi tag ai dischi
kubectl -n longhorn-system label nodes node1 storage=ssd
parameters:
  diskSelector: "ssd"

Backup Policy

# Backup giornaliero con retention 7 giorni
apiVersion: longhorn.io/v1beta2
kind: RecurringJob
metadata:
  name: daily-backup
spec:
  cron: "0 3 * * *"
  task: backup
  retain: 7

Troubleshooting

Volume Degraded

# Controlla stato volume
kubectl -n longhorn-system get volumes.longhorn.io

# Dettagli
kubectl -n longhorn-system describe volume my-volume

Cause comuni:

  • Nodo down
  • Disco pieno
  • Replica corrotta

Replica Rebuild Lento

# Aumenta concurrent rebuild
# Settings > Concurrent Replica Rebuild Per Node Limit: 5

Spazio Insufficiente

# Controlla spazio per nodo
kubectl -n longhorn-system get nodes.longhorn.io -o wide

Soluzioni:

  • Aggiungi dischi
  • Elimina snapshot vecchi
  • Riduci replica count temporaneamente

Confronto con Alternative

FeatureLonghornRook-CephOpenEBSPortworx
ComplessitaBassaAltaMediaMedia
PerformanceBuonaOttimaBuonaOttima
UI integrataSiNoSiSi
Backup S3SiSiSiSi
LicenzaApache 2.0Apache 2.0Apache 2.0Commercial
CNCFGraduatedGraduatedSandboxNo

Quando scegliere Longhorn:

  • Cluster piccoli-medi
  • Team con esperienza Kubernetes limitata
  • Necessita setup rapido
  • Budget limitato

Conclusioni

Longhorn e la soluzione ideale per chi cerca storage distribuito su Kubernetes senza la complessita di Ceph o i costi di soluzioni enterprise.

Checklist implementazione:

  • Prerequisiti nodi (open-iscsi)
  • Installazione via Helm
  • Configurazione StorageClass
  • Setup backup target (S3/NFS)
  • Recurring backup configurato
  • Monitoraggio attivo
  • Test di restore

Risorse