Alta Disponibilidade do Control Plane garante que o cluster continue operacional mesmo se um ou mais masters falharem.

Conceito Geral

  • Mínimo de 3 masters para quorum (tolerância a 1 falha)
  • Load Balancer distribui requisições entre API Servers
  • etcd em cluster (3 ou 5 membros) para redundância
  • Apenas um scheduler e controller-manager ativos (leader election)

Arquitetura HA

kubectl/Clients
   ↓
Load Balancer (HAProxy/Nginx)
   :6443
   ↓
+----------+----------+----------+
|          |          |          |
Master01   Master02   Master03
(API)      (API)      (API)
(Sched)    (Sched*)   (Sched*)
(CtrlMgr)  (CtrlMgr*) (CtrlMgr*)
|          |          |          |
+----------+----------+----------+
            |
         etcd cluster
      (3 ou 5 membros)

* = standby (leader election)

Inicializar Primeiro Master

kubeadm init --control-plane-endpoint "loadbalancer.example.com:6443" \
  --upload-certs \
  --pod-network-cidr=10.244.0.0/16

Adicionar Masters Adicionais

# No Master02 e Master03
kubeadm join loadbalancer.example.com:6443 \
  --token <token> \
  --discovery-token-ca-cert-hash sha256:<hash> \
  --control-plane \
  --certificate-key <cert-key>

HAProxy Config

# /etc/haproxy/haproxy.cfg
frontend k8s-api
    bind *:6443
    mode tcp
    option tcplog
    default_backend k8s-api-backend

backend k8s-api-backend
    mode tcp
    balance roundrobin
    option tcp-check
    server master01 10.0.0.11:6443 check
    server master02 10.0.0.12:6443 check
    server master03 10.0.0.13:6443 check

etcd Cluster Externo

# Configurar etcd em 3 nodes separados
ETCD_NAME=etcd01
ETCD_INITIAL_CLUSTER="etcd01=https://10.0.0.21:2380,etcd02=https://10.0.0.22:2380,etcd03=https://10.0.0.23:2380"

# No kubeadm-config.yaml
etcd:
  external:
    endpoints:
      - https://10.0.0.21:2379
      - https://10.0.0.22:2379
      - https://10.0.0.23:2379
    caFile: /etc/kubernetes/pki/etcd/ca.crt
    certFile: /etc/kubernetes/pki/apiserver-etcd-client.crt
    keyFile: /etc/kubernetes/pki/apiserver-etcd-client.key

Verificar Health do Cluster

# Status dos masters
kubectl get nodes -l node-role.kubernetes.io/control-plane

# Health do etcd
ETCDCTL_API=3 etcdctl member list \
  --endpoints=https://10.0.0.21:2379,https://10.0.0.22:2379,https://10.0.0.23:2379 \
  --cacert=/etc/kubernetes/pki/etcd/ca.crt \
  --cert=/etc/kubernetes/pki/etcd/server.crt \
  --key=/etc/kubernetes/pki/etcd/server.key

# Endpoint health
ETCDCTL_API=3 etcdctl endpoint health --cluster

Failover Test

# Desligar um master
systemctl stop kubelet

# Verificar que cluster continua funcionando
kubectl get nodes
kubectl get pods --all-namespaces

Regenerar Certificados para HA

kubeadm init phase upload-certs --upload-certs
# Copiar certificate-key gerado

# Adicionar novo master
kubeadm join <endpoint> --control-plane --certificate-key <key>