Alta Disponibilidade do Control Plane garante que o cluster continue operacional mesmo se um ou mais masters falharem.
Conceito Geral
- Mínimo de 3 masters para quorum (tolerância a 1 falha)
- Load Balancer distribui requisições entre API Servers
- etcd em cluster (3 ou 5 membros) para redundância
- Apenas um scheduler e controller-manager ativos (leader election)
Arquitetura HA
kubectl/Clients
↓
Load Balancer (HAProxy/Nginx)
:6443
↓
+----------+----------+----------+
| | | |
Master01 Master02 Master03
(API) (API) (API)
(Sched) (Sched*) (Sched*)
(CtrlMgr) (CtrlMgr*) (CtrlMgr*)
| | | |
+----------+----------+----------+
|
etcd cluster
(3 ou 5 membros)
* = standby (leader election)
Inicializar Primeiro Master
kubeadm init --control-plane-endpoint "loadbalancer.example.com:6443" \
--upload-certs \
--pod-network-cidr=10.244.0.0/16
Adicionar Masters Adicionais
# No Master02 e Master03
kubeadm join loadbalancer.example.com:6443 \
--token <token> \
--discovery-token-ca-cert-hash sha256:<hash> \
--control-plane \
--certificate-key <cert-key>
HAProxy Config
# /etc/haproxy/haproxy.cfg
frontend k8s-api
bind *:6443
mode tcp
option tcplog
default_backend k8s-api-backend
backend k8s-api-backend
mode tcp
balance roundrobin
option tcp-check
server master01 10.0.0.11:6443 check
server master02 10.0.0.12:6443 check
server master03 10.0.0.13:6443 check
etcd Cluster Externo
# Configurar etcd em 3 nodes separados
ETCD_NAME=etcd01
ETCD_INITIAL_CLUSTER="etcd01=https://10.0.0.21:2380,etcd02=https://10.0.0.22:2380,etcd03=https://10.0.0.23:2380"
# No kubeadm-config.yaml
etcd:
external:
endpoints:
- https://10.0.0.21:2379
- https://10.0.0.22:2379
- https://10.0.0.23:2379
caFile: /etc/kubernetes/pki/etcd/ca.crt
certFile: /etc/kubernetes/pki/apiserver-etcd-client.crt
keyFile: /etc/kubernetes/pki/apiserver-etcd-client.key
Verificar Health do Cluster
# Status dos masters
kubectl get nodes -l node-role.kubernetes.io/control-plane
# Health do etcd
ETCDCTL_API=3 etcdctl member list \
--endpoints=https://10.0.0.21:2379,https://10.0.0.22:2379,https://10.0.0.23:2379 \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
--cert=/etc/kubernetes/pki/etcd/server.crt \
--key=/etc/kubernetes/pki/etcd/server.key
# Endpoint health
ETCDCTL_API=3 etcdctl endpoint health --cluster
Failover Test
# Desligar um master
systemctl stop kubelet
# Verificar que cluster continua funcionando
kubectl get nodes
kubectl get pods --all-namespaces
Regenerar Certificados para HA
kubeadm init phase upload-certs --upload-certs
# Copiar certificate-key gerado
# Adicionar novo master
kubeadm join <endpoint> --control-plane --certificate-key <key>