How to Configure High Availabilty Kubernetes Cluster on Ubuntu 20.4
On this occasion I will share how to configure High Availability Cluster Kubernetes on Ubuntu using HAProxy and Keepalive. We use Keepalive and HAproxy for load balancing and high availability. Things that need to be prepared include:
- Hosts/Instances/VM
- Configure HAproxy and Keepalive
- Configure Kubernetes clustering using Kubeadm.
Architecture Clusters.
Here we use 3 masters, 3 workers, 2 load balancers, and 1 IP to be used as a Virtual IP Address. VIP means that if an error occurs or 1 node dies, the IP can be forwarded to the node that allows flaiover to occur, so that High Availability is achieved according to our current lab goals.
It should be underlined, that HAproxy and Keepalive are not installed on the master node. HAproxy and Keepalive are only installed on loadbalancer1 and loadbalancer2 nodes
Hosts used:
Hostname | IP Address | Roles |
---|---|---|
rb-k8s-lb1 | 10.60.60.43 | HAproxy & Keepalived |
rb-k8s-lb2 | 10.60.60.44 | HAproxy & Keepalived |
rb-k8s-master1 | 10.60.60.51 | master, etcd |
rb-k8s-master2 | 10.60.60.52 | master, etcd |
rb-k8s-master3 | 10.60.60.53 | master, etcd |
rb-k8s-worker1 | 10.60.60.54 | worker |
rb-k8s-worker2 | 10.60.60.55 | worker |
rb-k8s-worker3 | 10.60.60.56 | worker |
- | 10.60.60.45 | VirtualIP |
Load Balancing Configuration:
Keepalived and HAproxy are installed on nodes rb-k8s-lb1 and rb-k8s-lb2. In this scenario, if one of the loadbalancer nodes is down/dead, Virtual IP will automatically use the path on the loadbalancer node that is running so that the Kubernetes cluster does not experience an error.
Install Keepalive and HAproxy on both nodes
- Run on nodes rb-k8s-lb1 and rb-k8s-lb2:
sudo apt install keepalived haproxy psmisc -y
- Configure haproxy on both loadbalancer nodes:
sudo vi /etc/haproxy/haproxy.cfg
global
log /dev/log local0 warning
chroot /var/lib/haproxy
pidfile /var/run/haproxy.pid
maxconn 4000
user haproxy
group haproxy
daemon
stats socket /var/lib/haproxy/stats
defaults
log global
option httplog
option dontlognull
timeout connect 5000
timeout client 50000
timeout server 50000
frontend kube-apiserver
bind *:6443
mode tcp
option tcplog
default_backend kube-apiserver
backend kube-apiserver
mode tcp
option tcplog
option tcp-check
balance roundrobin
default-server inter 10s downinter 5s rise 2 fall 2 slowstart 60s maxconn 250 maxqueue 256 weight 100
server rb-k8s-master1 10.60.60.51:6443 check # Replace the IP address with your own.
server rb-k8s-master2 10.60.60.52:6443 check # Replace the IP address with your own.
server rb-k8s-master3 10.60.60.53:6443 check # Replace the IP address with your own.
- Restart service HAproxy:
sudo systemctl restart haproxy
- Then restart and enable the HAproxy service:
sudo systemctl enable haproxy
sudo systemctl restart haproxy
- Make sure the HAproxy service is active:
root@rb-k8s-lb1:~# systemctl status haproxy.service
● haproxy.service - HAProxy Load Balancer
Loaded: loaded (/lib/systemd/system/haproxy.service; enabled; vendor preset: enabled)
Active: active (running) since Sat 2022-02-12 17:29:05 UTC; 10min ago
Docs: man:haproxy(1)
file:/usr/share/doc/haproxy/configuration.txt.gz
Process: 463986 ExecStartPre=/usr/sbin/haproxy -f $CONFIG -c -q $EXTRAOPTS (code=exited, status=0/SUCCESS)
Main PID: 464001 (haproxy)
Tasks: 5 (limit: 4676)
Memory: 3.6M
CGroup: /system.slice/haproxy.service
├─464001 /usr/sbin/haproxy -Ws -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -S /run/haproxy-master.sock
└─464002 /usr/sbin/haproxy -Ws -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -S /run/haproxy-master.sock
Feb 12 17:29:05 rb-k8s-lb1 systemd[1]: Starting HAProxy Load Balancer...
Feb 12 17:29:05 rb-k8s-lb1 haproxy[464001]: [WARNING] 042/172905 (464001) : parsing [/etc/haproxy/haproxy.cfg:28] : backend 'kube-apiserver' >
Feb 12 17:29:05 rb-k8s-lb1 haproxy[464001]: [NOTICE] 042/172905 (464001) : New worker #1 (464002) forked
Feb 12 17:29:05 rb-k8s-lb1 systemd[1]: Started HAProxy Load Balancer.
Feb 12 17:29:08 rb-k8s-lb1 haproxy[464002]: [WARNING] 042/172908 (464002) : Server kube-apiserver/rb-k8s-master2 is DOWN, reason: Layer4 conn>
Feb 12 17:29:12 rb-k8s-lb1 haproxy[464002]: [WARNING] 042/172912 (464002) : Server kube-apiserver/rb-k8s-master3 is DOWN, reason: Layer4 conn>
Keepalive Configuration
- Configure keepalive on both nodes rb-k8s-lb1 and rb-k8s-lb2:
sudo /etc/keepalived/keepalived.conf
Configure keepalive.conf on rb-k8s-lb1:
global_defs {
notification_email {
}
router_id LVS_DEVEL
vrrp_skip_check_adv_addr
vrrp_garp_interval 0
vrrp_gna_interval 0
}
vrrp_script chk_haproxy {
script "killall -0 haproxy"
interval 2
weight 2
}
vrrp_instance haproxy-vip {
state BACKUP
priority 100
interface ens3 # Network card
virtual_router_id 60
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
unicast_src_ip 10.60.60.43 # The IP address of this machine rb-k8s-lb1
unicast_peer {
10.60.60.44 # The IP address of peer machines rb-k8s-lb2
}
virtual_ipaddress {
10.60.60.45/24 # The VIP address
}
track_script {
chk_haproxy
}
}
Configure keepalive.conf on rb-k8s-lb2:
global_defs {
notification_email {
}
router_id LVS_DEVEL
vrrp_skip_check_adv_addr
vrrp_garp_interval 0
vrrp_gna_interval 0
}
vrrp_script chk_haproxy {
script "killall -0 haproxy"
interval 2
weight 2
}
vrrp_instance haproxy-vip {
state BACKUP
priority 100
interface ens3 # Network card
virtual_router_id 60
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
unicast_src_ip 10.60.60.44 # The IP address of this machine rb-k8s-lb2
unicast_peer {
10.60.60.43 # The IP address of peer machines rb-k8s-lb1
}
virtual_ipaddress {
10.60.60.45/24 # The VIP address
}
track_script {
chk_haproxy
}
}
- Then restart and enable the keepalive service
sudo systemctl enable keepalived
sudo systemctl restart keepalived
- Make sure the service keepalived is running Status keepalived on rb-k8s-lb1:
root@rb-k8s-lb1:~# sudo systemctl status keepalived.service
● keepalived.service - Keepalive Daemon (LVS and VRRP)
Loaded: loaded (/lib/systemd/system/keepalived.service; enabled; vendor preset: enabled)
Active: active (running) since Wed 2022-02-02 16:07:10 UTC; 1 weeks 3 days ago
Main PID: 647 (keepalived)
Tasks: 2 (limit: 4676)
Memory: 6.5M
CGroup: /system.slice/keepalived.service
├─647 /usr/sbin/keepalived --dont-fork
└─708 /usr/sbin/keepalived --dont-fork
Feb 02 16:07:10 rb-k8s-lb1 Keepalived_vrrp[708]: Registering Kernel netlink command channel
Feb 02 16:07:10 rb-k8s-lb1 Keepalived_vrrp[708]: Opening file '/etc/keepalived/keepalived.conf'.
Feb 02 16:07:10 rb-k8s-lb1 Keepalived_vrrp[708]: WARNING - default user 'keepalived_script' for script execution does not exist - please crea>
Feb 02 16:07:10 rb-k8s-lb1 Keepalived_vrrp[708]: WARNING - script `killall` resolved by path search to `/usr/bin/killall`. Please specify ful>
Feb 02 16:07:10 rb-k8s-lb1 Keepalived_vrrp[708]: SECURITY VIOLATION - scripts are being executed but script_security not enabled.
Feb 02 16:07:10 rb-k8s-lb1 Keepalived_vrrp[708]: Registering gratuitous ARP shared channel
Feb 02 16:07:10 rb-k8s-lb1 Keepalived_vrrp[708]: (haproxy-vip) Entering BACKUP STATE (init)
Feb 02 16:07:10 rb-k8s-lb1 Keepalived_vrrp[708]: VRRP_Script(chk_haproxy) succeeded
Feb 02 16:07:10 rb-k8s-lb1 Keepalived_vrrp[708]: (haproxy-vip) Changing effective priority from 100 to 102
Feb 02 16:07:14 rb-k8s-lb1 Keepalived_vrrp[708]: (haproxy-vip) Entering MASTER STATE
Status keepalived on rb-k8s-lb2:
root@rb-k8s-lb2:~# sudo systemctl status keepalived.service
● keepalived.service - Keepalive Daemon (LVS and VRRP)
Loaded: loaded (/lib/systemd/system/keepalived.service; enabled; vendor preset: enabled)
Active: active (running) since Wed 2022-02-02 16:07:12 UTC; 1 weeks 3 days ago
Main PID: 647 (keepalived)
Tasks: 2 (limit: 4676)
Memory: 6.6M
CGroup: /system.slice/keepalived.service
├─647 /usr/sbin/keepalived --dont-fork
└─699 /usr/sbin/keepalived --dont-fork
Feb 02 16:07:13 rb-k8s-lb2 Keepalived_vrrp[699]: Registering Kernel netlink reflector
Feb 02 16:07:13 rb-k8s-lb2 Keepalived_vrrp[699]: Registering Kernel netlink command channel
Feb 02 16:07:13 rb-k8s-lb2 Keepalived_vrrp[699]: Opening file '/etc/keepalived/keepalived.conf'.
Feb 02 16:07:13 rb-k8s-lb2 Keepalived_vrrp[699]: WARNING - default user 'keepalived_script' for script execution does not exist - please crea>
Feb 02 16:07:13 rb-k8s-lb2 Keepalived_vrrp[699]: WARNING - script `killall` resolved by path search to `/usr/bin/killall`. Please specify ful>
Feb 02 16:07:13 rb-k8s-lb2 Keepalived_vrrp[699]: SECURITY VIOLATION - scripts are being executed but script_security not enabled.
Feb 02 16:07:13 rb-k8s-lb2 Keepalived_vrrp[699]: Registering gratuitous ARP shared channel
Feb 02 16:07:13 rb-k8s-lb2 Keepalived_vrrp[699]: (haproxy-vip) Entering BACKUP STATE (init)
Feb 02 16:07:13 rb-k8s-lb2 Keepalived_vrrp[699]: VRRP_Script(chk_haproxy) succeeded
Feb 02 16:07:13 rb-k8s-lb2 Keepalived_vrrp[699]: (haproxy-vip) Changing effective priority from 100 to 102
High Availability Verification
Before we create a kubernetes cluster we have to make sure our high availability configuration is working properly.
- Check the IP Address on the node rb-k8s-lb1:
root@rb-k8s-lb1:~# ip a
1: lo: mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens3: mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether 52:54:00:72:9a:4b brd ff:ff:ff:ff:ff:ff
inet 10.60.60.43/24 brd 10.60.60.255 scope global ens3
valid_lft forever preferred_lft forever
inet 10.60.60.45/24 scope global secondary ens3
valid_lft forever preferred_lft forever
inet6 fe80::5054:ff:fe72:9a4b/64 scope link
valid_lft forever preferred_lft forever
We can see that VIP 10.60.60.45 was successfully added to the node rb-k8s-lb1. then we simulate if node rb-k8s-lb1 is down, then VIP 10.60.60.45 should automatically move to node rb-k8s-lb2
- We turn off the HAproxy service on the rb-k8s-lb1 node:
sudo systemctl stop haproxy.service
- Then we check again the IP Address on the node rb-k8s-lb1:
root@rb-k8s-lb1:~# ip a
1: lo: mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens3: mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether 52:54:00:72:9a:4b brd ff:ff:ff:ff:ff:ff
inet 10.60.60.43/24 brd 10.60.60.255 scope global ens3
valid_lft forever preferred_lft forever
inet6 fe80::5054:ff:fe72:9a4b/64 scope link
valid_lft forever preferred_lft forever
At node rb-k8s-lb1 VIP 10.60.60.45 no longer exists, then we have to make sure if VIP is at node rb-k8s-lb2
- Check the IP Address on the rb-k8s-lb2 node:
root@rb-k8s-lb2:~# ip a
1: lo: mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens3: mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether 52:54:00:aa:3d:02 brd ff:ff:ff:ff:ff:ff
inet 10.60.60.44/24 brd 10.60.60.255 scope global ens3
valid_lft forever preferred_lft forever
inet 10.60.60.45/24 scope global secondary ens3
valid_lft forever preferred_lft forever
inet6 fe80::5054:ff:feaa:3d02/64 scope link
valid_lft forever preferred_lft forever
As per our expectation that VIP has moved to node rb-k8s-lb2 if simulated node rb-k8s-lb1 is dead by disabling haproxy service on rb-k8s-lb1.
- Don’t forget to start the haproxy service again on node rb-k82-lb1:
sudo systemctl start haproxy.service
In the next post we will continue with configuring the Kubernetes cluster using the VIP address we created earlier