Docker Swarm Failover using VIP

Installing Keepalived for Docker Swarm Failover To achieve high availability and failover in a Docker Swarm cluster, Keepalived is commonly used to manage a Virtual IP (VIP) that automatically fails over between nodes if one becomes unavailable. Below is a step-by-step guide on how to install and configure Keepalived for Docker Swarm failover.
1. Install Keepalived on All Nodes On each node (manager and optionally worker nodes), install Keepalived using your package manager. For Ubuntu/Debian:
sudo apt update
sudo apt install -y keepalived For CentOS/RHEL:
bash
sudo yum install -y keepalived
2. Enable Non-local IP Binding (Optional but Recommended)
To allow Keepalived to bind to the VIP even if it is not assigned to the node, add the following to /etc/sysctl.conf:
echo "net.ipv4.ip_nonlocal_bind = 1" | sudo tee -a /etc/sysctl.conf
sudo sysctl -p
3. Configure Keepalived
You need to create a Keepalived configuration file on each node, typically at /etc/keepalived/keepalived.conf. The configuration will differ slightly between the master and backup nodes. Here is a sample setup for three nodes:
Example: Master Node (dockerswm1)
vrrp_instance VI_1 { state MASTER
interface ens192 # Replace with your primary network interface virtual_router_id 51 priority 255
advert_int 1
authentication {
auth_type PASS
auth_pass 12345
}
unicast_peer {
192.168.0.101
192.168.0.102
}
virtual_ipaddress {
192.168.0.99/24
}
}
Example: Backup Node (dockerswm2)
vrrp_instance VI_1 {
state BACKUP
interface ens192
virtual_router_id 51 priority 254
advert_int 1
authentication {
auth_type PASS
auth_pass 12345
}
unicast_peer {
192.168.0.100
192.168.0.102
}
virtual_ipaddress {
192.168.0.99/24
}
}
Example: Backup Node (dockerswm3)
vrrp_instance VI_1 {
state BACKUP
interface ens18
virtual_router_id 51 priority 253
advert_int 1
authentication {
auth_type PASS
auth_pass 12345
}
unicast_peer {
192.168.0.100
192.168.0.101
}
virtual_ipaddress {
192.168.0.99/24
}
}
Replace interface and IP addresses as appropriate for your environment.
4. (Optional) Add Health Checks
You can enhance failover accuracy by configuring Keepalived to run a health check script (e.g., to check if Docker or a specific service like Traefik is running). Place your script (e.g., /etc/keepalived/healthcheck.sh) and make it executable:
chmod +x /etc/keepalived/healthcheck.sh
Reference this script in your Keepalived configuration using the track_script directive.
5. Start and Enable Keepalived Enable and start Keepalived on all nodes:
sudo systemctl enable --now keepalived sudo systemctl status keepalived
6. Update DNS
Point your application's DNS (A record) to the VIP (e.g., 192.168.0.99).
This ensures clients always connect to the active node.
7. Test Failover
Stop Docker or the relevant service on the master node to simulate a failure.
Observe that the VIP moves to a backup node.
Restart services to restore normal operation.
|