Failover when service is down

1. Create a Health Check Script
Create a script (e.g., /etc/keepalived/check_port.sh) to test the port’s availability.
Example using nc (netcat):
#!/bin/bash
PORT=22 # Replace with your target port (e.g., 80, 443, 22) TIMEOUT=3
# Check if the port is reachable nc -z -w $TIMEOUT localhost $PORT > /dev/null 2>&1 exit $?
Make it executable:
chmod +x /etc/keepalived/check_port.sh
2. Configure Keepalived to Use the Script
Modify your keepalived.conf to include the script and adjust priorities based on its status.
Example configuration:
vrrp_script chk_port {
script "/etc/keepalived/check_port.sh"
interval 2 # Check every 2 seconds
timeout 1 # Allow 1 second for the script to complete
rise 2 # Require 2 successes to consider healthy
fall 2 # Require 2 failures to consider unhealthy
weight -50 # Adjust priority by -50 if the check fails
}
vrrp_instance VI_1 {
state MASTER
interface eth0 # Replace with your
interface virtual_router_id 51
priority 100 # Base
priority (100 - 50 = 50 if check fails)
advert_int 1
authentication {
auth_type PASS
auth_pass 12345
}
virtual_ipaddress {
192.168.0.99/24 # VIP
}
track_script {
chk_port # Reference the health check script
}
}
3. Key Parameters Explained Parameter Purpose
interval How often the script runs (e.g., every 2 seconds).
timeout Maximum time allowed for the script to complete.
rise/fall Number of successes/failures required to change state.
weight Priority adjustment if the check fails (e.g., -50 lowers priority).
4. Test the Configuration
Manually test the script: /etc/keepalived/check_port.sh
echo $? # Should return 0 (success) or 1 (failure)
Simulate a failure:
Stop the service on the master node (e.g., systemctl stop sshd for port 22).
Verify the VIP migrates to a backup node:
ip addr show dev eth0 | grep "192.168.0.99"
5. Troubleshooting
Check logs:
journalctl -u keepalived -f
Firewall rules: Ensure VRRP (protocol 112) and health check traffic are allowed between nodes.
Script permissions: Ensure keepalived has execute rights on the script.
Alternative: Built-in TCP Check (LVS Context)
For LVS-based setups, use TCP_CHECK in the real_server block:
virtual_server 192.168.0.99 22 {
delay_loop 15
protocol TCP
real_server 192.168.0.100 22 {
weight 1
TCP_CHECK {
connect_port 22
connect_timeout 3
}
}
}
|