keepalived HA, float VIP when a service failure.

General support questions
Post Reply
luke_devon
Posts: 6
Joined: 2009/11/14 14:36:46

keepalived HA, float VIP when a service failure.

Post by luke_devon » 2018/12/01 16:33:59

Hi

As an HA implementation, I have configured keepalived in CentOS7.

1st CentOS BOX, Having httpd -80, Mysql - 3306 and Keepalived + LVS (ipvsadm) as the MASTER, Priority 101

2nd CentOS BOX, Having httpd -80, Mysql - 3306 and Keepalived + LVS (ipvsadm) as the BACKUP, Priority 100

httpd VIP - 192.168.1.200
mysql VIP - 192.168.56.200

However, during my testing, I have found the following concerns;

1. stopped mysqld, BACKUP been notified but "mysql VIP" did not float to the BACKUP server.
2. Started mysql, again, but now the mysql VIP didn't assign back to MASTER.
3. The same testing performed for httpd and got the same result.

Could you please help me to resolve this issue? I really wanna implement keepalived HA for service failures. Please help.

Thanks
Luke.

hunter86_bg
Posts: 2019
Joined: 2015/02/17 15:14:33
Location: Bulgaria
Contact:

Re: keepalived HA, float VIP when a service failure.

Post by hunter86_bg » 2018/12/02 02:49:01

Paste your cluster config in order people to be sable to help.

luke_devon
Posts: 6
Joined: 2009/11/14 14:36:46

Re: keepalived HA, float VIP when a service failure.

Post by luke_devon » 2018/12/02 05:19:40

Hi

Based on current configurations; no SELIUNX, no Firewalld, no IPtables. All disabled.

In the MASTER node

inet 192.168.56.103/24 brd 192.168.56.255 scope global noprefixroute enp0s8
inet 192.168.56.100/24 scope global secondary enp0s8 ---> VIP

Code: Select all

! Configuration File for keepalived

global_defs {
   router_id LVS_WEB
   script_user root
   enable_script_security
}

vrrp_instance WEB {
    state MASTER
    interface enp0s8
    virtual_router_id 109
    priority 101
    advert_int 1

   unicast_src_ip 192.168.56.103   # IP address of local interface

   unicast_peer {                  # IP address of peer interface
        192.168.56.104
        }

    virtual_ipaddress {
        192.168.56.100/24 dev enp0s8
    }
}

virtual_server 192.168.56.100 80 {
    delay_loop 6
    lb_algo rr
    lb_kind NAT
    persistence_timeout 50
    protocol TCP

    real_server 192.168.56.103 80 {
        weight 1
        HTTP_GET {
            url {
              path /var/www/html/node.html
              digest 90f7e9a5f1e12ba93d3086e9e314b201
            }

            connect_timeout 3
            nb_get_retry 3
            delay_before_retry 3
        }
    }

    real_server 192.168.56.104 80 {
                HTTP_GET {
                        url {
                                path /var/www/html/node.html
                                digest 90f7e9a5f1e12ba93d3086e9e314b201
                                }
                        connect_timeout 3
                        nb_get_retry 3
                        delay_before_retry 2
                        }
                }
}
[root@test1 keepalived]# ipvsadm -ln
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP 192.168.56.100:80 rr persistent 50
-> 192.168.56.103:80 Masq 1 0 0
-> 192.168.56.104:80 Masq 1 0 0

==============================================================
In the BACKUP node

inet 192.168.56.104/24 brd 192.168.56.255 scope global noprefixroute enp0s8

Code: Select all

! Configuration File for keepalived

global_defs {
   router_id LVS_WEB
   script_user root
   enable_script_security
}

vrrp_instance WEB {
    state BACKUP
    interface enp0s8
    virtual_router_id 109
    priority 100
    advert_int 1

   unicast_src_ip 192.168.56.104   # IP address of local interface

   unicast_peer {                  # IP address of peer interface
        192.168.56.103
        }

    virtual_ipaddress {
        192.168.56.100/24 dev enp0s8
    }
}

virtual_server 192.168.56.100 80 {
    delay_loop 6
    lb_algo rr
    lb_kind NAT
    persistence_timeout 50
    protocol TCP

    real_server 192.168.56.103 80 {
        weight 1
        HTTP_GET {
            url {
              path /var/www/html/node.html
              digest 90f7e9a5f1e12ba93d3086e9e314b201
            }

            connect_timeout 3
            nb_get_retry 3
            delay_before_retry 3
        }
    }

    real_server 192.168.56.104 80 {
                weight 1
                HTTP_GET {
                        url {
                                path /var/www/html/node.html
                                digest 90f7e9a5f1e12ba93d3086e9e314b201
                                }
                        connect_timeout 3
                        nb_get_retry 3
                        delay_before_retry 2
                        }
                }
}
[root@test2 keepalived]# ipvsadm -ln
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP 192.168.56.100:80 rr persistent 50
-> 192.168.56.103:80 Masq 1 0 0
-> 192.168.56.104:80 Masq 1 0 0

=====================================================
Now stopped httpd in MASTER;

/var/log/messages

Code: Select all

Dec  2 13:01:27 test1 Keepalived_vrrp[1643]: Sending gratuitous ARP on enp0s8 for 192.168.56.100
Dec  2 13:01:27 test1 Keepalived_vrrp[1643]: Sending gratuitous ARP on enp0s8 for 192.168.56.100
Dec  2 13:01:32 test1 Keepalived_vrrp[1643]: Sending gratuitous ARP on enp0s8 for 192.168.56.100
Dec  2 13:01:32 test1 Keepalived_vrrp[1643]: VRRP_Instance(WEB) Sending/queueing gratuitous ARPs on enp0s8 for 192.168.56.100
Dec  2 13:01:32 test1 Keepalived_vrrp[1643]: Sending gratuitous ARP on enp0s8 for 192.168.56.100
Dec  2 13:01:32 test1 Keepalived_vrrp[1643]: Sending gratuitous ARP on enp0s8 for 192.168.56.100
Dec  2 13:01:32 test1 Keepalived_vrrp[1643]: Sending gratuitous ARP on enp0s8 for 192.168.56.100
Dec  2 13:01:32 test1 Keepalived_vrrp[1643]: Sending gratuitous ARP on enp0s8 for 192.168.56.100
Dec  2 13:09:35 test1 systemd: Stopping The Apache HTTP Server...
Dec  2 13:09:36 test1 systemd: Stopped The Apache HTTP Server.
Dec  2 13:09:38 test1 Keepalived_healthcheckers[1642]: Error connecting server [192.168.56.103]:80.
Dec  2 13:09:41 test1 Keepalived_healthcheckers[1642]: Error connecting server [192.168.56.103]:80.
Dec  2 13:09:44 test1 Keepalived_healthcheckers[1642]: Error connecting server [192.168.56.103]:80.
Dec  2 13:09:47 test1 Keepalived_healthcheckers[1642]: Error connecting server [192.168.56.103]:80.
Dec  2 13:09:47 test1 Keepalived_healthcheckers[1642]: Check on service [192.168.56.103]:80 failed after 3 retry.
Dec  2 13:09:47 test1 Keepalived_healthcheckers[1642]: Removing service [192.168.56.103]:80 from VS [192.168.56.100]:80
and still the VIP is in the MASTER and it did not move to the BACKUP.

in the BACKUP node;

Code: Select all

Dec  2 13:09:38 test2 Keepalived_healthcheckers[1581]: Error connecting server [192.168.56.103]:80.
Dec  2 13:09:41 test2 Keepalived_healthcheckers[1581]: Error connecting server [192.168.56.103]:80.
Dec  2 13:09:44 test2 Keepalived_healthcheckers[1581]: Error connecting server [192.168.56.103]:80.
Dec  2 13:09:47 test2 Keepalived_healthcheckers[1581]: Error connecting server [192.168.56.103]:80.
Dec  2 13:09:47 test2 Keepalived_healthcheckers[1581]: Check on service [192.168.56.103]:80 failed after 3 retry.
Dec  2 13:09:47 test2 Keepalived_healthcheckers[1581]: Removing service [192.168.56.103]:80 from VS [192.168.56.100]:80

hunter86_bg
Posts: 2019
Joined: 2015/02/17 15:14:33
Location: Bulgaria
Contact:

Re: keepalived HA, float VIP when a service failure.

Post by hunter86_bg » 2018/12/02 15:49:22

I'm not a keepalived expert.
First thing that comes to my mind is if you are sure that both nodes are sending proper heartbeats.
What happens when node1 is being powered off?

luke_devon
Posts: 6
Joined: 2009/11/14 14:36:46

Re: keepalived HA, float VIP when a service failure.

Post by luke_devon » 2018/12/03 02:04:59

Hi,

Thank you for the support so far. Actually, I am struggling with this issue since 2 -3 month. I was trying most of all the possible ways but I could not make it yet. Maybe I have done something wrong in configurations or I have made a mistake somewhere. Still, I am unable to figure out.

LVS - working fine and VIP also floating correctly when a general test cases with restarting and power off the node.
===========================================================

When Node 1 is being powered off, VIP is moving to Node 2. Node 2 becomes the MASTER.

Node 2, Logs

Code: Select all

Dec  3 09:45:06 test2 Keepalived_healthcheckers[1376]: Error connecting server [192.168.56.103]:80.
Dec  3 09:45:06 test2 Keepalived_vrrp[1377]: VRRP_Instance(WEB) Transition to MASTER STATE
Dec  3 09:45:07 test2 Keepalived_vrrp[1377]: VRRP_Instance(WEB) Entering MASTER STATE
Dec  3 09:45:07 test2 Keepalived_vrrp[1377]: VRRP_Instance(WEB) setting protocol VIPs.
Dec  3 09:45:07 test2 Keepalived_vrrp[1377]: Sending gratuitous ARP on enp0s8 for 192.168.56.100
Dec  3 09:45:07 test2 Keepalived_vrrp[1377]: VRRP_Instance(WEB) Sending/queueing gratuitous ARPs on enp0s8 for 192.168.56.100
Dec  3 09:45:07 test2 Keepalived_vrrp[1377]: Sending gratuitous ARP on enp0s8 for 192.168.56.100
Dec  3 09:45:07 test2 Keepalived_vrrp[1377]: Sending gratuitous ARP on enp0s8 for 192.168.56.100
Dec  3 09:45:07 test2 Keepalived_vrrp[1377]: Sending gratuitous ARP on enp0s8 for 192.168.56.100
Dec  3 09:45:07 test2 Keepalived_vrrp[1377]: Sending gratuitous ARP on enp0s8 for 192.168.56.100
Dec  3 09:45:12 test2 Keepalived_healthcheckers[1376]: Timeout connecting server [192.168.56.103]:80.
Dec  3 09:45:12 test2 Keepalived_vrrp[1377]: Sending gratuitous ARP on enp0s8 for 192.168.56.100
Dec  3 09:45:12 test2 Keepalived_vrrp[1377]: VRRP_Instance(WEB) Sending/queueing gratuitous ARPs on enp0s8 for 192.168.56.100
Dec  3 09:45:12 test2 Keepalived_vrrp[1377]: Sending gratuitous ARP on enp0s8 for 192.168.56.100
Dec  3 09:45:12 test2 Keepalived_vrrp[1377]: Sending gratuitous ARP on enp0s8 for 192.168.56.100
Dec  3 09:45:12 test2 Keepalived_vrrp[1377]: Sending gratuitous ARP on enp0s8 for 192.168.56.100
Dec  3 09:45:12 test2 Keepalived_vrrp[1377]: Sending gratuitous ARP on enp0s8 for 192.168.56.100
Dec  3 09:45:18 test2 Keepalived_healthcheckers[1376]: Timeout connecting server [192.168.56.103]:80.
Dec  3 09:45:24 test2 Keepalived_healthcheckers[1376]: Timeout connecting server [192.168.56.103]:80.
Dec  3 09:45:24 test2 Keepalived_healthcheckers[1376]: Check on service [192.168.56.103]:80 failed after 3 retry.
Dec  3 09:45:24 test2 Keepalived_healthcheckers[1376]: Removing service [192.168.56.103]:80 from VS [192.168.56.100]:80
VIP : moved to Node 2

Code: Select all

[root@test2 ~]# ip addr show enp0s8 | grep 'inet '
    inet 192.168.56.104/24 brd 192.168.56.255 scope global noprefixroute enp0s8
    inet 192.168.56.100/24 scope global secondary enp0s8
==========================================
Powered on again Node 1, and then VIP moved back to Node 1

Code: Select all

[root@test1 ~]# ip addr show enp0s8 | grep 'inet '
    inet 192.168.56.103/24 brd 192.168.56.255 scope global noprefixroute enp0s8
    inet 192.168.56.100/24 scope global secondary enp0s8
Status in Node 2 It becomes the backup node

Code: Select all

Dec  3 09:51:41 test2 Keepalived_vrrp[1377]: VRRP_Instance(WEB) Received advert with higher priority 101, ours 100
Dec  3 09:51:41 test2 Keepalived_vrrp[1377]: VRRP_Instance(WEB) Entering BACKUP STATE
Dec  3 09:51:41 test2 Keepalived_vrrp[1377]: VRRP_Instance(WEB) removing protocol VIPs.
Dec  3 09:51:51 test2 Keepalived_healthcheckers[1376]: MD5 digest success to [192.168.56.103]:80 url(1).
Dec  3 09:51:51 test2 Keepalived_healthcheckers[1376]: Remote Web server [192.168.56.103]:80 succeed on service.
Dec  3 09:51:51 test2 Keepalived_healthcheckers[1376]: Adding service [192.168.56.103]:80 to VS [192.168.56.100]:80
VIP removed from the Node 2

Code: Select all

[root@test2 ~]# ip addr show enp0s8 | grep 'inet '
    inet 192.168.56.104/24 brd 192.168.56.255 scope global noprefixroute enp0s8
All Keepalived experts out there, could you please help me to resolve this issue? I want to use CentOS for my project. I can't switch to another OS at this moment.

Please help.

Thank you
Luke.

Post Reply