LVS-DR keepalived problem - SOLUTION

Paolo Perrucci p.perrucci at ludonet.it
Sat Jun 17 19:57:48 BST 2006


Hi all,
after some days (and nights) of work I found the problem and the solution.
Following you can find the explanation of the problem with the solution.
I hope this help other guys to sleep easy...

keepalived control the ipvs configuration of both master and slave 
director (in my case also real servers).
Actually the ipvs in the backup director is not sleeping. If the module 
is loaded in the kernel and the lvs table is not empty, ipvs is 
inspecting the network traffic applying the configured rules.
Therefore, in my configuration, ipvs on the 2nd real server handle 
network traffic forwarded by the master ipvs creating the loop.

For the 1st client request the flow is:
- the request reaches the master ipvs
- according to configuration table (it's in my first mail) and counter 
(I configured a rr scheduler), ipvs forward packets to 2nd real server
- on the 2nd real server ipvs handle the request using the table, so the 
request is forwarded to 2nd real server RIP

For the 2nd client request the flow is:
- the request reaches the master ipvs
- according to configuration table and counter, ipvs forward packets to 
1st real server RIP

For the 3rd client request the flow is:
- the request reaches the master ipvs
- according to configuration table and counter, ipvs forward packets to 
2nd real server RIP
- on the 2nd real server ipvs handle the request using the table, so the 
request is forwarded to 1st real server
- on the 1st real server ipvs handle the request using the table, so the 
request is forwarded to 2nd real server
- ...loop...

To solve the problem I removed the hidden VIP on the real servers and I 
used the following iptables nat rule

-A PREROUTING -d 10.0.91.25 -p tcp -j REDIRECT

activated by keepalived on the slave director.
In this way, the packets arriving on the slave director are modified in 
order to bypass ipvs (ipvs get only packets direct to VIP 10.0.91.25).

Paolo

Paolo Perrucci ha scritto:
> Hi all,
>
> I trying to configure a LVS-DR with 2 servers (centos 4.3) using
> keepalived 1.1.12 for an http service.
> The 2 servers acts as master director/slave director and real servers.
>
> The problem arise when the 3rd client request arrive on the director.
> From the client side, the browser wait for the connection to be
> established without success and after a while it fails.
> From the real servers point of view, I see a LOT of network traffic
> consisting of only SYN packet.
> My configuration is:
>
> VIP: 10.0.91.25
> RIP1: 10.0.91.23
> RIP1: 10.0.91.24
> Client: 10.0.90.116
>
> --------------------------- keepalived.conf on real server 1 (10.0.91.23)
> vrrp_instance VI_1 {
>        state MASTER
>        interface eth0
>        track_interface {
>                eth0
>        }
>        lvs_sync_daemon_interface eth0
>        virtual_router_id 25
>        priority 150
>        advert_int 2
>        authentication {
>                auth_type PASS
>                auth_pass tps
>        }
>        virtual_ipaddress {
>                10.0.91.25/24
>        }
>        notify_master "/etc/keepalived/ip_localhost del"
>        notify_backup "/etc/keepalived/ip_localhost add"
>        notify_fault "/etc/keepalived/ip_localhost add"
> }
>
> virtual_server 10.0.91.25 80  {
>        delay_loop 5
>        lb_algo rr
>        lb_kind DR
>        protocol TCP
>        real_server 10.0.91.23 80 {
>                weight 1
>                inhibit_on_failure
>                TCP_CHECK {
>                        connect_port 80
>                        connect_timeout 3
>                        nb_get_retry 3
>                        delay_before_retry 1
>                }
>        }
>        real_server 10.0.91.24 80 {
>                weight 1
>                inhibit_on_failure
>                TCP_CHECK {
>                        connect_port 80
>                        connect_timeout 3
>                        nb_get_retry 3
>                        delay_before_retry 1
>                }
>        }
> }
> -------------------------------------------------------------------------------------- 
>
>
>
> ---------------------------  keepalived.conf on real server 2 
> (10.0.91.24)
> vrrp_instance VI_1 {
>        state BACKUP
>        interface eth0
>        track_interface {
>                eth0
>        }
>        lvs_sync_daemon_interface eth0
>        virtual_router_id 25
>        priority 100
>        advert_int 2
>        authentication {
>                auth_type PASS
>                auth_pass tps
>        }
>        virtual_ipaddress {
>                10.0.91.25/24
>        }
>        notify_master "/etc/keepalived/ip_localhost del"
>        notify_backup "/etc/keepalived/ip_localhost add"
>        notify_fault "/etc/keepalived/ip_localhost add"
> }
>
> virtual_server 10.0.91.25 80  {
>        delay_loop 5
>        lb_algo rr
>        lb_kind DR
>        protocol TCP
>        real_server 10.0.91.23 80 {
>                weight 1
>                inhibit_on_failure
>                TCP_CHECK {
>                        connect_port 80
>                        connect_timeout 3
>                        nb_get_retry 3
>                        delay_before_retry 1
>                }
>        }
>        real_server 10.0.91.24 80 {
>                weight 1
>                inhibit_on_failure
>                TCP_CHECK {
>                        connect_port 80
>                        connect_timeout 3
>                        nb_get_retry 3
>                        delay_before_retry 1
>                }
>        }
> }
> -------------------------------------------------------------------------------------- 
>
>
>
> -------------------------------------------------------------------------------------- 
>
> /etc/keepalived/ip_localhost is the script used to setup the VIP (bound
> to lo) on the real servers:
>
> #/bin/sh
> case "$1" in
>  add)
>        ip addr add 10.0.91.25/32 dev lo brd + scope host
>        ;;
>  del)
>        ip add del 10.0.91.25/32 dev lo
>        ;;
>  *)
>        echo "Usage: $0 {add|del}"
>        exit 1
> esac
> exit 0
> -------------------------------------------------------------------------------------- 
>
>
> -------------------------------------------------------------------------------------- 
>
> /etc/sysctl.conf
>
> net.ipv4.ip_forward = 1
> net.ipv4.conf.default.rp_filter = 0
> net.ipv4.conf.default.accept_source_route = 1
> net.ipv4.conf.all.arp_ignore = 1
> net.ipv4.conf.all.arp_announce = 2
> net.ipv4.conf.eth0.arp_ignore = 1
> net.ipv4.conf.eth0.arp_announce = 2
> -------------------------------------------------------------------------------------- 
>
>
> After starting the keepalived service on the two servers I have this
> network configuration on the first real server:
>
> 1: lo: <LOOPBACK,UP> mtu 16436 qdisc noqueue
>    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
>    inet 127.0.0.1/8 brd 127.255.255.255 scope host lo
>    inet6 ::1/128 scope host
>       valid_lft forever preferred_lft forever
> 2: eth0: <BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast qlen 1000
>    link/ether 00:0c:29:1a:ce:fe brd ff:ff:ff:ff:ff:ff
>    inet 10.0.91.23/24 brd 10.0.91.255 scope global eth0
>    inet 10.0.91.25/24 scope global secondary eth0
>    inet6 fe80::20c:29ff:fe1a:cefe/64 scope link
>       valid_lft forever preferred_lft forever
>
> and this one on the 2nd real server:
>
> 1: lo: <LOOPBACK,UP> mtu 16436 qdisc noqueue
>    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
>    inet 127.0.0.1/8 brd 127.255.255.255 scope host lo
>    inet 10.0.91.25/32 scope host lo
>    inet6 ::1/128 scope host
>       valid_lft forever preferred_lft forever
> 2: eth0: <BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast qlen 1000
>    link/ether 00:0c:29:7a:c2:d3 brd ff:ff:ff:ff:ff:ff
>    inet 10.0.91.24/24 brd 10.0.91.255 scope global eth0
>    inet6 fe80::20c:29ff:fe7a:c2d3/64 scope link
>       valid_lft forever preferred_lft forever
>
> The ipvsadm status seems to be correct.
> On the 1st server is:
>
> IP Virtual Server version 1.2.0 (size=4096)
> Prot LocalAddress:Port Scheduler Flags
>  -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
> TCP  10.0.91.25:http rr
>  -> 10.0.91.24:http              Route   1      0          0
>  -> 10.0.91.23:http              Local   1      0          0
>
> On the 2nd server is:
>
> IP Virtual Server version 1.2.0 (size=4096)
> Prot LocalAddress:Port Scheduler Flags
>  -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
> TCP  10.0.91.25:http rr
>  -> 10.0.91.24:http              Local   1      0          0
>  -> 10.0.91.23:http              Route   1      0          0
>
> When the 3rd client request arrive on the server this is the tcpdump
> output on the first node:
>
> ...
> 00:49:02.366902 IP 10.0.90.116.3724 > 10.0.91.25.http: S
> 2143602042:2143602042(0) win 32768 <mss 1460,nop,nop,timestamp 0
> 0,nop,nop,sackOK>
> 00:49:02.366929 IP 10.0.90.116.3724 > 10.0.91.25.http: S
> 2143602042:2143602042(0) win 32768 <mss 1460,nop,nop,timestamp 0
> 0,nop,nop,sackOK>
> 00:49:02.367082 IP 10.0.90.116.3724 > 10.0.91.25.http: S
> 2143602042:2143602042(0) win 32768 <mss 1460,nop,nop,timestamp 0
> 0,nop,nop,sackOK>
> 00:49:02.367095 IP 10.0.90.116.3724 > 10.0.91.25.http: S
> 2143602042:2143602042(0) win 32768 <mss 1460,nop,nop,timestamp 0
> 0,nop,nop,sackOK>
> 00:49:02.367878 IP 10.0.90.116.3724 > 10.0.91.25.http: S
> 2143602042:2143602042(0) win 32768 <mss 1460,nop,nop,timestamp 0
> 0,nop,nop,sackOK>
> 00:49:02.367902 IP 10.0.90.116.3724 > 10.0.91.25.http: S
> 2143602042:2143602042(0) win 32768 <mss 1460,nop,nop,timestamp 0
> 0,nop,nop,sackOK>
> 00:49:02.367881 IP 10.0.90.116.3724 > 10.0.91.25.http: S
> 2143602042:2143602042(0) win 32768 <mss 1460,nop,nop,timestamp 0
> 0,nop,nop,sackOK>
> 00:49:02.367910 IP 10.0.90.116.3724 > 10.0.91.25.http: S
> 2143602042:2143602042(0) win 32768 <mss 1460,nop,nop,timestamp 0
> 0,nop,nop,sackOK>
> 00:49:02.367882 IP 10.0.90.116.3724 > 10.0.91.25.http: S
> 2143602042:2143602042(0) win 32768 <mss 1460,nop,nop,timestamp 0
> 0,nop,nop,sackOK>
> 00:49:02.367916 IP 10.0.90.116.3724 > 10.0.91.25.http: S
> 2143602042:2143602042(0) win 32768 <mss 1460,nop,nop,timestamp 0
> 0,nop,nop,sackOK>
> 00:49:02.368584 IP 10.0.90.116.3724 > 10.0.91.25.http: S
> 2143602042:2143602042(0) win 32768 <mss 1460,nop,nop,timestamp 0
> 0,nop,nop,sackOK>
> ...
>
> and the same you can see in the tcpdump output from the 2
>
> ...
> 22:51:39.744887 IP 10.0.90.116.3724 > 10.0.91.25.http: S
> 2143602042:2143602042(0) win 32768 <mss 1460,nop,nop,timestamp 0
> 0,nop,nop,sackOK>
> 22:51:39.746808 IP 10.0.90.116.3724 > 10.0.91.25.http: S
> 2143602042:2143602042(0) win 32768 <mss 1460,nop,nop,timestamp 0
> 0,nop,nop,sackOK>
> 22:51:39.746843 IP 10.0.90.116.3724 > 10.0.91.25.http: S
> 2143602042:2143602042(0) win 32768 <mss 1460,nop,nop,timestamp 0
> 0,nop,nop,sackOK>
> 22:51:39.746816 IP 10.0.90.116.3724 > 10.0.91.25.http: S
> 2143602042:2143602042(0) win 32768 <mss 1460,nop,nop,timestamp 0
> 0,nop,nop,sackOK>
> 22:51:39.746862 IP 10.0.90.116.3724 > 10.0.91.25.http: S
> 2143602042:2143602042(0) win 32768 <mss 1460,nop,nop,timestamp 0
> 0,nop,nop,sackOK>
> 22:51:39.746818 IP 10.0.90.116.3724 > 10.0.91.25.http: S
> 2143602042:2143602042(0) win 32768 <mss 1460,nop,nop,timestamp 0
> 0,nop,nop,sackOK>
> 22:51:39.746884 IP 10.0.90.116.3724 > 10.0.91.25.http: S
> 2143602042:2143602042(0) win 32768 <mss 1460,nop,nop,timestamp 0
> 0,nop,nop,sackOK>
> 22:51:39.747879 IP 10.0.90.116.3724 > 10.0.91.25.http: S
> 2143602042:2143602042(0) win 32768 <mss 1460,nop,nop,timestamp 0
> 0,nop,nop,sackOK>
> 22:51:39.747909 IP 10.0.90.116.3724 > 10.0.91.25.http: S
> 2143602042:2143602042(0) win 32768 <mss 1460,nop,nop,timestamp 0
> 0,nop,nop,sackOK>
> 22:51:39.747881 IP 10.0.90.116.3724 > 10.0.91.25.http: S
> 2143602042:2143602042(0) win 32768 <mss 1460,nop,nop,timestamp 0
> 0,nop,nop,sackOK>
> 22:51:39.747949 IP 10.0.90.116.3724 > 10.0.91.25.http: S
> 2143602042:2143602042(0) win 32768 <mss 1460,nop,nop,timestamp 0
> 0,nop,nop,sackOK>
> 22:51:39.748892 IP 10.0.90.116.3724 > 10.0.91.25.http: S
> 2143602042:2143602042(0) win 32768 <mss 1460,nop,nop,timestamp 0
> 0,nop,nop,sackOK>
> 22:51:39.748923 IP 10.0.90.116.3724 > 10.0.91.25.http: S
> 2143602042:2143602042(0) win 32768 <mss 1460,nop,nop,timestamp 0
> 0,nop,nop,sackOK>
> 22:51:39.749745 IP 10.0.90.116.3724 > 10.0.91.25.http: S
> 2143602042:2143602042(0) win 32768 <mss 1460,nop,nop,timestamp 0
> 0,nop,nop,sackOK>
> ...
>
> As you can see from the timestamps it's a lot of network traffic.
> It seems like there is a loop between the two server.
> The first two client requests are handled correctly: the first one goes
> to the first node and the 2nd one goes to the other node.
>
> Anyone can give me some hints to debug (and hopefully solve) the problem.
> Thank you
> Paolo
>
> _______________________________________________
> LinuxVirtualServer.org mailing list - lvs-users at LinuxVirtualServer.org
> Send requests to lvs-users-request at LinuxVirtualServer.org
> or go to http://www.in-addr.de/mailman/listinfo/lvs-users
>
>

Search lvs-users Archives
Limit search to: Subject & Body Subject Author
Sort by: Reverse Sort

More information about the lvs-users mailing list