[lvs-users] ldirectord does not transfer connections when a real server dies

Konstantin Boyanov kkboyanov at gmail.com
Tue Apr 30 10:30:31 BST 2013


Hello LVS users,

I am using ldirectord to load balance two IIS servers. The
ldirectord.cglooks like this:


    autoreload = yes
    quiescent = yes
    checkinterval = 1
    negotiatetimeout = 2
    emailalertfreq = 60
    emailalert = Konstantin.Boyanov at mysite.com
    failurecount = 1

    virtual = 172.22.9.100:80
        checktimeout = 1
        checktype = negotiate
        protocol = tcp
        real = 172.22.1.133:80 masq 2048
        real = 172.22.1.134:80 masq 2048
        request = "alive.htm"
        receive = "I am not a zombie"
        scheduler = wrr

The load balancing is working fine, the real servers are visible etc.
Nevertheless I am encountering a problem with a simple test:

1. I open some connections from a client browser (IE 8) to the sites that
are hosted on the real servers
2. I cange the weight of the real server which server the above connections
to 0 and leave only the other real server alive
3. I reload the pages to regenerate the connections

What I am seeing with ipvsadm -Ln is that the connections are still on the
"dead" server. I have to wait up to one minute (I suppose some TCP timeout
from the browser-side) for them to transfer to the "living" server. And If
in this one minute I continue pressing the reload button the connections
stay at the "dead" server and their TCP timeout counter gets restarted.

So my question is: Is there a way to tell the load balancer in NAT mode to
terminate / redirect existing connections to a dead server *immediately*
(or close to immediately)?

It seems to me a blunder that a reload on the client-side can make a
connection become a "zombie", e.g. be bound to a dead real server although
persistance is not used and the other server is ready and available.

The only thing that I found affecting this timeout is changing the
keepAliveTimeout in the Windows machine running the IE8 which I use for the
tests. When I cahnged it from the dafault value of 60 seconds to 30 seconds
the connections could be transferred after 30 seconds. It seems to me very
odd that a client setting can affect the operation of a network component
as the load balancer.

And another thing - what is the colum named "Inactive Conenctions" in the
output from ipvsadm used for? Which connections are considered inactive?

And also in the output of ipvsadm i see a couple of connections with the
state TIME_WAIT. What are these for?

Any insight and suggestions are highly appreciated !

Cheers,
Konstantin



P.S: Here is some more information about the configuration:

    # uname -a
    Linux 3.0.58-0.6.2-default #1 SMP Fri Jan 25 08:31:01 UTC 2013 x86_64
x86_64 x86_64 GNU/Linux

    # ipvsadm -L
    IP Virtual Server version 1.2.1 (size=4096)
    Prot LocalAddress:Port Scheduler Flags
      -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
    TCP  lb-mysite.com wrr
      -> spwfe001.mysite.com:h Masq    10     0          0
      -> spwfe002.mysite.com:h Masq    10     0          0

    # iptables -t nat -L
    Chain PREROUTING (policy ACCEPT)
    target     prot opt source               destination

    Chain INPUT (policy ACCEPT)
    target     prot opt source               destination

    Chain OUTPUT (policy ACCEPT)
    target     prot opt source               destination

    Chain POSTROUTING (policy ACCEPT)
    target     prot opt source               destination
    SNAT       all  --  anywhere             anywhere
to:172.22.9.100
    SNAT       all  --  anywhere             anywhere
to:172.22.1.130


    # ip a
    1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN
        link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
        inet 127.0.0.1/8 brd 127.255.255.255 scope host lo
        inet 127.0.0.2/8 brd 127.255.255.255 scope host secondary lo
    2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast
state UNKNOWN         qlen 1000
        link/ether 00:50:56:a5:77:ae brd ff:ff:ff:ff:ff:ff
        inet 192.168.8.216/22 brd 192.168.11.255 scope global eth0
    3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast
state UNKNOWN         qlen 1000
    link/ether 00:50:56:a5:77:af brd ff:ff:ff:ff:ff:ff
    inet 172.22.9.100/22 brd 172.22.11.255 scope global eth1:1
    inet 172.22.8.213/22 brd 172.22.11.255 scope global secondary eth1
    4: eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast
state UNKNOWN qlen 1000
        link/ether 00:50:56:a5:77:b0 brd ff:ff:ff:ff:ff:ff
        inet 172.22.1.130/24 brd 172.22.1.255 scope global eth2


    # cat /proc/sys/net/ipv4/ip_forward
    1
    # cat /proc/sys/net/ipv4/vs/conntrack
    1
    # cat /proc/sys/net/ipv4/vs/expire_nodest_conn
    1
    # cat /proc/sys/net/ipv4/vs/expire_quiescent_template
    1


More information about the lvs-users mailing list