[lvs-users] UDP packet loss when real server removed from farm
pdm at pobox.com
Sun Jul 16 16:38:51 BST 2017
I faced a similar issue with http traffic. The cause for me was keepalived
by default will remove a real server from the configuration when it is
detected down breaking any sessions that were going to it.
If you specify one of the weighted routing algorithms and use
'inhibit_on_failure' instead of removing the real servers from the config
it will mark it with a weight of 0. Existing connections will be able to
complete and new ones will be routed to a new destination. Use weighted
round robin maybe.
As a caveat when I did this we had the persistence value set to non 0 and
this made some busy clients never drain from the node that was removed. To
work around this I set the 'net.ipv4.vs.expire_quiescent_template' sysctl
setting to 1. Without this setting even if the real server was hard down
IPVS (admittedly older version) would keep sending traffic to the down
On Sun, Jul 16, 2017 at 9:41 AM, Ed Ravin <eravin at panix.com> wrote:
> I'm using keepalived to distribute DNS requests (UDP port 53) to a
> group of DNS servers. The farm is using source hashing. Environment
> is RHEL, with the stock keepalived and IPVS. I've reproduced the
> problem with RHEL 7.2, 6.8, and an older 6.x version.
> When a health check fails and keepalived takes a real server out of the
> farm, tests show that a client using the removed server has its packets
> discarded until it is remapped to a new server. I can also provoke the
> problem without keepalived, by using ipvsadm to remove a real server from
> the farm.
> I ran tcpdump on the load-balancing server during the test. When the
> IPVS load balancing is working as expected, I see the packets arrive
> on the incoming interface (a 2-interface bond) and then immediately get
> forwarded to a real server. We are using direct response so there's
> no manipulation of the IP headers.
> After the real server is removed from the farm, requests from clients
> that were hashed to that server still arrive, but they don't get forwarded
> out. I haven't calculated all the numbers yet, but on a farm that gets
> roughly 7500 requests per second, when one of the five real servers is
> removed, around 3400 requests do not get forwarded. Under various test
> scenarios it can take as long as a second for the farm to work normally
> again from the impacted clients' perspective - the problem gets worse
> when request rates are increased.
> I didn't see any loss for clients who were not using the removed server
> during the transition. I also didn't see any loss when a real server
> was added back into the farm.
> When I change the farm from source hashing to round-robin, the problem
> is reduced by an order of magnitude - instead of hundreds of lost
> requests, I get at most a few dozen.
> I'm kind of stuck at this point as I don't know much about IPVS internals.
> I've looked at the IPVS stats in /proc but those only cover packets
> successfully processed, there don't seem to be any counters for errors
> or drops.
> iptables is in use on the load balancer hosts (a very simple list with 3 or
> 4 drop rules), but in my test environment I didn't see any difference when
> the iptables modules were unloaded ("service iptables stop" and then
> confirmed with lsmod). The modules iptables uses when in NAT mode (I think
> it's nf_conntrack and a couple of others) are already blacklisted as they
> caused havoc one day last year when they were accidentally loaded.
> So my questions are:
> * Could there be a bug in the connection table code when a real server
> is removed and the farm mappings have to be recalculated?
> * Is it realistic to expect that no packets will be dropped when a real
> server is removed from the farm?
> * If not, what can I do to minimize the packet loss?
> -- Ed
> Please read the documentation before posting - it's available at:
> LinuxVirtualServer.org mailing list - lvs-users at LinuxVirtualServer.org
> Send requests to lvs-users-request at LinuxVirtualServer.org
> or go to http://lists.graemef.net/mailman/listinfo/lvs-users
More information about the lvs-users