[lvs-users] IPVS/NAT - no connection after real server down

Graeme Fowler graeme at graemef.net
Fri Sep 5 14:15:47 BST 2008

On Fri, 2008-09-05 at 14:47 +0200, Pitscheider, Oswald wrote:
> I’ve tried the LVS with this changes having a little succeed, but there is still the problem that if I remove a real server, requests to the server are responded very slowly.
> From them moment, when the real server is removed from the pool, some requests have to wait seconds for an answer.
> After a minute, the LVS works as it should.

This is fairly predictable, from your configuration and from the way TCP

Each realserver is checked every 20 seconds (delay_loop 20). If you stop
Apache just as the check is done successfully, requests will stall for
20 seconds until the next check (because the server isn't responding).

If a request arrives fractionally after the successful check, the server
isn't responding, then the client will retry at the following intervals:

 -0.002  RS1 Check succeeds
 -0.001  RS1 Apache stopped
  0.000  Request arrives at RS1
  3.000  retry 1 to RS1
  9.000  retry 2 to RS1
 19.998  RS1 Check fails
         keepalived removes RS1 from pool
 21.000  retry 3 sent to RS2

Note however that it may take a short period for keepalived to do the
server removal, which may overlap with retry 3 - and the next delay to
retry 4 is another 24 seconds (3, 6, 12, 24 and so on) which takes you
towards 45 seconds altogether.

> I’ve tested the LVS using jmeter with 25 threats.

And depending on the way jmeter is configured, alongside your webserver
config, this will mean a minimum of 20 seconds (and likely much longer)
delay between you dropping the webserver and the clients recovering.

It is perfectly permissible to bring down the delay_loop as much as you
or your app server can tolerate. For fast failover you need a short
delay. I would argue that for most web clients, 20 seconds is perfectly
acceptable but that can depend entirely on what you're trying to

Try "delay_loop 1" and see what you get. What you will get, possibly,
are a lot of log entries - but you should get very fast recovery.


More information about the lvs-users mailing list