thundering herd problem [Was Re: Questions about LVS-TUN

Joseph Mack NA3T jmack at wm7d.net
Sat Dec 16 15:32:44 GMT 2006


On Tue, 12 Dec 2006, Bill Omer wrote:

>
> For example.  Say you have 1000 people connected via telnet (those
> connections could be from terminals, wireless scanners etc..) across
> 10 servers.  Thats 100 users per server when it is weighted.  If you
> were using rr, if one server dies then the rest of the servers will
> pick up 10 extra users each.  When the dead server is restored, you
> will have 9 servers with over a hundred users and 1 server with hardly
> any connections.  That is a poor utilization of resources and could
> take a day to recover.

It should only take the average time for connections to 
expire. Is that about a day for you?

You could use wrr with w=2 for the new server, till the 
number of connections is about right.

The thundering herd problem is one of allocating resources.
Say you bring up a new realserver and using lc scheduling 
1000 people connect to a database in seconds on the new 
machine, then you're going to have problems. In your case if 
all new connections (say 100 connections) go to one 
machine, over what time interval would that happen?

Joe

-- 
Joseph Mack NA3T EME(B,D), FM05lw North Carolina
jmack (at) wm7d (dot) net - azimuthal equidistant map
generator at http://www.wm7d.net/azproj.shtml
Homepage http://www.austintek.com/ It's GNU/Linux!


More information about the lvs-users mailing list