[lvs-users] LVS RR becomes unbalanced after time

Luc van Donkersgoed luc at codingdutchmen.com
Mon Jul 4 14:25:19 BST 2011


Hi all,

I'm running a small apache cluster (2 loadbalancers in active-passive setup, 2 realservers serving HTTP and HTTPS). 

All machines are Dell PowerEdge (R200 and R410 series), not older than 2 years, running Ubuntu 11.04 with all packages updated. 

My loadbalancers are configured with:

Heartbeat 3.0.4
Linux Director v1.186-ha
ipvsadm v1.25 2008/5/15 (compiled with popt and IPVS v1.2.1)

====  ldirectord.cf ====

checktimeout=10
checkinterval=2
autoreload=no
logfile="local0"
quiescent=no

virtual=x.y.z.141:80
        real=x.y.z.135:80 gate 50
        real=x.y.z.136:80 gate 50
        fallback=127.0.0.1:80 gate
        service=http
        request="ldirector.html"
        receive="Test Page"
        scheduler=wrr
        protocol=tcp
        checktype=negotiate

virtual=x.y.z.141:443
        real=x.y.z.135:443 gate 50
        real=x.y.z.136:443 gate 50
        fallback=127.0.0.1:80 gate
        service=https
        request="ldirector.html"
        receive="Test Page"
        scheduler=wrr
        protocol=tcp
        checktype=negotiate

====  ipvsadm -ln ====

IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
  -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
TCP  x.y.z.141:80 wrr
  -> x.y.z.135:80             Route   50     110        51        
  -> x.y.z.136:80             Route   50     103        59        
TCP  x.y.z.141:443 wrr
  -> x.y.z.135:443            Route   50     12         14        
  -> x.y.z.136:443            Route   50     12         6  

==== the problem ====

When I (re)start my loadbalancer, the load is evenly balanced over my two realservers. The output of ipvsadm -ln is at that moment comparable to the output above. This is all as I would expert. Then, after somewhere between 30 minutes and 2 hours, the results of ipvsadm -ln changes to something like this, while the load on the webservers does not significantly change:

IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
  -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
TCP  x.y.z.141:80 wrr
  -> x.y.z.135:80             Route   50     0        1        
  -> x.y.z.136:80             Route   50     1        1        
TCP  x.y.z.141:443 wrr
  -> x.y.z.135:443            Route   50     2         1        
  -> x.y.z.136:443            Route   50     1         0  

Around the same time, the loadbalancer seems to become unbalanced, sending all requests to one server. This is not always the same server, it seems to be random. This server then becomes heavily loaded, while the other server is idling. After a while (perhaps 30 minutes) the loadbalancer starts to send packages to the unused realserver again. A little while after that, the balance tips again, often preferring the other server this time. 

The problem is always solved by restarting heartbeat, at which time the other loadbalancer takes over and starts to distribute the load evenly again. Then, after a while, this server starts to display the same issue.

A source for my problem might be found when running ipvsadm -ln --stats, which displays the following:

IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port               Conns   InPkts  OutPkts  InBytes OutBytes
  -> RemoteAddress:Port
TCP  x.y.z.141:80                20899   198954        0 21196756        0
  -> x.y.z.135:80                10449   101877        0 10743318        0
  -> x.y.z.136:80                10450    97077        0 10453438        0
TCP  x.y.z.141:443                2852    46171        0  8971996        0
  -> x.y.z.135:443                1426    23411        0  4485847        0
  -> x.y.z.136:443                1426    22760        0  4486149        0

This would suggest that the number of connections are still evenly distributed over the two realservers, even if the realservers themselves don't agree. 

Can anyone help me locate the reason for the round robin scheduler not distributing my requests evenly?

Thanks in advance,
Luc van Donkersgoed


More information about the lvs-users mailing list