DR Load balancing active/inactive connections

RU Admin lvs-user at camden.rutgers.edu
Tue Nov 21 13:57:59 GMT 2006


I've been using IPVS for almost two years now, I started out with 6 
machines (1 director, 5 real servers) and was using LVS-NAT.  During the 
first year that I was running that email server everything worked 
perfectly with LVS-NAT.  About a year ago, I decided to setup another 
email server, this time with 5 machines (1 director, 4 real servers) and 
decided it was time to get LVS-DR working, which I successfully did.  I 
then decided to switch over my first email server (the one with 6 
machines) to LVS-DR, since the other LVS-DR server was working great. 
Both of my email servers have been working great with LVS-DR for the past 
year, with one major exception (which has just recently started getting 
worse, because of the large volumes of connections coming into the 
servers).  The problem I am having is that my active/inactive connections 
are not being listed properly.  What I mean, is that the counter for my 
active/inactive connections just keep going up and up, and are constantly 
being skewed.  I read through a good number of archived messages on this 
mailing list, and I keep seeing everyone saying "Those numbers ipvsadm 
are showing, are just for reference, they don't really mean anything, 
don't worry about them."   Well, I can tell you first hand, when you use 
wlc (weighted least connections), those number obviously DO mean 
something.  My machines are no longer being equally balanced between 
because my connection counts are off, and this is really effecting the 
performance of my email servers.  When running "ipvsadm -lcn", I can see 
connections with the CLOSE state going from 00:59 to 00:01, and then 
magically going back to 00:59 again for no reason.  The same holds true 
for ESTABLISHED connections, I see them go from 29:59 to 00:01 and then 
back to 29:59, and I know for a fact that the connection from the client 
has ended.

I'm currently using "IP Virtual Server version 1.2.0", and I know that 
there is a 1.2.1 version available, but my problem is that my email 
servers are in a production environment, and I really don't want to 
recompile a new kernel with the latest IPVS if that isn't going to solve 
the problem.  I'd hate to cause other problems with my system because of a 
major kernel upgrade.

I can only hope that someone has some suggestions, I am a firm supporter 
of IPVS, and as I said I've been using it for 2 years now and one of my 
email servers handles over 30,000,000 emails in one month (or almost 1 
million emails a day).  So we heavily relying on IPVS.  There is another 
department in our organization that spent thousands of dollars on 
FoundryNet load balancing productions, and I've been able to accomplish 
the same tasks (and handle a higher load) by using IPVS, so clearly IPVS 
is a solid product.  Unfortunately, I just really need to figure out what 
is going on with the connection count problems.

I not sure what information you guys need, but here's some info about my 
setup.  If you need any more details, feel free to ask.

6 Dell PowerEdge SC1425
Dual Xeon 3.06Ghz processors
2GB DDR
160GB SATA
Running Debian Sarge

1 machine is the director, the other 5 are the real servers.  All 6 
machines are on the same subnet (with public IPs), and the director is 
using LVS-DR for load balancing.  Just to give you an idea as to the types 
of connection numbers I'm getting:
   Prot LocalAddress:Port Scheduler Flags
     -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
   TCP  vip.address.here:smtp wlc
     -> realserver1.ip.here:smtp     Route   50     648        2357
     -> realserver2.ip.here:smtp     Route   50     650        2231
     -> realserver3.ip.here:smtp     Route   50     648        2209
Whereas when using LVS-NAT (which was 100% perfect), my numbers would be 
something like:
     -> realserver1.ip.here:smtp     Route   50     16        56
     -> realserver2.ip.here:smtp     Route   50     14        50
     -> realserver3.ip.here:smtp     Route   50     15        48

I use keepalived to manage the director and to monitor the real servers. 
The only "tweaking" that I've done to IPVS, is I have to run this:
   /sbin/ipvsadm --set 1800 0 0
before starting up keepalived, just so that the active connections will 
stay active for 30 minutes.  In other words, we allow our users to idle 
their connection for 30 minutes, and after that, then the connection 
should be terminated.  And I put "0 0" there, because from what I've 
read, that tells ipvsadm to not change those other two values (in other 
words, leave the defaults as is).

That's about all I can think of, the only other wierd thing that I had 
to do was to tweak some networking settings on the real servers to fix 
the pain-in-the-@$$ ARP issues that come with DR.  But I doubt those 
changes would have anything to do with the director's load balancing 
problems. Those tweaks were only done on the real servers, and they were 
to just silence the broadcasting of the MAC address for the VIP (dummy0) 
interfaces on the real servers.

And for those interested, I switched from LVS-NAT to LVS-DR, because I 
really feel that you can get much better network throughput by using DR 
instead of NAT.  I know I've read a bunch of messages on the mailing list 
saying that NAT is just as good, but I think the one major advantage of 
IPVS is that it supports DR, whereas almost every other load balancing 
product I've seen uses some type of NATing (in other words, all network 
traffic goes in and out of the director).  To have a setup, like I do now 
where only incoming traffic has to go through the director, is absolutely 
fantasic, because the cluster (for lack of a better word) can be easily 
expanded.  With LVS-NAT, when you add more real servers, all you get is 
more CPU power, you don't get any more network throughput, with LVS-DR 
when you add a new real server, you completely expand your cluster, not 
just one part of it.

Sorry for the long email.  But I really would appreciate any help that can 
be provided.

Thanks!

Craig




Search lvs-users Archives
Limit search to: Subject & Body Subject Author
Sort by: Reverse Sort

More information about the lvs-users mailing list