Small packets handling : LVS-DR vs LVS-NAT

Nicolas Chiappero Nicolas.Chiappero at estat.com
Tue May 6 18:48:32 BST 2003


Hello all,

I've been using this simple LVS-DR setup for nearly 4 years
without encountering any problems :
                       _________
                      |         |
                      | Client  |
                      |_________|
                           |
                        (router)
                      _____|_____ 
                     |           |
                     | FireWall  |
                     |___________|
                           |
                      _____|_____ 
                     |           |
                     | DirectoR  |
                     |___________|
                           |
          +----------------+----------------+
   _______|_____     ______|______    ______|______
  |             |   |             |  |             |
  | RealServer  |   | RealServer  |  | RealServer  |
  |_____________|   |_____________|  |_____________|

Director has one fast ethernet NIC handling around 4000 inbound 
packets/s according to ipvsadm --rate output.
DirectoR is running a linux 2.4 kernel and ipvs 1.0.6 with
hashtable size equals to 2^15.
Default gateway for RealServers is the FireWall, so returning
packets are not handled by the DirectoR.

I've recently set up this new LVS-NAT architecture to provide 
failover:
                        ________
                       |        |
                       | Client |
                       |________|
                           |
                        (router)
             ____________  |  ____________
            |   Master   | | |   Backup   |
            |  DirectoR  |-+-|  DirectoR  |
            |(firewalled)| | |(firewalled)|
            |____________| | |____________|
                           |
          +----------------+----------------+
   _______|_____     ______|______    ______|______
  |             |   |             |  |             |
  | RealServer  |   | RealServer  |  | RealServer  |
  |_____________|   |_____________|  |_____________|

Each DirectoR has 2 fast ethernet NICs. One NIC is dedicated
to inbound packets, the other one is handling outbound packets.
DirectoRs are running a linux 2.4 kernel and ipvs 1.0.4 with
hashtable size equals to 2^16.

I noticed that, when having heavy traffic, there were some
bottleneck with LVS-NAT setup which didn't show up with 
older LVS-DR setup. LVS-NAT seems to hardly handle more
than 6000 packets/s. (3000 in / 3000 out).
`ipvsadm -lnc | wc -l` shows a value around 62.000.

I guess I have missed some point, because I was thinking
these 2 setups were logically equivalent, speaking about NIC
use:
    - LVS-DR : one NIC handling inbound packets, returning packets
    not handled.
    - LVS-NAT: one NIC for inbound packets, other NIC for outbound
    packets.

(For those who may ask, firewalling is not an issue. I deactivated
it and didn't notice any changes. BTW, if anyone is interesting
by this specific setup director+firewall+LVS-NAT, let me know so
that I could explain clearly what I was able to do.)

I was using eepro100 driver and successfully changed to intel e100
driver. Using this driver gives the opportunity to use some specific
features called "CPU Cycle Saver": the adapter does not generate an
interrupt for every frame it receives. Instead, it waits until it
receives N frames before generating an interrupt.
As this LVS setup is mainly handling small packets, I tried
different values for N and noticed that it can push back limitations. 
At least, it can now sustain 4000 inbound/4000 outbound packets/s.

I'm not sure I can deduce that IRQ handling is an issue when having 
more than one NIC and hence consider this as a workaround, 
optimization or as a real solution.

Also, I don't know much about hardware design, but there may be some
bus (DMA?) limitations when using 2 or more NICs...

Reading again my mail, I just noticed that maybe my hash table 
size was not correct. I can have up to 2000 requests/s. Each request
is served immediately (or so) and my tcp_fin_timeout value is 30.

As you may have understood :), it is not clear to me where is the
bottleneck: NIC ? IRQ ? LVS-type ? Bus transfers ? Connexion table ?
Im' a bit confused and I would greatly appreciate any help to figure
out which assumption is correct so to find the appropriate solution.

Regards,
Nicolas.



More information about the lvs-users mailing list