graeme at graemef.net
Thu Aug 27 16:20:04 BST 2009
On Thu, 2009-08-27 at 06:36 -0700, Robinson, Eric wrote:
> We have lvs+ldirectord running with 87 virtual servers and 285
> realservers, and growing. Most of the virtual servers use checktype=3.
> (Pertinent global settings are: checktimeout=3, checkinterval=5,
> Performance is excellent on a dual-core 3.0GHz Xeon with 1GB RAM. Does
> anyone know where the practical limits of scalability are? Where should
> be watch for potential bottlenecks?
It's important that you separate the two things which ldirectord/the
director are doing and view them independently of each other:
1. Doing load balancing (the director's kernel).
2. Health checking the realservers and manipulating the LVS table
Both of these are limited in ways which relate to each other, and ways
which are completely separate. For example:
* In theory a given director with a given CPU and given NIC/driver
combination can push a given maximum number of packets/sec between two
interfaces. That number depends on the CPU speed, number of cores or SMP
architecture, PCI bus in use, location of NIC controllers relative to
each other (ie. on different or the same bus) and a few other factors.
* The amount of RAM you have restricts the number of table entries your
director can keep state on (this is magnified if you're using the
netfilter conntrack stuff at the same time)
* The amount of RAM you have, coupled with available CPU cycles not
servicing interrupts caused by packets being routed, restricts the
number of concurrent processes you can use for health checking.
These days though the amount of RAM and multi-core CPUs (with NICs using
CPU affinity automagically to spread their interrupt load) have become
so big in raw terms that your network is likely to be the bottleneck!
More information about the lvs-users