[lvs-users] ldirectord sometimes stops balancing/counting connections after some time

Samy Ascha | Xel Media B.V. samy at xel.nl
Thu Jul 14 17:19:56 BST 2011

Dear members,

I joined the list today, cuz I'm having some issues with ldirectord and cannot find any related information through Google etc.

I've been playing with linux-ha for some years now and today, for the second time, I noticed ldirectord stops balancing connection across cluster-nodes.

I have a simple test setup with heartbeat3, pacemaker, ldirectord/ipvs. Plus 2 cluster-nodes running Apache. What happens is the following:

Cluster is working very nicely, balancing request, using 'wlc', and i check: ipvsadm -L -n. No problems with heartbeat + crm what-so-ever. It shows perfectly balanced connection counts. Sometimes, after some minutes/hours I start refreshing my test-pages (just to see how cool linux-ha is) and notice I'm stuck at a single cluster-node. Everything is up, ldirectord check-requests are added to the webserver logs as '200 OK', all seems ok, but the second node of the 2-node cluster is never used.

At this point, monitoring with ipvsadm shows no updates to connection counts and I have to either restart the heartbeat service or switch to the failover balancer. After the switch things always work fine.

I cannot seem to find any useful info in logs and it seems that no errors or warnings are reported.

My questions:

* Have any of you experienced the issue where ldirectord stops balancing reqeusts after working fine for some time?

* If so, were there any traces of this failure in logs / command output, which I can use as a starting point in drilling down to the cause?

I'll be happy to supply configs / command output if needed.

Setup details:

* Complete setup inside VMWare ESX 4.1 (the free version)
* 2x Ubuntu Lucid (LTS) active/passive loadbalancers, with heartbeat+pacemaker, with resources: IPAddr and ldirectord
* 2x Ubuntu Lucid (LTS) Apache webserver, failover-ip added on lo:0

I can give more details, but can't think of anything now ;)

Thx very much! Hope to hear from you soon. I can't wait to introduce linux-ha in our critical systems and realy need this thing cleared up before we do ;)

Kind regards,
Samy Ascha

