Kernel sync daemon causing lockups?

Bradley McLean bradlist at bradm.net
Mon Jul 29 14:33:48 BST 2002


We've just upgraded our LBs from running lvs 1.02 to 1.04, and
keepalived 0.5.6 to 0.6.8.

All addresses prefixed by non-routable 192.168.x.x

As part of the upgrade, we enabled the sync daemon again.  We'd
disabled it in the past because it was suspected of causing the
systems to hang.

Hardware:  Dell 2450, 600 Mhz P3, 128 Mb, DE570TX quad nic.
OS:  RH7.2 w/ kernel.org 2.4.18 kernel.

Old configuration:  LVS-NAT, 4096 connections, 1800 persistence.
LB1: eth0: .100.4  eth1: .110.4  eth2 .120.4  eth3 .130.4
LB2: eth0: .100.5  eth1: .110.5  eth2 .120.5  eth3 .130.5
(eth0 admin, eth1 outside, eth2 inside, eth3 syncdaemon).
VIP: .110.10  RIPs in the .120.x net.  Sync daemon alone on .130
via crossover cable.

New configuration:  LVS-DR, 4096 connections, 900 persistence.
LB1: eth0: .100.4  eth1: .110.4  eth2 .120.4  eth3 .130.4
LB2: eth0: .100.5  eth1: .110.5  eth2 .120.5  eth3 .130.5
(eth0 admin, eth1 outside/inside, eth2 unused, eth3 syncdaemon)
VIP: .110.10  RIPs in the .110.x net.  Sync daemon alone on .130
via crossover cable.

Symptom:  After running well for 24-72 hours, primary load balancer
locks hard.  No kernel messages, no keyboard, no mouse, no ping.
System reset required.  Secondary load balancer runs for an additional
4-12 hours, then fails as well.  Shut off the sync daemon, and they
run well forever (or at least months at a time).

We've shut off the sync daemon again.

Anybody else see this?  What other information can I provide?

regards,

-Brad




More information about the lvs-users mailing list