heartbeat trying to start ldirectord twice

Kenny Dail kend at amigo.net
Sat Jan 20 12:57:38 GMT 2007


Hello happy list,

Have a heartbeat + ldirectord setup spanning several IPs. From
hareresources:
cerberus        ip1/24/eth0 ip2/24/eth0 ip3/24/eth0 ip4/24/eth0 ip5/24/eth0
ip6/24/eth1 ip7/24/eth1 ldirectord

one line is all I have, slightly edited and wrapped here.

ha.cf is pretty simple:
logfacility     local0
bcast eth1
node    hydra cerberus


ldirectord has a quite huge ldirector.cf in /etc/ha.d/ and it is all
working just fine on the main node cerberus. The secondary node hydra
has undergone some software updates. Heartbeat failover works in that it
detects when cerberus dies, and takes over the network interfaces.
However it starts the interfaces and ldirectord and things work for a
few seconds, then it tries to start it all again, ldirectord complains
it is already running, and heartbeat bails. 

So what do I have set up wrong?

This is logged in messages:
Jan 20 04:49:02 hydra heartbeat: [14686]: info: Status update for node cerberus: status active
Jan 20 04:49:02 hydra heartbeat: [14696]: debug: notify_world: setting SIGCHLD Handler to SIG_DFL
Jan 20 04:49:03 hydra harc[14696]: info: Running /etc/ha.d/rc.d/status status
Jan 20 04:49:03 hydra heartbeat: [14686]: info: Link hydra:eth1 up.
Jan 20 04:49:58 hydra heartbeat: [14686]: info: Received shutdown notice from 'cerberus'.
Jan 20 04:49:58 hydra heartbeat: [14686]: info: Resources being acquired from cerberus.
Jan 20 04:49:58 hydra heartbeat: [14706]: debug: notify_world: setting SIGCHLD Handler to SIG_DFL
Jan 20 04:49:59 hydra harc[14706]: info: Running /etc/ha.d/rc.d/status status
Jan 20 04:49:59 hydra heartbeat: [14707]: info: No local resources [/usr/lib/heartbeat/ResourceManager listkeys hydra] to acquire.
Jan 20 04:49:59 hydra heartbeat: [14686]: debug: StartNextRemoteRscReq(): child count 1
Jan 20 04:49:59 hydra mach_down[14719]: info: Taking over resource group ip1/24/eth0
Jan 20 04:49:59 hydra ResourceManager[14746]: info: Acquiring resource group:<snip>
[many lines cut concerning the start of IPaddr for each ip]
Jan 20 04:50:11 hydra ResourceManager[14746]: info: Running /etc/ha.d/resource.d/ldirectord start
Jan 20 04:50:11 hydra ResourceManager[14746]: debug: Starting /etc/ha.d/resource.d/ldirectord  start
Jan 20 04:50:13 hydra ldirectord[16829]: Starting Linux Director v1.77.2.5 as daemon
Jan 20 04:50:13 hydra ResourceManager[14746]: debug: /etc/ha.d/resource.d/ldirectord  start done. RC=0
Jan 20 04:50:13 hydra mach_down[14719]: info: mach_down takeover complete for node cerberus.
[21 lines: ldirectord[16831]: Added virtual server: xxx]
[3 lines: ldirectord[16831]: Added fallback server: xxx]
[40 lines: ldirectord[16831]: Quiescent real server: xxx]
[20 lines: ldirectord[16831]: Restored real server: xxx]
Jan 20 04:50:29 hydra heartbeat: [14686]: WARN: node cerberus: is dead
Jan 20 04:50:29 hydra heartbeat: [14686]: info: Dead node cerberus gave up resources.
Jan 20 04:50:29 hydra heartbeat: [14686]: info: Resources being acquired from cerberus.
Jan 20 04:50:29 hydra heartbeat: [14686]: info: Link cerberus:eth1 dead.
Jan 20 04:50:29 hydra heartbeat: [17044]: debug: notify_world: setting SIGCHLD Handler to SIG_DFL
Jan 20 04:50:29 hydra harc[17044]: info: Running /etc/ha.d/rc.d/status status
Jan 20 04:50:29 hydra heartbeat: [17045]: info: No local resources [/usr/lib/heartbeat/ResourceManager listkeys hydra] to acquire.
Jan 20 04:50:29 hydra heartbeat: [14686]: debug: StartNextRemoteRscReq(): child count 1
Jan 20 04:50:29 hydra mach_down[17064]: info: Taking over resource group ip1/24/eth0
Jan 20 04:50:29 hydra ResourceManager[17084]: info: Acquiring resource group: <snip>
Jan 20 04:50:29 hydra heartbeat: [14686]: info: Comm_now_up(): updating status to active
Jan 20 04:50:29 hydra heartbeat: [14686]: info: Local status now set to: 'active'
Jan 20 04:50:30 hydra heartbeat: [17108]: info: No local resources [/usr/lib/heartbeat/ResourceManager listkeys hydra] to acquire.
Jan 20 04:50:30 hydra heartbeat: [14686]: debug: StartNextRemoteRscReq(): child count 1
Jan 20 04:50:30 hydra IPaddr[17112]: INFO: IPaddr Running OK
Jan 20 04:50:31 hydra IPaddr[17224]: INFO: IPaddr Running OK
Jan 20 04:50:31 hydra IPaddr[17330]: INFO: IPaddr Running OK
Jan 20 04:50:32 hydra IPaddr[17436]: INFO: IPaddr Running OK
Jan 20 04:50:32 hydra IPaddr[17542]: INFO: IPaddr Running OK
Jan 20 04:50:33 hydra IPaddr[17654]: INFO: IPaddr Running OK
Jan 20 04:50:34 hydra IPaddr[17760]: INFO: IPaddr Running OK
Jan 20 04:50:35 hydra ResourceManager[17084]: info: Running /etc/ha.d/resource.d/ldirectord  start
Jan 20 04:50:35 hydra ResourceManager[17084]: debug: Starting /etc/ha.d/resource.d/ldirectord  start
Jan 20 04:50:37 hydra ResourceManager[17084]: debug: /etc/ha.d/resource.d/ldirectord  start done. RC=1
Jan 20 04:50:37 hydra ResourceManager[17084]: ERROR: Return code 1 from /etc/ha.d/resource.d/ldirectord
Jan 20 04:50:37 hydra ResourceManager[17084]: CRIT: Giving up resources due to failure of ldirectord

-- 
Kenny Dail <kend at amigo.net>


Search lvs-users Archives
Limit search to: Subject & Body Subject Author
Sort by: Reverse Sort

More information about the lvs-users mailing list