heartbeat not able to start local resources

Jiang bearie66 at gmail.com
Thu Jun 29 05:23:03 BST 2006


Hi,

I recently upgraded heartbeat to version 2.0.2 with packages installed:
heartbeat-stonith-2.0.2-1
heartbeat-pils-2.0.2-1
heartbeat-2.0.2-1
heartbeat-ldirectord-1.2.3-2.rh.el.3.0

In haresource (without crm), I have configurations:

****************************************
SMSCONV11 \
        172.16.1.80 \
        172.16.1.81 \
        ldirectord::/etc/ha.d/ldirectord.cf

SMSCONV11 interopOB1

SMSCONV12 interopOB2
****************************************

In a situation the heartbeat is not running on SMSCONV12, I tried to
start heartbeart on SMSCONV11, but in /var/log/message, it seems the
process stucked after bring up interopOB2,

****************************************
Jun 29 12:03:45 SMSCONV11 heartbeat: [1202]: WARN: glib: TTY write
timeout on [/dev/ttyS0] (no connection or bad cable? [see
documentation])
Jun 29 12:04:00 SMSCONV11 heartbeat: [1198]: WARN: node smsconv12: is dead
Jun 29 12:04:00 SMSCONV11 heartbeat: [1198]: info: Local status now
set to: 'active'
Jun 29 12:04:00 SMSCONV11 heartbeat: [1198]: info: Starting child
client "/usr/lib/heartbeat/ipfail" (1001,104)
Jun 29 12:04:00 SMSCONV11 heartbeat: [1209]: info: Starting
"/usr/lib/heartbeat/ipfail" as uid 1001  gid 104 (pid 1209)
Jun 29 12:04:00 SMSCONV11 heartbeat: [1198]: WARN: No STONITH device configured.
Jun 29 12:04:00 SMSCONV11 heartbeat: [1198]: WARN: Shared disks are
not protected.
Jun 29 12:04:00 SMSCONV11 heartbeat: [1198]: info: Resources being
acquired from smsconv12.
Jun 29 12:04:00 SMSCONV11 harc[1210]: info: Running /etc/ha.d/rc.d/status status
Jun 29 12:04:00 SMSCONV11 mach_down[1239]: info: Taking over resource
group interopOB2
Jun 29 12:04:00 SMSCONV11 heartbeat: [1211]: info: Local Resource
acquisition completed.
Jun 29 12:04:00 SMSCONV11 ResourceManager[1299]: info: Acquiring
resource group: smsconv12 interopOB2
Jun 29 12:04:00 SMSCONV11 ResourceManager[1299]: info: Running
/etc/init.d/interopOB2  start
Jun 29 12:04:10 SMSCONV11 heartbeat: [1198]: info: Local Resource
acquisition completed. (none)
Jun 29 12:04:10 SMSCONV11 heartbeat: [1198]: info: local resource
transition completed.
****************************************

ps -e shows ResourceManager still waiting for sth:
****************************************
 1117 pts/10   00:00:00 ha_logd
 1118 pts/10   00:00:00 ha_logd
 1198 ?        00:00:00 heartbeat
 1201 ?        00:00:00 heartbeat
 1202 ?        00:00:00 heartbeat
 1203 ?        00:00:00 heartbeat
 1204 ?        00:00:00 heartbeat
 1205 ?        00:00:00 heartbeat
 1206 ?        00:00:00 heartbeat
 1207 ?        00:00:00 heartbeat
 1209 ?        00:00:00 ipfail
 1210 ?        00:00:00 status
 1239 ?        00:00:00 mach_down
 1299 ?        00:00:00 ResourceManager
 1343 ?        00:00:00 ResourceManager
****************************************

[root at SMSCONV11 init.d]# ps -ef | grep Re
root      1299  1239  0 12:04 ?        00:00:00 /bin/sh
/usr/lib/heartbeat/ResourceManager takegroup interopOB2
root      1343  1299  0 12:04 ?        00:00:00 /bin/sh
/usr/lib/heartbeat/ResourceManager takegroup interopOB2
****************************************

And then it never carry on to start interopOB1 and ldirectord on
SMSCONV11. I previously have the same problem with heartbeat version
1.2.3, but someone told it has been fixed in 1.2.4, since I can't get
1.2.4's rpms for RH EL3, so I upgraded to heartbeat 2.0.2.

Is it a known bug for heartbeat 2.0.2 as well? How can I fix it or if
I have configured something wrongly?

Your help is highly appreciated!!

- Jiang -

Search lvs-users Archives
Limit search to: Subject & Body Subject Author
Sort by: Reverse Sort

More information about the lvs-users mailing list