[lvs-users] Checkcommand defunct processes
Bruno Corsi dos Santos
corsibp at yahoo.com.br
Thu Apr 24 15:22:13 BST 2008
Hi there,
I'm using the ldirector with heartbeat daemon on a CentoOS 5 (x86_64) and I realized that when I use the "checkcommand" option to run a script or executable to check wheather a service is alive or not, the process exits properly but it keeps in "defunct" state. There are up and down services that are being tested, so, I also realized that it happens just when the "checkcommand" process takes more than "checktimeout" seconds to complete.
I decided to check the availability of my UDP service writting a code myself to check whether the server is up or down, so, I had to use the "checkcommand" option.
I also saw that sometimes the restart of the heartbeat hangs. But it doesn't happen so often...
Does anyone have already run into such trouble? I got thousands of "defunct" processes after some while, so, I have to restart the ldirectord to get rid of such "defunct" processes. I'm running the version script from the heartbeat-ldirectord-2.1.3-3.el5.centos rpm package from CentOS repository. A temporary solution I got was to increase the "checktimeout" providing that the ldirectord timeout would never happen.
Any help would be welcome,
Thanks in advance,
Bruno
Abra sua conta no Yahoo! Mail, o único sem limite de espaço para armazenamento!
http://br.mail.yahoo.com/
Search lvs-users Archives
More information about the lvs-users
mailing list