[lvs-users] Checkcommand defunct processes

Bruno Corsi dos Santos corsibp at yahoo.com.br
Thu Apr 24 15:22:13 BST 2008


Hi there,

I'm using the ldirector with heartbeat daemon on a CentoOS 5 (x86_64) and I realized that when I use the "checkcommand" option to run a script or executable to check wheather a service is alive or not, the process exits properly but it keeps in "defunct" state. There are up and down services that are being tested, so, I also realized that it happens just when the "checkcommand" process takes more than "checktimeout" seconds to complete. 

I decided to check the availability of my UDP service writting a code myself to check whether the server is up or down, so, I had to use the "checkcommand" option.

I also saw that sometimes the restart of the heartbeat hangs. But it doesn't happen so often...

Does anyone have already run into such trouble? I got thousands of "defunct" processes after some while, so, I have to restart the ldirectord to get rid of such "defunct" processes. I'm running the version script from the heartbeat-ldirectord-2.1.3-3.el5.centos rpm package from CentOS repository. A temporary solution I got was to increase the "checktimeout" providing that the ldirectord timeout would never happen. 

Any help would be welcome,

Thanks in advance,
Bruno





      Abra sua conta no Yahoo! Mail, o único sem limite de espaço para armazenamento!
http://br.mail.yahoo.com/


Search lvs-users Archives
Limit search to: Subject & Body Subject Author
Sort by: Reverse Sort

More information about the lvs-users mailing list