[lvs-users] ldirectord deletes pid when config is broken

Mark Riede mark.riede at aoe.com
Thu Apr 28 11:39:36 BST 2016


Hello,

when the config is broken and you want to stop, restart or want to get the status of the daemon, the initiated new process is checking the config, is reporting the error, is exiting with code 2 and unfortunately deletes the pid of the still healthy running process.

OS: Ubuntu 14.04.4 LTS
ldirectord: 1:3.9.3+git20121009-3ubuntu2
Pacemaker: 1.1.10+git20130802-1ubuntu2.3

There are two nodes in a cluster.
ldirectord will be checked with '/usr/sbin/ldirectord /etc/ldirectord.cf status' every 10 seconds.
When the config is broken, the check exits with the error message and the exit-code 2.
Pacemaker switches the resources to node 2.
Unfortunately, the process of ldirectord on node 1 is still running.

This is caused by the config-check when you want to stop, restart or want to get the status of the process.
I think it should be enough to check the config only on the start of the process.

By the way, the config errors are a result of using host-names in the config and a faulty dns-lookup when the process checks the config.

The pid gets deleted via the function config_error.

###########

sub config_error
{
    my ($line, $msg) = @_;

    __config_log($line, "Error", $msg);
    if ($DAEMON_STATUS == $DAEMON_STATUS_STARTING) {
        &ld_rm_file("$RUNPID.$CFGNAME.pid");
        &ld_exit(2, "config_error: Configuration Error");
    } else {
        die;
    }
}

###########


It looks like that there are proper phases (ld_init, ld_setup, etc.), which the process runs through everytime you execute it.

###########

$DAEMON_STATUS = $DAEMON_STATUS_STARTING;
ld_init();
ld_setup();
ld_start();
ld_cmd_children("start", %LD_INSTANCE);
$DAEMON_STATUS = $DAEMON_STATUS_RUNNING;
ld_main();

###########


But as far as I was able to debug it, it seems that everything is handled by the function ld_init and the process never runs through the following functions.

You can reproduce the error when you enter a typo in the config like 'checktimeoutt=3'.

Best regards,
Mark


More information about the lvs-users mailing list