[lvs-users] Oddity from monitoring server

Graeme Fowler graeme at graemef.net
Thu Feb 4 10:13:11 GMT 2010


On Wed, 2010-02-03 at 17:36 -0600, Anoop Bhat wrote:
> Occasionally, we have seen periods where the monitoring server (opennms) will report that it can't get to SSH on any of the VIPs that are hosted on our LVS clusters. We have several clusters and it seems that at once all of them will be unresponsive. We've only seen this happen with SSH and none of the other services like http.

This is widely discussed on the OpenNMS mailing lists from time to time.

More often than not, it's because a DNS server is unresponsive. The
pollong host connects to the IP it expects SSH to be on; the SSH daemon
wants to do a reverse lookup of the incoming IP, and gets a delay in DNS
response. It won't reveal the banner until the lookup completes, and
OpenNMS then flags an outage which goes away 30 seconds later.

One workaround (not a fix!) is to make sure the OpenNMS host is listed
in /etc/hosts on the affected system, and to make sure that your name
resolution system is configured to use files before DNS.

You did look at the OpenNMS archives first, didn't you?

Graeme




More information about the lvs-users mailing list