[lvs-users] Dead servers not being removed from pool, ldirectord

Michael Moody michael at gsc.cc
Thu May 29 02:20:21 BST 2008


The /etc/resolv.conf is identical.

The /etc/hosts is identical except this entry on the backup load balancer:

192.168.1.100   lvs1.bodybuilding.com lvs1

(I highly doubt that would have any bearing on it).

Any other suggestions?

Michael

Graeme Fowler wrote:
> On Wed, 2008-05-28 at 07:27 -0600, Michael S. Moody wrote:
>   
>> This happened again today, dead servers were not being removed. I had to
>> stop heartbeat, and allow the resources to transfer to the second load
>> balancer. Something is seriously wrong, but I don't know what it is. It
>> doesn't seem to happen on the second load balancer.
>>     
>
> Looking at your strace (which I'll edit, and is missing timestamps -
> next time if you can please use the "-tt" switch to get microsecond
> timing) shows the following:
>
> Setting up file descriptor 22, which is to be used to open a TCP stream
> socket:
>
>   
>> socket(PF_INET, SOCK_STREAM, IPPROTO_TCP) = 22
>> ioctl(22, SNDCTL_TMR_TIMEBASE or TCGETS, 0x7fffd25ea8c0) = -1 EINVAL (Invalid argument)
>> lseek(22, 0, SEEK_CUR)                  = -1 ESPIPE (Illegal seek)
>> ioctl(22, SNDCTL_TMR_TIMEBASE or TCGETS, 0x7fffd25ea8c0) = -1 EINVAL (Invalid argument)
>> lseek(22, 0, SEEK_CUR)                  = -1 ESPIPE (Illegal seek)
>>     
>
> The arg/seek errors there are fine, so ignore them. Now it
> sets/gets/sets flags:
>
>   
>> fcntl(22, F_SETFD, FD_CLOEXEC)          = 0
>> fcntl(22, F_GETFL)                      = 0x2 (flags O_RDWR)
>> fcntl(22, F_SETFL, O_RDWR|O_NONBLOCK)   = 0
>>     
>
> ...and now we connect to your realserver:
>
>   
>> connect(22, {sa_family=AF_INET, sin_port=htons(21), sin_addr=inet_addr("192.168.1.195")}, 16) = -1 EINPROGRESS (Operation now in progress)
>>     
>
> ...and here, FD 22 is being prepared for read/write (I think!):
>
>   
>> select(24, NULL, [22], NULL, {0, 0})    = 1 (out [22], left {0, 0})
>>     
>
> ...and is now connected, so we get flags, set flags, and wait to read
> from it:
>
>   
>> connect(22, {sa_family=AF_INET, sin_port=htons(21), sin_addr=inet_addr("192.168.1.195")}, 16) = 0
>> fcntl(22, F_GETFL)                      = 0x802 (flags O_RDWR|O_NONBLOCK)
>> fcntl(22, F_SETFL, O_RDWR)              = 0
>>     
>
> The read/write operation times out:
>
>   
>> select(24, [22], NULL, NULL, {0, 0})    = 0 (Timeout)
>>     
>
> ...and FD22 - the FTP connection - is closed.
>
>   
>> close(22)                               = 0
>>     
>
> Rinse, repeat, etc.
>
> The lack of timestamps is a bit of a blocker here, as there's no way to
> discern how long ldirectord is waiting before the timeouts occur.
>
> I'll suggest one thing, however: does the affected realserver have the
> exact same hosts file (with obvious differences if that isn't a complete
> oxymoron) and resolver configuration as the working one?
>
> It strikes me that the connection is timing out because the FTP daemon
> or xinetd, or some other wrapper, is trying to do a reverse DNS lookup
> of the calling IP and that's the part causing the timeout - if the
> daemon has to wait for a lookup to complete before returning the banner,
> perhaps ldirectord's timeout is less than that so it gives up and moves
> on?
>
> I think you've unearthed a config problem in your local setup, but it
> could be a bug. Let's go with making sure the realserver knows who
> everyone is first.
>
> Graeme
>
>
>
>   

-- 

Michael S. Moody
Sr. Systems Engineer
Global Systems Consulting
Direct: (650) 265-4154
Web: http://www.GlobalSystemsConsulting.com

Engineering Support: support at gsc.cc
Billing Support: billing at gsc.cc
Customer Support Portal:  http://my.gsc.cc


NOTICE - This message contains privileged and confidential information intended only for the use of the addressee named above. If you are not the intended recipient of this message, you are hereby notified that you must not disseminate, copy or take any action in reliance on it. If you have received this message in error, please immediately notify Global Systems Consulting, its subsidiaries or associates. Any views expressed in this message are those of the individual sender, except where the sender specifically states them to be the view of Global Systems Consulting, its subsidiaries and associates.



Search lvs-users Archives
Limit search to: Subject & Body Subject Author
Sort by: Reverse Sort

More information about the lvs-users mailing list