[lvs-users] LVS Cluster with lighttpd servers doesn't react to SYN packets

jan_abraham at gmx.net jan_abraham at gmx.net
Tue Jan 20 13:06:56 GMT 2009


Hi list,

I've a strange behaviour on a cluster of lighttpd which I can't explain.

Here are the facts: 
Xeon CPUs (2x Quad Core)
Gentoo Linux
Kernel 2.6.27.7
Glibc 2.6.1
Lighttpd 1.4.20
Cluster using LVS-DR

What happens:
Lighttpd simply doesn't accept connections. In the tcpdump on the realserver I can see the SYN packets arrive but the connection isn't accepted on the server socket. There's no SYN-ACK travelling out, neither a RST or something else. After 3 SYN retries, the browser shows a timeout error.

The strange thing is: If I try to connect again after the timeout, the connection is established and further connects work as well! At least most of the time - sometimes it starts working not until the 2nd or 3rd timeout/reload. After some moments without new connections, lighty again refrains from accepting new connections...

I have a huge installation of Apache webservers working well with the very same LVM-DR setup. The only notable difference between these two server types is that Apache uses traditional blocking I/O with forking processes and lighttpd uses non-blocking I/O with select()/epoll based multiplexing.

While doing some investigation on this issue, I found a very similar problem reported on this list, titled "problems with ipvsadm - 3 seconds delay", dated from February 2008 (http://lists.graemef.net/pipermail/lvs-users/2008-February/020541.html). I think it describes the very same problem, especially with the problems occurring on Quad Core Xeons, which makes me think about a timing problem or race condition in kernel or glibc.

Any ideas 'round here which may help me to track this issue down?

Regards,
Jan
-- 
Psssst! Schon vom neuen GMX MultiMessenger gehört? Der kann`s mit allen: http://www.gmx.net/de/go/multimessenger




More information about the lvs-users mailing list