[lvs-users] SOLVED Re: LVS + Xen issue

Matthias Saou thias at spam.spam.spam.spam.spam.spam.spam.egg.and.spam.freshrpms.net
Wed Aug 8 18:13:19 BST 2007


Matthias Saou wrote :

> I'm setting up various Xen guests, and want to use LVS to load-balance
> web traffic across them. I've tried two similar simple setups, and with
> both I see the same issue where LVS doesn't work properly when the
> director send the request to a real server on the same physical Xen
> host.
> 
> Scenario 1 :
> - 3 physical servers (Xen Hosts) with eth0 and eth1
> - 3 web servers (Xen guests), one per host, listening only on eth1
> - LVS NAT is configured using keepalived on the first Xen Host
> 
> When I make a web request to the LVS director, it works fine when it
> sends it to the 2nd or 3rd web servers, but only gets about the first
> 12kb of the page when it sends it to the 1st web server (the only one
> on the same Xen Host as LVS). For pages smaller than 12kb, no problem.
> 
> Scenario 2 :
> - 3 physical servers (Xen Hosts) with eth0 and eth1
> - 3 web servers (Xen guests), one per host, listening only on eth1
> - 1 LVS director (Xen guest), on the first Xen Host, eth0 and eth1
> 
> The exact same problem happens.
> 
> Has anyone already seen this issue?
> Here are a few more details :
> - RHEL5 x86_64 with latest 2.6.18-8.1.8.el5xen kernel
> - keepalived package from Fedora recompiled for RHEL5
> - net.ipv4.ip_forward = 1 on the LVS director
> - -A POSTROUTING -s 192.168.0.0/255.255.0.0 -o eth0 -j MASQUERADE
> 
> If I remove the "local" web server from the keepalive/LVS
> configuration, it works fine, since it only sends to the real servers
> on the other physical servers, but that would mean not using the first
> physical server's CPU power and memory, which I don't want to be
> wasting.
> 
> I'm pretty sure this has something to do with connection tracking
> and/or the bridges Xen configures, but I don't know what to try to fix
> the issue.

Daniel P. Berrange (who hacks extensively on Xen over at Red Hat)
suggested it might be a TCP checksum offload issue... and it was!

The solution is simply to use 'ethtool -K ethX tx off' on all relevant
interfaces, and it all starts working as expected.

I hope this will be useful to others. I'm certainly glad to know it
now :-)

Matthias

-- 
Clean custom Red Hat Linux rpm packages : http://freshrpms.net/
Fedora release 7 (Moonshine) - Linux kernel 2.6.22.1-41.fc7
Load : 0.56 0.46 0.45




More information about the lvs-users mailing list