[lvs-users] LVS-DR + 2 pools in 2 networks = hair pulling
tom at calixo.net
Thu Apr 19 10:49:49 BST 2012
I would like to have to following flow with A and B being two different networks (one public and one private range) on the same load balancer.
A client hits a serverA in pool A via VIPA which serverA with its RIP (RIPA1 or RIPA2) will hits VIPB and will receive an answer from a server in pool B. Everything works until RIPA1 or RIPA2 tries to connect to VIPB. The packets arrives from RIPA1 or RIPA2 on the load balancer (lb1) then nothing, it's like the packet disappears.
lb1 is both networks A _and_ B and only use one gateway, gateway from A.
lb1 eth0=18.104.22.168/24 - gw is 22.214.171.124
lb1 eth1= 10.1.1.10/24 - no gw
A servers are only in network A and use their respective gateway for this network
VIPA=126.96.36.199/24 (on eth0 so gw is 188.8.131.52)
B servers are only in network B and use their respective gateway for this network
VIPB=10.1.1.11/24 (on eth1 so gw is still 184.108.40.206)
lb1# ipvsadm -Ln
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP 220.127.116.11:443 rr
-> 18.104.22.168:443 Route 1 0 0
-> 22.214.171.124:443 Route 1 0 0
TCP 10.1.1.11:80 rr
-> 10.1.1.12:80 Route 1 0 0
-> 10.1.1.13:80 Route 1 0 0
lb1# netstat -nr
Destination Gateway Genmask Flags MSS Window irtt Iface
126.96.36.199 0.0.0.0 255.255.255.0 U 0 0 0 eth0
10.1.1.0 0.0.0.0 255.255.255.0 U 0 0 0 eth1
0.0.0.0 188.8.131.52 0.0.0.0 UG 0 0 0 eth0
lb1# tcpdump -i eth1 ip dst 10.1.1.11 -n
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth1, link-type EN10MB (Ethernet), capture size 65535 bytes
19:18:50.969769 IP 184.108.40.206.60877 > 10.1.1.11.http: Flags [S], seq 1331246417, win 5840, options [mss 1460,nop,nop,TS val 1603281160 ecr 0,nop,wscale 7], length 0
lb1# arp -a
RIPA1(220.127.116.11) at 00:18:51:28:aa:d3 [ether] on eth0
RIPB1 (10.1.1.12) at 00:18:51:5f:cd:11 [ether] on eth1
RIPA2 (18.104.22.168) at 00:18:51:e0:c6:e3 [ether] on eth0
gwA (22.214.171.124) at 00:18:19:9e:cf:ef [ether] on eth0
RIPB2 (10.1.1.13) at 00:18:51:9f:88:bd [ether] on eth1
*What I see*
Everything is load balanced properly in pool A 126.96.36.199/24 from ANY networks.
Everything is load balanced properly in pool B 10.1.1.0/24 from this network ONLY. So any client in 10.1.1.0/24 will be load balanced and will hit RIPB1 or RIPB2.
But when lets say RIPA1 (188.8.131.52) sends a packet to VIPB (10.1.1.11). I see the packet coming in via eth1 on the loadbalancer lb1 and then nothing. No ARP rewrite nothing. Weirdly enough, desperate, I changed the gw on lb1 to be 10.1.1.1, gateway of B. So it's not 184.108.40.206 anymore. And guess what then it works but I lose a working pool A. It's like I cannot have both. So why should I have a gateway to make it work ? The packet is coming on the right interface (eth1) therefore the good network, so an ARP rewrite should happen and bob's your uncle... except it's not ;)
*What I CANNOT see*
A packet coming on one of the B server (RIPB1 or RIPB2)
1. So is it possible to use one load balancer to load balance 2 differents network __AND__ let the servers in the pools to hit each others. Or is it wrong by design ?
2. Where is the packet going on the load balancer ? How can I track it ? (tcpdump is not enough or I am doing it wrong):
- Centos 6.2 - kernel: 2.6.32-220.4.2.el6.x86_64
- VMs on VMWare with VMXNET3 NICs
Thanks for reading so far. Any kind of hints will be greatly appreciated. I am really curious to understand why this happens.
More information about the lvs-users