'no hit' for LVS connection tracking (SYN+ACK not translated)

Jari Takkala Jari.Takkala at Q9.com
Mon Aug 15 19:46:53 BST 2005


Hello,

We have a fairly large installation of both single and failover-pair load balancers in an LVS-NAT setup. On one of these fail over pairs, we have been experiencing problems where the SYN+ACK reply from a real server does not get source NAT'd to the VIP IP. 

We are load balancing both HTTP and FTP connections, and it seems like the problem only manifests itself when we add the FTP service into ipvsadm. The odd thing is that it affects all VIP's.

Here is a tcpdump on the external interface of the load balancer:

03:20:42.944203 0:12:d9:92:d2:b2 0:2:b3:a1:90:d4 ip 74: 216.220.XX.XXX.9345 > 10.99.23.64.http: S [tcp sum ok] 250882373:250882373(0) win 5840 <mss 1380,sackOK,timestamp 2855118908 0,nop,wscale 0> (DF) (ttl 63, id 41526, len 60)
03:20:42.944533 0:2:b3:a1:90:d4 0:12:d9:92:d2:b2 ip 78: 10.99.22.53.http > 216.220.XX.XXX.9345: S [tcp sum ok] 3305203912:3305203912(0) ack 250882374 win 16560 <mss 1460,nop,wscale 0,nop,nop,timestamp 0 0,nop,nop,sackOK> (DF) (ttl 127, id 55367, len 64)
03:20:42.944671 0:12:d9:92:d2:b2 0:2:b3:a1:90:d4 ip 78: 216.220.XX.XXX.9345 > 10.99.22.53.http: R [tcp sum ok] 1:1(0) ack 1 win 16560 <mss 1460,nop,wscale 0,nop,nop,timestamp 0 0,nop,nop,sackOK> (DF) (ttl 127, id 55367, len 64)

216.220.XX.XXX is the client IP (CIP)
10.99.23.64 is the virtual IP on the outside interface of the load balancer (VIP)
10.99.22.53 is the IP of the real server (RIP)

As can be seen, the incoming SYN from 216.220.XX.XXX goes to the VIP IP of 10.99.23.64, it gets translated to one of the real servers (10.99.22.53), the real server returns a SYN+ACK back to the CIP, however the reply is not translated.

Here is the tcpdump from the internal interface:

03:20:42.944316 0:2:b3:a1:90:d5 0:2:55:9c:55:c3 ip 74: 216.220.XX.XXX.9345 > 10.99.22.53.http: S [tcp sum ok] 250882373:250882373(0) win 5840 <mss 1380,sackOK,timestamp 2855118908 0,nop,wscale 0> (DF) (ttl 63, id 41526, len 60)
03:20:42.944483 0:2:55:9c:55:c3 0:2:b3:a1:90:d5 ip 78: 10.99.22.53.http > 216.220.XX.XXX.9345: S [tcp sum ok] 3305203912:3305203912(0) ack 250882374 win 16560 <mss 1460,nop,wscale 0,nop,nop,timestamp 0 0,nop,nop,sackOK> (DF) (ttl 128, id 55367, len 64)
03:20:42.944703 0:2:b3:a1:90:d5 0:2:55:9c:55:c3 ip 78: 216.220.XX.XXX.9345 > 10.99.22.53.http: R [tcp sum ok] 1:1(0) ack 1 win 16560 <mss 1460,nop,wscale 0,nop,nop,timestamp 0 0,nop,nop,sackOK> (DF) (ttl 126, id 55367, len 64)

Packet flow on this interface is normal. MAC addresses in both tcpdump's appear normal.

Here is the debug output:

Aug 13 03:20:43 kernel: IPVS: lookup/in TCP 216.220.XX.XXX:9345->10.99.23.64:80 hit
Aug 13 03:20:43 kernel: IPVS: Incoming TCP 216.220.XX.XXX:9345->10.99.23.64:80
Aug 13 03:20:43 kernel: Enter: ip_vs_nat_xmit, ip_vs_conn.c line 680
Aug 13 03:20:43 kernel: IPVS: NAT to 10.99.22.53:80
Aug 13 03:20:43 kernel: Leave: ip_vs_nat_xmit, ip_vs_conn.c line 820
Aug 13 03:20:43 kernel: Enter: ip_vs_out, ip_vs_core.c line 646
Aug 13 03:20:43 kernel: IPVS: lookup/out TCP 10.99.22.53:80->216.220.XX.XXX:9345 not hit
Aug 13 03:20:43 kernel: IPVS: packet for TCP 216.220.XX.XXX:9345 continue traversal as normal.
Aug 13 03:20:43 kernel: Enter: ip_vs_out, ip_vs_core.c line 646
Aug 13 03:20:43 kernel: IPVS: lookup/out TCP 216.220.XX.XXX:9345->10.99.22.53:80 not hit
Aug 13 03:20:43 kernel: IPVS: packet for TCP 10.99.22.53:80 continue traversal as normal.

The VIP IP is an interface alias on eth0. The VIP's are shared between the load balancers using heartbeat.

eth0:4    Link encap:Ethernet  HWaddr 00:02:B3:A1:90:D4  
          inet addr:10.99.23.64  Bcast:10.99.23.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)
          Interrupt:11 Base address:0xece0 Memory:febff000-febff038 

We have IP forwarding enabled in /proc/sys/net/ipv4/ip_forward so that we can send traffic directly to the real servers if we need to. Software versions in use are:

# uname -a
Linux lb-a 2.4.26 #3 Fri Jul 2 14:12:31 EDT 2004 i686 i686 i386 GNU/Linux
# ipvsadm -v
ipvsadm v1.21 2002/07/09 (compiled with popt and IPVS v1.0.11)
# ipvsadm -l -n
IP Virtual Server version 1.0.11 (size=4096)
Prot LocalAddress:Port Scheduler Flags
  -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
TCP  10.99.23.64:80 wlc persistent 300
  -> 10.99.22.52:80               Masq    1      11         4
  -> 10.99.22.58:80               Masq    1      10         16
  -> 10.99.22.53:80               Masq    1      10         2
  -> 10.99.22.215:80              Masq    1      11         2
TCP  10.99.23.57:21 wlc
  -> 10.99.22.208:21              Masq    1      0          0
  -> 10.99.22.207:21              Masq    1      0          0
TCP  10.99.23.51:80 wlc persistent 300
  -> 10.99.22.199:80              Masq    1      12         3
  -> 10.99.22.197:80              Masq    1      12         3

Does anyone have any idea why this problem would be occurring? Looking at the debug output, it looks like IPVS is not matching the outgoing packet against its hash table entry. And would anyone be able to tell me what the relationship is between the ip_vs module, and the ip_tables, iptable_nat, and ip_conntrack modules? It seems like all of those modules get loaded at boot time, however we are not doing any firewalling on this load balancer ('iptables -L -n' and 'iptables -L -n -t nat' show no rules loaded).

Thank you,

Jari


More information about the lvs-users mailing list