[lvs-users] SYN storm with DR, not your average ARP issue

Julian Anastasov ja at ssi.bg
Sat Nov 18 13:48:26 GMT 2017


	Hello,

On Thu, 16 Nov 2017, Christian Balzer wrote:

> I've been using LVS for ages (as my posts here 9 years ago show ^o^), and
> consider myself well versed (and happy except for SH and quiescent) with
> it.
> 
> Facts first:
> Debian Stretch, kernel 4.9, ipvsadm 1.28.
> Network is bonded (CLAG) to 2 Arctica switches, tagged ports, actual
> interface is a VLAN (bond1.284). 
> 
> 2 servers, pacemaker, ldirectord, 1 having the LB and public VIP as well
> as the service (LDAP), the other being "just" an LDAP server by default.
> Again, not the first I'm doing LVS by a long shot (though first time with
> LDAP and bonded VLANs) and everything worked as expected.
> 
> However once in a while I'm seeing a SYN storm between the two LDAP nodes,
> supposedly coming from a client node (the busiest one).
> And at that time "ipvsadm -Lcn" will indeed show one connection from that
> client in SYN state.
> However:
> 
> 1. The packets are not originating from the client at all.
> 2. Other connections from that client (and the rest) work fine.
> 
> The failure clearly is related to the "slave" LDAP server, this never
> happens on the one actually running LVS and having the public VIP.
> Bringing the lo: interface with the VIP down and up on the slave fixes
> things, until it happens again a day or so later. 
> 
> Unfortunately I didn't have time to do a complete analysis the last time
> on the "master" server, but I definitely can say the SYN packets were
> local to the 2 servers and maybe the switches. 
> tcpdump on the slave showed that while they had the IP address of the
> client the MAC was that of the master (LVS node). 
> 
> I'm wondering if this a load issue, corner case, as the rate of LDAP
> connections is quite high (can peak to 500/s per server).
> OTOH, on exactly the same HW but with another bonded (but no VLAN)
> interface pair I'm also running another LVS setup for POP/IMAP for a
> dovecot proxy that can see 50 connections per second per server.
> 
> Typical, normal state of the LDAP LVS (07 is the local one running LVS, 08
> the "slave":
> ---
> # ipvsadm -L
> IP Virtual Server version 1.2.1 (size=1048576)
> Prot LocalAddress:Port Scheduler Flags
>   -> RemoteAddress:Port           Forward Weight ActiveConn InActConn  
> TCP  in-lbldap2:ldap rr
>   -> inside-pp08:ldap             Route   1      292        3473      
>   -> inside-pp07:ldap             Route   1      35         3745        
> ---
> 
> Anybody seen this before?

	Yes and we know for two solutions. One is the setting
of sysctl var "backup_only" to 1 in all directors that can take
the role of backup server. Here is recent thread that has more
info:

https://marc.info/?l=linux-virtual-server&m=148621038304357&w=2

> Any other data needed?

Regards

--
Julian Anastasov <ja at ssi.bg>



More information about the lvs-users mailing list