[lvs-users] connection broken after 2MB of data transmitted
Julian Anastasov
ja at ssi.bg
Thu May 17 21:01:44 BST 2018
Hello,
On Mon, 14 May 2018, Robert.Grange at swisscom.com wrote:
> We are using this tool in one of our project, and we are facing a disconnect every ~2MB of data transferred.
>
> rhel 7n
> ipvsadm.x86_64 1.27-7.el7
> keepalived.x86_64 1.3.5-1.el7
> kernel.x86_64 3.1nnN0.0-514.21.1.el7
>
>
> Our configuration:
> VIP 10.1.1.130
> LB1 10.1.1.131 Virtual Server keepalived Active
> LB2 10.1.1.132 Virtual Server keepalived backup
> MQ1 10.1.1.151 Real Server MQ Active
> MQ2 10.1.1.152 Real Server MQ Standby
>
> Our keepalived.conf (simplified)
> global_defs {
> notification_email {
> blablalba at mymail.com<mailto:blablalba at mymail.com>
> }
> notification_email_from blablalba at mymail.com<mailto:blablalba at mymail.com>
> smtp_server sysmail.mymail.com
> smtp_connect_timeout 30
> }
>
> vrrp_instance vi_y-maas {
> state BACKUP
> virtual_router_id 100
> interface ens32
> priority 150
> advert_int 5
> nopreempt
> smtp_alert
> virtual_ipaddress {
> 10.1.1.130/25
> }
> }
>
> # My MQ
> virtual_server 10.1.1.130 1423 {
> delay_loop 2
> protocol TCP
> lb_algo rr
> lb_kind DR
>
> real_server 10.1.1.151 1423 {
> weight 10
> TCP_CHECK {
> }
> }
> real_server 10.1.1.152 1423 {
> weight 10
> TCP_CHECK {
> }
> }
> }
>
> On MQ1 and MQ2, we have added ARP rules (due to Direct Routing)
> :INPUT ACCEPT
> :OUTPUT ACCEPT
> :FORWARD ACCEPT
> -A INPUT -j DROP -d 10.1.1.130
> -A OUTPUT -j mangle -s 10.1.1.130 --mangle-ip-s 10.1.1.151
> And
> :INPUT ACCEPT
> :OUTPUT ACCEPT
> :FORWARD ACCEPT
> -A INPUT -j DROP -d 10.1.1.130
> -A OUTPUT -j mangle -s 10.1.1.130 --mangle-ip-s 10.1.1.152
>
> MQ1 and MQ2 also have the VIP Address as a secondary address of the interface
> 2: ens32: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
> link/ether xx:xx:xx:xx:xx:xx brd ff:ff:ff:ff:ff:ff
> inet 10.1.1.151/25 brd 10.1.1.255 scope global ens32
> valid_lft forever preferred_lft forever
> inet 10.1.1.130/25 scope global secondary ens32
> valid_lft forever preferred_lft forever
>
> This permit us to direct the routing to the active MQ without intervention (if the Active MQ fail, the StdBy take relay and the LB detect that MQ1 is down and MQ2 is up)
>
> My problem
>
> When trying to read messages (~8'000) from MQ, using VIP to connect, the program can read ~2MB, then the connection is broken (We can see that in Wireshark trace that there are 5 TCP Retransmit with increasing delay between retransmit, between the IP where the Application PGM runs and the VIP address, and that at the same time, there is no more traffic between the LB1 (active LB) to the MQ1 (Active MQ)
It would be useful to see trace just before the
retransmission starts, from client, director and real server:
tcpdump -lnnnv -i any -s 0 port 1423 or icmp
If you prefer, you can scramble the addresses, we
care for things like checksum, packet sizes, PMTU (ICMP errors?).
Also, you can try to stop GRO/GSO on the director:
ethtool ETH -K gso off
ethtool ETH -K gro off
Check on client with arp -an if MAC for VIP is correct,
just in case to be sure.
> With the same pgm, same read of messages, when connecting directly to the MQ1, there is no problems.
>
> Could this be a problem related to keepalived or from linux-lvs it-self ?
>
> Many thanks and regards
> Robert
Regards
--
Julian Anastasov <ja at ssi.bg>
More information about the lvs-users
mailing list