[lvs-users] One second connection delay in masquerading mode

Andrew Smalley asmalley at loadbalancer.org
Wed Jan 24 22:11:16 GMT 2018


Hello Sergey

I had a quick look at your configuration and the first thing that
comes to mind is can you do this on a single host?

First what I see is you are in SYN_RECV as shown below. this means you
have not resolved the ARP issue. now again back to can you do this on
the same host like you are?

NONE->SYN_RECV conn->refcnt:2

I don't want to say NO however if you move your docker nginx guest
somewhere else and not use 127.0.0.1 as this is where the VIP/32 sits
on the real server and as LVS DR Mode can not talk to its self I
suggest this is because you cant do internal to director hosting of
services. or connect same host to VIP of the director.

Anyone wish to correct me ? It makes sense when the director is a
separate VM (Kvm etc) director where the docker host can share the
same host we now longer use 127.0.0.x addresses for communications.

Andruw Smalley

Loadbalancer.org Ltd.

www.loadbalancer.org
+1 888 867 9504 / +44 (0)330 380 1064
asmalley at loadbalancer.org

Leave a Review | Deployment Guides | Blog


On 24 January 2018 at 21:51, Sergey Urbanovich
<surbanovich at mesosphere.io> wrote:
> Hi all,
>
> I encountered an issue with IPVS load balancing in case of short-lived
> connections. I've seen it in masqurading mode on CentOS 7 (kernel
> 3.10.0-693.11.6) and CoreOS 1235.12.0 (4.7.3-coreos-r3). After opening and
> closing of thousands of TCP connctions, new connections are being delayed for 1
> second.
>
> Please see a short example [4], there are steps to reproduce the issue. It
> starts nginx on port 8080 and creates a virtual service 127.0.0.1:80 ->
> 127.0.0.1:8080. After that an HTTP load generator (rakyll/hey) sends 30k queries
> with disabled keep-alive option. All records in ip_vs_conn table are in
> TIME_WAIT state. Then ipvs debug level is enabled and strace starts curl(1) with
> the same virtual service. Curl encounters the 1 second delay as shown. Attached
> you can find the full versions of strace.log and dmesg.log as well as their short
> versions [1] [2].
>
> Setting conn_reuse_mode to 0 resolves the issue, but doesn't fit our needs and
> doesn't work well in case of changing the list of real servers.
>
> What could be causing the delay? How can we get rid of it?
>
> Thank you.
>
> [1] strace.log
>
> 1516823661.570756 connect(3, {sa_family=AF_INET, sin_port=htons(80), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
> 1516823661.571578 poll([{fd=3, events=POLLOUT|POLLWRNORM}], 1, 0) = 0 (Timeout)
> 1516823661.572153 poll([{fd=3, events=POLLOUT}], 1, 1000) = 1 ([{fd=3, revents=POLLOUT}])
> 1516823662.573304 poll([{fd=3, events=POLLOUT|POLLWRNORM}], 1, 0) = 1 ([{fd=3, revents=POLLOUT|POLLWRNORM}])
> 1516823662.573556 getsockopt(3, SOL_SOCKET, SO_ERROR, [0], [4]) = 0
> 1516823662.573947 getpeername(3, {sa_family=AF_INET, sin_port=htons(80), sin_addr=inet_addr("127.0.0.1")}, [16]) = 0
> 1516823662.574139 getsockname(3, {sa_family=AF_INET, sin_port=htons(34258), sin_addr=inet_addr("127.0.0.1")}, [16]) = 0
> 1516823662.574641 sendto(3, "GET / HTTP/1.1\r\nUser-Agent: curl"..., 73, MSG_NOSIGNAL, NULL, 0) = 73
>
> [2] dmesg.log
>
> 706.699647 IPVS: Enter: ip_vs_out, net/netfilter/ipvs/ip_vs_core.c line 1132
> 706.699656 IPVS: lookup/out TCP 127.0.0.1:34258->127.0.0.1:80 not hit
> 706.699662 IPVS: lookup/in TCP 127.0.0.1:34258->127.0.0.1:80 hit
> 706.700092 IPVS: ip_vs_conn_drop_conntrack: dropping conntrack with tuple=127.0.0.1:34258->127.0.0.1:80/6 for conn 127.0.0.1:34258->127.0.0.1:80->127.0.0.1:8080/6:5
> 706.700097 IPVS: ip_vs_conn_drop_conntrack: no conntrack for tuple=127.0.0.1:34258->127.0.0.1:80/6
> 706.700102 IPVS: ip_vs_conn_drop_conntrack: dropping conntrack with tuple=127.0.0.1:34258->127.0.0.1:80/6 for conn 127.0.0.1:34258->127.0.0.1:80->127.0.0.1:8080/6:5
> 706.700105 IPVS: ip_vs_conn_drop_conntrack: no conntrack for tuple=127.0.0.1:34258->127.0.0.1:80/6
> 706.700110 IPVS: Unbind-dest TCP c:127.0.0.1:34258 v:127.0.0.1:80 d:127.0.0.1:8080 fwd:M s:5 conn->flags:10100 conn->refcnt:0 dest->refcnt:15505
> 707.701103 IPVS: Enter: ip_vs_out, net/netfilter/ipvs/ip_vs_core.c line 1132
> 707.701128 IPVS: lookup/out TCP 127.0.0.1:34258->127.0.0.1:80 not hit
> 707.701143 IPVS: lookup/in TCP 127.0.0.1:34258->127.0.0.1:80 not hit
> 707.701156 IPVS: lookup service: fwm 0 TCP 127.0.0.1:80 hit
> 707.701169 IPVS: lookup/in TCP 127.0.0.1:80->127.0.0.1:34258 not hit
> 707.701180 IPVS: ip_vs_wlc_schedule(): Scheduling...
> 707.701193 IPVS: WLC: server 127.0.0.1:8080 activeconns 0 refcnt 15504 weight 1 overhead 15503
> 707.701210 IPVS: Bind-dest TCP c:127.0.0.1:34258 v:127.0.0.1:80 d:127.0.0.1:8080 fwd:M s:5 conn->flags:100 conn->refcnt:1 dest->refcnt:15505
> 707.701225 IPVS: Schedule fwd:M c:127.0.0.1:34258 v:127.0.0.1:80 d:127.0.0.1:8080 conn->flags:10140 conn->refcnt:2
> 707.701239 IPVS: TCP input  [S...] 127.0.0.1:8080->127.0.0.1:34258 state: NONE->SYN_RECV conn->refcnt:2
> 707.701252 IPVS: Enter: ip_vs_nat_xmit, net/netfilter/ipvs/ip_vs_xmit.c line 625
> 707.701269 IPVS: ip_vs_update_conntrack: Updating conntrack ct=ffff8800ce14e5d0, status=0x100, ctinfo=2, old reply=127.0.0.1:80->127.0.0.1:34258/6, new reply=127.0.0.1:8080->127.0.0.1:34258/6, cp=127.0.0.1:34258->127.0.0.1:80->127.0.0.1:8080/6:3
> 707.701282 IPVS: Leave: ip_vs_nat_xmit, net/netfilter/ipvs/ip_vs_xmit.c line 698
> 707.701330 IPVS: Enter: ip_vs_out, net/netfilter/ipvs/ip_vs_core.c line 1132
> 707.701374 IPVS: Enter: ip_vs_out, net/netfilter/ipvs/ip_vs_core.c line 1132
>
> [3] sysctl
>
> # uname -r
> 3.10.0-693.11.6.el7.x86_64.debug
> # sysctl net.ipv4.vs
> net.ipv4.vs.am_droprate = 10
> net.ipv4.vs.amemthresh = 1024
> net.ipv4.vs.backup_only = 0
> net.ipv4.vs.cache_bypass = 0
> net.ipv4.vs.conn_reuse_mode = 1
> net.ipv4.vs.conntrack = 1
> net.ipv4.vs.debug_level = 0
> net.ipv4.vs.drop_entry = 0
> net.ipv4.vs.drop_packet = 0
> net.ipv4.vs.expire_nodest_conn = 0
> net.ipv4.vs.expire_quiescent_template = 0
> net.ipv4.vs.nat_icmp_send = 0
> net.ipv4.vs.pmtu_disc = 1
> net.ipv4.vs.secure_tcp = 0
> net.ipv4.vs.snat_reroute = 1
> net.ipv4.vs.sync_ports = 1
> net.ipv4.vs.sync_qlen_max = 61432
> net.ipv4.vs.sync_refresh_period = 0
> net.ipv4.vs.sync_retries = 0
> net.ipv4.vs.sync_sock_size = 0
> net.ipv4.vs.sync_threshold = 3  50
> net.ipv4.vs.sync_version = 1
> # cat /sys/module/ip_vs/parameters/conn_tab_bits
> 18
>
> [4] Steps to reproduce
>
> # modprobe ip_vs_wlc
> # sysctl -w net.ipv4.vs.conntrack=1
> # sysctl -w net.ipv4.ip_local_port_range='32768 60999'
> # docker run --rm --net=host --detach --name=nginx nginx sh -c 'echo "daemon off; worker_processes 2; events { worker_connections 16384; } http { server { listen 8080 backlog=16384; access_log off; server_name localhost; location / { return 200; } } }" > /etc/nginx/ping.conf && nginx -c /etc/nginx/ping.conf'
> # docker run --rm --net=host --privileged mesosphere/net-toolbox ipvsadm -C
> # docker run --rm --net=host --privileged mesosphere/net-toolbox ipvsadm -A -t 127.0.0.1:80
> # docker run --rm --net=host --privileged mesosphere/net-toolbox ipvsadm -a -t 127.0.0.1:80 -r 127.0.0.1:8080 -m
> # docker run --rm --net=host mesosphere/hey -n 30000 -c 1000 -disable-keepalive -t 3 http://127.0.0.1/
> # awk '{print $8}' /proc/net/ip_vs_conn | sort | uniq -c
> # sysctl -w net.ipv4.vs.debug_level=15; strace -ttt curl --silent http://127.0.0.1/ 2>&1 | tee strace.log; sysctl -w net.ipv4.vs.debug_level=0
> # dmesg | sed -re 's!^\[ *([0-9.]+)\]!\1!' | tee dmesg.log
> # docker run --rm --net=host --privileged mesosphere/net-toolbox ipvsadm -C
> # docker kill nginx
>
>
>
> --
> Yours,
> Sergey Urbanovich
>
>
> _______________________________________________
> Please read the documentation before posting - it's available at:
> http://www.linuxvirtualserver.org/
>
> LinuxVirtualServer.org mailing list - lvs-users at LinuxVirtualServer.org
> Send requests to lvs-users-request at LinuxVirtualServer.org
> or go to http://lists.graemef.net/mailman/listinfo/lvs-users



More information about the lvs-users mailing list