[lvs-users] One second connection delay in masquerading mode

Sergey Urbanovich surbanovich at mesosphere.io
Wed Jan 24 21:51:43 GMT 2018


Hi all,

I encountered an issue with IPVS load balancing in case of short-lived
connections. I've seen it in masqurading mode on CentOS 7 (kernel
3.10.0-693.11.6) and CoreOS 1235.12.0 (4.7.3-coreos-r3). After opening and
closing of thousands of TCP connctions, new connections are being delayed for 1
second.

Please see a short example [4], there are steps to reproduce the issue. It
starts nginx on port 8080 and creates a virtual service 127.0.0.1:80 ->
127.0.0.1:8080. After that an HTTP load generator (rakyll/hey) sends 30k queries
with disabled keep-alive option. All records in ip_vs_conn table are in
TIME_WAIT state. Then ipvs debug level is enabled and strace starts curl(1) with
the same virtual service. Curl encounters the 1 second delay as shown. Attached
you can find the full versions of strace.log and dmesg.log as well as their short
versions [1] [2].

Setting conn_reuse_mode to 0 resolves the issue, but doesn't fit our needs and
doesn't work well in case of changing the list of real servers.

What could be causing the delay? How can we get rid of it?

Thank you.

[1] strace.log

1516823661.570756 connect(3, {sa_family=AF_INET, sin_port=htons(80), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
1516823661.571578 poll([{fd=3, events=POLLOUT|POLLWRNORM}], 1, 0) = 0 (Timeout)
1516823661.572153 poll([{fd=3, events=POLLOUT}], 1, 1000) = 1 ([{fd=3, revents=POLLOUT}])
1516823662.573304 poll([{fd=3, events=POLLOUT|POLLWRNORM}], 1, 0) = 1 ([{fd=3, revents=POLLOUT|POLLWRNORM}])
1516823662.573556 getsockopt(3, SOL_SOCKET, SO_ERROR, [0], [4]) = 0
1516823662.573947 getpeername(3, {sa_family=AF_INET, sin_port=htons(80), sin_addr=inet_addr("127.0.0.1")}, [16]) = 0
1516823662.574139 getsockname(3, {sa_family=AF_INET, sin_port=htons(34258), sin_addr=inet_addr("127.0.0.1")}, [16]) = 0
1516823662.574641 sendto(3, "GET / HTTP/1.1\r\nUser-Agent: curl"..., 73, MSG_NOSIGNAL, NULL, 0) = 73

[2] dmesg.log

706.699647 IPVS: Enter: ip_vs_out, net/netfilter/ipvs/ip_vs_core.c line 1132
706.699656 IPVS: lookup/out TCP 127.0.0.1:34258->127.0.0.1:80 not hit
706.699662 IPVS: lookup/in TCP 127.0.0.1:34258->127.0.0.1:80 hit
706.700092 IPVS: ip_vs_conn_drop_conntrack: dropping conntrack with tuple=127.0.0.1:34258->127.0.0.1:80/6 for conn 127.0.0.1:34258->127.0.0.1:80->127.0.0.1:8080/6:5
706.700097 IPVS: ip_vs_conn_drop_conntrack: no conntrack for tuple=127.0.0.1:34258->127.0.0.1:80/6
706.700102 IPVS: ip_vs_conn_drop_conntrack: dropping conntrack with tuple=127.0.0.1:34258->127.0.0.1:80/6 for conn 127.0.0.1:34258->127.0.0.1:80->127.0.0.1:8080/6:5
706.700105 IPVS: ip_vs_conn_drop_conntrack: no conntrack for tuple=127.0.0.1:34258->127.0.0.1:80/6
706.700110 IPVS: Unbind-dest TCP c:127.0.0.1:34258 v:127.0.0.1:80 d:127.0.0.1:8080 fwd:M s:5 conn->flags:10100 conn->refcnt:0 dest->refcnt:15505
707.701103 IPVS: Enter: ip_vs_out, net/netfilter/ipvs/ip_vs_core.c line 1132
707.701128 IPVS: lookup/out TCP 127.0.0.1:34258->127.0.0.1:80 not hit
707.701143 IPVS: lookup/in TCP 127.0.0.1:34258->127.0.0.1:80 not hit
707.701156 IPVS: lookup service: fwm 0 TCP 127.0.0.1:80 hit
707.701169 IPVS: lookup/in TCP 127.0.0.1:80->127.0.0.1:34258 not hit
707.701180 IPVS: ip_vs_wlc_schedule(): Scheduling...
707.701193 IPVS: WLC: server 127.0.0.1:8080 activeconns 0 refcnt 15504 weight 1 overhead 15503
707.701210 IPVS: Bind-dest TCP c:127.0.0.1:34258 v:127.0.0.1:80 d:127.0.0.1:8080 fwd:M s:5 conn->flags:100 conn->refcnt:1 dest->refcnt:15505
707.701225 IPVS: Schedule fwd:M c:127.0.0.1:34258 v:127.0.0.1:80 d:127.0.0.1:8080 conn->flags:10140 conn->refcnt:2
707.701239 IPVS: TCP input  [S...] 127.0.0.1:8080->127.0.0.1:34258 state: NONE->SYN_RECV conn->refcnt:2
707.701252 IPVS: Enter: ip_vs_nat_xmit, net/netfilter/ipvs/ip_vs_xmit.c line 625
707.701269 IPVS: ip_vs_update_conntrack: Updating conntrack ct=ffff8800ce14e5d0, status=0x100, ctinfo=2, old reply=127.0.0.1:80->127.0.0.1:34258/6, new reply=127.0.0.1:8080->127.0.0.1:34258/6, cp=127.0.0.1:34258->127.0.0.1:80->127.0.0.1:8080/6:3
707.701282 IPVS: Leave: ip_vs_nat_xmit, net/netfilter/ipvs/ip_vs_xmit.c line 698
707.701330 IPVS: Enter: ip_vs_out, net/netfilter/ipvs/ip_vs_core.c line 1132
707.701374 IPVS: Enter: ip_vs_out, net/netfilter/ipvs/ip_vs_core.c line 1132

[3] sysctl

# uname -r
3.10.0-693.11.6.el7.x86_64.debug
# sysctl net.ipv4.vs
net.ipv4.vs.am_droprate = 10
net.ipv4.vs.amemthresh = 1024
net.ipv4.vs.backup_only = 0
net.ipv4.vs.cache_bypass = 0
net.ipv4.vs.conn_reuse_mode = 1
net.ipv4.vs.conntrack = 1
net.ipv4.vs.debug_level = 0
net.ipv4.vs.drop_entry = 0
net.ipv4.vs.drop_packet = 0
net.ipv4.vs.expire_nodest_conn = 0
net.ipv4.vs.expire_quiescent_template = 0
net.ipv4.vs.nat_icmp_send = 0
net.ipv4.vs.pmtu_disc = 1
net.ipv4.vs.secure_tcp = 0
net.ipv4.vs.snat_reroute = 1
net.ipv4.vs.sync_ports = 1
net.ipv4.vs.sync_qlen_max = 61432
net.ipv4.vs.sync_refresh_period = 0
net.ipv4.vs.sync_retries = 0
net.ipv4.vs.sync_sock_size = 0
net.ipv4.vs.sync_threshold = 3	50
net.ipv4.vs.sync_version = 1
# cat /sys/module/ip_vs/parameters/conn_tab_bits
18

[4] Steps to reproduce

# modprobe ip_vs_wlc
# sysctl -w net.ipv4.vs.conntrack=1
# sysctl -w net.ipv4.ip_local_port_range='32768 60999'
# docker run --rm --net=host --detach --name=nginx nginx sh -c 'echo "daemon off; worker_processes 2; events { worker_connections 16384; } http { server { listen 8080 backlog=16384; access_log off; server_name localhost; location / { return 200; } } }" > /etc/nginx/ping.conf && nginx -c /etc/nginx/ping.conf'
# docker run --rm --net=host --privileged mesosphere/net-toolbox ipvsadm -C
# docker run --rm --net=host --privileged mesosphere/net-toolbox ipvsadm -A -t 127.0.0.1:80
# docker run --rm --net=host --privileged mesosphere/net-toolbox ipvsadm -a -t 127.0.0.1:80 -r 127.0.0.1:8080 -m
# docker run --rm --net=host mesosphere/hey -n 30000 -c 1000 -disable-keepalive -t 3 http://127.0.0.1/
# awk '{print $8}' /proc/net/ip_vs_conn | sort | uniq -c
# sysctl -w net.ipv4.vs.debug_level=15; strace -ttt curl --silent http://127.0.0.1/ 2>&1 | tee strace.log; sysctl -w net.ipv4.vs.debug_level=0
# dmesg | sed -re 's!^\[ *([0-9.]+)\]!\1!' | tee dmesg.log
# docker run --rm --net=host --privileged mesosphere/net-toolbox ipvsadm -C
# docker kill nginx

-------------- next part --------------
A non-text attachment was scrubbed...
Name: dmesg.log
Type: application/octet-stream
Size: 6034 bytes
Desc: not available
Url : http://lists.graemef.net/pipermail/lvs-users/attachments/20180124/a6d2a90c/attachment-0002.obj 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: strace.log
Type: application/octet-stream
Size: 32009 bytes
Desc: not available
Url : http://lists.graemef.net/pipermail/lvs-users/attachments/20180124/a6d2a90c/attachment-0003.obj 
-------------- next part --------------

--
Yours,
Sergey Urbanovich



More information about the lvs-users mailing list