[lvs-users] source hashing some times land on wrong server (with FTP)
pdm at pobox.com
Mon Nov 4 20:48:11 GMT 2019
Thank you for the response. I'm not quite sure I understand though or maybe
I did not express our setup clearly.
We have a set of director hosts that are all active (running master and
backup at the same time) BGP/anycasting their VIPS. Then using TUN mode to
load balance separate real servers running the FTP server.
It sounds to me from the documentation the backup_only is for when the real
is the same server as director. But we aren't doing that.
We had previously used WLC instead of SH for the lb agl but we had
customers facing similar problems where connections to data channel would
land on a different (and unprepared to handle it) FTP server. SH seemed a
good fit since it only looked at the source IP so seemingly requests from
the same client (regardless of src or dst port) would land on the same real
FTP server for the data port.
Thank you for your input and I would appreciate it if you can expand a
little on if backup_only would help in this case.
On Mon, Nov 4, 2019 at 2:21 PM Julian Anastasov <ja at ssi.bg> wrote:
> On Fri, 1 Nov 2019, Phillip Moore wrote:
> > Hello!
> > We have FTP setup with on its own VIP and just map all ports (:0) and use
> > source hashing. Sometimes when the FTP client opens the data channel it
> > will land on the wrong real server causing a reset. I stress sometimes
> > because mostly FTP seems to work but we do see this behavior of requests
> > landing on the wrong server.
> > FTP client makes connection to VIP:0 on ftp port, is asked to open data
> > channel on VIP:0 on alternate port. FTP client sends SYN packet but that
> > packet doesn't land on the correct real FTP server, so connection is
> > reset. That SYN packet likely came through a different IPVS server but
> > should have sync connection state by this time.
> > Example of our config:
> > -A -t x.y.z.220:0 -s sh -p 600 -b sh-fallback
> > -a -t x.y.z.220:0 -r a.b.c.4:0 -i -w 1
> > -a -t x.y.z.220:0 -r a.b.c.5:0 -i -w 1
> > -a -t x.y.z.220:0 -r a.b.c.6:0 -i -w 1
> > -a -t x.y.z.220:0 -r a.b.c.7:0 -i -w 1
> > 3.10.0-1062.1.1.el7.x86_64
> > We have this config running on multiple active IPVS servers all running
> > active/backup sync processes .
> > We've also tried a non 1 weight (1000) to see if it was the overload
> > kicking in and sending requests to alt server, but that did not seem to
> > it.
> > Is there any reason why subsequent connections from the same source IP
> > would land on a different server?
> Try to set the backup_only sysctl var to 1 on all directors
> that are backup servers and that can be used also as real servers.
> The flag can stay to 1 even while director runs as master. For the
> rare setups that run both master and backup function at the same time,
> this flag should not be used.
> As result, when backup function is active any traffic received on
> backup servers will be delivered locally, it will not be rescheduled to
> other real servers. The backup_only flag is useful for DR/TUN setups to
> avoid packet loops or as in your case to avoid rescheduling to different
> real server. Why this happens? May be because sync messages are delayed,
> sometimes up to 2 seconds.
> Also, make sure the real servers are listed (added) in same order
> in all directors that use SH scheduler. If not, the scheduling can select
> different real server in both directors. There is no such requirement for
> the MH scheduler which is more advanced but it is added in more recent
> kernels (4.18+).
> Julian Anastasov <ja at ssi.bg>
More information about the lvs-users