[lvs-users] 2-node setup connections hanging to backup

Lloyd Brown lloyd_brown at byu.edu
Thu Aug 21 22:18:15 BST 2014

Hi, all.  I'm having another problem that I hope someone can help me
with.  Or at least point me in the right direction for diagnosing this.
 It's a little weird, and I'm running out of ideas to test.

I'm in the middle of testing a two-node LVS balancing setup (see
to balance SSH connections.  But for reasons that aren't clear, some of
my connections are getting hung up.

Here's the general info on my setup:

- I'm running on (basically) a modified RHEL 6.2 image with an updated
kernel package
- This setup uses LVS-DR
- The two nodes have the IP addresses and
for direct access
- The virtual IP that's being balanced is
- The testing client is
- I'm using keepalived to manage the setup, VRRP, etc.
- I've already started using an FWMARK balancing setup, using IPTables,
to avoid the double-balancing/packet-storm/battling-directors issue
described in section 9.3 of the LVS-HOWTO (URL above)
- All connections that go to the active director, and get sent locally,
seem to be fine
- Some, but not all, of the connections that go through the active
director, and are forwarded to the backup director (also acting as a
realserver), are hanging up
- When I do something short (eg. a loop around "ssh
hostname"), I can frequently get several good connections through to the
backup director, before one of them hangs up.
- If I try to do a larger stream of data, eg scp a file, then my
connection stalls/hangs up every time I'm sent to the backup director/RS
- There doesn't seem to be any pattern yet as to the number of good
connections, packet count, or data, before the hang up occurs
- When the problem occurs, I see very rapid packet/byte rate on the "lo"
interfaces, that seems to be a lot of SSH packet retransmissions from (client), to (VIP).  Why this is ending up
on "lo" is a mystery to me.
- The problem only occurs when using the floating VIP interface to
connect, and only when it's redirected to the backup director host.
Connecting directly to that same host (eg. or works just fine every time.
- I've already tried flushing iptables completely on the backup
director, and it didn't seem to help.

I'm going to attach copies of several files (keepalived.conf, iptables
setup, etc.) to see if they're helpful.  If anyone can point me in the
right direction to figure this out, I'd appreciate it greatly.

Thanks again,

Lloyd Brown
Systems Administrator
Fulton Supercomputing Lab
Brigham Young University
-------------- next part --------------
A non-text attachment was scrubbed...
Name: lvs_diagnosis_21August2014.tar.gz
Type: application/x-gzip
Size: 13005 bytes
Desc: not available
Url : http://lists.graemef.net/pipermail/lvs-users/attachments/20140821/3852c49b/attachment.gz 

More information about the lvs-users mailing list