load balancing trouble at a high load

Hideaki Kondo kondo.hideaki at oss.ntt.co.jp
Fri May 26 05:56:34 BST 2006


Hello,

This is a response by myself.

> > (4)And then recover the NIC(eth0) of RS2 intentionally by executing manually
> >    "/etc/init.d/network restart".
> >    After a while, LB1 starts sending http packets to RS1 and RS2 in spite of
> >    still weight 0 of RS2. Moreover, LB1 is sending the packets to RS2 much
> >    less than RS1.
> >   (This strange behavior continues permanently. So I think the cause of 
> >    the behavior isn't always in a retransmit process of TCP Layer.
> >    In fact, the strange behavior stops when i stop the high load from CL1)
> 
> Checking by "ipvsadm -Lc", there are many TIME_WAIT states,
> it seems that InActConn number is reflected them.
> By the way, refering to ip_vs source code (ip_vs_proto_tcp.c),
> IP_VS_TCP_S_TIME_WAIT is 2*60*HZ.
> When i changed IP_VS_TCP_S_TIME_WAIT 2*60*HZ to 10*Hz etc (much smaller
> than 2*60*Hz), i think it seems to be improved the strange behavior. 

I tried to set IP_VS_TCP_S_TIME_WAIT(default:2*60*HZ) to 1*Hz extremely.
As I expected, InActConn wasn't increasing to maximum,
and it seems to be able to escape the trouble.

But I think that IP_VS_TCP_S_TIME_WAIT(1*Hz) maybe not realistic value.
So I'd like to find out another realistic resolution method and
I wonder Malcolm's advice(using expire_nodest_conn) is useful and
a similar resolution is better.

By the way, which parameter (or value) in LVS source-code is
maximum of InActConns and ActiveConns ?

> 
> Is IP_VS_TCP_S_TIME_WAIT related with the cause of the trouble ?
> i think some timers in LVS are related with the behavior ...??

I'm sorry for my many questions and comments.
Thanks a lot.

--
Hideaki Kondo




Search lvs-users Archives
Limit search to: Subject & Body Subject Author
Sort by: Reverse Sort

More information about the lvs-users mailing list