[lvs-users] Understanding granularity, timeouts and unexpected balance of traffic on reals

Abhijeet Rastogi abhijeet.1989 at gmail.com
Mon Aug 26 22:05:57 BST 2019


Hi Julian,

Thanks for taking a look.

>  Yes, if they are for same virtual service. Otherwise,
it would be a bug if connections from same subnet go to different
real servers.

I can confirm that traffic is coming to only one virtual service. Column 5
on "sudo ipvsadm -lnc" is the destination virtual service configured, this
is the output:-

[arastogi at esv5-app02 ~]$ sudo ipvsadm -lnc | grep <source_ip_/48_prefix> |
grep ESTAB | grep 29: | head -n 100 | awk '{print $5}' | sort | uniq -c  |
sort -nk 1
    100 [VIP_configured]:443

I see only VIP as the destination from current active ESTABLISHED
connections created <60 seconds ago.

>      I'm not sure how many virtual servers you are using.
Please, list your configuration, even with scrambled IPs:

        ipvsadm -Ln

Now that you said, if it's not happening it should be a bug, looks like I
missed seeing a key section in the ipvsadm output.

FWM  97284778 IPv6 rr persistent 120
  -> [v6_reals:9222]:0 Route   1      0          0
  -> [v6_reals:9223]:0 Route   1      0          0
  -> [v6_reals:9224]:0 Route   1      0          0

There is no mask mentioned in the service table info (line1). That should
mean that the mask is 128 as per ipvsadm code.

                  if (se->af == AF_INET6)
                        if (se->netmask != 128)
                              printf(" mask %i", se->netmask);

Thanks, Julian.

Cheers,
Abhijeet


On Sun, Aug 25, 2019 at 3:28 AM Julian Anastasov <ja at ssi.bg> wrote:

>
>         Hello,
>
> On Thu, 15 Aug 2019, Abhijeet Rastogi wrote:
>
> > Hi everyone,
> >
> > I'm investigating a typical configuration for an L4 TCP load balancer
> using
> > ipvs+keepalived. Settings:-
> >
> > persistence_timeout: 120 seconds.  (# LVS persistence timeout, sec)
> > /sbin/ipvsadm --set 1800 120 300 (30 min timeout for TCP)
> > persistence_granularity: "48" for ipv6.
> > lb_algo: rr (round-robin)
> >
> > My expectation is, all the IPs from the same /48 v6 subnet should always
> > reach the same real_server because of setting granularity. (at least the
> > connections created in last 120 seconds)
>
>         Yes, if they are for same virtual service. Otherwise,
> it would be a bug if connections from same subnet go to different
> real servers.
>
> > However, I can see that established connections from the same /48 v6
> subnet
> > are spread across multiple reals, even for recently established
> > connections.
> >
> > # Same /48 going to different reals, (very recent connections)
> > 1. Grepping with only first 3 quibble to see how a /48 is being
> > distributed.
> > 2. " | grep ESTAB | grep 29: | head -n 100" to only see first 100
> > established connections created in last 60 seconds as my timeout is set
> to
> > 30:00 (1800 seconds)
> > 4. 6th column is the real IP.
> > I see that the same /48 is getting distributed across multiple different
> > reals. (should be same real because of persistence_granularity set to
> 48).
> > $ sudo ipvsadm -lnc | grep "xxxx:xxxx:xxxx" | grep ESTAB | grep 29: |
> head
> > -n 100 | awk '{print $6}' | sort | uniq -c
> >       2 [V6IP_REDACTED:9222]:443
> >       9 [V6IP_REDACTED:9223]:443
> >       7 [V6IP_REDACTED:9224]:443
> >      13 [V6IP_REDACTED:9225]:443
> >       1 [V6IP_REDACTED:9226]:443
> >       ............
> >       ............ output redacted
>
>         I'm not sure how many virtual servers you are using.
> Please, list your configuration, even with scrambled IPs:
>
>         ipvsadm -Ln
>
> >    - Why are recent connections going to different reals?
> >    - For recent connections, shouldn't they always end up on same real?
> >    - For older connections, I guess, persistence_timeout causes the
> traffic
> >    to balance to other reals via round robin.
>
>         The persistence timeout (-p N) is used as minimum time.
> For this period other connections can come and expire.
>
>         When the timer expires it can be extended each time with new 60
> seconds if there are existing connections (even if not ESTAB anymore) that
> refer to the persistence template (connection with zeros after the 48-th
> bit)
> created to remember which real server is used. So, the persistence
> template can live very long time if the subnet is very active. You should
> see one such template for every subnet.
>
> Regards
>
> --
> Julian Anastasov <ja at ssi.bg>
>


-- 
Cheers,
Abhijeet (https://abhi.host)


More information about the lvs-users mailing list