mbox series

[net-next,0/4] net: tcp: make txhash use consistent for IPv4

Message ID 20230511093456.672221-1-atenart@kernel.org (mailing list archive)
Headers show
Series net: tcp: make txhash use consistent for IPv4 | expand

Message

Antoine Tenart May 11, 2023, 9:34 a.m. UTC
Hello,

Series is divided in two parts. First two commits make the txhash (used
for the skb hash in TCP) to be consistent for all IPv4/TCP packets (IPv6
doesn't have the same issue). Last two commits improve doc/comment
hash-related parts.

One example is when using OvS with dp_hash, which uses skb->hash, to
select a path. We'd like packets from the same flow to be consistent, as
well as the hash being stable over time when using net.core.txrehash=0.
Same applies for kernel ECMP which also can use skb->hash.

IMHO the series makes sense in net-next, but we could argue (some)
commits be seen as fixes and I can resend if necessary.

Thanks!
Antoine

Antoine Tenart (4):
  net: tcp: make the txhash available in TIME_WAIT sockets for IPv4 too
  net: ipv4: use consistent txhash in TIME_WAIT and SYN_RECV
  Documentation: net: net.core.txrehash is not specific to listening
    sockets
  net: skbuff: fix l4_hash comment

 Documentation/admin-guide/sysctl/net.rst |  4 ++--
 include/linux/skbuff.h                   |  4 ++--
 include/net/ip.h                         |  2 +-
 net/ipv4/ip_output.c                     |  4 +++-
 net/ipv4/tcp_ipv4.c                      | 14 +++++++++-----
 net/ipv4/tcp_minisocks.c                 |  2 +-
 6 files changed, 18 insertions(+), 12 deletions(-)

Comments

Eric Dumazet May 11, 2023, 10:24 a.m. UTC | #1
On Thu, May 11, 2023 at 11:35 AM Antoine Tenart <atenart@kernel.org> wrote:
>
> Hello,
>
> Series is divided in two parts. First two commits make the txhash (used
> for the skb hash in TCP) to be consistent for all IPv4/TCP packets (IPv6
> doesn't have the same issue). Last two commits improve doc/comment
> hash-related parts.
>
> One example is when using OvS with dp_hash, which uses skb->hash, to
> select a path. We'd like packets from the same flow to be consistent, as
> well as the hash being stable over time when using net.core.txrehash=0.
> Same applies for kernel ECMP which also can use skb->hash.
>

SGTM, thanks.

Reviewed-by: Eric Dumazet <edumazet@google.com>

FYI while reviewing your patches, I found that I have to send this fix:

I suggest we hold your patch series a bit before this reaches net-next tree,
to avoid merge conflicts.

Bug was added in commit f6c0f5d209fa ("tcp: honor SO_PRIORITY in
TIME_WAIT state")


diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 39bda2b1066e1d607a59fb79c6305d0ca30cb28d..06d2573685ca993a3a0a89807f09d7b5c153cc72
100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -829,6 +829,9 @@ static void tcp_v4_send_reset(const struct sock
*sk, struct sk_buff *skb)
                                   inet_twsk(sk)->tw_priority : sk->sk_priority;
                transmit_time = tcp_transmit_time(sk);
                xfrm_sk_clone_policy(ctl_sk, sk);
+       } else {
+               ctl_sk->sk_mark = 0;
+               ctl_sk->sk_priority = 0;
        }
        ip_send_unicast_reply(ctl_sk,
                              skb, &TCP_SKB_CB(skb)->header.h4.opt,
@@ -836,7 +839,6 @@ static void tcp_v4_send_reset(const struct sock
*sk, struct sk_buff *skb)
                              &arg, arg.iov[0].iov_len,
                              transmit_time);

-       ctl_sk->sk_mark = 0;
        xfrm_sk_free_policy(ctl_sk);
        sock_net_set(ctl_sk, &init_net);
        __TCP_INC_STATS(net, TCP_MIB_OUTSEGS);
@@ -935,7 +937,6 @@ static void tcp_v4_send_ack(const struct sock *sk,
                              &arg, arg.iov[0].iov_len,
                              transmit_time);

-       ctl_sk->sk_mark = 0;
        sock_net_set(ctl_sk, &init_net);
        __TCP_INC_STATS(net, TCP_MIB_OUTSEGS);
        local_bh_enable();
Antoine Tenart May 11, 2023, 11:55 a.m. UTC | #2
Quoting Eric Dumazet (2023-05-11 12:24:15)
> On Thu, May 11, 2023 at 11:35 AM Antoine Tenart <atenart@kernel.org> wrote:
> >
> > Series is divided in two parts. First two commits make the txhash (used
> > for the skb hash in TCP) to be consistent for all IPv4/TCP packets (IPv6
> > doesn't have the same issue). Last two commits improve doc/comment
> > hash-related parts.
> >
> > One example is when using OvS with dp_hash, which uses skb->hash, to
> > select a path. We'd like packets from the same flow to be consistent, as
> > well as the hash being stable over time when using net.core.txrehash=0.
> > Same applies for kernel ECMP which also can use skb->hash.
> >
> 
> SGTM, thanks.
> 
> Reviewed-by: Eric Dumazet <edumazet@google.com>
> 
> FYI while reviewing your patches, I found that I have to send this fix:
> 
> I suggest we hold your patch series a bit before this reaches net-next tree,
> to avoid merge conflicts.

Sure, no problem. Thanks for the review!

> Bug was added in commit f6c0f5d209fa ("tcp: honor SO_PRIORITY in
> TIME_WAIT state")
> 
> 
> diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
> index 39bda2b1066e1d607a59fb79c6305d0ca30cb28d..06d2573685ca993a3a0a89807f09d7b5c153cc72
> 100644
> --- a/net/ipv4/tcp_ipv4.c
> +++ b/net/ipv4/tcp_ipv4.c
> @@ -829,6 +829,9 @@ static void tcp_v4_send_reset(const struct sock
> *sk, struct sk_buff *skb)
>                                    inet_twsk(sk)->tw_priority : sk->sk_priority;
>                 transmit_time = tcp_transmit_time(sk);
>                 xfrm_sk_clone_policy(ctl_sk, sk);
> +       } else {
> +               ctl_sk->sk_mark = 0;
> +               ctl_sk->sk_priority = 0;
>         }
>         ip_send_unicast_reply(ctl_sk,
>                               skb, &TCP_SKB_CB(skb)->header.h4.opt,
> @@ -836,7 +839,6 @@ static void tcp_v4_send_reset(const struct sock
> *sk, struct sk_buff *skb)
>                               &arg, arg.iov[0].iov_len,
>                               transmit_time);
> 
> -       ctl_sk->sk_mark = 0;
>         xfrm_sk_free_policy(ctl_sk);
>         sock_net_set(ctl_sk, &init_net);
>         __TCP_INC_STATS(net, TCP_MIB_OUTSEGS);
> @@ -935,7 +937,6 @@ static void tcp_v4_send_ack(const struct sock *sk,
>                               &arg, arg.iov[0].iov_len,
>                               transmit_time);
> 
> -       ctl_sk->sk_mark = 0;
>         sock_net_set(ctl_sk, &init_net);
>         __TCP_INC_STATS(net, TCP_MIB_OUTSEGS);
>         local_bh_enable();
>
Ilya Maximets May 11, 2023, 11:59 a.m. UTC | #3
On 5/11/23 11:34, Antoine Tenart wrote:
> Hello,
> 
> Series is divided in two parts. First two commits make the txhash (used
> for the skb hash in TCP) to be consistent for all IPv4/TCP packets (IPv6
> doesn't have the same issue). Last two commits improve doc/comment
> hash-related parts.
> 
> One example is when using OvS with dp_hash, which uses skb->hash, to
> select a path. We'd like packets from the same flow to be consistent, as
> well as the hash being stable over time when using net.core.txrehash=0.
> Same applies for kernel ECMP which also can use skb->hash.

FWIW, same also applies to seg6_flowlabel that is used for flowlable
based load balancing, because seg6_make_flowlabel() is using skb hash.

Best regards, Ilya Maximets.

> 
> IMHO the series makes sense in net-next, but we could argue (some)
> commits be seen as fixes and I can resend if necessary.
> 
> Thanks!
> Antoine
> 
> Antoine Tenart (4):
>   net: tcp: make the txhash available in TIME_WAIT sockets for IPv4 too
>   net: ipv4: use consistent txhash in TIME_WAIT and SYN_RECV
>   Documentation: net: net.core.txrehash is not specific to listening
>     sockets
>   net: skbuff: fix l4_hash comment
> 
>  Documentation/admin-guide/sysctl/net.rst |  4 ++--
>  include/linux/skbuff.h                   |  4 ++--
>  include/net/ip.h                         |  2 +-
>  net/ipv4/ip_output.c                     |  4 +++-
>  net/ipv4/tcp_ipv4.c                      | 14 +++++++++-----
>  net/ipv4/tcp_minisocks.c                 |  2 +-
>  6 files changed, 18 insertions(+), 12 deletions(-)
>