From patchwork Thu May 11 09:34:53 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Antoine Tenart X-Patchwork-Id: 13237731 X-Patchwork-Delegate: kuba@kernel.org Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C743027734 for ; Thu, 11 May 2023 09:35:03 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 21208C433D2; Thu, 11 May 2023 09:35:02 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1683797703; bh=TyVMmLfQNyRUMre/Tdz35CHIY/sV2VSF64eZdStd4OM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=CsjC4u/NMGYAleWM6QuCYt6B6MQzRmWNUFAtPrOKqHsLCu6S5gbrGMBctnW+sAakQ WV4uKeWANMGdIqGKxG9fTyGB+D19j72Gsd1UZy3c3DoL+VLTJ94O2Mj+EhtgMD4KOZ nkziaZJv7bo2JHfdFYpM/h820SeQMEA04JfNKdZiBE52LLGvUnOYxkSUzU612TrRCa YZP3mRrdl5ST1aMZK/j55oOLPwWf76bKbkzgP5zk087M+0SkDQftb1CFqMVCqcBj5p f+koSdX37mWAjPXKNim0WXkX+zqOSFIEXNLSX4rb9GEPOjDONOyR/LdhiYz6kC18Ca 3/75g2hvj0AqQ== From: Antoine Tenart To: davem@davemloft.net, kuba@kernel.org, pabeni@redhat.com, edumazet@google.com Cc: Antoine Tenart , netdev@vger.kernel.org Subject: [PATCH net-next 1/4] net: tcp: make the txhash available in TIME_WAIT sockets for IPv4 too Date: Thu, 11 May 2023 11:34:53 +0200 Message-Id: <20230511093456.672221-2-atenart@kernel.org> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230511093456.672221-1-atenart@kernel.org> References: <20230511093456.672221-1-atenart@kernel.org> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: kuba@kernel.org Commit c67b85558ff2 ("ipv6: tcp: send consistent autoflowlabel in TIME_WAIT state") made the socket txhash also available in TIME_WAIT sockets but for IPv6 only. Make it available for IPv4 too as we'll use it in later commits. Signed-off-by: Antoine Tenart --- net/ipv4/tcp_minisocks.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c index dac0d62120e6..04fc328727e6 100644 --- a/net/ipv4/tcp_minisocks.c +++ b/net/ipv4/tcp_minisocks.c @@ -303,6 +303,7 @@ void tcp_time_wait(struct sock *sk, int state, int timeo) tcptw->tw_ts_offset = tp->tsoffset; tcptw->tw_last_oow_ack_time = 0; tcptw->tw_tx_delay = tp->tcp_tx_delay; + tw->tw_txhash = sk->sk_txhash; #if IS_ENABLED(CONFIG_IPV6) if (tw->tw_family == PF_INET6) { struct ipv6_pinfo *np = inet6_sk(sk); @@ -311,7 +312,6 @@ void tcp_time_wait(struct sock *sk, int state, int timeo) tw->tw_v6_rcv_saddr = sk->sk_v6_rcv_saddr; tw->tw_tclass = np->tclass; tw->tw_flowlabel = be32_to_cpu(np->flow_label & IPV6_FLOWLABEL_MASK); - tw->tw_txhash = sk->sk_txhash; tw->tw_ipv6only = sk->sk_ipv6only; } #endif From patchwork Thu May 11 09:34:54 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Antoine Tenart X-Patchwork-Id: 13237732 X-Patchwork-Delegate: kuba@kernel.org Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 331591C741 for ; Thu, 11 May 2023 09:35:07 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 847FAC4339E; Thu, 11 May 2023 09:35:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1683797707; bh=g8sb9pf4OKwGdWHzdapCjTXp8I1OFygag9Q8PZORahs=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=EHXVPjmlfzrVD2wTjoicsASB/iaPmxRUN10zX4y7+7iWpgxaszdJBk0gZLpWXZwh8 kz2q/9xbx68fAgX7FDTW6vYac1vB3lglbkQaL9Q5QkuiCp7H99Y9l+rSvxFmWtPebE HNkhDxj42/PzZYtGcfgbY71+GYI71Yiwz5+r1QQX0fNZ9m77OuRYpJywUXAm6ryq0C yjQQXTyFz9cbzRFKvomB3fE3o7YufDwiRs3pwbqOFmh1GB1O/BzJHLQ3TT4DWB7Djz KidzK1QoG37lXvTRqpxJAQp1sZYL1Z4BSOV5PmLHcFVLNiV8hRXmk2YKAs+UjGESwE 5WL6auT8x80Qw== From: Antoine Tenart To: davem@davemloft.net, kuba@kernel.org, pabeni@redhat.com, edumazet@google.com Cc: Antoine Tenart , netdev@vger.kernel.org Subject: [PATCH net-next 2/4] net: ipv4: use consistent txhash in TIME_WAIT and SYN_RECV Date: Thu, 11 May 2023 11:34:54 +0200 Message-Id: <20230511093456.672221-3-atenart@kernel.org> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230511093456.672221-1-atenart@kernel.org> References: <20230511093456.672221-1-atenart@kernel.org> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: kuba@kernel.org When using IPv4/TCP, skb->hash comes from sk->sk_txhash except in TIME_WAIT and SYN_RECV where it's not set in the reply skb from ip_send_unicast_reply. Those packets will have a mismatched hash with others from the same flow as their hashes will be 0. IPv6 does not have the same issue as the hash is set from the socket txhash in those cases. This commits sets the hash in the reply skb from ip_send_unicast_reply, which makes the IPv4 code behaving like IPv6. Signed-off-by: Antoine Tenart --- include/net/ip.h | 2 +- net/ipv4/ip_output.c | 4 +++- net/ipv4/tcp_ipv4.c | 14 +++++++++----- 3 files changed, 13 insertions(+), 7 deletions(-) diff --git a/include/net/ip.h b/include/net/ip.h index c3fffaa92d6e..749735171e2c 100644 --- a/include/net/ip.h +++ b/include/net/ip.h @@ -280,7 +280,7 @@ void ip_send_unicast_reply(struct sock *sk, struct sk_buff *skb, const struct ip_options *sopt, __be32 daddr, __be32 saddr, const struct ip_reply_arg *arg, - unsigned int len, u64 transmit_time); + unsigned int len, u64 transmit_time, u32 txhash); #define IP_INC_STATS(net, field) SNMP_INC_STATS64((net)->mib.ip_statistics, field) #define __IP_INC_STATS(net, field) __SNMP_INC_STATS64((net)->mib.ip_statistics, field) diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c index 61892268e8a6..a1bead441026 100644 --- a/net/ipv4/ip_output.c +++ b/net/ipv4/ip_output.c @@ -1692,7 +1692,7 @@ void ip_send_unicast_reply(struct sock *sk, struct sk_buff *skb, const struct ip_options *sopt, __be32 daddr, __be32 saddr, const struct ip_reply_arg *arg, - unsigned int len, u64 transmit_time) + unsigned int len, u64 transmit_time, u32 txhash) { struct ip_options_data replyopts; struct ipcm_cookie ipc; @@ -1755,6 +1755,8 @@ void ip_send_unicast_reply(struct sock *sk, struct sk_buff *skb, arg->csum)); nskb->ip_summed = CHECKSUM_NONE; nskb->mono_delivery_time = !!transmit_time; + if (txhash) + skb_set_hash(nskb, txhash, PKT_HASH_TYPE_L4); ip_push_pending_frames(sk, &fl4); } out: diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index 39bda2b1066e..8fd4b548d448 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -692,6 +692,7 @@ static void tcp_v4_send_reset(const struct sock *sk, struct sk_buff *skb) u64 transmit_time = 0; struct sock *ctl_sk; struct net *net; + u32 txhash = 0; /* Never send a reset in response to a reset. */ if (th->rst) @@ -829,12 +830,14 @@ static void tcp_v4_send_reset(const struct sock *sk, struct sk_buff *skb) inet_twsk(sk)->tw_priority : sk->sk_priority; transmit_time = tcp_transmit_time(sk); xfrm_sk_clone_policy(ctl_sk, sk); + txhash = (sk->sk_state == TCP_TIME_WAIT) ? + inet_twsk(sk)->tw_txhash : sk->sk_txhash; } ip_send_unicast_reply(ctl_sk, skb, &TCP_SKB_CB(skb)->header.h4.opt, ip_hdr(skb)->saddr, ip_hdr(skb)->daddr, &arg, arg.iov[0].iov_len, - transmit_time); + transmit_time, txhash); ctl_sk->sk_mark = 0; xfrm_sk_free_policy(ctl_sk); @@ -857,7 +860,7 @@ static void tcp_v4_send_ack(const struct sock *sk, struct sk_buff *skb, u32 seq, u32 ack, u32 win, u32 tsval, u32 tsecr, int oif, struct tcp_md5sig_key *key, - int reply_flags, u8 tos) + int reply_flags, u8 tos, u32 txhash) { const struct tcphdr *th = tcp_hdr(skb); struct { @@ -933,7 +936,7 @@ static void tcp_v4_send_ack(const struct sock *sk, skb, &TCP_SKB_CB(skb)->header.h4.opt, ip_hdr(skb)->saddr, ip_hdr(skb)->daddr, &arg, arg.iov[0].iov_len, - transmit_time); + transmit_time, txhash); ctl_sk->sk_mark = 0; sock_net_set(ctl_sk, &init_net); @@ -954,7 +957,8 @@ static void tcp_v4_timewait_ack(struct sock *sk, struct sk_buff *skb) tw->tw_bound_dev_if, tcp_twsk_md5_key(tcptw), tw->tw_transparent ? IP_REPLY_ARG_NOSRCCHECK : 0, - tw->tw_tos + tw->tw_tos, + tw->tw_txhash ); inet_twsk_put(tw); @@ -987,7 +991,7 @@ static void tcp_v4_reqsk_send_ack(const struct sock *sk, struct sk_buff *skb, 0, tcp_md5_do_lookup(sk, l3index, addr, AF_INET), inet_rsk(req)->no_srccheck ? IP_REPLY_ARG_NOSRCCHECK : 0, - ip_hdr(skb)->tos); + ip_hdr(skb)->tos, tcp_rsk(req)->txhash); } /* From patchwork Thu May 11 09:34:55 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Antoine Tenart X-Patchwork-Id: 13237733 X-Patchwork-Delegate: kuba@kernel.org Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 247C91C741 for ; Thu, 11 May 2023 09:35:10 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7B286C4339E; Thu, 11 May 2023 09:35:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1683797709; bh=+5crpwNYqK528To5BDEJCT90P3XZ+NwSQhG7Xu/xuLQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=iXu8BbWHoUAdGmdnuEhdwK2wOaxEnVk3rCTag80bjge8Hwv/SDFH2bqLJhszlGPIN VyFeRGiMOLgQkv+uaNirxQHNnhNBvHMuyzZxx+CufUzwgmM04xxgvBcaOGnxxgxviv 2ORR2EQZyieUiS5b5bRhkpYnCb11wP2/OKmKK19L3yLvn+mVxlIPss2/oCxEse3U6+ jEsIvl4GvqO3319zhFzrzKssBFSxpxy5HaNoOyg6H7oqzO3CicYB8r9OJVaJ17XOT4 oiSX2yg8o0u4begPQG32BVq4Bv6QCglX835HESGtOHMmWW0GG/ji70+yCOhIzrSsj5 X0niAG9oZp5Jg== From: Antoine Tenart To: davem@davemloft.net, kuba@kernel.org, pabeni@redhat.com, edumazet@google.com Cc: Antoine Tenart , netdev@vger.kernel.org Subject: [PATCH net-next 3/4] Documentation: net: net.core.txrehash is not specific to listening sockets Date: Thu, 11 May 2023 11:34:55 +0200 Message-Id: <20230511093456.672221-4-atenart@kernel.org> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230511093456.672221-1-atenart@kernel.org> References: <20230511093456.672221-1-atenart@kernel.org> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: kuba@kernel.org The net.core.txrehash documentation mentions this knob is for listening sockets only, while sk_rethink_txhash can be called on SYN and RTO retransmits on all TCP sockets. Remove the listening socket part. Signed-off-by: Antoine Tenart --- Documentation/admin-guide/sysctl/net.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/Documentation/admin-guide/sysctl/net.rst b/Documentation/admin-guide/sysctl/net.rst index 466c560b0c30..4877563241f3 100644 --- a/Documentation/admin-guide/sysctl/net.rst +++ b/Documentation/admin-guide/sysctl/net.rst @@ -386,8 +386,8 @@ Default : 0 (for compatibility reasons) txrehash -------- -Controls default hash rethink behaviour on listening socket when SO_TXREHASH -option is set to SOCK_TXREHASH_DEFAULT (i. e. not overridden by setsockopt). +Controls default hash rethink behaviour on socket when SO_TXREHASH option is set +to SOCK_TXREHASH_DEFAULT (i. e. not overridden by setsockopt). If set to 1 (default), hash rethink is performed on listening socket. If set to 0, hash rethink is not performed. From patchwork Thu May 11 09:34:56 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Antoine Tenart X-Patchwork-Id: 13237734 X-Patchwork-Delegate: kuba@kernel.org Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CB5DF1C741 for ; Thu, 11 May 2023 09:35:12 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 493A4C433D2; Thu, 11 May 2023 09:35:12 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1683797712; bh=Jl3566aH/3ykmNfm+L6zyhIBRl0U9RjudKH62vLSUOQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=h+dFT7p3R5psV7Lz2wjGu9W31GWFYP+s4bI7lQ6f1iJIPxOaPQXl0SIknQpv+L2XH gCTupsJPd6OAHCU4yJT3nTIQOofQZ3RSfXSOpbh3FHSn1UuLBRCJPb8StC9vQqciuT 9KoJ2Ov++Zw2zPviruQ2FvUkz7DsIKbm7NiSI6k/eIKwObnrNLIYS8e/buL3xF+BeP Zti2C2HXzFCx+7P5hWaolrAkA/VDcnxGAOKyBpu4NHvLZlvpn8ruGF8JixC98iCEu6 JBXBIU8MWROOJDgaf+v+P8Y34UwsBI8obYGO0YoQ4jSZvAWnp2ny5ktYvzJ+z2vOUL FkceZvsKjGKnQ== From: Antoine Tenart To: davem@davemloft.net, kuba@kernel.org, pabeni@redhat.com, edumazet@google.com Cc: Antoine Tenart , netdev@vger.kernel.org Subject: [PATCH net-next 4/4] net: skbuff: fix l4_hash comment Date: Thu, 11 May 2023 11:34:56 +0200 Message-Id: <20230511093456.672221-5-atenart@kernel.org> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230511093456.672221-1-atenart@kernel.org> References: <20230511093456.672221-1-atenart@kernel.org> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: kuba@kernel.org Since commit 877d1f6291f8 ("net: Set sk_txhash from a random number") sk->sk_txhash is not a canonical 4-tuple hash. sk->sk_txhash is used in the TCP Tx path to populate skb->hash, with skb->l4_hash=1. With this, skb->l4_hash does not always indicate the hash is a "canonical 4-tuple hash over transport ports" but rather a hash from L4 layer to provide a uniform distribution over flows. Reword the comment accordingly, to avoid misunderstandings. Signed-off-by: Antoine Tenart --- include/linux/skbuff.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index 738776ab8838..f54c84193b23 100644 --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -791,8 +791,8 @@ typedef unsigned char *sk_buff_data_t; * @active_extensions: active extensions (skb_ext_id types) * @ndisc_nodetype: router type (from link layer) * @ooo_okay: allow the mapping of a socket to a queue to be changed - * @l4_hash: indicate hash is a canonical 4-tuple hash over transport - * ports. + * @l4_hash: indicate hash is from layer 4 and provides a uniform + * distribution over flows. * @sw_hash: indicates hash was computed in software stack * @wifi_acked_valid: wifi_acked was set * @wifi_acked: whether frame was acked on wifi or not