From patchwork Fri Oct 20 05:32:47 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yan Zhai X-Patchwork-Id: 13430057 X-Patchwork-Delegate: kuba@kernel.org Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6198920E7 for ; Fri, 20 Oct 2023 05:32:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=cloudflare.com header.i=@cloudflare.com header.b="HzbvMgu0" Received: from mail-qt1-x832.google.com (mail-qt1-x832.google.com [IPv6:2607:f8b0:4864:20::832]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D2BFAD51 for ; Thu, 19 Oct 2023 22:32:50 -0700 (PDT) Received: by mail-qt1-x832.google.com with SMTP id d75a77b69052e-41cc0e9d92aso2610091cf.3 for ; Thu, 19 Oct 2023 22:32:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cloudflare.com; s=google09082023; t=1697779969; x=1698384769; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=KV869UUQKRDRmnZ80A1/n7z6+idYBNckQF54inYjS5g=; b=HzbvMgu0lCF6faIChfXVinjI8bdH8kCWR2utK3tXvu34fwxKuN+u7vQC/G26UceGv+ SzAlBbmmwoyVsMMJDSYTFzUfmx4HYcSpKW17SlKRor46zKxpjr6mwSaGGbxtPh0q8TgV yFKM4Ca7zSRagheV0IgDbFz4qAx4bITuix1XYKvWn7fxL9oGRbvNwfXRzE8OlLpwFzQo ypWPvQR5Vq97WzVambzsJJY1rCxxp6gIoMAoPfLEWaABiCZp25xVa0DQyTSr5qstDPcT 5/t822WT84u42eN7Th3pb3hEGHGm5n+IjmmX95weEKhR59aX+AtSUMlj/PwMdXA9gR5B QijA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697779969; x=1698384769; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=KV869UUQKRDRmnZ80A1/n7z6+idYBNckQF54inYjS5g=; b=nQW/gRRUgDyp+N/Epfcfs5X7EW8p9BFf1diAenk0L7okXftfq3ao13gAzCFeBQ9m45 bb7ixgfvVCbc7VhO3zusbmKyuKqu0+7sjlF3jAyzseaasBy8AmkZ/tS1Sp0SJFAaoWx9 9sxLPbOzrKoIgEXCD3CvB+UtBG4YyrYLgX+gCGFPanO6MFNa6sFQVgw4ro3J7J/aiiSd WSmc18ApQmbEOp3IhoSVT0pPIGbewBxllJm0ubxJEW/5Uw0rLARs/Q8jyje7/ku69uz6 3yv3FbDvvmwupfj1WlhZuYnlObaAnqj8zTFtQah3lX/c4xI85kL5ub3PUvor/JjeOHCI FeGg== X-Gm-Message-State: AOJu0YxcLhx+svMANy2u71kdrfgpg73VsmjlukcBWLJt80+VDwlqCr/A YYXpvKnhZMeL7B5X7IJLYr+YKBkXCzK12ZH1gK9vZw== X-Google-Smtp-Source: AGHT+IE/sGedLpUr4sLQdyiagGVmU5aZdQlN8A0vLvzrjxYrZUHQz/La835JeQyomnGiOmnneXe/Bg== X-Received: by 2002:a05:6214:d66:b0:66d:1e25:9774 with SMTP id 6-20020a0562140d6600b0066d1e259774mr916666qvs.61.1697779969555; Thu, 19 Oct 2023 22:32:49 -0700 (PDT) Received: from debian.debian ([140.141.197.139]) by smtp.gmail.com with ESMTPSA id g20-20020ad457b4000000b0065d0dcc28e3sm421513qvx.73.2023.10.19.22.32.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 19 Oct 2023 22:32:49 -0700 (PDT) Date: Thu, 19 Oct 2023 22:32:47 -0700 From: Yan Zhai To: netdev@vger.kernel.org Cc: "David S. Miller" , David Ahern , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Aya Levin , Tariq Toukan , linux-kernel@vger.kernel.org, kernel-team@cloudflare.com, Florian Westphal , Willem de Bruijn , Alexander H Duyck Subject: [PATCH v3 net-next 1/3] ipv6: remove dst_allfrag test on ipv6 output Message-ID: References: Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: X-Patchwork-Delegate: kuba@kernel.org dst_allfrag was added before the first git commit: https://www.mail-archive.com/bk-commits-head@vger.kernel.org/msg03399.html The feature would send packets to the fragmentation path if a box receives a PMTU value with less than 1280 byte. However, since commit 9d289715eb5c ("ipv6: stop sending PTB packets for MTU < 1280"), such message would be simply discarded. The feature flag is neither supported in iproute2 utility. In theory one can still manipulate it with direct netlink message, but it is not ideal because it was based on obsoleted guidance of RFC-2460 (replaced by RFC-8200). The feature test would always return false at the moment, so remove it from the output path. Signed-off-by: Yan Zhai --- net/ipv6/ip6_output.c | 1 - 1 file changed, 1 deletion(-) diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c index a471c7e91761..ae87a3817d4a 100644 --- a/net/ipv6/ip6_output.c +++ b/net/ipv6/ip6_output.c @@ -189,7 +189,6 @@ static int __ip6_finish_output(struct net *net, struct sock *sk, struct sk_buff return ip6_finish_output_gso_slowpath_drop(net, sk, skb, mtu); if ((skb->len > mtu && !skb_is_gso(skb)) || - dst_allfrag(skb_dst(skb)) || (IP6CB(skb)->frag_max_size && skb->len > IP6CB(skb)->frag_max_size)) return ip6_fragment(net, sk, skb, ip6_finish_output2); else From patchwork Fri Oct 20 05:32:49 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yan Zhai X-Patchwork-Id: 13430058 X-Patchwork-Delegate: kuba@kernel.org Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E26546AAB for ; Fri, 20 Oct 2023 05:32:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=cloudflare.com header.i=@cloudflare.com header.b="W5HYo6Cz" Received: from mail-qk1-x733.google.com (mail-qk1-x733.google.com [IPv6:2607:f8b0:4864:20::733]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A6615D4C for ; Thu, 19 Oct 2023 22:32:52 -0700 (PDT) Received: by mail-qk1-x733.google.com with SMTP id af79cd13be357-7781b176131so23900285a.1 for ; Thu, 19 Oct 2023 22:32:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cloudflare.com; s=google09082023; t=1697779971; x=1698384771; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=9xpTogBhloZZ3cgxOHAvGjGFV2B6QzeETWE0jVApmhU=; b=W5HYo6CzHn0n+hoSyGPdXLRPk7sDe9kyTLVSvCoWtjMgPhbsnd6DDmh+tFU5wF8rfQ RJqEmYP3AU7cmo2moTlMJpLaQ3puZno8yFfEbkDM5H4pLKeTQmWxq8uEsW9WfOFD6I00 /fB2kMHootI9Y+koucBakT4yLrQPVCLHsxVuvi2cEVfQaTUWOhAUc65d1dHSyABO+ANL G9IiC9klk/NZuF12Lc9wfpFHF5RDocpAEGuncVjd3Dps2HpEI8lih07s0p4oQ2FrN67f osEGTtzFj/Nt/tAswwLlMTqWHyBSK0FbtVkfRqFOGX7QzKpWAgu15VCQBYGUQoOtQfTM kYwQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697779971; x=1698384771; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=9xpTogBhloZZ3cgxOHAvGjGFV2B6QzeETWE0jVApmhU=; b=aylCAAOa1GHjRxC9dKPnKn8xGaxRPRaHId6/rG0NgooNaQ2X+gJmaNoiH/0g/kIqZg l6MvpzotuqdBXH1KRf04SZfHkKhKe0qa5v7ZittwMy9c/8vURuPxs+S1G9oG/og9p/6+ wIbB9CR9X6GfnkexEEuHMmWv2qD4WBn/RVVt/mzq1ahcT0dthtHUhIIGZYFG6WqEbjGU 9BxqbVmh7JwlWeXMF5SDU5+QSX9qncH8epbgsPC5v8ACX6RxOPKA4Xmk74GsmIVBt9H3 kOH/iRVE3fq8HhhV8Trt6NYnKwbQjd31e21UQGK01B1dfQO9FFmBAKm4O3Won+cWKJ2E 899g== X-Gm-Message-State: AOJu0YyPHvOi/x1vD90Ty4J70hB3IpetZDd2a3c2XafAvmBZRz+JHduG 1RJi+qr1C/XOBB3ZMmyhxxGeivWNugJPe0tQtGqrig== X-Google-Smtp-Source: AGHT+IGSkCTT/zquQUwd3swviH/Fg+gESUjwa/1PtV2qfkXpbULyGWMHdHpXDFA8evkAJp9nQqvX3Q== X-Received: by 2002:a05:620a:40c5:b0:778:8ce0:221a with SMTP id g5-20020a05620a40c500b007788ce0221amr824644qko.63.1697779971342; Thu, 19 Oct 2023 22:32:51 -0700 (PDT) Received: from debian.debian ([140.141.197.139]) by smtp.gmail.com with ESMTPSA id bq12-20020a05620a468c00b007678973eaa1sm356816qkb.127.2023.10.19.22.32.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 19 Oct 2023 22:32:50 -0700 (PDT) Date: Thu, 19 Oct 2023 22:32:49 -0700 From: Yan Zhai To: netdev@vger.kernel.org Cc: "David S. Miller" , David Ahern , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Aya Levin , Tariq Toukan , linux-kernel@vger.kernel.org, kernel-team@cloudflare.com, Florian Westphal , Willem de Bruijn , Alexander H Duyck Subject: [PATCH v3 net-next 2/3] ipv6: refactor ip6_finish_output for GSO handling Message-ID: <496ccff707e16e98163d2a3fbcfbc1f824fd8ec3.1697779681.git.yan@cloudflare.com> References: Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: X-Patchwork-Delegate: kuba@kernel.org Separate GSO and non-GSO packets handling to make the logic cleaner. For GSO packets, frag_max_size check can be omitted because it is only useful for packets defragmented by netfilter hooks. Both local output and GRO logic won't produce GSO packets when defragment is needed. This also mirrors what IPv4 side code is doing. Suggested-by: Florian Westphal Signed-off-by: Yan Zhai Reviewed-by: Willem de Bruijn --- net/ipv6/ip6_output.c | 22 +++++++++++++++------- 1 file changed, 15 insertions(+), 7 deletions(-) diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c index ae87a3817d4a..3270d56b5c37 100644 --- a/net/ipv6/ip6_output.c +++ b/net/ipv6/ip6_output.c @@ -170,6 +170,16 @@ ip6_finish_output_gso_slowpath_drop(struct net *net, struct sock *sk, return ret; } +static int ip6_finish_output_gso(struct net *net, struct sock *sk, + struct sk_buff *skb, unsigned int mtu) +{ + if (!(IP6CB(skb)->flags & IP6SKB_FAKEJUMBO) && + !skb_gso_validate_network_len(skb, mtu)) + return ip6_finish_output_gso_slowpath_drop(net, sk, skb, mtu); + + return ip6_finish_output2(net, sk, skb); +} + static int __ip6_finish_output(struct net *net, struct sock *sk, struct sk_buff *skb) { unsigned int mtu; @@ -183,16 +193,14 @@ static int __ip6_finish_output(struct net *net, struct sock *sk, struct sk_buff #endif mtu = ip6_skb_dst_mtu(skb); - if (skb_is_gso(skb) && - !(IP6CB(skb)->flags & IP6SKB_FAKEJUMBO) && - !skb_gso_validate_network_len(skb, mtu)) - return ip6_finish_output_gso_slowpath_drop(net, sk, skb, mtu); + if (skb_is_gso(skb)) + return ip6_finish_output_gso(net, sk, skb, mtu); - if ((skb->len > mtu && !skb_is_gso(skb)) || + if (skb->len > mtu || (IP6CB(skb)->frag_max_size && skb->len > IP6CB(skb)->frag_max_size)) return ip6_fragment(net, sk, skb, ip6_finish_output2); - else - return ip6_finish_output2(net, sk, skb); + + return ip6_finish_output2(net, sk, skb); } static int ip6_finish_output(struct net *net, struct sock *sk, struct sk_buff *skb) From patchwork Fri Oct 20 05:32:51 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yan Zhai X-Patchwork-Id: 13430059 X-Patchwork-Delegate: kuba@kernel.org Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 23F577476 for ; Fri, 20 Oct 2023 05:32:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=cloudflare.com header.i=@cloudflare.com header.b="UCHRbXYL" Received: from mail-vk1-xa33.google.com (mail-vk1-xa33.google.com [IPv6:2607:f8b0:4864:20::a33]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 800E3D57 for ; Thu, 19 Oct 2023 22:32:54 -0700 (PDT) Received: by mail-vk1-xa33.google.com with SMTP id 71dfb90a1353d-49d0a704ac7so168676e0c.1 for ; Thu, 19 Oct 2023 22:32:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cloudflare.com; s=google09082023; t=1697779973; x=1698384773; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=QERO0uRdQXqqSaQdJAhee2LXGn28nhRHE84yn6ioXDc=; b=UCHRbXYL3xC0knyXFuBAsnyrkLGlT5t1Rqg+cpKWHrQednZoHGoTfiInP7UE+NCSFR wXSUe6Jwu7zzaVd3n85LoZnXW8+dqlE1Bhuxl3jPFfvL6x46DSNj9/D3DLF2LqmuCB9D tWbot+5hrUbZVTCTn349B4xeWIal6d8nHMFgT+1sGLA7ZjBc5ojYa5D9rR9B8t5ub5VW NV0+7zd6FNrsiehybQSpW6+JXRZkUN448cicU2ypFjj3plmy96MfLdNhMBjvLdQQpkcG wiKlBsYehe+nSW73EcWYNU+qIicC6wCqfKAZSWXlkk4Az0wwxr7oLzYz2pnDrUz+I0OD VnwQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697779973; x=1698384773; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=QERO0uRdQXqqSaQdJAhee2LXGn28nhRHE84yn6ioXDc=; b=OJ8nbghIa30DOxeU2SZ+Yk07djvHLcVjHd5vvKH5z2soqyrBMJBrOFL+27SRO0Cufi ZZ9BgYq6rP4TKeNEe2B/WFV+c0PCb7xznm6s6Km6LcCX/GdEJBhJv+bXCx4Z3VtN0+sF Z3GBUoqHdvcNE9DYSAwM5dls1mwxOwcNrzxNELp5qRfSuEas/jE4MCr9CnNtzd8R/WlG VNntvmAfAADzip/uowQpKzUf7qNRlYpe5BNv+a+w/1TFTdJf/MTO1wPRl0+ykDh+bm3x n/2FWsb+ruNFGIh8I/e+sS90kv93rdS4gd3R9XTayGBClNHL0F5LE6SKkFG8uTzH0cfZ n9WQ== X-Gm-Message-State: AOJu0Ywnf+hkJI/9ghRiWfkCe+rPGT67urQ8yPdDQe8JSaHFmzn1+j1x fQwdm6WwNRFr3aEmcRJyYu1zwsZHwzsbmICkR8ZlEA== X-Google-Smtp-Source: AGHT+IEuHUJNvQq/HPA6H9apAJz8f05F78aDSMAqahV84NRcnvPy5dpCqHEpkhIlb+0srBHZfqHiYw== X-Received: by 2002:a67:ae05:0:b0:452:5798:64bd with SMTP id x5-20020a67ae05000000b00452579864bdmr920744vse.35.1697779973258; Thu, 19 Oct 2023 22:32:53 -0700 (PDT) Received: from debian.debian ([140.141.197.139]) by smtp.gmail.com with ESMTPSA id vq6-20020a05620a558600b0076e672f535asm359098qkn.57.2023.10.19.22.32.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 19 Oct 2023 22:32:52 -0700 (PDT) Date: Thu, 19 Oct 2023 22:32:51 -0700 From: Yan Zhai To: netdev@vger.kernel.org Cc: "David S. Miller" , David Ahern , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Aya Levin , Tariq Toukan , linux-kernel@vger.kernel.org, kernel-team@cloudflare.com, Florian Westphal , Willem de Bruijn , Alexander H Duyck Subject: [PATCH v3 net-next 3/3] ipv6: avoid atomic fragment on GSO packets Message-ID: <77423bb774612f0e8eaabfd9501d03389ff2cdbd.1697779681.git.yan@cloudflare.com> References: Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: X-Patchwork-Delegate: kuba@kernel.org When the ipv6 stack output a GSO packet, if its gso_size is larger than dst MTU, then all segments would be fragmented. However, it is possible for a GSO packet to have a trailing segment with smaller actual size than both gso_size as well as the MTU, which leads to an "atomic fragment". Atomic fragments are considered harmful in RFC-8021. An Existing report from APNIC also shows that atomic fragments are more likely to be dropped even it is equivalent to a no-op [1]. Add an extra check in the GSO slow output path. For each segment from the original over-sized packet, if it fits with the path MTU, then avoid generating an atomic fragment. Link: https://www.potaroo.net/presentations/2022-03-01-ipv6-frag.pdf [1] Fixes: b210de4f8c97 ("net: ipv6: Validate GSO SKB before finish IPv6 processing") Reported-by: David Wragg Signed-off-by: Yan Zhai --- net/ipv6/ip6_output.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c index 3270d56b5c37..3d4e8edaa10b 100644 --- a/net/ipv6/ip6_output.c +++ b/net/ipv6/ip6_output.c @@ -162,7 +162,13 @@ ip6_finish_output_gso_slowpath_drop(struct net *net, struct sock *sk, int err; skb_mark_not_on_list(segs); - err = ip6_fragment(net, sk, segs, ip6_finish_output2); + /* Last GSO segment can be smaller than gso_size (and MTU). + * Adding a fragment header would produce an "atomic fragment", + * which is considered harmful (RFC-8021). Avoid that. + */ + err = segs->len > mtu ? + ip6_fragment(net, sk, segs, ip6_finish_output2) : + ip6_finish_output2(net, sk, segs); if (err && ret == 0) ret = err; }