From patchwork Mon May 6 09:35:48 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Gobert X-Patchwork-Id: 13655212 X-Patchwork-Delegate: kuba@kernel.org Received: from mail-wm1-f41.google.com (mail-wm1-f41.google.com [209.85.128.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9AD1D142636; Mon, 6 May 2024 09:36:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.41 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714988208; cv=none; b=aqO/UZdvDOIhkjPLiZmBP3UWQ53wkeSMXCYoDcdqzMfMETRWTRZ5/NI2Kr01EJekfg3duJEfJaMrIEknK4mLh7jBJDW4afm6noUF7Zix3dHl/yaaoIMdLKIt35GQdRxwOqzXNJeAeM7FTt7qH/CrOoqjjJGAjcEID+7D8YenuPk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714988208; c=relaxed/simple; bh=QUoNVH8TVRS4fLCD6a7wzQLuK8966kObHh0ypbJd8oQ=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=Va1oemzdP295vh9oingWK0QTs6PnKBEOpGRhOo0G05o1Zv0pvSoBokIQWkuvFjhF7Erl9fSvUz31Cf2b4DXQHskRTWdmg9RkwDXcwy9ZqzbmBO7d6ItZiaCu47lxoOX3wmmNUjZcBzKQbhxOvb/FI8zD1gVTQ+USsHZkLs3BIaU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=m9u80tOW; arc=none smtp.client-ip=209.85.128.41 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="m9u80tOW" Received: by mail-wm1-f41.google.com with SMTP id 5b1f17b1804b1-41b794510cdso13832225e9.2; Mon, 06 May 2024 02:36:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1714988205; x=1715593005; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=kNAW0+IkVNh9sXcKl7otQ29jy1GxPJurqkt5h0EfCUc=; b=m9u80tOWQoZdAi+CMuDmiVoi88J1+jeqORLphlpbph479auEMfmMWGm5hjjdq5Smas o7s/s0gomO2sbFpzOaL1OH+2Nsw2qAGjhuwgv372PDIFwdEVxQc9PgSXhM0dnl+f6Gzv TRSMW1bZPAh9WL5K+Dco7AHQ+rQjgFIqAkvJIKKpIu2f+KWY74EyItfxyHrYaQ3iorso m0d3mzOIfe1oMOPdDni+fAATfgtbOmeie4s/R4xtqyBm1eT7+RJFwe/fL4TSCEfBhubK OhBKiHOQHf4hYaxx6us9yXxmqW9t2l41xOiMDQ0RqRXzIyz7kWqeSzK0Lw1I7o24+Iqi hq4g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1714988205; x=1715593005; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=kNAW0+IkVNh9sXcKl7otQ29jy1GxPJurqkt5h0EfCUc=; b=lPV217Ol207amrjy1P4vVzsH8jLGHejxxyxg1kGaEi6nlezIQw0fu0nGimgc7ZmA7g CcAGeH0v7GFzNPo9WI0izoAwyKkQpHlSlmNjMy1Thu92yulTnDhcU1Rpce6LtORkAvUK KUGe8CL1Guo1S+Rsdd2f+gtmlHe11SNWERIK5nW2SGTewlDdPWcFBj5XlXc80P2pHWGQ eqNVbOxIWKSgZks2tOSWk309BKH9MicxrSLNDHY+v/GMlgOb9WmwxuMuKad5pcR23aiw DNnxx9RIqo8Kk4qcO+8QZQS0D5X6j9mkP3w9X7k4zCJCTbGG40TPdMxdWwXtEhpKNudb KhQw== X-Forwarded-Encrypted: i=1; AJvYcCXv9icRSnPncJo3hofUKoGMcc+r+wv4RWAljY8YdJQYHJ0fRvvq/ap5aiLqOzkbn897RCxNhSNJcuuoG6yB9Qa/A7VV/j81uGxprercFKQCl0mp8XKApPOv18HujqaO2kUN60sDSReYVcEETafP+arQnemtzqUp/DNe/T3mxx+KPMG+Wx9j X-Gm-Message-State: AOJu0Yxvxy/lf5Arf/3XMndfMpL9+zfjccBliSUkg0aDp97Oe35shqRR xs2WCK4X3VKOWLbyoWct7MM3HE8ZLQ11l4U4uPdJ3chXQDJONuRB X-Google-Smtp-Source: AGHT+IExJwJIQrS3FTLr/yKJ17F3rysqCjdVWT9qP2PcHZgVyWbGOdPro5GyjW7VJDcy2ILBryyNuQ== X-Received: by 2002:a05:600c:3490:b0:41c:97e:20b9 with SMTP id a16-20020a05600c349000b0041c097e20b9mr6749301wmq.14.1714988204999; Mon, 06 May 2024 02:36:44 -0700 (PDT) Received: from localhost ([146.70.204.204]) by smtp.gmail.com with ESMTPSA id n21-20020a05600c3b9500b0041a9fc2a6b5sm19052040wms.20.2024.05.06.02.36.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 06 May 2024 02:36:44 -0700 (PDT) From: Richard Gobert To: davem@davemloft.net, edumazet@google.com, willemdebruijn.kernel@gmail.com, kuba@kernel.org, pabeni@redhat.com, dsahern@kernel.org, alobakin@pm.me, shuah@kernel.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org Cc: Richard Gobert Subject: [PATCH net-next v8 1/3] net: gro: use cb instead of skb->network_header Date: Mon, 6 May 2024 11:35:48 +0200 Message-Id: <20240506093550.128210-2-richardbgobert@gmail.com> In-Reply-To: <20240506093550.128210-1-richardbgobert@gmail.com> References: <20240506093550.128210-1-richardbgobert@gmail.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: kuba@kernel.org This patch converts references of skb->network_header to napi_gro_cb's network_offset and inner_network_offset. Signed-off-by: Richard Gobert --- include/net/gro.h | 9 +++++++-- net/ipv4/af_inet.c | 4 ---- net/ipv4/tcp_offload.c | 3 ++- net/ipv6/ip6_offload.c | 5 ++--- net/ipv6/tcpv6_offload.c | 3 ++- 5 files changed, 13 insertions(+), 11 deletions(-) diff --git a/include/net/gro.h b/include/net/gro.h index c1d4ca0463a1..1faff23b66e8 100644 --- a/include/net/gro.h +++ b/include/net/gro.h @@ -181,12 +181,17 @@ static inline void *skb_gro_header(struct sk_buff *skb, unsigned int hlen, return ptr; } +static inline int skb_gro_network_offset(const struct sk_buff *skb) +{ + return NAPI_GRO_CB(skb)->network_offsets[NAPI_GRO_CB(skb)->encap_mark]; +} + static inline void *skb_gro_network_header(const struct sk_buff *skb) { if (skb_gro_may_pull(skb, skb_gro_offset(skb))) - return skb_gro_header_fast(skb, skb_network_offset(skb)); + return skb_gro_header_fast(skb, skb_gro_network_offset(skb)); - return skb_network_header(skb); + return skb->data + skb_gro_network_offset(skb); } static inline __wsum inet_gro_compute_pseudo(const struct sk_buff *skb, diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c index a7bad18bc8b5..428196e1541f 100644 --- a/net/ipv4/af_inet.c +++ b/net/ipv4/af_inet.c @@ -1569,10 +1569,6 @@ struct sk_buff *inet_gro_receive(struct list_head *head, struct sk_buff *skb) NAPI_GRO_CB(skb)->is_atomic = !!(iph->frag_off & htons(IP_DF)); NAPI_GRO_CB(skb)->flush |= flush; - skb_set_network_header(skb, off); - /* The above will be needed by the transport layer if there is one - * immediately following this IP hdr. - */ NAPI_GRO_CB(skb)->inner_network_offset = off; /* Note : No need to call skb_gro_postpull_rcsum() here, diff --git a/net/ipv4/tcp_offload.c b/net/ipv4/tcp_offload.c index fab0973f995b..b70ae50e658d 100644 --- a/net/ipv4/tcp_offload.c +++ b/net/ipv4/tcp_offload.c @@ -330,7 +330,8 @@ struct sk_buff *tcp4_gro_receive(struct list_head *head, struct sk_buff *skb) INDIRECT_CALLABLE_SCOPE int tcp4_gro_complete(struct sk_buff *skb, int thoff) { - const struct iphdr *iph = ip_hdr(skb); + const u16 offset = NAPI_GRO_CB(skb)->network_offsets[skb->encapsulation]; + const struct iphdr *iph = (struct iphdr *)(skb->data + offset); struct tcphdr *th = tcp_hdr(skb); th->check = ~tcp_v4_check(skb->len - thoff, iph->saddr, diff --git a/net/ipv6/ip6_offload.c b/net/ipv6/ip6_offload.c index c8b909a9904f..5d6b875a4638 100644 --- a/net/ipv6/ip6_offload.c +++ b/net/ipv6/ip6_offload.c @@ -67,7 +67,7 @@ static int ipv6_gro_pull_exthdrs(struct sk_buff *skb, int off, int proto) off += len; } - skb_gro_pull(skb, off - skb_network_offset(skb)); + skb_gro_pull(skb, off - skb_gro_network_offset(skb)); return proto; } @@ -236,7 +236,6 @@ INDIRECT_CALLABLE_SCOPE struct sk_buff *ipv6_gro_receive(struct list_head *head, if (unlikely(!iph)) goto out; - skb_set_network_header(skb, off); NAPI_GRO_CB(skb)->inner_network_offset = off; flush += ntohs(iph->payload_len) != skb->len - hlen; @@ -260,7 +259,7 @@ INDIRECT_CALLABLE_SCOPE struct sk_buff *ipv6_gro_receive(struct list_head *head, NAPI_GRO_CB(skb)->proto = proto; flush--; - nlen = skb_network_header_len(skb); + nlen = skb_gro_offset(skb) - off; list_for_each_entry(p, head, list) { const struct ipv6hdr *iph2; diff --git a/net/ipv6/tcpv6_offload.c b/net/ipv6/tcpv6_offload.c index 4b07d1e6c952..e70d60e0f86f 100644 --- a/net/ipv6/tcpv6_offload.c +++ b/net/ipv6/tcpv6_offload.c @@ -29,7 +29,8 @@ struct sk_buff *tcp6_gro_receive(struct list_head *head, struct sk_buff *skb) INDIRECT_CALLABLE_SCOPE int tcp6_gro_complete(struct sk_buff *skb, int thoff) { - const struct ipv6hdr *iph = ipv6_hdr(skb); + const u16 offset = NAPI_GRO_CB(skb)->network_offsets[skb->encapsulation]; + const struct ipv6hdr *iph = (struct ipv6hdr *)(skb->data + offset); struct tcphdr *th = tcp_hdr(skb); th->check = ~tcp_v6_check(skb->len - thoff, &iph->saddr, From patchwork Mon May 6 09:47:09 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Gobert X-Patchwork-Id: 13655230 X-Patchwork-Delegate: kuba@kernel.org Received: from mail-lf1-f43.google.com (mail-lf1-f43.google.com [209.85.167.43]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A26F41422DD; Mon, 6 May 2024 09:47:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.167.43 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714988860; cv=none; b=qavmFaSsY9F/1DZv9DqMTnHa/wlDFvkcARgtCM2urJtCHpHgYgflGwlmEBU7G6bW3tAa0pm3eMFdRl/epQ9q4fdfSG8LhOjg9HwB9ZvydODCqbfRQSlNOP+x9aqgltXbpUaCsnWouN7u+9H5ajh4FweIFQbADI1/1IKHI4pDysM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714988860; c=relaxed/simple; bh=9sh7rHa1zaoLPu9N17bHQ5wa5CD6b1ETBOSTvR5soa0=; h=Message-ID:Date:MIME-Version:Subject:To:References:From: In-Reply-To:Content-Type; b=QLcXN3+vqBrTsU5x8+m/QEpOgjQS0NZKWQ6Q6HT6fh0us/XjosZnmvD+U4IDAAMaQTw+SE8AaWcSQiBSsgBQmIFA010Yhu22DfrHGnADJQH5EWwXSMM891MZYehN6ImOd++CWgy2Ag3cD/nCHyNTmg6nZVX+/lwoDmIOgxWPGps= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=lllKlgxf; arc=none smtp.client-ip=209.85.167.43 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="lllKlgxf" Received: by mail-lf1-f43.google.com with SMTP id 2adb3069b0e04-51f0b6b682fso1738034e87.1; Mon, 06 May 2024 02:47:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1714988857; x=1715593657; darn=vger.kernel.org; h=content-transfer-encoding:in-reply-to:from:references:to:subject :mime-version:date:message-id:from:to:cc:subject:date:message-id :reply-to; bh=p0zbUy76iL9lY1l5voQgJ3c+Z/VI10ocBZ49GywIStw=; b=lllKlgxft046t4XhNhBwIihy/8hHc1CA8mY2qRPQGoq0yVTJz8Kr6/MEMW2XQOZWiQ JoM2PIZQOn2u+o+HNxa3LvM+pre4tAYbHh+e52BDGbUxiNC7OvSkwebM2KYVr5PGUtWV gFp86RyaifgPlp5S9ONo9iRvBnWxJd7L/n3zpnPf9mzSq0QX/uyYKQv0AjAY3SjAQoNl 5dki+tYw0q9Fd6Lq20PhOUoJ3Yp3WjXqJlYprU6OswYjHq2Kljm6zXrtkItRmm0NGPsj BEdz/xum4lCix8kkutv+w7OaA+vujikWj7jjCR6y5g7WppkBc7rxHumc96KmNYMXduY9 jG8Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1714988857; x=1715593657; h=content-transfer-encoding:in-reply-to:from:references:to:subject :mime-version:date:message-id:x-gm-message-state:from:to:cc:subject :date:message-id:reply-to; bh=p0zbUy76iL9lY1l5voQgJ3c+Z/VI10ocBZ49GywIStw=; b=drYByLStK3cLR927ZYFPReQiY9vQ5dE2nRxYhYg8o8m7Xp1mwvNmnsVrnqx9RTxqbv 6LiiQNHzT2dY4sZTL3Zu32tqhk1NIuQ9Jz5PUMDgbbrOqBNbOgqlDoSRtWjS3zmCX1lF kMs8+B/LnpklJHPLnVwslZlAzNVB+1bYqUzw+tZYLy+VCEByI0vqcCLZ/PFHlOMfCkXX +k5zB/VF3X7Y1hze54UxsRwOp+TzWbXjo5snOffrk66YZFtzwPJ4aR1GE13m9CsdmeGD /XZdPQXyR9eB4ZGFWrO31AB4ti6Oq5HAGbKTANwC3IoqUSSqNGB1MZFWh2lMoNek+bAj 8BRA== X-Forwarded-Encrypted: i=1; AJvYcCWRtLvmVh3lUyESX0vRMz6+F1fiiWRO7QfqFkklU0NobxR1PEmfhSqII7+S4ud9KLBa2w7lz5clHSbYSGNR8Vd7O1JpXRpHpvY2vKXJ4yPR7dHmNpbuPf7+f26HGu8+1+SSbb8p3ommOkcK9XSo+IHR/VYDOCmEVZdqeieQXW6pM8ks587g X-Gm-Message-State: AOJu0YwdExLGbGp9JnOSKQ1a01xqudmrjc+A1nut9k4a0l/xpW3kjGaN oyQyzKxXy+qzKy+4Wog/OT36MByqFFFgFWssx4V2QSfFRZxYVTox X-Google-Smtp-Source: AGHT+IGGPqr5mpUnVBkw1z3Y6OVjkMi5bwSVrfA6ryMEn2dsPJ2bkdjlrqELin/s+pltnnCF4UHCOQ== X-Received: by 2002:a05:6512:10cc:b0:51a:d9a3:dbf5 with SMTP id k12-20020a05651210cc00b0051ad9a3dbf5mr11420390lfg.47.1714988856386; Mon, 06 May 2024 02:47:36 -0700 (PDT) Received: from debian ([146.70.204.204]) by smtp.gmail.com with ESMTPSA id n12-20020a05600c3b8c00b0041bf87e702asm15535867wms.10.2024.05.06.02.47.25 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 06 May 2024 02:47:36 -0700 (PDT) Message-ID: <1ed21e6d-7cbc-43e3-8933-fc40562b70b2@gmail.com> Date: Mon, 6 May 2024 11:47:09 +0200 Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Subject: [PATCH net-next v8 2/3] net: gro: move L3 flush checks to tcp_gro_receive and udp_gro_receive_segment To: davem@davemloft.net, edumazet@google.com, willemdebruijn.kernel@gmail.com, kuba@kernel.org, pabeni@redhat.com, dsahern@kernel.org, alobakin@pm.me, shuah@kernel.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org References: <20240506093550.128210-1-richardbgobert@gmail.com> From: Richard Gobert In-Reply-To: <20240506093550.128210-1-richardbgobert@gmail.com> X-Patchwork-Delegate: kuba@kernel.org {inet,ipv6}_gro_receive functions perform flush checks (ttl, flags, iph->id, ...) against all packets in a loop. These flush checks are used in all merging UDP and TCP flows. These checks need to be done only once and only against the found p skb, since they only affect flush and not same_flow. This patch leverages correct network header offsets from the cb for both outer and inner network headers - allowing these checks to be done only once, in tcp_gro_receive and udp_gro_receive_segment. As a result, NAPI_GRO_CB(p)->flush is not used at all. In addition, flush_id checks are more declarative and contained in inet_gro_flush, thus removing the need for flush_id in napi_gro_cb. This results in less parsing code for non-loop flush tests for TCP and UDP flows. To make sure results are not within noise range - I've made netfilter drop all TCP packets, and measured CPU performance in GRO (in this case GRO is responsible for about 50% of the CPU utilization). perf top while replaying 64 parallel IP/TCP streams merging in GRO: (gro_network_flush is compiled inline to tcp_gro_receive) net-next: 6.94% [kernel] [k] inet_gro_receive 3.02% [kernel] [k] tcp_gro_receive patch applied: 4.27% [kernel] [k] tcp_gro_receive 4.22% [kernel] [k] inet_gro_receive perf top while replaying 64 parallel IP/IP/TCP streams merging in GRO (same results for any encapsulation, in this case inet_gro_receive is top offender in net-next) net-next: 10.09% [kernel] [k] inet_gro_receive 2.08% [kernel] [k] tcp_gro_receive patch applied: 6.97% [kernel] [k] inet_gro_receive 3.68% [kernel] [k] tcp_gro_receive Signed-off-by: Richard Gobert --- include/net/gro.h | 66 ++++++++++++++++++++++++++++++++++++++---- net/core/gro.c | 3 -- net/ipv4/af_inet.c | 41 +------------------------- net/ipv4/tcp_offload.c | 15 ++-------- net/ipv4/udp_offload.c | 10 ++----- net/ipv6/ip6_offload.c | 11 ------- 6 files changed, 65 insertions(+), 81 deletions(-) diff --git a/include/net/gro.h b/include/net/gro.h index 1faff23b66e8..0565dd716ab7 100644 --- a/include/net/gro.h +++ b/include/net/gro.h @@ -36,15 +36,15 @@ struct napi_gro_cb { /* This is non-zero if the packet cannot be merged with the new skb. */ u16 flush; - /* Save the IP ID here and check when we get to the transport layer */ - u16 flush_id; - /* Number of segments aggregated. */ u16 count; /* Used in ipv6_gro_receive() and foo-over-udp and esp-in-udp */ u16 proto; + /* used to support CHECKSUM_COMPLETE for tunneling protocols */ + __wsum csum; + /* Used in napi_gro_cb::free */ #define NAPI_GRO_FREE 1 #define NAPI_GRO_FREE_STOLEN_HEAD 2 @@ -85,9 +85,6 @@ struct napi_gro_cb { u8 is_flist:1; ); - /* used to support CHECKSUM_COMPLETE for tunneling protocols */ - __wsum csum; - /* L3 offsets */ union { struct { @@ -442,6 +439,63 @@ static inline __wsum ip6_gro_compute_pseudo(const struct sk_buff *skb, skb_gro_len(skb), proto, 0)); } +static inline int inet_gro_flush(const struct iphdr *iph, const struct iphdr *iph2, + struct sk_buff *p, bool outer) +{ + const u32 id = ntohl(*(__be32 *)&iph->id); + const u32 id2 = ntohl(*(__be32 *)&iph2->id); + const u16 flush_id = (id >> 16) - (id2 >> 16); + const u16 count = NAPI_GRO_CB(p)->count; + const u32 df = id & IP_DF; + u32 is_atomic; + int flush; + + /* All fields must match except length and checksum. */ + flush = (iph->ttl ^ iph2->ttl) | (iph->tos ^ iph2->tos) | (df ^ (id2 & IP_DF)); + + if (outer && df) + return flush; + + /* When we receive our second frame we can make a decision on if we + * continue this flow as an atomic flow with a fixed ID or if we use + * an incrementing ID. + */ + NAPI_GRO_CB(p)->is_atomic |= (count == 1 && df && flush_id == 0); + is_atomic = (df && NAPI_GRO_CB(p)->is_atomic) - 1; + + return flush | (flush_id ^ (count & is_atomic)); +} + +static inline int ipv6_gro_flush(const struct ipv6hdr *iph, const struct ipv6hdr *iph2) +{ + /* */ + __be32 first_word = *(__be32 *)iph ^ *(__be32 *)iph2; + + /* Flush if Traffic Class fields are different. */ + return !!((first_word & htonl(0x0FF00000)) | + (__force __be32)(iph->hop_limit ^ iph2->hop_limit)); +} + +static inline int gro_network_flush(const void *th, const void *th2, struct sk_buff *p, int off) +{ + const bool encap_mark = NAPI_GRO_CB(p)->encap_mark; + int flush = 0; + int i; + + for (i = 0; i <= encap_mark; i++) { + const u16 diff = off - NAPI_GRO_CB(p)->network_offsets[i]; + const void *nh = th - diff; + const void *nh2 = th2 - diff; + + if (((struct iphdr *)nh)->version == 6) + flush |= ipv6_gro_flush(nh, nh2); + else + flush |= inet_gro_flush(nh, nh2, p, i != encap_mark); + } + + return flush; +} + int skb_gro_receive(struct sk_buff *p, struct sk_buff *skb); /* Pass the currently batched GRO_NORMAL SKBs up to the stack. */ diff --git a/net/core/gro.c b/net/core/gro.c index 99a45a5211c9..3e9422c23bc9 100644 --- a/net/core/gro.c +++ b/net/core/gro.c @@ -331,8 +331,6 @@ static void gro_list_prepare(const struct list_head *head, list_for_each_entry(p, head, list) { unsigned long diffs; - NAPI_GRO_CB(p)->flush = 0; - if (hash != skb_get_hash_raw(p)) { NAPI_GRO_CB(p)->same_flow = 0; continue; @@ -472,7 +470,6 @@ static enum gro_result dev_gro_receive(struct napi_struct *napi, struct sk_buff sizeof(u32))); /* Avoid slow unaligned acc */ *(u32 *)&NAPI_GRO_CB(skb)->zeroed = 0; NAPI_GRO_CB(skb)->flush = skb_has_frag_list(skb); - NAPI_GRO_CB(skb)->is_atomic = 1; NAPI_GRO_CB(skb)->count = 1; if (unlikely(skb_is_gso(skb))) { NAPI_GRO_CB(skb)->count = skb_shinfo(skb)->gso_segs; diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c index 428196e1541f..44564d009e95 100644 --- a/net/ipv4/af_inet.c +++ b/net/ipv4/af_inet.c @@ -1482,7 +1482,6 @@ struct sk_buff *inet_gro_receive(struct list_head *head, struct sk_buff *skb) struct sk_buff *p; unsigned int hlen; unsigned int off; - unsigned int id; int flush = 1; int proto; @@ -1508,13 +1507,10 @@ struct sk_buff *inet_gro_receive(struct list_head *head, struct sk_buff *skb) goto out; NAPI_GRO_CB(skb)->proto = proto; - id = ntohl(*(__be32 *)&iph->id); - flush = (u16)((ntohl(*(__be32 *)iph) ^ skb_gro_len(skb)) | (id & ~IP_DF)); - id >>= 16; + flush = (u16)((ntohl(*(__be32 *)iph) ^ skb_gro_len(skb)) | (ntohl(*(__be32 *)&iph->id) & ~IP_DF)); list_for_each_entry(p, head, list) { struct iphdr *iph2; - u16 flush_id; if (!NAPI_GRO_CB(p)->same_flow) continue; @@ -1531,43 +1527,8 @@ struct sk_buff *inet_gro_receive(struct list_head *head, struct sk_buff *skb) NAPI_GRO_CB(p)->same_flow = 0; continue; } - - /* All fields must match except length and checksum. */ - NAPI_GRO_CB(p)->flush |= - (iph->ttl ^ iph2->ttl) | - (iph->tos ^ iph2->tos) | - ((iph->frag_off ^ iph2->frag_off) & htons(IP_DF)); - - NAPI_GRO_CB(p)->flush |= flush; - - /* We need to store of the IP ID check to be included later - * when we can verify that this packet does in fact belong - * to a given flow. - */ - flush_id = (u16)(id - ntohs(iph2->id)); - - /* This bit of code makes it much easier for us to identify - * the cases where we are doing atomic vs non-atomic IP ID - * checks. Specifically an atomic check can return IP ID - * values 0 - 0xFFFF, while a non-atomic check can only - * return 0 or 0xFFFF. - */ - if (!NAPI_GRO_CB(p)->is_atomic || - !(iph->frag_off & htons(IP_DF))) { - flush_id ^= NAPI_GRO_CB(p)->count; - flush_id = flush_id ? 0xFFFF : 0; - } - - /* If the previous IP ID value was based on an atomic - * datagram we can overwrite the value and ignore it. - */ - if (NAPI_GRO_CB(skb)->is_atomic) - NAPI_GRO_CB(p)->flush_id = flush_id; - else - NAPI_GRO_CB(p)->flush_id |= flush_id; } - NAPI_GRO_CB(skb)->is_atomic = !!(iph->frag_off & htons(IP_DF)); NAPI_GRO_CB(skb)->flush |= flush; NAPI_GRO_CB(skb)->inner_network_offset = off; diff --git a/net/ipv4/tcp_offload.c b/net/ipv4/tcp_offload.c index b70ae50e658d..625b4800b3ed 100644 --- a/net/ipv4/tcp_offload.c +++ b/net/ipv4/tcp_offload.c @@ -232,9 +232,7 @@ struct sk_buff *tcp_gro_receive(struct list_head *head, struct sk_buff *skb) goto out_check_final; found: - /* Include the IP ID check below from the inner most IP hdr */ - flush = NAPI_GRO_CB(p)->flush; - flush |= (__force int)(flags & TCP_FLAG_CWR); + flush = (__force int)(flags & TCP_FLAG_CWR); flush |= (__force int)((flags ^ tcp_flag_word(th2)) & ~(TCP_FLAG_CWR | TCP_FLAG_FIN | TCP_FLAG_PSH)); flush |= (__force int)(th->ack_seq ^ th2->ack_seq); @@ -242,16 +240,7 @@ struct sk_buff *tcp_gro_receive(struct list_head *head, struct sk_buff *skb) flush |= *(u32 *)((u8 *)th + i) ^ *(u32 *)((u8 *)th2 + i); - /* When we receive our second frame we can made a decision on if we - * continue this flow as an atomic flow with a fixed ID or if we use - * an incrementing ID. - */ - if (NAPI_GRO_CB(p)->flush_id != 1 || - NAPI_GRO_CB(p)->count != 1 || - !NAPI_GRO_CB(p)->is_atomic) - flush |= NAPI_GRO_CB(p)->flush_id; - else - NAPI_GRO_CB(p)->is_atomic = false; + flush |= gro_network_flush(th, th2, p, off); mss = skb_shinfo(p)->gso_size; diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c index 8721fe5beca2..5d9696eaab8a 100644 --- a/net/ipv4/udp_offload.c +++ b/net/ipv4/udp_offload.c @@ -466,6 +466,7 @@ static struct sk_buff *udp_gro_receive_segment(struct list_head *head, struct sk_buff *skb) { struct udphdr *uh = udp_gro_udphdr(skb); + int off = skb_gro_offset(skb); struct sk_buff *pp = NULL; struct udphdr *uh2; struct sk_buff *p; @@ -505,14 +506,7 @@ static struct sk_buff *udp_gro_receive_segment(struct list_head *head, return p; } - flush = NAPI_GRO_CB(p)->flush; - - if (NAPI_GRO_CB(p)->flush_id != 1 || - NAPI_GRO_CB(p)->count != 1 || - !NAPI_GRO_CB(p)->is_atomic) - flush |= NAPI_GRO_CB(p)->flush_id; - else - NAPI_GRO_CB(p)->is_atomic = false; + flush = gro_network_flush(uh, uh2, p, off); /* Terminate the flow on len mismatch or if it grow "too much". * Under small packet flood GRO count could elsewhere grow a lot diff --git a/net/ipv6/ip6_offload.c b/net/ipv6/ip6_offload.c index 5d6b875a4638..72991a02cb30 100644 --- a/net/ipv6/ip6_offload.c +++ b/net/ipv6/ip6_offload.c @@ -290,19 +290,8 @@ INDIRECT_CALLABLE_SCOPE struct sk_buff *ipv6_gro_receive(struct list_head *head, nlen - sizeof(struct ipv6hdr))) goto not_same_flow; } - /* flush if Traffic Class fields are different */ - NAPI_GRO_CB(p)->flush |= !!((first_word & htonl(0x0FF00000)) | - (__force __be32)(iph->hop_limit ^ iph2->hop_limit)); - NAPI_GRO_CB(p)->flush |= flush; - - /* If the previous IP ID value was based on an atomic - * datagram we can overwrite the value and ignore it. - */ - if (NAPI_GRO_CB(skb)->is_atomic) - NAPI_GRO_CB(p)->flush_id = 0; } - NAPI_GRO_CB(skb)->is_atomic = true; NAPI_GRO_CB(skb)->flush |= flush; skb_gro_postpull_rcsum(skb, iph, nlen); From patchwork Mon May 6 09:51:28 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Gobert X-Patchwork-Id: 13655237 X-Patchwork-Delegate: kuba@kernel.org Received: from mail-wm1-f50.google.com (mail-wm1-f50.google.com [209.85.128.50]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1A940142913; Mon, 6 May 2024 09:51:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.50 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714989117; cv=none; b=ZHjfN5Ndy/rjkqtSf8buvtVOU6PRKvPKsbQkCfvAMKU0EMyuW5/GRJC8VNcQEYvWObAlBGthxsaFt5cMwA4GA/A/BlKGPRlvtMBXDXNcgUNLVKZV9f+KdP1qhSeiAmubmhQXMtm9INZOsnm5CVFBYoMu/iiegj1fFNjUShfMxU4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714989117; c=relaxed/simple; bh=EqjP3m7uykjcyB/f6y24MqoFP7cN3H5kazDuBfsQ4+c=; h=Message-ID:Date:MIME-Version:Subject:To:References:From: In-Reply-To:Content-Type; b=FsjICVXjftubt90oK9zDKGaz3mcX44/wg5sMughDxA7C62N/d/ilPN5MMrMUtovoOXtIc+Z9vassFHJcWBc2+BgmIpNhNhzltyt2UN3fecTN32N22andLRoF1u4m06MBtSh+Z4aXEeChTibw/+TOZ5Ja1AIg75SJvcHoPw2t2mk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=ijUDRax0; arc=none smtp.client-ip=209.85.128.50 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="ijUDRax0" Received: by mail-wm1-f50.google.com with SMTP id 5b1f17b1804b1-41a72f3a20dso11332645e9.0; Mon, 06 May 2024 02:51:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1714989114; x=1715593914; darn=vger.kernel.org; h=content-transfer-encoding:in-reply-to:from:references:to:subject :mime-version:date:message-id:from:to:cc:subject:date:message-id :reply-to; bh=a5lawwvvxgc/VZKdPAnqoKgJL757qCfC9o6ubXHOj+0=; b=ijUDRax03nlc88DUHUbEJ25DBq5xuHGgWRDHe7v0LUaM94oZp0EDGe/FI4kTOT3HwI UvHmcs86FCRZFzzSGOsoe39dPTN+Hpcxorn3ZkS2XcgGsbIVMOiCm2Sq3mxcQk5q4n4/ xUJvLbBLmQQPZ/UZt2hKH68mb/j2QbLqxMb7iYnddcu3fC0cPrj0UndwjxcWbpbG05F/ PEq8l+Os3PVPXL2wN8EVwI7dDdbhgtX5CS2rwoU5iC30rd44fm6xgaqc+IGGaHKx0lIt CqYBY0xWq9zSTmCNFA3l3iTLznMzYeLB4ag9Y/31AU7M3dTq/jRF1q8mA+UzaKmA5lTQ 0fgA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1714989114; x=1715593914; h=content-transfer-encoding:in-reply-to:from:references:to:subject :mime-version:date:message-id:x-gm-message-state:from:to:cc:subject :date:message-id:reply-to; bh=a5lawwvvxgc/VZKdPAnqoKgJL757qCfC9o6ubXHOj+0=; b=COefp4l25A/D326NpjnGLaZXOK5zAyG6IK2uEiF1w31FMlYJgl+WAkcxzVTY4t1FJW j/7seZZeXQmJKfSUpsVhtIZ+/0WfpX6kBGn5arp4T9XF6Syc/US0RgQTiD7sK8Kd8GgZ 2/9wEucuU0mwq729DjoBESIGBUsYYeXek5AltuNJ6agZCwjAR2KcJNOp1w0tSSd06A0m LruZwmYygG+cKlfjHjq0utnqVYyoXQSfSR2TGPIyuZ1knONGsy8lWprmeFWZ/2kv1OOe PnzKvsw7S9zx4CAobz7FuymxNpjRUkEt2R7RVr4ISAZF2jrshbMvfSwWQyMQvWN3Qcm1 gepA== X-Forwarded-Encrypted: i=1; AJvYcCVCf8vk6slTi7VtdBGxgpvOtj/ySBV+rbAtK2KTWRO2oMuIhQStBDgjo9u9HDVq0zhWAy6l2E6yJLd4CJn54/SeLHgnwuZpdskoQ2/EmryYiGbnJUnruntxej93oW9LSjfRSU8bahYRGWgBzBOeeapYhkXCl0vRGyrehH06ch56T938RiUC X-Gm-Message-State: AOJu0Yzy33tePxytQODTEWi3/EbRjgv6/bs0+nTA3Gpq8DfgLI0AW5Vx xuBfdoRgPna0B9vXiN0Ao5z1J8qGFFGIaQRBwlRrf9UF/3xG24yuK0IRPQ== X-Google-Smtp-Source: AGHT+IGAWuWiREqnEeRfMAFvOrd7cLm1v0RY/XfOQ7VR4+16ai775AJ7UrcrKVlhpl9/apO723+vOQ== X-Received: by 2002:a05:600c:138a:b0:41c:13f6:206d with SMTP id u10-20020a05600c138a00b0041c13f6206dmr7399586wmf.25.1714989114331; Mon, 06 May 2024 02:51:54 -0700 (PDT) Received: from debian ([146.70.204.204]) by smtp.gmail.com with ESMTPSA id v6-20020a5d6106000000b0034d743eb8dfsm10258649wrt.29.2024.05.06.02.51.38 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 06 May 2024 02:51:54 -0700 (PDT) Message-ID: <761374d3-1c76-4dc2-a4cc-7bd693deb453@gmail.com> Date: Mon, 6 May 2024 11:51:28 +0200 Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Subject: [PATCH net-next v8 3/3] selftests/net: add flush id selftests To: davem@davemloft.net, edumazet@google.com, willemdebruijn.kernel@gmail.com, kuba@kernel.org, pabeni@redhat.com, dsahern@kernel.org, alobakin@pm.me, shuah@kernel.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org References: <20240506093550.128210-1-richardbgobert@gmail.com> From: Richard Gobert In-Reply-To: <20240506093550.128210-1-richardbgobert@gmail.com> X-Patchwork-Delegate: kuba@kernel.org Added flush id selftests to test different cases where DF flag is set or unset and id value changes in the following packets. All cases where the packets should coalesce or should not coalesce are tested. Signed-off-by: Richard Gobert --- tools/testing/selftests/net/gro.c | 147 ++++++++++++++++++++++++++++++ 1 file changed, 147 insertions(+) diff --git a/tools/testing/selftests/net/gro.c b/tools/testing/selftests/net/gro.c index 353e1e867fbb..5dc7b539ccbf 100644 --- a/tools/testing/selftests/net/gro.c +++ b/tools/testing/selftests/net/gro.c @@ -617,6 +617,123 @@ static void add_ipv6_exthdr(void *buf, void *optpkt, __u8 exthdr_type, char *ext iph->payload_len = htons(ntohs(iph->payload_len) + MIN_EXTHDR_SIZE); } +static void fix_ip4_checksum(struct iphdr *iph) +{ + iph->check = 0; + iph->check = checksum_fold(iph, sizeof(struct iphdr), 0); +} + +static void send_flush_id_case(int fd, struct sockaddr_ll *daddr, int tcase) +{ + static char buf1[MAX_HDR_LEN + PAYLOAD_LEN]; + static char buf2[MAX_HDR_LEN + PAYLOAD_LEN]; + static char buf3[MAX_HDR_LEN + PAYLOAD_LEN]; + bool send_three = false; + struct iphdr *iph1; + struct iphdr *iph2; + struct iphdr *iph3; + + iph1 = (struct iphdr *)(buf1 + ETH_HLEN); + iph2 = (struct iphdr *)(buf2 + ETH_HLEN); + iph3 = (struct iphdr *)(buf3 + ETH_HLEN); + + create_packet(buf1, 0, 0, PAYLOAD_LEN, 0); + create_packet(buf2, PAYLOAD_LEN, 0, PAYLOAD_LEN, 0); + create_packet(buf3, PAYLOAD_LEN * 2, 0, PAYLOAD_LEN, 0); + + switch (tcase) { + case 0: /* DF=1, Incrementing - should coalesce */ + iph1->frag_off |= htons(IP_DF); + iph1->id = htons(8); + fix_ip4_checksum(iph1); + + iph2->frag_off |= htons(IP_DF); + iph2->id = htons(9); + fix_ip4_checksum(iph2); + break; + + case 1: /* DF=1, Fixed - should coalesce */ + iph1->frag_off |= htons(IP_DF); + iph1->id = htons(8); + fix_ip4_checksum(iph1); + + iph2->frag_off |= htons(IP_DF); + iph2->id = htons(8); + fix_ip4_checksum(iph2); + break; + + case 2: /* DF=0, Incrementing - should coalesce */ + iph1->frag_off &= ~htons(IP_DF); + iph1->id = htons(8); + fix_ip4_checksum(iph1); + + iph2->frag_off &= ~htons(IP_DF); + iph2->id = htons(9); + fix_ip4_checksum(iph2); + break; + + case 3: /* DF=0, Fixed - should not coalesce */ + iph1->frag_off &= ~htons(IP_DF); + iph1->id = htons(8); + fix_ip4_checksum(iph1); + + iph2->frag_off &= ~htons(IP_DF); + iph2->id = htons(8); + fix_ip4_checksum(iph2); + break; + + case 4: /* DF=1, two packets incrementing, and one fixed - should + * coalesce only the first two packets + */ + iph1->frag_off |= htons(IP_DF); + iph1->id = htons(8); + fix_ip4_checksum(iph1); + + iph2->frag_off |= htons(IP_DF); + iph2->id = htons(9); + fix_ip4_checksum(iph2); + + iph3->frag_off |= htons(IP_DF); + iph3->id = htons(9); + fix_ip4_checksum(iph3); + send_three = true; + break; + + case 5: /* DF=1, two packets fixed, and one incrementing - should + * coalesce only the first two packets + */ + iph1->frag_off |= htons(IP_DF); + iph1->id = htons(8); + fix_ip4_checksum(iph1); + + iph2->frag_off |= htons(IP_DF); + iph2->id = htons(8); + fix_ip4_checksum(iph2); + + iph3->frag_off |= htons(IP_DF); + iph3->id = htons(9); + fix_ip4_checksum(iph3); + send_three = true; + break; + } + + write_packet(fd, buf1, total_hdr_len + PAYLOAD_LEN, daddr); + write_packet(fd, buf2, total_hdr_len + PAYLOAD_LEN, daddr); + + if (send_three) + write_packet(fd, buf3, total_hdr_len + PAYLOAD_LEN, daddr); +} + +static void test_flush_id(int fd, struct sockaddr_ll *daddr, char *fin_pkt) +{ + for (int i = 0; i < 6; i++) { + sleep(1); + send_flush_id_case(fd, daddr, i); + sleep(1); + write_packet(fd, fin_pkt, total_hdr_len, daddr); + } +} + static void send_ipv6_exthdr(int fd, struct sockaddr_ll *daddr, char *ext_data1, char *ext_data2) { static char buf[MAX_HDR_LEN + PAYLOAD_LEN]; @@ -935,6 +1052,8 @@ static void gro_sender(void) send_fragment4(txfd, &daddr); sleep(1); write_packet(txfd, fin_pkt, total_hdr_len, &daddr); + + test_flush_id(txfd, &daddr, fin_pkt); } else if (proto == PF_INET6) { sleep(1); send_fragment6(txfd, &daddr); @@ -1061,6 +1180,34 @@ static void gro_receiver(void) printf("fragmented ip4 doesn't coalesce: "); check_recv_pkts(rxfd, correct_payload, 2); + + /* is_atomic checks */ + printf("DF=1, Incrementing - should coalesce: "); + correct_payload[0] = PAYLOAD_LEN * 2; + check_recv_pkts(rxfd, correct_payload, 1); + + printf("DF=1, Fixed - should coalesce: "); + correct_payload[0] = PAYLOAD_LEN * 2; + check_recv_pkts(rxfd, correct_payload, 1); + + printf("DF=0, Incrementing - should coalesce: "); + correct_payload[0] = PAYLOAD_LEN * 2; + check_recv_pkts(rxfd, correct_payload, 1); + + printf("DF=0, Fixed - should not coalesce: "); + correct_payload[0] = PAYLOAD_LEN; + correct_payload[1] = PAYLOAD_LEN; + check_recv_pkts(rxfd, correct_payload, 2); + + printf("DF=1, 2 Incrementing and one fixed - should coalesce only first 2 packets: "); + correct_payload[0] = PAYLOAD_LEN * 2; + correct_payload[1] = PAYLOAD_LEN; + check_recv_pkts(rxfd, correct_payload, 2); + + printf("DF=1, 2 Fixed and one incrementing - should coalesce only first 2 packets: "); + correct_payload[0] = PAYLOAD_LEN * 2; + correct_payload[1] = PAYLOAD_LEN; + check_recv_pkts(rxfd, correct_payload, 2); } else if (proto == PF_INET6) { /* GRO doesn't check for ipv6 hop limit when flushing. * Hence no corresponding test to the ipv4 case.