diff mbox series

[net-next] net: core: allow fast GRO for skbs with Ethernet header in head

Message ID 20191112122843.30636-1-alobakin@dlink.ru (mailing list archive)
State Not Applicable
Delegated to: Johannes Berg
Headers show
Series [net-next] net: core: allow fast GRO for skbs with Ethernet header in head | expand

Commit Message

Alexander Lobakin Nov. 12, 2019, 12:28 p.m. UTC
Commit 78d3fd0b7de8 ("gro: Only use skb_gro_header for completely
non-linear packets") back in May'09 (2.6.31-rc1) has changed the
original condition '!skb_headlen(skb)' to the current
'skb_mac_header(skb) == skb_tail_pointer(skb)' in gro_reset_offset()
saying: "Since the drivers that need this optimisation all provide
completely non-linear packets".

For now, we have the following rough statistics for 5.4-rc7:
1) napi_gro_frags: 14
2) napi_gro_receive with skb->head containing (most of) payload: 83
3) napi_gro_receive with skb->head containing all the headers: 20
4) napi_gro_receive with skb->head containing only Ethernet header: 2

With the current condition, fast GRO with the usage of
NAPI_GRO_CB(skb)->frag0 is available only in the [1] case.
Packets pushed by [2] and [3] go through the 'slow' path, but
it's not a problem for them as they already contains all the needed
headers in skb->head, so pskb_may_pull() only moves skb->data.

The layout of skbs in the fourth [4] case at the moment of
dev_gro_receive() is identical to skbs that have come through [1],
as napi_frags_skb() pulls Ethernet header to skb->head. The only
difference is that the mentioned condition is always false for them,
because skb_put() and friends irreversibly alter the tail pointer.
They also go though the 'slow' path, but now every single
pskb_may_pull() in every single .gro_receive() will call the *really*
slow __pskb_pull_tail() to pull headers to head. This significantly
decreases the overall performance for no visible reasons.

The only two users of method [4] is:
* drivers/staging/qlge
* drivers/net/wireless/iwlwifi (all three variants: dvm, mvm, mvm-mq)

Note that in case with wireless drivers we can't use [1]
(napi_gro_frags()) at least for now and mac80211 stack always
performs pushes and pulls anyways, so performance hit is inavoidable.

We can simply change the condition in gro_reset_offset() to allow
skbs from [4] go through 'fast' path just like in case [1].

This was tested on a custom driver and this patch gave boosts up to
40 Mbps to method [4] in both directions comparing to net-next, which
made overall performance relatively close to [1] (without it, [4] is
the slowest).

Signed-off-by: Alexander Lobakin <alobakin@dlink.ru>
---
 net/core/dev.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

Comments

David Miller Nov. 15, 2019, 1:25 a.m. UTC | #1
From: Alexander Lobakin <alobakin@dlink.ru>
Date: Tue, 12 Nov 2019 15:28:43 +0300

> Commit 78d3fd0b7de8 ("gro: Only use skb_gro_header for completely
> non-linear packets") back in May'09 (2.6.31-rc1) has changed the
> original condition '!skb_headlen(skb)' to the current
> 'skb_mac_header(skb) == skb_tail_pointer(skb)' in gro_reset_offset()
> saying: "Since the drivers that need this optimisation all provide
> completely non-linear packets".

Please reference the appropriate SHA1-ID both here in this paragraph and
also in an appropriate Fixes: tag.

If this goes so far back that it is before GIT, then you need to provide
a reference to the patch posting via lore.kernel.org or similar because
it is absolutely essentialy for people reviewing this patch to be able
to do some digging into why the condition is code the way that it is
currently.

Thank you.
Alexander Lobakin Nov. 15, 2019, 7:36 a.m. UTC | #2
Hi Dave,

David Miller wrote 15.11.2019 04:25:

> From: Alexander Lobakin <alobakin@dlink.ru>
> Date: Tue, 12 Nov 2019 15:28:43 +0300
> 
>> Commit 78d3fd0b7de8 ("gro: Only use skb_gro_header for completely
>> non-linear packets") back in May'09 (2.6.31-rc1) has changed the
>> original condition '!skb_headlen(skb)' to the current
>> 'skb_mac_header(skb) == skb_tail_pointer(skb)' in gro_reset_offset()
>> saying: "Since the drivers that need this optimisation all provide
>> completely non-linear packets".
> 
> Please reference the appropriate SHA1-ID both here in this paragraph 
> and
> also in an appropriate Fixes: tag.

Sorry for confusing. The SHA1-ID from commit message is correct
actually. At the moment of 2.6.31 we used skb->mac_header and skb->tail
pointers directly, so the original condition was
'skb->mac_header == skb->tail'.
Commit ced14f6804a9 ("net: Correct comparisons and calculations using
skb->tail and skb-transport_header") has changed this condition to
the referred 'skb_mac_header(skb) == skb_tail_pointer(skb)' without
any functional changes.
I didn't add the "Fixes:" tag because at the moment of 2.6.31 it was
a needed change, but it became obsolete later, so now we can revert
it back to speed up skbs with only Ethernet header in head.
Please let me know if I must send v2 of this patch with corrected
description before getting any further reviews.

Thanks.

> If this goes so far back that it is before GIT, then you need to 
> provide
> a reference to the patch posting via lore.kernel.org or similar because
> it is absolutely essentialy for people reviewing this patch to be able
> to do some digging into why the condition is code the way that it is
> currently.
> 
> Thank you.

Regards,
ᚷ ᛖ ᚢ ᚦ ᚠ ᚱ
David Miller Nov. 15, 2019, 7:49 a.m. UTC | #3
From: Alexander Lobakin <alobakin@dlink.ru>
Date: Fri, 15 Nov 2019 10:36:08 +0300

> Please let me know if I must send v2 of this patch with corrected
> description before getting any further reviews.

I would say that you do, thanks for asking.

The more details and information in the commit message, the better.
Alexander Lobakin Nov. 15, 2019, 7:52 a.m. UTC | #4
David Miller wrote 15.11.2019 10:49:

> From: Alexander Lobakin <alobakin@dlink.ru>
> Date: Fri, 15 Nov 2019 10:36:08 +0300
> 
>> Please let me know if I must send v2 of this patch with corrected
>> description before getting any further reviews.
> 
> I would say that you do, thanks for asking.
> 
> The more details and information in the commit message, the better.

Thank you, I'll publish v2 soon to clarify all the possible
misunderstandings.

Regards,
ᚷ ᛖ ᚢ ᚦ ᚠ ᚱ
diff mbox series

Patch

diff --git a/net/core/dev.c b/net/core/dev.c
index 1c799d486623..da78a433c10c 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -5611,8 +5611,7 @@  static void skb_gro_reset_offset(struct sk_buff *skb)
 	NAPI_GRO_CB(skb)->frag0 = NULL;
 	NAPI_GRO_CB(skb)->frag0_len = 0;
 
-	if (skb_mac_header(skb) == skb_tail_pointer(skb) &&
-	    pinfo->nr_frags &&
+	if (!skb_headlen(skb) && pinfo->nr_frags &&
 	    !PageHighMem(skb_frag_page(frag0))) {
 		NAPI_GRO_CB(skb)->frag0 = skb_frag_address(frag0);
 		NAPI_GRO_CB(skb)->frag0_len = min_t(unsigned int,