diff mbox series

[v2,net] gro_cells: Avoid packet re-ordering for cloned skbs

Message ID 20250121115010.110053-1-tbogendoerfer@suse.de (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series [v2,net] gro_cells: Avoid packet re-ordering for cloned skbs | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for net, async
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag present in non-next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers warning 2 maintainers not CCed: soheil@google.com dsahern@kernel.org
netdev/build_clang success Errors and warnings before: 2 this patch: 2
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 0 this patch: 0
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 27 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
netdev/contest success net-next-2025-01-21--15-00 (tests: 885)

Commit Message

Thomas Bogendoerfer Jan. 21, 2025, 11:50 a.m. UTC
gro_cells_receive() passes a cloned skb directly up the stack and
could cause re-ordering against segments still in GRO. To avoid
this queue cloned skbs and use gro_normal_one() to pass it during
normal NAPI work.

Fixes: c9e6bc644e55 ("net: add gro_cells infrastructure")
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de>
--
v2: don't use skb_copy(), but make decision how to pass cloned skbs in
    napi poll function (suggested by Eric)
v1: https://lore.kernel.org/lkml/20250109142724.29228-1-tbogendoerfer@suse.de/
  
 net/core/gro_cells.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

Comments

Eric Dumazet Jan. 21, 2025, 4:33 p.m. UTC | #1
On Tue, Jan 21, 2025 at 12:50 PM Thomas Bogendoerfer
<tbogendoerfer@suse.de> wrote:
>
> gro_cells_receive() passes a cloned skb directly up the stack and
> could cause re-ordering against segments still in GRO. To avoid
> this queue cloned skbs and use gro_normal_one() to pass it during
> normal NAPI work.
>
> Fixes: c9e6bc644e55 ("net: add gro_cells infrastructure")
> Suggested-by: Eric Dumazet <edumazet@google.com>
> Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de>
> --
> v2: don't use skb_copy(), but make decision how to pass cloned skbs in
>     napi poll function (suggested by Eric)
> v1: https://lore.kernel.org/lkml/20250109142724.29228-1-tbogendoerfer@suse.de/
>

Reviewed-by: Eric Dumazet <edumazet@google.com>

Thanks.
Paolo Abeni Jan. 23, 2025, 8:42 a.m. UTC | #2
On 1/21/25 12:50 PM, Thomas Bogendoerfer wrote:
> gro_cells_receive() passes a cloned skb directly up the stack and
> could cause re-ordering against segments still in GRO. To avoid
> this queue cloned skbs and use gro_normal_one() to pass it during
> normal NAPI work.
> 
> Fixes: c9e6bc644e55 ("net: add gro_cells infrastructure")
> Suggested-by: Eric Dumazet <edumazet@google.com>
> Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de>
> --
> v2: don't use skb_copy(), but make decision how to pass cloned skbs in
>     napi poll function (suggested by Eric)
> v1: https://lore.kernel.org/lkml/20250109142724.29228-1-tbogendoerfer@suse.de/
>   
>  net/core/gro_cells.c | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/net/core/gro_cells.c b/net/core/gro_cells.c
> index ff8e5b64bf6b..762746d18486 100644
> --- a/net/core/gro_cells.c
> +++ b/net/core/gro_cells.c
> @@ -2,6 +2,7 @@
>  #include <linux/skbuff.h>
>  #include <linux/slab.h>
>  #include <linux/netdevice.h>
> +#include <net/gro.h>
>  #include <net/gro_cells.h>
>  #include <net/hotdata.h>
>  
> @@ -20,7 +21,7 @@ int gro_cells_receive(struct gro_cells *gcells, struct sk_buff *skb)
>  	if (unlikely(!(dev->flags & IFF_UP)))
>  		goto drop;
>  
> -	if (!gcells->cells || skb_cloned(skb) || netif_elide_gro(dev)) {
> +	if (!gcells->cells || netif_elide_gro(dev)) {
>  		res = netif_rx(skb);
>  		goto unlock;
>  	}
> @@ -58,7 +59,11 @@ static int gro_cell_poll(struct napi_struct *napi, int budget)
>  		skb = __skb_dequeue(&cell->napi_skbs);
>  		if (!skb)
>  			break;
> -		napi_gro_receive(napi, skb);
> +		/* Core GRO stack does not play well with clones. */
> +		if (skb_cloned(skb))
> +			gro_normal_one(napi, skb, 1);
> +		else
> +			napi_gro_receive(napi, skb);

I must admit it's not clear to me how/why the above will avoid OoO. I
assume OoO happens when we observe both cloned and uncloned packets
belonging to the same connection/flow.

What if we have a (uncloned) packet for the relevant flow in the GRO,
'rx_count - 1' packets already sitting in 'rx_list' and a cloned packet
for the critical flow reaches gro_cells_receive()?

Don't we need to unconditionally flush any packets belonging to the same
flow?

Thanks!

Paolo
Eric Dumazet Jan. 23, 2025, 10:07 a.m. UTC | #3
On Thu, Jan 23, 2025 at 9:43 AM Paolo Abeni <pabeni@redhat.com> wrote:
>
> On 1/21/25 12:50 PM, Thomas Bogendoerfer wrote:
> > gro_cells_receive() passes a cloned skb directly up the stack and
> > could cause re-ordering against segments still in GRO. To avoid
> > this queue cloned skbs and use gro_normal_one() to pass it during
> > normal NAPI work.
> >
> > Fixes: c9e6bc644e55 ("net: add gro_cells infrastructure")
> > Suggested-by: Eric Dumazet <edumazet@google.com>
> > Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de>
> > --
> > v2: don't use skb_copy(), but make decision how to pass cloned skbs in
> >     napi poll function (suggested by Eric)
> > v1: https://lore.kernel.org/lkml/20250109142724.29228-1-tbogendoerfer@suse.de/
> >
> >  net/core/gro_cells.c | 9 +++++++--
> >  1 file changed, 7 insertions(+), 2 deletions(-)
> >
> > diff --git a/net/core/gro_cells.c b/net/core/gro_cells.c
> > index ff8e5b64bf6b..762746d18486 100644
> > --- a/net/core/gro_cells.c
> > +++ b/net/core/gro_cells.c
> > @@ -2,6 +2,7 @@
> >  #include <linux/skbuff.h>
> >  #include <linux/slab.h>
> >  #include <linux/netdevice.h>
> > +#include <net/gro.h>
> >  #include <net/gro_cells.h>
> >  #include <net/hotdata.h>
> >
> > @@ -20,7 +21,7 @@ int gro_cells_receive(struct gro_cells *gcells, struct sk_buff *skb)
> >       if (unlikely(!(dev->flags & IFF_UP)))
> >               goto drop;
> >
> > -     if (!gcells->cells || skb_cloned(skb) || netif_elide_gro(dev)) {
> > +     if (!gcells->cells || netif_elide_gro(dev)) {
> >               res = netif_rx(skb);
> >               goto unlock;
> >       }
> > @@ -58,7 +59,11 @@ static int gro_cell_poll(struct napi_struct *napi, int budget)
> >               skb = __skb_dequeue(&cell->napi_skbs);
> >               if (!skb)
> >                       break;
> > -             napi_gro_receive(napi, skb);
> > +             /* Core GRO stack does not play well with clones. */
> > +             if (skb_cloned(skb))
> > +                     gro_normal_one(napi, skb, 1);
> > +             else
> > +                     napi_gro_receive(napi, skb);
>
> I must admit it's not clear to me how/why the above will avoid OoO. I
> assume OoO happens when we observe both cloned and uncloned packets
> belonging to the same connection/flow.
>
> What if we have a (uncloned) packet for the relevant flow in the GRO,
> 'rx_count - 1' packets already sitting in 'rx_list' and a cloned packet
> for the critical flow reaches gro_cells_receive()?
>
> Don't we need to unconditionally flush any packets belonging to the same
> flow?

It would only matter if we had 2 or more segments that would belong
to the same flow and packet train (potential 'GRO super packet'), with
the 'cloned'
status being of mixed value on various segments.

In practice, the cloned status will be the same for all segments.

Same issue would happen when/if dev->features NETIF_F_GRO is flipped
back and forth : We do not really care.
Paolo Abeni Jan. 23, 2025, 10:42 a.m. UTC | #4
On 1/23/25 11:07 AM, Eric Dumazet wrote:
> On Thu, Jan 23, 2025 at 9:43 AM Paolo Abeni <pabeni@redhat.com> wrote:
>> On 1/21/25 12:50 PM, Thomas Bogendoerfer wrote:
>>> gro_cells_receive() passes a cloned skb directly up the stack and
>>> could cause re-ordering against segments still in GRO. To avoid
>>> this queue cloned skbs and use gro_normal_one() to pass it during
>>> normal NAPI work.
>>>
>>> Fixes: c9e6bc644e55 ("net: add gro_cells infrastructure")
>>> Suggested-by: Eric Dumazet <edumazet@google.com>
>>> Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de>
>>> --
>>> v2: don't use skb_copy(), but make decision how to pass cloned skbs in
>>>     napi poll function (suggested by Eric)
>>> v1: https://lore.kernel.org/lkml/20250109142724.29228-1-tbogendoerfer@suse.de/
>>>
>>>  net/core/gro_cells.c | 9 +++++++--
>>>  1 file changed, 7 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/net/core/gro_cells.c b/net/core/gro_cells.c
>>> index ff8e5b64bf6b..762746d18486 100644
>>> --- a/net/core/gro_cells.c
>>> +++ b/net/core/gro_cells.c
>>> @@ -2,6 +2,7 @@
>>>  #include <linux/skbuff.h>
>>>  #include <linux/slab.h>
>>>  #include <linux/netdevice.h>
>>> +#include <net/gro.h>
>>>  #include <net/gro_cells.h>
>>>  #include <net/hotdata.h>
>>>
>>> @@ -20,7 +21,7 @@ int gro_cells_receive(struct gro_cells *gcells, struct sk_buff *skb)
>>>       if (unlikely(!(dev->flags & IFF_UP)))
>>>               goto drop;
>>>
>>> -     if (!gcells->cells || skb_cloned(skb) || netif_elide_gro(dev)) {
>>> +     if (!gcells->cells || netif_elide_gro(dev)) {
>>>               res = netif_rx(skb);
>>>               goto unlock;
>>>       }
>>> @@ -58,7 +59,11 @@ static int gro_cell_poll(struct napi_struct *napi, int budget)
>>>               skb = __skb_dequeue(&cell->napi_skbs);
>>>               if (!skb)
>>>                       break;
>>> -             napi_gro_receive(napi, skb);
>>> +             /* Core GRO stack does not play well with clones. */
>>> +             if (skb_cloned(skb))
>>> +                     gro_normal_one(napi, skb, 1);
>>> +             else
>>> +                     napi_gro_receive(napi, skb);
>>
>> I must admit it's not clear to me how/why the above will avoid OoO. I
>> assume OoO happens when we observe both cloned and uncloned packets
>> belonging to the same connection/flow.
>>
>> What if we have a (uncloned) packet for the relevant flow in the GRO,
>> 'rx_count - 1' packets already sitting in 'rx_list' and a cloned packet
>> for the critical flow reaches gro_cells_receive()?
>>
>> Don't we need to unconditionally flush any packets belonging to the same
>> flow?
> 
> It would only matter if we had 2 or more segments that would belong
> to the same flow and packet train (potential 'GRO super packet'), with
> the 'cloned'
> status being of mixed value on various segments.
> 
> In practice, the cloned status will be the same for all segments.

I agree with the above, but my doubt is: does the above also mean that
in practice there are no OoO to deal with, even without this patch?

To rephrase my doubt: which scenario is addressed by this patch that
would lead to OoO without it?

Thanks,

Paolo
Eric Dumazet Jan. 23, 2025, 10:43 a.m. UTC | #5
On Thu, Jan 23, 2025 at 11:42 AM Paolo Abeni <pabeni@redhat.com> wrote:
>
> On 1/23/25 11:07 AM, Eric Dumazet wrote:
> > On Thu, Jan 23, 2025 at 9:43 AM Paolo Abeni <pabeni@redhat.com> wrote:
> >> On 1/21/25 12:50 PM, Thomas Bogendoerfer wrote:
> >>> gro_cells_receive() passes a cloned skb directly up the stack and
> >>> could cause re-ordering against segments still in GRO. To avoid
> >>> this queue cloned skbs and use gro_normal_one() to pass it during
> >>> normal NAPI work.
> >>>
> >>> Fixes: c9e6bc644e55 ("net: add gro_cells infrastructure")
> >>> Suggested-by: Eric Dumazet <edumazet@google.com>
> >>> Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de>
> >>> --
> >>> v2: don't use skb_copy(), but make decision how to pass cloned skbs in
> >>>     napi poll function (suggested by Eric)
> >>> v1: https://lore.kernel.org/lkml/20250109142724.29228-1-tbogendoerfer@suse.de/
> >>>
> >>>  net/core/gro_cells.c | 9 +++++++--
> >>>  1 file changed, 7 insertions(+), 2 deletions(-)
> >>>
> >>> diff --git a/net/core/gro_cells.c b/net/core/gro_cells.c
> >>> index ff8e5b64bf6b..762746d18486 100644
> >>> --- a/net/core/gro_cells.c
> >>> +++ b/net/core/gro_cells.c
> >>> @@ -2,6 +2,7 @@
> >>>  #include <linux/skbuff.h>
> >>>  #include <linux/slab.h>
> >>>  #include <linux/netdevice.h>
> >>> +#include <net/gro.h>
> >>>  #include <net/gro_cells.h>
> >>>  #include <net/hotdata.h>
> >>>
> >>> @@ -20,7 +21,7 @@ int gro_cells_receive(struct gro_cells *gcells, struct sk_buff *skb)
> >>>       if (unlikely(!(dev->flags & IFF_UP)))
> >>>               goto drop;
> >>>
> >>> -     if (!gcells->cells || skb_cloned(skb) || netif_elide_gro(dev)) {
> >>> +     if (!gcells->cells || netif_elide_gro(dev)) {
> >>>               res = netif_rx(skb);
> >>>               goto unlock;
> >>>       }
> >>> @@ -58,7 +59,11 @@ static int gro_cell_poll(struct napi_struct *napi, int budget)
> >>>               skb = __skb_dequeue(&cell->napi_skbs);
> >>>               if (!skb)
> >>>                       break;
> >>> -             napi_gro_receive(napi, skb);
> >>> +             /* Core GRO stack does not play well with clones. */
> >>> +             if (skb_cloned(skb))
> >>> +                     gro_normal_one(napi, skb, 1);
> >>> +             else
> >>> +                     napi_gro_receive(napi, skb);
> >>
> >> I must admit it's not clear to me how/why the above will avoid OoO. I
> >> assume OoO happens when we observe both cloned and uncloned packets
> >> belonging to the same connection/flow.
> >>
> >> What if we have a (uncloned) packet for the relevant flow in the GRO,
> >> 'rx_count - 1' packets already sitting in 'rx_list' and a cloned packet
> >> for the critical flow reaches gro_cells_receive()?
> >>
> >> Don't we need to unconditionally flush any packets belonging to the same
> >> flow?
> >
> > It would only matter if we had 2 or more segments that would belong
> > to the same flow and packet train (potential 'GRO super packet'), with
> > the 'cloned'
> > status being of mixed value on various segments.
> >
> > In practice, the cloned status will be the same for all segments.
>
> I agree with the above, but my doubt is: does the above also mean that
> in practice there are no OoO to deal with, even without this patch?
>
> To rephrase my doubt: which scenario is addressed by this patch that
> would lead to OoO without it?

Fair point, a detailed changelog would be really nice.
diff mbox series

Patch

diff --git a/net/core/gro_cells.c b/net/core/gro_cells.c
index ff8e5b64bf6b..762746d18486 100644
--- a/net/core/gro_cells.c
+++ b/net/core/gro_cells.c
@@ -2,6 +2,7 @@ 
 #include <linux/skbuff.h>
 #include <linux/slab.h>
 #include <linux/netdevice.h>
+#include <net/gro.h>
 #include <net/gro_cells.h>
 #include <net/hotdata.h>
 
@@ -20,7 +21,7 @@  int gro_cells_receive(struct gro_cells *gcells, struct sk_buff *skb)
 	if (unlikely(!(dev->flags & IFF_UP)))
 		goto drop;
 
-	if (!gcells->cells || skb_cloned(skb) || netif_elide_gro(dev)) {
+	if (!gcells->cells || netif_elide_gro(dev)) {
 		res = netif_rx(skb);
 		goto unlock;
 	}
@@ -58,7 +59,11 @@  static int gro_cell_poll(struct napi_struct *napi, int budget)
 		skb = __skb_dequeue(&cell->napi_skbs);
 		if (!skb)
 			break;
-		napi_gro_receive(napi, skb);
+		/* Core GRO stack does not play well with clones. */
+		if (skb_cloned(skb))
+			gro_normal_one(napi, skb, 1);
+		else
+			napi_gro_receive(napi, skb);
 		work_done++;
 	}