diff mbox series

[net,2/2] net: Call skb destructor on NAPI_GRO_FREE_STOLEN_HEAD

Message ID 20201117203355.389661-2-saeedm@nvidia.com (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series [net,1/2] net/tls: Protect from calling tls_dev_del for TLS RX twice | expand

Checks

Context Check Description
netdev/cover_letter success Link
netdev/fixes_present success Link
netdev/patch_count success Link
netdev/tree_selection success Clearly marked for net
netdev/subject_prefix success Link
netdev/source_inline success Was 0 now: 0
netdev/verify_signedoff success Link
netdev/module_param success Was 0 now: 0
netdev/build_32bit success Errors and warnings before: 10 this patch: 10
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/verify_fixes success Link
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 9 lines checked
netdev/build_allmodconfig_warn success Errors and warnings before: 10 this patch: 10
netdev/header_inline success Link
netdev/stable success Stable not CCed

Commit Message

Saeed Mahameed Nov. 17, 2020, 8:33 p.m. UTC
From: Maxim Mikityanskiy <maximmi@mellanox.com>

All GRO flows except one call skb->destructor, however, GRO_MERGED_FREE
doesn't do it in case of NAPI_GRO_FREE_STOLEN_HEAD. For better
consistency and to add resiliency against the drivers that may pass SKBs
with a destructor, this patch changes napi_skb_free_stolen_head to use
skb_release_head_state, which should perform all the needed cleanups,
including a call to the destructor. This way the code of GRO_MERGED_FREE
becomes similar to kfree_skb_partial.

Fixes: e44699d2c280 ("net: handle NAPI_GRO_FREE_STOLEN_HEAD case also in napi_frags_finish()")
Fixes: d7e8883cfcf4 ("net: make GRO aware of skb->head_frag")
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
For -stable: 5.4

 net/core/dev.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

Comments

Jakub Kicinski Nov. 18, 2020, 7:22 p.m. UTC | #1
On Tue, 17 Nov 2020 12:33:55 -0800 Saeed Mahameed wrote:
> From: Maxim Mikityanskiy <maximmi@mellanox.com>
> 
> All GRO flows except one call skb->destructor, however, GRO_MERGED_FREE
> doesn't do it in case of NAPI_GRO_FREE_STOLEN_HEAD. For better
> consistency and to add resiliency against the drivers that may pass SKBs
> with a destructor, this patch changes napi_skb_free_stolen_head to use
> skb_release_head_state, which should perform all the needed cleanups,
> including a call to the destructor. This way the code of GRO_MERGED_FREE
> becomes similar to kfree_skb_partial.
> 
> Fixes: e44699d2c280 ("net: handle NAPI_GRO_FREE_STOLEN_HEAD case also in napi_frags_finish()")
> Fixes: d7e8883cfcf4 ("net: make GRO aware of skb->head_frag")
> Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
> Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

CC Eric for GRO expertise.

Makes sense to me, but do you still need "net/mlx5e: Fix refcount leak
on kTLS RX resync" even with this applied?

> diff --git a/net/core/dev.c b/net/core/dev.c
> index 82dc6b48e45f..85dcc7f19902 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -6048,8 +6048,7 @@ EXPORT_SYMBOL(gro_find_complete_by_type);
>  
>  static void napi_skb_free_stolen_head(struct sk_buff *skb)
>  {
> -	skb_dst_drop(skb);
> -	skb_ext_put(skb);
> +	skb_release_head_state(skb);
>  	kmem_cache_free(skbuff_head_cache, skb);
>  }
>
Eric Dumazet Nov. 18, 2020, 8:02 p.m. UTC | #2
On Wed, Nov 18, 2020 at 8:22 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Tue, 17 Nov 2020 12:33:55 -0800 Saeed Mahameed wrote:
> > From: Maxim Mikityanskiy <maximmi@mellanox.com>
> >
> > All GRO flows except one call skb->destructor, however, GRO_MERGED_FREE
> > doesn't do it in case of NAPI_GRO_FREE_STOLEN_HEAD. For better
> > consistency and to add resiliency against the drivers that may pass SKBs
> > with a destructor, this patch changes napi_skb_free_stolen_head to use
> > skb_release_head_state, which should perform all the needed cleanups,
> > including a call to the destructor. This way the code of GRO_MERGED_FREE
> > becomes similar to kfree_skb_partial.
> >
> > Fixes: e44699d2c280 ("net: handle NAPI_GRO_FREE_STOLEN_HEAD case also in napi_frags_finish()")
> > Fixes: d7e8883cfcf4 ("net: make GRO aware of skb->head_frag")
> > Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
> > Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
> > Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
>
> CC Eric for GRO expertise.

Thanks for CCing me.

Since when drivers can pass funny skbs with destructors ???

Can we please stop adding more cycles to _already_ expensive GRO ?




>
> Makes sense to me, but do you still need "net/mlx5e: Fix refcount leak
> on kTLS RX resync" even with this applied?
>
> > diff --git a/net/core/dev.c b/net/core/dev.c
> > index 82dc6b48e45f..85dcc7f19902 100644
> > --- a/net/core/dev.c
> > +++ b/net/core/dev.c
> > @@ -6048,8 +6048,7 @@ EXPORT_SYMBOL(gro_find_complete_by_type);
> >
> >  static void napi_skb_free_stolen_head(struct sk_buff *skb)
> >  {
> > -     skb_dst_drop(skb);
> > -     skb_ext_put(skb);
> > +     skb_release_head_state(skb);
> >       kmem_cache_free(skbuff_head_cache, skb);
> >  }
> >
>
Jakub Kicinski Nov. 18, 2020, 8:14 p.m. UTC | #3
On Wed, 18 Nov 2020 21:02:29 +0100 Eric Dumazet wrote:
> On Wed, Nov 18, 2020 at 8:22 PM Jakub Kicinski <kuba@kernel.org> wrote:
> >
> > On Tue, 17 Nov 2020 12:33:55 -0800 Saeed Mahameed wrote:  
> > > From: Maxim Mikityanskiy <maximmi@mellanox.com>
> > >
> > > All GRO flows except one call skb->destructor, however, GRO_MERGED_FREE
> > > doesn't do it in case of NAPI_GRO_FREE_STOLEN_HEAD. For better
> > > consistency and to add resiliency against the drivers that may pass SKBs
> > > with a destructor, this patch changes napi_skb_free_stolen_head to use
> > > skb_release_head_state, which should perform all the needed cleanups,
> > > including a call to the destructor. This way the code of GRO_MERGED_FREE
> > > becomes similar to kfree_skb_partial.
> > >
> > > Fixes: e44699d2c280 ("net: handle NAPI_GRO_FREE_STOLEN_HEAD case also in napi_frags_finish()")
> > > Fixes: d7e8883cfcf4 ("net: make GRO aware of skb->head_frag")
> > > Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
> > > Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
> > > Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>  
> >
> > CC Eric for GRO expertise.  
> 
> Thanks for CCing me.
> 
> Since when drivers can pass funny skbs with destructors ???
> 
> Can we please stop adding more cycles to _already_ expensive GRO ?

I don't think they do that today much (save for the ktls optimization
in mlx5 Maxim is fixing separately). But I believe the idea of early
demux in XDP had been floated in the past.

If we don't want that to happen we should document it (stating the
obvious).
Eric Dumazet Nov. 18, 2020, 8:21 p.m. UTC | #4
On Wed, Nov 18, 2020 at 9:14 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Wed, 18 Nov 2020 21:02:29 +0100 Eric Dumazet wrote:
> > On Wed, Nov 18, 2020 at 8:22 PM Jakub Kicinski <kuba@kernel.org> wrote:
> > >
> > > On Tue, 17 Nov 2020 12:33:55 -0800 Saeed Mahameed wrote:
> > > > From: Maxim Mikityanskiy <maximmi@mellanox.com>
> > > >
> > > > All GRO flows except one call skb->destructor, however, GRO_MERGED_FREE
> > > > doesn't do it in case of NAPI_GRO_FREE_STOLEN_HEAD. For better
> > > > consistency and to add resiliency against the drivers that may pass SKBs
> > > > with a destructor, this patch changes napi_skb_free_stolen_head to use
> > > > skb_release_head_state, which should perform all the needed cleanups,
> > > > including a call to the destructor. This way the code of GRO_MERGED_FREE
> > > > becomes similar to kfree_skb_partial.
> > > >
> > > > Fixes: e44699d2c280 ("net: handle NAPI_GRO_FREE_STOLEN_HEAD case also in napi_frags_finish()")
> > > > Fixes: d7e8883cfcf4 ("net: make GRO aware of skb->head_frag")
> > > > Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
> > > > Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
> > > > Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
> > >
> > > CC Eric for GRO expertise.
> >
> > Thanks for CCing me.
> >
> > Since when drivers can pass funny skbs with destructors ???
> >
> > Can we please stop adding more cycles to _already_ expensive GRO ?
>
> I don't think they do that today much (save for the ktls optimization
> in mlx5 Maxim is fixing separately). But I believe the idea of early
> demux in XDP had been floated in the past.
>
> If we don't want that to happen we should document it (stating the
> obvious).

This is a patch targeting the net tree, with Fixes: tag pretending
this is an old bug.

How can we possibly merge two skbs if they have destructors ?

We do not make sure it is even possible.

Many destructors track skb->truesize against a socket wmem_alloc or rmem_alloc,
this stuff can not possibly work, unless stronger checks in GRO, since
GRO changes skb->truesize
without checking skb->destructor.

If skb has a destructor, just bypass GRO completely, this is the only
thing we can do.
This would be quite unfortunate to add such a check "just because
someone tries to fool us"

diff --git a/net/core/dev.c b/net/core/dev.c
index 4bfdcd6b20e8836e2884c51c6ce349ed54130bfa..76f0a627b6a1ee02339a724ecb6e4dbade80501b
100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -5920,7 +5920,7 @@ static enum gro_result dev_gro_receive(struct
napi_struct *napi, struct sk_buff
        int same_flow;
        int grow;

-       if (netif_elide_gro(skb->dev))
+       if (netif_elide_gro(skb->dev) || skb->destructor)
                goto normal;

        gro_head = gro_list_prepare(napi, skb);
Saeed Mahameed Nov. 25, 2020, 10:11 p.m. UTC | #5
On Wed, 2020-11-18 at 21:21 +0100, Eric Dumazet wrote:
> On Wed, Nov 18, 2020 at 9:14 PM Jakub Kicinski <kuba@kernel.org>
> wrote:
> > On Wed, 18 Nov 2020 21:02:29 +0100 Eric Dumazet wrote:
> > > On Wed, Nov 18, 2020 at 8:22 PM Jakub Kicinski <kuba@kernel.org>
> > > wrote:
> > > > On Tue, 17 Nov 2020 12:33:55 -0800 Saeed Mahameed wrote:
> > > > > From: Maxim Mikityanskiy <maximmi@mellanox.com>
> > > > > 
> > > > > All GRO flows except one call skb->destructor, however,
> > > > > GRO_MERGED_FREE
> > > > > doesn't do it in case of NAPI_GRO_FREE_STOLEN_HEAD. For
> > > > > better
> > > > > consistency and to add resiliency against the drivers that
> > > > > may pass SKBs
> > > > > with a destructor, this patch changes
> > > > > napi_skb_free_stolen_head to use
> > > > > skb_release_head_state, which should perform all the needed
> > > > > cleanups,
> > > > > including a call to the destructor. This way the code of
> > > > > GRO_MERGED_FREE
> > > > > becomes similar to kfree_skb_partial.
> > > > > 
> > > > > Fixes: e44699d2c280 ("net: handle NAPI_GRO_FREE_STOLEN_HEAD
> > > > > case also in napi_frags_finish()")
> > > > > Fixes: d7e8883cfcf4 ("net: make GRO aware of skb->head_frag")
> > > > > Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
> > > > > Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
> > > > > Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
> > > > 
> > > > CC Eric for GRO expertise.
> > > 
> > > Thanks for CCing me.
> > > 
> > > Since when drivers can pass funny skbs with destructors ???
> > > 
> > > Can we please stop adding more cycles to _already_ expensive GRO
> > > ?
> > 
> > I don't think they do that today much (save for the ktls
> > optimization
> > in mlx5 Maxim is fixing separately). But I believe the idea of
> > early
> > demux in XDP had been floated in the past.
> > 
> > If we don't want that to happen we should document it (stating the
> > obvious).
> 
> This is a patch targeting the net tree, with Fixes: tag pretending
> this is an old bug.
> 
> How can we possibly merge two skbs if they have destructors ?
> 
> We do not make sure it is even possible.
> 
> Many destructors track skb->truesize against a socket wmem_alloc or
> rmem_alloc,
> this stuff can not possibly work, unless stronger checks in GRO,
> since
> GRO changes skb->truesize
> without checking skb->destructor.
> 
> If skb has a destructor, just bypass GRO completely, this is the only
> thing we can do.
> This would be quite unfortunate to add such a check "just because
> someone tries to fool us"
> 

Thanks Eric !!
We don't actually need this patch, as the kTLS SKBs are handled locally
in the drivers, I think we don't need to add any extra check in the
datapath and just enforce the policy somehow with debug macros
maybe WARN_ONE_ONCE()

I will drop this patch, but the XDP folks who are going to implement
XDP early demux should take care of this themselves.

> diff --git a/net/core/dev.c b/net/core/dev.c
> index
> 4bfdcd6b20e8836e2884c51c6ce349ed54130bfa..76f0a627b6a1ee02339a724ecb6
> e4dbade80501b
> 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -5920,7 +5920,7 @@ static enum gro_result dev_gro_receive(struct
> napi_struct *napi, struct sk_buff
>         int same_flow;
>         int grow;
> 
> -       if (netif_elide_gro(skb->dev))
> +       if (netif_elide_gro(skb->dev) || skb->destructor)
>                 goto normal;
> 
>         gro_head = gro_list_prepare(napi, skb);
diff mbox series

Patch

diff --git a/net/core/dev.c b/net/core/dev.c
index 82dc6b48e45f..85dcc7f19902 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -6048,8 +6048,7 @@  EXPORT_SYMBOL(gro_find_complete_by_type);
 
 static void napi_skb_free_stolen_head(struct sk_buff *skb)
 {
-	skb_dst_drop(skb);
-	skb_ext_put(skb);
+	skb_release_head_state(skb);
 	kmem_cache_free(skbuff_head_cache, skb);
 }