diff mbox series

[net-next] xdp: Fixing skb->pp_recycle flag in generic XDP handling

Message ID 20230802070454.22534-1-liangchen.linux@gmail.com (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series [net-next] xdp: Fixing skb->pp_recycle flag in generic XDP handling | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for net-next
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 1336 this patch: 1336
netdev/cc_maintainers warning 2 maintainers not CCed: hawk@kernel.org bpf@vger.kernel.org
netdev/build_clang success Errors and warnings before: 1351 this patch: 1351
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 1359 this patch: 1359
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 8 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Liang Chen Aug. 2, 2023, 7:04 a.m. UTC
In the generic XDP processing flow, if an skb with a page pool page
(skb->pp_recycle == 1) fails to meet XDP packet requirements, it will
undergo head expansion and linearization of fragment data. As a result,
skb->head points to a reallocated buffer without any fragments. At this
point, the skb will not contain any page pool pages. However, the
skb->pp_recycle flag is still set to 1, which is inconsistent with the
actual situation. Although it doesn't seem to cause much real harm at the
moment(a little nagetive impact on skb_try_coalesce), to avoid potential
issues associated with using incorrect skb->pp_recycle information,
setting skb->pp_recycle to 0 to reflect the pp state of the skb.

Signed-off-by: Liang Chen <liangchen.linux@gmail.com>
---
 net/core/dev.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Alexander Lobakin Aug. 2, 2023, 5:09 p.m. UTC | #1
From: Liang Chen <liangchen.linux@gmail.com>
Date: Wed,  2 Aug 2023 15:04:54 +0800

> In the generic XDP processing flow, if an skb with a page pool page
> (skb->pp_recycle == 1) fails to meet XDP packet requirements, it will
> undergo head expansion and linearization of fragment data. As a result,
> skb->head points to a reallocated buffer without any fragments. At this
> point, the skb will not contain any page pool pages. However, the
> skb->pp_recycle flag is still set to 1, which is inconsistent with the
> actual situation. Although it doesn't seem to cause much real harm at the

This means it must be handled in the function which replaces the head,
i.e. pskb_expand_head(). Your change only suppresses one symptom of the
issue.

> moment(a little nagetive impact on skb_try_coalesce), to avoid potential

                  ^^^^^^^^
                  negative

> issues associated with using incorrect skb->pp_recycle information,
> setting skb->pp_recycle to 0 to reflect the pp state of the skb.
> 
> Signed-off-by: Liang Chen <liangchen.linux@gmail.com>

I won't say for sure, but may be a candidate for the fixes tree, not
next. This way it would need a "Fixes:" tag here (above the SoB).

> ---
>  net/core/dev.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 10e5a036c706..07baf72be7d7 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -4934,6 +4934,8 @@ static u32 netif_receive_generic_xdp(struct sk_buff *skb,
>  			goto do_drop;
>  		if (skb_linearize(skb))
>  			goto do_drop;
> +		if (skb->pp_recycle)
> +			skb->pp_recycle = 0;
>  	}
>  
>  	act = bpf_prog_run_generic_xdp(skb, xdp, xdp_prog);

Thanks,
Olek
Jakub Kicinski Aug. 2, 2023, 6:37 p.m. UTC | #2
On Wed,  2 Aug 2023 15:04:54 +0800 Liang Chen wrote:
> In the generic XDP processing flow, if an skb with a page pool page
> (skb->pp_recycle == 1) fails to meet XDP packet requirements, it will
> undergo head expansion and linearization of fragment data. As a result,
> skb->head points to a reallocated buffer without any fragments. At this
> point, the skb will not contain any page pool pages. However, the
> skb->pp_recycle flag is still set to 1, which is inconsistent with the
> actual situation. Although it doesn't seem to cause much real harm at the
> moment(a little nagetive impact on skb_try_coalesce), to avoid potential
> issues associated with using incorrect skb->pp_recycle information,
> setting skb->pp_recycle to 0 to reflect the pp state of the skb.

pp_recycle just means that the skb is "page pool aware", there's
absolutely no harm in having an skb with pp_recycle = 1 and no
page pool pages attached.

I vote not to apply this.
Liang Chen Aug. 3, 2023, 8:25 a.m. UTC | #3
On Thu, Aug 3, 2023 at 1:11 AM Alexander Lobakin
<aleksander.lobakin@intel.com> wrote:
>
> From: Liang Chen <liangchen.linux@gmail.com>
> Date: Wed,  2 Aug 2023 15:04:54 +0800
>
> > In the generic XDP processing flow, if an skb with a page pool page
> > (skb->pp_recycle == 1) fails to meet XDP packet requirements, it will
> > undergo head expansion and linearization of fragment data. As a result,
> > skb->head points to a reallocated buffer without any fragments. At this
> > point, the skb will not contain any page pool pages. However, the
> > skb->pp_recycle flag is still set to 1, which is inconsistent with the
> > actual situation. Although it doesn't seem to cause much real harm at the
>
> This means it must be handled in the function which replaces the head,
> i.e. pskb_expand_head(). Your change only suppresses one symptom of the
> issue.
>

I attempted to do so. But after pskb_expand_head, there may still be
skb frags with pp pages left. It is after skb_linearize those frags
are removed.

Thanks,
Liang

> > moment(a little nagetive impact on skb_try_coalesce), to avoid potential
>
>                   ^^^^^^^^
>                   negative
>

Sure, Thanks.


> > issues associated with using incorrect skb->pp_recycle information,
> > setting skb->pp_recycle to 0 to reflect the pp state of the skb.
> >
> > Signed-off-by: Liang Chen <liangchen.linux@gmail.com>
>
> I won't say for sure, but may be a candidate for the fixes tree, not
> next. This way it would need a "Fixes:" tag here (above the SoB).
>
> > ---
> >  net/core/dev.c | 2 ++
> >  1 file changed, 2 insertions(+)
> >
> > diff --git a/net/core/dev.c b/net/core/dev.c
> > index 10e5a036c706..07baf72be7d7 100644
> > --- a/net/core/dev.c
> > +++ b/net/core/dev.c
> > @@ -4934,6 +4934,8 @@ static u32 netif_receive_generic_xdp(struct sk_buff *skb,
> >                       goto do_drop;
> >               if (skb_linearize(skb))
> >                       goto do_drop;
> > +             if (skb->pp_recycle)
> > +                     skb->pp_recycle = 0;
> >       }
> >
> >       act = bpf_prog_run_generic_xdp(skb, xdp, xdp_prog);
>
> Thanks,
> Olek
Liang Chen Aug. 3, 2023, 8:26 a.m. UTC | #4
On Thu, Aug 3, 2023 at 2:37 AM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Wed,  2 Aug 2023 15:04:54 +0800 Liang Chen wrote:
> > In the generic XDP processing flow, if an skb with a page pool page
> > (skb->pp_recycle == 1) fails to meet XDP packet requirements, it will
> > undergo head expansion and linearization of fragment data. As a result,
> > skb->head points to a reallocated buffer without any fragments. At this
> > point, the skb will not contain any page pool pages. However, the
> > skb->pp_recycle flag is still set to 1, which is inconsistent with the
> > actual situation. Although it doesn't seem to cause much real harm at the
> > moment(a little nagetive impact on skb_try_coalesce), to avoid potential
> > issues associated with using incorrect skb->pp_recycle information,
> > setting skb->pp_recycle to 0 to reflect the pp state of the skb.
>
> pp_recycle just means that the skb is "page pool aware", there's
> absolutely no harm in having an skb with pp_recycle = 1 and no
> page pool pages attached.

I don't see it causing an error right now either. But it affects
skb_try_coalesce in a negative way from a performance perspective -
from->pp_recycle can be falsely true leading to a coalescing failure
in "(from->pp_recycle && skb_cloned(from))" test, which otherwise
would let the coalesce continue if from->pp_recycle was false. I
wonder if that justifies the need for a fix.

Thanks,
Liang


>
> I vote not to apply this.
Ilias Apalodimas Aug. 3, 2023, 8:58 a.m. UTC | #5
Hi Liang

On Thu, 3 Aug 2023 at 11:26, Liang Chen <liangchen.linux@gmail.com> wrote:
>
> On Thu, Aug 3, 2023 at 2:37 AM Jakub Kicinski <kuba@kernel.org> wrote:
> >
> > On Wed,  2 Aug 2023 15:04:54 +0800 Liang Chen wrote:
> > > In the generic XDP processing flow, if an skb with a page pool page
> > > (skb->pp_recycle == 1) fails to meet XDP packet requirements, it will
> > > undergo head expansion and linearization of fragment data. As a result,
> > > skb->head points to a reallocated buffer without any fragments. At this
> > > point, the skb will not contain any page pool pages. However, the
> > > skb->pp_recycle flag is still set to 1, which is inconsistent with the
> > > actual situation. Although it doesn't seem to cause much real harm at the
> > > moment(a little nagetive impact on skb_try_coalesce), to avoid potential
> > > issues associated with using incorrect skb->pp_recycle information,
> > > setting skb->pp_recycle to 0 to reflect the pp state of the skb.
> >
> > pp_recycle just means that the skb is "page pool aware", there's
> > absolutely no harm in having an skb with pp_recycle = 1 and no
> > page pool pages attached.
>
> I don't see it causing an error right now either. But it affects
> skb_try_coalesce in a negative way from a performance perspective -
> from->pp_recycle can be falsely true leading to a coalescing failure
> in "(from->pp_recycle && skb_cloned(from))" test, which otherwise
> would let the coalesce continue if from->pp_recycle was false. I
> wonder if that justifies the need for a fix.

This applies to non-native XDP code only though, right?  IIRC that
code was to make xdp easier to use but it's not used when xdp-like
performance is needed.

Thanks
/Ilias
>
> Thanks,
> Liang
>
>
> >
> > I vote not to apply this.
Liang Chen Aug. 3, 2023, 12:17 p.m. UTC | #6
On Thu, Aug 3, 2023 at 4:59 PM Ilias Apalodimas
<ilias.apalodimas@linaro.org> wrote:
>
> Hi Liang
>
> On Thu, 3 Aug 2023 at 11:26, Liang Chen <liangchen.linux@gmail.com> wrote:
> >
> > On Thu, Aug 3, 2023 at 2:37 AM Jakub Kicinski <kuba@kernel.org> wrote:
> > >
> > > On Wed,  2 Aug 2023 15:04:54 +0800 Liang Chen wrote:
> > > > In the generic XDP processing flow, if an skb with a page pool page
> > > > (skb->pp_recycle == 1) fails to meet XDP packet requirements, it will
> > > > undergo head expansion and linearization of fragment data. As a result,
> > > > skb->head points to a reallocated buffer without any fragments. At this
> > > > point, the skb will not contain any page pool pages. However, the
> > > > skb->pp_recycle flag is still set to 1, which is inconsistent with the
> > > > actual situation. Although it doesn't seem to cause much real harm at the
> > > > moment(a little nagetive impact on skb_try_coalesce), to avoid potential
> > > > issues associated with using incorrect skb->pp_recycle information,
> > > > setting skb->pp_recycle to 0 to reflect the pp state of the skb.
> > >
> > > pp_recycle just means that the skb is "page pool aware", there's
> > > absolutely no harm in having an skb with pp_recycle = 1 and no
> > > page pool pages attached.
> >
> > I don't see it causing an error right now either. But it affects
> > skb_try_coalesce in a negative way from a performance perspective -
> > from->pp_recycle can be falsely true leading to a coalescing failure
> > in "(from->pp_recycle && skb_cloned(from))" test, which otherwise
> > would let the coalesce continue if from->pp_recycle was false. I
> > wonder if that justifies the need for a fix.
>
> This applies to non-native XDP code only though, right?  IIRC that
> code was to make xdp easier to use but it's not used when xdp-like
> performance is needed.

Sure, this applies to non-native XDP code only. Thanks.

Thanks,
Liang

>
> Thanks
> /Ilias
> >
> > Thanks,
> > Liang
> >
> >
> > >
> > > I vote not to apply this.
Alexander Lobakin Aug. 3, 2023, 3:14 p.m. UTC | #7
From: Liang Chen <liangchen.linux@gmail.com>
Date: Thu, 3 Aug 2023 16:25:49 +0800

> On Thu, Aug 3, 2023 at 1:11 AM Alexander Lobakin
> <aleksander.lobakin@intel.com> wrote:
>>
>> From: Liang Chen <liangchen.linux@gmail.com>
>> Date: Wed,  2 Aug 2023 15:04:54 +0800
>>
>>> In the generic XDP processing flow, if an skb with a page pool page
>>> (skb->pp_recycle == 1) fails to meet XDP packet requirements, it will
>>> undergo head expansion and linearization of fragment data. As a result,
>>> skb->head points to a reallocated buffer without any fragments. At this
>>> point, the skb will not contain any page pool pages. However, the
>>> skb->pp_recycle flag is still set to 1, which is inconsistent with the
>>> actual situation. Although it doesn't seem to cause much real harm at the
>>
>> This means it must be handled in the function which replaces the head,
>> i.e. pskb_expand_head(). Your change only suppresses one symptom of the
>> issue.
>>
> 
> I attempted to do so. But after pskb_expand_head, there may still be
> skb frags with pp pages left. It is after skb_linearize those frags
> are removed.

Ah, right.
Then you need to handle that in __pskb_pull_tail(). Check at the end of
the function whether the skb still has any frags, and if not, clear
skb->pp_recycle.

The most correct fix would be to do that in both pskb_expand_head() and
__pskb_pull_tail(): iterate over the frags and check if any page still
belongs to a page_pool. Then page_pool_return_skb_page() wouldn't hit
false-branch after the skb was re-layout.

[...]

Thanks,
Olek
Liang Chen Aug. 7, 2023, 12:49 p.m. UTC | #8
On Thu, Aug 3, 2023 at 11:17 PM Alexander Lobakin
<aleksander.lobakin@intel.com> wrote:
>
> From: Liang Chen <liangchen.linux@gmail.com>
> Date: Thu, 3 Aug 2023 16:25:49 +0800
>
> > On Thu, Aug 3, 2023 at 1:11 AM Alexander Lobakin
> > <aleksander.lobakin@intel.com> wrote:
> >>
> >> From: Liang Chen <liangchen.linux@gmail.com>
> >> Date: Wed,  2 Aug 2023 15:04:54 +0800
> >>
> >>> In the generic XDP processing flow, if an skb with a page pool page
> >>> (skb->pp_recycle == 1) fails to meet XDP packet requirements, it will
> >>> undergo head expansion and linearization of fragment data. As a result,
> >>> skb->head points to a reallocated buffer without any fragments. At this
> >>> point, the skb will not contain any page pool pages. However, the
> >>> skb->pp_recycle flag is still set to 1, which is inconsistent with the
> >>> actual situation. Although it doesn't seem to cause much real harm at the
> >>
> >> This means it must be handled in the function which replaces the head,
> >> i.e. pskb_expand_head(). Your change only suppresses one symptom of the
> >> issue.
> >>
> >
> > I attempted to do so. But after pskb_expand_head, there may still be
> > skb frags with pp pages left. It is after skb_linearize those frags
> > are removed.
>
> Ah, right.
> Then you need to handle that in __pskb_pull_tail(). Check at the end of
> the function whether the skb still has any frags, and if not, clear
> skb->pp_recycle.
>
> The most correct fix would be to do that in both pskb_expand_head() and
> __pskb_pull_tail(): iterate over the frags and check if any page still
> belongs to a page_pool. Then page_pool_return_skb_page() wouldn't hit
> false-branch after the skb was re-layout.
>

Yeah, I agree. netif_receive_generic_xdp may not be the only place the
flag can get wrong. To make the pp_recycle flag strictly reflect the
state of page pool usages, the check should be put in both functions.
But, since it's merely an indication that the skb is 'page pool
aware,' and considering that the performance impact we observed
doesn't affect the native XDP path, addressing this issue doesn't seem
worthwhile based on the feedback I've received.


Thanks,
Liang

> [...]
>
> Thanks,
> Olek
diff mbox series

Patch

diff --git a/net/core/dev.c b/net/core/dev.c
index 10e5a036c706..07baf72be7d7 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4934,6 +4934,8 @@  static u32 netif_receive_generic_xdp(struct sk_buff *skb,
 			goto do_drop;
 		if (skb_linearize(skb))
 			goto do_drop;
+		if (skb->pp_recycle)
+			skb->pp_recycle = 0;
 	}
 
 	act = bpf_prog_run_generic_xdp(skb, xdp, xdp_prog);