diff mbox series

[bpf-next,v3,3/4] xdp: recycle Page Pool backed skbs built from XDP frames

Message ID 20230313215553.1045175-4-aleksander.lobakin@intel.com (mailing list archive)
State Accepted
Commit 9c94bbf9a87b264294f42e6cc0f76d87854733ec
Delegated to: BPF
Headers show
Series xdp: recycle Page Pool backed skbs built from XDP frames | expand

Checks

Context Check Description
netdev/series_format success Posting correctly formatted
netdev/tree_selection success Clearly marked for bpf-next
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 22 this patch: 22
netdev/cc_maintainers success CCed 10 of 10 maintainers
netdev/build_clang success Errors and warnings before: 18 this patch: 18
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 22 this patch: 22
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 10 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
bpf/vmtest-bpf-next-PR success PR summary
bpf/vmtest-bpf-next-VM_Test-1 success Logs for ShellCheck
bpf/vmtest-bpf-next-VM_Test-7 success Logs for llvm-toolchain
bpf/vmtest-bpf-next-VM_Test-8 success Logs for set-matrix
bpf/vmtest-bpf-next-VM_Test-2 success Logs for build for aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-3 success Logs for build for aarch64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-5 success Logs for build for x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-6 success Logs for build for x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-4 success Logs for build for s390x with gcc
bpf/vmtest-bpf-next-VM_Test-32 success Logs for test_verifier on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-33 success Logs for test_verifier on aarch64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-35 success Logs for test_verifier on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-36 success Logs for test_verifier on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-9 success Logs for test_maps on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-10 success Logs for test_maps on aarch64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-12 success Logs for test_maps on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-13 success Logs for test_maps on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-14 fail Logs for test_progs on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-15 fail Logs for test_progs on aarch64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-17 fail Logs for test_progs on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-18 fail Logs for test_progs on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-19 fail Logs for test_progs_no_alu32 on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-20 fail Logs for test_progs_no_alu32 on aarch64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-22 fail Logs for test_progs_no_alu32 on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-23 fail Logs for test_progs_no_alu32 on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-24 success Logs for test_progs_no_alu32_parallel on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-25 success Logs for test_progs_no_alu32_parallel on aarch64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-26 success Logs for test_progs_no_alu32_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-27 success Logs for test_progs_no_alu32_parallel on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-28 success Logs for test_progs_parallel on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-29 success Logs for test_progs_parallel on aarch64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-30 success Logs for test_progs_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-31 success Logs for test_progs_parallel on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-34 success Logs for test_verifier on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-21 fail Logs for test_progs_no_alu32 on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-16 fail Logs for test_progs on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-11 success Logs for test_maps on s390x with gcc

Commit Message

Alexander Lobakin March 13, 2023, 9:55 p.m. UTC
__xdp_build_skb_from_frame() state(d):

/* Until page_pool get SKB return path, release DMA here */

Page Pool got skb pages recycling in April 2021, but missed this
function.

xdp_release_frame() is relevant only for Page Pool backed frames and it
detaches the page from the corresponding page_pool in order to make it
freeable via page_frag_free(). It can instead just mark the output skb
as eligible for recycling if the frame is backed by a pp. No change for
other memory model types (the same condition check as before).
cpumap redirect and veth on Page Pool drivers now become zero-alloc (or
almost).

Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
---
 net/core/xdp.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Jesper Dangaard Brouer March 15, 2023, 2:55 p.m. UTC | #1
On 13/03/2023 22.55, Alexander Lobakin wrote:
> __xdp_build_skb_from_frame() state(d):
> 
> /* Until page_pool get SKB return path, release DMA here */
> 
> Page Pool got skb pages recycling in April 2021, but missed this
> function.
> 
> xdp_release_frame() is relevant only for Page Pool backed frames and it
> detaches the page from the corresponding page_pool in order to make it
> freeable via page_frag_free(). It can instead just mark the output skb
> as eligible for recycling if the frame is backed by a pp. No change for
> other memory model types (the same condition check as before).
> cpumap redirect and veth on Page Pool drivers now become zero-alloc (or
> almost).
> 
> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
> ---
>   net/core/xdp.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/net/core/xdp.c b/net/core/xdp.c
> index 8c92fc553317..a2237cfca8e9 100644
> --- a/net/core/xdp.c
> +++ b/net/core/xdp.c
> @@ -658,8 +658,8 @@ struct sk_buff *__xdp_build_skb_from_frame(struct xdp_frame *xdpf,
>   	 * - RX ring dev queue index	(skb_record_rx_queue)
>   	 */
>   
> -	/* Until page_pool get SKB return path, release DMA here */
> -	xdp_release_frame(xdpf);
> +	if (xdpf->mem.type == MEM_TYPE_PAGE_POOL)
> +		skb_mark_for_recycle(skb);

I hope this is safe ;-) ... Meaning hopefully drivers does the correct
thing when XDP_REDIRECT'ing page_pool pages.

Looking for drivers doing weird refcnt tricks and XDP_REDIRECT'ing, I
noticed the driver aquantia/atlantic (in aq_get_rxpages_xdp), but I now
see this is not using page_pool, so it should be affected by this (but I
worry if atlantic driver have a potential race condition for its refcnt
scheme).

>   
>   	/* Allow SKB to reuse area used by xdp_frame */
>   	xdp_scrub_frame(xdpf);
Alexander Lobakin March 15, 2023, 2:58 p.m. UTC | #2
From: Jesper Dangaard Brouer <jbrouer@redhat.com>
Date: Wed, 15 Mar 2023 15:55:44 +0100

> 
> On 13/03/2023 22.55, Alexander Lobakin wrote:
>> __xdp_build_skb_from_frame() state(d):
>>
>> /* Until page_pool get SKB return path, release DMA here */
>>
>> Page Pool got skb pages recycling in April 2021, but missed this
>> function.
>>
>> xdp_release_frame() is relevant only for Page Pool backed frames and it
>> detaches the page from the corresponding page_pool in order to make it
>> freeable via page_frag_free(). It can instead just mark the output skb
>> as eligible for recycling if the frame is backed by a pp. No change for
>> other memory model types (the same condition check as before).
>> cpumap redirect and veth on Page Pool drivers now become zero-alloc (or
>> almost).
>>
>> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
>> ---
>>   net/core/xdp.c | 4 ++--
>>   1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/net/core/xdp.c b/net/core/xdp.c
>> index 8c92fc553317..a2237cfca8e9 100644
>> --- a/net/core/xdp.c
>> +++ b/net/core/xdp.c
>> @@ -658,8 +658,8 @@ struct sk_buff *__xdp_build_skb_from_frame(struct
>> xdp_frame *xdpf,
>>        * - RX ring dev queue index    (skb_record_rx_queue)
>>        */
>>   -    /* Until page_pool get SKB return path, release DMA here */
>> -    xdp_release_frame(xdpf);
>> +    if (xdpf->mem.type == MEM_TYPE_PAGE_POOL)
>> +        skb_mark_for_recycle(skb);
> 
> I hope this is safe ;-) ... Meaning hopefully drivers does the correct
> thing when XDP_REDIRECT'ing page_pool pages.

Safe when it's done by the schoolbook. For now I'm observing only one
syzbot issue with test_run due to that it assumes yet another bunch
o'things I wouldn't rely on :D (separate subthread)

> 
> Looking for drivers doing weird refcnt tricks and XDP_REDIRECT'ing, I
> noticed the driver aquantia/atlantic (in aq_get_rxpages_xdp), but I now
> see this is not using page_pool, so it should be affected by this (but I
> worry if atlantic driver have a potential race condition for its refcnt
> scheme).

If we encounter some driver using Page Pool, but mangling refcounts on
redirect, we'll fix it ;)

> 
>>         /* Allow SKB to reuse area used by xdp_frame */
>>       xdp_scrub_frame(xdpf);
> 

Thanks,
Olek
Jesper Dangaard Brouer March 16, 2023, 5:10 p.m. UTC | #3
On 15/03/2023 15.58, Alexander Lobakin wrote:
> From: Jesper Dangaard Brouer <jbrouer@redhat.com>
> Date: Wed, 15 Mar 2023 15:55:44 +0100
> 
>> On 13/03/2023 22.55, Alexander Lobakin wrote:
[...]
>>>
>>> diff --git a/net/core/xdp.c b/net/core/xdp.c
>>> index 8c92fc553317..a2237cfca8e9 100644
>>> --- a/net/core/xdp.c
>>> +++ b/net/core/xdp.c
>>> @@ -658,8 +658,8 @@ struct sk_buff *__xdp_build_skb_from_frame(struct
>>> xdp_frame *xdpf,
>>>         * - RX ring dev queue index    (skb_record_rx_queue)
>>>         */
>>>    -    /* Until page_pool get SKB return path, release DMA here */
>>> -    xdp_release_frame(xdpf);
>>> +    if (xdpf->mem.type == MEM_TYPE_PAGE_POOL)
>>> +        skb_mark_for_recycle(skb);
>>
>> I hope this is safe ;-) ... Meaning hopefully drivers does the correct
>> thing when XDP_REDIRECT'ing page_pool pages.
> 
> Safe when it's done by the schoolbook. For now I'm observing only one
> syzbot issue with test_run due to that it assumes yet another bunch
> o'things I wouldn't rely on :D (separate subthread)
> 
>>
>> Looking for drivers doing weird refcnt tricks and XDP_REDIRECT'ing, I
>> noticed the driver aquantia/atlantic (in aq_get_rxpages_xdp), but I now
>> see this is not using page_pool, so it should be affected by this (but I
>> worry if atlantic driver have a potential race condition for its refcnt
>> scheme).
> 
> If we encounter some driver using Page Pool, but mangling refcounts on
> redirect, we'll fix it ;)
> 

Thanks for signing up for fixing these issues down-the-road :-)

For what is it worth, I've rebased to include this patchset on my
testlab.

For now, I've tested mlx5 with cpumap redirect and net stack processing,
everything seems to be working nicely. When disabling GRO/GRO, then the
cpumap get same and sometimes better TCP throughput performance,
even-though checksum have to be done in software. (Hopefully we can soon
close the missing HW checksum gap with XDP-hints).

--Jesper
Alexander Lobakin March 17, 2023, 1:36 p.m. UTC | #4
From: Jesper Dangaard Brouer <jbrouer@redhat.com>
Date: Thu, 16 Mar 2023 18:10:26 +0100

> 
> On 15/03/2023 15.58, Alexander Lobakin wrote:
>> From: Jesper Dangaard Brouer <jbrouer@redhat.com>
>> Date: Wed, 15 Mar 2023 15:55:44 +0100

[...]

> Thanks for signing up for fixing these issues down-the-road :-)

At some point, I wasn't sure which commit tags to put to Fixes:. Like,
from one PoV, it's not my patch which introduced them. From the other
side, there was no chance to have 0x42 overwritten in the metadata
during the selftest before that switch and no one could even predict it
(I didn't expect XDP_PASS frames from the test_run to reach neigh xmit
at all), so the original code is not buggy itself as well ._.

> 
> For what is it worth, I've rebased to include this patchset on my
> testlab.
> 
> For now, I've tested mlx5 with cpumap redirect and net stack processing,
> everything seems to be working nicely. When disabling GRO/GRO, then the
> cpumap get same and sometimes better TCP throughput performance,
> even-though checksum have to be done in software. (Hopefully we can soon
> close the missing HW checksum gap with XDP-hints).

Yeah I'm also looking forward to having some hints being passed to
cpumap/veth, so that __xdp_build_skb_from_frame() could consume it. So
that I could pick a bunch of patches from my RFC back to switch cpumap
to GRO finally :D

> 
> --Jesper
> 

Thanks,
Olek
diff mbox series

Patch

diff --git a/net/core/xdp.c b/net/core/xdp.c
index 8c92fc553317..a2237cfca8e9 100644
--- a/net/core/xdp.c
+++ b/net/core/xdp.c
@@ -658,8 +658,8 @@  struct sk_buff *__xdp_build_skb_from_frame(struct xdp_frame *xdpf,
 	 * - RX ring dev queue index	(skb_record_rx_queue)
 	 */
 
-	/* Until page_pool get SKB return path, release DMA here */
-	xdp_release_frame(xdpf);
+	if (xdpf->mem.type == MEM_TYPE_PAGE_POOL)
+		skb_mark_for_recycle(skb);
 
 	/* Allow SKB to reuse area used by xdp_frame */
 	xdp_scrub_frame(xdpf);