diff mbox series

[bpf-next,v2,5/6] mm, bpf: Use __GFP_ACCOUNT in try_alloc_pages().

Message ID 20241210023936.46871-6-alexei.starovoitov@gmail.com (mailing list archive)
State Changes Requested
Delegated to: BPF
Headers show
Series bpf, mm: Introduce __GFP_TRYLOCK | expand

Checks

Context Check Description
bpf/vmtest-bpf-next-VM_Test-30 success Logs for x86_64-llvm-17 / build / build for x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-31 success Logs for x86_64-llvm-17 / build-release / build for x86_64 with llvm-17-O2
bpf/vmtest-bpf-next-VM_Test-32 success Logs for x86_64-llvm-17 / test (test_maps, false, 360) / test_maps on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-35 success Logs for x86_64-llvm-17 / test (test_verifier, false, 360) / test_verifier on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-36 success Logs for x86_64-llvm-17 / veristat-kernel
bpf/vmtest-bpf-next-VM_Test-37 success Logs for x86_64-llvm-17 / veristat-meta
bpf/vmtest-bpf-next-VM_Test-38 success Logs for x86_64-llvm-18 / build / build for x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-39 success Logs for x86_64-llvm-18 / build-release / build for x86_64 with llvm-18-O2
bpf/vmtest-bpf-next-VM_Test-40 success Logs for x86_64-llvm-18 / test (test_maps, false, 360) / test_maps on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-44 success Logs for x86_64-llvm-18 / test (test_verifier, false, 360) / test_verifier on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-45 success Logs for x86_64-llvm-18 / veristat-kernel
bpf/vmtest-bpf-next-VM_Test-46 success Logs for x86_64-llvm-18 / veristat-meta
bpf/vmtest-bpf-next-VM_Test-33 success Logs for x86_64-llvm-17 / test (test_progs, false, 360) / test_progs on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-34 success Logs for x86_64-llvm-17 / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-41 success Logs for x86_64-llvm-18 / test (test_progs, false, 360) / test_progs on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-42 success Logs for x86_64-llvm-18 / test (test_progs_cpuv4, false, 360) / test_progs_cpuv4 on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-43 success Logs for x86_64-llvm-18 / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with llvm-18
netdev/series_format success Posting correctly formatted
netdev/tree_selection success Clearly marked for bpf-next
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 569 this patch: 569
netdev/build_tools success Errors and warnings before: 0 (+0) this patch: 0 (+0)
netdev/cc_maintainers success CCed 3 of 3 maintainers
netdev/build_clang success Errors and warnings before: 979 this patch: 979
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 15904 this patch: 15904
netdev/checkpatch warning WARNING: line length of 87 exceeds 80 columns WARNING: line length of 97 exceeds 80 columns
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 1 this patch: 1
netdev/source_inline success Was 0 now: 0
bpf/vmtest-bpf-next-PR fail PR summary
bpf/vmtest-bpf-next-VM_Test-1 success Logs for ShellCheck
bpf/vmtest-bpf-next-VM_Test-0 success Logs for Lint
bpf/vmtest-bpf-next-VM_Test-2 success Logs for Unittests
bpf/vmtest-bpf-next-VM_Test-3 success Logs for Validate matrix.py
bpf/vmtest-bpf-next-VM_Test-4 fail Logs for aarch64-gcc / build / build for aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-5 success Logs for aarch64-gcc / build-release
bpf/vmtest-bpf-next-VM_Test-6 success Logs for aarch64-gcc / test
bpf/vmtest-bpf-next-VM_Test-7 success Logs for aarch64-gcc / veristat-kernel
bpf/vmtest-bpf-next-VM_Test-8 success Logs for aarch64-gcc / veristat-meta
bpf/vmtest-bpf-next-VM_Test-9 fail Logs for s390x-gcc / build / build for s390x with gcc
bpf/vmtest-bpf-next-VM_Test-10 success Logs for s390x-gcc / build-release
bpf/vmtest-bpf-next-VM_Test-11 success Logs for s390x-gcc / test
bpf/vmtest-bpf-next-VM_Test-12 success Logs for s390x-gcc / veristat-kernel
bpf/vmtest-bpf-next-VM_Test-13 success Logs for s390x-gcc / veristat-meta
bpf/vmtest-bpf-next-VM_Test-14 success Logs for set-matrix
bpf/vmtest-bpf-next-VM_Test-15 fail Logs for x86_64-gcc / build / build for x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-16 success Logs for x86_64-gcc / build-release
bpf/vmtest-bpf-next-VM_Test-17 success Logs for x86_64-gcc / test
bpf/vmtest-bpf-next-VM_Test-18 success Logs for x86_64-gcc / veristat-kernel
bpf/vmtest-bpf-next-VM_Test-20 fail Logs for x86_64-llvm-17 / build / build for x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-19 success Logs for x86_64-gcc / veristat-meta
bpf/vmtest-bpf-next-VM_Test-22 success Logs for x86_64-llvm-17 / test
bpf/vmtest-bpf-next-VM_Test-21 fail Logs for x86_64-llvm-17 / build-release / build for x86_64 with llvm-17-O2
bpf/vmtest-bpf-next-VM_Test-23 success Logs for x86_64-llvm-17 / veristat-kernel
bpf/vmtest-bpf-next-VM_Test-24 success Logs for x86_64-llvm-17 / veristat-meta
bpf/vmtest-bpf-next-VM_Test-25 fail Logs for x86_64-llvm-18 / build / build for x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-26 fail Logs for x86_64-llvm-18 / build-release / build for x86_64 with llvm-18-O2
bpf/vmtest-bpf-next-VM_Test-27 success Logs for x86_64-llvm-18 / test
bpf/vmtest-bpf-next-VM_Test-28 success Logs for x86_64-llvm-18 / veristat-kernel
bpf/vmtest-bpf-next-VM_Test-29 success Logs for x86_64-llvm-18 / veristat-meta

Commit Message

Alexei Starovoitov Dec. 10, 2024, 2:39 a.m. UTC
From: Alexei Starovoitov <ast@kernel.org>

Unconditionally use __GFP_ACCOUNT in try_alloc_pages().
The caller is responsible to setup memcg correctly.
All BPF memory accounting is memcg based.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
 include/linux/gfp.h | 5 ++---
 mm/page_alloc.c     | 5 ++++-
 2 files changed, 6 insertions(+), 4 deletions(-)

Comments

Vlastimil Babka Dec. 11, 2024, 12:05 p.m. UTC | #1
On 12/10/24 03:39, Alexei Starovoitov wrote:
> From: Alexei Starovoitov <ast@kernel.org>
> 
> Unconditionally use __GFP_ACCOUNT in try_alloc_pages().
> The caller is responsible to setup memcg correctly.
> All BPF memory accounting is memcg based.

Hm I guess if there was later another user besides bpf that didn't want
memcg for some reason, a variant of the function (or a bool param, to avoid
passing full gfp param which would be thrown away besides __GFP_ACCOUNT)
could be easily created.

> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> ---
>  include/linux/gfp.h | 5 ++---
>  mm/page_alloc.c     | 5 ++++-
>  2 files changed, 6 insertions(+), 4 deletions(-)
> 
> diff --git a/include/linux/gfp.h b/include/linux/gfp.h
> index dcae733ed006..820c4938c9cd 100644
> --- a/include/linux/gfp.h
> +++ b/include/linux/gfp.h
> @@ -356,18 +356,17 @@ static inline struct page *try_alloc_pages_noprof(int nid, unsigned int order)
>  	 */
>  	if (preemptible() && !rcu_preempt_depth())
>  		return alloc_pages_node_noprof(nid,
> -					       GFP_NOWAIT | __GFP_ZERO,
> +					       GFP_NOWAIT | __GFP_ZERO | __GFP_ACCOUNT,
>  					       order);
>  	/*
>  	 * Best effort allocation from percpu free list.
>  	 * If it's empty attempt to spin_trylock zone->lock.
>  	 * Do not specify __GFP_KSWAPD_RECLAIM to avoid wakeup_kswapd
>  	 * that may need to grab a lock.
> -	 * Do not specify __GFP_ACCOUNT to avoid local_lock.
>  	 * Do not warn either.
>  	 */
>  	return alloc_pages_node_noprof(nid,
> -				       __GFP_TRYLOCK | __GFP_NOWARN | __GFP_ZERO,
> +				       __GFP_TRYLOCK | __GFP_NOWARN | __GFP_ZERO | __GFP_ACCOUNT,
>  				       order);
>  }
>  #define try_alloc_pages(...)			alloc_hooks(try_alloc_pages_noprof(__VA_ARGS__))
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index a969a62ec0c3..1fada16b8a14 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -4818,7 +4818,10 @@ struct page *__alloc_pages_noprof(gfp_t gfp, unsigned int order,
>  out:
>  	if (memcg_kmem_online() && (gfp & __GFP_ACCOUNT) && page &&
>  	    unlikely(__memcg_kmem_charge_page(page, gfp, order) != 0)) {
> -		__free_pages(page, order);
> +		if (unlikely(gfp & __GFP_TRYLOCK))
> +			free_pages_nolock(page, order);
> +		else
> +			__free_pages(page, order);
>  		page = NULL;
>  	}
>
Alexei Starovoitov Dec. 12, 2024, 2:54 a.m. UTC | #2
On Wed, Dec 11, 2024 at 4:05 AM Vlastimil Babka <vbabka@suse.cz> wrote:
>
> On 12/10/24 03:39, Alexei Starovoitov wrote:
> > From: Alexei Starovoitov <ast@kernel.org>
> >
> > Unconditionally use __GFP_ACCOUNT in try_alloc_pages().
> > The caller is responsible to setup memcg correctly.
> > All BPF memory accounting is memcg based.
>
> Hm I guess if there was later another user besides bpf that didn't want
> memcg for some reason, a variant of the function (or a bool param, to avoid
> passing full gfp param which would be thrown away besides __GFP_ACCOUNT)
> could be easily created.

Let's cross that bridge when we get there.
tbh the code that doesn't play nice with memcg should be shamed.
I think CONFIG_MEMCG should disappear and become a default.
diff mbox series

Patch

diff --git a/include/linux/gfp.h b/include/linux/gfp.h
index dcae733ed006..820c4938c9cd 100644
--- a/include/linux/gfp.h
+++ b/include/linux/gfp.h
@@ -356,18 +356,17 @@  static inline struct page *try_alloc_pages_noprof(int nid, unsigned int order)
 	 */
 	if (preemptible() && !rcu_preempt_depth())
 		return alloc_pages_node_noprof(nid,
-					       GFP_NOWAIT | __GFP_ZERO,
+					       GFP_NOWAIT | __GFP_ZERO | __GFP_ACCOUNT,
 					       order);
 	/*
 	 * Best effort allocation from percpu free list.
 	 * If it's empty attempt to spin_trylock zone->lock.
 	 * Do not specify __GFP_KSWAPD_RECLAIM to avoid wakeup_kswapd
 	 * that may need to grab a lock.
-	 * Do not specify __GFP_ACCOUNT to avoid local_lock.
 	 * Do not warn either.
 	 */
 	return alloc_pages_node_noprof(nid,
-				       __GFP_TRYLOCK | __GFP_NOWARN | __GFP_ZERO,
+				       __GFP_TRYLOCK | __GFP_NOWARN | __GFP_ZERO | __GFP_ACCOUNT,
 				       order);
 }
 #define try_alloc_pages(...)			alloc_hooks(try_alloc_pages_noprof(__VA_ARGS__))
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index a969a62ec0c3..1fada16b8a14 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -4818,7 +4818,10 @@  struct page *__alloc_pages_noprof(gfp_t gfp, unsigned int order,
 out:
 	if (memcg_kmem_online() && (gfp & __GFP_ACCOUNT) && page &&
 	    unlikely(__memcg_kmem_charge_page(page, gfp, order) != 0)) {
-		__free_pages(page, order);
+		if (unlikely(gfp & __GFP_TRYLOCK))
+			free_pages_nolock(page, order);
+		else
+			__free_pages(page, order);
 		page = NULL;
 	}