diff mbox series

[bpf-next,1/6] mm/percpu.c: introduce alloc_size_percpu()

Message ID 20231007135106.3031284-2-houtao@huaweicloud.com (mailing list archive)
State Superseded
Delegated to: BPF
Headers show
Series bpf: Fixes for per-cpu kptr | expand

Checks

Context Check Description
bpf/vmtest-bpf-next-PR success PR summary
netdev/series_format success Posting correctly formatted
netdev/tree_selection success Clearly marked for bpf-next
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 15602 this patch: 15602
netdev/cc_maintainers success CCed 6 of 6 maintainers
netdev/build_clang success Errors and warnings before: 3927 this patch: 3927
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 17005 this patch: 17005
netdev/checkpatch warning CHECK: extern prototypes should be avoided in .h files WARNING: line length of 87 exceeds 80 columns
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
bpf/vmtest-bpf-next-VM_Test-16 success Logs for test_progs_no_alu32 on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-17 success Logs for test_progs_no_alu32 on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-18 success Logs for test_progs_no_alu32_parallel on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-2 success Logs for build for s390x with gcc
bpf/vmtest-bpf-next-VM_Test-3 success Logs for build for x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-4 success Logs for build for x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-6 success Logs for test_maps on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-5 success Logs for set-matrix
bpf/vmtest-bpf-next-VM_Test-1 success Logs for build for aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-13 success Logs for test_progs on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-14 success Logs for test_progs_no_alu32 on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-9 fail Logs for test_maps on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-19 success Logs for test_progs_no_alu32_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-10 success Logs for test_progs on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-7 success Logs for test_maps on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-15 success Logs for test_progs_no_alu32 on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-11 success Logs for test_progs on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-12 success Logs for test_progs on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-8 success Logs for test_maps on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-20 success Logs for test_progs_no_alu32_parallel on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-21 success Logs for test_progs_parallel on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-22 success Logs for test_progs_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-23 success Logs for test_progs_parallel on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-24 success Logs for test_verifier on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-25 success Logs for test_verifier on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-26 success Logs for test_verifier on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-27 success Logs for test_verifier on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-28 success Logs for veristat
bpf/vmtest-bpf-next-VM_Test-0 success Logs for ShellCheck

Commit Message

Hou Tao Oct. 7, 2023, 1:51 p.m. UTC
From: Hou Tao <houtao1@huawei.com>

Introduce alloc_size_percpu() to get the size of the dynamic per-cpu
area. It will be used by bpf memory allocator in the following patches.
BPF memory allocator maintains multiple per-cpu area caches for multiple
area sizes and it needs the size of dynamic per-cpu area to select the
corresponding cache when bpf program frees the dynamic per-cpu area.

Signed-off-by: Hou Tao <houtao1@huawei.com>
---
 include/linux/percpu.h |  1 +
 mm/percpu.c            | 29 +++++++++++++++++++++++++++++
 2 files changed, 30 insertions(+)

Comments

Andrew Morton Oct. 7, 2023, 2:04 p.m. UTC | #1
On Sat,  7 Oct 2023 21:51:01 +0800 Hou Tao <houtao@huaweicloud.com> wrote:

> From: Hou Tao <houtao1@huawei.com>
> 
> Introduce alloc_size_percpu() to get the size of the dynamic per-cpu
> area. It will be used by bpf memory allocator in the following patches.
> BPF memory allocator maintains multiple per-cpu area caches for multiple
> area sizes and it needs the size of dynamic per-cpu area to select the
> corresponding cache when bpf program frees the dynamic per-cpu area.
> 
> --- a/mm/percpu.c
> +++ b/mm/percpu.c
> @@ -2244,6 +2244,35 @@ static void pcpu_balance_workfn(struct work_struct *work)
>  	mutex_unlock(&pcpu_alloc_mutex);
>  }
>  
> +/**
> + * alloc_size_percpu - the size of the dynamic percpu area
> + * @ptr: pointer to the dynamic percpu area
> + *
> + * Return the size of the dynamic percpu area @ptr.
> + *
> + * RETURNS:
> + * The size of the dynamic percpu area.
> + *
> + * CONTEXT:
> + * Can be called from atomic context.
> + */
> +size_t alloc_size_percpu(void __percpu *ptr)
> +{
> +	struct pcpu_chunk *chunk;
> +	int bit_off, end;

It's minor, but I'd suggest unsigned long for both.

> +	void *addr;
> +
> +	if (!ptr)
> +		return 0;
> +
> +	addr = __pcpu_ptr_to_addr(ptr);
> +	/* No pcpu_lock here: ptr has not been freed, so chunk is still alive */
> +	chunk = pcpu_chunk_addr_search(addr);
> +	bit_off = (addr - chunk->base_addr) / PCPU_MIN_ALLOC_SIZE;

void* - void* is a ptrdiff_t, which is long or int.

> +	end = find_next_bit(chunk->bound_map, pcpu_chunk_map_bits(chunk), bit_off + 1);

find_next_bit takes an unsigned long

> +	return (end - bit_off) * PCPU_MIN_ALLOC_SIZE;

And then we don't need to worry about signedness issues.

> +}
> +
Hou Tao Oct. 8, 2023, 2:47 a.m. UTC | #2
Hi,

On 10/7/2023 10:04 PM, Andrew Morton wrote:
> On Sat,  7 Oct 2023 21:51:01 +0800 Hou Tao <houtao@huaweicloud.com> wrote:
>
>> From: Hou Tao <houtao1@huawei.com>
>>
>> Introduce alloc_size_percpu() to get the size of the dynamic per-cpu
>> area. It will be used by bpf memory allocator in the following patches.
>> BPF memory allocator maintains multiple per-cpu area caches for multiple
>> area sizes and it needs the size of dynamic per-cpu area to select the
>> corresponding cache when bpf program frees the dynamic per-cpu area.
>>
>> --- a/mm/percpu.c
>> +++ b/mm/percpu.c
>> @@ -2244,6 +2244,35 @@ static void pcpu_balance_workfn(struct work_struct *work)
>>  	mutex_unlock(&pcpu_alloc_mutex);
>>  }
>>  
>> +/**
>> + * alloc_size_percpu - the size of the dynamic percpu area
>> + * @ptr: pointer to the dynamic percpu area
>> + *
>> + * Return the size of the dynamic percpu area @ptr.
>> + *
>> + * RETURNS:
>> + * The size of the dynamic percpu area.
>> + *
>> + * CONTEXT:
>> + * Can be called from atomic context.
>> + */
>> +size_t alloc_size_percpu(void __percpu *ptr)
>> +{
>> +	struct pcpu_chunk *chunk;
>> +	int bit_off, end;
> It's minor, but I'd suggest unsigned long for both.

Thanks for this and all following suggestions. Will do in v2.
>
>> +	void *addr;
>> +
>> +	if (!ptr)
>> +		return 0;
>> +
>> +	addr = __pcpu_ptr_to_addr(ptr);
>> +	/* No pcpu_lock here: ptr has not been freed, so chunk is still alive */
>> +	chunk = pcpu_chunk_addr_search(addr);
>> +	bit_off = (addr - chunk->base_addr) / PCPU_MIN_ALLOC_SIZE;
> void* - void* is a ptrdiff_t, which is long or int.
>
>> +	end = find_next_bit(chunk->bound_map, pcpu_chunk_map_bits(chunk), bit_off + 1);
> find_next_bit takes an unsigned long
>
>> +	return (end - bit_off) * PCPU_MIN_ALLOC_SIZE;
> And then we don't need to worry about signedness issues.
>
>> +}
>> +
Dennis Zhou Oct. 8, 2023, 10:32 p.m. UTC | #3
Hello,

On Sat, Oct 07, 2023 at 09:51:01PM +0800, Hou Tao wrote:
> From: Hou Tao <houtao1@huawei.com>
> 
> Introduce alloc_size_percpu() to get the size of the dynamic per-cpu
> area. It will be used by bpf memory allocator in the following patches.
> BPF memory allocator maintains multiple per-cpu area caches for multiple
> area sizes and it needs the size of dynamic per-cpu area to select the
> corresponding cache when bpf program frees the dynamic per-cpu area.
> 
> Signed-off-by: Hou Tao <houtao1@huawei.com>
> ---
>  include/linux/percpu.h |  1 +
>  mm/percpu.c            | 29 +++++++++++++++++++++++++++++
>  2 files changed, 30 insertions(+)
> 
> diff --git a/include/linux/percpu.h b/include/linux/percpu.h
> index 68fac2e7cbe6..d140d9d79567 100644
> --- a/include/linux/percpu.h
> +++ b/include/linux/percpu.h
> @@ -132,6 +132,7 @@ extern void __init setup_per_cpu_areas(void);
>  extern void __percpu *__alloc_percpu_gfp(size_t size, size_t align, gfp_t gfp) __alloc_size(1);
>  extern void __percpu *__alloc_percpu(size_t size, size_t align) __alloc_size(1);
>  extern void free_percpu(void __percpu *__pdata);
> +extern size_t alloc_size_percpu(void __percpu *__pdata);
>  
>  DEFINE_FREE(free_percpu, void __percpu *, free_percpu(_T))
>  
> diff --git a/mm/percpu.c b/mm/percpu.c
> index 7b40b3963f10..f541cfc3cb2d 100644
> --- a/mm/percpu.c
> +++ b/mm/percpu.c
> @@ -2244,6 +2244,35 @@ static void pcpu_balance_workfn(struct work_struct *work)
>  	mutex_unlock(&pcpu_alloc_mutex);
>  }
>  
> +/**
> + * alloc_size_percpu - the size of the dynamic percpu area

Can we name this pcpu_alloc_size(). A few other functions are
exposed under pcpu_* so it's a bit easier to keep track of.

> + * @ptr: pointer to the dynamic percpu area
> + *
> + * Return the size of the dynamic percpu area @ptr.
> + *
> + * RETURNS:
> + * The size of the dynamic percpu area.
> + *
> + * CONTEXT:
> + * Can be called from atomic context.
> + */
> +size_t alloc_size_percpu(void __percpu *ptr)
> +{
> +	struct pcpu_chunk *chunk;
> +	int bit_off, end;
> +	void *addr;
> +
> +	if (!ptr)
> +		return 0;
> +
> +	addr = __pcpu_ptr_to_addr(ptr);
> +	/* No pcpu_lock here: ptr has not been freed, so chunk is still alive */

Now that percpu variables are floating around more commonly, I think we
or I need to add more validation guards so it's easier to
debug bogus/stale pointers. Potentially like a static_key or Kconfig so
that we take the lock and `test_bit()`.

> +	chunk = pcpu_chunk_addr_search(addr);
> +	bit_off = (addr - chunk->base_addr) / PCPU_MIN_ALLOC_SIZE;
> +	end = find_next_bit(chunk->bound_map, pcpu_chunk_map_bits(chunk), bit_off + 1);

Nit: can you please reflow `bit_off + 1` to the next line. I know we
dropped the line requirement, but percpu.c almost completely still
follows it.

> +	return (end - bit_off) * PCPU_MIN_ALLOC_SIZE;
> +}
> +
>  /**
>   * free_percpu - free percpu area
>   * @ptr: pointer to area to free
> -- 
> 2.29.2
> 

Thanks,
Dennis
Hou Tao Oct. 11, 2023, 6:30 a.m. UTC | #4
Hi,

On 10/9/2023 6:32 AM, Dennis Zhou wrote:
> Hello,
>
> On Sat, Oct 07, 2023 at 09:51:01PM +0800, Hou Tao wrote:
>> From: Hou Tao <houtao1@huawei.com>
>>
>> Introduce alloc_size_percpu() to get the size of the dynamic per-cpu
>> area. It will be used by bpf memory allocator in the following patches.
>> BPF memory allocator maintains multiple per-cpu area caches for multiple
>> area sizes and it needs the size of dynamic per-cpu area to select the
>> corresponding cache when bpf program frees the dynamic per-cpu area.
>>
>> Signed-off-by: Hou Tao <houtao1@huawei.com>
>> ---
>>  include/linux/percpu.h |  1 +
>>  mm/percpu.c            | 29 +++++++++++++++++++++++++++++
>>  2 files changed, 30 insertions(+)
>>
>> diff --git a/include/linux/percpu.h b/include/linux/percpu.h
>> index 68fac2e7cbe6..d140d9d79567 100644
>> --- a/include/linux/percpu.h
>> +++ b/include/linux/percpu.h
>> @@ -132,6 +132,7 @@ extern void __init setup_per_cpu_areas(void);
>>  extern void __percpu *__alloc_percpu_gfp(size_t size, size_t align, gfp_t gfp) __alloc_size(1);
>>  extern void __percpu *__alloc_percpu(size_t size, size_t align) __alloc_size(1);
>>  extern void free_percpu(void __percpu *__pdata);
>> +extern size_t alloc_size_percpu(void __percpu *__pdata);
>>  
>>  DEFINE_FREE(free_percpu, void __percpu *, free_percpu(_T))
>>  
>> diff --git a/mm/percpu.c b/mm/percpu.c
>> index 7b40b3963f10..f541cfc3cb2d 100644
>> --- a/mm/percpu.c
>> +++ b/mm/percpu.c
>> @@ -2244,6 +2244,35 @@ static void pcpu_balance_workfn(struct work_struct *work)
>>  	mutex_unlock(&pcpu_alloc_mutex);
>>  }
>>  
>> +/**
>> + * alloc_size_percpu - the size of the dynamic percpu area
> Can we name this pcpu_alloc_size(). A few other functions are
> exposed under pcpu_* so it's a bit easier to keep track of.

Will do in v2.
>
>> + * @ptr: pointer to the dynamic percpu area
>> + *
>> + * Return the size of the dynamic percpu area @ptr.
>> + *
>> + * RETURNS:
>> + * The size of the dynamic percpu area.
>> + *
>> + * CONTEXT:
>> + * Can be called from atomic context.
>> + */
>> +size_t alloc_size_percpu(void __percpu *ptr)
>> +{
>> +	struct pcpu_chunk *chunk;
>> +	int bit_off, end;
>> +	void *addr;
>> +
>> +	if (!ptr)
>> +		return 0;
>> +
>> +	addr = __pcpu_ptr_to_addr(ptr);
>> +	/* No pcpu_lock here: ptr has not been freed, so chunk is still alive */
> Now that percpu variables are floating around more commonly, I think we
> or I need to add more validation guards so it's easier to
> debug bogus/stale pointers. Potentially like a static_key or Kconfig so
> that we take the lock and `test_bit()`.

Before the patch, it seems that free_percpu() is the only user of
__pcpu_ptr_to_addr(ptr). I can move the invocation of both
__pcpu_ptr_to_addr() and pcpu_chunk_addr_search() into a common helper
if you are OK with it. And I think it will ease the work of adding
validation guard on the per-cpu pointer.
>
>> +	chunk = pcpu_chunk_addr_search(addr);
>> +	bit_off = (addr - chunk->base_addr) / PCPU_MIN_ALLOC_SIZE;
>> +	end = find_next_bit(chunk->bound_map, pcpu_chunk_map_bits(chunk), bit_off + 1);
> Nit: can you please reflow `bit_off + 1` to the next line. I know we
> dropped the line requirement, but percpu.c almost completely still
> follows it.

Will do in v2.
>
>> +	return (end - bit_off) * PCPU_MIN_ALLOC_SIZE;
>> +}
>> +
>>  /**
>>   * free_percpu - free percpu area
>>   * @ptr: pointer to area to free
>> -- 
>> 2.29.2
>>
> Thanks,
> Dennis
diff mbox series

Patch

diff --git a/include/linux/percpu.h b/include/linux/percpu.h
index 68fac2e7cbe6..d140d9d79567 100644
--- a/include/linux/percpu.h
+++ b/include/linux/percpu.h
@@ -132,6 +132,7 @@  extern void __init setup_per_cpu_areas(void);
 extern void __percpu *__alloc_percpu_gfp(size_t size, size_t align, gfp_t gfp) __alloc_size(1);
 extern void __percpu *__alloc_percpu(size_t size, size_t align) __alloc_size(1);
 extern void free_percpu(void __percpu *__pdata);
+extern size_t alloc_size_percpu(void __percpu *__pdata);
 
 DEFINE_FREE(free_percpu, void __percpu *, free_percpu(_T))
 
diff --git a/mm/percpu.c b/mm/percpu.c
index 7b40b3963f10..f541cfc3cb2d 100644
--- a/mm/percpu.c
+++ b/mm/percpu.c
@@ -2244,6 +2244,35 @@  static void pcpu_balance_workfn(struct work_struct *work)
 	mutex_unlock(&pcpu_alloc_mutex);
 }
 
+/**
+ * alloc_size_percpu - the size of the dynamic percpu area
+ * @ptr: pointer to the dynamic percpu area
+ *
+ * Return the size of the dynamic percpu area @ptr.
+ *
+ * RETURNS:
+ * The size of the dynamic percpu area.
+ *
+ * CONTEXT:
+ * Can be called from atomic context.
+ */
+size_t alloc_size_percpu(void __percpu *ptr)
+{
+	struct pcpu_chunk *chunk;
+	int bit_off, end;
+	void *addr;
+
+	if (!ptr)
+		return 0;
+
+	addr = __pcpu_ptr_to_addr(ptr);
+	/* No pcpu_lock here: ptr has not been freed, so chunk is still alive */
+	chunk = pcpu_chunk_addr_search(addr);
+	bit_off = (addr - chunk->base_addr) / PCPU_MIN_ALLOC_SIZE;
+	end = find_next_bit(chunk->bound_map, pcpu_chunk_map_bits(chunk), bit_off + 1);
+	return (end - bit_off) * PCPU_MIN_ALLOC_SIZE;
+}
+
 /**
  * free_percpu - free percpu area
  * @ptr: pointer to area to free