mbox series

[bpf-next,0/3] bpf: inline bpf_kptr_xchg()

Message ID 20231219135615.2656572-1-houtao@huaweicloud.com (mailing list archive)
Headers show
Series bpf: inline bpf_kptr_xchg() | expand

Message

Hou Tao Dec. 19, 2023, 1:56 p.m. UTC
From: Hou Tao <houtao1@huawei.com>

Hi,

The motivation for the patch set comes from the performance profiling of
bpf memory allocator benchmark (will post it soon). The initial purpose
of the benchmark is used to test whether or not there is performance
degradation when using c->unit_size instead of ksize() to select the
target cache for free [1]. The benchmark uses bpf_kptr_xchg() to stash
the allocated objects and fetches the stashed objects for free. Based on
the fix proposed in [1], After inling bpf_kptr_xchg(), the performance
for object free increase about ~4%.

Initially the inline is implemented in do_jit() for x86-64 directly, but
I think it will more portable to implement the inline in verifier.
Please see individual patches for more details. And comments are always
welcome.

[1]: https://lore.kernel.org/bpf/20231216131052.27621-1-houtao@huaweicloud.com

Hou Tao (3):
  bpf: Support inlining bpf_kptr_xchg() helper
  bpf, x86: Don't generate lock prefix for BPF_XCHG
  bpf, x86: Inline bpf_kptr_xchg() on x86-64

 arch/x86/net/bpf_jit_comp.c |  9 ++++++++-
 include/linux/filter.h      |  1 +
 kernel/bpf/core.c           | 10 ++++++++++
 kernel/bpf/verifier.c       | 17 +++++++++++++++++
 4 files changed, 36 insertions(+), 1 deletion(-)

Comments

Daniel Borkmann Dec. 20, 2023, 2:54 p.m. UTC | #1
On 12/19/23 2:56 PM, Hou Tao wrote:
> From: Hou Tao <houtao1@huawei.com>
> 
> Hi,
> 
> The motivation for the patch set comes from the performance profiling of
> bpf memory allocator benchmark (will post it soon). The initial purpose
> of the benchmark is used to test whether or not there is performance
> degradation when using c->unit_size instead of ksize() to select the
> target cache for free [1]. The benchmark uses bpf_kptr_xchg() to stash
> the allocated objects and fetches the stashed objects for free. Based on
> the fix proposed in [1], After inling bpf_kptr_xchg(), the performance
> for object free increase about ~4%.

It would probably make more sense if you place this also in the actual
patch as motivation / use case on /why/ it's needed.

> Initially the inline is implemented in do_jit() for x86-64 directly, but
> I think it will more portable to implement the inline in verifier.
> Please see individual patches for more details. And comments are always
> welcome.
> 
> [1]: https://lore.kernel.org/bpf/20231216131052.27621-1-houtao@huaweicloud.com
> 
> Hou Tao (3):
>    bpf: Support inlining bpf_kptr_xchg() helper
>    bpf, x86: Don't generate lock prefix for BPF_XCHG
>    bpf, x86: Inline bpf_kptr_xchg() on x86-64
> 
>   arch/x86/net/bpf_jit_comp.c |  9 ++++++++-
>   include/linux/filter.h      |  1 +
>   kernel/bpf/core.c           | 10 ++++++++++
>   kernel/bpf/verifier.c       | 17 +++++++++++++++++
>   4 files changed, 36 insertions(+), 1 deletion(-)
> 

nit: Needs a rebase.
Hou Tao Dec. 21, 2023, 11:32 a.m. UTC | #2
Hi,

On 12/20/2023 10:54 PM, Daniel Borkmann wrote:
> On 12/19/23 2:56 PM, Hou Tao wrote:
>> From: Hou Tao <houtao1@huawei.com>
>>
>> Hi,
>>
>> The motivation for the patch set comes from the performance profiling of
>> bpf memory allocator benchmark (will post it soon). The initial purpose
>> of the benchmark is used to test whether or not there is performance
>> degradation when using c->unit_size instead of ksize() to select the
>> target cache for free [1]. The benchmark uses bpf_kptr_xchg() to stash
>> the allocated objects and fetches the stashed objects for free. Based on
>> the fix proposed in [1], After inling bpf_kptr_xchg(), the performance
>> for object free increase about ~4%.
>
> It would probably make more sense if you place this also in the actual
> patch as motivation / use case on /why/ it's needed.

Thanks for the suggestion. Will added it in the inline patch.
>
>> Initially the inline is implemented in do_jit() for x86-64 directly, but
>> I think it will more portable to implement the inline in verifier.
>> Please see individual patches for more details. And comments are always
>> welcome.
>>
>> [1]:
>> https://lore.kernel.org/bpf/20231216131052.27621-1-houtao@huaweicloud.com
>>
>> Hou Tao (3):
>>    bpf: Support inlining bpf_kptr_xchg() helper
>>    bpf, x86: Don't generate lock prefix for BPF_XCHG
>>    bpf, x86: Inline bpf_kptr_xchg() on x86-64
>>
>>   arch/x86/net/bpf_jit_comp.c |  9 ++++++++-
>>   include/linux/filter.h      |  1 +
>>   kernel/bpf/core.c           | 10 ++++++++++
>>   kernel/bpf/verifier.c       | 17 +++++++++++++++++
>>   4 files changed, 36 insertions(+), 1 deletion(-)
>>
>
> nit: Needs a rebase.
> .

Will do.