diff mbox series

[bpf,2/2] bpf: Use __llist_del_all() whenever possbile during memory draining

Message ID 20221019115539.983394-3-houtao@huaweicloud.com (mailing list archive)
State Changes Requested
Delegated to: BPF
Headers show
Series Wait for busy refill_work when destorying bpf memory allocator | expand

Checks

Context Check Description
netdev/tree_selection success Clearly marked for bpf
netdev/fixes_present success Fixes tag present in non-next series
netdev/subject_prefix success Link
netdev/cover_letter success Series has a cover letter
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 5 this patch: 5
netdev/cc_maintainers success CCed 12 of 12 maintainers
netdev/build_clang success Errors and warnings before: 5 this patch: 5
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 5 this patch: 5
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 19 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
bpf/vmtest-bpf-PR fail PR summary
bpf/vmtest-bpf-VM_Test-4 success Logs for llvm-toolchain
bpf/vmtest-bpf-VM_Test-5 success Logs for set-matrix
bpf/vmtest-bpf-VM_Test-2 success Logs for build for x86_64 with gcc
bpf/vmtest-bpf-VM_Test-3 success Logs for build for x86_64 with llvm-16
bpf/vmtest-bpf-VM_Test-1 success Logs for build for s390x with gcc
bpf/vmtest-bpf-VM_Test-16 success Logs for test_verifier on x86_64 with gcc
bpf/vmtest-bpf-VM_Test-17 success Logs for test_verifier on x86_64 with llvm-16
bpf/vmtest-bpf-VM_Test-6 success Logs for test_maps on s390x with gcc
bpf/vmtest-bpf-VM_Test-7 success Logs for test_maps on x86_64 with gcc
bpf/vmtest-bpf-VM_Test-8 success Logs for test_maps on x86_64 with llvm-16
bpf/vmtest-bpf-VM_Test-10 fail Logs for test_progs on x86_64 with gcc
bpf/vmtest-bpf-VM_Test-11 fail Logs for test_progs on x86_64 with llvm-16
bpf/vmtest-bpf-VM_Test-12 success Logs for test_progs_no_alu32 on s390x with gcc
bpf/vmtest-bpf-VM_Test-13 fail Logs for test_progs_no_alu32 on x86_64 with gcc
bpf/vmtest-bpf-VM_Test-14 fail Logs for test_progs_no_alu32 on x86_64 with llvm-16
bpf/vmtest-bpf-VM_Test-9 success Logs for test_progs on s390x with gcc
bpf/vmtest-bpf-VM_Test-15 success Logs for test_verifier on s390x with gcc

Commit Message

Hou Tao Oct. 19, 2022, 11:55 a.m. UTC
From: Hou Tao <houtao1@huawei.com>

Except for waiting_for_gp list, there are no concurrent operations on
free_by_rcu, free_llist and free_llist_extra lists, so use
__llist_del_all() instead of llist_del_all(). waiting_for_gp list can be
deleted by RCU callback concurrently, so still use llist_del_all().

Signed-off-by: Hou Tao <houtao1@huawei.com>
---
 kernel/bpf/memalloc.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

Comments

Stanislav Fomichev Oct. 19, 2022, 7 p.m. UTC | #1
On 10/19, Hou Tao wrote:
> From: Hou Tao <houtao1@huawei.com>

> Except for waiting_for_gp list, there are no concurrent operations on
> free_by_rcu, free_llist and free_llist_extra lists, so use
> __llist_del_all() instead of llist_del_all(). waiting_for_gp list can be
> deleted by RCU callback concurrently, so still use llist_del_all().

> Signed-off-by: Hou Tao <houtao1@huawei.com>
> ---
>   kernel/bpf/memalloc.c | 7 +++++--
>   1 file changed, 5 insertions(+), 2 deletions(-)

> diff --git a/kernel/bpf/memalloc.c b/kernel/bpf/memalloc.c
> index 48e606aaacf0..7f45744a09f7 100644
> --- a/kernel/bpf/memalloc.c
> +++ b/kernel/bpf/memalloc.c
> @@ -422,14 +422,17 @@ static void drain_mem_cache(struct bpf_mem_cache *c)
>   	/* No progs are using this bpf_mem_cache, but htab_map_free() called
>   	 * bpf_mem_cache_free() for all remaining elements and they can be in
>   	 * free_by_rcu or in waiting_for_gp lists, so drain those lists now.
> +	 *
> +	 * Except for waiting_for_gp list, there are no concurrent operations
> +	 * on these lists, so it is safe to use __llist_del_all().
>   	 */
>   	llist_for_each_safe(llnode, t, __llist_del_all(&c->free_by_rcu))
>   		free_one(c, llnode);
>   	llist_for_each_safe(llnode, t, llist_del_all(&c->waiting_for_gp))
>   		free_one(c, llnode);
> -	llist_for_each_safe(llnode, t, llist_del_all(&c->free_llist))
> +	llist_for_each_safe(llnode, t, __llist_del_all(&c->free_llist))
>   		free_one(c, llnode);
> -	llist_for_each_safe(llnode, t, llist_del_all(&c->free_llist_extra))
> +	llist_for_each_safe(llnode, t, __llist_del_all(&c->free_llist_extra))
>   		free_one(c, llnode);

Acked-by: Stanislav Fomichev <sdf@google.com>

Seems safe even without the previous patch? OTOH, do we really care
about __lllist vs llist in the cleanup path? Might be safer to always
do llist_del_all everywhere?

>   }

> --
> 2.29.2
Hou Tao Oct. 20, 2022, 1:17 a.m. UTC | #2
Hi,

On 10/20/2022 3:00 AM, sdf@google.com wrote:
> On 10/19, Hou Tao wrote:
>> From: Hou Tao <houtao1@huawei.com>
>
>> Except for waiting_for_gp list, there are no concurrent operations on
>> free_by_rcu, free_llist and free_llist_extra lists, so use
>> __llist_del_all() instead of llist_del_all(). waiting_for_gp list can be
>> deleted by RCU callback concurrently, so still use llist_del_all().
>
>> Signed-off-by: Hou Tao <houtao1@huawei.com>
>> ---
>>   kernel/bpf/memalloc.c | 7 +++++--
>>   1 file changed, 5 insertions(+), 2 deletions(-)
>
>> diff --git a/kernel/bpf/memalloc.c b/kernel/bpf/memalloc.c
>> index 48e606aaacf0..7f45744a09f7 100644
>> --- a/kernel/bpf/memalloc.c
>> +++ b/kernel/bpf/memalloc.c
>> @@ -422,14 +422,17 @@ static void drain_mem_cache(struct bpf_mem_cache *c)
>>       /* No progs are using this bpf_mem_cache, but htab_map_free() called
>>        * bpf_mem_cache_free() for all remaining elements and they can be in
>>        * free_by_rcu or in waiting_for_gp lists, so drain those lists now.
>> +     *
>> +     * Except for waiting_for_gp list, there are no concurrent operations
>> +     * on these lists, so it is safe to use __llist_del_all().
>>        */
>>       llist_for_each_safe(llnode, t, __llist_del_all(&c->free_by_rcu))
>>           free_one(c, llnode);
>>       llist_for_each_safe(llnode, t, llist_del_all(&c->waiting_for_gp))
>>           free_one(c, llnode);
>> -    llist_for_each_safe(llnode, t, llist_del_all(&c->free_llist))
>> +    llist_for_each_safe(llnode, t, __llist_del_all(&c->free_llist))
>>           free_one(c, llnode);
>> -    llist_for_each_safe(llnode, t, llist_del_all(&c->free_llist_extra))
>> +    llist_for_each_safe(llnode, t, __llist_del_all(&c->free_llist_extra))
>>           free_one(c, llnode);
>
> Acked-by: Stanislav Fomichev <sdf@google.com>
Thanks for the Acked-by.
>
> Seems safe even without the previous patch? OTOH, do we really care
> about __lllist vs llist in the cleanup path? Might be safer to always
> do llist_del_all everywhere?
No. free_llist is manipulated by both irq work and memory draining concurrently
before patch #1. Using llist_del_all(&c->free_llist) also doesn't help because
irq work uses __llist_add/__llist_del helpers. Basically there is no difference
between __llist and list helper for cleanup patch, but I think it is better to 
clarity the possible concurrent accesses and codify these assumption.
>
>>   }
>
>> -- 
>> 2.29.2
>
> .
Stanislav Fomichev Oct. 20, 2022, 5:52 p.m. UTC | #3
On Wed, Oct 19, 2022 at 6:18 PM Hou Tao <houtao@huaweicloud.com> wrote:
>
> Hi,
>
> On 10/20/2022 3:00 AM, sdf@google.com wrote:
> > On 10/19, Hou Tao wrote:
> >> From: Hou Tao <houtao1@huawei.com>
> >
> >> Except for waiting_for_gp list, there are no concurrent operations on
> >> free_by_rcu, free_llist and free_llist_extra lists, so use
> >> __llist_del_all() instead of llist_del_all(). waiting_for_gp list can be
> >> deleted by RCU callback concurrently, so still use llist_del_all().
> >
> >> Signed-off-by: Hou Tao <houtao1@huawei.com>
> >> ---
> >>   kernel/bpf/memalloc.c | 7 +++++--
> >>   1 file changed, 5 insertions(+), 2 deletions(-)
> >
> >> diff --git a/kernel/bpf/memalloc.c b/kernel/bpf/memalloc.c
> >> index 48e606aaacf0..7f45744a09f7 100644
> >> --- a/kernel/bpf/memalloc.c
> >> +++ b/kernel/bpf/memalloc.c
> >> @@ -422,14 +422,17 @@ static void drain_mem_cache(struct bpf_mem_cache *c)
> >>       /* No progs are using this bpf_mem_cache, but htab_map_free() called
> >>        * bpf_mem_cache_free() for all remaining elements and they can be in
> >>        * free_by_rcu or in waiting_for_gp lists, so drain those lists now.
> >> +     *
> >> +     * Except for waiting_for_gp list, there are no concurrent operations
> >> +     * on these lists, so it is safe to use __llist_del_all().
> >>        */
> >>       llist_for_each_safe(llnode, t, __llist_del_all(&c->free_by_rcu))
> >>           free_one(c, llnode);
> >>       llist_for_each_safe(llnode, t, llist_del_all(&c->waiting_for_gp))
> >>           free_one(c, llnode);
> >> -    llist_for_each_safe(llnode, t, llist_del_all(&c->free_llist))
> >> +    llist_for_each_safe(llnode, t, __llist_del_all(&c->free_llist))
> >>           free_one(c, llnode);
> >> -    llist_for_each_safe(llnode, t, llist_del_all(&c->free_llist_extra))
> >> +    llist_for_each_safe(llnode, t, __llist_del_all(&c->free_llist_extra))
> >>           free_one(c, llnode);
> >
> > Acked-by: Stanislav Fomichev <sdf@google.com>
> Thanks for the Acked-by.
> >
> > Seems safe even without the previous patch? OTOH, do we really care
> > about __lllist vs llist in the cleanup path? Might be safer to always
> > do llist_del_all everywhere?
> No. free_llist is manipulated by both irq work and memory draining concurrently
> before patch #1. Using llist_del_all(&c->free_llist) also doesn't help because
> irq work uses __llist_add/__llist_del helpers. Basically there is no difference
> between __llist and list helper for cleanup patch, but I think it is better to
> clarity the possible concurrent accesses and codify these assumption.

But this is still mostly relevant only for the preemt_rt/has_interrupt
case, right?
For non-preempt, irq should've finished long before we got to drain_mem_cache.
Hou Tao Oct. 21, 2022, 1:09 a.m. UTC | #4
Hi,

On 10/21/2022 1:52 AM, Stanislav Fomichev wrote:
> On Wed, Oct 19, 2022 at 6:18 PM Hou Tao <houtao@huaweicloud.com> wrote:
SNIP
>>>> diff --git a/kernel/bpf/memalloc.c b/kernel/bpf/memalloc.c
>>>> index 48e606aaacf0..7f45744a09f7 100644
>>>> --- a/kernel/bpf/memalloc.c
>>>> +++ b/kernel/bpf/memalloc.c
>>>> @@ -422,14 +422,17 @@ static void drain_mem_cache(struct bpf_mem_cache *c)
>>>>       /* No progs are using this bpf_mem_cache, but htab_map_free() called
>>>>        * bpf_mem_cache_free() for all remaining elements and they can be in
>>>>        * free_by_rcu or in waiting_for_gp lists, so drain those lists now.
>>>> +     *
>>>> +     * Except for waiting_for_gp list, there are no concurrent operations
>>>> +     * on these lists, so it is safe to use __llist_del_all().
>>>>        */
>>>>       llist_for_each_safe(llnode, t, __llist_del_all(&c->free_by_rcu))
>>>>           free_one(c, llnode);
>>>>       llist_for_each_safe(llnode, t, llist_del_all(&c->waiting_for_gp))
>>>>           free_one(c, llnode);
>>>> -    llist_for_each_safe(llnode, t, llist_del_all(&c->free_llist))
>>>> +    llist_for_each_safe(llnode, t, __llist_del_all(&c->free_llist))
>>>>           free_one(c, llnode);
>>>> -    llist_for_each_safe(llnode, t, llist_del_all(&c->free_llist_extra))
>>>> +    llist_for_each_safe(llnode, t, __llist_del_all(&c->free_llist_extra))
>>>>           free_one(c, llnode);
>>> Acked-by: Stanislav Fomichev <sdf@google.com>
>> Thanks for the Acked-by.
>>> Seems safe even without the previous patch? OTOH, do we really care
>>> about __lllist vs llist in the cleanup path? Might be safer to always
>>> do llist_del_all everywhere?
>> No. free_llist is manipulated by both irq work and memory draining concurrently
>> before patch #1. Using llist_del_all(&c->free_llist) also doesn't help because
>> irq work uses __llist_add/__llist_del helpers. Basically there is no difference
>> between __llist and list helper for cleanup patch, but I think it is better to
>> clarity the possible concurrent accesses and codify these assumption.
> But this is still mostly relevant only for the preemt_rt/has_interrupt
> case, right?
> For non-preempt, irq should've finished long before we got to drain_mem_cache.
Yes. The concurrent access on free_llist is only possible for
preempt_rt/does_not_has_interrupt cases.
diff mbox series

Patch

diff --git a/kernel/bpf/memalloc.c b/kernel/bpf/memalloc.c
index 48e606aaacf0..7f45744a09f7 100644
--- a/kernel/bpf/memalloc.c
+++ b/kernel/bpf/memalloc.c
@@ -422,14 +422,17 @@  static void drain_mem_cache(struct bpf_mem_cache *c)
 	/* No progs are using this bpf_mem_cache, but htab_map_free() called
 	 * bpf_mem_cache_free() for all remaining elements and they can be in
 	 * free_by_rcu or in waiting_for_gp lists, so drain those lists now.
+	 *
+	 * Except for waiting_for_gp list, there are no concurrent operations
+	 * on these lists, so it is safe to use __llist_del_all().
 	 */
 	llist_for_each_safe(llnode, t, __llist_del_all(&c->free_by_rcu))
 		free_one(c, llnode);
 	llist_for_each_safe(llnode, t, llist_del_all(&c->waiting_for_gp))
 		free_one(c, llnode);
-	llist_for_each_safe(llnode, t, llist_del_all(&c->free_llist))
+	llist_for_each_safe(llnode, t, __llist_del_all(&c->free_llist))
 		free_one(c, llnode);
-	llist_for_each_safe(llnode, t, llist_del_all(&c->free_llist_extra))
+	llist_for_each_safe(llnode, t, __llist_del_all(&c->free_llist_extra))
 		free_one(c, llnode);
 }