mbox series

[bpf-next,0/3] Fix lockdep warning for htab of map

Message ID 20241106063542.357743-1-houtao@huaweicloud.com (mailing list archive)
Headers show
Series Fix lockdep warning for htab of map | expand

Message

Hou Tao Nov. 6, 2024, 6:35 a.m. UTC
From: Hou Tao <houtao1@huawei.com>

Hi,

The patch set fixes a lockdep warning for htab of map. The
warning is found when running test_maps. The warning occurs when
htab_put_fd_value() attempts to acquire map_idr_lock to free the map id
of the inner map while already holding the bucket lock (raw_spinlock_t).

The fix moves the invocation of free_htab_elem() after
htab_unlock_bucket() and adds a test case to verify the solution. Please
see the individual patches for details. Comments are always welcome.

Hou Tao (3):
  bpf: Call free_htab_elem() after htab_unlock_bucket()
  selftests/bpf: Move ENOTSUPP from bpf_util.h
  selftests/bpf: Test the update operations for htab of maps

 kernel/bpf/hashtab.c                          |  56 +++++---
 tools/testing/selftests/bpf/bpf_util.h        |   4 +
 .../selftests/bpf/prog_tests/bpf_tcp_ca.c     |   4 -
 .../selftests/bpf/prog_tests/lsm_cgroup.c     |   4 -
 .../selftests/bpf/prog_tests/map_in_map.c     | 132 +++++++++++++++++-
 .../selftests/bpf/prog_tests/sock_addr.c      |   4 -
 .../selftests/bpf/progs/update_map_in_htab.c  |  30 ++++
 tools/testing/selftests/bpf/test_maps.c       |   4 -
 tools/testing/selftests/bpf/test_verifier.c   |   4 -
 9 files changed, 204 insertions(+), 38 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/progs/update_map_in_htab.c

Comments

Sebastian Andrzej Siewior Nov. 6, 2024, 8:45 a.m. UTC | #1
On 2024-11-06 14:35:39 [+0800], Hou Tao wrote:
> From: Hou Tao <houtao1@huawei.com>
> 
> Hi,
Hi Hou,

> The patch set fixes a lockdep warning for htab of map. The
> warning is found when running test_maps. The warning occurs when
> htab_put_fd_value() attempts to acquire map_idr_lock to free the map id
> of the inner map while already holding the bucket lock (raw_spinlock_t).
> 
> The fix moves the invocation of free_htab_elem() after
> htab_unlock_bucket() and adds a test case to verify the solution. Please
> see the individual patches for details. Comments are always welcome.

Thank you.

Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

I've seen that you didn't move check_and_free_fields() out of the bucket
locked section. Type BPF_TIMER does hrtimer_cancel() if the timer
happens to run on a remote CPU. On PREEMPT_RT this will acquire a
sleeping lock which is problematic due to the raw_spinlock_t.
Would it be okay, to cleanup the timer unconditionally via the
workqueue?

Sebastian
Hou Tao Nov. 6, 2024, 9:48 a.m. UTC | #2
Hi,

On 11/6/2024 4:45 PM, Sebastian Andrzej Siewior wrote:
> On 2024-11-06 14:35:39 [+0800], Hou Tao wrote:
>> From: Hou Tao <houtao1@huawei.com>
>>
>> Hi,
> Hi Hou,
>
>> The patch set fixes a lockdep warning for htab of map. The
>> warning is found when running test_maps. The warning occurs when
>> htab_put_fd_value() attempts to acquire map_idr_lock to free the map id
>> of the inner map while already holding the bucket lock (raw_spinlock_t).
>>
>> The fix moves the invocation of free_htab_elem() after
>> htab_unlock_bucket() and adds a test case to verify the solution. Please
>> see the individual patches for details. Comments are always welcome.
> Thank you.
>
> Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
>
> I've seen that you didn't move check_and_free_fields() out of the bucket
> locked section. Type BPF_TIMER does hrtimer_cancel() if the timer
> happens to run on a remote CPU. On PREEMPT_RT this will acquire a
> sleeping lock which is problematic due to the raw_spinlock_t.
> Would it be okay, to cleanup the timer unconditionally via the
> workqueue?

Yes. The patch set still invokes check_and_free_fields() under the
bucket lock when updating an existing element in a pre-allocated htab. I
missed the hrtimer case. For the sleeping lock, you mean the
cpu_base->softirq_expiry_lock in hrtimer_cancel_waiting_running(), right
? Instead of cancelling the timer in workqueue, maybe we could save the
old value temporarily in the bucket lock, and try to free it outside of
the bucket lock or disabling the extra_elems logic temporarily for the
case ?
> Sebastian
Sebastian Andrzej Siewior Nov. 6, 2024, 10:05 a.m. UTC | #3
On 2024-11-06 17:48:59 [+0800], Hou Tao wrote:
> Hi,
Hi,

> Yes. The patch set still invokes check_and_free_fields() under the
> bucket lock when updating an existing element in a pre-allocated htab. I
> missed the hrtimer case. For the sleeping lock, you mean the
> cpu_base->softirq_expiry_lock in hrtimer_cancel_waiting_running(), right

Yes.

> ? Instead of cancelling the timer in workqueue, maybe we could save the
> old value temporarily in the bucket lock, and try to free it outside of
> the bucket lock or disabling the extra_elems logic temporarily for the
> case ?

Well, it is up to you. Either:

diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index 1a43d06eab286..b077af12fc9b4 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -1593,10 +1593,7 @@ void bpf_timer_cancel_and_free(void *val)
 	 * To avoid these issues, punt to workqueue context when we are in a
 	 * timer callback.
 	 */
-	if (this_cpu_read(hrtimer_running))
-		queue_work(system_unbound_wq, &t->cb.delete_work);
-	else
-		bpf_timer_delete_work(&t->cb.delete_work);
+	queue_work(system_unbound_wq, &t->cb.delete_work);
 }
 
 /* This function is called by map_delete/update_elem for individual element and

Or something smarter where you cancel the timer outside of the bucket
lock.

Sebastian