Message ID | 20250129-test-kunit-v2-0-fe59c43805d5@kernel.org (mailing list archive) |
---|---|
Headers | show |
Series | drm/tests: Fix locking issues (kind of) | expand |
On Wed, Jan 29, 2025 at 03:21:52PM +0100, Maxime Ripard wrote: > Hi, > > Here's another attempt at fixing the current locking issues with the > HDMI kunit tests. > > The initial issue was reported by Dave here: > https://lore.kernel.org/all/CAPM=9tzJ4-ERDxvuwrCyUPY0=+P44orhp1kLWVGL7MCfpQjMEQ@mail.gmail.com/ > > After fixing it, there was still a lockdep warning for a circular > dependency. This series is also fixing the issue. So this looks like it's a kthread_exit, which yes is broken. You cannot acquire a lock in one thread and release it in another thread, that does not work for lockdep and therefore is forbidden for mutexes. It's kinda allowed for semaphore, but that's why semaphores cannot be automatically checked by lockdep. So yeah we cannot use such a deferred action, it would need to be a deferred action that's run synchronously. -Sima > > There's still an issue though. When running the tests, I get: > > KTAP version 1 > 1..1 > KTAP version 1 > # Subtest: drm_atomic_helper_connector_hdmi_check > # module: drm_hdmi_state_helper_test > 1..1 > > ==================================== > WARNING: kunit_try_catch/25 still has locks held! > 6.13.0-rc2-00410-gbd9d16533367 #18 Tainted: G N > ------------------------------------ > 2 locks held by kunit_try_catch/25: > #0: fff00000021586f0 (crtc_ww_class_acquire){+.+.}-{0:0}, at: drm_kunit_helper_acquire_ctx_alloc+0x5c/0xf0 > #1: fff0000002158718 (crtc_ww_class_mutex){+.+.}-{0:0}, at: drm_kunit_helper_acquire_ctx_alloc+0x5c/0xf0 > > stack backtrace: > CPU: 0 UID: 0 PID: 25 Comm: kunit_try_catch Tainted: G N 6.13.0-rc2-00410-gbd9d16533367 #18 > Tainted: [N]=TEST > Hardware name: linux,dummy-virt (DT) > Call trace: > show_stack+0x18/0x30 (C) > dump_stack_lvl+0x70/0x98 > dump_stack+0x18/0x24 > debug_check_no_locks_held+0x9c/0xa4 > do_exit+0x52c/0x970 > kthread_exit+0x28/0x30 > kthread+0xdc/0xf0 > ret_from_fork+0x10/0x20 > ok 1 drm_test_check_hdmi_funcs_reject_rate > ok 1 drm_atomic_helper_connector_hdmi_check > > I believe it's due to the fact that drm_kunit_helper_acquire_ctx_alloc() > will acquire the lock directly, but will release it as a deferred kunit > action. It's not unsafe, as the lock is eventually released, but I don't > really know what the best course of action is here: > > * Forget about the idea of a context tied to the lifetime of a test > * Make lockdep know that it's ok, and we know what to do > > I've tried the latter, using lockdep_pin/unpin_lock, but that didn't fix > the issue so I must have done something wrong. > > Let me know what you think, > Maxime > > Signed-off-by: Maxime Ripard <mripard@kernel.org> > --- > Changes in v2: > - Fix circular dependency warning > - Link to v1: https://lore.kernel.org/r/20241031091558.2435850-1-mripard@kernel.org > > --- > Maxime Ripard (4): > drm/tests: hdmi: Fix WW_MUTEX_SLOWPATH failures > drm/tests: hdmi: Remove redundant assignments > drm/tests: hdmi: Reorder DRM entities variables assignment > drm/tests: hdmi: Fix recursive locking > > drivers/gpu/drm/tests/drm_hdmi_state_helper_test.c | 200 +++++++++++---------- > 1 file changed, 103 insertions(+), 97 deletions(-) > --- > base-commit: e2a81c0cd7de6cb063058be304b18f200c64802b > change-id: 20250129-test-kunit-5ba3c03bffb0 > > Best regards, > -- > Maxime Ripard <mripard@kernel.org> >
Hi, Here's another attempt at fixing the current locking issues with the HDMI kunit tests. The initial issue was reported by Dave here: https://lore.kernel.org/all/CAPM=9tzJ4-ERDxvuwrCyUPY0=+P44orhp1kLWVGL7MCfpQjMEQ@mail.gmail.com/ After fixing it, there was still a lockdep warning for a circular dependency. This series is also fixing the issue. There's still an issue though. When running the tests, I get: KTAP version 1 1..1 KTAP version 1 # Subtest: drm_atomic_helper_connector_hdmi_check # module: drm_hdmi_state_helper_test 1..1 ==================================== WARNING: kunit_try_catch/25 still has locks held! 6.13.0-rc2-00410-gbd9d16533367 #18 Tainted: G N ------------------------------------ 2 locks held by kunit_try_catch/25: #0: fff00000021586f0 (crtc_ww_class_acquire){+.+.}-{0:0}, at: drm_kunit_helper_acquire_ctx_alloc+0x5c/0xf0 #1: fff0000002158718 (crtc_ww_class_mutex){+.+.}-{0:0}, at: drm_kunit_helper_acquire_ctx_alloc+0x5c/0xf0 stack backtrace: CPU: 0 UID: 0 PID: 25 Comm: kunit_try_catch Tainted: G N 6.13.0-rc2-00410-gbd9d16533367 #18 Tainted: [N]=TEST Hardware name: linux,dummy-virt (DT) Call trace: show_stack+0x18/0x30 (C) dump_stack_lvl+0x70/0x98 dump_stack+0x18/0x24 debug_check_no_locks_held+0x9c/0xa4 do_exit+0x52c/0x970 kthread_exit+0x28/0x30 kthread+0xdc/0xf0 ret_from_fork+0x10/0x20 ok 1 drm_test_check_hdmi_funcs_reject_rate ok 1 drm_atomic_helper_connector_hdmi_check I believe it's due to the fact that drm_kunit_helper_acquire_ctx_alloc() will acquire the lock directly, but will release it as a deferred kunit action. It's not unsafe, as the lock is eventually released, but I don't really know what the best course of action is here: * Forget about the idea of a context tied to the lifetime of a test * Make lockdep know that it's ok, and we know what to do I've tried the latter, using lockdep_pin/unpin_lock, but that didn't fix the issue so I must have done something wrong. Let me know what you think, Maxime Signed-off-by: Maxime Ripard <mripard@kernel.org> --- Changes in v2: - Fix circular dependency warning - Link to v1: https://lore.kernel.org/r/20241031091558.2435850-1-mripard@kernel.org --- Maxime Ripard (4): drm/tests: hdmi: Fix WW_MUTEX_SLOWPATH failures drm/tests: hdmi: Remove redundant assignments drm/tests: hdmi: Reorder DRM entities variables assignment drm/tests: hdmi: Fix recursive locking drivers/gpu/drm/tests/drm_hdmi_state_helper_test.c | 200 +++++++++++---------- 1 file changed, 103 insertions(+), 97 deletions(-) --- base-commit: e2a81c0cd7de6cb063058be304b18f200c64802b change-id: 20250129-test-kunit-5ba3c03bffb0 Best regards,