mbox series

[v2,0/4] drm/tests: Fix locking issues (kind of)

Message ID 20250129-test-kunit-v2-0-fe59c43805d5@kernel.org (mailing list archive)
Headers show
Series drm/tests: Fix locking issues (kind of) | expand

Message

Maxime Ripard Jan. 29, 2025, 2:21 p.m. UTC
Hi,

Here's another attempt at fixing the current locking issues with the
HDMI kunit tests.

The initial issue was reported by Dave here:
https://lore.kernel.org/all/CAPM=9tzJ4-ERDxvuwrCyUPY0=+P44orhp1kLWVGL7MCfpQjMEQ@mail.gmail.com/

After fixing it, there was still a lockdep warning for a circular
dependency. This series is also fixing the issue.

There's still an issue though. When running the tests, I get:

KTAP version 1
1..1
    KTAP version 1
    # Subtest: drm_atomic_helper_connector_hdmi_check
    # module: drm_hdmi_state_helper_test
    1..1

====================================
WARNING: kunit_try_catch/25 still has locks held!
6.13.0-rc2-00410-gbd9d16533367 #18 Tainted: G                 N
------------------------------------
2 locks held by kunit_try_catch/25:
 #0: fff00000021586f0 (crtc_ww_class_acquire){+.+.}-{0:0}, at: drm_kunit_helper_acquire_ctx_alloc+0x5c/0xf0
 #1: fff0000002158718 (crtc_ww_class_mutex){+.+.}-{0:0}, at: drm_kunit_helper_acquire_ctx_alloc+0x5c/0xf0

stack backtrace:
CPU: 0 UID: 0 PID: 25 Comm: kunit_try_catch Tainted: G                 N 6.13.0-rc2-00410-gbd9d16533367 #18
Tainted: [N]=TEST
Hardware name: linux,dummy-virt (DT)
Call trace:
 show_stack+0x18/0x30 (C)
 dump_stack_lvl+0x70/0x98
 dump_stack+0x18/0x24
 debug_check_no_locks_held+0x9c/0xa4
 do_exit+0x52c/0x970
 kthread_exit+0x28/0x30
 kthread+0xdc/0xf0
 ret_from_fork+0x10/0x20
    ok 1 drm_test_check_hdmi_funcs_reject_rate
ok 1 drm_atomic_helper_connector_hdmi_check

I believe it's due to the fact that drm_kunit_helper_acquire_ctx_alloc()
will acquire the lock directly, but will release it as a deferred kunit
action. It's not unsafe, as the lock is eventually released, but I don't
really know what the best course of action is here:

  * Forget about the idea of a context tied to the lifetime of a test
  * Make lockdep know that it's ok, and we know what to do 

I've tried the latter, using lockdep_pin/unpin_lock, but that didn't fix
the issue so I must have done something wrong.

Let me know what you think,
Maxime

Signed-off-by: Maxime Ripard <mripard@kernel.org>
---
Changes in v2:
- Fix circular dependency warning
- Link to v1: https://lore.kernel.org/r/20241031091558.2435850-1-mripard@kernel.org

---
Maxime Ripard (4):
      drm/tests: hdmi: Fix WW_MUTEX_SLOWPATH failures
      drm/tests: hdmi: Remove redundant assignments
      drm/tests: hdmi: Reorder DRM entities variables assignment
      drm/tests: hdmi: Fix recursive locking

 drivers/gpu/drm/tests/drm_hdmi_state_helper_test.c | 200 +++++++++++----------
 1 file changed, 103 insertions(+), 97 deletions(-)
---
base-commit: e2a81c0cd7de6cb063058be304b18f200c64802b
change-id: 20250129-test-kunit-5ba3c03bffb0

Best regards,

Comments

Simona Vetter Feb. 4, 2025, 11 a.m. UTC | #1
On Wed, Jan 29, 2025 at 03:21:52PM +0100, Maxime Ripard wrote:
> Hi,
> 
> Here's another attempt at fixing the current locking issues with the
> HDMI kunit tests.
> 
> The initial issue was reported by Dave here:
> https://lore.kernel.org/all/CAPM=9tzJ4-ERDxvuwrCyUPY0=+P44orhp1kLWVGL7MCfpQjMEQ@mail.gmail.com/
> 
> After fixing it, there was still a lockdep warning for a circular
> dependency. This series is also fixing the issue.

So this looks like it's a kthread_exit, which yes is broken. You cannot
acquire a lock in one thread and release it in another thread, that does
not work for lockdep and therefore is forbidden for mutexes.

It's kinda allowed for semaphore, but that's why semaphores cannot be
automatically checked by lockdep. So yeah we cannot use such a deferred
action, it would need to be a deferred action that's run synchronously.
-Sima

> 
> There's still an issue though. When running the tests, I get:
> 
> KTAP version 1
> 1..1
>     KTAP version 1
>     # Subtest: drm_atomic_helper_connector_hdmi_check
>     # module: drm_hdmi_state_helper_test
>     1..1
> 
> ====================================
> WARNING: kunit_try_catch/25 still has locks held!
> 6.13.0-rc2-00410-gbd9d16533367 #18 Tainted: G                 N
> ------------------------------------
> 2 locks held by kunit_try_catch/25:
>  #0: fff00000021586f0 (crtc_ww_class_acquire){+.+.}-{0:0}, at: drm_kunit_helper_acquire_ctx_alloc+0x5c/0xf0
>  #1: fff0000002158718 (crtc_ww_class_mutex){+.+.}-{0:0}, at: drm_kunit_helper_acquire_ctx_alloc+0x5c/0xf0
> 
> stack backtrace:
> CPU: 0 UID: 0 PID: 25 Comm: kunit_try_catch Tainted: G                 N 6.13.0-rc2-00410-gbd9d16533367 #18
> Tainted: [N]=TEST
> Hardware name: linux,dummy-virt (DT)
> Call trace:
>  show_stack+0x18/0x30 (C)
>  dump_stack_lvl+0x70/0x98
>  dump_stack+0x18/0x24
>  debug_check_no_locks_held+0x9c/0xa4
>  do_exit+0x52c/0x970
>  kthread_exit+0x28/0x30
>  kthread+0xdc/0xf0
>  ret_from_fork+0x10/0x20
>     ok 1 drm_test_check_hdmi_funcs_reject_rate
> ok 1 drm_atomic_helper_connector_hdmi_check
> 
> I believe it's due to the fact that drm_kunit_helper_acquire_ctx_alloc()
> will acquire the lock directly, but will release it as a deferred kunit
> action. It's not unsafe, as the lock is eventually released, but I don't
> really know what the best course of action is here:
> 
>   * Forget about the idea of a context tied to the lifetime of a test
>   * Make lockdep know that it's ok, and we know what to do 
> 
> I've tried the latter, using lockdep_pin/unpin_lock, but that didn't fix
> the issue so I must have done something wrong.
> 
> Let me know what you think,
> Maxime
> 
> Signed-off-by: Maxime Ripard <mripard@kernel.org>
> ---
> Changes in v2:
> - Fix circular dependency warning
> - Link to v1: https://lore.kernel.org/r/20241031091558.2435850-1-mripard@kernel.org
> 
> ---
> Maxime Ripard (4):
>       drm/tests: hdmi: Fix WW_MUTEX_SLOWPATH failures
>       drm/tests: hdmi: Remove redundant assignments
>       drm/tests: hdmi: Reorder DRM entities variables assignment
>       drm/tests: hdmi: Fix recursive locking
> 
>  drivers/gpu/drm/tests/drm_hdmi_state_helper_test.c | 200 +++++++++++----------
>  1 file changed, 103 insertions(+), 97 deletions(-)
> ---
> base-commit: e2a81c0cd7de6cb063058be304b18f200c64802b
> change-id: 20250129-test-kunit-5ba3c03bffb0
> 
> Best regards,
> -- 
> Maxime Ripard <mripard@kernel.org>
>