mutex: Report recursive ww_mutex locking early

Message ID	1464293297-19777-1-git-send-email-chris@chris-wilson.co.uk (mailing list archive)
State	New, archived
Headers	show Return-Path: <intel-gfx-bounces@lists.freedesktop.org> From: Chris Wilson <chris@chris-wilson.co.uk> To: Peter Zijlstra <peterz@infradead.org>, Ingo Molnar <mingo@redhat.com> Date: Thu, 26 May 2016 21:08:17 +0100 Message-Id: <1464293297-19777-1-git-send-email-chris@chris-wilson.co.uk> In-Reply-To: <1464251487-23778-1-git-send-email-chris@chris-wilson.co.uk> References: <1464251487-23778-1-git-send-email-chris@chris-wilson.co.uk> MIME-Version: 1.0 Cc: linux-kernel@vger.kernel.org, intel-gfx@lists.freedesktop.org, =?UTF-8?q?Christian=20K=C3=B6nig?= <christian.koenig@amd.com> Subject: [Intel-gfx] [PATCH] mutex: Report recursive ww_mutex locking early Precedence: list Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" <intel-gfx-bounces@lists.freedesktop.org>

Chris Wilson May 26, 2016, 8:08 p.m. UTC

Recursive locking for ww_mutexes was originally conceived as an
exception. However, it is heavily used by the DRM atomic modesetting
code. Currently, the recursive deadlock is checked after we have queued
up for a busy-spin and as we never release the lock, we spin until
kicked, whereupon the deadlock is discovered and reported.

A simple solution for the now common problem is to move the recursive
deadlock discovery to the first action when taking the ww_mutex.

Testcase: igt/kms_cursor_legacy
Suggested-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: linux-kernel@vger.kernel.org
---

Maarten suggested this as a simpler fix to the immediate problem. Imo,
we still want to perform deadlock detection within the spin in order to
catch more complicated deadlocks without osq_lock() forcing fairness!
-Chris

---
 kernel/locking/mutex.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

Maarten Lankhorst May 30, 2016, 7:43 a.m. UTC | #1

Op 26-05-16 om 22:08 schreef Chris Wilson:
> Recursive locking for ww_mutexes was originally conceived as an
> exception. However, it is heavily used by the DRM atomic modesetting
> code. Currently, the recursive deadlock is checked after we have queued
> up for a busy-spin and as we never release the lock, we spin until
> kicked, whereupon the deadlock is discovered and reported.
>
> A simple solution for the now common problem is to move the recursive
> deadlock discovery to the first action when taking the ww_mutex.
>
> Testcase: igt/kms_cursor_legacy
> Suggested-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Christian König <christian.koenig@amd.com>
> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Cc: linux-kernel@vger.kernel.org
> ---
>
> Maarten suggested this as a simpler fix to the immediate problem. Imo,
> we still want to perform deadlock detection within the spin in order to
> catch more complicated deadlocks without osq_lock() forcing fairness!
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

Should this be Cc: stable@vger.kernel.org ?

I think in the normal case things would move forward even with osq_lock,
but you can make a separate patch to add it to mutex_can_spin_on_owner,
with the same comment as in mutex_optimistic_spin.
> ---
>  kernel/locking/mutex.c | 9 ++++++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
>
> diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
> index d60f1ba3e64f..1659398dc8f8 100644
> --- a/kernel/locking/mutex.c
> +++ b/kernel/locking/mutex.c
> @@ -502,9 +502,6 @@ __ww_mutex_lock_check_stamp(struct mutex *lock, struct ww_acquire_ctx *ctx)
>  	if (!hold_ctx)
>  		return 0;
>  
> -	if (unlikely(ctx == hold_ctx))
> -		return -EALREADY;
> -
>  	if (ctx->stamp - hold_ctx->stamp <= LONG_MAX &&
>  	    (ctx->stamp != hold_ctx->stamp || ctx > hold_ctx)) {
>  #ifdef CONFIG_DEBUG_MUTEXES
> @@ -530,6 +527,12 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
>  	unsigned long flags;
>  	int ret;
>  
> +	if (use_ww_ctx) {
> +		struct ww_mutex *ww = container_of(lock, struct ww_mutex, base);
> +		if (unlikely(ww_ctx == READ_ONCE(ww->ctx)))
> +			return -EALREADY;
> +	}
> +
>  	preempt_disable();
>  	mutex_acquire_nest(&lock->dep_map, subclass, 0, nest_lock, ip);
>

Peter Zijlstra May 30, 2016, 9:11 a.m. UTC | #2

On Mon, May 30, 2016 at 09:43:53AM +0200, Maarten Lankhorst wrote:
> Op 26-05-16 om 22:08 schreef Chris Wilson:
> > Recursive locking for ww_mutexes was originally conceived as an
> > exception. However, it is heavily used by the DRM atomic modesetting
> > code. Currently, the recursive deadlock is checked after we have queued
> > up for a busy-spin and as we never release the lock, we spin until
> > kicked, whereupon the deadlock is discovered and reported.
> >
> > A simple solution for the now common problem is to move the recursive
> > deadlock discovery to the first action when taking the ww_mutex.
> >
> > Testcase: igt/kms_cursor_legacy

I've no idea what this tag is or where to find the actual testcase, so
I've killed it.

> > Suggested-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Peter Zijlstra <peterz@infradead.org>
> > Cc: Ingo Molnar <mingo@redhat.com>
> > Cc: Christian König <christian.koenig@amd.com>
> > Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> > Cc: linux-kernel@vger.kernel.org
> > ---
> >
> > Maarten suggested this as a simpler fix to the immediate problem. Imo,
> > we still want to perform deadlock detection within the spin in order to
> > catch more complicated deadlocks without osq_lock() forcing fairness!
> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> 
> Should this be Cc: stable@vger.kernel.org ?

Can do; how far back?

Maarten Lankhorst May 30, 2016, 9:43 a.m. UTC | #3

Op 30-05-16 om 11:11 schreef Peter Zijlstra:
> On Mon, May 30, 2016 at 09:43:53AM +0200, Maarten Lankhorst wrote:
>> Op 26-05-16 om 22:08 schreef Chris Wilson:
>>> Recursive locking for ww_mutexes was originally conceived as an
>>> exception. However, it is heavily used by the DRM atomic modesetting
>>> code. Currently, the recursive deadlock is checked after we have queued
>>> up for a busy-spin and as we never release the lock, we spin until
>>> kicked, whereupon the deadlock is discovered and reported.
>>>
>>> A simple solution for the now common problem is to move the recursive
>>> deadlock discovery to the first action when taking the ww_mutex.
>>>
>>> Testcase: igt/kms_cursor_legacy
> I've no idea what this tag is or where to find the actual testcase, so
> I've killed it.
https://cgit.freedesktop.org/xorg/app/intel-gpu-tools/

tests/kms_cursor_legacy tries to do as many updates as possible with SCHED_RR..

Patch not applied, SCHED_RR:
# ./kms_cursor_legacy 
IGT-Version: 1.14-g9579e5447aa3 (x86_64) (Linux: 4.6.0-patser+ x86_64)
[3] count=86
[2] count=91
[1] count=78
[0] count=104
Subtest stress-bo: SUCCESS (22,372s)

Patch not applied, SCHED_NORMAL:
# ./kms_cursor_legacy 
IGT-Version: 1.14-g9579e5447aa3 (x86_64) (Linux: 4.6.0-patser+ x86_64)
[2] count=4713
[0] count=4288
[3] count=4776
[1] count=4521
Subtest stress-bo: SUCCESS (21,492s)

Patch applied, NORMAL + RR give roughly same results:
# nfs/intel-gpu-tools/tests/kms_cursor_legacy 
IGT-Version: 1.14-g9579e5447aa3 (x86_64) (Linux: 4.6.0-patser+ x86_64)
[0] count=77631
[1] count=77740
[3] count=77612
[2] count=77666
Subtest stress-bo: SUCCESS (21,487s)

>>> Suggested-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>> Cc: Peter Zijlstra <peterz@infradead.org>
>>> Cc: Ingo Molnar <mingo@redhat.com>
>>> Cc: Christian König <christian.koenig@amd.com>
>>> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
>>> Cc: linux-kernel@vger.kernel.org
>>> ---
>>>
>>> Maarten suggested this as a simpler fix to the immediate problem. Imo,
>>> we still want to perform deadlock detection within the spin in order to
>>> catch more complicated deadlocks without osq_lock() forcing fairness!
>> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
>>
>> Should this be Cc: stable@vger.kernel.org ?
> Can do; how far back?
>

Peter Zijlstra May 30, 2016, 10:27 a.m. UTC | #4

On Mon, May 30, 2016 at 11:43:31AM +0200, Maarten Lankhorst wrote:
> Patch not applied, SCHED_RR:

ww_mutex isn't RT aware at all; its one of the things I still have on a
todo list. Should I look harder at finding time for this?

Chris Wilson May 30, 2016, 10:45 a.m. UTC | #5

On Mon, May 30, 2016 at 12:27:46PM +0200, Peter Zijlstra wrote:
> On Mon, May 30, 2016 at 11:43:31AM +0200, Maarten Lankhorst wrote:
> > Patch not applied, SCHED_RR:
> 
> ww_mutex isn't RT aware at all; its one of the things I still have on a
> todo list. Should I look harder at finding time for this?

The RT usage in the test is to just try and starve the kernel threads
that may be used behind the atomic modeset - a problem we have
encountered in the past. Afaik, no one is using ww_mutex from RT in the
wild, calling the atomic modeset from the RT was just a shortcut to
having the system fully populated with RT threads. To be more realistic
we should be using a couple of normal modesetting threads vs a set of RT
cpu hogs.

Otoh, i915.ko always draws the ire of rt-linux so ww_mutex is likely to
be in their sights in the near future (when i915.ko completes its
transition to full atomic modesetting).
-Chris

Maarten Lankhorst May 30, 2016, 11:16 a.m. UTC | #6

Op 30-05-16 om 12:45 schreef Chris Wilson:
> On Mon, May 30, 2016 at 12:27:46PM +0200, Peter Zijlstra wrote:
>> On Mon, May 30, 2016 at 11:43:31AM +0200, Maarten Lankhorst wrote:
>>> Patch not applied, SCHED_RR:
>> ww_mutex isn't RT aware at all; its one of the things I still have on a
>> todo list. Should I look harder at finding time for this?
> The RT usage in the test is to just try and starve the kernel threads
> that may be used behind the atomic modeset - a problem we have
> encountered in the past. Afaik, no one is using ww_mutex from RT in the
> wild, calling the atomic modeset from the RT was just a shortcut to
> having the system fully populated with RT threads. To be more realistic
> we should be using a couple of normal modesetting threads vs a set of RT
> cpu hogs.
Yeah, unfortunately this doesn't work as you intend it to. You'd need to spawn a few more threads at slightly lower priority so when a thread is blocked waiting for acquisition of the mutexes the workqueues still can't run.

ssh is still responsive with the rest running.
> Otoh, i915.ko always draws the ire of rt-linux so ww_mutex is likely to
> be in their sights in the near future (when i915.ko completes its
> transition to full atomic modesetting).

mutex: Report recursive ww_mutex locking early

Commit Message

Comments

Patch