diff mbox

mutex: fix deadlock injection

Message ID 51F775B5.201@canonical.com (mailing list archive)
State New, archived
Headers show

Commit Message

Maarten Lankhorst July 30, 2013, 8:13 a.m. UTC
The check needs to be for > 1, because ctx->acquired is already incremented.
This will prevent ww_mutex_lock_slow from returning -EDEADLK and not locking
the mutex. It caused a lot of false gpu lockups on radeon with
CONFIG_DEBUG_WW_MUTEX_SLOWPATH=y because a function that shouldn't be able
to return -EDEADLK did.

Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
---

Comments

Peter Zijlstra July 30, 2013, 8:41 a.m. UTC | #1
On Tue, Jul 30, 2013 at 10:13:41AM +0200, Maarten Lankhorst wrote:
> The check needs to be for > 1, because ctx->acquired is already incremented.
> This will prevent ww_mutex_lock_slow from returning -EDEADLK and not locking
> the mutex. It caused a lot of false gpu lockups on radeon with
> CONFIG_DEBUG_WW_MUTEX_SLOWPATH=y because a function that shouldn't be able
> to return -EDEADLK did.
> 
> Cc: Alex Deucher <alexander.deucher@amd.com>
> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>

Thanks!
Alex Deucher July 30, 2013, 1:09 p.m. UTC | #2
On Tue, Jul 30, 2013 at 4:13 AM, Maarten Lankhorst
<maarten.lankhorst@canonical.com> wrote:
> The check needs to be for > 1, because ctx->acquired is already incremented.
> This will prevent ww_mutex_lock_slow from returning -EDEADLK and not locking
> the mutex. It caused a lot of false gpu lockups on radeon with
> CONFIG_DEBUG_WW_MUTEX_SLOWPATH=y because a function that shouldn't be able
> to return -EDEADLK did.
>

I haven't followed the new reservation stuff too closely, but seems plausible.

Acked-by: Alex Deucher <alexander.deucher@amd.com>

> Cc: Alex Deucher <alexander.deucher@amd.com>
> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
> ---
> diff --git a/kernel/mutex.c b/kernel/mutex.c
> index ff05f4b..a52ee7bb 100644
> --- a/kernel/mutex.c
> +++ b/kernel/mutex.c
> @@ -686,7 +686,7 @@ __ww_mutex_lock(struct ww_mutex *lock, struct ww_acquire_ctx *ctx)
>         might_sleep();
>         ret =  __mutex_lock_common(&lock->base, TASK_UNINTERRUPTIBLE,
>                                    0, &ctx->dep_map, _RET_IP_, ctx);
> -       if (!ret && ctx->acquired > 0)
> +       if (!ret && ctx->acquired > 1)
>                 return ww_mutex_deadlock_injection(lock, ctx);
>
>         return ret;
> @@ -702,7 +702,7 @@ __ww_mutex_lock_interruptible(struct ww_mutex *lock, struct ww_acquire_ctx *ctx)
>         ret = __mutex_lock_common(&lock->base, TASK_INTERRUPTIBLE,
>                                   0, &ctx->dep_map, _RET_IP_, ctx);
>
> -       if (!ret && ctx->acquired > 0)
> +       if (!ret && ctx->acquired > 1)
>                 return ww_mutex_deadlock_injection(lock, ctx);
>
>         return ret;
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel
Daniel Vetter Aug. 5, 2013, 7:58 a.m. UTC | #3
On Tue, Jul 30, 2013 at 10:13:41AM +0200, Maarten Lankhorst wrote:
> The check needs to be for > 1, because ctx->acquired is already incremented.
> This will prevent ww_mutex_lock_slow from returning -EDEADLK and not locking
> the mutex. It caused a lot of false gpu lockups on radeon with
> CONFIG_DEBUG_WW_MUTEX_SLOWPATH=y because a function that shouldn't be able
> to return -EDEADLK did.
> 
> Cc: Alex Deucher <alexander.deucher@amd.com>
> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>

Oops, thanks for catching this.

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

> ---
> diff --git a/kernel/mutex.c b/kernel/mutex.c
> index ff05f4b..a52ee7bb 100644
> --- a/kernel/mutex.c
> +++ b/kernel/mutex.c
> @@ -686,7 +686,7 @@ __ww_mutex_lock(struct ww_mutex *lock, struct ww_acquire_ctx *ctx)
>  	might_sleep();
>  	ret =  __mutex_lock_common(&lock->base, TASK_UNINTERRUPTIBLE,
>  				   0, &ctx->dep_map, _RET_IP_, ctx);
> -	if (!ret && ctx->acquired > 0)
> +	if (!ret && ctx->acquired > 1)
>  		return ww_mutex_deadlock_injection(lock, ctx);
>  
>  	return ret;
> @@ -702,7 +702,7 @@ __ww_mutex_lock_interruptible(struct ww_mutex *lock, struct ww_acquire_ctx *ctx)
>  	ret = __mutex_lock_common(&lock->base, TASK_INTERRUPTIBLE,
>  				  0, &ctx->dep_map, _RET_IP_, ctx);
>  
> -	if (!ret && ctx->acquired > 0)
> +	if (!ret && ctx->acquired > 1)
>  		return ww_mutex_deadlock_injection(lock, ctx);
>  
>  	return ret;
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel
Dave Airlie Aug. 7, 2013, 12:05 a.m. UTC | #4
On Tue, Jul 30, 2013 at 6:41 PM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Tue, Jul 30, 2013 at 10:13:41AM +0200, Maarten Lankhorst wrote:
>> The check needs to be for > 1, because ctx->acquired is already incremented.
>> This will prevent ww_mutex_lock_slow from returning -EDEADLK and not locking
>> the mutex. It caused a lot of false gpu lockups on radeon with
>> CONFIG_DEBUG_WW_MUTEX_SLOWPATH=y because a function that shouldn't be able
>> to return -EDEADLK did.
>>
>> Cc: Alex Deucher <alexander.deucher@amd.com>
>> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>

Should this be merged via Ingo? or will I queue it in my -fixes?

Dave.
Maarten Lankhorst Aug. 7, 2013, 6:22 a.m. UTC | #5
Op 07-08-13 02:05, Dave Airlie schreef:
> On Tue, Jul 30, 2013 at 6:41 PM, Peter Zijlstra <peterz@infradead.org> wrote:
>> On Tue, Jul 30, 2013 at 10:13:41AM +0200, Maarten Lankhorst wrote:
>>> The check needs to be for > 1, because ctx->acquired is already incremented.
>>> This will prevent ww_mutex_lock_slow from returning -EDEADLK and not locking
>>> the mutex. It caused a lot of false gpu lockups on radeon with
>>> CONFIG_DEBUG_WW_MUTEX_SLOWPATH=y because a function that shouldn't be able
>>> to return -EDEADLK did.
>>>
>>> Cc: Alex Deucher <alexander.deucher@amd.com>
>>> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
> Should this be merged via Ingo? or will I queue it in my -fixes?
>
> Dave.
>
It's in tip:core/urgent, so I imagine you don't need to queue it.
diff mbox

Patch

diff --git a/kernel/mutex.c b/kernel/mutex.c
index ff05f4b..a52ee7bb 100644
--- a/kernel/mutex.c
+++ b/kernel/mutex.c
@@ -686,7 +686,7 @@  __ww_mutex_lock(struct ww_mutex *lock, struct ww_acquire_ctx *ctx)
 	might_sleep();
 	ret =  __mutex_lock_common(&lock->base, TASK_UNINTERRUPTIBLE,
 				   0, &ctx->dep_map, _RET_IP_, ctx);
-	if (!ret && ctx->acquired > 0)
+	if (!ret && ctx->acquired > 1)
 		return ww_mutex_deadlock_injection(lock, ctx);
 
 	return ret;
@@ -702,7 +702,7 @@  __ww_mutex_lock_interruptible(struct ww_mutex *lock, struct ww_acquire_ctx *ctx)
 	ret = __mutex_lock_common(&lock->base, TASK_INTERRUPTIBLE,
 				  0, &ctx->dep_map, _RET_IP_, ctx);
 
-	if (!ret && ctx->acquired > 0)
+	if (!ret && ctx->acquired > 1)
 		return ww_mutex_deadlock_injection(lock, ctx);
 
 	return ret;