diff mbox

drm/i915: Fix mutex->owner inspection race under DEBUG_MUTEXES

Message ID 1420192030-10206-1-git-send-email-chris@chris-wilson.co.uk (mailing list archive)
State New, archived
Headers show

Commit Message

Chris Wilson Jan. 2, 2015, 9:47 a.m. UTC
If CONFIG_DEBUG_MUTEXES is set, the mutex->owner field is only cleared
if the mutex debugging is enabled which introduces a race in our
mutex_is_locked_by() - i.e. we may inspect the old owner value before it
is acquired by the new task.

This is the root cause of this error:

Comments

Daniel Vetter Jan. 5, 2015, 10:51 a.m. UTC | #1
On Fri, Jan 02, 2015 at 09:47:10AM +0000, Chris Wilson wrote:
> If CONFIG_DEBUG_MUTEXES is set, the mutex->owner field is only cleared
> if the mutex debugging is enabled which introduces a race in our
> mutex_is_locked_by() - i.e. we may inspect the old owner value before it
> is acquired by the new task.
> 
> This is the root cause of this error:
> 
> diff --git a/kernel/locking/mutex-debug.c b/kernel/locking/mutex-debug.c
> index 5cf6731..3ef3736 100644
> --- a/kernel/locking/mutex-debug.c
> +++ b/kernel/locking/mutex-debug.c
> @@ -80,13 +80,13 @@ void debug_mutex_unlock(struct mutex *lock)
>                         DEBUG_LOCKS_WARN_ON(lock->owner != current);
> 
>                 DEBUG_LOCKS_WARN_ON(!lock->wait_list.prev && !lock->wait_list.next);
> -               mutex_clear_owner(lock);
>         }
> 
>         /*
>          * __mutex_slowpath_needs_to_unlock() is explicitly 0 for debug
>          * mutexes so that we can do it here after we've verified state.
>          */
> +       mutex_clear_owner(lock);
>         atomic_set(&lock->count, 1);
>  }
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87955
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: stable@vger.kernel.org

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

> ---
>  drivers/gpu/drm/i915/i915_gem.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index ad55b06a3cb1..4c14db9587a6 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -5024,7 +5024,7 @@ static bool mutex_is_locked_by(struct mutex *mutex, struct task_struct *task)
>  	if (!mutex_is_locked(mutex))
>  		return false;
>  
> -#if defined(CONFIG_SMP) || defined(CONFIG_DEBUG_MUTEXES)
> +#if defined(CONFIG_SMP) && !defined(CONFIG_DEBUG_MUTEXES)
>  	return mutex->owner == task;
>  #else
>  	/* Since UP may be pre-empted, we cannot assume that we own the lock */
> -- 
> 2.1.4
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Jani Nikula Jan. 7, 2015, 12:12 p.m. UTC | #2
On Mon, 05 Jan 2015, Daniel Vetter <daniel@ffwll.ch> wrote:
> On Fri, Jan 02, 2015 at 09:47:10AM +0000, Chris Wilson wrote:
>> If CONFIG_DEBUG_MUTEXES is set, the mutex->owner field is only cleared
>> if the mutex debugging is enabled which introduces a race in our
>> mutex_is_locked_by() - i.e. we may inspect the old owner value before it
>> is acquired by the new task.
>> 
>> This is the root cause of this error:
>> 
>> diff --git a/kernel/locking/mutex-debug.c b/kernel/locking/mutex-debug.c
>> index 5cf6731..3ef3736 100644
>> --- a/kernel/locking/mutex-debug.c
>> +++ b/kernel/locking/mutex-debug.c
>> @@ -80,13 +80,13 @@ void debug_mutex_unlock(struct mutex *lock)
>>                         DEBUG_LOCKS_WARN_ON(lock->owner != current);
>> 
>>                 DEBUG_LOCKS_WARN_ON(!lock->wait_list.prev && !lock->wait_list.next);
>> -               mutex_clear_owner(lock);
>>         }
>> 
>>         /*
>>          * __mutex_slowpath_needs_to_unlock() is explicitly 0 for debug
>>          * mutexes so that we can do it here after we've verified state.
>>          */
>> +       mutex_clear_owner(lock);
>>         atomic_set(&lock->count, 1);
>>  }
>> 
>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87955
>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>> Cc: stable@vger.kernel.org
>
> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

Pushed to drm-intel-fixes, thanks for the patch and review.

BR,
Jani.

>
>> ---
>>  drivers/gpu/drm/i915/i915_gem.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>> 
>> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
>> index ad55b06a3cb1..4c14db9587a6 100644
>> --- a/drivers/gpu/drm/i915/i915_gem.c
>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>> @@ -5024,7 +5024,7 @@ static bool mutex_is_locked_by(struct mutex *mutex, struct task_struct *task)
>>  	if (!mutex_is_locked(mutex))
>>  		return false;
>>  
>> -#if defined(CONFIG_SMP) || defined(CONFIG_DEBUG_MUTEXES)
>> +#if defined(CONFIG_SMP) && !defined(CONFIG_DEBUG_MUTEXES)
>>  	return mutex->owner == task;
>>  #else
>>  	/* Since UP may be pre-empted, we cannot assume that we own the lock */
>> -- 
>> 2.1.4
>> 
>> _______________________________________________
>> Intel-gfx mailing list
>> Intel-gfx@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
>
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch
Chris Wilson Jan. 7, 2015, 1:20 p.m. UTC | #3
On Wed, Jan 07, 2015 at 02:12:14PM +0200, Jani Nikula wrote:
> On Mon, 05 Jan 2015, Daniel Vetter <daniel@ffwll.ch> wrote:
> > On Fri, Jan 02, 2015 at 09:47:10AM +0000, Chris Wilson wrote:
> >> If CONFIG_DEBUG_MUTEXES is set, the mutex->owner field is only cleared
> >> if the mutex debugging is enabled which introduces a race in our
> >> mutex_is_locked_by() - i.e. we may inspect the old owner value before it
> >> is acquired by the new task.
> >> 
> >> This is the root cause of this error:
> >> 
> >> diff --git a/kernel/locking/mutex-debug.c b/kernel/locking/mutex-debug.c
> >> index 5cf6731..3ef3736 100644
> >> --- a/kernel/locking/mutex-debug.c
> >> +++ b/kernel/locking/mutex-debug.c
> >> @@ -80,13 +80,13 @@ void debug_mutex_unlock(struct mutex *lock)
> >>                         DEBUG_LOCKS_WARN_ON(lock->owner != current);
> >> 
> >>                 DEBUG_LOCKS_WARN_ON(!lock->wait_list.prev && !lock->wait_list.next);
> >> -               mutex_clear_owner(lock);
> >>         }
> >> 
> >>         /*
> >>          * __mutex_slowpath_needs_to_unlock() is explicitly 0 for debug
> >>          * mutexes so that we can do it here after we've verified state.
> >>          */
> >> +       mutex_clear_owner(lock);
> >>         atomic_set(&lock->count, 1);
> >>  }
> >> 
> >> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87955
> >> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> >> Cc: stable@vger.kernel.org
> >
> > Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> 
> Pushed to drm-intel-fixes, thanks for the patch and review.

For the record, I plan to post a revert of this to -next if the core fix
lands upstream (looks good so far).
-Chris
Jani Nikula Jan. 7, 2015, 1:27 p.m. UTC | #4
On Wed, 07 Jan 2015, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> On Wed, Jan 07, 2015 at 02:12:14PM +0200, Jani Nikula wrote:
>> On Mon, 05 Jan 2015, Daniel Vetter <daniel@ffwll.ch> wrote:
>> > On Fri, Jan 02, 2015 at 09:47:10AM +0000, Chris Wilson wrote:
>> >> If CONFIG_DEBUG_MUTEXES is set, the mutex->owner field is only cleared
>> >> if the mutex debugging is enabled which introduces a race in our
>> >> mutex_is_locked_by() - i.e. we may inspect the old owner value before it
>> >> is acquired by the new task.
>> >> 
>> >> This is the root cause of this error:
>> >> 
>> >> diff --git a/kernel/locking/mutex-debug.c b/kernel/locking/mutex-debug.c
>> >> index 5cf6731..3ef3736 100644
>> >> --- a/kernel/locking/mutex-debug.c
>> >> +++ b/kernel/locking/mutex-debug.c
>> >> @@ -80,13 +80,13 @@ void debug_mutex_unlock(struct mutex *lock)
>> >>                         DEBUG_LOCKS_WARN_ON(lock->owner != current);
>> >> 
>> >>                 DEBUG_LOCKS_WARN_ON(!lock->wait_list.prev && !lock->wait_list.next);
>> >> -               mutex_clear_owner(lock);
>> >>         }
>> >> 
>> >>         /*
>> >>          * __mutex_slowpath_needs_to_unlock() is explicitly 0 for debug
>> >>          * mutexes so that we can do it here after we've verified state.
>> >>          */
>> >> +       mutex_clear_owner(lock);
>> >>         atomic_set(&lock->count, 1);
>> >>  }
>> >> 
>> >> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87955
>> >> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>> >> Cc: stable@vger.kernel.org
>> >
>> > Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
>> 
>> Pushed to drm-intel-fixes, thanks for the patch and review.
>
> For the record, I plan to post a revert of this to -next if the core fix
> lands upstream (looks good so far).

Does it look like that could be cc:stable?

BR,
Jani.

> -Chris
>
> -- 
> Chris Wilson, Intel Open Source Technology Centre
Chris Wilson Jan. 7, 2015, 1:38 p.m. UTC | #5
On Wed, Jan 07, 2015 at 03:27:41PM +0200, Jani Nikula wrote:
> On Wed, 07 Jan 2015, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> > On Wed, Jan 07, 2015 at 02:12:14PM +0200, Jani Nikula wrote:
> >> On Mon, 05 Jan 2015, Daniel Vetter <daniel@ffwll.ch> wrote:
> >> > On Fri, Jan 02, 2015 at 09:47:10AM +0000, Chris Wilson wrote:
> >> >> If CONFIG_DEBUG_MUTEXES is set, the mutex->owner field is only cleared
> >> >> if the mutex debugging is enabled which introduces a race in our
> >> >> mutex_is_locked_by() - i.e. we may inspect the old owner value before it
> >> >> is acquired by the new task.
> >> >> 
> >> >> This is the root cause of this error:
> >> >> 
> >> >> diff --git a/kernel/locking/mutex-debug.c b/kernel/locking/mutex-debug.c
> >> >> index 5cf6731..3ef3736 100644
> >> >> --- a/kernel/locking/mutex-debug.c
> >> >> +++ b/kernel/locking/mutex-debug.c
> >> >> @@ -80,13 +80,13 @@ void debug_mutex_unlock(struct mutex *lock)
> >> >>                         DEBUG_LOCKS_WARN_ON(lock->owner != current);
> >> >> 
> >> >>                 DEBUG_LOCKS_WARN_ON(!lock->wait_list.prev && !lock->wait_list.next);
> >> >> -               mutex_clear_owner(lock);
> >> >>         }
> >> >> 
> >> >>         /*
> >> >>          * __mutex_slowpath_needs_to_unlock() is explicitly 0 for debug
> >> >>          * mutexes so that we can do it here after we've verified state.
> >> >>          */
> >> >> +       mutex_clear_owner(lock);
> >> >>         atomic_set(&lock->count, 1);
> >> >>  }
> >> >> 
> >> >> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87955
> >> >> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> >> >> Cc: stable@vger.kernel.org
> >> >
> >> > Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> >> 
> >> Pushed to drm-intel-fixes, thanks for the patch and review.
> >
> > For the record, I plan to post a revert of this to -next if the core fix
> > lands upstream (looks good so far).
> 
> Does it look like that could be cc:stable?

I didn't suggest it should be, so I wait with bated breath. I know
disabling lock-stealing in i915 is fairly safe (some corner cases may
now get oom, but that's a perenial risk anyway) and so felt more
comfortable with pushing this i915 patch through stable.
-Chris
diff mbox

Patch

diff --git a/kernel/locking/mutex-debug.c b/kernel/locking/mutex-debug.c
index 5cf6731..3ef3736 100644
--- a/kernel/locking/mutex-debug.c
+++ b/kernel/locking/mutex-debug.c
@@ -80,13 +80,13 @@  void debug_mutex_unlock(struct mutex *lock)
                        DEBUG_LOCKS_WARN_ON(lock->owner != current);

                DEBUG_LOCKS_WARN_ON(!lock->wait_list.prev && !lock->wait_list.next);
-               mutex_clear_owner(lock);
        }

        /*
         * __mutex_slowpath_needs_to_unlock() is explicitly 0 for debug
         * mutexes so that we can do it here after we've verified state.
         */
+       mutex_clear_owner(lock);
        atomic_set(&lock->count, 1);
 }

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87955
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: stable@vger.kernel.org
---
 drivers/gpu/drm/i915/i915_gem.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index ad55b06a3cb1..4c14db9587a6 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -5024,7 +5024,7 @@  static bool mutex_is_locked_by(struct mutex *mutex, struct task_struct *task)
 	if (!mutex_is_locked(mutex))
 		return false;
 
-#if defined(CONFIG_SMP) || defined(CONFIG_DEBUG_MUTEXES)
+#if defined(CONFIG_SMP) && !defined(CONFIG_DEBUG_MUTEXES)
 	return mutex->owner == task;
 #else
 	/* Since UP may be pre-empted, we cannot assume that we own the lock */