[1/3] locking/ww_mutex: cleanup lock->ctx usage in amdgpu

Message ID	20180215141944.4332-1-christian.koenig@amd.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <dri-devel-bounces@lists.freedesktop.org> From: "=?UTF-8?q?Christian=20K=C3=B6nig?=" <ckoenig.leichtzumerken@gmail.com> To: dri-devel@lists.freedesktop.org, amd-gfx@lists.freedesktop.org, linux-kernel@vger.kernel.org Subject: [PATCH 1/3] locking/ww_mutex: cleanup lock->ctx usage in amdgpu Date: Thu, 15 Feb 2018 15:19:42 +0100 Message-Id: <20180215141944.4332-1-christian.koenig@amd.com> MIME-Version: 1.0 Precedence: list Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" <dri-devel-bounces@lists.freedesktop.org>

Christian König Feb. 15, 2018, 2:19 p.m. UTC

amdgpu needs to verify if userspace sends us valid addresses and the simplest
way of doing this is to check if the buffer object is locked with the ticket
of the current submission.

Clean up the access to the ww_mutex internals by providing a function
for this and extend the check to the thread owning the underlying mutex.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c |  3 ++-
 include/linux/ww_mutex.h               | 17 +++++++++++++++++
 2 files changed, 19 insertions(+), 1 deletion(-)

Alex Deucher Feb. 15, 2018, 8:36 p.m. UTC | #1

On Thu, Feb 15, 2018 at 9:19 AM, Christian König
<ckoenig.leichtzumerken@gmail.com> wrote:
> amdgpu needs to verify if userspace sends us valid addresses and the simplest
> way of doing this is to check if the buffer object is locked with the ticket
> of the current submission.
>
> Clean up the access to the ww_mutex internals by providing a function
> for this and extend the check to the thread owning the underlying mutex.
>
> Signed-off-by: Christian König <christian.koenig@amd.com>

Shouldn't this be two patches?  One to add the new ww_mutex code and
one to update amdgpu?

Alex

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c |  3 ++-
>  include/linux/ww_mutex.h               | 17 +++++++++++++++++
>  2 files changed, 19 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> index eaa3cb0c3ad1..4c04b560e358 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> @@ -1594,7 +1594,8 @@ int amdgpu_cs_find_mapping(struct amdgpu_cs_parser *parser,
>         *map = mapping;
>
>         /* Double check that the BO is reserved by this CS */
> -       if (READ_ONCE((*bo)->tbo.resv->lock.ctx) != &parser->ticket)
> +       if (!ww_mutex_is_owned_by(&(*bo)->tbo.resv->lock, current,
> +                                 &parser->ticket))
>                 return -EINVAL;
>
>         if (!((*bo)->flags & AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS)) {
> diff --git a/include/linux/ww_mutex.h b/include/linux/ww_mutex.h
> index 39fda195bf78..dd580db289e8 100644
> --- a/include/linux/ww_mutex.h
> +++ b/include/linux/ww_mutex.h
> @@ -358,4 +358,21 @@ static inline bool ww_mutex_is_locked(struct ww_mutex *lock)
>         return mutex_is_locked(&lock->base);
>  }
>
> +/**
> + * ww_mutex_is_owned_by - is the w/w mutex locked by this task in that context
> + * @lock: the mutex to be queried
> + * @task: the task structure to check
> + * @ctx: the w/w acquire context to test
> + *
> + * Returns true if the mutex is locked in the context by the given task, false
> + * otherwise.
> + */
> +static inline bool ww_mutex_is_owned_by(struct ww_mutex *lock,
> +                                       struct task_struct *task,
> +                                       struct ww_acquire_ctx *ctx)
> +{
> +       return likely(__mutex_owner(&lock->base) == task) &&
> +               READ_ONCE(lock->ctx) == ctx;
> +}
> +
>  #endif
> --
> 2.14.1
>
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

Daniel Vetter Feb. 19, 2018, 3:24 p.m. UTC | #2

On Thu, Feb 15, 2018 at 03:19:42PM +0100, Christian König wrote:
> amdgpu needs to verify if userspace sends us valid addresses and the simplest
> way of doing this is to check if the buffer object is locked with the ticket
> of the current submission.
> 
> Clean up the access to the ww_mutex internals by providing a function
> for this and extend the check to the thread owning the underlying mutex.
> 
> Signed-off-by: Christian König <christian.koenig@amd.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c |  3 ++-
>  include/linux/ww_mutex.h               | 17 +++++++++++++++++
>  2 files changed, 19 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> index eaa3cb0c3ad1..4c04b560e358 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> @@ -1594,7 +1594,8 @@ int amdgpu_cs_find_mapping(struct amdgpu_cs_parser *parser,
>  	*map = mapping;
>  
>  	/* Double check that the BO is reserved by this CS */
> -	if (READ_ONCE((*bo)->tbo.resv->lock.ctx) != &parser->ticket)
> +	if (!ww_mutex_is_owned_by(&(*bo)->tbo.resv->lock, current,
> +				  &parser->ticket))
>  		return -EINVAL;
>  
>  	if (!((*bo)->flags & AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS)) {
> diff --git a/include/linux/ww_mutex.h b/include/linux/ww_mutex.h
> index 39fda195bf78..dd580db289e8 100644
> --- a/include/linux/ww_mutex.h
> +++ b/include/linux/ww_mutex.h
> @@ -358,4 +358,21 @@ static inline bool ww_mutex_is_locked(struct ww_mutex *lock)
>  	return mutex_is_locked(&lock->base);
>  }
>  
> +/**
> + * ww_mutex_is_owned_by - is the w/w mutex locked by this task in that context
> + * @lock: the mutex to be queried
> + * @task: the task structure to check
> + * @ctx: the w/w acquire context to test
> + *
> + * Returns true if the mutex is locked in the context by the given task, false
> + * otherwise.
> + */
> +static inline bool ww_mutex_is_owned_by(struct ww_mutex *lock,
> +					struct task_struct *task,
> +					struct ww_acquire_ctx *ctx)
> +{
> +	return likely(__mutex_owner(&lock->base) == task) &&
> +		READ_ONCE(lock->ctx) == ctx;

Just comparing the context should be good enough. If you ever pass a
ww_acquire_ctx which does not belong to your own thread your seriously
wreaking things much worse already (and if we do catch that, should
probably lock the ctx to a given task when ww-mutex debugging is enabled).

That also simplifies the function signature.

Of course that means if you don't have a ctx, you can't test ownership of
a ww_mute, but I think that's not a really valid use-case. And not needed
for cmd submission, where you need the ctx anyway.

Besides this interface nit looks all good. With the task check&parameter
removed:

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

-Daniel

> +}
> +
>  #endif
> -- 
> 2.14.1
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

Christian König Feb. 19, 2018, 3:41 p.m. UTC | #3

Am 19.02.2018 um 16:24 schrieb Daniel Vetter:
> On Thu, Feb 15, 2018 at 03:19:42PM +0100, Christian König wrote:
>> amdgpu needs to verify if userspace sends us valid addresses and the simplest
>> way of doing this is to check if the buffer object is locked with the ticket
>> of the current submission.
>>
>> Clean up the access to the ww_mutex internals by providing a function
>> for this and extend the check to the thread owning the underlying mutex.
>>
>> Signed-off-by: Christian König <christian.koenig@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c |  3 ++-
>>   include/linux/ww_mutex.h               | 17 +++++++++++++++++
>>   2 files changed, 19 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>> index eaa3cb0c3ad1..4c04b560e358 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>> @@ -1594,7 +1594,8 @@ int amdgpu_cs_find_mapping(struct amdgpu_cs_parser *parser,
>>   	*map = mapping;
>>   
>>   	/* Double check that the BO is reserved by this CS */
>> -	if (READ_ONCE((*bo)->tbo.resv->lock.ctx) != &parser->ticket)
>> +	if (!ww_mutex_is_owned_by(&(*bo)->tbo.resv->lock, current,
>> +				  &parser->ticket))
>>   		return -EINVAL;
>>   
>>   	if (!((*bo)->flags & AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS)) {
>> diff --git a/include/linux/ww_mutex.h b/include/linux/ww_mutex.h
>> index 39fda195bf78..dd580db289e8 100644
>> --- a/include/linux/ww_mutex.h
>> +++ b/include/linux/ww_mutex.h
>> @@ -358,4 +358,21 @@ static inline bool ww_mutex_is_locked(struct ww_mutex *lock)
>>   	return mutex_is_locked(&lock->base);
>>   }
>>   
>> +/**
>> + * ww_mutex_is_owned_by - is the w/w mutex locked by this task in that context
>> + * @lock: the mutex to be queried
>> + * @task: the task structure to check
>> + * @ctx: the w/w acquire context to test
>> + *
>> + * Returns true if the mutex is locked in the context by the given task, false
>> + * otherwise.
>> + */
>> +static inline bool ww_mutex_is_owned_by(struct ww_mutex *lock,
>> +					struct task_struct *task,
>> +					struct ww_acquire_ctx *ctx)
>> +{
>> +	return likely(__mutex_owner(&lock->base) == task) &&
>> +		READ_ONCE(lock->ctx) == ctx;
> Just comparing the context should be good enough. If you ever pass a
> ww_acquire_ctx which does not belong to your own thread your seriously
> wreaking things much worse already (and if we do catch that, should
> probably lock the ctx to a given task when ww-mutex debugging is enabled).
>
> That also simplifies the function signature.
>
> Of course that means if you don't have a ctx, you can't test ownership of
> a ww_mute, but I think that's not a really valid use-case.

Well exactly that is the use case in TTM, see patch #3 in this series.

In TTM the evicted BOs are trylocked and so we need a way of testing for 
ownership without a context.

Christian.

>   And not needed
> for cmd submission, where you need the ctx anyway.
>
> Besides this interface nit looks all good. With the task check&parameter
> removed:
>
> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
>
> -Daniel
>
>> +}
>> +
>>   #endif
>> -- 
>> 2.14.1
>>
>> _______________________________________________
>> dri-devel mailing list
>> dri-devel@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/dri-devel

Daniel Vetter Feb. 19, 2018, 4:15 p.m. UTC | #4

On Mon, Feb 19, 2018 at 04:41:55PM +0100, Christian König wrote:
> Am 19.02.2018 um 16:24 schrieb Daniel Vetter:
> > On Thu, Feb 15, 2018 at 03:19:42PM +0100, Christian König wrote:
> > > amdgpu needs to verify if userspace sends us valid addresses and the simplest
> > > way of doing this is to check if the buffer object is locked with the ticket
> > > of the current submission.
> > > 
> > > Clean up the access to the ww_mutex internals by providing a function
> > > for this and extend the check to the thread owning the underlying mutex.
> > > 
> > > Signed-off-by: Christian König <christian.koenig@amd.com>
> > > ---
> > >   drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c |  3 ++-
> > >   include/linux/ww_mutex.h               | 17 +++++++++++++++++
> > >   2 files changed, 19 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > > index eaa3cb0c3ad1..4c04b560e358 100644
> > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > > @@ -1594,7 +1594,8 @@ int amdgpu_cs_find_mapping(struct amdgpu_cs_parser *parser,
> > >   	*map = mapping;
> > >   	/* Double check that the BO is reserved by this CS */
> > > -	if (READ_ONCE((*bo)->tbo.resv->lock.ctx) != &parser->ticket)
> > > +	if (!ww_mutex_is_owned_by(&(*bo)->tbo.resv->lock, current,
> > > +				  &parser->ticket))
> > >   		return -EINVAL;
> > >   	if (!((*bo)->flags & AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS)) {
> > > diff --git a/include/linux/ww_mutex.h b/include/linux/ww_mutex.h
> > > index 39fda195bf78..dd580db289e8 100644
> > > --- a/include/linux/ww_mutex.h
> > > +++ b/include/linux/ww_mutex.h
> > > @@ -358,4 +358,21 @@ static inline bool ww_mutex_is_locked(struct ww_mutex *lock)
> > >   	return mutex_is_locked(&lock->base);
> > >   }
> > > +/**
> > > + * ww_mutex_is_owned_by - is the w/w mutex locked by this task in that context
> > > + * @lock: the mutex to be queried
> > > + * @task: the task structure to check
> > > + * @ctx: the w/w acquire context to test
> > > + *
> > > + * Returns true if the mutex is locked in the context by the given task, false
> > > + * otherwise.
> > > + */
> > > +static inline bool ww_mutex_is_owned_by(struct ww_mutex *lock,
> > > +					struct task_struct *task,
> > > +					struct ww_acquire_ctx *ctx)
> > > +{
> > > +	return likely(__mutex_owner(&lock->base) == task) &&
> > > +		READ_ONCE(lock->ctx) == ctx;
> > Just comparing the context should be good enough. If you ever pass a
> > ww_acquire_ctx which does not belong to your own thread your seriously
> > wreaking things much worse already (and if we do catch that, should
> > probably lock the ctx to a given task when ww-mutex debugging is enabled).
> > 
> > That also simplifies the function signature.
> > 
> > Of course that means if you don't have a ctx, you can't test ownership of
> > a ww_mute, but I think that's not a really valid use-case.
> 
> Well exactly that is the use case in TTM, see patch #3 in this series.
> 
> In TTM the evicted BOs are trylocked and so we need a way of testing for
> ownership without a context.

I don't think your final patch to keep ww_mutex locked until the end
works. You can't really nest ww_mutex_trylock with ww_mutex at will (since
trylock bypasses the entire deadlock avoidance).

If this is really what you want to do, then we need a
ww_mutex_trylock_ctx, which also fills out the ctx value (so that other
threads can correctly resolve deadlocks when you hold that lock while
trying to grab additional locks). In which case you really don't need the
task pointer.

Yes it's a disappointment that lockdep doesn't correctly track trylocks,
it just does basic sanity checks, but then drops them on the floor wrt
depency tracking. Just in case you wonder why you're not getting a
lockdeps splat for this. Unfortunately I don't understand lockdep enough
to be able to fix this gap.
-Daniel

> 
> Christian.
> 
> >   And not needed
> > for cmd submission, where you need the ctx anyway.
> > 
> > Besides this interface nit looks all good. With the task check&parameter
> > removed:
> > 
> > Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> > 
> > -Daniel
> > 
> > > +}
> > > +
> > >   #endif
> > > -- 
> > > 2.14.1
> > > 
> > > _______________________________________________
> > > dri-devel mailing list
> > > dri-devel@lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

Christian König Feb. 19, 2018, 4:29 p.m. UTC | #5

Am 19.02.2018 um 17:15 schrieb Daniel Vetter:
> On Mon, Feb 19, 2018 at 04:41:55PM +0100, Christian König wrote:
>> Am 19.02.2018 um 16:24 schrieb Daniel Vetter:
>>> On Thu, Feb 15, 2018 at 03:19:42PM +0100, Christian König wrote:
>>>> amdgpu needs to verify if userspace sends us valid addresses and the simplest
>>>> way of doing this is to check if the buffer object is locked with the ticket
>>>> of the current submission.
>>>>
>>>> Clean up the access to the ww_mutex internals by providing a function
>>>> for this and extend the check to the thread owning the underlying mutex.
>>>>
>>>> Signed-off-by: Christian König <christian.koenig@amd.com>
>>>> ---
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c |  3 ++-
>>>>    include/linux/ww_mutex.h               | 17 +++++++++++++++++
>>>>    2 files changed, 19 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>> index eaa3cb0c3ad1..4c04b560e358 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>> @@ -1594,7 +1594,8 @@ int amdgpu_cs_find_mapping(struct amdgpu_cs_parser *parser,
>>>>    	*map = mapping;
>>>>    	/* Double check that the BO is reserved by this CS */
>>>> -	if (READ_ONCE((*bo)->tbo.resv->lock.ctx) != &parser->ticket)
>>>> +	if (!ww_mutex_is_owned_by(&(*bo)->tbo.resv->lock, current,
>>>> +				  &parser->ticket))
>>>>    		return -EINVAL;
>>>>    	if (!((*bo)->flags & AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS)) {
>>>> diff --git a/include/linux/ww_mutex.h b/include/linux/ww_mutex.h
>>>> index 39fda195bf78..dd580db289e8 100644
>>>> --- a/include/linux/ww_mutex.h
>>>> +++ b/include/linux/ww_mutex.h
>>>> @@ -358,4 +358,21 @@ static inline bool ww_mutex_is_locked(struct ww_mutex *lock)
>>>>    	return mutex_is_locked(&lock->base);
>>>>    }
>>>> +/**
>>>> + * ww_mutex_is_owned_by - is the w/w mutex locked by this task in that context
>>>> + * @lock: the mutex to be queried
>>>> + * @task: the task structure to check
>>>> + * @ctx: the w/w acquire context to test
>>>> + *
>>>> + * Returns true if the mutex is locked in the context by the given task, false
>>>> + * otherwise.
>>>> + */
>>>> +static inline bool ww_mutex_is_owned_by(struct ww_mutex *lock,
>>>> +					struct task_struct *task,
>>>> +					struct ww_acquire_ctx *ctx)
>>>> +{
>>>> +	return likely(__mutex_owner(&lock->base) == task) &&
>>>> +		READ_ONCE(lock->ctx) == ctx;
>>> Just comparing the context should be good enough. If you ever pass a
>>> ww_acquire_ctx which does not belong to your own thread your seriously
>>> wreaking things much worse already (and if we do catch that, should
>>> probably lock the ctx to a given task when ww-mutex debugging is enabled).
>>>
>>> That also simplifies the function signature.
>>>
>>> Of course that means if you don't have a ctx, you can't test ownership of
>>> a ww_mute, but I think that's not a really valid use-case.
>> Well exactly that is the use case in TTM, see patch #3 in this series.
>>
>> In TTM the evicted BOs are trylocked and so we need a way of testing for
>> ownership without a context.
> I don't think your final patch to keep ww_mutex locked until the end
> works. You can't really nest ww_mutex_trylock with ww_mutex at will (since
> trylock bypasses the entire deadlock avoidance).

Well that is not a problem at all. See we don't nest trylock with normal 
lock acquiring, cause that would indeed bypass the whole deadlock detection.

Instead we first use ww_mutex_acquire to lock all BOs which are needed 
for a command submission, including the deadlock detection.

Then all additional BOs which needed to be evicted to fulfill the 
current request are trylocked.

> If this is really what you want to do, then we need a
> ww_mutex_trylock_ctx, which also fills out the ctx value (so that other
> threads can correctly resolve deadlocks when you hold that lock while
> trying to grab additional locks). In which case you really don't need the
> task pointer.

Actually considered that as well, but it turned out that this is exactly 
what I don't want.

Cause then we wouldn't be able to distinct ww_mutex locked with a 
context (because they are part of the submission) and without (because 
TTM trylocked them).

> Yes it's a disappointment that lockdep doesn't correctly track trylocks,
> it just does basic sanity checks, but then drops them on the floor wrt
> depency tracking. Just in case you wonder why you're not getting a
> lockdeps splat for this. Unfortunately I don't understand lockdep enough
> to be able to fix this gap.

Sorry to disappoint you, but lockdep is indeed capable to correctly 
track those trylocked BOs.

Got quite a bunch of warning before I was able to resolve to this solution.

Christian.

> -Daniel
>
>> Christian.
>>
>>>    And not needed
>>> for cmd submission, where you need the ctx anyway.
>>>
>>> Besides this interface nit looks all good. With the task check&parameter
>>> removed:
>>>
>>> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
>>>
>>> -Daniel
>>>
>>>> +}
>>>> +
>>>>    #endif
>>>> -- 
>>>> 2.14.1
>>>>
>>>> _______________________________________________
>>>> dri-devel mailing list
>>>> dri-devel@lists.freedesktop.org
>>>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
>> _______________________________________________
>> dri-devel mailing list
>> dri-devel@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/dri-devel

Daniel Vetter Feb. 19, 2018, 4:43 p.m. UTC | #6

On Mon, Feb 19, 2018 at 05:29:46PM +0100, Christian König wrote:
> Am 19.02.2018 um 17:15 schrieb Daniel Vetter:
> > On Mon, Feb 19, 2018 at 04:41:55PM +0100, Christian König wrote:
> > > Am 19.02.2018 um 16:24 schrieb Daniel Vetter:
> > > > On Thu, Feb 15, 2018 at 03:19:42PM +0100, Christian König wrote:
> > > > > amdgpu needs to verify if userspace sends us valid addresses and the simplest
> > > > > way of doing this is to check if the buffer object is locked with the ticket
> > > > > of the current submission.
> > > > > 
> > > > > Clean up the access to the ww_mutex internals by providing a function
> > > > > for this and extend the check to the thread owning the underlying mutex.
> > > > > 
> > > > > Signed-off-by: Christian König <christian.koenig@amd.com>
> > > > > ---
> > > > >    drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c |  3 ++-
> > > > >    include/linux/ww_mutex.h               | 17 +++++++++++++++++
> > > > >    2 files changed, 19 insertions(+), 1 deletion(-)
> > > > > 
> > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > > > > index eaa3cb0c3ad1..4c04b560e358 100644
> > > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > > > > @@ -1594,7 +1594,8 @@ int amdgpu_cs_find_mapping(struct amdgpu_cs_parser *parser,
> > > > >    	*map = mapping;
> > > > >    	/* Double check that the BO is reserved by this CS */
> > > > > -	if (READ_ONCE((*bo)->tbo.resv->lock.ctx) != &parser->ticket)
> > > > > +	if (!ww_mutex_is_owned_by(&(*bo)->tbo.resv->lock, current,
> > > > > +				  &parser->ticket))
> > > > >    		return -EINVAL;
> > > > >    	if (!((*bo)->flags & AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS)) {
> > > > > diff --git a/include/linux/ww_mutex.h b/include/linux/ww_mutex.h
> > > > > index 39fda195bf78..dd580db289e8 100644
> > > > > --- a/include/linux/ww_mutex.h
> > > > > +++ b/include/linux/ww_mutex.h
> > > > > @@ -358,4 +358,21 @@ static inline bool ww_mutex_is_locked(struct ww_mutex *lock)
> > > > >    	return mutex_is_locked(&lock->base);
> > > > >    }
> > > > > +/**
> > > > > + * ww_mutex_is_owned_by - is the w/w mutex locked by this task in that context
> > > > > + * @lock: the mutex to be queried
> > > > > + * @task: the task structure to check
> > > > > + * @ctx: the w/w acquire context to test
> > > > > + *
> > > > > + * Returns true if the mutex is locked in the context by the given task, false
> > > > > + * otherwise.
> > > > > + */
> > > > > +static inline bool ww_mutex_is_owned_by(struct ww_mutex *lock,
> > > > > +					struct task_struct *task,
> > > > > +					struct ww_acquire_ctx *ctx)
> > > > > +{
> > > > > +	return likely(__mutex_owner(&lock->base) == task) &&
> > > > > +		READ_ONCE(lock->ctx) == ctx;
> > > > Just comparing the context should be good enough. If you ever pass a
> > > > ww_acquire_ctx which does not belong to your own thread your seriously
> > > > wreaking things much worse already (and if we do catch that, should
> > > > probably lock the ctx to a given task when ww-mutex debugging is enabled).
> > > > 
> > > > That also simplifies the function signature.
> > > > 
> > > > Of course that means if you don't have a ctx, you can't test ownership of
> > > > a ww_mute, but I think that's not a really valid use-case.
> > > Well exactly that is the use case in TTM, see patch #3 in this series.
> > > 
> > > In TTM the evicted BOs are trylocked and so we need a way of testing for
> > > ownership without a context.
> > I don't think your final patch to keep ww_mutex locked until the end
> > works. You can't really nest ww_mutex_trylock with ww_mutex at will (since
> > trylock bypasses the entire deadlock avoidance).
> 
> Well that is not a problem at all. See we don't nest trylock with normal
> lock acquiring, cause that would indeed bypass the whole deadlock detection.
> 
> Instead we first use ww_mutex_acquire to lock all BOs which are needed for a
> command submission, including the deadlock detection.
> 
> Then all additional BOs which needed to be evicted to fulfill the current
> request are trylocked.

Yeah it's the lock; trylock; lock combo that deadlocks. If lockdep indeed
catches that one (and not some other recursion combo) then I think we
don't have to worry about holding tons of trylock'ed locks.
> 
> > If this is really what you want to do, then we need a
> > ww_mutex_trylock_ctx, which also fills out the ctx value (so that other
> > threads can correctly resolve deadlocks when you hold that lock while
> > trying to grab additional locks). In which case you really don't need the
> > task pointer.
> 
> Actually considered that as well, but it turned out that this is exactly
> what I don't want.
> 
> Cause then we wouldn't be able to distinct ww_mutex locked with a context
> (because they are part of the submission) and without (because TTM trylocked
> them).

Out of curiosity, why do you need to know that?

> > Yes it's a disappointment that lockdep doesn't correctly track trylocks,
> > it just does basic sanity checks, but then drops them on the floor wrt
> > depency tracking. Just in case you wonder why you're not getting a
> > lockdeps splat for this. Unfortunately I don't understand lockdep enough
> > to be able to fix this gap.
> 
> Sorry to disappoint you, but lockdep is indeed capable to correctly track
> those trylocked BOs.
> 
> Got quite a bunch of warning before I was able to resolve to this solution.

Hm, I thought it didn't detect a lock; trylock; lock combo because the
trylock didn't show up in the dependency stack. Maybe that got fixed
meanwhile.

Assuming that we indeed need both, could we split up the two use-cases for
clarity? I.e. ww_mutex_is_owned_by_ctx, which only takes the ctx (and
forgoes checking for a task, since that's implied) and requires a non-NULL
ctx. And ww_mutex_is_owned_by_task, which only takes the task (and maybe
we should limit it to trylock and _single locks, i.e. WARN_ON(lock->ctx)
if ww-mutex debugging is enabled).

Or does that hit another requirement of your use-case?
-Daniel

> 
> Christian.
> 
> > -Daniel
> > 
> > > Christian.
> > > 
> > > >    And not needed
> > > > for cmd submission, where you need the ctx anyway.
> > > > 
> > > > Besides this interface nit looks all good. With the task check&parameter
> > > > removed:
> > > > 
> > > > Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> > > > 
> > > > -Daniel
> > > > 
> > > > > +}
> > > > > +
> > > > >    #endif
> > > > > -- 
> > > > > 2.14.1
> > > > > 
> > > > > _______________________________________________
> > > > > dri-devel mailing list
> > > > > dri-devel@lists.freedesktop.org
> > > > > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> > > _______________________________________________
> > > dri-devel mailing list
> > > dri-devel@lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

Christian König Feb. 20, 2018, 9:43 a.m. UTC | #7

Am 19.02.2018 um 17:43 schrieb Daniel Vetter:
> On Mon, Feb 19, 2018 at 05:29:46PM +0100, Christian König wrote:
>> [SNIP]
>> Well that is not a problem at all. See we don't nest trylock with normal
>> lock acquiring, cause that would indeed bypass the whole deadlock detection.
>>
>> Instead we first use ww_mutex_acquire to lock all BOs which are needed for a
>> command submission, including the deadlock detection.
>>
>> Then all additional BOs which needed to be evicted to fulfill the current
>> request are trylocked.
> Yeah it's the lock; trylock; lock combo that deadlocks. If lockdep indeed
> catches that one (and not some other recursion combo) then I think we
> don't have to worry about holding tons of trylock'ed locks.

Well I haven't explicitly tested the lock; trylock; lock case, but you 
get a warning anyway in the lock; ... anything ...; lock case.

See the first and the second lock can't use the same acquire context, 
because that isn't known down the call stack and lockdep warns about 
that quite intensively.

What is a problem is that lockdep sometimes runs out of space to keep 
track of all the trylocked mutexes, but that could have happened before 
as well if I'm not completely mistaken.

>>> If this is really what you want to do, then we need a
>>> ww_mutex_trylock_ctx, which also fills out the ctx value (so that other
>>> threads can correctly resolve deadlocks when you hold that lock while
>>> trying to grab additional locks). In which case you really don't need the
>>> task pointer.
>> Actually considered that as well, but it turned out that this is exactly
>> what I don't want.
>>
>> Cause then we wouldn't be able to distinct ww_mutex locked with a context
>> (because they are part of the submission) and without (because TTM trylocked
>> them).
> Out of curiosity, why do you need to know that?

The control flow in TTM is that when you trylocked a BO you start to 
evict it.

Now sometimes it happens that we evict it from VRAM to GTT, but then 
find that we don't have enough GTT space and need to evict something 
from there to the SYSTEM domain.

The problem is now since the reservation object is trylocked because of 
the VRAM to GTT eviction we can't lock it again because of the GTT to 
the SYSTEM domain.

>>> Yes it's a disappointment that lockdep doesn't correctly track trylocks,
>>> it just does basic sanity checks, but then drops them on the floor wrt
>>> depency tracking. Just in case you wonder why you're not getting a
>>> lockdeps splat for this. Unfortunately I don't understand lockdep enough
>>> to be able to fix this gap.
>> Sorry to disappoint you, but lockdep is indeed capable to correctly track
>> those trylocked BOs.
>>
>> Got quite a bunch of warning before I was able to resolve to this solution.
> Hm, I thought it didn't detect a lock; trylock; lock combo because the
> trylock didn't show up in the dependency stack. Maybe that got fixed
> meanwhile.

Yeah I can confirm that this indeed got fixed.

> Assuming that we indeed need both, could we split up the two use-cases for
> clarity? I.e. ww_mutex_is_owned_by_ctx, which only takes the ctx (and
> forgoes checking for a task, since that's implied) and requires a non-NULL
> ctx. And ww_mutex_is_owned_by_task, which only takes the task (and maybe
> we should limit it to trylock and _single locks, i.e. WARN_ON(lock->ctx)
> if ww-mutex debugging is enabled).
>
> Or does that hit another requirement of your use-case?

Well we could add two tests, one which only checks the context and one 
which checks that the context is NULL and then checks the mutex owner.

But to me it actually looks more like that makes it unnecessary 
complicated. The use case in amdgpu which could only check the context 
isn't performance critical.

Christian.

> -Daniel
>

Daniel Vetter Feb. 20, 2018, 11:33 a.m. UTC | #8

On Tue, Feb 20, 2018 at 10:43:48AM +0100, Christian König wrote:
> Am 19.02.2018 um 17:43 schrieb Daniel Vetter:
> > On Mon, Feb 19, 2018 at 05:29:46PM +0100, Christian König wrote:
> > > [SNIP]
> > > Well that is not a problem at all. See we don't nest trylock with normal
> > > lock acquiring, cause that would indeed bypass the whole deadlock detection.
> > > 
> > > Instead we first use ww_mutex_acquire to lock all BOs which are needed for a
> > > command submission, including the deadlock detection.
> > > 
> > > Then all additional BOs which needed to be evicted to fulfill the current
> > > request are trylocked.
> > Yeah it's the lock; trylock; lock combo that deadlocks. If lockdep indeed
> > catches that one (and not some other recursion combo) then I think we
> > don't have to worry about holding tons of trylock'ed locks.
> 
> Well I haven't explicitly tested the lock; trylock; lock case, but you get a
> warning anyway in the lock; ... anything ...; lock case.
> 
> See the first and the second lock can't use the same acquire context,
> because that isn't known down the call stack and lockdep warns about that
> quite intensively.

Ah, so the ttm_ctx I've spotted was something entirely different and
doesn't contain the ww_acquire_ctx (I didn't check)? I'd assume you have
the same ctx passed around to everything in ttm, but if that doesn't exist
then we can indeed not annotate ww_mutex_trylock_ctx with the right ctx.

> What is a problem is that lockdep sometimes runs out of space to keep track
> of all the trylocked mutexes, but that could have happened before as well if
> I'm not completely mistaken.
> 
> > > > If this is really what you want to do, then we need a
> > > > ww_mutex_trylock_ctx, which also fills out the ctx value (so that other
> > > > threads can correctly resolve deadlocks when you hold that lock while
> > > > trying to grab additional locks). In which case you really don't need the
> > > > task pointer.
> > > Actually considered that as well, but it turned out that this is exactly
> > > what I don't want.
> > > 
> > > Cause then we wouldn't be able to distinct ww_mutex locked with a context
> > > (because they are part of the submission) and without (because TTM trylocked
> > > them).
> > Out of curiosity, why do you need to know that?
> 
> The control flow in TTM is that when you trylocked a BO you start to evict
> it.
> 
> Now sometimes it happens that we evict it from VRAM to GTT, but then find
> that we don't have enough GTT space and need to evict something from there
> to the SYSTEM domain.
> 
> The problem is now since the reservation object is trylocked because of the
> VRAM to GTT eviction we can't lock it again because of the GTT to the SYSTEM
> domain.
> 
> > > > Yes it's a disappointment that lockdep doesn't correctly track trylocks,
> > > > it just does basic sanity checks, but then drops them on the floor wrt
> > > > depency tracking. Just in case you wonder why you're not getting a
> > > > lockdeps splat for this. Unfortunately I don't understand lockdep enough
> > > > to be able to fix this gap.
> > > Sorry to disappoint you, but lockdep is indeed capable to correctly track
> > > those trylocked BOs.
> > > 
> > > Got quite a bunch of warning before I was able to resolve to this solution.
> > Hm, I thought it didn't detect a lock; trylock; lock combo because the
> > trylock didn't show up in the dependency stack. Maybe that got fixed
> > meanwhile.
> 
> Yeah I can confirm that this indeed got fixed.
> 
> > Assuming that we indeed need both, could we split up the two use-cases for
> > clarity? I.e. ww_mutex_is_owned_by_ctx, which only takes the ctx (and
> > forgoes checking for a task, since that's implied) and requires a non-NULL
> > ctx. And ww_mutex_is_owned_by_task, which only takes the task (and maybe
> > we should limit it to trylock and _single locks, i.e. WARN_ON(lock->ctx)
> > if ww-mutex debugging is enabled).
> > 
> > Or does that hit another requirement of your use-case?
> 
> Well we could add two tests, one which only checks the context and one which
> checks that the context is NULL and then checks the mutex owner.
> 
> But to me it actually looks more like that makes it unnecessary complicated.
> The use case in amdgpu which could only check the context isn't performance
> critical.

Oh I'm not worried about the runtime overhead at all, I'm worried about
conceptual clarity of this stuff. If you have a ctx there's no need to
also look at ->owner.

Another idea: We drop the task argument from functions and go with the
following logic:

ww_mutex_is_owner(lock, ctx)
{
	if (ctx)
		return lock->ctx == ctx;
	else
		return lock->owner == current;
}

I think that would solve your use case, and gives us the neat interface
I'm aiming for. Kerneldoc can then explain what's happening for a NULL
ctx.
-Daniel

Christian König Feb. 20, 2018, 12:31 p.m. UTC | #9

Am 20.02.2018 um 12:33 schrieb Daniel Vetter:
> [SNIP]
> Ah, so the ttm_ctx I've spotted was something entirely different and
> doesn't contain the ww_acquire_ctx (I didn't check)? I'd assume you have
> the same ctx passed around to everything in ttm, but if that doesn't exist
> then we can indeed not annotate ww_mutex_trylock_ctx with the right ctx.

Yes, exactly.

I actually tried this approach, e.g. put the ww_acquire_context into the 
ttm_operation_context and then use that with ww_mutex_trylock_ctx.

But a) that turned out to be to much hassle, e.g. at least amdgpu 
doesn't use a ww_acquire context in most cases.

And b) it actually wasn't what I was looking for, e.g. I couldn't 
distinct between the trylocked BOs an everything else any more.

>> [SNIP]
>> But to me it actually looks more like that makes it unnecessary complicated.
>> The use case in amdgpu which could only check the context isn't performance
>> critical.
> Oh I'm not worried about the runtime overhead at all, I'm worried about
> conceptual clarity of this stuff. If you have a ctx there's no need to
> also look at ->owner.
>
> Another idea: We drop the task argument from functions and go with the
> following logic:
>
> ww_mutex_is_owner(lock, ctx)
> {
> 	if (ctx)
> 		return lock->ctx == ctx;
> 	else
> 		return lock->owner == current;
> }
>
> I think that would solve your use case, and gives us the neat interface
> I'm aiming for. Kerneldoc can then explain what's happening for a NULL
> ctx.

Good point, going to adjust the patches this way and resend.

Christian.

Peter Zijlstra Feb. 20, 2018, 12:35 p.m. UTC | #10

This really should've been Cc'ed to me.

On Thu, Feb 15, 2018 at 03:19:42PM +0100, Christian König wrote:
> diff --git a/include/linux/ww_mutex.h b/include/linux/ww_mutex.h
> index 39fda195bf78..dd580db289e8 100644
> --- a/include/linux/ww_mutex.h
> +++ b/include/linux/ww_mutex.h
> @@ -358,4 +358,21 @@ static inline bool ww_mutex_is_locked(struct ww_mutex *lock)
>  	return mutex_is_locked(&lock->base);
>  }
>  
> +/**
> + * ww_mutex_is_owned_by - is the w/w mutex locked by this task in that context
> + * @lock: the mutex to be queried
> + * @task: the task structure to check
> + * @ctx: the w/w acquire context to test
> + *
> + * Returns true if the mutex is locked in the context by the given task, false
> + * otherwise.
> + */
> +static inline bool ww_mutex_is_owned_by(struct ww_mutex *lock,
> +					struct task_struct *task,
> +					struct ww_acquire_ctx *ctx)
> +{
> +	return likely(__mutex_owner(&lock->base) == task) &&
> +		READ_ONCE(lock->ctx) == ctx;
> +}

Nak on that interface, that's racy and broken by design.

Christian König Feb. 20, 2018, 1:08 p.m. UTC | #11

Am 20.02.2018 um 13:35 schrieb Peter Zijlstra:
> This really should've been Cc'ed to me.
>
> On Thu, Feb 15, 2018 at 03:19:42PM +0100, Christian König wrote:
>> diff --git a/include/linux/ww_mutex.h b/include/linux/ww_mutex.h
>> index 39fda195bf78..dd580db289e8 100644
>> --- a/include/linux/ww_mutex.h
>> +++ b/include/linux/ww_mutex.h
>> @@ -358,4 +358,21 @@ static inline bool ww_mutex_is_locked(struct ww_mutex *lock)
>>   	return mutex_is_locked(&lock->base);
>>   }
>>   
>> +/**
>> + * ww_mutex_is_owned_by - is the w/w mutex locked by this task in that context
>> + * @lock: the mutex to be queried
>> + * @task: the task structure to check
>> + * @ctx: the w/w acquire context to test
>> + *
>> + * Returns true if the mutex is locked in the context by the given task, false
>> + * otherwise.
>> + */
>> +static inline bool ww_mutex_is_owned_by(struct ww_mutex *lock,
>> +					struct task_struct *task,
>> +					struct ww_acquire_ctx *ctx)
>> +{
>> +	return likely(__mutex_owner(&lock->base) == task) &&
>> +		READ_ONCE(lock->ctx) == ctx;
>> +}
> Nak on that interface, that's racy and broken by design.

Why?

Regards,
Christian.

Peter Zijlstra Feb. 20, 2018, 1:22 p.m. UTC | #12

On Tue, Feb 20, 2018 at 02:08:26PM +0100, Christian König wrote:
> Am 20.02.2018 um 13:35 schrieb Peter Zijlstra:
> > > +static inline bool ww_mutex_is_owned_by(struct ww_mutex *lock,
> > > +					struct task_struct *task,
> > > +					struct ww_acquire_ctx *ctx)
> > > +{
> > > +	return likely(__mutex_owner(&lock->base) == task) &&
> > > +		READ_ONCE(lock->ctx) == ctx;
> > > +}
> > Nak on that interface, that's racy and broken by design.
> 
> Why?

If task != current you can race with a concurrent mutex_unlock().

[1/3] locking/ww_mutex: cleanup lock->ctx usage in amdgpu

Commit Message

Comments

Patch