drm/i915: Reduce nested prepare_remote_context() to a trylock
diff mbox series

Message ID 20191126065521.2331017-1-chris@chris-wilson.co.uk
State New
Headers show
Series
  • drm/i915: Reduce nested prepare_remote_context() to a trylock
Related show

Commit Message

Chris Wilson Nov. 26, 2019, 6:55 a.m. UTC
On context retiring, we may invoke the kernel_context to unpin this
context. Elsewhere, we may use the kernel_context to modify this
context. This currently leads to an AB-BA lock inversion, so we need to
back-off from the contended lock, and repeat.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111732
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gt/intel_context.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

Comments

Chris Wilson Nov. 26, 2019, 7:53 a.m. UTC | #1
Quoting Chris Wilson (2019-11-26 06:55:21)
> On context retiring, we may invoke the kernel_context to unpin this
> context. Elsewhere, we may use the kernel_context to modify this
> context. This currently leads to an AB-BA lock inversion, so we need to
> back-off from the contended lock, and repeat.
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111732
Fixes: a9877da2d629 ("drm/i915/oa: Reconfigure contexts on the fly")
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
Tvrtko Ursulin Nov. 26, 2019, 12:15 p.m. UTC | #2
On 26/11/2019 06:55, Chris Wilson wrote:
> On context retiring, we may invoke the kernel_context to unpin this
> context. Elsewhere, we may use the kernel_context to modify this
> context. This currently leads to an AB-BA lock inversion, so we need to
> back-off from the contended lock, and repeat.
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111732
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/i915/gt/intel_context.c | 6 ++----
>   1 file changed, 2 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c
> index ee9d2bcd2c13..4fcb98f96da6 100644
> --- a/drivers/gpu/drm/i915/gt/intel_context.c
> +++ b/drivers/gpu/drm/i915/gt/intel_context.c
> @@ -310,10 +310,8 @@ int intel_context_prepare_remote_request(struct intel_context *ce,
>   	GEM_BUG_ON(rq->hw_context == ce);
>   
>   	if (rcu_access_pointer(rq->timeline) != tl) { /* timeline sharing! */
> -		err = mutex_lock_interruptible_nested(&tl->mutex,
> -						      SINGLE_DEPTH_NESTING);
> -		if (err)
> -			return err;
> +		if (!mutex_trylock(&tl->mutex))
> +			return -EAGAIN;
>   
>   		/* Queue this switch after current activity by this context. */
>   		err = i915_active_fence_set(&tl->last_request, rq);
> 

Please just drop a short comment above the trylock since with git blame 
it is often very hard to find the commit.

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
Chris Wilson Nov. 26, 2019, 12:18 p.m. UTC | #3
Quoting Tvrtko Ursulin (2019-11-26 12:15:58)
> 
> On 26/11/2019 06:55, Chris Wilson wrote:
> > On context retiring, we may invoke the kernel_context to unpin this
> > context. Elsewhere, we may use the kernel_context to modify this
> > context. This currently leads to an AB-BA lock inversion, so we need to
> > back-off from the contended lock, and repeat.
> > 
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111732
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > ---
> >   drivers/gpu/drm/i915/gt/intel_context.c | 6 ++----
> >   1 file changed, 2 insertions(+), 4 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c
> > index ee9d2bcd2c13..4fcb98f96da6 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_context.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_context.c
> > @@ -310,10 +310,8 @@ int intel_context_prepare_remote_request(struct intel_context *ce,
> >       GEM_BUG_ON(rq->hw_context == ce);
> >   
> >       if (rcu_access_pointer(rq->timeline) != tl) { /* timeline sharing! */
> > -             err = mutex_lock_interruptible_nested(&tl->mutex,
> > -                                                   SINGLE_DEPTH_NESTING);
> > -             if (err)
> > -                     return err;
> > +             if (!mutex_trylock(&tl->mutex))
> > +                     return -EAGAIN;
> >   
> >               /* Queue this switch after current activity by this context. */
> >               err = i915_active_fence_set(&tl->last_request, rq);
> > 
> 
> Please just drop a short comment above the trylock since with git blame 
> it is often very hard to find the commit.

Ok. I'm still hoping to find another way to provide the serialisation
cleanly, but with engine_retire() being more aggressive, the rate of
contention has increased :(
-Chris
Chris Wilson Nov. 26, 2019, 12:22 p.m. UTC | #4
Quoting Chris Wilson (2019-11-26 12:18:07)
> Quoting Tvrtko Ursulin (2019-11-26 12:15:58)
> > 
> > On 26/11/2019 06:55, Chris Wilson wrote:
> > > On context retiring, we may invoke the kernel_context to unpin this
> > > context. Elsewhere, we may use the kernel_context to modify this
> > > context. This currently leads to an AB-BA lock inversion, so we need to
> > > back-off from the contended lock, and repeat.
> > > 
> > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111732
> > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > > ---
> > >   drivers/gpu/drm/i915/gt/intel_context.c | 6 ++----
> > >   1 file changed, 2 insertions(+), 4 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c
> > > index ee9d2bcd2c13..4fcb98f96da6 100644
> > > --- a/drivers/gpu/drm/i915/gt/intel_context.c
> > > +++ b/drivers/gpu/drm/i915/gt/intel_context.c
> > > @@ -310,10 +310,8 @@ int intel_context_prepare_remote_request(struct intel_context *ce,
> > >       GEM_BUG_ON(rq->hw_context == ce);
> > >   
> > >       if (rcu_access_pointer(rq->timeline) != tl) { /* timeline sharing! */
> > > -             err = mutex_lock_interruptible_nested(&tl->mutex,
> > > -                                                   SINGLE_DEPTH_NESTING);
> > > -             if (err)
> > > -                     return err;
> > > +             if (!mutex_trylock(&tl->mutex))
> > > +                     return -EAGAIN;
> > >   
> > >               /* Queue this switch after current activity by this context. */
> > >               err = i915_active_fence_set(&tl->last_request, rq);
> > > 
> > 
> > Please just drop a short comment above the trylock since with git blame 
> > it is often very hard to find the commit.
> 
> Ok. I'm still hoping to find another way to provide the serialisation
> cleanly, but with engine_retire() being more aggressive, the rate of
> contention has increased :(

Hmm. Staring at i915_active_fence_set()... That could be made to be
atomic with only a small amount of hassle. (By small, I mean by the
usual RCU standards.)
-Chris

Patch
diff mbox series

diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c
index ee9d2bcd2c13..4fcb98f96da6 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.c
+++ b/drivers/gpu/drm/i915/gt/intel_context.c
@@ -310,10 +310,8 @@  int intel_context_prepare_remote_request(struct intel_context *ce,
 	GEM_BUG_ON(rq->hw_context == ce);
 
 	if (rcu_access_pointer(rq->timeline) != tl) { /* timeline sharing! */
-		err = mutex_lock_interruptible_nested(&tl->mutex,
-						      SINGLE_DEPTH_NESTING);
-		if (err)
-			return err;
+		if (!mutex_trylock(&tl->mutex))
+			return -EAGAIN;
 
 		/* Queue this switch after current activity by this context. */
 		err = i915_active_fence_set(&tl->last_request, rq);