diff mbox series

drm/i915/gt/uc: Evaluate GuC priority within locks

Message ID 20240606001702.59005-1-andi.shyti@linux.intel.com (mailing list archive)
State New, archived
Headers show
Series drm/i915/gt/uc: Evaluate GuC priority within locks | expand

Commit Message

Andi Shyti June 6, 2024, 12:17 a.m. UTC
The ce->guc_state.lock was made to protect guc_prio, which
indicates the GuC priority level.

But at the begnning of the function we perform some sanity check
of guc_prio outside its protected section. Move them within the
locked region.

Use this occasion to expand the if statement to make it clearer.

Fixes: ee242ca704d3 ("drm/i915/guc: Implement GuC priority management")
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: <stable@vger.kernel.org> # v5.15+
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)

Comments

Matthew Brost June 6, 2024, 2:57 a.m. UTC | #1
On Thu, Jun 06, 2024 at 02:17:02AM +0200, Andi Shyti wrote:
> The ce->guc_state.lock was made to protect guc_prio, which
> indicates the GuC priority level.
> 
> But at the begnning of the function we perform some sanity check
> of guc_prio outside its protected section. Move them within the
> locked region.
> 
> Use this occasion to expand the if statement to make it clearer.
> 
> Fixes: ee242ca704d3 ("drm/i915/guc: Implement GuC priority management")
> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
> Cc: Matthew Brost <matthew.brost@intel.com>

Reviewed-by: Matthew Brost <matthew.brost@intel.com>

> Cc: <stable@vger.kernel.org> # v5.15+
> ---
>  drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 15 +++++++++++----
>  1 file changed, 11 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> index 0eaa1064242c..1181043bc5e9 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> @@ -4267,13 +4267,18 @@ static void guc_bump_inflight_request_prio(struct i915_request *rq,
>  	u8 new_guc_prio = map_i915_prio_to_guc_prio(prio);
>  
>  	/* Short circuit function */
> -	if (prio < I915_PRIORITY_NORMAL ||
> -	    rq->guc_prio == GUC_PRIO_FINI ||
> -	    (rq->guc_prio != GUC_PRIO_INIT &&
> -	     !new_guc_prio_higher(rq->guc_prio, new_guc_prio)))
> +	if (prio < I915_PRIORITY_NORMAL)
>  		return;
>  
>  	spin_lock(&ce->guc_state.lock);
> +
> +	if (rq->guc_prio == GUC_PRIO_FINI)
> +		goto exit;
> +
> +	if (rq->guc_prio != GUC_PRIO_INIT &&
> +	    !new_guc_prio_higher(rq->guc_prio, new_guc_prio))
> +		goto exit;
> +
>  	if (rq->guc_prio != GUC_PRIO_FINI) {
>  		if (rq->guc_prio != GUC_PRIO_INIT)
>  			sub_context_inflight_prio(ce, rq->guc_prio);
> @@ -4281,6 +4286,8 @@ static void guc_bump_inflight_request_prio(struct i915_request *rq,
>  		add_context_inflight_prio(ce, rq->guc_prio);
>  		update_context_prio(ce);
>  	}
> +
> +exit:
>  	spin_unlock(&ce->guc_state.lock);
>  }
>  
> -- 
> 2.45.1
>
Daniele Ceraolo Spurio June 7, 2024, 6:19 p.m. UTC | #2
On 6/5/2024 5:17 PM, Andi Shyti wrote:
> The ce->guc_state.lock was made to protect guc_prio, which
> indicates the GuC priority level.
>
> But at the begnning of the function we perform some sanity check
> of guc_prio outside its protected section. Move them within the
> locked region.
>
> Use this occasion to expand the if statement to make it clearer.
>
> Fixes: ee242ca704d3 ("drm/i915/guc: Implement GuC priority management")
> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
> Cc: Matthew Brost <matthew.brost@intel.com>
> Cc: <stable@vger.kernel.org> # v5.15+
> ---
>   drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 15 +++++++++++----
>   1 file changed, 11 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> index 0eaa1064242c..1181043bc5e9 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> @@ -4267,13 +4267,18 @@ static void guc_bump_inflight_request_prio(struct i915_request *rq,
>   	u8 new_guc_prio = map_i915_prio_to_guc_prio(prio);
>   
>   	/* Short circuit function */
> -	if (prio < I915_PRIORITY_NORMAL ||
> -	    rq->guc_prio == GUC_PRIO_FINI ||
> -	    (rq->guc_prio != GUC_PRIO_INIT &&
> -	     !new_guc_prio_higher(rq->guc_prio, new_guc_prio)))
> +	if (prio < I915_PRIORITY_NORMAL)
>   		return;
>   

My understanding was that those checks are purposely done outside of the 
lock to avoid taking it when not needed and that the early exit is not 
racy. In particular:

- GUC_PRIO_FINI is the end state for the priority, so if we're there 
that's not changing anymore and therefore the lock is not required.

- the priority only goes up with the bumping, so if 
new_guc_prio_higher() is false that's not going to be changed by a 
different thread running at the same time and increasing the priority 
even more.

I think there is still a possible race is if new_guc_prio_higher() is 
true when we check it outside the lock but then changes before we 
execute the protected chunk inside, so a fix would still be required for 
that.

All this said, I don't really have anything against moving the whole 
thing inside the lock since this isn't on a critical path, just wanted 
to point out that it's not all strictly required.

One nit on the code below.

>   	spin_lock(&ce->guc_state.lock);
> +
> +	if (rq->guc_prio == GUC_PRIO_FINI)
> +		goto exit;
> +
> +	if (rq->guc_prio != GUC_PRIO_INIT &&
> +	    !new_guc_prio_higher(rq->guc_prio, new_guc_prio))
> +		goto exit;
> +
>   	if (rq->guc_prio != GUC_PRIO_FINI) {

You're now checking for rq->guc_prio == GUC_PRIO_FINI inside the lock, 
so no need to check it again here as it can't have changed.

Daniele

>   		if (rq->guc_prio != GUC_PRIO_INIT)
>   			sub_context_inflight_prio(ce, rq->guc_prio);
> @@ -4281,6 +4286,8 @@ static void guc_bump_inflight_request_prio(struct i915_request *rq,
>   		add_context_inflight_prio(ce, rq->guc_prio);
>   		update_context_prio(ce);
>   	}
> +
> +exit:
>   	spin_unlock(&ce->guc_state.lock);
>   }
>
Andi Shyti June 11, 2024, 1:31 p.m. UTC | #3
Hi Daniele,

thanks for checking this patch.

> > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > index 0eaa1064242c..1181043bc5e9 100644
> > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > @@ -4267,13 +4267,18 @@ static void guc_bump_inflight_request_prio(struct i915_request *rq,
> >   	u8 new_guc_prio = map_i915_prio_to_guc_prio(prio);
> >   	/* Short circuit function */
> > -	if (prio < I915_PRIORITY_NORMAL ||
> > -	    rq->guc_prio == GUC_PRIO_FINI ||
> > -	    (rq->guc_prio != GUC_PRIO_INIT &&
> > -	     !new_guc_prio_higher(rq->guc_prio, new_guc_prio)))
> > +	if (prio < I915_PRIORITY_NORMAL)
> >   		return;
> 
> My understanding was that those checks are purposely done outside of the
> lock to avoid taking it when not needed and that the early exit is not racy.
> In particular:
> 
> - GUC_PRIO_FINI is the end state for the priority, so if we're there that's
> not changing anymore and therefore the lock is not required.

yeah... then I thought that the lock should either remove it
completely or have everything inside the lock.

> - the priority only goes up with the bumping, so if new_guc_prio_higher() is
> false that's not going to be changed by a different thread running at the
> same time and increasing the priority even more.
> 
> I think there is still a possible race is if new_guc_prio_higher() is true
> when we check it outside the lock but then changes before we execute the
> protected chunk inside, so a fix would still be required for that.

This is the reason why I made the patch :-)

> All this said, I don't really have anything against moving the whole thing
> inside the lock since this isn't on a critical path, just wanted to point
> out that it's not all strictly required.
> 
> One nit on the code below.
> 
> >   	spin_lock(&ce->guc_state.lock);
> > +
> > +	if (rq->guc_prio == GUC_PRIO_FINI)
> > +		goto exit;
> > +
> > +	if (rq->guc_prio != GUC_PRIO_INIT &&
> > +	    !new_guc_prio_higher(rq->guc_prio, new_guc_prio))
> > +		goto exit;
> > +
> >   	if (rq->guc_prio != GUC_PRIO_FINI) {
> 
> You're now checking for rq->guc_prio == GUC_PRIO_FINI inside the lock, so no
> need to check it again here as it can't have changed.

True, will resend.

Thanks, Daniele!

Andi
diff mbox series

Patch

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index 0eaa1064242c..1181043bc5e9 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -4267,13 +4267,18 @@  static void guc_bump_inflight_request_prio(struct i915_request *rq,
 	u8 new_guc_prio = map_i915_prio_to_guc_prio(prio);
 
 	/* Short circuit function */
-	if (prio < I915_PRIORITY_NORMAL ||
-	    rq->guc_prio == GUC_PRIO_FINI ||
-	    (rq->guc_prio != GUC_PRIO_INIT &&
-	     !new_guc_prio_higher(rq->guc_prio, new_guc_prio)))
+	if (prio < I915_PRIORITY_NORMAL)
 		return;
 
 	spin_lock(&ce->guc_state.lock);
+
+	if (rq->guc_prio == GUC_PRIO_FINI)
+		goto exit;
+
+	if (rq->guc_prio != GUC_PRIO_INIT &&
+	    !new_guc_prio_higher(rq->guc_prio, new_guc_prio))
+		goto exit;
+
 	if (rq->guc_prio != GUC_PRIO_FINI) {
 		if (rq->guc_prio != GUC_PRIO_INIT)
 			sub_context_inflight_prio(ce, rq->guc_prio);
@@ -4281,6 +4286,8 @@  static void guc_bump_inflight_request_prio(struct i915_request *rq,
 		add_context_inflight_prio(ce, rq->guc_prio);
 		update_context_prio(ce);
 	}
+
+exit:
 	spin_unlock(&ce->guc_state.lock);
 }