Message ID | 20240606001702.59005-1-andi.shyti@linux.intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | drm/i915/gt/uc: Evaluate GuC priority within locks | expand |
On Thu, Jun 06, 2024 at 02:17:02AM +0200, Andi Shyti wrote: > The ce->guc_state.lock was made to protect guc_prio, which > indicates the GuC priority level. > > But at the begnning of the function we perform some sanity check > of guc_prio outside its protected section. Move them within the > locked region. > > Use this occasion to expand the if statement to make it clearer. > > Fixes: ee242ca704d3 ("drm/i915/guc: Implement GuC priority management") > Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> > Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> > Cc: <stable@vger.kernel.org> # v5.15+ > --- > drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 15 +++++++++++---- > 1 file changed, 11 insertions(+), 4 deletions(-) > > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c > index 0eaa1064242c..1181043bc5e9 100644 > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c > @@ -4267,13 +4267,18 @@ static void guc_bump_inflight_request_prio(struct i915_request *rq, > u8 new_guc_prio = map_i915_prio_to_guc_prio(prio); > > /* Short circuit function */ > - if (prio < I915_PRIORITY_NORMAL || > - rq->guc_prio == GUC_PRIO_FINI || > - (rq->guc_prio != GUC_PRIO_INIT && > - !new_guc_prio_higher(rq->guc_prio, new_guc_prio))) > + if (prio < I915_PRIORITY_NORMAL) > return; > > spin_lock(&ce->guc_state.lock); > + > + if (rq->guc_prio == GUC_PRIO_FINI) > + goto exit; > + > + if (rq->guc_prio != GUC_PRIO_INIT && > + !new_guc_prio_higher(rq->guc_prio, new_guc_prio)) > + goto exit; > + > if (rq->guc_prio != GUC_PRIO_FINI) { > if (rq->guc_prio != GUC_PRIO_INIT) > sub_context_inflight_prio(ce, rq->guc_prio); > @@ -4281,6 +4286,8 @@ static void guc_bump_inflight_request_prio(struct i915_request *rq, > add_context_inflight_prio(ce, rq->guc_prio); > update_context_prio(ce); > } > + > +exit: > spin_unlock(&ce->guc_state.lock); > } > > -- > 2.45.1 >
On 6/5/2024 5:17 PM, Andi Shyti wrote: > The ce->guc_state.lock was made to protect guc_prio, which > indicates the GuC priority level. > > But at the begnning of the function we perform some sanity check > of guc_prio outside its protected section. Move them within the > locked region. > > Use this occasion to expand the if statement to make it clearer. > > Fixes: ee242ca704d3 ("drm/i915/guc: Implement GuC priority management") > Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> > Cc: Matthew Brost <matthew.brost@intel.com> > Cc: <stable@vger.kernel.org> # v5.15+ > --- > drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 15 +++++++++++---- > 1 file changed, 11 insertions(+), 4 deletions(-) > > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c > index 0eaa1064242c..1181043bc5e9 100644 > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c > @@ -4267,13 +4267,18 @@ static void guc_bump_inflight_request_prio(struct i915_request *rq, > u8 new_guc_prio = map_i915_prio_to_guc_prio(prio); > > /* Short circuit function */ > - if (prio < I915_PRIORITY_NORMAL || > - rq->guc_prio == GUC_PRIO_FINI || > - (rq->guc_prio != GUC_PRIO_INIT && > - !new_guc_prio_higher(rq->guc_prio, new_guc_prio))) > + if (prio < I915_PRIORITY_NORMAL) > return; > My understanding was that those checks are purposely done outside of the lock to avoid taking it when not needed and that the early exit is not racy. In particular: - GUC_PRIO_FINI is the end state for the priority, so if we're there that's not changing anymore and therefore the lock is not required. - the priority only goes up with the bumping, so if new_guc_prio_higher() is false that's not going to be changed by a different thread running at the same time and increasing the priority even more. I think there is still a possible race is if new_guc_prio_higher() is true when we check it outside the lock but then changes before we execute the protected chunk inside, so a fix would still be required for that. All this said, I don't really have anything against moving the whole thing inside the lock since this isn't on a critical path, just wanted to point out that it's not all strictly required. One nit on the code below. > spin_lock(&ce->guc_state.lock); > + > + if (rq->guc_prio == GUC_PRIO_FINI) > + goto exit; > + > + if (rq->guc_prio != GUC_PRIO_INIT && > + !new_guc_prio_higher(rq->guc_prio, new_guc_prio)) > + goto exit; > + > if (rq->guc_prio != GUC_PRIO_FINI) { You're now checking for rq->guc_prio == GUC_PRIO_FINI inside the lock, so no need to check it again here as it can't have changed. Daniele > if (rq->guc_prio != GUC_PRIO_INIT) > sub_context_inflight_prio(ce, rq->guc_prio); > @@ -4281,6 +4286,8 @@ static void guc_bump_inflight_request_prio(struct i915_request *rq, > add_context_inflight_prio(ce, rq->guc_prio); > update_context_prio(ce); > } > + > +exit: > spin_unlock(&ce->guc_state.lock); > } >
Hi Daniele, thanks for checking this patch. > > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c > > index 0eaa1064242c..1181043bc5e9 100644 > > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c > > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c > > @@ -4267,13 +4267,18 @@ static void guc_bump_inflight_request_prio(struct i915_request *rq, > > u8 new_guc_prio = map_i915_prio_to_guc_prio(prio); > > /* Short circuit function */ > > - if (prio < I915_PRIORITY_NORMAL || > > - rq->guc_prio == GUC_PRIO_FINI || > > - (rq->guc_prio != GUC_PRIO_INIT && > > - !new_guc_prio_higher(rq->guc_prio, new_guc_prio))) > > + if (prio < I915_PRIORITY_NORMAL) > > return; > > My understanding was that those checks are purposely done outside of the > lock to avoid taking it when not needed and that the early exit is not racy. > In particular: > > - GUC_PRIO_FINI is the end state for the priority, so if we're there that's > not changing anymore and therefore the lock is not required. yeah... then I thought that the lock should either remove it completely or have everything inside the lock. > - the priority only goes up with the bumping, so if new_guc_prio_higher() is > false that's not going to be changed by a different thread running at the > same time and increasing the priority even more. > > I think there is still a possible race is if new_guc_prio_higher() is true > when we check it outside the lock but then changes before we execute the > protected chunk inside, so a fix would still be required for that. This is the reason why I made the patch :-) > All this said, I don't really have anything against moving the whole thing > inside the lock since this isn't on a critical path, just wanted to point > out that it's not all strictly required. > > One nit on the code below. > > > spin_lock(&ce->guc_state.lock); > > + > > + if (rq->guc_prio == GUC_PRIO_FINI) > > + goto exit; > > + > > + if (rq->guc_prio != GUC_PRIO_INIT && > > + !new_guc_prio_higher(rq->guc_prio, new_guc_prio)) > > + goto exit; > > + > > if (rq->guc_prio != GUC_PRIO_FINI) { > > You're now checking for rq->guc_prio == GUC_PRIO_FINI inside the lock, so no > need to check it again here as it can't have changed. True, will resend. Thanks, Daniele! Andi
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c index 0eaa1064242c..1181043bc5e9 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c @@ -4267,13 +4267,18 @@ static void guc_bump_inflight_request_prio(struct i915_request *rq, u8 new_guc_prio = map_i915_prio_to_guc_prio(prio); /* Short circuit function */ - if (prio < I915_PRIORITY_NORMAL || - rq->guc_prio == GUC_PRIO_FINI || - (rq->guc_prio != GUC_PRIO_INIT && - !new_guc_prio_higher(rq->guc_prio, new_guc_prio))) + if (prio < I915_PRIORITY_NORMAL) return; spin_lock(&ce->guc_state.lock); + + if (rq->guc_prio == GUC_PRIO_FINI) + goto exit; + + if (rq->guc_prio != GUC_PRIO_INIT && + !new_guc_prio_higher(rq->guc_prio, new_guc_prio)) + goto exit; + if (rq->guc_prio != GUC_PRIO_FINI) { if (rq->guc_prio != GUC_PRIO_INIT) sub_context_inflight_prio(ce, rq->guc_prio); @@ -4281,6 +4286,8 @@ static void guc_bump_inflight_request_prio(struct i915_request *rq, add_context_inflight_prio(ce, rq->guc_prio); update_context_prio(ce); } + +exit: spin_unlock(&ce->guc_state.lock); }
The ce->guc_state.lock was made to protect guc_prio, which indicates the GuC priority level. But at the begnning of the function we perform some sanity check of guc_prio outside its protected section. Move them within the locked region. Use this occasion to expand the if statement to make it clearer. Fixes: ee242ca704d3 ("drm/i915/guc: Implement GuC priority management") Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: <stable@vger.kernel.org> # v5.15+ --- drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 15 +++++++++++---- 1 file changed, 11 insertions(+), 4 deletions(-)