diff mbox series

[17/27] drm/i915/guc: Flush G2H work queue during reset

Message ID 20210819061639.21051-18-matthew.brost@intel.com (mailing list archive)
State New, archived
Headers show
Series Clean up GuC CI failures, simplify locking, and kernel DOC | expand

Commit Message

Matthew Brost Aug. 19, 2021, 6:16 a.m. UTC
It isn't safe to scrub for missing G2H or continue with the reset until
all G2H processing is complete. Flush the G2H work queue during reset to
ensure it is done running.

Fixes: eb5e7da736f3 ("drm/i915/guc: Reset implementation for new GuC interface")
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
---
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c  | 18 ++----------------
 1 file changed, 2 insertions(+), 16 deletions(-)

Comments

Daniele Ceraolo Spurio Aug. 21, 2021, 12:25 a.m. UTC | #1
On 8/18/2021 11:16 PM, Matthew Brost wrote:
> It isn't safe to scrub for missing G2H or continue with the reset until
> all G2H processing is complete. Flush the G2H work queue during reset to
> ensure it is done running.

Might be worth moving this patch closer to "drm/i915/guc: Process all 
G2H message at once in work queue".

> Fixes: eb5e7da736f3 ("drm/i915/guc: Reset implementation for new GuC interface")
> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> ---
>   .../gpu/drm/i915/gt/uc/intel_guc_submission.c  | 18 ++----------------
>   1 file changed, 2 insertions(+), 16 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> index 4cf5a565f08e..9a53bae367b1 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> @@ -714,8 +714,6 @@ static void guc_flush_submissions(struct intel_guc *guc)
>   
>   void intel_guc_submission_reset_prepare(struct intel_guc *guc)
>   {
> -	int i;
> -
>   	if (unlikely(!guc_submission_initialized(guc))) {
>   		/* Reset called during driver load? GuC not yet initialised! */
>   		return;
> @@ -731,20 +729,8 @@ void intel_guc_submission_reset_prepare(struct intel_guc *guc)
>   
>   	guc_flush_submissions(guc);
>   
> -	/*
> -	 * Handle any outstanding G2Hs before reset. Call IRQ handler directly
> -	 * each pass as interrupt have been disabled. We always scrub for
> -	 * outstanding G2H as it is possible for outstanding_submission_g2h to
> -	 * be incremented after the context state update.
> -	 */
> -	for (i = 0; i < 4 && atomic_read(&guc->outstanding_submission_g2h); ++i) {
> -		intel_guc_to_host_event_handler(guc);
> -#define wait_for_reset(guc, wait_var) \
> -		intel_guc_wait_for_pending_msg(guc, wait_var, false, (HZ / 20))
> -		do {
> -			wait_for_reset(guc, &guc->outstanding_submission_g2h);
> -		} while (!list_empty(&guc->ct.requests.incoming));
> -	}
> +	flush_work(&guc->ct.requests.worker);
> +

We're now not waiting in the requests anymore, just ensuring that the 
processing of the ones we already received is done. Is this intended? We 
do still handle the remaining oustanding submission in the scrub so it's 
functionally correct, but the commit message doesn't state the change in 
waiting behavior, so wanted to double check it was planned.

Daniele

>   	scrub_guc_desc_for_outstanding_g2h(guc);
>   }
>
Matthew Brost Aug. 24, 2021, 3:44 p.m. UTC | #2
On Fri, Aug 20, 2021 at 05:25:41PM -0700, Daniele Ceraolo Spurio wrote:
> 
> 
> On 8/18/2021 11:16 PM, Matthew Brost wrote:
> > It isn't safe to scrub for missing G2H or continue with the reset until
> > all G2H processing is complete. Flush the G2H work queue during reset to
> > ensure it is done running.
> 
> Might be worth moving this patch closer to "drm/i915/guc: Process all G2H
> message at once in work queue".
> 

Sure.

> > Fixes: eb5e7da736f3 ("drm/i915/guc: Reset implementation for new GuC interface")
> > Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> > ---
> >   .../gpu/drm/i915/gt/uc/intel_guc_submission.c  | 18 ++----------------
> >   1 file changed, 2 insertions(+), 16 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > index 4cf5a565f08e..9a53bae367b1 100644
> > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > @@ -714,8 +714,6 @@ static void guc_flush_submissions(struct intel_guc *guc)
> >   void intel_guc_submission_reset_prepare(struct intel_guc *guc)
> >   {
> > -	int i;
> > -
> >   	if (unlikely(!guc_submission_initialized(guc))) {
> >   		/* Reset called during driver load? GuC not yet initialised! */
> >   		return;
> > @@ -731,20 +729,8 @@ void intel_guc_submission_reset_prepare(struct intel_guc *guc)
> >   	guc_flush_submissions(guc);
> > -	/*
> > -	 * Handle any outstanding G2Hs before reset. Call IRQ handler directly
> > -	 * each pass as interrupt have been disabled. We always scrub for
> > -	 * outstanding G2H as it is possible for outstanding_submission_g2h to
> > -	 * be incremented after the context state update.
> > -	 */
> > -	for (i = 0; i < 4 && atomic_read(&guc->outstanding_submission_g2h); ++i) {
> > -		intel_guc_to_host_event_handler(guc);
> > -#define wait_for_reset(guc, wait_var) \
> > -		intel_guc_wait_for_pending_msg(guc, wait_var, false, (HZ / 20))
> > -		do {
> > -			wait_for_reset(guc, &guc->outstanding_submission_g2h);
> > -		} while (!list_empty(&guc->ct.requests.incoming));
> > -	}
> > +	flush_work(&guc->ct.requests.worker);
> > +
> 
> We're now not waiting in the requests anymore, just ensuring that the
> processing of the ones we already received is done. Is this intended? We do
> still handle the remaining oustanding submission in the scrub so it's
> functionally correct, but the commit message doesn't state the change in
> waiting behavior, so wanted to double check it was planned.
> 

Yes, it is planned as scrub code should be able to cope with any missing
G2H. Will update the commit message to reflect that.

Matt

> Daniele
> 
> >   	scrub_guc_desc_for_outstanding_g2h(guc);
> >   }
>
Daniele Ceraolo Spurio Aug. 25, 2021, 1:22 a.m. UTC | #3
On 8/24/2021 8:44 AM, Matthew Brost wrote:
> On Fri, Aug 20, 2021 at 05:25:41PM -0700, Daniele Ceraolo Spurio wrote:
>>
>> On 8/18/2021 11:16 PM, Matthew Brost wrote:
>>> It isn't safe to scrub for missing G2H or continue with the reset until
>>> all G2H processing is complete. Flush the G2H work queue during reset to
>>> ensure it is done running.
>> Might be worth moving this patch closer to "drm/i915/guc: Process all G2H
>> message at once in work queue".
>>
> Sure.
>
>>> Fixes: eb5e7da736f3 ("drm/i915/guc: Reset implementation for new GuC interface")
>>> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
>>> ---
>>>    .../gpu/drm/i915/gt/uc/intel_guc_submission.c  | 18 ++----------------
>>>    1 file changed, 2 insertions(+), 16 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
>>> index 4cf5a565f08e..9a53bae367b1 100644
>>> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
>>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
>>> @@ -714,8 +714,6 @@ static void guc_flush_submissions(struct intel_guc *guc)
>>>    void intel_guc_submission_reset_prepare(struct intel_guc *guc)
>>>    {
>>> -	int i;
>>> -
>>>    	if (unlikely(!guc_submission_initialized(guc))) {
>>>    		/* Reset called during driver load? GuC not yet initialised! */
>>>    		return;
>>> @@ -731,20 +729,8 @@ void intel_guc_submission_reset_prepare(struct intel_guc *guc)
>>>    	guc_flush_submissions(guc);
>>> -	/*
>>> -	 * Handle any outstanding G2Hs before reset. Call IRQ handler directly
>>> -	 * each pass as interrupt have been disabled. We always scrub for
>>> -	 * outstanding G2H as it is possible for outstanding_submission_g2h to
>>> -	 * be incremented after the context state update.
>>> -	 */
>>> -	for (i = 0; i < 4 && atomic_read(&guc->outstanding_submission_g2h); ++i) {
>>> -		intel_guc_to_host_event_handler(guc);
>>> -#define wait_for_reset(guc, wait_var) \
>>> -		intel_guc_wait_for_pending_msg(guc, wait_var, false, (HZ / 20))
>>> -		do {
>>> -			wait_for_reset(guc, &guc->outstanding_submission_g2h);
>>> -		} while (!list_empty(&guc->ct.requests.incoming));
>>> -	}
>>> +	flush_work(&guc->ct.requests.worker);
>>> +
>> We're now not waiting in the requests anymore, just ensuring that the
>> processing of the ones we already received is done. Is this intended? We do
>> still handle the remaining oustanding submission in the scrub so it's
>> functionally correct, but the commit message doesn't state the change in
>> waiting behavior, so wanted to double check it was planned.
>>
> Yes, it is planned as scrub code should be able to cope with any missing
> G2H. Will update the commit message to reflect that.

With the updated commit msg:

Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>

Daniele

>
> Matt
>
>> Daniele
>>
>>>    	scrub_guc_desc_for_outstanding_g2h(guc);
>>>    }
diff mbox series

Patch

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index 4cf5a565f08e..9a53bae367b1 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -714,8 +714,6 @@  static void guc_flush_submissions(struct intel_guc *guc)
 
 void intel_guc_submission_reset_prepare(struct intel_guc *guc)
 {
-	int i;
-
 	if (unlikely(!guc_submission_initialized(guc))) {
 		/* Reset called during driver load? GuC not yet initialised! */
 		return;
@@ -731,20 +729,8 @@  void intel_guc_submission_reset_prepare(struct intel_guc *guc)
 
 	guc_flush_submissions(guc);
 
-	/*
-	 * Handle any outstanding G2Hs before reset. Call IRQ handler directly
-	 * each pass as interrupt have been disabled. We always scrub for
-	 * outstanding G2H as it is possible for outstanding_submission_g2h to
-	 * be incremented after the context state update.
-	 */
-	for (i = 0; i < 4 && atomic_read(&guc->outstanding_submission_g2h); ++i) {
-		intel_guc_to_host_event_handler(guc);
-#define wait_for_reset(guc, wait_var) \
-		intel_guc_wait_for_pending_msg(guc, wait_var, false, (HZ / 20))
-		do {
-			wait_for_reset(guc, &guc->outstanding_submission_g2h);
-		} while (!list_empty(&guc->ct.requests.incoming));
-	}
+	flush_work(&guc->ct.requests.worker);
+
 	scrub_guc_desc_for_outstanding_g2h(guc);
 }