diff mbox series

drm/i915/selftests: re-init the GT in live_gt_pm

Message ID 20191120000425.31588-1-daniele.ceraolospurio@intel.com (mailing list archive)
State New, archived
Headers show
Series drm/i915/selftests: re-init the GT in live_gt_pm | expand

Commit Message

Daniele Ceraolo Spurio Nov. 20, 2019, 12:04 a.m. UTC
When GuC is in use we need to make sure it is re-loaded before the call
to gt_resume, otherwise communication from the engines to the GuC will
not be processed, which blocks the engines from ctx switching and from
being reset.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112205
Cc: Andi Shyti <andi.shyti@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
---
 drivers/gpu/drm/i915/gt/selftest_gt_pm.c | 9 +++++++++
 1 file changed, 9 insertions(+)

Comments

Chris Wilson Nov. 20, 2019, 12:21 a.m. UTC | #1
Quoting Daniele Ceraolo Spurio (2019-11-20 00:04:25)
> When GuC is in use we need to make sure it is re-loaded before the call
> to gt_resume, otherwise communication from the engines to the GuC will
> not be processed, which blocks the engines from ctx switching and from
> being reset.
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112205
> Cc: Andi Shyti <andi.shyti@intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> ---
>  drivers/gpu/drm/i915/gt/selftest_gt_pm.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/gt/selftest_gt_pm.c b/drivers/gpu/drm/i915/gt/selftest_gt_pm.c
> index d1752f15702a..0bb17c806dfc 100644
> --- a/drivers/gpu/drm/i915/gt/selftest_gt_pm.c
> +++ b/drivers/gpu/drm/i915/gt/selftest_gt_pm.c
> @@ -11,8 +11,11 @@ static int live_gt_resume(void *arg)
>  {
>         struct intel_gt *gt = arg;
>         IGT_TIMEOUT(end_time);
> +       intel_wakeref_t wakeref;
>         int err;
>  
> +       wakeref = intel_runtime_pm_get(gt->uncore->rpm);

That defeats the point of gt pm, no?

> +
>         /* Do several suspend/resume cycles to check we don't explode! */
>         do {
>                 intel_gt_suspend_prepare(gt);
> @@ -25,6 +28,10 @@ static int live_gt_resume(void *arg)
>                         break;
>                 }
>  
> +               err = intel_gt_init_hw(gt);

Hmm. I have that as part of intel_gt_resume.  Which also pulls it into
the pm.

I think I prefer my plan/patches :)
-Chris
Daniele Ceraolo Spurio Nov. 20, 2019, 12:32 a.m. UTC | #2
On 11/19/19 4:21 PM, Chris Wilson wrote:
> Quoting Daniele Ceraolo Spurio (2019-11-20 00:04:25)
>> When GuC is in use we need to make sure it is re-loaded before the call
>> to gt_resume, otherwise communication from the engines to the GuC will
>> not be processed, which blocks the engines from ctx switching and from
>> being reset.
>>
>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112205
>> Cc: Andi Shyti <andi.shyti@intel.com>
>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
>> ---
>>   drivers/gpu/drm/i915/gt/selftest_gt_pm.c | 9 +++++++++
>>   1 file changed, 9 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/i915/gt/selftest_gt_pm.c b/drivers/gpu/drm/i915/gt/selftest_gt_pm.c
>> index d1752f15702a..0bb17c806dfc 100644
>> --- a/drivers/gpu/drm/i915/gt/selftest_gt_pm.c
>> +++ b/drivers/gpu/drm/i915/gt/selftest_gt_pm.c
>> @@ -11,8 +11,11 @@ static int live_gt_resume(void *arg)
>>   {
>>          struct intel_gt *gt = arg;
>>          IGT_TIMEOUT(end_time);
>> +       intel_wakeref_t wakeref;
>>          int err;
>>   
>> +       wakeref = intel_runtime_pm_get(gt->uncore->rpm);
> 
> That defeats the point of gt pm, no?
> 
>> +
>>          /* Do several suspend/resume cycles to check we don't explode! */
>>          do {
>>                  intel_gt_suspend_prepare(gt);
>> @@ -25,6 +28,10 @@ static int live_gt_resume(void *arg)
>>                          break;
>>                  }
>>   
>> +               err = intel_gt_init_hw(gt);
> 
> Hmm. I have that as part of intel_gt_resume.  Which also pulls it into
> the pm.

I also considered moving init_hw() inside resume(), but in the end opted 
not to to keep the fix isolated to the test. But if you have already 
done the work...

> 
> I think I prefer my plan/patches :)

Can you point me to them if they're already on the list?

Thanks,
Daniele

> -Chris
>
Daniele Ceraolo Spurio Dec. 4, 2019, 11:17 p.m. UTC | #3
On 11/19/19 4:32 PM, Daniele Ceraolo Spurio wrote:
> 
> 
> On 11/19/19 4:21 PM, Chris Wilson wrote:
>> Quoting Daniele Ceraolo Spurio (2019-11-20 00:04:25)
>>> When GuC is in use we need to make sure it is re-loaded before the call
>>> to gt_resume, otherwise communication from the engines to the GuC will
>>> not be processed, which blocks the engines from ctx switching and from
>>> being reset.
>>>
>>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112205
>>> Cc: Andi Shyti <andi.shyti@intel.com>
>>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>>> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
>>> ---
>>>   drivers/gpu/drm/i915/gt/selftest_gt_pm.c | 9 +++++++++
>>>   1 file changed, 9 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/i915/gt/selftest_gt_pm.c 
>>> b/drivers/gpu/drm/i915/gt/selftest_gt_pm.c
>>> index d1752f15702a..0bb17c806dfc 100644
>>> --- a/drivers/gpu/drm/i915/gt/selftest_gt_pm.c
>>> +++ b/drivers/gpu/drm/i915/gt/selftest_gt_pm.c
>>> @@ -11,8 +11,11 @@ static int live_gt_resume(void *arg)
>>>   {
>>>          struct intel_gt *gt = arg;
>>>          IGT_TIMEOUT(end_time);
>>> +       intel_wakeref_t wakeref;
>>>          int err;
>>> +       wakeref = intel_runtime_pm_get(gt->uncore->rpm);
>>
>> That defeats the point of gt pm, no?
>>
>>> +
>>>          /* Do several suspend/resume cycles to check we don't 
>>> explode! */
>>>          do {
>>>                  intel_gt_suspend_prepare(gt);
>>> @@ -25,6 +28,10 @@ static int live_gt_resume(void *arg)
>>>                          break;
>>>                  }
>>> +               err = intel_gt_init_hw(gt);
>>
>> Hmm. I have that as part of intel_gt_resume.  Which also pulls it into
>> the pm.
> 
> I also considered moving init_hw() inside resume(), but in the end opted 
> not to to keep the fix isolated to the test. But if you have already 
> done the work...
> 
>>
>> I think I prefer my plan/patches :)
> 
> Can you point me to them if they're already on the list?
> 

Hey Chris,

If your solution is still going to take a while, do you mind if we go 
ahead with this fix in the meantime (switching to gt pm), so we can fix 
the bug? More GuC code is going to slowly trickle in in the next few 
weeks, including attempting to turn HuC auth on by default again if 
things look good enough, so we need to get the GuC CI health under 
control before that.

Thanks,
Daniele

> Thanks,
> Daniele
> 
>> -Chris
>>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
diff mbox series

Patch

diff --git a/drivers/gpu/drm/i915/gt/selftest_gt_pm.c b/drivers/gpu/drm/i915/gt/selftest_gt_pm.c
index d1752f15702a..0bb17c806dfc 100644
--- a/drivers/gpu/drm/i915/gt/selftest_gt_pm.c
+++ b/drivers/gpu/drm/i915/gt/selftest_gt_pm.c
@@ -11,8 +11,11 @@  static int live_gt_resume(void *arg)
 {
 	struct intel_gt *gt = arg;
 	IGT_TIMEOUT(end_time);
+	intel_wakeref_t wakeref;
 	int err;
 
+	wakeref = intel_runtime_pm_get(gt->uncore->rpm);
+
 	/* Do several suspend/resume cycles to check we don't explode! */
 	do {
 		intel_gt_suspend_prepare(gt);
@@ -25,6 +28,10 @@  static int live_gt_resume(void *arg)
 			break;
 		}
 
+		err = intel_gt_init_hw(gt);
+		if (err)
+			break;
+
 		err = intel_gt_resume(gt);
 		if (err)
 			break;
@@ -44,6 +51,8 @@  static int live_gt_resume(void *arg)
 		}
 	} while (!__igt_timeout(end_time, NULL));
 
+	intel_runtime_pm_put(gt->uncore->rpm, wakeref);
+
 	return err;
 }