diff mbox series

[1/6] drm/i915/selftests: Serialise nop reset with retirement

Message ID 20190626065303.31624-1-chris@chris-wilson.co.uk (mailing list archive)
State New, archived
Headers show
Series [1/6] drm/i915/selftests: Serialise nop reset with retirement | expand

Commit Message

Chris Wilson June 26, 2019, 6:52 a.m. UTC
In order for the reset count to be accurate across our selftest, we need
to prevent the background retire worker from modifying our expected
state.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gt/selftest_hangcheck.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

Comments

Mika Kuoppala June 26, 2019, 1:11 p.m. UTC | #1
Chris Wilson <chris@chris-wilson.co.uk> writes:

> In order for the reset count to be accurate across our selftest, we need
> to prevent the background retire worker from modifying our expected
> state.
>

Ok, to summarize the irc discussion we had: The above holds true
for igt_reset_engine_nop only. As there is no race in
global reset path...currently.

But there is intent towards symmetry on both paths
so it makes sense to keep the tests aligned.

The commit msg could be enhanced on this regard.

Also while looking through this, we do increase
the reset_count rather early before the failpaths.
even with the resets disabled it gets incremented.
So now it is more of a attempted reset count.

But that is not a topic for this patch so,
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>

> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>  drivers/gpu/drm/i915/gt/selftest_hangcheck.c | 7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
> index 3ceb397c8645..0e0b6c572ae9 100644
> --- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
> +++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
> @@ -398,6 +398,7 @@ static int igt_reset_nop(void *arg)
>  	count = 0;
>  	do {
>  		mutex_lock(&i915->drm.struct_mutex);
> +
>  		for_each_engine(engine, i915, id) {
>  			int i;
>  
> @@ -413,11 +414,12 @@ static int igt_reset_nop(void *arg)
>  				i915_request_add(rq);
>  			}
>  		}
> -		mutex_unlock(&i915->drm.struct_mutex);
>  
>  		igt_global_reset_lock(i915);
>  		i915_reset(i915, ALL_ENGINES, NULL);
>  		igt_global_reset_unlock(i915);
> +
> +		mutex_unlock(&i915->drm.struct_mutex);
>  		if (i915_reset_failed(i915)) {
>  			err = -EIO;
>  			break;
> @@ -511,9 +513,8 @@ static int igt_reset_nop_engine(void *arg)
>  
>  				i915_request_add(rq);
>  			}
> -			mutex_unlock(&i915->drm.struct_mutex);
> -
>  			err = i915_reset_engine(engine, NULL);
> +			mutex_unlock(&i915->drm.struct_mutex);
>  			if (err) {
>  				pr_err("i915_reset_engine failed\n");
>  				break;
> -- 
> 2.20.1
Chris Wilson June 26, 2019, 1:22 p.m. UTC | #2
Quoting Mika Kuoppala (2019-06-26 14:11:49)
> Chris Wilson <chris@chris-wilson.co.uk> writes:
> 
> > In order for the reset count to be accurate across our selftest, we need
> > to prevent the background retire worker from modifying our expected
> > state.
> >
> 
> Ok, to summarize the irc discussion we had: The above holds true
> for igt_reset_engine_nop only. As there is no race in
> global reset path...currently.
> 
> But there is intent towards symmetry on both paths
> so it makes sense to keep the tests aligned.
> 
> The commit msg could be enhanced on this regard.
> 
> Also while looking through this, we do increase
> the reset_count rather early before the failpaths.
> even with the resets disabled it gets incremented.
> So now it is more of a attempted reset count.

We don't expect it to fail, and if it does we wedge and report -EIO not
just a boring -EINVAL (or in theory we do -- that's generally the
approach we take elsewhere, treating the GPU going south as a more
severe failure than the test itself failing to fulfil its contract).
-Chris
diff mbox series

Patch

diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
index 3ceb397c8645..0e0b6c572ae9 100644
--- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
+++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
@@ -398,6 +398,7 @@  static int igt_reset_nop(void *arg)
 	count = 0;
 	do {
 		mutex_lock(&i915->drm.struct_mutex);
+
 		for_each_engine(engine, i915, id) {
 			int i;
 
@@ -413,11 +414,12 @@  static int igt_reset_nop(void *arg)
 				i915_request_add(rq);
 			}
 		}
-		mutex_unlock(&i915->drm.struct_mutex);
 
 		igt_global_reset_lock(i915);
 		i915_reset(i915, ALL_ENGINES, NULL);
 		igt_global_reset_unlock(i915);
+
+		mutex_unlock(&i915->drm.struct_mutex);
 		if (i915_reset_failed(i915)) {
 			err = -EIO;
 			break;
@@ -511,9 +513,8 @@  static int igt_reset_nop_engine(void *arg)
 
 				i915_request_add(rq);
 			}
-			mutex_unlock(&i915->drm.struct_mutex);
-
 			err = i915_reset_engine(engine, NULL);
+			mutex_unlock(&i915->drm.struct_mutex);
 			if (err) {
 				pr_err("i915_reset_engine failed\n");
 				break;