diff mbox series

[i-g-t] lib/i915/gem_engine_topology.c - intel_get_current_engine invalid result

Message ID 20200213192606.63025-1-dale.b.stimson@intel.com (mailing list archive)
State New, archived
Headers show
Series [i-g-t] lib/i915/gem_engine_topology.c - intel_get_current_engine invalid result | expand

Commit Message

Stimson, Dale B Feb. 13, 2020, 7:26 p.m. UTC
Function intel_get_current_engine() should return NULL (instead of
engine 0) if there are no engines.

Function intel_init_engine_list() should not store potential engine
data in the output structure unless the engine is present.

Function intel_init_engine_list() should arguably not filter the static
engine list with gem_has_ring if fd == -1, so that subtests can still
be individually invoked to show subtest FAIL instead of test notrun.

Symptom: A device open failure in gem_ctx_isolation resulted in
an endless __for_each_physical_engine "per-engine" loop with the
purported last potential engine being processed every time.

Diagnosis: device open (or debugfs open) failed, leaving fd == -1.
Control skipped the rest of the initial igt_fixture block, after
which an attempt was made to iterate through engines using macro
__for_each_physical_engine.

Macro __for_each_physical_engine called intel_init_engine_list()
to initialize the loop control data.  Because fd == -1,
intel_init_engine_list() fell back to using __for_each_static_engine().
All of the engines in the static engine list are rejected due to
gem_has_ring returning false (because of fd == -1), leaving 0 engines.
That resulted in loop control data with engine_data.nengines == 0
and the data for the last engine considered stored at index 0.

Still in macro __for_each_physical_engine, intel_get_current_engine()
was called to get the engine to process.  It should have returned NULL,
but instead returned the engine entry at index 0, which
had received information describing the last potential engine.
This happened without end.

Signed-off-by: Dale B Stimson <dale.b.stimson@intel.com>
---
 lib/i915/gem_engine_topology.c | 29 ++++++++++++++++-------------
 1 file changed, 16 insertions(+), 13 deletions(-)

Comments

Chris Wilson Feb. 13, 2020, 8:44 p.m. UTC | #1
Quoting Dale B Stimson (2020-02-13 19:26:06)
> Function intel_get_current_engine() should return NULL (instead of
> engine 0) if there are no engines.

There should be some igt to put basic use of for_each_engine() though
its paces. Nothing fancy, just complete a loop....

Andi, am I imagining this? I swear saw patches from you to do the
basics.

Anyway, there should be some, and this is worth adding to them,
for_each_context_engine() on an empty engines[] and assert we do not
enter the loop.
-Chris
Antonio Argenziano Feb. 14, 2020, 6:43 p.m. UTC | #2
On 13/02/20 11:26, Dale B Stimson wrote:
> Function intel_get_current_engine() should return NULL (instead of
> engine 0) if there are no engines.
> 
> Function intel_init_engine_list() should not store potential engine
> data in the output structure unless the engine is present.
> 
> Function intel_init_engine_list() should arguably not filter the static
> engine list with gem_has_ring if fd == -1, so that subtests can still
> be individually invoked to show subtest FAIL instead of test notrun.
> 
> Symptom: A device open failure in gem_ctx_isolation resulted in
> an endless __for_each_physical_engine "per-engine" loop with the
> purported last potential engine being processed every time.
> 
> Diagnosis: device open (or debugfs open) failed, leaving fd == -1.
> Control skipped the rest of the initial igt_fixture block, after
> which an attempt was made to iterate through engines using macro
> __for_each_physical_engine.
> 
> Macro __for_each_physical_engine called intel_init_engine_list()
> to initialize the loop control data.  Because fd == -1,
> intel_init_engine_list() fell back to using __for_each_static_engine().
> All of the engines in the static engine list are rejected due to
> gem_has_ring returning false (because of fd == -1), leaving 0 engines.
> That resulted in loop control data with engine_data.nengines == 0
> and the data for the last engine considered stored at index 0.
> 
> Still in macro __for_each_physical_engine, intel_get_current_engine()
> was called to get the engine to process.  It should have returned NULL,
> but instead returned the engine entry at index 0, which
> had received information describing the last potential engine.
> This happened without end.
> 
> Signed-off-by: Dale B Stimson <dale.b.stimson@intel.com>
> ---
>   lib/i915/gem_engine_topology.c | 29 ++++++++++++++++-------------
>   1 file changed, 16 insertions(+), 13 deletions(-)
> 
> diff --git a/lib/i915/gem_engine_topology.c b/lib/i915/gem_engine_topology.c
> index 9daa03df4..b8ed49bc9 100644
> --- a/lib/i915/gem_engine_topology.c
> +++ b/lib/i915/gem_engine_topology.c
> @@ -156,10 +156,10 @@ static void query_engine_list(int fd, struct intel_engine_data *ed)
>   struct intel_execution_engine2 *
>   intel_get_current_engine(struct intel_engine_data *ed)
>   {
> -	if (!ed->n)
> -		ed->current_engine = &ed->engines[0];
> -	else if (ed->n >= ed->nengines)
> +	if (ed->n >= ed->nengines)
>   		ed->current_engine = NULL;
> +	else if (!ed->n)
> +		ed->current_engine = &ed->engines[0];
>   
>   	return ed->current_engine;
>   }
> @@ -222,18 +222,21 @@ struct intel_engine_data intel_init_engine_list(int fd, uint32_t ctx_id)
>   		igt_debug("using pre-allocated engine list\n");
>   
>   		__for_each_static_engine(e2) {
> -			struct intel_execution_engine2 *__e2 =
> -				&engine_data.engines[engine_data.nengines];
> -
> -			strcpy(__e2->name, e2->name);
> -			__e2->instance   = e2->instance;
> -			__e2->class      = e2->class;
> -			__e2->flags      = e2->flags;
> -			__e2->is_virtual = false;
> -
>   			if (igt_only_list_subtests() ||
> -			    gem_has_ring(fd, e2->flags))
> +			    (fd < 0) ||

Patch LGTM, Chris do you have any issues merging this before someone 
implements some tests for the infrastructure?

Acked-by: Antonio Argenziano <antonio.argenziano@intel.com>

> +			    gem_has_ring(fd, e2->flags)) {
> +				struct intel_execution_engine2 *__e2 =
> +					&engine_data.engines[
> +					engine_data.nengines];
> +
> +				strcpy(__e2->name, e2->name);
> +				__e2->instance   = e2->instance;
> +				__e2->class      = e2->class;
> +				__e2->flags      = e2->flags;
> +				__e2->is_virtual = false;
> +
>   				engine_data.nengines++;
> +                        }
>   		}
>   		return engine_data;
>   	}
>
Chris Wilson Feb. 14, 2020, 6:44 p.m. UTC | #3
Quoting Antonio Argenziano (2020-02-14 18:43:01)
> 
> 
> On 13/02/20 11:26, Dale B Stimson wrote:
> > Function intel_get_current_engine() should return NULL (instead of
> > engine 0) if there are no engines.
> > 
> > Function intel_init_engine_list() should not store potential engine
> > data in the output structure unless the engine is present.
> > 
> > Function intel_init_engine_list() should arguably not filter the static
> > engine list with gem_has_ring if fd == -1, so that subtests can still
> > be individually invoked to show subtest FAIL instead of test notrun.
> > 
> > Symptom: A device open failure in gem_ctx_isolation resulted in
> > an endless __for_each_physical_engine "per-engine" loop with the
> > purported last potential engine being processed every time.
> > 
> > Diagnosis: device open (or debugfs open) failed, leaving fd == -1.
> > Control skipped the rest of the initial igt_fixture block, after
> > which an attempt was made to iterate through engines using macro
> > __for_each_physical_engine.
> > 
> > Macro __for_each_physical_engine called intel_init_engine_list()
> > to initialize the loop control data.  Because fd == -1,
> > intel_init_engine_list() fell back to using __for_each_static_engine().
> > All of the engines in the static engine list are rejected due to
> > gem_has_ring returning false (because of fd == -1), leaving 0 engines.
> > That resulted in loop control data with engine_data.nengines == 0
> > and the data for the last engine considered stored at index 0.
> > 
> > Still in macro __for_each_physical_engine, intel_get_current_engine()
> > was called to get the engine to process.  It should have returned NULL,
> > but instead returned the engine entry at index 0, which
> > had received information describing the last potential engine.
> > This happened without end.
> > 
> > Signed-off-by: Dale B Stimson <dale.b.stimson@intel.com>
> > ---
> >   lib/i915/gem_engine_topology.c | 29 ++++++++++++++++-------------
> >   1 file changed, 16 insertions(+), 13 deletions(-)
> > 
> > diff --git a/lib/i915/gem_engine_topology.c b/lib/i915/gem_engine_topology.c
> > index 9daa03df4..b8ed49bc9 100644
> > --- a/lib/i915/gem_engine_topology.c
> > +++ b/lib/i915/gem_engine_topology.c
> > @@ -156,10 +156,10 @@ static void query_engine_list(int fd, struct intel_engine_data *ed)
> >   struct intel_execution_engine2 *
> >   intel_get_current_engine(struct intel_engine_data *ed)
> >   {
> > -     if (!ed->n)
> > -             ed->current_engine = &ed->engines[0];
> > -     else if (ed->n >= ed->nengines)
> > +     if (ed->n >= ed->nengines)
> >               ed->current_engine = NULL;
> > +     else if (!ed->n)
> > +             ed->current_engine = &ed->engines[0];
> >   
> >       return ed->current_engine;
> >   }
> > @@ -222,18 +222,21 @@ struct intel_engine_data intel_init_engine_list(int fd, uint32_t ctx_id)
> >               igt_debug("using pre-allocated engine list\n");
> >   
> >               __for_each_static_engine(e2) {
> > -                     struct intel_execution_engine2 *__e2 =
> > -                             &engine_data.engines[engine_data.nengines];
> > -
> > -                     strcpy(__e2->name, e2->name);
> > -                     __e2->instance   = e2->instance;
> > -                     __e2->class      = e2->class;
> > -                     __e2->flags      = e2->flags;
> > -                     __e2->is_virtual = false;
> > -
> >                       if (igt_only_list_subtests() ||
> > -                         gem_has_ring(fd, e2->flags))
> > +                         (fd < 0) ||
> 
> Patch LGTM, Chris do you have any issues merging this before someone 
> implements some tests for the infrastructure?

It seems like a really trivial one to write a test for. 3 minutes
tops... Just do it.
-Chris
Chris Wilson Feb. 14, 2020, 6:54 p.m. UTC | #4
+static void libapi(int i915)
+{
+       I915_DEFINE_CONTEXT_PARAM_ENGINES(engines, 0);
+       struct drm_i915_gem_context_param p = {
+               .ctx_id = gem_context_create(i915),
+               .param = I915_CONTEXT_PARAM_ENGINES,
+               .value = to_user_pointer(&engines),
+               .size = sizeof(engines),
+       };
+       const struct intel_execution_engine2 *e;
+       unsigned int count = 0;
+
+       gem_context_set_param(i915, &p);
+
+       for_each_context_engine(i915, p.ctx_id, e)
+               count++;
+       igt_assert_eq(count, 0);
+
+       ____for_each_physical_engine(i915, p.ctx_id, e)
+               count++;
+       igt_assert_eq(count, 0);
+
+       gem_context_destroy(i915, p.ctx_id);
+}

I leave find a home and correcting the whitespace to the reader.
-Chris
Chris Wilson Feb. 14, 2020, 7:14 p.m. UTC | #5
Quoting Chris Wilson (2020-02-14 18:54:43)
> +static void libapi(int i915)
> +{
> +       I915_DEFINE_CONTEXT_PARAM_ENGINES(engines, 0);

I915_DEFINE_CONTEXT_PARAM_ENGINES(engines, 0) = {};
or
struct i915_gem_context_param_engines engines = {};

> +       struct drm_i915_gem_context_param p = {
> +               .ctx_id = gem_context_create(i915),
> +               .param = I915_CONTEXT_PARAM_ENGINES,
> +               .value = to_user_pointer(&engines),
> +               .size = sizeof(engines),
> +       };
> +       const struct intel_execution_engine2 *e;
> +       unsigned int count = 0;
> +
> +       gem_context_set_param(i915, &p);
> +
> +       for_each_context_engine(i915, p.ctx_id, e)
> +               count++;
> +       igt_assert_eq(count, 0);

Of course this says that this for_each_context_engine() loop doesn't
work anyway.
-Chris
diff mbox series

Patch

diff --git a/lib/i915/gem_engine_topology.c b/lib/i915/gem_engine_topology.c
index 9daa03df4..b8ed49bc9 100644
--- a/lib/i915/gem_engine_topology.c
+++ b/lib/i915/gem_engine_topology.c
@@ -156,10 +156,10 @@  static void query_engine_list(int fd, struct intel_engine_data *ed)
 struct intel_execution_engine2 *
 intel_get_current_engine(struct intel_engine_data *ed)
 {
-	if (!ed->n)
-		ed->current_engine = &ed->engines[0];
-	else if (ed->n >= ed->nengines)
+	if (ed->n >= ed->nengines)
 		ed->current_engine = NULL;
+	else if (!ed->n)
+		ed->current_engine = &ed->engines[0];
 
 	return ed->current_engine;
 }
@@ -222,18 +222,21 @@  struct intel_engine_data intel_init_engine_list(int fd, uint32_t ctx_id)
 		igt_debug("using pre-allocated engine list\n");
 
 		__for_each_static_engine(e2) {
-			struct intel_execution_engine2 *__e2 =
-				&engine_data.engines[engine_data.nengines];
-
-			strcpy(__e2->name, e2->name);
-			__e2->instance   = e2->instance;
-			__e2->class      = e2->class;
-			__e2->flags      = e2->flags;
-			__e2->is_virtual = false;
-
 			if (igt_only_list_subtests() ||
-			    gem_has_ring(fd, e2->flags))
+			    (fd < 0) ||
+			    gem_has_ring(fd, e2->flags)) {
+				struct intel_execution_engine2 *__e2 =
+					&engine_data.engines[
+					engine_data.nengines];
+
+				strcpy(__e2->name, e2->name);
+				__e2->instance   = e2->instance;
+				__e2->class      = e2->class;
+				__e2->flags      = e2->flags;
+				__e2->is_virtual = false;
+
 				engine_data.nengines++;
+                        }
 		}
 		return engine_data;
 	}