diff mbox series

[i-g-t,14/21] gem_wsim: Engine map load balance command

Message ID 20190508121058.27038-15-tvrtko.ursulin@linux.intel.com (mailing list archive)
State New, archived
Headers show
Series Media scalability tooling | expand

Commit Message

Tvrtko Ursulin May 8, 2019, 12:10 p.m. UTC
From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

A new workload command for enabling a load balanced context map (aka
Virtual Engine). Example usage:

  B.1

This turns on load balancing for context one, assuming it has already been
configured with an engine map. Only DEFAULT engine specifier can be used
with load balanced engine maps.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 benchmarks/gem_wsim.c  | 73 ++++++++++++++++++++++++++++++++++++++----
 benchmarks/wsim/README | 18 +++++++++++
 2 files changed, 84 insertions(+), 7 deletions(-)

Comments

Chris Wilson May 10, 2019, 1:31 p.m. UTC | #1
Quoting Tvrtko Ursulin (2019-05-08 13:10:51)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> A new workload command for enabling a load balanced context map (aka
> Virtual Engine). Example usage:
> 
>   B.1
> 
> This turns on load balancing for context one, assuming it has already been
> configured with an engine map. Only DEFAULT engine specifier can be used
> with load balanced engine maps.

Restriction makes sense for keeping linenoise^W file format simple.

> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
> @@ -1172,6 +1210,8 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
>                 if (ctx->engine_map) {
>                         I915_DEFINE_CONTEXT_PARAM_ENGINES(set_engines,
>                                                           ctx->engine_map_count + 1);
> +                       I915_DEFINE_CONTEXT_ENGINES_LOAD_BALANCE(load_balance,
> +                                                                ctx->engine_map_count);
>                         struct drm_i915_gem_context_param param = {
>                                 .ctx_id = ctx_id,
>                                 .param = I915_CONTEXT_PARAM_ENGINES,
> @@ -1179,7 +1219,25 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
>                                 .value = to_user_pointer(&set_engines),
>                         };
>  
> -                       set_engines.extensions = 0;
> +                       if (ctx->wants_balance) {
> +                               set_engines.extensions =
> +                                       to_user_pointer(&load_balance);
> +
> +                               memset(&load_balance, 0, sizeof(load_balance));
> +                               load_balance.base.name =
> +                                       I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE;
> +                               load_balance.num_siblings =
> +                                       ctx->engine_map_count;
> +
> +                               for (j = 0; j < ctx->engine_map_count; j++) {
> +                                       load_balance.engines[j].engine_class =
> +                                               I915_ENGINE_CLASS_VIDEO; /* FIXME */
> +                                       load_balance.engines[j].engine_instance =
> +                                               ctx->engine_map[j] - VCS1; /* FIXME */

Ok, more fallout from fixing ctx->engine_map[] first?

Otherwise looks fine.
-Chris
Tvrtko Ursulin May 15, 2019, 11:44 a.m. UTC | #2
On 10/05/2019 14:31, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-05-08 13:10:51)
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> A new workload command for enabling a load balanced context map (aka
>> Virtual Engine). Example usage:
>>
>>    B.1
>>
>> This turns on load balancing for context one, assuming it has already been
>> configured with an engine map. Only DEFAULT engine specifier can be used
>> with load balanced engine maps.
> 
> Restriction makes sense for keeping linenoise^W file format simple.
> 
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> ---
>> @@ -1172,6 +1210,8 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
>>                  if (ctx->engine_map) {
>>                          I915_DEFINE_CONTEXT_PARAM_ENGINES(set_engines,
>>                                                            ctx->engine_map_count + 1);
>> +                       I915_DEFINE_CONTEXT_ENGINES_LOAD_BALANCE(load_balance,
>> +                                                                ctx->engine_map_count);
>>                          struct drm_i915_gem_context_param param = {
>>                                  .ctx_id = ctx_id,
>>                                  .param = I915_CONTEXT_PARAM_ENGINES,
>> @@ -1179,7 +1219,25 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
>>                                  .value = to_user_pointer(&set_engines),
>>                          };
>>   
>> -                       set_engines.extensions = 0;
>> +                       if (ctx->wants_balance) {
>> +                               set_engines.extensions =
>> +                                       to_user_pointer(&load_balance);
>> +
>> +                               memset(&load_balance, 0, sizeof(load_balance));
>> +                               load_balance.base.name =
>> +                                       I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE;
>> +                               load_balance.num_siblings =
>> +                                       ctx->engine_map_count;
>> +
>> +                               for (j = 0; j < ctx->engine_map_count; j++) {
>> +                                       load_balance.engines[j].engine_class =
>> +                                               I915_ENGINE_CLASS_VIDEO; /* FIXME */
>> +                                       load_balance.engines[j].engine_instance =
>> +                                               ctx->engine_map[j] - VCS1; /* FIXME */
> 
> Ok, more fallout from fixing ctx->engine_map[] first?

Not sure I understand the question.

I am at the moment updating the series with review feedback and some 
small thing here and there. When done with that I'll see if these VCS 
hardcoded assumptions can be easily solved. Basically I will have a go 
at integrating engine discovery which I think its definitely needed now 
that I have added class based engine map building ability.

Regards,

Tvrtko

> Otherwise looks fine.
> -Chris
>
Chris Wilson May 15, 2019, 11:48 a.m. UTC | #3
Quoting Tvrtko Ursulin (2019-05-15 12:44:41)
> 
> On 10/05/2019 14:31, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2019-05-08 13:10:51)
> >> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >>
> >> A new workload command for enabling a load balanced context map (aka
> >> Virtual Engine). Example usage:
> >>
> >>    B.1
> >>
> >> This turns on load balancing for context one, assuming it has already been
> >> configured with an engine map. Only DEFAULT engine specifier can be used
> >> with load balanced engine maps.
> > 
> > Restriction makes sense for keeping linenoise^W file format simple.
> > 
> >> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >> ---
> >> @@ -1172,6 +1210,8 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
> >>                  if (ctx->engine_map) {
> >>                          I915_DEFINE_CONTEXT_PARAM_ENGINES(set_engines,
> >>                                                            ctx->engine_map_count + 1);
> >> +                       I915_DEFINE_CONTEXT_ENGINES_LOAD_BALANCE(load_balance,
> >> +                                                                ctx->engine_map_count);
> >>                          struct drm_i915_gem_context_param param = {
> >>                                  .ctx_id = ctx_id,
> >>                                  .param = I915_CONTEXT_PARAM_ENGINES,
> >> @@ -1179,7 +1219,25 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
> >>                                  .value = to_user_pointer(&set_engines),
> >>                          };
> >>   
> >> -                       set_engines.extensions = 0;
> >> +                       if (ctx->wants_balance) {
> >> +                               set_engines.extensions =
> >> +                                       to_user_pointer(&load_balance);
> >> +
> >> +                               memset(&load_balance, 0, sizeof(load_balance));
> >> +                               load_balance.base.name =
> >> +                                       I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE;
> >> +                               load_balance.num_siblings =
> >> +                                       ctx->engine_map_count;
> >> +
> >> +                               for (j = 0; j < ctx->engine_map_count; j++) {
> >> +                                       load_balance.engines[j].engine_class =
> >> +                                               I915_ENGINE_CLASS_VIDEO; /* FIXME */
> >> +                                       load_balance.engines[j].engine_instance =
> >> +                                               ctx->engine_map[j] - VCS1; /* FIXME */
> > 
> > Ok, more fallout from fixing ctx->engine_map[] first?
> 
> Not sure I understand the question.

The proliferation of FIXME, the assumption of CLASS_VIDEO and an
impedance mismatch between engine_map and class:instance. Basically
those FIXME raise the question of what do you intend this to look like?
-Chris
Tvrtko Ursulin May 15, 2019, 11:55 a.m. UTC | #4
On 15/05/2019 12:48, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-05-15 12:44:41)
>>
>> On 10/05/2019 14:31, Chris Wilson wrote:
>>> Quoting Tvrtko Ursulin (2019-05-08 13:10:51)
>>>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>>
>>>> A new workload command for enabling a load balanced context map (aka
>>>> Virtual Engine). Example usage:
>>>>
>>>>     B.1
>>>>
>>>> This turns on load balancing for context one, assuming it has already been
>>>> configured with an engine map. Only DEFAULT engine specifier can be used
>>>> with load balanced engine maps.
>>>
>>> Restriction makes sense for keeping linenoise^W file format simple.
>>>
>>>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>> ---
>>>> @@ -1172,6 +1210,8 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
>>>>                   if (ctx->engine_map) {
>>>>                           I915_DEFINE_CONTEXT_PARAM_ENGINES(set_engines,
>>>>                                                             ctx->engine_map_count + 1);
>>>> +                       I915_DEFINE_CONTEXT_ENGINES_LOAD_BALANCE(load_balance,
>>>> +                                                                ctx->engine_map_count);
>>>>                           struct drm_i915_gem_context_param param = {
>>>>                                   .ctx_id = ctx_id,
>>>>                                   .param = I915_CONTEXT_PARAM_ENGINES,
>>>> @@ -1179,7 +1219,25 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
>>>>                                   .value = to_user_pointer(&set_engines),
>>>>                           };
>>>>    
>>>> -                       set_engines.extensions = 0;
>>>> +                       if (ctx->wants_balance) {
>>>> +                               set_engines.extensions =
>>>> +                                       to_user_pointer(&load_balance);
>>>> +
>>>> +                               memset(&load_balance, 0, sizeof(load_balance));
>>>> +                               load_balance.base.name =
>>>> +                                       I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE;
>>>> +                               load_balance.num_siblings =
>>>> +                                       ctx->engine_map_count;
>>>> +
>>>> +                               for (j = 0; j < ctx->engine_map_count; j++) {
>>>> +                                       load_balance.engines[j].engine_class =
>>>> +                                               I915_ENGINE_CLASS_VIDEO; /* FIXME */
>>>> +                                       load_balance.engines[j].engine_instance =
>>>> +                                               ctx->engine_map[j] - VCS1; /* FIXME */
>>>
>>> Ok, more fallout from fixing ctx->engine_map[] first?
>>
>> Not sure I understand the question.
> 
> The proliferation of FIXME, the assumption of CLASS_VIDEO and an
> impedance mismatch between engine_map and class:instance. Basically
> those FIXME raise the question of what do you intend this to look like?

Intention that implicit and explicit engine maps get populated by 
available engines.

When "-b i915":

1.VCS.1000.0.0 -> implicit map of available vcs engines

M.1.VCS
B.1	\-> explicit map of available vcs engines

That would support Icelake vcs0+vcs2 SKUs. And explicit engine map wsims 
would be more portable, like the original ones were.

Also, I am contemplating using VCS2 in wsim as meaning the 2nd VCS 
engine, so logical instances. So:

M.1.VCS1|VCS2 -> also works on both SKL and ICL (two vcs SKUs)
B.1

Regards,

Tvrtko
diff mbox series

Patch

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index c2e13d9939c2..b610a603f7b0 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -83,7 +83,8 @@  enum w_type
 	SW_FENCE_SIGNAL,
 	CTX_PRIORITY,
 	PREEMPTION,
-	ENGINE_MAP
+	ENGINE_MAP,
+	LOAD_BALANCE,
 };
 
 struct deps
@@ -121,6 +122,7 @@  struct w_step
 			unsigned int engine_map_count;
 			enum intel_engine_id *engine_map;
 		};
+		bool load_balance;
 	};
 
 	/* Implementation details */
@@ -501,6 +503,25 @@  parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 
 				step.type = PREEMPTION;
 				goto add_step;
+			} else if (!strcmp(field, "B")) {
+				unsigned int nr = 0;
+				while ((field = strtok_r(fstart, ".", &fctx))) {
+					tmp = atoi(field);
+					check_arg(nr == 0 && tmp <= 0,
+						  "Invalid context at step %u!\n",
+						  nr_steps);
+					check_arg(nr > 0,
+						  "Invalid load balance format at step %u!\n",
+						  nr_steps);
+
+					step.context = tmp;
+					step.load_balance = true;
+
+					nr++;
+				}
+
+				step.type = LOAD_BALANCE;
+				goto add_step;
 			}
 
 			if (!field) {
@@ -833,7 +854,7 @@  find_engine_in_map(struct ctx *ctx, enum intel_engine_id engine)
 			return i + 1;
 	}
 
-	igt_assert(0);
+	igt_assert(ctx->wants_balance);
 	return 0;
 }
 
@@ -1044,12 +1065,19 @@  prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 				wrk->ctx_list[j].engine_map = w->engine_map;
 				wrk->ctx_list[j].engine_map_count =
 					w->engine_map_count;
+			} else if (w->type == LOAD_BALANCE) {
+				if (!wrk->ctx_list[j].engine_map) {
+					wsim_err("Load balancing needs an engine map!\n");
+					return 1;
+				}
+				wrk->ctx_list[j].wants_balance =
+					w->load_balance;
 			}
 		}
 
 		wrk->ctx_list[j].targets_instance = targets;
 		if (flags & I915)
-			wrk->ctx_list[j].wants_balance = balance;
+			wrk->ctx_list[j].wants_balance |= balance;
 	}
 
 	/*
@@ -1063,10 +1091,19 @@  prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 			if (w->type != BATCH)
 				continue;
 
-			if (wrk->ctx_list[j].engine_map && w->engine == VCS) {
+			if (wrk->ctx_list[j].engine_map &&
+			    !wrk->ctx_list[j].wants_balance &&
+			    (w->engine == VCS || w->engine == DEFAULT)) {
 				wsim_err("Batches targetting engine maps must use explicit engines!\n");
 				return -1;
 			}
+
+			if (wrk->ctx_list[j].engine_map &&
+			    wrk->ctx_list[j].wants_balance &&
+			    w->engine != DEFAULT) {
+				wsim_err("Batches targetting load balanced maps must not use explicit engines!\n");
+				return -1;
+			}
 		}
 	}
 
@@ -1111,7 +1148,8 @@  prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 				break;
 			}
 
-			if (!ctx->engine_map && !ctx->targets_instance)
+			if ((!ctx->engine_map && !ctx->targets_instance) ||
+			    (ctx->engine_map && ctx->wants_balance))
 				args.flags |=
 				     I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE;
 
@@ -1172,6 +1210,8 @@  prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 		if (ctx->engine_map) {
 			I915_DEFINE_CONTEXT_PARAM_ENGINES(set_engines,
 							  ctx->engine_map_count + 1);
+			I915_DEFINE_CONTEXT_ENGINES_LOAD_BALANCE(load_balance,
+								 ctx->engine_map_count);
 			struct drm_i915_gem_context_param param = {
 				.ctx_id = ctx_id,
 				.param = I915_CONTEXT_PARAM_ENGINES,
@@ -1179,7 +1219,25 @@  prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 				.value = to_user_pointer(&set_engines),
 			};
 
-			set_engines.extensions = 0;
+			if (ctx->wants_balance) {
+				set_engines.extensions =
+					to_user_pointer(&load_balance);
+
+				memset(&load_balance, 0, sizeof(load_balance));
+				load_balance.base.name =
+					I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE;
+				load_balance.num_siblings =
+					ctx->engine_map_count;
+
+				for (j = 0; j < ctx->engine_map_count; j++) {
+					load_balance.engines[j].engine_class =
+						I915_ENGINE_CLASS_VIDEO; /* FIXME */
+					load_balance.engines[j].engine_instance =
+						ctx->engine_map[j] - VCS1; /* FIXME */
+				}
+			} else {
+				set_engines.extensions = 0;
+			}
 
 			/* Reserve slot for virtual engine. */
 			set_engines.engines[0].engine_class =
@@ -2164,7 +2222,8 @@  static void *run_workload(void *data)
 				}
 				continue;
 			} else if (w->type == PREEMPTION ||
-				   w->type == ENGINE_MAP) {
+				   w->type == ENGINE_MAP ||
+				   w->type == LOAD_BALANCE) {
 				continue;
 			}
 
diff --git a/benchmarks/wsim/README b/benchmarks/wsim/README
index 4b14aa28bfa7..45fa1e0a1d76 100644
--- a/benchmarks/wsim/README
+++ b/benchmarks/wsim/README
@@ -3,6 +3,7 @@  Workload descriptor format
 
 ctx.engine.duration_us.dependency.wait,...
 <uint>.<str>.<uint>[-<uint>].<int <= 0>[/<int <= 0>][...].<0|1>,...
+B.<uint>
 M.<uint>.<str>[|<str>]...
 P|X.<uint>.<int>
 d|p|s|t|q|a.<int>,...
@@ -24,6 +25,7 @@  Additional workload steps are also supported:
  'q' - Throttle to n max queue depth.
  'f' - Create a sync fence.
  'a' - Advance the previously created sync fence.
+ 'B' - Turn on context load balancing.
  'M' - Set up engine map.
  'P' - Context priority.
  'X' - Context preemption control.
@@ -176,3 +178,19 @@  Example:
 
 This sets up context 1 with an engine map containing VCS1 and VCS2 engine.
 Submission to this context can now only reference these two engines.
+
+Context load balancing
+----------------------
+
+Context load balancing (aka Virtual Engine) is an i915 feature where the driver
+will pick the best engine (most idle) to submit to given previously configured
+engine map.
+
+Example:
+
+  B.1
+
+This enables load balancing for context number one.
+
+Submissions to load balanced contexts are only allowed to use the DEFAULT engine
+specifier.