Message ID | 20180905142222.3251-4-tvrtko.ursulin@linux.intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Per context dynamic (sub)slice power-gating | expand |
Quoting Tvrtko Ursulin (2018-09-05 15:22:18) > From: Chris Wilson <chris@chris-wilson.co.uk> > > We want to expose the ability to reconfigure the slices, subslice and > eu per context and per engine. To facilitate that, store the current > configuration on the context for each engine, which is initially set > to the device default upon creation. > > v2: record sseu configuration per context & engine (Chris) > > v3: introduce the i915_gem_context_sseu to store powergating > programming, sseu_dev_info has grown quite a bit (Lionel) > > v4: rename i915_gem_sseu into intel_sseu (Chris) > use to_intel_context() (Chris) > > v5: More to_intel_context() (Tvrtko) > Switch intel_sseu from union to struct (Tvrtko) > Move context default sseu in existing loop (Chris) > > v6: s/intel_sseu_from_device_sseu/intel_device_default_sseu/ (Tvrtko) > > Tvrtko Ursulin: > > v7: > * Pass intel_sseu by pointer instead of value to make_rpcs. > * Rebase for make_rpcs changes. > > v8: > * Rebase for RPCS edit on pin. > > v9: > * Rebase for context image setup changes. > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> > Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> I feel this is substantially different (since I just outlined a v1!) to merit a Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> and probably deserves a different author. I think Lionel is still the principle author here, but Tvrtko has done a lot of refactoring and integrating in the new scheme. > -static u32 make_rpcs(struct drm_i915_private *dev_priv); > +static u32 make_rpcs(struct drm_i915_private *dev_priv, > + struct intel_sseu *ctx_sseu); > > static struct intel_context * > __execlists_context_pin(struct intel_engine_cs *engine, > @@ -1349,7 +1350,7 @@ __execlists_context_pin(struct intel_engine_cs *engine, > /* RPCS */ > if (engine->class == RENDER_CLASS) { > ce->lrc_reg_state[CTX_R_PWR_CLK_STATE + 1] = > - make_rpcs(engine->i915); > + make_rpcs(engine->i915, &ce->sseu); We have different habits here; my vim config just gives this a single tab indent beyond the incomplete line. (Was going to say it earlier ;) -Chris
On 05/09/2018 16:18, Chris Wilson wrote: > Quoting Tvrtko Ursulin (2018-09-05 15:22:18) >> From: Chris Wilson <chris@chris-wilson.co.uk> >> >> We want to expose the ability to reconfigure the slices, subslice and >> eu per context and per engine. To facilitate that, store the current >> configuration on the context for each engine, which is initially set >> to the device default upon creation. >> >> v2: record sseu configuration per context & engine (Chris) >> >> v3: introduce the i915_gem_context_sseu to store powergating >> programming, sseu_dev_info has grown quite a bit (Lionel) >> >> v4: rename i915_gem_sseu into intel_sseu (Chris) >> use to_intel_context() (Chris) >> >> v5: More to_intel_context() (Tvrtko) >> Switch intel_sseu from union to struct (Tvrtko) >> Move context default sseu in existing loop (Chris) >> >> v6: s/intel_sseu_from_device_sseu/intel_device_default_sseu/ (Tvrtko) >> >> Tvrtko Ursulin: >> >> v7: >> * Pass intel_sseu by pointer instead of value to make_rpcs. >> * Rebase for make_rpcs changes. >> >> v8: >> * Rebase for RPCS edit on pin. >> >> v9: >> * Rebase for context image setup changes. >> >> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> >> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> >> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > > I feel this is substantially different (since I just outlined a v1!) to > merit a > > Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> > > and probably deserves a different author. I think Lionel is still the > principle author here, but Tvrtko has done a lot of refactoring and > integrating in the new scheme. Agreed Lionel is the real author here - mine were just small tweaks. >> -static u32 make_rpcs(struct drm_i915_private *dev_priv); >> +static u32 make_rpcs(struct drm_i915_private *dev_priv, >> + struct intel_sseu *ctx_sseu); >> >> static struct intel_context * >> __execlists_context_pin(struct intel_engine_cs *engine, >> @@ -1349,7 +1350,7 @@ __execlists_context_pin(struct intel_engine_cs *engine, >> /* RPCS */ >> if (engine->class == RENDER_CLASS) { >> ce->lrc_reg_state[CTX_R_PWR_CLK_STATE + 1] = >> - make_rpcs(engine->i915); >> + make_rpcs(engine->i915, &ce->sseu); > > We have different habits here; my vim config just gives this a single > tab indent beyond the incomplete line. (Was going to say it earlier ;) Not sure if my approach here is always consistent, but I *think* I first try to indent it to where it "looks good". If neither indentation looks decidedly better, then I push it so end aligns with the wrap marker. I in this particular case I wasn't too happy with any of the options. :( Regards, Tvrtko
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 767615ecdea5..e4682fc572e6 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -3449,6 +3449,20 @@ mkwrite_device_info(struct drm_i915_private *dev_priv) return (struct intel_device_info *)&dev_priv->info; } +static inline struct intel_sseu +intel_device_default_sseu(struct drm_i915_private *i915) +{ + const struct sseu_dev_info *sseu = &INTEL_INFO(i915)->sseu; + struct intel_sseu value = { + .slice_mask = sseu->slice_mask, + .subslice_mask = sseu->subslice_mask[0], + .min_eus_per_subslice = sseu->max_eus_per_subslice, + .max_eus_per_subslice = sseu->max_eus_per_subslice, + }; + + return value; +} + /* modesetting */ extern void intel_modeset_init_hw(struct drm_device *dev); extern int intel_modeset_init(struct drm_device *dev); diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index 747b8170a15a..ca2c8fcd1090 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -343,6 +343,8 @@ __create_hw_context(struct drm_i915_private *dev_priv, struct intel_context *ce = &ctx->__engine[n]; ce->gem_context = ctx; + /* Use the whole device by default */ + ce->sseu = intel_device_default_sseu(dev_priv); } INIT_RADIX_TREE(&ctx->handles_vma, GFP_KERNEL); diff --git a/drivers/gpu/drm/i915/i915_gem_context.h b/drivers/gpu/drm/i915/i915_gem_context.h index e09673ca731d..79d2e8f62ad1 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.h +++ b/drivers/gpu/drm/i915/i915_gem_context.h @@ -31,6 +31,7 @@ #include "i915_gem.h" #include "i915_scheduler.h" +#include "intel_device_info.h" struct pid; @@ -165,6 +166,9 @@ struct i915_gem_context { int pin_count; const struct intel_context_ops *ops; + + /** sseu: Control eu/slice partitioning */ + struct intel_sseu sseu; } __engine[I915_NUM_ENGINES]; /** ring_size: size for allocating the per-engine ring buffer */ diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h index 9898301ab7ef..eb6f8cce16c4 100644 --- a/drivers/gpu/drm/i915/i915_request.h +++ b/drivers/gpu/drm/i915/i915_request.h @@ -39,6 +39,16 @@ struct drm_i915_gem_object; struct i915_request; struct i915_timeline; +/* + * Powergating configuration for a particular (context,engine). + */ +struct intel_sseu { + u8 slice_mask; + u8 subslice_mask; + u8 min_eus_per_subslice; + u8 max_eus_per_subslice; +}; + struct intel_wait { struct rb_node node; struct task_struct *tsk; diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index 3bdc1ac3e926..8a477e43dbca 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -1305,7 +1305,8 @@ static int __context_pin(struct i915_gem_context *ctx, struct i915_vma *vma) return i915_vma_pin(vma, 0, 0, flags); } -static u32 make_rpcs(struct drm_i915_private *dev_priv); +static u32 make_rpcs(struct drm_i915_private *dev_priv, + struct intel_sseu *ctx_sseu); static struct intel_context * __execlists_context_pin(struct intel_engine_cs *engine, @@ -1349,7 +1350,7 @@ __execlists_context_pin(struct intel_engine_cs *engine, /* RPCS */ if (engine->class == RENDER_CLASS) { ce->lrc_reg_state[CTX_R_PWR_CLK_STATE + 1] = - make_rpcs(engine->i915); + make_rpcs(engine->i915, &ce->sseu); } ce->state->obj->pin_global++; @@ -2493,12 +2494,13 @@ int logical_xcs_ring_init(struct intel_engine_cs *engine) return logical_ring_init(engine); } -static u32 -make_rpcs(struct drm_i915_private *dev_priv) +static u32 make_rpcs(struct drm_i915_private *dev_priv, + struct intel_sseu *ctx_sseu) { - bool subslice_pg = INTEL_INFO(dev_priv)->sseu.has_subslice_pg; - u8 slices = hweight8(INTEL_INFO(dev_priv)->sseu.slice_mask); - u8 subslices = hweight8(INTEL_INFO(dev_priv)->sseu.subslice_mask[0]); + const struct sseu_dev_info *sseu = &INTEL_INFO(dev_priv)->sseu; + bool subslice_pg = sseu->has_subslice_pg; + u8 slices = hweight8(ctx_sseu->slice_mask); + u8 subslices = hweight8(ctx_sseu->subslice_mask); u32 rpcs = 0; /* @@ -2539,7 +2541,7 @@ make_rpcs(struct drm_i915_private *dev_priv) * must make an explicit request through RPCS for full * enablement. */ - if (INTEL_INFO(dev_priv)->sseu.has_slice_pg) { + if (sseu->has_slice_pg) { u32 mask, val = slices; if (INTEL_GEN(dev_priv) >= 11) { @@ -2567,18 +2569,16 @@ make_rpcs(struct drm_i915_private *dev_priv) rpcs |= GEN8_RPCS_ENABLE | GEN8_RPCS_SS_CNT_ENABLE | val; } - if (INTEL_INFO(dev_priv)->sseu.has_eu_pg) { + if (sseu->has_eu_pg) { u32 val; - val = INTEL_INFO(dev_priv)->sseu.eu_per_subslice << - GEN8_RPCS_EU_MIN_SHIFT; + val = ctx_sseu->min_eus_per_subslice << GEN8_RPCS_EU_MIN_SHIFT; GEM_BUG_ON(val & ~GEN8_RPCS_EU_MIN_MASK); val &= GEN8_RPCS_EU_MIN_MASK; rpcs |= val; - val = INTEL_INFO(dev_priv)->sseu.eu_per_subslice << - GEN8_RPCS_EU_MAX_SHIFT; + val = ctx_sseu->max_eus_per_subslice << GEN8_RPCS_EU_MAX_SHIFT; GEM_BUG_ON(val & ~GEN8_RPCS_EU_MAX_MASK); val &= GEN8_RPCS_EU_MAX_MASK;