Message ID | 1524138288-23646-1-git-send-email-tomasz.lis@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Quoting Tomasz Lis (2018-04-19 12:44:48) > The patch adds support of preempt-to-idle requesting by setting a proper > bit within Execlist Control Register, and receiving preemption result from > Context Status Buffer. > > Preemption in previous gens required a special batch buffer to be executed, > so the Command Streamer never preempted to idle directly. In Icelake it is > possible, as there is a hardware mechanism to inform the kernel about > status of the preemption request. > > This patch does not cover using the new preemption mechanism when GuC is > active. > > v2: Added needs_preempt_context() change so that it is not created when > preempt-to-idle is supported. (Chris) > Updated setting HWACK flag so that it is cleared after > preempt-to-dle. (Chris, Daniele) > Updated to use I915_ENGINE_HAS_PREEMPTION flag. (Chris) > > Bspec: 18922 > Signed-off-by: Tomasz Lis <tomasz.lis@intel.com> > --- > drivers/gpu/drm/i915/i915_drv.h | 2 ++ > drivers/gpu/drm/i915/i915_gem_context.c | 4 ++- > drivers/gpu/drm/i915/i915_pci.c | 3 +- > drivers/gpu/drm/i915/intel_device_info.h | 1 + > drivers/gpu/drm/i915/intel_lrc.c | 47 ++++++++++++++++++++++++++++---- > drivers/gpu/drm/i915/intel_lrc.h | 1 + > 6 files changed, 51 insertions(+), 7 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h > index 0286911..f445340 100644 > --- a/drivers/gpu/drm/i915/i915_drv.h > +++ b/drivers/gpu/drm/i915/i915_drv.h > @@ -2518,6 +2518,8 @@ intel_info(const struct drm_i915_private *dev_priv) > ((dev_priv)->info.has_logical_ring_elsq) > #define HAS_LOGICAL_RING_PREEMPTION(dev_priv) \ > ((dev_priv)->info.has_logical_ring_preemption) > +#define HAS_HW_PREEMPT_TO_IDLE(dev_priv) \ > + ((dev_priv)->info.has_hw_preempt_to_idle) > > #define HAS_EXECLISTS(dev_priv) HAS_LOGICAL_RING_CONTEXTS(dev_priv) > > diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c > index 74435af..d65f469 100644 > --- a/drivers/gpu/drm/i915/i915_gem_context.c > +++ b/drivers/gpu/drm/i915/i915_gem_context.c > @@ -454,7 +454,9 @@ destroy_kernel_context(struct i915_gem_context **ctxp) > > static bool needs_preempt_context(struct drm_i915_private *i915) > { > - return HAS_LOGICAL_RING_PREEMPTION(i915); > + return HAS_LOGICAL_RING_PREEMPTION(i915) && > + !HAS_HW_PREEMPT_TO_IDLE(i915) && > + !USES_GUC_SUBMISSION(i915); Pardon? The guc uses the preempt_context for its preempt_client. > int i915_gem_contexts_init(struct drm_i915_private *dev_priv) > diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c > index 4364922..66b6700 100644 > --- a/drivers/gpu/drm/i915/i915_pci.c > +++ b/drivers/gpu/drm/i915/i915_pci.c > @@ -595,7 +595,8 @@ static const struct intel_device_info intel_cannonlake_info = { > GEN(11), \ > .ddb_size = 2048, \ > .has_csr = 0, \ > - .has_logical_ring_elsq = 1 > + .has_logical_ring_elsq = 1, \ > + .has_hw_preempt_to_idle = 1 > > static const struct intel_device_info intel_icelake_11_info = { > GEN11_FEATURES, > diff --git a/drivers/gpu/drm/i915/intel_device_info.h b/drivers/gpu/drm/i915/intel_device_info.h > index 933e316..4eb97b5 100644 > --- a/drivers/gpu/drm/i915/intel_device_info.h > +++ b/drivers/gpu/drm/i915/intel_device_info.h > @@ -98,6 +98,7 @@ enum intel_platform { > func(has_logical_ring_contexts); \ > func(has_logical_ring_elsq); \ > func(has_logical_ring_preemption); \ > + func(has_hw_preempt_to_idle); \ > func(has_overlay); \ > func(has_pooled_eu); \ > func(has_psr); \ > diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c > index 029901a..4c94488 100644 > --- a/drivers/gpu/drm/i915/intel_lrc.c > +++ b/drivers/gpu/drm/i915/intel_lrc.c > @@ -154,6 +154,7 @@ > #define GEN8_CTX_STATUS_ACTIVE_IDLE (1 << 3) > #define GEN8_CTX_STATUS_COMPLETE (1 << 4) > #define GEN8_CTX_STATUS_LITE_RESTORE (1 << 15) > +#define GEN11_CTX_STATUS_PREEMPT_IDLE (1 << 29) > > #define GEN8_CTX_STATUS_COMPLETED_MASK \ > (GEN8_CTX_STATUS_COMPLETE | GEN8_CTX_STATUS_PREEMPTED) > @@ -552,6 +553,25 @@ static void inject_preempt_context(struct intel_engine_cs *engine) > execlists_set_active(&engine->execlists, EXECLISTS_ACTIVE_PREEMPT); > } > > +static void gen11_preempt_to_idle(struct intel_engine_cs *engine) > +{ > + struct intel_engine_execlists *execlists = &engine->execlists; > + > + GEM_TRACE("%s\n", engine->name); > + > + /* > + * hardware which HAS_HW_PREEMPT_TO_IDLE(), always also > + * HAS_LOGICAL_RING_ELSQ(), so we can assume ctrl_reg is set > + */ > + GEM_BUG_ON(execlists->ctrl_reg == NULL); > + > + /* trigger preemption to idle */ > + writel(EL_CTRL_PREEMPT_TO_IDLE, execlists->ctrl_reg); > + > + execlists_clear_active(execlists, EXECLISTS_ACTIVE_HWACK); > + execlists_set_active(execlists, EXECLISTS_ACTIVE_PREEMPT); > +} > + > static void execlists_dequeue(struct intel_engine_cs *engine) > { > struct intel_engine_execlists * const execlists = &engine->execlists; > @@ -611,7 +631,10 @@ static void execlists_dequeue(struct intel_engine_cs *engine) > goto unlock; > > if (need_preempt(engine, last, execlists->queue_priority)) { > - inject_preempt_context(engine); > + if (HAS_HW_PREEMPT_TO_IDLE(engine->i915)) > + gen11_preempt_to_idle(engine); > + else > + inject_preempt_context(engine); Please do move this to inject_preempt_context. The conflict with other work in flight is not worth the hassle, especially to reiterate such points as we already have the equivalent machine check and so avoid repeating it in even more pointer dancing. > goto unlock; > } > > @@ -1010,7 +1033,15 @@ static void execlists_submission_tasklet(unsigned long data) > GEN8_CTX_STATUS_PREEMPTED)) > execlists_set_active(execlists, > EXECLISTS_ACTIVE_HWACK); > - if (status & GEN8_CTX_STATUS_ACTIVE_IDLE) > + > + /* > + * Check if switched to idle or preempted to idle. > + * The STATUS_IDLE_ACTIVE flag is really used to mark > + * preemtion from idle to idle, this is not a mistake. > + */ > + if ((status & GEN8_CTX_STATUS_ACTIVE_IDLE) || > + ((status & GEN8_CTX_STATUS_IDLE_ACTIVE) && > + (status & GEN11_CTX_STATUS_PREEMPT_IDLE))) > execlists_clear_active(execlists, > EXECLISTS_ACTIVE_HWACK); But still pointless, no? > @@ -1020,8 +1051,13 @@ static void execlists_submission_tasklet(unsigned long data) > /* We should never get a COMPLETED | IDLE_ACTIVE! */ > GEM_BUG_ON(status & GEN8_CTX_STATUS_IDLE_ACTIVE); > > - if (status & GEN8_CTX_STATUS_COMPLETE && > - buf[2*head + 1] == execlists->preempt_complete_status) { > + /* > + * Check if preempted to real idle, either directly or > + * the preemptive context already finished executing > + */ > + if ((status & GEN11_CTX_STATUS_PREEMPT_IDLE) || > + (status & GEN8_CTX_STATUS_COMPLETE && > + buf[2*head + 1] == execlists->preempt_complete_status)) { > GEM_TRACE("%s preempt-idle\n", engine->name); > > execlists_cancel_port_requests(execlists); > @@ -2157,7 +2193,8 @@ static void execlists_set_default_submission(struct intel_engine_cs *engine) > engine->unpark = NULL; > > engine->flags |= I915_ENGINE_SUPPORTS_STATS; > - if (engine->i915->preempt_context) > + if (engine->i915->preempt_context || > + HAS_HW_PREEMPT_TO_IDLE(engine->i915)) > engine->flags |= I915_ENGINE_HAS_PREEMPTION; -Chris
<snip> >> >> @@ -1010,7 +1033,15 @@ static void execlists_submission_tasklet(unsigned long data) >> GEN8_CTX_STATUS_PREEMPTED)) >> execlists_set_active(execlists, >> EXECLISTS_ACTIVE_HWACK); >> - if (status & GEN8_CTX_STATUS_ACTIVE_IDLE) >> + >> + /* >> + * Check if switched to idle or preempted to idle. >> + * The STATUS_IDLE_ACTIVE flag is really used to mark >> + * preemtion from idle to idle, this is not a mistake. >> + */ >> + if ((status & GEN8_CTX_STATUS_ACTIVE_IDLE) || >> + ((status & GEN8_CTX_STATUS_IDLE_ACTIVE) && >> + (status & GEN11_CTX_STATUS_PREEMPT_IDLE))) >> execlists_clear_active(execlists, >> EXECLISTS_ACTIVE_HWACK); > > But still pointless, no? > Just to understand, is it pointless because we have a preemption in flight and we're thus going to call execlists_dequeue below, which will eventually clear the flag in execlists_submit_ports? Or do we just don't care if this gets cleared here because we always clear it before a write to the elsp and we're only interested in it being clear between the write and the subsequent csb event? Also, now that I think about it, with the current flow it doesn't look like we would clear EXECLISTS_ACTIVE_PREEMPT if a preempt-to-idle happens on idle HW, so we still need a condition for that even if we drop the one for EXECLISTS_ACTIVE_HWACK. Daniele >> @@ -1020,8 +1051,13 @@ static void execlists_submission_tasklet(unsigned long data) >> /* We should never get a COMPLETED | IDLE_ACTIVE! */ >> GEM_BUG_ON(status & GEN8_CTX_STATUS_IDLE_ACTIVE); >> >> - if (status & GEN8_CTX_STATUS_COMPLETE && >> - buf[2*head + 1] == execlists->preempt_complete_status) { >> + /* >> + * Check if preempted to real idle, either directly or >> + * the preemptive context already finished executing >> + */ >> + if ((status & GEN11_CTX_STATUS_PREEMPT_IDLE) || >> + (status & GEN8_CTX_STATUS_COMPLETE && >> + buf[2*head + 1] == execlists->preempt_complete_status)) { >> GEM_TRACE("%s preempt-idle\n", engine->name); >> >> execlists_cancel_port_requests(execlists); >> @@ -2157,7 +2193,8 @@ static void execlists_set_default_submission(struct intel_engine_cs *engine) >> engine->unpark = NULL; >> >> engine->flags |= I915_ENGINE_SUPPORTS_STATS; >> - if (engine->i915->preempt_context) >> + if (engine->i915->preempt_context || >> + HAS_HW_PREEMPT_TO_IDLE(engine->i915)) >> engine->flags |= I915_ENGINE_HAS_PREEMPTION; > > -Chris >
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 0286911..f445340 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2518,6 +2518,8 @@ intel_info(const struct drm_i915_private *dev_priv) ((dev_priv)->info.has_logical_ring_elsq) #define HAS_LOGICAL_RING_PREEMPTION(dev_priv) \ ((dev_priv)->info.has_logical_ring_preemption) +#define HAS_HW_PREEMPT_TO_IDLE(dev_priv) \ + ((dev_priv)->info.has_hw_preempt_to_idle) #define HAS_EXECLISTS(dev_priv) HAS_LOGICAL_RING_CONTEXTS(dev_priv) diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index 74435af..d65f469 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -454,7 +454,9 @@ destroy_kernel_context(struct i915_gem_context **ctxp) static bool needs_preempt_context(struct drm_i915_private *i915) { - return HAS_LOGICAL_RING_PREEMPTION(i915); + return HAS_LOGICAL_RING_PREEMPTION(i915) && + !HAS_HW_PREEMPT_TO_IDLE(i915) && + !USES_GUC_SUBMISSION(i915); } int i915_gem_contexts_init(struct drm_i915_private *dev_priv) diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c index 4364922..66b6700 100644 --- a/drivers/gpu/drm/i915/i915_pci.c +++ b/drivers/gpu/drm/i915/i915_pci.c @@ -595,7 +595,8 @@ static const struct intel_device_info intel_cannonlake_info = { GEN(11), \ .ddb_size = 2048, \ .has_csr = 0, \ - .has_logical_ring_elsq = 1 + .has_logical_ring_elsq = 1, \ + .has_hw_preempt_to_idle = 1 static const struct intel_device_info intel_icelake_11_info = { GEN11_FEATURES, diff --git a/drivers/gpu/drm/i915/intel_device_info.h b/drivers/gpu/drm/i915/intel_device_info.h index 933e316..4eb97b5 100644 --- a/drivers/gpu/drm/i915/intel_device_info.h +++ b/drivers/gpu/drm/i915/intel_device_info.h @@ -98,6 +98,7 @@ enum intel_platform { func(has_logical_ring_contexts); \ func(has_logical_ring_elsq); \ func(has_logical_ring_preemption); \ + func(has_hw_preempt_to_idle); \ func(has_overlay); \ func(has_pooled_eu); \ func(has_psr); \ diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index 029901a..4c94488 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -154,6 +154,7 @@ #define GEN8_CTX_STATUS_ACTIVE_IDLE (1 << 3) #define GEN8_CTX_STATUS_COMPLETE (1 << 4) #define GEN8_CTX_STATUS_LITE_RESTORE (1 << 15) +#define GEN11_CTX_STATUS_PREEMPT_IDLE (1 << 29) #define GEN8_CTX_STATUS_COMPLETED_MASK \ (GEN8_CTX_STATUS_COMPLETE | GEN8_CTX_STATUS_PREEMPTED) @@ -552,6 +553,25 @@ static void inject_preempt_context(struct intel_engine_cs *engine) execlists_set_active(&engine->execlists, EXECLISTS_ACTIVE_PREEMPT); } +static void gen11_preempt_to_idle(struct intel_engine_cs *engine) +{ + struct intel_engine_execlists *execlists = &engine->execlists; + + GEM_TRACE("%s\n", engine->name); + + /* + * hardware which HAS_HW_PREEMPT_TO_IDLE(), always also + * HAS_LOGICAL_RING_ELSQ(), so we can assume ctrl_reg is set + */ + GEM_BUG_ON(execlists->ctrl_reg == NULL); + + /* trigger preemption to idle */ + writel(EL_CTRL_PREEMPT_TO_IDLE, execlists->ctrl_reg); + + execlists_clear_active(execlists, EXECLISTS_ACTIVE_HWACK); + execlists_set_active(execlists, EXECLISTS_ACTIVE_PREEMPT); +} + static void execlists_dequeue(struct intel_engine_cs *engine) { struct intel_engine_execlists * const execlists = &engine->execlists; @@ -611,7 +631,10 @@ static void execlists_dequeue(struct intel_engine_cs *engine) goto unlock; if (need_preempt(engine, last, execlists->queue_priority)) { - inject_preempt_context(engine); + if (HAS_HW_PREEMPT_TO_IDLE(engine->i915)) + gen11_preempt_to_idle(engine); + else + inject_preempt_context(engine); goto unlock; } @@ -1010,7 +1033,15 @@ static void execlists_submission_tasklet(unsigned long data) GEN8_CTX_STATUS_PREEMPTED)) execlists_set_active(execlists, EXECLISTS_ACTIVE_HWACK); - if (status & GEN8_CTX_STATUS_ACTIVE_IDLE) + + /* + * Check if switched to idle or preempted to idle. + * The STATUS_IDLE_ACTIVE flag is really used to mark + * preemtion from idle to idle, this is not a mistake. + */ + if ((status & GEN8_CTX_STATUS_ACTIVE_IDLE) || + ((status & GEN8_CTX_STATUS_IDLE_ACTIVE) && + (status & GEN11_CTX_STATUS_PREEMPT_IDLE))) execlists_clear_active(execlists, EXECLISTS_ACTIVE_HWACK); @@ -1020,8 +1051,13 @@ static void execlists_submission_tasklet(unsigned long data) /* We should never get a COMPLETED | IDLE_ACTIVE! */ GEM_BUG_ON(status & GEN8_CTX_STATUS_IDLE_ACTIVE); - if (status & GEN8_CTX_STATUS_COMPLETE && - buf[2*head + 1] == execlists->preempt_complete_status) { + /* + * Check if preempted to real idle, either directly or + * the preemptive context already finished executing + */ + if ((status & GEN11_CTX_STATUS_PREEMPT_IDLE) || + (status & GEN8_CTX_STATUS_COMPLETE && + buf[2*head + 1] == execlists->preempt_complete_status)) { GEM_TRACE("%s preempt-idle\n", engine->name); execlists_cancel_port_requests(execlists); @@ -2157,7 +2193,8 @@ static void execlists_set_default_submission(struct intel_engine_cs *engine) engine->unpark = NULL; engine->flags |= I915_ENGINE_SUPPORTS_STATS; - if (engine->i915->preempt_context) + if (engine->i915->preempt_context || + HAS_HW_PREEMPT_TO_IDLE(engine->i915)) engine->flags |= I915_ENGINE_HAS_PREEMPTION; engine->i915->caps.scheduler = diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h index 59d7b86..958d1b3 100644 --- a/drivers/gpu/drm/i915/intel_lrc.h +++ b/drivers/gpu/drm/i915/intel_lrc.h @@ -45,6 +45,7 @@ #define RING_EXECLIST_SQ_CONTENTS(engine) _MMIO((engine)->mmio_base + 0x510) #define RING_EXECLIST_CONTROL(engine) _MMIO((engine)->mmio_base + 0x550) #define EL_CTRL_LOAD (1 << 0) +#define EL_CTRL_PREEMPT_TO_IDLE (1 << 1) /* The docs specify that the write pointer wraps around after 5h, "After status * is written out to the last available status QW at offset 5h, this pointer
The patch adds support of preempt-to-idle requesting by setting a proper bit within Execlist Control Register, and receiving preemption result from Context Status Buffer. Preemption in previous gens required a special batch buffer to be executed, so the Command Streamer never preempted to idle directly. In Icelake it is possible, as there is a hardware mechanism to inform the kernel about status of the preemption request. This patch does not cover using the new preemption mechanism when GuC is active. v2: Added needs_preempt_context() change so that it is not created when preempt-to-idle is supported. (Chris) Updated setting HWACK flag so that it is cleared after preempt-to-dle. (Chris, Daniele) Updated to use I915_ENGINE_HAS_PREEMPTION flag. (Chris) Bspec: 18922 Signed-off-by: Tomasz Lis <tomasz.lis@intel.com> --- drivers/gpu/drm/i915/i915_drv.h | 2 ++ drivers/gpu/drm/i915/i915_gem_context.c | 4 ++- drivers/gpu/drm/i915/i915_pci.c | 3 +- drivers/gpu/drm/i915/intel_device_info.h | 1 + drivers/gpu/drm/i915/intel_lrc.c | 47 ++++++++++++++++++++++++++++---- drivers/gpu/drm/i915/intel_lrc.h | 1 + 6 files changed, 51 insertions(+), 7 deletions(-)