Message ID | 20180611104808.24295-1-chris@chris-wilson.co.uk (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 11/06/2018 11:48, Chris Wilson wrote: > An issue encountered with switching mm on gen7 is that the GPU likes to > hang (with the VS unit busy) when told to force restore the current > context. We can simply workaround this by substituting the > MI_FORCE_RESTORE flag with a round-trip through the kernel_context, > forcing the context to be saved and restored; thereby reloading the > PP_DIR registers and updating the modified page directory! > > v2: Undo attempted optimisation in caller (Tvrtko) > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> > Cc: Matthew Auld <matthew.william.auld@gmail.com> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> > Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> > --- > drivers/gpu/drm/i915/intel_ringbuffer.c | 27 +++++++++++++++++++++++++ > 1 file changed, 27 insertions(+) > > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c > index 65811e2fa7da..39108d8dcec5 100644 > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c > @@ -1458,6 +1458,7 @@ static inline int mi_set_context(struct i915_request *rq, u32 flags) > (HAS_LEGACY_SEMAPHORES(i915) && IS_GEN7(i915)) ? > INTEL_INFO(i915)->num_rings - 1 : > 0; > + bool force_restore = false; > int len; > u32 *cs; > > @@ -1471,6 +1472,12 @@ static inline int mi_set_context(struct i915_request *rq, u32 flags) > len = 4; > if (IS_GEN7(i915)) > len += 2 + (num_rings ? 4*num_rings + 6 : 0); > + if (flags & MI_FORCE_RESTORE) { > + GEM_BUG_ON(flags & MI_RESTORE_INHIBIT); > + flags &= ~MI_FORCE_RESTORE; > + force_restore = true; > + len += 2; > + } > > cs = intel_ring_begin(rq, len); > if (IS_ERR(cs)) > @@ -1495,6 +1502,26 @@ static inline int mi_set_context(struct i915_request *rq, u32 flags) > } > } > > + if (force_restore) { > + /* > + * The HW doesn't handle being told to restore the current > + * context very well. Quite often it likes goes to go off and > + * sulk, especially when it is meant to be reloading PP_DIR. > + * A very simple fix to force the reload is to simply switch > + * away from the current context and back again. > + * > + * Note that the kernel_context will contain random state > + * following the INHIBIT_RESTORE. We accept this since we > + * never use the kernel_context state; it is merely a > + * placeholder we use to flush other contexts. > + */ > + *cs++ = MI_SET_CONTEXT; > + *cs++ = i915_ggtt_offset(to_intel_context(i915->kernel_context, > + engine)->state) | > + MI_MM_SPACE_GTT | > + MI_RESTORE_INHIBIT; > + } > + > *cs++ = MI_NOOP; > *cs++ = MI_SET_CONTEXT; > *cs++ = i915_ggtt_offset(rq->hw_context->state) | flags; > One more thing - I assume there is some performance penalty in switching two times, and since the commit message says the issue is on Gen7 - should you skip the double-switch on Gen6? Or when full ppgtt is not enabled? (If that will be supported at all.) Regards, Tvrtko
Quoting Tvrtko Ursulin (2018-06-11 14:30:44) > > On 11/06/2018 11:48, Chris Wilson wrote: > > An issue encountered with switching mm on gen7 is that the GPU likes to > > hang (with the VS unit busy) when told to force restore the current > > context. We can simply workaround this by substituting the > > MI_FORCE_RESTORE flag with a round-trip through the kernel_context, > > forcing the context to be saved and restored; thereby reloading the > > PP_DIR registers and updating the modified page directory! > > > > v2: Undo attempted optimisation in caller (Tvrtko) > > > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> > > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> > > Cc: Matthew Auld <matthew.william.auld@gmail.com> > > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > > Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> > > Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> > > --- > > drivers/gpu/drm/i915/intel_ringbuffer.c | 27 +++++++++++++++++++++++++ > > 1 file changed, 27 insertions(+) > > > > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c > > index 65811e2fa7da..39108d8dcec5 100644 > > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c > > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c > > @@ -1458,6 +1458,7 @@ static inline int mi_set_context(struct i915_request *rq, u32 flags) > > (HAS_LEGACY_SEMAPHORES(i915) && IS_GEN7(i915)) ? > > INTEL_INFO(i915)->num_rings - 1 : > > 0; > > + bool force_restore = false; > > int len; > > u32 *cs; > > > > @@ -1471,6 +1472,12 @@ static inline int mi_set_context(struct i915_request *rq, u32 flags) > > len = 4; > > if (IS_GEN7(i915)) > > len += 2 + (num_rings ? 4*num_rings + 6 : 0); > > + if (flags & MI_FORCE_RESTORE) { > > + GEM_BUG_ON(flags & MI_RESTORE_INHIBIT); > > + flags &= ~MI_FORCE_RESTORE; > > + force_restore = true; > > + len += 2; > > + } > > > > cs = intel_ring_begin(rq, len); > > if (IS_ERR(cs)) > > @@ -1495,6 +1502,26 @@ static inline int mi_set_context(struct i915_request *rq, u32 flags) > > } > > } > > > > + if (force_restore) { > > + /* > > + * The HW doesn't handle being told to restore the current > > + * context very well. Quite often it likes goes to go off and > > + * sulk, especially when it is meant to be reloading PP_DIR. > > + * A very simple fix to force the reload is to simply switch > > + * away from the current context and back again. > > + * > > + * Note that the kernel_context will contain random state > > + * following the INHIBIT_RESTORE. We accept this since we > > + * never use the kernel_context state; it is merely a > > + * placeholder we use to flush other contexts. > > + */ > > + *cs++ = MI_SET_CONTEXT; > > + *cs++ = i915_ggtt_offset(to_intel_context(i915->kernel_context, > > + engine)->state) | > > + MI_MM_SPACE_GTT | > > + MI_RESTORE_INHIBIT; > > + } > > + > > *cs++ = MI_NOOP; > > *cs++ = MI_SET_CONTEXT; > > *cs++ = i915_ggtt_offset(rq->hw_context->state) | flags; > > > > One more thing - I assume there is some performance penalty in switching > two times, and since the commit message says the issue is on Gen7 - > should you skip the double-switch on Gen6? Or when full ppgtt is not > enabled? (If that will be supported at all.) (Hmm, apologies if I sent this twice.) No. We only do FORCE_RESTORE if the dirty flag is set, and that will only be set on the first load for aliasing_ppgtt, so no runtime impact. We don't see any effect in gem_ctx_switch, but that as a static PD, as will most context switches! I expected to see the LRI, w/a, SRM and second EMIT_INVALIDATE to add extra overhead to a context switch (and worse to a nop request), but the results seem encouraging that they are in the noise. -Chris
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index 65811e2fa7da..39108d8dcec5 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -1458,6 +1458,7 @@ static inline int mi_set_context(struct i915_request *rq, u32 flags) (HAS_LEGACY_SEMAPHORES(i915) && IS_GEN7(i915)) ? INTEL_INFO(i915)->num_rings - 1 : 0; + bool force_restore = false; int len; u32 *cs; @@ -1471,6 +1472,12 @@ static inline int mi_set_context(struct i915_request *rq, u32 flags) len = 4; if (IS_GEN7(i915)) len += 2 + (num_rings ? 4*num_rings + 6 : 0); + if (flags & MI_FORCE_RESTORE) { + GEM_BUG_ON(flags & MI_RESTORE_INHIBIT); + flags &= ~MI_FORCE_RESTORE; + force_restore = true; + len += 2; + } cs = intel_ring_begin(rq, len); if (IS_ERR(cs)) @@ -1495,6 +1502,26 @@ static inline int mi_set_context(struct i915_request *rq, u32 flags) } } + if (force_restore) { + /* + * The HW doesn't handle being told to restore the current + * context very well. Quite often it likes goes to go off and + * sulk, especially when it is meant to be reloading PP_DIR. + * A very simple fix to force the reload is to simply switch + * away from the current context and back again. + * + * Note that the kernel_context will contain random state + * following the INHIBIT_RESTORE. We accept this since we + * never use the kernel_context state; it is merely a + * placeholder we use to flush other contexts. + */ + *cs++ = MI_SET_CONTEXT; + *cs++ = i915_ggtt_offset(to_intel_context(i915->kernel_context, + engine)->state) | + MI_MM_SPACE_GTT | + MI_RESTORE_INHIBIT; + } + *cs++ = MI_NOOP; *cs++ = MI_SET_CONTEXT; *cs++ = i915_ggtt_offset(rq->hw_context->state) | flags;