diff mbox

drm/i915/ringbuffer: Brute force context restore

Message ID 20180611104808.24295-1-chris@chris-wilson.co.uk (mailing list archive)
State New, archived
Headers show

Commit Message

Chris Wilson June 11, 2018, 10:48 a.m. UTC
An issue encountered with switching mm on gen7 is that the GPU likes to
hang (with the VS unit busy) when told to force restore the current
context. We can simply workaround this by substituting the
MI_FORCE_RESTORE flag with a round-trip through the kernel_context,
forcing the context to be saved and restored; thereby reloading the
PP_DIR registers and updating the modified page directory!

v2: Undo attempted optimisation in caller (Tvrtko)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
 drivers/gpu/drm/i915/intel_ringbuffer.c | 27 +++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

Comments

Tvrtko Ursulin June 11, 2018, 1:30 p.m. UTC | #1
On 11/06/2018 11:48, Chris Wilson wrote:
> An issue encountered with switching mm on gen7 is that the GPU likes to
> hang (with the VS unit busy) when told to force restore the current
> context. We can simply workaround this by substituting the
> MI_FORCE_RESTORE flag with a round-trip through the kernel_context,
> forcing the context to be saved and restored; thereby reloading the
> PP_DIR registers and updating the modified page directory!
> 
> v2: Undo attempted optimisation in caller (Tvrtko)
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> Cc: Matthew Auld <matthew.william.auld@gmail.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> ---
>   drivers/gpu/drm/i915/intel_ringbuffer.c | 27 +++++++++++++++++++++++++
>   1 file changed, 27 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 65811e2fa7da..39108d8dcec5 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -1458,6 +1458,7 @@ static inline int mi_set_context(struct i915_request *rq, u32 flags)
>   		(HAS_LEGACY_SEMAPHORES(i915) && IS_GEN7(i915)) ?
>   		INTEL_INFO(i915)->num_rings - 1 :
>   		0;
> +	bool force_restore = false;
>   	int len;
>   	u32 *cs;
>   
> @@ -1471,6 +1472,12 @@ static inline int mi_set_context(struct i915_request *rq, u32 flags)
>   	len = 4;
>   	if (IS_GEN7(i915))
>   		len += 2 + (num_rings ? 4*num_rings + 6 : 0);
> +	if (flags & MI_FORCE_RESTORE) {
> +		GEM_BUG_ON(flags & MI_RESTORE_INHIBIT);
> +		flags &= ~MI_FORCE_RESTORE;
> +		force_restore = true;
> +		len += 2;
> +	}
>   
>   	cs = intel_ring_begin(rq, len);
>   	if (IS_ERR(cs))
> @@ -1495,6 +1502,26 @@ static inline int mi_set_context(struct i915_request *rq, u32 flags)
>   		}
>   	}
>   
> +	if (force_restore) {
> +		/*
> +		 * The HW doesn't handle being told to restore the current
> +		 * context very well. Quite often it likes goes to go off and
> +		 * sulk, especially when it is meant to be reloading PP_DIR.
> +		 * A very simple fix to force the reload is to simply switch
> +		 * away from the current context and back again.
> +		 *
> +		 * Note that the kernel_context will contain random state
> +		 * following the INHIBIT_RESTORE. We accept this since we
> +		 * never use the kernel_context state; it is merely a
> +		 * placeholder we use to flush other contexts.
> +		 */
> +		*cs++ = MI_SET_CONTEXT;
> +		*cs++ = i915_ggtt_offset(to_intel_context(i915->kernel_context,
> +							  engine)->state) |
> +			MI_MM_SPACE_GTT |
> +			MI_RESTORE_INHIBIT;
> +	}
> +
>   	*cs++ = MI_NOOP;
>   	*cs++ = MI_SET_CONTEXT;
>   	*cs++ = i915_ggtt_offset(rq->hw_context->state) | flags;
> 

One more thing - I assume there is some performance penalty in switching 
two times, and since the commit message says the issue is on Gen7 - 
should you skip the double-switch on Gen6? Or when full ppgtt is not 
enabled? (If that will be supported at all.)

Regards,

Tvrtko
Chris Wilson June 11, 2018, 1:48 p.m. UTC | #2
Quoting Tvrtko Ursulin (2018-06-11 14:30:44)
> 
> On 11/06/2018 11:48, Chris Wilson wrote:
> > An issue encountered with switching mm on gen7 is that the GPU likes to
> > hang (with the VS unit busy) when told to force restore the current
> > context. We can simply workaround this by substituting the
> > MI_FORCE_RESTORE flag with a round-trip through the kernel_context,
> > forcing the context to be saved and restored; thereby reloading the
> > PP_DIR registers and updating the modified page directory!
> > 
> > v2: Undo attempted optimisation in caller (Tvrtko)
> > 
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> > Cc: Matthew Auld <matthew.william.auld@gmail.com>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> > ---
> >   drivers/gpu/drm/i915/intel_ringbuffer.c | 27 +++++++++++++++++++++++++
> >   1 file changed, 27 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > index 65811e2fa7da..39108d8dcec5 100644
> > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > @@ -1458,6 +1458,7 @@ static inline int mi_set_context(struct i915_request *rq, u32 flags)
> >               (HAS_LEGACY_SEMAPHORES(i915) && IS_GEN7(i915)) ?
> >               INTEL_INFO(i915)->num_rings - 1 :
> >               0;
> > +     bool force_restore = false;
> >       int len;
> >       u32 *cs;
> >   
> > @@ -1471,6 +1472,12 @@ static inline int mi_set_context(struct i915_request *rq, u32 flags)
> >       len = 4;
> >       if (IS_GEN7(i915))
> >               len += 2 + (num_rings ? 4*num_rings + 6 : 0);
> > +     if (flags & MI_FORCE_RESTORE) {
> > +             GEM_BUG_ON(flags & MI_RESTORE_INHIBIT);
> > +             flags &= ~MI_FORCE_RESTORE;
> > +             force_restore = true;
> > +             len += 2;
> > +     }
> >   
> >       cs = intel_ring_begin(rq, len);
> >       if (IS_ERR(cs))
> > @@ -1495,6 +1502,26 @@ static inline int mi_set_context(struct i915_request *rq, u32 flags)
> >               }
> >       }
> >   
> > +     if (force_restore) {
> > +             /*
> > +              * The HW doesn't handle being told to restore the current
> > +              * context very well. Quite often it likes goes to go off and
> > +              * sulk, especially when it is meant to be reloading PP_DIR.
> > +              * A very simple fix to force the reload is to simply switch
> > +              * away from the current context and back again.
> > +              *
> > +              * Note that the kernel_context will contain random state
> > +              * following the INHIBIT_RESTORE. We accept this since we
> > +              * never use the kernel_context state; it is merely a
> > +              * placeholder we use to flush other contexts.
> > +              */
> > +             *cs++ = MI_SET_CONTEXT;
> > +             *cs++ = i915_ggtt_offset(to_intel_context(i915->kernel_context,
> > +                                                       engine)->state) |
> > +                     MI_MM_SPACE_GTT |
> > +                     MI_RESTORE_INHIBIT;
> > +     }
> > +
> >       *cs++ = MI_NOOP;
> >       *cs++ = MI_SET_CONTEXT;
> >       *cs++ = i915_ggtt_offset(rq->hw_context->state) | flags;
> > 
> 
> One more thing - I assume there is some performance penalty in switching 
> two times, and since the commit message says the issue is on Gen7 - 
> should you skip the double-switch on Gen6? Or when full ppgtt is not 
> enabled? (If that will be supported at all.)

(Hmm, apologies if I sent this twice.)

No. We only do FORCE_RESTORE if the dirty flag is set, and that will
only be set on the first load for aliasing_ppgtt, so no runtime impact.
We don't see any effect in gem_ctx_switch, but that as a static PD, as
will most context switches! I expected to see the LRI, w/a, SRM and
second EMIT_INVALIDATE to add extra overhead to a context switch (and
worse to a nop request), but the results seem encouraging that they are
in the noise.
-Chris
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 65811e2fa7da..39108d8dcec5 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1458,6 +1458,7 @@  static inline int mi_set_context(struct i915_request *rq, u32 flags)
 		(HAS_LEGACY_SEMAPHORES(i915) && IS_GEN7(i915)) ?
 		INTEL_INFO(i915)->num_rings - 1 :
 		0;
+	bool force_restore = false;
 	int len;
 	u32 *cs;
 
@@ -1471,6 +1472,12 @@  static inline int mi_set_context(struct i915_request *rq, u32 flags)
 	len = 4;
 	if (IS_GEN7(i915))
 		len += 2 + (num_rings ? 4*num_rings + 6 : 0);
+	if (flags & MI_FORCE_RESTORE) {
+		GEM_BUG_ON(flags & MI_RESTORE_INHIBIT);
+		flags &= ~MI_FORCE_RESTORE;
+		force_restore = true;
+		len += 2;
+	}
 
 	cs = intel_ring_begin(rq, len);
 	if (IS_ERR(cs))
@@ -1495,6 +1502,26 @@  static inline int mi_set_context(struct i915_request *rq, u32 flags)
 		}
 	}
 
+	if (force_restore) {
+		/*
+		 * The HW doesn't handle being told to restore the current
+		 * context very well. Quite often it likes goes to go off and
+		 * sulk, especially when it is meant to be reloading PP_DIR.
+		 * A very simple fix to force the reload is to simply switch
+		 * away from the current context and back again.
+		 *
+		 * Note that the kernel_context will contain random state
+		 * following the INHIBIT_RESTORE. We accept this since we
+		 * never use the kernel_context state; it is merely a
+		 * placeholder we use to flush other contexts.
+		 */
+		*cs++ = MI_SET_CONTEXT;
+		*cs++ = i915_ggtt_offset(to_intel_context(i915->kernel_context,
+							  engine)->state) |
+			MI_MM_SPACE_GTT |
+			MI_RESTORE_INHIBIT;
+	}
+
 	*cs++ = MI_NOOP;
 	*cs++ = MI_SET_CONTEXT;
 	*cs++ = i915_ggtt_offset(rq->hw_context->state) | flags;