diff mbox

drm/i915/gvt: Implement WaForceWakeRenderDuringMmioTLBInvalidate

Message ID 87a8dzqejz.fsf@gaia.fi.intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Mika Kuoppala Oct. 20, 2016, 2:29 p.m. UTC
Arkadiusz Hiler <arkadiusz.hiler@intel.com> writes:

> When invalidating RCS TLB the device can enter RC6 state interrupting
> the process, therefore the need for render forcewake for the whole
> procedure.
>
> This WA is needed for all production SKL SKUs.
>
> References: HSD#2136899, HSD#1404391274
> Cc: Mika Kuoppala <mika.kuoppala@intel.com>
> Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
> Signed-off-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
> ---
>  drivers/gpu/drm/i915/gvt/render.c | 11 +++++++++++
>  1 file changed, 11 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/gvt/render.c b/drivers/gpu/drm/i915/gvt/render.c
> index f54ab85..f5000ea 100644
> --- a/drivers/gpu/drm/i915/gvt/render.c
> +++ b/drivers/gpu/drm/i915/gvt/render.c
> @@ -134,11 +134,22 @@ static void handle_tlb_pending_event(struct intel_vgpu *vgpu, int ring_id)
>  
>  	reg = _MMIO(regs[ring_id]);
>

Ok not so familiar with the gvt side but I assume this is the host
side code and thus the vgpu is not active at this stage.

Then you could avoid some of the implicit fw dancing
by:

> +	 * we need to put a forcewake when invalidating RCS TLB caches,
> +	 * otherwise device can go to RC6 state and interrupt invalidation
> +	 * process */
> +	if (IS_SKYLAKE(dev_priv) && ring_id == RCS)
> +		intel_uncore_forcewake_get(dev_priv, FORCEWAKE_RENDER);
> +
>  	I915_WRITE(reg, 0x1);
>  
>  	if (wait_for_atomic((I915_READ(reg) == 0), 50))
>  		gvt_err("timeout in invalidate ring (%d) tlb\n", ring_id);
>  
> +	if (IS_SKYLAKE(dev_priv) && ring_id == RCS)
> +		intel_uncore_forcewake_put(dev_priv, FORCEWAKE_RENDER);
> +
> +
>  	gvt_dbg_core("invalidate TLB for ring %d\n", ring_id);
>  }
>  
> -- 
> 2.7.4
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Comments

Arkadiusz Hiler Oct. 20, 2016, 3:20 p.m. UTC | #1
On Thu, Oct 20, 2016 at 05:29:36PM +0300, Mika Kuoppala wrote:
> Arkadiusz Hiler <arkadiusz.hiler@intel.com> writes:
> 
> > When invalidating RCS TLB the device can enter RC6 state interrupting
> > the process, therefore the need for render forcewake for the whole
> > procedure.
> >
> > This WA is needed for all production SKL SKUs.
> >
> > References: HSD#2136899, HSD#1404391274
> > Cc: Mika Kuoppala <mika.kuoppala@intel.com>
> > Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
> > Signed-off-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
> > ---
> >  drivers/gpu/drm/i915/gvt/render.c | 11 +++++++++++
> >  1 file changed, 11 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/i915/gvt/render.c b/drivers/gpu/drm/i915/gvt/render.c
> > index f54ab85..f5000ea 100644
> > --- a/drivers/gpu/drm/i915/gvt/render.c
> > +++ b/drivers/gpu/drm/i915/gvt/render.c
> > @@ -134,11 +134,22 @@ static void handle_tlb_pending_event(struct intel_vgpu *vgpu, int ring_id)
> >  
> >  	reg = _MMIO(regs[ring_id]);
> >
> 
> Ok not so familiar with the gvt side but I assume this is the host
> side code and thus the vgpu is not active at this stage.

That's my understanding as well. It's a code that is setting up gvt for
further use (shadow context to be exact). It's called indirectly from
intel_gvt_create_vgpu.

We should wait for Zhenyu to verify that.

> Then you could avoid some of the implicit fw dancing
> by:
> 
> diff --git a/drivers/gpu/drm/i915/gvt/render.c b/drivers/gpu/drm/i915/gvt/render.c
> index feebb65..93ba156 100644
> --- a/drivers/gpu/drm/i915/gvt/render.c
> +++ b/drivers/gpu/drm/i915/gvt/render.c
> @@ -118,6 +118,7 @@ static u32 gen9_render_mocs_L3[32];
>  static void handle_tlb_pending_event(struct intel_vgpu *vgpu, int ring_id)
>  {
>         struct drm_i915_private *dev_priv = vgpu->gvt->dev_priv;
> +       enum forcewake_domains fw;
>         i915_reg_t reg;
>         u32 regs[] = {
>                 [RCS] = 0x4260,
> @@ -135,11 +136,21 @@ static void handle_tlb_pending_event(struct intel_vgpu *vgpu, int ring_id)
>  
>         reg = _MMIO(regs[ring_id]);
>  
> -       I915_WRITE(reg, 0x1);
> +       fw = intel_uncore_forcewake_for_reg(dev_priv, reg,
> +                                           FW_REG_READ | FW_REG_WRITE);
>  
> -       if (wait_for_atomic((I915_READ(reg) == 0), 50))
> +       if (ring_id == RCS && IS_SKYLAKE(dev_priv))
> +               fw |= FORCEWAKE_RENDER;
> +
> +       intel_uncore_forcewake_get(dev_priv, fw);
> +
> +       I915_WRITE_FW(reg, 0x1);
> +
> +       if (wait_for_atomic((I915_READ_FW(reg) == 0), 50))
>                 gvt_err("timeout in invalidate ring (%d) tlb\n", ring_id);
>  
> +       intel_uncore_forcewake_put(dev_priv, fw);
> +
> 

I can go with it, although I do not have strong preference. I think my
version is a little bit easier to follow, but his is less error prone,
as you check for the WA SKU only once, during setting the FW.

Any recommendations?
Zhenyu Wang Oct. 21, 2016, 4:15 a.m. UTC | #2
On 2016.10.20 17:20:04 +0200, Arkadiusz Hiler wrote:
> On Thu, Oct 20, 2016 at 05:29:36PM +0300, Mika Kuoppala wrote:
> > Arkadiusz Hiler <arkadiusz.hiler@intel.com> writes:
> > 
> > > When invalidating RCS TLB the device can enter RC6 state interrupting
> > > the process, therefore the need for render forcewake for the whole
> > > procedure.
> > >
> > > This WA is needed for all production SKL SKUs.
> > >
> > > References: HSD#2136899, HSD#1404391274
> > > Cc: Mika Kuoppala <mika.kuoppala@intel.com>
> > > Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
> > > Signed-off-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
> > > ---
> > >  drivers/gpu/drm/i915/gvt/render.c | 11 +++++++++++
> > >  1 file changed, 11 insertions(+)
> > >
> > > diff --git a/drivers/gpu/drm/i915/gvt/render.c b/drivers/gpu/drm/i915/gvt/render.c
> > > index f54ab85..f5000ea 100644
> > > --- a/drivers/gpu/drm/i915/gvt/render.c
> > > +++ b/drivers/gpu/drm/i915/gvt/render.c
> > > @@ -134,11 +134,22 @@ static void handle_tlb_pending_event(struct intel_vgpu *vgpu, int ring_id)
> > >  
> > >  	reg = _MMIO(regs[ring_id]);
> > >
> > 
> > Ok not so familiar with the gvt side but I assume this is the host
> > side code and thus the vgpu is not active at this stage.
> 
> That's my understanding as well. It's a code that is setting up gvt for
> further use (shadow context to be exact). It's called indirectly from
> intel_gvt_create_vgpu.
> 

yes, it's for host not for vgpu to handle context switch state for vgpu.

> > Then you could avoid some of the implicit fw dancing
> > by:
> > 
> > diff --git a/drivers/gpu/drm/i915/gvt/render.c b/drivers/gpu/drm/i915/gvt/render.c
> > index feebb65..93ba156 100644
> > --- a/drivers/gpu/drm/i915/gvt/render.c
> > +++ b/drivers/gpu/drm/i915/gvt/render.c
> > @@ -118,6 +118,7 @@ static u32 gen9_render_mocs_L3[32];
> >  static void handle_tlb_pending_event(struct intel_vgpu *vgpu, int ring_id)
> >  {
> >         struct drm_i915_private *dev_priv = vgpu->gvt->dev_priv;
> > +       enum forcewake_domains fw;
> >         i915_reg_t reg;
> >         u32 regs[] = {
> >                 [RCS] = 0x4260,
> > @@ -135,11 +136,21 @@ static void handle_tlb_pending_event(struct intel_vgpu *vgpu, int ring_id)
> >  
> >         reg = _MMIO(regs[ring_id]);
> >  
> > -       I915_WRITE(reg, 0x1);
> > +       fw = intel_uncore_forcewake_for_reg(dev_priv, reg,
> > +                                           FW_REG_READ | FW_REG_WRITE);
> >  
> > -       if (wait_for_atomic((I915_READ(reg) == 0), 50))
> > +       if (ring_id == RCS && IS_SKYLAKE(dev_priv))
> > +               fw |= FORCEWAKE_RENDER;
> > +
> > +       intel_uncore_forcewake_get(dev_priv, fw);
> > +
> > +       I915_WRITE_FW(reg, 0x1);
> > +
> > +       if (wait_for_atomic((I915_READ_FW(reg) == 0), 50))
> >                 gvt_err("timeout in invalidate ring (%d) tlb\n", ring_id);
> >  
> > +       intel_uncore_forcewake_put(dev_priv, fw);
> > +
> > 
> 
> I can go with it, although I do not have strong preference. I think my
> version is a little bit easier to follow, but his is less error prone,
> as you check for the WA SKU only once, during setting the FW.
> 
> Any recommendations?
> 

I like this one to be safer. Would Mika like to send another one or I
just take your commit message?

thanks
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/gvt/render.c b/drivers/gpu/drm/i915/gvt/render.c
index feebb65..93ba156 100644
--- a/drivers/gpu/drm/i915/gvt/render.c
+++ b/drivers/gpu/drm/i915/gvt/render.c
@@ -118,6 +118,7 @@  static u32 gen9_render_mocs_L3[32];
 static void handle_tlb_pending_event(struct intel_vgpu *vgpu, int ring_id)
 {
        struct drm_i915_private *dev_priv = vgpu->gvt->dev_priv;
+       enum forcewake_domains fw;
        i915_reg_t reg;
        u32 regs[] = {
                [RCS] = 0x4260,
@@ -135,11 +136,21 @@  static void handle_tlb_pending_event(struct intel_vgpu *vgpu, int ring_id)
 
        reg = _MMIO(regs[ring_id]);
 
-       I915_WRITE(reg, 0x1);
+       fw = intel_uncore_forcewake_for_reg(dev_priv, reg,
+                                           FW_REG_READ | FW_REG_WRITE);
 
-       if (wait_for_atomic((I915_READ(reg) == 0), 50))
+       if (ring_id == RCS && IS_SKYLAKE(dev_priv))
+               fw |= FORCEWAKE_RENDER;
+
+       intel_uncore_forcewake_get(dev_priv, fw);
+
+       I915_WRITE_FW(reg, 0x1);
+
+       if (wait_for_atomic((I915_READ_FW(reg) == 0), 50))
                gvt_err("timeout in invalidate ring (%d) tlb\n", ring_id);
 
+       intel_uncore_forcewake_put(dev_priv, fw);
+

-Mika

> +	/* WaForceWakeRenderDuringMmioTLBInvalidate:skl