diff mbox

[3/3] drm/i915: Enable GTT caching on gen8

Message ID 1432056777-963-3-git-send-email-ville.syrjala@linux.intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Ville Syrjälä May 19, 2015, 5:32 p.m. UTC
From: Ville Syrjälä <ville.syrjala@linux.intel.com>

GTT caching was disabled by default on gen8 due to not working with
big pages. Some information suggests that it got fixed, but still
GTT caching has been left disabled by default. Or could be it just
meant that the default was changed to off, and hence the problem
got solved.

Enable GTT caching in the hopes of some performance increase.
Whether or not the big pages issue has been fixed is irrelevant
at this stage since we don't use big pages.

This gives me a 1-2% improvement in xonotic on my BSW. Haven't tried
BDW, but supposedly it has larger TLBs so might not benefit as much.
On HSW GTT caching is enabled by default.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_reg.h |  2 ++
 drivers/gpu/drm/i915/intel_pm.c | 13 +++++++++++++
 2 files changed, 15 insertions(+)

Comments

Shuang He May 21, 2015, 9:48 a.m. UTC | #1
Tested-By: Intel Graphics QA PRTS (Patch Regression Test System Contact: shuang.he@intel.com)
Task id: 6434
-------------------------------------Summary-------------------------------------
Platform          Delta          drm-intel-nightly          Series Applied
PNV                                  234/234              234/234
ILK                                  262/262              262/262
SNB                 -1              282/282              281/282
IVB                                  300/300              300/300
BYT                                  254/254              254/254
BDW                                  275/275              275/275
-------------------------------------Detailed-------------------------------------
Platform  Test                                drm-intel-nightly          Series Applied
 SNB  igt@pm_rpm@dpms-mode-unset-non-lpsp      DMESG_WARN(11)PASS(1)      DMESG_WARN(1)
(dmesg patch applied)WARNING:at_drivers/gpu/drm/i915/intel_uncore.c:#assert_device_not_suspended[i915]()@WARNING:.* at .* assert_device_not_suspended+0x
Note: You need to pay more attention to line start with '*'
Jesse Barnes May 21, 2015, 8:18 p.m. UTC | #2
On 05/19/2015 10:32 AM, ville.syrjala@linux.intel.com wrote:
> From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> 
> GTT caching was disabled by default on gen8 due to not working with
> big pages. Some information suggests that it got fixed, but still
> GTT caching has been left disabled by default. Or could be it just
> meant that the default was changed to off, and hence the problem
> got solved.
> 
> Enable GTT caching in the hopes of some performance increase.
> Whether or not the big pages issue has been fixed is irrelevant
> at this stage since we don't use big pages.
> 
> This gives me a 1-2% improvement in xonotic on my BSW. Haven't tried
> BDW, but supposedly it has larger TLBs so might not benefit as much.
> On HSW GTT caching is enabled by default.
> 
> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> ---
>  drivers/gpu/drm/i915/i915_reg.h |  2 ++
>  drivers/gpu/drm/i915/intel_pm.c | 13 +++++++++++++
>  2 files changed, 15 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index 84af255..90640d5 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -1461,6 +1461,8 @@ enum skl_disp_power_wells {
>  #define RING_HWS_PGA(base)	((base)+0x80)
>  #define RING_HWS_PGA_GEN6(base)	((base)+0x2080)
>  
> +#define HSW_GTT_CACHE_EN	0x4024
> +#define   GTT_CACHE_EN_ALL	0xF0007FFF
>  #define GEN7_WR_WATERMARK	0x4028
>  #define GEN7_GFX_PRIO_CTRL	0x402C
>  #define ARB_MODE		0x4030
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index 5ec56b6..58517a50 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -6205,6 +6205,13 @@ static void broadwell_init_clock_gating(struct drm_device *dev)
>  	I915_WRITE(GEN8_L3SQCREG1, BDW_WA_L3SQCREG1_DEFAULT);
>  	I915_WRITE(GEN7_MISCCPCTL, misccpctl);
>  
> +	/*
> +	 * WaGttCachingOffByDefault:bdw
> +	 * GTT cache may not work with big pages, so if those
> +	 * are ever enabled GTT cache may need to be disabled.
> +	 */
> +	I915_WRITE(HSW_GTT_CACHE_EN, GTT_CACHE_EN_ALL);
> +
>  	lpt_init_clock_gating(dev);
>  }
>  
> @@ -6480,6 +6487,12 @@ static void cherryview_init_clock_gating(struct drm_device *dev)
>  	/* WaDisableSDEUnitClockGating:chv */
>  	I915_WRITE(GEN8_UCGCTL6, I915_READ(GEN8_UCGCTL6) |
>  		   GEN8_SDEUNIT_CLOCK_GATE_DISABLE);
> +
> +	/*
> +	 * GTT cache may not work with big pages, so if those
> +	 * are ever enabled GTT cache may need to be disabled.
> +	 */
> +	I915_WRITE(HSW_GTT_CACHE_EN, GTT_CACHE_EN_ALL);
>  }
>  
>  static void g4x_init_clock_gating(struct drm_device *dev)
> 

Looks ok to me; I guess testing will be the real review here.

Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Daniel Vetter May 22, 2015, 6:10 a.m. UTC | #3
On Thu, May 21, 2015 at 01:18:44PM -0700, Jesse Barnes wrote:
> On 05/19/2015 10:32 AM, ville.syrjala@linux.intel.com wrote:
> > From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > 
> > GTT caching was disabled by default on gen8 due to not working with
> > big pages. Some information suggests that it got fixed, but still
> > GTT caching has been left disabled by default. Or could be it just
> > meant that the default was changed to off, and hence the problem
> > got solved.
> > 
> > Enable GTT caching in the hopes of some performance increase.
> > Whether or not the big pages issue has been fixed is irrelevant
> > at this stage since we don't use big pages.
> > 
> > This gives me a 1-2% improvement in xonotic on my BSW. Haven't tried
> > BDW, but supposedly it has larger TLBs so might not benefit as much.
> > On HSW GTT caching is enabled by default.
> > 
> > Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > ---
> >  drivers/gpu/drm/i915/i915_reg.h |  2 ++
> >  drivers/gpu/drm/i915/intel_pm.c | 13 +++++++++++++
> >  2 files changed, 15 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> > index 84af255..90640d5 100644
> > --- a/drivers/gpu/drm/i915/i915_reg.h
> > +++ b/drivers/gpu/drm/i915/i915_reg.h
> > @@ -1461,6 +1461,8 @@ enum skl_disp_power_wells {
> >  #define RING_HWS_PGA(base)	((base)+0x80)
> >  #define RING_HWS_PGA_GEN6(base)	((base)+0x2080)
> >  
> > +#define HSW_GTT_CACHE_EN	0x4024
> > +#define   GTT_CACHE_EN_ALL	0xF0007FFF
> >  #define GEN7_WR_WATERMARK	0x4028
> >  #define GEN7_GFX_PRIO_CTRL	0x402C
> >  #define ARB_MODE		0x4030
> > diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> > index 5ec56b6..58517a50 100644
> > --- a/drivers/gpu/drm/i915/intel_pm.c
> > +++ b/drivers/gpu/drm/i915/intel_pm.c
> > @@ -6205,6 +6205,13 @@ static void broadwell_init_clock_gating(struct drm_device *dev)
> >  	I915_WRITE(GEN8_L3SQCREG1, BDW_WA_L3SQCREG1_DEFAULT);
> >  	I915_WRITE(GEN7_MISCCPCTL, misccpctl);
> >  
> > +	/*
> > +	 * WaGttCachingOffByDefault:bdw
> > +	 * GTT cache may not work with big pages, so if those
> > +	 * are ever enabled GTT cache may need to be disabled.
> > +	 */
> > +	I915_WRITE(HSW_GTT_CACHE_EN, GTT_CACHE_EN_ALL);
> > +
> >  	lpt_init_clock_gating(dev);
> >  }
> >  
> > @@ -6480,6 +6487,12 @@ static void cherryview_init_clock_gating(struct drm_device *dev)
> >  	/* WaDisableSDEUnitClockGating:chv */
> >  	I915_WRITE(GEN8_UCGCTL6, I915_READ(GEN8_UCGCTL6) |
> >  		   GEN8_SDEUNIT_CLOCK_GATE_DISABLE);
> > +
> > +	/*
> > +	 * GTT cache may not work with big pages, so if those
> > +	 * are ever enabled GTT cache may need to be disabled.
> > +	 */
> > +	I915_WRITE(HSW_GTT_CACHE_EN, GTT_CACHE_EN_ALL);
> >  }
> >  
> >  static void g4x_init_clock_gating(struct drm_device *dev)
> > 
> 
> Looks ok to me; I guess testing will be the real review here.

Just aside: If you think the real test for a patch is the real world imo
an important part of the review work is to make sure we do have that
testing coverage. We have a few igts that specifically exercise gtt tlb
issues (from previous generations), so I think we're covered here.

> Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>

Merged all three, thanks for patches&review.
-Daniel
Jesse Barnes May 22, 2015, 3:31 p.m. UTC | #4
On 05/21/2015 11:10 PM, Daniel Vetter wrote:
> On Thu, May 21, 2015 at 01:18:44PM -0700, Jesse Barnes wrote:
>> On 05/19/2015 10:32 AM, ville.syrjala@linux.intel.com wrote:
>>> From: Ville Syrjälä <ville.syrjala@linux.intel.com>
>>>
>>> GTT caching was disabled by default on gen8 due to not working with
>>> big pages. Some information suggests that it got fixed, but still
>>> GTT caching has been left disabled by default. Or could be it just
>>> meant that the default was changed to off, and hence the problem
>>> got solved.
>>>
>>> Enable GTT caching in the hopes of some performance increase.
>>> Whether or not the big pages issue has been fixed is irrelevant
>>> at this stage since we don't use big pages.
>>>
>>> This gives me a 1-2% improvement in xonotic on my BSW. Haven't tried
>>> BDW, but supposedly it has larger TLBs so might not benefit as much.
>>> On HSW GTT caching is enabled by default.
>>>
>>> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
>>> ---
>>>  drivers/gpu/drm/i915/i915_reg.h |  2 ++
>>>  drivers/gpu/drm/i915/intel_pm.c | 13 +++++++++++++
>>>  2 files changed, 15 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
>>> index 84af255..90640d5 100644
>>> --- a/drivers/gpu/drm/i915/i915_reg.h
>>> +++ b/drivers/gpu/drm/i915/i915_reg.h
>>> @@ -1461,6 +1461,8 @@ enum skl_disp_power_wells {
>>>  #define RING_HWS_PGA(base)	((base)+0x80)
>>>  #define RING_HWS_PGA_GEN6(base)	((base)+0x2080)
>>>  
>>> +#define HSW_GTT_CACHE_EN	0x4024
>>> +#define   GTT_CACHE_EN_ALL	0xF0007FFF
>>>  #define GEN7_WR_WATERMARK	0x4028
>>>  #define GEN7_GFX_PRIO_CTRL	0x402C
>>>  #define ARB_MODE		0x4030
>>> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
>>> index 5ec56b6..58517a50 100644
>>> --- a/drivers/gpu/drm/i915/intel_pm.c
>>> +++ b/drivers/gpu/drm/i915/intel_pm.c
>>> @@ -6205,6 +6205,13 @@ static void broadwell_init_clock_gating(struct drm_device *dev)
>>>  	I915_WRITE(GEN8_L3SQCREG1, BDW_WA_L3SQCREG1_DEFAULT);
>>>  	I915_WRITE(GEN7_MISCCPCTL, misccpctl);
>>>  
>>> +	/*
>>> +	 * WaGttCachingOffByDefault:bdw
>>> +	 * GTT cache may not work with big pages, so if those
>>> +	 * are ever enabled GTT cache may need to be disabled.
>>> +	 */
>>> +	I915_WRITE(HSW_GTT_CACHE_EN, GTT_CACHE_EN_ALL);
>>> +
>>>  	lpt_init_clock_gating(dev);
>>>  }
>>>  
>>> @@ -6480,6 +6487,12 @@ static void cherryview_init_clock_gating(struct drm_device *dev)
>>>  	/* WaDisableSDEUnitClockGating:chv */
>>>  	I915_WRITE(GEN8_UCGCTL6, I915_READ(GEN8_UCGCTL6) |
>>>  		   GEN8_SDEUNIT_CLOCK_GATE_DISABLE);
>>> +
>>> +	/*
>>> +	 * GTT cache may not work with big pages, so if those
>>> +	 * are ever enabled GTT cache may need to be disabled.
>>> +	 */
>>> +	I915_WRITE(HSW_GTT_CACHE_EN, GTT_CACHE_EN_ALL);
>>>  }
>>>  
>>>  static void g4x_init_clock_gating(struct drm_device *dev)
>>>
>>
>> Looks ok to me; I guess testing will be the real review here.
> 
> Just aside: If you think the real test for a patch is the real world imo
> an important part of the review work is to make sure we do have that
> testing coverage. We have a few igts that specifically exercise gtt tlb
> issues (from previous generations), so I think we're covered here.
> 
>> Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
> 
> Merged all three, thanks for patches&review.

Yeah I don't think it's something that requires a dedicated test, just
lots of coverage with our existing stuff to sniff out problems.

Thanks,
Jesse
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 84af255..90640d5 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -1461,6 +1461,8 @@  enum skl_disp_power_wells {
 #define RING_HWS_PGA(base)	((base)+0x80)
 #define RING_HWS_PGA_GEN6(base)	((base)+0x2080)
 
+#define HSW_GTT_CACHE_EN	0x4024
+#define   GTT_CACHE_EN_ALL	0xF0007FFF
 #define GEN7_WR_WATERMARK	0x4028
 #define GEN7_GFX_PRIO_CTRL	0x402C
 #define ARB_MODE		0x4030
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 5ec56b6..58517a50 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -6205,6 +6205,13 @@  static void broadwell_init_clock_gating(struct drm_device *dev)
 	I915_WRITE(GEN8_L3SQCREG1, BDW_WA_L3SQCREG1_DEFAULT);
 	I915_WRITE(GEN7_MISCCPCTL, misccpctl);
 
+	/*
+	 * WaGttCachingOffByDefault:bdw
+	 * GTT cache may not work with big pages, so if those
+	 * are ever enabled GTT cache may need to be disabled.
+	 */
+	I915_WRITE(HSW_GTT_CACHE_EN, GTT_CACHE_EN_ALL);
+
 	lpt_init_clock_gating(dev);
 }
 
@@ -6480,6 +6487,12 @@  static void cherryview_init_clock_gating(struct drm_device *dev)
 	/* WaDisableSDEUnitClockGating:chv */
 	I915_WRITE(GEN8_UCGCTL6, I915_READ(GEN8_UCGCTL6) |
 		   GEN8_SDEUNIT_CLOCK_GATE_DISABLE);
+
+	/*
+	 * GTT cache may not work with big pages, so if those
+	 * are ever enabled GTT cache may need to be disabled.
+	 */
+	I915_WRITE(HSW_GTT_CACHE_EN, GTT_CACHE_EN_ALL);
 }
 
 static void g4x_init_clock_gating(struct drm_device *dev)