diff mbox

[4/5] drm/i915: Only set gem object L3 cache level for IVB devices

Message ID 1449514270-15171-5-git-send-email-wayne.boyer@intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Wayne Boyer Dec. 7, 2015, 6:51 p.m. UTC
Do some further clean up based on the initial review of
drm/i915: Separate cherryview from valleyview.

In this case, in i915_gem_alloc_context_obj() only call
i915_gem_object_set_cache_level() for Ivy Bridge devices
since later platforms don't have L3 control bits in the PTE.

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Wayne Boyer <wayne.boyer@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_context.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

Comments

Ville Syrjälä Dec. 7, 2015, 7:28 p.m. UTC | #1
On Mon, Dec 07, 2015 at 10:51:09AM -0800, Wayne Boyer wrote:
> Do some further clean up based on the initial review of
> drm/i915: Separate cherryview from valleyview.
> 
> In this case, in i915_gem_alloc_context_obj() only call
> i915_gem_object_set_cache_level() for Ivy Bridge devices
> since later platforms don't have L3 control bits in the PTE.
> 
> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
> Signed-off-by: Wayne Boyer <wayne.boyer@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_gem_context.c | 8 +++-----
>  1 file changed, 3 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> index 4b1161d..e4de433 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> @@ -185,12 +185,10 @@ i915_gem_alloc_context_obj(struct drm_device *dev, size_t size)
>  	/*
>  	 * Try to make the context utilize L3 as well as LLC.
>  	 *
> -	 * On VLV we don't have L3 controls in the PTEs so we
> -	 * shouldn't touch the cache level, especially as that
> -	 * would make the object snooped which might have a
> -	 * negative performance impact.
> +	 * This is only applicable for Ivy Bridge devices since
> +	 * later platforms don't have L3 control bits in the PTE.
>  	 */
> -	if (INTEL_INFO(dev)->gen >= 7 && !IS_VALLEYVIEW(dev) && !IS_CHERRYVIEW(dev)) {
> +	if (IS_IVYBRIDGE(dev)) {

This would actually change the snoop setting on BXT, but that
should be fine since BXT still has the GTT -> PAT 0 thing which
means we get snooping anyway IIRC.

Imre, we discussed this stuff at some point, and I hope I'm not
forgetting something crucial here that would break BXT. Probably best
if you sanity check my thinking here...

We should probably mention this PAT 0 business in a comment somewhere
here since it's a very easy thing to miss.

>  		ret = i915_gem_object_set_cache_level(obj, I915_CACHE_L3_LLC);
>  		/* Failure shouldn't ever happen this early */
>  		if (WARN_ON(ret)) {
> -- 
> 2.6.3
Imre Deak Dec. 7, 2015, 7:56 p.m. UTC | #2
On Mon, 2015-12-07 at 21:28 +0200, Ville Syrjälä wrote:
> On Mon, Dec 07, 2015 at 10:51:09AM -0800, Wayne Boyer wrote:
> > Do some further clean up based on the initial review of
> > drm/i915: Separate cherryview from valleyview.
> > 
> > In this case, in i915_gem_alloc_context_obj() only call
> > i915_gem_object_set_cache_level() for Ivy Bridge devices
> > since later platforms don't have L3 control bits in the PTE.
> > 
> > Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
> > Signed-off-by: Wayne Boyer <wayne.boyer@intel.com>
> > ---
> >  drivers/gpu/drm/i915/i915_gem_context.c | 8 +++-----
> >  1 file changed, 3 insertions(+), 5 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_gem_context.c
> > b/drivers/gpu/drm/i915/i915_gem_context.c
> > index 4b1161d..e4de433 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_context.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> > @@ -185,12 +185,10 @@ i915_gem_alloc_context_obj(struct drm_device
> > *dev, size_t size)
> >  	/*
> >  	 * Try to make the context utilize L3 as well as LLC.
> >  	 *
> > -	 * On VLV we don't have L3 controls in the PTEs so we
> > -	 * shouldn't touch the cache level, especially as that
> > -	 * would make the object snooped which might have a
> > -	 * negative performance impact.
> > +	 * This is only applicable for Ivy Bridge devices since
> > +	 * later platforms don't have L3 control bits in the PTE.
> >  	 */
> > -	if (INTEL_INFO(dev)->gen >= 7 && !IS_VALLEYVIEW(dev) &&
> > !IS_CHERRYVIEW(dev)) {
> > +	if (IS_IVYBRIDGE(dev)) {
> 
> This would actually change the snoop setting on BXT, but that
> should be fine since BXT still has the GTT -> PAT 0 thing which
> means we get snooping anyway IIRC.
> 
> Imre, we discussed this stuff at some point, and I hope I'm not
> forgetting something crucial here that would break BXT. Probably best
> if you sanity check my thinking here...

Yes, AFAIU it doesn't matter what level we set, because of the PAT
mapping constraint you describe above. Atm we also don't use this
function on BXT, b/c of the different codepaths for LRC vs. ringbuffer
mode. But that could/should be unified at one point, so yea it's better
to keep things working for BXT too here. In the corresponding
intel_lr_context_deferred_alloc() we don't set the cache level, which
also agrees with the above, so I think this change is fine.

> We should probably mention this PAT 0 business in a comment somewhere
> here since it's a very easy thing to miss.

Yep, a comment about this would be in order.

> 
> >  		ret = i915_gem_object_set_cache_level(obj,
> > I915_CACHE_L3_LLC);
> >  		/* Failure shouldn't ever happen this early */
> >  		if (WARN_ON(ret)) {
> > -- 
> > 2.6.3
>
Wayne Boyer Dec. 7, 2015, 10:26 p.m. UTC | #3
On 12/7/15, 11:56 AM, "Deak, Imre" <imre.deak@intel.com> wrote:


>On Mon, 2015-12-07 at 21:28 +0200, Ville Syrjälä wrote:
>> On Mon, Dec 07, 2015 at 10:51:09AM -0800, Wayne Boyer wrote:
>> > Do some further clean up based on the initial review of
>> > drm/i915: Separate cherryview from valleyview.
>> > 
>> > In this case, in i915_gem_alloc_context_obj() only call
>> > i915_gem_object_set_cache_level() for Ivy Bridge devices
>> > since later platforms don't have L3 control bits in the PTE.
>> > 
>> > Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
>> > Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
>> > Signed-off-by: Wayne Boyer <wayne.boyer@intel.com>
>> > ---
>> >  drivers/gpu/drm/i915/i915_gem_context.c | 8 +++-----
>> >  1 file changed, 3 insertions(+), 5 deletions(-)
>> > 
>> > diff --git a/drivers/gpu/drm/i915/i915_gem_context.c
>> > b/drivers/gpu/drm/i915/i915_gem_context.c
>> > index 4b1161d..e4de433 100644
>> > --- a/drivers/gpu/drm/i915/i915_gem_context.c
>> > +++ b/drivers/gpu/drm/i915/i915_gem_context.c
>> > @@ -185,12 +185,10 @@ i915_gem_alloc_context_obj(struct drm_device
>> > *dev, size_t size)
>> >  	/*
>> >  	 * Try to make the context utilize L3 as well as LLC.
>> >  	 *
>> > -	 * On VLV we don't have L3 controls in the PTEs so we
>> > -	 * shouldn't touch the cache level, especially as that
>> > -	 * would make the object snooped which might have a
>> > -	 * negative performance impact.
>> > +	 * This is only applicable for Ivy Bridge devices since
>> > +	 * later platforms don't have L3 control bits in the PTE.
>> >  	 */
>> > -	if (INTEL_INFO(dev)->gen >= 7 && !IS_VALLEYVIEW(dev) &&
>> > !IS_CHERRYVIEW(dev)) {
>> > +	if (IS_IVYBRIDGE(dev)) {
>> 
>> This would actually change the snoop setting on BXT, but that
>> should be fine since BXT still has the GTT -> PAT 0 thing which
>> means we get snooping anyway IIRC.
>> 
>> Imre, we discussed this stuff at some point, and I hope I'm not
>> forgetting something crucial here that would break BXT. Probably best
>> if you sanity check my thinking here...
>
>Yes, AFAIU it doesn't matter what level we set, because of the PAT
>mapping constraint you describe above. Atm we also don't use this
>function on BXT, b/c of the different codepaths for LRC vs. ringbuffer
>mode. But that could/should be unified at one point, so yea it's better
>to keep things working for BXT too here. In the corresponding
>intel_lr_context_deferred_alloc() we don't set the cache level, which
>also agrees with the above, so I think this change is fine.
>
>> We should probably mention this PAT 0 business in a comment somewhere
>> here since it's a very easy thing to miss.
>
>Yep, a comment about this would be in order.

Ville, Imre, how does this look for an updated comment?

/*
 * Try to make the context utilize L3 as well as LLC.
 *
 * On VLV we don't have L3 controls in the PTE so we
 * shouldn't touch the cache level, especially as that
 * would make the object snooped which might have a
 * negative performance impact.
 *
 * On CHV and BXT we set PPAT 0 in chv_setup_private_ppat()
 * which means we get snooping anyway.
 *
 * This is only applicable for Ivy Bridge devices since
 * later platforms don't have L3 control bits in the PTE.
 */
Ville Syrjälä Dec. 8, 2015, 11:47 a.m. UTC | #4
On Mon, Dec 07, 2015 at 10:26:15PM +0000, Boyer, Wayne wrote:
> On 12/7/15, 11:56 AM, "Deak, Imre" <imre.deak@intel.com> wrote:
> 
> 
> >On Mon, 2015-12-07 at 21:28 +0200, Ville Syrjälä wrote:
> >> On Mon, Dec 07, 2015 at 10:51:09AM -0800, Wayne Boyer wrote:
> >> > Do some further clean up based on the initial review of
> >> > drm/i915: Separate cherryview from valleyview.
> >> > 
> >> > In this case, in i915_gem_alloc_context_obj() only call
> >> > i915_gem_object_set_cache_level() for Ivy Bridge devices
> >> > since later platforms don't have L3 control bits in the PTE.
> >> > 
> >> > Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
> >> > Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
> >> > Signed-off-by: Wayne Boyer <wayne.boyer@intel.com>
> >> > ---
> >> >  drivers/gpu/drm/i915/i915_gem_context.c | 8 +++-----
> >> >  1 file changed, 3 insertions(+), 5 deletions(-)
> >> > 
> >> > diff --git a/drivers/gpu/drm/i915/i915_gem_context.c
> >> > b/drivers/gpu/drm/i915/i915_gem_context.c
> >> > index 4b1161d..e4de433 100644
> >> > --- a/drivers/gpu/drm/i915/i915_gem_context.c
> >> > +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> >> > @@ -185,12 +185,10 @@ i915_gem_alloc_context_obj(struct drm_device
> >> > *dev, size_t size)
> >> >  	/*
> >> >  	 * Try to make the context utilize L3 as well as LLC.
> >> >  	 *
> >> > -	 * On VLV we don't have L3 controls in the PTEs so we
> >> > -	 * shouldn't touch the cache level, especially as that
> >> > -	 * would make the object snooped which might have a
> >> > -	 * negative performance impact.
> >> > +	 * This is only applicable for Ivy Bridge devices since
> >> > +	 * later platforms don't have L3 control bits in the PTE.
> >> >  	 */
> >> > -	if (INTEL_INFO(dev)->gen >= 7 && !IS_VALLEYVIEW(dev) &&
> >> > !IS_CHERRYVIEW(dev)) {
> >> > +	if (IS_IVYBRIDGE(dev)) {
> >> 
> >> This would actually change the snoop setting on BXT, but that
> >> should be fine since BXT still has the GTT -> PAT 0 thing which
> >> means we get snooping anyway IIRC.
> >> 
> >> Imre, we discussed this stuff at some point, and I hope I'm not
> >> forgetting something crucial here that would break BXT. Probably best
> >> if you sanity check my thinking here...
> >
> >Yes, AFAIU it doesn't matter what level we set, because of the PAT
> >mapping constraint you describe above. Atm we also don't use this
> >function on BXT, b/c of the different codepaths for LRC vs. ringbuffer
> >mode. But that could/should be unified at one point, so yea it's better
> >to keep things working for BXT too here. In the corresponding
> >intel_lr_context_deferred_alloc() we don't set the cache level, which
> >also agrees with the above, so I think this change is fine.
> >
> >> We should probably mention this PAT 0 business in a comment somewhere
> >> here since it's a very easy thing to miss.
> >
> >Yep, a comment about this would be in order.
> 
> Ville, Imre, how does this look for an updated comment?
> 
> /*
>  * Try to make the context utilize L3 as well as LLC.
>  *
>  * On VLV we don't have L3 controls in the PTE so we
>  * shouldn't touch the cache level, especially as that
>  * would make the object snooped which might have a
>  * negative performance impact.
>  *
>  * On CHV and BXT we set PPAT 0 in chv_setup_private_ppat()
>  * which means we get snooping anyway.

Maybe something like this:
"Snooping is required on non-llc platforms in execlist
mode, but since all GGTT accesses use PAT entry 0 we
get snooping anyway regardless of cache_level.

Otherwise lgtm.

>  *
>  * This is only applicable for Ivy Bridge devices since
>  * later platforms don't have L3 control bits in the PTE.
>  */
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 4b1161d..e4de433 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -185,12 +185,10 @@  i915_gem_alloc_context_obj(struct drm_device *dev, size_t size)
 	/*
 	 * Try to make the context utilize L3 as well as LLC.
 	 *
-	 * On VLV we don't have L3 controls in the PTEs so we
-	 * shouldn't touch the cache level, especially as that
-	 * would make the object snooped which might have a
-	 * negative performance impact.
+	 * This is only applicable for Ivy Bridge devices since
+	 * later platforms don't have L3 control bits in the PTE.
 	 */
-	if (INTEL_INFO(dev)->gen >= 7 && !IS_VALLEYVIEW(dev) && !IS_CHERRYVIEW(dev)) {
+	if (IS_IVYBRIDGE(dev)) {
 		ret = i915_gem_object_set_cache_level(obj, I915_CACHE_L3_LLC);
 		/* Failure shouldn't ever happen this early */
 		if (WARN_ON(ret)) {