diff mbox

drm/i915: Avoid snooping with userptr where not supported

Message ID 1456849775-15048-1-git-send-email-tvrtko.ursulin@linux.intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Tvrtko Ursulin March 1, 2016, 4:29 p.m. UTC
From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

   commit e5756c10d841ddb448293c849392f3d6b809561f
   Author: Imre Deak <imre.deak@intel.com>
   Date:   Fri Aug 14 18:43:30 2015 +0300

       drm/i915/bxt: don't allow cached GEM mappings on A stepping

Added an exception of disallowing snooping for Broxton A
stepping hardware but userptr was still enabling it regardless.

Move the check to HAS_SNOOP now that it is used from multiple
call sites and use it.

Idea is that userspace which relies on userptr snooping, for
example for fine grained buffered SVM, can query the caching
mode on a created userptr object to determine whether coherency
is supported or not.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Imre Deak <imre.deak@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h         | 1 +
 drivers/gpu/drm/i915/i915_gem.c         | 2 +-
 drivers/gpu/drm/i915/i915_gem_userptr.c | 2 +-
 3 files changed, 3 insertions(+), 2 deletions(-)

Comments

Chris Wilson March 1, 2016, 5:14 p.m. UTC | #1
On Tue, Mar 01, 2016 at 04:29:35PM +0000, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
>    commit e5756c10d841ddb448293c849392f3d6b809561f
>    Author: Imre Deak <imre.deak@intel.com>
>    Date:   Fri Aug 14 18:43:30 2015 +0300
> 
>        drm/i915/bxt: don't allow cached GEM mappings on A stepping
> 
> Added an exception of disallowing snooping for Broxton A
> stepping hardware but userptr was still enabling it regardless.
> 
> Move the check to HAS_SNOOP now that it is used from multiple
> call sites and use it.
> 
> Idea is that userspace which relies on userptr snooping, for
> example for fine grained buffered SVM, can query the caching
> mode on a created userptr object to determine whether coherency
> is supported or not.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Imre Deak <imre.deak@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_drv.h         | 1 +
>  drivers/gpu/drm/i915/i915_gem.c         | 2 +-
>  drivers/gpu/drm/i915/i915_gem_userptr.c | 2 +-
>  3 files changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 671295523317..73f0db17b990 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2629,6 +2629,7 @@ struct drm_i915_cmd_table {
>  #define HAS_LLC(dev)		(INTEL_INFO(dev)->has_llc)
>  #define HAS_WT(dev)		((IS_HASWELL(dev) || IS_BROADWELL(dev)) && \
>  				 __I915__(dev)->ellc_size)
> +#define HAS_SNOOP(dev)		(!IS_BXT_REVID(dev, 0, BXT_REVID_A1))
>  #define I915_NEED_GFX_HWS(dev)	(INTEL_INFO(dev)->need_gfx_hws)
>  
>  #define HAS_HW_CONTEXTS(dev)	(INTEL_INFO(dev)->gen >= 6)
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 3d31d3ac589e..1c6e8fb9d392 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -3949,7 +3949,7 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data,
>  		 * cacheline, whereas normally such cachelines would get
>  		 * invalidated.
>  		 */
> -		if (IS_BXT_REVID(dev, 0, BXT_REVID_A1))
> +		if (!HAS_SNOOP(dev))

if (!HAS_LLC && !HAS_SNOOP)

and HAS_SNOOP should be true for everything that is not llc, with the
exception above. So make a dev_info->has_snoop feature bit (or
uses_snoop?) and then clear the copy inside the struct for the special
case.

>  			return -ENODEV;
>  
>  		level = I915_CACHE_LLC;
> diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c
> index 4b09c840d493..48aaff019b42 100644
> --- a/drivers/gpu/drm/i915/i915_gem_userptr.c
> +++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
> @@ -782,7 +782,7 @@ i915_gem_userptr_ioctl(struct drm_device *dev, void *data, struct drm_file *file
>  
>  	drm_gem_private_object_init(dev, &obj->base, args->user_size);
>  	i915_gem_object_init(obj, &i915_gem_userptr_ops);
> -	obj->cache_level = I915_CACHE_LLC;
> +	obj->cache_level = HAS_SNOOP(dev) ? I915_CACHE_LLC : I915_CACHE_NONE;

This should be
if (!HAS_SNOOP && !HAS_LLC)
	return -ENODEV;

as we can not support a coherent pointer shared between memory and the GPU
in this case.
-Chris
Tvrtko Ursulin March 1, 2016, 5:21 p.m. UTC | #2
On 01/03/16 17:14, Chris Wilson wrote:
> On Tue, Mar 01, 2016 at 04:29:35PM +0000, Tvrtko Ursulin wrote:
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>>     commit e5756c10d841ddb448293c849392f3d6b809561f
>>     Author: Imre Deak <imre.deak@intel.com>
>>     Date:   Fri Aug 14 18:43:30 2015 +0300
>>
>>         drm/i915/bxt: don't allow cached GEM mappings on A stepping
>>
>> Added an exception of disallowing snooping for Broxton A
>> stepping hardware but userptr was still enabling it regardless.
>>
>> Move the check to HAS_SNOOP now that it is used from multiple
>> call sites and use it.
>>
>> Idea is that userspace which relies on userptr snooping, for
>> example for fine grained buffered SVM, can query the caching
>> mode on a created userptr object to determine whether coherency
>> is supported or not.
>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>> Cc: Imre Deak <imre.deak@intel.com>
>> ---
>>   drivers/gpu/drm/i915/i915_drv.h         | 1 +
>>   drivers/gpu/drm/i915/i915_gem.c         | 2 +-
>>   drivers/gpu/drm/i915/i915_gem_userptr.c | 2 +-
>>   3 files changed, 3 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>> index 671295523317..73f0db17b990 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -2629,6 +2629,7 @@ struct drm_i915_cmd_table {
>>   #define HAS_LLC(dev)		(INTEL_INFO(dev)->has_llc)
>>   #define HAS_WT(dev)		((IS_HASWELL(dev) || IS_BROADWELL(dev)) && \
>>   				 __I915__(dev)->ellc_size)
>> +#define HAS_SNOOP(dev)		(!IS_BXT_REVID(dev, 0, BXT_REVID_A1))
>>   #define I915_NEED_GFX_HWS(dev)	(INTEL_INFO(dev)->need_gfx_hws)
>>
>>   #define HAS_HW_CONTEXTS(dev)	(INTEL_INFO(dev)->gen >= 6)
>> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
>> index 3d31d3ac589e..1c6e8fb9d392 100644
>> --- a/drivers/gpu/drm/i915/i915_gem.c
>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>> @@ -3949,7 +3949,7 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data,
>>   		 * cacheline, whereas normally such cachelines would get
>>   		 * invalidated.
>>   		 */
>> -		if (IS_BXT_REVID(dev, 0, BXT_REVID_A1))
>> +		if (!HAS_SNOOP(dev))
>
> if (!HAS_LLC && !HAS_SNOOP)
>
> and HAS_SNOOP should be true for everything that is not llc, with the
> exception above. So make a dev_info->has_snoop feature bit (or
> uses_snoop?) and then clear the copy inside the struct for the special
> case.

Sure I can do that, but isn't it the same as this patch? Unless 
something with LLC and no snooping appears?

>>   			return -ENODEV;
>>
>>   		level = I915_CACHE_LLC;
>> diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c
>> index 4b09c840d493..48aaff019b42 100644
>> --- a/drivers/gpu/drm/i915/i915_gem_userptr.c
>> +++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
>> @@ -782,7 +782,7 @@ i915_gem_userptr_ioctl(struct drm_device *dev, void *data, struct drm_file *file
>>
>>   	drm_gem_private_object_init(dev, &obj->base, args->user_size);
>>   	i915_gem_object_init(obj, &i915_gem_userptr_ops);
>> -	obj->cache_level = I915_CACHE_LLC;
>> +	obj->cache_level = HAS_SNOOP(dev) ? I915_CACHE_LLC : I915_CACHE_NONE;
>
> This should be
> if (!HAS_SNOOP && !HAS_LLC)
> 	return -ENODEV;
>
> as we can not support a coherent pointer shared between memory and the GPU
> in this case.

Hm interesting, why not? I mean, what is the difference compared to 
normal BOs which will be uncached on !LLC?

Regards,

Tvrtko
Chris Wilson March 1, 2016, 5:36 p.m. UTC | #3
On Tue, Mar 01, 2016 at 05:21:49PM +0000, Tvrtko Ursulin wrote:
> 
> On 01/03/16 17:14, Chris Wilson wrote:
> >On Tue, Mar 01, 2016 at 04:29:35PM +0000, Tvrtko Ursulin wrote:
> >>From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >>
> >>    commit e5756c10d841ddb448293c849392f3d6b809561f
> >>    Author: Imre Deak <imre.deak@intel.com>
> >>    Date:   Fri Aug 14 18:43:30 2015 +0300
> >>
> >>        drm/i915/bxt: don't allow cached GEM mappings on A stepping
> >>
> >>Added an exception of disallowing snooping for Broxton A
> >>stepping hardware but userptr was still enabling it regardless.
> >>
> >>Move the check to HAS_SNOOP now that it is used from multiple
> >>call sites and use it.
> >>
> >>Idea is that userspace which relies on userptr snooping, for
> >>example for fine grained buffered SVM, can query the caching
> >>mode on a created userptr object to determine whether coherency
> >>is supported or not.
> >>
> >>Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >>Cc: Chris Wilson <chris@chris-wilson.co.uk>
> >>Cc: Imre Deak <imre.deak@intel.com>
> >>---
> >>  drivers/gpu/drm/i915/i915_drv.h         | 1 +
> >>  drivers/gpu/drm/i915/i915_gem.c         | 2 +-
> >>  drivers/gpu/drm/i915/i915_gem_userptr.c | 2 +-
> >>  3 files changed, 3 insertions(+), 2 deletions(-)
> >>
> >>diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> >>index 671295523317..73f0db17b990 100644
> >>--- a/drivers/gpu/drm/i915/i915_drv.h
> >>+++ b/drivers/gpu/drm/i915/i915_drv.h
> >>@@ -2629,6 +2629,7 @@ struct drm_i915_cmd_table {
> >>  #define HAS_LLC(dev)		(INTEL_INFO(dev)->has_llc)
> >>  #define HAS_WT(dev)		((IS_HASWELL(dev) || IS_BROADWELL(dev)) && \
> >>  				 __I915__(dev)->ellc_size)
> >>+#define HAS_SNOOP(dev)		(!IS_BXT_REVID(dev, 0, BXT_REVID_A1))
> >>  #define I915_NEED_GFX_HWS(dev)	(INTEL_INFO(dev)->need_gfx_hws)
> >>
> >>  #define HAS_HW_CONTEXTS(dev)	(INTEL_INFO(dev)->gen >= 6)
> >>diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> >>index 3d31d3ac589e..1c6e8fb9d392 100644
> >>--- a/drivers/gpu/drm/i915/i915_gem.c
> >>+++ b/drivers/gpu/drm/i915/i915_gem.c
> >>@@ -3949,7 +3949,7 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data,
> >>  		 * cacheline, whereas normally such cachelines would get
> >>  		 * invalidated.
> >>  		 */
> >>-		if (IS_BXT_REVID(dev, 0, BXT_REVID_A1))
> >>+		if (!HAS_SNOOP(dev))
> >
> >if (!HAS_LLC && !HAS_SNOOP)
> >
> >and HAS_SNOOP should be true for everything that is not llc, with the
> >exception above. So make a dev_info->has_snoop feature bit (or
> >uses_snoop?) and then clear the copy inside the struct for the special
> >case.
> 
> Sure I can do that, but isn't it the same as this patch? Unless
> something with LLC and no snooping appears?

I am generalising into something that makes sense here.

if (!HAS_SNOOP(dev)) does not make sense by itself since LLC is not
snooping and so this would appear to also reject all of LLC.

> 
> >>  			return -ENODEV;
> >>
> >>  		level = I915_CACHE_LLC;
> >>diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c
> >>index 4b09c840d493..48aaff019b42 100644
> >>--- a/drivers/gpu/drm/i915/i915_gem_userptr.c
> >>+++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
> >>@@ -782,7 +782,7 @@ i915_gem_userptr_ioctl(struct drm_device *dev, void *data, struct drm_file *file
> >>
> >>  	drm_gem_private_object_init(dev, &obj->base, args->user_size);
> >>  	i915_gem_object_init(obj, &i915_gem_userptr_ops);
> >>-	obj->cache_level = I915_CACHE_LLC;
> >>+	obj->cache_level = HAS_SNOOP(dev) ? I915_CACHE_LLC : I915_CACHE_NONE;
> >
> >This should be
> >if (!HAS_SNOOP && !HAS_LLC)
> >	return -ENODEV;
> >
> >as we can not support a coherent pointer shared between memory and the GPU
> >in this case.
> 
> Hm interesting, why not? I mean, what is the difference compared to
> normal BOs which will be uncached on !LLC?

They are allocated as device memory. userptr is for importing normal system
pages into the device with the expectation that such is coherent. To
change that we should have additional flags that tell us that the user
is expecting non-coherent bo which requires the user to manage sync.
That is also typically (thinking about the GL spec) also explicitly
requested at the client API level.
-Chris
Tvrtko Ursulin March 2, 2016, 1:49 p.m. UTC | #4
On 02/03/16 12:31, Patchwork wrote:
> == Series Details ==
>
> Series: drm/i915: Avoid snooping with userptr where not supported (rev3)
> URL   : https://patchwork.freedesktop.org/series/3979/
> State : failure
>
> == Summary ==
>
> Series 3979v3 drm/i915: Avoid snooping with userptr where not supported
> http://patchwork.freedesktop.org/api/1.0/series/3979/revisions/3/mbox/
>
> Test kms_flip:
>          Subgroup basic-flip-vs-wf_vblank:
>                  fail       -> PASS       (snb-x220t)
>                  dmesg-warn -> PASS       (hsw-gt2)
>          Subgroup basic-plain-flip:
>                  pass       -> DMESG-WARN (hsw-brixbox)

Unrelated: https://bugs.freedesktop.org/show_bug.cgi?id=94349

> Test kms_pipe_crc_basic:
>          Subgroup read-crc-pipe-c:
>                  dmesg-warn -> PASS       (bsw-nuc-2)
>          Subgroup suspend-read-crc-pipe-b:
>                  pass       -> FAIL       (byt-nuc)
>          Subgroup suspend-read-crc-pipe-c:
>                  skip       -> FAIL       (byt-nuc)

New unrelated: https://bugs.freedesktop.org/show_bug.cgi?id=94370

> Test pm_rpm:
>          Subgroup basic-pci-d3-state:
>                  pass       -> DMESG-WARN (bsw-nuc-2)

Unrelated: https://bugs.freedesktop.org/show_bug.cgi?id=94164

>          Subgroup basic-rte:
>                  pass       -> DMESG-WARN (snb-dellxps)

Again: https://bugs.freedesktop.org/show_bug.cgi?id=94349

>
> bdw-nuci7        total:169  pass:158  dwarn:0   dfail:0   fail:0   skip:11
> bdw-ultra        total:169  pass:155  dwarn:0   dfail:0   fail:0   skip:14
> bsw-nuc-2        total:169  pass:137  dwarn:1   dfail:0   fail:1   skip:30
> byt-nuc          total:169  pass:143  dwarn:0   dfail:0   fail:2   skip:24
> hsw-brixbox      total:169  pass:153  dwarn:1   dfail:0   fail:0   skip:15
> hsw-gt2          total:169  pass:159  dwarn:0   dfail:0   fail:0   skip:10
> ilk-hp8440p      total:169  pass:118  dwarn:0   dfail:0   fail:1   skip:50
> ivb-t430s        total:169  pass:154  dwarn:0   dfail:0   fail:0   skip:15
> skl-i5k-2        total:169  pass:153  dwarn:0   dfail:0   fail:0   skip:16
> skl-i7k-2        total:169  pass:153  dwarn:0   dfail:0   fail:0   skip:16
> snb-dellxps      total:169  pass:145  dwarn:1   dfail:0   fail:0   skip:23
> snb-x220t        total:169  pass:146  dwarn:0   dfail:1   fail:0   skip:22
>
> Results at /archive/results/CI_IGT_test/Patchwork_1513/
>
> 37415cefa4f17869f5ed40a42dae85f1e6b519be drm-intel-nightly: 2016y-03m-02d-10h-20m-17s UTC integration manifest
> d37b311fb28390fd8131caa5db1a623a86e81842 drm/i915: Avoid snooping with userptr where not supported

Merged, thanks for the review!

Regards,

Tvrtko
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 671295523317..73f0db17b990 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2629,6 +2629,7 @@  struct drm_i915_cmd_table {
 #define HAS_LLC(dev)		(INTEL_INFO(dev)->has_llc)
 #define HAS_WT(dev)		((IS_HASWELL(dev) || IS_BROADWELL(dev)) && \
 				 __I915__(dev)->ellc_size)
+#define HAS_SNOOP(dev)		(!IS_BXT_REVID(dev, 0, BXT_REVID_A1))
 #define I915_NEED_GFX_HWS(dev)	(INTEL_INFO(dev)->need_gfx_hws)
 
 #define HAS_HW_CONTEXTS(dev)	(INTEL_INFO(dev)->gen >= 6)
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 3d31d3ac589e..1c6e8fb9d392 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3949,7 +3949,7 @@  int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data,
 		 * cacheline, whereas normally such cachelines would get
 		 * invalidated.
 		 */
-		if (IS_BXT_REVID(dev, 0, BXT_REVID_A1))
+		if (!HAS_SNOOP(dev))
 			return -ENODEV;
 
 		level = I915_CACHE_LLC;
diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c
index 4b09c840d493..48aaff019b42 100644
--- a/drivers/gpu/drm/i915/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
@@ -782,7 +782,7 @@  i915_gem_userptr_ioctl(struct drm_device *dev, void *data, struct drm_file *file
 
 	drm_gem_private_object_init(dev, &obj->base, args->user_size);
 	i915_gem_object_init(obj, &i915_gem_userptr_ops);
-	obj->cache_level = I915_CACHE_LLC;
+	obj->cache_level = HAS_SNOOP(dev) ? I915_CACHE_LLC : I915_CACHE_NONE;
 	obj->base.write_domain = I915_GEM_DOMAIN_CPU;
 	obj->base.read_domains = I915_GEM_DOMAIN_CPU;