Message ID | 1456849775-15048-1-git-send-email-tvrtko.ursulin@linux.intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Tue, Mar 01, 2016 at 04:29:35PM +0000, Tvrtko Ursulin wrote: > From: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > > commit e5756c10d841ddb448293c849392f3d6b809561f > Author: Imre Deak <imre.deak@intel.com> > Date: Fri Aug 14 18:43:30 2015 +0300 > > drm/i915/bxt: don't allow cached GEM mappings on A stepping > > Added an exception of disallowing snooping for Broxton A > stepping hardware but userptr was still enabling it regardless. > > Move the check to HAS_SNOOP now that it is used from multiple > call sites and use it. > > Idea is that userspace which relies on userptr snooping, for > example for fine grained buffered SVM, can query the caching > mode on a created userptr object to determine whether coherency > is supported or not. > > Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > Cc: Chris Wilson <chris@chris-wilson.co.uk> > Cc: Imre Deak <imre.deak@intel.com> > --- > drivers/gpu/drm/i915/i915_drv.h | 1 + > drivers/gpu/drm/i915/i915_gem.c | 2 +- > drivers/gpu/drm/i915/i915_gem_userptr.c | 2 +- > 3 files changed, 3 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h > index 671295523317..73f0db17b990 100644 > --- a/drivers/gpu/drm/i915/i915_drv.h > +++ b/drivers/gpu/drm/i915/i915_drv.h > @@ -2629,6 +2629,7 @@ struct drm_i915_cmd_table { > #define HAS_LLC(dev) (INTEL_INFO(dev)->has_llc) > #define HAS_WT(dev) ((IS_HASWELL(dev) || IS_BROADWELL(dev)) && \ > __I915__(dev)->ellc_size) > +#define HAS_SNOOP(dev) (!IS_BXT_REVID(dev, 0, BXT_REVID_A1)) > #define I915_NEED_GFX_HWS(dev) (INTEL_INFO(dev)->need_gfx_hws) > > #define HAS_HW_CONTEXTS(dev) (INTEL_INFO(dev)->gen >= 6) > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > index 3d31d3ac589e..1c6e8fb9d392 100644 > --- a/drivers/gpu/drm/i915/i915_gem.c > +++ b/drivers/gpu/drm/i915/i915_gem.c > @@ -3949,7 +3949,7 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data, > * cacheline, whereas normally such cachelines would get > * invalidated. > */ > - if (IS_BXT_REVID(dev, 0, BXT_REVID_A1)) > + if (!HAS_SNOOP(dev)) if (!HAS_LLC && !HAS_SNOOP) and HAS_SNOOP should be true for everything that is not llc, with the exception above. So make a dev_info->has_snoop feature bit (or uses_snoop?) and then clear the copy inside the struct for the special case. > return -ENODEV; > > level = I915_CACHE_LLC; > diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c > index 4b09c840d493..48aaff019b42 100644 > --- a/drivers/gpu/drm/i915/i915_gem_userptr.c > +++ b/drivers/gpu/drm/i915/i915_gem_userptr.c > @@ -782,7 +782,7 @@ i915_gem_userptr_ioctl(struct drm_device *dev, void *data, struct drm_file *file > > drm_gem_private_object_init(dev, &obj->base, args->user_size); > i915_gem_object_init(obj, &i915_gem_userptr_ops); > - obj->cache_level = I915_CACHE_LLC; > + obj->cache_level = HAS_SNOOP(dev) ? I915_CACHE_LLC : I915_CACHE_NONE; This should be if (!HAS_SNOOP && !HAS_LLC) return -ENODEV; as we can not support a coherent pointer shared between memory and the GPU in this case. -Chris
On 01/03/16 17:14, Chris Wilson wrote: > On Tue, Mar 01, 2016 at 04:29:35PM +0000, Tvrtko Ursulin wrote: >> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com> >> >> commit e5756c10d841ddb448293c849392f3d6b809561f >> Author: Imre Deak <imre.deak@intel.com> >> Date: Fri Aug 14 18:43:30 2015 +0300 >> >> drm/i915/bxt: don't allow cached GEM mappings on A stepping >> >> Added an exception of disallowing snooping for Broxton A >> stepping hardware but userptr was still enabling it regardless. >> >> Move the check to HAS_SNOOP now that it is used from multiple >> call sites and use it. >> >> Idea is that userspace which relies on userptr snooping, for >> example for fine grained buffered SVM, can query the caching >> mode on a created userptr object to determine whether coherency >> is supported or not. >> >> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> >> Cc: Chris Wilson <chris@chris-wilson.co.uk> >> Cc: Imre Deak <imre.deak@intel.com> >> --- >> drivers/gpu/drm/i915/i915_drv.h | 1 + >> drivers/gpu/drm/i915/i915_gem.c | 2 +- >> drivers/gpu/drm/i915/i915_gem_userptr.c | 2 +- >> 3 files changed, 3 insertions(+), 2 deletions(-) >> >> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h >> index 671295523317..73f0db17b990 100644 >> --- a/drivers/gpu/drm/i915/i915_drv.h >> +++ b/drivers/gpu/drm/i915/i915_drv.h >> @@ -2629,6 +2629,7 @@ struct drm_i915_cmd_table { >> #define HAS_LLC(dev) (INTEL_INFO(dev)->has_llc) >> #define HAS_WT(dev) ((IS_HASWELL(dev) || IS_BROADWELL(dev)) && \ >> __I915__(dev)->ellc_size) >> +#define HAS_SNOOP(dev) (!IS_BXT_REVID(dev, 0, BXT_REVID_A1)) >> #define I915_NEED_GFX_HWS(dev) (INTEL_INFO(dev)->need_gfx_hws) >> >> #define HAS_HW_CONTEXTS(dev) (INTEL_INFO(dev)->gen >= 6) >> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c >> index 3d31d3ac589e..1c6e8fb9d392 100644 >> --- a/drivers/gpu/drm/i915/i915_gem.c >> +++ b/drivers/gpu/drm/i915/i915_gem.c >> @@ -3949,7 +3949,7 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data, >> * cacheline, whereas normally such cachelines would get >> * invalidated. >> */ >> - if (IS_BXT_REVID(dev, 0, BXT_REVID_A1)) >> + if (!HAS_SNOOP(dev)) > > if (!HAS_LLC && !HAS_SNOOP) > > and HAS_SNOOP should be true for everything that is not llc, with the > exception above. So make a dev_info->has_snoop feature bit (or > uses_snoop?) and then clear the copy inside the struct for the special > case. Sure I can do that, but isn't it the same as this patch? Unless something with LLC and no snooping appears? >> return -ENODEV; >> >> level = I915_CACHE_LLC; >> diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c >> index 4b09c840d493..48aaff019b42 100644 >> --- a/drivers/gpu/drm/i915/i915_gem_userptr.c >> +++ b/drivers/gpu/drm/i915/i915_gem_userptr.c >> @@ -782,7 +782,7 @@ i915_gem_userptr_ioctl(struct drm_device *dev, void *data, struct drm_file *file >> >> drm_gem_private_object_init(dev, &obj->base, args->user_size); >> i915_gem_object_init(obj, &i915_gem_userptr_ops); >> - obj->cache_level = I915_CACHE_LLC; >> + obj->cache_level = HAS_SNOOP(dev) ? I915_CACHE_LLC : I915_CACHE_NONE; > > This should be > if (!HAS_SNOOP && !HAS_LLC) > return -ENODEV; > > as we can not support a coherent pointer shared between memory and the GPU > in this case. Hm interesting, why not? I mean, what is the difference compared to normal BOs which will be uncached on !LLC? Regards, Tvrtko
On Tue, Mar 01, 2016 at 05:21:49PM +0000, Tvrtko Ursulin wrote: > > On 01/03/16 17:14, Chris Wilson wrote: > >On Tue, Mar 01, 2016 at 04:29:35PM +0000, Tvrtko Ursulin wrote: > >>From: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > >> > >> commit e5756c10d841ddb448293c849392f3d6b809561f > >> Author: Imre Deak <imre.deak@intel.com> > >> Date: Fri Aug 14 18:43:30 2015 +0300 > >> > >> drm/i915/bxt: don't allow cached GEM mappings on A stepping > >> > >>Added an exception of disallowing snooping for Broxton A > >>stepping hardware but userptr was still enabling it regardless. > >> > >>Move the check to HAS_SNOOP now that it is used from multiple > >>call sites and use it. > >> > >>Idea is that userspace which relies on userptr snooping, for > >>example for fine grained buffered SVM, can query the caching > >>mode on a created userptr object to determine whether coherency > >>is supported or not. > >> > >>Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > >>Cc: Chris Wilson <chris@chris-wilson.co.uk> > >>Cc: Imre Deak <imre.deak@intel.com> > >>--- > >> drivers/gpu/drm/i915/i915_drv.h | 1 + > >> drivers/gpu/drm/i915/i915_gem.c | 2 +- > >> drivers/gpu/drm/i915/i915_gem_userptr.c | 2 +- > >> 3 files changed, 3 insertions(+), 2 deletions(-) > >> > >>diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h > >>index 671295523317..73f0db17b990 100644 > >>--- a/drivers/gpu/drm/i915/i915_drv.h > >>+++ b/drivers/gpu/drm/i915/i915_drv.h > >>@@ -2629,6 +2629,7 @@ struct drm_i915_cmd_table { > >> #define HAS_LLC(dev) (INTEL_INFO(dev)->has_llc) > >> #define HAS_WT(dev) ((IS_HASWELL(dev) || IS_BROADWELL(dev)) && \ > >> __I915__(dev)->ellc_size) > >>+#define HAS_SNOOP(dev) (!IS_BXT_REVID(dev, 0, BXT_REVID_A1)) > >> #define I915_NEED_GFX_HWS(dev) (INTEL_INFO(dev)->need_gfx_hws) > >> > >> #define HAS_HW_CONTEXTS(dev) (INTEL_INFO(dev)->gen >= 6) > >>diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > >>index 3d31d3ac589e..1c6e8fb9d392 100644 > >>--- a/drivers/gpu/drm/i915/i915_gem.c > >>+++ b/drivers/gpu/drm/i915/i915_gem.c > >>@@ -3949,7 +3949,7 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data, > >> * cacheline, whereas normally such cachelines would get > >> * invalidated. > >> */ > >>- if (IS_BXT_REVID(dev, 0, BXT_REVID_A1)) > >>+ if (!HAS_SNOOP(dev)) > > > >if (!HAS_LLC && !HAS_SNOOP) > > > >and HAS_SNOOP should be true for everything that is not llc, with the > >exception above. So make a dev_info->has_snoop feature bit (or > >uses_snoop?) and then clear the copy inside the struct for the special > >case. > > Sure I can do that, but isn't it the same as this patch? Unless > something with LLC and no snooping appears? I am generalising into something that makes sense here. if (!HAS_SNOOP(dev)) does not make sense by itself since LLC is not snooping and so this would appear to also reject all of LLC. > > >> return -ENODEV; > >> > >> level = I915_CACHE_LLC; > >>diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c > >>index 4b09c840d493..48aaff019b42 100644 > >>--- a/drivers/gpu/drm/i915/i915_gem_userptr.c > >>+++ b/drivers/gpu/drm/i915/i915_gem_userptr.c > >>@@ -782,7 +782,7 @@ i915_gem_userptr_ioctl(struct drm_device *dev, void *data, struct drm_file *file > >> > >> drm_gem_private_object_init(dev, &obj->base, args->user_size); > >> i915_gem_object_init(obj, &i915_gem_userptr_ops); > >>- obj->cache_level = I915_CACHE_LLC; > >>+ obj->cache_level = HAS_SNOOP(dev) ? I915_CACHE_LLC : I915_CACHE_NONE; > > > >This should be > >if (!HAS_SNOOP && !HAS_LLC) > > return -ENODEV; > > > >as we can not support a coherent pointer shared between memory and the GPU > >in this case. > > Hm interesting, why not? I mean, what is the difference compared to > normal BOs which will be uncached on !LLC? They are allocated as device memory. userptr is for importing normal system pages into the device with the expectation that such is coherent. To change that we should have additional flags that tell us that the user is expecting non-coherent bo which requires the user to manage sync. That is also typically (thinking about the GL spec) also explicitly requested at the client API level. -Chris
On 02/03/16 12:31, Patchwork wrote: > == Series Details == > > Series: drm/i915: Avoid snooping with userptr where not supported (rev3) > URL : https://patchwork.freedesktop.org/series/3979/ > State : failure > > == Summary == > > Series 3979v3 drm/i915: Avoid snooping with userptr where not supported > http://patchwork.freedesktop.org/api/1.0/series/3979/revisions/3/mbox/ > > Test kms_flip: > Subgroup basic-flip-vs-wf_vblank: > fail -> PASS (snb-x220t) > dmesg-warn -> PASS (hsw-gt2) > Subgroup basic-plain-flip: > pass -> DMESG-WARN (hsw-brixbox) Unrelated: https://bugs.freedesktop.org/show_bug.cgi?id=94349 > Test kms_pipe_crc_basic: > Subgroup read-crc-pipe-c: > dmesg-warn -> PASS (bsw-nuc-2) > Subgroup suspend-read-crc-pipe-b: > pass -> FAIL (byt-nuc) > Subgroup suspend-read-crc-pipe-c: > skip -> FAIL (byt-nuc) New unrelated: https://bugs.freedesktop.org/show_bug.cgi?id=94370 > Test pm_rpm: > Subgroup basic-pci-d3-state: > pass -> DMESG-WARN (bsw-nuc-2) Unrelated: https://bugs.freedesktop.org/show_bug.cgi?id=94164 > Subgroup basic-rte: > pass -> DMESG-WARN (snb-dellxps) Again: https://bugs.freedesktop.org/show_bug.cgi?id=94349 > > bdw-nuci7 total:169 pass:158 dwarn:0 dfail:0 fail:0 skip:11 > bdw-ultra total:169 pass:155 dwarn:0 dfail:0 fail:0 skip:14 > bsw-nuc-2 total:169 pass:137 dwarn:1 dfail:0 fail:1 skip:30 > byt-nuc total:169 pass:143 dwarn:0 dfail:0 fail:2 skip:24 > hsw-brixbox total:169 pass:153 dwarn:1 dfail:0 fail:0 skip:15 > hsw-gt2 total:169 pass:159 dwarn:0 dfail:0 fail:0 skip:10 > ilk-hp8440p total:169 pass:118 dwarn:0 dfail:0 fail:1 skip:50 > ivb-t430s total:169 pass:154 dwarn:0 dfail:0 fail:0 skip:15 > skl-i5k-2 total:169 pass:153 dwarn:0 dfail:0 fail:0 skip:16 > skl-i7k-2 total:169 pass:153 dwarn:0 dfail:0 fail:0 skip:16 > snb-dellxps total:169 pass:145 dwarn:1 dfail:0 fail:0 skip:23 > snb-x220t total:169 pass:146 dwarn:0 dfail:1 fail:0 skip:22 > > Results at /archive/results/CI_IGT_test/Patchwork_1513/ > > 37415cefa4f17869f5ed40a42dae85f1e6b519be drm-intel-nightly: 2016y-03m-02d-10h-20m-17s UTC integration manifest > d37b311fb28390fd8131caa5db1a623a86e81842 drm/i915: Avoid snooping with userptr where not supported Merged, thanks for the review! Regards, Tvrtko
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 671295523317..73f0db17b990 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2629,6 +2629,7 @@ struct drm_i915_cmd_table { #define HAS_LLC(dev) (INTEL_INFO(dev)->has_llc) #define HAS_WT(dev) ((IS_HASWELL(dev) || IS_BROADWELL(dev)) && \ __I915__(dev)->ellc_size) +#define HAS_SNOOP(dev) (!IS_BXT_REVID(dev, 0, BXT_REVID_A1)) #define I915_NEED_GFX_HWS(dev) (INTEL_INFO(dev)->need_gfx_hws) #define HAS_HW_CONTEXTS(dev) (INTEL_INFO(dev)->gen >= 6) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 3d31d3ac589e..1c6e8fb9d392 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -3949,7 +3949,7 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data, * cacheline, whereas normally such cachelines would get * invalidated. */ - if (IS_BXT_REVID(dev, 0, BXT_REVID_A1)) + if (!HAS_SNOOP(dev)) return -ENODEV; level = I915_CACHE_LLC; diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c index 4b09c840d493..48aaff019b42 100644 --- a/drivers/gpu/drm/i915/i915_gem_userptr.c +++ b/drivers/gpu/drm/i915/i915_gem_userptr.c @@ -782,7 +782,7 @@ i915_gem_userptr_ioctl(struct drm_device *dev, void *data, struct drm_file *file drm_gem_private_object_init(dev, &obj->base, args->user_size); i915_gem_object_init(obj, &i915_gem_userptr_ops); - obj->cache_level = I915_CACHE_LLC; + obj->cache_level = HAS_SNOOP(dev) ? I915_CACHE_LLC : I915_CACHE_NONE; obj->base.write_domain = I915_GEM_DOMAIN_CPU; obj->base.read_domains = I915_GEM_DOMAIN_CPU;