diff mbox

drm/i915/selftests: Provide full mb() around clflush

Message ID 20180706174926.4712-1-chris@chris-wilson.co.uk (mailing list archive)
State New, archived
Headers show

Commit Message

Chris Wilson July 6, 2018, 5:49 p.m. UTC
clflush is an unserialised instruction and the IA manual strongly advises
you to serialise it with a mb. To be cautious, apply one before and one
after, so that it is serialised with both writes and reads without
worrying too much about the required direction.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 .../drm/i915/selftests/i915_gem_coherency.c   | 21 ++++++++++++++++---
 1 file changed, 18 insertions(+), 3 deletions(-)

Comments

Rodrigo Vivi July 6, 2018, 8:23 p.m. UTC | #1
On Fri, Jul 06, 2018 at 06:49:26PM +0100, Chris Wilson wrote:
> clflush is an unserialised instruction and the IA manual strongly advises
> you to serialise it with a mb. To be cautious, apply one before and one
> after,

my understanding is that we need one before and one after anyways,
not just a matter of being cautious for being cautious..

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>


> so that it is serialised with both writes and reads without
> worrying too much about the required direction.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>  .../drm/i915/selftests/i915_gem_coherency.c   | 21 ++++++++++++++++---
>  1 file changed, 18 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_coherency.c b/drivers/gpu/drm/i915/selftests/i915_gem_coherency.c
> index 294c58aba2c1..df44c302a9fe 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_gem_coherency.c
> +++ b/drivers/gpu/drm/i915/selftests/i915_gem_coherency.c
> @@ -42,11 +42,21 @@ static int cpu_set(struct drm_i915_gem_object *obj,
>  
>  	page = i915_gem_object_get_page(obj, offset >> PAGE_SHIFT);
>  	map = kmap_atomic(page);
> -	if (needs_clflush & CLFLUSH_BEFORE)
> +
> +	if (needs_clflush & CLFLUSH_BEFORE) {
> +		mb();
>  		clflush(map+offset_in_page(offset) / sizeof(*map));
> +		mb();
> +	}
> +
>  	map[offset_in_page(offset) / sizeof(*map)] = v;
> -	if (needs_clflush & CLFLUSH_AFTER)
> +
> +	if (needs_clflush & CLFLUSH_AFTER) {
> +		mb();
>  		clflush(map+offset_in_page(offset) / sizeof(*map));
> +		mb();
> +	}
> +
>  	kunmap_atomic(map);
>  
>  	i915_gem_obj_finish_shmem_access(obj);
> @@ -68,8 +78,13 @@ static int cpu_get(struct drm_i915_gem_object *obj,
>  
>  	page = i915_gem_object_get_page(obj, offset >> PAGE_SHIFT);
>  	map = kmap_atomic(page);
> -	if (needs_clflush & CLFLUSH_BEFORE)
> +
> +	if (needs_clflush & CLFLUSH_BEFORE) {
> +		mb();
>  		clflush(map+offset_in_page(offset) / sizeof(*map));
> +		mb();
> +	}
> +
>  	*v = map[offset_in_page(offset) / sizeof(*map)];
>  	kunmap_atomic(map);
>  
> -- 
> 2.18.0
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Chris Wilson July 6, 2018, 8:27 p.m. UTC | #2
Quoting Rodrigo Vivi (2018-07-06 21:23:00)
> On Fri, Jul 06, 2018 at 06:49:26PM +0100, Chris Wilson wrote:
> > clflush is an unserialised instruction and the IA manual strongly advises
> > you to serialise it with a mb. To be cautious, apply one before and one
> > after,
> 
> my understanding is that we need one before and one after anyways,
> not just a matter of being cautious for being cautious..
> 
> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

If you are really careful, you can reduce it to a mb between writes and
the clflush; and vice verse, the mb between clflush and subsequent
reads. It's all about prevent the clflush from overtaking or being
overtaken by the operations it is meant to flush or invalidate
respectively.

But since we're being paranoid here, do both.
-Chris
Rodrigo Vivi July 6, 2018, 8:31 p.m. UTC | #3
On Fri, Jul 06, 2018 at 09:27:47PM +0100, Chris Wilson wrote:
> Quoting Rodrigo Vivi (2018-07-06 21:23:00)
> > On Fri, Jul 06, 2018 at 06:49:26PM +0100, Chris Wilson wrote:
> > > clflush is an unserialised instruction and the IA manual strongly advises
> > > you to serialise it with a mb. To be cautious, apply one before and one
> > > after,
> > 
> > my understanding is that we need one before and one after anyways,
> > not just a matter of being cautious for being cautious..
> > 
> > Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> 
> If you are really careful, you can reduce it to a mb between writes and
> the clflush; and vice verse, the mb between clflush and subsequent
> reads. It's all about prevent the clflush from overtaking or being
> overtaken by the operations it is meant to flush or invalidate
> respectively.
> 
> But since we're being paranoid here, do both.

hm... makes sense... better to be paranoid...

> -Chris
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_coherency.c b/drivers/gpu/drm/i915/selftests/i915_gem_coherency.c
index 294c58aba2c1..df44c302a9fe 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_coherency.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_coherency.c
@@ -42,11 +42,21 @@  static int cpu_set(struct drm_i915_gem_object *obj,
 
 	page = i915_gem_object_get_page(obj, offset >> PAGE_SHIFT);
 	map = kmap_atomic(page);
-	if (needs_clflush & CLFLUSH_BEFORE)
+
+	if (needs_clflush & CLFLUSH_BEFORE) {
+		mb();
 		clflush(map+offset_in_page(offset) / sizeof(*map));
+		mb();
+	}
+
 	map[offset_in_page(offset) / sizeof(*map)] = v;
-	if (needs_clflush & CLFLUSH_AFTER)
+
+	if (needs_clflush & CLFLUSH_AFTER) {
+		mb();
 		clflush(map+offset_in_page(offset) / sizeof(*map));
+		mb();
+	}
+
 	kunmap_atomic(map);
 
 	i915_gem_obj_finish_shmem_access(obj);
@@ -68,8 +78,13 @@  static int cpu_get(struct drm_i915_gem_object *obj,
 
 	page = i915_gem_object_get_page(obj, offset >> PAGE_SHIFT);
 	map = kmap_atomic(page);
-	if (needs_clflush & CLFLUSH_BEFORE)
+
+	if (needs_clflush & CLFLUSH_BEFORE) {
+		mb();
 		clflush(map+offset_in_page(offset) / sizeof(*map));
+		mb();
+	}
+
 	*v = map[offset_in_page(offset) / sizeof(*map)];
 	kunmap_atomic(map);