diff mbox series

[i-g-t,v3] tests/prime_vgem: Examine blitter access path

Message ID 20200204114113.22436-1-janusz.krzysztofik@linux.intel.com (mailing list archive)
State New, archived
Headers show
Series [i-g-t,v3] tests/prime_vgem: Examine blitter access path | expand

Commit Message

Janusz Krzysztofik Feb. 4, 2020, 11:41 a.m. UTC
On future hardware with missing GGTT BAR we won't be able to exercise
dma-buf access via that path.  An alternative to basic-gtt subtest for
testing dma-buf access is required, as well as basic-fence-mmap and
coherency-gtt subtest alternatives for testing WC coherency.

Access to the dma sg list feature exposed by dma-buf can be tested
through blitter.  Unfortunately we don't have any equivalently simple
tests that use blitter.  Provide them.

XY_SRC_COPY_BLT method implemented by igt_blitter_src_copy() IGT
library helper has been chosen.

v2: As fast copy is not supported on platforms older than Gen 9,
    use XY_SRC_COPY instead (Chris),
  - add subtest descriptions.
v3: Don't calculate the pitch, use scratch.pitch returned by
    vgem_create() (Chris),
  - replace constants with values from respective fields of scratch
    (Chris),
  - use _u32 variant of igt_assert_eq() for better readability of
    possible error messages (Chris),
  - sleep a bit to emphasize that the only thing stopping the blitter
    is the fence (Chris),
  - use prime_sync_start/end() as the recommended practice for
    inter-device sync, not gem_sync() (Chris),
  - update the name of used XY_SRC_COPY_BLT helper to match the name of
    its library version just merged.

Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
---
Hi Chris,

I hope I've understood and addressed your comments correctly so your
R-b still applies.

Thanks,
Janusz

 tests/prime_vgem.c | 187 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 187 insertions(+)

Comments

Chris Wilson Feb. 11, 2020, 11:39 a.m. UTC | #1
Quoting Janusz Krzysztofik (2020-02-04 11:41:13)
> On future hardware with missing GGTT BAR we won't be able to exercise
> dma-buf access via that path.  An alternative to basic-gtt subtest for
> testing dma-buf access is required, as well as basic-fence-mmap and
> coherency-gtt subtest alternatives for testing WC coherency.
> 
> Access to the dma sg list feature exposed by dma-buf can be tested
> through blitter.  Unfortunately we don't have any equivalently simple
> tests that use blitter.  Provide them.
> 
> XY_SRC_COPY_BLT method implemented by igt_blitter_src_copy() IGT
> library helper has been chosen.
> 
> v2: As fast copy is not supported on platforms older than Gen 9,
>     use XY_SRC_COPY instead (Chris),
>   - add subtest descriptions.
> v3: Don't calculate the pitch, use scratch.pitch returned by
>     vgem_create() (Chris),
>   - replace constants with values from respective fields of scratch
>     (Chris),
>   - use _u32 variant of igt_assert_eq() for better readability of
>     possible error messages (Chris),
>   - sleep a bit to emphasize that the only thing stopping the blitter
>     is the fence (Chris),
>   - use prime_sync_start/end() as the recommended practice for
>     inter-device sync, not gem_sync() (Chris),
>   - update the name of used XY_SRC_COPY_BLT helper to match the name of
>     its library version just merged.
> 
> Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
> Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
> Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
> ---
> Hi Chris,
> 
> I hope I've understood and addressed your comments correctly so your
> R-b still applies.

Sure, just spotted one slight slip in uapi usage,

> +static void test_blt_interleaved(int vgem, int i915)
> +{
> +       struct vgem_bo scratch;
> +       uint32_t prime, native;
> +       uint32_t *foreign, *local;
> +       int dmabuf, i;
> +
> +       scratch.width = 1024;
> +       scratch.height = 1024;
> +       scratch.bpp = 32;
> +       vgem_create(vgem, &scratch);
> +
> +       dmabuf = prime_handle_to_fd(vgem, scratch.handle);
> +       prime = prime_fd_to_handle(i915, dmabuf);
> +
> +       native = gem_create(i915, scratch.size);
> +
> +       foreign = vgem_mmap(vgem, &scratch, PROT_WRITE);
> +       local = gem_mmap__wc(i915, native, 0, scratch.size, PROT_WRITE);
> +
> +       for (i = 0; i < scratch.height; i++) {
> +               local[scratch.pitch * i / sizeof(*local)] = i;
> +               igt_blitter_src_copy(i915, native, 0, scratch.pitch,
> +                                    I915_TILING_NONE, 0, i, scratch.width, 1,
> +                                    scratch.bpp, prime, 0, scratch.pitch,
> +                                    I915_TILING_NONE, 0, i);
> +               prime_sync_start(dmabuf, true);
> +               prime_sync_end(dmabuf, true);
> +               igt_assert_eq_u32(foreign[scratch.pitch * i / sizeof(*foreign)],
> +                                 i);

sync_start()
igt_assert...
sync_end()

> +
> +               foreign[scratch.pitch * i / sizeof(*foreign)] = ~i;
> +               igt_blitter_src_copy(i915, prime, 0, scratch.pitch,
> +                                    I915_TILING_NONE, 0, i, scratch.width, 1,
> +                                    scratch.bpp, native, 0, scratch.pitch,
> +                                    I915_TILING_NONE, 0, i);
> +               gem_sync(i915, native);
> +               igt_assert_eq_u32(local[scratch.pitch * i / sizeof(*local)],
> +                                 ~i);
> +       }

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
Janusz Krzysztofik Feb. 12, 2020, 9:08 a.m. UTC | #2
Hi Chris,

On Tuesday, February 11, 2020 12:39:36 PM CET Chris Wilson wrote:
> Quoting Janusz Krzysztofik (2020-02-04 11:41:13)
> > On future hardware with missing GGTT BAR we won't be able to exercise
> > dma-buf access via that path.  An alternative to basic-gtt subtest for
> > testing dma-buf access is required, as well as basic-fence-mmap and
> > coherency-gtt subtest alternatives for testing WC coherency.
> > 
> > Access to the dma sg list feature exposed by dma-buf can be tested
> > through blitter.  Unfortunately we don't have any equivalently simple
> > tests that use blitter.  Provide them.
> > 
> > XY_SRC_COPY_BLT method implemented by igt_blitter_src_copy() IGT
> > library helper has been chosen.
> > 
> > v2: As fast copy is not supported on platforms older than Gen 9,
> >     use XY_SRC_COPY instead (Chris),
> >   - add subtest descriptions.
> > v3: Don't calculate the pitch, use scratch.pitch returned by
> >     vgem_create() (Chris),
> >   - replace constants with values from respective fields of scratch
> >     (Chris),
> >   - use _u32 variant of igt_assert_eq() for better readability of
> >     possible error messages (Chris),
> >   - sleep a bit to emphasize that the only thing stopping the blitter
> >     is the fence (Chris),
> >   - use prime_sync_start/end() as the recommended practice for
> >     inter-device sync, not gem_sync() (Chris),
> >   - update the name of used XY_SRC_COPY_BLT helper to match the name of
> >     its library version just merged.
> > 
> > Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> > Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
> > ---
> > Hi Chris,
> > 
> > I hope I've understood and addressed your comments correctly so your
> > R-b still applies.
> 
> Sure, just spotted one slight slip in uapi usage,
> 
> > +static void test_blt_interleaved(int vgem, int i915)
> > +{
> > +       struct vgem_bo scratch;
> > +       uint32_t prime, native;
> > +       uint32_t *foreign, *local;
> > +       int dmabuf, i;
> > +
> > +       scratch.width = 1024;
> > +       scratch.height = 1024;
> > +       scratch.bpp = 32;
> > +       vgem_create(vgem, &scratch);
> > +
> > +       dmabuf = prime_handle_to_fd(vgem, scratch.handle);
> > +       prime = prime_fd_to_handle(i915, dmabuf);
> > +
> > +       native = gem_create(i915, scratch.size);
> > +
> > +       foreign = vgem_mmap(vgem, &scratch, PROT_WRITE);
> > +       local = gem_mmap__wc(i915, native, 0, scratch.size, PROT_WRITE);
> > +
> > +       for (i = 0; i < scratch.height; i++) {
> > +               local[scratch.pitch * i / sizeof(*local)] = i;
> > +               igt_blitter_src_copy(i915, native, 0, scratch.pitch,
> > +                                    I915_TILING_NONE, 0, i, scratch.width, 1,
> > +                                    scratch.bpp, prime, 0, scratch.pitch,
> > +                                    I915_TILING_NONE, 0, i);
> > +               prime_sync_start(dmabuf, true);
> > +               prime_sync_end(dmabuf, true);
> > +               igt_assert_eq_u32(foreign[scratch.pitch * i / sizeof(*foreign)],
> > +                                 i);
> 
> sync_start()
> igt_assert...
> sync_end()

While your modification seems harmless to me, could you please explain why? 
'foreign' is not a map to dma-buf, it's a map to a vgem object.  Why 
should we surround access to an mmapped vgem object with prime_sync_start/
end()?  I think vgem driver should take care of synchronization/serialization 
as needed.

My intention was to use an empty prime_sync_start/end() instead of 
gem_sync(prime) for the sole purpose of making sure blitter copy was completed 
before we examine results from the vgem side.

Thanks,
Janusz


> > +
> > +               foreign[scratch.pitch * i / sizeof(*foreign)] = ~i;
> > +               igt_blitter_src_copy(i915, prime, 0, scratch.pitch,
> > +                                    I915_TILING_NONE, 0, i, scratch.width, 1,
> > +                                    scratch.bpp, native, 0, scratch.pitch,
> > +                                    I915_TILING_NONE, 0, i);
> > +               gem_sync(i915, native);
> > +               igt_assert_eq_u32(local[scratch.pitch * i / sizeof(*local)],
> > +                                 ~i);
> > +       }
> 
> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
> -Chris
>
Chris Wilson Feb. 12, 2020, 9:52 a.m. UTC | #3
Quoting Janusz Krzysztofik (2020-02-12 09:08:53)
> Hi Chris,
> 
> On Tuesday, February 11, 2020 12:39:36 PM CET Chris Wilson wrote:
> > Quoting Janusz Krzysztofik (2020-02-04 11:41:13)
> > > On future hardware with missing GGTT BAR we won't be able to exercise
> > > dma-buf access via that path.  An alternative to basic-gtt subtest for
> > > testing dma-buf access is required, as well as basic-fence-mmap and
> > > coherency-gtt subtest alternatives for testing WC coherency.
> > > 
> > > Access to the dma sg list feature exposed by dma-buf can be tested
> > > through blitter.  Unfortunately we don't have any equivalently simple
> > > tests that use blitter.  Provide them.
> > > 
> > > XY_SRC_COPY_BLT method implemented by igt_blitter_src_copy() IGT
> > > library helper has been chosen.
> > > 
> > > v2: As fast copy is not supported on platforms older than Gen 9,
> > >     use XY_SRC_COPY instead (Chris),
> > >   - add subtest descriptions.
> > > v3: Don't calculate the pitch, use scratch.pitch returned by
> > >     vgem_create() (Chris),
> > >   - replace constants with values from respective fields of scratch
> > >     (Chris),
> > >   - use _u32 variant of igt_assert_eq() for better readability of
> > >     possible error messages (Chris),
> > >   - sleep a bit to emphasize that the only thing stopping the blitter
> > >     is the fence (Chris),
> > >   - use prime_sync_start/end() as the recommended practice for
> > >     inter-device sync, not gem_sync() (Chris),
> > >   - update the name of used XY_SRC_COPY_BLT helper to match the name of
> > >     its library version just merged.
> > > 
> > > Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
> > > Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> > > Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
> > > Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
> > > ---
> > > Hi Chris,
> > > 
> > > I hope I've understood and addressed your comments correctly so your
> > > R-b still applies.
> > 
> > Sure, just spotted one slight slip in uapi usage,
> > 
> > > +static void test_blt_interleaved(int vgem, int i915)
> > > +{
> > > +       struct vgem_bo scratch;
> > > +       uint32_t prime, native;
> > > +       uint32_t *foreign, *local;
> > > +       int dmabuf, i;
> > > +
> > > +       scratch.width = 1024;
> > > +       scratch.height = 1024;
> > > +       scratch.bpp = 32;
> > > +       vgem_create(vgem, &scratch);
> > > +
> > > +       dmabuf = prime_handle_to_fd(vgem, scratch.handle);
> > > +       prime = prime_fd_to_handle(i915, dmabuf);
> > > +
> > > +       native = gem_create(i915, scratch.size);
> > > +
> > > +       foreign = vgem_mmap(vgem, &scratch, PROT_WRITE);
> > > +       local = gem_mmap__wc(i915, native, 0, scratch.size, PROT_WRITE);
> > > +
> > > +       for (i = 0; i < scratch.height; i++) {
> > > +               local[scratch.pitch * i / sizeof(*local)] = i;
> > > +               igt_blitter_src_copy(i915, native, 0, scratch.pitch,
> > > +                                    I915_TILING_NONE, 0, i, scratch.width, 1,
> > > +                                    scratch.bpp, prime, 0, scratch.pitch,
> > > +                                    I915_TILING_NONE, 0, i);
> > > +               prime_sync_start(dmabuf, true);
> > > +               prime_sync_end(dmabuf, true);
> > > +               igt_assert_eq_u32(foreign[scratch.pitch * i / sizeof(*foreign)],
> > > +                                 i);
> > 
> > sync_start()
> > igt_assert...
> > sync_end()
> 
> While your modification seems harmless to me, could you please explain why? 
> 'foreign' is not a map to dma-buf, it's a map to a vgem object.  Why 
> should we surround access to an mmapped vgem object with prime_sync_start/
> end()?  I think vgem driver should take care of synchronization/serialization 
> as needed.

As I understood the flow of mmaps, dmabuf is pointing to vgem and so we
are using the dmabuf synchronisation to relegate access to the vgem
mmap.

Your intention is that this sequence was:

i915: do stuff

dmabuf: sync

vgem: assume coherent.

Whereas I was looking at from the perspective of having to use dmabuf as
the access controls for vgem, since it has none of its own.
 
> My intention was to use an empty prime_sync_start/end() instead of 
> gem_sync(prime) for the sole purpose of making sure blitter copy was completed 
> before we examine results from the vgem side.

I see. We have a triangle, with sync only on the dmabuf vertex.
The way I was looking at it is that our api should be focusing on /only/
having the dmabuf as the conduit:

         dmabuf
vgem |<---------->| i915


What that means for api, when we add in more nodes and have the same
buffer shared between multiple devices? Ugh.

And you can point that since dmabuf has its own mmap and accessors, it
isn't a plain conduit, and more of its own node.

The first one to figure it out wins the prize of writing the dmabuf
ioctls and documentation.
-Chris
diff mbox series

Patch

diff --git a/tests/prime_vgem.c b/tests/prime_vgem.c
index 3bdb23007..96c527287 100644
--- a/tests/prime_vgem.c
+++ b/tests/prime_vgem.c
@@ -28,6 +28,8 @@ 
 #include <sys/poll.h>
 #include <time.h>
 
+#include "intel_batchbuffer.h"	/* igt_blitter_src_copy() */
+
 IGT_TEST_DESCRIPTION("Basic check of polling for prime/vgem fences.");
 
 static void test_read(int vgem, int i915)
@@ -181,6 +183,79 @@  static void test_fence_mmap(int i915, int vgem)
 	close(slave[1]);
 }
 
+static void test_fence_blt(int i915, int vgem)
+{
+	struct vgem_bo scratch;
+	uint32_t prime;
+	uint32_t *ptr;
+	uint32_t fence;
+	int dmabuf, i;
+	int master[2], slave[2];
+
+	igt_assert(pipe(master) == 0);
+	igt_assert(pipe(slave) == 0);
+
+	scratch.width = 1024;
+	scratch.height = 1024;
+	scratch.bpp = 32;
+	vgem_create(vgem, &scratch);
+
+	dmabuf = prime_handle_to_fd(vgem, scratch.handle);
+	prime = prime_fd_to_handle(i915, dmabuf);
+	close(dmabuf);
+
+	igt_fork(child, 1) {
+		uint32_t native;
+
+		close(master[0]);
+		close(slave[1]);
+
+		native = gem_create(i915, scratch.size);
+
+		ptr = gem_mmap__wc(i915, native, 0, scratch.size, PROT_READ);
+		for (i = 0; i < scratch.height; i++)
+			igt_assert_eq_u32(ptr[scratch.pitch * i / sizeof(*ptr)],
+					  0);
+
+		write(master[1], &child, sizeof(child));
+		read(slave[0], &child, sizeof(child));
+
+		igt_blitter_src_copy(i915, prime, 0, scratch.pitch,
+				     I915_TILING_NONE, 0, 0, scratch.width,
+				     scratch.height, scratch.bpp, native, 0,
+				     scratch.pitch, I915_TILING_NONE, 0, 0);
+		gem_sync(i915, native);
+
+		for (i = 0; i < scratch.height; i++)
+			igt_assert_eq_u32(ptr[scratch.pitch * i / sizeof(*ptr)],
+					  i);
+
+		munmap(ptr, scratch.size);
+		gem_close(i915, native);
+		gem_close(i915, prime);
+	}
+
+	close(master[1]);
+	close(slave[0]);
+	read(master[0], &i, sizeof(i));
+	fence = vgem_fence_attach(vgem, &scratch, VGEM_FENCE_WRITE);
+	write(slave[1], &i, sizeof(i));
+
+	/* Emphasize that the only thing stopping the blitter is the fence */
+	usleep(50*1000);
+
+	ptr = vgem_mmap(vgem, &scratch, PROT_WRITE);
+	for (i = 0; i < scratch.height; i++)
+		ptr[scratch.pitch * i / sizeof(*ptr)] = i;
+	munmap(ptr, scratch.size);
+	vgem_fence_signal(vgem, fence);
+	gem_close(vgem, scratch.handle);
+
+	igt_waitchildren();
+	close(master[0]);
+	close(slave[1]);
+}
+
 static void test_write(int vgem, int i915)
 {
 	struct vgem_bo scratch;
@@ -249,6 +324,57 @@  static void test_gtt(int vgem, int i915)
 	gem_close(vgem, scratch.handle);
 }
 
+static void test_blt(int vgem, int i915)
+{
+	struct vgem_bo scratch;
+	uint32_t prime, native;
+	uint32_t *ptr;
+	int dmabuf, i;
+
+	scratch.width = 1024;
+	scratch.height = 1024;
+	scratch.bpp = 32;
+	vgem_create(vgem, &scratch);
+
+	dmabuf = prime_handle_to_fd(vgem, scratch.handle);
+	prime = prime_fd_to_handle(i915, dmabuf);
+
+	native = gem_create(i915, scratch.size);
+
+	ptr = gem_mmap__wc(i915, native, 0, scratch.size, PROT_WRITE);
+	for (i = 0; i < scratch.height; i++)
+		ptr[scratch.pitch * i / sizeof(*ptr)] = i;
+	munmap(ptr, scratch.size);
+
+	igt_blitter_src_copy(i915, native, 0, scratch.pitch, I915_TILING_NONE,
+			     0, 0, scratch.width, scratch.height, scratch.bpp,
+			     prime, 0, scratch.pitch, I915_TILING_NONE, 0, 0);
+	prime_sync_start(dmabuf, true);
+	prime_sync_end(dmabuf, true);
+	close(dmabuf);
+
+	ptr = vgem_mmap(vgem, &scratch, PROT_READ | PROT_WRITE);
+	for (i = 0; i < scratch.height; i++) {
+		igt_assert_eq_u32(ptr[scratch.pitch * i / sizeof(*ptr)], i);
+		ptr[scratch.pitch * i / sizeof(*ptr)] = ~i;
+	}
+	munmap(ptr, scratch.size);
+
+	igt_blitter_src_copy(i915, prime, 0, scratch.pitch, I915_TILING_NONE,
+			     0, 0, scratch.width, scratch.height, scratch.bpp,
+			     native, 0, scratch.pitch, I915_TILING_NONE, 0, 0);
+	gem_sync(i915, native);
+
+	ptr = gem_mmap__wc(i915, native, 0, scratch.size, PROT_READ);
+	for (i = 0; i < scratch.height; i++)
+		igt_assert_eq_u32(ptr[scratch.pitch * i / sizeof(*ptr)], ~i);
+	munmap(ptr, scratch.size);
+
+	gem_close(i915, native);
+	gem_close(i915, prime);
+	gem_close(vgem, scratch.handle);
+}
+
 static void test_shrink(int vgem, int i915)
 {
 	struct vgem_bo scratch = {
@@ -332,6 +458,56 @@  static void test_gtt_interleaved(int vgem, int i915)
 	gem_close(vgem, scratch.handle);
 }
 
+static void test_blt_interleaved(int vgem, int i915)
+{
+	struct vgem_bo scratch;
+	uint32_t prime, native;
+	uint32_t *foreign, *local;
+	int dmabuf, i;
+
+	scratch.width = 1024;
+	scratch.height = 1024;
+	scratch.bpp = 32;
+	vgem_create(vgem, &scratch);
+
+	dmabuf = prime_handle_to_fd(vgem, scratch.handle);
+	prime = prime_fd_to_handle(i915, dmabuf);
+
+	native = gem_create(i915, scratch.size);
+
+	foreign = vgem_mmap(vgem, &scratch, PROT_WRITE);
+	local = gem_mmap__wc(i915, native, 0, scratch.size, PROT_WRITE);
+
+	for (i = 0; i < scratch.height; i++) {
+		local[scratch.pitch * i / sizeof(*local)] = i;
+		igt_blitter_src_copy(i915, native, 0, scratch.pitch,
+				     I915_TILING_NONE, 0, i, scratch.width, 1,
+				     scratch.bpp, prime, 0, scratch.pitch,
+				     I915_TILING_NONE, 0, i);
+		prime_sync_start(dmabuf, true);
+		prime_sync_end(dmabuf, true);
+		igt_assert_eq_u32(foreign[scratch.pitch * i / sizeof(*foreign)],
+				  i);
+
+		foreign[scratch.pitch * i / sizeof(*foreign)] = ~i;
+		igt_blitter_src_copy(i915, prime, 0, scratch.pitch,
+				     I915_TILING_NONE, 0, i, scratch.width, 1,
+				     scratch.bpp, native, 0, scratch.pitch,
+				     I915_TILING_NONE, 0, i);
+		gem_sync(i915, native);
+		igt_assert_eq_u32(local[scratch.pitch * i / sizeof(*local)],
+				  ~i);
+	}
+	close(dmabuf);
+
+	munmap(local, scratch.size);
+	munmap(foreign, scratch.size);
+
+	gem_close(i915, native);
+	gem_close(i915, prime);
+	gem_close(vgem, scratch.handle);
+}
+
 static bool prime_busy(int fd, bool excl)
 {
 	struct pollfd pfd = { .fd = fd, .events = excl ? POLLOUT : POLLIN };
@@ -849,12 +1025,20 @@  igt_main
 	igt_subtest("basic-gtt")
 		test_gtt(vgem, i915);
 
+	igt_describe("Examine blitter access path");
+	igt_subtest("basic-blt")
+		test_blt(vgem, i915);
+
 	igt_subtest("shrink")
 		test_shrink(vgem, i915);
 
 	igt_subtest("coherency-gtt")
 		test_gtt_interleaved(vgem, i915);
 
+	igt_describe("Examine blitter access path WC coherency");
+	igt_subtest("coherency-blt")
+		test_blt_interleaved(vgem, i915);
+
 	for (e = intel_execution_engines; e->name; e++) {
 		igt_subtest_f("%ssync-%s",
 			      e->exec_id == 0 ? "basic-" : "",
@@ -904,6 +1088,9 @@  igt_main
 			test_fence_read(i915, vgem);
 		igt_subtest("basic-fence-mmap")
 			test_fence_mmap(i915, vgem);
+		igt_describe("Examine blitter access path fencing");
+		igt_subtest("basic-fence-blt")
+			test_fence_blt(i915, vgem);
 
 		for (e = intel_execution_engines; e->name; e++) {
 			igt_subtest_f("%sfence-wait-%s",