diff mbox

[v2] igt/gem_workarounds: Test all types of workarounds

Message ID 1507745740-4192-1-git-send-email-oscar.mateo@intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

oscar.mateo@intel.com Oct. 11, 2017, 6:15 p.m. UTC
Apart from context based workarounds, we can now also test for global
MMIO and whitelisting ones.

Do take into account that this test does not guarantee that all known
WAs for a given platform are applied. It only checks that the WAs the
kernel does know about are correctly applied (e.g. they didn't get
lost on a GPU reset or a suspend/resume).

v2:
  - Do not wait for the GPU unnecessarily (Chris)
  - Make a comment that this tests only looks for regressions (Chris)

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
 tests/gem_workarounds.c | 185 ++++++++++++++++++++++++++++++++++++++----------
 1 file changed, 147 insertions(+), 38 deletions(-)

Comments

Chris Wilson Oct. 11, 2017, 6:19 p.m. UTC | #1
Quoting Oscar Mateo (2017-10-11 19:15:40)
> @@ -241,39 +350,34 @@ igt_main
>         }, *m;
>  
>         igt_fixture {
> +               struct pci_device *pci_dev;
>                 FILE *file;
> -               char *line = NULL;
> -               size_t line_size;
> -               int i, fd;
> +               int fd;
>  
>                 device = drm_open_driver(DRIVER_INTEL);
>                 igt_require_gem(device);
>  
>                 gen = intel_gen(intel_get_drm_devid(device));
>  
> +               pci_dev = intel_get_pci_device();
> +               igt_require(pci_dev);
> +
> +               intel_register_access_init(pci_dev, 0, device);

intel_register_access_init() takes i915-user-forcewake. Can we limit the
register access to just the mmio tests?
-Chris
Chris Wilson Oct. 11, 2017, 6:36 p.m. UTC | #2
Quoting Oscar Mateo (2017-10-11 19:15:40)
> Apart from context based workarounds, we can now also test for global
> MMIO and whitelisting ones.
> 
> Do take into account that this test does not guarantee that all known
> WAs for a given platform are applied. It only checks that the WAs the
> kernel does know about are correctly applied (e.g. they didn't get
> lost on a GPU reset or a suspend/resume).

Can we pass in a wa_regs.txt (manual control) instead of using
i915_wa_regs (regression testing)?

I want to record the wa_regs from the end of the series and confirm that
the registers have the same values at the beginning of the series. We
can't check if regs have disappeared in between, but that way we can do
a quick check they haven't changed value.
-Chris
oscar.mateo@intel.com Oct. 13, 2017, 8:32 p.m. UTC | #3
On 10/12/2017 05:40 AM, Petri Latvala wrote:
> On Wed, Oct 11, 2017 at 11:15:40AM -0700, Oscar Mateo wrote:
>> Apart from context based workarounds, we can now also test for global
>> MMIO and whitelisting ones.
>>
>> Do take into account that this test does not guarantee that all known
>> WAs for a given platform are applied. It only checks that the WAs the
>> kernel does know about are correctly applied (e.g. they didn't get
>> lost on a GPU reset or a suspend/resume).
>>
>> v2:
>>    - Do not wait for the GPU unnecessarily (Chris)
>>    - Make a comment that this tests only looks for regressions (Chris)
>>
>> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
>
> Subjectprefix didn't contain i-g-t so CI did not pick this up.
>

Thanks for the heads up, Petri. I forgot to add the label, but it 
doesn't matter that much because this patch is going to fail CI until 
the companion patches make it to the KMD (is there a BKM to deal with 
this situation?).

Thanks,
Oscar
oscar.mateo@intel.com Oct. 30, 2017, 8:02 p.m. UTC | #4
On 10/16/2017 02:05 AM, Petri Latvala wrote:
> On Fri, Oct 13, 2017 at 01:32:58PM -0700, Oscar Mateo wrote:
>>
>> On 10/12/2017 05:40 AM, Petri Latvala wrote:
>>> On Wed, Oct 11, 2017 at 11:15:40AM -0700, Oscar Mateo wrote:
>>>> Apart from context based workarounds, we can now also test for global
>>>> MMIO and whitelisting ones.
>>>>
>>>> Do take into account that this test does not guarantee that all known
>>>> WAs for a given platform are applied. It only checks that the WAs the
>>>> kernel does know about are correctly applied (e.g. they didn't get
>>>> lost on a GPU reset or a suspend/resume).
>>>>
>>>> v2:
>>>>     - Do not wait for the GPU unnecessarily (Chris)
>>>>     - Make a comment that this tests only looks for regressions (Chris)
>>>>
>>>> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
>>>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>>>> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
>>> Subjectprefix didn't contain i-g-t so CI did not pick this up.
>>>
>> Thanks for the heads up, Petri. I forgot to add the label, but it doesn't
>> matter that much because this patch is going to fail CI until the companion
>> patches make it to the KMD (is there a BKM to deal with this situation?).
> That still tests if it builds in the CI environment, whether it
> affects other tests, and shows how the failure behaves.
>
> How is the test supposed to behave on older kernels, like stable
> releases (4.4.x, 4.12.x, etc)? Is "fail" proper for kernels without
> the KMD changes required, instead of producing a skip? If that is the
> case, the best story is to keep sending the patch towards CI, collect
> reviews, possibly resend for one more CI run after kernel changes have
> landed to see lights turn green, land tests. If the test is behaving
> well with "broken" kernels it can even be landed before the kernel
> changes, if the used interfaces already exist in kernel upstream.

Got it. Thank you, Petri. I can modify the test so that it keeps working 
with older kernels, which looks like the best possible approach.
diff mbox

Patch

diff --git a/tests/gem_workarounds.c b/tests/gem_workarounds.c
index 7b99961..69174d9 100644
--- a/tests/gem_workarounds.c
+++ b/tests/gem_workarounds.c
@@ -28,6 +28,7 @@ 
 #include "igt.h"
 
 #include <fcntl.h>
+#include <ctype.h>
 
 #define PAGE_SIZE 4096
 #define PAGE_ALIGN(x) ALIGN(x, PAGE_SIZE)
@@ -62,8 +63,14 @@  static struct write_only_list {
 	 */
 };
 
-static struct intel_wa_reg *wa_regs;
-static int num_wa_regs;
+static struct intel_wa_reg *ctx_wa_regs;
+static int num_ctx_wa_regs;
+
+static struct intel_wa_reg *mmio_wa_regs;
+static int num_mmio_wa_regs;
+
+static struct intel_wa_reg *whitelist_wa_regs;
+static int num_whitelist_wa_regs;
 
 static bool write_only(const uint32_t addr)
 {
@@ -82,7 +89,7 @@  static bool write_only(const uint32_t addr)
 
 #define MI_STORE_REGISTER_MEM (0x24 << 23)
 
-static int workaround_fail_count(int fd, uint32_t ctx)
+static int ctx_workarounds_fail_count(int fd, uint32_t ctx)
 {
 	struct drm_i915_gem_exec_object2 obj[2];
 	struct drm_i915_gem_relocation_entry *reloc;
@@ -91,13 +98,16 @@  static int workaround_fail_count(int fd, uint32_t ctx)
 	uint32_t *base, *out;
 	int fail_count = 0;
 
-	reloc = calloc(num_wa_regs, sizeof(*reloc));
+	if (!num_ctx_wa_regs)
+		return 0;
+
+	reloc = calloc(num_ctx_wa_regs, sizeof(*reloc));
 	igt_assert(reloc);
 
-	result_sz = 4 * num_wa_regs;
+	result_sz = 4 * num_ctx_wa_regs;
 	result_sz = PAGE_ALIGN(result_sz);
 
-	batch_sz = 16 * num_wa_regs + 4;
+	batch_sz = 16 * num_ctx_wa_regs + 4;
 	batch_sz = PAGE_ALIGN(batch_sz);
 
 	memset(obj, 0, sizeof(obj));
@@ -105,12 +115,12 @@  static int workaround_fail_count(int fd, uint32_t ctx)
 	gem_set_caching(fd, obj[0].handle, I915_CACHING_CACHED);
 	obj[1].handle = gem_create(fd, batch_sz);
 	obj[1].relocs_ptr = to_user_pointer(reloc);
-	obj[1].relocation_count = num_wa_regs;
+	obj[1].relocation_count = num_ctx_wa_regs;
 
 	out = base = gem_mmap__cpu(fd, obj[1].handle, 0, batch_sz, PROT_WRITE);
-	for (int i = 0; i < num_wa_regs; i++) {
+	for (int i = 0; i < num_ctx_wa_regs; i++) {
 		*out++ = MI_STORE_REGISTER_MEM | ((gen >= 8 ? 4 : 2) - 2);
-		*out++ = wa_regs[i].addr;
+		*out++ = ctx_wa_regs[i].addr;
 		reloc[i].target_handle = obj[0].handle;
 		reloc[i].offset = (out - base) * sizeof(*out);
 		reloc[i].delta = i * sizeof(uint32_t);
@@ -134,20 +144,20 @@  static int workaround_fail_count(int fd, uint32_t ctx)
 	igt_debug("Address\tval\t\tmask\t\tread\t\tresult\n");
 
 	out = gem_mmap__cpu(fd, obj[0].handle, 0, result_sz, PROT_READ);
-	for (int i = 0; i < num_wa_regs; i++) {
+	for (int i = 0; i < num_ctx_wa_regs; i++) {
 		const bool ok =
-			(wa_regs[i].value & wa_regs[i].mask) ==
-			(out[i] & wa_regs[i].mask);
+			(ctx_wa_regs[i].value & ctx_wa_regs[i].mask) ==
+			(out[i] & ctx_wa_regs[i].mask);
 		char buf[80];
 
 		snprintf(buf, sizeof(buf),
 			 "0x%05X\t0x%08X\t0x%08X\t0x%08X",
-			 wa_regs[i].addr, wa_regs[i].value, wa_regs[i].mask,
+			 ctx_wa_regs[i].addr, ctx_wa_regs[i].value, ctx_wa_regs[i].mask,
 			 out[i]);
 
 		if (ok) {
 			igt_debug("%s\tOK\n", buf);
-		} else if (write_only(wa_regs[i].addr)) {
+		} else if (write_only(ctx_wa_regs[i].addr)) {
 			igt_debug("%s\tIGNORED (w/o)\n", buf);
 		} else {
 			igt_warn("%s\tFAIL\n", buf);
@@ -163,6 +173,49 @@  static int workaround_fail_count(int fd, uint32_t ctx)
 	return fail_count;
 }
 
+static int mmio_workarounds_fail_count(struct intel_wa_reg *wa_regs, int num_wa_regs)
+{
+	int i, fail_count = 0;
+
+	if (!num_wa_regs)
+		return 0;
+
+	igt_debug("Address\tval\t\tmask\t\tread\t\tresult\n");
+
+	for (i = 0; i < num_wa_regs; ++i) {
+		const uint32_t val = intel_register_read(wa_regs[i].addr);
+		const bool ok = (wa_regs[i].value & wa_regs[i].mask) ==
+			(val & wa_regs[i].mask);
+
+		igt_debug("0x%05X\t0x%08X\t0x%08X\t0x%08X\t%s\n",
+			  wa_regs[i].addr, wa_regs[i].value,
+			  wa_regs[i].mask, val, ok ? "OK" : "FAIL");
+
+		if (write_only(wa_regs[i].addr))
+			continue;
+
+		if (!ok) {
+			igt_warn("0x%05X\t0x%08X\t0x%08X\t0x%08X\t%s\n",
+				 wa_regs[i].addr, wa_regs[i].value,
+				 wa_regs[i].mask, val, ok ? "OK" : "FAIL");
+			fail_count++;
+		}
+	}
+
+	return fail_count;
+}
+
+static int workarounds_fail_count(int fd, uint32_t ctx)
+{
+	int fail_count = 0;
+
+	fail_count += ctx_workarounds_fail_count(fd, ctx);
+	fail_count += mmio_workarounds_fail_count(mmio_wa_regs, num_mmio_wa_regs);
+	fail_count += mmio_workarounds_fail_count(whitelist_wa_regs, num_whitelist_wa_regs);
+
+	return fail_count;
+}
+
 static int reopen(int fd)
 {
 	char path[256];
@@ -185,7 +238,7 @@  static void check_workarounds(int fd, enum operation op, unsigned int flags)
 	if (flags & CONTEXT)
 		ctx = gem_context_create(fd);
 
-	igt_assert_eq(workaround_fail_count(fd, ctx), 0);
+	igt_assert_eq(workarounds_fail_count(fd, ctx), 0);
 
 	switch (op) {
 	case GPU_RESET:
@@ -209,7 +262,7 @@  static void check_workarounds(int fd, enum operation op, unsigned int flags)
 		igt_assert(0);
 	}
 
-	igt_assert_eq(workaround_fail_count(fd, ctx), 0);
+	igt_assert_eq(workarounds_fail_count(fd, ctx), 0);
 
 	if (flags & CONTEXT)
 		gem_context_destroy(fd, ctx);
@@ -217,6 +270,62 @@  static void check_workarounds(int fd, enum operation op, unsigned int flags)
 		close(fd);
 }
 
+static bool is_empty(const char *s)
+{
+	while (*s != '\0') {
+		if (!isspace(*s))
+			return false;
+		s++;
+	}
+
+	return true;
+}
+
+static char *skip_preamble(char *s, const char *preamble)
+{
+	while (*s == *preamble) {
+		s++;
+		preamble++;
+	}
+
+	return s;
+}
+
+static void read_workarounds(FILE *file, const char *preamble, struct intel_wa_reg **regs, int *num)
+{
+	char *header, *line = NULL;
+	size_t line_size;
+	struct intel_wa_reg *wa_regs;
+	int num_wa_regs;
+	int i = 0;
+
+	igt_assert(getline(&line, &line_size, file) > 0);
+	igt_debug("i915_wa_registers: %s", line);
+	header = skip_preamble(line, preamble);
+	sscanf(header, " workarounds applied: %d", &num_wa_regs);
+
+	wa_regs = malloc(num_wa_regs * sizeof(*wa_regs));
+
+	while (getline(&line, &line_size, file) > 0) {
+		if (is_empty(line))
+			break;
+
+		igt_debug("%s", line);
+		if (sscanf(line, "0x%X: 0x%08X, mask: 0x%08X",
+			   &wa_regs[i].addr,
+			   &wa_regs[i].value,
+			   &wa_regs[i].mask) == 3)
+			i++;
+	}
+
+	igt_assert_lte(i, num_wa_regs);
+
+	*regs = wa_regs;
+	*num = num_wa_regs;
+
+	free(line);
+}
+
 igt_main
 {
 	int device = -1;
@@ -241,39 +350,34 @@  igt_main
 	}, *m;
 
 	igt_fixture {
+		struct pci_device *pci_dev;
 		FILE *file;
-		char *line = NULL;
-		size_t line_size;
-		int i, fd;
+		int fd;
 
 		device = drm_open_driver(DRIVER_INTEL);
 		igt_require_gem(device);
 
 		gen = intel_gen(intel_get_drm_devid(device));
 
+		pci_dev = intel_get_pci_device();
+		igt_require(pci_dev);
+
+		intel_register_access_init(pci_dev, 0, device);
+
 		fd = igt_debugfs_open(device, "i915_wa_registers", O_RDONLY);
 		file = fdopen(fd, "r");
-		igt_assert(getline(&line, &line_size, file) > 0);
-		igt_debug("i915_wa_registers: %s", line);
-		sscanf(line, "Workarounds applied: %d", &num_wa_regs);
-		igt_require(num_wa_regs > 0);
-
-		wa_regs = malloc(num_wa_regs * sizeof(*wa_regs));
-		igt_assert(wa_regs);
-
-		i = 0;
-		while (getline(&line, &line_size, file) > 0) {
-			igt_debug("%s", line);
-			if (sscanf(line, "0x%X: 0x%08X, mask: 0x%08X",
-				   &wa_regs[i].addr,
-				   &wa_regs[i].value,
-				   &wa_regs[i].mask) == 3)
-				i++;
-		}
 
-		igt_assert_lte(i, num_wa_regs);
+		/*
+		 * This test relies on the list of workarounds the kernel says
+		 * have been applied and it only checks that those are (indeed)
+		 * correctly applied. It does not report whether the system has
+		 * applied all known workarounds for a fiven platform.
+		 */
+		read_workarounds(file, "Context", &ctx_wa_regs, &num_ctx_wa_regs);
+		read_workarounds(file, "MMIO", &mmio_wa_regs, &num_mmio_wa_regs);
+		read_workarounds(file, "Whitelist", &whitelist_wa_regs, &num_whitelist_wa_regs);
+		igt_require(num_ctx_wa_regs + num_mmio_wa_regs + num_whitelist_wa_regs > 0);
 
-		free(line);
 		fclose(file);
 		close(fd);
 	}
@@ -284,4 +388,9 @@  igt_main
 				check_workarounds(device, op->op, m->flags);
 		}
 	}
+
+	igt_fixture {
+		free(ctx_wa_regs);
+		free(mmio_wa_regs);
+	}
 }