diff mbox series

drm/i915: add guard page to ggtt->error_capture

Message ID 20230201153118.2654177-1-andrzej.hajda@intel.com (mailing list archive)
State New, archived
Headers show
Series drm/i915: add guard page to ggtt->error_capture | expand

Commit Message

Andrzej Hajda Feb. 1, 2023, 3:31 p.m. UTC
Some platforms (ADL, RPL, DG2) during CPU read of mapped GTT pages
tries to prefetch data beyond page table boundary. If the next PTE
in GTT points to invalid address it can cause DMAR errors.
Adding guard PTE pointing to scratch page solves the issue
in case of ggtt->error_capture.

Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
---
This patch tries to diminish plague of DMAR errors present
in CI for ADL*, RPL*, DG2 platforms, see for example [1] (grep DMAR).
This is upstreamed internal patch, with slightly modified commit message.

[1]: http://gfx-ci.igk.intel.com/tree/drm-tip/CI_DRM_12678/bat-adln-1/dmesg0.txt

Regards
Andrzej
---
 drivers/gpu/drm/i915/gt/intel_ggtt.c | 25 +++++++++++++++++++++----
 1 file changed, 21 insertions(+), 4 deletions(-)
diff mbox series

Patch

diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c b/drivers/gpu/drm/i915/gt/intel_ggtt.c
index 842e69c7b21e49..bdc1636375374a 100644
--- a/drivers/gpu/drm/i915/gt/intel_ggtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c
@@ -503,6 +503,13 @@  static void cleanup_init_ggtt(struct i915_ggtt *ggtt)
 	mutex_destroy(&ggtt->error_mutex);
 }
 
+static void ggtt_insert_scratch_page(struct i915_ggtt *ggtt, u64 offset)
+{
+	struct i915_address_space *vm = &ggtt->vm;
+
+	vm->insert_page(vm, px_dma(vm->scratch[0]), offset, I915_CACHE_NONE, 0);
+}
+
 static int init_ggtt(struct i915_ggtt *ggtt)
 {
 	/*
@@ -552,7 +559,7 @@  static int init_ggtt(struct i915_ggtt *ggtt)
 		 * the only likely reason for failure to insert is a driver
 		 * bug, which we expect to cause other failures...
 		 */
-		ggtt->error_capture.size = I915_GTT_PAGE_SIZE;
+		ggtt->error_capture.size = 2 * I915_GTT_PAGE_SIZE;
 		ggtt->error_capture.color = I915_COLOR_UNEVICTABLE;
 		if (drm_mm_reserve_node(&ggtt->vm.mm, &ggtt->error_capture))
 			drm_mm_insert_node_in_range(&ggtt->vm.mm,
@@ -562,11 +569,21 @@  static int init_ggtt(struct i915_ggtt *ggtt)
 						    0, ggtt->mappable_end,
 						    DRM_MM_INSERT_LOW);
 	}
-	if (drm_mm_node_allocated(&ggtt->error_capture))
+	if (drm_mm_node_allocated(&ggtt->error_capture)) {
+		u64 start = ggtt->error_capture.start;
+		u64 end = ggtt->error_capture.start + ggtt->error_capture.size;
+		u64 i;
+
+		/*
+		 * During error capture, memcpying from the GGTT is triggering a
+		 * prefetch of the following PTE, so fill it with a guard page.
+		 */
+		for (i = start + I915_GTT_PAGE_SIZE; i < end; i += I915_GTT_PAGE_SIZE)
+			ggtt_insert_scratch_page(ggtt, i);
 		drm_dbg(&ggtt->vm.i915->drm,
 			"Reserved GGTT:[%llx, %llx] for use by error capture\n",
-			ggtt->error_capture.start,
-			ggtt->error_capture.start + ggtt->error_capture.size);
+			start, end);
+	}
 
 	/*
 	 * The upper portion of the GuC address space has a sizeable hole