diff mbox

[2/2] drm/i915/gtt: Mark tlbs dirty on clear

Message ID 1477927486-6549-2-git-send-email-mika.kuoppala@intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Mika Kuoppala Oct. 31, 2016, 3:24 p.m. UTC
Now when clearing ptes can modify upper level pdp's,
we need to mark them dirty so that they will be flushed
correctly.

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 22 ++++++++++++----------
 1 file changed, 12 insertions(+), 10 deletions(-)

Comments

Chris Wilson Oct. 31, 2016, 3:49 p.m. UTC | #1
On Mon, Oct 31, 2016 at 05:24:46PM +0200, Mika Kuoppala wrote:
> Now when clearing ptes can modify upper level pdp's,
> we need to mark them dirty so that they will be flushed
> correctly.

I suppose so. I think we could push this into the cleanup_px() (and
alloc) but that is probably a fair chunk of work for immeasurable gain.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
Mika Kuoppala Oct. 31, 2016, 3:58 p.m. UTC | #2
Chris Wilson <chris@chris-wilson.co.uk> writes:

> On Mon, Oct 31, 2016 at 05:24:46PM +0200, Mika Kuoppala wrote:
>> Now when clearing ptes can modify upper level pdp's,
>> we need to mark them dirty so that they will be flushed
>> correctly.
>
> I suppose so.

It is a bit iffy if we really do, but this way we gain symmetry
with the alloc side.

I pondered why this is missing from 4l side, but in there the
topmost pointer is static.

-Mika

> I think we could push this into the cleanup_px() (and
> alloc) but that is probably a fair chunk of work for immeasurable gain.
> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
> -Chris
>
> -- 
> Chris Wilson, Intel Open Source Technology Centre
Chris Wilson Oct. 31, 2016, 5:14 p.m. UTC | #3
On Mon, Oct 31, 2016 at 05:58:15PM +0200, Mika Kuoppala wrote:
> Chris Wilson <chris@chris-wilson.co.uk> writes:
> 
> > On Mon, Oct 31, 2016 at 05:24:46PM +0200, Mika Kuoppala wrote:
> >> Now when clearing ptes can modify upper level pdp's,
> >> we need to mark them dirty so that they will be flushed
> >> correctly.
> >
> > I suppose so.
> 
> It is a bit iffy if we really do, but this way we gain symmetry
> with the alloc side.

If we assume correct behaviour on the client, no. They shouldn't be
accessing any of the removed PTE, PDE, PDPE and so now follow the stale
TLB to the old pages (now reused by the system and containing garbage ->
GPU hang or carefully crafted redirection). However, on the same basis
we fill the client address space with scratch pages, we also should
prepare for random stray access which means we need to invalidate the
TLB -- however, given that this is undefined behaviour a ncurrently
executing batch may be accessing this page illegally and see the stale
value long before a newly submitted batch flushes the TLB.

So based on that we fix predictable TLB errors (such as the prefetcher
crossing the page boundary, if it still does!) for correctly behaving
batches.
-Chris
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index cda263c..cbca332 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -707,6 +707,16 @@  static int gen8_48b_mm_switch(struct i915_hw_ppgtt *ppgtt,
 	return gen8_write_pdp(req, 0, px_dma(&ppgtt->pml4));
 }
 
+/* PDE TLBs are a pain to invalidate on GEN8+. When we modify
+ * the page table structures, we mark them dirty so that
+ * context switching/execlist queuing code takes extra steps
+ * to ensure that tlbs are flushed.
+ */
+static void mark_tlbs_dirty(struct i915_hw_ppgtt *ppgtt)
+{
+	ppgtt->pd_dirty_rings = INTEL_INFO(ppgtt->base.dev)->ring_mask;
+}
+
 /* Removes entries from a single page table, releasing it if it's empty.
  * Caller can use the return value to update higher-level entries.
  */
@@ -810,6 +820,8 @@  static bool gen8_ppgtt_clear_pdp(struct i915_address_space *vm,
 		}
 	}
 
+	mark_tlbs_dirty(ppgtt);
+
 	if (USES_FULL_48BIT_PPGTT(vm->dev) &&
 	    bitmap_empty(pdp->used_pdpes, I915_PDPES_PER_PDP(vm->dev))) {
 		free_pdp(vm->dev, pdp);
@@ -1284,16 +1296,6 @@  int __must_check alloc_gen8_temp_bitmaps(unsigned long **new_pds,
 	return -ENOMEM;
 }
 
-/* PDE TLBs are a pain to invalidate on GEN8+. When we modify
- * the page table structures, we mark them dirty so that
- * context switching/execlist queuing code takes extra steps
- * to ensure that tlbs are flushed.
- */
-static void mark_tlbs_dirty(struct i915_hw_ppgtt *ppgtt)
-{
-	ppgtt->pd_dirty_rings = INTEL_INFO(ppgtt->base.dev)->ring_mask;
-}
-
 static int gen8_alloc_va_range_3lvl(struct i915_address_space *vm,
 				    struct i915_page_directory_pointer *pdp,
 				    uint64_t start,