drm/i915: Flush extra hard after writing relocations through the GTT
diff mbox series

Message ID 20190718195650.20635-1-chris@chris-wilson.co.uk
State New
Headers show
Series
  • drm/i915: Flush extra hard after writing relocations through the GTT
Related show

Commit Message

Chris Wilson July 18, 2019, 7:56 p.m. UTC
Recently discovered in commit bdae33b8b82b ("drm/i915: Use maximum write
flush for pwrite_gtt") was that we needed to our full write barrier
before changing the GGTT PTE to ensure that our indirect writes through
the GTT landed before the PTE changed (and the writes end up in a
different page). That also applies to our GGTT relocation path.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: stable@vger.kernel.org
---
 drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

Comments

Sasha Levin July 19, 2019, 12:45 a.m. UTC | #1
Hi,

[This is an automated email]

This commit has been processed because it contains a -stable tag.
The stable tag indicates that it's relevant for the following trees: all

The bot has tested the following trees: v5.2.1, v5.1.18, v4.19.59, v4.14.133, v4.9.185, v4.4.185.

v5.2.1: Failed to apply! Possible dependencies:
    Unable to calculate

v5.1.18: Failed to apply! Possible dependencies:
    Unable to calculate

v4.19.59: Failed to apply! Possible dependencies:
    Unable to calculate

v4.14.133: Failed to apply! Possible dependencies:
    3bd4073524fa ("drm/i915: Consolidate get_fence with pin_fence")
    465c403cb508 ("drm/i915: introduce simple gemfs")
    66df1014efba ("drm/i915: Keep a small stash of preallocated WC pages")
    7393b7ee3a9c ("drm/i915/debugfs: include some gtt page size metrics")
    73ebd503034c ("drm/i915: make mappable struct resource centric")
    7789422665f5 ("drm/i915: make dsm struct resource centric")
    82ad6443a55e ("drm/i915/gtt: Rename i915_hw_ppgtt base member")
    969b0950a188 ("drm/i915: Add interface to reserve fence registers for vGPU")
    a65adaf8a834 ("drm/i915: Track user GTT faulting per-vma")
    b1ace60107e6 ("drm/i915: give stolen_usable_size a more suitable home")
    b7128ef125b4 ("drm/i915: prefer resource_size_t for everything stolen")
    da1dd0dbe024 ("drm/i915: Make the report about a bogus stolen reserved area an error")
    db7fb60593e4 ("drm/i915: Check if the stolen memory "reserved" area is enabled or not")
    e91ef99b9543 ("drm/i915/selftests: Remember to create the fake preempt context")
    f773568b6ff8 ("drm/i915: nuke the duplicated stolen discovery")

v4.9.185: Failed to apply! Possible dependencies:
    04d348ae3f0a ("drm/i915/gvt: vGPU display virtualization")
    12d14cc43b34 ("drm/i915/gvt: Introduce a framework for tracking HW registers.")
    28a60dee2ce6 ("drm/i915/gvt: vGPU HW resource management")
    3f728236c516 ("drm/i915/gvt: trace stub")
    4c7d62c6b8a2 ("drm/i915: Markup GEM API with lockdep asserts")
    4d60c5fd3f87 ("drm/i915/gvt: vGPU PCI configuration space virtualization")
    579cea5f30f2 ("drm/i915/gvt: golden virtual HW state management")
    650bc63568e4 ("drm/i915: Amalgamate execbuffer parameter structures")
    718659a63054 ("drm/i915: Rename some warts in the VMA API")
    82d375d1b568 ("drm/i915/gvt: Introduce basic vGPU life cycle management")
    8453d674ae7e ("drm/i915/gvt: vGPU execlist virtualization")
    c8fe6a6811a7 ("drm/i915/gvt: vGPU interrupt virtualization.")
    e39c5add3221 ("drm/i915/gvt: vGPU MMIO virtualization")
    e473405783c0 ("drm/i915/gvt: vGPU workload scheduler")
    e95433c73a11 ("drm/i915: Rearrange i915_wait_request() accounting with callers")

v4.4.185: Failed to apply! Possible dependencies:
    033908aed5a5 ("drm/i915: mark GEM object pages dirty when mapped & written by the CPU")
    058d88c4330f ("drm/i915: Track pinned VMA")
    09cfcb456941 ("drm/i915: Split out load time HW initialization")
    0a9d2bed5557 ("drm/i915/skl: Making DC6 entry is the last call in suspend flow.")
    188c1ab7769d ("drm/i915: Add struct_mutex locking for debugs/i915_gem_framebuffer")
    1f814daca43a ("drm/i915: add support for checking if we hold an RPM reference")
    31a39207f04a ("drm/i915: Cache kmap between relocations")
    399bb5b6db02 ("drm/i915: Move allocation of various workqueues earlier during init")
    414b7999b8be ("drm/i915/gen9: Remove csr.state, csr_lock and related code.")
    506a8e87d8d2 ("drm/i915: Add soft-pinning API for execbuffer")
    62106b4f6b91 ("drm/i915: Rename dev_priv->gtt to dev_priv->ggtt")
    72e96d6450c0 ("drm/i915: Refer to GGTT {,VM} consistently")
    73dfc227ff5c ("drm/i915/skl: init/uninit display core as part of the HW power domain state")
    8da32727ac0e ("drm/i915: Remove i915_gem_obj_size")
    934acce3c069 ("drm/i915: Avoid writing relocs with addresses in non-canonical form")
    9e2793f6e4e2 ("drm/i915: compile-time consistency check on __EXEC_OBJECT flags")
    ad5c3d3ffbb2 ("drm/i915: Move MCHBAR setup earlier during init")
    bc87229f323e ("drm/i915/skl: enable PC9/10 power states during suspend-to-idle")
    be12a86b46e8 ("drm/i915: Show pin mapped status in describe_obj")
    d50415cc6c83 ("drm/i915: Refactor execbuffer relocation writing")
    f514c2d84285 ("drm/i915/gen9: flush DMC fw loading work during system suspend")


NOTE: The patch will not be queued to stable trees until it is upstream.

How should we proceed with this patch?

--
Thanks,
Sasha

Patch
diff mbox series

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 8a2047c4e7c3..01901dad33f7 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -1019,11 +1019,12 @@  static void reloc_cache_reset(struct reloc_cache *cache)
 		kunmap_atomic(vaddr);
 		i915_gem_object_finish_access((struct drm_i915_gem_object *)cache->node.mm);
 	} else {
-		wmb();
+		struct i915_ggtt *ggtt = cache_to_ggtt(cache);
+
+		intel_gt_flush_ggtt_writes(ggtt->vm.gt);
 		io_mapping_unmap_atomic((void __iomem *)vaddr);
-		if (cache->node.allocated) {
-			struct i915_ggtt *ggtt = cache_to_ggtt(cache);
 
+		if (cache->node.allocated) {
 			ggtt->vm.clear_range(&ggtt->vm,
 					     cache->node.start,
 					     cache->node.size);
@@ -1078,6 +1079,7 @@  static void *reloc_iomap(struct drm_i915_gem_object *obj,
 	void *vaddr;
 
 	if (cache->vaddr) {
+		intel_gt_flush_ggtt_writes(ggtt->vm.gt);
 		io_mapping_unmap_atomic((void __force __iomem *) unmask_page(cache->vaddr));
 	} else {
 		struct i915_vma *vma;
@@ -1119,7 +1121,6 @@  static void *reloc_iomap(struct drm_i915_gem_object *obj,
 
 	offset = cache->node.start;
 	if (cache->node.allocated) {
-		wmb();
 		ggtt->vm.insert_page(&ggtt->vm,
 				     i915_gem_object_get_dma_address(obj, page),
 				     offset, I915_CACHE_NONE, 0);