[2/4] drm/i915: Use maximum write flush for pwrite_gtt
diff mbox series

Message ID 20190718145407.21352-2-chris@chris-wilson.co.uk
State New
Headers show
Series
  • [1/4] drm/i915: Drop wmb() inside pread_gtt
Related show

Commit Message

Chris Wilson July 18, 2019, 2:54 p.m. UTC
As recently disovered by forcing big-core (!llc) machines to use the GTT
paths, we need our full GTT write flush before manipulating the GTT PTE
or else the writes may be directed to the wrong page.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: stable@vger.kernel.org
---
 drivers/gpu/drm/i915/i915_gem.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

Comments

Ville Syrjälä July 18, 2019, 6:28 p.m. UTC | #1
On Thu, Jul 18, 2019 at 03:54:05PM +0100, Chris Wilson wrote:
> As recently disovered by forcing big-core (!llc) machines to use the GTT
> paths, we need our full GTT write flush before manipulating the GTT PTE
> or else the writes may be directed to the wrong page.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Matthew Auld <matthew.william.auld@gmail.com>
> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
> Cc: stable@vger.kernel.org
> ---
>  drivers/gpu/drm/i915/i915_gem.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index fed0bc421a55..c6ba350e6e4f 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -610,7 +610,8 @@ i915_gem_gtt_pwrite_fast(struct drm_i915_gem_object *obj,
>  		unsigned int page_length = PAGE_SIZE - page_offset;
>  		page_length = remain < page_length ? remain : page_length;
>  		if (node.allocated) {
> -			wmb(); /* flush the write before we modify the GGTT */
> +			/* flush the write before we modify the GGTT */
> +			intel_gt_flush_ggtt_writes(ggtt->vm.gt);

Matches the story told by intel_gt_flush_ggtt_writes().

Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>

>  			ggtt->vm.insert_page(&ggtt->vm,
>  					     i915_gem_object_get_dma_address(obj, offset >> PAGE_SHIFT),
>  					     node.start, I915_CACHE_NONE, 0);
> @@ -639,8 +640,8 @@ i915_gem_gtt_pwrite_fast(struct drm_i915_gem_object *obj,
>  	i915_gem_object_unlock_fence(obj, fence);
>  out_unpin:
>  	mutex_lock(&i915->drm.struct_mutex);
> +	intel_gt_flush_ggtt_writes(ggtt->vm.gt);
>  	if (node.allocated) {
> -		wmb();
>  		ggtt->vm.clear_range(&ggtt->vm, node.start, node.size);
>  		remove_mappable_node(&node);
>  	} else {
> -- 
> 2.22.0
Chris Wilson July 18, 2019, 7:19 p.m. UTC | #2
Quoting Ville Syrjälä (2019-07-18 19:28:43)
> On Thu, Jul 18, 2019 at 03:54:05PM +0100, Chris Wilson wrote:
> > As recently disovered by forcing big-core (!llc) machines to use the GTT
> > paths, we need our full GTT write flush before manipulating the GTT PTE
> > or else the writes may be directed to the wrong page.
> > 
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > Cc: Matthew Auld <matthew.william.auld@gmail.com>
> > Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > Cc: stable@vger.kernel.org
> > ---
> >  drivers/gpu/drm/i915/i915_gem.c | 5 +++--
> >  1 file changed, 3 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index fed0bc421a55..c6ba350e6e4f 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -610,7 +610,8 @@ i915_gem_gtt_pwrite_fast(struct drm_i915_gem_object *obj,
> >               unsigned int page_length = PAGE_SIZE - page_offset;
> >               page_length = remain < page_length ? remain : page_length;
> >               if (node.allocated) {
> > -                     wmb(); /* flush the write before we modify the GGTT */
> > +                     /* flush the write before we modify the GGTT */
> > +                     intel_gt_flush_ggtt_writes(ggtt->vm.gt);
> 
> Matches the story told by intel_gt_flush_ggtt_writes().
> 
> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>

Ta, pushed to dinq. Hopefully, this may explain some mystery fails!
(Not that any sane userspace does for(;;) { gem_write(); gem_read(); })
-Chris
Sasha Levin July 19, 2019, 12:45 a.m. UTC | #3
Hi,

[This is an automated email]

This commit has been processed because it contains a -stable tag.
The stable tag indicates that it's relevant for the following trees: all

The bot has tested the following trees: v5.2.1, v5.1.18, v4.19.59, v4.14.133, v4.9.185, v4.4.185.

v5.2.1: Failed to apply! Possible dependencies:
    09407579abf5 ("drm/i915: Store the default sseu setup on the engine")
    10be98a77c55 ("drm/i915: Move more GEM objects under gem/")
    112ed2d31a46 ("drm/i915: Move GraphicsTechnology files under gt/")
    5e5d2e209e08 ("drm/i915: Split GEM object type definition to its own header")
    6951e5893b48 ("drm/i915: Move GEM object domain management from struct_mutex to local")
    98932149aeb9 ("drm/i915: Move object->pages API to i915_gem_object.[ch]")

v5.1.18: Failed to apply! Possible dependencies:
    10be98a77c55 ("drm/i915: Move more GEM objects under gem/")
    112ed2d31a46 ("drm/i915: Move GraphicsTechnology files under gt/")
    13f1bfd3b332 ("drm/i915: Make object/vma allocation caches global")
    2caffbf11762 ("drm/i915: Revoke mmaps and prevent access to fence registers across reset")
    32eb6bcfdda9 ("drm/i915: Make request allocation caches global")
    39e2f501c1b4 ("drm/i915: Split struct intel_context definition to its own header")
    5e5d2e209e08 ("drm/i915: Split GEM object type definition to its own header")
    6951e5893b48 ("drm/i915: Move GEM object domain management from struct_mutex to local")
    7ae1940014ef ("drm/i915: Defer removing fence register tracking to rpm wakeup")
    7e3d9a59410d ("drm/i915: Track active engines within a context")
    7f4127c4839b ("drm/i915: Use time based guilty context banning")
    98932149aeb9 ("drm/i915: Move object->pages API to i915_gem_object.[ch]")
    ba4fda620a5f ("drm/i915: Optionally disable automatic recovery after a GPU reset")
    c2400ec3b6d1 ("drm/i915: add Makefile magic for testing headers are self-contained")

v4.19.59: Failed to apply! Possible dependencies:
    0e39037b3165 ("drm/i915: Cache the error string")
    10be98a77c55 ("drm/i915: Move more GEM objects under gem/")
    112ed2d31a46 ("drm/i915: Move GraphicsTechnology files under gt/")
    16e4dd0342a8 ("drm/i915: Markup paired operations on wakerefs")
    39e2f501c1b4 ("drm/i915: Split struct intel_context definition to its own header")
    52c0fdb25c7c ("drm/i915: Replace global breadcrumbs with per-context interrupt tracking")
    538ef96b9dae ("drm/i915/gem: Track the rpm wakerefs")
    5e5d2e209e08 ("drm/i915: Split GEM object type definition to its own header")
    6951e5893b48 ("drm/i915: Move GEM object domain management from struct_mutex to local")
    6b048706f407 ("drm/i915: Forcibly flush unwanted requests in drop-caches")
    87f1ef225242 ("drm/i915: Record the sseu configuration per-context & engine")
    95fd94a645f7 ("drm/i915: avoid rebuilding i915_gpu_error.o on version string updates")
    98932149aeb9 ("drm/i915: Move object->pages API to i915_gem_object.[ch]")
    c0a6aa7ec2c3 ("drm/i915: Show actual alongside requested frequency in debugfs/i915_rps_boost_info")
    c2400ec3b6d1 ("drm/i915: add Makefile magic for testing headers are self-contained")
    c44301fce614 ("drm/i915: Allow control of PSR at runtime through debugfs, v6")
    cab870b7fdf3 ("drm/i915/ilk: Fix warning when reading emon_status with no output")
    e6154e4cb8b0 ("drm/i915: Skip the ERR_PTR error state")
    eb8d0f5af4ec ("drm/i915: Remove GPU reset dependence on struct_mutex")
    fb6f0b64e455 ("drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture")

v4.14.133: Failed to apply! Possible dependencies:
    3bd4073524fa ("drm/i915: Consolidate get_fence with pin_fence")
    465c403cb508 ("drm/i915: introduce simple gemfs")
    66df1014efba ("drm/i915: Keep a small stash of preallocated WC pages")
    7393b7ee3a9c ("drm/i915/debugfs: include some gtt page size metrics")
    73ebd503034c ("drm/i915: make mappable struct resource centric")
    7789422665f5 ("drm/i915: make dsm struct resource centric")
    82ad6443a55e ("drm/i915/gtt: Rename i915_hw_ppgtt base member")
    969b0950a188 ("drm/i915: Add interface to reserve fence registers for vGPU")
    a65adaf8a834 ("drm/i915: Track user GTT faulting per-vma")
    b1ace60107e6 ("drm/i915: give stolen_usable_size a more suitable home")
    b7128ef125b4 ("drm/i915: prefer resource_size_t for everything stolen")
    da1dd0dbe024 ("drm/i915: Make the report about a bogus stolen reserved area an error")
    db7fb60593e4 ("drm/i915: Check if the stolen memory "reserved" area is enabled or not")
    e91ef99b9543 ("drm/i915/selftests: Remember to create the fake preempt context")
    f773568b6ff8 ("drm/i915: nuke the duplicated stolen discovery")

v4.9.185: Failed to apply! Possible dependencies:
    0e70447605f4 ("drm/i915: Move common code out of i915_gpu_error.c")
    1b36595ffb35 ("drm/i915: Show RING registers through debugfs")
    28a60dee2ce6 ("drm/i915/gvt: vGPU HW resource management")
    3b3f1650b1ca ("drm/i915: Allocate intel_engine_cs structure only for the enabled engines")
    82ad6443a55e ("drm/i915/gtt: Rename i915_hw_ppgtt base member")
    85fd4f58d7ef ("drm/i915: Mark all non-vma being inserted into the address spaces")
    9c870d03674f ("drm/i915: Use RPM as the barrier for controlling user mmap access")
    bb6dc8d96b68 ("drm/i915: Implement pread without struct-mutex")
    d636951ec01b ("drm/i915: Cleanup instdone collection")
    e007b19d7ba7 ("drm/i915: Use the MRU stack search after evicting")
    f9e613728090 ("drm/i915: Try to print INSTDONE bits for all slice/subslice")

v4.4.185: Failed to apply! Possible dependencies:
    09cfcb456941 ("drm/i915: Split out load time HW initialization")
    0a9d2bed5557 ("drm/i915/skl: Making DC6 entry is the last call in suspend flow.")
    1f814daca43a ("drm/i915: add support for checking if we hold an RPM reference")
    2f693e28b8df ("drm/i915: Make turning on/off PW1 and Misc I/O part of the init/fini sequences")
    399bb5b6db02 ("drm/i915: Move allocation of various workqueues earlier during init")
    414b7999b8be ("drm/i915/gen9: Remove csr.state, csr_lock and related code.")
    4f1959ee33c0 ("drm/i915: Use insert_page for pwrite_fast")
    62106b4f6b91 ("drm/i915: Rename dev_priv->gtt to dev_priv->ggtt")
    666a45379e2c ("drm/i915: Separate cherryview from valleyview")
    72e96d6450c0 ("drm/i915: Refer to GGTT {,VM} consistently")
    73dfc227ff5c ("drm/i915/skl: init/uninit display core as part of the HW power domain state")
    9c5308ea1cd4 ("drm/i915/skl: Refuse to load outdated dmc firmware")
    ad5c3d3ffbb2 ("drm/i915: Move MCHBAR setup earlier during init")
    b6e7d894c3d2 ("drm/i915/skl: Store and print the DMC firmware version we load")
    bc87229f323e ("drm/i915/skl: enable PC9/10 power states during suspend-to-idle")
    c73666f394fc ("drm/i915/skl: If needed sanitize bios programmed cdclk")
    ebae38d061df ("drm/i915/gen9: csr_init after runtime pm enable")
    f4448375467d ("drm/i915/gen9: Use dev_priv in csr functions")
    f514c2d84285 ("drm/i915/gen9: flush DMC fw loading work during system suspend")


NOTE: The patch will not be queued to stable trees until it is upstream.

How should we proceed with this patch?

--
Thanks,
Sasha

Patch
diff mbox series

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index fed0bc421a55..c6ba350e6e4f 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -610,7 +610,8 @@  i915_gem_gtt_pwrite_fast(struct drm_i915_gem_object *obj,
 		unsigned int page_length = PAGE_SIZE - page_offset;
 		page_length = remain < page_length ? remain : page_length;
 		if (node.allocated) {
-			wmb(); /* flush the write before we modify the GGTT */
+			/* flush the write before we modify the GGTT */
+			intel_gt_flush_ggtt_writes(ggtt->vm.gt);
 			ggtt->vm.insert_page(&ggtt->vm,
 					     i915_gem_object_get_dma_address(obj, offset >> PAGE_SHIFT),
 					     node.start, I915_CACHE_NONE, 0);
@@ -639,8 +640,8 @@  i915_gem_gtt_pwrite_fast(struct drm_i915_gem_object *obj,
 	i915_gem_object_unlock_fence(obj, fence);
 out_unpin:
 	mutex_lock(&i915->drm.struct_mutex);
+	intel_gt_flush_ggtt_writes(ggtt->vm.gt);
 	if (node.allocated) {
-		wmb();
 		ggtt->vm.clear_range(&ggtt->vm, node.start, node.size);
 		remove_mappable_node(&node);
 	} else {