diff mbox

[1/5] drm/i915: Use fence_write() from rpm resume

Message ID 20161012111637.14784-1-chris@chris-wilson.co.uk (mailing list archive)
State New, archived
Headers show

Commit Message

Chris Wilson Oct. 12, 2016, 11:16 a.m. UTC
During rpm resume we restore the fences, but we do not have the
protection of struct_mutex. This rules out updating the activity
tracking on the fences, and requires us to rely on the rpm as the
serialisation barrier instead.

[  350.298052] [drm:intel_runtime_resume [i915]] Resuming device
[  350.308606]
[  350.310520] ===============================
[  350.315560] [ INFO: suspicious RCU usage. ]
[  350.320554] 4.8.0-rc8-bsw-rapl+ #3133 Tainted: G     U  W
[  350.327208] -------------------------------
[  350.331977] ../drivers/gpu/drm/i915/i915_gem_request.h:371 suspicious rcu_dereference_protected() usage!
[  350.342619]
[  350.342619] other info that might help us debug this:
[  350.342619]
[  350.351593]
[  350.351593] rcu_scheduler_active = 1, debug_locks = 0
[  350.358952] 3 locks held by Xorg/320:
[  350.363077]  #0:  (&dev->mode_config.mutex){+.+.+.}, at: [<ffffffffa030589c>] drm_modeset_lock_all+0x3c/0xd0 [drm]
[  350.375162]  #1:  (crtc_ww_class_acquire){+.+.+.}, at: [<ffffffffa03058a6>] drm_modeset_lock_all+0x46/0xd0 [drm]
[  350.387022]  #2:  (crtc_ww_class_mutex){+.+.+.}, at: [<ffffffffa0305056>] drm_modeset_lock+0x36/0x110 [drm]
[  350.398236]
[  350.398236] stack backtrace:
[  350.403196] CPU: 1 PID: 320 Comm: Xorg Tainted: G     U  W       4.8.0-rc8-bsw-rapl+ #3133
[  350.412457] Hardware name: Intel Corporation CHERRYVIEW C0 PLATFORM/Braswell CRB, BIOS BRAS.X64.X088.R00.1510270350 10/27/2015
[  350.425212]  0000000000000000 ffff8801680a78c8 ffffffff81332187 ffff88016c5c5000
[  350.433611]  0000000000000001 ffff8801680a78f8 ffffffff810ca6da ffff88016cc8b0f0
[  350.442012]  ffff88016cc80000 ffff88016cc80000 ffff880177ad0000 ffff8801680a7948
[  350.450409] Call Trace:
[  350.453165]  [<ffffffff81332187>] dump_stack+0x67/0x90
[  350.458931]  [<ffffffff810ca6da>] lockdep_rcu_suspicious+0xea/0x120
[  350.466002]  [<ffffffffa039e8dd>] fence_update+0xbd/0x670 [i915]
[  350.472766]  [<ffffffffa039efe2>] i915_gem_restore_fences+0x52/0x70 [i915]
[  350.480496]  [<ffffffffa0368f42>] vlv_resume_prepare+0x72/0x570 [i915]
[  350.487839]  [<ffffffffa0369802>] intel_runtime_resume+0x102/0x210 [i915]
[  350.495442]  [<ffffffff8137f26f>] pci_pm_runtime_resume+0x7f/0xb0
[  350.502274]  [<ffffffff8137f1f0>] ? pci_restore_standard_config+0x40/0x40
[  350.509883]  [<ffffffff814401c5>] __rpm_callback+0x35/0x70
[  350.516037]  [<ffffffff8137f1f0>] ? pci_restore_standard_config+0x40/0x40
[  350.523646]  [<ffffffff81440224>] rpm_callback+0x24/0x80
[  350.529604]  [<ffffffff8137f1f0>] ? pci_restore_standard_config+0x40/0x40
[  350.537212]  [<ffffffff814417bd>] rpm_resume+0x4ad/0x740
[  350.543161]  [<ffffffff81441aa1>] __pm_runtime_resume+0x51/0x80
[  350.549824]  [<ffffffffa03889c8>] intel_runtime_pm_get+0x28/0x90 [i915]
[  350.557265]  [<ffffffffa0388a53>] intel_display_power_get+0x23/0x50 [i915]
[  350.565001]  [<ffffffffa03ef23d>] intel_atomic_commit_tail+0xdfd/0x10b0 [i915]
[  350.573106]  [<ffffffffa034b2e9>] ? drm_atomic_helper_swap_state+0x159/0x300 [drm_kms_helper]
[  350.582659]  [<ffffffff81615091>] ? _raw_spin_unlock+0x31/0x50
[  350.589205]  [<ffffffffa034b2e9>] ? drm_atomic_helper_swap_state+0x159/0x300 [drm_kms_helper]
[  350.598787]  [<ffffffffa03ef8a5>] intel_atomic_commit+0x3b5/0x500 [i915]
[  350.606319]  [<ffffffffa03061dc>] ? drm_atomic_set_crtc_for_connector+0xcc/0x100 [drm]
[  350.615209]  [<ffffffffa0306b49>] drm_atomic_commit+0x49/0x50 [drm]
[  350.622242]  [<ffffffffa034dee8>] drm_atomic_helper_set_config+0x88/0xc0 [drm_kms_helper]
[  350.631419]  [<ffffffffa02f94ac>] drm_mode_set_config_internal+0x6c/0x120 [drm]
[  350.639623]  [<ffffffffa02fa94c>] drm_mode_setcrtc+0x22c/0x4d0 [drm]
[  350.646760]  [<ffffffffa02f0f19>] drm_ioctl+0x209/0x460 [drm]
[  350.653217]  [<ffffffffa02fa720>] ? drm_mode_getcrtc+0x150/0x150 [drm]
[  350.660536]  [<ffffffff810c984a>] ? __lock_is_held+0x4a/0x70
[  350.666885]  [<ffffffff81202303>] do_vfs_ioctl+0x93/0x6b0
[  350.672939]  [<ffffffff8120f843>] ? __fget+0x113/0x200
[  350.678797]  [<ffffffff8120f735>] ? __fget+0x5/0x200
[  350.684361]  [<ffffffff81202964>] SyS_ioctl+0x44/0x80
[  350.690030]  [<ffffffff81001deb>] do_syscall_64+0x5b/0x120
[  350.696184]  [<ffffffff81615ada>] entry_SYSCALL64_slow_path+0x25/0x25

Note we also have to remember the lesson from commit 4fc788f5ee3d
("drm/i915: Flush delayed fence releases after reset") where we have to
flush any changes to the fence on restore.

v2: Replace call to release user mmaps with an assertion that they have
already been zapped.

Fixes: 49ef5294cda2 ("drm/i915: Move fence tracking from object to vma")
Reported-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_fence.c | 21 +++++++++++++++++++--
 1 file changed, 19 insertions(+), 2 deletions(-)

Comments

Joonas Lahtinen Oct. 12, 2016, 11:42 a.m. UTC | #1
This was already;

Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

Regards, Joonas
Daniel Vetter Oct. 13, 2016, 3:10 p.m. UTC | #2
On Wed, Oct 12, 2016 at 12:16:33PM +0100, Chris Wilson wrote:
> During rpm resume we restore the fences, but we do not have the
> protection of struct_mutex. This rules out updating the activity
> tracking on the fences, and requires us to rely on the rpm as the
> serialisation barrier instead.
> 
> [  350.298052] [drm:intel_runtime_resume [i915]] Resuming device
> [  350.308606]
> [  350.310520] ===============================
> [  350.315560] [ INFO: suspicious RCU usage. ]
> [  350.320554] 4.8.0-rc8-bsw-rapl+ #3133 Tainted: G     U  W
> [  350.327208] -------------------------------
> [  350.331977] ../drivers/gpu/drm/i915/i915_gem_request.h:371 suspicious rcu_dereference_protected() usage!
> [  350.342619]
> [  350.342619] other info that might help us debug this:
> [  350.342619]
> [  350.351593]
> [  350.351593] rcu_scheduler_active = 1, debug_locks = 0
> [  350.358952] 3 locks held by Xorg/320:
> [  350.363077]  #0:  (&dev->mode_config.mutex){+.+.+.}, at: [<ffffffffa030589c>] drm_modeset_lock_all+0x3c/0xd0 [drm]
> [  350.375162]  #1:  (crtc_ww_class_acquire){+.+.+.}, at: [<ffffffffa03058a6>] drm_modeset_lock_all+0x46/0xd0 [drm]
> [  350.387022]  #2:  (crtc_ww_class_mutex){+.+.+.}, at: [<ffffffffa0305056>] drm_modeset_lock+0x36/0x110 [drm]
> [  350.398236]
> [  350.398236] stack backtrace:
> [  350.403196] CPU: 1 PID: 320 Comm: Xorg Tainted: G     U  W       4.8.0-rc8-bsw-rapl+ #3133
> [  350.412457] Hardware name: Intel Corporation CHERRYVIEW C0 PLATFORM/Braswell CRB, BIOS BRAS.X64.X088.R00.1510270350 10/27/2015
> [  350.425212]  0000000000000000 ffff8801680a78c8 ffffffff81332187 ffff88016c5c5000
> [  350.433611]  0000000000000001 ffff8801680a78f8 ffffffff810ca6da ffff88016cc8b0f0
> [  350.442012]  ffff88016cc80000 ffff88016cc80000 ffff880177ad0000 ffff8801680a7948
> [  350.450409] Call Trace:
> [  350.453165]  [<ffffffff81332187>] dump_stack+0x67/0x90
> [  350.458931]  [<ffffffff810ca6da>] lockdep_rcu_suspicious+0xea/0x120
> [  350.466002]  [<ffffffffa039e8dd>] fence_update+0xbd/0x670 [i915]
> [  350.472766]  [<ffffffffa039efe2>] i915_gem_restore_fences+0x52/0x70 [i915]
> [  350.480496]  [<ffffffffa0368f42>] vlv_resume_prepare+0x72/0x570 [i915]
> [  350.487839]  [<ffffffffa0369802>] intel_runtime_resume+0x102/0x210 [i915]
> [  350.495442]  [<ffffffff8137f26f>] pci_pm_runtime_resume+0x7f/0xb0
> [  350.502274]  [<ffffffff8137f1f0>] ? pci_restore_standard_config+0x40/0x40
> [  350.509883]  [<ffffffff814401c5>] __rpm_callback+0x35/0x70
> [  350.516037]  [<ffffffff8137f1f0>] ? pci_restore_standard_config+0x40/0x40
> [  350.523646]  [<ffffffff81440224>] rpm_callback+0x24/0x80
> [  350.529604]  [<ffffffff8137f1f0>] ? pci_restore_standard_config+0x40/0x40
> [  350.537212]  [<ffffffff814417bd>] rpm_resume+0x4ad/0x740
> [  350.543161]  [<ffffffff81441aa1>] __pm_runtime_resume+0x51/0x80
> [  350.549824]  [<ffffffffa03889c8>] intel_runtime_pm_get+0x28/0x90 [i915]
> [  350.557265]  [<ffffffffa0388a53>] intel_display_power_get+0x23/0x50 [i915]
> [  350.565001]  [<ffffffffa03ef23d>] intel_atomic_commit_tail+0xdfd/0x10b0 [i915]
> [  350.573106]  [<ffffffffa034b2e9>] ? drm_atomic_helper_swap_state+0x159/0x300 [drm_kms_helper]
> [  350.582659]  [<ffffffff81615091>] ? _raw_spin_unlock+0x31/0x50
> [  350.589205]  [<ffffffffa034b2e9>] ? drm_atomic_helper_swap_state+0x159/0x300 [drm_kms_helper]
> [  350.598787]  [<ffffffffa03ef8a5>] intel_atomic_commit+0x3b5/0x500 [i915]
> [  350.606319]  [<ffffffffa03061dc>] ? drm_atomic_set_crtc_for_connector+0xcc/0x100 [drm]
> [  350.615209]  [<ffffffffa0306b49>] drm_atomic_commit+0x49/0x50 [drm]
> [  350.622242]  [<ffffffffa034dee8>] drm_atomic_helper_set_config+0x88/0xc0 [drm_kms_helper]
> [  350.631419]  [<ffffffffa02f94ac>] drm_mode_set_config_internal+0x6c/0x120 [drm]
> [  350.639623]  [<ffffffffa02fa94c>] drm_mode_setcrtc+0x22c/0x4d0 [drm]
> [  350.646760]  [<ffffffffa02f0f19>] drm_ioctl+0x209/0x460 [drm]
> [  350.653217]  [<ffffffffa02fa720>] ? drm_mode_getcrtc+0x150/0x150 [drm]
> [  350.660536]  [<ffffffff810c984a>] ? __lock_is_held+0x4a/0x70
> [  350.666885]  [<ffffffff81202303>] do_vfs_ioctl+0x93/0x6b0
> [  350.672939]  [<ffffffff8120f843>] ? __fget+0x113/0x200
> [  350.678797]  [<ffffffff8120f735>] ? __fget+0x5/0x200
> [  350.684361]  [<ffffffff81202964>] SyS_ioctl+0x44/0x80
> [  350.690030]  [<ffffffff81001deb>] do_syscall_64+0x5b/0x120
> [  350.696184]  [<ffffffff81615ada>] entry_SYSCALL64_slow_path+0x25/0x25
> 
> Note we also have to remember the lesson from commit 4fc788f5ee3d
> ("drm/i915: Flush delayed fence releases after reset") where we have to
> flush any changes to the fence on restore.
> 
> v2: Replace call to release user mmaps with an assertion that they have
> already been zapped.
> 
> Fixes: 49ef5294cda2 ("drm/i915: Move fence tracking from object to vma")
> Reported-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Mika Kuoppala <mika.kuoppala@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_gem_fence.c | 21 +++++++++++++++++++--
>  1 file changed, 19 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_fence.c b/drivers/gpu/drm/i915/i915_gem_fence.c
> index 8df1fa7234e8..f7081f4b5d22 100644
> --- a/drivers/gpu/drm/i915/i915_gem_fence.c
> +++ b/drivers/gpu/drm/i915/i915_gem_fence.c
> @@ -290,6 +290,8 @@ i915_vma_put_fence(struct i915_vma *vma)
>  {
>  	struct drm_i915_fence_reg *fence = vma->fence;
>  
> +	assert_rpm_wakelock_held(to_i915(vma->vm->dev));
> +
>  	if (!fence)
>  		return 0;
>  
> @@ -341,6 +343,8 @@ i915_vma_get_fence(struct i915_vma *vma)
>  	struct drm_i915_fence_reg *fence;
>  	struct i915_vma *set = i915_gem_object_is_tiled(vma->obj) ? vma : NULL;
>  
> +	assert_rpm_wakelock_held(to_i915(vma->vm->dev));
> +
>  	/* Just update our place in the LRU if our fence is getting reused. */
>  	if (vma->fence) {
>  		fence = vma->fence;
> @@ -371,6 +375,12 @@ void i915_gem_restore_fences(struct drm_device *dev)
>  	struct drm_i915_private *dev_priv = to_i915(dev);
>  	int i;
>  
> +	/* Note that this may be called outside of struct_mutex, by
> +	 * runtime suspend/resume. The barrier we require is enforced by
> +	 * rpm itself - all access to fences/GTT are only within an rpm
> +	 * wakeref, and to acquire that wakeref you must pass through here.
> +	 */
> +
>  	for (i = 0; i < dev_priv->num_fence_regs; i++) {
>  		struct drm_i915_fence_reg *reg = &dev_priv->fence_regs[i];
>  		struct i915_vma *vma = reg->vma;
> @@ -379,10 +389,17 @@ void i915_gem_restore_fences(struct drm_device *dev)
>  		 * Commit delayed tiling changes if we have an object still
>  		 * attached to the fence, otherwise just clear the fence.
>  		 */
> -		if (vma && !i915_gem_object_is_tiled(vma->obj))
> +		if (vma && !i915_gem_object_is_tiled(vma->obj)) {
> +			GEM_BUG_ON(!reg->dirty);
> +			GEM_BUG_ON(!list_empty(&vma->obj->userfault_link));
> +
> +			list_move(&reg->link, &dev_priv->mm.fence_list);
> +			vma->fence = NULL;
>  			vma = NULL;
> +		}
>  
> -		fence_update(reg, vma);
> +		fence_write(reg, vma);
> +		reg->vma = vma;

Same comments as with the userfault_list: Using rpm ordering to enforce
consistency causes mild panic attacks here with me ;-)

Is the above (delayed tiling change commit) even possible here, at least
for rpm resume? Same for system s/r (both s3 and s4) since the pagetables
won't survive anyway. Can't we simply make this an impossibility?
-Daniel

>  	}
>  }
>  
> -- 
> 2.9.3
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Chris Wilson Oct. 13, 2016, 3:25 p.m. UTC | #3
On Thu, Oct 13, 2016 at 05:10:21PM +0200, Daniel Vetter wrote:
> On Wed, Oct 12, 2016 at 12:16:33PM +0100, Chris Wilson wrote:
> > @@ -379,10 +389,17 @@ void i915_gem_restore_fences(struct drm_device *dev)
> >  		 * Commit delayed tiling changes if we have an object still
> >  		 * attached to the fence, otherwise just clear the fence.
> >  		 */
> > -		if (vma && !i915_gem_object_is_tiled(vma->obj))
> > +		if (vma && !i915_gem_object_is_tiled(vma->obj)) {
> > +			GEM_BUG_ON(!reg->dirty);
> > +			GEM_BUG_ON(!list_empty(&vma->obj->userfault_link));
> > +
> > +			list_move(&reg->link, &dev_priv->mm.fence_list);
> > +			vma->fence = NULL;
> >  			vma = NULL;
> > +		}
> >  
> > -		fence_update(reg, vma);
> > +		fence_write(reg, vma);
> > +		reg->vma = vma;
> 
> Same comments as with the userfault_list: Using rpm ordering to enforce
> consistency causes mild panic attacks here with me ;-)
> 
> Is the above (delayed tiling change commit) even possible here, at least
> for rpm resume? Same for system s/r (both s3 and s4) since the pagetables
> won't survive anyway. Can't we simply make this an impossibility?

We also use this from reset to rewrite the old fences, and we know there
we can hit the delayed fence write [4fc788f5ee3d]. It would also be
possible to hit it on suspend as well.

I've been thinking about whether we should be bothering to write the
fence registers with the correct value or just cancel the fences. But we
have to restore anything that is pinned, and we have to write something
into the fences (just to be safe), and if we have to write something we
may as well use the most recent information we have as that has a good
chance of being used again.

Long story short, I don't have a better idea for restoring or avoiding
the restore of fences.
-Chris
Daniel Vetter Oct. 13, 2016, 3:28 p.m. UTC | #4
On Thu, Oct 13, 2016 at 04:25:18PM +0100, Chris Wilson wrote:
> On Thu, Oct 13, 2016 at 05:10:21PM +0200, Daniel Vetter wrote:
> > On Wed, Oct 12, 2016 at 12:16:33PM +0100, Chris Wilson wrote:
> > > @@ -379,10 +389,17 @@ void i915_gem_restore_fences(struct drm_device *dev)
> > >  		 * Commit delayed tiling changes if we have an object still
> > >  		 * attached to the fence, otherwise just clear the fence.
> > >  		 */
> > > -		if (vma && !i915_gem_object_is_tiled(vma->obj))
> > > +		if (vma && !i915_gem_object_is_tiled(vma->obj)) {
> > > +			GEM_BUG_ON(!reg->dirty);
> > > +			GEM_BUG_ON(!list_empty(&vma->obj->userfault_link));
> > > +
> > > +			list_move(&reg->link, &dev_priv->mm.fence_list);
> > > +			vma->fence = NULL;
> > >  			vma = NULL;
> > > +		}
> > >  
> > > -		fence_update(reg, vma);
> > > +		fence_write(reg, vma);
> > > +		reg->vma = vma;
> > 
> > Same comments as with the userfault_list: Using rpm ordering to enforce
> > consistency causes mild panic attacks here with me ;-)
> > 
> > Is the above (delayed tiling change commit) even possible here, at least
> > for rpm resume? Same for system s/r (both s3 and s4) since the pagetables
> > won't survive anyway. Can't we simply make this an impossibility?
> 
> We also use this from reset to rewrite the old fences, and we know there
> we can hit the delayed fence write [4fc788f5ee3d]. It would also be
> possible to hit it on suspend as well.
> 
> I've been thinking about whether we should be bothering to write the
> fence registers with the correct value or just cancel the fences. But we
> have to restore anything that is pinned, and we have to write something
> into the fences (just to be safe), and if we have to write something we
> may as well use the most recent information we have as that has a good
> chance of being used again.
> 
> Long story short, I don't have a better idea for restoring or avoiding
> the restore of fences.

What about a rpm_resume only version that just does a blind fence_write?
It is something, and we can update the book-keeping once we do get to one
of the real synchronization points again.

With that we can leave the versions for reset and system s/r alone ... Or
is there trickery even with rpm going on?
-Daniel
Chris Wilson Oct. 13, 2016, 3:41 p.m. UTC | #5
On Thu, Oct 13, 2016 at 05:28:13PM +0200, Daniel Vetter wrote:
> On Thu, Oct 13, 2016 at 04:25:18PM +0100, Chris Wilson wrote:
> > On Thu, Oct 13, 2016 at 05:10:21PM +0200, Daniel Vetter wrote:
> > > On Wed, Oct 12, 2016 at 12:16:33PM +0100, Chris Wilson wrote:
> > > > @@ -379,10 +389,17 @@ void i915_gem_restore_fences(struct drm_device *dev)
> > > >  		 * Commit delayed tiling changes if we have an object still
> > > >  		 * attached to the fence, otherwise just clear the fence.
> > > >  		 */
> > > > -		if (vma && !i915_gem_object_is_tiled(vma->obj))
> > > > +		if (vma && !i915_gem_object_is_tiled(vma->obj)) {
> > > > +			GEM_BUG_ON(!reg->dirty);
> > > > +			GEM_BUG_ON(!list_empty(&vma->obj->userfault_link));
> > > > +
> > > > +			list_move(&reg->link, &dev_priv->mm.fence_list);
> > > > +			vma->fence = NULL;
> > > >  			vma = NULL;
> > > > +		}
> > > >  
> > > > -		fence_update(reg, vma);
> > > > +		fence_write(reg, vma);
> > > > +		reg->vma = vma;
> > > 
> > > Same comments as with the userfault_list: Using rpm ordering to enforce
> > > consistency causes mild panic attacks here with me ;-)
> > > 
> > > Is the above (delayed tiling change commit) even possible here, at least
> > > for rpm resume? Same for system s/r (both s3 and s4) since the pagetables
> > > won't survive anyway. Can't we simply make this an impossibility?
> > 
> > We also use this from reset to rewrite the old fences, and we know there
> > we can hit the delayed fence write [4fc788f5ee3d]. It would also be
> > possible to hit it on suspend as well.
> > 
> > I've been thinking about whether we should be bothering to write the
> > fence registers with the correct value or just cancel the fences. But we
> > have to restore anything that is pinned, and we have to write something
> > into the fences (just to be safe), and if we have to write something we
> > may as well use the most recent information we have as that has a good
> > chance of being used again.
> > 
> > Long story short, I don't have a better idea for restoring or avoiding
> > the restore of fences.
> 
> What about a rpm_resume only version that just does a blind fence_write?
> It is something, and we can update the book-keeping once we do get to one
> of the real synchronization points again.
> 
> With that we can leave the versions for reset and system s/r alone ... Or
> is there trickery even with rpm going on?

For rpm suspend, we only zap the user's mmap and not mark the fence as
lost. I think that's the missing piece as to why this is not as simple as
it could be for rpm-resume. On rpm-resume we only need to restore pinned
fences, and fences should only be pinned for hw access, and so there
should never be any if we were rpm-suspended. (Assuming that all pinned
fences are active, which on the surface seems a reasonable assumption.)

If that holds true, we do not need this at all for runtime pm (we still
need it across full system suspend/reset) and just need to doctor the
existing scary i915_gem_release_all_mmaps() (aka
i915_gem_runtime_suspend()!) to both release the mmap and throw away the
fence tracing. At least then we only have one dragon nest.
-Chris
Daniel Vetter Oct. 17, 2016, 6:52 a.m. UTC | #6
On Thu, Oct 13, 2016 at 04:41:01PM +0100, Chris Wilson wrote:
> On Thu, Oct 13, 2016 at 05:28:13PM +0200, Daniel Vetter wrote:
> > On Thu, Oct 13, 2016 at 04:25:18PM +0100, Chris Wilson wrote:
> > > On Thu, Oct 13, 2016 at 05:10:21PM +0200, Daniel Vetter wrote:
> > > > On Wed, Oct 12, 2016 at 12:16:33PM +0100, Chris Wilson wrote:
> > > > > @@ -379,10 +389,17 @@ void i915_gem_restore_fences(struct drm_device *dev)
> > > > >  		 * Commit delayed tiling changes if we have an object still
> > > > >  		 * attached to the fence, otherwise just clear the fence.
> > > > >  		 */
> > > > > -		if (vma && !i915_gem_object_is_tiled(vma->obj))
> > > > > +		if (vma && !i915_gem_object_is_tiled(vma->obj)) {
> > > > > +			GEM_BUG_ON(!reg->dirty);
> > > > > +			GEM_BUG_ON(!list_empty(&vma->obj->userfault_link));
> > > > > +
> > > > > +			list_move(&reg->link, &dev_priv->mm.fence_list);
> > > > > +			vma->fence = NULL;
> > > > >  			vma = NULL;
> > > > > +		}
> > > > >  
> > > > > -		fence_update(reg, vma);
> > > > > +		fence_write(reg, vma);
> > > > > +		reg->vma = vma;
> > > > 
> > > > Same comments as with the userfault_list: Using rpm ordering to enforce
> > > > consistency causes mild panic attacks here with me ;-)
> > > > 
> > > > Is the above (delayed tiling change commit) even possible here, at least
> > > > for rpm resume? Same for system s/r (both s3 and s4) since the pagetables
> > > > won't survive anyway. Can't we simply make this an impossibility?
> > > 
> > > We also use this from reset to rewrite the old fences, and we know there
> > > we can hit the delayed fence write [4fc788f5ee3d]. It would also be
> > > possible to hit it on suspend as well.
> > > 
> > > I've been thinking about whether we should be bothering to write the
> > > fence registers with the correct value or just cancel the fences. But we
> > > have to restore anything that is pinned, and we have to write something
> > > into the fences (just to be safe), and if we have to write something we
> > > may as well use the most recent information we have as that has a good
> > > chance of being used again.
> > > 
> > > Long story short, I don't have a better idea for restoring or avoiding
> > > the restore of fences.
> > 
> > What about a rpm_resume only version that just does a blind fence_write?
> > It is something, and we can update the book-keeping once we do get to one
> > of the real synchronization points again.
> > 
> > With that we can leave the versions for reset and system s/r alone ... Or
> > is there trickery even with rpm going on?
> 
> For rpm suspend, we only zap the user's mmap and not mark the fence as
> lost. I think that's the missing piece as to why this is not as simple as
> it could be for rpm-resume. On rpm-resume we only need to restore pinned
> fences, and fences should only be pinned for hw access, and so there
> should never be any if we were rpm-suspended. (Assuming that all pinned
> fences are active, which on the surface seems a reasonable assumption.)
> 
> If that holds true, we do not need this at all for runtime pm (we still
> need it across full system suspend/reset) and just need to doctor the
> existing scary i915_gem_release_all_mmaps() (aka
> i915_gem_runtime_suspend()!) to both release the mmap and throw away the
> fence tracing. At least then we only have one dragon nest.

Yeah, would be great to unify this stuff a bit ... Making fence semantics
the same for rpm suspend as for normal suspend would definitely be nice,
and the hw will forget about the registers anyway.
-Daniel
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/i915_gem_fence.c b/drivers/gpu/drm/i915/i915_gem_fence.c
index 8df1fa7234e8..f7081f4b5d22 100644
--- a/drivers/gpu/drm/i915/i915_gem_fence.c
+++ b/drivers/gpu/drm/i915/i915_gem_fence.c
@@ -290,6 +290,8 @@  i915_vma_put_fence(struct i915_vma *vma)
 {
 	struct drm_i915_fence_reg *fence = vma->fence;
 
+	assert_rpm_wakelock_held(to_i915(vma->vm->dev));
+
 	if (!fence)
 		return 0;
 
@@ -341,6 +343,8 @@  i915_vma_get_fence(struct i915_vma *vma)
 	struct drm_i915_fence_reg *fence;
 	struct i915_vma *set = i915_gem_object_is_tiled(vma->obj) ? vma : NULL;
 
+	assert_rpm_wakelock_held(to_i915(vma->vm->dev));
+
 	/* Just update our place in the LRU if our fence is getting reused. */
 	if (vma->fence) {
 		fence = vma->fence;
@@ -371,6 +375,12 @@  void i915_gem_restore_fences(struct drm_device *dev)
 	struct drm_i915_private *dev_priv = to_i915(dev);
 	int i;
 
+	/* Note that this may be called outside of struct_mutex, by
+	 * runtime suspend/resume. The barrier we require is enforced by
+	 * rpm itself - all access to fences/GTT are only within an rpm
+	 * wakeref, and to acquire that wakeref you must pass through here.
+	 */
+
 	for (i = 0; i < dev_priv->num_fence_regs; i++) {
 		struct drm_i915_fence_reg *reg = &dev_priv->fence_regs[i];
 		struct i915_vma *vma = reg->vma;
@@ -379,10 +389,17 @@  void i915_gem_restore_fences(struct drm_device *dev)
 		 * Commit delayed tiling changes if we have an object still
 		 * attached to the fence, otherwise just clear the fence.
 		 */
-		if (vma && !i915_gem_object_is_tiled(vma->obj))
+		if (vma && !i915_gem_object_is_tiled(vma->obj)) {
+			GEM_BUG_ON(!reg->dirty);
+			GEM_BUG_ON(!list_empty(&vma->obj->userfault_link));
+
+			list_move(&reg->link, &dev_priv->mm.fence_list);
+			vma->fence = NULL;
 			vma = NULL;
+		}
 
-		fence_update(reg, vma);
+		fence_write(reg, vma);
+		reg->vma = vma;
 	}
 }