diff mbox series

[3/3] drm/vkms: Reduce critical section in vblank_simulate

Message ID 20190719152314.7706-3-daniel.vetter@ffwll.ch (mailing list archive)
State New, archived
Headers show
Series [1/3] drm/vblank: Document and fix vblank count barrier semantics | expand

Commit Message

Daniel Vetter July 19, 2019, 3:23 p.m. UTC
We can reduce the critical section in vkms_vblank_simulate under
output->lock quite a lot:

- hrtimer_forward_now just needs to be ordered correctly wrt
  drm_crtc_handle_vblank. We already access the hrtimer timestamp
  without locks. While auditing that I noticed that we don't correctly
  annotate the read there, so sprinkle a READ_ONCE to make sure the
  compiler doesn't do anything foolish.

- drm_crtc_handle_vblank must stay under the lock to avoid races with
  drm_crtc_arm_vblank_event.

- The access to vkms_ouptut->crc_state also must stay under the lock.

- next problem is making sure the output->state structure doesn't get
  freed too early. First we rely on a given hrtimer being serialized:
  If we call drm_crtc_handle_vblank, then we are guaranteed that the
  previous call to vkms_vblank_simulate has completed. The other side
  of the coin is that the atomic updates waits for the vblank to
  happen before it releases the old state. Both taken together means
  that by the time the atomic update releases the old state, the
  hrtimer won't access it anymore (it might be accessing the new state
  at the same time, but that's ok).

- state is invariant, except the few fields separate protected by
  state->crc_lock. So no need to hold the lock for that.

- finally the queue_work. We need to make sure there's no races with
  the flush_work, i.e. when we call flush_work we need to guarantee
  that the hrtimer can't requeue the work again. This is guaranteed by
  the same vblank/hrtimer ordering guarantees like the reasoning above
  why state won't be freed too early: flush_work on the old state is
  called after wait_for_flip_done in the atomic commit code.

Therefore we can also move everything after the output->crc_state out
of the critical section.

Motivated by suggestions from Rodrigo.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com>
Cc: Haneen Mohammed <hamohammed.sa@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
---
 drivers/gpu/drm/vkms/vkms_crtc.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

Comments

Rodrigo Siqueira Sept. 3, 2019, 12:50 p.m. UTC | #1
Thanks for this patch! It looks good for me.

Reviewed-by: Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com>

On 07/19, Daniel Vetter wrote:
> We can reduce the critical section in vkms_vblank_simulate under
> output->lock quite a lot:
> 
> - hrtimer_forward_now just needs to be ordered correctly wrt
>   drm_crtc_handle_vblank. We already access the hrtimer timestamp
>   without locks. While auditing that I noticed that we don't correctly
>   annotate the read there, so sprinkle a READ_ONCE to make sure the
>   compiler doesn't do anything foolish.
> 
> - drm_crtc_handle_vblank must stay under the lock to avoid races with
>   drm_crtc_arm_vblank_event.
> 
> - The access to vkms_ouptut->crc_state also must stay under the lock.
> 
> - next problem is making sure the output->state structure doesn't get
>   freed too early. First we rely on a given hrtimer being serialized:
>   If we call drm_crtc_handle_vblank, then we are guaranteed that the
>   previous call to vkms_vblank_simulate has completed. The other side
>   of the coin is that the atomic updates waits for the vblank to
>   happen before it releases the old state. Both taken together means
>   that by the time the atomic update releases the old state, the
>   hrtimer won't access it anymore (it might be accessing the new state
>   at the same time, but that's ok).
> 
> - state is invariant, except the few fields separate protected by
>   state->crc_lock. So no need to hold the lock for that.
> 
> - finally the queue_work. We need to make sure there's no races with
>   the flush_work, i.e. when we call flush_work we need to guarantee
>   that the hrtimer can't requeue the work again. This is guaranteed by
>   the same vblank/hrtimer ordering guarantees like the reasoning above
>   why state won't be freed too early: flush_work on the old state is
>   called after wait_for_flip_done in the atomic commit code.
> 
> Therefore we can also move everything after the output->crc_state out
> of the critical section.
> 
> Motivated by suggestions from Rodrigo.
> 
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com>
> Cc: Haneen Mohammed <hamohammed.sa@gmail.com>
> Cc: Daniel Vetter <daniel@ffwll.ch>
> ---
>  drivers/gpu/drm/vkms/vkms_crtc.c | 9 ++++-----
>  1 file changed, 4 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/vkms/vkms_crtc.c b/drivers/gpu/drm/vkms/vkms_crtc.c
> index 927dafaebc76..74f703b8d22a 100644
> --- a/drivers/gpu/drm/vkms/vkms_crtc.c
> +++ b/drivers/gpu/drm/vkms/vkms_crtc.c
> @@ -16,17 +16,18 @@ static enum hrtimer_restart vkms_vblank_simulate(struct hrtimer *timer)
>  	u64 ret_overrun;
>  	bool ret;
>  
> -	spin_lock(&output->lock);
> -
>  	ret_overrun = hrtimer_forward_now(&output->vblank_hrtimer,
>  					  output->period_ns);
>  	WARN_ON(ret_overrun != 1);
>  
> +	spin_lock(&output->lock);
>  	ret = drm_crtc_handle_vblank(crtc);
>  	if (!ret)
>  		DRM_ERROR("vkms failure on handling vblank");
>  
>  	state = output->composer_state;
> +	spin_unlock(&output->lock);
> +
>  	if (state && output->composer_enabled) {
>  		u64 frame = drm_crtc_accurate_vblank_count(crtc);
>  
> @@ -48,8 +49,6 @@ static enum hrtimer_restart vkms_vblank_simulate(struct hrtimer *timer)
>  			DRM_DEBUG_DRIVER("Composer worker already queued\n");
>  	}
>  
> -	spin_unlock(&output->lock);
> -
>  	return HRTIMER_RESTART;
>  }
>  
> @@ -85,7 +84,7 @@ bool vkms_get_vblank_timestamp(struct drm_device *dev, unsigned int pipe,
>  	struct vkms_output *output = &vkmsdev->output;
>  	struct drm_vblank_crtc *vblank = &dev->vblank[pipe];
>  
> -	*vblank_time = output->vblank_hrtimer.node.expires;
> +	*vblank_time = READ_ONCE(output->vblank_hrtimer.node.expires);
>  
>  	if (WARN_ON(*vblank_time == vblank->time))
>  		return true;
> -- 
> 2.22.0
>
Daniel Vetter Sept. 3, 2019, 3:16 p.m. UTC | #2
On Tue, Sep 03, 2019 at 08:50:29AM -0400, Rodrigo Siqueira wrote:
> Thanks for this patch! It looks good for me.
> 
> Reviewed-by: Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com>

Thanks for taking a look at all this, entire series merged. With the r-b
from Ville on patch 1, but I'm happy to further discuss your questions.
Plus I augmented the commit message a bit for patch 1 to explain the
missed vblank story better.
-Daniel

> 
> On 07/19, Daniel Vetter wrote:
> > We can reduce the critical section in vkms_vblank_simulate under
> > output->lock quite a lot:
> > 
> > - hrtimer_forward_now just needs to be ordered correctly wrt
> >   drm_crtc_handle_vblank. We already access the hrtimer timestamp
> >   without locks. While auditing that I noticed that we don't correctly
> >   annotate the read there, so sprinkle a READ_ONCE to make sure the
> >   compiler doesn't do anything foolish.
> > 
> > - drm_crtc_handle_vblank must stay under the lock to avoid races with
> >   drm_crtc_arm_vblank_event.
> > 
> > - The access to vkms_ouptut->crc_state also must stay under the lock.
> > 
> > - next problem is making sure the output->state structure doesn't get
> >   freed too early. First we rely on a given hrtimer being serialized:
> >   If we call drm_crtc_handle_vblank, then we are guaranteed that the
> >   previous call to vkms_vblank_simulate has completed. The other side
> >   of the coin is that the atomic updates waits for the vblank to
> >   happen before it releases the old state. Both taken together means
> >   that by the time the atomic update releases the old state, the
> >   hrtimer won't access it anymore (it might be accessing the new state
> >   at the same time, but that's ok).
> > 
> > - state is invariant, except the few fields separate protected by
> >   state->crc_lock. So no need to hold the lock for that.
> > 
> > - finally the queue_work. We need to make sure there's no races with
> >   the flush_work, i.e. when we call flush_work we need to guarantee
> >   that the hrtimer can't requeue the work again. This is guaranteed by
> >   the same vblank/hrtimer ordering guarantees like the reasoning above
> >   why state won't be freed too early: flush_work on the old state is
> >   called after wait_for_flip_done in the atomic commit code.
> > 
> > Therefore we can also move everything after the output->crc_state out
> > of the critical section.
> > 
> > Motivated by suggestions from Rodrigo.
> > 
> > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > Cc: Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com>
> > Cc: Haneen Mohammed <hamohammed.sa@gmail.com>
> > Cc: Daniel Vetter <daniel@ffwll.ch>
> > ---
> >  drivers/gpu/drm/vkms/vkms_crtc.c | 9 ++++-----
> >  1 file changed, 4 insertions(+), 5 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/vkms/vkms_crtc.c b/drivers/gpu/drm/vkms/vkms_crtc.c
> > index 927dafaebc76..74f703b8d22a 100644
> > --- a/drivers/gpu/drm/vkms/vkms_crtc.c
> > +++ b/drivers/gpu/drm/vkms/vkms_crtc.c
> > @@ -16,17 +16,18 @@ static enum hrtimer_restart vkms_vblank_simulate(struct hrtimer *timer)
> >  	u64 ret_overrun;
> >  	bool ret;
> >  
> > -	spin_lock(&output->lock);
> > -
> >  	ret_overrun = hrtimer_forward_now(&output->vblank_hrtimer,
> >  					  output->period_ns);
> >  	WARN_ON(ret_overrun != 1);
> >  
> > +	spin_lock(&output->lock);
> >  	ret = drm_crtc_handle_vblank(crtc);
> >  	if (!ret)
> >  		DRM_ERROR("vkms failure on handling vblank");
> >  
> >  	state = output->composer_state;
> > +	spin_unlock(&output->lock);
> > +
> >  	if (state && output->composer_enabled) {
> >  		u64 frame = drm_crtc_accurate_vblank_count(crtc);
> >  
> > @@ -48,8 +49,6 @@ static enum hrtimer_restart vkms_vblank_simulate(struct hrtimer *timer)
> >  			DRM_DEBUG_DRIVER("Composer worker already queued\n");
> >  	}
> >  
> > -	spin_unlock(&output->lock);
> > -
> >  	return HRTIMER_RESTART;
> >  }
> >  
> > @@ -85,7 +84,7 @@ bool vkms_get_vblank_timestamp(struct drm_device *dev, unsigned int pipe,
> >  	struct vkms_output *output = &vkmsdev->output;
> >  	struct drm_vblank_crtc *vblank = &dev->vblank[pipe];
> >  
> > -	*vblank_time = output->vblank_hrtimer.node.expires;
> > +	*vblank_time = READ_ONCE(output->vblank_hrtimer.node.expires);
> >  
> >  	if (WARN_ON(*vblank_time == vblank->time))
> >  		return true;
> > -- 
> > 2.22.0
> > 
> 
> -- 
> Rodrigo Siqueira
> Software Engineer, Advanced Micro Devices (AMD)
> https://siqueira.tech
diff mbox series

Patch

diff --git a/drivers/gpu/drm/vkms/vkms_crtc.c b/drivers/gpu/drm/vkms/vkms_crtc.c
index 927dafaebc76..74f703b8d22a 100644
--- a/drivers/gpu/drm/vkms/vkms_crtc.c
+++ b/drivers/gpu/drm/vkms/vkms_crtc.c
@@ -16,17 +16,18 @@  static enum hrtimer_restart vkms_vblank_simulate(struct hrtimer *timer)
 	u64 ret_overrun;
 	bool ret;
 
-	spin_lock(&output->lock);
-
 	ret_overrun = hrtimer_forward_now(&output->vblank_hrtimer,
 					  output->period_ns);
 	WARN_ON(ret_overrun != 1);
 
+	spin_lock(&output->lock);
 	ret = drm_crtc_handle_vblank(crtc);
 	if (!ret)
 		DRM_ERROR("vkms failure on handling vblank");
 
 	state = output->composer_state;
+	spin_unlock(&output->lock);
+
 	if (state && output->composer_enabled) {
 		u64 frame = drm_crtc_accurate_vblank_count(crtc);
 
@@ -48,8 +49,6 @@  static enum hrtimer_restart vkms_vblank_simulate(struct hrtimer *timer)
 			DRM_DEBUG_DRIVER("Composer worker already queued\n");
 	}
 
-	spin_unlock(&output->lock);
-
 	return HRTIMER_RESTART;
 }
 
@@ -85,7 +84,7 @@  bool vkms_get_vblank_timestamp(struct drm_device *dev, unsigned int pipe,
 	struct vkms_output *output = &vkmsdev->output;
 	struct drm_vblank_crtc *vblank = &dev->vblank[pipe];
 
-	*vblank_time = output->vblank_hrtimer.node.expires;
+	*vblank_time = READ_ONCE(output->vblank_hrtimer.node.expires);
 
 	if (WARN_ON(*vblank_time == vblank->time))
 		return true;