diff mbox

[3/6] drm: Fix drm_vblank_pre/post_modeset regression from Linux 4.4

Message ID 1454894009-15466-4-git-send-email-mario.kleiner.de@gmail.com (mailing list archive)
State New, archived
Headers show

Commit Message

Mario Kleiner Feb. 8, 2016, 1:13 a.m. UTC
Changes to drm_update_vblank_count() in Linux 4.4 broke the
behaviour of the pre/post modeset functions as the new update
code doesn't deal with hw vblank counter resets inbetween calls
to drm_vblank_pre_modeset an drm_vblank_post_modeset, as it
should.

This causes mistreatment of such hw counter resets as counter
wraparound, and thereby large forward jumps of the software
vblank counter which in turn cause vblank event dispatching
and vblank waits to fail/hang --> userspace clients hang.

This symptom was reported on radeon-kms to cause a infinite
hang of KDE Plasma 5 shell's login procedure, preventing users
from logging in.

Fix this by detecting when drm_update_vblank_count() is called
inside a pre->post modeset interval. If so, clamp valid vblank
increments to the safe values 0 and 1, pretty much restoring
the update behavior of the old update code of Linux 4.3 and
earlier. Also reset the last recorded hw vblank count at call
to drm_vblank_post_modeset() to be safe against hw that after
modesetting, dpms on etc. only fires its first vblank irq after
drm_vblank_post_modeset() was already called.

Reported-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Cc: <stable@vger.kernel.org> # 4.4+
Cc: michel@daenzer.net
Cc: vbabka@suse.cz
Cc: ville.syrjala@linux.intel.com
Cc: daniel.vetter@ffwll.ch
Cc: dri-devel@lists.freedesktop.org
Cc: alexander.deucher@amd.com
Cc: christian.koenig@amd.com
---
 drivers/gpu/drm/drm_irq.c | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

Comments

Daniel Vetter Feb. 9, 2016, 10 a.m. UTC | #1
On Mon, Feb 08, 2016 at 02:13:26AM +0100, Mario Kleiner wrote:
> Changes to drm_update_vblank_count() in Linux 4.4 broke the
> behaviour of the pre/post modeset functions as the new update
> code doesn't deal with hw vblank counter resets inbetween calls
> to drm_vblank_pre_modeset an drm_vblank_post_modeset, as it
> should.
> 
> This causes mistreatment of such hw counter resets as counter
> wraparound, and thereby large forward jumps of the software
> vblank counter which in turn cause vblank event dispatching
> and vblank waits to fail/hang --> userspace clients hang.
> 
> This symptom was reported on radeon-kms to cause a infinite
> hang of KDE Plasma 5 shell's login procedure, preventing users
> from logging in.
> 
> Fix this by detecting when drm_update_vblank_count() is called
> inside a pre->post modeset interval. If so, clamp valid vblank
> increments to the safe values 0 and 1, pretty much restoring
> the update behavior of the old update code of Linux 4.3 and
> earlier. Also reset the last recorded hw vblank count at call
> to drm_vblank_post_modeset() to be safe against hw that after
> modesetting, dpms on etc. only fires its first vblank irq after
> drm_vblank_post_modeset() was already called.
> 
> Reported-by: Vlastimil Babka <vbabka@suse.cz>
> Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
> Cc: <stable@vger.kernel.org> # 4.4+
> Cc: michel@daenzer.net
> Cc: vbabka@suse.cz
> Cc: ville.syrjala@linux.intel.com
> Cc: daniel.vetter@ffwll.ch
> Cc: dri-devel@lists.freedesktop.org
> Cc: alexander.deucher@amd.com
> Cc: christian.koenig@amd.com

We need to untangle the new vblank stuff that assumes solide (atomic)
drivers and all the old stuff much more. But that can be done later on.

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

> ---
>  drivers/gpu/drm/drm_irq.c | 16 ++++++++++++++++
>  1 file changed, 16 insertions(+)
> 
> diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c
> index aa2c74b..5c27ad3 100644
> --- a/drivers/gpu/drm/drm_irq.c
> +++ b/drivers/gpu/drm/drm_irq.c
> @@ -222,6 +222,21 @@ static void drm_update_vblank_count(struct drm_device *dev, unsigned int pipe,
>  	}
>  
>  	/*
> +	 * Within a drm_vblank_pre_modeset - drm_vblank_post_modeset
> +	 * interval? If so then vblank irqs keep running and it will likely
> +	 * happen that the hardware vblank counter is not trustworthy as it
> +	 * might reset at some point in that interval and vblank timestamps
> +	 * are not trustworthy either in that interval. Iow. this can result
> +	 * in a bogus diff >> 1 which must be avoided as it would cause
> +	 * random large forward jumps of the software vblank counter.
> +	 */
> +	if (diff > 1 && (vblank->inmodeset & 0x2)) {
> +		DRM_DEBUG_VBL("clamping vblank bump to 1 on crtc %u: diffr=%u"
> +			      " due to pre-modeset.\n", pipe, diff);
> +		diff = 1;
> +	}
> +
> +	/*
>  	 * Restrict the bump of the software vblank counter to a safe maximum
>  	 * value of +1 whenever there is the possibility that concurrent readers
>  	 * of vblank timestamps could be active at the moment, as the current
> @@ -1573,6 +1588,7 @@ void drm_vblank_post_modeset(struct drm_device *dev, unsigned int pipe)
>  	if (vblank->inmodeset) {
>  		spin_lock_irqsave(&dev->vbl_lock, irqflags);
>  		dev->vblank_disable_allowed = true;
> +		drm_reset_vblank_timestamp(dev, pipe);
>  		spin_unlock_irqrestore(&dev->vbl_lock, irqflags);
>  
>  		if (vblank->inmodeset & 0x2)
> -- 
> 1.9.1
>
Vlastimil Babka Feb. 11, 2016, 1:03 p.m. UTC | #2
On 02/08/2016 02:13 AM, Mario Kleiner wrote:
> Changes to drm_update_vblank_count() in Linux 4.4 broke the
> behaviour of the pre/post modeset functions as the new update
> code doesn't deal with hw vblank counter resets inbetween calls
> to drm_vblank_pre_modeset an drm_vblank_post_modeset, as it
> should.
>
> This causes mistreatment of such hw counter resets as counter
> wraparound, and thereby large forward jumps of the software
> vblank counter which in turn cause vblank event dispatching
> and vblank waits to fail/hang --> userspace clients hang.
>
> This symptom was reported on radeon-kms to cause a infinite
> hang of KDE Plasma 5 shell's login procedure, preventing users
> from logging in.
>
> Fix this by detecting when drm_update_vblank_count() is called
> inside a pre->post modeset interval. If so, clamp valid vblank
> increments to the safe values 0 and 1, pretty much restoring
> the update behavior of the old update code of Linux 4.3 and
> earlier. Also reset the last recorded hw vblank count at call
> to drm_vblank_post_modeset() to be safe against hw that after
> modesetting, dpms on etc. only fires its first vblank irq after
> drm_vblank_post_modeset() was already called.
>
> Reported-by: Vlastimil Babka <vbabka@suse.cz>

FWIW, I've applied the whole patchset to 4.4 and the kde5 login problem 
didn't occur. I can test the next version too.

Thanks,
Vlastimil
diff mbox

Patch

diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c
index aa2c74b..5c27ad3 100644
--- a/drivers/gpu/drm/drm_irq.c
+++ b/drivers/gpu/drm/drm_irq.c
@@ -222,6 +222,21 @@  static void drm_update_vblank_count(struct drm_device *dev, unsigned int pipe,
 	}
 
 	/*
+	 * Within a drm_vblank_pre_modeset - drm_vblank_post_modeset
+	 * interval? If so then vblank irqs keep running and it will likely
+	 * happen that the hardware vblank counter is not trustworthy as it
+	 * might reset at some point in that interval and vblank timestamps
+	 * are not trustworthy either in that interval. Iow. this can result
+	 * in a bogus diff >> 1 which must be avoided as it would cause
+	 * random large forward jumps of the software vblank counter.
+	 */
+	if (diff > 1 && (vblank->inmodeset & 0x2)) {
+		DRM_DEBUG_VBL("clamping vblank bump to 1 on crtc %u: diffr=%u"
+			      " due to pre-modeset.\n", pipe, diff);
+		diff = 1;
+	}
+
+	/*
 	 * Restrict the bump of the software vblank counter to a safe maximum
 	 * value of +1 whenever there is the possibility that concurrent readers
 	 * of vblank timestamps could be active at the moment, as the current
@@ -1573,6 +1588,7 @@  void drm_vblank_post_modeset(struct drm_device *dev, unsigned int pipe)
 	if (vblank->inmodeset) {
 		spin_lock_irqsave(&dev->vbl_lock, irqflags);
 		dev->vblank_disable_allowed = true;
+		drm_reset_vblank_timestamp(dev, pipe);
 		spin_unlock_irqrestore(&dev->vbl_lock, irqflags);
 
 		if (vblank->inmodeset & 0x2)