diff mbox

[1/2] drm/radeon: Don't hang in radeon_flip_work_func on disabled crtc. (v2)

Message ID 1455843999-19179-2-git-send-email-mario.kleiner.de@gmail.com (mailing list archive)
State New, archived
Headers show

Commit Message

Mario Kleiner Feb. 19, 2016, 1:06 a.m. UTC
This fixes a regression introduced in Linux 4.4.

Limit the amount of time radeon_flip_work_func can
delay programming a page flip, by both limiting the
maximum amount of time per wait cycle and the maximum
number of wait cycles. Continue the flip if the limit
is exceeded, even if that may result in a visual or
timing glitch.

This is to prevent a hang of page flips, as reported
in fdo bug #93746: Disconnecting a DisplayPort display
in parallel to a kms pageflip getting queued can cause
the following hang of page flips and thereby an unusable
desktop:

1. kms pageflip ioctl() queues pageflip -> queues execution
   of radeon_flip_work_func.

2. Hotunplug of display causes the driver to DPMS OFF
   the unplugged display. Display engine shuts down,
   scanout no longer moves, but stays at its resting
   position at start line of vblank.

3. radeon_flip_work_func executes while crtc is off, and
   due to the non-moving scanout position, the new flip
   delay code introduced into Linux 4.4 by
   commit 5b5561b3660d ("drm/radeon: Fixup hw vblank counter/ts..")
   enters an infinite wait loop.

4. After reconnecting the display, the pageflip continues
   to hang in 3. and the display doesn't update its view
   of the desktop.

This patch fixes the Linux 4.4 regression from fdo bug #93746

<https://bugs.freedesktop.org/show_bug.cgi?id=93746>

v2: Skip wait immediately if !radeon_crtc->enabled, as
    suggested by Michel.

Reported-by: Bernd Steinhauser <linux@bernd-steinhauser.de>
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Tested-by: Bernd Steinhauser <linux@bernd-steinhauser.de>

Cc: <stable@vger.kernel.org> # 4.4+
Cc: Michel Dänzer <michel.daenzer@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/radeon/radeon_display.c | 17 +++++++++++++++--
 1 file changed, 15 insertions(+), 2 deletions(-)

Comments

Michel Dänzer Feb. 19, 2016, 1:16 a.m. UTC | #1
On 19.02.2016 10:06, Mario Kleiner wrote:
> This fixes a regression introduced in Linux 4.4.
> 
> Limit the amount of time radeon_flip_work_func can
> delay programming a page flip, by both limiting the
> maximum amount of time per wait cycle and the maximum
> number of wait cycles. Continue the flip if the limit
> is exceeded, even if that may result in a visual or
> timing glitch.
> 
> This is to prevent a hang of page flips, as reported
> in fdo bug #93746: Disconnecting a DisplayPort display
> in parallel to a kms pageflip getting queued can cause
> the following hang of page flips and thereby an unusable
> desktop:
> 
> 1. kms pageflip ioctl() queues pageflip -> queues execution
>    of radeon_flip_work_func.
> 
> 2. Hotunplug of display causes the driver to DPMS OFF
>    the unplugged display. Display engine shuts down,
>    scanout no longer moves, but stays at its resting
>    position at start line of vblank.
> 
> 3. radeon_flip_work_func executes while crtc is off, and
>    due to the non-moving scanout position, the new flip
>    delay code introduced into Linux 4.4 by
>    commit 5b5561b3660d ("drm/radeon: Fixup hw vblank counter/ts..")
>    enters an infinite wait loop.
> 
> 4. After reconnecting the display, the pageflip continues
>    to hang in 3. and the display doesn't update its view
>    of the desktop.
> 
> This patch fixes the Linux 4.4 regression from fdo bug #93746
> 
> <https://bugs.freedesktop.org/show_bug.cgi?id=93746>
> 
> v2: Skip wait immediately if !radeon_crtc->enabled, as
>     suggested by Michel.
> 
> Reported-by: Bernd Steinhauser <linux@bernd-steinhauser.de>
> Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
> Tested-by: Bernd Steinhauser <linux@bernd-steinhauser.de>
> 
> Cc: <stable@vger.kernel.org> # 4.4+

Both patches are

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Alex Deucher Feb. 19, 2016, 10:17 p.m. UTC | #2
On Thu, Feb 18, 2016 at 8:16 PM, Michel Dänzer <michel@daenzer.net> wrote:
> On 19.02.2016 10:06, Mario Kleiner wrote:
>> This fixes a regression introduced in Linux 4.4.
>>
>> Limit the amount of time radeon_flip_work_func can
>> delay programming a page flip, by both limiting the
>> maximum amount of time per wait cycle and the maximum
>> number of wait cycles. Continue the flip if the limit
>> is exceeded, even if that may result in a visual or
>> timing glitch.
>>
>> This is to prevent a hang of page flips, as reported
>> in fdo bug #93746: Disconnecting a DisplayPort display
>> in parallel to a kms pageflip getting queued can cause
>> the following hang of page flips and thereby an unusable
>> desktop:
>>
>> 1. kms pageflip ioctl() queues pageflip -> queues execution
>>    of radeon_flip_work_func.
>>
>> 2. Hotunplug of display causes the driver to DPMS OFF
>>    the unplugged display. Display engine shuts down,
>>    scanout no longer moves, but stays at its resting
>>    position at start line of vblank.
>>
>> 3. radeon_flip_work_func executes while crtc is off, and
>>    due to the non-moving scanout position, the new flip
>>    delay code introduced into Linux 4.4 by
>>    commit 5b5561b3660d ("drm/radeon: Fixup hw vblank counter/ts..")
>>    enters an infinite wait loop.
>>
>> 4. After reconnecting the display, the pageflip continues
>>    to hang in 3. and the display doesn't update its view
>>    of the desktop.
>>
>> This patch fixes the Linux 4.4 regression from fdo bug #93746
>>
>> <https://bugs.freedesktop.org/show_bug.cgi?id=93746>
>>
>> v2: Skip wait immediately if !radeon_crtc->enabled, as
>>     suggested by Michel.
>>
>> Reported-by: Bernd Steinhauser <linux@bernd-steinhauser.de>
>> Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
>> Tested-by: Bernd Steinhauser <linux@bernd-steinhauser.de>
>>
>> Cc: <stable@vger.kernel.org> # 4.4+
>
> Both patches are
>
> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>


Applied.  Thanks!

Alex
diff mbox

Patch

diff --git a/drivers/gpu/drm/radeon/radeon_display.c b/drivers/gpu/drm/radeon/radeon_display.c
index b3bb923..1fab4b9 100644
--- a/drivers/gpu/drm/radeon/radeon_display.c
+++ b/drivers/gpu/drm/radeon/radeon_display.c
@@ -403,7 +403,8 @@  static void radeon_flip_work_func(struct work_struct *__work)
 	struct drm_crtc *crtc = &radeon_crtc->base;
 	unsigned long flags;
 	int r;
-	int vpos, hpos, stat, min_udelay;
+	int vpos, hpos, stat, min_udelay = 0;
+	unsigned repcnt = 4;
 	struct drm_vblank_crtc *vblank = &crtc->dev->vblank[work->crtc_id];
 
         down_read(&rdev->exclusive_lock);
@@ -454,7 +455,7 @@  static void radeon_flip_work_func(struct work_struct *__work)
 	 * In practice this won't execute very often unless on very fast
 	 * machines because the time window for this to happen is very small.
 	 */
-	for (;;) {
+	while (radeon_crtc->enabled && repcnt--) {
 		/* GET_DISTANCE_TO_VBLANKSTART returns distance to real vblank
 		 * start in hpos, and to the "fudged earlier" vblank start in
 		 * vpos.
@@ -472,10 +473,22 @@  static void radeon_flip_work_func(struct work_struct *__work)
 		/* Sleep at least until estimated real start of hw vblank */
 		spin_unlock_irqrestore(&crtc->dev->event_lock, flags);
 		min_udelay = (-hpos + 1) * max(vblank->linedur_ns / 1000, 5);
+		if (min_udelay > vblank->framedur_ns / 2000) {
+			/* Don't wait ridiculously long - something is wrong */
+			repcnt = 0;
+			break;
+		}
 		usleep_range(min_udelay, 2 * min_udelay);
 		spin_lock_irqsave(&crtc->dev->event_lock, flags);
 	};
 
+	if (!repcnt)
+		DRM_DEBUG_DRIVER("Delay problem on crtc %d: min_udelay %d, "
+				 "framedur %d, linedur %d, stat %d, vpos %d, "
+				 "hpos %d\n", work->crtc_id, min_udelay,
+				 vblank->framedur_ns / 1000,
+				 vblank->linedur_ns / 1000, stat, vpos, hpos);
+
 	/* do the flip (mmio) */
 	radeon_page_flip(rdev, radeon_crtc->crtc_id, work->base);