diff mbox

[1/2] Revert "drm/radeon: remove drm_vblank_get|put from pflip handling"

Message ID 53A94D84.1070305@daenzer.net (mailing list archive)
State New, archived
Headers show

Commit Message

Michel Dänzer June 24, 2014, 10:05 a.m. UTC
On 24.06.2014 05:32, Dieter Nützel wrote:
> Am 23.06.2014 21:46, schrieb Dieter Nützel:
>> Am 23.06.2014 11:34, schrieb Michel Dänzer:
>>> On 18.06.2014 18:14, Christian König wrote:
>>>> Am 18.06.2014 07:53, schrieb Michel Dänzer:
>>>>>
>>>>>   (WW) RADEON(0): radeon_dri2_flip_event_handler: Pageflip completion
>>>>> event has impossible msc [x-1] < target_msc [x]
>>>>>
>>>>> messages in the X log file which have been popping up in bug reports
>>>>> lately.
>>>>> This also results in 0s being returned to the client for the MSC and
>>>>> timestamp of the swap completion, which could cause all kinds of bad
>>>>> behaviour.
>>>> First of all thanks for looking into it. Are you getting this on
>>>> 3.16 or
>>>> 3.15?
>>>
>>> I haven't actually run into this myself yet. I thought I'd seen it in
>>> several bug reports, but right now I can only find
>>> https://bugs.freedesktop.org/show_bug.cgi?id=80029#c17 , which seems to
>>> include the page flipping changes from 3.16.
>>
>> With 3.16-rc2 I get it now on my RV730 AGP as in the above bug report.
>> But only the lines in Xorg.0.log.
>> NO signs of any damage/error in use.
>>
>> Since 3.15 and 3.16 (rc2 only) my system is rock solid.
>>
>> I've tried 3.15-rc7 + Christian's pflip rework (did some little
>> handwork), too.
>> It was solid but I saw the reported flip/black distortion in the below
>> part during Kwin 4.13 cube screen effect (rotation). Your fix for
>> 3.16-rc1 fixed that.

That's good to hear.


> I can reliable generate such lines in Xorg.0.log with KWin cube desktop
> effect.
> 
> Rotate screens with mouse wheel or screen switcher => new entry in
> Xorg.0.log. If it happens I notice ('see') flip delay.

I was only able to reproduce it a couple of times even with that, but not
at all yet with the patch below. Does it help for you as well?


>>>> I don't think that the pflip irq is thrown earlier than the vblank, but
>>>> on 3.16 it might actually be that we program the flip so fast into the
>>>> hardware that we do it one frame earlier than planned.
>>>
>>> So userspace is notified of the previous vertical blank period and calls
>>> the page flip ioctl in response, which then manages to program the
>>> scanout address update into the hardware before the scanout address
>>> update is latched during the previous vertical blank period?

I think there's another possible scenario:

 1. Userspace submits page flip intended for MSC x
 2. The vertical blank interrupt is triggered for MSC x =>
    radeon_crtc_handle_vblank() => radeon_crtc_handle_flip()
 3. Userspace submits page flip intended for MSC (x + 1)
 4. The page flip interrupt is triggered for the previous flip =>
    radeon_crtc_handle_flip() => drm_send_vblank_event(). The second flip
    hasn't actually executed yet, and the event has MSC x instead of (x + 1)
    as expected by userspace.

If that is the case, only actually enabling and handling the page flip
interrupt when a flip is pending might also avoid it. I can hack that up
tomorrow, if Christian doesn't beat me to it.

Comments

Dieter Nützel June 24, 2014, 7:58 p.m. UTC | #1
Am 24.06.2014 12:05, schrieb Michel Dänzer:
> On 24.06.2014 05:32, Dieter Nützel wrote:
>> Am 23.06.2014 21:46, schrieb Dieter Nützel:
>>> Am 23.06.2014 11:34, schrieb Michel Dänzer:
>>>> On 18.06.2014 18:14, Christian König wrote:
>>>>> Am 18.06.2014 07:53, schrieb Michel Dänzer:
>>>>>> 
>>>>>>   (WW) RADEON(0): radeon_dri2_flip_event_handler: Pageflip 
>>>>>> completion
>>>>>> event has impossible msc [x-1] < target_msc [x]
>>>>>> 
>>>>>> messages in the X log file which have been popping up in bug 
>>>>>> reports
>>>>>> lately.
>>>>>> This also results in 0s being returned to the client for the MSC 
>>>>>> and
>>>>>> timestamp of the swap completion, which could cause all kinds of 
>>>>>> bad
>>>>>> behaviour.
>>>>> First of all thanks for looking into it. Are you getting this on
>>>>> 3.16 or
>>>>> 3.15?
>>>> 
>>>> I haven't actually run into this myself yet. I thought I'd seen it 
>>>> in
>>>> several bug reports, but right now I can only find
>>>> https://bugs.freedesktop.org/show_bug.cgi?id=80029#c17 , which seems 
>>>> to
>>>> include the page flipping changes from 3.16.
>>> 
>>> With 3.16-rc2 I get it now on my RV730 AGP as in the above bug 
>>> report.
>>> But only the lines in Xorg.0.log.
>>> NO signs of any damage/error in use.
>>> 
>>> Since 3.15 and 3.16 (rc2 only) my system is rock solid.
>>> 
>>> I've tried 3.15-rc7 + Christian's pflip rework (did some little
>>> handwork), too.
>>> It was solid but I saw the reported flip/black distortion in the 
>>> below
>>> part during Kwin 4.13 cube screen effect (rotation). Your fix for
>>> 3.16-rc1 fixed that.
> 
> That's good to hear.
> 
> 
>> I can reliable generate such lines in Xorg.0.log with KWin cube 
>> desktop
>> effect.
>> 
>> Rotate screens with mouse wheel or screen switcher => new entry in
>> Xorg.0.log. If it happens I notice ('see') flip delay.
> 
> I was only able to reproduce it a couple of times even with that, but 
> not
> at all yet with the patch below. Does it help for you as well?

Will try in the next run.

My daughter generated kernel crash for us.;-)
See would open up a zoom image in Konqi of a new Waveboard for here girl 
friends...

But I could only take images with my mobile.
kernel BUG at drivers/gpu/drm/drm_irq.c:976!
Will send one, have two more.

Greetings,
   Dieter

2nd, try.
This time without image.
Let me know where I should add it, please.

> 
>>>>> I don't think that the pflip irq is thrown earlier than the vblank, 
>>>>> but
>>>>> on 3.16 it might actually be that we program the flip so fast into 
>>>>> the
>>>>> hardware that we do it one frame earlier than planned.
>>>> 
>>>> So userspace is notified of the previous vertical blank period and 
>>>> calls
>>>> the page flip ioctl in response, which then manages to program the
>>>> scanout address update into the hardware before the scanout address
>>>> update is latched during the previous vertical blank period?
> 
> I think there's another possible scenario:
> 
>  1. Userspace submits page flip intended for MSC x
>  2. The vertical blank interrupt is triggered for MSC x =>
>     radeon_crtc_handle_vblank() => radeon_crtc_handle_flip()
>  3. Userspace submits page flip intended for MSC (x + 1)
>  4. The page flip interrupt is triggered for the previous flip =>
>     radeon_crtc_handle_flip() => drm_send_vblank_event(). The second 
> flip
>     hasn't actually executed yet, and the event has MSC x instead of (x 
> + 1)
>     as expected by userspace.
> 
> If that is the case, only actually enabling and handling the page flip
> interrupt when a flip is pending might also avoid it. I can hack that 
> up
> tomorrow, if Christian doesn't beat me to it.
> 
> 
> diff --git a/drivers/gpu/drm/radeon/radeon_display.c
> b/drivers/gpu/drm/radeon/radeon_display.c
> index 8b575a4..8350f8c 100644
> --- a/drivers/gpu/drm/radeon/radeon_display.c
> +++ b/drivers/gpu/drm/radeon/radeon_display.c
> @@ -336,14 +336,19 @@ void radeon_crtc_handle_flip(struct
> radeon_device *rdev, int crtc_id)
>         struct radeon_crtc *radeon_crtc = 
> rdev->mode_info.crtcs[crtc_id];
>         struct radeon_flip_work *work;
>         unsigned long flags;
> +       struct timeval vblank_time;
> +       u32 vblank_seq;
> 
>         /* this can happen at init */
>         if (radeon_crtc == NULL)
>                 return;
> 
> +       vblank_seq = drm_vblank_count_and_time(rdev->ddev, crtc_id,
> &vblank_time);
> +
>         spin_lock_irqsave(&rdev->ddev->event_lock, flags);
>         work = radeon_crtc->flip_work;
> -       if (work == NULL) {
> +       if (work == NULL ||
> +           (vblank_seq - work->event->event.sequence) > (1<<23)) {
>                 spin_unlock_irqrestore(&rdev->ddev->event_lock, flags);
>                 return;
>         }
> @@ -379,6 +384,7 @@ static void radeon_flip_work_func(struct
> work_struct *__work)
> 
>         struct drm_crtc *crtc = &radeon_crtc->base;
>         struct drm_framebuffer *fb = work->fb;
> +       struct timeval vblank_time;
> 
>         uint32_t tiling_flags, pitch_pixels;
>         uint64_t base;
> @@ -466,6 +472,10 @@ static void radeon_flip_work_func(struct
> work_struct *__work)
>                 goto pflip_cleanup;
>         }
> 
> +       work->event->event.sequence =
> +               drm_vblank_count_and_time(crtc->dev, 
> radeon_crtc->crtc_id,
> +                                         &vblank_time) + 1;
> +
>         /* We borrow the event spin lock for protecting flip_work */
>         spin_lock_irqsave(&crtc->dev->event_lock, flags);
Dieter Nützel June 24, 2014, 9:52 p.m. UTC | #2
Am 24.06.2014 12:05, schrieb Michel Dänzer:
> On 24.06.2014 05:32, Dieter Nützel wrote:
>> Am 23.06.2014 21:46, schrieb Dieter Nützel:
>>> Am 23.06.2014 11:34, schrieb Michel Dänzer:
>>>> On 18.06.2014 18:14, Christian König wrote:
>>>>> Am 18.06.2014 07:53, schrieb Michel Dänzer:
>>>>>> 
>>>>>>   (WW) RADEON(0): radeon_dri2_flip_event_handler: Pageflip 
>>>>>> completion
>>>>>> event has impossible msc [x-1] < target_msc [x]
>>>>>> 
>>>>>> messages in the X log file which have been popping up in bug 
>>>>>> reports
>>>>>> lately.
>>>>>> This also results in 0s being returned to the client for the MSC 
>>>>>> and
>>>>>> timestamp of the swap completion, which could cause all kinds of 
>>>>>> bad
>>>>>> behaviour.
>>>>> First of all thanks for looking into it. Are you getting this on
>>>>> 3.16 or
>>>>> 3.15?
>>>> 
>>>> I haven't actually run into this myself yet. I thought I'd seen it 
>>>> in
>>>> several bug reports, but right now I can only find
>>>> https://bugs.freedesktop.org/show_bug.cgi?id=80029#c17 , which seems 
>>>> to
>>>> include the page flipping changes from 3.16.
>>> 
>>> With 3.16-rc2 I get it now on my RV730 AGP as in the above bug 
>>> report.
>>> But only the lines in Xorg.0.log.
>>> NO signs of any damage/error in use.
>>> 
>>> Since 3.15 and 3.16 (rc2 only) my system is rock solid.
>>> 
>>> I've tried 3.15-rc7 + Christian's pflip rework (did some little
>>> handwork), too.
>>> It was solid but I saw the reported flip/black distortion in the 
>>> below
>>> part during Kwin 4.13 cube screen effect (rotation). Your fix for
>>> 3.16-rc1 fixed that.
> 
> That's good to hear.
> 
> 
>> I can reliable generate such lines in Xorg.0.log with KWin cube 
>> desktop
>> effect.
>> 
>> Rotate screens with mouse wheel or screen switcher => new entry in
>> Xorg.0.log. If it happens I notice ('see') flip delay.
> 
> I was only able to reproduce it a couple of times even with that, but 
> not
> at all yet with the patch below. Does it help for you as well?

You have my Tested-by: for it.
Can't reproduce it any longer with your patch below.
Even that it didn't apply ontop of 3.16-rc2, but
most of the time I know what I'm doing...;-)

Now some little Fußball watching!

Cheers,
Dieter

> diff --git a/drivers/gpu/drm/radeon/radeon_display.c
> b/drivers/gpu/drm/radeon/radeon_display.c
> index 8b575a4..8350f8c 100644
> --- a/drivers/gpu/drm/radeon/radeon_display.c
> +++ b/drivers/gpu/drm/radeon/radeon_display.c
> @@ -336,14 +336,19 @@ void radeon_crtc_handle_flip(struct
> radeon_device *rdev, int crtc_id)
>         struct radeon_crtc *radeon_crtc = 
> rdev->mode_info.crtcs[crtc_id];
>         struct radeon_flip_work *work;
>         unsigned long flags;
> +       struct timeval vblank_time;
> +       u32 vblank_seq;
> 
>         /* this can happen at init */
>         if (radeon_crtc == NULL)
>                 return;
> 
> +       vblank_seq = drm_vblank_count_and_time(rdev->ddev, crtc_id,
> &vblank_time);
> +
>         spin_lock_irqsave(&rdev->ddev->event_lock, flags);
>         work = radeon_crtc->flip_work;
> -       if (work == NULL) {
> +       if (work == NULL ||
> +           (vblank_seq - work->event->event.sequence) > (1<<23)) {
>                 spin_unlock_irqrestore(&rdev->ddev->event_lock, flags);
>                 return;
>         }
> @@ -379,6 +384,7 @@ static void radeon_flip_work_func(struct
> work_struct *__work)
> 
>         struct drm_crtc *crtc = &radeon_crtc->base;
>         struct drm_framebuffer *fb = work->fb;
> +       struct timeval vblank_time;
> 
>         uint32_t tiling_flags, pitch_pixels;
>         uint64_t base;
> @@ -466,6 +472,10 @@ static void radeon_flip_work_func(struct
> work_struct *__work)
>                 goto pflip_cleanup;
>         }
> 
> +       work->event->event.sequence =
> +               drm_vblank_count_and_time(crtc->dev, 
> radeon_crtc->crtc_id,
> +                                         &vblank_time) + 1;
> +
>         /* We borrow the event spin lock for protecting flip_work */
>         spin_lock_irqsave(&crtc->dev->event_lock, flags);
diff mbox

Patch

diff --git a/drivers/gpu/drm/radeon/radeon_display.c b/drivers/gpu/drm/radeon/radeon_display.c
index 8b575a4..8350f8c 100644
--- a/drivers/gpu/drm/radeon/radeon_display.c
+++ b/drivers/gpu/drm/radeon/radeon_display.c
@@ -336,14 +336,19 @@  void radeon_crtc_handle_flip(struct radeon_device *rdev, int crtc_id)
        struct radeon_crtc *radeon_crtc = rdev->mode_info.crtcs[crtc_id];
        struct radeon_flip_work *work;
        unsigned long flags;
+       struct timeval vblank_time;
+       u32 vblank_seq;

        /* this can happen at init */
        if (radeon_crtc == NULL)
                return;

+       vblank_seq = drm_vblank_count_and_time(rdev->ddev, crtc_id, &vblank_time);
+
        spin_lock_irqsave(&rdev->ddev->event_lock, flags);
        work = radeon_crtc->flip_work;
-       if (work == NULL) {
+       if (work == NULL ||
+           (vblank_seq - work->event->event.sequence) > (1<<23)) {
                spin_unlock_irqrestore(&rdev->ddev->event_lock, flags);
                return;
        }
@@ -379,6 +384,7 @@  static void radeon_flip_work_func(struct work_struct *__work)

        struct drm_crtc *crtc = &radeon_crtc->base;
        struct drm_framebuffer *fb = work->fb;
+       struct timeval vblank_time;

        uint32_t tiling_flags, pitch_pixels;
        uint64_t base;
@@ -466,6 +472,10 @@  static void radeon_flip_work_func(struct work_struct *__work)
                goto pflip_cleanup;
        }

+       work->event->event.sequence =
+               drm_vblank_count_and_time(crtc->dev, radeon_crtc->crtc_id,
+                                         &vblank_time) + 1;
+
        /* We borrow the event spin lock for protecting flip_work */
        spin_lock_irqsave(&crtc->dev->event_lock, flags);