diff mbox series

drm/vkms: Don't warn hrtimer_forward_now failure.

Message ID 20200525143849.4964-1-penguin-kernel@I-love.SAKURA.ne.jp (mailing list archive)
State New, archived
Headers show
Series drm/vkms: Don't warn hrtimer_forward_now failure. | expand

Commit Message

Tetsuo Handa May 25, 2020, 2:38 p.m. UTC
Commit 3a0709928b172a41 ("drm/vkms: Add vblank events simulated by
hrtimers") introduced ret_overrun variable. And that variable was an
unused-but-set-variable until commit 09ef09b4ab95dc40 ("drm/vkms:
WARN when hrtimer_forward_now fails") added WARN_ON(ret_overrun != 1).

Now, syzbot is hitting this WARN_ON() using a simple reproducer that
does open("/dev/dri/card1") followed by ioctl(DRM_IOCTL_WAIT_VBLANK),
and a debug printk() patch says that syzbot is getting

   output->vblank_hrtimer.base->get_time()=93531904774 (which is uptime)
   output->period_ns=0
   ret_overrun=216994

. I can't understand what "verify the hrtimer_forward_now return" in
that commit wants to say. hrtimer_forward_now() must return, and the
return value of hrtimer_forward_now() is not a boolean. Why comparing
with 1 ? Anyway, this failure is not something that worth crashing the
system. Let's remove the ret_overrun variable and WARN_ON() test.

Link: https://syzkaller.appspot.com/bug?id=0ba17d70d062b2595e1f061231474800f076c7cb
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reported-by: syzbot+0871b14ca2e2fb64f6e3@syzkaller.appspotmail.com
Fixes: 09ef09b4ab95dc40 ("drm/vkms: WARN when hrtimer_forward_now fails")
---
 drivers/gpu/drm/vkms/vkms_crtc.c | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

Comments

Tetsuo Handa May 26, 2020, 4:18 a.m. UTC | #1
On 2020/05/26 2:00, Daniel Vetter wrote:
> Forgot to add: I did run this quickly with vkms as secondary output.
> Good fireworks show, but there's an entire army of additional splats
> after the first one. The WARN_ON you're removing is only the
> messenger, the real bug is probably one of the later backtraces. Or at
> least points more clearly at the real bug.
> -Daniel
> 
>>
>> Debug printk() patch shows that hrtimer_forward_now() can return larger than 1.
>> What is the reason you are expecting hrtimer_forward_now() to always return 1 ?
>>
>>>
>>> If the vkms driver is loaded, and there's nothing else going on, then what
>>> I expect to happen is that this virtual hw is entirely off. And in that
>>> case, the vblank ioctl should be rejected outright. So there's definitely
>>> something fishy going on to begin with.
>>>
>>> If otoh the virtual hw is somehow on (maybe fbcon gets loaded, no idea),
>>> then the vblank wait shouldn't just blow up like this.

I see. Another debug printk() patch ( https://syzkaller.appspot.com/text?tag=Patch&x=135f254a100000 ) found that

  [   86.613767][ T7837] [drm:drm_stub_open] 
  [   86.621592][ T7837] [drm:drm_open] pid = 7837, minor = 1
  [   86.651632][ T7837] [drm:drm_ioctl] pid=7837, dev=0xe201, auth=1, DRM_IOCTL_WAIT_VBLANK
  [   86.663391][ T7837] [drm:drm_calc_timestamping_constants] *ERROR* crtc 32: Can't calculate constants, dotclock = 0!
  [   86.679151][ T7837] [drm:drm_calc_timestamping_constants] crtc 32: hwmode: htotal 0, vtotal 0, vdisplay 0
  [   86.688865][ T7837] [drm:drm_calc_timestamping_constants] crtc 32: clock 0 kHz framedur 0 linedur 0
  [   86.688899][ T7837] Starting out->vblank_hrtimer : out->period_ns=0 vblank->framedur_ns=0
  [   86.689348][ T7837] [drm:drm_vblank_enable] enabling vblank on crtc 0, ret: 0
  [   86.706883][ T7837] [drm:drm_update_vblank_count] updating vblank count on crtc 0: current=0, diff=0, hw=0 hw_last=0
  [   86.706962][    C1] now=86714983667 interval=0 ret_overrun=17693375

hrtimer_start(&out->vblank_hrtimer, 0, HRTIMER_MODE_REL) from vkms_enable_vblank() is
causing immediate vkms_vblank_simulate() callback. The reason of vblank->framedur_ns == 0
is that drm_calc_timestamping_constants() fails to calculate framedur_ns (which is 0 by default)
due to mode->crtc_clock <= 0. Thus, somehow initializing mode->crtc_clock > 0 might be able
to solve this problem.

By the way, if you have an environment that can reproduce this problem, you can
test with debug printk() patch above applied.
Daniel Vetter May 26, 2020, 11:11 a.m. UTC | #2
On Tue, May 26, 2020 at 6:39 AM Tetsuo Handa
<penguin-kernel@i-love.sakura.ne.jp> wrote:
>
> On 2020/05/26 13:18, Tetsuo Handa wrote:
> > due to mode->crtc_clock <= 0. Thus, somehow initializing mode->crtc_clock > 0 might be able
> > to solve this problem.
>
> Well, I came to think that vkms_enable_vblank() should return an error to the caller
> when drm_calc_timestamping_constants() failed...

If my memory is right we shouldn't even get there. crtc->mode being
all zeros sounds like the simulated output isn't on (no surprise,
syzbot doesn't enable it and fbcon is probably on card0), so something
higher up should reject this. I'll see whether I can figure out what
vkms isn't doing right (or whether there's a higher level bug in
drm_vblank.c code), yesterday my machine died and already evening
anyway.

Thanks for digging into this a bit more meanwhile.

Cheers, Daniel

>
> static int vkms_enable_vblank(struct drm_crtc *crtc)
> {
>         struct drm_device *dev = crtc->dev;
>         unsigned int pipe = drm_crtc_index(crtc);
>         struct drm_vblank_crtc *vblank = &dev->vblank[pipe];
>         struct vkms_output *out = drm_crtc_to_vkms_output(crtc);
>
>         drm_calc_timestamping_constants(crtc, &crtc->mode);
>
>         hrtimer_init(&out->vblank_hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
>         out->vblank_hrtimer.function = &vkms_vblank_simulate;
>         out->period_ns = ktime_set(0, vblank->framedur_ns);
>         hrtimer_start(&out->vblank_hrtimer, out->period_ns, HRTIMER_MODE_REL);
>
>         return 0;
> }
Daniel Vetter May 27, 2020, 8:27 a.m. UTC | #3
On Tue, May 26, 2020 at 01:11:28PM +0200, Daniel Vetter wrote:
> On Tue, May 26, 2020 at 6:39 AM Tetsuo Handa
> <penguin-kernel@i-love.sakura.ne.jp> wrote:
> >
> > On 2020/05/26 13:18, Tetsuo Handa wrote:
> > > due to mode->crtc_clock <= 0. Thus, somehow initializing mode->crtc_clock > 0 might be able
> > > to solve this problem.
> >
> > Well, I came to think that vkms_enable_vblank() should return an error to the caller
> > when drm_calc_timestamping_constants() failed...
> 
> If my memory is right we shouldn't even get there. crtc->mode being
> all zeros sounds like the simulated output isn't on (no surprise,
> syzbot doesn't enable it and fbcon is probably on card0), so something
> higher up should reject this. I'll see whether I can figure out what
> vkms isn't doing right (or whether there's a higher level bug in
> drm_vblank.c code), yesterday my machine died and already evening
> anyway.
> 
> Thanks for digging into this a bit more meanwhile.

Ok good news, I think I have a fix. Bad news it's code used by like 50
drivers, so this will take some time to get reviewed and merged (and I
think almost all these other drivers are buggy too, which is why I decided
to fix the shared code).

Patch below, can you pls confirm this fixes it?

Thanks, Daniel

commit b882894ac0fce412b67db79a5c92f2b599ec5069
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Tue May 26 14:29:00 2020 +0200

    drm/atomic-helper: reset vblank on crtc reset
    
    Only when vblanks are supported ofc.
    
    Some drivers do this already, but most unfortunately missed it. This
    opens up bugs after driver load, before the crtc is enabled for the
    first time. syzbot spotted this when loading vkms as a secondary
    output. Given how many drivers are buggy it's best to solve this once
    and for all in shared helper code.
    
    Aside from moving the few existing calls to drm_crtc_vblank_reset into
    helpers (i915 doesn't use helpers, so keeps its own) I think the
    regression risk is minimal: atomic helpers already rely on drivers
    calling drm_crtc_vblank_on/off correctly in their hooks when they
    support vblanks. And driver that's failing to handle vblanks after
    this is missing those calls already, and vblanks could only work by
    accident when enabling a CRTC for the first time right after boot.
    
    FIXME: Fix up drivers and audit them all.
    
    Big thanks to Tetsuo for helping track down what's going wrong here.
    
    Link: https://syzkaller.appspot.com/bug?id=0ba17d70d062b2595e1f061231474800f076c7cb
    Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
    Reported-by: syzbot+0871b14ca2e2fb64f6e3@syzkaller.appspotmail.com
    Cc: stable@vger.kernel.org
    Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
    Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>

diff --git a/drivers/gpu/drm/drm_atomic_state_helper.c b/drivers/gpu/drm/drm_atomic_state_helper.c
index 8fce6a115dfe..843964252239 100644
--- a/drivers/gpu/drm/drm_atomic_state_helper.c
+++ b/drivers/gpu/drm/drm_atomic_state_helper.c
@@ -32,6 +32,7 @@
 #include <drm/drm_device.h>
 #include <drm/drm_plane.h>
 #include <drm/drm_print.h>
+#include <drm/drm_vblank.h>
 #include <drm/drm_writeback.h>
 
 #include <linux/slab.h>
@@ -93,6 +94,9 @@ __drm_atomic_helper_crtc_reset(struct drm_crtc *crtc,
 	if (crtc_state)
 		__drm_atomic_helper_crtc_state_reset(crtc_state, crtc);
 
+	if (crtc->dev->num_crtcs)
+		drm_crtc_vblank_reset(crtc);
+
 	crtc->state = crtc_state;
 }
 EXPORT_SYMBOL(__drm_atomic_helper_crtc_reset);
diff mbox series

Patch

diff --git a/drivers/gpu/drm/vkms/vkms_crtc.c b/drivers/gpu/drm/vkms/vkms_crtc.c
index ac85e17428f8..cc1811ce6092 100644
--- a/drivers/gpu/drm/vkms/vkms_crtc.c
+++ b/drivers/gpu/drm/vkms/vkms_crtc.c
@@ -13,12 +13,9 @@  static enum hrtimer_restart vkms_vblank_simulate(struct hrtimer *timer)
 						  vblank_hrtimer);
 	struct drm_crtc *crtc = &output->crtc;
 	struct vkms_crtc_state *state;
-	u64 ret_overrun;
 	bool ret;
 
-	ret_overrun = hrtimer_forward_now(&output->vblank_hrtimer,
-					  output->period_ns);
-	WARN_ON(ret_overrun != 1);
+	hrtimer_forward_now(&output->vblank_hrtimer, output->period_ns);
 
 	spin_lock(&output->lock);
 	ret = drm_crtc_handle_vblank(crtc);