diff mbox series

drm/drm_vblank: use drm_warn_once() to warn undefined mode timing

Message ID 20201016071254.2681-1-shawn.guo@linaro.org
State New, archived
Headers show
Series drm/drm_vblank: use drm_warn_once() to warn undefined mode timing | expand

Commit Message

Shawn Guo Oct. 16, 2020, 7:12 a.m. UTC
Commit 5caa0feafcc6 ("drm/vblank: Lock down vblank->hwmode more") added
WARN_ON_ONCE() for atomic drivers to warn the case that vsync is enabled
before a mode has been set on CRTC.  This happens sometimes during the
initial mode setting of a CRTC.  It also happens on Android running HWC2
backed with drm_hwcomposer, where HWC2::SetVsyncEnabled could be called
before the atomic mode setting on CRTC happens.

In this case, there is nothing really bad to happen as kernel function
returns as no-op.  So using WARN() version might be overkilled,
considering some user space crash reporting services may treat kernel
WARNINGS as crashes.  Let's drop WARN_ON_ONCE() and change drm_dbg_core()
to drm_warn_once() for warning undefined mode timing.

Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
---
 drivers/gpu/drm/drm_vblank.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

Comments

Daniel Vetter Oct. 16, 2020, 7:58 a.m. UTC | #1
On Fri, Oct 16, 2020 at 9:13 AM Shawn Guo <shawn.guo@linaro.org> wrote:
>
> Commit 5caa0feafcc6 ("drm/vblank: Lock down vblank->hwmode more") added
> WARN_ON_ONCE() for atomic drivers to warn the case that vsync is enabled
> before a mode has been set on CRTC.  This happens sometimes during the
> initial mode setting of a CRTC.  It also happens on Android running HWC2
> backed with drm_hwcomposer, where HWC2::SetVsyncEnabled could be called
> before the atomic mode setting on CRTC happens.
>
> In this case, there is nothing really bad to happen as kernel function
> returns as no-op.  So using WARN() version might be overkilled,
> considering some user space crash reporting services may treat kernel
> WARNINGS as crashes.  Let's drop WARN_ON_ONCE() and change drm_dbg_core()
> to drm_warn_once() for warning undefined mode timing.

This indicates a bug in your driver. Please fix it there, not by
shutting up the core code complaining about that. Either you're
getting vblank timestamps when the vblank isn't set up yet
(drm_crtc_vblank_on/off) or there's some other race going on in your
driver code resulting in this.
-Daniel

> Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
> ---
>  drivers/gpu/drm/drm_vblank.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/drm_vblank.c b/drivers/gpu/drm/drm_vblank.c
> index b18e1efbbae1..5f7c4a923e8a 100644
> --- a/drivers/gpu/drm/drm_vblank.c
> +++ b/drivers/gpu/drm/drm_vblank.c
> @@ -717,9 +717,9 @@ drm_crtc_vblank_helper_get_vblank_timestamp_internal(
>          * Happens during initial modesetting of a crtc.
>          */
>         if (mode->crtc_clock == 0) {
> -               drm_dbg_core(dev, "crtc %u: Noop due to uninitialized mode.\n",
> -                            pipe);
> -               drm_WARN_ON_ONCE(dev, drm_drv_uses_atomic_modeset(dev));
> +               drm_warn_once(dev, "crtc %u: Noop due to uninitialized mode.\n",
> +                             pipe);
> +
>                 return false;
>         }
>
> --
> 2.17.1
>
Shawn Guo Oct. 16, 2020, 8:54 a.m. UTC | #2
On Fri, Oct 16, 2020 at 09:58:46AM +0200, Daniel Vetter wrote:
> On Fri, Oct 16, 2020 at 9:13 AM Shawn Guo <shawn.guo@linaro.org> wrote:
> >
> > Commit 5caa0feafcc6 ("drm/vblank: Lock down vblank->hwmode more") added
> > WARN_ON_ONCE() for atomic drivers to warn the case that vsync is enabled
> > before a mode has been set on CRTC.  This happens sometimes during the
> > initial mode setting of a CRTC.  It also happens on Android running HWC2
> > backed with drm_hwcomposer, where HWC2::SetVsyncEnabled could be called
> > before the atomic mode setting on CRTC happens.
> >
> > In this case, there is nothing really bad to happen as kernel function
> > returns as no-op.  So using WARN() version might be overkilled,
> > considering some user space crash reporting services may treat kernel
> > WARNINGS as crashes.  Let's drop WARN_ON_ONCE() and change drm_dbg_core()
> > to drm_warn_once() for warning undefined mode timing.
> 
> This indicates a bug in your driver. Please fix it there, not by
> shutting up the core code complaining about that. Either you're
> getting vblank timestamps when the vblank isn't set up yet
> (drm_crtc_vblank_on/off) or there's some other race going on in your
> driver code resulting in this.

Thanks for the comment, Daniel.

I'm hitting this warning on an Android running drm_hwcomposer.  I'm
indeed getting vblank timestamps request before drm_crtc_vblank_on() is
called.  I'm not sure this is a bug or race condition in the driver
code, as both vblank timestamps and on/off requests are coming from user
space ioctl for my case.  @Sean, that means the problem is in Android
drm_hwcomposer code?

Shawn
Daniel Vetter Oct. 16, 2020, 9:30 a.m. UTC | #3
On Fri, Oct 16, 2020 at 10:54 AM Shawn Guo <shawn.guo@linaro.org> wrote:
>
> On Fri, Oct 16, 2020 at 09:58:46AM +0200, Daniel Vetter wrote:
> > On Fri, Oct 16, 2020 at 9:13 AM Shawn Guo <shawn.guo@linaro.org> wrote:
> > >
> > > Commit 5caa0feafcc6 ("drm/vblank: Lock down vblank->hwmode more") added
> > > WARN_ON_ONCE() for atomic drivers to warn the case that vsync is enabled
> > > before a mode has been set on CRTC.  This happens sometimes during the
> > > initial mode setting of a CRTC.  It also happens on Android running HWC2
> > > backed with drm_hwcomposer, where HWC2::SetVsyncEnabled could be called
> > > before the atomic mode setting on CRTC happens.
> > >
> > > In this case, there is nothing really bad to happen as kernel function
> > > returns as no-op.  So using WARN() version might be overkilled,
> > > considering some user space crash reporting services may treat kernel
> > > WARNINGS as crashes.  Let's drop WARN_ON_ONCE() and change drm_dbg_core()
> > > to drm_warn_once() for warning undefined mode timing.
> >
> > This indicates a bug in your driver. Please fix it there, not by
> > shutting up the core code complaining about that. Either you're
> > getting vblank timestamps when the vblank isn't set up yet
> > (drm_crtc_vblank_on/off) or there's some other race going on in your
> > driver code resulting in this.
>
> Thanks for the comment, Daniel.
>
> I'm hitting this warning on an Android running drm_hwcomposer.  I'm
> indeed getting vblank timestamps request before drm_crtc_vblank_on() is
> called.  I'm not sure this is a bug or race condition in the driver
> code, as both vblank timestamps and on/off requests are coming from user
> space ioctl for my case.  @Sean, that means the problem is in Android
> drm_hwcomposer code?

vblank request when the crtc is off should be rejected. Most drivers
got this wrong before I added the required drm_crtc_vblank_reset()
into atomic helpers in 51f644b40b4b ("drm/atomic-helper: reset vblank
on crtc reset")

Please make sure you have that, and that drm_crtc_vblank_reset is run
at driver load time. If the crtc is off, vblank ioctl should be
rejected. So this is definitely not a userspace bug, still a driver
bug. In general, userspace is not allowed to do anything that results
in dmesg spam at normal log levels. Anytime that happens it's a kernel
bug. And if it's a warning in core code, it's most likely a driver bug
since the core code tends to be better debugged about these things.
But there's ofc exceptions.

Cheers, Daniel
Shawn Guo Oct. 16, 2020, 11:46 a.m. UTC | #4
On Fri, Oct 16, 2020 at 11:30:04AM +0200, Daniel Vetter wrote:
> On Fri, Oct 16, 2020 at 10:54 AM Shawn Guo <shawn.guo@linaro.org> wrote:
> >
> > On Fri, Oct 16, 2020 at 09:58:46AM +0200, Daniel Vetter wrote:
> > > On Fri, Oct 16, 2020 at 9:13 AM Shawn Guo <shawn.guo@linaro.org> wrote:
> > > >
> > > > Commit 5caa0feafcc6 ("drm/vblank: Lock down vblank->hwmode more") added
> > > > WARN_ON_ONCE() for atomic drivers to warn the case that vsync is enabled
> > > > before a mode has been set on CRTC.  This happens sometimes during the
> > > > initial mode setting of a CRTC.  It also happens on Android running HWC2
> > > > backed with drm_hwcomposer, where HWC2::SetVsyncEnabled could be called
> > > > before the atomic mode setting on CRTC happens.
> > > >
> > > > In this case, there is nothing really bad to happen as kernel function
> > > > returns as no-op.  So using WARN() version might be overkilled,
> > > > considering some user space crash reporting services may treat kernel
> > > > WARNINGS as crashes.  Let's drop WARN_ON_ONCE() and change drm_dbg_core()
> > > > to drm_warn_once() for warning undefined mode timing.
> > >
> > > This indicates a bug in your driver. Please fix it there, not by
> > > shutting up the core code complaining about that. Either you're
> > > getting vblank timestamps when the vblank isn't set up yet
> > > (drm_crtc_vblank_on/off) or there's some other race going on in your
> > > driver code resulting in this.
> >
> > Thanks for the comment, Daniel.
> >
> > I'm hitting this warning on an Android running drm_hwcomposer.  I'm
> > indeed getting vblank timestamps request before drm_crtc_vblank_on() is
> > called.  I'm not sure this is a bug or race condition in the driver
> > code, as both vblank timestamps and on/off requests are coming from user
> > space ioctl for my case.  @Sean, that means the problem is in Android
> > drm_hwcomposer code?
> 
> vblank request when the crtc is off should be rejected. Most drivers
> got this wrong before I added the required drm_crtc_vblank_reset()
> into atomic helpers in 51f644b40b4b ("drm/atomic-helper: reset vblank
> on crtc reset")
> 
> Please make sure you have that, and that drm_crtc_vblank_reset is run
> at driver load time. If the crtc is off, vblank ioctl should be
> rejected. So this is definitely not a userspace bug, still a driver
> bug. In general, userspace is not allowed to do anything that results
> in dmesg spam at normal log levels. Anytime that happens it's a kernel
> bug. And if it's a warning in core code, it's most likely a driver bug
> since the core code tends to be better debugged about these things.
> But there's ofc exceptions.

Indeed!  Adding drm_crtc_vblank_reset() into driver crtc reset hook
removes the WARNING for me.  Really appreciate your comments, Daniel!

Shawn
Daniel Vetter Oct. 19, 2020, 3:48 p.m. UTC | #5
On Fri, Oct 16, 2020 at 07:46:41PM +0800, Shawn Guo wrote:
> On Fri, Oct 16, 2020 at 11:30:04AM +0200, Daniel Vetter wrote:
> > On Fri, Oct 16, 2020 at 10:54 AM Shawn Guo <shawn.guo@linaro.org> wrote:
> > >
> > > On Fri, Oct 16, 2020 at 09:58:46AM +0200, Daniel Vetter wrote:
> > > > On Fri, Oct 16, 2020 at 9:13 AM Shawn Guo <shawn.guo@linaro.org> wrote:
> > > > >
> > > > > Commit 5caa0feafcc6 ("drm/vblank: Lock down vblank->hwmode more") added
> > > > > WARN_ON_ONCE() for atomic drivers to warn the case that vsync is enabled
> > > > > before a mode has been set on CRTC.  This happens sometimes during the
> > > > > initial mode setting of a CRTC.  It also happens on Android running HWC2
> > > > > backed with drm_hwcomposer, where HWC2::SetVsyncEnabled could be called
> > > > > before the atomic mode setting on CRTC happens.
> > > > >
> > > > > In this case, there is nothing really bad to happen as kernel function
> > > > > returns as no-op.  So using WARN() version might be overkilled,
> > > > > considering some user space crash reporting services may treat kernel
> > > > > WARNINGS as crashes.  Let's drop WARN_ON_ONCE() and change drm_dbg_core()
> > > > > to drm_warn_once() for warning undefined mode timing.
> > > >
> > > > This indicates a bug in your driver. Please fix it there, not by
> > > > shutting up the core code complaining about that. Either you're
> > > > getting vblank timestamps when the vblank isn't set up yet
> > > > (drm_crtc_vblank_on/off) or there's some other race going on in your
> > > > driver code resulting in this.
> > >
> > > Thanks for the comment, Daniel.
> > >
> > > I'm hitting this warning on an Android running drm_hwcomposer.  I'm
> > > indeed getting vblank timestamps request before drm_crtc_vblank_on() is
> > > called.  I'm not sure this is a bug or race condition in the driver
> > > code, as both vblank timestamps and on/off requests are coming from user
> > > space ioctl for my case.  @Sean, that means the problem is in Android
> > > drm_hwcomposer code?
> > 
> > vblank request when the crtc is off should be rejected. Most drivers
> > got this wrong before I added the required drm_crtc_vblank_reset()
> > into atomic helpers in 51f644b40b4b ("drm/atomic-helper: reset vblank
> > on crtc reset")
> > 
> > Please make sure you have that, and that drm_crtc_vblank_reset is run
> > at driver load time. If the crtc is off, vblank ioctl should be
> > rejected. So this is definitely not a userspace bug, still a driver
> > bug. In general, userspace is not allowed to do anything that results
> > in dmesg spam at normal log levels. Anytime that happens it's a kernel
> > bug. And if it's a warning in core code, it's most likely a driver bug
> > since the core code tends to be better debugged about these things.
> > But there's ofc exceptions.
> 
> Indeed!  Adding drm_crtc_vblank_reset() into driver crtc reset hook
> removes the WARNING for me.  Really appreciate your comments, Daniel!

This should work automatically if you're using the atomic state helpers.
Please try to unify as much as possible, so you're not missing any future
bugfixes in this area. Although I thought I've completely reviewed all
drivers to fix as many as possible for this bug. Are you on some older
kernel that didn't have this? Maybe good idea to submit the bugfix in
upstream to stable, if that's the case.
-Daniel
Shawn Guo Oct. 21, 2020, 9:11 a.m. UTC | #6
On Mon, Oct 19, 2020 at 05:48:29PM +0200, Daniel Vetter wrote:
> On Fri, Oct 16, 2020 at 07:46:41PM +0800, Shawn Guo wrote:
> > Indeed!  Adding drm_crtc_vblank_reset() into driver crtc reset hook
> > removes the WARNING for me.  Really appreciate your comments, Daniel!
> 
> This should work automatically if you're using the atomic state helpers.
> Please try to unify as much as possible, so you're not missing any future
> bugfixes in this area. Although I thought I've completely reviewed all
> drivers to fix as many as possible for this bug. Are you on some older
> kernel that didn't have this? Maybe good idea to submit the bugfix in
> upstream to stable, if that's the case.

Yes, I'm running a 4.19 stable kernel with msm/mdp5 device.  The bug was
addressed (unintentionally?) by one patch that is part of the series
adding Nexus 5 display support.  And later on, you commit 51f644b40b4b
("drm/atomic-helper: reset vblank on crtc reset") switches the driver to
use atomic state helper.

Shawn

[1] https://lore.kernel.org/patchwork/patch/1082564/
diff mbox series

Patch

diff --git a/drivers/gpu/drm/drm_vblank.c b/drivers/gpu/drm/drm_vblank.c
index b18e1efbbae1..5f7c4a923e8a 100644
--- a/drivers/gpu/drm/drm_vblank.c
+++ b/drivers/gpu/drm/drm_vblank.c
@@ -717,9 +717,9 @@  drm_crtc_vblank_helper_get_vblank_timestamp_internal(
 	 * Happens during initial modesetting of a crtc.
 	 */
 	if (mode->crtc_clock == 0) {
-		drm_dbg_core(dev, "crtc %u: Noop due to uninitialized mode.\n",
-			     pipe);
-		drm_WARN_ON_ONCE(dev, drm_drv_uses_atomic_modeset(dev));
+		drm_warn_once(dev, "crtc %u: Noop due to uninitialized mode.\n",
+			      pipe);
+
 		return false;
 	}