qxl: fix null-pointer crash during suspend
diff mbox series

Message ID 20180904202747.14968-1-peter@lekensteyn.nl
State New
Headers show
Series
  • qxl: fix null-pointer crash during suspend
Related show

Commit Message

Peter Wu Sept. 4, 2018, 8:27 p.m. UTC
"crtc->helper_private" is not initialized by the QXL driver and thus the
"crtc_funcs->disable" call would crash (resulting in suspend failure).
Fix this by converting the suspend/resume functions to use the
drm_mode_config_helper_* helpers.

Tested system sleep with QEMU 3.0 using "echo mem > /sys/power/state".
During suspend the following message is visible from QEMU:

    spice/server/display-channel.c:2425:display_channel_validate_surface: canvas address is 0x7fd05da68308 for 0 (and is NULL)
    spice/server/display-channel.c:2426:display_channel_validate_surface: failed on 0

This seems to be triggered by QXL_IO_NOTIFY_CMD after
QXL_IO_DESTROY_PRIMARY_ASYNC, but aside from the warning things still
seem to work (tested with both the GTK and -spice options).

Signed-off-by: Peter Wu <peter@lekensteyn.nl>
---
Hi,

I found this issue while trying to suspend a VM that uses QXL. In order to see
the stack trace over serial, boot with no_console_suspend. Searching for
"qxl_drm_freeze" showed one recent report from Alan:
https://lkml.kernel.org/r/891e334c-cf19-032c-b996-59ac166fcde1@gmail.com

Kind regards,
Peter
---
 drivers/gpu/drm/qxl/qxl_drv.c | 26 +++++---------------------
 1 file changed, 5 insertions(+), 21 deletions(-)

Comments

Fubo Chen Oct. 1, 2018, 8:13 p.m. UTC | #1
On Tue, Sep 4, 2018 at 2:10 PM Peter Wu <peter@lekensteyn.nl> wrote:
>
> "crtc->helper_private" is not initialized by the QXL driver and thus the
> "crtc_funcs->disable" call would crash (resulting in suspend failure).
> Fix this by converting the suspend/resume functions to use the
> drm_mode_config_helper_* helpers.
>
> Tested system sleep with QEMU 3.0 using "echo mem > /sys/power/state".
> During suspend the following message is visible from QEMU:
>
>     spice/server/display-channel.c:2425:display_channel_validate_surface: canvas address is 0x7fd05da68308 for 0 (and is NULL)
>     spice/server/display-channel.c:2426:display_channel_validate_surface: failed on 0
>
> This seems to be triggered by QXL_IO_NOTIFY_CMD after
> QXL_IO_DESTROY_PRIMARY_ASYNC, but aside from the warning things still
> seem to work (tested with both the GTK and -spice options).
>
> Signed-off-by: Peter Wu <peter@lekensteyn.nl>

Is this a new issue or something that was introduced a long time ago?
In the latter case, please consider adding a "Cc:
<stable@vger.kernel.org>" tag to this patch.

Thanks,

Fubo.
Peter Wu Oct. 1, 2018, 8:33 p.m. UTC | #2
On Mon, Oct 01, 2018 at 01:13:59PM -0700, Fubo Chen wrote:
> On Tue, Sep 4, 2018 at 2:10 PM Peter Wu <peter@lekensteyn.nl> wrote:
> >
> > "crtc->helper_private" is not initialized by the QXL driver and thus the
> > "crtc_funcs->disable" call would crash (resulting in suspend failure).
> > Fix this by converting the suspend/resume functions to use the
> > drm_mode_config_helper_* helpers.
> >
> > Tested system sleep with QEMU 3.0 using "echo mem > /sys/power/state".
> > During suspend the following message is visible from QEMU:
> >
> >     spice/server/display-channel.c:2425:display_channel_validate_surface: canvas address is 0x7fd05da68308 for 0 (and is NULL)
> >     spice/server/display-channel.c:2426:display_channel_validate_surface: failed on 0
> >
> > This seems to be triggered by QXL_IO_NOTIFY_CMD after
> > QXL_IO_DESTROY_PRIMARY_ASYNC, but aside from the warning things still
> > seem to work (tested with both the GTK and -spice options).
> >
> > Signed-off-by: Peter Wu <peter@lekensteyn.nl>
> 
> Is this a new issue or something that was introduced a long time ago?
> In the latter case, please consider adding a "Cc:
> <stable@vger.kernel.org>" tag to this patch.

I am not sure exactly when the issue was introduced, but the original
code was added in v3.10-rc7-800-gd84300bf7934 while the new
drm_mode_config_helper_suspend API was added in 4.16.

The intended call chain to initialize the private object seems to be:
drm_crtc_helper_add
<- qdev_crtc_init
<- qxl_modeset_init
<- qxl_pci_probe

If any error occurs along the callchain, then the helper_private pointer
will remain NULL. Or if the crtc is obtained in a different way (not
sure how).

Not sure if it is worth backporting, suspend/resume does not seem an
important use case for VMs using QXL and the fix was not validated for
older kernels.
Daniel Vetter Oct. 2, 2018, 8:14 a.m. UTC | #3
On Tue, Sep 04, 2018 at 10:27:47PM +0200, Peter Wu wrote:
> "crtc->helper_private" is not initialized by the QXL driver and thus the

This is still initialized, it's the ->disable that goes boom. At least the
call to drm_crtc_helper_add is still there. The ->disable was removed in:

commit 64581714b58bc3e16ede8dc37a025c3aa0e0eef1
Author: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Date:   Fri Jun 30 12:36:45 2017 +0300

    drm: Convert atomic drivers from CRTC .disable() to .atomic_disable()

Fixes: 64581714b58b ("drm: Convert atomic drivers from CRTC .disable() to .atomic_disable()")
Cc: <stable@vger.kernel.org> # v4.14+
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

I'll let Gerd pick this one up, after some testing. Also adding Laurent.
-Daniel

> "crtc_funcs->disable" call would crash (resulting in suspend failure).
> Fix this by converting the suspend/resume functions to use the
> drm_mode_config_helper_* helpers.
> 
> Tested system sleep with QEMU 3.0 using "echo mem > /sys/power/state".
> During suspend the following message is visible from QEMU:
> 
>     spice/server/display-channel.c:2425:display_channel_validate_surface: canvas address is 0x7fd05da68308 for 0 (and is NULL)
>     spice/server/display-channel.c:2426:display_channel_validate_surface: failed on 0
> 
> This seems to be triggered by QXL_IO_NOTIFY_CMD after
> QXL_IO_DESTROY_PRIMARY_ASYNC, but aside from the warning things still
> seem to work (tested with both the GTK and -spice options).
> 
> Signed-off-by: Peter Wu <peter@lekensteyn.nl>
> ---
> Hi,
> 
> I found this issue while trying to suspend a VM that uses QXL. In order to see
> the stack trace over serial, boot with no_console_suspend. Searching for
> "qxl_drm_freeze" showed one recent report from Alan:
> https://lkml.kernel.org/r/891e334c-cf19-032c-b996-59ac166fcde1@gmail.com
> 
> Kind regards,
> Peter
> ---
>  drivers/gpu/drm/qxl/qxl_drv.c | 26 +++++---------------------
>  1 file changed, 5 insertions(+), 21 deletions(-)
> 
> diff --git a/drivers/gpu/drm/qxl/qxl_drv.c b/drivers/gpu/drm/qxl/qxl_drv.c
> index 2445e75cf7ea..d00f45eed03c 100644
> --- a/drivers/gpu/drm/qxl/qxl_drv.c
> +++ b/drivers/gpu/drm/qxl/qxl_drv.c
> @@ -136,20 +136,11 @@ static int qxl_drm_freeze(struct drm_device *dev)
>  {
>  	struct pci_dev *pdev = dev->pdev;
>  	struct qxl_device *qdev = dev->dev_private;
> -	struct drm_crtc *crtc;
> -
> -	drm_kms_helper_poll_disable(dev);
> -
> -	console_lock();
> -	qxl_fbdev_set_suspend(qdev, 1);
> -	console_unlock();
> +	int ret;
>  
> -	/* unpin the front buffers */
> -	list_for_each_entry(crtc, &dev->mode_config.crtc_list, head) {
> -		const struct drm_crtc_helper_funcs *crtc_funcs = crtc->helper_private;
> -		if (crtc->enabled)
> -			(*crtc_funcs->disable)(crtc);
> -	}
> +	ret = drm_mode_config_helper_suspend(dev);
> +	if (ret)
> +		return ret;
>  
>  	qxl_destroy_monitors_object(qdev);
>  	qxl_surf_evict(qdev);
> @@ -175,14 +166,7 @@ static int qxl_drm_resume(struct drm_device *dev, bool thaw)
>  	}
>  
>  	qxl_create_monitors_object(qdev);
> -	drm_helper_resume_force_mode(dev);
> -
> -	console_lock();
> -	qxl_fbdev_set_suspend(qdev, 0);
> -	console_unlock();
> -
> -	drm_kms_helper_poll_enable(dev);
> -	return 0;
> +	return drm_mode_config_helper_resume(dev);
>  }
>  
>  static int qxl_pm_suspend(struct device *dev)
> -- 
> 2.18.0
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
Laurent Pinchart Oct. 2, 2018, 10:05 a.m. UTC | #4
Hello,

On Tuesday, 2 October 2018 11:14:22 EEST Daniel Vetter wrote:
> On Tue, Sep 04, 2018 at 10:27:47PM +0200, Peter Wu wrote:
> > "crtc->helper_private" is not initialized by the QXL driver and thus the
> 
> This is still initialized, it's the ->disable that goes boom. At least the
> call to drm_crtc_helper_add is still there. The ->disable was removed in:
> 
> commit 64581714b58bc3e16ede8dc37a025c3aa0e0eef1
> Author: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
> Date:   Fri Jun 30 12:36:45 2017 +0300
> 
>     drm: Convert atomic drivers from CRTC .disable() to .atomic_disable()
> 
> Fixes: 64581714b58b ("drm: Convert atomic drivers from CRTC .disable() to
> .atomic_disable()") Cc: <stable@vger.kernel.org> # v4.14+
> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> 
> I'll let Gerd pick this one up, after some testing. Also adding Laurent.

Sorry for breaking it :-( Please let me know if there's something I can do to 
help.

> > "crtc_funcs->disable" call would crash (resulting in suspend failure).
> > Fix this by converting the suspend/resume functions to use the
> > drm_mode_config_helper_* helpers.
> > 
> > Tested system sleep with QEMU 3.0 using "echo mem > /sys/power/state".
> > 
> > During suspend the following message is visible from QEMU:
> >     spice/server/display-channel.c:2425:display_channel_validate_surface:
> >     canvas address is 0x7fd05da68308 for 0 (and is NULL)
> >     spice/server/display-channel.c:2426:display_channel_validate_surface:
> >     failed on 0> 
> > This seems to be triggered by QXL_IO_NOTIFY_CMD after
> > QXL_IO_DESTROY_PRIMARY_ASYNC, but aside from the warning things still
> > seem to work (tested with both the GTK and -spice options).
> > 
> > Signed-off-by: Peter Wu <peter@lekensteyn.nl>
> > ---
> > Hi,
> > 
> > I found this issue while trying to suspend a VM that uses QXL. In order to
> > see the stack trace over serial, boot with no_console_suspend. Searching
> > for "qxl_drm_freeze" showed one recent report from Alan:
> > https://lkml.kernel.org/r/891e334c-cf19-032c-b996-59ac166fcde1@gmail.com
> > 
> > Kind regards,
> > Peter
> > ---
> > 
> >  drivers/gpu/drm/qxl/qxl_drv.c | 26 +++++---------------------
> >  1 file changed, 5 insertions(+), 21 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/qxl/qxl_drv.c b/drivers/gpu/drm/qxl/qxl_drv.c
> > index 2445e75cf7ea..d00f45eed03c 100644
> > --- a/drivers/gpu/drm/qxl/qxl_drv.c
> > +++ b/drivers/gpu/drm/qxl/qxl_drv.c
> > @@ -136,20 +136,11 @@ static int qxl_drm_freeze(struct drm_device *dev)
> > 
> >  {
> >  
> >  	struct pci_dev *pdev = dev->pdev;
> >  	struct qxl_device *qdev = dev->dev_private;
> > 
> > -	struct drm_crtc *crtc;
> > -
> > -	drm_kms_helper_poll_disable(dev);
> > -
> > -	console_lock();
> > -	qxl_fbdev_set_suspend(qdev, 1);
> > -	console_unlock();
> > +	int ret;
> > 
> > -	/* unpin the front buffers */
> > -	list_for_each_entry(crtc, &dev->mode_config.crtc_list, head) {
> > -		const struct drm_crtc_helper_funcs *crtc_funcs = crtc-
>helper_private;
> > -		if (crtc->enabled)
> > -			(*crtc_funcs->disable)(crtc);
> > -	}
> > +	ret = drm_mode_config_helper_suspend(dev);
> > +	if (ret)
> > +		return ret;
> > 
> >  	qxl_destroy_monitors_object(qdev);
> >  	qxl_surf_evict(qdev);
> > 
> > @@ -175,14 +166,7 @@ static int qxl_drm_resume(struct drm_device *dev,
> > bool thaw)> 
> >  	}
> >  	
> >  	qxl_create_monitors_object(qdev);
> > 
> > -	drm_helper_resume_force_mode(dev);
> > -
> > -	console_lock();
> > -	qxl_fbdev_set_suspend(qdev, 0);
> > -	console_unlock();
> > -
> > -	drm_kms_helper_poll_enable(dev);
> > -	return 0;
> > +	return drm_mode_config_helper_resume(dev);
> > 
> >  }
> >  
> >  static int qxl_pm_suspend(struct device *dev)

Patch
diff mbox series

diff --git a/drivers/gpu/drm/qxl/qxl_drv.c b/drivers/gpu/drm/qxl/qxl_drv.c
index 2445e75cf7ea..d00f45eed03c 100644
--- a/drivers/gpu/drm/qxl/qxl_drv.c
+++ b/drivers/gpu/drm/qxl/qxl_drv.c
@@ -136,20 +136,11 @@  static int qxl_drm_freeze(struct drm_device *dev)
 {
 	struct pci_dev *pdev = dev->pdev;
 	struct qxl_device *qdev = dev->dev_private;
-	struct drm_crtc *crtc;
-
-	drm_kms_helper_poll_disable(dev);
-
-	console_lock();
-	qxl_fbdev_set_suspend(qdev, 1);
-	console_unlock();
+	int ret;
 
-	/* unpin the front buffers */
-	list_for_each_entry(crtc, &dev->mode_config.crtc_list, head) {
-		const struct drm_crtc_helper_funcs *crtc_funcs = crtc->helper_private;
-		if (crtc->enabled)
-			(*crtc_funcs->disable)(crtc);
-	}
+	ret = drm_mode_config_helper_suspend(dev);
+	if (ret)
+		return ret;
 
 	qxl_destroy_monitors_object(qdev);
 	qxl_surf_evict(qdev);
@@ -175,14 +166,7 @@  static int qxl_drm_resume(struct drm_device *dev, bool thaw)
 	}
 
 	qxl_create_monitors_object(qdev);
-	drm_helper_resume_force_mode(dev);
-
-	console_lock();
-	qxl_fbdev_set_suspend(qdev, 0);
-	console_unlock();
-
-	drm_kms_helper_poll_enable(dev);
-	return 0;
+	return drm_mode_config_helper_resume(dev);
 }
 
 static int qxl_pm_suspend(struct device *dev)