diff mbox

drm: Don't race connector registration

Message ID 1484237756-2720-1-git-send-email-daniel.vetter@ffwll.ch (mailing list archive)
State New, archived
Headers show

Commit Message

Daniel Vetter Jan. 12, 2017, 4:15 p.m. UTC
I was under the misconception that the sysfs dev stuff can be fully
set up, and then registered all in one step with device_add. That's
true for properties and property groups, but not for parents and child
devices. Those must be fully registered before you can register a
child.

Add a bit of tracking to make sure that asynchronous mst connector
hotplugging gets this right. For consistency we rely upon the implicit
barriers of the connector->mutex, which is taken anyway, to ensure
that at least either the connector or device registration call will
work out.

Mildly tested since I can't reliably reproduce this on my mst box
here.

Reported-by: Dave Hansen <dave.hansen@intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
---
 drivers/gpu/drm/drm_connector.c | 3 +++
 drivers/gpu/drm/drm_drv.c       | 4 ++++
 include/drm/drmP.h              | 1 +
 3 files changed, 8 insertions(+)

Comments

Daniel Vetter Jan. 12, 2017, 4:17 p.m. UTC | #1
On Thu, Jan 12, 2017 at 05:15:56PM +0100, Daniel Vetter wrote:
> I was under the misconception that the sysfs dev stuff can be fully
> set up, and then registered all in one step with device_add. That's
> true for properties and property groups, but not for parents and child
> devices. Those must be fully registered before you can register a
> child.
> 
> Add a bit of tracking to make sure that asynchronous mst connector
> hotplugging gets this right. For consistency we rely upon the implicit
> barriers of the connector->mutex, which is taken anyway, to ensure
> that at least either the connector or device registration call will
> work out.
> 
> Mildly tested since I can't reliably reproduce this on my mst box
> here.
> 
> Reported-by: Dave Hansen <dave.hansen@intel.com>
> Cc: Dave Hansen <dave.hansen@intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>

Cc: stable@vger.kernel.org

> ---
>  drivers/gpu/drm/drm_connector.c | 3 +++
>  drivers/gpu/drm/drm_drv.c       | 4 ++++
>  include/drm/drmP.h              | 1 +
>  3 files changed, 8 insertions(+)
> 
> diff --git a/drivers/gpu/drm/drm_connector.c b/drivers/gpu/drm/drm_connector.c
> index c75ab242f907..5999cb83d05b 100644
> --- a/drivers/gpu/drm/drm_connector.c
> +++ b/drivers/gpu/drm/drm_connector.c
> @@ -378,6 +378,9 @@ int drm_connector_register(struct drm_connector *connector)
>  {
>  	int ret = 0;
>  
> +	if (!connector->dev->registered)
> +		return 0;
> +
>  	mutex_lock(&connector->mutex);
>  	if (connector->registered)
>  		goto unlock;
> diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
> index 7e24103c47f1..cad6df626678 100644
> --- a/drivers/gpu/drm/drm_drv.c
> +++ b/drivers/gpu/drm/drm_drv.c
> @@ -749,6 +749,8 @@ int drm_dev_register(struct drm_device *dev, unsigned long flags)
>  	if (ret)
>  		goto err_minors;
>  
> +	dev->registered = true;
> +
>  	if (dev->driver->load) {
>  		ret = dev->driver->load(dev, flags);
>  		if (ret)
> @@ -796,6 +798,8 @@ void drm_dev_unregister(struct drm_device *dev)
>  
>  	drm_lastclose(dev);
>  
> +	dev->registered = false;
> +
>  	if (drm_core_check_feature(dev, DRIVER_MODESET))
>  		drm_modeset_unregister_all(dev);
>  
> diff --git a/include/drm/drmP.h b/include/drm/drmP.h
> index c537c278a4be..ec105c339347 100644
> --- a/include/drm/drmP.h
> +++ b/include/drm/drmP.h
> @@ -518,6 +518,7 @@ struct drm_device {
>  	struct drm_minor *control;		/**< Control node */
>  	struct drm_minor *primary;		/**< Primary node */
>  	struct drm_minor *render;		/**< Render node */
> +	bool registered;
>  
>  	/* currently active master for this device. Protected by master_mutex */
>  	struct drm_master *master;
> -- 
> 2.5.5
>
Chris Wilson Jan. 12, 2017, 4:54 p.m. UTC | #2
On Thu, Jan 12, 2017 at 05:15:56PM +0100, Daniel Vetter wrote:
> I was under the misconception that the sysfs dev stuff can be fully
> set up, and then registered all in one step with device_add. That's
> true for properties and property groups, but not for parents and child
> devices. Those must be fully registered before you can register a
> child.
> 
> Add a bit of tracking to make sure that asynchronous mst connector
> hotplugging gets this right. For consistency we rely upon the implicit
> barriers of the connector->mutex, which is taken anyway, to ensure
> that at least either the connector or device registration call will
> work out.
> 
> Mildly tested since I can't reliably reproduce this on my mst box
> here.

Hmm, right it should prevent the oops on load. I think we need to
control the async dp-mst better and stop it running until we have the
device registered, and so if we use in the interim, plonk a WARN_ON on
top for -next and try and fix it for realz. (Probably a
drm_dp_mst_mgr_register() hook?)

Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
Daniel Vetter Jan. 25, 2017, 6:21 a.m. UTC | #3
Hi Dave,

Still waiting for your testing results on this one here ...
-Daniel

On Thu, Jan 12, 2017 at 05:17:21PM +0100, Daniel Vetter wrote:
> On Thu, Jan 12, 2017 at 05:15:56PM +0100, Daniel Vetter wrote:
> > I was under the misconception that the sysfs dev stuff can be fully
> > set up, and then registered all in one step with device_add. That's
> > true for properties and property groups, but not for parents and child
> > devices. Those must be fully registered before you can register a
> > child.
> > 
> > Add a bit of tracking to make sure that asynchronous mst connector
> > hotplugging gets this right. For consistency we rely upon the implicit
> > barriers of the connector->mutex, which is taken anyway, to ensure
> > that at least either the connector or device registration call will
> > work out.
> > 
> > Mildly tested since I can't reliably reproduce this on my mst box
> > here.
> > 
> > Reported-by: Dave Hansen <dave.hansen@intel.com>
> > Cc: Dave Hansen <dave.hansen@intel.com>
> > Cc: Chris Wilson <chris@chris-wilson.co.uk>
> > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> 
> Cc: stable@vger.kernel.org
> 
> > ---
> >  drivers/gpu/drm/drm_connector.c | 3 +++
> >  drivers/gpu/drm/drm_drv.c       | 4 ++++
> >  include/drm/drmP.h              | 1 +
> >  3 files changed, 8 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/drm_connector.c b/drivers/gpu/drm/drm_connector.c
> > index c75ab242f907..5999cb83d05b 100644
> > --- a/drivers/gpu/drm/drm_connector.c
> > +++ b/drivers/gpu/drm/drm_connector.c
> > @@ -378,6 +378,9 @@ int drm_connector_register(struct drm_connector *connector)
> >  {
> >  	int ret = 0;
> >  
> > +	if (!connector->dev->registered)
> > +		return 0;
> > +
> >  	mutex_lock(&connector->mutex);
> >  	if (connector->registered)
> >  		goto unlock;
> > diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
> > index 7e24103c47f1..cad6df626678 100644
> > --- a/drivers/gpu/drm/drm_drv.c
> > +++ b/drivers/gpu/drm/drm_drv.c
> > @@ -749,6 +749,8 @@ int drm_dev_register(struct drm_device *dev, unsigned long flags)
> >  	if (ret)
> >  		goto err_minors;
> >  
> > +	dev->registered = true;
> > +
> >  	if (dev->driver->load) {
> >  		ret = dev->driver->load(dev, flags);
> >  		if (ret)
> > @@ -796,6 +798,8 @@ void drm_dev_unregister(struct drm_device *dev)
> >  
> >  	drm_lastclose(dev);
> >  
> > +	dev->registered = false;
> > +
> >  	if (drm_core_check_feature(dev, DRIVER_MODESET))
> >  		drm_modeset_unregister_all(dev);
> >  
> > diff --git a/include/drm/drmP.h b/include/drm/drmP.h
> > index c537c278a4be..ec105c339347 100644
> > --- a/include/drm/drmP.h
> > +++ b/include/drm/drmP.h
> > @@ -518,6 +518,7 @@ struct drm_device {
> >  	struct drm_minor *control;		/**< Control node */
> >  	struct drm_minor *primary;		/**< Primary node */
> >  	struct drm_minor *render;		/**< Render node */
> > +	bool registered;
> >  
> >  	/* currently active master for this device. Protected by master_mutex */
> >  	struct drm_master *master;
> > -- 
> > 2.5.5
> > 
> 
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch
Dave Hansen Jan. 25, 2017, 3:20 p.m. UTC | #4
On 01/24/2017 10:21 PM, Daniel Vetter wrote:
> Hi Dave,
> 
> Still waiting for your testing results on this one here ...

It's definitely stable with that patch applied.  No more crashes.

But, it's also definitely having difficulty re-probing to find the
monitor that's attached to the dock in some cases.  Whatever is going on
isn't fixed by poking it with xrandr.
Daniel Vetter Jan. 25, 2017, 3:38 p.m. UTC | #5
On Wed, Jan 25, 2017 at 07:20:45AM -0800, Dave Hansen wrote:
> On 01/24/2017 10:21 PM, Daniel Vetter wrote:
> > Hi Dave,
> > 
> > Still waiting for your testing results on this one here ...
> 
> It's definitely stable with that patch applied.  No more crashes.
> 
> But, it's also definitely having difficulty re-probing to find the
> monitor that's attached to the dock in some cases.  Whatever is going on
> isn't fixed by poking it with xrandr.

Is this new? When exactly does this happen? Does the mst sink connector no
longer show up, or is the connected/disconnected status all wrong?
-Daniel
Dave Hansen Jan. 26, 2017, 8:34 p.m. UTC | #6
On 01/25/2017 07:38 AM, Daniel Vetter wrote:
> On Wed, Jan 25, 2017 at 07:20:45AM -0800, Dave Hansen wrote:
>> On 01/24/2017 10:21 PM, Daniel Vetter wrote:
>>> Hi Dave,
>>>
>>> Still waiting for your testing results on this one here ...
>>
>> It's definitely stable with that patch applied.  No more crashes.
>>
>> But, it's also definitely having difficulty re-probing to find the
>> monitor that's attached to the dock in some cases.  Whatever is going on
>> isn't fixed by poking it with xrandr.
> 
> Is this new? When exactly does this happen? Does the mst sink connector no
> longer show up, or is the connected/disconnected status all wrong?

It's hard to say whether it's new or not.  I *think* it worked better
before, but it also crashed pretty often, so it's hard to say.

And, yeah, I think it just gets the connected status wrong.  The
connector is still there.
diff mbox

Patch

diff --git a/drivers/gpu/drm/drm_connector.c b/drivers/gpu/drm/drm_connector.c
index c75ab242f907..5999cb83d05b 100644
--- a/drivers/gpu/drm/drm_connector.c
+++ b/drivers/gpu/drm/drm_connector.c
@@ -378,6 +378,9 @@  int drm_connector_register(struct drm_connector *connector)
 {
 	int ret = 0;
 
+	if (!connector->dev->registered)
+		return 0;
+
 	mutex_lock(&connector->mutex);
 	if (connector->registered)
 		goto unlock;
diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
index 7e24103c47f1..cad6df626678 100644
--- a/drivers/gpu/drm/drm_drv.c
+++ b/drivers/gpu/drm/drm_drv.c
@@ -749,6 +749,8 @@  int drm_dev_register(struct drm_device *dev, unsigned long flags)
 	if (ret)
 		goto err_minors;
 
+	dev->registered = true;
+
 	if (dev->driver->load) {
 		ret = dev->driver->load(dev, flags);
 		if (ret)
@@ -796,6 +798,8 @@  void drm_dev_unregister(struct drm_device *dev)
 
 	drm_lastclose(dev);
 
+	dev->registered = false;
+
 	if (drm_core_check_feature(dev, DRIVER_MODESET))
 		drm_modeset_unregister_all(dev);
 
diff --git a/include/drm/drmP.h b/include/drm/drmP.h
index c537c278a4be..ec105c339347 100644
--- a/include/drm/drmP.h
+++ b/include/drm/drmP.h
@@ -518,6 +518,7 @@  struct drm_device {
 	struct drm_minor *control;		/**< Control node */
 	struct drm_minor *primary;		/**< Primary node */
 	struct drm_minor *render;		/**< Render node */
+	bool registered;
 
 	/* currently active master for this device. Protected by master_mutex */
 	struct drm_master *master;