diff mbox

[RFC] drm/bridge: panel: Add module_get/but calls to attached panel driver

Message ID 20180221071842.GA4194@wunner.de (mailing list archive)
State New, archived
Headers show

Commit Message

Lukas Wunner Feb. 21, 2018, 7:18 a.m. UTC
[+cc Rafael]

On Tue, Feb 20, 2018 at 05:04:00PM +0200, Jyri Sarha wrote:
> On 20/02/18 14:03, Thierry Reding wrote:
> > On Tue, Feb 20, 2018 at 01:28:48PM +0200, Jyri Sarha wrote:
> >> On 20/02/18 12:34, Thierry Reding wrote:
> >>>> On Mon, Feb 19, 2018 at 10:06:22PM +0200, Jyri Sarha wrote:
> >>>>> Currently there is no way for a master drm driver to protect against an
> >>>>> attached panel driver from being unloaded while it is in use. The
> >>>>> least we can do is to indicate the usage by incrementing the module
> >>>>> reference count.
[...]
> >>> I disagree. module_get() is only going to protect you from unloading a
> >>> module that's in use, but there are other ways to unbind a driver from
> >>> the device and cause subsequent mayhem.
> >>>
> >>> struct device_link was "recently" introduced to fix that issue.
> >>
> >> Unfortunately I do not have time to do full fix for the issue, but in
> >> any case I think this patch is a step to the right direction. The module
> >> reference count should anyway be increased at least to know that the
> >> driver code remains in memory, even if the device is not there any more.
> > 
> > I think device links are actually a lot easier to use and give you a
> > full solution for this. Essentially the only thing you need to do is add
> > a call to device_link_add() in the correct place.
> > 
> > That said, it seems to me like drm_panel_bridge_add() is not a good
> > place to add this, because the panel/bridge does not follow a typical
> > consumer/supplier model. It is rather a helper construct with a well-
> > defined lifetime.
> > 
> > What you do want to protect is the panel (the "parent" of the panel/
> > bridge) from going away unexpectedly. I think the best way to achieve
> > that is to make the link in drm_panel_attach() with something like
> > this:
> > 
> > 	u32 flags = DL_FLAG_PM_RUNTIME | DL_FLAG_AUTOREMOVE;
> > 	struct device *consumer = connector->dev->dev;
> > 
> > 	link = device_link_add(consumer, panel->dev, flags);
> > 	if (!link) {
> > 		dev_err(panel->dev, "failed to link panel %s to %s\n",
> > 			dev_name(panel->dev), dev_name(consumer));
> > 		return -EINVAL;
> > 	}
> > 
> > Ideally I think we'd link the panel to the connector rather than the
> > top-level DRM device, but we don't have a struct device * for that, so
> > this is probably as good as it gets for now.
> 
> Thanks for the hint. Adding the link indeed unbinds the master drm
> device when the panel is unloaded without any complaints.
> 
> The annoying thing is that the master drm device does not probe again
> and bind when the supplier device becomes available again. However,
> forcing a manual bind brings the whole device back without a problem.
> 
> Is there any trivial way to fix this?

How about the below, would this work for you?

Or is the device link added in the consumer's driver, hence is gone when
the consumer is unbound?  If so, it should be added somewhere else,
or it shouldn't be removed on consumer unbind.

@Thierry:  Thanks for shifting the discussion in the right direction,
it's good to see device links getting more adoption, instead of
proliferating yet more kludges.  However I don't understand why you say
we don't have a struct device for the connector, drm_sysfs_connector_add()
does create one.

-- >8 --
Subject: [PATCH] driver core: Ensure consumers are probed when supplier is
 bound

When a device link supplier is unbound (either manually or because one
of its own suppliers was unbound), its consumers are unbound as well.

However when that supplier is subsequently rebound, its consumers should
likewise be given a chance to rebind.  Achieve that by putting them on
the deferred probe list in device_links_driver_bound().  The sole caller
of that function, driver_bound(), triggers deferred probing afterwards.

Reported-by: Jyri Sarha <jsarha@ti.com>
Cc: Thierry Reding <thierry.reding@gmail.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
---
 drivers/base/base.h | 1 +
 drivers/base/core.c | 4 +++-
 drivers/base/dd.c   | 2 +-
 3 files changed, 5 insertions(+), 2 deletions(-)

Comments

Jyri Sarha Feb. 21, 2018, 11:30 a.m. UTC | #1
On 21/02/18 09:18, Lukas Wunner wrote:
> [+cc Rafael]
> 
> On Tue, Feb 20, 2018 at 05:04:00PM +0200, Jyri Sarha wrote:
>> On 20/02/18 14:03, Thierry Reding wrote:
>>> On Tue, Feb 20, 2018 at 01:28:48PM +0200, Jyri Sarha wrote:
>>>> On 20/02/18 12:34, Thierry Reding wrote:
>>>>>> On Mon, Feb 19, 2018 at 10:06:22PM +0200, Jyri Sarha wrote:
>>>>>>> Currently there is no way for a master drm driver to protect against an
>>>>>>> attached panel driver from being unloaded while it is in use. The
>>>>>>> least we can do is to indicate the usage by incrementing the module
>>>>>>> reference count.
> [...]
>>>>> I disagree. module_get() is only going to protect you from unloading a
>>>>> module that's in use, but there are other ways to unbind a driver from
>>>>> the device and cause subsequent mayhem.
>>>>>
>>>>> struct device_link was "recently" introduced to fix that issue.
>>>>
>>>> Unfortunately I do not have time to do full fix for the issue, but in
>>>> any case I think this patch is a step to the right direction. The module
>>>> reference count should anyway be increased at least to know that the
>>>> driver code remains in memory, even if the device is not there any more.
>>>
>>> I think device links are actually a lot easier to use and give you a
>>> full solution for this. Essentially the only thing you need to do is add
>>> a call to device_link_add() in the correct place.
>>>
>>> That said, it seems to me like drm_panel_bridge_add() is not a good
>>> place to add this, because the panel/bridge does not follow a typical
>>> consumer/supplier model. It is rather a helper construct with a well-
>>> defined lifetime.
>>>
>>> What you do want to protect is the panel (the "parent" of the panel/
>>> bridge) from going away unexpectedly. I think the best way to achieve
>>> that is to make the link in drm_panel_attach() with something like
>>> this:
>>>
>>> 	u32 flags = DL_FLAG_PM_RUNTIME | DL_FLAG_AUTOREMOVE;
>>> 	struct device *consumer = connector->dev->dev;
>>>
>>> 	link = device_link_add(consumer, panel->dev, flags);
>>> 	if (!link) {
>>> 		dev_err(panel->dev, "failed to link panel %s to %s\n",
>>> 			dev_name(panel->dev), dev_name(consumer));
>>> 		return -EINVAL;
>>> 	}
>>>
>>> Ideally I think we'd link the panel to the connector rather than the
>>> top-level DRM device, but we don't have a struct device * for that, so
>>> this is probably as good as it gets for now.
>>
>> Thanks for the hint. Adding the link indeed unbinds the master drm
>> device when the panel is unloaded without any complaints.
>>
>> The annoying thing is that the master drm device does not probe again
>> and bind when the supplier device becomes available again. However,
>> forcing a manual bind brings the whole device back without a problem.
>>
>> Is there any trivial way to fix this?
> 
> How about the below, would this work for you?
> 
> Or is the device link added in the consumer's driver, hence is gone when
> the consumer is unbound?  If so, it should be added somewhere else,
> or it shouldn't be removed on consumer unbind.
> 
For some reason the patch bellow does not work. No even if I do not
remove the link ever, after it has been made in the consumers probe.

However, the device link works the way I want it to if I move the
driver_deferred_probe_add() to device_links_unbind_consumers() right
before the device_release_driver_internal(). This appears to works even
if I have device_link_del() in the remove call of the consumer driver,
but I have not yet checked how racy this solution is.

If I put the driver_deferred_probe_add() right after the
device_release_driver_internal() everything works if I do not try to
delete the link in the consumer driver remove. However, leaving the link
there forever after a successful probe does not feel right to me, even
if in practice it usually would work fine.

I wonder if either solution could evolve to an acceptable solution for
all device_link users.

Best regards,
Jyri

> @Thierry:  Thanks for shifting the discussion in the right direction,
> it's good to see device links getting more adoption, instead of
> proliferating yet more kludges.  However I don't understand why you say
> we don't have a struct device for the connector, drm_sysfs_connector_add()
> does create one.
> 
> -- >8 --
> Subject: [PATCH] driver core: Ensure consumers are probed when supplier is
>  bound
> 
> When a device link supplier is unbound (either manually or because one
> of its own suppliers was unbound), its consumers are unbound as well.
> 
> However when that supplier is subsequently rebound, its consumers should
> likewise be given a chance to rebind.  Achieve that by putting them on
> the deferred probe list in device_links_driver_bound().  The sole caller
> of that function, driver_bound(), triggers deferred probing afterwards.
> 
> Reported-by: Jyri Sarha <jsarha@ti.com>
> Cc: Thierry Reding <thierry.reding@gmail.com>
> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> Signed-off-by: Lukas Wunner <lukas@wunner.de>
> ---
>  drivers/base/base.h | 1 +
>  drivers/base/core.c | 4 +++-
>  drivers/base/dd.c   | 2 +-
>  3 files changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/base/base.h b/drivers/base/base.h
> index d800de6..39370eb 100644
> --- a/drivers/base/base.h
> +++ b/drivers/base/base.h
> @@ -114,6 +114,7 @@ extern void device_release_driver_internal(struct device *dev,
>  
>  extern void driver_detach(struct device_driver *drv);
>  extern int driver_probe_device(struct device_driver *drv, struct device *dev);
> +extern void driver_deferred_probe_add(struct device *dev);
>  extern void driver_deferred_probe_del(struct device *dev);
>  static inline int driver_match_device(struct device_driver *drv,
>  				      struct device *dev)
> diff --git a/drivers/base/core.c b/drivers/base/core.c
> index 920af22..2869d21 100644
> --- a/drivers/base/core.c
> +++ b/drivers/base/core.c
> @@ -402,7 +402,8 @@ int device_links_check_suppliers(struct device *dev)
>   * @dev: Device to update the links for.
>   *
>   * The probe has been successful, so update links from this device to any
> - * consumers by changing their status to "available".
> + * consumers by changing their status to "available".  Mark the consumers
> + * for deferred probing in case the supplier was unbound and is now rebound.
>   *
>   * Also change the status of @dev's links to suppliers to "active".
>   *
> @@ -420,6 +421,7 @@ void device_links_driver_bound(struct device *dev)
>  
>  		WARN_ON(link->status != DL_STATE_DORMANT);
>  		WRITE_ONCE(link->status, DL_STATE_AVAILABLE);
> +		driver_deferred_probe_add(link->consumer);
>  	}
>  
>  	list_for_each_entry(link, &dev->links.suppliers, c_node) {
> diff --git a/drivers/base/dd.c b/drivers/base/dd.c
> index 2c964f5..ac5a21a 100644
> --- a/drivers/base/dd.c
> +++ b/drivers/base/dd.c
> @@ -141,7 +141,7 @@ static void deferred_probe_work_func(struct work_struct *work)
>  }
>  static DECLARE_WORK(deferred_probe_work, deferred_probe_work_func);
>  
> -static void driver_deferred_probe_add(struct device *dev)
> +void driver_deferred_probe_add(struct device *dev)
>  {
>  	mutex_lock(&deferred_probe_mutex);
>  	if (list_empty(&dev->p->deferred_probe)) {
>
Lukas Wunner Feb. 21, 2018, 1:15 p.m. UTC | #2
On Wed, Feb 21, 2018 at 01:30:23PM +0200, Jyri Sarha wrote:
> On 21/02/18 09:18, Lukas Wunner wrote:
> > On Tue, Feb 20, 2018 at 05:04:00PM +0200, Jyri Sarha wrote:
> >> On 20/02/18 14:03, Thierry Reding wrote:
> >>> On Tue, Feb 20, 2018 at 01:28:48PM +0200, Jyri Sarha wrote:
> >>>> On 20/02/18 12:34, Thierry Reding wrote:
> >>>>>> On Mon, Feb 19, 2018 at 10:06:22PM +0200, Jyri Sarha wrote:
> >>>>>>> Currently there is no way for a master drm driver to protect against an
> >>>>>>> attached panel driver from being unloaded while it is in use. The
> >>>>>>> least we can do is to indicate the usage by incrementing the module
> >>>>>>> reference count.
> > [...]
> >>>>> I disagree. module_get() is only going to protect you from unloading a
> >>>>> module that's in use, but there are other ways to unbind a driver from
> >>>>> the device and cause subsequent mayhem.
> >>>>>
> >>>>> struct device_link was "recently" introduced to fix that issue.
> >>>>
> >>>> Unfortunately I do not have time to do full fix for the issue, but in
> >>>> any case I think this patch is a step to the right direction. The module
> >>>> reference count should anyway be increased at least to know that the
> >>>> driver code remains in memory, even if the device is not there any more.
> >>>
> >>> I think device links are actually a lot easier to use and give you a
> >>> full solution for this. Essentially the only thing you need to do is add
> >>> a call to device_link_add() in the correct place.
> >>>
> >>> That said, it seems to me like drm_panel_bridge_add() is not a good
> >>> place to add this, because the panel/bridge does not follow a typical
> >>> consumer/supplier model. It is rather a helper construct with a well-
> >>> defined lifetime.
> >>>
> >>> What you do want to protect is the panel (the "parent" of the panel/
> >>> bridge) from going away unexpectedly. I think the best way to achieve
> >>> that is to make the link in drm_panel_attach() with something like
> >>> this:
> >>>
> >>> 	u32 flags = DL_FLAG_PM_RUNTIME | DL_FLAG_AUTOREMOVE;
> >>> 	struct device *consumer = connector->dev->dev;
> >>>
> >>> 	link = device_link_add(consumer, panel->dev, flags);
> >>> 	if (!link) {
> >>> 		dev_err(panel->dev, "failed to link panel %s to %s\n",
> >>> 			dev_name(panel->dev), dev_name(consumer));
> >>> 		return -EINVAL;
> >>> 	}
> >>>
> >>> Ideally I think we'd link the panel to the connector rather than the
> >>> top-level DRM device, but we don't have a struct device * for that, so
> >>> this is probably as good as it gets for now.
> >>
> >> Thanks for the hint. Adding the link indeed unbinds the master drm
> >> device when the panel is unloaded without any complaints.
> >>
> >> The annoying thing is that the master drm device does not probe again
> >> and bind when the supplier device becomes available again. However,
> >> forcing a manual bind brings the whole device back without a problem.
> >>
> >> Is there any trivial way to fix this?
> > 
> > How about the below, would this work for you?
> > 
> > Or is the device link added in the consumer's driver, hence is gone when
> > the consumer is unbound?  If so, it should be added somewhere else,
> > or it shouldn't be removed on consumer unbind.
> > 
> For some reason the patch bellow does not work. No even if I do not
> remove the link ever, after it has been made in the consumers probe.

Yes.  The patch was only meant to work if the device link is *not*
added by the consumer, i.e. if it is added by the supplier, in a PCI
quirk, initcall or whatever.


> However, the device link works the way I want it to if I move the
> driver_deferred_probe_add() to device_links_unbind_consumers() right
> before the device_release_driver_internal(). This appears to works even
> if I have device_link_del() in the remove call of the consumer driver,
> but I have not yet checked how racy this solution is.
> 
> If I put the driver_deferred_probe_add() right after the
> device_release_driver_internal() everything works if I do not try to
> delete the link in the consumer driver remove. However, leaving the link
> there forever after a successful probe does not feel right to me, even
> if in practice it usually would work fine.

The first of the two above-mentioned solutions might also be
acceptable.

However a general problem of adding the link in the consumer is that
the consumer has to be probed at least once.  I'm not sure if that's
always guaranteed.

Leaving the device link around is fine if it represents a dependency
that exists permanently in the system.  But that begs the question if
the consumer is at all the right place to add it.  Perhaps the link
should rather be added when traversing the device tree and instantiating
the devices, and before ever probing them.  E.g. if a panel DT node
references a backlight DT node, add a device link.

Thanks,

Lukas
Thierry Reding Feb. 21, 2018, 2:26 p.m. UTC | #3
On Wed, Feb 21, 2018 at 08:18:42AM +0100, Lukas Wunner wrote:
> [+cc Rafael]
> 
> On Tue, Feb 20, 2018 at 05:04:00PM +0200, Jyri Sarha wrote:
> > On 20/02/18 14:03, Thierry Reding wrote:
> > > On Tue, Feb 20, 2018 at 01:28:48PM +0200, Jyri Sarha wrote:
> > >> On 20/02/18 12:34, Thierry Reding wrote:
> > >>>> On Mon, Feb 19, 2018 at 10:06:22PM +0200, Jyri Sarha wrote:
> > >>>>> Currently there is no way for a master drm driver to protect against an
> > >>>>> attached panel driver from being unloaded while it is in use. The
> > >>>>> least we can do is to indicate the usage by incrementing the module
> > >>>>> reference count.
> [...]
> > >>> I disagree. module_get() is only going to protect you from unloading a
> > >>> module that's in use, but there are other ways to unbind a driver from
> > >>> the device and cause subsequent mayhem.
> > >>>
> > >>> struct device_link was "recently" introduced to fix that issue.
> > >>
> > >> Unfortunately I do not have time to do full fix for the issue, but in
> > >> any case I think this patch is a step to the right direction. The module
> > >> reference count should anyway be increased at least to know that the
> > >> driver code remains in memory, even if the device is not there any more.
> > > 
> > > I think device links are actually a lot easier to use and give you a
> > > full solution for this. Essentially the only thing you need to do is add
> > > a call to device_link_add() in the correct place.
> > > 
> > > That said, it seems to me like drm_panel_bridge_add() is not a good
> > > place to add this, because the panel/bridge does not follow a typical
> > > consumer/supplier model. It is rather a helper construct with a well-
> > > defined lifetime.
> > > 
> > > What you do want to protect is the panel (the "parent" of the panel/
> > > bridge) from going away unexpectedly. I think the best way to achieve
> > > that is to make the link in drm_panel_attach() with something like
> > > this:
> > > 
> > > 	u32 flags = DL_FLAG_PM_RUNTIME | DL_FLAG_AUTOREMOVE;
> > > 	struct device *consumer = connector->dev->dev;
> > > 
> > > 	link = device_link_add(consumer, panel->dev, flags);
> > > 	if (!link) {
> > > 		dev_err(panel->dev, "failed to link panel %s to %s\n",
> > > 			dev_name(panel->dev), dev_name(consumer));
> > > 		return -EINVAL;
> > > 	}
> > > 
> > > Ideally I think we'd link the panel to the connector rather than the
> > > top-level DRM device, but we don't have a struct device * for that, so
> > > this is probably as good as it gets for now.
> > 
> > Thanks for the hint. Adding the link indeed unbinds the master drm
> > device when the panel is unloaded without any complaints.
> > 
> > The annoying thing is that the master drm device does not probe again
> > and bind when the supplier device becomes available again. However,
> > forcing a manual bind brings the whole device back without a problem.
> > 
> > Is there any trivial way to fix this?
> 
> How about the below, would this work for you?
> 
> Or is the device link added in the consumer's driver, hence is gone when
> the consumer is unbound?  If so, it should be added somewhere else,
> or it shouldn't be removed on consumer unbind.
> 
> @Thierry:  Thanks for shifting the discussion in the right direction,
> it's good to see device links getting more adoption, instead of
> proliferating yet more kludges.  However I don't understand why you say
> we don't have a struct device for the connector, drm_sysfs_connector_add()
> does create one.

The problem is that the device created by drm_sysfs_connector_add() is
created purely to represent the device in sysfs. That means it won't
have a driver bound to it, so if you add the device link to that, then
the driver core will try to unbind the sysfs device which isn't bound
in the first place.

I think what we'd need is for drivers to be able to set that device to
whatever their parent is. For example, if there's an IP block on an SoC
that provides the connector, then the corresponding pdev->dev should be
that pointer. That way the device link would work as expected and tear
down the driver for that IP block (hopefully together with the rest of
the DRM device).

However, doing this via the DRM device's ->dev should also work because
a) not all connectors have a parent device and b) the dependency really
is from the DRM device to the panel.

Thierry
Thierry Reding Feb. 21, 2018, 2:30 p.m. UTC | #4
On Wed, Feb 21, 2018 at 02:15:13PM +0100, Lukas Wunner wrote:
> On Wed, Feb 21, 2018 at 01:30:23PM +0200, Jyri Sarha wrote:
> > On 21/02/18 09:18, Lukas Wunner wrote:
> > > On Tue, Feb 20, 2018 at 05:04:00PM +0200, Jyri Sarha wrote:
> > >> On 20/02/18 14:03, Thierry Reding wrote:
> > >>> On Tue, Feb 20, 2018 at 01:28:48PM +0200, Jyri Sarha wrote:
> > >>>> On 20/02/18 12:34, Thierry Reding wrote:
> > >>>>>> On Mon, Feb 19, 2018 at 10:06:22PM +0200, Jyri Sarha wrote:
> > >>>>>>> Currently there is no way for a master drm driver to protect against an
> > >>>>>>> attached panel driver from being unloaded while it is in use. The
> > >>>>>>> least we can do is to indicate the usage by incrementing the module
> > >>>>>>> reference count.
> > > [...]
> > >>>>> I disagree. module_get() is only going to protect you from unloading a
> > >>>>> module that's in use, but there are other ways to unbind a driver from
> > >>>>> the device and cause subsequent mayhem.
> > >>>>>
> > >>>>> struct device_link was "recently" introduced to fix that issue.
> > >>>>
> > >>>> Unfortunately I do not have time to do full fix for the issue, but in
> > >>>> any case I think this patch is a step to the right direction. The module
> > >>>> reference count should anyway be increased at least to know that the
> > >>>> driver code remains in memory, even if the device is not there any more.
> > >>>
> > >>> I think device links are actually a lot easier to use and give you a
> > >>> full solution for this. Essentially the only thing you need to do is add
> > >>> a call to device_link_add() in the correct place.
> > >>>
> > >>> That said, it seems to me like drm_panel_bridge_add() is not a good
> > >>> place to add this, because the panel/bridge does not follow a typical
> > >>> consumer/supplier model. It is rather a helper construct with a well-
> > >>> defined lifetime.
> > >>>
> > >>> What you do want to protect is the panel (the "parent" of the panel/
> > >>> bridge) from going away unexpectedly. I think the best way to achieve
> > >>> that is to make the link in drm_panel_attach() with something like
> > >>> this:
> > >>>
> > >>> 	u32 flags = DL_FLAG_PM_RUNTIME | DL_FLAG_AUTOREMOVE;
> > >>> 	struct device *consumer = connector->dev->dev;
> > >>>
> > >>> 	link = device_link_add(consumer, panel->dev, flags);
> > >>> 	if (!link) {
> > >>> 		dev_err(panel->dev, "failed to link panel %s to %s\n",
> > >>> 			dev_name(panel->dev), dev_name(consumer));
> > >>> 		return -EINVAL;
> > >>> 	}
> > >>>
> > >>> Ideally I think we'd link the panel to the connector rather than the
> > >>> top-level DRM device, but we don't have a struct device * for that, so
> > >>> this is probably as good as it gets for now.
> > >>
> > >> Thanks for the hint. Adding the link indeed unbinds the master drm
> > >> device when the panel is unloaded without any complaints.
> > >>
> > >> The annoying thing is that the master drm device does not probe again
> > >> and bind when the supplier device becomes available again. However,
> > >> forcing a manual bind brings the whole device back without a problem.
> > >>
> > >> Is there any trivial way to fix this?
> > > 
> > > How about the below, would this work for you?
> > > 
> > > Or is the device link added in the consumer's driver, hence is gone when
> > > the consumer is unbound?  If so, it should be added somewhere else,
> > > or it shouldn't be removed on consumer unbind.
> > > 
> > For some reason the patch bellow does not work. No even if I do not
> > remove the link ever, after it has been made in the consumers probe.
> 
> Yes.  The patch was only meant to work if the device link is *not*
> added by the consumer, i.e. if it is added by the supplier, in a PCI
> quirk, initcall or whatever.
> 
> 
> > However, the device link works the way I want it to if I move the
> > driver_deferred_probe_add() to device_links_unbind_consumers() right
> > before the device_release_driver_internal(). This appears to works even
> > if I have device_link_del() in the remove call of the consumer driver,
> > but I have not yet checked how racy this solution is.
> > 
> > If I put the driver_deferred_probe_add() right after the
> > device_release_driver_internal() everything works if I do not try to
> > delete the link in the consumer driver remove. However, leaving the link
> > there forever after a successful probe does not feel right to me, even
> > if in practice it usually would work fine.
> 
> The first of the two above-mentioned solutions might also be
> acceptable.
> 
> However a general problem of adding the link in the consumer is that
> the consumer has to be probed at least once.  I'm not sure if that's
> always guaranteed.
> 
> Leaving the device link around is fine if it represents a dependency
> that exists permanently in the system.  But that begs the question if
> the consumer is at all the right place to add it.  Perhaps the link
> should rather be added when traversing the device tree and instantiating
> the devices, and before ever probing them.  E.g. if a panel DT node
> references a backlight DT node, add a device link.

Good luck with that. This has been discussed numerous times in the past
and never solved. I don't recall any good keywords, and a quick search
through my mailbox doesn't show the threads I'm looking for. But this
was discussed in the context of automatically making deferred probe work
for these kind of dependencies.

The bottom line was, if I remember correctly, that we can't generically
because there are many shapes these dependencies can take. Any generic
DT code wouldn't know about them and only the binding implementers (i.e.
the drivers, the consumers) actually know what to look for and what the
type of dependency really is (some may want to defer probe, others may
succeed probe but have a mechanism to "hotplug" the resource later on).

Thierry
Thierry Reding Feb. 21, 2018, 2:40 p.m. UTC | #5
On Wed, Feb 21, 2018 at 03:30:41PM +0100, Thierry Reding wrote:
> On Wed, Feb 21, 2018 at 02:15:13PM +0100, Lukas Wunner wrote:
> > On Wed, Feb 21, 2018 at 01:30:23PM +0200, Jyri Sarha wrote:
> > > On 21/02/18 09:18, Lukas Wunner wrote:
> > > > On Tue, Feb 20, 2018 at 05:04:00PM +0200, Jyri Sarha wrote:
> > > >> On 20/02/18 14:03, Thierry Reding wrote:
> > > >>> On Tue, Feb 20, 2018 at 01:28:48PM +0200, Jyri Sarha wrote:
> > > >>>> On 20/02/18 12:34, Thierry Reding wrote:
> > > >>>>>> On Mon, Feb 19, 2018 at 10:06:22PM +0200, Jyri Sarha wrote:
> > > >>>>>>> Currently there is no way for a master drm driver to protect against an
> > > >>>>>>> attached panel driver from being unloaded while it is in use. The
> > > >>>>>>> least we can do is to indicate the usage by incrementing the module
> > > >>>>>>> reference count.
> > > > [...]
> > > >>>>> I disagree. module_get() is only going to protect you from unloading a
> > > >>>>> module that's in use, but there are other ways to unbind a driver from
> > > >>>>> the device and cause subsequent mayhem.
> > > >>>>>
> > > >>>>> struct device_link was "recently" introduced to fix that issue.
> > > >>>>
> > > >>>> Unfortunately I do not have time to do full fix for the issue, but in
> > > >>>> any case I think this patch is a step to the right direction. The module
> > > >>>> reference count should anyway be increased at least to know that the
> > > >>>> driver code remains in memory, even if the device is not there any more.
> > > >>>
> > > >>> I think device links are actually a lot easier to use and give you a
> > > >>> full solution for this. Essentially the only thing you need to do is add
> > > >>> a call to device_link_add() in the correct place.
> > > >>>
> > > >>> That said, it seems to me like drm_panel_bridge_add() is not a good
> > > >>> place to add this, because the panel/bridge does not follow a typical
> > > >>> consumer/supplier model. It is rather a helper construct with a well-
> > > >>> defined lifetime.
> > > >>>
> > > >>> What you do want to protect is the panel (the "parent" of the panel/
> > > >>> bridge) from going away unexpectedly. I think the best way to achieve
> > > >>> that is to make the link in drm_panel_attach() with something like
> > > >>> this:
> > > >>>
> > > >>> 	u32 flags = DL_FLAG_PM_RUNTIME | DL_FLAG_AUTOREMOVE;
> > > >>> 	struct device *consumer = connector->dev->dev;
> > > >>>
> > > >>> 	link = device_link_add(consumer, panel->dev, flags);
> > > >>> 	if (!link) {
> > > >>> 		dev_err(panel->dev, "failed to link panel %s to %s\n",
> > > >>> 			dev_name(panel->dev), dev_name(consumer));
> > > >>> 		return -EINVAL;
> > > >>> 	}
> > > >>>
> > > >>> Ideally I think we'd link the panel to the connector rather than the
> > > >>> top-level DRM device, but we don't have a struct device * for that, so
> > > >>> this is probably as good as it gets for now.
> > > >>
> > > >> Thanks for the hint. Adding the link indeed unbinds the master drm
> > > >> device when the panel is unloaded without any complaints.
> > > >>
> > > >> The annoying thing is that the master drm device does not probe again
> > > >> and bind when the supplier device becomes available again. However,
> > > >> forcing a manual bind brings the whole device back without a problem.
> > > >>
> > > >> Is there any trivial way to fix this?
> > > > 
> > > > How about the below, would this work for you?
> > > > 
> > > > Or is the device link added in the consumer's driver, hence is gone when
> > > > the consumer is unbound?  If so, it should be added somewhere else,
> > > > or it shouldn't be removed on consumer unbind.
> > > > 
> > > For some reason the patch bellow does not work. No even if I do not
> > > remove the link ever, after it has been made in the consumers probe.
> > 
> > Yes.  The patch was only meant to work if the device link is *not*
> > added by the consumer, i.e. if it is added by the supplier, in a PCI
> > quirk, initcall or whatever.
> > 
> > 
> > > However, the device link works the way I want it to if I move the
> > > driver_deferred_probe_add() to device_links_unbind_consumers() right
> > > before the device_release_driver_internal(). This appears to works even
> > > if I have device_link_del() in the remove call of the consumer driver,
> > > but I have not yet checked how racy this solution is.
> > > 
> > > If I put the driver_deferred_probe_add() right after the
> > > device_release_driver_internal() everything works if I do not try to
> > > delete the link in the consumer driver remove. However, leaving the link
> > > there forever after a successful probe does not feel right to me, even
> > > if in practice it usually would work fine.
> > 
> > The first of the two above-mentioned solutions might also be
> > acceptable.
> > 
> > However a general problem of adding the link in the consumer is that
> > the consumer has to be probed at least once.  I'm not sure if that's
> > always guaranteed.
> > 
> > Leaving the device link around is fine if it represents a dependency
> > that exists permanently in the system.  But that begs the question if
> > the consumer is at all the right place to add it.  Perhaps the link
> > should rather be added when traversing the device tree and instantiating
> > the devices, and before ever probing them.  E.g. if a panel DT node
> > references a backlight DT node, add a device link.
> 
> Good luck with that. This has been discussed numerous times in the past
> and never solved. I don't recall any good keywords, and a quick search
> through my mailbox doesn't show the threads I'm looking for. But this
> was discussed in the context of automatically making deferred probe work
> for these kind of dependencies.
> 
> The bottom line was, if I remember correctly, that we can't generically
> because there are many shapes these dependencies can take. Any generic
> DT code wouldn't know about them and only the binding implementers (i.e.
> the drivers, the consumers) actually know what to look for and what the
> type of dependency really is (some may want to defer probe, others may
> succeed probe but have a mechanism to "hotplug" the resource later on).

I found this:

	https://lwn.net/Articles/652668/

which has a couple of pointers to other threads that should serve as a
good starting point if you want to read up on what's been tried in the
past.

Note that the context here is primarily probe ordering, but the issues
underneath are the same.

Thierry
diff mbox

Patch

diff --git a/drivers/base/base.h b/drivers/base/base.h
index d800de6..39370eb 100644
--- a/drivers/base/base.h
+++ b/drivers/base/base.h
@@ -114,6 +114,7 @@  extern void device_release_driver_internal(struct device *dev,
 
 extern void driver_detach(struct device_driver *drv);
 extern int driver_probe_device(struct device_driver *drv, struct device *dev);
+extern void driver_deferred_probe_add(struct device *dev);
 extern void driver_deferred_probe_del(struct device *dev);
 static inline int driver_match_device(struct device_driver *drv,
 				      struct device *dev)
diff --git a/drivers/base/core.c b/drivers/base/core.c
index 920af22..2869d21 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -402,7 +402,8 @@  int device_links_check_suppliers(struct device *dev)
  * @dev: Device to update the links for.
  *
  * The probe has been successful, so update links from this device to any
- * consumers by changing their status to "available".
+ * consumers by changing their status to "available".  Mark the consumers
+ * for deferred probing in case the supplier was unbound and is now rebound.
  *
  * Also change the status of @dev's links to suppliers to "active".
  *
@@ -420,6 +421,7 @@  void device_links_driver_bound(struct device *dev)
 
 		WARN_ON(link->status != DL_STATE_DORMANT);
 		WRITE_ONCE(link->status, DL_STATE_AVAILABLE);
+		driver_deferred_probe_add(link->consumer);
 	}
 
 	list_for_each_entry(link, &dev->links.suppliers, c_node) {
diff --git a/drivers/base/dd.c b/drivers/base/dd.c
index 2c964f5..ac5a21a 100644
--- a/drivers/base/dd.c
+++ b/drivers/base/dd.c
@@ -141,7 +141,7 @@  static void deferred_probe_work_func(struct work_struct *work)
 }
 static DECLARE_WORK(deferred_probe_work, deferred_probe_work_func);
 
-static void driver_deferred_probe_add(struct device *dev)
+void driver_deferred_probe_add(struct device *dev)
 {
 	mutex_lock(&deferred_probe_mutex);
 	if (list_empty(&dev->p->deferred_probe)) {