diff mbox series

[2/2] drm/imx: make sure to cleanup DRM before unbinding components

Message ID e08eff2ed53b441c8a6c41758c00c1c91dc48d2c.1539120077.git.stefan@agner.ch (mailing list archive)
State New, archived
Headers show
Series [1/2] component: add optional cleanup function | expand

Commit Message

Stefan Agner Oct. 9, 2018, 9:30 p.m. UTC
In situations where a component fails to bind, a previously
successfully bound component might already registered itself
with the DRM framework (e.g. an encoder). When the master
component then calls drm_mode_config_cleanup, we end up in a
use after free sitution.

Use the cleanup callback to make sure all framework level
cleanup is done by the time we unbind components.

Signed-off-by: Stefan Agner <stefan@agner.ch>
---
 drivers/gpu/drm/imx/imx-drm-core.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

Comments

Russell King (Oracle) Oct. 10, 2018, 10:38 a.m. UTC | #1
On Tue, Oct 09, 2018 at 11:30:49PM +0200, Stefan Agner wrote:
> In situations where a component fails to bind, a previously
> successfully bound component might already registered itself
> with the DRM framework (e.g. an encoder). When the master
> component then calls drm_mode_config_cleanup, we end up in a
> use after free sitution.
> 
> Use the cleanup callback to make sure all framework level
> cleanup is done by the time we unbind components.

I'm not sure about this approach - the idea about the component bind
and unbind callbacks is that unbind undoes _everything_ that bind has
done, so everything is correctly ordered.  If bind registers something,
unbind should unregister it.

What seems to be going on is that imx is registering stuff in bind()
but not unregistering it in unbind().

Since imx was one of the drivers that the component helper was
created for, if it's now crashing, that's a regression in the imx
driver.  Looking at the commit log, I'd say:

commit 8e3b16e2117409625b89807de3912ff773aea354
Author: Lucas Stach <l.stach@pengutronix.de>
Date:   Thu Aug 11 11:18:49 2016 +0200

    drm/imx: don't destroy mode objects manually on driver unbind

    Instead let drm_mode_config_cleanup() do the work when taking down
    the master device. This requires all cleanup functions to be
    properly hooked up to the mode object .destroy callback.

    Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
    Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>

is probably responsible for introducing this problem, since the
explicit calls were added by me when imx was stuck in staging due to
the problems that the component helper solved.

I think what we have here are different opinions on how cleanup
should be handled.
Stefan Agner Oct. 10, 2018, 11:02 a.m. UTC | #2
[adding Lucas]

On 10.10.2018 12:38, Russell King - ARM Linux wrote:
> On Tue, Oct 09, 2018 at 11:30:49PM +0200, Stefan Agner wrote:
>> In situations where a component fails to bind, a previously
>> successfully bound component might already registered itself
>> with the DRM framework (e.g. an encoder). When the master
>> component then calls drm_mode_config_cleanup, we end up in a
>> use after free sitution.
>>
>> Use the cleanup callback to make sure all framework level
>> cleanup is done by the time we unbind components.
> 
> I'm not sure about this approach - the idea about the component bind
> and unbind callbacks is that unbind undoes _everything_ that bind has
> done, so everything is correctly ordered.  If bind registers something,
> unbind should unregister it.
> 
> What seems to be going on is that imx is registering stuff in bind()
> but not unregistering it in unbind().

Yes indeed, if that can be fixed this seems to be the better approach to
me.

> 
> Since imx was one of the drivers that the component helper was
> created for, if it's now crashing, that's a regression in the imx
> driver.  Looking at the commit log, I'd say:
> 
> commit 8e3b16e2117409625b89807de3912ff773aea354
> Author: Lucas Stach <l.stach@pengutronix.de>
> Date:   Thu Aug 11 11:18:49 2016 +0200
> 
>     drm/imx: don't destroy mode objects manually on driver unbind
> 
>     Instead let drm_mode_config_cleanup() do the work when taking down
>     the master device. This requires all cleanup functions to be
>     properly hooked up to the mode object .destroy callback.
> 
>     Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
>     Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
> 
> is probably responsible for introducing this problem, since the
> explicit calls were added by me when imx was stuck in staging due to
> the problems that the component helper solved.

The commit above does not revert cleanly today, but a quick fixing
seemed to resolve the problem I am seeing...

> 
> I think what we have here are different opinions on how cleanup
> should be handled.

In the regular case using the framework cleanup function before calling
component_unbind_all() works fine.

Its really only the case where a subcomponent fails to bind where unbind
happens before calling drm_mode_config_cleanup(drm). I guess Lucas was
not aware of that special case...?

I can send a patch which properly reverts the above commit.

--
Stefan
Russell King (Oracle) Oct. 10, 2018, 11:22 a.m. UTC | #3
On Wed, Oct 10, 2018 at 01:02:16PM +0200, Stefan Agner wrote:
> [adding Lucas]
> 
> On 10.10.2018 12:38, Russell King - ARM Linux wrote:
> > On Tue, Oct 09, 2018 at 11:30:49PM +0200, Stefan Agner wrote:
> >> In situations where a component fails to bind, a previously
> >> successfully bound component might already registered itself
> >> with the DRM framework (e.g. an encoder). When the master
> >> component then calls drm_mode_config_cleanup, we end up in a
> >> use after free sitution.
> >>
> >> Use the cleanup callback to make sure all framework level
> >> cleanup is done by the time we unbind components.
> > 
> > I'm not sure about this approach - the idea about the component bind
> > and unbind callbacks is that unbind undoes _everything_ that bind has
> > done, so everything is correctly ordered.  If bind registers something,
> > unbind should unregister it.
> > 
> > What seems to be going on is that imx is registering stuff in bind()
> > but not unregistering it in unbind().
> 
> Yes indeed, if that can be fixed this seems to be the better approach to
> me.
> 
> > 
> > Since imx was one of the drivers that the component helper was
> > created for, if it's now crashing, that's a regression in the imx
> > driver.  Looking at the commit log, I'd say:
> > 
> > commit 8e3b16e2117409625b89807de3912ff773aea354
> > Author: Lucas Stach <l.stach@pengutronix.de>
> > Date:   Thu Aug 11 11:18:49 2016 +0200
> > 
> >     drm/imx: don't destroy mode objects manually on driver unbind
> > 
> >     Instead let drm_mode_config_cleanup() do the work when taking down
> >     the master device. This requires all cleanup functions to be
> >     properly hooked up to the mode object .destroy callback.
> > 
> >     Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
> >     Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
> > 
> > is probably responsible for introducing this problem, since the
> > explicit calls were added by me when imx was stuck in staging due to
> > the problems that the component helper solved.
> 
> The commit above does not revert cleanly today, but a quick fixing
> seemed to resolve the problem I am seeing...
> 
> > 
> > I think what we have here are different opinions on how cleanup
> > should be handled.
> 
> In the regular case using the framework cleanup function before calling
> component_unbind_all() works fine.
> 
> Its really only the case where a subcomponent fails to bind where unbind
> happens before calling drm_mode_config_cleanup(drm). I guess Lucas was
> not aware of that special case...?

As long as ->unbind undoes everything that ->bind does, it's not a
problem.

WRT using devres groups in one of your previous mails on this topic,
consider what would happen if you use devm_* APIs in the bind callback
and the component helper did not use devres groups.

Any resources taken in the bind callback would not be freed until the
device was unbound from the driver by the driver model.  This means
each time the bind callback gets called, new sets of resources are
attempted to be acquired.  If some resources are exclusive, the first
bind attempt would succeed, but the second attempt would fail with
-EBUSY for example.  In the case of memory, the allocated memory
would not be freed, but new memory would be allocated each time a
bind was attempted.

The alternative would be that devres must not be used at all in the
bind/unbind callbacks - but then we'll be subject to the same error-
prone cleanup that we see with driver probe functions prior to devres.

Consider what the LDB bind/unbind callbacks would look like if devres
was not permitted, and how the memory for the imx_ldb structure would
be managed - you'd need some sort of ref-counting on the imx_ldb
structure, and management of that in the encoder and optional connector
destroy functions so you know when all embedded encoders and connectors
in that structure have been "destroyed" by drm_mode_config_cleanup(),
and hence it's safe to free the data structure.
diff mbox series

Patch

diff --git a/drivers/gpu/drm/imx/imx-drm-core.c b/drivers/gpu/drm/imx/imx-drm-core.c
index 5ea0c82f9957..b174a0ca9acb 100644
--- a/drivers/gpu/drm/imx/imx-drm-core.c
+++ b/drivers/gpu/drm/imx/imx-drm-core.c
@@ -288,8 +288,8 @@  static int imx_drm_bind(struct device *dev)
 err_unbind:
 #endif
 	component_unbind_all(drm->dev, drm);
-err_kms:
 	drm_mode_config_cleanup(drm);
+err_kms:
 	drm_dev_put(drm);
 
 	return ret;
@@ -313,9 +313,17 @@  static void imx_drm_unbind(struct device *dev)
 	drm_dev_put(drm);
 }
 
+static void imx_drm_cleanup(struct device *dev)
+{
+	struct drm_device *drm = dev_get_drvdata(dev);
+
+	drm_mode_config_cleanup(drm);
+}
+
 static const struct component_master_ops imx_drm_ops = {
 	.bind = imx_drm_bind,
 	.unbind = imx_drm_unbind,
+	.cleanup = imx_drm_cleanup,
 };
 
 static int imx_drm_platform_probe(struct platform_device *pdev)