Message ID | e08eff2ed53b441c8a6c41758c00c1c91dc48d2c.1539120077.git.stefan@agner.ch (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [1/2] component: add optional cleanup function | expand |
On Tue, Oct 09, 2018 at 11:30:49PM +0200, Stefan Agner wrote: > In situations where a component fails to bind, a previously > successfully bound component might already registered itself > with the DRM framework (e.g. an encoder). When the master > component then calls drm_mode_config_cleanup, we end up in a > use after free sitution. > > Use the cleanup callback to make sure all framework level > cleanup is done by the time we unbind components. I'm not sure about this approach - the idea about the component bind and unbind callbacks is that unbind undoes _everything_ that bind has done, so everything is correctly ordered. If bind registers something, unbind should unregister it. What seems to be going on is that imx is registering stuff in bind() but not unregistering it in unbind(). Since imx was one of the drivers that the component helper was created for, if it's now crashing, that's a regression in the imx driver. Looking at the commit log, I'd say: commit 8e3b16e2117409625b89807de3912ff773aea354 Author: Lucas Stach <l.stach@pengutronix.de> Date: Thu Aug 11 11:18:49 2016 +0200 drm/imx: don't destroy mode objects manually on driver unbind Instead let drm_mode_config_cleanup() do the work when taking down the master device. This requires all cleanup functions to be properly hooked up to the mode object .destroy callback. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de> is probably responsible for introducing this problem, since the explicit calls were added by me when imx was stuck in staging due to the problems that the component helper solved. I think what we have here are different opinions on how cleanup should be handled.
[adding Lucas] On 10.10.2018 12:38, Russell King - ARM Linux wrote: > On Tue, Oct 09, 2018 at 11:30:49PM +0200, Stefan Agner wrote: >> In situations where a component fails to bind, a previously >> successfully bound component might already registered itself >> with the DRM framework (e.g. an encoder). When the master >> component then calls drm_mode_config_cleanup, we end up in a >> use after free sitution. >> >> Use the cleanup callback to make sure all framework level >> cleanup is done by the time we unbind components. > > I'm not sure about this approach - the idea about the component bind > and unbind callbacks is that unbind undoes _everything_ that bind has > done, so everything is correctly ordered. If bind registers something, > unbind should unregister it. > > What seems to be going on is that imx is registering stuff in bind() > but not unregistering it in unbind(). Yes indeed, if that can be fixed this seems to be the better approach to me. > > Since imx was one of the drivers that the component helper was > created for, if it's now crashing, that's a regression in the imx > driver. Looking at the commit log, I'd say: > > commit 8e3b16e2117409625b89807de3912ff773aea354 > Author: Lucas Stach <l.stach@pengutronix.de> > Date: Thu Aug 11 11:18:49 2016 +0200 > > drm/imx: don't destroy mode objects manually on driver unbind > > Instead let drm_mode_config_cleanup() do the work when taking down > the master device. This requires all cleanup functions to be > properly hooked up to the mode object .destroy callback. > > Signed-off-by: Lucas Stach <l.stach@pengutronix.de> > Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de> > > is probably responsible for introducing this problem, since the > explicit calls were added by me when imx was stuck in staging due to > the problems that the component helper solved. The commit above does not revert cleanly today, but a quick fixing seemed to resolve the problem I am seeing... > > I think what we have here are different opinions on how cleanup > should be handled. In the regular case using the framework cleanup function before calling component_unbind_all() works fine. Its really only the case where a subcomponent fails to bind where unbind happens before calling drm_mode_config_cleanup(drm). I guess Lucas was not aware of that special case...? I can send a patch which properly reverts the above commit. -- Stefan
On Wed, Oct 10, 2018 at 01:02:16PM +0200, Stefan Agner wrote: > [adding Lucas] > > On 10.10.2018 12:38, Russell King - ARM Linux wrote: > > On Tue, Oct 09, 2018 at 11:30:49PM +0200, Stefan Agner wrote: > >> In situations where a component fails to bind, a previously > >> successfully bound component might already registered itself > >> with the DRM framework (e.g. an encoder). When the master > >> component then calls drm_mode_config_cleanup, we end up in a > >> use after free sitution. > >> > >> Use the cleanup callback to make sure all framework level > >> cleanup is done by the time we unbind components. > > > > I'm not sure about this approach - the idea about the component bind > > and unbind callbacks is that unbind undoes _everything_ that bind has > > done, so everything is correctly ordered. If bind registers something, > > unbind should unregister it. > > > > What seems to be going on is that imx is registering stuff in bind() > > but not unregistering it in unbind(). > > Yes indeed, if that can be fixed this seems to be the better approach to > me. > > > > > Since imx was one of the drivers that the component helper was > > created for, if it's now crashing, that's a regression in the imx > > driver. Looking at the commit log, I'd say: > > > > commit 8e3b16e2117409625b89807de3912ff773aea354 > > Author: Lucas Stach <l.stach@pengutronix.de> > > Date: Thu Aug 11 11:18:49 2016 +0200 > > > > drm/imx: don't destroy mode objects manually on driver unbind > > > > Instead let drm_mode_config_cleanup() do the work when taking down > > the master device. This requires all cleanup functions to be > > properly hooked up to the mode object .destroy callback. > > > > Signed-off-by: Lucas Stach <l.stach@pengutronix.de> > > Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de> > > > > is probably responsible for introducing this problem, since the > > explicit calls were added by me when imx was stuck in staging due to > > the problems that the component helper solved. > > The commit above does not revert cleanly today, but a quick fixing > seemed to resolve the problem I am seeing... > > > > > I think what we have here are different opinions on how cleanup > > should be handled. > > In the regular case using the framework cleanup function before calling > component_unbind_all() works fine. > > Its really only the case where a subcomponent fails to bind where unbind > happens before calling drm_mode_config_cleanup(drm). I guess Lucas was > not aware of that special case...? As long as ->unbind undoes everything that ->bind does, it's not a problem. WRT using devres groups in one of your previous mails on this topic, consider what would happen if you use devm_* APIs in the bind callback and the component helper did not use devres groups. Any resources taken in the bind callback would not be freed until the device was unbound from the driver by the driver model. This means each time the bind callback gets called, new sets of resources are attempted to be acquired. If some resources are exclusive, the first bind attempt would succeed, but the second attempt would fail with -EBUSY for example. In the case of memory, the allocated memory would not be freed, but new memory would be allocated each time a bind was attempted. The alternative would be that devres must not be used at all in the bind/unbind callbacks - but then we'll be subject to the same error- prone cleanup that we see with driver probe functions prior to devres. Consider what the LDB bind/unbind callbacks would look like if devres was not permitted, and how the memory for the imx_ldb structure would be managed - you'd need some sort of ref-counting on the imx_ldb structure, and management of that in the encoder and optional connector destroy functions so you know when all embedded encoders and connectors in that structure have been "destroyed" by drm_mode_config_cleanup(), and hence it's safe to free the data structure.
diff --git a/drivers/gpu/drm/imx/imx-drm-core.c b/drivers/gpu/drm/imx/imx-drm-core.c index 5ea0c82f9957..b174a0ca9acb 100644 --- a/drivers/gpu/drm/imx/imx-drm-core.c +++ b/drivers/gpu/drm/imx/imx-drm-core.c @@ -288,8 +288,8 @@ static int imx_drm_bind(struct device *dev) err_unbind: #endif component_unbind_all(drm->dev, drm); -err_kms: drm_mode_config_cleanup(drm); +err_kms: drm_dev_put(drm); return ret; @@ -313,9 +313,17 @@ static void imx_drm_unbind(struct device *dev) drm_dev_put(drm); } +static void imx_drm_cleanup(struct device *dev) +{ + struct drm_device *drm = dev_get_drvdata(dev); + + drm_mode_config_cleanup(drm); +} + static const struct component_master_ops imx_drm_ops = { .bind = imx_drm_bind, .unbind = imx_drm_unbind, + .cleanup = imx_drm_cleanup, }; static int imx_drm_platform_probe(struct platform_device *pdev)
In situations where a component fails to bind, a previously successfully bound component might already registered itself with the DRM framework (e.g. an encoder). When the master component then calls drm_mode_config_cleanup, we end up in a use after free sitution. Use the cleanup callback to make sure all framework level cleanup is done by the time we unbind components. Signed-off-by: Stefan Agner <stefan@agner.ch> --- drivers/gpu/drm/imx/imx-drm-core.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-)