Message ID | 20231225133615.78993-1-eugen.hristev@collabora.com (mailing list archive) |
---|---|
State | Handled Elsewhere, archived |
Headers | show |
Series | pmdomain: mediatek: fix race conditions with genpd | expand |
On Mon, 25 Dec 2023 at 14:36, Eugen Hristev <eugen.hristev@collabora.com> wrote: > > If the power domains are registered first with genpd and *after that* > the driver attempts to power them on in the probe sequence, then it is > possible that a race condition occurs if genpd tries to power them on > in the same time. > The same is valid for powering them off before unregistering them > from genpd. Right. When the PM domain has been registered with genpd, attempts to power-on/off the PM domain need to be synchronized with genpd. > Attempt to fix race conditions by first removing the domains from genpd > and *after that* powering down domains. > Also first power up the domains and *after that* register them > to genpd. This seems like a reasonable approach to me. > > Fixes: 59b644b01cf4 ("soc: mediatek: Add MediaTek SCPSYS power domains") > Signed-off-by: Eugen Hristev <eugen.hristev@collabora.com> Applied for fixes and by adding a stable tag, thanks! Although, please, see some more comments below. > --- > > This comes as another way to fix the problem as described in this thread: > https://lore.kernel.org/linux-arm-kernel/20231129113120.4907-1-eugen.hristev@collabora.com/ > > I have not been able to reproduce the problem with either fix anymore > (so far). > > I have a few doubts about this one though, if I really covered the > way it's supposed to work, and registering the pmdomains in the recursive > function in the reversed order has any side effect or if it does not > work correctly. > Tested on mt8186 where it appears to be fine. I had a quick look at the code in the driver and a few things caught my eyes. *) The error path in scpsys_probe() doesn't seem to handle removal of the link between parent/child-domains (subdomains). **) An option that might simplify the code and error path too, could be to convert into using of_genpd_add|remove_subdomain() in favor or pm_genpd_add|remove_subdomain(). Kind regards Uffe > > Eugen > > drivers/pmdomain/mediatek/mtk-pm-domains.c | 15 +++++++-------- > 1 file changed, 7 insertions(+), 8 deletions(-) > > diff --git a/drivers/pmdomain/mediatek/mtk-pm-domains.c b/drivers/pmdomain/mediatek/mtk-pm-domains.c > index e26dc17d07ad..e274e3315fe7 100644 > --- a/drivers/pmdomain/mediatek/mtk-pm-domains.c > +++ b/drivers/pmdomain/mediatek/mtk-pm-domains.c > @@ -561,6 +561,11 @@ static int scpsys_add_subdomain(struct scpsys *scpsys, struct device_node *paren > goto err_put_node; > } > > + /* recursive call to add all subdomains */ > + ret = scpsys_add_subdomain(scpsys, child); > + if (ret) > + goto err_put_node; > + > ret = pm_genpd_add_subdomain(parent_pd, child_pd); > if (ret) { > dev_err(scpsys->dev, "failed to add %s subdomain to parent %s\n", > @@ -570,11 +575,6 @@ static int scpsys_add_subdomain(struct scpsys *scpsys, struct device_node *paren > dev_dbg(scpsys->dev, "%s add subdomain: %s\n", parent_pd->name, > child_pd->name); > } > - > - /* recursive call to add all subdomains */ > - ret = scpsys_add_subdomain(scpsys, child); > - if (ret) > - goto err_put_node; > } > > return 0; > @@ -588,9 +588,6 @@ static void scpsys_remove_one_domain(struct scpsys_domain *pd) > { > int ret; > > - if (scpsys_domain_is_on(pd)) > - scpsys_power_off(&pd->genpd); > - > /* > * We're in the error cleanup already, so we only complain, > * but won't emit another error on top of the original one. > @@ -600,6 +597,8 @@ static void scpsys_remove_one_domain(struct scpsys_domain *pd) > dev_err(pd->scpsys->dev, > "failed to remove domain '%s' : %d - state may be inconsistent\n", > pd->genpd.name, ret); > + if (scpsys_domain_is_on(pd)) > + scpsys_power_off(&pd->genpd); > > clk_bulk_put(pd->num_clks, pd->clks); > clk_bulk_put(pd->num_subsys_clks, pd->subsys_clks); > -- > 2.34.1 > >
diff --git a/drivers/pmdomain/mediatek/mtk-pm-domains.c b/drivers/pmdomain/mediatek/mtk-pm-domains.c index e26dc17d07ad..e274e3315fe7 100644 --- a/drivers/pmdomain/mediatek/mtk-pm-domains.c +++ b/drivers/pmdomain/mediatek/mtk-pm-domains.c @@ -561,6 +561,11 @@ static int scpsys_add_subdomain(struct scpsys *scpsys, struct device_node *paren goto err_put_node; } + /* recursive call to add all subdomains */ + ret = scpsys_add_subdomain(scpsys, child); + if (ret) + goto err_put_node; + ret = pm_genpd_add_subdomain(parent_pd, child_pd); if (ret) { dev_err(scpsys->dev, "failed to add %s subdomain to parent %s\n", @@ -570,11 +575,6 @@ static int scpsys_add_subdomain(struct scpsys *scpsys, struct device_node *paren dev_dbg(scpsys->dev, "%s add subdomain: %s\n", parent_pd->name, child_pd->name); } - - /* recursive call to add all subdomains */ - ret = scpsys_add_subdomain(scpsys, child); - if (ret) - goto err_put_node; } return 0; @@ -588,9 +588,6 @@ static void scpsys_remove_one_domain(struct scpsys_domain *pd) { int ret; - if (scpsys_domain_is_on(pd)) - scpsys_power_off(&pd->genpd); - /* * We're in the error cleanup already, so we only complain, * but won't emit another error on top of the original one. @@ -600,6 +597,8 @@ static void scpsys_remove_one_domain(struct scpsys_domain *pd) dev_err(pd->scpsys->dev, "failed to remove domain '%s' : %d - state may be inconsistent\n", pd->genpd.name, ret); + if (scpsys_domain_is_on(pd)) + scpsys_power_off(&pd->genpd); clk_bulk_put(pd->num_clks, pd->clks); clk_bulk_put(pd->num_subsys_clks, pd->subsys_clks);
If the power domains are registered first with genpd and *after that* the driver attempts to power them on in the probe sequence, then it is possible that a race condition occurs if genpd tries to power them on in the same time. The same is valid for powering them off before unregistering them from genpd. Attempt to fix race conditions by first removing the domains from genpd and *after that* powering down domains. Also first power up the domains and *after that* register them to genpd. Fixes: 59b644b01cf4 ("soc: mediatek: Add MediaTek SCPSYS power domains") Signed-off-by: Eugen Hristev <eugen.hristev@collabora.com> --- This comes as another way to fix the problem as described in this thread: https://lore.kernel.org/linux-arm-kernel/20231129113120.4907-1-eugen.hristev@collabora.com/ I have not been able to reproduce the problem with either fix anymore (so far). I have a few doubts about this one though, if I really covered the way it's supposed to work, and registering the pmdomains in the recursive function in the reversed order has any side effect or if it does not work correctly. Tested on mt8186 where it appears to be fine. Eugen drivers/pmdomain/mediatek/mtk-pm-domains.c | 15 +++++++-------- 1 file changed, 7 insertions(+), 8 deletions(-)