Message ID | 20201025221735.3062-47-digetx@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Introduce memory interconnect for NVIDIA Tegra SoCs | expand |
On 26-10-20, 01:17, Dmitry Osipenko wrote: > This patch fixes lockup which happens when OPP table is released if > interconnect provider uses OPP in the icc_provider->set() callback > and bandwidth of the ICC path is set to 0 by the ICC core when path > is released. The icc_put() doesn't need the opp_table_lock protection, > hence let's move it outside of the lock in order to resolve the problem. > > In particular this fixes tegra-devfreq driver lockup on trying to unload > the driver module. The devfreq driver uses OPP-bandwidth API and its ICC > provider also uses OPP for DVFS, hence they both take same opp_table_lock > when OPP table of the devfreq is released. > > Signed-off-by: Dmitry Osipenko <digetx@gmail.com> > --- > drivers/opp/core.c | 21 ++++++++++++++------- > 1 file changed, 14 insertions(+), 7 deletions(-) > > diff --git a/drivers/opp/core.c b/drivers/opp/core.c > index 2483e765318a..1134df360fe0 100644 > --- a/drivers/opp/core.c > +++ b/drivers/opp/core.c > @@ -1187,12 +1187,6 @@ static void _opp_table_kref_release(struct kref *kref) > if (!IS_ERR(opp_table->clk)) > clk_put(opp_table->clk); > > - if (opp_table->paths) { > - for (i = 0; i < opp_table->path_count; i++) > - icc_put(opp_table->paths[i]); > - kfree(opp_table->paths); > - } > - > WARN_ON(!list_empty(&opp_table->opp_list)); > > list_for_each_entry_safe(opp_dev, temp, &opp_table->dev_list, node) { > @@ -1209,9 +1203,22 @@ static void _opp_table_kref_release(struct kref *kref) > mutex_destroy(&opp_table->genpd_virt_dev_lock); > mutex_destroy(&opp_table->lock); > list_del(&opp_table->node); > - kfree(opp_table); > > mutex_unlock(&opp_table_lock); > + > + /* > + * Interconnect provider may use OPP too, hence icc_put() needs to be > + * invoked outside of the opp_table_lock in order to prevent nested > + * locking which happens when bandwidth of the ICC path is set to 0 > + * by ICC core on release of the path. > + */ > + if (opp_table->paths) { > + for (i = 0; i < opp_table->path_count; i++) > + icc_put(opp_table->paths[i]); > + kfree(opp_table->paths); > + } > + > + kfree(opp_table); > } Never make such _fixes_ part of such a big patchset. Always send them separately. Having said that, I already have a patch with me which shall fix it for you as well: diff --git a/drivers/opp/core.c b/drivers/opp/core.c index 4ac4e7ce6b8b..0e0a5269dc82 100644 --- a/drivers/opp/core.c +++ b/drivers/opp/core.c @@ -1181,6 +1181,10 @@ static void _opp_table_kref_release(struct kref *kref) struct opp_device *opp_dev, *temp; int i; + /* Drop the lock as soon as we can */ + list_del(&opp_table->node); + mutex_unlock(&opp_table_lock); + _of_clear_opp_table(opp_table); /* Release clk */ @@ -1208,10 +1212,7 @@ static void _opp_table_kref_release(struct kref *kref) mutex_destroy(&opp_table->genpd_virt_dev_lock); mutex_destroy(&opp_table->lock); - list_del(&opp_table->node); kfree(opp_table); - - mutex_unlock(&opp_table_lock); } void dev_pm_opp_put_opp_table(struct opp_table *opp_table)
27.10.2020 08:10, Viresh Kumar пишет: > On 26-10-20, 01:17, Dmitry Osipenko wrote: >> This patch fixes lockup which happens when OPP table is released if >> interconnect provider uses OPP in the icc_provider->set() callback >> and bandwidth of the ICC path is set to 0 by the ICC core when path >> is released. The icc_put() doesn't need the opp_table_lock protection, >> hence let's move it outside of the lock in order to resolve the problem. >> >> In particular this fixes tegra-devfreq driver lockup on trying to unload >> the driver module. The devfreq driver uses OPP-bandwidth API and its ICC >> provider also uses OPP for DVFS, hence they both take same opp_table_lock >> when OPP table of the devfreq is released. >> >> Signed-off-by: Dmitry Osipenko <digetx@gmail.com> >> --- ... > > Never make such _fixes_ part of such a big patchset. Always send them > separately. Perhaps it's not obvious from the commit description that this patch doesn't fix any known problems of the current mainline kernel and it's needed only for the new patches. > Having said that, I already have a patch with me which shall fix it for you as > well: I see that yours fix is already applied, thanks!
On 27-10-20, 23:26, Dmitry Osipenko wrote: > 27.10.2020 08:10, Viresh Kumar пишет: > > On 26-10-20, 01:17, Dmitry Osipenko wrote: > >> This patch fixes lockup which happens when OPP table is released if > >> interconnect provider uses OPP in the icc_provider->set() callback > >> and bandwidth of the ICC path is set to 0 by the ICC core when path > >> is released. The icc_put() doesn't need the opp_table_lock protection, > >> hence let's move it outside of the lock in order to resolve the problem. > >> > >> In particular this fixes tegra-devfreq driver lockup on trying to unload > >> the driver module. The devfreq driver uses OPP-bandwidth API and its ICC > >> provider also uses OPP for DVFS, hence they both take same opp_table_lock > >> when OPP table of the devfreq is released. > >> > >> Signed-off-by: Dmitry Osipenko <digetx@gmail.com> > >> --- > ... > > > > Never make such _fixes_ part of such a big patchset. Always send them > > separately. > > Perhaps it's not obvious from the commit description that this patch > doesn't fix any known problems of the current mainline kernel and it's > needed only for the new patches. No, I understood that we started getting the warning now only after some other patches of yours. Nevertheless, it should be considered as a fix only as that generated lockdep because of locking placement. And so sending such stuff separately is better as that allows people to apply it fast. > > Having said that, I already have a patch with me which shall fix it for you as > > well: > > I see that yours fix is already applied, thanks! I hope it worked for you. Thanks.
diff --git a/drivers/opp/core.c b/drivers/opp/core.c index 2483e765318a..1134df360fe0 100644 --- a/drivers/opp/core.c +++ b/drivers/opp/core.c @@ -1187,12 +1187,6 @@ static void _opp_table_kref_release(struct kref *kref) if (!IS_ERR(opp_table->clk)) clk_put(opp_table->clk); - if (opp_table->paths) { - for (i = 0; i < opp_table->path_count; i++) - icc_put(opp_table->paths[i]); - kfree(opp_table->paths); - } - WARN_ON(!list_empty(&opp_table->opp_list)); list_for_each_entry_safe(opp_dev, temp, &opp_table->dev_list, node) { @@ -1209,9 +1203,22 @@ static void _opp_table_kref_release(struct kref *kref) mutex_destroy(&opp_table->genpd_virt_dev_lock); mutex_destroy(&opp_table->lock); list_del(&opp_table->node); - kfree(opp_table); mutex_unlock(&opp_table_lock); + + /* + * Interconnect provider may use OPP too, hence icc_put() needs to be + * invoked outside of the opp_table_lock in order to prevent nested + * locking which happens when bandwidth of the ICC path is set to 0 + * by ICC core on release of the path. + */ + if (opp_table->paths) { + for (i = 0; i < opp_table->path_count; i++) + icc_put(opp_table->paths[i]); + kfree(opp_table->paths); + } + + kfree(opp_table); } void dev_pm_opp_put_opp_table(struct opp_table *opp_table)
This patch fixes lockup which happens when OPP table is released if interconnect provider uses OPP in the icc_provider->set() callback and bandwidth of the ICC path is set to 0 by the ICC core when path is released. The icc_put() doesn't need the opp_table_lock protection, hence let's move it outside of the lock in order to resolve the problem. In particular this fixes tegra-devfreq driver lockup on trying to unload the driver module. The devfreq driver uses OPP-bandwidth API and its ICC provider also uses OPP for DVFS, hence they both take same opp_table_lock when OPP table of the devfreq is released. Signed-off-by: Dmitry Osipenko <digetx@gmail.com> --- drivers/opp/core.c | 21 ++++++++++++++------- 1 file changed, 14 insertions(+), 7 deletions(-)