diff mbox series

[v6,46/52] opp: Put interconnect paths outside of opp_table_lock

Message ID 20201025221735.3062-47-digetx@gmail.com (mailing list archive)
State New, archived
Headers show
Series Introduce memory interconnect for NVIDIA Tegra SoCs | expand

Commit Message

Dmitry Osipenko Oct. 25, 2020, 10:17 p.m. UTC
This patch fixes lockup which happens when OPP table is released if
interconnect provider uses OPP in the icc_provider->set() callback
and bandwidth of the ICC path is set to 0 by the ICC core when path
is released. The icc_put() doesn't need the opp_table_lock protection,
hence let's move it outside of the lock in order to resolve the problem.

In particular this fixes tegra-devfreq driver lockup on trying to unload
the driver module. The devfreq driver uses OPP-bandwidth API and its ICC
provider also uses OPP for DVFS, hence they both take same opp_table_lock
when OPP table of the devfreq is released.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
---
 drivers/opp/core.c | 21 ++++++++++++++-------
 1 file changed, 14 insertions(+), 7 deletions(-)

Comments

Viresh Kumar Oct. 27, 2020, 5:10 a.m. UTC | #1
On 26-10-20, 01:17, Dmitry Osipenko wrote:
> This patch fixes lockup which happens when OPP table is released if
> interconnect provider uses OPP in the icc_provider->set() callback
> and bandwidth of the ICC path is set to 0 by the ICC core when path
> is released. The icc_put() doesn't need the opp_table_lock protection,
> hence let's move it outside of the lock in order to resolve the problem.
> 
> In particular this fixes tegra-devfreq driver lockup on trying to unload
> the driver module. The devfreq driver uses OPP-bandwidth API and its ICC
> provider also uses OPP for DVFS, hence they both take same opp_table_lock
> when OPP table of the devfreq is released.
> 
> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
> ---
>  drivers/opp/core.c | 21 ++++++++++++++-------
>  1 file changed, 14 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/opp/core.c b/drivers/opp/core.c
> index 2483e765318a..1134df360fe0 100644
> --- a/drivers/opp/core.c
> +++ b/drivers/opp/core.c
> @@ -1187,12 +1187,6 @@ static void _opp_table_kref_release(struct kref *kref)
>  	if (!IS_ERR(opp_table->clk))
>  		clk_put(opp_table->clk);
>  
> -	if (opp_table->paths) {
> -		for (i = 0; i < opp_table->path_count; i++)
> -			icc_put(opp_table->paths[i]);
> -		kfree(opp_table->paths);
> -	}
> -
>  	WARN_ON(!list_empty(&opp_table->opp_list));
>  
>  	list_for_each_entry_safe(opp_dev, temp, &opp_table->dev_list, node) {
> @@ -1209,9 +1203,22 @@ static void _opp_table_kref_release(struct kref *kref)
>  	mutex_destroy(&opp_table->genpd_virt_dev_lock);
>  	mutex_destroy(&opp_table->lock);
>  	list_del(&opp_table->node);
> -	kfree(opp_table);
>  
>  	mutex_unlock(&opp_table_lock);
> +
> +	/*
> +	 * Interconnect provider may use OPP too, hence icc_put() needs to be
> +	 * invoked outside of the opp_table_lock in order to prevent nested
> +	 * locking which happens when bandwidth of the ICC path is set to 0
> +	 * by ICC core on release of the path.
> +	 */
> +	if (opp_table->paths) {
> +		for (i = 0; i < opp_table->path_count; i++)
> +			icc_put(opp_table->paths[i]);
> +		kfree(opp_table->paths);
> +	}
> +
> +	kfree(opp_table);
>  }

Never make such _fixes_ part of such a big patchset. Always send them
separately.

Having said that, I already have a patch with me which shall fix it for you as
well:

diff --git a/drivers/opp/core.c b/drivers/opp/core.c
index 4ac4e7ce6b8b..0e0a5269dc82 100644
--- a/drivers/opp/core.c
+++ b/drivers/opp/core.c
@@ -1181,6 +1181,10 @@ static void _opp_table_kref_release(struct kref *kref)
        struct opp_device *opp_dev, *temp;
        int i;
 
+       /* Drop the lock as soon as we can */
+       list_del(&opp_table->node);
+       mutex_unlock(&opp_table_lock);
+
        _of_clear_opp_table(opp_table);
 
        /* Release clk */
@@ -1208,10 +1212,7 @@ static void _opp_table_kref_release(struct kref *kref)
 
        mutex_destroy(&opp_table->genpd_virt_dev_lock);
        mutex_destroy(&opp_table->lock);
-       list_del(&opp_table->node);
        kfree(opp_table);
-
-       mutex_unlock(&opp_table_lock);
 }
 
 void dev_pm_opp_put_opp_table(struct opp_table *opp_table)
Dmitry Osipenko Oct. 27, 2020, 8:26 p.m. UTC | #2
27.10.2020 08:10, Viresh Kumar пишет:
> On 26-10-20, 01:17, Dmitry Osipenko wrote:
>> This patch fixes lockup which happens when OPP table is released if
>> interconnect provider uses OPP in the icc_provider->set() callback
>> and bandwidth of the ICC path is set to 0 by the ICC core when path
>> is released. The icc_put() doesn't need the opp_table_lock protection,
>> hence let's move it outside of the lock in order to resolve the problem.
>>
>> In particular this fixes tegra-devfreq driver lockup on trying to unload
>> the driver module. The devfreq driver uses OPP-bandwidth API and its ICC
>> provider also uses OPP for DVFS, hence they both take same opp_table_lock
>> when OPP table of the devfreq is released.
>>
>> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
>> ---
...
> 
> Never make such _fixes_ part of such a big patchset. Always send them
> separately.

Perhaps it's not obvious from the commit description that this patch
doesn't fix any known problems of the current mainline kernel and it's
needed only for the new patches.

> Having said that, I already have a patch with me which shall fix it for you as
> well:

I see that yours fix is already applied, thanks!
Viresh Kumar Oct. 28, 2020, 4:03 a.m. UTC | #3
On 27-10-20, 23:26, Dmitry Osipenko wrote:
> 27.10.2020 08:10, Viresh Kumar пишет:
> > On 26-10-20, 01:17, Dmitry Osipenko wrote:
> >> This patch fixes lockup which happens when OPP table is released if
> >> interconnect provider uses OPP in the icc_provider->set() callback
> >> and bandwidth of the ICC path is set to 0 by the ICC core when path
> >> is released. The icc_put() doesn't need the opp_table_lock protection,
> >> hence let's move it outside of the lock in order to resolve the problem.
> >>
> >> In particular this fixes tegra-devfreq driver lockup on trying to unload
> >> the driver module. The devfreq driver uses OPP-bandwidth API and its ICC
> >> provider also uses OPP for DVFS, hence they both take same opp_table_lock
> >> when OPP table of the devfreq is released.
> >>
> >> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
> >> ---
> ...
> > 
> > Never make such _fixes_ part of such a big patchset. Always send them
> > separately.
> 
> Perhaps it's not obvious from the commit description that this patch
> doesn't fix any known problems of the current mainline kernel and it's
> needed only for the new patches.

No, I understood that we started getting the warning now only after
some other patches of yours. Nevertheless, it should be considered as
a fix only as that generated lockdep because of locking placement. And
so sending such stuff separately is better as that allows people to
apply it fast.

> > Having said that, I already have a patch with me which shall fix it for you as
> > well:
> 
> I see that yours fix is already applied, thanks!

I hope it worked for you. Thanks.
diff mbox series

Patch

diff --git a/drivers/opp/core.c b/drivers/opp/core.c
index 2483e765318a..1134df360fe0 100644
--- a/drivers/opp/core.c
+++ b/drivers/opp/core.c
@@ -1187,12 +1187,6 @@  static void _opp_table_kref_release(struct kref *kref)
 	if (!IS_ERR(opp_table->clk))
 		clk_put(opp_table->clk);
 
-	if (opp_table->paths) {
-		for (i = 0; i < opp_table->path_count; i++)
-			icc_put(opp_table->paths[i]);
-		kfree(opp_table->paths);
-	}
-
 	WARN_ON(!list_empty(&opp_table->opp_list));
 
 	list_for_each_entry_safe(opp_dev, temp, &opp_table->dev_list, node) {
@@ -1209,9 +1203,22 @@  static void _opp_table_kref_release(struct kref *kref)
 	mutex_destroy(&opp_table->genpd_virt_dev_lock);
 	mutex_destroy(&opp_table->lock);
 	list_del(&opp_table->node);
-	kfree(opp_table);
 
 	mutex_unlock(&opp_table_lock);
+
+	/*
+	 * Interconnect provider may use OPP too, hence icc_put() needs to be
+	 * invoked outside of the opp_table_lock in order to prevent nested
+	 * locking which happens when bandwidth of the ICC path is set to 0
+	 * by ICC core on release of the path.
+	 */
+	if (opp_table->paths) {
+		for (i = 0; i < opp_table->path_count; i++)
+			icc_put(opp_table->paths[i]);
+		kfree(opp_table->paths);
+	}
+
+	kfree(opp_table);
 }
 
 void dev_pm_opp_put_opp_table(struct opp_table *opp_table)