diff mbox series

[v4,2/4] PM / devfreq: Fix kernel warning with cpufreq passive register fail

Message ID 20220614230950.426-3-ansuelsmth@gmail.com (mailing list archive)
State New, archived
Delegated to: Chanwoo Choi
Headers show
Series PM / devfreq: Various Fixes to cpufreq based passive governor | expand

Commit Message

Christian Marangi June 14, 2022, 11:09 p.m. UTC
When the cpufreq passive register path from the passive governor fails,
the cpufreq_passive_unregister is called and a kernel WARNING is always
reported.
This is caused by the fact that the devfreq driver already call the
governor unregister with the GOV_STOP, for this reason the second
cpufreq_passive_unregister always return error and a WARN is printed
from the WARN_ON function.
Remove the unregister call from the error handling of the cpufreq register
notifier as it's fundamentally wrong and already handled by the devfreq
core code.

Fixes: a03dacb0316f ("PM / devfreq: Add cpu based scaling support to passive governor")
Signed-off-by: Christian 'Ansuel' Marangi <ansuelsmth@gmail.com>
---
 drivers/devfreq/governor_passive.c | 1 -
 1 file changed, 1 deletion(-)

Comments

Chanwoo Choi June 15, 2022, 7:11 a.m. UTC | #1
On 22. 6. 15. 08:09, Christian 'Ansuel' Marangi wrote:
> When the cpufreq passive register path from the passive governor fails,
> the cpufreq_passive_unregister is called and a kernel WARNING is always
> reported.
> This is caused by the fact that the devfreq driver already call the
> governor unregister with the GOV_STOP, for this reason the second
> cpufreq_passive_unregister always return error and a WARN is printed
> from the WARN_ON function.
> Remove the unregister call from the error handling of the cpufreq register
> notifier as it's fundamentally wrong and already handled by the devfreq
> core code.
> 
> Fixes: a03dacb0316f ("PM / devfreq: Add cpu based scaling support to passive governor")
> Signed-off-by: Christian 'Ansuel' Marangi <ansuelsmth@gmail.com>
> ---
>  drivers/devfreq/governor_passive.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/drivers/devfreq/governor_passive.c b/drivers/devfreq/governor_passive.c
> index 95de336f20d5..dcc9dd518197 100644
> --- a/drivers/devfreq/governor_passive.c
> +++ b/drivers/devfreq/governor_passive.c
> @@ -331,7 +331,6 @@ static int cpufreq_passive_register_notifier(struct devfreq *devfreq)
>  err_put_policy:
>  	cpufreq_cpu_put(policy);
>  err:
> -	WARN_ON(cpufreq_passive_unregister_notifier(devfreq));
>  
>  	return ret;
>  }

I think that it is necessary to free the resource when error happen.
Also, after merging the your patch1, I think that cpufreq_passive_unregister_notifier(devfreq)
will not return error. Instead, just 0 for success.

Instead, 'err_free_cpu_data' and 'err_put_policy' goto statement are wrong exception
handling. If fix the exception handling code in cpufreq_passive_register_notifier
as following and with your patch1, I'll handle the resource for free/un-registration
when error happen during cpufreq_passive_register_notifier.


diff --git a/drivers/devfreq/governor_passive.c b/drivers/devfreq/governor_passive.c
index a35b39ac656c..0246e0731fc0 100644
--- a/drivers/devfreq/governor_passive.c
+++ b/drivers/devfreq/governor_passive.c
@@ -289,22 +289,23 @@ static int cpufreq_passive_register_notifier(struct devfreq *devfreq)
                parent_cpu_data = kzalloc(sizeof(*parent_cpu_data),
                                                GFP_KERNEL);
                if (!parent_cpu_data) {
+                       cpufreq_cpu_put(policy);
                        ret = -ENOMEM;
-                       goto err_put_policy;
+                       goto err;
                }
 
                cpu_dev = get_cpu_device(cpu);
                if (!cpu_dev) {
                        dev_err(dev, "failed to get cpu device\n");
                        ret = -ENODEV;
-                       goto err_free_cpu_data;
+                       goto err;
                }
 
                opp_table = dev_pm_opp_get_opp_table(cpu_dev);
                if (IS_ERR(opp_table)) {
                        dev_err(dev, "failed to get opp_table of cpu%d\n", cpu);
                        ret = PTR_ERR(opp_table);
-                       goto err_free_cpu_data;
+                       goto err;
                }
 
                parent_cpu_data->dev = cpu_dev;
@@ -326,10 +327,6 @@ static int cpufreq_passive_register_notifier(struct devfreq *devfreq)
 
        return ret;
 
-err_free_cpu_data:
-       kfree(parent_cpu_data);
-err_put_policy:
-       cpufreq_cpu_put(policy);
 err:
        WARN_ON(cpufreq_passive_unregister_notifier(devfreq));
Christian Marangi June 15, 2022, 9:20 a.m. UTC | #2
On Wed, Jun 15, 2022 at 04:11:13PM +0900, Chanwoo Choi wrote:
> On 22. 6. 15. 08:09, Christian 'Ansuel' Marangi wrote:
> > When the cpufreq passive register path from the passive governor fails,
> > the cpufreq_passive_unregister is called and a kernel WARNING is always
> > reported.
> > This is caused by the fact that the devfreq driver already call the
> > governor unregister with the GOV_STOP, for this reason the second
> > cpufreq_passive_unregister always return error and a WARN is printed
> > from the WARN_ON function.
> > Remove the unregister call from the error handling of the cpufreq register
> > notifier as it's fundamentally wrong and already handled by the devfreq
> > core code.
> > 
> > Fixes: a03dacb0316f ("PM / devfreq: Add cpu based scaling support to passive governor")
> > Signed-off-by: Christian 'Ansuel' Marangi <ansuelsmth@gmail.com>
> > ---
> >  drivers/devfreq/governor_passive.c | 1 -
> >  1 file changed, 1 deletion(-)
> > 
> > diff --git a/drivers/devfreq/governor_passive.c b/drivers/devfreq/governor_passive.c
> > index 95de336f20d5..dcc9dd518197 100644
> > --- a/drivers/devfreq/governor_passive.c
> > +++ b/drivers/devfreq/governor_passive.c
> > @@ -331,7 +331,6 @@ static int cpufreq_passive_register_notifier(struct devfreq *devfreq)
> >  err_put_policy:
> >  	cpufreq_cpu_put(policy);
> >  err:
> > -	WARN_ON(cpufreq_passive_unregister_notifier(devfreq));
> >  
> >  	return ret;
> >  }
> 
> I think that it is necessary to free the resource when error happen.

Thing is that it should not be done in the register. Following the flow
of the devfreq core code, if a gov fails to START, the gov STOP is
called and we correctly free our resources. In the current
implementation we call the free 2 times and the second time will always
print error as the notifier is already unregistered.

> Also, after merging the your patch1, I think that cpufreq_passive_unregister_notifier(devfreq)
> will not return error. Instead, just 0 for success.

With path1 we removed the error with the parent_cpu_data deletion but
the unregister error is still there.

> 
> Instead, 'err_free_cpu_data' and 'err_put_policy' goto statement are wrong exception
> handling. If fix the exception handling code in cpufreq_passive_register_notifier
> as following and with your patch1, I'll handle the resource for free/un-registration
> when error happen during cpufreq_passive_register_notifier.
> 

Don't know the main problem here is calling unregister 2 times.

> 
> diff --git a/drivers/devfreq/governor_passive.c b/drivers/devfreq/governor_passive.c
> index a35b39ac656c..0246e0731fc0 100644
> --- a/drivers/devfreq/governor_passive.c
> +++ b/drivers/devfreq/governor_passive.c
> @@ -289,22 +289,23 @@ static int cpufreq_passive_register_notifier(struct devfreq *devfreq)
>                 parent_cpu_data = kzalloc(sizeof(*parent_cpu_data),
>                                                 GFP_KERNEL);
>                 if (!parent_cpu_data) {
> +                       cpufreq_cpu_put(policy);
>                         ret = -ENOMEM;
> -                       goto err_put_policy;
> +                       goto err;
>                 }
>  
>                 cpu_dev = get_cpu_device(cpu);
>                 if (!cpu_dev) {
>                         dev_err(dev, "failed to get cpu device\n");
>                         ret = -ENODEV;
> -                       goto err_free_cpu_data;
> +                       goto err;
>                 }
>  
>                 opp_table = dev_pm_opp_get_opp_table(cpu_dev);
>                 if (IS_ERR(opp_table)) {
>                         dev_err(dev, "failed to get opp_table of cpu%d\n", cpu);
>                         ret = PTR_ERR(opp_table);
> -                       goto err_free_cpu_data;
> +                       goto err;
>                 }
>  
>                 parent_cpu_data->dev = cpu_dev;
> @@ -326,10 +327,6 @@ static int cpufreq_passive_register_notifier(struct devfreq *devfreq)
>  
>         return ret;
>  
> -err_free_cpu_data:
> -       kfree(parent_cpu_data);
> -err_put_policy:
> -       cpufreq_cpu_put(policy);
>  err:
>         WARN_ON(cpufreq_passive_unregister_notifier(devfreq));
>  
> 
> 
> 
> -- 
> Best Regards,
> Samsung Electronics
> Chanwoo Choi
Chanwoo Choi June 17, 2022, 7:08 p.m. UTC | #3
On 22. 6. 15. 18:20, Ansuel Smith wrote:
> On Wed, Jun 15, 2022 at 04:11:13PM +0900, Chanwoo Choi wrote:
>> On 22. 6. 15. 08:09, Christian 'Ansuel' Marangi wrote:
>>> When the cpufreq passive register path from the passive governor fails,
>>> the cpufreq_passive_unregister is called and a kernel WARNING is always
>>> reported.

>>> This is caused by the fact that the devfreq driver already call the
>>> governor unregister with the GOV_STOP, for this reason the second
>>> cpufreq_passive_unregister always return error and a WARN is printed
>>> from the WARN_ON function.

>>> Remove the unregister call from the error handling of the cpufreq register
>>> notifier as it's fundamentally wrong and already handled by the devfreq
>>> core code.

If possible, could you make the patch description more simply?

>>>
>>> Fixes: a03dacb0316f ("PM / devfreq: Add cpu based scaling support to passive governor")
>>> Signed-off-by: Christian 'Ansuel' Marangi <ansuelsmth@gmail.com>
>>> ---
>>>  drivers/devfreq/governor_passive.c | 1 -
>>>  1 file changed, 1 deletion(-)
>>>
>>> diff --git a/drivers/devfreq/governor_passive.c b/drivers/devfreq/governor_passive.c
>>> index 95de336f20d5..dcc9dd518197 100644
>>> --- a/drivers/devfreq/governor_passive.c
>>> +++ b/drivers/devfreq/governor_passive.c
>>> @@ -331,7 +331,6 @@ static int cpufreq_passive_register_notifier(struct devfreq *devfreq)
>>>  err_put_policy:
>>>  	cpufreq_cpu_put(policy);
>>>  err:
>>> -	WARN_ON(cpufreq_passive_unregister_notifier(devfreq));
>>>  
>>>  	return ret;
>>>  }
>>
>> I think that it is necessary to free the resource when error happen.
> 
> Thing is that it should not be done in the register. Following the flow
> of the devfreq core code, if a gov fails to START, the gov STOP is
> called and we correctly free our resources. In the current
> implementation we call the free 2 times and the second time will always
> print error as the notifier is already unregistered.
> 
>> Also, after merging the your patch1, I think that cpufreq_passive_unregister_notifier(devfreq)
>> will not return error. Instead, just 0 for success.
> 
> With path1 we removed the error with the parent_cpu_data deletion but
> the unregister error is still there.
> 
>>
>> Instead, 'err_free_cpu_data' and 'err_put_policy' goto statement are wrong exception
>> handling. If fix the exception handling code in cpufreq_passive_register_notifier
>> as following and with your patch1, I'll handle the resource for free/un-registration
>> when error happen during cpufreq_passive_register_notifier.
>>
> 
> Don't know the main problem here is calling unregister 2 times.

Ah. I understood. To fix the error path handling with unregister function
is called twice, I think that need to to fix it as following:


diff --git a/drivers/devfreq/governor_passive.c b/drivers/devfreq/governor_passive.c
index a35b39ac656c..8f38a63beefc 100644
--- a/drivers/devfreq/governor_passive.c
+++ b/drivers/devfreq/governor_passive.c
@@ -289,22 +289,25 @@ static int cpufreq_passive_register_notifier(struct devfreq *devfreq)
                parent_cpu_data = kzalloc(sizeof(*parent_cpu_data),
                                                GFP_KERNEL);
                if (!parent_cpu_data) {
+                       cpufreq_cpu_put(policy);
                        ret = -ENOMEM;
-                       goto err_put_policy;
+                       goto err;
                }
 
                cpu_dev = get_cpu_device(cpu);
                if (!cpu_dev) {
                        dev_err(dev, "failed to get cpu device\n");
+                       cpufreq_cpu_put(policy);
                        ret = -ENODEV;
-                       goto err_free_cpu_data;
+                       goto err;
                }
 
                opp_table = dev_pm_opp_get_opp_table(cpu_dev);
                if (IS_ERR(opp_table)) {
                        dev_err(dev, "failed to get opp_table of cpu%d\n", cpu);
+                       cpufreq_cpu_put(policy);
                        ret = PTR_ERR(opp_table);
-                       goto err_free_cpu_data;
+                       goto err;
                }
 
                parent_cpu_data->dev = cpu_dev;
@@ -324,15 +327,7 @@ static int cpufreq_passive_register_notifier(struct devfreq *devfreq)
        if (ret)
                dev_err(dev, "failed to update the frequency\n");
 
-       return ret;
-
-err_free_cpu_data:
-       kfree(parent_cpu_data);
-err_put_policy:
-       cpufreq_cpu_put(policy);
 err:
-       WARN_ON(cpufreq_passive_unregister_notifier(devfreq));
-
        return ret;
 }
Christian Marangi June 19, 2022, 10:19 p.m. UTC | #4
On Sat, Jun 18, 2022 at 04:08:35AM +0900, Chanwoo Choi wrote:
> On 22. 6. 15. 18:20, Ansuel Smith wrote:
> > On Wed, Jun 15, 2022 at 04:11:13PM +0900, Chanwoo Choi wrote:
> >> On 22. 6. 15. 08:09, Christian 'Ansuel' Marangi wrote:
> >>> When the cpufreq passive register path from the passive governor fails,
> >>> the cpufreq_passive_unregister is called and a kernel WARNING is always
> >>> reported.
> 
> >>> This is caused by the fact that the devfreq driver already call the
> >>> governor unregister with the GOV_STOP, for this reason the second
> >>> cpufreq_passive_unregister always return error and a WARN is printed
> >>> from the WARN_ON function.
> 
> >>> Remove the unregister call from the error handling of the cpufreq register
> >>> notifier as it's fundamentally wrong and already handled by the devfreq
> >>> core code.
> 
> If possible, could you make the patch description more simply?
> 
> >>>
> >>> Fixes: a03dacb0316f ("PM / devfreq: Add cpu based scaling support to passive governor")
> >>> Signed-off-by: Christian 'Ansuel' Marangi <ansuelsmth@gmail.com>
> >>> ---
> >>>  drivers/devfreq/governor_passive.c | 1 -
> >>>  1 file changed, 1 deletion(-)
> >>>
> >>> diff --git a/drivers/devfreq/governor_passive.c b/drivers/devfreq/governor_passive.c
> >>> index 95de336f20d5..dcc9dd518197 100644
> >>> --- a/drivers/devfreq/governor_passive.c
> >>> +++ b/drivers/devfreq/governor_passive.c
> >>> @@ -331,7 +331,6 @@ static int cpufreq_passive_register_notifier(struct devfreq *devfreq)
> >>>  err_put_policy:
> >>>  	cpufreq_cpu_put(policy);
> >>>  err:
> >>> -	WARN_ON(cpufreq_passive_unregister_notifier(devfreq));
> >>>  
> >>>  	return ret;
> >>>  }
> >>
> >> I think that it is necessary to free the resource when error happen.
> > 
> > Thing is that it should not be done in the register. Following the flow
> > of the devfreq core code, if a gov fails to START, the gov STOP is
> > called and we correctly free our resources. In the current
> > implementation we call the free 2 times and the second time will always
> > print error as the notifier is already unregistered.
> > 
> >> Also, after merging the your patch1, I think that cpufreq_passive_unregister_notifier(devfreq)
> >> will not return error. Instead, just 0 for success.
> > 
> > With path1 we removed the error with the parent_cpu_data deletion but
> > the unregister error is still there.
> > 
> >>
> >> Instead, 'err_free_cpu_data' and 'err_put_policy' goto statement are wrong exception
> >> handling. If fix the exception handling code in cpufreq_passive_register_notifier
> >> as following and with your patch1, I'll handle the resource for free/un-registration
> >> when error happen during cpufreq_passive_register_notifier.
> >>
> > 
> > Don't know the main problem here is calling unregister 2 times.
> 
> Ah. I understood. To fix the error path handling with unregister function
> is called twice, I think that need to to fix it as following:
> 
> 
> diff --git a/drivers/devfreq/governor_passive.c b/drivers/devfreq/governor_passive.c
> index a35b39ac656c..8f38a63beefc 100644
> --- a/drivers/devfreq/governor_passive.c
> +++ b/drivers/devfreq/governor_passive.c
> @@ -289,22 +289,25 @@ static int cpufreq_passive_register_notifier(struct devfreq *devfreq)
>                 parent_cpu_data = kzalloc(sizeof(*parent_cpu_data),
>                                                 GFP_KERNEL);
>                 if (!parent_cpu_data) {
> +                       cpufreq_cpu_put(policy);
>                         ret = -ENOMEM;
> -                       goto err_put_policy;
> +                       goto err;
>                 }
>  
>                 cpu_dev = get_cpu_device(cpu);
>                 if (!cpu_dev) {
>                         dev_err(dev, "failed to get cpu device\n");
> +                       cpufreq_cpu_put(policy);
>                         ret = -ENODEV;
> -                       goto err_free_cpu_data;
> +                       goto err;
>                 }
>  
>                 opp_table = dev_pm_opp_get_opp_table(cpu_dev);
>                 if (IS_ERR(opp_table)) {
>                         dev_err(dev, "failed to get opp_table of cpu%d\n", cpu);
> +                       cpufreq_cpu_put(policy);
>                         ret = PTR_ERR(opp_table);
> -                       goto err_free_cpu_data;
> +                       goto err;
>                 }
>  
>                 parent_cpu_data->dev = cpu_dev;
> @@ -324,15 +327,7 @@ static int cpufreq_passive_register_notifier(struct devfreq *devfreq)
>         if (ret)
>                 dev_err(dev, "failed to update the frequency\n");
>  
> -       return ret;
> -
> -err_free_cpu_data:
> -       kfree(parent_cpu_data);

Wait! This would leak the just allocated parent_cpu_data in case of
error since we return before it's added to the cpu_data_list and it
won't be freed on the unregister function.

I'm sending this patch (since the other got merged) with the description
reworked. Think there isn't another way to have the code linear and
remove the ""double return"".

> -err_put_policy:
> -       cpufreq_cpu_put(policy);
>  err:
> -       WARN_ON(cpufreq_passive_unregister_notifier(devfreq));
> -
>         return ret;
>  }
> 
> 
> -- 
> Best Regards,
> Samsung Electronics
> Chanwoo Choi
diff mbox series

Patch

diff --git a/drivers/devfreq/governor_passive.c b/drivers/devfreq/governor_passive.c
index 95de336f20d5..dcc9dd518197 100644
--- a/drivers/devfreq/governor_passive.c
+++ b/drivers/devfreq/governor_passive.c
@@ -331,7 +331,6 @@  static int cpufreq_passive_register_notifier(struct devfreq *devfreq)
 err_put_policy:
 	cpufreq_cpu_put(policy);
 err:
-	WARN_ON(cpufreq_passive_unregister_notifier(devfreq));
 
 	return ret;
 }