diff mbox series

[02/10] PM / devfreq: Do not require devices to have OPPs

Message ID 20210929044254.38301-3-samuel@sholland.org (mailing list archive)
State New, archived
Headers show
Series DRAM devfreq support for Allwinner A64/H5 | expand

Commit Message

Samuel Holland Sept. 29, 2021, 4:42 a.m. UTC
Since commit ea572f816032 ("PM / devfreq: Change return type of
devfreq_set_freq_table()"), all devfreq devices are required to have a
valid freq_table. If freq_table is not provided by the driver, it will
be filled in by set_freq_table() from the OPPs; if that fails,
devfreq_add_device() will return an error.

However, since commit ab8f58ad72c4 ("PM / devfreq: Set min/max_freq when
adding the devfreq device"), devfreq devices are _also_ required to have
an OPP table, even if they provide freq_table. devfreq_add_device()
requires dev_pm_opp_find_freq_ceil() and dev_pm_opp_find_freq_floor() to
return successfully, specifically to initialize scaling_min/max_freq.

Not all drivers need an OPP table. For example, a driver where all
frequencies are determined dynamically could work by filling out only
freq_table. But with the current code it must call dev_pm_opp_add() on
every freq_table entry to probe successfully.

The offending properties, scaling_min/max_freq, are only necessary if a
device has OPPs; if no OPPs exist at all, OPPs cannot be dynamically
enabled or disabled, so those values have no effect. Thus it is trivial
to restore support for devices with freq_table only and not OPPs -- move
those initializations behind the check for a valid OPP table.

Since get_freq_range() uses scaling_max_freq in a min() expression, it
must be initialized to the maximum possible value. scaling_min_freq is
initialized as well for consistency.

Fixes: ab8f58ad72c4 ("PM / devfreq: Set min/max_freq when adding the devfreq device")
Signed-off-by: Samuel Holland <samuel@sholland.org>
---
 drivers/devfreq/devfreq.c | 36 ++++++++++++++++++++----------------
 1 file changed, 20 insertions(+), 16 deletions(-)

Comments

Chanwoo Choi Sept. 30, 2021, 4:19 a.m. UTC | #1
Hi Samuel,


On 9/29/21 1:42 PM, Samuel Holland wrote:
> Since commit ea572f816032 ("PM / devfreq: Change return type of
> devfreq_set_freq_table()"), all devfreq devices are required to have a
> valid freq_table. If freq_table is not provided by the driver, it will
> be filled in by set_freq_table() from the OPPs; if that fails,
> devfreq_add_device() will return an error.
> 
> However, since commit ab8f58ad72c4 ("PM / devfreq: Set min/max_freq when
> adding the devfreq device"), devfreq devices are _also_ required to have
> an OPP table, even if they provide freq_table. devfreq_add_device()
> requires dev_pm_opp_find_freq_ceil() and dev_pm_opp_find_freq_floor() to
> return successfully, specifically to initialize scaling_min/max_freq.
> 
> Not all drivers need an OPP table. For example, a driver where all
> frequencies are determined dynamically could work by filling out only
> freq_table. But with the current code it must call dev_pm_opp_add() on
> every freq_table entry to probe successfully.

As you commented, if device has no opp table, it should call dev_pm_opp_add().
The devfreq have to use OPP for controlling the frequency/regulator.

Actually, I want that all devfreq driver uses the OPP as default way.
Are there any reason why don't use the OPP table?

> 
> The offending properties, scaling_min/max_freq, are only necessary if a
> device has OPPs; if no OPPs exist at all, OPPs cannot be dynamically
> enabled or disabled, so those values have no effect. Thus it is trivial
> to restore support for devices with freq_table only and not OPPs -- move
> those initializations behind the check for a valid OPP table.
> 
> Since get_freq_range() uses scaling_max_freq in a min() expression, it
> must be initialized to the maximum possible value. scaling_min_freq is
> initialized as well for consistency.
> 
> Fixes: ab8f58ad72c4 ("PM / devfreq: Set min/max_freq when adding the devfreq device")
> Signed-off-by: Samuel Holland <samuel@sholland.org>
> ---
>  drivers/devfreq/devfreq.c | 36 ++++++++++++++++++++----------------
>  1 file changed, 20 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
> index 7a8b022ba456..426e31e6c448 100644
> --- a/drivers/devfreq/devfreq.c
> +++ b/drivers/devfreq/devfreq.c
> @@ -835,24 +835,28 @@ struct devfreq *devfreq_add_device(struct device *dev,
>  		mutex_lock(&devfreq->lock);
>  	}
>  
> -	devfreq->scaling_min_freq = find_available_min_freq(devfreq);
> -	if (!devfreq->scaling_min_freq) {
> -		mutex_unlock(&devfreq->lock);
> -		err = -EINVAL;
> -		goto err_dev;
> -	}
> -
> -	devfreq->scaling_max_freq = find_available_max_freq(devfreq);
> -	if (!devfreq->scaling_max_freq) {
> -		mutex_unlock(&devfreq->lock);
> -		err = -EINVAL;
> -		goto err_dev;
> -	}
> -
> -	devfreq->suspend_freq = dev_pm_opp_get_suspend_opp_freq(dev);
>  	devfreq->opp_table = dev_pm_opp_get_opp_table(dev);
> -	if (IS_ERR(devfreq->opp_table))
> +	if (IS_ERR(devfreq->opp_table)) {
>  		devfreq->opp_table = NULL;
> +		devfreq->scaling_min_freq = 0;
> +		devfreq->scaling_max_freq = ULONG_MAX;
> +	} else {
> +		devfreq->scaling_min_freq = find_available_min_freq(devfreq);
> +		if (!devfreq->scaling_min_freq) {
> +			mutex_unlock(&devfreq->lock);
> +			err = -EINVAL;
> +			goto err_dev;
> +		}
> +
> +		devfreq->scaling_max_freq = find_available_max_freq(devfreq);
> +		if (!devfreq->scaling_max_freq) {
> +			mutex_unlock(&devfreq->lock);
> +			err = -EINVAL;
> +			goto err_dev;
> +		}
> +
> +		devfreq->suspend_freq = dev_pm_opp_get_suspend_opp_freq(dev);
> +	}
>  
>  	atomic_set(&devfreq->suspend_count, 0);
>  
>
Samuel Holland Sept. 30, 2021, 11:37 a.m. UTC | #2
On 9/29/21 11:19 PM, Chanwoo Choi wrote:
> Hi Samuel,
> 
> 
> On 9/29/21 1:42 PM, Samuel Holland wrote:
>> Since commit ea572f816032 ("PM / devfreq: Change return type of
>> devfreq_set_freq_table()"), all devfreq devices are required to have a
>> valid freq_table. If freq_table is not provided by the driver, it will
>> be filled in by set_freq_table() from the OPPs; if that fails,
>> devfreq_add_device() will return an error.
>>
>> However, since commit ab8f58ad72c4 ("PM / devfreq: Set min/max_freq when
>> adding the devfreq device"), devfreq devices are _also_ required to have
>> an OPP table, even if they provide freq_table. devfreq_add_device()
>> requires dev_pm_opp_find_freq_ceil() and dev_pm_opp_find_freq_floor() to
>> return successfully, specifically to initialize scaling_min/max_freq.
>>
>> Not all drivers need an OPP table. For example, a driver where all
>> frequencies are determined dynamically could work by filling out only
>> freq_table. But with the current code it must call dev_pm_opp_add() on
>> every freq_table entry to probe successfully.
> 
> As you commented, if device has no opp table, it should call dev_pm_opp_add().
> The devfreq have to use OPP for controlling the frequency/regulator.
> 
> Actually, I want that all devfreq driver uses the OPP as default way.

The current code/documentation implies that an OPP table is intended to
be optional. For example:

 * struct devfreq - Device devfreq structure
...
 * @opp_table:  Reference to OPP table of dev.parent, if one exists.

So this should be updated if an OPP table is no longer optional.

> Are there any reason why don't use the OPP table?

dev_pm_opp_add() takes a voltage, and assumes the existence of some
voltage regulator, but there is none involved here. The only way to have
an OPP table without regulators is to use a static table in the
devicetree. But that also doesn't make much sense, because the OPPs
aren't actually customizable; they are integer dividers from a fixed
base clock. And adding a fixed OPP table to each board would be a lot of
work to replace a trivial loop in the driver. So it seems to be the
wrong abstraction.

Using an OPP table adds extra complexity (memory allocations, error
cases), just to duplicate the list of frequencies that already has to
exist in freq_table. And the driver works fine without any of that.

Regards,
Samuel
Samuel Holland Oct. 1, 2021, 1:45 a.m. UTC | #3
On 9/30/21 8:59 PM, Chanwoo Choi wrote:
> On 9/30/21 8:37 PM, Samuel Holland wrote:
>> On 9/29/21 11:19 PM, Chanwoo Choi wrote:
>>> Hi Samuel,
>>>
>>>
>>> On 9/29/21 1:42 PM, Samuel Holland wrote:
>>>> Since commit ea572f816032 ("PM / devfreq: Change return type of
>>>> devfreq_set_freq_table()"), all devfreq devices are required to have a
>>>> valid freq_table. If freq_table is not provided by the driver, it will
>>>> be filled in by set_freq_table() from the OPPs; if that fails,
>>>> devfreq_add_device() will return an error.
>>>>
>>>> However, since commit ab8f58ad72c4 ("PM / devfreq: Set min/max_freq when
>>>> adding the devfreq device"), devfreq devices are _also_ required to have
>>>> an OPP table, even if they provide freq_table. devfreq_add_device()
>>>> requires dev_pm_opp_find_freq_ceil() and dev_pm_opp_find_freq_floor() to
>>>> return successfully, specifically to initialize scaling_min/max_freq.
>>>>
>>>> Not all drivers need an OPP table. For example, a driver where all
>>>> frequencies are determined dynamically could work by filling out only
>>>> freq_table. But with the current code it must call dev_pm_opp_add() on
>>>> every freq_table entry to probe successfully.
>>>
>>> As you commented, if device has no opp table, it should call dev_pm_opp_add().
>>> The devfreq have to use OPP for controlling the frequency/regulator.
>>>
>>> Actually, I want that all devfreq driver uses the OPP as default way.
>>
>> The current code/documentation implies that an OPP table is intended to
>> be optional. For example:
>>
>>  * struct devfreq - Device devfreq structure
>> ...
>>  * @opp_table:  Reference to OPP table of dev.parent, if one exists.
>>
>> So this should be updated if an OPP table is no longer optional.
> 
> Right. Need to update it.
> 
>>
>>> Are there any reason why don't use the OPP table?
>>
>> dev_pm_opp_add() takes a voltage, and assumes the existence of some
>> voltage regulator, but there is none involved here. The only way to have
>> an OPP table without regulators is to use a static table in the
>> devicetree. But that also doesn't make much sense, because the OPPs
>> aren't actually customizable; they are integer dividers from a fixed
>> base clock.
> 
> You can use OPP for only clock control without regulator. OPP already
> provides them. OPP already provides the helpful function which
> implement the functions to handle the clock/regulator or power doamin.
> It is useful framework to control clock/regulator. 
> 
> If the standard framework in Linux kernel, it is best to use
> this framework in order to remove the duplicate codes on multiple
> device drivers. It is one of advantage of Linux kernel. 
> 
> Also, if OPP doesn't support the some requirement of you,
> you can contribute and update the OPP.
> 
>  And adding a fixed OPP table to each board would be a lot of
>> work to replace a trivial loop in the driver. So it seems to be the
>> wrong abstraction.
> 
> I don't understand. As I commented for patch 10, you can add
> the OPP entry of the clock without the fixed OPP table in devicetree.
> 
>>
>> Using an OPP table adds extra complexity (memory allocations, error
>> cases), just to duplicate the list of frequencies that already has to
>> exist in freq_table. And the driver works fine without any of that.
> 
> 'freq_table' of devfreq was developed before of adding OPP interface to Linux kernel as I knew. Actually, I prefer to use the OPP interface
> instead of initializing the freq_table directly by device driver.
> I just keep the 'freq_table' for preventing the build/working issue
> for older device driver. I think OPP is enough to control frequency/voltage
> and it provides the various helper funcitons for user of OPP.

Thanks for the explanation. I will convert the driver to use
dev_pm_opp_add(), and I will drop patches 2 and 4. I think patch 3 is
still worth considering.

Regards,
Samuel
Chanwoo Choi Oct. 1, 2021, 1:59 a.m. UTC | #4
On 9/30/21 8:37 PM, Samuel Holland wrote:
> On 9/29/21 11:19 PM, Chanwoo Choi wrote:
>> Hi Samuel,
>>
>>
>> On 9/29/21 1:42 PM, Samuel Holland wrote:
>>> Since commit ea572f816032 ("PM / devfreq: Change return type of
>>> devfreq_set_freq_table()"), all devfreq devices are required to have a
>>> valid freq_table. If freq_table is not provided by the driver, it will
>>> be filled in by set_freq_table() from the OPPs; if that fails,
>>> devfreq_add_device() will return an error.
>>>
>>> However, since commit ab8f58ad72c4 ("PM / devfreq: Set min/max_freq when
>>> adding the devfreq device"), devfreq devices are _also_ required to have
>>> an OPP table, even if they provide freq_table. devfreq_add_device()
>>> requires dev_pm_opp_find_freq_ceil() and dev_pm_opp_find_freq_floor() to
>>> return successfully, specifically to initialize scaling_min/max_freq.
>>>
>>> Not all drivers need an OPP table. For example, a driver where all
>>> frequencies are determined dynamically could work by filling out only
>>> freq_table. But with the current code it must call dev_pm_opp_add() on
>>> every freq_table entry to probe successfully.
>>
>> As you commented, if device has no opp table, it should call dev_pm_opp_add().
>> The devfreq have to use OPP for controlling the frequency/regulator.
>>
>> Actually, I want that all devfreq driver uses the OPP as default way.
> 
> The current code/documentation implies that an OPP table is intended to
> be optional. For example:
> 
>  * struct devfreq - Device devfreq structure
> ...
>  * @opp_table:  Reference to OPP table of dev.parent, if one exists.
> 
> So this should be updated if an OPP table is no longer optional.

Right. Need to update it.

> 
>> Are there any reason why don't use the OPP table?
> 
> dev_pm_opp_add() takes a voltage, and assumes the existence of some
> voltage regulator, but there is none involved here. The only way to have
> an OPP table without regulators is to use a static table in the
> devicetree. But that also doesn't make much sense, because the OPPs
> aren't actually customizable; they are integer dividers from a fixed
> base clock.

You can use OPP for only clock control without regulator. OPP already
provides them. OPP already provides the helpful function which
implement the functions to handle the clock/regulator or power doamin.
It is useful framework to control clock/regulator. 

If the standard framework in Linux kernel, it is best to use
this framework in order to remove the duplicate codes on multiple
device drivers. It is one of advantage of Linux kernel. 

Also, if OPP doesn't support the some requirement of you,
you can contribute and update the OPP.

 And adding a fixed OPP table to each board would be a lot of
> work to replace a trivial loop in the driver. So it seems to be the
> wrong abstraction.

I don't understand. As I commented for patch 10, you can add
the OPP entry of the clock without the fixed OPP table in devicetree.

> 
> Using an OPP table adds extra complexity (memory allocations, error
> cases), just to duplicate the list of frequencies that already has to
> exist in freq_table. And the driver works fine without any of that.

'freq_table' of devfreq was developed before of adding OPP interface to Linux kernel as I knew. Actually, I prefer to use the OPP interface
instead of initializing the freq_table directly by device driver.
I just keep the 'freq_table' for preventing the build/working issue
for older device driver. I think OPP is enough to control frequency/voltage
and it provides the various helper funcitons for user of OPP.
Chanwoo Choi Oct. 1, 2021, 2:14 a.m. UTC | #5
On 10/1/21 10:45 AM, Samuel Holland wrote:
> On 9/30/21 8:59 PM, Chanwoo Choi wrote:
>> On 9/30/21 8:37 PM, Samuel Holland wrote:
>>> On 9/29/21 11:19 PM, Chanwoo Choi wrote:
>>>> Hi Samuel,
>>>>
>>>>
>>>> On 9/29/21 1:42 PM, Samuel Holland wrote:
>>>>> Since commit ea572f816032 ("PM / devfreq: Change return type of
>>>>> devfreq_set_freq_table()"), all devfreq devices are required to have a
>>>>> valid freq_table. If freq_table is not provided by the driver, it will
>>>>> be filled in by set_freq_table() from the OPPs; if that fails,
>>>>> devfreq_add_device() will return an error.
>>>>>
>>>>> However, since commit ab8f58ad72c4 ("PM / devfreq: Set min/max_freq when
>>>>> adding the devfreq device"), devfreq devices are _also_ required to have
>>>>> an OPP table, even if they provide freq_table. devfreq_add_device()
>>>>> requires dev_pm_opp_find_freq_ceil() and dev_pm_opp_find_freq_floor() to
>>>>> return successfully, specifically to initialize scaling_min/max_freq.
>>>>>
>>>>> Not all drivers need an OPP table. For example, a driver where all
>>>>> frequencies are determined dynamically could work by filling out only
>>>>> freq_table. But with the current code it must call dev_pm_opp_add() on
>>>>> every freq_table entry to probe successfully.
>>>>
>>>> As you commented, if device has no opp table, it should call dev_pm_opp_add().
>>>> The devfreq have to use OPP for controlling the frequency/regulator.
>>>>
>>>> Actually, I want that all devfreq driver uses the OPP as default way.
>>>
>>> The current code/documentation implies that an OPP table is intended to
>>> be optional. For example:
>>>
>>>  * struct devfreq - Device devfreq structure
>>> ...
>>>  * @opp_table:  Reference to OPP table of dev.parent, if one exists.
>>>
>>> So this should be updated if an OPP table is no longer optional.
>>
>> Right. Need to update it.
>>
>>>
>>>> Are there any reason why don't use the OPP table?
>>>
>>> dev_pm_opp_add() takes a voltage, and assumes the existence of some
>>> voltage regulator, but there is none involved here. The only way to have
>>> an OPP table without regulators is to use a static table in the
>>> devicetree. But that also doesn't make much sense, because the OPPs
>>> aren't actually customizable; they are integer dividers from a fixed
>>> base clock.
>>
>> You can use OPP for only clock control without regulator. OPP already
>> provides them. OPP already provides the helpful function which
>> implement the functions to handle the clock/regulator or power doamin.
>> It is useful framework to control clock/regulator. 
>>
>> If the standard framework in Linux kernel, it is best to use
>> this framework in order to remove the duplicate codes on multiple
>> device drivers. It is one of advantage of Linux kernel. 
>>
>> Also, if OPP doesn't support the some requirement of you,
>> you can contribute and update the OPP.
>>
>>  And adding a fixed OPP table to each board would be a lot of
>>> work to replace a trivial loop in the driver. So it seems to be the
>>> wrong abstraction.
>>
>> I don't understand. As I commented for patch 10, you can add
>> the OPP entry of the clock without the fixed OPP table in devicetree.
>>
>>>
>>> Using an OPP table adds extra complexity (memory allocations, error
>>> cases), just to duplicate the list of frequencies that already has to
>>> exist in freq_table. And the driver works fine without any of that.
>>
>> 'freq_table' of devfreq was developed before of adding OPP interface to Linux kernel as I knew. Actually, I prefer to use the OPP interface
>> instead of initializing the freq_table directly by device driver.
>> I just keep the 'freq_table' for preventing the build/working issue
>> for older device driver. I think OPP is enough to control frequency/voltage
>> and it provides the various helper funcitons for user of OPP.
> 
> Thanks for the explanation. I will convert the driver to use
> dev_pm_opp_add(), and I will drop patches 2 and 4. I think patch 3 is
> still worth considering.

Thanks. Previously, there was some devfreq driver to use freq_table
with descending order. I'll check the devfreq driver in linux kernel.
If there are no any use-case of freq_table with descending order,
I'll accept the patch3.
diff mbox series

Patch

diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
index 7a8b022ba456..426e31e6c448 100644
--- a/drivers/devfreq/devfreq.c
+++ b/drivers/devfreq/devfreq.c
@@ -835,24 +835,28 @@  struct devfreq *devfreq_add_device(struct device *dev,
 		mutex_lock(&devfreq->lock);
 	}
 
-	devfreq->scaling_min_freq = find_available_min_freq(devfreq);
-	if (!devfreq->scaling_min_freq) {
-		mutex_unlock(&devfreq->lock);
-		err = -EINVAL;
-		goto err_dev;
-	}
-
-	devfreq->scaling_max_freq = find_available_max_freq(devfreq);
-	if (!devfreq->scaling_max_freq) {
-		mutex_unlock(&devfreq->lock);
-		err = -EINVAL;
-		goto err_dev;
-	}
-
-	devfreq->suspend_freq = dev_pm_opp_get_suspend_opp_freq(dev);
 	devfreq->opp_table = dev_pm_opp_get_opp_table(dev);
-	if (IS_ERR(devfreq->opp_table))
+	if (IS_ERR(devfreq->opp_table)) {
 		devfreq->opp_table = NULL;
+		devfreq->scaling_min_freq = 0;
+		devfreq->scaling_max_freq = ULONG_MAX;
+	} else {
+		devfreq->scaling_min_freq = find_available_min_freq(devfreq);
+		if (!devfreq->scaling_min_freq) {
+			mutex_unlock(&devfreq->lock);
+			err = -EINVAL;
+			goto err_dev;
+		}
+
+		devfreq->scaling_max_freq = find_available_max_freq(devfreq);
+		if (!devfreq->scaling_max_freq) {
+			mutex_unlock(&devfreq->lock);
+			err = -EINVAL;
+			goto err_dev;
+		}
+
+		devfreq->suspend_freq = dev_pm_opp_get_suspend_opp_freq(dev);
+	}
 
 	atomic_set(&devfreq->suspend_count, 0);