[v4,2/3] PM / EM: Expose perf domain struct
diff mbox series

Message ID 20190515082318.7993-3-quentin.perret@arm.com
State Changes Requested, archived
Headers show
Series
  • Make IPA use PM_EM
Related show

Commit Message

Quentin Perret May 15, 2019, 8:23 a.m. UTC
In the current state, the perf_domain struct is fully defined only when
CONFIG_ENERGY_MODEL=y. Since we need to write code that compiles both
with or without that option in the thermal framework, make sure to
actually define the struct regardless of the config option. That allows
to avoid using stubbed accessor functions all the time in code paths
that use the EM.

Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Quentin Perret <quentin.perret@arm.com>
---
 include/linux/energy_model.h | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

Comments

Daniel Lezcano May 15, 2019, 9:06 a.m. UTC | #1
On 15/05/2019 10:23, Quentin Perret wrote:
> In the current state, the perf_domain struct is fully defined only when
> CONFIG_ENERGY_MODEL=y. Since we need to write code that compiles both
> with or without that option in the thermal framework, make sure to
> actually define the struct regardless of the config option. That allows
> to avoid using stubbed accessor functions all the time in code paths
> that use the EM.
> 
> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
> Signed-off-by: Quentin Perret <quentin.perret@arm.com>

This patch implies the cpu cooling device can be set without the energy
model.

Isn't it possible to make a strong dependency for the cpu cooling device
on the energy model option, add the energy model as default on arm arch
and drop this patch?

After all, the cpu cooling is using the em framework.

> ---
>  include/linux/energy_model.h | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/include/linux/energy_model.h b/include/linux/energy_model.h
> index aa027f7bcb3e..fb32b86a467d 100644
> --- a/include/linux/energy_model.h
> +++ b/include/linux/energy_model.h
> @@ -9,7 +9,6 @@
>  #include <linux/sched/topology.h>
>  #include <linux/types.h>
>  
> -#ifdef CONFIG_ENERGY_MODEL
>  /**
>   * em_cap_state - Capacity state of a performance domain
>   * @frequency:	The CPU frequency in KHz, for consistency with CPUFreq
> @@ -40,6 +39,7 @@ struct em_perf_domain {
>  	unsigned long cpus[0];
>  };
>  
> +#ifdef CONFIG_ENERGY_MODEL
>  #define EM_CPU_MAX_POWER 0xFFFF
>  
>  struct em_data_callback {
> @@ -160,7 +160,6 @@ static inline int em_pd_nr_cap_states(struct em_perf_domain *pd)
>  }
>  
>  #else
> -struct em_perf_domain {};
>  struct em_data_callback {};
>  #define EM_DATA_CB(_active_power_cb) { }
>  
>
Quentin Perret May 15, 2019, 9:17 a.m. UTC | #2
Hi Daniel,

On Wednesday 15 May 2019 at 11:06:18 (+0200), Daniel Lezcano wrote:
> On 15/05/2019 10:23, Quentin Perret wrote:
> > In the current state, the perf_domain struct is fully defined only when
> > CONFIG_ENERGY_MODEL=y. Since we need to write code that compiles both
> > with or without that option in the thermal framework, make sure to
> > actually define the struct regardless of the config option. That allows
> > to avoid using stubbed accessor functions all the time in code paths
> > that use the EM.
> > 
> > Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
> > Signed-off-by: Quentin Perret <quentin.perret@arm.com>
> 
> This patch implies the cpu cooling device can be set without the energy
> model.
> 
> Isn't it possible to make a strong dependency for the cpu cooling device
> on the energy model option, add the energy model as default on arm arch
> and drop this patch?

Right, that should work too.

> After all, the cpu cooling is using the em framework.

The reason I did it that way is simply to keep things flexible. If you
don't compile in THERMAL_GOV_POWER_ALLOCATOR, you will never use the EM
for CPU thermal. So I thought it would be good to not mandate compiling
in ENERGY_MODEL in this case -- that should save a bit of space.

But TBH I don't have a strong opinion on this one, so if everybody
agrees it's fine to just make CPU_THERMAL depend on ENERGY_MODEL, I'm
happy to drop this patch and fix patch 3/3. That would indeed simplify
things a bit.

Thanks,
Quentin
Daniel Lezcano May 15, 2019, 9:56 a.m. UTC | #3
On 15/05/2019 11:17, Quentin Perret wrote:
> Hi Daniel,
> 
> On Wednesday 15 May 2019 at 11:06:18 (+0200), Daniel Lezcano wrote:
>> On 15/05/2019 10:23, Quentin Perret wrote:
>>> In the current state, the perf_domain struct is fully defined only when
>>> CONFIG_ENERGY_MODEL=y. Since we need to write code that compiles both
>>> with or without that option in the thermal framework, make sure to
>>> actually define the struct regardless of the config option. That allows
>>> to avoid using stubbed accessor functions all the time in code paths
>>> that use the EM.
>>>
>>> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
>>> Signed-off-by: Quentin Perret <quentin.perret@arm.com>
>>
>> This patch implies the cpu cooling device can be set without the energy
>> model.
>>
>> Isn't it possible to make a strong dependency for the cpu cooling device
>> on the energy model option, add the energy model as default on arm arch
>> and drop this patch?
> 
> Right, that should work too.
> 
>> After all, the cpu cooling is using the em framework.
> 
> The reason I did it that way is simply to keep things flexible. If you
> don't compile in THERMAL_GOV_POWER_ALLOCATOR, you will never use the EM
> for CPU thermal. So I thought it would be good to not mandate compiling
> in ENERGY_MODEL in this case -- that should save a bit of space.
> 
> But TBH I don't have a strong opinion on this one, so if everybody
> agrees it's fine to just make CPU_THERMAL depend on ENERGY_MODEL, I'm
> happy to drop this patch and fix patch 3/3. That would indeed simplify
> things a bit.

Ok in this case it will be better to drop the 2/3 and add a small series
doing for the cpu_cooling.c

#ifdef CONFIG_THERMAL_GOV_POWER_ALLOCATOR

/* structure freq */

/* power2state */

/* state2power*/

/* getrequestedpower */

/* All functions needed for the above */

#endif

static struct thermal_cooling_device_ops cpufreq_cooling_ops = {
        .get_max_state          = cpufreq_get_max_state,
        .get_cur_state          = cpufreq_get_cur_state,
        .set_cur_state          = cpufreq_set_cur_state,
#ifdef CONFIG_THERMAL_GOV_POWER_ALLOCATOR
        .get_requested_power    = cpufreq_get_requested_power,
        .state2power            = cpufreq_state2power,
        .power2state            = cpufreq_power2state,
#endif
};

So you don't have to care about ENERGY_MODEL to be set as
THERMAL_GOV_POWER_ALLOCATOR depends on it.

I think the result for cpu_cooling.c will be even more cleaner with the
em change.
Quentin Perret May 15, 2019, 10:07 a.m. UTC | #4
On Wednesday 15 May 2019 at 11:56:30 (+0200), Daniel Lezcano wrote:
> On 15/05/2019 11:17, Quentin Perret wrote:
> > Hi Daniel,
> > 
> > On Wednesday 15 May 2019 at 11:06:18 (+0200), Daniel Lezcano wrote:
> >> On 15/05/2019 10:23, Quentin Perret wrote:
> >>> In the current state, the perf_domain struct is fully defined only when
> >>> CONFIG_ENERGY_MODEL=y. Since we need to write code that compiles both
> >>> with or without that option in the thermal framework, make sure to
> >>> actually define the struct regardless of the config option. That allows
> >>> to avoid using stubbed accessor functions all the time in code paths
> >>> that use the EM.
> >>>
> >>> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
> >>> Signed-off-by: Quentin Perret <quentin.perret@arm.com>
> >>
> >> This patch implies the cpu cooling device can be set without the energy
> >> model.
> >>
> >> Isn't it possible to make a strong dependency for the cpu cooling device
> >> on the energy model option, add the energy model as default on arm arch
> >> and drop this patch?
> > 
> > Right, that should work too.
> > 
> >> After all, the cpu cooling is using the em framework.
> > 
> > The reason I did it that way is simply to keep things flexible. If you
> > don't compile in THERMAL_GOV_POWER_ALLOCATOR, you will never use the EM
> > for CPU thermal. So I thought it would be good to not mandate compiling
> > in ENERGY_MODEL in this case -- that should save a bit of space.
> > 
> > But TBH I don't have a strong opinion on this one, so if everybody
> > agrees it's fine to just make CPU_THERMAL depend on ENERGY_MODEL, I'm
> > happy to drop this patch and fix patch 3/3. That would indeed simplify
> > things a bit.
> 
> Ok in this case it will be better to drop the 2/3 and add a small series
> doing for the cpu_cooling.c
> 
> #ifdef CONFIG_THERMAL_GOV_POWER_ALLOCATOR
> 
> /* structure freq */
> 
> /* power2state */
> 
> /* state2power*/
> 
> /* getrequestedpower */
> 
> /* All functions needed for the above */
> 
> #endif
> 
> static struct thermal_cooling_device_ops cpufreq_cooling_ops = {
>         .get_max_state          = cpufreq_get_max_state,
>         .get_cur_state          = cpufreq_get_cur_state,
>         .set_cur_state          = cpufreq_set_cur_state,
> #ifdef CONFIG_THERMAL_GOV_POWER_ALLOCATOR
>         .get_requested_power    = cpufreq_get_requested_power,
>         .state2power            = cpufreq_state2power,
>         .power2state            = cpufreq_power2state,
> #endif
> };
> 
> So you don't have to care about ENERGY_MODEL to be set as
> THERMAL_GOV_POWER_ALLOCATOR depends on it.
> 
> I think the result for cpu_cooling.c will be even more cleaner with the
> em change.

OK, that works for me. I'll give it a go in v5.

Thanks !
Quentin
Daniel Lezcano May 15, 2019, 10:16 a.m. UTC | #5
On 15/05/2019 12:07, Quentin Perret wrote:
> On Wednesday 15 May 2019 at 11:56:30 (+0200), Daniel Lezcano wrote:
>> On 15/05/2019 11:17, Quentin Perret wrote:
>>> Hi Daniel,
>>>
>>> On Wednesday 15 May 2019 at 11:06:18 (+0200), Daniel Lezcano wrote:
>>>> On 15/05/2019 10:23, Quentin Perret wrote:
>>>>> In the current state, the perf_domain struct is fully defined only when
>>>>> CONFIG_ENERGY_MODEL=y. Since we need to write code that compiles both
>>>>> with or without that option in the thermal framework, make sure to
>>>>> actually define the struct regardless of the config option. That allows
>>>>> to avoid using stubbed accessor functions all the time in code paths
>>>>> that use the EM.
>>>>>
>>>>> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
>>>>> Signed-off-by: Quentin Perret <quentin.perret@arm.com>
>>>>
>>>> This patch implies the cpu cooling device can be set without the energy
>>>> model.
>>>>
>>>> Isn't it possible to make a strong dependency for the cpu cooling device
>>>> on the energy model option, add the energy model as default on arm arch
>>>> and drop this patch?
>>>
>>> Right, that should work too.
>>>
>>>> After all, the cpu cooling is using the em framework.
>>>
>>> The reason I did it that way is simply to keep things flexible. If you
>>> don't compile in THERMAL_GOV_POWER_ALLOCATOR, you will never use the EM
>>> for CPU thermal. So I thought it would be good to not mandate compiling
>>> in ENERGY_MODEL in this case -- that should save a bit of space.
>>>
>>> But TBH I don't have a strong opinion on this one, so if everybody
>>> agrees it's fine to just make CPU_THERMAL depend on ENERGY_MODEL, I'm
>>> happy to drop this patch and fix patch 3/3. That would indeed simplify
>>> things a bit.
>>
>> Ok in this case it will be better to drop the 2/3 and add a small series
>> doing for the cpu_cooling.c
>>
>> #ifdef CONFIG_THERMAL_GOV_POWER_ALLOCATOR
>>
>> /* structure freq */
>>
>> /* power2state */
>>
>> /* state2power*/
>>
>> /* getrequestedpower */
>>
>> /* All functions needed for the above */
>>
>> #endif
>>
>> static struct thermal_cooling_device_ops cpufreq_cooling_ops = {
>>         .get_max_state          = cpufreq_get_max_state,
>>         .get_cur_state          = cpufreq_get_cur_state,
>>         .set_cur_state          = cpufreq_set_cur_state,
>> #ifdef CONFIG_THERMAL_GOV_POWER_ALLOCATOR
>>         .get_requested_power    = cpufreq_get_requested_power,
>>         .state2power            = cpufreq_state2power,
>>         .power2state            = cpufreq_power2state,
>> #endif
>> };
>>
>> So you don't have to care about ENERGY_MODEL to be set as
>> THERMAL_GOV_POWER_ALLOCATOR depends on it.
>>
>> I think the result for cpu_cooling.c will be even more cleaner with the
>> em change.
> 
> OK, that works for me. I'll give it a go in v5.

Viresh what do you think ?
Viresh Kumar May 15, 2019, 10:22 a.m. UTC | #6
On 15-05-19, 12:16, Daniel Lezcano wrote:
> Viresh what do you think ?

I agree with your last suggestions. They do make sense.
Quentin Perret May 15, 2019, 10:40 a.m. UTC | #7
On Wednesday 15 May 2019 at 15:52:00 (+0530), Viresh Kumar wrote:
> On 15-05-19, 12:16, Daniel Lezcano wrote:
> > Viresh what do you think ?
> 
> I agree with your last suggestions. They do make sense.

Good :-)

So, FWIW, the below compiles w/ or w/o THERMAL_GOV_POWER_ALLOCATOR. I'll
test it and clean it up some more and put it as patch 1 in the series if
that's OK.

Thanks,
Quentin


diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index f7c1f49ec87f..ee431848ef71 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -58,7 +58,9 @@
  */
 struct freq_table {
        u32 frequency;
+#ifdef CONFIG_THERMAL_GOV_POWER_ALLOCATOR
        u32 power;
+#endif
 };
 
 /**
@@ -109,28 +111,6 @@ static DEFINE_IDA(cpufreq_ida);
 static DEFINE_MUTEX(cooling_list_lock);
 static LIST_HEAD(cpufreq_cdev_list);
 
-/* Below code defines functions to be used for cpufreq as cooling device */
-
-/**
- * get_level: Find the level for a particular frequency
- * @cpufreq_cdev: cpufreq_cdev for which the property is required
- * @freq: Frequency
- *
- * Return: level corresponding to the frequency.
- */
-static unsigned long get_level(struct cpufreq_cooling_device *cpufreq_cdev,
-                              unsigned int freq)
-{
-       struct freq_table *freq_table = cpufreq_cdev->freq_table;
-       unsigned long level;
-
-       for (level = 1; level <= cpufreq_cdev->max_level; level++)
-               if (freq > freq_table[level].frequency)
-                       break;
-
-       return level - 1;
-}
-
 /**
  * cpufreq_thermal_notifier - notifier callback for cpufreq policy change.
  * @nb:        struct notifier_block * with callback info.
@@ -184,6 +164,27 @@ static int cpufreq_thermal_notifier(struct notifier_block *nb,
        return NOTIFY_OK;
 }
 
+#ifdef CONFIG_THERMAL_GOV_POWER_ALLOCATOR
+/**
+ * get_level: Find the level for a particular frequency
+ * @cpufreq_cdev: cpufreq_cdev for which the property is required
+ * @freq: Frequency
+ *
+ * Return: level corresponding to the frequency.
+ */
+static unsigned long get_level(struct cpufreq_cooling_device *cpufreq_cdev,
+                              unsigned int freq)
+{
+       struct freq_table *freq_table = cpufreq_cdev->freq_table;
+       unsigned long level;
+
+       for (level = 1; level <= cpufreq_cdev->max_level; level++)
+               if (freq > freq_table[level].frequency)
+                       break;
+
+       return level - 1;
+}
+
 /**
  * update_freq_table() - Update the freq table with power numbers
  * @cpufreq_cdev:      the cpufreq cooling device in which to update the table
@@ -333,80 +334,6 @@ static u32 get_dynamic_power(struct cpufreq_cooling_device *cpufreq_cdev,
        return (raw_cpu_power * cpufreq_cdev->last_load) / 100;
 }
 
-/* cpufreq cooling device callback functions are defined below */
-
-/**
- * cpufreq_get_max_state - callback function to get the max cooling state.
- * @cdev: thermal cooling device pointer.
- * @state: fill this variable with the max cooling state.
- *
- * Callback for the thermal cooling device to return the cpufreq
- * max cooling state.
- *
- * Return: 0 on success, an error code otherwise.
- */
-static int cpufreq_get_max_state(struct thermal_cooling_device *cdev,
-                                unsigned long *state)
-{
-       struct cpufreq_cooling_device *cpufreq_cdev = cdev->devdata;
-
-       *state = cpufreq_cdev->max_level;
-       return 0;
-}
-
-/**
- * cpufreq_get_cur_state - callback function to get the current cooling state.
- * @cdev: thermal cooling device pointer.
- * @state: fill this variable with the current cooling state.
- *
- * Callback for the thermal cooling device to return the cpufreq
- * current cooling state.
- *
- * Return: 0 on success, an error code otherwise.
- */
-static int cpufreq_get_cur_state(struct thermal_cooling_device *cdev,
-                                unsigned long *state)
-{
-       struct cpufreq_cooling_device *cpufreq_cdev = cdev->devdata;
-
-       *state = cpufreq_cdev->cpufreq_state;
-
-       return 0;
-}
-
-/**
- * cpufreq_set_cur_state - callback function to set the current cooling state.
- * @cdev: thermal cooling device pointer.
- * @state: set this variable to the current cooling state.
- *
- * Callback for the thermal cooling device to change the cpufreq
- * current cooling state.
- *
- * Return: 0 on success, an error code otherwise.
- */
-static int cpufreq_set_cur_state(struct thermal_cooling_device *cdev,
-                                unsigned long state)
-{
-       struct cpufreq_cooling_device *cpufreq_cdev = cdev->devdata;
-       unsigned int clip_freq;
-
-       /* Request state should be less than max_level */
-       if (WARN_ON(state > cpufreq_cdev->max_level))
-               return -EINVAL;
-
-       /* Check if the old cooling action is same as new cooling action */
-       if (cpufreq_cdev->cpufreq_state == state)
-               return 0;
-
-       clip_freq = cpufreq_cdev->freq_table[state].frequency;
-       cpufreq_cdev->cpufreq_state = state;
-       cpufreq_cdev->clipped_freq = clip_freq;
-
-       cpufreq_update_policy(cpufreq_cdev->policy->cpu);
-
-       return 0;
-}
-
 /**
  * cpufreq_get_requested_power() - get the current power
  * @cdev:      &thermal_cooling_device pointer
@@ -551,22 +478,93 @@ static int cpufreq_power2state(struct thermal_cooling_device *cdev,
                                      power);
        return 0;
 }
+#endif /* CONFIG_THERMAL_GOV_POWER_ALLOCATOR */
+
+/* cpufreq cooling device callback functions are defined below */
+
+/**
+ * cpufreq_get_max_state - callback function to get the max cooling state.
+ * @cdev: thermal cooling device pointer.
+ * @state: fill this variable with the max cooling state.
+ *
+ * Callback for the thermal cooling device to return the cpufreq
+ * max cooling state.
+ *
+ * Return: 0 on success, an error code otherwise.
+ */
+static int cpufreq_get_max_state(struct thermal_cooling_device *cdev,
+                                unsigned long *state)
+{
+       struct cpufreq_cooling_device *cpufreq_cdev = cdev->devdata;
+
+       *state = cpufreq_cdev->max_level;
+       return 0;
+}
+
+/**
+ * cpufreq_get_cur_state - callback function to get the current cooling state.
+ * @cdev: thermal cooling device pointer.
+ * @state: fill this variable with the current cooling state.
+ *
+ * Callback for the thermal cooling device to return the cpufreq
+ * current cooling state.
+ *
+ * Return: 0 on success, an error code otherwise.
+ */
+static int cpufreq_get_cur_state(struct thermal_cooling_device *cdev,
+                                unsigned long *state)
+{
+       struct cpufreq_cooling_device *cpufreq_cdev = cdev->devdata;
+
+       *state = cpufreq_cdev->cpufreq_state;
+
+       return 0;
+}
+
+/**
+ * cpufreq_set_cur_state - callback function to set the current cooling state.
+ * @cdev: thermal cooling device pointer.
+ * @state: set this variable to the current cooling state.
+ *
+ * Callback for the thermal cooling device to change the cpufreq
+ * current cooling state.
+ *
+ * Return: 0 on success, an error code otherwise.
+ */
+static int cpufreq_set_cur_state(struct thermal_cooling_device *cdev,
+                                unsigned long state)
+{
+       struct cpufreq_cooling_device *cpufreq_cdev = cdev->devdata;
+       unsigned int clip_freq;
+
+       /* Request state should be less than max_level */
+       if (WARN_ON(state > cpufreq_cdev->max_level))
+               return -EINVAL;
+
+       /* Check if the old cooling action is same as new cooling action */
+       if (cpufreq_cdev->cpufreq_state == state)
+               return 0;
+
+       clip_freq = cpufreq_cdev->freq_table[state].frequency;
+       cpufreq_cdev->cpufreq_state = state;
+       cpufreq_cdev->clipped_freq = clip_freq;
+
+       cpufreq_update_policy(cpufreq_cdev->policy->cpu);
+
+       return 0;
+}
 
 /* Bind cpufreq callbacks to thermal cooling device ops */
 
 static struct thermal_cooling_device_ops cpufreq_cooling_ops = {
-       .get_max_state = cpufreq_get_max_state,
-       .get_cur_state = cpufreq_get_cur_state,
-       .set_cur_state = cpufreq_set_cur_state,
-};
-
-static struct thermal_cooling_device_ops cpufreq_power_cooling_ops = {
        .get_max_state          = cpufreq_get_max_state,
        .get_cur_state          = cpufreq_get_cur_state,
        .set_cur_state          = cpufreq_set_cur_state,
+#ifdef CONFIG_THERMAL_GOV_POWER_ALLOCATOR
        .get_requested_power    = cpufreq_get_requested_power,
        .state2power            = cpufreq_state2power,
        .power2state            = cpufreq_power2state,
+#endif
 };
 
 /* Notifier for cpufreq policy change */
@@ -674,17 +672,16 @@ __cpufreq_cooling_register(struct device_node *np,
                        pr_debug("%s: freq:%u KHz\n", __func__, freq);
        }
 
+#ifdef CONFIG_THERMAL_GOV_POWER_ALLOCATOR
        if (capacitance) {
                ret = update_freq_table(cpufreq_cdev, capacitance);
                if (ret) {
                        cdev = ERR_PTR(ret);
                        goto remove_ida;
                }
-
-               cooling_ops = &cpufreq_power_cooling_ops;
-       } else {
-               cooling_ops = &cpufreq_cooling_ops;
        }
+#endif
+       cooling_ops = &cpufreq_cooling_ops;
 
        cdev = thermal_of_cooling_device_register(np, dev_name, cpufreq_cdev,
                                                  cooling_ops);
Quentin Perret May 15, 2019, 10:46 a.m. UTC | #8
On Wednesday 15 May 2019 at 11:40:44 (+0100), Quentin Perret wrote:
> On Wednesday 15 May 2019 at 15:52:00 (+0530), Viresh Kumar wrote:
> > On 15-05-19, 12:16, Daniel Lezcano wrote:
> > > Viresh what do you think ?
> > 
> > I agree with your last suggestions. They do make sense.
> 
> Good :-)
> 
> So, FWIW, the below compiles w/ or w/o THERMAL_GOV_POWER_ALLOCATOR. I'll
> test it and clean it up some more and put it as patch 1 in the series if
> that's OK.
> 
> Thanks,
> Quentin
> 
> 
> diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
> index f7c1f49ec87f..ee431848ef71 100644
> --- a/drivers/thermal/cpu_cooling.c
> +++ b/drivers/thermal/cpu_cooling.c
> @@ -58,7 +58,9 @@
>   */
>  struct freq_table {
>         u32 frequency;
> +#ifdef CONFIG_THERMAL_GOV_POWER_ALLOCATOR
>         u32 power;
> +#endif
>  };
>  
>  /**
> @@ -109,28 +111,6 @@ static DEFINE_IDA(cpufreq_ida);
>  static DEFINE_MUTEX(cooling_list_lock);
>  static LIST_HEAD(cpufreq_cdev_list);
>  
> -/* Below code defines functions to be used for cpufreq as cooling device */
> -
> -/**
> - * get_level: Find the level for a particular frequency
> - * @cpufreq_cdev: cpufreq_cdev for which the property is required
> - * @freq: Frequency
> - *
> - * Return: level corresponding to the frequency.
> - */
> -static unsigned long get_level(struct cpufreq_cooling_device *cpufreq_cdev,
> -                              unsigned int freq)
> -{
> -       struct freq_table *freq_table = cpufreq_cdev->freq_table;
> -       unsigned long level;
> -
> -       for (level = 1; level <= cpufreq_cdev->max_level; level++)
> -               if (freq > freq_table[level].frequency)
> -                       break;
> -
> -       return level - 1;
> -}
> -
>  /**
>   * cpufreq_thermal_notifier - notifier callback for cpufreq policy change.
>   * @nb:        struct notifier_block * with callback info.
> @@ -184,6 +164,27 @@ static int cpufreq_thermal_notifier(struct notifier_block *nb,
>         return NOTIFY_OK;
>  }
>  
> +#ifdef CONFIG_THERMAL_GOV_POWER_ALLOCATOR
> +/**
> + * get_level: Find the level for a particular frequency
> + * @cpufreq_cdev: cpufreq_cdev for which the property is required
> + * @freq: Frequency
> + *
> + * Return: level corresponding to the frequency.
> + */
> +static unsigned long get_level(struct cpufreq_cooling_device *cpufreq_cdev,
> +                              unsigned int freq)
> +{
> +       struct freq_table *freq_table = cpufreq_cdev->freq_table;
> +       unsigned long level;
> +
> +       for (level = 1; level <= cpufreq_cdev->max_level; level++)
> +               if (freq > freq_table[level].frequency)
> +                       break;
> +
> +       return level - 1;
> +}
> +
>  /**
>   * update_freq_table() - Update the freq table with power numbers
>   * @cpufreq_cdev:      the cpufreq cooling device in which to update the table
> @@ -333,80 +334,6 @@ static u32 get_dynamic_power(struct cpufreq_cooling_device *cpufreq_cdev,
>         return (raw_cpu_power * cpufreq_cdev->last_load) / 100;
>  }
>  
> -/* cpufreq cooling device callback functions are defined below */
> -
> -/**
> - * cpufreq_get_max_state - callback function to get the max cooling state.
> - * @cdev: thermal cooling device pointer.
> - * @state: fill this variable with the max cooling state.
> - *
> - * Callback for the thermal cooling device to return the cpufreq
> - * max cooling state.
> - *
> - * Return: 0 on success, an error code otherwise.
> - */
> -static int cpufreq_get_max_state(struct thermal_cooling_device *cdev,
> -                                unsigned long *state)
> -{
> -       struct cpufreq_cooling_device *cpufreq_cdev = cdev->devdata;
> -
> -       *state = cpufreq_cdev->max_level;
> -       return 0;
> -}
> -
> -/**
> - * cpufreq_get_cur_state - callback function to get the current cooling state.
> - * @cdev: thermal cooling device pointer.
> - * @state: fill this variable with the current cooling state.
> - *
> - * Callback for the thermal cooling device to return the cpufreq
> - * current cooling state.
> - *
> - * Return: 0 on success, an error code otherwise.
> - */
> -static int cpufreq_get_cur_state(struct thermal_cooling_device *cdev,
> -                                unsigned long *state)
> -{
> -       struct cpufreq_cooling_device *cpufreq_cdev = cdev->devdata;
> -
> -       *state = cpufreq_cdev->cpufreq_state;
> -
> -       return 0;
> -}
> -
> -/**
> - * cpufreq_set_cur_state - callback function to set the current cooling state.
> - * @cdev: thermal cooling device pointer.
> - * @state: set this variable to the current cooling state.
> - *
> - * Callback for the thermal cooling device to change the cpufreq
> - * current cooling state.
> - *
> - * Return: 0 on success, an error code otherwise.
> - */
> -static int cpufreq_set_cur_state(struct thermal_cooling_device *cdev,
> -                                unsigned long state)
> -{
> -       struct cpufreq_cooling_device *cpufreq_cdev = cdev->devdata;
> -       unsigned int clip_freq;
> -
> -       /* Request state should be less than max_level */
> -       if (WARN_ON(state > cpufreq_cdev->max_level))
> -               return -EINVAL;
> -
> -       /* Check if the old cooling action is same as new cooling action */
> -       if (cpufreq_cdev->cpufreq_state == state)
> -               return 0;
> -
> -       clip_freq = cpufreq_cdev->freq_table[state].frequency;
> -       cpufreq_cdev->cpufreq_state = state;
> -       cpufreq_cdev->clipped_freq = clip_freq;
> -
> -       cpufreq_update_policy(cpufreq_cdev->policy->cpu);
> -
> -       return 0;
> -}
> -
>  /**
>   * cpufreq_get_requested_power() - get the current power
>   * @cdev:      &thermal_cooling_device pointer
> @@ -551,22 +478,93 @@ static int cpufreq_power2state(struct thermal_cooling_device *cdev,
>                                       power);
>         return 0;
>  }
> +#endif /* CONFIG_THERMAL_GOV_POWER_ALLOCATOR */
> +
> +/* cpufreq cooling device callback functions are defined below */
> +
> +/**
> + * cpufreq_get_max_state - callback function to get the max cooling state.
> + * @cdev: thermal cooling device pointer.
> + * @state: fill this variable with the max cooling state.
> + *
> + * Callback for the thermal cooling device to return the cpufreq
> + * max cooling state.
> + *
> + * Return: 0 on success, an error code otherwise.
> + */
> +static int cpufreq_get_max_state(struct thermal_cooling_device *cdev,
> +                                unsigned long *state)
> +{
> +       struct cpufreq_cooling_device *cpufreq_cdev = cdev->devdata;
> +
> +       *state = cpufreq_cdev->max_level;
> +       return 0;
> +}
> +
> +/**
> + * cpufreq_get_cur_state - callback function to get the current cooling state.
> + * @cdev: thermal cooling device pointer.
> + * @state: fill this variable with the current cooling state.
> + *
> + * Callback for the thermal cooling device to return the cpufreq
> + * current cooling state.
> + *
> + * Return: 0 on success, an error code otherwise.
> + */
> +static int cpufreq_get_cur_state(struct thermal_cooling_device *cdev,
> +                                unsigned long *state)
> +{
> +       struct cpufreq_cooling_device *cpufreq_cdev = cdev->devdata;
> +
> +       *state = cpufreq_cdev->cpufreq_state;
> +
> +       return 0;
> +}
> +
> +/**
> + * cpufreq_set_cur_state - callback function to set the current cooling state.
> + * @cdev: thermal cooling device pointer.
> + * @state: set this variable to the current cooling state.
> + *
> + * Callback for the thermal cooling device to change the cpufreq
> + * current cooling state.
> + *
> + * Return: 0 on success, an error code otherwise.
> + */
> +static int cpufreq_set_cur_state(struct thermal_cooling_device *cdev,
> +                                unsigned long state)
> +{
> +       struct cpufreq_cooling_device *cpufreq_cdev = cdev->devdata;
> +       unsigned int clip_freq;
> +
> +       /* Request state should be less than max_level */
> +       if (WARN_ON(state > cpufreq_cdev->max_level))
> +               return -EINVAL;
> +
> +       /* Check if the old cooling action is same as new cooling action */
> +       if (cpufreq_cdev->cpufreq_state == state)
> +               return 0;
> +
> +       clip_freq = cpufreq_cdev->freq_table[state].frequency;
> +       cpufreq_cdev->cpufreq_state = state;
> +       cpufreq_cdev->clipped_freq = clip_freq;
> +
> +       cpufreq_update_policy(cpufreq_cdev->policy->cpu);
> +
> +       return 0;
> +}
>  
>  /* Bind cpufreq callbacks to thermal cooling device ops */
>  
>  static struct thermal_cooling_device_ops cpufreq_cooling_ops = {
> -       .get_max_state = cpufreq_get_max_state,
> -       .get_cur_state = cpufreq_get_cur_state,
> -       .set_cur_state = cpufreq_set_cur_state,
> -};
> -
> -static struct thermal_cooling_device_ops cpufreq_power_cooling_ops = {
>         .get_max_state          = cpufreq_get_max_state,
>         .get_cur_state          = cpufreq_get_cur_state,
>         .set_cur_state          = cpufreq_set_cur_state,
> +#ifdef CONFIG_THERMAL_GOV_POWER_ALLOCATOR
>         .get_requested_power    = cpufreq_get_requested_power,
>         .state2power            = cpufreq_state2power,
>         .power2state            = cpufreq_power2state,
> +#endif
>  };
>  
>  /* Notifier for cpufreq policy change */
> @@ -674,17 +672,16 @@ __cpufreq_cooling_register(struct device_node *np,
>                         pr_debug("%s: freq:%u KHz\n", __func__, freq);
>         }
>  
> +#ifdef CONFIG_THERMAL_GOV_POWER_ALLOCATOR
>         if (capacitance) {
>                 ret = update_freq_table(cpufreq_cdev, capacitance);
>                 if (ret) {
>                         cdev = ERR_PTR(ret);
>                         goto remove_ida;
>                 }
> -
> -               cooling_ops = &cpufreq_power_cooling_ops;
> -       } else {
> -               cooling_ops = &cpufreq_cooling_ops;
>         }
> +#endif
> +       cooling_ops = &cpufreq_cooling_ops;

Argh, that is actually broken with !capacitance and
THERMAL_GOV_POWER_ALLOCATOR=y ... Perhaps it's best to keep the two
thermal_cooling_device_ops struct separated in the end.

>  
>         cdev = thermal_of_cooling_device_register(np, dev_name, cpufreq_cdev,
>                                                   cooling_ops);
Daniel Lezcano May 15, 2019, 10:51 a.m. UTC | #9
On 15/05/2019 12:46, Quentin Perret wrote:
> On Wednesday 15 May 2019 at 11:40:44 (+0100), Quentin Perret wrote:

[ ... ]

>> +#ifdef CONFIG_THERMAL_GOV_POWER_ALLOCATOR
>>         if (capacitance) {
>>                 ret = update_freq_table(cpufreq_cdev, capacitance);
>>                 if (ret) {
>>                         cdev = ERR_PTR(ret);
>>                         goto remove_ida;
>>                 }
>> -
>> -               cooling_ops = &cpufreq_power_cooling_ops;
>> -       } else {
>> -               cooling_ops = &cpufreq_cooling_ops;
>>         }
>> +#endif
>> +       cooling_ops = &cpufreq_cooling_ops;
> 
> Argh, that is actually broken with !capacitance and
> THERMAL_GOV_POWER_ALLOCATOR=y ... Perhaps it's best to keep the two
> thermal_cooling_device_ops struct separated in the end.

Or alternatively you can keep one structure but instead of filling the
state2power,power2state and getrequestedpower fields in the declaration,
you fill them in the if (capacitance) block, no?
Daniel Lezcano May 15, 2019, 10:54 a.m. UTC | #10
On 15/05/2019 12:40, Quentin Perret wrote:
> On Wednesday 15 May 2019 at 15:52:00 (+0530), Viresh Kumar wrote:
>> On 15-05-19, 12:16, Daniel Lezcano wrote:
>>> Viresh what do you think ?
>>
>> I agree with your last suggestions. They do make sense.
> 
> Good :-)
> 
> So, FWIW, the below compiles w/ or w/o THERMAL_GOV_POWER_ALLOCATOR. I'll
> test it and clean it up some more and put it as patch 1 in the series if
> that's OK.
> 
> Thanks,
> Quentin
> 
> 
> diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
> index f7c1f49ec87f..ee431848ef71 100644
> --- a/drivers/thermal/cpu_cooling.c
> +++ b/drivers/thermal/cpu_cooling.c
> @@ -58,7 +58,9 @@
>   */
>  struct freq_table {
>         u32 frequency;

I suspect we will have a problem here as cpufreq_set_cur_state uses the
frequency and when switching to EM, we will still need a freq table.
Quentin Perret May 15, 2019, 10:57 a.m. UTC | #11
On Wednesday 15 May 2019 at 12:54:10 (+0200), Daniel Lezcano wrote:
> On 15/05/2019 12:40, Quentin Perret wrote:
> > On Wednesday 15 May 2019 at 15:52:00 (+0530), Viresh Kumar wrote:
> >> On 15-05-19, 12:16, Daniel Lezcano wrote:
> >>> Viresh what do you think ?
> >>
> >> I agree with your last suggestions. They do make sense.
> > 
> > Good :-)
> > 
> > So, FWIW, the below compiles w/ or w/o THERMAL_GOV_POWER_ALLOCATOR. I'll
> > test it and clean it up some more and put it as patch 1 in the series if
> > that's OK.
> > 
> > Thanks,
> > Quentin
> > 
> > 
> > diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
> > index f7c1f49ec87f..ee431848ef71 100644
> > --- a/drivers/thermal/cpu_cooling.c
> > +++ b/drivers/thermal/cpu_cooling.c
> > @@ -58,7 +58,9 @@
> >   */
> >  struct freq_table {
> >         u32 frequency;
> 
> I suspect we will have a problem here as cpufreq_set_cur_state uses the
> frequency and when switching to EM, we will still need a freq table.

Indeed, I'll need a bit of ifdefery in the get_state_freq() function
introduced in patch 3, but nothing too horrible I hope.

Thanks,
Quentin
Quentin Perret May 15, 2019, 11:01 a.m. UTC | #12
On Wednesday 15 May 2019 at 12:51:57 (+0200), Daniel Lezcano wrote:
> On 15/05/2019 12:46, Quentin Perret wrote:
> > On Wednesday 15 May 2019 at 11:40:44 (+0100), Quentin Perret wrote:
> 
> [ ... ]
> 
> >> +#ifdef CONFIG_THERMAL_GOV_POWER_ALLOCATOR
> >>         if (capacitance) {
> >>                 ret = update_freq_table(cpufreq_cdev, capacitance);
> >>                 if (ret) {
> >>                         cdev = ERR_PTR(ret);
> >>                         goto remove_ida;
> >>                 }
> >> -
> >> -               cooling_ops = &cpufreq_power_cooling_ops;
> >> -       } else {
> >> -               cooling_ops = &cpufreq_cooling_ops;
> >>         }
> >> +#endif
> >> +       cooling_ops = &cpufreq_cooling_ops;
> > 
> > Argh, that is actually broken with !capacitance and
> > THERMAL_GOV_POWER_ALLOCATOR=y ... Perhaps it's best to keep the two
> > thermal_cooling_device_ops struct separated in the end.
> 
> Or alternatively you can keep one structure but instead of filling the
> state2power,power2state and getrequestedpower fields in the declaration,
> you fill them in the if (capacitance) block, no?

Something like the below ? Yes, that works too. I'll write a proper
patch and send that next week or so.

Thanks,
Quentin

--->8---

 /* Bind cpufreq callbacks to thermal cooling device ops */

 static struct thermal_cooling_device_ops cpufreq_cooling_ops = {
-       .get_max_state = cpufreq_get_max_state,
-       .get_cur_state = cpufreq_get_cur_state,
-       .set_cur_state = cpufreq_set_cur_state,
-};
-
-static struct thermal_cooling_device_ops cpufreq_power_cooling_ops = {
        .get_max_state          = cpufreq_get_max_state,
        .get_cur_state          = cpufreq_get_cur_state,
        .set_cur_state          = cpufreq_set_cur_state,
-       .get_requested_power    = cpufreq_get_requested_power,
-       .state2power            = cpufreq_state2power,
-       .power2state            = cpufreq_power2state,
 };

 /* Notifier for cpufreq policy change */
@@ -674,18 +667,19 @@ __cpufreq_cooling_register(struct device_node *np,
                        pr_debug("%s: freq:%u KHz\n", __func__, freq);
        }

+       cooling_ops = &cpufreq_cooling_ops;
+#ifdef CONFIG_THERMAL_GOV_POWER_ALLOCATOR
        if (capacitance) {
                ret = update_freq_table(cpufreq_cdev, capacitance);
                if (ret) {
                        cdev = ERR_PTR(ret);
                        goto remove_ida;
                }
-
-               cooling_ops = &cpufreq_power_cooling_ops;
-       } else {
-               cooling_ops = &cpufreq_cooling_ops;
+               cooling_ops->get_requested_power = cpufreq_get_requested_power;
+               cooling_ops->state2power = cpufreq_state2power;
+               cooling_ops->power2state = cpufreq_power2state;
        }
-
+#endif
        cdev = thermal_of_cooling_device_register(np, dev_name, cpufreq_cdev,
                                                  cooling_ops);
        if (IS_ERR(cdev))
Daniel Lezcano May 15, 2019, 11:07 a.m. UTC | #13
On 15/05/2019 13:01, Quentin Perret wrote:
> On Wednesday 15 May 2019 at 12:51:57 (+0200), Daniel Lezcano wrote:
>> On 15/05/2019 12:46, Quentin Perret wrote:
>>> On Wednesday 15 May 2019 at 11:40:44 (+0100), Quentin Perret wrote:
>>
>> [ ... ]
>>
>>>> +#ifdef CONFIG_THERMAL_GOV_POWER_ALLOCATOR
>>>>         if (capacitance) {
>>>>                 ret = update_freq_table(cpufreq_cdev, capacitance);
>>>>                 if (ret) {
>>>>                         cdev = ERR_PTR(ret);
>>>>                         goto remove_ida;
>>>>                 }
>>>> -
>>>> -               cooling_ops = &cpufreq_power_cooling_ops;
>>>> -       } else {
>>>> -               cooling_ops = &cpufreq_cooling_ops;
>>>>         }
>>>> +#endif
>>>> +       cooling_ops = &cpufreq_cooling_ops;
>>>
>>> Argh, that is actually broken with !capacitance and
>>> THERMAL_GOV_POWER_ALLOCATOR=y ... Perhaps it's best to keep the two
>>> thermal_cooling_device_ops struct separated in the end.
>>
>> Or alternatively you can keep one structure but instead of filling the
>> state2power,power2state and getrequestedpower fields in the declaration,
>> you fill them in the if (capacitance) block, no?
> 
> Something like the below ? Yes, that works too. I'll write a proper
> patch and send that next week or so.

Yes, exactly. And IMHO, that helps for the understanding of code also.

> --->8---
> 
>  /* Bind cpufreq callbacks to thermal cooling device ops */
> 
>  static struct thermal_cooling_device_ops cpufreq_cooling_ops = {
> -       .get_max_state = cpufreq_get_max_state,
> -       .get_cur_state = cpufreq_get_cur_state,
> -       .set_cur_state = cpufreq_set_cur_state,
> -};
> -
> -static struct thermal_cooling_device_ops cpufreq_power_cooling_ops = {
>         .get_max_state          = cpufreq_get_max_state,
>         .get_cur_state          = cpufreq_get_cur_state,
>         .set_cur_state          = cpufreq_set_cur_state,
> -       .get_requested_power    = cpufreq_get_requested_power,
> -       .state2power            = cpufreq_state2power,
> -       .power2state            = cpufreq_power2state,
>  };
> 
>  /* Notifier for cpufreq policy change */
> @@ -674,18 +667,19 @@ __cpufreq_cooling_register(struct device_node *np,
>                         pr_debug("%s: freq:%u KHz\n", __func__, freq);
>         }
> 
> +       cooling_ops = &cpufreq_cooling_ops;
> +#ifdef CONFIG_THERMAL_GOV_POWER_ALLOCATOR
>         if (capacitance) {
>                 ret = update_freq_table(cpufreq_cdev, capacitance);
>                 if (ret) {
>                         cdev = ERR_PTR(ret);
>                         goto remove_ida;
>                 }
> -
> -               cooling_ops = &cpufreq_power_cooling_ops;
> -       } else {
> -               cooling_ops = &cpufreq_cooling_ops;
> +               cooling_ops->get_requested_power = cpufreq_get_requested_power;
> +               cooling_ops->state2power = cpufreq_state2power;
> +               cooling_ops->power2state = cpufreq_power2state;
>         }
> -
> +#endif
>         cdev = thermal_of_cooling_device_register(np, dev_name, cpufreq_cdev,
>                                                   cooling_ops);
>         if (IS_ERR(cdev))
>

Patch
diff mbox series

diff --git a/include/linux/energy_model.h b/include/linux/energy_model.h
index aa027f7bcb3e..fb32b86a467d 100644
--- a/include/linux/energy_model.h
+++ b/include/linux/energy_model.h
@@ -9,7 +9,6 @@ 
 #include <linux/sched/topology.h>
 #include <linux/types.h>
 
-#ifdef CONFIG_ENERGY_MODEL
 /**
  * em_cap_state - Capacity state of a performance domain
  * @frequency:	The CPU frequency in KHz, for consistency with CPUFreq
@@ -40,6 +39,7 @@  struct em_perf_domain {
 	unsigned long cpus[0];
 };
 
+#ifdef CONFIG_ENERGY_MODEL
 #define EM_CPU_MAX_POWER 0xFFFF
 
 struct em_data_callback {
@@ -160,7 +160,6 @@  static inline int em_pd_nr_cap_states(struct em_perf_domain *pd)
 }
 
 #else
-struct em_perf_domain {};
 struct em_data_callback {};
 #define EM_DATA_CB(_active_power_cb) { }