diff mbox

cpufreq: acpi_cpufreq: base frequency attribute support

Message ID 1443738317-4224-1-git-send-email-srinivas.pandruvada@linux.intel.com (mailing list archive)
State Superseded, archived
Headers show

Commit Message

srinivas pandruvada Oct. 1, 2015, 10:25 p.m. UTC
Currently scaling_available_frequencies displays list of available
frequencies which can be used to set max/min or current scaling frequency.

>cat scaling_available_frequencies
2301000 2300000 2200000 2000000 1900000 1800000 1700000 1500000 1400000
1300000 1100000 1000000 900000 800000 600000 500000

Here traditionally it is assumed that only 2301000 is a turbo frequency,
which is purely opportunistic, anything else user can request and may
get it.
But because of configurable thermal design power implementation in several
Intel CPUs, the opportunistic frequency start can be any frequency in this
range. For example it can be 2300000 or any lower value.
This change adds an optional new attribute called "base_frequency",
which displays the max non-turbo frequency (base frequency). For example:
>cat base_frequency
2200000
This will allow user to choose a certain frequency which is not
opportunistic.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
---
 drivers/cpufreq/acpi-cpufreq.c | 34 ++++++++++++++++++++++++++++++++++
 1 file changed, 34 insertions(+)

Comments

Viresh Kumar Oct. 7, 2015, 5:23 p.m. UTC | #1
On 01-10-15, 15:25, Srinivas Pandruvada wrote:
> Currently scaling_available_frequencies displays list of available
> frequencies which can be used to set max/min or current scaling frequency.
> 
> >cat scaling_available_frequencies
> 2301000 2300000 2200000 2000000 1900000 1800000 1700000 1500000 1400000
> 1300000 1100000 1000000 900000 800000 600000 500000
> 
> Here traditionally it is assumed that only 2301000 is a turbo frequency,
> which is purely opportunistic, anything else user can request and may
> get it.
> But because of configurable thermal design power implementation in several
> Intel CPUs, the opportunistic frequency start can be any frequency in this
> range. For example it can be 2300000 or any lower value.
> This change adds an optional new attribute called "base_frequency",
> which displays the max non-turbo frequency (base frequency). For example:
> >cat base_frequency
> 2200000
> This will allow user to choose a certain frequency which is not
> opportunistic.
> 
> Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
> ---
>  drivers/cpufreq/acpi-cpufreq.c | 34 ++++++++++++++++++++++++++++++++++
>  1 file changed, 34 insertions(+)

Look for scaling_boost_frequencies_show() and see if you can reuse it.
srinivas pandruvada Oct. 9, 2015, 3:34 p.m. UTC | #2
On Wed, 2015-10-07 at 22:53 +0530, Viresh Kumar wrote:
> On 01-10-15, 15:25, Srinivas Pandruvada wrote:
> > Currently scaling_available_frequencies displays list of available
> > frequencies which can be used to set max/min or current scaling frequency.
> > 
> > >cat scaling_available_frequencies
> > 2301000 2300000 2200000 2000000 1900000 1800000 1700000 1500000 1400000
> > 1300000 1100000 1000000 900000 800000 600000 500000
> > 
> > Here traditionally it is assumed that only 2301000 is a turbo frequency,
> > which is purely opportunistic, anything else user can request and may
> > get it.
> > But because of configurable thermal design power implementation in several
> > Intel CPUs, the opportunistic frequency start can be any frequency in this
> > range. For example it can be 2300000 or any lower value.
> > This change adds an optional new attribute called "base_frequency",
> > which displays the max non-turbo frequency (base frequency). For example:
> > >cat base_frequency
> > 2200000
> > This will allow user to choose a certain frequency which is not
> > opportunistic.
> > 
> > Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
> > ---
> >  drivers/cpufreq/acpi-cpufreq.c | 34 ++++++++++++++++++++++++++++++++++
> >  1 file changed, 34 insertions(+)
> 
> Look for scaling_boost_frequencies_show() and see if you can reuse it.
Initially I have planned to use this, but there is a issue in using it.
Checking internally again.

Thanks,
Srinivas
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rafael J. Wysocki Oct. 15, 2015, 10:45 p.m. UTC | #3
On Wednesday, October 07, 2015 10:53:43 PM Viresh Kumar wrote:
> On 01-10-15, 15:25, Srinivas Pandruvada wrote:
> > Currently scaling_available_frequencies displays list of available
> > frequencies which can be used to set max/min or current scaling frequency.
> > 
> > >cat scaling_available_frequencies
> > 2301000 2300000 2200000 2000000 1900000 1800000 1700000 1500000 1400000
> > 1300000 1100000 1000000 900000 800000 600000 500000
> > 
> > Here traditionally it is assumed that only 2301000 is a turbo frequency,
> > which is purely opportunistic, anything else user can request and may
> > get it.
> > But because of configurable thermal design power implementation in several
> > Intel CPUs, the opportunistic frequency start can be any frequency in this
> > range. For example it can be 2300000 or any lower value.
> > This change adds an optional new attribute called "base_frequency",
> > which displays the max non-turbo frequency (base frequency). For example:
> > >cat base_frequency
> > 2200000
> > This will allow user to choose a certain frequency which is not
> > opportunistic.
> > 
> > Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
> > ---
> >  drivers/cpufreq/acpi-cpufreq.c | 34 ++++++++++++++++++++++++++++++++++
> >  1 file changed, 34 insertions(+)
> 
> Look for scaling_boost_frequencies_show() and see if you can reuse it.

Are you sure that would match the use case?

We're talking about a continuous range above the base frequency here.

Thanks,
Rafael

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Viresh Kumar Oct. 16, 2015, 5:42 a.m. UTC | #4
On 16-10-15, 00:45, Rafael J. Wysocki wrote:
> We're talking about a continuous range above the base frequency here.

Ick. No it can't be reused for the continuous range stuff.
srinivas pandruvada Feb. 24, 2016, 8 p.m. UTC | #5
On Fri, 2015-10-16 at 11:12 +0530, Viresh Kumar wrote:
> On 16-10-15, 00:45, Rafael J. Wysocki wrote:
> > We're talking about a continuous range above the base frequency
> > here.
> 
> Ick. No it can't be reused for the continuous range stuff.
> 
I know this is bit old discussion. Is there any problem in applying
this patch?

Thanks,
Srinivas
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rafael J. Wysocki Feb. 24, 2016, 8:05 p.m. UTC | #6
On Wed, Feb 24, 2016 at 9:00 PM, Srinivas Pandruvada
<srinivas.pandruvada@linux.intel.com> wrote:
> On Fri, 2015-10-16 at 11:12 +0530, Viresh Kumar wrote:
>> On 16-10-15, 00:45, Rafael J. Wysocki wrote:
>> > We're talking about a continuous range above the base frequency
>> > here.
>>
>> Ick. No it can't be reused for the continuous range stuff.
>>
> I know this is bit old discussion. Is there any problem in applying
> this patch?

Not in principle, but can you please resubmit?

Thanks,
Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
srinivas pandruvada Feb. 24, 2016, 11:37 p.m. UTC | #7
On Wed, 2016-02-24 at 21:05 +0100, Rafael J. Wysocki wrote:
> On Wed, Feb 24, 2016 at 9:00 PM, Srinivas Pandruvada
> <srinivas.pandruvada@linux.intel.com> wrote:
> > On Fri, 2015-10-16 at 11:12 +0530, Viresh Kumar wrote:
> > > On 16-10-15, 00:45, Rafael J. Wysocki wrote:
> > > > We're talking about a continuous range above the base frequency
> > > > here.
> > > 
> > > Ick. No it can't be reused for the continuous range stuff.
> > > 
> > I know this is bit old discussion. Is there any problem in applying
> > this patch?
> 
> Not in principle, but can you please resubmit?
> 
Good I tested again.

Hi Viresh,

Are we no longer allowed to call
	sysfs_create_file in cpufreq_driver.init() callback?

The following call crashes because of BUG_ON for !kobj->sd:
	sysfs_create_file(&policy->kobj,
                                  &(cpufreq_attr_base_frequency.attr));

Before I debug further, I want to check with you.
This used to work before.

Thanks,
Srinivas

> Thanks,
> Rafael
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pm"
> in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Viresh Kumar Feb. 25, 2016, 3:27 a.m. UTC | #8
On 24-02-16, 15:37, Srinivas Pandruvada wrote:
> Good I tested again.
> 
> Hi Viresh,
> 
> Are we no longer allowed to call
> 	sysfs_create_file in cpufreq_driver.init() callback?
> 
> The following call crashes because of BUG_ON for !kobj->sd:
> 	sysfs_create_file(&policy->kobj,
>                                   &(cpufreq_attr_base_frequency.attr));
> 
> Before I debug further, I want to check with you.
> This used to work before.

Haven't checked that yet, but you should be using cpufreq_driver.attr for this
kind of stuff, isn't it ?
srinivas pandruvada Feb. 25, 2016, 6:07 p.m. UTC | #9
On Thu, 2016-02-25 at 08:57 +0530, Viresh Kumar wrote:
> On 24-02-16, 15:37, Srinivas Pandruvada wrote:
> > Good I tested again.
> > 
> > Hi Viresh,
> > 
> > Are we no longer allowed to call
> > 	sysfs_create_file in cpufreq_driver.init() callback?
> > 
> > The following call crashes because of BUG_ON for !kobj->sd:
> > 	sysfs_create_file(&policy->kobj,
> >                                  
> > &(cpufreq_attr_base_frequency.attr));
> > 
> > Before I debug further, I want to check with you.
> > This used to work before.
> 
> Haven't checked that yet, but you should be using cpufreq_driver.attr
> for this
> kind of stuff, isn't it ?
If I use cpufreq_driver.attr, then it will create sysfs attribute for
every system using acpi-cpufreq, whether they can support it or not.
This change is only needed for the later generation of Intel CPUs
(IvyBridge and later). 
There is no standard ACPI way to know the base frequency if we add
attribute for all systems using acpi-cpufreq.

Thanks,
Srinivas

> 
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Pandruvada, Srinivas Feb. 26, 2016, 1:10 a.m. UTC | #10
On Thu, 2016-02-25 at 10:07 -0800, Srinivas Pandruvada wrote:
> On Thu, 2016-02-25 at 08:57 +0530, Viresh Kumar wrote:

> > On 24-02-16, 15:37, Srinivas Pandruvada wrote:

> > > Good I tested again.

> > > 

> > > Hi Viresh,

> > > 

> > > Are we no longer allowed to call

> > > 	sysfs_create_file in cpufreq_driver.init() callback?

> > > 

> > > The following call crashes because of BUG_ON for !kobj->sd:

> > > 	sysfs_create_file(&policy->kobj,

> > >                                  

> > > &(cpufreq_attr_base_frequency.attr));

> > > 

> > > Before I debug further, I want to check with you.

> > > This used to work before.

> > 

> > Haven't checked that yet, but you should be using

> > cpufreq_driver.attr

> > for this

> > kind of stuff, isn't it ?

> If I use cpufreq_driver.attr, then it will create sysfs attribute for

> every system using acpi-cpufreq, whether they can support it or not.

> This change is only needed for the later generation of Intel CPUs

> (IvyBridge and later). 

> There is no standard ACPI way to know the base frequency if we add

> attribute for all systems using acpi-cpufreq.

> 

There is used to be

cpufreq_policy_alloc() {
..
    ret = kobject_init_and_add(&policy->kobj, &ktype_cpufreq,
&dev>kobj, "cpufreq");
}

Now it is removed, so we don't have directory entry for policy->kobj to
add during init() callback. Now we have kobject_add after init()
callback returns for policy%u.

Thanks,
Srinivas
Viresh Kumar Feb. 26, 2016, 1:57 a.m. UTC | #11
On 25-02-16, 10:07, Srinivas Pandruvada wrote:
> If I use cpufreq_driver.attr, then it will create sysfs attribute for
> every system using acpi-cpufreq, whether they can support it or not.
> This change is only needed for the later generation of Intel CPUs
> (IvyBridge and later). 
> There is no standard ACPI way to know the base frequency if we add
> attribute for all systems using acpi-cpufreq.

Why can't you set cpufreq_driver.attr selectively from probe()?

See below for reference.

commit 21c36d35711d ("cpufreq-dt: make scaling_boost_freqs sysfs attr available
when boost is enabled")
srinivas pandruvada Feb. 26, 2016, 8:21 p.m. UTC | #12
On Fri, 2016-02-26 at 07:27 +0530, Viresh Kumar wrote:
> On 25-02-16, 10:07, Srinivas Pandruvada wrote:
> > If I use cpufreq_driver.attr, then it will create sysfs attribute
> > for
> > every system using acpi-cpufreq, whether they can support it or
> > not.
> > This change is only needed for the later generation of Intel CPUs
> > (IvyBridge and later). 
> > There is no standard ACPI way to know the base frequency if we add
> > attribute for all systems using acpi-cpufreq.
> 
> Why can't you set cpufreq_driver.attr selectively from probe()?
> 
We are setting a cpufreq global variable in cpufreq_driver->attr with
this for each cpu. This feature can be absent in certain cpus. So
unlike boost it is not system wide, so I have to reset the attr to NULL
for some cpus.
Can we assume that cpufreq_driver->init(policy) calls are always
serialized from cpu online/offline and subsys_interface callback path?

Thanks,
Srinivas
> See below for reference.
> 
> commit 21c36d35711d ("cpufreq-dt: make scaling_boost_freqs sysfs attr
> available
> when boost is enabled")
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Viresh Kumar Feb. 29, 2016, 3:16 a.m. UTC | #13
On 26-02-16, 12:21, Srinivas Pandruvada wrote:
> We are setting a cpufreq global variable in cpufreq_driver->attr with
> this for each cpu. This feature can be absent in certain cpus. So
> unlike boost it is not system wide, so I have to reset the attr to NULL
> for some cpus.

Ahh, I see..

What about always creating this file in sysfs, but allowing read/write only if
it is applicable to a CPU.

> Can we assume that cpufreq_driver->init(policy) calls are always
> serialized from cpu online/offline and subsys_interface callback path?

It is called from these paths but only on poilcy creation. So, if you are going
to have a single CPU for each policy, then yes.
srinivas pandruvada Feb. 29, 2016, 5:11 p.m. UTC | #14
On Mon, 2016-02-29 at 08:46 +0530, Viresh Kumar wrote:
> On 26-02-16, 12:21, Srinivas Pandruvada wrote:
> > We are setting a cpufreq global variable in cpufreq_driver->attr
> > with
> > this for each cpu. This feature can be absent in certain cpus. So
> > unlike boost it is not system wide, so I have to reset the attr to
> > NULL
> > for some cpus.
> 
> Ahh, I see..
> 
> What about always creating this file in sysfs, but allowing
> read/write only if
> it is applicable to a CPU.
> 
> > Can we assume that cpufreq_driver->init(policy) calls are always
> > serialized from cpu online/offline and subsys_interface callback
> > path?
> 
> It is called from these paths but only on poilcy creation. So, if you
> are going
> to have a single CPU for each policy, then yes.
> 
This depends on ACPI configuration, so not guaranteed to have one cpu
per policy. Since this is not guaranteed, I can't change mode to Read
only on some as after I set the cpufreq_driver->attr, the other call to
init may change the mode before the actual attribute creation.
So safe bet is to implement like boost for all CPUs and fail on read
for some CPUs, where this is not present. If this becomes a problem,
then we can revisit.

Thanks,
Srinivas
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Viresh Kumar March 1, 2016, 2:16 a.m. UTC | #15
On 29-02-16, 09:11, Srinivas Pandruvada wrote:
> This depends on ACPI configuration, so not guaranteed to have one cpu
> per policy. Since this is not guaranteed, I can't change mode to Read
> only on some as after I set the cpufreq_driver->attr, the other call to
> init may change the mode before the actual attribute creation.
> So safe bet is to implement like boost for all CPUs and fail on read
> for some CPUs, where this is not present. If this becomes a problem,
> then we can revisit.

Perhaps don't look at it on per-cpu basis, but rather per-policy basis
(as the files are going to be shared).

So, maybe just create the attribute always and return errors by
default. Once ->init() is called and you know if a CPU supports it or
not, toggle some internal flag to allow read/write to those sysfs
files.
diff mbox

Patch

diff --git a/drivers/cpufreq/acpi-cpufreq.c b/drivers/cpufreq/acpi-cpufreq.c
index cec1ee2..4433f6b 100644
--- a/drivers/cpufreq/acpi-cpufreq.c
+++ b/drivers/cpufreq/acpi-cpufreq.c
@@ -70,6 +70,7 @@  struct acpi_cpufreq_data {
 	unsigned int cpu_feature;
 	unsigned int acpi_perf_cpu;
 	cpumask_var_t freqdomain_cpus;
+	bool base_freq_attr_present;
 };
 
 /* acpi_perf_data is a pointer to percpu data. */
@@ -651,6 +652,21 @@  static int acpi_cpufreq_blacklist(struct cpuinfo_x86 *c)
 }
 #endif
 
+static ssize_t base_frequency_show(struct cpufreq_policy *policy, char *buf)
+{
+	u64 tar;
+	int err;
+
+	err = rdmsrl_safe(MSR_TURBO_ACTIVATION_RATIO, &tar);
+	if (!err)
+		/* Refer to IA64, IA32 SDM table 35-20, unit = 100 MHz */
+		return sprintf(buf, "%llu\n", tar * 100000);
+
+	return err;
+}
+
+static struct freq_attr cpufreq_attr_base_frequency = __ATTR_RO(base_frequency);
+
 static int acpi_cpufreq_cpu_init(struct cpufreq_policy *policy)
 {
 	unsigned int i;
@@ -827,6 +843,21 @@  static int acpi_cpufreq_cpu_init(struct cpufreq_policy *policy)
 		break;
 	}
 
+	if (data->cpu_feature == SYSTEM_INTEL_MSR_CAPABLE) {
+		u64 tar;
+		int err;
+
+		err = rdmsrl_safe(MSR_TURBO_ACTIVATION_RATIO, &tar);
+		if (!err) {
+			result = sysfs_create_file(&policy->kobj,
+					&(cpufreq_attr_base_frequency.attr));
+			if (result)
+				goto err_freqfree;
+
+			data->base_freq_attr_present = true;
+		}
+	}
+
 	/* notify BIOS that we exist */
 	acpi_processor_notify_smm(THIS_MODULE);
 
@@ -866,6 +897,9 @@  static int acpi_cpufreq_cpu_exit(struct cpufreq_policy *policy)
 	pr_debug("acpi_cpufreq_cpu_exit\n");
 
 	if (data) {
+		if (data->base_freq_attr_present)
+			sysfs_remove_file(&policy->kobj,
+					  &(cpufreq_attr_base_frequency.attr));
 		policy->driver_data = NULL;
 		acpi_processor_unregister_performance(data->acpi_perf_cpu);
 		free_cpumask_var(data->freqdomain_cpus);