diff mbox

[v6,6/7,Resend] cpufreq: Support for fast frequency switching

Message ID 25154681.B5BGJ94JlQ@vostro.rjw.lan (mailing list archive)
State Superseded, archived
Delegated to: Rafael Wysocki
Headers show

Commit Message

Rafael J. Wysocki March 22, 2016, 1:53 a.m. UTC
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Modify the ACPI cpufreq driver to provide a method for switching
CPU frequencies from interrupt context and update the cpufreq core
to support that method if available.

Introduce a new cpufreq driver callback, ->fast_switch, to be
invoked for frequency switching from interrupt context by (future)
governors supporting that feature via (new) helper function
cpufreq_driver_fast_switch().

Add two new policy flags, fast_switch_possible, to be set by the
cpufreq driver if fast frequency switching can be used for the
given policy and fast_switch_enabled, to be set by the governor
if it is going to use fast frequency switching for the given
policy.  Also add a helper for setting the latter.

Since fast frequency switching is inherently incompatible with
cpufreq transition notifiers, make it possible to set the
fast_switch_enabled only if there are no transition notifiers
already registered and make the registration of new transition
notifiers fail if fast_switch_enabled is set for at least one
policy.

Implement the ->fast_switch callback in the ACPI cpufreq driver
and make it set fast_switch_possible during policy initialization
as appropriate.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---

Changes from v5:
- cpufreq_enable_fast_switch() fixed to avoid printing a confusing message
  if fast_switch_possible is not set for the policy.
- Fixed a typo in that message.
- Removed the WARN_ON() from the (cpufreq_fast_switch_count > 0) check in
  cpufreq_register_notifier(), because it triggered false-positive warnings
  from the cpufreq_stats module (cpufreq_stats don't work with the fast
  switching, because it is based on notifiers).

Changes from v4:
- If cpufreq_enable_fast_switch() is about to fail, it will print the list
  of currently registered transition notifiers.
- Added lock_assert_held(&policy->rwsem) to cpufreq_enable_fast_switch().
- Added WARN_ON() to the (cpufreq_fast_switch_count > 0) check in
  cpufreq_register_notifier().
- Modified the kerneldoc comment of cpufreq_driver_fast_switch() to
  mention the RELATION_L expectation regarding the ->fast_switch callback.

Changes from v3:
- New fast_switch_enabled field in struct cpufreq_policy to help
  avoid affecting existing setups by setting the fast_switch_possible
  flag in the driver.
- __cpufreq_get() skips the policy->cur check if fast_switch_enabled is set.

Changes from v2:
- The driver ->fast_switch callback and cpufreq_driver_fast_switch()
  don't need the relation argument as they will always do RELATION_L now.
- New mechanism to make fast switch and cpufreq notifiers mutually
  exclusive.
- cpufreq_driver_fast_switch() doesn't do anything in addition to
  invoking the driver callback and returns its return value.

---
 drivers/cpufreq/acpi-cpufreq.c |   41 ++++++++++++
 drivers/cpufreq/cpufreq.c      |  130 ++++++++++++++++++++++++++++++++++++++---
 include/linux/cpufreq.h        |    9 ++
 3 files changed, 171 insertions(+), 9 deletions(-)


--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Steve Muckle March 26, 2016, 1:12 a.m. UTC | #1
Hi Rafael,

On 03/21/2016 06:53 PM, Rafael J. Wysocki wrote:
> Add two new policy flags, fast_switch_possible, to be set by the
> cpufreq driver if fast frequency switching can be used for the
> given policy and fast_switch_enabled, to be set by the governor
> if it is going to use fast frequency switching for the given
> policy.  Also add a helper for setting the latter.
...
> @@ -740,6 +777,9 @@ static int acpi_cpufreq_cpu_init(struct
>  		goto err_unreg;
>  	}
>  
> +	policy->fast_switch_possible = !acpi_pstate_strict &&
> +		!(policy_is_shared(policy) && policy->shared_type != CPUFREQ_SHARED_TYPE_ANY);

Could the policy->fast_switch_possible flag be avoided by just checking
whether a driver has registered the .fast_switch callback?

...
> @@ -1726,6 +1810,34 @@ EXPORT_SYMBOL(cpufreq_unregister_notifie
>   *                              GOVERNORS                            *
>   *********************************************************************/
>  
> +/**
> + * cpufreq_driver_fast_switch - Carry out a fast CPU frequency switch.
> + * @policy: cpufreq policy to switch the frequency for.
> + * @target_freq: New frequency to set (may be approximate).
> + *
> + * Carry out a fast frequency switch from interrupt context.

I think that should say atomic rather than interrupt as this might not
be called from interrupt context.

thanks,
Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rafael J. Wysocki March 26, 2016, 1:46 a.m. UTC | #2
On Sat, Mar 26, 2016 at 2:12 AM, Steve Muckle <steve.muckle@linaro.org> wrote:
> Hi Rafael,
>
> On 03/21/2016 06:53 PM, Rafael J. Wysocki wrote:
>> Add two new policy flags, fast_switch_possible, to be set by the
>> cpufreq driver if fast frequency switching can be used for the
>> given policy and fast_switch_enabled, to be set by the governor
>> if it is going to use fast frequency switching for the given
>> policy.  Also add a helper for setting the latter.
> ...
>> @@ -740,6 +777,9 @@ static int acpi_cpufreq_cpu_init(struct
>>               goto err_unreg;
>>       }
>>
>> +     policy->fast_switch_possible = !acpi_pstate_strict &&
>> +             !(policy_is_shared(policy) && policy->shared_type != CPUFREQ_SHARED_TYPE_ANY);
>
> Could the policy->fast_switch_possible flag be avoided by just checking
> whether a driver has registered the .fast_switch callback?

No, it couldn't.

As in this case, the driver has the ->fast_switch callback, but it
can't be used for policies that don't satisfy the above condition.  At
the same time it may be possible to use it for other policies on the
same system in principle.

> ...
>> @@ -1726,6 +1810,34 @@ EXPORT_SYMBOL(cpufreq_unregister_notifie
>>   *                              GOVERNORS                            *
>>   *********************************************************************/
>>
>> +/**
>> + * cpufreq_driver_fast_switch - Carry out a fast CPU frequency switch.
>> + * @policy: cpufreq policy to switch the frequency for.
>> + * @target_freq: New frequency to set (may be approximate).
>> + *
>> + * Carry out a fast frequency switch from interrupt context.
>
> I think that should say atomic rather than interrupt as this might not
> be called from interrupt context.

"Interrupt context" here means something like "context that cannot
sleep" and it's sort of a traditional way of calling that.  I
considered saying "atomic context" here, but then decided that it
might suggest too much.

Maybe something like "Carry out a fast frequency switch without
sleeping" would be better?

Thanks,
Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rafael J. Wysocki March 27, 2016, 1:27 a.m. UTC | #3
On Sat, Mar 26, 2016 at 2:46 AM, Rafael J. Wysocki <rafael@kernel.org> wrote:
> On Sat, Mar 26, 2016 at 2:12 AM, Steve Muckle <steve.muckle@linaro.org> wrote:
>> Hi Rafael,
>>
>> On 03/21/2016 06:53 PM, Rafael J. Wysocki wrote:
>>> Add two new policy flags, fast_switch_possible, to be set by the
>>> cpufreq driver if fast frequency switching can be used for the
>>> given policy and fast_switch_enabled, to be set by the governor
>>> if it is going to use fast frequency switching for the given
>>> policy.  Also add a helper for setting the latter.
>> ...
>>> @@ -740,6 +777,9 @@ static int acpi_cpufreq_cpu_init(struct
>>>               goto err_unreg;
>>>       }
>>>
>>> +     policy->fast_switch_possible = !acpi_pstate_strict &&
>>> +             !(policy_is_shared(policy) && policy->shared_type != CPUFREQ_SHARED_TYPE_ANY);
>>
>> Could the policy->fast_switch_possible flag be avoided by just checking
>> whether a driver has registered the .fast_switch callback?
>
> No, it couldn't.
>
> As in this case, the driver has the ->fast_switch callback, but it
> can't be used for policies that don't satisfy the above condition.  At
> the same time it may be possible to use it for other policies on the
> same system in principle.

In fact, for fast switching to be useful, the driver has to guarantee
that frequency can be updated on any of the policy CPUs (and it
doesn't matter which of them updates the frequency) and that's what
the fast_switch_possible flag is really for.  I guess I should add a
comment to that effect to its definition.

Thanks,
Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Viresh Kumar March 28, 2016, 6:27 a.m. UTC | #4
Sorry for jumping in late, was busy with other stuff and travel :(

On 22-03-16, 02:53, Rafael J. Wysocki wrote:
> Index: linux-pm/drivers/cpufreq/acpi-cpufreq.c
> ===================================================================
> --- linux-pm.orig/drivers/cpufreq/acpi-cpufreq.c
> +++ linux-pm/drivers/cpufreq/acpi-cpufreq.c
> @@ -458,6 +458,43 @@ static int acpi_cpufreq_target(struct cp
>  	return result;
>  }
>  
> +unsigned int acpi_cpufreq_fast_switch(struct cpufreq_policy *policy,
> +				      unsigned int target_freq)
> +{
> +	struct acpi_cpufreq_data *data = policy->driver_data;
> +	struct acpi_processor_performance *perf;
> +	struct cpufreq_frequency_table *entry;
> +	unsigned int next_perf_state, next_freq, freq;
> +
> +	/*
> +	 * Find the closest frequency above target_freq.
> +	 *
> +	 * The table is sorted in the reverse order with respect to the
> +	 * frequency and all of the entries are valid (see the initialization).
> +	 */
> +	entry = data->freq_table;
> +	do {
> +		entry++;
> +		freq = entry->frequency;
> +	} while (freq >= target_freq && freq != CPUFREQ_TABLE_END);
> +	entry--;
> +	next_freq = entry->frequency;
> +	next_perf_state = entry->driver_data;
> +
> +	perf = to_perf_data(data);
> +	if (perf->state == next_perf_state) {
> +		if (unlikely(data->resume))
> +			data->resume = 0;
> +		else
> +			return next_freq;
> +	}
> +
> +	data->cpu_freq_write(&perf->control_register,
> +			     perf->states[next_perf_state].control);
> +	perf->state = next_perf_state;
> +	return next_freq;
> +}
> +
>  static unsigned long
>  acpi_cpufreq_guess_freq(struct acpi_cpufreq_data *data, unsigned int cpu)
>  {
> @@ -740,6 +777,9 @@ static int acpi_cpufreq_cpu_init(struct
>  		goto err_unreg;
>  	}
>  
> +	policy->fast_switch_possible = !acpi_pstate_strict &&
> +		!(policy_is_shared(policy) && policy->shared_type != CPUFREQ_SHARED_TYPE_ANY);
> +
>  	data->freq_table = kzalloc(sizeof(*data->freq_table) *
>  		    (perf->state_count+1), GFP_KERNEL);
>  	if (!data->freq_table) {
> @@ -874,6 +914,7 @@ static struct freq_attr *acpi_cpufreq_at
>  static struct cpufreq_driver acpi_cpufreq_driver = {
>  	.verify		= cpufreq_generic_frequency_table_verify,
>  	.target_index	= acpi_cpufreq_target,
> +	.fast_switch	= acpi_cpufreq_fast_switch,
>  	.bios_limit	= acpi_processor_get_bios_limit,
>  	.init		= acpi_cpufreq_cpu_init,
>  	.exit		= acpi_cpufreq_cpu_exit,
> Index: linux-pm/include/linux/cpufreq.h
> ===================================================================
> --- linux-pm.orig/include/linux/cpufreq.h
> +++ linux-pm/include/linux/cpufreq.h
> @@ -102,6 +102,10 @@ struct cpufreq_policy {
>  	 */
>  	struct rw_semaphore	rwsem;
>  
> +	/* Fast switch flags */
> +	bool			fast_switch_possible;	/* Set by the driver. */
> +	bool			fast_switch_enabled;
> +
>  	/* Synchronization for frequency transitions */
>  	bool			transition_ongoing; /* Tracks transition status */
>  	spinlock_t		transition_lock;
> @@ -156,6 +160,7 @@ int cpufreq_get_policy(struct cpufreq_po
>  int cpufreq_update_policy(unsigned int cpu);
>  bool have_governor_per_policy(void);
>  struct kobject *get_governor_parent_kobj(struct cpufreq_policy *policy);
> +void cpufreq_enable_fast_switch(struct cpufreq_policy *policy);
>  #else
>  static inline unsigned int cpufreq_get(unsigned int cpu)
>  {
> @@ -236,6 +241,8 @@ struct cpufreq_driver {
>  				  unsigned int relation);	/* Deprecated */
>  	int		(*target_index)(struct cpufreq_policy *policy,
>  					unsigned int index);
> +	unsigned int	(*fast_switch)(struct cpufreq_policy *policy,
> +				       unsigned int target_freq);
>  	/*
>  	 * Only for drivers with target_index() and CPUFREQ_ASYNC_NOTIFICATION
>  	 * unset.
> @@ -464,6 +471,8 @@ struct cpufreq_governor {
>  };
>  
>  /* Pass a target to the cpufreq driver */
> +unsigned int cpufreq_driver_fast_switch(struct cpufreq_policy *policy,
> +					unsigned int target_freq);
>  int cpufreq_driver_target(struct cpufreq_policy *policy,
>  				 unsigned int target_freq,
>  				 unsigned int relation);
> Index: linux-pm/drivers/cpufreq/cpufreq.c
> ===================================================================
> --- linux-pm.orig/drivers/cpufreq/cpufreq.c
> +++ linux-pm/drivers/cpufreq/cpufreq.c
> @@ -428,6 +428,57 @@ void cpufreq_freq_transition_end(struct
>  }
>  EXPORT_SYMBOL_GPL(cpufreq_freq_transition_end);
>  
> +/*
> + * Fast frequency switching status count.  Positive means "enabled", negative
> + * means "disabled" and 0 means "not decided yet".
> + */
> +static int cpufreq_fast_switch_count;
> +static DEFINE_MUTEX(cpufreq_fast_switch_lock);
> +
> +static void cpufreq_list_transition_notifiers(void)
> +{
> +	struct notifier_block *nb;
> +
> +	pr_info("cpufreq: Registered transition notifiers:\n");
> +
> +	mutex_lock(&cpufreq_transition_notifier_list.mutex);
> +
> +	for (nb = cpufreq_transition_notifier_list.head; nb; nb = nb->next)
> +		pr_info("cpufreq: %pF\n", nb->notifier_call);
> +
> +	mutex_unlock(&cpufreq_transition_notifier_list.mutex);

This will get printed as:

cpufreq: cpufreq: Registered transition notifiers:
cpufreq: cpufreq: <func>+0x0/0x<address>
cpufreq: cpufreq: <func>+0x0/0x<address>
cpufreq: cpufreq: <func>+0x0/0x<address>

Maybe we want something like:
cpufreq: Registered transition notifiers:
  cpufreq: <func>+0x0/0x<address>
  cpufreq: <func>+0x0/0x<address>
  cpufreq: <func>+0x0/0x<address>

?

> +}
> +
> +/**
> + * cpufreq_enable_fast_switch - Enable fast frequency switching for policy.
> + * @policy: cpufreq policy to enable fast frequency switching for.
> + *
> + * Try to enable fast frequency switching for @policy.
> + *
> + * The attempt will fail if there is at least one transition notifier registered
> + * at this point, as fast frequency switching is quite fundamentally at odds
> + * with transition notifiers.  Thus if successful, it will make registration of
> + * transition notifiers fail going forward.
> + */
> +void cpufreq_enable_fast_switch(struct cpufreq_policy *policy)
> +{
> +	lockdep_assert_held(&policy->rwsem);
> +
> +	if (!policy->fast_switch_possible)
> +		return;
> +
> +	mutex_lock(&cpufreq_fast_switch_lock);
> +	if (cpufreq_fast_switch_count >= 0) {
> +		cpufreq_fast_switch_count++;
> +		policy->fast_switch_enabled = true;
> +	} else {
> +		pr_warn("cpufreq: CPU%u: Fast frequency switching not enabled\n",
> +			policy->cpu);
> +		cpufreq_list_transition_notifiers();
> +	}
> +	mutex_unlock(&cpufreq_fast_switch_lock);
> +}
> +EXPORT_SYMBOL_GPL(cpufreq_enable_fast_switch);

And, why don't we have support for disabling fast-switch support? What if we
switch to schedutil governor (from userspace) and then back to ondemand? We
don't call policy->exit for that.

>  /*********************************************************************
>   *                          SYSFS INTERFACE                          *
> @@ -1083,6 +1134,24 @@ static void cpufreq_policy_free(struct c
>  	kfree(policy);
>  }
>  
> +static void cpufreq_driver_exit_policy(struct cpufreq_policy *policy)
> +{
> +	if (policy->fast_switch_enabled) {

Shouldn't this be accessed from within lock as well ?

> +		mutex_lock(&cpufreq_fast_switch_lock);
> +
> +		policy->fast_switch_enabled = false;
> +		if (!WARN_ON(cpufreq_fast_switch_count <= 0))
> +			cpufreq_fast_switch_count--;

Shouldn't we make it more efficient and write it as:

		WARN_ON(cpufreq_fast_switch_count <= 0);
		policy->fast_switch_enabled = false;
                cpufreq_fast_switch_count--;

The WARN check will hold true only for a major bug somewhere in the core and we
shall *never* hit it.

> +		mutex_unlock(&cpufreq_fast_switch_lock);
> +	}
> +
> +	if (cpufreq_driver->exit) {
> +		cpufreq_driver->exit(policy);
> +		policy->freq_table = NULL;
> +	}
> +}
> +
>  static int cpufreq_online(unsigned int cpu)
>  {
>  	struct cpufreq_policy *policy;
> @@ -1236,8 +1305,7 @@ static int cpufreq_online(unsigned int c
>  out_exit_policy:
>  	up_write(&policy->rwsem);
>  
> -	if (cpufreq_driver->exit)
> -		cpufreq_driver->exit(policy);
> +	cpufreq_driver_exit_policy(policy);
>  out_free_policy:
>  	cpufreq_policy_free(policy, !new_policy);
>  	return ret;
> @@ -1334,10 +1402,7 @@ static void cpufreq_offline(unsigned int
>  	 * since this is a core component, and is essential for the
>  	 * subsequent light-weight ->init() to succeed.
>  	 */
> -	if (cpufreq_driver->exit) {
> -		cpufreq_driver->exit(policy);
> -		policy->freq_table = NULL;
> -	}
> +	cpufreq_driver_exit_policy(policy);
>  
>  unlock:
>  	up_write(&policy->rwsem);
> @@ -1452,8 +1517,12 @@ static unsigned int __cpufreq_get(struct
>  
>  	ret_freq = cpufreq_driver->get(policy->cpu);
>  
> -	/* Updating inactive policies is invalid, so avoid doing that. */
> -	if (unlikely(policy_is_inactive(policy)))
> +	/*
> +	 * Updating inactive policies is invalid, so avoid doing that.  Also
> +	 * if fast frequency switching is used with the given policy, the check
> +	 * against policy->cur is pointless, so skip it in that case too.
> +	 */
> +	if (unlikely(policy_is_inactive(policy)) || policy->fast_switch_enabled)
>  		return ret_freq;
>  
>  	if (ret_freq && policy->cur &&
> @@ -1465,7 +1534,6 @@ static unsigned int __cpufreq_get(struct
>  			schedule_work(&policy->update);
>  		}
>  	}
> -

Unrelated change ? And to me it looks better with the blank line ..

>  	return ret_freq;
>  }
>  
> @@ -1672,8 +1740,18 @@ int cpufreq_register_notifier(struct not
>  
>  	switch (list) {
>  	case CPUFREQ_TRANSITION_NOTIFIER:
> +		mutex_lock(&cpufreq_fast_switch_lock);
> +
> +		if (cpufreq_fast_switch_count > 0) {
> +			mutex_unlock(&cpufreq_fast_switch_lock);
> +			return -EBUSY;
> +		}
>  		ret = srcu_notifier_chain_register(
>  				&cpufreq_transition_notifier_list, nb);
> +		if (!ret)
> +			cpufreq_fast_switch_count--;
> +
> +		mutex_unlock(&cpufreq_fast_switch_lock);
>  		break;
>  	case CPUFREQ_POLICY_NOTIFIER:
>  		ret = blocking_notifier_chain_register(
> @@ -1706,8 +1784,14 @@ int cpufreq_unregister_notifier(struct n
>  
>  	switch (list) {
>  	case CPUFREQ_TRANSITION_NOTIFIER:
> +		mutex_lock(&cpufreq_fast_switch_lock);
> +
>  		ret = srcu_notifier_chain_unregister(
>  				&cpufreq_transition_notifier_list, nb);
> +		if (!ret && !WARN_ON(cpufreq_fast_switch_count >= 0))
> +			cpufreq_fast_switch_count++;

Again here, why shouldn't we write it as:

		WARN_ON(cpufreq_fast_switch_count >= 0);

		if (!ret)
			cpufreq_fast_switch_count++;

> +
> +		mutex_unlock(&cpufreq_fast_switch_lock);
>  		break;
>  	case CPUFREQ_POLICY_NOTIFIER:
>  		ret = blocking_notifier_chain_unregister(
> @@ -1726,6 +1810,34 @@ EXPORT_SYMBOL(cpufreq_unregister_notifie
>   *                              GOVERNORS                            *
>   *********************************************************************/
>  
> +/**
> + * cpufreq_driver_fast_switch - Carry out a fast CPU frequency switch.
> + * @policy: cpufreq policy to switch the frequency for.
> + * @target_freq: New frequency to set (may be approximate).
> + *
> + * Carry out a fast frequency switch from interrupt context.
> + *
> + * The driver's ->fast_switch() callback invoked by this function is expected to
> + * select the minimum available frequency greater than or equal to @target_freq
> + * (CPUFREQ_RELATION_L).
> + *
> + * This function must not be called if policy->fast_switch_enabled is unset.
> + *
> + * Governors calling this function must guarantee that it will never be invoked
> + * twice in parallel for the same policy and that it will never be called in
> + * parallel with either ->target() or ->target_index() for the same policy.
> + *
> + * If CPUFREQ_ENTRY_INVALID is returned by the driver's ->fast_switch()
> + * callback to indicate an error condition, the hardware configuration must be
> + * preserved.
> + */
> +unsigned int cpufreq_driver_fast_switch(struct cpufreq_policy *policy,
> +					unsigned int target_freq)
> +{
> +	return cpufreq_driver->fast_switch(policy, target_freq);
> +}
> +EXPORT_SYMBOL_GPL(cpufreq_driver_fast_switch);
> +
>  /* Must set freqs->new to intermediate frequency */
>  static int __target_intermediate(struct cpufreq_policy *policy,
>  				 struct cpufreq_freqs *freqs, int index)
Viresh Kumar March 28, 2016, 7:03 a.m. UTC | #5
forgot to review acpi update earlier ..

On 22-03-16, 02:53, Rafael J. Wysocki wrote:
> Index: linux-pm/drivers/cpufreq/acpi-cpufreq.c
> ===================================================================
> --- linux-pm.orig/drivers/cpufreq/acpi-cpufreq.c
> +++ linux-pm/drivers/cpufreq/acpi-cpufreq.c
> @@ -458,6 +458,43 @@ static int acpi_cpufreq_target(struct cp
>  	return result;
>  }
>  
> +unsigned int acpi_cpufreq_fast_switch(struct cpufreq_policy *policy,
> +				      unsigned int target_freq)
> +{
> +	struct acpi_cpufreq_data *data = policy->driver_data;
> +	struct acpi_processor_performance *perf;
> +	struct cpufreq_frequency_table *entry;
> +	unsigned int next_perf_state, next_freq, freq;
> +
> +	/*
> +	 * Find the closest frequency above target_freq.
> +	 *
> +	 * The table is sorted in the reverse order with respect to the
> +	 * frequency and all of the entries are valid (see the initialization).
> +	 */
> +	entry = data->freq_table;
> +	do {
> +		entry++;
> +		freq = entry->frequency;
> +	} while (freq >= target_freq && freq != CPUFREQ_TABLE_END);

Consider this table:

11000
10000
9000

And a target-freq of 10000.

Wouldn't you end up selecting 11000 ? Or did I misread it ?

> +	entry--;
> +	next_freq = entry->frequency;
> +	next_perf_state = entry->driver_data;
> +
> +	perf = to_perf_data(data);
> +	if (perf->state == next_perf_state) {
> +		if (unlikely(data->resume))
> +			data->resume = 0;
> +		else
> +			return next_freq;
> +	}
> +
> +	data->cpu_freq_write(&perf->control_register,
> +			     perf->states[next_perf_state].control);
> +	perf->state = next_perf_state;
> +	return next_freq;
> +}
> +
>  static unsigned long
>  acpi_cpufreq_guess_freq(struct acpi_cpufreq_data *data, unsigned int cpu)
>  {
> @@ -740,6 +777,9 @@ static int acpi_cpufreq_cpu_init(struct
>  		goto err_unreg;
>  	}
>  
> +	policy->fast_switch_possible = !acpi_pstate_strict &&
> +		!(policy_is_shared(policy) && policy->shared_type != CPUFREQ_SHARED_TYPE_ANY);
> +
>  	data->freq_table = kzalloc(sizeof(*data->freq_table) *
>  		    (perf->state_count+1), GFP_KERNEL);
>  	if (!data->freq_table) {
> @@ -874,6 +914,7 @@ static struct freq_attr *acpi_cpufreq_at
>  static struct cpufreq_driver acpi_cpufreq_driver = {
>  	.verify		= cpufreq_generic_frequency_table_verify,
>  	.target_index	= acpi_cpufreq_target,
> +	.fast_switch	= acpi_cpufreq_fast_switch,
>  	.bios_limit	= acpi_processor_get_bios_limit,
>  	.init		= acpi_cpufreq_cpu_init,
>  	.exit		= acpi_cpufreq_cpu_exit,
Steve Muckle March 28, 2016, 4:47 p.m. UTC | #6
On 03/25/2016 06:46 PM, Rafael J. Wysocki wrote:
>>> @@ -1726,6 +1810,34 @@ EXPORT_SYMBOL(cpufreq_unregister_notifie
>>> >>   *                              GOVERNORS                            *
>>> >>   *********************************************************************/
>>> >>
>>> >> +/**
>>> >> + * cpufreq_driver_fast_switch - Carry out a fast CPU frequency switch.
>>> >> + * @policy: cpufreq policy to switch the frequency for.
>>> >> + * @target_freq: New frequency to set (may be approximate).
>>> >> + *
>>> >> + * Carry out a fast frequency switch from interrupt context.
>> >
>> > I think that should say atomic rather than interrupt as this might not
>> > be called from interrupt context.
>
> "Interrupt context" here means something like "context that cannot
> sleep" and it's sort of a traditional way of calling that.  I
> considered saying "atomic context" here, but then decided that it
> might suggest too much.
> 
> Maybe something like "Carry out a fast frequency switch without
> sleeping" would be better?

Yes I do think that's preferable. I also wonder if it makes sense to
state expectations of how long the operation should take - i.e. not only
will it not sleep, but it is expected to complete "quickly." However I
accept that it is not well defined what that means. Maybe a mention that
this may be called in scheduler hot paths.
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rafael J. Wysocki March 29, 2016, 12:10 p.m. UTC | #7
On Monday, March 28, 2016 12:33:41 PM Viresh Kumar wrote:
> forgot to review acpi update earlier ..
> 
> On 22-03-16, 02:53, Rafael J. Wysocki wrote:
> > Index: linux-pm/drivers/cpufreq/acpi-cpufreq.c
> > ===================================================================
> > --- linux-pm.orig/drivers/cpufreq/acpi-cpufreq.c
> > +++ linux-pm/drivers/cpufreq/acpi-cpufreq.c
> > @@ -458,6 +458,43 @@ static int acpi_cpufreq_target(struct cp
> >  	return result;
> >  }
> >  
> > +unsigned int acpi_cpufreq_fast_switch(struct cpufreq_policy *policy,
> > +				      unsigned int target_freq)
> > +{
> > +	struct acpi_cpufreq_data *data = policy->driver_data;
> > +	struct acpi_processor_performance *perf;
> > +	struct cpufreq_frequency_table *entry;
> > +	unsigned int next_perf_state, next_freq, freq;
> > +
> > +	/*
> > +	 * Find the closest frequency above target_freq.
> > +	 *
> > +	 * The table is sorted in the reverse order with respect to the
> > +	 * frequency and all of the entries are valid (see the initialization).
> > +	 */
> > +	entry = data->freq_table;
> > +	do {
> > +		entry++;
> > +		freq = entry->frequency;
> > +	} while (freq >= target_freq && freq != CPUFREQ_TABLE_END);
> 
> Consider this table:
> 
> 11000
> 10000
> 9000
> 
> And a target-freq of 10000.
> 
> Wouldn't you end up selecting 11000 ? Or did I misread it ?

In that case the loop will break for freq = 9000 (as per the above
freq >= freq_target check), so it looks like you've misread it.

Thanks,
Rafael

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rafael J. Wysocki March 29, 2016, 12:10 p.m. UTC | #8
On Monday, March 28, 2016 09:47:53 AM Steve Muckle wrote:
> On 03/25/2016 06:46 PM, Rafael J. Wysocki wrote:
> >>> @@ -1726,6 +1810,34 @@ EXPORT_SYMBOL(cpufreq_unregister_notifie
> >>> >>   *                              GOVERNORS                            *
> >>> >>   *********************************************************************/
> >>> >>
> >>> >> +/**
> >>> >> + * cpufreq_driver_fast_switch - Carry out a fast CPU frequency switch.
> >>> >> + * @policy: cpufreq policy to switch the frequency for.
> >>> >> + * @target_freq: New frequency to set (may be approximate).
> >>> >> + *
> >>> >> + * Carry out a fast frequency switch from interrupt context.
> >> >
> >> > I think that should say atomic rather than interrupt as this might not
> >> > be called from interrupt context.
> >
> > "Interrupt context" here means something like "context that cannot
> > sleep" and it's sort of a traditional way of calling that.  I
> > considered saying "atomic context" here, but then decided that it
> > might suggest too much.
> > 
> > Maybe something like "Carry out a fast frequency switch without
> > sleeping" would be better?
> 
> Yes I do think that's preferable. I also wonder if it makes sense to
> state expectations of how long the operation should take - i.e. not only
> will it not sleep, but it is expected to complete "quickly." However I
> accept that it is not well defined what that means. Maybe a mention that
> this may be called in scheduler hot paths.

OK

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rafael J. Wysocki March 29, 2016, 12:31 p.m. UTC | #9
On Monday, March 28, 2016 11:57:51 AM Viresh Kumar wrote:
> Sorry for jumping in late, was busy with other stuff and travel :(
> 

[cut]

> > +static void cpufreq_list_transition_notifiers(void)
> > +{
> > +	struct notifier_block *nb;
> > +
> > +	pr_info("cpufreq: Registered transition notifiers:\n");
> > +
> > +	mutex_lock(&cpufreq_transition_notifier_list.mutex);
> > +
> > +	for (nb = cpufreq_transition_notifier_list.head; nb; nb = nb->next)
> > +		pr_info("cpufreq: %pF\n", nb->notifier_call);
> > +
> > +	mutex_unlock(&cpufreq_transition_notifier_list.mutex);
> 
> This will get printed as:
> 
> cpufreq: cpufreq: Registered transition notifiers:
> cpufreq: cpufreq: <func>+0x0/0x<address>
> cpufreq: cpufreq: <func>+0x0/0x<address>
> cpufreq: cpufreq: <func>+0x0/0x<address>
> 
> Maybe we want something like:
> cpufreq: Registered transition notifiers:
>   cpufreq: <func>+0x0/0x<address>
>   cpufreq: <func>+0x0/0x<address>
>   cpufreq: <func>+0x0/0x<address>
> 
> ?

You seem to be saying that pr_fmt() already has "cpufreq: " in it.  Fair enough.

> > +}
> > +
> > +/**
> > + * cpufreq_enable_fast_switch - Enable fast frequency switching for policy.
> > + * @policy: cpufreq policy to enable fast frequency switching for.
> > + *
> > + * Try to enable fast frequency switching for @policy.
> > + *
> > + * The attempt will fail if there is at least one transition notifier registered
> > + * at this point, as fast frequency switching is quite fundamentally at odds
> > + * with transition notifiers.  Thus if successful, it will make registration of
> > + * transition notifiers fail going forward.
> > + */
> > +void cpufreq_enable_fast_switch(struct cpufreq_policy *policy)
> > +{
> > +	lockdep_assert_held(&policy->rwsem);
> > +
> > +	if (!policy->fast_switch_possible)
> > +		return;
> > +
> > +	mutex_lock(&cpufreq_fast_switch_lock);
> > +	if (cpufreq_fast_switch_count >= 0) {
> > +		cpufreq_fast_switch_count++;
> > +		policy->fast_switch_enabled = true;
> > +	} else {
> > +		pr_warn("cpufreq: CPU%u: Fast frequency switching not enabled\n",
> > +			policy->cpu);
> > +		cpufreq_list_transition_notifiers();
> > +	}
> > +	mutex_unlock(&cpufreq_fast_switch_lock);
> > +}
> > +EXPORT_SYMBOL_GPL(cpufreq_enable_fast_switch);
> 
> And, why don't we have support for disabling fast-switch support? What if we
> switch to schedutil governor (from userspace) and then back to ondemand? We
> don't call policy->exit for that.

Disabling fast switch can be automatic depending on whether or not
fast_switch_enabled is set, but I clearly forgot about the manual governor
switch case.

It should be fine to do it before calling cpufreq_governor(_EXIT) then.


> >  /*********************************************************************
> >   *                          SYSFS INTERFACE                          *
> > @@ -1083,6 +1134,24 @@ static void cpufreq_policy_free(struct c
> >  	kfree(policy);
> >  }
> >  
> > +static void cpufreq_driver_exit_policy(struct cpufreq_policy *policy)
> > +{
> > +	if (policy->fast_switch_enabled) {
> 
> Shouldn't this be accessed from within lock as well ?
> 
> > +		mutex_lock(&cpufreq_fast_switch_lock);
> > +
> > +		policy->fast_switch_enabled = false;
> > +		if (!WARN_ON(cpufreq_fast_switch_count <= 0))
> > +			cpufreq_fast_switch_count--;
> 
> Shouldn't we make it more efficient and write it as:
> 
> 		WARN_ON(cpufreq_fast_switch_count <= 0);
> 		policy->fast_switch_enabled = false;
>                 cpufreq_fast_switch_count--;
> 
> The WARN check will hold true only for a major bug somewhere in the core and we
> shall *never* hit it.

The point here is to avoid the decrementation if the WARN_ON() triggers too.

> > +		mutex_unlock(&cpufreq_fast_switch_lock);
> > +	}
> > +
> > +	if (cpufreq_driver->exit) {
> > +		cpufreq_driver->exit(policy);
> > +		policy->freq_table = NULL;
> > +	}
> > +}
> > +
> >  static int cpufreq_online(unsigned int cpu)
> >  {
> >  	struct cpufreq_policy *policy;
> > @@ -1236,8 +1305,7 @@ static int cpufreq_online(unsigned int c
> >  out_exit_policy:
> >  	up_write(&policy->rwsem);
> >  
> > -	if (cpufreq_driver->exit)
> > -		cpufreq_driver->exit(policy);
> > +	cpufreq_driver_exit_policy(policy);
> >  out_free_policy:
> >  	cpufreq_policy_free(policy, !new_policy);
> >  	return ret;
> > @@ -1334,10 +1402,7 @@ static void cpufreq_offline(unsigned int
> >  	 * since this is a core component, and is essential for the
> >  	 * subsequent light-weight ->init() to succeed.
> >  	 */
> > -	if (cpufreq_driver->exit) {
> > -		cpufreq_driver->exit(policy);
> > -		policy->freq_table = NULL;
> > -	}
> > +	cpufreq_driver_exit_policy(policy);
> >  
> >  unlock:
> >  	up_write(&policy->rwsem);
> > @@ -1452,8 +1517,12 @@ static unsigned int __cpufreq_get(struct
> >  
> >  	ret_freq = cpufreq_driver->get(policy->cpu);
> >  
> > -	/* Updating inactive policies is invalid, so avoid doing that. */
> > -	if (unlikely(policy_is_inactive(policy)))
> > +	/*
> > +	 * Updating inactive policies is invalid, so avoid doing that.  Also
> > +	 * if fast frequency switching is used with the given policy, the check
> > +	 * against policy->cur is pointless, so skip it in that case too.
> > +	 */
> > +	if (unlikely(policy_is_inactive(policy)) || policy->fast_switch_enabled)
> >  		return ret_freq;
> >  
> >  	if (ret_freq && policy->cur &&
> > @@ -1465,7 +1534,6 @@ static unsigned int __cpufreq_get(struct
> >  			schedule_work(&policy->update);
> >  		}
> >  	}
> > -
> 
> Unrelated change ? And to me it looks better with the blank line ..

Yes, it is unrelated.

> >  	return ret_freq;
> >  }
> >  
> > @@ -1672,8 +1740,18 @@ int cpufreq_register_notifier(struct not
> >  
> >  	switch (list) {
> >  	case CPUFREQ_TRANSITION_NOTIFIER:
> > +		mutex_lock(&cpufreq_fast_switch_lock);
> > +
> > +		if (cpufreq_fast_switch_count > 0) {
> > +			mutex_unlock(&cpufreq_fast_switch_lock);
> > +			return -EBUSY;
> > +		}
> >  		ret = srcu_notifier_chain_register(
> >  				&cpufreq_transition_notifier_list, nb);
> > +		if (!ret)
> > +			cpufreq_fast_switch_count--;
> > +
> > +		mutex_unlock(&cpufreq_fast_switch_lock);
> >  		break;
> >  	case CPUFREQ_POLICY_NOTIFIER:
> >  		ret = blocking_notifier_chain_register(
> > @@ -1706,8 +1784,14 @@ int cpufreq_unregister_notifier(struct n
> >  
> >  	switch (list) {
> >  	case CPUFREQ_TRANSITION_NOTIFIER:
> > +		mutex_lock(&cpufreq_fast_switch_lock);
> > +
> >  		ret = srcu_notifier_chain_unregister(
> >  				&cpufreq_transition_notifier_list, nb);
> > +		if (!ret && !WARN_ON(cpufreq_fast_switch_count >= 0))
> > +			cpufreq_fast_switch_count++;
> 
> Again here, why shouldn't we write it as:

And same here again, I don't want the incrementation to happen if the WARN_ON()
triggers.

Thanks,
Rafael

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Viresh Kumar March 29, 2016, 2:20 p.m. UTC | #10
On 29-03-16, 14:10, Rafael J. Wysocki wrote:
> In that case the loop will break for freq = 9000 (as per the above
> freq >= freq_target check), so it looks like you've misread it.

My bad ..
diff mbox

Patch

Index: linux-pm/drivers/cpufreq/acpi-cpufreq.c
===================================================================
--- linux-pm.orig/drivers/cpufreq/acpi-cpufreq.c
+++ linux-pm/drivers/cpufreq/acpi-cpufreq.c
@@ -458,6 +458,43 @@  static int acpi_cpufreq_target(struct cp
 	return result;
 }
 
+unsigned int acpi_cpufreq_fast_switch(struct cpufreq_policy *policy,
+				      unsigned int target_freq)
+{
+	struct acpi_cpufreq_data *data = policy->driver_data;
+	struct acpi_processor_performance *perf;
+	struct cpufreq_frequency_table *entry;
+	unsigned int next_perf_state, next_freq, freq;
+
+	/*
+	 * Find the closest frequency above target_freq.
+	 *
+	 * The table is sorted in the reverse order with respect to the
+	 * frequency and all of the entries are valid (see the initialization).
+	 */
+	entry = data->freq_table;
+	do {
+		entry++;
+		freq = entry->frequency;
+	} while (freq >= target_freq && freq != CPUFREQ_TABLE_END);
+	entry--;
+	next_freq = entry->frequency;
+	next_perf_state = entry->driver_data;
+
+	perf = to_perf_data(data);
+	if (perf->state == next_perf_state) {
+		if (unlikely(data->resume))
+			data->resume = 0;
+		else
+			return next_freq;
+	}
+
+	data->cpu_freq_write(&perf->control_register,
+			     perf->states[next_perf_state].control);
+	perf->state = next_perf_state;
+	return next_freq;
+}
+
 static unsigned long
 acpi_cpufreq_guess_freq(struct acpi_cpufreq_data *data, unsigned int cpu)
 {
@@ -740,6 +777,9 @@  static int acpi_cpufreq_cpu_init(struct
 		goto err_unreg;
 	}
 
+	policy->fast_switch_possible = !acpi_pstate_strict &&
+		!(policy_is_shared(policy) && policy->shared_type != CPUFREQ_SHARED_TYPE_ANY);
+
 	data->freq_table = kzalloc(sizeof(*data->freq_table) *
 		    (perf->state_count+1), GFP_KERNEL);
 	if (!data->freq_table) {
@@ -874,6 +914,7 @@  static struct freq_attr *acpi_cpufreq_at
 static struct cpufreq_driver acpi_cpufreq_driver = {
 	.verify		= cpufreq_generic_frequency_table_verify,
 	.target_index	= acpi_cpufreq_target,
+	.fast_switch	= acpi_cpufreq_fast_switch,
 	.bios_limit	= acpi_processor_get_bios_limit,
 	.init		= acpi_cpufreq_cpu_init,
 	.exit		= acpi_cpufreq_cpu_exit,
Index: linux-pm/include/linux/cpufreq.h
===================================================================
--- linux-pm.orig/include/linux/cpufreq.h
+++ linux-pm/include/linux/cpufreq.h
@@ -102,6 +102,10 @@  struct cpufreq_policy {
 	 */
 	struct rw_semaphore	rwsem;
 
+	/* Fast switch flags */
+	bool			fast_switch_possible;	/* Set by the driver. */
+	bool			fast_switch_enabled;
+
 	/* Synchronization for frequency transitions */
 	bool			transition_ongoing; /* Tracks transition status */
 	spinlock_t		transition_lock;
@@ -156,6 +160,7 @@  int cpufreq_get_policy(struct cpufreq_po
 int cpufreq_update_policy(unsigned int cpu);
 bool have_governor_per_policy(void);
 struct kobject *get_governor_parent_kobj(struct cpufreq_policy *policy);
+void cpufreq_enable_fast_switch(struct cpufreq_policy *policy);
 #else
 static inline unsigned int cpufreq_get(unsigned int cpu)
 {
@@ -236,6 +241,8 @@  struct cpufreq_driver {
 				  unsigned int relation);	/* Deprecated */
 	int		(*target_index)(struct cpufreq_policy *policy,
 					unsigned int index);
+	unsigned int	(*fast_switch)(struct cpufreq_policy *policy,
+				       unsigned int target_freq);
 	/*
 	 * Only for drivers with target_index() and CPUFREQ_ASYNC_NOTIFICATION
 	 * unset.
@@ -464,6 +471,8 @@  struct cpufreq_governor {
 };
 
 /* Pass a target to the cpufreq driver */
+unsigned int cpufreq_driver_fast_switch(struct cpufreq_policy *policy,
+					unsigned int target_freq);
 int cpufreq_driver_target(struct cpufreq_policy *policy,
 				 unsigned int target_freq,
 				 unsigned int relation);
Index: linux-pm/drivers/cpufreq/cpufreq.c
===================================================================
--- linux-pm.orig/drivers/cpufreq/cpufreq.c
+++ linux-pm/drivers/cpufreq/cpufreq.c
@@ -428,6 +428,57 @@  void cpufreq_freq_transition_end(struct
 }
 EXPORT_SYMBOL_GPL(cpufreq_freq_transition_end);
 
+/*
+ * Fast frequency switching status count.  Positive means "enabled", negative
+ * means "disabled" and 0 means "not decided yet".
+ */
+static int cpufreq_fast_switch_count;
+static DEFINE_MUTEX(cpufreq_fast_switch_lock);
+
+static void cpufreq_list_transition_notifiers(void)
+{
+	struct notifier_block *nb;
+
+	pr_info("cpufreq: Registered transition notifiers:\n");
+
+	mutex_lock(&cpufreq_transition_notifier_list.mutex);
+
+	for (nb = cpufreq_transition_notifier_list.head; nb; nb = nb->next)
+		pr_info("cpufreq: %pF\n", nb->notifier_call);
+
+	mutex_unlock(&cpufreq_transition_notifier_list.mutex);
+}
+
+/**
+ * cpufreq_enable_fast_switch - Enable fast frequency switching for policy.
+ * @policy: cpufreq policy to enable fast frequency switching for.
+ *
+ * Try to enable fast frequency switching for @policy.
+ *
+ * The attempt will fail if there is at least one transition notifier registered
+ * at this point, as fast frequency switching is quite fundamentally at odds
+ * with transition notifiers.  Thus if successful, it will make registration of
+ * transition notifiers fail going forward.
+ */
+void cpufreq_enable_fast_switch(struct cpufreq_policy *policy)
+{
+	lockdep_assert_held(&policy->rwsem);
+
+	if (!policy->fast_switch_possible)
+		return;
+
+	mutex_lock(&cpufreq_fast_switch_lock);
+	if (cpufreq_fast_switch_count >= 0) {
+		cpufreq_fast_switch_count++;
+		policy->fast_switch_enabled = true;
+	} else {
+		pr_warn("cpufreq: CPU%u: Fast frequency switching not enabled\n",
+			policy->cpu);
+		cpufreq_list_transition_notifiers();
+	}
+	mutex_unlock(&cpufreq_fast_switch_lock);
+}
+EXPORT_SYMBOL_GPL(cpufreq_enable_fast_switch);
 
 /*********************************************************************
  *                          SYSFS INTERFACE                          *
@@ -1083,6 +1134,24 @@  static void cpufreq_policy_free(struct c
 	kfree(policy);
 }
 
+static void cpufreq_driver_exit_policy(struct cpufreq_policy *policy)
+{
+	if (policy->fast_switch_enabled) {
+		mutex_lock(&cpufreq_fast_switch_lock);
+
+		policy->fast_switch_enabled = false;
+		if (!WARN_ON(cpufreq_fast_switch_count <= 0))
+			cpufreq_fast_switch_count--;
+
+		mutex_unlock(&cpufreq_fast_switch_lock);
+	}
+
+	if (cpufreq_driver->exit) {
+		cpufreq_driver->exit(policy);
+		policy->freq_table = NULL;
+	}
+}
+
 static int cpufreq_online(unsigned int cpu)
 {
 	struct cpufreq_policy *policy;
@@ -1236,8 +1305,7 @@  static int cpufreq_online(unsigned int c
 out_exit_policy:
 	up_write(&policy->rwsem);
 
-	if (cpufreq_driver->exit)
-		cpufreq_driver->exit(policy);
+	cpufreq_driver_exit_policy(policy);
 out_free_policy:
 	cpufreq_policy_free(policy, !new_policy);
 	return ret;
@@ -1334,10 +1402,7 @@  static void cpufreq_offline(unsigned int
 	 * since this is a core component, and is essential for the
 	 * subsequent light-weight ->init() to succeed.
 	 */
-	if (cpufreq_driver->exit) {
-		cpufreq_driver->exit(policy);
-		policy->freq_table = NULL;
-	}
+	cpufreq_driver_exit_policy(policy);
 
 unlock:
 	up_write(&policy->rwsem);
@@ -1452,8 +1517,12 @@  static unsigned int __cpufreq_get(struct
 
 	ret_freq = cpufreq_driver->get(policy->cpu);
 
-	/* Updating inactive policies is invalid, so avoid doing that. */
-	if (unlikely(policy_is_inactive(policy)))
+	/*
+	 * Updating inactive policies is invalid, so avoid doing that.  Also
+	 * if fast frequency switching is used with the given policy, the check
+	 * against policy->cur is pointless, so skip it in that case too.
+	 */
+	if (unlikely(policy_is_inactive(policy)) || policy->fast_switch_enabled)
 		return ret_freq;
 
 	if (ret_freq && policy->cur &&
@@ -1465,7 +1534,6 @@  static unsigned int __cpufreq_get(struct
 			schedule_work(&policy->update);
 		}
 	}
-
 	return ret_freq;
 }
 
@@ -1672,8 +1740,18 @@  int cpufreq_register_notifier(struct not
 
 	switch (list) {
 	case CPUFREQ_TRANSITION_NOTIFIER:
+		mutex_lock(&cpufreq_fast_switch_lock);
+
+		if (cpufreq_fast_switch_count > 0) {
+			mutex_unlock(&cpufreq_fast_switch_lock);
+			return -EBUSY;
+		}
 		ret = srcu_notifier_chain_register(
 				&cpufreq_transition_notifier_list, nb);
+		if (!ret)
+			cpufreq_fast_switch_count--;
+
+		mutex_unlock(&cpufreq_fast_switch_lock);
 		break;
 	case CPUFREQ_POLICY_NOTIFIER:
 		ret = blocking_notifier_chain_register(
@@ -1706,8 +1784,14 @@  int cpufreq_unregister_notifier(struct n
 
 	switch (list) {
 	case CPUFREQ_TRANSITION_NOTIFIER:
+		mutex_lock(&cpufreq_fast_switch_lock);
+
 		ret = srcu_notifier_chain_unregister(
 				&cpufreq_transition_notifier_list, nb);
+		if (!ret && !WARN_ON(cpufreq_fast_switch_count >= 0))
+			cpufreq_fast_switch_count++;
+
+		mutex_unlock(&cpufreq_fast_switch_lock);
 		break;
 	case CPUFREQ_POLICY_NOTIFIER:
 		ret = blocking_notifier_chain_unregister(
@@ -1726,6 +1810,34 @@  EXPORT_SYMBOL(cpufreq_unregister_notifie
  *                              GOVERNORS                            *
  *********************************************************************/
 
+/**
+ * cpufreq_driver_fast_switch - Carry out a fast CPU frequency switch.
+ * @policy: cpufreq policy to switch the frequency for.
+ * @target_freq: New frequency to set (may be approximate).
+ *
+ * Carry out a fast frequency switch from interrupt context.
+ *
+ * The driver's ->fast_switch() callback invoked by this function is expected to
+ * select the minimum available frequency greater than or equal to @target_freq
+ * (CPUFREQ_RELATION_L).
+ *
+ * This function must not be called if policy->fast_switch_enabled is unset.
+ *
+ * Governors calling this function must guarantee that it will never be invoked
+ * twice in parallel for the same policy and that it will never be called in
+ * parallel with either ->target() or ->target_index() for the same policy.
+ *
+ * If CPUFREQ_ENTRY_INVALID is returned by the driver's ->fast_switch()
+ * callback to indicate an error condition, the hardware configuration must be
+ * preserved.
+ */
+unsigned int cpufreq_driver_fast_switch(struct cpufreq_policy *policy,
+					unsigned int target_freq)
+{
+	return cpufreq_driver->fast_switch(policy, target_freq);
+}
+EXPORT_SYMBOL_GPL(cpufreq_driver_fast_switch);
+
 /* Must set freqs->new to intermediate frequency */
 static int __target_intermediate(struct cpufreq_policy *policy,
 				 struct cpufreq_freqs *freqs, int index)