diff mbox

[V3,12/13] cpufreq: ondemand: Traverse list of policy_dbs in update_sampling_rate()

Message ID bb6f1b48f09cfe1f0322b1718bf797f8cf5e108c.1454931189.git.viresh.kumar@linaro.org (mailing list archive)
State Changes Requested, archived
Headers show

Commit Message

Viresh Kumar Feb. 8, 2016, 11:39 a.m. UTC
Now that we maintain a list of all 'struct policy_dbs_info' for an
instance of 'struct dbs_data', we can traverse that instead of
traversing the loop for all online CPUs.

This also solves the circular dependency lockdep reported by Juri
(and verified by Shilpa) earlier:

 ======================================================
 [ INFO: possible circular locking dependency detected ]
 4.4.0+ #445 Not tainted
 -------------------------------------------------------
 trace.sh/1723 is trying to acquire lock:
  (s_active#48){++++.+}, at: [<c01f78c8>] kernfs_remove_by_name_ns+0x4c/0x94

 but task is already holding lock:
  (od_dbs_cdata.mutex){+.+.+.}, at: [<c05824a0>] cpufreq_governor_dbs+0x34/0x5d4

 which lock already depends on the new lock.

 the existing dependency chain (in reverse order) is:

-> #2 (od_dbs_cdata.mutex){+.+.+.}:
        [<c075b040>] mutex_lock_nested+0x7c/0x434
        [<c05824a0>] cpufreq_governor_dbs+0x34/0x5d4
        [<c0017c10>] return_to_handler+0x0/0x18

-> #1 (&policy->rwsem){+++++.}:
        [<c075ca8c>] down_read+0x58/0x94
        [<c057c244>] show+0x30/0x60
        [<c01f934c>] sysfs_kf_seq_show+0x90/0xfc
        [<c01f7ad8>] kernfs_seq_show+0x34/0x38
        [<c01a22ec>] seq_read+0x1e4/0x4e4
        [<c01f8694>] kernfs_fop_read+0x120/0x1a0
        [<c01794b4>] __vfs_read+0x3c/0xe0
        [<c017a378>] vfs_read+0x98/0x104
        [<c017a434>] SyS_read+0x50/0x90
        [<c000fd40>] ret_fast_syscall+0x0/0x1c

-> #0 (s_active#48){++++.+}:
        [<c008238c>] lock_acquire+0xd4/0x20c
        [<c01f6ae4>] __kernfs_remove+0x288/0x328
        [<c01f78c8>] kernfs_remove_by_name_ns+0x4c/0x94
        [<c01fa024>] remove_files+0x44/0x88
        [<c01fa5a4>] sysfs_remove_group+0x50/0xa4
        [<c058285c>] cpufreq_governor_dbs+0x3f0/0x5d4
        [<c0017c10>] return_to_handler+0x0/0x18

 other info that might help us debug this:

 Chain exists of:
  s_active#48 --> &policy->rwsem --> od_dbs_cdata.mutex

  Possible unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock(od_dbs_cdata.mutex);
                                lock(&policy->rwsem);
                                lock(od_dbs_cdata.mutex);
   lock(s_active#48);

  *** DEADLOCK ***

 5 locks held by trace.sh/1723:
  #0:  (sb_writers#6){.+.+.+}, at: [<c017beb8>] __sb_start_write+0xb4/0xc0
  #1:  (&of->mutex){+.+.+.}, at: [<c01f8418>] kernfs_fop_write+0x6c/0x1c8
  #2:  (s_active#35){.+.+.+}, at: [<c01f8420>] kernfs_fop_write+0x74/0x1c8
  #3:  (cpu_hotplug.lock){++++++}, at: [<c0029e6c>] get_online_cpus+0x48/0xb8
  #4:  (od_dbs_cdata.mutex){+.+.+.}, at: [<c05824a0>] cpufreq_governor_dbs+0x34/0x5d4

 stack backtrace:
 CPU: 2 PID: 1723 Comm: trace.sh Not tainted 4.4.0+ #445
 Hardware name: ARM-Versatile Express
 [<c001883c>] (unwind_backtrace) from [<c0013f50>] (show_stack+0x20/0x24)
 [<c0013f50>] (show_stack) from [<c044ad90>] (dump_stack+0x80/0xb4)
 [<c044ad90>] (dump_stack) from [<c0128edc>] (print_circular_bug+0x29c/0x2f0)
 [<c0128edc>] (print_circular_bug) from [<c0081708>] (__lock_acquire+0x163c/0x1d74)
 [<c0081708>] (__lock_acquire) from [<c008238c>] (lock_acquire+0xd4/0x20c)
 [<c008238c>] (lock_acquire) from [<c01f6ae4>] (__kernfs_remove+0x288/0x328)
 [<c01f6ae4>] (__kernfs_remove) from [<c01f78c8>] (kernfs_remove_by_name_ns+0x4c/0x94)
 [<c01f78c8>] (kernfs_remove_by_name_ns) from [<c01fa024>] (remove_files+0x44/0x88)
 [<c01fa024>] (remove_files) from [<c01fa5a4>] (sysfs_remove_group+0x50/0xa4)
 [<c01fa5a4>] (sysfs_remove_group) from [<c058285c>] (cpufreq_governor_dbs+0x3f0/0x5d4)
 [<c058285c>] (cpufreq_governor_dbs) from [<c0017c10>] (return_to_handler+0x0/0x18)

Reported-by: Juri Lelli <juri.lelli@arm.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/cpufreq/cpufreq_ondemand.c | 89 ++++++++++----------------------------
 1 file changed, 22 insertions(+), 67 deletions(-)

Comments

Rafael J. Wysocki Feb. 8, 2016, 1:32 p.m. UTC | #1
On Mon, Feb 8, 2016 at 12:39 PM, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> Now that we maintain a list of all 'struct policy_dbs_info' for an
> instance of 'struct dbs_data', we can traverse that instead of
> traversing the loop for all online CPUs.
>
> This also solves the circular dependency lockdep reported by Juri
> (and verified by Shilpa) earlier:
>
>  ======================================================
>  [ INFO: possible circular locking dependency detected ]
>  4.4.0+ #445 Not tainted
>  -------------------------------------------------------
>  trace.sh/1723 is trying to acquire lock:
>   (s_active#48){++++.+}, at: [<c01f78c8>] kernfs_remove_by_name_ns+0x4c/0x94
>
>  but task is already holding lock:
>   (od_dbs_cdata.mutex){+.+.+.}, at: [<c05824a0>] cpufreq_governor_dbs+0x34/0x5d4
>
>  which lock already depends on the new lock.
>
>  the existing dependency chain (in reverse order) is:
>
> -> #2 (od_dbs_cdata.mutex){+.+.+.}:
>         [<c075b040>] mutex_lock_nested+0x7c/0x434
>         [<c05824a0>] cpufreq_governor_dbs+0x34/0x5d4
>         [<c0017c10>] return_to_handler+0x0/0x18
>
> -> #1 (&policy->rwsem){+++++.}:
>         [<c075ca8c>] down_read+0x58/0x94
>         [<c057c244>] show+0x30/0x60
>         [<c01f934c>] sysfs_kf_seq_show+0x90/0xfc
>         [<c01f7ad8>] kernfs_seq_show+0x34/0x38
>         [<c01a22ec>] seq_read+0x1e4/0x4e4
>         [<c01f8694>] kernfs_fop_read+0x120/0x1a0
>         [<c01794b4>] __vfs_read+0x3c/0xe0
>         [<c017a378>] vfs_read+0x98/0x104
>         [<c017a434>] SyS_read+0x50/0x90
>         [<c000fd40>] ret_fast_syscall+0x0/0x1c
>
> -> #0 (s_active#48){++++.+}:
>         [<c008238c>] lock_acquire+0xd4/0x20c
>         [<c01f6ae4>] __kernfs_remove+0x288/0x328
>         [<c01f78c8>] kernfs_remove_by_name_ns+0x4c/0x94
>         [<c01fa024>] remove_files+0x44/0x88
>         [<c01fa5a4>] sysfs_remove_group+0x50/0xa4
>         [<c058285c>] cpufreq_governor_dbs+0x3f0/0x5d4
>         [<c0017c10>] return_to_handler+0x0/0x18
>
>  other info that might help us debug this:
>
>  Chain exists of:
>   s_active#48 --> &policy->rwsem --> od_dbs_cdata.mutex
>
>   Possible unsafe locking scenario:
>
>         CPU0                    CPU1
>         ----                    ----
>    lock(od_dbs_cdata.mutex);
>                                 lock(&policy->rwsem);
>                                 lock(od_dbs_cdata.mutex);
>    lock(s_active#48);
>
>   *** DEADLOCK ***
>
>  5 locks held by trace.sh/1723:
>   #0:  (sb_writers#6){.+.+.+}, at: [<c017beb8>] __sb_start_write+0xb4/0xc0
>   #1:  (&of->mutex){+.+.+.}, at: [<c01f8418>] kernfs_fop_write+0x6c/0x1c8
>   #2:  (s_active#35){.+.+.+}, at: [<c01f8420>] kernfs_fop_write+0x74/0x1c8
>   #3:  (cpu_hotplug.lock){++++++}, at: [<c0029e6c>] get_online_cpus+0x48/0xb8
>   #4:  (od_dbs_cdata.mutex){+.+.+.}, at: [<c05824a0>] cpufreq_governor_dbs+0x34/0x5d4
>
>  stack backtrace:
>  CPU: 2 PID: 1723 Comm: trace.sh Not tainted 4.4.0+ #445
>  Hardware name: ARM-Versatile Express
>  [<c001883c>] (unwind_backtrace) from [<c0013f50>] (show_stack+0x20/0x24)
>  [<c0013f50>] (show_stack) from [<c044ad90>] (dump_stack+0x80/0xb4)
>  [<c044ad90>] (dump_stack) from [<c0128edc>] (print_circular_bug+0x29c/0x2f0)
>  [<c0128edc>] (print_circular_bug) from [<c0081708>] (__lock_acquire+0x163c/0x1d74)
>  [<c0081708>] (__lock_acquire) from [<c008238c>] (lock_acquire+0xd4/0x20c)
>  [<c008238c>] (lock_acquire) from [<c01f6ae4>] (__kernfs_remove+0x288/0x328)
>  [<c01f6ae4>] (__kernfs_remove) from [<c01f78c8>] (kernfs_remove_by_name_ns+0x4c/0x94)
>  [<c01f78c8>] (kernfs_remove_by_name_ns) from [<c01fa024>] (remove_files+0x44/0x88)
>  [<c01fa024>] (remove_files) from [<c01fa5a4>] (sysfs_remove_group+0x50/0xa4)
>  [<c01fa5a4>] (sysfs_remove_group) from [<c058285c>] (cpufreq_governor_dbs+0x3f0/0x5d4)
>  [<c058285c>] (cpufreq_governor_dbs) from [<c0017c10>] (return_to_handler+0x0/0x18)
>
> Reported-by: Juri Lelli <juri.lelli@arm.com>
> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
> ---
>  drivers/cpufreq/cpufreq_ondemand.c | 89 ++++++++++----------------------------
>  1 file changed, 22 insertions(+), 67 deletions(-)
>
> diff --git a/drivers/cpufreq/cpufreq_ondemand.c b/drivers/cpufreq/cpufreq_ondemand.c
> index 745290d7f6a2..f72087bc8302 100644
> --- a/drivers/cpufreq/cpufreq_ondemand.c
> +++ b/drivers/cpufreq/cpufreq_ondemand.c
> @@ -224,83 +224,38 @@ static struct dbs_governor od_dbs_gov;
>  /**
>   * update_sampling_rate - update sampling rate effective immediately if needed.
>   * @new_rate: new sampling rate
> - *
> - * If new rate is smaller than the old, simply updating
> - * dbs_tuners_int.sampling_rate might not be appropriate. For example, if the
> - * original sampling_rate was 1 second and the requested new sampling rate is 10
> - * ms because the user needs immediate reaction from ondemand governor, but not
> - * sure if higher frequency will be required or not, then, the governor may
> - * change the sampling rate too late; up to 1 second later. Thus, if we are
> - * reducing the sampling rate, we need to make the new value effective
> - * immediately.

The comment still applies.

Moreover, please extend it to say that this must be called with
dbs_data->mutex held (or it looks racy otherwise).

Thanks,
Rafael


On Mon, Feb 8, 2016 at 12:39 PM, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> Now that we maintain a list of all 'struct policy_dbs_info' for an
> instance of 'struct dbs_data', we can traverse that instead of
> traversing the loop for all online CPUs.
>
> This also solves the circular dependency lockdep reported by Juri
> (and verified by Shilpa) earlier:
>
>  ======================================================
>  [ INFO: possible circular locking dependency detected ]
>  4.4.0+ #445 Not tainted
>  -------------------------------------------------------
>  trace.sh/1723 is trying to acquire lock:
>   (s_active#48){++++.+}, at: [<c01f78c8>] kernfs_remove_by_name_ns+0x4c/0x94
>
>  but task is already holding lock:
>   (od_dbs_cdata.mutex){+.+.+.}, at: [<c05824a0>] cpufreq_governor_dbs+0x34/0x5d4
>
>  which lock already depends on the new lock.
>
>  the existing dependency chain (in reverse order) is:
>
> -> #2 (od_dbs_cdata.mutex){+.+.+.}:
>         [<c075b040>] mutex_lock_nested+0x7c/0x434
>         [<c05824a0>] cpufreq_governor_dbs+0x34/0x5d4
>         [<c0017c10>] return_to_handler+0x0/0x18
>
> -> #1 (&policy->rwsem){+++++.}:
>         [<c075ca8c>] down_read+0x58/0x94
>         [<c057c244>] show+0x30/0x60
>         [<c01f934c>] sysfs_kf_seq_show+0x90/0xfc
>         [<c01f7ad8>] kernfs_seq_show+0x34/0x38
>         [<c01a22ec>] seq_read+0x1e4/0x4e4
>         [<c01f8694>] kernfs_fop_read+0x120/0x1a0
>         [<c01794b4>] __vfs_read+0x3c/0xe0
>         [<c017a378>] vfs_read+0x98/0x104
>         [<c017a434>] SyS_read+0x50/0x90
>         [<c000fd40>] ret_fast_syscall+0x0/0x1c
>
> -> #0 (s_active#48){++++.+}:
>         [<c008238c>] lock_acquire+0xd4/0x20c
>         [<c01f6ae4>] __kernfs_remove+0x288/0x328
>         [<c01f78c8>] kernfs_remove_by_name_ns+0x4c/0x94
>         [<c01fa024>] remove_files+0x44/0x88
>         [<c01fa5a4>] sysfs_remove_group+0x50/0xa4
>         [<c058285c>] cpufreq_governor_dbs+0x3f0/0x5d4
>         [<c0017c10>] return_to_handler+0x0/0x18
>
>  other info that might help us debug this:
>
>  Chain exists of:
>   s_active#48 --> &policy->rwsem --> od_dbs_cdata.mutex
>
>   Possible unsafe locking scenario:
>
>         CPU0                    CPU1
>         ----                    ----
>    lock(od_dbs_cdata.mutex);
>                                 lock(&policy->rwsem);
>                                 lock(od_dbs_cdata.mutex);
>    lock(s_active#48);
>
>   *** DEADLOCK ***
>
>  5 locks held by trace.sh/1723:
>   #0:  (sb_writers#6){.+.+.+}, at: [<c017beb8>] __sb_start_write+0xb4/0xc0
>   #1:  (&of->mutex){+.+.+.}, at: [<c01f8418>] kernfs_fop_write+0x6c/0x1c8
>   #2:  (s_active#35){.+.+.+}, at: [<c01f8420>] kernfs_fop_write+0x74/0x1c8
>   #3:  (cpu_hotplug.lock){++++++}, at: [<c0029e6c>] get_online_cpus+0x48/0xb8
>   #4:  (od_dbs_cdata.mutex){+.+.+.}, at: [<c05824a0>] cpufreq_governor_dbs+0x34/0x5d4
>
>  stack backtrace:
>  CPU: 2 PID: 1723 Comm: trace.sh Not tainted 4.4.0+ #445
>  Hardware name: ARM-Versatile Express
>  [<c001883c>] (unwind_backtrace) from [<c0013f50>] (show_stack+0x20/0x24)
>  [<c0013f50>] (show_stack) from [<c044ad90>] (dump_stack+0x80/0xb4)
>  [<c044ad90>] (dump_stack) from [<c0128edc>] (print_circular_bug+0x29c/0x2f0)
>  [<c0128edc>] (print_circular_bug) from [<c0081708>] (__lock_acquire+0x163c/0x1d74)
>  [<c0081708>] (__lock_acquire) from [<c008238c>] (lock_acquire+0xd4/0x20c)
>  [<c008238c>] (lock_acquire) from [<c01f6ae4>] (__kernfs_remove+0x288/0x328)
>  [<c01f6ae4>] (__kernfs_remove) from [<c01f78c8>] (kernfs_remove_by_name_ns+0x4c/0x94)
>  [<c01f78c8>] (kernfs_remove_by_name_ns) from [<c01fa024>] (remove_files+0x44/0x88)
>  [<c01fa024>] (remove_files) from [<c01fa5a4>] (sysfs_remove_group+0x50/0xa4)
>  [<c01fa5a4>] (sysfs_remove_group) from [<c058285c>] (cpufreq_governor_dbs+0x3f0/0x5d4)
>  [<c058285c>] (cpufreq_governor_dbs) from [<c0017c10>] (return_to_handler+0x0/0x18)
>
> Reported-by: Juri Lelli <juri.lelli@arm.com>
> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
> ---
>  drivers/cpufreq/cpufreq_ondemand.c | 89 ++++++++++----------------------------
>  1 file changed, 22 insertions(+), 67 deletions(-)
>
> diff --git a/drivers/cpufreq/cpufreq_ondemand.c b/drivers/cpufreq/cpufreq_ondemand.c
> index 745290d7f6a2..f72087bc8302 100644
> --- a/drivers/cpufreq/cpufreq_ondemand.c
> +++ b/drivers/cpufreq/cpufreq_ondemand.c
> @@ -224,83 +224,38 @@ static struct dbs_governor od_dbs_gov;
>  /**
>   * update_sampling_rate - update sampling rate effective immediately if needed.
>   * @new_rate: new sampling rate
> - *
> - * If new rate is smaller than the old, simply updating
> - * dbs_tuners_int.sampling_rate might not be appropriate. For example, if the
> - * original sampling_rate was 1 second and the requested new sampling rate is 10
> - * ms because the user needs immediate reaction from ondemand governor, but not
> - * sure if higher frequency will be required or not, then, the governor may
> - * change the sampling rate too late; up to 1 second later. Thus, if we are
> - * reducing the sampling rate, we need to make the new value effective
> - * immediately.
>   */
>  static void update_sampling_rate(struct dbs_data *dbs_data)
>  {
> -       struct cpumask cpumask;
> +       struct policy_dbs_info *policy_dbs;
>         unsigned int new_rate = dbs_data->sampling_rate;
> -       int cpu;
>
>         /*
> -        * Lock governor so that governor start/stop can't execute in parallel.
> +        * We are operating under dbs_data->mutex and so the list and its
> +        * entries can't be freed concurrently.
>          */
> -       mutex_lock(&dbs_data_mutex);
> -
> -       cpumask_copy(&cpumask, cpu_online_mask);
> -
> -       for_each_cpu(cpu, &cpumask) {
> -               struct cpufreq_policy *policy;
> -               struct od_cpu_dbs_info_s *dbs_info;
> -               struct cpu_dbs_info *cdbs;
> -               struct policy_dbs_info *policy_dbs;
> -
> -               dbs_info = &per_cpu(od_cpu_dbs_info, cpu);
> -               cdbs = &dbs_info->cdbs;
> -               policy_dbs = cdbs->policy_dbs;
> -
> +       list_for_each_entry(policy_dbs, &dbs_data->policy_dbs_list, list) {
> +               mutex_lock(&policy_dbs->timer_mutex);
>                 /*
> -                * A valid policy_dbs means governor hasn't stopped or exited
> -                * yet.
> +                * On 32-bit architectures this may race with the
> +                * sample_delay_ns read in dbs_update_util_handler(), but that
> +                * really doesn't matter.  If the read returns a value that's
> +                * too big, the sample will be skipped, but the next invocation
> +                * of dbs_update_util_handler() (when the update has been
> +                * completed) will take a sample.  If the returned value is too
> +                * small, the sample will be taken immediately, but that isn't a
> +                * problem, as we want the new rate to take effect immediately
> +                * anyway.
> +                *
> +                * If this runs in parallel with dbs_work_handler(), we may end
> +                * up overwriting the sample_delay_ns value that it has just
> +                * written, but the difference should not be too big and it will
> +                * be corrected next time a sample is taken, so it shouldn't be
> +                * significant.
>                  */
> -               if (!policy_dbs)
> -                       continue;
> -
> -               policy = policy_dbs->policy;
> -
> -               /* clear all CPUs of this policy */
> -               cpumask_andnot(&cpumask, &cpumask, policy->cpus);
> -
> -               /*
> -                * Update sampling rate for CPUs whose policy is governed by
> -                * dbs_data. In case of governor_per_policy, only a single
> -                * policy will be governed by dbs_data, otherwise there can be
> -                * multiple policies that are governed by the same dbs_data.
> -                */
> -               if (dbs_data == policy_dbs->dbs_data) {
> -                       mutex_lock(&policy_dbs->timer_mutex);
> -                       /*
> -                        * On 32-bit architectures this may race with the
> -                        * sample_delay_ns read in dbs_update_util_handler(),
> -                        * but that really doesn't matter.  If the read returns
> -                        * a value that's too big, the sample will be skipped,
> -                        * but the next invocation of dbs_update_util_handler()
> -                        * (when the update has been completed) will take a
> -                        * sample.  If the returned value is too small, the
> -                        * sample will be taken immediately, but that isn't a
> -                        * problem, as we want the new rate to take effect
> -                        * immediately anyway.
> -                        *
> -                        * If this runs in parallel with dbs_work_handler(), we
> -                        * may end up overwriting the sample_delay_ns value that
> -                        * it has just written, but the difference should not be
> -                        * too big and it will be corrected next time a sample
> -                        * is taken, so it shouldn't be significant.
> -                        */
> -                       gov_update_sample_delay(policy_dbs, new_rate);
> -                       mutex_unlock(&policy_dbs->timer_mutex);
> -               }
> +               gov_update_sample_delay(policy_dbs, new_rate);
> +               mutex_unlock(&policy_dbs->timer_mutex);
>         }
> -
> -       mutex_unlock(&dbs_data_mutex);
>  }
>
>  static bool invalid_up_threshold(struct dbs_data *dbs_data,
> --
> 2.7.1.370.gb2aa7f8
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Viresh Kumar Feb. 8, 2016, 1:34 p.m. UTC | #2
On 08-02-16, 14:32, Rafael J. Wysocki wrote:
> > - * If new rate is smaller than the old, simply updating
> > - * dbs_tuners_int.sampling_rate might not be appropriate. For example, if the
> > - * original sampling_rate was 1 second and the requested new sampling rate is 10
> > - * ms because the user needs immediate reaction from ondemand governor, but not
> > - * sure if higher frequency will be required or not, then, the governor may
> > - * change the sampling rate too late; up to 1 second later. Thus, if we are
> > - * reducing the sampling rate, we need to make the new value effective
> > - * immediately.
> 
> The comment still applies.

Why? It talks about the case where we have reduced sampling rate, but
that's not the case anymore. We *always* update sample_delay_ns now.

> Moreover, please extend it to say that this must be called with
> dbs_data->mutex held (or it looks racy otherwise).

Yeah, that can be done.

--
viresh
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rafael J. Wysocki Feb. 8, 2016, 1:37 p.m. UTC | #3
On Mon, Feb 8, 2016 at 2:34 PM, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> On 08-02-16, 14:32, Rafael J. Wysocki wrote:
>> > - * If new rate is smaller than the old, simply updating
>> > - * dbs_tuners_int.sampling_rate might not be appropriate. For example, if the
>> > - * original sampling_rate was 1 second and the requested new sampling rate is 10
>> > - * ms because the user needs immediate reaction from ondemand governor, but not
>> > - * sure if higher frequency will be required or not, then, the governor may
>> > - * change the sampling rate too late; up to 1 second later. Thus, if we are
>> > - * reducing the sampling rate, we need to make the new value effective
>> > - * immediately.
>>
>> The comment still applies.
>
> Why? It talks about the case where we have reduced sampling rate, but
> that's not the case anymore. We *always* update sample_delay_ns now.

But the comment explains *why* we do that, doesn't it?

If it doesn't apply, then why do we need this function at all?

>> Moreover, please extend it to say that this must be called with
>> dbs_data->mutex held (or it looks racy otherwise).
>
> Yeah, that can be done.


OK

Thanks,
Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Viresh Kumar Feb. 8, 2016, 5:20 p.m. UTC | #4
On 08-02-16, 14:32, Rafael J. Wysocki wrote:
> The comment still applies.
> 
> Moreover, please extend it to say that this must be called with
> dbs_data->mutex held (or it looks racy otherwise).

Modified it as:

+ *
+ * Simply updating dbs_tuners_int.sampling_rate might not be appropriate here.
+ * For example, if the original sampling_rate was 1 second and the requested new
+ * sampling rate is 10 ms because the user needs immediate reaction from
+ * ondemand governor, otherwise the governor may change the sampling rate too
+ * late; up to 1 second later.
+ *
+ * Similar logic applies while increasing the sampling rate. And so we need to
+ * update it with immediate effect.
+ *
+ * This must be called with dbs_data->mutex held, otherwise traversing
+ * policy_dbs_list isn't safe.
Rafael J. Wysocki Feb. 8, 2016, 10:05 p.m. UTC | #5
On Mon, Feb 8, 2016 at 6:20 PM, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> On 08-02-16, 14:32, Rafael J. Wysocki wrote:
>> The comment still applies.
>>
>> Moreover, please extend it to say that this must be called with
>> dbs_data->mutex held (or it looks racy otherwise).
>
> Modified it as:
>
> + *
> + * Simply updating dbs_tuners_int.sampling_rate might not be appropriate here.
> + * For example, if the original sampling_rate was 1 second and the requested new
> + * sampling rate is 10 ms because the user needs immediate reaction from
> + * ondemand governor, otherwise the governor may change the sampling rate too
> + * late; up to 1 second later.

The "otherwise" doesn't seem to be necessary here.

> + *
> + * Similar logic applies while increasing the sampling rate. And so we need to
> + * update it with immediate effect.

Actually, no, it doesn't apply.  If you increase the sampling rate,
the governor will never be late.  It may be early, but that's fine in
this case.

It just doesn't hurt to update immediately in this case too.

> + *
> + * This must be called with dbs_data->mutex held, otherwise traversing
> + * policy_dbs_list isn't safe.

Thanks,
Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rafael J. Wysocki Feb. 8, 2016, 10:08 p.m. UTC | #6
On Mon, Feb 8, 2016 at 11:05 PM, Rafael J. Wysocki <rafael@kernel.org> wrote:
> On Mon, Feb 8, 2016 at 6:20 PM, Viresh Kumar <viresh.kumar@linaro.org> wrote:
>> On 08-02-16, 14:32, Rafael J. Wysocki wrote:
>>> The comment still applies.
>>>
>>> Moreover, please extend it to say that this must be called with
>>> dbs_data->mutex held (or it looks racy otherwise).
>>
>> Modified it as:
>>
>> + *
>> + * Simply updating dbs_tuners_int.sampling_rate might not be appropriate here.
>> + * For example, if the original sampling_rate was 1 second and the requested new
>> + * sampling rate is 10 ms because the user needs immediate reaction from
>> + * ondemand governor, otherwise the governor may change the sampling rate too
>> + * late; up to 1 second later.
>
> The "otherwise" doesn't seem to be necessary here.
>
>> + *
>> + * Similar logic applies while increasing the sampling rate. And so we need to
>> + * update it with immediate effect.
>
> Actually, no, it doesn't apply.  If you increase the sampling rate,
> the governor will never be late.  It may be early, but that's fine in
> this case.
>
> It just doesn't hurt to update immediately in this case too.
>
>> + *
>> + * This must be called with dbs_data->mutex held, otherwise traversing
>> + * policy_dbs_list isn't safe.

I really don't know what's wrong with retaining the original paragraph
and adding the above sentence after it.

Thanks,
Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/cpufreq/cpufreq_ondemand.c b/drivers/cpufreq/cpufreq_ondemand.c
index 745290d7f6a2..f72087bc8302 100644
--- a/drivers/cpufreq/cpufreq_ondemand.c
+++ b/drivers/cpufreq/cpufreq_ondemand.c
@@ -224,83 +224,38 @@  static struct dbs_governor od_dbs_gov;
 /**
  * update_sampling_rate - update sampling rate effective immediately if needed.
  * @new_rate: new sampling rate
- *
- * If new rate is smaller than the old, simply updating
- * dbs_tuners_int.sampling_rate might not be appropriate. For example, if the
- * original sampling_rate was 1 second and the requested new sampling rate is 10
- * ms because the user needs immediate reaction from ondemand governor, but not
- * sure if higher frequency will be required or not, then, the governor may
- * change the sampling rate too late; up to 1 second later. Thus, if we are
- * reducing the sampling rate, we need to make the new value effective
- * immediately.
  */
 static void update_sampling_rate(struct dbs_data *dbs_data)
 {
-	struct cpumask cpumask;
+	struct policy_dbs_info *policy_dbs;
 	unsigned int new_rate = dbs_data->sampling_rate;
-	int cpu;
 
 	/*
-	 * Lock governor so that governor start/stop can't execute in parallel.
+	 * We are operating under dbs_data->mutex and so the list and its
+	 * entries can't be freed concurrently.
 	 */
-	mutex_lock(&dbs_data_mutex);
-
-	cpumask_copy(&cpumask, cpu_online_mask);
-
-	for_each_cpu(cpu, &cpumask) {
-		struct cpufreq_policy *policy;
-		struct od_cpu_dbs_info_s *dbs_info;
-		struct cpu_dbs_info *cdbs;
-		struct policy_dbs_info *policy_dbs;
-
-		dbs_info = &per_cpu(od_cpu_dbs_info, cpu);
-		cdbs = &dbs_info->cdbs;
-		policy_dbs = cdbs->policy_dbs;
-
+	list_for_each_entry(policy_dbs, &dbs_data->policy_dbs_list, list) {
+		mutex_lock(&policy_dbs->timer_mutex);
 		/*
-		 * A valid policy_dbs means governor hasn't stopped or exited
-		 * yet.
+		 * On 32-bit architectures this may race with the
+		 * sample_delay_ns read in dbs_update_util_handler(), but that
+		 * really doesn't matter.  If the read returns a value that's
+		 * too big, the sample will be skipped, but the next invocation
+		 * of dbs_update_util_handler() (when the update has been
+		 * completed) will take a sample.  If the returned value is too
+		 * small, the sample will be taken immediately, but that isn't a
+		 * problem, as we want the new rate to take effect immediately
+		 * anyway.
+		 *
+		 * If this runs in parallel with dbs_work_handler(), we may end
+		 * up overwriting the sample_delay_ns value that it has just
+		 * written, but the difference should not be too big and it will
+		 * be corrected next time a sample is taken, so it shouldn't be
+		 * significant.
 		 */
-		if (!policy_dbs)
-			continue;
-
-		policy = policy_dbs->policy;
-
-		/* clear all CPUs of this policy */
-		cpumask_andnot(&cpumask, &cpumask, policy->cpus);
-
-		/*
-		 * Update sampling rate for CPUs whose policy is governed by
-		 * dbs_data. In case of governor_per_policy, only a single
-		 * policy will be governed by dbs_data, otherwise there can be
-		 * multiple policies that are governed by the same dbs_data.
-		 */
-		if (dbs_data == policy_dbs->dbs_data) {
-			mutex_lock(&policy_dbs->timer_mutex);
-			/*
-			 * On 32-bit architectures this may race with the
-			 * sample_delay_ns read in dbs_update_util_handler(),
-			 * but that really doesn't matter.  If the read returns
-			 * a value that's too big, the sample will be skipped,
-			 * but the next invocation of dbs_update_util_handler()
-			 * (when the update has been completed) will take a
-			 * sample.  If the returned value is too small, the
-			 * sample will be taken immediately, but that isn't a
-			 * problem, as we want the new rate to take effect
-			 * immediately anyway.
-			 *
-			 * If this runs in parallel with dbs_work_handler(), we
-			 * may end up overwriting the sample_delay_ns value that
-			 * it has just written, but the difference should not be
-			 * too big and it will be corrected next time a sample
-			 * is taken, so it shouldn't be significant.
-			 */
-			gov_update_sample_delay(policy_dbs, new_rate);
-			mutex_unlock(&policy_dbs->timer_mutex);
-		}
+		gov_update_sample_delay(policy_dbs, new_rate);
+		mutex_unlock(&policy_dbs->timer_mutex);
 	}
-
-	mutex_unlock(&dbs_data_mutex);
 }
 
 static bool invalid_up_threshold(struct dbs_data *dbs_data,