Message ID | 56084623b2c27372a4c2c598151dd47176c3e26f.1444723240.git.viresh.kumar@linaro.org (mailing list archive) |
---|---|
State | Accepted, archived |
Delegated to: | Rafael Wysocki |
Headers | show |
On Tuesday, October 13, 2015 01:39:01 PM Viresh Kumar wrote: > 'timer_mutex' is required to sync work-handlers of policy->cpus. > update_sampling_rate() is just canceling the works and queuing them > again. This isn't protecting anything at all in update_sampling_rate() > and is not gonna be of any use. > > Even if a work-handler is already running for a CPU, > cancel_delayed_work_sync() will wait for it to finish. > > Drop these unnecessary locks. > > Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com> > Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> I'm queuing this up for 4.4, although I think that the changelog is not right. While at it, what are the race conditions the lock is protecting against? Thanks, Rafael -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 28-10-15, 05:05, Rafael J. Wysocki wrote: > On Tuesday, October 13, 2015 01:39:01 PM Viresh Kumar wrote: > > 'timer_mutex' is required to sync work-handlers of policy->cpus. > > update_sampling_rate() is just canceling the works and queuing them > > again. This isn't protecting anything at all in update_sampling_rate() > > and is not gonna be of any use. > > > > Even if a work-handler is already running for a CPU, > > cancel_delayed_work_sync() will wait for it to finish. > > > > Drop these unnecessary locks. > > > > Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com> > > Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> > > I'm queuing this up for 4.4, although I think that the changelog is not right. > > While at it, what are the race conditions the lock is protecting against? In cases where a single policy controls multiple CPUs, a timer is queued for every cpu present in policy->cpus. When we reach the timer handler (which can be on multiple CPUs together) on any CPU, we trace CPU load for all policy->cpus and update the frequency accordingly. The lock is for protecting multiple CPUs to do the same thing together, as only its required to be done by a single CPU. Once any CPUs handler has completed, it updates the last update time and drops the mutex. At that point of time, other blocked handler (if any) check the last update time and return early. And then there are enough minute things that can go wrong if multiple CPUs do the load evaluation and freq-update at the same time, apart from it being an time wasting effort. And so I still think that the commit log isn't that bad. The timer_mutex lock isn't required in other parts of the governor, they are just for synchronizing the work-handlers of CPUs belonging to the same policy.
On Wednesday, October 28, 2015 10:14:51 AM Viresh Kumar wrote: > On 28-10-15, 05:05, Rafael J. Wysocki wrote: > > On Tuesday, October 13, 2015 01:39:01 PM Viresh Kumar wrote: > > > 'timer_mutex' is required to sync work-handlers of policy->cpus. > > > update_sampling_rate() is just canceling the works and queuing them > > > again. This isn't protecting anything at all in update_sampling_rate() > > > and is not gonna be of any use. > > > > > > Even if a work-handler is already running for a CPU, > > > cancel_delayed_work_sync() will wait for it to finish. > > > > > > Drop these unnecessary locks. > > > > > > Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com> > > > Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> > > > > I'm queuing this up for 4.4, although I think that the changelog is not right. > > > > While at it, what are the race conditions the lock is protecting against? > > In cases where a single policy controls multiple CPUs, a timer is > queued for every cpu present in policy->cpus. When we reach the timer > handler (which can be on multiple CPUs together) on any CPU, we trace > CPU load for all policy->cpus and update the frequency accordingly. That would be in dbs_timer(), right? > The lock is for protecting multiple CPUs to do the same thing > together, as only its required to be done by a single CPU. Once any > CPUs handler has completed, it updates the last update time and drops > the mutex. At that point of time, other blocked handler (if any) check > the last update time and return early. Well, that would mean we only needed to hold the lock around the need_load_eval() evaluation in dbs_timer() if I'm not mistaken. We also should acquire it around updates of the sampling rate, which essentially is set_sampling_rate(). Is there any reason to acquire it in cpufreq_governor_limits(), then, for example? > And then there are enough minute things that can go wrong if multiple > CPUs do the load evaluation and freq-update at the same time, apart > from it being an time wasting effort. > > And so I still think that the commit log isn't that bad. The > timer_mutex lock isn't required in other parts of the governor, they > are just for synchronizing the work-handlers of CPUs belonging to the > same policy. I agree that it doesn't serve any purpose in the piece of code you're removing it from (which is why I agree with the patch), but the changelog is incomplete and confusing. Thanks, Rafael -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 28-10-15, 06:54, Rafael J. Wysocki wrote: > On Wednesday, October 28, 2015 10:14:51 AM Viresh Kumar wrote: > > In cases where a single policy controls multiple CPUs, a timer is > > queued for every cpu present in policy->cpus. When we reach the timer > > handler (which can be on multiple CPUs together) on any CPU, we trace > > CPU load for all policy->cpus and update the frequency accordingly. > > That would be in dbs_timer(), right? Yeah, and we already do stuff from within the mutex there. > > The lock is for protecting multiple CPUs to do the same thing > > together, as only its required to be done by a single CPU. Once any > > CPUs handler has completed, it updates the last update time and drops > > the mutex. At that point of time, other blocked handler (if any) check > > the last update time and return early. > > Well, that would mean we only needed to hold the lock around the > need_load_eval() evaluation in dbs_timer() if I'm not mistaken. Actually yeah, but then the fourth patch of this series uses the timer_mutex to fix a long standing problem (which was fixed by hacking the code earlier). And so we need to take the lock for the entire dbs_timer() routine. > We also should acquire it around updates of the sampling rate, which > essentially is set_sampling_rate(). Why? In the worst case we may schedule the next timer for the earlier sampling rate. But do we care that much for that race, that we want to add locks here as well ? > Is there any reason to acquire it in cpufreq_governor_limits(), then, > for example? Yeah, we are calling dbs_check_cpu(dbs_data, cpu) from that path, which will reevaluate the load.
On Wednesday, October 28, 2015 12:13:17 PM Viresh Kumar wrote: > On 28-10-15, 06:54, Rafael J. Wysocki wrote: > > On Wednesday, October 28, 2015 10:14:51 AM Viresh Kumar wrote: > > > In cases where a single policy controls multiple CPUs, a timer is > > > queued for every cpu present in policy->cpus. When we reach the timer > > > handler (which can be on multiple CPUs together) on any CPU, we trace > > > CPU load for all policy->cpus and update the frequency accordingly. > > > > That would be in dbs_timer(), right? > > Yeah, and we already do stuff from within the mutex there. > > > > The lock is for protecting multiple CPUs to do the same thing > > > together, as only its required to be done by a single CPU. Once any > > > CPUs handler has completed, it updates the last update time and drops > > > the mutex. At that point of time, other blocked handler (if any) check > > > the last update time and return early. > > > > Well, that would mean we only needed to hold the lock around the > > need_load_eval() evaluation in dbs_timer() if I'm not mistaken. > > Actually yeah, but then the fourth patch of this series uses the > timer_mutex to fix a long standing problem (which was fixed by hacking > the code earlier). And so we need to take the lock for the entire > dbs_timer() routine. I don't actually think that that patch is correct and even if it is, we'll only need to do that *after* that patch, so at least it would be fair to say a word about it in the changelog, wouldn't it? > > We also should acquire it around updates of the sampling rate, which > > essentially is set_sampling_rate(). > > Why? In the worst case we may schedule the next timer for the earlier > sampling rate. But do we care that much for that race, that we want to > add locks here as well ? OK That works because we actully hold the mutex around the whole function, as otherwise we'd have seen races between delayed work items on different CPUs sharing the policy. > > Is there any reason to acquire it in cpufreq_governor_limits(), then, > > for example? > > Yeah, we are calling dbs_check_cpu(dbs_data, cpu) from that path, > which will reevaluate the load. Which means that we should take the lock around dbs_check_cpu() everywhere in a consistent way. Which in turn means that the lock actually does more than you said. My point is basically that we seem to have a vague idea about what the lock is used for, while we need to know exactly why we need it. Thanks, Rafael -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 28-10-15, 08:46, Rafael J. Wysocki wrote: > On Wednesday, October 28, 2015 12:13:17 PM Viresh Kumar wrote: > > Actually yeah, but then the fourth patch of this series uses the > > timer_mutex to fix a long standing problem (which was fixed by hacking > > the code earlier). And so we need to take the lock for the entire > > dbs_timer() routine. Well, there is another reason why the lock is taken for the complete dbs_timer() routine. There are two parts of that routine: - Checking if load evaluation is required or not + updating the last-update time. - The second is the load evaluation + freq change thing. Lock around the first check makes sure that timer handlers of other CPUs don't do load evaluation in parallel and that they don't do it before the sampling period. Lock around the second part makes sure there is only one thread which is doing load evaluation + freq update. The other thread being cpufreq_governor_limits(). And so the same lock taken across that part as well. > I don't actually think that that patch is correct and even if it is, > we'll only need to do that *after* that patch, so at least it would be > fair to say a word about it in the changelog, wouldn't it? Hmm, If you agree about the above reasoning, then we may not require an update to the changelog, otherwise I will mention that in the changelog of this patch. > > Yeah, we are calling dbs_check_cpu(dbs_data, cpu) from that path, > > which will reevaluate the load. > > Which means that we should take the lock around dbs_check_cpu() everywhere > in a consistent way. We already do this from everywhere. > Which in turn means that the lock actually does more > than you said. What I described towards the top is probably a better answer to the earlier query. > My point is basically that we seem to have a vague idea about what the lock > is used for, while we need to know exactly why we need it. I am totally with you on this, we have surely screwed up on locking for a long time in cpufreq. And we should know exactly why we want to change it now.
diff --git a/drivers/cpufreq/cpufreq_ondemand.c b/drivers/cpufreq/cpufreq_ondemand.c index 1fa9088c84a8..03ac6ce54042 100644 --- a/drivers/cpufreq/cpufreq_ondemand.c +++ b/drivers/cpufreq/cpufreq_ondemand.c @@ -267,27 +267,19 @@ static void update_sampling_rate(struct dbs_data *dbs_data, dbs_info = &per_cpu(od_cpu_dbs_info, cpu); cpufreq_cpu_put(policy); - mutex_lock(&dbs_info->cdbs.shared->timer_mutex); - - if (!delayed_work_pending(&dbs_info->cdbs.dwork)) { - mutex_unlock(&dbs_info->cdbs.shared->timer_mutex); + if (!delayed_work_pending(&dbs_info->cdbs.dwork)) continue; - } next_sampling = jiffies + usecs_to_jiffies(new_rate); appointed_at = dbs_info->cdbs.dwork.timer.expires; if (time_before(next_sampling, appointed_at)) { - - mutex_unlock(&dbs_info->cdbs.shared->timer_mutex); cancel_delayed_work_sync(&dbs_info->cdbs.dwork); - mutex_lock(&dbs_info->cdbs.shared->timer_mutex); gov_queue_work(dbs_data, policy, usecs_to_jiffies(new_rate), true); } - mutex_unlock(&dbs_info->cdbs.shared->timer_mutex); } }