Message ID | 2298639.F0Qv2QqRnZ@vostro.rjw.lan (mailing list archive) |
---|---|
State | Accepted, archived |
Headers | show |
Wow!! Lot of stuff happened while I was asleep.. @Srivatsa: Thanks for answering what I would have answered to Rafael :) And you should really get some sleep, I would suggest :) On 2 August 2013 02:23, Rafael J. Wysocki <rjw@sisk.pl> wrote: > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > Subject: cpufreq: Do not hold driver module references for additional policy CPUs I still have issues with this subject. Why don't we get rid of .owner field completely? And stop using a mix of cpufreq_cpu_get() and kobject_get()? > The cpufreq core is a little inconsistent in the way it uses the > driver module refcount. > > Namely, if __cpufreq_add_dev() is called for a CPU that doesn't > share the policy object with any other CPUs, the driver module > refcount it grabs to start with will be dropped by it before > returning and will be equal to 0 afterward. It wouldn't be zero but 1, this is what it is initialized with probably. That's what I can see in my tests. > However, if the given CPU does share the policy object with other > CPUs, either cpufreq_add_policy_cpu() is called to link the new CPU > to the existing policy, or cpufreq_add_dev_symlink() is used to link > the other CPUs sharing the policy with it to the just created policy > object. In that case, because both cpufreq_add_policy_cpu() and > cpufreq_add_dev_symlink() call cpufreq_cpu_get() for the given > policy (the latter possibly many times) without the balancing > cpufreq_cpu_put() (unless there is an error), the driver module > refcount will be left by __cpufreq_add_dev() with a nonzero value. > > To remove that inconsistency make cpufreq_add_policy_cpu() execute > cpufreq_cpu_put() for the given policy before returning, which > decrements the driver module refcount so that it will be 0 after > __cpufreq_add_dev() returns. Moreover, remove the cpufreq_cpu_get() > call from cpufreq_add_dev_symlink(), since both the policy refcount > and the driver module refcount are nonzero when it is called and they > don't need to be bumped up by it. > > Accordingly, drop the cpufreq_cpu_put() from __cpufreq_remove_dev(), > since it is only necessary to balance the cpufreq_cpu_get() called > by cpufreq_add_policy_cpu() or cpufreq_add_dev_symlink(). > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> > --- > drivers/cpufreq/cpufreq.c | 28 +++++++--------------------- > 1 file changed, 7 insertions(+), 21 deletions(-) So, we can't rmmod the module as soon as it is inserted and so the problem stays as is. :( > Index: linux-pm/drivers/cpufreq/cpufreq.c > =================================================================== > --- linux-pm.orig/drivers/cpufreq/cpufreq.c > +++ linux-pm/drivers/cpufreq/cpufreq.c > @@ -818,14 +818,11 @@ static int cpufreq_add_dev_symlink(struc > continue; > > pr_debug("Adding link for CPU: %u\n", j); > - cpufreq_cpu_get(policy->cpu); > cpu_dev = get_cpu_device(j); > ret = sysfs_create_link(&cpu_dev->kobj, &policy->kobj, > "cpufreq"); > - if (ret) { > - cpufreq_cpu_put(policy); > - return ret; > - } > + if (ret) > + break; > } > return ret; > } > @@ -908,7 +905,8 @@ static int cpufreq_add_policy_cpu(unsign > unsigned long flags; > > policy = cpufreq_cpu_get(sibling); This can be skipped completely at this place. Caller of cpufreq_add_policy_cpu() has got the policy pointer with it and so can be passed. I haven't done it earlier as the impression was we need to call cpufreq_cpu_get().. > - WARN_ON(!policy); > + if (WARN_ON_ONCE(!policy)) > + return -ENODATA; > > if (has_target) > __cpufreq_governor(policy, CPUFREQ_GOV_STOP); > @@ -930,16 +928,10 @@ static int cpufreq_add_policy_cpu(unsign > } > > /* Don't touch sysfs links during light-weight init */ > - if (frozen) { > - /* Drop the extra refcount that we took above */ > - cpufreq_cpu_put(policy); > - return 0; > - } > - > - ret = sysfs_create_link(&dev->kobj, &policy->kobj, "cpufreq"); > - if (ret) > - cpufreq_cpu_put(policy); > + if (!frozen) > + ret = sysfs_create_link(&dev->kobj, &policy->kobj, "cpufreq"); > > + cpufreq_cpu_put(policy); And so this will go away. -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 08/02/2013 10:07 AM, Viresh Kumar wrote: > Wow!! Lot of stuff happened while I was asleep.. > > @Srivatsa: Thanks for answering what I would have answered to Rafael :) > And you should really get some sleep, I would suggest :) No problem :-) And thank you for your concern :-) > > On 2 August 2013 02:23, Rafael J. Wysocki <rjw@sisk.pl> wrote: >> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> >> Subject: cpufreq: Do not hold driver module references for additional policy CPUs > > I still have issues with this subject. Why don't we get rid of .owner > field completely? And stop using a mix of cpufreq_cpu_get() and > kobject_get()? > I guess Rafael's intention is to do one thing at a time - fix the inconsistency first, and then rework the synchronization on top of it. And that makes sense to me, since its the logical way of fixing all these issues. >> The cpufreq core is a little inconsistent in the way it uses the >> driver module refcount. >> >> Namely, if __cpufreq_add_dev() is called for a CPU that doesn't >> share the policy object with any other CPUs, the driver module >> refcount it grabs to start with will be dropped by it before >> returning and will be equal to 0 afterward. > > It wouldn't be zero but 1, this is what it is initialized with probably. > That's what I can see in my tests. > But lsmod shows 0 for the cpufreq driver right? (Note, your related_cpus should have only 1 CPU each, for you to see 0. Else, you'll see a non-zero value due to the very bug/inconsistency that Rafael is fixing in this patch). >> However, if the given CPU does share the policy object with other >> CPUs, either cpufreq_add_policy_cpu() is called to link the new CPU >> to the existing policy, or cpufreq_add_dev_symlink() is used to link >> the other CPUs sharing the policy with it to the just created policy >> object. In that case, because both cpufreq_add_policy_cpu() and >> cpufreq_add_dev_symlink() call cpufreq_cpu_get() for the given >> policy (the latter possibly many times) without the balancing >> cpufreq_cpu_put() (unless there is an error), the driver module >> refcount will be left by __cpufreq_add_dev() with a nonzero value. >> >> To remove that inconsistency make cpufreq_add_policy_cpu() execute >> cpufreq_cpu_put() for the given policy before returning, which >> decrements the driver module refcount so that it will be 0 after >> __cpufreq_add_dev() returns. Moreover, remove the cpufreq_cpu_get() >> call from cpufreq_add_dev_symlink(), since both the policy refcount >> and the driver module refcount are nonzero when it is called and they >> don't need to be bumped up by it. >> >> Accordingly, drop the cpufreq_cpu_put() from __cpufreq_remove_dev(), >> since it is only necessary to balance the cpufreq_cpu_get() called >> by cpufreq_add_policy_cpu() or cpufreq_add_dev_symlink(). >> >> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> >> Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> >> --- >> drivers/cpufreq/cpufreq.c | 28 +++++++--------------------- >> 1 file changed, 7 insertions(+), 21 deletions(-) > > So, we can't rmmod the module as soon as it is inserted and so the > problem stays as is. :( > No, we get one step closer to the solution, since we fix the inconsistency between refcounts. Next step would be to get rid of refcounts and use locking like you suggested. Then we can rmmod it easily. I'm assuming Rafael has the same plan. (We could have done all this in one-shot, but that would make it difficult to track regressions etc. So good to have each improvement in a separate patch). >> Index: linux-pm/drivers/cpufreq/cpufreq.c >> =================================================================== >> --- linux-pm.orig/drivers/cpufreq/cpufreq.c >> +++ linux-pm/drivers/cpufreq/cpufreq.c >> @@ -818,14 +818,11 @@ static int cpufreq_add_dev_symlink(struc >> continue; >> >> pr_debug("Adding link for CPU: %u\n", j); >> - cpufreq_cpu_get(policy->cpu); >> cpu_dev = get_cpu_device(j); >> ret = sysfs_create_link(&cpu_dev->kobj, &policy->kobj, >> "cpufreq"); >> - if (ret) { >> - cpufreq_cpu_put(policy); >> - return ret; >> - } >> + if (ret) >> + break; >> } >> return ret; >> } >> @@ -908,7 +905,8 @@ static int cpufreq_add_policy_cpu(unsign >> unsigned long flags; >> >> policy = cpufreq_cpu_get(sibling); > > This can be skipped completely at this place. Caller of > cpufreq_add_policy_cpu() has got the policy pointer with it and so > can be passed. I haven't done it earlier as the impression was we need > to call cpufreq_cpu_get().. > Agreed, that would be a good cleanup. >> - WARN_ON(!policy); >> + if (WARN_ON_ONCE(!policy)) >> + return -ENODATA; >> >> if (has_target) >> __cpufreq_governor(policy, CPUFREQ_GOV_STOP); >> @@ -930,16 +928,10 @@ static int cpufreq_add_policy_cpu(unsign >> } >> >> /* Don't touch sysfs links during light-weight init */ >> - if (frozen) { >> - /* Drop the extra refcount that we took above */ >> - cpufreq_cpu_put(policy); >> - return 0; >> - } >> - >> - ret = sysfs_create_link(&dev->kobj, &policy->kobj, "cpufreq"); >> - if (ret) >> - cpufreq_cpu_put(policy); >> + if (!frozen) >> + ret = sysfs_create_link(&dev->kobj, &policy->kobj, "cpufreq"); >> >> + cpufreq_cpu_put(policy); > > And so this will go away. > Regards, Srivatsa S. Bhat -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2 August 2013 12:19, Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> wrote: > But lsmod shows 0 for the cpufreq driver right? (Note, your related_cpus > should have only 1 CPU each, for you to see 0. Else, you'll see a non-zero > value due to the very bug/inconsistency that Rafael is fixing in this > patch). I have hacked the driver this way: @@ -2114,10 +2114,16 @@ int cpufreq_register_driver(struct cpufreq_driver *driver_data) cpufreq_driver = driver_data; write_unlock_irqrestore(&cpufreq_driver_lock, flags); + printk(KERN_INFO "%s: Module refcount: %lu\n", __func__, + module_refcount(cpufreq_driver->owner)); + ret = subsys_interface_register(&cpufreq_interface); if (ret) goto err_null_driver; And this gave me 1.. -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 08/02/2013 12:29 PM, Viresh Kumar wrote: > On 2 August 2013 12:19, Srivatsa S. Bhat > <srivatsa.bhat@linux.vnet.ibm.com> wrote: >> But lsmod shows 0 for the cpufreq driver right? (Note, your related_cpus >> should have only 1 CPU each, for you to see 0. Else, you'll see a non-zero >> value due to the very bug/inconsistency that Rafael is fixing in this >> patch). > > I have hacked the driver this way: > > @@ -2114,10 +2114,16 @@ int cpufreq_register_driver(struct > cpufreq_driver *driver_data) > cpufreq_driver = driver_data; > write_unlock_irqrestore(&cpufreq_driver_lock, flags); > > + printk(KERN_INFO "%s: Module refcount: %lu\n", __func__, > + module_refcount(cpufreq_driver->owner)); > + > ret = subsys_interface_register(&cpufreq_interface); > if (ret) > goto err_null_driver; > > > And this gave me 1.. > Well, on my system, lsmod shows: acpi_cpufreq 13643 0 The last column is the refcount, as printed by: kernel/module.c: print_unload_info() 913 seq_printf(m, " %lu ", module_refcount(mod)); I guess you are printing it at an odd time, when the module is still running its init function. Perhaps the core kernel module infrastructure increments the refcount around that region temporarily? Regards, Srivatsa S. Bhat -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2 August 2013 12:39, Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> wrote: > I guess you are printing it at an odd time, when the module > is still running its init function. Perhaps the core kernel > module infrastructure increments the refcount around that > region temporarily? If I think logically, that sounds correct. I haven't looked at the implementation details though. But yes, I understood why my refcount was incremented :) Thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2 August 2013 12:19, Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> wrote: > On 08/02/2013 10:07 AM, Viresh Kumar wrote: >> So, we can't rmmod the module as soon as it is inserted and so the >> problem stays as is. :( >> > > No, we get one step closer to the solution, since we fix the inconsistency > between refcounts. Next step would be to get rid of refcounts and use > locking like you suggested. Then we can rmmod it easily. I'm assuming > Rafael has the same plan. Not really. We are putting the reference at the end of add_dev() and so refcount would be zero when we aren't running any critical sections. And so, we can rmmod the module now and that problem is gone. @Rafael: I will try to do generic cleanups in cpufreq in coming time and will take care to remove .owner field completely in that. Until that point your patches look fine: For both of your patches: Acked-by: Viresh Kumar <viresh.kumar@linaro.org> -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 08/02/2013 03:06 PM, Viresh Kumar wrote: > On 2 August 2013 12:19, Srivatsa S. Bhat > <srivatsa.bhat@linux.vnet.ibm.com> wrote: >> On 08/02/2013 10:07 AM, Viresh Kumar wrote: >>> So, we can't rmmod the module as soon as it is inserted and so the >>> problem stays as is. :( >>> >> >> No, we get one step closer to the solution, since we fix the inconsistency >> between refcounts. Next step would be to get rid of refcounts and use >> locking like you suggested. Then we can rmmod it easily. I'm assuming >> Rafael has the same plan. > > Not really. We are putting the reference at the end of add_dev() and > so refcount would be zero when we aren't running any critical sections. > And so, we can rmmod the module now and that problem is gone. > Ah, yes, you are right. > @Rafael: I will try to do generic cleanups in cpufreq in coming time > and will take care to remove .owner field completely in that. Until that > point your patches look fine: > > For both of your patches: > Acked-by: Viresh Kumar <viresh.kumar@linaro.org> > Regards, Srivatsa S. Bhat -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2 August 2013 02:23, Rafael J. Wysocki <rjw@sisk.pl> wrote: > To remove that inconsistency make cpufreq_add_policy_cpu() execute > cpufreq_cpu_put() for the given policy before returning, which > decrements the driver module refcount so that it will be 0 after > __cpufreq_add_dev() returns. Moreover, remove the cpufreq_cpu_get() > call from cpufreq_add_dev_symlink(), since both the policy refcount > and the driver module refcount are nonzero when it is called and they > don't need to be bumped up by it. Sorry for creating so many problems but my concerns with this patch aren't yet over :( Should we increment policy refcount or kobj refcount for every cpu it is used on? I think yes, that's probably the right way of doing it. And so we simply can't remove calls to cpufreq_cpu_get() from cpufreq_add_dev_symlink() routine and also from cpufreq_add_policy_cpu().. -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 08/02/2013 04:00 PM, Viresh Kumar wrote: > On 2 August 2013 02:23, Rafael J. Wysocki <rjw@sisk.pl> wrote: >> To remove that inconsistency make cpufreq_add_policy_cpu() execute >> cpufreq_cpu_put() for the given policy before returning, which >> decrements the driver module refcount so that it will be 0 after >> __cpufreq_add_dev() returns. Moreover, remove the cpufreq_cpu_get() >> call from cpufreq_add_dev_symlink(), since both the policy refcount >> and the driver module refcount are nonzero when it is called and they >> don't need to be bumped up by it. > > Sorry for creating so many problems but my concerns with this patch > aren't yet over :( > > Should we increment policy refcount or kobj refcount for every cpu it > is used on? I think yes, that's probably the right way of doing it. > It depends on how you look at it. The number of CPUs in the policy (cpumask_weight(policy)) itself serves as a refcount. We don't actually need yet another refcount to manage things. Besides, not bumping up the policy refcount for every CPU actually seems to simplify the code and make it easier to understand, so why not do it? :-) > And so we simply can't remove calls to cpufreq_cpu_get() from > cpufreq_add_dev_symlink() routine and also from > cpufreq_add_policy_cpu().. Regards, Srivatsa S. Bhat -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Index: linux-pm/drivers/cpufreq/cpufreq.c =================================================================== --- linux-pm.orig/drivers/cpufreq/cpufreq.c +++ linux-pm/drivers/cpufreq/cpufreq.c @@ -818,14 +818,11 @@ static int cpufreq_add_dev_symlink(struc continue; pr_debug("Adding link for CPU: %u\n", j); - cpufreq_cpu_get(policy->cpu); cpu_dev = get_cpu_device(j); ret = sysfs_create_link(&cpu_dev->kobj, &policy->kobj, "cpufreq"); - if (ret) { - cpufreq_cpu_put(policy); - return ret; - } + if (ret) + break; } return ret; } @@ -908,7 +905,8 @@ static int cpufreq_add_policy_cpu(unsign unsigned long flags; policy = cpufreq_cpu_get(sibling); - WARN_ON(!policy); + if (WARN_ON_ONCE(!policy)) + return -ENODATA; if (has_target) __cpufreq_governor(policy, CPUFREQ_GOV_STOP); @@ -930,16 +928,10 @@ static int cpufreq_add_policy_cpu(unsign } /* Don't touch sysfs links during light-weight init */ - if (frozen) { - /* Drop the extra refcount that we took above */ - cpufreq_cpu_put(policy); - return 0; - } - - ret = sysfs_create_link(&dev->kobj, &policy->kobj, "cpufreq"); - if (ret) - cpufreq_cpu_put(policy); + if (!frozen) + ret = sysfs_create_link(&dev->kobj, &policy->kobj, "cpufreq"); + cpufreq_cpu_put(policy); return ret; } #endif @@ -1298,12 +1290,6 @@ static int __cpufreq_remove_dev(struct d if (!frozen) cpufreq_policy_free(data); } else { - - if (!frozen) { - pr_debug("%s: removing link, cpu: %d\n", __func__, cpu); - cpufreq_cpu_put(data); - } - if (cpufreq_driver->target) { __cpufreq_governor(data, CPUFREQ_GOV_START); __cpufreq_governor(data, CPUFREQ_GOV_LIMITS);