Message ID | 1365770165-27096-7-git-send-email-daniel.lezcano@linaro.org (mailing list archive) |
---|---|
State | Superseded, archived |
Headers | show |
On 04/12/2013 02:35 PM, Daniel Lezcano wrote: > The usual scheme to initialize a cpuidle driver on a SMP is: > > cpuidle_register_driver(drv); > for_each_possible_cpu(cpu) { > device = &per_cpu(cpuidle_dev, cpu); > cpuidle_register_device(device); > } > > This code is duplicated in each cpuidle driver. > > On UP systems, it is done this way: > > cpuidle_register_driver(drv); > device = &per_cpu(cpuidle_dev, cpu); > cpuidle_register_device(device); > > On UP, the macro 'for_each_cpu' does one iteration: > > #define for_each_cpu(cpu, mask) \ > for ((cpu) = 0; (cpu) < 1; (cpu)++, (void)mask) > > Hence, the initialization loop is the same for UP than SMP. > > Beside, we saw different bugs / mis-initialization / return code unchecked in > the different drivers, the code is duplicated including bugs. After fixing all > these ones, it appears the initialization pattern is the same for everyone. > > Please note, some drivers are doing dev->state_count = drv->state_count. This is > not necessary because it is done by the cpuidle_enable_device function in the > cpuidle framework. This is true, until you have the same states for all your > devices. Otherwise, the 'low level' API should be used instead with the specific > initialization for the driver. > > Let's add a wrapper function doing this initialization with a cpumask parameter > for the coupled idle states and use it for all the drivers. > > That will save a lot of LOC, consolidate the code, and the modifications in the > future could be done in a single place. Another benefit is the consolidation of > the cpuidle_device variable which is now in the cpuidle framework and no longer > spread accross the different arch specific drivers. > > Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> > --- Hi Rob, Andrew, are you ok with this new version ? Thanks -- Daniel > Documentation/cpuidle/driver.txt | 6 ++++ > drivers/cpuidle/cpuidle.c | 72 ++++++++++++++++++++++++++++++++++++++ > include/linux/cpuidle.h | 9 +++-- > 3 files changed, 85 insertions(+), 2 deletions(-) > > diff --git a/Documentation/cpuidle/driver.txt b/Documentation/cpuidle/driver.txt > index 7a9e09e..1b0d81d 100644 > --- a/Documentation/cpuidle/driver.txt > +++ b/Documentation/cpuidle/driver.txt > @@ -15,11 +15,17 @@ has mechanisms in place to support actual entry-exit into CPU idle states. > cpuidle driver initializes the cpuidle_device structure for each CPU device > and registers with cpuidle using cpuidle_register_device. > > +If all the idle states are the same, the wrapper function cpuidle_register > +could be used instead. > + > It can also support the dynamic changes (like battery <-> AC), by using > cpuidle_pause_and_lock, cpuidle_disable_device and cpuidle_enable_device, > cpuidle_resume_and_unlock. > > Interfaces: > +extern int cpuidle_register(struct cpuidle_driver *drv, > + const struct cpumask *const coupled_cpus); > +extern int cpuidle_unregister(struct cpuidle_driver *drv); > extern int cpuidle_register_driver(struct cpuidle_driver *drv); > extern void cpuidle_unregister_driver(struct cpuidle_driver *drv); > extern int cpuidle_register_device(struct cpuidle_device *dev); > diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c > index 0da795b..49e8d30 100644 > --- a/drivers/cpuidle/cpuidle.c > +++ b/drivers/cpuidle/cpuidle.c > @@ -24,6 +24,7 @@ > #include "cpuidle.h" > > DEFINE_PER_CPU(struct cpuidle_device *, cpuidle_devices); > +DEFINE_PER_CPU(struct cpuidle_device, cpuidle_dev); > > DEFINE_MUTEX(cpuidle_lock); > LIST_HEAD(cpuidle_detected_devices); > @@ -453,6 +454,77 @@ void cpuidle_unregister_device(struct cpuidle_device *dev) > > EXPORT_SYMBOL_GPL(cpuidle_unregister_device); > > +/* > + * cpuidle_unregister: unregister a driver and the devices. This function > + * can be used only if the driver has been previously registered through > + * the cpuidle_register function. > + * > + * @drv: a valid pointer to a struct cpuidle_driver > + */ > +void cpuidle_unregister(struct cpuidle_driver *drv) > +{ > + int cpu; > + struct cpuidle_device *device; > + > + for_each_possible_cpu(cpu) { > + device = &per_cpu(cpuidle_dev, cpu); > + cpuidle_unregister_device(device); > + } > + > + cpuidle_unregister_driver(drv); > +} > +EXPORT_SYMBOL_GPL(cpuidle_unregister); > + > +/** > + * cpuidle_register: registers the driver and the cpu devices with the > + * coupled_cpus passed as parameter. This function is used for all common > + * initialization pattern there are in the arch specific drivers. The > + * devices is globally defined in this file. > + * > + * @drv : a valid pointer to a struct cpuidle_driver > + * @coupled_cpus: a cpumask for the coupled states > + * > + * Returns 0 on success, < 0 otherwise > + */ > +int cpuidle_register(struct cpuidle_driver *drv, > + const struct cpumask *const coupled_cpus) > +{ > + int ret, cpu; > + struct cpuidle_device *device; > + > + ret = cpuidle_register_driver(drv); > + if (ret) { > + pr_err("failed to register cpuidle driver\n"); > + return ret; > + } > + > + for_each_possible_cpu(cpu) { > + device = &per_cpu(cpuidle_dev, cpu); > + device->cpu = cpu; > + > +#ifdef CONFIG_ARCH_NEEDS_CPU_IDLE_COUPLED > + /* > + * On multiplatform for ARM, the coupled idle states could > + * enabled in the kernel even if the cpuidle driver does not > + * use it. Note, coupled_cpus is a struct copy. > + */ > + if (coupled_cpus) > + device->coupled_cpus = *coupled_cpus; > +#endif > + ret = cpuidle_register_device(device); > + if (!ret) > + continue; > + > + pr_err("Failed to register cpuidle device for cpu%d\n", cpu); > + > + cpuidle_unregister(drv); > + break; > + } > + > + return ret; > +} > +EXPORT_SYMBOL_GPL(cpuidle_register); > + > #ifdef CONFIG_SMP > > static void smp_callback(void *v) > diff --git a/include/linux/cpuidle.h b/include/linux/cpuidle.h > index 79e3811..3c86faa 100644 > --- a/include/linux/cpuidle.h > +++ b/include/linux/cpuidle.h > @@ -123,7 +123,9 @@ extern void cpuidle_driver_unref(void); > extern void cpuidle_unregister_driver(struct cpuidle_driver *drv); > extern int cpuidle_register_device(struct cpuidle_device *dev); > extern void cpuidle_unregister_device(struct cpuidle_device *dev); > - > +extern int cpuidle_register(struct cpuidle_driver *drv, > + const struct cpumask *const coupled_cpus); > +extern void cpuidle_unregister(struct cpuidle_driver *drv); > extern void cpuidle_pause_and_lock(void); > extern void cpuidle_resume_and_unlock(void); > extern void cpuidle_pause(void); > @@ -148,7 +150,10 @@ static inline void cpuidle_unregister_driver(struct cpuidle_driver *drv) { } > static inline int cpuidle_register_device(struct cpuidle_device *dev) > {return -ENODEV; } > static inline void cpuidle_unregister_device(struct cpuidle_device *dev) { } > - > +static inline int cpuidle_register(struct cpuidle_driver *drv, > + const struct cpumask *const coupled_cpus) > +{return -ENODEV; } > +static inline void cpuidle_unregister(struct cpuidle_driver *drv) { } > static inline void cpuidle_pause_and_lock(void) { } > static inline void cpuidle_resume_and_unlock(void) { } > static inline void cpuidle_pause(void) { }
On Friday 12 April 2013 06:05 PM, Daniel Lezcano wrote: > The usual scheme to initialize a cpuidle driver on a SMP is: > > cpuidle_register_driver(drv); > for_each_possible_cpu(cpu) { > device = &per_cpu(cpuidle_dev, cpu); > cpuidle_register_device(device); > } > Not exactly related to $subject patch but the driver should be registered after all devices has been registered to avoid devices start using the idle state data as soon as it is registered. In multi CPU system, this race can easily happen. Current CPUIDLE core layer is also written with the assumption that driver will be registered first and then the devices which is not mandatory as per typical drive/device model. May be you can fix that part while you are creating this common wrapper. > This code is duplicated in each cpuidle driver. > > On UP systems, it is done this way: > > cpuidle_register_driver(drv); > device = &per_cpu(cpuidle_dev, cpu); > cpuidle_register_device(device); > > On UP, the macro 'for_each_cpu' does one iteration: > > #define for_each_cpu(cpu, mask) \ > for ((cpu) = 0; (cpu) < 1; (cpu)++, (void)mask) > > Hence, the initialization loop is the same for UP than SMP. > > Beside, we saw different bugs / mis-initialization / return code unchecked in > the different drivers, the code is duplicated including bugs. After fixing all > these ones, it appears the initialization pattern is the same for everyone. > > Please note, some drivers are doing dev->state_count = drv->state_count. This is > not necessary because it is done by the cpuidle_enable_device function in the > cpuidle framework. This is true, until you have the same states for all your > devices. Otherwise, the 'low level' API should be used instead with the specific > initialization for the driver. > > Let's add a wrapper function doing this initialization with a cpumask parameter > for the coupled idle states and use it for all the drivers. > > That will save a lot of LOC, consolidate the code, and the modifications in the > future could be done in a single place. Another benefit is the consolidation of > the cpuidle_device variable which is now in the cpuidle framework and no longer > spread accross the different arch specific drivers. > s/accross/across > Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> > --- > Documentation/cpuidle/driver.txt | 6 ++++ > drivers/cpuidle/cpuidle.c | 72 ++++++++++++++++++++++++++++++++++++++ > include/linux/cpuidle.h | 9 +++-- > 3 files changed, 85 insertions(+), 2 deletions(-) > > diff --git a/Documentation/cpuidle/driver.txt b/Documentation/cpuidle/driver.txt > index 7a9e09e..1b0d81d 100644 > --- a/Documentation/cpuidle/driver.txt > +++ b/Documentation/cpuidle/driver.txt > @@ -15,11 +15,17 @@ has mechanisms in place to support actual entry-exit into CPU idle states. > cpuidle driver initializes the cpuidle_device structure for each CPU device > and registers with cpuidle using cpuidle_register_device. > > +If all the idle states are the same, the wrapper function cpuidle_register > +could be used instead. > + > It can also support the dynamic changes (like battery <-> AC), by using > cpuidle_pause_and_lock, cpuidle_disable_device and cpuidle_enable_device, > cpuidle_resume_and_unlock. > > Interfaces: > +extern int cpuidle_register(struct cpuidle_driver *drv, > + const struct cpumask *const coupled_cpus); > +extern int cpuidle_unregister(struct cpuidle_driver *drv); > extern int cpuidle_register_driver(struct cpuidle_driver *drv); > extern void cpuidle_unregister_driver(struct cpuidle_driver *drv); > extern int cpuidle_register_device(struct cpuidle_device *dev); > diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c > index 0da795b..49e8d30 100644 > --- a/drivers/cpuidle/cpuidle.c > +++ b/drivers/cpuidle/cpuidle.c > @@ -24,6 +24,7 @@ > #include "cpuidle.h" > > DEFINE_PER_CPU(struct cpuidle_device *, cpuidle_devices); > +DEFINE_PER_CPU(struct cpuidle_device, cpuidle_dev); > > DEFINE_MUTEX(cpuidle_lock); > LIST_HEAD(cpuidle_detected_devices); > @@ -453,6 +454,77 @@ void cpuidle_unregister_device(struct cpuidle_device *dev) > > EXPORT_SYMBOL_GPL(cpuidle_unregister_device); > > +/* s/*/** > + * cpuidle_unregister: unregister a driver and the devices. This function > + * can be used only if the driver has been previously registered through > + * the cpuidle_register function. > + * > + * @drv: a valid pointer to a struct cpuidle_driver > + */ > +void cpuidle_unregister(struct cpuidle_driver *drv) > +{ > + int cpu; > + struct cpuidle_device *device; > + > + for_each_possible_cpu(cpu) { > + device = &per_cpu(cpuidle_dev, cpu); > + cpuidle_unregister_device(device); > + } > + > + cpuidle_unregister_driver(drv); > +} > +EXPORT_SYMBOL_GPL(cpuidle_unregister); > + > +/** > + * cpuidle_register: registers the driver and the cpu devices with the > + * coupled_cpus passed as parameter. This function is used for all common > + * initialization pattern there are in the arch specific drivers. The > + * devices is globally defined in this file. > + * > + * @drv : a valid pointer to a struct cpuidle_driver > + * @coupled_cpus: a cpumask for the coupled states > + * > + * Returns 0 on success, < 0 otherwise > + */ > +int cpuidle_register(struct cpuidle_driver *drv, > + const struct cpumask *const coupled_cpus) > +{ > + int ret, cpu; > + struct cpuidle_device *device; > + > + ret = cpuidle_register_driver(drv); > + if (ret) { > + pr_err("failed to register cpuidle driver\n"); > + return ret; > + } > + > + for_each_possible_cpu(cpu) { > + device = &per_cpu(cpuidle_dev, cpu); > + device->cpu = cpu; > + > +#ifdef CONFIG_ARCH_NEEDS_CPU_IDLE_COUPLED > + /* > + * On multiplatform for ARM, the coupled idle states could > + * enabled in the kernel even if the cpuidle driver does not > + * use it. Note, coupled_cpus is a struct copy. > + */ > + if (coupled_cpus) > + device->coupled_cpus = *coupled_cpus; > +#endif Can we get rid off the need of wrapping the code under CONFIG marco in middle of function ? > + ret = cpuidle_register_device(device); > + if (!ret) > + continue; > + > + pr_err("Failed to register cpuidle device for cpu%d\n", cpu); > + > + cpuidle_unregister(drv); > + break; > + } > + > + return ret; > +} > +EXPORT_SYMBOL_GPL(cpuidle_register); > + > #ifdef CONFIG_SMP > > static void smp_callback(void *v) > diff --git a/include/linux/cpuidle.h b/include/linux/cpuidle.h > index 79e3811..3c86faa 100644 > --- a/include/linux/cpuidle.h > +++ b/include/linux/cpuidle.h > @@ -123,7 +123,9 @@ extern void cpuidle_driver_unref(void); > extern void cpuidle_unregister_driver(struct cpuidle_driver *drv); > extern int cpuidle_register_device(struct cpuidle_device *dev); > extern void cpuidle_unregister_device(struct cpuidle_device *dev); > - > +extern int cpuidle_register(struct cpuidle_driver *drv, > + const struct cpumask *const coupled_cpus); > +extern void cpuidle_unregister(struct cpuidle_driver *drv); > extern void cpuidle_pause_and_lock(void); > extern void cpuidle_resume_and_unlock(void); > extern void cpuidle_pause(void); > @@ -148,7 +150,10 @@ static inline void cpuidle_unregister_driver(struct cpuidle_driver *drv) { } > static inline int cpuidle_register_device(struct cpuidle_device *dev) > {return -ENODEV; } > static inline void cpuidle_unregister_device(struct cpuidle_device *dev) { } > - > +static inline int cpuidle_register(struct cpuidle_driver *drv, > + const struct cpumask *const coupled_cpus) > +{return -ENODEV; } > +static inline void cpuidle_unregister(struct cpuidle_driver *drv) { } > static inline void cpuidle_pause_and_lock(void) { } > static inline void cpuidle_resume_and_unlock(void) { } > static inline void cpuidle_pause(void) { } > -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 04/18/2013 10:48 AM, Santosh Shilimkar wrote: > On Friday 12 April 2013 06:05 PM, Daniel Lezcano wrote: >> The usual scheme to initialize a cpuidle driver on a SMP is: >> >> cpuidle_register_driver(drv); >> for_each_possible_cpu(cpu) { >> device = &per_cpu(cpuidle_dev, cpu); >> cpuidle_register_device(device); >> } >> > Not exactly related to $subject patch but the driver should > be registered after all devices has been registered to avoid > devices start using the idle state data as soon as it is > registered. In multi CPU system, this race can easily happen. Could you elaborate what problems the system will be facing if a cpu starts using the idle state data as soon as it is registered ? Is there a bug related to this ? > Current CPUIDLE core layer is also written with the assumption > that driver will be registered first and then the devices which > is not mandatory as per typical drive/device model. Yes, that's true. The framework assumes cpuidle_register_driver is called before cpuidle_register_device. > May be you can fix that part while you are creating this common > wrapper. Personally, as that will modify the cpuidle core layer and the changes are not obvious (because of the design of the code) I prefer to do that in a separate patchset if it is worth to do it - if there is a bug related to it, then there is no discussion, we have to do it :) [ ... ]
On Tuesday 23 April 2013 07:13 PM, Daniel Lezcano wrote: > On 04/18/2013 10:48 AM, Santosh Shilimkar wrote: >> On Friday 12 April 2013 06:05 PM, Daniel Lezcano wrote: >>> The usual scheme to initialize a cpuidle driver on a SMP is: >>> >>> cpuidle_register_driver(drv); >>> for_each_possible_cpu(cpu) { >>> device = &per_cpu(cpuidle_dev, cpu); >>> cpuidle_register_device(device); >>> } >>> >> Not exactly related to $subject patch but the driver should >> be registered after all devices has been registered to avoid >> devices start using the idle state data as soon as it is >> registered. In multi CPU system, this race can easily happen. > > Could you elaborate what problems the system will be facing if a cpu > starts using the idle state data as soon as it is registered ? > > Is there a bug related to this ? > Ofcouse. In multi-CPU scenario, where CPU C-states needs co-ordination can just lead into unknown issues if all the CPUs are not already part registered. >> Current CPUIDLE core layer is also written with the assumption >> that driver will be registered first and then the devices which >> is not mandatory as per typical drive/device model. > > Yes, that's true. The framework assumes cpuidle_register_driver is > called before cpuidle_register_device. > >> May be you can fix that part while you are creating this common >> wrapper. > > Personally, as that will modify the cpuidle core layer and the changes > are not obvious (because of the design of the code) I prefer to do that > in a separate patchset if it is worth to do it - if there is a bug > related to it, then there is no discussion, we have to do it :) > Sure. It would have been nice if you would have clarified that before posting the next version. You still need to fix the kernel doc in your v4 though. Regards, Santosh -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 04/23/2013 03:56 PM, Santosh Shilimkar wrote: > On Tuesday 23 April 2013 07:13 PM, Daniel Lezcano wrote: >> On 04/18/2013 10:48 AM, Santosh Shilimkar wrote: >>> On Friday 12 April 2013 06:05 PM, Daniel Lezcano wrote: >>>> The usual scheme to initialize a cpuidle driver on a SMP is: >>>> >>>> cpuidle_register_driver(drv); >>>> for_each_possible_cpu(cpu) { >>>> device = &per_cpu(cpuidle_dev, cpu); >>>> cpuidle_register_device(device); >>>> } >>>> >>> Not exactly related to $subject patch but the driver should >>> be registered after all devices has been registered to avoid >>> devices start using the idle state data as soon as it is >>> registered. In multi CPU system, this race can easily happen. >> >> Could you elaborate what problems the system will be facing if a cpu >> starts using the idle state data as soon as it is registered ? >> >> Is there a bug related to this ? >> > Ofcouse. In multi-CPU scenario, where CPU C-states needs co-ordination > can just lead into unknown issues if all the CPUs are not already part > registered. Hmm, ok. I don't see a scenario, with the current code, where that could occurs. The coupled idle state will wait for the other cpus to enter idle before initiating a shutdown sequence and, so far, the other sync algorithm (last man standing) are doing the same. There are some systems with 1024 cpus, and I did not heard problems like this. Do you know a system where this problem occurred ? Or is it something you suspect that can happen ? That would be interesting to have a system where this race occurs in order to check the modifications will solve the issue. >>> Current CPUIDLE core layer is also written with the assumption >>> that driver will be registered first and then the devices which >>> is not mandatory as per typical drive/device model. >> >> Yes, that's true. The framework assumes cpuidle_register_driver is >> called before cpuidle_register_device. >> >>> May be you can fix that part while you are creating this common >>> wrapper. >> >> Personally, as that will modify the cpuidle core layer and the changes >> are not obvious (because of the design of the code) I prefer to do that >> in a separate patchset if it is worth to do it - if there is a bug >> related to it, then there is no discussion, we have to do it :) >> > Sure. It would have been nice if you would have clarified that before > posting the next version. > > You still need to fix the kernel doc in your v4 though. Which one ? "s/accross/across" ? Thanks -- Daniel
On Tuesday 23 April 2013 07:52 PM, Daniel Lezcano wrote: > On 04/23/2013 03:56 PM, Santosh Shilimkar wrote: >> On Tuesday 23 April 2013 07:13 PM, Daniel Lezcano wrote: >>> On 04/18/2013 10:48 AM, Santosh Shilimkar wrote: >>>> On Friday 12 April 2013 06:05 PM, Daniel Lezcano wrote: >>>>> The usual scheme to initialize a cpuidle driver on a SMP is: >>>>> >>>>> cpuidle_register_driver(drv); >>>>> for_each_possible_cpu(cpu) { >>>>> device = &per_cpu(cpuidle_dev, cpu); >>>>> cpuidle_register_device(device); >>>>> } >>>>> >>>> Not exactly related to $subject patch but the driver should >>>> be registered after all devices has been registered to avoid >>>> devices start using the idle state data as soon as it is >>>> registered. In multi CPU system, this race can easily happen. >>> >>> Could you elaborate what problems the system will be facing if a cpu >>> starts using the idle state data as soon as it is registered ? >>> >>> Is there a bug related to this ? >>> >> Ofcouse. In multi-CPU scenario, where CPU C-states needs co-ordination >> can just lead into unknown issues if all the CPUs are not already part >> registered. > > Hmm, ok. I don't see a scenario, with the current code, where that could > occurs. The coupled idle state will wait for the other cpus to enter > idle before initiating a shutdown sequence and, so far, the other sync > algorithm (last man standing) are doing the same. > Its no just couple idle state usages. CPUs do share power domains, clock domains, clocks etc. One CPU going ahead and tampering/progarmming the low power states till the next one isn't registered yet can lead to issues. > There are some systems with 1024 cpus, and I did not heard problems like > this. > That is because todays CPUIDLe core code doesn't let that happen. Once you fix the ordering issue, there is window where the issue could happen. > Do you know a system where this problem occurred ? Or is it something > you suspect that can happen ? > See above. Its more prone to issues true for systems with higher number of CPUs. Not sure if that was the reason the core code, doesn't proceed without all the devices are registered ? > That would be interesting to have a system where this race occurs in > order to check the modifications will solve the issue. > I haven't see the issue myself but logically it could easily happen once the core code is fixed. >>>> Current CPUIDLE core layer is also written with the assumption >>>> that driver will be registered first and then the devices which >>>> is not mandatory as per typical drive/device model. >>> >>> Yes, that's true. The framework assumes cpuidle_register_driver is >>> called before cpuidle_register_device. >>> >>>> May be you can fix that part while you are creating this common >>>> wrapper. >>> >>> Personally, as that will modify the cpuidle core layer and the changes >>> are not obvious (because of the design of the code) I prefer to do that >>> in a separate patchset if it is worth to do it - if there is a bug >>> related to it, then there is no discussion, we have to do it :) >>> >> Sure. It would have been nice if you would have clarified that before >> posting the next version. >> >> You still need to fix the kernel doc in your v4 though. > > Which one ? "s/accross/across" ? > s/*/** below hunk in v4 +/* + * cpuidle_unregister: unregister a driver and the devices. This function + * can be used only if the driver has been previously registered through + * the cpuidle_register function. + * + * @drv: a valid pointer to a struct cpuidle_driver + */ +void cpuidle_unregister(struct cpuidle_driver *drv) +{ + int cpu; + struct cpuidle_device *device; -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 04/23/2013 05:07 PM, Santosh Shilimkar wrote: > On Tuesday 23 April 2013 07:52 PM, Daniel Lezcano wrote: >> On 04/23/2013 03:56 PM, Santosh Shilimkar wrote: >>> On Tuesday 23 April 2013 07:13 PM, Daniel Lezcano wrote: >>>> On 04/18/2013 10:48 AM, Santosh Shilimkar wrote: >>>>> On Friday 12 April 2013 06:05 PM, Daniel Lezcano wrote: >>>>>> The usual scheme to initialize a cpuidle driver on a SMP is: >>>>>> >>>>>> cpuidle_register_driver(drv); >>>>>> for_each_possible_cpu(cpu) { >>>>>> device = &per_cpu(cpuidle_dev, cpu); >>>>>> cpuidle_register_device(device); >>>>>> } >>>>>> >>>>> Not exactly related to $subject patch but the driver should >>>>> be registered after all devices has been registered to avoid >>>>> devices start using the idle state data as soon as it is >>>>> registered. In multi CPU system, this race can easily happen. >>>> >>>> Could you elaborate what problems the system will be facing if a cpu >>>> starts using the idle state data as soon as it is registered ? >>>> >>>> Is there a bug related to this ? >>>> >>> Ofcouse. In multi-CPU scenario, where CPU C-states needs co-ordination >>> can just lead into unknown issues if all the CPUs are not already part >>> registered. >> >> Hmm, ok. I don't see a scenario, with the current code, where that could >> occurs. The coupled idle state will wait for the other cpus to enter >> idle before initiating a shutdown sequence and, so far, the other sync >> algorithm (last man standing) are doing the same. >> > Its no just couple idle state usages. CPUs do share power domains, clock > domains, clocks etc. One CPU going ahead and tampering/progarmming > the low power states till the next one isn't registered yet > can lead to issues. > >> There are some systems with 1024 cpus, and I did not heard problems like >> this. >> > That is because todays CPUIDLe core code doesn't let that happen. Once > you fix the ordering issue, there is window where the issue could happen. > >> Do you know a system where this problem occurred ? Or is it something >> you suspect that can happen ? >> > See above. Its more prone to issues true for systems with higher > number of CPUs. Not sure if that was the reason the core code, doesn't > proceed without all the devices are registered ? > >> That would be interesting to have a system where this race occurs in >> order to check the modifications will solve the issue. >> > I haven't see the issue myself but logically it could easily happen > once the core code is fixed. > >>>>> Current CPUIDLE core layer is also written with the assumption >>>>> that driver will be registered first and then the devices which >>>>> is not mandatory as per typical drive/device model. >>>> >>>> Yes, that's true. The framework assumes cpuidle_register_driver is >>>> called before cpuidle_register_device. >>>> >>>>> May be you can fix that part while you are creating this common >>>>> wrapper. >>>> >>>> Personally, as that will modify the cpuidle core layer and the changes >>>> are not obvious (because of the design of the code) I prefer to do that >>>> in a separate patchset if it is worth to do it - if there is a bug >>>> related to it, then there is no discussion, we have to do it :) >>>> >>> Sure. It would have been nice if you would have clarified that before >>> posting the next version. >>> >>> You still need to fix the kernel doc in your v4 though. >> >> Which one ? "s/accross/across" ? >> > s/*/** below hunk in v4 > > +/* > + * cpuidle_unregister: unregister a driver and the devices. This function > + * can be used only if the driver has been previously registered through > + * the cpuidle_register function. > + * > + * @drv: a valid pointer to a struct cpuidle_driver > + */ > +void cpuidle_unregister(struct cpuidle_driver *drv) > +{ > + int cpu; > + struct cpuidle_device *device; > Ah, ok. I thought you were referring to something in 'Documentation'. I will fix this nit. Thanks ! -- Daniel
diff --git a/Documentation/cpuidle/driver.txt b/Documentation/cpuidle/driver.txt index 7a9e09e..1b0d81d 100644 --- a/Documentation/cpuidle/driver.txt +++ b/Documentation/cpuidle/driver.txt @@ -15,11 +15,17 @@ has mechanisms in place to support actual entry-exit into CPU idle states. cpuidle driver initializes the cpuidle_device structure for each CPU device and registers with cpuidle using cpuidle_register_device. +If all the idle states are the same, the wrapper function cpuidle_register +could be used instead. + It can also support the dynamic changes (like battery <-> AC), by using cpuidle_pause_and_lock, cpuidle_disable_device and cpuidle_enable_device, cpuidle_resume_and_unlock. Interfaces: +extern int cpuidle_register(struct cpuidle_driver *drv, + const struct cpumask *const coupled_cpus); +extern int cpuidle_unregister(struct cpuidle_driver *drv); extern int cpuidle_register_driver(struct cpuidle_driver *drv); extern void cpuidle_unregister_driver(struct cpuidle_driver *drv); extern int cpuidle_register_device(struct cpuidle_device *dev); diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c index 0da795b..49e8d30 100644 --- a/drivers/cpuidle/cpuidle.c +++ b/drivers/cpuidle/cpuidle.c @@ -24,6 +24,7 @@ #include "cpuidle.h" DEFINE_PER_CPU(struct cpuidle_device *, cpuidle_devices); +DEFINE_PER_CPU(struct cpuidle_device, cpuidle_dev); DEFINE_MUTEX(cpuidle_lock); LIST_HEAD(cpuidle_detected_devices); @@ -453,6 +454,77 @@ void cpuidle_unregister_device(struct cpuidle_device *dev) EXPORT_SYMBOL_GPL(cpuidle_unregister_device); +/* + * cpuidle_unregister: unregister a driver and the devices. This function + * can be used only if the driver has been previously registered through + * the cpuidle_register function. + * + * @drv: a valid pointer to a struct cpuidle_driver + */ +void cpuidle_unregister(struct cpuidle_driver *drv) +{ + int cpu; + struct cpuidle_device *device; + + for_each_possible_cpu(cpu) { + device = &per_cpu(cpuidle_dev, cpu); + cpuidle_unregister_device(device); + } + + cpuidle_unregister_driver(drv); +} +EXPORT_SYMBOL_GPL(cpuidle_unregister); + +/** + * cpuidle_register: registers the driver and the cpu devices with the + * coupled_cpus passed as parameter. This function is used for all common + * initialization pattern there are in the arch specific drivers. The + * devices is globally defined in this file. + * + * @drv : a valid pointer to a struct cpuidle_driver + * @coupled_cpus: a cpumask for the coupled states + * + * Returns 0 on success, < 0 otherwise + */ +int cpuidle_register(struct cpuidle_driver *drv, + const struct cpumask *const coupled_cpus) +{ + int ret, cpu; + struct cpuidle_device *device; + + ret = cpuidle_register_driver(drv); + if (ret) { + pr_err("failed to register cpuidle driver\n"); + return ret; + } + + for_each_possible_cpu(cpu) { + device = &per_cpu(cpuidle_dev, cpu); + device->cpu = cpu; + +#ifdef CONFIG_ARCH_NEEDS_CPU_IDLE_COUPLED + /* + * On multiplatform for ARM, the coupled idle states could + * enabled in the kernel even if the cpuidle driver does not + * use it. Note, coupled_cpus is a struct copy. + */ + if (coupled_cpus) + device->coupled_cpus = *coupled_cpus; +#endif + ret = cpuidle_register_device(device); + if (!ret) + continue; + + pr_err("Failed to register cpuidle device for cpu%d\n", cpu); + + cpuidle_unregister(drv); + break; + } + + return ret; +} +EXPORT_SYMBOL_GPL(cpuidle_register); + #ifdef CONFIG_SMP static void smp_callback(void *v) diff --git a/include/linux/cpuidle.h b/include/linux/cpuidle.h index 79e3811..3c86faa 100644 --- a/include/linux/cpuidle.h +++ b/include/linux/cpuidle.h @@ -123,7 +123,9 @@ extern void cpuidle_driver_unref(void); extern void cpuidle_unregister_driver(struct cpuidle_driver *drv); extern int cpuidle_register_device(struct cpuidle_device *dev); extern void cpuidle_unregister_device(struct cpuidle_device *dev); - +extern int cpuidle_register(struct cpuidle_driver *drv, + const struct cpumask *const coupled_cpus); +extern void cpuidle_unregister(struct cpuidle_driver *drv); extern void cpuidle_pause_and_lock(void); extern void cpuidle_resume_and_unlock(void); extern void cpuidle_pause(void); @@ -148,7 +150,10 @@ static inline void cpuidle_unregister_driver(struct cpuidle_driver *drv) { } static inline int cpuidle_register_device(struct cpuidle_device *dev) {return -ENODEV; } static inline void cpuidle_unregister_device(struct cpuidle_device *dev) { } - +static inline int cpuidle_register(struct cpuidle_driver *drv, + const struct cpumask *const coupled_cpus) +{return -ENODEV; } +static inline void cpuidle_unregister(struct cpuidle_driver *drv) { } static inline void cpuidle_pause_and_lock(void) { } static inline void cpuidle_resume_and_unlock(void) { } static inline void cpuidle_pause(void) { }
The usual scheme to initialize a cpuidle driver on a SMP is: cpuidle_register_driver(drv); for_each_possible_cpu(cpu) { device = &per_cpu(cpuidle_dev, cpu); cpuidle_register_device(device); } This code is duplicated in each cpuidle driver. On UP systems, it is done this way: cpuidle_register_driver(drv); device = &per_cpu(cpuidle_dev, cpu); cpuidle_register_device(device); On UP, the macro 'for_each_cpu' does one iteration: #define for_each_cpu(cpu, mask) \ for ((cpu) = 0; (cpu) < 1; (cpu)++, (void)mask) Hence, the initialization loop is the same for UP than SMP. Beside, we saw different bugs / mis-initialization / return code unchecked in the different drivers, the code is duplicated including bugs. After fixing all these ones, it appears the initialization pattern is the same for everyone. Please note, some drivers are doing dev->state_count = drv->state_count. This is not necessary because it is done by the cpuidle_enable_device function in the cpuidle framework. This is true, until you have the same states for all your devices. Otherwise, the 'low level' API should be used instead with the specific initialization for the driver. Let's add a wrapper function doing this initialization with a cpumask parameter for the coupled idle states and use it for all the drivers. That will save a lot of LOC, consolidate the code, and the modifications in the future could be done in a single place. Another benefit is the consolidation of the cpuidle_device variable which is now in the cpuidle framework and no longer spread accross the different arch specific drivers. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> --- Documentation/cpuidle/driver.txt | 6 ++++ drivers/cpuidle/cpuidle.c | 72 ++++++++++++++++++++++++++++++++++++++ include/linux/cpuidle.h | 9 +++-- 3 files changed, 85 insertions(+), 2 deletions(-)