Message ID | 002f01d0186b$2700b730$75022590$%brar@samsung.com (mailing list archive) |
---|---|
State | Not Applicable, archived |
Headers | show |
On Mon, Dec 15, 2014 at 06:58:41PM +0530, Yadwinder Singh Brar wrote: > Unfortunately, I didn’t get any such warning though I tested > patch enabling CONFIG_PROVE_LOCKING before posting. It seems > Russell is trying to load imx_thermal as module and parallely > Changing cpufreq governor from userspace, which was not my > test case. No. Yes, imx_thermal is a module, which is loaded by udev very early in the userspace boot. Later on, in the rc.local, I poke the governor from userspace. This is evidenced by the bluetooth modules being loaded after imx_thermal: Module Size Used by bnep 10574 2 rfcomm 33720 0 bluetooth 293598 10 bnep,rfcomm nfsd 90264 0 exportfs 3988 1 nfsd hid_cypress 1763 0 snd_soc_fsl_spdif 9753 2 imx_pcm_dma 1137 1 snd_soc_fsl_spdif imx_sdma 12885 2 imx2_wdt 3248 0 imx_thermal 5386 0 snd_soc_imx_spdif 1877 0 and the timestamp on the bluetooth message: [ 25.503739] Bluetooth: Core ver 2.19 vs the timestamp on the lockdep message: [ 29.499195] [ INFO: possible circular locking dependency detected ] My rc.local does this: # Configure cpufreq echo 1 > /sys/devices/system/cpu/cpufreq/ondemand/io_is_busy echo performance > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor > Anyways, after analyzing log and code,I think problem is not > in cpufreq_thermal_notifier which was modified in patch as > stated above. Actual problem is in __cpufreq_cooling_register > which is unnecessarily calling cpufreq_register_notifier() > inside section protected by cooling_cpufreq_lock. > Because cpufreq_policy_notifier_list).rwsem is already held > by store_scaling_governor when __cpufreq_cooling_register is > trying to cpufreq_policy_notifier_list while holding cooling_cpufreq_lock. > > So I think following can fix the problem: > > diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c > index ad09e51..622ea40 100644 > --- a/drivers/thermal/cpu_cooling.c > +++ b/drivers/thermal/cpu_cooling.c > @@ -484,15 +484,15 @@ __cpufreq_cooling_register(struct device_node *np, > cpufreq_dev->cpufreq_state = 0; > mutex_lock(&cooling_cpufreq_lock); > > - /* Register the notifier for first cpufreq cooling device */ > - if (cpufreq_dev_count == 0) > - cpufreq_register_notifier(&thermal_cpufreq_notifier_block, > - CPUFREQ_POLICY_NOTIFIER); > cpufreq_dev_count++; > list_add(&cpufreq_dev->node, &cpufreq_dev_list); > > mutex_unlock(&cooling_cpufreq_lock); > > + /* Register the notifier for first cpufreq cooling device */ > + if (cpufreq_dev_count == 0) > + cpufreq_register_notifier(&thermal_cpufreq_notifier_block, > + CPUFREQ_POLICY_NOTIFIER); No, sorry, this patch is worse. 1. cpufreq_register_notifier() will never be called, even on the first caller to this code: at the point where it's tested for zero, it will be incremented to one by the previous code. 2. What happens if two threads come here? The first thread succeeds in grabbing cooling_cpufreq_lock. The second thread stalls waiting for cooling_cpufreq_lock to be released. The first thread increments cpufreq_dev_count, adds to the list, and then releases the lock. At this point, control may be passed to the second thread (since mutex_unlock() will wake it up.) The second thread gets into the critical region, and increments cpufreq_dev_count again. By the time the first thread runs, cpufreq_dev_count may be two at this point. Unfortunately, you do need some kind of synchronisation here. If it's not important when cpufreq_register_notifier() gets called, maybe this can work: bool register; mutex_lock(&cooling_cpufreq_lock); register = cpufreq_dev_count++ == 0; list_add(&cpufreq_dev->node, &cpufreq_dev_list); mutex_unlock(&cooling_cpufreq_lock); if (register) cpufreq_register_notifier(&thermal_cpufreq_notifier_block, CPUFREQ_POLICY_NOTIFIER); However, I suspect that may well be buggy if we have a cleanup path which unregisters the notifier when cpufreq_dev_count is decremented to zero... which we seem to have in cpufreq_cooling_unregister(). The answer may well be to have finer grained locking here: - one lock to protect cpufreq_dev_list, which is only ever taken when adding or removing entries from it - a second lock to protect cpufreq_dev_count and the calls to cpufreq_register_notifier() and cpufreq_unregister_notifier() and you would /never/ take either of those two locks inside each other. In other words: mutex_lock(&cooling_list_lock); list_add(&cpufreq_dev->node, &cpufreq_dev_list); mutex_unlock(&cooling_list_lock); mutex_lock(&cooling_cpufreq_lock); if (cpufreq_dev_count++ == 0) cpufreq_register_notifier(&thermal_cpufreq_notifier_block, CPUFREQ_POLICY_NOTIFIER); mutex_unlock(&cooling_cpufreq_lock); and similar in the cleanup path. The notifier itself would only ever take the cooling_list_lock.
On 15 December 2014 at 18:58, Yadwinder Singh Brar <yadi.brar@samsung.com> wrote: > Unfortunately, I didn’t get any such warning though I tested > patch enabling CONFIG_PROVE_LOCKING before posting. It seems Even I couldn't reproduce it :) > Anyways, after analyzing log and code,I think problem is not > in cpufreq_thermal_notifier which was modified in patch as > stated above. Actual problem is in __cpufreq_cooling_register > which is unnecessarily calling cpufreq_register_notifier() > inside section protected by cooling_cpufreq_lock. Agreed, and I was looking to reproduce first before writing some crazy code. > Because cpufreq_policy_notifier_list).rwsem is already held > by store_scaling_governor when __cpufreq_cooling_register is > trying to cpufreq_policy_notifier_list while holding cooling_cpufreq_lock. Correct. > So I think following can fix the problem: > > diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c > index ad09e51..622ea40 100644 > --- a/drivers/thermal/cpu_cooling.c > +++ b/drivers/thermal/cpu_cooling.c > @@ -484,15 +484,15 @@ __cpufreq_cooling_register(struct device_node *np, > cpufreq_dev->cpufreq_state = 0; > mutex_lock(&cooling_cpufreq_lock); > > - /* Register the notifier for first cpufreq cooling device */ > - if (cpufreq_dev_count == 0) > - cpufreq_register_notifier(&thermal_cpufreq_notifier_block, > - CPUFREQ_POLICY_NOTIFIER); > cpufreq_dev_count++; > list_add(&cpufreq_dev->node, &cpufreq_dev_list); > > mutex_unlock(&cooling_cpufreq_lock); > > + /* Register the notifier for first cpufreq cooling device */ > + if (cpufreq_dev_count == 0) This will always fail :), over that we need to access cpufreq_dev_count from within the lock. > + cpufreq_register_notifier(&thermal_cpufreq_notifier_block, > + CPUFREQ_POLICY_NOTIFIER); > return cool_dev; > } -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> -----Original Message----- > From: Russell King - ARM Linux [mailto:linux@arm.linux.org.uk] > Sent: Monday, December 15, 2014 7:17 PM > To: Yadwinder Singh Brar > Cc: 'Viresh Kumar'; 'Rafael J. Wysocki'; linux-arm- > kernel@lists.infradead.org; linux-pm@vger.kernel.org; 'Eduardo > Valentin' > Subject: Re: 3.18: lockdep problems in cpufreq > > On Mon, Dec 15, 2014 at 06:58:41PM +0530, Yadwinder Singh Brar wrote: > > Unfortunately, I didn’t get any such warning though I tested patch > > enabling CONFIG_PROVE_LOCKING before posting. It seems Russell is > > trying to load imx_thermal as module and parallely Changing cpufreq > > governor from userspace, which was not my test case. > > No. Yes, imx_thermal is a module, which is loaded by udev very early > in the userspace boot. Later on, in the rc.local, I poke the governor > from userspace. > > This is evidenced by the bluetooth modules being loaded after > imx_thermal: > > Module Size Used by > bnep 10574 2 > rfcomm 33720 0 > bluetooth 293598 10 bnep,rfcomm > nfsd 90264 0 > exportfs 3988 1 nfsd > hid_cypress 1763 0 > snd_soc_fsl_spdif 9753 2 > imx_pcm_dma 1137 1 snd_soc_fsl_spdif > imx_sdma 12885 2 > imx2_wdt 3248 0 > imx_thermal 5386 0 > snd_soc_imx_spdif 1877 0 > > and the timestamp on the bluetooth message: > > [ 25.503739] Bluetooth: Core ver 2.19 > > vs the timestamp on the lockdep message: > > [ 29.499195] [ INFO: possible circular locking dependency detected ] > > My rc.local does this: > > # Configure cpufreq > echo 1 > /sys/devices/system/cpu/cpufreq/ondemand/io_is_busy > echo performance > > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor > > > Anyways, after analyzing log and code,I think problem is not in > > cpufreq_thermal_notifier which was modified in patch as stated above. > > Actual problem is in __cpufreq_cooling_register which is > unnecessarily > > calling cpufreq_register_notifier() inside section protected by > > cooling_cpufreq_lock. > > Because cpufreq_policy_notifier_list).rwsem is already held by > > store_scaling_governor when __cpufreq_cooling_register is trying to > > cpufreq_policy_notifier_list while holding cooling_cpufreq_lock. > > > > So I think following can fix the problem: > > > > diff --git a/drivers/thermal/cpu_cooling.c > > b/drivers/thermal/cpu_cooling.c index ad09e51..622ea40 100644 > > --- a/drivers/thermal/cpu_cooling.c > > +++ b/drivers/thermal/cpu_cooling.c > > @@ -484,15 +484,15 @@ __cpufreq_cooling_register(struct device_node > *np, > > cpufreq_dev->cpufreq_state = 0; > > mutex_lock(&cooling_cpufreq_lock); > > > > - /* Register the notifier for first cpufreq cooling device */ > > - if (cpufreq_dev_count == 0) > > - > cpufreq_register_notifier(&thermal_cpufreq_notifier_block, > > - CPUFREQ_POLICY_NOTIFIER); > > cpufreq_dev_count++; > > list_add(&cpufreq_dev->node, &cpufreq_dev_list); > > > > mutex_unlock(&cooling_cpufreq_lock); > > > > + /* Register the notifier for first cpufreq cooling device */ > > + if (cpufreq_dev_count == 0) > > + > cpufreq_register_notifier(&thermal_cpufreq_notifier_block, > > + CPUFREQ_POLICY_NOTIFIER); > > No, sorry, this patch is worse. > Actually it was not patch :P , just moved cpufreq_register_notifier() out of locking, as it is(C&P) to explain my point. > 1. cpufreq_register_notifier() will never be called, even on the first > caller to this code: at the point where it's tested for zero, it > will > be incremented to one by the previous code. > > 2. What happens if two threads come here? > > The first thread succeeds in grabbing cooling_cpufreq_lock. The > second > thread stalls waiting for cooling_cpufreq_lock to be released. > > The first thread increments cpufreq_dev_count, adds to the list, and > then > releases the lock. At this point, control may be passed to the > second > thread (since mutex_unlock() will wake it up.) The second thread > gets > into the critical region, and increments cpufreq_dev_count again. > > By the time the first thread runs, cpufreq_dev_count may be two at > this > point. Yes, may be. > > Unfortunately, you do need some kind of synchronisation here. If it's > not important when cpufreq_register_notifier() gets called, maybe this > can work: > > bool register; > > mutex_lock(&cooling_cpufreq_lock); > register = cpufreq_dev_count++ == 0; > list_add(&cpufreq_dev->node, &cpufreq_dev_list); > mutex_unlock(&cooling_cpufreq_lock); > > if (register) register may be 0 in scenario you stated above in second point. So this approach will not work. > cpufreq_register_notifier(&thermal_cpufreq_notifier_block, > CPUFREQ_POLICY_NOTIFIER); > > However, I suspect that may well be buggy if we have a cleanup path > which unregisters the notifier when cpufreq_dev_count is decremented to > zero... > which we seem to have in cpufreq_cooling_unregister(). > > The answer may well be to have finer grained locking here: > > - one lock to protect cpufreq_dev_list, which is only ever taken when > adding or removing entries from it > > - a second lock to protect cpufreq_dev_count and the calls to > cpufreq_register_notifier() and cpufreq_unregister_notifier() > > and you would /never/ take either of those two locks inside each other. > In other words: > > mutex_lock(&cooling_list_lock); > list_add(&cpufreq_dev->node, &cpufreq_dev_list); > mutex_unlock(&cooling_list_lock); > > mutex_lock(&cooling_cpufreq_lock); > if (cpufreq_dev_count++ == 0) > cpufreq_register_notifier(&thermal_cpufreq_notifier_block, > CPUFREQ_POLICY_NOTIFIER); > mutex_unlock(&cooling_cpufreq_lock); > > and similar in the cleanup path. The notifier itself would only ever > take the cooling_list_lock. > I agree with this approach, if its fine for others also, I can implement and post patch. Thanks, Yadwinder > -- > FTTC broadband for 0.8mile line: currently at 9.5Mbps down 400kbps up > according to speedtest.net. -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Dec 15, 2014 at 08:24:05PM +0530, Yadwinder Singh Brar wrote: > > The answer may well be to have finer grained locking here: > > > > - one lock to protect cpufreq_dev_list, which is only ever taken when > > adding or removing entries from it > > > > - a second lock to protect cpufreq_dev_count and the calls to > > cpufreq_register_notifier() and cpufreq_unregister_notifier() > > > > and you would /never/ take either of those two locks inside each other. > > In other words: > > > > mutex_lock(&cooling_list_lock); > > list_add(&cpufreq_dev->node, &cpufreq_dev_list); > > mutex_unlock(&cooling_list_lock); > > > > mutex_lock(&cooling_cpufreq_lock); > > if (cpufreq_dev_count++ == 0) > > cpufreq_register_notifier(&thermal_cpufreq_notifier_block, > > CPUFREQ_POLICY_NOTIFIER); > > mutex_unlock(&cooling_cpufreq_lock); > > > > and similar in the cleanup path. The notifier itself would only ever > > take the cooling_list_lock. > > > > I agree with this approach, if its fine for others also, I can implement > and post patch. Yes, please do.
On Monday, December 15, 2014 05:43:36 PM Russell King - ARM Linux wrote: > On Mon, Dec 15, 2014 at 08:24:05PM +0530, Yadwinder Singh Brar wrote: > > > The answer may well be to have finer grained locking here: > > > > > > - one lock to protect cpufreq_dev_list, which is only ever taken when > > > adding or removing entries from it > > > > > > - a second lock to protect cpufreq_dev_count and the calls to > > > cpufreq_register_notifier() and cpufreq_unregister_notifier() > > > > > > and you would /never/ take either of those two locks inside each other. > > > In other words: > > > > > > mutex_lock(&cooling_list_lock); > > > list_add(&cpufreq_dev->node, &cpufreq_dev_list); > > > mutex_unlock(&cooling_list_lock); > > > > > > mutex_lock(&cooling_cpufreq_lock); > > > if (cpufreq_dev_count++ == 0) > > > cpufreq_register_notifier(&thermal_cpufreq_notifier_block, > > > CPUFREQ_POLICY_NOTIFIER); > > > mutex_unlock(&cooling_cpufreq_lock); > > > > > > and similar in the cleanup path. The notifier itself would only ever > > > take the cooling_list_lock. > > > > > > > I agree with this approach, if its fine for others also, I can implement > > and post patch. > > Yes, please do. Indeed. We have a regression here to fix. Rafael -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 15 December 2014 at 20:24, Yadwinder Singh Brar <yadi.brar@samsung.com> wrote: >> Unfortunately, you do need some kind of synchronisation here. If it's >> not important when cpufreq_register_notifier() gets called, maybe this >> can work: >> >> bool register; >> >> mutex_lock(&cooling_cpufreq_lock); >> register = cpufreq_dev_count++ == 0; >> list_add(&cpufreq_dev->node, &cpufreq_dev_list); >> mutex_unlock(&cooling_cpufreq_lock); >> >> if (register) > > register may be 0 in scenario you stated above in second point. > So this approach will not work. I didn't understood what you meant here. register will be zero only for one of the threads. -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c index ad09e51..622ea40 100644 --- a/drivers/thermal/cpu_cooling.c +++ b/drivers/thermal/cpu_cooling.c @@ -484,15 +484,15 @@ __cpufreq_cooling_register(struct device_node *np, cpufreq_dev->cpufreq_state = 0; mutex_lock(&cooling_cpufreq_lock); - /* Register the notifier for first cpufreq cooling device */ - if (cpufreq_dev_count == 0) - cpufreq_register_notifier(&thermal_cpufreq_notifier_block, - CPUFREQ_POLICY_NOTIFIER); cpufreq_dev_count++; list_add(&cpufreq_dev->node, &cpufreq_dev_list); mutex_unlock(&cooling_cpufreq_lock); + /* Register the notifier for first cpufreq cooling device */ + if (cpufreq_dev_count == 0) + cpufreq_register_notifier(&thermal_cpufreq_notifier_block, + CPUFREQ_POLICY_NOTIFIER); return cool_dev; }