Message ID | 20170412201042.785920903@linutronix.de (mailing list archive) |
---|---|
State | Not Applicable, archived |
Headers | show |
On Wed, Apr 12, 2017 at 10:07:34PM +0200, Thomas Gleixner wrote: > acpi_processor_get_throttling() requires to invoke the getter function on > the target CPU. This is achieved by temporarily setting the affinity of the > calling user space thread to the requested CPU and reset it to the original > affinity afterwards. > > That's racy vs. CPU hotplug and concurrent affinity settings for that > thread resulting in code executing on the wrong CPU and overwriting the > new affinity setting. > > acpi_processor_get_throttling() is invoked in two ways: > > 1) The CPU online callback, which is already running on the target CPU and > obviously protected against hotplug and not affected by affinity > settings. > > 2) The ACPI driver probe function, which is not protected against hotplug > during modprobe. > > Switch it over to work_on_cpu() and protect the probe function against CPU > hotplug. > > +static int acpi_processor_get_throttling(struct acpi_processor *pr) > +{ > if (!pr) > return -EINVAL; > > if (!pr->flags.throttling) > return -ENODEV; > > + * This is either called from the CPU hotplug callback of > + * processor_driver or via the ACPI probe function. In the latter > + * case the CPU is not guaranteed to be online. Both call sites are > + * protected against CPU hotplug. > */ > + if (!cpu_online(pr->id)) > return -ENODEV; > > + return work_on_cpu(pr->id, __acpi_processor_get_throttling, pr); > } That makes my machine sad... [ 9.583030] ============================================= [ 9.589053] [ INFO: possible recursive locking detected ] [ 9.595079] 4.11.0-rc6-00385-g5aee78a-dirty #678 Not tainted [ 9.601393] --------------------------------------------- [ 9.607418] kworker/0:0/3 is trying to acquire lock: [ 9.612954] ((&wfc.work)){+.+.+.}, at: [<ffffffff8110c172>] flush_work+0x12/0x2a0 [ 9.621406] [ 9.621406] but task is already holding lock: [ 9.627915] ((&wfc.work)){+.+.+.}, at: [<ffffffff8110df17>] process_one_work+0x1e7/0x670 [ 9.637044] [ 9.637044] other info that might help us debug this: [ 9.644330] Possible unsafe locking scenario: [ 9.644330] [ 9.650934] CPU0 [ 9.653660] ---- [ 9.656386] lock((&wfc.work)); [ 9.659987] lock((&wfc.work)); [ 9.663586] [ 9.663586] *** DEADLOCK *** [ 9.663586] [ 9.670189] May be due to missing lock nesting notation [ 9.670189] [ 9.677765] 2 locks held by kworker/0:0/3: [ 9.682332] #0: ("events"){.+.+.+}, at: [<ffffffff8110df17>] process_one_work+0x1e7/0x670 [ 9.691654] #1: ((&wfc.work)){+.+.+.}, at: [<ffffffff8110df17>] process_one_work+0x1e7/0x670 [ 9.701267] [ 9.701267] stack backtrace: [ 9.706127] CPU: 0 PID: 3 Comm: kworker/0:0 Not tainted 4.11.0-rc6-00385-g5aee78a-dirty #678 [ 9.715545] Hardware name: Intel Corporation S2600GZ/S2600GZ, BIOS SE5C600.86B.02.02.0002.122320131210 12/23/2013 [ 9.726999] Workqueue: events work_for_cpu_fn [ 9.731860] Call Trace: [ 9.734591] dump_stack+0x86/0xcf [ 9.738290] __lock_acquire+0x790/0x1620 [ 9.742667] ? __lock_acquire+0x4a5/0x1620 [ 9.747237] lock_acquire+0x100/0x210 [ 9.751319] ? lock_acquire+0x100/0x210 [ 9.755596] ? flush_work+0x12/0x2a0 [ 9.759583] flush_work+0x47/0x2a0 [ 9.763375] ? flush_work+0x12/0x2a0 [ 9.767362] ? queue_work_on+0x47/0xa0 [ 9.771545] ? __this_cpu_preempt_check+0x13/0x20 [ 9.776792] ? trace_hardirqs_on_caller+0xfb/0x1d0 [ 9.782139] ? trace_hardirqs_on+0xd/0x10 [ 9.786610] work_on_cpu+0x82/0x90 [ 9.790404] ? __usermodehelper_disable+0x110/0x110 [ 9.795846] ? __acpi_processor_get_throttling+0x20/0x20 [ 9.801773] acpi_processor_set_throttling+0x199/0x220 [ 9.807506] ? trace_hardirqs_on_caller+0xfb/0x1d0 [ 9.812851] acpi_processor_get_throttling_ptc+0xec/0x180 [ 9.818876] __acpi_processor_get_throttling+0xf/0x20 [ 9.824511] work_for_cpu_fn+0x14/0x20 [ 9.828692] process_one_work+0x261/0x670 [ 9.833165] worker_thread+0x21b/0x3f0 [ 9.837348] kthread+0x108/0x140 [ 9.840947] ? process_one_work+0x670/0x670 [ 9.845611] ? kthread_create_on_node+0x40/0x40 [ 9.850667] ret_from_fork+0x31/0x40 -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
--- a/drivers/acpi/processor_driver.c +++ b/drivers/acpi/processor_driver.c @@ -262,11 +262,16 @@ static int __acpi_processor_start(struct static int acpi_processor_start(struct device *dev) { struct acpi_device *device = ACPI_COMPANION(dev); + int ret; if (!device) return -ENODEV; - return __acpi_processor_start(device); + /* Protect against concurrent CPU hotplug operations */ + get_online_cpus(); + ret = __acpi_processor_start(device); + put_online_cpus(); + return ret; } static int acpi_processor_stop(struct device *dev) --- a/drivers/acpi/processor_throttling.c +++ b/drivers/acpi/processor_throttling.c @@ -901,36 +901,31 @@ static int acpi_processor_get_throttling return 0; } -static int acpi_processor_get_throttling(struct acpi_processor *pr) +static long __acpi_processor_get_throttling(void *data) { - cpumask_var_t saved_mask; - int ret; + struct acpi_processor *pr = data; + + return pr->throttling.acpi_processor_get_throttling(pr); +} +static int acpi_processor_get_throttling(struct acpi_processor *pr) +{ if (!pr) return -EINVAL; if (!pr->flags.throttling) return -ENODEV; - if (!alloc_cpumask_var(&saved_mask, GFP_KERNEL)) - return -ENOMEM; - /* - * Migrate task to the cpu pointed by pr. + * This is either called from the CPU hotplug callback of + * processor_driver or via the ACPI probe function. In the latter + * case the CPU is not guaranteed to be online. Both call sites are + * protected against CPU hotplug. */ - cpumask_copy(saved_mask, ¤t->cpus_allowed); - /* FIXME: use work_on_cpu() */ - if (set_cpus_allowed_ptr(current, cpumask_of(pr->id))) { - /* Can't migrate to the target pr->id CPU. Exit */ - free_cpumask_var(saved_mask); + if (!cpu_online(pr->id)) return -ENODEV; - } - ret = pr->throttling.acpi_processor_get_throttling(pr); - /* restore the previous state */ - set_cpus_allowed_ptr(current, saved_mask); - free_cpumask_var(saved_mask); - return ret; + return work_on_cpu(pr->id, __acpi_processor_get_throttling, pr); } static int acpi_processor_get_fadt_info(struct acpi_processor *pr)
acpi_processor_get_throttling() requires to invoke the getter function on the target CPU. This is achieved by temporarily setting the affinity of the calling user space thread to the requested CPU and reset it to the original affinity afterwards. That's racy vs. CPU hotplug and concurrent affinity settings for that thread resulting in code executing on the wrong CPU and overwriting the new affinity setting. acpi_processor_get_throttling() is invoked in two ways: 1) The CPU online callback, which is already running on the target CPU and obviously protected against hotplug and not affected by affinity settings. 2) The ACPI driver probe function, which is not protected against hotplug during modprobe. Switch it over to work_on_cpu() and protect the probe function against CPU hotplug. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Len Brown <lenb@kernel.org> Cc: linux-acpi@vger.kernel.org --- drivers/acpi/processor_driver.c | 7 ++++++- drivers/acpi/processor_throttling.c | 31 +++++++++++++------------------ 2 files changed, 19 insertions(+), 19 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html