diff mbox

[08/13] ACPI/processor: Replace racy task affinity logic.

Message ID 20170412201042.785920903@linutronix.de (mailing list archive)
State Not Applicable, archived
Headers show

Commit Message

Thomas Gleixner April 12, 2017, 8:07 p.m. UTC
acpi_processor_get_throttling() requires to invoke the getter function on
the target CPU. This is achieved by temporarily setting the affinity of the
calling user space thread to the requested CPU and reset it to the original
affinity afterwards.

That's racy vs. CPU hotplug and concurrent affinity settings for that
thread resulting in code executing on the wrong CPU and overwriting the
new affinity setting.

acpi_processor_get_throttling() is invoked in two ways:

1) The CPU online callback, which is already running on the target CPU and
   obviously protected against hotplug and not affected by affinity
   settings.

2) The ACPI driver probe function, which is not protected against hotplug
   during modprobe.

Switch it over to work_on_cpu() and protect the probe function against CPU
hotplug.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Len Brown <lenb@kernel.org>
Cc: linux-acpi@vger.kernel.org
---
 drivers/acpi/processor_driver.c     |    7 ++++++-
 drivers/acpi/processor_throttling.c |   31 +++++++++++++------------------
 2 files changed, 19 insertions(+), 19 deletions(-)



--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Peter Zijlstra April 13, 2017, 11:39 a.m. UTC | #1
On Wed, Apr 12, 2017 at 10:07:34PM +0200, Thomas Gleixner wrote:
> acpi_processor_get_throttling() requires to invoke the getter function on
> the target CPU. This is achieved by temporarily setting the affinity of the
> calling user space thread to the requested CPU and reset it to the original
> affinity afterwards.
> 
> That's racy vs. CPU hotplug and concurrent affinity settings for that
> thread resulting in code executing on the wrong CPU and overwriting the
> new affinity setting.
> 
> acpi_processor_get_throttling() is invoked in two ways:
> 
> 1) The CPU online callback, which is already running on the target CPU and
>    obviously protected against hotplug and not affected by affinity
>    settings.
> 
> 2) The ACPI driver probe function, which is not protected against hotplug
>    during modprobe.
> 
> Switch it over to work_on_cpu() and protect the probe function against CPU
> hotplug.
> 

> +static int acpi_processor_get_throttling(struct acpi_processor *pr)
> +{
>  	if (!pr)
>  		return -EINVAL;
>  
>  	if (!pr->flags.throttling)
>  		return -ENODEV;
>  
> +	 * This is either called from the CPU hotplug callback of
> +	 * processor_driver or via the ACPI probe function. In the latter
> +	 * case the CPU is not guaranteed to be online. Both call sites are
> +	 * protected against CPU hotplug.
>  	 */
> +	if (!cpu_online(pr->id))
>  		return -ENODEV;
>  
> +	return work_on_cpu(pr->id, __acpi_processor_get_throttling, pr);
>  }

That makes my machine sad...


[    9.583030] =============================================
[    9.589053] [ INFO: possible recursive locking detected ]
[    9.595079] 4.11.0-rc6-00385-g5aee78a-dirty #678 Not tainted
[    9.601393] ---------------------------------------------
[    9.607418] kworker/0:0/3 is trying to acquire lock:
[    9.612954]  ((&wfc.work)){+.+.+.}, at: [<ffffffff8110c172>] flush_work+0x12/0x2a0
[    9.621406] 
[    9.621406] but task is already holding lock:
[    9.627915]  ((&wfc.work)){+.+.+.}, at: [<ffffffff8110df17>] process_one_work+0x1e7/0x670
[    9.637044] 
[    9.637044] other info that might help us debug this:
[    9.644330]  Possible unsafe locking scenario:
[    9.644330] 
[    9.650934]        CPU0
[    9.653660]        ----
[    9.656386]   lock((&wfc.work));
[    9.659987]   lock((&wfc.work));
[    9.663586] 
[    9.663586]  *** DEADLOCK ***
[    9.663586] 
[    9.670189]  May be due to missing lock nesting notation
[    9.670189] 
[    9.677765] 2 locks held by kworker/0:0/3:
[    9.682332]  #0:  ("events"){.+.+.+}, at: [<ffffffff8110df17>] process_one_work+0x1e7/0x670
[    9.691654]  #1:  ((&wfc.work)){+.+.+.}, at: [<ffffffff8110df17>] process_one_work+0x1e7/0x670
[    9.701267] 
[    9.701267] stack backtrace:
[    9.706127] CPU: 0 PID: 3 Comm: kworker/0:0 Not tainted 4.11.0-rc6-00385-g5aee78a-dirty #678
[    9.715545] Hardware name: Intel Corporation S2600GZ/S2600GZ, BIOS SE5C600.86B.02.02.0002.122320131210 12/23/2013
[    9.726999] Workqueue: events work_for_cpu_fn
[    9.731860] Call Trace:
[    9.734591]  dump_stack+0x86/0xcf
[    9.738290]  __lock_acquire+0x790/0x1620
[    9.742667]  ? __lock_acquire+0x4a5/0x1620
[    9.747237]  lock_acquire+0x100/0x210
[    9.751319]  ? lock_acquire+0x100/0x210
[    9.755596]  ? flush_work+0x12/0x2a0
[    9.759583]  flush_work+0x47/0x2a0
[    9.763375]  ? flush_work+0x12/0x2a0
[    9.767362]  ? queue_work_on+0x47/0xa0
[    9.771545]  ? __this_cpu_preempt_check+0x13/0x20
[    9.776792]  ? trace_hardirqs_on_caller+0xfb/0x1d0
[    9.782139]  ? trace_hardirqs_on+0xd/0x10
[    9.786610]  work_on_cpu+0x82/0x90
[    9.790404]  ? __usermodehelper_disable+0x110/0x110
[    9.795846]  ? __acpi_processor_get_throttling+0x20/0x20
[    9.801773]  acpi_processor_set_throttling+0x199/0x220
[    9.807506]  ? trace_hardirqs_on_caller+0xfb/0x1d0
[    9.812851]  acpi_processor_get_throttling_ptc+0xec/0x180
[    9.818876]  __acpi_processor_get_throttling+0xf/0x20
[    9.824511]  work_for_cpu_fn+0x14/0x20
[    9.828692]  process_one_work+0x261/0x670
[    9.833165]  worker_thread+0x21b/0x3f0
[    9.837348]  kthread+0x108/0x140
[    9.840947]  ? process_one_work+0x670/0x670
[    9.845611]  ? kthread_create_on_node+0x40/0x40
[    9.850667]  ret_from_fork+0x31/0x40
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

--- a/drivers/acpi/processor_driver.c
+++ b/drivers/acpi/processor_driver.c
@@ -262,11 +262,16 @@  static int __acpi_processor_start(struct
 static int acpi_processor_start(struct device *dev)
 {
 	struct acpi_device *device = ACPI_COMPANION(dev);
+	int ret;
 
 	if (!device)
 		return -ENODEV;
 
-	return __acpi_processor_start(device);
+	/* Protect against concurrent CPU hotplug operations */
+	get_online_cpus();
+	ret = __acpi_processor_start(device);
+	put_online_cpus();
+	return ret;
 }
 
 static int acpi_processor_stop(struct device *dev)
--- a/drivers/acpi/processor_throttling.c
+++ b/drivers/acpi/processor_throttling.c
@@ -901,36 +901,31 @@  static int acpi_processor_get_throttling
 	return 0;
 }
 
-static int acpi_processor_get_throttling(struct acpi_processor *pr)
+static long __acpi_processor_get_throttling(void *data)
 {
-	cpumask_var_t saved_mask;
-	int ret;
+	struct acpi_processor *pr = data;
+
+	return pr->throttling.acpi_processor_get_throttling(pr);
+}
 
+static int acpi_processor_get_throttling(struct acpi_processor *pr)
+{
 	if (!pr)
 		return -EINVAL;
 
 	if (!pr->flags.throttling)
 		return -ENODEV;
 
-	if (!alloc_cpumask_var(&saved_mask, GFP_KERNEL))
-		return -ENOMEM;
-
 	/*
-	 * Migrate task to the cpu pointed by pr.
+	 * This is either called from the CPU hotplug callback of
+	 * processor_driver or via the ACPI probe function. In the latter
+	 * case the CPU is not guaranteed to be online. Both call sites are
+	 * protected against CPU hotplug.
 	 */
-	cpumask_copy(saved_mask, &current->cpus_allowed);
-	/* FIXME: use work_on_cpu() */
-	if (set_cpus_allowed_ptr(current, cpumask_of(pr->id))) {
-		/* Can't migrate to the target pr->id CPU. Exit */
-		free_cpumask_var(saved_mask);
+	if (!cpu_online(pr->id))
 		return -ENODEV;
-	}
-	ret = pr->throttling.acpi_processor_get_throttling(pr);
-	/* restore the previous state */
-	set_cpus_allowed_ptr(current, saved_mask);
-	free_cpumask_var(saved_mask);
 
-	return ret;
+	return work_on_cpu(pr->id, __acpi_processor_get_throttling, pr);
 }
 
 static int acpi_processor_get_fadt_info(struct acpi_processor *pr)