diff mbox

driver core / ACPI: Avoid device removal locking problems

Message ID 521C6FA8.4070804@cn.fujitsu.com (mailing list archive)
State Rejected, archived
Headers show

Commit Message

Gu Zheng Aug. 27, 2013, 9:21 a.m. UTC
Hi Rafael,

On 08/26/2013 11:02 PM, Rafael J. Wysocki wrote:

> On Monday, August 26, 2013 04:43:26 PM Rafael J. Wysocki wrote:
>> On Monday, August 26, 2013 02:42:09 PM Rafael J. Wysocki wrote:
>>> On Monday, August 26, 2013 11:13:13 AM Gu Zheng wrote:
>>>> Hi Rafael,
> 
> [...]
> 
>>
>> OK, so the patch below is quick and dirty and overkill, but it should make the
>> splat go away at least.
> 
> And if this patch does make the splat go away for you, please also test the
> appended one (Tejun, thanks for the hint!).
> 
> I'll address the ACPI part differently later.

What about changing device_hotplug_lock and acpi_scan_lock to rwsem? like the
attached one(With a preliminary test, it also can make the splat go away).:)

Regards,
Gu

> 
[...]
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>
From f1682ceaef4105f75f4d6a0bb8e77c8a5dde365b Mon Sep 17 00:00:00 2001
From: Gu Zheng <guz.fnst@cn.fujitsu.com>
Date: Tue, 27 Aug 2013 17:59:55 +0900
Subject: [PATCH] acpi: fix removal lock dep


Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
---
 drivers/acpi/scan.c    |   43 ++++++++++++++++++++++---------------------
 drivers/acpi/sysfs.c   |    7 +++++--
 drivers/base/core.c    |   45 ++++++++++++++++++++++++++++++++++++---------
 drivers/base/memory.c  |    5 +++--
 include/linux/device.h |    8 ++++++--
 5 files changed, 72 insertions(+), 36 deletions(-)

Comments

Tejun Heo Aug. 27, 2013, 6:36 p.m. UTC | #1
Hello,

On Tue, Aug 27, 2013 at 05:21:44PM +0800, Gu Zheng wrote:
> >> OK, so the patch below is quick and dirty and overkill, but it should make the
> >> splat go away at least.
> > 
> > And if this patch does make the splat go away for you, please also test the
> > appended one (Tejun, thanks for the hint!).
> > 
> > I'll address the ACPI part differently later.
> 
> What about changing device_hotplug_lock and acpi_scan_lock to rwsem? like the
> attached one(With a preliminary test, it also can make the splat go away).:)

Hmmm.. I don't get it.  How is introducing another rwlock whic may
cause the operation, even reading the status, to fail randomly a
better option?  It's harier and more brittle.  We probably want to
implement better solution in sysfs for files which interact with
device addition / removal paths but for now I think Rafael's patch is
the right direction.

Thanks.
Toshi Kani Aug. 27, 2013, 9:38 p.m. UTC | #2
On Tue, 2013-08-27 at 17:21 +0800, Gu Zheng wrote:
> Hi Rafael,
> 
> On 08/26/2013 11:02 PM, Rafael J. Wysocki wrote:
> 
> > On Monday, August 26, 2013 04:43:26 PM Rafael J. Wysocki wrote:
> >> On Monday, August 26, 2013 02:42:09 PM Rafael J. Wysocki wrote:
> >>> On Monday, August 26, 2013 11:13:13 AM Gu Zheng wrote:
> >>>> Hi Rafael,
> > 
> > [...]
> > 
> >>
> >> OK, so the patch below is quick and dirty and overkill, but it should make the
> >> splat go away at least.
> > 
> > And if this patch does make the splat go away for you, please also test the
> > appended one (Tejun, thanks for the hint!).
> > 
> > I'll address the ACPI part differently later.
> 
> What about changing device_hotplug_lock and acpi_scan_lock to rwsem? like the
> attached one(With a preliminary test, it also can make the splat go away).:)

I am curious how msleep(10) & restart_syscall() work in the change
below.  Doesn't the msleep() make s_active held longer time, which can
lead the thread holding device_hotplug_lock to wait it for deletion?
Also, does restart_syscall() release s_active and reopen this file
again?

@@ -408,9 +408,13 @@ static ssize_t show_online(struct device *dev,
struct device_attribute *attr,
 {
        bool val;

-       lock_device_hotplug();
+       if (!read_lock_device_hotplug()) {
+               msleep(10);
+               return restart_syscall();
+       }
+

Thanks,
-Toshi

--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Gu Zheng Aug. 28, 2013, 2:12 a.m. UTC | #3
Hi Toshi,

On 08/28/2013 05:38 AM, Toshi Kani wrote:

> On Tue, 2013-08-27 at 17:21 +0800, Gu Zheng wrote:
>> Hi Rafael,
>>
>> On 08/26/2013 11:02 PM, Rafael J. Wysocki wrote:
>>
>>> On Monday, August 26, 2013 04:43:26 PM Rafael J. Wysocki wrote:
>>>> On Monday, August 26, 2013 02:42:09 PM Rafael J. Wysocki wrote:
>>>>> On Monday, August 26, 2013 11:13:13 AM Gu Zheng wrote:
>>>>>> Hi Rafael,
>>>
>>> [...]
>>>
>>>>
>>>> OK, so the patch below is quick and dirty and overkill, but it should make the
>>>> splat go away at least.
>>>
>>> And if this patch does make the splat go away for you, please also test the
>>> appended one (Tejun, thanks for the hint!).
>>>
>>> I'll address the ACPI part differently later.
>>
>> What about changing device_hotplug_lock and acpi_scan_lock to rwsem? like the
>> attached one(With a preliminary test, it also can make the splat go away).:)
> 
> I am curious how msleep(10) & restart_syscall() work in the change
> below.  Doesn't the msleep() make s_active held longer time, which can
> lead the thread holding device_hotplug_lock to wait it for deletion?

Yes, but it can avoid busy waiting. 

> Also, does restart_syscall() release s_active and reopen this file
> again?

Sure, it just set a TIF_SIGPENDING flag and return an -ERESTARTNOINTR error, s_active/file
will be released/closed in the failed path. And when do_signal() catches the -ERESTARTNOINTR,
it will change the regs to restart the syscall.

Thanks,
Gu

> 
> @@ -408,9 +408,13 @@ static ssize_t show_online(struct device *dev,
> struct device_attribute *attr,
>  {
>         bool val;
> 
> -       lock_device_hotplug();
> +       if (!read_lock_device_hotplug()) {
> +               msleep(10);
> +               return restart_syscall();
> +       }
> +
> 
> Thanks,
> -Toshi
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Toshi Kani Aug. 28, 2013, 4:55 p.m. UTC | #4
On Wed, 2013-08-28 at 10:12 +0800, Gu Zheng wrote:
> On 08/28/2013 05:38 AM, Toshi Kani wrote:
 :
> >>
> >> What about changing device_hotplug_lock and acpi_scan_lock to rwsem? like the
> >> attached one(With a preliminary test, it also can make the splat go away).:)
> > 
> > I am curious how msleep(10) & restart_syscall() work in the change
> > below.  Doesn't the msleep() make s_active held longer time, which can
> > lead the thread holding device_hotplug_lock to wait it for deletion?
> 
> Yes, but it can avoid busy waiting. 

I know, but it's kinda unfortunate to sleep with s_active held in this
situation.  But I am fine with the 5ms Rafael used in this latest
patchset since this is a rare case anyway.

> > Also, does restart_syscall() release s_active and reopen this file
> > again?
> 
> Sure, it just set a TIF_SIGPENDING flag and return an -ERESTARTNOINTR error, s_active/file
> will be released/closed in the failed path. And when do_signal() catches the -ERESTARTNOINTR,
> it will change the regs to restart the syscall.

I see.  This is a clever functionality. 

Thanks,
-Toshi


--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c
index 8a46c92..bb41760 100644
--- a/drivers/acpi/scan.c
+++ b/drivers/acpi/scan.c
@@ -36,7 +36,7 @@  bool acpi_force_hot_remove;
 static const char *dummy_hid = "device";
 
 static LIST_HEAD(acpi_bus_id_list);
-static DEFINE_MUTEX(acpi_scan_lock);
+static DECLARE_RWSEM(acpi_scan_rwsem);
 static LIST_HEAD(acpi_scan_handlers_list);
 DEFINE_MUTEX(acpi_device_lock);
 LIST_HEAD(acpi_wakeup_device_list);
@@ -49,13 +49,13 @@  struct acpi_device_bus_id{
 
 void acpi_scan_lock_acquire(void)
 {
-	mutex_lock(&acpi_scan_lock);
+	down_write(&acpi_scan_rwsem);
 }
 EXPORT_SYMBOL_GPL(acpi_scan_lock_acquire);
 
 void acpi_scan_lock_release(void)
 {
-	mutex_unlock(&acpi_scan_lock);
+	up_write(&acpi_scan_rwsem);
 }
 EXPORT_SYMBOL_GPL(acpi_scan_lock_release);
 
@@ -207,7 +207,7 @@  static int acpi_scan_hot_remove(struct acpi_device *device)
 		return -EINVAL;
 	}
 
-	lock_device_hotplug();
+	device_hotplug_begin();
 
 	/*
 	 * Carry out two passes here and ignore errors in the first pass,
@@ -240,7 +240,7 @@  static int acpi_scan_hot_remove(struct acpi_device *device)
 					    acpi_bus_online_companions, NULL,
 					    NULL, NULL);
 
-			unlock_device_hotplug();
+			device_hotplug_end();
 
 			put_device(&device->dev);
 			return -EBUSY;
@@ -252,7 +252,7 @@  static int acpi_scan_hot_remove(struct acpi_device *device)
 
 	acpi_bus_trim(device);
 
-	unlock_device_hotplug();
+	device_hotplug_end();
 
 	/* Device node has been unregistered. */
 	put_device(&device->dev);
@@ -308,7 +308,7 @@  static void acpi_bus_device_eject(void *context)
 	struct acpi_scan_handler *handler;
 	u32 ost_code = ACPI_OST_SC_NON_SPECIFIC_FAILURE;
 
-	mutex_lock(&acpi_scan_lock);
+	acpi_scan_lock_acquire();
 
 	acpi_bus_get_device(handle, &device);
 	if (!device)
@@ -334,7 +334,7 @@  static void acpi_bus_device_eject(void *context)
 	}
 
  out:
-	mutex_unlock(&acpi_scan_lock);
+	acpi_scan_lock_release();
 	return;
 
  err_out:
@@ -349,8 +349,8 @@  static void acpi_scan_bus_device_check(acpi_handle handle, u32 ost_source)
 	u32 ost_code = ACPI_OST_SC_NON_SPECIFIC_FAILURE;
 	int error;
 
-	mutex_lock(&acpi_scan_lock);
-	lock_device_hotplug();
+	acpi_scan_lock_acquire();
+	device_hotplug_begin();
 
 	if (ost_source != ACPI_NOTIFY_BUS_CHECK) {
 		acpi_bus_get_device(handle, &device);
@@ -376,9 +376,9 @@  static void acpi_scan_bus_device_check(acpi_handle handle, u32 ost_source)
 		kobject_uevent(&device->dev.kobj, KOBJ_ONLINE);
 
  out:
-	unlock_device_hotplug();
+	device_hotplug_end();
 	acpi_evaluate_hotplug_ost(handle, ost_source, ost_code, NULL);
-	mutex_unlock(&acpi_scan_lock);
+	acpi_scan_lock_release();
 }
 
 static void acpi_scan_bus_check(void *context)
@@ -469,15 +469,14 @@  void acpi_bus_hot_remove_device(void *context)
 	acpi_handle handle = device->handle;
 	int error;
 
-	mutex_lock(&acpi_scan_lock);
+	acpi_scan_lock_acquire();
 
 	error = acpi_scan_hot_remove(device);
 	if (error && handle)
 		acpi_evaluate_hotplug_ost(handle, ej_event->event,
 					  ACPI_OST_SC_NON_SPECIFIC_FAILURE,
 					  NULL);
-
-	mutex_unlock(&acpi_scan_lock);
+	acpi_scan_lock_release();
 	kfree(context);
 }
 EXPORT_SYMBOL(acpi_bus_hot_remove_device);
@@ -530,7 +529,8 @@  acpi_eject_store(struct device *d, struct device_attribute *attr,
 	if (ACPI_FAILURE(status) || !acpi_device->flags.ejectable)
 		return -ENODEV;
 
-	mutex_lock(&acpi_scan_lock);
+	if (!down_write_trylock(&acpi_scan_rwsem))
+		return -EBUSY;
 
 	if (acpi_device->flags.eject_pending) {
 		/* ACPI eject notification event. */
@@ -560,7 +560,7 @@  acpi_eject_store(struct device *d, struct device_attribute *attr,
 	ret = count;
 
  out:
-	mutex_unlock(&acpi_scan_lock);
+	up_write(&acpi_scan_rwsem);
 	return ret;
 
  err_out:
@@ -1858,11 +1858,12 @@  void acpi_scan_hotplug_enabled(struct acpi_hotplug_profile *hotplug, bool val)
 	if (!!hotplug->enabled == !!val)
 		return;
 
-	mutex_lock(&acpi_scan_lock);
+	acpi_scan_lock_acquire();
 
 	hotplug->enabled = val;
 
-	mutex_unlock(&acpi_scan_lock);
+	acpi_scan_lock_release();
+
 }
 
 static void acpi_scan_init_hotplug(acpi_handle handle, int type)
@@ -2141,7 +2142,7 @@  int __init acpi_scan_init(void)
 	acpi_memory_hotplug_init();
 	acpi_dock_init();
 
-	mutex_lock(&acpi_scan_lock);
+	acpi_scan_lock_acquire();
 	/*
 	 * Enumerate devices in the ACPI namespace.
 	 */
@@ -2164,6 +2165,6 @@  int __init acpi_scan_init(void)
 	acpi_pci_root_hp_init();
 
  out:
-	mutex_unlock(&acpi_scan_lock);
+	acpi_scan_lock_release();
 	return result;
 }
diff --git a/drivers/acpi/sysfs.c b/drivers/acpi/sysfs.c
index 05306a5..6d8b54f 100644
--- a/drivers/acpi/sysfs.c
+++ b/drivers/acpi/sysfs.c
@@ -796,9 +796,12 @@  static ssize_t force_remove_store(struct kobject *kobj,
 	if (ret < 0)
 		return ret;
 
-	lock_device_hotplug();
+	if (!write_lock_device_hotplug())
+		return -EBUSY;
+
 	acpi_force_hot_remove = val;
-	unlock_device_hotplug();
+
+	write_unlock_device_hotplug();
 	return size;
 }
 
diff --git a/drivers/base/core.c b/drivers/base/core.c
index 8856d74..83c0f46 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -408,9 +408,13 @@  static ssize_t show_online(struct device *dev, struct device_attribute *attr,
 {
 	bool val;
 
-	lock_device_hotplug();
+	if (!read_lock_device_hotplug()) {
+		msleep(10);
+		return restart_syscall();
+	}
+
 	val = !dev->offline;
-	unlock_device_hotplug();
+	read_unlock_device_hotplug();
 	return sprintf(buf, "%u\n", val);
 }
 
@@ -424,9 +428,12 @@  static ssize_t store_online(struct device *dev, struct device_attribute *attr,
 	if (ret < 0)
 		return ret;
 
-	lock_device_hotplug();
+	if (!write_lock_device_hotplug()) {
+		msleep(10);
+		return restart_syscall();
+	}
 	ret = val ? device_online(dev) : device_offline(dev);
-	unlock_device_hotplug();
+	write_unlock_device_hotplug();
 	return ret < 0 ? ret : count;
 }
 
@@ -1479,16 +1486,36 @@  EXPORT_SYMBOL_GPL(put_device);
 EXPORT_SYMBOL_GPL(device_create_file);
 EXPORT_SYMBOL_GPL(device_remove_file);
 
-static DEFINE_MUTEX(device_hotplug_lock);
+static DECLARE_RWSEM(device_hotplug_rwsem);
+
+bool __must_check read_lock_device_hotplug(void)
+{
+	return down_read_trylock(&device_hotplug_rwsem);
+}
+
+void read_unlock_device_hotplug(void)
+{
+	up_read(&device_hotplug_rwsem);
+}
+
+bool __must_check write_lock_device_hotplug(void)
+{
+	return down_write_trylock(&device_hotplug_rwsem);
+}
+
+void write_unlock_device_hotplug(void)
+{
+	up_write(&device_hotplug_rwsem);
+}
 
-void lock_device_hotplug(void)
+void device_hotplug_begin(void)
 {
-	mutex_lock(&device_hotplug_lock);
+	down_write(&device_hotplug_rwsem);
 }
 
-void unlock_device_hotplug(void)
+void device_hotplug_end(void)
 {
-	mutex_unlock(&device_hotplug_lock);
+	up_write(&device_hotplug_rwsem);
 }
 
 static int device_check_offline(struct device *dev, void *not_used)
diff --git a/drivers/base/memory.c b/drivers/base/memory.c
index 2b7813e..71991b9 100644
--- a/drivers/base/memory.c
+++ b/drivers/base/memory.c
@@ -351,7 +351,8 @@  store_mem_state(struct device *dev,
 
 	mem = container_of(dev, struct memory_block, dev);
 
-	lock_device_hotplug();
+	if (!write_lock_device_hotplug())
+		return -EBUSY;
 
 	if (!strncmp(buf, "online_kernel", min_t(int, count, 13))) {
 		offline = false;
@@ -373,7 +374,7 @@  store_mem_state(struct device *dev,
 	if (!ret)
 		dev->offline = offline;
 
-	unlock_device_hotplug();
+	write_unlock_device_hotplug();
 
 	if (ret)
 		return ret;
diff --git a/include/linux/device.h b/include/linux/device.h
index 22b546a..08581f4 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -893,8 +893,12 @@  static inline bool device_supports_offline(struct device *dev)
 	return dev->bus && dev->bus->offline && dev->bus->online;
 }
 
-extern void lock_device_hotplug(void);
-extern void unlock_device_hotplug(void);
+extern bool read_lock_device_hotplug(void);
+extern void read_unlock_device_hotplug(void);
+extern bool write_lock_device_hotplug(void);
+extern void write_unlock_device_hotplug(void);
+extern void device_hotplug_begin(void);
+extern void device_hotplug_end(void);
 extern int device_offline(struct device *dev);
 extern int device_online(struct device *dev);
 /*