diff mbox

[v4,1/7] acpi,memory-hotplug: introduce a mutex lock to protect the list in acpi_memory_device

Message ID 1352372693-32411-2-git-send-email-wency@cn.fujitsu.com (mailing list archive)
State Superseded, archived
Headers show

Commit Message

Wen Congyang Nov. 8, 2012, 11:04 a.m. UTC
The memory device can be removed by 2 ways:
1. send eject request by SCI
2. echo 1 >/sys/bus/pci/devices/PNP0C80:XX/eject

This 2 events may happen at the same time, so we may touch
acpi_memory_device.res_list at the same time. This patch
introduce a lock to protect this list.

CC: David Rientjes <rientjes@google.com>
CC: Jiang Liu <liuj97@gmail.com>
CC: Len Brown <len.brown@intel.com>
CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: Paul Mackerras <paulus@samba.org>
CC: Christoph Lameter <cl@linux.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
CC: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
CC: Rafael J. Wysocki <rjw@sisk.pl>
CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
---
 The commit in pm tree is 85fcb375
 drivers/acpi/acpi_memhotplug.c | 21 ++++++++++++++++++---
 1 file changed, 18 insertions(+), 3 deletions(-)

Comments

Toshi Kani Nov. 12, 2012, 9 p.m. UTC | #1
On Thu, 2012-11-08 at 19:04 +0800, Wen Congyang wrote:
> The memory device can be removed by 2 ways:
> 1. send eject request by SCI
> 2. echo 1 >/sys/bus/pci/devices/PNP0C80:XX/eject
> 
> This 2 events may happen at the same time, so we may touch
> acpi_memory_device.res_list at the same time. This patch
> introduce a lock to protect this list.

Hi Wen,

This race condition is not unique in memory hot-remove as the sysfs
eject interface is created for all objects with _EJ0.  For CPU
hot-remove, I addressed this race condition by making the notify handler
to run the hot-remove operation on kacpi_hotplug_wq by calling
acpi_os_hotplug_execute().  This serializes the hot-remove operations
among the two events since the sysfs eject also runs on
kacpi_hotplug_wq.  This way is much simpler and is easy to maintain,
although it does not allow both operations to run simultaneously (which
I do not think we need).  Can it be used for memory hot-remove as well?

Thanks,
-Toshi

--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Wen Congyang Nov. 13, 2012, 2:04 a.m. UTC | #2
At 11/13/2012 05:00 AM, Toshi Kani Wrote:
> On Thu, 2012-11-08 at 19:04 +0800, Wen Congyang wrote:
>> The memory device can be removed by 2 ways:
>> 1. send eject request by SCI
>> 2. echo 1 >/sys/bus/pci/devices/PNP0C80:XX/eject
>>
>> This 2 events may happen at the same time, so we may touch
>> acpi_memory_device.res_list at the same time. This patch
>> introduce a lock to protect this list.
> 
> Hi Wen,
> 
> This race condition is not unique in memory hot-remove as the sysfs
> eject interface is created for all objects with _EJ0.  For CPU
> hot-remove, I addressed this race condition by making the notify handler
> to run the hot-remove operation on kacpi_hotplug_wq by calling
> acpi_os_hotplug_execute().  This serializes the hot-remove operations
> among the two events since the sysfs eject also runs on
> kacpi_hotplug_wq.  This way is much simpler and is easy to maintain,
> although it does not allow both operations to run simultaneously (which
> I do not think we need).  Can it be used for memory hot-remove as well?

Good idea. I will update it.

Thanks
Wen Congyang

> 
> Thanks,
> -Toshi
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rafael Wysocki Nov. 14, 2012, 11:34 p.m. UTC | #3
On Tuesday, November 13, 2012 10:04:53 AM Wen Congyang wrote:
> At 11/13/2012 05:00 AM, Toshi Kani Wrote:
> > On Thu, 2012-11-08 at 19:04 +0800, Wen Congyang wrote:
> >> The memory device can be removed by 2 ways:
> >> 1. send eject request by SCI
> >> 2. echo 1 >/sys/bus/pci/devices/PNP0C80:XX/eject
> >>
> >> This 2 events may happen at the same time, so we may touch
> >> acpi_memory_device.res_list at the same time. This patch
> >> introduce a lock to protect this list.
> > 
> > Hi Wen,
> > 
> > This race condition is not unique in memory hot-remove as the sysfs
> > eject interface is created for all objects with _EJ0.  For CPU
> > hot-remove, I addressed this race condition by making the notify handler
> > to run the hot-remove operation on kacpi_hotplug_wq by calling
> > acpi_os_hotplug_execute().  This serializes the hot-remove operations
> > among the two events since the sysfs eject also runs on
> > kacpi_hotplug_wq.  This way is much simpler and is easy to maintain,
> > although it does not allow both operations to run simultaneously (which
> > I do not think we need).  Can it be used for memory hot-remove as well?
> 
> Good idea. I will update it.

Still waiting. :-)

But if you want that in v3.8, please repost ASAP.

Thanks,
Rafael
Wen Congyang Nov. 15, 2012, 1:24 a.m. UTC | #4
At 11/15/2012 07:34 AM, Rafael J. Wysocki Wrote:
> On Tuesday, November 13, 2012 10:04:53 AM Wen Congyang wrote:
>> At 11/13/2012 05:00 AM, Toshi Kani Wrote:
>>> On Thu, 2012-11-08 at 19:04 +0800, Wen Congyang wrote:
>>>> The memory device can be removed by 2 ways:
>>>> 1. send eject request by SCI
>>>> 2. echo 1 >/sys/bus/pci/devices/PNP0C80:XX/eject
>>>>
>>>> This 2 events may happen at the same time, so we may touch
>>>> acpi_memory_device.res_list at the same time. This patch
>>>> introduce a lock to protect this list.
>>>
>>> Hi Wen,
>>>
>>> This race condition is not unique in memory hot-remove as the sysfs
>>> eject interface is created for all objects with _EJ0.  For CPU
>>> hot-remove, I addressed this race condition by making the notify handler
>>> to run the hot-remove operation on kacpi_hotplug_wq by calling
>>> acpi_os_hotplug_execute().  This serializes the hot-remove operations
>>> among the two events since the sysfs eject also runs on
>>> kacpi_hotplug_wq.  This way is much simpler and is easy to maintain,
>>> although it does not allow both operations to run simultaneously (which
>>> I do not think we need).  Can it be used for memory hot-remove as well?
>>
>> Good idea. I will update it.
> 
> Still waiting. :-)
> 
> But if you want that in v3.8, please repost ASAP.

I think I will send it today. It is in test now.

Thanks
Wen Congyang

> 
> Thanks,
> Rafael
> 
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/acpi/acpi_memhotplug.c b/drivers/acpi/acpi_memhotplug.c
index 1e90e8f..4c18ee3 100644
--- a/drivers/acpi/acpi_memhotplug.c
+++ b/drivers/acpi/acpi_memhotplug.c
@@ -83,7 +83,8 @@  struct acpi_memory_info {
 struct acpi_memory_device {
 	struct acpi_device * device;
 	unsigned int state;	/* State of the memory device */
-	struct list_head res_list;
+	struct mutex list_lock;
+	struct list_head res_list;	/* protected by list_lock */
 };
 
 static int acpi_hotmem_initialized;
@@ -101,19 +102,23 @@  acpi_memory_get_resource(struct acpi_resource *resource, void *context)
 	    (address64.resource_type != ACPI_MEMORY_RANGE))
 		return AE_OK;
 
+	mutex_lock(&mem_device->list_lock);
 	list_for_each_entry(info, &mem_device->res_list, list) {
 		/* Can we combine the resource range information? */
 		if ((info->caching == address64.info.mem.caching) &&
 		    (info->write_protect == address64.info.mem.write_protect) &&
 		    (info->start_addr + info->length == address64.minimum)) {
 			info->length += address64.address_length;
+			mutex_unlock(&mem_device->list_lock);
 			return AE_OK;
 		}
 	}
 
 	new = kzalloc(sizeof(struct acpi_memory_info), GFP_KERNEL);
-	if (!new)
+	if (!new) {
+		mutex_unlock(&mem_device->list_lock);
 		return AE_ERROR;
+	}
 
 	INIT_LIST_HEAD(&new->list);
 	new->caching = address64.info.mem.caching;
@@ -121,6 +126,7 @@  acpi_memory_get_resource(struct acpi_resource *resource, void *context)
 	new->start_addr = address64.minimum;
 	new->length = address64.address_length;
 	list_add_tail(&new->list, &mem_device->res_list);
+	mutex_unlock(&mem_device->list_lock);
 
 	return AE_OK;
 }
@@ -138,9 +144,11 @@  acpi_memory_get_device_resources(struct acpi_memory_device *mem_device)
 	status = acpi_walk_resources(mem_device->device->handle, METHOD_NAME__CRS,
 				     acpi_memory_get_resource, mem_device);
 	if (ACPI_FAILURE(status)) {
+		mutex_lock(&mem_device->list_lock);
 		list_for_each_entry_safe(info, n, &mem_device->res_list, list)
 			kfree(info);
 		INIT_LIST_HEAD(&mem_device->res_list);
+		mutex_unlock(&mem_device->list_lock);
 		return -EINVAL;
 	}
 
@@ -236,6 +244,7 @@  static int acpi_memory_enable_device(struct acpi_memory_device *mem_device)
 	 * We don't have memory-hot-add rollback function,now.
 	 * (i.e. memory-hot-remove function)
 	 */
+	mutex_lock(&mem_device->list_lock);
 	list_for_each_entry(info, &mem_device->res_list, list) {
 		if (info->enabled) { /* just sanity check...*/
 			num_enabled++;
@@ -256,6 +265,7 @@  static int acpi_memory_enable_device(struct acpi_memory_device *mem_device)
 		info->enabled = 1;
 		num_enabled++;
 	}
+	mutex_unlock(&mem_device->list_lock);
 	if (!num_enabled) {
 		printk(KERN_ERR PREFIX "add_memory failed\n");
 		mem_device->state = MEMORY_INVALID_STATE;
@@ -316,14 +326,18 @@  static int acpi_memory_disable_device(struct acpi_memory_device *mem_device)
 	 * Ask the VM to offline this memory range.
 	 * Note: Assume that this function returns zero on success
 	 */
+	mutex_lock(&mem_device->list_lock);
 	list_for_each_entry_safe(info, n, &mem_device->res_list, list) {
 		if (info->enabled) {
 			result = remove_memory(info->start_addr, info->length);
-			if (result)
+			if (result) {
+				mutex_unlock(&mem_device->list_lock);
 				return result;
+			}
 		}
 		kfree(info);
 	}
+	mutex_unlock(&mem_device->list_lock);
 
 	/* Power-off and eject the device */
 	result = acpi_memory_powerdown_device(mem_device);
@@ -438,6 +452,7 @@  static int acpi_memory_device_add(struct acpi_device *device)
 	mem_device->device = device;
 	sprintf(acpi_device_name(device), "%s", ACPI_MEMORY_DEVICE_NAME);
 	sprintf(acpi_device_class(device), "%s", ACPI_MEMORY_DEVICE_CLASS);
+	mutex_init(&mem_device->list_lock);
 	device->driver_data = mem_device;
 
 	/* Get the range from the _CRS */