Message ID | 1353693037-21704-4-git-send-email-vasilis.liaskovitis@profitbricks.com (mailing list archive) |
---|---|
State | RFC, archived |
Headers | show |
At 2012/11/24 1:50, Vasilis Liaskovitis Wrote: > Consider the following sequence of operations for a hotplugged memory device: > > 1. echo "PNP0C80:XX"> /sys/bus/acpi/drivers/acpi_memhotplug/unbind > 2. echo "PNP0C80:XX"> /sys/bus/acpi/drivers/acpi_memhotplug/bind > 3. echo 1>/sys/bus/pci/devices/PNP0C80:XX/eject > > The driver is successfully re-bound to the device in step 2. However step 3 will > not attempt to remove the memory. This is because the acpi_memory_info enabled > bit for the newly bound driver has not been set to 1. This bit needs to be set > in the case where the memory is already used by the kernel (add_memory returns > -EEXIST) Hmm, I think the reason is that we don't offline/remove memory when unbinding it from the driver. I have sent a patch to fix this problem, and this patch is in pm tree now. With this patch, we will offline/remove memory when unbinding it from the drriver. Consider the following sequence of operations for a hotplugged memory device: 1. echo "PNP0C80:XX" > /sys/bus/acpi/drivers/acpi_memhotplug/unbind 2. echo 1 >/sys/bus/pci/devices/PNP0C80:XX/eject If we don't offline/remove the memory, we have no chance to do it in step 2. After step2, the memory is used by the kernel, but we have powered off it. It is very dangerous. So this patch is unnecessary now. Thanks Wen Congyang > > Setting the enabled bit in this case (in acpi_memory_enable_device) makes the > driver function properly after a rebind of the driver i.e. eject operation > attempts to remove memory after a successful rebind. > > I am not sure if this breaks some other usage of the enabled bit (see commit > 65479472). When is it possible for the memory to be in use by the kernel but > not managed by the acpi driver, apart from a driver unbind scenario? > > Perhaps the patch is not needed, depending on expected semantics of re-binding. > Is the newly bound driver supposed to manage the device, if it was earlier > managed by the same driver? > > This patch is only specific to this scenario, and can be dropped from the patch > series if needed. > > Signed-off-by: Vasilis Liaskovitis<vasilis.liaskovitis@profitbricks.com> > --- > drivers/acpi/acpi_memhotplug.c | 3 +-- > 1 files changed, 1 insertions(+), 2 deletions(-) > > diff --git a/drivers/acpi/acpi_memhotplug.c b/drivers/acpi/acpi_memhotplug.c > index d0cfbd9..0562cb4 100644 > --- a/drivers/acpi/acpi_memhotplug.c > +++ b/drivers/acpi/acpi_memhotplug.c > @@ -271,12 +271,11 @@ static int acpi_memory_enable_device(struct acpi_memory_device *mem_device) > continue; > } > > - if (!result) > - info->enabled = 1; > /* > * Add num_enable even if add_memory() returns -EEXIST, so the > * device is bound to this driver. > */ > + info->enabled = 1; > num_enabled++; > } > if (!num_enabled) { -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sun, Nov 25, 2012 at 12:20:47AM +0800, Wen Congyang wrote: > At 2012/11/24 1:50, Vasilis Liaskovitis Wrote: > > Consider the following sequence of operations for a hotplugged memory device: > > > > 1. echo "PNP0C80:XX"> /sys/bus/acpi/drivers/acpi_memhotplug/unbind > > 2. echo "PNP0C80:XX"> /sys/bus/acpi/drivers/acpi_memhotplug/bind > > 3. echo 1>/sys/bus/pci/devices/PNP0C80:XX/eject > > > > The driver is successfully re-bound to the device in step 2. However step 3 will > > not attempt to remove the memory. This is because the acpi_memory_info enabled > > bit for the newly bound driver has not been set to 1. This bit needs to be set > > in the case where the memory is already used by the kernel (add_memory returns > > -EEXIST) > > Hmm, I think the reason is that we don't offline/remove memory when > unbinding it > from the driver. I have sent a patch to fix this problem, and this patch > is in > pm tree now. With this patch, we will offline/remove memory when > unbinding it from > the drriver. ok. Which patch is this? Does it require driver-core changes? > > Consider the following sequence of operations for a hotplugged memory > device: > > 1. echo "PNP0C80:XX" > /sys/bus/acpi/drivers/acpi_memhotplug/unbind > 2. echo 1 >/sys/bus/pci/devices/PNP0C80:XX/eject > > If we don't offline/remove the memory, we have no chance to do it in > step 2. After > step2, the memory is used by the kernel, but we have powered off it. It > is very > dangerous. How does power-off happen after unbind? acpi_eject_store checks for existing driver before taking any action: #ifndef FORCE_EJECT if (acpi_device->driver == NULL) { ret = -ENODEV; goto err; } #endif FORCE_EJECT is not defined afaict, so the function returns without scheduling acpi_bus_hot_remove_device. Is there another code path that calls power-off? thanks, - Vasilis -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
At 11/26/2012 04:36 PM, Vasilis Liaskovitis Wrote: > On Sun, Nov 25, 2012 at 12:20:47AM +0800, Wen Congyang wrote: >> At 2012/11/24 1:50, Vasilis Liaskovitis Wrote: >>> Consider the following sequence of operations for a hotplugged memory device: >>> >>> 1. echo "PNP0C80:XX"> /sys/bus/acpi/drivers/acpi_memhotplug/unbind >>> 2. echo "PNP0C80:XX"> /sys/bus/acpi/drivers/acpi_memhotplug/bind >>> 3. echo 1>/sys/bus/pci/devices/PNP0C80:XX/eject >>> >>> The driver is successfully re-bound to the device in step 2. However step 3 will >>> not attempt to remove the memory. This is because the acpi_memory_info enabled >>> bit for the newly bound driver has not been set to 1. This bit needs to be set >>> in the case where the memory is already used by the kernel (add_memory returns >>> -EEXIST) >> >> Hmm, I think the reason is that we don't offline/remove memory when >> unbinding it >> from the driver. I have sent a patch to fix this problem, and this patch >> is in >> pm tree now. With this patch, we will offline/remove memory when >> unbinding it from >> the drriver. > > ok. Which patch is this? Does it require driver-core changes? https://lkml.org/lkml/2012/11/15/21 Patch 1-6 is in pm tree now. > >> >> Consider the following sequence of operations for a hotplugged memory >> device: >> >> 1. echo "PNP0C80:XX" > /sys/bus/acpi/drivers/acpi_memhotplug/unbind >> 2. echo 1 >/sys/bus/pci/devices/PNP0C80:XX/eject >> >> If we don't offline/remove the memory, we have no chance to do it in >> step 2. After >> step2, the memory is used by the kernel, but we have powered off it. It >> is very >> dangerous. > > How does power-off happen after unbind? acpi_eject_store checks for existing > driver before taking any action: > > #ifndef FORCE_EJECT > if (acpi_device->driver == NULL) { > ret = -ENODEV; > goto err; > } > #endif > > FORCE_EJECT is not defined afaict, so the function returns without scheduling > acpi_bus_hot_remove_device. Is there another code path that calls power-off? Consider the following case: We hotremove the memory device by SCI and unbind it from the driver at the same time: CPUa CPUb acpi_memory_device_notify() unbind it from the driver acpi_bus_hot_remove_device() Thanks Wen Congyang > > thanks, > > - Vasilis > -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> >> Consider the following sequence of operations for a hotplugged memory > >> device: > >> > >> 1. echo "PNP0C80:XX" > /sys/bus/acpi/drivers/acpi_memhotplug/unbind > >> 2. echo 1 >/sys/bus/pci/devices/PNP0C80:XX/eject > >> > >> If we don't offline/remove the memory, we have no chance to do it in > >> step 2. After > >> step2, the memory is used by the kernel, but we have powered off it. It > >> is very > >> dangerous. > > > > How does power-off happen after unbind? acpi_eject_store checks for existing > > driver before taking any action: > > > > #ifndef FORCE_EJECT > > if (acpi_device->driver == NULL) { > > ret = -ENODEV; > > goto err; > > } > > #endif > > > > FORCE_EJECT is not defined afaict, so the function returns without scheduling > > acpi_bus_hot_remove_device. Is there another code path that calls power-off? > > Consider the following case: > > We hotremove the memory device by SCI and unbind it from the driver at the same time: > > CPUa CPUb > acpi_memory_device_notify() > unbind it from the driver > acpi_bus_hot_remove_device() Can we make acpi_bus_remove() to fail if a given acpi_device is not bound with a driver? If so, can we make the unbind operation to perform unbind only? Thanks, -Toshi -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Nov 26, 2012 at 05:19:01PM -0700, Toshi Kani wrote: > > >> Consider the following sequence of operations for a hotplugged memory > > >> device: > > >> > > >> 1. echo "PNP0C80:XX" > /sys/bus/acpi/drivers/acpi_memhotplug/unbind > > >> 2. echo 1 >/sys/bus/pci/devices/PNP0C80:XX/eject > > >> > > >> If we don't offline/remove the memory, we have no chance to do it in > > >> step 2. After > > >> step2, the memory is used by the kernel, but we have powered off it. It > > >> is very > > >> dangerous. > > > > > > How does power-off happen after unbind? acpi_eject_store checks for existing > > > driver before taking any action: > > > > > > #ifndef FORCE_EJECT > > > if (acpi_device->driver == NULL) { > > > ret = -ENODEV; > > > goto err; > > > } > > > #endif > > > > > > FORCE_EJECT is not defined afaict, so the function returns without scheduling > > > acpi_bus_hot_remove_device. Is there another code path that calls power-off? > > > > Consider the following case: > > > > We hotremove the memory device by SCI and unbind it from the driver at the same time: > > > > CPUa CPUb > > acpi_memory_device_notify() > > unbind it from the driver > > acpi_bus_hot_remove_device() > > Can we make acpi_bus_remove() to fail if a given acpi_device is not > bound with a driver? If so, can we make the unbind operation to perform > unbind only? acpi_bus_remove_device could check if the driver is present, and return -ENODEV if it's not present (dev->driver == NULL). But there can still be a race between an eject and an unbind operation happening simultaneously. This seems like a general problem to me i.e. not specific to an acpi memory device. How do we ensure an eject does not race with a driver unbind for other acpi devices? Is there a per-device lock in acpi-core or device-core that can prevent this from happening? Driver core does a device_lock(dev) on all operations, but this is probably not grabbed on SCI-initiated acpi ejects. thanks, - Vasilis -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 2012-11-27 at 19:32 +0100, Vasilis Liaskovitis wrote: > On Mon, Nov 26, 2012 at 05:19:01PM -0700, Toshi Kani wrote: > > > >> Consider the following sequence of operations for a hotplugged memory > > > >> device: > > > >> > > > >> 1. echo "PNP0C80:XX" > /sys/bus/acpi/drivers/acpi_memhotplug/unbind > > > >> 2. echo 1 >/sys/bus/pci/devices/PNP0C80:XX/eject > > > >> > > > >> If we don't offline/remove the memory, we have no chance to do it in > > > >> step 2. After > > > >> step2, the memory is used by the kernel, but we have powered off it. It > > > >> is very > > > >> dangerous. > > > > > > > > How does power-off happen after unbind? acpi_eject_store checks for existing > > > > driver before taking any action: > > > > > > > > #ifndef FORCE_EJECT > > > > if (acpi_device->driver == NULL) { > > > > ret = -ENODEV; > > > > goto err; > > > > } > > > > #endif > > > > > > > > FORCE_EJECT is not defined afaict, so the function returns without scheduling > > > > acpi_bus_hot_remove_device. Is there another code path that calls power-off? > > > > > > Consider the following case: > > > > > > We hotremove the memory device by SCI and unbind it from the driver at the same time: > > > > > > CPUa CPUb > > > acpi_memory_device_notify() > > > unbind it from the driver > > > acpi_bus_hot_remove_device() > > > > Can we make acpi_bus_remove() to fail if a given acpi_device is not > > bound with a driver? If so, can we make the unbind operation to perform > > unbind only? > > acpi_bus_remove_device could check if the driver is present, and return -ENODEV > if it's not present (dev->driver == NULL). > > But there can still be a race between an eject and an unbind operation happening > simultaneously. This seems like a general problem to me i.e. not specific to an > acpi memory device. How do we ensure an eject does not race with a driver unbind > for other acpi devices? > > Is there a per-device lock in acpi-core or device-core that can prevent this from > happening? Driver core does a device_lock(dev) on all operations, but this is > probably not grabbed on SCI-initiated acpi ejects. Since driver_unbind() calls device_lock(dev->parent) before calling device_release_driver(), I am wondering if we can call device_lock(dev->dev->parent) at the beginning of acpi_bus_remove() (i.e. before calling pre_remove) and fails if dev->driver is NULL. The parent lock is otherwise released after device_release_driver() is done. Thanks, -Toshi -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tuesday, November 27, 2012 03:03:47 PM Toshi Kani wrote: > On Tue, 2012-11-27 at 19:32 +0100, Vasilis Liaskovitis wrote: > > On Mon, Nov 26, 2012 at 05:19:01PM -0700, Toshi Kani wrote: > > > > >> Consider the following sequence of operations for a hotplugged memory > > > > >> device: > > > > >> > > > > >> 1. echo "PNP0C80:XX" > /sys/bus/acpi/drivers/acpi_memhotplug/unbind > > > > >> 2. echo 1 >/sys/bus/pci/devices/PNP0C80:XX/eject > > > > >> > > > > >> If we don't offline/remove the memory, we have no chance to do it in > > > > >> step 2. After > > > > >> step2, the memory is used by the kernel, but we have powered off it. It > > > > >> is very > > > > >> dangerous. > > > > > > > > > > How does power-off happen after unbind? acpi_eject_store checks for existing > > > > > driver before taking any action: > > > > > > > > > > #ifndef FORCE_EJECT > > > > > if (acpi_device->driver == NULL) { > > > > > ret = -ENODEV; > > > > > goto err; > > > > > } > > > > > #endif > > > > > > > > > > FORCE_EJECT is not defined afaict, so the function returns without scheduling > > > > > acpi_bus_hot_remove_device. Is there another code path that calls power-off? > > > > > > > > Consider the following case: > > > > > > > > We hotremove the memory device by SCI and unbind it from the driver at the same time: > > > > > > > > CPUa CPUb > > > > acpi_memory_device_notify() > > > > unbind it from the driver > > > > acpi_bus_hot_remove_device() > > > > > > Can we make acpi_bus_remove() to fail if a given acpi_device is not > > > bound with a driver? If so, can we make the unbind operation to perform > > > unbind only? > > > > acpi_bus_remove_device could check if the driver is present, and return -ENODEV > > if it's not present (dev->driver == NULL). > > > > But there can still be a race between an eject and an unbind operation happening > > simultaneously. This seems like a general problem to me i.e. not specific to an > > acpi memory device. How do we ensure an eject does not race with a driver unbind > > for other acpi devices? > > > > Is there a per-device lock in acpi-core or device-core that can prevent this from > > happening? Driver core does a device_lock(dev) on all operations, but this is > > probably not grabbed on SCI-initiated acpi ejects. > > Since driver_unbind() calls device_lock(dev->parent) before calling > device_release_driver(), I am wondering if we can call > device_lock(dev->dev->parent) at the beginning of acpi_bus_remove() > (i.e. before calling pre_remove) and fails if dev->driver is NULL. The > parent lock is otherwise released after device_release_driver() is done. I would be careful. You may introduce some subtle locking-related issues this way. Besides, there may be an alternative approach to all this. For example, what if we don't remove struct device objects on eject? The ACPI handles associated with them don't go away in that case after all, do they? Rafael
On Wed, 2012-11-28 at 00:41 +0100, Rafael J. Wysocki wrote: > On Tuesday, November 27, 2012 03:03:47 PM Toshi Kani wrote: > > On Tue, 2012-11-27 at 19:32 +0100, Vasilis Liaskovitis wrote: > > > On Mon, Nov 26, 2012 at 05:19:01PM -0700, Toshi Kani wrote: > > > > > >> Consider the following sequence of operations for a hotplugged memory > > > > > >> device: > > > > > >> > > > > > >> 1. echo "PNP0C80:XX" > /sys/bus/acpi/drivers/acpi_memhotplug/unbind > > > > > >> 2. echo 1 >/sys/bus/pci/devices/PNP0C80:XX/eject > > > > > >> > > > > > >> If we don't offline/remove the memory, we have no chance to do it in > > > > > >> step 2. After > > > > > >> step2, the memory is used by the kernel, but we have powered off it. It > > > > > >> is very > > > > > >> dangerous. > > > > > > > > > > > > How does power-off happen after unbind? acpi_eject_store checks for existing > > > > > > driver before taking any action: > > > > > > > > > > > > #ifndef FORCE_EJECT > > > > > > if (acpi_device->driver == NULL) { > > > > > > ret = -ENODEV; > > > > > > goto err; > > > > > > } > > > > > > #endif > > > > > > > > > > > > FORCE_EJECT is not defined afaict, so the function returns without scheduling > > > > > > acpi_bus_hot_remove_device. Is there another code path that calls power-off? > > > > > > > > > > Consider the following case: > > > > > > > > > > We hotremove the memory device by SCI and unbind it from the driver at the same time: > > > > > > > > > > CPUa CPUb > > > > > acpi_memory_device_notify() > > > > > unbind it from the driver > > > > > acpi_bus_hot_remove_device() > > > > > > > > Can we make acpi_bus_remove() to fail if a given acpi_device is not > > > > bound with a driver? If so, can we make the unbind operation to perform > > > > unbind only? > > > > > > acpi_bus_remove_device could check if the driver is present, and return -ENODEV > > > if it's not present (dev->driver == NULL). > > > > > > But there can still be a race between an eject and an unbind operation happening > > > simultaneously. This seems like a general problem to me i.e. not specific to an > > > acpi memory device. How do we ensure an eject does not race with a driver unbind > > > for other acpi devices? > > > > > > Is there a per-device lock in acpi-core or device-core that can prevent this from > > > happening? Driver core does a device_lock(dev) on all operations, but this is > > > probably not grabbed on SCI-initiated acpi ejects. > > > > Since driver_unbind() calls device_lock(dev->parent) before calling > > device_release_driver(), I am wondering if we can call > > device_lock(dev->dev->parent) at the beginning of acpi_bus_remove() > > (i.e. before calling pre_remove) and fails if dev->driver is NULL. The > > parent lock is otherwise released after device_release_driver() is done. > > I would be careful. You may introduce some subtle locking-related issues > this way. Right. This requires careful inspection and testing. As far as the locking is concerned, I am not keen on using fine grained locking for hot-plug. It is much simpler and solid if we serialize such operations. > Besides, there may be an alternative approach to all this. For example, > what if we don't remove struct device objects on eject? The ACPI handles > associated with them don't go away in that case after all, do they? Umm... Sorry, I am not getting your point. The issue is that we need to be able to fail a request when memory range cannot be off-lined. Otherwise, we end up ejecting online memory range. Thanks, -Toshi -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wednesday, November 28, 2012 09:01:13 AM Toshi Kani wrote: > On Wed, 2012-11-28 at 00:41 +0100, Rafael J. Wysocki wrote: > > On Tuesday, November 27, 2012 03:03:47 PM Toshi Kani wrote: > > > On Tue, 2012-11-27 at 19:32 +0100, Vasilis Liaskovitis wrote: > > > > On Mon, Nov 26, 2012 at 05:19:01PM -0700, Toshi Kani wrote: > > > > > > >> Consider the following sequence of operations for a hotplugged memory > > > > > > >> device: > > > > > > >> > > > > > > >> 1. echo "PNP0C80:XX" > /sys/bus/acpi/drivers/acpi_memhotplug/unbind > > > > > > >> 2. echo 1 >/sys/bus/pci/devices/PNP0C80:XX/eject > > > > > > >> > > > > > > >> If we don't offline/remove the memory, we have no chance to do it in > > > > > > >> step 2. After > > > > > > >> step2, the memory is used by the kernel, but we have powered off it. It > > > > > > >> is very > > > > > > >> dangerous. > > > > > > > > > > > > > > How does power-off happen after unbind? acpi_eject_store checks for existing > > > > > > > driver before taking any action: > > > > > > > > > > > > > > #ifndef FORCE_EJECT > > > > > > > if (acpi_device->driver == NULL) { > > > > > > > ret = -ENODEV; > > > > > > > goto err; > > > > > > > } > > > > > > > #endif > > > > > > > > > > > > > > FORCE_EJECT is not defined afaict, so the function returns without scheduling > > > > > > > acpi_bus_hot_remove_device. Is there another code path that calls power-off? > > > > > > > > > > > > Consider the following case: > > > > > > > > > > > > We hotremove the memory device by SCI and unbind it from the driver at the same time: > > > > > > > > > > > > CPUa CPUb > > > > > > acpi_memory_device_notify() > > > > > > unbind it from the driver > > > > > > acpi_bus_hot_remove_device() > > > > > > > > > > Can we make acpi_bus_remove() to fail if a given acpi_device is not > > > > > bound with a driver? If so, can we make the unbind operation to perform > > > > > unbind only? > > > > > > > > acpi_bus_remove_device could check if the driver is present, and return -ENODEV > > > > if it's not present (dev->driver == NULL). > > > > > > > > But there can still be a race between an eject and an unbind operation happening > > > > simultaneously. This seems like a general problem to me i.e. not specific to an > > > > acpi memory device. How do we ensure an eject does not race with a driver unbind > > > > for other acpi devices? > > > > > > > > Is there a per-device lock in acpi-core or device-core that can prevent this from > > > > happening? Driver core does a device_lock(dev) on all operations, but this is > > > > probably not grabbed on SCI-initiated acpi ejects. > > > > > > Since driver_unbind() calls device_lock(dev->parent) before calling > > > device_release_driver(), I am wondering if we can call > > > device_lock(dev->dev->parent) at the beginning of acpi_bus_remove() > > > (i.e. before calling pre_remove) and fails if dev->driver is NULL. The > > > parent lock is otherwise released after device_release_driver() is done. > > > > I would be careful. You may introduce some subtle locking-related issues > > this way. > > Right. This requires careful inspection and testing. As far as the > locking is concerned, I am not keen on using fine grained locking for > hot-plug. It is much simpler and solid if we serialize such operations. > > > Besides, there may be an alternative approach to all this. For example, > > what if we don't remove struct device objects on eject? The ACPI handles > > associated with them don't go away in that case after all, do they? > > Umm... Sorry, I am not getting your point. The issue is that we need > to be able to fail a request when memory range cannot be off-lined. > Otherwise, we end up ejecting online memory range. Yes, this is the major one. The minor issue, however, is a race condition between unbinding a driver from a device and removing the device if I understand it correctly. Which will go away automatically if the device is not removed in the first place. Or so I would think. :-) Thanks, Rafael
> > > > > > > Consider the following case: > > > > > > > > > > > > > > We hotremove the memory device by SCI and unbind it from the driver at the same time: > > > > > > > > > > > > > > CPUa CPUb > > > > > > > acpi_memory_device_notify() > > > > > > > unbind it from the driver > > > > > > > acpi_bus_hot_remove_device() > > > > > > > > > > > > Can we make acpi_bus_remove() to fail if a given acpi_device is not > > > > > > bound with a driver? If so, can we make the unbind operation to perform > > > > > > unbind only? > > > > > > > > > > acpi_bus_remove_device could check if the driver is present, and return -ENODEV > > > > > if it's not present (dev->driver == NULL). > > > > > > > > > > But there can still be a race between an eject and an unbind operation happening > > > > > simultaneously. This seems like a general problem to me i.e. not specific to an > > > > > acpi memory device. How do we ensure an eject does not race with a driver unbind > > > > > for other acpi devices? > > > > > > > > > > Is there a per-device lock in acpi-core or device-core that can prevent this from > > > > > happening? Driver core does a device_lock(dev) on all operations, but this is > > > > > probably not grabbed on SCI-initiated acpi ejects. > > > > > > > > Since driver_unbind() calls device_lock(dev->parent) before calling > > > > device_release_driver(), I am wondering if we can call > > > > device_lock(dev->dev->parent) at the beginning of acpi_bus_remove() > > > > (i.e. before calling pre_remove) and fails if dev->driver is NULL. The > > > > parent lock is otherwise released after device_release_driver() is done. > > > > > > I would be careful. You may introduce some subtle locking-related issues > > > this way. > > > > Right. This requires careful inspection and testing. As far as the > > locking is concerned, I am not keen on using fine grained locking for > > hot-plug. It is much simpler and solid if we serialize such operations. > > > > > Besides, there may be an alternative approach to all this. For example, > > > what if we don't remove struct device objects on eject? The ACPI handles > > > associated with them don't go away in that case after all, do they? > > > > Umm... Sorry, I am not getting your point. The issue is that we need > > to be able to fail a request when memory range cannot be off-lined. > > Otherwise, we end up ejecting online memory range. > > Yes, this is the major one. The minor issue, however, is a race condition > between unbinding a driver from a device and removing the device if I > understand it correctly. Which will go away automatically if the device is > not removed in the first place. Or so I would think. :-) I see. I do not think whether or not the device is removed on eject makes any difference here. The issue is that after driver_unbind() is done, acpi_bus_hot_remove_device() no longer calls the ACPI memory driver (hence, it cannot fail in prepare_remove), and goes ahead to call _EJ0. If driver_unbind() did off-line the memory, this is OK. However, it cannot off-line kernel memory ranges. So, we basically need to either 1) serialize acpi_bus_hot_remove_device() and driver_unbind(), or 2) make acpi_bus_hot_remove_device() to fail if driver_unbind() is run during the operation. Thanks, -Toshi -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 2012-11-28 at 22:40 +0100, Rafael J. Wysocki wrote: > On Wednesday, November 28, 2012 02:02:48 PM Toshi Kani wrote: > > > > > > > > > Consider the following case: > > > > > > > > > > > > > > > > > > We hotremove the memory device by SCI and unbind it from the driver at the same time: > > > > > > > > > > > > > > > > > > CPUa CPUb > > > > > > > > > acpi_memory_device_notify() > > > > > > > > > unbind it from the driver > > > > > > > > > acpi_bus_hot_remove_device() > > > > > > > > > > > > > > > > Can we make acpi_bus_remove() to fail if a given acpi_device is not > > > > > > > > bound with a driver? If so, can we make the unbind operation to perform > > > > > > > > unbind only? > > > > > > > > > > > > > > acpi_bus_remove_device could check if the driver is present, and return -ENODEV > > > > > > > if it's not present (dev->driver == NULL). > > > > > > > > > > > > > > But there can still be a race between an eject and an unbind operation happening > > > > > > > simultaneously. This seems like a general problem to me i.e. not specific to an > > > > > > > acpi memory device. How do we ensure an eject does not race with a driver unbind > > > > > > > for other acpi devices? > > > > > > > > > > > > > > Is there a per-device lock in acpi-core or device-core that can prevent this from > > > > > > > happening? Driver core does a device_lock(dev) on all operations, but this is > > > > > > > probably not grabbed on SCI-initiated acpi ejects. > > > > > > > > > > > > Since driver_unbind() calls device_lock(dev->parent) before calling > > > > > > device_release_driver(), I am wondering if we can call > > > > > > device_lock(dev->dev->parent) at the beginning of acpi_bus_remove() > > > > > > (i.e. before calling pre_remove) and fails if dev->driver is NULL. The > > > > > > parent lock is otherwise released after device_release_driver() is done. > > > > > > > > > > I would be careful. You may introduce some subtle locking-related issues > > > > > this way. > > > > > > > > Right. This requires careful inspection and testing. As far as the > > > > locking is concerned, I am not keen on using fine grained locking for > > > > hot-plug. It is much simpler and solid if we serialize such operations. > > > > > > > > > Besides, there may be an alternative approach to all this. For example, > > > > > what if we don't remove struct device objects on eject? The ACPI handles > > > > > associated with them don't go away in that case after all, do they? > > > > > > > > Umm... Sorry, I am not getting your point. The issue is that we need > > > > to be able to fail a request when memory range cannot be off-lined. > > > > Otherwise, we end up ejecting online memory range. > > > > > > Yes, this is the major one. The minor issue, however, is a race condition > > > between unbinding a driver from a device and removing the device if I > > > understand it correctly. Which will go away automatically if the device is > > > not removed in the first place. Or so I would think. :-) > > > > I see. I do not think whether or not the device is removed on eject > > makes any difference here. The issue is that after driver_unbind() is > > done, acpi_bus_hot_remove_device() no longer calls the ACPI memory > > driver (hence, it cannot fail in prepare_remove), and goes ahead to call > > _EJ0. If driver_unbind() did off-line the memory, this is OK. However, > > it cannot off-line kernel memory ranges. So, we basically need to > > either 1) serialize acpi_bus_hot_remove_device() and driver_unbind(), or > > 2) make acpi_bus_hot_remove_device() to fail if driver_unbind() is run > > during the operation. > > OK, I see the problem now. > > What exactly is triggering the driver_unbind() in this scenario? User can request driver_unbind() from sysfs as follows. I do not see much reason why user has to do for memory, though. echo "PNP0C80:XX" > /sys/bus/acpi/drivers/acpi_memhotplug/unbind Thanks, -Toshi -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wednesday, November 28, 2012 02:02:48 PM Toshi Kani wrote: > > > > > > > > Consider the following case: > > > > > > > > > > > > > > > > We hotremove the memory device by SCI and unbind it from the driver at the same time: > > > > > > > > > > > > > > > > CPUa CPUb > > > > > > > > acpi_memory_device_notify() > > > > > > > > unbind it from the driver > > > > > > > > acpi_bus_hot_remove_device() > > > > > > > > > > > > > > Can we make acpi_bus_remove() to fail if a given acpi_device is not > > > > > > > bound with a driver? If so, can we make the unbind operation to perform > > > > > > > unbind only? > > > > > > > > > > > > acpi_bus_remove_device could check if the driver is present, and return -ENODEV > > > > > > if it's not present (dev->driver == NULL). > > > > > > > > > > > > But there can still be a race between an eject and an unbind operation happening > > > > > > simultaneously. This seems like a general problem to me i.e. not specific to an > > > > > > acpi memory device. How do we ensure an eject does not race with a driver unbind > > > > > > for other acpi devices? > > > > > > > > > > > > Is there a per-device lock in acpi-core or device-core that can prevent this from > > > > > > happening? Driver core does a device_lock(dev) on all operations, but this is > > > > > > probably not grabbed on SCI-initiated acpi ejects. > > > > > > > > > > Since driver_unbind() calls device_lock(dev->parent) before calling > > > > > device_release_driver(), I am wondering if we can call > > > > > device_lock(dev->dev->parent) at the beginning of acpi_bus_remove() > > > > > (i.e. before calling pre_remove) and fails if dev->driver is NULL. The > > > > > parent lock is otherwise released after device_release_driver() is done. > > > > > > > > I would be careful. You may introduce some subtle locking-related issues > > > > this way. > > > > > > Right. This requires careful inspection and testing. As far as the > > > locking is concerned, I am not keen on using fine grained locking for > > > hot-plug. It is much simpler and solid if we serialize such operations. > > > > > > > Besides, there may be an alternative approach to all this. For example, > > > > what if we don't remove struct device objects on eject? The ACPI handles > > > > associated with them don't go away in that case after all, do they? > > > > > > Umm... Sorry, I am not getting your point. The issue is that we need > > > to be able to fail a request when memory range cannot be off-lined. > > > Otherwise, we end up ejecting online memory range. > > > > Yes, this is the major one. The minor issue, however, is a race condition > > between unbinding a driver from a device and removing the device if I > > understand it correctly. Which will go away automatically if the device is > > not removed in the first place. Or so I would think. :-) > > I see. I do not think whether or not the device is removed on eject > makes any difference here. The issue is that after driver_unbind() is > done, acpi_bus_hot_remove_device() no longer calls the ACPI memory > driver (hence, it cannot fail in prepare_remove), and goes ahead to call > _EJ0. If driver_unbind() did off-line the memory, this is OK. However, > it cannot off-line kernel memory ranges. So, we basically need to > either 1) serialize acpi_bus_hot_remove_device() and driver_unbind(), or > 2) make acpi_bus_hot_remove_device() to fail if driver_unbind() is run > during the operation. OK, I see the problem now. What exactly is triggering the driver_unbind() in this scenario? Rafael
On Wednesday, November 28, 2012 02:40:09 PM Toshi Kani wrote: > On Wed, 2012-11-28 at 22:40 +0100, Rafael J. Wysocki wrote: > > On Wednesday, November 28, 2012 02:02:48 PM Toshi Kani wrote: > > > > > > > > > > Consider the following case: > > > > > > > > > > > > > > > > > > > > We hotremove the memory device by SCI and unbind it from the driver at the same time: > > > > > > > > > > > > > > > > > > > > CPUa CPUb > > > > > > > > > > acpi_memory_device_notify() > > > > > > > > > > unbind it from the driver > > > > > > > > > > acpi_bus_hot_remove_device() > > > > > > > > > > > > > > > > > > Can we make acpi_bus_remove() to fail if a given acpi_device is not > > > > > > > > > bound with a driver? If so, can we make the unbind operation to perform > > > > > > > > > unbind only? > > > > > > > > > > > > > > > > acpi_bus_remove_device could check if the driver is present, and return -ENODEV > > > > > > > > if it's not present (dev->driver == NULL). > > > > > > > > > > > > > > > > But there can still be a race between an eject and an unbind operation happening > > > > > > > > simultaneously. This seems like a general problem to me i.e. not specific to an > > > > > > > > acpi memory device. How do we ensure an eject does not race with a driver unbind > > > > > > > > for other acpi devices? > > > > > > > > > > > > > > > > Is there a per-device lock in acpi-core or device-core that can prevent this from > > > > > > > > happening? Driver core does a device_lock(dev) on all operations, but this is > > > > > > > > probably not grabbed on SCI-initiated acpi ejects. > > > > > > > > > > > > > > Since driver_unbind() calls device_lock(dev->parent) before calling > > > > > > > device_release_driver(), I am wondering if we can call > > > > > > > device_lock(dev->dev->parent) at the beginning of acpi_bus_remove() > > > > > > > (i.e. before calling pre_remove) and fails if dev->driver is NULL. The > > > > > > > parent lock is otherwise released after device_release_driver() is done. > > > > > > > > > > > > I would be careful. You may introduce some subtle locking-related issues > > > > > > this way. > > > > > > > > > > Right. This requires careful inspection and testing. As far as the > > > > > locking is concerned, I am not keen on using fine grained locking for > > > > > hot-plug. It is much simpler and solid if we serialize such operations. > > > > > > > > > > > Besides, there may be an alternative approach to all this. For example, > > > > > > what if we don't remove struct device objects on eject? The ACPI handles > > > > > > associated with them don't go away in that case after all, do they? > > > > > > > > > > Umm... Sorry, I am not getting your point. The issue is that we need > > > > > to be able to fail a request when memory range cannot be off-lined. > > > > > Otherwise, we end up ejecting online memory range. > > > > > > > > Yes, this is the major one. The minor issue, however, is a race condition > > > > between unbinding a driver from a device and removing the device if I > > > > understand it correctly. Which will go away automatically if the device is > > > > not removed in the first place. Or so I would think. :-) > > > > > > I see. I do not think whether or not the device is removed on eject > > > makes any difference here. The issue is that after driver_unbind() is > > > done, acpi_bus_hot_remove_device() no longer calls the ACPI memory > > > driver (hence, it cannot fail in prepare_remove), and goes ahead to call > > > _EJ0. If driver_unbind() did off-line the memory, this is OK. However, > > > it cannot off-line kernel memory ranges. So, we basically need to > > > either 1) serialize acpi_bus_hot_remove_device() and driver_unbind(), or > > > 2) make acpi_bus_hot_remove_device() to fail if driver_unbind() is run > > > during the operation. > > > > OK, I see the problem now. > > > > What exactly is triggering the driver_unbind() in this scenario? > > User can request driver_unbind() from sysfs as follows. I do not see > much reason why user has to do for memory, though. > > echo "PNP0C80:XX" > /sys/bus/acpi/drivers/acpi_memhotplug/unbind This is wrong. Even if we want to permit user space to forcibly unbind drivers from anything like this, we should at least check for some situations in which it is plain dangerous. Like in this case. So I think the above should fail unless we know that the driver won't be necessary to handle hot-removal of memory. Alternatively, this may actually try to carry out the hot-removal and only call driver_unbind() if that succeeds. Whichever is preferable, I'd say. Thanks, Rafael
On Wed, 2012-11-28 at 23:01 +0100, Rafael J. Wysocki wrote: > On Wednesday, November 28, 2012 02:40:09 PM Toshi Kani wrote: > > On Wed, 2012-11-28 at 22:40 +0100, Rafael J. Wysocki wrote: > > > On Wednesday, November 28, 2012 02:02:48 PM Toshi Kani wrote: > > > > > > > > > > > Consider the following case: > > > > > > > > > > > > > > > > > > > > > > We hotremove the memory device by SCI and unbind it from the driver at the same time: > > > > > > > > > > > > > > > > > > > > > > CPUa CPUb > > > > > > > > > > > acpi_memory_device_notify() > > > > > > > > > > > unbind it from the driver > > > > > > > > > > > acpi_bus_hot_remove_device() > > > > > > > > > > > > > > > > > > > > Can we make acpi_bus_remove() to fail if a given acpi_device is not > > > > > > > > > > bound with a driver? If so, can we make the unbind operation to perform > > > > > > > > > > unbind only? > > > > > > > > > > > > > > > > > > acpi_bus_remove_device could check if the driver is present, and return -ENODEV > > > > > > > > > if it's not present (dev->driver == NULL). > > > > > > > > > > > > > > > > > > But there can still be a race between an eject and an unbind operation happening > > > > > > > > > simultaneously. This seems like a general problem to me i.e. not specific to an > > > > > > > > > acpi memory device. How do we ensure an eject does not race with a driver unbind > > > > > > > > > for other acpi devices? > > > > > > > > > > > > > > > > > > Is there a per-device lock in acpi-core or device-core that can prevent this from > > > > > > > > > happening? Driver core does a device_lock(dev) on all operations, but this is > > > > > > > > > probably not grabbed on SCI-initiated acpi ejects. > > > > > > > > > > > > > > > > Since driver_unbind() calls device_lock(dev->parent) before calling > > > > > > > > device_release_driver(), I am wondering if we can call > > > > > > > > device_lock(dev->dev->parent) at the beginning of acpi_bus_remove() > > > > > > > > (i.e. before calling pre_remove) and fails if dev->driver is NULL. The > > > > > > > > parent lock is otherwise released after device_release_driver() is done. > > > > > > > > > > > > > > I would be careful. You may introduce some subtle locking-related issues > > > > > > > this way. > > > > > > > > > > > > Right. This requires careful inspection and testing. As far as the > > > > > > locking is concerned, I am not keen on using fine grained locking for > > > > > > hot-plug. It is much simpler and solid if we serialize such operations. > > > > > > > > > > > > > Besides, there may be an alternative approach to all this. For example, > > > > > > > what if we don't remove struct device objects on eject? The ACPI handles > > > > > > > associated with them don't go away in that case after all, do they? > > > > > > > > > > > > Umm... Sorry, I am not getting your point. The issue is that we need > > > > > > to be able to fail a request when memory range cannot be off-lined. > > > > > > Otherwise, we end up ejecting online memory range. > > > > > > > > > > Yes, this is the major one. The minor issue, however, is a race condition > > > > > between unbinding a driver from a device and removing the device if I > > > > > understand it correctly. Which will go away automatically if the device is > > > > > not removed in the first place. Or so I would think. :-) > > > > > > > > I see. I do not think whether or not the device is removed on eject > > > > makes any difference here. The issue is that after driver_unbind() is > > > > done, acpi_bus_hot_remove_device() no longer calls the ACPI memory > > > > driver (hence, it cannot fail in prepare_remove), and goes ahead to call > > > > _EJ0. If driver_unbind() did off-line the memory, this is OK. However, > > > > it cannot off-line kernel memory ranges. So, we basically need to > > > > either 1) serialize acpi_bus_hot_remove_device() and driver_unbind(), or > > > > 2) make acpi_bus_hot_remove_device() to fail if driver_unbind() is run > > > > during the operation. > > > > > > OK, I see the problem now. > > > > > > What exactly is triggering the driver_unbind() in this scenario? > > > > User can request driver_unbind() from sysfs as follows. I do not see > > much reason why user has to do for memory, though. > > > > echo "PNP0C80:XX" > /sys/bus/acpi/drivers/acpi_memhotplug/unbind > > This is wrong. Even if we want to permit user space to forcibly unbind > drivers from anything like this, we should at least check for some > situations in which it is plain dangerous. Like in this case. So I think > the above should fail unless we know that the driver won't be necessary > to handle hot-removal of memory. Well, we tried twice already... :) https://lkml.org/lkml/2012/11/16/649 > Alternatively, this may actually try to carry out the hot-removal and only > call driver_unbind() if that succeeds. Whichever is preferable, I'd say. Greg clarified in the above link that this interface is "unbind", not remove. Thanks, -Toshi -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> > > > > > I see. I do not think whether or not the device is removed on eject > > > > > > makes any difference here. The issue is that after driver_unbind() is > > > > > > done, acpi_bus_hot_remove_device() no longer calls the ACPI memory > > > > > > driver (hence, it cannot fail in prepare_remove), and goes ahead to call > > > > > > _EJ0. If driver_unbind() did off-line the memory, this is OK. However, > > > > > > it cannot off-line kernel memory ranges. So, we basically need to > > > > > > either 1) serialize acpi_bus_hot_remove_device() and driver_unbind(), or > > > > > > 2) make acpi_bus_hot_remove_device() to fail if driver_unbind() is run > > > > > > during the operation. > > > > > > > > > > OK, I see the problem now. > > > > > > > > > > What exactly is triggering the driver_unbind() in this scenario? > > > > > > > > User can request driver_unbind() from sysfs as follows. I do not see > > > > much reason why user has to do for memory, though. > > > > > > > > echo "PNP0C80:XX" > /sys/bus/acpi/drivers/acpi_memhotplug/unbind > > > > > > This is wrong. Even if we want to permit user space to forcibly unbind > > > drivers from anything like this, we should at least check for some > > > situations in which it is plain dangerous. Like in this case. So I think > > > the above should fail unless we know that the driver won't be necessary > > > to handle hot-removal of memory. > > > > Well, we tried twice already... :) > > https://lkml.org/lkml/2012/11/16/649 > > I didn't mean driver_unbind() should fail. The code path that executes > driver_unbind() eventually should fail _before_ executing it. driver_unbind() is the handler, so it is called directly from this unbind interface. Thanks, -Toshi -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wednesday, November 28, 2012 03:04:52 PM Toshi Kani wrote: > On Wed, 2012-11-28 at 23:01 +0100, Rafael J. Wysocki wrote: > > On Wednesday, November 28, 2012 02:40:09 PM Toshi Kani wrote: > > > On Wed, 2012-11-28 at 22:40 +0100, Rafael J. Wysocki wrote: > > > > On Wednesday, November 28, 2012 02:02:48 PM Toshi Kani wrote: > > > > > > > > > > > > Consider the following case: > > > > > > > > > > > > > > > > > > > > > > > > We hotremove the memory device by SCI and unbind it from the driver at the same time: > > > > > > > > > > > > > > > > > > > > > > > > CPUa CPUb > > > > > > > > > > > > acpi_memory_device_notify() > > > > > > > > > > > > unbind it from the driver > > > > > > > > > > > > acpi_bus_hot_remove_device() > > > > > > > > > > > > > > > > > > > > > > Can we make acpi_bus_remove() to fail if a given acpi_device is not > > > > > > > > > > > bound with a driver? If so, can we make the unbind operation to perform > > > > > > > > > > > unbind only? > > > > > > > > > > > > > > > > > > > > acpi_bus_remove_device could check if the driver is present, and return -ENODEV > > > > > > > > > > if it's not present (dev->driver == NULL). > > > > > > > > > > > > > > > > > > > > But there can still be a race between an eject and an unbind operation happening > > > > > > > > > > simultaneously. This seems like a general problem to me i.e. not specific to an > > > > > > > > > > acpi memory device. How do we ensure an eject does not race with a driver unbind > > > > > > > > > > for other acpi devices? > > > > > > > > > > > > > > > > > > > > Is there a per-device lock in acpi-core or device-core that can prevent this from > > > > > > > > > > happening? Driver core does a device_lock(dev) on all operations, but this is > > > > > > > > > > probably not grabbed on SCI-initiated acpi ejects. > > > > > > > > > > > > > > > > > > Since driver_unbind() calls device_lock(dev->parent) before calling > > > > > > > > > device_release_driver(), I am wondering if we can call > > > > > > > > > device_lock(dev->dev->parent) at the beginning of acpi_bus_remove() > > > > > > > > > (i.e. before calling pre_remove) and fails if dev->driver is NULL. The > > > > > > > > > parent lock is otherwise released after device_release_driver() is done. > > > > > > > > > > > > > > > > I would be careful. You may introduce some subtle locking-related issues > > > > > > > > this way. > > > > > > > > > > > > > > Right. This requires careful inspection and testing. As far as the > > > > > > > locking is concerned, I am not keen on using fine grained locking for > > > > > > > hot-plug. It is much simpler and solid if we serialize such operations. > > > > > > > > > > > > > > > Besides, there may be an alternative approach to all this. For example, > > > > > > > > what if we don't remove struct device objects on eject? The ACPI handles > > > > > > > > associated with them don't go away in that case after all, do they? > > > > > > > > > > > > > > Umm... Sorry, I am not getting your point. The issue is that we need > > > > > > > to be able to fail a request when memory range cannot be off-lined. > > > > > > > Otherwise, we end up ejecting online memory range. > > > > > > > > > > > > Yes, this is the major one. The minor issue, however, is a race condition > > > > > > between unbinding a driver from a device and removing the device if I > > > > > > understand it correctly. Which will go away automatically if the device is > > > > > > not removed in the first place. Or so I would think. :-) > > > > > > > > > > I see. I do not think whether or not the device is removed on eject > > > > > makes any difference here. The issue is that after driver_unbind() is > > > > > done, acpi_bus_hot_remove_device() no longer calls the ACPI memory > > > > > driver (hence, it cannot fail in prepare_remove), and goes ahead to call > > > > > _EJ0. If driver_unbind() did off-line the memory, this is OK. However, > > > > > it cannot off-line kernel memory ranges. So, we basically need to > > > > > either 1) serialize acpi_bus_hot_remove_device() and driver_unbind(), or > > > > > 2) make acpi_bus_hot_remove_device() to fail if driver_unbind() is run > > > > > during the operation. > > > > > > > > OK, I see the problem now. > > > > > > > > What exactly is triggering the driver_unbind() in this scenario? > > > > > > User can request driver_unbind() from sysfs as follows. I do not see > > > much reason why user has to do for memory, though. > > > > > > echo "PNP0C80:XX" > /sys/bus/acpi/drivers/acpi_memhotplug/unbind > > > > This is wrong. Even if we want to permit user space to forcibly unbind > > drivers from anything like this, we should at least check for some > > situations in which it is plain dangerous. Like in this case. So I think > > the above should fail unless we know that the driver won't be necessary > > to handle hot-removal of memory. > > Well, we tried twice already... :) > https://lkml.org/lkml/2012/11/16/649 I didn't mean driver_unbind() should fail. The code path that executes driver_unbind() eventually should fail _before_ executing it. > > Alternatively, this may actually try to carry out the hot-removal and only > > call driver_unbind() if that succeeds. Whichever is preferable, I'd say. > > Greg clarified in the above link that this interface is "unbind", not > remove. OK, so that's clear. Thanks, Rafael
On Wednesday, November 28, 2012 03:16:22 PM Toshi Kani wrote: > > > > > > > I see. I do not think whether or not the device is removed on eject > > > > > > > makes any difference here. The issue is that after driver_unbind() is > > > > > > > done, acpi_bus_hot_remove_device() no longer calls the ACPI memory > > > > > > > driver (hence, it cannot fail in prepare_remove), and goes ahead to call > > > > > > > _EJ0. If driver_unbind() did off-line the memory, this is OK. However, > > > > > > > it cannot off-line kernel memory ranges. So, we basically need to > > > > > > > either 1) serialize acpi_bus_hot_remove_device() and driver_unbind(), or > > > > > > > 2) make acpi_bus_hot_remove_device() to fail if driver_unbind() is run > > > > > > > during the operation. > > > > > > > > > > > > OK, I see the problem now. > > > > > > > > > > > > What exactly is triggering the driver_unbind() in this scenario? > > > > > > > > > > User can request driver_unbind() from sysfs as follows. I do not see > > > > > much reason why user has to do for memory, though. > > > > > > > > > > echo "PNP0C80:XX" > /sys/bus/acpi/drivers/acpi_memhotplug/unbind > > > > > > > > This is wrong. Even if we want to permit user space to forcibly unbind > > > > drivers from anything like this, we should at least check for some > > > > situations in which it is plain dangerous. Like in this case. So I think > > > > the above should fail unless we know that the driver won't be necessary > > > > to handle hot-removal of memory. > > > > > > Well, we tried twice already... :) > > > https://lkml.org/lkml/2012/11/16/649 > > > > I didn't mean driver_unbind() should fail. The code path that executes > > driver_unbind() eventually should fail _before_ executing it. > > driver_unbind() is the handler, so it is called directly from this > unbind interface. Yes, sorry for the confusion. So, it looks like the driver core wants us to handle driver unbinding no matter what. This pretty much means that it is a bad idea to have a driver that is exposed as a "device driver" in sysfs for memory hotplugging. Thanks, Rafael
On Wed, Nov 28, 2012 at 11:39:22PM +0100, Rafael J. Wysocki wrote: > On Wednesday, November 28, 2012 03:16:22 PM Toshi Kani wrote: > > > > > > > > I see. I do not think whether or not the device is removed on eject > > > > > > > > makes any difference here. The issue is that after driver_unbind() is > > > > > > > > done, acpi_bus_hot_remove_device() no longer calls the ACPI memory > > > > > > > > driver (hence, it cannot fail in prepare_remove), and goes ahead to call > > > > > > > > _EJ0. If driver_unbind() did off-line the memory, this is OK. However, > > > > > > > > it cannot off-line kernel memory ranges. So, we basically need to > > > > > > > > either 1) serialize acpi_bus_hot_remove_device() and driver_unbind(), or > > > > > > > > 2) make acpi_bus_hot_remove_device() to fail if driver_unbind() is run > > > > > > > > during the operation. > > > > > > > > > > > > > > OK, I see the problem now. > > > > > > > > > > > > > > What exactly is triggering the driver_unbind() in this scenario? > > > > > > > > > > > > User can request driver_unbind() from sysfs as follows. I do not see > > > > > > much reason why user has to do for memory, though. > > > > > > > > > > > > echo "PNP0C80:XX" > /sys/bus/acpi/drivers/acpi_memhotplug/unbind > > > > > > > > > > This is wrong. Even if we want to permit user space to forcibly unbind > > > > > drivers from anything like this, we should at least check for some > > > > > situations in which it is plain dangerous. Like in this case. So I think > > > > > the above should fail unless we know that the driver won't be necessary > > > > > to handle hot-removal of memory. > > > > > > > > Well, we tried twice already... :) > > > > https://lkml.org/lkml/2012/11/16/649 > > > > > > I didn't mean driver_unbind() should fail. The code path that executes > > > driver_unbind() eventually should fail _before_ executing it. > > > > driver_unbind() is the handler, so it is called directly from this > > unbind interface. > > Yes, sorry for the confusion. > > So, it looks like the driver core wants us to handle driver unbinding no > matter what. Yes. Well, the driver core does the unbinding no matter what, if it was told, by a user, to do so. Why is that a problem? The user then is responsible for any bad things (i.e. not able to control the device any more), if they do so. > This pretty much means that it is a bad idea to have a driver that is > exposed as a "device driver" in sysfs for memory hotplugging. Again, why? All this means is that the driver is now not connected to the device (memory in this case.) The memory is still there, still operates as before, only difference is, the driver can't touch it anymore. This is the same for any ACPI driver, and has been for years. Please don't confuse unbind with any "normal" system operation, it is not to be used for memory hotplug, or anything else like this. Also, if you really do not want to do this, turn off the ability to unbind/bind for these devices, that is under your control in your bus logic. thanks, greg k-h -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wednesday, November 28, 2012 02:46:33 PM Greg KH wrote: > On Wed, Nov 28, 2012 at 11:39:22PM +0100, Rafael J. Wysocki wrote: > > On Wednesday, November 28, 2012 03:16:22 PM Toshi Kani wrote: > > > > > > > > > I see. I do not think whether or not the device is removed on eject > > > > > > > > > makes any difference here. The issue is that after driver_unbind() is > > > > > > > > > done, acpi_bus_hot_remove_device() no longer calls the ACPI memory > > > > > > > > > driver (hence, it cannot fail in prepare_remove), and goes ahead to call > > > > > > > > > _EJ0. If driver_unbind() did off-line the memory, this is OK. However, > > > > > > > > > it cannot off-line kernel memory ranges. So, we basically need to > > > > > > > > > either 1) serialize acpi_bus_hot_remove_device() and driver_unbind(), or > > > > > > > > > 2) make acpi_bus_hot_remove_device() to fail if driver_unbind() is run > > > > > > > > > during the operation. > > > > > > > > > > > > > > > > OK, I see the problem now. > > > > > > > > > > > > > > > > What exactly is triggering the driver_unbind() in this scenario? > > > > > > > > > > > > > > User can request driver_unbind() from sysfs as follows. I do not see > > > > > > > much reason why user has to do for memory, though. > > > > > > > > > > > > > > echo "PNP0C80:XX" > /sys/bus/acpi/drivers/acpi_memhotplug/unbind > > > > > > > > > > > > This is wrong. Even if we want to permit user space to forcibly unbind > > > > > > drivers from anything like this, we should at least check for some > > > > > > situations in which it is plain dangerous. Like in this case. So I think > > > > > > the above should fail unless we know that the driver won't be necessary > > > > > > to handle hot-removal of memory. > > > > > > > > > > Well, we tried twice already... :) > > > > > https://lkml.org/lkml/2012/11/16/649 > > > > > > > > I didn't mean driver_unbind() should fail. The code path that executes > > > > driver_unbind() eventually should fail _before_ executing it. > > > > > > driver_unbind() is the handler, so it is called directly from this > > > unbind interface. > > > > Yes, sorry for the confusion. > > > > So, it looks like the driver core wants us to handle driver unbinding no > > matter what. > > Yes. Well, the driver core does the unbinding no matter what, if it was > told, by a user, to do so. Why is that a problem? The user then is > responsible for any bad things (i.e. not able to control the device any > more), if they do so. I don't really agree with that, because the user may simply not know what the consequences of that will be. In my not so humble opinion any interface allowing user space to crash the kernel is a bad one. And this is an example of that. > > This pretty much means that it is a bad idea to have a driver that is > > exposed as a "device driver" in sysfs for memory hotplugging. > > Again, why? All this means is that the driver is now not connected to > the device (memory in this case.) The memory is still there, still > operates as before, only difference is, the driver can't touch it > anymore. > > This is the same for any ACPI driver, and has been for years. Except that if this driver has been unbound and the removal is triggered by an SCI, the core will just go on and remove the memory, although it may be killing the kernel this way. Arguably, this may be considered as the core's fault, but the only way to fix that would be to move the code from that driver into the core and not to register it as a "driver" any more. Which was my point. :-) > Please don't confuse unbind with any "normal" system operation, it is > not to be used for memory hotplug, or anything else like this. > > Also, if you really do not want to do this, turn off the ability to > unbind/bind for these devices, that is under your control in your bus > logic. OK, but how? I'm looking at driver_unbind() and not seeing any way to do that actually. Thanks, Rafael
On Thu, Nov 29, 2012 at 12:05:20AM +0100, Rafael J. Wysocki wrote: > On Wednesday, November 28, 2012 02:46:33 PM Greg KH wrote: > > > So, it looks like the driver core wants us to handle driver unbinding no > > > matter what. > > > > Yes. Well, the driver core does the unbinding no matter what, if it was > > told, by a user, to do so. Why is that a problem? The user then is > > responsible for any bad things (i.e. not able to control the device any > > more), if they do so. > > I don't really agree with that, because the user may simply not know what > the consequences of that will be. In my not so humble opinion any interface > allowing user space to crash the kernel is a bad one. And this is an example > of that. This has been in place since 2005, over 7 years now, and I have never heard any problems with it being used to crash the kernel, despite the easy ability for people to unbind all of their devices from drivers and instantly cause a system hang. So really doubt this is a problem in real life :) > > > This pretty much means that it is a bad idea to have a driver that is > > > exposed as a "device driver" in sysfs for memory hotplugging. > > > > Again, why? All this means is that the driver is now not connected to > > the device (memory in this case.) The memory is still there, still > > operates as before, only difference is, the driver can't touch it > > anymore. > > > > This is the same for any ACPI driver, and has been for years. > > Except that if this driver has been unbound and the removal is triggered by > an SCI, the core will just go on and remove the memory, although it may > be killing the kernel this way. Why would memory go away if a driver is unbound from a device? The device didn't go away. It's the same if the driver was a module and it was unloaded, you should not turn memory off in that situation, right? Are you also going to prevent module unloading of this driver? > Arguably, this may be considered as the core's fault, but the only way to > fix that would be to move the code from that driver into the core and not to > register it as a "driver" any more. Which was my point. :-) No, I think people are totally overreacting to the unbind/bind files, which are there to aid in development, and in adding new device ids to drivers, as well as sometimes doing a hacky revoke() call. > > Please don't confuse unbind with any "normal" system operation, it is > > not to be used for memory hotplug, or anything else like this. > > > > Also, if you really do not want to do this, turn off the ability to > > unbind/bind for these devices, that is under your control in your bus > > logic. > > OK, but how? I'm looking at driver_unbind() and not seeing any way to do > that actually. See the suppress_bind_attrs field in struct device_driver. It's even documented in device.h, but sadly, no one reads documentation :) I recommend you set this field if you don't want the bind/unbind files to show up for your memory driver, although I would argue that the driver needs to be fixed up to not do foolish things like removing memory from a system unless it really does go away... hope this helps, greg k-h -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wednesday, November 28, 2012 03:10:46 PM Greg KH wrote: > On Thu, Nov 29, 2012 at 12:05:20AM +0100, Rafael J. Wysocki wrote: > > On Wednesday, November 28, 2012 02:46:33 PM Greg KH wrote: > > > > So, it looks like the driver core wants us to handle driver unbinding no > > > > matter what. > > > > > > Yes. Well, the driver core does the unbinding no matter what, if it was > > > told, by a user, to do so. Why is that a problem? The user then is > > > responsible for any bad things (i.e. not able to control the device any > > > more), if they do so. > > > > I don't really agree with that, because the user may simply not know what > > the consequences of that will be. In my not so humble opinion any interface > > allowing user space to crash the kernel is a bad one. And this is an example > > of that. > > This has been in place since 2005, over 7 years now, and I have never > heard any problems with it being used to crash the kernel, despite the > easy ability for people to unbind all of their devices from drivers and > instantly cause a system hang. So really doubt this is a problem in > real life :) > > > > > This pretty much means that it is a bad idea to have a driver that is > > > > exposed as a "device driver" in sysfs for memory hotplugging. > > > > > > Again, why? All this means is that the driver is now not connected to > > > the device (memory in this case.) The memory is still there, still > > > operates as before, only difference is, the driver can't touch it > > > anymore. > > > > > > This is the same for any ACPI driver, and has been for years. > > > > Except that if this driver has been unbound and the removal is triggered by > > an SCI, the core will just go on and remove the memory, although it may > > be killing the kernel this way. > > Why would memory go away if a driver is unbound from a device? The > device didn't go away. It's the same if the driver was a module and it > was unloaded, you should not turn memory off in that situation, right? Right. It looks like there's some confusion about the role of .remove() in the ACPI subsystem, but I need to investigate it a bit more. > Are you also going to prevent module unloading of this driver? I'm not sure what I'm going to do with that driver at the moment to be honest. :-) > > Arguably, this may be considered as the core's fault, but the only way to > > fix that would be to move the code from that driver into the core and not to > > register it as a "driver" any more. Which was my point. :-) > > No, I think people are totally overreacting to the unbind/bind files, > which are there to aid in development, and in adding new device ids to > drivers, as well as sometimes doing a hacky revoke() call. > > > > Please don't confuse unbind with any "normal" system operation, it is > > > not to be used for memory hotplug, or anything else like this. > > > > > > Also, if you really do not want to do this, turn off the ability to > > > unbind/bind for these devices, that is under your control in your bus > > > logic. > > > > OK, but how? I'm looking at driver_unbind() and not seeing any way to do > > that actually. > > See the suppress_bind_attrs field in struct device_driver. It's even > documented in device.h, but sadly, no one reads documentation :) That's good to know, thanks. :-) And if I knew I could find that information in device.h, I'd look in there. > I recommend you set this field if you don't want the bind/unbind files > to show up for your memory driver, although I would argue that the > driver needs to be fixed up to not do foolish things like removing > memory from a system unless it really does go away... Quite frankly, I need to look at that driver and how things are supposed to work more thoroughly, because I don't seem to see a reason to do various things the way they are done. Well, maybe it's just me. Thanks, Rafael
On Wednesday, November 28, 2012 02:02:48 PM Toshi Kani wrote: > > > > > > > > Consider the following case: > > > > > > > > > > > > > > > > We hotremove the memory device by SCI and unbind it from the driver at the same time: > > > > > > > > > > > > > > > > CPUa CPUb > > > > > > > > acpi_memory_device_notify() > > > > > > > > unbind it from the driver > > > > > > > > acpi_bus_hot_remove_device() > > > > > > > > > > > > > > Can we make acpi_bus_remove() to fail if a given acpi_device is not > > > > > > > bound with a driver? If so, can we make the unbind operation to perform > > > > > > > unbind only? > > > > > > > > > > > > acpi_bus_remove_device could check if the driver is present, and return -ENODEV > > > > > > if it's not present (dev->driver == NULL). > > > > > > > > > > > > But there can still be a race between an eject and an unbind operation happening > > > > > > simultaneously. This seems like a general problem to me i.e. not specific to an > > > > > > acpi memory device. How do we ensure an eject does not race with a driver unbind > > > > > > for other acpi devices? > > > > > > > > > > > > Is there a per-device lock in acpi-core or device-core that can prevent this from > > > > > > happening? Driver core does a device_lock(dev) on all operations, but this is > > > > > > probably not grabbed on SCI-initiated acpi ejects. > > > > > > > > > > Since driver_unbind() calls device_lock(dev->parent) before calling > > > > > device_release_driver(), I am wondering if we can call > > > > > device_lock(dev->dev->parent) at the beginning of acpi_bus_remove() > > > > > (i.e. before calling pre_remove) and fails if dev->driver is NULL. The > > > > > parent lock is otherwise released after device_release_driver() is done. > > > > > > > > I would be careful. You may introduce some subtle locking-related issues > > > > this way. > > > > > > Right. This requires careful inspection and testing. As far as the > > > locking is concerned, I am not keen on using fine grained locking for > > > hot-plug. It is much simpler and solid if we serialize such operations. > > > > > > > Besides, there may be an alternative approach to all this. For example, > > > > what if we don't remove struct device objects on eject? The ACPI handles > > > > associated with them don't go away in that case after all, do they? > > > > > > Umm... Sorry, I am not getting your point. The issue is that we need > > > to be able to fail a request when memory range cannot be off-lined. > > > Otherwise, we end up ejecting online memory range. > > > > Yes, this is the major one. The minor issue, however, is a race condition > > between unbinding a driver from a device and removing the device if I > > understand it correctly. Which will go away automatically if the device is > > not removed in the first place. Or so I would think. :-) > > I see. I do not think whether or not the device is removed on eject > makes any difference here. The issue is that after driver_unbind() is > done, acpi_bus_hot_remove_device() no longer calls the ACPI memory > driver (hence, it cannot fail in prepare_remove), and goes ahead to call > _EJ0. I see two reasons for calling acpi_bus_hot_remove_device() for memory (correct me if I'm wrong): (1) from the memhotplug driver's notify handler and (2) from acpi_eject_store() which is exposed through sysfs. If we disabled exposing acpi_eject_store() for memory devices, then the only way would be from the notify handler. So I wonder if driver_unbind() shouldn't just uninstall the notify handler for memory (so that memory eject events are simply dropped on the floor after unbinding the driver)? Rafael
On Thu, 2012-11-29 at 00:49 +0100, Rafael J. Wysocki wrote: > On Wednesday, November 28, 2012 02:02:48 PM Toshi Kani wrote: > > > > > > > > > Consider the following case: > > > > > > > > > > > > > > > > > > We hotremove the memory device by SCI and unbind it from the driver at the same time: > > > > > > > > > > > > > > > > > > CPUa CPUb > > > > > > > > > acpi_memory_device_notify() > > > > > > > > > unbind it from the driver > > > > > > > > > acpi_bus_hot_remove_device() > > > > > > > > > > > > > > > > Can we make acpi_bus_remove() to fail if a given acpi_device is not > > > > > > > > bound with a driver? If so, can we make the unbind operation to perform > > > > > > > > unbind only? > > > > > > > > > > > > > > acpi_bus_remove_device could check if the driver is present, and return -ENODEV > > > > > > > if it's not present (dev->driver == NULL). > > > > > > > > > > > > > > But there can still be a race between an eject and an unbind operation happening > > > > > > > simultaneously. This seems like a general problem to me i.e. not specific to an > > > > > > > acpi memory device. How do we ensure an eject does not race with a driver unbind > > > > > > > for other acpi devices? > > > > > > > > > > > > > > Is there a per-device lock in acpi-core or device-core that can prevent this from > > > > > > > happening? Driver core does a device_lock(dev) on all operations, but this is > > > > > > > probably not grabbed on SCI-initiated acpi ejects. > > > > > > > > > > > > Since driver_unbind() calls device_lock(dev->parent) before calling > > > > > > device_release_driver(), I am wondering if we can call > > > > > > device_lock(dev->dev->parent) at the beginning of acpi_bus_remove() > > > > > > (i.e. before calling pre_remove) and fails if dev->driver is NULL. The > > > > > > parent lock is otherwise released after device_release_driver() is done. > > > > > > > > > > I would be careful. You may introduce some subtle locking-related issues > > > > > this way. > > > > > > > > Right. This requires careful inspection and testing. As far as the > > > > locking is concerned, I am not keen on using fine grained locking for > > > > hot-plug. It is much simpler and solid if we serialize such operations. > > > > > > > > > Besides, there may be an alternative approach to all this. For example, > > > > > what if we don't remove struct device objects on eject? The ACPI handles > > > > > associated with them don't go away in that case after all, do they? > > > > > > > > Umm... Sorry, I am not getting your point. The issue is that we need > > > > to be able to fail a request when memory range cannot be off-lined. > > > > Otherwise, we end up ejecting online memory range. > > > > > > Yes, this is the major one. The minor issue, however, is a race condition > > > between unbinding a driver from a device and removing the device if I > > > understand it correctly. Which will go away automatically if the device is > > > not removed in the first place. Or so I would think. :-) > > > > I see. I do not think whether or not the device is removed on eject > > makes any difference here. The issue is that after driver_unbind() is > > done, acpi_bus_hot_remove_device() no longer calls the ACPI memory > > driver (hence, it cannot fail in prepare_remove), and goes ahead to call > > _EJ0. > > I see two reasons for calling acpi_bus_hot_remove_device() for memory (correct > me if I'm wrong): (1) from the memhotplug driver's notify handler and (2) from > acpi_eject_store() which is exposed through sysfs. Yes, that is correct. > If we disabled exposing > acpi_eject_store() for memory devices, then the only way would be from the > notify handler. So I wonder if driver_unbind() shouldn't just uninstall the > notify handler for memory (so that memory eject events are simply dropped on > the floor after unbinding the driver)? If driver_unbind() happens before an eject request, we do not have a problem. acpi_eject_store() fails if a driver is not bound to the device. acpi_memory_device_notify() fails as well. The race condition Wen pointed out (see the top of this email) is that driver_unbind() may come in while eject operation is in-progress. This is why I mentioned the following in previous email. > So, we basically need to either 1) serialize > acpi_bus_hot_remove_device() and driver_unbind(), or 2) make > acpi_bus_hot_remove_device() to fail if driver_unbind() is run > during the operation. Thanks, -Toshi -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 2012-11-28 at 18:02 -0700, Toshi Kani wrote: > On Thu, 2012-11-29 at 00:49 +0100, Rafael J. Wysocki wrote: > > On Wednesday, November 28, 2012 02:02:48 PM Toshi Kani wrote: > > > > > > > > > > Consider the following case: > > > > > > > > > > > > > > > > > > > > We hotremove the memory device by SCI and unbind it from the driver at the same time: > > > > > > > > > > > > > > > > > > > > CPUa CPUb > > > > > > > > > > acpi_memory_device_notify() > > > > > > > > > > unbind it from the driver > > > > > > > > > > acpi_bus_hot_remove_device() > > > > > > > > > > > > > > > > > > Can we make acpi_bus_remove() to fail if a given acpi_device is not > > > > > > > > > bound with a driver? If so, can we make the unbind operation to perform > > > > > > > > > unbind only? > > > > > > > > > > > > > > > > acpi_bus_remove_device could check if the driver is present, and return -ENODEV > > > > > > > > if it's not present (dev->driver == NULL). > > > > > > > > > > > > > > > > But there can still be a race between an eject and an unbind operation happening > > > > > > > > simultaneously. This seems like a general problem to me i.e. not specific to an > > > > > > > > acpi memory device. How do we ensure an eject does not race with a driver unbind > > > > > > > > for other acpi devices? > > > > > > > > > > > > > > > > Is there a per-device lock in acpi-core or device-core that can prevent this from > > > > > > > > happening? Driver core does a device_lock(dev) on all operations, but this is > > > > > > > > probably not grabbed on SCI-initiated acpi ejects. > > > > > > > > > > > > > > Since driver_unbind() calls device_lock(dev->parent) before calling > > > > > > > device_release_driver(), I am wondering if we can call > > > > > > > device_lock(dev->dev->parent) at the beginning of acpi_bus_remove() > > > > > > > (i.e. before calling pre_remove) and fails if dev->driver is NULL. The > > > > > > > parent lock is otherwise released after device_release_driver() is done. > > > > > > > > > > > > I would be careful. You may introduce some subtle locking-related issues > > > > > > this way. > > > > > > > > > > Right. This requires careful inspection and testing. As far as the > > > > > locking is concerned, I am not keen on using fine grained locking for > > > > > hot-plug. It is much simpler and solid if we serialize such operations. > > > > > > > > > > > Besides, there may be an alternative approach to all this. For example, > > > > > > what if we don't remove struct device objects on eject? The ACPI handles > > > > > > associated with them don't go away in that case after all, do they? > > > > > > > > > > Umm... Sorry, I am not getting your point. The issue is that we need > > > > > to be able to fail a request when memory range cannot be off-lined. > > > > > Otherwise, we end up ejecting online memory range. > > > > > > > > Yes, this is the major one. The minor issue, however, is a race condition > > > > between unbinding a driver from a device and removing the device if I > > > > understand it correctly. Which will go away automatically if the device is > > > > not removed in the first place. Or so I would think. :-) > > > > > > I see. I do not think whether or not the device is removed on eject > > > makes any difference here. The issue is that after driver_unbind() is > > > done, acpi_bus_hot_remove_device() no longer calls the ACPI memory > > > driver (hence, it cannot fail in prepare_remove), and goes ahead to call > > > _EJ0. > > > > I see two reasons for calling acpi_bus_hot_remove_device() for memory (correct > > me if I'm wrong): (1) from the memhotplug driver's notify handler and (2) from > > acpi_eject_store() which is exposed through sysfs. > > Yes, that is correct. > > > If we disabled exposing > > acpi_eject_store() for memory devices, then the only way would be from the > > notify handler. So I wonder if driver_unbind() shouldn't just uninstall the > > notify handler for memory (so that memory eject events are simply dropped on > > the floor after unbinding the driver)? > > If driver_unbind() happens before an eject request, we do not have a > problem. acpi_eject_store() fails if a driver is not bound to the > device. acpi_memory_device_notify() fails as well. > > The race condition Wen pointed out (see the top of this email) is that > driver_unbind() may come in while eject operation is in-progress. This > is why I mentioned the following in previous email. > > > So, we basically need to either 1) serialize > > acpi_bus_hot_remove_device() and driver_unbind(), or 2) make > > acpi_bus_hot_remove_device() to fail if driver_unbind() is run > > during the operation. Forgot to mention. The 3rd option is what Greg said -- use the suppress_bind_attrs field. I think this is a good option to address this race condition for now. For a long term solution, we should have a better infrastructure in place to address such issue in general. Thanks, -Toshi -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wednesday, November 28, 2012 06:15:42 PM Toshi Kani wrote: > On Wed, 2012-11-28 at 18:02 -0700, Toshi Kani wrote: > > On Thu, 2012-11-29 at 00:49 +0100, Rafael J. Wysocki wrote: > > > On Wednesday, November 28, 2012 02:02:48 PM Toshi Kani wrote: > > > > > > > > > > > Consider the following case: > > > > > > > > > > > > > > > > > > > > > > We hotremove the memory device by SCI and unbind it from the driver at the same time: > > > > > > > > > > > > > > > > > > > > > > CPUa CPUb > > > > > > > > > > > acpi_memory_device_notify() > > > > > > > > > > > unbind it from the driver > > > > > > > > > > > acpi_bus_hot_remove_device() > > > > > > > > > > > > > > > > > > > > Can we make acpi_bus_remove() to fail if a given acpi_device is not > > > > > > > > > > bound with a driver? If so, can we make the unbind operation to perform > > > > > > > > > > unbind only? > > > > > > > > > > > > > > > > > > acpi_bus_remove_device could check if the driver is present, and return -ENODEV > > > > > > > > > if it's not present (dev->driver == NULL). > > > > > > > > > > > > > > > > > > But there can still be a race between an eject and an unbind operation happening > > > > > > > > > simultaneously. This seems like a general problem to me i.e. not specific to an > > > > > > > > > acpi memory device. How do we ensure an eject does not race with a driver unbind > > > > > > > > > for other acpi devices? > > > > > > > > > > > > > > > > > > Is there a per-device lock in acpi-core or device-core that can prevent this from > > > > > > > > > happening? Driver core does a device_lock(dev) on all operations, but this is > > > > > > > > > probably not grabbed on SCI-initiated acpi ejects. > > > > > > > > > > > > > > > > Since driver_unbind() calls device_lock(dev->parent) before calling > > > > > > > > device_release_driver(), I am wondering if we can call > > > > > > > > device_lock(dev->dev->parent) at the beginning of acpi_bus_remove() > > > > > > > > (i.e. before calling pre_remove) and fails if dev->driver is NULL. The > > > > > > > > parent lock is otherwise released after device_release_driver() is done. > > > > > > > > > > > > > > I would be careful. You may introduce some subtle locking-related issues > > > > > > > this way. > > > > > > > > > > > > Right. This requires careful inspection and testing. As far as the > > > > > > locking is concerned, I am not keen on using fine grained locking for > > > > > > hot-plug. It is much simpler and solid if we serialize such operations. > > > > > > > > > > > > > Besides, there may be an alternative approach to all this. For example, > > > > > > > what if we don't remove struct device objects on eject? The ACPI handles > > > > > > > associated with them don't go away in that case after all, do they? > > > > > > > > > > > > Umm... Sorry, I am not getting your point. The issue is that we need > > > > > > to be able to fail a request when memory range cannot be off-lined. > > > > > > Otherwise, we end up ejecting online memory range. > > > > > > > > > > Yes, this is the major one. The minor issue, however, is a race condition > > > > > between unbinding a driver from a device and removing the device if I > > > > > understand it correctly. Which will go away automatically if the device is > > > > > not removed in the first place. Or so I would think. :-) > > > > > > > > I see. I do not think whether or not the device is removed on eject > > > > makes any difference here. The issue is that after driver_unbind() is > > > > done, acpi_bus_hot_remove_device() no longer calls the ACPI memory > > > > driver (hence, it cannot fail in prepare_remove), and goes ahead to call > > > > _EJ0. > > > > > > I see two reasons for calling acpi_bus_hot_remove_device() for memory (correct > > > me if I'm wrong): (1) from the memhotplug driver's notify handler and (2) from > > > acpi_eject_store() which is exposed through sysfs. > > > > Yes, that is correct. > > > > > If we disabled exposing > > > acpi_eject_store() for memory devices, then the only way would be from the > > > notify handler. So I wonder if driver_unbind() shouldn't just uninstall the > > > notify handler for memory (so that memory eject events are simply dropped on > > > the floor after unbinding the driver)? > > > > If driver_unbind() happens before an eject request, we do not have a > > problem. acpi_eject_store() fails if a driver is not bound to the > > device. acpi_memory_device_notify() fails as well. > > > > The race condition Wen pointed out (see the top of this email) is that > > driver_unbind() may come in while eject operation is in-progress. This > > is why I mentioned the following in previous email. > > > > > So, we basically need to either 1) serialize > > > acpi_bus_hot_remove_device() and driver_unbind(), or 2) make > > > acpi_bus_hot_remove_device() to fail if driver_unbind() is run > > > during the operation. > > Forgot to mention. The 3rd option is what Greg said -- use the > suppress_bind_attrs field. I think this is a good option to address > this race condition for now. For a long term solution, we should have a > better infrastructure in place to address such issue in general. Well, in the meantime I've had a look at acpi_bus_hot_remove_device() and friends and I think there's a way to address all of these problems without big redesign (for now). First, why don't we introduce an ACPI device flag (in the flags field of struct acpi_device) called eject_forbidden or something like this such that: (1) It will be clear by default. (2) It may only be set by a driver's .add() routine if necessary. (3) Once set, it may only be cleared by the driver's .remove() routine if it's safe to physically remove the device after the .remove(). Then, after the .remove() (which must be successful) has returned, and the flag is set, it will tell acpi_bus_remove() to return a specific error code (such as -EBUSY or -EAGAIN). It doesn't matter if .remove() was called earlier, because if it left the flag set, there's no way to clear it afterward and acpi_bus_remove() will see it set anyway. I think the struct acpi_device should be unregistered anyway if that error code is to be returned. [By the way, do you know where we free the memory allocated for struct acpi_device objects?] Now if acpi_bus_trim() gets that error code from acpi_bus_remove(), it should store it, but continue the trimming normally and finally it should return that error code to acpi_bus_hot_remove_device(). Now, if acpi_bus_hot_remove_device() gets that error code, it should just reverse the whole trimming (i.e. trigger acpi_bus_scan() from the device we attempted to eject) and notify the firmware about the failure. If we have that, then the memory hotplug driver would only need to set flags.eject_forbidden in its .add() routine and make its .remove() routine only clear that flag if it is safe to actually remove the memory. Does this make sense to you? [BTW, using _PS3 in acpi_bus_hot_remove_device() directly to power off the device is a nonsense, because this method is not guaranteed to turn the power off in the first place (it may just put the device into D3hot). If anything, acpi_device_set_power() should be used for that, but even that is not guaranteed to actually remove the power (power resources may be shared with other devices, so in fact that operation should be done by acpi_bus_trim() for each of the trimmed devices.] Thanks, Rafael
Hi, On Wed, Nov 28, 2012 at 06:15:42PM -0700, Toshi Kani wrote: > On Wed, 2012-11-28 at 18:02 -0700, Toshi Kani wrote: > > On Thu, 2012-11-29 at 00:49 +0100, Rafael J. Wysocki wrote: > > > On Wednesday, November 28, 2012 02:02:48 PM Toshi Kani wrote: > > > > > > > > > > > Consider the following case: > > > > > > > > > > > > > > > > > > > > > > We hotremove the memory device by SCI and unbind it from the driver at the same time: > > > > > > > > > > > > > > > > > > > > > > CPUa CPUb > > > > > > > > > > > acpi_memory_device_notify() > > > > > > > > > > > unbind it from the driver > > > > > > > > > > > acpi_bus_hot_remove_device() > > > I see two reasons for calling acpi_bus_hot_remove_device() for memory (correct > > > me if I'm wrong): (1) from the memhotplug driver's notify handler and (2) from > > > acpi_eject_store() which is exposed through sysfs. > > > > Yes, that is correct. > > > > > If we disabled exposing > > > acpi_eject_store() for memory devices, then the only way would be from the > > > notify handler. So I wonder if driver_unbind() shouldn't just uninstall the > > > notify handler for memory (so that memory eject events are simply dropped on > > > the floor after unbinding the driver)? > > > > If driver_unbind() happens before an eject request, we do not have a > > problem. acpi_eject_store() fails if a driver is not bound to the > > device. acpi_memory_device_notify() fails as well. > > > > The race condition Wen pointed out (see the top of this email) is that > > driver_unbind() may come in while eject operation is in-progress. This > > is why I mentioned the following in previous email. > > > > > So, we basically need to either 1) serialize > > > acpi_bus_hot_remove_device() and driver_unbind(), or 2) make > > > acpi_bus_hot_remove_device() to fail if driver_unbind() is run > > > during the operation. > > Forgot to mention. The 3rd option is what Greg said -- use the > suppress_bind_attrs field. I think this is a good option to address > this race condition for now. For a long term solution, we should have a > better infrastructure in place to address such issue in general. I like the suppress_bind_attrs idea, I 'll take a look. As I said for option 2), acpi_bus_remove could check for driver presence. But It's more a quick hack to abort the eject (the race with unbind can still happen, but acpi_bus_remove can now detect it later in the eject path). Something like: static int acpi_bus_remove(struct acpi_device *dev, int rmdevice) { + int ret; if (!dev) return -EINVAL; dev->removal_type = ACPI_BUS_REMOVAL_EJECT; + + if (dev->driver && dev->driver->ops.prepare_remove) { + ret = dev->driver->ops.prepare_remove(dev); + if (ret) + return ret; + } + else if (!dev->driver) + return -ENODEV; device_release_driver(&dev->dev); thanks, - Vasilis -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Nov 29, 2012 at 11:03:05AM +0100, Rafael J. Wysocki wrote: > On Wednesday, November 28, 2012 06:15:42 PM Toshi Kani wrote: > > On Wed, 2012-11-28 at 18:02 -0700, Toshi Kani wrote: > > > On Thu, 2012-11-29 at 00:49 +0100, Rafael J. Wysocki wrote: > > > > On Wednesday, November 28, 2012 02:02:48 PM Toshi Kani wrote: > > > > > > > > > > > > Consider the following case: > > > > > > > > > > > > > > > > > > > > > > > > We hotremove the memory device by SCI and unbind it from the driver at the same time: > > > > > > > > > > > > > > > > > > > > > > > > CPUa CPUb > > > > > > > > > > > > acpi_memory_device_notify() > > > > > > > > > > > > unbind it from the driver > > > > > > > > > > > > acpi_bus_hot_remove_device() > > > > > > > > > > > [...] > Well, in the meantime I've had a look at acpi_bus_hot_remove_device() and > friends and I think there's a way to address all of these problems > without big redesign (for now). > > First, why don't we introduce an ACPI device flag (in the flags field of > struct acpi_device) called eject_forbidden or something like this such that: > > (1) It will be clear by default. > (2) It may only be set by a driver's .add() routine if necessary. > (3) Once set, it may only be cleared by the driver's .remove() routine if > it's safe to physically remove the device after the .remove(). > > Then, after the .remove() (which must be successful) has returned, and the > flag is set, it will tell acpi_bus_remove() to return a specific error code > (such as -EBUSY or -EAGAIN). It doesn't matter if .remove() was called > earlier, because if it left the flag set, there's no way to clear it afterward > and acpi_bus_remove() will see it set anyway. I think the struct acpi_device > should be unregistered anyway if that error code is to be returned. > > [By the way, do you know where we free the memory allocated for struct > acpi_device objects?] > > Now if acpi_bus_trim() gets that error code from acpi_bus_remove(), it should > store it, but continue the trimming normally and finally it should return that > error code to acpi_bus_hot_remove_device(). Side-note: In the pre_remove patches, acpi_bus_trim actually returns on the first error from acpi_bus_remove (e.g. when memory offlining in pre_remove fails). Trimming is not continued. Normally, acpi_bus_trim keeps trimming as you say, and always returns the last error. Is this the desired behaviour that we want to keep for bus_trim? (This is more a general question, not specific to the eject_forbidden suggestion) > > Now, if acpi_bus_hot_remove_device() gets that error code, it should just > reverse the whole trimming (i.e. trigger acpi_bus_scan() from the device > we attempted to eject) and notify the firmware about the failure. sounds like this rollback needs to be implemented in any solution we choose to implement, correct? > > If we have that, then the memory hotplug driver would only need to set > flags.eject_forbidden in its .add() routine and make its .remove() routine > only clear that flag if it is safe to actually remove the memory. > But when .remove op is called, we are already in the irreversible/error-free removal (final removal step). Maybe we need to reset eject_forbidden in a prepare_remove operation which handles the removal part that can fail ? thanks, - Vasilis -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 2012-11-29 at 11:03 +0100, Rafael J. Wysocki wrote: > On Wednesday, November 28, 2012 06:15:42 PM Toshi Kani wrote: > > On Wed, 2012-11-28 at 18:02 -0700, Toshi Kani wrote: > > > On Thu, 2012-11-29 at 00:49 +0100, Rafael J. Wysocki wrote: > > > > On Wednesday, November 28, 2012 02:02:48 PM Toshi Kani wrote: > > > > If we disabled exposing > > > > acpi_eject_store() for memory devices, then the only way would be from the > > > > notify handler. So I wonder if driver_unbind() shouldn't just uninstall the > > > > notify handler for memory (so that memory eject events are simply dropped on > > > > the floor after unbinding the driver)? > > > > > > If driver_unbind() happens before an eject request, we do not have a > > > problem. acpi_eject_store() fails if a driver is not bound to the > > > device. acpi_memory_device_notify() fails as well. > > > > > > The race condition Wen pointed out (see the top of this email) is that > > > driver_unbind() may come in while eject operation is in-progress. This > > > is why I mentioned the following in previous email. > > > > > > > So, we basically need to either 1) serialize > > > > acpi_bus_hot_remove_device() and driver_unbind(), or 2) make > > > > acpi_bus_hot_remove_device() to fail if driver_unbind() is run > > > > during the operation. > > > > Forgot to mention. The 3rd option is what Greg said -- use the > > suppress_bind_attrs field. I think this is a good option to address > > this race condition for now. For a long term solution, we should have a > > better infrastructure in place to address such issue in general. > > Well, in the meantime I've had a look at acpi_bus_hot_remove_device() and > friends and I think there's a way to address all of these problems > without big redesign (for now). > > First, why don't we introduce an ACPI device flag (in the flags field of > struct acpi_device) called eject_forbidden or something like this such that: > > (1) It will be clear by default. > (2) It may only be set by a driver's .add() routine if necessary. > (3) Once set, it may only be cleared by the driver's .remove() routine if > it's safe to physically remove the device after the .remove(). > > Then, after the .remove() (which must be successful) has returned, and the > flag is set, it will tell acpi_bus_remove() to return a specific error code > (such as -EBUSY or -EAGAIN). It doesn't matter if .remove() was called > earlier, because if it left the flag set, there's no way to clear it afterward > and acpi_bus_remove() will see it set anyway. I think the struct acpi_device > should be unregistered anyway if that error code is to be returned. I like the idea! It's a good intermediate solution if we need to keep the bind/unbind interface. That said, I still prefer to go with option 3) for now. I do not see much reason to keep the bind/unbind interface for ACPI hotplug drivers, and it seems that the semantics of .remove() is .remove_driver(), not .remove_device() for driver_unbind(). So, I think we should disable the bind/unbind interface until we settle this issue. > [By the way, do you know where we free the memory allocated for struct > acpi_device objects?] device_release() -> acpi_device_release(). > Now if acpi_bus_trim() gets that error code from acpi_bus_remove(), it should > store it, but continue the trimming normally and finally it should return that > error code to acpi_bus_hot_remove_device(). > > Now, if acpi_bus_hot_remove_device() gets that error code, it should just > reverse the whole trimming (i.e. trigger acpi_bus_scan() from the device > we attempted to eject) and notify the firmware about the failure. > > If we have that, then the memory hotplug driver would only need to set > flags.eject_forbidden in its .add() routine and make its .remove() routine > only clear that flag if it is safe to actually remove the memory. > > Does this make sense to you? In high-level, yes. Rollback strategy, such as we should continue the trimming after an error, is something we need to think about along with the framework design. I think we need a good framework before implementing rollback. > [BTW, using _PS3 in acpi_bus_hot_remove_device() directly to power off the > device is a nonsense, because this method is not guaranteed to turn the power > off in the first place (it may just put the device into D3hot). If anything, > acpi_device_set_power() should be used for that, but even that is not > guaranteed to actually remove the power (power resources may be shared with > other devices, so in fact that operation should be done by acpi_bus_trim() > for each of the trimmed devices.] I agree. I cannot tell for other vendor's implementation, but I expect that _EJ0 takes care of the power state after it is ejected. Thanks, -Toshi -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thursday, November 29, 2012 12:30:30 PM Vasilis Liaskovitis wrote: > On Thu, Nov 29, 2012 at 11:03:05AM +0100, Rafael J. Wysocki wrote: > > On Wednesday, November 28, 2012 06:15:42 PM Toshi Kani wrote: > > > On Wed, 2012-11-28 at 18:02 -0700, Toshi Kani wrote: > > > > On Thu, 2012-11-29 at 00:49 +0100, Rafael J. Wysocki wrote: > > > > > On Wednesday, November 28, 2012 02:02:48 PM Toshi Kani wrote: > > > > > > > > > > > > > Consider the following case: > > > > > > > > > > > > > > > > > > > > > > > > > > We hotremove the memory device by SCI and unbind it from the driver at the same time: > > > > > > > > > > > > > > > > > > > > > > > > > > CPUa CPUb > > > > > > > > > > > > > acpi_memory_device_notify() > > > > > > > > > > > > > unbind it from the driver > > > > > > > > > > > > > acpi_bus_hot_remove_device() > > > > > > > > > > > > > [...] > > Well, in the meantime I've had a look at acpi_bus_hot_remove_device() and > > friends and I think there's a way to address all of these problems > > without big redesign (for now). > > > > First, why don't we introduce an ACPI device flag (in the flags field of > > struct acpi_device) called eject_forbidden or something like this such that: > > > > (1) It will be clear by default. > > (2) It may only be set by a driver's .add() routine if necessary. > > (3) Once set, it may only be cleared by the driver's .remove() routine if > > it's safe to physically remove the device after the .remove(). > > > > Then, after the .remove() (which must be successful) has returned, and the > > flag is set, it will tell acpi_bus_remove() to return a specific error code > > (such as -EBUSY or -EAGAIN). It doesn't matter if .remove() was called > > earlier, because if it left the flag set, there's no way to clear it afterward > > and acpi_bus_remove() will see it set anyway. I think the struct acpi_device > > should be unregistered anyway if that error code is to be returned. > > > > [By the way, do you know where we free the memory allocated for struct > > acpi_device objects?] > > > > Now if acpi_bus_trim() gets that error code from acpi_bus_remove(), it should > > store it, but continue the trimming normally and finally it should return that > > error code to acpi_bus_hot_remove_device(). > > Side-note: In the pre_remove patches, acpi_bus_trim actually returns on the > first error from acpi_bus_remove (e.g. when memory offlining in pre_remove > fails). Trimming is not continued. > > Normally, acpi_bus_trim keeps trimming as you say, and always returns the last > error. Is this the desired behaviour that we want to keep for bus_trim? (This is > more a general question, not specific to the eject_forbidden suggestion) > > > > > Now, if acpi_bus_hot_remove_device() gets that error code, it should just > > reverse the whole trimming (i.e. trigger acpi_bus_scan() from the device > > we attempted to eject) and notify the firmware about the failure. > > sounds like this rollback needs to be implemented in any solution we choose > to implement, correct? > > > > > If we have that, then the memory hotplug driver would only need to set > > flags.eject_forbidden in its .add() routine and make its .remove() routine > > only clear that flag if it is safe to actually remove the memory. > > > > But when .remove op is called, we are already in the irreversible/error-free > removal (final removal step). Why so? What prevents us from doing a bus scan again and binding the driver again to the device? Is .remove() doing something to the firmware? Rafael
On Thu, 2012-11-29 at 12:04 +0100, Vasilis Liaskovitis wrote: > Hi, > > On Wed, Nov 28, 2012 at 06:15:42PM -0700, Toshi Kani wrote: > > On Wed, 2012-11-28 at 18:02 -0700, Toshi Kani wrote: > > > On Thu, 2012-11-29 at 00:49 +0100, Rafael J. Wysocki wrote: > > > > On Wednesday, November 28, 2012 02:02:48 PM Toshi Kani wrote: > > > > > > > > > > > > Consider the following case: > > > > > > > > > > > > > > > > > > > > > > > > We hotremove the memory device by SCI and unbind it from the driver at the same time: > > > > > > > > > > > > > > > > > > > > > > > > CPUa CPUb > > > > > > > > > > > > acpi_memory_device_notify() > > > > > > > > > > > > unbind it from the driver > > > > > > > > > > > > acpi_bus_hot_remove_device() > > > > I see two reasons for calling acpi_bus_hot_remove_device() for memory (correct > > > > me if I'm wrong): (1) from the memhotplug driver's notify handler and (2) from > > > > acpi_eject_store() which is exposed through sysfs. > > > > > > Yes, that is correct. > > > > > > > If we disabled exposing > > > > acpi_eject_store() for memory devices, then the only way would be from the > > > > notify handler. So I wonder if driver_unbind() shouldn't just uninstall the > > > > notify handler for memory (so that memory eject events are simply dropped on > > > > the floor after unbinding the driver)? > > > > > > If driver_unbind() happens before an eject request, we do not have a > > > problem. acpi_eject_store() fails if a driver is not bound to the > > > device. acpi_memory_device_notify() fails as well. > > > > > > The race condition Wen pointed out (see the top of this email) is that > > > driver_unbind() may come in while eject operation is in-progress. This > > > is why I mentioned the following in previous email. > > > > > > > So, we basically need to either 1) serialize > > > > acpi_bus_hot_remove_device() and driver_unbind(), or 2) make > > > > acpi_bus_hot_remove_device() to fail if driver_unbind() is run > > > > during the operation. > > > > Forgot to mention. The 3rd option is what Greg said -- use the > > suppress_bind_attrs field. I think this is a good option to address > > this race condition for now. For a long term solution, we should have a > > better infrastructure in place to address such issue in general. > > I like the suppress_bind_attrs idea, I 'll take a look. Great! > As I said for option 2), acpi_bus_remove could check for driver presence. > But It's more a quick hack to abort the eject (the race with unbind can still > happen, but acpi_bus_remove can now detect it later in the eject path). > Something like: > > static int acpi_bus_remove(struct acpi_device *dev, int rmdevice) > { > + int ret; > if (!dev) > return -EINVAL; > > dev->removal_type = ACPI_BUS_REMOVAL_EJECT; > + > + if (dev->driver && dev->driver->ops.prepare_remove) { > + ret = dev->driver->ops.prepare_remove(dev); > + if (ret) > + return ret; > + } > + else if (!dev->driver) > + return -ENODEV; > device_release_driver(&dev->dev); Yes, that's what I had in mind along with device_lock(). I think the lock is necessary to close the window. http://www.spinics.net/lists/linux-mm/msg46973.html But as I mentioned in other email, I prefer option 3 with suppress_bind_attrs. So, yes, please take a look to see how it works out. Thanks, -Toshi -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 2012-11-29 at 12:30 +0100, Vasilis Liaskovitis wrote: > On Thu, Nov 29, 2012 at 11:03:05AM +0100, Rafael J. Wysocki wrote: > > On Wednesday, November 28, 2012 06:15:42 PM Toshi Kani wrote: > > > On Wed, 2012-11-28 at 18:02 -0700, Toshi Kani wrote: > > > > On Thu, 2012-11-29 at 00:49 +0100, Rafael J. Wysocki wrote: > > > > > On Wednesday, November 28, 2012 02:02:48 PM Toshi Kani wrote: > > > > > > > > > > > > > Consider the following case: > > > > > > > > > > > > > > > > > > > > > > > > > > We hotremove the memory device by SCI and unbind it from the driver at the same time: > > > > > > > > > > > > > > > > > > > > > > > > > > CPUa CPUb > > > > > > > > > > > > > acpi_memory_device_notify() > > > > > > > > > > > > > unbind it from the driver > > > > > > > > > > > > > acpi_bus_hot_remove_device() > > > > > > > > > > > > > [...] > > Well, in the meantime I've had a look at acpi_bus_hot_remove_device() and > > friends and I think there's a way to address all of these problems > > without big redesign (for now). > > > > First, why don't we introduce an ACPI device flag (in the flags field of > > struct acpi_device) called eject_forbidden or something like this such that: > > > > (1) It will be clear by default. > > (2) It may only be set by a driver's .add() routine if necessary. > > (3) Once set, it may only be cleared by the driver's .remove() routine if > > it's safe to physically remove the device after the .remove(). > > > > Then, after the .remove() (which must be successful) has returned, and the > > flag is set, it will tell acpi_bus_remove() to return a specific error code > > (such as -EBUSY or -EAGAIN). It doesn't matter if .remove() was called > > earlier, because if it left the flag set, there's no way to clear it afterward > > and acpi_bus_remove() will see it set anyway. I think the struct acpi_device > > should be unregistered anyway if that error code is to be returned. > > > > [By the way, do you know where we free the memory allocated for struct > > acpi_device objects?] > > > > Now if acpi_bus_trim() gets that error code from acpi_bus_remove(), it should > > store it, but continue the trimming normally and finally it should return that > > error code to acpi_bus_hot_remove_device(). > > Side-note: In the pre_remove patches, acpi_bus_trim actually returns on the > first error from acpi_bus_remove (e.g. when memory offlining in pre_remove > fails). Trimming is not continued. > > Normally, acpi_bus_trim keeps trimming as you say, and always returns the last > error. Is this the desired behaviour that we want to keep for bus_trim? (This is > more a general question, not specific to the eject_forbidden suggestion) Your change makes sense to me. At least until we have rollback code in place, we need to fail as soon as we hit an error. > > Now, if acpi_bus_hot_remove_device() gets that error code, it should just > > reverse the whole trimming (i.e. trigger acpi_bus_scan() from the device > > we attempted to eject) and notify the firmware about the failure. > > sounds like this rollback needs to be implemented in any solution we choose > to implement, correct? Yes, rollback is necessary. But I do not think we need to include it into your patch, though. Thanks, -Toshi > > If we have that, then the memory hotplug driver would only need to set > > flags.eject_forbidden in its .add() routine and make its .remove() routine > > only clear that flag if it is safe to actually remove the memory. > > > > But when .remove op is called, we are already in the irreversible/error-free > removal (final removal step). > Maybe we need to reset eject_forbidden in a prepare_remove operation which > handles the removal part that can fail ? > > thanks, > > - Vasilis -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thursday, November 29, 2012 10:56:30 AM Toshi Kani wrote: > On Thu, 2012-11-29 at 12:30 +0100, Vasilis Liaskovitis wrote: > > On Thu, Nov 29, 2012 at 11:03:05AM +0100, Rafael J. Wysocki wrote: > > > On Wednesday, November 28, 2012 06:15:42 PM Toshi Kani wrote: > > > > On Wed, 2012-11-28 at 18:02 -0700, Toshi Kani wrote: > > > > > On Thu, 2012-11-29 at 00:49 +0100, Rafael J. Wysocki wrote: > > > > > > On Wednesday, November 28, 2012 02:02:48 PM Toshi Kani wrote: > > > > > > > > > > > > > > Consider the following case: > > > > > > > > > > > > > > > > > > > > > > > > > > > > We hotremove the memory device by SCI and unbind it from the driver at the same time: > > > > > > > > > > > > > > > > > > > > > > > > > > > > CPUa CPUb > > > > > > > > > > > > > > acpi_memory_device_notify() > > > > > > > > > > > > > > unbind it from the driver > > > > > > > > > > > > > > acpi_bus_hot_remove_device() > > > > > > > > > > > > > > > [...] > > > Well, in the meantime I've had a look at acpi_bus_hot_remove_device() and > > > friends and I think there's a way to address all of these problems > > > without big redesign (for now). > > > > > > First, why don't we introduce an ACPI device flag (in the flags field of > > > struct acpi_device) called eject_forbidden or something like this such that: > > > > > > (1) It will be clear by default. > > > (2) It may only be set by a driver's .add() routine if necessary. > > > (3) Once set, it may only be cleared by the driver's .remove() routine if > > > it's safe to physically remove the device after the .remove(). > > > > > > Then, after the .remove() (which must be successful) has returned, and the > > > flag is set, it will tell acpi_bus_remove() to return a specific error code > > > (such as -EBUSY or -EAGAIN). It doesn't matter if .remove() was called > > > earlier, because if it left the flag set, there's no way to clear it afterward > > > and acpi_bus_remove() will see it set anyway. I think the struct acpi_device > > > should be unregistered anyway if that error code is to be returned. > > > > > > [By the way, do you know where we free the memory allocated for struct > > > acpi_device objects?] > > > > > > Now if acpi_bus_trim() gets that error code from acpi_bus_remove(), it should > > > store it, but continue the trimming normally and finally it should return that > > > error code to acpi_bus_hot_remove_device(). > > > > Side-note: In the pre_remove patches, acpi_bus_trim actually returns on the > > first error from acpi_bus_remove (e.g. when memory offlining in pre_remove > > fails). Trimming is not continued. > > > > Normally, acpi_bus_trim keeps trimming as you say, and always returns the last > > error. Is this the desired behaviour that we want to keep for bus_trim? (This is > > more a general question, not specific to the eject_forbidden suggestion) > > Your change makes sense to me. At least until we have rollback code in > place, we need to fail as soon as we hit an error. Are you sure this makes sense? What happens to the devices that we have trimmed already and then there's an error? Looks like they are just unusable going forward, aren't they? > > > Now, if acpi_bus_hot_remove_device() gets that error code, it should just > > > reverse the whole trimming (i.e. trigger acpi_bus_scan() from the device > > > we attempted to eject) and notify the firmware about the failure. > > > > sounds like this rollback needs to be implemented in any solution we choose > > to implement, correct? > > Yes, rollback is necessary. But I do not think we need to include it > into your patch, though. As the first step, we should just trim everything and then return an error code in my opinion. Thanks, Rafael
On Thu, 2012-11-29 at 21:25 +0100, Rafael J. Wysocki wrote: > On Thursday, November 29, 2012 10:56:30 AM Toshi Kani wrote: > > On Thu, 2012-11-29 at 12:30 +0100, Vasilis Liaskovitis wrote: > > > On Thu, Nov 29, 2012 at 11:03:05AM +0100, Rafael J. Wysocki wrote: > > > > On Wednesday, November 28, 2012 06:15:42 PM Toshi Kani wrote: > > > > > On Wed, 2012-11-28 at 18:02 -0700, Toshi Kani wrote: > > > > > > On Thu, 2012-11-29 at 00:49 +0100, Rafael J. Wysocki wrote: > > > > > > > On Wednesday, November 28, 2012 02:02:48 PM Toshi Kani wrote: > > > > > > > > > > > > > > > Consider the following case: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > We hotremove the memory device by SCI and unbind it from the driver at the same time: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > CPUa CPUb > > > > > > > > > > > > > > > acpi_memory_device_notify() > > > > > > > > > > > > > > > unbind it from the driver > > > > > > > > > > > > > > > acpi_bus_hot_remove_device() > > > > > > > > > > > > > > > > > [...] > > > > Well, in the meantime I've had a look at acpi_bus_hot_remove_device() and > > > > friends and I think there's a way to address all of these problems > > > > without big redesign (for now). > > > > > > > > First, why don't we introduce an ACPI device flag (in the flags field of > > > > struct acpi_device) called eject_forbidden or something like this such that: > > > > > > > > (1) It will be clear by default. > > > > (2) It may only be set by a driver's .add() routine if necessary. > > > > (3) Once set, it may only be cleared by the driver's .remove() routine if > > > > it's safe to physically remove the device after the .remove(). > > > > > > > > Then, after the .remove() (which must be successful) has returned, and the > > > > flag is set, it will tell acpi_bus_remove() to return a specific error code > > > > (such as -EBUSY or -EAGAIN). It doesn't matter if .remove() was called > > > > earlier, because if it left the flag set, there's no way to clear it afterward > > > > and acpi_bus_remove() will see it set anyway. I think the struct acpi_device > > > > should be unregistered anyway if that error code is to be returned. > > > > > > > > [By the way, do you know where we free the memory allocated for struct > > > > acpi_device objects?] > > > > > > > > Now if acpi_bus_trim() gets that error code from acpi_bus_remove(), it should > > > > store it, but continue the trimming normally and finally it should return that > > > > error code to acpi_bus_hot_remove_device(). > > > > > > Side-note: In the pre_remove patches, acpi_bus_trim actually returns on the > > > first error from acpi_bus_remove (e.g. when memory offlining in pre_remove > > > fails). Trimming is not continued. > > > > > > Normally, acpi_bus_trim keeps trimming as you say, and always returns the last > > > error. Is this the desired behaviour that we want to keep for bus_trim? (This is > > > more a general question, not specific to the eject_forbidden suggestion) > > > > Your change makes sense to me. At least until we have rollback code in > > place, we need to fail as soon as we hit an error. > > Are you sure this makes sense? What happens to the devices that we have > trimmed already and then there's an error? Looks like they are just unusable > going forward, aren't they? Yes, the devices trimmed already are released from the kernel, and their memory ranges become unusable. This is bad. But I do not think we should trim further to make more devices unusable after an error. > > > > Now, if acpi_bus_hot_remove_device() gets that error code, it should just > > > > reverse the whole trimming (i.e. trigger acpi_bus_scan() from the device > > > > we attempted to eject) and notify the firmware about the failure. > > > > > > sounds like this rollback needs to be implemented in any solution we choose > > > to implement, correct? > > > > Yes, rollback is necessary. But I do not think we need to include it > > into your patch, though. > > As the first step, we should just trim everything and then return an error > code in my opinion. But we cannot trim devices with kernel memory. Thanks, -Toshi -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thursday, November 29, 2012 01:38:39 PM Toshi Kani wrote: > On Thu, 2012-11-29 at 21:25 +0100, Rafael J. Wysocki wrote: > > On Thursday, November 29, 2012 10:56:30 AM Toshi Kani wrote: > > > On Thu, 2012-11-29 at 12:30 +0100, Vasilis Liaskovitis wrote: > > > > On Thu, Nov 29, 2012 at 11:03:05AM +0100, Rafael J. Wysocki wrote: > > > > > On Wednesday, November 28, 2012 06:15:42 PM Toshi Kani wrote: > > > > > > On Wed, 2012-11-28 at 18:02 -0700, Toshi Kani wrote: > > > > > > > On Thu, 2012-11-29 at 00:49 +0100, Rafael J. Wysocki wrote: > > > > > > > > On Wednesday, November 28, 2012 02:02:48 PM Toshi Kani wrote: > > > > > > > > > > > > > > > > Consider the following case: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > We hotremove the memory device by SCI and unbind it from the driver at the same time: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > CPUa CPUb > > > > > > > > > > > > > > > > acpi_memory_device_notify() > > > > > > > > > > > > > > > > unbind it from the driver > > > > > > > > > > > > > > > > acpi_bus_hot_remove_device() > > > > > > > > > > > > > > > > > > > [...] > > > > > Well, in the meantime I've had a look at acpi_bus_hot_remove_device() and > > > > > friends and I think there's a way to address all of these problems > > > > > without big redesign (for now). > > > > > > > > > > First, why don't we introduce an ACPI device flag (in the flags field of > > > > > struct acpi_device) called eject_forbidden or something like this such that: > > > > > > > > > > (1) It will be clear by default. > > > > > (2) It may only be set by a driver's .add() routine if necessary. > > > > > (3) Once set, it may only be cleared by the driver's .remove() routine if > > > > > it's safe to physically remove the device after the .remove(). > > > > > > > > > > Then, after the .remove() (which must be successful) has returned, and the > > > > > flag is set, it will tell acpi_bus_remove() to return a specific error code > > > > > (such as -EBUSY or -EAGAIN). It doesn't matter if .remove() was called > > > > > earlier, because if it left the flag set, there's no way to clear it afterward > > > > > and acpi_bus_remove() will see it set anyway. I think the struct acpi_device > > > > > should be unregistered anyway if that error code is to be returned. > > > > > > > > > > [By the way, do you know where we free the memory allocated for struct > > > > > acpi_device objects?] > > > > > > > > > > Now if acpi_bus_trim() gets that error code from acpi_bus_remove(), it should > > > > > store it, but continue the trimming normally and finally it should return that > > > > > error code to acpi_bus_hot_remove_device(). > > > > > > > > Side-note: In the pre_remove patches, acpi_bus_trim actually returns on the > > > > first error from acpi_bus_remove (e.g. when memory offlining in pre_remove > > > > fails). Trimming is not continued. > > > > > > > > Normally, acpi_bus_trim keeps trimming as you say, and always returns the last > > > > error. Is this the desired behaviour that we want to keep for bus_trim? (This is > > > > more a general question, not specific to the eject_forbidden suggestion) > > > > > > Your change makes sense to me. At least until we have rollback code in > > > place, we need to fail as soon as we hit an error. > > > > Are you sure this makes sense? What happens to the devices that we have > > trimmed already and then there's an error? Looks like they are just unusable > > going forward, aren't they? > > Yes, the devices trimmed already are released from the kernel, and their > memory ranges become unusable. This is bad. But I do not think we > should trim further to make more devices unusable after an error. > > > > > > > Now, if acpi_bus_hot_remove_device() gets that error code, it should just > > > > > reverse the whole trimming (i.e. trigger acpi_bus_scan() from the device > > > > > we attempted to eject) and notify the firmware about the failure. > > > > > > > > sounds like this rollback needs to be implemented in any solution we choose > > > > to implement, correct? > > > > > > Yes, rollback is necessary. But I do not think we need to include it > > > into your patch, though. > > > > As the first step, we should just trim everything and then return an error > > code in my opinion. > > But we cannot trim devices with kernel memory. Well, let's put it this way: If we started a trim, we should just do it completely, in which case we know we can go for the eject, or we should roll it back completely. Now, if you just break the trim on first error, the complete rollback is kind of problematic. It should be doable, but it won't be easy. On the other hand, if you go for the full trim, doing a rollback is trivial, it's as though you have reinserted the whole stuff. Now, that need not harm functionality, and that's why I proposed the eject_forbidden flag, so that .remove() can say "I'm not done, please rollback", in which case the device can happily function going forward, even if we don't rebind the driver to it. Thanks, Rafael
On Thu, 2012-11-29 at 22:23 +0100, Rafael J. Wysocki wrote: > On Thursday, November 29, 2012 01:38:39 PM Toshi Kani wrote: > > On Thu, 2012-11-29 at 21:25 +0100, Rafael J. Wysocki wrote: > > > On Thursday, November 29, 2012 10:56:30 AM Toshi Kani wrote: > > > > On Thu, 2012-11-29 at 12:30 +0100, Vasilis Liaskovitis wrote: > > > > > Side-note: In the pre_remove patches, acpi_bus_trim actually returns on the > > > > > first error from acpi_bus_remove (e.g. when memory offlining in pre_remove > > > > > fails). Trimming is not continued. > > > > > > > > > > Normally, acpi_bus_trim keeps trimming as you say, and always returns the last > > > > > error. Is this the desired behaviour that we want to keep for bus_trim? (This is > > > > > more a general question, not specific to the eject_forbidden suggestion) > > > > > > > > Your change makes sense to me. At least until we have rollback code in > > > > place, we need to fail as soon as we hit an error. > > > > > > Are you sure this makes sense? What happens to the devices that we have > > > trimmed already and then there's an error? Looks like they are just unusable > > > going forward, aren't they? > > > > Yes, the devices trimmed already are released from the kernel, and their > > memory ranges become unusable. This is bad. But I do not think we > > should trim further to make more devices unusable after an error. > > > > > > > > > > Now, if acpi_bus_hot_remove_device() gets that error code, it should just > > > > > > reverse the whole trimming (i.e. trigger acpi_bus_scan() from the device > > > > > > we attempted to eject) and notify the firmware about the failure. > > > > > > > > > > sounds like this rollback needs to be implemented in any solution we choose > > > > > to implement, correct? > > > > > > > > Yes, rollback is necessary. But I do not think we need to include it > > > > into your patch, though. > > > > > > As the first step, we should just trim everything and then return an error > > > code in my opinion. > > > > But we cannot trim devices with kernel memory. > > Well, let's put it this way: If we started a trim, we should just do it > completely, in which case we know we can go for the eject, or we should > roll it back completely. Now, if you just break the trim on first error, > the complete rollback is kind of problematic. It should be doable, but > it won't be easy. On the other hand, if you go for the full trim, > doing a rollback is trivial, it's as though you have reinserted the whole > stuff. acpi_bus_check_add() skips initialization when an ACPI device already has its associated acpi_device. So, I think it works either way. > Now, that need not harm functionality, and that's why I proposed the > eject_forbidden flag, so that .remove() can say "I'm not done, please > rollback", in which case the device can happily function going forward, > even if we don't rebind the driver to it. A partially trimmed acpi_device is hard to rollback. acpi_device should be either trimmed completely or intact. When a function failed to trim an acpi_device, it needs to rollback its operation for the device before returning an error. This is because only the failed function has enough context to rollback when an error occurred in the middle of its procedure. Thanks, -Toshi -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thursday, November 29, 2012 02:46:44 PM Toshi Kani wrote: > On Thu, 2012-11-29 at 22:23 +0100, Rafael J. Wysocki wrote: > > On Thursday, November 29, 2012 01:38:39 PM Toshi Kani wrote: > > > On Thu, 2012-11-29 at 21:25 +0100, Rafael J. Wysocki wrote: > > > > On Thursday, November 29, 2012 10:56:30 AM Toshi Kani wrote: > > > > > On Thu, 2012-11-29 at 12:30 +0100, Vasilis Liaskovitis wrote: > > > > > > Side-note: In the pre_remove patches, acpi_bus_trim actually returns on the > > > > > > first error from acpi_bus_remove (e.g. when memory offlining in pre_remove > > > > > > fails). Trimming is not continued. > > > > > > > > > > > > Normally, acpi_bus_trim keeps trimming as you say, and always returns the last > > > > > > error. Is this the desired behaviour that we want to keep for bus_trim? (This is > > > > > > more a general question, not specific to the eject_forbidden suggestion) > > > > > > > > > > Your change makes sense to me. At least until we have rollback code in > > > > > place, we need to fail as soon as we hit an error. > > > > > > > > Are you sure this makes sense? What happens to the devices that we have > > > > trimmed already and then there's an error? Looks like they are just unusable > > > > going forward, aren't they? > > > > > > Yes, the devices trimmed already are released from the kernel, and their > > > memory ranges become unusable. This is bad. But I do not think we > > > should trim further to make more devices unusable after an error. > > > > > > > > > > > > > Now, if acpi_bus_hot_remove_device() gets that error code, it should just > > > > > > > reverse the whole trimming (i.e. trigger acpi_bus_scan() from the device > > > > > > > we attempted to eject) and notify the firmware about the failure. > > > > > > > > > > > > sounds like this rollback needs to be implemented in any solution we choose > > > > > > to implement, correct? > > > > > > > > > > Yes, rollback is necessary. But I do not think we need to include it > > > > > into your patch, though. > > > > > > > > As the first step, we should just trim everything and then return an error > > > > code in my opinion. > > > > > > But we cannot trim devices with kernel memory. > > > > Well, let's put it this way: If we started a trim, we should just do it > > completely, in which case we know we can go for the eject, or we should > > roll it back completely. Now, if you just break the trim on first error, > > the complete rollback is kind of problematic. It should be doable, but > > it won't be easy. On the other hand, if you go for the full trim, > > doing a rollback is trivial, it's as though you have reinserted the whole > > stuff. > > acpi_bus_check_add() skips initialization when an ACPI device already > has its associated acpi_device. So, I think it works either way. OK > > Now, that need not harm functionality, and that's why I proposed the > > eject_forbidden flag, so that .remove() can say "I'm not done, please > > rollback", in which case the device can happily function going forward, > > even if we don't rebind the driver to it. > > A partially trimmed acpi_device is hard to rollback. acpi_device should > be either trimmed completely or intact. I may or may not agree, depending on what you mean by "trimmed". :-) > When a function failed to trim > an acpi_device, it needs to rollback its operation for the device before > returning an error. Unless it is .remove(), because .remove() is supposed to always succeed (ie. unbind the driver from the device). However, it may signal the caller that something's fishy, by setting a flag in the device object, for example. > This is because only the failed function has enough > context to rollback when an error occurred in the middle of its > procedure. Not really. If it actually removes the struct acpi_device then the caller may run acpi_bus_scan() on that device if necessary. There may be a problem if the device has an associated physical node (or more of them), but that requires special care anyway. Thanks, Rafael
On Thu, 2012-11-29 at 23:11 +0100, Rafael J. Wysocki wrote: > On Thursday, November 29, 2012 02:46:44 PM Toshi Kani wrote: > > On Thu, 2012-11-29 at 22:23 +0100, Rafael J. Wysocki wrote: > > > On Thursday, November 29, 2012 01:38:39 PM Toshi Kani wrote: > > > > > > Well, let's put it this way: If we started a trim, we should just do it > > > completely, in which case we know we can go for the eject, or we should > > > roll it back completely. Now, if you just break the trim on first error, > > > the complete rollback is kind of problematic. It should be doable, but > > > it won't be easy. On the other hand, if you go for the full trim, > > > doing a rollback is trivial, it's as though you have reinserted the whole > > > stuff. > > > > acpi_bus_check_add() skips initialization when an ACPI device already > > has its associated acpi_device. So, I think it works either way. > > OK > > > > Now, that need not harm functionality, and that's why I proposed the > > > eject_forbidden flag, so that .remove() can say "I'm not done, please > > > rollback", in which case the device can happily function going forward, > > > even if we don't rebind the driver to it. > > > > A partially trimmed acpi_device is hard to rollback. acpi_device should > > be either trimmed completely or intact. > > I may or may not agree, depending on what you mean by "trimmed". :-) > > > When a function failed to trim > > an acpi_device, it needs to rollback its operation for the device before > > returning an error. > > Unless it is .remove(), because .remove() is supposed to always succeed > (ie. unbind the driver from the device). However, it may signal the caller > that something's fishy, by setting a flag in the device object, for example. Right, .remove() cannot fail. We still need to check if we should continue to use .remove(), though. As for the flag, are you thinking that we call acpi_bus_trim() with rmdevice false first, so that it won't remove acpi_device? > > This is because only the failed function has enough > > context to rollback when an error occurred in the middle of its > > procedure. > > Not really. If it actually removes the struct acpi_device then the caller > may run acpi_bus_scan() on that device if necessary. There may be a problem > if the device has an associated physical node (or more of them), but that > requires special care anyway. Well, hot-remove to a device fails when there is a reason to fail. IOW, such reason prevented the device to be removed safely. So, I think we need to put it back to the original state in this case. Removing it by ignoring the cause of failure sounds unsafe to me. Some status/data may be left un-deleted as a result. Thanks, -Toshi -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thursday, November 29, 2012 04:17:19 PM Toshi Kani wrote: > On Thu, 2012-11-29 at 23:11 +0100, Rafael J. Wysocki wrote: > > On Thursday, November 29, 2012 02:46:44 PM Toshi Kani wrote: > > > On Thu, 2012-11-29 at 22:23 +0100, Rafael J. Wysocki wrote: > > > > On Thursday, November 29, 2012 01:38:39 PM Toshi Kani wrote: > > > > > > > > Well, let's put it this way: If we started a trim, we should just do it > > > > completely, in which case we know we can go for the eject, or we should > > > > roll it back completely. Now, if you just break the trim on first error, > > > > the complete rollback is kind of problematic. It should be doable, but > > > > it won't be easy. On the other hand, if you go for the full trim, > > > > doing a rollback is trivial, it's as though you have reinserted the whole > > > > stuff. > > > > > > acpi_bus_check_add() skips initialization when an ACPI device already > > > has its associated acpi_device. So, I think it works either way. > > > > OK > > > > > > Now, that need not harm functionality, and that's why I proposed the > > > > eject_forbidden flag, so that .remove() can say "I'm not done, please > > > > rollback", in which case the device can happily function going forward, > > > > even if we don't rebind the driver to it. > > > > > > A partially trimmed acpi_device is hard to rollback. acpi_device should > > > be either trimmed completely or intact. > > > > I may or may not agree, depending on what you mean by "trimmed". :-) > > > > > When a function failed to trim > > > an acpi_device, it needs to rollback its operation for the device before > > > returning an error. > > > > Unless it is .remove(), because .remove() is supposed to always succeed > > (ie. unbind the driver from the device). However, it may signal the caller > > that something's fishy, by setting a flag in the device object, for example. > > Right, .remove() cannot fail. We still need to check if we should > continue to use .remove(), though. > > As for the flag, are you thinking that we call acpi_bus_trim() with > rmdevice false first, so that it won't remove acpi_device? I'm not sure if that's going to help. Definitely, .remove() should just unbind the driver from the device. That's what it's supposed to do. Still, it may leave some information for the caller in the device structure itself. For example, "I have unbound from the device, but it is not safe to remove it physically". I'm now thinking that we may need to rework the trimming so that .remove() is called for all drivers first and the struct acpi_device objects are not removed at this stage. Then, if .remove() from one driver signals the situation like above, the routine will have to rebind the drivers that have been unbound and we're done. After that stage, when all drivers have been unbound, we should be able to go for full eject. First, we can drop all struct acpi_device objects in the relevant subtree and then we can run _EJ0. > > > This is because only the failed function has enough > > > context to rollback when an error occurred in the middle of its > > > procedure. > > > > Not really. If it actually removes the struct acpi_device then the caller > > may run acpi_bus_scan() on that device if necessary. There may be a problem > > if the device has an associated physical node (or more of them), but that > > requires special care anyway. > > Well, hot-remove to a device fails when there is a reason to fail. IOW, > such reason prevented the device to be removed safely. So, I think we > need to put it back to the original state in this case. Removing it by > ignoring the cause of failure sounds unsafe to me. Some status/data may > be left un-deleted as a result. Again, I may or may not agree with that, depending on whether you're talking about physical devices or about struct acpi_device objects. Anyway, I agree that removing struct acpi_device objects may not be worth the effort if we're going to re-create them in a while, because that may be costly. Thanks, Rafael
On Fri, 2012-11-30 at 01:13 +0100, Rafael J. Wysocki wrote: > On Thursday, November 29, 2012 04:17:19 PM Toshi Kani wrote: > > On Thu, 2012-11-29 at 23:11 +0100, Rafael J. Wysocki wrote: > > > On Thursday, November 29, 2012 02:46:44 PM Toshi Kani wrote: > > > > On Thu, 2012-11-29 at 22:23 +0100, Rafael J. Wysocki wrote: > > > > > On Thursday, November 29, 2012 01:38:39 PM Toshi Kani wrote: > > > > > Now, that need not harm functionality, and that's why I proposed the > > > > > eject_forbidden flag, so that .remove() can say "I'm not done, please > > > > > rollback", in which case the device can happily function going forward, > > > > > even if we don't rebind the driver to it. > > > > > > > > A partially trimmed acpi_device is hard to rollback. acpi_device should > > > > be either trimmed completely or intact. > > > > > > I may or may not agree, depending on what you mean by "trimmed". :-) > > > > > > > When a function failed to trim > > > > an acpi_device, it needs to rollback its operation for the device before > > > > returning an error. > > > > > > Unless it is .remove(), because .remove() is supposed to always succeed > > > (ie. unbind the driver from the device). However, it may signal the caller > > > that something's fishy, by setting a flag in the device object, for example. > > > > Right, .remove() cannot fail. We still need to check if we should > > continue to use .remove(), though. > > > > As for the flag, are you thinking that we call acpi_bus_trim() with > > rmdevice false first, so that it won't remove acpi_device? > > I'm not sure if that's going to help. > > Definitely, .remove() should just unbind the driver from the device. > That's what it's supposed to do. Still, it may leave some information for > the caller in the device structure itself. For example, "I have unbound > from the device, but it is not safe to remove it physically". Right. > I'm now thinking that we may need to rework the trimming so that > .remove() is called for all drivers first and the struct acpi_device > objects are not removed at this stage. Then, if .remove() from one > driver signals the situation like above, the routine will have to > rebind the drivers that have been unbound and we're done. > > After that stage, when all drivers have been unbound, we should be > able to go for full eject. First, we can drop all struct acpi_device > objects in the relevant subtree and then we can run _EJ0. I agree that such approach is worth pursuing. > > > > This is because only the failed function has enough > > > > context to rollback when an error occurred in the middle of its > > > > procedure. > > > > > > Not really. If it actually removes the struct acpi_device then the caller > > > may run acpi_bus_scan() on that device if necessary. There may be a problem > > > if the device has an associated physical node (or more of them), but that > > > requires special care anyway. > > > > Well, hot-remove to a device fails when there is a reason to fail. IOW, > > such reason prevented the device to be removed safely. So, I think we > > need to put it back to the original state in this case. Removing it by > > ignoring the cause of failure sounds unsafe to me. Some status/data may > > be left un-deleted as a result. > > Again, I may or may not agree with that, depending on whether you're talking > about physical devices or about struct acpi_device objects. Sorry, by "hot-remove a device", I was referring removing struct acpi_device and off-lining its resource. By "left un-delted", I was referring its resource left un-deleted, such as memory ranges. > Anyway, I agree that removing struct acpi_device objects may not be worth the > effort if we're going to re-create them in a while, because that may be costly. Thanks, -Toshi -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/acpi/acpi_memhotplug.c b/drivers/acpi/acpi_memhotplug.c index d0cfbd9..0562cb4 100644 --- a/drivers/acpi/acpi_memhotplug.c +++ b/drivers/acpi/acpi_memhotplug.c @@ -271,12 +271,11 @@ static int acpi_memory_enable_device(struct acpi_memory_device *mem_device) continue; } - if (!result) - info->enabled = 1; /* * Add num_enable even if add_memory() returns -EEXIST, so the * device is bound to this driver. */ + info->enabled = 1; num_enabled++; } if (!num_enabled) {
Consider the following sequence of operations for a hotplugged memory device: 1. echo "PNP0C80:XX" > /sys/bus/acpi/drivers/acpi_memhotplug/unbind 2. echo "PNP0C80:XX" > /sys/bus/acpi/drivers/acpi_memhotplug/bind 3. echo 1 >/sys/bus/pci/devices/PNP0C80:XX/eject The driver is successfully re-bound to the device in step 2. However step 3 will not attempt to remove the memory. This is because the acpi_memory_info enabled bit for the newly bound driver has not been set to 1. This bit needs to be set in the case where the memory is already used by the kernel (add_memory returns -EEXIST) Setting the enabled bit in this case (in acpi_memory_enable_device) makes the driver function properly after a rebind of the driver i.e. eject operation attempts to remove memory after a successful rebind. I am not sure if this breaks some other usage of the enabled bit (see commit 65479472). When is it possible for the memory to be in use by the kernel but not managed by the acpi driver, apart from a driver unbind scenario? Perhaps the patch is not needed, depending on expected semantics of re-binding. Is the newly bound driver supposed to manage the device, if it was earlier managed by the same driver? This patch is only specific to this scenario, and can be dropped from the patch series if needed. Signed-off-by: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com> --- drivers/acpi/acpi_memhotplug.c | 3 +-- 1 files changed, 1 insertions(+), 2 deletions(-)