Message ID | 1647668.odH0HuOGZ9@vostro.rjw.lan (mailing list archive) |
---|---|
State | Accepted, archived |
Headers | show |
On Thu, May 02, 2013 at 02:27:30PM +0200, Rafael J. Wysocki wrote: > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > In some cases, graceful hot-removal of devices is not possible, > although in principle the devices in question support hotplug. > For example, that may happen for the last CPU in the system or > for memory modules holding kernel memory. > > In those cases it is nice to be able to check if the given device > can be gracefully hot-removed before triggering a removal procedure > that cannot be aborted or reversed. Unfortunately, however, the > kernel currently doesn't provide any support for that. > > To address that deficiency, introduce support for offline and > online operations that can be performed on devices, respectively, > before a hot-removal and in case when it is necessary (or convenient) > to put a device back online after a successful offline (that has not > been followed by removal). The idea is that the offline will fail > whenever the given device cannot be gracefully removed from the > system and it will not be allowed to use the device after a > successful offline (until a subsequent online) in analogy with the > existing CPU offline/online mechanism. > > For now, the offline and online operations are introduced at the > bus type level, as that should be sufficient for the most urgent use > cases (CPUs and memory modules). In the future, however, the > approach may be extended to cover some more complicated device > offline/online scenarios involving device drivers etc. > > The lock_device_hotplug() and unlock_device_hotplug() functions are > introduced because subsequent patches need to put larger pieces of > code under device_hotplug_lock to prevent race conditions between > device offline and removal from happening. > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 2013-05-02 at 14:27 +0200, Rafael J. Wysocki wrote: > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > In some cases, graceful hot-removal of devices is not possible, > although in principle the devices in question support hotplug. > For example, that may happen for the last CPU in the system or > for memory modules holding kernel memory. > > In those cases it is nice to be able to check if the given device > can be gracefully hot-removed before triggering a removal procedure > that cannot be aborted or reversed. Unfortunately, however, the > kernel currently doesn't provide any support for that. > > To address that deficiency, introduce support for offline and > online operations that can be performed on devices, respectively, > before a hot-removal and in case when it is necessary (or convenient) > to put a device back online after a successful offline (that has not > been followed by removal). The idea is that the offline will fail > whenever the given device cannot be gracefully removed from the > system and it will not be allowed to use the device after a > successful offline (until a subsequent online) in analogy with the > existing CPU offline/online mechanism. > > For now, the offline and online operations are introduced at the > bus type level, as that should be sufficient for the most urgent use > cases (CPUs and memory modules). In the future, however, the > approach may be extended to cover some more complicated device > offline/online scenarios involving device drivers etc. > > The lock_device_hotplug() and unlock_device_hotplug() functions are > introduced because subsequent patches need to put larger pieces of > code under device_hotplug_lock to prevent race conditions between > device offline and removal from happening. > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Looks good. For patch 1/4 to 3/4: Reviewed-by: Toshi Kani <toshi.kani@hp.com> I have one minor comment below. > --- > Documentation/ABI/testing/sysfs-devices-online | 20 +++ > drivers/base/core.c | 130 +++++++++++++++++++++++++ > include/linux/device.h | 21 ++++ > 3 files changed, 171 insertions(+) > > Index: linux-pm/include/linux/device.h > =================================================================== > --- linux-pm.orig/include/linux/device.h > +++ linux-pm/include/linux/device.h > @@ -70,6 +70,10 @@ extern void bus_remove_file(struct bus_t > * the specific driver's probe to initial the matched device. > * @remove: Called when a device removed from this bus. > * @shutdown: Called at shut-down time to quiesce the device. > + * > + * @online: Called to put the device back online (after offlining it). > + * @offline: Called to put the device offline for hot-removal. May fail. > + * > * @suspend: Called when a device on this bus wants to go to sleep mode. > * @resume: Called to bring a device on this bus out of sleep mode. > * @pm: Power management operations of this bus, callback the specific > @@ -103,6 +107,9 @@ struct bus_type { > int (*remove)(struct device *dev); > void (*shutdown)(struct device *dev); > > + int (*online)(struct device *dev); > + int (*offline)(struct device *dev); > + > int (*suspend)(struct device *dev, pm_message_t state); > int (*resume)(struct device *dev); > > @@ -646,6 +653,8 @@ struct acpi_dev_node { > * @release: Callback to free the device after all references have > * gone away. This should be set by the allocator of the > * device (i.e. the bus driver that discovered the device). > + * @offline_disabled: If set, the device is permanently online. > + * @offline: Set after successful invocation of bus type's .offline(). > * > * At the lowest level, every device in a Linux system is represented by an > * instance of struct device. The device structure contains the information > @@ -718,6 +727,9 @@ struct device { > > void (*release)(struct device *dev); > struct iommu_group *iommu_group; > + > + bool offline_disabled:1; > + bool offline:1; > }; > > static inline struct device *kobj_to_dev(struct kobject *kobj) > @@ -853,6 +865,15 @@ extern const char *device_get_devnode(st > extern void *dev_get_drvdata(const struct device *dev); > extern int dev_set_drvdata(struct device *dev, void *data); > > +static inline bool device_supports_offline(struct device *dev) Since we renamed "offline" to "hotplug" for the lock interfaces, should this function be renamed to device_supports_hotplug() as well? Thanks, -Toshi > +{ > + return dev->bus && dev->bus->offline && dev->bus->online; > +} > + -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, 2013-05-03 at 01:36 +0200, Rafael J. Wysocki wrote: > On Thursday, May 02, 2013 05:11:27 PM Toshi Kani wrote: > > On Thu, 2013-05-02 at 14:27 +0200, Rafael J. Wysocki wrote: > > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> : > > > > > > +static inline bool device_supports_offline(struct device *dev) > > > > Since we renamed "offline" to "hotplug" for the lock interfaces, should > > this function be renamed to device_supports_hotplug() as well? > > Well, "offline" is more specific, as there may be devices that don't > support offline/online, but support hotplug otherwise. That's why I didn't > change it. I see. That makes sense. Thanks, -Toshi -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thursday, May 02, 2013 05:11:27 PM Toshi Kani wrote: > On Thu, 2013-05-02 at 14:27 +0200, Rafael J. Wysocki wrote: > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > > > In some cases, graceful hot-removal of devices is not possible, > > although in principle the devices in question support hotplug. > > For example, that may happen for the last CPU in the system or > > for memory modules holding kernel memory. > > > > In those cases it is nice to be able to check if the given device > > can be gracefully hot-removed before triggering a removal procedure > > that cannot be aborted or reversed. Unfortunately, however, the > > kernel currently doesn't provide any support for that. > > > > To address that deficiency, introduce support for offline and > > online operations that can be performed on devices, respectively, > > before a hot-removal and in case when it is necessary (or convenient) > > to put a device back online after a successful offline (that has not > > been followed by removal). The idea is that the offline will fail > > whenever the given device cannot be gracefully removed from the > > system and it will not be allowed to use the device after a > > successful offline (until a subsequent online) in analogy with the > > existing CPU offline/online mechanism. > > > > For now, the offline and online operations are introduced at the > > bus type level, as that should be sufficient for the most urgent use > > cases (CPUs and memory modules). In the future, however, the > > approach may be extended to cover some more complicated device > > offline/online scenarios involving device drivers etc. > > > > The lock_device_hotplug() and unlock_device_hotplug() functions are > > introduced because subsequent patches need to put larger pieces of > > code under device_hotplug_lock to prevent race conditions between > > device offline and removal from happening. > > > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > Looks good. For patch 1/4 to 3/4: > > Reviewed-by: Toshi Kani <toshi.kani@hp.com> Thanks! > I have one minor comment below. > > > --- > > Documentation/ABI/testing/sysfs-devices-online | 20 +++ > > drivers/base/core.c | 130 +++++++++++++++++++++++++ > > include/linux/device.h | 21 ++++ > > 3 files changed, 171 insertions(+) > > > > Index: linux-pm/include/linux/device.h > > =================================================================== > > --- linux-pm.orig/include/linux/device.h > > +++ linux-pm/include/linux/device.h > > @@ -70,6 +70,10 @@ extern void bus_remove_file(struct bus_t > > * the specific driver's probe to initial the matched device. > > * @remove: Called when a device removed from this bus. > > * @shutdown: Called at shut-down time to quiesce the device. > > + * > > + * @online: Called to put the device back online (after offlining it). > > + * @offline: Called to put the device offline for hot-removal. May fail. > > + * > > * @suspend: Called when a device on this bus wants to go to sleep mode. > > * @resume: Called to bring a device on this bus out of sleep mode. > > * @pm: Power management operations of this bus, callback the specific > > @@ -103,6 +107,9 @@ struct bus_type { > > int (*remove)(struct device *dev); > > void (*shutdown)(struct device *dev); > > > > + int (*online)(struct device *dev); > > + int (*offline)(struct device *dev); > > + > > int (*suspend)(struct device *dev, pm_message_t state); > > int (*resume)(struct device *dev); > > > > @@ -646,6 +653,8 @@ struct acpi_dev_node { > > * @release: Callback to free the device after all references have > > * gone away. This should be set by the allocator of the > > * device (i.e. the bus driver that discovered the device). > > + * @offline_disabled: If set, the device is permanently online. > > + * @offline: Set after successful invocation of bus type's .offline(). > > * > > * At the lowest level, every device in a Linux system is represented by an > > * instance of struct device. The device structure contains the information > > @@ -718,6 +727,9 @@ struct device { > > > > void (*release)(struct device *dev); > > struct iommu_group *iommu_group; > > + > > + bool offline_disabled:1; > > + bool offline:1; > > }; > > > > static inline struct device *kobj_to_dev(struct kobject *kobj) > > @@ -853,6 +865,15 @@ extern const char *device_get_devnode(st > > extern void *dev_get_drvdata(const struct device *dev); > > extern int dev_set_drvdata(struct device *dev, void *data); > > > > +static inline bool device_supports_offline(struct device *dev) > > Since we renamed "offline" to "hotplug" for the lock interfaces, should > this function be renamed to device_supports_hotplug() as well? Well, "offline" is more specific, as there may be devices that don't support offline/online, but support hotplug otherwise. That's why I didn't change it. Thanks, Rafael
Index: linux-pm/include/linux/device.h =================================================================== --- linux-pm.orig/include/linux/device.h +++ linux-pm/include/linux/device.h @@ -70,6 +70,10 @@ extern void bus_remove_file(struct bus_t * the specific driver's probe to initial the matched device. * @remove: Called when a device removed from this bus. * @shutdown: Called at shut-down time to quiesce the device. + * + * @online: Called to put the device back online (after offlining it). + * @offline: Called to put the device offline for hot-removal. May fail. + * * @suspend: Called when a device on this bus wants to go to sleep mode. * @resume: Called to bring a device on this bus out of sleep mode. * @pm: Power management operations of this bus, callback the specific @@ -103,6 +107,9 @@ struct bus_type { int (*remove)(struct device *dev); void (*shutdown)(struct device *dev); + int (*online)(struct device *dev); + int (*offline)(struct device *dev); + int (*suspend)(struct device *dev, pm_message_t state); int (*resume)(struct device *dev); @@ -646,6 +653,8 @@ struct acpi_dev_node { * @release: Callback to free the device after all references have * gone away. This should be set by the allocator of the * device (i.e. the bus driver that discovered the device). + * @offline_disabled: If set, the device is permanently online. + * @offline: Set after successful invocation of bus type's .offline(). * * At the lowest level, every device in a Linux system is represented by an * instance of struct device. The device structure contains the information @@ -718,6 +727,9 @@ struct device { void (*release)(struct device *dev); struct iommu_group *iommu_group; + + bool offline_disabled:1; + bool offline:1; }; static inline struct device *kobj_to_dev(struct kobject *kobj) @@ -853,6 +865,15 @@ extern const char *device_get_devnode(st extern void *dev_get_drvdata(const struct device *dev); extern int dev_set_drvdata(struct device *dev, void *data); +static inline bool device_supports_offline(struct device *dev) +{ + return dev->bus && dev->bus->offline && dev->bus->online; +} + +extern void lock_device_hotplug(void); +extern void unlock_device_hotplug(void); +extern int device_offline(struct device *dev); +extern int device_online(struct device *dev); /* * Root device objects for grouping under /sys/devices */ Index: linux-pm/drivers/base/core.c =================================================================== --- linux-pm.orig/drivers/base/core.c +++ linux-pm/drivers/base/core.c @@ -397,6 +397,36 @@ static ssize_t store_uevent(struct devic static struct device_attribute uevent_attr = __ATTR(uevent, S_IRUGO | S_IWUSR, show_uevent, store_uevent); +static ssize_t show_online(struct device *dev, struct device_attribute *attr, + char *buf) +{ + bool val; + + lock_device_hotplug(); + val = !dev->offline; + unlock_device_hotplug(); + return sprintf(buf, "%u\n", val); +} + +static ssize_t store_online(struct device *dev, struct device_attribute *attr, + const char *buf, size_t count) +{ + bool val; + int ret; + + ret = strtobool(buf, &val); + if (ret < 0) + return ret; + + lock_device_hotplug(); + ret = val ? device_online(dev) : device_offline(dev); + unlock_device_hotplug(); + return ret < 0 ? ret : count; +} + +static struct device_attribute online_attr = + __ATTR(online, S_IRUGO | S_IWUSR, show_online, store_online); + static int device_add_attributes(struct device *dev, struct device_attribute *attrs) { @@ -510,6 +540,12 @@ static int device_add_attrs(struct devic if (error) goto err_remove_type_groups; + if (device_supports_offline(dev) && !dev->offline_disabled) { + error = device_create_file(dev, &online_attr); + if (error) + goto err_remove_type_groups; + } + return 0; err_remove_type_groups: @@ -530,6 +566,7 @@ static void device_remove_attrs(struct d struct class *class = dev->class; const struct device_type *type = dev->type; + device_remove_file(dev, &online_attr); device_remove_groups(dev, dev->groups); if (type) @@ -1415,6 +1452,99 @@ EXPORT_SYMBOL_GPL(put_device); EXPORT_SYMBOL_GPL(device_create_file); EXPORT_SYMBOL_GPL(device_remove_file); +static DEFINE_MUTEX(device_hotplug_lock); + +void lock_device_hotplug(void) +{ + mutex_lock(&device_hotplug_lock); +} + +void unlock_device_hotplug(void) +{ + mutex_unlock(&device_hotplug_lock); +} + +static int device_check_offline(struct device *dev, void *not_used) +{ + int ret; + + ret = device_for_each_child(dev, NULL, device_check_offline); + if (ret) + return ret; + + return device_supports_offline(dev) && !dev->offline ? -EBUSY : 0; +} + +/** + * device_offline - Prepare the device for hot-removal. + * @dev: Device to be put offline. + * + * Execute the device bus type's .offline() callback, if present, to prepare + * the device for a subsequent hot-removal. If that succeeds, the device must + * not be used until either it is removed or its bus type's .online() callback + * is executed. + * + * Call under device_hotplug_lock. + */ +int device_offline(struct device *dev) +{ + int ret; + + if (dev->offline_disabled) + return -EPERM; + + ret = device_for_each_child(dev, NULL, device_check_offline); + if (ret) + return ret; + + device_lock(dev); + if (device_supports_offline(dev)) { + if (dev->offline) { + ret = 1; + } else { + ret = dev->bus->offline(dev); + if (!ret) { + kobject_uevent(&dev->kobj, KOBJ_OFFLINE); + dev->offline = true; + } + } + } + device_unlock(dev); + + return ret; +} + +/** + * device_online - Put the device back online after successful device_offline(). + * @dev: Device to be put back online. + * + * If device_offline() has been successfully executed for @dev, but the device + * has not been removed subsequently, execute its bus type's .online() callback + * to indicate that the device can be used again. + * + * Call under device_hotplug_lock. + */ +int device_online(struct device *dev) +{ + int ret = 0; + + device_lock(dev); + if (device_supports_offline(dev)) { + if (dev->offline) { + ret = dev->bus->online(dev); + if (!ret) { + kobject_uevent(&dev->kobj, KOBJ_ONLINE); + dev->offline = false; + } + } else { + ret = 1; + } + } + device_unlock(dev); + + return ret; +} + struct root_device { struct device dev; struct module *owner; Index: linux-pm/Documentation/ABI/testing/sysfs-devices-online =================================================================== --- /dev/null +++ linux-pm/Documentation/ABI/testing/sysfs-devices-online @@ -0,0 +1,20 @@ +What: /sys/devices/.../online +Date: April 2013 +Contact: Rafael J. Wysocki <rafael.j.wysocki@intel.com> +Description: + The /sys/devices/.../online attribute is only present for + devices whose bus types provide .online() and .offline() + callbacks. The number read from it (0 or 1) reflects the value + of the device's 'offline' field. If that number is 1 and '0' + (or 'n', or 'N') is written to this file, the device bus type's + .offline() callback is executed for the device and (if + successful) its 'offline' field is updated accordingly. In + turn, if that number is 0 and '1' (or 'y', or 'Y') is written to + this file, the device bus type's .online() callback is executed + for the device and (if successful) its 'offline' field is + updated as appropriate. + + After a successful execution of the bus type's .offline() + callback the device cannot be used for any purpose until either + it is removed (i.e. device_del() is called for it), or its bus + type's .online() is exeucted successfully.