Message ID | alpine.LRH.2.02.1401021733250.27775@file01.intranet.prod.int.rdu2.redhat.com (mailing list archive) |
---|---|
State | Superseded, archived |
Headers | show |
On 1/4/14, 1:06 PM, Mikulas Patocka wrote: > Hi > > I noticed that Jeff Mahoney added a new structure kobj_completion, defined > in include/linux/kobj_completion.h to the kernel 3.13-rc1 in the patch > eee031649707db3c9920d9498f8d03819b74fc23. In the current upstream kernel, > this interface is still unused. > > However, converting the drivers to use kobj_completion is not trivial > (note that all users of the original kobject interface are buggy - so all > of them need to be converted). > > I came up with a simpler patch to achieve the same purpose - this patch > makes fixing the drivers easy - the driver is fixed just by replacing > "kobject_put" with "kobject_put_wait" in the unload routine. > > I'd like to ask if you could revert > eee031649707db3c9920d9498f8d03819b74fc23 (no code uses it) and replace it > with this patch. I have no objections to reverting it. There were concerns from Al Viro that it'd be tough to get right by callers and I had assumed it got dropped after that. I had planned on using it in my btrfs sysfs exports patchset but came up with a better way. -Jeff > See http://www.redhat.com/archives/dm-devel/2013-October/msg00141.html for > the bug that this patch fixes. > > Mikulas > > > > From: Mikulas Patocka <mpatocka@redhat.com> > > This patch introduces a new function kobject_put_wait. It decrements the > kobject reference count, waits until the count reaches zero. When this > function returns, it is guaranteed that the kobject was freed. > > A rationale for this function: > > The kobject is keeps a reference count. The driver unload routine > decrements the reference count, however, references to the kobject may > still be held by other kernel subsystems. The driver must not free the > memory that contains the kobject. Instead, the driver provides a "release" > method. The "release" method is called by the kernel when the last kobject > refernce is dropped. The "release" method should free the memory that > contains the kobject. > > However, this pattern is buggy with respect to modules. The release method > is placed in the driver's module. When the driver exits, the module > reference count is zero, thus the module may be freed. However, there may > still be references to the kobject. If the module is unloaded and then the > release method is called, a crash happens. > > Recently, CONFIG_DEBUG_KOBJECT_RELEASE was added. This option deliberately > provokes this race condition. > > This patch fixes the bug by providing new function kobject_put_wait. > kobject_put_wait works like kobject_put, but it also waits until all other > references were dropped and until the kobject was freed. When > kobject_put_wait returns, it is guaranteed that the kobject was released > with the release method. > > Thus, we can change kobject_put to kobject_put_wait in the unload routine > of various drivers to fix the above race condition. > > This patch fixes it for device mapper. Note that all kobject users in > modules should be fixed to use this function. > > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> > Cc: stable@kernel.org > > --- > drivers/md/dm-sysfs.c | 2 +- > include/linux/kobject.h | 3 +++ > lib/kobject.c | 34 +++++++++++++++++++++++++++++----- > 3 files changed, 33 insertions(+), 6 deletions(-) > > Index: linux-3.13-rc6/include/linux/kobject.h > =================================================================== > --- linux-3.13-rc6.orig/include/linux/kobject.h 2014-01-02 23:13:24.000000000 +0100 > +++ linux-3.13-rc6/include/linux/kobject.h 2014-01-02 23:14:02.000000000 +0100 > @@ -27,6 +27,7 @@ > #include <linux/wait.h> > #include <linux/atomic.h> > #include <linux/workqueue.h> > +#include <linux/completion.h> > > #define UEVENT_HELPER_PATH_LEN 256 > #define UEVENT_NUM_ENVP 32 /* number of env pointers */ > @@ -66,6 +67,7 @@ struct kobject { > struct kobj_type *ktype; > struct sysfs_dirent *sd; > struct kref kref; > + struct completion *free_completion; > #ifdef CONFIG_DEBUG_KOBJECT_RELEASE > struct delayed_work release; > #endif > @@ -106,6 +108,7 @@ extern int __must_check kobject_move(str > > extern struct kobject *kobject_get(struct kobject *kobj); > extern void kobject_put(struct kobject *kobj); > +extern void kobject_put_wait(struct kobject *kobj); > > extern const void *kobject_namespace(struct kobject *kobj); > extern char *kobject_get_path(struct kobject *kobj, gfp_t flag); > Index: linux-3.13-rc6/lib/kobject.c > =================================================================== > --- linux-3.13-rc6.orig/lib/kobject.c 2014-01-02 23:13:23.000000000 +0100 > +++ linux-3.13-rc6/lib/kobject.c 2014-01-02 23:17:01.000000000 +0100 > @@ -172,6 +172,7 @@ static void kobject_init_internal(struct > if (!kobj) > return; > kref_init(&kobj->kref); > + kobj->free_completion = NULL; > INIT_LIST_HEAD(&kobj->entry); > kobj->state_in_sysfs = 0; > kobj->state_add_uevent_sent = 0; > @@ -577,15 +578,11 @@ static void kobject_cleanup(struct kobje > { > struct kobj_type *t = get_ktype(kobj); > const char *name = kobj->name; > + struct completion *free_completion = kobj->free_completion; > > pr_debug("kobject: '%s' (%p): %s, parent %p\n", > kobject_name(kobj), kobj, __func__, kobj->parent); > > - if (t && !t->release) > - pr_debug("kobject: '%s' (%p): does not have a release() " > - "function, it is broken and must be fixed.\n", > - kobject_name(kobj), kobj); > - > /* send "remove" if the caller did not do it but sent "add" */ > if (kobj->state_add_uevent_sent && !kobj->state_remove_uevent_sent) { > pr_debug("kobject: '%s' (%p): auto cleanup 'remove' event\n", > @@ -611,6 +608,10 @@ static void kobject_cleanup(struct kobje > pr_debug("kobject: '%s': free name\n", name); > kfree(name); > } > + > + /* if someone is waiting for the kobject to be freed, wake him up */ > + if (free_completion) > + complete(free_completion); > } > > #ifdef CONFIG_DEBUG_KOBJECT_RELEASE > @@ -651,6 +652,28 @@ void kobject_put(struct kobject *kobj) > } > } > > +/** > + * kobject_put - decrement refcount for object and wait until it reaches zero. > + * @kobj: object. > + * > + * Decrement the refcount, and wait until the refcount reaches zero and the > + * kobject is freed. > + * > + * This function should be called from the driver unload routine. It must not > + * be called concurrently on the same kobject. When this function returns, it > + * is guaranteed that the kobject was freed. > + */ > +void kobject_put_wait(struct kobject *kobj) > +{ > + if (kobj) { > + DECLARE_COMPLETION_ONSTACK(completion); > + BUG_ON(kobj->free_completion); > + kobj->free_completion = &completion; > + kobject_put(kobj); > + wait_for_completion(&completion); > + } > +} > + > static void dynamic_kobj_release(struct kobject *kobj) > { > pr_debug("kobject: (%p): %s\n", kobj, __func__); > @@ -1076,6 +1099,7 @@ void kobj_ns_drop(enum kobj_ns_type type > > EXPORT_SYMBOL(kobject_get); > EXPORT_SYMBOL(kobject_put); > +EXPORT_SYMBOL(kobject_put_wait); > EXPORT_SYMBOL(kobject_del); > > EXPORT_SYMBOL(kset_register); > Index: linux-3.13-rc6/drivers/md/dm-sysfs.c > =================================================================== > --- linux-3.13-rc6.orig/drivers/md/dm-sysfs.c 2014-01-02 23:13:24.000000000 +0100 > +++ linux-3.13-rc6/drivers/md/dm-sysfs.c 2014-01-02 23:14:02.000000000 +0100 > @@ -104,5 +104,5 @@ int dm_sysfs_init(struct mapped_device * > */ > void dm_sysfs_exit(struct mapped_device *md) > { > - kobject_put(dm_kobject(md)); > + kobject_put_wait(dm_kobject(md)); > } >
On Sat, Jan 04, 2014 at 01:06:01PM -0500, Mikulas Patocka wrote: > Hi > > I noticed that Jeff Mahoney added a new structure kobj_completion, defined > in include/linux/kobj_completion.h to the kernel 3.13-rc1 in the patch > eee031649707db3c9920d9498f8d03819b74fc23. In the current upstream kernel, > this interface is still unused. There are pending btrfs patches to use this interface. > However, converting the drivers to use kobj_completion is not trivial > (note that all users of the original kobject interface are buggy - so all > of them need to be converted). Wait, what? How are "all users" buggy? Please explain this in detail. > I came up with a simpler patch to achieve the same purpose - this patch > makes fixing the drivers easy - the driver is fixed just by replacing > "kobject_put" with "kobject_put_wait" in the unload routine. No, that's not ok at all. > I'd like to ask if you could revert > eee031649707db3c9920d9498f8d03819b74fc23 (no code uses it) and replace it > with this patch. > > See http://www.redhat.com/archives/dm-devel/2013-October/msg00141.html for > the bug that this patch fixes. > > Mikulas > > > > From: Mikulas Patocka <mpatocka@redhat.com> > > This patch introduces a new function kobject_put_wait. It decrements the > kobject reference count, waits until the count reaches zero. When this > function returns, it is guaranteed that the kobject was freed. > > A rationale for this function: > > The kobject is keeps a reference count. The driver unload routine > decrements the reference count, however, references to the kobject may > still be held by other kernel subsystems. The driver must not free the > memory that contains the kobject. Instead, the driver provides a "release" > method. The "release" method is called by the kernel when the last kobject > refernce is dropped. The "release" method should free the memory that > contains the kobject. > > However, this pattern is buggy with respect to modules. The release method > is placed in the driver's module. When the driver exits, the module > reference count is zero, thus the module may be freed. However, there may > still be references to the kobject. If the module is unloaded and then the > release method is called, a crash happens. Yes, module unloading while a kobject is still "active" is not a good thing, what modules do you have that cause this problem? Why not just grab the module reference in your kobject if you need this type of protection? It's not the kobject's code fault that this issue is there, or that we now have a "delayed release" function to expose this type of thing, it's the user of the kobject. Please fix the broken users of the kobject first. thanks, greg k-h -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
On Sat, Jan 04, 2014 at 10:16:20AM -0800, Greg Kroah-Hartman wrote: > > I came up with a simpler patch to achieve the same purpose - this patch > > makes fixing the drivers easy - the driver is fixed just by replacing > > "kobject_put" with "kobject_put_wait" in the unload routine. > > No, that's not ok at all. Agreed - all it takes is one cargo-culter who religoiusly does such conversion and drops a ref to parent before that to child. > > However, this pattern is buggy with respect to modules. The release method > > is placed in the driver's module. When the driver exits, the module > > reference count is zero, thus the module may be freed. However, there may > > still be references to the kobject. If the module is unloaded and then the > > release method is called, a crash happens. > > Yes, module unloading while a kobject is still "active" is not a good > thing, what modules do you have that cause this problem? Why not just > grab the module reference in your kobject if you need this type of > protection? It's not the kobject's code fault that this issue is there, > or that we now have a "delayed release" function to expose this type of > thing, it's the user of the kobject. > > Please fix the broken users of the kobject first. <snide> Are you saying that there is another kind? </snide> When would you grab that reference to module? More to the point, when would you *drop* it? Doing so from module_exit is not going to work, obviously... -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
On Sat, 4 Jan 2014, Greg Kroah-Hartman wrote: > On Sat, Jan 04, 2014 at 01:06:01PM -0500, Mikulas Patocka wrote: > > Hi > > > > I noticed that Jeff Mahoney added a new structure kobj_completion, defined > > in include/linux/kobj_completion.h to the kernel 3.13-rc1 in the patch > > eee031649707db3c9920d9498f8d03819b74fc23. In the current upstream kernel, > > this interface is still unused. > > There are pending btrfs patches to use this interface. > > > However, converting the drivers to use kobj_completion is not trivial > > (note that all users of the original kobject interface are buggy - so all > > of them need to be converted). > > Wait, what? How are "all users" buggy? Please explain this in detail. 1) some code takes a reference to a kobject 2) the user unloads the device 3) the device driver unload routine calls kobject_put (but there is still reference, so the kobject is not destroyed) 4) the device is unregistered, it drops the module reference count to zero 5) the user unloads the module 6) the piece of code that grabbed the kobject reference at step 1) drops it 7) the kobject code calls ->release method, it points into the unloaded module You could try to fix it by incrementing the module reference count between steps 2) and 3) and decrementing it in the kobject release routine, but it doesn't work - you'd have to run "module_put(THIS_MODULE);" from the kobject release routine - it is still buggy, because the module may be unloaded just after "module_put(THIS_MODULE);" and before the release routine returns. Another problem with incrementing the module reference count happens if the above sequence is executing as a consequence of module unloading - suppose that steps 2)-4) are being executed from the module unload routine itself (for example, network drivers have this property that the module may be removed while the device is in use and the module unload routine shuts the device down). In this case, there is no way to delay unloading of the module from the unload routine itself. > > I came up with a simpler patch to achieve the same purpose - this patch > > makes fixing the drivers easy - the driver is fixed just by replacing > > "kobject_put" with "kobject_put_wait" in the unload routine. > > No, that's not ok at all. What exactly do you think is wrong with this patch? > > I'd like to ask if you could revert > > eee031649707db3c9920d9498f8d03819b74fc23 (no code uses it) and replace it > > with this patch. > > > > See http://www.redhat.com/archives/dm-devel/2013-October/msg00141.html for > > the bug that this patch fixes. > > > > Mikulas > > > > > > > > From: Mikulas Patocka <mpatocka@redhat.com> > > > > This patch introduces a new function kobject_put_wait. It decrements the > > kobject reference count, waits until the count reaches zero. When this > > function returns, it is guaranteed that the kobject was freed. > > > > A rationale for this function: > > > > The kobject is keeps a reference count. The driver unload routine > > decrements the reference count, however, references to the kobject may > > still be held by other kernel subsystems. The driver must not free the > > memory that contains the kobject. Instead, the driver provides a "release" > > method. The "release" method is called by the kernel when the last kobject > > refernce is dropped. The "release" method should free the memory that > > contains the kobject. > > > > However, this pattern is buggy with respect to modules. The release method > > is placed in the driver's module. When the driver exits, the module > > reference count is zero, thus the module may be freed. However, there may > > still be references to the kobject. If the module is unloaded and then the > > release method is called, a crash happens. > > Yes, module unloading while a kobject is still "active" is not a good > thing, what modules do you have that cause this problem? All modules - Do you have some code in the generic module unload routine that prevents unloading the module while someone holds a reference to a kobject created by the module? If not, then all modules have this bug. > Why not just > grab the module reference in your kobject if you need this type of > protection? See above - you'd have to drop the reference count from the module itself. And it doesn't work if you are destroying the kobject from the module unload routine itself. > It's not the kobject's code fault that this issue is there, > or that we now have a "delayed release" function to expose this type of > thing, it's the user of the kobject. So, show me any single driver that do this correctly. I haven't looked at the whole kernel, but all drivers I have looked on so far are buggy. Just a grep '\.release' drivers/*/*.c shows this: These are buggy because they don't manage module references in the kobject release routine: pkt_kobj_release, cpufreq_sysfs_release, cpuidle_sysfs_release, ktype_state_cpuidle, cpuidle_driver_sysfs_release, dmi_entry_free, dmi_sysfs_entry_release, release_firmware_map_entry, rdev_free, md_free, pps_cdev_release, iscsi_boot_kobj_release, map_release, portio_release These are decrementing the own module count from the release method, so they are less buggy than the above, but still buggy: edac_device_ctrl_instance_release, edac_device_ctrl_master_release, edac_device_ctrl_instance_release > Please fix the broken users of the kobject first. > > thanks, > > greg k-h Mikulas -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
On Sat, Jan 04, 2014 at 06:34:03PM +0000, Al Viro wrote: > On Sat, Jan 04, 2014 at 10:16:20AM -0800, Greg Kroah-Hartman wrote: > > > > I came up with a simpler patch to achieve the same purpose - this patch > > > makes fixing the drivers easy - the driver is fixed just by replacing > > > "kobject_put" with "kobject_put_wait" in the unload routine. > > > > No, that's not ok at all. > > Agreed - all it takes is one cargo-culter who religoiusly does such > conversion and drops a ref to parent before that to child. > > > > However, this pattern is buggy with respect to modules. The release method > > > is placed in the driver's module. When the driver exits, the module > > > reference count is zero, thus the module may be freed. However, there may > > > still be references to the kobject. If the module is unloaded and then the > > > release method is called, a crash happens. > > > > Yes, module unloading while a kobject is still "active" is not a good > > thing, what modules do you have that cause this problem? Why not just > > grab the module reference in your kobject if you need this type of > > protection? It's not the kobject's code fault that this issue is there, > > or that we now have a "delayed release" function to expose this type of > > thing, it's the user of the kobject. > > > > Please fix the broken users of the kobject first. > > <snide> Are you saying that there is another kind? </snide> > > When would you grab that reference to module? More to the point, when > would you *drop* it? Doing so from module_exit is not going to work, > obviously... You normally have subsystem core module that does handle release of its objects and users of said objects so it is usually OK for objects to outlive the users, you just need to make sure the core stays around. In input we grab module reference to input core when we allocate input device and drop it when input device is freed. This way we can be sure that input core stays around until all input devices are gone. The same for serio. Thanks.
On Sat, Jan 04, 2014 at 03:35:39PM -0500, Mikulas Patocka wrote: > > > On Sat, 4 Jan 2014, Greg Kroah-Hartman wrote: > > > On Sat, Jan 04, 2014 at 01:06:01PM -0500, Mikulas Patocka wrote: > > > Hi > > > > > > I noticed that Jeff Mahoney added a new structure kobj_completion, defined > > > in include/linux/kobj_completion.h to the kernel 3.13-rc1 in the patch > > > eee031649707db3c9920d9498f8d03819b74fc23. In the current upstream kernel, > > > this interface is still unused. > > > > There are pending btrfs patches to use this interface. > > > > > However, converting the drivers to use kobj_completion is not trivial > > > (note that all users of the original kobject interface are buggy - so all > > > of them need to be converted). > > > > Wait, what? How are "all users" buggy? Please explain this in detail. > > 1) some code takes a reference to a kobject > 2) the user unloads the device > 3) the device driver unload routine calls kobject_put (but there is still > reference, so the kobject is not destroyed) A driver should never be messing around with "raw" kobjects, they should be using a 'struct device' which is created/managed by the subsystem they belong to. See Dmitry's example of input and serio as ways to do this, also USB and PCI do this properly. Perhaps your sybsystem isn't doing this properly? What code do you have that creates raw kobjects and has this problem? thanks, greg k-h -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
On Sat, Jan 04, 2014 at 01:14:14PM -0500, Jeff Mahoney wrote: > On 1/4/14, 1:06 PM, Mikulas Patocka wrote: > > Hi > > > > I noticed that Jeff Mahoney added a new structure kobj_completion, defined > > in include/linux/kobj_completion.h to the kernel 3.13-rc1 in the patch > > eee031649707db3c9920d9498f8d03819b74fc23. In the current upstream kernel, > > this interface is still unused. > > > > However, converting the drivers to use kobj_completion is not trivial > > (note that all users of the original kobject interface are buggy - so all > > of them need to be converted). > > > > I came up with a simpler patch to achieve the same purpose - this patch > > makes fixing the drivers easy - the driver is fixed just by replacing > > "kobject_put" with "kobject_put_wait" in the unload routine. > > > > I'd like to ask if you could revert > > eee031649707db3c9920d9498f8d03819b74fc23 (no code uses it) and replace it > > with this patch. > > I have no objections to reverting it. There were concerns from Al Viro > that it'd be tough to get right by callers and I had assumed it got > dropped after that. I had planned on using it in my btrfs sysfs exports > patchset but came up with a better way. Ok, thanks, I'll revert it for 3.14. greg k-h -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
On Sat, Jan 04, 2014 at 07:42:28PM -0800, Greg Kroah-Hartman wrote: > On Sat, Jan 04, 2014 at 03:35:39PM -0500, Mikulas Patocka wrote: > > > > > > On Sat, 4 Jan 2014, Greg Kroah-Hartman wrote: > > > > > On Sat, Jan 04, 2014 at 01:06:01PM -0500, Mikulas Patocka wrote: > > > > Hi > > > > > > > > I noticed that Jeff Mahoney added a new structure kobj_completion, defined > > > > in include/linux/kobj_completion.h to the kernel 3.13-rc1 in the patch > > > > eee031649707db3c9920d9498f8d03819b74fc23. In the current upstream kernel, > > > > this interface is still unused. > > > > > > There are pending btrfs patches to use this interface. > > > > > > > However, converting the drivers to use kobj_completion is not trivial > > > > (note that all users of the original kobject interface are buggy - so all > > > > of them need to be converted). > > > > > > Wait, what? How are "all users" buggy? Please explain this in detail. > > > > 1) some code takes a reference to a kobject > > 2) the user unloads the device > > 3) the device driver unload routine calls kobject_put (but there is still > > reference, so the kobject is not destroyed) > > A driver should never be messing around with "raw" kobjects, they should > be using a 'struct device' which is created/managed by the subsystem > they belong to. See Dmitry's example of input and serio as ways to do > this, also USB and PCI do this properly. Well, Mikulas is correct in the sense that there is still a race between release function invoking the final module_put() and getting preempted and module getting unloaded by another thread. Hitting this race is pretty hard though.
On 01/04/14 19:06, Mikulas Patocka wrote: > - if (t && !t->release) > - pr_debug("kobject: '%s' (%p): does not have a release() " > - "function, it is broken and must be fixed.\n", > - kobject_name(kobj), kobj); > - Has it been considered to issue a warning if no release function has been defined and free_completion == NULL instead of removing the above debug message entirely ? I think even with this patch applied it is still wrong to invoke kobject_put() on an object without defining a release function. Thanks, Bart. -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
On Sun, Jan 05, 2014 at 05:43:56PM +0100, Bart Van Assche wrote: > On 01/04/14 19:06, Mikulas Patocka wrote: > > - if (t && !t->release) > > - pr_debug("kobject: '%s' (%p): does not have a release() " > > - "function, it is broken and must be fixed.\n", > > - kobject_name(kobj), kobj); > > - > > Has it been considered to issue a warning if no release function has > been defined and free_completion == NULL instead of removing the above > debug message entirely ? I think even with this patch applied it is > still wrong to invoke kobject_put() on an object without defining a > release function. This patch isn't going to be applied, and I've reverted the original commit, so there shouldn't be any issues anymore with this code. thanks, greg k-h -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
On Sat, Jan 04, 2014 at 10:05:31PM -0800, Dmitry Torokhov wrote: > On Sat, Jan 04, 2014 at 07:42:28PM -0800, Greg Kroah-Hartman wrote: > > On Sat, Jan 04, 2014 at 03:35:39PM -0500, Mikulas Patocka wrote: > > > > > > > > > On Sat, 4 Jan 2014, Greg Kroah-Hartman wrote: > > > > > > > On Sat, Jan 04, 2014 at 01:06:01PM -0500, Mikulas Patocka wrote: > > > > > Hi > > > > > > > > > > I noticed that Jeff Mahoney added a new structure kobj_completion, defined > > > > > in include/linux/kobj_completion.h to the kernel 3.13-rc1 in the patch > > > > > eee031649707db3c9920d9498f8d03819b74fc23. In the current upstream kernel, > > > > > this interface is still unused. > > > > > > > > There are pending btrfs patches to use this interface. > > > > > > > > > However, converting the drivers to use kobj_completion is not trivial > > > > > (note that all users of the original kobject interface are buggy - so all > > > > > of them need to be converted). > > > > > > > > Wait, what? How are "all users" buggy? Please explain this in detail. > > > > > > 1) some code takes a reference to a kobject > > > 2) the user unloads the device > > > 3) the device driver unload routine calls kobject_put (but there is still > > > reference, so the kobject is not destroyed) > > > > A driver should never be messing around with "raw" kobjects, they should > > be using a 'struct device' which is created/managed by the subsystem > > they belong to. See Dmitry's example of input and serio as ways to do > > this, also USB and PCI do this properly. > > Well, Mikulas is correct in the sense that there is still a race between > release function invoking the final module_put() and getting preempted > and module getting unloaded by another thread. Hitting this race is > pretty hard though. Yes, that is true, especially as no one auto-unloads modules for reasons like this :) thanks, greg k-h -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
On Sat, 4 Jan 2014, Greg Kroah-Hartman wrote: > On Sat, Jan 04, 2014 at 03:35:39PM -0500, Mikulas Patocka wrote: > > > > > > On Sat, 4 Jan 2014, Greg Kroah-Hartman wrote: > > > > > On Sat, Jan 04, 2014 at 01:06:01PM -0500, Mikulas Patocka wrote: > > > > Hi > > > > > > > > I noticed that Jeff Mahoney added a new structure kobj_completion, defined > > > > in include/linux/kobj_completion.h to the kernel 3.13-rc1 in the patch > > > > eee031649707db3c9920d9498f8d03819b74fc23. In the current upstream kernel, > > > > this interface is still unused. > > > > > > There are pending btrfs patches to use this interface. > > > > > > > However, converting the drivers to use kobj_completion is not trivial > > > > (note that all users of the original kobject interface are buggy - so all > > > > of them need to be converted). > > > > > > Wait, what? How are "all users" buggy? Please explain this in detail. > > > > 1) some code takes a reference to a kobject > > 2) the user unloads the device > > 3) the device driver unload routine calls kobject_put (but there is still > > reference, so the kobject is not destroyed) > > A driver should never be messing around with "raw" kobjects, they should > be using a 'struct device' which is created/managed by the subsystem > they belong to. See Dmitry's example of input and serio as ways to do > this, also USB and PCI do this properly. > > Perhaps your sybsystem isn't doing this properly? What code do you have > that creates raw kobjects and has this problem? > > thanks, > > greg k-h So, are you saying that a module shouldn't ever be able to create a kobject type? Do "grep -rw kobj_type drivers/ fs/* net/bridge/" to see how much code uses kobjects. There are 77 line. Majority of them may be compiled as modules. What do you want to do with all those kobject users? Hide them behind another interface that doesn't exists yet? Mikulas -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
On Sun, 5 Jan 2014, Bart Van Assche wrote: > On 01/04/14 19:06, Mikulas Patocka wrote: > > - if (t && !t->release) > > - pr_debug("kobject: '%s' (%p): does not have a release() " > > - "function, it is broken and must be fixed.\n", > > - kobject_name(kobj), kobj); > > - > > Has it been considered to issue a warning if no release function has > been defined and free_completion == NULL instead of removing the above > debug message entirely ? I think even with this patch applied it is > still wrong to invoke kobject_put() on an object without defining a > release function. > > Thanks, > > Bart. With the above patch, you don't need the release method for correct behavior. Therefore you don't have to issue warning when it is missing. Mikulas -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
On Sat, 4 Jan 2014, Dmitry Torokhov wrote: > On Sat, Jan 04, 2014 at 06:34:03PM +0000, Al Viro wrote: > > On Sat, Jan 04, 2014 at 10:16:20AM -0800, Greg Kroah-Hartman wrote: > > > > > > I came up with a simpler patch to achieve the same purpose - this patch > > > > makes fixing the drivers easy - the driver is fixed just by replacing > > > > "kobject_put" with "kobject_put_wait" in the unload routine. > > > > > > No, that's not ok at all. > > > > Agreed - all it takes is one cargo-culter who religoiusly does such > > conversion and drops a ref to parent before that to child. > > > > > > However, this pattern is buggy with respect to modules. The release method > > > > is placed in the driver's module. When the driver exits, the module > > > > reference count is zero, thus the module may be freed. However, there may > > > > still be references to the kobject. If the module is unloaded and then the > > > > release method is called, a crash happens. > > > > > > Yes, module unloading while a kobject is still "active" is not a good > > > thing, what modules do you have that cause this problem? Why not just > > > grab the module reference in your kobject if you need this type of > > > protection? It's not the kobject's code fault that this issue is there, > > > or that we now have a "delayed release" function to expose this type of > > > thing, it's the user of the kobject. > > > > > > Please fix the broken users of the kobject first. > > > > <snide> Are you saying that there is another kind? </snide> > > > > When would you grab that reference to module? More to the point, when > > would you *drop* it? Doing so from module_exit is not going to work, > > obviously... > > You normally have subsystem core module that does handle release of its > objects and users of said objects so it is usually OK for objects to > outlive the users, you just need to make sure the core stays around. > > In input we grab module reference to input core when we allocate input > device and drop it when input device is freed. This way we can be sure > that input core stays around until all input devices are gone. The same > for serio. > > Thanks. > > -- > Dmitry But sometimes, the driver itself needs to create nodes in the sysfs filesystem (for example drivers/md/dm-sysfs.c). I don't quite see how would you push all driver-specific sysfs nodes into the generic non-module code. Mikulas -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
On Sun, Jan 05, 2014 at 05:04:31PM -0500, Mikulas Patocka wrote: > > > On Sat, 4 Jan 2014, Greg Kroah-Hartman wrote: > > > On Sat, Jan 04, 2014 at 03:35:39PM -0500, Mikulas Patocka wrote: > > > > > > > > > On Sat, 4 Jan 2014, Greg Kroah-Hartman wrote: > > > > > > > On Sat, Jan 04, 2014 at 01:06:01PM -0500, Mikulas Patocka wrote: > > > > > Hi > > > > > > > > > > I noticed that Jeff Mahoney added a new structure kobj_completion, defined > > > > > in include/linux/kobj_completion.h to the kernel 3.13-rc1 in the patch > > > > > eee031649707db3c9920d9498f8d03819b74fc23. In the current upstream kernel, > > > > > this interface is still unused. > > > > > > > > There are pending btrfs patches to use this interface. > > > > > > > > > However, converting the drivers to use kobj_completion is not trivial > > > > > (note that all users of the original kobject interface are buggy - so all > > > > > of them need to be converted). > > > > > > > > Wait, what? How are "all users" buggy? Please explain this in detail. > > > > > > 1) some code takes a reference to a kobject > > > 2) the user unloads the device > > > 3) the device driver unload routine calls kobject_put (but there is still > > > reference, so the kobject is not destroyed) > > > > A driver should never be messing around with "raw" kobjects, they should > > be using a 'struct device' which is created/managed by the subsystem > > they belong to. See Dmitry's example of input and serio as ways to do > > this, also USB and PCI do this properly. > > > > Perhaps your sybsystem isn't doing this properly? What code do you have > > that creates raw kobjects and has this problem? > > > > thanks, > > > > greg k-h > > So, are you saying that a module shouldn't ever be able to create a > kobject type? > > Do "grep -rw kobj_type drivers/ fs/* net/bridge/" to see how much code > uses kobjects. There are 77 line. Majority of them may be compiled as > modules. > > What do you want to do with all those kobject users? Hide them behind > another interface that doesn't exists yet? Most of them should be using the driver/device interface to sysfs (the drivers/* files, with the exception of the driver core code). I'll look at the others later. And note, as module unloading can only happen by the root user, and never happens "automatically", this is an issue, but a very minor one, and can usually be solved by having a central "place" that handles the kobject lifetimes. thanks, greg k-h -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
On Sun, Jan 05, 2014 at 05:11:23PM -0500, Mikulas Patocka wrote: > > > On Sat, 4 Jan 2014, Dmitry Torokhov wrote: > > > On Sat, Jan 04, 2014 at 06:34:03PM +0000, Al Viro wrote: > > > On Sat, Jan 04, 2014 at 10:16:20AM -0800, Greg Kroah-Hartman wrote: > > > > > > > > I came up with a simpler patch to achieve the same purpose - this patch > > > > > makes fixing the drivers easy - the driver is fixed just by replacing > > > > > "kobject_put" with "kobject_put_wait" in the unload routine. > > > > > > > > No, that's not ok at all. > > > > > > Agreed - all it takes is one cargo-culter who religoiusly does such > > > conversion and drops a ref to parent before that to child. > > > > > > > > However, this pattern is buggy with respect to modules. The release method > > > > > is placed in the driver's module. When the driver exits, the module > > > > > reference count is zero, thus the module may be freed. However, there may > > > > > still be references to the kobject. If the module is unloaded and then the > > > > > release method is called, a crash happens. > > > > > > > > Yes, module unloading while a kobject is still "active" is not a good > > > > thing, what modules do you have that cause this problem? Why not just > > > > grab the module reference in your kobject if you need this type of > > > > protection? It's not the kobject's code fault that this issue is there, > > > > or that we now have a "delayed release" function to expose this type of > > > > thing, it's the user of the kobject. > > > > > > > > Please fix the broken users of the kobject first. > > > > > > <snide> Are you saying that there is another kind? </snide> > > > > > > When would you grab that reference to module? More to the point, when > > > would you *drop* it? Doing so from module_exit is not going to work, > > > obviously... > > > > You normally have subsystem core module that does handle release of its > > objects and users of said objects so it is usually OK for objects to > > outlive the users, you just need to make sure the core stays around. > > > > In input we grab module reference to input core when we allocate input > > device and drop it when input device is freed. This way we can be sure > > that input core stays around until all input devices are gone. The same > > for serio. > > > > Thanks. > > > > -- > > Dmitry > > But sometimes, the driver itself needs to create nodes in the sysfs > filesystem (for example drivers/md/dm-sysfs.c). I don't quite see how > would you push all driver-specific sysfs nodes into the generic non-module > code. Then you need to make sure your driver does not allow unloading while its objects are active. I.e. require that all your devices are gone (by increasing module count when you create a DM object and decreasing it when you release DM object) before you allow unloading the driver. Basically we should avoid kobject_put() in exit paths of the module. Then we are left with that tiny race with release being preempted and module getting unloaded. Thanks.
On Sun, 5 Jan 2014, Dmitry Torokhov wrote: > > But sometimes, the driver itself needs to create nodes in the sysfs > > filesystem (for example drivers/md/dm-sysfs.c). I don't quite see how > > would you push all driver-specific sysfs nodes into the generic non-module > > code. > > Then you need to make sure your driver does not allow unloading while > its objects are active. I.e. require that all your devices are gone > (by increasing module count when you create a DM object and decreasing > it when you release DM object) before you allow unloading the driver. For drivers that register to various subsystems (for example with pci_register_driver and pci_unregister_driver) this can't be done correctly - pci_unregister_driver is called from the module unload path, it destroys all instances of the device and their appropriate sysfs nodes. The sysfs nodes exist even if the driver is unused and has zero module count. > Basically we should avoid kobject_put() in exit paths of the module. > Then we are left with that tiny race with release being preempted and > module getting unloaded. Majority of kobject users aren't managing module refcount in the release routine. They do not have a tiny race - they have a big race that is hapenning with CONFIG_DEBUG_KOBJECT_RELEASE. These use completion to wait for the released object (thus, they are correct): cpufreq_sysfs_release, cpuidle_sysfs_release, cpuidle_state_sysfs_release, cpuidle_driver_sysfs_release, ext4_sb_release, ext4_feat_release, f2fs_sb_release These have no protection against module unload at all: pkt_kobj_release, map_release, portio_release, ib_port_release, cm_release_port_obj, mlx4_port_release, ttm_bo_global_kobj_release, ttm_pool_kobj_release, ttm_mem_zone_kobj_release, ttm_mem_global_kobj_release, rdev_free, md_free, efivar_release, dmi_entry_free, dmi_sysfs_entry_release, edd_release, iscsi_boot_kobj_release, lockspace_kobj_release, gfs2_sbd_release, release_nbp These have empty or non-existent release routine (thus having no protection): dm-sysfs.c, qib_port_release These use module refcount: edac_device_ctrl_master_release, edac_device_ctrl_instance_release, edac_device_ctrl_block_release > Thanks. > > -- > Dmitry Mikulas -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
On Sun, 5 Jan 2014, Greg Kroah-Hartman wrote: > On Sun, Jan 05, 2014 at 05:43:56PM +0100, Bart Van Assche wrote: > > On 01/04/14 19:06, Mikulas Patocka wrote: > > > - if (t && !t->release) > > > - pr_debug("kobject: '%s' (%p): does not have a release() " > > > - "function, it is broken and must be fixed.\n", > > > - kobject_name(kobj), kobj); > > > - > > > > Has it been considered to issue a warning if no release function has > > been defined and free_completion == NULL instead of removing the above > > debug message entirely ? I think even with this patch applied it is > > still wrong to invoke kobject_put() on an object without defining a > > release function. > > This patch isn't going to be applied, and I've reverted the original > commit, so there shouldn't be any issues anymore with this code. Why? This patch does the same thing as eee031649707db3c9920d9498f8d03819b74fc23, but it's smaller. So why did you accept eee031649707db3c9920d9498f8d03819b74fc23 and not this? The code to wait for kobject destruction using completion already exists in cpufreq_sysfs_release, cpuidle_sysfs_release, cpuidle_state_sysfs_release, cpuidle_driver_sysfs_release, ext4_sb_release, ext4_feat_release, f2fs_sb_release (these are the only kobject users that are correct w.r.t. module unloading), so if you accept this patch, you can simplify them to use kobject_put_wait. Mikulas -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
On Mon, Jan 06, 2014 at 01:55:11PM -0500, Mikulas Patocka wrote: > > > On Sun, 5 Jan 2014, Greg Kroah-Hartman wrote: > > > On Sun, Jan 05, 2014 at 05:43:56PM +0100, Bart Van Assche wrote: > > > On 01/04/14 19:06, Mikulas Patocka wrote: > > > > - if (t && !t->release) > > > > - pr_debug("kobject: '%s' (%p): does not have a release() " > > > > - "function, it is broken and must be fixed.\n", > > > > - kobject_name(kobj), kobj); > > > > - > > > > > > Has it been considered to issue a warning if no release function has > > > been defined and free_completion == NULL instead of removing the above > > > debug message entirely ? I think even with this patch applied it is > > > still wrong to invoke kobject_put() on an object without defining a > > > release function. > > > > This patch isn't going to be applied, and I've reverted the original > > commit, so there shouldn't be any issues anymore with this code. > > Why? This patch does the same thing as > eee031649707db3c9920d9498f8d03819b74fc23, but it's smaller. So why did you > accept eee031649707db3c9920d9498f8d03819b74fc23 and not this? I have now reverted that commit, it will not be in 3.14, so consider it rejected as well :) thanks, greg k-h -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
On Mon, Jan 06 2014 at 1:55pm -0500, Mikulas Patocka <mpatocka@redhat.com> wrote: > > > On Sun, 5 Jan 2014, Greg Kroah-Hartman wrote: > > > On Sun, Jan 05, 2014 at 05:43:56PM +0100, Bart Van Assche wrote: > > > On 01/04/14 19:06, Mikulas Patocka wrote: > > > > - if (t && !t->release) > > > > - pr_debug("kobject: '%s' (%p): does not have a release() " > > > > - "function, it is broken and must be fixed.\n", > > > > - kobject_name(kobj), kobj); > > > > - > > > > > > Has it been considered to issue a warning if no release function has > > > been defined and free_completion == NULL instead of removing the above > > > debug message entirely ? I think even with this patch applied it is > > > still wrong to invoke kobject_put() on an object without defining a > > > release function. > > > > This patch isn't going to be applied, and I've reverted the original > > commit, so there shouldn't be any issues anymore with this code. > > Why? This patch does the same thing as > eee031649707db3c9920d9498f8d03819b74fc23, but it's smaller. So why did you > accept eee031649707db3c9920d9498f8d03819b74fc23 and not this? > > The code to wait for kobject destruction using completion already exists > in cpufreq_sysfs_release, cpuidle_sysfs_release, > cpuidle_state_sysfs_release, cpuidle_driver_sysfs_release, > ext4_sb_release, ext4_feat_release, f2fs_sb_release (these are the only > kobject users that are correct w.r.t. module unloading), so if you accept > this patch, you can simplify them to use kobject_put_wait. Hi Mikulas, Please just submit a DM-only patch that follows the same racey pattern of firing a completion from the kobj_type .release method in dm_mod. I'll get it queued up for 3.14. If/when we gets reports of a crash due to dm_mod unload racing with kobject_put we can revisit this. Thanks, Mike -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
Index: linux-3.13-rc6/include/linux/kobject.h =================================================================== --- linux-3.13-rc6.orig/include/linux/kobject.h 2014-01-02 23:13:24.000000000 +0100 +++ linux-3.13-rc6/include/linux/kobject.h 2014-01-02 23:14:02.000000000 +0100 @@ -27,6 +27,7 @@ #include <linux/wait.h> #include <linux/atomic.h> #include <linux/workqueue.h> +#include <linux/completion.h> #define UEVENT_HELPER_PATH_LEN 256 #define UEVENT_NUM_ENVP 32 /* number of env pointers */ @@ -66,6 +67,7 @@ struct kobject { struct kobj_type *ktype; struct sysfs_dirent *sd; struct kref kref; + struct completion *free_completion; #ifdef CONFIG_DEBUG_KOBJECT_RELEASE struct delayed_work release; #endif @@ -106,6 +108,7 @@ extern int __must_check kobject_move(str extern struct kobject *kobject_get(struct kobject *kobj); extern void kobject_put(struct kobject *kobj); +extern void kobject_put_wait(struct kobject *kobj); extern const void *kobject_namespace(struct kobject *kobj); extern char *kobject_get_path(struct kobject *kobj, gfp_t flag); Index: linux-3.13-rc6/lib/kobject.c =================================================================== --- linux-3.13-rc6.orig/lib/kobject.c 2014-01-02 23:13:23.000000000 +0100 +++ linux-3.13-rc6/lib/kobject.c 2014-01-02 23:17:01.000000000 +0100 @@ -172,6 +172,7 @@ static void kobject_init_internal(struct if (!kobj) return; kref_init(&kobj->kref); + kobj->free_completion = NULL; INIT_LIST_HEAD(&kobj->entry); kobj->state_in_sysfs = 0; kobj->state_add_uevent_sent = 0; @@ -577,15 +578,11 @@ static void kobject_cleanup(struct kobje { struct kobj_type *t = get_ktype(kobj); const char *name = kobj->name; + struct completion *free_completion = kobj->free_completion; pr_debug("kobject: '%s' (%p): %s, parent %p\n", kobject_name(kobj), kobj, __func__, kobj->parent); - if (t && !t->release) - pr_debug("kobject: '%s' (%p): does not have a release() " - "function, it is broken and must be fixed.\n", - kobject_name(kobj), kobj); - /* send "remove" if the caller did not do it but sent "add" */ if (kobj->state_add_uevent_sent && !kobj->state_remove_uevent_sent) { pr_debug("kobject: '%s' (%p): auto cleanup 'remove' event\n", @@ -611,6 +608,10 @@ static void kobject_cleanup(struct kobje pr_debug("kobject: '%s': free name\n", name); kfree(name); } + + /* if someone is waiting for the kobject to be freed, wake him up */ + if (free_completion) + complete(free_completion); } #ifdef CONFIG_DEBUG_KOBJECT_RELEASE @@ -651,6 +652,28 @@ void kobject_put(struct kobject *kobj) } } +/** + * kobject_put - decrement refcount for object and wait until it reaches zero. + * @kobj: object. + * + * Decrement the refcount, and wait until the refcount reaches zero and the + * kobject is freed. + * + * This function should be called from the driver unload routine. It must not + * be called concurrently on the same kobject. When this function returns, it + * is guaranteed that the kobject was freed. + */ +void kobject_put_wait(struct kobject *kobj) +{ + if (kobj) { + DECLARE_COMPLETION_ONSTACK(completion); + BUG_ON(kobj->free_completion); + kobj->free_completion = &completion; + kobject_put(kobj); + wait_for_completion(&completion); + } +} + static void dynamic_kobj_release(struct kobject *kobj) { pr_debug("kobject: (%p): %s\n", kobj, __func__); @@ -1076,6 +1099,7 @@ void kobj_ns_drop(enum kobj_ns_type type EXPORT_SYMBOL(kobject_get); EXPORT_SYMBOL(kobject_put); +EXPORT_SYMBOL(kobject_put_wait); EXPORT_SYMBOL(kobject_del); EXPORT_SYMBOL(kset_register); Index: linux-3.13-rc6/drivers/md/dm-sysfs.c =================================================================== --- linux-3.13-rc6.orig/drivers/md/dm-sysfs.c 2014-01-02 23:13:24.000000000 +0100 +++ linux-3.13-rc6/drivers/md/dm-sysfs.c 2014-01-02 23:14:02.000000000 +0100 @@ -104,5 +104,5 @@ int dm_sysfs_init(struct mapped_device * */ void dm_sysfs_exit(struct mapped_device *md) { - kobject_put(dm_kobject(md)); + kobject_put_wait(dm_kobject(md)); }