diff mbox

kobject: provide kobject_put_wait to fix module unload race

Message ID alpine.LRH.2.02.1401021733250.27775@file01.intranet.prod.int.rdu2.redhat.com (mailing list archive)
State Superseded, archived
Headers show

Commit Message

Mikulas Patocka Jan. 4, 2014, 6:06 p.m. UTC
Hi

I noticed that Jeff Mahoney added a new structure kobj_completion, defined 
in include/linux/kobj_completion.h to the kernel 3.13-rc1 in the patch 
eee031649707db3c9920d9498f8d03819b74fc23. In the current upstream kernel, 
this interface is still unused.

However, converting the drivers to use kobj_completion is not trivial 
(note that all users of the original kobject interface are buggy - so all 
of them need to be converted).

I came up with a simpler patch to achieve the same purpose - this patch 
makes fixing the drivers easy - the driver is fixed just by replacing 
"kobject_put" with "kobject_put_wait" in the unload routine.

I'd like to ask if you could revert 
eee031649707db3c9920d9498f8d03819b74fc23 (no code uses it) and replace it 
with this patch.

See http://www.redhat.com/archives/dm-devel/2013-October/msg00141.html for 
the bug that this patch fixes.

Mikulas



From: Mikulas Patocka <mpatocka@redhat.com>

This patch introduces a new function kobject_put_wait. It decrements the
kobject reference count, waits until the count reaches zero. When this
function returns, it is guaranteed that the kobject was freed.

A rationale for this function:

The kobject is keeps a reference count. The driver unload routine
decrements the reference count, however, references to the kobject may
still be held by other kernel subsystems. The driver must not free the
memory that contains the kobject. Instead, the driver provides a "release"
method. The "release" method is called by the kernel when the last kobject
refernce is dropped. The "release" method should free the memory that
contains the kobject.

However, this pattern is buggy with respect to modules. The release method
is placed in the driver's module. When the driver exits, the module
reference count is zero, thus the module may be freed. However, there may
still be references to the kobject. If the module is unloaded and then the
release method is called, a crash happens.

Recently, CONFIG_DEBUG_KOBJECT_RELEASE was added. This option deliberately
provokes this race condition.

This patch fixes the bug by providing new function kobject_put_wait.
kobject_put_wait works like kobject_put, but it also waits until all other
references were dropped and until the kobject was freed. When
kobject_put_wait returns, it is guaranteed that the kobject was released
with the release method.

Thus, we can change kobject_put to kobject_put_wait in the unload routine 
of various drivers to fix the above race condition.

This patch fixes it for device mapper. Note that all kobject users in 
modules should be fixed to use this function.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@kernel.org

---
 drivers/md/dm-sysfs.c   |    2 +-
 include/linux/kobject.h |    3 +++
 lib/kobject.c           |   34 +++++++++++++++++++++++++++++-----
 3 files changed, 33 insertions(+), 6 deletions(-)


--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

Comments

Jeff Mahoney Jan. 4, 2014, 6:14 p.m. UTC | #1
On 1/4/14, 1:06 PM, Mikulas Patocka wrote:
> Hi
> 
> I noticed that Jeff Mahoney added a new structure kobj_completion, defined 
> in include/linux/kobj_completion.h to the kernel 3.13-rc1 in the patch 
> eee031649707db3c9920d9498f8d03819b74fc23. In the current upstream kernel, 
> this interface is still unused.
> 
> However, converting the drivers to use kobj_completion is not trivial 
> (note that all users of the original kobject interface are buggy - so all 
> of them need to be converted).
> 
> I came up with a simpler patch to achieve the same purpose - this patch 
> makes fixing the drivers easy - the driver is fixed just by replacing 
> "kobject_put" with "kobject_put_wait" in the unload routine.
> 
> I'd like to ask if you could revert 
> eee031649707db3c9920d9498f8d03819b74fc23 (no code uses it) and replace it 
> with this patch.

I have no objections to reverting it. There were concerns from Al Viro
that it'd be tough to get right by callers and I had assumed it got
dropped after that. I had planned on using it in my btrfs sysfs exports
patchset but came up with a better way.

-Jeff

> See http://www.redhat.com/archives/dm-devel/2013-October/msg00141.html for 
> the bug that this patch fixes.
> 
> Mikulas
> 
> 
> 
> From: Mikulas Patocka <mpatocka@redhat.com>
> 
> This patch introduces a new function kobject_put_wait. It decrements the
> kobject reference count, waits until the count reaches zero. When this
> function returns, it is guaranteed that the kobject was freed.
> 
> A rationale for this function:
> 
> The kobject is keeps a reference count. The driver unload routine
> decrements the reference count, however, references to the kobject may
> still be held by other kernel subsystems. The driver must not free the
> memory that contains the kobject. Instead, the driver provides a "release"
> method. The "release" method is called by the kernel when the last kobject
> refernce is dropped. The "release" method should free the memory that
> contains the kobject.
> 
> However, this pattern is buggy with respect to modules. The release method
> is placed in the driver's module. When the driver exits, the module
> reference count is zero, thus the module may be freed. However, there may
> still be references to the kobject. If the module is unloaded and then the
> release method is called, a crash happens.
> 
> Recently, CONFIG_DEBUG_KOBJECT_RELEASE was added. This option deliberately
> provokes this race condition.
> 
> This patch fixes the bug by providing new function kobject_put_wait.
> kobject_put_wait works like kobject_put, but it also waits until all other
> references were dropped and until the kobject was freed. When
> kobject_put_wait returns, it is guaranteed that the kobject was released
> with the release method.
> 
> Thus, we can change kobject_put to kobject_put_wait in the unload routine 
> of various drivers to fix the above race condition.
> 
> This patch fixes it for device mapper. Note that all kobject users in 
> modules should be fixed to use this function.
> 
> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> Cc: stable@kernel.org
> 
> ---
>  drivers/md/dm-sysfs.c   |    2 +-
>  include/linux/kobject.h |    3 +++
>  lib/kobject.c           |   34 +++++++++++++++++++++++++++++-----
>  3 files changed, 33 insertions(+), 6 deletions(-)
> 
> Index: linux-3.13-rc6/include/linux/kobject.h
> ===================================================================
> --- linux-3.13-rc6.orig/include/linux/kobject.h	2014-01-02 23:13:24.000000000 +0100
> +++ linux-3.13-rc6/include/linux/kobject.h	2014-01-02 23:14:02.000000000 +0100
> @@ -27,6 +27,7 @@
>  #include <linux/wait.h>
>  #include <linux/atomic.h>
>  #include <linux/workqueue.h>
> +#include <linux/completion.h>
>  
>  #define UEVENT_HELPER_PATH_LEN		256
>  #define UEVENT_NUM_ENVP			32	/* number of env pointers */
> @@ -66,6 +67,7 @@ struct kobject {
>  	struct kobj_type	*ktype;
>  	struct sysfs_dirent	*sd;
>  	struct kref		kref;
> +	struct completion	*free_completion;
>  #ifdef CONFIG_DEBUG_KOBJECT_RELEASE
>  	struct delayed_work	release;
>  #endif
> @@ -106,6 +108,7 @@ extern int __must_check kobject_move(str
>  
>  extern struct kobject *kobject_get(struct kobject *kobj);
>  extern void kobject_put(struct kobject *kobj);
> +extern void kobject_put_wait(struct kobject *kobj);
>  
>  extern const void *kobject_namespace(struct kobject *kobj);
>  extern char *kobject_get_path(struct kobject *kobj, gfp_t flag);
> Index: linux-3.13-rc6/lib/kobject.c
> ===================================================================
> --- linux-3.13-rc6.orig/lib/kobject.c	2014-01-02 23:13:23.000000000 +0100
> +++ linux-3.13-rc6/lib/kobject.c	2014-01-02 23:17:01.000000000 +0100
> @@ -172,6 +172,7 @@ static void kobject_init_internal(struct
>  	if (!kobj)
>  		return;
>  	kref_init(&kobj->kref);
> +	kobj->free_completion = NULL;
>  	INIT_LIST_HEAD(&kobj->entry);
>  	kobj->state_in_sysfs = 0;
>  	kobj->state_add_uevent_sent = 0;
> @@ -577,15 +578,11 @@ static void kobject_cleanup(struct kobje
>  {
>  	struct kobj_type *t = get_ktype(kobj);
>  	const char *name = kobj->name;
> +	struct completion *free_completion = kobj->free_completion;
>  
>  	pr_debug("kobject: '%s' (%p): %s, parent %p\n",
>  		 kobject_name(kobj), kobj, __func__, kobj->parent);
>  
> -	if (t && !t->release)
> -		pr_debug("kobject: '%s' (%p): does not have a release() "
> -			 "function, it is broken and must be fixed.\n",
> -			 kobject_name(kobj), kobj);
> -
>  	/* send "remove" if the caller did not do it but sent "add" */
>  	if (kobj->state_add_uevent_sent && !kobj->state_remove_uevent_sent) {
>  		pr_debug("kobject: '%s' (%p): auto cleanup 'remove' event\n",
> @@ -611,6 +608,10 @@ static void kobject_cleanup(struct kobje
>  		pr_debug("kobject: '%s': free name\n", name);
>  		kfree(name);
>  	}
> +
> +	/* if someone is waiting for the kobject to be freed, wake him up */
> +	if (free_completion)
> +		complete(free_completion);
>  }
>  
>  #ifdef CONFIG_DEBUG_KOBJECT_RELEASE
> @@ -651,6 +652,28 @@ void kobject_put(struct kobject *kobj)
>  	}
>  }
>  
> +/**
> + * kobject_put - decrement refcount for object and wait until it reaches zero.
> + * @kobj: object.
> + *
> + * Decrement the refcount, and wait until the refcount reaches zero and the
> + * kobject is freed.
> + *
> + * This function should be called from the driver unload routine. It must not
> + * be called concurrently on the same kobject. When this function returns, it
> + * is guaranteed that the kobject was freed.
> + */
> +void kobject_put_wait(struct kobject *kobj)
> +{
> +	if (kobj) {
> +		DECLARE_COMPLETION_ONSTACK(completion);
> +		BUG_ON(kobj->free_completion);
> +		kobj->free_completion = &completion;
> +		kobject_put(kobj);
> +		wait_for_completion(&completion);
> +	}
> +}
> +
>  static void dynamic_kobj_release(struct kobject *kobj)
>  {
>  	pr_debug("kobject: (%p): %s\n", kobj, __func__);
> @@ -1076,6 +1099,7 @@ void kobj_ns_drop(enum kobj_ns_type type
>  
>  EXPORT_SYMBOL(kobject_get);
>  EXPORT_SYMBOL(kobject_put);
> +EXPORT_SYMBOL(kobject_put_wait);
>  EXPORT_SYMBOL(kobject_del);
>  
>  EXPORT_SYMBOL(kset_register);
> Index: linux-3.13-rc6/drivers/md/dm-sysfs.c
> ===================================================================
> --- linux-3.13-rc6.orig/drivers/md/dm-sysfs.c	2014-01-02 23:13:24.000000000 +0100
> +++ linux-3.13-rc6/drivers/md/dm-sysfs.c	2014-01-02 23:14:02.000000000 +0100
> @@ -104,5 +104,5 @@ int dm_sysfs_init(struct mapped_device *
>   */
>  void dm_sysfs_exit(struct mapped_device *md)
>  {
> -	kobject_put(dm_kobject(md));
> +	kobject_put_wait(dm_kobject(md));
>  }
>
Greg KH Jan. 4, 2014, 6:16 p.m. UTC | #2
On Sat, Jan 04, 2014 at 01:06:01PM -0500, Mikulas Patocka wrote:
> Hi
> 
> I noticed that Jeff Mahoney added a new structure kobj_completion, defined 
> in include/linux/kobj_completion.h to the kernel 3.13-rc1 in the patch 
> eee031649707db3c9920d9498f8d03819b74fc23. In the current upstream kernel, 
> this interface is still unused.

There are pending btrfs patches to use this interface.

> However, converting the drivers to use kobj_completion is not trivial 
> (note that all users of the original kobject interface are buggy - so all 
> of them need to be converted).

Wait, what?  How are "all users" buggy?  Please explain this in detail.

> I came up with a simpler patch to achieve the same purpose - this patch 
> makes fixing the drivers easy - the driver is fixed just by replacing 
> "kobject_put" with "kobject_put_wait" in the unload routine.

No, that's not ok at all.

> I'd like to ask if you could revert 
> eee031649707db3c9920d9498f8d03819b74fc23 (no code uses it) and replace it 
> with this patch.
> 
> See http://www.redhat.com/archives/dm-devel/2013-October/msg00141.html for 
> the bug that this patch fixes.
> 
> Mikulas
> 
> 
> 
> From: Mikulas Patocka <mpatocka@redhat.com>
> 
> This patch introduces a new function kobject_put_wait. It decrements the
> kobject reference count, waits until the count reaches zero. When this
> function returns, it is guaranteed that the kobject was freed.
> 
> A rationale for this function:
> 
> The kobject is keeps a reference count. The driver unload routine
> decrements the reference count, however, references to the kobject may
> still be held by other kernel subsystems. The driver must not free the
> memory that contains the kobject. Instead, the driver provides a "release"
> method. The "release" method is called by the kernel when the last kobject
> refernce is dropped. The "release" method should free the memory that
> contains the kobject.
> 
> However, this pattern is buggy with respect to modules. The release method
> is placed in the driver's module. When the driver exits, the module
> reference count is zero, thus the module may be freed. However, there may
> still be references to the kobject. If the module is unloaded and then the
> release method is called, a crash happens.

Yes, module unloading while a kobject is still "active" is not a good
thing, what modules do you have that cause this problem?  Why not just
grab the module reference in your kobject if you need this type of
protection?  It's not the kobject's code fault that this issue is there,
or that we now have a "delayed release" function to expose this type of
thing, it's the user of the kobject.

Please fix the broken users of the kobject first.

thanks,

greg k-h

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Al Viro Jan. 4, 2014, 6:34 p.m. UTC | #3
On Sat, Jan 04, 2014 at 10:16:20AM -0800, Greg Kroah-Hartman wrote:

> > I came up with a simpler patch to achieve the same purpose - this patch 
> > makes fixing the drivers easy - the driver is fixed just by replacing 
> > "kobject_put" with "kobject_put_wait" in the unload routine.
> 
> No, that's not ok at all.

Agreed - all it takes is one cargo-culter who religoiusly does such
conversion and drops a ref to parent before that to child.

> > However, this pattern is buggy with respect to modules. The release method
> > is placed in the driver's module. When the driver exits, the module
> > reference count is zero, thus the module may be freed. However, there may
> > still be references to the kobject. If the module is unloaded and then the
> > release method is called, a crash happens.
> 
> Yes, module unloading while a kobject is still "active" is not a good
> thing, what modules do you have that cause this problem?  Why not just
> grab the module reference in your kobject if you need this type of
> protection?  It's not the kobject's code fault that this issue is there,
> or that we now have a "delayed release" function to expose this type of
> thing, it's the user of the kobject.
> 
> Please fix the broken users of the kobject first.

<snide> Are you saying that there is another kind? </snide>

When would you grab that reference to module?  More to the point, when
would you *drop* it?  Doing so from module_exit is not going to work,
obviously...

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Mikulas Patocka Jan. 4, 2014, 8:35 p.m. UTC | #4
On Sat, 4 Jan 2014, Greg Kroah-Hartman wrote:

> On Sat, Jan 04, 2014 at 01:06:01PM -0500, Mikulas Patocka wrote:
> > Hi
> > 
> > I noticed that Jeff Mahoney added a new structure kobj_completion, defined 
> > in include/linux/kobj_completion.h to the kernel 3.13-rc1 in the patch 
> > eee031649707db3c9920d9498f8d03819b74fc23. In the current upstream kernel, 
> > this interface is still unused.
> 
> There are pending btrfs patches to use this interface.
> 
> > However, converting the drivers to use kobj_completion is not trivial 
> > (note that all users of the original kobject interface are buggy - so all 
> > of them need to be converted).
> 
> Wait, what?  How are "all users" buggy?  Please explain this in detail.

1) some code takes a reference to a kobject
2) the user unloads the device
3) the device driver unload routine calls kobject_put (but there is still 
   reference, so the kobject is not destroyed)
4) the device is unregistered, it drops the module reference count to zero
5) the user unloads the module
6) the piece of code that grabbed the kobject reference at step 1) drops 
   it
7) the kobject code calls ->release method, it points into the unloaded 
   module


You could try to fix it by incrementing the module reference count between 
steps 2) and 3) and decrementing it in the kobject release routine, but it 
doesn't work - you'd have to run "module_put(THIS_MODULE);" from the 
kobject release routine - it is still buggy, because the module may be 
unloaded just after "module_put(THIS_MODULE);" and before the release 
routine returns.


Another problem with incrementing the module reference count happens if 
the above sequence is executing as a consequence of module unloading - 
suppose that steps 2)-4) are being executed from the module unload routine 
itself (for example, network drivers have this property that the module 
may be removed while the device is in use and the module unload routine 
shuts the device down). In this case, there is no way to delay unloading 
of the module from the unload routine itself.


> > I came up with a simpler patch to achieve the same purpose - this patch 
> > makes fixing the drivers easy - the driver is fixed just by replacing 
> > "kobject_put" with "kobject_put_wait" in the unload routine.
> 
> No, that's not ok at all.

What exactly do you think is wrong with this patch?

> > I'd like to ask if you could revert 
> > eee031649707db3c9920d9498f8d03819b74fc23 (no code uses it) and replace it 
> > with this patch.
> > 
> > See http://www.redhat.com/archives/dm-devel/2013-October/msg00141.html for 
> > the bug that this patch fixes.
> > 
> > Mikulas
> > 
> > 
> > 
> > From: Mikulas Patocka <mpatocka@redhat.com>
> > 
> > This patch introduces a new function kobject_put_wait. It decrements the
> > kobject reference count, waits until the count reaches zero. When this
> > function returns, it is guaranteed that the kobject was freed.
> > 
> > A rationale for this function:
> > 
> > The kobject is keeps a reference count. The driver unload routine
> > decrements the reference count, however, references to the kobject may
> > still be held by other kernel subsystems. The driver must not free the
> > memory that contains the kobject. Instead, the driver provides a "release"
> > method. The "release" method is called by the kernel when the last kobject
> > refernce is dropped. The "release" method should free the memory that
> > contains the kobject.
> > 
> > However, this pattern is buggy with respect to modules. The release method
> > is placed in the driver's module. When the driver exits, the module
> > reference count is zero, thus the module may be freed. However, there may
> > still be references to the kobject. If the module is unloaded and then the
> > release method is called, a crash happens.
> 
> Yes, module unloading while a kobject is still "active" is not a good
> thing, what modules do you have that cause this problem?

All modules - Do you have some code in the generic module unload routine 
that prevents unloading the module while someone holds a reference to a 
kobject created by the module? If not, then all modules have this bug.

> Why not just
> grab the module reference in your kobject if you need this type of
> protection?

See above - you'd have to drop the reference count from the module itself. 
And it doesn't work if you are destroying the kobject from the module 
unload routine itself.

> It's not the kobject's code fault that this issue is there,
> or that we now have a "delayed release" function to expose this type of
> thing, it's the user of the kobject.

So, show me any single driver that do this correctly. I haven't looked at 
the whole kernel, but all drivers I have looked on so far are buggy.


Just a grep '\.release' drivers/*/*.c shows this:

These are buggy because they don't manage module references in the kobject 
release routine: pkt_kobj_release, cpufreq_sysfs_release, 
cpuidle_sysfs_release, ktype_state_cpuidle, cpuidle_driver_sysfs_release, 
dmi_entry_free, dmi_sysfs_entry_release, release_firmware_map_entry, 
rdev_free, md_free, pps_cdev_release, iscsi_boot_kobj_release, 
map_release, portio_release

These are decrementing the own module count from the release method, so 
they are less buggy than the above, but still buggy: 
edac_device_ctrl_instance_release, edac_device_ctrl_master_release, 
edac_device_ctrl_instance_release

> Please fix the broken users of the kobject first.
> 
> thanks,
> 
> greg k-h

Mikulas

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Dmitry Torokhov Jan. 4, 2014, 10:42 p.m. UTC | #5
On Sat, Jan 04, 2014 at 06:34:03PM +0000, Al Viro wrote:
> On Sat, Jan 04, 2014 at 10:16:20AM -0800, Greg Kroah-Hartman wrote:
> 
> > > I came up with a simpler patch to achieve the same purpose - this patch 
> > > makes fixing the drivers easy - the driver is fixed just by replacing 
> > > "kobject_put" with "kobject_put_wait" in the unload routine.
> > 
> > No, that's not ok at all.
> 
> Agreed - all it takes is one cargo-culter who religoiusly does such
> conversion and drops a ref to parent before that to child.
> 
> > > However, this pattern is buggy with respect to modules. The release method
> > > is placed in the driver's module. When the driver exits, the module
> > > reference count is zero, thus the module may be freed. However, there may
> > > still be references to the kobject. If the module is unloaded and then the
> > > release method is called, a crash happens.
> > 
> > Yes, module unloading while a kobject is still "active" is not a good
> > thing, what modules do you have that cause this problem?  Why not just
> > grab the module reference in your kobject if you need this type of
> > protection?  It's not the kobject's code fault that this issue is there,
> > or that we now have a "delayed release" function to expose this type of
> > thing, it's the user of the kobject.
> > 
> > Please fix the broken users of the kobject first.
> 
> <snide> Are you saying that there is another kind? </snide>
> 
> When would you grab that reference to module?  More to the point, when
> would you *drop* it?  Doing so from module_exit is not going to work,
> obviously...

You normally have subsystem core module that does handle release of its
objects and users of said objects so it is usually OK for objects to
outlive the users, you just need to make sure the core stays around.

In input we grab module reference to input core when we allocate input
device and drop it when input device is freed. This way we can be sure
that input core stays around until all input devices are gone. The same
for serio.

Thanks.
Greg KH Jan. 5, 2014, 3:42 a.m. UTC | #6
On Sat, Jan 04, 2014 at 03:35:39PM -0500, Mikulas Patocka wrote:
> 
> 
> On Sat, 4 Jan 2014, Greg Kroah-Hartman wrote:
> 
> > On Sat, Jan 04, 2014 at 01:06:01PM -0500, Mikulas Patocka wrote:
> > > Hi
> > > 
> > > I noticed that Jeff Mahoney added a new structure kobj_completion, defined 
> > > in include/linux/kobj_completion.h to the kernel 3.13-rc1 in the patch 
> > > eee031649707db3c9920d9498f8d03819b74fc23. In the current upstream kernel, 
> > > this interface is still unused.
> > 
> > There are pending btrfs patches to use this interface.
> > 
> > > However, converting the drivers to use kobj_completion is not trivial 
> > > (note that all users of the original kobject interface are buggy - so all 
> > > of them need to be converted).
> > 
> > Wait, what?  How are "all users" buggy?  Please explain this in detail.
> 
> 1) some code takes a reference to a kobject
> 2) the user unloads the device
> 3) the device driver unload routine calls kobject_put (but there is still 
>    reference, so the kobject is not destroyed)

A driver should never be messing around with "raw" kobjects, they should
be using a 'struct device' which is created/managed by the subsystem
they belong to.  See Dmitry's example of input and serio as ways to do
this, also USB and PCI do this properly.

Perhaps your sybsystem isn't doing this properly?  What code do you have
that creates raw kobjects and has this problem?

thanks,

greg k-h

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Greg KH Jan. 5, 2014, 3:48 a.m. UTC | #7
On Sat, Jan 04, 2014 at 01:14:14PM -0500, Jeff Mahoney wrote:
> On 1/4/14, 1:06 PM, Mikulas Patocka wrote:
> > Hi
> > 
> > I noticed that Jeff Mahoney added a new structure kobj_completion, defined 
> > in include/linux/kobj_completion.h to the kernel 3.13-rc1 in the patch 
> > eee031649707db3c9920d9498f8d03819b74fc23. In the current upstream kernel, 
> > this interface is still unused.
> > 
> > However, converting the drivers to use kobj_completion is not trivial 
> > (note that all users of the original kobject interface are buggy - so all 
> > of them need to be converted).
> > 
> > I came up with a simpler patch to achieve the same purpose - this patch 
> > makes fixing the drivers easy - the driver is fixed just by replacing 
> > "kobject_put" with "kobject_put_wait" in the unload routine.
> > 
> > I'd like to ask if you could revert 
> > eee031649707db3c9920d9498f8d03819b74fc23 (no code uses it) and replace it 
> > with this patch.
> 
> I have no objections to reverting it. There were concerns from Al Viro
> that it'd be tough to get right by callers and I had assumed it got
> dropped after that. I had planned on using it in my btrfs sysfs exports
> patchset but came up with a better way.

Ok, thanks, I'll revert it for 3.14.

greg k-h

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Dmitry Torokhov Jan. 5, 2014, 6:05 a.m. UTC | #8
On Sat, Jan 04, 2014 at 07:42:28PM -0800, Greg Kroah-Hartman wrote:
> On Sat, Jan 04, 2014 at 03:35:39PM -0500, Mikulas Patocka wrote:
> > 
> > 
> > On Sat, 4 Jan 2014, Greg Kroah-Hartman wrote:
> > 
> > > On Sat, Jan 04, 2014 at 01:06:01PM -0500, Mikulas Patocka wrote:
> > > > Hi
> > > > 
> > > > I noticed that Jeff Mahoney added a new structure kobj_completion, defined 
> > > > in include/linux/kobj_completion.h to the kernel 3.13-rc1 in the patch 
> > > > eee031649707db3c9920d9498f8d03819b74fc23. In the current upstream kernel, 
> > > > this interface is still unused.
> > > 
> > > There are pending btrfs patches to use this interface.
> > > 
> > > > However, converting the drivers to use kobj_completion is not trivial 
> > > > (note that all users of the original kobject interface are buggy - so all 
> > > > of them need to be converted).
> > > 
> > > Wait, what?  How are "all users" buggy?  Please explain this in detail.
> > 
> > 1) some code takes a reference to a kobject
> > 2) the user unloads the device
> > 3) the device driver unload routine calls kobject_put (but there is still 
> >    reference, so the kobject is not destroyed)
> 
> A driver should never be messing around with "raw" kobjects, they should
> be using a 'struct device' which is created/managed by the subsystem
> they belong to.  See Dmitry's example of input and serio as ways to do
> this, also USB and PCI do this properly.

Well, Mikulas is correct in the sense that there is still a race between
release function invoking the final module_put() and getting preempted
and module getting unloaded by another thread. Hitting this race is
pretty hard though.
Bart Van Assche Jan. 5, 2014, 4:43 p.m. UTC | #9
On 01/04/14 19:06, Mikulas Patocka wrote:
> -	if (t && !t->release)
> -		pr_debug("kobject: '%s' (%p): does not have a release() "
> -			 "function, it is broken and must be fixed.\n",
> -			 kobject_name(kobj), kobj);
> -

Has it been considered to issue a warning if no release function has
been defined and free_completion == NULL instead of removing the above
debug message entirely ? I think even with this patch applied it is
still wrong to invoke kobject_put() on an object without defining a
release function.

Thanks,

Bart.

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Greg KH Jan. 5, 2014, 6:26 p.m. UTC | #10
On Sun, Jan 05, 2014 at 05:43:56PM +0100, Bart Van Assche wrote:
> On 01/04/14 19:06, Mikulas Patocka wrote:
> > -	if (t && !t->release)
> > -		pr_debug("kobject: '%s' (%p): does not have a release() "
> > -			 "function, it is broken and must be fixed.\n",
> > -			 kobject_name(kobj), kobj);
> > -
> 
> Has it been considered to issue a warning if no release function has
> been defined and free_completion == NULL instead of removing the above
> debug message entirely ? I think even with this patch applied it is
> still wrong to invoke kobject_put() on an object without defining a
> release function.

This patch isn't going to be applied, and I've reverted the original
commit, so there shouldn't be any issues anymore with this code.

thanks,

greg k-h

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Greg KH Jan. 5, 2014, 6:27 p.m. UTC | #11
On Sat, Jan 04, 2014 at 10:05:31PM -0800, Dmitry Torokhov wrote:
> On Sat, Jan 04, 2014 at 07:42:28PM -0800, Greg Kroah-Hartman wrote:
> > On Sat, Jan 04, 2014 at 03:35:39PM -0500, Mikulas Patocka wrote:
> > > 
> > > 
> > > On Sat, 4 Jan 2014, Greg Kroah-Hartman wrote:
> > > 
> > > > On Sat, Jan 04, 2014 at 01:06:01PM -0500, Mikulas Patocka wrote:
> > > > > Hi
> > > > > 
> > > > > I noticed that Jeff Mahoney added a new structure kobj_completion, defined 
> > > > > in include/linux/kobj_completion.h to the kernel 3.13-rc1 in the patch 
> > > > > eee031649707db3c9920d9498f8d03819b74fc23. In the current upstream kernel, 
> > > > > this interface is still unused.
> > > > 
> > > > There are pending btrfs patches to use this interface.
> > > > 
> > > > > However, converting the drivers to use kobj_completion is not trivial 
> > > > > (note that all users of the original kobject interface are buggy - so all 
> > > > > of them need to be converted).
> > > > 
> > > > Wait, what?  How are "all users" buggy?  Please explain this in detail.
> > > 
> > > 1) some code takes a reference to a kobject
> > > 2) the user unloads the device
> > > 3) the device driver unload routine calls kobject_put (but there is still 
> > >    reference, so the kobject is not destroyed)
> > 
> > A driver should never be messing around with "raw" kobjects, they should
> > be using a 'struct device' which is created/managed by the subsystem
> > they belong to.  See Dmitry's example of input and serio as ways to do
> > this, also USB and PCI do this properly.
> 
> Well, Mikulas is correct in the sense that there is still a race between
> release function invoking the final module_put() and getting preempted
> and module getting unloaded by another thread. Hitting this race is
> pretty hard though.

Yes, that is true, especially as no one auto-unloads modules for reasons
like this :)

thanks,

greg k-h

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Mikulas Patocka Jan. 5, 2014, 10:04 p.m. UTC | #12
On Sat, 4 Jan 2014, Greg Kroah-Hartman wrote:

> On Sat, Jan 04, 2014 at 03:35:39PM -0500, Mikulas Patocka wrote:
> > 
> > 
> > On Sat, 4 Jan 2014, Greg Kroah-Hartman wrote:
> > 
> > > On Sat, Jan 04, 2014 at 01:06:01PM -0500, Mikulas Patocka wrote:
> > > > Hi
> > > > 
> > > > I noticed that Jeff Mahoney added a new structure kobj_completion, defined 
> > > > in include/linux/kobj_completion.h to the kernel 3.13-rc1 in the patch 
> > > > eee031649707db3c9920d9498f8d03819b74fc23. In the current upstream kernel, 
> > > > this interface is still unused.
> > > 
> > > There are pending btrfs patches to use this interface.
> > > 
> > > > However, converting the drivers to use kobj_completion is not trivial 
> > > > (note that all users of the original kobject interface are buggy - so all 
> > > > of them need to be converted).
> > > 
> > > Wait, what?  How are "all users" buggy?  Please explain this in detail.
> > 
> > 1) some code takes a reference to a kobject
> > 2) the user unloads the device
> > 3) the device driver unload routine calls kobject_put (but there is still 
> >    reference, so the kobject is not destroyed)
> 
> A driver should never be messing around with "raw" kobjects, they should
> be using a 'struct device' which is created/managed by the subsystem
> they belong to.  See Dmitry's example of input and serio as ways to do
> this, also USB and PCI do this properly.
> 
> Perhaps your sybsystem isn't doing this properly?  What code do you have
> that creates raw kobjects and has this problem?
> 
> thanks,
> 
> greg k-h

So, are you saying that a module shouldn't ever be able to create a 
kobject type?

Do "grep -rw kobj_type drivers/ fs/* net/bridge/" to see how much code 
uses kobjects. There are 77 line. Majority of them may be compiled as 
modules.

What do you want to do with all those kobject users? Hide them behind 
another interface that doesn't exists yet?

Mikulas

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Mikulas Patocka Jan. 5, 2014, 10:04 p.m. UTC | #13
On Sun, 5 Jan 2014, Bart Van Assche wrote:

> On 01/04/14 19:06, Mikulas Patocka wrote:
> > -	if (t && !t->release)
> > -		pr_debug("kobject: '%s' (%p): does not have a release() "
> > -			 "function, it is broken and must be fixed.\n",
> > -			 kobject_name(kobj), kobj);
> > -
> 
> Has it been considered to issue a warning if no release function has
> been defined and free_completion == NULL instead of removing the above
> debug message entirely ? I think even with this patch applied it is
> still wrong to invoke kobject_put() on an object without defining a
> release function.
> 
> Thanks,
> 
> Bart.

With the above patch, you don't need the release method for correct 
behavior. Therefore you don't have to issue warning when it is missing.

Mikulas

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Mikulas Patocka Jan. 5, 2014, 10:11 p.m. UTC | #14
On Sat, 4 Jan 2014, Dmitry Torokhov wrote:

> On Sat, Jan 04, 2014 at 06:34:03PM +0000, Al Viro wrote:
> > On Sat, Jan 04, 2014 at 10:16:20AM -0800, Greg Kroah-Hartman wrote:
> > 
> > > > I came up with a simpler patch to achieve the same purpose - this patch 
> > > > makes fixing the drivers easy - the driver is fixed just by replacing 
> > > > "kobject_put" with "kobject_put_wait" in the unload routine.
> > > 
> > > No, that's not ok at all.
> > 
> > Agreed - all it takes is one cargo-culter who religoiusly does such
> > conversion and drops a ref to parent before that to child.
> > 
> > > > However, this pattern is buggy with respect to modules. The release method
> > > > is placed in the driver's module. When the driver exits, the module
> > > > reference count is zero, thus the module may be freed. However, there may
> > > > still be references to the kobject. If the module is unloaded and then the
> > > > release method is called, a crash happens.
> > > 
> > > Yes, module unloading while a kobject is still "active" is not a good
> > > thing, what modules do you have that cause this problem?  Why not just
> > > grab the module reference in your kobject if you need this type of
> > > protection?  It's not the kobject's code fault that this issue is there,
> > > or that we now have a "delayed release" function to expose this type of
> > > thing, it's the user of the kobject.
> > > 
> > > Please fix the broken users of the kobject first.
> > 
> > <snide> Are you saying that there is another kind? </snide>
> > 
> > When would you grab that reference to module?  More to the point, when
> > would you *drop* it?  Doing so from module_exit is not going to work,
> > obviously...
> 
> You normally have subsystem core module that does handle release of its
> objects and users of said objects so it is usually OK for objects to
> outlive the users, you just need to make sure the core stays around.
> 
> In input we grab module reference to input core when we allocate input
> device and drop it when input device is freed. This way we can be sure
> that input core stays around until all input devices are gone. The same
> for serio.
> 
> Thanks. 
> 
> -- 
> Dmitry

But sometimes, the driver itself needs to create nodes in the sysfs 
filesystem (for example drivers/md/dm-sysfs.c). I don't quite see how 
would you push all driver-specific sysfs nodes into the generic non-module 
code.

Mikulas

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Greg KH Jan. 5, 2014, 10:23 p.m. UTC | #15
On Sun, Jan 05, 2014 at 05:04:31PM -0500, Mikulas Patocka wrote:
> 
> 
> On Sat, 4 Jan 2014, Greg Kroah-Hartman wrote:
> 
> > On Sat, Jan 04, 2014 at 03:35:39PM -0500, Mikulas Patocka wrote:
> > > 
> > > 
> > > On Sat, 4 Jan 2014, Greg Kroah-Hartman wrote:
> > > 
> > > > On Sat, Jan 04, 2014 at 01:06:01PM -0500, Mikulas Patocka wrote:
> > > > > Hi
> > > > > 
> > > > > I noticed that Jeff Mahoney added a new structure kobj_completion, defined 
> > > > > in include/linux/kobj_completion.h to the kernel 3.13-rc1 in the patch 
> > > > > eee031649707db3c9920d9498f8d03819b74fc23. In the current upstream kernel, 
> > > > > this interface is still unused.
> > > > 
> > > > There are pending btrfs patches to use this interface.
> > > > 
> > > > > However, converting the drivers to use kobj_completion is not trivial 
> > > > > (note that all users of the original kobject interface are buggy - so all 
> > > > > of them need to be converted).
> > > > 
> > > > Wait, what?  How are "all users" buggy?  Please explain this in detail.
> > > 
> > > 1) some code takes a reference to a kobject
> > > 2) the user unloads the device
> > > 3) the device driver unload routine calls kobject_put (but there is still 
> > >    reference, so the kobject is not destroyed)
> > 
> > A driver should never be messing around with "raw" kobjects, they should
> > be using a 'struct device' which is created/managed by the subsystem
> > they belong to.  See Dmitry's example of input and serio as ways to do
> > this, also USB and PCI do this properly.
> > 
> > Perhaps your sybsystem isn't doing this properly?  What code do you have
> > that creates raw kobjects and has this problem?
> > 
> > thanks,
> > 
> > greg k-h
> 
> So, are you saying that a module shouldn't ever be able to create a 
> kobject type?
> 
> Do "grep -rw kobj_type drivers/ fs/* net/bridge/" to see how much code 
> uses kobjects. There are 77 line. Majority of them may be compiled as 
> modules.
> 
> What do you want to do with all those kobject users? Hide them behind 
> another interface that doesn't exists yet?

Most of them should be using the driver/device interface to sysfs (the
drivers/* files, with the exception of the driver core code).  I'll look
at the others later.

And note, as module unloading can only happen by the root user, and
never happens "automatically", this is an issue, but a very minor one,
and can usually be solved by having a central "place" that handles the
kobject lifetimes.

thanks,

greg k-h

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Dmitry Torokhov Jan. 5, 2014, 10:39 p.m. UTC | #16
On Sun, Jan 05, 2014 at 05:11:23PM -0500, Mikulas Patocka wrote:
> 
> 
> On Sat, 4 Jan 2014, Dmitry Torokhov wrote:
> 
> > On Sat, Jan 04, 2014 at 06:34:03PM +0000, Al Viro wrote:
> > > On Sat, Jan 04, 2014 at 10:16:20AM -0800, Greg Kroah-Hartman wrote:
> > > 
> > > > > I came up with a simpler patch to achieve the same purpose - this patch 
> > > > > makes fixing the drivers easy - the driver is fixed just by replacing 
> > > > > "kobject_put" with "kobject_put_wait" in the unload routine.
> > > > 
> > > > No, that's not ok at all.
> > > 
> > > Agreed - all it takes is one cargo-culter who religoiusly does such
> > > conversion and drops a ref to parent before that to child.
> > > 
> > > > > However, this pattern is buggy with respect to modules. The release method
> > > > > is placed in the driver's module. When the driver exits, the module
> > > > > reference count is zero, thus the module may be freed. However, there may
> > > > > still be references to the kobject. If the module is unloaded and then the
> > > > > release method is called, a crash happens.
> > > > 
> > > > Yes, module unloading while a kobject is still "active" is not a good
> > > > thing, what modules do you have that cause this problem?  Why not just
> > > > grab the module reference in your kobject if you need this type of
> > > > protection?  It's not the kobject's code fault that this issue is there,
> > > > or that we now have a "delayed release" function to expose this type of
> > > > thing, it's the user of the kobject.
> > > > 
> > > > Please fix the broken users of the kobject first.
> > > 
> > > <snide> Are you saying that there is another kind? </snide>
> > > 
> > > When would you grab that reference to module?  More to the point, when
> > > would you *drop* it?  Doing so from module_exit is not going to work,
> > > obviously...
> > 
> > You normally have subsystem core module that does handle release of its
> > objects and users of said objects so it is usually OK for objects to
> > outlive the users, you just need to make sure the core stays around.
> > 
> > In input we grab module reference to input core when we allocate input
> > device and drop it when input device is freed. This way we can be sure
> > that input core stays around until all input devices are gone. The same
> > for serio.
> > 
> > Thanks. 
> > 
> > -- 
> > Dmitry
> 
> But sometimes, the driver itself needs to create nodes in the sysfs 
> filesystem (for example drivers/md/dm-sysfs.c). I don't quite see how 
> would you push all driver-specific sysfs nodes into the generic non-module 
> code.

Then you need to make sure your driver does not allow unloading while
its objects are active. I.e. require that all your devices are gone
(by increasing module count when you create a DM object and decreasing
it when you release DM object) before you allow unloading the driver.

Basically we should avoid kobject_put() in exit paths of the module.
Then we are left with that tiny race with release being preempted and
module getting unloaded.

Thanks.
Mikulas Patocka Jan. 6, 2014, 6:43 p.m. UTC | #17
On Sun, 5 Jan 2014, Dmitry Torokhov wrote:

> > But sometimes, the driver itself needs to create nodes in the sysfs 
> > filesystem (for example drivers/md/dm-sysfs.c). I don't quite see how 
> > would you push all driver-specific sysfs nodes into the generic non-module 
> > code.
> 
> Then you need to make sure your driver does not allow unloading while
> its objects are active. I.e. require that all your devices are gone
> (by increasing module count when you create a DM object and decreasing
> it when you release DM object) before you allow unloading the driver.

For drivers that register to various subsystems (for example with 
pci_register_driver and pci_unregister_driver) this can't be done 
correctly - pci_unregister_driver is called from the module unload path, 
it destroys all instances of the device and their appropriate sysfs nodes.

The sysfs nodes exist even if the driver is unused and has zero module 
count.

> Basically we should avoid kobject_put() in exit paths of the module.



> Then we are left with that tiny race with release being preempted and
> module getting unloaded.

Majority of kobject users aren't managing module refcount in the 
release routine. They do not have a tiny race - they have a big race that 
is hapenning with CONFIG_DEBUG_KOBJECT_RELEASE.



These use completion to wait for the released object (thus, they are 
correct):
cpufreq_sysfs_release, cpuidle_sysfs_release, cpuidle_state_sysfs_release, 
cpuidle_driver_sysfs_release, ext4_sb_release, ext4_feat_release, 
f2fs_sb_release

These have no protection against module unload at all:
pkt_kobj_release, map_release, portio_release, ib_port_release, 
cm_release_port_obj, mlx4_port_release, ttm_bo_global_kobj_release, 
ttm_pool_kobj_release, ttm_mem_zone_kobj_release, 
ttm_mem_global_kobj_release, rdev_free, md_free, efivar_release, 
dmi_entry_free, dmi_sysfs_entry_release, edd_release, 
iscsi_boot_kobj_release, lockspace_kobj_release, gfs2_sbd_release, 
release_nbp

These have empty or non-existent release routine (thus having no 
protection): dm-sysfs.c, qib_port_release

These use module refcount: edac_device_ctrl_master_release, 
edac_device_ctrl_instance_release, edac_device_ctrl_block_release

> Thanks.
> 
> -- 
> Dmitry

Mikulas

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Mikulas Patocka Jan. 6, 2014, 6:55 p.m. UTC | #18
On Sun, 5 Jan 2014, Greg Kroah-Hartman wrote:

> On Sun, Jan 05, 2014 at 05:43:56PM +0100, Bart Van Assche wrote:
> > On 01/04/14 19:06, Mikulas Patocka wrote:
> > > -	if (t && !t->release)
> > > -		pr_debug("kobject: '%s' (%p): does not have a release() "
> > > -			 "function, it is broken and must be fixed.\n",
> > > -			 kobject_name(kobj), kobj);
> > > -
> > 
> > Has it been considered to issue a warning if no release function has
> > been defined and free_completion == NULL instead of removing the above
> > debug message entirely ? I think even with this patch applied it is
> > still wrong to invoke kobject_put() on an object without defining a
> > release function.
> 
> This patch isn't going to be applied, and I've reverted the original
> commit, so there shouldn't be any issues anymore with this code.

Why? This patch does the same thing as 
eee031649707db3c9920d9498f8d03819b74fc23, but it's smaller. So why did you 
accept eee031649707db3c9920d9498f8d03819b74fc23 and not this?

The code to wait for kobject destruction using completion already exists 
in cpufreq_sysfs_release, cpuidle_sysfs_release, 
cpuidle_state_sysfs_release, cpuidle_driver_sysfs_release, 
ext4_sb_release, ext4_feat_release, f2fs_sb_release (these are the only 
kobject users that are correct w.r.t. module unloading), so if you accept 
this patch, you can simplify them to use kobject_put_wait.

Mikulas

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Greg KH Jan. 6, 2014, 7:23 p.m. UTC | #19
On Mon, Jan 06, 2014 at 01:55:11PM -0500, Mikulas Patocka wrote:
> 
> 
> On Sun, 5 Jan 2014, Greg Kroah-Hartman wrote:
> 
> > On Sun, Jan 05, 2014 at 05:43:56PM +0100, Bart Van Assche wrote:
> > > On 01/04/14 19:06, Mikulas Patocka wrote:
> > > > -	if (t && !t->release)
> > > > -		pr_debug("kobject: '%s' (%p): does not have a release() "
> > > > -			 "function, it is broken and must be fixed.\n",
> > > > -			 kobject_name(kobj), kobj);
> > > > -
> > > 
> > > Has it been considered to issue a warning if no release function has
> > > been defined and free_completion == NULL instead of removing the above
> > > debug message entirely ? I think even with this patch applied it is
> > > still wrong to invoke kobject_put() on an object without defining a
> > > release function.
> > 
> > This patch isn't going to be applied, and I've reverted the original
> > commit, so there shouldn't be any issues anymore with this code.
> 
> Why? This patch does the same thing as 
> eee031649707db3c9920d9498f8d03819b74fc23, but it's smaller. So why did you 
> accept eee031649707db3c9920d9498f8d03819b74fc23 and not this?

I have now reverted that commit, it will not be in 3.14, so consider it
rejected as well :)

thanks,

greg k-h

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Mike Snitzer Jan. 6, 2014, 9:31 p.m. UTC | #20
On Mon, Jan 06 2014 at  1:55pm -0500,
Mikulas Patocka <mpatocka@redhat.com> wrote:

> 
> 
> On Sun, 5 Jan 2014, Greg Kroah-Hartman wrote:
> 
> > On Sun, Jan 05, 2014 at 05:43:56PM +0100, Bart Van Assche wrote:
> > > On 01/04/14 19:06, Mikulas Patocka wrote:
> > > > -	if (t && !t->release)
> > > > -		pr_debug("kobject: '%s' (%p): does not have a release() "
> > > > -			 "function, it is broken and must be fixed.\n",
> > > > -			 kobject_name(kobj), kobj);
> > > > -
> > > 
> > > Has it been considered to issue a warning if no release function has
> > > been defined and free_completion == NULL instead of removing the above
> > > debug message entirely ? I think even with this patch applied it is
> > > still wrong to invoke kobject_put() on an object without defining a
> > > release function.
> > 
> > This patch isn't going to be applied, and I've reverted the original
> > commit, so there shouldn't be any issues anymore with this code.
> 
> Why? This patch does the same thing as 
> eee031649707db3c9920d9498f8d03819b74fc23, but it's smaller. So why did you 
> accept eee031649707db3c9920d9498f8d03819b74fc23 and not this?
> 
> The code to wait for kobject destruction using completion already exists 
> in cpufreq_sysfs_release, cpuidle_sysfs_release, 
> cpuidle_state_sysfs_release, cpuidle_driver_sysfs_release, 
> ext4_sb_release, ext4_feat_release, f2fs_sb_release (these are the only 
> kobject users that are correct w.r.t. module unloading), so if you accept 
> this patch, you can simplify them to use kobject_put_wait.

Hi Mikulas,

Please just submit a DM-only patch that follows the same racey pattern
of firing a completion from the kobj_type .release method in dm_mod.
I'll get it queued up for 3.14.

If/when we gets reports of a crash due to dm_mod unload racing with
kobject_put we can revisit this.

Thanks,
Mike

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
diff mbox

Patch

Index: linux-3.13-rc6/include/linux/kobject.h
===================================================================
--- linux-3.13-rc6.orig/include/linux/kobject.h	2014-01-02 23:13:24.000000000 +0100
+++ linux-3.13-rc6/include/linux/kobject.h	2014-01-02 23:14:02.000000000 +0100
@@ -27,6 +27,7 @@ 
 #include <linux/wait.h>
 #include <linux/atomic.h>
 #include <linux/workqueue.h>
+#include <linux/completion.h>
 
 #define UEVENT_HELPER_PATH_LEN		256
 #define UEVENT_NUM_ENVP			32	/* number of env pointers */
@@ -66,6 +67,7 @@  struct kobject {
 	struct kobj_type	*ktype;
 	struct sysfs_dirent	*sd;
 	struct kref		kref;
+	struct completion	*free_completion;
 #ifdef CONFIG_DEBUG_KOBJECT_RELEASE
 	struct delayed_work	release;
 #endif
@@ -106,6 +108,7 @@  extern int __must_check kobject_move(str
 
 extern struct kobject *kobject_get(struct kobject *kobj);
 extern void kobject_put(struct kobject *kobj);
+extern void kobject_put_wait(struct kobject *kobj);
 
 extern const void *kobject_namespace(struct kobject *kobj);
 extern char *kobject_get_path(struct kobject *kobj, gfp_t flag);
Index: linux-3.13-rc6/lib/kobject.c
===================================================================
--- linux-3.13-rc6.orig/lib/kobject.c	2014-01-02 23:13:23.000000000 +0100
+++ linux-3.13-rc6/lib/kobject.c	2014-01-02 23:17:01.000000000 +0100
@@ -172,6 +172,7 @@  static void kobject_init_internal(struct
 	if (!kobj)
 		return;
 	kref_init(&kobj->kref);
+	kobj->free_completion = NULL;
 	INIT_LIST_HEAD(&kobj->entry);
 	kobj->state_in_sysfs = 0;
 	kobj->state_add_uevent_sent = 0;
@@ -577,15 +578,11 @@  static void kobject_cleanup(struct kobje
 {
 	struct kobj_type *t = get_ktype(kobj);
 	const char *name = kobj->name;
+	struct completion *free_completion = kobj->free_completion;
 
 	pr_debug("kobject: '%s' (%p): %s, parent %p\n",
 		 kobject_name(kobj), kobj, __func__, kobj->parent);
 
-	if (t && !t->release)
-		pr_debug("kobject: '%s' (%p): does not have a release() "
-			 "function, it is broken and must be fixed.\n",
-			 kobject_name(kobj), kobj);
-
 	/* send "remove" if the caller did not do it but sent "add" */
 	if (kobj->state_add_uevent_sent && !kobj->state_remove_uevent_sent) {
 		pr_debug("kobject: '%s' (%p): auto cleanup 'remove' event\n",
@@ -611,6 +608,10 @@  static void kobject_cleanup(struct kobje
 		pr_debug("kobject: '%s': free name\n", name);
 		kfree(name);
 	}
+
+	/* if someone is waiting for the kobject to be freed, wake him up */
+	if (free_completion)
+		complete(free_completion);
 }
 
 #ifdef CONFIG_DEBUG_KOBJECT_RELEASE
@@ -651,6 +652,28 @@  void kobject_put(struct kobject *kobj)
 	}
 }
 
+/**
+ * kobject_put - decrement refcount for object and wait until it reaches zero.
+ * @kobj: object.
+ *
+ * Decrement the refcount, and wait until the refcount reaches zero and the
+ * kobject is freed.
+ *
+ * This function should be called from the driver unload routine. It must not
+ * be called concurrently on the same kobject. When this function returns, it
+ * is guaranteed that the kobject was freed.
+ */
+void kobject_put_wait(struct kobject *kobj)
+{
+	if (kobj) {
+		DECLARE_COMPLETION_ONSTACK(completion);
+		BUG_ON(kobj->free_completion);
+		kobj->free_completion = &completion;
+		kobject_put(kobj);
+		wait_for_completion(&completion);
+	}
+}
+
 static void dynamic_kobj_release(struct kobject *kobj)
 {
 	pr_debug("kobject: (%p): %s\n", kobj, __func__);
@@ -1076,6 +1099,7 @@  void kobj_ns_drop(enum kobj_ns_type type
 
 EXPORT_SYMBOL(kobject_get);
 EXPORT_SYMBOL(kobject_put);
+EXPORT_SYMBOL(kobject_put_wait);
 EXPORT_SYMBOL(kobject_del);
 
 EXPORT_SYMBOL(kset_register);
Index: linux-3.13-rc6/drivers/md/dm-sysfs.c
===================================================================
--- linux-3.13-rc6.orig/drivers/md/dm-sysfs.c	2014-01-02 23:13:24.000000000 +0100
+++ linux-3.13-rc6/drivers/md/dm-sysfs.c	2014-01-02 23:14:02.000000000 +0100
@@ -104,5 +104,5 @@  int dm_sysfs_init(struct mapped_device *
  */
 void dm_sysfs_exit(struct mapped_device *md)
 {
-	kobject_put(dm_kobject(md));
+	kobject_put_wait(dm_kobject(md));
 }