Message ID | 1486754370-3057-1-git-send-email-logang@deltatee.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Hey, Also on the subject of very minor fixes: I noticed drivers/dax is not in the maintainers file. I just assumed the nvdimm list should have been included with those from get_maintainers. Thanks, Logan On 10/02/17 12:19 PM, Logan Gunthorpe wrote: > I copied this code and per feedback from Greg Kroah-Hartman [1] the > cdev's kobject's parent should not be set to the related device. > This should have minor consequences but isn't doing what anyone > expects it to. > > This patch then fixes device-dax so it doesn't make the same mistake. > > [1] https://lkml.org/lkml/2017/2/10/370 > > Signed-off-by: Logan Gunthorpe <logang@deltatee.com> > --- > drivers/dax/dax.c | 6 +----- > 1 file changed, 1 insertion(+), 5 deletions(-) > > diff --git a/drivers/dax/dax.c b/drivers/dax/dax.c > index ed758b7..24e53b7 100644 > --- a/drivers/dax/dax.c > +++ b/drivers/dax/dax.c > @@ -699,13 +699,9 @@ struct dax_dev *devm_create_dax_dev(struct dax_region *dax_region, > goto err_inode; > } > > - /* device_initialize() so cdev can reference kobj parent */ > - device_initialize(dev); > - > cdev = &dax_dev->cdev; > cdev_init(cdev, &dax_fops); > cdev->owner = parent->driver->owner; > - cdev->kobj.parent = &dev->kobj; > rc = cdev_add(&dax_dev->cdev, dev_t, 1); > if (rc) > goto err_cdev; > @@ -722,7 +718,7 @@ struct dax_dev *devm_create_dax_dev(struct dax_region *dax_region, > dev->groups = dax_attribute_groups; > dev->release = dax_dev_release; > dev_set_name(dev, "dax%d.%d", dax_region->id, dax_dev->id); > - rc = device_add(dev); > + rc = device_register(dev); > if (rc) { > put_device(dev); > return ERR_PTR(rc); >
On Fri, Feb 10, 2017 at 11:19 AM, Logan Gunthorpe <logang@deltatee.com> wrote: > I copied this code and per feedback from Greg Kroah-Hartman [1] the > cdev's kobject's parent should not be set to the related device. > This should have minor consequences but isn't doing what anyone > expects it to. > > This patch then fixes device-dax so it doesn't make the same mistake. > > [1] https://lkml.org/lkml/2017/2/10/370 > > Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Thanks for following up with this fix, but this causes a use-after-free regression: general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC [..] Call Trace: vsnprintf+0x2d7/0x500 snprintf+0x49/0x60 dev_vprintk_emit+0x68/0x230 ? debug_lockdep_rcu_enabled+0x1d/0x20 ? trace_hardirqs_off+0xd/0x10 ? cmpxchg_double_slab.isra.70+0x15a/0x1c0 ? __slab_free+0x134/0x290 dev_printk_emit+0x4e/0x70 __dynamic_dev_dbg+0xc8/0x110 ? __lock_acquire+0x33d/0x1290 dax_dev_huge_fault+0xee/0x570 [dax] __handle_mm_fault+0x5aa/0x10a0 handle_mm_fault+0x154/0x350 ? handle_mm_fault+0x3c/0x350 __do_page_fault+0x26b/0x4c0 trace_do_page_fault+0x58/0x270 do_async_page_fault+0x1a/0xa0 async_page_fault+0x28/0x30 I added this reference explicitly so the parent struct device has the correct lifetime after this feedback from Al. https://lists.01.org/pipermail/linux-nvdimm/2016-August/006563.html ...so I'm wondering what the actual problem is with setting cdev->parent?
On Fri, Feb 10, 2017 at 12:19:30PM -0700, Logan Gunthorpe wrote: > I copied this code and per feedback from Greg Kroah-Hartman [1] the > cdev's kobject's parent should not be set to the related device. > This should have minor consequences but isn't doing what anyone > expects it to. > > This patch then fixes device-dax so it doesn't make the same mistake. > > [1] https://lkml.org/lkml/2017/2/10/370 > > Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
On Fri, Feb 10, 2017 at 11:41:20AM -0800, Dan Williams wrote: > On Fri, Feb 10, 2017 at 11:19 AM, Logan Gunthorpe <logang@deltatee.com> wrote: > > I copied this code and per feedback from Greg Kroah-Hartman [1] the > > cdev's kobject's parent should not be set to the related device. > > This should have minor consequences but isn't doing what anyone > > expects it to. > > > > This patch then fixes device-dax so it doesn't make the same mistake. > > > > [1] https://lkml.org/lkml/2017/2/10/370 > > > > Signed-off-by: Logan Gunthorpe <logang@deltatee.com> > > Thanks for following up with this fix, but this causes a > use-after-free regression: > > general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC > [..] > Call Trace: > vsnprintf+0x2d7/0x500 > snprintf+0x49/0x60 > dev_vprintk_emit+0x68/0x230 > ? debug_lockdep_rcu_enabled+0x1d/0x20 > ? trace_hardirqs_off+0xd/0x10 > ? cmpxchg_double_slab.isra.70+0x15a/0x1c0 > ? __slab_free+0x134/0x290 > dev_printk_emit+0x4e/0x70 > __dynamic_dev_dbg+0xc8/0x110 > ? __lock_acquire+0x33d/0x1290 > dax_dev_huge_fault+0xee/0x570 [dax] > __handle_mm_fault+0x5aa/0x10a0 > handle_mm_fault+0x154/0x350 > ? handle_mm_fault+0x3c/0x350 > __do_page_fault+0x26b/0x4c0 > trace_do_page_fault+0x58/0x270 > do_async_page_fault+0x1a/0xa0 > async_page_fault+0x28/0x30 > > I added this reference explicitly so the parent struct device has the > correct lifetime after this feedback from Al. > > https://lists.01.org/pipermail/linux-nvdimm/2016-August/006563.html > > ...so I'm wondering what the actual problem is with setting cdev->parent? It shouldn't do anything at all. The kobject in a cdev isn't a "normal" kobject, it doesn't show up in sysfs, or anywhere else. It's used for an internal representation to the cdev code (a kmap) to look up the object to call when userspace opens the device node in a quick manner. Now changing from initialize/add to just register, does do different things, perhaps that is the issue here. Just try removing the cdev->kobject parent stuff and see if that causes a problem or not. thanks, greg k-h
On Fri, Feb 10, 2017 at 12:17 PM, Greg Kroah-Hartman <gregkh@linuxfoundation.org> wrote: > On Fri, Feb 10, 2017 at 11:41:20AM -0800, Dan Williams wrote: >> On Fri, Feb 10, 2017 at 11:19 AM, Logan Gunthorpe <logang@deltatee.com> wrote: >> > I copied this code and per feedback from Greg Kroah-Hartman [1] the >> > cdev's kobject's parent should not be set to the related device. >> > This should have minor consequences but isn't doing what anyone >> > expects it to. >> > >> > This patch then fixes device-dax so it doesn't make the same mistake. >> > >> > [1] https://lkml.org/lkml/2017/2/10/370 >> > >> > Signed-off-by: Logan Gunthorpe <logang@deltatee.com> >> >> Thanks for following up with this fix, but this causes a >> use-after-free regression: >> >> general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC >> [..] >> Call Trace: >> vsnprintf+0x2d7/0x500 >> snprintf+0x49/0x60 >> dev_vprintk_emit+0x68/0x230 >> ? debug_lockdep_rcu_enabled+0x1d/0x20 >> ? trace_hardirqs_off+0xd/0x10 >> ? cmpxchg_double_slab.isra.70+0x15a/0x1c0 >> ? __slab_free+0x134/0x290 >> dev_printk_emit+0x4e/0x70 >> __dynamic_dev_dbg+0xc8/0x110 >> ? __lock_acquire+0x33d/0x1290 >> dax_dev_huge_fault+0xee/0x570 [dax] >> __handle_mm_fault+0x5aa/0x10a0 >> handle_mm_fault+0x154/0x350 >> ? handle_mm_fault+0x3c/0x350 >> __do_page_fault+0x26b/0x4c0 >> trace_do_page_fault+0x58/0x270 >> do_async_page_fault+0x1a/0xa0 >> async_page_fault+0x28/0x30 >> >> I added this reference explicitly so the parent struct device has the >> correct lifetime after this feedback from Al. >> >> https://lists.01.org/pipermail/linux-nvdimm/2016-August/006563.html >> >> ...so I'm wondering what the actual problem is with setting cdev->parent? > > It shouldn't do anything at all. The kobject in a cdev isn't a "normal" > kobject, it doesn't show up in sysfs, or anywhere else. It's used for > an internal representation to the cdev code (a kmap) to look up the > object to call when userspace opens the device node in a quick manner. > > Now changing from initialize/add to just register, does do different > things, perhaps that is the issue here. Just try removing the > cdev->kobject parent stuff and see if that causes a problem or not. > That doesn't help. I rely on the "kobject_get(p->kobj.parent);" in cdev_add() to pin my device and cdev_default_release() to free it.
On Fri, Feb 10, 2017 at 02:25:35PM -0800, Dan Williams wrote: > On Fri, Feb 10, 2017 at 12:17 PM, Greg Kroah-Hartman > <gregkh@linuxfoundation.org> wrote: > > On Fri, Feb 10, 2017 at 11:41:20AM -0800, Dan Williams wrote: > >> On Fri, Feb 10, 2017 at 11:19 AM, Logan Gunthorpe <logang@deltatee.com> wrote: > >> > I copied this code and per feedback from Greg Kroah-Hartman [1] the > >> > cdev's kobject's parent should not be set to the related device. > >> > This should have minor consequences but isn't doing what anyone > >> > expects it to. > >> > > >> > This patch then fixes device-dax so it doesn't make the same mistake. > >> > > >> > [1] https://lkml.org/lkml/2017/2/10/370 > >> > > >> > Signed-off-by: Logan Gunthorpe <logang@deltatee.com> > >> > >> Thanks for following up with this fix, but this causes a > >> use-after-free regression: > >> > >> general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC > >> [..] > >> Call Trace: > >> vsnprintf+0x2d7/0x500 > >> snprintf+0x49/0x60 > >> dev_vprintk_emit+0x68/0x230 > >> ? debug_lockdep_rcu_enabled+0x1d/0x20 > >> ? trace_hardirqs_off+0xd/0x10 > >> ? cmpxchg_double_slab.isra.70+0x15a/0x1c0 > >> ? __slab_free+0x134/0x290 > >> dev_printk_emit+0x4e/0x70 > >> __dynamic_dev_dbg+0xc8/0x110 > >> ? __lock_acquire+0x33d/0x1290 > >> dax_dev_huge_fault+0xee/0x570 [dax] > >> __handle_mm_fault+0x5aa/0x10a0 > >> handle_mm_fault+0x154/0x350 > >> ? handle_mm_fault+0x3c/0x350 > >> __do_page_fault+0x26b/0x4c0 > >> trace_do_page_fault+0x58/0x270 > >> do_async_page_fault+0x1a/0xa0 > >> async_page_fault+0x28/0x30 > >> > >> I added this reference explicitly so the parent struct device has the > >> correct lifetime after this feedback from Al. > >> > >> https://lists.01.org/pipermail/linux-nvdimm/2016-August/006563.html > >> > >> ...so I'm wondering what the actual problem is with setting cdev->parent? > > > > It shouldn't do anything at all. The kobject in a cdev isn't a "normal" > > kobject, it doesn't show up in sysfs, or anywhere else. It's used for > > an internal representation to the cdev code (a kmap) to look up the > > object to call when userspace opens the device node in a quick manner. > > > > Now changing from initialize/add to just register, does do different > > things, perhaps that is the issue here. Just try removing the > > cdev->kobject parent stuff and see if that causes a problem or not. > > > > That doesn't help. I rely on the "kobject_get(p->kobj.parent);" in > cdev_add() to pin my device and cdev_default_release() to free it. "pin it" where? Why do you need this? That feels really "odd" to me...
On Fri, Feb 10, 2017 at 11:16 PM, Greg Kroah-Hartman <gregkh@linuxfoundation.org> wrote: > On Fri, Feb 10, 2017 at 02:25:35PM -0800, Dan Williams wrote: >> On Fri, Feb 10, 2017 at 12:17 PM, Greg Kroah-Hartman >> <gregkh@linuxfoundation.org> wrote: >> > On Fri, Feb 10, 2017 at 11:41:20AM -0800, Dan Williams wrote: >> >> On Fri, Feb 10, 2017 at 11:19 AM, Logan Gunthorpe <logang@deltatee.com> wrote: >> >> > I copied this code and per feedback from Greg Kroah-Hartman [1] the >> >> > cdev's kobject's parent should not be set to the related device. >> >> > This should have minor consequences but isn't doing what anyone >> >> > expects it to. >> >> > >> >> > This patch then fixes device-dax so it doesn't make the same mistake. >> >> > >> >> > [1] https://lkml.org/lkml/2017/2/10/370 >> >> > >> >> > Signed-off-by: Logan Gunthorpe <logang@deltatee.com> >> >> >> >> Thanks for following up with this fix, but this causes a >> >> use-after-free regression: >> >> >> >> general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC >> >> [..] >> >> Call Trace: >> >> vsnprintf+0x2d7/0x500 >> >> snprintf+0x49/0x60 >> >> dev_vprintk_emit+0x68/0x230 >> >> ? debug_lockdep_rcu_enabled+0x1d/0x20 >> >> ? trace_hardirqs_off+0xd/0x10 >> >> ? cmpxchg_double_slab.isra.70+0x15a/0x1c0 >> >> ? __slab_free+0x134/0x290 >> >> dev_printk_emit+0x4e/0x70 >> >> __dynamic_dev_dbg+0xc8/0x110 >> >> ? __lock_acquire+0x33d/0x1290 >> >> dax_dev_huge_fault+0xee/0x570 [dax] >> >> __handle_mm_fault+0x5aa/0x10a0 >> >> handle_mm_fault+0x154/0x350 >> >> ? handle_mm_fault+0x3c/0x350 >> >> __do_page_fault+0x26b/0x4c0 >> >> trace_do_page_fault+0x58/0x270 >> >> do_async_page_fault+0x1a/0xa0 >> >> async_page_fault+0x28/0x30 >> >> >> >> I added this reference explicitly so the parent struct device has the >> >> correct lifetime after this feedback from Al. >> >> >> >> https://lists.01.org/pipermail/linux-nvdimm/2016-August/006563.html >> >> >> >> ...so I'm wondering what the actual problem is with setting cdev->parent? >> > >> > It shouldn't do anything at all. The kobject in a cdev isn't a "normal" >> > kobject, it doesn't show up in sysfs, or anywhere else. It's used for >> > an internal representation to the cdev code (a kmap) to look up the >> > object to call when userspace opens the device node in a quick manner. >> > >> > Now changing from initialize/add to just register, does do different >> > things, perhaps that is the issue here. Just try removing the >> > cdev->kobject parent stuff and see if that causes a problem or not. >> > >> >> That doesn't help. I rely on the "kobject_get(p->kobj.parent);" in >> cdev_add() to pin my device and cdev_default_release() to free it. > > "pin it" where? Why do you need this? That feels really "odd" to me... If there's a more idiomatic way to achieve what drivers/dax/dax.c is doing I'm of course open to switching... The cdev is embedded in a struct dax_dev. /** * struct dax_dev - subdivision of a dax region * @region - parent region * @dev - device backing the character device * @cdev - core chardev data * @alive - !alive + rcu grace period == no new mappings can be established * @id - child id in the region * @num_resources - number of physical address extents in this device * @res - array of physical address ranges */ struct dax_dev { struct dax_region *region; struct inode *inode; struct device dev; struct cdev cdev; bool alive; int id; int num_resources; struct resource res[0]; }; The only I/O operation that can be performed on this device file is mmap. static const struct file_operations dax_fops = { .llseek = noop_llseek, .owner = THIS_MODULE, .open = dax_open, .release = dax_release, .get_unmapped_area = dax_get_unmapped_area, .mmap = dax_mmap, }; When the device is unregistered it invalidates all existing mappings, but the driver may continue to service vm fault requests until the final put of the cdev. Until that time the fault handler needs to be able to check dax_dev->alive. Since the final cdev put is handled by the vfs I use the cdev's kobject to keep the struct dax_dev instance alive.
On 11/02/17 01:56 AM, Dan Williams wrote: > When the device is unregistered it invalidates all existing mappings, > but the driver may continue to service vm fault requests until the > final put of the cdev. Until that time the fault handler needs to be > able to check dax_dev->alive. Since the final cdev put is handled by > the vfs I use the cdev's kobject to keep the struct dax_dev instance > alive. I'm just taking a wild stab at this, but would it not make sense to just take a reference to the dax_dev device in dax_open and put it back it in dax_release? (Or perhaps, in the open/close of the vm_ops.) That way the structure won't be free'd until there are no users and alive will always be accessible. It would also be a bit more clear as to what's going on because you are actually making a reference in filp->private_data. Logan
On Sat, Feb 11, 2017 at 9:59 AM, Logan Gunthorpe <logang@deltatee.com> wrote: > On 11/02/17 01:56 AM, Dan Williams wrote: >> >> When the device is unregistered it invalidates all existing mappings, >> but the driver may continue to service vm fault requests until the >> final put of the cdev. Until that time the fault handler needs to be >> able to check dax_dev->alive. Since the final cdev put is handled by >> the vfs I use the cdev's kobject to keep the struct dax_dev instance >> alive. > > > I'm just taking a wild stab at this, but would it not make sense to just > take a reference to the dax_dev device in dax_open and put it back it in > dax_release? (Or perhaps, in the open/close of the vm_ops.) That way the > structure won't be free'd until there are no users and alive will always be > accessible. > > It would also be a bit more clear as to what's going on because you are > actually making a reference in filp->private_data. > Why, when the lifetime of the cdev is already correct? See commit ba09c01d2fa8 "dax: convert to the cdev api". I used to take explicit references like you suggest, but cdev made it cleaner.
On 11/02/17 11:27 AM, Dan Williams wrote: > Why, when the lifetime of the cdev is already correct? Well, it's only correct if you use the kobj parent trick which Greg is arguing against. As someone reviewing/copying code that trick is unclear, undocumented and it looks rather odd messing with internal kobjects. Taking the explicit reference would be very clear, very standard and only net one additional line. > See commit ba09c01d2fa8 "dax: convert to the cdev api". I used to take > explicit references like you suggest, but cdev made it cleaner. I agree that, on the whole, that patch makes things a good deal cleaner. I'm not so sure that this one small aspect is an improvement. In any case, it's up to you. If you'd like I can certainly submit a v2 patch that adds the get/put. Thanks, Logan
On Sat, Feb 11, 2017 at 10:43 AM, Logan Gunthorpe <logang@deltatee.com> wrote: > > On 11/02/17 11:27 AM, Dan Williams wrote: >> Why, when the lifetime of the cdev is already correct? > > Well, it's only correct if you use the kobj parent trick which Greg is > arguing against. As someone reviewing/copying code that trick is > unclear, undocumented and it looks rather odd messing with internal > kobjects. Taking the explicit reference would be very clear, very > standard and only net one additional line. > >> See commit ba09c01d2fa8 "dax: convert to the cdev api". I used to take >> explicit references like you suggest, but cdev made it cleaner. > > I agree that, on the whole, that patch makes things a good deal cleaner. > I'm not so sure that this one small aspect is an improvement. > > In any case, it's up to you. If you'd like I can certainly submit a v2 > patch that adds the get/put. Can we meet in the middle and just add some comments about what is going on? It's a shame to add reference counts for something that is already properly reference counted.
On Sat, Feb 11, 2017 at 10:55 AM, Dan Williams <dan.j.williams@intel.com> wrote: > On Sat, Feb 11, 2017 at 10:43 AM, Logan Gunthorpe <logang@deltatee.com> wrote: >> >> On 11/02/17 11:27 AM, Dan Williams wrote: >>> Why, when the lifetime of the cdev is already correct? >> >> Well, it's only correct if you use the kobj parent trick which Greg is >> arguing against. As someone reviewing/copying code that trick is >> unclear, undocumented and it looks rather odd messing with internal >> kobjects. Taking the explicit reference would be very clear, very >> standard and only net one additional line. >> >>> See commit ba09c01d2fa8 "dax: convert to the cdev api". I used to take >>> explicit references like you suggest, but cdev made it cleaner. >> >> I agree that, on the whole, that patch makes things a good deal cleaner. >> I'm not so sure that this one small aspect is an improvement. >> >> In any case, it's up to you. If you'd like I can certainly submit a v2 >> patch that adds the get/put. > > Can we meet in the middle and just add some comments about what is going on? > > It's a shame to add reference counts for something that is already > properly reference counted. Also when using an embedded cdev how would you recommend avoiding this problem? https://lists.01.org/pipermail/linux-nvdimm/2016-August/006562.html
On 11/02/17 11:58 AM, Dan Williams wrote:
> Also when using an embedded cdev how would you recommend avoiding this problem?
I don't know. Hopefully, Greg has a good idea. But it sounds like a
general problem that a lot of cdev's actually suffer from without
realizing. Perhaps we need a more general solution. Some way for the
cdev to reference its containing structure in a way that it's designed
for such that anyone writing a driver will do the right thing without
needing to dive into the kobjects.
Logan
diff --git a/drivers/dax/dax.c b/drivers/dax/dax.c index ed758b7..24e53b7 100644 --- a/drivers/dax/dax.c +++ b/drivers/dax/dax.c @@ -699,13 +699,9 @@ struct dax_dev *devm_create_dax_dev(struct dax_region *dax_region, goto err_inode; } - /* device_initialize() so cdev can reference kobj parent */ - device_initialize(dev); - cdev = &dax_dev->cdev; cdev_init(cdev, &dax_fops); cdev->owner = parent->driver->owner; - cdev->kobj.parent = &dev->kobj; rc = cdev_add(&dax_dev->cdev, dev_t, 1); if (rc) goto err_cdev; @@ -722,7 +718,7 @@ struct dax_dev *devm_create_dax_dev(struct dax_region *dax_region, dev->groups = dax_attribute_groups; dev->release = dax_dev_release; dev_set_name(dev, "dax%d.%d", dax_region->id, dax_dev->id); - rc = device_add(dev); + rc = device_register(dev); if (rc) { put_device(dev); return ERR_PTR(rc);
I copied this code and per feedback from Greg Kroah-Hartman [1] the cdev's kobject's parent should not be set to the related device. This should have minor consequences but isn't doing what anyone expects it to. This patch then fixes device-dax so it doesn't make the same mistake. [1] https://lkml.org/lkml/2017/2/10/370 Signed-off-by: Logan Gunthorpe <logang@deltatee.com> --- drivers/dax/dax.c | 6 +----- 1 file changed, 1 insertion(+), 5 deletions(-)