Message ID | 1457343873-14869-1-git-send-email-devesh.sharma@broadcom.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
On 3/7/2016 11:44 AM, Devesh Sharma wrote: > Fixes: 35d4a0b63dc0 ("IB/uverbs: Fix race between ib_uverbs_open and remove_one") It fixes 036b10635739 (IB/uverbs: Enable device removal when there are active user space applications) and not the commit that you pointed on. > > While testing ocrdma for disassociate_ucontext support following > kernel panic was seen: > > BUG: unable to handle kernel paging request at ffffffffa07ccd7a > [67139.981020] IP: [<ffffffffa07ccd7a>] 0xffffffffa07ccd7a > [67139.987185] PGD 19c5067 PUD 19c6063 PMD 469d08067 PTE 0 > [67139.993370] Oops: 0010 [#1] SMP > [67140.257286] Call Trace: > [67140.260665] [<ffffffff810c16a0>] ? prepare_to_wait_event+0xf0/0xf0 > [67140.268337] [<ffffffffa04cabc3>] ? ib_dereg_mr+0x23/0x30 [ib_core] > [67140.276009] [<ffffffffa03ee5f0>] ? ib_uverbs_cleanup_ucontext+0x320/0x440 [ib_uverbs] > [67140.285550] [<ffffffffa03ee9e9>] ? ib_uverbs_close+0x59/0xb0 [ib_uverbs] > [67140.293807] [<ffffffff811ff744>] ? __fput+0xe4/0x210 > [67140.300132] [<ffffffff811ff8ae>] ? ____fput+0xe/0x10 > [67140.306457] [<ffffffff8109b697>] ? task_work_run+0x77/0x90 > [67140.313388] [<ffffffff81081af2>] ? do_exit+0x2d2/0xab0 > [67140.319910] [<ffffffff8108234f>] ? do_group_exit+0x3f/0xa0 > [67140.326821] [<ffffffff8108d54c>] ? get_signal+0x1cc/0x5e0 > [67140.333635] [<ffffffff81017387>] ? do_signal+0x37/0x660 > [67140.340257] [<ffffffffa021669a>] ? ucma_write+0x7a/0xc0 [rdma_ucm] > [67140.347949] [<ffffffff81079c37>] ? exit_to_usermode_loop+0x59/0xa2 > [67140.355651] [<ffffffff81003bad>] ? syscall_return_slowpath+0x8d/0xa0 > [67140.363554] [<ffffffff81680dcc>] ? int_ret_from_sys_call+0x25/0x8f > [67140.371259] Code: Bad RIP value. > [67140.375678] RIP [<ffffffffa07ccd7a>] 0xffffffffa07ccd7a > [67140.382314] RSP <ffff880867727ad8> > [67140.386894] CR2: ffffffffa07ccd7a > [67140.393737] ---[ end trace 807b4472c30412d0 ]--- > [67141.682413] Kernel panic - not syncing: Fatal exception > [67141.682431] Kernel Offset: disabled > [67141.733934] ---[ end Kernel panic - not syncing: Fatal exception > > Root Cause: > > During rmmod <vendor-driver> "ib_uverbs_close()" context is > still running, while "ib_uverbs_remove_one()" context completes and > ends up freeing ib_dev pointer, thus causing a Kernel Panic. Kernel Panic -> kernel panic. > > This patch fixes the race. ib_uverbs_close validates dev->ib_dev against NULL > inside an srcu lock. If it is NULL, it waits for a completion and drops the srcu > else continues with the normal flow. Need to fix this description as of the expected code change, see below. Please also describe why/how it solves the problem and not what it does. > > CC: Yishai Hadas <yishaih@mellanox.com> > Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com> > --- > drivers/infiniband/core/uverbs.h | 1 + > drivers/infiniband/core/uverbs_main.c | 18 ++++++++++++++++-- > 2 files changed, 17 insertions(+), 2 deletions(-) > > diff --git a/drivers/infiniband/core/uverbs.h b/drivers/infiniband/core/uverbs.h > index 612ccfd..94a7339 100644 > --- a/drivers/infiniband/core/uverbs.h > +++ b/drivers/infiniband/core/uverbs.h > @@ -121,6 +121,7 @@ struct ib_uverbs_file { > struct ib_event_handler event_handler; > struct ib_uverbs_event_file *async_file; > struct list_head list; > + struct completion fcomp; > int is_closed; > }; > > diff --git a/drivers/infiniband/core/uverbs_main.c b/drivers/infiniband/core/uverbs_main.c > index 39680ae..9531168 100644 > --- a/drivers/infiniband/core/uverbs_main.c > +++ b/drivers/infiniband/core/uverbs_main.c > @@ -928,6 +928,7 @@ static int ib_uverbs_open(struct inode *inode, struct file *filp) > file->async_file = NULL; > kref_init(&file->ref); > mutex_init(&file->mutex); > + init_completion(&file->fcomp); > > filp->private_data = file; > kobject_get(&dev->kobj); > @@ -954,21 +955,33 @@ static int ib_uverbs_close(struct inode *inode, struct file *filp) > struct ib_uverbs_file *file = filp->private_data; > struct ib_uverbs_device *dev = file->device; > struct ib_ucontext *ucontext = NULL; > + struct ib_device *ib_dev; > + int srcu_key; > > - mutex_lock(&file->device->lists_mutex); > + srcu_key = srcu_read_lock(&dev->disassociate_srcu); > + ib_dev = srcu_dereference(dev->ib_dev, > + &dev->disassociate_srcu); > + if (!ib_dev) You need to free the sruc lock, wait for completion then go to out. Doing in the opposite order as you did below, might end up with a dead-lock in case ib_uverbs_free_hw_resources is waiting for the synchronize_srcu(disassociate_srcu) and can't mark the file as completed. > + goto out; > + > + mutex_lock(&dev->lists_mutex); > ucontext = file->ucontext; > file->ucontext = NULL; > if (!file->is_closed) { > list_del(&file->list); > file->is_closed = 1; > } > - mutex_unlock(&file->device->lists_mutex); > + mutex_unlock(&dev->lists_mutex); > if (ucontext) > ib_uverbs_cleanup_ucontext(file, ucontext); > > if (file->async_file) > kref_put(&file->async_file->ref, ib_uverbs_release_event_file); > > + complete(&file->fcomp); No need to do that in that flow, only in above flow. > +out: > + wait_for_completion(&file->fcomp); > + srcu_read_unlock(&dev->disassociate_srcu, srcu_key); See above, might end-up with a dead-lock, should be done above in opposite order. > kref_put(&file->ref, ib_uverbs_release_file); > kobject_put(&dev->kobj); > > @@ -1199,6 +1212,7 @@ static void ib_uverbs_free_hw_resources(struct ib_uverbs_device *uverbs_dev, > } > > mutex_lock(&uverbs_dev->lists_mutex); > + complete(&file->fcomp); > kref_put(&file->ref, ib_uverbs_release_file); > } > > -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Mar 7, 2016 at 4:44 PM, Yishai Hadas <yishaih@dev.mellanox.co.il> wrote: > On 3/7/2016 11:44 AM, Devesh Sharma wrote: >> >> Fixes: 35d4a0b63dc0 ("IB/uverbs: Fix race between ib_uverbs_open and >> remove_one") > > > It fixes 036b10635739 (IB/uverbs: Enable device removal when there are > active user space applications) and not the commit that you pointed on. > > > >> >> While testing ocrdma for disassociate_ucontext support following >> kernel panic was seen: >> >> BUG: unable to handle kernel paging request at ffffffffa07ccd7a >> [67139.981020] IP: [<ffffffffa07ccd7a>] 0xffffffffa07ccd7a >> [67139.987185] PGD 19c5067 PUD 19c6063 PMD 469d08067 PTE 0 >> [67139.993370] Oops: 0010 [#1] SMP >> [67140.257286] Call Trace: >> [67140.260665] [<ffffffff810c16a0>] ? prepare_to_wait_event+0xf0/0xf0 >> [67140.268337] [<ffffffffa04cabc3>] ? ib_dereg_mr+0x23/0x30 [ib_core] >> [67140.276009] [<ffffffffa03ee5f0>] ? >> ib_uverbs_cleanup_ucontext+0x320/0x440 [ib_uverbs] >> [67140.285550] [<ffffffffa03ee9e9>] ? ib_uverbs_close+0x59/0xb0 >> [ib_uverbs] >> [67140.293807] [<ffffffff811ff744>] ? __fput+0xe4/0x210 >> [67140.300132] [<ffffffff811ff8ae>] ? ____fput+0xe/0x10 >> [67140.306457] [<ffffffff8109b697>] ? task_work_run+0x77/0x90 >> [67140.313388] [<ffffffff81081af2>] ? do_exit+0x2d2/0xab0 >> [67140.319910] [<ffffffff8108234f>] ? do_group_exit+0x3f/0xa0 >> [67140.326821] [<ffffffff8108d54c>] ? get_signal+0x1cc/0x5e0 >> [67140.333635] [<ffffffff81017387>] ? do_signal+0x37/0x660 >> [67140.340257] [<ffffffffa021669a>] ? ucma_write+0x7a/0xc0 [rdma_ucm] >> [67140.347949] [<ffffffff81079c37>] ? exit_to_usermode_loop+0x59/0xa2 >> [67140.355651] [<ffffffff81003bad>] ? syscall_return_slowpath+0x8d/0xa0 >> [67140.363554] [<ffffffff81680dcc>] ? int_ret_from_sys_call+0x25/0x8f >> [67140.371259] Code: Bad RIP value. >> [67140.375678] RIP [<ffffffffa07ccd7a>] 0xffffffffa07ccd7a >> [67140.382314] RSP <ffff880867727ad8> >> [67140.386894] CR2: ffffffffa07ccd7a >> [67140.393737] ---[ end trace 807b4472c30412d0 ]--- >> [67141.682413] Kernel panic - not syncing: Fatal exception >> [67141.682431] Kernel Offset: disabled >> [67141.733934] ---[ end Kernel panic - not syncing: Fatal exception >> >> Root Cause: >> >> During rmmod <vendor-driver> "ib_uverbs_close()" context is >> still running, while "ib_uverbs_remove_one()" context completes and >> ends up freeing ib_dev pointer, thus causing a Kernel Panic. > > > Kernel Panic -> kernel panic. Will fix this in V3. >> >> >> This patch fixes the race. ib_uverbs_close validates dev->ib_dev against >> NULL >> inside an srcu lock. If it is NULL, it waits for a completion and drops >> the srcu >> else continues with the normal flow. > > > Need to fix this description as of the expected code change, see below. > Please also describe why/how it solves the problem and not what it does. Sure, will do that. > > >> >> CC: Yishai Hadas <yishaih@mellanox.com> >> Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com> >> --- >> drivers/infiniband/core/uverbs.h | 1 + >> drivers/infiniband/core/uverbs_main.c | 18 ++++++++++++++++-- >> 2 files changed, 17 insertions(+), 2 deletions(-) >> >> diff --git a/drivers/infiniband/core/uverbs.h >> b/drivers/infiniband/core/uverbs.h >> index 612ccfd..94a7339 100644 >> --- a/drivers/infiniband/core/uverbs.h >> +++ b/drivers/infiniband/core/uverbs.h >> @@ -121,6 +121,7 @@ struct ib_uverbs_file { >> struct ib_event_handler event_handler; >> struct ib_uverbs_event_file *async_file; >> struct list_head list; >> + struct completion fcomp; >> int is_closed; >> }; >> >> diff --git a/drivers/infiniband/core/uverbs_main.c >> b/drivers/infiniband/core/uverbs_main.c >> index 39680ae..9531168 100644 >> --- a/drivers/infiniband/core/uverbs_main.c >> +++ b/drivers/infiniband/core/uverbs_main.c >> @@ -928,6 +928,7 @@ static int ib_uverbs_open(struct inode *inode, struct >> file *filp) >> file->async_file = NULL; >> kref_init(&file->ref); >> mutex_init(&file->mutex); >> + init_completion(&file->fcomp); >> >> filp->private_data = file; >> kobject_get(&dev->kobj); >> @@ -954,21 +955,33 @@ static int ib_uverbs_close(struct inode *inode, >> struct file *filp) >> struct ib_uverbs_file *file = filp->private_data; >> struct ib_uverbs_device *dev = file->device; >> struct ib_ucontext *ucontext = NULL; >> + struct ib_device *ib_dev; >> + int srcu_key; >> >> - mutex_lock(&file->device->lists_mutex); >> + srcu_key = srcu_read_lock(&dev->disassociate_srcu); >> + ib_dev = srcu_dereference(dev->ib_dev, >> + &dev->disassociate_srcu); >> + if (!ib_dev) > > > You need to free the sruc lock, wait for completion then go to out. > Doing in the opposite order as you did below, might end up with a dead-lock > in case ib_uverbs_free_hw_resources is waiting for the > synchronize_srcu(disassociate_srcu) and can't mark the file as completed. Agreed. Will fix it. > >> + goto out; >> + >> + mutex_lock(&dev->lists_mutex); >> ucontext = file->ucontext; >> file->ucontext = NULL; >> if (!file->is_closed) { >> list_del(&file->list); >> file->is_closed = 1; >> } >> - mutex_unlock(&file->device->lists_mutex); >> + mutex_unlock(&dev->lists_mutex); >> if (ucontext) >> ib_uverbs_cleanup_ucontext(file, ucontext); >> >> if (file->async_file) >> kref_put(&file->async_file->ref, >> ib_uverbs_release_event_file); >> >> + complete(&file->fcomp); > > > No need to do that in that flow, only in above flow. Agreed. > >> +out: >> + wait_for_completion(&file->fcomp); >> + srcu_read_unlock(&dev->disassociate_srcu, srcu_key); > > > See above, might end-up with a dead-lock, should be done above in opposite > order. Will fix in next version. > > >> kref_put(&file->ref, ib_uverbs_release_file); >> kobject_put(&dev->kobj); >> >> @@ -1199,6 +1212,7 @@ static void ib_uverbs_free_hw_resources(struct >> ib_uverbs_device *uverbs_dev, >> } >> >> mutex_lock(&uverbs_dev->lists_mutex); >> + complete(&file->fcomp); >> kref_put(&file->ref, ib_uverbs_release_file); >> } >> >> > -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/infiniband/core/uverbs.h b/drivers/infiniband/core/uverbs.h index 612ccfd..94a7339 100644 --- a/drivers/infiniband/core/uverbs.h +++ b/drivers/infiniband/core/uverbs.h @@ -121,6 +121,7 @@ struct ib_uverbs_file { struct ib_event_handler event_handler; struct ib_uverbs_event_file *async_file; struct list_head list; + struct completion fcomp; int is_closed; }; diff --git a/drivers/infiniband/core/uverbs_main.c b/drivers/infiniband/core/uverbs_main.c index 39680ae..9531168 100644 --- a/drivers/infiniband/core/uverbs_main.c +++ b/drivers/infiniband/core/uverbs_main.c @@ -928,6 +928,7 @@ static int ib_uverbs_open(struct inode *inode, struct file *filp) file->async_file = NULL; kref_init(&file->ref); mutex_init(&file->mutex); + init_completion(&file->fcomp); filp->private_data = file; kobject_get(&dev->kobj); @@ -954,21 +955,33 @@ static int ib_uverbs_close(struct inode *inode, struct file *filp) struct ib_uverbs_file *file = filp->private_data; struct ib_uverbs_device *dev = file->device; struct ib_ucontext *ucontext = NULL; + struct ib_device *ib_dev; + int srcu_key; - mutex_lock(&file->device->lists_mutex); + srcu_key = srcu_read_lock(&dev->disassociate_srcu); + ib_dev = srcu_dereference(dev->ib_dev, + &dev->disassociate_srcu); + if (!ib_dev) + goto out; + + mutex_lock(&dev->lists_mutex); ucontext = file->ucontext; file->ucontext = NULL; if (!file->is_closed) { list_del(&file->list); file->is_closed = 1; } - mutex_unlock(&file->device->lists_mutex); + mutex_unlock(&dev->lists_mutex); if (ucontext) ib_uverbs_cleanup_ucontext(file, ucontext); if (file->async_file) kref_put(&file->async_file->ref, ib_uverbs_release_event_file); + complete(&file->fcomp); +out: + wait_for_completion(&file->fcomp); + srcu_read_unlock(&dev->disassociate_srcu, srcu_key); kref_put(&file->ref, ib_uverbs_release_file); kobject_put(&dev->kobj); @@ -1199,6 +1212,7 @@ static void ib_uverbs_free_hw_resources(struct ib_uverbs_device *uverbs_dev, } mutex_lock(&uverbs_dev->lists_mutex); + complete(&file->fcomp); kref_put(&file->ref, ib_uverbs_release_file); }
Fixes: 35d4a0b63dc0 ("IB/uverbs: Fix race between ib_uverbs_open and remove_one") While testing ocrdma for disassociate_ucontext support following kernel panic was seen: BUG: unable to handle kernel paging request at ffffffffa07ccd7a [67139.981020] IP: [<ffffffffa07ccd7a>] 0xffffffffa07ccd7a [67139.987185] PGD 19c5067 PUD 19c6063 PMD 469d08067 PTE 0 [67139.993370] Oops: 0010 [#1] SMP [67140.257286] Call Trace: [67140.260665] [<ffffffff810c16a0>] ? prepare_to_wait_event+0xf0/0xf0 [67140.268337] [<ffffffffa04cabc3>] ? ib_dereg_mr+0x23/0x30 [ib_core] [67140.276009] [<ffffffffa03ee5f0>] ? ib_uverbs_cleanup_ucontext+0x320/0x440 [ib_uverbs] [67140.285550] [<ffffffffa03ee9e9>] ? ib_uverbs_close+0x59/0xb0 [ib_uverbs] [67140.293807] [<ffffffff811ff744>] ? __fput+0xe4/0x210 [67140.300132] [<ffffffff811ff8ae>] ? ____fput+0xe/0x10 [67140.306457] [<ffffffff8109b697>] ? task_work_run+0x77/0x90 [67140.313388] [<ffffffff81081af2>] ? do_exit+0x2d2/0xab0 [67140.319910] [<ffffffff8108234f>] ? do_group_exit+0x3f/0xa0 [67140.326821] [<ffffffff8108d54c>] ? get_signal+0x1cc/0x5e0 [67140.333635] [<ffffffff81017387>] ? do_signal+0x37/0x660 [67140.340257] [<ffffffffa021669a>] ? ucma_write+0x7a/0xc0 [rdma_ucm] [67140.347949] [<ffffffff81079c37>] ? exit_to_usermode_loop+0x59/0xa2 [67140.355651] [<ffffffff81003bad>] ? syscall_return_slowpath+0x8d/0xa0 [67140.363554] [<ffffffff81680dcc>] ? int_ret_from_sys_call+0x25/0x8f [67140.371259] Code: Bad RIP value. [67140.375678] RIP [<ffffffffa07ccd7a>] 0xffffffffa07ccd7a [67140.382314] RSP <ffff880867727ad8> [67140.386894] CR2: ffffffffa07ccd7a [67140.393737] ---[ end trace 807b4472c30412d0 ]--- [67141.682413] Kernel panic - not syncing: Fatal exception [67141.682431] Kernel Offset: disabled [67141.733934] ---[ end Kernel panic - not syncing: Fatal exception Root Cause: During rmmod <vendor-driver> "ib_uverbs_close()" context is still running, while "ib_uverbs_remove_one()" context completes and ends up freeing ib_dev pointer, thus causing a Kernel Panic. This patch fixes the race. ib_uverbs_close validates dev->ib_dev against NULL inside an srcu lock. If it is NULL, it waits for a completion and drops the srcu else continues with the normal flow. CC: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com> --- drivers/infiniband/core/uverbs.h | 1 + drivers/infiniband/core/uverbs_main.c | 18 ++++++++++++++++-- 2 files changed, 17 insertions(+), 2 deletions(-)