Message ID | 20160307190833.GA1886@obsidianresearch.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
On Tue, Mar 8, 2016 at 12:38 AM, Jason Gunthorpe <jgunthorpe@obsidianresearch.com> wrote: > On Mon, Mar 07, 2016 at 04:44:33AM -0500, Devesh Sharma wrote: > >> [67140.260665] [<ffffffff810c16a0>] ? prepare_to_wait_event+0xf0/0xf0 >> [67140.268337] [<ffffffffa04cabc3>] ? ib_dereg_mr+0x23/0x30 [ib_core] > > So, ib_dereg_mr at this point: > > ret = mr->device->dereg_mr(mr); > > Is running when mr->device is already freed? Yes. > >> During rmmod <vendor-driver> "ib_uverbs_close()" context is >> still running, while "ib_uverbs_remove_one()" context completes and >> ends up freeing ib_dev pointer, thus causing a Kernel Panic. > > Hurm.. > > So ib_uverbs_close is busy running in ib_uverbs_cleanup_ucontext and > then ib_uverbs_free_hw_resources is called? Yes, and completed also to unblock ib_unregister_device() which actually free-up device pointer. > > At first blush it certainly looks like the locking around > ib_uverbs_cleanup_context is wrong. I agree, from both locations it is called without any synchronization. > >> This patch fixes the race. ib_uverbs_close validates dev->ib_dev against NULL >> inside an srcu lock. If it is NULL, it waits for a completion and drops the srcu >> else continues with the normal flow. > > Hum.. So this is really weird, this patch is bascially duplicating a > mutex with srcu and a completion?? Agreed. > > What is wrong with simply this: > > --- a/drivers/infiniband/core/uverbs_main.c > +++ b/drivers/infiniband/core/uverbs_main.c > @@ -962,9 +962,9 @@ static int ib_uverbs_close(struct inode *inode, struct file *filp) > list_del(&file->list); > file->is_closed = 1; > } > - mutex_unlock(&file->device->lists_mutex); > if (ucontext) > ib_uverbs_cleanup_ucontext(file, ucontext); > + mutex_unlock(&file->device->lists_mutex); > > > ?? There is following comment about list_mutex in uverbs_main.c around line number 1200: /* We must release the mutex before going ahead and calling * disassociate_ucontext. disassociate_ucontext might end up * indirectly calling uverbs_close, for example due to freeing * the resources (e.g mmput). */ > > Noting that ib_uverbs_free_hw_resources holds lists_mutex while > calling ib_uverbs_cleanup_ucontext, so it should be safe, or we have > another bug? No, ib_uverbs_cleanup_ucontext is called outside mutex from this context. the code takes the reference of the file pointer from the list, then releases the mutex then calls ib_uverbs_cleanup_ucontext. After the call is done, mutext is acquired again. > > Certainly, the above is closer to the original intent of how this was > supposed to work... > > Jason -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 3/8/2016 12:54 PM, Devesh Sharma wrote: >> What is wrong with simply this: >> >> --- a/drivers/infiniband/core/uverbs_main.c >> +++ b/drivers/infiniband/core/uverbs_main.c >> @@ -962,9 +962,9 @@ static int ib_uverbs_close(struct inode *inode, struct file *filp) >> list_del(&file->list); >> file->is_closed = 1; >> } >> - mutex_unlock(&file->device->lists_mutex); >> if (ucontext) >> ib_uverbs_cleanup_ucontext(file, ucontext); >> + mutex_unlock(&file->device->lists_mutex); >> >> >> ?? > > There is following comment about list_mutex in uverbs_main.c around > line number 1200: > /* We must release the mutex before going ahead and calling > * disassociate_ucontext. disassociate_ucontext might end up > * indirectly calling uverbs_close, for example due to freeing > * the resources (e.g mmput). > */ > Correct. In addition it's very *bad/incorrect* to call ib_uverbs_cleanup_ucontext under the list mutex as it will prevent parallel cleanups of different contexts, this may end up with softlock ups as cleanup may involve FW commands and may take time. The correct solution is as I described in my comments to V2, need V3 with the fixes. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Mar 08, 2016 at 04:24:51PM +0530, Devesh Sharma wrote: > > +++ b/drivers/infiniband/core/uverbs_main.c > > @@ -962,9 +962,9 @@ static int ib_uverbs_close(struct inode *inode, struct file *filp) > > list_del(&file->list); > > file->is_closed = 1; > > } > > - mutex_unlock(&file->device->lists_mutex); > > if (ucontext) > > ib_uverbs_cleanup_ucontext(file, ucontext); > > + mutex_unlock(&file->device->lists_mutex); > > > > > > ?? > > There is following comment about list_mutex in uverbs_main.c around > line number 1200: > /* We must release the mutex before going ahead and calling > * disassociate_ucontext. disassociate_ucontext might end up > * indirectly calling uverbs_close, for example due to freeing > * the resources (e.g mmput). > */ Okay, now I remember this discussion, and being unhappy about this during review. However, this comment is talking about disassociate_ucontext, the bug is with ib_uverbs_cleanup_ucontext. We can't re-entre uverbs_close while we are already in uverbs_close, so that doesn't explain why it cannot be in the mutex. So, Yishai, what is the problem with the above lock placement? The only issue you raised was with multi-file close concurrency, and that is trivially solved with another mutex. I'd rather see another mutex added then this ugly add-hoc srcu/completion thing. Jason -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 3/8/2016 7:53 PM, Jason Gunthorpe wrote: > On Tue, Mar 08, 2016 at 04:24:51PM +0530, Devesh Sharma wrote: > >>> +++ b/drivers/infiniband/core/uverbs_main.c >>> @@ -962,9 +962,9 @@ static int ib_uverbs_close(struct inode *inode, struct file *filp) >>> list_del(&file->list); >>> file->is_closed = 1; >>> } >>> - mutex_unlock(&file->device->lists_mutex); >>> if (ucontext) >>> ib_uverbs_cleanup_ucontext(file, ucontext); >>> + mutex_unlock(&file->device->lists_mutex); >>> >>> >>> ?? >> >> There is following comment about list_mutex in uverbs_main.c around >> line number 1200: >> /* We must release the mutex before going ahead and calling >> * disassociate_ucontext. disassociate_ucontext might end up >> * indirectly calling uverbs_close, for example due to freeing >> * the resources (e.g mmput). >> */ > > Okay, now I remember this discussion, and being unhappy about this > during review. > > However, this comment is talking about disassociate_ucontext, the bug > is with ib_uverbs_cleanup_ucontext. We can't re-entre uverbs_close > while we are already in uverbs_close, so that doesn't explain why it > cannot be in the mutex. > > So, Yishai, what is the problem with the above lock placement? > > The only issue you raised was with multi-file close concurrency, and > that is trivially solved with another mutex. > > I'd rather see another mutex added then this ugly add-hoc > srcu/completion thing. The srcu with NULL checking by itself can prevent the race, no need for the "completion" mechanism. ib_uverbs_free_hw_resources uses synchronize_srcu just after that ib_dev was set to NULL as part of ib_uverbs_remove_one. The reason for the extra "completion" that I suggested comes to make sure that when an application returns from its close API the underlying resources were really freed, this is open in current code even if the race *wasn't* hit. As we need to enable parallel closing it seems to be the preferred way to go. Devesh, can you send V3 with above suggestion to help people reviewing it ? if you have some other solution with mutex that addressed above points please come it to the list for a review. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Mar 9, 2016 at 10:18 PM, Yishai Hadas <yishaih@dev.mellanox.co.il> wrote: > On 3/8/2016 7:53 PM, Jason Gunthorpe wrote: >> >> On Tue, Mar 08, 2016 at 04:24:51PM +0530, Devesh Sharma wrote: >> >>>> +++ b/drivers/infiniband/core/uverbs_main.c >>>> @@ -962,9 +962,9 @@ static int ib_uverbs_close(struct inode *inode, >>>> struct file *filp) >>>> list_del(&file->list); >>>> file->is_closed = 1; >>>> } >>>> - mutex_unlock(&file->device->lists_mutex); >>>> if (ucontext) >>>> ib_uverbs_cleanup_ucontext(file, ucontext); >>>> + mutex_unlock(&file->device->lists_mutex); >>>> >>>> >>>> ?? >>> >>> >>> There is following comment about list_mutex in uverbs_main.c around >>> line number 1200: >>> /* We must release the mutex before going ahead and calling >>> * disassociate_ucontext. disassociate_ucontext might end up >>> * indirectly calling uverbs_close, for example due to freeing >>> * the resources (e.g mmput). >>> */ >> >> >> Okay, now I remember this discussion, and being unhappy about this >> during review. >> >> However, this comment is talking about disassociate_ucontext, the bug >> is with ib_uverbs_cleanup_ucontext. We can't re-entre uverbs_close >> while we are already in uverbs_close, so that doesn't explain why it >> cannot be in the mutex. >> >> So, Yishai, what is the problem with the above lock placement? >> >> The only issue you raised was with multi-file close concurrency, and >> that is trivially solved with another mutex. >> >> I'd rather see another mutex added then this ugly add-hoc >> srcu/completion thing. > > > The srcu with NULL checking by itself can prevent the race, no need for the > "completion" mechanism. ib_uverbs_free_hw_resources uses synchronize_srcu > just after that ib_dev was set to NULL as part of ib_uverbs_remove_one. > > The reason for the extra "completion" that I suggested comes to make sure > that when an application returns from its close API the underlying resources > were really freed, this is open in current code even if the race *wasn't* > hit. > > As we need to enable parallel closing it seems to be the preferred way to > go. > > Devesh, can you send V3 with above suggestion to help people reviewing it ? > if you have some other solution with mutex that addressed above points > please come it to the list for a review. Sure, I should be able to post it toady. > > > > > > -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
--- a/drivers/infiniband/core/uverbs_main.c +++ b/drivers/infiniband/core/uverbs_main.c @@ -962,9 +962,9 @@ static int ib_uverbs_close(struct inode *inode, struct file *filp) list_del(&file->list); file->is_closed = 1; } - mutex_unlock(&file->device->lists_mutex); if (ucontext) ib_uverbs_cleanup_ucontext(file, ucontext); + mutex_unlock(&file->device->lists_mutex); ??