Message ID | 20201209164542.61387-3-jinpu.wang@cloud.ionos.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | Misc update for rtrs | expand |
On Wed, Dec 9, 2020 at 5:45 PM Jack Wang <jinpu.wang@cloud.ionos.com> wrote: > > If there are many establishments/teardowns, we need to make sure > we do not consume too much system memory. Thus let on going > session closing to finish before accepting new connection. > > Inspired by commit 777dc82395de ("nvmet-rdma: occasionally flush ongoing controller teardown") > Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com> > Reviewed-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Please ignore this one, it could lead to deadlock, due to the fact cma_ib_req_handler is holding mutex_lock(&listen_id->handler_mutex) when calling into rtrs_rdma_connect, we call close_work which will call rdma_destroy_id, which could try to hold the same handler_mutex, so deadlock. Sorry & thanks! Jack > --- > drivers/infiniband/ulp/rtrs/rtrs-srv.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/drivers/infiniband/ulp/rtrs/rtrs-srv.c b/drivers/infiniband/ulp/rtrs/rtrs-srv.c > index ed4628f032bb..0a2202c28b54 100644 > --- a/drivers/infiniband/ulp/rtrs/rtrs-srv.c > +++ b/drivers/infiniband/ulp/rtrs/rtrs-srv.c > @@ -1791,6 +1791,10 @@ static int rtrs_rdma_connect(struct rdma_cm_id *cm_id, > err = -ENOMEM; > goto reject_w_err; > } > + if (!cid) { > + /* Let inflight session teardown complete */ > + flush_workqueue(rtrs_wq); > + } > mutex_lock(&srv->paths_mutex); > sess = __find_sess(srv, &msg->sess_uuid); > if (sess) { > -- > 2.25.1 >
On 12/10/20 15:56, Jinpu Wang wrote: > On Wed, Dec 9, 2020 at 5:45 PM Jack Wang <jinpu.wang@cloud.ionos.com> wrote: >> >> If there are many establishments/teardowns, we need to make sure >> we do not consume too much system memory. Thus let on going >> session closing to finish before accepting new connection. >> >> Inspired by commit 777dc82395de ("nvmet-rdma: occasionally flush ongoing controller teardown") >> Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com> >> Reviewed-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> > Please ignore this one, it could lead to deadlock, due to the fact > cma_ib_req_handler is holding > mutex_lock(&listen_id->handler_mutex) when calling into > rtrs_rdma_connect, we call close_work which will call rdma_destroy_id, > which > could try to hold the same handler_mutex, so deadlock. > I am wondering if nvmet-rdma has the similar issue or not, if so, maybe introduce a locked version of rdma_destroy_id. Thanks, Guoqing
On Fri, Dec 11, 2020 at 3:35 AM Guoqing Jiang <guoqing.jiang@cloud.ionos.com> wrote: > > > > On 12/10/20 15:56, Jinpu Wang wrote: > > On Wed, Dec 9, 2020 at 5:45 PM Jack Wang <jinpu.wang@cloud.ionos.com> wrote: > >> > >> If there are many establishments/teardowns, we need to make sure > >> we do not consume too much system memory. Thus let on going > >> session closing to finish before accepting new connection. > >> > >> Inspired by commit 777dc82395de ("nvmet-rdma: occasionally flush ongoing controller teardown") > >> Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com> > >> Reviewed-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> > > Please ignore this one, it could lead to deadlock, due to the fact > > cma_ib_req_handler is holding > > mutex_lock(&listen_id->handler_mutex) when calling into > > rtrs_rdma_connect, we call close_work which will call rdma_destroy_id, > > which > > could try to hold the same handler_mutex, so deadlock. > > > > I am wondering if nvmet-rdma has the similar issue or not, if so, maybe > introduce a locked version of rdma_destroy_id. > > Thanks, > Guoqing No, I was wrong. I rechecked the code, it's not a valid deadlock, in cma_ib_req_handler, the conn_id is newly created in https://elixir.bootlin.com/linux/latest/source/drivers/infiniband/core/cma.c#L2185. Flush_workqueue will only flush close_work for any other cm_id, but not the newly created one conn_id, it has not associated with anything yet. The same applies to nvme-rdma. so it's a false alarm by lockdep. Regards! Jack
On Fri, Dec 11, 2020 at 07:50:13AM +0100, Jinpu Wang wrote: > On Fri, Dec 11, 2020 at 3:35 AM Guoqing Jiang > <guoqing.jiang@cloud.ionos.com> wrote: > > > > > > > > On 12/10/20 15:56, Jinpu Wang wrote: > > > On Wed, Dec 9, 2020 at 5:45 PM Jack Wang <jinpu.wang@cloud.ionos.com> wrote: > > >> > > >> If there are many establishments/teardowns, we need to make sure > > >> we do not consume too much system memory. Thus let on going > > >> session closing to finish before accepting new connection. > > >> > > >> Inspired by commit 777dc82395de ("nvmet-rdma: occasionally flush ongoing controller teardown") > > >> Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com> > > >> Reviewed-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> > > > Please ignore this one, it could lead to deadlock, due to the fact > > > cma_ib_req_handler is holding > > > mutex_lock(&listen_id->handler_mutex) when calling into > > > rtrs_rdma_connect, we call close_work which will call rdma_destroy_id, > > > which > > > could try to hold the same handler_mutex, so deadlock. > > > > > > > I am wondering if nvmet-rdma has the similar issue or not, if so, maybe > > introduce a locked version of rdma_destroy_id. > > > > Thanks, > > Guoqing > > No, I was wrong. I rechecked the code, it's not a valid deadlock, in > cma_ib_req_handler, the conn_id is newly created in > https://elixir.bootlin.com/linux/latest/source/drivers/infiniband/core/cma.c#L2185. > > Flush_workqueue will only flush close_work for any other cm_id, but > not the newly created one conn_id, it has not associated with anything > yet. > > The same applies to nvme-rdma. so it's a false alarm by lockdep. Leaving this without fix (proper lock annotation) is not right thing to do, because everyone who runs rtrs code with LOCKDEP on will have same "false alarm". So I recommend or not to take this patch or write it without LOCKDEP warning. Thanks > > Regards! > Jack
On Fri, Dec 11, 2020 at 8:26 AM Leon Romanovsky <leon@kernel.org> wrote: > > On Fri, Dec 11, 2020 at 07:50:13AM +0100, Jinpu Wang wrote: > > On Fri, Dec 11, 2020 at 3:35 AM Guoqing Jiang > > <guoqing.jiang@cloud.ionos.com> wrote: > > > > > > > > > > > > On 12/10/20 15:56, Jinpu Wang wrote: > > > > On Wed, Dec 9, 2020 at 5:45 PM Jack Wang <jinpu.wang@cloud.ionos.com> wrote: > > > >> > > > >> If there are many establishments/teardowns, we need to make sure > > > >> we do not consume too much system memory. Thus let on going > > > >> session closing to finish before accepting new connection. > > > >> > > > >> Inspired by commit 777dc82395de ("nvmet-rdma: occasionally flush ongoing controller teardown") > > > >> Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com> > > > >> Reviewed-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> > > > > Please ignore this one, it could lead to deadlock, due to the fact > > > > cma_ib_req_handler is holding > > > > mutex_lock(&listen_id->handler_mutex) when calling into > > > > rtrs_rdma_connect, we call close_work which will call rdma_destroy_id, > > > > which > > > > could try to hold the same handler_mutex, so deadlock. > > > > > > > > > > I am wondering if nvmet-rdma has the similar issue or not, if so, maybe > > > introduce a locked version of rdma_destroy_id. > > > > > > Thanks, > > > Guoqing > > > > No, I was wrong. I rechecked the code, it's not a valid deadlock, in > > cma_ib_req_handler, the conn_id is newly created in > > https://elixir.bootlin.com/linux/latest/source/drivers/infiniband/core/cma.c#L2185. > > > > Flush_workqueue will only flush close_work for any other cm_id, but > > not the newly created one conn_id, it has not associated with anything > > yet. > > > > The same applies to nvme-rdma. so it's a false alarm by lockdep. > > Leaving this without fix (proper lock annotation) is not right thing to > do, because everyone who runs rtrs code with LOCKDEP on will have same > "false alarm". > > So I recommend or not to take this patch or write it without LOCKDEP warning. Hi Leon, I'm thinking about the same, do you have a suggestion on how to teach LOCKDEP this is not really a deadlock, I do not know LOCKDEP well. Thanks > > Thanks > > > > > Regards! > > Jack
On Fri, Dec 11, 2020 at 8:53 AM Jinpu Wang <jinpu.wang@cloud.ionos.com> wrote: > > On Fri, Dec 11, 2020 at 8:26 AM Leon Romanovsky <leon@kernel.org> wrote: > > > > On Fri, Dec 11, 2020 at 07:50:13AM +0100, Jinpu Wang wrote: > > > On Fri, Dec 11, 2020 at 3:35 AM Guoqing Jiang > > > <guoqing.jiang@cloud.ionos.com> wrote: > > > > > > > > > > > > > > > > On 12/10/20 15:56, Jinpu Wang wrote: > > > > > On Wed, Dec 9, 2020 at 5:45 PM Jack Wang <jinpu.wang@cloud.ionos.com> wrote: > > > > >> > > > > >> If there are many establishments/teardowns, we need to make sure > > > > >> we do not consume too much system memory. Thus let on going > > > > >> session closing to finish before accepting new connection. > > > > >> > > > > >> Inspired by commit 777dc82395de ("nvmet-rdma: occasionally flush ongoing controller teardown") > > > > >> Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com> > > > > >> Reviewed-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> > > > > > Please ignore this one, it could lead to deadlock, due to the fact > > > > > cma_ib_req_handler is holding > > > > > mutex_lock(&listen_id->handler_mutex) when calling into > > > > > rtrs_rdma_connect, we call close_work which will call rdma_destroy_id, > > > > > which > > > > > could try to hold the same handler_mutex, so deadlock. > > > > > > > > > > > > > I am wondering if nvmet-rdma has the similar issue or not, if so, maybe > > > > introduce a locked version of rdma_destroy_id. > > > > > > > > Thanks, > > > > Guoqing > > > > > > No, I was wrong. I rechecked the code, it's not a valid deadlock, in > > > cma_ib_req_handler, the conn_id is newly created in > > > https://elixir.bootlin.com/linux/latest/source/drivers/infiniband/core/cma.c#L2185. > > > > > > Flush_workqueue will only flush close_work for any other cm_id, but > > > not the newly created one conn_id, it has not associated with anything > > > yet. > > > > > > The same applies to nvme-rdma. so it's a false alarm by lockdep. > > > > Leaving this without fix (proper lock annotation) is not right thing to > > do, because everyone who runs rtrs code with LOCKDEP on will have same > > "false alarm". > > > > So I recommend or not to take this patch or write it without LOCKDEP warning. > Hi Leon, > > I'm thinking about the same, do you have a suggestion on how to teach > LOCKDEP this is not really a deadlock, > I do not know LOCKDEP well. Found it myself, we can use lockdep_off https://elixir.bootlin.com/linux/latest/source/drivers/virtio/virtio_mem.c#L699 Thanks > > Thanks > > > > Thanks > > > > > > > > Regards! > > > Jack
On Fri, Dec 11, 2020 at 08:58:09AM +0100, Jinpu Wang wrote: > > > > No, I was wrong. I rechecked the code, it's not a valid deadlock, in > > > > cma_ib_req_handler, the conn_id is newly created in > > > > https://elixir.bootlin.com/linux/latest/source/drivers/infiniband/core/cma.c#L2185. > > > > > > > > Flush_workqueue will only flush close_work for any other cm_id, but > > > > not the newly created one conn_id, it has not associated with anything > > > > yet. > > > > > > > > The same applies to nvme-rdma. so it's a false alarm by lockdep. > > > > > > Leaving this without fix (proper lock annotation) is not right thing to > > > do, because everyone who runs rtrs code with LOCKDEP on will have same > > > "false alarm". > > > > > > So I recommend or not to take this patch or write it without LOCKDEP warning. > > Hi Leon, > > > > I'm thinking about the same, do you have a suggestion on how to teach > > LOCKDEP this is not really a deadlock, > > I do not know LOCKDEP well. > Found it myself, we can use lockdep_off > > https://elixir.bootlin.com/linux/latest/source/drivers/virtio/virtio_mem.c#L699 Gah, that is horrible. I'm not completely sure it is false either, at this point two handler_mutex's are locked (listening ID and the new ID) and there may be locking cycles that end up back on the listening ID, certainly it is not so simple. Jason
On Fri, Dec 11, 2020 at 05:17:36PM +0100, Jack Wang wrote: > En, the lockdep was complaining about the new conn_id, I will > post the full log if needed next week. let’s skip this patch for > now, will double check! That is even more worrysome as the new conn_id already has a different lock class. Jason
On Fri, Dec 11, 2020 at 08:58:09AM +0100, Jinpu Wang wrote: > On Fri, Dec 11, 2020 at 8:53 AM Jinpu Wang <jinpu.wang@cloud.ionos.com> wrote: > > > > On Fri, Dec 11, 2020 at 8:26 AM Leon Romanovsky <leon@kernel.org> wrote: > > > > > > On Fri, Dec 11, 2020 at 07:50:13AM +0100, Jinpu Wang wrote: > > > > On Fri, Dec 11, 2020 at 3:35 AM Guoqing Jiang > > > > <guoqing.jiang@cloud.ionos.com> wrote: > > > > > > > > > > > > > > > > > > > > On 12/10/20 15:56, Jinpu Wang wrote: > > > > > > On Wed, Dec 9, 2020 at 5:45 PM Jack Wang <jinpu.wang@cloud.ionos.com> wrote: > > > > > >> > > > > > >> If there are many establishments/teardowns, we need to make sure > > > > > >> we do not consume too much system memory. Thus let on going > > > > > >> session closing to finish before accepting new connection. > > > > > >> > > > > > >> Inspired by commit 777dc82395de ("nvmet-rdma: occasionally flush ongoing controller teardown") > > > > > >> Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com> > > > > > >> Reviewed-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> > > > > > > Please ignore this one, it could lead to deadlock, due to the fact > > > > > > cma_ib_req_handler is holding > > > > > > mutex_lock(&listen_id->handler_mutex) when calling into > > > > > > rtrs_rdma_connect, we call close_work which will call rdma_destroy_id, > > > > > > which > > > > > > could try to hold the same handler_mutex, so deadlock. > > > > > > > > > > > > > > > > I am wondering if nvmet-rdma has the similar issue or not, if so, maybe > > > > > introduce a locked version of rdma_destroy_id. > > > > > > > > > > Thanks, > > > > > Guoqing > > > > > > > > No, I was wrong. I rechecked the code, it's not a valid deadlock, in > > > > cma_ib_req_handler, the conn_id is newly created in > > > > https://elixir.bootlin.com/linux/latest/source/drivers/infiniband/core/cma.c#L2185. > > > > > > > > Flush_workqueue will only flush close_work for any other cm_id, but > > > > not the newly created one conn_id, it has not associated with anything > > > > yet. > > > > > > > > The same applies to nvme-rdma. so it's a false alarm by lockdep. > > > > > > Leaving this without fix (proper lock annotation) is not right thing to > > > do, because everyone who runs rtrs code with LOCKDEP on will have same > > > "false alarm". > > > > > > So I recommend or not to take this patch or write it without LOCKDEP warning. > > Hi Leon, > > > > I'm thinking about the same, do you have a suggestion on how to teach > > LOCKDEP this is not really a deadlock, > > I do not know LOCKDEP well. > Found it myself, we can use lockdep_off > > https://elixir.bootlin.com/linux/latest/source/drivers/virtio/virtio_mem.c#L699 My personal experience from internal/external reviews shows that claims about false alarms in LOCKDEP warnings are almost always wrong. Thanks > > Thanks > > > > > Thanks > > > > > > Thanks > > > > > > > > > > > Regards! > > > > Jack
On Fri, Dec 11, 2020 at 5:29 PM Jason Gunthorpe <jgg@ziepe.ca> wrote: > > On Fri, Dec 11, 2020 at 05:17:36PM +0100, Jack Wang wrote: > > En, the lockdep was complaining about the new conn_id, I will > > post the full log if needed next week. let’s skip this patch for > > now, will double check! > > That is even more worrysome as the new conn_id already has a different > lock class. > > Jason This is the dmesg of the LOCKDEP warning, it's on kernel 5.4.77, but the latest 5.10 behaves the same. [ 500.071552] ====================================================== [ 500.071648] WARNING: possible circular locking dependency detected [ 500.071869] 5.4.77-storage+ #35 Tainted: G O [ 500.071959] ------------------------------------------------------ [ 500.072054] kworker/1:1/28 is trying to acquire lock: [ 500.072200] ffff99653a624390 (&id_priv->handler_mutex){+.+.}, at: rdma_destroy_id+0x55/0x230 [rdma_cm] [ 500.072837] but task is already holding lock: [ 500.072938] ffff9d18800f7e80 ((work_completion)(&sess->close_work)){+.+.}, at: process_one_work+0x223/0x600 [ 500.075642] which lock already depends on the new lock. [ 500.075759] the existing dependency chain (in reverse order) is: [ 500.075880] -> #3 ((work_completion)(&sess->close_work)){+.+.}: [ 500.076062] process_one_work+0x278/0x600 [ 500.076154] worker_thread+0x2d/0x3d0 [ 500.076225] kthread+0x111/0x130 [ 500.076290] ret_from_fork+0x24/0x30 [ 500.076370] -> #2 ((wq_completion)rtrs_server_wq){+.+.}: [ 500.076482] flush_workqueue+0xab/0x4b0 [ 500.076565] rtrs_srv_rdma_cm_handler+0x71d/0x1500 [rtrs_server] [ 500.076674] cma_ib_req_handler+0x8c4/0x14f0 [rdma_cm] [ 500.076770] cm_process_work+0x22/0x140 [ib_cm] [ 500.076857] cm_req_handler+0x900/0xde0 [ib_cm] [ 500.076944] cm_work_handler+0x136/0x1af2 [ib_cm] [ 500.077025] process_one_work+0x29f/0x600 [ 500.077097] worker_thread+0x2d/0x3d0 [ 500.077164] kthread+0x111/0x130 [ 500.077224] ret_from_fork+0x24/0x30 [ 500.077294] -> #1 (&id_priv->handler_mutex/1){+.+.}: [ 500.077409] __mutex_lock+0x7e/0x950 [ 500.077488] cma_ib_req_handler+0x787/0x14f0 [rdma_cm] [ 500.077582] cm_process_work+0x22/0x140 [ib_cm] [ 500.077669] cm_req_handler+0x900/0xde0 [ib_cm] [ 500.077755] cm_work_handler+0x136/0x1af2 [ib_cm] [ 500.077835] process_one_work+0x29f/0x600 [ 500.077907] worker_thread+0x2d/0x3d0 [ 500.077973] kthread+0x111/0x130 [ 500.078034] ret_from_fork+0x24/0x30 [ 500.078095] -> #0 (&id_priv->handler_mutex){+.+.}: [ 500.078196] __lock_acquire+0x1166/0x1440 [ 500.078267] lock_acquire+0x90/0x170 [ 500.078335] __mutex_lock+0x7e/0x950 [ 500.078410] rdma_destroy_id+0x55/0x230 [rdma_cm] [ 500.078498] rtrs_srv_close_work+0xf2/0x2d0 [rtrs_server] [ 500.078586] process_one_work+0x29f/0x600 [ 500.078662] worker_thread+0x2d/0x3d0 [ 500.078732] kthread+0x111/0x130 [ 500.078793] ret_from_fork+0x24/0x30 [ 500.078859] other info that might help us debug this: [ 500.078984] Chain exists of: &id_priv->handler_mutex --> (wq_completion)rtrs_server_wq --> (work_completion)(&sess->close_work) [ 500.079207] Possible unsafe locking scenario: [ 500.079293] CPU0 CPU1 [ 500.079358] ---- ---- [ 500.079358] lock((work_completion)(&sess->close_work)); [ 500.079358] lock((wq_completion)rtrs_server_wq); [ 500.079358] lock((work_completion)(&sess->close_work)); [ 500.079358] lock(&id_priv->handler_mutex); [ 500.079358] *** DEADLOCK *** [ 500.079358] 2 locks held by kworker/1:1/28: [ 500.079358] #0: ffff99652d281f28 ((wq_completion)rtrs_server_wq){+.+.}, at: process_one_work+0x223/0x600 [ 500.079358] #1: ffff9d18800f7e80 ((work_completion)(&sess->close_work)){+.+.}, at: process_one_work+0x223/0x600 [ 500.079358] stack backtrace: [ 500.079358] CPU: 1 PID: 28 Comm: kworker/1:1 Tainted: G O 5.4.77-storage+ #35 [ 500.079358] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014 [ 500.079358] Workqueue: rtrs_server_wq rtrs_srv_close_work [rtrs_server] [ 500.079358] Call Trace: [ 500.079358] dump_stack+0x71/0x9b [ 500.079358] check_noncircular+0x17d/0x1a0 [ 500.079358] ? __lock_acquire+0x1166/0x1440 [ 500.079358] __lock_acquire+0x1166/0x1440 [ 500.079358] lock_acquire+0x90/0x170 [ 500.079358] ? rdma_destroy_id+0x55/0x230 [rdma_cm] [ 500.079358] ? rdma_destroy_id+0x55/0x230 [rdma_cm] [ 500.079358] __mutex_lock+0x7e/0x950 [ 500.079358] ? rdma_destroy_id+0x55/0x230 [rdma_cm] [ 500.079358] ? find_held_lock+0x2d/0x90 [ 500.079358] ? mark_held_locks+0x49/0x70 [ 500.079358] ? rdma_destroy_id+0x55/0x230 [rdma_cm] [ 500.079358] rdma_destroy_id+0x55/0x230 [rdma_cm] [ 500.079358] rtrs_srv_close_work+0xf2/0x2d0 [rtrs_server] [ 500.079358] process_one_work+0x29f/0x600 [ 500.079358] worker_thread+0x2d/0x3d0 [ 500.079358] ? process_one_work+0x600/0x600 [ 500.079358] kthread+0x111/0x130 [ 500.079358] ? kthread_park+0x90/0x90 [ 500.079358] ret_from_fork+0x24/0x30 According to my understanding in cma_ib_req_handler, the conn_id is newly created in https://elixir.bootlin.com/linux/latest/source/drivers/infiniband/core/cma.c#L2222. And the rdma_cm_id associated with conn_id is passed to rtrs_srv_rdma_cm_handler->rtrs_rdma_connect. In rtrs_rdma_connect, we do flush_workqueue will only flush close_work for any other cm_id, but not the newly created one conn_id, it has not associated with anything yet. The same applies to nvme-rdma. so it's a false alarm by lockdep. Regards!
On Wed, Dec 16, 2020 at 05:42:17PM +0100, Jinpu Wang wrote: > On Fri, Dec 11, 2020 at 5:29 PM Jason Gunthorpe <jgg@ziepe.ca> wrote: > > > > On Fri, Dec 11, 2020 at 05:17:36PM +0100, Jack Wang wrote: > > > En, the lockdep was complaining about the new conn_id, I will > > > post the full log if needed next week. let’s skip this patch for > > > now, will double check! > > > > That is even more worrysome as the new conn_id already has a different > > lock class. > > > > Jason > This is the dmesg of the LOCKDEP warning, it's on kernel 5.4.77, but > the latest 5.10 behaves the same. > > [ 500.071552] ====================================================== > [ 500.071648] WARNING: possible circular locking dependency detected > [ 500.071869] 5.4.77-storage+ #35 Tainted: G O > [ 500.071959] ------------------------------------------------------ > [ 500.072054] kworker/1:1/28 is trying to acquire lock: > [ 500.072200] ffff99653a624390 (&id_priv->handler_mutex){+.+.}, at: > rdma_destroy_id+0x55/0x230 [rdma_cm] > [ 500.072837] > but task is already holding lock: > [ 500.072938] ffff9d18800f7e80 > ((work_completion)(&sess->close_work)){+.+.}, at: > process_one_work+0x223/0x600 > [ 500.075642] > which lock already depends on the new lock. > > [ 500.075759] > the existing dependency chain (in reverse order) is: > [ 500.075880] > -> #3 ((work_completion)(&sess->close_work)){+.+.}: > [ 500.076062] process_one_work+0x278/0x600 > [ 500.076154] worker_thread+0x2d/0x3d0 > [ 500.076225] kthread+0x111/0x130 > [ 500.076290] ret_from_fork+0x24/0x30 > [ 500.076370] > -> #2 ((wq_completion)rtrs_server_wq){+.+.}: > [ 500.076482] flush_workqueue+0xab/0x4b0 > [ 500.076565] rtrs_srv_rdma_cm_handler+0x71d/0x1500 [rtrs_server] > [ 500.076674] cma_ib_req_handler+0x8c4/0x14f0 [rdma_cm] > [ 500.076770] cm_process_work+0x22/0x140 [ib_cm] > [ 500.076857] cm_req_handler+0x900/0xde0 [ib_cm] > [ 500.076944] cm_work_handler+0x136/0x1af2 [ib_cm] > [ 500.077025] process_one_work+0x29f/0x600 > [ 500.077097] worker_thread+0x2d/0x3d0 > [ 500.077164] kthread+0x111/0x130 > [ 500.077224] ret_from_fork+0x24/0x30 > [ 500.077294] > -> #1 (&id_priv->handler_mutex/1){+.+.}: > [ 500.077409] __mutex_lock+0x7e/0x950 > [ 500.077488] cma_ib_req_handler+0x787/0x14f0 [rdma_cm] > [ 500.077582] cm_process_work+0x22/0x140 [ib_cm] > [ 500.077669] cm_req_handler+0x900/0xde0 [ib_cm] > [ 500.077755] cm_work_handler+0x136/0x1af2 [ib_cm] > [ 500.077835] process_one_work+0x29f/0x600 > [ 500.077907] worker_thread+0x2d/0x3d0 > [ 500.077973] kthread+0x111/0x130 > [ 500.078034] ret_from_fork+0x24/0x30 > [ 500.078095] > -> #0 (&id_priv->handler_mutex){+.+.}: > [ 500.078196] __lock_acquire+0x1166/0x1440 > [ 500.078267] lock_acquire+0x90/0x170 > [ 500.078335] __mutex_lock+0x7e/0x950 > [ 500.078410] rdma_destroy_id+0x55/0x230 [rdma_cm] > [ 500.078498] rtrs_srv_close_work+0xf2/0x2d0 [rtrs_server] > [ 500.078586] process_one_work+0x29f/0x600 > [ 500.078662] worker_thread+0x2d/0x3d0 > [ 500.078732] kthread+0x111/0x130 > [ 500.078793] ret_from_fork+0x24/0x30 > [ 500.078859] > other info that might help us debug this: > > [ 500.078984] Chain exists of: > &id_priv->handler_mutex --> > (wq_completion)rtrs_server_wq --> (work_completion)(&sess->close_work) > > [ 500.079207] Possible unsafe locking scenario: > > [ 500.079293] CPU0 CPU1 > [ 500.079358] ---- ---- > [ 500.079358] lock((work_completion)(&sess->close_work)); > [ 500.079358] > lock((wq_completion)rtrs_server_wq); > [ 500.079358] > lock((work_completion)(&sess->close_work)); > [ 500.079358] lock(&id_priv->handler_mutex); > [ 500.079358] > *** DEADLOCK *** > > [ 500.079358] 2 locks held by kworker/1:1/28: > [ 500.079358] #0: ffff99652d281f28 > ((wq_completion)rtrs_server_wq){+.+.}, at: > process_one_work+0x223/0x600 > [ 500.079358] #1: ffff9d18800f7e80 > ((work_completion)(&sess->close_work)){+.+.}, at: > process_one_work+0x223/0x600 > [ 500.079358] > stack backtrace: > [ 500.079358] CPU: 1 PID: 28 Comm: kworker/1:1 Tainted: G O > 5.4.77-storage+ #35 > [ 500.079358] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), > BIOS 1.10.2-1ubuntu1 04/01/2014 > [ 500.079358] Workqueue: rtrs_server_wq rtrs_srv_close_work [rtrs_server] > [ 500.079358] Call Trace: > [ 500.079358] dump_stack+0x71/0x9b > [ 500.079358] check_noncircular+0x17d/0x1a0 > [ 500.079358] ? __lock_acquire+0x1166/0x1440 > [ 500.079358] __lock_acquire+0x1166/0x1440 > [ 500.079358] lock_acquire+0x90/0x170 > [ 500.079358] ? rdma_destroy_id+0x55/0x230 [rdma_cm] > [ 500.079358] ? rdma_destroy_id+0x55/0x230 [rdma_cm] > [ 500.079358] __mutex_lock+0x7e/0x950 > [ 500.079358] ? rdma_destroy_id+0x55/0x230 [rdma_cm] > [ 500.079358] ? find_held_lock+0x2d/0x90 > [ 500.079358] ? mark_held_locks+0x49/0x70 > [ 500.079358] ? rdma_destroy_id+0x55/0x230 [rdma_cm] > [ 500.079358] rdma_destroy_id+0x55/0x230 [rdma_cm] > [ 500.079358] rtrs_srv_close_work+0xf2/0x2d0 [rtrs_server] > [ 500.079358] process_one_work+0x29f/0x600 > [ 500.079358] worker_thread+0x2d/0x3d0 > [ 500.079358] ? process_one_work+0x600/0x600 > [ 500.079358] kthread+0x111/0x130 > [ 500.079358] ? kthread_park+0x90/0x90 > [ 500.079358] ret_from_fork+0x24/0x30 > > According to my understanding > in cma_ib_req_handler, the conn_id is newly created in > https://elixir.bootlin.com/linux/latest/source/drivers/infiniband/core/cma.c#L2222. > And the rdma_cm_id associated with conn_id is passed to > rtrs_srv_rdma_cm_handler->rtrs_rdma_connect. > > In rtrs_rdma_connect, we do flush_workqueue will only flush close_work > for any other cm_id, but > not the newly created one conn_id, it has not associated with anything yet. How did you come to this conclusion that rtrs handler was called before cma_cm_event_handler()? I'm not so sure about that and it will explain the lockdep. Thanks > > The same applies to nvme-rdma. so it's a false alarm by lockdep. > > Regards!
On Sun, Dec 27, 2020 at 10:01 AM Leon Romanovsky <leon@kernel.org> wrote: > > On Wed, Dec 16, 2020 at 05:42:17PM +0100, Jinpu Wang wrote: > > On Fri, Dec 11, 2020 at 5:29 PM Jason Gunthorpe <jgg@ziepe.ca> wrote: > > > > > > On Fri, Dec 11, 2020 at 05:17:36PM +0100, Jack Wang wrote: > > > > En, the lockdep was complaining about the new conn_id, I will > > > > post the full log if needed next week. let’s skip this patch for > > > > now, will double check! > > > > > > That is even more worrysome as the new conn_id already has a different > > > lock class. > > > > > > Jason > > This is the dmesg of the LOCKDEP warning, it's on kernel 5.4.77, but > > the latest 5.10 behaves the same. > > > > [ 500.071552] ====================================================== > > [ 500.071648] WARNING: possible circular locking dependency detected > > [ 500.071869] 5.4.77-storage+ #35 Tainted: G O > > [ 500.071959] ------------------------------------------------------ > > [ 500.072054] kworker/1:1/28 is trying to acquire lock: > > [ 500.072200] ffff99653a624390 (&id_priv->handler_mutex){+.+.}, at: > > rdma_destroy_id+0x55/0x230 [rdma_cm] > > [ 500.072837] > > but task is already holding lock: > > [ 500.072938] ffff9d18800f7e80 > > ((work_completion)(&sess->close_work)){+.+.}, at: > > process_one_work+0x223/0x600 > > [ 500.075642] > > which lock already depends on the new lock. > > > > [ 500.075759] > > the existing dependency chain (in reverse order) is: > > [ 500.075880] > > -> #3 ((work_completion)(&sess->close_work)){+.+.}: > > [ 500.076062] process_one_work+0x278/0x600 > > [ 500.076154] worker_thread+0x2d/0x3d0 > > [ 500.076225] kthread+0x111/0x130 > > [ 500.076290] ret_from_fork+0x24/0x30 > > [ 500.076370] > > -> #2 ((wq_completion)rtrs_server_wq){+.+.}: > > [ 500.076482] flush_workqueue+0xab/0x4b0 > > [ 500.076565] rtrs_srv_rdma_cm_handler+0x71d/0x1500 [rtrs_server] > > [ 500.076674] cma_ib_req_handler+0x8c4/0x14f0 [rdma_cm] > > [ 500.076770] cm_process_work+0x22/0x140 [ib_cm] > > [ 500.076857] cm_req_handler+0x900/0xde0 [ib_cm] > > [ 500.076944] cm_work_handler+0x136/0x1af2 [ib_cm] > > [ 500.077025] process_one_work+0x29f/0x600 > > [ 500.077097] worker_thread+0x2d/0x3d0 > > [ 500.077164] kthread+0x111/0x130 > > [ 500.077224] ret_from_fork+0x24/0x30 > > [ 500.077294] > > -> #1 (&id_priv->handler_mutex/1){+.+.}: > > [ 500.077409] __mutex_lock+0x7e/0x950 > > [ 500.077488] cma_ib_req_handler+0x787/0x14f0 [rdma_cm] > > [ 500.077582] cm_process_work+0x22/0x140 [ib_cm] > > [ 500.077669] cm_req_handler+0x900/0xde0 [ib_cm] > > [ 500.077755] cm_work_handler+0x136/0x1af2 [ib_cm] > > [ 500.077835] process_one_work+0x29f/0x600 > > [ 500.077907] worker_thread+0x2d/0x3d0 > > [ 500.077973] kthread+0x111/0x130 > > [ 500.078034] ret_from_fork+0x24/0x30 > > [ 500.078095] > > -> #0 (&id_priv->handler_mutex){+.+.}: > > [ 500.078196] __lock_acquire+0x1166/0x1440 > > [ 500.078267] lock_acquire+0x90/0x170 > > [ 500.078335] __mutex_lock+0x7e/0x950 > > [ 500.078410] rdma_destroy_id+0x55/0x230 [rdma_cm] > > [ 500.078498] rtrs_srv_close_work+0xf2/0x2d0 [rtrs_server] > > [ 500.078586] process_one_work+0x29f/0x600 > > [ 500.078662] worker_thread+0x2d/0x3d0 > > [ 500.078732] kthread+0x111/0x130 > > [ 500.078793] ret_from_fork+0x24/0x30 > > [ 500.078859] > > other info that might help us debug this: > > > > [ 500.078984] Chain exists of: > > &id_priv->handler_mutex --> > > (wq_completion)rtrs_server_wq --> (work_completion)(&sess->close_work) > > > > [ 500.079207] Possible unsafe locking scenario: > > > > [ 500.079293] CPU0 CPU1 > > [ 500.079358] ---- ---- > > [ 500.079358] lock((work_completion)(&sess->close_work)); > > [ 500.079358] > > lock((wq_completion)rtrs_server_wq); > > [ 500.079358] > > lock((work_completion)(&sess->close_work)); > > [ 500.079358] lock(&id_priv->handler_mutex); > > [ 500.079358] > > *** DEADLOCK *** > > > > [ 500.079358] 2 locks held by kworker/1:1/28: > > [ 500.079358] #0: ffff99652d281f28 > > ((wq_completion)rtrs_server_wq){+.+.}, at: > > process_one_work+0x223/0x600 > > [ 500.079358] #1: ffff9d18800f7e80 > > ((work_completion)(&sess->close_work)){+.+.}, at: > > process_one_work+0x223/0x600 > > [ 500.079358] > > stack backtrace: > > [ 500.079358] CPU: 1 PID: 28 Comm: kworker/1:1 Tainted: G O > > 5.4.77-storage+ #35 > > [ 500.079358] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), > > BIOS 1.10.2-1ubuntu1 04/01/2014 > > [ 500.079358] Workqueue: rtrs_server_wq rtrs_srv_close_work [rtrs_server] > > [ 500.079358] Call Trace: > > [ 500.079358] dump_stack+0x71/0x9b > > [ 500.079358] check_noncircular+0x17d/0x1a0 > > [ 500.079358] ? __lock_acquire+0x1166/0x1440 > > [ 500.079358] __lock_acquire+0x1166/0x1440 > > [ 500.079358] lock_acquire+0x90/0x170 > > [ 500.079358] ? rdma_destroy_id+0x55/0x230 [rdma_cm] > > [ 500.079358] ? rdma_destroy_id+0x55/0x230 [rdma_cm] > > [ 500.079358] __mutex_lock+0x7e/0x950 > > [ 500.079358] ? rdma_destroy_id+0x55/0x230 [rdma_cm] > > [ 500.079358] ? find_held_lock+0x2d/0x90 > > [ 500.079358] ? mark_held_locks+0x49/0x70 > > [ 500.079358] ? rdma_destroy_id+0x55/0x230 [rdma_cm] > > [ 500.079358] rdma_destroy_id+0x55/0x230 [rdma_cm] > > [ 500.079358] rtrs_srv_close_work+0xf2/0x2d0 [rtrs_server] > > [ 500.079358] process_one_work+0x29f/0x600 > > [ 500.079358] worker_thread+0x2d/0x3d0 > > [ 500.079358] ? process_one_work+0x600/0x600 > > [ 500.079358] kthread+0x111/0x130 > > [ 500.079358] ? kthread_park+0x90/0x90 > > [ 500.079358] ret_from_fork+0x24/0x30 > > > > According to my understanding > > in cma_ib_req_handler, the conn_id is newly created in > > https://elixir.bootlin.com/linux/latest/source/drivers/infiniband/core/cma.c#L2222. > > And the rdma_cm_id associated with conn_id is passed to > > rtrs_srv_rdma_cm_handler->rtrs_rdma_connect. > > > > In rtrs_rdma_connect, we do flush_workqueue will only flush close_work > > for any other cm_id, but > > not the newly created one conn_id, it has not associated with anything yet. > > How did you come to this conclusion that rtrs handler was called before > cma_cm_event_handler()? I'm not so sure about that and it will explain > the lockdep. > > Thanks Hi Leon, I never said that, the call chain here is: cma_ib_req_handler->cma_cm_event_handler->rtrs_srv_rdma_cm_handler->rtrs_rdma_connect. Repeat myself in last email: in cma_ib_req_handler, the conn_id is newly created in https://elixir.bootlin.com/linux/latest/source/drivers/infiniband/core/cma.c#L2222. And the rdma_cm_id associated with conn_id is passed to rtrs_rdma_connect. In rtrs_rdma_connect, we do flush_workqueue will only flush close_work for any other cm_id, but not the newly created one conn_id, the rdma_cm_id passed in rtrs_rdma_connect has not associated with anything yet. Hope this is now clear. Happy New Year!
On Mon, Jan 04, 2021 at 09:06:13AM +0100, Jinpu Wang wrote: > On Sun, Dec 27, 2020 at 10:01 AM Leon Romanovsky <leon@kernel.org> wrote: > > > > On Wed, Dec 16, 2020 at 05:42:17PM +0100, Jinpu Wang wrote: > > > On Fri, Dec 11, 2020 at 5:29 PM Jason Gunthorpe <jgg@ziepe.ca> wrote: > > > > > > > > On Fri, Dec 11, 2020 at 05:17:36PM +0100, Jack Wang wrote: > > > > > En, the lockdep was complaining about the new conn_id, I will > > > > > post the full log if needed next week. let’s skip this patch for > > > > > now, will double check! > > > > > > > > That is even more worrysome as the new conn_id already has a different > > > > lock class. > > > > > > > > Jason > > > This is the dmesg of the LOCKDEP warning, it's on kernel 5.4.77, but > > > the latest 5.10 behaves the same. > > > > > > [ 500.071552] ====================================================== > > > [ 500.071648] WARNING: possible circular locking dependency detected > > > [ 500.071869] 5.4.77-storage+ #35 Tainted: G O > > > [ 500.071959] ------------------------------------------------------ > > > [ 500.072054] kworker/1:1/28 is trying to acquire lock: > > > [ 500.072200] ffff99653a624390 (&id_priv->handler_mutex){+.+.}, at: > > > rdma_destroy_id+0x55/0x230 [rdma_cm] > > > [ 500.072837] > > > but task is already holding lock: > > > [ 500.072938] ffff9d18800f7e80 > > > ((work_completion)(&sess->close_work)){+.+.}, at: > > > process_one_work+0x223/0x600 > > > [ 500.075642] > > > which lock already depends on the new lock. > > > > > > [ 500.075759] > > > the existing dependency chain (in reverse order) is: > > > [ 500.075880] > > > -> #3 ((work_completion)(&sess->close_work)){+.+.}: > > > [ 500.076062] process_one_work+0x278/0x600 > > > [ 500.076154] worker_thread+0x2d/0x3d0 > > > [ 500.076225] kthread+0x111/0x130 > > > [ 500.076290] ret_from_fork+0x24/0x30 > > > [ 500.076370] > > > -> #2 ((wq_completion)rtrs_server_wq){+.+.}: > > > [ 500.076482] flush_workqueue+0xab/0x4b0 > > > [ 500.076565] rtrs_srv_rdma_cm_handler+0x71d/0x1500 [rtrs_server] > > > [ 500.076674] cma_ib_req_handler+0x8c4/0x14f0 [rdma_cm] > > > [ 500.076770] cm_process_work+0x22/0x140 [ib_cm] > > > [ 500.076857] cm_req_handler+0x900/0xde0 [ib_cm] > > > [ 500.076944] cm_work_handler+0x136/0x1af2 [ib_cm] > > > [ 500.077025] process_one_work+0x29f/0x600 > > > [ 500.077097] worker_thread+0x2d/0x3d0 > > > [ 500.077164] kthread+0x111/0x130 > > > [ 500.077224] ret_from_fork+0x24/0x30 > > > [ 500.077294] > > > -> #1 (&id_priv->handler_mutex/1){+.+.}: > > > [ 500.077409] __mutex_lock+0x7e/0x950 > > > [ 500.077488] cma_ib_req_handler+0x787/0x14f0 [rdma_cm] > > > [ 500.077582] cm_process_work+0x22/0x140 [ib_cm] > > > [ 500.077669] cm_req_handler+0x900/0xde0 [ib_cm] > > > [ 500.077755] cm_work_handler+0x136/0x1af2 [ib_cm] > > > [ 500.077835] process_one_work+0x29f/0x600 > > > [ 500.077907] worker_thread+0x2d/0x3d0 > > > [ 500.077973] kthread+0x111/0x130 > > > [ 500.078034] ret_from_fork+0x24/0x30 > > > [ 500.078095] > > > -> #0 (&id_priv->handler_mutex){+.+.}: > > > [ 500.078196] __lock_acquire+0x1166/0x1440 > > > [ 500.078267] lock_acquire+0x90/0x170 > > > [ 500.078335] __mutex_lock+0x7e/0x950 > > > [ 500.078410] rdma_destroy_id+0x55/0x230 [rdma_cm] > > > [ 500.078498] rtrs_srv_close_work+0xf2/0x2d0 [rtrs_server] > > > [ 500.078586] process_one_work+0x29f/0x600 > > > [ 500.078662] worker_thread+0x2d/0x3d0 > > > [ 500.078732] kthread+0x111/0x130 > > > [ 500.078793] ret_from_fork+0x24/0x30 > > > [ 500.078859] > > > other info that might help us debug this: > > > > > > [ 500.078984] Chain exists of: > > > &id_priv->handler_mutex --> > > > (wq_completion)rtrs_server_wq --> (work_completion)(&sess->close_work) > > > > > > [ 500.079207] Possible unsafe locking scenario: > > > > > > [ 500.079293] CPU0 CPU1 > > > [ 500.079358] ---- ---- > > > [ 500.079358] lock((work_completion)(&sess->close_work)); > > > [ 500.079358] > > > lock((wq_completion)rtrs_server_wq); > > > [ 500.079358] > > > lock((work_completion)(&sess->close_work)); > > > [ 500.079358] lock(&id_priv->handler_mutex); > > > [ 500.079358] > > > *** DEADLOCK *** > > > > > > [ 500.079358] 2 locks held by kworker/1:1/28: > > > [ 500.079358] #0: ffff99652d281f28 > > > ((wq_completion)rtrs_server_wq){+.+.}, at: > > > process_one_work+0x223/0x600 > > > [ 500.079358] #1: ffff9d18800f7e80 > > > ((work_completion)(&sess->close_work)){+.+.}, at: > > > process_one_work+0x223/0x600 > > > [ 500.079358] > > > stack backtrace: > > > [ 500.079358] CPU: 1 PID: 28 Comm: kworker/1:1 Tainted: G O > > > 5.4.77-storage+ #35 > > > [ 500.079358] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), > > > BIOS 1.10.2-1ubuntu1 04/01/2014 > > > [ 500.079358] Workqueue: rtrs_server_wq rtrs_srv_close_work [rtrs_server] > > > [ 500.079358] Call Trace: > > > [ 500.079358] dump_stack+0x71/0x9b > > > [ 500.079358] check_noncircular+0x17d/0x1a0 > > > [ 500.079358] ? __lock_acquire+0x1166/0x1440 > > > [ 500.079358] __lock_acquire+0x1166/0x1440 > > > [ 500.079358] lock_acquire+0x90/0x170 > > > [ 500.079358] ? rdma_destroy_id+0x55/0x230 [rdma_cm] > > > [ 500.079358] ? rdma_destroy_id+0x55/0x230 [rdma_cm] > > > [ 500.079358] __mutex_lock+0x7e/0x950 > > > [ 500.079358] ? rdma_destroy_id+0x55/0x230 [rdma_cm] > > > [ 500.079358] ? find_held_lock+0x2d/0x90 > > > [ 500.079358] ? mark_held_locks+0x49/0x70 > > > [ 500.079358] ? rdma_destroy_id+0x55/0x230 [rdma_cm] > > > [ 500.079358] rdma_destroy_id+0x55/0x230 [rdma_cm] > > > [ 500.079358] rtrs_srv_close_work+0xf2/0x2d0 [rtrs_server] > > > [ 500.079358] process_one_work+0x29f/0x600 > > > [ 500.079358] worker_thread+0x2d/0x3d0 > > > [ 500.079358] ? process_one_work+0x600/0x600 > > > [ 500.079358] kthread+0x111/0x130 > > > [ 500.079358] ? kthread_park+0x90/0x90 > > > [ 500.079358] ret_from_fork+0x24/0x30 > > > > > > According to my understanding > > > in cma_ib_req_handler, the conn_id is newly created in > > > https://elixir.bootlin.com/linux/latest/source/drivers/infiniband/core/cma.c#L2222. > > > And the rdma_cm_id associated with conn_id is passed to > > > rtrs_srv_rdma_cm_handler->rtrs_rdma_connect. > > > > > > In rtrs_rdma_connect, we do flush_workqueue will only flush close_work > > > for any other cm_id, but > > > not the newly created one conn_id, it has not associated with anything yet. > > > > How did you come to this conclusion that rtrs handler was called before > > cma_cm_event_handler()? I'm not so sure about that and it will explain > > the lockdep. > > > > Thanks > Hi Leon, > I never said that, the call chain here is: > cma_ib_req_handler->cma_cm_event_handler->rtrs_srv_rdma_cm_handler->rtrs_rdma_connect. > Repeat myself in last email: > in cma_ib_req_handler, the conn_id is newly created in > https://elixir.bootlin.com/linux/latest/source/drivers/infiniband/core/cma.c#L2222. > And the rdma_cm_id associated with conn_id is passed to > rtrs_rdma_connect. > > In rtrs_rdma_connect, we do flush_workqueue will only flush close_work > for any other cm_id, but > not the newly created one conn_id, the rdma_cm_id passed in > rtrs_rdma_connect has not associated with anything yet. This is exactly why I'm not so sure, after rdma_cm_id returns from RDMA/core, it will be in that flush_workqueue queue. > > Hope this is now clear. > > Happy New Year! Happy New Year too :)
On Mon, Jan 4, 2021 at 9:25 AM Leon Romanovsky <leon@kernel.org> wrote: > > On Mon, Jan 04, 2021 at 09:06:13AM +0100, Jinpu Wang wrote: > > On Sun, Dec 27, 2020 at 10:01 AM Leon Romanovsky <leon@kernel.org> wrote: > > > > > > On Wed, Dec 16, 2020 at 05:42:17PM +0100, Jinpu Wang wrote: > > > > On Fri, Dec 11, 2020 at 5:29 PM Jason Gunthorpe <jgg@ziepe.ca> wrote: > > > > > > > > > > On Fri, Dec 11, 2020 at 05:17:36PM +0100, Jack Wang wrote: > > > > > > En, the lockdep was complaining about the new conn_id, I will > > > > > > post the full log if needed next week. let’s skip this patch for > > > > > > now, will double check! > > > > > > > > > > That is even more worrysome as the new conn_id already has a different > > > > > lock class. > > > > > > > > > > Jason > > > > This is the dmesg of the LOCKDEP warning, it's on kernel 5.4.77, but > > > > the latest 5.10 behaves the same. > > > > > > > > [ 500.071552] ====================================================== > > > > [ 500.071648] WARNING: possible circular locking dependency detected > > > > [ 500.071869] 5.4.77-storage+ #35 Tainted: G O > > > > [ 500.071959] ------------------------------------------------------ > > > > [ 500.072054] kworker/1:1/28 is trying to acquire lock: > > > > [ 500.072200] ffff99653a624390 (&id_priv->handler_mutex){+.+.}, at: > > > > rdma_destroy_id+0x55/0x230 [rdma_cm] > > > > [ 500.072837] > > > > but task is already holding lock: > > > > [ 500.072938] ffff9d18800f7e80 > > > > ((work_completion)(&sess->close_work)){+.+.}, at: > > > > process_one_work+0x223/0x600 > > > > [ 500.075642] > > > > which lock already depends on the new lock. > > > > > > > > [ 500.075759] > > > > the existing dependency chain (in reverse order) is: > > > > [ 500.075880] > > > > -> #3 ((work_completion)(&sess->close_work)){+.+.}: > > > > [ 500.076062] process_one_work+0x278/0x600 > > > > [ 500.076154] worker_thread+0x2d/0x3d0 > > > > [ 500.076225] kthread+0x111/0x130 > > > > [ 500.076290] ret_from_fork+0x24/0x30 > > > > [ 500.076370] > > > > -> #2 ((wq_completion)rtrs_server_wq){+.+.}: > > > > [ 500.076482] flush_workqueue+0xab/0x4b0 > > > > [ 500.076565] rtrs_srv_rdma_cm_handler+0x71d/0x1500 [rtrs_server] > > > > [ 500.076674] cma_ib_req_handler+0x8c4/0x14f0 [rdma_cm] > > > > [ 500.076770] cm_process_work+0x22/0x140 [ib_cm] > > > > [ 500.076857] cm_req_handler+0x900/0xde0 [ib_cm] > > > > [ 500.076944] cm_work_handler+0x136/0x1af2 [ib_cm] > > > > [ 500.077025] process_one_work+0x29f/0x600 > > > > [ 500.077097] worker_thread+0x2d/0x3d0 > > > > [ 500.077164] kthread+0x111/0x130 > > > > [ 500.077224] ret_from_fork+0x24/0x30 > > > > [ 500.077294] > > > > -> #1 (&id_priv->handler_mutex/1){+.+.}: > > > > [ 500.077409] __mutex_lock+0x7e/0x950 > > > > [ 500.077488] cma_ib_req_handler+0x787/0x14f0 [rdma_cm] > > > > [ 500.077582] cm_process_work+0x22/0x140 [ib_cm] > > > > [ 500.077669] cm_req_handler+0x900/0xde0 [ib_cm] > > > > [ 500.077755] cm_work_handler+0x136/0x1af2 [ib_cm] > > > > [ 500.077835] process_one_work+0x29f/0x600 > > > > [ 500.077907] worker_thread+0x2d/0x3d0 > > > > [ 500.077973] kthread+0x111/0x130 > > > > [ 500.078034] ret_from_fork+0x24/0x30 > > > > [ 500.078095] > > > > -> #0 (&id_priv->handler_mutex){+.+.}: > > > > [ 500.078196] __lock_acquire+0x1166/0x1440 > > > > [ 500.078267] lock_acquire+0x90/0x170 > > > > [ 500.078335] __mutex_lock+0x7e/0x950 > > > > [ 500.078410] rdma_destroy_id+0x55/0x230 [rdma_cm] > > > > [ 500.078498] rtrs_srv_close_work+0xf2/0x2d0 [rtrs_server] > > > > [ 500.078586] process_one_work+0x29f/0x600 > > > > [ 500.078662] worker_thread+0x2d/0x3d0 > > > > [ 500.078732] kthread+0x111/0x130 > > > > [ 500.078793] ret_from_fork+0x24/0x30 > > > > [ 500.078859] > > > > other info that might help us debug this: > > > > > > > > [ 500.078984] Chain exists of: > > > > &id_priv->handler_mutex --> > > > > (wq_completion)rtrs_server_wq --> (work_completion)(&sess->close_work) > > > > > > > > [ 500.079207] Possible unsafe locking scenario: > > > > > > > > [ 500.079293] CPU0 CPU1 > > > > [ 500.079358] ---- ---- > > > > [ 500.079358] lock((work_completion)(&sess->close_work)); > > > > [ 500.079358] > > > > lock((wq_completion)rtrs_server_wq); > > > > [ 500.079358] > > > > lock((work_completion)(&sess->close_work)); > > > > [ 500.079358] lock(&id_priv->handler_mutex); > > > > [ 500.079358] > > > > *** DEADLOCK *** > > > > > > > > [ 500.079358] 2 locks held by kworker/1:1/28: > > > > [ 500.079358] #0: ffff99652d281f28 > > > > ((wq_completion)rtrs_server_wq){+.+.}, at: > > > > process_one_work+0x223/0x600 > > > > [ 500.079358] #1: ffff9d18800f7e80 > > > > ((work_completion)(&sess->close_work)){+.+.}, at: > > > > process_one_work+0x223/0x600 > > > > [ 500.079358] > > > > stack backtrace: > > > > [ 500.079358] CPU: 1 PID: 28 Comm: kworker/1:1 Tainted: G O > > > > 5.4.77-storage+ #35 > > > > [ 500.079358] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), > > > > BIOS 1.10.2-1ubuntu1 04/01/2014 > > > > [ 500.079358] Workqueue: rtrs_server_wq rtrs_srv_close_work [rtrs_server] > > > > [ 500.079358] Call Trace: > > > > [ 500.079358] dump_stack+0x71/0x9b > > > > [ 500.079358] check_noncircular+0x17d/0x1a0 > > > > [ 500.079358] ? __lock_acquire+0x1166/0x1440 > > > > [ 500.079358] __lock_acquire+0x1166/0x1440 > > > > [ 500.079358] lock_acquire+0x90/0x170 > > > > [ 500.079358] ? rdma_destroy_id+0x55/0x230 [rdma_cm] > > > > [ 500.079358] ? rdma_destroy_id+0x55/0x230 [rdma_cm] > > > > [ 500.079358] __mutex_lock+0x7e/0x950 > > > > [ 500.079358] ? rdma_destroy_id+0x55/0x230 [rdma_cm] > > > > [ 500.079358] ? find_held_lock+0x2d/0x90 > > > > [ 500.079358] ? mark_held_locks+0x49/0x70 > > > > [ 500.079358] ? rdma_destroy_id+0x55/0x230 [rdma_cm] > > > > [ 500.079358] rdma_destroy_id+0x55/0x230 [rdma_cm] > > > > [ 500.079358] rtrs_srv_close_work+0xf2/0x2d0 [rtrs_server] > > > > [ 500.079358] process_one_work+0x29f/0x600 > > > > [ 500.079358] worker_thread+0x2d/0x3d0 > > > > [ 500.079358] ? process_one_work+0x600/0x600 > > > > [ 500.079358] kthread+0x111/0x130 > > > > [ 500.079358] ? kthread_park+0x90/0x90 > > > > [ 500.079358] ret_from_fork+0x24/0x30 > > > > > > > > According to my understanding > > > > in cma_ib_req_handler, the conn_id is newly created in > > > > https://elixir.bootlin.com/linux/latest/source/drivers/infiniband/core/cma.c#L2222. > > > > And the rdma_cm_id associated with conn_id is passed to > > > > rtrs_srv_rdma_cm_handler->rtrs_rdma_connect. > > > > > > > > In rtrs_rdma_connect, we do flush_workqueue will only flush close_work > > > > for any other cm_id, but > > > > not the newly created one conn_id, it has not associated with anything yet. > > > > > > How did you come to this conclusion that rtrs handler was called before > > > cma_cm_event_handler()? I'm not so sure about that and it will explain > > > the lockdep. > > > > > > Thanks > > Hi Leon, > > I never said that, the call chain here is: > > cma_ib_req_handler->cma_cm_event_handler->rtrs_srv_rdma_cm_handler->rtrs_rdma_connect. > > Repeat myself in last email: > > in cma_ib_req_handler, the conn_id is newly created in > > https://elixir.bootlin.com/linux/latest/source/drivers/infiniband/core/cma.c#L2222. > > And the rdma_cm_id associated with conn_id is passed to > > rtrs_rdma_connect. > > > > In rtrs_rdma_connect, we do flush_workqueue will only flush close_work > > for any other cm_id, but > > not the newly created one conn_id, the rdma_cm_id passed in > > rtrs_rdma_connect has not associated with anything yet. > > This is exactly why I'm not so sure, after rdma_cm_id returns from > RDMA/core, it will be in that flush_workqueue queue. In rtrs_rdma_connect, we do flush_workqueue(rtrs_wq) in the beginning before we associate rdma_cm_id (conn_id) to rtrs_srv_con in by rtrs_srv_rdma_cm_handler -> rtrs_rdma_connect -> create_con -> con->c.cm_id = cm_id. And in rtrs_srv_close_work we do rdma_destroy_id(con->c.cm_id); so the rdma_cm_id is not in the flush_workqueue queue yet. Thanks! > > > > > Hope this is now clear. > > > > Happy New Year! > > Happy New Year too :)
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-srv.c b/drivers/infiniband/ulp/rtrs/rtrs-srv.c index ed4628f032bb..0a2202c28b54 100644 --- a/drivers/infiniband/ulp/rtrs/rtrs-srv.c +++ b/drivers/infiniband/ulp/rtrs/rtrs-srv.c @@ -1791,6 +1791,10 @@ static int rtrs_rdma_connect(struct rdma_cm_id *cm_id, err = -ENOMEM; goto reject_w_err; } + if (!cid) { + /* Let inflight session teardown complete */ + flush_workqueue(rtrs_wq); + } mutex_lock(&srv->paths_mutex); sess = __find_sess(srv, &msg->sess_uuid); if (sess) {