diff mbox series

Revert "IB/core: Fix use workqueue without WQ_MEM_RECLAIM"

Message ID 20230523155408.48594-1-mlombard@redhat.com (mailing list archive)
State Rejected
Headers show
Series Revert "IB/core: Fix use workqueue without WQ_MEM_RECLAIM" | expand

Commit Message

Maurizio Lombardi May 23, 2023, 3:54 p.m. UTC
when nvme_rdma_reconnect_ctrl_work() fails, it flushes
the ib_addr:process_one_req() work but the latter is enqueued
on addr_wq which has been marked as "!WQ_MEM_RECLAIM".

workqueue: WQ_MEM_RECLAIM nvme-wq:nvme_rdma_reconnect_ctrl_work
[nvme_rdma] is flushing !WQ_MEM_RECLAIM ib_addr:process_one_req [ib_core]

Call Trace:
__flush_work.isra.0+0xbf/0x230
__cancel_work_timer+0x103/0x190
? rdma_resolve_ip+0x257/0x2f0 [ib_core]
? __dev_printk+0x2d/0x69
rdma_addr_cancel+0x8e/0xc0 [ib_core]
_destroy_id+0x1b/0x270 [rdma_cm]
nvme_rdma_alloc_queue.cold+0x4b/0x5c [nvme_rdma]
nvme_rdma_configure_admin_queue+0x1e/0x2f0 [nvme_rdma]
nvme_rdma_setup_ctrl+0x1e/0x220 [nvme_rdma]
nvme_rdma_reconnect_ctrl_work+0x22/0x30 [nvme_rdma]

This has been introduced by
commit 39baf10310e6 ("IB/core: Fix use workqueue without WQ_MEM_RECLAIM")

Since nvme_rdma_reconnect_work() needs to flush process_one_req(), we
have to restore the WQ_MEM_RECLAIM flag on the addr_wq workqueue.

This reverts commit 39baf10310e6669564a485b55267fae70a4e44ae.

Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
---
 drivers/infiniband/core/addr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Leon Romanovsky May 23, 2023, 6:28 p.m. UTC | #1
On Tue, May 23, 2023 at 05:54:08PM +0200, Maurizio Lombardi wrote:
> when nvme_rdma_reconnect_ctrl_work() fails, it flushes
> the ib_addr:process_one_req() work but the latter is enqueued
> on addr_wq which has been marked as "!WQ_MEM_RECLAIM".
> 
> workqueue: WQ_MEM_RECLAIM nvme-wq:nvme_rdma_reconnect_ctrl_work
> [nvme_rdma] is flushing !WQ_MEM_RECLAIM ib_addr:process_one_req [ib_core]

And why does nvme-wq need WQ_MEM_RECLAIM flag? I wonder if it is really
needed.

Thanks
Maurizio Lombardi May 29, 2023, 3:12 p.m. UTC | #2
Ășt 23. 5. 2023 v 20:28 odesĂ­latel Leon Romanovsky <leon@kernel.org> napsal:
> > workqueue: WQ_MEM_RECLAIM nvme-wq:nvme_rdma_reconnect_ctrl_work
> > [nvme_rdma] is flushing !WQ_MEM_RECLAIM ib_addr:process_one_req [ib_core]
>
> And why does nvme-wq need WQ_MEM_RECLAIM flag? I wonder if it is really
> needed.

Adding Sagi Grimberg to cc, he probably knows and can explain it better than me.

Maurizio
Sagi Grimberg June 5, 2023, 11:01 p.m. UTC | #3
>>>> workqueue: WQ_MEM_RECLAIM nvme-wq:nvme_rdma_reconnect_ctrl_work
>>>> [nvme_rdma] is flushing !WQ_MEM_RECLAIM ib_addr:process_one_req [ib_core]
>>>
>>> And why does nvme-wq need WQ_MEM_RECLAIM flag? I wonder if it is really
>>> needed.
>>
>> Adding Sagi Grimberg to cc, he probably knows and can explain it better than me.
> 
> We already allocate so much memory on these paths it is pretty
> nonsense to claim they are a reclaim context. One allocation on the WQ
> is not going to be the problem.
> 
> Probably this nvme stuff should not be re-using a reclaim marke dWQ
> for memory allocating work like this, it is kind of nonsensical.

A controller reset runs on this workqueue, which should succeed to allow
for pages to be flushed to the nvme disk. So I'd say its kind of
essential that this sequence has a rescuer thread.
Jason Gunthorpe June 6, 2023, 5:45 p.m. UTC | #4
On Tue, Jun 06, 2023 at 02:01:04AM +0300, Sagi Grimberg wrote:
> 
> > > > > workqueue: WQ_MEM_RECLAIM nvme-wq:nvme_rdma_reconnect_ctrl_work
> > > > > [nvme_rdma] is flushing !WQ_MEM_RECLAIM ib_addr:process_one_req [ib_core]
> > > > 
> > > > And why does nvme-wq need WQ_MEM_RECLAIM flag? I wonder if it is really
> > > > needed.
> > > 
> > > Adding Sagi Grimberg to cc, he probably knows and can explain it better than me.
> > 
> > We already allocate so much memory on these paths it is pretty
> > nonsense to claim they are a reclaim context. One allocation on the WQ
> > is not going to be the problem.
> > 
> > Probably this nvme stuff should not be re-using a reclaim marke dWQ
> > for memory allocating work like this, it is kind of nonsensical.
> 
> A controller reset runs on this workqueue, which should succeed to allow
> for pages to be flushed to the nvme disk. So I'd say its kind of
> essential that this sequence has a rescuer thread.

So don't run the CM stuff on the same WQ, go to another one without
the reclaim mark?

Jason
Sagi Grimberg June 7, 2023, 2:59 p.m. UTC | #5
>>>>>> workqueue: WQ_MEM_RECLAIM nvme-wq:nvme_rdma_reconnect_ctrl_work
>>>>>> [nvme_rdma] is flushing !WQ_MEM_RECLAIM ib_addr:process_one_req [ib_core]
>>>>>
>>>>> And why does nvme-wq need WQ_MEM_RECLAIM flag? I wonder if it is really
>>>>> needed.
>>>>
>>>> Adding Sagi Grimberg to cc, he probably knows and can explain it better than me.
>>>
>>> We already allocate so much memory on these paths it is pretty
>>> nonsense to claim they are a reclaim context. One allocation on the WQ
>>> is not going to be the problem.
>>>
>>> Probably this nvme stuff should not be re-using a reclaim marke dWQ
>>> for memory allocating work like this, it is kind of nonsensical.
>>
>> A controller reset runs on this workqueue, which should succeed to allow
>> for pages to be flushed to the nvme disk. So I'd say its kind of
>> essential that this sequence has a rescuer thread.
> 
> So don't run the CM stuff on the same WQ, go to another one without
> the reclaim mark?

That is not trivial. teardown works won't need a rescuer, but
they will need to fence async work elements that are a part
of the bringup that do need a rescuer (for example a scan work).

So it requires more thought how it would be possible to untangle
any dependency between work elements that make use of a rescuer
and others that don't, and vice-versa.
diff mbox series

Patch

diff --git a/drivers/infiniband/core/addr.c b/drivers/infiniband/core/addr.c
index f253295795f0..5c36d01ebf0b 100644
--- a/drivers/infiniband/core/addr.c
+++ b/drivers/infiniband/core/addr.c
@@ -872,7 +872,7 @@  static struct notifier_block nb = {
 
 int addr_init(void)
 {
-	addr_wq = alloc_ordered_workqueue("ib_addr", 0);
+	addr_wq = alloc_ordered_workqueue("ib_addr", WQ_MEM_RECLAIM);
 	if (!addr_wq)
 		return -ENOMEM;