Revert "IB/core: Fix use workqueue without WQ_MEM_RECLAIM"

Message ID	20230523155408.48594-1-mlombard@redhat.com (mailing list archive)
State	Rejected
Headers	show Return-Path: <linux-rdma-owner@vger.kernel.org> From: Maurizio Lombardi <mlombard@redhat.com> To: jgg@ziepe.ca Cc: leon@kernel.org, linux-rdma@vger.kernel.org, linux-nvme@lists.infradead.org, parav@mellanox.com Subject: [PATCH] Revert "IB/core: Fix use workqueue without WQ_MEM_RECLAIM" Date: Tue, 23 May 2023 17:54:08 +0200 Message-Id: <20230523155408.48594-1-mlombard@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk
Series	Revert "IB/core: Fix use workqueue without WQ_MEM_RECLAIM" \| expand Revert "IB/core: Fix use workqueue without WQ_MEM_RECLAIM"

Maurizio Lombardi May 23, 2023, 3:54 p.m. UTC

when nvme_rdma_reconnect_ctrl_work() fails, it flushes
the ib_addr:process_one_req() work but the latter is enqueued
on addr_wq which has been marked as "!WQ_MEM_RECLAIM".

workqueue: WQ_MEM_RECLAIM nvme-wq:nvme_rdma_reconnect_ctrl_work
[nvme_rdma] is flushing !WQ_MEM_RECLAIM ib_addr:process_one_req [ib_core]

Call Trace:
__flush_work.isra.0+0xbf/0x230
__cancel_work_timer+0x103/0x190
? rdma_resolve_ip+0x257/0x2f0 [ib_core]
? __dev_printk+0x2d/0x69
rdma_addr_cancel+0x8e/0xc0 [ib_core]
_destroy_id+0x1b/0x270 [rdma_cm]
nvme_rdma_alloc_queue.cold+0x4b/0x5c [nvme_rdma]
nvme_rdma_configure_admin_queue+0x1e/0x2f0 [nvme_rdma]
nvme_rdma_setup_ctrl+0x1e/0x220 [nvme_rdma]
nvme_rdma_reconnect_ctrl_work+0x22/0x30 [nvme_rdma]

This has been introduced by
commit 39baf10310e6 ("IB/core: Fix use workqueue without WQ_MEM_RECLAIM")

Since nvme_rdma_reconnect_work() needs to flush process_one_req(), we
have to restore the WQ_MEM_RECLAIM flag on the addr_wq workqueue.

This reverts commit 39baf10310e6669564a485b55267fae70a4e44ae.

Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
---
 drivers/infiniband/core/addr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Leon Romanovsky May 23, 2023, 6:28 p.m. UTC | #1

On Tue, May 23, 2023 at 05:54:08PM +0200, Maurizio Lombardi wrote:
> when nvme_rdma_reconnect_ctrl_work() fails, it flushes
> the ib_addr:process_one_req() work but the latter is enqueued
> on addr_wq which has been marked as "!WQ_MEM_RECLAIM".
> 
> workqueue: WQ_MEM_RECLAIM nvme-wq:nvme_rdma_reconnect_ctrl_work
> [nvme_rdma] is flushing !WQ_MEM_RECLAIM ib_addr:process_one_req [ib_core]

And why does nvme-wq need WQ_MEM_RECLAIM flag? I wonder if it is really
needed.

Thanks

Maurizio Lombardi May 29, 2023, 3:12 p.m. UTC | #2

út 23. 5. 2023 v 20:28 odesílatel Leon Romanovsky <leon@kernel.org> napsal:
> > workqueue: WQ_MEM_RECLAIM nvme-wq:nvme_rdma_reconnect_ctrl_work
> > [nvme_rdma] is flushing !WQ_MEM_RECLAIM ib_addr:process_one_req [ib_core]
>
> And why does nvme-wq need WQ_MEM_RECLAIM flag? I wonder if it is really
> needed.

Adding Sagi Grimberg to cc, he probably knows and can explain it better than me.

Maurizio

Sagi Grimberg June 5, 2023, 11:01 p.m. UTC | #3

>>>> workqueue: WQ_MEM_RECLAIM nvme-wq:nvme_rdma_reconnect_ctrl_work
>>>> [nvme_rdma] is flushing !WQ_MEM_RECLAIM ib_addr:process_one_req [ib_core]
>>>
>>> And why does nvme-wq need WQ_MEM_RECLAIM flag? I wonder if it is really
>>> needed.
>>
>> Adding Sagi Grimberg to cc, he probably knows and can explain it better than me.
> 
> We already allocate so much memory on these paths it is pretty
> nonsense to claim they are a reclaim context. One allocation on the WQ
> is not going to be the problem.
> 
> Probably this nvme stuff should not be re-using a reclaim marke dWQ
> for memory allocating work like this, it is kind of nonsensical.

A controller reset runs on this workqueue, which should succeed to allow
for pages to be flushed to the nvme disk. So I'd say its kind of
essential that this sequence has a rescuer thread.

Jason Gunthorpe June 6, 2023, 5:45 p.m. UTC | #4

On Tue, Jun 06, 2023 at 02:01:04AM +0300, Sagi Grimberg wrote:
> 
> > > > > workqueue: WQ_MEM_RECLAIM nvme-wq:nvme_rdma_reconnect_ctrl_work
> > > > > [nvme_rdma] is flushing !WQ_MEM_RECLAIM ib_addr:process_one_req [ib_core]
> > > > 
> > > > And why does nvme-wq need WQ_MEM_RECLAIM flag? I wonder if it is really
> > > > needed.
> > > 
> > > Adding Sagi Grimberg to cc, he probably knows and can explain it better than me.
> > 
> > We already allocate so much memory on these paths it is pretty
> > nonsense to claim they are a reclaim context. One allocation on the WQ
> > is not going to be the problem.
> > 
> > Probably this nvme stuff should not be re-using a reclaim marke dWQ
> > for memory allocating work like this, it is kind of nonsensical.
> 
> A controller reset runs on this workqueue, which should succeed to allow
> for pages to be flushed to the nvme disk. So I'd say its kind of
> essential that this sequence has a rescuer thread.

So don't run the CM stuff on the same WQ, go to another one without
the reclaim mark?

Jason

Sagi Grimberg June 7, 2023, 2:59 p.m. UTC | #5

>>>>>> workqueue: WQ_MEM_RECLAIM nvme-wq:nvme_rdma_reconnect_ctrl_work
>>>>>> [nvme_rdma] is flushing !WQ_MEM_RECLAIM ib_addr:process_one_req [ib_core]
>>>>>
>>>>> And why does nvme-wq need WQ_MEM_RECLAIM flag? I wonder if it is really
>>>>> needed.
>>>>
>>>> Adding Sagi Grimberg to cc, he probably knows and can explain it better than me.
>>>
>>> We already allocate so much memory on these paths it is pretty
>>> nonsense to claim they are a reclaim context. One allocation on the WQ
>>> is not going to be the problem.
>>>
>>> Probably this nvme stuff should not be re-using a reclaim marke dWQ
>>> for memory allocating work like this, it is kind of nonsensical.
>>
>> A controller reset runs on this workqueue, which should succeed to allow
>> for pages to be flushed to the nvme disk. So I'd say its kind of
>> essential that this sequence has a rescuer thread.
> 
> So don't run the CM stuff on the same WQ, go to another one without
> the reclaim mark?

That is not trivial. teardown works won't need a rescuer, but
they will need to fence async work elements that are a part
of the bringup that do need a rescuer (for example a scan work).

So it requires more thought how it would be possible to untangle
any dependency between work elements that make use of a rescuer
and others that don't, and vice-versa.

Revert "IB/core: Fix use workqueue without WQ_MEM_RECLAIM"

Commit Message

Comments

Patch