diff mbox

nvmet-rdma: Fix missing dma sync to nvme data structures

Message ID 1484261109-3316-1-git-send-email-parav@mellanox.com (mailing list archive)
State Not Applicable
Headers show

Commit Message

Parav Pandit Jan. 12, 2017, 10:45 p.m. UTC
This patch performs dma sync operations on nvme_commmand,
inline page(s) and nvme_completion.

nvme_command and write cmd inline data is synced
(a) on receiving of the recv queue completion for cpu access.
(b) before posting recv wqe back to rdma adapter for device access.

nvme_completion is synced
(a) on receiving send completion for nvme_completion for cpu access.
(b) before posting send wqe to rdma adapter for device access.

Pushing this patch through linux-rdma tree as its more relavant with
Bart's changes for dma_map_ops of[1].

[1] https://patchwork.kernel.org/patch/9514085/

Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
---
 drivers/nvme/target/rdma.c | 27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

Comments

Christoph Hellwig Jan. 13, 2017, 7:44 a.m. UTC | #1
On Thu, Jan 12, 2017 at 04:45:09PM -0600, Parav Pandit wrote:
> This patch performs dma sync operations on nvme_commmand,
> inline page(s) and nvme_completion.
> 
> nvme_command and write cmd inline data is synced
> (a) on receiving of the recv queue completion for cpu access.
> (b) before posting recv wqe back to rdma adapter for device access.
> 
> nvme_completion is synced
> (a) on receiving send completion for nvme_completion for cpu access.
> (b) before posting send wqe to rdma adapter for device access.
> 
> Pushing this patch through linux-rdma tree as its more relavant with
> Bart's changes for dma_map_ops of[1].
> 
> [1] https://patchwork.kernel.org/patch/9514085/

This seems like a fix for 4.10-rc as the old code is buggy on
not DMA coherent architectures, so I suspect it should go in without
being based on Barts cleanup.

Btw, what architecture did you test this with?  One of the not coherent
ARM SOCs?
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Parav Pandit Jan. 13, 2017, 3:08 p.m. UTC | #2
Hi Christoph,

> -----Original Message-----
> From: Christoph Hellwig [mailto:hch@lst.de]
> Sent: Friday, January 13, 2017 1:44 AM
> To: Parav Pandit <parav@mellanox.com>
> Cc: hch@lst.de; sagi@grimberg.me; linux-nvme@lists.infradead.org; linux-
> rdma@vger.kernel.org; dledford@redhat.com
> Subject: Re: [PATCH] nvmet-rdma: Fix missing dma sync to nvme data
> structures
> 
> On Thu, Jan 12, 2017 at 04:45:09PM -0600, Parav Pandit wrote:
> > This patch performs dma sync operations on nvme_commmand, inline
> > page(s) and nvme_completion.
> >
> > nvme_command and write cmd inline data is synced
> > (a) on receiving of the recv queue completion for cpu access.
> > (b) before posting recv wqe back to rdma adapter for device access.
> >
> > nvme_completion is synced
> > (a) on receiving send completion for nvme_completion for cpu access.
> > (b) before posting send wqe to rdma adapter for device access.
> >
> > Pushing this patch through linux-rdma tree as its more relavant with
> > Bart's changes for dma_map_ops of[1].
> >
> > [1] https://patchwork.kernel.org/patch/9514085/
> 
> This seems like a fix for 4.10-rc as the old code is buggy on not DMA coherent
> architectures, so I suspect it should go in without being based on Barts
> cleanup.
> 
I first found and tested on git://git.infradead.org/nvme-fabrics.git in nvmf-4.10 branch.
If it has to go to 4.10-rc in above git repo,

Doug needs to rebase to above git tree.
Followed by Bart's patches, followed by my changes in his tree.
Without that, Linux-next will get compile error because ib_dma_xx won't be defined.
Let me know, I have 4.10-rc patch also tested.

> Btw, what architecture did you test this with?  One of the not coherent ARM
> SOCs?
On x86_64.
Currently testing on ARM.
While doing some performance improvement using inline sge mode, I encounter this bug in review.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sagi Grimberg Jan. 13, 2017, 7:35 p.m. UTC | #3
> This seems like a fix for 4.10-rc as the old code is buggy on
> not DMA coherent architectures, so I suspect it should go in without
> being based on Barts cleanup.

It does,

We should also give it a stable 4.8+ tag.

I'll queue this for nvme rc4 pull-req.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sagi Grimberg Jan. 13, 2017, 7:42 p.m. UTC | #4
> Doug needs to rebase to above git tree.
> Followed by Bart's patches, followed by my changes in his tree.
> Without that, Linux-next will get compile error because ib_dma_xx won't be defined.
> Let me know, I have 4.10-rc patch also tested.

No need, we'll jest keep the patch as is, no compile error
will generate.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Parav Pandit Jan. 13, 2017, 7:50 p.m. UTC | #5
Hi Sagi,

On Fri, Jan 13, 2017 at 1:42 PM, Sagi Grimberg <sagi@grimberg.me> wrote:
>
>> Doug needs to rebase to above git tree.
>> Followed by Bart's patches, followed by my changes in his tree.
>> Without that, Linux-next will get compile error because ib_dma_xx won't be
>> defined.
>> Let me know, I have 4.10-rc patch also tested.
>
>
> No need, we'll jest keep the patch as is, no compile error
> will generate.
> --

If we keep the patch as it is in 4.10-rc, than I think IB/rxe will
break with plays with virtual addresses.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sagi Grimberg Jan. 13, 2017, 8:01 p.m. UTC | #6
> Hi Sagi,

> If we keep the patch as it is in 4.10-rc, than I think IB/rxe will
> break with plays with virtual addresses.

I've been seeing reports that nvme-rdma is not
useable with rxe as it crashes pretty consistently.

In any event, we can either add this with 'ib_' prefix
into 4.10-rc and add a patch to Bart's series or
we can send a dedicated patch for stable.

Christoph, whats your preference?
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Parav Pandit Jan. 14, 2017, 4:12 p.m. UTC | #7
Hi Sagi, Christoph,

> -----Original Message-----

> From: Sagi Grimberg [mailto:sagi@grimberg.me]

> Sent: Friday, January 13, 2017 2:01 PM

> To: Parav Pandit <pandit.parav@gmail.com>

> Cc: Parav Pandit <parav@mellanox.com>; Christoph Hellwig <hch@lst.de>;

> linux-nvme@lists.infradead.org; linux-rdma@vger.kernel.org;

> dledford@redhat.com

> Subject: Re: [PATCH] nvmet-rdma: Fix missing dma sync to nvme data

> structures

> 

> > Hi Sagi,

> 

> > If we keep the patch as it is in 4.10-rc, than I think IB/rxe will

> > break with plays with virtual addresses.

> 

> I've been seeing reports that nvme-rdma is not useable with rxe as it crashes

> pretty consistently.

> 

> In any event, we can either add this with 'ib_' prefix into 4.10-rc and add a

> patch to Bart's series or we can send a dedicated patch for stable.

> 

> Christoph, whats your preference?


I see there is some discussion with Bart's patch with respect to SDMA.
So I guess it might take a while to merge his patchset.

How about doing this?
- add this with 'ib_' prefix into 4.10-rc.
- followed by cherry pick this commit in linux-rdma tree.

When Bart rebase it, changes of this fix will automatically get converted from ib_ to dma_.
So shall I send patch for 4.10-rc with 'ib_' prefix?
Christoph Hellwig Jan. 14, 2017, 4:18 p.m. UTC | #8
On Fri, Jan 13, 2017 at 10:01:23PM +0200, Sagi Grimberg wrote:
> I've been seeing reports that nvme-rdma is not
> useable with rxe as it crashes pretty consistently.
>
> In any event, we can either add this with 'ib_' prefix
> into 4.10-rc and add a patch to Bart's series or
> we can send a dedicated patch for stable.
>
> Christoph, whats your preference?

Get the ib_dma_* version of Parave's patch in for 4.10-rc.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sagi Grimberg Jan. 14, 2017, 9 p.m. UTC | #9
> I see there is some discussion with Bart's patch with respect to SDMA.
> So I guess it might take a while to merge his patchset.
>
> How about doing this?
> - add this with 'ib_' prefix into 4.10-rc.
> - followed by cherry pick this commit in linux-rdma tree.
>
> When Bart rebase it, changes of this fix will automatically get converted from ib_ to dma_.
> So shall I send patch for 4.10-rc with 'ib_' prefix?

Sounds good,

Thanks Parav.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sagi Grimberg Jan. 14, 2017, 9:04 p.m. UTC | #10
>> I've been seeing reports that nvme-rdma is not
>> useable with rxe as it crashes pretty consistently.
>>
>> In any event, we can either add this with 'ib_' prefix
>> into 4.10-rc and add a patch to Bart's series or
>> we can send a dedicated patch for stable.
>>
>> Christoph, whats your preference?
>
> Get the ib_dma_* version of Parave's patch in for 4.10-rc.

Fine with me.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/nvme/target/rdma.c b/drivers/nvme/target/rdma.c
index 8c3760a..c6468b3 100644
--- a/drivers/nvme/target/rdma.c
+++ b/drivers/nvme/target/rdma.c
@@ -438,6 +438,14 @@  static int nvmet_rdma_post_recv(struct nvmet_rdma_device *ndev,
 {
 	struct ib_recv_wr *bad_wr;
 
+	dma_sync_single_for_device(ndev->device->dma_device,
+			cmd->sge[0].addr, sizeof(*cmd->nvme_cmd),
+			DMA_FROM_DEVICE);
+
+	if (cmd->sge[1].addr)
+		dma_sync_single_for_device(ndev->device->dma_device,
+				cmd->sge[1].addr, NVMET_RDMA_INLINE_DATA_SIZE,
+				DMA_FROM_DEVICE);
 	if (ndev->srq)
 		return ib_post_srq_recv(ndev->srq, &cmd->wr, &bad_wr);
 	return ib_post_recv(cmd->queue->cm_id->qp, &cmd->wr, &bad_wr);
@@ -507,6 +515,10 @@  static void nvmet_rdma_send_done(struct ib_cq *cq, struct ib_wc *wc)
 	struct nvmet_rdma_rsp *rsp =
 		container_of(wc->wr_cqe, struct nvmet_rdma_rsp, send_cqe);
 
+	dma_sync_single_for_cpu(rsp->queue->dev->device->dma_device,
+			rsp->send_sge.addr, sizeof(*rsp->req.rsp),
+			DMA_TO_DEVICE);
+
 	nvmet_rdma_release_rsp(rsp);
 
 	if (unlikely(wc->status != IB_WC_SUCCESS &&
@@ -538,6 +550,11 @@  static void nvmet_rdma_queue_response(struct nvmet_req *req)
 		first_wr = &rsp->send_wr;
 
 	nvmet_rdma_post_recv(rsp->queue->dev, rsp->cmd);
+
+	dma_sync_single_for_device(rsp->queue->dev->device->dma_device,
+			rsp->send_sge.addr, sizeof(*rsp->req.rsp),
+			DMA_TO_DEVICE);
+
 	if (ib_post_send(cm_id->qp, first_wr, &bad_wr)) {
 		pr_err("sending cmd response failed\n");
 		nvmet_rdma_release_rsp(rsp);
@@ -698,6 +715,16 @@  static void nvmet_rdma_handle_command(struct nvmet_rdma_queue *queue,
 	cmd->n_rdma = 0;
 	cmd->req.port = queue->port;
 
+	dma_sync_single_for_cpu(queue->dev->device->dma_device,
+			cmd->cmd->sge[0].addr, sizeof(*cmd->cmd->nvme_cmd),
+			DMA_FROM_DEVICE);
+
+	if (cmd->cmd->sge[1].addr)
+		dma_sync_single_for_cpu(queue->dev->device->dma_device,
+				cmd->cmd->sge[1].addr,
+				NVMET_RDMA_INLINE_DATA_SIZE,
+				DMA_FROM_DEVICE);
+
 	if (!nvmet_req_init(&cmd->req, &queue->nvme_cq,
 			&queue->nvme_sq, &nvmet_rdma_ops))
 		return;