diff mbox

[2/3] svcrdma: handle rdma read with a non-zero initial page offset

Message ID 20150921172428.9761.27838.stgit@build2.ogc.int (mailing list archive)
State New, archived
Headers show

Commit Message

Steve Wise Sept. 21, 2015, 5:24 p.m. UTC
The server rdma_read_chunk_lcl() and rdma_read_chunk_frmr() functions
were not taking into account the initial page_offset when determining
the rdma read length.  This resulted in a read who's starting address
and length exceeded the base/bounds of the frmr.

Most work loads don't tickle this bug apparently, but one test hit it
every time: building the linux kernel on a 16 core node with 'make -j
16 O=/mnt/0' where /mnt/0 is a ramdisk mounted via NFSRDMA.

This bug seems to only be tripped with devices having small fastreg page
list depths.  I didn't see it with mlx4, for instance.

Fixes: 0bf4828983df ('svcrdma: refactor marshalling logic')
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Tested-by: Chuck Lever <chuck.lever@oracle.com>
---

 net/sunrpc/xprtrdma/svc_rdma_recvfrom.c |    6 ++++--
 1 files changed, 4 insertions(+), 2 deletions(-)


--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Steve Wise Sept. 28, 2015, 2:31 p.m. UTC | #1
On 9/21/2015 12:24 PM, Steve Wise wrote:
> The server rdma_read_chunk_lcl() and rdma_read_chunk_frmr() functions
> were not taking into account the initial page_offset when determining
> the rdma read length.  This resulted in a read who's starting address
> and length exceeded the base/bounds of the frmr.
>
> Most work loads don't tickle this bug apparently, but one test hit it
> every time: building the linux kernel on a 16 core node with 'make -j
> 16 O=/mnt/0' where /mnt/0 is a ramdisk mounted via NFSRDMA.
>
> This bug seems to only be tripped with devices having small fastreg page
> list depths.  I didn't see it with mlx4, for instance.
>
> Fixes: 0bf4828983df ('svcrdma: refactor marshalling logic')
> Signed-off-by: Steve Wise <swise@opengridcomputing.com>
> Tested-by: Chuck Lever <chuck.lever@oracle.com>
> ---
>
>

Hey Bruce, can this make 4.3-rc?  Also, what do you think about pushing 
it to stable?

Thanks,

Steve.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
J. Bruce Fields Sept. 28, 2015, 9:04 p.m. UTC | #2
On Mon, Sep 28, 2015 at 09:31:25AM -0500, Steve Wise wrote:
> On 9/21/2015 12:24 PM, Steve Wise wrote:
> >The server rdma_read_chunk_lcl() and rdma_read_chunk_frmr() functions
> >were not taking into account the initial page_offset when determining
> >the rdma read length.  This resulted in a read who's starting address
> >and length exceeded the base/bounds of the frmr.
> >
> >Most work loads don't tickle this bug apparently, but one test hit it
> >every time: building the linux kernel on a 16 core node with 'make -j
> >16 O=/mnt/0' where /mnt/0 is a ramdisk mounted via NFSRDMA.
> >
> >This bug seems to only be tripped with devices having small fastreg page
> >list depths.  I didn't see it with mlx4, for instance.
> >
> >Fixes: 0bf4828983df ('svcrdma: refactor marshalling logic')
> >Signed-off-by: Steve Wise <swise@opengridcomputing.com>
> >Tested-by: Chuck Lever <chuck.lever@oracle.com>
> >---
> >
> >
> 
> Hey Bruce, can this make 4.3-rc?  Also, what do you think about
> pushing it to stable?

It looks like a reasonable candidate for stable.  Apologies, somehow I
missed it when you posted it--would you mind resending?

--b.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Steve Wise Sept. 28, 2015, 9:49 p.m. UTC | #3
> -----Original Message-----
> From: J. Bruce Fields [mailto:bfields@fieldses.org]
> Sent: Monday, September 28, 2015 4:05 PM
> To: Steve Wise
> Cc: trond.myklebust@primarydata.com; linux-nfs@vger.kernel.org; linux-rdma@vger.kernel.org
> Subject: Re: [PATCH 2/3] svcrdma: handle rdma read with a non-zero initial page offset
> 
> On Mon, Sep 28, 2015 at 09:31:25AM -0500, Steve Wise wrote:
> > On 9/21/2015 12:24 PM, Steve Wise wrote:
> > >The server rdma_read_chunk_lcl() and rdma_read_chunk_frmr() functions
> > >were not taking into account the initial page_offset when determining
> > >the rdma read length.  This resulted in a read who's starting address
> > >and length exceeded the base/bounds of the frmr.
> > >
> > >Most work loads don't tickle this bug apparently, but one test hit it
> > >every time: building the linux kernel on a 16 core node with 'make -j
> > >16 O=/mnt/0' where /mnt/0 is a ramdisk mounted via NFSRDMA.
> > >
> > >This bug seems to only be tripped with devices having small fastreg page
> > >list depths.  I didn't see it with mlx4, for instance.
> > >
> > >Fixes: 0bf4828983df ('svcrdma: refactor marshalling logic')
> > >Signed-off-by: Steve Wise <swise@opengridcomputing.com>
> > >Tested-by: Chuck Lever <chuck.lever@oracle.com>
> > >---
> > >
> > >
> >
> > Hey Bruce, can this make 4.3-rc?  Also, what do you think about
> > pushing it to stable?
> 
> It looks like a reasonable candidate for stable.  Apologies, somehow I
> missed it when you posted it--would you mind resending?
> 
> --b.

resent this one patch.

What is your process for pushing to stable?

Thanks,

Steve.

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
J. Bruce Fields Sept. 29, 2015, 3:40 p.m. UTC | #4
On Mon, Sep 28, 2015 at 04:49:21PM -0500, Steve Wise wrote:
> 
> 
> > -----Original Message-----
> > From: J. Bruce Fields [mailto:bfields@fieldses.org]
> > Sent: Monday, September 28, 2015 4:05 PM
> > To: Steve Wise
> > Cc: trond.myklebust@primarydata.com; linux-nfs@vger.kernel.org; linux-rdma@vger.kernel.org
> > Subject: Re: [PATCH 2/3] svcrdma: handle rdma read with a non-zero initial page offset
> > 
> > On Mon, Sep 28, 2015 at 09:31:25AM -0500, Steve Wise wrote:
> > > On 9/21/2015 12:24 PM, Steve Wise wrote:
> > > >The server rdma_read_chunk_lcl() and rdma_read_chunk_frmr() functions
> > > >were not taking into account the initial page_offset when determining
> > > >the rdma read length.  This resulted in a read who's starting address
> > > >and length exceeded the base/bounds of the frmr.
> > > >
> > > >Most work loads don't tickle this bug apparently, but one test hit it
> > > >every time: building the linux kernel on a 16 core node with 'make -j
> > > >16 O=/mnt/0' where /mnt/0 is a ramdisk mounted via NFSRDMA.
> > > >
> > > >This bug seems to only be tripped with devices having small fastreg page
> > > >list depths.  I didn't see it with mlx4, for instance.
> > > >
> > > >Fixes: 0bf4828983df ('svcrdma: refactor marshalling logic')
> > > >Signed-off-by: Steve Wise <swise@opengridcomputing.com>
> > > >Tested-by: Chuck Lever <chuck.lever@oracle.com>
> > > >---
> > > >
> > > >
> > >
> > > Hey Bruce, can this make 4.3-rc?  Also, what do you think about
> > > pushing it to stable?
> > 
> > It looks like a reasonable candidate for stable.  Apologies, somehow I
> > missed it when you posted it--would you mind resending?
> > 
> > --b.
> 
> resent this one patch.
> 
> What is your process for pushing to stable?

Currently my standard regression tests don't seem to be passing on
recent 4.3; once I figure that out, I'll push out a branch with your
patch (and any others required) to

	git://linux-nfs.org/~bfields for-4.3

and then send a pull request to Linus.  Should be by the end of the
week.

--b.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
index cb51742..5f6ca47 100644
--- a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
+++ b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
@@ -136,7 +136,8 @@  int rdma_read_chunk_lcl(struct svcxprt_rdma *xprt,
 	ctxt->direction = DMA_FROM_DEVICE;
 	ctxt->read_hdr = head;
 	pages_needed = min_t(int, pages_needed, xprt->sc_max_sge_rd);
-	read = min_t(int, pages_needed << PAGE_SHIFT, rs_length);
+	read = min_t(int, (pages_needed << PAGE_SHIFT) - *page_offset,
+		     rs_length);
 
 	for (pno = 0; pno < pages_needed; pno++) {
 		int len = min_t(int, rs_length, PAGE_SIZE - pg_off);
@@ -235,7 +236,8 @@  int rdma_read_chunk_frmr(struct svcxprt_rdma *xprt,
 	ctxt->direction = DMA_FROM_DEVICE;
 	ctxt->frmr = frmr;
 	pages_needed = min_t(int, pages_needed, xprt->sc_frmr_pg_list_len);
-	read = min_t(int, pages_needed << PAGE_SHIFT, rs_length);
+	read = min_t(int, (pages_needed << PAGE_SHIFT) - *page_offset,
+		     rs_length);
 
 	frmr->kva = page_address(rqstp->rq_arg.pages[pg_no]);
 	frmr->direction = DMA_FROM_DEVICE;