diff mbox series

[v1,2/2] svcrdma: DMA-sync the receive buffer in svc_rdma_recvfrom()

Message ID 161126239239.8979.7995314438640511469.stgit@klimt.1015granger.net (mailing list archive)
State Not Applicable
Headers show
Series Two small NFSD/RDMA scalability enhancements | expand

Commit Message

Chuck Lever III Jan. 21, 2021, 8:53 p.m. UTC
The Receive completion handler doesn't look at the contents of the
Receive buffer. The DMA sync isn't terribly expensive but it's one
less thing that needs to be done by the Receive completion handler,
which is single-threaded (per svc_xprt). This helps scalability.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 net/sunrpc/xprtrdma/svc_rdma_recvfrom.c |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

Comments

Christoph Hellwig Jan. 22, 2021, 7:58 a.m. UTC | #1
On Thu, Jan 21, 2021 at 03:53:12PM -0500, Chuck Lever wrote:
> The Receive completion handler doesn't look at the contents of the
> Receive buffer. The DMA sync isn't terribly expensive but it's one
> less thing that needs to be done by the Receive completion handler,
> which is single-threaded (per svc_xprt). This helps scalability.

On dma-noncoherent systems that have speculative execution (e.g. a lot
of ARM systems) it can be fairly expensive, so for those this a very
good thing.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Tom Talpey Jan. 22, 2021, 2:37 p.m. UTC | #2
Is there an asynchronous version of ib_dma_sync? Because it flushes
DMA pipelines, I'm wondering if kicking it off early might improve
latency, getting it done before svc_rdma_recvfrom() needs to dig
into the contents.

Tom.

On 1/21/2021 3:53 PM, Chuck Lever wrote:
> The Receive completion handler doesn't look at the contents of the
> Receive buffer. The DMA sync isn't terribly expensive but it's one
> less thing that needs to be done by the Receive completion handler,
> which is single-threaded (per svc_xprt). This helps scalability.
> 
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> ---
>   net/sunrpc/xprtrdma/svc_rdma_recvfrom.c |    6 +++---
>   1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> index ab0b7e9777bc..6d28f23ceb35 100644
> --- a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> +++ b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> @@ -342,9 +342,6 @@ static void svc_rdma_wc_receive(struct ib_cq *cq, struct ib_wc *wc)
>   
>   	/* All wc fields are now known to be valid */
>   	ctxt->rc_byte_len = wc->byte_len;
> -	ib_dma_sync_single_for_cpu(rdma->sc_pd->device,
> -				   ctxt->rc_recv_sge.addr,
> -				   wc->byte_len, DMA_FROM_DEVICE);
>   
>   	spin_lock(&rdma->sc_rq_dto_lock);
>   	list_add_tail(&ctxt->rc_list, &rdma->sc_rq_dto_q);
> @@ -851,6 +848,9 @@ int svc_rdma_recvfrom(struct svc_rqst *rqstp)
>   	spin_unlock(&rdma_xprt->sc_rq_dto_lock);
>   	percpu_counter_inc(&svcrdma_stat_recv);
>   
> +	ib_dma_sync_single_for_cpu(rdma_xprt->sc_pd->device,
> +				   ctxt->rc_recv_sge.addr, ctxt->rc_byte_len,
> +				   DMA_FROM_DEVICE);
>   	svc_rdma_build_arg_xdr(rqstp, ctxt);
>   
>   	/* Prevent svc_xprt_release from releasing pages in rq_pages
> 
> 
>
Christoph Hellwig Jan. 22, 2021, 5:26 p.m. UTC | #3
On Fri, Jan 22, 2021 at 09:37:02AM -0500, Tom Talpey wrote:
> Is there an asynchronous version of ib_dma_sync?

No.  These routines basically compile down to cache writeback and/or
invalidate instructions without much logic around them.
diff mbox series

Patch

diff --git a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
index ab0b7e9777bc..6d28f23ceb35 100644
--- a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
+++ b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
@@ -342,9 +342,6 @@  static void svc_rdma_wc_receive(struct ib_cq *cq, struct ib_wc *wc)
 
 	/* All wc fields are now known to be valid */
 	ctxt->rc_byte_len = wc->byte_len;
-	ib_dma_sync_single_for_cpu(rdma->sc_pd->device,
-				   ctxt->rc_recv_sge.addr,
-				   wc->byte_len, DMA_FROM_DEVICE);
 
 	spin_lock(&rdma->sc_rq_dto_lock);
 	list_add_tail(&ctxt->rc_list, &rdma->sc_rq_dto_q);
@@ -851,6 +848,9 @@  int svc_rdma_recvfrom(struct svc_rqst *rqstp)
 	spin_unlock(&rdma_xprt->sc_rq_dto_lock);
 	percpu_counter_inc(&svcrdma_stat_recv);
 
+	ib_dma_sync_single_for_cpu(rdma_xprt->sc_pd->device,
+				   ctxt->rc_recv_sge.addr, ctxt->rc_byte_len,
+				   DMA_FROM_DEVICE);
 	svc_rdma_build_arg_xdr(rqstp, ctxt);
 
 	/* Prevent svc_xprt_release from releasing pages in rq_pages