diff mbox series

[v1,1/3] svcrdma: Fewer calls to wake_up() in Send completion handler

Message ID 162731081843.13580.15415936872318036839.stgit@klimt.1015granger.net (mailing list archive)
State New, archived
Headers show
Series Optimize NFSD Send completion processing | expand

Commit Message

Chuck Lever July 26, 2021, 2:46 p.m. UTC
Because wake_up() takes an IRQ-safe lock, it can be expensive,
especially to call inside of a single-threaded completion handler.
What's more, the Send wait queue almost never has waiters, so
most of the time, this is an expensive no-op.

As always, the goal is to reduce the average overhead of each
completion, because a transport's completion handlers are single-
threaded on one CPU core. This change reduces CPU utilization of
the Send completion thread by 2-3% on my server.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 include/linux/sunrpc/svc_rdma.h       |    1 +
 net/sunrpc/xprtrdma/svc_rdma_rw.c     |    7 ++-----
 net/sunrpc/xprtrdma/svc_rdma_sendto.c |   18 +++++++++++++++---
 3 files changed, 18 insertions(+), 8 deletions(-)

Comments

Tom Talpey July 26, 2021, 4:50 p.m. UTC | #1
On 7/26/2021 10:46 AM, Chuck Lever wrote:
>   /**
>    * svc_rdma_wc_send - Invoked by RDMA provider for each polled Send WC
>    * @cq: Completion Queue context
> @@ -275,11 +289,9 @@ static void svc_rdma_wc_send(struct ib_cq *cq, struct ib_wc *wc)
>   
>   	trace_svcrdma_wc_send(wc, &ctxt->sc_cid);
>   
> +	svc_rdma_wake_send_waiters(rdma, 1);
>   	complete(&ctxt->sc_done);
>   
> -	atomic_inc(&rdma->sc_sq_avail);
> -	wake_up(&rdma->sc_send_wait);

This appears to change the order of wake_up() vs complete().
Is that intentional? Is there any possibility of a false
scheduler activation, later leading to a second wakeup or poll?

Tom.
Chuck Lever July 26, 2021, 5:53 p.m. UTC | #2
> On Jul 26, 2021, at 12:50 PM, Tom Talpey <tom@talpey.com> wrote:
> 
> On 7/26/2021 10:46 AM, Chuck Lever wrote:
>>  /**
>>   * svc_rdma_wc_send - Invoked by RDMA provider for each polled Send WC
>>   * @cq: Completion Queue context
>> @@ -275,11 +289,9 @@ static void svc_rdma_wc_send(struct ib_cq *cq, struct ib_wc *wc)
>>    	trace_svcrdma_wc_send(wc, &ctxt->sc_cid);
>>  +	svc_rdma_wake_send_waiters(rdma, 1);
>>  	complete(&ctxt->sc_done);
>>  -	atomic_inc(&rdma->sc_sq_avail);
>> -	wake_up(&rdma->sc_send_wait);
> 
> This appears to change the order of wake_up() vs complete().
> Is that intentional?

IIRC I reversed the order here to be consistent with the other
Send completion handlers.


> Is there any possibility of a false
> scheduler activation, later leading to a second wakeup or poll?

The two "wake-ups" here are not related to each other, and RPC
Replies are transmitted already so this shouldn't have a direct
impact on server latency. But I might not understand your
question.


--
Chuck Lever
Tom Talpey July 26, 2021, 8:23 p.m. UTC | #3
On 7/26/2021 1:53 PM, Chuck Lever III wrote:
> 
> 
>> On Jul 26, 2021, at 12:50 PM, Tom Talpey <tom@talpey.com> wrote:
>>
>> On 7/26/2021 10:46 AM, Chuck Lever wrote:
>>>   /**
>>>    * svc_rdma_wc_send - Invoked by RDMA provider for each polled Send WC
>>>    * @cq: Completion Queue context
>>> @@ -275,11 +289,9 @@ static void svc_rdma_wc_send(struct ib_cq *cq, struct ib_wc *wc)
>>>     	trace_svcrdma_wc_send(wc, &ctxt->sc_cid);
>>>   +	svc_rdma_wake_send_waiters(rdma, 1);
>>>   	complete(&ctxt->sc_done);
>>>   -	atomic_inc(&rdma->sc_sq_avail);
>>> -	wake_up(&rdma->sc_send_wait);
>>
>> This appears to change the order of wake_up() vs complete().
>> Is that intentional?
> 
> IIRC I reversed the order here to be consistent with the other
> Send completion handlers.
> 
> 
>> Is there any possibility of a false
>> scheduler activation, later leading to a second wakeup or poll?
> 
> The two "wake-ups" here are not related to each other, and RPC
> Replies are transmitted already so this shouldn't have a direct
> impact on server latency. But I might not understand your
> question.

IIUC, you're saying that the thread which is awaiting the
completion of ctxt->sc_done is not also waiting to send
anything, therefore no thread is going to experience a
fire drill. Ok.

Feel free to add my
   Reviewed-By: Tom Talpey <tom@talpey.com>
to the series.

Tom.
diff mbox series

Patch

diff --git a/include/linux/sunrpc/svc_rdma.h b/include/linux/sunrpc/svc_rdma.h
index 3184465de3a0..57c60ffe76dd 100644
--- a/include/linux/sunrpc/svc_rdma.h
+++ b/include/linux/sunrpc/svc_rdma.h
@@ -207,6 +207,7 @@  extern void svc_rdma_send_error_msg(struct svcxprt_rdma *rdma,
 				    struct svc_rdma_send_ctxt *sctxt,
 				    struct svc_rdma_recv_ctxt *rctxt,
 				    int status);
+extern void svc_rdma_wake_send_waiters(struct svcxprt_rdma *rdma, int avail);
 extern int svc_rdma_sendto(struct svc_rqst *);
 extern int svc_rdma_result_payload(struct svc_rqst *rqstp, unsigned int offset,
 				   unsigned int length);
diff --git a/net/sunrpc/xprtrdma/svc_rdma_rw.c b/net/sunrpc/xprtrdma/svc_rdma_rw.c
index 1e651447dc4e..3d1b119f6e3e 100644
--- a/net/sunrpc/xprtrdma/svc_rdma_rw.c
+++ b/net/sunrpc/xprtrdma/svc_rdma_rw.c
@@ -248,8 +248,7 @@  static void svc_rdma_write_done(struct ib_cq *cq, struct ib_wc *wc)
 
 	trace_svcrdma_wc_write(wc, &cc->cc_cid);
 
-	atomic_add(cc->cc_sqecount, &rdma->sc_sq_avail);
-	wake_up(&rdma->sc_send_wait);
+	svc_rdma_wake_send_waiters(rdma, cc->cc_sqecount);
 
 	if (unlikely(wc->status != IB_WC_SUCCESS))
 		svc_xprt_deferred_close(&rdma->sc_xprt);
@@ -304,9 +303,7 @@  static void svc_rdma_wc_read_done(struct ib_cq *cq, struct ib_wc *wc)
 
 	trace_svcrdma_wc_read(wc, &cc->cc_cid);
 
-	atomic_add(cc->cc_sqecount, &rdma->sc_sq_avail);
-	wake_up(&rdma->sc_send_wait);
-
+	svc_rdma_wake_send_waiters(rdma, cc->cc_sqecount);
 	cc->cc_status = wc->status;
 	complete(&cc->cc_done);
 	return;
diff --git a/net/sunrpc/xprtrdma/svc_rdma_sendto.c b/net/sunrpc/xprtrdma/svc_rdma_sendto.c
index d6bbafb773e1..fba2ee4eb607 100644
--- a/net/sunrpc/xprtrdma/svc_rdma_sendto.c
+++ b/net/sunrpc/xprtrdma/svc_rdma_sendto.c
@@ -258,6 +258,20 @@  void svc_rdma_send_ctxt_put(struct svcxprt_rdma *rdma,
 	spin_unlock(&rdma->sc_send_lock);
 }
 
+/**
+ * svc_rdma_wake_send_waiters - manage Send Queue accounting
+ * @rdma: controlling transport
+ * @avail: Number of additional SQEs that are now available
+ *
+ */
+void svc_rdma_wake_send_waiters(struct svcxprt_rdma *rdma, int avail)
+{
+	atomic_add(avail, &rdma->sc_sq_avail);
+	smp_mb__after_atomic();
+	if (unlikely(waitqueue_active(&rdma->sc_send_wait)))
+		wake_up(&rdma->sc_send_wait);
+}
+
 /**
  * svc_rdma_wc_send - Invoked by RDMA provider for each polled Send WC
  * @cq: Completion Queue context
@@ -275,11 +289,9 @@  static void svc_rdma_wc_send(struct ib_cq *cq, struct ib_wc *wc)
 
 	trace_svcrdma_wc_send(wc, &ctxt->sc_cid);
 
+	svc_rdma_wake_send_waiters(rdma, 1);
 	complete(&ctxt->sc_done);
 
-	atomic_inc(&rdma->sc_sq_avail);
-	wake_up(&rdma->sc_send_wait);
-
 	if (unlikely(wc->status != IB_WC_SUCCESS))
 		svc_xprt_deferred_close(&rdma->sc_xprt);
 }