mbox series

[v3,0/9] SUNRPC service thread scheduler optimizations

Message ID 168900729243.7514.15141312295052254929.stgit@manet.1015granger.net (mailing list archive)
Headers show
Series SUNRPC service thread scheduler optimizations | expand

Message

Chuck Lever July 10, 2023, 4:41 p.m. UTC
Walking a linked list to find an idle thread is not CPU cache-
friendly, and in fact I've noted palpable per-request latency
impacts as the number of nfsd threads on the server increases.

After discussing some possible improvements with Jeff at LSF/MM,
I've been experimenting with the following series. I've measured an
order of magnitude latency improvement in the thread lookup time,
and have managed to keep the whole thing lockless.

This version of the series addresses Neil's earlier comments and 
is robust under load. The first 7 patches in this series seem
uncontroversial, so I'll push them to the 
"topic-sunrpc-thread-scheduling" branch of: 

https://git.kernel.org/pub/scm/linux/kernel/git/cel/linux.git

The last two are still under discussion. They are posted as part
of this series for comparison to other proposals, but are not yet 
included in the topic branch. But they are tested and working.


Changes since v2: 
* Dropped the patch that converts sp_lock to a simple spinlock
* Replaced explicit memory barriers in svc_get_next_xprt()
* Select thread victims from the other end of the bitmap
* Added a metric for wake-ups that find nothing on the transport queue

Changes since RFC:
* Add a counter for ingress RPC messages
* Add a few documenting comments
* Move the more controversial patches to the end of the series
* Clarify the refactoring of svc_wake_up() 
* Increase the value of RPCSVC_MAXPOOLTHREADS to 4096 (and tested with that many threads)
* Optimize the loop in svc_pool_wake_idle_thread()
* Optimize marking a thread "idle" in svc_get_next_xprt()

---

Chuck Lever (9):
      SUNRPC: Deduplicate thread wake-up code
      SUNRPC: Report when no service thread is available.
      SUNRPC: Split the svc_xprt_dequeue tracepoint
      SUNRPC: Count ingress RPC messages per svc_pool
      SUNRPC: Count pool threads that were awoken but found no work to do
      SUNRPC: Clean up svc_set_num_threads
      SUNRPC: Replace dprintk() call site in __svc_create()
      SUNRPC: Replace sp_threads_all with an xarray
      SUNRPC: Convert RQ_BUSY into a per-pool bitmap


 fs/nfsd/nfssvc.c              |   3 +-
 include/linux/sunrpc/svc.h    |  19 ++--
 include/trace/events/sunrpc.h | 159 ++++++++++++++++++++++++++----
 net/sunrpc/svc.c              | 177 ++++++++++++++++++++++------------
 net/sunrpc/svc_xprt.c         |  99 ++++++++++---------
 5 files changed, 323 insertions(+), 134 deletions(-)

--
Chuck Lever

Comments

Jeff Layton July 10, 2023, 6:29 p.m. UTC | #1
On Mon, 2023-07-10 at 12:41 -0400, Chuck Lever wrote:
> Walking a linked list to find an idle thread is not CPU cache-
> friendly, and in fact I've noted palpable per-request latency
> impacts as the number of nfsd threads on the server increases.
> 
> After discussing some possible improvements with Jeff at LSF/MM,
> I've been experimenting with the following series. I've measured an
> order of magnitude latency improvement in the thread lookup time,
> and have managed to keep the whole thing lockless.
> 
> This version of the series addresses Neil's earlier comments and 
> is robust under load. The first 7 patches in this series seem
> uncontroversial, so I'll push them to the 
> "topic-sunrpc-thread-scheduling" branch of: 
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/cel/linux.git
> 
> The last two are still under discussion. They are posted as part
> of this series for comparison to other proposals, but are not yet 
> included in the topic branch. But they are tested and working.
> 
> 
> Changes since v2: 
> * Dropped the patch that converts sp_lock to a simple spinlock
> * Replaced explicit memory barriers in svc_get_next_xprt()
> * Select thread victims from the other end of the bitmap
> * Added a metric for wake-ups that find nothing on the transport queue
> 
> Changes since RFC:
> * Add a counter for ingress RPC messages
> * Add a few documenting comments
> * Move the more controversial patches to the end of the series
> * Clarify the refactoring of svc_wake_up() 
> * Increase the value of RPCSVC_MAXPOOLTHREADS to 4096 (and tested with that many threads)
> * Optimize the loop in svc_pool_wake_idle_thread()
> * Optimize marking a thread "idle" in svc_get_next_xprt()
> 
> ---
> 
> Chuck Lever (9):
>       SUNRPC: Deduplicate thread wake-up code
>       SUNRPC: Report when no service thread is available.
>       SUNRPC: Split the svc_xprt_dequeue tracepoint
>       SUNRPC: Count ingress RPC messages per svc_pool
>       SUNRPC: Count pool threads that were awoken but found no work to do
>       SUNRPC: Clean up svc_set_num_threads
>       SUNRPC: Replace dprintk() call site in __svc_create()
>       SUNRPC: Replace sp_threads_all with an xarray
>       SUNRPC: Convert RQ_BUSY into a per-pool bitmap
> 
> 
>  fs/nfsd/nfssvc.c              |   3 +-
>  include/linux/sunrpc/svc.h    |  19 ++--
>  include/trace/events/sunrpc.h | 159 ++++++++++++++++++++++++++----
>  net/sunrpc/svc.c              | 177 ++++++++++++++++++++++------------
>  net/sunrpc/svc_xprt.c         |  99 ++++++++++---------
>  5 files changed, 323 insertions(+), 134 deletions(-)
> 
> --
> Chuck Lever
> 

You can add my Reviewed-by to the first 7 patches. The last two are not
quite there yet, I think, but I like how they're looking so far.

Cheers,
NeilBrown July 10, 2023, 10:30 p.m. UTC | #2
On Tue, 11 Jul 2023, Jeff Layton wrote:
> 
> You can add my Reviewed-by to the first 7 patches. The last two are not
> quite there yet, I think, but I like how they're looking so far.

Similarly here is my r-b for the first 7

 Reviewed-By: NeilBrown <neilb@suse.de>

NeilBrown