mbox series

[00/14,RFC] support automatic changes to nfsd thread count

Message ID 20240715074657.18174-1-neilb@suse.de (mailing list archive)
Headers show
Series support automatic changes to nfsd thread count | expand

Message

NeilBrown July 15, 2024, 7:14 a.m. UTC
This patch set (against nfsd-next) enables automatic adjustment of the
number of nfsd threads.  The number can increase under high load, and
reduce after idle periods.

The first few patches (1-6) are cleanups that may not be entirely
relevant to the current series.  They could safely land any time and
only need minimal review.

Patch 9,10,11 remove some places were sv_nrthreads are used for things
other than counting threads.  It is use to adjust other limits.  At the
time this seemed like an easy and sensible solution.  I now have to
repent of that short-cut and find a better way to impose reasonable
limits.

These and the other sundry patches (7,8,12) can, I think safely land
whenever that get sufficient review.  I think they are sensible even if
we won't end up adjusting threads dynamically.

Patches 13 and 14 build on all this to provide the desired
functionality.  Patch 13 allows the maximum to be configured, and patch
14 starts or stops threads based on some simple triggers.

For 13 I decided that if the user/admin makes no explicit configuration,
then the currently request number of threads becomes a minimum, and a
maximum is determined based on the amount of memory.  This will make
the patch set immediately useful but shouldn't unduly impact existing
configurations.

For patch 14 I only implemented starting a thread when there is work to
do but no threads to do it, and stopping a thread when it has been idle
for 5 seconds.  The start-up is deliberately serialised so at least one
NFS request is serviced between the decision to start a thread and the
action of starting it.  This hopefully encourages a ramping up of thread
count rather than a sudden jump.

There is certain room for discussion around the wisdom of these
heuristics, and what other heuristics are needed - we probably want a
shrinker to impose memory pressure of the number of threads.  We
probably want a thread to exit rather than retry when a memory
allocation in svc_alloc_arg() fails.

I certainly wouldn't recommend patch 14 landing in any hurry at all.

I'd love to hear what y'all think, and what experiences you have when
testing it.

Thanks,
NeilBrown

Comments

Jeff Layton July 15, 2024, 5:29 p.m. UTC | #1
On Mon, 2024-07-15 at 17:14 +1000, NeilBrown wrote:
> This patch set (against nfsd-next) enables automatic adjustment of the
> number of nfsd threads.  The number can increase under high load, and
> reduce after idle periods.
> 
> The first few patches (1-6) are cleanups that may not be entirely
> relevant to the current series.  They could safely land any time and
> only need minimal review.
> 
> Patch 9,10,11 remove some places were sv_nrthreads are used for things
> other than counting threads.  It is use to adjust other limits.  At the
> time this seemed like an easy and sensible solution.  I now have to
> repent of that short-cut and find a better way to impose reasonable
> limits.
> 
> These and the other sundry patches (7,8,12) can, I think safely land
> whenever that get sufficient review.  I think they are sensible even if
> we won't end up adjusting threads dynamically.
> 
> Patches 13 and 14 build on all this to provide the desired
> functionality.  Patch 13 allows the maximum to be configured, and patch
> 14 starts or stops threads based on some simple triggers.
> 
> For 13 I decided that if the user/admin makes no explicit configuration,
> then the currently request number of threads becomes a minimum, and a
> maximum is determined based on the amount of memory.  This will make
> the patch set immediately useful but shouldn't unduly impact existing
> configurations.
> 
> For patch 14 I only implemented starting a thread when there is work to
> do but no threads to do it, and stopping a thread when it has been idle
> for 5 seconds.  The start-up is deliberately serialised so at least one
> NFS request is serviced between the decision to start a thread and the
> action of starting it.  This hopefully encourages a ramping up of thread
> count rather than a sudden jump.
> 
> There is certain room for discussion around the wisdom of these
> heuristics, and what other heuristics are needed - we probably want a
> shrinker to impose memory pressure of the number of threads.  We
> probably want a thread to exit rather than retry when a memory
> allocation in svc_alloc_arg() fails.
> 
> I certainly wouldn't recommend patch 14 landing in any hurry at all.
> 
> I'd love to hear what y'all think, and what experiences you have when
> testing it.
> 
> 

This looks mostly reasonable, modulo a few nits on the later patches.
You can add my Reviewed-by to 1-9. 10-12 others look tentatively OK
too, but I'm less familiar with the slot handling code, and it sounds
like you're going to rework that part anyway.

For 13 I have some ideas about how we should present this from a user
interface standpoint that I wrote in my reply. The heuristics you came
up with in 14 look like a fine place to start.

Cheers!
Chuck Lever July 24, 2024, 7:43 p.m. UTC | #2
> On Jul 15, 2024, at 3:14 AM, NeilBrown <neilb@suse.de> wrote:
> 
> This patch set (against nfsd-next) enables automatic adjustment of the
> number of nfsd threads.  The number can increase under high load, and
> reduce after idle periods.
> 
> The first few patches (1-6) are cleanups that may not be entirely
> relevant to the current series.  They could safely land any time and
> only need minimal review.

I'm trying to get moving on this series. So, I've reviewed 1-6,
with one minor comment (posted previously). If you plan to
repost 5/14, let me know, or you can send me a set of edits for
its patch description and I can apply what's already been posted
to nfsd-next now.

I stopped at 7/14 because we should resolve whether to continue
adding NOFAIL in new code. My impression, from attending the
various LSF sessions on this topic, was that community consensus
is "NOFAIL is NO BUENO". If we feel the community discussion is
ongoing rather than concluded, then we'll have to sort this out
ourselves.


--
Chuck Lever
NeilBrown July 24, 2024, 9:25 p.m. UTC | #3
On Thu, 25 Jul 2024, Chuck Lever III wrote:
> 
> 
> > On Jul 15, 2024, at 3:14 AM, NeilBrown <neilb@suse.de> wrote:
> > 
> > This patch set (against nfsd-next) enables automatic adjustment of the
> > number of nfsd threads.  The number can increase under high load, and
> > reduce after idle periods.
> > 
> > The first few patches (1-6) are cleanups that may not be entirely
> > relevant to the current series.  They could safely land any time and
> > only need minimal review.
> 
> I'm trying to get moving on this series. So, I've reviewed 1-6,
> with one minor comment (posted previously). If you plan to
> repost 5/14, let me know, or you can send me a set of edits for
> its patch description and I can apply what's already been posted
> to nfsd-next now.

I do have plans - for code comments as well as commit comments.  I'll
try to send something on Friday.

> 
> I stopped at 7/14 because we should resolve whether to continue
> adding NOFAIL in new code. My impression, from attending the
> various LSF sessions on this topic, was that community consensus
> is "NOFAIL is NO BUENO". If we feel the community discussion is
> ongoing rather than concluded, then we'll have to sort this out
> ourselves.

I will post that NOFAIL patch more broadly - including fs-devel and
linux-mm and see if any consensus emerges.

Thanks,
NeilBrown

> 
> 
> --
> Chuck Lever
> 
> 
>