mbox series

[v2,0/7] xen: credit2: limit the number of CPUs per runqueue

Message ID 159070133878.12060.13318432301910522647.stgit@Palanthas (mailing list archive)
Headers show
Series xen: credit2: limit the number of CPUs per runqueue | expand

Message

Dario Faggioli May 28, 2020, 9:29 p.m. UTC
Hello!

Here's v2 of this series... a bit late, but technically still in time
for code-freeze, although I understand this is quite tight! :-P

Anyway, In Credit2, the CPUs are assigned to runqueues according to the
host topology. For instance, if we want per-socket runqueues (which is
the default), the CPUs that are in the same socket will end up in the
same runqueue.

This is generally good for scalability, at least until the number of
CPUs that end up in the same runqueue is not too high. In fact, all this
CPUs will compete for the same spinlock, for making scheduling decisions
and manipulating the scheduler data structures. Therefore, if they are
too many, that can become a bottleneck.

This has not been an issue so far, but architectures with 128 CPUs per
socket are now available, and it is certainly unideal to have so many
CPUs in the same runqueue, competing for the same locks, etc.

Let's therefore set a cap to the total number of CPUs that can share a
runqueue. The value is set to 16, by default, but can be changed with
a boot command line parameter.

Note also that, if the host has hyperthreading (or equivalent), and we
are not using core-scheduling), additional care is taken to avoid splitting
CPUs that are hyperthread siblings among different runqueues.

In v2, in addition to trying to address the review comments, I've added
the logic for doing a full rebalance of the scheduler runqueues, while
the system is running. This is actually something that itself came up
during review of v1, when we realized that we do not only wanted a cap,
we also wanted some balancing, and if you want real balancing, you have
to be able to re-arrange the runqueue layout, dynamically.

It took a while because I, although I had something that looked a lot
like the final solution implemented in this patch, could not see how to
solve cleanly and effectively the issue of having the vCPUs in the
runqueues, while trying to re-balance them. It was while talking with
Juergen that we figured out that we can actually pause the domains,
which I had not thought at... So, once again, Juergen, thanks! :-)

I have done the most of the stress testing with core-scheduling
disabled, and it has survived to anything I threw at it, but of course
the more testing the better, and I will be able to actually do more of
it, in the coming days.

IAC, I have also verified that at least a few core-scheduling enabled
configs also work.

There are git branches here:
 git://xenbits.xen.org/people/dariof/xen.git  sched/credit2-max-cpus-runqueue-v2
 http://xenbits.xen.org/gitweb/?p=people/dariof/xen.git;a=shortlog;h=refs/heads/sched/credit2-max-cpus-runqueue-v2

 https://github.com/dfaggioli/xen/tree/sched/credit2-max-cpus-runqueue-v2

While v1 is at the following link:
 https://lore.kernel.org/xen-devel/158818022727.24327.14309662489731832234.stgit@Palanthas/T/#m1e885a0f0a1feef83790ac179ab66512201cb770

Thanks and Regards
---
Dario Faggioli (7):
      xen: credit2: factor cpu to runqueue matching in a function
      xen: credit2: factor runqueue initialization in its own function.
      xen: cpupool: add a back-pointer from a scheduler to its pool
      xen: credit2: limit the max number of CPUs in a runqueue
      xen: credit2: compute cpus per-runqueue more dynamically.
      cpupool: create an the 'cpupool sync' infrastructure
      xen: credit2: rebalance the number of CPUs in the scheduler runqueues

 docs/misc/xen-command-line.pandoc |   14 +
 xen/common/sched/cpupool.c        |   53 ++++
 xen/common/sched/credit2.c        |  437 ++++++++++++++++++++++++++++++++++---
 xen/common/sched/private.h        |    7 +
 xen/include/asm-arm/cpufeature.h  |    5 
 xen/include/asm-x86/processor.h   |    5 
 xen/include/xen/sched.h           |    1 
 7 files changed, 491 insertions(+), 31 deletions(-)

--
Dario Faggioli, Ph.D
http://about.me/dario.faggioli
Virtualization Software Engineer
SUSE Labs, SUSE https://www.suse.com/
-------------------------------------------------------------------
<<This happens because _I_ choose it to happen!>> (Raistlin Majere)

Comments

Jan Beulich July 21, 2020, 12:08 p.m. UTC | #1
On 28.05.2020 23:29, Dario Faggioli wrote:
> Dario Faggioli (7):
>       xen: credit2: factor cpu to runqueue matching in a function
>       xen: credit2: factor runqueue initialization in its own function.
>       xen: cpupool: add a back-pointer from a scheduler to its pool
>       xen: credit2: limit the max number of CPUs in a runqueue
>       xen: credit2: compute cpus per-runqueue more dynamically.
>       cpupool: create an the 'cpupool sync' infrastructure
>       xen: credit2: rebalance the number of CPUs in the scheduler runqueues

I still have the last three patches here as well as "xen: credit2:
document that min_rqd is valid and ok to use" in my "waiting for
sufficient acks" folder. Would you mind indicating if they should
stay there (and you will take care of pinging or whatever is
needed), or whether I may drop them (and you'll eventually re-
submit)?

Thanks, Jan
Dario Faggioli July 22, 2020, 3:33 p.m. UTC | #2
On Tue, 2020-07-21 at 14:08 +0200, Jan Beulich wrote:
> On 28.05.2020 23:29, Dario Faggioli wrote:
> > Dario Faggioli (7):
> >       xen: credit2: factor cpu to runqueue matching in a function
> >       xen: credit2: factor runqueue initialization in its own
> > function.
> >       xen: cpupool: add a back-pointer from a scheduler to its pool
> >       xen: credit2: limit the max number of CPUs in a runqueue
> >       xen: credit2: compute cpus per-runqueue more dynamically.
> >       cpupool: create an the 'cpupool sync' infrastructure
> >       xen: credit2: rebalance the number of CPUs in the scheduler
> > runqueues
> 
> I still have the last three patches here as well as "xen: credit2:
> document that min_rqd is valid and ok to use" in my "waiting for
> sufficient acks" folder. 
>
Ok.

> Would you mind indicating if they should
> stay there (and you will take care of pinging or whatever is
> needed), or whether I may drop them (and you'll eventually re-
> submit)?
> 
So, the last 3 patches of this series, despite their status is indeed
"waiting for sufficient acks", in the sense that they've not been
looked at due to code freeze being imminent back then, but it still
would be ok for people to look at them, I'm happy for you to drop this
from your queue.

I will take care of resending just them in a new series and everything.
And thanks.

For the min_rqd one... That should be quite easy, in theory, so let me
ping the thread right now. Especially considering that I just looked
back at it, and noticed that I forgot to Cc George in the first place!
:-O

Regards