diff mbox

[v3] features: declare the Credit2 scheduler as Supported.

Message ID 147809910079.3182.8377673440389249817.stgit@Solace.fritz.box (mailing list archive)
State New, archived
Headers show

Commit Message

Dario Faggioli Nov. 2, 2016, 3:05 p.m. UTC
Credit2 is available in tree as an "Experimental" scheduler since
a few years. Recently, effort started for making it production ready
and, eventually, the new Xen's default scheduler. As a consequence of
that, it has undergone a greatd deal of development, testing and
benchmarking.

In fact, Credit2's much more modern (wrt Credit1) design and cleaner
code makes it a lot easier to understand what the scheduler is doing,
fix scheduling issues that may come up, and implement new and more
advanced features, in future.

In some more details:

 - key features that were missing (pinning and context switching
   rate-limiting) have now been implemented, and more (soft affinity,
   caps and reservations) are about to come. The gap wrt Credit1 is
   therefore closing. In particular, with pinning and rate-limiting
   available, the scheduler can be considered usable.

 - Credit2 is tested by OSSTest since long time. Furthermore, as a
   part of recent efforts, stress tests and benchmarks have been run
   and shown no bugs or stability issues.

 - A number of different benchmarks have been run, most of them
   comparing Credit2 with Credit1. Some of the results were posted on
   xen-devel, some others have been illustrated during a talk at 2016
   edition of Xen-Project Developer Summit. In general, performance
   look promising --if not better than Credit1 already, in some of
   the cases.

It therefore appears that we are ready to mark the Credit2 scheduler
as a 'Supported' feature, and ask users to look at it and try it, if
they think it suits their needs.

Of course, declaring something 'Supported' has security implications.
So here it is how the situation looks like from a security standpoint:

1) Is guest->host privilege escalation possible?

The only interfaces exposed to unprivileged guests are the SCHEDOP
hypercalls, and timers. None of those hypercalls contain any pointers,
and they don't look to contain any privilege escalation path. Also,
they're not specific to Credit2, as they're "used" by all schedulers
(ingluding the current default, Credit1), so anything about these
interfaces would be a security concern already.


2) Is guest user->guest kernel escalation possible?

The guest kernel is not really relying on anything from the scheduler
to protect itself or any data in any way.


3) Is there any information leakage?

The only information which the scheduler exposes to unprivileged
guests is the timing information.  This may be able to be used for
side-channel attacks to probabilistically infer things about other
vcpus running on the same system; but this has not traditionally
been considered within the security boundary. And, again, this is
possible with all schedulers.

The control domain can issue DOMCTL_SCHEDOP and SYSCTL_SCHEDOP
hypercalls, but the involved data structures are handled in a
way that does not leak information (which would be leaked "only"
to Dom0 anyway).


4) Can a Denial-of-Service be triggered?

This is a risk, with schedulers, and one that's hard to foresee.
For instance, it _did_ happen on Credit1, in the past (a vcpu
could "game the system" by sleeping at particular times to gain
BOOST priority and monopolize 95% of the cpu). In that case, it
was possible because of the probabilistic nature of accounting
in Credit1 (which was then fixed). Well, Credit2:
 - already do accurate, rather than probabilistic, accounting;
 - does not have any BOOST or, in general, any way for a vcpu to
   become 'more important' than the others: they're all subjected
   to the same crediting algorithm.

Also note that, the accounting and the crediting algorithm are a lot
simpler than in Credit1, and hence a lot easier to understand, debug
and audit.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
---
Changes since v2:
 * 'EXPERIMENTAL' tag removed from Kconfig;
 * reworded paragraph on SCHEDOP DOMCTL & SYSCTL from Dom0.
---
Cc: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: Anshul Makkar <anshul.makkar@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Lars Kurth <lars.kurth@citrix.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Tim Deegan <tim@xen.org>
Cc: security@xenproject.org
---
 docs/features/sched_credit2.pandoc |    2 +-
 xen/common/Kconfig                 |    2 +-
 xen/common/sched_credit2.c         |    2 --
 3 files changed, 2 insertions(+), 4 deletions(-)

Comments

Jan Beulich Nov. 2, 2016, 3:38 p.m. UTC | #1
>>> On 02.11.16 at 16:05, <dario.faggioli@citrix.com> wrote:
> Credit2 is available in tree as an "Experimental" scheduler since
> a few years. Recently, effort started for making it production ready
> and, eventually, the new Xen's default scheduler. As a consequence of
> that, it has undergone a greatd deal of development, testing and
> benchmarking.
> 
> In fact, Credit2's much more modern (wrt Credit1) design and cleaner
> code makes it a lot easier to understand what the scheduler is doing,
> fix scheduling issues that may come up, and implement new and more
> advanced features, in future.
> 
> In some more details:
> 
>  - key features that were missing (pinning and context switching
>    rate-limiting) have now been implemented, and more (soft affinity,
>    caps and reservations) are about to come. The gap wrt Credit1 is
>    therefore closing. In particular, with pinning and rate-limiting
>    available, the scheduler can be considered usable.
> 
>  - Credit2 is tested by OSSTest since long time. Furthermore, as a
>    part of recent efforts, stress tests and benchmarks have been run
>    and shown no bugs or stability issues.
> 
>  - A number of different benchmarks have been run, most of them
>    comparing Credit2 with Credit1. Some of the results were posted on
>    xen-devel, some others have been illustrated during a talk at 2016
>    edition of Xen-Project Developer Summit. In general, performance
>    look promising --if not better than Credit1 already, in some of
>    the cases.
> 
> It therefore appears that we are ready to mark the Credit2 scheduler
> as a 'Supported' feature, and ask users to look at it and try it, if
> they think it suits their needs.
> 
> Of course, declaring something 'Supported' has security implications.
> So here it is how the situation looks like from a security standpoint:
> 
> 1) Is guest->host privilege escalation possible?
> 
> The only interfaces exposed to unprivileged guests are the SCHEDOP
> hypercalls, and timers. None of those hypercalls contain any pointers,
> and they don't look to contain any privilege escalation path. Also,
> they're not specific to Credit2, as they're "used" by all schedulers
> (ingluding the current default, Credit1), so anything about these
> interfaces would be a security concern already.
> 
> 
> 2) Is guest user->guest kernel escalation possible?
> 
> The guest kernel is not really relying on anything from the scheduler
> to protect itself or any data in any way.
> 
> 
> 3) Is there any information leakage?
> 
> The only information which the scheduler exposes to unprivileged
> guests is the timing information.  This may be able to be used for
> side-channel attacks to probabilistically infer things about other
> vcpus running on the same system; but this has not traditionally
> been considered within the security boundary. And, again, this is
> possible with all schedulers.
> 
> The control domain can issue DOMCTL_SCHEDOP and SYSCTL_SCHEDOP
> hypercalls, but the involved data structures are handled in a
> way that does not leak information (which would be leaked "only"
> to Dom0 anyway).
> 
> 
> 4) Can a Denial-of-Service be triggered?
> 
> This is a risk, with schedulers, and one that's hard to foresee.
> For instance, it _did_ happen on Credit1, in the past (a vcpu
> could "game the system" by sleeping at particular times to gain
> BOOST priority and monopolize 95% of the cpu). In that case, it
> was possible because of the probabilistic nature of accounting
> in Credit1 (which was then fixed). Well, Credit2:
>  - already do accurate, rather than probabilistic, accounting;
>  - does not have any BOOST or, in general, any way for a vcpu to
>    become 'more important' than the others: they're all subjected
>    to the same crediting algorithm.
> 
> Also note that, the accounting and the crediting algorithm are a lot
> simpler than in Credit1, and hence a lot easier to understand, debug
> and audit.
> 
> Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>

Acked-by: Jan Beulich <jbeulich@suse.com>
Wei Liu Nov. 2, 2016, 3:39 p.m. UTC | #2
On Wed, Nov 02, 2016 at 04:05:03PM +0100, Dario Faggioli wrote:
> Credit2 is available in tree as an "Experimental" scheduler since
> a few years. Recently, effort started for making it production ready
> and, eventually, the new Xen's default scheduler. As a consequence of
> that, it has undergone a greatd deal of development, testing and

greatd -> great

> benchmarking.
> 
> In fact, Credit2's much more modern (wrt Credit1) design and cleaner

Credit2's -> Credit2 is

(I believe contraction is not applicable in this case, but maybe some
native speakers can check.)

> code makes it a lot easier to understand what the scheduler is doing,
> fix scheduling issues that may come up, and implement new and more
> advanced features, in future.
> 
> In some more details:
> 
>  - key features that were missing (pinning and context switching
>    rate-limiting) have now been implemented, and more (soft affinity,
>    caps and reservations) are about to come. The gap wrt Credit1 is
>    therefore closing. In particular, with pinning and rate-limiting
>    available, the scheduler can be considered usable.
> 
>  - Credit2 is tested by OSSTest since long time. Furthermore, as a
>    part of recent efforts, stress tests and benchmarks have been run
>    and shown no bugs or stability issues.
> 
>  - A number of different benchmarks have been run, most of them
>    comparing Credit2 with Credit1. Some of the results were posted on
>    xen-devel, some others have been illustrated during a talk at 2016
>    edition of Xen-Project Developer Summit. In general, performance
>    look promising --if not better than Credit1 already, in some of
>    the cases.
> 
> It therefore appears that we are ready to mark the Credit2 scheduler
> as a 'Supported' feature, and ask users to look at it and try it, if
> they think it suits their needs.
> 
> Of course, declaring something 'Supported' has security implications.
> So here it is how the situation looks like from a security standpoint:
> 
> 1) Is guest->host privilege escalation possible?
> 
> The only interfaces exposed to unprivileged guests are the SCHEDOP
> hypercalls, and timers. None of those hypercalls contain any pointers,
> and they don't look to contain any privilege escalation path. Also,
> they're not specific to Credit2, as they're "used" by all schedulers
> (ingluding the current default, Credit1), so anything about these
> interfaces would be a security concern already.
> 
> 
> 2) Is guest user->guest kernel escalation possible?
> 
> The guest kernel is not really relying on anything from the scheduler
> to protect itself or any data in any way.
> 
> 
> 3) Is there any information leakage?
> 
> The only information which the scheduler exposes to unprivileged
> guests is the timing information.  This may be able to be used for
> side-channel attacks to probabilistically infer things about other
> vcpus running on the same system; but this has not traditionally
> been considered within the security boundary. And, again, this is
> possible with all schedulers.
> 
> The control domain can issue DOMCTL_SCHEDOP and SYSCTL_SCHEDOP
> hypercalls, but the involved data structures are handled in a
> way that does not leak information (which would be leaked "only"
> to Dom0 anyway).
> 
> 
> 4) Can a Denial-of-Service be triggered?
> 
> This is a risk, with schedulers, and one that's hard to foresee.
> For instance, it _did_ happen on Credit1, in the past (a vcpu
> could "game the system" by sleeping at particular times to gain
> BOOST priority and monopolize 95% of the cpu). In that case, it
> was possible because of the probabilistic nature of accounting
> in Credit1 (which was then fixed). Well, Credit2:
>  - already do accurate, rather than probabilistic, accounting;
>  - does not have any BOOST or, in general, any way for a vcpu to
>    become 'more important' than the others: they're all subjected
>    to the same crediting algorithm.
> 
> Also note that, the accounting and the crediting algorithm are a lot
> simpler than in Credit1, and hence a lot easier to understand, debug
> and audit.
> 
> Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>

Acked-by: Wei Liu <wei.liu2@citrix.com>
George Dunlap Nov. 2, 2016, 3:45 p.m. UTC | #3
On 02/11/16 15:05, Dario Faggioli wrote:
> Credit2 is available in tree as an "Experimental" scheduler since
> a few years. Recently, effort started for making it production ready
> and, eventually, the new Xen's default scheduler. As a consequence of
> that, it has undergone a greatd deal of development, testing and
> benchmarking.
> 
> In fact, Credit2's much more modern (wrt Credit1) design and cleaner
> code makes it a lot easier to understand what the scheduler is doing,
> fix scheduling issues that may come up, and implement new and more
> advanced features, in future.
> 
> In some more details:
> 
>  - key features that were missing (pinning and context switching
>    rate-limiting) have now been implemented, and more (soft affinity,
>    caps and reservations) are about to come. The gap wrt Credit1 is
>    therefore closing. In particular, with pinning and rate-limiting
>    available, the scheduler can be considered usable.
> 
>  - Credit2 is tested by OSSTest since long time. Furthermore, as a
>    part of recent efforts, stress tests and benchmarks have been run
>    and shown no bugs or stability issues.
> 
>  - A number of different benchmarks have been run, most of them
>    comparing Credit2 with Credit1. Some of the results were posted on
>    xen-devel, some others have been illustrated during a talk at 2016
>    edition of Xen-Project Developer Summit. In general, performance
>    look promising --if not better than Credit1 already, in some of
>    the cases.
> 
> It therefore appears that we are ready to mark the Credit2 scheduler
> as a 'Supported' feature, and ask users to look at it and try it, if
> they think it suits their needs.
> 
> Of course, declaring something 'Supported' has security implications.
> So here it is how the situation looks like from a security standpoint:
> 
> 1) Is guest->host privilege escalation possible?
> 
> The only interfaces exposed to unprivileged guests are the SCHEDOP
> hypercalls, and timers. None of those hypercalls contain any pointers,
> and they don't look to contain any privilege escalation path. Also,
> they're not specific to Credit2, as they're "used" by all schedulers
> (ingluding the current default, Credit1), so anything about these
> interfaces would be a security concern already.
> 
> 
> 2) Is guest user->guest kernel escalation possible?
> 
> The guest kernel is not really relying on anything from the scheduler
> to protect itself or any data in any way.
> 
> 
> 3) Is there any information leakage?
> 
> The only information which the scheduler exposes to unprivileged
> guests is the timing information.  This may be able to be used for
> side-channel attacks to probabilistically infer things about other
> vcpus running on the same system; but this has not traditionally
> been considered within the security boundary. And, again, this is
> possible with all schedulers.
> 
> The control domain can issue DOMCTL_SCHEDOP and SYSCTL_SCHEDOP
> hypercalls, but the involved data structures are handled in a
> way that does not leak information (which would be leaked "only"
> to Dom0 anyway).
> 
> 
> 4) Can a Denial-of-Service be triggered?
> 
> This is a risk, with schedulers, and one that's hard to foresee.
> For instance, it _did_ happen on Credit1, in the past (a vcpu
> could "game the system" by sleeping at particular times to gain
> BOOST priority and monopolize 95% of the cpu). In that case, it
> was possible because of the probabilistic nature of accounting
> in Credit1 (which was then fixed). Well, Credit2:
>  - already do accurate, rather than probabilistic, accounting;
>  - does not have any BOOST or, in general, any way for a vcpu to
>    become 'more important' than the others: they're all subjected
>    to the same crediting algorithm.
> 
> Also note that, the accounting and the crediting algorithm are a lot
> simpler than in Credit1, and hence a lot easier to understand, debug
> and audit.
> 
> Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>

Acked-by: George Dunlap <george.dunlap@citrix.com>
Dario Faggioli Nov. 2, 2016, 3:49 p.m. UTC | #4
On Wed, 2016-11-02 at 15:39 +0000, Wei Liu wrote:
> On Wed, Nov 02, 2016 at 04:05:03PM +0100, Dario Faggioli wrote:
> > 
> > Credit2 is available in tree as an "Experimental" scheduler since
> > a few years. Recently, effort started for making it production
> > ready
> > and, eventually, the new Xen's default scheduler. As a consequence
> > of
> > that, it has undergone a greatd deal of development, testing and
> 
> greatd -> great
> 
Sorry, and thanks.
> > 
> > benchmarking.
> > 
> > In fact, Credit2's much more modern (wrt Credit1) design and
> > cleaner
> 
> Credit2's -> Credit2 is
> 
> (I believe contraction is not applicable in this case, but maybe some
> native speakers can check.)
> 
Well, that was actually Saxon genitive (or English possessive, or
genitive case, or whatever is called). But even that, I'm not 100% sure
it is ok/best to use it. I think yes, but if it is not, I guess this
should become:

"In fact, the much more modern design of Credit2 (wrt Credit1) and its
cleaner..."

But I'm relaying on native speakers too.

Thanks and Regards,
Dario
Konrad Rzeszutek Wilk Nov. 2, 2016, 4 p.m. UTC | #5
> Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
> ---
> Changes since v2:
>  * 'EXPERIMENTAL' tag removed from Kconfig;
>  * reworded paragraph on SCHEDOP DOMCTL & SYSCTL from Dom0.
> ---
> Cc: George Dunlap <George.Dunlap@eu.citrix.com>
> Cc: Anshul Makkar <anshul.makkar@citrix.com>
> Cc: Wei Liu <wei.liu2@citrix.com>
> Cc: Lars Kurth <lars.kurth@citrix.com>
> Cc: Stefano Stabellini <sstabellini@kernel.org>
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> Cc: Ian Jackson <ian.jackson@eu.citrix.com>
> Cc: Jan Beulich <jbeulich@suse.com>
> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>


Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Ian Jackson Nov. 2, 2016, 4:11 p.m. UTC | #6
Dario Faggioli writes ("Re: [PATCH v3] features: declare the Credit2 scheduler as Supported."):
> On Wed, 2016-11-02 at 15:39 +0000, Wei Liu wrote:
> > On Wed, Nov 02, 2016 at 04:05:03PM +0100, Dario Faggioli wrote:
> > > In fact, Credit2's much more modern (wrt Credit1) design and
> > > cleaner
> > 
> > Credit2's -> Credit2 is
> > 
> > (I believe contraction is not applicable in this case, but maybe some
> > native speakers can check.)
> 
> Well, that was actually Saxon genitive (or English possessive, or
> genitive case, or whatever is called). But even that, I'm not 100% sure
> it is ok/best to use it. I think yes, but if it is not, I guess this
> should become:

Your original was right.

Anyway,

Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>

Ian.
Wei Liu Nov. 2, 2016, 4:22 p.m. UTC | #7
On Wed, Nov 02, 2016 at 04:49:37PM +0100, Dario Faggioli wrote:
> On Wed, 2016-11-02 at 15:39 +0000, Wei Liu wrote:
> > On Wed, Nov 02, 2016 at 04:05:03PM +0100, Dario Faggioli wrote:
> > > 
> > > Credit2 is available in tree as an "Experimental" scheduler since
> > > a few years. Recently, effort started for making it production
> > > ready
> > > and, eventually, the new Xen's default scheduler. As a consequence
> > > of
> > > that, it has undergone a greatd deal of development, testing and
> > 
> > greatd -> great
> > 
> Sorry, and thanks.
> > > 
> > > benchmarking.
> > > 
> > > In fact, Credit2's much more modern (wrt Credit1) design and
> > > cleaner
> > 
> > Credit2's -> Credit2 is
> > 
> > (I believe contraction is not applicable in this case, but maybe some
> > native speakers can check.)
> > 
> Well, that was actually Saxon genitive (or English possessive, or
> genitive case, or whatever is called). But even that, I'm not 100% sure
> it is ok/best to use it. I think yes, but if it is not, I guess this
> should become:
> 

Ah, I skipped "design" and read that as "Credit2's ... cleaner". Sorry
for the noise.

Wei.
Wei Liu Nov. 3, 2016, 11:09 a.m. UTC | #8
Fixed one typo, removed period in subject line and pushed.
diff mbox

Patch

diff --git a/docs/features/sched_credit2.pandoc b/docs/features/sched_credit2.pandoc
index 8609d9c..9c8e15b 100644
--- a/docs/features/sched_credit2.pandoc
+++ b/docs/features/sched_credit2.pandoc
@@ -5,7 +5,7 @@ 
 
 # Basics
 ---------------- ----------------------------------------------------
-         Status: **Experimental**
+         Status: **Supported**
 
       Component: Hypervisor
 ---------------- ----------------------------------------------------
diff --git a/xen/common/Kconfig b/xen/common/Kconfig
index d4f10ca..f2ecbc4 100644
--- a/xen/common/Kconfig
+++ b/xen/common/Kconfig
@@ -166,7 +166,7 @@  config SCHED_CREDIT
 	  The traditional credit scheduler is a general purpose scheduler.
 
 config SCHED_CREDIT2
-	bool "Credit2 scheduler support (EXPERIMENTAL)"
+	bool "Credit2 scheduler support"
 	default y
 	---help---
 	  The credit2 scheduler is a general purpose scheduler that is
diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
index fe46e80..1f26553 100644
--- a/xen/common/sched_credit2.c
+++ b/xen/common/sched_credit2.c
@@ -2954,8 +2954,6 @@  csched2_init(struct scheduler *ops)
     struct csched2_private *prv;
 
     printk("Initializing Credit2 scheduler\n");
-    printk(" WARNING: This is experimental software in development.\n" \
-           " Use at your own risk.\n");
 
     printk(XENLOG_INFO " load_precision_shift: %d\n"
            XENLOG_INFO " load_window_shift: %d\n"