diff mbox

[v3,15/15] docs: add MBA description in docs

Message ID 1504603957-5389-16-git-send-email-yi.y.sun@linux.intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Yi Sun Sept. 5, 2017, 9:32 a.m. UTC
This patch adds MBA description in related documents.

Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
---
v2:
    - state the value type shown by 'psr-mba-show'. For linear mode,
      it shows decimal value. For non-linear mode, it shows hexadecimal
      value.
      (suggested by Chao Peng)
---
 docs/man/xl.pod.1.in      | 34 +++++++++++++++++++++++++
 docs/misc/xl-psr.markdown | 63 +++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 97 insertions(+)

Comments

Roger Pau Monne Sept. 19, 2017, 11:37 a.m. UTC | #1
On Tue, Sep 05, 2017 at 05:32:37PM +0800, Yi Sun wrote:
> This patch adds MBA description in related documents.
> 
> Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
> Acked-by: Wei Liu <wei.liu2@citrix.com>
> ---
> v2:
>     - state the value type shown by 'psr-mba-show'. For linear mode,
>       it shows decimal value. For non-linear mode, it shows hexadecimal
>       value.
>       (suggested by Chao Peng)
> ---
>  docs/man/xl.pod.1.in      | 34 +++++++++++++++++++++++++
>  docs/misc/xl-psr.markdown | 63 +++++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 97 insertions(+)
> 
> diff --git a/docs/man/xl.pod.1.in b/docs/man/xl.pod.1.in
> index 16c8306..e644b19 100644
> --- a/docs/man/xl.pod.1.in
> +++ b/docs/man/xl.pod.1.in
> @@ -1798,6 +1798,40 @@ processed.
>  
>  =back
>  
> +=head2 Memory Bandwidth Allocation
> +
> +Intel Skylake and later server platforms offer capabilities to configure and
> +make use of the Memory Bandwidth Allocation (MBA) mechanisms, which provides
> +OS/VMMs the ability to slow misbehaving apps/VMs or create advanced closed-loop

I don't get the 'closed-loop' thing again, but that might just be me
since I'm not a native speaker.

> +control system via exposing control over a credit-based throttling mechanism.
> +In the Xen implementation, MBA is used to control memory bandwidth on VM basis.
> +To enforce bandwidth on a specific domain, just set throttling value (THRTL)
> +for the domain.
> +
> +=over 4
> +
> +=item B<psr-mba-set> [I<OPTIONS>] I<domain-id> I<thrtl>
> +
> +Set throttling value (THRTL) for a domain. For how to specify I<thrtl>
> +please refer to L<http://xenbits.xen.org/docs/unstable/misc/xl-psr.html>.
> +
> +B<OPTIONS>
> +
> +=over 4
> +
> +=item B<-s SOCKET>, B<--socket=SOCKET>
> +
> +Specify the socket to process, otherwise all sockets are processed.
> +
> +=back
> +
> +=item B<psr-mba-show> [I<domain-id>]
> +
> +Show MBA settings for a certain domain or all domains. For linear mode, it
> +shows the decimal value. For non-linear mode, it shows hexadecimal value.
> +
> +=back
> +
>  =head1 IGNORED FOR COMPATIBILITY WITH XM
>  
>  xl is mostly command-line compatible with the old xm utility used with
> diff --git a/docs/misc/xl-psr.markdown b/docs/misc/xl-psr.markdown
> index 04dd957..39fc801 100644
> --- a/docs/misc/xl-psr.markdown
> +++ b/docs/misc/xl-psr.markdown
> @@ -186,6 +186,69 @@ Setting data CBM for a domain:
>  Setting the same code and data CBM for a domain:
>  `xl psr-cat-set <domid> <cbm>`
>  
> +## Memory Bandwidth Allocation (MBA)
> +
> +Memory Bandwidth Allocation (MBA) is a new feature available on Intel
> +Skylake and later server platforms that allows an OS or Hypervisor/VMM to
> +slow misbehaving apps/VMs or create advanced closed-loop control system via
> +exposing control over a credit-based throttling mechanism. To enforce bandwidth
> +on a specific domain, just set throttling value (THRTL) into Class of Service
> +(COS). MBA provides two THRTL mode. One is linear mode and the other is
> +non-linear mode.
> +
> +In the linear mode the input precision is defined as 100-(THRTL_MAX). Values
> +not an even multiple of the precision (e.g., 12%) will be rounded down (e.g.,
> +to 10% delay applied).
               ^ s/applied/by the hardware/

Thanks, Roger.
Yi Sun Sept. 20, 2017, 7:26 a.m. UTC | #2
On 17-09-19 12:37:24, Roger Pau Monn� wrote:
> On Tue, Sep 05, 2017 at 05:32:37PM +0800, Yi Sun wrote:
> > +Intel Skylake and later server platforms offer capabilities to configure and
> > +make use of the Memory Bandwidth Allocation (MBA) mechanisms, which provides
> > +OS/VMMs the ability to slow misbehaving apps/VMs or create advanced closed-loop
> 
> I don't get the 'closed-loop' thing again, but that might just be me
> since I'm not a native speaker.
> 
Will modify this to be same as feature doc.

[...]

> > +In the linear mode the input precision is defined as 100-(THRTL_MAX). Values
> > +not an even multiple of the precision (e.g., 12%) will be rounded down (e.g.,
> > +to 10% delay applied).
>                ^ s/applied/by the hardware/
> 
Thanks!

> Thanks, Roger.
Dario Faggioli Sept. 28, 2017, 4:56 p.m. UTC | #3
On Tue, 2017-09-19 at 12:37 +0100, Roger Pau Monné wrote:
> On Tue, Sep 05, 2017 at 05:32:37PM +0800, Yi Sun wrote:
> > 
> > --- a/docs/man/xl.pod.1.in
> > +++ b/docs/man/xl.pod.1.in
> > @@ -1798,6 +1798,40 @@ processed.
> >  
> >  =back
> >  
> > +=head2 Memory Bandwidth Allocation
> > +
> > +Intel Skylake and later server platforms offer capabilities to
> > configure and
> > +make use of the Memory Bandwidth Allocation (MBA) mechanisms,
> > which provides
> > +OS/VMMs the ability to slow misbehaving apps/VMs or create
> > advanced closed-loop
> 
> I don't get the 'closed-loop' thing again, but that might just be me
> since I'm not a native speaker.
> 
> > +control system via exposing control over a credit-based throttling
> > mechanism.
>
It goes together with 'control system'. In fact, 'closed-loop control
system' is a concept from control theory (or system automation, or
system theory... I've head it called in all these ways).

It's when you want to control a system, or a process, and you do it by
enclosing it in a "loop" in such a way that the n+1-th input to the
process is influenced by the n-th output of the process itself. It's
also called 'feedback-loop' or 'feedback-based control system'.

Basically, you usually read/measure/sense the n-th output of the
process, you compare it with some 'desired' value, and you use --as the
process' n+1-th input-- some indication of how different the measured
value was from the desired value.

http://www.electronics-tutorials.ws/systems/closed-loop-system.html

Alternatively, you have 'open-loop control systems', where there is no
sensing of the output, and no feedback mechanism that would correct the
input according to how things are actually going (i.e., someone says,
there is no control!).

http://www.electronics-tutorials.ws/systems/open-loop-system.html


*I guess* what this means, in this context, is that, with both MBA and
MBM, you can build a piece of software that, given a desired memory
bandwidth usage, for a certain domain, sets MBA accordingly, then
monitors what the domain is actually getting, and use the difference
between that and the desired value to drive the new value to be set,
using MBA again. Like, if it's getting less, give it _some_ more, if
it's getting more, give it _some_ less (where both the _some_-s are
coefficients).

Ideally, after initial spikes and fluctuations (which depends on the
coefficients, and on which one can do math, still using control theory
concepts), happening, e.g., when the workload inside the VM changes,
the bandwidth utilization will settle at the desired point.


All that being said, I'd say that either more details are given (or a
link is put here, pointing to a whitepaper or in general a place where
a full description of the solution can be found), or it's probably
better to drop the 'close-loop' reference, and explain how MBA can be
useful in another way.

Regards,
Dario
diff mbox

Patch

diff --git a/docs/man/xl.pod.1.in b/docs/man/xl.pod.1.in
index 16c8306..e644b19 100644
--- a/docs/man/xl.pod.1.in
+++ b/docs/man/xl.pod.1.in
@@ -1798,6 +1798,40 @@  processed.
 
 =back
 
+=head2 Memory Bandwidth Allocation
+
+Intel Skylake and later server platforms offer capabilities to configure and
+make use of the Memory Bandwidth Allocation (MBA) mechanisms, which provides
+OS/VMMs the ability to slow misbehaving apps/VMs or create advanced closed-loop
+control system via exposing control over a credit-based throttling mechanism.
+In the Xen implementation, MBA is used to control memory bandwidth on VM basis.
+To enforce bandwidth on a specific domain, just set throttling value (THRTL)
+for the domain.
+
+=over 4
+
+=item B<psr-mba-set> [I<OPTIONS>] I<domain-id> I<thrtl>
+
+Set throttling value (THRTL) for a domain. For how to specify I<thrtl>
+please refer to L<http://xenbits.xen.org/docs/unstable/misc/xl-psr.html>.
+
+B<OPTIONS>
+
+=over 4
+
+=item B<-s SOCKET>, B<--socket=SOCKET>
+
+Specify the socket to process, otherwise all sockets are processed.
+
+=back
+
+=item B<psr-mba-show> [I<domain-id>]
+
+Show MBA settings for a certain domain or all domains. For linear mode, it
+shows the decimal value. For non-linear mode, it shows hexadecimal value.
+
+=back
+
 =head1 IGNORED FOR COMPATIBILITY WITH XM
 
 xl is mostly command-line compatible with the old xm utility used with
diff --git a/docs/misc/xl-psr.markdown b/docs/misc/xl-psr.markdown
index 04dd957..39fc801 100644
--- a/docs/misc/xl-psr.markdown
+++ b/docs/misc/xl-psr.markdown
@@ -186,6 +186,69 @@  Setting data CBM for a domain:
 Setting the same code and data CBM for a domain:
 `xl psr-cat-set <domid> <cbm>`
 
+## Memory Bandwidth Allocation (MBA)
+
+Memory Bandwidth Allocation (MBA) is a new feature available on Intel
+Skylake and later server platforms that allows an OS or Hypervisor/VMM to
+slow misbehaving apps/VMs or create advanced closed-loop control system via
+exposing control over a credit-based throttling mechanism. To enforce bandwidth
+on a specific domain, just set throttling value (THRTL) into Class of Service
+(COS). MBA provides two THRTL mode. One is linear mode and the other is
+non-linear mode.
+
+In the linear mode the input precision is defined as 100-(THRTL_MAX). Values
+not an even multiple of the precision (e.g., 12%) will be rounded down (e.g.,
+to 10% delay applied).
+
+If linear values are not supported then input delay values are powers-of-two
+from zero to the THRTL_MAX value from CPUID. In this case any values not a power
+of two will be rounded down the next nearest power of two.
+
+For example, assuming a system with 2 domains:
+
+ * A THRTL of 0x0 for every domain means each domain can access the whole cache
+   without any delay. This is the default.
+
+ * Linear mode: Giving one domain a THRTL of 0xC and the other domain's 0 means
+   that the first domain gets 10% delay to access the cache and the other one
+   without any delay.
+
+ * Non-linear mode: Giving one domain a THRTL of 0xC and the other domain's 0
+   means that the first domain gets 8% delay to access the cache and the other
+   one without any delay.
+
+For more detailed information please refer to Intel SDM chapter
+"Introduction to Memory Bandwidth Allocation".
+
+In Xen's implementation, THRTL can be configured with libxl/xl interfaces but
+COS is maintained in hypervisor only. The cache partition granularity is per
+domain, each domain has COS=0 assigned by default, the corresponding THRTL is
+0, which means all the cache resource can be accessed without delay.
+
+### xl interfaces
+
+System MBA information such as maximum COS and maximum THRTL can be obtained by:
+
+`xl psr-hwinfo --mba`
+
+The simplest way to change a domain's THRTL from its default is running:
+
+`xl psr-mba-set  [OPTIONS] <domid> <thrtl>`
+
+In a multi-socket system, the same thrtl will be set on each socket by default.
+Per socket thrtl can be specified with the `--socket SOCKET` option.
+
+Setting the THRTL may not be successful if insufficient COS is available. In
+such case unused COS(es) may be freed by setting THRTL of all related domains to
+its default value(0).
+
+Per domain THRTL settings can be shown by:
+
+`xl psr-mba-show [OPTIONS] <domid>`
+
+For linear mode, it shows the decimal value. For non-linear mode, it shows
+hexadecimal value.
+
 ## Reference
 
 [1] Intel SDM