diff mbox series

[v7,01/14] xen/common: add cache coloring common code

Message ID 20240315105902.160047-2-carlo.nonato@minervasys.tech (mailing list archive)
State Superseded
Headers show
Series Arm cache coloring | expand

Commit Message

Carlo Nonato March 15, 2024, 10:58 a.m. UTC
Last Level Cache (LLC) coloring allows to partition the cache in smaller
chunks called cache colors. Since not all architectures can actually
implement it, add a HAS_LLC_COLORING Kconfig and put other options under
xen/arch.

LLC colors are a property of the domain, so the domain struct has to be
extended.

Based on original work from: Luca Miccio <lucmiccio@gmail.com>

Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech>
Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
---
v7:
- SUPPORT.md changes added to this patch
- extended documentation to better address applicability of cache coloring
- "llc-nr-ways" and "llc-size" params introduced in favor of "llc-way-size"
- moved dump_llc_coloring_info() call in 'm' keyhandler (pagealloc_info())
v6:
- moved almost all code in common
- moved documentation in this patch
- reintroduced range for CONFIG_NR_LLC_COLORS
- reintroduced some stub functions to reduce the number of checks on
  llc_coloring_enabled
- moved domain_llc_coloring_free() in same patch where allocation happens
- turned "d->llc_colors" to pointer-to-const
- llc_coloring_init() now returns void and panics if errors are found
v5:
- used - instead of _ for filenames
- removed domain_create_llc_colored()
- removed stub functions
- coloring domain fields are now #ifdef protected
v4:
- Kconfig options moved to xen/arch
- removed range for CONFIG_NR_LLC_COLORS
- added "llc_coloring_enabled" global to later implement the boot-time
  switch
- added domain_create_llc_colored() to be able to pass colors
- added is_domain_llc_colored() macro
---
 SUPPORT.md                        |   7 ++
 docs/misc/cache-coloring.rst      | 125 ++++++++++++++++++++++++++++++
 docs/misc/xen-command-line.pandoc |  37 +++++++++
 xen/arch/Kconfig                  |  20 +++++
 xen/common/Kconfig                |   3 +
 xen/common/Makefile               |   1 +
 xen/common/keyhandler.c           |   3 +
 xen/common/llc-coloring.c         | 102 ++++++++++++++++++++++++
 xen/common/page_alloc.c           |   3 +
 xen/include/xen/llc-coloring.h    |  36 +++++++++
 xen/include/xen/sched.h           |   5 ++
 11 files changed, 342 insertions(+)
 create mode 100644 docs/misc/cache-coloring.rst
 create mode 100644 xen/common/llc-coloring.c
 create mode 100644 xen/include/xen/llc-coloring.h

Comments

Carlo Nonato March 15, 2024, 11:39 a.m. UTC | #1
Hi all,

unfortunately, this patch doesn't apply cleanly to the latest master.
The conflict is very small: just a reordering of two lines in
xen/common/Kconfig. Should I resend the whole series?

Thanks.

On Fri, Mar 15, 2024 at 11:59 AM Carlo Nonato
<carlo.nonato@minervasys.tech> wrote:
>
> Last Level Cache (LLC) coloring allows to partition the cache in smaller
> chunks called cache colors. Since not all architectures can actually
> implement it, add a HAS_LLC_COLORING Kconfig and put other options under
> xen/arch.
>
> LLC colors are a property of the domain, so the domain struct has to be
> extended.
>
> Based on original work from: Luca Miccio <lucmiccio@gmail.com>
>
> Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech>
> Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
> ---
> v7:
> - SUPPORT.md changes added to this patch
> - extended documentation to better address applicability of cache coloring
> - "llc-nr-ways" and "llc-size" params introduced in favor of "llc-way-size"
> - moved dump_llc_coloring_info() call in 'm' keyhandler (pagealloc_info())
> v6:
> - moved almost all code in common
> - moved documentation in this patch
> - reintroduced range for CONFIG_NR_LLC_COLORS
> - reintroduced some stub functions to reduce the number of checks on
>   llc_coloring_enabled
> - moved domain_llc_coloring_free() in same patch where allocation happens
> - turned "d->llc_colors" to pointer-to-const
> - llc_coloring_init() now returns void and panics if errors are found
> v5:
> - used - instead of _ for filenames
> - removed domain_create_llc_colored()
> - removed stub functions
> - coloring domain fields are now #ifdef protected
> v4:
> - Kconfig options moved to xen/arch
> - removed range for CONFIG_NR_LLC_COLORS
> - added "llc_coloring_enabled" global to later implement the boot-time
>   switch
> - added domain_create_llc_colored() to be able to pass colors
> - added is_domain_llc_colored() macro
> ---
>  SUPPORT.md                        |   7 ++
>  docs/misc/cache-coloring.rst      | 125 ++++++++++++++++++++++++++++++
>  docs/misc/xen-command-line.pandoc |  37 +++++++++
>  xen/arch/Kconfig                  |  20 +++++
>  xen/common/Kconfig                |   3 +
>  xen/common/Makefile               |   1 +
>  xen/common/keyhandler.c           |   3 +
>  xen/common/llc-coloring.c         | 102 ++++++++++++++++++++++++
>  xen/common/page_alloc.c           |   3 +
>  xen/include/xen/llc-coloring.h    |  36 +++++++++
>  xen/include/xen/sched.h           |   5 ++
>  11 files changed, 342 insertions(+)
>  create mode 100644 docs/misc/cache-coloring.rst
>  create mode 100644 xen/common/llc-coloring.c
>  create mode 100644 xen/include/xen/llc-coloring.h
>
> diff --git a/SUPPORT.md b/SUPPORT.md
> index 510bb02190..456abd42bf 100644
> --- a/SUPPORT.md
> +++ b/SUPPORT.md
> @@ -364,6 +364,13 @@ by maintaining multiple physical to machine (p2m) memory mappings.
>      Status, x86 HVM: Tech Preview
>      Status, ARM: Tech Preview
>
> +### Cache coloring
> +
> +Allows to reserve Last Level Cache (LLC) partitions for Dom0, DomUs and Xen
> +itself.
> +
> +    Status, Arm64: Experimental
> +
>  ## Resource Management
>
>  ### CPU Pools
> diff --git a/docs/misc/cache-coloring.rst b/docs/misc/cache-coloring.rst
> new file mode 100644
> index 0000000000..52ce52ffbd
> --- /dev/null
> +++ b/docs/misc/cache-coloring.rst
> @@ -0,0 +1,125 @@
> +Xen cache coloring user guide
> +=============================
> +
> +The cache coloring support in Xen allows to reserve Last Level Cache (LLC)
> +partitions for Dom0, DomUs and Xen itself. Currently only ARM64 is supported.
> +Cache coloring realizes per-set cache partitioning in software and is applicable
> +to shared LLCs as implemented in Cortex-A53, Cortex-A72 and similar CPUs.
> +
> +To compile LLC coloring support set ``CONFIG_LLC_COLORING=y``.
> +
> +If needed, change the maximum number of colors with
> +``CONFIG_NR_LLC_COLORS=<n>``.
> +
> +Runtime configuration is done via `Command line parameters`_.
> +
> +Background
> +**********
> +
> +Cache hierarchy of a modern multi-core CPU typically has first levels dedicated
> +to each core (hence using multiple cache units), while the last level is shared
> +among all of them. Such configuration implies that memory operations on one
> +core (e.g. running a DomU) are able to generate interference on another core
> +(e.g. hosting another DomU). Cache coloring realizes per-set cache-partitioning
> +in software and mitigates this, guaranteeing higher and more predictable
> +performances for memory accesses.
> +Software-based cache coloring is particularly useful in those situations where
> +no hardware mechanisms (e.g., DSU-based way partitioning) are available to
> +partition caches. This is the case for e.g., Cortex-A53, A57 and A72 CPUs that
> +feature a L2 LLC cache shared among all cores.
> +
> +The key concept underlying cache coloring is a fragmentation of the memory
> +space into a set of sub-spaces called colors that are mapped to disjoint cache
> +partitions. Technically, the whole memory space is first divided into a number
> +of subsequent regions. Then each region is in turn divided into a number of
> +subsequent sub-colors. The generic i-th color is then obtained by all the
> +i-th sub-colors in each region.
> +
> +::
> +
> +                            Region j            Region j+1
> +                .....................   ............
> +                .                     . .
> +                .                       .
> +            _ _ _______________ _ _____________________ _ _
> +                |     |     |     |     |     |     |
> +                | c_0 | c_1 |     | c_n | c_0 | c_1 |
> +           _ _ _|_____|_____|_ _ _|_____|_____|_____|_ _ _
> +                    :                       :
> +                    :                       :...         ... .
> +                    :                            color 0
> +                    :...........................         ... .
> +                                                :
> +          . . ..................................:
> +
> +How colors are actually defined depends on the function that maps memory to
> +cache lines. In case of physically-indexed, physically-tagged caches with linear
> +mapping, the set index is found by extracting some contiguous bits from the
> +physical address. This allows colors to be defined as shown in figure: they
> +appear in memory as subsequent blocks of equal size and repeats themselves after
> +``n`` different colors, where ``n`` is the total number of colors.
> +
> +If some kind of bit shuffling appears in the mapping function, then colors
> +assume a different layout in memory. Those kind of caches aren't supported by
> +the current implementation.
> +
> +**Note**: Finding the exact cache mapping function can be a really difficult
> +task since it's not always documented in the CPU manual. As said Cortex-A53, A57
> +and A72 are known to work with the current implementation.
> +
> +How to compute the number of colors
> +###################################
> +
> +Given the linear mapping from physical memory to cache lines for granted, the
> +number of available colors for a specific platform is computed using three
> +parameters:
> +
> +- the size of the LLC.
> +- the number of the LLC ways.
> +- the page size used by Xen.
> +
> +The first two parameters can be found in the processor manual, while the third
> +one is the minimum mapping granularity. Dividing the cache size by the number of
> +its ways we obtain the size of a way. Dividing this number by the page size,
> +the number of total cache colors is found. So for example an Arm Cortex-A53
> +with a 16-ways associative 1 MiB LLC can isolate up to 16 colors when pages are
> +4 KiB in size.
> +
> +LLC size and number of ways are probed automatically by default so there's
> +should be no need to compute the number of colors by yourself.
> +
> +Effective colors assignment
> +###########################
> +
> +When assigning colors:
> +
> +1. If one wants to avoid cache interference between two domains, different
> +   colors needs to be used for their memory.
> +
> +2. To improve spatial locality, color assignment should privilege continuity in
> +   the partitioning. E.g., assigning colors (0,1) to domain I and (2,3) to
> +   domain J is better than assigning colors (0,2) to I and (1,3) to J.
> +
> +Command line parameters
> +***********************
> +
> +Specific documentation is available at `docs/misc/xen-command-line.pandoc`.
> +
> ++----------------------+-------------------------------+
> +| **Parameter**        | **Description**               |
> ++----------------------+-------------------------------+
> +| ``llc-coloring``     | enable coloring at runtime    |
> ++----------------------+-------------------------------+
> +| ``llc-size``         | set the LLC size              |
> ++----------------------+-------------------------------+
> +| ``llc-nr-ways``      | set the LLC number of ways    |
> ++----------------------+-------------------------------+
> +
> +Auto-probing of LLC specs
> +#########################
> +
> +LLC size and number of ways are probed automatically by default.
> +
> +LLC specs can be manually set via the above command line parameters. This
> +bypasses any auto-probing and it's used to overcome failing situations or for
> +debugging/testing purposes.
> diff --git a/docs/misc/xen-command-line.pandoc b/docs/misc/xen-command-line.pandoc
> index 54edbc0fbc..2936abea2c 100644
> --- a/docs/misc/xen-command-line.pandoc
> +++ b/docs/misc/xen-command-line.pandoc
> @@ -1706,6 +1706,43 @@ This option is intended for debugging purposes only.  Enable MSR_DEBUGCTL.LBR
>  in hypervisor context to be able to dump the Last Interrupt/Exception To/From
>  record with other registers.
>
> +### llc-coloring
> +> `= <boolean>`
> +
> +> Default: `false`
> +
> +Flag to enable or disable LLC coloring support at runtime. This option is
> +available only when `CONFIG_LLC_COLORING` is enabled. See the general
> +cache coloring documentation for more info.
> +
> +### llc-nr-ways
> +> `= <integer>`
> +
> +> Default: `Obtained from hardware`
> +
> +Specify the number of ways of the Last Level Cache. This option is available
> +only when `CONFIG_LLC_COLORING` is enabled. LLC size and number of ways are used
> +to find the number of supported cache colors. By default the value is
> +automatically computed by probing the hardware, but in case of specific needs,
> +it can be manually set. Those include failing probing and debugging/testing
> +purposes so that it's possibile to emulate platforms with different number of
> +supported colors. If set, also "llc-size" must be set, otherwise the default
> +will be used.
> +
> +### llc-size
> +> `= <size>`
> +
> +> Default: `Obtained from hardware`
> +
> +Specify the size of the Last Level Cache. This option is available only when
> +`CONFIG_LLC_COLORING` is enabled. LLC size and number of ways are used to find
> +the number of supported cache colors. By default the value is automatically
> +computed by probing the hardware, but in case of specific needs, it can be
> +manually set. Those include failing probing and debugging/testing purposes so
> +that it's possibile to emulate platforms with different number of supported
> +colors. If set, also "llc-nr-ways" must be set, otherwise the default will be
> +used.
> +
>  ### lock-depth-size
>  > `= <integer>`
>
> diff --git a/xen/arch/Kconfig b/xen/arch/Kconfig
> index 67ba38f32f..a65c38e53e 100644
> --- a/xen/arch/Kconfig
> +++ b/xen/arch/Kconfig
> @@ -31,3 +31,23 @@ config NR_NUMA_NODES
>           associated with multiple-nodes management. It is the upper bound of
>           the number of NUMA nodes that the scheduler, memory allocation and
>           other NUMA-aware components can handle.
> +
> +config LLC_COLORING
> +       bool "Last Level Cache (LLC) coloring" if EXPERT
> +       depends on HAS_LLC_COLORING
> +       depends on !NUMA
> +
> +config NR_LLC_COLORS
> +       int "Maximum number of LLC colors"
> +       range 2 1024
> +       default 128
> +       depends on LLC_COLORING
> +       help
> +         Controls the build-time size of various arrays associated with LLC
> +         coloring. Refer to cache coloring documentation for how to compute the
> +         number of colors supported by the platform. This is only an upper
> +         bound. The runtime value is autocomputed or manually set via cmdline.
> +         The default value corresponds to an 8 MiB 16-ways LLC, which should be
> +         more than what's needed in the general case. Use only power of 2 values.
> +         1024 is the number of colors that fit in a 4 KiB page when integers are 4
> +         bytes long.
> diff --git a/xen/common/Kconfig b/xen/common/Kconfig
> index a5c3d5a6bf..1e467178bd 100644
> --- a/xen/common/Kconfig
> +++ b/xen/common/Kconfig
> @@ -71,6 +71,9 @@ config HAS_IOPORTS
>  config HAS_KEXEC
>         bool
>
> +config HAS_LLC_COLORING
> +       bool
> +
>  config HAS_PMAP
>         bool
>
> diff --git a/xen/common/Makefile b/xen/common/Makefile
> index e5eee19a85..3054254a7d 100644
> --- a/xen/common/Makefile
> +++ b/xen/common/Makefile
> @@ -23,6 +23,7 @@ obj-y += keyhandler.o
>  obj-$(CONFIG_KEXEC) += kexec.o
>  obj-$(CONFIG_KEXEC) += kimage.o
>  obj-$(CONFIG_LIVEPATCH) += livepatch.o livepatch_elf.o
> +obj-$(CONFIG_LLC_COLORING) += llc-coloring.o
>  obj-$(CONFIG_MEM_ACCESS) += mem_access.o
>  obj-y += memory.o
>  obj-y += multicall.o
> diff --git a/xen/common/keyhandler.c b/xen/common/keyhandler.c
> index 127ca50696..778f93e063 100644
> --- a/xen/common/keyhandler.c
> +++ b/xen/common/keyhandler.c
> @@ -5,6 +5,7 @@
>  #include <asm/regs.h>
>  #include <xen/delay.h>
>  #include <xen/keyhandler.h>
> +#include <xen/llc-coloring.h>
>  #include <xen/param.h>
>  #include <xen/shutdown.h>
>  #include <xen/event.h>
> @@ -303,6 +304,8 @@ static void cf_check dump_domains(unsigned char key)
>
>          arch_dump_domain_info(d);
>
> +        domain_dump_llc_colors(d);
> +
>          rangeset_domain_printk(d);
>
>          dump_pageframe_info(d);
> diff --git a/xen/common/llc-coloring.c b/xen/common/llc-coloring.c
> new file mode 100644
> index 0000000000..db96a83ddd
> --- /dev/null
> +++ b/xen/common/llc-coloring.c
> @@ -0,0 +1,102 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Last Level Cache (LLC) coloring common code
> + *
> + * Copyright (C) 2022 Xilinx Inc.
> + */
> +#include <xen/keyhandler.h>
> +#include <xen/llc-coloring.h>
> +#include <xen/param.h>
> +
> +static bool __ro_after_init llc_coloring_enabled;
> +boolean_param("llc-coloring", llc_coloring_enabled);
> +
> +static unsigned int __initdata llc_size;
> +size_param("llc-size", llc_size);
> +static unsigned int __initdata llc_nr_ways;
> +integer_param("llc-nr-ways", llc_nr_ways);
> +/* Number of colors available in the LLC */
> +static unsigned int __ro_after_init max_nr_colors;
> +
> +static void print_colors(const unsigned int *colors, unsigned int num_colors)
> +{
> +    unsigned int i;
> +
> +    printk("{ ");
> +    for ( i = 0; i < num_colors; i++ )
> +    {
> +        unsigned int start = colors[i], end = start;
> +
> +        printk("%u", start);
> +
> +        for ( ; i < num_colors - 1 && end + 1 == colors[i + 1]; i++, end++ )
> +            ;
> +
> +        if ( start != end )
> +            printk("-%u", end);
> +
> +        if ( i < num_colors - 1 )
> +            printk(", ");
> +    }
> +    printk(" }\n");
> +}
> +
> +void __init llc_coloring_init(void)
> +{
> +    unsigned int way_size;
> +
> +    if ( !llc_coloring_enabled )
> +        return;
> +
> +    if ( llc_size && llc_nr_ways )
> +        way_size = llc_size / llc_nr_ways;
> +    else
> +    {
> +        way_size = get_llc_way_size();
> +        if ( !way_size )
> +            panic("LLC probing failed and 'llc-size' or 'llc-nr-ways' missing\n");
> +    }
> +
> +    /*
> +     * The maximum number of colors must be a power of 2 in order to correctly
> +     * map them to bits of an address.
> +     */
> +    max_nr_colors = way_size >> PAGE_SHIFT;
> +
> +    if ( max_nr_colors & (max_nr_colors - 1) )
> +        panic("Number of LLC colors (%u) isn't a power of 2\n", max_nr_colors);
> +
> +    if ( max_nr_colors < 2 || max_nr_colors > CONFIG_NR_LLC_COLORS )
> +        panic("Number of LLC colors (%u) not in range [2, %u]\n",
> +              max_nr_colors, CONFIG_NR_LLC_COLORS);
> +
> +    arch_llc_coloring_init();
> +}
> +
> +void cf_check dump_llc_coloring_info(void)
> +{
> +    if ( !llc_coloring_enabled )
> +        return;
> +
> +    printk("LLC coloring info:\n");
> +    printk("    Number of LLC colors supported: %u\n", max_nr_colors);
> +}
> +
> +void cf_check domain_dump_llc_colors(const struct domain *d)
> +{
> +    if ( !llc_coloring_enabled )
> +        return;
> +
> +    printk("%u LLC colors: ", d->num_llc_colors);
> +    print_colors(d->llc_colors, d->num_llc_colors);
> +}
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * tab-width: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
> index 2ec17df9b4..c38edb9a58 100644
> --- a/xen/common/page_alloc.c
> +++ b/xen/common/page_alloc.c
> @@ -126,6 +126,7 @@
>  #include <xen/irq.h>
>  #include <xen/keyhandler.h>
>  #include <xen/lib.h>
> +#include <xen/llc-coloring.h>
>  #include <xen/mm.h>
>  #include <xen/nodemask.h>
>  #include <xen/numa.h>
> @@ -2623,6 +2624,8 @@ static void cf_check pagealloc_info(unsigned char key)
>      }
>
>      printk("    Dom heap: %lukB free\n", total << (PAGE_SHIFT-10));
> +
> +    dump_llc_coloring_info();
>  }
>
>  static __init int cf_check pagealloc_keyhandler_init(void)
> diff --git a/xen/include/xen/llc-coloring.h b/xen/include/xen/llc-coloring.h
> new file mode 100644
> index 0000000000..c60c8050c5
> --- /dev/null
> +++ b/xen/include/xen/llc-coloring.h
> @@ -0,0 +1,36 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Last Level Cache (LLC) coloring common header
> + *
> + * Copyright (C) 2022 Xilinx Inc.
> + */
> +#ifndef __COLORING_H__
> +#define __COLORING_H__
> +
> +#include <xen/sched.h>
> +#include <public/domctl.h>
> +
> +#ifdef CONFIG_LLC_COLORING
> +void llc_coloring_init(void);
> +void dump_llc_coloring_info(void);
> +void domain_dump_llc_colors(const struct domain *d);
> +#else
> +static inline void llc_coloring_init(void) {}
> +static inline void dump_llc_coloring_info(void) {}
> +static inline void domain_dump_llc_colors(const struct domain *d) {}
> +#endif
> +
> +unsigned int get_llc_way_size(void);
> +void arch_llc_coloring_init(void);
> +
> +#endif /* __COLORING_H__ */
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * tab-width: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
> index 37f5922f32..96cc934fc3 100644
> --- a/xen/include/xen/sched.h
> +++ b/xen/include/xen/sched.h
> @@ -627,6 +627,11 @@ struct domain
>
>      /* Holding CDF_* constant. Internal flags for domain creation. */
>      unsigned int cdf;
> +
> +#ifdef CONFIG_LLC_COLORING
> +    unsigned int num_llc_colors;
> +    const unsigned int *llc_colors;
> +#endif
>  };
>
>  static inline struct page_list_head *page_to_list(
> --
> 2.34.1
>
Jan Beulich March 19, 2024, 2:58 p.m. UTC | #2
On 15.03.2024 11:58, Carlo Nonato wrote:
> +Background
> +**********
> +
> +Cache hierarchy of a modern multi-core CPU typically has first levels dedicated
> +to each core (hence using multiple cache units), while the last level is shared
> +among all of them. Such configuration implies that memory operations on one
> +core (e.g. running a DomU) are able to generate interference on another core
> +(e.g. hosting another DomU). Cache coloring realizes per-set cache-partitioning
> +in software and mitigates this, guaranteeing higher and more predictable
> +performances for memory accesses.

Are you sure about "higher"? On an otherwise idle system, a single domain (or
vCPU) may perform better when not partitioned, as more cache would be available
to it overall.

> +How to compute the number of colors
> +###################################
> +
> +Given the linear mapping from physical memory to cache lines for granted, the
> +number of available colors for a specific platform is computed using three
> +parameters:
> +
> +- the size of the LLC.
> +- the number of the LLC ways.
> +- the page size used by Xen.
> +
> +The first two parameters can be found in the processor manual, while the third
> +one is the minimum mapping granularity. Dividing the cache size by the number of
> +its ways we obtain the size of a way. Dividing this number by the page size,
> +the number of total cache colors is found. So for example an Arm Cortex-A53
> +with a 16-ways associative 1 MiB LLC can isolate up to 16 colors when pages are
> +4 KiB in size.
> +
> +LLC size and number of ways are probed automatically by default so there's
> +should be no need to compute the number of colors by yourself.

Is this a leftover from the earlier (single) command line option?

> +Effective colors assignment
> +###########################
> +
> +When assigning colors:
> +
> +1. If one wants to avoid cache interference between two domains, different
> +   colors needs to be used for their memory.
> +
> +2. To improve spatial locality, color assignment should privilege continuity in

s/privilege/prefer/ ?

> +   the partitioning. E.g., assigning colors (0,1) to domain I and (2,3) to
> +   domain J is better than assigning colors (0,2) to I and (1,3) to J.

While I consider 1 obvious without further explanation, the same isn't
the case for 2: What's the benefit of spatial locality? If there was
support for allocating higher order pages, I could certainly see the
point, but iirc that isn't supported (yet).

> +Command line parameters
> +***********************
> +
> +Specific documentation is available at `docs/misc/xen-command-line.pandoc`.
> +
> ++----------------------+-------------------------------+
> +| **Parameter**        | **Description**               |
> ++----------------------+-------------------------------+
> +| ``llc-coloring``     | enable coloring at runtime    |
> ++----------------------+-------------------------------+
> +| ``llc-size``         | set the LLC size              |
> ++----------------------+-------------------------------+
> +| ``llc-nr-ways``      | set the LLC number of ways    |
> ++----------------------+-------------------------------+
> +
> +Auto-probing of LLC specs
> +#########################
> +
> +LLC size and number of ways are probed automatically by default.
> +
> +LLC specs can be manually set via the above command line parameters. This
> +bypasses any auto-probing and it's used to overcome failing situations or for
> +debugging/testing purposes.

As well as perhaps for cases where the auto-probing logic is flawed?

> --- a/docs/misc/xen-command-line.pandoc
> +++ b/docs/misc/xen-command-line.pandoc
> @@ -1706,6 +1706,43 @@ This option is intended for debugging purposes only.  Enable MSR_DEBUGCTL.LBR
>  in hypervisor context to be able to dump the Last Interrupt/Exception To/From
>  record with other registers.
>  
> +### llc-coloring
> +> `= <boolean>`
> +
> +> Default: `false`
> +
> +Flag to enable or disable LLC coloring support at runtime. This option is
> +available only when `CONFIG_LLC_COLORING` is enabled. See the general
> +cache coloring documentation for more info.
> +
> +### llc-nr-ways
> +> `= <integer>`
> +
> +> Default: `Obtained from hardware`
> +
> +Specify the number of ways of the Last Level Cache. This option is available
> +only when `CONFIG_LLC_COLORING` is enabled. LLC size and number of ways are used
> +to find the number of supported cache colors. By default the value is
> +automatically computed by probing the hardware, but in case of specific needs,
> +it can be manually set. Those include failing probing and debugging/testing
> +purposes so that it's possibile to emulate platforms with different number of
> +supported colors. If set, also "llc-size" must be set, otherwise the default
> +will be used.
> +
> +### llc-size
> +> `= <size>`
> +
> +> Default: `Obtained from hardware`
> +
> +Specify the size of the Last Level Cache. This option is available only when
> +`CONFIG_LLC_COLORING` is enabled. LLC size and number of ways are used to find
> +the number of supported cache colors. By default the value is automatically
> +computed by probing the hardware, but in case of specific needs, it can be
> +manually set. Those include failing probing and debugging/testing purposes so
> +that it's possibile to emulate platforms with different number of supported
> +colors. If set, also "llc-nr-ways" must be set, otherwise the default will be
> +used.

Wouldn't it make sense to infer "llc-coloring" when both of the latter options
were supplied?

> --- a/xen/arch/Kconfig
> +++ b/xen/arch/Kconfig
> @@ -31,3 +31,23 @@ config NR_NUMA_NODES
>  	  associated with multiple-nodes management. It is the upper bound of
>  	  the number of NUMA nodes that the scheduler, memory allocation and
>  	  other NUMA-aware components can handle.
> +
> +config LLC_COLORING
> +	bool "Last Level Cache (LLC) coloring" if EXPERT
> +	depends on HAS_LLC_COLORING
> +	depends on !NUMA
> +
> +config NR_LLC_COLORS
> +	int "Maximum number of LLC colors"
> +	range 2 1024
> +	default 128
> +	depends on LLC_COLORING
> +	help
> +	  Controls the build-time size of various arrays associated with LLC
> +	  coloring. Refer to cache coloring documentation for how to compute the
> +	  number of colors supported by the platform. This is only an upper
> +	  bound. The runtime value is autocomputed or manually set via cmdline.
> +	  The default value corresponds to an 8 MiB 16-ways LLC, which should be
> +	  more than what's needed in the general case. Use only power of 2 values.

I think I said so before: Rather than telling people to pick only power-of-2
values (and it remaining unclear what happens if they don't), why don't you
simply keep them from specifying anything bogus, by having them pass in the
value to use as a power of 2? I.e. "range 1 10" and "default 7" for what
you're currently putting in place.

> +	  1024 is the number of colors that fit in a 4 KiB page when integers are 4
> +	  bytes long.

How's this relevant here? As a justification it would make sense to have in
the description.

I'm btw also not convinced this is a good place to put these options. Imo ...

> --- a/xen/common/Kconfig
> +++ b/xen/common/Kconfig
> @@ -71,6 +71,9 @@ config HAS_IOPORTS
>  config HAS_KEXEC
>  	bool
>  
> +config HAS_LLC_COLORING
> +	bool
> +
>  config HAS_PMAP
>  	bool

... they'd better live further down from here.

> --- /dev/null
> +++ b/xen/common/llc-coloring.c
> @@ -0,0 +1,102 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Last Level Cache (LLC) coloring common code
> + *
> + * Copyright (C) 2022 Xilinx Inc.
> + */
> +#include <xen/keyhandler.h>
> +#include <xen/llc-coloring.h>
> +#include <xen/param.h>
> +
> +static bool __ro_after_init llc_coloring_enabled;
> +boolean_param("llc-coloring", llc_coloring_enabled);
> +
> +static unsigned int __initdata llc_size;
> +size_param("llc-size", llc_size);
> +static unsigned int __initdata llc_nr_ways;
> +integer_param("llc-nr-ways", llc_nr_ways);
> +/* Number of colors available in the LLC */
> +static unsigned int __ro_after_init max_nr_colors;
> +
> +static void print_colors(const unsigned int *colors, unsigned int num_colors)
> +{
> +    unsigned int i;
> +
> +    printk("{ ");
> +    for ( i = 0; i < num_colors; i++ )
> +    {
> +        unsigned int start = colors[i], end = start;
> +
> +        printk("%u", start);
> +
> +        for ( ; i < num_colors - 1 && end + 1 == colors[i + 1]; i++, end++ )
> +            ;
> +
> +        if ( start != end )
> +            printk("-%u", end);
> +
> +        if ( i < num_colors - 1 )
> +            printk(", ");
> +    }
> +    printk(" }\n");
> +}
> +
> +void __init llc_coloring_init(void)
> +{
> +    unsigned int way_size;
> +
> +    if ( !llc_coloring_enabled )
> +        return;
> +
> +    if ( llc_size && llc_nr_ways )
> +        way_size = llc_size / llc_nr_ways;
> +    else
> +    {
> +        way_size = get_llc_way_size();
> +        if ( !way_size )
> +            panic("LLC probing failed and 'llc-size' or 'llc-nr-ways' missing\n");
> +    }
> +
> +    /*
> +     * The maximum number of colors must be a power of 2 in order to correctly
> +     * map them to bits of an address.
> +     */
> +    max_nr_colors = way_size >> PAGE_SHIFT;
> +
> +    if ( max_nr_colors & (max_nr_colors - 1) )
> +        panic("Number of LLC colors (%u) isn't a power of 2\n", max_nr_colors);
> +
> +    if ( max_nr_colors < 2 || max_nr_colors > CONFIG_NR_LLC_COLORS )
> +        panic("Number of LLC colors (%u) not in range [2, %u]\n",
> +              max_nr_colors, CONFIG_NR_LLC_COLORS);

Rather than crashing when max_nr_colors is too large, couldn't you simply
halve it a number of times? That would still satisfy the requirement on
isolation, wouldn't it?

> +    arch_llc_coloring_init();
> +}
> +
> +void cf_check dump_llc_coloring_info(void)

I don't think cf_check is needed here nor ...

> +void cf_check domain_dump_llc_colors(const struct domain *d)

... here anymore. You're using direct calls now.

Jan
Carlo Nonato March 21, 2024, 3:03 p.m. UTC | #3
Hi Jan,

(adding Andrea Bastoni in cc)

On Tue, Mar 19, 2024 at 3:58 PM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 15.03.2024 11:58, Carlo Nonato wrote:
> > +Background
> > +**********
> > +
> > +Cache hierarchy of a modern multi-core CPU typically has first levels dedicated
> > +to each core (hence using multiple cache units), while the last level is shared
> > +among all of them. Such configuration implies that memory operations on one
> > +core (e.g. running a DomU) are able to generate interference on another core
> > +(e.g. hosting another DomU). Cache coloring realizes per-set cache-partitioning
> > +in software and mitigates this, guaranteeing higher and more predictable
> > +performances for memory accesses.
>
> Are you sure about "higher"? On an otherwise idle system, a single domain (or
> vCPU) may perform better when not partitioned, as more cache would be available
> to it overall.

I'll drop "higher" and leave the rest.

> > +How to compute the number of colors
> > +###################################
> > +
> > +Given the linear mapping from physical memory to cache lines for granted, the
> > +number of available colors for a specific platform is computed using three
> > +parameters:
> > +
> > +- the size of the LLC.
> > +- the number of the LLC ways.
> > +- the page size used by Xen.
> > +
> > +The first two parameters can be found in the processor manual, while the third
> > +one is the minimum mapping granularity. Dividing the cache size by the number of
> > +its ways we obtain the size of a way. Dividing this number by the page size,
> > +the number of total cache colors is found. So for example an Arm Cortex-A53
> > +with a 16-ways associative 1 MiB LLC can isolate up to 16 colors when pages are
> > +4 KiB in size.
> > +
> > +LLC size and number of ways are probed automatically by default so there's
> > +should be no need to compute the number of colors by yourself.
>
> Is this a leftover from the earlier (single) command line option?

Nope, but I can drop it since it's already stated below.

> > +Effective colors assignment
> > +###########################
> > +
> > +When assigning colors:
> > +
> > +1. If one wants to avoid cache interference between two domains, different
> > +   colors needs to be used for their memory.
> > +
> > +2. To improve spatial locality, color assignment should privilege continuity in
>
> s/privilege/prefer/ ?
>
> > +   the partitioning. E.g., assigning colors (0,1) to domain I and (2,3) to
> > +   domain J is better than assigning colors (0,2) to I and (1,3) to J.
>
> While I consider 1 obvious without further explanation, the same isn't
> the case for 2: What's the benefit of spatial locality? If there was
> support for allocating higher order pages, I could certainly see the
> point, but iirc that isn't supported (yet).

I'll drop point 2.

> > +Command line parameters
> > +***********************
> > +
> > +Specific documentation is available at `docs/misc/xen-command-line.pandoc`.
> > +
> > ++----------------------+-------------------------------+
> > +| **Parameter**        | **Description**               |
> > ++----------------------+-------------------------------+
> > +| ``llc-coloring``     | enable coloring at runtime    |
> > ++----------------------+-------------------------------+
> > +| ``llc-size``         | set the LLC size              |
> > ++----------------------+-------------------------------+
> > +| ``llc-nr-ways``      | set the LLC number of ways    |
> > ++----------------------+-------------------------------+
> > +
> > +Auto-probing of LLC specs
> > +#########################
> > +
> > +LLC size and number of ways are probed automatically by default.
> > +
> > +LLC specs can be manually set via the above command line parameters. This
> > +bypasses any auto-probing and it's used to overcome failing situations or for
> > +debugging/testing purposes.
>
> As well as perhaps for cases where the auto-probing logic is flawed?

This is what I meant with "overcome failing situations", but I'll be more
explicit.

> > --- a/docs/misc/xen-command-line.pandoc
> > +++ b/docs/misc/xen-command-line.pandoc
> > @@ -1706,6 +1706,43 @@ This option is intended for debugging purposes only.  Enable MSR_DEBUGCTL.LBR
> >  in hypervisor context to be able to dump the Last Interrupt/Exception To/From
> >  record with other registers.
> >
> > +### llc-coloring
> > +> `= <boolean>`
> > +
> > +> Default: `false`
> > +
> > +Flag to enable or disable LLC coloring support at runtime. This option is
> > +available only when `CONFIG_LLC_COLORING` is enabled. See the general
> > +cache coloring documentation for more info.
> > +
> > +### llc-nr-ways
> > +> `= <integer>`
> > +
> > +> Default: `Obtained from hardware`
> > +
> > +Specify the number of ways of the Last Level Cache. This option is available
> > +only when `CONFIG_LLC_COLORING` is enabled. LLC size and number of ways are used
> > +to find the number of supported cache colors. By default the value is
> > +automatically computed by probing the hardware, but in case of specific needs,
> > +it can be manually set. Those include failing probing and debugging/testing
> > +purposes so that it's possibile to emulate platforms with different number of
> > +supported colors. If set, also "llc-size" must be set, otherwise the default
> > +will be used.
> > +
> > +### llc-size
> > +> `= <size>`
> > +
> > +> Default: `Obtained from hardware`
> > +
> > +Specify the size of the Last Level Cache. This option is available only when
> > +`CONFIG_LLC_COLORING` is enabled. LLC size and number of ways are used to find
> > +the number of supported cache colors. By default the value is automatically
> > +computed by probing the hardware, but in case of specific needs, it can be
> > +manually set. Those include failing probing and debugging/testing purposes so
> > +that it's possibile to emulate platforms with different number of supported
> > +colors. If set, also "llc-nr-ways" must be set, otherwise the default will be
> > +used.
>
> Wouldn't it make sense to infer "llc-coloring" when both of the latter options
> were supplied?

To me it looks a bit strange that specifying some attributes of the cache
automatically enables cache coloring. Also it would require some changes in
how to express the auto-probing for such attributes.

> > --- a/xen/arch/Kconfig
> > +++ b/xen/arch/Kconfig
> > @@ -31,3 +31,23 @@ config NR_NUMA_NODES
> >         associated with multiple-nodes management. It is the upper bound of
> >         the number of NUMA nodes that the scheduler, memory allocation and
> >         other NUMA-aware components can handle.
> > +
> > +config LLC_COLORING
> > +     bool "Last Level Cache (LLC) coloring" if EXPERT
> > +     depends on HAS_LLC_COLORING
> > +     depends on !NUMA
> > +
> > +config NR_LLC_COLORS
> > +     int "Maximum number of LLC colors"
> > +     range 2 1024
> > +     default 128
> > +     depends on LLC_COLORING
> > +     help
> > +       Controls the build-time size of various arrays associated with LLC
> > +       coloring. Refer to cache coloring documentation for how to compute the
> > +       number of colors supported by the platform. This is only an upper
> > +       bound. The runtime value is autocomputed or manually set via cmdline.
> > +       The default value corresponds to an 8 MiB 16-ways LLC, which should be
> > +       more than what's needed in the general case. Use only power of 2 values.
>
> I think I said so before: Rather than telling people to pick only power-of-2
> values (and it remaining unclear what happens if they don't), why don't you
> simply keep them from specifying anything bogus, by having them pass in the
> value to use as a power of 2? I.e. "range 1 10" and "default 7" for what
> you're currently putting in place.

I'll do that.

> > +       1024 is the number of colors that fit in a 4 KiB page when integers are 4
> > +       bytes long.
>
> How's this relevant here? As a justification it would make sense to have in
> the description.

I'll move it.

> I'm btw also not convinced this is a good place to put these options. Imo ...
>
> > --- a/xen/common/Kconfig
> > +++ b/xen/common/Kconfig
> > @@ -71,6 +71,9 @@ config HAS_IOPORTS
> >  config HAS_KEXEC
> >       bool
> >
> > +config HAS_LLC_COLORING
> > +     bool
> > +
> >  config HAS_PMAP
> >       bool
>
> ... they'd better live further down from here.

Ok.

> > --- /dev/null
> > +++ b/xen/common/llc-coloring.c
> > @@ -0,0 +1,102 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +/*
> > + * Last Level Cache (LLC) coloring common code
> > + *
> > + * Copyright (C) 2022 Xilinx Inc.
> > + */
> > +#include <xen/keyhandler.h>
> > +#include <xen/llc-coloring.h>
> > +#include <xen/param.h>
> > +
> > +static bool __ro_after_init llc_coloring_enabled;
> > +boolean_param("llc-coloring", llc_coloring_enabled);
> > +
> > +static unsigned int __initdata llc_size;
> > +size_param("llc-size", llc_size);
> > +static unsigned int __initdata llc_nr_ways;
> > +integer_param("llc-nr-ways", llc_nr_ways);
> > +/* Number of colors available in the LLC */
> > +static unsigned int __ro_after_init max_nr_colors;
> > +
> > +static void print_colors(const unsigned int *colors, unsigned int num_colors)
> > +{
> > +    unsigned int i;
> > +
> > +    printk("{ ");
> > +    for ( i = 0; i < num_colors; i++ )
> > +    {
> > +        unsigned int start = colors[i], end = start;
> > +
> > +        printk("%u", start);
> > +
> > +        for ( ; i < num_colors - 1 && end + 1 == colors[i + 1]; i++, end++ )
> > +            ;
> > +
> > +        if ( start != end )
> > +            printk("-%u", end);
> > +
> > +        if ( i < num_colors - 1 )
> > +            printk(", ");
> > +    }
> > +    printk(" }\n");
> > +}
> > +
> > +void __init llc_coloring_init(void)
> > +{
> > +    unsigned int way_size;
> > +
> > +    if ( !llc_coloring_enabled )
> > +        return;
> > +
> > +    if ( llc_size && llc_nr_ways )
> > +        way_size = llc_size / llc_nr_ways;
> > +    else
> > +    {
> > +        way_size = get_llc_way_size();
> > +        if ( !way_size )
> > +            panic("LLC probing failed and 'llc-size' or 'llc-nr-ways' missing\n");
> > +    }
> > +
> > +    /*
> > +     * The maximum number of colors must be a power of 2 in order to correctly
> > +     * map them to bits of an address.
> > +     */
> > +    max_nr_colors = way_size >> PAGE_SHIFT;
> > +
> > +    if ( max_nr_colors & (max_nr_colors - 1) )
> > +        panic("Number of LLC colors (%u) isn't a power of 2\n", max_nr_colors);
> > +
> > +    if ( max_nr_colors < 2 || max_nr_colors > CONFIG_NR_LLC_COLORS )
> > +        panic("Number of LLC colors (%u) not in range [2, %u]\n",
> > +              max_nr_colors, CONFIG_NR_LLC_COLORS);
>
> Rather than crashing when max_nr_colors is too large, couldn't you simply
> halve it a number of times? That would still satisfy the requirement on
> isolation, wouldn't it?

Well I could simply set it to CONFIG_NR_LLC_COLORS at this point.

> > +    arch_llc_coloring_init();
> > +}
> > +
> > +void cf_check dump_llc_coloring_info(void)
>
> I don't think cf_check is needed here nor ...
>
> > +void cf_check domain_dump_llc_colors(const struct domain *d)
>
> ... here anymore. You're using direct calls now.

Ok.

> Jan

Thanks.
Jan Beulich March 21, 2024, 3:53 p.m. UTC | #4
On 21.03.2024 16:03, Carlo Nonato wrote:
> On Tue, Mar 19, 2024 at 3:58 PM Jan Beulich <jbeulich@suse.com> wrote:
>> On 15.03.2024 11:58, Carlo Nonato wrote:
>>> --- a/docs/misc/xen-command-line.pandoc
>>> +++ b/docs/misc/xen-command-line.pandoc
>>> @@ -1706,6 +1706,43 @@ This option is intended for debugging purposes only.  Enable MSR_DEBUGCTL.LBR
>>>  in hypervisor context to be able to dump the Last Interrupt/Exception To/From
>>>  record with other registers.
>>>
>>> +### llc-coloring
>>> +> `= <boolean>`
>>> +
>>> +> Default: `false`
>>> +
>>> +Flag to enable or disable LLC coloring support at runtime. This option is
>>> +available only when `CONFIG_LLC_COLORING` is enabled. See the general
>>> +cache coloring documentation for more info.
>>> +
>>> +### llc-nr-ways
>>> +> `= <integer>`
>>> +
>>> +> Default: `Obtained from hardware`
>>> +
>>> +Specify the number of ways of the Last Level Cache. This option is available
>>> +only when `CONFIG_LLC_COLORING` is enabled. LLC size and number of ways are used
>>> +to find the number of supported cache colors. By default the value is
>>> +automatically computed by probing the hardware, but in case of specific needs,
>>> +it can be manually set. Those include failing probing and debugging/testing
>>> +purposes so that it's possibile to emulate platforms with different number of
>>> +supported colors. If set, also "llc-size" must be set, otherwise the default
>>> +will be used.
>>> +
>>> +### llc-size
>>> +> `= <size>`
>>> +
>>> +> Default: `Obtained from hardware`
>>> +
>>> +Specify the size of the Last Level Cache. This option is available only when
>>> +`CONFIG_LLC_COLORING` is enabled. LLC size and number of ways are used to find
>>> +the number of supported cache colors. By default the value is automatically
>>> +computed by probing the hardware, but in case of specific needs, it can be
>>> +manually set. Those include failing probing and debugging/testing purposes so
>>> +that it's possibile to emulate platforms with different number of supported
>>> +colors. If set, also "llc-nr-ways" must be set, otherwise the default will be
>>> +used.
>>
>> Wouldn't it make sense to infer "llc-coloring" when both of the latter options
>> were supplied?
> 
> To me it looks a bit strange that specifying some attributes of the cache
> automatically enables cache coloring. Also it would require some changes in
> how to express the auto-probing for such attributes.

Whereas to me it looks strange that, when having llc-size and llc-nr-ways
provided, I'd need to add a 3rd option. What purpose other than enabling
coloring could there be when specifying those parameters?

Jan
Carlo Nonato March 21, 2024, 5:22 p.m. UTC | #5
Hi Jan,

On Thu, Mar 21, 2024 at 4:53 PM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 21.03.2024 16:03, Carlo Nonato wrote:
> > On Tue, Mar 19, 2024 at 3:58 PM Jan Beulich <jbeulich@suse.com> wrote:
> >> On 15.03.2024 11:58, Carlo Nonato wrote:
> >>> --- a/docs/misc/xen-command-line.pandoc
> >>> +++ b/docs/misc/xen-command-line.pandoc
> >>> @@ -1706,6 +1706,43 @@ This option is intended for debugging purposes only.  Enable MSR_DEBUGCTL.LBR
> >>>  in hypervisor context to be able to dump the Last Interrupt/Exception To/From
> >>>  record with other registers.
> >>>
> >>> +### llc-coloring
> >>> +> `= <boolean>`
> >>> +
> >>> +> Default: `false`
> >>> +
> >>> +Flag to enable or disable LLC coloring support at runtime. This option is
> >>> +available only when `CONFIG_LLC_COLORING` is enabled. See the general
> >>> +cache coloring documentation for more info.
> >>> +
> >>> +### llc-nr-ways
> >>> +> `= <integer>`
> >>> +
> >>> +> Default: `Obtained from hardware`
> >>> +
> >>> +Specify the number of ways of the Last Level Cache. This option is available
> >>> +only when `CONFIG_LLC_COLORING` is enabled. LLC size and number of ways are used
> >>> +to find the number of supported cache colors. By default the value is
> >>> +automatically computed by probing the hardware, but in case of specific needs,
> >>> +it can be manually set. Those include failing probing and debugging/testing
> >>> +purposes so that it's possibile to emulate platforms with different number of
> >>> +supported colors. If set, also "llc-size" must be set, otherwise the default
> >>> +will be used.
> >>> +
> >>> +### llc-size
> >>> +> `= <size>`
> >>> +
> >>> +> Default: `Obtained from hardware`
> >>> +
> >>> +Specify the size of the Last Level Cache. This option is available only when
> >>> +`CONFIG_LLC_COLORING` is enabled. LLC size and number of ways are used to find
> >>> +the number of supported cache colors. By default the value is automatically
> >>> +computed by probing the hardware, but in case of specific needs, it can be
> >>> +manually set. Those include failing probing and debugging/testing purposes so
> >>> +that it's possibile to emulate platforms with different number of supported
> >>> +colors. If set, also "llc-nr-ways" must be set, otherwise the default will be
> >>> +used.
> >>
> >> Wouldn't it make sense to infer "llc-coloring" when both of the latter options
> >> were supplied?
> >
> > To me it looks a bit strange that specifying some attributes of the cache
> > automatically enables cache coloring. Also it would require some changes in
> > how to express the auto-probing for such attributes.
>
> Whereas to me it looks strange that, when having llc-size and llc-nr-ways
> provided, I'd need to add a 3rd option. What purpose other than enabling
> coloring could there be when specifying those parameters?

Ok, I probably misunderstood you. You mean just to assume llc-coloring=on
when both llc-size and llc-nr-ways are present and not to remove
llc-coloring completely, right? I'm ok with this.

> Jan

Thanks.
Jan Beulich March 22, 2024, 7:25 a.m. UTC | #6
On 21.03.2024 18:22, Carlo Nonato wrote:
> Hi Jan,
> 
> On Thu, Mar 21, 2024 at 4:53 PM Jan Beulich <jbeulich@suse.com> wrote:
>>
>> On 21.03.2024 16:03, Carlo Nonato wrote:
>>> On Tue, Mar 19, 2024 at 3:58 PM Jan Beulich <jbeulich@suse.com> wrote:
>>>> On 15.03.2024 11:58, Carlo Nonato wrote:
>>>>> --- a/docs/misc/xen-command-line.pandoc
>>>>> +++ b/docs/misc/xen-command-line.pandoc
>>>>> @@ -1706,6 +1706,43 @@ This option is intended for debugging purposes only.  Enable MSR_DEBUGCTL.LBR
>>>>>  in hypervisor context to be able to dump the Last Interrupt/Exception To/From
>>>>>  record with other registers.
>>>>>
>>>>> +### llc-coloring
>>>>> +> `= <boolean>`
>>>>> +
>>>>> +> Default: `false`
>>>>> +
>>>>> +Flag to enable or disable LLC coloring support at runtime. This option is
>>>>> +available only when `CONFIG_LLC_COLORING` is enabled. See the general
>>>>> +cache coloring documentation for more info.
>>>>> +
>>>>> +### llc-nr-ways
>>>>> +> `= <integer>`
>>>>> +
>>>>> +> Default: `Obtained from hardware`
>>>>> +
>>>>> +Specify the number of ways of the Last Level Cache. This option is available
>>>>> +only when `CONFIG_LLC_COLORING` is enabled. LLC size and number of ways are used
>>>>> +to find the number of supported cache colors. By default the value is
>>>>> +automatically computed by probing the hardware, but in case of specific needs,
>>>>> +it can be manually set. Those include failing probing and debugging/testing
>>>>> +purposes so that it's possibile to emulate platforms with different number of
>>>>> +supported colors. If set, also "llc-size" must be set, otherwise the default
>>>>> +will be used.
>>>>> +
>>>>> +### llc-size
>>>>> +> `= <size>`
>>>>> +
>>>>> +> Default: `Obtained from hardware`
>>>>> +
>>>>> +Specify the size of the Last Level Cache. This option is available only when
>>>>> +`CONFIG_LLC_COLORING` is enabled. LLC size and number of ways are used to find
>>>>> +the number of supported cache colors. By default the value is automatically
>>>>> +computed by probing the hardware, but in case of specific needs, it can be
>>>>> +manually set. Those include failing probing and debugging/testing purposes so
>>>>> +that it's possibile to emulate platforms with different number of supported
>>>>> +colors. If set, also "llc-nr-ways" must be set, otherwise the default will be
>>>>> +used.
>>>>
>>>> Wouldn't it make sense to infer "llc-coloring" when both of the latter options
>>>> were supplied?
>>>
>>> To me it looks a bit strange that specifying some attributes of the cache
>>> automatically enables cache coloring. Also it would require some changes in
>>> how to express the auto-probing for such attributes.
>>
>> Whereas to me it looks strange that, when having llc-size and llc-nr-ways
>> provided, I'd need to add a 3rd option. What purpose other than enabling
>> coloring could there be when specifying those parameters?
> 
> Ok, I probably misunderstood you. You mean just to assume llc-coloring=on
> when both llc-size and llc-nr-ways are present and not to remove
> llc-coloring completely, right?

Yes. The common thing, after all, will be to just have llc-coloring on the
command line.

Jan
diff mbox series

Patch

diff --git a/SUPPORT.md b/SUPPORT.md
index 510bb02190..456abd42bf 100644
--- a/SUPPORT.md
+++ b/SUPPORT.md
@@ -364,6 +364,13 @@  by maintaining multiple physical to machine (p2m) memory mappings.
     Status, x86 HVM: Tech Preview
     Status, ARM: Tech Preview
 
+### Cache coloring
+
+Allows to reserve Last Level Cache (LLC) partitions for Dom0, DomUs and Xen
+itself.
+
+    Status, Arm64: Experimental
+
 ## Resource Management
 
 ### CPU Pools
diff --git a/docs/misc/cache-coloring.rst b/docs/misc/cache-coloring.rst
new file mode 100644
index 0000000000..52ce52ffbd
--- /dev/null
+++ b/docs/misc/cache-coloring.rst
@@ -0,0 +1,125 @@ 
+Xen cache coloring user guide
+=============================
+
+The cache coloring support in Xen allows to reserve Last Level Cache (LLC)
+partitions for Dom0, DomUs and Xen itself. Currently only ARM64 is supported.
+Cache coloring realizes per-set cache partitioning in software and is applicable
+to shared LLCs as implemented in Cortex-A53, Cortex-A72 and similar CPUs.
+
+To compile LLC coloring support set ``CONFIG_LLC_COLORING=y``.
+
+If needed, change the maximum number of colors with
+``CONFIG_NR_LLC_COLORS=<n>``.
+
+Runtime configuration is done via `Command line parameters`_.
+
+Background
+**********
+
+Cache hierarchy of a modern multi-core CPU typically has first levels dedicated
+to each core (hence using multiple cache units), while the last level is shared
+among all of them. Such configuration implies that memory operations on one
+core (e.g. running a DomU) are able to generate interference on another core
+(e.g. hosting another DomU). Cache coloring realizes per-set cache-partitioning
+in software and mitigates this, guaranteeing higher and more predictable
+performances for memory accesses.
+Software-based cache coloring is particularly useful in those situations where
+no hardware mechanisms (e.g., DSU-based way partitioning) are available to
+partition caches. This is the case for e.g., Cortex-A53, A57 and A72 CPUs that
+feature a L2 LLC cache shared among all cores.
+
+The key concept underlying cache coloring is a fragmentation of the memory
+space into a set of sub-spaces called colors that are mapped to disjoint cache
+partitions. Technically, the whole memory space is first divided into a number
+of subsequent regions. Then each region is in turn divided into a number of
+subsequent sub-colors. The generic i-th color is then obtained by all the
+i-th sub-colors in each region.
+
+::
+
+                            Region j            Region j+1
+                .....................   ............
+                .                     . .
+                .                       .
+            _ _ _______________ _ _____________________ _ _
+                |     |     |     |     |     |     |
+                | c_0 | c_1 |     | c_n | c_0 | c_1 |
+           _ _ _|_____|_____|_ _ _|_____|_____|_____|_ _ _
+                    :                       :
+                    :                       :...         ... .
+                    :                            color 0
+                    :...........................         ... .
+                                                :
+          . . ..................................:
+
+How colors are actually defined depends on the function that maps memory to
+cache lines. In case of physically-indexed, physically-tagged caches with linear
+mapping, the set index is found by extracting some contiguous bits from the
+physical address. This allows colors to be defined as shown in figure: they
+appear in memory as subsequent blocks of equal size and repeats themselves after
+``n`` different colors, where ``n`` is the total number of colors.
+
+If some kind of bit shuffling appears in the mapping function, then colors
+assume a different layout in memory. Those kind of caches aren't supported by
+the current implementation.
+
+**Note**: Finding the exact cache mapping function can be a really difficult
+task since it's not always documented in the CPU manual. As said Cortex-A53, A57
+and A72 are known to work with the current implementation.
+
+How to compute the number of colors
+###################################
+
+Given the linear mapping from physical memory to cache lines for granted, the
+number of available colors for a specific platform is computed using three
+parameters:
+
+- the size of the LLC.
+- the number of the LLC ways.
+- the page size used by Xen.
+
+The first two parameters can be found in the processor manual, while the third
+one is the minimum mapping granularity. Dividing the cache size by the number of
+its ways we obtain the size of a way. Dividing this number by the page size,
+the number of total cache colors is found. So for example an Arm Cortex-A53
+with a 16-ways associative 1 MiB LLC can isolate up to 16 colors when pages are
+4 KiB in size.
+
+LLC size and number of ways are probed automatically by default so there's
+should be no need to compute the number of colors by yourself.
+
+Effective colors assignment
+###########################
+
+When assigning colors:
+
+1. If one wants to avoid cache interference between two domains, different
+   colors needs to be used for their memory.
+
+2. To improve spatial locality, color assignment should privilege continuity in
+   the partitioning. E.g., assigning colors (0,1) to domain I and (2,3) to
+   domain J is better than assigning colors (0,2) to I and (1,3) to J.
+
+Command line parameters
+***********************
+
+Specific documentation is available at `docs/misc/xen-command-line.pandoc`.
+
++----------------------+-------------------------------+
+| **Parameter**        | **Description**               |
++----------------------+-------------------------------+
+| ``llc-coloring``     | enable coloring at runtime    |
++----------------------+-------------------------------+
+| ``llc-size``         | set the LLC size              |
++----------------------+-------------------------------+
+| ``llc-nr-ways``      | set the LLC number of ways    |
++----------------------+-------------------------------+
+
+Auto-probing of LLC specs
+#########################
+
+LLC size and number of ways are probed automatically by default.
+
+LLC specs can be manually set via the above command line parameters. This
+bypasses any auto-probing and it's used to overcome failing situations or for
+debugging/testing purposes.
diff --git a/docs/misc/xen-command-line.pandoc b/docs/misc/xen-command-line.pandoc
index 54edbc0fbc..2936abea2c 100644
--- a/docs/misc/xen-command-line.pandoc
+++ b/docs/misc/xen-command-line.pandoc
@@ -1706,6 +1706,43 @@  This option is intended for debugging purposes only.  Enable MSR_DEBUGCTL.LBR
 in hypervisor context to be able to dump the Last Interrupt/Exception To/From
 record with other registers.
 
+### llc-coloring
+> `= <boolean>`
+
+> Default: `false`
+
+Flag to enable or disable LLC coloring support at runtime. This option is
+available only when `CONFIG_LLC_COLORING` is enabled. See the general
+cache coloring documentation for more info.
+
+### llc-nr-ways
+> `= <integer>`
+
+> Default: `Obtained from hardware`
+
+Specify the number of ways of the Last Level Cache. This option is available
+only when `CONFIG_LLC_COLORING` is enabled. LLC size and number of ways are used
+to find the number of supported cache colors. By default the value is
+automatically computed by probing the hardware, but in case of specific needs,
+it can be manually set. Those include failing probing and debugging/testing
+purposes so that it's possibile to emulate platforms with different number of
+supported colors. If set, also "llc-size" must be set, otherwise the default
+will be used.
+
+### llc-size
+> `= <size>`
+
+> Default: `Obtained from hardware`
+
+Specify the size of the Last Level Cache. This option is available only when
+`CONFIG_LLC_COLORING` is enabled. LLC size and number of ways are used to find
+the number of supported cache colors. By default the value is automatically
+computed by probing the hardware, but in case of specific needs, it can be
+manually set. Those include failing probing and debugging/testing purposes so
+that it's possibile to emulate platforms with different number of supported
+colors. If set, also "llc-nr-ways" must be set, otherwise the default will be
+used.
+
 ### lock-depth-size
 > `= <integer>`
 
diff --git a/xen/arch/Kconfig b/xen/arch/Kconfig
index 67ba38f32f..a65c38e53e 100644
--- a/xen/arch/Kconfig
+++ b/xen/arch/Kconfig
@@ -31,3 +31,23 @@  config NR_NUMA_NODES
 	  associated with multiple-nodes management. It is the upper bound of
 	  the number of NUMA nodes that the scheduler, memory allocation and
 	  other NUMA-aware components can handle.
+
+config LLC_COLORING
+	bool "Last Level Cache (LLC) coloring" if EXPERT
+	depends on HAS_LLC_COLORING
+	depends on !NUMA
+
+config NR_LLC_COLORS
+	int "Maximum number of LLC colors"
+	range 2 1024
+	default 128
+	depends on LLC_COLORING
+	help
+	  Controls the build-time size of various arrays associated with LLC
+	  coloring. Refer to cache coloring documentation for how to compute the
+	  number of colors supported by the platform. This is only an upper
+	  bound. The runtime value is autocomputed or manually set via cmdline.
+	  The default value corresponds to an 8 MiB 16-ways LLC, which should be
+	  more than what's needed in the general case. Use only power of 2 values.
+	  1024 is the number of colors that fit in a 4 KiB page when integers are 4
+	  bytes long.
diff --git a/xen/common/Kconfig b/xen/common/Kconfig
index a5c3d5a6bf..1e467178bd 100644
--- a/xen/common/Kconfig
+++ b/xen/common/Kconfig
@@ -71,6 +71,9 @@  config HAS_IOPORTS
 config HAS_KEXEC
 	bool
 
+config HAS_LLC_COLORING
+	bool
+
 config HAS_PMAP
 	bool
 
diff --git a/xen/common/Makefile b/xen/common/Makefile
index e5eee19a85..3054254a7d 100644
--- a/xen/common/Makefile
+++ b/xen/common/Makefile
@@ -23,6 +23,7 @@  obj-y += keyhandler.o
 obj-$(CONFIG_KEXEC) += kexec.o
 obj-$(CONFIG_KEXEC) += kimage.o
 obj-$(CONFIG_LIVEPATCH) += livepatch.o livepatch_elf.o
+obj-$(CONFIG_LLC_COLORING) += llc-coloring.o
 obj-$(CONFIG_MEM_ACCESS) += mem_access.o
 obj-y += memory.o
 obj-y += multicall.o
diff --git a/xen/common/keyhandler.c b/xen/common/keyhandler.c
index 127ca50696..778f93e063 100644
--- a/xen/common/keyhandler.c
+++ b/xen/common/keyhandler.c
@@ -5,6 +5,7 @@ 
 #include <asm/regs.h>
 #include <xen/delay.h>
 #include <xen/keyhandler.h>
+#include <xen/llc-coloring.h>
 #include <xen/param.h>
 #include <xen/shutdown.h>
 #include <xen/event.h>
@@ -303,6 +304,8 @@  static void cf_check dump_domains(unsigned char key)
 
         arch_dump_domain_info(d);
 
+        domain_dump_llc_colors(d);
+
         rangeset_domain_printk(d);
 
         dump_pageframe_info(d);
diff --git a/xen/common/llc-coloring.c b/xen/common/llc-coloring.c
new file mode 100644
index 0000000000..db96a83ddd
--- /dev/null
+++ b/xen/common/llc-coloring.c
@@ -0,0 +1,102 @@ 
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Last Level Cache (LLC) coloring common code
+ *
+ * Copyright (C) 2022 Xilinx Inc.
+ */
+#include <xen/keyhandler.h>
+#include <xen/llc-coloring.h>
+#include <xen/param.h>
+
+static bool __ro_after_init llc_coloring_enabled;
+boolean_param("llc-coloring", llc_coloring_enabled);
+
+static unsigned int __initdata llc_size;
+size_param("llc-size", llc_size);
+static unsigned int __initdata llc_nr_ways;
+integer_param("llc-nr-ways", llc_nr_ways);
+/* Number of colors available in the LLC */
+static unsigned int __ro_after_init max_nr_colors;
+
+static void print_colors(const unsigned int *colors, unsigned int num_colors)
+{
+    unsigned int i;
+
+    printk("{ ");
+    for ( i = 0; i < num_colors; i++ )
+    {
+        unsigned int start = colors[i], end = start;
+
+        printk("%u", start);
+
+        for ( ; i < num_colors - 1 && end + 1 == colors[i + 1]; i++, end++ )
+            ;
+
+        if ( start != end )
+            printk("-%u", end);
+
+        if ( i < num_colors - 1 )
+            printk(", ");
+    }
+    printk(" }\n");
+}
+
+void __init llc_coloring_init(void)
+{
+    unsigned int way_size;
+
+    if ( !llc_coloring_enabled )
+        return;
+
+    if ( llc_size && llc_nr_ways )
+        way_size = llc_size / llc_nr_ways;
+    else
+    {
+        way_size = get_llc_way_size();
+        if ( !way_size )
+            panic("LLC probing failed and 'llc-size' or 'llc-nr-ways' missing\n");
+    }
+
+    /*
+     * The maximum number of colors must be a power of 2 in order to correctly
+     * map them to bits of an address.
+     */
+    max_nr_colors = way_size >> PAGE_SHIFT;
+
+    if ( max_nr_colors & (max_nr_colors - 1) )
+        panic("Number of LLC colors (%u) isn't a power of 2\n", max_nr_colors);
+
+    if ( max_nr_colors < 2 || max_nr_colors > CONFIG_NR_LLC_COLORS )
+        panic("Number of LLC colors (%u) not in range [2, %u]\n",
+              max_nr_colors, CONFIG_NR_LLC_COLORS);
+
+    arch_llc_coloring_init();
+}
+
+void cf_check dump_llc_coloring_info(void)
+{
+    if ( !llc_coloring_enabled )
+        return;
+
+    printk("LLC coloring info:\n");
+    printk("    Number of LLC colors supported: %u\n", max_nr_colors);
+}
+
+void cf_check domain_dump_llc_colors(const struct domain *d)
+{
+    if ( !llc_coloring_enabled )
+        return;
+
+    printk("%u LLC colors: ", d->num_llc_colors);
+    print_colors(d->llc_colors, d->num_llc_colors);
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
index 2ec17df9b4..c38edb9a58 100644
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -126,6 +126,7 @@ 
 #include <xen/irq.h>
 #include <xen/keyhandler.h>
 #include <xen/lib.h>
+#include <xen/llc-coloring.h>
 #include <xen/mm.h>
 #include <xen/nodemask.h>
 #include <xen/numa.h>
@@ -2623,6 +2624,8 @@  static void cf_check pagealloc_info(unsigned char key)
     }
 
     printk("    Dom heap: %lukB free\n", total << (PAGE_SHIFT-10));
+
+    dump_llc_coloring_info();
 }
 
 static __init int cf_check pagealloc_keyhandler_init(void)
diff --git a/xen/include/xen/llc-coloring.h b/xen/include/xen/llc-coloring.h
new file mode 100644
index 0000000000..c60c8050c5
--- /dev/null
+++ b/xen/include/xen/llc-coloring.h
@@ -0,0 +1,36 @@ 
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Last Level Cache (LLC) coloring common header
+ *
+ * Copyright (C) 2022 Xilinx Inc.
+ */
+#ifndef __COLORING_H__
+#define __COLORING_H__
+
+#include <xen/sched.h>
+#include <public/domctl.h>
+
+#ifdef CONFIG_LLC_COLORING
+void llc_coloring_init(void);
+void dump_llc_coloring_info(void);
+void domain_dump_llc_colors(const struct domain *d);
+#else
+static inline void llc_coloring_init(void) {}
+static inline void dump_llc_coloring_info(void) {}
+static inline void domain_dump_llc_colors(const struct domain *d) {}
+#endif
+
+unsigned int get_llc_way_size(void);
+void arch_llc_coloring_init(void);
+
+#endif /* __COLORING_H__ */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index 37f5922f32..96cc934fc3 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -627,6 +627,11 @@  struct domain
 
     /* Holding CDF_* constant. Internal flags for domain creation. */
     unsigned int cdf;
+
+#ifdef CONFIG_LLC_COLORING
+    unsigned int num_llc_colors;
+    const unsigned int *llc_colors;
+#endif
 };
 
 static inline struct page_list_head *page_to_list(