[v3,00/17] Support smp.clusters for x86

Message ID	20230801103527.397756-1-zhao1.liu@linux.intel.com (mailing list archive)
Headers	show Return-Path: <qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org> From: Zhao Liu <zhao1.liu@linux.intel.com> To: Eduardo Habkost <eduardo@habkost.net>, Marcel Apfelbaum <marcel.apfelbaum@gmail.com>, =?utf-8?q?Philippe_Mathieu-D?= =?utf-8?q?aud=C3=A9?= <philmd@linaro.org>, Yanan Wang <wangyanan55@huawei.com>, "Michael S . Tsirkin" <mst@redhat.com>, Richard Henderson <richard.henderson@linaro.org>, Paolo Bonzini <pbonzini@redhat.com> Cc: qemu-devel@nongnu.org, Zhenyu Wang <zhenyu.z.wang@intel.com>, Xiaoyao Li <xiaoyao.li@intel.com>, Babu Moger <babu.moger@amd.com>, Zhao Liu <zhao1.liu@intel.com> Subject: [PATCH v3 00/17] Support smp.clusters for x86 Date: Tue, 1 Aug 2023 18:35:10 +0800 Message-Id: <20230801103527.397756-1-zhao1.liu@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Received-SPF: none client-ip=192.55.52.88; envelope-from=zhao1.liu@linux.intel.com; helo=mgamail.intel.com X-Spam_score_int: -34 X-Spam_score: -3.5 X-Spam_bar: --- X-Spam_report: (-3.5 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RDNS_NONE=0.793, SPF_HELO_NONE=0.001, SPF_NONE=0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action Precedence: list Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org
Series	Support smp.clusters for x86 \| expand [v3,00/17] Support smp.clusters for x86 [v3,01/17] i386: Fix comment style in topology.h [v3,02/17] tests: Rename test-x86-cpuid.c to test-x86-topo.c [v3,03/17] softmmu: Fix CPUSTATE.nr_cores' calculation [v3,04/17] i386/cpu: Fix i/d-cache topology to core level for Intel CPU [v3,05/17] i386/cpu: Use APIC ID offset to encode cache topo in CPUID[4] [v3,06/17] i386/cpu: Consolidate the use of topo_info in cpu_x86_cpuid() [v3,07/17] i386: Introduce module-level cpu topology to CPUX86State [v3,08/17] i386: Support modules_per_die in X86CPUTopoInfo [v3,09/17] i386: Support module_id in X86CPUTopoIDs [v3,10/17] i386/cpu: Introduce cluster-id to X86CPU [v3,11/17] tests: Add test case of APIC ID for module level parsing [v3,12/17] hw/i386/pc: Support smp.clusters for x86 PC machine [v3,13/17] i386: Add cache topology info in CPUCacheInfo [v3,14/17] i386: Use CPUCacheInfo.share_level to encode CPUID[4] [v3,15/17] i386: Fix NumSharingCache for CPUID[0x8000001D].EAX[bits 25:14] [v3,16/17] i386: Use CPUCacheInfo.share_level to encode CPUID[0x8000001D].EAX[bits 25:14] [v3,17/17] i386: Add new property to control L2 cache topo in CPUID.04H

Zhao Liu Aug. 1, 2023, 10:35 a.m. UTC

From: Zhao Liu <zhao1.liu@intel.com>

Hi list,

This is the our v3 patch series, rebased on the master branch at the
commit 234320cd0573 ("Merge tag 'pull-target-arm-20230731' of https:
//git.linaro.org/people/pmaydell/qemu-arm into staging").

Comparing with v2 [1], v3 mainly adds "Tested-by", "Reviewed-by" and
"ACKed-by" (for PC related patchies) tags and minor code changes (Pls
see changelog).


# Introduction

This series add the cluster support for x86 PC machine, which allows
x86 can use smp.clusters to configure x86 modlue level CPU topology.

And since the compatibility issue (see section: ## Why not share L2
cache in cluster directly), this series also introduce a new command
to adjust the x86 L2 cache topology.

Welcome your comments!


# Backgroud

The "clusters" parameter in "smp" is introduced by ARM [2], but x86
hasn't supported it.

At present, x86 defaults L2 cache is shared in one core, but this is
not enough. There're some platforms that multiple cores share the
same L2 cache, e.g., Alder Lake-P shares L2 cache for one module of
Atom cores [3], that is, every four Atom cores shares one L2 cache.
Therefore, we need the new CPU topology level (cluster/module).

Another reason is for hybrid architecture. cluster support not only
provides another level of topology definition in x86, but would aslo
provide required code change for future our hybrid topology support.


# Overview

## Introduction of module level for x86

"cluster" in smp is the CPU topology level which is between "core" and
die.

For x86, the "cluster" in smp is corresponding to the module level [4],
which is above the core level. So use the "module" other than "cluster"
in x86 code.

And please note that x86 already has a cpu topology level also named
"cluster" [4], this level is at the upper level of the package. Here,
the cluster in x86 cpu topology is completely different from the
"clusters" as the smp parameter. After the module level is introduced,
the cluster as the smp parameter will actually refer to the module level
of x86.


## Why not share L2 cache in cluster directly

Though "clusters" was introduced to help define L2 cache topology
[2], using cluster to define x86's L2 cache topology will cause the
compatibility problem:

Currently, x86 defaults that the L2 cache is shared in one core, which
actually implies a default setting "cores per L2 cache is 1" and
therefore implicitly defaults to having as many L2 caches as cores.

For example (i386 PC machine):
-smp 16,sockets=2,dies=2,cores=2,threads=2,maxcpus=16 (*)

Considering the topology of the L2 cache, this (*) implicitly means "1
core per L2 cache" and "2 L2 caches per die".

If we use cluster to configure L2 cache topology with the new default
setting "clusters per L2 cache is 1", the above semantics will change
to "2 cores per cluster" and "1 cluster per L2 cache", that is, "2
cores per L2 cache".

So the same command (*) will cause changes in the L2 cache topology,
further affecting the performance of the virtual machine.

Therefore, x86 should only treat cluster as a cpu topology level and
avoid using it to change L2 cache by default for compatibility.


## module level in CPUID

Currently, we don't expose module level in CPUID.1FH because currently
linux (v6.2-rc6) doesn't support module level. And exposing module and
die levels at the same time in CPUID.1FH will cause linux to calculate
wrong die_id. The module level should be exposed until the real machine
has the module level in CPUID.1FH.

We can configure CPUID.04H.02H (L2 cache topology) with module level by
a new command:

        "-cpu,x-l2-cache-topo=cluster"

More information about this command, please see the section: "## New
property: x-l2-cache-topo".


## New cache topology info in CPUCacheInfo

Currently, by default, the cache topology is encoded as:
1. i/d cache is shared in one core.
2. L2 cache is shared in one core.
3. L3 cache is shared in one die.

This default general setting has caused a misunderstanding, that is, the
cache topology is completely equated with a specific cpu topology, such
as the connection between L2 cache and core level, and the connection
between L3 cache and die level.

In fact, the settings of these topologies depend on the specific
platform and are not static. For example, on Alder Lake-P, every
four Atom cores share the same L2 cache [2].

Thus, in this patch set, we explicitly define the corresponding cache
topology for different cpu models and this has two benefits:
1. Easy to expand to new CPU models in the future, which has different
   cache topology.
2. It can easily support custom cache topology by some command (e.g.,
   x-l2-cache-topo).


## New property: x-l2-cache-topo

The property l2-cache-topo will be used to change the L2 cache topology
in CPUID.04H.

Now it allows user to set the L2 cache is shared in core level or
cluster level.

If user passes "-cpu x-l2-cache-topo=[core|cluster]" then older L2 cache
topology will be overrided by the new topology setting.

Since CPUID.04H is used by intel cpus, this property is available on
intel cpus as for now.

When necessary, it can be extended to CPUID[0x8000001D] for amd cpus.


# Patch description

patch 1-2 Cleanups about coding style and test name.

patch 3-4,15 Fixes about x86 topology, intel l1 cache topology and amd
             cache topology encoding.

patch 5-6 Cleanups about topology related CPUID encoding and QEMU
          topology variables.

patch 7-12 Add the module as the new CPU topology level in x86, and it
           is corresponding to the cluster level in generic code.

patch 13,14,16 Add cache topology infomation in cache models.

patch 17 Introduce a new command to configure L2 cache topology.


[1]: https://lists.gnu.org/archive/html/qemu-devel/2023-05/msg07179.html
[2]: https://patchew.org/QEMU/20211228092221.21068-1-wangyanan55@huawei.com/
[3]: https://www.intel.com/content/www/us/en/products/platforms/details/alder-lake-p.html
[4]: SDM, vol.3, ch.9, 9.9.1 Hierarchical Mapping of Shared Resources.

Best Regards,
Zhao

---
Changelog:

Changes since v2:
 * Add "Tested-by", "Reviewed-by" and "ACKed-by" tags.
 * Use newly added wrapped helper to get cores per socket in
   qemu_init_vcpu().

Changes since v1:
 * Reordered patches. (Yanan)
 * Deprecated the patch to fix comment of machine_parse_smp_config().
   (Yanan)
 * Rename test-x86-cpuid.c to test-x86-topo.c. (Yanan)
 * Split the intel's l1 cache topology fix into a new separate patch.
   (Yanan)
 * Combined module_id and APIC ID for module level support into one
   patch. (Yanan)
 * Make cache_into_passthrough case of cpuid 0x04 leaf in
 * cpu_x86_cpuid() use max_processor_ids_for_cache() and
   max_core_ids_in_package() to encode CPUID[4]. (Yanan)
 * Add the prefix "CPU_TOPO_LEVEL_*" for CPU topology level names.
   (Yanan)
 * Rename the "INVALID" level to "CPU_TOPO_LEVEL_UNKNOW". (Yanan)

---
Zhao Liu (10):
  i386: Fix comment style in topology.h
  tests: Rename test-x86-cpuid.c to test-x86-topo.c
  i386/cpu: Fix i/d-cache topology to core level for Intel CPU
  i386/cpu: Use APIC ID offset to encode cache topo in CPUID[4]
  i386/cpu: Consolidate the use of topo_info in cpu_x86_cpuid()
  i386: Add cache topology info in CPUCacheInfo
  i386: Use CPUCacheInfo.share_level to encode CPUID[4]
  i386: Fix NumSharingCache for CPUID[0x8000001D].EAX[bits 25:14]
  i386: Use CPUCacheInfo.share_level to encode
    CPUID[0x8000001D].EAX[bits 25:14]
  i386: Add new property to control L2 cache topo in CPUID.04H

Zhuocheng Ding (7):
  softmmu: Fix CPUSTATE.nr_cores' calculation
  i386: Introduce module-level cpu topology to CPUX86State
  i386: Support modules_per_die in X86CPUTopoInfo
  i386: Support module_id in X86CPUTopoIDs
  i386/cpu: Introduce cluster-id to X86CPU
  tests: Add test case of APIC ID for module level parsing
  hw/i386/pc: Support smp.clusters for x86 PC machine

 MAINTAINERS                                   |   2 +-
 hw/i386/pc.c                                  |   1 +
 hw/i386/x86.c                                 |  49 +++++-
 include/hw/core/cpu.h                         |   2 +-
 include/hw/i386/topology.h                    |  68 +++++---
 qemu-options.hx                               |  10 +-
 softmmu/cpus.c                                |   2 +-
 target/i386/cpu.c                             | 158 ++++++++++++++----
 target/i386/cpu.h                             |  25 +++
 tests/unit/meson.build                        |   4 +-
 .../{test-x86-cpuid.c => test-x86-topo.c}     |  58 ++++---
 11 files changed, 280 insertions(+), 99 deletions(-)
 rename tests/unit/{test-x86-cpuid.c => test-x86-topo.c} (73%)

Jonathan Cameron Aug. 1, 2023, 3:35 p.m. UTC | #1

On Tue,  1 Aug 2023 18:35:10 +0800
Zhao Liu <zhao1.liu@linux.intel.com> wrote:

> From: Zhao Liu <zhao1.liu@intel.com>
> 
> Hi list,
> 
> This is the our v3 patch series, rebased on the master branch at the
> commit 234320cd0573 ("Merge tag 'pull-target-arm-20230731' of https:
> //git.linaro.org/people/pmaydell/qemu-arm into staging").
> 
> Comparing with v2 [1], v3 mainly adds "Tested-by", "Reviewed-by" and
> "ACKed-by" (for PC related patchies) tags and minor code changes (Pls
> see changelog).
> 
> 
> # Introduction
> 
> This series add the cluster support for x86 PC machine, which allows
> x86 can use smp.clusters to configure x86 modlue level CPU topology.
> 
> And since the compatibility issue (see section: ## Why not share L2
> cache in cluster directly), this series also introduce a new command
> to adjust the x86 L2 cache topology.
> 
> Welcome your comments!
> 
> 
> # Backgroud
> 
> The "clusters" parameter in "smp" is introduced by ARM [2], but x86
> hasn't supported it.
> 
> At present, x86 defaults L2 cache is shared in one core, but this is
> not enough. There're some platforms that multiple cores share the
> same L2 cache, e.g., Alder Lake-P shares L2 cache for one module of
> Atom cores [3], that is, every four Atom cores shares one L2 cache.
> Therefore, we need the new CPU topology level (cluster/module).
> 
> Another reason is for hybrid architecture. cluster support not only
> provides another level of topology definition in x86, but would aslo
> provide required code change for future our hybrid topology support.
> 
> 
> # Overview
> 
> ## Introduction of module level for x86
> 
> "cluster" in smp is the CPU topology level which is between "core" and
> die.
> 
> For x86, the "cluster" in smp is corresponding to the module level [4],
> which is above the core level. So use the "module" other than "cluster"
> in x86 code.
> 
> And please note that x86 already has a cpu topology level also named
> "cluster" [4], this level is at the upper level of the package. Here,
> the cluster in x86 cpu topology is completely different from the
> "clusters" as the smp parameter. After the module level is introduced,
> the cluster as the smp parameter will actually refer to the module level
> of x86.
> 
> 
> ## Why not share L2 cache in cluster directly
> 
> Though "clusters" was introduced to help define L2 cache topology
> [2], using cluster to define x86's L2 cache topology will cause the
> compatibility problem:
> 
> Currently, x86 defaults that the L2 cache is shared in one core, which
> actually implies a default setting "cores per L2 cache is 1" and
> therefore implicitly defaults to having as many L2 caches as cores.
> 
> For example (i386 PC machine):
> -smp 16,sockets=2,dies=2,cores=2,threads=2,maxcpus=16 (*)
> 
> Considering the topology of the L2 cache, this (*) implicitly means "1
> core per L2 cache" and "2 L2 caches per die".
> 
> If we use cluster to configure L2 cache topology with the new default
> setting "clusters per L2 cache is 1", the above semantics will change
> to "2 cores per cluster" and "1 cluster per L2 cache", that is, "2
> cores per L2 cache".
> 
> So the same command (*) will cause changes in the L2 cache topology,
> further affecting the performance of the virtual machine.
> 
> Therefore, x86 should only treat cluster as a cpu topology level and
> avoid using it to change L2 cache by default for compatibility.
> 
> 
> ## module level in CPUID
> 
> Currently, we don't expose module level in CPUID.1FH because currently
> linux (v6.2-rc6) doesn't support module level. And exposing module and
> die levels at the same time in CPUID.1FH will cause linux to calculate
> wrong die_id. The module level should be exposed until the real machine
> has the module level in CPUID.1FH.
> 
> We can configure CPUID.04H.02H (L2 cache topology) with module level by
> a new command:
> 
>         "-cpu,x-l2-cache-topo=cluster"
> 
> More information about this command, please see the section: "## New
> property: x-l2-cache-topo".
> 
> 
> ## New cache topology info in CPUCacheInfo
> 
> Currently, by default, the cache topology is encoded as:
> 1. i/d cache is shared in one core.
> 2. L2 cache is shared in one core.
> 3. L3 cache is shared in one die.
> 
> This default general setting has caused a misunderstanding, that is, the
> cache topology is completely equated with a specific cpu topology, such
> as the connection between L2 cache and core level, and the connection
> between L3 cache and die level.
> 
> In fact, the settings of these topologies depend on the specific
> platform and are not static. For example, on Alder Lake-P, every
> four Atom cores share the same L2 cache [2].
> 
> Thus, in this patch set, we explicitly define the corresponding cache
> topology for different cpu models and this has two benefits:
> 1. Easy to expand to new CPU models in the future, which has different
>    cache topology.
> 2. It can easily support custom cache topology by some command (e.g.,
>    x-l2-cache-topo).
> 
> 
> ## New property: x-l2-cache-topo
> 
> The property l2-cache-topo will be used to change the L2 cache topology
> in CPUID.04H.
> 
> Now it allows user to set the L2 cache is shared in core level or
> cluster level.
> 
> If user passes "-cpu x-l2-cache-topo=[core|cluster]" then older L2 cache
> topology will be overrided by the new topology setting.
> 
> Since CPUID.04H is used by intel cpus, this property is available on
> intel cpus as for now.
> 
> When necessary, it can be extended to CPUID[0x8000001D] for amd cpus.

Hi Zhao Liu,

As part of emulating arm's MPAM (cache partitioning controls) I needed
to add the missing cache description in the ACPI PPTT table. As such I ran
into a very similar problem to the one you are addressing.

I wonder if a more generic description is possible? We can rely on ordering
of the cache levels, so what I was planning to propose was the rather lengthy
but flexible (and with better names ;)

-smp 16,sockets=1,clusters=4,threads=2,cache-cluster-start-level=2,cache-node-start-level=3

Perhaps we can come up with a common scheme that covers both usecases?
It gets more fiddly to define if we have variable topology across different clusters
- and that was going to be an open question in the RFC proposing this - our current
definition of the more basic topology doesn't cover those cases anyway.

What I want:

1) No restriction on maximum cache levels - some systems have more than 3.
2) Easy ability to express everything from all caches are private to all caches are shared.
Is 3 levels enough? (private, shared at cluster level, shared at a level above that) I think
so, but if not any scheme should be extensible to cover another level.

Great if we can figure out a common scheme.

Jonathan

> 
> 
> # Patch description
> 
> patch 1-2 Cleanups about coding style and test name.
> 
> patch 3-4,15 Fixes about x86 topology, intel l1 cache topology and amd
>              cache topology encoding.
> 
> patch 5-6 Cleanups about topology related CPUID encoding and QEMU
>           topology variables.
> 
> patch 7-12 Add the module as the new CPU topology level in x86, and it
>            is corresponding to the cluster level in generic code.
> 
> patch 13,14,16 Add cache topology infomation in cache models.
> 
> patch 17 Introduce a new command to configure L2 cache topology.
> 
> 
> [1]: https://lists.gnu.org/archive/html/qemu-devel/2023-05/msg07179.html
> [2]: https://patchew.org/QEMU/20211228092221.21068-1-wangyanan55@huawei.com/
> [3]: https://www.intel.com/content/www/us/en/products/platforms/details/alder-lake-p.html
> [4]: SDM, vol.3, ch.9, 9.9.1 Hierarchical Mapping of Shared Resources.
> 
> Best Regards,
> Zhao
> 
> ---
> Changelog:
> 
> Changes since v2:
>  * Add "Tested-by", "Reviewed-by" and "ACKed-by" tags.
>  * Use newly added wrapped helper to get cores per socket in
>    qemu_init_vcpu().
> 
> Changes since v1:
>  * Reordered patches. (Yanan)
>  * Deprecated the patch to fix comment of machine_parse_smp_config().
>    (Yanan)
>  * Rename test-x86-cpuid.c to test-x86-topo.c. (Yanan)
>  * Split the intel's l1 cache topology fix into a new separate patch.
>    (Yanan)
>  * Combined module_id and APIC ID for module level support into one
>    patch. (Yanan)
>  * Make cache_into_passthrough case of cpuid 0x04 leaf in
>  * cpu_x86_cpuid() use max_processor_ids_for_cache() and
>    max_core_ids_in_package() to encode CPUID[4]. (Yanan)
>  * Add the prefix "CPU_TOPO_LEVEL_*" for CPU topology level names.
>    (Yanan)
>  * Rename the "INVALID" level to "CPU_TOPO_LEVEL_UNKNOW". (Yanan)
> 
> ---
> Zhao Liu (10):
>   i386: Fix comment style in topology.h
>   tests: Rename test-x86-cpuid.c to test-x86-topo.c
>   i386/cpu: Fix i/d-cache topology to core level for Intel CPU
>   i386/cpu: Use APIC ID offset to encode cache topo in CPUID[4]
>   i386/cpu: Consolidate the use of topo_info in cpu_x86_cpuid()
>   i386: Add cache topology info in CPUCacheInfo
>   i386: Use CPUCacheInfo.share_level to encode CPUID[4]
>   i386: Fix NumSharingCache for CPUID[0x8000001D].EAX[bits 25:14]
>   i386: Use CPUCacheInfo.share_level to encode
>     CPUID[0x8000001D].EAX[bits 25:14]
>   i386: Add new property to control L2 cache topo in CPUID.04H
> 
> Zhuocheng Ding (7):
>   softmmu: Fix CPUSTATE.nr_cores' calculation
>   i386: Introduce module-level cpu topology to CPUX86State
>   i386: Support modules_per_die in X86CPUTopoInfo
>   i386: Support module_id in X86CPUTopoIDs
>   i386/cpu: Introduce cluster-id to X86CPU
>   tests: Add test case of APIC ID for module level parsing
>   hw/i386/pc: Support smp.clusters for x86 PC machine
> 
>  MAINTAINERS                                   |   2 +-
>  hw/i386/pc.c                                  |   1 +
>  hw/i386/x86.c                                 |  49 +++++-
>  include/hw/core/cpu.h                         |   2 +-
>  include/hw/i386/topology.h                    |  68 +++++---
>  qemu-options.hx                               |  10 +-
>  softmmu/cpus.c                                |   2 +-
>  target/i386/cpu.c                             | 158 ++++++++++++++----
>  target/i386/cpu.h                             |  25 +++
>  tests/unit/meson.build                        |   4 +-
>  .../{test-x86-cpuid.c => test-x86-topo.c}     |  58 ++++---
>  11 files changed, 280 insertions(+), 99 deletions(-)
>  rename tests/unit/{test-x86-cpuid.c => test-x86-topo.c} (73%)
>

Babu Moger Aug. 1, 2023, 11:11 p.m. UTC | #2

Hi Zhao,

On 8/1/23 05:35, Zhao Liu wrote:
> From: Zhao Liu <zhao1.liu@intel.com>
> 
> Hi list,
> 
> This is the our v3 patch series, rebased on the master branch at the
> commit 234320cd0573 ("Merge tag 'pull-target-arm-20230731' of https:
> //git.linaro.org/people/pmaydell/qemu-arm into staging").
> 
> Comparing with v2 [1], v3 mainly adds "Tested-by", "Reviewed-by" and
> "ACKed-by" (for PC related patchies) tags and minor code changes (Pls
> see changelog).
> 
> 
> # Introduction
> 
> This series add the cluster support for x86 PC machine, which allows
> x86 can use smp.clusters to configure x86 modlue level CPU topology.

/s/modlue/module
> 
> And since the compatibility issue (see section: ## Why not share L2
> cache in cluster directly), this series also introduce a new command
> to adjust the x86 L2 cache topology.
> 
> Welcome your comments!
> 
> 
> # Backgroud
> 
> The "clusters" parameter in "smp" is introduced by ARM [2], but x86
> hasn't supported it.
> 
> At present, x86 defaults L2 cache is shared in one core, but this is
> not enough. There're some platforms that multiple cores share the
> same L2 cache, e.g., Alder Lake-P shares L2 cache for one module of
> Atom cores [3], that is, every four Atom cores shares one L2 cache.
> Therefore, we need the new CPU topology level (cluster/module).
> 
> Another reason is for hybrid architecture. cluster support not only
> provides another level of topology definition in x86, but would aslo
> provide required code change for future our hybrid topology support.
> 
> 
> # Overview
> 
> ## Introduction of module level for x86
> 
> "cluster" in smp is the CPU topology level which is between "core" and
> die.
> 
> For x86, the "cluster" in smp is corresponding to the module level [4],
> which is above the core level. So use the "module" other than "cluster"
> in x86 code.
> 
> And please note that x86 already has a cpu topology level also named
> "cluster" [4], this level is at the upper level of the package. Here,
> the cluster in x86 cpu topology is completely different from the
> "clusters" as the smp parameter. After the module level is introduced,
> the cluster as the smp parameter will actually refer to the module level
> of x86.
> 
> 
> ## Why not share L2 cache in cluster directly
> 
> Though "clusters" was introduced to help define L2 cache topology
> [2], using cluster to define x86's L2 cache topology will cause the
> compatibility problem:
> 
> Currently, x86 defaults that the L2 cache is shared in one core, which
> actually implies a default setting "cores per L2 cache is 1" and
> therefore implicitly defaults to having as many L2 caches as cores.
> 
> For example (i386 PC machine):
> -smp 16,sockets=2,dies=2,cores=2,threads=2,maxcpus=16 (*)
> 
> Considering the topology of the L2 cache, this (*) implicitly means "1
> core per L2 cache" and "2 L2 caches per die".
> 
> If we use cluster to configure L2 cache topology with the new default
> setting "clusters per L2 cache is 1", the above semantics will change
> to "2 cores per cluster" and "1 cluster per L2 cache", that is, "2
> cores per L2 cache".
> 
> So the same command (*) will cause changes in the L2 cache topology,
> further affecting the performance of the virtual machine.
> 
> Therefore, x86 should only treat cluster as a cpu topology level and
> avoid using it to change L2 cache by default for compatibility.
> 
> 
> ## module level in CPUID
> 
> Currently, we don't expose module level in CPUID.1FH because currently
> linux (v6.2-rc6) doesn't support module level. And exposing module and
> die levels at the same time in CPUID.1FH will cause linux to calculate
> wrong die_id. The module level should be exposed until the real machine
> has the module level in CPUID.1FH.
> 
> We can configure CPUID.04H.02H (L2 cache topology) with module level by
> a new command:
> 
>         "-cpu,x-l2-cache-topo=cluster"
> 
> More information about this command, please see the section: "## New
> property: x-l2-cache-topo".
> 
> 
> ## New cache topology info in CPUCacheInfo
> 
> Currently, by default, the cache topology is encoded as:
> 1. i/d cache is shared in one core.
> 2. L2 cache is shared in one core.
> 3. L3 cache is shared in one die.
> 
> This default general setting has caused a misunderstanding, that is, the
> cache topology is completely equated with a specific cpu topology, such
> as the connection between L2 cache and core level, and the connection
> between L3 cache and die level.
> 
> In fact, the settings of these topologies depend on the specific
> platform and are not static. For example, on Alder Lake-P, every
> four Atom cores share the same L2 cache [2].
> 
> Thus, in this patch set, we explicitly define the corresponding cache
> topology for different cpu models and this has two benefits:
> 1. Easy to expand to new CPU models in the future, which has different
>    cache topology.
> 2. It can easily support custom cache topology by some command (e.g.,
>    x-l2-cache-topo).
> 
> 
> ## New property: x-l2-cache-topo
> 
> The property l2-cache-topo will be used to change the L2 cache topology

Should this be x-l2-cache-topo ?

> in CPUID.04H.
> 
> Now it allows user to set the L2 cache is shared in core level or
> cluster level.
> 
> If user passes "-cpu x-l2-cache-topo=[core|cluster]" then older L2 cache
> topology will be overrided by the new topology setting.
> 
> Since CPUID.04H is used by intel cpus, this property is available on
> intel cpus as for now.

s/intel cpus/Intel CPUs/
I feel this looks  better

> 
> When necessary, it can be extended to CPUID[0x8000001D] for amd cpus.

s/amd cpus/AMD CPUs/

> 
> 
> # Patch description
> 
> patch 1-2 Cleanups about coding style and test name.
> 
> patch 3-4,15 Fixes about x86 topology, intel l1 cache topology and amd
>              cache topology encoding.
> 
> patch 5-6 Cleanups about topology related CPUID encoding and QEMU
>           topology variables.
> 
> patch 7-12 Add the module as the new CPU topology level in x86, and it
>            is corresponding to the cluster level in generic code.
> 
> patch 13,14,16 Add cache topology infomation in cache models.
> 
> patch 17 Introduce a new command to configure L2 cache topology.
> 
> 
> [1]: https://lists.gnu.org/archive/html/qemu-devel/2023-05/msg07179.html
> [2]: https://patchew.org/QEMU/20211228092221.21068-1-wangyanan55@huawei.com/
> [3]: https://www.intel.com/content/www/us/en/products/platforms/details/alder-lake-p.html
> [4]: SDM, vol.3, ch.9, 9.9.1 Hierarchical Mapping of Shared Resources.
> 
> Best Regards,
> Zhao
> 
> ---
> Changelog:
> 
> Changes since v2:
>  * Add "Tested-by", "Reviewed-by" and "ACKed-by" tags.
>  * Use newly added wrapped helper to get cores per socket in
>    qemu_init_vcpu().
> 
> Changes since v1:
>  * Reordered patches. (Yanan)
>  * Deprecated the patch to fix comment of machine_parse_smp_config().
>    (Yanan)
>  * Rename test-x86-cpuid.c to test-x86-topo.c. (Yanan)
>  * Split the intel's l1 cache topology fix into a new separate patch.
>    (Yanan)
>  * Combined module_id and APIC ID for module level support into one
>    patch. (Yanan)
>  * Make cache_into_passthrough case of cpuid 0x04 leaf in
>  * cpu_x86_cpuid() use max_processor_ids_for_cache() and
>    max_core_ids_in_package() to encode CPUID[4]. (Yanan)
>  * Add the prefix "CPU_TOPO_LEVEL_*" for CPU topology level names.
>    (Yanan)
>  * Rename the "INVALID" level to "CPU_TOPO_LEVEL_UNKNOW". (Yanan)
> 
> ---
> Zhao Liu (10):
>   i386: Fix comment style in topology.h
>   tests: Rename test-x86-cpuid.c to test-x86-topo.c
>   i386/cpu: Fix i/d-cache topology to core level for Intel CPU
>   i386/cpu: Use APIC ID offset to encode cache topo in CPUID[4]
>   i386/cpu: Consolidate the use of topo_info in cpu_x86_cpuid()
>   i386: Add cache topology info in CPUCacheInfo
>   i386: Use CPUCacheInfo.share_level to encode CPUID[4]
>   i386: Fix NumSharingCache for CPUID[0x8000001D].EAX[bits 25:14]
>   i386: Use CPUCacheInfo.share_level to encode
>     CPUID[0x8000001D].EAX[bits 25:14]
>   i386: Add new property to control L2 cache topo in CPUID.04H
> 
> Zhuocheng Ding (7):
>   softmmu: Fix CPUSTATE.nr_cores' calculation
>   i386: Introduce module-level cpu topology to CPUX86State
>   i386: Support modules_per_die in X86CPUTopoInfo
>   i386: Support module_id in X86CPUTopoIDs
>   i386/cpu: Introduce cluster-id to X86CPU
>   tests: Add test case of APIC ID for module level parsing
>   hw/i386/pc: Support smp.clusters for x86 PC machine
> 
>  MAINTAINERS                                   |   2 +-
>  hw/i386/pc.c                                  |   1 +
>  hw/i386/x86.c                                 |  49 +++++-
>  include/hw/core/cpu.h                         |   2 +-
>  include/hw/i386/topology.h                    |  68 +++++---
>  qemu-options.hx                               |  10 +-
>  softmmu/cpus.c                                |   2 +-
>  target/i386/cpu.c                             | 158 ++++++++++++++----
>  target/i386/cpu.h                             |  25 +++
>  tests/unit/meson.build                        |   4 +-
>  .../{test-x86-cpuid.c => test-x86-topo.c}     |  58 ++++---
>  11 files changed, 280 insertions(+), 99 deletions(-)
>  rename tests/unit/{test-x86-cpuid.c => test-x86-topo.c} (73%)
>

Zhao Liu Aug. 4, 2023, 7:44 a.m. UTC | #3

On Tue, Aug 01, 2023 at 06:11:52PM -0500, Moger, Babu wrote:
> Date: Tue, 1 Aug 2023 18:11:52 -0500
> From: "Moger, Babu" <babu.moger@amd.com>
> Subject: Re: [PATCH v3 00/17] Support smp.clusters for x86
> 

Hi Babu,

Many thanks for your review and help!

> 
> On 8/1/23 05:35, Zhao Liu wrote:
> > From: Zhao Liu <zhao1.liu@intel.com>
> > 
> > Hi list,
> > 
> > This is the our v3 patch series, rebased on the master branch at the
> > commit 234320cd0573 ("Merge tag 'pull-target-arm-20230731' of https:
> > //git.linaro.org/people/pmaydell/qemu-arm into staging").
> > 
> > Comparing with v2 [1], v3 mainly adds "Tested-by", "Reviewed-by" and
> > "ACKed-by" (for PC related patchies) tags and minor code changes (Pls
> > see changelog).
> > 
> > 
> > # Introduction
> > 
> > This series add the cluster support for x86 PC machine, which allows
> > x86 can use smp.clusters to configure x86 modlue level CPU topology.
> 
> /s/modlue/module

Thanks!

> > 
> > And since the compatibility issue (see section: ## Why not share L2
> > cache in cluster directly), this series also introduce a new command
> > to adjust the x86 L2 cache topology.
> > 
> > Welcome your comments!
> > 
> > 
> > # Backgroud
> > 
> > The "clusters" parameter in "smp" is introduced by ARM [2], but x86
> > hasn't supported it.
> > 
> > At present, x86 defaults L2 cache is shared in one core, but this is
> > not enough. There're some platforms that multiple cores share the
> > same L2 cache, e.g., Alder Lake-P shares L2 cache for one module of
> > Atom cores [3], that is, every four Atom cores shares one L2 cache.
> > Therefore, we need the new CPU topology level (cluster/module).
> > 
> > Another reason is for hybrid architecture. cluster support not only
> > provides another level of topology definition in x86, but would aslo
> > provide required code change for future our hybrid topology support.
> > 
> > 
> > # Overview
> > 
> > ## Introduction of module level for x86
> > 
> > "cluster" in smp is the CPU topology level which is between "core" and
> > die.
> > 
> > For x86, the "cluster" in smp is corresponding to the module level [4],
> > which is above the core level. So use the "module" other than "cluster"
> > in x86 code.
> > 
> > And please note that x86 already has a cpu topology level also named
> > "cluster" [4], this level is at the upper level of the package. Here,
> > the cluster in x86 cpu topology is completely different from the
> > "clusters" as the smp parameter. After the module level is introduced,
> > the cluster as the smp parameter will actually refer to the module level
> > of x86.
> > 
> > 
> > ## Why not share L2 cache in cluster directly
> > 
> > Though "clusters" was introduced to help define L2 cache topology
> > [2], using cluster to define x86's L2 cache topology will cause the
> > compatibility problem:
> > 
> > Currently, x86 defaults that the L2 cache is shared in one core, which
> > actually implies a default setting "cores per L2 cache is 1" and
> > therefore implicitly defaults to having as many L2 caches as cores.
> > 
> > For example (i386 PC machine):
> > -smp 16,sockets=2,dies=2,cores=2,threads=2,maxcpus=16 (*)
> > 
> > Considering the topology of the L2 cache, this (*) implicitly means "1
> > core per L2 cache" and "2 L2 caches per die".
> > 
> > If we use cluster to configure L2 cache topology with the new default
> > setting "clusters per L2 cache is 1", the above semantics will change
> > to "2 cores per cluster" and "1 cluster per L2 cache", that is, "2
> > cores per L2 cache".
> > 
> > So the same command (*) will cause changes in the L2 cache topology,
> > further affecting the performance of the virtual machine.
> > 
> > Therefore, x86 should only treat cluster as a cpu topology level and
> > avoid using it to change L2 cache by default for compatibility.
> > 
> > 
> > ## module level in CPUID
> > 
> > Currently, we don't expose module level in CPUID.1FH because currently
> > linux (v6.2-rc6) doesn't support module level. And exposing module and
> > die levels at the same time in CPUID.1FH will cause linux to calculate
> > wrong die_id. The module level should be exposed until the real machine
> > has the module level in CPUID.1FH.
> > 
> > We can configure CPUID.04H.02H (L2 cache topology) with module level by
> > a new command:
> > 
> >         "-cpu,x-l2-cache-topo=cluster"
> > 
> > More information about this command, please see the section: "## New
> > property: x-l2-cache-topo".
> > 
> > 
> > ## New cache topology info in CPUCacheInfo
> > 
> > Currently, by default, the cache topology is encoded as:
> > 1. i/d cache is shared in one core.
> > 2. L2 cache is shared in one core.
> > 3. L3 cache is shared in one die.
> > 
> > This default general setting has caused a misunderstanding, that is, the
> > cache topology is completely equated with a specific cpu topology, such
> > as the connection between L2 cache and core level, and the connection
> > between L3 cache and die level.
> > 
> > In fact, the settings of these topologies depend on the specific
> > platform and are not static. For example, on Alder Lake-P, every
> > four Atom cores share the same L2 cache [2].
> > 
> > Thus, in this patch set, we explicitly define the corresponding cache
> > topology for different cpu models and this has two benefits:
> > 1. Easy to expand to new CPU models in the future, which has different
> >    cache topology.
> > 2. It can easily support custom cache topology by some command (e.g.,
> >    x-l2-cache-topo).
> > 
> > 
> > ## New property: x-l2-cache-topo
> > 
> > The property l2-cache-topo will be used to change the L2 cache topology
> 
> Should this be x-l2-cache-topo ?

Yes.

> 
> > in CPUID.04H.
> > 
> > Now it allows user to set the L2 cache is shared in core level or
> > cluster level.
> > 
> > If user passes "-cpu x-l2-cache-topo=[core|cluster]" then older L2 cache
> > topology will be overrided by the new topology setting.
> > 
> > Since CPUID.04H is used by intel cpus, this property is available on
> > intel cpus as for now.
> 
> s/intel cpus/Intel CPUs/
> I feel this looks  better

OK.

> 
> > 
> > When necessary, it can be extended to CPUID[0x8000001D] for amd cpus.
> 
> s/amd cpus/AMD CPUs/

Will fix.

Thanks,
Zhao

> 
> > 
> > 
> > # Patch description
> > 
> > patch 1-2 Cleanups about coding style and test name.
> > 
> > patch 3-4,15 Fixes about x86 topology, intel l1 cache topology and amd
> >              cache topology encoding.
> > 
> > patch 5-6 Cleanups about topology related CPUID encoding and QEMU
> >           topology variables.
> > 
> > patch 7-12 Add the module as the new CPU topology level in x86, and it
> >            is corresponding to the cluster level in generic code.
> > 
> > patch 13,14,16 Add cache topology infomation in cache models.
> > 
> > patch 17 Introduce a new command to configure L2 cache topology.
> > 
> > 
> > [1]: https://lists.gnu.org/archive/html/qemu-devel/2023-05/msg07179.html
> > [2]: https://patchew.org/QEMU/20211228092221.21068-1-wangyanan55@huawei.com/
> > [3]: https://www.intel.com/content/www/us/en/products/platforms/details/alder-lake-p.html
> > [4]: SDM, vol.3, ch.9, 9.9.1 Hierarchical Mapping of Shared Resources.
> > 
> > Best Regards,
> > Zhao
> > 
> > ---
> > Changelog:
> > 
> > Changes since v2:
> >  * Add "Tested-by", "Reviewed-by" and "ACKed-by" tags.
> >  * Use newly added wrapped helper to get cores per socket in
> >    qemu_init_vcpu().
> > 
> > Changes since v1:
> >  * Reordered patches. (Yanan)
> >  * Deprecated the patch to fix comment of machine_parse_smp_config().
> >    (Yanan)
> >  * Rename test-x86-cpuid.c to test-x86-topo.c. (Yanan)
> >  * Split the intel's l1 cache topology fix into a new separate patch.
> >    (Yanan)
> >  * Combined module_id and APIC ID for module level support into one
> >    patch. (Yanan)
> >  * Make cache_into_passthrough case of cpuid 0x04 leaf in
> >  * cpu_x86_cpuid() use max_processor_ids_for_cache() and
> >    max_core_ids_in_package() to encode CPUID[4]. (Yanan)
> >  * Add the prefix "CPU_TOPO_LEVEL_*" for CPU topology level names.
> >    (Yanan)
> >  * Rename the "INVALID" level to "CPU_TOPO_LEVEL_UNKNOW". (Yanan)
> > 
> > ---
> > Zhao Liu (10):
> >   i386: Fix comment style in topology.h
> >   tests: Rename test-x86-cpuid.c to test-x86-topo.c
> >   i386/cpu: Fix i/d-cache topology to core level for Intel CPU
> >   i386/cpu: Use APIC ID offset to encode cache topo in CPUID[4]
> >   i386/cpu: Consolidate the use of topo_info in cpu_x86_cpuid()
> >   i386: Add cache topology info in CPUCacheInfo
> >   i386: Use CPUCacheInfo.share_level to encode CPUID[4]
> >   i386: Fix NumSharingCache for CPUID[0x8000001D].EAX[bits 25:14]
> >   i386: Use CPUCacheInfo.share_level to encode
> >     CPUID[0x8000001D].EAX[bits 25:14]
> >   i386: Add new property to control L2 cache topo in CPUID.04H
> > 
> > Zhuocheng Ding (7):
> >   softmmu: Fix CPUSTATE.nr_cores' calculation
> >   i386: Introduce module-level cpu topology to CPUX86State
> >   i386: Support modules_per_die in X86CPUTopoInfo
> >   i386: Support module_id in X86CPUTopoIDs
> >   i386/cpu: Introduce cluster-id to X86CPU
> >   tests: Add test case of APIC ID for module level parsing
> >   hw/i386/pc: Support smp.clusters for x86 PC machine
> > 
> >  MAINTAINERS                                   |   2 +-
> >  hw/i386/pc.c                                  |   1 +
> >  hw/i386/x86.c                                 |  49 +++++-
> >  include/hw/core/cpu.h                         |   2 +-
> >  include/hw/i386/topology.h                    |  68 +++++---
> >  qemu-options.hx                               |  10 +-
> >  softmmu/cpus.c                                |   2 +-
> >  target/i386/cpu.c                             | 158 ++++++++++++++----
> >  target/i386/cpu.h                             |  25 +++
> >  tests/unit/meson.build                        |   4 +-
> >  .../{test-x86-cpuid.c => test-x86-topo.c}     |  58 ++++---
> >  11 files changed, 280 insertions(+), 99 deletions(-)
> >  rename tests/unit/{test-x86-cpuid.c => test-x86-topo.c} (73%)
> > 
> 
> -- 
> Thanks
> Babu Moger

Zhao Liu Aug. 4, 2023, 1:17 p.m. UTC | #4

Hi Jonathan,

On Tue, Aug 01, 2023 at 04:35:27PM +0100, Jonathan Cameron via wrote:
> > 

[snip]

> > 
> > ## New property: x-l2-cache-topo
> > 
> > The property l2-cache-topo will be used to change the L2 cache topology
> > in CPUID.04H.
> > 
> > Now it allows user to set the L2 cache is shared in core level or
> > cluster level.
> > 
> > If user passes "-cpu x-l2-cache-topo=[core|cluster]" then older L2 cache
> > topology will be overrided by the new topology setting.
> > 
> > Since CPUID.04H is used by intel cpus, this property is available on
> > intel cpus as for now.
> > 
> > When necessary, it can be extended to CPUID[0x8000001D] for amd cpus.
> 
> Hi Zhao Liu,
> 
> As part of emulating arm's MPAM (cache partitioning controls) I needed
> to add the missing cache description in the ACPI PPTT table. As such I ran
> into a very similar problem to the one you are addressing.

May I ask if the cache topology you need is symmetric or heterogeneous?

I had the discussion with Yanan [5] about heterogeneous cache. If you
need a "symmetric" cache topology, maybe we could consider trying make
this x-l2-cache-topo more generic.

But if you need a heterogeneous cache topo, e.g., some cores have its
own l2 cache, and other cores share the same l2 cache, only this command
is not enough.

Intel hybrid platforms have the above case I mentioned, we used "hybrid
CPU topology" [6] + "x-l2-cache-topo=cluster" to solve this:

For example, AdlerLake has 2 types of core, one type is P-core with L2 per
P-core, and another type is E-core that 4 E-cores share a L2.

So we set a CPU topology like this:

Set 2 kinds of clusters:
* 1 P-core is in a cluster.
* 4 E-cores in a cluster.

Then we use "x-l2-cache-topo" to make l2 is shared at cluster. In this
way, a P-core could own a L2 because its cluster only has 1 P-core, and
4 E-cores could could share a L2.

For more general way to set cache topology, Yanan and me discussed 2
ways ([7] [8]). [8] depends on the QOM CPU topology mechanism that I'm
working on.

[5]: https://mail.gnu.org/archive/html/qemu-devel/2023-02/msg04795.html
[6]: https://mail.gnu.org/archive/html/qemu-devel/2023-02/msg03205.html
[7]: https://mail.gnu.org/archive/html/qemu-devel/2023-02/msg05139.html
[8]: https://mail.gnu.org/archive/html/qemu-devel/2023-02/msg05167.html

> 
> I wonder if a more generic description is possible? We can rely on ordering
> of the cache levels, so what I was planning to propose was the rather lengthy
> but flexible (and with better names ;)
> 
> -smp 16,sockets=1,clusters=4,threads=2,cache-cluster-start-level=2,cache-node-start-level=3

Could you explain more about this command?
I don't understand what "cache-cluster-start-level=2,cache-node-start-level=3" mean.

> 
> Perhaps we can come up with a common scheme that covers both usecases?
> It gets more fiddly to define if we have variable topology across different clusters
> - and that was going to be an open question in the RFC proposing this - our current
> definition of the more basic topology doesn't cover those cases anyway.
> 
> What I want:
> 
> 1) No restriction on maximum cache levels - ...

Hmmm, if there's no cache name, it would be difficult to define in cli.

> ... some systems have more than 3

What about L4? A name can simplify a lot of issues.

> 2) Easy ability to express everything from all caches are private to all caches are shared.
> Is 3 levels enough? (private, shared at cluster level, shared at a level above that) I think
> so, but if not any scheme should be extensible to cover another level.

It seems you may need the "heterogeneous" cache topology.

I think "private" and "shared" are not good definitions of cache, since
they are not technical terms? (Correct me if I'm wrong.) And i/d cache,
l1 cache, l2 cache are generic terms accepted by many architectures.

Though cache topology is different with CPU topology, it's true that the
cache topology is related to the CPU hierarchy, so I think using the CPU
topology hierarchy to define the heterogeneous topology looks like a more
appropriate way to do it.

> 
> Great if we can figure out a common scheme.

Yeah, it's worth discussing.

Thanks,
Zhao

> 
> Jonathan
> 
> > 
> > 
> > # Patch description
> > 
> > patch 1-2 Cleanups about coding style and test name.
> > 
> > patch 3-4,15 Fixes about x86 topology, intel l1 cache topology and amd
> >              cache topology encoding.
> > 
> > patch 5-6 Cleanups about topology related CPUID encoding and QEMU
> >           topology variables.
> > 
> > patch 7-12 Add the module as the new CPU topology level in x86, and it
> >            is corresponding to the cluster level in generic code.
> > 
> > patch 13,14,16 Add cache topology infomation in cache models.
> > 
> > patch 17 Introduce a new command to configure L2 cache topology.
> > 
> > 
> > [1]: https://lists.gnu.org/archive/html/qemu-devel/2023-05/msg07179.html
> > [2]: https://patchew.org/QEMU/20211228092221.21068-1-wangyanan55@huawei.com/
> > [3]: https://www.intel.com/content/www/us/en/products/platforms/details/alder-lake-p.html
> > [4]: SDM, vol.3, ch.9, 9.9.1 Hierarchical Mapping of Shared Resources.
> > 
> > Best Regards,
> > Zhao
> > 
> > ---
> > Changelog:
> > 
> > Changes since v2:
> >  * Add "Tested-by", "Reviewed-by" and "ACKed-by" tags.
> >  * Use newly added wrapped helper to get cores per socket in
> >    qemu_init_vcpu().
> > 
> > Changes since v1:
> >  * Reordered patches. (Yanan)
> >  * Deprecated the patch to fix comment of machine_parse_smp_config().
> >    (Yanan)
> >  * Rename test-x86-cpuid.c to test-x86-topo.c. (Yanan)
> >  * Split the intel's l1 cache topology fix into a new separate patch.
> >    (Yanan)
> >  * Combined module_id and APIC ID for module level support into one
> >    patch. (Yanan)
> >  * Make cache_into_passthrough case of cpuid 0x04 leaf in
> >  * cpu_x86_cpuid() use max_processor_ids_for_cache() and
> >    max_core_ids_in_package() to encode CPUID[4]. (Yanan)
> >  * Add the prefix "CPU_TOPO_LEVEL_*" for CPU topology level names.
> >    (Yanan)
> >  * Rename the "INVALID" level to "CPU_TOPO_LEVEL_UNKNOW". (Yanan)
> > 
> > ---
> > Zhao Liu (10):
> >   i386: Fix comment style in topology.h
> >   tests: Rename test-x86-cpuid.c to test-x86-topo.c
> >   i386/cpu: Fix i/d-cache topology to core level for Intel CPU
> >   i386/cpu: Use APIC ID offset to encode cache topo in CPUID[4]
> >   i386/cpu: Consolidate the use of topo_info in cpu_x86_cpuid()
> >   i386: Add cache topology info in CPUCacheInfo
> >   i386: Use CPUCacheInfo.share_level to encode CPUID[4]
> >   i386: Fix NumSharingCache for CPUID[0x8000001D].EAX[bits 25:14]
> >   i386: Use CPUCacheInfo.share_level to encode
> >     CPUID[0x8000001D].EAX[bits 25:14]
> >   i386: Add new property to control L2 cache topo in CPUID.04H
> > 
> > Zhuocheng Ding (7):
> >   softmmu: Fix CPUSTATE.nr_cores' calculation
> >   i386: Introduce module-level cpu topology to CPUX86State
> >   i386: Support modules_per_die in X86CPUTopoInfo
> >   i386: Support module_id in X86CPUTopoIDs
> >   i386/cpu: Introduce cluster-id to X86CPU
> >   tests: Add test case of APIC ID for module level parsing
> >   hw/i386/pc: Support smp.clusters for x86 PC machine
> > 
> >  MAINTAINERS                                   |   2 +-
> >  hw/i386/pc.c                                  |   1 +
> >  hw/i386/x86.c                                 |  49 +++++-
> >  include/hw/core/cpu.h                         |   2 +-
> >  include/hw/i386/topology.h                    |  68 +++++---
> >  qemu-options.hx                               |  10 +-
> >  softmmu/cpus.c                                |   2 +-
> >  target/i386/cpu.c                             | 158 ++++++++++++++----
> >  target/i386/cpu.h                             |  25 +++
> >  tests/unit/meson.build                        |   4 +-
> >  .../{test-x86-cpuid.c => test-x86-topo.c}     |  58 ++++---
> >  11 files changed, 280 insertions(+), 99 deletions(-)
> >  rename tests/unit/{test-x86-cpuid.c => test-x86-topo.c} (73%)
> > 
> 
>

Jonathan Cameron Aug. 8, 2023, 11:52 a.m. UTC | #5

On Fri, 4 Aug 2023 21:17:59 +0800
Zhao Liu <zhao1.liu@linux.intel.com> wrote:

> Hi Jonathan,
> 
> On Tue, Aug 01, 2023 at 04:35:27PM +0100, Jonathan Cameron via wrote:
> > >   
> 
> [snip]
> 
> > > 
> > > ## New property: x-l2-cache-topo
> > > 
> > > The property l2-cache-topo will be used to change the L2 cache topology
> > > in CPUID.04H.
> > > 
> > > Now it allows user to set the L2 cache is shared in core level or
> > > cluster level.
> > > 
> > > If user passes "-cpu x-l2-cache-topo=[core|cluster]" then older L2 cache
> > > topology will be overrided by the new topology setting.
> > > 
> > > Since CPUID.04H is used by intel cpus, this property is available on
> > > intel cpus as for now.
> > > 
> > > When necessary, it can be extended to CPUID[0x8000001D] for amd cpus.  
> > 
> > Hi Zhao Liu,
> > 
> > As part of emulating arm's MPAM (cache partitioning controls) I needed
> > to add the missing cache description in the ACPI PPTT table. As such I ran
> > into a very similar problem to the one you are addressing.  
> 
> May I ask if the cache topology you need is symmetric or heterogeneous?
> 
> I had the discussion with Yanan [5] about heterogeneous cache. If you
> need a "symmetric" cache topology, maybe we could consider trying make
> this x-l2-cache-topo more generic.

For now, I'm interested in symmetric, but heterogeneous is certain
to pop up at some point as people will build MPAM mobile systems.
Right now there needs to be a lot of other work to emulate those well
in QEMU given the cores are likely to be quite different.

> 
> But if you need a heterogeneous cache topo, e.g., some cores have its
> own l2 cache, and other cores share the same l2 cache, only this command
> is not enough.
> 
> Intel hybrid platforms have the above case I mentioned, we used "hybrid
> CPU topology" [6] + "x-l2-cache-topo=cluster" to solve this:
> 
> For example, AdlerLake has 2 types of core, one type is P-core with L2 per
> P-core, and another type is E-core that 4 E-cores share a L2.
> 
> So we set a CPU topology like this:
> 
> Set 2 kinds of clusters:
> * 1 P-core is in a cluster.
> * 4 E-cores in a cluster.
> 
> Then we use "x-l2-cache-topo" to make l2 is shared at cluster. In this
> way, a P-core could own a L2 because its cluster only has 1 P-core, and
> 4 E-cores could could share a L2.

That can work if you aren't using clusters to describe other elements of
the topology.  We originally added PPTT based cluster support to Linux
to enable sharing of l3 tags (but not l3 data) among a cluster of
CPUs. So we'd need it for some of our platforms independent of this
particular aspect.  Sometimes the cluster will have associated caches
sometimes it won't and they will be at a different level.  PPTT represents
that cache topology well but is complex.

> 
> For more general way to set cache topology, Yanan and me discussed 2
> ways ([7] [8]). [8] depends on the QOM CPU topology mechanism that I'm
> working on.
> 
> [5]: https://mail.gnu.org/archive/html/qemu-devel/2023-02/msg04795.html
> [6]: https://mail.gnu.org/archive/html/qemu-devel/2023-02/msg03205.html
> [7]: https://mail.gnu.org/archive/html/qemu-devel/2023-02/msg05139.html
> [8]: https://mail.gnu.org/archive/html/qemu-devel/2023-02/msg05167.html

Great. I'll have a read!

> 
> > 
> > I wonder if a more generic description is possible? We can rely on ordering
> > of the cache levels, so what I was planning to propose was the rather lengthy
> > but flexible (and with better names ;)
> > 
> > -smp 16,sockets=1,clusters=4,threads=2,cache-cluster-start-level=2,cache-node-start-level=3  
> 
> Could you explain more about this command?

I'll cc you on the RFC patch set in a few mins.

> I don't understand what "cache-cluster-start-level=2,cache-node-start-level=3" mean.

Assume hierarchical max Nth level cache. 
Cache levels 1 to (cache-cluster-start-level - 1) are private to a core (shared by threads).
Cache levels cache-cluster-start-level to (cache-node-start-level - 1) are shared at cluster level.
Cache levels cache cach-node-start-level to N are at the Numa node level which may or may not
be the physical package.

It very much assumes non heterogeneous though which I don't like about this scheme.

> 
> > 
> > Perhaps we can come up with a common scheme that covers both usecases?
> > It gets more fiddly to define if we have variable topology across different clusters
> > - and that was going to be an open question in the RFC proposing this - our current
> > definition of the more basic topology doesn't cover those cases anyway.
> > 
> > What I want:
> > 
> > 1) No restriction on maximum cache levels - ...  
> 
> Hmmm, if there's no cache name, it would be difficult to define in cli.

Define by level number rather than name.

> 
> > ... some systems have more than 3  
> 
> What about L4? A name can simplify a lot of issues.

True but if we can make it take a number as a parameter it extends to
any level.

> 
> > 2) Easy ability to express everything from all caches are private to all caches are shared.
> > Is 3 levels enough? (private, shared at cluster level, shared at a level above that) I think
> > so, but if not any scheme should be extensible to cover another level.  
> 
> It seems you may need the "heterogeneous" cache topology.
So far, nope. I need to be able to define flexible.

> 
> I think "private" and "shared" are not good definitions of cache, since
> they are not technical terms? (Correct me if I'm wrong.) And i/d cache,
> l1 cache, l2 cache are generic terms accepted by many architectures.
Totally parallel concepts.

L1, l2 etc are just distances from the core and even that gets fiddly if
they aren't inclusive.

Private just means the cache is not shared by multiple cores.
The way you define it above is to put them in clusters, but the cluster concept
means a bunch of other things that don't necessarily have anything to do with
caches (many other resources may be shared).

> 
> Though cache topology is different with CPU topology, it's true that the
> cache topology is related to the CPU hierarchy, so I think using the CPU
> topology hierarchy to define the heterogeneous topology looks like a more
> appropriate way to do it.

Agreed this needs to be built off the CPU topology (PPTT in ACPI is a good
general way to describe things as I've not yet met a system it can't describe
to some degree), but there can be levels of that topology there for different
purposes than describing the sharing of cacehs.

> 
> > 
> > Great if we can figure out a common scheme.  
> 
> Yeah, it's worth discussing.

Let me catch up with the discussions you link above - perhaps the proposals
are generic enough for my cases as well.  The ARM world tends to have a lot
more varied topology than x86 so we often see corner cases.

Jonathan

> 
> Thanks,
> Zhao
> 
> > 
> > Jonathan
> >   
> > > 
> > > 
> > > # Patch description
> > > 
> > > patch 1-2 Cleanups about coding style and test name.
> > > 
> > > patch 3-4,15 Fixes about x86 topology, intel l1 cache topology and amd
> > >              cache topology encoding.
> > > 
> > > patch 5-6 Cleanups about topology related CPUID encoding and QEMU
> > >           topology variables.
> > > 
> > > patch 7-12 Add the module as the new CPU topology level in x86, and it
> > >            is corresponding to the cluster level in generic code.
> > > 
> > > patch 13,14,16 Add cache topology infomation in cache models.
> > > 
> > > patch 17 Introduce a new command to configure L2 cache topology.
> > > 
> > > 
> > > [1]: https://lists.gnu.org/archive/html/qemu-devel/2023-05/msg07179.html
> > > [2]: https://patchew.org/QEMU/20211228092221.21068-1-wangyanan55@huawei.com/
> > > [3]: https://www.intel.com/content/www/us/en/products/platforms/details/alder-lake-p.html
> > > [4]: SDM, vol.3, ch.9, 9.9.1 Hierarchical Mapping of Shared Resources.
> > > 
> > > Best Regards,
> > > Zhao
> > > 
> > > ---
> > > Changelog:
> > > 
> > > Changes since v2:
> > >  * Add "Tested-by", "Reviewed-by" and "ACKed-by" tags.
> > >  * Use newly added wrapped helper to get cores per socket in
> > >    qemu_init_vcpu().
> > > 
> > > Changes since v1:
> > >  * Reordered patches. (Yanan)
> > >  * Deprecated the patch to fix comment of machine_parse_smp_config().
> > >    (Yanan)
> > >  * Rename test-x86-cpuid.c to test-x86-topo.c. (Yanan)
> > >  * Split the intel's l1 cache topology fix into a new separate patch.
> > >    (Yanan)
> > >  * Combined module_id and APIC ID for module level support into one
> > >    patch. (Yanan)
> > >  * Make cache_into_passthrough case of cpuid 0x04 leaf in
> > >  * cpu_x86_cpuid() use max_processor_ids_for_cache() and
> > >    max_core_ids_in_package() to encode CPUID[4]. (Yanan)
> > >  * Add the prefix "CPU_TOPO_LEVEL_*" for CPU topology level names.
> > >    (Yanan)
> > >  * Rename the "INVALID" level to "CPU_TOPO_LEVEL_UNKNOW". (Yanan)
> > > 
> > > ---
> > > Zhao Liu (10):
> > >   i386: Fix comment style in topology.h
> > >   tests: Rename test-x86-cpuid.c to test-x86-topo.c
> > >   i386/cpu: Fix i/d-cache topology to core level for Intel CPU
> > >   i386/cpu: Use APIC ID offset to encode cache topo in CPUID[4]
> > >   i386/cpu: Consolidate the use of topo_info in cpu_x86_cpuid()
> > >   i386: Add cache topology info in CPUCacheInfo
> > >   i386: Use CPUCacheInfo.share_level to encode CPUID[4]
> > >   i386: Fix NumSharingCache for CPUID[0x8000001D].EAX[bits 25:14]
> > >   i386: Use CPUCacheInfo.share_level to encode
> > >     CPUID[0x8000001D].EAX[bits 25:14]
> > >   i386: Add new property to control L2 cache topo in CPUID.04H
> > > 
> > > Zhuocheng Ding (7):
> > >   softmmu: Fix CPUSTATE.nr_cores' calculation
> > >   i386: Introduce module-level cpu topology to CPUX86State
> > >   i386: Support modules_per_die in X86CPUTopoInfo
> > >   i386: Support module_id in X86CPUTopoIDs
> > >   i386/cpu: Introduce cluster-id to X86CPU
> > >   tests: Add test case of APIC ID for module level parsing
> > >   hw/i386/pc: Support smp.clusters for x86 PC machine
> > > 
> > >  MAINTAINERS                                   |   2 +-
> > >  hw/i386/pc.c                                  |   1 +
> > >  hw/i386/x86.c                                 |  49 +++++-
> > >  include/hw/core/cpu.h                         |   2 +-
> > >  include/hw/i386/topology.h                    |  68 +++++---
> > >  qemu-options.hx                               |  10 +-
> > >  softmmu/cpus.c                                |   2 +-
> > >  target/i386/cpu.c                             | 158 ++++++++++++++----
> > >  target/i386/cpu.h                             |  25 +++
> > >  tests/unit/meson.build                        |   4 +-
> > >  .../{test-x86-cpuid.c => test-x86-topo.c}     |  58 ++++---
> > >  11 files changed, 280 insertions(+), 99 deletions(-)
> > >  rename tests/unit/{test-x86-cpuid.c => test-x86-topo.c} (73%)
> > >   
> > 
> >   
>

[v3,00/17] Support smp.clusters for x86

Message

Comments