diff mbox series

[v11,19/21] i386: Add cache topology info in CPUCacheInfo

Message ID 20240424154929.1487382-20-zhao1.liu@intel.com (mailing list archive)
State New, archived
Headers show
Series i386: Introduce smp.modules and clean up cache topology | expand

Commit Message

Zhao Liu April 24, 2024, 3:49 p.m. UTC
Currently, by default, the cache topology is encoded as:
1. i/d cache is shared in one core.
2. L2 cache is shared in one core.
3. L3 cache is shared in one die.

This default general setting has caused a misunderstanding, that is, the
cache topology is completely equated with a specific cpu topology, such
as the connection between L2 cache and core level, and the connection
between L3 cache and die level.

In fact, the settings of these topologies depend on the specific
platform and are not static. For example, on Alder Lake-P, every
four Atom cores share the same L2 cache.

Thus, we should explicitly define the corresponding cache topology for
different cache models to increase scalability.

Except legacy_l2_cache_cpuid2 (its default topo level is
CPU_TOPO_LEVEL_UNKNOW), explicitly set the corresponding topology level
for all other cache models. In order to be compatible with the existing
cache topology, set the CPU_TOPO_LEVEL_CORE level for the i/d cache, set
the CPU_TOPO_LEVEL_CORE level for L2 cache, and set the
CPU_TOPO_LEVEL_DIE level for L3 cache.

The field for CPUID[4].EAX[bits 25:14] or CPUID[0x8000001D].EAX[bits
25:14] will be set based on CPUCacheInfo.share_level.

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
---
Changes since v3:
 * Fixed cache topology uninitialization bugs for some AMD CPUs. (Babu)
 * Moved the CPUTopoLevel enumeration definition to the previous 0x1f
   rework patch.

Changes since v1:
 * Added the prefix "CPU_TOPO_LEVEL_*" for CPU topology level names.
   (Yanan)
 * (Revert, pls refer "i386: Decouple CPUID[0x1F] subleaf with specific
   topology level") Renamed the "INVALID" level to CPU_TOPO_LEVEL_UNKNOW.
   (Yanan)
---
 target/i386/cpu.c | 36 ++++++++++++++++++++++++++++++++++++
 target/i386/cpu.h |  7 +++++++
 2 files changed, 43 insertions(+)

Comments

Tejus GK April 30, 2024, 6:14 a.m. UTC | #1
On 24 Apr 2024, at 9:19 PM, Zhao Liu <zhao1.liu@intel.com> wrote:

@@ -2140,6 +2164,7 @@ static const CPUCaches epyc_milan_cache_info = {
        .lines_per_tag = 1,
        .self_init = 1,
        .no_invd_sharing = true,
+        .share_level = CPU_TOPO_LEVEL_CORE,
    },
    .l1i_cache = &(CPUCacheInfo) {
        .type = INSTRUCTION_CACHE,
@@ -2152,6 +2177,7 @@ static const CPUCaches epyc_milan_cache_info = {
        .lines_per_tag = 1,
        .self_init = 1,
        .no_invd_sharing = true,
+        .share_level = CPU_TOPO_LEVEL_CORE,
    },
    .l2_cache = &(CPUCacheInfo) {
        .type = UNIFIED_CACHE,
@@ -2162,6 +2188,7 @@ static const CPUCaches epyc_milan_cache_info = {
        .partitions = 1,
        .sets = 1024,
        .lines_per_tag = 1,
+        .share_level = CPU_TOPO_LEVEL_CORE,
    },
    .l3_cache = &(CPUCacheInfo) {
        .type = UNIFIED_CACHE,
@@ -2175,6 +2202,7 @@ static const CPUCaches epyc_milan_cache_info = {
        .self_init = true,
        .inclusive = true,
        .complex_indexing = true,
+        .share_level = CPU_TOPO_LEVEL_DIE,
    },
};


Hi Zhao and Babu, thank you for this patch. I have a slightly off-topic question about this patch. Firstly, many AMD CPU models have pre-defined cache sizes for the various cache levels; how are these values decided? I couldn't figure that out from the patches that introduced those changes. Secondly, there isn't any pre-defined cache size for Intel, and the legacy cache values are used. This value can be vastly different from what actual available caches might be. Is there any reason why something like that for Intel has yet to be introduced?

regards,
Tejus
Zhao Liu May 6, 2024, 7:32 a.m. UTC | #2
Hi Tejus,

(Also +Paolo/Daniel)

On Tue, Apr 30, 2024 at 06:14:52AM +0000, Tejus GK wrote:
> Date: Tue, 30 Apr 2024 06:14:52 +0000
> From: Tejus GK <tejus.gk@nutanix.com>
> Subject: Re: [PATCH v11 19/21] i386: Add cache topology info in CPUCacheInfo
> 
> 
> 
> On 24 Apr 2024, at 9:19 PM, Zhao Liu <zhao1.liu@intel.com> wrote:
> 
> @@ -2140,6 +2164,7 @@ static const CPUCaches epyc_milan_cache_info = {
>         .lines_per_tag = 1,
>         .self_init = 1,
>         .no_invd_sharing = true,
> +        .share_level = CPU_TOPO_LEVEL_CORE,
>     },
>     .l1i_cache = &(CPUCacheInfo) {
>         .type = INSTRUCTION_CACHE,
> @@ -2152,6 +2177,7 @@ static const CPUCaches epyc_milan_cache_info = {
>         .lines_per_tag = 1,
>         .self_init = 1,
>         .no_invd_sharing = true,
> +        .share_level = CPU_TOPO_LEVEL_CORE,
>     },
>     .l2_cache = &(CPUCacheInfo) {
>         .type = UNIFIED_CACHE,
> @@ -2162,6 +2188,7 @@ static const CPUCaches epyc_milan_cache_info = {
>         .partitions = 1,
>         .sets = 1024,
>         .lines_per_tag = 1,
> +        .share_level = CPU_TOPO_LEVEL_CORE,
>     },
>     .l3_cache = &(CPUCacheInfo) {
>         .type = UNIFIED_CACHE,
> @@ -2175,6 +2202,7 @@ static const CPUCaches epyc_milan_cache_info = {
>         .self_init = true,
>         .inclusive = true,
>         .complex_indexing = true,
> +        .share_level = CPU_TOPO_LEVEL_DIE,
>     },
> };
> 
> 
> Hi Zhao and Babu, thank you for this patch. I have a slightly
> off-topic question about this patch. Firstly, many AMD CPU models
> have pre-defined cache sizes for the various cache levels; how are
> these values decided? I couldn't figure that out from the patches that
> introduced those changes.

I understand the AMD pre-defined cache idea started from this
discussion:

https://lore.kernel.org/qemu-devel/20180320175427.GU3417@localhost.localdomain/

From the discussion, it appears that AMD's cache information is encoded
according to the spec/datasheet for each generation of EPYC.

> Secondly, there isn't any pre-defined cache size for Intel, and the
> legacy cache values are used. This value can be vastly different from
> what actual available caches might be. Is there any reason why
> something like that for Intel has yet to be introduced?

Previously, there should be a lack of reason to introduce on Intel side,
or haven't met the relevant need/issue before.

I understand that AMD's reason is to make the cache information in the
Guest with a specific CPU model look more correct and to be able to
better emulate the Host environment.

Hi @Paolo and @Daniel, do you think Intel should also add correct cache
info for each CPU model?

Thanks,
Zhao
diff mbox series

Patch

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 76085ec007fa..5ba44c57666d 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -551,6 +551,7 @@  static CPUCacheInfo legacy_l1d_cache = {
     .sets = 64,
     .partitions = 1,
     .no_invd_sharing = true,
+    .share_level = CPU_TOPO_LEVEL_CORE,
 };
 
 /*FIXME: CPUID leaf 0x80000005 is inconsistent with leaves 2 & 4 */
@@ -565,6 +566,7 @@  static CPUCacheInfo legacy_l1d_cache_amd = {
     .partitions = 1,
     .lines_per_tag = 1,
     .no_invd_sharing = true,
+    .share_level = CPU_TOPO_LEVEL_CORE,
 };
 
 /* L1 instruction cache: */
@@ -578,6 +580,7 @@  static CPUCacheInfo legacy_l1i_cache = {
     .sets = 64,
     .partitions = 1,
     .no_invd_sharing = true,
+    .share_level = CPU_TOPO_LEVEL_CORE,
 };
 
 /*FIXME: CPUID leaf 0x80000005 is inconsistent with leaves 2 & 4 */
@@ -592,6 +595,7 @@  static CPUCacheInfo legacy_l1i_cache_amd = {
     .partitions = 1,
     .lines_per_tag = 1,
     .no_invd_sharing = true,
+    .share_level = CPU_TOPO_LEVEL_CORE,
 };
 
 /* Level 2 unified cache: */
@@ -605,6 +609,7 @@  static CPUCacheInfo legacy_l2_cache = {
     .sets = 4096,
     .partitions = 1,
     .no_invd_sharing = true,
+    .share_level = CPU_TOPO_LEVEL_CORE,
 };
 
 /*FIXME: CPUID leaf 2 descriptor is inconsistent with CPUID leaf 4 */
@@ -614,6 +619,7 @@  static CPUCacheInfo legacy_l2_cache_cpuid2 = {
     .size = 2 * MiB,
     .line_size = 64,
     .associativity = 8,
+    .share_level = CPU_TOPO_LEVEL_INVALID,
 };
 
 
@@ -627,6 +633,7 @@  static CPUCacheInfo legacy_l2_cache_amd = {
     .associativity = 16,
     .sets = 512,
     .partitions = 1,
+    .share_level = CPU_TOPO_LEVEL_CORE,
 };
 
 /* Level 3 unified cache: */
@@ -642,6 +649,7 @@  static CPUCacheInfo legacy_l3_cache = {
     .self_init = true,
     .inclusive = true,
     .complex_indexing = true,
+    .share_level = CPU_TOPO_LEVEL_DIE,
 };
 
 /* TLB definitions: */
@@ -1940,6 +1948,7 @@  static const CPUCaches epyc_cache_info = {
         .lines_per_tag = 1,
         .self_init = 1,
         .no_invd_sharing = true,
+        .share_level = CPU_TOPO_LEVEL_CORE,
     },
     .l1i_cache = &(CPUCacheInfo) {
         .type = INSTRUCTION_CACHE,
@@ -1952,6 +1961,7 @@  static const CPUCaches epyc_cache_info = {
         .lines_per_tag = 1,
         .self_init = 1,
         .no_invd_sharing = true,
+        .share_level = CPU_TOPO_LEVEL_CORE,
     },
     .l2_cache = &(CPUCacheInfo) {
         .type = UNIFIED_CACHE,
@@ -1962,6 +1972,7 @@  static const CPUCaches epyc_cache_info = {
         .partitions = 1,
         .sets = 1024,
         .lines_per_tag = 1,
+        .share_level = CPU_TOPO_LEVEL_CORE,
     },
     .l3_cache = &(CPUCacheInfo) {
         .type = UNIFIED_CACHE,
@@ -1975,6 +1986,7 @@  static const CPUCaches epyc_cache_info = {
         .self_init = true,
         .inclusive = true,
         .complex_indexing = true,
+        .share_level = CPU_TOPO_LEVEL_DIE,
     },
 };
 
@@ -1990,6 +2002,7 @@  static CPUCaches epyc_v4_cache_info = {
         .lines_per_tag = 1,
         .self_init = 1,
         .no_invd_sharing = true,
+        .share_level = CPU_TOPO_LEVEL_CORE,
     },
     .l1i_cache = &(CPUCacheInfo) {
         .type = INSTRUCTION_CACHE,
@@ -2002,6 +2015,7 @@  static CPUCaches epyc_v4_cache_info = {
         .lines_per_tag = 1,
         .self_init = 1,
         .no_invd_sharing = true,
+        .share_level = CPU_TOPO_LEVEL_CORE,
     },
     .l2_cache = &(CPUCacheInfo) {
         .type = UNIFIED_CACHE,
@@ -2012,6 +2026,7 @@  static CPUCaches epyc_v4_cache_info = {
         .partitions = 1,
         .sets = 1024,
         .lines_per_tag = 1,
+        .share_level = CPU_TOPO_LEVEL_CORE,
     },
     .l3_cache = &(CPUCacheInfo) {
         .type = UNIFIED_CACHE,
@@ -2025,6 +2040,7 @@  static CPUCaches epyc_v4_cache_info = {
         .self_init = true,
         .inclusive = true,
         .complex_indexing = false,
+        .share_level = CPU_TOPO_LEVEL_DIE,
     },
 };
 
@@ -2040,6 +2056,7 @@  static const CPUCaches epyc_rome_cache_info = {
         .lines_per_tag = 1,
         .self_init = 1,
         .no_invd_sharing = true,
+        .share_level = CPU_TOPO_LEVEL_CORE,
     },
     .l1i_cache = &(CPUCacheInfo) {
         .type = INSTRUCTION_CACHE,
@@ -2052,6 +2069,7 @@  static const CPUCaches epyc_rome_cache_info = {
         .lines_per_tag = 1,
         .self_init = 1,
         .no_invd_sharing = true,
+        .share_level = CPU_TOPO_LEVEL_CORE,
     },
     .l2_cache = &(CPUCacheInfo) {
         .type = UNIFIED_CACHE,
@@ -2062,6 +2080,7 @@  static const CPUCaches epyc_rome_cache_info = {
         .partitions = 1,
         .sets = 1024,
         .lines_per_tag = 1,
+        .share_level = CPU_TOPO_LEVEL_CORE,
     },
     .l3_cache = &(CPUCacheInfo) {
         .type = UNIFIED_CACHE,
@@ -2075,6 +2094,7 @@  static const CPUCaches epyc_rome_cache_info = {
         .self_init = true,
         .inclusive = true,
         .complex_indexing = true,
+        .share_level = CPU_TOPO_LEVEL_DIE,
     },
 };
 
@@ -2090,6 +2110,7 @@  static const CPUCaches epyc_rome_v3_cache_info = {
         .lines_per_tag = 1,
         .self_init = 1,
         .no_invd_sharing = true,
+        .share_level = CPU_TOPO_LEVEL_CORE,
     },
     .l1i_cache = &(CPUCacheInfo) {
         .type = INSTRUCTION_CACHE,
@@ -2102,6 +2123,7 @@  static const CPUCaches epyc_rome_v3_cache_info = {
         .lines_per_tag = 1,
         .self_init = 1,
         .no_invd_sharing = true,
+        .share_level = CPU_TOPO_LEVEL_CORE,
     },
     .l2_cache = &(CPUCacheInfo) {
         .type = UNIFIED_CACHE,
@@ -2112,6 +2134,7 @@  static const CPUCaches epyc_rome_v3_cache_info = {
         .partitions = 1,
         .sets = 1024,
         .lines_per_tag = 1,
+        .share_level = CPU_TOPO_LEVEL_CORE,
     },
     .l3_cache = &(CPUCacheInfo) {
         .type = UNIFIED_CACHE,
@@ -2125,6 +2148,7 @@  static const CPUCaches epyc_rome_v3_cache_info = {
         .self_init = true,
         .inclusive = true,
         .complex_indexing = false,
+        .share_level = CPU_TOPO_LEVEL_DIE,
     },
 };
 
@@ -2140,6 +2164,7 @@  static const CPUCaches epyc_milan_cache_info = {
         .lines_per_tag = 1,
         .self_init = 1,
         .no_invd_sharing = true,
+        .share_level = CPU_TOPO_LEVEL_CORE,
     },
     .l1i_cache = &(CPUCacheInfo) {
         .type = INSTRUCTION_CACHE,
@@ -2152,6 +2177,7 @@  static const CPUCaches epyc_milan_cache_info = {
         .lines_per_tag = 1,
         .self_init = 1,
         .no_invd_sharing = true,
+        .share_level = CPU_TOPO_LEVEL_CORE,
     },
     .l2_cache = &(CPUCacheInfo) {
         .type = UNIFIED_CACHE,
@@ -2162,6 +2188,7 @@  static const CPUCaches epyc_milan_cache_info = {
         .partitions = 1,
         .sets = 1024,
         .lines_per_tag = 1,
+        .share_level = CPU_TOPO_LEVEL_CORE,
     },
     .l3_cache = &(CPUCacheInfo) {
         .type = UNIFIED_CACHE,
@@ -2175,6 +2202,7 @@  static const CPUCaches epyc_milan_cache_info = {
         .self_init = true,
         .inclusive = true,
         .complex_indexing = true,
+        .share_level = CPU_TOPO_LEVEL_DIE,
     },
 };
 
@@ -2190,6 +2218,7 @@  static const CPUCaches epyc_milan_v2_cache_info = {
         .lines_per_tag = 1,
         .self_init = 1,
         .no_invd_sharing = true,
+        .share_level = CPU_TOPO_LEVEL_CORE,
     },
     .l1i_cache = &(CPUCacheInfo) {
         .type = INSTRUCTION_CACHE,
@@ -2202,6 +2231,7 @@  static const CPUCaches epyc_milan_v2_cache_info = {
         .lines_per_tag = 1,
         .self_init = 1,
         .no_invd_sharing = true,
+        .share_level = CPU_TOPO_LEVEL_CORE,
     },
     .l2_cache = &(CPUCacheInfo) {
         .type = UNIFIED_CACHE,
@@ -2212,6 +2242,7 @@  static const CPUCaches epyc_milan_v2_cache_info = {
         .partitions = 1,
         .sets = 1024,
         .lines_per_tag = 1,
+        .share_level = CPU_TOPO_LEVEL_CORE,
     },
     .l3_cache = &(CPUCacheInfo) {
         .type = UNIFIED_CACHE,
@@ -2225,6 +2256,7 @@  static const CPUCaches epyc_milan_v2_cache_info = {
         .self_init = true,
         .inclusive = true,
         .complex_indexing = false,
+        .share_level = CPU_TOPO_LEVEL_DIE,
     },
 };
 
@@ -2240,6 +2272,7 @@  static const CPUCaches epyc_genoa_cache_info = {
         .lines_per_tag = 1,
         .self_init = 1,
         .no_invd_sharing = true,
+        .share_level = CPU_TOPO_LEVEL_CORE,
     },
     .l1i_cache = &(CPUCacheInfo) {
         .type = INSTRUCTION_CACHE,
@@ -2252,6 +2285,7 @@  static const CPUCaches epyc_genoa_cache_info = {
         .lines_per_tag = 1,
         .self_init = 1,
         .no_invd_sharing = true,
+        .share_level = CPU_TOPO_LEVEL_CORE,
     },
     .l2_cache = &(CPUCacheInfo) {
         .type = UNIFIED_CACHE,
@@ -2262,6 +2296,7 @@  static const CPUCaches epyc_genoa_cache_info = {
         .partitions = 1,
         .sets = 2048,
         .lines_per_tag = 1,
+        .share_level = CPU_TOPO_LEVEL_CORE,
     },
     .l3_cache = &(CPUCacheInfo) {
         .type = UNIFIED_CACHE,
@@ -2275,6 +2310,7 @@  static const CPUCaches epyc_genoa_cache_info = {
         .self_init = true,
         .inclusive = true,
         .complex_indexing = false,
+        .share_level = CPU_TOPO_LEVEL_DIE,
     },
 };
 
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 042eec417d0f..a28ba33365b4 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1590,6 +1590,13 @@  typedef struct CPUCacheInfo {
      * address bits.  CPUID[4].EDX[bit 2].
      */
     bool complex_indexing;
+
+    /*
+     * Cache Topology. The level that cache is shared in.
+     * Used to encode CPUID[4].EAX[bits 25:14] or
+     * CPUID[0x8000001D].EAX[bits 25:14].
+     */
+    enum CPUTopoLevel share_level;
 } CPUCacheInfo;