Message ID | 20230723185909.441455-1-nospam@kota.moe (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | target/i386: Fix reporting of CPU dies when nr_cores=nr_threads=1 | expand |
On 7/24/2023 2:59 AM, 小太 wrote: > When QEMU is started with `-smp D,sockets=1,dies=D,cores=1,threads=1` (that > is, 1 socket with D dies but each die contains just a single thread), both > Linux and Windows guests incorrectly interprets the system as having D > sockets with 1 die each > > Ultimately this is caused by various CPUID leaves not being die-aware in > their "threads per socket" calculations, so this patch fixes that > > These changes are referenced to the AMD PPR for Family 19h Model 01h (Milan) > and Family 17h Model 01h (Naples) manuals: > - CPUID_Fn00000001_EBX[23:16]: Number of threads in the processor > (Core::X86::Cpuid::SizeId[NC] + 1) > - CPUID_Fn0000000B_EBX_x01[15:0]: Number of logical cores in processor > socket (not present until Rome) > - CPUID_Fn80000001_ECX[1]: Multi core product > (Core::X86::Cpuid::SizeId[NC] != 0) > - CPUID_Fn80000008_ECX[7:0]: The number of threads in the package - 1 > (Core::X86::Cpuid::SizeId[NC]) > > Note there are two remaining occurences that I didn't touch: > - CPUID_Fn8000001E_ECX[10:8]: Always 0 (1 node per processor) for Milan. > But for Naples, it can also be 2 or 4 nodes > where each node is defined as one or two > CCXes (CCD?). But Milan also has multiple > CCXes, so clearly the definition of a node is > different from model to model, so I've left > it untouched. (QEMU seems to use the Naples > definition) > - MSR_CORE_THREAD_COUNT: This MSR doesn't exist on Milan or Naples Is this patch specific to AMD CPU type? what's situation for Intel CPU?
On Thu, 27 Jul 2023 at 11:25, Xiaoyao Li <xiaoyao.li@intel.com> wrote: > > On 7/24/2023 2:59 AM, 小太 wrote: > > When QEMU is started with `-smp D,sockets=1,dies=D,cores=1,threads=1` (that > > is, 1 socket with D dies but each die contains just a single thread), both > > Linux and Windows guests incorrectly interprets the system as having D > > sockets with 1 die each > > > > Ultimately this is caused by various CPUID leaves not being die-aware in > > their "threads per socket" calculations, so this patch fixes that > > > > These changes are referenced to the AMD PPR for Family 19h Model 01h (Milan) > > and Family 17h Model 01h (Naples) manuals: > > - CPUID_Fn00000001_EBX[23:16]: Number of threads in the processor > > (Core::X86::Cpuid::SizeId[NC] + 1) > > - CPUID_Fn0000000B_EBX_x01[15:0]: Number of logical cores in processor > > socket (not present until Rome) > > - CPUID_Fn80000001_ECX[1]: Multi core product > > (Core::X86::Cpuid::SizeId[NC] != 0) > > - CPUID_Fn80000008_ECX[7:0]: The number of threads in the package - 1 > > (Core::X86::Cpuid::SizeId[NC]) > > > > Note there are two remaining occurences that I didn't touch: > > - CPUID_Fn8000001E_ECX[10:8]: Always 0 (1 node per processor) for Milan. > > But for Naples, it can also be 2 or 4 nodes > > where each node is defined as one or two > > CCXes (CCD?). But Milan also has multiple > > CCXes, so clearly the definition of a node is > > different from model to model, so I've left > > it untouched. (QEMU seems to use the Naples > > definition) > > - MSR_CORE_THREAD_COUNT: This MSR doesn't exist on Milan or Naples > > Is this patch specific to AMD CPU type? what's situation for Intel CPU? I don't have a MCM Intel CPU to confirm against, but according to "Intel Architectures Software Developer’s Manual": - 01h EBX[23:16]: Maximum number of addressable IDs for logical processors in this physical package - 0Bh EBX[15:0]: The number of logical processors across all instances of this domain within the next higher-scoped domain. Note: The 0Bh leaf caps at the "core" domain, so Intel prefers using the 1Fh leaf instead which supports modules/tiles/dies/sockets. But AMD doesn't support the 1Fh leaf yet - 80000001h ECX[1]: Reserved - 80000008h ECX: Reserved = 0 - 8000001Eh: Unsupported - MSR_CORE_THREAD_COUNT[31:16]: The number of processor cores that are currently enabled in the physical package So the changes seem compatible with Intel, though MSR_CORE_THREAD_COUNT should be updated to support dies as well. But in the absence of a MCM Intel CPU to test with, I don't feel comfortable making that change
On Thu, 27 Jul 2023 at 19:16, 小太 <nospam@kota.moe> wrote: > > On Thu, 27 Jul 2023 at 11:25, Xiaoyao Li <xiaoyao.li@intel.com> wrote: > > > > On 7/24/2023 2:59 AM, 小太 wrote: > > > When QEMU is started with `-smp D,sockets=1,dies=D,cores=1,threads=1` (that > > > is, 1 socket with D dies but each die contains just a single thread), both > > > Linux and Windows guests incorrectly interprets the system as having D > > > sockets with 1 die each > > > > > > Ultimately this is caused by various CPUID leaves not being die-aware in > > > their "threads per socket" calculations, so this patch fixes that > > > > > > These changes are referenced to the AMD PPR for Family 19h Model 01h (Milan) > > > and Family 17h Model 01h (Naples) manuals: > > > - CPUID_Fn00000001_EBX[23:16]: Number of threads in the processor > > > (Core::X86::Cpuid::SizeId[NC] + 1) > > > - CPUID_Fn0000000B_EBX_x01[15:0]: Number of logical cores in processor > > > socket (not present until Rome) > > > - CPUID_Fn80000001_ECX[1]: Multi core product > > > (Core::X86::Cpuid::SizeId[NC] != 0) > > > - CPUID_Fn80000008_ECX[7:0]: The number of threads in the package - 1 > > > (Core::X86::Cpuid::SizeId[NC]) > > > > > > Note there are two remaining occurences that I didn't touch: > > > - CPUID_Fn8000001E_ECX[10:8]: Always 0 (1 node per processor) for Milan. > > > But for Naples, it can also be 2 or 4 nodes > > > where each node is defined as one or two > > > CCXes (CCD?). But Milan also has multiple > > > CCXes, so clearly the definition of a node is > > > different from model to model, so I've left > > > it untouched. (QEMU seems to use the Naples > > > definition) > > > - MSR_CORE_THREAD_COUNT: This MSR doesn't exist on Milan or Naples > > > > Is this patch specific to AMD CPU type? what's situation for Intel CPU? > > I don't have a MCM Intel CPU to confirm against, but according to "Intel > Architectures Software Developer’s Manual": > - 01h EBX[23:16]: Maximum number of addressable IDs for logical processors in > this physical package > - 0Bh EBX[15:0]: The number of logical processors across all instances of this > domain within the next higher-scoped domain. Note: The 0Bh > leaf caps at the "core" domain, so Intel prefers using the 1Fh > leaf instead which supports modules/tiles/dies/sockets. But > AMD doesn't support the 1Fh leaf yet > - 80000001h ECX[1]: Reserved > - 80000008h ECX: Reserved = 0 > - 8000001Eh: Unsupported > - MSR_CORE_THREAD_COUNT[31:16]: The number of processor cores that are > currently enabled in the physical package > > So the changes seem compatible with Intel, though MSR_CORE_THREAD_COUNT should > be updated to support dies as well. But in the absence of a MCM Intel CPU to > test with, I don't feel comfortable making that change Friendly ping on this patch (patchew link: https://patchew.org/QEMU/20230723185909.441455-1-nospam@kota.moe/)
On 7/24/2023 2:59 AM, 小太 wrote: > When QEMU is started with `-smp D,sockets=1,dies=D,cores=1,threads=1` (that > is, 1 socket with D dies but each die contains just a single thread), both > Linux and Windows guests incorrectly interprets the system as having D > sockets with 1 die each > > Ultimately this is caused by various CPUID leaves not being die-aware in > their "threads per socket" calculations, so this patch fixes that > > These changes are referenced to the AMD PPR for Family 19h Model 01h (Milan) > and Family 17h Model 01h (Naples) manuals: > - CPUID_Fn00000001_EBX[23:16]: Number of threads in the processor > (Core::X86::Cpuid::SizeId[NC] + 1) > - CPUID_Fn0000000B_EBX_x01[15:0]: Number of logical cores in processor > socket (not present until Rome) > - CPUID_Fn80000001_ECX[1]: Multi core product > (Core::X86::Cpuid::SizeId[NC] != 0) > - CPUID_Fn80000008_ECX[7:0]: The number of threads in the package - 1 > (Core::X86::Cpuid::SizeId[NC]) > > Note there are two remaining occurences that I didn't touch: > - CPUID_Fn8000001E_ECX[10:8]: Always 0 (1 node per processor) for Milan. > But for Naples, it can also be 2 or 4 nodes > where each node is defined as one or two > CCXes (CCD?). But Milan also has multiple > CCXes, so clearly the definition of a node is > different from model to model, so I've left > it untouched. (QEMU seems to use the Naples > definition) > - MSR_CORE_THREAD_COUNT: This MSR doesn't exist on Milan or Naples > > Signed-off-by: 小太 <nospam@kota.moe> > --- > target/i386/cpu.c | 12 ++++++------ > 1 file changed, 6 insertions(+), 6 deletions(-) > > diff --git a/target/i386/cpu.c b/target/i386/cpu.c > index 97ad229d8b..6ff23fa590 100644 > --- a/target/i386/cpu.c > +++ b/target/i386/cpu.c > @@ -6049,8 +6049,8 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count, > *ecx |= CPUID_EXT_OSXSAVE; > } > *edx = env->features[FEAT_1_EDX]; > - if (cs->nr_cores * cs->nr_threads > 1) { > - *ebx |= (cs->nr_cores * cs->nr_threads) << 16; > + if (env->nr_dies * cs->nr_cores * cs->nr_threads > 1) { > + *ebx |= (env->nr_dies * cs->nr_cores * cs->nr_threads) << 16; The nr_cores in CPUState means "Number of cores within this CPU package", so it doesn't need "nr_dies" here. Zhao Liu is fixing the calculation of nr_cores [1]. [1]: https://lists.gnu.org/archive/html/qemu-devel/2023-05/msg07187.html Thanks, Qian > *edx |= CPUID_HT; > } > if (!cpu->enable_pmu) { > @@ -6230,7 +6230,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count, > break; > case 1: > *eax = apicid_pkg_offset(&topo_info); > - *ebx = cs->nr_cores * cs->nr_threads; > + *ebx = env->nr_dies * cs->nr_cores * cs->nr_threads; > *ecx |= CPUID_TOPOLOGY_LEVEL_CORE; > break; > default: > @@ -6496,7 +6496,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count, > * discards multiple thread information if it is set. > * So don't set it here for Intel to make Linux guests happy. > */ > - if (cs->nr_cores * cs->nr_threads > 1) { > + if (env->nr_dies * cs->nr_cores * cs->nr_threads > 1) { > if (env->cpuid_vendor1 != CPUID_VENDOR_INTEL_1 || > env->cpuid_vendor2 != CPUID_VENDOR_INTEL_2 || > env->cpuid_vendor3 != CPUID_VENDOR_INTEL_3) { > @@ -6562,7 +6562,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count, > *eax |= (cpu_x86_virtual_addr_width(env) << 8); > } > *ebx = env->features[FEAT_8000_0008_EBX]; > - if (cs->nr_cores * cs->nr_threads > 1) { > + if (env->nr_dies * cs->nr_cores * cs->nr_threads > 1) { > /* > * Bits 15:12 is "The number of bits in the initial > * Core::X86::Apic::ApicId[ApicId] value that indicate > @@ -6570,7 +6570,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count, > * Bits 7:0 is "The number of threads in the package is NC+1" > */ > *ecx = (apicid_pkg_offset(&topo_info) << 12) | > - ((cs->nr_cores * cs->nr_threads) - 1); > + ((env->nr_dies * cs->nr_cores * cs->nr_threads) - 1); > } else { > *ecx = 0; > }
diff --git a/target/i386/cpu.c b/target/i386/cpu.c index 97ad229d8b..6ff23fa590 100644 --- a/target/i386/cpu.c +++ b/target/i386/cpu.c @@ -6049,8 +6049,8 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count, *ecx |= CPUID_EXT_OSXSAVE; } *edx = env->features[FEAT_1_EDX]; - if (cs->nr_cores * cs->nr_threads > 1) { - *ebx |= (cs->nr_cores * cs->nr_threads) << 16; + if (env->nr_dies * cs->nr_cores * cs->nr_threads > 1) { + *ebx |= (env->nr_dies * cs->nr_cores * cs->nr_threads) << 16; *edx |= CPUID_HT; } if (!cpu->enable_pmu) { @@ -6230,7 +6230,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count, break; case 1: *eax = apicid_pkg_offset(&topo_info); - *ebx = cs->nr_cores * cs->nr_threads; + *ebx = env->nr_dies * cs->nr_cores * cs->nr_threads; *ecx |= CPUID_TOPOLOGY_LEVEL_CORE; break; default: @@ -6496,7 +6496,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count, * discards multiple thread information if it is set. * So don't set it here for Intel to make Linux guests happy. */ - if (cs->nr_cores * cs->nr_threads > 1) { + if (env->nr_dies * cs->nr_cores * cs->nr_threads > 1) { if (env->cpuid_vendor1 != CPUID_VENDOR_INTEL_1 || env->cpuid_vendor2 != CPUID_VENDOR_INTEL_2 || env->cpuid_vendor3 != CPUID_VENDOR_INTEL_3) { @@ -6562,7 +6562,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count, *eax |= (cpu_x86_virtual_addr_width(env) << 8); } *ebx = env->features[FEAT_8000_0008_EBX]; - if (cs->nr_cores * cs->nr_threads > 1) { + if (env->nr_dies * cs->nr_cores * cs->nr_threads > 1) { /* * Bits 15:12 is "The number of bits in the initial * Core::X86::Apic::ApicId[ApicId] value that indicate @@ -6570,7 +6570,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count, * Bits 7:0 is "The number of threads in the package is NC+1" */ *ecx = (apicid_pkg_offset(&topo_info) << 12) | - ((cs->nr_cores * cs->nr_threads) - 1); + ((env->nr_dies * cs->nr_cores * cs->nr_threads) - 1); } else { *ecx = 0; }
When QEMU is started with `-smp D,sockets=1,dies=D,cores=1,threads=1` (that is, 1 socket with D dies but each die contains just a single thread), both Linux and Windows guests incorrectly interprets the system as having D sockets with 1 die each Ultimately this is caused by various CPUID leaves not being die-aware in their "threads per socket" calculations, so this patch fixes that These changes are referenced to the AMD PPR for Family 19h Model 01h (Milan) and Family 17h Model 01h (Naples) manuals: - CPUID_Fn00000001_EBX[23:16]: Number of threads in the processor (Core::X86::Cpuid::SizeId[NC] + 1) - CPUID_Fn0000000B_EBX_x01[15:0]: Number of logical cores in processor socket (not present until Rome) - CPUID_Fn80000001_ECX[1]: Multi core product (Core::X86::Cpuid::SizeId[NC] != 0) - CPUID_Fn80000008_ECX[7:0]: The number of threads in the package - 1 (Core::X86::Cpuid::SizeId[NC]) Note there are two remaining occurences that I didn't touch: - CPUID_Fn8000001E_ECX[10:8]: Always 0 (1 node per processor) for Milan. But for Naples, it can also be 2 or 4 nodes where each node is defined as one or two CCXes (CCD?). But Milan also has multiple CCXes, so clearly the definition of a node is different from model to model, so I've left it untouched. (QEMU seems to use the Naples definition) - MSR_CORE_THREAD_COUNT: This MSR doesn't exist on Milan or Naples Signed-off-by: 小太 <nospam@kota.moe> --- target/i386/cpu.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-)