diff mbox series

[v3,03/17] softmmu: Fix CPUSTATE.nr_cores' calculation

Message ID 20230801103527.397756-4-zhao1.liu@linux.intel.com (mailing list archive)
State New, archived
Headers show
Series Support smp.clusters for x86 | expand

Commit Message

Zhao Liu Aug. 1, 2023, 10:35 a.m. UTC
From: Zhuocheng Ding <zhuocheng.ding@intel.com>

From CPUState.nr_cores' comment, it represents "number of cores within
this CPU package".

After 003f230e37d7 ("machine: Tweak the order of topology members in
struct CpuTopology"), the meaning of smp.cores changed to "the number of
cores in one die", but this commit missed to change CPUState.nr_cores'
caculation, so that CPUState.nr_cores became wrong and now it
misses to consider numbers of clusters and dies.

At present, only i386 is using CPUState.nr_cores.

But as for i386, which supports die level, the uses of CPUState.nr_cores
are very confusing:

Early uses are based on the meaning of "cores per package" (before die
is introduced into i386), and later uses are based on "cores per die"
(after die's introduction).

This difference is due to that commit a94e1428991f ("target/i386: Add
CPUID.1F generation support for multi-dies PCMachine") misunderstood
that CPUState.nr_cores means "cores per die" when caculated
CPUID.1FH.01H:EBX. After that, the changes in i386 all followed this
wrong understanding.

With the influence of 003f230e37d7 and a94e1428991f, for i386 currently
the result of CPUState.nr_cores is "cores per die", thus the original
uses of CPUState.cores based on the meaning of "cores per package" are
wrong when mutiple dies exist:
1. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.01H:EBX[bits 23:16] is
   incorrect because it expects "cpus per package" but now the
   result is "cpus per die".
2. In cpu_x86_cpuid() of target/i386/cpu.c, for all leaves of CPUID.04H:
   EAX[bits 31:26] is incorrect because they expect "cpus per package"
   but now the result is "cpus per die". The error not only impacts the
   EAX caculation in cache_info_passthrough case, but also impacts other
   cases of setting cache topology for Intel CPU according to cpu
   topology (specifically, the incoming parameter "num_cores" expects
   "cores per package" in encode_cache_cpuid4()).
3. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.0BH.01H:EBX[bits
   15:00] is incorrect because the EBX of 0BH.01H (core level) expects
   "cpus per package", which may be different with 1FH.01H (The reason
   is 1FH can support more levels. For QEMU, 1FH also supports die,
   1FH.01H:EBX[bits 15:00] expects "cpus per die").
4. In cpu_x86_cpuid() of target/i386/cpu.c, when CPUID.80000001H is
   caculated, here "cpus per package" is expected to be checked, but in
   fact, now it checks "cpus per die". Though "cpus per die" also works
   for this code logic, this isn't consistent with AMD's APM.
5. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.80000008H:ECX expects
   "cpus per package" but it obtains "cpus per die".
6. In simulate_rdmsr() of target/i386/hvf/x86_emu.c, in
   kvm_rdmsr_core_thread_count() of target/i386/kvm/kvm.c, and in
   helper_rdmsr() of target/i386/tcg/sysemu/misc_helper.c,
   MSR_CORE_THREAD_COUNT expects "cpus per package" and "cores per
   package", but in these functions, it obtains "cpus per die" and
   "cores per die".

On the other hand, these uses are correct now (they are added in/after
a94e1428991f):
1. In cpu_x86_cpuid() of target/i386/cpu.c, topo_info.cores_per_die
   meets the actual meaning of CPUState.nr_cores ("cores per die").
2. In cpu_x86_cpuid() of target/i386/cpu.c, vcpus_per_socket (in CPUID.
   04H's caculation) considers number of dies, so it's correct.
3. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.1FH.01H:EBX[bits
   15:00] needs "cpus per die" and it gets the correct result, and
   CPUID.1FH.02H:EBX[bits 15:00] gets correct "cpus per package".

When CPUState.nr_cores is correctly changed to "cores per package" again
, the above errors will be fixed without extra work, but the "currently"
correct cases will go wrong and need special handling to pass correct
"cpus/cores per die" they want.

Thus in this patch, we fix CPUState.nr_cores' caculation to fit the
original meaning "cores per package", as well as changing calculation of
topo_info.cores_per_die, vcpus_per_socket and CPUID.1FH.

In addition, in the nr_threads' comment, specify it represents the
number of threads in the "core" to avoid confusion, and also add comment
for nr_dies in CPUX86State.

Fixes: a94e1428991f ("target/i386: Add CPUID.1F generation support for multi-dies PCMachine")
Fixes: 003f230e37d7 ("machine: Tweak the order of topology members in struct CpuTopology")
Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
Co-developed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
Changes since v2:
 * Use wrapped helper to get cores per socket in qemu_init_vcpu().
Changes since v1:
 * Add comment for nr_dies in CPUX86State. (Yanan)
---
 include/hw/core/cpu.h | 2 +-
 softmmu/cpus.c        | 2 +-
 target/i386/cpu.c     | 9 ++++-----
 target/i386/cpu.h     | 1 +
 4 files changed, 7 insertions(+), 7 deletions(-)

Comments

Moger, Babu Aug. 2, 2023, 3:25 p.m. UTC | #1
Hi Zhao,

On 8/1/23 05:35, Zhao Liu wrote:
> From: Zhuocheng Ding <zhuocheng.ding@intel.com>
> 
> From CPUState.nr_cores' comment, it represents "number of cores within
> this CPU package".
> 
> After 003f230e37d7 ("machine: Tweak the order of topology members in
> struct CpuTopology"), the meaning of smp.cores changed to "the number of
> cores in one die", but this commit missed to change CPUState.nr_cores'
> caculation, so that CPUState.nr_cores became wrong and now it
> misses to consider numbers of clusters and dies.
> 
> At present, only i386 is using CPUState.nr_cores.
> 
> But as for i386, which supports die level, the uses of CPUState.nr_cores
> are very confusing:
> 
> Early uses are based on the meaning of "cores per package" (before die
> is introduced into i386), and later uses are based on "cores per die"
> (after die's introduction).
> 
> This difference is due to that commit a94e1428991f ("target/i386: Add
> CPUID.1F generation support for multi-dies PCMachine") misunderstood
> that CPUState.nr_cores means "cores per die" when caculated
> CPUID.1FH.01H:EBX. After that, the changes in i386 all followed this
> wrong understanding.
> 
> With the influence of 003f230e37d7 and a94e1428991f, for i386 currently
> the result of CPUState.nr_cores is "cores per die", thus the original
> uses of CPUState.cores based on the meaning of "cores per package" are
> wrong when mutiple dies exist:
> 1. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.01H:EBX[bits 23:16] is
>    incorrect because it expects "cpus per package" but now the
>    result is "cpus per die".
> 2. In cpu_x86_cpuid() of target/i386/cpu.c, for all leaves of CPUID.04H:
>    EAX[bits 31:26] is incorrect because they expect "cpus per package"
>    but now the result is "cpus per die". The error not only impacts the
>    EAX caculation in cache_info_passthrough case, but also impacts other
>    cases of setting cache topology for Intel CPU according to cpu
>    topology (specifically, the incoming parameter "num_cores" expects
>    "cores per package" in encode_cache_cpuid4()).
> 3. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.0BH.01H:EBX[bits
>    15:00] is incorrect because the EBX of 0BH.01H (core level) expects
>    "cpus per package", which may be different with 1FH.01H (The reason
>    is 1FH can support more levels. For QEMU, 1FH also supports die,
>    1FH.01H:EBX[bits 15:00] expects "cpus per die").
> 4. In cpu_x86_cpuid() of target/i386/cpu.c, when CPUID.80000001H is
>    caculated, here "cpus per package" is expected to be checked, but in
>    fact, now it checks "cpus per die". Though "cpus per die" also works
>    for this code logic, this isn't consistent with AMD's APM.
> 5. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.80000008H:ECX expects
>    "cpus per package" but it obtains "cpus per die".
> 6. In simulate_rdmsr() of target/i386/hvf/x86_emu.c, in
>    kvm_rdmsr_core_thread_count() of target/i386/kvm/kvm.c, and in
>    helper_rdmsr() of target/i386/tcg/sysemu/misc_helper.c,
>    MSR_CORE_THREAD_COUNT expects "cpus per package" and "cores per
>    package", but in these functions, it obtains "cpus per die" and
>    "cores per die".
> 
> On the other hand, these uses are correct now (they are added in/after
> a94e1428991f):
> 1. In cpu_x86_cpuid() of target/i386/cpu.c, topo_info.cores_per_die
>    meets the actual meaning of CPUState.nr_cores ("cores per die").
> 2. In cpu_x86_cpuid() of target/i386/cpu.c, vcpus_per_socket (in CPUID.
>    04H's caculation) considers number of dies, so it's correct.
> 3. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.1FH.01H:EBX[bits
>    15:00] needs "cpus per die" and it gets the correct result, and
>    CPUID.1FH.02H:EBX[bits 15:00] gets correct "cpus per package".
> 
> When CPUState.nr_cores is correctly changed to "cores per package" again
> , the above errors will be fixed without extra work, but the "currently"
> correct cases will go wrong and need special handling to pass correct
> "cpus/cores per die" they want.
> 
> Thus in this patch, we fix CPUState.nr_cores' caculation to fit the

s/Thus in this patch, we fix CPUState.nr_cores' caculation/Fix
CPUState.nr_cores' calculation/


Describe your changes in imperative mood also spell check.
Thanks
Babu


> original meaning "cores per package", as well as changing calculation of
> topo_info.cores_per_die, vcpus_per_socket and CPUID.1FH.
> 
> In addition, in the nr_threads' comment, specify it represents the
> number of threads in the "core" to avoid confusion, and also add comment
> for nr_dies in CPUX86State.
> 
> Fixes: a94e1428991f ("target/i386: Add CPUID.1F generation support for multi-dies PCMachine")
> Fixes: 003f230e37d7 ("machine: Tweak the order of topology members in struct CpuTopology")
> Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
> Co-developed-by: Zhao Liu <zhao1.liu@intel.com>
> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> ---
> Changes since v2:
>  * Use wrapped helper to get cores per socket in qemu_init_vcpu().
> Changes since v1:
>  * Add comment for nr_dies in CPUX86State. (Yanan)
> ---
>  include/hw/core/cpu.h | 2 +-
>  softmmu/cpus.c        | 2 +-
>  target/i386/cpu.c     | 9 ++++-----
>  target/i386/cpu.h     | 1 +
>  4 files changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
> index fdcbe8735258..57f4d50ace72 100644
> --- a/include/hw/core/cpu.h
> +++ b/include/hw/core/cpu.h
> @@ -277,7 +277,7 @@ struct qemu_work_item;
>   *   See TranslationBlock::TCG CF_CLUSTER_MASK.
>   * @tcg_cflags: Pre-computed cflags for this cpu.
>   * @nr_cores: Number of cores within this CPU package.
> - * @nr_threads: Number of threads within this CPU.
> + * @nr_threads: Number of threads within this CPU core.
>   * @running: #true if CPU is currently running (lockless).
>   * @has_waiter: #true if a CPU is currently waiting for the cpu_exec_end;
>   * valid under cpu_list_lock.
> diff --git a/softmmu/cpus.c b/softmmu/cpus.c
> index fed20ffb5dd2..984558d7b245 100644
> --- a/softmmu/cpus.c
> +++ b/softmmu/cpus.c
> @@ -630,7 +630,7 @@ void qemu_init_vcpu(CPUState *cpu)
>  {
>      MachineState *ms = MACHINE(qdev_get_machine());
>  
> -    cpu->nr_cores = ms->smp.cores;
> +    cpu->nr_cores = machine_topo_get_cores_per_socket(ms);
>      cpu->nr_threads =  ms->smp.threads;
>      cpu->stopped = true;
>      cpu->random_seed = qemu_guest_random_seed_thread_part1();
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index 97ad229d8ba3..50613cd04612 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -6011,7 +6011,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>      X86CPUTopoInfo topo_info;
>  
>      topo_info.dies_per_pkg = env->nr_dies;
> -    topo_info.cores_per_die = cs->nr_cores;
> +    topo_info.cores_per_die = cs->nr_cores / env->nr_dies;
>      topo_info.threads_per_core = cs->nr_threads;
>  
>      /* Calculate & apply limits for different index ranges */
> @@ -6087,8 +6087,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>               */
>              if (*eax & 31) {
>                  int host_vcpus_per_cache = 1 + ((*eax & 0x3FFC000) >> 14);
> -                int vcpus_per_socket = env->nr_dies * cs->nr_cores *
> -                                       cs->nr_threads;
> +                int vcpus_per_socket = cs->nr_cores * cs->nr_threads;
>                  if (cs->nr_cores > 1) {
>                      *eax &= ~0xFC000000;
>                      *eax |= (pow2ceil(cs->nr_cores) - 1) << 26;
> @@ -6266,12 +6265,12 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>              break;
>          case 1:
>              *eax = apicid_die_offset(&topo_info);
> -            *ebx = cs->nr_cores * cs->nr_threads;
> +            *ebx = topo_info.cores_per_die * topo_info.threads_per_core;
>              *ecx |= CPUID_TOPOLOGY_LEVEL_CORE;
>              break;
>          case 2:
>              *eax = apicid_pkg_offset(&topo_info);
> -            *ebx = env->nr_dies * cs->nr_cores * cs->nr_threads;
> +            *ebx = cs->nr_cores * cs->nr_threads;
>              *ecx |= CPUID_TOPOLOGY_LEVEL_DIE;
>              break;
>          default:
> diff --git a/target/i386/cpu.h b/target/i386/cpu.h
> index e0771a10433b..7638128d59cc 100644
> --- a/target/i386/cpu.h
> +++ b/target/i386/cpu.h
> @@ -1878,6 +1878,7 @@ typedef struct CPUArchState {
>  
>      TPRAccess tpr_access_type;
>  
> +    /* Number of dies within this CPU package. */
>      unsigned nr_dies;
>  } CPUX86State;
>
Zhao Liu Aug. 4, 2023, 8:16 a.m. UTC | #2
Hi Babu,

On Wed, Aug 02, 2023 at 10:25:58AM -0500, Moger, Babu wrote:
> Date: Wed, 2 Aug 2023 10:25:58 -0500
> From: "Moger, Babu" <babu.moger@amd.com>
> Subject: Re: [PATCH v3 03/17] softmmu: Fix CPUSTATE.nr_cores' calculation
> 
> Hi Zhao,
> 
> On 8/1/23 05:35, Zhao Liu wrote:
> > From: Zhuocheng Ding <zhuocheng.ding@intel.com>
> > 
> > From CPUState.nr_cores' comment, it represents "number of cores within
> > this CPU package".
> > 
> > After 003f230e37d7 ("machine: Tweak the order of topology members in
> > struct CpuTopology"), the meaning of smp.cores changed to "the number of
> > cores in one die", but this commit missed to change CPUState.nr_cores'
> > caculation, so that CPUState.nr_cores became wrong and now it
> > misses to consider numbers of clusters and dies.
> > 
> > At present, only i386 is using CPUState.nr_cores.
> > 
> > But as for i386, which supports die level, the uses of CPUState.nr_cores
> > are very confusing:
> > 
> > Early uses are based on the meaning of "cores per package" (before die
> > is introduced into i386), and later uses are based on "cores per die"
> > (after die's introduction).
> > 
> > This difference is due to that commit a94e1428991f ("target/i386: Add
> > CPUID.1F generation support for multi-dies PCMachine") misunderstood
> > that CPUState.nr_cores means "cores per die" when caculated
> > CPUID.1FH.01H:EBX. After that, the changes in i386 all followed this
> > wrong understanding.
> > 
> > With the influence of 003f230e37d7 and a94e1428991f, for i386 currently
> > the result of CPUState.nr_cores is "cores per die", thus the original
> > uses of CPUState.cores based on the meaning of "cores per package" are
> > wrong when mutiple dies exist:
> > 1. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.01H:EBX[bits 23:16] is
> >    incorrect because it expects "cpus per package" but now the
> >    result is "cpus per die".
> > 2. In cpu_x86_cpuid() of target/i386/cpu.c, for all leaves of CPUID.04H:
> >    EAX[bits 31:26] is incorrect because they expect "cpus per package"
> >    but now the result is "cpus per die". The error not only impacts the
> >    EAX caculation in cache_info_passthrough case, but also impacts other
> >    cases of setting cache topology for Intel CPU according to cpu
> >    topology (specifically, the incoming parameter "num_cores" expects
> >    "cores per package" in encode_cache_cpuid4()).
> > 3. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.0BH.01H:EBX[bits
> >    15:00] is incorrect because the EBX of 0BH.01H (core level) expects
> >    "cpus per package", which may be different with 1FH.01H (The reason
> >    is 1FH can support more levels. For QEMU, 1FH also supports die,
> >    1FH.01H:EBX[bits 15:00] expects "cpus per die").
> > 4. In cpu_x86_cpuid() of target/i386/cpu.c, when CPUID.80000001H is
> >    caculated, here "cpus per package" is expected to be checked, but in
> >    fact, now it checks "cpus per die". Though "cpus per die" also works
> >    for this code logic, this isn't consistent with AMD's APM.
> > 5. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.80000008H:ECX expects
> >    "cpus per package" but it obtains "cpus per die".
> > 6. In simulate_rdmsr() of target/i386/hvf/x86_emu.c, in
> >    kvm_rdmsr_core_thread_count() of target/i386/kvm/kvm.c, and in
> >    helper_rdmsr() of target/i386/tcg/sysemu/misc_helper.c,
> >    MSR_CORE_THREAD_COUNT expects "cpus per package" and "cores per
> >    package", but in these functions, it obtains "cpus per die" and
> >    "cores per die".
> > 
> > On the other hand, these uses are correct now (they are added in/after
> > a94e1428991f):
> > 1. In cpu_x86_cpuid() of target/i386/cpu.c, topo_info.cores_per_die
> >    meets the actual meaning of CPUState.nr_cores ("cores per die").
> > 2. In cpu_x86_cpuid() of target/i386/cpu.c, vcpus_per_socket (in CPUID.
> >    04H's caculation) considers number of dies, so it's correct.
> > 3. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.1FH.01H:EBX[bits
> >    15:00] needs "cpus per die" and it gets the correct result, and
> >    CPUID.1FH.02H:EBX[bits 15:00] gets correct "cpus per package".
> > 
> > When CPUState.nr_cores is correctly changed to "cores per package" again
> > , the above errors will be fixed without extra work, but the "currently"
> > correct cases will go wrong and need special handling to pass correct
> > "cpus/cores per die" they want.
> > 
> > Thus in this patch, we fix CPUState.nr_cores' caculation to fit the
> 
> s/Thus in this patch, we fix CPUState.nr_cores' caculation/Fix
> CPUState.nr_cores' calculation/

Thanks!

> 
> 
> Describe your changes in imperative mood also spell check.

Thanks for your suggestion!

-Zhao

> 
> 
> > original meaning "cores per package", as well as changing calculation of
> > topo_info.cores_per_die, vcpus_per_socket and CPUID.1FH.
> > 
> > In addition, in the nr_threads' comment, specify it represents the
> > number of threads in the "core" to avoid confusion, and also add comment
> > for nr_dies in CPUX86State.
> > 
> > Fixes: a94e1428991f ("target/i386: Add CPUID.1F generation support for multi-dies PCMachine")
> > Fixes: 003f230e37d7 ("machine: Tweak the order of topology members in struct CpuTopology")
> > Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
> > Co-developed-by: Zhao Liu <zhao1.liu@intel.com>
> > Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> > ---
> > Changes since v2:
> >  * Use wrapped helper to get cores per socket in qemu_init_vcpu().
> > Changes since v1:
> >  * Add comment for nr_dies in CPUX86State. (Yanan)
> > ---
> >  include/hw/core/cpu.h | 2 +-
> >  softmmu/cpus.c        | 2 +-
> >  target/i386/cpu.c     | 9 ++++-----
> >  target/i386/cpu.h     | 1 +
> >  4 files changed, 7 insertions(+), 7 deletions(-)
> > 
> > diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
> > index fdcbe8735258..57f4d50ace72 100644
> > --- a/include/hw/core/cpu.h
> > +++ b/include/hw/core/cpu.h
> > @@ -277,7 +277,7 @@ struct qemu_work_item;
> >   *   See TranslationBlock::TCG CF_CLUSTER_MASK.
> >   * @tcg_cflags: Pre-computed cflags for this cpu.
> >   * @nr_cores: Number of cores within this CPU package.
> > - * @nr_threads: Number of threads within this CPU.
> > + * @nr_threads: Number of threads within this CPU core.
> >   * @running: #true if CPU is currently running (lockless).
> >   * @has_waiter: #true if a CPU is currently waiting for the cpu_exec_end;
> >   * valid under cpu_list_lock.
> > diff --git a/softmmu/cpus.c b/softmmu/cpus.c
> > index fed20ffb5dd2..984558d7b245 100644
> > --- a/softmmu/cpus.c
> > +++ b/softmmu/cpus.c
> > @@ -630,7 +630,7 @@ void qemu_init_vcpu(CPUState *cpu)
> >  {
> >      MachineState *ms = MACHINE(qdev_get_machine());
> >  
> > -    cpu->nr_cores = ms->smp.cores;
> > +    cpu->nr_cores = machine_topo_get_cores_per_socket(ms);
> >      cpu->nr_threads =  ms->smp.threads;
> >      cpu->stopped = true;
> >      cpu->random_seed = qemu_guest_random_seed_thread_part1();
> > diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> > index 97ad229d8ba3..50613cd04612 100644
> > --- a/target/i386/cpu.c
> > +++ b/target/i386/cpu.c
> > @@ -6011,7 +6011,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
> >      X86CPUTopoInfo topo_info;
> >  
> >      topo_info.dies_per_pkg = env->nr_dies;
> > -    topo_info.cores_per_die = cs->nr_cores;
> > +    topo_info.cores_per_die = cs->nr_cores / env->nr_dies;
> >      topo_info.threads_per_core = cs->nr_threads;
> >  
> >      /* Calculate & apply limits for different index ranges */
> > @@ -6087,8 +6087,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
> >               */
> >              if (*eax & 31) {
> >                  int host_vcpus_per_cache = 1 + ((*eax & 0x3FFC000) >> 14);
> > -                int vcpus_per_socket = env->nr_dies * cs->nr_cores *
> > -                                       cs->nr_threads;
> > +                int vcpus_per_socket = cs->nr_cores * cs->nr_threads;
> >                  if (cs->nr_cores > 1) {
> >                      *eax &= ~0xFC000000;
> >                      *eax |= (pow2ceil(cs->nr_cores) - 1) << 26;
> > @@ -6266,12 +6265,12 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
> >              break;
> >          case 1:
> >              *eax = apicid_die_offset(&topo_info);
> > -            *ebx = cs->nr_cores * cs->nr_threads;
> > +            *ebx = topo_info.cores_per_die * topo_info.threads_per_core;
> >              *ecx |= CPUID_TOPOLOGY_LEVEL_CORE;
> >              break;
> >          case 2:
> >              *eax = apicid_pkg_offset(&topo_info);
> > -            *ebx = env->nr_dies * cs->nr_cores * cs->nr_threads;
> > +            *ebx = cs->nr_cores * cs->nr_threads;
> >              *ecx |= CPUID_TOPOLOGY_LEVEL_DIE;
> >              break;
> >          default:
> > diff --git a/target/i386/cpu.h b/target/i386/cpu.h
> > index e0771a10433b..7638128d59cc 100644
> > --- a/target/i386/cpu.h
> > +++ b/target/i386/cpu.h
> > @@ -1878,6 +1878,7 @@ typedef struct CPUArchState {
> >  
> >      TPRAccess tpr_access_type;
> >  
> > +    /* Number of dies within this CPU package. */
> >      unsigned nr_dies;
> >  } CPUX86State;
> >  
> 
> -- 
> Thanks
> Babu Moger
Xiaoyao Li Aug. 7, 2023, 7:03 a.m. UTC | #3
On 8/1/2023 6:35 PM, Zhao Liu wrote:
> From: Zhuocheng Ding <zhuocheng.ding@intel.com>
> 
>  From CPUState.nr_cores' comment, it represents "number of cores within
> this CPU package".
> 
> After 003f230e37d7 ("machine: Tweak the order of topology members in
> struct CpuTopology"), the meaning of smp.cores changed to "the number of
> cores in one die", but this commit missed to change CPUState.nr_cores'
> caculation, so that CPUState.nr_cores became wrong and now it
> misses to consider numbers of clusters and dies.
> 
> At present, only i386 is using CPUState.nr_cores.
> 
> But as for i386, which supports die level, the uses of CPUState.nr_cores
> are very confusing:
> 
> Early uses are based on the meaning of "cores per package" (before die
> is introduced into i386), and later uses are based on "cores per die"
> (after die's introduction).
> 
> This difference is due to that commit a94e1428991f ("target/i386: Add
> CPUID.1F generation support for multi-dies PCMachine") misunderstood
> that CPUState.nr_cores means "cores per die" when caculated
> CPUID.1FH.01H:EBX. After that, the changes in i386 all followed this
> wrong understanding.
> 
> With the influence of 003f230e37d7 and a94e1428991f, for i386 currently
> the result of CPUState.nr_cores is "cores per die", thus the original
> uses of CPUState.cores based on the meaning of "cores per package" are
> wrong when mutiple dies exist:
> 1. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.01H:EBX[bits 23:16] is
>     incorrect because it expects "cpus per package" but now the
>     result is "cpus per die".
> 2. In cpu_x86_cpuid() of target/i386/cpu.c, for all leaves of CPUID.04H:
>     EAX[bits 31:26] is incorrect because they expect "cpus per package"
>     but now the result is "cpus per die". The error not only impacts the
>     EAX caculation in cache_info_passthrough case, but also impacts other
>     cases of setting cache topology for Intel CPU according to cpu
>     topology (specifically, the incoming parameter "num_cores" expects
>     "cores per package" in encode_cache_cpuid4()).
> 3. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.0BH.01H:EBX[bits
>     15:00] is incorrect because the EBX of 0BH.01H (core level) expects
>     "cpus per package", which may be different with 1FH.01H (The reason
>     is 1FH can support more levels. For QEMU, 1FH also supports die,
>     1FH.01H:EBX[bits 15:00] expects "cpus per die").
> 4. In cpu_x86_cpuid() of target/i386/cpu.c, when CPUID.80000001H is
>     caculated, here "cpus per package" is expected to be checked, but in
>     fact, now it checks "cpus per die". Though "cpus per die" also works
>     for this code logic, this isn't consistent with AMD's APM.
> 5. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.80000008H:ECX expects
>     "cpus per package" but it obtains "cpus per die".
> 6. In simulate_rdmsr() of target/i386/hvf/x86_emu.c, in
>     kvm_rdmsr_core_thread_count() of target/i386/kvm/kvm.c, and in
>     helper_rdmsr() of target/i386/tcg/sysemu/misc_helper.c,
>     MSR_CORE_THREAD_COUNT expects "cpus per package" and "cores per
>     package", but in these functions, it obtains "cpus per die" and
>     "cores per die".
> 
> On the other hand, these uses are correct now (they are added in/after
> a94e1428991f):
> 1. In cpu_x86_cpuid() of target/i386/cpu.c, topo_info.cores_per_die
>     meets the actual meaning of CPUState.nr_cores ("cores per die").
> 2. In cpu_x86_cpuid() of target/i386/cpu.c, vcpus_per_socket (in CPUID.
>     04H's caculation) considers number of dies, so it's correct.
> 3. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.1FH.01H:EBX[bits
>     15:00] needs "cpus per die" and it gets the correct result, and
>     CPUID.1FH.02H:EBX[bits 15:00] gets correct "cpus per package".
> 
> When CPUState.nr_cores is correctly changed to "cores per package" again
> , the above errors will be fixed without extra work, but the "currently"
> correct cases will go wrong and need special handling to pass correct
> "cpus/cores per die" they want.
> 
> Thus in this patch, we fix CPUState.nr_cores' caculation to fit the
> original meaning "cores per package", as well as changing calculation of
> topo_info.cores_per_die, vcpus_per_socket and CPUID.1FH.
> 
> In addition, in the nr_threads' comment, specify it represents the
> number of threads in the "core" to avoid confusion, and also add comment
> for nr_dies in CPUX86State.
> 
> Fixes: a94e1428991f ("target/i386: Add CPUID.1F generation support for multi-dies PCMachine")
> Fixes: 003f230e37d7 ("machine: Tweak the order of topology members in struct CpuTopology")
> Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
> Co-developed-by: Zhao Liu <zhao1.liu@intel.com>
> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> ---
> Changes since v2:
>   * Use wrapped helper to get cores per socket in qemu_init_vcpu().
> Changes since v1:
>   * Add comment for nr_dies in CPUX86State. (Yanan)
> ---
>   include/hw/core/cpu.h | 2 +-
>   softmmu/cpus.c        | 2 +-
>   target/i386/cpu.c     | 9 ++++-----
>   target/i386/cpu.h     | 1 +
>   4 files changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
> index fdcbe8735258..57f4d50ace72 100644
> --- a/include/hw/core/cpu.h
> +++ b/include/hw/core/cpu.h
> @@ -277,7 +277,7 @@ struct qemu_work_item;
>    *   See TranslationBlock::TCG CF_CLUSTER_MASK.
>    * @tcg_cflags: Pre-computed cflags for this cpu.
>    * @nr_cores: Number of cores within this CPU package.
> - * @nr_threads: Number of threads within this CPU.
> + * @nr_threads: Number of threads within this CPU core.

This seems to be better as a separate patch.

Besides, if could, I think it's better to rename them to 
nr_threads_per_core and nr_cores_per_pkg.

Side topic. Ideally, it seems redundant to maintain the @nr_threads and 
@nr_cores in struct CPUState because we have ms->smp to carry all the 
topology info. However, various codes across different arches use 
@nr_threads and @nr_cores. It looks painful to get rid of it or use 
something else to replace it.

>    * @running: #true if CPU is currently running (lockless).
>    * @has_waiter: #true if a CPU is currently waiting for the cpu_exec_end;
>    * valid under cpu_list_lock.
> diff --git a/softmmu/cpus.c b/softmmu/cpus.c
> index fed20ffb5dd2..984558d7b245 100644
> --- a/softmmu/cpus.c
> +++ b/softmmu/cpus.c
> @@ -630,7 +630,7 @@ void qemu_init_vcpu(CPUState *cpu)
>   {
>       MachineState *ms = MACHINE(qdev_get_machine());
>   
> -    cpu->nr_cores = ms->smp.cores;
> +    cpu->nr_cores = machine_topo_get_cores_per_socket(ms);
>       cpu->nr_threads =  ms->smp.threads;
>       cpu->stopped = true;
>       cpu->random_seed = qemu_guest_random_seed_thread_part1();
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index 97ad229d8ba3..50613cd04612 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -6011,7 +6011,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>       X86CPUTopoInfo topo_info;
>   
>       topo_info.dies_per_pkg = env->nr_dies;
> -    topo_info.cores_per_die = cs->nr_cores;
> +    topo_info.cores_per_die = cs->nr_cores / env->nr_dies;

This and below things make me think that, it looks ugly that @nr_dies is 
added separately in struct CPUArchState for i386 because CPUState only 
has @nr_cores and nr_threads. Further, for i386, it defines a specific 
struct X86CPUTopoInfo to contain topology info when setting up CPUID. To 
me, struct X86CPUTopoInfo is redundant as struct CpuTopology.

maybe we can carry a struct CpuTopology in CPUState, so that we can drop 
@nr_threads, @nr_cores in CPUState for all ARCHes, and @nr_dies in 
struct CPUArchState for i386. As well, topo_info can be dropped here.

>       topo_info.threads_per_core = cs->nr_threads;
>   
>       /* Calculate & apply limits for different index ranges */
> @@ -6087,8 +6087,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>                */
>               if (*eax & 31) {
>                   int host_vcpus_per_cache = 1 + ((*eax & 0x3FFC000) >> 14);
> -                int vcpus_per_socket = env->nr_dies * cs->nr_cores *
> -                                       cs->nr_threads;
> +                int vcpus_per_socket = cs->nr_cores * cs->nr_threads;
>                   if (cs->nr_cores > 1) {
>                       *eax &= ~0xFC000000;
>                       *eax |= (pow2ceil(cs->nr_cores) - 1) << 26;
> @@ -6266,12 +6265,12 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>               break;
>           case 1:
>               *eax = apicid_die_offset(&topo_info);
> -            *ebx = cs->nr_cores * cs->nr_threads;
> +            *ebx = topo_info.cores_per_die * topo_info.threads_per_core;
>               *ecx |= CPUID_TOPOLOGY_LEVEL_CORE;
>               break;
>           case 2:
>               *eax = apicid_pkg_offset(&topo_info);
> -            *ebx = env->nr_dies * cs->nr_cores * cs->nr_threads;
> +            *ebx = cs->nr_cores * cs->nr_threads;
>               *ecx |= CPUID_TOPOLOGY_LEVEL_DIE;
>               break;
>           default:
> diff --git a/target/i386/cpu.h b/target/i386/cpu.h
> index e0771a10433b..7638128d59cc 100644
> --- a/target/i386/cpu.h
> +++ b/target/i386/cpu.h
> @@ -1878,6 +1878,7 @@ typedef struct CPUArchState {
>   
>       TPRAccess tpr_access_type;
>   
> +    /* Number of dies within this CPU package. */
>       unsigned nr_dies;
>   } CPUX86State;
>
Zhao Liu Aug. 7, 2023, 7:53 a.m. UTC | #4
Hi Xiaoyao,

On Mon, Aug 07, 2023 at 03:03:13PM +0800, Xiaoyao Li wrote:
> Date: Mon, 7 Aug 2023 15:03:13 +0800
> From: Xiaoyao Li <xiaoyao.li@intel.com>
> Subject: Re: [PATCH v3 03/17] softmmu: Fix CPUSTATE.nr_cores' calculation
> 
> On 8/1/2023 6:35 PM, Zhao Liu wrote:
> > From: Zhuocheng Ding <zhuocheng.ding@intel.com>
> > 
> >  From CPUState.nr_cores' comment, it represents "number of cores within
> > this CPU package".
> > 
> > After 003f230e37d7 ("machine: Tweak the order of topology members in
> > struct CpuTopology"), the meaning of smp.cores changed to "the number of
> > cores in one die", but this commit missed to change CPUState.nr_cores'
> > caculation, so that CPUState.nr_cores became wrong and now it
> > misses to consider numbers of clusters and dies.
> > 
> > At present, only i386 is using CPUState.nr_cores.
> > 
> > But as for i386, which supports die level, the uses of CPUState.nr_cores
> > are very confusing:
> > 
> > Early uses are based on the meaning of "cores per package" (before die
> > is introduced into i386), and later uses are based on "cores per die"
> > (after die's introduction).
> > 
> > This difference is due to that commit a94e1428991f ("target/i386: Add
> > CPUID.1F generation support for multi-dies PCMachine") misunderstood
> > that CPUState.nr_cores means "cores per die" when caculated
> > CPUID.1FH.01H:EBX. After that, the changes in i386 all followed this
> > wrong understanding.
> > 
> > With the influence of 003f230e37d7 and a94e1428991f, for i386 currently
> > the result of CPUState.nr_cores is "cores per die", thus the original
> > uses of CPUState.cores based on the meaning of "cores per package" are
> > wrong when mutiple dies exist:
> > 1. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.01H:EBX[bits 23:16] is
> >     incorrect because it expects "cpus per package" but now the
> >     result is "cpus per die".
> > 2. In cpu_x86_cpuid() of target/i386/cpu.c, for all leaves of CPUID.04H:
> >     EAX[bits 31:26] is incorrect because they expect "cpus per package"
> >     but now the result is "cpus per die". The error not only impacts the
> >     EAX caculation in cache_info_passthrough case, but also impacts other
> >     cases of setting cache topology for Intel CPU according to cpu
> >     topology (specifically, the incoming parameter "num_cores" expects
> >     "cores per package" in encode_cache_cpuid4()).
> > 3. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.0BH.01H:EBX[bits
> >     15:00] is incorrect because the EBX of 0BH.01H (core level) expects
> >     "cpus per package", which may be different with 1FH.01H (The reason
> >     is 1FH can support more levels. For QEMU, 1FH also supports die,
> >     1FH.01H:EBX[bits 15:00] expects "cpus per die").
> > 4. In cpu_x86_cpuid() of target/i386/cpu.c, when CPUID.80000001H is
> >     caculated, here "cpus per package" is expected to be checked, but in
> >     fact, now it checks "cpus per die". Though "cpus per die" also works
> >     for this code logic, this isn't consistent with AMD's APM.
> > 5. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.80000008H:ECX expects
> >     "cpus per package" but it obtains "cpus per die".
> > 6. In simulate_rdmsr() of target/i386/hvf/x86_emu.c, in
> >     kvm_rdmsr_core_thread_count() of target/i386/kvm/kvm.c, and in
> >     helper_rdmsr() of target/i386/tcg/sysemu/misc_helper.c,
> >     MSR_CORE_THREAD_COUNT expects "cpus per package" and "cores per
> >     package", but in these functions, it obtains "cpus per die" and
> >     "cores per die".
> > 
> > On the other hand, these uses are correct now (they are added in/after
> > a94e1428991f):
> > 1. In cpu_x86_cpuid() of target/i386/cpu.c, topo_info.cores_per_die
> >     meets the actual meaning of CPUState.nr_cores ("cores per die").
> > 2. In cpu_x86_cpuid() of target/i386/cpu.c, vcpus_per_socket (in CPUID.
> >     04H's caculation) considers number of dies, so it's correct.
> > 3. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.1FH.01H:EBX[bits
> >     15:00] needs "cpus per die" and it gets the correct result, and
> >     CPUID.1FH.02H:EBX[bits 15:00] gets correct "cpus per package".
> > 
> > When CPUState.nr_cores is correctly changed to "cores per package" again
> > , the above errors will be fixed without extra work, but the "currently"
> > correct cases will go wrong and need special handling to pass correct
> > "cpus/cores per die" they want.
> > 
> > Thus in this patch, we fix CPUState.nr_cores' caculation to fit the
> > original meaning "cores per package", as well as changing calculation of
> > topo_info.cores_per_die, vcpus_per_socket and CPUID.1FH.
> > 
> > In addition, in the nr_threads' comment, specify it represents the
> > number of threads in the "core" to avoid confusion, and also add comment
> > for nr_dies in CPUX86State.
> > 
> > Fixes: a94e1428991f ("target/i386: Add CPUID.1F generation support for multi-dies PCMachine")
> > Fixes: 003f230e37d7 ("machine: Tweak the order of topology members in struct CpuTopology")
> > Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
> > Co-developed-by: Zhao Liu <zhao1.liu@intel.com>
> > Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> > ---
> > Changes since v2:
> >   * Use wrapped helper to get cores per socket in qemu_init_vcpu().
> > Changes since v1:
> >   * Add comment for nr_dies in CPUX86State. (Yanan)
> > ---
> >   include/hw/core/cpu.h | 2 +-
> >   softmmu/cpus.c        | 2 +-
> >   target/i386/cpu.c     | 9 ++++-----
> >   target/i386/cpu.h     | 1 +
> >   4 files changed, 7 insertions(+), 7 deletions(-)
> > 
> > diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
> > index fdcbe8735258..57f4d50ace72 100644
> > --- a/include/hw/core/cpu.h
> > +++ b/include/hw/core/cpu.h
> > @@ -277,7 +277,7 @@ struct qemu_work_item;
> >    *   See TranslationBlock::TCG CF_CLUSTER_MASK.
> >    * @tcg_cflags: Pre-computed cflags for this cpu.
> >    * @nr_cores: Number of cores within this CPU package.
> > - * @nr_threads: Number of threads within this CPU.
> > + * @nr_threads: Number of threads within this CPU core.
> 
> This seems to be better as a separate patch.

Ok, I can spilt it into a new patch in v4.

> 
> Besides, if could, I think it's better to rename them to nr_threads_per_core
> and nr_cores_per_pkg.

Yeah, the names nr_threads and nr_cores are more historical, having been
around for a long time. With more CPU topology levels available today,
it's really worth reconsidering the names.

In the previous RFC of hybrid topology [1], I proposed the new structure
to replace nr_threads/nr_cores:

typedef struct TopologyState {
    int sockets;
    int cores_per_socket;
    int threads_per_socket;
    int dies_per_socket;
    int clusters_per_die;
    int cores_per_cluster;
    int threads_per_core;
} TopologyState;

I'm still working on hybrid topology, and I'll continue to follow up on
this cleanup! :-)

[1]: https://mail.gnu.org/archive/html/qemu-devel/2023-02/msg03235.html

> 
> Side topic. Ideally, it seems redundant to maintain the @nr_threads and
> @nr_cores in struct CPUState because we have ms->smp to carry all the
> topology info. However, various codes across different arches use
> @nr_threads and @nr_cores. It looks painful to get rid of it or use
> something else to replace it.

IIUC, user emulator needs the topology info in CPUState. System emulator
could use ms->smp directly.

> 
> >    * @running: #true if CPU is currently running (lockless).
> >    * @has_waiter: #true if a CPU is currently waiting for the cpu_exec_end;
> >    * valid under cpu_list_lock.
> > diff --git a/softmmu/cpus.c b/softmmu/cpus.c
> > index fed20ffb5dd2..984558d7b245 100644
> > --- a/softmmu/cpus.c
> > +++ b/softmmu/cpus.c
> > @@ -630,7 +630,7 @@ void qemu_init_vcpu(CPUState *cpu)
> >   {
> >       MachineState *ms = MACHINE(qdev_get_machine());
> > -    cpu->nr_cores = ms->smp.cores;
> > +    cpu->nr_cores = machine_topo_get_cores_per_socket(ms);
> >       cpu->nr_threads =  ms->smp.threads;
> >       cpu->stopped = true;
> >       cpu->random_seed = qemu_guest_random_seed_thread_part1();
> > diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> > index 97ad229d8ba3..50613cd04612 100644
> > --- a/target/i386/cpu.c
> > +++ b/target/i386/cpu.c
> > @@ -6011,7 +6011,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
> >       X86CPUTopoInfo topo_info;
> >       topo_info.dies_per_pkg = env->nr_dies;
> > -    topo_info.cores_per_die = cs->nr_cores;
> > +    topo_info.cores_per_die = cs->nr_cores / env->nr_dies;
> 
> This and below things make me think that, it looks ugly that @nr_dies is
> added separately in struct CPUArchState for i386 because CPUState only has
> @nr_cores and nr_threads. Further, for i386, it defines a specific struct
> X86CPUTopoInfo to contain topology info when setting up CPUID. To me, struct
> X86CPUTopoInfo is redundant as struct CpuTopology.
> 
> maybe we can carry a struct CpuTopology in CPUState, so that we can drop
> @nr_threads, @nr_cores in CPUState for all ARCHes, and @nr_dies in struct
> CPUArchState for i386. As well, topo_info can be dropped here.

Yeah, I agree. We think the same way, as did in [1].

About X86CPUTopoInfo, it's still necessary to keep to help encode
APICID. For hybrid topology case, the APICID is likely discontinuous,
and the width of each CPU level in APICID depends on the maximum number
of elements in this level. So I also proposed to rename it to
X86ApicidTopoInfo [2] and count the maximum number of elements in each
CPU level.

[2]: https://mail.gnu.org/archive/html/qemu-devel/2023-02/msg03237.html

Thanks,
Zhao

> 
> >       topo_info.threads_per_core = cs->nr_threads;
> >       /* Calculate & apply limits for different index ranges */
> > @@ -6087,8 +6087,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
> >                */
> >               if (*eax & 31) {
> >                   int host_vcpus_per_cache = 1 + ((*eax & 0x3FFC000) >> 14);
> > -                int vcpus_per_socket = env->nr_dies * cs->nr_cores *
> > -                                       cs->nr_threads;
> > +                int vcpus_per_socket = cs->nr_cores * cs->nr_threads;
> >                   if (cs->nr_cores > 1) {
> >                       *eax &= ~0xFC000000;
> >                       *eax |= (pow2ceil(cs->nr_cores) - 1) << 26;
> > @@ -6266,12 +6265,12 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
> >               break;
> >           case 1:
> >               *eax = apicid_die_offset(&topo_info);
> > -            *ebx = cs->nr_cores * cs->nr_threads;
> > +            *ebx = topo_info.cores_per_die * topo_info.threads_per_core;
> >               *ecx |= CPUID_TOPOLOGY_LEVEL_CORE;
> >               break;
> >           case 2:
> >               *eax = apicid_pkg_offset(&topo_info);
> > -            *ebx = env->nr_dies * cs->nr_cores * cs->nr_threads;
> > +            *ebx = cs->nr_cores * cs->nr_threads;
> >               *ecx |= CPUID_TOPOLOGY_LEVEL_DIE;
> >               break;
> >           default:
> > diff --git a/target/i386/cpu.h b/target/i386/cpu.h
> > index e0771a10433b..7638128d59cc 100644
> > --- a/target/i386/cpu.h
> > +++ b/target/i386/cpu.h
> > @@ -1878,6 +1878,7 @@ typedef struct CPUArchState {
> >       TPRAccess tpr_access_type;
> > +    /* Number of dies within this CPU package. */
> >       unsigned nr_dies;
> >   } CPUX86State;
>
Xiaoyao Li Aug. 7, 2023, 8:43 a.m. UTC | #5
On 8/7/2023 3:53 PM, Zhao Liu wrote:
>>> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
>>> index 97ad229d8ba3..50613cd04612 100644
>>> --- a/target/i386/cpu.c
>>> +++ b/target/i386/cpu.c
>>> @@ -6011,7 +6011,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>>>        X86CPUTopoInfo topo_info;
>>>        topo_info.dies_per_pkg = env->nr_dies;
>>> -    topo_info.cores_per_die = cs->nr_cores;
>>> +    topo_info.cores_per_die = cs->nr_cores / env->nr_dies;
>> This and below things make me think that, it looks ugly that @nr_dies is
>> added separately in struct CPUArchState for i386 because CPUState only has
>> @nr_cores and nr_threads. Further, for i386, it defines a specific struct
>> X86CPUTopoInfo to contain topology info when setting up CPUID. To me, struct
>> X86CPUTopoInfo is redundant as struct CpuTopology.
>>
>> maybe we can carry a struct CpuTopology in CPUState, so that we can drop
>> @nr_threads, @nr_cores in CPUState for all ARCHes, and @nr_dies in struct
>> CPUArchState for i386. As well, topo_info can be dropped here.
> Yeah, I agree. We think the same way, as did in [1].
> 
> About X86CPUTopoInfo, it's still necessary to keep to help encode
> APICID. 

typedef struct X86CPUTopoInfo {
     unsigned dies_per_pkg;
     unsigned cores_per_die;
     unsigned threads_per_core;
} X86CPUTopoInfo;

/**
  * CpuTopology:
  * @cpus: the number of present logical processors on the machine
  * @sockets: the number of sockets on the machine
  * @dies: the number of dies in one socket
  * @clusters: the number of clusters in one die
  * @cores: the number of cores in one cluster
  * @threads: the number of threads in one core
  * @max_cpus: the maximum number of logical processors on the machine
  */
typedef struct CpuTopology {
     unsigned int cpus;
     unsigned int sockets;
     unsigned int dies;
     unsigned int clusters;
     unsigned int cores;
     unsigned int threads;
     unsigned int max_cpus;
} CpuTopology;

I think 'struct X86CPUTopoInfo' is just a subset of 'struct CpuTopology'

IIUC, currently the value of X86CPUTopoInfo::dies_per_pkg should equal 
with CpuTopology::dies, and the same for cores_per_die and threads_per_core.

So it's OK to keep an copy of 'struct CpuTopology' in CPUState and drop 
'struct X86CPUTopoInfo'

> For hybrid topology case, the APICID is likely discontinuous,
> and the width of each CPU level in APICID depends on the maximum number
> of elements in this level. So I also proposed to rename it to
> X86ApicidTopoInfo [2] and count the maximum number of elements in each
> CPU level.

Do you mean, for example, for hybrid topology, 
X86CPUTopoInfo::dies_per_pkg != CpuTopology::dies? Or after rename
X86CPUTopoInfo::max_dies != CpuTopology::dies?

> [2]:https://mail.gnu.org/archive/html/qemu-devel/2023-02/msg03237.html
> 
> Thanks,
> Zhao
>
Zhao Liu Aug. 7, 2023, 10 a.m. UTC | #6
Hi Xiaoyao,

On Mon, Aug 07, 2023 at 04:43:32PM +0800, Xiaoyao Li wrote:
> Date: Mon, 7 Aug 2023 16:43:32 +0800
> From: Xiaoyao Li <xiaoyao.li@intel.com>
> Subject: Re: [PATCH v3 03/17] softmmu: Fix CPUSTATE.nr_cores' calculation
> 
> On 8/7/2023 3:53 PM, Zhao Liu wrote:
> > > > diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> > > > index 97ad229d8ba3..50613cd04612 100644
> > > > --- a/target/i386/cpu.c
> > > > +++ b/target/i386/cpu.c
> > > > @@ -6011,7 +6011,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
> > > >        X86CPUTopoInfo topo_info;
> > > >        topo_info.dies_per_pkg = env->nr_dies;
> > > > -    topo_info.cores_per_die = cs->nr_cores;
> > > > +    topo_info.cores_per_die = cs->nr_cores / env->nr_dies;
> > > This and below things make me think that, it looks ugly that @nr_dies is
> > > added separately in struct CPUArchState for i386 because CPUState only has
> > > @nr_cores and nr_threads. Further, for i386, it defines a specific struct
> > > X86CPUTopoInfo to contain topology info when setting up CPUID. To me, struct
> > > X86CPUTopoInfo is redundant as struct CpuTopology.
> > > 
> > > maybe we can carry a struct CpuTopology in CPUState, so that we can drop
> > > @nr_threads, @nr_cores in CPUState for all ARCHes, and @nr_dies in struct
> > > CPUArchState for i386. As well, topo_info can be dropped here.
> > Yeah, I agree. We think the same way, as did in [1].
> > 
> > About X86CPUTopoInfo, it's still necessary to keep to help encode
> > APICID.
> 
> typedef struct X86CPUTopoInfo {
>     unsigned dies_per_pkg;
>     unsigned cores_per_die;
>     unsigned threads_per_core;
> } X86CPUTopoInfo;
> 
> /**
>  * CpuTopology:
>  * @cpus: the number of present logical processors on the machine
>  * @sockets: the number of sockets on the machine
>  * @dies: the number of dies in one socket
>  * @clusters: the number of clusters in one die
>  * @cores: the number of cores in one cluster
>  * @threads: the number of threads in one core
>  * @max_cpus: the maximum number of logical processors on the machine
>  */
> typedef struct CpuTopology {
>     unsigned int cpus;
>     unsigned int sockets;
>     unsigned int dies;
>     unsigned int clusters;
>     unsigned int cores;
>     unsigned int threads;
>     unsigned int max_cpus;
> } CpuTopology;
> 
> I think 'struct X86CPUTopoInfo' is just a subset of 'struct CpuTopology'

For smp topology, it's correct.

> 
> IIUC, currently the value of X86CPUTopoInfo::dies_per_pkg should equal with
> CpuTopology::dies, and the same for cores_per_die and threads_per_core.
> 
> So it's OK to keep an copy of 'struct CpuTopology' in CPUState and drop
> 'struct X86CPUTopoInfo'
> 
> > For hybrid topology case, the APICID is likely discontinuous,
> > and the width of each CPU level in APICID depends on the maximum number
> > of elements in this level. So I also proposed to rename it to
> > X86ApicidTopoInfo [2] and count the maximum number of elements in each
> > CPU level.
> 
> Do you mean, for example, for hybrid topology, X86CPUTopoInfo::dies_per_pkg
> != CpuTopology::dies? Or after rename
> X86CPUTopoInfo::max_dies != CpuTopology::dies?

I mean the latter.

A more typical example nowadays is thread level.

X86CPUTopoInfo::max_threads may not euqal to CpuTopology::threads,

since P core has 2 threads per core and E core doesn't support SMT.

The CpuTopology in CPUState should reflect the topology information for
current CPU, so CpuTopology::threads is 2 for P core and
CpuTopology::threads = 1 for E core.

But the width of the SMT level in APICID must be fixed, so that SMT width
should be determined by X86CPUTopoInfo::max_threads. Current hybrid
platforms implement it the same way.

Thanks,
Zhao

> 
> > [2]:https://mail.gnu.org/archive/html/qemu-devel/2023-02/msg03237.html
> > 
> > Thanks,
> > Zhao
> > 
>
Xiaoyao Li Aug. 7, 2023, 2:20 p.m. UTC | #7
On 8/7/2023 6:00 PM, Zhao Liu wrote:
> Hi Xiaoyao,
> 
> On Mon, Aug 07, 2023 at 04:43:32PM +0800, Xiaoyao Li wrote:
>> Date: Mon, 7 Aug 2023 16:43:32 +0800
>> From: Xiaoyao Li <xiaoyao.li@intel.com>
>> Subject: Re: [PATCH v3 03/17] softmmu: Fix CPUSTATE.nr_cores' calculation
>>
>> On 8/7/2023 3:53 PM, Zhao Liu wrote:
>>>>> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
>>>>> index 97ad229d8ba3..50613cd04612 100644
>>>>> --- a/target/i386/cpu.c
>>>>> +++ b/target/i386/cpu.c
>>>>> @@ -6011,7 +6011,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>>>>>         X86CPUTopoInfo topo_info;
>>>>>         topo_info.dies_per_pkg = env->nr_dies;
>>>>> -    topo_info.cores_per_die = cs->nr_cores;
>>>>> +    topo_info.cores_per_die = cs->nr_cores / env->nr_dies;
>>>> This and below things make me think that, it looks ugly that @nr_dies is
>>>> added separately in struct CPUArchState for i386 because CPUState only has
>>>> @nr_cores and nr_threads. Further, for i386, it defines a specific struct
>>>> X86CPUTopoInfo to contain topology info when setting up CPUID. To me, struct
>>>> X86CPUTopoInfo is redundant as struct CpuTopology.
>>>>
>>>> maybe we can carry a struct CpuTopology in CPUState, so that we can drop
>>>> @nr_threads, @nr_cores in CPUState for all ARCHes, and @nr_dies in struct
>>>> CPUArchState for i386. As well, topo_info can be dropped here.
>>> Yeah, I agree. We think the same way, as did in [1].
>>>
>>> About X86CPUTopoInfo, it's still necessary to keep to help encode
>>> APICID.
>>
>> typedef struct X86CPUTopoInfo {
>>      unsigned dies_per_pkg;
>>      unsigned cores_per_die;
>>      unsigned threads_per_core;
>> } X86CPUTopoInfo;
>>
>> /**
>>   * CpuTopology:
>>   * @cpus: the number of present logical processors on the machine
>>   * @sockets: the number of sockets on the machine
>>   * @dies: the number of dies in one socket
>>   * @clusters: the number of clusters in one die
>>   * @cores: the number of cores in one cluster
>>   * @threads: the number of threads in one core
>>   * @max_cpus: the maximum number of logical processors on the machine
>>   */
>> typedef struct CpuTopology {
>>      unsigned int cpus;
>>      unsigned int sockets;
>>      unsigned int dies;
>>      unsigned int clusters;
>>      unsigned int cores;
>>      unsigned int threads;
>>      unsigned int max_cpus;
>> } CpuTopology;
>>
>> I think 'struct X86CPUTopoInfo' is just a subset of 'struct CpuTopology'
> 
> For smp topology, it's correct.
> 
>>
>> IIUC, currently the value of X86CPUTopoInfo::dies_per_pkg should equal with
>> CpuTopology::dies, and the same for cores_per_die and threads_per_core.
>>
>> So it's OK to keep an copy of 'struct CpuTopology' in CPUState and drop
>> 'struct X86CPUTopoInfo'
>>
>>> For hybrid topology case, the APICID is likely discontinuous,
>>> and the width of each CPU level in APICID depends on the maximum number
>>> of elements in this level. So I also proposed to rename it to
>>> X86ApicidTopoInfo [2] and count the maximum number of elements in each
>>> CPU level.
>>
>> Do you mean, for example, for hybrid topology, X86CPUTopoInfo::dies_per_pkg
>> != CpuTopology::dies? Or after rename
>> X86CPUTopoInfo::max_dies != CpuTopology::dies?
> 
> I mean the latter.
> 
> A more typical example nowadays is thread level.
> 
> X86CPUTopoInfo::max_threads may not euqal to CpuTopology::threads,
> 
> since P core has 2 threads per core and E core doesn't support SMT.
> 
> The CpuTopology in CPUState should reflect the topology information for
> current CPU, so CpuTopology::threads is 2 for P core and
> CpuTopology::threads = 1 for E core.
> 
> But the width of the SMT level in APICID must be fixed, so that SMT width
> should be determined by X86CPUTopoInfo::max_threads. Current hybrid
> platforms implement it the same way.

I see.

Can we pull the patch into this series (define a common CPUTopoInfo in 
CPUState and drop env->nr_dies, cs->nr_cores and cs->nr_threads) and let 
the hybrid series later to rename it to X86ApicidTopoInfo?

> Thanks,
> Zhao
> 
>>
>>> [2]:https://mail.gnu.org/archive/html/qemu-devel/2023-02/msg03237.html
>>>
>>> Thanks,
>>> Zhao
>>>
>>
Zhao Liu Aug. 7, 2023, 2:42 p.m. UTC | #8
Hi Xiaoyao,

[snip]

> 
> I see.
> 
> Can we pull the patch into this series (define a common CPUTopoInfo in
> CPUState and drop env->nr_dies, cs->nr_cores and cs->nr_threads) and let the
> hybrid series later to rename it to X86ApicidTopoInfo?
>

Yes, we can spilt these from hybrid series.

But if I merge these into this series, it makes the v4 change a bit too
much for subsequent reviews.

I could cook another series to do these cleanups after this series.

Thanks,
Zhao
diff mbox series

Patch

diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
index fdcbe8735258..57f4d50ace72 100644
--- a/include/hw/core/cpu.h
+++ b/include/hw/core/cpu.h
@@ -277,7 +277,7 @@  struct qemu_work_item;
  *   See TranslationBlock::TCG CF_CLUSTER_MASK.
  * @tcg_cflags: Pre-computed cflags for this cpu.
  * @nr_cores: Number of cores within this CPU package.
- * @nr_threads: Number of threads within this CPU.
+ * @nr_threads: Number of threads within this CPU core.
  * @running: #true if CPU is currently running (lockless).
  * @has_waiter: #true if a CPU is currently waiting for the cpu_exec_end;
  * valid under cpu_list_lock.
diff --git a/softmmu/cpus.c b/softmmu/cpus.c
index fed20ffb5dd2..984558d7b245 100644
--- a/softmmu/cpus.c
+++ b/softmmu/cpus.c
@@ -630,7 +630,7 @@  void qemu_init_vcpu(CPUState *cpu)
 {
     MachineState *ms = MACHINE(qdev_get_machine());
 
-    cpu->nr_cores = ms->smp.cores;
+    cpu->nr_cores = machine_topo_get_cores_per_socket(ms);
     cpu->nr_threads =  ms->smp.threads;
     cpu->stopped = true;
     cpu->random_seed = qemu_guest_random_seed_thread_part1();
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 97ad229d8ba3..50613cd04612 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -6011,7 +6011,7 @@  void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
     X86CPUTopoInfo topo_info;
 
     topo_info.dies_per_pkg = env->nr_dies;
-    topo_info.cores_per_die = cs->nr_cores;
+    topo_info.cores_per_die = cs->nr_cores / env->nr_dies;
     topo_info.threads_per_core = cs->nr_threads;
 
     /* Calculate & apply limits for different index ranges */
@@ -6087,8 +6087,7 @@  void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
              */
             if (*eax & 31) {
                 int host_vcpus_per_cache = 1 + ((*eax & 0x3FFC000) >> 14);
-                int vcpus_per_socket = env->nr_dies * cs->nr_cores *
-                                       cs->nr_threads;
+                int vcpus_per_socket = cs->nr_cores * cs->nr_threads;
                 if (cs->nr_cores > 1) {
                     *eax &= ~0xFC000000;
                     *eax |= (pow2ceil(cs->nr_cores) - 1) << 26;
@@ -6266,12 +6265,12 @@  void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
             break;
         case 1:
             *eax = apicid_die_offset(&topo_info);
-            *ebx = cs->nr_cores * cs->nr_threads;
+            *ebx = topo_info.cores_per_die * topo_info.threads_per_core;
             *ecx |= CPUID_TOPOLOGY_LEVEL_CORE;
             break;
         case 2:
             *eax = apicid_pkg_offset(&topo_info);
-            *ebx = env->nr_dies * cs->nr_cores * cs->nr_threads;
+            *ebx = cs->nr_cores * cs->nr_threads;
             *ecx |= CPUID_TOPOLOGY_LEVEL_DIE;
             break;
         default:
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index e0771a10433b..7638128d59cc 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1878,6 +1878,7 @@  typedef struct CPUArchState {
 
     TPRAccess tpr_access_type;
 
+    /* Number of dies within this CPU package. */
     unsigned nr_dies;
 } CPUX86State;