diff mbox series

[v7,15/16] i386: Use offsets get NumSharingCache for CPUID[0x8000001D].EAX[bits 25:14]

Message ID 20240108082727.420817-16-zhao1.liu@linux.intel.com (mailing list archive)
State New, archived
Headers show
Series Support smp.clusters for x86 in QEMU | expand

Commit Message

Zhao Liu Jan. 8, 2024, 8:27 a.m. UTC
From: Zhao Liu <zhao1.liu@intel.com>

The commit 8f4202fb1080 ("i386: Populate AMD Processor Cache Information
for cpuid 0x8000001D") adds the cache topology for AMD CPU by encoding
the number of sharing threads directly.

From AMD's APM, NumSharingCache (CPUID[0x8000001D].EAX[bits 25:14])
means [1]:

The number of logical processors sharing this cache is the value of
this field incremented by 1. To determine which logical processors are
sharing a cache, determine a Share Id for each processor as follows:

ShareId = LocalApicId >> log2(NumSharingCache+1)

Logical processors with the same ShareId then share a cache. If
NumSharingCache+1 is not a power of two, round it up to the next power
of two.

From the description above, the calculation of this field should be same
as CPUID[4].EAX[bits 25:14] for Intel CPUs. So also use the offsets of
APIC ID to calculate this field.

[1]: APM, vol.3, appendix.E.4.15 Function 8000_001Dh--Cache Topology
     Information

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Babu Moger <babu.moger@amd.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
---
Changes since v3:
 * Rewrite the subject. (Babu)
 * Delete the original "comment/help" expression, as this behavior is
   confirmed for AMD CPUs. (Babu)
 * Rename "num_apic_ids" (v3) to "num_sharing_cache" to match spec
   definition. (Babu)

Changes since v1:
 * Rename "l3_threads" to "num_apic_ids" in
   encode_cache_cpuid8000001d(). (Yanan)
 * Add the description of the original commit and add Cc.
---
 target/i386/cpu.c | 10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

Comments

Xiaoyao Li Jan. 14, 2024, 2:42 p.m. UTC | #1
On 1/8/2024 4:27 PM, Zhao Liu wrote:
> From: Zhao Liu <zhao1.liu@intel.com>
> 
> The commit 8f4202fb1080 ("i386: Populate AMD Processor Cache Information
> for cpuid 0x8000001D") adds the cache topology for AMD CPU by encoding
> the number of sharing threads directly.
> 
>  From AMD's APM, NumSharingCache (CPUID[0x8000001D].EAX[bits 25:14])
> means [1]:
> 
> The number of logical processors sharing this cache is the value of
> this field incremented by 1. To determine which logical processors are
> sharing a cache, determine a Share Id for each processor as follows:
> 
> ShareId = LocalApicId >> log2(NumSharingCache+1)
> 
> Logical processors with the same ShareId then share a cache. If
> NumSharingCache+1 is not a power of two, round it up to the next power
> of two.
> 
>  From the description above, the calculation of this field should be same
> as CPUID[4].EAX[bits 25:14] for Intel CPUs. So also use the offsets of
> APIC ID to calculate this field.
> 
> [1]: APM, vol.3, appendix.E.4.15 Function 8000_001Dh--Cache Topology
>       Information

this patch can be dropped because we have next patch.

> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> Reviewed-by: Babu Moger <babu.moger@amd.com>
> Tested-by: Babu Moger <babu.moger@amd.com>
> Tested-by: Yongwei Ma <yongwei.ma@intel.com>
> Acked-by: Michael S. Tsirkin <mst@redhat.com>
> ---
> Changes since v3:
>   * Rewrite the subject. (Babu)
>   * Delete the original "comment/help" expression, as this behavior is
>     confirmed for AMD CPUs. (Babu)
>   * Rename "num_apic_ids" (v3) to "num_sharing_cache" to match spec
>     definition. (Babu)
> 
> Changes since v1:
>   * Rename "l3_threads" to "num_apic_ids" in
>     encode_cache_cpuid8000001d(). (Yanan)
>   * Add the description of the original commit and add Cc.
> ---
>   target/i386/cpu.c | 10 ++++------
>   1 file changed, 4 insertions(+), 6 deletions(-)
> 
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index b23e8190dc68..8a4d72f6f760 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -483,7 +483,7 @@ static void encode_cache_cpuid8000001d(CPUCacheInfo *cache,
>                                          uint32_t *eax, uint32_t *ebx,
>                                          uint32_t *ecx, uint32_t *edx)
>   {
> -    uint32_t l3_threads;
> +    uint32_t num_sharing_cache;
>       assert(cache->size == cache->line_size * cache->associativity *
>                             cache->partitions * cache->sets);
>   
> @@ -492,13 +492,11 @@ static void encode_cache_cpuid8000001d(CPUCacheInfo *cache,
>   
>       /* L3 is shared among multiple cores */
>       if (cache->level == 3) {
> -        l3_threads = topo_info->modules_per_die *
> -                     topo_info->cores_per_module *
> -                     topo_info->threads_per_core;
> -        *eax |= (l3_threads - 1) << 14;
> +        num_sharing_cache = 1 << apicid_die_offset(topo_info);
>       } else {
> -        *eax |= ((topo_info->threads_per_core - 1) << 14);
> +        num_sharing_cache = 1 << apicid_core_offset(topo_info);
>       }
> +    *eax |= (num_sharing_cache - 1) << 14;
>   
>       assert(cache->line_size > 0);
>       assert(cache->partitions > 0);
Zhao Liu Jan. 15, 2024, 3:48 a.m. UTC | #2
Hi Xiaoyao,

On Sun, Jan 14, 2024 at 10:42:41PM +0800, Xiaoyao Li wrote:
> Date: Sun, 14 Jan 2024 22:42:41 +0800
> From: Xiaoyao Li <xiaoyao.li@intel.com>
> Subject: Re: [PATCH v7 15/16] i386: Use offsets get NumSharingCache for
>  CPUID[0x8000001D].EAX[bits 25:14]
> 
> On 1/8/2024 4:27 PM, Zhao Liu wrote:
> > From: Zhao Liu <zhao1.liu@intel.com>
> > 
> > The commit 8f4202fb1080 ("i386: Populate AMD Processor Cache Information
> > for cpuid 0x8000001D") adds the cache topology for AMD CPU by encoding
> > the number of sharing threads directly.
> > 
> >  From AMD's APM, NumSharingCache (CPUID[0x8000001D].EAX[bits 25:14])
> > means [1]:
> > 
> > The number of logical processors sharing this cache is the value of
> > this field incremented by 1. To determine which logical processors are
> > sharing a cache, determine a Share Id for each processor as follows:
> > 
> > ShareId = LocalApicId >> log2(NumSharingCache+1)
> > 
> > Logical processors with the same ShareId then share a cache. If
> > NumSharingCache+1 is not a power of two, round it up to the next power
> > of two.
> > 
> >  From the description above, the calculation of this field should be same
> > as CPUID[4].EAX[bits 25:14] for Intel CPUs. So also use the offsets of
> > APIC ID to calculate this field.
> > 
> > [1]: APM, vol.3, appendix.E.4.15 Function 8000_001Dh--Cache Topology
> >       Information
> 
> this patch can be dropped because we have next patch.

This patch is mainly used to explicitly emphasize the change in encoding
way and compliance with AMD spec... I didn't tested on AMD machine, so
the more granular patch would make it easier for the community to review
and test.

Thanks,
Zhao

> 
> > Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> > Reviewed-by: Babu Moger <babu.moger@amd.com>
> > Tested-by: Babu Moger <babu.moger@amd.com>
> > Tested-by: Yongwei Ma <yongwei.ma@intel.com>
> > Acked-by: Michael S. Tsirkin <mst@redhat.com>
> > ---
> > Changes since v3:
> >   * Rewrite the subject. (Babu)
> >   * Delete the original "comment/help" expression, as this behavior is
> >     confirmed for AMD CPUs. (Babu)
> >   * Rename "num_apic_ids" (v3) to "num_sharing_cache" to match spec
> >     definition. (Babu)
> > 
> > Changes since v1:
> >   * Rename "l3_threads" to "num_apic_ids" in
> >     encode_cache_cpuid8000001d(). (Yanan)
> >   * Add the description of the original commit and add Cc.
> > ---
> >   target/i386/cpu.c | 10 ++++------
> >   1 file changed, 4 insertions(+), 6 deletions(-)
> > 
> > diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> > index b23e8190dc68..8a4d72f6f760 100644
> > --- a/target/i386/cpu.c
> > +++ b/target/i386/cpu.c
> > @@ -483,7 +483,7 @@ static void encode_cache_cpuid8000001d(CPUCacheInfo *cache,
> >                                          uint32_t *eax, uint32_t *ebx,
> >                                          uint32_t *ecx, uint32_t *edx)
> >   {
> > -    uint32_t l3_threads;
> > +    uint32_t num_sharing_cache;
> >       assert(cache->size == cache->line_size * cache->associativity *
> >                             cache->partitions * cache->sets);
> > @@ -492,13 +492,11 @@ static void encode_cache_cpuid8000001d(CPUCacheInfo *cache,
> >       /* L3 is shared among multiple cores */
> >       if (cache->level == 3) {
> > -        l3_threads = topo_info->modules_per_die *
> > -                     topo_info->cores_per_module *
> > -                     topo_info->threads_per_core;
> > -        *eax |= (l3_threads - 1) << 14;
> > +        num_sharing_cache = 1 << apicid_die_offset(topo_info);
> >       } else {
> > -        *eax |= ((topo_info->threads_per_core - 1) << 14);
> > +        num_sharing_cache = 1 << apicid_core_offset(topo_info);
> >       }
> > +    *eax |= (num_sharing_cache - 1) << 14;
> >       assert(cache->line_size > 0);
> >       assert(cache->partitions > 0);
>
Xiaoyao Li Jan. 15, 2024, 4:27 a.m. UTC | #3
On 1/15/2024 11:48 AM, Zhao Liu wrote:
> Hi Xiaoyao,
> 
> On Sun, Jan 14, 2024 at 10:42:41PM +0800, Xiaoyao Li wrote:
>> Date: Sun, 14 Jan 2024 22:42:41 +0800
>> From: Xiaoyao Li <xiaoyao.li@intel.com>
>> Subject: Re: [PATCH v7 15/16] i386: Use offsets get NumSharingCache for
>>   CPUID[0x8000001D].EAX[bits 25:14]
>>
>> On 1/8/2024 4:27 PM, Zhao Liu wrote:
>>> From: Zhao Liu <zhao1.liu@intel.com>
>>>
>>> The commit 8f4202fb1080 ("i386: Populate AMD Processor Cache Information
>>> for cpuid 0x8000001D") adds the cache topology for AMD CPU by encoding
>>> the number of sharing threads directly.
>>>
>>>   From AMD's APM, NumSharingCache (CPUID[0x8000001D].EAX[bits 25:14])
>>> means [1]:
>>>
>>> The number of logical processors sharing this cache is the value of
>>> this field incremented by 1. To determine which logical processors are
>>> sharing a cache, determine a Share Id for each processor as follows:
>>>
>>> ShareId = LocalApicId >> log2(NumSharingCache+1)
>>>
>>> Logical processors with the same ShareId then share a cache. If
>>> NumSharingCache+1 is not a power of two, round it up to the next power
>>> of two.
>>>
>>>   From the description above, the calculation of this field should be same
>>> as CPUID[4].EAX[bits 25:14] for Intel CPUs. So also use the offsets of
>>> APIC ID to calculate this field.
>>>
>>> [1]: APM, vol.3, appendix.E.4.15 Function 8000_001Dh--Cache Topology
>>>        Information
>>
>> this patch can be dropped because we have next patch.
> 
> This patch is mainly used to explicitly emphasize the change in encoding
> way and compliance with AMD spec... I didn't tested on AMD machine, so
> the more granular patch would make it easier for the community to review
> and test.

then please move this patch ahead, e.g., after patch 2.

> Thanks,
> Zhao
> 
>>
>>> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
>>> Reviewed-by: Babu Moger <babu.moger@amd.com>
>>> Tested-by: Babu Moger <babu.moger@amd.com>
>>> Tested-by: Yongwei Ma <yongwei.ma@intel.com>
>>> Acked-by: Michael S. Tsirkin <mst@redhat.com>
>>> ---
>>> Changes since v3:
>>>    * Rewrite the subject. (Babu)
>>>    * Delete the original "comment/help" expression, as this behavior is
>>>      confirmed for AMD CPUs. (Babu)
>>>    * Rename "num_apic_ids" (v3) to "num_sharing_cache" to match spec
>>>      definition. (Babu)
>>>
>>> Changes since v1:
>>>    * Rename "l3_threads" to "num_apic_ids" in
>>>      encode_cache_cpuid8000001d(). (Yanan)
>>>    * Add the description of the original commit and add Cc.
>>> ---
>>>    target/i386/cpu.c | 10 ++++------
>>>    1 file changed, 4 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
>>> index b23e8190dc68..8a4d72f6f760 100644
>>> --- a/target/i386/cpu.c
>>> +++ b/target/i386/cpu.c
>>> @@ -483,7 +483,7 @@ static void encode_cache_cpuid8000001d(CPUCacheInfo *cache,
>>>                                           uint32_t *eax, uint32_t *ebx,
>>>                                           uint32_t *ecx, uint32_t *edx)
>>>    {
>>> -    uint32_t l3_threads;
>>> +    uint32_t num_sharing_cache;
>>>        assert(cache->size == cache->line_size * cache->associativity *
>>>                              cache->partitions * cache->sets);
>>> @@ -492,13 +492,11 @@ static void encode_cache_cpuid8000001d(CPUCacheInfo *cache,
>>>        /* L3 is shared among multiple cores */
>>>        if (cache->level == 3) {
>>> -        l3_threads = topo_info->modules_per_die *
>>> -                     topo_info->cores_per_module *
>>> -                     topo_info->threads_per_core;
>>> -        *eax |= (l3_threads - 1) << 14;
>>> +        num_sharing_cache = 1 << apicid_die_offset(topo_info);
>>>        } else {
>>> -        *eax |= ((topo_info->threads_per_core - 1) << 14);
>>> +        num_sharing_cache = 1 << apicid_core_offset(topo_info);
>>>        }
>>> +    *eax |= (num_sharing_cache - 1) << 14;
>>>        assert(cache->line_size > 0);
>>>        assert(cache->partitions > 0);
>>
Zhao Liu Jan. 15, 2024, 2:54 p.m. UTC | #4
Hi Xiaoyao,

On Mon, Jan 15, 2024 at 12:27:43PM +0800, Xiaoyao Li wrote:
> Date: Mon, 15 Jan 2024 12:27:43 +0800
> From: Xiaoyao Li <xiaoyao.li@intel.com>
> Subject: Re: [PATCH v7 15/16] i386: Use offsets get NumSharingCache for
>  CPUID[0x8000001D].EAX[bits 25:14]
> 
> On 1/15/2024 11:48 AM, Zhao Liu wrote:
> > Hi Xiaoyao,
> > 
> > On Sun, Jan 14, 2024 at 10:42:41PM +0800, Xiaoyao Li wrote:
> > > Date: Sun, 14 Jan 2024 22:42:41 +0800
> > > From: Xiaoyao Li <xiaoyao.li@intel.com>
> > > Subject: Re: [PATCH v7 15/16] i386: Use offsets get NumSharingCache for
> > >   CPUID[0x8000001D].EAX[bits 25:14]
> > > 
> > > On 1/8/2024 4:27 PM, Zhao Liu wrote:
> > > > From: Zhao Liu <zhao1.liu@intel.com>
> > > > 
> > > > The commit 8f4202fb1080 ("i386: Populate AMD Processor Cache Information
> > > > for cpuid 0x8000001D") adds the cache topology for AMD CPU by encoding
> > > > the number of sharing threads directly.
> > > > 
> > > >   From AMD's APM, NumSharingCache (CPUID[0x8000001D].EAX[bits 25:14])
> > > > means [1]:
> > > > 
> > > > The number of logical processors sharing this cache is the value of
> > > > this field incremented by 1. To determine which logical processors are
> > > > sharing a cache, determine a Share Id for each processor as follows:
> > > > 
> > > > ShareId = LocalApicId >> log2(NumSharingCache+1)
> > > > 
> > > > Logical processors with the same ShareId then share a cache. If
> > > > NumSharingCache+1 is not a power of two, round it up to the next power
> > > > of two.
> > > > 
> > > >   From the description above, the calculation of this field should be same
> > > > as CPUID[4].EAX[bits 25:14] for Intel CPUs. So also use the offsets of
> > > > APIC ID to calculate this field.
> > > > 
> > > > [1]: APM, vol.3, appendix.E.4.15 Function 8000_001Dh--Cache Topology
> > > >        Information
> > > 
> > > this patch can be dropped because we have next patch.
> > 
> > This patch is mainly used to explicitly emphasize the change in encoding
> > way and compliance with AMD spec... I didn't tested on AMD machine, so
> > the more granular patch would make it easier for the community to review
> > and test.
> 
> then please move this patch ahead, e.g., after patch 2.
>

OK. Thanks!
-Zhao
diff mbox series

Patch

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index b23e8190dc68..8a4d72f6f760 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -483,7 +483,7 @@  static void encode_cache_cpuid8000001d(CPUCacheInfo *cache,
                                        uint32_t *eax, uint32_t *ebx,
                                        uint32_t *ecx, uint32_t *edx)
 {
-    uint32_t l3_threads;
+    uint32_t num_sharing_cache;
     assert(cache->size == cache->line_size * cache->associativity *
                           cache->partitions * cache->sets);
 
@@ -492,13 +492,11 @@  static void encode_cache_cpuid8000001d(CPUCacheInfo *cache,
 
     /* L3 is shared among multiple cores */
     if (cache->level == 3) {
-        l3_threads = topo_info->modules_per_die *
-                     topo_info->cores_per_module *
-                     topo_info->threads_per_core;
-        *eax |= (l3_threads - 1) << 14;
+        num_sharing_cache = 1 << apicid_die_offset(topo_info);
     } else {
-        *eax |= ((topo_info->threads_per_core - 1) << 14);
+        num_sharing_cache = 1 << apicid_core_offset(topo_info);
     }
+    *eax |= (num_sharing_cache - 1) << 14;
 
     assert(cache->line_size > 0);
     assert(cache->partitions > 0);