diff mbox series

[v2,2/2] powercap/intel_rapl: Fix the energy-pkg event for AMD CPUs

Message ID 20240730044917.4680-3-Dhananjay.Ugwekar@amd.com (mailing list archive)
State Mainlined, archived
Headers show
Series RAPL driver fixes for AMD CPUs | expand

Commit Message

Dhananjay Ugwekar July 30, 2024, 4:49 a.m. UTC
After commit ("x86/cpu/topology: Add support for the AMD 0x80000026 leaf"),
on AMD processors that support extended CPUID leaf 0x80000026, the
topology_logical_die_id() macros, no longer returns package id, instead it
returns the CCD (Core Complex Die) id. This leads to the energy-pkg
event scope to be modified to CCD instead of package.

For more historical context, please refer to commit 32fb480e0a2c
("powercap/intel_rapl: Support multi-die/package"), which initially changed
the RAPL scope from package to die for all systems, as Intel systems
with Die enumeration have RAPL scope as die, and those without die
enumeration are not affected. So, all systems(Intel, AMD, Hygon), worked
correctly with topology_logical_die_id() until recently, but this changed
after the "0x80000026 leaf" commit mentioned above.

Future multi-die Intel systems will have package scope RAPL counters,
but they will be using TPMI RAPL interface, which is not affected by
this change.

Replacing topology_logical_die_id() with topology_physical_package_id()
conditionally only for AMD and Hygon fixes the energy-pkg event.

On an AMD 2 socket 8 CCD Zen4 server:

Before:

linux$ ls /sys/class/powercap/
intel-rapl      intel-rapl:4    intel-rapl:8:0  intel-rapl:d
intel-rapl:0    intel-rapl:4:0  intel-rapl:9    intel-rapl:d:0
intel-rapl:0:0  intel-rapl:5    intel-rapl:9:0  intel-rapl:e
intel-rapl:1    intel-rapl:5:0  intel-rapl:a    intel-rapl:e:0
intel-rapl:1:0  intel-rapl:6    intel-rapl:a:0  intel-rapl:f
intel-rapl:2    intel-rapl:6:0  intel-rapl:b    intel-rapl:f:0
intel-rapl:2:0  intel-rapl:7    intel-rapl:b:0
intel-rapl:3    intel-rapl:7:0  intel-rapl:c
intel-rapl:3:0  intel-rapl:8    intel-rapl:c:0

After:

linux$ ls /sys/class/powercap/
intel-rapl  intel-rapl:0  intel-rapl:0:0  intel-rapl:1  intel-rapl:1:0

Only one sysfs entry per-event per-package is created after this change.

Fixes: 63edbaa48a57 ("x86/cpu/topology: Add support for the AMD 0x80000026 leaf")
Reported-by: Michael Larabel <michael@michaellarabel.com>
Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Reviewed-by: Zhang Rui <rui.zhang@intel.com>
---
Changes in v2:
* Updated scope description comment, commit log
* Rename rapl_pmu_is_pkg_scope() to rapl_msrs_are_pkg_scope()
* Check topology_logical_(die/package)_id return value
---
 drivers/powercap/intel_rapl_common.c | 34 ++++++++++++++++++++++++----
 1 file changed, 30 insertions(+), 4 deletions(-)

Comments

Rafael J. Wysocki Aug. 2, 2024, 12:35 p.m. UTC | #1
On Tue, Jul 30, 2024 at 6:53 AM Dhananjay Ugwekar
<Dhananjay.Ugwekar@amd.com> wrote:
>
> After commit ("x86/cpu/topology: Add support for the AMD 0x80000026 leaf"),
> on AMD processors that support extended CPUID leaf 0x80000026, the
> topology_logical_die_id() macros, no longer returns package id, instead it
> returns the CCD (Core Complex Die) id. This leads to the energy-pkg
> event scope to be modified to CCD instead of package.
>
> For more historical context, please refer to commit 32fb480e0a2c
> ("powercap/intel_rapl: Support multi-die/package"), which initially changed
> the RAPL scope from package to die for all systems, as Intel systems
> with Die enumeration have RAPL scope as die, and those without die
> enumeration are not affected. So, all systems(Intel, AMD, Hygon), worked
> correctly with topology_logical_die_id() until recently, but this changed
> after the "0x80000026 leaf" commit mentioned above.
>
> Future multi-die Intel systems will have package scope RAPL counters,
> but they will be using TPMI RAPL interface, which is not affected by
> this change.
>
> Replacing topology_logical_die_id() with topology_physical_package_id()
> conditionally only for AMD and Hygon fixes the energy-pkg event.
>
> On an AMD 2 socket 8 CCD Zen4 server:
>
> Before:
>
> linux$ ls /sys/class/powercap/
> intel-rapl      intel-rapl:4    intel-rapl:8:0  intel-rapl:d
> intel-rapl:0    intel-rapl:4:0  intel-rapl:9    intel-rapl:d:0
> intel-rapl:0:0  intel-rapl:5    intel-rapl:9:0  intel-rapl:e
> intel-rapl:1    intel-rapl:5:0  intel-rapl:a    intel-rapl:e:0
> intel-rapl:1:0  intel-rapl:6    intel-rapl:a:0  intel-rapl:f
> intel-rapl:2    intel-rapl:6:0  intel-rapl:b    intel-rapl:f:0
> intel-rapl:2:0  intel-rapl:7    intel-rapl:b:0
> intel-rapl:3    intel-rapl:7:0  intel-rapl:c
> intel-rapl:3:0  intel-rapl:8    intel-rapl:c:0
>
> After:
>
> linux$ ls /sys/class/powercap/
> intel-rapl  intel-rapl:0  intel-rapl:0:0  intel-rapl:1  intel-rapl:1:0
>
> Only one sysfs entry per-event per-package is created after this change.
>
> Fixes: 63edbaa48a57 ("x86/cpu/topology: Add support for the AMD 0x80000026 leaf")
> Reported-by: Michael Larabel <michael@michaellarabel.com>
> Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
> Reviewed-by: Zhang Rui <rui.zhang@intel.com>
> ---
> Changes in v2:
> * Updated scope description comment, commit log
> * Rename rapl_pmu_is_pkg_scope() to rapl_msrs_are_pkg_scope()
> * Check topology_logical_(die/package)_id return value

This patch does not depend on the first one in the series if I'm not
mistaken, in which case I can pick it up separately if you want me to
do that, so please let me know.

Thanks!

> ---
>  drivers/powercap/intel_rapl_common.c | 34 ++++++++++++++++++++++++----
>  1 file changed, 30 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/powercap/intel_rapl_common.c b/drivers/powercap/intel_rapl_common.c
> index 3cffa6c79538..4bc56acb99d6 100644
> --- a/drivers/powercap/intel_rapl_common.c
> +++ b/drivers/powercap/intel_rapl_common.c
> @@ -2128,6 +2128,21 @@ void rapl_remove_package(struct rapl_package *rp)
>  }
>  EXPORT_SYMBOL_GPL(rapl_remove_package);
>
> +/*
> + * RAPL Package energy counter scope:
> + * 1. AMD/HYGON platforms use per-PKG package energy counter
> + * 2. For Intel platforms
> + *     2.1 CLX-AP platform has per-DIE package energy counter
> + *     2.2 Other platforms that uses MSR RAPL are single die systems so the
> + *          package energy counter can be considered as per-PKG/per-DIE,
> + *          here it is considered as per-DIE.
> + *     2.3 New platforms that use TPMI RAPL doesn't care about the
> + *         scope because they are not MSR/CPU based.
> + */
> +#define rapl_msrs_are_pkg_scope()                              \
> +       (boot_cpu_data.x86_vendor == X86_VENDOR_AMD ||  \
> +        boot_cpu_data.x86_vendor == X86_VENDOR_HYGON)
> +
>  /* caller to ensure CPU hotplug lock is held */
>  struct rapl_package *rapl_find_package_domain_cpuslocked(int id, struct rapl_if_priv *priv,
>                                                          bool id_is_cpu)
> @@ -2135,8 +2150,14 @@ struct rapl_package *rapl_find_package_domain_cpuslocked(int id, struct rapl_if_
>         struct rapl_package *rp;
>         int uid;
>
> -       if (id_is_cpu)
> -               uid = topology_logical_die_id(id);
> +       if (id_is_cpu) {
> +               uid = rapl_msrs_are_pkg_scope() ?
> +                     topology_physical_package_id(id) : topology_logical_die_id(id);
> +               if (uid < 0) {
> +                       pr_err("topology_logical_(package/die)_id() returned a negative value");
> +                       return ERR_PTR(-EINVAL);
> +               }
> +       }
>         else
>                 uid = id;
>
> @@ -2168,9 +2189,14 @@ struct rapl_package *rapl_add_package_cpuslocked(int id, struct rapl_if_priv *pr
>                 return ERR_PTR(-ENOMEM);
>
>         if (id_is_cpu) {
> -               rp->id = topology_logical_die_id(id);
> +               rp->id = rapl_msrs_are_pkg_scope() ?
> +                        topology_physical_package_id(id) : topology_logical_die_id(id);
> +               if ((int)(rp->id) < 0) {
> +                       pr_err("topology_logical_(package/die)_id() returned a negative value");
> +                       return ERR_PTR(-EINVAL);
> +               }
>                 rp->lead_cpu = id;
> -               if (topology_max_dies_per_package() > 1)
> +               if (!rapl_msrs_are_pkg_scope() && topology_max_dies_per_package() > 1)
>                         snprintf(rp->name, PACKAGE_DOMAIN_NAME_LENGTH, "package-%d-die-%d",
>                                  topology_physical_package_id(id), topology_die_id(id));
>                 else
> --
> 2.43.0
>
>
Dhananjay Ugwekar Aug. 8, 2024, 11:17 a.m. UTC | #2
Hello Rafael,

On 8/2/2024 6:05 PM, Rafael J. Wysocki wrote:
> On Tue, Jul 30, 2024 at 6:53 AM Dhananjay Ugwekar
> <Dhananjay.Ugwekar@amd.com> wrote:
>>
>> After commit ("x86/cpu/topology: Add support for the AMD 0x80000026 leaf"),
>> on AMD processors that support extended CPUID leaf 0x80000026, the
>> topology_logical_die_id() macros, no longer returns package id, instead it
>> returns the CCD (Core Complex Die) id. This leads to the energy-pkg
>> event scope to be modified to CCD instead of package.
>>
>> For more historical context, please refer to commit 32fb480e0a2c
>> ("powercap/intel_rapl: Support multi-die/package"), which initially changed
>> the RAPL scope from package to die for all systems, as Intel systems
>> with Die enumeration have RAPL scope as die, and those without die
>> enumeration are not affected. So, all systems(Intel, AMD, Hygon), worked
>> correctly with topology_logical_die_id() until recently, but this changed
>> after the "0x80000026 leaf" commit mentioned above.
>>
>> Future multi-die Intel systems will have package scope RAPL counters,
>> but they will be using TPMI RAPL interface, which is not affected by
>> this change.
>>
>> Replacing topology_logical_die_id() with topology_physical_package_id()
>> conditionally only for AMD and Hygon fixes the energy-pkg event.
>>
>> On an AMD 2 socket 8 CCD Zen4 server:
>>
>> Before:
>>
>> linux$ ls /sys/class/powercap/
>> intel-rapl      intel-rapl:4    intel-rapl:8:0  intel-rapl:d
>> intel-rapl:0    intel-rapl:4:0  intel-rapl:9    intel-rapl:d:0
>> intel-rapl:0:0  intel-rapl:5    intel-rapl:9:0  intel-rapl:e
>> intel-rapl:1    intel-rapl:5:0  intel-rapl:a    intel-rapl:e:0
>> intel-rapl:1:0  intel-rapl:6    intel-rapl:a:0  intel-rapl:f
>> intel-rapl:2    intel-rapl:6:0  intel-rapl:b    intel-rapl:f:0
>> intel-rapl:2:0  intel-rapl:7    intel-rapl:b:0
>> intel-rapl:3    intel-rapl:7:0  intel-rapl:c
>> intel-rapl:3:0  intel-rapl:8    intel-rapl:c:0
>>
>> After:
>>
>> linux$ ls /sys/class/powercap/
>> intel-rapl  intel-rapl:0  intel-rapl:0:0  intel-rapl:1  intel-rapl:1:0
>>
>> Only one sysfs entry per-event per-package is created after this change.
>>
>> Fixes: 63edbaa48a57 ("x86/cpu/topology: Add support for the AMD 0x80000026 leaf")
>> Reported-by: Michael Larabel <michael@michaellarabel.com>
>> Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
>> Reviewed-by: Zhang Rui <rui.zhang@intel.com>
>> ---
>> Changes in v2:
>> * Updated scope description comment, commit log
>> * Rename rapl_pmu_is_pkg_scope() to rapl_msrs_are_pkg_scope()
>> * Check topology_logical_(die/package)_id return value
> 
> This patch does not depend on the first one in the series if I'm not
> mistaken, in which case I can pick it up separately if you want me to
> do that, so please let me know.

Sorry for the late reply, was out sick,

Yes, please pick this patch separately, it is independent from the first one.

Thanks,
Dhananjay

> 
> Thanks!
> 
>> ---
>>  drivers/powercap/intel_rapl_common.c | 34 ++++++++++++++++++++++++----
>>  1 file changed, 30 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/powercap/intel_rapl_common.c b/drivers/powercap/intel_rapl_common.c
>> index 3cffa6c79538..4bc56acb99d6 100644
>> --- a/drivers/powercap/intel_rapl_common.c
>> +++ b/drivers/powercap/intel_rapl_common.c
>> @@ -2128,6 +2128,21 @@ void rapl_remove_package(struct rapl_package *rp)
>>  }
>>  EXPORT_SYMBOL_GPL(rapl_remove_package);
>>
>> +/*
>> + * RAPL Package energy counter scope:
>> + * 1. AMD/HYGON platforms use per-PKG package energy counter
>> + * 2. For Intel platforms
>> + *     2.1 CLX-AP platform has per-DIE package energy counter
>> + *     2.2 Other platforms that uses MSR RAPL are single die systems so the
>> + *          package energy counter can be considered as per-PKG/per-DIE,
>> + *          here it is considered as per-DIE.
>> + *     2.3 New platforms that use TPMI RAPL doesn't care about the
>> + *         scope because they are not MSR/CPU based.
>> + */
>> +#define rapl_msrs_are_pkg_scope()                              \
>> +       (boot_cpu_data.x86_vendor == X86_VENDOR_AMD ||  \
>> +        boot_cpu_data.x86_vendor == X86_VENDOR_HYGON)
>> +
>>  /* caller to ensure CPU hotplug lock is held */
>>  struct rapl_package *rapl_find_package_domain_cpuslocked(int id, struct rapl_if_priv *priv,
>>                                                          bool id_is_cpu)
>> @@ -2135,8 +2150,14 @@ struct rapl_package *rapl_find_package_domain_cpuslocked(int id, struct rapl_if_
>>         struct rapl_package *rp;
>>         int uid;
>>
>> -       if (id_is_cpu)
>> -               uid = topology_logical_die_id(id);
>> +       if (id_is_cpu) {
>> +               uid = rapl_msrs_are_pkg_scope() ?
>> +                     topology_physical_package_id(id) : topology_logical_die_id(id);
>> +               if (uid < 0) {
>> +                       pr_err("topology_logical_(package/die)_id() returned a negative value");
>> +                       return ERR_PTR(-EINVAL);
>> +               }
>> +       }
>>         else
>>                 uid = id;
>>
>> @@ -2168,9 +2189,14 @@ struct rapl_package *rapl_add_package_cpuslocked(int id, struct rapl_if_priv *pr
>>                 return ERR_PTR(-ENOMEM);
>>
>>         if (id_is_cpu) {
>> -               rp->id = topology_logical_die_id(id);
>> +               rp->id = rapl_msrs_are_pkg_scope() ?
>> +                        topology_physical_package_id(id) : topology_logical_die_id(id);
>> +               if ((int)(rp->id) < 0) {
>> +                       pr_err("topology_logical_(package/die)_id() returned a negative value");
>> +                       return ERR_PTR(-EINVAL);
>> +               }
>>                 rp->lead_cpu = id;
>> -               if (topology_max_dies_per_package() > 1)
>> +               if (!rapl_msrs_are_pkg_scope() && topology_max_dies_per_package() > 1)
>>                         snprintf(rp->name, PACKAGE_DOMAIN_NAME_LENGTH, "package-%d-die-%d",
>>                                  topology_physical_package_id(id), topology_die_id(id));
>>                 else
>> --
>> 2.43.0
>>
>>
Rafael J. Wysocki Aug. 19, 2024, 1:34 p.m. UTC | #3
On Thu, Aug 8, 2024 at 1:18 PM Dhananjay Ugwekar
<Dhananjay.Ugwekar@amd.com> wrote:
>
> Hello Rafael,
>
> On 8/2/2024 6:05 PM, Rafael J. Wysocki wrote:
> > On Tue, Jul 30, 2024 at 6:53 AM Dhananjay Ugwekar
> > <Dhananjay.Ugwekar@amd.com> wrote:
> >>
> >> After commit ("x86/cpu/topology: Add support for the AMD 0x80000026 leaf"),
> >> on AMD processors that support extended CPUID leaf 0x80000026, the
> >> topology_logical_die_id() macros, no longer returns package id, instead it
> >> returns the CCD (Core Complex Die) id. This leads to the energy-pkg
> >> event scope to be modified to CCD instead of package.
> >>
> >> For more historical context, please refer to commit 32fb480e0a2c
> >> ("powercap/intel_rapl: Support multi-die/package"), which initially changed
> >> the RAPL scope from package to die for all systems, as Intel systems
> >> with Die enumeration have RAPL scope as die, and those without die
> >> enumeration are not affected. So, all systems(Intel, AMD, Hygon), worked
> >> correctly with topology_logical_die_id() until recently, but this changed
> >> after the "0x80000026 leaf" commit mentioned above.
> >>
> >> Future multi-die Intel systems will have package scope RAPL counters,
> >> but they will be using TPMI RAPL interface, which is not affected by
> >> this change.
> >>
> >> Replacing topology_logical_die_id() with topology_physical_package_id()
> >> conditionally only for AMD and Hygon fixes the energy-pkg event.
> >>
> >> On an AMD 2 socket 8 CCD Zen4 server:
> >>
> >> Before:
> >>
> >> linux$ ls /sys/class/powercap/
> >> intel-rapl      intel-rapl:4    intel-rapl:8:0  intel-rapl:d
> >> intel-rapl:0    intel-rapl:4:0  intel-rapl:9    intel-rapl:d:0
> >> intel-rapl:0:0  intel-rapl:5    intel-rapl:9:0  intel-rapl:e
> >> intel-rapl:1    intel-rapl:5:0  intel-rapl:a    intel-rapl:e:0
> >> intel-rapl:1:0  intel-rapl:6    intel-rapl:a:0  intel-rapl:f
> >> intel-rapl:2    intel-rapl:6:0  intel-rapl:b    intel-rapl:f:0
> >> intel-rapl:2:0  intel-rapl:7    intel-rapl:b:0
> >> intel-rapl:3    intel-rapl:7:0  intel-rapl:c
> >> intel-rapl:3:0  intel-rapl:8    intel-rapl:c:0
> >>
> >> After:
> >>
> >> linux$ ls /sys/class/powercap/
> >> intel-rapl  intel-rapl:0  intel-rapl:0:0  intel-rapl:1  intel-rapl:1:0
> >>
> >> Only one sysfs entry per-event per-package is created after this change.
> >>
> >> Fixes: 63edbaa48a57 ("x86/cpu/topology: Add support for the AMD 0x80000026 leaf")
> >> Reported-by: Michael Larabel <michael@michaellarabel.com>
> >> Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
> >> Reviewed-by: Zhang Rui <rui.zhang@intel.com>
> >> ---
> >> Changes in v2:
> >> * Updated scope description comment, commit log
> >> * Rename rapl_pmu_is_pkg_scope() to rapl_msrs_are_pkg_scope()
> >> * Check topology_logical_(die/package)_id return value
> >
> > This patch does not depend on the first one in the series if I'm not
> > mistaken, in which case I can pick it up separately if you want me to
> > do that, so please let me know.
>
> Sorry for the late reply, was out sick,
>
> Yes, please pick this patch separately, it is independent from the first one.

OK, applied as 6.12 material.

Thanks!
Gautham R. Shenoy Aug. 21, 2024, 4:57 a.m. UTC | #4
Hello Rafael,

On Mon, Aug 19, 2024 at 03:34:09PM +0200, Rafael J. Wysocki wrote:
> On Thu, Aug 8, 2024 at 1:18 PM Dhananjay Ugwekar
> <Dhananjay.Ugwekar@amd.com> wrote:
> >
> > Hello Rafael,
> >
> > On 8/2/2024 6:05 PM, Rafael J. Wysocki wrote:
> > > On Tue, Jul 30, 2024 at 6:53 AM Dhananjay Ugwekar
> > > <Dhananjay.Ugwekar@amd.com> wrote:
> > >>
> > >> After commit ("x86/cpu/topology: Add support for the AMD 0x80000026 leaf"),
> > >> on AMD processors that support extended CPUID leaf 0x80000026, the
> > >> topology_logical_die_id() macros, no longer returns package id, instead it
> > >> returns the CCD (Core Complex Die) id. This leads to the energy-pkg
> > >> event scope to be modified to CCD instead of package.
> > >>
> > >> For more historical context, please refer to commit 32fb480e0a2c
> > >> ("powercap/intel_rapl: Support multi-die/package"), which initially changed
> > >> the RAPL scope from package to die for all systems, as Intel systems
> > >> with Die enumeration have RAPL scope as die, and those without die
> > >> enumeration are not affected. So, all systems(Intel, AMD, Hygon), worked
> > >> correctly with topology_logical_die_id() until recently, but this changed
> > >> after the "0x80000026 leaf" commit mentioned above.
> > >>
> > >> Future multi-die Intel systems will have package scope RAPL counters,
> > >> but they will be using TPMI RAPL interface, which is not affected by
> > >> this change.
> > >>
> > >> Replacing topology_logical_die_id() with topology_physical_package_id()
> > >> conditionally only for AMD and Hygon fixes the energy-pkg event.
> > >>
> > >> On an AMD 2 socket 8 CCD Zen4 server:
> > >>
> > >> Before:
> > >>
> > >> linux$ ls /sys/class/powercap/
> > >> intel-rapl      intel-rapl:4    intel-rapl:8:0  intel-rapl:d
> > >> intel-rapl:0    intel-rapl:4:0  intel-rapl:9    intel-rapl:d:0
> > >> intel-rapl:0:0  intel-rapl:5    intel-rapl:9:0  intel-rapl:e
> > >> intel-rapl:1    intel-rapl:5:0  intel-rapl:a    intel-rapl:e:0
> > >> intel-rapl:1:0  intel-rapl:6    intel-rapl:a:0  intel-rapl:f
> > >> intel-rapl:2    intel-rapl:6:0  intel-rapl:b    intel-rapl:f:0
> > >> intel-rapl:2:0  intel-rapl:7    intel-rapl:b:0
> > >> intel-rapl:3    intel-rapl:7:0  intel-rapl:c
> > >> intel-rapl:3:0  intel-rapl:8    intel-rapl:c:0
> > >>
> > >> After:
> > >>
> > >> linux$ ls /sys/class/powercap/
> > >> intel-rapl  intel-rapl:0  intel-rapl:0:0  intel-rapl:1  intel-rapl:1:0
> > >>
> > >> Only one sysfs entry per-event per-package is created after this change.
> > >>
> > >> Fixes: 63edbaa48a57 ("x86/cpu/topology: Add support for the AMD 0x80000026 leaf")
> > >> Reported-by: Michael Larabel <michael@michaellarabel.com>
> > >> Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
> > >> Reviewed-by: Zhang Rui <rui.zhang@intel.com>
> > >> ---
> > >> Changes in v2:
> > >> * Updated scope description comment, commit log
> > >> * Rename rapl_pmu_is_pkg_scope() to rapl_msrs_are_pkg_scope()
> > >> * Check topology_logical_(die/package)_id return value
> > >
> > > This patch does not depend on the first one in the series if I'm not
> > > mistaken, in which case I can pick it up separately if you want me to
> > > do that, so please let me know.
> >
> > Sorry for the late reply, was out sick,
> >
> > Yes, please pick this patch separately, it is independent from the first one.
> 
> OK, applied as 6.12 material.

Can this go into 6.11 fixes?

It fixes the commit 63edbaa48a57 ("x86/cpu/topology: Add support for
the AMD 0x80000026 leaf") which got merged in 6.10.

--
Thanks and Regards
gautham.
diff mbox series

Patch

diff --git a/drivers/powercap/intel_rapl_common.c b/drivers/powercap/intel_rapl_common.c
index 3cffa6c79538..4bc56acb99d6 100644
--- a/drivers/powercap/intel_rapl_common.c
+++ b/drivers/powercap/intel_rapl_common.c
@@ -2128,6 +2128,21 @@  void rapl_remove_package(struct rapl_package *rp)
 }
 EXPORT_SYMBOL_GPL(rapl_remove_package);
 
+/*
+ * RAPL Package energy counter scope:
+ * 1. AMD/HYGON platforms use per-PKG package energy counter
+ * 2. For Intel platforms
+ *	2.1 CLX-AP platform has per-DIE package energy counter
+ *	2.2 Other platforms that uses MSR RAPL are single die systems so the
+ *          package energy counter can be considered as per-PKG/per-DIE,
+ *          here it is considered as per-DIE.
+ *	2.3 New platforms that use TPMI RAPL doesn't care about the
+ *	    scope because they are not MSR/CPU based.
+ */
+#define rapl_msrs_are_pkg_scope()				\
+	(boot_cpu_data.x86_vendor == X86_VENDOR_AMD ||	\
+	 boot_cpu_data.x86_vendor == X86_VENDOR_HYGON)
+
 /* caller to ensure CPU hotplug lock is held */
 struct rapl_package *rapl_find_package_domain_cpuslocked(int id, struct rapl_if_priv *priv,
 							 bool id_is_cpu)
@@ -2135,8 +2150,14 @@  struct rapl_package *rapl_find_package_domain_cpuslocked(int id, struct rapl_if_
 	struct rapl_package *rp;
 	int uid;
 
-	if (id_is_cpu)
-		uid = topology_logical_die_id(id);
+	if (id_is_cpu) {
+		uid = rapl_msrs_are_pkg_scope() ?
+		      topology_physical_package_id(id) : topology_logical_die_id(id);
+		if (uid < 0) {
+			pr_err("topology_logical_(package/die)_id() returned a negative value");
+			return ERR_PTR(-EINVAL);
+		}
+	}
 	else
 		uid = id;
 
@@ -2168,9 +2189,14 @@  struct rapl_package *rapl_add_package_cpuslocked(int id, struct rapl_if_priv *pr
 		return ERR_PTR(-ENOMEM);
 
 	if (id_is_cpu) {
-		rp->id = topology_logical_die_id(id);
+		rp->id = rapl_msrs_are_pkg_scope() ?
+			 topology_physical_package_id(id) : topology_logical_die_id(id);
+		if ((int)(rp->id) < 0) {
+			pr_err("topology_logical_(package/die)_id() returned a negative value");
+			return ERR_PTR(-EINVAL);
+		}
 		rp->lead_cpu = id;
-		if (topology_max_dies_per_package() > 1)
+		if (!rapl_msrs_are_pkg_scope() && topology_max_dies_per_package() > 1)
 			snprintf(rp->name, PACKAGE_DOMAIN_NAME_LENGTH, "package-%d-die-%d",
 				 topology_physical_package_id(id), topology_die_id(id));
 		else