diff mbox series

Thermal: intel: hfi: Give HFI instances package scope

Message ID 20240703055445.125362-1-rui.zhang@intel.com (mailing list archive)
State Mainlined, archived
Headers show
Series Thermal: intel: hfi: Give HFI instances package scope | expand

Commit Message

Zhang, Rui July 3, 2024, 5:54 a.m. UTC
The Intel Software Developer's Manual defines the scope of HFI (registers
and memory buffer) as package. Use package scope* in the software
representation of an HFI instance.

Using die scope in HFI instances has the effect of creating multiple,
conflicting, instances for the same package: each instance allocates its
own memory buffer and configures the same package-level registers.
Specifically, only one of the allocated memory buffers can be set in the
MSR_IA32_HW_FEEDBACK_PTR register. CPUs get incorrect HFI data from the
table.

The problem does not affect current HFI-capable platforms because they
all have single-die processors.

* We used die scope for HFI instances because there have been processors
in which packages where enumerated as dies. None of those systems support
HFI. If such a system emerged we would need to quirk it.

Co-developed-by: Chen Yu <yu.c.chen@intel.com>
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
---
 drivers/thermal/intel/intel_hfi.c | 30 +++++++++++++++++-------------
 1 file changed, 17 insertions(+), 13 deletions(-)

Comments

Rafael J. Wysocki July 3, 2024, 11:33 a.m. UTC | #1
On Wed, Jul 3, 2024 at 7:55 AM Zhang Rui <rui.zhang@intel.com> wrote:
>
> The Intel Software Developer's Manual defines the scope of HFI (registers
> and memory buffer) as package. Use package scope* in the software

"as a package"

> representation of an HFI instance.
>
> Using die scope in HFI instances has the effect of creating multiple,
> conflicting, instances for the same package: each instance allocates its
> own memory buffer and configures the same package-level registers.
> Specifically, only one of the allocated memory buffers can be set in the
> MSR_IA32_HW_FEEDBACK_PTR register. CPUs get incorrect HFI data from the
> table.
>
> The problem does not affect current HFI-capable platforms because they
> all have single-die processors.
>
> * We used die scope for HFI instances because there have been processors
> in which packages where enumerated as dies. None of those systems support

"were"

> HFI. If such a system emerged we would need to quirk it.
>
> Co-developed-by: Chen Yu <yu.c.chen@intel.com>
> Signed-off-by: Chen Yu <yu.c.chen@intel.com>
> Signed-off-by: Zhang Rui <rui.zhang@intel.com>

Ricardo, any concerns?

> ---
>  drivers/thermal/intel/intel_hfi.c | 30 +++++++++++++++++-------------
>  1 file changed, 17 insertions(+), 13 deletions(-)
>
> diff --git a/drivers/thermal/intel/intel_hfi.c b/drivers/thermal/intel/intel_hfi.c
> index a180a98bb9f1..5b18a46a10b0 100644
> --- a/drivers/thermal/intel/intel_hfi.c
> +++ b/drivers/thermal/intel/intel_hfi.c
> @@ -401,10 +401,10 @@ static void hfi_disable(void)
>   * intel_hfi_online() - Enable HFI on @cpu
>   * @cpu:       CPU in which the HFI will be enabled
>   *
> - * Enable the HFI to be used in @cpu. The HFI is enabled at the die/package
> - * level. The first CPU in the die/package to come online does the full HFI
> + * Enable the HFI to be used in @cpu. The HFI is enabled at the package
> + * level. The first CPU in the package to come online does the full HFI
>   * initialization. Subsequent CPUs will just link themselves to the HFI
> - * instance of their die/package.
> + * instance of their package.
>   *
>   * This function is called before enabling the thermal vector in the local APIC
>   * in order to ensure that @cpu has an associated HFI instance when it receives
> @@ -414,31 +414,31 @@ void intel_hfi_online(unsigned int cpu)
>  {
>         struct hfi_instance *hfi_instance;
>         struct hfi_cpu_info *info;
> -       u16 die_id;
> +       u16 pkg_id;
>
>         /* Nothing to do if hfi_instances are missing. */
>         if (!hfi_instances)
>                 return;
>
>         /*
> -        * Link @cpu to the HFI instance of its package/die. It does not
> +        * Link @cpu to the HFI instance of its package. It does not
>          * matter whether the instance has been initialized.
>          */
>         info = &per_cpu(hfi_cpu_info, cpu);
> -       die_id = topology_logical_die_id(cpu);
> +       pkg_id = topology_logical_package_id(cpu);
>         hfi_instance = info->hfi_instance;
>         if (!hfi_instance) {
> -               if (die_id >= max_hfi_instances)
> +               if (pkg_id >= max_hfi_instances)
>                         return;
>
> -               hfi_instance = &hfi_instances[die_id];
> +               hfi_instance = &hfi_instances[pkg_id];
>                 info->hfi_instance = hfi_instance;
>         }
>
>         init_hfi_cpu_index(info);
>
>         /*
> -        * Now check if the HFI instance of the package/die of @cpu has been
> +        * Now check if the HFI instance of the package of @cpu has been
>          * initialized (by checking its header). In such case, all we have to
>          * do is to add @cpu to this instance's cpumask and enable the instance
>          * if needed.
> @@ -504,7 +504,7 @@ void intel_hfi_online(unsigned int cpu)
>   *
>   * On some processors, hardware remembers previous programming settings even
>   * after being reprogrammed. Thus, keep HFI enabled even if all CPUs in the
> - * die/package of @cpu are offline. See note in intel_hfi_online().
> + * package of @cpu are offline. See note in intel_hfi_online().
>   */
>  void intel_hfi_offline(unsigned int cpu)
>  {
> @@ -674,9 +674,13 @@ void __init intel_hfi_init(void)
>         if (hfi_parse_features())
>                 return;
>
> -       /* There is one HFI instance per die/package. */
> -       max_hfi_instances = topology_max_packages() *
> -                           topology_max_dies_per_package();
> +       /*
> +        * Note: HFI resources are managed at the physical package scope.
> +        * There could be platforms that enumerate packages as Linux dies.
> +        * Special handling would be needed if this happens on an HFI-capable
> +        * platform.
> +        */
> +       max_hfi_instances = topology_max_packages();
>
>         /*
>          * This allocation may fail. CPU hotplug callbacks must check
> --
> 2.34.1
>
>
Ricardo Neri July 9, 2024, 2:19 a.m. UTC | #2
On Wed, Jul 03, 2024 at 01:33:03PM +0200, Rafael J. Wysocki wrote:
> On Wed, Jul 3, 2024 at 7:55 AM Zhang Rui <rui.zhang@intel.com> wrote:
> >
> > The Intel Software Developer's Manual defines the scope of HFI (registers
> > and memory buffer) as package. Use package scope* in the software
> 
> "as a package"
> 
> > representation of an HFI instance.
> >
> > Using die scope in HFI instances has the effect of creating multiple,
> > conflicting, instances for the same package: each instance allocates its
> > own memory buffer and configures the same package-level registers.
> > Specifically, only one of the allocated memory buffers can be set in the
> > MSR_IA32_HW_FEEDBACK_PTR register. CPUs get incorrect HFI data from the
> > table.
> >
> > The problem does not affect current HFI-capable platforms because they
> > all have single-die processors.
> >
> > * We used die scope for HFI instances because there have been processors
> > in which packages where enumerated as dies. None of those systems support
> 
> "were"
> 
> > HFI. If such a system emerged we would need to quirk it.
> >
> > Co-developed-by: Chen Yu <yu.c.chen@intel.com>
> > Signed-off-by: Chen Yu <yu.c.chen@intel.com>
> > Signed-off-by: Zhang Rui <rui.zhang@intel.com>
> 
> Ricardo, any concerns?

No concerns. IMO, it is important to document why we were using die scope
before. Rui has done it.

This patch looks good to me.

FWIW, Reviewed-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Rafael J. Wysocki July 9, 2024, 4:30 p.m. UTC | #3
On Tue, Jul 9, 2024 at 4:13 AM Ricardo Neri
<ricardo.neri-calderon@linux.intel.com> wrote:
>
> On Wed, Jul 03, 2024 at 01:33:03PM +0200, Rafael J. Wysocki wrote:
> > On Wed, Jul 3, 2024 at 7:55 AM Zhang Rui <rui.zhang@intel.com> wrote:
> > >
> > > The Intel Software Developer's Manual defines the scope of HFI (registers
> > > and memory buffer) as package. Use package scope* in the software
> >
> > "as a package"
> >
> > > representation of an HFI instance.
> > >
> > > Using die scope in HFI instances has the effect of creating multiple,
> > > conflicting, instances for the same package: each instance allocates its
> > > own memory buffer and configures the same package-level registers.
> > > Specifically, only one of the allocated memory buffers can be set in the
> > > MSR_IA32_HW_FEEDBACK_PTR register. CPUs get incorrect HFI data from the
> > > table.
> > >
> > > The problem does not affect current HFI-capable platforms because they
> > > all have single-die processors.
> > >
> > > * We used die scope for HFI instances because there have been processors
> > > in which packages where enumerated as dies. None of those systems support
> >
> > "were"
> >
> > > HFI. If such a system emerged we would need to quirk it.
> > >
> > > Co-developed-by: Chen Yu <yu.c.chen@intel.com>
> > > Signed-off-by: Chen Yu <yu.c.chen@intel.com>
> > > Signed-off-by: Zhang Rui <rui.zhang@intel.com>
> >
> > Ricardo, any concerns?
>
> No concerns. IMO, it is important to document why we were using die scope
> before. Rui has done it.
>
> This patch looks good to me.
>
> FWIW, Reviewed-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>

Applied as 6.11 material, thanks!
diff mbox series

Patch

diff --git a/drivers/thermal/intel/intel_hfi.c b/drivers/thermal/intel/intel_hfi.c
index a180a98bb9f1..5b18a46a10b0 100644
--- a/drivers/thermal/intel/intel_hfi.c
+++ b/drivers/thermal/intel/intel_hfi.c
@@ -401,10 +401,10 @@  static void hfi_disable(void)
  * intel_hfi_online() - Enable HFI on @cpu
  * @cpu:	CPU in which the HFI will be enabled
  *
- * Enable the HFI to be used in @cpu. The HFI is enabled at the die/package
- * level. The first CPU in the die/package to come online does the full HFI
+ * Enable the HFI to be used in @cpu. The HFI is enabled at the package
+ * level. The first CPU in the package to come online does the full HFI
  * initialization. Subsequent CPUs will just link themselves to the HFI
- * instance of their die/package.
+ * instance of their package.
  *
  * This function is called before enabling the thermal vector in the local APIC
  * in order to ensure that @cpu has an associated HFI instance when it receives
@@ -414,31 +414,31 @@  void intel_hfi_online(unsigned int cpu)
 {
 	struct hfi_instance *hfi_instance;
 	struct hfi_cpu_info *info;
-	u16 die_id;
+	u16 pkg_id;
 
 	/* Nothing to do if hfi_instances are missing. */
 	if (!hfi_instances)
 		return;
 
 	/*
-	 * Link @cpu to the HFI instance of its package/die. It does not
+	 * Link @cpu to the HFI instance of its package. It does not
 	 * matter whether the instance has been initialized.
 	 */
 	info = &per_cpu(hfi_cpu_info, cpu);
-	die_id = topology_logical_die_id(cpu);
+	pkg_id = topology_logical_package_id(cpu);
 	hfi_instance = info->hfi_instance;
 	if (!hfi_instance) {
-		if (die_id >= max_hfi_instances)
+		if (pkg_id >= max_hfi_instances)
 			return;
 
-		hfi_instance = &hfi_instances[die_id];
+		hfi_instance = &hfi_instances[pkg_id];
 		info->hfi_instance = hfi_instance;
 	}
 
 	init_hfi_cpu_index(info);
 
 	/*
-	 * Now check if the HFI instance of the package/die of @cpu has been
+	 * Now check if the HFI instance of the package of @cpu has been
 	 * initialized (by checking its header). In such case, all we have to
 	 * do is to add @cpu to this instance's cpumask and enable the instance
 	 * if needed.
@@ -504,7 +504,7 @@  void intel_hfi_online(unsigned int cpu)
  *
  * On some processors, hardware remembers previous programming settings even
  * after being reprogrammed. Thus, keep HFI enabled even if all CPUs in the
- * die/package of @cpu are offline. See note in intel_hfi_online().
+ * package of @cpu are offline. See note in intel_hfi_online().
  */
 void intel_hfi_offline(unsigned int cpu)
 {
@@ -674,9 +674,13 @@  void __init intel_hfi_init(void)
 	if (hfi_parse_features())
 		return;
 
-	/* There is one HFI instance per die/package. */
-	max_hfi_instances = topology_max_packages() *
-			    topology_max_dies_per_package();
+	/*
+	 * Note: HFI resources are managed at the physical package scope.
+	 * There could be platforms that enumerate packages as Linux dies.
+	 * Special handling would be needed if this happens on an HFI-capable
+	 * platform.
+	 */
+	max_hfi_instances = topology_max_packages();
 
 	/*
 	 * This allocation may fail. CPU hotplug callbacks must check