diff mbox series

[v2,8/8] arch_topology: Add support to build llc_sibling on DT platforms

Message ID 20220518093325.2070336-9-sudeep.holla@arm.com (mailing list archive)
State New, archived
Headers show
Series arch_topology: Updates to add socket support and fix cluster ids | expand

Commit Message

Sudeep Holla May 18, 2022, 9:33 a.m. UTC
ACPI PPTT provides cache identifiers and especially the last level cache
identifier is used in obtaining last level cache siblings amongst CPUs.

While we have the cpu map representing all the CPUs sharing last level
cache in the cacheinfo driver, it is populated quite late in the boot
while the information is needed to build scheduler domains quite early.

On DT platforms we can use the pointer to the last level cache as the
firmware identifier for the last level cache and build the cpumap sharing
the last level cache based on the same.

Cc: Rob Herring <robh+dt@kernel.org>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
---
 drivers/base/arch_topology.c  | 18 ++++++++++++++----
 include/linux/arch_topology.h |  1 +
 2 files changed, 15 insertions(+), 4 deletions(-)

Comments

Rob Herring May 19, 2022, 6:10 p.m. UTC | #1
On Wed, May 18, 2022 at 4:34 AM Sudeep Holla <sudeep.holla@arm.com> wrote:
>
> ACPI PPTT provides cache identifiers and especially the last level cache
> identifier is used in obtaining last level cache siblings amongst CPUs.
>
> While we have the cpu map representing all the CPUs sharing last level
> cache in the cacheinfo driver, it is populated quite late in the boot
> while the information is needed to build scheduler domains quite early.

Late is because it's a device_initcall() rather than late in the cpu
hotplug state machine, right? The late aspect is for sysfs presumably,
but I think we could decouple that. Do all the firmware cache parsing
early and then populate the sysfs parts later. It's not a unique
problem as the DT unflattening and init code has to do the same thing.
I'd assume the hotplug and cpu sysfs devices have to deal with the
same thing.

Rob
Dietmar Eggemann May 20, 2022, 12:33 p.m. UTC | #2
On 18/05/2022 11:33, Sudeep Holla wrote:
> ACPI PPTT provides cache identifiers and especially the last level cache
> identifier is used in obtaining last level cache siblings amongst CPUs.
> 
> While we have the cpu map representing all the CPUs sharing last level
> cache in the cacheinfo driver, it is populated quite late in the boot
> while the information is needed to build scheduler domains quite early.
> 
> On DT platforms we can use the pointer to the last level cache as the
> firmware identifier for the last level cache and build the cpumap sharing
> the last level cache based on the same.

[...]

> diff --git a/include/linux/arch_topology.h b/include/linux/arch_topology.h
> index 58cbe18d825c..d8a36b0e27c9 100644
> --- a/include/linux/arch_topology.h
> +++ b/include/linux/arch_topology.h
> @@ -69,6 +69,7 @@ struct cpu_topology {
>  	int cluster_id;
>  	int package_id;
>  	int llc_id;
> +	void *llc_fw_node;

Would be nicer if you could set llc_id directly to avoid all this
addition sync `llc_id and llc_fw_node` code. ACPI PPTT has this
ACPI_PTR_DIFF() macro which IMHO lets it create distinct ids.

>  	cpumask_t thread_sibling;
>  	cpumask_t core_sibling;
>  	cpumask_t cluster_sibling;
Sudeep Holla May 20, 2022, 12:59 p.m. UTC | #3
On Thu, May 19, 2022 at 01:10:51PM -0500, Rob Herring wrote:
> On Wed, May 18, 2022 at 4:34 AM Sudeep Holla <sudeep.holla@arm.com> wrote:
> >
> > ACPI PPTT provides cache identifiers and especially the last level cache
> > identifier is used in obtaining last level cache siblings amongst CPUs.
> >
> > While we have the cpu map representing all the CPUs sharing last level
> > cache in the cacheinfo driver, it is populated quite late in the boot
> > while the information is needed to build scheduler domains quite early.
>
> Late is because it's a device_initcall() rather than late in the cpu
> hotplug state machine, right?

Right. The expectation is to run in on each online CPU in CPU hotplug state
machine for some architectures. We may not need that on arm64 especially
since we get all info from DT or ACPI, but e.g. x86 uses cpuid which needs
to be executed on that CPU.

> The late aspect is for sysfs presumably,but I think we could decouple that.

OK, not sure when this sched_domain info is actually needed. It think it
could be decoupled if we can wait until all the cpus are online.

> Do all the firmware cache parsing early and then populate the sysfs parts
> later.

Yes that may work on DT/ACPI based systems, as I said x86 relies on cpuid.

> It's not a unique problem as the DT unflattening and init code has to
> do the same thing. I'd assume the hotplug and cpu sysfs devices have
> to deal with the same thing.
>

OK, I will take a look at how to do that.

--
Regards,
Sudeep
Rob Herring May 20, 2022, 2:36 p.m. UTC | #4
On Fri, May 20, 2022 at 01:59:59PM +0100, Sudeep Holla wrote:
> On Thu, May 19, 2022 at 01:10:51PM -0500, Rob Herring wrote:
> > On Wed, May 18, 2022 at 4:34 AM Sudeep Holla <sudeep.holla@arm.com> wrote:
> > >
> > > ACPI PPTT provides cache identifiers and especially the last level cache
> > > identifier is used in obtaining last level cache siblings amongst CPUs.
> > >
> > > While we have the cpu map representing all the CPUs sharing last level
> > > cache in the cacheinfo driver, it is populated quite late in the boot
> > > while the information is needed to build scheduler domains quite early.
> >
> > Late is because it's a device_initcall() rather than late in the cpu
> > hotplug state machine, right?
> 
> Right. The expectation is to run in on each online CPU in CPU hotplug state
> machine for some architectures. We may not need that on arm64 especially
> since we get all info from DT or ACPI, but e.g. x86 uses cpuid which needs
> to be executed on that CPU.

That's a separate issue. I'm not suggesting changing that part (that 
would just be an optimization).

> > The late aspect is for sysfs presumably,but I think we could decouple that.
> 
> OK, not sure when this sched_domain info is actually needed. It think it
> could be decoupled if we can wait until all the cpus are online.

No need to wait for all cpus to be online. I think you keep doing 
it as part of cpu hotplug. The device_initcall() is used because you 
cannot have struct device or sysfs calls before the driver core is 
initialized. If we run the cacheinfo code earlier (I think the arch code 
will have to call it) just like the topology code and skip the sysfs 
parts, then you can use it.

> > Do all the firmware cache parsing early and then populate the sysfs parts
> > later.
> 
> Yes that may work on DT/ACPI based systems, as I said x86 relies on cpuid.

I'd assume using cpuid works at any time?

> > It's not a unique problem as the DT unflattening and init code has to
> > do the same thing. I'd assume the hotplug and cpu sysfs devices have
> > to deal with the same thing.
> >
> 
> OK, I will take a look at how to do that.
> 
> --
> Regards,
> Sudeep
Sudeep Holla May 20, 2022, 2:56 p.m. UTC | #5
On Fri, May 20, 2022 at 02:33:46PM +0200, Dietmar Eggemann wrote:
> On 18/05/2022 11:33, Sudeep Holla wrote:
> > ACPI PPTT provides cache identifiers and especially the last level cache
> > identifier is used in obtaining last level cache siblings amongst CPUs.
> > 
> > While we have the cpu map representing all the CPUs sharing last level
> > cache in the cacheinfo driver, it is populated quite late in the boot
> > while the information is needed to build scheduler domains quite early.
> > 
> > On DT platforms we can use the pointer to the last level cache as the
> > firmware identifier for the last level cache and build the cpumap sharing
> > the last level cache based on the same.
> 
> [...]
> 
> > diff --git a/include/linux/arch_topology.h b/include/linux/arch_topology.h
> > index 58cbe18d825c..d8a36b0e27c9 100644
> > --- a/include/linux/arch_topology.h
> > +++ b/include/linux/arch_topology.h
> > @@ -69,6 +69,7 @@ struct cpu_topology {
> >  	int cluster_id;
> >  	int package_id;
> >  	int llc_id;
> > +	void *llc_fw_node;
> 
> Would be nicer if you could set llc_id directly to avoid all this
> addition sync `llc_id and llc_fw_node` code. ACPI PPTT has this
> ACPI_PTR_DIFF() macro which IMHO lets it create distinct ids.
> 

Indeed, I initially thought so, but they are 64 bit pointers and choosing
one reference is difficult to take some sort of diff while in ACPI it is in
a static table and we could use the reference to the top of the table.
Sudeep Holla May 20, 2022, 3:06 p.m. UTC | #6
On Fri, May 20, 2022 at 09:36:20AM -0500, Rob Herring wrote:
> On Fri, May 20, 2022 at 01:59:59PM +0100, Sudeep Holla wrote:
> > On Thu, May 19, 2022 at 01:10:51PM -0500, Rob Herring wrote:
> > > On Wed, May 18, 2022 at 4:34 AM Sudeep Holla <sudeep.holla@arm.com> wrote:
> > > >
> > > > ACPI PPTT provides cache identifiers and especially the last level cache
> > > > identifier is used in obtaining last level cache siblings amongst CPUs.
> > > >
> > > > While we have the cpu map representing all the CPUs sharing last level
> > > > cache in the cacheinfo driver, it is populated quite late in the boot
> > > > while the information is needed to build scheduler domains quite early.
> > >
> > > Late is because it's a device_initcall() rather than late in the cpu
> > > hotplug state machine, right?
> > 
> > Right. The expectation is to run in on each online CPU in CPU hotplug state
> > machine for some architectures. We may not need that on arm64 especially
> > since we get all info from DT or ACPI, but e.g. x86 uses cpuid which needs
> > to be executed on that CPU.
> 
> That's a separate issue. I'm not suggesting changing that part (that 
> would just be an optimization).
> 
> > > The late aspect is for sysfs presumably,but I think we could decouple that.
> > 
> > OK, not sure when this sched_domain info is actually needed. It think it
> > could be decoupled if we can wait until all the cpus are online.
> 
> No need to wait for all cpus to be online. I think you keep doing 
> it as part of cpu hotplug. The device_initcall() is used because you 
> cannot have struct device or sysfs calls before the driver core is 
> initialized. If we run the cacheinfo code earlier (I think the arch code 
> will have to call it) just like the topology code and skip the sysfs 
> parts, then you can use it.
>

Yes I was thinking something on similar lines, though I didn't think of
pushing code to arch. Let me check, must be possible.

> > > Do all the firmware cache parsing early and then populate the sysfs parts
> > > later.
> > 
> > Yes that may work on DT/ACPI based systems, as I said x86 relies on cpuid.
> 
> I'd assume using cpuid works at any time?
>

I think so, I can't recall the details since I looked at that about 5 years
ago. I have to check again anyways to explore early initialisation.
diff mbox series

Patch

diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index 6d3346efe74b..bc57f0f1862e 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -661,11 +661,14 @@  static int __init parse_dt_topology(void)
 	 * Check that all cores are in the topology; the SMP code will
 	 * only mark cores described in the DT as possible.
 	 */
-	for_each_possible_cpu(cpu)
+	for_each_possible_cpu(cpu) {
 		if (cpu_topology[cpu].package_id < 0) {
 			ret = -EINVAL;
 			break;
 		}
+		cpu_topology[cpu].llc_fw_node =
+					of_find_last_level_cache_node(cpu);
+	}
 
 out_map:
 	of_node_put(map);
@@ -681,6 +684,12 @@  static int __init parse_dt_topology(void)
 struct cpu_topology cpu_topology[NR_CPUS];
 EXPORT_SYMBOL_GPL(cpu_topology);
 
+#define IS_VALID_LLC_ID(x)	\
+	((x)->llc_id >= 0 || (x)->llc_fw_node)
+#define IS_MATCH_LLC_ID(x, y)	\
+	(((x)->llc_id >= 0 && (x)->llc_id == (y)->llc_id) || \
+	 ((x)->llc_fw_node && (x)->llc_fw_node == (y)->llc_fw_node))
+
 const struct cpumask *cpu_coregroup_mask(int cpu)
 {
 	const cpumask_t *core_mask = cpumask_of_node(cpu_to_node(cpu));
@@ -690,7 +699,8 @@  const struct cpumask *cpu_coregroup_mask(int cpu)
 		/* not numa in package, lets use the package siblings */
 		core_mask = &cpu_topology[cpu].core_sibling;
 	}
-	if (cpu_topology[cpu].llc_id >= 0) {
+
+	if (IS_VALID_LLC_ID(&cpu_topology[cpu])) {
 		if (cpumask_subset(&cpu_topology[cpu].llc_sibling, core_mask))
 			core_mask = &cpu_topology[cpu].llc_sibling;
 	}
@@ -721,8 +731,7 @@  void update_siblings_masks(unsigned int cpuid)
 	for_each_online_cpu(cpu) {
 		cpu_topo = &cpu_topology[cpu];
 
-		if (cpu_topo->llc_id >= 0 &&
-		    cpuid_topo->llc_id == cpu_topo->llc_id) {
+		if (IS_MATCH_LLC_ID(cpu_topo, cpuid_topo)) {
 			cpumask_set_cpu(cpu, &cpuid_topo->llc_sibling);
 			cpumask_set_cpu(cpuid, &cpu_topo->llc_sibling);
 		}
@@ -777,6 +786,7 @@  void __init reset_cpu_topology(void)
 		cpu_topo->cluster_id = -1;
 		cpu_topo->package_id = -1;
 		cpu_topo->llc_id = -1;
+		cpu_topo->llc_fw_node = NULL;
 
 		clear_cpu_topology(cpu);
 	}
diff --git a/include/linux/arch_topology.h b/include/linux/arch_topology.h
index 58cbe18d825c..d8a36b0e27c9 100644
--- a/include/linux/arch_topology.h
+++ b/include/linux/arch_topology.h
@@ -69,6 +69,7 @@  struct cpu_topology {
 	int cluster_id;
 	int package_id;
 	int llc_id;
+	void *llc_fw_node;
 	cpumask_t thread_sibling;
 	cpumask_t core_sibling;
 	cpumask_t cluster_sibling;