[2/2] arm64: topology: add MPIDR-based detection

Message ID	1398217214-12204-3-git-send-email-zlim@broadcom.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org> From: Zi Shen Lim <zlim@broadcom.com> To: Catalin Marinas <catalin.marinas@arm.com>, Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>, Mark Brown <broonie@linaro.org>, Mark Rutland <mark.rutland@arm.com>, Will Deacon <will.deacon@arm.com> Subject: [PATCH 2/2] arm64: topology: add MPIDR-based detection Date: Tue, 22 Apr 2014 18:40:14 -0700 Message-ID: <1398217214-12204-3-git-send-email-zlim@broadcom.com> In-Reply-To: <1398217214-12204-1-git-send-email-zlim@broadcom.com> References: <1398217214-12204-1-git-send-email-zlim@broadcom.com> MIME-Version: 1.0 Cc: Zi Shen Lim <zlim@broadcom.com>, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org Precedence: list Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" <linux-arm-kernel-bounces@lists.infradead.org> Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org

Zi Shen Lim April 23, 2014, 1:40 a.m. UTC

Create cpu topology based on MPIDR. When hardware sets MPIDR to sane
values, this method will always work. Therefore it should also work well
as the fallback method. [1]

[1] http://www.spinics.net/lists/arm-kernel/msg317445.html

Signed-off-by: Zi Shen Lim <zlim@broadcom.com>
---
 arch/arm64/include/asm/cputype.h |  2 ++
 arch/arm64/kernel/topology.c     | 31 +++++++++++++++++++++++++++++++
 2 files changed, 33 insertions(+)

Mark Brown April 23, 2014, 10:36 a.m. UTC | #1

On Tue, Apr 22, 2014 at 06:40:14PM -0700, Zi Shen Lim wrote:

> +		/* Multiprocessor system */
> +		if (mpidr & MPIDR_MT_BITMASK) {
> +			/* Multi-threads per core */
> +			cpuid_topo->thread_id  = MPIDR_AFFINITY_LEVEL(mpidr, 0);
> +			cpuid_topo->core_id    = MPIDR_AFFINITY_LEVEL(mpidr, 1);
> +			cpuid_topo->cluster_id = MPIDR_AFFINITY_LEVEL(mpidr, 2);
> +		} else {
> +			/* Single-thread per core */
> +			cpuid_topo->thread_id  = -1;
> +			cpuid_topo->core_id    = MPIDR_AFFINITY_LEVEL(mpidr, 0);
> +			cpuid_topo->cluster_id = MPIDR_AFFINITY_LEVEL(mpidr, 1);
> +		}
> +	}

This means that we ignore affinity level 3 and on non-MT cores we ignore
affinity level 2.  That means that if it runs on some system where we do
have multiple levels of clustering (for example some future multi socket
server) or if for some reason the hardware engineers have decided to use
one of the higher affinity levels then we will incorrectly report cores
from several clusters as being part of a single cluster.

I had been intending to just combine all the bits from affinitly levels
above the CPU number into a single number until we know what to do with
them individually.  We shouldn't just ignore them.

> +	pr_info("CPU%u: cluster %d core %d thread %d mpidr %llx\n",
> +		cpuid, cpuid_topo->cluster_id, cpuid_topo->core_id,
> +		cpuid_topo->thread_id, mpidr);
> +

Catalin or Lorenzo asked for stuff like that to be taken out.

Zi Shen Lim April 23, 2014, 5:27 p.m. UTC | #2

On Wed, Apr 23, 2014 at 11:36:48AM +0100, Mark Brown wrote:
> On Tue, Apr 22, 2014 at 06:40:14PM -0700, Zi Shen Lim wrote:
> 
> > +		/* Multiprocessor system */
> > +		if (mpidr & MPIDR_MT_BITMASK) {
> > +			/* Multi-threads per core */
> > +			cpuid_topo->thread_id  = MPIDR_AFFINITY_LEVEL(mpidr, 0);
> > +			cpuid_topo->core_id    = MPIDR_AFFINITY_LEVEL(mpidr, 1);
> > +			cpuid_topo->cluster_id = MPIDR_AFFINITY_LEVEL(mpidr, 2);
> > +		} else {
> > +			/* Single-thread per core */
> > +			cpuid_topo->thread_id  = -1;
> > +			cpuid_topo->core_id    = MPIDR_AFFINITY_LEVEL(mpidr, 0);
> > +			cpuid_topo->cluster_id = MPIDR_AFFINITY_LEVEL(mpidr, 1);
> > +		}
> > +	}
> 
> This means that we ignore affinity level 3 and on non-MT cores we ignore
> affinity level 2.  That means that if it runs on some system where we do
> have multiple levels of clustering (for example some future multi socket
> server) or if for some reason the hardware engineers have decided to use
> one of the higher affinity levels then we will incorrectly report cores
> from several clusters as being part of a single cluster.
> 

Good points.

"Cluster" in my case is actually "Socket".

I guess single-socket in b.L configuration could have multiple clusters,
in which case, a new 'socket_id' would map to aff3 in MT case.

	thread	= aff0
	core	= aff1
	cluster	= aff2
	socket	= aff3

For non-MT, it could very well be aff2?

	thread	= -1
	core	= aff0
	cluster	= aff1
	socket	= aff2

Or is it likely that some folks may opt to skip aff2, and simply use aff3?
Mark, is there precedence for such usage of affinity levels?

Maybe ARM folks can point us to documentation or conventions w.r.t.
usage of those affinity levels.  Otherwise, this has potential to become
a creative playground :)

> I had been intending to just combine all the bits from affinitly levels
> above the CPU number into a single number until we know what to do with
> them individually.  We shouldn't just ignore them.
> 

I agree we shouldn't ignore aff3 if someone is already using it.

I'm not sure how combining them into a single number helps with topology.
We already started out with a cpuid, no?

Perhaps we should just add a new 'socket_id' and that will accommodate
all cases (up to aff3).

> > +	pr_info("CPU%u: cluster %d core %d thread %d mpidr %llx\n",
> > +		cpuid, cpuid_topo->cluster_id, cpuid_topo->core_id,
> > +		cpuid_topo->thread_id, mpidr);
> > +
> 
> Catalin or Lorenzo asked for stuff like that to be taken out.

Is it ok to change it pr_debug instead? Or drop it completely?

Mark Brown April 23, 2014, 6:26 p.m. UTC | #3

On Wed, Apr 23, 2014 at 10:27:20AM -0700, Zi Shen Lim wrote:

> Or is it likely that some folks may opt to skip aff2, and simply use aff3?
> Mark, is there precedence for such usage of affinity levels?

Not that I'm aware of at the minute myself but of course this code may
end up running on some enterprise distribution with an extended support
time with hardware that isn't even on the drawing board now.

> > I had been intending to just combine all the bits from affinitly levels
> > above the CPU number into a single number until we know what to do with
> > them individually.  We shouldn't just ignore them.

> I agree we shouldn't ignore aff3 if someone is already using it.

> I'm not sure how combining them into a single number helps with topology.
> We already started out with a cpuid, no?

It will at least ensure that all clusters get assigned a unique ID and
we don't end up discarding some of the information and coming out with
two identically numbered clusters which then have identically numbered
CPUs inside of them which doesn't seem clever.

When I was looking at this it wasn't sufficiently clear to me that the
cluster clustering would be well modelled by sockets as the scheduler
currently assumes them, nor what to do with additional levels of that
(the DT binding allows for infinite levels).  Punting and just putting
all clusters at the same level avoids active bugs and seems fairly
conservative.

> Perhaps we should just add a new 'socket_id' and that will accommodate
> all cases (up to aff3).

Not in the non-MT case where we've got two levels above the cluster ID
in affinity level 1 unless we just combine 2 and 3 (which would be
reasonable enough of course).

Zi Shen Lim April 23, 2014, 7:22 p.m. UTC | #4

On Wed, Apr 23, 2014 at 07:26:11PM +0100, Mark Brown wrote:
> On Wed, Apr 23, 2014 at 10:27:20AM -0700, Zi Shen Lim wrote:
> 
> It will at least ensure that all clusters get assigned a unique ID and
> we don't end up discarding some of the information and coming out with
> two identically numbered clusters which then have identically numbered
> CPUs inside of them which doesn't seem clever.

I agree with you. Simply ignoring aff3 is not acceptable, whether or not
someone is using it.

> 
> When I was looking at this it wasn't sufficiently clear to me that the
> cluster clustering would be well modelled by sockets as the scheduler
> currently assumes them, nor what to do with additional levels of that
> (the DT binding allows for infinite levels).  Punting and just putting
> all clusters at the same level avoids active bugs and seems fairly
> conservative.
> 

Sounds like you prefer "cluster of clusters" over "socket", correct?

In any case, with only 4 affinity levels defined in the arch, as long as
we also have 4 variables to capture that information, we should be good,
right?

Anything more exotic not expressable by these 4 affinity levels in MPIDR
will require additional information from other sources such as DT or ACPI.

> > Perhaps we should just add a new 'socket_id' and that will accommodate
> > all cases (up to aff3).
> 
> Not in the non-MT case where we've got two levels above the cluster ID
> in affinity level 1 unless we just combine 2 and 3 (which would be
> reasonable enough of course).

Is the following an accurate description of your proposal for non-MT?

	thread_id   = -1
	core_id     = aff0
	cluster_id  = aff1
	clusters_id = combine(aff2,aff3)

Mark Brown April 23, 2014, 10:38 p.m. UTC | #5

On Wed, Apr 23, 2014 at 12:22:28PM -0700, Zi Shen Lim wrote:
> On Wed, Apr 23, 2014 at 07:26:11PM +0100, Mark Brown wrote:

> > When I was looking at this it wasn't sufficiently clear to me that the
> > cluster clustering would be well modelled by sockets as the scheduler
> > currently assumes them, nor what to do with additional levels of that
> > (the DT binding allows for infinite levels).  Punting and just putting
> > all clusters at the same level avoids active bugs and seems fairly
> > conservative.

> Sounds like you prefer "cluster of clusters" over "socket", correct?

I don't actually care that much, I guess I was using the hardware
terminology because I'm not sure how well it will map onto the current
Linux software model of what a socket is but if we decide it maps well
onto a socket that's fine also.

> In any case, with only 4 affinity levels defined in the arch, as long as
> we also have 4 variables to capture that information, we should be good,
> right?

> Anything more exotic not expressable by these 4 affinity levels in MPIDR
> will require additional information from other sources such as DT or ACPI.

Yes.  I'd really only thought it through properly for the DT case since
at the time Catalin was saying that it would not be possible to use
MPIDR.

> > > Perhaps we should just add a new 'socket_id' and that will accommodate
> > > all cases (up to aff3).

> > Not in the non-MT case where we've got two levels above the cluster ID
> > in affinity level 1 unless we just combine 2 and 3 (which would be
> > reasonable enough of course).

> Is the following an accurate description of your proposal for non-MT?

> 	thread_id   = -1
> 	core_id     = aff0
> 	cluster_id  = aff1
> 	clusters_id = combine(aff2,aff3)

Yes, exactly in the combining case - probably just shift one of them
left and or them together.  Otherwise just only have the cluster_id and
combine everything else into one (that's what the DT code does
effectively).

[2/2] arm64: topology: add MPIDR-based detection

Commit Message

Comments

Patch