diff mbox series

[v6,1/7] Documentation: DT: arm: add support for sockets defining package boundaries

Message ID 20190529211340.17087-2-atish.patra@wdc.com (mailing list archive)
State New, archived
Headers show
Series Unify CPU topology across ARM & RISC-V | expand

Commit Message

Atish Patra May 29, 2019, 9:13 p.m. UTC
From: Sudeep Holla <sudeep.holla@arm.com>

The current ARM DT topology description provides the operating system
with a topological view of the system that is based on leaf nodes
representing either cores or threads (in an SMT system) and a
hierarchical set of cluster nodes that creates a hierarchical topology
view of how those cores and threads are grouped.

However this hierarchical representation of clusters does not allow to
describe what topology level actually represents the physical package or
the socket boundary, which is a key piece of information to be used by
an operating system to optimize resource allocation and scheduling.

Lets add a new "socket" node type in the cpu-map node to describe the
same.

Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Reviewed-by: Rob Herring <robh@kernel.org>
---
 .../devicetree/bindings/arm/topology.txt      | 52 ++++++++++++++-----
 1 file changed, 39 insertions(+), 13 deletions(-)

Comments

Andrew Davis May 29, 2019, 11:39 p.m. UTC | #1
On 5/29/19 5:13 PM, Atish Patra wrote:
> From: Sudeep Holla <sudeep.holla@arm.com>
> 
> The current ARM DT topology description provides the operating system
> with a topological view of the system that is based on leaf nodes
> representing either cores or threads (in an SMT system) and a
> hierarchical set of cluster nodes that creates a hierarchical topology
> view of how those cores and threads are grouped.
> 
> However this hierarchical representation of clusters does not allow to
> describe what topology level actually represents the physical package or
> the socket boundary, which is a key piece of information to be used by
> an operating system to optimize resource allocation and scheduling.
> 

Are physical package descriptions really needed? What does "socket" 
imply that a higher layer "cluster" node grouping does not? It doesn't 
imply a different NUMA distance and the definition of "socket" is 
already not well defined, is a dual chiplet processor not just a fancy 
dual "socket" or are dual "sockets" on a server board "slotket" card, 
will we need new names for those too..

Andrew

> Lets add a new "socket" node type in the cpu-map node to describe the
> same.
> 
> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
> Reviewed-by: Rob Herring <robh@kernel.org>
> ---
>   .../devicetree/bindings/arm/topology.txt      | 52 ++++++++++++++-----
>   1 file changed, 39 insertions(+), 13 deletions(-)
> 
> diff --git a/Documentation/devicetree/bindings/arm/topology.txt b/Documentation/devicetree/bindings/arm/topology.txt
> index b0d80c0fb265..3b8febb46dad 100644
> --- a/Documentation/devicetree/bindings/arm/topology.txt
> +++ b/Documentation/devicetree/bindings/arm/topology.txt
> @@ -9,6 +9,7 @@ ARM topology binding description
>   In an ARM system, the hierarchy of CPUs is defined through three entities that
>   are used to describe the layout of physical CPUs in the system:
>   
> +- socket
>   - cluster
>   - core
>   - thread
> @@ -63,21 +64,23 @@ nodes are listed.
>   
>   	The cpu-map node's child nodes can be:
>   
> -	- one or more cluster nodes
> +	- one or more cluster nodes or
> +	- one or more socket nodes in a multi-socket system
>   
>   	Any other configuration is considered invalid.
>   
> -The cpu-map node can only contain three types of child nodes:
> +The cpu-map node can only contain 4 types of child nodes:
>   
> +- socket node
>   - cluster node
>   - core node
>   - thread node
>   
>   whose bindings are described in paragraph 3.
>   
> -The nodes describing the CPU topology (cluster/core/thread) can only
> -be defined within the cpu-map node and every core/thread in the system
> -must be defined within the topology.  Any other configuration is
> +The nodes describing the CPU topology (socket/cluster/core/thread) can
> +only be defined within the cpu-map node and every core/thread in the
> +system must be defined within the topology.  Any other configuration is
>   invalid and therefore must be ignored.
>   
>   ===========================================
> @@ -85,26 +88,44 @@ invalid and therefore must be ignored.
>   ===========================================
>   
>   cpu-map child nodes must follow a naming convention where the node name
> -must be "clusterN", "coreN", "threadN" depending on the node type (ie
> -cluster/core/thread) (where N = {0, 1, ...} is the node number; nodes which
> -are siblings within a single common parent node must be given a unique and
> +must be "socketN", "clusterN", "coreN", "threadN" depending on the node type
> +(ie socket/cluster/core/thread) (where N = {0, 1, ...} is the node number; nodes
> +which are siblings within a single common parent node must be given a unique and
>   sequential N value, starting from 0).
>   cpu-map child nodes which do not share a common parent node can have the same
>   name (ie same number N as other cpu-map child nodes at different device tree
>   levels) since name uniqueness will be guaranteed by the device tree hierarchy.
>   
>   ===========================================
> -3 - cluster/core/thread node bindings
> +3 - socket/cluster/core/thread node bindings
>   ===========================================
>   
> -Bindings for cluster/cpu/thread nodes are defined as follows:
> +Bindings for socket/cluster/cpu/thread nodes are defined as follows:
> +
> +- socket node
> +
> +	 Description: must be declared within a cpu-map node, one node
> +		      per physical socket in the system. A system can
> +		      contain single or multiple physical socket.
> +		      The association of sockets and NUMA nodes is beyond
> +		      the scope of this bindings, please refer [2] for
> +		      NUMA bindings.
> +
> +	This node is optional for a single socket system.
> +
> +	The socket node name must be "socketN" as described in 2.1 above.
> +	A socket node can not be a leaf node.
> +
> +	A socket node's child nodes must be one or more cluster nodes.
> +
> +	Any other configuration is considered invalid.
>   
>   - cluster node
>   
>   	 Description: must be declared within a cpu-map node, one node
>   		      per cluster. A system can contain several layers of
> -		      clustering and cluster nodes can be contained in parent
> -		      cluster nodes.
> +		      clustering within a single physical socket and cluster
> +		      nodes can be contained in parent cluster nodes.
>   
>   	The cluster node name must be "clusterN" as described in 2.1 above.
>   	A cluster node can not be a leaf node.
> @@ -164,13 +185,15 @@ Bindings for cluster/cpu/thread nodes are defined as follows:
>   4 - Example dts
>   ===========================================
>   
> -Example 1 (ARM 64-bit, 16-cpu system, two clusters of clusters):
> +Example 1 (ARM 64-bit, 16-cpu system, two clusters of clusters in a single
> +physical socket):
>   
>   cpus {
>   	#size-cells = <0>;
>   	#address-cells = <2>;
>   
>   	cpu-map {
> +		socket0 {
>   			cluster0 {
>   				cluster0 {
>   					core0 {
> @@ -253,6 +276,7 @@ cpus {
>   				};
>   			};
>   		};
> +	};
>   
>   	CPU0: cpu@0 {
>   		device_type = "cpu";
> @@ -473,3 +497,5 @@ cpus {
>   ===============================================================================
>   [1] ARM Linux kernel documentation
>       Documentation/devicetree/bindings/arm/cpus.yaml
> +[2] Devicetree NUMA binding description
> +    Documentation/devicetree/bindings/numa.txt
>
Morten Rasmussen May 30, 2019, 11:51 a.m. UTC | #2
On Wed, May 29, 2019 at 07:39:17PM -0400, Andrew F. Davis wrote:
> On 5/29/19 5:13 PM, Atish Patra wrote:
> >From: Sudeep Holla <sudeep.holla@arm.com>
> >
> >The current ARM DT topology description provides the operating system
> >with a topological view of the system that is based on leaf nodes
> >representing either cores or threads (in an SMT system) and a
> >hierarchical set of cluster nodes that creates a hierarchical topology
> >view of how those cores and threads are grouped.
> >
> >However this hierarchical representation of clusters does not allow to
> >describe what topology level actually represents the physical package or
> >the socket boundary, which is a key piece of information to be used by
> >an operating system to optimize resource allocation and scheduling.
> >
> 
> Are physical package descriptions really needed? What does "socket" imply
> that a higher layer "cluster" node grouping does not? It doesn't imply a
> different NUMA distance and the definition of "socket" is already not well
> defined, is a dual chiplet processor not just a fancy dual "socket" or are
> dual "sockets" on a server board "slotket" card, will we need new names for
> those too..

Socket (or package) just implies what you suggest, a grouping of CPUs
based on the physical socket (or package). Some resources might be
associated with packages and more importantly socket information is
exposed to user-space. At the moment clusters are being exposed to
user-space as sockets which is less than ideal for some topologies.

At the moment user-space is only told about hw threads, cores, and
sockets. In the very near future it is going to be told about dies too
(look for Len Brown's multi-die patch set).

I don't see how we can provide correct information to user-space based
on the current information in DT. I'm not convinced it was a good idea
to expose this information to user-space to begin with but that is
another discussion.

Morten
Andrew Davis May 30, 2019, 12:56 p.m. UTC | #3
On 5/30/19 7:51 AM, Morten Rasmussen wrote:
> On Wed, May 29, 2019 at 07:39:17PM -0400, Andrew F. Davis wrote:
>> On 5/29/19 5:13 PM, Atish Patra wrote:
>>> From: Sudeep Holla <sudeep.holla@arm.com>
>>>
>>> The current ARM DT topology description provides the operating system
>>> with a topological view of the system that is based on leaf nodes
>>> representing either cores or threads (in an SMT system) and a
>>> hierarchical set of cluster nodes that creates a hierarchical topology
>>> view of how those cores and threads are grouped.
>>>
>>> However this hierarchical representation of clusters does not allow to
>>> describe what topology level actually represents the physical package or
>>> the socket boundary, which is a key piece of information to be used by
>>> an operating system to optimize resource allocation and scheduling.
>>>
>>
>> Are physical package descriptions really needed? What does "socket" imply
>> that a higher layer "cluster" node grouping does not? It doesn't imply a
>> different NUMA distance and the definition of "socket" is already not well
>> defined, is a dual chiplet processor not just a fancy dual "socket" or are
>> dual "sockets" on a server board "slotket" card, will we need new names for
>> those too..
> 
> Socket (or package) just implies what you suggest, a grouping of CPUs
> based on the physical socket (or package). Some resources might be
> associated with packages and more importantly socket information is
> exposed to user-space. At the moment clusters are being exposed to
> user-space as sockets which is less than ideal for some topologies.
> 

I see the benefit of reporting the physical layout and packaging 
information to user-space for tracking reasons, but from software 
perspective this doesn't matter, and the resource partitioning should be 
described elsewhere (NUMA nodes being the go to example).

> At the moment user-space is only told about hw threads, cores, and
> sockets. In the very near future it is going to be told about dies too
> (look for Len Brown's multi-die patch set).
> 

Seems my hypothetical case is already in the works :(

> I don't see how we can provide correct information to user-space based
> on the current information in DT. I'm not convinced it was a good idea
> to expose this information to user-space to begin with but that is
> another discussion.
> 

Fair enough, it's a little late now to un-expose this info to userspace 
so we should at least present it correctly. My worry was this getting 
out of hand with layering, for instance what happens when we need to add 
die nodes in-between cluster and socket?

Andrew

> Morten
>
Morten Rasmussen May 30, 2019, 1:12 p.m. UTC | #4
On Thu, May 30, 2019 at 08:56:03AM -0400, Andrew F. Davis wrote:
> On 5/30/19 7:51 AM, Morten Rasmussen wrote:
> >On Wed, May 29, 2019 at 07:39:17PM -0400, Andrew F. Davis wrote:
> >>On 5/29/19 5:13 PM, Atish Patra wrote:
> >>>From: Sudeep Holla <sudeep.holla@arm.com>
> >>>
> >>>The current ARM DT topology description provides the operating system
> >>>with a topological view of the system that is based on leaf nodes
> >>>representing either cores or threads (in an SMT system) and a
> >>>hierarchical set of cluster nodes that creates a hierarchical topology
> >>>view of how those cores and threads are grouped.
> >>>
> >>>However this hierarchical representation of clusters does not allow to
> >>>describe what topology level actually represents the physical package or
> >>>the socket boundary, which is a key piece of information to be used by
> >>>an operating system to optimize resource allocation and scheduling.
> >>>
> >>
> >>Are physical package descriptions really needed? What does "socket" imply
> >>that a higher layer "cluster" node grouping does not? It doesn't imply a
> >>different NUMA distance and the definition of "socket" is already not well
> >>defined, is a dual chiplet processor not just a fancy dual "socket" or are
> >>dual "sockets" on a server board "slotket" card, will we need new names for
> >>those too..
> >
> >Socket (or package) just implies what you suggest, a grouping of CPUs
> >based on the physical socket (or package). Some resources might be
> >associated with packages and more importantly socket information is
> >exposed to user-space. At the moment clusters are being exposed to
> >user-space as sockets which is less than ideal for some topologies.
> >
> 
> I see the benefit of reporting the physical layout and packaging information
> to user-space for tracking reasons, but from software perspective this
> doesn't matter, and the resource partitioning should be described elsewhere
> (NUMA nodes being the go to example).

That would make defining a NUMA node mandatory even for non-NUMA
systems?

> >At the moment user-space is only told about hw threads, cores, and
> >sockets. In the very near future it is going to be told about dies too
> >(look for Len Brown's multi-die patch set).
> >
> 
> Seems my hypothetical case is already in the works :(

Indeed. IIUC, the reasoning behind it is related to actual multi-die
x86 packages and some rapl stuff being per-die or per-core.

> 
> >I don't see how we can provide correct information to user-space based
> >on the current information in DT. I'm not convinced it was a good idea
> >to expose this information to user-space to begin with but that is
> >another discussion.
> >
> 
> Fair enough, it's a little late now to un-expose this info to userspace so
> we should at least present it correctly. My worry was this getting out of
> hand with layering, for instance what happens when we need to add die nodes
> in-between cluster and socket?

If we want the die mask to be correct for arm/arm64/riscv we need die
information from somewhere. I'm not in favour of adding more topology
layers to the user-space visible topology description, but others might
have a valid reason and if it is exposed I would prefer if we try to
expose the right information.

Btw, for packages, we already have that information in ACPI/PPTT so it
would be nice if we could have that for DT based systems too.

Morten
Russell King (Oracle) May 30, 2019, 9:42 p.m. UTC | #5
On Thu, May 30, 2019 at 12:51:03PM +0100, Morten Rasmussen wrote:
> On Wed, May 29, 2019 at 07:39:17PM -0400, Andrew F. Davis wrote:
> > On 5/29/19 5:13 PM, Atish Patra wrote:
> > >From: Sudeep Holla <sudeep.holla@arm.com>
> > >
> > >The current ARM DT topology description provides the operating system
> > >with a topological view of the system that is based on leaf nodes
> > >representing either cores or threads (in an SMT system) and a
> > >hierarchical set of cluster nodes that creates a hierarchical topology
> > >view of how those cores and threads are grouped.
> > >
> > >However this hierarchical representation of clusters does not allow to
> > >describe what topology level actually represents the physical package or
> > >the socket boundary, which is a key piece of information to be used by
> > >an operating system to optimize resource allocation and scheduling.
> > >
> > 
> > Are physical package descriptions really needed? What does "socket" imply
> > that a higher layer "cluster" node grouping does not? It doesn't imply a
> > different NUMA distance and the definition of "socket" is already not well
> > defined, is a dual chiplet processor not just a fancy dual "socket" or are
> > dual "sockets" on a server board "slotket" card, will we need new names for
> > those too..
> 
> Socket (or package) just implies what you suggest, a grouping of CPUs
> based on the physical socket (or package). Some resources might be
> associated with packages and more importantly socket information is
> exposed to user-space. At the moment clusters are being exposed to
> user-space as sockets which is less than ideal for some topologies.

Please point out a 32-bit ARM system that has multiple "socket"s.

As far as I'm aware, all 32-bit systems do not have socketed CPUs
(modern ARM CPUs are part of a larger SoC), and the CPUs are always
in one package.

Even the test systems I've seen do not have socketed CPUs.
Sudeep Holla May 31, 2019, 9:37 a.m. UTC | #6
On Thu, May 30, 2019 at 10:42:54PM +0100, Russell King - ARM Linux admin wrote:
> On Thu, May 30, 2019 at 12:51:03PM +0100, Morten Rasmussen wrote:
> > On Wed, May 29, 2019 at 07:39:17PM -0400, Andrew F. Davis wrote:
> > > On 5/29/19 5:13 PM, Atish Patra wrote:
> > > >From: Sudeep Holla <sudeep.holla@arm.com>
> > > >
> > > >The current ARM DT topology description provides the operating system
> > > >with a topological view of the system that is based on leaf nodes
> > > >representing either cores or threads (in an SMT system) and a
> > > >hierarchical set of cluster nodes that creates a hierarchical topology
> > > >view of how those cores and threads are grouped.
> > > >
> > > >However this hierarchical representation of clusters does not allow to
> > > >describe what topology level actually represents the physical package or
> > > >the socket boundary, which is a key piece of information to be used by
> > > >an operating system to optimize resource allocation and scheduling.
> > > >
> > >
> > > Are physical package descriptions really needed? What does "socket" imply
> > > that a higher layer "cluster" node grouping does not? It doesn't imply a
> > > different NUMA distance and the definition of "socket" is already not well
> > > defined, is a dual chiplet processor not just a fancy dual "socket" or are
> > > dual "sockets" on a server board "slotket" card, will we need new names for
> > > those too..
> >
> > Socket (or package) just implies what you suggest, a grouping of CPUs
> > based on the physical socket (or package). Some resources might be
> > associated with packages and more importantly socket information is
> > exposed to user-space. At the moment clusters are being exposed to
> > user-space as sockets which is less than ideal for some topologies.
>
> Please point out a 32-bit ARM system that has multiple "socket"s.
>
> As far as I'm aware, all 32-bit systems do not have socketed CPUs
> (modern ARM CPUs are part of a larger SoC), and the CPUs are always
> in one package.
>
> Even the test systems I've seen do not have socketed CPUs.
>

As far as we know, there's none. So we simply have to assume all
those systems are single socket(IOW all CPUs reside inside a single
SoC package) system.

--
Regards,
Sudeep
Sudeep Holla May 31, 2019, 9:41 a.m. UTC | #7
On Thu, May 30, 2019 at 08:56:03AM -0400, Andrew F. Davis wrote:
> On 5/30/19 7:51 AM, Morten Rasmussen wrote:
> > On Wed, May 29, 2019 at 07:39:17PM -0400, Andrew F. Davis wrote:
> > > On 5/29/19 5:13 PM, Atish Patra wrote:
> > > > From: Sudeep Holla <sudeep.holla@arm.com>
> > > >
> > > > The current ARM DT topology description provides the operating system
> > > > with a topological view of the system that is based on leaf nodes
> > > > representing either cores or threads (in an SMT system) and a
> > > > hierarchical set of cluster nodes that creates a hierarchical topology
> > > > view of how those cores and threads are grouped.
> > > >
> > > > However this hierarchical representation of clusters does not allow to
> > > > describe what topology level actually represents the physical package or
> > > > the socket boundary, which is a key piece of information to be used by
> > > > an operating system to optimize resource allocation and scheduling.
> > > >
> > >
> > > Are physical package descriptions really needed? What does "socket" imply
> > > that a higher layer "cluster" node grouping does not? It doesn't imply a
> > > different NUMA distance and the definition of "socket" is already not well
> > > defined, is a dual chiplet processor not just a fancy dual "socket" or are
> > > dual "sockets" on a server board "slotket" card, will we need new names for
> > > those too..
> >
> > Socket (or package) just implies what you suggest, a grouping of CPUs
> > based on the physical socket (or package). Some resources might be
> > associated with packages and more importantly socket information is
> > exposed to user-space. At the moment clusters are being exposed to
> > user-space as sockets which is less than ideal for some topologies.
> >
>
> I see the benefit of reporting the physical layout and packaging information
> to user-space for tracking reasons, but from software perspective this
> doesn't matter, and the resource partitioning should be described elsewhere
> (NUMA nodes being the go to example).
>
> > At the moment user-space is only told about hw threads, cores, and
> > sockets. In the very near future it is going to be told about dies too
> > (look for Len Brown's multi-die patch set).
> >
>
> Seems my hypothetical case is already in the works :(
>
> > I don't see how we can provide correct information to user-space based
> > on the current information in DT. I'm not convinced it was a good idea
> > to expose this information to user-space to begin with but that is
> > another discussion.
> >
>
> Fair enough, it's a little late now to un-expose this info to userspace so
> we should at least present it correctly. My worry was this getting out of
> hand with layering, for instance what happens when we need to add die nodes
> in-between cluster and socket?
>

We may have to, if there's a similar requirement on ARM64 as the one
addressed by Len Brown's multi-die patch set. But for now, no one has
asked for it.

--
Regards,
Sudeep
Morten Rasmussen May 31, 2019, 9:54 a.m. UTC | #8
On Fri, May 31, 2019 at 10:37:43AM +0100, Sudeep Holla wrote:
> On Thu, May 30, 2019 at 10:42:54PM +0100, Russell King - ARM Linux admin wrote:
> > On Thu, May 30, 2019 at 12:51:03PM +0100, Morten Rasmussen wrote:
> > > On Wed, May 29, 2019 at 07:39:17PM -0400, Andrew F. Davis wrote:
> > > > On 5/29/19 5:13 PM, Atish Patra wrote:
> > > > >From: Sudeep Holla <sudeep.holla@arm.com>
> > > > >
> > > > >The current ARM DT topology description provides the operating system
> > > > >with a topological view of the system that is based on leaf nodes
> > > > >representing either cores or threads (in an SMT system) and a
> > > > >hierarchical set of cluster nodes that creates a hierarchical topology
> > > > >view of how those cores and threads are grouped.
> > > > >
> > > > >However this hierarchical representation of clusters does not allow to
> > > > >describe what topology level actually represents the physical package or
> > > > >the socket boundary, which is a key piece of information to be used by
> > > > >an operating system to optimize resource allocation and scheduling.
> > > > >
> > > >
> > > > Are physical package descriptions really needed? What does "socket" imply
> > > > that a higher layer "cluster" node grouping does not? It doesn't imply a
> > > > different NUMA distance and the definition of "socket" is already not well
> > > > defined, is a dual chiplet processor not just a fancy dual "socket" or are
> > > > dual "sockets" on a server board "slotket" card, will we need new names for
> > > > those too..
> > >
> > > Socket (or package) just implies what you suggest, a grouping of CPUs
> > > based on the physical socket (or package). Some resources might be
> > > associated with packages and more importantly socket information is
> > > exposed to user-space. At the moment clusters are being exposed to
> > > user-space as sockets which is less than ideal for some topologies.
> >
> > Please point out a 32-bit ARM system that has multiple "socket"s.
> >
> > As far as I'm aware, all 32-bit systems do not have socketed CPUs
> > (modern ARM CPUs are part of a larger SoC), and the CPUs are always
> > in one package.
> >
> > Even the test systems I've seen do not have socketed CPUs.
> >
> 
> As far as we know, there's none. So we simply have to assume all
> those systems are single socket(IOW all CPUs reside inside a single
> SoC package) system.

Right, but we don't make that assumption. Clusters are reported as
sockets/packages for arm, just like they are for arm64. My comment above
applied to what can be described using DT, not what systems actually
exists. We need to be able describe packages for architecture where we
can't make assumptions.

arm example (ARM TC2):
root@morras01-tc2:~# lstopo
Machine (985MB)
  Package L#0
    Core L#0 + PU L#0 (P#0)
    Core L#1 + PU L#1 (P#1)
  Package L#1
    Core L#2 + PU L#2 (P#2)
    Core L#3 + PU L#3 (P#3)
    Core L#4 + PU L#4 (P#4)

Morten
diff mbox series

Patch

diff --git a/Documentation/devicetree/bindings/arm/topology.txt b/Documentation/devicetree/bindings/arm/topology.txt
index b0d80c0fb265..3b8febb46dad 100644
--- a/Documentation/devicetree/bindings/arm/topology.txt
+++ b/Documentation/devicetree/bindings/arm/topology.txt
@@ -9,6 +9,7 @@  ARM topology binding description
 In an ARM system, the hierarchy of CPUs is defined through three entities that
 are used to describe the layout of physical CPUs in the system:
 
+- socket
 - cluster
 - core
 - thread
@@ -63,21 +64,23 @@  nodes are listed.
 
 	The cpu-map node's child nodes can be:
 
-	- one or more cluster nodes
+	- one or more cluster nodes or
+	- one or more socket nodes in a multi-socket system
 
 	Any other configuration is considered invalid.
 
-The cpu-map node can only contain three types of child nodes:
+The cpu-map node can only contain 4 types of child nodes:
 
+- socket node
 - cluster node
 - core node
 - thread node
 
 whose bindings are described in paragraph 3.
 
-The nodes describing the CPU topology (cluster/core/thread) can only
-be defined within the cpu-map node and every core/thread in the system
-must be defined within the topology.  Any other configuration is
+The nodes describing the CPU topology (socket/cluster/core/thread) can
+only be defined within the cpu-map node and every core/thread in the
+system must be defined within the topology.  Any other configuration is
 invalid and therefore must be ignored.
 
 ===========================================
@@ -85,26 +88,44 @@  invalid and therefore must be ignored.
 ===========================================
 
 cpu-map child nodes must follow a naming convention where the node name
-must be "clusterN", "coreN", "threadN" depending on the node type (ie
-cluster/core/thread) (where N = {0, 1, ...} is the node number; nodes which
-are siblings within a single common parent node must be given a unique and
+must be "socketN", "clusterN", "coreN", "threadN" depending on the node type
+(ie socket/cluster/core/thread) (where N = {0, 1, ...} is the node number; nodes
+which are siblings within a single common parent node must be given a unique and
 sequential N value, starting from 0).
 cpu-map child nodes which do not share a common parent node can have the same
 name (ie same number N as other cpu-map child nodes at different device tree
 levels) since name uniqueness will be guaranteed by the device tree hierarchy.
 
 ===========================================
-3 - cluster/core/thread node bindings
+3 - socket/cluster/core/thread node bindings
 ===========================================
 
-Bindings for cluster/cpu/thread nodes are defined as follows:
+Bindings for socket/cluster/cpu/thread nodes are defined as follows:
+
+- socket node
+
+	 Description: must be declared within a cpu-map node, one node
+		      per physical socket in the system. A system can
+		      contain single or multiple physical socket.
+		      The association of sockets and NUMA nodes is beyond
+		      the scope of this bindings, please refer [2] for
+		      NUMA bindings.
+
+	This node is optional for a single socket system.
+
+	The socket node name must be "socketN" as described in 2.1 above.
+	A socket node can not be a leaf node.
+
+	A socket node's child nodes must be one or more cluster nodes.
+
+	Any other configuration is considered invalid.
 
 - cluster node
 
 	 Description: must be declared within a cpu-map node, one node
 		      per cluster. A system can contain several layers of
-		      clustering and cluster nodes can be contained in parent
-		      cluster nodes.
+		      clustering within a single physical socket and cluster
+		      nodes can be contained in parent cluster nodes.
 
 	The cluster node name must be "clusterN" as described in 2.1 above.
 	A cluster node can not be a leaf node.
@@ -164,13 +185,15 @@  Bindings for cluster/cpu/thread nodes are defined as follows:
 4 - Example dts
 ===========================================
 
-Example 1 (ARM 64-bit, 16-cpu system, two clusters of clusters):
+Example 1 (ARM 64-bit, 16-cpu system, two clusters of clusters in a single
+physical socket):
 
 cpus {
 	#size-cells = <0>;
 	#address-cells = <2>;
 
 	cpu-map {
+		socket0 {
 			cluster0 {
 				cluster0 {
 					core0 {
@@ -253,6 +276,7 @@  cpus {
 				};
 			};
 		};
+	};
 
 	CPU0: cpu@0 {
 		device_type = "cpu";
@@ -473,3 +497,5 @@  cpus {
 ===============================================================================
 [1] ARM Linux kernel documentation
     Documentation/devicetree/bindings/arm/cpus.yaml
+[2] Devicetree NUMA binding description
+    Documentation/devicetree/bindings/numa.txt