diff mbox series

[v3,03/17] xen/arm: implement node distance helpers for Arm

Message ID 20230420112521.3272732-4-Henry.Wang@arm.com (mailing list archive)
State Superseded
Headers show
Series Device tree based NUMA support for Arm - Part#3 | expand

Commit Message

Henry Wang April 20, 2023, 11:25 a.m. UTC
From: Wei Chen <wei.chen@arm.com>

We will parse NUMA nodes distances from device tree. So we need
a matrix to record the distances between any two nodes we parsed.
Accordingly, we provide this node_set_distance API for device tree
NUMA to set the distance for any two nodes in this patch. When
NUMA initialization failed, __node_distance will return
NUMA_REMOTE_DISTANCE, this will help us avoid doing rollback
for distance maxtrix when NUMA initialization failed.

As both x86 and Arm have implemented __node_distance, so we move
its definition from asm/numa.h to xen/numa.h. At same time, the
outdated u8 return value of x86 has been changed to unsigned char.

Signed-off-by: Wei Chen <wei.chen@arm.com>
Signed-off-by: Henry Wang <Henry.Wang@arm.com>
---
v2 -> v3:
1. Use __ro_after_init for node_distance_map.
2. Correct format of if condition identation in numa_set_distance().
3. Drop the unnecessary change to the year of copyright.
4. Use ARRAY_SIZE() to determine node_distance_map's row, column size.
v1 -> v2:
1. Use unsigned int/char instead of uint32_t/u8.
2. Re-org the commit message.
---
 xen/arch/arm/Makefile           |  1 +
 xen/arch/arm/include/asm/numa.h | 13 +++++++++
 xen/arch/arm/numa.c             | 52 +++++++++++++++++++++++++++++++++
 xen/arch/x86/include/asm/numa.h |  1 -
 xen/arch/x86/srat.c             |  2 +-
 xen/include/xen/numa.h          |  1 +
 6 files changed, 68 insertions(+), 2 deletions(-)

Comments

Jan Beulich April 20, 2023, 12:38 p.m. UTC | #1
On 20.04.2023 13:25, Henry Wang wrote:
> From: Wei Chen <wei.chen@arm.com>
> 
> We will parse NUMA nodes distances from device tree. So we need
> a matrix to record the distances between any two nodes we parsed.
> Accordingly, we provide this node_set_distance API for device tree
> NUMA to set the distance for any two nodes in this patch. When
> NUMA initialization failed, __node_distance will return
> NUMA_REMOTE_DISTANCE, this will help us avoid doing rollback
> for distance maxtrix when NUMA initialization failed.
> 
> As both x86 and Arm have implemented __node_distance, so we move
> its definition from asm/numa.h to xen/numa.h.

Nit: You mean "declaration", not "definition".

> At same time, the
> outdated u8 return value of x86 has been changed to unsigned char.
> 
> Signed-off-by: Wei Chen <wei.chen@arm.com>
> Signed-off-by: Henry Wang <Henry.Wang@arm.com>

non-Arm parts; to more it's not applicable anyway:
Acked-by: Jan Beulich <jbeulich@suse.com>

> --- a/xen/arch/arm/numa.c
> +++ b/xen/arch/arm/numa.c
> @@ -28,6 +28,11 @@ enum dt_numa_status {
>  
>  static enum dt_numa_status __ro_after_init device_tree_numa = DT_NUMA_DEFAULT;
>  
> +static unsigned char __ro_after_init
> +node_distance_map[MAX_NUMNODES][MAX_NUMNODES] = {
> +    { 0 }
> +};

Nit: There's no (incomplete or complete) initializer needed here, if
all you're after is having all slots set to zero.

However, looking at the code below, don't you mean to have the array
pre-set to all NUMA_NO_DISTANCE?

> @@ -48,3 +53,50 @@ int __init arch_numa_setup(const char *opt)
>  {
>      return -EINVAL;
>  }
> +
> +void __init numa_set_distance(nodeid_t from, nodeid_t to,
> +                              unsigned int distance)
> +{
> +    if ( from >= ARRAY_SIZE(node_distance_map) ||
> +         to >= ARRAY_SIZE(node_distance_map[0]) )
> +    {
> +        printk(KERN_WARNING
> +               "NUMA: invalid nodes: from=%"PRIu8" to=%"PRIu8" MAX=%"PRIu8"\n",
> +               from, to, MAX_NUMNODES);
> +        return;
> +    }
> +
> +    /* NUMA defines 0xff as an unreachable node and 0-9 are undefined */
> +    if ( distance >= NUMA_NO_DISTANCE ||
> +         (distance >= NUMA_DISTANCE_UDF_MIN &&

Compilers may warn about comparison of "unsigned int" to be >= 0. I'm
not sure about the usefulness of the NUMA_DISTANCE_UDF_MIN define in
the first place, so maybe best drop it and its use here?

> +unsigned char __node_distance(nodeid_t from, nodeid_t to)
> +{
> +    /* When NUMA is off, any distance will be treated as remote. */
> +    if ( numa_disabled() )
> +        return NUMA_REMOTE_DISTANCE;

Wouldn't it make sense to have the "from == to" special case ahead of
this (rather than further down), thus yielding a sensible result for
from == to == 0? And else return NUMA_NO_DISTANCE, thus having a
sensible result also for any from/to != 0?

> +    /*
> +     * Check whether the nodes are in the matrix range.
> +     * When any node is out of range, except from and to nodes are the
> +     * same, we treat them as unreachable (return 0xFF)
> +     */
> +    if ( from >= ARRAY_SIZE(node_distance_map) ||
> +         to >= ARRAY_SIZE(node_distance_map[0]) )
> +        return from == to ? NUMA_LOCAL_DISTANCE : NUMA_NO_DISTANCE;
> +
> +    return node_distance_map[from][to];
> +}
> +
> +EXPORT_SYMBOL(__node_distance);

What is this needed for?

Jan
Henry Wang April 21, 2023, 9:23 a.m. UTC | #2
Hi Jan,

> -----Original Message-----
> From: Jan Beulich <jbeulich@suse.com>
> Subject: Re: [PATCH v3 03/17] xen/arm: implement node distance helpers for
> Arm
> > As both x86 and Arm have implemented __node_distance, so we move
> > its definition from asm/numa.h to xen/numa.h.
> 
> Nit: You mean "declaration", not "definition".

Correct, I overlooked this miswording in commit message while going
through your comments in v2. will correct in v4.

> 
> > At same time, the
> > outdated u8 return value of x86 has been changed to unsigned char.
> >
> > Signed-off-by: Wei Chen <wei.chen@arm.com>
> > Signed-off-by: Henry Wang <Henry.Wang@arm.com>
> 
> non-Arm parts; to more it's not applicable anyway:
> Acked-by: Jan Beulich <jbeulich@suse.com>

I will add # non-Arm parts in the end of the tag and take the tag.

> 
> > --- a/xen/arch/arm/numa.c
> > +++ b/xen/arch/arm/numa.c
> > @@ -28,6 +28,11 @@ enum dt_numa_status {
> >
> >  static enum dt_numa_status __ro_after_init device_tree_numa =
> DT_NUMA_DEFAULT;
> >
> > +static unsigned char __ro_after_init
> > +node_distance_map[MAX_NUMNODES][MAX_NUMNODES] = {
> > +    { 0 }
> > +};
> 
> Nit: There's no (incomplete or complete) initializer needed here, if
> all you're after is having all slots set to zero.

Yeah, I agree with you, so I will drop the initializer in v4, however...

> 
> However, looking at the code below, don't you mean to have the array
> pre-set to all NUMA_NO_DISTANCE?

...I am a bit puzzled about why pre-setting the array to all
NUMA_NO_DISTANCE matters here, as I think the node distance map will
be populated when parsing the device tree anyway no matter what their
initial values.

> 
> > +    /* NUMA defines 0xff as an unreachable node and 0-9 are undefined */
> > +    if ( distance >= NUMA_NO_DISTANCE ||
> > +         (distance >= NUMA_DISTANCE_UDF_MIN &&
> 
> Compilers may warn about comparison of "unsigned int" to be >= 0. I'm
> not sure about the usefulness of the NUMA_DISTANCE_UDF_MIN define in
> the first place, so maybe best drop it and its use here?

Yeah, will do that in v4.

> 
> > +unsigned char __node_distance(nodeid_t from, nodeid_t to)
> > +{
> > +    /* When NUMA is off, any distance will be treated as remote. */
> > +    if ( numa_disabled() )
> > +        return NUMA_REMOTE_DISTANCE;
> 
> Wouldn't it make sense to have the "from == to" special case ahead of
> this (rather than further down), thus yielding a sensible result for
> from == to == 0? And else return NUMA_NO_DISTANCE, thus having a
> sensible result also for any from/to != 0?

Could you please elaborate a bit more about why 0 matters here? As from
my understanding,
(1) If from == to, then we set the distance to NUMA_LOCAL_DISTANCE
which represents the diagonal of the matrix.
(2) If from and to is in the matrix range, then we return
node_distance_map[from][to].
(3) Other cases we return NUMA_NO_DISTANCE.

So this diff is enough:
diff --git a/xen/arch/arm/numa.c b/xen/arch/arm/numa.c
@@ -98,6 +98,9 @@ void numa_detect_cpu_node(unsigned int cpu)

 unsigned char __node_distance(nodeid_t from, nodeid_t to)
 {
+    if ( from == to )
+        return NUMA_LOCAL_DISTANCE;
+
     /* When NUMA is off, any distance will be treated as remote. */
     if ( numa_disabled() )
         return NUMA_REMOTE_DISTANCE;
@@ -109,7 +112,7 @@ unsigned char __node_distance(nodeid_t from, nodeid_t to)
      */
     if ( from >= ARRAY_SIZE(node_distance_map) ||
          to >= ARRAY_SIZE(node_distance_map[0]) )
-        return from == to ? NUMA_LOCAL_DISTANCE : NUMA_NO_DISTANCE;
+        return NUMA_NO_DISTANCE;

     return node_distance_map[from][to];
 }

Would you mind pointing out why my understanding is wrong? Thanks!

> 
> > +    /*
> > +     * Check whether the nodes are in the matrix range.
> > +     * When any node is out of range, except from and to nodes are the
> > +     * same, we treat them as unreachable (return 0xFF)
> > +     */
> > +    if ( from >= ARRAY_SIZE(node_distance_map) ||
> > +         to >= ARRAY_SIZE(node_distance_map[0]) )
> > +        return from == to ? NUMA_LOCAL_DISTANCE : NUMA_NO_DISTANCE;
> > +
> > +    return node_distance_map[from][to];
> > +}
> > +
> > +EXPORT_SYMBOL(__node_distance);
> 
> What is this needed for?

Will drop it.

Kind regards,
Henry

> 
> Jan
Jan Beulich April 21, 2023, 9:40 a.m. UTC | #3
On 21.04.2023 11:23, Henry Wang wrote:
>> -----Original Message-----
>> From: Jan Beulich <jbeulich@suse.com>
>>
>>> --- a/xen/arch/arm/numa.c
>>> +++ b/xen/arch/arm/numa.c
>>> @@ -28,6 +28,11 @@ enum dt_numa_status {
>>>
>>>  static enum dt_numa_status __ro_after_init device_tree_numa =
>> DT_NUMA_DEFAULT;
>>>
>>> +static unsigned char __ro_after_init
>>> +node_distance_map[MAX_NUMNODES][MAX_NUMNODES] = {
>>> +    { 0 }
>>> +};
>>
>> Nit: There's no (incomplete or complete) initializer needed here, if
>> all you're after is having all slots set to zero.
> 
> Yeah, I agree with you, so I will drop the initializer in v4, however...
> 
>>
>> However, looking at the code below, don't you mean to have the array
>> pre-set to all NUMA_NO_DISTANCE?
> 
> ...I am a bit puzzled about why pre-setting the array to all
> NUMA_NO_DISTANCE matters here, as I think the node distance map will
> be populated when parsing the device tree anyway no matter what their
> initial values.

From this patch alone it doesn't become clear whether indeed all array
slots (and not just ones for valid nodes) would be populated. I think
the code in the patch here would better not make itself dependent on
behavior of code added subsequently (which may change; recall that a
series may be committed in pieces).

>>> +unsigned char __node_distance(nodeid_t from, nodeid_t to)
>>> +{
>>> +    /* When NUMA is off, any distance will be treated as remote. */
>>> +    if ( numa_disabled() )
>>> +        return NUMA_REMOTE_DISTANCE;
>>
>> Wouldn't it make sense to have the "from == to" special case ahead of
>> this (rather than further down), thus yielding a sensible result for
>> from == to == 0? And else return NUMA_NO_DISTANCE, thus having a
>> sensible result also for any from/to != 0?
> 
> Could you please elaborate a bit more about why 0 matters here?

When NUMA is off, there's only one node - node 0. Hence 0 has special
meaning in that case.

> As from my understanding,
> (1) If from == to, then we set the distance to NUMA_LOCAL_DISTANCE
> which represents the diagonal of the matrix.
> (2) If from and to is in the matrix range, then we return
> node_distance_map[from][to].

Provided that's set correctly. IOW this interacts with the other comment
(which really I made only after the one here, just that that's of course
not visible from the reply that I sent).

> (3) Other cases we return NUMA_NO_DISTANCE.

And when NUMA is off, it should be NUMA_NO_DISTANCE in _all_ other cases,
i.e. ...

> So this diff is enough:
> diff --git a/xen/arch/arm/numa.c b/xen/arch/arm/numa.c
> @@ -98,6 +98,9 @@ void numa_detect_cpu_node(unsigned int cpu)
> 
>  unsigned char __node_distance(nodeid_t from, nodeid_t to)
>  {
> +    if ( from == to )
> +        return NUMA_LOCAL_DISTANCE;
> +
>      /* When NUMA is off, any distance will be treated as remote. */
>      if ( numa_disabled() )
>          return NUMA_REMOTE_DISTANCE;

... this return is wrong in that case (even if in reality this likely
wouldn't matter much).

Jan

> @@ -109,7 +112,7 @@ unsigned char __node_distance(nodeid_t from, nodeid_t to)
>       */
>      if ( from >= ARRAY_SIZE(node_distance_map) ||
>           to >= ARRAY_SIZE(node_distance_map[0]) )
> -        return from == to ? NUMA_LOCAL_DISTANCE : NUMA_NO_DISTANCE;
> +        return NUMA_NO_DISTANCE;
> 
>      return node_distance_map[from][to];
>  }
> 
> Would you mind pointing out why my understanding is wrong? Thanks!
> 
> Kind regards,
> Henry
Henry Wang April 23, 2023, 5:36 a.m. UTC | #4
Hi Jan,

> -----Original Message-----
> From: Jan Beulich <jbeulich@suse.com>
> Subject: Re: [PATCH v3 03/17] xen/arm: implement node distance helpers for
> Arm
> >> However, looking at the code below, don't you mean to have the array
> >> pre-set to all NUMA_NO_DISTANCE?
> >
> > ...I am a bit puzzled about why pre-setting the array to all
> > NUMA_NO_DISTANCE matters here, as I think the node distance map will
> > be populated when parsing the device tree anyway no matter what their
> > initial values.
> 
> From this patch alone it doesn't become clear whether indeed all array
> slots (and not just ones for valid nodes) would be populated. I think
> the code in the patch here would better not make itself dependent on
> behavior of code added subsequently (which may change; recall that a
> series may be committed in pieces).

Correct, I agree. I added a numa_init_distance() function (in patch #12) to
set all values to NUMA_NO_DISTANCE. The numa_init_distance() will be
called in the beginning of numa_init().

> 
> >>> +unsigned char __node_distance(nodeid_t from, nodeid_t to)
> >>> +{
> >>> +    /* When NUMA is off, any distance will be treated as remote. */
> >>> +    if ( numa_disabled() )
> >>> +        return NUMA_REMOTE_DISTANCE;
> >>
> >> Wouldn't it make sense to have the "from == to" special case ahead of
> >> this (rather than further down), thus yielding a sensible result for
> >> from == to == 0? And else return NUMA_NO_DISTANCE, thus having a
> >> sensible result also for any from/to != 0?
> >
> > Could you please elaborate a bit more about why 0 matters here?
> 
> When NUMA is off, there's only one node - node 0. Hence 0 has special
> meaning in that case.
> 
> > As from my understanding,
> > (1) If from == to, then we set the distance to NUMA_LOCAL_DISTANCE
> > which represents the diagonal of the matrix.
> > (2) If from and to is in the matrix range, then we return
> > node_distance_map[from][to].
> 
> Provided that's set correctly. IOW this interacts with the other comment
> (which really I made only after the one here, just that that's of course
> not visible from the reply that I sent).
> 
> > (3) Other cases we return NUMA_NO_DISTANCE.
> 
> And when NUMA is off, it should be NUMA_NO_DISTANCE in _all_ other
> cases,
> i.e. ...
> 
> >      /* When NUMA is off, any distance will be treated as remote. */
> >      if ( numa_disabled() )
> >          return NUMA_REMOTE_DISTANCE;
> 
> ... this return is wrong in that case (even if in reality this likely
> wouldn't matter much).

Thanks for the explanation! I think I now understand :) Would this diff below
look good to you then? Appreciate your patience.

unsigned char __node_distance(nodeid_t from, nodeid_t to)
 {
-    /* When NUMA is off, any distance will be treated as remote. */
+    if ( from == to )
+        return NUMA_LOCAL_DISTANCE;
+
+    /* When NUMA is off, any distance will be treated as unreachable (0xFF). */
     if ( numa_disabled() )
-        return NUMA_REMOTE_DISTANCE;
+        return NUMA_NO_DISTANCE;

     /*
      * Check whether the nodes are in the matrix range.
      * When any node is out of range, except from and to nodes are the
-     * same, we treat them as unreachable (return 0xFF)
+     * same, we treat them as unreachable (0xFF)
      */
     if ( from >= ARRAY_SIZE(node_distance_map) ||
          to >= ARRAY_SIZE(node_distance_map[0]) )
-        return from == to ? NUMA_LOCAL_DISTANCE : NUMA_NO_DISTANCE;
+        return NUMA_NO_DISTANCE;

     return node_distance_map[from][to];
 }

Kind regards,
Henry

> 
> Jan
>
Jan Beulich April 24, 2023, 7:41 a.m. UTC | #5
On 23.04.2023 07:36, Henry Wang wrote:
>> -----Original Message-----
>> From: Jan Beulich <jbeulich@suse.com>
>>>> However, looking at the code below, don't you mean to have the array
>>>> pre-set to all NUMA_NO_DISTANCE?
>>>
>>> ...I am a bit puzzled about why pre-setting the array to all
>>> NUMA_NO_DISTANCE matters here, as I think the node distance map will
>>> be populated when parsing the device tree anyway no matter what their
>>> initial values.
>>
>> From this patch alone it doesn't become clear whether indeed all array
>> slots (and not just ones for valid nodes) would be populated. I think
>> the code in the patch here would better not make itself dependent on
>> behavior of code added subsequently (which may change; recall that a
>> series may be committed in pieces).
> 
> Correct, I agree. I added a numa_init_distance() function (in patch #12) to
> set all values to NUMA_NO_DISTANCE. The numa_init_distance() will be
> called in the beginning of numa_init().

Why do you need a function for this? As said, this array can be pre-set at
compile time (unless I'm overlooking something).

>>>>> +unsigned char __node_distance(nodeid_t from, nodeid_t to)
>>>>> +{
>>>>> +    /* When NUMA is off, any distance will be treated as remote. */
>>>>> +    if ( numa_disabled() )
>>>>> +        return NUMA_REMOTE_DISTANCE;
>>>>
>>>> Wouldn't it make sense to have the "from == to" special case ahead of
>>>> this (rather than further down), thus yielding a sensible result for
>>>> from == to == 0? And else return NUMA_NO_DISTANCE, thus having a
>>>> sensible result also for any from/to != 0?
>>>
>>> Could you please elaborate a bit more about why 0 matters here?
>>
>> When NUMA is off, there's only one node - node 0. Hence 0 has special
>> meaning in that case.
>>
>>> As from my understanding,
>>> (1) If from == to, then we set the distance to NUMA_LOCAL_DISTANCE
>>> which represents the diagonal of the matrix.
>>> (2) If from and to is in the matrix range, then we return
>>> node_distance_map[from][to].
>>
>> Provided that's set correctly. IOW this interacts with the other comment
>> (which really I made only after the one here, just that that's of course
>> not visible from the reply that I sent).
>>
>>> (3) Other cases we return NUMA_NO_DISTANCE.
>>
>> And when NUMA is off, it should be NUMA_NO_DISTANCE in _all_ other
>> cases,
>> i.e. ...
>>
>>>      /* When NUMA is off, any distance will be treated as remote. */
>>>      if ( numa_disabled() )
>>>          return NUMA_REMOTE_DISTANCE;
>>
>> ... this return is wrong in that case (even if in reality this likely
>> wouldn't matter much).
> 
> Thanks for the explanation! I think I now understand :) Would this diff below
> look good to you then? Appreciate your patience.

Looks largely okay, but possibly one part can now be omitted (see below).

> unsigned char __node_distance(nodeid_t from, nodeid_t to)
>  {
> -    /* When NUMA is off, any distance will be treated as remote. */
> +    if ( from == to )
> +        return NUMA_LOCAL_DISTANCE;
> +
> +    /* When NUMA is off, any distance will be treated as unreachable (0xFF). */

Please avoid mentioning the actual value of 0xFF: This serves no real
purpose (afaict) and is liable to go stale at some point.

>      if ( numa_disabled() )
> -        return NUMA_REMOTE_DISTANCE;
> +        return NUMA_NO_DISTANCE;

With the code below this is now only an optimization. Might be worth
saying so in the comment (assuming having this optimization is deemed
worth it).

Jan

>      /*
>       * Check whether the nodes are in the matrix range.
>       * When any node is out of range, except from and to nodes are the
> -     * same, we treat them as unreachable (return 0xFF)
> +     * same, we treat them as unreachable (0xFF)
>       */
>      if ( from >= ARRAY_SIZE(node_distance_map) ||
>           to >= ARRAY_SIZE(node_distance_map[0]) )
> -        return from == to ? NUMA_LOCAL_DISTANCE : NUMA_NO_DISTANCE;
> +        return NUMA_NO_DISTANCE;
> 
>      return node_distance_map[from][to];
>  }
> 
> Kind regards,
> Henry
> 
>>
>> Jan
>>
Henry Wang April 24, 2023, 7:50 a.m. UTC | #6
Hi Jan,

> -----Original Message-----
> From: Jan Beulich <jbeulich@suse.com>
> Subject: Re: [PATCH v3 03/17] xen/arm: implement node distance helpers for
> Arm
> 
> > Thanks for the explanation! I think I now understand :) Would this diff
> below
> > look good to you then? Appreciate your patience.
> 
> Looks largely okay, but possibly one part can now be omitted (see below).
> 
> > unsigned char __node_distance(nodeid_t from, nodeid_t to)
> >  {
> > -    /* When NUMA is off, any distance will be treated as remote. */
> > +    if ( from == to )
> > +        return NUMA_LOCAL_DISTANCE;
> > +
> > +    /* When NUMA is off, any distance will be treated as unreachable (0xFF).
> */
> 
> Please avoid mentioning the actual value of 0xFF: This serves no real
> purpose (afaict) and is liable to go stale at some point.

Good point, I will drop the 0xFF.

> 
> >      if ( numa_disabled() )
> > -        return NUMA_REMOTE_DISTANCE;
> > +        return NUMA_NO_DISTANCE;
> 
> With the code below this is now only an optimization. Might be worth
> saying so in the comment (assuming having this optimization is deemed
> worth it).

Sounds good, if you think below comment makes sense to you, I will add:
"When NUMA is disabled, the node distance should always be
NUMA_NO_DISTANCE, directly return here as an optimization."

Kind regards,
Henry

> 
> Jan
Henry Wang April 24, 2023, 11:02 a.m. UTC | #7
Hi Jan,

> -----Original Message-----
> From: Jan Beulich <jbeulich@suse.com>
> Subject: Re: [PATCH v3 03/17] xen/arm: implement node distance helpers for
> Arm
> 
> On 23.04.2023 07:36, Henry Wang wrote:
> >> -----Original Message-----
> >> From: Jan Beulich <jbeulich@suse.com>
> >>>> However, looking at the code below, don't you mean to have the array
> >>>> pre-set to all NUMA_NO_DISTANCE?
> >>>
> >>> ...I am a bit puzzled about why pre-setting the array to all
> >>> NUMA_NO_DISTANCE matters here, as I think the node distance map will
> >>> be populated when parsing the device tree anyway no matter what their
> >>> initial values.
> >>
> >> From this patch alone it doesn't become clear whether indeed all array
> >> slots (and not just ones for valid nodes) would be populated. I think
> >> the code in the patch here would better not make itself dependent on
> >> behavior of code added subsequently (which may change; recall that a
> >> series may be committed in pieces).
> >
> > Correct, I agree. I added a numa_init_distance() function (in patch #12) to
> > set all values to NUMA_NO_DISTANCE. The numa_init_distance() will be
> > called in the beginning of numa_init().
> 
> Why do you need a function for this? As said, this array can be pre-set at
> compile time (unless I'm overlooking something).

Sorry I overlooked this comment, correct me if I am wrong, but IIUC we
can only pre-set the 2D array to 0 at the compile time. Could you please
elaborate a bit more about the code in your mind? Thanks!

Kind regards,
Henry
Jan Beulich April 24, 2023, 11:18 a.m. UTC | #8
On 24.04.2023 13:02, Henry Wang wrote:
>> -----Original Message-----
>> From: Jan Beulich <jbeulich@suse.com>
>>
>> On 23.04.2023 07:36, Henry Wang wrote:
>>>> -----Original Message-----
>>>> From: Jan Beulich <jbeulich@suse.com>
>>>>>> However, looking at the code below, don't you mean to have the array
>>>>>> pre-set to all NUMA_NO_DISTANCE?
>>>>>
>>>>> ...I am a bit puzzled about why pre-setting the array to all
>>>>> NUMA_NO_DISTANCE matters here, as I think the node distance map will
>>>>> be populated when parsing the device tree anyway no matter what their
>>>>> initial values.
>>>>
>>>> From this patch alone it doesn't become clear whether indeed all array
>>>> slots (and not just ones for valid nodes) would be populated. I think
>>>> the code in the patch here would better not make itself dependent on
>>>> behavior of code added subsequently (which may change; recall that a
>>>> series may be committed in pieces).
>>>
>>> Correct, I agree. I added a numa_init_distance() function (in patch #12) to
>>> set all values to NUMA_NO_DISTANCE. The numa_init_distance() will be
>>> called in the beginning of numa_init().
>>
>> Why do you need a function for this? As said, this array can be pre-set at
>> compile time (unless I'm overlooking something).
> 
> Sorry I overlooked this comment, correct me if I am wrong, but IIUC we
> can only pre-set the 2D array to 0 at the compile time. Could you please
> elaborate a bit more about the code in your mind? Thanks!

static unsigned char __ro_after_init
node_distance_map[MAX_NUMNODES][MAX_NUMNODES] = {
    [0 ... MAX_NUMNODES - 1] = { [0 ... MAX_NUMNODES - 1] = NUMA_NO_DISTANCE }
};

or even (limiting redundancy a little)

static unsigned char __ro_after_init
node_distance_map[][MAX_NUMNODES] = {
    [0 ... MAX_NUMNODES - 1] = { [0 ... MAX_NUMNODES - 1] = NUMA_NO_DISTANCE }
};

Jan
Henry Wang April 24, 2023, 11:27 a.m. UTC | #9
Hi Jan,

> -----Original Message-----
> From: Jan Beulich <jbeulich@suse.com>
> Subject: Re: [PATCH v3 03/17] xen/arm: implement node distance helpers for
> Arm
> >>> Correct, I agree. I added a numa_init_distance() function (in patch #12) to
> >>> set all values to NUMA_NO_DISTANCE. The numa_init_distance() will be
> >>> called in the beginning of numa_init().
> >>
> >> Why do you need a function for this? As said, this array can be pre-set at
> >> compile time (unless I'm overlooking something).
> >
> > Sorry I overlooked this comment, correct me if I am wrong, but IIUC we
> > can only pre-set the 2D array to 0 at the compile time. Could you please
> > elaborate a bit more about the code in your mind? Thanks!
> 
> static unsigned char __ro_after_init
> node_distance_map[MAX_NUMNODES][MAX_NUMNODES] = {
>     [0 ... MAX_NUMNODES - 1] = { [0 ... MAX_NUMNODES - 1] =
> NUMA_NO_DISTANCE }
> };
> 
> or even (limiting redundancy a little)
> 
> static unsigned char __ro_after_init
> node_distance_map[][MAX_NUMNODES] = {
>     [0 ... MAX_NUMNODES - 1] = { [0 ... MAX_NUMNODES - 1] =
> NUMA_NO_DISTANCE }
> };

Yeah you are correct. I made a mistake that missing the
"[0 ... MAX_NUMNODES - 1]" in the left side of "=" hence my compiler
complained... Thanks for your patience.

Kind regards,
Henry

> 
> Jan
diff mbox series

Patch

diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
index 4d076b278b..9073398d6e 100644
--- a/xen/arch/arm/Makefile
+++ b/xen/arch/arm/Makefile
@@ -38,6 +38,7 @@  obj-$(CONFIG_LIVEPATCH) += livepatch.o
 obj-y += mem_access.o
 obj-y += mm.o
 obj-y += monitor.o
+obj-$(CONFIG_NUMA) += numa.o
 obj-y += p2m.o
 obj-y += percpu.o
 obj-y += platform.o
diff --git a/xen/arch/arm/include/asm/numa.h b/xen/arch/arm/include/asm/numa.h
index 83f60ad05b..123a1a8dd0 100644
--- a/xen/arch/arm/include/asm/numa.h
+++ b/xen/arch/arm/include/asm/numa.h
@@ -22,7 +22,20 @@  typedef u8 nodeid_t;
  */
 #define NR_NODE_MEMBLKS NR_MEM_BANKS
 
+/*
+ * In ACPI spec, 0-9 are the reserved values for node distance,
+ * 10 indicates local node distance, 20 indicates remote node
+ * distance. Set node distance map in device tree will follow
+ * the ACPI's definition.
+ */
+#define NUMA_DISTANCE_UDF_MIN   0
+#define NUMA_DISTANCE_UDF_MAX   9
+#define NUMA_LOCAL_DISTANCE     10
+#define NUMA_REMOTE_DISTANCE    20
+
 extern bool numa_disabled(void);
+extern void numa_set_distance(nodeid_t from, nodeid_t to,
+                              unsigned int distance);
 
 #else
 
diff --git a/xen/arch/arm/numa.c b/xen/arch/arm/numa.c
index eb5d0632cb..c5ef81aad3 100644
--- a/xen/arch/arm/numa.c
+++ b/xen/arch/arm/numa.c
@@ -28,6 +28,11 @@  enum dt_numa_status {
 
 static enum dt_numa_status __ro_after_init device_tree_numa = DT_NUMA_DEFAULT;
 
+static unsigned char __ro_after_init
+node_distance_map[MAX_NUMNODES][MAX_NUMNODES] = {
+    { 0 }
+};
+
 void __init numa_fw_bad(void)
 {
     printk(KERN_ERR "NUMA: device tree numa info table not used.\n");
@@ -48,3 +53,50 @@  int __init arch_numa_setup(const char *opt)
 {
     return -EINVAL;
 }
+
+void __init numa_set_distance(nodeid_t from, nodeid_t to,
+                              unsigned int distance)
+{
+    if ( from >= ARRAY_SIZE(node_distance_map) ||
+         to >= ARRAY_SIZE(node_distance_map[0]) )
+    {
+        printk(KERN_WARNING
+               "NUMA: invalid nodes: from=%"PRIu8" to=%"PRIu8" MAX=%"PRIu8"\n",
+               from, to, MAX_NUMNODES);
+        return;
+    }
+
+    /* NUMA defines 0xff as an unreachable node and 0-9 are undefined */
+    if ( distance >= NUMA_NO_DISTANCE ||
+         (distance >= NUMA_DISTANCE_UDF_MIN &&
+          distance <= NUMA_DISTANCE_UDF_MAX) ||
+         (from == to && distance != NUMA_LOCAL_DISTANCE) )
+    {
+        printk(KERN_WARNING
+               "NUMA: invalid distance: from=%"PRIu8" to=%"PRIu8" distance=%"PRIu32"\n",
+               from, to, distance);
+        return;
+    }
+
+    node_distance_map[from][to] = distance;
+}
+
+unsigned char __node_distance(nodeid_t from, nodeid_t to)
+{
+    /* When NUMA is off, any distance will be treated as remote. */
+    if ( numa_disabled() )
+        return NUMA_REMOTE_DISTANCE;
+
+    /*
+     * Check whether the nodes are in the matrix range.
+     * When any node is out of range, except from and to nodes are the
+     * same, we treat them as unreachable (return 0xFF)
+     */
+    if ( from >= ARRAY_SIZE(node_distance_map) ||
+         to >= ARRAY_SIZE(node_distance_map[0]) )
+        return from == to ? NUMA_LOCAL_DISTANCE : NUMA_NO_DISTANCE;
+
+    return node_distance_map[from][to];
+}
+
+EXPORT_SYMBOL(__node_distance);
diff --git a/xen/arch/x86/include/asm/numa.h b/xen/arch/x86/include/asm/numa.h
index 7866afa408..45456ac441 100644
--- a/xen/arch/x86/include/asm/numa.h
+++ b/xen/arch/x86/include/asm/numa.h
@@ -22,7 +22,6 @@  extern void init_cpu_to_node(void);
 #define arch_want_default_dmazone() (num_online_nodes() > 1)
 
 void srat_parse_regions(paddr_t addr);
-extern u8 __node_distance(nodeid_t a, nodeid_t b);
 unsigned int arch_get_dma_bitsize(void);
 
 #endif
diff --git a/xen/arch/x86/srat.c b/xen/arch/x86/srat.c
index 56749ddca5..50faf5d352 100644
--- a/xen/arch/x86/srat.c
+++ b/xen/arch/x86/srat.c
@@ -328,7 +328,7 @@  unsigned int numa_node_to_arch_nid(nodeid_t n)
 	return 0;
 }
 
-u8 __node_distance(nodeid_t a, nodeid_t b)
+unsigned char __node_distance(nodeid_t a, nodeid_t b)
 {
 	unsigned index;
 	u8 slit_val;
diff --git a/xen/include/xen/numa.h b/xen/include/xen/numa.h
index b86d0851fc..8356e47b61 100644
--- a/xen/include/xen/numa.h
+++ b/xen/include/xen/numa.h
@@ -114,6 +114,7 @@  extern bool numa_memblks_available(void);
 extern bool numa_update_node_memblks(nodeid_t node, unsigned int arch_nid,
                                      paddr_t start, paddr_t size, bool hotplug);
 extern void numa_set_processor_nodes_parsed(nodeid_t node);
+extern unsigned char __node_distance(nodeid_t a, nodeid_t b);
 
 #else