arm64/numa: Add more vetting in numa_set_distance()
diff mbox series

Message ID 1540562267-101152-1-git-send-email-john.garry@huawei.com
State New
Headers show
Series
  • arm64/numa: Add more vetting in numa_set_distance()
Related show

Commit Message

John Garry Oct. 26, 2018, 1:57 p.m. UTC
Currently it is acceptable to set the distance between 2 separate nodes to
LOCAL_DISTANCE.

Reject this as it is invalid.

This change avoids a crash reported in [1].

[1] https://www.spinics.net/lists/arm-kernel/msg683304.html

Signed-off-by: John Garry <john.garry@huawei.com>

Comments

Will Deacon Oct. 29, 2018, 11:25 a.m. UTC | #1
Hi John,

On Fri, Oct 26, 2018 at 09:57:47PM +0800, John Garry wrote:
> Currently it is acceptable to set the distance between 2 separate nodes to
> LOCAL_DISTANCE.
> 
> Reject this as it is invalid.
> 
> This change avoids a crash reported in [1].
> 
> [1] https://www.spinics.net/lists/arm-kernel/msg683304.html
> 
> Signed-off-by: John Garry <john.garry@huawei.com>
> 
> diff --git a/arch/arm64/mm/numa.c b/arch/arm64/mm/numa.c
> index 146c04c..6092e3d 100644
> --- a/arch/arm64/mm/numa.c
> +++ b/arch/arm64/mm/numa.c
> @@ -335,7 +335,8 @@ void __init numa_set_distance(int from, int to, int distance)
>  	}
>  
>  	if ((u8)distance != distance ||
> -	    (from == to && distance != LOCAL_DISTANCE)) {
> +	    (from == to && distance != LOCAL_DISTANCE) ||
> +	    (from != to && distance == LOCAL_DISTANCE)) {

The current code here is more-or-less lifted from the x86 implementation
of numa_set_distance(). I think we should either factor out the sanity check
into a core helper or make the core code robust to these funny configurations.

Will
John Garry Oct. 29, 2018, 12:14 p.m. UTC | #2
On 29/10/2018 11:25, Will Deacon wrote:
> Hi John,
>

Hi Will,

> On Fri, Oct 26, 2018 at 09:57:47PM +0800, John Garry wrote:
>> Currently it is acceptable to set the distance between 2 separate nodes to
>> LOCAL_DISTANCE.
>>
>> Reject this as it is invalid.
>>
>> This change avoids a crash reported in [1].
>>
>> [1] https://www.spinics.net/lists/arm-kernel/msg683304.html
>>
>> Signed-off-by: John Garry <john.garry@huawei.com>
>>
>> diff --git a/arch/arm64/mm/numa.c b/arch/arm64/mm/numa.c
>> index 146c04c..6092e3d 100644
>> --- a/arch/arm64/mm/numa.c
>> +++ b/arch/arm64/mm/numa.c
>> @@ -335,7 +335,8 @@ void __init numa_set_distance(int from, int to, int distance)
>>  	}
>>
>>  	if ((u8)distance != distance ||
>> -	    (from == to && distance != LOCAL_DISTANCE)) {
>> +	    (from == to && distance != LOCAL_DISTANCE) ||
>> +	    (from != to && distance == LOCAL_DISTANCE)) {
>
> The current code here is more-or-less lifted from the x86 implementation
> of numa_set_distance().

Right, I did notice this. I didn't think that x86 folks would be so 
concerned since they generally only use ACPI, and the ACPI code already 
validates these distances in drivers/acpi/numa.c: slit_valid() [unlike 
OF code].

  I think we should either factor out the sanity check
> into a core helper or make the core code robust to these funny configurations.

OK, so to me it would make sense to factor out a sanity check into a 
core helper.

Cheers,
John

>
> Will
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>
>
Will Deacon Oct. 29, 2018, 12:16 p.m. UTC | #3
On Mon, Oct 29, 2018 at 12:14:09PM +0000, John Garry wrote:
> On 29/10/2018 11:25, Will Deacon wrote:
> >On Fri, Oct 26, 2018 at 09:57:47PM +0800, John Garry wrote:
> >>Currently it is acceptable to set the distance between 2 separate nodes to
> >>LOCAL_DISTANCE.
> >>
> >>Reject this as it is invalid.
> >>
> >>This change avoids a crash reported in [1].
> >>
> >>[1] https://www.spinics.net/lists/arm-kernel/msg683304.html
> >>
> >>Signed-off-by: John Garry <john.garry@huawei.com>
> >>
> >>diff --git a/arch/arm64/mm/numa.c b/arch/arm64/mm/numa.c
> >>index 146c04c..6092e3d 100644
> >>--- a/arch/arm64/mm/numa.c
> >>+++ b/arch/arm64/mm/numa.c
> >>@@ -335,7 +335,8 @@ void __init numa_set_distance(int from, int to, int distance)
> >> 	}
> >>
> >> 	if ((u8)distance != distance ||
> >>-	    (from == to && distance != LOCAL_DISTANCE)) {
> >>+	    (from == to && distance != LOCAL_DISTANCE) ||
> >>+	    (from != to && distance == LOCAL_DISTANCE)) {
> >
> >The current code here is more-or-less lifted from the x86 implementation
> >of numa_set_distance().
> 
> Right, I did notice this. I didn't think that x86 folks would be so
> concerned since they generally only use ACPI, and the ACPI code already
> validates these distances in drivers/acpi/numa.c: slit_valid() [unlike OF
> code].
> 
>  I think we should either factor out the sanity check
> >into a core helper or make the core code robust to these funny configurations.
> 
> OK, so to me it would make sense to factor out a sanity check into a core
> helper.

That, or have the OF code perform the same validation that slit_valid() is
doing for ACPI. I'm just trying to avoid other architectures running into
this problem down the line.

Will
John Garry Oct. 29, 2018, 12:32 p.m. UTC | #4
On 29/10/2018 12:16, Will Deacon wrote:
> On Mon, Oct 29, 2018 at 12:14:09PM +0000, John Garry wrote:
>> On 29/10/2018 11:25, Will Deacon wrote:
>>> On Fri, Oct 26, 2018 at 09:57:47PM +0800, John Garry wrote:
>>>> Currently it is acceptable to set the distance between 2 separate nodes to
>>>> LOCAL_DISTANCE.
>>>>
>>>> Reject this as it is invalid.
>>>>
>>>> This change avoids a crash reported in [1].
>>>>
>>>> [1] https://www.spinics.net/lists/arm-kernel/msg683304.html
>>>>
>>>> Signed-off-by: John Garry <john.garry@huawei.com>
>>>>
>>>> diff --git a/arch/arm64/mm/numa.c b/arch/arm64/mm/numa.c
>>>> index 146c04c..6092e3d 100644
>>>> --- a/arch/arm64/mm/numa.c
>>>> +++ b/arch/arm64/mm/numa.c
>>>> @@ -335,7 +335,8 @@ void __init numa_set_distance(int from, int to, int distance)
>>>> 	}
>>>>
>>>> 	if ((u8)distance != distance ||
>>>> -	    (from == to && distance != LOCAL_DISTANCE)) {
>>>> +	    (from == to && distance != LOCAL_DISTANCE) ||
>>>> +	    (from != to && distance == LOCAL_DISTANCE)) {
>>>
>>> The current code here is more-or-less lifted from the x86 implementation
>>> of numa_set_distance().
>>
>> Right, I did notice this. I didn't think that x86 folks would be so
>> concerned since they generally only use ACPI, and the ACPI code already
>> validates these distances in drivers/acpi/numa.c: slit_valid() [unlike OF
>> code].
>>
>>  I think we should either factor out the sanity check
>>> into a core helper or make the core code robust to these funny configurations.
>>
>> OK, so to me it would make sense to factor out a sanity check into a core
>> helper.
>
> That, or have the OF code perform the same validation that slit_valid() is
> doing for ACPI. I'm just trying to avoid other architectures running into
> this problem down the line.
>

Right, OF code should do this validation job if ACPI is doing it 
(especially since the DT bindings actually specify the distance rules), 
and not rely on the arch NUMA code to accept/reject numa_set_distance() 
combinations.

And, in addition to this, I'd say OF should disable NUMA if given an 
invalid table (like ACPI does).

Cheers,
John

> Will
>
> .
>
Anshuman Khandual Oct. 29, 2018, 12:45 p.m. UTC | #5
On 10/29/2018 06:02 PM, John Garry wrote:
> On 29/10/2018 12:16, Will Deacon wrote:
>> On Mon, Oct 29, 2018 at 12:14:09PM +0000, John Garry wrote:
>>> On 29/10/2018 11:25, Will Deacon wrote:
>>>> On Fri, Oct 26, 2018 at 09:57:47PM +0800, John Garry wrote:
>>>>> Currently it is acceptable to set the distance between 2 separate nodes to
>>>>> LOCAL_DISTANCE.
>>>>>
>>>>> Reject this as it is invalid.
>>>>>
>>>>> This change avoids a crash reported in [1].
>>>>>
>>>>> [1] https://www.spinics.net/lists/arm-kernel/msg683304.html
>>>>>
>>>>> Signed-off-by: John Garry <john.garry@huawei.com>
>>>>>
>>>>> diff --git a/arch/arm64/mm/numa.c b/arch/arm64/mm/numa.c
>>>>> index 146c04c..6092e3d 100644
>>>>> --- a/arch/arm64/mm/numa.c
>>>>> +++ b/arch/arm64/mm/numa.c
>>>>> @@ -335,7 +335,8 @@ void __init numa_set_distance(int from, int to, int distance)
>>>>>     }
>>>>>
>>>>>     if ((u8)distance != distance ||
>>>>> -        (from == to && distance != LOCAL_DISTANCE)) {
>>>>> +        (from == to && distance != LOCAL_DISTANCE) ||
>>>>> +        (from != to && distance == LOCAL_DISTANCE)) {
>>>>
>>>> The current code here is more-or-less lifted from the x86 implementation
>>>> of numa_set_distance().
>>>
>>> Right, I did notice this. I didn't think that x86 folks would be so
>>> concerned since they generally only use ACPI, and the ACPI code already
>>> validates these distances in drivers/acpi/numa.c: slit_valid() [unlike OF
>>> code].
>>>
>>>  I think we should either factor out the sanity check
>>>> into a core helper or make the core code robust to these funny configurations.
>>>
>>> OK, so to me it would make sense to factor out a sanity check into a core
>>> helper.
>>
>> That, or have the OF code perform the same validation that slit_valid() is
>> doing for ACPI. I'm just trying to avoid other architectures running into
>> this problem down the line.
>>
> 
> Right, OF code should do this validation job if ACPI is doing it (especially since the DT bindings actually specify the distance rules), and not rely on the arch NUMA code to accept/reject numa_set_distance() combinations.

I would say this particular condition checking still falls under arch NUMA init
code sanity check like other basic tests what numa_set_distance() currently does
already but it should not be a necessity for the OF driver to check these. It can
choose to check but arch NUMA should check basic things like two different NUMA
nodes should not have LOCAL_DISTANCE as distance like in this case.

	(from == to && distance != LOCAL_DISTANCE) ||
		(from != to && distance == LOCAL_DISTANCE))


> 
> And, in addition to this, I'd say OF should disable NUMA if given an invalid table (like ACPI does).

Taking a decision to disable NUMA should be with kernel (arch NUMA) once kernel
starts booting. Platform should have sent right values, OF driver trying to
adjust stuff what platform has sent with FDT once the kernel starts booting is
not right. For example "Kernel NUMA wont like the distance factors lets clean
then up before passing on to MM". Disabling NUMA is one such major decision which
should be with arch NUMA code not with OF driver.
John Garry Oct. 29, 2018, 2:44 p.m. UTC | #6
>>>>
>>>>  I think we should either factor out the sanity check
>>>>> into a core helper or make the core code robust to these funny configurations.
>>>>
>>>> OK, so to me it would make sense to factor out a sanity check into a core
>>>> helper.
>>>
>>> That, or have the OF code perform the same validation that slit_valid() is
>>> doing for ACPI. I'm just trying to avoid other architectures running into
>>> this problem down the line.
>>>
>>
>> Right, OF code should do this validation job if ACPI is doing it (especially since the DT bindings actually specify the distance rules), and not rely on the arch NUMA code to accept/reject numa_set_distance() combinations.
>
> I would say this particular condition checking still falls under arch NUMA init
> code sanity check like other basic tests what numa_set_distance() currently does
> already but it should not be a necessity for the OF driver to check these.

The checks in the arch NUMA code mean that invalid inter-node distance 
combinations are ignored.

However, if any entries in the table are invalid, then the whole table 
can be discarded as none of it can be believed, i.e. it's better to 
validate the table.

It can
> choose to check but arch NUMA should check basic things like two different NUMA
> nodes should not have LOCAL_DISTANCE as distance like in this case.
>
> 	(from == to && distance != LOCAL_DISTANCE) ||
> 		(from != to && distance == LOCAL_DISTANCE))
>
>
>>
>> And, in addition to this, I'd say OF should disable NUMA if given an invalid table (like ACPI does).
>
> Taking a decision to disable NUMA should be with kernel (arch NUMA) once kernel
> starts booting. Platform should have sent right values, OF driver trying to
> adjust stuff what platform has sent with FDT once the kernel starts booting is
> not right. For example "Kernel NUMA wont like the distance factors lets clean
> then up before passing on to MM".

Sorry, but I don't know who was advocating this.

Disabling NUMA is one such major decision which
> should be with arch NUMA code not with OF driver.

I meant parsing the table would fail, so arch NUMA would fall back on 
dummy NUMA.

>

Thanks,
John
Will Deacon Oct. 29, 2018, 2:48 p.m. UTC | #7
On Mon, Oct 29, 2018 at 06:15:42PM +0530, Anshuman Khandual wrote:
> On 10/29/2018 06:02 PM, John Garry wrote:
> > On 29/10/2018 12:16, Will Deacon wrote:
> >> On Mon, Oct 29, 2018 at 12:14:09PM +0000, John Garry wrote:
> >>> On 29/10/2018 11:25, Will Deacon wrote:
> >>>> On Fri, Oct 26, 2018 at 09:57:47PM +0800, John Garry wrote:
> >>>>> Currently it is acceptable to set the distance between 2 separate nodes to
> >>>>> LOCAL_DISTANCE.
> >>>>>
> >>>>> Reject this as it is invalid.
> >>>>>
> >>>>> This change avoids a crash reported in [1].
> >>>>>
> >>>>> [1] https://www.spinics.net/lists/arm-kernel/msg683304.html
> >>>>>
> >>>>> Signed-off-by: John Garry <john.garry@huawei.com>
> >>>>>
> >>>>> diff --git a/arch/arm64/mm/numa.c b/arch/arm64/mm/numa.c
> >>>>> index 146c04c..6092e3d 100644
> >>>>> --- a/arch/arm64/mm/numa.c
> >>>>> +++ b/arch/arm64/mm/numa.c
> >>>>> @@ -335,7 +335,8 @@ void __init numa_set_distance(int from, int to, int distance)
> >>>>>     }
> >>>>>
> >>>>>     if ((u8)distance != distance ||
> >>>>> -        (from == to && distance != LOCAL_DISTANCE)) {
> >>>>> +        (from == to && distance != LOCAL_DISTANCE) ||
> >>>>> +        (from != to && distance == LOCAL_DISTANCE)) {
> >>>>
> >>>> The current code here is more-or-less lifted from the x86 implementation
> >>>> of numa_set_distance().
> >>>
> >>> Right, I did notice this. I didn't think that x86 folks would be so
> >>> concerned since they generally only use ACPI, and the ACPI code already
> >>> validates these distances in drivers/acpi/numa.c: slit_valid() [unlike OF
> >>> code].
> >>>
> >>>  I think we should either factor out the sanity check
> >>>> into a core helper or make the core code robust to these funny configurations.
> >>>
> >>> OK, so to me it would make sense to factor out a sanity check into a core
> >>> helper.
> >>
> >> That, or have the OF code perform the same validation that slit_valid() is
> >> doing for ACPI. I'm just trying to avoid other architectures running into
> >> this problem down the line.
> >>
> > 
> > Right, OF code should do this validation job if ACPI is doing it
> > (especially since the DT bindings actually specify the distance rules),
> > and not rely on the arch NUMA code to accept/reject numa_set_distance()
> > combinations.
> 
> I would say this particular condition checking still falls under arch NUMA init
> code sanity check like other basic tests what numa_set_distance() currently does
> already but it should not be a necessity for the OF driver to check these. It can
> choose to check but arch NUMA should check basic things like two different NUMA
> nodes should not have LOCAL_DISTANCE as distance like in this case.
> 
> 	(from == to && distance != LOCAL_DISTANCE) ||
> 		(from != to && distance == LOCAL_DISTANCE))
> 
> 
> > 
> > And, in addition to this, I'd say OF should disable NUMA if given an
> > invalid table (like ACPI does).
> 
> Taking a decision to disable NUMA should be with kernel (arch NUMA) once kernel
> starts booting. Platform should have sent right values, OF driver trying to
> adjust stuff what platform has sent with FDT once the kernel starts booting is
> not right. For example "Kernel NUMA wont like the distance factors lets clean
> then up before passing on to MM". Disabling NUMA is one such major decision which
> should be with arch NUMA code not with OF driver.

I don't fully understand what you're getting at here, but why would the
check posted by John be arch-specific? It's already done in the core code
for ACPI, so there's a discrepancy between ACPI and FDT that should be
resolved. I'd also argue that the subtleties of this check are actually
based on what the core code is willing to accept in terms of the NUMA
description, so it's also the best place to enforce it.

Will
Anshuman Khandual Oct. 30, 2018, 2:46 a.m. UTC | #8
On 10/29/2018 08:14 PM, John Garry wrote:
>>>>>
>>>>>  I think we should either factor out the sanity check
>>>>>> into a core helper or make the core code robust to these funny configurations.
>>>>>
>>>>> OK, so to me it would make sense to factor out a sanity check into a core
>>>>> helper.
>>>>
>>>> That, or have the OF code perform the same validation that slit_valid() is
>>>> doing for ACPI. I'm just trying to avoid other architectures running into
>>>> this problem down the line.
>>>>
>>>
>>> Right, OF code should do this validation job if ACPI is doing it (especially since the DT bindings actually specify the distance rules), and not rely on the arch NUMA code to accept/reject numa_set_distance() combinations.
>>
>> I would say this particular condition checking still falls under arch NUMA init
>> code sanity check like other basic tests what numa_set_distance() currently does
>> already but it should not be a necessity for the OF driver to check these.
> 
> The checks in the arch NUMA code mean that invalid inter-node distance combinations are ignored.

Right and should not this new test (from != to && distance == LOCAL_DISTANCE) be
one of them as well ? numa_set_distance() updates the table or just throws some
warnings while skipping entries it deems invalid. It would be okay to have this
new check there in addition to others like this patch suggests.

> 
> However, if any entries in the table are invalid, then the whole table can be discarded as none of it can be believed, i.e. it's better to validate the table.
>

Agreed. slit_valid() on the ACPI parsing is currently enforcing that before
acpi_numa_slit_init() which would call into numa_set_distance(). Hence arch
NUMA code numa_set_distance() never had the opportunity to do the sanity
checks as ACPI slit_valid() has completely invalidated the table.

Unlike ACPI path, of_numa_parse_distance_map_v1() does not do any sanity
checks on the distance values parse from the "distance-matrix" property
and all the checks directly falls on numa_set_distance(). This needs to
be fixed in line with ACPI

* If (to == from) ---> distance = LOCAL_DISTANCE
* If (to != from) ---> distance > LOCAL_DISTANCE

At the same time its okay to just enhance numa_set_distance() test coverage
to include this new test. If we would have trusted firmware parsing all the
way, existing basic checks about node range, distance stuff should not have
been there in numa_set_distance(). Hence IMHO even if we fix the OF driver
part, we should include this new check there as well.

> It can
>> choose to check but arch NUMA should check basic things like two different NUMA
>> nodes should not have LOCAL_DISTANCE as distance like in this case.
>>
>>     (from == to && distance != LOCAL_DISTANCE) ||
>>         (from != to && distance == LOCAL_DISTANCE))
>>
>>
>>>
>>> And, in addition to this, I'd say OF should disable NUMA if given an invalid table (like ACPI does).
>>
>> Taking a decision to disable NUMA should be with kernel (arch NUMA) once kernel
>> starts booting. Platform should have sent right values, OF driver trying to
>> adjust stuff what platform has sent with FDT once the kernel starts booting is
>> not right. For example "Kernel NUMA wont like the distance factors lets clean
>> then up before passing on to MM".
> 
> Sorry, but I don't know who was advocating this.

I was just giving an example. Invalidating NUMA distance table during firmware
table (ACPI or FDT) parsing forces arm64_numa_init() to fall back on dummy NUMA
node which is like disabling NUMA. But that is the current semantics with ACPI
parsing which I overlooked. Fixing OF driver to do the same wont extend this
any further, hence my previous concern does not stand valid.

> 
> Disabling NUMA is one such major decision which
>> should be with arch NUMA code not with OF driver.
> 
> I meant parsing the table would fail, so arch NUMA would fall back on dummy NUMA.

Right and ACPI parsing does that and can force a fallback on a dummy NUMA node.
Anshuman Khandual Oct. 30, 2018, 3 a.m. UTC | #9
On 10/29/2018 08:18 PM, Will Deacon wrote:
> On Mon, Oct 29, 2018 at 06:15:42PM +0530, Anshuman Khandual wrote:
>> On 10/29/2018 06:02 PM, John Garry wrote:
>>> On 29/10/2018 12:16, Will Deacon wrote:
>>>> On Mon, Oct 29, 2018 at 12:14:09PM +0000, John Garry wrote:
>>>>> On 29/10/2018 11:25, Will Deacon wrote:
>>>>>> On Fri, Oct 26, 2018 at 09:57:47PM +0800, John Garry wrote:
>>>>>>> Currently it is acceptable to set the distance between 2 separate nodes to
>>>>>>> LOCAL_DISTANCE.
>>>>>>>
>>>>>>> Reject this as it is invalid.
>>>>>>>
>>>>>>> This change avoids a crash reported in [1].
>>>>>>>
>>>>>>> [1] https://www.spinics.net/lists/arm-kernel/msg683304.html
>>>>>>>
>>>>>>> Signed-off-by: John Garry <john.garry@huawei.com>
>>>>>>>
>>>>>>> diff --git a/arch/arm64/mm/numa.c b/arch/arm64/mm/numa.c
>>>>>>> index 146c04c..6092e3d 100644
>>>>>>> --- a/arch/arm64/mm/numa.c
>>>>>>> +++ b/arch/arm64/mm/numa.c
>>>>>>> @@ -335,7 +335,8 @@ void __init numa_set_distance(int from, int to, int distance)
>>>>>>>     }
>>>>>>>
>>>>>>>     if ((u8)distance != distance ||
>>>>>>> -        (from == to && distance != LOCAL_DISTANCE)) {
>>>>>>> +        (from == to && distance != LOCAL_DISTANCE) ||
>>>>>>> +        (from != to && distance == LOCAL_DISTANCE)) {
>>>>>>
>>>>>> The current code here is more-or-less lifted from the x86 implementation
>>>>>> of numa_set_distance().
>>>>>
>>>>> Right, I did notice this. I didn't think that x86 folks would be so
>>>>> concerned since they generally only use ACPI, and the ACPI code already
>>>>> validates these distances in drivers/acpi/numa.c: slit_valid() [unlike OF
>>>>> code].
>>>>>
>>>>>  I think we should either factor out the sanity check
>>>>>> into a core helper or make the core code robust to these funny configurations.
>>>>>
>>>>> OK, so to me it would make sense to factor out a sanity check into a core
>>>>> helper.
>>>>
>>>> That, or have the OF code perform the same validation that slit_valid() is
>>>> doing for ACPI. I'm just trying to avoid other architectures running into
>>>> this problem down the line.
>>>>
>>>
>>> Right, OF code should do this validation job if ACPI is doing it
>>> (especially since the DT bindings actually specify the distance rules),
>>> and not rely on the arch NUMA code to accept/reject numa_set_distance()
>>> combinations.
>>
>> I would say this particular condition checking still falls under arch NUMA init
>> code sanity check like other basic tests what numa_set_distance() currently does
>> already but it should not be a necessity for the OF driver to check these. It can
>> choose to check but arch NUMA should check basic things like two different NUMA
>> nodes should not have LOCAL_DISTANCE as distance like in this case.
>>
>> 	(from == to && distance != LOCAL_DISTANCE) ||
>> 		(from != to && distance == LOCAL_DISTANCE))
>>
>>
>>>
>>> And, in addition to this, I'd say OF should disable NUMA if given an
>>> invalid table (like ACPI does).
>>
>> Taking a decision to disable NUMA should be with kernel (arch NUMA) once kernel
>> starts booting. Platform should have sent right values, OF driver trying to
>> adjust stuff what platform has sent with FDT once the kernel starts booting is
>> not right. For example "Kernel NUMA wont like the distance factors lets clean
>> then up before passing on to MM". Disabling NUMA is one such major decision which
>> should be with arch NUMA code not with OF driver.
> 
> I don't fully understand what you're getting at here, but why would the
> check posted by John be arch-specific? It's already done in the core code
> for ACPI, so there's a discrepancy between ACPI and FDT that should be
> resolved. I'd also argue that the subtleties of this check are actually
> based on what the core code is willing to accept in terms of the NUMA
> description, so it's also the best place to enforce it.

Agreed. I had overlooked the existing semantics with respect to ACPI parsing.
Yes, there is a discrepancy with respect to FDT which should be fixed. But
IMHO its also worth to enhance numa_set_distance() checks with this proposed
new check as well.
Will Deacon Nov. 1, 2018, 11:27 a.m. UTC | #10
[ Nit: Please wrap your lines when replying -- I've done it for you here ]

On Tue, Oct 30, 2018 at 08:16:21AM +0530, Anshuman Khandual wrote:
> On 10/29/2018 08:14 PM, John Garry wrote:
> >>>>>  I think we should either factor out the sanity check
> >>>>>> into a core helper or make the core code robust to these funny configurations.
> >>>>>
> >>>>> OK, so to me it would make sense to factor out a sanity check into a core
> >>>>> helper.
> >>>>
> >>>> That, or have the OF code perform the same validation that slit_valid() is
> >>>> doing for ACPI. I'm just trying to avoid other architectures running into
> >>>> this problem down the line.
> >>>>
> >>>
> >>> Right, OF code should do this validation job if ACPI is doing it
> >>> (especially since the DT bindings actually specify the distance
> >>> rules), and not rely on the arch NUMA code to accept/reject
> >>> numa_set_distance() combinations.
> >>
> >> I would say this particular condition checking still falls under arch NUMA init
> >> code sanity check like other basic tests what numa_set_distance() currently does
> >> already but it should not be a necessity for the OF driver to check these.
> > 
> > The checks in the arch NUMA code mean that invalid inter-node distance
> > combinations are ignored.
> 
> Right and should not this new test (from != to && distance == LOCAL_DISTANCE) be
> one of them as well ? numa_set_distance() updates the table or just throws some
> warnings while skipping entries it deems invalid. It would be okay to have this
> new check there in addition to others like this patch suggests.

If we're changing numa_set_distance(), I'd actually be inclined to do the
opposite of what you're suggesting and move as much of this boilerplate
checking into the core code. Perhaps we could add something like
check_fw_numa_topology() and have both ACPI and OF call that so that the
arch doesn't need to worry about malformed tables at all.

> > However, if any entries in the table are invalid, then the whole table
> > can be discarded as none of it can be believed, i.e. it's better to
> > validate the table.
> >
> 
> Agreed. slit_valid() on the ACPI parsing is currently enforcing that before
> acpi_numa_slit_init() which would call into numa_set_distance(). Hence arch
> NUMA code numa_set_distance() never had the opportunity to do the sanity
> checks as ACPI slit_valid() has completely invalidated the table.
> 
> Unlike ACPI path, of_numa_parse_distance_map_v1() does not do any sanity
> checks on the distance values parse from the "distance-matrix" property
> and all the checks directly falls on numa_set_distance(). This needs to
> be fixed in line with ACPI
> 
> * If (to == from) ---> distance = LOCAL_DISTANCE
> * If (to != from) ---> distance > LOCAL_DISTANCE
> 
> At the same time its okay to just enhance numa_set_distance() test coverage
> to include this new test. If we would have trusted firmware parsing all the
> way, existing basic checks about node range, distance stuff should not have
> been there in numa_set_distance(). Hence IMHO even if we fix the OF driver
> part, we should include this new check there as well.

I don't see what we gain by duplicating the check. In fact, it has a few
downsides:

  (1) It confuses the semantics of the API, because it is no longer clear
      who "owns" the check

  (2) It duplicates code in each architecture

  (3) Some clever-cloggs will remove at least some of the duplication in
      future

I'm not willing to accept the check in the arm64 code if we update the
OF code.

I think the way forward here is for John to fix the crash he reported by
adding the check to the OF code. If somebody wants to follow up with
subsequent patches to move more of the checking out of the arch code, then
we can review that as a separate series.

Will
John Garry Nov. 1, 2018, 11:39 a.m. UTC | #11
>>
>> Agreed. slit_valid() on the ACPI parsing is currently enforcing that before
>> acpi_numa_slit_init() which would call into numa_set_distance(). Hence arch
>> NUMA code numa_set_distance() never had the opportunity to do the sanity
>> checks as ACPI slit_valid() has completely invalidated the table.
>>
>> Unlike ACPI path, of_numa_parse_distance_map_v1() does not do any sanity
>> checks on the distance values parse from the "distance-matrix" property
>> and all the checks directly falls on numa_set_distance(). This needs to
>> be fixed in line with ACPI
>>
>> * If (to == from) ---> distance = LOCAL_DISTANCE
>> * If (to != from) ---> distance > LOCAL_DISTANCE
>>
>> At the same time its okay to just enhance numa_set_distance() test coverage
>> to include this new test. If we would have trusted firmware parsing all the
>> way, existing basic checks about node range, distance stuff should not have
>> been there in numa_set_distance(). Hence IMHO even if we fix the OF driver
>> part, we should include this new check there as well.
>
> I don't see what we gain by duplicating the check. In fact, it has a few
> downsides:
>
>   (1) It confuses the semantics of the API, because it is no longer clear
>       who "owns" the check
>
>   (2) It duplicates code in each architecture
>
>   (3) Some clever-cloggs will remove at least some of the duplication in
>       future
>
> I'm not willing to accept the check in the arm64 code if we update the
> OF code.
>
> I think the way forward here is for John to fix the crash he reported by
> adding the check to the OF code.

I was planning on doing that.

If somebody wants to follow up with
> subsequent patches to move more of the checking out of the arch code, then
> we can review that as a separate series.

Cheers,
John

>
> Will
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>
> .
>
Anshuman Khandual Nov. 8, 2018, 2:20 p.m. UTC | #12
On 11/01/2018 04:57 PM, Will Deacon wrote:
> [ Nit: Please wrap your lines when replying -- I've done it for you here ]
> 
> On Tue, Oct 30, 2018 at 08:16:21AM +0530, Anshuman Khandual wrote:
>> On 10/29/2018 08:14 PM, John Garry wrote:
>>>>>>>  I think we should either factor out the sanity check
>>>>>>>> into a core helper or make the core code robust to these funny configurations.
>>>>>>>
>>>>>>> OK, so to me it would make sense to factor out a sanity check into a core
>>>>>>> helper.
>>>>>>
>>>>>> That, or have the OF code perform the same validation that slit_valid() is
>>>>>> doing for ACPI. I'm just trying to avoid other architectures running into
>>>>>> this problem down the line.
>>>>>>
>>>>>
>>>>> Right, OF code should do this validation job if ACPI is doing it
>>>>> (especially since the DT bindings actually specify the distance
>>>>> rules), and not rely on the arch NUMA code to accept/reject
>>>>> numa_set_distance() combinations.
>>>>
>>>> I would say this particular condition checking still falls under arch NUMA init
>>>> code sanity check like other basic tests what numa_set_distance() currently does
>>>> already but it should not be a necessity for the OF driver to check these.
>>>
>>> The checks in the arch NUMA code mean that invalid inter-node distance
>>> combinations are ignored.
>>
>> Right and should not this new test (from != to && distance == LOCAL_DISTANCE) be
>> one of them as well ? numa_set_distance() updates the table or just throws some
>> warnings while skipping entries it deems invalid. It would be okay to have this
>> new check there in addition to others like this patch suggests.

I missed this response due to some problem in my mail client and re-iterated
some of the same points again on the V2 (https://lkml.org/lkml/2018/11/8/734).
My apologies for the same.

> 
> If we're changing numa_set_distance(), I'd actually be inclined to do the
> opposite of what you're suggesting and move as much of this boilerplate
> checking into the core code. Perhaps we could add something like
> check_fw_numa_topology() and have both ACPI and OF call that so that the
> arch doesn't need to worry about malformed tables at all.

Right although I doubt that we could ever have a common check_fw_numa_topology()
check as the semantics for the table and individual entries there in for DT and
ACPI might be different. But I agree what you are saying.

> 
>>> However, if any entries in the table are invalid, then the whole table
>>> can be discarded as none of it can be believed, i.e. it's better to
>>> validate the table.
>>>
>>
>> Agreed. slit_valid() on the ACPI parsing is currently enforcing that before
>> acpi_numa_slit_init() which would call into numa_set_distance(). Hence arch
>> NUMA code numa_set_distance() never had the opportunity to do the sanity
>> checks as ACPI slit_valid() has completely invalidated the table.
>>
>> Unlike ACPI path, of_numa_parse_distance_map_v1() does not do any sanity
>> checks on the distance values parse from the "distance-matrix" property
>> and all the checks directly falls on numa_set_distance(). This needs to
>> be fixed in line with ACPI
>>
>> * If (to == from) ---> distance = LOCAL_DISTANCE
>> * If (to != from) ---> distance > LOCAL_DISTANCE
>>
>> At the same time its okay to just enhance numa_set_distance() test coverage
>> to include this new test. If we would have trusted firmware parsing all the
>> way, existing basic checks about node range, distance stuff should not have
>> been there in numa_set_distance(). Hence IMHO even if we fix the OF driver
>> part, we should include this new check there as well.
> 
> I don't see what we gain by duplicating the check. In fact, it has a few
> downsides:
> 
>   (1) It confuses the semantics of the API, because it is no longer clear
>       who "owns" the check
> 
>   (2) It duplicates code in each architecture
> 
>   (3) Some clever-cloggs will remove at least some of the duplication in
>       future

Agreed, it has down sides.

> 
> I'm not willing to accept the check in the arm64 code if we update the
> OF code.

Okay, we should remove them instead.

> 
> I think the way forward here is for John to fix the crash he reported by
> adding the check to the OF code. If somebody wants to follow up with
> subsequent patches to move more of the checking out of the arch code, then
> we can review that as a separate series.

Okay. Anyways I had some other comments related to semantics checking at
the OF driver level but I can probably send them out as a separate patch
later. Once again it was my bad to miss this response in the first place.

Patch
diff mbox series

diff --git a/arch/arm64/mm/numa.c b/arch/arm64/mm/numa.c
index 146c04c..6092e3d 100644
--- a/arch/arm64/mm/numa.c
+++ b/arch/arm64/mm/numa.c
@@ -335,7 +335,8 @@  void __init numa_set_distance(int from, int to, int distance)
 	}
 
 	if ((u8)distance != distance ||
-	    (from == to && distance != LOCAL_DISTANCE)) {
+	    (from == to && distance != LOCAL_DISTANCE) ||
+	    (from != to && distance == LOCAL_DISTANCE)) {
 		pr_warn_once("Warning: invalid distance parameter, from=%d to=%d distance=%d\n",
 			     from, to, distance);
 		return;