diff mbox series

[1/1] Fix undefined reference to 'node_reclaim_distance'.

Message ID 20191216103522.32215-2-gonsolo@gmail.com (mailing list archive)
State New, archived
Headers show
Series Fix SH config error. | expand

Commit Message

Gon Solo Dec. 16, 2019, 10:35 a.m. UTC
According to https://lkml.org/lkml/2019/12/16/101 and
http://kisskb.ellerman.id.au/kisskb/buildresult/14067948/ building on
sh4 is broken due to a

page_alloc.c:(.text+0x3148): undefined reference to `node_reclaim_distance'.

This only happens with CONFIG_NUMA=y (variable used with #ifdef
CONFIG_NUMA at mm/page_alloc.c:3529) and CONFIG_SMP=n (variable defined at
kernel/sched/topology.c:2291 but the whole file to be built depends on
CONFIG_SMP in kernel/sched/Makefile:23.

Follow the lead of arch/x86/Kconfig:1547 and depend on SMP.

This assumes that there are no NUMA systems without SMP which is
reasonable I guess.

Signed-off-by: Gon Solo <gonsolo@gmail.com>
---
 arch/sh/mm/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Geert Uytterhoeven Dec. 16, 2019, 12:21 p.m. UTC | #1
Hi Gon,

Thanks for your patch!

On Mon, Dec 16, 2019 at 11:35 AM Gon Solo <gonsolo@gmail.com> wrote:
> According to https://lkml.org/lkml/2019/12/16/101 and
> http://kisskb.ellerman.id.au/kisskb/buildresult/14067948/ building on
> sh4 is broken due to a
>
> page_alloc.c:(.text+0x3148): undefined reference to `node_reclaim_distance'.
>
> This only happens with CONFIG_NUMA=y (variable used with #ifdef
> CONFIG_NUMA at mm/page_alloc.c:3529) and CONFIG_SMP=n (variable defined at
> kernel/sched/topology.c:2291 but the whole file to be built depends on
> CONFIG_SMP in kernel/sched/Makefile:23.
>
> Follow the lead of arch/x86/Kconfig:1547 and depend on SMP.
>
> This assumes that there are no NUMA systems without SMP which is
> reasonable I guess.

Unfortunately that may be an x86-centric assumption: on other platforms,
there do exist systems with multiple memory banks with different access
performance figures.

> Signed-off-by: Gon Solo <gonsolo@gmail.com>
> ---
>  arch/sh/mm/Kconfig | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/sh/mm/Kconfig b/arch/sh/mm/Kconfig
> index 5c8a2ebfc720..cf655d8e8758 100644
> --- a/arch/sh/mm/Kconfig
> +++ b/arch/sh/mm/Kconfig
> @@ -108,7 +108,7 @@ config VSYSCALL
>
>  config NUMA
>         bool "Non Uniform Memory Access (NUMA) Support"
> -       depends on MMU && SYS_SUPPORTS_NUMA
> +       depends on MMU && SMP && SYS_SUPPORTS_NUMA
>         select ARCH_WANT_NUMA_VARIABLE_LOCALITY
>         default n
>         help

Gr{oetje,eeting}s,

                        Geert
Gon Solo Dec. 16, 2019, 1:21 p.m. UTC | #2
Hi!

> Unfortunately that may be an x86-centric assumption: on other platforms,
> there do exist systems with multiple memory banks with different access
> performance figures.

Does that mean that the line

obj-$(CONFIG_SMP) += cpupri.o cpudeadline.o topology.o stop_task.o pelt.o

in kernel/sched/topology/Makefile:23 is wrong?

Because in topology.c:1284 the variable node_reclaim_distance ist defined
which is used in mm/page_alloc.c:3529 without depending on SMP.
Geert Uytterhoeven Dec. 16, 2019, 1:31 p.m. UTC | #3
Hi Gon,

On Mon, Dec 16, 2019 at 2:21 PM Gonsolo <gonsolo@gmail.com> wrote:
> > Unfortunately that may be an x86-centric assumption: on other platforms,
> > there do exist systems with multiple memory banks with different access
> > performance figures.
>
> Does that mean that the line
>
> obj-$(CONFIG_SMP) += cpupri.o cpudeadline.o topology.o stop_task.o pelt.o
>
> in kernel/sched/topology/Makefile:23 is wrong?
>
> Because in topology.c:1284 the variable node_reclaim_distance ist defined
> which is used in mm/page_alloc.c:3529 without depending on SMP.

The offending commit seems to be a55c7454a8c887b2 ("sched/topology:
Improve load balancing on AMD EPYC systems").

Probably the node_reclaim_distance variable should be moved from
an SMP-specific file to a NUMA-specific file.

Gr{oetje,eeting}s,

                        Geert
Gon Solo Dec. 16, 2019, 1:50 p.m. UTC | #4
> The offending commit seems to be a55c7454a8c887b2 ("sched/topology:
> Improve load balancing on AMD EPYC systems").
>
> Probably the node_reclaim_distance variable should be moved from
> an SMP-specific file to a NUMA-specific file.

There are two variables that are used elsewhere:

int                             sched_max_numa_distance;

Used in kernel/sched/fair.c and kernel/sched/topology.c. I would move
it to fair.c.

int __read_mostly               node_reclaim_distance = RECLAIM_DISTANCE;

Used in
arch/x86/kernel/cpu/amd.c, line 894
kernel/sched/topology.c
mm/khugepaged.c, line 725
mm/page_alloc.c, line 3529

I'm not sure where to move this one.
Matt Fleming Dec. 23, 2019, 4:42 p.m. UTC | #5
On Mon, 16 Dec, at 02:50:49PM, Gonsolo wrote:
> > The offending commit seems to be a55c7454a8c887b2 ("sched/topology:
> > Improve load balancing on AMD EPYC systems").
> >
> > Probably the node_reclaim_distance variable should be moved from
> > an SMP-specific file to a NUMA-specific file.
> 
> There are two variables that are used elsewhere:
> 
> int                             sched_max_numa_distance;
> 
> Used in kernel/sched/fair.c and kernel/sched/topology.c. I would move
> it to fair.c.
> 
> int __read_mostly               node_reclaim_distance = RECLAIM_DISTANCE;
> 
> Used in
> arch/x86/kernel/cpu/amd.c, line 894
> kernel/sched/topology.c
> mm/khugepaged.c, line 725
> mm/page_alloc.c, line 3529
> 
> I'm not sure where to move this one.

Can someone test out this patch on one of the failing architectures?
(sh, ppc64)

---->8----

diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
index 6ec1e595b1d4..bf20e5883026 100644
--- a/kernel/sched/topology.c
+++ b/kernel/sched/topology.c
@@ -1281,7 +1281,6 @@ static int			sched_domains_curr_level;
 int				sched_max_numa_distance;
 static int			*sched_domains_numa_distance;
 static struct cpumask		***sched_domains_numa_masks;
-int __read_mostly		node_reclaim_distance = RECLAIM_DISTANCE;
 #endif
 
 /*
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 4785a8a2040e..733890d913ea 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3523,6 +3523,7 @@ bool zone_watermark_ok_safe(struct zone *z, unsigned int order,
 }
 
 #ifdef CONFIG_NUMA
+int __read_mostly		node_reclaim_distance = RECLAIM_DISTANCE;
 static bool zone_allows_reclaim(struct zone *local_zone, struct zone *zone)
 {
 	return node_distance(zone_to_nid(local_zone), zone_to_nid(zone)) <=
Randy Dunlap Dec. 25, 2019, 7:18 a.m. UTC | #6
On 12/23/19 8:42 AM, Matt Fleming wrote:
> On Mon, 16 Dec, at 02:50:49PM, Gonsolo wrote:
>>> The offending commit seems to be a55c7454a8c887b2 ("sched/topology:
>>> Improve load balancing on AMD EPYC systems").
>>>
>>> Probably the node_reclaim_distance variable should be moved from
>>> an SMP-specific file to a NUMA-specific file.
>>
>> There are two variables that are used elsewhere:
>>
>> int                             sched_max_numa_distance;
>>
>> Used in kernel/sched/fair.c and kernel/sched/topology.c. I would move
>> it to fair.c.
>>
>> int __read_mostly               node_reclaim_distance = RECLAIM_DISTANCE;
>>
>> Used in
>> arch/x86/kernel/cpu/amd.c, line 894
>> kernel/sched/topology.c
>> mm/khugepaged.c, line 725
>> mm/page_alloc.c, line 3529
>>
>> I'm not sure where to move this one.
> 
> Can someone test out this patch on one of the failing architectures?
> (sh, ppc64)

Yes, it fixes the arch/sh/ build errors in my testing.
I don't have a failing ppc64 .config file to test.

thanks.

> ---->8----
> 
> diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
> index 6ec1e595b1d4..bf20e5883026 100644
> --- a/kernel/sched/topology.c
> +++ b/kernel/sched/topology.c
> @@ -1281,7 +1281,6 @@ static int			sched_domains_curr_level;
>  int				sched_max_numa_distance;
>  static int			*sched_domains_numa_distance;
>  static struct cpumask		***sched_domains_numa_masks;
> -int __read_mostly		node_reclaim_distance = RECLAIM_DISTANCE;
>  #endif
>  
>  /*
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 4785a8a2040e..733890d913ea 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -3523,6 +3523,7 @@ bool zone_watermark_ok_safe(struct zone *z, unsigned int order,
>  }
>  
>  #ifdef CONFIG_NUMA
> +int __read_mostly		node_reclaim_distance = RECLAIM_DISTANCE;
>  static bool zone_allows_reclaim(struct zone *local_zone, struct zone *zone)
>  {
>  	return node_distance(zone_to_nid(local_zone), zone_to_nid(zone)) <=
>
Gon Solo May 4, 2020, 12:08 p.m. UTC | #7
Hi!

> Yes, it fixes the arch/sh/ build errors in my testing.
> I don't have a failing ppc64 .config file to test.

As of v5.7-rc4 this patch doesn't seem to have been applied and the
build is still failing:
http://kisskb.ellerman.id.au/kisskb/buildresult/14067948/

diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
index 6ec1e595b1d4..bf20e5883026 100644
--- a/kernel/sched/topology.c
+++ b/kernel/sched/topology.c
@@ -1281,7 +1281,6 @@ static int                      sched_domains_curr_level;
int                          sched_max_numa_distance;
static int                   *sched_domains_numa_distance;
static struct cpumask                ***sched_domains_numa_masks;
-int __read_mostly            node_reclaim_distance = RECLAIM_DISTANCE;
#endif

/*
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 4785a8a2040e..733890d913ea 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3523,6 +3523,7 @@ bool zone_watermark_ok_safe(struct zone *z,
unsigned int order,
}

#ifdef CONFIG_NUMA
+int __read_mostly            node_reclaim_distance = RECLAIM_DISTANCE;
static bool zone_allows_reclaim(struct zone *local_zone, struct zone *zone)
{
    return node_distance(zone_to_nid(local_zone), zone_to_nid(zone)) <=
diff mbox series

Patch

diff --git a/arch/sh/mm/Kconfig b/arch/sh/mm/Kconfig
index 5c8a2ebfc720..cf655d8e8758 100644
--- a/arch/sh/mm/Kconfig
+++ b/arch/sh/mm/Kconfig
@@ -108,7 +108,7 @@  config VSYSCALL
 
 config NUMA
 	bool "Non Uniform Memory Access (NUMA) Support"
-	depends on MMU && SYS_SUPPORTS_NUMA
+	depends on MMU && SMP && SYS_SUPPORTS_NUMA
 	select ARCH_WANT_NUMA_VARIABLE_LOCALITY
 	default n
 	help