diff mbox series

[v4,4/6] x86/mm: Introduce CONFIG_KEEP_NUMA

Message ID 157966229575.2508551.1892426244277171485.stgit@dwillia2-desk3.amr.corp.intel.com (mailing list archive)
State Superseded
Headers show
Series Memory Hierarchy: Enable target node lookups for reserved memory | expand

Commit Message

Dan Williams Jan. 22, 2020, 3:04 a.m. UTC
Currently x86 numa_meminfo is marked __initdata in the
CONFIG_MEMORY_HOTPLUG=n case. In support of a new facility to allow
drivers to map reserved memory to a 'target_node'
(phys_to_target_node()), add support for removing the __initdata
designation for those users. Both memory hotplug and
phys_to_target_node() users select CONFIG_KEEP_NUMA to tell the arch to
maintain its physical address to numa mapping infrastructure post init.

Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: <x86@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 arch/x86/mm/numa.c   |    6 +-----
 include/linux/numa.h |    6 ++++++
 mm/Kconfig           |    5 +++++
 3 files changed, 12 insertions(+), 5 deletions(-)

Comments

Ingo Molnar Feb. 13, 2020, 9:32 a.m. UTC | #1
* Dan Williams <dan.j.williams@intel.com> wrote:

> Currently x86 numa_meminfo is marked __initdata in the
> CONFIG_MEMORY_HOTPLUG=n case. In support of a new facility to allow
> drivers to map reserved memory to a 'target_node'
> (phys_to_target_node()), add support for removing the __initdata
> designation for those users. Both memory hotplug and
> phys_to_target_node() users select CONFIG_KEEP_NUMA to tell the arch to
> maintain its physical address to numa mapping infrastructure post init.
> 
> Cc: Dave Hansen <dave.hansen@linux.intel.com>
> Cc: Andy Lutomirski <luto@kernel.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Borislav Petkov <bp@alien8.de>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: <x86@kernel.org>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: David Hildenbrand <david@redhat.com>
> Cc: Michal Hocko <mhocko@suse.com>
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> ---
>  arch/x86/mm/numa.c   |    6 +-----
>  include/linux/numa.h |    6 ++++++
>  mm/Kconfig           |    5 +++++
>  3 files changed, 12 insertions(+), 5 deletions(-)

The concept and the x86 portions look sane, just a few minor nits:

> 
> diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c
> index 99f7a68738f0..5289d9d6799a 100644
> --- a/arch/x86/mm/numa.c
> +++ b/arch/x86/mm/numa.c
> @@ -25,11 +25,7 @@ nodemask_t numa_nodes_parsed __initdata;
>  struct pglist_data *node_data[MAX_NUMNODES] __read_mostly;
>  EXPORT_SYMBOL(node_data);
>  
> -static struct numa_meminfo numa_meminfo
> -#ifndef CONFIG_MEMORY_HOTPLUG
> -__initdata
> -#endif
> -;
> +static struct numa_meminfo numa_meminfo __initdata_numa;
>  
>  static int numa_distance_cnt;
>  static u8 *numa_distance;
> diff --git a/include/linux/numa.h b/include/linux/numa.h
> index 20f4e44b186c..c005ed6b807b 100644
> --- a/include/linux/numa.h
> +++ b/include/linux/numa.h
> @@ -13,6 +13,12 @@
>  
>  #define	NUMA_NO_NODE	(-1)
>  
> +#ifdef CONFIG_KEEP_NUMA
> +#define __initdata_numa
> +#else
> +#define __initdata_numa __initdata
> +#endif
> +
>  #ifdef CONFIG_NUMA
>  int numa_map_to_online_node(int node);
>  #else
> diff --git a/mm/Kconfig b/mm/Kconfig
> index ab80933be65f..001f1185eadf 100644
> --- a/mm/Kconfig
> +++ b/mm/Kconfig
> @@ -139,6 +139,10 @@ config HAVE_FAST_GUP
>  config ARCH_KEEP_MEMBLOCK
>  	bool
>  
> +# Keep arch numa mapping infrastructure post-init.

s/numa/NUMA

Please also capitalize consistently in the rest of the series.

> +config KEEP_NUMA
> +	bool


So most of our recent new NUMA options followed the naming pattern of:

  CONFIG_NUMA_*

Such as CONFIG_NUMA_BALANCING or CONFIG_NUMA_EMU.

So I'd suggesting naming it to CONFIG_NUMA_KEEP, or, a bit more 
descriptively, such as CONFIG_NUMA_KEEP_MAPPING or such?

'Keeping NUMA' is kind of lame - of course we keep NUMA. ;-)

Thanks,

	Ingo
Thomas Gleixner Feb. 13, 2020, 11:22 a.m. UTC | #2
Dan Williams <dan.j.williams@intel.com> writes:
> +#ifdef CONFIG_KEEP_NUMA
> +#define __initdata_numa
> +#else
> +#define __initdata_numa __initdata
> +#endif

TBH, I find this conditional annotation mightingly confusing.

__initdata_numa still suggest that this is __initdata, just a different
section and some extra rules or whatever.

Something like __initdata_or_keepnuma (sorry I could not come up with
something prettier, but you get the idea.

Thanks,

        tglx
Dan Williams Feb. 13, 2020, 9:19 p.m. UTC | #3
On Thu, Feb 13, 2020 at 1:32 AM Ingo Molnar <mingo@kernel.org> wrote:
>
>
> * Dan Williams <dan.j.williams@intel.com> wrote:
>
> > Currently x86 numa_meminfo is marked __initdata in the
> > CONFIG_MEMORY_HOTPLUG=n case. In support of a new facility to allow
> > drivers to map reserved memory to a 'target_node'
> > (phys_to_target_node()), add support for removing the __initdata
> > designation for those users. Both memory hotplug and
> > phys_to_target_node() users select CONFIG_KEEP_NUMA to tell the arch to
> > maintain its physical address to numa mapping infrastructure post init.
> >
> > Cc: Dave Hansen <dave.hansen@linux.intel.com>
> > Cc: Andy Lutomirski <luto@kernel.org>
> > Cc: Peter Zijlstra <peterz@infradead.org>
> > Cc: Thomas Gleixner <tglx@linutronix.de>
> > Cc: Ingo Molnar <mingo@redhat.com>
> > Cc: Borislav Petkov <bp@alien8.de>
> > Cc: "H. Peter Anvin" <hpa@zytor.com>
> > Cc: <x86@kernel.org>
> > Cc: Andrew Morton <akpm@linux-foundation.org>
> > Cc: David Hildenbrand <david@redhat.com>
> > Cc: Michal Hocko <mhocko@suse.com>
> > Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> > ---
> >  arch/x86/mm/numa.c   |    6 +-----
> >  include/linux/numa.h |    6 ++++++
> >  mm/Kconfig           |    5 +++++
> >  3 files changed, 12 insertions(+), 5 deletions(-)
>
> The concept and the x86 portions look sane, just a few minor nits:
>
> >
> > diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c
> > index 99f7a68738f0..5289d9d6799a 100644
> > --- a/arch/x86/mm/numa.c
> > +++ b/arch/x86/mm/numa.c
> > @@ -25,11 +25,7 @@ nodemask_t numa_nodes_parsed __initdata;
> >  struct pglist_data *node_data[MAX_NUMNODES] __read_mostly;
> >  EXPORT_SYMBOL(node_data);
> >
> > -static struct numa_meminfo numa_meminfo
> > -#ifndef CONFIG_MEMORY_HOTPLUG
> > -__initdata
> > -#endif
> > -;
> > +static struct numa_meminfo numa_meminfo __initdata_numa;
> >
> >  static int numa_distance_cnt;
> >  static u8 *numa_distance;
> > diff --git a/include/linux/numa.h b/include/linux/numa.h
> > index 20f4e44b186c..c005ed6b807b 100644
> > --- a/include/linux/numa.h
> > +++ b/include/linux/numa.h
> > @@ -13,6 +13,12 @@
> >
> >  #define      NUMA_NO_NODE    (-1)
> >
> > +#ifdef CONFIG_KEEP_NUMA
> > +#define __initdata_numa
> > +#else
> > +#define __initdata_numa __initdata
> > +#endif
> > +
> >  #ifdef CONFIG_NUMA
> >  int numa_map_to_online_node(int node);
> >  #else
> > diff --git a/mm/Kconfig b/mm/Kconfig
> > index ab80933be65f..001f1185eadf 100644
> > --- a/mm/Kconfig
> > +++ b/mm/Kconfig
> > @@ -139,6 +139,10 @@ config HAVE_FAST_GUP
> >  config ARCH_KEEP_MEMBLOCK
> >       bool
> >
> > +# Keep arch numa mapping infrastructure post-init.
>
> s/numa/NUMA
>
> Please also capitalize consistently in the rest of the series.
>
> > +config KEEP_NUMA
> > +     bool
>
>
> So most of our recent new NUMA options followed the naming pattern of:
>
>   CONFIG_NUMA_*
>
> Such as CONFIG_NUMA_BALANCING or CONFIG_NUMA_EMU.
>
> So I'd suggesting naming it to CONFIG_NUMA_KEEP, or, a bit more
> descriptively, such as CONFIG_NUMA_KEEP_MAPPING or such?
>
> 'Keeping NUMA' is kind of lame - of course we keep NUMA. ;-)

Ok, I settled on CONFIG_NUMA_KEEP_MEMINFO, and will fix up the
lowercase "numa" instances in the set.
Dan Williams Feb. 13, 2020, 9:21 p.m. UTC | #4
On Thu, Feb 13, 2020 at 3:22 AM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> Dan Williams <dan.j.williams@intel.com> writes:
> > +#ifdef CONFIG_KEEP_NUMA
> > +#define __initdata_numa
> > +#else
> > +#define __initdata_numa __initdata
> > +#endif
>
> TBH, I find this conditional annotation mightingly confusing.
>
> __initdata_numa still suggest that this is __initdata, just a different
> section and some extra rules or whatever.
>
> Something like __initdata_or_keepnuma (sorry I could not come up with
> something prettier, but you get the idea.

Yes, and to dovetail with Ingo's feedback I think
__initdata_or_meminfo conveys it's optionally init vs runtime data.
diff mbox series

Patch

diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c
index 99f7a68738f0..5289d9d6799a 100644
--- a/arch/x86/mm/numa.c
+++ b/arch/x86/mm/numa.c
@@ -25,11 +25,7 @@  nodemask_t numa_nodes_parsed __initdata;
 struct pglist_data *node_data[MAX_NUMNODES] __read_mostly;
 EXPORT_SYMBOL(node_data);
 
-static struct numa_meminfo numa_meminfo
-#ifndef CONFIG_MEMORY_HOTPLUG
-__initdata
-#endif
-;
+static struct numa_meminfo numa_meminfo __initdata_numa;
 
 static int numa_distance_cnt;
 static u8 *numa_distance;
diff --git a/include/linux/numa.h b/include/linux/numa.h
index 20f4e44b186c..c005ed6b807b 100644
--- a/include/linux/numa.h
+++ b/include/linux/numa.h
@@ -13,6 +13,12 @@ 
 
 #define	NUMA_NO_NODE	(-1)
 
+#ifdef CONFIG_KEEP_NUMA
+#define __initdata_numa
+#else
+#define __initdata_numa __initdata
+#endif
+
 #ifdef CONFIG_NUMA
 int numa_map_to_online_node(int node);
 #else
diff --git a/mm/Kconfig b/mm/Kconfig
index ab80933be65f..001f1185eadf 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -139,6 +139,10 @@  config HAVE_FAST_GUP
 config ARCH_KEEP_MEMBLOCK
 	bool
 
+# Keep arch numa mapping infrastructure post-init.
+config KEEP_NUMA
+	bool
+
 config MEMORY_ISOLATION
 	bool
 
@@ -154,6 +158,7 @@  config MEMORY_HOTPLUG
 	bool "Allow for memory hot-add"
 	depends on SPARSEMEM || X86_64_ACPI_NUMA
 	depends on ARCH_ENABLE_MEMORY_HOTPLUG
+	select KEEP_NUMA if NUMA
 
 config MEMORY_HOTPLUG_SPARSE
 	def_bool y