diff mbox series

[1/1] mm/memory_hotplug: Adds option to hot-add memory in ZONE_MOVABLE

Message ID 20190718024133.3873-1-leonardo@linux.ibm.com (mailing list archive)
State New, archived
Headers show
Series [1/1] mm/memory_hotplug: Adds option to hot-add memory in ZONE_MOVABLE | expand

Commit Message

Leonardo Bras July 18, 2019, 2:41 a.m. UTC
Adds an option on kernel config to make hot-added memory online in
ZONE_MOVABLE by default.

This would be great in systems with MEMORY_HOTPLUG_DEFAULT_ONLINE=y by
allowing to choose which zone it will be auto-onlined

Signed-off-by: Leonardo Bras <leonardo@linux.ibm.com>
---
 drivers/base/memory.c |  3 +++
 mm/Kconfig            | 14 ++++++++++++++
 2 files changed, 17 insertions(+)

Comments

Oscar Salvador July 18, 2019, 6:12 a.m. UTC | #1
On Wed, 2019-07-17 at 23:41 -0300, Leonardo Bras wrote:
> Adds an option on kernel config to make hot-added memory online in
> ZONE_MOVABLE by default.
> 
> This would be great in systems with MEMORY_HOTPLUG_DEFAULT_ONLINE=y
> by
> allowing to choose which zone it will be auto-onlined

We do already have "movable_node" boot option, which exactly has that
effect.
Any hotplugged range will be placed in ZONE_MOVABLE.

Why do we need yet another option to achieve the same? Was not that
enough for your case?

> 
> Signed-off-by: Leonardo Bras <leonardo@linux.ibm.com>
> ---
>  drivers/base/memory.c |  3 +++
>  mm/Kconfig            | 14 ++++++++++++++
>  2 files changed, 17 insertions(+)
> 
> diff --git a/drivers/base/memory.c b/drivers/base/memory.c
> index f180427e48f4..378b585785c1 100644
> --- a/drivers/base/memory.c
> +++ b/drivers/base/memory.c
> @@ -670,6 +670,9 @@ static int init_memory_block(struct memory_block
> **memory,
>  	mem->state = state;
>  	start_pfn = section_nr_to_pfn(mem->start_section_nr);
>  	mem->phys_device = arch_get_memory_phys_device(start_pfn);
> +#ifdef CONFIG_MEMORY_HOTPLUG_MOVABLE
> +	mem->online_type = MMOP_ONLINE_MOVABLE;
> +#endif
>  
>  	ret = register_memory(mem);
>  
> diff --git a/mm/Kconfig b/mm/Kconfig
> index f0c76ba47695..74e793720f43 100644
> --- a/mm/Kconfig
> +++ b/mm/Kconfig
> @@ -180,6 +180,20 @@ config MEMORY_HOTREMOVE
>  	depends on MEMORY_HOTPLUG && ARCH_ENABLE_MEMORY_HOTREMOVE
>  	depends on MIGRATION
>  
> +config MEMORY_HOTPLUG_MOVABLE
> +	bool "Enhance the likelihood of hot-remove"
> +	depends on MEMORY_HOTREMOVE
> +	help
> +	  This option sets the hot-added memory zone to MOVABLE
> which
> +	  drastically reduces the chance of a hot-remove to fail due
> to
> +	  unmovable memory segments. Kernel memory can't be
> allocated in
> +	  this zone.
> +
> +	  Say Y here if you want to have better chance to hot-remove 
> memory
> +	  that have been previously hot-added.
> +	  Say N here if you want to make all hot-added memory
> available to
> +	  kernel space.
> +
>  # Heavily threaded applications may benefit from splitting the mm-
> wide
>  # page_table_lock, so that faults on different parts of the user
> address
>  # space can be handled with less contention: split it at this
> NR_CPUS.
Mike Rapoport July 18, 2019, 6:26 a.m. UTC | #2
On Wed, Jul 17, 2019 at 11:41:34PM -0300, Leonardo Bras wrote:
> Adds an option on kernel config to make hot-added memory online in
> ZONE_MOVABLE by default.
> 
> This would be great in systems with MEMORY_HOTPLUG_DEFAULT_ONLINE=y by
> allowing to choose which zone it will be auto-onlined
 
Please add more elaborate description of the problem you are solving and
the solution outline.


> Signed-off-by: Leonardo Bras <leonardo@linux.ibm.com>
> ---
>  drivers/base/memory.c |  3 +++
>  mm/Kconfig            | 14 ++++++++++++++
>  2 files changed, 17 insertions(+)
> 
> diff --git a/drivers/base/memory.c b/drivers/base/memory.c
> index f180427e48f4..378b585785c1 100644
> --- a/drivers/base/memory.c
> +++ b/drivers/base/memory.c
> @@ -670,6 +670,9 @@ static int init_memory_block(struct memory_block **memory,
>  	mem->state = state;
>  	start_pfn = section_nr_to_pfn(mem->start_section_nr);
>  	mem->phys_device = arch_get_memory_phys_device(start_pfn);
> +#ifdef CONFIG_MEMORY_HOTPLUG_MOVABLE
> +	mem->online_type = MMOP_ONLINE_MOVABLE;
> +#endif

Does it has to be a compile time option?
Seems like this can be changed at run time or at least at boot.
  
>  	ret = register_memory(mem);
>  
> diff --git a/mm/Kconfig b/mm/Kconfig
> index f0c76ba47695..74e793720f43 100644
> --- a/mm/Kconfig
> +++ b/mm/Kconfig
> @@ -180,6 +180,20 @@ config MEMORY_HOTREMOVE
>  	depends on MEMORY_HOTPLUG && ARCH_ENABLE_MEMORY_HOTREMOVE
>  	depends on MIGRATION
>  
> +config MEMORY_HOTPLUG_MOVABLE
> +	bool "Enhance the likelihood of hot-remove"
> +	depends on MEMORY_HOTREMOVE
> +	help
> +	  This option sets the hot-added memory zone to MOVABLE which
> +	  drastically reduces the chance of a hot-remove to fail due to
> +	  unmovable memory segments. Kernel memory can't be allocated in
> +	  this zone.
> +
> +	  Say Y here if you want to have better chance to hot-remove memory
> +	  that have been previously hot-added.
> +	  Say N here if you want to make all hot-added memory available to
> +	  kernel space.
> +
>  # Heavily threaded applications may benefit from splitting the mm-wide
>  # page_table_lock, so that faults on different parts of the user address
>  # space can be handled with less contention: split it at this NR_CPUS.
> -- 
> 2.20.1
>
Pasha Tatashin July 18, 2019, 12:19 p.m. UTC | #3
On Wed, Jul 17, 2019 at 10:42 PM Leonardo Bras <leonardo@linux.ibm.com> wrote:
>
> Adds an option on kernel config to make hot-added memory online in
> ZONE_MOVABLE by default.
>
> This would be great in systems with MEMORY_HOTPLUG_DEFAULT_ONLINE=y by
> allowing to choose which zone it will be auto-onlined

This is a desired feature. From reading the code it looks to me that
auto-selection of online method type should be done in
memory_subsys_online().

When it is called from device online, mem->online_type should be -1:

if (mem->online_type < 0)
     mem->online_type = MMOP_ONLINE_KEEP;

Change it to:
if (mem->online_type < 0)
     mem->online_type = MMOP_DEFAULT_ONLINE_TYPE;

And in "linux/memory_hotplug.h"
#ifdef CONFIG_MEMORY_HOTPLUG_MOVABLE
#define MMOP_DEFAULT_ONLINE_TYPE MMOP_ONLINE_MOVABLE
#else
#define MMOP_DEFAULT_ONLINE_TYPE MMOP_ONLINE_KEEP
#endif

Could be expanded to support MMOP_ONLINE_KERNEL as well.

Pasha

>
> Signed-off-by: Leonardo Bras <leonardo@linux.ibm.com>
> ---
>  drivers/base/memory.c |  3 +++
>  mm/Kconfig            | 14 ++++++++++++++
>  2 files changed, 17 insertions(+)
>
> diff --git a/drivers/base/memory.c b/drivers/base/memory.c
> index f180427e48f4..378b585785c1 100644
> --- a/drivers/base/memory.c
> +++ b/drivers/base/memory.c
> @@ -670,6 +670,9 @@ static int init_memory_block(struct memory_block **memory,
>         mem->state = state;
>         start_pfn = section_nr_to_pfn(mem->start_section_nr);
>         mem->phys_device = arch_get_memory_phys_device(start_pfn);
> +#ifdef CONFIG_MEMORY_HOTPLUG_MOVABLE
> +       mem->online_type = MMOP_ONLINE_MOVABLE;
> +#endif
>
>         ret = register_memory(mem);
>
> diff --git a/mm/Kconfig b/mm/Kconfig
> index f0c76ba47695..74e793720f43 100644
> --- a/mm/Kconfig
> +++ b/mm/Kconfig
> @@ -180,6 +180,20 @@ config MEMORY_HOTREMOVE
>         depends on MEMORY_HOTPLUG && ARCH_ENABLE_MEMORY_HOTREMOVE
>         depends on MIGRATION
>
> +config MEMORY_HOTPLUG_MOVABLE
> +       bool "Enhance the likelihood of hot-remove"
> +       depends on MEMORY_HOTREMOVE
> +       help
> +         This option sets the hot-added memory zone to MOVABLE which
> +         drastically reduces the chance of a hot-remove to fail due to
> +         unmovable memory segments. Kernel memory can't be allocated in
> +         this zone.
> +
> +         Say Y here if you want to have better chance to hot-remove memory
> +         that have been previously hot-added.
> +         Say N here if you want to make all hot-added memory available to
> +         kernel space.
> +
>  # Heavily threaded applications may benefit from splitting the mm-wide
>  # page_table_lock, so that faults on different parts of the user address
>  # space can be handled with less contention: split it at this NR_CPUS.
> --
> 2.20.1
>

On Wed, Jul 17, 2019 at 10:42 PM Leonardo Bras <leonardo@linux.ibm.com> wrote:
>
> Adds an option on kernel config to make hot-added memory online in
> ZONE_MOVABLE by default.
>
> This would be great in systems with MEMORY_HOTPLUG_DEFAULT_ONLINE=y by
> allowing to choose which zone it will be auto-onlined
>
> Signed-off-by: Leonardo Bras <leonardo@linux.ibm.com>
> ---
>  drivers/base/memory.c |  3 +++
>  mm/Kconfig            | 14 ++++++++++++++
>  2 files changed, 17 insertions(+)
>
> diff --git a/drivers/base/memory.c b/drivers/base/memory.c
> index f180427e48f4..378b585785c1 100644
> --- a/drivers/base/memory.c
> +++ b/drivers/base/memory.c
> @@ -670,6 +670,9 @@ static int init_memory_block(struct memory_block **memory,
>         mem->state = state;
>         start_pfn = section_nr_to_pfn(mem->start_section_nr);
>         mem->phys_device = arch_get_memory_phys_device(start_pfn);
> +#ifdef CONFIG_MEMORY_HOTPLUG_MOVABLE
> +       mem->online_type = MMOP_ONLINE_MOVABLE;
> +#endif
>
>         ret = register_memory(mem);
>
> diff --git a/mm/Kconfig b/mm/Kconfig
> index f0c76ba47695..74e793720f43 100644
> --- a/mm/Kconfig
> +++ b/mm/Kconfig
> @@ -180,6 +180,20 @@ config MEMORY_HOTREMOVE
>         depends on MEMORY_HOTPLUG && ARCH_ENABLE_MEMORY_HOTREMOVE
>         depends on MIGRATION
>
> +config MEMORY_HOTPLUG_MOVABLE
> +       bool "Enhance the likelihood of hot-remove"
> +       depends on MEMORY_HOTREMOVE
> +       help
> +         This option sets the hot-added memory zone to MOVABLE which
> +         drastically reduces the chance of a hot-remove to fail due to
> +         unmovable memory segments. Kernel memory can't be allocated in
> +         this zone.
> +
> +         Say Y here if you want to have better chance to hot-remove memory
> +         that have been previously hot-added.
> +         Say N here if you want to make all hot-added memory available to
> +         kernel space.
> +
>  # Heavily threaded applications may benefit from splitting the mm-wide
>  # page_table_lock, so that faults on different parts of the user address
>  # space can be handled with less contention: split it at this NR_CPUS.
> --
> 2.20.1
>
Leonardo Bras July 18, 2019, 3:50 p.m. UTC | #4
On Thu, 2019-07-18 at 08:12 +0200, Oscar Salvador wrote:
> We do already have "movable_node" boot option, which exactly has that
> effect.
> Any hotplugged range will be placed in ZONE_MOVABLE.
Oh, I was not aware of it.

> Why do we need yet another option to achieve the same? Was not that
> enough for your case?
Well, another use of this config could be doing this boot option a
default on any given kernel. 
But in the above case I agree it would be wiser to add the code on
movable_node_is_enabled() directly, and not where I did put.

What do you think about it?

Thanks for the feedback,

Leonardo Brás
Michal Hocko July 18, 2019, 3:57 p.m. UTC | #5
On Thu 18-07-19 12:50:29, Leonardo Bras wrote:
> On Thu, 2019-07-18 at 08:12 +0200, Oscar Salvador wrote:
> > We do already have "movable_node" boot option, which exactly has that
> > effect.
> > Any hotplugged range will be placed in ZONE_MOVABLE.
> Oh, I was not aware of it.
> 
> > Why do we need yet another option to achieve the same? Was not that
> > enough for your case?
> Well, another use of this config could be doing this boot option a
> default on any given kernel. 
> But in the above case I agree it would be wiser to add the code on
> movable_node_is_enabled() directly, and not where I did put.
> 
> What do you think about it?

No further config options please. We do have means a more flexible way
to achieve movable node onlining so let's use it. Or could you be more
specific about cases which cannot use the command line option and really
need a config option to workaround that?
Leonardo Bras July 18, 2019, 4:03 p.m. UTC | #6
On Thu, 2019-07-18 at 08:19 -0400, Pavel Tatashin wrote:
> On Wed, Jul 17, 2019 at 10:42 PM Leonardo Bras <leonardo@linux.ibm.com> wrote:
> > Adds an option on kernel config to make hot-added memory online in
> > ZONE_MOVABLE by default.
> > 
> > This would be great in systems with MEMORY_HOTPLUG_DEFAULT_ONLINE=y by
> > allowing to choose which zone it will be auto-onlined
> 
> This is a desired feature. From reading the code it looks to me that
> auto-selection of online method type should be done in
> memory_subsys_online().
> 
> When it is called from device online, mem->online_type should be -1:
> 
> if (mem->online_type < 0)
>      mem->online_type = MMOP_ONLINE_KEEP;
> 
> Change it to:
> if (mem->online_type < 0)
>      mem->online_type = MMOP_DEFAULT_ONLINE_TYPE;
> 
> And in "linux/memory_hotplug.h"
> #ifdef CONFIG_MEMORY_HOTPLUG_MOVABLE
> #define MMOP_DEFAULT_ONLINE_TYPE MMOP_ONLINE_MOVABLE
> #else
> #define MMOP_DEFAULT_ONLINE_TYPE MMOP_ONLINE_KEEP
> #endif
> 
> Could be expanded to support MMOP_ONLINE_KERNEL as well.
> 
> Pasha

Thanks for the suggestions Pasha,

I was made aware there is a kernel boot option "movable_node" that
already creates the behavior I was trying to reproduce.

I was thinking of changing my patch in order to add a config option
that makes this behavior default (i.e. not need to pass it as a boot
parameter.

Do you think that it would still be a desired feature?

Regards,

Leonardo Brás
Pasha Tatashin July 18, 2019, 4:11 p.m. UTC | #7
On Thu, Jul 18, 2019 at 11:57 AM Michal Hocko <mhocko@kernel.org> wrote:
>
> On Thu 18-07-19 12:50:29, Leonardo Bras wrote:
> > On Thu, 2019-07-18 at 08:12 +0200, Oscar Salvador wrote:
> > > We do already have "movable_node" boot option, which exactly has that
> > > effect.
> > > Any hotplugged range will be placed in ZONE_MOVABLE.
> > Oh, I was not aware of it.
> >
> > > Why do we need yet another option to achieve the same? Was not that
> > > enough for your case?
> > Well, another use of this config could be doing this boot option a
> > default on any given kernel.
> > But in the above case I agree it would be wiser to add the code on
> > movable_node_is_enabled() directly, and not where I did put.
> >
> > What do you think about it?
>
> No further config options please. We do have means a more flexible way
> to achieve movable node onlining so let's use it. Or could you be more
> specific about cases which cannot use the command line option and really
> need a config option to workaround that?

Hi Michal,

Just trying to understand, if kernel parameters is the preferable
method, why do we even have

MEMORY_HOTPLUG_DEFAULT_ONLINE

It is just strange that we have a config to online memory by default
without kernel parameter, but no way to specify how to online it. It
just looks as incomplete interface to me. Perhaps this config should
be removed as well?

Pasha
Michal Hocko July 18, 2019, 4:40 p.m. UTC | #8
On Thu 18-07-19 12:11:25, Pavel Tatashin wrote:
> On Thu, Jul 18, 2019 at 11:57 AM Michal Hocko <mhocko@kernel.org> wrote:
> >
> > On Thu 18-07-19 12:50:29, Leonardo Bras wrote:
> > > On Thu, 2019-07-18 at 08:12 +0200, Oscar Salvador wrote:
> > > > We do already have "movable_node" boot option, which exactly has that
> > > > effect.
> > > > Any hotplugged range will be placed in ZONE_MOVABLE.
> > > Oh, I was not aware of it.
> > >
> > > > Why do we need yet another option to achieve the same? Was not that
> > > > enough for your case?
> > > Well, another use of this config could be doing this boot option a
> > > default on any given kernel.
> > > But in the above case I agree it would be wiser to add the code on
> > > movable_node_is_enabled() directly, and not where I did put.
> > >
> > > What do you think about it?
> >
> > No further config options please. We do have means a more flexible way
> > to achieve movable node onlining so let's use it. Or could you be more
> > specific about cases which cannot use the command line option and really
> > need a config option to workaround that?
> 
> Hi Michal,
> 
> Just trying to understand, if kernel parameters is the preferable
> method, why do we even have
> 
> MEMORY_HOTPLUG_DEFAULT_ONLINE

I have some opinion on this one TBH. I have even tried to remove it. The
config option has been added to workaround hotplug issues for some
memory balloning usecases where it was believed that the memory consumed
for the memory hotadd (struct pages) could get machine to OOM before
userspace manages to online it. So I would be more than happy to remove
it but there were some objections in the past. Maybe the work by Oscar
to allocate memmaps from the hotplugged memory can finally put an end to
this gross hack.

In any case, I do not think we want to repeat that pattern again.
Pasha Tatashin July 18, 2019, 4:43 p.m. UTC | #9
> > Just trying to understand, if kernel parameters is the preferable
> > method, why do we even have
> >
> > MEMORY_HOTPLUG_DEFAULT_ONLINE
>
> I have some opinion on this one TBH. I have even tried to remove it. The
> config option has been added to workaround hotplug issues for some
> memory balloning usecases where it was believed that the memory consumed
> for the memory hotadd (struct pages) could get machine to OOM before
> userspace manages to online it. So I would be more than happy to remove
> it but there were some objections in the past. Maybe the work by Oscar
> to allocate memmaps from the hotplugged memory can finally put an end to
> this gross hack.

Makes sense, thank you for the background info.

Pasha
Pasha Tatashin July 18, 2019, 4:45 p.m. UTC | #10
On Thu, Jul 18, 2019 at 12:04 PM Leonardo Bras <leonardo@linux.ibm.com> wrote:
>
> On Thu, 2019-07-18 at 08:19 -0400, Pavel Tatashin wrote:
> > On Wed, Jul 17, 2019 at 10:42 PM Leonardo Bras <leonardo@linux.ibm.com> wrote:
> > > Adds an option on kernel config to make hot-added memory online in
> > > ZONE_MOVABLE by default.
> > >
> > > This would be great in systems with MEMORY_HOTPLUG_DEFAULT_ONLINE=y by
> > > allowing to choose which zone it will be auto-onlined
> >
> > This is a desired feature. From reading the code it looks to me that
> > auto-selection of online method type should be done in
> > memory_subsys_online().
> >
> > When it is called from device online, mem->online_type should be -1:
> >
> > if (mem->online_type < 0)
> >      mem->online_type = MMOP_ONLINE_KEEP;
> >
> > Change it to:
> > if (mem->online_type < 0)
> >      mem->online_type = MMOP_DEFAULT_ONLINE_TYPE;
> >
> > And in "linux/memory_hotplug.h"
> > #ifdef CONFIG_MEMORY_HOTPLUG_MOVABLE
> > #define MMOP_DEFAULT_ONLINE_TYPE MMOP_ONLINE_MOVABLE
> > #else
> > #define MMOP_DEFAULT_ONLINE_TYPE MMOP_ONLINE_KEEP
> > #endif
> >
> > Could be expanded to support MMOP_ONLINE_KERNEL as well.
> >
> > Pasha
>
> Thanks for the suggestions Pasha,
>
> I was made aware there is a kernel boot option "movable_node" that
> already creates the behavior I was trying to reproduce.

I agree with others, no need to duplicate this functionality in a
config, and Michal in a separate e-mail explained the reasons why we
have MEMORY_HOTPLUG_DEFAULT_ONLINE.

>
> I was thinking of changing my patch in order to add a config option
> that makes this behavior default (i.e. not need to pass it as a boot
> parameter.
>
> Do you think that it would still be a desired feature?
>
> Regards,
>
> Leonardo Brás
diff mbox series

Patch

diff --git a/drivers/base/memory.c b/drivers/base/memory.c
index f180427e48f4..378b585785c1 100644
--- a/drivers/base/memory.c
+++ b/drivers/base/memory.c
@@ -670,6 +670,9 @@  static int init_memory_block(struct memory_block **memory,
 	mem->state = state;
 	start_pfn = section_nr_to_pfn(mem->start_section_nr);
 	mem->phys_device = arch_get_memory_phys_device(start_pfn);
+#ifdef CONFIG_MEMORY_HOTPLUG_MOVABLE
+	mem->online_type = MMOP_ONLINE_MOVABLE;
+#endif
 
 	ret = register_memory(mem);
 
diff --git a/mm/Kconfig b/mm/Kconfig
index f0c76ba47695..74e793720f43 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -180,6 +180,20 @@  config MEMORY_HOTREMOVE
 	depends on MEMORY_HOTPLUG && ARCH_ENABLE_MEMORY_HOTREMOVE
 	depends on MIGRATION
 
+config MEMORY_HOTPLUG_MOVABLE
+	bool "Enhance the likelihood of hot-remove"
+	depends on MEMORY_HOTREMOVE
+	help
+	  This option sets the hot-added memory zone to MOVABLE which
+	  drastically reduces the chance of a hot-remove to fail due to
+	  unmovable memory segments. Kernel memory can't be allocated in
+	  this zone.
+
+	  Say Y here if you want to have better chance to hot-remove memory
+	  that have been previously hot-added.
+	  Say N here if you want to make all hot-added memory available to
+	  kernel space.
+
 # Heavily threaded applications may benefit from splitting the mm-wide
 # page_table_lock, so that faults on different parts of the user address
 # space can be handled with less contention: split it at this NR_CPUS.