mbox series

[for,4.19-stable,00/25] mm/memory_hotplug: backport of pending stable fixes

Message ID 20200115153339.36409-1-david@redhat.com (mailing list archive)
Headers show
Series mm/memory_hotplug: backport of pending stable fixes | expand

Message

David Hildenbrand Jan. 15, 2020, 3:33 p.m. UTC
This is the backport of the following fixes for 4.19-stable:

- a31b264c2b41 ("mm/memory_hotplug: make
  unregister_memory_block_under_nodes() never fail")
-- Turned out to not only be a cleanup but also a fix
- 2c91f8fc6c99 ("mm/memory_hotplug: fix try_offline_node()")
-- Automatic stable backport failed due to missing dependencies.
- feee6b298916 ("mm/memory_hotplug: shrink zones when offlining memory")
-- Was marked as stable 5.0+ due to the backport complexity,, but it's also
   relevant for 4.19/4.14. As I have to backport quite some cleanups
   already ...

To minimize manual code changes, I decided to pull in quite some cleanups.
Still some manual code changes are necessary (indicated in the individual
patches). Especially missing arm64 hot(un)plug, missing sub-section hotadd
support, and missing unification of mm/hmm.c and kernel/memremap.c requires
care.

Due to:
- 4e0d2e7ef14d ("mm, sparse: pass nid instead of pgdat to
  sparse_add_one_section()")
I need:
- afe9b36ca890 ("mm/memunmap: don't access uninitialized memmap in
  memunmap_pages()")

Please note that:
- 4c4b7f9ba948 ("mm/memory_hotplug: remove memory block devices
  before arch_remove_memory()")
Makes big (e.g., 32TB) machines boot up slower (e.g., 2h vs 10m). There is
a performance fix in linux-next, but it does not seem to classify as a
fix for current RC / stable.

I did quite some testing with hot(un)plug, onlining/offlining of memory
blocks and memory-less/CPU-less NUMA nodes under x86_64 - the same set of
tests I run against upstream on a fairly regular basis. I compile-tested
on PowerPC. I did not test any ZONE_DEVICE/HMM thingies.

Let's see what people think - it's a lot of patches. If we want this,
then I can try to prepare a similar set for 4.4-stable.

CCing only some people to minimize noise.

Cc: Oscar Salvador <osalvador@suse.de>
Cc: Michal Hocko <mhocko@suse.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Laurent Vivier <lvivier@redhat.com>
Cc: Baoquan He <bhe@redhat.com>

David Hildenbrand (25):
  mm/memory_hotplug: make remove_memory() take the device_hotplug_lock
  mm, sparse: drop pgdat_resize_lock in sparse_add/remove_one_section()
  mm, sparse: pass nid instead of pgdat to sparse_add_one_section()
  drivers/base/memory.c: remove an unnecessary check on NR_MEM_SECTIONS
  mm, memory_hotplug: add nid parameter to arch_remove_memory
  mm/memory_hotplug: release memory resource after arch_remove_memory()
  drivers/base/memory.c: clean up relics in function parameters
  mm, memory_hotplug: update a comment in unregister_memory()
  mm/memory_hotplug: make unregister_memory_section() never fail
  mm/memory_hotplug: make __remove_section() never fail
  powerpc/mm: Fix section mismatch warning
  powerpc/mm: move warning from resize_hpt_for_hotplug()
  mm/memory_hotplug: make __remove_pages() and arch_remove_memory()
    never fail
  s390x/mm: implement arch_remove_memory()
  mm/memory_hotplug: allow arch_remove_memory() without
    CONFIG_MEMORY_HOTREMOVE
  drivers/base/memory: pass a block_id to init_memory_block()
  mm/memory_hotplug: create memory block devices after arch_add_memory()
  mm/memory_hotplug: remove memory block devices before
    arch_remove_memory()
  mm/memory_hotplug: make unregister_memory_block_under_nodes() never
    fail
  mm/memory_hotplug: remove "zone" parameter from
    sparse_remove_one_section
  mm/hotplug: kill is_dev_zone() usage in __remove_pages()
  drivers/base/node.c: simplify unregister_memory_block_under_nodes()
  mm/memunmap: don't access uninitialized memmap in memunmap_pages()
  mm/memory_hotplug: fix try_offline_node()
  mm/memory_hotplug: shrink zones when offlining memory

 arch/ia64/mm/init.c                           |  15 +-
 arch/powerpc/include/asm/sparsemem.h          |   4 +-
 arch/powerpc/mm/hash_utils_64.c               |  19 +-
 arch/powerpc/mm/mem.c                         |  28 +--
 arch/powerpc/platforms/powernv/memtrace.c     |   2 +-
 .../platforms/pseries/hotplug-memory.c        |   6 +-
 arch/powerpc/platforms/pseries/lpar.c         |   3 +-
 arch/s390/mm/init.c                           |  18 +-
 arch/sh/mm/init.c                             |  15 +-
 arch/x86/mm/init_32.c                         |   9 +-
 arch/x86/mm/init_64.c                         |  17 +-
 drivers/acpi/acpi_memhotplug.c                |   2 +-
 drivers/base/memory.c                         | 203 +++++++++++-------
 drivers/base/node.c                           |  52 ++---
 include/linux/memory.h                        |   8 +-
 include/linux/memory_hotplug.h                |  22 +-
 include/linux/mmzone.h                        |   3 +-
 include/linux/node.h                          |   7 +-
 kernel/memremap.c                             |  13 +-
 mm/hmm.c                                      |   8 +-
 mm/memory_hotplug.c                           | 166 +++++++-------
 mm/sparse.c                                   |  27 +--
 22 files changed, 318 insertions(+), 329 deletions(-)

Comments

Greg Kroah-Hartman Jan. 15, 2020, 3:39 p.m. UTC | #1
On Wed, Jan 15, 2020 at 04:33:14PM +0100, David Hildenbrand wrote:
> This is the backport of the following fixes for 4.19-stable:
> 
> - a31b264c2b41 ("mm/memory_hotplug: make
>   unregister_memory_block_under_nodes() never fail")
> -- Turned out to not only be a cleanup but also a fix
> - 2c91f8fc6c99 ("mm/memory_hotplug: fix try_offline_node()")
> -- Automatic stable backport failed due to missing dependencies.
> - feee6b298916 ("mm/memory_hotplug: shrink zones when offlining memory")
> -- Was marked as stable 5.0+ due to the backport complexity,, but it's also
>    relevant for 4.19/4.14. As I have to backport quite some cleanups
>    already ...
> 
> To minimize manual code changes, I decided to pull in quite some cleanups.
> Still some manual code changes are necessary (indicated in the individual
> patches). Especially missing arm64 hot(un)plug, missing sub-section hotadd
> support, and missing unification of mm/hmm.c and kernel/memremap.c requires
> care.
> 
> Due to:
> - 4e0d2e7ef14d ("mm, sparse: pass nid instead of pgdat to
>   sparse_add_one_section()")
> I need:
> - afe9b36ca890 ("mm/memunmap: don't access uninitialized memmap in
>   memunmap_pages()")
> 
> Please note that:
> - 4c4b7f9ba948 ("mm/memory_hotplug: remove memory block devices
>   before arch_remove_memory()")
> Makes big (e.g., 32TB) machines boot up slower (e.g., 2h vs 10m). There is
> a performance fix in linux-next, but it does not seem to classify as a
> fix for current RC / stable.
> 
> I did quite some testing with hot(un)plug, onlining/offlining of memory
> blocks and memory-less/CPU-less NUMA nodes under x86_64 - the same set of
> tests I run against upstream on a fairly regular basis. I compile-tested
> on PowerPC. I did not test any ZONE_DEVICE/HMM thingies.
> 
> Let's see what people think - it's a lot of patches. If we want this,
> then I can try to prepare a similar set for 4.4-stable.

What bug(s) are these trying to fix here?

And why would 4.9 and 4.4 care about them?

thanks,

greg k-h
David Hildenbrand Jan. 15, 2020, 3:54 p.m. UTC | #2
On 15.01.20 16:39, Greg Kroah-Hartman wrote:
> On Wed, Jan 15, 2020 at 04:33:14PM +0100, David Hildenbrand wrote:
>> This is the backport of the following fixes for 4.19-stable:
>>
>> - a31b264c2b41 ("mm/memory_hotplug: make
>>   unregister_memory_block_under_nodes() never fail")
>> -- Turned out to not only be a cleanup but also a fix

Took the wrong one. It's d84f2f5a7552 ("drivers/base/node.c: simplify
unregister_memory_block_under_nodes()")

>> - 2c91f8fc6c99 ("mm/memory_hotplug: fix try_offline_node()")
>> -- Automatic stable backport failed due to missing dependencies.
>> - feee6b298916 ("mm/memory_hotplug: shrink zones when offlining memory")
>> -- Was marked as stable 5.0+ due to the backport complexity,, but it's also
>>    relevant for 4.19/4.14. As I have to backport quite some cleanups
>>    already ...
>>
>> To minimize manual code changes, I decided to pull in quite some cleanups.
>> Still some manual code changes are necessary (indicated in the individual
>> patches). Especially missing arm64 hot(un)plug, missing sub-section hotadd
>> support, and missing unification of mm/hmm.c and kernel/memremap.c requires
>> care.
>>
>> Due to:
>> - 4e0d2e7ef14d ("mm, sparse: pass nid instead of pgdat to
>>   sparse_add_one_section()")
>> I need:
>> - afe9b36ca890 ("mm/memunmap: don't access uninitialized memmap in
>>   memunmap_pages()")
>>
>> Please note that:
>> - 4c4b7f9ba948 ("mm/memory_hotplug: remove memory block devices
>>   before arch_remove_memory()")
>> Makes big (e.g., 32TB) machines boot up slower (e.g., 2h vs 10m). There is
>> a performance fix in linux-next, but it does not seem to classify as a
>> fix for current RC / stable.
>>
>> I did quite some testing with hot(un)plug, onlining/offlining of memory
>> blocks and memory-less/CPU-less NUMA nodes under x86_64 - the same set of
>> tests I run against upstream on a fairly regular basis. I compile-tested
>> on PowerPC. I did not test any ZONE_DEVICE/HMM thingies.
>>
>> Let's see what people think - it's a lot of patches. If we want this,
>> then I can try to prepare a similar set for 4.4-stable.
> 
> What bug(s) are these trying to fix here?

All tackle memory unplug issues, especially when memory was never
onlined (or onlining failed), paired with memory unplug. When trying to
access garbage memmaps we crash the kernel (e.g., because the derviced
pgdat pointer is broken)


d84f2f5a7552 ("drivers/base/node.c: simplify
unregister_memory_block_under_nodes()")

->
https://lore.kernel.org/linux-mm/b2e31976-b07d-11e6-f806-f13f4619be4d@redhat.com/

"If the memory we are removing was never onlined,
get_nid_for_pfn()->pfn_to_nid() will return garbage. Removing will
succeed but links will remain in place. [...] We will trigger the
BUG_ON(ret) in add_memory_resource(), because
link_mem_sections() will return with -EEXIST."


2c91f8fc6c99 ("mm/memory_hotplug: fix try_offline_node()")

We might access garbage memmaps on memory unplug and trigger a crash on
memory unplug, when trying to offline the node.


feee6b298916 ("mm/memory_hotplug: shrink zones when offlining memory")

Memory unplug will access garbage memmaps (resulting in crashes) and the
zones might not get fixed up properly. Relevant when memory was never
onlined, when memory blocks of a DIMM were onlined to different zones,
or when memory blocks were re-onlined to different zones.


This backports the remaining "don't access uninitialized memmaps"-like
fixes. The other ones, were already backported.

> 
> And why would 4.9 and 4.4 care about them?

The crashes can be trigger under 4.9 and 4.4. If we decide that we do
not care, then this series can be dropped.
Greg Kroah-Hartman Jan. 16, 2020, 8:34 a.m. UTC | #3
On Wed, Jan 15, 2020 at 04:54:59PM +0100, David Hildenbrand wrote:
> > 
> > And why would 4.9 and 4.4 care about them?
> 
> The crashes can be trigger under 4.9 and 4.4. If we decide that we do
> not care, then this series can be dropped.

Do we have users of memory hotplug that are somehow stuck at those old
versions that can not upgrade?  Obviously this didn't work previously
for them, so moving to a modern kernel might be a good reason to get
this new feature :)

thanks,

greg k-h
David Hildenbrand Jan. 16, 2020, 8:42 a.m. UTC | #4
On 16.01.20 09:34, Greg Kroah-Hartman wrote:
> On Wed, Jan 15, 2020 at 04:54:59PM +0100, David Hildenbrand wrote:
>>>
>>> And why would 4.9 and 4.4 care about them?
>>
>> The crashes can be trigger under 4.9 and 4.4. If we decide that we do
>> not care, then this series can be dropped.
> 
> Do we have users of memory hotplug that are somehow stuck at those old
> versions that can not upgrade?  Obviously this didn't work previously
> for them, so moving to a modern kernel might be a good reason to get
> this new feature :)

That's a good point - but usually when you experience a crash it's too
late for you to realize that you have to move to a newer release :) It
used to work before 4.4 IIRC.

(one case I am concerned with is when memory onlining after memory
hotplug failed (e.g., because the was an OOM event happening
concurrently) - then memory hotunplug will crash your system.)

But yeah, I am not aware of a report where somebody actually hit any of
these issues on a stable kernel.
Greg Kroah-Hartman Jan. 16, 2020, 8:54 a.m. UTC | #5
On Thu, Jan 16, 2020 at 09:42:51AM +0100, David Hildenbrand wrote:
> On 16.01.20 09:34, Greg Kroah-Hartman wrote:
> > On Wed, Jan 15, 2020 at 04:54:59PM +0100, David Hildenbrand wrote:
> >>>
> >>> And why would 4.9 and 4.4 care about them?
> >>
> >> The crashes can be trigger under 4.9 and 4.4. If we decide that we do
> >> not care, then this series can be dropped.
> > 
> > Do we have users of memory hotplug that are somehow stuck at those old
> > versions that can not upgrade?  Obviously this didn't work previously
> > for them, so moving to a modern kernel might be a good reason to get
> > this new feature :)
> 
> That's a good point - but usually when you experience a crash it's too
> late for you to realize that you have to move to a newer release :) It
> used to work before 4.4 IIRC.
> 
> (one case I am concerned with is when memory onlining after memory
> hotplug failed (e.g., because the was an OOM event happening
> concurrently) - then memory hotunplug will crash your system.)
> 
> But yeah, I am not aware of a report where somebody actually hit any of
> these issues on a stable kernel.

Ok, let's start with 4.19 and 4.14 for these for now.  Should make
things easier, right?

thanks,

greg k-h
David Hildenbrand Jan. 16, 2020, 8:59 a.m. UTC | #6
On 16.01.20 09:54, Greg Kroah-Hartman wrote:
> On Thu, Jan 16, 2020 at 09:42:51AM +0100, David Hildenbrand wrote:
>> On 16.01.20 09:34, Greg Kroah-Hartman wrote:
>>> On Wed, Jan 15, 2020 at 04:54:59PM +0100, David Hildenbrand wrote:
>>>>>
>>>>> And why would 4.9 and 4.4 care about them?
>>>>
>>>> The crashes can be trigger under 4.9 and 4.4. If we decide that we do
>>>> not care, then this series can be dropped.
>>>
>>> Do we have users of memory hotplug that are somehow stuck at those old
>>> versions that can not upgrade?  Obviously this didn't work previously
>>> for them, so moving to a modern kernel might be a good reason to get
>>> this new feature :)
>>
>> That's a good point - but usually when you experience a crash it's too
>> late for you to realize that you have to move to a newer release :) It
>> used to work before 4.4 IIRC.
>>
>> (one case I am concerned with is when memory onlining after memory
>> hotplug failed (e.g., because the was an OOM event happening
>> concurrently) - then memory hotunplug will crash your system.)
>>
>> But yeah, I am not aware of a report where somebody actually hit any of
>> these issues on a stable kernel.

Just to clarify: I can reproduce them of course :)

> 
> Ok, let's start with 4.19 and 4.14 for these for now.  Should make
> things easier, right?

What do you mean with "start with"? Drop this series and not do the
backport, meaning people should switch to a stable kernel > 4.19 if they
don't want surprises on memory unplug?
Greg Kroah-Hartman Jan. 16, 2020, 9:26 a.m. UTC | #7
On Thu, Jan 16, 2020 at 09:59:44AM +0100, David Hildenbrand wrote:
> On 16.01.20 09:54, Greg Kroah-Hartman wrote:
> > On Thu, Jan 16, 2020 at 09:42:51AM +0100, David Hildenbrand wrote:
> >> On 16.01.20 09:34, Greg Kroah-Hartman wrote:
> >>> On Wed, Jan 15, 2020 at 04:54:59PM +0100, David Hildenbrand wrote:
> >>>>>
> >>>>> And why would 4.9 and 4.4 care about them?
> >>>>
> >>>> The crashes can be trigger under 4.9 and 4.4. If we decide that we do
> >>>> not care, then this series can be dropped.
> >>>
> >>> Do we have users of memory hotplug that are somehow stuck at those old
> >>> versions that can not upgrade?  Obviously this didn't work previously
> >>> for them, so moving to a modern kernel might be a good reason to get
> >>> this new feature :)
> >>
> >> That's a good point - but usually when you experience a crash it's too
> >> late for you to realize that you have to move to a newer release :) It
> >> used to work before 4.4 IIRC.
> >>
> >> (one case I am concerned with is when memory onlining after memory
> >> hotplug failed (e.g., because the was an OOM event happening
> >> concurrently) - then memory hotunplug will crash your system.)
> >>
> >> But yeah, I am not aware of a report where somebody actually hit any of
> >> these issues on a stable kernel.
> 
> Just to clarify: I can reproduce them of course :)
> 
> > 
> > Ok, let's start with 4.19 and 4.14 for these for now.  Should make
> > things easier, right?
> 
> What do you mean with "start with"? Drop this series and not do the
> backport, meaning people should switch to a stable kernel > 4.19 if they
> don't want surprises on memory unplug?

No, I'm saying I want to take this for 4.19, and 4.14 if you have it.

But your original series you sent needs to be fixed up, I can't take it
as-is for the authorship reasons.

thanks,

greg k-h
David Hildenbrand Jan. 16, 2020, 9:35 a.m. UTC | #8
On 16.01.20 10:26, Greg Kroah-Hartman wrote:
> On Thu, Jan 16, 2020 at 09:59:44AM +0100, David Hildenbrand wrote:
>> On 16.01.20 09:54, Greg Kroah-Hartman wrote:
>>> On Thu, Jan 16, 2020 at 09:42:51AM +0100, David Hildenbrand wrote:
>>>> On 16.01.20 09:34, Greg Kroah-Hartman wrote:
>>>>> On Wed, Jan 15, 2020 at 04:54:59PM +0100, David Hildenbrand wrote:
>>>>>>>
>>>>>>> And why would 4.9 and 4.4 care about them?
>>>>>>
>>>>>> The crashes can be trigger under 4.9 and 4.4. If we decide that we do
>>>>>> not care, then this series can be dropped.
>>>>>
>>>>> Do we have users of memory hotplug that are somehow stuck at those old
>>>>> versions that can not upgrade?  Obviously this didn't work previously
>>>>> for them, so moving to a modern kernel might be a good reason to get
>>>>> this new feature :)
>>>>
>>>> That's a good point - but usually when you experience a crash it's too
>>>> late for you to realize that you have to move to a newer release :) It
>>>> used to work before 4.4 IIRC.
>>>>
>>>> (one case I am concerned with is when memory onlining after memory
>>>> hotplug failed (e.g., because the was an OOM event happening
>>>> concurrently) - then memory hotunplug will crash your system.)
>>>>
>>>> But yeah, I am not aware of a report where somebody actually hit any of
>>>> these issues on a stable kernel.
>>
>> Just to clarify: I can reproduce them of course :)
>>
>>>
>>> Ok, let's start with 4.19 and 4.14 for these for now.  Should make
>>> things easier, right?
>>
>> What do you mean with "start with"? Drop this series and not do the
>> backport, meaning people should switch to a stable kernel > 4.19 if they
>> don't want surprises on memory unplug?
> 
> No, I'm saying I want to take this for 4.19, and 4.14 if you have it.
> 
> But your original series you sent needs to be fixed up, I can't take it
> as-is for the authorship reasons.

Got it, will fix that up and resend!

Cheers!
David Hildenbrand Jan. 16, 2020, 2:32 p.m. UTC | #9
On 16.01.20 10:26, Greg Kroah-Hartman wrote:
> On Thu, Jan 16, 2020 at 09:59:44AM +0100, David Hildenbrand wrote:
>> On 16.01.20 09:54, Greg Kroah-Hartman wrote:
>>> On Thu, Jan 16, 2020 at 09:42:51AM +0100, David Hildenbrand wrote:
>>>> On 16.01.20 09:34, Greg Kroah-Hartman wrote:
>>>>> On Wed, Jan 15, 2020 at 04:54:59PM +0100, David Hildenbrand wrote:
>>>>>>>
>>>>>>> And why would 4.9 and 4.4 care about them?
>>>>>>
>>>>>> The crashes can be trigger under 4.9 and 4.4. If we decide that we do
>>>>>> not care, then this series can be dropped.
>>>>>
>>>>> Do we have users of memory hotplug that are somehow stuck at those old
>>>>> versions that can not upgrade?  Obviously this didn't work previously
>>>>> for them, so moving to a modern kernel might be a good reason to get
>>>>> this new feature :)
>>>>
>>>> That's a good point - but usually when you experience a crash it's too
>>>> late for you to realize that you have to move to a newer release :) It
>>>> used to work before 4.4 IIRC.
>>>>
>>>> (one case I am concerned with is when memory onlining after memory
>>>> hotplug failed (e.g., because the was an OOM event happening
>>>> concurrently) - then memory hotunplug will crash your system.)
>>>>
>>>> But yeah, I am not aware of a report where somebody actually hit any of
>>>> these issues on a stable kernel.
>>
>> Just to clarify: I can reproduce them of course :)
>>
>>>
>>> Ok, let's start with 4.19 and 4.14 for these for now.  Should make
>>> things easier, right?
>>
>> What do you mean with "start with"? Drop this series and not do the
>> backport, meaning people should switch to a stable kernel > 4.19 if they
>> don't want surprises on memory unplug?
> 
> No, I'm saying I want to take this for 4.19, and 4.14 if you have it.

Minor correction: I meant 4.19 and 4.14, not 4.4 :/ Sorry for the
confusion. Will try to prepare the 4.14 backports as well.