diff mbox series

[v1] s390: drop memory notifier for protecting kdump crash kernel area

Message ID 20200424081218.6919-1-david@redhat.com (mailing list archive)
State New, archived
Headers show
Series [v1] s390: drop memory notifier for protecting kdump crash kernel area | expand

Commit Message

David Hildenbrand April 24, 2020, 8:12 a.m. UTC
Assume we have a crashkernel area of 256MB reserved:

root@vm0:~# cat /proc/iomem
00000000-6fffffff : System RAM
  0f258000-0fcfffff : Kernel code
  0fd00000-101d10e3 : Kernel data
  105b3000-1068dfff : Kernel bss
70000000-7fffffff : Crash kernel

This exactly corresponds to memory block 7 (memory block size is 256MB).
Trying to offline that memory block results in:

root@vm0:~# echo "offline" > /sys/devices/system/memory/memory7/state
-bash: echo: write error: Device or resource busy

[  128.458762] page:000003d081c00000 refcount:1 mapcount:0 mapping:00000000d01cecd4 index:0x0
[  128.458773] flags: 0x1ffff00000001000(reserved)
[  128.458781] raw: 1ffff00000001000 000003d081c00008 000003d081c00008 0000000000000000
[  128.458781] raw: 0000000000000000 0000000000000000 ffffffff00000001 0000000000000000
[  128.458783] page dumped because: unmovable page

The craskernel area is marked reserved in the bootmem allocator. This
results in the memmap getting initialized (refcount=1, PG_reserved), but
the pages are never freed to the page allocator.

So these pages look like allocated pages that are unmovable (esp.
PG_reserved), and therefore, memory offlining fails early, when trying to
isolate the page range.

We don't need a special memory notifier and can drop it. Repeating the
above test with this patch results in the same behavior.

Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Philipp Rudo <prudo@linux.ibm.com>
Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 arch/s390/kernel/setup.c | 33 ---------------------------------
 1 file changed, 33 deletions(-)

Comments

David Hildenbrand April 24, 2020, 8:18 a.m. UTC | #1
On 24.04.20 10:12, David Hildenbrand wrote:
> Assume we have a crashkernel area of 256MB reserved:
> 
> root@vm0:~# cat /proc/iomem
> 00000000-6fffffff : System RAM
>   0f258000-0fcfffff : Kernel code
>   0fd00000-101d10e3 : Kernel data
>   105b3000-1068dfff : Kernel bss
> 70000000-7fffffff : Crash kernel
> 
> This exactly corresponds to memory block 7 (memory block size is 256MB).
> Trying to offline that memory block results in:
> 
> root@vm0:~# echo "offline" > /sys/devices/system/memory/memory7/state
> -bash: echo: write error: Device or resource busy
> 
> [  128.458762] page:000003d081c00000 refcount:1 mapcount:0 mapping:00000000d01cecd4 index:0x0
> [  128.458773] flags: 0x1ffff00000001000(reserved)
> [  128.458781] raw: 1ffff00000001000 000003d081c00008 000003d081c00008 0000000000000000
> [  128.458781] raw: 0000000000000000 0000000000000000 ffffffff00000001 0000000000000000
> [  128.458783] page dumped because: unmovable page
> 
> The craskernel area is marked reserved in the bootmem allocator. This
> results in the memmap getting initialized (refcount=1, PG_reserved), but
> the pages are never freed to the page allocator.
> 
> So these pages look like allocated pages that are unmovable (esp.
> PG_reserved), and therefore, memory offlining fails early, when trying to
> isolate the page range.
> 
> We don't need a special memory notifier and can drop it. Repeating the
> above test with this patch results in the same behavior.
> 
> Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
> Cc: Vasily Gorbik <gor@linux.ibm.com>
> Cc: Christian Borntraeger <borntraeger@de.ibm.com>
> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
> Cc: Philipp Rudo <prudo@linux.ibm.com>
> Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
> Cc: Eric W. Biederman <ebiederm@xmission.com>
> Cc: Michal Hocko <mhocko@kernel.org>
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  arch/s390/kernel/setup.c | 33 ---------------------------------
>  1 file changed, 33 deletions(-)
> 
> diff --git a/arch/s390/kernel/setup.c b/arch/s390/kernel/setup.c
> index 0f0b140b5558..95d4fba0d811 100644
> --- a/arch/s390/kernel/setup.c
> +++ b/arch/s390/kernel/setup.c
> @@ -39,7 +39,6 @@
>  #include <linux/kernel_stat.h>
>  #include <linux/dma-contiguous.h>
>  #include <linux/device.h>
> -#include <linux/notifier.h>
>  #include <linux/pfn.h>
>  #include <linux/ctype.h>
>  #include <linux/reboot.h>
> @@ -591,35 +590,6 @@ static void __init setup_memory_end(void)
>  	pr_notice("The maximum memory size is %luMB\n", memory_end >> 20);
>  }
>  
> -#ifdef CONFIG_CRASH_DUMP
> -
> -/*
> - * When kdump is enabled, we have to ensure that no memory from
> - * the area [0 - crashkernel memory size] and
> - * [crashk_res.start - crashk_res.end] is set offline.
> - */

Re-reading that comment, I missed the [0 - crashkernel memory size]
part (for relocation IIRC). So we might want to keep checking for [0 -
crashkernel memory size] - will double check.
diff mbox series

Patch

diff --git a/arch/s390/kernel/setup.c b/arch/s390/kernel/setup.c
index 0f0b140b5558..95d4fba0d811 100644
--- a/arch/s390/kernel/setup.c
+++ b/arch/s390/kernel/setup.c
@@ -39,7 +39,6 @@ 
 #include <linux/kernel_stat.h>
 #include <linux/dma-contiguous.h>
 #include <linux/device.h>
-#include <linux/notifier.h>
 #include <linux/pfn.h>
 #include <linux/ctype.h>
 #include <linux/reboot.h>
@@ -591,35 +590,6 @@  static void __init setup_memory_end(void)
 	pr_notice("The maximum memory size is %luMB\n", memory_end >> 20);
 }
 
-#ifdef CONFIG_CRASH_DUMP
-
-/*
- * When kdump is enabled, we have to ensure that no memory from
- * the area [0 - crashkernel memory size] and
- * [crashk_res.start - crashk_res.end] is set offline.
- */
-static int kdump_mem_notifier(struct notifier_block *nb,
-			      unsigned long action, void *data)
-{
-	struct memory_notify *arg = data;
-
-	if (action != MEM_GOING_OFFLINE)
-		return NOTIFY_OK;
-	if (arg->start_pfn < PFN_DOWN(resource_size(&crashk_res)))
-		return NOTIFY_BAD;
-	if (arg->start_pfn > PFN_DOWN(crashk_res.end))
-		return NOTIFY_OK;
-	if (arg->start_pfn + arg->nr_pages - 1 < PFN_DOWN(crashk_res.start))
-		return NOTIFY_OK;
-	return NOTIFY_BAD;
-}
-
-static struct notifier_block kdump_mem_nb = {
-	.notifier_call = kdump_mem_notifier,
-};
-
-#endif
-
 /*
  * Make sure that the area behind memory_end is protected
  */
@@ -703,9 +673,6 @@  static void __init reserve_crashkernel(void)
 		return;
 	}
 
-	if (register_memory_notifier(&kdump_mem_nb))
-		return;
-
 	if (!OLDMEM_BASE && MACHINE_IS_VM)
 		diag10_range(PFN_DOWN(crash_base), PFN_DOWN(crash_size));
 	crashk_res.start = crash_base;