diff mbox series

[V2] arm64/hotplug: Improve memory offline event notifier

Message ID 1598241869-28416-1-git-send-email-anshuman.khandual@arm.com (mailing list archive)
State New, archived
Headers show
Series [V2] arm64/hotplug: Improve memory offline event notifier | expand

Commit Message

Anshuman Khandual Aug. 24, 2020, 4:04 a.m. UTC
This brings about three different changes to the sole memory event notifier
for arm64 platform and improves it's robustness while also enhancing debug
capabilities during potential memory offlining error conditions.

This moves the memory notifier registration bit earlier in the boot process
from device_initcall() to setup_arch() which will help in guarding against
potential early boot memory offline requests.

This enables MEM_OFFLINE memory event handling. It will help intercept any
possible error condition such as if boot memory some how still got offlined
even after an expilicit notifier failure, potentially by a future change in
generic hotplug framework. This would help detect such scenarious and help
debug further.

It also adds a validation function which scans entire boot memory and makes
sure that early memory sections are online. This check is essential for the
memory notifier to work properly as it cannot prevent boot memory offlining
if they are not online to begin with. But this additional sanity check is
enabled only with DEBUG_VM.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Steve Capper <steve.capper@arm.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
This applies on 5.9-rc2

Changes in V2:

- Dropped all generic changes wrt MEM_CANCEL_OFFLINE reasons enumeration
- Dropped all related (processing MEM_CANCEL_OFFLINE reasons) changes on arm64
- Added validate_boot_mem_online_state() that gets called with early_initcall()
- Added CONFIG_MEMORY_HOTREMOVE check before registering memory notifier
- Moved notifier registration i.e memory_hotremove_notifier into setup_arch()

Changes in V1: (https://patchwork.kernel.org/project/linux-mm/list/?series=271237)

 arch/arm64/include/asm/mmu.h |   8 +++
 arch/arm64/kernel/setup.c    |   8 +++
 arch/arm64/mm/mmu.c          | 108 ++++++++++++++++++++++++++++++++---
 3 files changed, 116 insertions(+), 8 deletions(-)

Comments

Anshuman Khandual Aug. 24, 2020, 4:09 a.m. UTC | #1
On 08/24/2020 09:34 AM, Anshuman Khandual wrote:
> This brings about three different changes to the sole memory event notifier
> for arm64 platform and improves it's robustness while also enhancing debug
> capabilities during potential memory offlining error conditions.
> 
> This moves the memory notifier registration bit earlier in the boot process
> from device_initcall() to setup_arch() which will help in guarding against
> potential early boot memory offline requests.
> 
> This enables MEM_OFFLINE memory event handling. It will help intercept any
> possible error condition such as if boot memory some how still got offlined
> even after an expilicit notifier failure, potentially by a future change in
> generic hotplug framework. This would help detect such scenarious and help
> debug further.
> 
> It also adds a validation function which scans entire boot memory and makes
> sure that early memory sections are online. This check is essential for the
> memory notifier to work properly as it cannot prevent boot memory offlining
> if they are not online to begin with. But this additional sanity check is
> enabled only with DEBUG_VM.
> 
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.com>

Wrong email address here for Will.

+ Will Deacon <will@kernel.org>

s/will@kernel.com/will@kernel.org next time around.
Anshuman Khandual Sept. 8, 2020, 9:23 a.m. UTC | #2
On 08/24/2020 09:34 AM, Anshuman Khandual wrote:
> This brings about three different changes to the sole memory event notifier
> for arm64 platform and improves it's robustness while also enhancing debug
> capabilities during potential memory offlining error conditions.
> 
> This moves the memory notifier registration bit earlier in the boot process
> from device_initcall() to setup_arch() which will help in guarding against
> potential early boot memory offline requests.
> 
> This enables MEM_OFFLINE memory event handling. It will help intercept any
> possible error condition such as if boot memory some how still got offlined
> even after an expilicit notifier failure, potentially by a future change in
> generic hotplug framework. This would help detect such scenarious and help
> debug further.
> 
> It also adds a validation function which scans entire boot memory and makes
> sure that early memory sections are online. This check is essential for the
> memory notifier to work properly as it cannot prevent boot memory offlining
> if they are not online to begin with. But this additional sanity check is
> enabled only with DEBUG_VM.
> 
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.com>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: Steve Capper <steve.capper@arm.com>
> Cc: Mark Brown <broonie@kernel.org>
> Cc: linux-arm-kernel@lists.infradead.org
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
> ---
> This applies on 5.9-rc2
> 
> Changes in V2:
> 
> - Dropped all generic changes wrt MEM_CANCEL_OFFLINE reasons enumeration
> - Dropped all related (processing MEM_CANCEL_OFFLINE reasons) changes on arm64
> - Added validate_boot_mem_online_state() that gets called with early_initcall()
> - Added CONFIG_MEMORY_HOTREMOVE check before registering memory notifier
> - Moved notifier registration i.e memory_hotremove_notifier into setup_arch()
Gentle ping, any updates on this ?
Catalin Marinas Sept. 11, 2020, 2:06 p.m. UTC | #3
Hi Anshuman,

On Mon, Aug 24, 2020 at 09:34:29AM +0530, Anshuman Khandual wrote:
> This brings about three different changes to the sole memory event notifier
> for arm64 platform and improves it's robustness while also enhancing debug
> capabilities during potential memory offlining error conditions.
> 
> This moves the memory notifier registration bit earlier in the boot process
> from device_initcall() to setup_arch() which will help in guarding against
> potential early boot memory offline requests.
> 
> This enables MEM_OFFLINE memory event handling. It will help intercept any
> possible error condition such as if boot memory some how still got offlined
> even after an expilicit notifier failure, potentially by a future change in
> generic hotplug framework. This would help detect such scenarious and help
> debug further.
> 
> It also adds a validation function which scans entire boot memory and makes
> sure that early memory sections are online. This check is essential for the
> memory notifier to work properly as it cannot prevent boot memory offlining
> if they are not online to begin with. But this additional sanity check is
> enabled only with DEBUG_VM.

Could you please split this in separate patches rather than having a
single one doing three somewhat related things?

> --- a/arch/arm64/kernel/setup.c
> +++ b/arch/arm64/kernel/setup.c
> @@ -376,6 +376,14 @@ void __init __no_sanitize_address setup_arch(char **cmdline_p)
>  			"This indicates a broken bootloader or old kernel\n",
>  			boot_args[1], boot_args[2], boot_args[3]);
>  	}
> +
> +	/*
> +	 * Register the memory notifier which will prevent boot
> +	 * memory offlining requests - early enough. But there
> +	 * should not be any actual offlinig request till memory
> +	 * block devices are initialized with memory_dev_init().
> +	 */
> +	memory_hotremove_notifier();

Why can this not be an early_initcall()? As you said, memory_dev_init()
is called much later, after the SMP was initialised.

You could even combine this with validate_bootmem_online_state() in a
single early_initcall() which, after checking, registers the notifier.
Anshuman Khandual Sept. 14, 2020, 4:05 a.m. UTC | #4
On 09/11/2020 07:36 PM, Catalin Marinas wrote:
> Hi Anshuman,
> 
> On Mon, Aug 24, 2020 at 09:34:29AM +0530, Anshuman Khandual wrote:
>> This brings about three different changes to the sole memory event notifier
>> for arm64 platform and improves it's robustness while also enhancing debug
>> capabilities during potential memory offlining error conditions.
>>
>> This moves the memory notifier registration bit earlier in the boot process
>> from device_initcall() to setup_arch() which will help in guarding against
>> potential early boot memory offline requests.
>>
>> This enables MEM_OFFLINE memory event handling. It will help intercept any
>> possible error condition such as if boot memory some how still got offlined
>> even after an expilicit notifier failure, potentially by a future change in
>> generic hotplug framework. This would help detect such scenarious and help
>> debug further.
>>
>> It also adds a validation function which scans entire boot memory and makes
>> sure that early memory sections are online. This check is essential for the
>> memory notifier to work properly as it cannot prevent boot memory offlining
>> if they are not online to begin with. But this additional sanity check is
>> enabled only with DEBUG_VM.
> 
> Could you please split this in separate patches rather than having a
> single one doing three somewhat related things?

Sure, will do.

> 
>> --- a/arch/arm64/kernel/setup.c
>> +++ b/arch/arm64/kernel/setup.c
>> @@ -376,6 +376,14 @@ void __init __no_sanitize_address setup_arch(char **cmdline_p)
>>  			"This indicates a broken bootloader or old kernel\n",
>>  			boot_args[1], boot_args[2], boot_args[3]);
>>  	}
>> +
>> +	/*
>> +	 * Register the memory notifier which will prevent boot
>> +	 * memory offlining requests - early enough. But there
>> +	 * should not be any actual offlinig request till memory
>> +	 * block devices are initialized with memory_dev_init().
>> +	 */
>> +	memory_hotremove_notifier();
> 
> Why can this not be an early_initcall()? As you said, memory_dev_init()
> is called much later, after the SMP was initialised.

This proposal moves memory_hotremove_notifier() to setup_arch() because it
could and there is no harm in calling this too early than required for now.
But in case generic MM sequence of events during memory init changes later,
this notifier will still work.

IIUC, the notifier chain registration can be called very early in the boot
process without much problem. There are some precedence on other platforms.

1. arch/s390/mm/init.c   		  - In device_initcall() via s390_cma_mem_init()
2. arch/s390/mm/setup.c  		  - In setup_arch() via reserve_crashkernel()
3. arch/powerpc/platforms/pseries/cmm.c   - In module_init() via cmm_init()
4. arch/powerpc/platforms/pseries/iommu.c - via iommu_init_early_pSeries()
					    via pSeries_init()
				            via pSeries_probe() aka ppc_md.porbe()
					    via probe_machine()
					    via setup_arch()

> 
> You could even combine this with validate_bootmem_online_state() in a
> single early_initcall() which, after checking, registers the notifier.
> 

Yes, that will be definitely simpler but there might be still some value
in having this registration in setup_arch() which guard against future
generic MM changes while keeping it separate from the sanity check i.e
validate_bootmem_online_state() which is enabled only with DEBUG_VM. But
will combine both in early_initcall() with some name changes if that is
preferred.
diff mbox series

Patch

diff --git a/arch/arm64/include/asm/mmu.h b/arch/arm64/include/asm/mmu.h
index a7a5ecaa2e83..b7e99b528766 100644
--- a/arch/arm64/include/asm/mmu.h
+++ b/arch/arm64/include/asm/mmu.h
@@ -73,6 +73,14 @@  static inline struct bp_hardening_data *arm64_get_bp_hardening_data(void)
 static inline void arm64_apply_bp_hardening(void)	{ }
 #endif	/* CONFIG_HARDEN_BRANCH_PREDICTOR */
 
+#ifdef CONFIG_MEMORY_HOTPLUG
+extern void memory_hotremove_notifier(void);
+#else
+static inline void memory_hotremove_notifier(void)
+{
+}
+#endif /* CONFIG_MEMORY_HOTPLUG */
+
 extern void arm64_memblock_init(void);
 extern void paging_init(void);
 extern void bootmem_init(void);
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index 77c4c9bad1b8..44406c9f8d83 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -376,6 +376,14 @@  void __init __no_sanitize_address setup_arch(char **cmdline_p)
 			"This indicates a broken bootloader or old kernel\n",
 			boot_args[1], boot_args[2], boot_args[3]);
 	}
+
+	/*
+	 * Register the memory notifier which will prevent boot
+	 * memory offlining requests - early enough. But there
+	 * should not be any actual offlinig request till memory
+	 * block devices are initialized with memory_dev_init().
+	 */
+	memory_hotremove_notifier();
 }
 
 static inline bool cpu_can_disable(unsigned int cpu)
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 75df62fea1b6..8cdb0b02089f 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -1482,13 +1482,40 @@  static int prevent_bootmem_remove_notifier(struct notifier_block *nb,
 	unsigned long end_pfn = arg->start_pfn + arg->nr_pages;
 	unsigned long pfn = arg->start_pfn;
 
-	if (action != MEM_GOING_OFFLINE)
+	if ((action != MEM_GOING_OFFLINE) && (action != MEM_OFFLINE))
 		return NOTIFY_OK;
 
-	for (; pfn < end_pfn; pfn += PAGES_PER_SECTION) {
-		ms = __pfn_to_section(pfn);
-		if (early_section(ms))
-			return NOTIFY_BAD;
+	if (action == MEM_GOING_OFFLINE) {
+		for (; pfn < end_pfn; pfn += PAGES_PER_SECTION) {
+			ms = __pfn_to_section(pfn);
+			if (early_section(ms)) {
+				pr_warn("Boot memory offlining attempted\n");
+				return NOTIFY_BAD;
+			}
+		}
+	} else if (action == MEM_OFFLINE) {
+		for (; pfn < end_pfn; pfn += PAGES_PER_SECTION) {
+			ms = __pfn_to_section(pfn);
+			if (early_section(ms)) {
+
+				/*
+				 * This should have never happened. Boot memory
+				 * offlining should have been prevented by this
+				 * very notifier. Probably some memory removal
+				 * procedure might have changed which would then
+				 * require further debug.
+				 */
+				pr_err("Boot memory offlined\n");
+
+				/*
+				 * Core memory hotplug does not process a return
+				 * code from the notifier for MEM_OFFLINE event.
+				 * Error condition has been reported. Report as
+				 * ignored.
+				 */
+				return NOTIFY_DONE;
+			}
+		}
 	}
 	return NOTIFY_OK;
 }
@@ -1497,9 +1524,74 @@  static struct notifier_block prevent_bootmem_remove_nb = {
 	.notifier_call = prevent_bootmem_remove_notifier,
 };
 
-static int __init prevent_bootmem_remove_init(void)
+void memory_hotremove_notifier(void)
 {
-	return register_memory_notifier(&prevent_bootmem_remove_nb);
+	int rc;
+
+	if (!IS_ENABLED(CONFIG_MEMORY_HOTREMOVE))
+		return;
+
+	rc = register_memory_notifier(&prevent_bootmem_remove_nb);
+	if (!rc)
+		return;
+
+	pr_err("Notifier registration failed - boot memory can be removed\n");
+}
+
+/*
+ * This ensures that boot memory sections on the plaltform are online
+ * during early boot. They could not be prevented from being offlined
+ * if for some reason they are not brought online to begin with. This
+ * help validate the basic assumption on which the above memory event
+ * notifier works to prevent boot memory offlining and it's possible
+ * removal.
+ */
+static int __init validate_bootmem_online_state(void)
+{
+	struct memblock_region *mblk;
+	struct mem_section *ms;
+	unsigned long pfn, end_pfn, start, end;
+
+	/*
+	 * Scanning across all memblock might be expensive
+	 * on some big memory systems. Hence enable this
+	 * validation only with DEBUG_VM.
+	 */
+	if (!IS_ENABLED(CONFIG_MEMORY_HOTREMOVE) ||
+			!IS_ENABLED(CONFIG_DEBUG_VM))
+		return 0;
+
+	for_each_memblock(memory, mblk) {
+		pfn = PHYS_PFN(mblk->base);
+		end_pfn = PHYS_PFN(mblk->base + mblk->size);
+
+		for (; pfn < end_pfn; pfn += PAGES_PER_SECTION) {
+			ms = __pfn_to_section(pfn);
+
+			/*
+			 * All memory ranges in the system at this point
+			 * should have been marked early sections.
+			 */
+			WARN_ON(!early_section(ms));
+
+			/*
+			 * Memory notifier mechanism here to prevent boot
+			 * memory offlining depends on the fact that each
+			 * early section memory on the system is intially
+			 * online. Otherwise a given memory section which
+			 * is already offline will be overlooked and can
+			 * be removed completely. Call out such sections.
+			 */
+			if (!online_section(ms)) {
+				start = PFN_PHYS(pfn);
+				end = start + (1UL << PA_SECTION_SHIFT);
+
+				pr_err("Memory range [%lx %lx] is offline\n", start, end);
+				pr_err("Memory range [%lx %lx] can be removed\n", start, end);
+			}
+		}
+	}
+	return 0;
 }
-device_initcall(prevent_bootmem_remove_init);
+early_initcall(validate_bootmem_online_state);
 #endif