diff mbox series

mm, debug_pagealloc: don't rely on static keys too early

Message ID 20191219130612.23171-1-vbabka@suse.cz (mailing list archive)
State New, archived
Headers show
Series mm, debug_pagealloc: don't rely on static keys too early | expand

Commit Message

Vlastimil Babka Dec. 19, 2019, 1:06 p.m. UTC
Commit 96a2b03f281d ("mm, debug_pagelloc: use static keys to enable debugging")
has introduced a static key to reduce overhead when debug_pagealloc is compiled
in but not enabled. It relied on the assumption that jump_label_init() is
called before parse_early_param() as in start_kernel(), so when the
"debug_pagealloc=on" option is parsed, it is safe to enable the static key.

However, it turns out multiple architectures call parse_early_param() earlier
from their setup_arch(). x86 also calls jump_label_init() even earlier, so no
issue was found while testing the commit, but same is not true for e.g. ppc64
and s390 where the kernel would not boot with debug_pagealloc=on as found by
our QA.

To fix this without tricky changes to init code of multiple architectures, this
patch partially reverts the static key conversion from 96a2b03f281d. Init-time
and non-fastpath calls (such as in arch code) of debug_pagealloc_enabled() will
again test a simple bool variable. Fastpath mm code is converted to a new
debug_pagealloc_enabled_static() variant that relies on the static key, which
is enabled in a well-defined point in mm_init() where it's guaranteed that
jump_label_init() has been called, regardless of architecture.

Fixes: 96a2b03f281d ("mm, debug_pagelloc: use static keys to enable debugging")
Cc: <stable@vger.kernel.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
---
 include/linux/mm.h | 18 +++++++++++++++---
 init/main.c        |  1 +
 mm/page_alloc.c    | 36 ++++++++++++------------------------
 mm/slab.c          |  4 ++--
 mm/slub.c          |  2 +-
 mm/vmalloc.c       |  4 ++--
 6 files changed, 33 insertions(+), 32 deletions(-)

Comments

Qian Cai Dec. 19, 2019, 1:16 p.m. UTC | #1
> On Dec 19, 2019, at 8:06 AM, Vlastimil Babka <vbabka@suse.cz> wrote:
> 
> Commit 96a2b03f281d ("mm, debug_pagelloc: use static keys to enable debugging")
> has introduced a static key to reduce overhead when debug_pagealloc is compiled
> in but not enabled. It relied on the assumption that jump_label_init() is
> called before parse_early_param() as in start_kernel(), so when the
> "debug_pagealloc=on" option is parsed, it is safe to enable the static key.
> 
> However, it turns out multiple architectures call parse_early_param() earlier
> from their setup_arch(). x86 also calls jump_label_init() even earlier, so no
> issue was found while testing the commit, but same is not true for e.g. ppc64
> and s390 where the kernel would not boot with debug_pagealloc=on as found by
> our QA.

This was daily tested on linux-next here for those arches and never saw an issue. Are you able to reproduce it on mainline or linux-next?
Vlastimil Babka Dec. 19, 2019, 1:34 p.m. UTC | #2
On 12/19/19 2:16 PM, Qian Cai wrote:
> 
> 
>> On Dec 19, 2019, at 8:06 AM, Vlastimil Babka <vbabka@suse.cz> wrote:
>>
>> Commit 96a2b03f281d ("mm, debug_pagelloc: use static keys to enable debugging")
>> has introduced a static key to reduce overhead when debug_pagealloc is compiled
>> in but not enabled. It relied on the assumption that jump_label_init() is
>> called before parse_early_param() as in start_kernel(), so when the
>> "debug_pagealloc=on" option is parsed, it is safe to enable the static key.
>>
>> However, it turns out multiple architectures call parse_early_param() earlier
>> from their setup_arch(). x86 also calls jump_label_init() even earlier, so no
>> issue was found while testing the commit, but same is not true for e.g. ppc64
>> and s390 where the kernel would not boot with debug_pagealloc=on as found by
>> our QA.
> 
> This was daily tested on linux-next here for those arches and never saw an issue.

Well I assume nobody has booted the kernel specifically with
debug_pagealloc=on. In case a randconfig run enabled
CONFIG_DEBUG_PAGEALLOC_ENABLE_DEFAULT then the problem wouldn't manifest
as the static key would be initialized with DEFINE_STATIC_KEY_TRUE by
the compiler itself, and the enabling-by-param would be a no-op.

> Are you able to reproduce it on mainline or linux-next?

I didn't try, but our kernel is 5.3-based so that's quite recent. The
offending commit was also introduced in 5.3-rc1 so there's that.
Andrew Morton Jan. 3, 2020, 3:02 a.m. UTC | #3
On Thu, 19 Dec 2019 14:06:12 +0100 Vlastimil Babka <vbabka@suse.cz> wrote:

> Commit 96a2b03f281d ("mm, debug_pagelloc: use static keys to enable debugging")
> has introduced a static key to reduce overhead when debug_pagealloc is compiled
> in but not enabled. It relied on the assumption that jump_label_init() is
> called before parse_early_param() as in start_kernel(), so when the
> "debug_pagealloc=on" option is parsed, it is safe to enable the static key.
> 
> However, it turns out multiple architectures call parse_early_param() earlier
> from their setup_arch(). x86 also calls jump_label_init() even earlier, so no
> issue was found while testing the commit, but same is not true for e.g. ppc64
> and s390 where the kernel would not boot with debug_pagealloc=on as found by
> our QA.
> 
> To fix this without tricky changes to init code of multiple architectures, this
> patch partially reverts the static key conversion from 96a2b03f281d. Init-time
> and non-fastpath calls (such as in arch code) of debug_pagealloc_enabled() will
> again test a simple bool variable. Fastpath mm code is converted to a new
> debug_pagealloc_enabled_static() variant that relies on the static key, which
> is enabled in a well-defined point in mm_init() where it's guaranteed that
> jump_label_init() has been called, regardless of architecture.

I'm seeing this with x86_64 allmodconfig:

ERROR: "_debug_pagealloc_enabled_early" [sound/drivers/pcsp/snd-pcsp.ko] undefined!

Not sure why.  It's there:

q:/usr/src/25> nm mm/page_alloc.o|grep _debug_pagealloc_enabled
...
00000000000028a0 B _debug_pagealloc_enabled
...

and exported:

0000000000000072 r __kstrtab__debug_pagealloc_enabled

Odd.  Does this happen to you as well?
Vlastimil Babka Jan. 6, 2020, 8:37 a.m. UTC | #4
On 1/3/20 4:02 AM, Andrew Morton wrote:
> On Thu, 19 Dec 2019 14:06:12 +0100 Vlastimil Babka <vbabka@suse.cz> wrote:
> 
>> Commit 96a2b03f281d ("mm, debug_pagelloc: use static keys to enable debugging")
>> has introduced a static key to reduce overhead when debug_pagealloc is compiled
>> in but not enabled. It relied on the assumption that jump_label_init() is
>> called before parse_early_param() as in start_kernel(), so when the
>> "debug_pagealloc=on" option is parsed, it is safe to enable the static key.
>>
>> However, it turns out multiple architectures call parse_early_param() earlier
>> from their setup_arch(). x86 also calls jump_label_init() even earlier, so no
>> issue was found while testing the commit, but same is not true for e.g. ppc64
>> and s390 where the kernel would not boot with debug_pagealloc=on as found by
>> our QA.
>>
>> To fix this without tricky changes to init code of multiple architectures, this
>> patch partially reverts the static key conversion from 96a2b03f281d. Init-time
>> and non-fastpath calls (such as in arch code) of debug_pagealloc_enabled() will
>> again test a simple bool variable. Fastpath mm code is converted to a new
>> debug_pagealloc_enabled_static() variant that relies on the static key, which
>> is enabled in a well-defined point in mm_init() where it's guaranteed that
>> jump_label_init() has been called, regardless of architecture.
> 
> I'm seeing this with x86_64 allmodconfig:
> 
> ERROR: "_debug_pagealloc_enabled_early" [sound/drivers/pcsp/snd-pcsp.ko] undefined!
> 
> Not sure why.  It's there:
> 
> q:/usr/src/25> nm mm/page_alloc.o|grep _debug_pagealloc_enabled
> ...
> 00000000000028a0 B _debug_pagealloc_enabled
> ...
> 
> and exported:
> 
> 0000000000000072 r __kstrtab__debug_pagealloc_enabled

Hm that's not the _early version. Missing export indeed, can you amend
with Stephen's patch? Thanks!

https://lore.kernel.org/linux-next/20200106164944.063ac07b@canb.auug.org.au/

> 
> Odd.  Does this happen to you as well?
>
diff mbox series

Patch

diff --git a/include/linux/mm.h b/include/linux/mm.h
index c97ea3b694e6..5cf260d5e248 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2655,13 +2655,25 @@  static inline bool want_init_on_free(void)
 	       !page_poisoning_enabled();
 }
 
-#ifdef CONFIG_DEBUG_PAGEALLOC_ENABLE_DEFAULT
-DECLARE_STATIC_KEY_TRUE(_debug_pagealloc_enabled);
+#ifdef CONFIG_DEBUG_PAGEALLOC
+extern void init_debug_pagealloc(void);
 #else
-DECLARE_STATIC_KEY_FALSE(_debug_pagealloc_enabled);
+static inline void init_debug_pagealloc(void) {}
 #endif
+extern bool _debug_pagealloc_enabled_early;
+DECLARE_STATIC_KEY_FALSE(_debug_pagealloc_enabled);
 
 static inline bool debug_pagealloc_enabled(void)
+{
+	return IS_ENABLED(CONFIG_DEBUG_PAGEALLOC) &&
+		_debug_pagealloc_enabled_early;
+}
+
+/*
+ * For use in fast paths after init_debug_pagealloc() has run, or when a
+ * false negative result is not harmful when called too early.
+ */
+static inline bool debug_pagealloc_enabled_static(void)
 {
 	if (!IS_ENABLED(CONFIG_DEBUG_PAGEALLOC))
 		return false;
diff --git a/init/main.c b/init/main.c
index ec3a1463ac69..c93b9cc201fa 100644
--- a/init/main.c
+++ b/init/main.c
@@ -554,6 +554,7 @@  static void __init mm_init(void)
 	 * bigger than MAX_ORDER unless SPARSEMEM.
 	 */
 	page_ext_init_flatmem();
+	init_debug_pagealloc();
 	report_meminit();
 	mem_init();
 	kmem_cache_init();
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 4785a8a2040e..5e3fe156ffb4 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -694,34 +694,26 @@  void prep_compound_page(struct page *page, unsigned int order)
 #ifdef CONFIG_DEBUG_PAGEALLOC
 unsigned int _debug_guardpage_minorder;
 
-#ifdef CONFIG_DEBUG_PAGEALLOC_ENABLE_DEFAULT
-DEFINE_STATIC_KEY_TRUE(_debug_pagealloc_enabled);
-#else
+bool _debug_pagealloc_enabled_early __read_mostly
+			= IS_ENABLED(CONFIG_DEBUG_PAGEALLOC_ENABLE_DEFAULT);
 DEFINE_STATIC_KEY_FALSE(_debug_pagealloc_enabled);
-#endif
 EXPORT_SYMBOL(_debug_pagealloc_enabled);
 
 DEFINE_STATIC_KEY_FALSE(_debug_guardpage_enabled);
 
 static int __init early_debug_pagealloc(char *buf)
 {
-	bool enable = false;
-
-	if (kstrtobool(buf, &enable))
-		return -EINVAL;
-
-	if (enable)
-		static_branch_enable(&_debug_pagealloc_enabled);
-
-	return 0;
+	return kstrtobool(buf, &_debug_pagealloc_enabled_early);
 }
 early_param("debug_pagealloc", early_debug_pagealloc);
 
-static void init_debug_guardpage(void)
+void init_debug_pagealloc(void)
 {
 	if (!debug_pagealloc_enabled())
 		return;
 
+	static_branch_enable(&_debug_pagealloc_enabled);
+
 	if (!debug_guardpage_minorder())
 		return;
 
@@ -1186,7 +1178,7 @@  static __always_inline bool free_pages_prepare(struct page *page,
 	 */
 	arch_free_page(page, order);
 
-	if (debug_pagealloc_enabled())
+	if (debug_pagealloc_enabled_static())
 		kernel_map_pages(page, 1 << order, 0);
 
 	kasan_free_nondeferred_pages(page, order);
@@ -1207,7 +1199,7 @@  static bool free_pcp_prepare(struct page *page)
 
 static bool bulkfree_pcp_prepare(struct page *page)
 {
-	if (debug_pagealloc_enabled())
+	if (debug_pagealloc_enabled_static())
 		return free_pages_check(page);
 	else
 		return false;
@@ -1221,7 +1213,7 @@  static bool bulkfree_pcp_prepare(struct page *page)
  */
 static bool free_pcp_prepare(struct page *page)
 {
-	if (debug_pagealloc_enabled())
+	if (debug_pagealloc_enabled_static())
 		return free_pages_prepare(page, 0, true);
 	else
 		return free_pages_prepare(page, 0, false);
@@ -1973,10 +1965,6 @@  void __init page_alloc_init_late(void)
 
 	for_each_populated_zone(zone)
 		set_zone_contiguous(zone);
-
-#ifdef CONFIG_DEBUG_PAGEALLOC
-	init_debug_guardpage();
-#endif
 }
 
 #ifdef CONFIG_CMA
@@ -2106,7 +2094,7 @@  static inline bool free_pages_prezeroed(void)
  */
 static inline bool check_pcp_refill(struct page *page)
 {
-	if (debug_pagealloc_enabled())
+	if (debug_pagealloc_enabled_static())
 		return check_new_page(page);
 	else
 		return false;
@@ -2128,7 +2116,7 @@  static inline bool check_pcp_refill(struct page *page)
 }
 static inline bool check_new_pcp(struct page *page)
 {
-	if (debug_pagealloc_enabled())
+	if (debug_pagealloc_enabled_static())
 		return check_new_page(page);
 	else
 		return false;
@@ -2155,7 +2143,7 @@  inline void post_alloc_hook(struct page *page, unsigned int order,
 	set_page_refcounted(page);
 
 	arch_alloc_page(page, order);
-	if (debug_pagealloc_enabled())
+	if (debug_pagealloc_enabled_static())
 		kernel_map_pages(page, 1 << order, 1);
 	kasan_alloc_pages(page, order);
 	kernel_poison_pages(page, 1 << order, 1);
diff --git a/mm/slab.c b/mm/slab.c
index f1e1840af533..a89633603b2d 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -1416,7 +1416,7 @@  static void kmem_rcu_free(struct rcu_head *head)
 #if DEBUG
 static bool is_debug_pagealloc_cache(struct kmem_cache *cachep)
 {
-	if (debug_pagealloc_enabled() && OFF_SLAB(cachep) &&
+	if (debug_pagealloc_enabled_static() && OFF_SLAB(cachep) &&
 		(cachep->size % PAGE_SIZE) == 0)
 		return true;
 
@@ -2008,7 +2008,7 @@  int __kmem_cache_create(struct kmem_cache *cachep, slab_flags_t flags)
 	 * to check size >= 256. It guarantees that all necessary small
 	 * sized slab is initialized in current slab initialization sequence.
 	 */
-	if (debug_pagealloc_enabled() && (flags & SLAB_POISON) &&
+	if (debug_pagealloc_enabled_static() && (flags & SLAB_POISON) &&
 		size >= 256 && cachep->object_size > cache_line_size()) {
 		if (size < PAGE_SIZE || size % PAGE_SIZE == 0) {
 			size_t tmp_size = ALIGN(size, PAGE_SIZE);
diff --git a/mm/slub.c b/mm/slub.c
index d11389710b12..8eafccf75940 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -288,7 +288,7 @@  static inline void *get_freepointer_safe(struct kmem_cache *s, void *object)
 	unsigned long freepointer_addr;
 	void *p;
 
-	if (!debug_pagealloc_enabled())
+	if (!debug_pagealloc_enabled_static())
 		return get_freepointer(s, object);
 
 	freepointer_addr = (unsigned long)object + s->offset;
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 4d3b3d60d893..544cc9a725cf 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -1375,7 +1375,7 @@  static void free_unmap_vmap_area(struct vmap_area *va)
 {
 	flush_cache_vunmap(va->va_start, va->va_end);
 	unmap_vmap_area(va);
-	if (debug_pagealloc_enabled())
+	if (debug_pagealloc_enabled_static())
 		flush_tlb_kernel_range(va->va_start, va->va_end);
 
 	free_vmap_area_noflush(va);
@@ -1673,7 +1673,7 @@  static void vb_free(const void *addr, unsigned long size)
 
 	vunmap_page_range((unsigned long)addr, (unsigned long)addr + size);
 
-	if (debug_pagealloc_enabled())
+	if (debug_pagealloc_enabled_static())
 		flush_tlb_kernel_range((unsigned long)addr,
 					(unsigned long)addr + size);