diff mbox series

[v5,2/4] mm: Provide kernel parameter to allow disabling page init poisoning

Message ID 20180925201921.3576.84239.stgit@localhost.localdomain (mailing list archive)
State New, archived
Headers show
Series Address issues slowing persistent memory initialization | expand

Commit Message

Alexander Duyck Sept. 25, 2018, 8:20 p.m. UTC
On systems with a large amount of memory it can take a significant amount
of time to initialize all of the page structs with the PAGE_POISON_PATTERN
value. I have seen it take over 2 minutes to initialize a system with
over 12TB of RAM.

In order to work around the issue I had to disable CONFIG_DEBUG_VM and then
the boot time returned to something much more reasonable as the
arch_add_memory call completed in milliseconds versus seconds. However in
doing that I had to disable all of the other VM debugging on the system.

In order to work around a kernel that might have CONFIG_DEBUG_VM enabled on
a system that has a large amount of memory I have added a new kernel
parameter named "vm_debug" that can be set to "-" in order to disable it.

Reviewed-by: Pavel Tatashin <pavel.tatashin@microsoft.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
---

v3: Switched from kernel config option to parameter
v4: Added comment to parameter handler to record when option is disabled
    Updated parameter description based on feedback from Michal Hocko
    Fixed GB vs TB typo in patch description.
    Switch to vm_debug option similar to slub_debug
v5: Rebased on latest linux-next

 Documentation/admin-guide/kernel-parameters.txt |   12 ++++++
 include/linux/page-flags.h                      |    8 ++++
 mm/debug.c                                      |   46 +++++++++++++++++++++++
 mm/memblock.c                                   |    5 +--
 mm/sparse.c                                     |    4 +-
 5 files changed, 69 insertions(+), 6 deletions(-)

Comments

Dave Hansen Sept. 25, 2018, 8:26 p.m. UTC | #1
On 09/25/2018 01:20 PM, Alexander Duyck wrote:
> +	vm_debug[=options]	[KNL] Available with CONFIG_DEBUG_VM=y.
> +			May slow down system boot speed, especially when
> +			enabled on systems with a large amount of memory.
> +			All options are enabled by default, and this
> +			interface is meant to allow for selectively
> +			enabling or disabling specific virtual memory
> +			debugging features.
> +
> +			Available options are:
> +			  P	Enable page structure init time poisoning
> +			  -	Disable all of the above options

Can we have vm_debug=off for turning things off, please?  That seems to
be pretty standard.

Also, we need to document the defaults.  I think the default is "all
debug options are enabled", but it would be nice to document that.
Alexander Duyck Sept. 25, 2018, 8:38 p.m. UTC | #2
On 9/25/2018 1:26 PM, Dave Hansen wrote:
> On 09/25/2018 01:20 PM, Alexander Duyck wrote:
>> +	vm_debug[=options]	[KNL] Available with CONFIG_DEBUG_VM=y.
>> +			May slow down system boot speed, especially when
>> +			enabled on systems with a large amount of memory.
>> +			All options are enabled by default, and this
>> +			interface is meant to allow for selectively
>> +			enabling or disabling specific virtual memory
>> +			debugging features.
>> +
>> +			Available options are:
>> +			  P	Enable page structure init time poisoning
>> +			  -	Disable all of the above options
> 
> Can we have vm_debug=off for turning things off, please?  That seems to
> be pretty standard.

No. The simple reason for that is that you had requested this work like 
the slub_debug. If we are going to do that then each individual letter 
represents a feature. That is why the "-" represents off. We cannot have 
letters represent flags, and letters put together into words. For 
example slub_debug=OFF would turn on sanity checks and turn off 
debugging for caches that would have causes higher minimum slab orders.

Either I can do this as a single parameter that supports on/off 
semantics, or I can support it as a slub_debug type parameter that does 
flags based on the input options. I would rather not muddy things by 
trying to do both.

> Also, we need to document the defaults.  I think the default is "all
> debug options are enabled", but it would be nice to document that.

In the description I call out "All options are enabled by default, and 
this interface is meant to allow for selectively enabling or disabling".
Dave Hansen Sept. 25, 2018, 10:14 p.m. UTC | #3
On 09/25/2018 01:38 PM, Alexander Duyck wrote:
> On 9/25/2018 1:26 PM, Dave Hansen wrote:
>> On 09/25/2018 01:20 PM, Alexander Duyck wrote:
>>> +    vm_debug[=options]    [KNL] Available with CONFIG_DEBUG_VM=y.
>>> +            May slow down system boot speed, especially when
>>> +            enabled on systems with a large amount of memory.
>>> +            All options are enabled by default, and this
>>> +            interface is meant to allow for selectively
>>> +            enabling or disabling specific virtual memory
>>> +            debugging features.
>>> +
>>> +            Available options are:
>>> +              P    Enable page structure init time poisoning
>>> +              -    Disable all of the above options
>>
>> Can we have vm_debug=off for turning things off, please?  That seems to
>> be pretty standard.
> 
> No. The simple reason for that is that you had requested this work like
> the slub_debug. If we are going to do that then each individual letter
> represents a feature. That is why the "-" represents off. We cannot have
> letters represent flags, and letters put together into words. For
> example slub_debug=OFF would turn on sanity checks and turn off
> debugging for caches that would have causes higher minimum slab orders.

We don't have to have the same letters mean the same things for both
options.  We also can live without 'o' and 'f' being valid.  We can
*also* just say "don't do 'off'" if you want to enable things.

I'd much rather have vm_debug=off do the right thing than have
per-feature enable/disable.  I know I'll *never* remember vm_debug=- and
doing it this way will subject me to innumerable trips to Documentation/
during my few remaining years.

Surely you can make vm_debug=off happen. :)

>> we need to document the defaults.  I think the default is "all
>> debug options are enabled", but it would be nice to document that.
> 
> In the description I call out "All options are enabled by default, an> this interface is meant to allow for selectively enabling or disabling".

I found "all options are enabled by default" really confusing.  Maybe:

"Control debug features which become available when CONFIG_DEBUG_VM=y.
When this option is not specified, all debug features are enabled.  Use
this option enable a specific subset."

Then, let's actually say what the options do, and what their impact is:

	P	Enable 'struct page' poisoning at initialization.
		(Slows down boot time).
Alexander Duyck Sept. 25, 2018, 10:27 p.m. UTC | #4
On 9/25/2018 3:14 PM, Dave Hansen wrote:
> On 09/25/2018 01:38 PM, Alexander Duyck wrote:
>> On 9/25/2018 1:26 PM, Dave Hansen wrote:
>>> On 09/25/2018 01:20 PM, Alexander Duyck wrote:
>>>> +    vm_debug[=options]    [KNL] Available with CONFIG_DEBUG_VM=y.
>>>> +            May slow down system boot speed, especially when
>>>> +            enabled on systems with a large amount of memory.
>>>> +            All options are enabled by default, and this
>>>> +            interface is meant to allow for selectively
>>>> +            enabling or disabling specific virtual memory
>>>> +            debugging features.
>>>> +
>>>> +            Available options are:
>>>> +              P    Enable page structure init time poisoning
>>>> +              -    Disable all of the above options
>>>
>>> Can we have vm_debug=off for turning things off, please?  That seems to
>>> be pretty standard.
>>
>> No. The simple reason for that is that you had requested this work like
>> the slub_debug. If we are going to do that then each individual letter
>> represents a feature. That is why the "-" represents off. We cannot have
>> letters represent flags, and letters put together into words. For
>> example slub_debug=OFF would turn on sanity checks and turn off
>> debugging for caches that would have causes higher minimum slab orders.
> 
> We don't have to have the same letters mean the same things for both
> options.  We also can live without 'o' and 'f' being valid.  We can
> *also* just say "don't do 'off'" if you want to enable things.

I'm not saying we do either. I would prefer it if we stuck to similar 
behavior though. If we are going to do a slub_debug style parameter then 
we should stick with similar behavior where "-" is used to indicate all 
features off.

> I'd much rather have vm_debug=off do the right thing than have
> per-feature enable/disable.  I know I'll *never* remember vm_debug=- and
> doing it this way will subject me to innumerable trips to Documentation/
> during my few remaining years.
> 
> Surely you can make vm_debug=off happen. :)

I could, but then it is going to confuse people even more. I really feel 
that if we want to do a slub_debug style interface we should use the 
same switch for turning off all the features that they do for slub_debug.

>>> we need to document the defaults.  I think the default is "all
>>> debug options are enabled", but it would be nice to document that.
>>
>> In the description I call out "All options are enabled by default, and this interface is meant to allow for selectively enabling or disabling".
> 
> I found "all options are enabled by default" really confusing.  Maybe:
> 
> "Control debug features which become available when CONFIG_DEBUG_VM=y.
> When this option is not specified, all debug features are enabled.  Use
> this option enable a specific subset."
> 
> Then, let's actually say what the options do, and what their impact is:
> 
> 	P	Enable 'struct page' poisoning at initialization.
> 		(Slows down boot time).
>

 From my perspective I just don't see how this changes much since it 
conveys the same message I had conveyed in my description. Since it 
looks like Andrew applied the patch feel free to submit your suggestion 
here as a follow-up patch and I would be willing to review/ack it.
Michal Hocko Sept. 26, 2018, 7:38 a.m. UTC | #5
On Tue 25-09-18 13:20:12, Alexander Duyck wrote:
[...]
> +	vm_debug[=options]	[KNL] Available with CONFIG_DEBUG_VM=y.
> +			May slow down system boot speed, especially when
> +			enabled on systems with a large amount of memory.
> +			All options are enabled by default, and this
> +			interface is meant to allow for selectively
> +			enabling or disabling specific virtual memory
> +			debugging features.
> +
> +			Available options are:
> +			  P	Enable page structure init time poisoning
> +			  -	Disable all of the above options

I agree with Dave that this is confusing as hell. So what does vm_debug
(without any options means). I assume it's NOP and all debugging is
enabled and that is the default. What if I want to disable _only_ the
page struct poisoning. The weird lookcing `-' will disable all other
options that we might gather in the future.

Why cannot you simply go with [no]vm_page_poison[=on/off]?
Alexander Duyck Sept. 26, 2018, 3:24 p.m. UTC | #6
On 9/26/2018 12:38 AM, Michal Hocko wrote:
> On Tue 25-09-18 13:20:12, Alexander Duyck wrote:
> [...]
>> +	vm_debug[=options]	[KNL] Available with CONFIG_DEBUG_VM=y.
>> +			May slow down system boot speed, especially when
>> +			enabled on systems with a large amount of memory.
>> +			All options are enabled by default, and this
>> +			interface is meant to allow for selectively
>> +			enabling or disabling specific virtual memory
>> +			debugging features.
>> +
>> +			Available options are:
>> +			  P	Enable page structure init time poisoning
>> +			  -	Disable all of the above options
> 
> I agree with Dave that this is confusing as hell. So what does vm_debug
> (without any options means). I assume it's NOP and all debugging is
> enabled and that is the default. What if I want to disable _only_ the
> page struct poisoning. The weird lookcing `-' will disable all other
> options that we might gather in the future.

With no options it works just like slub_debug and enables all available 
options. So in our case it is a NOP since we wanted the debugging 
enabled by default.

> Why cannot you simply go with [no]vm_page_poison[=on/off]?

That is what I had to begin with, but Dave Hansen and Dan Williams 
suggested that I go with a slub_debug style interface so we could extend 
it in the future.

It would probably make more sense if we had additional options added, 
but we only have one option for now so the only values we really have 
are 'P' and '-' for now.
Dave Hansen Sept. 26, 2018, 3:36 p.m. UTC | #7
On 09/26/2018 12:38 AM, Michal Hocko wrote:
> Why cannot you simply go with [no]vm_page_poison[=on/off]?

I was trying to look to the future a bit, if we end up with five or six
more other options we want to allow folks to enable/disable.  I don't
want to end up in a situation where we have a bunch of different knobs
to turn all this stuff off at runtime.

I'd really like to have one stop shopping so that folks who have a
system that's behaving well and don't need any debugging can get some of
their performance back.

But, the *primary* thing we want here is a nice, quick way to turn as
much debugging off as we can.  A nice-to-have is a future-proof,
slub-style option that will centralize things.

Alex's patch fails at the primary goal, IMNHO because "vm_debug=-" is so
weird.  I'd much rather have "vm_debug=off" (the primary goal) and throw
away the nice-to-have (future-proof fine-grained on/off).

I think we can have both, but I guess the onus is on me to go and add a
strcmp(..., "off"). :)
Michal Hocko Sept. 26, 2018, 3:39 p.m. UTC | #8
On Wed 26-09-18 08:24:56, Alexander Duyck wrote:
> On 9/26/2018 12:38 AM, Michal Hocko wrote:
> > On Tue 25-09-18 13:20:12, Alexander Duyck wrote:
> > [...]
> > > +	vm_debug[=options]	[KNL] Available with CONFIG_DEBUG_VM=y.
> > > +			May slow down system boot speed, especially when
> > > +			enabled on systems with a large amount of memory.
> > > +			All options are enabled by default, and this
> > > +			interface is meant to allow for selectively
> > > +			enabling or disabling specific virtual memory
> > > +			debugging features.
> > > +
> > > +			Available options are:
> > > +			  P	Enable page structure init time poisoning
> > > +			  -	Disable all of the above options
> > 
> > I agree with Dave that this is confusing as hell. So what does vm_debug
> > (without any options means). I assume it's NOP and all debugging is
> > enabled and that is the default. What if I want to disable _only_ the
> > page struct poisoning. The weird lookcing `-' will disable all other
> > options that we might gather in the future.
> 
> With no options it works just like slub_debug and enables all available
> options. So in our case it is a NOP since we wanted the debugging enabled by
> default.

But isn't slub_debug more about _adding_ debugging features? While you
want to effectively disbale some debugging features here? So if you want
to follow that pattern then it would be something like
vm_debug_disable=page_poisoning,$OTHER_FUTURE_DEBUG_OPTIONS

why would you want to enable something when CONFIG_DEBUG_VM=y just
enables everything?

> > Why cannot you simply go with [no]vm_page_poison[=on/off]?
> 
> That is what I had to begin with, but Dave Hansen and Dan Williams suggested
> that I go with a slub_debug style interface so we could extend it in the
> future.

Please let's not over-engineer this. If you really need an umbrella
parameter then make a list of things to disable.
Dave Hansen Sept. 26, 2018, 3:41 p.m. UTC | #9
On 09/26/2018 08:24 AM, Alexander Duyck wrote:
> With no options it works just like slub_debug and enables all
> available options. So in our case it is a NOP since we wanted the
> debugging enabled by default.

Yeah, but slub_debug is different.

First, nobody uses the slub_debug=- option because *that* is only used
when you have SLUB_DEBUG=y *and* CONFIG_SLUB_DEBUG_ON=y, which not even
Fedora does.

slub_debug is *primarily* for *adding* debug features.  For this, we
need to turn them off.

It sounds like following slub_debug was a bad idea, especially following
its semantics too closely when it doesn't make sense.
Alexander Duyck Sept. 26, 2018, 4:18 p.m. UTC | #10
On 9/26/2018 8:41 AM, Dave Hansen wrote:
> On 09/26/2018 08:24 AM, Alexander Duyck wrote:
>> With no options it works just like slub_debug and enables all
>> available options. So in our case it is a NOP since we wanted the
>> debugging enabled by default.
> 
> Yeah, but slub_debug is different.
> 
> First, nobody uses the slub_debug=- option because *that* is only used
> when you have SLUB_DEBUG=y *and* CONFIG_SLUB_DEBUG_ON=y, which not even
> Fedora does.
> 
> slub_debug is *primarily* for *adding* debug features.  For this, we
> need to turn them off.
> 
> It sounds like following slub_debug was a bad idea, especially following
> its semantics too closely when it doesn't make sense.

I actually like the idea of using slub_debug style semantics. It makes 
sense when you start thinking about future features being added. Then we 
might actually have scenarios where vm_debug=P will make sense, but for 
right now it is probably not going to be used. Basically this all makes 
room for future expansion. It is just ugly to read right now while we 
only have one feature controlled by this bit.
Andrew Morton Sept. 26, 2018, 10:36 p.m. UTC | #11
On Wed, 26 Sep 2018 08:36:47 -0700 Dave Hansen <dave.hansen@intel.com> wrote:

> On 09/26/2018 12:38 AM, Michal Hocko wrote:
> > Why cannot you simply go with [no]vm_page_poison[=on/off]?
> 
> I was trying to look to the future a bit, if we end up with five or six
> more other options we want to allow folks to enable/disable.  I don't
> want to end up in a situation where we have a bunch of different knobs
> to turn all this stuff off at runtime.
> 
> I'd really like to have one stop shopping so that folks who have a
> system that's behaving well and don't need any debugging can get some of
> their performance back.
> 
> But, the *primary* thing we want here is a nice, quick way to turn as
> much debugging off as we can.  A nice-to-have is a future-proof,
> slub-style option that will centralize things.

Yup.  DEBUG_VM just covers too much stuff nowadays.  A general way to
make these thing more fine-grained and without requiring a rebuild
would be great.

And I expect that quite a few of the debug features could be
enabled/disabled after bootup as well, so a /proc knob is probably in
our future.  Any infrastructure which is added to support a new
kernel-command-line option should be designed with that in mind.
diff mbox series

Patch

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 42d9150047f2..d9ad70ccbdc2 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -4811,6 +4811,18 @@ 
 			This is actually a boot loader parameter; the value is
 			passed to the kernel using a special protocol.
 
+	vm_debug[=options]	[KNL] Available with CONFIG_DEBUG_VM=y.
+			May slow down system boot speed, especially when
+			enabled on systems with a large amount of memory.
+			All options are enabled by default, and this
+			interface is meant to allow for selectively
+			enabling or disabling specific virtual memory
+			debugging features.
+
+			Available options are:
+			  P	Enable page structure init time poisoning
+			  -	Disable all of the above options
+
 	vmalloc=nn[KMG]	[KNL,BOOT] Forces the vmalloc area to have an exact
 			size of <nn>. This can be used to increase the
 			minimum size (128MB on x86). It can also be used to
diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index 4d99504f6496..934f91ef3f54 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -163,6 +163,14 @@  static inline int PagePoisoned(const struct page *page)
 	return page->flags == PAGE_POISON_PATTERN;
 }
 
+#ifdef CONFIG_DEBUG_VM
+void page_init_poison(struct page *page, size_t size);
+#else
+static inline void page_init_poison(struct page *page, size_t size)
+{
+}
+#endif
+
 /*
  * Page flags policies wrt compound pages
  *
diff --git a/mm/debug.c b/mm/debug.c
index bd10aad8539a..cdacba12e09a 100644
--- a/mm/debug.c
+++ b/mm/debug.c
@@ -13,6 +13,7 @@ 
 #include <trace/events/mmflags.h>
 #include <linux/migrate.h>
 #include <linux/page_owner.h>
+#include <linux/ctype.h>
 
 #include "internal.h"
 
@@ -175,4 +176,49 @@  void dump_mm(const struct mm_struct *mm)
 	);
 }
 
+static bool page_init_poisoning __read_mostly = true;
+
+static int __init setup_vm_debug(char *str)
+{
+	bool __page_init_poisoning = true;
+
+	/*
+	 * Calling vm_debug with no arguments is equivalent to requesting
+	 * to enable all debugging options we can control.
+	 */
+	if (*str++ != '=' || !*str)
+		goto out;
+
+	__page_init_poisoning = false;
+	if (*str == '-')
+		goto out;
+
+	while (*str) {
+		switch (tolower(*str)) {
+		case'p':
+			__page_init_poisoning = true;
+			break;
+		default:
+			pr_err("vm_debug option '%c' unknown. skipped\n",
+			       *str);
+		}
+
+		str++;
+	}
+out:
+	if (page_init_poisoning && !__page_init_poisoning)
+		pr_warn("Page struct poisoning disabled by kernel command line option 'vm_debug'\n");
+
+	page_init_poisoning = __page_init_poisoning;
+
+	return 1;
+}
+__setup("vm_debug", setup_vm_debug);
+
+void page_init_poison(struct page *page, size_t size)
+{
+	if (page_init_poisoning)
+		memset(page, PAGE_POISON_PATTERN, size);
+}
+EXPORT_SYMBOL_GPL(page_init_poison);
 #endif		/* CONFIG_DEBUG_VM */
diff --git a/mm/memblock.c b/mm/memblock.c
index 32e5c62ee142..b0ebca546ba1 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -1503,10 +1503,9 @@  void * __init memblock_alloc_try_nid_raw(
 
 	ptr = memblock_alloc_internal(size, align,
 					   min_addr, max_addr, nid);
-#ifdef CONFIG_DEBUG_VM
 	if (ptr && size > 0)
-		memset(ptr, PAGE_POISON_PATTERN, size);
-#endif
+		page_init_poison(ptr, size);
+
 	return ptr;
 }
 
diff --git a/mm/sparse.c b/mm/sparse.c
index c0788e3d8513..ab2ac45e0440 100644
--- a/mm/sparse.c
+++ b/mm/sparse.c
@@ -696,13 +696,11 @@  int __meminit sparse_add_one_section(struct pglist_data *pgdat,
 		goto out;
 	}
 
-#ifdef CONFIG_DEBUG_VM
 	/*
 	 * Poison uninitialized struct pages in order to catch invalid flags
 	 * combinations.
 	 */
-	memset(memmap, PAGE_POISON_PATTERN, sizeof(struct page) * PAGES_PER_SECTION);
-#endif
+	page_init_poison(memmap, sizeof(struct page) * PAGES_PER_SECTION);
 
 	section_mark_present(ms);
 	sparse_init_one_section(ms, section_nr, memmap, usemap);