mbox series

[0/3] page table check default to warn instead of panic

Message ID 20220911095923.3614387-1-pasha.tatashin@soleen.com (mailing list archive)
Headers show
Series page table check default to warn instead of panic | expand

Message

Pasha Tatashin Sept. 11, 2022, 9:59 a.m. UTC
From: Pasha Tatashin <tatashin@google.com>

Page table check when detects errors panics the kernel. Let instead,
print a warning, and panic only when specifically requested via kernel
parameter:

	page_table_check=panic

The discussion about using panic vs. warn is here:
https://lore.kernel.org/linux-mm/20220902232732.12358-1-rick.p.edgecombe@intel.com

Pasha Tatashin (2):
  mm/page_table_check: Do WARN_ON instead of BUG_ON by default
  doc/vm: add information about page_table_check=panic

Rick Edgecombe (1):
  mm/page_table_check: Check writable zero page in page table check

 Documentation/mm/page_table_check.rst | 16 ++++----
 mm/page_table_check.c                 | 53 ++++++++++++++++++++-------
 2 files changed, 49 insertions(+), 20 deletions(-)

Comments

Andrew Morton Sept. 12, 2022, 8:23 p.m. UTC | #1
On Sun, 11 Sep 2022 09:59:20 +0000 Pasha Tatashin <pasha.tatashin@soleen.com> wrote:

> From: Pasha Tatashin <tatashin@google.com>
> 
> Page table check when detects errors panics the kernel. Let instead,
> print a warning, and panic only when specifically requested via kernel
> parameter:
> 
> 	page_table_check=panic
> 
> The discussion about using panic vs. warn is here:
> https://lore.kernel.org/linux-mm/20220902232732.12358-1-rick.p.edgecombe@intel.com

The changelog doesn't actually describe the reason for making this
change.  Somebody obviously wants pagetable check errors to no longer
panic the kernel, but why??  (The same can be said of the [2/3]
changelog).

Also, should we be changing the default?  People who like the panic
will get a big surprise when they find out that they should have added
a kernel parameter to get the old behaviour back.  It would be less
disruptive to default to panic unless page_table_check=warn was added.

If there's a solid reason for changing the default, it should be
changelogged.  And if that reason is generally agreed to, perhaps the
kernel should print a warning at boot if neither page_table_check=panic
nor page_table_check=warn were provided.  To tell people that the
default has been changed.
Pasha Tatashin Sept. 20, 2022, 6:11 p.m. UTC | #2
On Mon, Sep 12, 2022 at 4:23 PM Andrew Morton <akpm@linux-foundation.org> wrote:
>
> On Sun, 11 Sep 2022 09:59:20 +0000 Pasha Tatashin <pasha.tatashin@soleen.com> wrote:
>
> > From: Pasha Tatashin <tatashin@google.com>
> >
> > Page table check when detects errors panics the kernel. Let instead,
> > print a warning, and panic only when specifically requested via kernel
> > parameter:
> >
> >       page_table_check=panic
> >
> > The discussion about using panic vs. warn is here:
> > https://lore.kernel.org/linux-mm/20220902232732.12358-1-rick.p.edgecombe@intel.com
>
> The changelog doesn't actually describe the reason for making this
> change.  Somebody obviously wants pagetable check errors to no longer
> panic the kernel, but why??  (The same can be said of the [2/3]
> changelog).

This came from the discussion listed above. There seems to be a
consensus that we should reduce the number of BUG_ON() in the kernel,
and replace them with WARN_ON_ONCE() when possible to recover. In the
case of page_table_check we can recover, but for some it may be unsafe
because of security implications. Therefore, I would like to keep  an
option of being able to panic only because of page table check errors,
but not keeping it enabled by default.

I will add more info to the commit message.

>
> Also, should we be changing the default?  People who like the panic
> will get a big surprise when they find out that they should have added
> a kernel parameter to get the old behaviour back.  It would be less
> disruptive to default to panic unless page_table_check=warn was added.

I was thinking about this as well. I decided to change the default:
the old users will still get a warning, but going forward we will be
inline with the rest of the kernel: warn on by default, and optionally
panic.

>
> If there's a solid reason for changing the default, it should be
> changelogged.  And if that reason is generally agreed to, perhaps the
> kernel should print a warning at boot if neither page_table_check=panic
> nor page_table_check=warn were provided.  To tell people that the
> default has been changed.

I am not sure that is needed, and when do we remove that extra boot
message? This is a relatively new feature, and existing users would
still get an ugly warning about incorrect page table mappings.

Thank you,
Pasha

>
>