diff mbox series

[RFC,v3,18/36] kmsan: disable LOCK_DEBUGGING_SUPPORT

Message ID 20191122112621.204798-19-glider@google.com (mailing list archive)
State New, archived
Headers show
Series Add KernelMemorySanitizer infrastructure | expand

Commit Message

Alexander Potapenko Nov. 22, 2019, 11:26 a.m. UTC
KMSAN doesn't currently support lock debugging, causing the kernel to
enter infinite recursion at boot time.
See https://github.com/google/kmsan/issues/57.

Signed-off-by: Alexander Potapenko <glider@google.com>
Cc: Eric Biggers <ebiggers@google.com>
Cc: Qian Cai <cai@lca.pw>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Harry Wentland <harry.wentland@amd.com>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: linux-mm@kvack.org

---
This patch is part of "kmsan: Kconfig changes to disable options
incompatible with KMSAN", which was split into smaller pieces.

Comments

Marco Elver Dec. 2, 2019, 1:33 p.m. UTC | #1
On Fri, 22 Nov 2019 at 12:27, <glider@google.com> wrote:
>
> KMSAN doesn't currently support lock debugging, causing the kernel to
> enter infinite recursion at boot time.
> See https://github.com/google/kmsan/issues/57.
>
> Signed-off-by: Alexander Potapenko <glider@google.com>
> Cc: Eric Biggers <ebiggers@google.com>
> Cc: Qian Cai <cai@lca.pw>
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Herbert Xu <herbert@gondor.apana.org.au>
> Cc: Harry Wentland <harry.wentland@amd.com>
> Cc: Vegard Nossum <vegard.nossum@oracle.com>
> Cc: Dmitry Vyukov <dvyukov@google.com>
> Cc: linux-mm@kvack.org
>
> ---
> This patch is part of "kmsan: Kconfig changes to disable options
> incompatible with KMSAN", which was split into smaller pieces.
>
> diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> index 75c36318943d..a3f6f5d68593 100644
> --- a/lib/Kconfig.debug
> +++ b/lib/Kconfig.debug
> @@ -1068,6 +1068,9 @@ menu "Lock Debugging (spinlocks, mutexes, etc...)"
>  config LOCK_DEBUGGING_SUPPORT
>         bool
>         depends on TRACE_IRQFLAGS_SUPPORT && STACKTRACE_SUPPORT && LOCKDEP_SUPPORT
> +       # KMSAN is incompatible with lockdep,
> +       # see https://github.com/google/kmsan/issues/57.
> +       depends on !KMSAN
>         default y
>
>  config PROVE_LOCKING
>
> Change-Id: I4f97edc8a02d8ca208fc914e55e8f0c23d74eac8
> ---
>  lib/Kconfig.debug | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> index 75c36318943d..a3f6f5d68593 100644
> --- a/lib/Kconfig.debug
> +++ b/lib/Kconfig.debug
> @@ -1068,6 +1068,9 @@ menu "Lock Debugging (spinlocks, mutexes, etc...)"
>  config LOCK_DEBUGGING_SUPPORT
>         bool
>         depends on TRACE_IRQFLAGS_SUPPORT && STACKTRACE_SUPPORT && LOCKDEP_SUPPORT
> +       # KMSAN is incompatible with lockdep,
> +       # see https://github.com/google/kmsan/issues/57.

Does it make sense to get this working? Or rather, is it feasible to
get working? If not, I would just remove the reference to the Github
issue, and declare that KMSAN is incompatible with lockdep.

> +       depends on !KMSAN
>         default y
>
>  config PROVE_LOCKING
> --
> 2.24.0.432.g9d3f5f5b63-goog
>
Alexander Potapenko Dec. 3, 2019, 2:34 p.m. UTC | #2
On Mon, Dec 2, 2019 at 2:33 PM Marco Elver <elver@google.com> wrote:
>
> On Fri, 22 Nov 2019 at 12:27, <glider@google.com> wrote:
> >
> > KMSAN doesn't currently support lock debugging, causing the kernel to
> > enter infinite recursion at boot time.
> > See https://github.com/google/kmsan/issues/57.
> >
> > Signed-off-by: Alexander Potapenko <glider@google.com>
> > Cc: Eric Biggers <ebiggers@google.com>
> > Cc: Qian Cai <cai@lca.pw>
> > Cc: Christoph Hellwig <hch@lst.de>
> > Cc: Herbert Xu <herbert@gondor.apana.org.au>
> > Cc: Harry Wentland <harry.wentland@amd.com>
> > Cc: Vegard Nossum <vegard.nossum@oracle.com>
> > Cc: Dmitry Vyukov <dvyukov@google.com>
> > Cc: linux-mm@kvack.org
> >
> > ---
> > This patch is part of "kmsan: Kconfig changes to disable options
> > incompatible with KMSAN", which was split into smaller pieces.
> >
> > diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> > index 75c36318943d..a3f6f5d68593 100644
> > --- a/lib/Kconfig.debug
> > +++ b/lib/Kconfig.debug
> > @@ -1068,6 +1068,9 @@ menu "Lock Debugging (spinlocks, mutexes, etc...)"
> >  config LOCK_DEBUGGING_SUPPORT
> >         bool
> >         depends on TRACE_IRQFLAGS_SUPPORT && STACKTRACE_SUPPORT && LOCKDEP_SUPPORT
> > +       # KMSAN is incompatible with lockdep,
> > +       # see https://github.com/google/kmsan/issues/57.
> > +       depends on !KMSAN
> >         default y
> >
> >  config PROVE_LOCKING
> >
> > Change-Id: I4f97edc8a02d8ca208fc914e55e8f0c23d74eac8
> > ---
> >  lib/Kconfig.debug | 3 +++
> >  1 file changed, 3 insertions(+)
> >
> > diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> > index 75c36318943d..a3f6f5d68593 100644
> > --- a/lib/Kconfig.debug
> > +++ b/lib/Kconfig.debug
> > @@ -1068,6 +1068,9 @@ menu "Lock Debugging (spinlocks, mutexes, etc...)"
> >  config LOCK_DEBUGGING_SUPPORT
> >         bool
> >         depends on TRACE_IRQFLAGS_SUPPORT && STACKTRACE_SUPPORT && LOCKDEP_SUPPORT
> > +       # KMSAN is incompatible with lockdep,
> > +       # see https://github.com/google/kmsan/issues/57.
>
> Does it make sense to get this working? Or rather, is it feasible to
> get working? If not, I would just remove the reference to the Github
> issue, and declare that KMSAN is incompatible with lockdep.
Agreed.
At this point I don't really know why KMSAN and lockdep don't play
well together, but I'm not expecting anyone to use them together
either.
> > +       depends on !KMSAN
> >         default y
> >
> >  config PROVE_LOCKING
> > --
> > 2.24.0.432.g9d3f5f5b63-goog
> >
Qian Cai Dec. 3, 2019, 3 p.m. UTC | #3
> On Dec 3, 2019, at 9:35 AM, Alexander Potapenko <glider@google.com> wrote:
> 
> At this point I don't really know why KMSAN and lockdep don't play
> well together, but I'm not expecting anyone to use them together
> either.

Of course people will use those together. For example, distro debug kernel variants.
Alexander Potapenko Dec. 3, 2019, 3:14 p.m. UTC | #4
On Tue, Dec 3, 2019 at 4:00 PM Qian Cai <cai@lca.pw> wrote:
>
>
>
> > On Dec 3, 2019, at 9:35 AM, Alexander Potapenko <glider@google.com> wrote:
> >
> > At this point I don't really know why KMSAN and lockdep don't play
> > well together, but I'm not expecting anyone to use them together
> > either.
>
> Of course people will use those together. For example, distro debug kernel variants.
Some tools are just not designed to work together.
For example, you won't be able to compile the kernel with both KASAN
and KMSAN enabled at the same time.

Lockdep doesn't require any instrumentation to work, so it _might_ be
possible to make it work with KMSAN, but it will probably still slow
down the things to an unacceptable level.
I'm inclining towards disabling the two together for now, unless
anyone is willing to address that issue.

Please let me know if you think I need to keep the link to
https://github.com/google/kmsan/issues/57 in the Kconfig comment,
right now it looks like:

  # KMSAN is currently incompatible with lockdep.
Qian Cai Dec. 3, 2019, 6:02 p.m. UTC | #5
> On Dec 3, 2019, at 10:14 AM, Alexander Potapenko <glider@google.com> wrote:
> 
> Some tools are just not designed to work together.
> For example, you won't be able to compile the kernel with both KASAN
> and KMSAN enabled at the same time.
> 
> Lockdep doesn't require any instrumentation to work, so it _might_ be
> possible to make it work with KMSAN, but it will probably still slow
> down the things to an unacceptable level.
> I'm inclining towards disabling the two together for now, unless
> anyone is willing to address that issue.
> 
> Please let me know if you think I need to keep the link to
> https://github.com/google/kmsan/issues/57 in the Kconfig comment,
> right now it looks like:
> 
>  # KMSAN is currently incompatible with lockdep.

Then, I don’t see much value with KMSAN get merged. Although In theory, it does cover a bit more than the existing memory debugging infrastructure, but the question is does it worth it? Maybe it needs more data to show a list of bugs it found that existing debugging options could not find?
Steven Rostedt Dec. 3, 2019, 6:38 p.m. UTC | #6
On Tue, 3 Dec 2019 16:14:17 +0100
Alexander Potapenko <glider@google.com> wrote:

> On Tue, Dec 3, 2019 at 4:00 PM Qian Cai <cai@lca.pw> wrote:
> >
> >
> >  
> > > On Dec 3, 2019, at 9:35 AM, Alexander Potapenko <glider@google.com> wrote:
> > >
> > > At this point I don't really know why KMSAN and lockdep don't play
> > > well together, but I'm not expecting anyone to use them together
> > > either.  
> >
> > Of course people will use those together. For example, distro debug kernel variants.  
> Some tools are just not designed to work together.
> For example, you won't be able to compile the kernel with both KASAN
> and KMSAN enabled at the same time.
> 
> Lockdep doesn't require any instrumentation to work, so it _might_ be
> possible to make it work with KMSAN, but it will probably still slow
> down the things to an unacceptable level.
> I'm inclining towards disabling the two together for now, unless
> anyone is willing to address that issue.

Note, I'm much more interested in someone running with lockdep than
with KMSAN. Thus, you may not get as much use if you do not work with
lockdep. I enable lockdep first when testing out my code. If KMSAN is
not compatible, it wont get enabled.

> 
> Please let me know if you think I need to keep the link to
> https://github.com/google/kmsan/issues/57 in the Kconfig comment,
> right now it looks like:
> 
>   # KMSAN is currently incompatible with lockdep.
> 

Function tracing had lots of issues with lockdep, but I worked hard to
make sure they could be compatible. This usually required having the
lockdep code not be traced. Is it possible to have the same with KMSAN.
That is, have KMSAN not look at anything that lockdep does?

-- Steve
Alexander Potapenko Dec. 4, 2019, 8:41 a.m. UTC | #7
On Tue, Dec 3, 2019 at 7:38 PM Steven Rostedt <rostedt@goodmis.org> wrote:
>
> On Tue, 3 Dec 2019 16:14:17 +0100
> Alexander Potapenko <glider@google.com> wrote:
>
> > On Tue, Dec 3, 2019 at 4:00 PM Qian Cai <cai@lca.pw> wrote:
> > >
> > >
> > >
> > > > On Dec 3, 2019, at 9:35 AM, Alexander Potapenko <glider@google.com> wrote:
> > > >
> > > > At this point I don't really know why KMSAN and lockdep don't play
> > > > well together, but I'm not expecting anyone to use them together
> > > > either.
> > >
> > > Of course people will use those together. For example, distro debug kernel variants.
> > Some tools are just not designed to work together.
> > For example, you won't be able to compile the kernel with both KASAN
> > and KMSAN enabled at the same time.
> >
> > Lockdep doesn't require any instrumentation to work, so it _might_ be
> > possible to make it work with KMSAN, but it will probably still slow
> > down the things to an unacceptable level.
> > I'm inclining towards disabling the two together for now, unless
> > anyone is willing to address that issue.
>
> Note, I'm much more interested in someone running with lockdep than
> with KMSAN. Thus, you may not get as much use if you do not work with
> lockdep. I enable lockdep first when testing out my code. If KMSAN is
> not compatible, it wont get enabled.
>
> >
> > Please let me know if you think I need to keep the link to
> > https://github.com/google/kmsan/issues/57 in the Kconfig comment,
> > right now it looks like:
> >
> >   # KMSAN is currently incompatible with lockdep.
> >
>
> Function tracing had lots of issues with lockdep, but I worked hard to
> make sure they could be compatible. This usually required having the
> lockdep code not be traced. Is it possible to have the same with KMSAN.
> That is, have KMSAN not look at anything that lockdep does?
>
> -- Steve
Qian, Steve, thanks for the data points.
I thought it might be more valuable to cut some edges to make KMSAN
available to early users, but in the case most developers use a single
build for testing it indeed makes more sense to fix lockdep
interoperability.
I'll look into that.
Petr Mladek Dec. 4, 2019, 12:22 p.m. UTC | #8
On Wed 2019-12-04 09:41:15, Alexander Potapenko wrote:
> On Tue, Dec 3, 2019 at 7:38 PM Steven Rostedt <rostedt@goodmis.org> wrote:
> >
> > On Tue, 3 Dec 2019 16:14:17 +0100
> > Alexander Potapenko <glider@google.com> wrote:
> >
> > > On Tue, Dec 3, 2019 at 4:00 PM Qian Cai <cai@lca.pw> wrote:
> > > >
> > > >
> > > >
> > > > > On Dec 3, 2019, at 9:35 AM, Alexander Potapenko <glider@google.com> wrote:
> > > > >
> > > > > At this point I don't really know why KMSAN and lockdep don't play
> > > > > well together, but I'm not expecting anyone to use them together
> > > > > either.
> > > >
> > > > Of course people will use those together. For example, distro debug kernel variants.
> > > Some tools are just not designed to work together.
> > > For example, you won't be able to compile the kernel with both KASAN
> > > and KMSAN enabled at the same time.
> > >
> > > Lockdep doesn't require any instrumentation to work, so it _might_ be
> > > possible to make it work with KMSAN, but it will probably still slow
> > > down the things to an unacceptable level.
> > > I'm inclining towards disabling the two together for now, unless
> > > anyone is willing to address that issue.
> >
> > Note, I'm much more interested in someone running with lockdep than
> > with KMSAN. Thus, you may not get as much use if you do not work with
> > lockdep. I enable lockdep first when testing out my code. If KMSAN is
> > not compatible, it wont get enabled.
> >
> > >
> > > Please let me know if you think I need to keep the link to
> > > https://github.com/google/kmsan/issues/57 in the Kconfig comment,
> > > right now it looks like:
> > >
> > >   # KMSAN is currently incompatible with lockdep.
> > >
> >
> > Function tracing had lots of issues with lockdep, but I worked hard to
> > make sure they could be compatible. This usually required having the
> > lockdep code not be traced. Is it possible to have the same with KMSAN.
> > That is, have KMSAN not look at anything that lockdep does?
> >
> > -- Steve
> Qian, Steve, thanks for the data points.
> I thought it might be more valuable to cut some edges to make KMSAN
> available to early users,

Makes sense to me.

> but in the case most developers use a single
> build for testing it indeed makes more sense to fix lockdep
> interoperability.
> I'll look into that.

IMHO, a generic distro debug kernel could not enable debugging
options that would significantly slow down the kernel. Most users
would refuse using such kernels. Also they might slow down
the system to the point that many problems won't be visible.

For example, I do not see KASAN enabled in SUSE debug kernel.
I guess that it will be the same with KMSAN. These options
can be enabled in custom kernels when debugging particular
problems.

Best Regards,
Petr
Qian Cai Dec. 4, 2019, 1:12 p.m. UTC | #9
> On Dec 4, 2019, at 7:22 AM, Petr Mladek <pmladek@suse.com> wrote:
> 
> IMHO, a generic distro debug kernel could not enable debugging
> options that would significantly slow down the kernel. Most users
> would refuse using such kernels. Also they might slow down
> the system to the point that many problems won't be visible.
> 
> For example, I do not see KASAN enabled in SUSE debug kernel.
> I guess that it will be the same with KMSAN. These options
> can be enabled in custom kernels when debugging particular
> problems.

No, KASAN is one of most important debug options to find general issues. It was actually used in many debug kernels like those in Google. In contrast, it will find many issues compared to without even though the performance would suffer in some degrees.

I would even argue that since KASAN is so valuable that it is also desirable to get KMSAN to work together with KASAN by improving the compilers. Otherwise, it is a struggling to choose between KASAN and KMSAN for general debug kernels as KASAN could also cover a subset of KMSAN coverage.
Alexander Potapenko Dec. 4, 2019, 4:24 p.m. UTC | #10
On Wed, Dec 4, 2019 at 2:12 PM Qian Cai <cai@lca.pw> wrote:
>
>
>
> > On Dec 4, 2019, at 7:22 AM, Petr Mladek <pmladek@suse.com> wrote:
> >
> > IMHO, a generic distro debug kernel could not enable debugging
> > options that would significantly slow down the kernel. Most users
> > would refuse using such kernels. Also they might slow down
> > the system to the point that many problems won't be visible.
> >
> > For example, I do not see KASAN enabled in SUSE debug kernel.
> > I guess that it will be the same with KMSAN. These options
> > can be enabled in custom kernels when debugging particular
> > problems.
>
> No, KASAN is one of most important debug options to find general issues. It was actually used in many debug kernels like those in Google. In contrast, it will find many issues compared to without even though the performance would suffer in some degrees.
>
> I would even argue that since KASAN is so valuable that it is also desirable to get KMSAN to work together with KASAN by improving the compilers. Otherwise, it is a struggling to choose between KASAN and KMSAN for general debug kernels as KASAN could also cover a subset of KMSAN coverage.

Turns out it was fairly easy to make KMSAN work with lockdep, sorry
for the noise.
I'll send the updated patch that disables instrumentation of
kernel/locking/lockdep.c in v4.

While on this, I want to make a point on KMSAN features that seem to
be misunderstood.
KMSAN uses precise bit-to-bit shadow mapping and compiler
instrumentation for shadow tracking to tell whether a value is used in
a comparison, pointer dereference or is copied to hardware or
userspace.
As long as the use of such value doesn't cause an immediate crash or a
memory corruption, KASAN is simply unable to detect such bugs.
Neither are any other existing tools, although they may also report
errors later, after the use of uninitialized values.
There's a list of bugs found by syzbot running KMSAN
(https://syzkaller.appspot.com/upstream/fixed?manager=ci-upstream-kmsan-gce),
and I would say more than a hundred of those is unique to KMSAN.

One can think of KMSAN as a _fast_ replacement of kmemcheck.
Unfortunately, the latter bit-rotted at some point and was deleted
from the kernel (I don't think distro debug kernels ever used it
though).

From our experience building userspace tools, making KMSAN and KASAN
work together isn't worth it.
The resulting monster will be slower than any of the two tools and
will be using more memory than both of them together.
This means that anyone who's using at least two machines for fuzzing
or continuous integration will prefer running two different tools on
them instead of running KMSAN+KASAN together on every machine.

That said, I'll try my best to keep existing low-overhead debug
configs working with KMSAN, but it still won't solve the need of
running orthogonal kernel configs for testing.
Qian Cai Dec. 4, 2019, 6:03 p.m. UTC | #11
> On Dec 4, 2019, at 11:24 AM, Alexander Potapenko <glider@google.com> wrote:
> 
> One can think of KMSAN as a _fast_ replacement of kmemcheck.
> Unfortunately, the latter bit-rotted at some point and was deleted
> from the kernel (I don't think distro debug kernels ever used it
> though).
> 
> From our experience building userspace tools, making KMSAN and KASAN
> work together isn't worth it.
> The resulting monster will be slower than any of the two tools and
> will be using more memory than both of them together.
> This means that anyone who's using at least two machines for fuzzing
> or continuous integration will prefer running two different tools on
> them instead of running KMSAN+KASAN together on every machine.
> 
> That said, I'll try my best to keep existing low-overhead debug
> configs working with KMSAN, but it still won't solve the need of
> running orthogonal kernel configs for testing.

The problem is that we have too many memory-related debugging options in kernel. Those tend to break over time mainly because not many paid kernel developers would care about them, and most of Kernel developers nowadays are paid.  Not many employers would like Google to spend money on long-term health of the kernel because those debugging options does not generate revenues. For example, kmemleak was broken on NUMA for many releases and lockdep PROVE_LOCKING still has too many unfixed splats right now which would render it almost useless for general debugging because it will disable itself after found a splat.

Every of those does somewhat different and yet have many overlapping that yet nobody put any effort to consolidate them. It is even more cumbersome for the options to break without notice because it does not play well with popular ones like KASAN given code for those features are spanning all over the kernel code base, so they are sensitive to changes from other subsystems. It put a burden for bug hunters to enable them because it is going to be expensive increasing test runs time exponentially and maintenance cost of different debug kernels.
diff mbox series

Patch

diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 75c36318943d..a3f6f5d68593 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1068,6 +1068,9 @@  menu "Lock Debugging (spinlocks, mutexes, etc...)"
 config LOCK_DEBUGGING_SUPPORT
 	bool
 	depends on TRACE_IRQFLAGS_SUPPORT && STACKTRACE_SUPPORT && LOCKDEP_SUPPORT
+	# KMSAN is incompatible with lockdep,
+	# see https://github.com/google/kmsan/issues/57.
+	depends on !KMSAN
 	default y

 config PROVE_LOCKING

Change-Id: I4f97edc8a02d8ca208fc914e55e8f0c23d74eac8
---
 lib/Kconfig.debug | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 75c36318943d..a3f6f5d68593 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1068,6 +1068,9 @@  menu "Lock Debugging (spinlocks, mutexes, etc...)"
 config LOCK_DEBUGGING_SUPPORT
 	bool
 	depends on TRACE_IRQFLAGS_SUPPORT && STACKTRACE_SUPPORT && LOCKDEP_SUPPORT
+	# KMSAN is incompatible with lockdep,
+	# see https://github.com/google/kmsan/issues/57.
+	depends on !KMSAN
 	default y
 
 config PROVE_LOCKING