diff mbox series

[v4,44/45] mm: fs: initialize fsdata passed to write_begin/write_end interface

Message ID 20220701142310.2188015-45-glider@google.com (mailing list archive)
State New
Headers show
Series Add KernelMemorySanitizer infrastructure | expand

Commit Message

Alexander Potapenko July 1, 2022, 2:23 p.m. UTC
Functions implementing the a_ops->write_end() interface accept the
`void *fsdata` parameter that is supposed to be initialized by the
corresponding a_ops->write_begin() (which accepts `void **fsdata`).

However not all a_ops->write_begin() implementations initialize `fsdata`
unconditionally, so it may get passed uninitialized to a_ops->write_end(),
resulting in undefined behavior.

Fix this by initializing fsdata with NULL before the call to
write_begin(), rather than doing so in all possible a_ops
implementations.

This patch covers only the following cases found by running x86 KMSAN
under syzkaller:

 - generic_perform_write()
 - cont_expand_zero() and generic_cont_expand_simple()
 - page_symlink()

Other cases of passing uninitialized fsdata may persist in the codebase.

Signed-off-by: Alexander Potapenko <glider@google.com>
---
Link: https://linux-review.googlesource.com/id/I414f0ee3a164c9c335d91d82ce4558f6f2841471
---
 fs/buffer.c  | 4 ++--
 fs/namei.c   | 2 +-
 mm/filemap.c | 2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

Comments

Matthew Wilcox (Oracle) July 4, 2022, 8:07 p.m. UTC | #1
On Fri, Jul 01, 2022 at 04:23:09PM +0200, Alexander Potapenko wrote:
> Functions implementing the a_ops->write_end() interface accept the
> `void *fsdata` parameter that is supposed to be initialized by the
> corresponding a_ops->write_begin() (which accepts `void **fsdata`).
> 
> However not all a_ops->write_begin() implementations initialize `fsdata`
> unconditionally, so it may get passed uninitialized to a_ops->write_end(),
> resulting in undefined behavior.

... wait, passing an uninitialised variable to a function *which doesn't
actually use it* is now UB?  What genius came up with that rule?  What
purpose does it serve?
Al Viro July 4, 2022, 8:30 p.m. UTC | #2
On Mon, Jul 04, 2022 at 09:07:43PM +0100, Matthew Wilcox wrote:
> On Fri, Jul 01, 2022 at 04:23:09PM +0200, Alexander Potapenko wrote:
> > Functions implementing the a_ops->write_end() interface accept the
> > `void *fsdata` parameter that is supposed to be initialized by the
> > corresponding a_ops->write_begin() (which accepts `void **fsdata`).
> > 
> > However not all a_ops->write_begin() implementations initialize `fsdata`
> > unconditionally, so it may get passed uninitialized to a_ops->write_end(),
> > resulting in undefined behavior.
> 
> ... wait, passing an uninitialised variable to a function *which doesn't
> actually use it* is now UB?  What genius came up with that rule?  What
> purpose does it serve?

"The value we are passing might be utter bollocks, but that way it's
obfuscated enough to confuse anyone, compiler included".

Defensive progamming, don'cha know?

I would suggest a different way to obfuscate it, though - pass const void **
and leave it for the callee to decide whether they want to dereferences it.
It is still 100% dependent upon the ->write_end() being correctly matched
with ->write_begin(), with zero assistance from the compiler, but it does
look, er, safer.  Or something.

	Of course, a clean way to handle that would be to have
->write_begin() return a partial application of foo_write_end to
whatever it wants for fsdata, to be evaluated where we would currently
call ->write_end().  _That_ could be usefully typechecked, but... we
don't have usable partial application.
Alexander Potapenko Aug. 25, 2022, 3:39 p.m. UTC | #3
On Mon, Jul 4, 2022 at 10:07 PM Matthew Wilcox <willy@infradead.org> wrote:
>
> On Fri, Jul 01, 2022 at 04:23:09PM +0200, Alexander Potapenko wrote:
> > Functions implementing the a_ops->write_end() interface accept the
> > `void *fsdata` parameter that is supposed to be initialized by the
> > corresponding a_ops->write_begin() (which accepts `void **fsdata`).
> >
> > However not all a_ops->write_begin() implementations initialize `fsdata`
> > unconditionally, so it may get passed uninitialized to a_ops->write_end(),
> > resulting in undefined behavior.
>
> ... wait, passing an uninitialised variable to a function *which doesn't
> actually use it* is now UB?  What genius came up with that rule?  What
> purpose does it serve?
>

Hi Matthew,

There is a discussion at [1], with Segher pointing out a reason for
this rule [2] and Linus requesting that we should be warning about the
cases where uninitialized variables are passed by value.

Right now there are only a handful cases in the kernel where such
passing is performed (we just need one more patch in addition to this
one for KMSAN to boot cleanly). So we are in a good position to start
enforcing this rule, unless there's a reason not to.

I am not sure standard compliance alone is a convincing argument, but
from KMSAN standpoint, performing parameter check at callsites
noticeably eases handling of values passed between instrumented and
non-instrumented code. This lets us avoid some low-level hacks around
instrumentation_begin()/instrumentation_end() (some context available
at [4]).

Let me know what you think,
Alex

[1] - https://lore.kernel.org/lkml/CAFKCwrjBjHMquj-adTf0_1QLYq3Et=gJ0rq6HS-qrAEmVA7Ujw@mail.gmail.com/T/
[2] - https://lore.kernel.org/lkml/20220615164655.GC25951@gate.crashing.org/
[3] - https://lore.kernel.org/lkml/CAHk-=whjz3wO8zD+itoerphWem+JZz4uS3myf6u1Wd6epGRgmQ@mail.gmail.com/
[4] https://lore.kernel.org/lkml/20220426164315.625149-29-glider@google.com/
Linus Torvalds Aug. 25, 2022, 4:33 p.m. UTC | #4
On Thu, Aug 25, 2022 at 8:40 AM Alexander Potapenko <glider@google.com> wrote:
>
> On Mon, Jul 4, 2022 at 10:07 PM Matthew Wilcox <willy@infradead.org> wrote:
> >
> > ... wait, passing an uninitialised variable to a function *which doesn't
> > actually use it* is now UB?  What genius came up with that rule?  What
> > purpose does it serve?
> >
>
> There is a discussion at [1], with Segher pointing out a reason for
> this rule [2] and Linus requesting that we should be warning about the
> cases where uninitialized variables are passed by value.

I think Matthew was actually more wondering how that UB rule came to be.

Personally, I pretty much despise *all* cases of "undefined behavior",
but "uninitialized argument" across a function call is one of the more
understandable ones.

For one, it's a static sanity checking issue: if function call
arguments can be uninitialized random garbage on the assumption that
the callee doesn't necessarily _use_ them, then any static checker is
going to be unhappy because it means that it can never assume that
incoming arguments have been initialized either.

Of course, that's always true for any pointer passing, but hey, at
least then it's pretty much explicit. You're passing a pointer to some
memory to another function, it's always going to be a bit ambiguous
who is supposed to initialize it - the caller or the callee.

Because one very important "static checker" is the person reading the
code. When I read a function definition, I most definitely have the
expectation that the caller has initialized all the arguments.

So I actually think that "human static checker" is a really important
case. I do not think I'm the only one who expects incomping function
arguments to have values.

But I think the immediate cause of it on a compiler side was basically
things like poison bits. Which are a nice debugging feature, even
though (sadly) I don't think they are usually added the for debugging.
It's always for some other much more nefarious reason (eg ia64 and
speculative loads weren't for "hey, this will help people find bugs",
but for "hey, our architecture depends on static scheduling tricks
that aren't really valid, so we have to take faults late").

Now, imagine you're a compiler, and you see a random incoming integer
argument, and you can't even schedule simple arithmetic expressions on
it early because you don't know if the caller initialized it or not,
and it might cause some poison bit fault...

So you'd most certainly want to know that all incoming arguments are
actually valid, because otherwise you can't do even some really simple
and obvious optimziations.

Of course, on normal architectures, this only ever happens with FP
values, and it's often hard to trigger there too. But you most
definitely *could* see it.

I personally was actually surprised compilers didn't warn for "you are
using an uninitialized value" for a function call argument, because I
mentally consider function call arguments to *be* a use of a value.

Except when the function is inlined, and then it's all different - the
call itself goes away, and I *expect* the compiler to DTRT and not
"use" the argument except when it's used inside the inlined function.

Because hey, that's literally the whole point of inlining, and it
makes the "static checking" problem go away at least for a compiler.

                     Linus
Segher Boessenkool Aug. 25, 2022, 9:57 p.m. UTC | #5
On Thu, Aug 25, 2022 at 09:33:18AM -0700, Linus Torvalds wrote:
> On Thu, Aug 25, 2022 at 8:40 AM Alexander Potapenko <glider@google.com> wrote:
> >
> > On Mon, Jul 4, 2022 at 10:07 PM Matthew Wilcox <willy@infradead.org> wrote:
> > >
> > > ... wait, passing an uninitialised variable to a function *which doesn't
> > > actually use it* is now UB?  What genius came up with that rule?  What
> > > purpose does it serve?
> > >
> >
> > There is a discussion at [1], with Segher pointing out a reason for
> > this rule [2] and Linus requesting that we should be warning about the
> > cases where uninitialized variables are passed by value.
> 
> I think Matthew was actually more wondering how that UB rule came to be.
> 
> Personally, I pretty much despise *all* cases of "undefined behavior",

Let me start by saying you're not alone.  But some UB *cannot* be worked
around by compilers (we cannot solve the halting problem), and some is
very expensive to work around (initialising huge structures is a
typical example).

Many (if not most) instances of undefined behaviour are unavoidable with
a language like C.  A very big part of this is separate compilation,
that is, compiling translation units separately from each other, so that
the compiler does not see all the ways that something is used when it is
compiling it.  There only is UB if something is *used*.

> but "uninitialized argument" across a function call is one of the more
> understandable ones.

Allowing this essentially never allows generating better machine code,
so there are no real arguments for ever allowing it, other than just
inertia: uninitialised everything else is allowed just fine, and only
actually *using* such data is UB.  Passing it around is not!  That is
how everything used to work (with static data, automatic data, function
parameters, the lot).

But it now is clarified that passing data to a function as function
argument is a use of that data by itself, even if the function will not
even look at it ever.

> I personally was actually surprised compilers didn't warn for "you are
> using an uninitialized value" for a function call argument, because I
> mentally consider function call arguments to *be* a use of a value.

The function call is a use of all passed arguments.

> Except when the function is inlined, and then it's all different - the
> call itself goes away, and I *expect* the compiler to DTRT and not
> "use" the argument except when it's used inside the inlined function.

> Because hey, that's literally the whole point of inlining, and it
> makes the "static checking" problem go away at least for a compiler.

But UB is defined in terms of the abstract machine (like *all* of C),
not in terms of the generated machine code.  Typically things will work
fine if they "become invisible" by inlining, but this does not make the
program a correct program ever.  Sorry :-(


Segher
Segher Boessenkool Aug. 25, 2022, 10:13 p.m. UTC | #6
On Thu, Aug 25, 2022 at 09:33:18AM -0700, Linus Torvalds wrote:
> So you'd most certainly want to know that all incoming arguments are
> actually valid, because otherwise you can't do even some really simple
> and obvious optimziations.

The C11 change was via DR 338, see
<https://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_338.htm>
for more info.


Segher
Linus Torvalds Aug. 26, 2022, 7:41 p.m. UTC | #7
On Thu, Aug 25, 2022 at 3:10 PM Segher Boessenkool
<segher@kernel.crashing.org> wrote:
>
> But UB is defined in terms of the abstract machine (like *all* of C),
> not in terms of the generated machine code.  Typically things will work
> fine if they "become invisible" by inlining, but this does not make the
> program a correct program ever.  Sorry :-(

Yeah, and the abstract machine model based on "abstract syntax" is
just wrong, wrong, wrong.

I really wish the C standard people had the guts to just fix it.  At
some point, relying on tradition when the tradition is bad is not a
great thing.

It's the same problem that made all the memory ordering discussions
completely untenable. The language to allow the whole data dependency
was completely ridiculous, because it became about the C language
syntax and theory, not about the actual code generation and actual
*meaning* that the whole thing was *about*.

Java may be a horrible language that a lot of people hate, but it
avoided a lot of problems by just making things about an actual
virtual machine and describing things within a more concrete model of
a virtual machine.

Then you can just say "this code sequence generates this set of
operations, and the compiler can optimize it any which way it likes as
long as the end result is equivalent".

Oh well.

I will repeat: a paper standard that doesn't take reality into account
is less useful than toilet paper. It's scratchy and not very
absorbent.

And the kernel will continue to care more about reality than about a C
standard that does bad things.

Inlining makes the use of the argument go away at the call site and
moves the code of the function into the body. That's how things
*work*. That's literally the meaning of inlining.

And inlining in C is so important because macros are weak, and other
facilities like templates don't exist.

But in the kernel, we also often use it because the actual semantics
of "not a function call" in terms of code generation is also important
(ie we have literal cases where "not generating the 'call'
instruction" is a correctness issue).

If the C standard thinks "undefined argument even for inlining use is
UB", then it's a case of that paperwork that doesn't reflect reality,
and we'll treat it with the deference it deserves - is less than
toilet paper.

We have decades of history of doing that in the kernel. Sometimes the
standards are just wrong, sometimes they are just too far removed from
reality to be relevant, and then it's just not worth worrying about
them.

          Linus
Alexander Potapenko Aug. 31, 2022, 1:32 p.m. UTC | #8
On Fri, Aug 26, 2022 at 9:41 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> On Thu, Aug 25, 2022 at 3:10 PM Segher Boessenkool
> <segher@kernel.crashing.org> wrote:
> >
> > But UB is defined in terms of the abstract machine (like *all* of C),
> > not in terms of the generated machine code.  Typically things will work
> > fine if they "become invisible" by inlining, but this does not make the
> > program a correct program ever.  Sorry :-(
>
> Yeah, and the abstract machine model based on "abstract syntax" is
> just wrong, wrong, wrong.
>
> I really wish the C standard people had the guts to just fix it.  At
> some point, relying on tradition when the tradition is bad is not a
> great thing.
>
> It's the same problem that made all the memory ordering discussions
> completely untenable. The language to allow the whole data dependency
> was completely ridiculous, because it became about the C language
> syntax and theory, not about the actual code generation and actual
> *meaning* that the whole thing was *about*.
>
> Java may be a horrible language that a lot of people hate, but it
> avoided a lot of problems by just making things about an actual
> virtual machine and describing things within a more concrete model of
> a virtual machine.
>
> Then you can just say "this code sequence generates this set of
> operations, and the compiler can optimize it any which way it likes as
> long as the end result is equivalent".
>
> Oh well.
>
> I will repeat: a paper standard that doesn't take reality into account
> is less useful than toilet paper. It's scratchy and not very
> absorbent.
>
> And the kernel will continue to care more about reality than about a C
> standard that does bad things.
>
> Inlining makes the use of the argument go away at the call site and
> moves the code of the function into the body. That's how things
> *work*. That's literally the meaning of inlining.
>
> And inlining in C is so important because macros are weak, and other
> facilities like templates don't exist.
>
> But in the kernel, we also often use it because the actual semantics
> of "not a function call" in terms of code generation is also important
> (ie we have literal cases where "not generating the 'call'
> instruction" is a correctness issue).
>
> If the C standard thinks "undefined argument even for inlining use is
> UB", then it's a case of that paperwork that doesn't reflect reality,
> and we'll treat it with the deference it deserves - is less than
> toilet paper.

Just for posterity, in the case of KMSAN we are only dealing with
cases where the function call survived inlining and dead code
elimination.


> We have decades of history of doing that in the kernel. Sometimes the
> standards are just wrong, sometimes they are just too far removed from
> reality to be relevant, and then it's just not worth worrying about
> them.
>
>           Linus
diff mbox series

Patch

diff --git a/fs/buffer.c b/fs/buffer.c
index 898c7f301b1b9..d014009cff941 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -2349,7 +2349,7 @@  int generic_cont_expand_simple(struct inode *inode, loff_t size)
 	struct address_space *mapping = inode->i_mapping;
 	const struct address_space_operations *aops = mapping->a_ops;
 	struct page *page;
-	void *fsdata;
+	void *fsdata = NULL;
 	int err;
 
 	err = inode_newsize_ok(inode, size);
@@ -2375,7 +2375,7 @@  static int cont_expand_zero(struct file *file, struct address_space *mapping,
 	const struct address_space_operations *aops = mapping->a_ops;
 	unsigned int blocksize = i_blocksize(inode);
 	struct page *page;
-	void *fsdata;
+	void *fsdata = NULL;
 	pgoff_t index, curidx;
 	loff_t curpos;
 	unsigned zerofrom, offset, len;
diff --git a/fs/namei.c b/fs/namei.c
index 6b39dfd3b41bc..5e3ff9d65f502 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -5051,7 +5051,7 @@  int page_symlink(struct inode *inode, const char *symname, int len)
 	const struct address_space_operations *aops = mapping->a_ops;
 	bool nofs = !mapping_gfp_constraint(mapping, __GFP_FS);
 	struct page *page;
-	void *fsdata;
+	void *fsdata = NULL;
 	int err;
 	unsigned int flags;
 
diff --git a/mm/filemap.c b/mm/filemap.c
index ffdfbc8b0e3ca..72467f00f1916 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -3753,7 +3753,7 @@  ssize_t generic_perform_write(struct kiocb *iocb, struct iov_iter *i)
 		unsigned long offset;	/* Offset into pagecache page */
 		unsigned long bytes;	/* Bytes to write to page */
 		size_t copied;		/* Bytes copied from user */
-		void *fsdata;
+		void *fsdata = NULL;
 
 		offset = (pos & (PAGE_SIZE - 1));
 		bytes = min_t(unsigned long, PAGE_SIZE - offset,