diff mbox series

[1/3] 9p: Annotate data-racy writes to file::f_flags on fd mount

Message ID 20231023233704.1185154-2-asmadeus@codewreck.org (mailing list archive)
State New, archived
Headers show
Series Small patches for 6.7 | expand

Commit Message

Dominique Martinet Oct. 23, 2023, 11:37 p.m. UTC
From: Marco Elver <elver@google.com>

syzbot reported:

 | BUG: KCSAN: data-race in p9_fd_create / p9_fd_create
 |
 | read-write to 0xffff888130fb3d48 of 4 bytes by task 15599 on cpu 0:
 |  p9_fd_open net/9p/trans_fd.c:842 [inline]
 |  p9_fd_create+0x210/0x250 net/9p/trans_fd.c:1092
 |  p9_client_create+0x595/0xa70 net/9p/client.c:1010
 |  v9fs_session_init+0xf9/0xd90 fs/9p/v9fs.c:410
 |  v9fs_mount+0x69/0x630 fs/9p/vfs_super.c:123
 |  legacy_get_tree+0x74/0xd0 fs/fs_context.c:611
 |  vfs_get_tree+0x51/0x190 fs/super.c:1519
 |  do_new_mount+0x203/0x660 fs/namespace.c:3335
 |  path_mount+0x496/0xb30 fs/namespace.c:3662
 |  do_mount fs/namespace.c:3675 [inline]
 |  __do_sys_mount fs/namespace.c:3884 [inline]
 |  [...]
 |
 | read-write to 0xffff888130fb3d48 of 4 bytes by task 15563 on cpu 1:
 |  p9_fd_open net/9p/trans_fd.c:842 [inline]
 |  p9_fd_create+0x210/0x250 net/9p/trans_fd.c:1092
 |  p9_client_create+0x595/0xa70 net/9p/client.c:1010
 |  v9fs_session_init+0xf9/0xd90 fs/9p/v9fs.c:410
 |  v9fs_mount+0x69/0x630 fs/9p/vfs_super.c:123
 |  legacy_get_tree+0x74/0xd0 fs/fs_context.c:611
 |  vfs_get_tree+0x51/0x190 fs/super.c:1519
 |  do_new_mount+0x203/0x660 fs/namespace.c:3335
 |  path_mount+0x496/0xb30 fs/namespace.c:3662
 |  do_mount fs/namespace.c:3675 [inline]
 |  __do_sys_mount fs/namespace.c:3884 [inline]
 |  [...]
 |
 | value changed: 0x00008002 -> 0x00008802

Within p9_fd_open(), O_NONBLOCK is added to f_flags of the read and
write files. This may happen concurrently if e.g. mounting process
modifies the fd in another thread.

Mark the plain read-modify-writes as intentional data-races, with the
assumption that the result of executing the accesses concurrently will
always result in the same result despite the accesses themselves not
being atomic.

Reported-by: syzbot+e441aeeb422763cc5511@syzkaller.appspotmail.com
Signed-off-by: Marco Elver <elver@google.com>
Link: https://lore.kernel.org/r/ZO38mqkS0TYUlpFp@elver.google.com
Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
---

Hi Marco, sorry for taking ages to process this patch. I've reworded the
commit message a bit and added a comment, so given this has your name on
it please have a look.
I'm planning to send this to Linus during the merge window next week

 net/9p/trans_fd.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

Comments

Marco Elver Oct. 24, 2023, 7:12 a.m. UTC | #1
On Tue, 24 Oct 2023 at 01:37, Dominique Martinet <asmadeus@codewreck.org> wrote:
>
> From: Marco Elver <elver@google.com>
>
> syzbot reported:
>
>  | BUG: KCSAN: data-race in p9_fd_create / p9_fd_create
>  |
>  | read-write to 0xffff888130fb3d48 of 4 bytes by task 15599 on cpu 0:
>  |  p9_fd_open net/9p/trans_fd.c:842 [inline]
>  |  p9_fd_create+0x210/0x250 net/9p/trans_fd.c:1092
>  |  p9_client_create+0x595/0xa70 net/9p/client.c:1010
>  |  v9fs_session_init+0xf9/0xd90 fs/9p/v9fs.c:410
>  |  v9fs_mount+0x69/0x630 fs/9p/vfs_super.c:123
>  |  legacy_get_tree+0x74/0xd0 fs/fs_context.c:611
>  |  vfs_get_tree+0x51/0x190 fs/super.c:1519
>  |  do_new_mount+0x203/0x660 fs/namespace.c:3335
>  |  path_mount+0x496/0xb30 fs/namespace.c:3662
>  |  do_mount fs/namespace.c:3675 [inline]
>  |  __do_sys_mount fs/namespace.c:3884 [inline]
>  |  [...]
>  |
>  | read-write to 0xffff888130fb3d48 of 4 bytes by task 15563 on cpu 1:
>  |  p9_fd_open net/9p/trans_fd.c:842 [inline]
>  |  p9_fd_create+0x210/0x250 net/9p/trans_fd.c:1092
>  |  p9_client_create+0x595/0xa70 net/9p/client.c:1010
>  |  v9fs_session_init+0xf9/0xd90 fs/9p/v9fs.c:410
>  |  v9fs_mount+0x69/0x630 fs/9p/vfs_super.c:123
>  |  legacy_get_tree+0x74/0xd0 fs/fs_context.c:611
>  |  vfs_get_tree+0x51/0x190 fs/super.c:1519
>  |  do_new_mount+0x203/0x660 fs/namespace.c:3335
>  |  path_mount+0x496/0xb30 fs/namespace.c:3662
>  |  do_mount fs/namespace.c:3675 [inline]
>  |  __do_sys_mount fs/namespace.c:3884 [inline]
>  |  [...]
>  |
>  | value changed: 0x00008002 -> 0x00008802
>
> Within p9_fd_open(), O_NONBLOCK is added to f_flags of the read and
> write files. This may happen concurrently if e.g. mounting process
> modifies the fd in another thread.
>
> Mark the plain read-modify-writes as intentional data-races, with the
> assumption that the result of executing the accesses concurrently will
> always result in the same result despite the accesses themselves not
> being atomic.
>
> Reported-by: syzbot+e441aeeb422763cc5511@syzkaller.appspotmail.com
> Signed-off-by: Marco Elver <elver@google.com>
> Link: https://lore.kernel.org/r/ZO38mqkS0TYUlpFp@elver.google.com
> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
> ---
>
> Hi Marco, sorry for taking ages to process this patch. I've reworded the
> commit message a bit and added a comment, so given this has your name on
> it please have a look.
> I'm planning to send this to Linus during the merge window next week

No worries, and thank you!

>  net/9p/trans_fd.c | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/net/9p/trans_fd.c b/net/9p/trans_fd.c
> index f226953577b2..d89c88986950 100644
> --- a/net/9p/trans_fd.c
> +++ b/net/9p/trans_fd.c
> @@ -836,14 +836,16 @@ static int p9_fd_open(struct p9_client *client, int rfd, int wfd)
>                 goto out_free_ts;
>         if (!(ts->rd->f_mode & FMODE_READ))
>                 goto out_put_rd;
> -       /* prevent workers from hanging on IO when fd is a pipe */
> -       ts->rd->f_flags |= O_NONBLOCK;
> +       /* Prevent workers from hanging on IO when fd is a pipe

Add '.' at end of sentence(s)?

> +        * We don't support userspace messing with the fd after passing it
> +        * to mount, so flag possible data race for KCSAN */

The comment should explain why the data race is safe, independent of
KCSAN. I still don't quite get why it's safe.

The case that syzbot found was 2 concurrent mount. Is that also disallowed?

Maybe something like: "We don't support userspace messing with the fd
after passing it to the first mount. While it's not officially
supported, concurrent modification of flags is unlikely to break this
code. So that tooling (like KCSAN) knows about it, mark them as
intentional data races."

> +       data_race(ts->rd->f_flags |= O_NONBLOCK);
>         ts->wr = fget(wfd);
>         if (!ts->wr)
>                 goto out_put_rd;
>         if (!(ts->wr->f_mode & FMODE_WRITE))
>                 goto out_put_wr;
> -       ts->wr->f_flags |= O_NONBLOCK;
> +       data_race(ts->wr->f_flags |= O_NONBLOCK);
>
>         client->trans = ts;
>         client->status = Connected;
> --
> 2.41.0
>
Dominique Martinet Oct. 24, 2023, 7:44 a.m. UTC | #2
Marco Elver wrote on Tue, Oct 24, 2023 at 09:12:56AM +0200:
> > diff --git a/net/9p/trans_fd.c b/net/9p/trans_fd.c
> > index f226953577b2..d89c88986950 100644
> > --- a/net/9p/trans_fd.c
> > +++ b/net/9p/trans_fd.c
> > @@ -836,14 +836,16 @@ static int p9_fd_open(struct p9_client *client, int rfd, int wfd)
> >                 goto out_free_ts;
> >         if (!(ts->rd->f_mode & FMODE_READ))
> >                 goto out_put_rd;
> > -       /* prevent workers from hanging on IO when fd is a pipe */
> > -       ts->rd->f_flags |= O_NONBLOCK;
> > +       /* Prevent workers from hanging on IO when fd is a pipe
> 
> Add '.' at end of sentence(s)?

I don't think we have any rule about this in the 9p part of the tree,
looking around there seem to be more comments without '.' than with, but
it's never too late to start... I'll add some in a v2 after we've agreed
with the rest.

> 
> > +        * We don't support userspace messing with the fd after passing it
> > +        * to mount, so flag possible data race for KCSAN */
> 
> The comment should explain why the data race is safe, independent of
> KCSAN. I still don't quite get why it's safe.

I guess it depends on what we call 'safe' here: if we agree the worst
thing that can happen is weird flags being set when we didn't request
them and socket operations behaving oddly (of the level of block when
they shouldn't), we don't care because there's no way to make concurrent
usage of the fd work anyway.
If it's possible to get an invalid value there such that a socket
operation ends up executing user-controlled code somewhere, then we've
got a bigger problem and we should take some lock (presumably the same
lock fcntl(F_SETFD) is taking, as that's got more potential for breakage
than another mount in my opinon)

> The case that syzbot found was 2 concurrent mount. Is that also disallowed?

Yes, there's no way you'll get a working filesystem out of two mounts
using the same fd as the protocol has no muxing

> Maybe something like: "We don't support userspace messing with the fd
> after passing it to the first mount. While it's not officially
> supported, concurrent modification of flags is unlikely to break this
> code. So that tooling (like KCSAN) knows about it, mark them as
> intentional data races."

I'd word this as much likely to break, how about something like this?

	/* Prevent workers from hanging on IO when fd is a pipe.
	 * It's technically possible for userspace or concurrent mounts to
	 * modify this flag concurrently, which will likely result in a
	 * broken filesystem. However, just having bad flags here should
	 * not crash the kernel or cause any other sort of bug, so mark this
	 * particular data race as intentional so that tooling (like KCSAN)
	 * can allow it and detect further problems.
	 */

Thanks,
Marco Elver Oct. 24, 2023, 7:49 a.m. UTC | #3
On Tue, 24 Oct 2023 at 09:44, Dominique Martinet <asmadeus@codewreck.org> wrote:
>
> Marco Elver wrote on Tue, Oct 24, 2023 at 09:12:56AM +0200:
> > > diff --git a/net/9p/trans_fd.c b/net/9p/trans_fd.c
> > > index f226953577b2..d89c88986950 100644
> > > --- a/net/9p/trans_fd.c
> > > +++ b/net/9p/trans_fd.c
> > > @@ -836,14 +836,16 @@ static int p9_fd_open(struct p9_client *client, int rfd, int wfd)
> > >                 goto out_free_ts;
> > >         if (!(ts->rd->f_mode & FMODE_READ))
> > >                 goto out_put_rd;
> > > -       /* prevent workers from hanging on IO when fd is a pipe */
> > > -       ts->rd->f_flags |= O_NONBLOCK;
> > > +       /* Prevent workers from hanging on IO when fd is a pipe
> >
> > Add '.' at end of sentence(s)?
>
> I don't think we have any rule about this in the 9p part of the tree,
> looking around there seem to be more comments without '.' than with, but
> it's never too late to start... I'll add some in a v2 after we've agreed
> with the rest.

Sounds good.
I think if there's 1 short sentence (1 line) comment, it's more or
less optional. But I'd insist on punctuation as soon as there are 2 or
more sentences.

> >
> > > +        * We don't support userspace messing with the fd after passing it
> > > +        * to mount, so flag possible data race for KCSAN */
> >
> > The comment should explain why the data race is safe, independent of
> > KCSAN. I still don't quite get why it's safe.
>
> I guess it depends on what we call 'safe' here: if we agree the worst
> thing that can happen is weird flags being set when we didn't request
> them and socket operations behaving oddly (of the level of block when
> they shouldn't), we don't care because there's no way to make concurrent
> usage of the fd work anyway.

Yes, that's reasonable.

> If it's possible to get an invalid value there such that a socket
> operation ends up executing user-controlled code somewhere, then we've
> got a bigger problem and we should take some lock (presumably the same
> lock fcntl(F_SETFD) is taking, as that's got more potential for breakage
> than another mount in my opinon)
>
> > The case that syzbot found was 2 concurrent mount. Is that also disallowed?
>
> Yes, there's no way you'll get a working filesystem out of two mounts
> using the same fd as the protocol has no muxing
>
> > Maybe something like: "We don't support userspace messing with the fd
> > after passing it to the first mount. While it's not officially
> > supported, concurrent modification of flags is unlikely to break this
> > code. So that tooling (like KCSAN) knows about it, mark them as
> > intentional data races."
>
> I'd word this as much likely to break, how about something like this?
>
>         /* Prevent workers from hanging on IO when fd is a pipe.
>          * It's technically possible for userspace or concurrent mounts to
>          * modify this flag concurrently, which will likely result in a
>          * broken filesystem. However, just having bad flags here should
>          * not crash the kernel or cause any other sort of bug, so mark this
>          * particular data race as intentional so that tooling (like KCSAN)
>          * can allow it and detect further problems.
>          */

I think this sounds much clearer. Thanks!
diff mbox series

Patch

diff --git a/net/9p/trans_fd.c b/net/9p/trans_fd.c
index f226953577b2..d89c88986950 100644
--- a/net/9p/trans_fd.c
+++ b/net/9p/trans_fd.c
@@ -836,14 +836,16 @@  static int p9_fd_open(struct p9_client *client, int rfd, int wfd)
 		goto out_free_ts;
 	if (!(ts->rd->f_mode & FMODE_READ))
 		goto out_put_rd;
-	/* prevent workers from hanging on IO when fd is a pipe */
-	ts->rd->f_flags |= O_NONBLOCK;
+	/* Prevent workers from hanging on IO when fd is a pipe
+	 * We don't support userspace messing with the fd after passing it
+	 * to mount, so flag possible data race for KCSAN */
+	data_race(ts->rd->f_flags |= O_NONBLOCK);
 	ts->wr = fget(wfd);
 	if (!ts->wr)
 		goto out_put_rd;
 	if (!(ts->wr->f_mode & FMODE_WRITE))
 		goto out_put_wr;
-	ts->wr->f_flags |= O_NONBLOCK;
+	data_race(ts->wr->f_flags |= O_NONBLOCK);
 
 	client->trans = ts;
 	client->status = Connected;