diff mbox series

vfs: move_mount: reject moving kernel internal mounts

Message ID 20190629202744.12396-1-ebiggers@kernel.org (mailing list archive)
State New, archived
Headers show
Series vfs: move_mount: reject moving kernel internal mounts | expand

Commit Message

Eric Biggers June 29, 2019, 8:27 p.m. UTC
From: Eric Biggers <ebiggers@google.com>

sys_move_mount() crashes by dereferencing the pointer MNT_NS_INTERNAL,
a.k.a. ERR_PTR(-EINVAL), if the old mount is specified by fd for a
kernel object with an internal mount, such as a pipe or memfd.

Fix it by checking for this case and returning -EINVAL.

Reproducer:

    #include <unistd.h>

    #define __NR_move_mount         429
    #define MOVE_MOUNT_F_EMPTY_PATH 0x00000004

    int main()
    {
    	int fds[2];

    	pipe(fds);
        syscall(__NR_move_mount, fds[0], "", -1, "/", MOVE_MOUNT_F_EMPTY_PATH);
    }

Reported-by: syzbot+6004acbaa1893ad013f0@syzkaller.appspotmail.com
Fixes: 2db154b3ea8e ("vfs: syscall: Add move_mount(2) to move mounts around")
Signed-off-by: Eric Biggers <ebiggers@google.com>
---
 fs/namespace.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Al Viro June 29, 2019, 8:39 p.m. UTC | #1
On Sat, Jun 29, 2019 at 01:27:44PM -0700, Eric Biggers wrote:

> @@ -2600,7 +2600,7 @@ static int do_move_mount(struct path *old_path, struct path *new_path)
>  	if (attached && !check_mnt(old))
>  		goto out;
>  
> -	if (!attached && !(ns && is_anon_ns(ns)))
> +	if (!attached && !(ns && ns != MNT_NS_INTERNAL && is_anon_ns(ns)))
>  		goto out;
>  
>  	if (old->mnt.mnt_flags & MNT_LOCKED)

*UGH*

Applied, but that code is getting really ugly ;-/
Al Viro July 1, 2019, 1:08 a.m. UTC | #2
On Sat, Jun 29, 2019 at 09:39:16PM +0100, Al Viro wrote:
> On Sat, Jun 29, 2019 at 01:27:44PM -0700, Eric Biggers wrote:
> 
> > @@ -2600,7 +2600,7 @@ static int do_move_mount(struct path *old_path, struct path *new_path)
> >  	if (attached && !check_mnt(old))
> >  		goto out;
> >  
> > -	if (!attached && !(ns && is_anon_ns(ns)))
> > +	if (!attached && !(ns && ns != MNT_NS_INTERNAL && is_anon_ns(ns)))
> >  		goto out;
> >  
> >  	if (old->mnt.mnt_flags & MNT_LOCKED)
> 
> *UGH*
> 
> Applied, but that code is getting really ugly ;-/

FWIW, it's too ugly and confusing.  Look:
        /* The mountpoint must be in our namespace. */
        if (!check_mnt(p))
                goto out;

        /* The thing moved should be either ours or completely unattached. */
        if (attached && !check_mnt(old))
                goto out;

        if (!attached && !(ns && ns != MNT_NS_INTERNAL && is_anon_ns(ns)))
                goto out;

OK, the first check is sane and understandable.  But let's look at what's
coming after it.  We have two cases:
	1) attached.  IOW, old->mnt_parent != old.  In that case we
require old->mnt_ns == current mnt_ns.  Anything else is rejected.
	2) !attached.  old->mnt_parent == old.  In that case we
require old->mnt_ns to be an anon namespace.

Let's reorder that a bit:
        /* The mountpoint must be in our namespace. */
        if (!check_mnt(p))
                goto out;

	/* The thing moved must be mounted... */
	if (!is_mounted(old_path->mnt))
		goto out;

        /* ... and either ours or the root of anon namespace */
	if (!(attached ? check_mnt(old) : is_anon_ns(ns)))
		goto out;

IMO that looks saner and all it costs us is a redundant check
in attached case.  Objections?
David Howells July 1, 2019, 7:38 a.m. UTC | #3
Al Viro <viro@zeniv.linux.org.uk> wrote:

> 	/* The thing moved must be mounted... */
> 	if (!is_mounted(old_path->mnt))
> 		goto out;

Um...  Doesn't that stuff up fsmount()?

David
Al Viro July 1, 2019, 11:19 a.m. UTC | #4
On Mon, Jul 01, 2019 at 08:38:10AM +0100, David Howells wrote:
> Al Viro <viro@zeniv.linux.org.uk> wrote:
> 
> > 	/* The thing moved must be mounted... */
> > 	if (!is_mounted(old_path->mnt))
> > 		goto out;
> 
> Um...  Doesn't that stuff up fsmount()?

Nope - check is_mounted() definition.  Stuff in anon namespace
*is* mounted there, so that's not a problem.

FWIW, is_mounted() would've been better off spelled as
ns != NULL && ns != MNT_NS_INTERNAL; the use of IS_ERR_OR_NULL
in there works, but is unidiomatic and I don't think it yields
better code...
Eric Biggers July 1, 2019, 3:43 p.m. UTC | #5
On Mon, Jul 01, 2019 at 02:08:48AM +0100, Al Viro wrote:
> 
> Let's reorder that a bit:
>         /* The mountpoint must be in our namespace. */
>         if (!check_mnt(p))
>                 goto out;
> 
> 	/* The thing moved must be mounted... */
> 	if (!is_mounted(old_path->mnt))
> 		goto out;
> 
>         /* ... and either ours or the root of anon namespace */
> 	if (!(attached ? check_mnt(old) : is_anon_ns(ns)))
> 		goto out;
> 
> IMO that looks saner and all it costs us is a redundant check
> in attached case.  Objections?

Looks good to me.

- Eric
Eric Biggers July 1, 2019, 4:45 p.m. UTC | #6
On Sat, Jun 29, 2019 at 01:27:44PM -0700, Eric Biggers wrote:
> 
> Reproducer:
> 
>     #include <unistd.h>
> 
>     #define __NR_move_mount         429
>     #define MOVE_MOUNT_F_EMPTY_PATH 0x00000004
> 
>     int main()
>     {
>     	  int fds[2];
> 
>     	  pipe(fds);
>         syscall(__NR_move_mount, fds[0], "", -1, "/", MOVE_MOUNT_F_EMPTY_PATH);
>     }

David, I'd like to add this as a regression test somewhere.

Can you point me to the tests for the new mount syscalls?

I checked LTP, kselftests, and xfstests, but nothing to be found.

- Eric
Al Viro July 1, 2019, 6:22 p.m. UTC | #7
On Mon, Jul 01, 2019 at 09:45:37AM -0700, Eric Biggers wrote:
> On Sat, Jun 29, 2019 at 01:27:44PM -0700, Eric Biggers wrote:
> > 
> > Reproducer:
> > 
> >     #include <unistd.h>
> > 
> >     #define __NR_move_mount         429
> >     #define MOVE_MOUNT_F_EMPTY_PATH 0x00000004
> > 
> >     int main()
> >     {
> >     	  int fds[2];
> > 
> >     	  pipe(fds);
> >         syscall(__NR_move_mount, fds[0], "", -1, "/", MOVE_MOUNT_F_EMPTY_PATH);
> >     }
> 
> David, I'd like to add this as a regression test somewhere.
> 
> Can you point me to the tests for the new mount syscalls?
> 
> I checked LTP, kselftests, and xfstests, but nothing to be found.

FWIW, it's not just move_mount(2) - I'd expect

	int fds[2];
	char s[80];

	pipe(fds);
	sprintf(s, "/dev/fd/%d", fds[0]);
	mount(s, "/dev/null", NULL, MS_MOVE, 0);

to step into exactly the same thing.  mount(2) does follow symlinks -
always had...
Al Viro July 1, 2019, 7:20 p.m. UTC | #8
On Mon, Jul 01, 2019 at 07:22:39PM +0100, Al Viro wrote:

> FWIW, it's not just move_mount(2) - I'd expect
> 
> 	int fds[2];
> 	char s[80];
> 
> 	pipe(fds);
> 	sprintf(s, "/dev/fd/%d", fds[0]);
> 	mount(s, "/dev/null", NULL, MS_MOVE, 0);
> 
> to step into exactly the same thing.  mount(2) does follow symlinks -
> always had...

The same goes for e.g.

#define _GNU_SOURCE
#include <sched.h>
#include <sys/mount.h>
#include <stdio.h>
#include <sys/epoll.h>

main()
{
	char s[80];
	unshare(CLONE_NEWNS);	// so nobody else gets confused
 	sprintf(s, "/dev/fd/%d", epoll_create1(0));
 	mount(s, "/dev/null", NULL, MS_MOVE, 0);	// see if it oopses
}

modulo error-checking, etc.
Eric Biggers July 2, 2019, 6:22 p.m. UTC | #9
On Mon, Jul 01, 2019 at 07:22:39PM +0100, Al Viro wrote:
> On Mon, Jul 01, 2019 at 09:45:37AM -0700, Eric Biggers wrote:
> > On Sat, Jun 29, 2019 at 01:27:44PM -0700, Eric Biggers wrote:
> > > 
> > > Reproducer:
> > > 
> > >     #include <unistd.h>
> > > 
> > >     #define __NR_move_mount         429
> > >     #define MOVE_MOUNT_F_EMPTY_PATH 0x00000004
> > > 
> > >     int main()
> > >     {
> > >     	  int fds[2];
> > > 
> > >     	  pipe(fds);
> > >         syscall(__NR_move_mount, fds[0], "", -1, "/", MOVE_MOUNT_F_EMPTY_PATH);
> > >     }
> > 
> > David, I'd like to add this as a regression test somewhere.
> > 
> > Can you point me to the tests for the new mount syscalls?
> > 
> > I checked LTP, kselftests, and xfstests, but nothing to be found.
> 
> FWIW, it's not just move_mount(2) - I'd expect
> 
> 	int fds[2];
> 	char s[80];
> 
> 	pipe(fds);
> 	sprintf(s, "/dev/fd/%d", fds[0]);
> 	mount(s, "/dev/null", NULL, MS_MOVE, 0);
> 
> to step into exactly the same thing.  mount(2) does follow symlinks -
> always had...

Sure, but the new mount syscalls still need tests.  Where are the tests?

Also, since the case of a fd with an internal mount was overlooked, probably the
man page needs to be updated clarify that move_mount(2) fails with EINVAL in
this case.  Where is the man page?

- Eric
David Howells July 5, 2019, 9:01 a.m. UTC | #10
Eric Biggers <ebiggers@kernel.org> wrote:

> Also, since the case of a fd with an internal mount was overlooked, probably
> the man page needs to be updated clarify that move_mount(2) fails with
> EINVAL in this case.  Where is the man page?

See below.  I'm in the middle of updating the manpages I need to push.

David
---
'\" t
.\" Copyright (c) 2018 David Howells <dhowells@redhat.com>
.\"
.\" %%%LICENSE_START(VERBATIM)
.\" Permission is granted to make and distribute verbatim copies of this
.\" manual provided the copyright notice and this permission notice are
.\" preserved on all copies.
.\"
.\" Permission is granted to copy and distribute modified versions of this
.\" manual under the conditions for verbatim copying, provided that the
.\" entire resulting derived work is distributed under the terms of a
.\" permission notice identical to this one.
.\"
.\" Since the Linux kernel and libraries are constantly changing, this
.\" manual page may be incorrect or out-of-date.  The author(s) assume no
.\" responsibility for errors or omissions, or for damages resulting from
.\" the use of the information contained herein.  The author(s) may not
.\" have taken the same level of care in the production of this manual,
.\" which is licensed free of charge, as they might when working
.\" professionally.
.\"
.\" Formatted or processed versions of this manual, if unaccompanied by
.\" the source, must acknowledge the copyright and authors of this work.
.\" %%%LICENSE_END
.\"
.TH MOVE_MOUNT 2 2018-06-08 "Linux" "Linux Programmer's Manual"
.SH NAME
move_mount \- Move mount objects around the filesystem topology
.SH SYNOPSIS
.nf
.B #include <sys/types.h>
.br
.B #include <sys/mount.h>
.br
.B #include <unistd.h>
.br
.BR "#include <fcntl.h>           " "/* Definition of AT_* constants */"
.PP
.BI "int move_mount(int " from_dirfd ", const char *" from_pathname ","
.BI "               int " to_dirfd ", const char *" to_pathname ","
.BI "               unsigned int " flags );
.fi
.PP
.IR Note :
There are no glibc wrappers for these system calls.
.SH DESCRIPTION
The
.BR move_mount ()
call moves a mount from one place to another; it can also be used to attach an
unattached mount created by
.BR fsmount "() or " open_tree "() with " OPEN_TREE_CLONE .
.PP
If
.BR move_mount ()
is called repeatedly with a file descriptor that refers to a mount object,
then the object will be attached/moved the first time and then moved again and
again and again, detaching it from the previous mountpoint each time.
.PP
To access the source mount object or the destination mountpoint, no
permissions are required on the object itself, but if either pathname is
supplied, execute (search) permission is required on all of the directories
specified in
.IR from_pathname " or " to_pathname .
.PP
The caller does, however, require the appropriate capabilities or permission
to effect a mount.
.PP
.BR move_mount ()
uses
.IR from_pathname ", " from_dirfd " and some " flags
to locate the mount object to be moved and
.IR to_pathname ", " to_dirfd " and some other " flags
to locate the destination mountpoint.  Each lookup can be done in one of a
variety of ways:
.TP
[*] By absolute path.
The pathname points to an absolute path and the dirfd is ignored.  The file is
looked up by name, starting from the root of the filesystem as seen by the
calling process.
.TP
[*] By cwd-relative path.
The pathname points to a relative path and the dirfd is
.IR AT_FDCWD .
The file is looked up by name, starting from the current working directory.
.TP
[*] By dir-relative path.
The pathname points to relative path and the dirfd indicates a file descriptor
pointing to a directory.  The file is looked up by name, starting from the
directory specified by
.IR dirfd .
.TP
[*] By file descriptor.
The pathname points to "", the dirfd points directly to the mount object to
move or the destination mount point and the appropriate
.B *_EMPTY_PATH
flag is set.
.PP
.I flags
can be used to influence a path-based lookup.  A value for
.I flags
is constructed by OR'ing together zero or more of the following constants:
.TP
.BR MOVE_MOUNT_F_EMPTY_PATH
.\" commit 65cfc6722361570bfe255698d9cd4dccaf47570d
If
.I from_pathname
is an empty string, operate on the file referred to by
.IR from_dirfd
(which may have been obtained using the
.BR open (2)
.B O_PATH
flag or
.BR open_tree ())
If
.I from_dirfd
is
.BR AT_FDCWD ,
the call operates on the current working directory.
In this case,
.I from_dirfd
can refer to any type of file, not just a directory.
This flag is Linux-specific; define
.B _GNU_SOURCE
.\" Before glibc 2.16, defining _ATFILE_SOURCE sufficed
to obtain its definition.
.TP
.B MOVE_MOUNT_T_EMPTY_PATH
As above, but operating on
.IR to_pathname " and " to_dirfd .
.TP
.B MOVE_MOUNT_F_NO_AUTOMOUNT
Don't automount the terminal ("basename") component of
.I from_pathname
if it is a directory that is an automount point.  This allows a mount object
that has an automount point at its root to be moved and prevents unintended
triggering of an automount point.
The
.B MOVE_MOUNT_F_NO_AUTOMOUNT
flag has no effect if the automount point has already been mounted over.  This
flag is Linux-specific; define
.B _GNU_SOURCE
.\" Before glibc 2.16, defining _ATFILE_SOURCE sufficed
to obtain its definition.
.TP
.B MOVE_MOUNT_T_NO_AUTOMOUNT
As above, but operating on
.IR to_pathname " and " to_dirfd .
This allows an automount point to be manually mounted over.
.TP
.B MOVE_MOUNT_F_SYMLINKS
If
.I from_pathname
is a symbolic link, then dereference it.  The default for
.BR move_mount ()
is to not follow symlinks.
.TP
.B MOVE_MOUNT_T_SYMLINKS
As above, but operating on
.IR to_pathname " and " to_dirfd .

.SH EXAMPLES
The
.BR move_mount ()
function can be used like the following:
.PP
.RS
.nf
move_mount(AT_FDCWD, "/a", AT_FDCWD, "/b", 0);
.fi
.RE
.PP
This would move the object mounted on "/a" to "/b".  It can also be used in
conjunction with
.BR open_tree "(2) or " open "(2) with " O_PATH :
.PP
.RS
.nf
fd = open_tree(AT_FDCWD, "/mnt", 0);
move_mount(fd, "", AT_FDCWD, "/mnt2", MOVE_MOUNT_F_EMPTY_PATH);
move_mount(fd, "", AT_FDCWD, "/mnt3", MOVE_MOUNT_F_EMPTY_PATH);
move_mount(fd, "", AT_FDCWD, "/mnt4", MOVE_MOUNT_F_EMPTY_PATH);
.fi
.RE
.PP
This would attach the path point for "/mnt" to fd, then it would move the
mount to "/mnt2", then move it to "/mnt3" and finally to "/mnt4".
.PP
It can also be used to attach new mounts:
.PP
.RS
.nf
sfd = fsopen("ext4", FSOPEN_CLOEXEC);
write(sfd, "s /dev/sda1");
write(sfd, "o user_xattr");
mfd = fsmount(sfd, FSMOUNT_CLOEXEC, MS_NODEV);
move_mount(mfd, "", AT_FDCWD, "/home", MOVE_MOUNT_F_EMPTY_PATH);
.fi
.RE
.PP
Which would open the Ext4 filesystem mounted on "/dev/sda1", turn on user
extended attribute support and create a mount object for it.  Finally, the new
mount object would be attached with
.BR move_mount ()
to "/home".


.\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
.\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
.\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
.SH RETURN VALUE
On success, 0 is returned.  On error, \-1 is returned, and
.I errno
is set appropriately.
.SH ERRORS
.TP
.B EACCES
Search permission is denied for one of the directories
in the path prefix of
.IR pathname .
(See also
.BR path_resolution (7).)
.TP
.B EBADF
.IR from_dirfd " or " to_dirfd
is not a valid open file descriptor.
.TP
.B EFAULT
.IR from_pathname " or " to_pathname
is NULL or either one point to a location outside the process's accessible
address space.
.TP
.B EINVAL
Reserved flag specified in
.IR flags .
.TP
.B ELOOP
Too many symbolic links encountered while traversing the pathname.
.TP
.B ENAMETOOLONG
.IR from_pathname " or " to_pathname
is too long.
.TP
.B ENOENT
A component of
.IR from_pathname " or " to_pathname
does not exist, or one is an empty string and the appropriate
.B *_EMPTY_PATH
was not specified in
.IR flags .
.TP
.B ENOMEM
Out of memory (i.e., kernel memory).
.TP
.B ENOTDIR
A component of the path prefix of
.IR from_pathname " or " to_pathname
is not a directory or one or the other is relative and the appropriate
.I *_dirfd
is a file descriptor referring to a file other than a directory.
.SH VERSIONS
.BR move_mount ()
was added to Linux in kernel 4.18.
.SH CONFORMING TO
.BR move_mount ()
is Linux-specific.
.SH NOTES
Glibc does not (yet) provide a wrapper for the
.BR move_mount ()
system call; call it using
.BR syscall (2).
.SH SEE ALSO
.BR fsmount (2),
.BR fsopen (2),
.BR open_tree (2)
Eric Biggers July 9, 2019, 7:40 p.m. UTC | #11
On Tue, Jul 02, 2019 at 11:22:59AM -0700, Eric Biggers wrote:
> 
> Sure, but the new mount syscalls still need tests.  Where are the tests?
> 

Still waiting for an answer to this question.

Did we just release 6 new syscalls with no tests?

I don't understand how that is even remotely acceptable.

- Eric
Al Viro July 9, 2019, 8:54 p.m. UTC | #12
On Tue, Jul 09, 2019 at 12:40:01PM -0700, Eric Biggers wrote:
> On Tue, Jul 02, 2019 at 11:22:59AM -0700, Eric Biggers wrote:
> > 
> > Sure, but the new mount syscalls still need tests.  Where are the tests?
> > 
> 
> Still waiting for an answer to this question.

In samples/vfs/fsmount.c, IIRC, and that's not much of a test.
Eric Biggers July 10, 2019, 3:23 a.m. UTC | #13
On Tue, Jul 09, 2019 at 09:54:24PM +0100, Al Viro wrote:
> On Tue, Jul 09, 2019 at 12:40:01PM -0700, Eric Biggers wrote:
> > On Tue, Jul 02, 2019 at 11:22:59AM -0700, Eric Biggers wrote:
> > > 
> > > Sure, but the new mount syscalls still need tests.  Where are the tests?
> > > 
> > 
> > Still waiting for an answer to this question.
> 
> In samples/vfs/fsmount.c, IIRC, and that's not much of a test.

A sample program doesn't count.  There need to be tests that can be run
automatically as part of a well known test suite, such as LTP, kselftests, or
xfstests.  Why is this not mandatory for new syscalls to be accepted?  What if
they are broken and we don't know?  See what happened with copy_file_range():
https://lwn.net/Articles/774114/

- Eric
diff mbox series

Patch

diff --git a/fs/namespace.c b/fs/namespace.c
index 7660c2749c96..a7e5a44770a7 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -2600,7 +2600,7 @@  static int do_move_mount(struct path *old_path, struct path *new_path)
 	if (attached && !check_mnt(old))
 		goto out;
 
-	if (!attached && !(ns && is_anon_ns(ns)))
+	if (!attached && !(ns && ns != MNT_NS_INTERNAL && is_anon_ns(ns)))
 		goto out;
 
 	if (old->mnt.mnt_flags & MNT_LOCKED)