diff mbox

[v3,2/7] userns: Simpilify MNT_NODEV handling.

Message ID 1442433764-80826-3-git-send-email-seth.forshee@canonical.com
State New
Headers show

Commit Message

Seth Forshee Sept. 16, 2015, 8:02 p.m. UTC
From: "Eric W. Biederman" <ebiederm@xmission.com>

- Consolidate the testing if a device node may be opened in a new
  function may_open_dev.

- Move the check for allowing access to device nodes on filesystems
  not mounted in the initial user namespace from mount time to open
  time and include it in may_open_dev.

This set of changes removes the implicit adding of MNT_NODEV which
simplifies the logic in fs/namespace.c and removes a potentially
problematic user visible difference in how normal and unprivileged
mount namespaces work.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 fs/block_dev.c     |  2 +-
 fs/namei.c         |  9 ++++++++-
 fs/namespace.c     | 18 ++++--------------
 include/linux/fs.h |  1 +
 4 files changed, 14 insertions(+), 16 deletions(-)

Comments

Andy Lutomirski Sept. 17, 2015, 12:24 a.m. UTC | #1
On Wed, Sep 16, 2015 at 1:02 PM, Seth Forshee
<seth.forshee@canonical.com> wrote:
> From: "Eric W. Biederman" <ebiederm@xmission.com>
>
> - Consolidate the testing if a device node may be opened in a new
>   function may_open_dev.
>
> - Move the check for allowing access to device nodes on filesystems
>   not mounted in the initial user namespace from mount time to open
>   time and include it in may_open_dev.
>
> This set of changes removes the implicit adding of MNT_NODEV which
> simplifies the logic in fs/namespace.c and removes a potentially
> problematic user visible difference in how normal and unprivileged
> mount namespaces work.
>
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>

> -               /* Only in special cases allow devices from mounts
> -                * created outside the initial user namespace.
> -                */
> -               if (!(type->fs_flags & FS_USERNS_DEV_MOUNT)) {
> -                       flags |= MS_NODEV;
> -                       mnt_flags |= MNT_NODEV | MNT_LOCK_NODEV;
> -               }

This is an ABI change.  It's probably okay, but I think the commit
message should make it clear what's happening.


--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric W. Biederman Sept. 17, 2015, 12:54 a.m. UTC | #2
Andy Lutomirski <luto@amacapital.net> writes:

> On Wed, Sep 16, 2015 at 1:02 PM, Seth Forshee
> <seth.forshee@canonical.com> wrote:
>> From: "Eric W. Biederman" <ebiederm@xmission.com>
>>
>> - Consolidate the testing if a device node may be opened in a new
>>   function may_open_dev.
>>
>> - Move the check for allowing access to device nodes on filesystems
>>   not mounted in the initial user namespace from mount time to open
>>   time and include it in may_open_dev.
>>
>> This set of changes removes the implicit adding of MNT_NODEV which
>> simplifies the logic in fs/namespace.c and removes a potentially
>> problematic user visible difference in how normal and unprivileged
>> mount namespaces work.
>>
>> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
>
>> -               /* Only in special cases allow devices from mounts
>> -                * created outside the initial user namespace.
>> -                */
>> -               if (!(type->fs_flags & FS_USERNS_DEV_MOUNT)) {
>> -                       flags |= MS_NODEV;
>> -                       mnt_flags |= MNT_NODEV | MNT_LOCK_NODEV;
>> -               }
>
> This is an ABI change.  It's probably okay, but I think the commit
> message should make it clear what's happening.

You mean it should include in big flashing neon letters
            ***REGRESSION FIX***
?

It is longer in coming than I had hoped.  But that is part of the reason
I did not fix the security hole this way.  Getting the s_user_ns stuff
just so has been non-trivial.

I do agree that because this is a user visible change we do need to keep
our eyes peeled for pieces of userspace software that may depend on the
exact details of the current behavior.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Andy Lutomirski Sept. 17, 2015, 10:15 p.m. UTC | #3
On Sep 16, 2015 6:01 PM, "Eric W. Biederman" <ebiederm@xmission.com> wrote:
>
> Andy Lutomirski <luto@amacapital.net> writes:
>
> > On Wed, Sep 16, 2015 at 1:02 PM, Seth Forshee
> > <seth.forshee@canonical.com> wrote:
> >> From: "Eric W. Biederman" <ebiederm@xmission.com>
> >>
> >> - Consolidate the testing if a device node may be opened in a new
> >>   function may_open_dev.
> >>
> >> - Move the check for allowing access to device nodes on filesystems
> >>   not mounted in the initial user namespace from mount time to open
> >>   time and include it in may_open_dev.
> >>
> >> This set of changes removes the implicit adding of MNT_NODEV which
> >> simplifies the logic in fs/namespace.c and removes a potentially
> >> problematic user visible difference in how normal and unprivileged
> >> mount namespaces work.
> >>
> >> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> >
> >> -               /* Only in special cases allow devices from mounts
> >> -                * created outside the initial user namespace.
> >> -                */
> >> -               if (!(type->fs_flags & FS_USERNS_DEV_MOUNT)) {
> >> -                       flags |= MS_NODEV;
> >> -                       mnt_flags |= MNT_NODEV | MNT_LOCK_NODEV;
> >> -               }
> >
> > This is an ABI change.  It's probably okay, but I think the commit
> > message should make it clear what's happening.
>
> You mean it should include in big flashing neon letters
>             ***REGRESSION FIX***
> ?
>
> It is longer in coming than I had hoped.  But that is part of the reason
> I did not fix the security hole this way.  Getting the s_user_ns stuff
> just so has been non-trivial.
>
> I do agree that because this is a user visible change we do need to keep
> our eyes peeled for pieces of userspace software that may depend on the
> exact details of the current behavior.

Yeah.  "Removes a potentially problematic user visible difference"
sounds like the difference has been there forever.   If it needs to
get backported, people will appreciate it.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/fs/block_dev.c b/fs/block_dev.c
index 22ea424ee741..26cee058dc02 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -1730,7 +1730,7 @@  struct block_device *lookup_bdev(const char *pathname)
 	if (!S_ISBLK(inode->i_mode))
 		goto fail;
 	error = -EACCES;
-	if (path.mnt->mnt_flags & MNT_NODEV)
+	if (!may_open_dev(&path))
 		goto fail;
 	error = -ENOMEM;
 	bdev = bd_acquire(inode);
diff --git a/fs/namei.c b/fs/namei.c
index 726d211db484..fcc5751d6395 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -2663,6 +2663,13 @@  int vfs_create(struct inode *dir, struct dentry *dentry, umode_t mode,
 }
 EXPORT_SYMBOL(vfs_create);
 
+bool may_open_dev(const struct path *path)
+{
+	return !(path->mnt->mnt_flags & MNT_NODEV) &&
+		((path->mnt->mnt_sb->s_user_ns == &init_user_ns) ||
+		 (path->mnt->mnt_sb->s_type->fs_flags & FS_USERNS_DEV_MOUNT));
+}
+
 static int may_open(struct path *path, int acc_mode, int flag)
 {
 	struct dentry *dentry = path->dentry;
@@ -2685,7 +2692,7 @@  static int may_open(struct path *path, int acc_mode, int flag)
 		break;
 	case S_IFBLK:
 	case S_IFCHR:
-		if (path->mnt->mnt_flags & MNT_NODEV)
+		if (!may_open_dev(path))
 			return -EACCES;
 		/*FALLTHRU*/
 	case S_IFIFO:
diff --git a/fs/namespace.c b/fs/namespace.c
index d023a353dc63..da70f7c4ece1 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -2177,13 +2177,7 @@  static int do_remount(struct path *path, int flags, int mnt_flags,
 	}
 	if ((mnt->mnt.mnt_flags & MNT_LOCK_NODEV) &&
 	    !(mnt_flags & MNT_NODEV)) {
-		/* Was the nodev implicitly added in mount? */
-		if ((mnt->mnt_ns->user_ns != &init_user_ns) &&
-		    !(sb->s_type->fs_flags & FS_USERNS_DEV_MOUNT)) {
-			mnt_flags |= MNT_NODEV;
-		} else {
-			return -EPERM;
-		}
+		return -EPERM;
 	}
 	if ((mnt->mnt.mnt_flags & MNT_LOCK_NOSUID) &&
 	    !(mnt_flags & MNT_NOSUID)) {
@@ -2396,13 +2390,6 @@  static int do_new_mount(struct path *path, const char *fstype, int flags,
 			put_filesystem(type);
 			return -EPERM;
 		}
-		/* Only in special cases allow devices from mounts
-		 * created outside the initial user namespace.
-		 */
-		if (!(type->fs_flags & FS_USERNS_DEV_MOUNT)) {
-			flags |= MS_NODEV;
-			mnt_flags |= MNT_NODEV | MNT_LOCK_NODEV;
-		}
 		if (type->fs_flags & FS_USERNS_VISIBLE) {
 			if (!fs_fully_visible(type, &mnt_flags))
 				return -EPERM;
@@ -3238,6 +3225,9 @@  static bool fs_fully_visible(struct file_system_type *type, int *new_mnt_flags)
 		mnt_flags = mnt->mnt.mnt_flags;
 		if (mnt->mnt.mnt_sb->s_iflags & SB_I_NOEXEC)
 			mnt_flags &= ~(MNT_LOCK_NOSUID | MNT_LOCK_NOEXEC);
+		if (mnt->mnt.mnt_sb->s_user_ns != &init_user_ns &&
+		    !(mnt->mnt.mnt_sb->s_type->fs_flags & FS_USERNS_DEV_MOUNT))
+			mnt_flags &= ~(MNT_LOCK_NODEV);
 
 		/* Verify the mount flags are equal to or more permissive
 		 * than the proposed new mount.
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 79c15ab2159d..5ec201e8308c 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1537,6 +1537,7 @@  extern void dentry_unhash(struct dentry *dentry);
  */
 extern void inode_init_owner(struct inode *inode, const struct inode *dir,
 			umode_t mode);
+extern bool may_open_dev(const struct path *path);
 /*
  * VFS FS_IOC_FIEMAP helper definitions.
  */