diff mbox

vfs: Allow FS_USERNS_MOUNT filesystems to create inodes with unmapped uid/gids

Message ID 20161115133221.GA23813@fedora-21-dvm (mailing list archive)
State New, archived
Headers show

Commit Message

Johanna Abrahamsson Nov. 15, 2016, 1:32 p.m. UTC
Fixes: 036d523641c6 ("vfs: Don't create inodes with a uid or gid unknown
to the vfs")

This fixes the bugzilla bug #183461
(https://bugzilla.kernel.org/show_bug.cgi?id=183461) "mkdir(2) returns
EOVERFLOW on tmpfs in user namespace" that was introduced in the
aforementioned commit.

Since filesystems with the FS_USERNS_MOUNT flag (mainly tmpfs in this
case) are safe to use within user namespaces they should be allowed
to create inodes without checking whether their uid and gid has a
mapping or not.

I learned most of what I know about user namespaces while researching
this bug, so there's a fair chance that I've misunderstood something.
Feedback is greatly appreciated.

Signed-off-by: Johanna Abrahamsson <johanna@mjao.org>
---
 fs/namei.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

Comments

Eric W. Biederman Nov. 22, 2016, 6:55 p.m. UTC | #1
Johanna Abrahamsson <johanna@mjao.org> writes:

> Fixes: 036d523641c6 ("vfs: Don't create inodes with a uid or gid unknown
> to the vfs")
>
> This fixes the bugzilla bug #183461
> (https://bugzilla.kernel.org/show_bug.cgi?id=183461) "mkdir(2) returns
> EOVERFLOW on tmpfs in user namespace" that was introduced in the
> aforementioned commit.
>
> Since filesystems with the FS_USERNS_MOUNT flag (mainly tmpfs in this
> case) are safe to use within user namespaces they should be allowed
> to create inodes without checking whether their uid and gid has a
> mapping or not.
>
> I learned most of what I know about user namespaces while researching
> this bug, so there's a fair chance that I've misunderstood something.
> Feedback is greatly appreciated.

So thank you for taking a look at this and brining it to my attention.

The assumption and what is generally true is that if an id does not
map into a filesystem's user namespace the filesystem will not know
how to represent this id on disk.

tmpfs because it is all in ram is a special exception to this.
FS_USERNS_MOUNT is not sufficient to test if the filesystem is all in
ram.  To fix this would require a different test.

FS_USERNS_MOUNT is a particularly bad thing to test because the only
filesystems where this is likely to happen will set FS_USERNS_MOUNT.
Making the test for FS_USERNS_MOUNT wrong.

If something more than testing system calls in weird scenarios hits this
it will be a regression which breaks userspace and I will fix it.
As it sits systems calls are not going to act fully normally without a
mapped uid and gid so it makes for an incorrect test of if the system call
is implemented correctly.

I would have to think for a bit to come up with a clean solution that
will succeed on tmpfs and ramfs and would fail on all other filesystems.
I believe it would require pushing the unmapped uid and gid test down
further in the code than this.

Thank you for bringing this to my attention.

Eric

> Signed-off-by: Johanna Abrahamsson <johanna@mjao.org>
> ---
>  fs/namei.c | 10 ++++++----
>  1 file changed, 6 insertions(+), 4 deletions(-)
>
> diff --git a/fs/namei.c b/fs/namei.c
> index 1669c93d..9e45e01 100644
> --- a/fs/namei.c
> +++ b/fs/namei.c
> @@ -2816,10 +2816,12 @@ static inline int may_create(struct inode *dir, struct dentry *child)
>  		return -EEXIST;
>  	if (IS_DEADDIR(dir))
>  		return -ENOENT;
> -	s_user_ns = dir->i_sb->s_user_ns;
> -	if (!kuid_has_mapping(s_user_ns, current_fsuid()) ||
> -	    !kgid_has_mapping(s_user_ns, current_fsgid()))
> -		return -EOVERFLOW;
> +	if (!(dir->i_sb->s_type->fs_flags & FS_USERNS_MOUNT)) {
> +		s_user_ns = dir->i_sb->s_user_ns;
> +		if (!kuid_has_mapping(s_user_ns, current_fsuid()) ||
> +		    !kgid_has_mapping(s_user_ns, current_fsgid()))
> +			return -EOVERFLOW;
> +	}
>  	return inode_permission(dir, MAY_WRITE | MAY_EXEC);
>  }
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/fs/namei.c b/fs/namei.c
index 1669c93d..9e45e01 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -2816,10 +2816,12 @@  static inline int may_create(struct inode *dir, struct dentry *child)
 		return -EEXIST;
 	if (IS_DEADDIR(dir))
 		return -ENOENT;
-	s_user_ns = dir->i_sb->s_user_ns;
-	if (!kuid_has_mapping(s_user_ns, current_fsuid()) ||
-	    !kgid_has_mapping(s_user_ns, current_fsgid()))
-		return -EOVERFLOW;
+	if (!(dir->i_sb->s_type->fs_flags & FS_USERNS_MOUNT)) {
+		s_user_ns = dir->i_sb->s_user_ns;
+		if (!kuid_has_mapping(s_user_ns, current_fsuid()) ||
+		    !kgid_has_mapping(s_user_ns, current_fsgid()))
+			return -EOVERFLOW;
+	}
 	return inode_permission(dir, MAY_WRITE | MAY_EXEC);
 }