diff mbox series

[v3,18/21] btrfs/ioctl: allow idmapped BTRFS_IOC_INO_LOOKUP_USER ioctl

Message ID 20210726102816.612434-19-brauner@kernel.org (mailing list archive)
State New, archived
Headers show
Series btrfs: support idmapped mounts | expand

Commit Message

Christian Brauner July 26, 2021, 10:28 a.m. UTC
From: Christian Brauner <christian.brauner@ubuntu.com>

The BTRFS_IOC_INO_LOOKUP_USER is an unprivileged version of the
BTRFS_IOC_INO_LOOKUP ioctl and has the following restrictions. The main
difference between the two is that BTRFS_IOC_INO_LOOKUP is filesystem wide
operation wheres BTRFS_IOC_INO_LOOKUP_USER is scoped beneath the file
descriptor passed with the ioctl. Specifically, BTRFS_IOC_INO_LOOKUP_USER must
adhere to the following restrictions:
- The caller must be privileged over each inode of each path component for the
  path they are trying to lookup.
- The path for the subvolume the caller is trying to lookup must be reachable
  from the inode associated with the file descriptor passed with the ioctl.
The second condition makes it possible to scope the lookup of the path to the
mount identified by the file descriptor passed with the ioctl. This allows us
to enable this ioctl on idmapped mounts.

Specifically, this is possible because all child subvolumes of a parent
subvolume are reachable when the parent subvolume is mounted. So if the user
had access to open the parent subvolume or has been given the fd then they can
lookup the path if they had access to it provided they were privileged over
each path component.

Note, the BTRFS_IOC_INO_LOOKUP_USER ioctl allows a user to learn the path and
name of a subvolume even though they would otherwise be restricted from doing
so via regular vfs-based lookup.
So think about a parent subvolume with multiple child subvolumes. Someone could
mount he parent subvolume and restrict access to the child subvolumes by
overmounting them with empty directories. At this point the user can't traverse
the child subvolumes and they can't open files in the child subvolumes.
However, they can still learn the path of child subvolumes as long as they have
access to the parent subvolume by using the BTRFS_IOC_INO_LOOKUP_USER ioctl.

The underlying assumption here is that it's ok that the lookup ioctls can't
really take mounts into account other than the original mount the fd belongs to
during lookup. Since this assumption is baked into the original
BTRFS_IOC_INO_LOOKUP_USER ioctl we can extend it to idmapped mounts.

Cc: Chris Mason <clm@fb.com>
Cc: Josef Bacik <josef@toxicpanda.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: David Sterba <dsterba@suse.com>
Cc: linux-btrfs@vger.kernel.org
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
---
/* v2 */
unchanged

/* v3 */
unchanged
---
 fs/btrfs/ioctl.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)
diff mbox series

Patch

diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index e4571522427a..159bff0c07c1 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -2440,7 +2440,8 @@  static noinline int btrfs_search_path_in_tree(struct btrfs_fs_info *info,
 	return ret;
 }
 
-static int btrfs_search_path_in_tree_user(struct inode *inode,
+static int btrfs_search_path_in_tree_user(struct user_namespace *mnt_userns,
+				struct inode *inode,
 				struct btrfs_ioctl_ino_lookup_user_args *args)
 {
 	struct btrfs_fs_info *fs_info = BTRFS_I(inode)->root->fs_info;
@@ -2538,7 +2539,7 @@  static int btrfs_search_path_in_tree_user(struct inode *inode,
 				ret = PTR_ERR(temp_inode);
 				goto out_put;
 			}
-			ret = inode_permission(&init_user_ns, temp_inode,
+			ret = inode_permission(mnt_userns, temp_inode,
 					       MAY_READ | MAY_EXEC);
 			iput(temp_inode);
 			if (ret) {
@@ -2680,7 +2681,7 @@  static int btrfs_ioctl_ino_lookup_user(struct file *file, void __user *argp)
 		return -EACCES;
 	}
 
-	ret = btrfs_search_path_in_tree_user(inode, args);
+	ret = btrfs_search_path_in_tree_user(file_mnt_user_ns(file), inode, args);
 
 	if (ret == 0 && copy_to_user(argp, args, sizeof(*args)))
 		ret = -EFAULT;