From patchwork Thu May 14 17:31:29 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Eric W. Biederman" X-Patchwork-Id: 6408681 Return-Path: X-Original-To: patchwork-linux-fsdevel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 4078A9F32B for ; Thu, 14 May 2015 17:36:56 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 49E5E2027D for ; Thu, 14 May 2015 17:36:55 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 180DE20452 for ; Thu, 14 May 2015 17:36:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S964866AbbENRgp (ORCPT ); Thu, 14 May 2015 13:36:45 -0400 Received: from out01.mta.xmission.com ([166.70.13.231]:32974 "EHLO out01.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1422638AbbENRgW (ORCPT ); Thu, 14 May 2015 13:36:22 -0400 Received: from in01.mta.xmission.com ([166.70.13.51]) by out01.mta.xmission.com with esmtps (TLS1.2:DHE_RSA_AES_128_CBC_SHA1:128) (Exim 4.82) (envelope-from ) id 1Ysx3V-0006p5-C8; Thu, 14 May 2015 11:36:21 -0600 Received: from 67-3-205-90.omah.qwest.net ([67.3.205.90] helo=x220.int.ebiederm.org.xmission.com) by in01.mta.xmission.com with esmtpsa (TLS1.2:DHE_RSA_AES_128_CBC_SHA1:128) (Exim 4.82) (envelope-from ) id 1Ysx3U-0000TO-9C; Thu, 14 May 2015 11:36:21 -0600 From: ebiederm@xmission.com (Eric W. Biederman) To: Linux Containers Cc: , Linux API , "Serge E. Hallyn" , Andy Lutomirski , Richard Weinberger , Kenton Varda , Michael Kerrisk-manpages , =?utf-8?Q?St=C3=A9phane?= Graber , Eric Windisch , Greg Kroah-Hartman , Tejun Heo References: <87pp63jcca.fsf@x220.int.ebiederm.org> Date: Thu, 14 May 2015 12:31:29 -0500 In-Reply-To: <87pp63jcca.fsf@x220.int.ebiederm.org> (Eric W. Biederman's message of "Thu, 14 May 2015 12:30:45 -0500") Message-ID: <87k2wbjcb2.fsf@x220.int.ebiederm.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux) MIME-Version: 1.0 X-XM-AID: U2FsdGVkX1/UpKxfxmUclo86ACzhO0hX3IPACtCyQbw= X-SA-Exim-Connect-IP: 67.3.205.90 X-SA-Exim-Mail-From: ebiederm@xmission.com X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, T_RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-DCC: XMission; sa06 1397; Body=1 Fuz1=1 Fuz2=1 X-Spam-Combo: ***;Linux Containers X-Spam-Relay-Country: X-Spam-Timing: total 632 ms - load_scoreonly_sql: 0.04 (0.0%), signal_user_changed: 3.2 (0.5%), b_tie_ro: 2.3 (0.4%), parse: 0.78 (0.1%), extract_message_metadata: 12 (2.0%), get_uri_detail_list: 2.5 (0.4%), tests_pri_-1000: 6 (0.9%), tests_pri_-950: 1.31 (0.2%), tests_pri_-900: 1.11 (0.2%), tests_pri_-400: 28 (4.4%), check_bayes: 27 (4.2%), b_tokenize: 9 (1.5%), b_tok_get_all: 9 (1.4%), b_comp_prob: 2.1 (0.3%), b_tok_touch_all: 4.3 (0.7%), b_finish: 0.68 (0.1%), tests_pri_0: 572 (90.6%), tests_pri_500: 4.0 (0.6%), rewrite_mail: 0.00 (0.0%) Subject: [CFT][PATCH 01/10] mnt: Refactor the logic for mounting sysfs and proc in a user namespace X-SA-Exim-Version: 4.2.1 (built Wed, 24 Sep 2014 11:00:52 -0600) X-SA-Exim-Scanned: Yes (on in01.mta.xmission.com) Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Fresh mounts of proc and sysfs are a very special case that works very much like a bind mount. Unfortunately the current structure can not preserve the MNT_LOCK... mount flags. Therefore refactor the logic into a form that can be modified to preserve those lock bits. Add a new filesystem flag FS_USERNS_VISIBLE that requires some mount of the filesystem be fully visible in the current mount namespace, before the filesystem may be mounted. Move the logic for calling fs_fully_visible from proc and sysfs into fs/namespace.c where it has greater access to mount namespace state. Cc: stable@vger.kernel.org Signed-off-by: "Eric W. Biederman" --- fs/namespace.c | 8 +++++++- fs/proc/root.c | 5 +---- fs/sysfs/mount.c | 5 +---- include/linux/fs.h | 2 +- 4 files changed, 10 insertions(+), 10 deletions(-) diff --git a/fs/namespace.c b/fs/namespace.c index 1b9e11167bae..8e7edaf60fe1 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -2332,6 +2332,8 @@ unlock: return err; } +static bool fs_fully_visible(struct file_system_type *fs_type); + /* * create a new mount for userspace and request it to be added into the * namespace's tree @@ -2363,6 +2365,10 @@ static int do_new_mount(struct path *path, const char *fstype, int flags, flags |= MS_NODEV; mnt_flags |= MNT_NODEV | MNT_LOCK_NODEV; } + if (type->fs_flags & FS_USERNS_VISIBLE) { + if (!fs_fully_visible(type)) + return -EPERM; + } } mnt = vfs_kern_mount(type, flags, name, data); @@ -3164,7 +3170,7 @@ bool current_chrooted(void) return chrooted; } -bool fs_fully_visible(struct file_system_type *type) +static bool fs_fully_visible(struct file_system_type *type) { struct mnt_namespace *ns = current->nsproxy->mnt_ns; struct mount *mnt; diff --git a/fs/proc/root.c b/fs/proc/root.c index b7fa4bfe896a..64e1ab64bde6 100644 --- a/fs/proc/root.c +++ b/fs/proc/root.c @@ -112,9 +112,6 @@ static struct dentry *proc_mount(struct file_system_type *fs_type, ns = task_active_pid_ns(current); options = data; - if (!capable(CAP_SYS_ADMIN) && !fs_fully_visible(fs_type)) - return ERR_PTR(-EPERM); - /* Does the mounter have privilege over the pid namespace? */ if (!ns_capable(ns->user_ns, CAP_SYS_ADMIN)) return ERR_PTR(-EPERM); @@ -159,7 +156,7 @@ static struct file_system_type proc_fs_type = { .name = "proc", .mount = proc_mount, .kill_sb = proc_kill_sb, - .fs_flags = FS_USERNS_MOUNT, + .fs_flags = FS_USERNS_VISIBLE | FS_USERNS_MOUNT, }; void __init proc_root_init(void) diff --git a/fs/sysfs/mount.c b/fs/sysfs/mount.c index 8a49486bf30c..1c6ac6fcee9f 100644 --- a/fs/sysfs/mount.c +++ b/fs/sysfs/mount.c @@ -31,9 +31,6 @@ static struct dentry *sysfs_mount(struct file_system_type *fs_type, bool new_sb; if (!(flags & MS_KERNMOUNT)) { - if (!capable(CAP_SYS_ADMIN) && !fs_fully_visible(fs_type)) - return ERR_PTR(-EPERM); - if (!kobj_ns_current_may_mount(KOBJ_NS_TYPE_NET)) return ERR_PTR(-EPERM); } @@ -58,7 +55,7 @@ static struct file_system_type sysfs_fs_type = { .name = "sysfs", .mount = sysfs_mount, .kill_sb = sysfs_kill_sb, - .fs_flags = FS_USERNS_MOUNT, + .fs_flags = FS_USERNS_VISIBLE | FS_USERNS_MOUNT, }; int __init sysfs_init(void) diff --git a/include/linux/fs.h b/include/linux/fs.h index 35ec87e490b1..2d24eeb8e59c 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -1897,6 +1897,7 @@ struct file_system_type { #define FS_HAS_SUBTYPE 4 #define FS_USERNS_MOUNT 8 /* Can be mounted by userns root */ #define FS_USERNS_DEV_MOUNT 16 /* A userns mount does not imply MNT_NODEV */ +#define FS_USERNS_VISIBLE 32 /* FS must already be visible */ #define FS_RENAME_DOES_D_MOVE 32768 /* FS will handle d_move() during rename() internally. */ struct dentry *(*mount) (struct file_system_type *, int, const char *, void *); @@ -1984,7 +1985,6 @@ extern int vfs_ustat(dev_t, struct kstatfs *); extern int freeze_super(struct super_block *super); extern int thaw_super(struct super_block *super); extern bool our_mnt(struct vfsmount *mnt); -extern bool fs_fully_visible(struct file_system_type *); extern int current_umask(void);