From patchwork Tue Feb 19 16:28:06 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 10820157 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 27189184E for ; Tue, 19 Feb 2019 16:28:16 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 072392CCAB for ; Tue, 19 Feb 2019 16:28:16 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 1AC032CC73; Tue, 19 Feb 2019 16:28:15 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A0A8F2CB79 for ; Tue, 19 Feb 2019 16:28:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728099AbfBSQ2N (ORCPT ); Tue, 19 Feb 2019 11:28:13 -0500 Received: from mx1.redhat.com ([209.132.183.28]:58914 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726357AbfBSQ2N (ORCPT ); Tue, 19 Feb 2019 11:28:13 -0500 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 237A7820E9; Tue, 19 Feb 2019 16:28:08 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-121-129.rdu2.redhat.com [10.10.121.129]) by smtp.corp.redhat.com (Postfix) with ESMTP id B719E1835A; Tue, 19 Feb 2019 16:28:06 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 01/43] fix cgroup_do_mount() handling of failure exits From: David Howells To: viro@zeniv.linux.org.uk Cc: stable@kernel.org, linux-fsdevel@vger.kernel.org, dhowells@redhat.com, torvalds@linux-foundation.org, ebiederm@xmission.com, linux-security-module@vger.kernel.org Date: Tue, 19 Feb 2019 16:28:06 +0000 Message-ID: <155059368597.12449.3144063971030696984.stgit@warthog.procyon.org.uk> In-Reply-To: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> References: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.26]); Tue, 19 Feb 2019 16:28:13 +0000 (UTC) Sender: owner-linux-security-module@vger.kernel.org Precedence: bulk List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Al Viro same story as with last May fixes in sysfs (7b745a4e4051 "unfuck sysfs_mount()"); new_sb is left uninitialized in case of early errors in kernfs_mount_ns() and papering over it by treating any error from kernfs_mount_ns() as equivalent to !new_ns ends up conflating the cases when objects had never been transferred to a superblock with ones when that has happened and resulting new superblock had been dropped. Easily fixed (same way as in sysfs case). Additionally, there's a superblock leak on kernfs_node_dentry() failure *and* a dentry leak inside kernfs_node_dentry() itself - the latter on probably impossible errors, but the former not impossible to trigger (as the matter of fact, injecting allocation failures at that point *does* trigger it). Cc: stable@kernel.org Signed-off-by: Al Viro --- fs/kernfs/mount.c | 8 ++++++-- kernel/cgroup/cgroup.c | 9 ++++++--- 2 files changed, 12 insertions(+), 5 deletions(-) diff --git a/fs/kernfs/mount.c b/fs/kernfs/mount.c index fdf527b6d79c..d71c9405874a 100644 --- a/fs/kernfs/mount.c +++ b/fs/kernfs/mount.c @@ -196,8 +196,10 @@ struct dentry *kernfs_node_dentry(struct kernfs_node *kn, return dentry; knparent = find_next_ancestor(kn, NULL); - if (WARN_ON(!knparent)) + if (WARN_ON(!knparent)) { + dput(dentry); return ERR_PTR(-EINVAL); + } do { struct dentry *dtmp; @@ -206,8 +208,10 @@ struct dentry *kernfs_node_dentry(struct kernfs_node *kn, if (kn == knparent) return dentry; kntmp = find_next_ancestor(kn, knparent); - if (WARN_ON(!kntmp)) + if (WARN_ON(!kntmp)) { + dput(dentry); return ERR_PTR(-EINVAL); + } dtmp = lookup_one_len_unlocked(kntmp->name, dentry, strlen(kntmp->name)); dput(dentry); diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c index f31bd61c9466..503bba3c4bae 100644 --- a/kernel/cgroup/cgroup.c +++ b/kernel/cgroup/cgroup.c @@ -2033,7 +2033,7 @@ struct dentry *cgroup_do_mount(struct file_system_type *fs_type, int flags, struct cgroup_namespace *ns) { struct dentry *dentry; - bool new_sb; + bool new_sb = false; dentry = kernfs_mount(fs_type, flags, root->kf_root, magic, &new_sb); @@ -2043,6 +2043,7 @@ struct dentry *cgroup_do_mount(struct file_system_type *fs_type, int flags, */ if (!IS_ERR(dentry) && ns != &init_cgroup_ns) { struct dentry *nsdentry; + struct super_block *sb = dentry->d_sb; struct cgroup *cgrp; mutex_lock(&cgroup_mutex); @@ -2053,12 +2054,14 @@ struct dentry *cgroup_do_mount(struct file_system_type *fs_type, int flags, spin_unlock_irq(&css_set_lock); mutex_unlock(&cgroup_mutex); - nsdentry = kernfs_node_dentry(cgrp->kn, dentry->d_sb); + nsdentry = kernfs_node_dentry(cgrp->kn, sb); dput(dentry); + if (IS_ERR(nsdentry)) + deactivate_locked_super(sb); dentry = nsdentry; } - if (IS_ERR(dentry) || !new_sb) + if (!new_sb) cgroup_put(&root->cgrp); return dentry; From patchwork Tue Feb 19 16:28:13 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 10820161 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B2576922 for ; Tue, 19 Feb 2019 16:28:23 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 989282CC9C for ; Tue, 19 Feb 2019 16:28:23 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 8C7C72CC9B; Tue, 19 Feb 2019 16:28:23 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DAD212CAC8 for ; Tue, 19 Feb 2019 16:28:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729196AbfBSQ2W (ORCPT ); Tue, 19 Feb 2019 11:28:22 -0500 Received: from mx1.redhat.com ([209.132.183.28]:46680 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729058AbfBSQ2W (ORCPT ); Tue, 19 Feb 2019 11:28:22 -0500 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id D7FCD8046D; Tue, 19 Feb 2019 16:28:21 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-121-129.rdu2.redhat.com [10.10.121.129]) by smtp.corp.redhat.com (Postfix) with ESMTP id 147D119C57; Tue, 19 Feb 2019 16:28:13 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 02/43] cgroup: saner refcounting for cgroup_root From: David Howells To: viro@zeniv.linux.org.uk Cc: linux-fsdevel@vger.kernel.org, dhowells@redhat.com, torvalds@linux-foundation.org, ebiederm@xmission.com, linux-security-module@vger.kernel.org Date: Tue, 19 Feb 2019 16:28:13 +0000 Message-ID: <155059369332.12449.14447911414340183902.stgit@warthog.procyon.org.uk> In-Reply-To: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> References: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.28]); Tue, 19 Feb 2019 16:28:21 +0000 (UTC) Sender: owner-linux-security-module@vger.kernel.org Precedence: bulk List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Al Viro * make the reference from superblock to cgroup_root counting - do cgroup_put() in cgroup_kill_sb() whether we'd done percpu_ref_kill() or not; matching grab is done when we allocate a new root. That gives the same refcounting rules for all callers of cgroup_do_mount() - a reference to cgroup_root has been grabbed by caller and it either is transferred to new superblock or dropped. * have cgroup_kill_sb() treat an already killed refcount as "just don't bother killing it, then". * after successful cgroup_do_mount() have cgroup1_mount() recheck if we'd raced with mount/umount from somebody else and cgroup_root got killed. In that case we drop the superblock and bugger off with -ERESTARTSYS, same as if we'd found it in the list already dying. * don't bother with delayed initialization of refcount - it's unreliable and not needed. No need to prevent attempts to bump the refcount if we find cgroup_root of another mount in progress - sget will reuse an existing superblock just fine and if the other sb manages to die before we get there, we'll catch that immediately after cgroup_do_mount(). * don't bother with kernfs_pin_sb() - no need for doing that either. Signed-off-by: Al Viro --- kernel/cgroup/cgroup-internal.h | 2 + kernel/cgroup/cgroup-v1.c | 58 +++++++++------------------------------ kernel/cgroup/cgroup.c | 16 +++++------ 3 files changed, 21 insertions(+), 55 deletions(-) diff --git a/kernel/cgroup/cgroup-internal.h b/kernel/cgroup/cgroup-internal.h index c950864016e2..c9a35f09e4b9 100644 --- a/kernel/cgroup/cgroup-internal.h +++ b/kernel/cgroup/cgroup-internal.h @@ -198,7 +198,7 @@ int cgroup_path_ns_locked(struct cgroup *cgrp, char *buf, size_t buflen, void cgroup_free_root(struct cgroup_root *root); void init_cgroup_root(struct cgroup_root *root, struct cgroup_sb_opts *opts); -int cgroup_setup_root(struct cgroup_root *root, u16 ss_mask, int ref_flags); +int cgroup_setup_root(struct cgroup_root *root, u16 ss_mask); int rebind_subsystems(struct cgroup_root *dst_root, u16 ss_mask); struct dentry *cgroup_do_mount(struct file_system_type *fs_type, int flags, struct cgroup_root *root, unsigned long magic, diff --git a/kernel/cgroup/cgroup-v1.c b/kernel/cgroup/cgroup-v1.c index 583b969b0c0e..f94a7229974e 100644 --- a/kernel/cgroup/cgroup-v1.c +++ b/kernel/cgroup/cgroup-v1.c @@ -1116,13 +1116,11 @@ struct dentry *cgroup1_mount(struct file_system_type *fs_type, int flags, void *data, unsigned long magic, struct cgroup_namespace *ns) { - struct super_block *pinned_sb = NULL; struct cgroup_sb_opts opts; struct cgroup_root *root; struct cgroup_subsys *ss; struct dentry *dentry; int i, ret; - bool new_root = false; cgroup_lock_and_drain_offline(&cgrp_dfl_root.cgrp); @@ -1184,29 +1182,6 @@ struct dentry *cgroup1_mount(struct file_system_type *fs_type, int flags, if (root->flags ^ opts.flags) pr_warn("new mount options do not match the existing superblock, will be ignored\n"); - /* - * We want to reuse @root whose lifetime is governed by its - * ->cgrp. Let's check whether @root is alive and keep it - * that way. As cgroup_kill_sb() can happen anytime, we - * want to block it by pinning the sb so that @root doesn't - * get killed before mount is complete. - * - * With the sb pinned, tryget_live can reliably indicate - * whether @root can be reused. If it's being killed, - * drain it. We can use wait_queue for the wait but this - * path is super cold. Let's just sleep a bit and retry. - */ - pinned_sb = kernfs_pin_sb(root->kf_root, NULL); - if (IS_ERR(pinned_sb) || - !percpu_ref_tryget_live(&root->cgrp.self.refcnt)) { - mutex_unlock(&cgroup_mutex); - if (!IS_ERR_OR_NULL(pinned_sb)) - deactivate_super(pinned_sb); - msleep(10); - ret = restart_syscall(); - goto out_free; - } - ret = 0; goto out_unlock; } @@ -1232,15 +1207,20 @@ struct dentry *cgroup1_mount(struct file_system_type *fs_type, int flags, ret = -ENOMEM; goto out_unlock; } - new_root = true; init_cgroup_root(root, &opts); - ret = cgroup_setup_root(root, opts.subsys_mask, PERCPU_REF_INIT_DEAD); + ret = cgroup_setup_root(root, opts.subsys_mask); if (ret) cgroup_free_root(root); out_unlock: + if (!ret && !percpu_ref_tryget_live(&root->cgrp.self.refcnt)) { + mutex_unlock(&cgroup_mutex); + msleep(10); + ret = restart_syscall(); + goto out_free; + } mutex_unlock(&cgroup_mutex); out_free: kfree(opts.release_agent); @@ -1252,25 +1232,13 @@ struct dentry *cgroup1_mount(struct file_system_type *fs_type, int flags, dentry = cgroup_do_mount(&cgroup_fs_type, flags, root, CGROUP_SUPER_MAGIC, ns); - /* - * There's a race window after we release cgroup_mutex and before - * allocating a superblock. Make sure a concurrent process won't - * be able to re-use the root during this window by delaying the - * initialization of root refcnt. - */ - if (new_root) { - mutex_lock(&cgroup_mutex); - percpu_ref_reinit(&root->cgrp.self.refcnt); - mutex_unlock(&cgroup_mutex); + if (!IS_ERR(dentry) && percpu_ref_is_dying(&root->cgrp.self.refcnt)) { + struct super_block *sb = dentry->d_sb; + dput(dentry); + deactivate_locked_super(sb); + msleep(10); + dentry = ERR_PTR(restart_syscall()); } - - /* - * If @pinned_sb, we're reusing an existing root and holding an - * extra ref on its sb. Mount is complete. Put the extra ref. - */ - if (pinned_sb) - deactivate_super(pinned_sb); - return dentry; } diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c index 503bba3c4bae..7fd9f22e406d 100644 --- a/kernel/cgroup/cgroup.c +++ b/kernel/cgroup/cgroup.c @@ -1927,7 +1927,7 @@ void init_cgroup_root(struct cgroup_root *root, struct cgroup_sb_opts *opts) set_bit(CGRP_CPUSET_CLONE_CHILDREN, &root->cgrp.flags); } -int cgroup_setup_root(struct cgroup_root *root, u16 ss_mask, int ref_flags) +int cgroup_setup_root(struct cgroup_root *root, u16 ss_mask) { LIST_HEAD(tmp_links); struct cgroup *root_cgrp = &root->cgrp; @@ -1944,7 +1944,7 @@ int cgroup_setup_root(struct cgroup_root *root, u16 ss_mask, int ref_flags) root_cgrp->ancestor_ids[0] = ret; ret = percpu_ref_init(&root_cgrp->self.refcnt, css_release, - ref_flags, GFP_KERNEL); + 0, GFP_KERNEL); if (ret) goto out; @@ -2121,18 +2121,16 @@ static void cgroup_kill_sb(struct super_block *sb) struct cgroup_root *root = cgroup_root_from_kf(kf_root); /* - * If @root doesn't have any mounts or children, start killing it. + * If @root doesn't have any children, start killing it. * This prevents new mounts by disabling percpu_ref_tryget_live(). * cgroup_mount() may wait for @root's release. * * And don't kill the default root. */ - if (!list_empty(&root->cgrp.self.children) || - root == &cgrp_dfl_root) - cgroup_put(&root->cgrp); - else + if (list_empty(&root->cgrp.self.children) && root != &cgrp_dfl_root && + !percpu_ref_is_dying(&root->cgrp.self.refcnt)) percpu_ref_kill(&root->cgrp.self.refcnt); - + cgroup_put(&root->cgrp); kernfs_kill_sb(sb); } @@ -5402,7 +5400,7 @@ int __init cgroup_init(void) hash_add(css_set_table, &init_css_set.hlist, css_set_hash(init_css_set.subsys)); - BUG_ON(cgroup_setup_root(&cgrp_dfl_root, 0, 0)); + BUG_ON(cgroup_setup_root(&cgrp_dfl_root, 0)); mutex_unlock(&cgroup_mutex); From patchwork Tue Feb 19 16:28:27 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 10820165 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A86EC922 for ; Tue, 19 Feb 2019 16:28:34 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 916362CC9E for ; Tue, 19 Feb 2019 16:28:34 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 8F9502CCAB; Tue, 19 Feb 2019 16:28:34 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2FDE22CC9E for ; Tue, 19 Feb 2019 16:28:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729197AbfBSQ2c (ORCPT ); Tue, 19 Feb 2019 11:28:32 -0500 Received: from mx1.redhat.com ([209.132.183.28]:33562 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729107AbfBSQ2c (ORCPT ); Tue, 19 Feb 2019 11:28:32 -0500 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 291207CFCE; Tue, 19 Feb 2019 16:28:32 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-121-129.rdu2.redhat.com [10.10.121.129]) by smtp.corp.redhat.com (Postfix) with ESMTP id E83BC101962A; Tue, 19 Feb 2019 16:28:28 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 03/43] kill kernfs_pin_sb() From: David Howells To: viro@zeniv.linux.org.uk Cc: linux-fsdevel@vger.kernel.org, dhowells@redhat.com, torvalds@linux-foundation.org, ebiederm@xmission.com, linux-security-module@vger.kernel.org Date: Tue, 19 Feb 2019 16:28:27 +0000 Message-ID: <155059370708.12449.16300336178932343753.stgit@warthog.procyon.org.uk> In-Reply-To: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> References: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.27]); Tue, 19 Feb 2019 16:28:32 +0000 (UTC) Sender: owner-linux-security-module@vger.kernel.org Precedence: bulk List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Al Viro unused now and impossible to use safely anyway. Signed-off-by: Al Viro --- fs/kernfs/mount.c | 30 ------------------------------ include/linux/kernfs.h | 1 - 2 files changed, 31 deletions(-) diff --git a/fs/kernfs/mount.c b/fs/kernfs/mount.c index d71c9405874a..4d303047a4f8 100644 --- a/fs/kernfs/mount.c +++ b/fs/kernfs/mount.c @@ -377,36 +377,6 @@ void kernfs_kill_sb(struct super_block *sb) kfree(info); } -/** - * kernfs_pin_sb: try to pin the superblock associated with a kernfs_root - * @kernfs_root: the kernfs_root in question - * @ns: the namespace tag - * - * Pin the superblock so the superblock won't be destroyed in subsequent - * operations. This can be used to block ->kill_sb() which may be useful - * for kernfs users which dynamically manage superblocks. - * - * Returns NULL if there's no superblock associated to this kernfs_root, or - * -EINVAL if the superblock is being freed. - */ -struct super_block *kernfs_pin_sb(struct kernfs_root *root, const void *ns) -{ - struct kernfs_super_info *info; - struct super_block *sb = NULL; - - mutex_lock(&kernfs_mutex); - list_for_each_entry(info, &root->supers, node) { - if (info->ns == ns) { - sb = info->sb; - if (!atomic_inc_not_zero(&info->sb->s_active)) - sb = ERR_PTR(-EINVAL); - break; - } - } - mutex_unlock(&kernfs_mutex); - return sb; -} - void __init kernfs_init(void) { diff --git a/include/linux/kernfs.h b/include/linux/kernfs.h index 5b36b1287a5a..44acb4c3659c 100644 --- a/include/linux/kernfs.h +++ b/include/linux/kernfs.h @@ -357,7 +357,6 @@ struct dentry *kernfs_mount_ns(struct file_system_type *fs_type, int flags, struct kernfs_root *root, unsigned long magic, bool *new_sb_created, const void *ns); void kernfs_kill_sb(struct super_block *sb); -struct super_block *kernfs_pin_sb(struct kernfs_root *root, const void *ns); void kernfs_init(void); From patchwork Tue Feb 19 16:28:37 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 10820169 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id EB9DE922 for ; Tue, 19 Feb 2019 16:29:00 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D0F4929A6E for ; Tue, 19 Feb 2019 16:29:00 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id BE4CF2BCF6; Tue, 19 Feb 2019 16:29:00 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1A1952A763 for ; Tue, 19 Feb 2019 16:29:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727061AbfBSQ27 (ORCPT ); Tue, 19 Feb 2019 11:28:59 -0500 Received: from mx1.redhat.com ([209.132.183.28]:41754 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726388AbfBSQ27 (ORCPT ); Tue, 19 Feb 2019 11:28:59 -0500 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 71564C049E24; Tue, 19 Feb 2019 16:28:58 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-121-129.rdu2.redhat.com [10.10.121.129]) by smtp.corp.redhat.com (Postfix) with ESMTP id 1F23E5D6AA; Tue, 19 Feb 2019 16:28:56 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 04/43] separate copying and locking mount tree on cross-userns copies From: David Howells To: viro@zeniv.linux.org.uk Cc: linux-fsdevel@vger.kernel.org, dhowells@redhat.com, torvalds@linux-foundation.org, ebiederm@xmission.com, linux-security-module@vger.kernel.org Date: Tue, 19 Feb 2019 16:28:37 +0000 Message-ID: <155059371731.12449.5751025556744658291.stgit@warthog.procyon.org.uk> In-Reply-To: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> References: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.31]); Tue, 19 Feb 2019 16:28:58 +0000 (UTC) Sender: owner-linux-security-module@vger.kernel.org Precedence: bulk List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Al Viro Rather than having propagate_mnt() check doing unprivileged copies, lock them before commit_tree(). Signed-off-by: Al Viro --- fs/namespace.c | 59 +++++++++++++++++++++++++++++++++++--------------------- fs/pnode.c | 5 ----- fs/pnode.h | 3 +-- 3 files changed, 38 insertions(+), 29 deletions(-) diff --git a/fs/namespace.c b/fs/namespace.c index a677b59efd74..9ed2f2930dfd 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -1013,27 +1013,6 @@ static struct mount *clone_mnt(struct mount *old, struct dentry *root, mnt->mnt.mnt_flags = old->mnt.mnt_flags; mnt->mnt.mnt_flags &= ~(MNT_WRITE_HOLD|MNT_MARKED|MNT_INTERNAL); - /* Don't allow unprivileged users to change mount flags */ - if (flag & CL_UNPRIVILEGED) { - mnt->mnt.mnt_flags |= MNT_LOCK_ATIME; - - if (mnt->mnt.mnt_flags & MNT_READONLY) - mnt->mnt.mnt_flags |= MNT_LOCK_READONLY; - - if (mnt->mnt.mnt_flags & MNT_NODEV) - mnt->mnt.mnt_flags |= MNT_LOCK_NODEV; - - if (mnt->mnt.mnt_flags & MNT_NOSUID) - mnt->mnt.mnt_flags |= MNT_LOCK_NOSUID; - - if (mnt->mnt.mnt_flags & MNT_NOEXEC) - mnt->mnt.mnt_flags |= MNT_LOCK_NOEXEC; - } - - /* Don't allow unprivileged users to reveal what is under a mount */ - if ((flag & CL_UNPRIVILEGED) && - (!(flag & CL_EXPIRE) || list_empty(&old->mnt_expire))) - mnt->mnt.mnt_flags |= MNT_LOCKED; atomic_inc(&sb->s_active); mnt->mnt.mnt_sb = sb; @@ -1837,6 +1816,33 @@ int iterate_mounts(int (*f)(struct vfsmount *, void *), void *arg, return 0; } +static void lock_mnt_tree(struct mount *mnt) +{ + struct mount *p; + + for (p = mnt; p; p = next_mnt(p, mnt)) { + int flags = p->mnt.mnt_flags; + /* Don't allow unprivileged users to change mount flags */ + flags |= MNT_LOCK_ATIME; + + if (flags & MNT_READONLY) + flags |= MNT_LOCK_READONLY; + + if (flags & MNT_NODEV) + flags |= MNT_LOCK_NODEV; + + if (flags & MNT_NOSUID) + flags |= MNT_LOCK_NOSUID; + + if (flags & MNT_NOEXEC) + flags |= MNT_LOCK_NOEXEC; + /* Don't allow unprivileged users to reveal what is under a mount */ + if (list_empty(&p->mnt_expire)) + flags |= MNT_LOCKED; + p->mnt.mnt_flags = flags; + } +} + static void cleanup_group_ids(struct mount *mnt, struct mount *end) { struct mount *p; @@ -1954,6 +1960,7 @@ static int attach_recursive_mnt(struct mount *source_mnt, struct mountpoint *dest_mp, struct path *parent_path) { + struct user_namespace *user_ns = current->nsproxy->mnt_ns->user_ns; HLIST_HEAD(tree_list); struct mnt_namespace *ns = dest_mnt->mnt_ns; struct mountpoint *smp; @@ -2004,6 +2011,9 @@ static int attach_recursive_mnt(struct mount *source_mnt, child->mnt_mountpoint); if (q) mnt_change_mountpoint(child, smp, q); + /* Notice when we are propagating across user namespaces */ + if (child->mnt_parent->mnt_ns->user_ns != user_ns) + lock_mnt_tree(child); commit_tree(child); } put_mountpoint(smp); @@ -2941,13 +2951,18 @@ struct mnt_namespace *copy_mnt_ns(unsigned long flags, struct mnt_namespace *ns, /* First pass: copy the tree topology */ copy_flags = CL_COPY_UNBINDABLE | CL_EXPIRE; if (user_ns != ns->user_ns) - copy_flags |= CL_SHARED_TO_SLAVE | CL_UNPRIVILEGED; + copy_flags |= CL_SHARED_TO_SLAVE; new = copy_tree(old, old->mnt.mnt_root, copy_flags); if (IS_ERR(new)) { namespace_unlock(); free_mnt_ns(new_ns); return ERR_CAST(new); } + if (user_ns != ns->user_ns) { + lock_mount_hash(); + lock_mnt_tree(new); + unlock_mount_hash(); + } new_ns->root = new; list_add_tail(&new_ns->list, &new->mnt_list); diff --git a/fs/pnode.c b/fs/pnode.c index 1100e810d855..7ea6cfb65077 100644 --- a/fs/pnode.c +++ b/fs/pnode.c @@ -214,7 +214,6 @@ static struct mount *next_group(struct mount *m, struct mount *origin) } /* all accesses are serialized by namespace_sem */ -static struct user_namespace *user_ns; static struct mount *last_dest, *first_source, *last_source, *dest_master; static struct mountpoint *mp; static struct hlist_head *list; @@ -260,9 +259,6 @@ static int propagate_one(struct mount *m) type |= CL_MAKE_SHARED; } - /* Notice when we are propagating across user namespaces */ - if (m->mnt_ns->user_ns != user_ns) - type |= CL_UNPRIVILEGED; child = copy_tree(last_source, last_source->mnt.mnt_root, type); if (IS_ERR(child)) return PTR_ERR(child); @@ -303,7 +299,6 @@ int propagate_mnt(struct mount *dest_mnt, struct mountpoint *dest_mp, * propagate_one(); everything is serialized by namespace_sem, * so globals will do just fine. */ - user_ns = current->nsproxy->mnt_ns->user_ns; last_dest = dest_mnt; first_source = source_mnt; last_source = source_mnt; diff --git a/fs/pnode.h b/fs/pnode.h index dc87e65becd2..3960a83666cf 100644 --- a/fs/pnode.h +++ b/fs/pnode.h @@ -27,8 +27,7 @@ #define CL_MAKE_SHARED 0x08 #define CL_PRIVATE 0x10 #define CL_SHARED_TO_SLAVE 0x20 -#define CL_UNPRIVILEGED 0x40 -#define CL_COPY_MNT_NS_FILE 0x80 +#define CL_COPY_MNT_NS_FILE 0x40 #define CL_COPY_ALL (CL_COPY_UNBINDABLE | CL_COPY_MNT_NS_FILE) From patchwork Tue Feb 19 16:29:03 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 10820173 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E40A1922 for ; Tue, 19 Feb 2019 16:29:10 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CD62B2AF06 for ; Tue, 19 Feb 2019 16:29:10 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id C1EE42BCF6; Tue, 19 Feb 2019 16:29:10 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 44B3E2B77C for ; Tue, 19 Feb 2019 16:29:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728818AbfBSQ3J (ORCPT ); Tue, 19 Feb 2019 11:29:09 -0500 Received: from mx1.redhat.com ([209.132.183.28]:61490 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726110AbfBSQ3J (ORCPT ); Tue, 19 Feb 2019 11:29:09 -0500 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 33775C704B; Tue, 19 Feb 2019 16:29:09 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-121-129.rdu2.redhat.com [10.10.121.129]) by smtp.corp.redhat.com (Postfix) with ESMTP id D577D19C68; Tue, 19 Feb 2019 16:29:04 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 05/43] saner handling of temporary namespaces From: David Howells To: viro@zeniv.linux.org.uk Cc: linux-fsdevel@vger.kernel.org, dhowells@redhat.com, torvalds@linux-foundation.org, ebiederm@xmission.com, linux-security-module@vger.kernel.org Date: Tue, 19 Feb 2019 16:29:03 +0000 Message-ID: <155059374364.12449.13897776439989704741.stgit@warthog.procyon.org.uk> In-Reply-To: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> References: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.30]); Tue, 19 Feb 2019 16:29:09 +0000 (UTC) Sender: owner-linux-security-module@vger.kernel.org Precedence: bulk List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Al Viro mount_subtree() creates (and soon destroys) a temporary namespace, so that automounts could function normally. These beasts should never become anyone's current namespaces; they don't, but it would be better to make prevention of that more straightforward. And since they don't become anyone's current namespace, we don't need to bother with reserving procfs inums for those. Teach alloc_mnt_ns() to skip inum allocation if told so, adjust put_mnt_ns() accordingly, make mount_subtree() use temporary (anon) namespace. is_anon_ns() checks if a namespace is such. Signed-off-by: Al Viro --- fs/mount.h | 5 ++++ fs/namespace.c | 74 ++++++++++++++++++++++++++------------------------------ 2 files changed, 40 insertions(+), 39 deletions(-) diff --git a/fs/mount.h b/fs/mount.h index f39bc9da4d73..6250de544760 100644 --- a/fs/mount.h +++ b/fs/mount.h @@ -146,3 +146,8 @@ static inline bool is_local_mountpoint(struct dentry *dentry) return __is_local_mountpoint(dentry); } + +static inline bool is_anon_ns(struct mnt_namespace *ns) +{ + return ns->seq == 0; +} diff --git a/fs/namespace.c b/fs/namespace.c index 9ed2f2930dfd..f0b8a8ca08df 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -2873,7 +2873,8 @@ static void dec_mnt_namespaces(struct ucounts *ucounts) static void free_mnt_ns(struct mnt_namespace *ns) { - ns_free_inum(&ns->ns); + if (!is_anon_ns(ns)) + ns_free_inum(&ns->ns); dec_mnt_namespaces(ns->ucounts); put_user_ns(ns->user_ns); kfree(ns); @@ -2888,7 +2889,7 @@ static void free_mnt_ns(struct mnt_namespace *ns) */ static atomic64_t mnt_ns_seq = ATOMIC64_INIT(1); -static struct mnt_namespace *alloc_mnt_ns(struct user_namespace *user_ns) +static struct mnt_namespace *alloc_mnt_ns(struct user_namespace *user_ns, bool anon) { struct mnt_namespace *new_ns; struct ucounts *ucounts; @@ -2898,28 +2899,27 @@ static struct mnt_namespace *alloc_mnt_ns(struct user_namespace *user_ns) if (!ucounts) return ERR_PTR(-ENOSPC); - new_ns = kmalloc(sizeof(struct mnt_namespace), GFP_KERNEL); + new_ns = kzalloc(sizeof(struct mnt_namespace), GFP_KERNEL); if (!new_ns) { dec_mnt_namespaces(ucounts); return ERR_PTR(-ENOMEM); } - ret = ns_alloc_inum(&new_ns->ns); - if (ret) { - kfree(new_ns); - dec_mnt_namespaces(ucounts); - return ERR_PTR(ret); + if (!anon) { + ret = ns_alloc_inum(&new_ns->ns); + if (ret) { + kfree(new_ns); + dec_mnt_namespaces(ucounts); + return ERR_PTR(ret); + } } new_ns->ns.ops = &mntns_operations; - new_ns->seq = atomic64_add_return(1, &mnt_ns_seq); + if (!anon) + new_ns->seq = atomic64_add_return(1, &mnt_ns_seq); atomic_set(&new_ns->count, 1); - new_ns->root = NULL; INIT_LIST_HEAD(&new_ns->list); init_waitqueue_head(&new_ns->poll); - new_ns->event = 0; new_ns->user_ns = get_user_ns(user_ns); new_ns->ucounts = ucounts; - new_ns->mounts = 0; - new_ns->pending_mounts = 0; return new_ns; } @@ -2943,7 +2943,7 @@ struct mnt_namespace *copy_mnt_ns(unsigned long flags, struct mnt_namespace *ns, old = ns->root; - new_ns = alloc_mnt_ns(user_ns); + new_ns = alloc_mnt_ns(user_ns, false); if (IS_ERR(new_ns)) return new_ns; @@ -3003,37 +3003,25 @@ struct mnt_namespace *copy_mnt_ns(unsigned long flags, struct mnt_namespace *ns, return new_ns; } -/** - * create_mnt_ns - creates a private namespace and adds a root filesystem - * @mnt: pointer to the new root filesystem mountpoint - */ -static struct mnt_namespace *create_mnt_ns(struct vfsmount *m) -{ - struct mnt_namespace *new_ns = alloc_mnt_ns(&init_user_ns); - if (!IS_ERR(new_ns)) { - struct mount *mnt = real_mount(m); - mnt->mnt_ns = new_ns; - new_ns->root = mnt; - new_ns->mounts++; - list_add(&mnt->mnt_list, &new_ns->list); - } else { - mntput(m); - } - return new_ns; -} - -struct dentry *mount_subtree(struct vfsmount *mnt, const char *name) +struct dentry *mount_subtree(struct vfsmount *m, const char *name) { + struct mount *mnt = real_mount(m); struct mnt_namespace *ns; struct super_block *s; struct path path; int err; - ns = create_mnt_ns(mnt); - if (IS_ERR(ns)) + ns = alloc_mnt_ns(&init_user_ns, true); + if (IS_ERR(ns)) { + mntput(m); return ERR_CAST(ns); + } + mnt->mnt_ns = ns; + ns->root = mnt; + ns->mounts++; + list_add(&mnt->mnt_list, &ns->list); - err = vfs_path_lookup(mnt->mnt_root, mnt, + err = vfs_path_lookup(m->mnt_root, m, name, LOOKUP_FOLLOW|LOOKUP_AUTOMOUNT, &path); put_mnt_ns(ns); @@ -3243,6 +3231,7 @@ SYSCALL_DEFINE2(pivot_root, const char __user *, new_root, static void __init init_mount_tree(void) { struct vfsmount *mnt; + struct mount *m; struct mnt_namespace *ns; struct path root; struct file_system_type *type; @@ -3255,10 +3244,14 @@ static void __init init_mount_tree(void) if (IS_ERR(mnt)) panic("Can't create rootfs"); - ns = create_mnt_ns(mnt); + ns = alloc_mnt_ns(&init_user_ns, false); if (IS_ERR(ns)) panic("Can't allocate initial namespace"); - + m = real_mount(mnt); + m->mnt_ns = ns; + ns->root = m; + ns->mounts = 1; + list_add(&m->mnt_list, &ns->list); init_task.nsproxy->mnt_ns = ns; get_mnt_ns(ns); @@ -3499,6 +3492,9 @@ static int mntns_install(struct nsproxy *nsproxy, struct ns_common *ns) !ns_capable(current_user_ns(), CAP_SYS_ADMIN)) return -EPERM; + if (is_anon_ns(mnt_ns)) + return -EINVAL; + if (fs->users != 1) return -EINVAL; From patchwork Tue Feb 19 16:29:14 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 10820177 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0212F922 for ; Tue, 19 Feb 2019 16:29:21 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E1A252A923 for ; Tue, 19 Feb 2019 16:29:20 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D5F572B77C; Tue, 19 Feb 2019 16:29:20 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id BB2422AF06 for ; Tue, 19 Feb 2019 16:29:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728957AbfBSQ3T (ORCPT ); Tue, 19 Feb 2019 11:29:19 -0500 Received: from mx1.redhat.com ([209.132.183.28]:37306 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728312AbfBSQ3S (ORCPT ); Tue, 19 Feb 2019 11:29:18 -0500 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 29DF87266F; Tue, 19 Feb 2019 16:29:18 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-121-129.rdu2.redhat.com [10.10.121.129]) by smtp.corp.redhat.com (Postfix) with ESMTP id 90DC117131; Tue, 19 Feb 2019 16:29:14 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 06/43] vfs: Introduce fs_context, switch vfs_kern_mount() to it. From: David Howells To: viro@zeniv.linux.org.uk Cc: linux-fsdevel@vger.kernel.org, dhowells@redhat.com, torvalds@linux-foundation.org, ebiederm@xmission.com, linux-security-module@vger.kernel.org Date: Tue, 19 Feb 2019 16:29:14 +0000 Message-ID: <155059375435.12449.6188847556322464304.stgit@warthog.procyon.org.uk> In-Reply-To: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> References: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Tue, 19 Feb 2019 16:29:18 +0000 (UTC) Sender: owner-linux-security-module@vger.kernel.org Precedence: bulk List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Introduce a filesystem context concept to be used during superblock creation for mount and superblock reconfiguration for remount. This is allocated at the beginning of the mount procedure and into it is placed: (1) Filesystem type. (2) Namespaces. (3) Source/Device names (there may be multiple). (4) Superblock flags (SB_*). (5) Security details. (6) Filesystem-specific data, as set by the mount options. Accessor functions are then provided to set up a context, parameterise it from monolithic mount data (the data page passed to mount(2)) and tear it down again. A legacy wrapper is provided that implements what will be the basic operations, wrapping access to filesystems that aren't yet aware of the fs_context. Finally, vfs_kern_mount() is changed to make use of the fs_context and mount_fs() is replaced by vfs_get_tree(), called from vfs_kern_mount(). [AV -- add missing kstrdup()] [AV -- put_cred() can be unconditional - fc->cred can't be NULL] [AV -- take legacy_validate() contents into legacy_parse_monolithic()] [AV -- merge KERNEL_MOUNT and USER_MOUNT] [AV -- don't unlock superblock on success return from vfs_get_tree()] [AV -- kill 'reference' argument of init_fs_context()] Signed-off-by: David Howells Co-developed-by: Al Viro Signed-off-by: Al Viro --- fs/Makefile | 3 - fs/fs_context.c | 182 ++++++++++++++++++++++++++++++++++++++++++++ fs/internal.h | 9 ++ fs/namespace.c | 46 ++++++++--- fs/super.c | 50 ++++++------ include/linux/fs_context.h | 64 +++++++++++++++ 6 files changed, 310 insertions(+), 44 deletions(-) create mode 100644 fs/fs_context.c create mode 100644 include/linux/fs_context.h diff --git a/fs/Makefile b/fs/Makefile index 293733f61594..5563cf34f7c2 100644 --- a/fs/Makefile +++ b/fs/Makefile @@ -12,7 +12,8 @@ obj-y := open.o read_write.o file_table.o super.o \ attr.o bad_inode.o file.o filesystems.o namespace.o \ seq_file.o xattr.o libfs.o fs-writeback.o \ pnode.o splice.o sync.o utimes.o d_path.o \ - stack.o fs_struct.o statfs.o fs_pin.o nsfs.o + stack.o fs_struct.o statfs.o fs_pin.o nsfs.o \ + fs_context.o ifeq ($(CONFIG_BLOCK),y) obj-y += buffer.o block_dev.o direct-io.o mpage.o diff --git a/fs/fs_context.c b/fs/fs_context.c new file mode 100644 index 000000000000..4294091b689d --- /dev/null +++ b/fs/fs_context.c @@ -0,0 +1,182 @@ +/* Provide a way to create a superblock configuration context within the kernel + * that allows a superblock to be set up prior to mounting. + * + * Copyright (C) 2017 Red Hat, Inc. All Rights Reserved. + * Written by David Howells (dhowells@redhat.com) + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public Licence + * as published by the Free Software Foundation; either version + * 2 of the Licence, or (at your option) any later version. + */ + +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include "mount.h" +#include "internal.h" + +struct legacy_fs_context { + char *legacy_data; /* Data page for legacy filesystems */ + size_t data_size; +}; + +static int legacy_init_fs_context(struct fs_context *fc); + +/** + * alloc_fs_context - Create a filesystem context. + * @fs_type: The filesystem type. + * @reference: The dentry from which this one derives (or NULL) + * @sb_flags: Filesystem/superblock flags (SB_*) + * @sb_flags_mask: Applicable members of @sb_flags + * @purpose: The purpose that this configuration shall be used for. + * + * Open a filesystem and create a mount context. The mount context is + * initialised with the supplied flags and, if a submount/automount from + * another superblock (referred to by @reference) is supplied, may have + * parameters such as namespaces copied across from that superblock. + */ +static struct fs_context *alloc_fs_context(struct file_system_type *fs_type, + struct dentry *reference, + unsigned int sb_flags, + unsigned int sb_flags_mask, + enum fs_context_purpose purpose) +{ + struct fs_context *fc; + int ret = -ENOMEM; + + fc = kzalloc(sizeof(struct fs_context), GFP_KERNEL); + if (!fc) + return ERR_PTR(-ENOMEM); + + fc->purpose = purpose; + fc->sb_flags = sb_flags; + fc->sb_flags_mask = sb_flags_mask; + fc->fs_type = get_filesystem(fs_type); + fc->cred = get_current_cred(); + fc->net_ns = get_net(current->nsproxy->net_ns); + + switch (purpose) { + case FS_CONTEXT_FOR_MOUNT: + fc->user_ns = get_user_ns(fc->cred->user_ns); + break; + } + + ret = legacy_init_fs_context(fc); + if (ret < 0) + goto err_fc; + fc->need_free = true; + return fc; + +err_fc: + put_fs_context(fc); + return ERR_PTR(ret); +} + +struct fs_context *fs_context_for_mount(struct file_system_type *fs_type, + unsigned int sb_flags) +{ + return alloc_fs_context(fs_type, NULL, sb_flags, 0, + FS_CONTEXT_FOR_MOUNT); +} +EXPORT_SYMBOL(fs_context_for_mount); + +static void legacy_fs_context_free(struct fs_context *fc); +/** + * put_fs_context - Dispose of a superblock configuration context. + * @fc: The context to dispose of. + */ +void put_fs_context(struct fs_context *fc) +{ + struct super_block *sb; + + if (fc->root) { + sb = fc->root->d_sb; + dput(fc->root); + fc->root = NULL; + deactivate_super(sb); + } + + if (fc->need_free) + legacy_fs_context_free(fc); + + security_free_mnt_opts(&fc->security); + if (fc->net_ns) + put_net(fc->net_ns); + put_user_ns(fc->user_ns); + put_cred(fc->cred); + kfree(fc->subtype); + put_filesystem(fc->fs_type); + kfree(fc->source); + kfree(fc); +} +EXPORT_SYMBOL(put_fs_context); + +/* + * Free the config for a filesystem that doesn't support fs_context. + */ +static void legacy_fs_context_free(struct fs_context *fc) +{ + kfree(fc->fs_private); +} + +/* + * Add monolithic mount data. + */ +static int legacy_parse_monolithic(struct fs_context *fc, void *data) +{ + struct legacy_fs_context *ctx = fc->fs_private; + ctx->legacy_data = data; + if (!ctx->legacy_data) + return 0; + if (fc->fs_type->fs_flags & FS_BINARY_MOUNTDATA) + return 0; + return security_sb_eat_lsm_opts(ctx->legacy_data, &fc->security); +} + +/* + * Get a mountable root with the legacy mount command. + */ +int legacy_get_tree(struct fs_context *fc) +{ + struct legacy_fs_context *ctx = fc->fs_private; + struct super_block *sb; + struct dentry *root; + + root = fc->fs_type->mount(fc->fs_type, fc->sb_flags, + fc->source, ctx->legacy_data); + if (IS_ERR(root)) + return PTR_ERR(root); + + sb = root->d_sb; + BUG_ON(!sb); + + fc->root = root; + return 0; +} + +/* + * Initialise a legacy context for a filesystem that doesn't support + * fs_context. + */ +static int legacy_init_fs_context(struct fs_context *fc) +{ + fc->fs_private = kzalloc(sizeof(struct legacy_fs_context), GFP_KERNEL); + if (!fc->fs_private) + return -ENOMEM; + return 0; +} + +int parse_monolithic_mount_data(struct fs_context *fc, void *data) +{ + return legacy_parse_monolithic(fc, data); +} diff --git a/fs/internal.h b/fs/internal.h index d410186bc369..f85c3212d25d 100644 --- a/fs/internal.h +++ b/fs/internal.h @@ -17,6 +17,7 @@ struct linux_binprm; struct path; struct mount; struct shrink_control; +struct fs_context; /* * block_dev.c @@ -51,6 +52,12 @@ int __generic_write_end(struct inode *inode, loff_t pos, unsigned copied, */ extern void __init chrdev_init(void); +/* + * fs_context.c + */ +extern int legacy_get_tree(struct fs_context *fc); +extern int parse_monolithic_mount_data(struct fs_context *, void *); + /* * namei.c */ @@ -101,8 +108,6 @@ extern struct file *alloc_empty_file_noaccount(int, const struct cred *); */ extern int do_remount_sb(struct super_block *, int, void *, int); extern bool trylock_super(struct super_block *sb); -extern struct dentry *mount_fs(struct file_system_type *, - int, const char *, void *); extern struct super_block *user_get_super(dev_t); /* diff --git a/fs/namespace.c b/fs/namespace.c index f0b8a8ca08df..3f2fd7a34733 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -27,6 +27,7 @@ #include #include #include +#include #include "pnode.h" #include "internal.h" @@ -940,36 +941,53 @@ static struct mount *skip_mnt_tree(struct mount *p) return p; } -struct vfsmount * -vfs_kern_mount(struct file_system_type *type, int flags, const char *name, void *data) +struct vfsmount *vfs_kern_mount(struct file_system_type *type, + int flags, const char *name, + void *data) { + struct fs_context *fc; struct mount *mnt; - struct dentry *root; + int ret = 0; if (!type) return ERR_PTR(-ENODEV); + fc = fs_context_for_mount(type, flags); + if (IS_ERR(fc)) + return ERR_CAST(fc); + + if (name) { + fc->source = kstrdup(name, GFP_KERNEL); + if (!fc->source) + ret = -ENOMEM; + } + if (!ret) + ret = parse_monolithic_mount_data(fc, data); + if (!ret) + ret = vfs_get_tree(fc); + if (ret) { + put_fs_context(fc); + return ERR_PTR(ret); + } + up_write(&fc->root->d_sb->s_umount); mnt = alloc_vfsmnt(name); - if (!mnt) + if (!mnt) { + put_fs_context(fc); return ERR_PTR(-ENOMEM); + } if (flags & SB_KERNMOUNT) mnt->mnt.mnt_flags = MNT_INTERNAL; - root = mount_fs(type, flags, name, data); - if (IS_ERR(root)) { - mnt_free_id(mnt); - free_vfsmnt(mnt); - return ERR_CAST(root); - } - - mnt->mnt.mnt_root = root; - mnt->mnt.mnt_sb = root->d_sb; + atomic_inc(&fc->root->d_sb->s_active); + mnt->mnt.mnt_root = dget(fc->root); + mnt->mnt.mnt_sb = fc->root->d_sb; mnt->mnt_mountpoint = mnt->mnt.mnt_root; mnt->mnt_parent = mnt; lock_mount_hash(); - list_add_tail(&mnt->mnt_instance, &root->d_sb->s_mounts); + list_add_tail(&mnt->mnt_instance, &fc->root->d_sb->s_mounts); unlock_mount_hash(); + put_fs_context(fc); return &mnt->mnt; } EXPORT_SYMBOL_GPL(vfs_kern_mount); diff --git a/fs/super.c b/fs/super.c index 48e25eba8465..fc3887277ad1 100644 --- a/fs/super.c +++ b/fs/super.c @@ -35,6 +35,7 @@ #include #include #include +#include #include #include "internal.h" @@ -1241,27 +1242,24 @@ struct dentry *mount_single(struct file_system_type *fs_type, } EXPORT_SYMBOL(mount_single); -struct dentry * -mount_fs(struct file_system_type *type, int flags, const char *name, void *data) +/** + * vfs_get_tree - Get the mountable root + * @fc: The superblock configuration context. + * + * The filesystem is invoked to get or create a superblock which can then later + * be used for mounting. The filesystem places a pointer to the root to be + * used for mounting in @fc->root. + */ +int vfs_get_tree(struct fs_context *fc) { - struct dentry *root; struct super_block *sb; - int error = -ENOMEM; - void *sec_opts = NULL; + int error; - if (data && !(type->fs_flags & FS_BINARY_MOUNTDATA)) { - error = security_sb_eat_lsm_opts(data, &sec_opts); - if (error) - return ERR_PTR(error); - } + error = legacy_get_tree(fc); + if (error < 0) + return error; - root = type->mount(type, flags, name, data); - if (IS_ERR(root)) { - error = PTR_ERR(root); - goto out_free_secdata; - } - sb = root->d_sb; - BUG_ON(!sb); + sb = fc->root->d_sb; WARN_ON(!sb->s_bdi); /* @@ -1273,11 +1271,11 @@ mount_fs(struct file_system_type *type, int flags, const char *name, void *data) smp_wmb(); sb->s_flags |= SB_BORN; - error = security_sb_set_mnt_opts(sb, sec_opts, 0, NULL); + error = security_sb_set_mnt_opts(sb, fc->security, 0, NULL); if (error) goto out_sb; - if (!(flags & (MS_KERNMOUNT|MS_SUBMOUNT))) { + if (!(fc->sb_flags & (MS_KERNMOUNT|MS_SUBMOUNT))) { error = security_sb_kern_mount(sb); if (error) goto out_sb; @@ -1290,18 +1288,16 @@ mount_fs(struct file_system_type *type, int flags, const char *name, void *data) * violate this rule. */ WARN((sb->s_maxbytes < 0), "%s set sb->s_maxbytes to " - "negative value (%lld)\n", type->name, sb->s_maxbytes); + "negative value (%lld)\n", fc->fs_type->name, sb->s_maxbytes); - up_write(&sb->s_umount); - security_free_mnt_opts(&sec_opts); - return root; + return 0; out_sb: - dput(root); + dput(fc->root); + fc->root = NULL; deactivate_locked_super(sb); -out_free_secdata: - security_free_mnt_opts(&sec_opts); - return ERR_PTR(error); + return error; } +EXPORT_SYMBOL(vfs_get_tree); /* * Setup private BDI for given superblock. It gets automatically cleaned up diff --git a/include/linux/fs_context.h b/include/linux/fs_context.h new file mode 100644 index 000000000000..9805514444c9 --- /dev/null +++ b/include/linux/fs_context.h @@ -0,0 +1,64 @@ +/* Filesystem superblock creation and reconfiguration context. + * + * Copyright (C) 2018 Red Hat, Inc. All Rights Reserved. + * Written by David Howells (dhowells@redhat.com) + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public Licence + * as published by the Free Software Foundation; either version + * 2 of the Licence, or (at your option) any later version. + */ + +#ifndef _LINUX_FS_CONTEXT_H +#define _LINUX_FS_CONTEXT_H + +#include +#include +#include + +struct cred; +struct dentry; +struct file_operations; +struct file_system_type; +struct net; +struct user_namespace; + +enum fs_context_purpose { + FS_CONTEXT_FOR_MOUNT, /* New superblock for explicit mount */ +}; + +/* + * Filesystem context for holding the parameters used in the creation or + * reconfiguration of a superblock. + * + * Superblock creation fills in ->root whereas reconfiguration begins with this + * already set. + * + * See Documentation/filesystems/mounting.txt + */ +struct fs_context { + struct file_system_type *fs_type; + void *fs_private; /* The filesystem's context */ + struct dentry *root; /* The root and superblock */ + struct user_namespace *user_ns; /* The user namespace for this mount */ + struct net *net_ns; /* The network namespace for this mount */ + const struct cred *cred; /* The mounter's credentials */ + const char *source; /* The source name (eg. dev path) */ + const char *subtype; /* The subtype to set on the superblock */ + void *security; /* Linux S&M options */ + unsigned int sb_flags; /* Proposed superblock flags (SB_*) */ + unsigned int sb_flags_mask; /* Superblock flags that were changed */ + enum fs_context_purpose purpose:8; + bool need_free:1; /* Need to call ops->free() */ +}; + +/* + * fs_context manipulation functions. + */ +extern struct fs_context *fs_context_for_mount(struct file_system_type *fs_type, + unsigned int sb_flags); + +extern int vfs_get_tree(struct fs_context *fc); +extern void put_fs_context(struct fs_context *fc); + +#endif /* _LINUX_FS_CONTEXT_H */ From patchwork Tue Feb 19 16:29:23 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 10820183 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 70B38922 for ; Tue, 19 Feb 2019 16:29:29 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 613462CBA7 for ; Tue, 19 Feb 2019 16:29:29 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 53D6B2CB56; Tue, 19 Feb 2019 16:29:29 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2BBAD2CB56 for ; Tue, 19 Feb 2019 16:29:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727566AbfBSQ30 (ORCPT ); Tue, 19 Feb 2019 11:29:26 -0500 Received: from mx1.redhat.com ([209.132.183.28]:47586 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726342AbfBSQ30 (ORCPT ); Tue, 19 Feb 2019 11:29:26 -0500 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 9E09510F84; Tue, 19 Feb 2019 16:29:25 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-121-129.rdu2.redhat.com [10.10.121.129]) by smtp.corp.redhat.com (Postfix) with ESMTP id 33D77648B9; Tue, 19 Feb 2019 16:29:24 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 07/43] new helpers: vfs_create_mount(), fc_mount() From: David Howells To: viro@zeniv.linux.org.uk Cc: linux-fsdevel@vger.kernel.org, dhowells@redhat.com, torvalds@linux-foundation.org, ebiederm@xmission.com, linux-security-module@vger.kernel.org Date: Tue, 19 Feb 2019 16:29:23 +0000 Message-ID: <155059376340.12449.11080415395465474975.stgit@warthog.procyon.org.uk> In-Reply-To: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> References: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.28]); Tue, 19 Feb 2019 16:29:25 +0000 (UTC) Sender: owner-linux-security-module@vger.kernel.org Precedence: bulk List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Al Viro Create a new helper, vfs_create_mount(), that creates a detached vfsmount object from an fs_context that has a superblock attached to it. Almost all uses will be paired with immediately preceding vfs_get_tree(); add a helper for such combination. Switch vfs_kern_mount() to use this. NOTE: mild behaviour change; passing NULL as 'device name' to something like procfs will change /proc/*/mountstats - "device none" instead on "no device". That is consistent with /proc/mounts et.al. [do'h - EXPORT_SYMBOL_GPL slipped in by mistake; removed] [AV -- remove confused comment from vfs_create_mount()] [AV -- removed the second argument] Reviewed-by: David Howells Signed-off-by: Al Viro --- fs/namespace.c | 76 ++++++++++++++++++++++++++++++++++--------------- include/linux/mount.h | 3 ++ 2 files changed, 55 insertions(+), 24 deletions(-) diff --git a/fs/namespace.c b/fs/namespace.c index 3f2fd7a34733..156771f5745a 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -941,12 +941,59 @@ static struct mount *skip_mnt_tree(struct mount *p) return p; } +/** + * vfs_create_mount - Create a mount for a configured superblock + * @fc: The configuration context with the superblock attached + * + * Create a mount to an already configured superblock. If necessary, the + * caller should invoke vfs_get_tree() before calling this. + * + * Note that this does not attach the mount to anything. + */ +struct vfsmount *vfs_create_mount(struct fs_context *fc) +{ + struct mount *mnt; + + if (!fc->root) + return ERR_PTR(-EINVAL); + + mnt = alloc_vfsmnt(fc->source ?: "none"); + if (!mnt) + return ERR_PTR(-ENOMEM); + + if (fc->sb_flags & SB_KERNMOUNT) + mnt->mnt.mnt_flags = MNT_INTERNAL; + + atomic_inc(&fc->root->d_sb->s_active); + mnt->mnt.mnt_sb = fc->root->d_sb; + mnt->mnt.mnt_root = dget(fc->root); + mnt->mnt_mountpoint = mnt->mnt.mnt_root; + mnt->mnt_parent = mnt; + + lock_mount_hash(); + list_add_tail(&mnt->mnt_instance, &mnt->mnt.mnt_sb->s_mounts); + unlock_mount_hash(); + return &mnt->mnt; +} +EXPORT_SYMBOL(vfs_create_mount); + +struct vfsmount *fc_mount(struct fs_context *fc) +{ + int err = vfs_get_tree(fc); + if (!err) { + up_write(&fc->root->d_sb->s_umount); + return vfs_create_mount(fc); + } + return ERR_PTR(err); +} +EXPORT_SYMBOL(fc_mount); + struct vfsmount *vfs_kern_mount(struct file_system_type *type, int flags, const char *name, void *data) { struct fs_context *fc; - struct mount *mnt; + struct vfsmount *mnt; int ret = 0; if (!type) @@ -964,31 +1011,12 @@ struct vfsmount *vfs_kern_mount(struct file_system_type *type, if (!ret) ret = parse_monolithic_mount_data(fc, data); if (!ret) - ret = vfs_get_tree(fc); - if (ret) { - put_fs_context(fc); - return ERR_PTR(ret); - } - up_write(&fc->root->d_sb->s_umount); - mnt = alloc_vfsmnt(name); - if (!mnt) { - put_fs_context(fc); - return ERR_PTR(-ENOMEM); - } - - if (flags & SB_KERNMOUNT) - mnt->mnt.mnt_flags = MNT_INTERNAL; + mnt = fc_mount(fc); + else + mnt = ERR_PTR(ret); - atomic_inc(&fc->root->d_sb->s_active); - mnt->mnt.mnt_root = dget(fc->root); - mnt->mnt.mnt_sb = fc->root->d_sb; - mnt->mnt_mountpoint = mnt->mnt.mnt_root; - mnt->mnt_parent = mnt; - lock_mount_hash(); - list_add_tail(&mnt->mnt_instance, &fc->root->d_sb->s_mounts); - unlock_mount_hash(); put_fs_context(fc); - return &mnt->mnt; + return mnt; } EXPORT_SYMBOL_GPL(vfs_kern_mount); diff --git a/include/linux/mount.h b/include/linux/mount.h index 037eed52164b..9197ddbf35fb 100644 --- a/include/linux/mount.h +++ b/include/linux/mount.h @@ -21,6 +21,7 @@ struct super_block; struct vfsmount; struct dentry; struct mnt_namespace; +struct fs_context; #define MNT_NOSUID 0x01 #define MNT_NODEV 0x02 @@ -88,6 +89,8 @@ struct path; extern struct vfsmount *clone_private_mount(const struct path *path); struct file_system_type; +extern struct vfsmount *fc_mount(struct fs_context *fc); +extern struct vfsmount *vfs_create_mount(struct fs_context *fc); extern struct vfsmount *vfs_kern_mount(struct file_system_type *type, int flags, const char *name, void *data); From patchwork Tue Feb 19 16:29:30 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 10820185 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id BD022922 for ; Tue, 19 Feb 2019 16:29:35 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A59152CBC8 for ; Tue, 19 Feb 2019 16:29:35 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 99D2B2CBCD; Tue, 19 Feb 2019 16:29:35 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 292562CBC8 for ; Tue, 19 Feb 2019 16:29:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726110AbfBSQ3e (ORCPT ); Tue, 19 Feb 2019 11:29:34 -0500 Received: from mx1.redhat.com ([209.132.183.28]:53474 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726342AbfBSQ3e (ORCPT ); Tue, 19 Feb 2019 11:29:34 -0500 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 08735599C2; Tue, 19 Feb 2019 16:29:34 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-121-129.rdu2.redhat.com [10.10.121.129]) by smtp.corp.redhat.com (Postfix) with ESMTP id A90BE60C1D; Tue, 19 Feb 2019 16:29:32 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 08/43] teach vfs_get_tree() to handle subtype, switch do_new_mount() to it From: David Howells To: viro@zeniv.linux.org.uk Cc: linux-fsdevel@vger.kernel.org, dhowells@redhat.com, torvalds@linux-foundation.org, ebiederm@xmission.com, linux-security-module@vger.kernel.org Date: Tue, 19 Feb 2019 16:29:30 +0000 Message-ID: <155059377086.12449.48912929800325479.stgit@warthog.procyon.org.uk> In-Reply-To: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> References: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.27]); Tue, 19 Feb 2019 16:29:34 +0000 (UTC) Sender: owner-linux-security-module@vger.kernel.org Precedence: bulk List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Al Viro Roll the handling of subtypes into do_new_mount() and vfs_get_tree(). The former determines any subtype string and hangs it off the fs_context; the latter applies it. Make do_new_mount() create, parameterise and commit an fs_context and create a mount for itself rather than calling vfs_kern_mount(). [AV -- missing kstrdup()] [AV -- ... and no kstrdup() if we get to setting ->s_submount - we simply transfer it from fc, leaving NULL behind] [AV -- constify ->s_submount, while we are at it] Reviewed-by: David Howells Signed-off-by: Al Viro --- fs/namespace.c | 77 +++++++++++++++++++++++++++++++--------------------- fs/super.c | 5 +++ include/linux/fs.h | 2 + 3 files changed, 52 insertions(+), 32 deletions(-) diff --git a/fs/namespace.c b/fs/namespace.c index 156771f5745a..0354cb6ac2d3 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -2479,29 +2479,6 @@ static int do_move_mount(struct path *path, const char *old_name) return err; } -static struct vfsmount *fs_set_subtype(struct vfsmount *mnt, const char *fstype) -{ - int err; - const char *subtype = strchr(fstype, '.'); - if (subtype) { - subtype++; - err = -EINVAL; - if (!subtype[0]) - goto err; - } else - subtype = ""; - - mnt->mnt_sb->s_subtype = kstrdup(subtype, GFP_KERNEL); - err = -ENOMEM; - if (!mnt->mnt_sb->s_subtype) - goto err; - return mnt; - - err: - mntput(mnt); - return ERR_PTR(err); -} - /* * add a mount into a namespace's mount tree */ @@ -2557,7 +2534,9 @@ static int do_new_mount(struct path *path, const char *fstype, int sb_flags, { struct file_system_type *type; struct vfsmount *mnt; - int err; + struct fs_context *fc; + const char *subtype = NULL; + int err = 0; if (!fstype) return -EINVAL; @@ -2566,23 +2545,59 @@ static int do_new_mount(struct path *path, const char *fstype, int sb_flags, if (!type) return -ENODEV; - mnt = vfs_kern_mount(type, sb_flags, name, data); - if (!IS_ERR(mnt) && (type->fs_flags & FS_HAS_SUBTYPE) && - !mnt->mnt_sb->s_subtype) - mnt = fs_set_subtype(mnt, fstype); + if (type->fs_flags & FS_HAS_SUBTYPE) { + subtype = strchr(fstype, '.'); + if (subtype) { + subtype++; + if (!*subtype) { + put_filesystem(type); + return -EINVAL; + } + } else { + subtype = ""; + } + } + fc = fs_context_for_mount(type, sb_flags); put_filesystem(type); - if (IS_ERR(mnt)) - return PTR_ERR(mnt); + if (IS_ERR(fc)) + return PTR_ERR(fc); + + if (subtype) { + fc->subtype = kstrdup(subtype, GFP_KERNEL); + if (!fc->subtype) + err = -ENOMEM; + } + if (!err && name) { + fc->source = kstrdup(name, GFP_KERNEL); + if (!fc->source) + err = -ENOMEM; + } + if (!err) + err = parse_monolithic_mount_data(fc, data); + if (!err) + err = vfs_get_tree(fc); + if (err) + goto out; + + up_write(&fc->root->d_sb->s_umount); + mnt = vfs_create_mount(fc); + if (IS_ERR(mnt)) { + err = PTR_ERR(mnt); + goto out; + } if (mount_too_revealing(mnt, &mnt_flags)) { mntput(mnt); - return -EPERM; + err = -EPERM; + goto out; } err = do_add_mount(real_mount(mnt), path, mnt_flags); if (err) mntput(mnt); +out: + put_fs_context(fc); return err; } diff --git a/fs/super.c b/fs/super.c index fc3887277ad1..b91b6df05b67 100644 --- a/fs/super.c +++ b/fs/super.c @@ -1262,6 +1262,11 @@ int vfs_get_tree(struct fs_context *fc) sb = fc->root->d_sb; WARN_ON(!sb->s_bdi); + if (fc->subtype && !sb->s_subtype) { + sb->s_subtype = fc->subtype; + fc->subtype = NULL; + } + /* * Write barrier is for super_cache_count(). We place it before setting * SB_BORN as the data dependency between the two functions is the diff --git a/include/linux/fs.h b/include/linux/fs.h index 811c77743dad..36fff12ab890 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -1447,7 +1447,7 @@ struct super_block { * Filesystem subtype. If non-empty the filesystem type field * in /proc/mounts will be "type.subtype" */ - char *s_subtype; + const char *s_subtype; const struct dentry_operations *s_d_op; /* default d_op for dentries */ From patchwork Tue Feb 19 16:29:39 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 10820189 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 65D33922 for ; Tue, 19 Feb 2019 16:29:43 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 51CF82CBCD for ; Tue, 19 Feb 2019 16:29:43 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 45ECA2CBD1; Tue, 19 Feb 2019 16:29:43 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C38E22CBC8 for ; Tue, 19 Feb 2019 16:29:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729068AbfBSQ3m (ORCPT ); Tue, 19 Feb 2019 11:29:42 -0500 Received: from mx1.redhat.com ([209.132.183.28]:60470 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729041AbfBSQ3m (ORCPT ); Tue, 19 Feb 2019 11:29:42 -0500 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 5F97987638; Tue, 19 Feb 2019 16:29:41 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-121-129.rdu2.redhat.com [10.10.121.129]) by smtp.corp.redhat.com (Postfix) with ESMTP id F241A5D6AA; Tue, 19 Feb 2019 16:29:39 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 09/43] new helper: do_new_mount_fc() From: David Howells To: viro@zeniv.linux.org.uk Cc: linux-fsdevel@vger.kernel.org, dhowells@redhat.com, torvalds@linux-foundation.org, ebiederm@xmission.com, linux-security-module@vger.kernel.org Date: Tue, 19 Feb 2019 16:29:39 +0000 Message-ID: <155059377923.12449.15324572719021896571.stgit@warthog.procyon.org.uk> In-Reply-To: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> References: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.26]); Tue, 19 Feb 2019 16:29:41 +0000 (UTC) Sender: owner-linux-security-module@vger.kernel.org Precedence: bulk List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Create an fs_context-aware version of do_new_mount(). This takes an fs_context with a superblock already attached to it. Make do_new_mount() use do_new_mount_fc() rather than do_new_mount(); this allows the consolidation of the mount creation, check and add steps. To make this work, mount_too_revealing() is changed to take a superblock rather than a mount (which the fs_context doesn't have available), allowing this check to be done before the mount object is created. Signed-off-by: David Howells Co-developed-by: Al Viro Signed-off-by: Al Viro --- fs/namespace.c | 65 ++++++++++++++++++++++++++++++++++---------------------- 1 file changed, 39 insertions(+), 26 deletions(-) diff --git a/fs/namespace.c b/fs/namespace.c index 0354cb6ac2d3..f629e1c7f3cc 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -2523,7 +2523,37 @@ static int do_add_mount(struct mount *newmnt, struct path *path, int mnt_flags) return err; } -static bool mount_too_revealing(struct vfsmount *mnt, int *new_mnt_flags); +static bool mount_too_revealing(const struct super_block *sb, int *new_mnt_flags); + +/* + * Create a new mount using a superblock configuration and request it + * be added to the namespace tree. + */ +static int do_new_mount_fc(struct fs_context *fc, struct path *mountpoint, + unsigned int mnt_flags) +{ + struct vfsmount *mnt; + struct super_block *sb = fc->root->d_sb; + int error; + + if (mount_too_revealing(sb, &mnt_flags)) { + dput(fc->root); + fc->root = NULL; + deactivate_locked_super(sb); + return -EPERM; + } + + up_write(&sb->s_umount); + + mnt = vfs_create_mount(fc); + if (IS_ERR(mnt)) + return PTR_ERR(mnt); + + error = do_add_mount(real_mount(mnt), mountpoint, mnt_flags); + if (error < 0) + mntput(mnt); + return error; +} /* * create a new mount for userspace and request it to be added into the @@ -2533,7 +2563,6 @@ static int do_new_mount(struct path *path, const char *fstype, int sb_flags, int mnt_flags, const char *name, void *data) { struct file_system_type *type; - struct vfsmount *mnt; struct fs_context *fc; const char *subtype = NULL; int err = 0; @@ -2577,26 +2606,9 @@ static int do_new_mount(struct path *path, const char *fstype, int sb_flags, err = parse_monolithic_mount_data(fc, data); if (!err) err = vfs_get_tree(fc); - if (err) - goto out; - - up_write(&fc->root->d_sb->s_umount); - mnt = vfs_create_mount(fc); - if (IS_ERR(mnt)) { - err = PTR_ERR(mnt); - goto out; - } - - if (mount_too_revealing(mnt, &mnt_flags)) { - mntput(mnt); - err = -EPERM; - goto out; - } + if (!err) + err = do_new_mount_fc(fc, path, mnt_flags); - err = do_add_mount(real_mount(mnt), path, mnt_flags); - if (err) - mntput(mnt); -out: put_fs_context(fc); return err; } @@ -3421,7 +3433,8 @@ bool current_chrooted(void) return chrooted; } -static bool mnt_already_visible(struct mnt_namespace *ns, struct vfsmount *new, +static bool mnt_already_visible(struct mnt_namespace *ns, + const struct super_block *sb, int *new_mnt_flags) { int new_flags = *new_mnt_flags; @@ -3433,7 +3446,7 @@ static bool mnt_already_visible(struct mnt_namespace *ns, struct vfsmount *new, struct mount *child; int mnt_flags; - if (mnt->mnt.mnt_sb->s_type != new->mnt_sb->s_type) + if (mnt->mnt.mnt_sb->s_type != sb->s_type) continue; /* This mount is not fully visible if it's root directory @@ -3484,7 +3497,7 @@ static bool mnt_already_visible(struct mnt_namespace *ns, struct vfsmount *new, return visible; } -static bool mount_too_revealing(struct vfsmount *mnt, int *new_mnt_flags) +static bool mount_too_revealing(const struct super_block *sb, int *new_mnt_flags) { const unsigned long required_iflags = SB_I_NOEXEC | SB_I_NODEV; struct mnt_namespace *ns = current->nsproxy->mnt_ns; @@ -3494,7 +3507,7 @@ static bool mount_too_revealing(struct vfsmount *mnt, int *new_mnt_flags) return false; /* Can this filesystem be too revealing? */ - s_iflags = mnt->mnt_sb->s_iflags; + s_iflags = sb->s_iflags; if (!(s_iflags & SB_I_USERNS_VISIBLE)) return false; @@ -3504,7 +3517,7 @@ static bool mount_too_revealing(struct vfsmount *mnt, int *new_mnt_flags) return true; } - return !mnt_already_visible(ns, mnt, new_mnt_flags); + return !mnt_already_visible(ns, sb, new_mnt_flags); } bool mnt_may_suid(struct vfsmount *mnt) From patchwork Tue Feb 19 16:29:46 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 10820193 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C52D31805 for ; Tue, 19 Feb 2019 16:29:50 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id AFD6E2CBC8 for ; Tue, 19 Feb 2019 16:29:50 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A3A182CBD5; Tue, 19 Feb 2019 16:29:50 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2DAF82CBC8 for ; Tue, 19 Feb 2019 16:29:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727082AbfBSQ3t (ORCPT ); Tue, 19 Feb 2019 11:29:49 -0500 Received: from mx1.redhat.com ([209.132.183.28]:37762 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726342AbfBSQ3t (ORCPT ); Tue, 19 Feb 2019 11:29:49 -0500 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id C7C3958E5C; Tue, 19 Feb 2019 16:29:48 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-121-129.rdu2.redhat.com [10.10.121.129]) by smtp.corp.redhat.com (Postfix) with ESMTP id 637575D970; Tue, 19 Feb 2019 16:29:47 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 10/43] vfs_get_tree(): evict the call of security_sb_kern_mount() From: David Howells To: viro@zeniv.linux.org.uk Cc: linux-fsdevel@vger.kernel.org, dhowells@redhat.com, torvalds@linux-foundation.org, ebiederm@xmission.com, linux-security-module@vger.kernel.org Date: Tue, 19 Feb 2019 16:29:46 +0000 Message-ID: <155059378660.12449.1465968920527134556.stgit@warthog.procyon.org.uk> In-Reply-To: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> References: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Tue, 19 Feb 2019 16:29:48 +0000 (UTC) Sender: owner-linux-security-module@vger.kernel.org Precedence: bulk List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Al Viro Right now vfs_get_tree() calls security_sb_kern_mount() (i.e. mount MAC) unless it gets MS_KERNMOUNT or MS_SUBMOUNT in flags. Doing it that way is both clumsy and imprecise. Consider the callers' tree of vfs_get_tree(): vfs_get_tree() <- do_new_mount() <- vfs_kern_mount() <- simple_pin_fs() <- vfs_submount() <- kern_mount_data() <- init_mount_tree() <- btrfs_mount() <- vfs_get_tree() <- nfs_do_root_mount() <- nfs4_try_mount() <- nfs_fs_mount() <- vfs_get_tree() <- nfs4_referral_mount() do_new_mount() always does need MAC (we are guaranteed that neither MS_KERNMOUNT nor MS_SUBMOUNT will be passed there). simple_pin_fs(), vfs_submount() and kern_mount_data() pass explicit flags inhibiting that check. So does nfs4_referral_mount() (the flags there are ulimately coming from vfs_submount()). init_mount_tree() is called too early for anything LSM-related; it doesn't matter whether we attempt those checks, they'll do nothing. Finally, in case of btrfs_mount() and nfs_fs_mount(), doing MAC is pointless - either the caller will do it, or the flags are such that we wouldn't have done it either. In other words, the one and only case when we want that check done is when we are called from do_new_mount(), and there we want it unconditionally. So let's simply move it there. The superblock is still locked, so nobody is going to get access to it (via ustat(2), etc.) until we get a chance to apply the checks - we are free to move them to any point up to where we drop ->s_umount (in do_new_mount_fc()). Signed-off-by: Al Viro --- fs/fs_context.c | 8 ++++++++ fs/internal.h | 1 + fs/namespace.c | 12 +++++++----- fs/super.c | 15 +++------------ 4 files changed, 19 insertions(+), 17 deletions(-) diff --git a/fs/fs_context.c b/fs/fs_context.c index 4294091b689d..857cd46a687b 100644 --- a/fs/fs_context.c +++ b/fs/fs_context.c @@ -90,6 +90,14 @@ struct fs_context *fs_context_for_mount(struct file_system_type *fs_type, } EXPORT_SYMBOL(fs_context_for_mount); +void fc_drop_locked(struct fs_context *fc) +{ + struct super_block *sb = fc->root->d_sb; + dput(fc->root); + fc->root = NULL; + deactivate_locked_super(sb); +} + static void legacy_fs_context_free(struct fs_context *fc); /** * put_fs_context - Dispose of a superblock configuration context. diff --git a/fs/internal.h b/fs/internal.h index f85c3212d25d..6af26d897034 100644 --- a/fs/internal.h +++ b/fs/internal.h @@ -57,6 +57,7 @@ extern void __init chrdev_init(void); */ extern int legacy_get_tree(struct fs_context *fc); extern int parse_monolithic_mount_data(struct fs_context *, void *); +extern void fc_drop_locked(struct fs_context *); /* * namei.c diff --git a/fs/namespace.c b/fs/namespace.c index f629e1c7f3cc..750500c6c33d 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -2536,11 +2536,13 @@ static int do_new_mount_fc(struct fs_context *fc, struct path *mountpoint, struct super_block *sb = fc->root->d_sb; int error; - if (mount_too_revealing(sb, &mnt_flags)) { - dput(fc->root); - fc->root = NULL; - deactivate_locked_super(sb); - return -EPERM; + error = security_sb_kern_mount(sb); + if (!error && mount_too_revealing(sb, &mnt_flags)) + error = -EPERM; + + if (unlikely(error)) { + fc_drop_locked(fc); + return error; } up_write(&sb->s_umount); diff --git a/fs/super.c b/fs/super.c index b91b6df05b67..11e2a6cb3baf 100644 --- a/fs/super.c +++ b/fs/super.c @@ -1277,13 +1277,9 @@ int vfs_get_tree(struct fs_context *fc) sb->s_flags |= SB_BORN; error = security_sb_set_mnt_opts(sb, fc->security, 0, NULL); - if (error) - goto out_sb; - - if (!(fc->sb_flags & (MS_KERNMOUNT|MS_SUBMOUNT))) { - error = security_sb_kern_mount(sb); - if (error) - goto out_sb; + if (unlikely(error)) { + fc_drop_locked(fc); + return error; } /* @@ -1296,11 +1292,6 @@ int vfs_get_tree(struct fs_context *fc) "negative value (%lld)\n", fc->fs_type->name, sb->s_maxbytes); return 0; -out_sb: - dput(fc->root); - fc->root = NULL; - deactivate_locked_super(sb); - return error; } EXPORT_SYMBOL(vfs_get_tree); From patchwork Tue Feb 19 16:29:54 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 10820199 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5F9A4180E for ; Tue, 19 Feb 2019 16:30:00 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4EA742CBD5 for ; Tue, 19 Feb 2019 16:30:00 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 42D742CBE6; Tue, 19 Feb 2019 16:30:00 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5C0282CBE3 for ; Tue, 19 Feb 2019 16:29:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727527AbfBSQ35 (ORCPT ); Tue, 19 Feb 2019 11:29:57 -0500 Received: from mx1.redhat.com ([209.132.183.28]:37884 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729234AbfBSQ35 (ORCPT ); Tue, 19 Feb 2019 11:29:57 -0500 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 501359FDD3; Tue, 19 Feb 2019 16:29:56 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-121-129.rdu2.redhat.com [10.10.121.129]) by smtp.corp.redhat.com (Postfix) with ESMTP id C830D17C78; Tue, 19 Feb 2019 16:29:54 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 11/43] convert do_remount_sb() to fs_context From: David Howells To: viro@zeniv.linux.org.uk Cc: linux-fsdevel@vger.kernel.org, dhowells@redhat.com, torvalds@linux-foundation.org, ebiederm@xmission.com, linux-security-module@vger.kernel.org Date: Tue, 19 Feb 2019 16:29:54 +0000 Message-ID: <155059379404.12449.7151387243186063769.stgit@warthog.procyon.org.uk> In-Reply-To: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> References: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Tue, 19 Feb 2019 16:29:56 +0000 (UTC) Sender: owner-linux-security-module@vger.kernel.org Precedence: bulk List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Replace do_remount_sb() with a function, reconfigure_super(), that's fs_context aware. The fs_context is expected to be parameterised already and have ->root pointing to the superblock to be reconfigured. A legacy wrapper is provided that is intended to be called from the fs_context ops when those appear, but for now is called directly from reconfigure_super(). This wrapper invokes the ->remount_fs() superblock op for the moment. It is intended that the remount_fs() op will be phased out. The fs_context->purpose is set to FS_CONTEXT_FOR_RECONFIGURE to indicate that the context is being used for reconfiguration. do_umount_root() is provided to consolidate remount-to-R/O for umount and emergency remount by creating a context and invoking reconfiguration. do_remount(), do_umount() and do_emergency_remount_callback() are switched to use the new process. [AV -- fold UMOUNT and EMERGENCY_REMOUNT in; fixes the umount / bug, gets rid of pointless complexity] [AV -- set ->net_ns in all cases; nfs remount will need that] [AV -- shift security_sb_remount() call into reconfigure_super(); the callers that didn't do security_sb_remount() have NULL fc->security anyway, so it's a no-op for them] Signed-off-by: David Howells Co-developed-by: Al Viro Signed-off-by: Al Viro --- fs/fs_context.c | 35 ++++++++++++++ fs/internal.h | 3 + fs/namespace.c | 61 ++++++++++++++++--------- fs/super.c | 107 ++++++++++++++++++++++++++++++-------------- include/linux/fs.h | 1 include/linux/fs_context.h | 4 ++ 6 files changed, 152 insertions(+), 59 deletions(-) diff --git a/fs/fs_context.c b/fs/fs_context.c index 857cd46a687b..5e2c3aba1dd8 100644 --- a/fs/fs_context.c +++ b/fs/fs_context.c @@ -69,6 +69,13 @@ static struct fs_context *alloc_fs_context(struct file_system_type *fs_type, case FS_CONTEXT_FOR_MOUNT: fc->user_ns = get_user_ns(fc->cred->user_ns); break; + case FS_CONTEXT_FOR_RECONFIGURE: + /* We don't pin any namespaces as the superblock's + * subscriptions cannot be changed at this point. + */ + atomic_inc(&reference->d_sb->s_active); + fc->root = dget(reference); + break; } ret = legacy_init_fs_context(fc); @@ -90,6 +97,15 @@ struct fs_context *fs_context_for_mount(struct file_system_type *fs_type, } EXPORT_SYMBOL(fs_context_for_mount); +struct fs_context *fs_context_for_reconfigure(struct dentry *dentry, + unsigned int sb_flags, + unsigned int sb_flags_mask) +{ + return alloc_fs_context(dentry->d_sb->s_type, dentry, sb_flags, + sb_flags_mask, FS_CONTEXT_FOR_RECONFIGURE); +} +EXPORT_SYMBOL(fs_context_for_reconfigure); + void fc_drop_locked(struct fs_context *fc) { struct super_block *sb = fc->root->d_sb; @@ -99,6 +115,7 @@ void fc_drop_locked(struct fs_context *fc) } static void legacy_fs_context_free(struct fs_context *fc); + /** * put_fs_context - Dispose of a superblock configuration context. * @fc: The context to dispose of. @@ -118,8 +135,7 @@ void put_fs_context(struct fs_context *fc) legacy_fs_context_free(fc); security_free_mnt_opts(&fc->security); - if (fc->net_ns) - put_net(fc->net_ns); + put_net(fc->net_ns); put_user_ns(fc->user_ns); put_cred(fc->cred); kfree(fc->subtype); @@ -172,6 +188,21 @@ int legacy_get_tree(struct fs_context *fc) return 0; } +/* + * Handle remount. + */ +int legacy_reconfigure(struct fs_context *fc) +{ + struct legacy_fs_context *ctx = fc->fs_private; + struct super_block *sb = fc->root->d_sb; + + if (!sb->s_op->remount_fs) + return 0; + + return sb->s_op->remount_fs(sb, &fc->sb_flags, + ctx ? ctx->legacy_data : NULL); +} + /* * Initialise a legacy context for a filesystem that doesn't support * fs_context. diff --git a/fs/internal.h b/fs/internal.h index 6af26d897034..016a5b8dd305 100644 --- a/fs/internal.h +++ b/fs/internal.h @@ -56,6 +56,7 @@ extern void __init chrdev_init(void); * fs_context.c */ extern int legacy_get_tree(struct fs_context *fc); +extern int legacy_reconfigure(struct fs_context *fc); extern int parse_monolithic_mount_data(struct fs_context *, void *); extern void fc_drop_locked(struct fs_context *); @@ -107,7 +108,7 @@ extern struct file *alloc_empty_file_noaccount(int, const struct cred *); /* * super.c */ -extern int do_remount_sb(struct super_block *, int, void *, int); +extern int reconfigure_super(struct fs_context *); extern bool trylock_super(struct super_block *sb); extern struct super_block *user_get_super(dev_t); diff --git a/fs/namespace.c b/fs/namespace.c index 750500c6c33d..931228d8518a 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -1489,6 +1489,29 @@ static void umount_tree(struct mount *mnt, enum umount_tree_flags how) static void shrink_submounts(struct mount *mnt); +static int do_umount_root(struct super_block *sb) +{ + int ret = 0; + + down_write(&sb->s_umount); + if (!sb_rdonly(sb)) { + struct fs_context *fc; + + fc = fs_context_for_reconfigure(sb->s_root, SB_RDONLY, + SB_RDONLY); + if (IS_ERR(fc)) { + ret = PTR_ERR(fc); + } else { + ret = parse_monolithic_mount_data(fc, NULL); + if (!ret) + ret = reconfigure_super(fc); + put_fs_context(fc); + } + } + up_write(&sb->s_umount); + return ret; +} + static int do_umount(struct mount *mnt, int flags) { struct super_block *sb = mnt->mnt.mnt_sb; @@ -1554,11 +1577,7 @@ static int do_umount(struct mount *mnt, int flags) */ if (!ns_capable(sb->s_user_ns, CAP_SYS_ADMIN)) return -EPERM; - down_write(&sb->s_umount); - if (!sb_rdonly(sb)) - retval = do_remount_sb(sb, SB_RDONLY, NULL, 0); - up_write(&sb->s_umount); - return retval; + return do_umount_root(sb); } namespace_lock(); @@ -2367,7 +2386,7 @@ static int do_remount(struct path *path, int ms_flags, int sb_flags, int err; struct super_block *sb = path->mnt->mnt_sb; struct mount *mnt = real_mount(path->mnt); - void *sec_opts = NULL; + struct fs_context *fc; if (!check_mnt(mnt)) return -EINVAL; @@ -2378,24 +2397,22 @@ static int do_remount(struct path *path, int ms_flags, int sb_flags, if (!can_change_locked_flags(mnt, mnt_flags)) return -EPERM; - if (data && !(sb->s_type->fs_flags & FS_BINARY_MOUNTDATA)) { - err = security_sb_eat_lsm_opts(data, &sec_opts); - if (err) - return err; - } - err = security_sb_remount(sb, sec_opts); - security_free_mnt_opts(&sec_opts); - if (err) - return err; + fc = fs_context_for_reconfigure(path->dentry, sb_flags, MS_RMT_MASK); + if (IS_ERR(fc)) + return PTR_ERR(fc); - down_write(&sb->s_umount); - err = -EPERM; - if (ns_capable(sb->s_user_ns, CAP_SYS_ADMIN)) { - err = do_remount_sb(sb, sb_flags, data, 0); - if (!err) - set_mount_attributes(mnt, mnt_flags); + err = parse_monolithic_mount_data(fc, data); + if (!err) { + down_write(&sb->s_umount); + err = -EPERM; + if (ns_capable(sb->s_user_ns, CAP_SYS_ADMIN)) { + err = reconfigure_super(fc); + if (!err) + set_mount_attributes(mnt, mnt_flags); + } + up_write(&sb->s_umount); } - up_write(&sb->s_umount); + put_fs_context(fc); return err; } diff --git a/fs/super.c b/fs/super.c index 11e2a6cb3baf..50553233dd15 100644 --- a/fs/super.c +++ b/fs/super.c @@ -836,28 +836,35 @@ struct super_block *user_get_super(dev_t dev) } /** - * do_remount_sb - asks filesystem to change mount options. - * @sb: superblock in question - * @sb_flags: revised superblock flags - * @data: the rest of options - * @force: whether or not to force the change + * reconfigure_super - asks filesystem to change superblock parameters + * @fc: The superblock and configuration * - * Alters the mount options of a mounted file system. + * Alters the configuration parameters of a live superblock. */ -int do_remount_sb(struct super_block *sb, int sb_flags, void *data, int force) +int reconfigure_super(struct fs_context *fc) { + struct super_block *sb = fc->root->d_sb; int retval; - int remount_ro; + bool remount_ro = false; + bool force = fc->sb_flags & SB_FORCE; + if (fc->sb_flags_mask & ~MS_RMT_MASK) + return -EINVAL; if (sb->s_writers.frozen != SB_UNFROZEN) return -EBUSY; + retval = security_sb_remount(sb, fc->security); + if (retval) + return retval; + + if (fc->sb_flags_mask & SB_RDONLY) { #ifdef CONFIG_BLOCK - if (!(sb_flags & SB_RDONLY) && bdev_read_only(sb->s_bdev)) - return -EACCES; + if (!(fc->sb_flags & SB_RDONLY) && bdev_read_only(sb->s_bdev)) + return -EACCES; #endif - remount_ro = (sb_flags & SB_RDONLY) && !sb_rdonly(sb); + remount_ro = (fc->sb_flags & SB_RDONLY) && !sb_rdonly(sb); + } if (remount_ro) { if (!hlist_empty(&sb->s_pins)) { @@ -868,13 +875,14 @@ int do_remount_sb(struct super_block *sb, int sb_flags, void *data, int force) return 0; if (sb->s_writers.frozen != SB_UNFROZEN) return -EBUSY; - remount_ro = (sb_flags & SB_RDONLY) && !sb_rdonly(sb); + remount_ro = !sb_rdonly(sb); } } shrink_dcache_sb(sb); - /* If we are remounting RDONLY and current sb is read/write, - make sure there are no rw files opened */ + /* If we are reconfiguring to RDONLY and current sb is read/write, + * make sure there are no files open for writing. + */ if (remount_ro) { if (force) { sb->s_readonly_remount = 1; @@ -886,17 +894,17 @@ int do_remount_sb(struct super_block *sb, int sb_flags, void *data, int force) } } - if (sb->s_op->remount_fs) { - retval = sb->s_op->remount_fs(sb, &sb_flags, data); - if (retval) { - if (!force) - goto cancel_readonly; - /* If forced remount, go ahead despite any errors */ - WARN(1, "forced remount of a %s fs returned %i\n", - sb->s_type->name, retval); - } + retval = legacy_reconfigure(fc); + if (retval) { + if (!force) + goto cancel_readonly; + /* If forced remount, go ahead despite any errors */ + WARN(1, "forced remount of a %s fs returned %i\n", + sb->s_type->name, retval); } - sb->s_flags = (sb->s_flags & ~MS_RMT_MASK) | (sb_flags & MS_RMT_MASK); + + WRITE_ONCE(sb->s_flags, ((sb->s_flags & ~fc->sb_flags_mask) | + (fc->sb_flags & fc->sb_flags_mask))); /* Needs to be ordered wrt mnt_is_readonly() */ smp_wmb(); sb->s_readonly_remount = 0; @@ -923,10 +931,15 @@ static void do_emergency_remount_callback(struct super_block *sb) down_write(&sb->s_umount); if (sb->s_root && sb->s_bdev && (sb->s_flags & SB_BORN) && !sb_rdonly(sb)) { - /* - * What lock protects sb->s_flags?? - */ - do_remount_sb(sb, SB_RDONLY, NULL, 1); + struct fs_context *fc; + + fc = fs_context_for_reconfigure(sb->s_root, + SB_RDONLY | SB_FORCE, SB_RDONLY); + if (!IS_ERR(fc)) { + if (parse_monolithic_mount_data(fc, NULL) == 0) + (void)reconfigure_super(fc); + put_fs_context(fc); + } } up_write(&sb->s_umount); } @@ -1213,6 +1226,31 @@ struct dentry *mount_nodev(struct file_system_type *fs_type, } EXPORT_SYMBOL(mount_nodev); +static int reconfigure_single(struct super_block *s, + int flags, void *data) +{ + struct fs_context *fc; + int ret; + + /* The caller really need to be passing fc down into mount_single(), + * then a chunk of this can be removed. [Bollocks -- AV] + * Better yet, reconfiguration shouldn't happen, but rather the second + * mount should be rejected if the parameters are not compatible. + */ + fc = fs_context_for_reconfigure(s->s_root, flags, MS_RMT_MASK); + if (IS_ERR(fc)) + return PTR_ERR(fc); + + ret = parse_monolithic_mount_data(fc, data); + if (ret < 0) + goto out; + + ret = reconfigure_super(fc); +out: + put_fs_context(fc); + return ret; +} + static int compare_single(struct super_block *s, void *p) { return 1; @@ -1230,13 +1268,14 @@ struct dentry *mount_single(struct file_system_type *fs_type, return ERR_CAST(s); if (!s->s_root) { error = fill_super(s, data, flags & SB_SILENT ? 1 : 0); - if (error) { - deactivate_locked_super(s); - return ERR_PTR(error); - } - s->s_flags |= SB_ACTIVE; + if (!error) + s->s_flags |= SB_ACTIVE; } else { - do_remount_sb(s, flags, data, 0); + error = reconfigure_single(s, flags, data); + } + if (unlikely(error)) { + deactivate_locked_super(s); + return ERR_PTR(error); } return dget(s->s_root); } diff --git a/include/linux/fs.h b/include/linux/fs.h index 36fff12ab890..c65d02c5c512 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -1337,6 +1337,7 @@ extern int send_sigurg(struct fown_struct *fown); /* These sb flags are internal to the kernel */ #define SB_SUBMOUNT (1<<26) +#define SB_FORCE (1<<27) #define SB_NOSEC (1<<28) #define SB_BORN (1<<29) #define SB_ACTIVE (1<<30) diff --git a/include/linux/fs_context.h b/include/linux/fs_context.h index 9805514444c9..98772f882a3e 100644 --- a/include/linux/fs_context.h +++ b/include/linux/fs_context.h @@ -25,6 +25,7 @@ struct user_namespace; enum fs_context_purpose { FS_CONTEXT_FOR_MOUNT, /* New superblock for explicit mount */ + FS_CONTEXT_FOR_RECONFIGURE, /* Superblock reconfiguration (remount) */ }; /* @@ -57,6 +58,9 @@ struct fs_context { */ extern struct fs_context *fs_context_for_mount(struct file_system_type *fs_type, unsigned int sb_flags); +extern struct fs_context *fs_context_for_reconfigure(struct dentry *dentry, + unsigned int sb_flags, + unsigned int sb_flags_mask); extern int vfs_get_tree(struct fs_context *fc); extern void put_fs_context(struct fs_context *fc); From patchwork Tue Feb 19 16:30:01 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 10820203 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 532F414E1 for ; Tue, 19 Feb 2019 16:30:06 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3DD512CBD5 for ; Tue, 19 Feb 2019 16:30:06 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 312F42CBE6; Tue, 19 Feb 2019 16:30:06 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CA7082CBED for ; Tue, 19 Feb 2019 16:30:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729099AbfBSQaF (ORCPT ); Tue, 19 Feb 2019 11:30:05 -0500 Received: from mx1.redhat.com ([209.132.183.28]:35520 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728984AbfBSQaF (ORCPT ); Tue, 19 Feb 2019 11:30:05 -0500 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id B6D407D0CB; Tue, 19 Feb 2019 16:30:04 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-121-129.rdu2.redhat.com [10.10.121.129]) by smtp.corp.redhat.com (Postfix) with ESMTP id 6975061080; Tue, 19 Feb 2019 16:30:02 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 12/43] fs_context flavour for submounts From: David Howells To: viro@zeniv.linux.org.uk Cc: linux-fsdevel@vger.kernel.org, dhowells@redhat.com, torvalds@linux-foundation.org, ebiederm@xmission.com, linux-security-module@vger.kernel.org Date: Tue, 19 Feb 2019 16:30:01 +0000 Message-ID: <155059380154.12449.5948045751612660590.stgit@warthog.procyon.org.uk> In-Reply-To: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> References: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.27]); Tue, 19 Feb 2019 16:30:04 +0000 (UTC) Sender: owner-linux-security-module@vger.kernel.org Precedence: bulk List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Al Viro This is an eventual replacement for vfs_submount() uses. Unlike the "mount" and "remount" cases, the users of that thing are not in VFS - they are buried in various ->d_automount() instances and rather than converting them all at once we introduce the (thankfully small and simple) infrastructure here and deal with the prospective users in afs, nfs, etc. parts of the series. Here we just introduce a new constructor (fs_context_for_submount()) along with the corresponding enum constant to be put into fc->purpose for those. Signed-off-by: Al Viro --- fs/fs_context.c | 10 ++++++++++ include/linux/fs_context.h | 3 +++ 2 files changed, 13 insertions(+) diff --git a/fs/fs_context.c b/fs/fs_context.c index 5e2c3aba1dd8..2bd652b6e848 100644 --- a/fs/fs_context.c +++ b/fs/fs_context.c @@ -69,6 +69,9 @@ static struct fs_context *alloc_fs_context(struct file_system_type *fs_type, case FS_CONTEXT_FOR_MOUNT: fc->user_ns = get_user_ns(fc->cred->user_ns); break; + case FS_CONTEXT_FOR_SUBMOUNT: + fc->user_ns = get_user_ns(reference->d_sb->s_user_ns); + break; case FS_CONTEXT_FOR_RECONFIGURE: /* We don't pin any namespaces as the superblock's * subscriptions cannot be changed at this point. @@ -106,6 +109,13 @@ struct fs_context *fs_context_for_reconfigure(struct dentry *dentry, } EXPORT_SYMBOL(fs_context_for_reconfigure); +struct fs_context *fs_context_for_submount(struct file_system_type *type, + struct dentry *reference) +{ + return alloc_fs_context(type, reference, 0, 0, FS_CONTEXT_FOR_SUBMOUNT); +} +EXPORT_SYMBOL(fs_context_for_submount); + void fc_drop_locked(struct fs_context *fc) { struct super_block *sb = fc->root->d_sb; diff --git a/include/linux/fs_context.h b/include/linux/fs_context.h index 98772f882a3e..7feb018c7a9e 100644 --- a/include/linux/fs_context.h +++ b/include/linux/fs_context.h @@ -25,6 +25,7 @@ struct user_namespace; enum fs_context_purpose { FS_CONTEXT_FOR_MOUNT, /* New superblock for explicit mount */ + FS_CONTEXT_FOR_SUBMOUNT, /* New superblock for automatic submount */ FS_CONTEXT_FOR_RECONFIGURE, /* Superblock reconfiguration (remount) */ }; @@ -61,6 +62,8 @@ extern struct fs_context *fs_context_for_mount(struct file_system_type *fs_type, extern struct fs_context *fs_context_for_reconfigure(struct dentry *dentry, unsigned int sb_flags, unsigned int sb_flags_mask); +extern struct fs_context *fs_context_for_submount(struct file_system_type *fs_type, + struct dentry *reference); extern int vfs_get_tree(struct fs_context *fc); extern void put_fs_context(struct fs_context *fc); From patchwork Tue Feb 19 16:30:09 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 10820207 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6F5861805 for ; Tue, 19 Feb 2019 16:30:14 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 579202CC1D for ; Tue, 19 Feb 2019 16:30:14 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 4B9BA2CC8D; Tue, 19 Feb 2019 16:30:14 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9E36A2CC1C for ; Tue, 19 Feb 2019 16:30:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729166AbfBSQaN (ORCPT ); Tue, 19 Feb 2019 11:30:13 -0500 Received: from mx1.redhat.com ([209.132.183.28]:48222 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728984AbfBSQaN (ORCPT ); Tue, 19 Feb 2019 11:30:13 -0500 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 3F3347518D; Tue, 19 Feb 2019 16:30:12 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-121-129.rdu2.redhat.com [10.10.121.129]) by smtp.corp.redhat.com (Postfix) with ESMTP id C39AC60C1D; Tue, 19 Feb 2019 16:30:10 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 13/43] introduce fs_context methods From: David Howells To: viro@zeniv.linux.org.uk Cc: linux-fsdevel@vger.kernel.org, dhowells@redhat.com, torvalds@linux-foundation.org, ebiederm@xmission.com, linux-security-module@vger.kernel.org Date: Tue, 19 Feb 2019 16:30:09 +0000 Message-ID: <155059380992.12449.10913805628701754409.stgit@warthog.procyon.org.uk> In-Reply-To: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> References: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.28]); Tue, 19 Feb 2019 16:30:12 +0000 (UTC) Sender: owner-linux-security-module@vger.kernel.org Precedence: bulk List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Al Viro Signed-off-by: Al Viro --- fs/fs_context.c | 28 ++++++++++++++++++++++------ fs/internal.h | 2 -- fs/super.c | 36 ++++++++++++++++++++++++++++-------- include/linux/fs.h | 2 ++ include/linux/fs_context.h | 13 +++++++++++++ 5 files changed, 65 insertions(+), 16 deletions(-) diff --git a/fs/fs_context.c b/fs/fs_context.c index 2bd652b6e848..825d1b2c8807 100644 --- a/fs/fs_context.c +++ b/fs/fs_context.c @@ -51,6 +51,7 @@ static struct fs_context *alloc_fs_context(struct file_system_type *fs_type, unsigned int sb_flags_mask, enum fs_context_purpose purpose) { + int (*init_fs_context)(struct fs_context *); struct fs_context *fc; int ret = -ENOMEM; @@ -81,7 +82,12 @@ static struct fs_context *alloc_fs_context(struct file_system_type *fs_type, break; } - ret = legacy_init_fs_context(fc); + /* TODO: Make all filesystems support this unconditionally */ + init_fs_context = fc->fs_type->init_fs_context; + if (!init_fs_context) + init_fs_context = legacy_init_fs_context; + + ret = init_fs_context(fc); if (ret < 0) goto err_fc; fc->need_free = true; @@ -141,8 +147,8 @@ void put_fs_context(struct fs_context *fc) deactivate_super(sb); } - if (fc->need_free) - legacy_fs_context_free(fc); + if (fc->need_free && fc->ops && fc->ops->free) + fc->ops->free(fc); security_free_mnt_opts(&fc->security); put_net(fc->net_ns); @@ -180,7 +186,7 @@ static int legacy_parse_monolithic(struct fs_context *fc, void *data) /* * Get a mountable root with the legacy mount command. */ -int legacy_get_tree(struct fs_context *fc) +static int legacy_get_tree(struct fs_context *fc) { struct legacy_fs_context *ctx = fc->fs_private; struct super_block *sb; @@ -201,7 +207,7 @@ int legacy_get_tree(struct fs_context *fc) /* * Handle remount. */ -int legacy_reconfigure(struct fs_context *fc) +static int legacy_reconfigure(struct fs_context *fc) { struct legacy_fs_context *ctx = fc->fs_private; struct super_block *sb = fc->root->d_sb; @@ -213,6 +219,13 @@ int legacy_reconfigure(struct fs_context *fc) ctx ? ctx->legacy_data : NULL); } +const struct fs_context_operations legacy_fs_context_ops = { + .free = legacy_fs_context_free, + .parse_monolithic = legacy_parse_monolithic, + .get_tree = legacy_get_tree, + .reconfigure = legacy_reconfigure, +}; + /* * Initialise a legacy context for a filesystem that doesn't support * fs_context. @@ -222,10 +235,13 @@ static int legacy_init_fs_context(struct fs_context *fc) fc->fs_private = kzalloc(sizeof(struct legacy_fs_context), GFP_KERNEL); if (!fc->fs_private) return -ENOMEM; + fc->ops = &legacy_fs_context_ops; return 0; } int parse_monolithic_mount_data(struct fs_context *fc, void *data) { - return legacy_parse_monolithic(fc, data); + int (*monolithic_mount_data)(struct fs_context *, void *); + monolithic_mount_data = fc->ops->parse_monolithic; + return monolithic_mount_data(fc, data); } diff --git a/fs/internal.h b/fs/internal.h index 016a5b8dd305..8f8d07cc433f 100644 --- a/fs/internal.h +++ b/fs/internal.h @@ -55,8 +55,6 @@ extern void __init chrdev_init(void); /* * fs_context.c */ -extern int legacy_get_tree(struct fs_context *fc); -extern int legacy_reconfigure(struct fs_context *fc); extern int parse_monolithic_mount_data(struct fs_context *, void *); extern void fc_drop_locked(struct fs_context *); diff --git a/fs/super.c b/fs/super.c index 50553233dd15..76b3181c782d 100644 --- a/fs/super.c +++ b/fs/super.c @@ -894,13 +894,15 @@ int reconfigure_super(struct fs_context *fc) } } - retval = legacy_reconfigure(fc); - if (retval) { - if (!force) - goto cancel_readonly; - /* If forced remount, go ahead despite any errors */ - WARN(1, "forced remount of a %s fs returned %i\n", - sb->s_type->name, retval); + if (fc->ops->reconfigure) { + retval = fc->ops->reconfigure(fc); + if (retval) { + if (!force) + goto cancel_readonly; + /* If forced remount, go ahead despite any errors */ + WARN(1, "forced remount of a %s fs returned %i\n", + sb->s_type->name, retval); + } } WRITE_ONCE(sb->s_flags, ((sb->s_flags & ~fc->sb_flags_mask) | @@ -1294,10 +1296,28 @@ int vfs_get_tree(struct fs_context *fc) struct super_block *sb; int error; - error = legacy_get_tree(fc); + if (fc->fs_type->fs_flags & FS_REQUIRES_DEV && !fc->source) + return -ENOENT; + + if (fc->root) + return -EBUSY; + + /* Get the mountable root in fc->root, with a ref on the root and a ref + * on the superblock. + */ + error = fc->ops->get_tree(fc); if (error < 0) return error; + if (!fc->root) { + pr_err("Filesystem %s get_tree() didn't set fc->root\n", + fc->fs_type->name); + /* We don't know what the locking state of the superblock is - + * if there is a superblock. + */ + BUG(); + } + sb = fc->root->d_sb; WARN_ON(!sb->s_bdi); diff --git a/include/linux/fs.h b/include/linux/fs.h index c65d02c5c512..8d578a9e1e8c 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -61,6 +61,7 @@ struct workqueue_struct; struct iov_iter; struct fscrypt_info; struct fscrypt_operations; +struct fs_context; extern void __init inode_init(void); extern void __init inode_init_early(void); @@ -2173,6 +2174,7 @@ struct file_system_type { #define FS_HAS_SUBTYPE 4 #define FS_USERNS_MOUNT 8 /* Can be mounted by userns root */ #define FS_RENAME_DOES_D_MOVE 32768 /* FS will handle d_move() during rename() internally. */ + int (*init_fs_context)(struct fs_context *); struct dentry *(*mount) (struct file_system_type *, int, const char *, void *); void (*kill_sb) (struct super_block *); diff --git a/include/linux/fs_context.h b/include/linux/fs_context.h index 7feb018c7a9e..087c12954360 100644 --- a/include/linux/fs_context.h +++ b/include/linux/fs_context.h @@ -20,8 +20,13 @@ struct cred; struct dentry; struct file_operations; struct file_system_type; +struct mnt_namespace; struct net; +struct pid_namespace; +struct super_block; struct user_namespace; +struct vfsmount; +struct path; enum fs_context_purpose { FS_CONTEXT_FOR_MOUNT, /* New superblock for explicit mount */ @@ -39,6 +44,7 @@ enum fs_context_purpose { * See Documentation/filesystems/mounting.txt */ struct fs_context { + const struct fs_context_operations *ops; struct file_system_type *fs_type; void *fs_private; /* The filesystem's context */ struct dentry *root; /* The root and superblock */ @@ -54,6 +60,13 @@ struct fs_context { bool need_free:1; /* Need to call ops->free() */ }; +struct fs_context_operations { + void (*free)(struct fs_context *fc); + int (*parse_monolithic)(struct fs_context *fc, void *data); + int (*get_tree)(struct fs_context *fc); + int (*reconfigure)(struct fs_context *fc); +}; + /* * fs_context manipulation functions. */ From patchwork Tue Feb 19 16:30:17 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 10820213 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 04BB0180E for ; Tue, 19 Feb 2019 16:30:22 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id EA1D12CC12 for ; Tue, 19 Feb 2019 16:30:21 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id E86222CC67; Tue, 19 Feb 2019 16:30:21 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7C38B2CC12 for ; Tue, 19 Feb 2019 16:30:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729243AbfBSQaU (ORCPT ); Tue, 19 Feb 2019 11:30:20 -0500 Received: from mx1.redhat.com ([209.132.183.28]:38414 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728984AbfBSQaU (ORCPT ); Tue, 19 Feb 2019 11:30:20 -0500 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 70B6AA8C1A; Tue, 19 Feb 2019 16:30:19 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-121-129.rdu2.redhat.com [10.10.121.129]) by smtp.corp.redhat.com (Postfix) with ESMTP id 2695061460; Tue, 19 Feb 2019 16:30:17 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 14/43] vfs: Introduce logging functions From: David Howells To: viro@zeniv.linux.org.uk Cc: linux-fsdevel@vger.kernel.org, dhowells@redhat.com, torvalds@linux-foundation.org, ebiederm@xmission.com, linux-security-module@vger.kernel.org Date: Tue, 19 Feb 2019 16:30:17 +0000 Message-ID: <155059381736.12449.16194644068515549957.stgit@warthog.procyon.org.uk> In-Reply-To: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> References: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Tue, 19 Feb 2019 16:30:19 +0000 (UTC) Sender: owner-linux-security-module@vger.kernel.org Precedence: bulk List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Introduce a set of logging functions through which informational messages, warnings and error messages incurred by the mount procedure can be logged and, in a future patch, passed to userspace instead by way of the filesystem configuration context file descriptor. There are four functions: (1) infof(const char *fmt, ...); Logs an informational message. (2) warnf(const char *fmt, ...); Logs a warning message. (3) errorf(const char *fmt, ...); Logs an error message. (4) invalf(const char *fmt, ...); As errof(), but returns -EINVAL so can be used on a return statement. Signed-off-by: David Howells Signed-off-by: Al Viro --- include/linux/fs_context.h | 42 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 42 insertions(+) diff --git a/include/linux/fs_context.h b/include/linux/fs_context.h index 087c12954360..d208cc40b868 100644 --- a/include/linux/fs_context.h +++ b/include/linux/fs_context.h @@ -81,4 +81,46 @@ extern struct fs_context *fs_context_for_submount(struct file_system_type *fs_ty extern int vfs_get_tree(struct fs_context *fc); extern void put_fs_context(struct fs_context *fc); +#define logfc(FC, FMT, ...) pr_notice(FMT, ## __VA_ARGS__) + +/** + * infof - Store supplementary informational message + * @fc: The context in which to log the informational message + * @fmt: The format string + * + * Store the supplementary informational message for the process if the process + * has enabled the facility. + */ +#define infof(fc, fmt, ...) ({ logfc(fc, fmt, ## __VA_ARGS__); }) + +/** + * warnf - Store supplementary warning message + * @fc: The context in which to log the error message + * @fmt: The format string + * + * Store the supplementary warning message for the process if the process has + * enabled the facility. + */ +#define warnf(fc, fmt, ...) ({ logfc(fc, fmt, ## __VA_ARGS__); }) + +/** + * errorf - Store supplementary error message + * @fc: The context in which to log the error message + * @fmt: The format string + * + * Store the supplementary error message for the process if the process has + * enabled the facility. + */ +#define errorf(fc, fmt, ...) ({ logfc(fc, fmt, ## __VA_ARGS__); }) + +/** + * invalf - Store supplementary invalid argument error message + * @fc: The context in which to log the error message + * @fmt: The format string + * + * Store the supplementary error message for the process if the process has + * enabled the facility and return -EINVAL. + */ +#define invalf(fc, fmt, ...) ({ errorf(fc, fmt, ## __VA_ARGS__); -EINVAL; }) + #endif /* _LINUX_FS_CONTEXT_H */ From patchwork Tue Feb 19 16:30:24 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 10820215 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8EDDF14E1 for ; Tue, 19 Feb 2019 16:30:32 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 727492CC0F for ; Tue, 19 Feb 2019 16:30:32 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 707722CC67; Tue, 19 Feb 2019 16:30:32 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8A2602CCAB for ; Tue, 19 Feb 2019 16:30:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729160AbfBSQaa (ORCPT ); Tue, 19 Feb 2019 11:30:30 -0500 Received: from mx1.redhat.com ([209.132.183.28]:38634 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728984AbfBSQa3 (ORCPT ); Tue, 19 Feb 2019 11:30:29 -0500 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id EF0EAA0915; Tue, 19 Feb 2019 16:30:28 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-121-129.rdu2.redhat.com [10.10.121.129]) by smtp.corp.redhat.com (Postfix) with ESMTP id 6BE38648B9; Tue, 19 Feb 2019 16:30:25 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 15/43] vfs: Add configuration parser helpers From: David Howells To: viro@zeniv.linux.org.uk Cc: linux-fsdevel@vger.kernel.org, dhowells@redhat.com, torvalds@linux-foundation.org, ebiederm@xmission.com, linux-security-module@vger.kernel.org Date: Tue, 19 Feb 2019 16:30:24 +0000 Message-ID: <155059382463.12449.9058984258656142511.stgit@warthog.procyon.org.uk> In-Reply-To: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> References: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Tue, 19 Feb 2019 16:30:29 +0000 (UTC) Sender: owner-linux-security-module@vger.kernel.org Precedence: bulk List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Because the new API passes in key,value parameters, match_token() cannot be used with it. Instead, provide three new helpers to aid with parsing: (1) fs_parse(). This takes a parameter and a simple static description of all the parameters and maps the key name to an ID. It returns 1 on a match, 0 on no match if unknowns should be ignored and some other negative error code on a parse error. The parameter description includes a list of key names to IDs, desired parameter types and a list of enumeration name -> ID mappings. [!] Note that for the moment I've required that the key->ID mapping array is expected to be sorted and unterminated. The size of the array is noted in the fsconfig_parser struct. This allows me to use bsearch(), but I'm not sure any performance gain is worth the hassle of requiring people to keep the array sorted. The parameter type array is sized according to the number of parameter IDs and is indexed directly. The optional enum mapping array is an unterminated, unsorted list and the size goes into the fsconfig_parser struct. The function can do some additional things: (a) If it's not ambiguous and no value is given, the prefix "no" on a key name is permitted to indicate that the parameter should be considered negatory. (b) If the desired type is a single simple integer, it will perform an appropriate conversion and store the result in a union in the parse result. (c) If the desired type is an enumeration, {key ID, name} will be looked up in the enumeration list and the matching value will be stored in the parse result union. (d) Optionally generate an error if the key is unrecognised. This is called something like: enum rdt_param { Opt_cdp, Opt_cdpl2, Opt_mba_mpbs, nr__rdt_params }; const struct fs_parameter_spec rdt_param_specs[nr__rdt_params] = { [Opt_cdp] = { fs_param_is_bool }, [Opt_cdpl2] = { fs_param_is_bool }, [Opt_mba_mpbs] = { fs_param_is_bool }, }; const const char *const rdt_param_keys[nr__rdt_params] = { [Opt_cdp] = "cdp", [Opt_cdpl2] = "cdpl2", [Opt_mba_mpbs] = "mba_mbps", }; const struct fs_parameter_description rdt_parser = { .name = "rdt", .nr_params = nr__rdt_params, .keys = rdt_param_keys, .specs = rdt_param_specs, .no_source = true, }; int rdt_parse_param(struct fs_context *fc, struct fs_parameter *param) { struct fs_parse_result parse; struct rdt_fs_context *ctx = rdt_fc2context(fc); int ret; ret = fs_parse(fc, &rdt_parser, param, &parse); if (ret < 0) return ret; switch (parse.key) { case Opt_cdp: ctx->enable_cdpl3 = true; return 0; case Opt_cdpl2: ctx->enable_cdpl2 = true; return 0; case Opt_mba_mpbs: ctx->enable_mba_mbps = true; return 0; } return -EINVAL; } (2) fs_lookup_param(). This takes a { dirfd, path, LOOKUP_EMPTY? } or string value and performs an appropriate path lookup to convert it into a path object, which it will then return. If the desired type was a blockdev, the type of the looked up inode will be checked to make sure it is one. This can be used like: enum foo_param { Opt_source, nr__foo_params }; const struct fs_parameter_spec foo_param_specs[nr__foo_params] = { [Opt_source] = { fs_param_is_blockdev }, }; const char *char foo_param_keys[nr__foo_params] = { [Opt_source] = "source", }; const struct constant_table foo_param_alt_keys[] = { { "device", Opt_source }, }; const struct fs_parameter_description foo_parser = { .name = "foo", .nr_params = nr__foo_params, .nr_alt_keys = ARRAY_SIZE(foo_param_alt_keys), .keys = foo_param_keys, .alt_keys = foo_param_alt_keys, .specs = foo_param_specs, }; int foo_parse_param(struct fs_context *fc, struct fs_parameter *param) { struct fs_parse_result parse; struct foo_fs_context *ctx = foo_fc2context(fc); int ret; ret = fs_parse(fc, &foo_parser, param, &parse); if (ret < 0) return ret; switch (parse.key) { case Opt_source: return fs_lookup_param(fc, &foo_parser, param, &parse, &ctx->source); default: return -EINVAL; } } (3) lookup_constant(). This takes a table of named constants and looks up the given name within it. The table is expected to be sorted such that bsearch() be used upon it. Possibly I should require the table be terminated and just use a for-loop to scan it instead of using bsearch() to reduce hassle. Tables look something like: static const struct constant_table bool_names[] = { { "0", false }, { "1", true }, { "false", false }, { "no", false }, { "true", true }, { "yes", true }, }; and a lookup is done with something like: b = lookup_constant(bool_names, param->string, -1); Additionally, optional validation routines for the parameter description are provided that can be enabled at compile time. A later patch will invoke these when a filesystem is registered. Signed-off-by: David Howells Signed-off-by: Al Viro --- fs/Kconfig | 7 + fs/Makefile | 2 fs/fs_parser.c | 447 ++++++++++++++++++++++++++++++++++++++++++++ fs/internal.h | 2 fs/namei.c | 4 include/linux/errno.h | 1 include/linux/fs_context.h | 29 +++ include/linux/fs_parser.h | 176 +++++++++++++++++ 8 files changed, 665 insertions(+), 3 deletions(-) create mode 100644 fs/fs_parser.c create mode 100644 include/linux/fs_parser.h diff --git a/fs/Kconfig b/fs/Kconfig index ac474a61be37..25700b152c75 100644 --- a/fs/Kconfig +++ b/fs/Kconfig @@ -8,6 +8,13 @@ menu "File systems" config DCACHE_WORD_ACCESS bool +config VALIDATE_FS_PARSER + bool "Validate filesystem parameter description" + default y + help + Enable this to perform validation of the parameter description for a + filesystem when it is registered. + if BLOCK config FS_IOMAP diff --git a/fs/Makefile b/fs/Makefile index 5563cf34f7c2..9a0b8003f069 100644 --- a/fs/Makefile +++ b/fs/Makefile @@ -13,7 +13,7 @@ obj-y := open.o read_write.o file_table.o super.o \ seq_file.o xattr.o libfs.o fs-writeback.o \ pnode.o splice.o sync.o utimes.o d_path.o \ stack.o fs_struct.o statfs.o fs_pin.o nsfs.o \ - fs_context.o + fs_context.o fs_parser.o ifeq ($(CONFIG_BLOCK),y) obj-y += buffer.o block_dev.o direct-io.o mpage.o diff --git a/fs/fs_parser.c b/fs/fs_parser.c new file mode 100644 index 000000000000..842e8f749db6 --- /dev/null +++ b/fs/fs_parser.c @@ -0,0 +1,447 @@ +/* Filesystem parameter parser. + * + * Copyright (C) 2018 Red Hat, Inc. All Rights Reserved. + * Written by David Howells (dhowells@redhat.com) + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public Licence + * as published by the Free Software Foundation; either version + * 2 of the Licence, or (at your option) any later version. + */ + +#include +#include +#include +#include +#include +#include +#include "internal.h" + +static const struct constant_table bool_names[] = { + { "0", false }, + { "1", true }, + { "false", false }, + { "no", false }, + { "true", true }, + { "yes", true }, +}; + +/** + * lookup_constant - Look up a constant by name in an ordered table + * @tbl: The table of constants to search. + * @tbl_size: The size of the table. + * @name: The name to look up. + * @not_found: The value to return if the name is not found. + */ +int __lookup_constant(const struct constant_table *tbl, size_t tbl_size, + const char *name, int not_found) +{ + unsigned int i; + + for (i = 0; i < tbl_size; i++) + if (strcmp(name, tbl[i].name) == 0) + return tbl[i].value; + + return not_found; +} +EXPORT_SYMBOL(__lookup_constant); + +static const struct fs_parameter_spec *fs_lookup_key( + const struct fs_parameter_description *desc, + const char *name) +{ + const struct fs_parameter_spec *p; + + if (!desc->specs) + return NULL; + + for (p = desc->specs; p->name; p++) + if (strcmp(p->name, name) == 0) + return p; + + return NULL; +} + +/* + * fs_parse - Parse a filesystem configuration parameter + * @fc: The filesystem context to log errors through. + * @desc: The parameter description to use. + * @param: The parameter. + * @result: Where to place the result of the parse + * + * Parse a filesystem configuration parameter and attempt a conversion for a + * simple parameter for which this is requested. If successful, the determined + * parameter ID is placed into @result->key, the desired type is indicated in + * @result->t and any converted value is placed into an appropriate member of + * the union in @result. + * + * The function returns the parameter number if the parameter was matched, + * -ENOPARAM if it wasn't matched and @desc->ignore_unknown indicated that + * unknown parameters are okay and -EINVAL if there was a conversion issue or + * the parameter wasn't recognised and unknowns aren't okay. + */ +int fs_parse(struct fs_context *fc, + const struct fs_parameter_description *desc, + struct fs_parameter *param, + struct fs_parse_result *result) +{ + const struct fs_parameter_spec *p; + const struct fs_parameter_enum *e; + int ret = -ENOPARAM, b; + + result->has_value = !!param->string; + result->negated = false; + result->uint_64 = 0; + + p = fs_lookup_key(desc, param->key); + if (!p) { + /* If we didn't find something that looks like "noxxx", see if + * "xxx" takes the "no"-form negative - but only if there + * wasn't an value. + */ + if (result->has_value) + goto unknown_parameter; + if (param->key[0] != 'n' || param->key[1] != 'o' || !param->key[2]) + goto unknown_parameter; + + p = fs_lookup_key(desc, param->key + 2); + if (!p) + goto unknown_parameter; + if (!(p->flags & fs_param_neg_with_no)) + goto unknown_parameter; + result->boolean = false; + result->negated = true; + } + + if (p->flags & fs_param_deprecated) + warnf(fc, "%s: Deprecated parameter '%s'", + desc->name, param->key); + + if (result->negated) + goto okay; + + /* Certain parameter types only take a string and convert it. */ + switch (p->type) { + case __fs_param_wasnt_defined: + return -EINVAL; + case fs_param_is_u32: + case fs_param_is_u32_octal: + case fs_param_is_u32_hex: + case fs_param_is_s32: + case fs_param_is_u64: + case fs_param_is_enum: + case fs_param_is_string: + if (param->type != fs_value_is_string) + goto bad_value; + if (!result->has_value) { + if (p->flags & fs_param_v_optional) + goto okay; + goto bad_value; + } + /* Fall through */ + default: + break; + } + + /* Try to turn the type we were given into the type desired by the + * parameter and give an error if we can't. + */ + switch (p->type) { + case fs_param_is_flag: + if (param->type != fs_value_is_flag && + (param->type != fs_value_is_string || result->has_value)) + return invalf(fc, "%s: Unexpected value for '%s'", + desc->name, param->key); + result->boolean = true; + goto okay; + + case fs_param_is_bool: + switch (param->type) { + case fs_value_is_flag: + result->boolean = true; + goto okay; + case fs_value_is_string: + if (param->size == 0) { + result->boolean = true; + goto okay; + } + b = lookup_constant(bool_names, param->string, -1); + if (b == -1) + goto bad_value; + result->boolean = b; + goto okay; + default: + goto bad_value; + } + + case fs_param_is_u32: + ret = kstrtouint(param->string, 0, &result->uint_32); + goto maybe_okay; + case fs_param_is_u32_octal: + ret = kstrtouint(param->string, 8, &result->uint_32); + goto maybe_okay; + case fs_param_is_u32_hex: + ret = kstrtouint(param->string, 16, &result->uint_32); + goto maybe_okay; + case fs_param_is_s32: + ret = kstrtoint(param->string, 0, &result->int_32); + goto maybe_okay; + case fs_param_is_u64: + ret = kstrtoull(param->string, 0, &result->uint_64); + goto maybe_okay; + + case fs_param_is_enum: + for (e = desc->enums; e->name[0]; e++) { + if (e->opt == p->opt && + strcmp(e->name, param->string) == 0) { + result->uint_32 = e->value; + goto okay; + } + } + goto bad_value; + + case fs_param_is_string: + goto okay; + case fs_param_is_blob: + if (param->type != fs_value_is_blob) + goto bad_value; + goto okay; + + case fs_param_is_fd: { + if (param->type != fs_value_is_file) + goto bad_value; + goto okay; + } + + case fs_param_is_blockdev: + case fs_param_is_path: + goto okay; + default: + BUG(); + } + +maybe_okay: + if (ret < 0) + goto bad_value; +okay: + return p->opt; + +bad_value: + return invalf(fc, "%s: Bad value for '%s'", desc->name, param->key); +unknown_parameter: + return -ENOPARAM; +} +EXPORT_SYMBOL(fs_parse); + +/** + * fs_lookup_param - Look up a path referred to by a parameter + * @fc: The filesystem context to log errors through. + * @param: The parameter. + * @want_bdev: T if want a blockdev + * @_path: The result of the lookup + */ +int fs_lookup_param(struct fs_context *fc, + struct fs_parameter *param, + bool want_bdev, + struct path *_path) +{ + struct filename *f; + unsigned int flags = 0; + bool put_f; + int ret; + + switch (param->type) { + case fs_value_is_string: + f = getname_kernel(param->string); + if (IS_ERR(f)) + return PTR_ERR(f); + put_f = true; + break; + case fs_value_is_filename_empty: + flags = LOOKUP_EMPTY; + /* Fall through */ + case fs_value_is_filename: + f = param->name; + put_f = false; + break; + default: + return invalf(fc, "%s: not usable as path", param->key); + } + + ret = filename_lookup(param->dirfd, f, flags, _path, NULL); + if (ret < 0) { + errorf(fc, "%s: Lookup failure for '%s'", param->key, f->name); + goto out; + } + + if (want_bdev && + !S_ISBLK(d_backing_inode(_path->dentry)->i_mode)) { + path_put(_path); + _path->dentry = NULL; + _path->mnt = NULL; + errorf(fc, "%s: Non-blockdev passed as '%s'", + param->key, f->name); + ret = -ENOTBLK; + } + +out: + if (put_f) + putname(f); + return ret; +} +EXPORT_SYMBOL(fs_lookup_param); + +#ifdef CONFIG_VALIDATE_FS_PARSER +/** + * validate_constant_table - Validate a constant table + * @name: Name to use in reporting + * @tbl: The constant table to validate. + * @tbl_size: The size of the table. + * @low: The lowest permissible value. + * @high: The highest permissible value. + * @special: One special permissible value outside of the range. + */ +bool validate_constant_table(const struct constant_table *tbl, size_t tbl_size, + int low, int high, int special) +{ + size_t i; + bool good = true; + + if (tbl_size == 0) { + pr_warn("VALIDATE C-TBL: Empty\n"); + return true; + } + + for (i = 0; i < tbl_size; i++) { + if (!tbl[i].name) { + pr_err("VALIDATE C-TBL[%zu]: Null\n", i); + good = false; + } else if (i > 0 && tbl[i - 1].name) { + int c = strcmp(tbl[i-1].name, tbl[i].name); + + if (c == 0) { + pr_err("VALIDATE C-TBL[%zu]: Duplicate %s\n", + i, tbl[i].name); + good = false; + } + if (c > 0) { + pr_err("VALIDATE C-TBL[%zu]: Missorted %s>=%s\n", + i, tbl[i-1].name, tbl[i].name); + good = false; + } + } + + if (tbl[i].value != special && + (tbl[i].value < low || tbl[i].value > high)) { + pr_err("VALIDATE C-TBL[%zu]: %s->%d const out of range (%d-%d)\n", + i, tbl[i].name, tbl[i].value, low, high); + good = false; + } + } + + return good; +} + +/** + * fs_validate_description - Validate a parameter description + * @desc: The parameter description to validate. + */ +bool fs_validate_description(const struct fs_parameter_description *desc) +{ + const struct fs_parameter_spec *param, *p2; + const struct fs_parameter_enum *e; + const char *name = desc->name; + unsigned int nr_params = 0; + bool good = true, enums = false; + + pr_notice("*** VALIDATE %s ***\n", name); + + if (!name[0]) { + pr_err("VALIDATE Parser: No name\n"); + name = "Unknown"; + good = false; + } + + if (desc->specs) { + for (param = desc->specs; param->name; param++) { + enum fs_parameter_type t = param->type; + + /* Check that the type is in range */ + if (t == __fs_param_wasnt_defined || + t >= nr__fs_parameter_type) { + pr_err("VALIDATE %s: PARAM[%s] Bad type %u\n", + name, param->name, t); + good = false; + } else if (t == fs_param_is_enum) { + enums = true; + } + + /* Check for duplicate parameter names */ + for (p2 = desc->specs; p2 < param; p2++) { + if (strcmp(param->name, p2->name) == 0) { + pr_err("VALIDATE %s: PARAM[%s]: Duplicate\n", + name, param->name); + good = false; + } + } + } + + nr_params = param - desc->specs; + } + + if (desc->enums) { + if (!nr_params) { + pr_err("VALIDATE %s: Enum table but no parameters\n", + name); + good = false; + goto no_enums; + } + if (!enums) { + pr_err("VALIDATE %s: Enum table but no enum-type values\n", + name); + good = false; + goto no_enums; + } + + for (e = desc->enums; e->name[0]; e++) { + /* Check that all entries in the enum table have at + * least one parameter that uses them. + */ + for (param = desc->specs; param->name; param++) { + if (param->opt == e->opt && + param->type != fs_param_is_enum) { + pr_err("VALIDATE %s: e[%lu] enum val for %s\n", + name, e - desc->enums, param->name); + good = false; + } + } + } + + /* Check that all enum-type parameters have at least one enum + * value in the enum table. + */ + for (param = desc->specs; param->name; param++) { + if (param->type != fs_param_is_enum) + continue; + for (e = desc->enums; e->name[0]; e++) + if (e->opt == param->opt) + break; + if (!e->name[0]) { + pr_err("VALIDATE %s: PARAM[%s] enum with no values\n", + name, param->name); + good = false; + } + } + } else { + if (enums) { + pr_err("VALIDATE %s: enum-type values, but no enum table\n", + name); + good = false; + goto no_enums; + } + } + +no_enums: + return good; +} +#endif /* CONFIG_VALIDATE_FS_PARSER */ diff --git a/fs/internal.h b/fs/internal.h index 8f8d07cc433f..6a8b71643af4 100644 --- a/fs/internal.h +++ b/fs/internal.h @@ -61,6 +61,8 @@ extern void fc_drop_locked(struct fs_context *); /* * namei.c */ +extern int filename_lookup(int dfd, struct filename *name, unsigned flags, + struct path *path, struct path *root); extern int user_path_mountpoint_at(int, const char __user *, unsigned int, struct path *); extern int vfs_path_lookup(struct dentry *, struct vfsmount *, const char *, unsigned int, struct path *); diff --git a/fs/namei.c b/fs/namei.c index 914178cdbe94..a85deb55d0c9 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -2333,8 +2333,8 @@ static int path_lookupat(struct nameidata *nd, unsigned flags, struct path *path return err; } -static int filename_lookup(int dfd, struct filename *name, unsigned flags, - struct path *path, struct path *root) +int filename_lookup(int dfd, struct filename *name, unsigned flags, + struct path *path, struct path *root) { int retval; struct nameidata nd; diff --git a/include/linux/errno.h b/include/linux/errno.h index 3cba627577d6..d73f597a2484 100644 --- a/include/linux/errno.h +++ b/include/linux/errno.h @@ -18,6 +18,7 @@ #define ERESTART_RESTARTBLOCK 516 /* restart by calling sys_restart_syscall */ #define EPROBE_DEFER 517 /* Driver requests probe retry */ #define EOPENSTALE 518 /* open found a stale dentry */ +#define ENOPARAM 519 /* Parameter not supported */ /* Defined for the NFSv3 protocol */ #define EBADHANDLE 521 /* Illegal NFS file handle */ diff --git a/include/linux/fs_context.h b/include/linux/fs_context.h index d208cc40b868..899027c94788 100644 --- a/include/linux/fs_context.h +++ b/include/linux/fs_context.h @@ -34,6 +34,35 @@ enum fs_context_purpose { FS_CONTEXT_FOR_RECONFIGURE, /* Superblock reconfiguration (remount) */ }; +/* + * Type of parameter value. + */ +enum fs_value_type { + fs_value_is_undefined, + fs_value_is_flag, /* Value not given a value */ + fs_value_is_string, /* Value is a string */ + fs_value_is_blob, /* Value is a binary blob */ + fs_value_is_filename, /* Value is a filename* + dirfd */ + fs_value_is_filename_empty, /* Value is a filename* + dirfd + AT_EMPTY_PATH */ + fs_value_is_file, /* Value is a file* */ +}; + +/* + * Configuration parameter. + */ +struct fs_parameter { + const char *key; /* Parameter name */ + enum fs_value_type type:8; /* The type of value here */ + union { + char *string; + void *blob; + struct filename *name; + struct file *file; + }; + size_t size; + int dirfd; +}; + /* * Filesystem context for holding the parameters used in the creation or * reconfiguration of a superblock. diff --git a/include/linux/fs_parser.h b/include/linux/fs_parser.h new file mode 100644 index 000000000000..620e89d0bb61 --- /dev/null +++ b/include/linux/fs_parser.h @@ -0,0 +1,176 @@ +/* Filesystem parameter description and parser + * + * Copyright (C) 2018 Red Hat, Inc. All Rights Reserved. + * Written by David Howells (dhowells@redhat.com) + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public Licence + * as published by the Free Software Foundation; either version + * 2 of the Licence, or (at your option) any later version. + */ + +#ifndef _LINUX_FS_PARSER_H +#define _LINUX_FS_PARSER_H + +#include + +struct path; + +struct constant_table { + const char *name; + int value; +}; + +/* + * The type of parameter expected. + */ +enum fs_parameter_type { + __fs_param_wasnt_defined, + fs_param_is_flag, + fs_param_is_bool, + fs_param_is_u32, + fs_param_is_u32_octal, + fs_param_is_u32_hex, + fs_param_is_s32, + fs_param_is_u64, + fs_param_is_enum, + fs_param_is_string, + fs_param_is_blob, + fs_param_is_blockdev, + fs_param_is_path, + fs_param_is_fd, + nr__fs_parameter_type, +}; + +/* + * Specification of the type of value a parameter wants. + * + * Note that the fsparam_flag(), fsparam_string(), fsparam_u32(), ... macros + * should be used to generate elements of this type. + */ +struct fs_parameter_spec { + const char *name; + u8 opt; /* Option number (returned by fs_parse()) */ + enum fs_parameter_type type:8; /* The desired parameter type */ + unsigned short flags; +#define fs_param_v_optional 0x0001 /* The value is optional */ +#define fs_param_neg_with_no 0x0002 /* "noxxx" is negative param */ +#define fs_param_neg_with_empty 0x0004 /* "xxx=" is negative param */ +#define fs_param_deprecated 0x0008 /* The param is deprecated */ +}; + +struct fs_parameter_enum { + u8 opt; /* Option number (as fs_parameter_spec::opt) */ + char name[14]; + u8 value; +}; + +struct fs_parameter_description { + const char name[16]; /* Name for logging purposes */ + const struct fs_parameter_spec *specs; /* List of param specifications */ + const struct fs_parameter_enum *enums; /* Enum values */ +}; + +/* + * Result of parse. + */ +struct fs_parse_result { + bool negated; /* T if param was "noxxx" */ + bool has_value; /* T if value supplied to param */ + union { + bool boolean; /* For spec_bool */ + int int_32; /* For spec_s32/spec_enum */ + unsigned int uint_32; /* For spec_u32{,_octal,_hex}/spec_enum */ + u64 uint_64; /* For spec_u64 */ + }; +}; + +extern int fs_parse(struct fs_context *fc, + const struct fs_parameter_description *desc, + struct fs_parameter *value, + struct fs_parse_result *result); +extern int fs_lookup_param(struct fs_context *fc, + struct fs_parameter *param, + bool want_bdev, + struct path *_path); + +extern int __lookup_constant(const struct constant_table tbl[], size_t tbl_size, + const char *name, int not_found); +#define lookup_constant(t, n, nf) __lookup_constant(t, ARRAY_SIZE(t), (n), (nf)) + +#ifdef CONFIG_VALIDATE_FS_PARSER +extern bool validate_constant_table(const struct constant_table *tbl, size_t tbl_size, + int low, int high, int special); +extern bool fs_validate_description(const struct fs_parameter_description *desc); +#else +static inline bool validate_constant_table(const struct constant_table *tbl, size_t tbl_size, + int low, int high, int special) +{ return true; } +static inline bool fs_validate_description(const struct fs_parameter_description *desc) +{ return true; } +#endif + +/* + * Utility macro to allow varargs macros to be productive in themselves rather + * than merely being used to wrap a varargs function. + */ +#define __wrap19(Q,A, ...) A +#define __wrap18(Q,A, ...) A __wrap19(Q, _##Q##__VA_ARGS__) +#define __wrap17(Q,A, ...) A __wrap18(Q, _##Q##__VA_ARGS__) +#define __wrap16(Q,A, ...) A __wrap17(Q, _##Q##__VA_ARGS__) +#define __wrap15(Q,A, ...) A __wrap16(Q, _##Q##__VA_ARGS__) +#define __wrap14(Q,A, ...) A __wrap15(Q, _##Q##__VA_ARGS__) +#define __wrap13(Q,A, ...) A __wrap14(Q, _##Q##__VA_ARGS__) +#define __wrap12(Q,A, ...) A __wrap13(Q, _##Q##__VA_ARGS__) +#define __wrap11(Q,A, ...) A __wrap12(Q, _##Q##__VA_ARGS__) +#define __wrap10(Q,A, ...) A __wrap11(Q, _##Q##__VA_ARGS__) +#define __wrap09(Q,A, ...) A __wrap10(Q, _##Q##__VA_ARGS__) +#define __wrap08(Q,A, ...) A __wrap09(Q, _##Q##__VA_ARGS__) +#define __wrap07(Q,A, ...) A __wrap08(Q, _##Q##__VA_ARGS__) +#define __wrap06(Q,A, ...) A __wrap07(Q, _##Q##__VA_ARGS__) +#define __wrap05(Q,A, ...) A __wrap06(Q, _##Q##__VA_ARGS__) +#define __wrap04(Q,A, ...) A __wrap05(Q, _##Q##__VA_ARGS__) +#define __wrap03(Q,A, ...) A __wrap04(Q, _##Q##__VA_ARGS__) +#define __wrap02(Q,A, ...) A __wrap03(Q, _##Q##__VA_ARGS__) +#define __wrap01(Q,A, ...) A __wrap02(Q, _##Q##__VA_ARGS__) +#define __wrap00(Q,A, ...) A __wrap01(Q, _##Q##__VA_ARGS__) +#define __wrap(Q, ...) __wrap00(Q, _##Q##__VA_ARGS__) + +/* + * Hooks for __wrap() to OR together a list of parameter flags + */ +#define _fsp_flag_ +#define _fsp_flag_NEGATE_WITH_NO | fs_param_neg_with_no +#define _fsp_flag_NEGATE_WITH_EMPTY | fs_param_neg_with_empty +#define _fsp_flag_OPTIONAL | fs_param_v_optional +#define _fsp_flag_IS_DEPRECATED | fs_param_deprecated + +/* + * Parameter type, name, index and flags element constructors. Use as: + * + * fsparam_xxxx("foo", Opt_foo[,NEGATE_WITH_NO][,NEGATE_WITH_EMPTY][,OPTIONAL][,DEPRECATED]) + */ +#define __fsparam(TYPE, NAME, OPT, ...) \ + { \ + .name = NAME, \ + .opt = OPT, \ + .type = TYPE, \ + .flags = 0 __wrap(fsp_flag_, __VA_ARGS__) \ + } + +#define fsparam_flag(NAME, OPT, ...) __fsparam(fs_param_is_flag, NAME, OPT, ## __VA_ARGS__) +#define fsparam_bool(NAME, OPT, ...) __fsparam(fs_param_is_bool, NAME, OPT, ## __VA_ARGS__) +#define fsparam_u32(NAME, OPT, ...) __fsparam(fs_param_is_u32, NAME, OPT, ## __VA_ARGS__) +#define fsparam_u32oct(NAME, OPT, ...) __fsparam(fs_param_is_u32_octal, NAME, OPT, ## __VA_ARGS__) +#define fsparam_u32hex(NAME, OPT, ...) __fsparam(fs_param_is_u32_hex, NAME, OPT, ## __VA_ARGS__) +#define fsparam_s32(NAME, OPT, ...) __fsparam(fs_param_is_s32, NAME, OPT, ## __VA_ARGS__) +#define fsparam_u64(NAME, OPT, ...) __fsparam(fs_param_is_u64, NAME, OPT, ## __VA_ARGS__) +#define fsparam_enum(NAME, OPT, ...) __fsparam(fs_param_is_enum, NAME, OPT, ## __VA_ARGS__) +#define fsparam_string(NAME, OPT, ...) __fsparam(fs_param_is_string, NAME, OPT, ## __VA_ARGS__) +#define fsparam_blob(NAME, OPT, ...) __fsparam(fs_param_is_blob, NAME, OPT, ## __VA_ARGS__) +#define fsparam_bdev(NAME, OPT, ...) __fsparam(fs_param_is_blockdev, NAME, OPT, ## __VA_ARGS__) +#define fsparam_path(NAME, OPT, ...) __fsparam(fs_param_is_path, NAME, OPT, ## __VA_ARGS__) +#define fsparam_fd(NAME, OPT, ...) __fsparam(fs_param_is_fd, NAME, OPT, ## __VA_ARGS__) + + +#endif /* _LINUX_FS_PARSER_H */ From patchwork Tue Feb 19 16:30:34 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 10820219 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 570CE1805 for ; Tue, 19 Feb 2019 16:30:38 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 34D452CBF4 for ; Tue, 19 Feb 2019 16:30:38 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 311B02CCD1; Tue, 19 Feb 2019 16:30:38 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 97EBD2CC8A for ; Tue, 19 Feb 2019 16:30:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729095AbfBSQah (ORCPT ); Tue, 19 Feb 2019 11:30:37 -0500 Received: from mx1.redhat.com ([209.132.183.28]:38816 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728984AbfBSQag (ORCPT ); Tue, 19 Feb 2019 11:30:36 -0500 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 6519813150C; Tue, 19 Feb 2019 16:30:36 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-121-129.rdu2.redhat.com [10.10.121.129]) by smtp.corp.redhat.com (Postfix) with ESMTP id 0EB6C4C4; Tue, 19 Feb 2019 16:30:34 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 16/43] vfs: Add LSM hooks for the new mount API From: David Howells To: viro@zeniv.linux.org.uk Cc: linux-security-module@vger.kernel.org, linux-fsdevel@vger.kernel.org, dhowells@redhat.com, torvalds@linux-foundation.org, ebiederm@xmission.com, linux-security-module@vger.kernel.org Date: Tue, 19 Feb 2019 16:30:34 +0000 Message-ID: <155059383419.12449.6522023890948977248.stgit@warthog.procyon.org.uk> In-Reply-To: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> References: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Tue, 19 Feb 2019 16:30:36 +0000 (UTC) Sender: owner-linux-security-module@vger.kernel.org Precedence: bulk List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Add LSM hooks for use by the new mount API and filesystem context code. This includes: (1) Hooks to handle allocation, duplication and freeing of the security record attached to a filesystem context. (2) A hook to snoop source specifications. There may be multiple of these if the filesystem supports it. They will to be local files/devices if fs_context::source_is_dev is true and will be something else, possibly remote server specifications, if false. (3) A hook to snoop superblock configuration options in key[=val] form. If the LSM decides it wants to handle it, it can suppress the option being passed to the filesystem. Note that 'val' may include commas and binary data with the fsopen patch. (4) A hook to perform validation and allocation after the configuration has been done but before the superblock is allocated and set up. (5) A hook to transfer the security from the context to a newly created superblock. (6) A hook to rule on whether a path point can be used as a mountpoint. These are intended to replace: security_sb_copy_data security_sb_kern_mount security_sb_mount security_sb_set_mnt_opts security_sb_clone_mnt_opts security_sb_parse_opts_str [AV -- some of the methods being replaced are already gone, some of the methods are not added for the lack of need] Signed-off-by: David Howells cc: linux-security-module@vger.kernel.org Signed-off-by: Al Viro --- include/linux/lsm_hooks.h | 14 ++++++++++++++ include/linux/security.h | 10 ++++++++++ security/security.c | 5 +++++ 3 files changed, 29 insertions(+) diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h index 9a0bdf91e646..47ba4db4d8fb 100644 --- a/include/linux/lsm_hooks.h +++ b/include/linux/lsm_hooks.h @@ -76,6 +76,17 @@ * changes on the process such as clearing out non-inheritable signal * state. This is called immediately after commit_creds(). * + * Security hooks for mount using fs_context. + * [See also Documentation/filesystems/mounting.txt] + * + * @fs_context_parse_param: + * Userspace provided a parameter to configure a superblock. The LSM may + * reject it with an error and may use it for itself, in which case it + * should return 0; otherwise it should return -ENOPARAM to pass it on to + * the filesystem. + * @fc indicates the filesystem context. + * @param The parameter + * * Security hooks for filesystem operations. * * @sb_alloc_security: @@ -1459,6 +1470,8 @@ union security_list_options { void (*bprm_committing_creds)(struct linux_binprm *bprm); void (*bprm_committed_creds)(struct linux_binprm *bprm); + int (*fs_context_parse_param)(struct fs_context *fc, struct fs_parameter *param); + int (*sb_alloc_security)(struct super_block *sb); void (*sb_free_security)(struct super_block *sb); void (*sb_free_mnt_opts)(void *mnt_opts); @@ -1800,6 +1813,7 @@ struct security_hook_heads { struct hlist_head bprm_check_security; struct hlist_head bprm_committing_creds; struct hlist_head bprm_committed_creds; + struct hlist_head fs_context_parse_param; struct hlist_head sb_alloc_security; struct hlist_head sb_free_security; struct hlist_head sb_free_mnt_opts; diff --git a/include/linux/security.h b/include/linux/security.h index dbfb5a66babb..1cc4d7a3d6fa 100644 --- a/include/linux/security.h +++ b/include/linux/security.h @@ -53,6 +53,9 @@ struct msg_msg; struct xattr; struct xfrm_sec_ctx; struct mm_struct; +struct fs_context; +struct fs_parameter; +enum fs_value_type; /* If capable should audit the security request */ #define SECURITY_CAP_NOAUDIT 0 @@ -220,6 +223,7 @@ int security_bprm_set_creds(struct linux_binprm *bprm); int security_bprm_check(struct linux_binprm *bprm); void security_bprm_committing_creds(struct linux_binprm *bprm); void security_bprm_committed_creds(struct linux_binprm *bprm); +int security_fs_context_parse_param(struct fs_context *fc, struct fs_parameter *param); int security_sb_alloc(struct super_block *sb); void security_sb_free(struct super_block *sb); void security_free_mnt_opts(void **mnt_opts); @@ -517,6 +521,12 @@ static inline void security_bprm_committed_creds(struct linux_binprm *bprm) { } +static inline int security_fs_context_parse_param(struct fs_context *fc, + struct fs_parameter *param) +{ + return -ENOPARAM; +} + static inline int security_sb_alloc(struct super_block *sb) { return 0; diff --git a/security/security.c b/security/security.c index f1b8d2587639..e5519488327d 100644 --- a/security/security.c +++ b/security/security.c @@ -374,6 +374,11 @@ void security_bprm_committed_creds(struct linux_binprm *bprm) call_void_hook(bprm_committed_creds, bprm); } +int security_fs_context_parse_param(struct fs_context *fc, struct fs_parameter *param) +{ + return call_int_hook(fs_context_parse_param, -ENOPARAM, fc, param); +} + int security_sb_alloc(struct super_block *sb) { return call_int_hook(sb_alloc_security, 0, sb); From patchwork Tue Feb 19 16:30:41 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 10820223 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 59A351805 for ; Tue, 19 Feb 2019 16:30:46 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4246D2CCA4 for ; Tue, 19 Feb 2019 16:30:46 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 4080C2CCB9; Tue, 19 Feb 2019 16:30:46 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A1ECC2CCCC for ; Tue, 19 Feb 2019 16:30:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729297AbfBSQap (ORCPT ); Tue, 19 Feb 2019 11:30:45 -0500 Received: from mx1.redhat.com ([209.132.183.28]:50576 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729120AbfBSQao (ORCPT ); Tue, 19 Feb 2019 11:30:44 -0500 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 488E081DEE; Tue, 19 Feb 2019 16:30:44 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-121-129.rdu2.redhat.com [10.10.121.129]) by smtp.corp.redhat.com (Postfix) with ESMTP id 76B4B60FD8; Tue, 19 Feb 2019 16:30:42 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 17/43] selinux: Implement the new mount API LSM hooks From: David Howells To: viro@zeniv.linux.org.uk Cc: Paul Moore , Stephen Smalley , selinux@tycho.nsa.gov, linux-security-module@vger.kernel.org, linux-fsdevel@vger.kernel.org, dhowells@redhat.com, torvalds@linux-foundation.org, ebiederm@xmission.com, linux-security-module@vger.kernel.org Date: Tue, 19 Feb 2019 16:30:41 +0000 Message-ID: <155059384161.12449.17357084420263811610.stgit@warthog.procyon.org.uk> In-Reply-To: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> References: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.25]); Tue, 19 Feb 2019 16:30:44 +0000 (UTC) Sender: owner-linux-security-module@vger.kernel.org Precedence: bulk List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Implement the new mount API LSM hooks for SELinux. At some point the old hooks will need to be removed. Signed-off-by: David Howells cc: Paul Moore cc: Stephen Smalley cc: selinux@tycho.nsa.gov cc: linux-security-module@vger.kernel.org Signed-off-by: Al Viro --- security/selinux/hooks.c | 49 +++++++++++++++++++++++++++++++---- security/selinux/include/security.h | 10 ++++--- 2 files changed, 49 insertions(+), 10 deletions(-) diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c index f0e36c3492ba..f99381e97d73 100644 --- a/security/selinux/hooks.c +++ b/security/selinux/hooks.c @@ -48,6 +48,8 @@ #include #include #include +#include +#include #include #include #include @@ -454,11 +456,11 @@ static inline int inode_doinit(struct inode *inode) enum { Opt_error = -1, - Opt_context = 1, + Opt_context = 0, + Opt_defcontext = 1, Opt_fscontext = 2, - Opt_defcontext = 3, - Opt_rootcontext = 4, - Opt_seclabel = 5, + Opt_rootcontext = 3, + Opt_seclabel = 4, }; #define A(s, has_arg) {#s, sizeof(#s) - 1, Opt_##s, has_arg} @@ -1089,6 +1091,7 @@ static int show_sid(struct seq_file *m, u32 sid) if (!rc) { bool has_comma = context && strchr(context, ','); + seq_putc(m, '='); if (has_comma) seq_putc(m, '\"'); seq_escape(m, context, "\"\n\\"); @@ -1142,7 +1145,7 @@ static int selinux_sb_show_options(struct seq_file *m, struct super_block *sb) } if (sbsec->flags & SBLABEL_MNT) { seq_putc(m, ','); - seq_puts(m, LABELSUPP_STR); + seq_puts(m, SECLABEL_STR); } return 0; } @@ -2761,6 +2764,38 @@ static int selinux_umount(struct vfsmount *mnt, int flags) FILESYSTEM__UNMOUNT, NULL); } +static const struct fs_parameter_spec selinux_param_specs[] = { + fsparam_string(CONTEXT_STR, Opt_context), + fsparam_string(DEFCONTEXT_STR, Opt_defcontext), + fsparam_string(FSCONTEXT_STR, Opt_fscontext), + fsparam_string(ROOTCONTEXT_STR, Opt_rootcontext), + fsparam_flag (SECLABEL_STR, Opt_seclabel), + {} +}; + +static const struct fs_parameter_description selinux_fs_parameters = { + .name = "SELinux", + .specs = selinux_param_specs, +}; + +static int selinux_fs_context_parse_param(struct fs_context *fc, + struct fs_parameter *param) +{ + struct fs_parse_result result; + int opt, rc; + + opt = fs_parse(fc, &selinux_fs_parameters, param, &result); + if (opt < 0) + return opt; + + rc = selinux_add_opt(opt, param->string, &fc->security); + if (!rc) { + param->string = NULL; + rc = 1; + } + return rc; +} + /* inode security operations */ static int selinux_inode_alloc_security(struct inode *inode) @@ -6710,6 +6745,8 @@ static struct security_hook_list selinux_hooks[] __lsm_ro_after_init = { LSM_HOOK_INIT(bprm_committing_creds, selinux_bprm_committing_creds), LSM_HOOK_INIT(bprm_committed_creds, selinux_bprm_committed_creds), + LSM_HOOK_INIT(fs_context_parse_param, selinux_fs_context_parse_param), + LSM_HOOK_INIT(sb_alloc_security, selinux_sb_alloc_security), LSM_HOOK_INIT(sb_free_security, selinux_sb_free_security), LSM_HOOK_INIT(sb_eat_lsm_opts, selinux_sb_eat_lsm_opts), @@ -6978,6 +7015,8 @@ static __init int selinux_init(void) else pr_debug("SELinux: Starting in permissive mode\n"); + fs_validate_description(&selinux_fs_parameters); + return 0; } diff --git a/security/selinux/include/security.h b/security/selinux/include/security.h index ba8eedf42b90..529d8941c9c5 100644 --- a/security/selinux/include/security.h +++ b/security/selinux/include/security.h @@ -59,11 +59,11 @@ #define SE_SBPROC 0x0200 #define SE_SBGENFS 0x0400 -#define CONTEXT_STR "context=" -#define FSCONTEXT_STR "fscontext=" -#define ROOTCONTEXT_STR "rootcontext=" -#define DEFCONTEXT_STR "defcontext=" -#define LABELSUPP_STR "seclabel" +#define CONTEXT_STR "context" +#define FSCONTEXT_STR "fscontext" +#define ROOTCONTEXT_STR "rootcontext" +#define DEFCONTEXT_STR "defcontext" +#define SECLABEL_STR "seclabel" struct netlbl_lsm_secattr; From patchwork Tue Feb 19 16:30:49 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 10820227 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 79CC514E1 for ; Tue, 19 Feb 2019 16:30:53 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 60D5A2CCD5 for ; Tue, 19 Feb 2019 16:30:53 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 5EA2F2CCD7; Tue, 19 Feb 2019 16:30:53 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E46192CCE9 for ; Tue, 19 Feb 2019 16:30:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729152AbfBSQaw (ORCPT ); Tue, 19 Feb 2019 11:30:52 -0500 Received: from mx1.redhat.com ([209.132.183.28]:36692 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729120AbfBSQaw (ORCPT ); Tue, 19 Feb 2019 11:30:52 -0500 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id B77A6804F2; Tue, 19 Feb 2019 16:30:51 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-121-129.rdu2.redhat.com [10.10.121.129]) by smtp.corp.redhat.com (Postfix) with ESMTP id 3C1F1608C1; Tue, 19 Feb 2019 16:30:50 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 18/43] smack: Implement filesystem context security hooks From: David Howells To: viro@zeniv.linux.org.uk Cc: Casey Schaufler , linux-security-module@vger.kernel.org, linux-fsdevel@vger.kernel.org, dhowells@redhat.com, torvalds@linux-foundation.org, ebiederm@xmission.com, linux-security-module@vger.kernel.org Date: Tue, 19 Feb 2019 16:30:49 +0000 Message-ID: <155059384951.12449.6175507681481014891.stgit@warthog.procyon.org.uk> In-Reply-To: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> References: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.27]); Tue, 19 Feb 2019 16:30:51 +0000 (UTC) Sender: owner-linux-security-module@vger.kernel.org Precedence: bulk List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Implement filesystem context security hooks for the smack LSM. Signed-off-by: David Howells cc: Casey Schaufler cc: linux-security-module@vger.kernel.org Signed-off-by: Al Viro --- security/smack/smack.h | 19 +++++-------------- security/smack/smack_lsm.c | 43 ++++++++++++++++++++++++++++++++++++++++++- 2 files changed, 47 insertions(+), 15 deletions(-) diff --git a/security/smack/smack.h b/security/smack/smack.h index f7db791fb566..0380a9c89d3b 100644 --- a/security/smack/smack.h +++ b/security/smack/smack.h @@ -195,22 +195,13 @@ struct smack_known_list_elem { enum { Opt_error = -1, - Opt_fsdefault = 1, - Opt_fsfloor = 2, - Opt_fshat = 3, - Opt_fsroot = 4, - Opt_fstransmute = 5, + Opt_fsdefault = 0, + Opt_fsfloor = 1, + Opt_fshat = 2, + Opt_fsroot = 3, + Opt_fstransmute = 4, }; -/* - * Mount options - */ -#define SMK_FSDEFAULT "smackfsdef=" -#define SMK_FSFLOOR "smackfsfloor=" -#define SMK_FSHAT "smackfshat=" -#define SMK_FSROOT "smackfsroot=" -#define SMK_FSTRANS "smackfstransmute=" - #define SMACK_DELETE_OPTION "-DELETE" #define SMACK_CIPSO_OPTION "-CIPSO" diff --git a/security/smack/smack_lsm.c b/security/smack/smack_lsm.c index 430d4f35e55c..5f93c4f84384 100644 --- a/security/smack/smack_lsm.c +++ b/security/smack/smack_lsm.c @@ -43,6 +43,8 @@ #include #include #include +#include +#include #include "smack.h" #define TRANS_TRUE "TRUE" @@ -541,7 +543,6 @@ static int smack_syslog(int typefrom_file) return rc; } - /* * Superblock Hooks. */ @@ -646,6 +647,44 @@ static int smack_add_opt(int token, const char *s, void **mnt_opts) return -EINVAL; } +static const struct fs_parameter_spec smack_param_specs[] = { + fsparam_string("fsdefault", Opt_fsdefault), + fsparam_string("fsfloor", Opt_fsfloor), + fsparam_string("fshat", Opt_fshat), + fsparam_string("fsroot", Opt_fsroot), + fsparam_string("fstransmute", Opt_fstransmute), + {} +}; + +static const struct fs_parameter_description smack_fs_parameters = { + .name = "smack", + .specs = smack_param_specs, +}; + +/** + * smack_fs_context_parse_param - Parse a single mount parameter + * @fc: The new filesystem context being constructed. + * @param: The parameter. + * + * Returns 0 on success, -ENOPARAM to pass the parameter on or anything else on + * error. + */ +static int smack_fs_context_parse_param(struct fs_context *fc, + struct fs_parameter *param) +{ + struct fs_parse_result result; + int opt, rc; + + opt = fs_parse(fc, &smack_fs_parameters, param, &result); + if (opt < 0) + return opt; + + rc = smack_add_opt(opt, param->string, &fc->security); + if (!rc) + param->string = NULL; + return rc; +} + static int smack_sb_eat_lsm_opts(char *options, void **mnt_opts) { char *from = options, *to = options; @@ -4587,6 +4626,8 @@ static struct security_hook_list smack_hooks[] __lsm_ro_after_init = { LSM_HOOK_INIT(ptrace_traceme, smack_ptrace_traceme), LSM_HOOK_INIT(syslog, smack_syslog), + LSM_HOOK_INIT(fs_context_parse_param, smack_fs_context_parse_param), + LSM_HOOK_INIT(sb_alloc_security, smack_sb_alloc_security), LSM_HOOK_INIT(sb_free_security, smack_sb_free_security), LSM_HOOK_INIT(sb_free_mnt_opts, smack_free_mnt_opts), From patchwork Tue Feb 19 16:30:56 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 10820231 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 87A521805 for ; Tue, 19 Feb 2019 16:31:00 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 740B22CC8A for ; Tue, 19 Feb 2019 16:31:00 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 70CB22CCE6; Tue, 19 Feb 2019 16:31:00 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 155602CCEE for ; Tue, 19 Feb 2019 16:31:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729302AbfBSQa7 (ORCPT ); Tue, 19 Feb 2019 11:30:59 -0500 Received: from mx1.redhat.com ([209.132.183.28]:36808 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729120AbfBSQa7 (ORCPT ); Tue, 19 Feb 2019 11:30:59 -0500 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 047467F746; Tue, 19 Feb 2019 16:30:59 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-121-129.rdu2.redhat.com [10.10.121.129]) by smtp.corp.redhat.com (Postfix) with ESMTP id B51ED101962A; Tue, 19 Feb 2019 16:30:57 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 19/43] vfs: Put security flags into the fs_context struct From: David Howells To: viro@zeniv.linux.org.uk Cc: linux-fsdevel@vger.kernel.org, dhowells@redhat.com, torvalds@linux-foundation.org, ebiederm@xmission.com, linux-security-module@vger.kernel.org Date: Tue, 19 Feb 2019 16:30:56 +0000 Message-ID: <155059385694.12449.4260302564821199185.stgit@warthog.procyon.org.uk> In-Reply-To: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> References: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.27]); Tue, 19 Feb 2019 16:30:59 +0000 (UTC) Sender: owner-linux-security-module@vger.kernel.org Precedence: bulk List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Put security flags, such as SECURITY_LSM_NATIVE_LABELS, into the filesystem context so that the filesystem can communicate them to the LSM more easily. Signed-off-by: David Howells Signed-off-by: Al Viro --- include/linux/fs_context.h | 1 + include/linux/security.h | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/include/linux/fs_context.h b/include/linux/fs_context.h index 899027c94788..d5ff3b0bc28d 100644 --- a/include/linux/fs_context.h +++ b/include/linux/fs_context.h @@ -85,6 +85,7 @@ struct fs_context { void *security; /* Linux S&M options */ unsigned int sb_flags; /* Proposed superblock flags (SB_*) */ unsigned int sb_flags_mask; /* Superblock flags that were changed */ + unsigned int lsm_flags; /* Information flags from the fs to the LSM */ enum fs_context_purpose purpose:8; bool need_free:1; /* Need to call ops->free() */ }; diff --git a/include/linux/security.h b/include/linux/security.h index 1cc4d7a3d6fa..2da9336a987e 100644 --- a/include/linux/security.h +++ b/include/linux/security.h @@ -61,7 +61,7 @@ enum fs_value_type; #define SECURITY_CAP_NOAUDIT 0 #define SECURITY_CAP_AUDIT 1 -/* LSM Agnostic defines for sb_set_mnt_opts */ +/* LSM Agnostic defines for fs_context::lsm_flags */ #define SECURITY_LSM_NATIVE_LABELS 1 struct ctl_table; From patchwork Tue Feb 19 16:31:04 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 10820235 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6F3EB14E1 for ; Tue, 19 Feb 2019 16:31:18 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 574CB2CD07 for ; Tue, 19 Feb 2019 16:31:18 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 4BB0E2CCF2; Tue, 19 Feb 2019 16:31:18 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0C3652CD16 for ; Tue, 19 Feb 2019 16:31:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728984AbfBSQbQ (ORCPT ); Tue, 19 Feb 2019 11:31:16 -0500 Received: from mx1.redhat.com ([209.132.183.28]:37934 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728818AbfBSQbQ (ORCPT ); Tue, 19 Feb 2019 11:31:16 -0500 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 37267A7865; Tue, 19 Feb 2019 16:31:15 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-121-129.rdu2.redhat.com [10.10.121.129]) by smtp.corp.redhat.com (Postfix) with ESMTP id C546861460; Tue, 19 Feb 2019 16:31:04 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 20/43] vfs: Implement a filesystem superblock creation/configuration context From: David Howells To: viro@zeniv.linux.org.uk Cc: linux-fsdevel@vger.kernel.org, dhowells@redhat.com, torvalds@linux-foundation.org, ebiederm@xmission.com, linux-security-module@vger.kernel.org Date: Tue, 19 Feb 2019 16:31:04 +0000 Message-ID: <155059386422.12449.6531190148715512242.stgit@warthog.procyon.org.uk> In-Reply-To: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> References: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.30]); Tue, 19 Feb 2019 16:31:15 +0000 (UTC) Sender: owner-linux-security-module@vger.kernel.org Precedence: bulk List-ID: X-Virus-Scanned: ClamAV using ClamSMTP [AV - unfuck kern_mount_data(); we want non-NULL ->mnt_ns on long-living mounts] [AV - reordering fs/namespace.c is badly overdue, but let's keep it separate from that series] [AV - drop simple_pin_fs() change] [AV - clean vfs_kern_mount() failure exits up] Implement a filesystem context concept to be used during superblock creation for mount and superblock reconfiguration for remount. The mounting procedure then becomes: (1) Allocate new fs_context context. (2) Configure the context. (3) Create superblock. (4) Query the superblock. (5) Create a mount for the superblock. (6) Destroy the context. Rather than calling fs_type->mount(), an fs_context struct is created and fs_type->init_fs_context() is called to set it up. Pointers exist for the filesystem and LSM to hang their private data off. A set of operations has to be set by ->init_fs_context() to provide freeing, duplication, option parsing, binary data parsing, validation, mounting and superblock filling. Legacy filesystems are supported by the provision of a set of legacy fs_context operations that build up a list of mount options and then invoke fs_type->mount() from within the fs_context ->get_tree() operation. This allows all filesystems to be accessed using fs_context. It should be noted that, whilst this patch adds a lot of lines of code, there is quite a bit of duplication with existing code that can be eliminated should all filesystems be converted over. Signed-off-by: David Howells Signed-off-by: Al Viro --- fs/filesystems.c | 4 + fs/fs_context.c | 300 ++++++++++++++++++++++++++++++++++++++++++++ fs/namespace.c | 25 +--- include/linux/fs.h | 2 include/linux/fs_context.h | 5 + 5 files changed, 319 insertions(+), 17 deletions(-) diff --git a/fs/filesystems.c b/fs/filesystems.c index b03f57b1105b..9135646e41ac 100644 --- a/fs/filesystems.c +++ b/fs/filesystems.c @@ -16,6 +16,7 @@ #include #include #include +#include /* * Handling of filesystem drivers list. @@ -73,6 +74,9 @@ int register_filesystem(struct file_system_type * fs) int res = 0; struct file_system_type ** p; + if (fs->parameters && !fs_validate_description(fs->parameters)) + return -EINVAL; + BUG_ON(strchr(fs->name, '.')); if (fs->next) return -EBUSY; diff --git a/fs/fs_context.c b/fs/fs_context.c index 825d1b2c8807..aa7e0ffb591a 100644 --- a/fs/fs_context.c +++ b/fs/fs_context.c @@ -12,6 +12,7 @@ #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt #include +#include #include #include #include @@ -25,13 +26,217 @@ #include "mount.h" #include "internal.h" +enum legacy_fs_param { + LEGACY_FS_UNSET_PARAMS, + LEGACY_FS_MONOLITHIC_PARAMS, + LEGACY_FS_INDIVIDUAL_PARAMS, +}; + struct legacy_fs_context { char *legacy_data; /* Data page for legacy filesystems */ size_t data_size; + enum legacy_fs_param param_type; }; static int legacy_init_fs_context(struct fs_context *fc); +static const struct constant_table common_set_sb_flag[] = { + { "dirsync", SB_DIRSYNC }, + { "lazytime", SB_LAZYTIME }, + { "mand", SB_MANDLOCK }, + { "posixacl", SB_POSIXACL }, + { "ro", SB_RDONLY }, + { "sync", SB_SYNCHRONOUS }, +}; + +static const struct constant_table common_clear_sb_flag[] = { + { "async", SB_SYNCHRONOUS }, + { "nolazytime", SB_LAZYTIME }, + { "nomand", SB_MANDLOCK }, + { "rw", SB_RDONLY }, + { "silent", SB_SILENT }, +}; + +static const char *const forbidden_sb_flag[] = { + "bind", + "dev", + "exec", + "move", + "noatime", + "nodev", + "nodiratime", + "noexec", + "norelatime", + "nostrictatime", + "nosuid", + "private", + "rec", + "relatime", + "remount", + "shared", + "slave", + "strictatime", + "suid", + "unbindable", +}; + +/* + * Check for a common mount option that manipulates s_flags. + */ +static int vfs_parse_sb_flag(struct fs_context *fc, const char *key) +{ + unsigned int token; + unsigned int i; + + for (i = 0; i < ARRAY_SIZE(forbidden_sb_flag); i++) + if (strcmp(key, forbidden_sb_flag[i]) == 0) + return -EINVAL; + + token = lookup_constant(common_set_sb_flag, key, 0); + if (token) { + fc->sb_flags |= token; + fc->sb_flags_mask |= token; + return 0; + } + + token = lookup_constant(common_clear_sb_flag, key, 0); + if (token) { + fc->sb_flags &= ~token; + fc->sb_flags_mask |= token; + return 0; + } + + return -ENOPARAM; +} + +/** + * vfs_parse_fs_param - Add a single parameter to a superblock config + * @fc: The filesystem context to modify + * @param: The parameter + * + * A single mount option in string form is applied to the filesystem context + * being set up. Certain standard options (for example "ro") are translated + * into flag bits without going to the filesystem. The active security module + * is allowed to observe and poach options. Any other options are passed over + * to the filesystem to parse. + * + * This may be called multiple times for a context. + * + * Returns 0 on success and a negative error code on failure. In the event of + * failure, supplementary error information may have been set. + */ +int vfs_parse_fs_param(struct fs_context *fc, struct fs_parameter *param) +{ + int ret; + + if (!param->key) + return invalf(fc, "Unnamed parameter\n"); + + ret = vfs_parse_sb_flag(fc, param->key); + if (ret != -ENOPARAM) + return ret; + + ret = security_fs_context_parse_param(fc, param); + if (ret != -ENOPARAM) + /* Param belongs to the LSM or is disallowed by the LSM; so + * don't pass to the FS. + */ + return ret; + + if (fc->ops->parse_param) { + ret = fc->ops->parse_param(fc, param); + if (ret != -ENOPARAM) + return ret; + } + + /* If the filesystem doesn't take any arguments, give it the + * default handling of source. + */ + if (strcmp(param->key, "source") == 0) { + if (param->type != fs_value_is_string) + return invalf(fc, "VFS: Non-string source"); + if (fc->source) + return invalf(fc, "VFS: Multiple sources"); + fc->source = param->string; + param->string = NULL; + return 0; + } + + return invalf(fc, "%s: Unknown parameter '%s'", + fc->fs_type->name, param->key); +} +EXPORT_SYMBOL(vfs_parse_fs_param); + +/** + * vfs_parse_fs_string - Convenience function to just parse a string. + */ +int vfs_parse_fs_string(struct fs_context *fc, const char *key, + const char *value, size_t v_size) +{ + int ret; + + struct fs_parameter param = { + .key = key, + .type = fs_value_is_string, + .size = v_size, + }; + + if (v_size > 0) { + param.string = kmemdup_nul(value, v_size, GFP_KERNEL); + if (!param.string) + return -ENOMEM; + } + + ret = vfs_parse_fs_param(fc, ¶m); + kfree(param.string); + return ret; +} +EXPORT_SYMBOL(vfs_parse_fs_string); + +/** + * generic_parse_monolithic - Parse key[=val][,key[=val]]* mount data + * @ctx: The superblock configuration to fill in. + * @data: The data to parse + * + * Parse a blob of data that's in key[=val][,key[=val]]* form. This can be + * called from the ->monolithic_mount_data() fs_context operation. + * + * Returns 0 on success or the error returned by the ->parse_option() fs_context + * operation on failure. + */ +int generic_parse_monolithic(struct fs_context *fc, void *data) +{ + char *options = data, *key; + int ret = 0; + + if (!options) + return 0; + + ret = security_sb_eat_lsm_opts(options, &fc->security); + if (ret) + return ret; + + while ((key = strsep(&options, ",")) != NULL) { + if (*key) { + size_t v_len = 0; + char *value = strchr(key, '='); + + if (value) { + if (value == key) + continue; + *value++ = 0; + v_len = strlen(value); + } + ret = vfs_parse_fs_string(fc, key, value, v_len); + if (ret < 0) + break; + } + } + + return ret; +} +EXPORT_SYMBOL(generic_parse_monolithic); + /** * alloc_fs_context - Create a filesystem context. * @fs_type: The filesystem type. @@ -166,7 +371,87 @@ EXPORT_SYMBOL(put_fs_context); */ static void legacy_fs_context_free(struct fs_context *fc) { - kfree(fc->fs_private); + struct legacy_fs_context *ctx = fc->fs_private; + + if (ctx) { + if (ctx->param_type == LEGACY_FS_INDIVIDUAL_PARAMS) + kfree(ctx->legacy_data); + kfree(ctx); + } +} + +/* + * Add a parameter to a legacy config. We build up a comma-separated list of + * options. + */ +static int legacy_parse_param(struct fs_context *fc, struct fs_parameter *param) +{ + struct legacy_fs_context *ctx = fc->fs_private; + unsigned int size = ctx->data_size; + size_t len = 0; + + if (strcmp(param->key, "source") == 0) { + if (param->type != fs_value_is_string) + return invalf(fc, "VFS: Legacy: Non-string source"); + if (fc->source) + return invalf(fc, "VFS: Legacy: Multiple sources"); + fc->source = param->string; + param->string = NULL; + return 0; + } + + if ((fc->fs_type->fs_flags & FS_HAS_SUBTYPE) && + strcmp(param->key, "subtype") == 0) { + if (param->type != fs_value_is_string) + return invalf(fc, "VFS: Legacy: Non-string subtype"); + if (fc->subtype) + return invalf(fc, "VFS: Legacy: Multiple subtype"); + fc->subtype = param->string; + param->string = NULL; + return 0; + } + + if (ctx->param_type == LEGACY_FS_MONOLITHIC_PARAMS) + return invalf(fc, "VFS: Legacy: Can't mix monolithic and individual options"); + + switch (param->type) { + case fs_value_is_string: + len = 1 + param->size; + /* Fall through */ + case fs_value_is_flag: + len += strlen(param->key); + break; + default: + return invalf(fc, "VFS: Legacy: Parameter type for '%s' not supported", + param->key); + } + + if (len > PAGE_SIZE - 2 - size) + return invalf(fc, "VFS: Legacy: Cumulative options too large"); + if (strchr(param->key, ',') || + (param->type == fs_value_is_string && + memchr(param->string, ',', param->size))) + return invalf(fc, "VFS: Legacy: Option '%s' contained comma", + param->key); + if (!ctx->legacy_data) { + ctx->legacy_data = kmalloc(PAGE_SIZE, GFP_KERNEL); + if (!ctx->legacy_data) + return -ENOMEM; + } + + ctx->legacy_data[size++] = ','; + len = strlen(param->key); + memcpy(ctx->legacy_data + size, param->key, len); + size += len; + if (param->type == fs_value_is_string) { + ctx->legacy_data[size++] = '='; + memcpy(ctx->legacy_data + size, param->string, param->size); + size += param->size; + } + ctx->legacy_data[size] = '\0'; + ctx->data_size = size; + ctx->param_type = LEGACY_FS_INDIVIDUAL_PARAMS; + return 0; } /* @@ -175,9 +460,17 @@ static void legacy_fs_context_free(struct fs_context *fc) static int legacy_parse_monolithic(struct fs_context *fc, void *data) { struct legacy_fs_context *ctx = fc->fs_private; + + if (ctx->param_type != LEGACY_FS_UNSET_PARAMS) { + pr_warn("VFS: Can't mix monolithic and individual options\n"); + return -EINVAL; + } + ctx->legacy_data = data; + ctx->param_type = LEGACY_FS_MONOLITHIC_PARAMS; if (!ctx->legacy_data) return 0; + if (fc->fs_type->fs_flags & FS_BINARY_MOUNTDATA) return 0; return security_sb_eat_lsm_opts(ctx->legacy_data, &fc->security); @@ -221,6 +514,7 @@ static int legacy_reconfigure(struct fs_context *fc) const struct fs_context_operations legacy_fs_context_ops = { .free = legacy_fs_context_free, + .parse_param = legacy_parse_param, .parse_monolithic = legacy_parse_monolithic, .get_tree = legacy_get_tree, .reconfigure = legacy_reconfigure, @@ -242,6 +536,10 @@ static int legacy_init_fs_context(struct fs_context *fc) int parse_monolithic_mount_data(struct fs_context *fc, void *data) { int (*monolithic_mount_data)(struct fs_context *, void *); + monolithic_mount_data = fc->ops->parse_monolithic; + if (!monolithic_mount_data) + monolithic_mount_data = generic_parse_monolithic; + return monolithic_mount_data(fc, data); } diff --git a/fs/namespace.c b/fs/namespace.c index 931228d8518a..1a1ed2528f47 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -997,17 +997,15 @@ struct vfsmount *vfs_kern_mount(struct file_system_type *type, int ret = 0; if (!type) - return ERR_PTR(-ENODEV); + return ERR_PTR(-EINVAL); fc = fs_context_for_mount(type, flags); if (IS_ERR(fc)) return ERR_CAST(fc); - if (name) { - fc->source = kstrdup(name, GFP_KERNEL); - if (!fc->source) - ret = -ENOMEM; - } + if (name) + ret = vfs_parse_fs_string(fc, "source", + name, strlen(name)); if (!ret) ret = parse_monolithic_mount_data(fc, data); if (!ret) @@ -2611,16 +2609,11 @@ static int do_new_mount(struct path *path, const char *fstype, int sb_flags, if (IS_ERR(fc)) return PTR_ERR(fc); - if (subtype) { - fc->subtype = kstrdup(subtype, GFP_KERNEL); - if (!fc->subtype) - err = -ENOMEM; - } - if (!err && name) { - fc->source = kstrdup(name, GFP_KERNEL); - if (!fc->source) - err = -ENOMEM; - } + if (subtype) + err = vfs_parse_fs_string(fc, "subtype", + subtype, strlen(subtype)); + if (!err && name) + err = vfs_parse_fs_string(fc, "source", name, strlen(name)); if (!err) err = parse_monolithic_mount_data(fc, data); if (!err) diff --git a/include/linux/fs.h b/include/linux/fs.h index 8d578a9e1e8c..cf6e9ea161eb 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -62,6 +62,7 @@ struct iov_iter; struct fscrypt_info; struct fscrypt_operations; struct fs_context; +struct fs_parameter_description; extern void __init inode_init(void); extern void __init inode_init_early(void); @@ -2175,6 +2176,7 @@ struct file_system_type { #define FS_USERNS_MOUNT 8 /* Can be mounted by userns root */ #define FS_RENAME_DOES_D_MOVE 32768 /* FS will handle d_move() during rename() internally. */ int (*init_fs_context)(struct fs_context *); + const struct fs_parameter_description *parameters; struct dentry *(*mount) (struct file_system_type *, int, const char *, void *); void (*kill_sb) (struct super_block *); diff --git a/include/linux/fs_context.h b/include/linux/fs_context.h index d5ff3b0bc28d..d794b04e9fbb 100644 --- a/include/linux/fs_context.h +++ b/include/linux/fs_context.h @@ -92,6 +92,7 @@ struct fs_context { struct fs_context_operations { void (*free)(struct fs_context *fc); + int (*parse_param)(struct fs_context *fc, struct fs_parameter *param); int (*parse_monolithic)(struct fs_context *fc, void *data); int (*get_tree)(struct fs_context *fc); int (*reconfigure)(struct fs_context *fc); @@ -108,6 +109,10 @@ extern struct fs_context *fs_context_for_reconfigure(struct dentry *dentry, extern struct fs_context *fs_context_for_submount(struct file_system_type *fs_type, struct dentry *reference); +extern int vfs_parse_fs_param(struct fs_context *fc, struct fs_parameter *param); +extern int vfs_parse_fs_string(struct fs_context *fc, const char *key, + const char *value, size_t v_size); +extern int generic_parse_monolithic(struct fs_context *fc, void *data); extern int vfs_get_tree(struct fs_context *fc); extern void put_fs_context(struct fs_context *fc); From patchwork Tue Feb 19 16:31:20 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 10820239 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1A58A1805 for ; Tue, 19 Feb 2019 16:31:25 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 02EE32CC81 for ; Tue, 19 Feb 2019 16:31:25 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0054F2CD09; Tue, 19 Feb 2019 16:31:24 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1A2C02CD09 for ; Tue, 19 Feb 2019 16:31:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729211AbfBSQbX (ORCPT ); Tue, 19 Feb 2019 11:31:23 -0500 Received: from mx1.redhat.com ([209.132.183.28]:42582 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729182AbfBSQbX (ORCPT ); Tue, 19 Feb 2019 11:31:23 -0500 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 8404FD7819; Tue, 19 Feb 2019 16:31:22 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-121-129.rdu2.redhat.com [10.10.121.129]) by smtp.corp.redhat.com (Postfix) with ESMTP id 2A4A3101E5B1; Tue, 19 Feb 2019 16:31:21 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 21/43] convenience helpers: vfs_get_super() and sget_fc() From: David Howells To: viro@zeniv.linux.org.uk Cc: linux-fsdevel@vger.kernel.org, dhowells@redhat.com, torvalds@linux-foundation.org, ebiederm@xmission.com, linux-security-module@vger.kernel.org Date: Tue, 19 Feb 2019 16:31:20 +0000 Message-ID: <155059388041.12449.16262861562909897857.stgit@warthog.procyon.org.uk> In-Reply-To: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> References: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.38]); Tue, 19 Feb 2019 16:31:22 +0000 (UTC) Sender: owner-linux-security-module@vger.kernel.org Precedence: bulk List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Al Viro the former is an analogue of mount_{single,nodev} for use in ->get_tree() instances, the latter - analogue of sget() for the same. These are fairly similar to the originals, but the callback signature for sget_fc() is different from sget() ones, so getting bits and pieces shared would be too convoluted; we might get around to that later, but for now let's just remember to keep them in sync. They do live next to each other, and changes in either won't be hard to spot. Signed-off-by: Al Viro --- fs/super.c | 171 ++++++++++++++++++++++++++++++++++++++++++++ include/linux/fs.h | 4 + include/linux/fs_context.h | 15 ++++ 3 files changed, 190 insertions(+) diff --git a/fs/super.c b/fs/super.c index 76b3181c782d..0ebb5c11fa56 100644 --- a/fs/super.c +++ b/fs/super.c @@ -476,6 +476,94 @@ void generic_shutdown_super(struct super_block *sb) EXPORT_SYMBOL(generic_shutdown_super); +/** + * sget_fc - Find or create a superblock + * @fc: Filesystem context. + * @test: Comparison callback + * @set: Setup callback + * + * Find or create a superblock using the parameters stored in the filesystem + * context and the two callback functions. + * + * If an extant superblock is matched, then that will be returned with an + * elevated reference count that the caller must transfer or discard. + * + * If no match is made, a new superblock will be allocated and basic + * initialisation will be performed (s_type, s_fs_info and s_id will be set and + * the set() callback will be invoked), the superblock will be published and it + * will be returned in a partially constructed state with SB_BORN and SB_ACTIVE + * as yet unset. + */ +struct super_block *sget_fc(struct fs_context *fc, + int (*test)(struct super_block *, struct fs_context *), + int (*set)(struct super_block *, struct fs_context *)) +{ + struct super_block *s = NULL; + struct super_block *old; + struct user_namespace *user_ns = fc->global ? &init_user_ns : fc->user_ns; + int err; + + if (!(fc->sb_flags & SB_KERNMOUNT) && + fc->purpose != FS_CONTEXT_FOR_SUBMOUNT) { + /* Don't allow mounting unless the caller has CAP_SYS_ADMIN + * over the namespace. + */ + if (!(fc->fs_type->fs_flags & FS_USERNS_MOUNT)) { + if (!capable(CAP_SYS_ADMIN)) + return ERR_PTR(-EPERM); + } else { + if (!ns_capable(fc->user_ns, CAP_SYS_ADMIN)) + return ERR_PTR(-EPERM); + } + } + +retry: + spin_lock(&sb_lock); + if (test) { + hlist_for_each_entry(old, &fc->fs_type->fs_supers, s_instances) { + if (test(old, fc)) + goto share_extant_sb; + } + } + if (!s) { + spin_unlock(&sb_lock); + s = alloc_super(fc->fs_type, fc->sb_flags, user_ns); + if (!s) + return ERR_PTR(-ENOMEM); + goto retry; + } + + s->s_fs_info = fc->s_fs_info; + err = set(s, fc); + if (err) { + s->s_fs_info = NULL; + spin_unlock(&sb_lock); + destroy_unused_super(s); + return ERR_PTR(err); + } + fc->s_fs_info = NULL; + s->s_type = fc->fs_type; + strlcpy(s->s_id, s->s_type->name, sizeof(s->s_id)); + list_add_tail(&s->s_list, &super_blocks); + hlist_add_head(&s->s_instances, &s->s_type->fs_supers); + spin_unlock(&sb_lock); + get_filesystem(s->s_type); + register_shrinker_prepared(&s->s_shrink); + return s; + +share_extant_sb: + if (user_ns != old->s_user_ns) { + spin_unlock(&sb_lock); + destroy_unused_super(s); + return ERR_PTR(-EBUSY); + } + if (!grab_super(old)) + goto retry; + destroy_unused_super(s); + return old; +} +EXPORT_SYMBOL(sget_fc); + /** * sget_userns - find or create a superblock * @type: filesystem type superblock should belong to @@ -1103,6 +1191,89 @@ struct dentry *mount_ns(struct file_system_type *fs_type, EXPORT_SYMBOL(mount_ns); +int set_anon_super_fc(struct super_block *sb, struct fs_context *fc) +{ + return set_anon_super(sb, NULL); +} +EXPORT_SYMBOL(set_anon_super_fc); + +static int test_keyed_super(struct super_block *sb, struct fs_context *fc) +{ + return sb->s_fs_info == fc->s_fs_info; +} + +static int test_single_super(struct super_block *s, struct fs_context *fc) +{ + return 1; +} + +/** + * vfs_get_super - Get a superblock with a search key set in s_fs_info. + * @fc: The filesystem context holding the parameters + * @keying: How to distinguish superblocks + * @fill_super: Helper to initialise a new superblock + * + * Search for a superblock and create a new one if not found. The search + * criterion is controlled by @keying. If the search fails, a new superblock + * is created and @fill_super() is called to initialise it. + * + * @keying can take one of a number of values: + * + * (1) vfs_get_single_super - Only one superblock of this type may exist on the + * system. This is typically used for special system filesystems. + * + * (2) vfs_get_keyed_super - Multiple superblocks may exist, but they must have + * distinct keys (where the key is in s_fs_info). Searching for the same + * key again will turn up the superblock for that key. + * + * (3) vfs_get_independent_super - Multiple superblocks may exist and are + * unkeyed. Each call will get a new superblock. + * + * A permissions check is made by sget_fc() unless we're getting a superblock + * for a kernel-internal mount or a submount. + */ +int vfs_get_super(struct fs_context *fc, + enum vfs_get_super_keying keying, + int (*fill_super)(struct super_block *sb, + struct fs_context *fc)) +{ + int (*test)(struct super_block *, struct fs_context *); + struct super_block *sb; + + switch (keying) { + case vfs_get_single_super: + test = test_single_super; + break; + case vfs_get_keyed_super: + test = test_keyed_super; + break; + case vfs_get_independent_super: + test = NULL; + break; + default: + BUG(); + } + + sb = sget_fc(fc, test, set_anon_super_fc); + if (IS_ERR(sb)) + return PTR_ERR(sb); + + if (!sb->s_root) { + int err = fill_super(sb, fc); + if (err) { + deactivate_locked_super(sb); + return err; + } + + sb->s_flags |= SB_ACTIVE; + } + + BUG_ON(fc->root); + fc->root = dget(sb->s_root); + return 0; +} +EXPORT_SYMBOL(vfs_get_super); + #ifdef CONFIG_BLOCK static int set_bdev_super(struct super_block *s, void *data) { diff --git a/include/linux/fs.h b/include/linux/fs.h index cf6e9ea161eb..9d05c128ccf6 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -2232,8 +2232,12 @@ void kill_litter_super(struct super_block *sb); void deactivate_super(struct super_block *sb); void deactivate_locked_super(struct super_block *sb); int set_anon_super(struct super_block *s, void *data); +int set_anon_super_fc(struct super_block *s, struct fs_context *fc); int get_anon_bdev(dev_t *); void free_anon_bdev(dev_t); +struct super_block *sget_fc(struct fs_context *fc, + int (*test)(struct super_block *, struct fs_context *), + int (*set)(struct super_block *, struct fs_context *)); struct super_block *sget_userns(struct file_system_type *type, int (*test)(struct super_block *,void *), int (*set)(struct super_block *,void *), diff --git a/include/linux/fs_context.h b/include/linux/fs_context.h index d794b04e9fbb..b1a95db7a111 100644 --- a/include/linux/fs_context.h +++ b/include/linux/fs_context.h @@ -83,11 +83,13 @@ struct fs_context { const char *source; /* The source name (eg. dev path) */ const char *subtype; /* The subtype to set on the superblock */ void *security; /* Linux S&M options */ + void *s_fs_info; /* Proposed s_fs_info */ unsigned int sb_flags; /* Proposed superblock flags (SB_*) */ unsigned int sb_flags_mask; /* Superblock flags that were changed */ unsigned int lsm_flags; /* Information flags from the fs to the LSM */ enum fs_context_purpose purpose:8; bool need_free:1; /* Need to call ops->free() */ + bool global:1; /* Goes into &init_user_ns */ }; struct fs_context_operations { @@ -116,6 +118,19 @@ extern int generic_parse_monolithic(struct fs_context *fc, void *data); extern int vfs_get_tree(struct fs_context *fc); extern void put_fs_context(struct fs_context *fc); +/* + * sget() wrapper to be called from the ->get_tree() op. + */ +enum vfs_get_super_keying { + vfs_get_single_super, /* Only one such superblock may exist */ + vfs_get_keyed_super, /* Superblocks with different s_fs_info keys may exist */ + vfs_get_independent_super, /* Multiple independent superblocks may exist */ +}; +extern int vfs_get_super(struct fs_context *fc, + enum vfs_get_super_keying keying, + int (*fill_super)(struct super_block *sb, + struct fs_context *fc)); + #define logfc(FC, FMT, ...) pr_notice(FMT, ## __VA_ARGS__) /** From patchwork Tue Feb 19 16:31:27 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 10820243 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A93D914E1 for ; Tue, 19 Feb 2019 16:31:32 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8CFAC2CD0C for ; Tue, 19 Feb 2019 16:31:32 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 8AC932CD21; Tue, 19 Feb 2019 16:31:32 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id AF08D2CD2E for ; Tue, 19 Feb 2019 16:31:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729176AbfBSQbb (ORCPT ); Tue, 19 Feb 2019 11:31:31 -0500 Received: from mx1.redhat.com ([209.132.183.28]:38074 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729066AbfBSQbb (ORCPT ); Tue, 19 Feb 2019 11:31:31 -0500 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 2BE94A786B; Tue, 19 Feb 2019 16:31:30 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-121-129.rdu2.redhat.com [10.10.121.129]) by smtp.corp.redhat.com (Postfix) with ESMTP id 8A0CE19C57; Tue, 19 Feb 2019 16:31:28 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 22/43] introduce cloning of fs_context From: David Howells To: viro@zeniv.linux.org.uk Cc: linux-fsdevel@vger.kernel.org, dhowells@redhat.com, torvalds@linux-foundation.org, ebiederm@xmission.com, linux-security-module@vger.kernel.org Date: Tue, 19 Feb 2019 16:31:27 +0000 Message-ID: <155059388775.12449.6588682823855626275.stgit@warthog.procyon.org.uk> In-Reply-To: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> References: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.30]); Tue, 19 Feb 2019 16:31:30 +0000 (UTC) Sender: owner-linux-security-module@vger.kernel.org Precedence: bulk List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Al Viro new primitive: vfs_dup_fs_context(). Comes with fs_context method (->dup()) for copying the filesystem-specific parts of fs_context, along with LSM one (->fs_context_dup()) for doing the same to LSM parts. [needs better commit message, and change of Author:, anyway] Signed-off-by: Al Viro --- fs/fs_context.c | 67 ++++++++++++++++++++++++++++++++++++++++++++ include/linux/fs_context.h | 2 + include/linux/lsm_hooks.h | 7 +++++ include/linux/security.h | 6 ++++ security/security.c | 5 +++ security/selinux/hooks.c | 39 ++++++++++++++++++++++++++ security/smack/smack_lsm.c | 49 ++++++++++++++++++++++++++++++++ 7 files changed, 175 insertions(+) diff --git a/fs/fs_context.c b/fs/fs_context.c index aa7e0ffb591a..57f61833ac83 100644 --- a/fs/fs_context.c +++ b/fs/fs_context.c @@ -337,6 +337,47 @@ void fc_drop_locked(struct fs_context *fc) static void legacy_fs_context_free(struct fs_context *fc); +/** + * vfs_dup_fc_config: Duplicate a filesystem context. + * @src_fc: The context to copy. + */ +struct fs_context *vfs_dup_fs_context(struct fs_context *src_fc) +{ + struct fs_context *fc; + int ret; + + if (!src_fc->ops->dup) + return ERR_PTR(-EOPNOTSUPP); + + fc = kmemdup(src_fc, sizeof(struct fs_context), GFP_KERNEL); + if (!fc) + return ERR_PTR(-ENOMEM); + + fc->fs_private = NULL; + fc->s_fs_info = NULL; + fc->source = NULL; + fc->security = NULL; + get_filesystem(fc->fs_type); + get_net(fc->net_ns); + get_user_ns(fc->user_ns); + get_cred(fc->cred); + + /* Can't call put until we've called ->dup */ + ret = fc->ops->dup(fc, src_fc); + if (ret < 0) + goto err_fc; + + ret = security_fs_context_dup(fc, src_fc); + if (ret < 0) + goto err_fc; + return fc; + +err_fc: + put_fs_context(fc); + return ERR_PTR(ret); +} +EXPORT_SYMBOL(vfs_dup_fs_context); + /** * put_fs_context - Dispose of a superblock configuration context. * @fc: The context to dispose of. @@ -380,6 +421,31 @@ static void legacy_fs_context_free(struct fs_context *fc) } } +/* + * Duplicate a legacy config. + */ +static int legacy_fs_context_dup(struct fs_context *fc, struct fs_context *src_fc) +{ + struct legacy_fs_context *ctx; + struct legacy_fs_context *src_ctx = src_fc->fs_private; + + ctx = kmemdup(src_ctx, sizeof(*src_ctx), GFP_KERNEL); + if (!ctx) + return -ENOMEM; + + if (ctx->param_type == LEGACY_FS_INDIVIDUAL_PARAMS) { + ctx->legacy_data = kmemdup(src_ctx->legacy_data, + src_ctx->data_size, GFP_KERNEL); + if (!ctx->legacy_data) { + kfree(ctx); + return -ENOMEM; + } + } + + fc->fs_private = ctx; + return 0; +} + /* * Add a parameter to a legacy config. We build up a comma-separated list of * options. @@ -514,6 +580,7 @@ static int legacy_reconfigure(struct fs_context *fc) const struct fs_context_operations legacy_fs_context_ops = { .free = legacy_fs_context_free, + .dup = legacy_fs_context_dup, .parse_param = legacy_parse_param, .parse_monolithic = legacy_parse_monolithic, .get_tree = legacy_get_tree, diff --git a/include/linux/fs_context.h b/include/linux/fs_context.h index b1a95db7a111..0db0b645c7b8 100644 --- a/include/linux/fs_context.h +++ b/include/linux/fs_context.h @@ -94,6 +94,7 @@ struct fs_context { struct fs_context_operations { void (*free)(struct fs_context *fc); + int (*dup)(struct fs_context *fc, struct fs_context *src_fc); int (*parse_param)(struct fs_context *fc, struct fs_parameter *param); int (*parse_monolithic)(struct fs_context *fc, void *data); int (*get_tree)(struct fs_context *fc); @@ -111,6 +112,7 @@ extern struct fs_context *fs_context_for_reconfigure(struct dentry *dentry, extern struct fs_context *fs_context_for_submount(struct file_system_type *fs_type, struct dentry *reference); +extern struct fs_context *vfs_dup_fs_context(struct fs_context *fc); extern int vfs_parse_fs_param(struct fs_context *fc, struct fs_parameter *param); extern int vfs_parse_fs_string(struct fs_context *fc, const char *key, const char *value, size_t v_size); diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h index 47ba4db4d8fb..356e78fe90a8 100644 --- a/include/linux/lsm_hooks.h +++ b/include/linux/lsm_hooks.h @@ -79,6 +79,11 @@ * Security hooks for mount using fs_context. * [See also Documentation/filesystems/mounting.txt] * + * @fs_context_dup: + * Allocate and attach a security structure to sc->security. This pointer + * is initialised to NULL by the caller. + * @fc indicates the new filesystem context. + * @src_fc indicates the original filesystem context. * @fs_context_parse_param: * Userspace provided a parameter to configure a superblock. The LSM may * reject it with an error and may use it for itself, in which case it @@ -1470,6 +1475,7 @@ union security_list_options { void (*bprm_committing_creds)(struct linux_binprm *bprm); void (*bprm_committed_creds)(struct linux_binprm *bprm); + int (*fs_context_dup)(struct fs_context *fc, struct fs_context *src_sc); int (*fs_context_parse_param)(struct fs_context *fc, struct fs_parameter *param); int (*sb_alloc_security)(struct super_block *sb); @@ -1813,6 +1819,7 @@ struct security_hook_heads { struct hlist_head bprm_check_security; struct hlist_head bprm_committing_creds; struct hlist_head bprm_committed_creds; + struct hlist_head fs_context_dup; struct hlist_head fs_context_parse_param; struct hlist_head sb_alloc_security; struct hlist_head sb_free_security; diff --git a/include/linux/security.h b/include/linux/security.h index 2da9336a987e..f28a1ebfd78e 100644 --- a/include/linux/security.h +++ b/include/linux/security.h @@ -223,6 +223,7 @@ int security_bprm_set_creds(struct linux_binprm *bprm); int security_bprm_check(struct linux_binprm *bprm); void security_bprm_committing_creds(struct linux_binprm *bprm); void security_bprm_committed_creds(struct linux_binprm *bprm); +int security_fs_context_dup(struct fs_context *fc, struct fs_context *src_fc); int security_fs_context_parse_param(struct fs_context *fc, struct fs_parameter *param); int security_sb_alloc(struct super_block *sb); void security_sb_free(struct super_block *sb); @@ -521,6 +522,11 @@ static inline void security_bprm_committed_creds(struct linux_binprm *bprm) { } +static inline int security_fs_context_dup(struct fs_context *fc, + struct fs_context *src_fc) +{ + return 0; +} static inline int security_fs_context_parse_param(struct fs_context *fc, struct fs_parameter *param) { diff --git a/security/security.c b/security/security.c index e5519488327d..5759339319dc 100644 --- a/security/security.c +++ b/security/security.c @@ -374,6 +374,11 @@ void security_bprm_committed_creds(struct linux_binprm *bprm) call_void_hook(bprm_committed_creds, bprm); } +int security_fs_context_dup(struct fs_context *fc, struct fs_context *src_fc) +{ + return call_int_hook(fs_context_dup, 0, fc, src_fc); +} + int security_fs_context_parse_param(struct fs_context *fc, struct fs_parameter *param) { return call_int_hook(fs_context_parse_param, -ENOPARAM, fc, param); diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c index f99381e97d73..4ba83de5fa80 100644 --- a/security/selinux/hooks.c +++ b/security/selinux/hooks.c @@ -2764,6 +2764,44 @@ static int selinux_umount(struct vfsmount *mnt, int flags) FILESYSTEM__UNMOUNT, NULL); } +static int selinux_fs_context_dup(struct fs_context *fc, + struct fs_context *src_fc) +{ + const struct selinux_mnt_opts *src = src_fc->security; + struct selinux_mnt_opts *opts; + + if (!src) + return 0; + + fc->security = kzalloc(sizeof(struct selinux_mnt_opts), GFP_KERNEL); + if (!fc->security) + return -ENOMEM; + + opts = fc->security; + + if (src->fscontext) { + opts->fscontext = kstrdup(src->fscontext, GFP_KERNEL); + if (!opts->fscontext) + return -ENOMEM; + } + if (src->context) { + opts->context = kstrdup(src->context, GFP_KERNEL); + if (!opts->context) + return -ENOMEM; + } + if (src->rootcontext) { + opts->rootcontext = kstrdup(src->rootcontext, GFP_KERNEL); + if (!opts->rootcontext) + return -ENOMEM; + } + if (src->defcontext) { + opts->defcontext = kstrdup(src->defcontext, GFP_KERNEL); + if (!opts->defcontext) + return -ENOMEM; + } + return 0; +} + static const struct fs_parameter_spec selinux_param_specs[] = { fsparam_string(CONTEXT_STR, Opt_context), fsparam_string(DEFCONTEXT_STR, Opt_defcontext), @@ -6745,6 +6783,7 @@ static struct security_hook_list selinux_hooks[] __lsm_ro_after_init = { LSM_HOOK_INIT(bprm_committing_creds, selinux_bprm_committing_creds), LSM_HOOK_INIT(bprm_committed_creds, selinux_bprm_committed_creds), + LSM_HOOK_INIT(fs_context_dup, selinux_fs_context_dup), LSM_HOOK_INIT(fs_context_parse_param, selinux_fs_context_parse_param), LSM_HOOK_INIT(sb_alloc_security, selinux_sb_alloc_security), diff --git a/security/smack/smack_lsm.c b/security/smack/smack_lsm.c index 5f93c4f84384..03176f600a87 100644 --- a/security/smack/smack_lsm.c +++ b/security/smack/smack_lsm.c @@ -647,6 +647,54 @@ static int smack_add_opt(int token, const char *s, void **mnt_opts) return -EINVAL; } +/** + * smack_fs_context_dup - Duplicate the security data on fs_context duplication + * @fc: The new filesystem context. + * @src_fc: The source filesystem context being duplicated. + * + * Returns 0 on success or -ENOMEM on error. + */ +static int smack_fs_context_dup(struct fs_context *fc, + struct fs_context *src_fc) +{ + struct smack_mnt_opts *dst, *src = src_fc->security; + + if (!src) + return 0; + + fc->security = kzalloc(sizeof(struct smack_mnt_opts), GFP_KERNEL); + if (!fc->security) + return -ENOMEM; + dst = fc->security; + + if (src->fsdefault) { + dst->fsdefault = kstrdup(src->fsdefault, GFP_KERNEL); + if (!dst->fsdefault) + return -ENOMEM; + } + if (src->fsfloor) { + dst->fsfloor = kstrdup(src->fsfloor, GFP_KERNEL); + if (!dst->fsfloor) + return -ENOMEM; + } + if (src->fshat) { + dst->fshat = kstrdup(src->fshat, GFP_KERNEL); + if (!dst->fshat) + return -ENOMEM; + } + if (src->fsroot) { + dst->fsroot = kstrdup(src->fsroot, GFP_KERNEL); + if (!dst->fsroot) + return -ENOMEM; + } + if (src->fstransmute) { + dst->fstransmute = kstrdup(src->fstransmute, GFP_KERNEL); + if (!dst->fstransmute) + return -ENOMEM; + } + return 0; +} + static const struct fs_parameter_spec smack_param_specs[] = { fsparam_string("fsdefault", Opt_fsdefault), fsparam_string("fsfloor", Opt_fsfloor), @@ -4626,6 +4674,7 @@ static struct security_hook_list smack_hooks[] __lsm_ro_after_init = { LSM_HOOK_INIT(ptrace_traceme, smack_ptrace_traceme), LSM_HOOK_INIT(syslog, smack_syslog), + LSM_HOOK_INIT(fs_context_dup, smack_fs_context_dup), LSM_HOOK_INIT(fs_context_parse_param, smack_fs_context_parse_param), LSM_HOOK_INIT(sb_alloc_security, smack_sb_alloc_security), From patchwork Tue Feb 19 16:31:35 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 10820247 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 43C1B14E1 for ; Tue, 19 Feb 2019 16:31:39 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2B53E2CD21 for ; Tue, 19 Feb 2019 16:31:39 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 28F8A2CD11; Tue, 19 Feb 2019 16:31:39 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8DE0F2CD31 for ; Tue, 19 Feb 2019 16:31:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727957AbfBSQbi (ORCPT ); Tue, 19 Feb 2019 11:31:38 -0500 Received: from mx1.redhat.com ([209.132.183.28]:44536 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726357AbfBSQbh (ORCPT ); Tue, 19 Feb 2019 11:31:37 -0500 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id B2033C049E24; Tue, 19 Feb 2019 16:31:37 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-121-129.rdu2.redhat.com [10.10.121.129]) by smtp.corp.redhat.com (Postfix) with ESMTP id 3180419C58; Tue, 19 Feb 2019 16:31:36 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 23/43] procfs: Move proc_fill_super() to fs/proc/root.c From: David Howells To: viro@zeniv.linux.org.uk Cc: Alexey Dobriyan , Alexey Dobriyan , linux-fsdevel@vger.kernel.org, dhowells@redhat.com, torvalds@linux-foundation.org, ebiederm@xmission.com, linux-security-module@vger.kernel.org Date: Tue, 19 Feb 2019 16:31:35 +0000 Message-ID: <155059389539.12449.17955800242100461744.stgit@warthog.procyon.org.uk> In-Reply-To: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> References: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.31]); Tue, 19 Feb 2019 16:31:37 +0000 (UTC) Sender: owner-linux-security-module@vger.kernel.org Precedence: bulk List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Move proc_fill_super() to fs/proc/root.c as that's where the other superblock stuff is. Signed-off-by: David Howells Reviewed-by: Alexey Dobriyan cc: Alexey Dobriyan Signed-off-by: Al Viro --- fs/proc/inode.c | 51 +-------------------------------------------------- fs/proc/internal.h | 4 +--- fs/proc/root.c | 51 ++++++++++++++++++++++++++++++++++++++++++++++++++- 3 files changed, 52 insertions(+), 54 deletions(-) diff --git a/fs/proc/inode.c b/fs/proc/inode.c index da649ccd6804..17b5261206dd 100644 --- a/fs/proc/inode.c +++ b/fs/proc/inode.c @@ -24,7 +24,6 @@ #include #include #include -#include #include @@ -122,7 +121,7 @@ static int proc_show_options(struct seq_file *seq, struct dentry *root) return 0; } -static const struct super_operations proc_sops = { +const struct super_operations proc_sops = { .alloc_inode = proc_alloc_inode, .destroy_inode = proc_destroy_inode, .drop_inode = generic_delete_inode, @@ -488,51 +487,3 @@ struct inode *proc_get_inode(struct super_block *sb, struct proc_dir_entry *de) pde_put(de); return inode; } - -int proc_fill_super(struct super_block *s, void *data, int silent) -{ - struct pid_namespace *ns = get_pid_ns(s->s_fs_info); - struct inode *root_inode; - int ret; - - if (!proc_parse_options(data, ns)) - return -EINVAL; - - /* User space would break if executables or devices appear on proc */ - s->s_iflags |= SB_I_USERNS_VISIBLE | SB_I_NOEXEC | SB_I_NODEV; - s->s_flags |= SB_NODIRATIME | SB_NOSUID | SB_NOEXEC; - s->s_blocksize = 1024; - s->s_blocksize_bits = 10; - s->s_magic = PROC_SUPER_MAGIC; - s->s_op = &proc_sops; - s->s_time_gran = 1; - - /* - * procfs isn't actually a stacking filesystem; however, there is - * too much magic going on inside it to permit stacking things on - * top of it - */ - s->s_stack_depth = FILESYSTEM_MAX_STACK_DEPTH; - - /* procfs dentries and inodes don't require IO to create */ - s->s_shrink.seeks = 0; - - pde_get(&proc_root); - root_inode = proc_get_inode(s, &proc_root); - if (!root_inode) { - pr_err("proc_fill_super: get root inode failed\n"); - return -ENOMEM; - } - - s->s_root = d_make_root(root_inode); - if (!s->s_root) { - pr_err("proc_fill_super: allocate dentry failed\n"); - return -ENOMEM; - } - - ret = proc_setup_self(s); - if (ret) { - return ret; - } - return proc_setup_thread_self(s); -} diff --git a/fs/proc/internal.h b/fs/proc/internal.h index 5185d7f6a51e..97157c0410a2 100644 --- a/fs/proc/internal.h +++ b/fs/proc/internal.h @@ -205,13 +205,12 @@ struct pde_opener { struct completion *c; } __randomize_layout; extern const struct inode_operations proc_link_inode_operations; - extern const struct inode_operations proc_pid_link_inode_operations; +extern const struct super_operations proc_sops; void proc_init_kmemcache(void); void set_proc_pid_nlink(void); extern struct inode *proc_get_inode(struct super_block *, struct proc_dir_entry *); -extern int proc_fill_super(struct super_block *, void *data, int flags); extern void proc_entry_rundown(struct proc_dir_entry *); /* @@ -269,7 +268,6 @@ static inline void proc_tty_init(void) {} * root.c */ extern struct proc_dir_entry proc_root; -extern int proc_parse_options(char *options, struct pid_namespace *pid); extern void proc_self_init(void); extern int proc_remount(struct super_block *, int *, char *); diff --git a/fs/proc/root.c b/fs/proc/root.c index f4b1a9d2eca6..fe4f64b3250b 100644 --- a/fs/proc/root.c +++ b/fs/proc/root.c @@ -23,6 +23,7 @@ #include #include #include +#include #include "internal.h" @@ -36,7 +37,7 @@ static const match_table_t tokens = { {Opt_err, NULL}, }; -int proc_parse_options(char *options, struct pid_namespace *pid) +static int proc_parse_options(char *options, struct pid_namespace *pid) { char *p; substring_t args[MAX_OPT_ARGS]; @@ -78,6 +79,54 @@ int proc_parse_options(char *options, struct pid_namespace *pid) return 1; } +static int proc_fill_super(struct super_block *s, void *data, int silent) +{ + struct pid_namespace *ns = get_pid_ns(s->s_fs_info); + struct inode *root_inode; + int ret; + + if (!proc_parse_options(data, ns)) + return -EINVAL; + + /* User space would break if executables or devices appear on proc */ + s->s_iflags |= SB_I_USERNS_VISIBLE | SB_I_NOEXEC | SB_I_NODEV; + s->s_flags |= SB_NODIRATIME | SB_NOSUID | SB_NOEXEC; + s->s_blocksize = 1024; + s->s_blocksize_bits = 10; + s->s_magic = PROC_SUPER_MAGIC; + s->s_op = &proc_sops; + s->s_time_gran = 1; + + /* + * procfs isn't actually a stacking filesystem; however, there is + * too much magic going on inside it to permit stacking things on + * top of it + */ + s->s_stack_depth = FILESYSTEM_MAX_STACK_DEPTH; + + /* procfs dentries and inodes don't require IO to create */ + s->s_shrink.seeks = 0; + + pde_get(&proc_root); + root_inode = proc_get_inode(s, &proc_root); + if (!root_inode) { + pr_err("proc_fill_super: get root inode failed\n"); + return -ENOMEM; + } + + s->s_root = d_make_root(root_inode); + if (!s->s_root) { + pr_err("proc_fill_super: allocate dentry failed\n"); + return -ENOMEM; + } + + ret = proc_setup_self(s); + if (ret) { + return ret; + } + return proc_setup_thread_self(s); +} + int proc_remount(struct super_block *sb, int *flags, char *data) { struct pid_namespace *pid = sb->s_fs_info; From patchwork Tue Feb 19 16:31:42 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 10820251 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 610E01805 for ; Tue, 19 Feb 2019 16:31:47 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4A7B22CCFA for ; Tue, 19 Feb 2019 16:31:47 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 484F42CD10; Tue, 19 Feb 2019 16:31:47 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 98F332CD15 for ; Tue, 19 Feb 2019 16:31:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726565AbfBSQbq (ORCPT ); Tue, 19 Feb 2019 11:31:46 -0500 Received: from mx1.redhat.com ([209.132.183.28]:41906 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726357AbfBSQbp (ORCPT ); Tue, 19 Feb 2019 11:31:45 -0500 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 1236EC059B7A; Tue, 19 Feb 2019 16:31:45 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-121-129.rdu2.redhat.com [10.10.121.129]) by smtp.corp.redhat.com (Postfix) with ESMTP id 9E66853; Tue, 19 Feb 2019 16:31:43 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 24/43] proc: Add fs_context support to procfs From: David Howells To: viro@zeniv.linux.org.uk Cc: Alexey Dobriyan , linux-fsdevel@vger.kernel.org, dhowells@redhat.com, torvalds@linux-foundation.org, ebiederm@xmission.com, linux-security-module@vger.kernel.org Date: Tue, 19 Feb 2019 16:31:42 +0000 Message-ID: <155059390291.12449.10527077827963704179.stgit@warthog.procyon.org.uk> In-Reply-To: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> References: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.32]); Tue, 19 Feb 2019 16:31:45 +0000 (UTC) Sender: owner-linux-security-module@vger.kernel.org Precedence: bulk List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Add fs_context support to procfs. Signed-off-by: David Howells cc: Alexey Dobriyan Signed-off-by: Al Viro --- fs/proc/inode.c | 1 fs/proc/internal.h | 1 fs/proc/root.c | 195 ++++++++++++++++++++++++++++++++++------------------ 3 files changed, 129 insertions(+), 68 deletions(-) diff --git a/fs/proc/inode.c b/fs/proc/inode.c index 17b5261206dd..fc7e38def174 100644 --- a/fs/proc/inode.c +++ b/fs/proc/inode.c @@ -127,7 +127,6 @@ const struct super_operations proc_sops = { .drop_inode = generic_delete_inode, .evict_inode = proc_evict_inode, .statfs = simple_statfs, - .remount_fs = proc_remount, .show_options = proc_show_options, }; diff --git a/fs/proc/internal.h b/fs/proc/internal.h index 97157c0410a2..40f905143d39 100644 --- a/fs/proc/internal.h +++ b/fs/proc/internal.h @@ -270,7 +270,6 @@ static inline void proc_tty_init(void) {} extern struct proc_dir_entry proc_root; extern void proc_self_init(void); -extern int proc_remount(struct super_block *, int *, char *); /* * task_[no]mmu.c diff --git a/fs/proc/root.c b/fs/proc/root.c index fe4f64b3250b..6927b29ece76 100644 --- a/fs/proc/root.c +++ b/fs/proc/root.c @@ -19,74 +19,89 @@ #include #include #include +#include #include #include -#include +#include #include #include +#include #include "internal.h" -enum { - Opt_gid, Opt_hidepid, Opt_err, +struct proc_fs_context { + struct pid_namespace *pid_ns; + unsigned int mask; + int hidepid; + int gid; }; -static const match_table_t tokens = { - {Opt_hidepid, "hidepid=%u"}, - {Opt_gid, "gid=%u"}, - {Opt_err, NULL}, +enum proc_param { + Opt_gid, + Opt_hidepid, }; -static int proc_parse_options(char *options, struct pid_namespace *pid) +static const struct fs_parameter_spec proc_param_specs[] = { + fsparam_u32("gid", Opt_gid), + fsparam_u32("hidepid", Opt_hidepid), + {} +}; + +static const struct fs_parameter_description proc_fs_parameters = { + .name = "proc", + .specs = proc_param_specs, +}; + +static int proc_parse_param(struct fs_context *fc, struct fs_parameter *param) { - char *p; - substring_t args[MAX_OPT_ARGS]; - int option; - - if (!options) - return 1; - - while ((p = strsep(&options, ",")) != NULL) { - int token; - if (!*p) - continue; - - args[0].to = args[0].from = NULL; - token = match_token(p, tokens, args); - switch (token) { - case Opt_gid: - if (match_int(&args[0], &option)) - return 0; - pid->pid_gid = make_kgid(current_user_ns(), option); - break; - case Opt_hidepid: - if (match_int(&args[0], &option)) - return 0; - if (option < HIDEPID_OFF || - option > HIDEPID_INVISIBLE) { - pr_err("proc: hidepid value must be between 0 and 2.\n"); - return 0; - } - pid->hide_pid = option; - break; - default: - pr_err("proc: unrecognized mount option \"%s\" " - "or missing value\n", p); - return 0; - } + struct proc_fs_context *ctx = fc->fs_private; + struct fs_parse_result result; + int opt; + + opt = fs_parse(fc, &proc_fs_parameters, param, &result); + if (opt < 0) + return opt; + + switch (opt) { + case Opt_gid: + ctx->gid = result.uint_32; + break; + + case Opt_hidepid: + ctx->hidepid = result.uint_32; + if (ctx->hidepid < HIDEPID_OFF || + ctx->hidepid > HIDEPID_INVISIBLE) + return invalf(fc, "proc: hidepid value must be between 0 and 2.\n"); + break; + + default: + return -EINVAL; } - return 1; + ctx->mask |= 1 << opt; + return 0; +} + +static void proc_apply_options(struct super_block *s, + struct fs_context *fc, + struct pid_namespace *pid_ns, + struct user_namespace *user_ns) +{ + struct proc_fs_context *ctx = fc->fs_private; + + if (ctx->mask & (1 << Opt_gid)) + pid_ns->pid_gid = make_kgid(user_ns, ctx->gid); + if (ctx->mask & (1 << Opt_hidepid)) + pid_ns->hide_pid = ctx->hidepid; } -static int proc_fill_super(struct super_block *s, void *data, int silent) +static int proc_fill_super(struct super_block *s, struct fs_context *fc) { - struct pid_namespace *ns = get_pid_ns(s->s_fs_info); + struct pid_namespace *pid_ns = get_pid_ns(s->s_fs_info); struct inode *root_inode; int ret; - if (!proc_parse_options(data, ns)) - return -EINVAL; + proc_apply_options(s, fc, pid_ns, current_user_ns()); /* User space would break if executables or devices appear on proc */ s->s_iflags |= SB_I_USERNS_VISIBLE | SB_I_NOEXEC | SB_I_NODEV; @@ -127,27 +142,55 @@ static int proc_fill_super(struct super_block *s, void *data, int silent) return proc_setup_thread_self(s); } -int proc_remount(struct super_block *sb, int *flags, char *data) +static int proc_reconfigure(struct fs_context *fc) { + struct super_block *sb = fc->root->d_sb; struct pid_namespace *pid = sb->s_fs_info; sync_filesystem(sb); - return !proc_parse_options(data, pid); + + proc_apply_options(sb, fc, pid, current_user_ns()); + return 0; } -static struct dentry *proc_mount(struct file_system_type *fs_type, - int flags, const char *dev_name, void *data) +static int proc_get_tree(struct fs_context *fc) { - struct pid_namespace *ns; + struct proc_fs_context *ctx = fc->fs_private; - if (flags & SB_KERNMOUNT) { - ns = data; - data = NULL; - } else { - ns = task_active_pid_ns(current); - } + put_user_ns(fc->user_ns); + fc->user_ns = get_user_ns(ctx->pid_ns->user_ns); + fc->s_fs_info = ctx->pid_ns; + return vfs_get_super(fc, vfs_get_keyed_super, proc_fill_super); +} + +static void proc_fs_context_free(struct fs_context *fc) +{ + struct proc_fs_context *ctx = fc->fs_private; + + if (ctx->pid_ns) + put_pid_ns(ctx->pid_ns); + kfree(ctx); +} + +static const struct fs_context_operations proc_fs_context_ops = { + .free = proc_fs_context_free, + .parse_param = proc_parse_param, + .get_tree = proc_get_tree, + .reconfigure = proc_reconfigure, +}; + +static int proc_init_fs_context(struct fs_context *fc) +{ + struct proc_fs_context *ctx; + + ctx = kzalloc(sizeof(struct proc_fs_context), GFP_KERNEL); + if (!ctx) + return -ENOMEM; - return mount_ns(fs_type, flags, data, ns, ns->user_ns, proc_fill_super); + ctx->pid_ns = get_pid_ns(task_active_pid_ns(current)); + fc->fs_private = ctx; + fc->ops = &proc_fs_context_ops; + return 0; } static void proc_kill_sb(struct super_block *sb) @@ -164,10 +207,11 @@ static void proc_kill_sb(struct super_block *sb) } static struct file_system_type proc_fs_type = { - .name = "proc", - .mount = proc_mount, - .kill_sb = proc_kill_sb, - .fs_flags = FS_USERNS_MOUNT, + .name = "proc", + .init_fs_context = proc_init_fs_context, + .parameters = &proc_fs_parameters, + .kill_sb = proc_kill_sb, + .fs_flags = FS_USERNS_MOUNT, }; void __init proc_root_init(void) @@ -205,7 +249,7 @@ static struct dentry *proc_root_lookup(struct inode * dir, struct dentry * dentr { if (!proc_pid_lookup(dir, dentry, flags)) return NULL; - + return proc_lookup(dir, dentry, flags); } @@ -258,9 +302,28 @@ struct proc_dir_entry proc_root = { int pid_ns_prepare_proc(struct pid_namespace *ns) { + struct proc_fs_context *ctx; + struct fs_context *fc; struct vfsmount *mnt; - mnt = kern_mount_data(&proc_fs_type, ns); + fc = fs_context_for_mount(&proc_fs_type, SB_KERNMOUNT); + if (IS_ERR(fc)) + return PTR_ERR(fc); + + if (fc->user_ns != ns->user_ns) { + put_user_ns(fc->user_ns); + fc->user_ns = get_user_ns(ns->user_ns); + } + + ctx = fc->fs_private; + if (ctx->pid_ns != ns) { + put_pid_ns(ctx->pid_ns); + get_pid_ns(ns); + ctx->pid_ns = ns; + } + + mnt = fc_mount(fc); + put_fs_context(fc); if (IS_ERR(mnt)) return PTR_ERR(mnt); From patchwork Tue Feb 19 16:31:50 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 10820255 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 698C714E1 for ; Tue, 19 Feb 2019 16:31:54 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5051B2CD4B for ; Tue, 19 Feb 2019 16:31:54 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 4C6062CD4F; Tue, 19 Feb 2019 16:31:54 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 924E22CD11 for ; Tue, 19 Feb 2019 16:31:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729020AbfBSQbx (ORCPT ); Tue, 19 Feb 2019 11:31:53 -0500 Received: from mx1.redhat.com ([209.132.183.28]:49678 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726357AbfBSQbx (ORCPT ); Tue, 19 Feb 2019 11:31:53 -0500 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 5C34080467; Tue, 19 Feb 2019 16:31:52 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-121-129.rdu2.redhat.com [10.10.121.129]) by smtp.corp.redhat.com (Postfix) with ESMTP id 0682719C58; Tue, 19 Feb 2019 16:31:50 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 25/43] ipc: Convert mqueue fs to fs_context From: David Howells To: viro@zeniv.linux.org.uk Cc: linux-fsdevel@vger.kernel.org, dhowells@redhat.com, torvalds@linux-foundation.org, ebiederm@xmission.com, linux-security-module@vger.kernel.org Date: Tue, 19 Feb 2019 16:31:50 +0000 Message-ID: <155059391026.12449.668855607210599414.stgit@warthog.procyon.org.uk> In-Reply-To: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> References: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.28]); Tue, 19 Feb 2019 16:31:52 +0000 (UTC) Sender: owner-linux-security-module@vger.kernel.org Precedence: bulk List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Convert the mqueue filesystem to use the filesystem context stuff. Notes: (1) The relevant ipc namespace is selected in when the context is initialised (and it defaults to the current task's ipc namespace). The caller can override this before calling vfs_get_tree(). (2) Rather than simply calling kern_mount_data(), mq_init_ns() and mq_internal_mount() create a context, adjust it and then do the rest of the mount procedure. (3) The lazy mqueue mounting on creation of a new namespace is retained from a previous patch, but the avoidance of sget() if no superblock yet exists is reverted and the superblock is again keyed on the namespace pointer. Yes, there was a performance gain in not searching the superblock hash, but it's only paid once per ipc namespace - and only if someone uses mqueue within that namespace, so I'm not sure it's worth it, especially as calling sget() allows avoidance of recursion. Signed-off-by: David Howells Signed-off-by: Al Viro --- ipc/mqueue.c | 94 ++++++++++++++++++++++++++++++++++++++++++------------- ipc/namespace.c | 2 + 2 files changed, 73 insertions(+), 23 deletions(-) diff --git a/ipc/mqueue.c b/ipc/mqueue.c index c595bed7bfcb..2a9a8be49f5b 100644 --- a/ipc/mqueue.c +++ b/ipc/mqueue.c @@ -18,6 +18,7 @@ #include #include #include +#include #include #include #include @@ -42,6 +43,10 @@ #include #include "util.h" +struct mqueue_fs_context { + struct ipc_namespace *ipc_ns; +}; + #define MQUEUE_MAGIC 0x19800202 #define DIRENT_SIZE 20 #define FILENT_SIZE 80 @@ -87,9 +92,11 @@ struct mqueue_inode_info { unsigned long qsize; /* size of queue in memory (sum of all msgs) */ }; +static struct file_system_type mqueue_fs_type; static const struct inode_operations mqueue_dir_inode_operations; static const struct file_operations mqueue_file_operations; static const struct super_operations mqueue_super_ops; +static const struct fs_context_operations mqueue_fs_context_ops; static void remove_notification(struct mqueue_inode_info *info); static struct kmem_cache *mqueue_inode_cachep; @@ -322,7 +329,7 @@ static struct inode *mqueue_get_inode(struct super_block *sb, return ERR_PTR(ret); } -static int mqueue_fill_super(struct super_block *sb, void *data, int silent) +static int mqueue_fill_super(struct super_block *sb, struct fs_context *fc) { struct inode *inode; struct ipc_namespace *ns = sb->s_fs_info; @@ -343,18 +350,56 @@ static int mqueue_fill_super(struct super_block *sb, void *data, int silent) return 0; } -static struct dentry *mqueue_mount(struct file_system_type *fs_type, - int flags, const char *dev_name, - void *data) +static int mqueue_get_tree(struct fs_context *fc) { - struct ipc_namespace *ns; - if (flags & SB_KERNMOUNT) { - ns = data; - data = NULL; - } else { - ns = current->nsproxy->ipc_ns; - } - return mount_ns(fs_type, flags, data, ns, ns->user_ns, mqueue_fill_super); + struct mqueue_fs_context *ctx = fc->fs_private; + + put_user_ns(fc->user_ns); + fc->user_ns = get_user_ns(ctx->ipc_ns->user_ns); + fc->s_fs_info = ctx->ipc_ns; + return vfs_get_super(fc, vfs_get_keyed_super, mqueue_fill_super); +} + +static void mqueue_fs_context_free(struct fs_context *fc) +{ + struct mqueue_fs_context *ctx = fc->fs_private; + + if (ctx->ipc_ns) + put_ipc_ns(ctx->ipc_ns); + kfree(ctx); +} + +static int mqueue_init_fs_context(struct fs_context *fc) +{ + struct mqueue_fs_context *ctx; + + ctx = kzalloc(sizeof(struct mqueue_fs_context), GFP_KERNEL); + if (!ctx) + return -ENOMEM; + + ctx->ipc_ns = get_ipc_ns(current->nsproxy->ipc_ns); + fc->fs_private = ctx; + fc->ops = &mqueue_fs_context_ops; + return 0; +} + +static struct vfsmount *mq_create_mount(struct ipc_namespace *ns) +{ + struct mqueue_fs_context *ctx; + struct fs_context *fc; + struct vfsmount *mnt; + + fc = fs_context_for_mount(&mqueue_fs_type, SB_KERNMOUNT); + if (IS_ERR(fc)) + return ERR_CAST(fc); + + ctx = fc->fs_private; + put_ipc_ns(ctx->ipc_ns); + ctx->ipc_ns = get_ipc_ns(ns); + + mnt = fc_mount(fc); + put_fs_context(fc); + return mnt; } static void init_once(void *foo) @@ -1522,15 +1567,22 @@ static const struct super_operations mqueue_super_ops = { .statfs = simple_statfs, }; +static const struct fs_context_operations mqueue_fs_context_ops = { + .free = mqueue_fs_context_free, + .get_tree = mqueue_get_tree, +}; + static struct file_system_type mqueue_fs_type = { - .name = "mqueue", - .mount = mqueue_mount, - .kill_sb = kill_litter_super, - .fs_flags = FS_USERNS_MOUNT, + .name = "mqueue", + .init_fs_context = mqueue_init_fs_context, + .kill_sb = kill_litter_super, + .fs_flags = FS_USERNS_MOUNT, }; int mq_init_ns(struct ipc_namespace *ns) { + struct vfsmount *m; + ns->mq_queues_count = 0; ns->mq_queues_max = DFLT_QUEUESMAX; ns->mq_msg_max = DFLT_MSGMAX; @@ -1538,12 +1590,10 @@ int mq_init_ns(struct ipc_namespace *ns) ns->mq_msg_default = DFLT_MSG; ns->mq_msgsize_default = DFLT_MSGSIZE; - ns->mq_mnt = kern_mount_data(&mqueue_fs_type, ns); - if (IS_ERR(ns->mq_mnt)) { - int err = PTR_ERR(ns->mq_mnt); - ns->mq_mnt = NULL; - return err; - } + m = mq_create_mount(ns); + if (IS_ERR(m)) + return PTR_ERR(m); + ns->mq_mnt = m; return 0; } diff --git a/ipc/namespace.c b/ipc/namespace.c index 21607791d62c..b3ca1476ca51 100644 --- a/ipc/namespace.c +++ b/ipc/namespace.c @@ -42,7 +42,7 @@ static struct ipc_namespace *create_ipc_ns(struct user_namespace *user_ns, goto fail; err = -ENOMEM; - ns = kmalloc(sizeof(struct ipc_namespace), GFP_KERNEL); + ns = kzalloc(sizeof(struct ipc_namespace), GFP_KERNEL); if (ns == NULL) goto fail_dec; From patchwork Tue Feb 19 16:31:57 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 10820259 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1B2FB1805 for ; Tue, 19 Feb 2019 16:32:02 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id F2B9B2CD54 for ; Tue, 19 Feb 2019 16:32:01 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id F0B722CD4F; Tue, 19 Feb 2019 16:32:01 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2BF302CD5F for ; Tue, 19 Feb 2019 16:32:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729200AbfBSQcA (ORCPT ); Tue, 19 Feb 2019 11:32:00 -0500 Received: from mx1.redhat.com ([209.132.183.28]:43116 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726357AbfBSQcA (ORCPT ); Tue, 19 Feb 2019 11:32:00 -0500 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 945D08635; Tue, 19 Feb 2019 16:31:59 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-121-129.rdu2.redhat.com [10.10.121.129]) by smtp.corp.redhat.com (Postfix) with ESMTP id 4AB035D6AA; Tue, 19 Feb 2019 16:31:58 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 26/43] cgroup: start switching to fs_context From: David Howells To: viro@zeniv.linux.org.uk Cc: linux-fsdevel@vger.kernel.org, dhowells@redhat.com, torvalds@linux-foundation.org, ebiederm@xmission.com, linux-security-module@vger.kernel.org Date: Tue, 19 Feb 2019 16:31:57 +0000 Message-ID: <155059391757.12449.10589286664392696344.stgit@warthog.procyon.org.uk> In-Reply-To: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> References: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.38]); Tue, 19 Feb 2019 16:31:59 +0000 (UTC) Sender: owner-linux-security-module@vger.kernel.org Precedence: bulk List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Al Viro Unfortunately, cgroup is tangled into kernfs infrastructure. To avoid converting all kernfs-based filesystems at once, we need to untangle the remount part of things, instead of having it go through kernfs_sop_remount_fs(). Fortunately, it's not hard to do. This commit just gets cgroup/cgroup1 to use fs_context to deliver options on mount and remount paths. Parsing those is going to be done in the next commits; for now we do pretty much what legacy case does. Signed-off-by: Al Viro --- kernel/cgroup/cgroup-internal.h | 14 ++++ kernel/cgroup/cgroup-v1.c | 9 +-- kernel/cgroup/cgroup.c | 134 ++++++++++++++++++++++++++++----------- 3 files changed, 116 insertions(+), 41 deletions(-) diff --git a/kernel/cgroup/cgroup-internal.h b/kernel/cgroup/cgroup-internal.h index c9a35f09e4b9..a89cb0ba7a68 100644 --- a/kernel/cgroup/cgroup-internal.h +++ b/kernel/cgroup/cgroup-internal.h @@ -7,6 +7,7 @@ #include #include #include +#include #define TRACE_CGROUP_PATH_LEN 1024 extern spinlock_t trace_cgroup_path_lock; @@ -36,6 +37,18 @@ extern void __init enable_debug_cgroup(void); } \ } while (0) +/* + * The cgroup filesystem superblock creation/mount context. + */ +struct cgroup_fs_context { + char *data; +}; + +static inline struct cgroup_fs_context *cgroup_fc2context(struct fs_context *fc) +{ + return fc->fs_private; +} + /* * A cgroup can be associated with multiple css_sets as different tasks may * belong to different cgroups on different hierarchies. In the other @@ -255,5 +268,6 @@ void cgroup1_check_for_release(struct cgroup *cgrp); struct dentry *cgroup1_mount(struct file_system_type *fs_type, int flags, void *data, unsigned long magic, struct cgroup_namespace *ns); +int cgroup1_reconfigure(struct fs_context *ctx); #endif /* __CGROUP_INTERNAL_H */ diff --git a/kernel/cgroup/cgroup-v1.c b/kernel/cgroup/cgroup-v1.c index f94a7229974e..e377e19dd3e6 100644 --- a/kernel/cgroup/cgroup-v1.c +++ b/kernel/cgroup/cgroup-v1.c @@ -1046,17 +1046,19 @@ static int parse_cgroupfs_options(char *data, struct cgroup_sb_opts *opts) return 0; } -static int cgroup1_remount(struct kernfs_root *kf_root, int *flags, char *data) +int cgroup1_reconfigure(struct fs_context *fc) { - int ret = 0; + struct cgroup_fs_context *ctx = cgroup_fc2context(fc); + struct kernfs_root *kf_root = kernfs_root_from_sb(fc->root->d_sb); struct cgroup_root *root = cgroup_root_from_kf(kf_root); + int ret = 0; struct cgroup_sb_opts opts; u16 added_mask, removed_mask; cgroup_lock_and_drain_offline(&cgrp_dfl_root.cgrp); /* See what subsystems are wanted */ - ret = parse_cgroupfs_options(data, &opts); + ret = parse_cgroupfs_options(ctx->data, &opts); if (ret) goto out_unlock; @@ -1106,7 +1108,6 @@ static int cgroup1_remount(struct kernfs_root *kf_root, int *flags, char *data) struct kernfs_syscall_ops cgroup1_kf_syscall_ops = { .rename = cgroup1_rename, .show_options = cgroup1_show_options, - .remount_fs = cgroup1_remount, .mkdir = cgroup_mkdir, .rmdir = cgroup_rmdir, .show_path = cgroup_show_path, diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c index 7fd9f22e406d..7f7db5f967e3 100644 --- a/kernel/cgroup/cgroup.c +++ b/kernel/cgroup/cgroup.c @@ -1811,12 +1811,13 @@ static int cgroup_show_options(struct seq_file *seq, struct kernfs_root *kf_root return 0; } -static int cgroup_remount(struct kernfs_root *kf_root, int *flags, char *data) +static int cgroup_reconfigure(struct fs_context *fc) { + struct cgroup_fs_context *ctx = cgroup_fc2context(fc); unsigned int root_flags; int ret; - ret = parse_cgroup_root_flags(data, &root_flags); + ret = parse_cgroup_root_flags(ctx->data, &root_flags); if (ret) return ret; @@ -2067,21 +2068,98 @@ struct dentry *cgroup_do_mount(struct file_system_type *fs_type, int flags, return dentry; } -static struct dentry *cgroup_mount(struct file_system_type *fs_type, - int flags, const char *unused_dev_name, - void *data) +/* + * Destroy a cgroup filesystem context. + */ +static void cgroup_fs_context_free(struct fs_context *fc) +{ + struct cgroup_fs_context *ctx = cgroup_fc2context(fc); + + kfree(ctx); +} + +static int cgroup_parse_monolithic(struct fs_context *fc, void *data) +{ + struct cgroup_fs_context *ctx = cgroup_fc2context(fc); + + ctx->data = data; + if (ctx->data) + security_sb_eat_lsm_opts(ctx->data, &fc->security); + return 0; +} + +static int cgroup_get_tree(struct fs_context *fc) { struct cgroup_namespace *ns = current->nsproxy->cgroup_ns; - struct dentry *dentry; + struct cgroup_fs_context *ctx = cgroup_fc2context(fc); + unsigned int root_flags; + struct dentry *root; int ret; - get_cgroup_ns(ns); + /* Check if the caller has permission to mount. */ + if (!ns_capable(ns->user_ns, CAP_SYS_ADMIN)) + return -EPERM; + + ret = parse_cgroup_root_flags(ctx->data, &root_flags); + if (ret) + return ret; + + cgrp_dfl_visible = true; + cgroup_get_live(&cgrp_dfl_root.cgrp); + + root = cgroup_do_mount(&cgroup2_fs_type, fc->sb_flags, &cgrp_dfl_root, + CGROUP2_SUPER_MAGIC, ns); + if (IS_ERR(root)) + return PTR_ERR(root); + + apply_cgroup_root_flags(root_flags); + fc->root = root; + return 0; +} + +static int cgroup1_get_tree(struct fs_context *fc) +{ + struct cgroup_namespace *ns = current->nsproxy->cgroup_ns; + struct cgroup_fs_context *ctx = cgroup_fc2context(fc); + struct dentry *root; /* Check if the caller has permission to mount. */ - if (!ns_capable(ns->user_ns, CAP_SYS_ADMIN)) { - put_cgroup_ns(ns); - return ERR_PTR(-EPERM); - } + if (!ns_capable(ns->user_ns, CAP_SYS_ADMIN)) + return -EPERM; + + root = cgroup1_mount(&cgroup_fs_type, fc->sb_flags, ctx->data, + CGROUP_SUPER_MAGIC, ns); + if (IS_ERR(root)) + return PTR_ERR(root); + + fc->root = root; + return 0; +} + +static const struct fs_context_operations cgroup_fs_context_ops = { + .free = cgroup_fs_context_free, + .parse_monolithic = cgroup_parse_monolithic, + .get_tree = cgroup_get_tree, + .reconfigure = cgroup_reconfigure, +}; + +static const struct fs_context_operations cgroup1_fs_context_ops = { + .free = cgroup_fs_context_free, + .parse_monolithic = cgroup_parse_monolithic, + .get_tree = cgroup1_get_tree, + .reconfigure = cgroup1_reconfigure, +}; + +/* + * Initialise the cgroup filesystem creation/reconfiguration context. + */ +static int cgroup_init_fs_context(struct fs_context *fc) +{ + struct cgroup_fs_context *ctx; + + ctx = kzalloc(sizeof(struct cgroup_fs_context), GFP_KERNEL); + if (!ctx) + return -ENOMEM; /* * The first time anyone tries to mount a cgroup, enable the list @@ -2090,29 +2168,12 @@ static struct dentry *cgroup_mount(struct file_system_type *fs_type, if (!use_task_css_set_links) cgroup_enable_task_cg_lists(); - if (fs_type == &cgroup2_fs_type) { - unsigned int root_flags; - - ret = parse_cgroup_root_flags(data, &root_flags); - if (ret) { - put_cgroup_ns(ns); - return ERR_PTR(ret); - } - - cgrp_dfl_visible = true; - cgroup_get_live(&cgrp_dfl_root.cgrp); - - dentry = cgroup_do_mount(&cgroup2_fs_type, flags, &cgrp_dfl_root, - CGROUP2_SUPER_MAGIC, ns); - if (!IS_ERR(dentry)) - apply_cgroup_root_flags(root_flags); - } else { - dentry = cgroup1_mount(&cgroup_fs_type, flags, data, - CGROUP_SUPER_MAGIC, ns); - } - - put_cgroup_ns(ns); - return dentry; + fc->fs_private = ctx; + if (fc->fs_type == &cgroup2_fs_type) + fc->ops = &cgroup_fs_context_ops; + else + fc->ops = &cgroup1_fs_context_ops; + return 0; } static void cgroup_kill_sb(struct super_block *sb) @@ -2136,14 +2197,14 @@ static void cgroup_kill_sb(struct super_block *sb) struct file_system_type cgroup_fs_type = { .name = "cgroup", - .mount = cgroup_mount, + .init_fs_context = cgroup_init_fs_context, .kill_sb = cgroup_kill_sb, .fs_flags = FS_USERNS_MOUNT, }; static struct file_system_type cgroup2_fs_type = { .name = "cgroup2", - .mount = cgroup_mount, + .init_fs_context = cgroup_init_fs_context, .kill_sb = cgroup_kill_sb, .fs_flags = FS_USERNS_MOUNT, }; @@ -5268,7 +5329,6 @@ int cgroup_rmdir(struct kernfs_node *kn) static struct kernfs_syscall_ops cgroup_kf_syscall_ops = { .show_options = cgroup_show_options, - .remount_fs = cgroup_remount, .mkdir = cgroup_mkdir, .rmdir = cgroup_rmdir, .show_path = cgroup_show_path, From patchwork Tue Feb 19 16:32:04 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 10820263 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0C45E14E1 for ; Tue, 19 Feb 2019 16:32:09 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E88292CD52 for ; Tue, 19 Feb 2019 16:32:08 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id E63D12CD6D; Tue, 19 Feb 2019 16:32:08 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 659F82CD6D for ; Tue, 19 Feb 2019 16:32:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726357AbfBSQcI (ORCPT ); Tue, 19 Feb 2019 11:32:08 -0500 Received: from mx1.redhat.com ([209.132.183.28]:45166 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728216AbfBSQcH (ORCPT ); Tue, 19 Feb 2019 11:32:07 -0500 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 393C518B24A; Tue, 19 Feb 2019 16:32:07 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-121-129.rdu2.redhat.com [10.10.121.129]) by smtp.corp.redhat.com (Postfix) with ESMTP id EA5DF61B63; Tue, 19 Feb 2019 16:32:05 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 27/43] cgroup: fold cgroup1_mount() into cgroup1_get_tree() From: David Howells To: viro@zeniv.linux.org.uk Cc: linux-fsdevel@vger.kernel.org, dhowells@redhat.com, torvalds@linux-foundation.org, ebiederm@xmission.com, linux-security-module@vger.kernel.org Date: Tue, 19 Feb 2019 16:32:04 +0000 Message-ID: <155059392478.12449.4711912735370818757.stgit@warthog.procyon.org.uk> In-Reply-To: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> References: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.29]); Tue, 19 Feb 2019 16:32:07 +0000 (UTC) Sender: owner-linux-security-module@vger.kernel.org Precedence: bulk List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Al Viro Signed-off-by: Al Viro --- kernel/cgroup/cgroup-internal.h | 4 +--- kernel/cgroup/cgroup-v1.c | 26 +++++++++++++++++--------- kernel/cgroup/cgroup.c | 19 ------------------- 3 files changed, 18 insertions(+), 31 deletions(-) diff --git a/kernel/cgroup/cgroup-internal.h b/kernel/cgroup/cgroup-internal.h index a89cb0ba7a68..37836d598ff8 100644 --- a/kernel/cgroup/cgroup-internal.h +++ b/kernel/cgroup/cgroup-internal.h @@ -265,9 +265,7 @@ bool cgroup1_ssid_disabled(int ssid); void cgroup1_pidlist_destroy_all(struct cgroup *cgrp); void cgroup1_release_agent(struct work_struct *work); void cgroup1_check_for_release(struct cgroup *cgrp); -struct dentry *cgroup1_mount(struct file_system_type *fs_type, int flags, - void *data, unsigned long magic, - struct cgroup_namespace *ns); +int cgroup1_get_tree(struct fs_context *fc); int cgroup1_reconfigure(struct fs_context *ctx); #endif /* __CGROUP_INTERNAL_H */ diff --git a/kernel/cgroup/cgroup-v1.c b/kernel/cgroup/cgroup-v1.c index e377e19dd3e6..7ae3810dcbdf 100644 --- a/kernel/cgroup/cgroup-v1.c +++ b/kernel/cgroup/cgroup-v1.c @@ -1113,20 +1113,24 @@ struct kernfs_syscall_ops cgroup1_kf_syscall_ops = { .show_path = cgroup_show_path, }; -struct dentry *cgroup1_mount(struct file_system_type *fs_type, int flags, - void *data, unsigned long magic, - struct cgroup_namespace *ns) +int cgroup1_get_tree(struct fs_context *fc) { + struct cgroup_namespace *ns = current->nsproxy->cgroup_ns; + struct cgroup_fs_context *ctx = cgroup_fc2context(fc); struct cgroup_sb_opts opts; struct cgroup_root *root; struct cgroup_subsys *ss; struct dentry *dentry; int i, ret; + /* Check if the caller has permission to mount. */ + if (!ns_capable(ns->user_ns, CAP_SYS_ADMIN)) + return -EPERM; + cgroup_lock_and_drain_offline(&cgrp_dfl_root.cgrp); /* First find the desired set of subsystems */ - ret = parse_cgroupfs_options(data, &opts); + ret = parse_cgroupfs_options(ctx->data, &opts); if (ret) goto out_unlock; @@ -1228,19 +1232,23 @@ struct dentry *cgroup1_mount(struct file_system_type *fs_type, int flags, kfree(opts.name); if (ret) - return ERR_PTR(ret); + return ret; - dentry = cgroup_do_mount(&cgroup_fs_type, flags, root, + dentry = cgroup_do_mount(&cgroup_fs_type, fc->sb_flags, root, CGROUP_SUPER_MAGIC, ns); + if (IS_ERR(dentry)) + return PTR_ERR(dentry); - if (!IS_ERR(dentry) && percpu_ref_is_dying(&root->cgrp.self.refcnt)) { + if (percpu_ref_is_dying(&root->cgrp.self.refcnt)) { struct super_block *sb = dentry->d_sb; dput(dentry); deactivate_locked_super(sb); msleep(10); - dentry = ERR_PTR(restart_syscall()); + return restart_syscall(); } - return dentry; + + fc->root = dentry; + return 0; } static int __init cgroup1_wq_init(void) diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c index 7f7db5f967e3..0652f74064a2 100644 --- a/kernel/cgroup/cgroup.c +++ b/kernel/cgroup/cgroup.c @@ -2117,25 +2117,6 @@ static int cgroup_get_tree(struct fs_context *fc) return 0; } -static int cgroup1_get_tree(struct fs_context *fc) -{ - struct cgroup_namespace *ns = current->nsproxy->cgroup_ns; - struct cgroup_fs_context *ctx = cgroup_fc2context(fc); - struct dentry *root; - - /* Check if the caller has permission to mount. */ - if (!ns_capable(ns->user_ns, CAP_SYS_ADMIN)) - return -EPERM; - - root = cgroup1_mount(&cgroup_fs_type, fc->sb_flags, ctx->data, - CGROUP_SUPER_MAGIC, ns); - if (IS_ERR(root)) - return PTR_ERR(root); - - fc->root = root; - return 0; -} - static const struct fs_context_operations cgroup_fs_context_ops = { .free = cgroup_fs_context_free, .parse_monolithic = cgroup_parse_monolithic, From patchwork Tue Feb 19 16:32:13 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 10820267 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E2D53180E for ; Tue, 19 Feb 2019 16:32:21 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C800C2CD42 for ; Tue, 19 Feb 2019 16:32:21 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id C5D0A2CD83; Tue, 19 Feb 2019 16:32:21 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 952D02CD42 for ; Tue, 19 Feb 2019 16:32:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728216AbfBSQcU (ORCPT ); Tue, 19 Feb 2019 11:32:20 -0500 Received: from mx1.redhat.com ([209.132.183.28]:50128 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725991AbfBSQcU (ORCPT ); Tue, 19 Feb 2019 11:32:20 -0500 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id B3861AB40B; Tue, 19 Feb 2019 16:32:17 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-121-129.rdu2.redhat.com [10.10.121.129]) by smtp.corp.redhat.com (Postfix) with ESMTP id 61F1F5D970; Tue, 19 Feb 2019 16:32:15 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 28/43] cgroup: take options parsing into ->parse_monolithic() From: David Howells To: viro@zeniv.linux.org.uk Cc: linux-fsdevel@vger.kernel.org, dhowells@redhat.com, torvalds@linux-foundation.org, ebiederm@xmission.com, linux-security-module@vger.kernel.org Date: Tue, 19 Feb 2019 16:32:13 +0000 Message-ID: <155059393356.12449.12233243314907537336.stgit@warthog.procyon.org.uk> In-Reply-To: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> References: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.28]); Tue, 19 Feb 2019 16:32:17 +0000 (UTC) Sender: owner-linux-security-module@vger.kernel.org Precedence: bulk List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Al Viro Store the results in cgroup_fs_context. There's a nasty twist caused by the enabling/disabling subsystems - we can't do the checks sensitive to that until cgroup_mutex gets grabbed. Frankly, these checks are complete bullshit (e.g. all,none combination is accepted if all subsystems are disabled; so's cpusets,none and all,cpusets when cpusets is disabled, etc.), but touching that would be a userland-visible behaviour change ;-/ So we do parsing in ->parse_monolithic() and have the consistency checks done in check_cgroupfs_options(), with the latter called (on already parsed options) from cgroup1_get_tree() and cgroup1_reconfigure(). Freeing the strdup'ed strings is done from fs_context destructor, which somewhat simplifies the life for cgroup1_{get_tree,reconfigure}(). Signed-off-by: Al Viro --- kernel/cgroup/cgroup-internal.h | 23 +++--- kernel/cgroup/cgroup-v1.c | 143 ++++++++++++++++++--------------------- kernel/cgroup/cgroup.c | 54 +++++++-------- 3 files changed, 104 insertions(+), 116 deletions(-) diff --git a/kernel/cgroup/cgroup-internal.h b/kernel/cgroup/cgroup-internal.h index 37836d598ff8..e627ff193dba 100644 --- a/kernel/cgroup/cgroup-internal.h +++ b/kernel/cgroup/cgroup-internal.h @@ -41,7 +41,15 @@ extern void __init enable_debug_cgroup(void); * The cgroup filesystem superblock creation/mount context. */ struct cgroup_fs_context { - char *data; + unsigned int flags; /* CGRP_ROOT_* flags */ + + /* cgroup1 bits */ + bool cpuset_clone_children; + bool none; /* User explicitly requested empty subsystem */ + bool all_ss; /* Seen 'all' option */ + u16 subsys_mask; /* Selected subsystems */ + char *name; /* Hierarchy name */ + char *release_agent; /* Path for release notifications */ }; static inline struct cgroup_fs_context *cgroup_fc2context(struct fs_context *fc) @@ -130,16 +138,6 @@ struct cgroup_mgctx { #define DEFINE_CGROUP_MGCTX(name) \ struct cgroup_mgctx name = CGROUP_MGCTX_INIT(name) -struct cgroup_sb_opts { - u16 subsys_mask; - unsigned int flags; - char *release_agent; - bool cpuset_clone_children; - char *name; - /* User explicitly requested empty subsystem */ - bool none; -}; - extern struct mutex cgroup_mutex; extern spinlock_t css_set_lock; extern struct cgroup_subsys *cgroup_subsys[]; @@ -210,7 +208,7 @@ int cgroup_path_ns_locked(struct cgroup *cgrp, char *buf, size_t buflen, struct cgroup_namespace *ns); void cgroup_free_root(struct cgroup_root *root); -void init_cgroup_root(struct cgroup_root *root, struct cgroup_sb_opts *opts); +void init_cgroup_root(struct cgroup_root *root, struct cgroup_fs_context *ctx); int cgroup_setup_root(struct cgroup_root *root, u16 ss_mask); int rebind_subsystems(struct cgroup_root *dst_root, u16 ss_mask); struct dentry *cgroup_do_mount(struct file_system_type *fs_type, int flags, @@ -266,6 +264,7 @@ void cgroup1_pidlist_destroy_all(struct cgroup *cgrp); void cgroup1_release_agent(struct work_struct *work); void cgroup1_check_for_release(struct cgroup *cgrp); int cgroup1_get_tree(struct fs_context *fc); +int parse_cgroup1_options(char *data, struct cgroup_fs_context *ctx); int cgroup1_reconfigure(struct fs_context *ctx); #endif /* __CGROUP_INTERNAL_H */ diff --git a/kernel/cgroup/cgroup-v1.c b/kernel/cgroup/cgroup-v1.c index 7ae3810dcbdf..5c93643090e9 100644 --- a/kernel/cgroup/cgroup-v1.c +++ b/kernel/cgroup/cgroup-v1.c @@ -906,61 +906,47 @@ static int cgroup1_show_options(struct seq_file *seq, struct kernfs_root *kf_roo return 0; } -static int parse_cgroupfs_options(char *data, struct cgroup_sb_opts *opts) +int parse_cgroup1_options(char *data, struct cgroup_fs_context *ctx) { char *token, *o = data; - bool all_ss = false, one_ss = false; - u16 mask = U16_MAX; struct cgroup_subsys *ss; - int nr_opts = 0; int i; -#ifdef CONFIG_CPUSETS - mask = ~((u16)1 << cpuset_cgrp_id); -#endif - - memset(opts, 0, sizeof(*opts)); - while ((token = strsep(&o, ",")) != NULL) { - nr_opts++; - if (!*token) return -EINVAL; if (!strcmp(token, "none")) { /* Explicitly have no subsystems */ - opts->none = true; + ctx->none = true; continue; } if (!strcmp(token, "all")) { - /* Mutually exclusive option 'all' + subsystem name */ - if (one_ss) - return -EINVAL; - all_ss = true; + ctx->all_ss = true; continue; } if (!strcmp(token, "noprefix")) { - opts->flags |= CGRP_ROOT_NOPREFIX; + ctx->flags |= CGRP_ROOT_NOPREFIX; continue; } if (!strcmp(token, "clone_children")) { - opts->cpuset_clone_children = true; + ctx->cpuset_clone_children = true; continue; } if (!strcmp(token, "cpuset_v2_mode")) { - opts->flags |= CGRP_ROOT_CPUSET_V2_MODE; + ctx->flags |= CGRP_ROOT_CPUSET_V2_MODE; continue; } if (!strcmp(token, "xattr")) { - opts->flags |= CGRP_ROOT_XATTR; + ctx->flags |= CGRP_ROOT_XATTR; continue; } if (!strncmp(token, "release_agent=", 14)) { /* Specifying two release agents is forbidden */ - if (opts->release_agent) + if (ctx->release_agent) return -EINVAL; - opts->release_agent = + ctx->release_agent = kstrndup(token + 14, PATH_MAX - 1, GFP_KERNEL); - if (!opts->release_agent) + if (!ctx->release_agent) return -ENOMEM; continue; } @@ -983,12 +969,12 @@ static int parse_cgroupfs_options(char *data, struct cgroup_sb_opts *opts) return -EINVAL; } /* Specifying two names is forbidden */ - if (opts->name) + if (ctx->name) return -EINVAL; - opts->name = kstrndup(name, + ctx->name = kstrndup(name, MAX_CGROUP_ROOT_NAMELEN - 1, GFP_KERNEL); - if (!opts->name) + if (!ctx->name) return -ENOMEM; continue; @@ -997,38 +983,51 @@ static int parse_cgroupfs_options(char *data, struct cgroup_sb_opts *opts) for_each_subsys(ss, i) { if (strcmp(token, ss->legacy_name)) continue; - if (!cgroup_ssid_enabled(i)) - continue; - if (cgroup1_ssid_disabled(i)) - continue; - - /* Mutually exclusive option 'all' + subsystem name */ - if (all_ss) - return -EINVAL; - opts->subsys_mask |= (1 << i); - one_ss = true; - + ctx->subsys_mask |= (1 << i); break; } if (i == CGROUP_SUBSYS_COUNT) return -ENOENT; } + return 0; +} + +static int check_cgroupfs_options(struct cgroup_fs_context *ctx) +{ + u16 mask = U16_MAX; + u16 enabled = 0; + struct cgroup_subsys *ss; + int i; + +#ifdef CONFIG_CPUSETS + mask = ~((u16)1 << cpuset_cgrp_id); +#endif + for_each_subsys(ss, i) + if (cgroup_ssid_enabled(i) && !cgroup1_ssid_disabled(i)) + enabled |= 1 << i; + + ctx->subsys_mask &= enabled; /* - * If the 'all' option was specified select all the subsystems, - * otherwise if 'none', 'name=' and a subsystem name options were - * not specified, let's default to 'all' + * In absense of 'none', 'name=' or subsystem name options, + * let's default to 'all'. */ - if (all_ss || (!one_ss && !opts->none && !opts->name)) - for_each_subsys(ss, i) - if (cgroup_ssid_enabled(i) && !cgroup1_ssid_disabled(i)) - opts->subsys_mask |= (1 << i); + if (!ctx->subsys_mask && !ctx->none && !ctx->name) + ctx->all_ss = true; + + if (ctx->all_ss) { + /* Mutually exclusive option 'all' + subsystem name */ + if (ctx->subsys_mask) + return -EINVAL; + /* 'all' => select all the subsystems */ + ctx->subsys_mask = enabled; + } /* * We either have to specify by name or by subsystems. (So all * empty hierarchies must have a name). */ - if (!opts->subsys_mask && !opts->name) + if (!ctx->subsys_mask && !ctx->name) return -EINVAL; /* @@ -1036,11 +1035,11 @@ static int parse_cgroupfs_options(char *data, struct cgroup_sb_opts *opts) * with the old cpuset, so we allow noprefix only if mounting just * the cpuset subsystem. */ - if ((opts->flags & CGRP_ROOT_NOPREFIX) && (opts->subsys_mask & mask)) + if ((ctx->flags & CGRP_ROOT_NOPREFIX) && (ctx->subsys_mask & mask)) return -EINVAL; /* Can't specify "none" and some subsystems */ - if (opts->subsys_mask && opts->none) + if (ctx->subsys_mask && ctx->none) return -EINVAL; return 0; @@ -1052,28 +1051,27 @@ int cgroup1_reconfigure(struct fs_context *fc) struct kernfs_root *kf_root = kernfs_root_from_sb(fc->root->d_sb); struct cgroup_root *root = cgroup_root_from_kf(kf_root); int ret = 0; - struct cgroup_sb_opts opts; u16 added_mask, removed_mask; cgroup_lock_and_drain_offline(&cgrp_dfl_root.cgrp); /* See what subsystems are wanted */ - ret = parse_cgroupfs_options(ctx->data, &opts); + ret = check_cgroupfs_options(ctx); if (ret) goto out_unlock; - if (opts.subsys_mask != root->subsys_mask || opts.release_agent) + if (ctx->subsys_mask != root->subsys_mask || ctx->release_agent) pr_warn("option changes via remount are deprecated (pid=%d comm=%s)\n", task_tgid_nr(current), current->comm); - added_mask = opts.subsys_mask & ~root->subsys_mask; - removed_mask = root->subsys_mask & ~opts.subsys_mask; + added_mask = ctx->subsys_mask & ~root->subsys_mask; + removed_mask = root->subsys_mask & ~ctx->subsys_mask; /* Don't allow flags or name to change at remount */ - if ((opts.flags ^ root->flags) || - (opts.name && strcmp(opts.name, root->name))) { + if ((ctx->flags ^ root->flags) || + (ctx->name && strcmp(ctx->name, root->name))) { pr_err("option or name mismatch, new: 0x%x \"%s\", old: 0x%x \"%s\"\n", - opts.flags, opts.name ?: "", root->flags, root->name); + ctx->flags, ctx->name ?: "", root->flags, root->name); ret = -EINVAL; goto out_unlock; } @@ -1090,17 +1088,15 @@ int cgroup1_reconfigure(struct fs_context *fc) WARN_ON(rebind_subsystems(&cgrp_dfl_root, removed_mask)); - if (opts.release_agent) { + if (ctx->release_agent) { spin_lock(&release_agent_path_lock); - strcpy(root->release_agent_path, opts.release_agent); + strcpy(root->release_agent_path, ctx->release_agent); spin_unlock(&release_agent_path_lock); } trace_cgroup_remount(root); out_unlock: - kfree(opts.release_agent); - kfree(opts.name); mutex_unlock(&cgroup_mutex); return ret; } @@ -1117,7 +1113,6 @@ int cgroup1_get_tree(struct fs_context *fc) { struct cgroup_namespace *ns = current->nsproxy->cgroup_ns; struct cgroup_fs_context *ctx = cgroup_fc2context(fc); - struct cgroup_sb_opts opts; struct cgroup_root *root; struct cgroup_subsys *ss; struct dentry *dentry; @@ -1130,7 +1125,7 @@ int cgroup1_get_tree(struct fs_context *fc) cgroup_lock_and_drain_offline(&cgrp_dfl_root.cgrp); /* First find the desired set of subsystems */ - ret = parse_cgroupfs_options(ctx->data, &opts); + ret = check_cgroupfs_options(ctx); if (ret) goto out_unlock; @@ -1142,7 +1137,7 @@ int cgroup1_get_tree(struct fs_context *fc) * starting. Testing ref liveliness is good enough. */ for_each_subsys(ss, i) { - if (!(opts.subsys_mask & (1 << i)) || + if (!(ctx->subsys_mask & (1 << i)) || ss->root == &cgrp_dfl_root) continue; @@ -1166,8 +1161,8 @@ int cgroup1_get_tree(struct fs_context *fc) * name matches but sybsys_mask doesn't, we should fail. * Remember whether name matched. */ - if (opts.name) { - if (strcmp(opts.name, root->name)) + if (ctx->name) { + if (strcmp(ctx->name, root->name)) continue; name_match = true; } @@ -1176,15 +1171,15 @@ int cgroup1_get_tree(struct fs_context *fc) * If we asked for subsystems (or explicitly for no * subsystems) then they must match. */ - if ((opts.subsys_mask || opts.none) && - (opts.subsys_mask != root->subsys_mask)) { + if ((ctx->subsys_mask || ctx->none) && + (ctx->subsys_mask != root->subsys_mask)) { if (!name_match) continue; ret = -EBUSY; goto out_unlock; } - if (root->flags ^ opts.flags) + if (root->flags ^ ctx->flags) pr_warn("new mount options do not match the existing superblock, will be ignored\n"); ret = 0; @@ -1196,7 +1191,7 @@ int cgroup1_get_tree(struct fs_context *fc) * specification is allowed for already existing hierarchies but we * can't create new one without subsys specification. */ - if (!opts.subsys_mask && !opts.none) { + if (!ctx->subsys_mask && !ctx->none) { ret = -EINVAL; goto out_unlock; } @@ -1213,9 +1208,9 @@ int cgroup1_get_tree(struct fs_context *fc) goto out_unlock; } - init_cgroup_root(root, &opts); + init_cgroup_root(root, ctx); - ret = cgroup_setup_root(root, opts.subsys_mask); + ret = cgroup_setup_root(root, ctx->subsys_mask); if (ret) cgroup_free_root(root); @@ -1223,14 +1218,10 @@ int cgroup1_get_tree(struct fs_context *fc) if (!ret && !percpu_ref_tryget_live(&root->cgrp.self.refcnt)) { mutex_unlock(&cgroup_mutex); msleep(10); - ret = restart_syscall(); - goto out_free; + return restart_syscall(); } mutex_unlock(&cgroup_mutex); out_free: - kfree(opts.release_agent); - kfree(opts.name); - if (ret) return ret; diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c index 0652f74064a2..33da9eef3ef4 100644 --- a/kernel/cgroup/cgroup.c +++ b/kernel/cgroup/cgroup.c @@ -1814,14 +1814,8 @@ static int cgroup_show_options(struct seq_file *seq, struct kernfs_root *kf_root static int cgroup_reconfigure(struct fs_context *fc) { struct cgroup_fs_context *ctx = cgroup_fc2context(fc); - unsigned int root_flags; - int ret; - - ret = parse_cgroup_root_flags(ctx->data, &root_flags); - if (ret) - return ret; - apply_cgroup_root_flags(root_flags); + apply_cgroup_root_flags(ctx->flags); return 0; } @@ -1909,7 +1903,7 @@ static void init_cgroup_housekeeping(struct cgroup *cgrp) INIT_WORK(&cgrp->release_agent_work, cgroup1_release_agent); } -void init_cgroup_root(struct cgroup_root *root, struct cgroup_sb_opts *opts) +void init_cgroup_root(struct cgroup_root *root, struct cgroup_fs_context *ctx) { struct cgroup *cgrp = &root->cgrp; @@ -1919,12 +1913,12 @@ void init_cgroup_root(struct cgroup_root *root, struct cgroup_sb_opts *opts) init_cgroup_housekeeping(cgrp); idr_init(&root->cgroup_idr); - root->flags = opts->flags; - if (opts->release_agent) - strscpy(root->release_agent_path, opts->release_agent, PATH_MAX); - if (opts->name) - strscpy(root->name, opts->name, MAX_CGROUP_ROOT_NAMELEN); - if (opts->cpuset_clone_children) + root->flags = ctx->flags; + if (ctx->release_agent) + strscpy(root->release_agent_path, ctx->release_agent, PATH_MAX); + if (ctx->name) + strscpy(root->name, ctx->name, MAX_CGROUP_ROOT_NAMELEN); + if (ctx->cpuset_clone_children) set_bit(CGRP_CPUSET_CLONE_CHILDREN, &root->cgrp.flags); } @@ -2075,6 +2069,8 @@ static void cgroup_fs_context_free(struct fs_context *fc) { struct cgroup_fs_context *ctx = cgroup_fc2context(fc); + kfree(ctx->name); + kfree(ctx->release_agent); kfree(ctx); } @@ -2082,28 +2078,30 @@ static int cgroup_parse_monolithic(struct fs_context *fc, void *data) { struct cgroup_fs_context *ctx = cgroup_fc2context(fc); - ctx->data = data; - if (ctx->data) - security_sb_eat_lsm_opts(ctx->data, &fc->security); - return 0; + if (data) + security_sb_eat_lsm_opts(data, &fc->security); + return parse_cgroup_root_flags(data, &ctx->flags); +} + +static int cgroup1_parse_monolithic(struct fs_context *fc, void *data) +{ + struct cgroup_fs_context *ctx = cgroup_fc2context(fc); + + if (data) + security_sb_eat_lsm_opts(data, &fc->security); + return parse_cgroup1_options(data, ctx); } static int cgroup_get_tree(struct fs_context *fc) { struct cgroup_namespace *ns = current->nsproxy->cgroup_ns; struct cgroup_fs_context *ctx = cgroup_fc2context(fc); - unsigned int root_flags; struct dentry *root; - int ret; /* Check if the caller has permission to mount. */ if (!ns_capable(ns->user_ns, CAP_SYS_ADMIN)) return -EPERM; - ret = parse_cgroup_root_flags(ctx->data, &root_flags); - if (ret) - return ret; - cgrp_dfl_visible = true; cgroup_get_live(&cgrp_dfl_root.cgrp); @@ -2112,7 +2110,7 @@ static int cgroup_get_tree(struct fs_context *fc) if (IS_ERR(root)) return PTR_ERR(root); - apply_cgroup_root_flags(root_flags); + apply_cgroup_root_flags(ctx->flags); fc->root = root; return 0; } @@ -2126,7 +2124,7 @@ static const struct fs_context_operations cgroup_fs_context_ops = { static const struct fs_context_operations cgroup1_fs_context_ops = { .free = cgroup_fs_context_free, - .parse_monolithic = cgroup_parse_monolithic, + .parse_monolithic = cgroup1_parse_monolithic, .get_tree = cgroup1_get_tree, .reconfigure = cgroup1_reconfigure, }; @@ -5376,11 +5374,11 @@ static void __init cgroup_init_subsys(struct cgroup_subsys *ss, bool early) */ int __init cgroup_init_early(void) { - static struct cgroup_sb_opts __initdata opts; + static struct cgroup_fs_context __initdata ctx; struct cgroup_subsys *ss; int i; - init_cgroup_root(&cgrp_dfl_root, &opts); + init_cgroup_root(&cgrp_dfl_root, &ctx); cgrp_dfl_root.cgrp.self.flags |= CSS_NO_REF; RCU_INIT_POINTER(init_task.cgroups, &init_css_set); From patchwork Tue Feb 19 16:32:22 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 10820273 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id BD2F6180E for ; Tue, 19 Feb 2019 16:32:30 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id ADCAD2CD15 for ; Tue, 19 Feb 2019 16:32:30 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A9A5D2CD4E; Tue, 19 Feb 2019 16:32:30 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CE5F52CD15 for ; Tue, 19 Feb 2019 16:32:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728060AbfBSQc2 (ORCPT ); Tue, 19 Feb 2019 11:32:28 -0500 Received: from mx1.redhat.com ([209.132.183.28]:34844 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725991AbfBSQcZ (ORCPT ); Tue, 19 Feb 2019 11:32:25 -0500 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 34FF38762E; Tue, 19 Feb 2019 16:32:25 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-121-129.rdu2.redhat.com [10.10.121.129]) by smtp.corp.redhat.com (Postfix) with ESMTP id A993519C56; Tue, 19 Feb 2019 16:32:23 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 29/43] cgroup1: switch to option-by-option parsing From: David Howells To: viro@zeniv.linux.org.uk Cc: linux-fsdevel@vger.kernel.org, dhowells@redhat.com, torvalds@linux-foundation.org, ebiederm@xmission.com, linux-security-module@vger.kernel.org Date: Tue, 19 Feb 2019 16:32:22 +0000 Message-ID: <155059394294.12449.7694352641084869686.stgit@warthog.procyon.org.uk> In-Reply-To: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> References: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.26]); Tue, 19 Feb 2019 16:32:25 +0000 (UTC) Sender: owner-linux-security-module@vger.kernel.org Precedence: bulk List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Al Viro [dhowells should be the author - it's carved out of his patch] Signed-off-by: Al Viro --- kernel/cgroup/cgroup-internal.h | 3 - kernel/cgroup/cgroup-v1.c | 192 ++++++++++++++++++++++----------------- kernel/cgroup/cgroup.c | 20 +--- 3 files changed, 117 insertions(+), 98 deletions(-) diff --git a/kernel/cgroup/cgroup-internal.h b/kernel/cgroup/cgroup-internal.h index e627ff193dba..a7b5a41f170c 100644 --- a/kernel/cgroup/cgroup-internal.h +++ b/kernel/cgroup/cgroup-internal.h @@ -257,14 +257,15 @@ extern const struct proc_ns_operations cgroupns_operations; */ extern struct cftype cgroup1_base_files[]; extern struct kernfs_syscall_ops cgroup1_kf_syscall_ops; +extern const struct fs_parameter_description cgroup1_fs_parameters; int proc_cgroupstats_show(struct seq_file *m, void *v); bool cgroup1_ssid_disabled(int ssid); void cgroup1_pidlist_destroy_all(struct cgroup *cgrp); void cgroup1_release_agent(struct work_struct *work); void cgroup1_check_for_release(struct cgroup *cgrp); +int cgroup1_parse_param(struct fs_context *fc, struct fs_parameter *param); int cgroup1_get_tree(struct fs_context *fc); -int parse_cgroup1_options(char *data, struct cgroup_fs_context *ctx); int cgroup1_reconfigure(struct fs_context *ctx); #endif /* __CGROUP_INTERNAL_H */ diff --git a/kernel/cgroup/cgroup-v1.c b/kernel/cgroup/cgroup-v1.c index 5c93643090e9..725e9f6fe80d 100644 --- a/kernel/cgroup/cgroup-v1.c +++ b/kernel/cgroup/cgroup-v1.c @@ -13,9 +13,12 @@ #include #include #include +#include #include +#define cg_invalf(fc, fmt, ...) ({ pr_err(fmt, ## __VA_ARGS__); -EINVAL; }) + /* * pidlists linger the following amount before being destroyed. The goal * is avoiding frequent destruction in the middle of consecutive read calls @@ -906,94 +909,117 @@ static int cgroup1_show_options(struct seq_file *seq, struct kernfs_root *kf_roo return 0; } -int parse_cgroup1_options(char *data, struct cgroup_fs_context *ctx) -{ - char *token, *o = data; - struct cgroup_subsys *ss; - int i; +enum cgroup1_param { + Opt_all, + Opt_clone_children, + Opt_cpuset_v2_mode, + Opt_name, + Opt_none, + Opt_noprefix, + Opt_release_agent, + Opt_xattr, +}; - while ((token = strsep(&o, ",")) != NULL) { - if (!*token) - return -EINVAL; - if (!strcmp(token, "none")) { - /* Explicitly have no subsystems */ - ctx->none = true; - continue; - } - if (!strcmp(token, "all")) { - ctx->all_ss = true; - continue; - } - if (!strcmp(token, "noprefix")) { - ctx->flags |= CGRP_ROOT_NOPREFIX; - continue; - } - if (!strcmp(token, "clone_children")) { - ctx->cpuset_clone_children = true; - continue; - } - if (!strcmp(token, "cpuset_v2_mode")) { - ctx->flags |= CGRP_ROOT_CPUSET_V2_MODE; - continue; - } - if (!strcmp(token, "xattr")) { - ctx->flags |= CGRP_ROOT_XATTR; - continue; - } - if (!strncmp(token, "release_agent=", 14)) { - /* Specifying two release agents is forbidden */ - if (ctx->release_agent) - return -EINVAL; - ctx->release_agent = - kstrndup(token + 14, PATH_MAX - 1, GFP_KERNEL); - if (!ctx->release_agent) - return -ENOMEM; - continue; - } - if (!strncmp(token, "name=", 5)) { - const char *name = token + 5; - - /* blocked by boot param? */ - if (cgroup_no_v1_named) - return -ENOENT; - /* Can't specify an empty name */ - if (!strlen(name)) - return -EINVAL; - /* Must match [\w.-]+ */ - for (i = 0; i < strlen(name); i++) { - char c = name[i]; - if (isalnum(c)) - continue; - if ((c == '.') || (c == '-') || (c == '_')) - continue; - return -EINVAL; - } - /* Specifying two names is forbidden */ - if (ctx->name) - return -EINVAL; - ctx->name = kstrndup(name, - MAX_CGROUP_ROOT_NAMELEN - 1, - GFP_KERNEL); - if (!ctx->name) - return -ENOMEM; +static const struct fs_parameter_spec cgroup1_param_specs[] = { + fsparam_flag ("all", Opt_all), + fsparam_flag ("clone_children", Opt_clone_children), + fsparam_flag ("cpuset_v2_mode", Opt_cpuset_v2_mode), + fsparam_string("name", Opt_name), + fsparam_flag ("none", Opt_none), + fsparam_flag ("noprefix", Opt_noprefix), + fsparam_string("release_agent", Opt_release_agent), + fsparam_flag ("xattr", Opt_xattr), + {} +}; - continue; - } +const struct fs_parameter_description cgroup1_fs_parameters = { + .name = "cgroup1", + .specs = cgroup1_param_specs, +}; +int cgroup1_parse_param(struct fs_context *fc, struct fs_parameter *param) +{ + struct cgroup_fs_context *ctx = cgroup_fc2context(fc); + struct cgroup_subsys *ss; + struct fs_parse_result result; + int opt, i; + + opt = fs_parse(fc, &cgroup1_fs_parameters, param, &result); + if (opt == -ENOPARAM) { + if (strcmp(param->key, "source") == 0) { + fc->source = param->string; + param->string = NULL; + return 0; + } for_each_subsys(ss, i) { - if (strcmp(token, ss->legacy_name)) + if (strcmp(param->key, ss->legacy_name)) continue; ctx->subsys_mask |= (1 << i); - break; + return 0; } - if (i == CGROUP_SUBSYS_COUNT) + return cg_invalf(fc, "cgroup1: Unknown subsys name '%s'", param->key); + } + if (opt < 0) + return opt; + + switch (opt) { + case Opt_none: + /* Explicitly have no subsystems */ + ctx->none = true; + break; + case Opt_all: + ctx->all_ss = true; + break; + case Opt_noprefix: + ctx->flags |= CGRP_ROOT_NOPREFIX; + break; + case Opt_clone_children: + ctx->cpuset_clone_children = true; + break; + case Opt_cpuset_v2_mode: + ctx->flags |= CGRP_ROOT_CPUSET_V2_MODE; + break; + case Opt_xattr: + ctx->flags |= CGRP_ROOT_XATTR; + break; + case Opt_release_agent: + /* Specifying two release agents is forbidden */ + if (ctx->release_agent) + return cg_invalf(fc, "cgroup1: release_agent respecified"); + ctx->release_agent = param->string; + param->string = NULL; + break; + case Opt_name: + /* blocked by boot param? */ + if (cgroup_no_v1_named) return -ENOENT; + /* Can't specify an empty name */ + if (!param->size) + return cg_invalf(fc, "cgroup1: Empty name"); + if (param->size > MAX_CGROUP_ROOT_NAMELEN - 1) + return cg_invalf(fc, "cgroup1: Name too long"); + /* Must match [\w.-]+ */ + for (i = 0; i < param->size; i++) { + char c = param->string[i]; + if (isalnum(c)) + continue; + if ((c == '.') || (c == '-') || (c == '_')) + continue; + return cg_invalf(fc, "cgroup1: Invalid name"); + } + /* Specifying two names is forbidden */ + if (ctx->name) + return cg_invalf(fc, "cgroup1: name respecified"); + ctx->name = param->string; + param->string = NULL; + break; } return 0; } -static int check_cgroupfs_options(struct cgroup_fs_context *ctx) +static int check_cgroupfs_options(struct fs_context *fc) { + struct cgroup_fs_context *ctx = cgroup_fc2context(fc); u16 mask = U16_MAX; u16 enabled = 0; struct cgroup_subsys *ss; @@ -1018,7 +1044,7 @@ static int check_cgroupfs_options(struct cgroup_fs_context *ctx) if (ctx->all_ss) { /* Mutually exclusive option 'all' + subsystem name */ if (ctx->subsys_mask) - return -EINVAL; + return cg_invalf(fc, "cgroup1: subsys name conflicts with all"); /* 'all' => select all the subsystems */ ctx->subsys_mask = enabled; } @@ -1028,7 +1054,7 @@ static int check_cgroupfs_options(struct cgroup_fs_context *ctx) * empty hierarchies must have a name). */ if (!ctx->subsys_mask && !ctx->name) - return -EINVAL; + return cg_invalf(fc, "cgroup1: Need name or subsystem set"); /* * Option noprefix was introduced just for backward compatibility @@ -1036,11 +1062,11 @@ static int check_cgroupfs_options(struct cgroup_fs_context *ctx) * the cpuset subsystem. */ if ((ctx->flags & CGRP_ROOT_NOPREFIX) && (ctx->subsys_mask & mask)) - return -EINVAL; + return cg_invalf(fc, "cgroup1: noprefix used incorrectly"); /* Can't specify "none" and some subsystems */ if (ctx->subsys_mask && ctx->none) - return -EINVAL; + return cg_invalf(fc, "cgroup1: none used incorrectly"); return 0; } @@ -1056,7 +1082,7 @@ int cgroup1_reconfigure(struct fs_context *fc) cgroup_lock_and_drain_offline(&cgrp_dfl_root.cgrp); /* See what subsystems are wanted */ - ret = check_cgroupfs_options(ctx); + ret = check_cgroupfs_options(fc); if (ret) goto out_unlock; @@ -1070,7 +1096,7 @@ int cgroup1_reconfigure(struct fs_context *fc) /* Don't allow flags or name to change at remount */ if ((ctx->flags ^ root->flags) || (ctx->name && strcmp(ctx->name, root->name))) { - pr_err("option or name mismatch, new: 0x%x \"%s\", old: 0x%x \"%s\"\n", + cg_invalf(fc, "option or name mismatch, new: 0x%x \"%s\", old: 0x%x \"%s\"", ctx->flags, ctx->name ?: "", root->flags, root->name); ret = -EINVAL; goto out_unlock; @@ -1125,7 +1151,7 @@ int cgroup1_get_tree(struct fs_context *fc) cgroup_lock_and_drain_offline(&cgrp_dfl_root.cgrp); /* First find the desired set of subsystems */ - ret = check_cgroupfs_options(ctx); + ret = check_cgroupfs_options(fc); if (ret) goto out_unlock; @@ -1192,7 +1218,7 @@ int cgroup1_get_tree(struct fs_context *fc) * can't create new one without subsys specification. */ if (!ctx->subsys_mask && !ctx->none) { - ret = -EINVAL; + ret = cg_invalf(fc, "cgroup1: No subsys list or none specified"); goto out_unlock; } diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c index 33da9eef3ef4..faba00caa197 100644 --- a/kernel/cgroup/cgroup.c +++ b/kernel/cgroup/cgroup.c @@ -2083,15 +2083,6 @@ static int cgroup_parse_monolithic(struct fs_context *fc, void *data) return parse_cgroup_root_flags(data, &ctx->flags); } -static int cgroup1_parse_monolithic(struct fs_context *fc, void *data) -{ - struct cgroup_fs_context *ctx = cgroup_fc2context(fc); - - if (data) - security_sb_eat_lsm_opts(data, &fc->security); - return parse_cgroup1_options(data, ctx); -} - static int cgroup_get_tree(struct fs_context *fc) { struct cgroup_namespace *ns = current->nsproxy->cgroup_ns; @@ -2124,7 +2115,7 @@ static const struct fs_context_operations cgroup_fs_context_ops = { static const struct fs_context_operations cgroup1_fs_context_ops = { .free = cgroup_fs_context_free, - .parse_monolithic = cgroup1_parse_monolithic, + .parse_param = cgroup1_parse_param, .get_tree = cgroup1_get_tree, .reconfigure = cgroup1_reconfigure, }; @@ -2175,10 +2166,11 @@ static void cgroup_kill_sb(struct super_block *sb) } struct file_system_type cgroup_fs_type = { - .name = "cgroup", - .init_fs_context = cgroup_init_fs_context, - .kill_sb = cgroup_kill_sb, - .fs_flags = FS_USERNS_MOUNT, + .name = "cgroup", + .init_fs_context = cgroup_init_fs_context, + .parameters = &cgroup1_fs_parameters, + .kill_sb = cgroup_kill_sb, + .fs_flags = FS_USERNS_MOUNT, }; static struct file_system_type cgroup2_fs_type = { From patchwork Tue Feb 19 16:32:30 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 10820277 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 127F714E1 for ; Tue, 19 Feb 2019 16:32:34 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id EB2742CC03 for ; Tue, 19 Feb 2019 16:32:33 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id E90B22CD71; Tue, 19 Feb 2019 16:32:33 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6F8372CC03 for ; Tue, 19 Feb 2019 16:32:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727956AbfBSQcd (ORCPT ); Tue, 19 Feb 2019 11:32:33 -0500 Received: from mx1.redhat.com ([209.132.183.28]:50356 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725991AbfBSQcc (ORCPT ); Tue, 19 Feb 2019 11:32:32 -0500 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 5733A2F5D; Tue, 19 Feb 2019 16:32:32 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-121-129.rdu2.redhat.com [10.10.121.129]) by smtp.corp.redhat.com (Postfix) with ESMTP id 1C0FB379F; Tue, 19 Feb 2019 16:32:30 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 30/43] cgroup2: switch to option-by-option parsing From: David Howells To: viro@zeniv.linux.org.uk Cc: linux-fsdevel@vger.kernel.org, dhowells@redhat.com, torvalds@linux-foundation.org, ebiederm@xmission.com, linux-security-module@vger.kernel.org Date: Tue, 19 Feb 2019 16:32:30 +0000 Message-ID: <155059395041.12449.16641083373128419721.stgit@warthog.procyon.org.uk> In-Reply-To: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> References: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.28]); Tue, 19 Feb 2019 16:32:32 +0000 (UTC) Sender: owner-linux-security-module@vger.kernel.org Precedence: bulk List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Al Viro [again, carved out of patch by dhowells] [NB: we probably want to handle "source" in parse_param here] Signed-off-by: Al Viro --- kernel/cgroup/cgroup.c | 62 ++++++++++++++++++++++++++---------------------- 1 file changed, 33 insertions(+), 29 deletions(-) diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c index faba00caa197..d0cddfbdf5cf 100644 --- a/kernel/cgroup/cgroup.c +++ b/kernel/cgroup/cgroup.c @@ -54,6 +54,7 @@ #include #include #include +#include #include #include #include @@ -1772,26 +1773,37 @@ int cgroup_show_path(struct seq_file *sf, struct kernfs_node *kf_node, return len; } -static int parse_cgroup_root_flags(char *data, unsigned int *root_flags) -{ - char *token; +enum cgroup2_param { + Opt_nsdelegate, + nr__cgroup2_params +}; - *root_flags = 0; +static const struct fs_parameter_spec cgroup2_param_specs[] = { + fsparam_flag ("nsdelegate", Opt_nsdelegate), + {} +}; - if (!data || *data == '\0') - return 0; +static const struct fs_parameter_description cgroup2_fs_parameters = { + .name = "cgroup2", + .specs = cgroup2_param_specs, +}; - while ((token = strsep(&data, ",")) != NULL) { - if (!strcmp(token, "nsdelegate")) { - *root_flags |= CGRP_ROOT_NS_DELEGATE; - continue; - } +static int cgroup2_parse_param(struct fs_context *fc, struct fs_parameter *param) +{ + struct cgroup_fs_context *ctx = cgroup_fc2context(fc); + struct fs_parse_result result; + int opt; - pr_err("cgroup2: unknown option \"%s\"\n", token); - return -EINVAL; - } + opt = fs_parse(fc, &cgroup2_fs_parameters, param, &result); + if (opt < 0) + return opt; - return 0; + switch (opt) { + case Opt_nsdelegate: + ctx->flags |= CGRP_ROOT_NS_DELEGATE; + return 0; + } + return -EINVAL; } static void apply_cgroup_root_flags(unsigned int root_flags) @@ -2074,15 +2086,6 @@ static void cgroup_fs_context_free(struct fs_context *fc) kfree(ctx); } -static int cgroup_parse_monolithic(struct fs_context *fc, void *data) -{ - struct cgroup_fs_context *ctx = cgroup_fc2context(fc); - - if (data) - security_sb_eat_lsm_opts(data, &fc->security); - return parse_cgroup_root_flags(data, &ctx->flags); -} - static int cgroup_get_tree(struct fs_context *fc) { struct cgroup_namespace *ns = current->nsproxy->cgroup_ns; @@ -2108,7 +2111,7 @@ static int cgroup_get_tree(struct fs_context *fc) static const struct fs_context_operations cgroup_fs_context_ops = { .free = cgroup_fs_context_free, - .parse_monolithic = cgroup_parse_monolithic, + .parse_param = cgroup2_parse_param, .get_tree = cgroup_get_tree, .reconfigure = cgroup_reconfigure, }; @@ -2174,10 +2177,11 @@ struct file_system_type cgroup_fs_type = { }; static struct file_system_type cgroup2_fs_type = { - .name = "cgroup2", - .init_fs_context = cgroup_init_fs_context, - .kill_sb = cgroup_kill_sb, - .fs_flags = FS_USERNS_MOUNT, + .name = "cgroup2", + .init_fs_context = cgroup_init_fs_context, + .parameters = &cgroup2_fs_parameters, + .kill_sb = cgroup_kill_sb, + .fs_flags = FS_USERNS_MOUNT, }; int cgroup_path_ns_locked(struct cgroup *cgrp, char *buf, size_t buflen, From patchwork Tue Feb 19 16:32:37 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 10820281 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0EE1314E1 for ; Tue, 19 Feb 2019 16:32:41 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id EE1272CCCA for ; Tue, 19 Feb 2019 16:32:40 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id EC4362CD07; Tue, 19 Feb 2019 16:32:40 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 890D12CD12 for ; Tue, 19 Feb 2019 16:32:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729197AbfBSQck (ORCPT ); Tue, 19 Feb 2019 11:32:40 -0500 Received: from mx1.redhat.com ([209.132.183.28]:38244 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727519AbfBSQcj (ORCPT ); Tue, 19 Feb 2019 11:32:39 -0500 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 9C93B7D0D8; Tue, 19 Feb 2019 16:32:39 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-121-129.rdu2.redhat.com [10.10.121.129]) by smtp.corp.redhat.com (Postfix) with ESMTP id 5A7FA101962A; Tue, 19 Feb 2019 16:32:38 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 31/43] cgroup: stash cgroup_root reference into cgroup_fs_context From: David Howells To: viro@zeniv.linux.org.uk Cc: linux-fsdevel@vger.kernel.org, dhowells@redhat.com, torvalds@linux-foundation.org, ebiederm@xmission.com, linux-security-module@vger.kernel.org Date: Tue, 19 Feb 2019 16:32:37 +0000 Message-ID: <155059395755.12449.3860088067534823614.stgit@warthog.procyon.org.uk> In-Reply-To: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> References: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.27]); Tue, 19 Feb 2019 16:32:39 +0000 (UTC) Sender: owner-linux-security-module@vger.kernel.org Precedence: bulk List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Al Viro Note that this reference is *NOT* contributing to refcount of cgroup_root in question and is valid only until cgroup_do_mount() returns. Signed-off-by: Al Viro --- kernel/cgroup/cgroup-internal.h | 3 ++- kernel/cgroup/cgroup-v1.c | 4 +++- kernel/cgroup/cgroup.c | 7 +++++-- 3 files changed, 10 insertions(+), 4 deletions(-) diff --git a/kernel/cgroup/cgroup-internal.h b/kernel/cgroup/cgroup-internal.h index a7b5a41f170c..3c1613a7648c 100644 --- a/kernel/cgroup/cgroup-internal.h +++ b/kernel/cgroup/cgroup-internal.h @@ -41,6 +41,7 @@ extern void __init enable_debug_cgroup(void); * The cgroup filesystem superblock creation/mount context. */ struct cgroup_fs_context { + struct cgroup_root *root; unsigned int flags; /* CGRP_ROOT_* flags */ /* cgroup1 bits */ @@ -208,7 +209,7 @@ int cgroup_path_ns_locked(struct cgroup *cgrp, char *buf, size_t buflen, struct cgroup_namespace *ns); void cgroup_free_root(struct cgroup_root *root); -void init_cgroup_root(struct cgroup_root *root, struct cgroup_fs_context *ctx); +void init_cgroup_root(struct cgroup_fs_context *ctx); int cgroup_setup_root(struct cgroup_root *root, u16 ss_mask); int rebind_subsystems(struct cgroup_root *dst_root, u16 ss_mask); struct dentry *cgroup_do_mount(struct file_system_type *fs_type, int flags, diff --git a/kernel/cgroup/cgroup-v1.c b/kernel/cgroup/cgroup-v1.c index 725e9f6fe80d..45a198c63d6e 100644 --- a/kernel/cgroup/cgroup-v1.c +++ b/kernel/cgroup/cgroup-v1.c @@ -1208,6 +1208,7 @@ int cgroup1_get_tree(struct fs_context *fc) if (root->flags ^ ctx->flags) pr_warn("new mount options do not match the existing superblock, will be ignored\n"); + ctx->root = root; ret = 0; goto out_unlock; } @@ -1234,7 +1235,8 @@ int cgroup1_get_tree(struct fs_context *fc) goto out_unlock; } - init_cgroup_root(root, ctx); + ctx->root = root; + init_cgroup_root(ctx); ret = cgroup_setup_root(root, ctx->subsys_mask); if (ret) diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c index d0cddfbdf5cf..57f43f63363a 100644 --- a/kernel/cgroup/cgroup.c +++ b/kernel/cgroup/cgroup.c @@ -1915,8 +1915,9 @@ static void init_cgroup_housekeeping(struct cgroup *cgrp) INIT_WORK(&cgrp->release_agent_work, cgroup1_release_agent); } -void init_cgroup_root(struct cgroup_root *root, struct cgroup_fs_context *ctx) +void init_cgroup_root(struct cgroup_fs_context *ctx) { + struct cgroup_root *root = ctx->root; struct cgroup *cgrp = &root->cgrp; INIT_LIST_HEAD(&root->root_list); @@ -2098,6 +2099,7 @@ static int cgroup_get_tree(struct fs_context *fc) cgrp_dfl_visible = true; cgroup_get_live(&cgrp_dfl_root.cgrp); + ctx->root = &cgrp_dfl_root; root = cgroup_do_mount(&cgroup2_fs_type, fc->sb_flags, &cgrp_dfl_root, CGROUP2_SUPER_MAGIC, ns); @@ -5374,7 +5376,8 @@ int __init cgroup_init_early(void) struct cgroup_subsys *ss; int i; - init_cgroup_root(&cgrp_dfl_root, &ctx); + ctx.root = &cgrp_dfl_root; + init_cgroup_root(&ctx); cgrp_dfl_root.cgrp.self.flags |= CSS_NO_REF; RCU_INIT_POINTER(init_task.cgroups, &init_css_set); From patchwork Tue Feb 19 16:32:44 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 10820285 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4B4C61805 for ; Tue, 19 Feb 2019 16:32:52 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 32F4E2CCD0 for ; Tue, 19 Feb 2019 16:32:52 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 3175D2CD72; Tue, 19 Feb 2019 16:32:52 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id AE2172CD6D for ; Tue, 19 Feb 2019 16:32:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726397AbfBSQcv (ORCPT ); Tue, 19 Feb 2019 11:32:51 -0500 Received: from mx1.redhat.com ([209.132.183.28]:45446 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725991AbfBSQcv (ORCPT ); Tue, 19 Feb 2019 11:32:51 -0500 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id A0A59C04B956; Tue, 19 Feb 2019 16:32:50 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-121-129.rdu2.redhat.com [10.10.121.129]) by smtp.corp.redhat.com (Postfix) with ESMTP id 622131710E; Tue, 19 Feb 2019 16:32:46 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 32/43] cgroup_do_mount(): massage calling conventions From: David Howells To: viro@zeniv.linux.org.uk Cc: linux-fsdevel@vger.kernel.org, dhowells@redhat.com, torvalds@linux-foundation.org, ebiederm@xmission.com, linux-security-module@vger.kernel.org Date: Tue, 19 Feb 2019 16:32:44 +0000 Message-ID: <155059396486.12449.1117812894043256835.stgit@warthog.procyon.org.uk> In-Reply-To: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> References: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.31]); Tue, 19 Feb 2019 16:32:50 +0000 (UTC) Sender: owner-linux-security-module@vger.kernel.org Precedence: bulk List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Al Viro pass it fs_context instead of fs_type/flags/root triple, have it return int instead of dentry and make it deal with setting fc->root. Signed-off-by: Al Viro --- kernel/cgroup/cgroup-internal.h | 3 +-- kernel/cgroup/cgroup-v1.c | 17 ++++----------- kernel/cgroup/cgroup.c | 45 ++++++++++++++++++++------------------- 3 files changed, 29 insertions(+), 36 deletions(-) diff --git a/kernel/cgroup/cgroup-internal.h b/kernel/cgroup/cgroup-internal.h index 3c1613a7648c..f7fd54f2973f 100644 --- a/kernel/cgroup/cgroup-internal.h +++ b/kernel/cgroup/cgroup-internal.h @@ -212,8 +212,7 @@ void cgroup_free_root(struct cgroup_root *root); void init_cgroup_root(struct cgroup_fs_context *ctx); int cgroup_setup_root(struct cgroup_root *root, u16 ss_mask); int rebind_subsystems(struct cgroup_root *dst_root, u16 ss_mask); -struct dentry *cgroup_do_mount(struct file_system_type *fs_type, int flags, - struct cgroup_root *root, unsigned long magic, +int cgroup_do_mount(struct fs_context *fc, unsigned long magic, struct cgroup_namespace *ns); int cgroup_migrate_vet_dst(struct cgroup *dst_cgrp); diff --git a/kernel/cgroup/cgroup-v1.c b/kernel/cgroup/cgroup-v1.c index 45a198c63d6e..05f05d773adf 100644 --- a/kernel/cgroup/cgroup-v1.c +++ b/kernel/cgroup/cgroup-v1.c @@ -1141,7 +1141,6 @@ int cgroup1_get_tree(struct fs_context *fc) struct cgroup_fs_context *ctx = cgroup_fc2context(fc); struct cgroup_root *root; struct cgroup_subsys *ss; - struct dentry *dentry; int i, ret; /* Check if the caller has permission to mount. */ @@ -1253,21 +1252,15 @@ int cgroup1_get_tree(struct fs_context *fc) if (ret) return ret; - dentry = cgroup_do_mount(&cgroup_fs_type, fc->sb_flags, root, - CGROUP_SUPER_MAGIC, ns); - if (IS_ERR(dentry)) - return PTR_ERR(dentry); - - if (percpu_ref_is_dying(&root->cgrp.self.refcnt)) { - struct super_block *sb = dentry->d_sb; - dput(dentry); + ret = cgroup_do_mount(fc, CGROUP_SUPER_MAGIC, ns); + if (!ret && percpu_ref_is_dying(&root->cgrp.self.refcnt)) { + struct super_block *sb = fc->root->d_sb; + dput(fc->root); deactivate_locked_super(sb); msleep(10); return restart_syscall(); } - - fc->root = dentry; - return 0; + return ret; } static int __init cgroup1_wq_init(void) diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c index 57f43f63363a..64360a46d4df 100644 --- a/kernel/cgroup/cgroup.c +++ b/kernel/cgroup/cgroup.c @@ -2036,43 +2036,48 @@ int cgroup_setup_root(struct cgroup_root *root, u16 ss_mask) return ret; } -struct dentry *cgroup_do_mount(struct file_system_type *fs_type, int flags, - struct cgroup_root *root, unsigned long magic, - struct cgroup_namespace *ns) +int cgroup_do_mount(struct fs_context *fc, unsigned long magic, + struct cgroup_namespace *ns) { - struct dentry *dentry; + struct cgroup_fs_context *ctx = cgroup_fc2context(fc); bool new_sb = false; + int ret = 0; - dentry = kernfs_mount(fs_type, flags, root->kf_root, magic, &new_sb); + fc->root = kernfs_mount(fc->fs_type, fc->sb_flags, ctx->root->kf_root, + magic, &new_sb); + if (IS_ERR(fc->root)) + ret = PTR_ERR(fc->root); /* * In non-init cgroup namespace, instead of root cgroup's dentry, * we return the dentry corresponding to the cgroupns->root_cgrp. */ - if (!IS_ERR(dentry) && ns != &init_cgroup_ns) { + if (!ret && ns != &init_cgroup_ns) { struct dentry *nsdentry; - struct super_block *sb = dentry->d_sb; + struct super_block *sb = fc->root->d_sb; struct cgroup *cgrp; mutex_lock(&cgroup_mutex); spin_lock_irq(&css_set_lock); - cgrp = cset_cgroup_from_root(ns->root_cset, root); + cgrp = cset_cgroup_from_root(ns->root_cset, ctx->root); spin_unlock_irq(&css_set_lock); mutex_unlock(&cgroup_mutex); nsdentry = kernfs_node_dentry(cgrp->kn, sb); - dput(dentry); - if (IS_ERR(nsdentry)) + dput(fc->root); + fc->root = nsdentry; + if (IS_ERR(nsdentry)) { + ret = PTR_ERR(nsdentry); deactivate_locked_super(sb); - dentry = nsdentry; + } } if (!new_sb) - cgroup_put(&root->cgrp); + cgroup_put(&ctx->root->cgrp); - return dentry; + return ret; } /* @@ -2091,7 +2096,7 @@ static int cgroup_get_tree(struct fs_context *fc) { struct cgroup_namespace *ns = current->nsproxy->cgroup_ns; struct cgroup_fs_context *ctx = cgroup_fc2context(fc); - struct dentry *root; + int ret; /* Check if the caller has permission to mount. */ if (!ns_capable(ns->user_ns, CAP_SYS_ADMIN)) @@ -2101,14 +2106,10 @@ static int cgroup_get_tree(struct fs_context *fc) cgroup_get_live(&cgrp_dfl_root.cgrp); ctx->root = &cgrp_dfl_root; - root = cgroup_do_mount(&cgroup2_fs_type, fc->sb_flags, &cgrp_dfl_root, - CGROUP2_SUPER_MAGIC, ns); - if (IS_ERR(root)) - return PTR_ERR(root); - - apply_cgroup_root_flags(ctx->flags); - fc->root = root; - return 0; + ret = cgroup_do_mount(fc, CGROUP2_SUPER_MAGIC, ns); + if (!ret) + apply_cgroup_root_flags(ctx->flags); + return ret; } static const struct fs_context_operations cgroup_fs_context_ops = { From patchwork Tue Feb 19 16:32:55 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 10820289 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B70A814E1 for ; Tue, 19 Feb 2019 16:33:01 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9627A2CCFB for ; Tue, 19 Feb 2019 16:33:01 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 939BD2CD62; Tue, 19 Feb 2019 16:33:01 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 013722CCFB for ; Tue, 19 Feb 2019 16:33:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728099AbfBSQdA (ORCPT ); Tue, 19 Feb 2019 11:33:00 -0500 Received: from mx1.redhat.com ([209.132.183.28]:52348 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725991AbfBSQdA (ORCPT ); Tue, 19 Feb 2019 11:33:00 -0500 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id E109E85363; Tue, 19 Feb 2019 16:32:59 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-121-129.rdu2.redhat.com [10.10.121.129]) by smtp.corp.redhat.com (Postfix) with ESMTP id 8D665BA67; Tue, 19 Feb 2019 16:32:56 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 33/43] cgroup1_get_tree(): separate "get cgroup_root to use" into a separate helper From: David Howells To: viro@zeniv.linux.org.uk Cc: linux-fsdevel@vger.kernel.org, dhowells@redhat.com, torvalds@linux-foundation.org, ebiederm@xmission.com, linux-security-module@vger.kernel.org Date: Tue, 19 Feb 2019 16:32:55 +0000 Message-ID: <155059397586.12449.13952181528654229603.stgit@warthog.procyon.org.uk> In-Reply-To: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> References: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.25]); Tue, 19 Feb 2019 16:33:00 +0000 (UTC) Sender: owner-linux-security-module@vger.kernel.org Precedence: bulk List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Al Viro Signed-off-by: Al Viro --- kernel/cgroup/cgroup-v1.c | 87 ++++++++++++++++++++++++--------------------- 1 file changed, 46 insertions(+), 41 deletions(-) diff --git a/kernel/cgroup/cgroup-v1.c b/kernel/cgroup/cgroup-v1.c index 05f05d773adf..0d71fc98e73d 100644 --- a/kernel/cgroup/cgroup-v1.c +++ b/kernel/cgroup/cgroup-v1.c @@ -1135,7 +1135,15 @@ struct kernfs_syscall_ops cgroup1_kf_syscall_ops = { .show_path = cgroup_show_path, }; -int cgroup1_get_tree(struct fs_context *fc) +/* + * The guts of cgroup1 mount - find or create cgroup_root to use. + * Called with cgroup_mutex held; returns 0 on success, -E... on + * error and positive - in case when the candidate is busy dying. + * On success it stashes a reference to cgroup_root into given + * cgroup_fs_context; that reference is *NOT* counting towards the + * cgroup_root refcount. + */ +static int cgroup1_root_to_use(struct fs_context *fc) { struct cgroup_namespace *ns = current->nsproxy->cgroup_ns; struct cgroup_fs_context *ctx = cgroup_fc2context(fc); @@ -1143,16 +1151,10 @@ int cgroup1_get_tree(struct fs_context *fc) struct cgroup_subsys *ss; int i, ret; - /* Check if the caller has permission to mount. */ - if (!ns_capable(ns->user_ns, CAP_SYS_ADMIN)) - return -EPERM; - - cgroup_lock_and_drain_offline(&cgrp_dfl_root.cgrp); - /* First find the desired set of subsystems */ ret = check_cgroupfs_options(fc); if (ret) - goto out_unlock; + return ret; /* * Destruction of cgroup root is asynchronous, so subsystems may @@ -1166,12 +1168,8 @@ int cgroup1_get_tree(struct fs_context *fc) ss->root == &cgrp_dfl_root) continue; - if (!percpu_ref_tryget_live(&ss->root->cgrp.self.refcnt)) { - mutex_unlock(&cgroup_mutex); - msleep(10); - ret = restart_syscall(); - goto out_free; - } + if (!percpu_ref_tryget_live(&ss->root->cgrp.self.refcnt)) + return 1; /* restart */ cgroup_put(&ss->root->cgrp); } @@ -1200,16 +1198,14 @@ int cgroup1_get_tree(struct fs_context *fc) (ctx->subsys_mask != root->subsys_mask)) { if (!name_match) continue; - ret = -EBUSY; - goto out_unlock; + return -EBUSY; } if (root->flags ^ ctx->flags) pr_warn("new mount options do not match the existing superblock, will be ignored\n"); ctx->root = root; - ret = 0; - goto out_unlock; + return 0; } /* @@ -1217,22 +1213,16 @@ int cgroup1_get_tree(struct fs_context *fc) * specification is allowed for already existing hierarchies but we * can't create new one without subsys specification. */ - if (!ctx->subsys_mask && !ctx->none) { - ret = cg_invalf(fc, "cgroup1: No subsys list or none specified"); - goto out_unlock; - } + if (!ctx->subsys_mask && !ctx->none) + return cg_invalf(fc, "cgroup1: No subsys list or none specified"); /* Hierarchies may only be created in the initial cgroup namespace. */ - if (ns != &init_cgroup_ns) { - ret = -EPERM; - goto out_unlock; - } + if (ns != &init_cgroup_ns) + return -EPERM; root = kzalloc(sizeof(*root), GFP_KERNEL); - if (!root) { - ret = -ENOMEM; - goto out_unlock; - } + if (!root) + return -ENOMEM; ctx->root = root; init_cgroup_root(ctx); @@ -1240,23 +1230,38 @@ int cgroup1_get_tree(struct fs_context *fc) ret = cgroup_setup_root(root, ctx->subsys_mask); if (ret) cgroup_free_root(root); + return ret; +} + +int cgroup1_get_tree(struct fs_context *fc) +{ + struct cgroup_namespace *ns = current->nsproxy->cgroup_ns; + struct cgroup_fs_context *ctx = cgroup_fc2context(fc); + int ret; + + /* Check if the caller has permission to mount. */ + if (!ns_capable(ns->user_ns, CAP_SYS_ADMIN)) + return -EPERM; + + cgroup_lock_and_drain_offline(&cgrp_dfl_root.cgrp); + + ret = cgroup1_root_to_use(fc); + if (!ret && !percpu_ref_tryget_live(&ctx->root->cgrp.self.refcnt)) + ret = 1; /* restart */ -out_unlock: - if (!ret && !percpu_ref_tryget_live(&root->cgrp.self.refcnt)) { - mutex_unlock(&cgroup_mutex); - msleep(10); - return restart_syscall(); - } mutex_unlock(&cgroup_mutex); -out_free: - if (ret) - return ret; - ret = cgroup_do_mount(fc, CGROUP_SUPER_MAGIC, ns); - if (!ret && percpu_ref_is_dying(&root->cgrp.self.refcnt)) { + if (!ret) + ret = cgroup_do_mount(fc, CGROUP_SUPER_MAGIC, ns); + + if (!ret && percpu_ref_is_dying(&ctx->root->cgrp.self.refcnt)) { struct super_block *sb = fc->root->d_sb; dput(fc->root); deactivate_locked_super(sb); + ret = 1; + } + + if (unlikely(ret > 0)) { msleep(10); return restart_syscall(); } From patchwork Tue Feb 19 16:33:05 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 10820293 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 60DCF1805 for ; Tue, 19 Feb 2019 16:33:09 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4B95C2CCFB for ; Tue, 19 Feb 2019 16:33:09 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 4A1802CD77; Tue, 19 Feb 2019 16:33:09 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C69D52CCFB for ; Tue, 19 Feb 2019 16:33:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726221AbfBSQdI (ORCPT ); Tue, 19 Feb 2019 11:33:08 -0500 Received: from mx1.redhat.com ([209.132.183.28]:52560 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725911AbfBSQdI (ORCPT ); Tue, 19 Feb 2019 11:33:08 -0500 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 6A1AF53F1; Tue, 19 Feb 2019 16:33:07 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-121-129.rdu2.redhat.com [10.10.121.129]) by smtp.corp.redhat.com (Postfix) with ESMTP id EFD9D10694E6; Tue, 19 Feb 2019 16:33:05 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 34/43] cgroup: store a reference to cgroup_ns into cgroup_fs_context From: David Howells To: viro@zeniv.linux.org.uk Cc: linux-fsdevel@vger.kernel.org, dhowells@redhat.com, torvalds@linux-foundation.org, ebiederm@xmission.com, linux-security-module@vger.kernel.org Date: Tue, 19 Feb 2019 16:33:05 +0000 Message-ID: <155059398512.12449.12782242262840297952.stgit@warthog.procyon.org.uk> In-Reply-To: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> References: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.25]); Tue, 19 Feb 2019 16:33:07 +0000 (UTC) Sender: owner-linux-security-module@vger.kernel.org Precedence: bulk List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Al Viro ... and trim cgroup_do_mount() arguments (renaming it to cgroup_do_get_tree()) Signed-off-by: Al Viro --- kernel/cgroup/cgroup-internal.h | 4 ++-- kernel/cgroup/cgroup-v1.c | 8 +++----- kernel/cgroup/cgroup.c | 17 ++++++++++++----- 3 files changed, 17 insertions(+), 12 deletions(-) diff --git a/kernel/cgroup/cgroup-internal.h b/kernel/cgroup/cgroup-internal.h index f7fd54f2973f..37cf709b7a0e 100644 --- a/kernel/cgroup/cgroup-internal.h +++ b/kernel/cgroup/cgroup-internal.h @@ -42,6 +42,7 @@ extern void __init enable_debug_cgroup(void); */ struct cgroup_fs_context { struct cgroup_root *root; + struct cgroup_namespace *ns; unsigned int flags; /* CGRP_ROOT_* flags */ /* cgroup1 bits */ @@ -212,8 +213,7 @@ void cgroup_free_root(struct cgroup_root *root); void init_cgroup_root(struct cgroup_fs_context *ctx); int cgroup_setup_root(struct cgroup_root *root, u16 ss_mask); int rebind_subsystems(struct cgroup_root *dst_root, u16 ss_mask); -int cgroup_do_mount(struct fs_context *fc, unsigned long magic, - struct cgroup_namespace *ns); +int cgroup_do_get_tree(struct fs_context *fc); int cgroup_migrate_vet_dst(struct cgroup *dst_cgrp); void cgroup_migrate_finish(struct cgroup_mgctx *mgctx); diff --git a/kernel/cgroup/cgroup-v1.c b/kernel/cgroup/cgroup-v1.c index 0d71fc98e73d..571ef3447426 100644 --- a/kernel/cgroup/cgroup-v1.c +++ b/kernel/cgroup/cgroup-v1.c @@ -1145,7 +1145,6 @@ struct kernfs_syscall_ops cgroup1_kf_syscall_ops = { */ static int cgroup1_root_to_use(struct fs_context *fc) { - struct cgroup_namespace *ns = current->nsproxy->cgroup_ns; struct cgroup_fs_context *ctx = cgroup_fc2context(fc); struct cgroup_root *root; struct cgroup_subsys *ss; @@ -1217,7 +1216,7 @@ static int cgroup1_root_to_use(struct fs_context *fc) return cg_invalf(fc, "cgroup1: No subsys list or none specified"); /* Hierarchies may only be created in the initial cgroup namespace. */ - if (ns != &init_cgroup_ns) + if (ctx->ns != &init_cgroup_ns) return -EPERM; root = kzalloc(sizeof(*root), GFP_KERNEL); @@ -1235,12 +1234,11 @@ static int cgroup1_root_to_use(struct fs_context *fc) int cgroup1_get_tree(struct fs_context *fc) { - struct cgroup_namespace *ns = current->nsproxy->cgroup_ns; struct cgroup_fs_context *ctx = cgroup_fc2context(fc); int ret; /* Check if the caller has permission to mount. */ - if (!ns_capable(ns->user_ns, CAP_SYS_ADMIN)) + if (!ns_capable(ctx->ns->user_ns, CAP_SYS_ADMIN)) return -EPERM; cgroup_lock_and_drain_offline(&cgrp_dfl_root.cgrp); @@ -1252,7 +1250,7 @@ int cgroup1_get_tree(struct fs_context *fc) mutex_unlock(&cgroup_mutex); if (!ret) - ret = cgroup_do_mount(fc, CGROUP_SUPER_MAGIC, ns); + ret = cgroup_do_get_tree(fc); if (!ret && percpu_ref_is_dying(&ctx->root->cgrp.self.refcnt)) { struct super_block *sb = fc->root->d_sb; diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c index 64360a46d4df..0c6bef234a7c 100644 --- a/kernel/cgroup/cgroup.c +++ b/kernel/cgroup/cgroup.c @@ -2036,13 +2036,17 @@ int cgroup_setup_root(struct cgroup_root *root, u16 ss_mask) return ret; } -int cgroup_do_mount(struct fs_context *fc, unsigned long magic, - struct cgroup_namespace *ns) +int cgroup_do_get_tree(struct fs_context *fc) { struct cgroup_fs_context *ctx = cgroup_fc2context(fc); bool new_sb = false; + unsigned long magic; int ret = 0; + if (fc->fs_type == &cgroup2_fs_type) + magic = CGROUP2_SUPER_MAGIC; + else + magic = CGROUP_SUPER_MAGIC; fc->root = kernfs_mount(fc->fs_type, fc->sb_flags, ctx->root->kf_root, magic, &new_sb); if (IS_ERR(fc->root)) @@ -2052,7 +2056,7 @@ int cgroup_do_mount(struct fs_context *fc, unsigned long magic, * In non-init cgroup namespace, instead of root cgroup's dentry, * we return the dentry corresponding to the cgroupns->root_cgrp. */ - if (!ret && ns != &init_cgroup_ns) { + if (!ret && ctx->ns != &init_cgroup_ns) { struct dentry *nsdentry; struct super_block *sb = fc->root->d_sb; struct cgroup *cgrp; @@ -2060,7 +2064,7 @@ int cgroup_do_mount(struct fs_context *fc, unsigned long magic, mutex_lock(&cgroup_mutex); spin_lock_irq(&css_set_lock); - cgrp = cset_cgroup_from_root(ns->root_cset, ctx->root); + cgrp = cset_cgroup_from_root(ctx->ns->root_cset, ctx->root); spin_unlock_irq(&css_set_lock); mutex_unlock(&cgroup_mutex); @@ -2089,6 +2093,7 @@ static void cgroup_fs_context_free(struct fs_context *fc) kfree(ctx->name); kfree(ctx->release_agent); + put_cgroup_ns(ctx->ns); kfree(ctx); } @@ -2106,7 +2111,7 @@ static int cgroup_get_tree(struct fs_context *fc) cgroup_get_live(&cgrp_dfl_root.cgrp); ctx->root = &cgrp_dfl_root; - ret = cgroup_do_mount(fc, CGROUP2_SUPER_MAGIC, ns); + ret = cgroup_do_get_tree(fc); if (!ret) apply_cgroup_root_flags(ctx->flags); return ret; @@ -2144,6 +2149,8 @@ static int cgroup_init_fs_context(struct fs_context *fc) if (!use_task_css_set_links) cgroup_enable_task_cg_lists(); + ctx->ns = current->nsproxy->cgroup_ns; + get_cgroup_ns(ctx->ns); fc->fs_private = ctx; if (fc->fs_type == &cgroup2_fs_type) fc->ops = &cgroup_fs_context_ops; From patchwork Tue Feb 19 16:33:12 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 10820297 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 93D8D14E1 for ; Tue, 19 Feb 2019 16:33:26 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7D9A82CC12 for ; Tue, 19 Feb 2019 16:33:26 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 70A672CCCA; Tue, 19 Feb 2019 16:33:26 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CC4A72CC12 for ; Tue, 19 Feb 2019 16:33:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1725911AbfBSQdY (ORCPT ); Tue, 19 Feb 2019 11:33:24 -0500 Received: from mx1.redhat.com ([209.132.183.28]:41954 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725820AbfBSQdY (ORCPT ); Tue, 19 Feb 2019 11:33:24 -0500 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id D217C59456; Tue, 19 Feb 2019 16:33:22 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-121-129.rdu2.redhat.com [10.10.121.129]) by smtp.corp.redhat.com (Postfix) with ESMTP id 4BC146608B; Tue, 19 Feb 2019 16:33:14 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 35/43] kernfs, sysfs, cgroup, intel_rdt: Support fs_context From: David Howells To: viro@zeniv.linux.org.uk Cc: Greg Kroah-Hartman , Tejun Heo , Li Zefan , Johannes Weiner , cgroups@vger.kernel.org, fenghua.yu@intel.com, linux-fsdevel@vger.kernel.org, dhowells@redhat.com, torvalds@linux-foundation.org, ebiederm@xmission.com, linux-security-module@vger.kernel.org Date: Tue, 19 Feb 2019 16:33:12 +0000 Message-ID: <155059399265.12449.14521966377258462430.stgit@warthog.procyon.org.uk> In-Reply-To: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> References: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Tue, 19 Feb 2019 16:33:23 +0000 (UTC) Sender: owner-linux-security-module@vger.kernel.org Precedence: bulk List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Make kernfs support superblock creation/mount/remount with fs_context. This requires that sysfs, cgroup and intel_rdt, which are built on kernfs, be made to support fs_context also. Notes: (1) A kernfs_fs_context struct is created to wrap fs_context and the kernfs mount parameters are moved in here (or are in fs_context). (2) kernfs_mount{,_ns}() are made into kernfs_get_tree(). The extra namespace tag parameter is passed in the context if desired (3) kernfs_free_fs_context() is provided as a destructor for the kernfs_fs_context struct, but for the moment it does nothing except get called in the right places. (4) sysfs doesn't wrap kernfs_fs_context since it has no parameters to pass, but possibly this should be done anyway in case someone wants to add a parameter in future. (5) A cgroup_fs_context struct is created to wrap kernfs_fs_context and the cgroup v1 and v2 mount parameters are all moved there. (6) cgroup1 parameter parsing error messages are now handled by invalf(), which allows userspace to collect them directly. (7) cgroup1 parameter cleanup is now done in the context destructor rather than in the mount/get_tree and remount functions. Weirdies: (*) cgroup_do_get_tree() calls cset_cgroup_from_root() with locks held, but then uses the resulting pointer after dropping the locks. I'm told this is okay and needs commenting. (*) The cgroup refcount web. This really needs documenting. (*) cgroup2 only has one root? Add a suggestion from Thomas Gleixner in which the RDT enablement code is placed into its own function. [folded a leak fix from Andrey Vagin] Signed-off-by: David Howells cc: Greg Kroah-Hartman cc: Tejun Heo cc: Li Zefan cc: Johannes Weiner cc: cgroups@vger.kernel.org cc: fenghua.yu@intel.com Signed-off-by: Al Viro --- arch/x86/kernel/cpu/resctrl/internal.h | 16 +++ arch/x86/kernel/cpu/resctrl/rdtgroup.c | 185 ++++++++++++++++++++------------ fs/kernfs/kernfs-internal.h | 1 fs/kernfs/mount.c | 89 ++++++--------- fs/sysfs/mount.c | 73 +++++++++---- include/linux/kernfs.h | 38 +++---- kernel/cgroup/cgroup-internal.h | 5 + kernel/cgroup/cgroup.c | 31 ++--- 8 files changed, 262 insertions(+), 176 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h index 822b7db634ee..e49b77283924 100644 --- a/arch/x86/kernel/cpu/resctrl/internal.h +++ b/arch/x86/kernel/cpu/resctrl/internal.h @@ -4,6 +4,7 @@ #include #include +#include #include #define MSR_IA32_L3_QOS_CFG 0xc81 @@ -40,6 +41,21 @@ #define RMID_VAL_ERROR BIT_ULL(63) #define RMID_VAL_UNAVAIL BIT_ULL(62) + +struct rdt_fs_context { + struct kernfs_fs_context kfc; + bool enable_cdpl2; + bool enable_cdpl3; + bool enable_mba_mbps; +}; + +static inline struct rdt_fs_context *rdt_fc2context(struct fs_context *fc) +{ + struct kernfs_fs_context *kfc = fc->fs_private; + + return container_of(kfc, struct rdt_fs_context, kfc); +} + DECLARE_STATIC_KEY_FALSE(rdt_enable_key); /** diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c index 8388adf241b2..399601eda8e4 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -24,6 +24,7 @@ #include #include #include +#include #include #include #include @@ -32,6 +33,7 @@ #include #include #include +#include #include @@ -1858,46 +1860,6 @@ static void cdp_disable_all(void) cdpl2_disable(); } -static int parse_rdtgroupfs_options(char *data) -{ - char *token, *o = data; - int ret = 0; - - while ((token = strsep(&o, ",")) != NULL) { - if (!*token) { - ret = -EINVAL; - goto out; - } - - if (!strcmp(token, "cdp")) { - ret = cdpl3_enable(); - if (ret) - goto out; - } else if (!strcmp(token, "cdpl2")) { - ret = cdpl2_enable(); - if (ret) - goto out; - } else if (!strcmp(token, "mba_MBps")) { - if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL) - ret = set_mba_sc(true); - else - ret = -EINVAL; - if (ret) - goto out; - } else { - ret = -EINVAL; - goto out; - } - } - - return 0; - -out: - pr_err("Invalid mount option \"%s\"\n", token); - - return ret; -} - /* * We don't allow rdtgroup directories to be created anywhere * except the root directory. Thus when looking for the rdtgroup @@ -1969,13 +1931,27 @@ static int mkdir_mondata_all(struct kernfs_node *parent_kn, struct rdtgroup *prgrp, struct kernfs_node **mon_data_kn); -static struct dentry *rdt_mount(struct file_system_type *fs_type, - int flags, const char *unused_dev_name, - void *data) +static int rdt_enable_ctx(struct rdt_fs_context *ctx) +{ + int ret = 0; + + if (ctx->enable_cdpl2) + ret = cdpl2_enable(); + + if (!ret && ctx->enable_cdpl3) + ret = cdpl3_enable(); + + if (!ret && ctx->enable_mba_mbps) + ret = set_mba_sc(true); + + return ret; +} + +static int rdt_get_tree(struct fs_context *fc) { + struct rdt_fs_context *ctx = rdt_fc2context(fc); struct rdt_domain *dom; struct rdt_resource *r; - struct dentry *dentry; int ret; cpus_read_lock(); @@ -1984,53 +1960,42 @@ static struct dentry *rdt_mount(struct file_system_type *fs_type, * resctrl file system can only be mounted once. */ if (static_branch_unlikely(&rdt_enable_key)) { - dentry = ERR_PTR(-EBUSY); + ret = -EBUSY; goto out; } - ret = parse_rdtgroupfs_options(data); - if (ret) { - dentry = ERR_PTR(ret); + ret = rdt_enable_ctx(ctx); + if (ret < 0) goto out_cdp; - } closid_init(); ret = rdtgroup_create_info_dir(rdtgroup_default.kn); - if (ret) { - dentry = ERR_PTR(ret); - goto out_cdp; - } + if (ret < 0) + goto out_mba; if (rdt_mon_capable) { ret = mongroup_create_dir(rdtgroup_default.kn, NULL, "mon_groups", &kn_mongrp); - if (ret) { - dentry = ERR_PTR(ret); + if (ret < 0) goto out_info; - } kernfs_get(kn_mongrp); ret = mkdir_mondata_all(rdtgroup_default.kn, &rdtgroup_default, &kn_mondata); - if (ret) { - dentry = ERR_PTR(ret); + if (ret < 0) goto out_mongrp; - } kernfs_get(kn_mondata); rdtgroup_default.mon.mon_data_kn = kn_mondata; } ret = rdt_pseudo_lock_init(); - if (ret) { - dentry = ERR_PTR(ret); + if (ret) goto out_mondata; - } - dentry = kernfs_mount(fs_type, flags, rdt_root, - RDTGROUP_SUPER_MAGIC, NULL); - if (IS_ERR(dentry)) + ret = kernfs_get_tree(fc); + if (ret < 0) goto out_psl; if (rdt_alloc_capable) @@ -2059,14 +2024,95 @@ static struct dentry *rdt_mount(struct file_system_type *fs_type, kernfs_remove(kn_mongrp); out_info: kernfs_remove(kn_info); +out_mba: + if (ctx->enable_mba_mbps) + set_mba_sc(false); out_cdp: cdp_disable_all(); out: rdt_last_cmd_clear(); mutex_unlock(&rdtgroup_mutex); cpus_read_unlock(); + return ret; +} + +enum rdt_param { + Opt_cdp, + Opt_cdpl2, + Opt_mba_mpbs, + nr__rdt_params +}; + +static const struct fs_parameter_spec rdt_param_specs[] = { + fsparam_flag("cdp", Opt_cdp), + fsparam_flag("cdpl2", Opt_cdpl2), + fsparam_flag("mba_mpbs", Opt_mba_mpbs), + {} +}; + +static const struct fs_parameter_description rdt_fs_parameters = { + .name = "rdt", + .specs = rdt_param_specs, +}; + +static int rdt_parse_param(struct fs_context *fc, struct fs_parameter *param) +{ + struct rdt_fs_context *ctx = rdt_fc2context(fc); + struct fs_parse_result result; + int opt; + + opt = fs_parse(fc, &rdt_fs_parameters, param, &result); + if (opt < 0) + return opt; - return dentry; + switch (opt) { + case Opt_cdp: + ctx->enable_cdpl3 = true; + return 0; + case Opt_cdpl2: + ctx->enable_cdpl2 = true; + return 0; + case Opt_mba_mpbs: + if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL) + return -EINVAL; + ctx->enable_mba_mbps = true; + return 0; + } + + return -EINVAL; +} + +static void rdt_fs_context_free(struct fs_context *fc) +{ + struct rdt_fs_context *ctx = rdt_fc2context(fc); + + kernfs_free_fs_context(fc); + kfree(ctx); +} + +static const struct fs_context_operations rdt_fs_context_ops = { + .free = rdt_fs_context_free, + .parse_param = rdt_parse_param, + .get_tree = rdt_get_tree, +}; + +static int rdt_init_fs_context(struct fs_context *fc) +{ + struct rdt_fs_context *ctx; + + ctx = kzalloc(sizeof(struct rdt_fs_context), GFP_KERNEL); + if (!ctx) + return -ENOMEM; + + ctx->kfc.root = rdt_root; + ctx->kfc.magic = RDTGROUP_SUPER_MAGIC; + fc->fs_private = &ctx->kfc; + fc->ops = &rdt_fs_context_ops; + if (fc->user_ns) + put_user_ns(fc->user_ns); + fc->user_ns = get_user_ns(&init_user_ns); + fc->global = true; + return 0; } static int reset_all_ctrls(struct rdt_resource *r) @@ -2239,9 +2285,10 @@ static void rdt_kill_sb(struct super_block *sb) } static struct file_system_type rdt_fs_type = { - .name = "resctrl", - .mount = rdt_mount, - .kill_sb = rdt_kill_sb, + .name = "resctrl", + .init_fs_context = rdt_init_fs_context, + .parameters = &rdt_fs_parameters, + .kill_sb = rdt_kill_sb, }; static int mon_addfile(struct kernfs_node *parent_kn, const char *name, diff --git a/fs/kernfs/kernfs-internal.h b/fs/kernfs/kernfs-internal.h index 3d83b114bb08..379e3a9eb1ec 100644 --- a/fs/kernfs/kernfs-internal.h +++ b/fs/kernfs/kernfs-internal.h @@ -17,6 +17,7 @@ #include #include +#include struct kernfs_iattrs { struct iattr ia_iattr; diff --git a/fs/kernfs/mount.c b/fs/kernfs/mount.c index 4d303047a4f8..36376cc5c9c2 100644 --- a/fs/kernfs/mount.c +++ b/fs/kernfs/mount.c @@ -22,16 +22,6 @@ struct kmem_cache *kernfs_node_cache; -static int kernfs_sop_remount_fs(struct super_block *sb, int *flags, char *data) -{ - struct kernfs_root *root = kernfs_info(sb)->root; - struct kernfs_syscall_ops *scops = root->syscall_ops; - - if (scops && scops->remount_fs) - return scops->remount_fs(root, flags, data); - return 0; -} - static int kernfs_sop_show_options(struct seq_file *sf, struct dentry *dentry) { struct kernfs_root *root = kernfs_root(kernfs_dentry_node(dentry)); @@ -60,7 +50,6 @@ const struct super_operations kernfs_sops = { .drop_inode = generic_delete_inode, .evict_inode = kernfs_evict_inode, - .remount_fs = kernfs_sop_remount_fs, .show_options = kernfs_sop_show_options, .show_path = kernfs_sop_show_path, }; @@ -222,7 +211,7 @@ struct dentry *kernfs_node_dentry(struct kernfs_node *kn, } while (true); } -static int kernfs_fill_super(struct super_block *sb, unsigned long magic) +static int kernfs_fill_super(struct super_block *sb, struct kernfs_fs_context *kfc) { struct kernfs_super_info *info = kernfs_info(sb); struct inode *inode; @@ -233,7 +222,7 @@ static int kernfs_fill_super(struct super_block *sb, unsigned long magic) sb->s_iflags |= SB_I_NOEXEC | SB_I_NODEV; sb->s_blocksize = PAGE_SIZE; sb->s_blocksize_bits = PAGE_SHIFT; - sb->s_magic = magic; + sb->s_magic = kfc->magic; sb->s_op = &kernfs_sops; sb->s_xattr = kernfs_xattr_handlers; if (info->root->flags & KERNFS_ROOT_SUPPORT_EXPORTOP) @@ -263,21 +252,20 @@ static int kernfs_fill_super(struct super_block *sb, unsigned long magic) return 0; } -static int kernfs_test_super(struct super_block *sb, void *data) +static int kernfs_test_super(struct super_block *sb, struct fs_context *fc) { struct kernfs_super_info *sb_info = kernfs_info(sb); - struct kernfs_super_info *info = data; + struct kernfs_super_info *info = fc->s_fs_info; return sb_info->root == info->root && sb_info->ns == info->ns; } -static int kernfs_set_super(struct super_block *sb, void *data) +static int kernfs_set_super(struct super_block *sb, struct fs_context *fc) { - int error; - error = set_anon_super(sb, data); - if (!error) - sb->s_fs_info = data; - return error; + struct kernfs_fs_context *kfc = fc->fs_private; + + kfc->ns_tag = NULL; + return set_anon_super_fc(sb, fc); } /** @@ -294,63 +282,60 @@ const void *kernfs_super_ns(struct super_block *sb) } /** - * kernfs_mount_ns - kernfs mount helper - * @fs_type: file_system_type of the fs being mounted - * @flags: mount flags specified for the mount - * @root: kernfs_root of the hierarchy being mounted - * @magic: file system specific magic number - * @new_sb_created: tell the caller if we allocated a new superblock - * @ns: optional namespace tag of the mount + * kernfs_get_tree - kernfs filesystem access/retrieval helper + * @fc: The filesystem context. * - * This is to be called from each kernfs user's file_system_type->mount() - * implementation, which should pass through the specified @fs_type and - * @flags, and specify the hierarchy and namespace tag to mount via @root - * and @ns, respectively. - * - * The return value can be passed to the vfs layer verbatim. + * This is to be called from each kernfs user's fs_context->ops->get_tree() + * implementation, which should set the specified ->@fs_type and ->@flags, and + * specify the hierarchy and namespace tag to mount via ->@root and ->@ns, + * respectively. */ -struct dentry *kernfs_mount_ns(struct file_system_type *fs_type, int flags, - struct kernfs_root *root, unsigned long magic, - bool *new_sb_created, const void *ns) +int kernfs_get_tree(struct fs_context *fc) { + struct kernfs_fs_context *kfc = fc->fs_private; struct super_block *sb; struct kernfs_super_info *info; int error; info = kzalloc(sizeof(*info), GFP_KERNEL); if (!info) - return ERR_PTR(-ENOMEM); + return -ENOMEM; - info->root = root; - info->ns = ns; + info->root = kfc->root; + info->ns = kfc->ns_tag; INIT_LIST_HEAD(&info->node); - sb = sget_userns(fs_type, kernfs_test_super, kernfs_set_super, flags, - &init_user_ns, info); - if (IS_ERR(sb) || sb->s_fs_info != info) - kfree(info); + fc->s_fs_info = info; + sb = sget_fc(fc, kernfs_test_super, kernfs_set_super); if (IS_ERR(sb)) - return ERR_CAST(sb); - - if (new_sb_created) - *new_sb_created = !sb->s_root; + return PTR_ERR(sb); if (!sb->s_root) { struct kernfs_super_info *info = kernfs_info(sb); - error = kernfs_fill_super(sb, magic); + kfc->new_sb_created = true; + + error = kernfs_fill_super(sb, kfc); if (error) { deactivate_locked_super(sb); - return ERR_PTR(error); + return error; } sb->s_flags |= SB_ACTIVE; mutex_lock(&kernfs_mutex); - list_add(&info->node, &root->supers); + list_add(&info->node, &info->root->supers); mutex_unlock(&kernfs_mutex); } - return dget(sb->s_root); + fc->root = dget(sb->s_root); + return 0; +} + +void kernfs_free_fs_context(struct fs_context *fc) +{ + /* Note that we don't deal with kfc->ns_tag here. */ + kfree(fc->s_fs_info); + fc->s_fs_info = NULL; } /** diff --git a/fs/sysfs/mount.c b/fs/sysfs/mount.c index 92682fcc41f6..4cb21b558a85 100644 --- a/fs/sysfs/mount.c +++ b/fs/sysfs/mount.c @@ -13,34 +13,69 @@ #include #include #include +#include #include +#include +#include #include "sysfs.h" static struct kernfs_root *sysfs_root; struct kernfs_node *sysfs_root_kn; -static struct dentry *sysfs_mount(struct file_system_type *fs_type, - int flags, const char *dev_name, void *data) +static int sysfs_get_tree(struct fs_context *fc) { - struct dentry *root; - void *ns; - bool new_sb = false; + struct kernfs_fs_context *kfc = fc->fs_private; + int ret; - if (!(flags & SB_KERNMOUNT)) { + ret = kernfs_get_tree(fc); + if (ret) + return ret; + + if (kfc->new_sb_created) + fc->root->d_sb->s_iflags |= SB_I_USERNS_VISIBLE; + return 0; +} + +static void sysfs_fs_context_free(struct fs_context *fc) +{ + struct kernfs_fs_context *kfc = fc->fs_private; + + if (kfc->ns_tag) + kobj_ns_drop(KOBJ_NS_TYPE_NET, kfc->ns_tag); + kernfs_free_fs_context(fc); + kfree(kfc); +} + +static const struct fs_context_operations sysfs_fs_context_ops = { + .free = sysfs_fs_context_free, + .get_tree = sysfs_get_tree, +}; + +static int sysfs_init_fs_context(struct fs_context *fc) +{ + struct kernfs_fs_context *kfc; + struct net *netns; + + if (!(fc->sb_flags & SB_KERNMOUNT)) { if (!kobj_ns_current_may_mount(KOBJ_NS_TYPE_NET)) - return ERR_PTR(-EPERM); + return -EPERM; } - ns = kobj_ns_grab_current(KOBJ_NS_TYPE_NET); - root = kernfs_mount_ns(fs_type, flags, sysfs_root, - SYSFS_MAGIC, &new_sb, ns); - if (!new_sb) - kobj_ns_drop(KOBJ_NS_TYPE_NET, ns); - else if (!IS_ERR(root)) - root->d_sb->s_iflags |= SB_I_USERNS_VISIBLE; + kfc = kzalloc(sizeof(struct kernfs_fs_context), GFP_KERNEL); + if (!kfc) + return -ENOMEM; - return root; + kfc->ns_tag = netns = kobj_ns_grab_current(KOBJ_NS_TYPE_NET); + kfc->root = sysfs_root; + kfc->magic = SYSFS_MAGIC; + fc->fs_private = kfc; + fc->ops = &sysfs_fs_context_ops; + if (fc->user_ns) + put_user_ns(fc->user_ns); + fc->user_ns = get_user_ns(netns->user_ns); + fc->global = true; + return 0; } static void sysfs_kill_sb(struct super_block *sb) @@ -52,10 +87,10 @@ static void sysfs_kill_sb(struct super_block *sb) } static struct file_system_type sysfs_fs_type = { - .name = "sysfs", - .mount = sysfs_mount, - .kill_sb = sysfs_kill_sb, - .fs_flags = FS_USERNS_MOUNT, + .name = "sysfs", + .init_fs_context = sysfs_init_fs_context, + .kill_sb = sysfs_kill_sb, + .fs_flags = FS_USERNS_MOUNT, }; int __init sysfs_init(void) diff --git a/include/linux/kernfs.h b/include/linux/kernfs.h index 44acb4c3659c..822a64e65b41 100644 --- a/include/linux/kernfs.h +++ b/include/linux/kernfs.h @@ -25,7 +25,9 @@ struct seq_file; struct vm_area_struct; struct super_block; struct file_system_type; +struct fs_context; +struct kernfs_fs_context; struct kernfs_open_node; struct kernfs_iattrs; @@ -167,7 +169,6 @@ struct kernfs_node { * kernfs_node parameter. */ struct kernfs_syscall_ops { - int (*remount_fs)(struct kernfs_root *root, int *flags, char *data); int (*show_options)(struct seq_file *sf, struct kernfs_root *root); int (*mkdir)(struct kernfs_node *parent, const char *name, @@ -268,6 +269,18 @@ struct kernfs_ops { #endif }; +/* + * The kernfs superblock creation/mount parameter context. + */ +struct kernfs_fs_context { + struct kernfs_root *root; /* Root of the hierarchy being mounted */ + void *ns_tag; /* Namespace tag of the mount (or NULL) */ + unsigned long magic; /* File system specific magic number */ + + /* The following are set/used by kernfs_mount() */ + bool new_sb_created; /* Set to T if we allocated a new sb */ +}; + #ifdef CONFIG_KERNFS static inline enum kernfs_node_type kernfs_type(struct kernfs_node *kn) @@ -353,9 +366,8 @@ int kernfs_setattr(struct kernfs_node *kn, const struct iattr *iattr); void kernfs_notify(struct kernfs_node *kn); const void *kernfs_super_ns(struct super_block *sb); -struct dentry *kernfs_mount_ns(struct file_system_type *fs_type, int flags, - struct kernfs_root *root, unsigned long magic, - bool *new_sb_created, const void *ns); +int kernfs_get_tree(struct fs_context *fc); +void kernfs_free_fs_context(struct fs_context *fc); void kernfs_kill_sb(struct super_block *sb); void kernfs_init(void); @@ -458,11 +470,10 @@ static inline void kernfs_notify(struct kernfs_node *kn) { } static inline const void *kernfs_super_ns(struct super_block *sb) { return NULL; } -static inline struct dentry * -kernfs_mount_ns(struct file_system_type *fs_type, int flags, - struct kernfs_root *root, unsigned long magic, - bool *new_sb_created, const void *ns) -{ return ERR_PTR(-ENOSYS); } +static inline int kernfs_get_tree(struct fs_context *fc) +{ return -ENOSYS; } + +static inline void kernfs_free_fs_context(struct fs_context *fc) { } static inline void kernfs_kill_sb(struct super_block *sb) { } @@ -545,13 +556,4 @@ static inline int kernfs_rename(struct kernfs_node *kn, return kernfs_rename_ns(kn, new_parent, new_name, NULL); } -static inline struct dentry * -kernfs_mount(struct file_system_type *fs_type, int flags, - struct kernfs_root *root, unsigned long magic, - bool *new_sb_created) -{ - return kernfs_mount_ns(fs_type, flags, root, - magic, new_sb_created, NULL); -} - #endif /* __LINUX_KERNFS_H */ diff --git a/kernel/cgroup/cgroup-internal.h b/kernel/cgroup/cgroup-internal.h index 37cf709b7a0e..30e39f3932ad 100644 --- a/kernel/cgroup/cgroup-internal.h +++ b/kernel/cgroup/cgroup-internal.h @@ -41,6 +41,7 @@ extern void __init enable_debug_cgroup(void); * The cgroup filesystem superblock creation/mount context. */ struct cgroup_fs_context { + struct kernfs_fs_context kfc; struct cgroup_root *root; struct cgroup_namespace *ns; unsigned int flags; /* CGRP_ROOT_* flags */ @@ -56,7 +57,9 @@ struct cgroup_fs_context { static inline struct cgroup_fs_context *cgroup_fc2context(struct fs_context *fc) { - return fc->fs_private; + struct kernfs_fs_context *kfc = fc->fs_private; + + return container_of(kfc, struct cgroup_fs_context, kfc); } /* diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c index 0c6bef234a7c..747e5b17f9da 100644 --- a/kernel/cgroup/cgroup.c +++ b/kernel/cgroup/cgroup.c @@ -2039,18 +2039,14 @@ int cgroup_setup_root(struct cgroup_root *root, u16 ss_mask) int cgroup_do_get_tree(struct fs_context *fc) { struct cgroup_fs_context *ctx = cgroup_fc2context(fc); - bool new_sb = false; - unsigned long magic; - int ret = 0; + int ret; + ctx->kfc.root = ctx->root->kf_root; if (fc->fs_type == &cgroup2_fs_type) - magic = CGROUP2_SUPER_MAGIC; + ctx->kfc.magic = CGROUP2_SUPER_MAGIC; else - magic = CGROUP_SUPER_MAGIC; - fc->root = kernfs_mount(fc->fs_type, fc->sb_flags, ctx->root->kf_root, - magic, &new_sb); - if (IS_ERR(fc->root)) - ret = PTR_ERR(fc->root); + ctx->kfc.magic = CGROUP_SUPER_MAGIC; + ret = kernfs_get_tree(fc); /* * In non-init cgroup namespace, instead of root cgroup's dentry, @@ -2078,7 +2074,7 @@ int cgroup_do_get_tree(struct fs_context *fc) } } - if (!new_sb) + if (!ctx->kfc.new_sb_created) cgroup_put(&ctx->root->cgrp); return ret; @@ -2094,19 +2090,15 @@ static void cgroup_fs_context_free(struct fs_context *fc) kfree(ctx->name); kfree(ctx->release_agent); put_cgroup_ns(ctx->ns); + kernfs_free_fs_context(fc); kfree(ctx); } static int cgroup_get_tree(struct fs_context *fc) { - struct cgroup_namespace *ns = current->nsproxy->cgroup_ns; struct cgroup_fs_context *ctx = cgroup_fc2context(fc); int ret; - /* Check if the caller has permission to mount. */ - if (!ns_capable(ns->user_ns, CAP_SYS_ADMIN)) - return -EPERM; - cgrp_dfl_visible = true; cgroup_get_live(&cgrp_dfl_root.cgrp); ctx->root = &cgrp_dfl_root; @@ -2132,7 +2124,8 @@ static const struct fs_context_operations cgroup1_fs_context_ops = { }; /* - * Initialise the cgroup filesystem creation/reconfiguration context. + * Initialise the cgroup filesystem creation/reconfiguration context. Notably, + * we select the namespace we're going to use. */ static int cgroup_init_fs_context(struct fs_context *fc) { @@ -2151,11 +2144,15 @@ static int cgroup_init_fs_context(struct fs_context *fc) ctx->ns = current->nsproxy->cgroup_ns; get_cgroup_ns(ctx->ns); - fc->fs_private = ctx; + fc->fs_private = &ctx->kfc; if (fc->fs_type == &cgroup2_fs_type) fc->ops = &cgroup_fs_context_ops; else fc->ops = &cgroup1_fs_context_ops; + if (fc->user_ns) + put_user_ns(fc->user_ns); + fc->user_ns = get_user_ns(ctx->ns->user_ns); + fc->global = true; return 0; } From patchwork Tue Feb 19 16:33:28 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 10820301 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 796771805 for ; Tue, 19 Feb 2019 16:33:34 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6519E2CCCA for ; Tue, 19 Feb 2019 16:33:34 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 588522CCCF; Tue, 19 Feb 2019 16:33:34 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id ED2142CCCA for ; Tue, 19 Feb 2019 16:33:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727230AbfBSQdd (ORCPT ); Tue, 19 Feb 2019 11:33:33 -0500 Received: from mx1.redhat.com ([209.132.183.28]:53132 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725820AbfBSQdd (ORCPT ); Tue, 19 Feb 2019 11:33:33 -0500 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 5267F81DE8; Tue, 19 Feb 2019 16:33:32 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-121-129.rdu2.redhat.com [10.10.121.129]) by smtp.corp.redhat.com (Postfix) with ESMTP id E49C06114E; Tue, 19 Feb 2019 16:33:29 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 36/43] cpuset: Use fs_context From: David Howells To: viro@zeniv.linux.org.uk Cc: Tejun Heo , linux-fsdevel@vger.kernel.org, dhowells@redhat.com, torvalds@linux-foundation.org, ebiederm@xmission.com, linux-security-module@vger.kernel.org Date: Tue, 19 Feb 2019 16:33:28 +0000 Message-ID: <155059400804.12449.6061293146947154243.stgit@warthog.procyon.org.uk> In-Reply-To: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> References: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.25]); Tue, 19 Feb 2019 16:33:32 +0000 (UTC) Sender: owner-linux-security-module@vger.kernel.org Precedence: bulk List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Make the cpuset filesystem use the filesystem context. This is potentially tricky as the cpuset fs is almost an alias for the cgroup filesystem, but with some special parameters. This can, however, be handled by setting up an appropriate cgroup filesystem and returning the root directory of that as the root dir of this one. Signed-off-by: David Howells cc: Tejun Heo Signed-off-by: Al Viro --- kernel/cgroup/cpuset.c | 56 ++++++++++++++++++++++++++++++++++++------------ 1 file changed, 42 insertions(+), 14 deletions(-) diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c index 479743db6c37..9758b03834ac 100644 --- a/kernel/cgroup/cpuset.c +++ b/kernel/cgroup/cpuset.c @@ -39,6 +39,7 @@ #include #include #include +#include #include #include #include @@ -372,25 +373,52 @@ static inline bool is_in_v2_mode(void) * users. If someone tries to mount the "cpuset" filesystem, we * silently switch it to mount "cgroup" instead */ -static struct dentry *cpuset_mount(struct file_system_type *fs_type, - int flags, const char *unused_dev_name, void *data) -{ - struct file_system_type *cgroup_fs = get_fs_type("cgroup"); - struct dentry *ret = ERR_PTR(-ENODEV); - if (cgroup_fs) { - char mountopts[] = - "cpuset,noprefix," - "release_agent=/sbin/cpuset_release_agent"; - ret = cgroup_fs->mount(cgroup_fs, flags, - unused_dev_name, mountopts); - put_filesystem(cgroup_fs); +static int cpuset_get_tree(struct fs_context *fc) +{ + struct file_system_type *cgroup_fs; + struct fs_context *new_fc; + int ret; + + cgroup_fs = get_fs_type("cgroup"); + if (!cgroup_fs) + return -ENODEV; + + new_fc = fs_context_for_mount(cgroup_fs, fc->sb_flags); + if (IS_ERR(new_fc)) { + ret = PTR_ERR(new_fc); + } else { + static const char agent_path[] = "/sbin/cpuset_release_agent"; + ret = vfs_parse_fs_string(new_fc, "cpuset", NULL, 0); + if (!ret) + ret = vfs_parse_fs_string(new_fc, "noprefix", NULL, 0); + if (!ret) + ret = vfs_parse_fs_string(new_fc, "release_agent", + agent_path, sizeof(agent_path) - 1); + if (!ret) + ret = vfs_get_tree(new_fc); + if (!ret) { /* steal the result */ + fc->root = new_fc->root; + new_fc->root = NULL; + } + put_fs_context(new_fc); } + put_filesystem(cgroup_fs); return ret; } +static const struct fs_context_operations cpuset_fs_context_ops = { + .get_tree = cpuset_get_tree, +}; + +static int cpuset_init_fs_context(struct fs_context *fc) +{ + fc->ops = &cpuset_fs_context_ops; + return 0; +} + static struct file_system_type cpuset_fs_type = { - .name = "cpuset", - .mount = cpuset_mount, + .name = "cpuset", + .init_fs_context = cpuset_init_fs_context, }; /* From patchwork Tue Feb 19 16:33:37 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 10820305 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id BB02A14E1 for ; Tue, 19 Feb 2019 16:33:42 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A02DD2CCED for ; Tue, 19 Feb 2019 16:33:42 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 920422CCEF; Tue, 19 Feb 2019 16:33:42 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5D75D2CCCF for ; Tue, 19 Feb 2019 16:33:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728922AbfBSQdl (ORCPT ); Tue, 19 Feb 2019 11:33:41 -0500 Received: from mx1.redhat.com ([209.132.183.28]:32306 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725820AbfBSQdk (ORCPT ); Tue, 19 Feb 2019 11:33:40 -0500 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id C3D0010F94; Tue, 19 Feb 2019 16:33:39 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-121-129.rdu2.redhat.com [10.10.121.129]) by smtp.corp.redhat.com (Postfix) with ESMTP id 4267A1024912; Tue, 19 Feb 2019 16:33:38 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 37/43] hugetlbfs: Convert to fs_context From: David Howells To: viro@zeniv.linux.org.uk Cc: linux-fsdevel@vger.kernel.org, dhowells@redhat.com, torvalds@linux-foundation.org, ebiederm@xmission.com, linux-security-module@vger.kernel.org Date: Tue, 19 Feb 2019 16:33:37 +0000 Message-ID: <155059401751.12449.12414022402183556207.stgit@warthog.procyon.org.uk> In-Reply-To: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> References: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.28]); Tue, 19 Feb 2019 16:33:40 +0000 (UTC) Sender: owner-linux-security-module@vger.kernel.org Precedence: bulk List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Convert the hugetlbfs to use the fs_context during mount. Signed-off-by: David Howells Signed-off-by: Al Viro --- fs/hugetlbfs/inode.c | 358 ++++++++++++++++++++++++++++---------------------- 1 file changed, 200 insertions(+), 158 deletions(-) diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c index 32920a10100e..239c7ca09b74 100644 --- a/fs/hugetlbfs/inode.c +++ b/fs/hugetlbfs/inode.c @@ -27,7 +27,7 @@ #include #include #include -#include +#include #include #include #include @@ -45,11 +45,17 @@ const struct file_operations hugetlbfs_file_operations; static const struct inode_operations hugetlbfs_dir_inode_operations; static const struct inode_operations hugetlbfs_inode_operations; -struct hugetlbfs_config { +enum hugetlbfs_size_type { NO_SIZE, SIZE_STD, SIZE_PERCENT }; + +struct hugetlbfs_fs_context { struct hstate *hstate; + unsigned long long max_size_opt; + unsigned long long min_size_opt; long max_hpages; long nr_inodes; long min_hpages; + enum hugetlbfs_size_type max_val_type; + enum hugetlbfs_size_type min_val_type; kuid_t uid; kgid_t gid; umode_t mode; @@ -57,22 +63,30 @@ struct hugetlbfs_config { int sysctl_hugetlb_shm_group; -enum { - Opt_size, Opt_nr_inodes, - Opt_mode, Opt_uid, Opt_gid, - Opt_pagesize, Opt_min_size, - Opt_err, +enum hugetlb_param { + Opt_gid, + Opt_min_size, + Opt_mode, + Opt_nr_inodes, + Opt_pagesize, + Opt_size, + Opt_uid, }; -static const match_table_t tokens = { - {Opt_size, "size=%s"}, - {Opt_nr_inodes, "nr_inodes=%s"}, - {Opt_mode, "mode=%o"}, - {Opt_uid, "uid=%u"}, - {Opt_gid, "gid=%u"}, - {Opt_pagesize, "pagesize=%s"}, - {Opt_min_size, "min_size=%s"}, - {Opt_err, NULL}, +static const struct fs_parameter_spec hugetlb_param_specs[] = { + fsparam_u32 ("gid", Opt_gid), + fsparam_string("min_size", Opt_min_size), + fsparam_u32 ("mode", Opt_mode), + fsparam_string("nr_inodes", Opt_nr_inodes), + fsparam_string("pagesize", Opt_pagesize), + fsparam_string("size", Opt_size), + fsparam_u32 ("uid", Opt_uid), + {} +}; + +static const struct fs_parameter_description hugetlb_fs_parameters = { + .name = "hugetlbfs", + .specs = hugetlb_param_specs, }; #ifdef CONFIG_NUMA @@ -708,16 +722,16 @@ static int hugetlbfs_setattr(struct dentry *dentry, struct iattr *attr) } static struct inode *hugetlbfs_get_root(struct super_block *sb, - struct hugetlbfs_config *config) + struct hugetlbfs_fs_context *ctx) { struct inode *inode; inode = new_inode(sb); if (inode) { inode->i_ino = get_next_ino(); - inode->i_mode = S_IFDIR | config->mode; - inode->i_uid = config->uid; - inode->i_gid = config->gid; + inode->i_mode = S_IFDIR | ctx->mode; + inode->i_uid = ctx->uid; + inode->i_gid = ctx->gid; inode->i_atime = inode->i_mtime = inode->i_ctime = current_time(inode); inode->i_op = &hugetlbfs_dir_inode_operations; inode->i_fop = &simple_dir_operations; @@ -1081,8 +1095,6 @@ static const struct super_operations hugetlbfs_ops = { .show_options = hugetlbfs_show_options, }; -enum hugetlbfs_size_type { NO_SIZE, SIZE_STD, SIZE_PERCENT }; - /* * Convert size option passed from command line to number of huge pages * in the pool specified by hstate. Size option could be in bytes @@ -1105,170 +1117,151 @@ hugetlbfs_size_to_hpages(struct hstate *h, unsigned long long size_opt, return size_opt; } -static int -hugetlbfs_parse_options(char *options, struct hugetlbfs_config *pconfig) +/* + * Parse one mount parameter. + */ +static int hugetlbfs_parse_param(struct fs_context *fc, struct fs_parameter *param) { - char *p, *rest; - substring_t args[MAX_OPT_ARGS]; - int option; - unsigned long long max_size_opt = 0, min_size_opt = 0; - enum hugetlbfs_size_type max_val_type = NO_SIZE, min_val_type = NO_SIZE; - - if (!options) + struct hugetlbfs_fs_context *ctx = fc->fs_private; + struct fs_parse_result result; + char *rest; + unsigned long ps; + int opt; + + opt = fs_parse(fc, &hugetlb_fs_parameters, param, &result); + if (opt < 0) + return opt; + + switch (opt) { + case Opt_uid: + ctx->uid = make_kuid(current_user_ns(), result.uint_32); + if (!uid_valid(ctx->uid)) + goto bad_val; return 0; - while ((p = strsep(&options, ",")) != NULL) { - int token; - if (!*p) - continue; + case Opt_gid: + ctx->gid = make_kgid(current_user_ns(), result.uint_32); + if (!gid_valid(ctx->gid)) + goto bad_val; + return 0; - token = match_token(p, tokens, args); - switch (token) { - case Opt_uid: - if (match_int(&args[0], &option)) - goto bad_val; - pconfig->uid = make_kuid(current_user_ns(), option); - if (!uid_valid(pconfig->uid)) - goto bad_val; - break; + case Opt_mode: + ctx->mode = result.uint_32 & 01777U; + return 0; - case Opt_gid: - if (match_int(&args[0], &option)) - goto bad_val; - pconfig->gid = make_kgid(current_user_ns(), option); - if (!gid_valid(pconfig->gid)) - goto bad_val; - break; + case Opt_size: + /* memparse() will accept a K/M/G without a digit */ + if (!isdigit(param->string[0])) + goto bad_val; + ctx->max_size_opt = memparse(param->string, &rest); + ctx->max_val_type = SIZE_STD; + if (*rest == '%') + ctx->max_val_type = SIZE_PERCENT; + return 0; - case Opt_mode: - if (match_octal(&args[0], &option)) - goto bad_val; - pconfig->mode = option & 01777U; - break; + case Opt_nr_inodes: + /* memparse() will accept a K/M/G without a digit */ + if (!isdigit(param->string[0])) + goto bad_val; + ctx->nr_inodes = memparse(param->string, &rest); + return 0; - case Opt_size: { - /* memparse() will accept a K/M/G without a digit */ - if (!isdigit(*args[0].from)) - goto bad_val; - max_size_opt = memparse(args[0].from, &rest); - max_val_type = SIZE_STD; - if (*rest == '%') - max_val_type = SIZE_PERCENT; - break; + case Opt_pagesize: + ps = memparse(param->string, &rest); + ctx->hstate = size_to_hstate(ps); + if (!ctx->hstate) { + pr_err("Unsupported page size %lu MB\n", ps >> 20); + return -EINVAL; } + return 0; - case Opt_nr_inodes: - /* memparse() will accept a K/M/G without a digit */ - if (!isdigit(*args[0].from)) - goto bad_val; - pconfig->nr_inodes = memparse(args[0].from, &rest); - break; + case Opt_min_size: + /* memparse() will accept a K/M/G without a digit */ + if (!isdigit(param->string[0])) + goto bad_val; + ctx->min_size_opt = memparse(param->string, &rest); + ctx->min_val_type = SIZE_STD; + if (*rest == '%') + ctx->min_val_type = SIZE_PERCENT; + return 0; - case Opt_pagesize: { - unsigned long ps; - ps = memparse(args[0].from, &rest); - pconfig->hstate = size_to_hstate(ps); - if (!pconfig->hstate) { - pr_err("Unsupported page size %lu MB\n", - ps >> 20); - return -EINVAL; - } - break; - } + default: + return -EINVAL; + } - case Opt_min_size: { - /* memparse() will accept a K/M/G without a digit */ - if (!isdigit(*args[0].from)) - goto bad_val; - min_size_opt = memparse(args[0].from, &rest); - min_val_type = SIZE_STD; - if (*rest == '%') - min_val_type = SIZE_PERCENT; - break; - } +bad_val: + return invalf(fc, "hugetlbfs: Bad value '%s' for mount option '%s'\n", + param->string, param->key); +} - default: - pr_err("Bad mount option: \"%s\"\n", p); - return -EINVAL; - break; - } - } +/* + * Validate the parsed options. + */ +static int hugetlbfs_validate(struct fs_context *fc) +{ + struct hugetlbfs_fs_context *ctx = fc->fs_private; /* * Use huge page pool size (in hstate) to convert the size * options to number of huge pages. If NO_SIZE, -1 is returned. */ - pconfig->max_hpages = hugetlbfs_size_to_hpages(pconfig->hstate, - max_size_opt, max_val_type); - pconfig->min_hpages = hugetlbfs_size_to_hpages(pconfig->hstate, - min_size_opt, min_val_type); + ctx->max_hpages = hugetlbfs_size_to_hpages(ctx->hstate, + ctx->max_size_opt, + ctx->max_val_type); + ctx->min_hpages = hugetlbfs_size_to_hpages(ctx->hstate, + ctx->min_size_opt, + ctx->min_val_type); /* * If max_size was specified, then min_size must be smaller */ - if (max_val_type > NO_SIZE && - pconfig->min_hpages > pconfig->max_hpages) { - pr_err("minimum size can not be greater than maximum size\n"); + if (ctx->max_val_type > NO_SIZE && + ctx->min_hpages > ctx->max_hpages) { + pr_err("Minimum size can not be greater than maximum size\n"); return -EINVAL; } return 0; - -bad_val: - pr_err("Bad value '%s' for mount option '%s'\n", args[0].from, p); - return -EINVAL; } static int -hugetlbfs_fill_super(struct super_block *sb, void *data, int silent) +hugetlbfs_fill_super(struct super_block *sb, struct fs_context *fc) { - int ret; - struct hugetlbfs_config config; + struct hugetlbfs_fs_context *ctx = fc->fs_private; struct hugetlbfs_sb_info *sbinfo; - config.max_hpages = -1; /* No limit on size by default */ - config.nr_inodes = -1; /* No limit on number of inodes by default */ - config.uid = current_fsuid(); - config.gid = current_fsgid(); - config.mode = 0755; - config.hstate = &default_hstate; - config.min_hpages = -1; /* No default minimum size */ - ret = hugetlbfs_parse_options(data, &config); - if (ret) - return ret; - sbinfo = kmalloc(sizeof(struct hugetlbfs_sb_info), GFP_KERNEL); if (!sbinfo) return -ENOMEM; sb->s_fs_info = sbinfo; - sbinfo->hstate = config.hstate; spin_lock_init(&sbinfo->stat_lock); - sbinfo->max_inodes = config.nr_inodes; - sbinfo->free_inodes = config.nr_inodes; - sbinfo->spool = NULL; - sbinfo->uid = config.uid; - sbinfo->gid = config.gid; - sbinfo->mode = config.mode; + sbinfo->hstate = ctx->hstate; + sbinfo->max_inodes = ctx->nr_inodes; + sbinfo->free_inodes = ctx->nr_inodes; + sbinfo->spool = NULL; + sbinfo->uid = ctx->uid; + sbinfo->gid = ctx->gid; + sbinfo->mode = ctx->mode; /* * Allocate and initialize subpool if maximum or minimum size is * specified. Any needed reservations (for minimim size) are taken * taken when the subpool is created. */ - if (config.max_hpages != -1 || config.min_hpages != -1) { - sbinfo->spool = hugepage_new_subpool(config.hstate, - config.max_hpages, - config.min_hpages); + if (ctx->max_hpages != -1 || ctx->min_hpages != -1) { + sbinfo->spool = hugepage_new_subpool(ctx->hstate, + ctx->max_hpages, + ctx->min_hpages); if (!sbinfo->spool) goto out_free; } sb->s_maxbytes = MAX_LFS_FILESIZE; - sb->s_blocksize = huge_page_size(config.hstate); - sb->s_blocksize_bits = huge_page_shift(config.hstate); + sb->s_blocksize = huge_page_size(ctx->hstate); + sb->s_blocksize_bits = huge_page_shift(ctx->hstate); sb->s_magic = HUGETLBFS_MAGIC; sb->s_op = &hugetlbfs_ops; sb->s_time_gran = 1; - sb->s_root = d_make_root(hugetlbfs_get_root(sb, &config)); + sb->s_root = d_make_root(hugetlbfs_get_root(sb, ctx)); if (!sb->s_root) goto out_free; return 0; @@ -1278,16 +1271,52 @@ hugetlbfs_fill_super(struct super_block *sb, void *data, int silent) return -ENOMEM; } -static struct dentry *hugetlbfs_mount(struct file_system_type *fs_type, - int flags, const char *dev_name, void *data) +static int hugetlbfs_get_tree(struct fs_context *fc) +{ + int err = hugetlbfs_validate(fc); + if (err) + return err; + return vfs_get_super(fc, vfs_get_independent_super, hugetlbfs_fill_super); +} + +static void hugetlbfs_fs_context_free(struct fs_context *fc) +{ + kfree(fc->fs_private); +} + +static const struct fs_context_operations hugetlbfs_fs_context_ops = { + .free = hugetlbfs_fs_context_free, + .parse_param = hugetlbfs_parse_param, + .get_tree = hugetlbfs_get_tree, +}; + +static int hugetlbfs_init_fs_context(struct fs_context *fc) { - return mount_nodev(fs_type, flags, data, hugetlbfs_fill_super); + struct hugetlbfs_fs_context *ctx; + + ctx = kzalloc(sizeof(struct hugetlbfs_fs_context), GFP_KERNEL); + if (!ctx) + return -ENOMEM; + + ctx->max_hpages = -1; /* No limit on size by default */ + ctx->nr_inodes = -1; /* No limit on number of inodes by default */ + ctx->uid = current_fsuid(); + ctx->gid = current_fsgid(); + ctx->mode = 0755; + ctx->hstate = &default_hstate; + ctx->min_hpages = -1; /* No default minimum size */ + ctx->max_val_type = NO_SIZE; + ctx->min_val_type = NO_SIZE; + fc->fs_private = ctx; + fc->ops = &hugetlbfs_fs_context_ops; + return 0; } static struct file_system_type hugetlbfs_fs_type = { - .name = "hugetlbfs", - .mount = hugetlbfs_mount, - .kill_sb = kill_litter_super, + .name = "hugetlbfs", + .init_fs_context = hugetlbfs_init_fs_context, + .parameters = &hugetlb_fs_parameters, + .kill_sb = kill_litter_super, }; static struct vfsmount *hugetlbfs_vfsmount[HUGE_MAX_HSTATE]; @@ -1372,8 +1401,29 @@ struct file *hugetlb_file_setup(const char *name, size_t size, return file; } +static struct vfsmount *__init mount_one_hugetlbfs(struct hstate *h) +{ + struct fs_context *fc; + struct vfsmount *mnt; + + fc = fs_context_for_mount(&hugetlbfs_fs_type, SB_KERNMOUNT); + if (IS_ERR(fc)) { + mnt = ERR_CAST(fc); + } else { + struct hugetlbfs_fs_context *ctx = fc->fs_private; + ctx->hstate = h; + mnt = fc_mount(fc); + put_fs_context(fc); + } + if (IS_ERR(mnt)) + pr_err("Cannot mount internal hugetlbfs for page size %uK", + 1U << (h->order + PAGE_SHIFT - 10)); + return mnt; +} + static int __init init_hugetlbfs_fs(void) { + struct vfsmount *mnt; struct hstate *h; int error; int i; @@ -1396,24 +1446,16 @@ static int __init init_hugetlbfs_fs(void) i = 0; for_each_hstate(h) { - char buf[50]; - unsigned ps_kb = 1U << (h->order + PAGE_SHIFT - 10); - - snprintf(buf, sizeof(buf), "pagesize=%uK", ps_kb); - hugetlbfs_vfsmount[i] = kern_mount_data(&hugetlbfs_fs_type, - buf); - - if (IS_ERR(hugetlbfs_vfsmount[i])) { - pr_err("Cannot mount internal hugetlbfs for " - "page size %uK", ps_kb); - error = PTR_ERR(hugetlbfs_vfsmount[i]); - hugetlbfs_vfsmount[i] = NULL; + mnt = mount_one_hugetlbfs(h); + if (IS_ERR(mnt) && i == 0) { + error = PTR_ERR(mnt); + goto out; } + hugetlbfs_vfsmount[i] = mnt; i++; } - /* Non default hstates are optional */ - if (!IS_ERR_OR_NULL(hugetlbfs_vfsmount[default_hstate_idx])) - return 0; + + return 0; out: kmem_cache_destroy(hugetlbfs_inode_cachep); From patchwork Tue Feb 19 16:33:45 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 10820309 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CC23714E1 for ; Tue, 19 Feb 2019 16:33:48 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B7DD32CCD0 for ; Tue, 19 Feb 2019 16:33:48 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id AC2DA2CCEF; Tue, 19 Feb 2019 16:33:48 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 582D52CCD0 for ; Tue, 19 Feb 2019 16:33:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728994AbfBSQdr (ORCPT ); Tue, 19 Feb 2019 11:33:47 -0500 Received: from mx1.redhat.com ([209.132.183.28]:51460 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728872AbfBSQdr (ORCPT ); Tue, 19 Feb 2019 11:33:47 -0500 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 50004AC61A; Tue, 19 Feb 2019 16:33:47 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-121-129.rdu2.redhat.com [10.10.121.129]) by smtp.corp.redhat.com (Postfix) with ESMTP id D348F5D770; Tue, 19 Feb 2019 16:33:45 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 38/43] vfs: Remove kern_mount_data() From: David Howells To: viro@zeniv.linux.org.uk Cc: linux-fsdevel@vger.kernel.org, dhowells@redhat.com, torvalds@linux-foundation.org, ebiederm@xmission.com, linux-security-module@vger.kernel.org Date: Tue, 19 Feb 2019 16:33:45 +0000 Message-ID: <155059402497.12449.9115734348268644271.stgit@warthog.procyon.org.uk> In-Reply-To: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> References: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.28]); Tue, 19 Feb 2019 16:33:47 +0000 (UTC) Sender: owner-linux-security-module@vger.kernel.org Precedence: bulk List-ID: X-Virus-Scanned: ClamAV using ClamSMTP The kern_mount_data() isn't used any more so remove it. Signed-off-by: David Howells Signed-off-by: Al Viro --- fs/namespace.c | 6 +++--- include/linux/fs.h | 3 +-- 2 files changed, 4 insertions(+), 5 deletions(-) diff --git a/fs/namespace.c b/fs/namespace.c index 1a1ed2528f47..bb9b7db1c66c 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -3390,10 +3390,10 @@ void put_mnt_ns(struct mnt_namespace *ns) free_mnt_ns(ns); } -struct vfsmount *kern_mount_data(struct file_system_type *type, void *data) +struct vfsmount *kern_mount(struct file_system_type *type) { struct vfsmount *mnt; - mnt = vfs_kern_mount(type, SB_KERNMOUNT, type->name, data); + mnt = vfs_kern_mount(type, SB_KERNMOUNT, type->name, NULL); if (!IS_ERR(mnt)) { /* * it is a longterm mount, don't release mnt until @@ -3403,7 +3403,7 @@ struct vfsmount *kern_mount_data(struct file_system_type *type, void *data) } return mnt; } -EXPORT_SYMBOL_GPL(kern_mount_data); +EXPORT_SYMBOL_GPL(kern_mount); void kern_unmount(struct vfsmount *mnt) { diff --git a/include/linux/fs.h b/include/linux/fs.h index 9d05c128ccf6..3e85cb8e8c20 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -2280,8 +2280,7 @@ mount_pseudo(struct file_system_type *fs_type, char *name, extern int register_filesystem(struct file_system_type *); extern int unregister_filesystem(struct file_system_type *); -extern struct vfsmount *kern_mount_data(struct file_system_type *, void *data); -#define kern_mount(type) kern_mount_data(type, NULL) +extern struct vfsmount *kern_mount(struct file_system_type *); extern void kern_unmount(struct vfsmount *mnt); extern int may_umount_tree(struct vfsmount *); extern int may_umount(struct vfsmount *); From patchwork Tue Feb 19 16:33:52 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 10820313 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A1CCB1805 for ; Tue, 19 Feb 2019 16:33:58 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8B88E2CC22 for ; Tue, 19 Feb 2019 16:33:58 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 802B72CC32; Tue, 19 Feb 2019 16:33:58 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D23032CC22 for ; Tue, 19 Feb 2019 16:33:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728924AbfBSQd4 (ORCPT ); Tue, 19 Feb 2019 11:33:56 -0500 Received: from mx1.redhat.com ([209.132.183.28]:39648 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725820AbfBSQd4 (ORCPT ); Tue, 19 Feb 2019 11:33:56 -0500 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 8BCC6A2A24; Tue, 19 Feb 2019 16:33:55 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-121-129.rdu2.redhat.com [10.10.121.129]) by smtp.corp.redhat.com (Postfix) with ESMTP id 3A02E5D77A; Tue, 19 Feb 2019 16:33:53 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 39/43] vfs: Provide documentation for new mount API From: David Howells To: viro@zeniv.linux.org.uk Cc: linux-fsdevel@vger.kernel.org, dhowells@redhat.com, torvalds@linux-foundation.org, ebiederm@xmission.com, linux-security-module@vger.kernel.org Date: Tue, 19 Feb 2019 16:33:52 +0000 Message-ID: <155059403238.12449.14892359870902718528.stgit@warthog.procyon.org.uk> In-Reply-To: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> References: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.27]); Tue, 19 Feb 2019 16:33:55 +0000 (UTC) Sender: owner-linux-security-module@vger.kernel.org Precedence: bulk List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Provide documentation for the new mount API. Signed-off-by: David Howells Signed-off-by: Al Viro --- Documentation/filesystems/mount_api.txt | 709 +++++++++++++++++++++++++++++++ 1 file changed, 709 insertions(+) create mode 100644 Documentation/filesystems/mount_api.txt diff --git a/Documentation/filesystems/mount_api.txt b/Documentation/filesystems/mount_api.txt new file mode 100644 index 000000000000..944d1965e917 --- /dev/null +++ b/Documentation/filesystems/mount_api.txt @@ -0,0 +1,709 @@ + ==================== + FILESYSTEM MOUNT API + ==================== + +CONTENTS + + (1) Overview. + + (2) The filesystem context. + + (3) The filesystem context operations. + + (4) Filesystem context security. + + (5) VFS filesystem context operations. + + (6) Parameter description. + + (7) Parameter helper functions. + + +======== +OVERVIEW +======== + +The creation of new mounts is now to be done in a multistep process: + + (1) Create a filesystem context. + + (2) Parse the parameters and attach them to the context. Parameters are + expected to be passed individually from userspace, though legacy binary + parameters can also be handled. + + (3) Validate and pre-process the context. + + (4) Get or create a superblock and mountable root. + + (5) Perform the mount. + + (6) Return an error message attached to the context. + + (7) Destroy the context. + +To support this, the file_system_type struct gains a new field: + + int (*init_fs_context)(struct fs_context *fc); + +which is invoked to set up the filesystem-specific parts of a filesystem +context, including the additional space. + +Note that security initialisation is done *after* the filesystem is called so +that the namespaces may be adjusted first. + + +====================== +THE FILESYSTEM CONTEXT +====================== + +The creation and reconfiguration of a superblock is governed by a filesystem +context. This is represented by the fs_context structure: + + struct fs_context { + const struct fs_context_operations *ops; + struct file_system_type *fs_type; + void *fs_private; + struct dentry *root; + struct user_namespace *user_ns; + struct net *net_ns; + const struct cred *cred; + char *source; + char *subtype; + void *security; + void *s_fs_info; + unsigned int sb_flags; + unsigned int sb_flags_mask; + enum fs_context_purpose purpose:8; + bool sloppy:1; + bool silent:1; + ... + }; + +The fs_context fields are as follows: + + (*) const struct fs_context_operations *ops + + These are operations that can be done on a filesystem context (see + below). This must be set by the ->init_fs_context() file_system_type + operation. + + (*) struct file_system_type *fs_type + + A pointer to the file_system_type of the filesystem that is being + constructed or reconfigured. This retains a reference on the type owner. + + (*) void *fs_private + + A pointer to the file system's private data. This is where the filesystem + will need to store any options it parses. + + (*) struct dentry *root + + A pointer to the root of the mountable tree (and indirectly, the + superblock thereof). This is filled in by the ->get_tree() op. If this + is set, an active reference on root->d_sb must also be held. + + (*) struct user_namespace *user_ns + (*) struct net *net_ns + + There are a subset of the namespaces in use by the invoking process. They + retain references on each namespace. The subscribed namespaces may be + replaced by the filesystem to reflect other sources, such as the parent + mount superblock on an automount. + + (*) const struct cred *cred + + The mounter's credentials. This retains a reference on the credentials. + + (*) char *source + + This specifies the source. It may be a block device (e.g. /dev/sda1) or + something more exotic, such as the "host:/path" that NFS desires. + + (*) char *subtype + + This is a string to be added to the type displayed in /proc/mounts to + qualify it (used by FUSE). This is available for the filesystem to set if + desired. + + (*) void *security + + A place for the LSMs to hang their security data for the superblock. The + relevant security operations are described below. + + (*) void *s_fs_info + + The proposed s_fs_info for a new superblock, set in the superblock by + sget_fc(). This can be used to distinguish superblocks. + + (*) unsigned int sb_flags + (*) unsigned int sb_flags_mask + + Which bits SB_* flags are to be set/cleared in super_block::s_flags. + + (*) enum fs_context_purpose + + This indicates the purpose for which the context is intended. The + available values are: + + FS_CONTEXT_FOR_MOUNT, -- New superblock for explicit mount + FS_CONTEXT_FOR_SUBMOUNT -- New automatic submount of extant mount + FS_CONTEXT_FOR_RECONFIGURE -- Change an existing mount + + (*) bool sloppy + (*) bool silent + + These are set if the sloppy or silent mount options are given. + + [NOTE] sloppy is probably unnecessary when userspace passes over one + option at a time since the error can just be ignored if userspace deems it + to be unimportant. + + [NOTE] silent is probably redundant with sb_flags & SB_SILENT. + +The mount context is created by calling vfs_new_fs_context() or +vfs_dup_fs_context() and is destroyed with put_fs_context(). Note that the +structure is not refcounted. + +VFS, security and filesystem mount options are set individually with +vfs_parse_mount_option(). Options provided by the old mount(2) system call as +a page of data can be parsed with generic_parse_monolithic(). + +When mounting, the filesystem is allowed to take data from any of the pointers +and attach it to the superblock (or whatever), provided it clears the pointer +in the mount context. + +The filesystem is also allowed to allocate resources and pin them with the +mount context. For instance, NFS might pin the appropriate protocol version +module. + + +================================= +THE FILESYSTEM CONTEXT OPERATIONS +================================= + +The filesystem context points to a table of operations: + + struct fs_context_operations { + void (*free)(struct fs_context *fc); + int (*dup)(struct fs_context *fc, struct fs_context *src_fc); + int (*parse_param)(struct fs_context *fc, + struct struct fs_parameter *param); + int (*parse_monolithic)(struct fs_context *fc, void *data); + int (*get_tree)(struct fs_context *fc); + int (*reconfigure)(struct fs_context *fc); + }; + +These operations are invoked by the various stages of the mount procedure to +manage the filesystem context. They are as follows: + + (*) void (*free)(struct fs_context *fc); + + Called to clean up the filesystem-specific part of the filesystem context + when the context is destroyed. It should be aware that parts of the + context may have been removed and NULL'd out by ->get_tree(). + + (*) int (*dup)(struct fs_context *fc, struct fs_context *src_fc); + + Called when a filesystem context has been duplicated to duplicate the + filesystem-private data. An error may be returned to indicate failure to + do this. + + [!] Note that even if this fails, put_fs_context() will be called + immediately thereafter, so ->dup() *must* make the + filesystem-private data safe for ->free(). + + (*) int (*parse_param)(struct fs_context *fc, + struct struct fs_parameter *param); + + Called when a parameter is being added to the filesystem context. param + points to the key name and maybe a value object. VFS-specific options + will have been weeded out and fc->sb_flags updated in the context. + Security options will also have been weeded out and fc->security updated. + + The parameter can be parsed with fs_parse() and fs_lookup_param(). Note + that the source(s) are presented as parameters named "source". + + If successful, 0 should be returned or a negative error code otherwise. + + (*) int (*parse_monolithic)(struct fs_context *fc, void *data); + + Called when the mount(2) system call is invoked to pass the entire data + page in one go. If this is expected to be just a list of "key[=val]" + items separated by commas, then this may be set to NULL. + + The return value is as for ->parse_param(). + + If the filesystem (e.g. NFS) needs to examine the data first and then + finds it's the standard key-val list then it may pass it off to + generic_parse_monolithic(). + + (*) int (*get_tree)(struct fs_context *fc); + + Called to get or create the mountable root and superblock, using the + information stored in the filesystem context (reconfiguration goes via a + different vector). It may detach any resources it desires from the + filesystem context and transfer them to the superblock it creates. + + On success it should set fc->root to the mountable root and return 0. In + the case of an error, it should return a negative error code. + + The phase on a userspace-driven context will be set to only allow this to + be called once on any particular context. + + (*) int (*reconfigure)(struct fs_context *fc); + + Called to effect reconfiguration of a superblock using information stored + in the filesystem context. It may detach any resources it desires from + the filesystem context and transfer them to the superblock. The + superblock can be found from fc->root->d_sb. + + On success it should return 0. In the case of an error, it should return + a negative error code. + + [NOTE] reconfigure is intended as a replacement for remount_fs. + + +=========================== +FILESYSTEM CONTEXT SECURITY +=========================== + +The filesystem context contains a security pointer that the LSMs can use for +building up a security context for the superblock to be mounted. There are a +number of operations used by the new mount code for this purpose: + + (*) int security_fs_context_alloc(struct fs_context *fc, + struct dentry *reference); + + Called to initialise fc->security (which is preset to NULL) and allocate + any resources needed. It should return 0 on success or a negative error + code on failure. + + reference will be non-NULL if the context is being created for superblock + reconfiguration (FS_CONTEXT_FOR_RECONFIGURE) in which case it indicates + the root dentry of the superblock to be reconfigured. It will also be + non-NULL in the case of a submount (FS_CONTEXT_FOR_SUBMOUNT) in which case + it indicates the automount point. + + (*) int security_fs_context_dup(struct fs_context *fc, + struct fs_context *src_fc); + + Called to initialise fc->security (which is preset to NULL) and allocate + any resources needed. The original filesystem context is pointed to by + src_fc and may be used for reference. It should return 0 on success or a + negative error code on failure. + + (*) void security_fs_context_free(struct fs_context *fc); + + Called to clean up anything attached to fc->security. Note that the + contents may have been transferred to a superblock and the pointer cleared + during get_tree. + + (*) int security_fs_context_parse_param(struct fs_context *fc, + struct fs_parameter *param); + + Called for each mount parameter, including the source. The arguments are + as for the ->parse_param() method. It should return 0 to indicate that + the parameter should be passed on to the filesystem, 1 to indicate that + the parameter should be discarded or an error to indicate that the + parameter should be rejected. + + The value pointed to by param may be modified (if a string) or stolen + (provided the value pointer is NULL'd out). If it is stolen, 1 must be + returned to prevent it being passed to the filesystem. + + (*) int security_fs_context_validate(struct fs_context *fc); + + Called after all the options have been parsed to validate the collection + as a whole and to do any necessary allocation so that + security_sb_get_tree() and security_sb_reconfigure() are less likely to + fail. It should return 0 or a negative error code. + + In the case of reconfiguration, the target superblock will be accessible + via fc->root. + + (*) int security_sb_get_tree(struct fs_context *fc); + + Called during the mount procedure to verify that the specified superblock + is allowed to be mounted and to transfer the security data there. It + should return 0 or a negative error code. + + (*) void security_sb_reconfigure(struct fs_context *fc); + + Called to apply any reconfiguration to an LSM's context. It must not + fail. Error checking and resource allocation must be done in advance by + the parameter parsing and validation hooks. + + (*) int security_sb_mountpoint(struct fs_context *fc, struct path *mountpoint, + unsigned int mnt_flags); + + Called during the mount procedure to verify that the root dentry attached + to the context is permitted to be attached to the specified mountpoint. + It should return 0 on success or a negative error code on failure. + + +================================= +VFS FILESYSTEM CONTEXT OPERATIONS +================================= + +There are four operations for creating a filesystem context and +one for destroying a context: + + (*) struct fs_context *vfs_new_fs_context(struct file_system_type *fs_type, + struct dentry *reference, + unsigned int sb_flags, + unsigned int sb_flags_mask, + enum fs_context_purpose purpose); + + Create a filesystem context for a given filesystem type and purpose. This + allocates the filesystem context, sets the superblock flags, initialises + the security and calls fs_type->init_fs_context() to initialise the + filesystem private data. + + reference can be NULL or it may indicate the root dentry of a superblock + that is going to be reconfigured (FS_CONTEXT_FOR_RECONFIGURE) or + the automount point that triggered a submount (FS_CONTEXT_FOR_SUBMOUNT). + This is provided as a source of namespace information. + + (*) struct fs_context *vfs_dup_fs_context(struct fs_context *src_fc); + + Duplicate a filesystem context, copying any options noted and duplicating + or additionally referencing any resources held therein. This is available + for use where a filesystem has to get a mount within a mount, such as NFS4 + does by internally mounting the root of the target server and then doing a + private pathwalk to the target directory. + + The purpose in the new context is inherited from the old one. + + (*) void put_fs_context(struct fs_context *fc); + + Destroy a filesystem context, releasing any resources it holds. This + calls the ->free() operation. This is intended to be called by anyone who + created a filesystem context. + + [!] filesystem contexts are not refcounted, so this causes unconditional + destruction. + +In all the above operations, apart from the put op, the return is a mount +context pointer or a negative error code. + +For the remaining operations, if an error occurs, a negative error code will be +returned. + + (*) int vfs_get_tree(struct fs_context *fc); + + Get or create the mountable root and superblock, using the parameters in + the filesystem context to select/configure the superblock. This invokes + the ->validate() op and then the ->get_tree() op. + + [NOTE] ->validate() could perhaps be rolled into ->get_tree() and + ->reconfigure(). + + (*) struct vfsmount *vfs_create_mount(struct fs_context *fc); + + Create a mount given the parameters in the specified filesystem context. + Note that this does not attach the mount to anything. + + (*) int vfs_parse_fs_param(struct fs_context *fc, + struct fs_parameter *param); + + Supply a single mount parameter to the filesystem context. This include + the specification of the source/device which is specified as the "source" + parameter (which may be specified multiple times if the filesystem + supports that). + + param specifies the parameter key name and the value. The parameter is + first checked to see if it corresponds to a standard mount flag (in which + case it is used to set an SB_xxx flag and consumed) or a security option + (in which case the LSM consumes it) before it is passed on to the + filesystem. + + The parameter value is typed and can be one of: + + fs_value_is_flag, Parameter not given a value. + fs_value_is_string, Value is a string + fs_value_is_blob, Value is a binary blob + fs_value_is_filename, Value is a filename* + dirfd + fs_value_is_filename_empty, Value is a filename* + dirfd + AT_EMPTY_PATH + fs_value_is_file, Value is an open file (file*) + + If there is a value, that value is stored in a union in the struct in one + of param->{string,blob,name,file}. Note that the function may steal and + clear the pointer, but then becomes responsible for disposing of the + object. + + (*) int vfs_parse_fs_string(struct fs_context *fc, char *key, + const char *value, size_t v_size); + + A wrapper around vfs_parse_fs_param() that just passes a constant string. + + (*) int generic_parse_monolithic(struct fs_context *fc, void *data); + + Parse a sys_mount() data page, assuming the form to be a text list + consisting of key[=val] options separated by commas. Each item in the + list is passed to vfs_mount_option(). This is the default when the + ->parse_monolithic() operation is NULL. + + +===================== +PARAMETER DESCRIPTION +===================== + +Parameters are described using structures defined in linux/fs_parser.h. +There's a core description struct that links everything together: + + struct fs_parameter_description { + const char name[16]; + u8 nr_params; + u8 nr_alt_keys; + u8 nr_enums; + bool ignore_unknown; + bool no_source; + const char *const *keys; + const struct constant_table *alt_keys; + const struct fs_parameter_spec *specs; + const struct fs_parameter_enum *enums; + }; + +For example: + + enum afs_param { + Opt_autocell, + Opt_bar, + Opt_dyn, + Opt_foo, + Opt_source, + nr__afs_params + }; + + static const struct fs_parameter_description afs_fs_parameters = { + .name = "kAFS", + .nr_params = nr__afs_params, + .nr_alt_keys = ARRAY_SIZE(afs_param_alt_keys), + .nr_enums = ARRAY_SIZE(afs_param_enums), + .keys = afs_param_keys, + .alt_keys = afs_param_alt_keys, + .specs = afs_param_specs, + .enums = afs_param_enums, + }; + +The members are as follows: + + (1) const char name[16]; + + The name to be used in error messages generated by the parse helper + functions. + + (2) u8 nr_params; + + The number of discrete parameter identifiers. This indicates the number + of elements in the ->types[] array and also limits the values that may be + used in the values that the ->keys[] array maps to. + + It is expected that, for example, two parameters that are related, say + "acl" and "noacl" with have the same ID, but will be flagged to indicate + that one is the inverse of the other. The value can then be picked out + from the parse result. + + (3) const struct fs_parameter_specification *specs; + + Table of parameter specifications, where the entries are of type: + + struct fs_parameter_type { + enum fs_parameter_spec type:8; + u8 flags; + }; + + and the parameter identifier is the index to the array. 'type' indicates + the desired value type and must be one of: + + TYPE NAME EXPECTED VALUE RESULT IN + ======================= ======================= ===================== + fs_param_is_flag No value n/a + fs_param_is_bool Boolean value result->boolean + fs_param_is_u32 32-bit unsigned int result->uint_32 + fs_param_is_u32_octal 32-bit octal int result->uint_32 + fs_param_is_u32_hex 32-bit hex int result->uint_32 + fs_param_is_s32 32-bit signed int result->int_32 + fs_param_is_enum Enum value name result->uint_32 + fs_param_is_string Arbitrary string param->string + fs_param_is_blob Binary blob param->blob + fs_param_is_blockdev Blockdev path * Needs lookup + fs_param_is_path Path * Needs lookup + fs_param_is_fd File descriptor param->file + + And each parameter can be qualified with 'flags': + + fs_param_v_optional The value is optional + fs_param_neg_with_no If key name is prefixed with "no", it is false + fs_param_neg_with_empty If value is "", it is false + fs_param_deprecated The parameter is deprecated. + + For example: + + static const struct fs_parameter_spec afs_param_specs[nr__afs_params] = { + [Opt_autocell] = { fs_param_is flag }, + [Opt_bar] = { fs_param_is_enum }, + [Opt_dyn] = { fs_param_is flag }, + [Opt_foo] = { fs_param_is_bool, fs_param_neg_with_no }, + [Opt_source] = { fs_param_is_string }, + }; + + Note that if the value is of fs_param_is_bool type, fs_parse() will try + to match any string value against "0", "1", "no", "yes", "false", "true". + + [!] NOTE that the table must be sorted according to primary key name so + that ->keys[] is also sorted. + + (4) const char *const *keys; + + Table of primary key names for the parameters. There must be one entry + per defined parameter. The table is optional if ->nr_params is 0. The + table is just an array of names e.g.: + + static const char *const afs_param_keys[nr__afs_params] = { + [Opt_autocell] = "autocell", + [Opt_bar] = "bar", + [Opt_dyn] = "dyn", + [Opt_foo] = "foo", + [Opt_source] = "source", + }; + + [!] NOTE that the table must be sorted such that the table can be searched + with bsearch() using strcmp(). This means that the Opt_* values must + correspond to the entries in this table. + + (5) const struct constant_table *alt_keys; + u8 nr_alt_keys; + + Table of additional key names and their mappings to parameter ID plus the + number of elements in the table. This is optional. The table is just an + array of { name, integer } pairs, e.g.: + + static const struct constant_table afs_param_keys[] = { + { "baz", Opt_bar }, + { "dynamic", Opt_dyn }, + }; + + [!] NOTE that the table must be sorted such that strcmp() can be used with + bsearch() to search the entries. + + The parameter ID can also be fs_param_key_removed to indicate that a + deprecated parameter has been removed and that an error will be given. + This differs from fs_param_deprecated where the parameter may still have + an effect. + + Further, the behaviour of the parameter may differ when an alternate name + is used (for instance with NFS, "v3", "v4.2", etc. are alternate names). + + (6) const struct fs_parameter_enum *enums; + u8 nr_enums; + + Table of enum value names to integer mappings and the number of elements + stored therein. This is of type: + + struct fs_parameter_enum { + u8 param_id; + char name[14]; + u8 value; + }; + + Where the array is an unsorted list of { parameter ID, name }-keyed + elements that indicate the value to map to, e.g.: + + static const struct fs_parameter_enum afs_param_enums[] = { + { Opt_bar, "x", 1}, + { Opt_bar, "y", 23}, + { Opt_bar, "z", 42}, + }; + + If a parameter of type fs_param_is_enum is encountered, fs_parse() will + try to look the value up in the enum table and the result will be stored + in the parse result. + + (7) bool no_source; + + If this is set, fs_parse() will ignore any "source" parameter and not + pass it to the filesystem. + +The parser should be pointed to by the parser pointer in the file_system_type +struct as this will provide validation on registration (if +CONFIG_VALIDATE_FS_PARSER=y) and will allow the description to be queried from +userspace using the fsinfo() syscall. + + +========================== +PARAMETER HELPER FUNCTIONS +========================== + +A number of helper functions are provided to help a filesystem or an LSM +process the parameters it is given. + + (*) int lookup_constant(const struct constant_table tbl[], + const char *name, int not_found); + + Look up a constant by name in a table of name -> integer mappings. The + table is an array of elements of the following type: + + struct constant_table { + const char *name; + int value; + }; + + and it must be sorted such that it can be searched using bsearch() using + strcmp(). If a match is found, the corresponding value is returned. If a + match isn't found, the not_found value is returned instead. + + (*) bool validate_constant_table(const struct constant_table *tbl, + size_t tbl_size, + int low, int high, int special); + + Validate a constant table. Checks that all the elements are appropriately + ordered, that there are no duplicates and that the values are between low + and high inclusive, though provision is made for one allowable special + value outside of that range. If no special value is required, special + should just be set to lie inside the low-to-high range. + + If all is good, true is returned. If the table is invalid, errors are + logged to dmesg, the stack is dumped and false is returned. + + (*) int fs_parse(struct fs_context *fc, + const struct fs_param_parser *parser, + struct fs_parameter *param, + struct fs_param_parse_result *result); + + This is the main interpreter of parameters. It uses the parameter + description (parser) to look up the name of the parameter to use and to + convert that to a parameter ID (stored in result->key). + + If successful, and if the parameter type indicates the result is a + boolean, integer or enum type, the value is converted by this function and + the result stored in result->{boolean,int_32,uint_32}. + + If a match isn't initially made, the key is prefixed with "no" and no + value is present then an attempt will be made to look up the key with the + prefix removed. If this matches a parameter for which the type has flag + fs_param_neg_with_no set, then a match will be made and the value will be + set to false/0/NULL. + + If the parameter is successfully matched and, optionally, parsed + correctly, 1 is returned. If the parameter isn't matched and + parser->ignore_unknown is set, then 0 is returned. Otherwise -EINVAL is + returned. + + (*) bool fs_validate_description(const struct fs_parameter_description *desc); + + This is validates the parameter description. It returns true if the + description is good and false if it is not. + + (*) int fs_lookup_param(struct fs_context *fc, + struct fs_parameter *value, + bool want_bdev, + struct path *_path); + + This takes a parameter that carries a string or filename type and attempts + to do a path lookup on it. If the parameter expects a blockdev, a check + is made that the inode actually represents one. + + Returns 0 if successful and *_path will be set; returns a negative error + code if not. From patchwork Tue Feb 19 16:34:00 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 10820317 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id DF0F514E1 for ; Tue, 19 Feb 2019 16:34:06 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C877C2CC22 for ; Tue, 19 Feb 2019 16:34:06 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id BBD6A2CC32; Tue, 19 Feb 2019 16:34:06 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4775E2CC22 for ; Tue, 19 Feb 2019 16:34:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729138AbfBSQeF (ORCPT ); Tue, 19 Feb 2019 11:34:05 -0500 Received: from mx1.redhat.com ([209.132.183.28]:45126 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728688AbfBSQeF (ORCPT ); Tue, 19 Feb 2019 11:34:05 -0500 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 7562AC0E4CE5; Tue, 19 Feb 2019 16:34:04 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-121-129.rdu2.redhat.com [10.10.121.129]) by smtp.corp.redhat.com (Postfix) with ESMTP id 2E3494C4; Tue, 19 Feb 2019 16:34:02 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 40/43] vfs: Implement logging through fs_context From: David Howells To: viro@zeniv.linux.org.uk Cc: linux-fsdevel@vger.kernel.org, dhowells@redhat.com, torvalds@linux-foundation.org, ebiederm@xmission.com, linux-security-module@vger.kernel.org Date: Tue, 19 Feb 2019 16:34:00 +0000 Message-ID: <155059404031.12449.2777877374858164989.stgit@warthog.procyon.org.uk> In-Reply-To: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> References: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.32]); Tue, 19 Feb 2019 16:34:04 +0000 (UTC) Sender: owner-linux-security-module@vger.kernel.org Precedence: bulk List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Implement the ability for filesystems to log error, warning and informational messages through the fs_context. In the future, these will be extractable by userspace by reading from an fd created by the fsopen() syscall. Error messages are prefixed with "e ", warnings with "w " and informational messages with "i ". In the future, inside the kernel, formatted messages will be malloc'd but unformatted messages will not copied if they're either in the core .rodata section or in the .rodata section of the filesystem module pinned by fs_context::fs_type. The messages will only be good till the fs_type is released. Note that the logging object will be shared between duplicated fs_context structures. This is so that such as NFS which do a mount within a mount can get at least some of the errors from the inner mount. Five logging functions are provided for this: (1) void logfc(struct fs_context *fc, const char *fmt, ...); This logs a message into the context. If the buffer is full, the earliest message is discarded. (2) void errorf(fc, fmt, ...); This wraps logfc() to log an error. (3) void invalf(fc, fmt, ...); This wraps errorf() and returns -EINVAL for convenience. (4) void warnf(fc, fmt, ...); This wraps logfc() to log a warning. (5) void infof(fc, fmt, ...); This wraps logfc() to log an informational message. Signed-off-by: David Howells Signed-off-by: Al Viro --- fs/fs_context.c | 30 ++++++++++++++++++++++++++++++ include/linux/fs_context.h | 18 ++++++++++++++---- 2 files changed, 44 insertions(+), 4 deletions(-) diff --git a/fs/fs_context.c b/fs/fs_context.c index 57f61833ac83..87e3546b9a52 100644 --- a/fs/fs_context.c +++ b/fs/fs_context.c @@ -378,6 +378,36 @@ struct fs_context *vfs_dup_fs_context(struct fs_context *src_fc) } EXPORT_SYMBOL(vfs_dup_fs_context); +#ifdef CONFIG_PRINTK +/** + * logfc - Log a message to a filesystem context + * @fc: The filesystem context to log to. + * @fmt: The format of the buffer. + */ +void logfc(struct fs_context *fc, const char *fmt, ...) +{ + va_list va; + + va_start(va, fmt); + + switch (fmt[0]) { + case 'w': + vprintk_emit(0, LOGLEVEL_WARNING, NULL, 0, fmt, va); + break; + case 'e': + vprintk_emit(0, LOGLEVEL_ERR, NULL, 0, fmt, va); + break; + default: + vprintk_emit(0, LOGLEVEL_NOTICE, NULL, 0, fmt, va); + break; + } + + pr_cont("\n"); + va_end(va); +} +EXPORT_SYMBOL(logfc); +#endif + /** * put_fs_context - Dispose of a superblock configuration context. * @fc: The context to dispose of. diff --git a/include/linux/fs_context.h b/include/linux/fs_context.h index 0db0b645c7b8..eaca452088fa 100644 --- a/include/linux/fs_context.h +++ b/include/linux/fs_context.h @@ -133,7 +133,17 @@ extern int vfs_get_super(struct fs_context *fc, int (*fill_super)(struct super_block *sb, struct fs_context *fc)); -#define logfc(FC, FMT, ...) pr_notice(FMT, ## __VA_ARGS__) +extern const struct file_operations fscontext_fops; + +#ifdef CONFIG_PRINTK +extern __attribute__((format(printf, 2, 3))) +void logfc(struct fs_context *fc, const char *fmt, ...); +#else +static inline __attribute__((format(printf, 2, 3))) +void logfc(struct fs_context *fc, const char *fmt, ...) +{ +} +#endif /** * infof - Store supplementary informational message @@ -143,7 +153,7 @@ extern int vfs_get_super(struct fs_context *fc, * Store the supplementary informational message for the process if the process * has enabled the facility. */ -#define infof(fc, fmt, ...) ({ logfc(fc, fmt, ## __VA_ARGS__); }) +#define infof(fc, fmt, ...) ({ logfc(fc, "i "fmt, ## __VA_ARGS__); }) /** * warnf - Store supplementary warning message @@ -153,7 +163,7 @@ extern int vfs_get_super(struct fs_context *fc, * Store the supplementary warning message for the process if the process has * enabled the facility. */ -#define warnf(fc, fmt, ...) ({ logfc(fc, fmt, ## __VA_ARGS__); }) +#define warnf(fc, fmt, ...) ({ logfc(fc, "w "fmt, ## __VA_ARGS__); }) /** * errorf - Store supplementary error message @@ -163,7 +173,7 @@ extern int vfs_get_super(struct fs_context *fc, * Store the supplementary error message for the process if the process has * enabled the facility. */ -#define errorf(fc, fmt, ...) ({ logfc(fc, fmt, ## __VA_ARGS__); }) +#define errorf(fc, fmt, ...) ({ logfc(fc, "e "fmt, ## __VA_ARGS__); }) /** * invalf - Store supplementary invalid argument error message From patchwork Tue Feb 19 16:34:09 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 10820321 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id AD2E41805 for ; Tue, 19 Feb 2019 16:34:14 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 98C4A2CC22 for ; Tue, 19 Feb 2019 16:34:14 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 8A06D2CC2D; Tue, 19 Feb 2019 16:34:14 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 291D12CC22 for ; Tue, 19 Feb 2019 16:34:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729195AbfBSQeN (ORCPT ); Tue, 19 Feb 2019 11:34:13 -0500 Received: from mx1.redhat.com ([209.132.183.28]:43776 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728688AbfBSQeN (ORCPT ); Tue, 19 Feb 2019 11:34:13 -0500 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 459A2145FB3; Tue, 19 Feb 2019 16:34:13 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-121-129.rdu2.redhat.com [10.10.121.129]) by smtp.corp.redhat.com (Postfix) with ESMTP id 7FA9C379F; Tue, 19 Feb 2019 16:34:10 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 41/43] vfs: Add some logging to the core users of the fs_context log From: David Howells To: viro@zeniv.linux.org.uk Cc: linux-fsdevel@vger.kernel.org, dhowells@redhat.com, torvalds@linux-foundation.org, ebiederm@xmission.com, linux-security-module@vger.kernel.org Date: Tue, 19 Feb 2019 16:34:09 +0000 Message-ID: <155059404970.12449.2377788417776946395.stgit@warthog.procyon.org.uk> In-Reply-To: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> References: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Tue, 19 Feb 2019 16:34:13 +0000 (UTC) Sender: owner-linux-security-module@vger.kernel.org Precedence: bulk List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Add some logging to the core users of the fs_context log so that information can be extracted from them as to the reason for failure. Signed-off-by: David Howells Signed-off-by: Al Viro --- fs/super.c | 4 +++- kernel/cgroup/cgroup-v1.c | 2 +- 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/fs/super.c b/fs/super.c index 0ebb5c11fa56..583a0124bc39 100644 --- a/fs/super.c +++ b/fs/super.c @@ -1467,8 +1467,10 @@ int vfs_get_tree(struct fs_context *fc) struct super_block *sb; int error; - if (fc->fs_type->fs_flags & FS_REQUIRES_DEV && !fc->source) + if (fc->fs_type->fs_flags & FS_REQUIRES_DEV && !fc->source) { + errorf(fc, "Filesystem requires source device"); return -ENOENT; + } if (fc->root) return -EBUSY; diff --git a/kernel/cgroup/cgroup-v1.c b/kernel/cgroup/cgroup-v1.c index 571ef3447426..c126b34fd4ff 100644 --- a/kernel/cgroup/cgroup-v1.c +++ b/kernel/cgroup/cgroup-v1.c @@ -17,7 +17,7 @@ #include -#define cg_invalf(fc, fmt, ...) ({ pr_err(fmt, ## __VA_ARGS__); -EINVAL; }) +#define cg_invalf(fc, fmt, ...) invalf(fc, fmt, ## __VA_ARGS__) /* * pidlists linger the following amount before being destroyed. The goal From patchwork Tue Feb 19 16:34:18 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 10820325 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C16551805 for ; Tue, 19 Feb 2019 16:34:23 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id AC22E2CC22 for ; Tue, 19 Feb 2019 16:34:23 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 9FBCB2CC32; Tue, 19 Feb 2019 16:34:23 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 538B72CC22 for ; Tue, 19 Feb 2019 16:34:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728688AbfBSQeV (ORCPT ); Tue, 19 Feb 2019 11:34:21 -0500 Received: from mx1.redhat.com ([209.132.183.28]:45058 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725811AbfBSQeV (ORCPT ); Tue, 19 Feb 2019 11:34:21 -0500 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id E1EFE31B33E; Tue, 19 Feb 2019 16:34:20 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-121-129.rdu2.redhat.com [10.10.121.129]) by smtp.corp.redhat.com (Postfix) with ESMTP id 42D6D53; Tue, 19 Feb 2019 16:34:19 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 42/43] afs: Add fs_context support From: David Howells To: viro@zeniv.linux.org.uk Cc: linux-fsdevel@vger.kernel.org, dhowells@redhat.com, torvalds@linux-foundation.org, ebiederm@xmission.com, linux-security-module@vger.kernel.org Date: Tue, 19 Feb 2019 16:34:18 +0000 Message-ID: <155059405847.12449.6597467845812805111.stgit@warthog.procyon.org.uk> In-Reply-To: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> References: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.29]); Tue, 19 Feb 2019 16:34:21 +0000 (UTC) Sender: owner-linux-security-module@vger.kernel.org Precedence: bulk List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Add fs_context support to the AFS filesystem, converting the parameter parsing to store options there. This will form the basis for namespace propagation over mountpoints within the AFS model, thereby allowing AFS to be used in containers more easily. Signed-off-by: David Howells Signed-off-by: Al Viro --- fs/afs/internal.h | 8 - fs/afs/mntpt.c | 1 fs/afs/super.c | 460 +++++++++++++++++++++++++++++------------------------ fs/afs/volume.c | 4 4 files changed, 259 insertions(+), 214 deletions(-) diff --git a/fs/afs/internal.h b/fs/afs/internal.h index 8871b9e8645f..3ed0550a2e29 100644 --- a/fs/afs/internal.h +++ b/fs/afs/internal.h @@ -36,15 +36,15 @@ struct pagevec; struct afs_call; -struct afs_mount_params { +struct afs_fs_context { bool rwpath; /* T if the parent should be considered R/W */ bool force; /* T to force cell type */ bool autocell; /* T if set auto mount operation */ bool dyn_root; /* T if dynamic root */ + bool no_cell; /* T if the source is "none" (for dynroot) */ afs_voltype_t type; /* type of volume requested */ - int volnamesz; /* size of volume name */ + unsigned int volnamesz; /* size of volume name */ const char *volname; /* name of volume to mount */ - struct net *net_ns; /* Network namespace in effect */ struct afs_net *net; /* the AFS net namespace stuff */ struct afs_cell *cell; /* cell in which to find volume */ struct afs_volume *volume; /* volume record */ @@ -1274,7 +1274,7 @@ static inline struct afs_volume *__afs_get_volume(struct afs_volume *volume) return volume; } -extern struct afs_volume *afs_create_volume(struct afs_mount_params *); +extern struct afs_volume *afs_create_volume(struct afs_fs_context *); extern void afs_activate_volume(struct afs_volume *); extern void afs_deactivate_volume(struct afs_volume *); extern void afs_put_volume(struct afs_cell *, struct afs_volume *); diff --git a/fs/afs/mntpt.c b/fs/afs/mntpt.c index 2e51c6994148..b3f41d27590b 100644 --- a/fs/afs/mntpt.c +++ b/fs/afs/mntpt.c @@ -17,6 +17,7 @@ #include #include #include +#include #include "internal.h" diff --git a/fs/afs/super.c b/fs/afs/super.c index dcd07fe99871..e1a7a8085262 100644 --- a/fs/afs/super.c +++ b/fs/afs/super.c @@ -1,6 +1,6 @@ /* AFS superblock handling * - * Copyright (c) 2002, 2007 Red Hat, Inc. All rights reserved. + * Copyright (c) 2002, 2007, 2018 Red Hat, Inc. All rights reserved. * * This software may be freely redistributed under the terms of the * GNU General Public License. @@ -21,7 +21,7 @@ #include #include #include -#include +#include #include #include #include @@ -30,21 +30,22 @@ #include "internal.h" static void afs_i_init_once(void *foo); -static struct dentry *afs_mount(struct file_system_type *fs_type, - int flags, const char *dev_name, void *data); static void afs_kill_super(struct super_block *sb); static struct inode *afs_alloc_inode(struct super_block *sb); static void afs_destroy_inode(struct inode *inode); static int afs_statfs(struct dentry *dentry, struct kstatfs *buf); static int afs_show_devname(struct seq_file *m, struct dentry *root); static int afs_show_options(struct seq_file *m, struct dentry *root); +static int afs_init_fs_context(struct fs_context *fc); +static const struct fs_parameter_description afs_fs_parameters; struct file_system_type afs_fs_type = { - .owner = THIS_MODULE, - .name = "afs", - .mount = afs_mount, - .kill_sb = afs_kill_super, - .fs_flags = 0, + .owner = THIS_MODULE, + .name = "afs", + .init_fs_context = afs_init_fs_context, + .parameters = &afs_fs_parameters, + .kill_sb = afs_kill_super, + .fs_flags = 0, }; MODULE_ALIAS_FS("afs"); @@ -63,22 +64,28 @@ static const struct super_operations afs_super_ops = { static struct kmem_cache *afs_inode_cachep; static atomic_t afs_count_active_inodes; -enum { - afs_no_opt, - afs_opt_cell, - afs_opt_dyn, - afs_opt_rwpath, - afs_opt_vol, - afs_opt_autocell, +enum afs_param { + Opt_autocell, + Opt_cell, + Opt_dyn, + Opt_rwpath, + Opt_source, + Opt_vol, }; -static const match_table_t afs_options_list = { - { afs_opt_cell, "cell=%s" }, - { afs_opt_dyn, "dyn" }, - { afs_opt_rwpath, "rwpath" }, - { afs_opt_vol, "vol=%s" }, - { afs_opt_autocell, "autocell" }, - { afs_no_opt, NULL }, +static const struct fs_parameter_spec afs_param_specs[] = { + fsparam_flag ("autocell", Opt_autocell), + fsparam_string("cell", Opt_cell), + fsparam_flag ("dyn", Opt_dyn), + fsparam_flag ("rwpath", Opt_rwpath), + fsparam_string("source", Opt_source), + fsparam_string("vol", Opt_vol), + {} +}; + +static const struct fs_parameter_description afs_fs_parameters = { + .name = "kAFS", + .specs = afs_param_specs, }; /* @@ -190,71 +197,10 @@ static int afs_show_options(struct seq_file *m, struct dentry *root) } /* - * parse the mount options - * - this function has been shamelessly adapted from the ext3 fs which - * shamelessly adapted it from the msdos fs - */ -static int afs_parse_options(struct afs_mount_params *params, - char *options, const char **devname) -{ - struct afs_cell *cell; - substring_t args[MAX_OPT_ARGS]; - char *p; - int token; - - _enter("%s", options); - - options[PAGE_SIZE - 1] = 0; - - while ((p = strsep(&options, ","))) { - if (!*p) - continue; - - token = match_token(p, afs_options_list, args); - switch (token) { - case afs_opt_cell: - rcu_read_lock(); - cell = afs_lookup_cell_rcu(params->net, - args[0].from, - args[0].to - args[0].from); - rcu_read_unlock(); - if (IS_ERR(cell)) - return PTR_ERR(cell); - afs_put_cell(params->net, params->cell); - params->cell = cell; - break; - - case afs_opt_rwpath: - params->rwpath = true; - break; - - case afs_opt_vol: - *devname = args[0].from; - break; - - case afs_opt_autocell: - params->autocell = true; - break; - - case afs_opt_dyn: - params->dyn_root = true; - break; - - default: - printk(KERN_ERR "kAFS:" - " Unknown or invalid mount option: '%s'\n", p); - return -EINVAL; - } - } - - _leave(" = 0"); - return 0; -} - -/* - * parse a device name to get cell name, volume name, volume type and R/W - * selector - * - this can be one of the following: + * Parse the source name to get cell name, volume name, volume type and R/W + * selector. + * + * This can be one of the following: * "%[cell:]volume[.]" R/W volume * "#[cell:]volume[.]" R/O or R/W volume (rwpath=0), * or R/W (rwpath=1) volume @@ -263,11 +209,11 @@ static int afs_parse_options(struct afs_mount_params *params, * "%[cell:]volume.backup" Backup volume * "#[cell:]volume.backup" Backup volume */ -static int afs_parse_device_name(struct afs_mount_params *params, - const char *name) +static int afs_parse_source(struct fs_context *fc, struct fs_parameter *param) { + struct afs_fs_context *ctx = fc->fs_private; struct afs_cell *cell; - const char *cellname, *suffix; + const char *cellname, *suffix, *name = param->string; int cellnamesz; _enter(",%s", name); @@ -278,69 +224,174 @@ static int afs_parse_device_name(struct afs_mount_params *params, } if ((name[0] != '%' && name[0] != '#') || !name[1]) { + /* To use dynroot, we don't want to have to provide a source */ + if (strcmp(name, "none") == 0) { + ctx->no_cell = true; + return 0; + } printk(KERN_ERR "kAFS: unparsable volume name\n"); return -EINVAL; } /* determine the type of volume we're looking for */ - params->type = AFSVL_ROVOL; - params->force = false; - if (params->rwpath || name[0] == '%') { - params->type = AFSVL_RWVOL; - params->force = true; + ctx->type = AFSVL_ROVOL; + ctx->force = false; + if (ctx->rwpath || name[0] == '%') { + ctx->type = AFSVL_RWVOL; + ctx->force = true; } name++; /* split the cell name out if there is one */ - params->volname = strchr(name, ':'); - if (params->volname) { + ctx->volname = strchr(name, ':'); + if (ctx->volname) { cellname = name; - cellnamesz = params->volname - name; - params->volname++; + cellnamesz = ctx->volname - name; + ctx->volname++; } else { - params->volname = name; + ctx->volname = name; cellname = NULL; cellnamesz = 0; } /* the volume type is further affected by a possible suffix */ - suffix = strrchr(params->volname, '.'); + suffix = strrchr(ctx->volname, '.'); if (suffix) { if (strcmp(suffix, ".readonly") == 0) { - params->type = AFSVL_ROVOL; - params->force = true; + ctx->type = AFSVL_ROVOL; + ctx->force = true; } else if (strcmp(suffix, ".backup") == 0) { - params->type = AFSVL_BACKVOL; - params->force = true; + ctx->type = AFSVL_BACKVOL; + ctx->force = true; } else if (suffix[1] == 0) { } else { suffix = NULL; } } - params->volnamesz = suffix ? - suffix - params->volname : strlen(params->volname); + ctx->volnamesz = suffix ? + suffix - ctx->volname : strlen(ctx->volname); _debug("cell %*.*s [%p]", - cellnamesz, cellnamesz, cellname ?: "", params->cell); + cellnamesz, cellnamesz, cellname ?: "", ctx->cell); /* lookup the cell record */ - if (cellname || !params->cell) { - cell = afs_lookup_cell(params->net, cellname, cellnamesz, + if (cellname) { + cell = afs_lookup_cell(ctx->net, cellname, cellnamesz, NULL, false); if (IS_ERR(cell)) { - printk(KERN_ERR "kAFS: unable to lookup cell '%*.*s'\n", + pr_err("kAFS: unable to lookup cell '%*.*s'\n", cellnamesz, cellnamesz, cellname ?: ""); return PTR_ERR(cell); } - afs_put_cell(params->net, params->cell); - params->cell = cell; + afs_put_cell(ctx->net, ctx->cell); + ctx->cell = cell; } _debug("CELL:%s [%p] VOLUME:%*.*s SUFFIX:%s TYPE:%d%s", - params->cell->name, params->cell, - params->volnamesz, params->volnamesz, params->volname, - suffix ?: "-", params->type, params->force ? " FORCE" : ""); + ctx->cell->name, ctx->cell, + ctx->volnamesz, ctx->volnamesz, ctx->volname, + suffix ?: "-", ctx->type, ctx->force ? " FORCE" : ""); + + fc->source = param->string; + param->string = NULL; + return 0; +} + +/* + * Parse a single mount parameter. + */ +static int afs_parse_param(struct fs_context *fc, struct fs_parameter *param) +{ + struct fs_parse_result result; + struct afs_fs_context *ctx = fc->fs_private; + struct afs_cell *cell; + int opt; + + opt = fs_parse(fc, &afs_fs_parameters, param, &result); + if (opt < 0) + return opt; + + switch (opt) { + case Opt_cell: + if (param->size <= 0) + return -EINVAL; + if (param->size > AFS_MAXCELLNAME) + return -ENAMETOOLONG; + + rcu_read_lock(); + cell = afs_lookup_cell_rcu(ctx->net, param->string, param->size); + rcu_read_unlock(); + if (IS_ERR(cell)) + return PTR_ERR(cell); + afs_put_cell(ctx->net, ctx->cell); + ctx->cell = cell; + break; + + case Opt_source: + return afs_parse_source(fc, param); + + case Opt_autocell: + ctx->autocell = true; + break; + + case Opt_dyn: + ctx->dyn_root = true; + break; + + case Opt_rwpath: + ctx->rwpath = true; + break; + + case Opt_vol: + return invalf(fc, "'vol' param is obsolete"); + + default: + return -EINVAL; + } + + _leave(" = 0"); + return 0; +} + +/* + * Validate the options, get the cell key and look up the volume. + */ +static int afs_validate_fc(struct fs_context *fc) +{ + struct afs_fs_context *ctx = fc->fs_private; + struct afs_volume *volume; + struct key *key; + + if (!ctx->dyn_root) { + if (ctx->no_cell) { + pr_warn("kAFS: Can only specify source 'none' with -o dyn\n"); + return -EINVAL; + } + + if (!ctx->cell) { + pr_warn("kAFS: No cell specified\n"); + return -EDESTADDRREQ; + } + + /* We try to do the mount securely. */ + key = afs_request_key(ctx->cell); + if (IS_ERR(key)) + return PTR_ERR(key); + + ctx->key = key; + + if (ctx->volume) { + afs_put_volume(ctx->cell, ctx->volume); + ctx->volume = NULL; + } + + volume = afs_create_volume(ctx); + if (IS_ERR(volume)) + return PTR_ERR(volume); + + ctx->volume = volume; + } return 0; } @@ -348,39 +399,34 @@ static int afs_parse_device_name(struct afs_mount_params *params, /* * check a superblock to see if it's the one we're looking for */ -static int afs_test_super(struct super_block *sb, void *data) +static int afs_test_super(struct super_block *sb, struct fs_context *fc) { - struct afs_super_info *as1 = data; + struct afs_fs_context *ctx = fc->fs_private; struct afs_super_info *as = AFS_FS_S(sb); - return (as->net_ns == as1->net_ns && + return (as->net_ns == fc->net_ns && as->volume && - as->volume->vid == as1->volume->vid && + as->volume->vid == ctx->volume->vid && !as->dyn_root); } -static int afs_dynroot_test_super(struct super_block *sb, void *data) +static int afs_dynroot_test_super(struct super_block *sb, struct fs_context *fc) { - struct afs_super_info *as1 = data; struct afs_super_info *as = AFS_FS_S(sb); - return (as->net_ns == as1->net_ns && + return (as->net_ns == fc->net_ns && as->dyn_root); } -static int afs_set_super(struct super_block *sb, void *data) +static int afs_set_super(struct super_block *sb, struct fs_context *fc) { - struct afs_super_info *as = data; - - sb->s_fs_info = as; return set_anon_super(sb, NULL); } /* * fill in the superblock */ -static int afs_fill_super(struct super_block *sb, - struct afs_mount_params *params) +static int afs_fill_super(struct super_block *sb, struct afs_fs_context *ctx) { struct afs_super_info *as = AFS_FS_S(sb); struct afs_fid fid; @@ -412,13 +458,13 @@ static int afs_fill_super(struct super_block *sb, fid.vnode = 1; fid.vnode_hi = 0; fid.unique = 1; - inode = afs_iget(sb, params->key, &fid, NULL, NULL, NULL); + inode = afs_iget(sb, ctx->key, &fid, NULL, NULL, NULL); } if (IS_ERR(inode)) return PTR_ERR(inode); - if (params->autocell || params->dyn_root) + if (ctx->autocell || as->dyn_root) set_bit(AFS_VNODE_AUTOCELL, &AFS_FS_I(inode)->flags); ret = -ENOMEM; @@ -443,17 +489,20 @@ static int afs_fill_super(struct super_block *sb, return ret; } -static struct afs_super_info *afs_alloc_sbi(struct afs_mount_params *params) +static struct afs_super_info *afs_alloc_sbi(struct fs_context *fc) { + struct afs_fs_context *ctx = fc->fs_private; struct afs_super_info *as; as = kzalloc(sizeof(struct afs_super_info), GFP_KERNEL); if (as) { - as->net_ns = get_net(params->net_ns); - if (params->dyn_root) + as->net_ns = get_net(fc->net_ns); + if (ctx->dyn_root) { as->dyn_root = true; - else - as->cell = afs_get_cell(params->cell); + } else { + as->cell = afs_get_cell(ctx->cell); + as->volume = __afs_get_volume(ctx->volume); + } } return as; } @@ -475,7 +524,7 @@ static void afs_kill_super(struct super_block *sb) if (as->dyn_root) afs_dynroot_depopulate(sb); - + /* Clear the callback interests (which will do ilookup5) before * deactivating the superblock. */ @@ -488,111 +537,106 @@ static void afs_kill_super(struct super_block *sb) } /* - * get an AFS superblock + * Get an AFS superblock and root directory. */ -static struct dentry *afs_mount(struct file_system_type *fs_type, - int flags, const char *dev_name, void *options) +static int afs_get_tree(struct fs_context *fc) { - struct afs_mount_params params; + struct afs_fs_context *ctx = fc->fs_private; struct super_block *sb; - struct afs_volume *candidate; - struct key *key; struct afs_super_info *as; int ret; - _enter(",,%s,%p", dev_name, options); - - memset(¶ms, 0, sizeof(params)); - - ret = -EINVAL; - if (current->nsproxy->net_ns != &init_net) + ret = afs_validate_fc(fc); + if (ret) goto error; - params.net_ns = current->nsproxy->net_ns; - params.net = afs_net(params.net_ns); - - /* parse the options and device name */ - if (options) { - ret = afs_parse_options(¶ms, options, &dev_name); - if (ret < 0) - goto error; - } - - if (!params.dyn_root) { - ret = afs_parse_device_name(¶ms, dev_name); - if (ret < 0) - goto error; - /* try and do the mount securely */ - key = afs_request_key(params.cell); - if (IS_ERR(key)) { - _leave(" = %ld [key]", PTR_ERR(key)); - ret = PTR_ERR(key); - goto error; - } - params.key = key; - } + _enter(""); /* allocate a superblock info record */ ret = -ENOMEM; - as = afs_alloc_sbi(¶ms); + as = afs_alloc_sbi(fc); if (!as) - goto error_key; - - if (!params.dyn_root) { - /* Assume we're going to need a volume record; at the very - * least we can use it to update the volume record if we have - * one already. This checks that the volume exists within the - * cell. - */ - candidate = afs_create_volume(¶ms); - if (IS_ERR(candidate)) { - ret = PTR_ERR(candidate); - goto error_as; - } - - as->volume = candidate; - } + goto error; + fc->s_fs_info = as; /* allocate a deviceless superblock */ - sb = sget(fs_type, - as->dyn_root ? afs_dynroot_test_super : afs_test_super, - afs_set_super, flags, as); + sb = sget_fc(fc, + as->dyn_root ? afs_dynroot_test_super : afs_test_super, + afs_set_super); if (IS_ERR(sb)) { ret = PTR_ERR(sb); - goto error_as; + goto error; } if (!sb->s_root) { /* initial superblock/root creation */ _debug("create"); - ret = afs_fill_super(sb, ¶ms); + ret = afs_fill_super(sb, ctx); if (ret < 0) goto error_sb; - as = NULL; sb->s_flags |= SB_ACTIVE; } else { _debug("reuse"); ASSERTCMP(sb->s_flags, &, SB_ACTIVE); - afs_destroy_sbi(as); - as = NULL; } - afs_put_cell(params.net, params.cell); - key_put(params.key); + fc->root = dget(sb->s_root); _leave(" = 0 [%p]", sb); - return dget(sb->s_root); + return 0; error_sb: deactivate_locked_super(sb); - goto error_key; -error_as: - afs_destroy_sbi(as); -error_key: - key_put(params.key); error: - afs_put_cell(params.net, params.cell); _leave(" = %d", ret); - return ERR_PTR(ret); + return ret; +} + +static void afs_free_fc(struct fs_context *fc) +{ + struct afs_fs_context *ctx = fc->fs_private; + + afs_destroy_sbi(fc->s_fs_info); + afs_put_volume(ctx->cell, ctx->volume); + afs_put_cell(ctx->net, ctx->cell); + key_put(ctx->key); + kfree(ctx); +} + +static const struct fs_context_operations afs_context_ops = { + .free = afs_free_fc, + .parse_param = afs_parse_param, + .get_tree = afs_get_tree, +}; + +/* + * Set up the filesystem mount context. + */ +static int afs_init_fs_context(struct fs_context *fc) +{ + struct afs_fs_context *ctx; + struct afs_cell *cell; + + if (current->nsproxy->net_ns != &init_net) + return -EINVAL; + + ctx = kzalloc(sizeof(struct afs_fs_context), GFP_KERNEL); + if (!ctx) + return -ENOMEM; + + ctx->type = AFSVL_ROVOL; + ctx->net = afs_net(fc->net_ns); + + /* Default to the workstation cell. */ + rcu_read_lock(); + cell = afs_lookup_cell_rcu(ctx->net, NULL, 0); + rcu_read_unlock(); + if (IS_ERR(cell)) + cell = NULL; + ctx->cell = cell; + + fc->fs_private = ctx; + fc->ops = &afs_context_ops; + return 0; } /* diff --git a/fs/afs/volume.c b/fs/afs/volume.c index 00975ed3640f..f6eba2def0a1 100644 --- a/fs/afs/volume.c +++ b/fs/afs/volume.c @@ -21,7 +21,7 @@ static const char *const afs_voltypes[] = { "R/W", "R/O", "BAK" }; /* * Allocate a volume record and load it up from a vldb record. */ -static struct afs_volume *afs_alloc_volume(struct afs_mount_params *params, +static struct afs_volume *afs_alloc_volume(struct afs_fs_context *params, struct afs_vldb_entry *vldb, unsigned long type_mask) { @@ -113,7 +113,7 @@ static struct afs_vldb_entry *afs_vl_lookup_vldb(struct afs_cell *cell, * - Rule 3: If parent volume is R/W, then only mount R/W volume unless * explicitly told otherwise */ -struct afs_volume *afs_create_volume(struct afs_mount_params *params) +struct afs_volume *afs_create_volume(struct afs_fs_context *params) { struct afs_vldb_entry *vldb; struct afs_volume *volume; From patchwork Tue Feb 19 16:34:26 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 10820329 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E2C2A1390 for ; Tue, 19 Feb 2019 16:34:30 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CA6742CC5A for ; Tue, 19 Feb 2019 16:34:30 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id BC27C2CC79; Tue, 19 Feb 2019 16:34:30 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E7B4C2CC5A for ; Tue, 19 Feb 2019 16:34:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728818AbfBSQe3 (ORCPT ); Tue, 19 Feb 2019 11:34:29 -0500 Received: from mx1.redhat.com ([209.132.183.28]:47752 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725811AbfBSQe3 (ORCPT ); Tue, 19 Feb 2019 11:34:29 -0500 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 7F9F8C0AC913; Tue, 19 Feb 2019 16:34:28 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-121-129.rdu2.redhat.com [10.10.121.129]) by smtp.corp.redhat.com (Postfix) with ESMTP id DF2B4101962A; Tue, 19 Feb 2019 16:34:26 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 43/43] afs: Use fs_context to pass parameters over automount From: David Howells To: viro@zeniv.linux.org.uk Cc: "Eric W. Biederman" , linux-fsdevel@vger.kernel.org, dhowells@redhat.com, torvalds@linux-foundation.org, ebiederm@xmission.com, linux-security-module@vger.kernel.org Date: Tue, 19 Feb 2019 16:34:26 +0000 Message-ID: <155059406610.12449.7679699218753072978.stgit@warthog.procyon.org.uk> In-Reply-To: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> References: <155059366914.12449.4669870128936536848.stgit@warthog.procyon.org.uk> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.31]); Tue, 19 Feb 2019 16:34:28 +0000 (UTC) Sender: owner-linux-security-module@vger.kernel.org Precedence: bulk List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Alter the AFS automounting code to create and modify an fs_context struct when parameterising a new mount triggered by an AFS mountpoint rather than constructing device name and option strings. Also remove the cell=, vol= and rwpath options as they are then redundant. The reason they existed is because the 'device name' may be derived literally from a mountpoint object in the filesystem, so default cell and parent-type information needed to be passed in by some other method from the automount routines. The vol= option didn't end up being used. Signed-off-by: David Howells cc: Eric W. Biederman Signed-off-by: Al Viro --- fs/afs/internal.h | 1 fs/afs/mntpt.c | 148 ++++++++++++++++++++++++++++------------------------- fs/afs/super.c | 40 +------------- 3 files changed, 80 insertions(+), 109 deletions(-) diff --git a/fs/afs/internal.h b/fs/afs/internal.h index 3ed0550a2e29..bb1f244b2b3a 100644 --- a/fs/afs/internal.h +++ b/fs/afs/internal.h @@ -37,7 +37,6 @@ struct pagevec; struct afs_call; struct afs_fs_context { - bool rwpath; /* T if the parent should be considered R/W */ bool force; /* T to force cell type */ bool autocell; /* T if set auto mount operation */ bool dyn_root; /* T if dynamic root */ diff --git a/fs/afs/mntpt.c b/fs/afs/mntpt.c index b3f41d27590b..eecd8b699186 100644 --- a/fs/afs/mntpt.c +++ b/fs/afs/mntpt.c @@ -48,6 +48,8 @@ static DECLARE_DELAYED_WORK(afs_mntpt_expiry_timer, afs_mntpt_expiry_timed_out); static unsigned long afs_mntpt_expiry_timeout = 10 * 60; +static const char afs_root_volume[] = "root.cell"; + /* * no valid lookup procedure on this sort of dir */ @@ -69,108 +71,112 @@ static int afs_mntpt_open(struct inode *inode, struct file *file) } /* - * create a vfsmount to be automounted + * Set the parameters for the proposed superblock. */ -static struct vfsmount *afs_mntpt_do_automount(struct dentry *mntpt) +static int afs_mntpt_set_params(struct fs_context *fc, struct dentry *mntpt) { - struct afs_super_info *as; - struct vfsmount *mnt; - struct afs_vnode *vnode; - struct page *page; - char *devname, *options; - bool rwpath = false; + struct afs_fs_context *ctx = fc->fs_private; + struct afs_super_info *src_as = AFS_FS_S(mntpt->d_sb); + struct afs_vnode *vnode = AFS_FS_I(d_inode(mntpt)); + struct afs_cell *cell; + const char *p; int ret; - _enter("{%pd}", mntpt); - - BUG_ON(!d_inode(mntpt)); - - ret = -ENOMEM; - devname = (char *) get_zeroed_page(GFP_KERNEL); - if (!devname) - goto error_no_devname; - - options = (char *) get_zeroed_page(GFP_KERNEL); - if (!options) - goto error_no_options; + if (fc->net_ns != src_as->net_ns) { + put_net(fc->net_ns); + fc->net_ns = get_net(src_as->net_ns); + } - vnode = AFS_FS_I(d_inode(mntpt)); + if (src_as->volume && src_as->volume->type == AFSVL_RWVOL) { + ctx->type = AFSVL_RWVOL; + ctx->force = true; + } + if (ctx->cell) { + afs_put_cell(ctx->net, ctx->cell); + ctx->cell = NULL; + } if (test_bit(AFS_VNODE_PSEUDODIR, &vnode->flags)) { /* if the directory is a pseudo directory, use the d_name */ - static const char afs_root_cell[] = ":root.cell."; unsigned size = mntpt->d_name.len; - ret = -ENOENT; - if (size < 2 || size > AFS_MAXCELLNAME) - goto error_no_page; + if (size < 2) + return -ENOENT; + p = mntpt->d_name.name; if (mntpt->d_name.name[0] == '.') { - devname[0] = '%'; - memcpy(devname + 1, mntpt->d_name.name + 1, size - 1); - memcpy(devname + size, afs_root_cell, - sizeof(afs_root_cell)); - rwpath = true; - } else { - devname[0] = '#'; - memcpy(devname + 1, mntpt->d_name.name, size); - memcpy(devname + size + 1, afs_root_cell, - sizeof(afs_root_cell)); + size--; + p++; + ctx->type = AFSVL_RWVOL; + ctx->force = true; } + if (size > AFS_MAXCELLNAME) + return -ENAMETOOLONG; + + cell = afs_lookup_cell(ctx->net, p, size, NULL, false); + if (IS_ERR(cell)) { + pr_err("kAFS: unable to lookup cell '%pd'\n", mntpt); + return PTR_ERR(cell); + } + ctx->cell = cell; + + ctx->volname = afs_root_volume; + ctx->volnamesz = sizeof(afs_root_volume) - 1; } else { /* read the contents of the AFS special symlink */ + struct page *page; loff_t size = i_size_read(d_inode(mntpt)); char *buf; - ret = -EINVAL; + if (src_as->cell) + ctx->cell = afs_get_cell(src_as->cell); + if (size > PAGE_SIZE - 1) - goto error_no_page; + return -EINVAL; page = read_mapping_page(d_inode(mntpt)->i_mapping, 0, NULL); - if (IS_ERR(page)) { - ret = PTR_ERR(page); - goto error_no_page; - } + if (IS_ERR(page)) + return PTR_ERR(page); if (PageError(page)) { ret = afs_bad(AFS_FS_I(d_inode(mntpt)), afs_file_error_mntpt); - goto error; + put_page(page); + return ret; } - buf = kmap_atomic(page); - memcpy(devname, buf, size); - kunmap_atomic(buf); + buf = kmap(page); + ret = vfs_parse_fs_string(fc, "source", buf, size); + kunmap(page); put_page(page); - page = NULL; + if (ret < 0) + return ret; } - /* work out what options we want */ - as = AFS_FS_S(mntpt->d_sb); - if (as->cell) { - memcpy(options, "cell=", 5); - strcpy(options + 5, as->cell->name); - if ((as->volume && as->volume->type == AFSVL_RWVOL) || rwpath) - strcat(options, ",rwpath"); - } + return 0; +} - /* try and do the mount */ - _debug("--- attempting mount %s -o %s ---", devname, options); - mnt = vfs_submount(mntpt, &afs_fs_type, devname, options); - _debug("--- mount result %p ---", mnt); +/* + * create a vfsmount to be automounted + */ +static struct vfsmount *afs_mntpt_do_automount(struct dentry *mntpt) +{ + struct fs_context *fc; + struct vfsmount *mnt; + int ret; - free_page((unsigned long) devname); - free_page((unsigned long) options); - _leave(" = %p", mnt); - return mnt; + BUG_ON(!d_inode(mntpt)); -error: - put_page(page); -error_no_page: - free_page((unsigned long) options); -error_no_options: - free_page((unsigned long) devname); -error_no_devname: - _leave(" = %d", ret); - return ERR_PTR(ret); + fc = fs_context_for_submount(&afs_fs_type, mntpt); + if (IS_ERR(fc)) + return ERR_CAST(fc); + + ret = afs_mntpt_set_params(fc, mntpt); + if (!ret) + mnt = fc_mount(fc); + else + mnt = ERR_PTR(ret); + + put_fs_context(fc); + return mnt; } /* diff --git a/fs/afs/super.c b/fs/afs/super.c index e1a7a8085262..a07af1ab488d 100644 --- a/fs/afs/super.c +++ b/fs/afs/super.c @@ -66,20 +66,14 @@ static atomic_t afs_count_active_inodes; enum afs_param { Opt_autocell, - Opt_cell, Opt_dyn, - Opt_rwpath, Opt_source, - Opt_vol, }; static const struct fs_parameter_spec afs_param_specs[] = { fsparam_flag ("autocell", Opt_autocell), - fsparam_string("cell", Opt_cell), fsparam_flag ("dyn", Opt_dyn), - fsparam_flag ("rwpath", Opt_rwpath), fsparam_string("source", Opt_source), - fsparam_string("vol", Opt_vol), {} }; @@ -202,8 +196,8 @@ static int afs_show_options(struct seq_file *m, struct dentry *root) * * This can be one of the following: * "%[cell:]volume[.]" R/W volume - * "#[cell:]volume[.]" R/O or R/W volume (rwpath=0), - * or R/W (rwpath=1) volume + * "#[cell:]volume[.]" R/O or R/W volume (R/O parent), + * or R/W (R/W parent) volume * "%[cell:]volume.readonly" R/O volume * "#[cell:]volume.readonly" R/O volume * "%[cell:]volume.backup" Backup volume @@ -234,9 +228,7 @@ static int afs_parse_source(struct fs_context *fc, struct fs_parameter *param) } /* determine the type of volume we're looking for */ - ctx->type = AFSVL_ROVOL; - ctx->force = false; - if (ctx->rwpath || name[0] == '%') { + if (name[0] == '%') { ctx->type = AFSVL_RWVOL; ctx->force = true; } @@ -305,7 +297,6 @@ static int afs_parse_param(struct fs_context *fc, struct fs_parameter *param) { struct fs_parse_result result; struct afs_fs_context *ctx = fc->fs_private; - struct afs_cell *cell; int opt; opt = fs_parse(fc, &afs_fs_parameters, param, &result); @@ -313,21 +304,6 @@ static int afs_parse_param(struct fs_context *fc, struct fs_parameter *param) return opt; switch (opt) { - case Opt_cell: - if (param->size <= 0) - return -EINVAL; - if (param->size > AFS_MAXCELLNAME) - return -ENAMETOOLONG; - - rcu_read_lock(); - cell = afs_lookup_cell_rcu(ctx->net, param->string, param->size); - rcu_read_unlock(); - if (IS_ERR(cell)) - return PTR_ERR(cell); - afs_put_cell(ctx->net, ctx->cell); - ctx->cell = cell; - break; - case Opt_source: return afs_parse_source(fc, param); @@ -339,13 +315,6 @@ static int afs_parse_param(struct fs_context *fc, struct fs_parameter *param) ctx->dyn_root = true; break; - case Opt_rwpath: - ctx->rwpath = true; - break; - - case Opt_vol: - return invalf(fc, "'vol' param is obsolete"); - default: return -EINVAL; } @@ -616,9 +585,6 @@ static int afs_init_fs_context(struct fs_context *fc) struct afs_fs_context *ctx; struct afs_cell *cell; - if (current->nsproxy->net_ns != &init_net) - return -EINVAL; - ctx = kzalloc(sizeof(struct afs_fs_context), GFP_KERNEL); if (!ctx) return -ENOMEM;