From patchwork Wed May 4 14:26:47 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Djalal Harouni X-Patchwork-Id: 9015301 Return-Path: X-Original-To: patchwork-linux-fsdevel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 148459F30C for ; Wed, 4 May 2016 14:47:44 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 01A062039D for ; Wed, 4 May 2016 14:47:43 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id D142020165 for ; Wed, 4 May 2016 14:47:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753654AbcEDOrk (ORCPT ); Wed, 4 May 2016 10:47:40 -0400 Received: from mail-wm0-f54.google.com ([74.125.82.54]:37977 "EHLO mail-wm0-f54.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752857AbcEDO3l (ORCPT ); Wed, 4 May 2016 10:29:41 -0400 Received: by mail-wm0-f54.google.com with SMTP id g17so97560887wme.1; Wed, 04 May 2016 07:29:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=p7HFPCi39M2N6ljciqFLJnVbsSDW+el1bz8bw9N88aE=; b=BkdrLEslpQ+wERn2k2Jl2GE8fYtjCJ4AR9CPApmG9ktmr+qM88EoqpUtJAEmwk3QSH VnTQoiyG22RXBwu4JiBp1d2l+yCS8LQkOFFLU7zlqRPqwYbHoUeJns0SvkH+Sdv7P+x8 xT2nfw4ARt9WtJhdGCX4GiLDZ1R5JKAZXUhAw7XizQa+cDpyCOSHm8WvLKfFcTG9EpmW wHMtusO2THz/KnMsXLPMyYZHk+r5/t6csNrUZ+b5FwZ8GRo5ag4SvkeLjZ9sJmjMc+HC IET57Q8raXMH1QIGOOCvMdM6Bn76Ul+4DJvGe6PFjnyLnbJJjsIchx7gYUNDcTsjalw1 bcRQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=p7HFPCi39M2N6ljciqFLJnVbsSDW+el1bz8bw9N88aE=; b=YkMuQ6MTEslDB/mgppL1c0sj5omLztnWj8AZj8Hc8x2rU/Zzu10vlT638mzSCuh08B KPNPnXrmimZthI9of5P1lk4Q6XJp4NG3FgXSLeAYdK1hpGD6PO7umV9yD7oR4fm16ZOR bmrcdF8qM+S1O+WvwqE5S95q11yl147liFNDfQ0IIatHaCnCswhR7+7Tx53xikJA9b8C DOfEvDMpQ9KBbQ5aOaOWOdFgG+A+MyKVBOr6eSd11DjKjK0Qi7jC20K5GqOoERcqGZHv u8hsJQf4sWNgyFLxJ6ZLUVA1vkVDfr6uSNOTFm7fTijFX0gnytMlY2Ue6Twama30ygON hzww== X-Gm-Message-State: AOPr4FVzShzSAGCnfCDk26nmWhg8bIgZ9AbmjaKME8QpW4/d3CTPSILemI+auAr1mmRDHg== X-Received: by 10.28.27.17 with SMTP id b17mr9507719wmb.19.1462372179531; Wed, 04 May 2016 07:29:39 -0700 (PDT) Received: from dztty2.localdomain (ip5b42f9c9.dynamic.kabel-deutschland.de. [91.66.249.201]) by smtp.gmail.com with ESMTPSA id a75sm4615505wme.18.2016.05.04.07.29.37 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 04 May 2016 07:29:38 -0700 (PDT) From: Djalal Harouni To: Alexander Viro , Chris Mason , , Serge Hallyn , Josh Triplett , "Eric W. Biederman" , Andy Lutomirski , Seth Forshee , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-security-module@vger.kernel.org, Dongsu Park , David Herrmann , Miklos Szeredi , Alban Crequy Cc: Djalal Harouni , Djalal Harouni Subject: [RFC v2 PATCH 1/8] VFS: add CLONE_MNTNS_SHIFT_UIDGID flag to allow mounts to shift their UIDs/GIDs Date: Wed, 4 May 2016 16:26:47 +0200 Message-Id: <1462372014-3786-2-git-send-email-tixxdz@gmail.com> X-Mailer: git-send-email 2.5.5 In-Reply-To: <1462372014-3786-1-git-send-email-tixxdz@gmail.com> References: <1462372014-3786-1-git-send-email-tixxdz@gmail.com> Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Spam-Status: No, score=-8.9 required=5.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED, DKIM_SIGNED, FREEMAIL_FROM, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, T_DKIM_INVALID, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Add CLONE_MNTNS_SHIFT_UIDGID flag which is a mount namespace flag when set mount points on filesystems that support UID/GID shifts will have their UIDs and GIDs shifted by the VFS. The UID and GID mapping rules are per mount namespace, they follow the rules of the user namespace of the containing mount namespace. The UID/GID of inodes are supposed to always contain the on-disk values, hence, the shift will be done inside VFS and it's a read shift when we access the inodes. This is a preparation patch. Goal: /* (1) */ clone4(CLONE_NEWNS|CLONE_MNTNS_SHIFT_UIDGID, ...) /* Setup container base mount namespace, rootfs and mount all necessary mount points and filesystems that can't be mounted in user namespaces. Filesystems that support uid/gid shifts should set the mount parameters. mount(..., mount_options=[vfs_shift_uids, vfs_shift_gids]) */ /* (2) */ /* Setup new mount and user namespaces and inherit the CLONE_MNTNS_SHIFT_UIDGID flag from (1) into the new mount namespace (2). */ clone4(CLONE_NEWUSER|CLONE_NEWNS|CLONE_MNTNS_SHIFT_UIDGID, ...) /* inodes of mount points here that support UID/GID shifts will have automatically their UID/GID shifted according to the user namespace rules of the current mount namespace (2). */ We create the new user and mount namespaces where: 1) The mount namespace allows mounts inside it that support UID and GID shifting to perform the shifts if the CLONE_MNTNS_SHIFT_UIDGID is set in the current mount namespace. 2) The UID and GID mapping is done according to the rules of the user namespace of the containing mount namespace. The CLONE_MNTNS_SHIFT_UIDGID follows the CLONE_NEWUSER|CLONE_NEWNS combination. This ensures that only the creator of the mount namespace is able to adjust the user namespace mapping rules. The flag CLONE_MNTNS_SHIFT_UIDGID can be set on the mount namespace only if: 1) The parent namespace has already CLONE_MNTNS_SHIFT_UIDGID set on its mount namespace. 2) The caller has CAP_SYS_ADMIN in the init_user_ns namespace, since we start from that namespace and we inherit some mount points we have to protect files from privileged userns doing: clone(CLONE_NEWUSER|CLONE_NEWNS|CLONE_MNTNS_SHIFT_UIDGID...) This is blocked. If a filesystem was mounted with "vfs_shift_uids" and "vfs_shift_gids" and shows up in a mount namespace that does not include the CLONE_MNTNS_SHIFT_UIDGID, then no shift is done. UIDs and GIDs will not be changed at all, and things will continue to work as they are now. Signed-off-by: Dongsu Park Signed-off-by: Djalal Harouni --- fs/mount.h | 1 + fs/namespace.c | 20 ++++++++++++++++++++ include/uapi/linux/sched.h | 1 + kernel/fork.c | 4 ++++ 4 files changed, 26 insertions(+) diff --git a/fs/mount.h b/fs/mount.h index 14db05d..1e317eb 100644 --- a/fs/mount.h +++ b/fs/mount.h @@ -6,6 +6,7 @@ struct mnt_namespace { atomic_t count; + int flags; struct ns_common ns; struct mount * root; struct list_head list; diff --git a/fs/namespace.c b/fs/namespace.c index 4fb1691..940ecfc 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -2774,6 +2774,7 @@ static struct mnt_namespace *alloc_mnt_ns(struct user_namespace *user_ns) INIT_LIST_HEAD(&new_ns->list); init_waitqueue_head(&new_ns->poll); new_ns->event = 0; + new_ns->flags = 0; new_ns->user_ns = get_user_ns(user_ns); return new_ns; } @@ -2801,6 +2802,25 @@ struct mnt_namespace *copy_mnt_ns(unsigned long flags, struct mnt_namespace *ns, if (IS_ERR(new_ns)) return new_ns; + if (flags & CLONE_MNTNS_SHIFT_UIDGID) { + /* + * If parent has the CLONE_MNTNS_SHIFT_UIDGID flag set + * or current is capable in init_user_ns, then we set the + * CLONE_MNTNS_SHIFT_UIDGID flag and allow mounts inside + * this namespace to shift their UID and GID. + * + * We check the init_user_ns here since we always start from + * that user namespace and mounts are by default available to all + * users. In this regard, only CAP_SYS_ADMIN in init_user_ns is + * allowed to start and propagate the CLONE_MNTNS_SHIFT_UIDGID + * flag to new mount namespaces. + */ + if ((ns->flags & CLONE_MNTNS_SHIFT_UIDGID) || capable(CAP_SYS_ADMIN)) + new_ns->flags |= CLONE_MNTNS_SHIFT_UIDGID; + else + return ERR_PTR(-EPERM); + } + namespace_lock(); /* First pass: copy the tree topology */ copy_flags = CL_COPY_UNBINDABLE | CL_EXPIRE; diff --git a/include/uapi/linux/sched.h b/include/uapi/linux/sched.h index 5f0fe01..9ba2124 100644 --- a/include/uapi/linux/sched.h +++ b/include/uapi/linux/sched.h @@ -19,6 +19,7 @@ #define CLONE_PARENT_SETTID 0x00100000 /* set the TID in the parent */ #define CLONE_CHILD_CLEARTID 0x00200000 /* clear the TID in the child */ #define CLONE_DETACHED 0x00400000 /* Unused, ignored */ +#define CLONE_MNTNS_SHIFT_UIDGID 0x00400000 /* If set allows to shift UID and GID for mounts that support it */ #define CLONE_UNTRACED 0x00800000 /* set if the tracing process can't force CLONE_PTRACE on this clone */ #define CLONE_CHILD_SETTID 0x01000000 /* set the TID in the child */ #define CLONE_NEWCGROUP 0x02000000 /* New cgroup namespace */ diff --git a/kernel/fork.c b/kernel/fork.c index d277e83..41223cd 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -1264,6 +1264,10 @@ static struct task_struct *copy_process(unsigned long clone_flags, if ((clone_flags & (CLONE_NEWNS|CLONE_FS)) == (CLONE_NEWNS|CLONE_FS)) return ERR_PTR(-EINVAL); + if ((clone_flags & CLONE_MNTNS_SHIFT_UIDGID) && + !(clone_flags & CLONE_NEWNS)) + return ERR_PTR(-EINVAL); + if ((clone_flags & (CLONE_NEWUSER|CLONE_FS)) == (CLONE_NEWUSER|CLONE_FS)) return ERR_PTR(-EINVAL);