From patchwork Sun Jan 6 21:36:34 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 10749665 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3D3E71399 for ; Sun, 6 Jan 2019 21:37:02 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2E289288BA for ; Sun, 6 Jan 2019 21:37:02 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 1FD852891B; Sun, 6 Jan 2019 21:37:02 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 86A3D288BA for ; Sun, 6 Jan 2019 21:37:01 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id DBD2821FFAA; Sun, 6 Jan 2019 13:36:58 -0800 (PST) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 6CEF521FCD6 for ; Sun, 6 Jan 2019 13:36:51 -0800 (PST) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 8B18110050F6; Sun, 6 Jan 2019 16:36:49 -0500 (EST) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 8442BBB; Sun, 6 Jan 2019 16:36:49 -0500 (EST) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Sun, 6 Jan 2019 16:36:34 -0500 Message-Id: <1546810607-6348-2-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1546810607-6348-1-git-send-email-jsimmons@infradead.org> References: <1546810607-6348-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 01/14] lustre: llite: Add S_NOSEC support X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" X-Virus-Scanned: ClamAV using ClamSMTP From: Patrick Farrell We must use the i_mutex to protect permission changes, which means we need to take it when we write to a file with the setuid or setgid bit set (as this removes those bits). LU-8025 attempted to use IS_NOSEC to check for this case, but did not actually add support for the S_NOSEC flag to Lustre. S_NOSEC was added in upstream kernel commit: 69b4573296469fd3f70cf7044693074980517067 But a key change requiring parallel filesystems to opt in with a superblock flag was added in: 9e1f1de02c2275d7172e18dc4e7c2065777611bf This patch adds the required support. Specifically, Lustre should set S_NOSEC correctly when an inode is created (ll_iget), but only for new inodes. Setting it for existing inodes requires taking the i_mutex, creating an unacceptable metadata performance impact in the lookup code. Existing inodes get S_NOSEC set either by a setattr call (see below), or by the first writer to write to the node, in file_remove_privs/file_remove_suid. So it's OK not to set S_NOSEC on all inodes in ll_iget. Also, Lustre must clear S_NOSEC when it gets a metadata update because another node could have changed permissions. i_flags is already cleared in ll_update_inode, but we would prefer to have S_NOSEC set whenever possible, so we want to re-do the check after the update. This requires holding the i_mutex to avoid a check/set race with permissions changes. We cannot easily take the i_mutex in ll_update_inode (it is called from too many places, some of which already hold the i_mutex). It is acceptable to sometimes not have S_NOSEC set (since occasionally taking the i_mutex when not needed is OK), so we look at getting it set most of the time. It looks like the primary concern is ll_md_setattr. The caller (ll_setattr_raw) takes the i_mutex before returning, so we do the relevant call there. This opens a window during which S_NOSEC is not set, but again, this merely creates a situation where we take the i_mutex unnecessarily, and should be rare in practice. Testing with multiple writers shows that we only very rarely attempt to take the i_mutex, so the performance impact is minimal. Signed-off-by: Patrick Farrell WC-bug-id: https://jira.whamcloud.com/browse/LU-8656 Reviewed-on: https://review.whamcloud.com/22853 Reviewed-by: James Simmons Reviewed-by: Andreas Dilger Signed-off-by: James Simmons --- drivers/staging/lustre/lustre/include/obd_support.h | 2 ++ drivers/staging/lustre/lustre/llite/llite_lib.c | 19 +++++++++++++++++-- drivers/staging/lustre/lustre/llite/namei.c | 1 + drivers/staging/lustre/lustre/llite/vvp_io.c | 10 +++++++++- 4 files changed, 29 insertions(+), 3 deletions(-) diff --git a/drivers/staging/lustre/lustre/include/obd_support.h b/drivers/staging/lustre/lustre/include/obd_support.h index 1832193..79875fa 100644 --- a/drivers/staging/lustre/lustre/include/obd_support.h +++ b/drivers/staging/lustre/lustre/include/obd_support.h @@ -442,6 +442,8 @@ #define OBD_FAIL_LLITE_LOST_LAYOUT 0x1407 #define OBD_FAIL_GETATTR_DELAY 0x1409 #define OBD_FAIL_LLITE_CREATE_NODE_PAUSE 0x140c +#define OBD_FAIL_LLITE_IMUTEX_SEC 0x140e +#define OBD_FAIL_LLITE_IMUTEX_NOSEC 0x140f #define OBD_FAIL_FID_INDIR 0x1501 #define OBD_FAIL_FID_INLMA 0x1502 diff --git a/drivers/staging/lustre/lustre/llite/llite_lib.c b/drivers/staging/lustre/lustre/llite/llite_lib.c index cd169c1..203a1f7 100644 --- a/drivers/staging/lustre/lustre/llite/llite_lib.c +++ b/drivers/staging/lustre/lustre/llite/llite_lib.c @@ -227,6 +227,11 @@ static int client_common_fill_super(struct super_block *sb, char *md, char *dt) if (sbi->ll_flags & LL_SBI_USER_XATTR) data->ocd_connect_flags |= OBD_CONNECT_XATTR; + /* Setting this indicates we correctly support S_NOSEC (See kernel + * commit 9e1f1de02c2275d7172e18dc4e7c2065777611bf) + */ + sb->s_flags |= MS_NOSEC; + if (sbi->ll_flags & LL_SBI_FLOCK) sbi->ll_fop = &ll_file_operations_flock; else if (sbi->ll_flags & LL_SBI_LOCALFLOCK) @@ -1638,6 +1643,13 @@ int ll_setattr_raw(struct dentry *dentry, struct iattr *attr, inode_lock(inode); if ((attr->ia_valid & ATTR_SIZE) && !hsm_import) inode_dio_wait(inode); + /* Once we've got the i_mutex, it's safe to set the S_NOSEC + * flag. ll_update_inode (called from ll_md_setattr), clears + * inode flags, so there is a gap where S_NOSEC is not set. + * This can cause a writer to take the i_mutex unnecessarily, + * but this is safe to do and should be rare. + */ + inode_has_no_xattr(inode); } ll_stats_ops_tally(ll_i2sbi(inode), (attr->ia_valid & ATTR_SIZE) ? @@ -1847,6 +1859,11 @@ int ll_update_inode(struct inode *inode, struct lustre_md *md) inode->i_ctime.tv_sec = body->mbo_ctime; lli->lli_ctime = body->mbo_ctime; } + + /* Clear i_flags to remove S_NOSEC before permissions are updated */ + if (body->mbo_valid & OBD_MD_FLFLAGS) + ll_update_inode_flags(inode, body->mbo_flags); + if (body->mbo_valid & OBD_MD_FLMODE) inode->i_mode = (inode->i_mode & S_IFMT) | (body->mbo_mode & ~S_IFMT); @@ -1865,8 +1882,6 @@ int ll_update_inode(struct inode *inode, struct lustre_md *md) inode->i_gid = make_kgid(&init_user_ns, body->mbo_gid); if (body->mbo_valid & OBD_MD_FLPROJID) lli->lli_projid = body->mbo_projid; - if (body->mbo_valid & OBD_MD_FLFLAGS) - ll_update_inode_flags(inode, body->mbo_flags); if (body->mbo_valid & OBD_MD_FLNLINK) set_nlink(inode, body->mbo_nlink); if (body->mbo_valid & OBD_MD_FLRDEV) diff --git a/drivers/staging/lustre/lustre/llite/namei.c b/drivers/staging/lustre/lustre/llite/namei.c index f2bd57e..b5b46f7 100644 --- a/drivers/staging/lustre/lustre/llite/namei.c +++ b/drivers/staging/lustre/lustre/llite/namei.c @@ -130,6 +130,7 @@ struct inode *ll_iget(struct super_block *sb, ino_t hash, iput(inode); inode = ERR_PTR(rc); } else { + inode_has_no_xattr(inode); unlock_new_inode(inode); } } else if (!(inode->i_state & (I_FREEING | I_CLEAR))) { diff --git a/drivers/staging/lustre/lustre/llite/vvp_io.c b/drivers/staging/lustre/lustre/llite/vvp_io.c index d6b27ba..b772e25 100644 --- a/drivers/staging/lustre/lustre/llite/vvp_io.c +++ b/drivers/staging/lustre/lustre/llite/vvp_io.c @@ -925,9 +925,10 @@ static int vvp_io_write_start(const struct lu_env *env, struct cl_object *obj = io->ci_obj; struct inode *inode = vvp_object_inode(obj); struct ll_inode_info *lli = ll_i2info(inode); - ssize_t result = 0; + bool lock_inode = !inode_is_locked(inode) && !IS_NOSEC(inode); loff_t pos = io->u.ci_wr.wr.crw_pos; size_t cnt = io->u.ci_wr.wr.crw_count; + ssize_t result = 0; down_read(&lli->lli_trunc_sem); @@ -963,6 +964,13 @@ static int vvp_io_write_start(const struct lu_env *env, return -EFBIG; } + /* Tests to verify we take the i_mutex correctly */ + if (OBD_FAIL_CHECK(OBD_FAIL_LLITE_IMUTEX_SEC) && !lock_inode) + return -EINVAL; + + if (OBD_FAIL_CHECK(OBD_FAIL_LLITE_IMUTEX_NOSEC) && lock_inode) + return -EINVAL; + if (!vio->vui_iter) { /* from a temp io in ll_cl_init(). */ result = 0;