From patchwork Tue Jul 27 22:37:45 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 12404505 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7D082C432BE for ; Tue, 27 Jul 2021 22:42:43 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 609C260F6D for ; Tue, 27 Jul 2021 22:42:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232599AbhG0Wmm (ORCPT ); Tue, 27 Jul 2021 18:42:42 -0400 Received: from smtp-out2.suse.de ([195.135.220.29]:53886 "EHLO smtp-out2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232272AbhG0Wml (ORCPT ); Tue, 27 Jul 2021 18:42:41 -0400 Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 52AAE1FF40; Tue, 27 Jul 2021 22:42:39 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1627425759; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=EUPFwhQ4l0lXEkkNGESyS5BwcbfXYjC/1aLE4BjDLe4=; b=kl0n9co+iBonCLM/u0/Pfx7Yl/a9T5BQ2njBwHoMgpDE2exuUyRN9nXAVFDso8k9raZXnK W5p4kNWedyIjumCLQz2XoJDrZ4ND9xY0umMBOM+jyVNLC3PisvN+5IJdNEIomWvujh+e8Z ec3jcVWxdiN0n0sbdvgY+cBnfAdQVdc= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1627425759; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=EUPFwhQ4l0lXEkkNGESyS5BwcbfXYjC/1aLE4BjDLe4=; b=Of5rBFkryDtdNvTQ72fjf5pezhSy9A+sJiILBS/I15WgplE0J09V5KoRLDFBml1vPYxPTo Ffz7uhXr6cfR5eCA== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 6E09313A5D; Tue, 27 Jul 2021 22:42:36 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id HhFlC9yLAGGPVQAAMHmgww (envelope-from ); Tue, 27 Jul 2021 22:42:36 +0000 Subject: [PATCH 01/11] VFS: show correct dev num in mountinfo From: NeilBrown To: Christoph Hellwig , Josef Bacik , "J. Bruce Fields" , Chuck Lever , Chris Mason , David Sterba , Alexander Viro Cc: linux-fsdevel@vger.kernel.org, linux-nfs@vger.kernel.org, linux-btrfs@vger.kernel.org Date: Wed, 28 Jul 2021 08:37:45 +1000 Message-ID: <162742546548.32498.10889023150565429936.stgit@noble.brown> In-Reply-To: <162742539595.32498.13687924366155737575.stgit@noble.brown> References: <162742539595.32498.13687924366155737575.stgit@noble.brown> User-Agent: StGit/0.23 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org /proc/$PID/mountinfo contains a field for the device number of the filesystem at each mount. This is taken from the superblock ->s_dev field, which is correct for every filesystem except btrfs. A btrfs filesystem can contain multiple subvols which each have a different device number. If (a directory within) one of these subvols is mounted, the device number reported in mountinfo will be different from the device number reported by stat(). This confuses some libraries and tools such as, historically, findmnt. Current findmnt seems to cope with the strangeness. So instead of using ->s_dev, call vfs_getattr_nosec() and use the ->dev provided. As there is no STATX flag to ask for the device number, we pass a request mask for zero, and also ask the filesystem to avoid syncing with any remote service. Signed-off-by: NeilBrown --- fs/proc_namespace.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/fs/proc_namespace.c b/fs/proc_namespace.c index 392ef5162655..f342a0231e9e 100644 --- a/fs/proc_namespace.c +++ b/fs/proc_namespace.c @@ -138,10 +138,16 @@ static int show_mountinfo(struct seq_file *m, struct vfsmount *mnt) struct mount *r = real_mount(mnt); struct super_block *sb = mnt->mnt_sb; struct path mnt_path = { .dentry = mnt->mnt_root, .mnt = mnt }; + struct kstat stat; int err; + /* We only want ->dev, and there is no STATX flag for that, + * so ask for nothing and assume we get ->dev + */ + vfs_getattr_nosec(&mnt_path, &stat, 0, AT_STATX_DONT_SYNC); + seq_printf(m, "%i %i %u:%u ", r->mnt_id, r->mnt_parent->mnt_id, - MAJOR(sb->s_dev), MINOR(sb->s_dev)); + MAJOR(stat.dev), MINOR(stat.dev)); if (sb->s_op->show_path) { err = sb->s_op->show_path(m, mnt->mnt_root); if (err) From patchwork Tue Jul 27 22:37:45 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 12404507 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2F4F4C432BE for ; Tue, 27 Jul 2021 22:42:48 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1D26E60F90 for ; Tue, 27 Jul 2021 22:42:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232908AbhG0Wmr (ORCPT ); Tue, 27 Jul 2021 18:42:47 -0400 Received: from smtp-out1.suse.de ([195.135.220.28]:57098 "EHLO smtp-out1.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232272AbhG0Wmr (ORCPT ); Tue, 27 Jul 2021 18:42:47 -0400 Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 7B63021E78; Tue, 27 Jul 2021 22:42:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1627425765; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=jpgPD5RojPABcTMZULE6R7dcTBRiP3N2liSuqkAN1xY=; b=LMyD+4WcK7zDEfC+MFkXkbm4u0MMKJhZGrx3wykC6W5yYa8VGzDLSZnYVjS+rqnbFutBxS Ki/1B0Yv9VgP1fDcCv6AjsXN/IYBIUiat7mOfYOrEmHIDVditzSwcSBP1L7hYo0pQLFmxU RJ7sbGkalpJtTVIbeczlQ6GRrwSm8Dk= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1627425765; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=jpgPD5RojPABcTMZULE6R7dcTBRiP3N2liSuqkAN1xY=; b=gQl3LzlyrzrgRj1iFmV9PwaZk0LRCtv5DDnAgMdZXrYf/Oz4vtZfIj8oCo+gNm/pC1dgwc Mg4KAEtZ5NLNALBA== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 95D3013A5D; Tue, 27 Jul 2021 22:42:42 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id hvjwFOKLAGGVVQAAMHmgww (envelope-from ); Tue, 27 Jul 2021 22:42:42 +0000 Subject: [PATCH 02/11] VFS: allow d_automount to create in-place bind-mount. From: NeilBrown To: Christoph Hellwig , Josef Bacik , "J. Bruce Fields" , Chuck Lever , Chris Mason , David Sterba , Alexander Viro Cc: linux-fsdevel@vger.kernel.org, linux-nfs@vger.kernel.org, linux-btrfs@vger.kernel.org Date: Wed, 28 Jul 2021 08:37:45 +1000 Message-ID: <162742546549.32498.76256513179684921.stgit@noble.brown> In-Reply-To: <162742539595.32498.13687924366155737575.stgit@noble.brown> References: <162742539595.32498.13687924366155737575.stgit@noble.brown> User-Agent: StGit/0.23 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org finish_automount() prevents a mount trap from mounting a dentry onto itself, as this could cause a loop - repeatedly automounting. There is nothing intrinsically wrong with this arrangement, and the d_automount function can easily avoid the loop. btrfs will use it to expose subvols in the mount table. It may well be a problem to mount a dentry onto itself when it is already the root of the vfsmount, so narrow the test to only check that case. The test on mnt_sb is redundant and has been removed. path->mnt and path->dentry must have the same sb, so if m->mnt_root == dentry, then m->mnt_sb must be the same as path->mnt->mnt_sb. Signed-off-by: NeilBrown --- fs/namespace.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/namespace.c b/fs/namespace.c index ab4174a3c802..81b0f2b2e701 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -2928,7 +2928,7 @@ int finish_automount(struct vfsmount *m, struct path *path) */ BUG_ON(mnt_get_count(mnt) < 2); - if (m->mnt_sb == path->mnt->mnt_sb && + if (m->mnt_root == path->mnt->mnt_root && m->mnt_root == dentry) { err = -ELOOP; goto discard; From patchwork Tue Jul 27 22:37:45 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 12404509 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EBEADC4338F for ; Tue, 27 Jul 2021 22:42:57 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D709C60F6D for ; Tue, 27 Jul 2021 22:42:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232449AbhG0Wm5 (ORCPT ); Tue, 27 Jul 2021 18:42:57 -0400 Received: from smtp-out2.suse.de ([195.135.220.29]:53912 "EHLO smtp-out2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232464AbhG0Wm4 (ORCPT ); Tue, 27 Jul 2021 18:42:56 -0400 Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 538B21FF40; Tue, 27 Jul 2021 22:42:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1627425775; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=BdWc5KqmXjzy1xBY7ficQJMIpIg+XlTLtZTRBW7cd2M=; b=HABFOC4ZUavNpgxB2HVyXBMgaS6Q0CxRT601I6yALwsJ3RFmpKH+40SVn6iVfkZf1j4yxM BLihlT307OKP6/2DXCcW7LJzm60YQefxFR0lZoyIYFqG+PcEW8yrxS2T32HjRXGH3PP/Et /jINM8iA6J1ebllkcAysubXJOvJEdT4= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1627425775; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=BdWc5KqmXjzy1xBY7ficQJMIpIg+XlTLtZTRBW7cd2M=; b=Kh8V08OHt4aWrNeeRMdcdZ6KDLpUOgOCYz8tY0tv+Z9uvMwSrpgqwuLpTEfdbbj7Tkb3BR dMdbzhGJeN7G7mCA== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 5E88413A5D; Tue, 27 Jul 2021 22:42:52 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id ksqPB+yLAGGlVQAAMHmgww (envelope-from ); Tue, 27 Jul 2021 22:42:52 +0000 Subject: [PATCH 03/11] VFS: pass lookup_flags into follow_down() From: NeilBrown To: Christoph Hellwig , Josef Bacik , "J. Bruce Fields" , Chuck Lever , Chris Mason , David Sterba , Alexander Viro Cc: linux-fsdevel@vger.kernel.org, linux-nfs@vger.kernel.org, linux-btrfs@vger.kernel.org Date: Wed, 28 Jul 2021 08:37:45 +1000 Message-ID: <162742546550.32498.10582545131617192944.stgit@noble.brown> In-Reply-To: <162742539595.32498.13687924366155737575.stgit@noble.brown> References: <162742539595.32498.13687924366155737575.stgit@noble.brown> User-Agent: StGit/0.23 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org A future patch will want to trigger automount (LOOKUP_AUTOMOUNT) on some follow_down calls, so allow a flag to be passed. Signed-off-by: NeilBrown --- fs/namei.c | 6 +++--- fs/nfsd/vfs.c | 2 +- include/linux/namei.h | 2 +- 3 files changed, 5 insertions(+), 5 deletions(-) diff --git a/fs/namei.c b/fs/namei.c index bf6d8a738c59..cea0e9b2f162 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -1395,11 +1395,11 @@ EXPORT_SYMBOL(follow_down_one); * point, the filesystem owning that dentry may be queried as to whether the * caller is permitted to proceed or not. */ -int follow_down(struct path *path) +int follow_down(struct path *path, unsigned int lookup_flags) { struct vfsmount *mnt = path->mnt; bool jumped; - int ret = traverse_mounts(path, &jumped, NULL, 0); + int ret = traverse_mounts(path, &jumped, NULL, lookup_flags); if (path->mnt != mnt) mntput(mnt); @@ -2736,7 +2736,7 @@ int path_pts(struct path *path) path->dentry = child; dput(parent); - follow_down(path); + follow_down(path, 0); return 0; } #endif diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c index a224a5e23cc1..7c32edcfd2e9 100644 --- a/fs/nfsd/vfs.c +++ b/fs/nfsd/vfs.c @@ -65,7 +65,7 @@ nfsd_cross_mnt(struct svc_rqst *rqstp, struct dentry **dpp, .dentry = dget(dentry)}; int err = 0; - err = follow_down(&path); + err = follow_down(&path, 0); if (err < 0) goto out; if (path.mnt == exp->ex_path.mnt && path.dentry == dentry && diff --git a/include/linux/namei.h b/include/linux/namei.h index be9a2b349ca7..8d47433def3c 100644 --- a/include/linux/namei.h +++ b/include/linux/namei.h @@ -70,7 +70,7 @@ extern struct dentry *lookup_one_len_unlocked(const char *, struct dentry *, int extern struct dentry *lookup_positive_unlocked(const char *, struct dentry *, int); extern int follow_down_one(struct path *); -extern int follow_down(struct path *); +extern int follow_down(struct path *, unsigned int); extern int follow_up(struct path *); extern struct dentry *lock_rename(struct dentry *, struct dentry *); From patchwork Tue Jul 27 22:37:45 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 12404511 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.9 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, UNWANTED_LANGUAGE_BODY,URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4FCEEC4338F for ; Tue, 27 Jul 2021 22:43:08 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 34B1460184 for ; Tue, 27 Jul 2021 22:43:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232972AbhG0WnH (ORCPT ); Tue, 27 Jul 2021 18:43:07 -0400 Received: from smtp-out1.suse.de ([195.135.220.28]:57124 "EHLO smtp-out1.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232810AbhG0WnF (ORCPT ); Tue, 27 Jul 2021 18:43:05 -0400 Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 00CDC21E78; Tue, 27 Jul 2021 22:43:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1627425784; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=0TpbPbKdpPuBJzOYDLGprabdAvWGnrm1h37EgRsFnts=; b=nL3M6YWdd6o/ZyD521zO3PjVfu0MTzCsWfh5OhGkhN+KcPtsm7tK6ucvRHtwju5ekFeADP 8VOhZduis4H2FoVdVUGFqcb/2iWKNHpSknpDOKR1qiyrsY60T7v7yJY90My65vOmYcvRjO XgoiDdNrdPjliyJXcZTC6eovo4jU+aE= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1627425784; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=0TpbPbKdpPuBJzOYDLGprabdAvWGnrm1h37EgRsFnts=; b=CZzlkIxkHsCqWkUwj+ehsgFgw7bBL5VYd774t4FPisdItlI2nO0Krq1bvu+lubYsJBh0kb 8RxuZBklBv4lCnDw== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 1FA2313A5D; Tue, 27 Jul 2021 22:43:00 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id 2DnzM/SLAGGsVQAAMHmgww (envelope-from ); Tue, 27 Jul 2021 22:43:00 +0000 Subject: [PATCH 04/11] VFS: export lookup_mnt() From: NeilBrown To: Christoph Hellwig , Josef Bacik , "J. Bruce Fields" , Chuck Lever , Chris Mason , David Sterba , Alexander Viro Cc: linux-fsdevel@vger.kernel.org, linux-nfs@vger.kernel.org, linux-btrfs@vger.kernel.org Date: Wed, 28 Jul 2021 08:37:45 +1000 Message-ID: <162742546551.32498.5847026750506620683.stgit@noble.brown> In-Reply-To: <162742539595.32498.13687924366155737575.stgit@noble.brown> References: <162742539595.32498.13687924366155737575.stgit@noble.brown> User-Agent: StGit/0.23 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org In order to support filehandle lookup in filesystems with internal mounts (multiple subvols in the one filesystem) reconnect_path() in exportfs will need to find out if a given dentry is already mounted. This can be done with the function lookup_mnt(), so export that to make it available. Signed-off-by: NeilBrown --- fs/internal.h | 1 - fs/namespace.c | 1 + include/linux/mount.h | 2 ++ 3 files changed, 3 insertions(+), 1 deletion(-) diff --git a/fs/internal.h b/fs/internal.h index 3ce8edbaa3ca..0feb2722d2e5 100644 --- a/fs/internal.h +++ b/fs/internal.h @@ -81,7 +81,6 @@ int do_renameat2(int olddfd, struct filename *oldname, int newdfd, /* * namespace.c */ -extern struct vfsmount *lookup_mnt(const struct path *); extern int finish_automount(struct vfsmount *, struct path *); extern int sb_prepare_remount_readonly(struct super_block *); diff --git a/fs/namespace.c b/fs/namespace.c index 81b0f2b2e701..73bbdb921e24 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -662,6 +662,7 @@ struct vfsmount *lookup_mnt(const struct path *path) rcu_read_unlock(); return m; } +EXPORT_SYMBOL(lookup_mnt); static inline void lock_ns_list(struct mnt_namespace *ns) { diff --git a/include/linux/mount.h b/include/linux/mount.h index 5d92a7e1a742..1d3daed88f83 100644 --- a/include/linux/mount.h +++ b/include/linux/mount.h @@ -118,6 +118,8 @@ extern unsigned int sysctl_mount_max; extern bool path_is_mountpoint(const struct path *path); +extern struct vfsmount *lookup_mnt(const struct path *); + extern void kern_unmount_array(struct vfsmount *mnt[], unsigned int num); #endif /* _LINUX_MOUNT_H */ From patchwork Tue Jul 27 22:37:45 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 12404513 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 910D8C4338F for ; Tue, 27 Jul 2021 22:43:15 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 77CF460184 for ; Tue, 27 Jul 2021 22:43:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233086AbhG0WnO (ORCPT ); Tue, 27 Jul 2021 18:43:14 -0400 Received: from smtp-out1.suse.de ([195.135.220.28]:57146 "EHLO smtp-out1.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231730AbhG0WnN (ORCPT ); Tue, 27 Jul 2021 18:43:13 -0400 Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 2895B21E78; Tue, 27 Jul 2021 22:43:12 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1627425792; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=xLs00Rrv/kTcj/5olsX73VJ3zYmRNlaU9tOovlvLkrY=; b=CDYrsqvXmziG2cY28/a82LbbwXxEKRAV1gNZwQZZjcmIeTklR0CSSUHSCu/Y8kbUonbe78 wjytZURQAFUN9P7NL1uOozNd/XxH68LBZ2xuRkX6yleEp1GBawrGqI4r5Uq4RZmKoos8RD 3E2YzIB7cRdofn5hdWyD0PIc8fkzukw= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1627425792; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=xLs00Rrv/kTcj/5olsX73VJ3zYmRNlaU9tOovlvLkrY=; b=WP2zRKJ88WSgYMmFFXlnfNH3h946jiyNr0bF/6uXF8WrxO9XhcMa7bNsJ2H42E5Iv6mzYx 6eNbxKV9SskOJzDw== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 22F9C13A5D; Tue, 27 Jul 2021 22:43:08 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id rca3NPyLAGGzVQAAMHmgww (envelope-from ); Tue, 27 Jul 2021 22:43:08 +0000 Subject: [PATCH 05/11] VFS: new function: mount_is_internal() From: NeilBrown To: Christoph Hellwig , Josef Bacik , "J. Bruce Fields" , Chuck Lever , Chris Mason , David Sterba , Alexander Viro Cc: linux-fsdevel@vger.kernel.org, linux-nfs@vger.kernel.org, linux-btrfs@vger.kernel.org Date: Wed, 28 Jul 2021 08:37:45 +1000 Message-ID: <162742546552.32498.14429836898036234922.stgit@noble.brown> In-Reply-To: <162742539595.32498.13687924366155737575.stgit@noble.brown> References: <162742539595.32498.13687924366155737575.stgit@noble.brown> User-Agent: StGit/0.23 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org This patch introduces the concept of an "internal" mount which is a mount where a filesystem has create the mount itself. Both the mounted-on-dentry and the mount's root dentry must refer to the same superblock (they may be the same dentry), and the mounted-on dentry must be an automount. Signed-off-by: NeilBrown --- fs/namespace.c | 29 +++++++++++++++++++++++++++++ include/linux/mount.h | 2 ++ 2 files changed, 31 insertions(+) diff --git a/fs/namespace.c b/fs/namespace.c index 73bbdb921e24..a14efbccfb03 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -1273,6 +1273,35 @@ bool path_is_mountpoint(const struct path *path) } EXPORT_SYMBOL(path_is_mountpoint); +/** + * mount_is_internal() - Check if path is a mount internal to a single filesystem + * @mnt: vfsmount to check + * + * Some filesystems present multiple file-sets using a single + * superblock, such as btrfs with multiple subvolumes. Names within a + * parent filesystem which lead to a subordinate filesystem are + * implemented as automounts so that the structure is visible in the + * mount table. nfsd needs visibility into this arrangement so that it + * can determine if a mountpoint requires a new export, or is completely + * covered by an existing mount. + * + * An "internal" mount is one where the parent and child have the same + * superblock, and the mounted-on dentry is "managed" as an automount. A + * filehandle found for an inode in the child can be looked-up using either + * vfsmount. + */ +bool mount_is_internal(struct vfsmount *mnt) +{ + struct mount *m = real_mount(mnt); + + if (!mnt_has_parent(m)) + return false; + if (m->mnt_parent->mnt.mnt_sb != m->mnt.mnt_sb) + return false; + return m->mnt_mountpoint->d_flags & DCACHE_NEED_AUTOMOUNT; +} +EXPORT_SYMBOL(mount_is_internal); + struct vfsmount *mnt_clone_internal(const struct path *path) { struct mount *p; diff --git a/include/linux/mount.h b/include/linux/mount.h index 1d3daed88f83..ab58087728ba 100644 --- a/include/linux/mount.h +++ b/include/linux/mount.h @@ -118,6 +118,8 @@ extern unsigned int sysctl_mount_max; extern bool path_is_mountpoint(const struct path *path); +extern bool mount_is_internal(struct vfsmount *mnt); + extern struct vfsmount *lookup_mnt(const struct path *); extern void kern_unmount_array(struct vfsmount *mnt[], unsigned int num); From patchwork Tue Jul 27 22:37:45 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 12404515 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id F3BCCC4338F for ; Tue, 27 Jul 2021 22:43:23 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id DE78160F6D for ; Tue, 27 Jul 2021 22:43:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233003AbhG0WnX (ORCPT ); Tue, 27 Jul 2021 18:43:23 -0400 Received: from smtp-out1.suse.de ([195.135.220.28]:57176 "EHLO smtp-out1.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231730AbhG0WnX (ORCPT ); Tue, 27 Jul 2021 18:43:23 -0400 Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 5292421E78; Tue, 27 Jul 2021 22:43:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1627425801; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=C5Drv34FJdxt83TLCHoF7R/jR6m1Ke+UdAwK940UJS8=; b=POQVZD6jz6bHKpZdUP3pme3Ne0lfuA0QXD2/qoSH6gmCGQGafCZ4vM7qBz3LHNaZy24r0e Rj4WvMK4i2czcfoDpe6muZvktwvwiwjMDfyOd8Kj43liBzjW+zoNMEUTxdskHJuUKrPqdo zUJubPuWzxSFgJwd4MDXiYap/l0f+f8= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1627425801; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=C5Drv34FJdxt83TLCHoF7R/jR6m1Ke+UdAwK940UJS8=; b=zq5NuAnhowY2d5S3RAU4pMzBqxRHTgUFa7dWbUGHN6NsFEpIe4xoeBlbtgFPrIm3tJc6CK y9yHKR5ULGy3QcCQ== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 3005013A5D; Tue, 27 Jul 2021 22:43:17 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id HmfjNwWMAGHGVQAAMHmgww (envelope-from ); Tue, 27 Jul 2021 22:43:17 +0000 Subject: [PATCH 06/11] nfsd: include a vfsmount in struct svc_fh From: NeilBrown To: Christoph Hellwig , Josef Bacik , "J. Bruce Fields" , Chuck Lever , Chris Mason , David Sterba , Alexander Viro Cc: linux-fsdevel@vger.kernel.org, linux-nfs@vger.kernel.org, linux-btrfs@vger.kernel.org Date: Wed, 28 Jul 2021 08:37:45 +1000 Message-ID: <162742546552.32498.8097200286954882080.stgit@noble.brown> In-Reply-To: <162742539595.32498.13687924366155737575.stgit@noble.brown> References: <162742539595.32498.13687924366155737575.stgit@noble.brown> User-Agent: StGit/0.23 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org A future patch will allow exportfs_decode_fh{,_raw} to return a different vfsmount than the one passed. This is specifically for btrfs, but would be useful for any filesystem that presents as multiple volumes (i.e. different st_dev, each with their own st_ino number-space). For nfsd, this means that the mnt in the svc_export may not apply to all filehandles reached from that export. So svc_fh needs to store a distinct vfsmount as well. For now, fs->fh_mnt == fh->fh_export->ex_path.mnt, but that will change. Changes include: fh_compose() nfsd_lookup_dentry() now take a *path instead of a *dentry nfsd4_encode_fattr() nfsd4_encode_fattr_to_buf() now take a *vfsmount as well as a *dentry nfsd_cross_mnt() now takes a *path instead of a **dentry to pass in, and get back, the mnt and dentry. nfsd_lookup_parent() used to take a *dentry and a **dentry. now it just takes a *path. This is the *path that as passed to nfsd_lookup_dentry(). Signed-off-by: NeilBrown --- fs/nfsd/export.c | 4 +- fs/nfsd/nfs3xdr.c | 22 +++++---- fs/nfsd/nfs4proc.c | 9 ++-- fs/nfsd/nfs4xdr.c | 55 +++++++++++----------- fs/nfsd/nfsfh.c | 30 +++++++----- fs/nfsd/nfsfh.h | 3 + fs/nfsd/nfsproc.c | 5 ++ fs/nfsd/vfs.c | 133 ++++++++++++++++++++++++++++------------------------ fs/nfsd/vfs.h | 10 ++-- fs/nfsd/xdr4.h | 2 - 10 files changed, 150 insertions(+), 123 deletions(-) diff --git a/fs/nfsd/export.c b/fs/nfsd/export.c index 9421dae22737..e506cbe78b4f 100644 --- a/fs/nfsd/export.c +++ b/fs/nfsd/export.c @@ -1003,7 +1003,7 @@ exp_rootfh(struct net *net, struct auth_domain *clp, char *name, * fh must be initialized before calling fh_compose */ fh_init(&fh, maxsize); - if (fh_compose(&fh, exp, path.dentry, NULL)) + if (fh_compose(&fh, exp, &path, NULL)) err = -EINVAL; else err = 0; @@ -1178,7 +1178,7 @@ exp_pseudoroot(struct svc_rqst *rqstp, struct svc_fh *fhp) exp = rqst_find_fsidzero_export(rqstp); if (IS_ERR(exp)) return nfserrno(PTR_ERR(exp)); - rv = fh_compose(fhp, exp, exp->ex_path.dentry, NULL); + rv = fh_compose(fhp, exp, &exp->ex_path, NULL); exp_put(exp); return rv; } diff --git a/fs/nfsd/nfs3xdr.c b/fs/nfsd/nfs3xdr.c index 0a5ebc52e6a9..67af0c5c1543 100644 --- a/fs/nfsd/nfs3xdr.c +++ b/fs/nfsd/nfs3xdr.c @@ -1089,36 +1089,38 @@ compose_entry_fh(struct nfsd3_readdirres *cd, struct svc_fh *fhp, const char *name, int namlen, u64 ino) { struct svc_export *exp; - struct dentry *dparent, *dchild; + struct dentry *dparent; + struct path child; __be32 rv = nfserr_noent; dparent = cd->fh.fh_dentry; exp = cd->fh.fh_export; + child.mnt = cd->fh.fh_mnt; if (isdotent(name, namlen)) { if (namlen == 2) { - dchild = dget_parent(dparent); + child.dentry = dget_parent(dparent); /* * Don't return filehandle for ".." if we're at * the filesystem or export root: */ - if (dchild == dparent) + if (child.dentry == dparent) goto out; if (dparent == exp->ex_path.dentry) goto out; } else - dchild = dget(dparent); + child.dentry = dget(dparent); } else - dchild = lookup_positive_unlocked(name, dparent, namlen); - if (IS_ERR(dchild)) + child.dentry = lookup_positive_unlocked(name, dparent, namlen); + if (IS_ERR(child.dentry)) return rv; - if (d_mountpoint(dchild)) + if (d_mountpoint(child.dentry)) goto out; - if (dchild->d_inode->i_ino != ino) + if (child.dentry->d_inode->i_ino != ino) goto out; - rv = fh_compose(fhp, exp, dchild, &cd->fh); + rv = fh_compose(fhp, exp, &child, &cd->fh); out: - dput(dchild); + dput(child.dentry); return rv; } diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c index 486c5dba4b65..743b9315cd3e 100644 --- a/fs/nfsd/nfs4proc.c +++ b/fs/nfsd/nfs4proc.c @@ -902,7 +902,7 @@ nfsd4_secinfo(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate, { struct nfsd4_secinfo *secinfo = &u->secinfo; struct svc_export *exp; - struct dentry *dentry; + struct path path; __be32 err; err = fh_verify(rqstp, &cstate->current_fh, S_IFDIR, NFSD_MAY_EXEC); @@ -910,16 +910,16 @@ nfsd4_secinfo(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate, return err; err = nfsd_lookup_dentry(rqstp, &cstate->current_fh, secinfo->si_name, secinfo->si_namelen, - &exp, &dentry); + &exp, &path); if (err) return err; fh_unlock(&cstate->current_fh); - if (d_really_is_negative(dentry)) { + if (d_really_is_negative(path.dentry)) { exp_put(exp); err = nfserr_noent; } else secinfo->si_exp = exp; - dput(dentry); + path_put(&path); if (cstate->minorversion) /* See rfc 5661 section 2.6.3.1.1.8 */ fh_put(&cstate->current_fh); @@ -1930,6 +1930,7 @@ _nfsd4_verify(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate, p = buf; status = nfsd4_encode_fattr_to_buf(&p, count, &cstate->current_fh, cstate->current_fh.fh_export, + cstate->current_fh.fh_mnt, cstate->current_fh.fh_dentry, verify->ve_bmval, rqstp, 0); diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c index 7abeccb975b2..21c277fa28ae 100644 --- a/fs/nfsd/nfs4xdr.c +++ b/fs/nfsd/nfs4xdr.c @@ -2823,9 +2823,9 @@ nfsd4_encode_bitmap(struct xdr_stream *xdr, u32 bmval0, u32 bmval1, u32 bmval2) */ static __be32 nfsd4_encode_fattr(struct xdr_stream *xdr, struct svc_fh *fhp, - struct svc_export *exp, - struct dentry *dentry, u32 *bmval, - struct svc_rqst *rqstp, int ignore_crossmnt) + struct svc_export *exp, + struct vfsmount *mnt, struct dentry *dentry, + u32 *bmval, struct svc_rqst *rqstp, int ignore_crossmnt) { u32 bmval0 = bmval[0]; u32 bmval1 = bmval[1]; @@ -2851,7 +2851,7 @@ nfsd4_encode_fattr(struct xdr_stream *xdr, struct svc_fh *fhp, struct nfsd4_compoundres *resp = rqstp->rq_resp; u32 minorversion = resp->cstate.minorversion; struct path path = { - .mnt = exp->ex_path.mnt, + .mnt = mnt, .dentry = dentry, }; struct nfsd_net *nn = net_generic(SVC_NET(rqstp), nfsd_net_id); @@ -2882,7 +2882,7 @@ nfsd4_encode_fattr(struct xdr_stream *xdr, struct svc_fh *fhp, if (!tempfh) goto out; fh_init(tempfh, NFS4_FHSIZE); - status = fh_compose(tempfh, exp, dentry, NULL); + status = fh_compose(tempfh, exp, &path, NULL); if (status) goto out; fhp = tempfh; @@ -3274,13 +3274,12 @@ nfsd4_encode_fattr(struct xdr_stream *xdr, struct svc_fh *fhp, p = xdr_reserve_space(xdr, 8); if (!p) - goto out_resource; + goto out_resource; /* * Get parent's attributes if not ignoring crossmount * and this is the root of a cross-mounted filesystem. */ - if (ignore_crossmnt == 0 && - dentry == exp->ex_path.mnt->mnt_root) { + if (ignore_crossmnt == 0 && dentry == mnt->mnt_root) { err = get_parent_attributes(exp, &parent_stat); if (err) goto out_nfserr; @@ -3380,17 +3379,18 @@ static void svcxdr_init_encode_from_buffer(struct xdr_stream *xdr, } __be32 nfsd4_encode_fattr_to_buf(__be32 **p, int words, - struct svc_fh *fhp, struct svc_export *exp, - struct dentry *dentry, u32 *bmval, - struct svc_rqst *rqstp, int ignore_crossmnt) + struct svc_fh *fhp, struct svc_export *exp, + struct vfsmount *mnt, struct dentry *dentry, + u32 *bmval, struct svc_rqst *rqstp, + int ignore_crossmnt) { struct xdr_buf dummy; struct xdr_stream xdr; __be32 ret; svcxdr_init_encode_from_buffer(&xdr, &dummy, *p, words << 2); - ret = nfsd4_encode_fattr(&xdr, fhp, exp, dentry, bmval, rqstp, - ignore_crossmnt); + ret = nfsd4_encode_fattr(&xdr, fhp, exp, mnt, dentry, bmval, rqstp, + ignore_crossmnt); *p = xdr.p; return ret; } @@ -3409,14 +3409,16 @@ nfsd4_encode_dirent_fattr(struct xdr_stream *xdr, struct nfsd4_readdir *cd, const char *name, int namlen) { struct svc_export *exp = cd->rd_fhp->fh_export; - struct dentry *dentry; + struct path path; __be32 nfserr; int ignore_crossmnt = 0; - dentry = lookup_positive_unlocked(name, cd->rd_fhp->fh_dentry, namlen); - if (IS_ERR(dentry)) - return nfserrno(PTR_ERR(dentry)); + path.dentry = lookup_positive_unlocked(name, cd->rd_fhp->fh_dentry, + namlen); + if (IS_ERR(path.dentry)) + return nfserrno(PTR_ERR(path.dentry)); + path.mnt = mntget(cd->rd_fhp->fh_mnt); exp_get(exp); /* * In the case of a mountpoint, the client may be asking for @@ -3425,7 +3427,7 @@ nfsd4_encode_dirent_fattr(struct xdr_stream *xdr, struct nfsd4_readdir *cd, * we will not follow the cross mount and will fill the attribtutes * directly from the mountpoint dentry. */ - if (nfsd_mountpoint(dentry, exp)) { + if (nfsd_mountpoint(path.dentry, exp)) { int err; if (!(exp->ex_flags & NFSEXP_V4ROOT) @@ -3434,11 +3436,11 @@ nfsd4_encode_dirent_fattr(struct xdr_stream *xdr, struct nfsd4_readdir *cd, goto out_encode; } /* - * Why the heck aren't we just using nfsd_lookup?? + * Why the heck aren't we just using nfsd_lookup_dentry?? * Different "."/".." handling? Something else? * At least, add a comment here to explain.... */ - err = nfsd_cross_mnt(cd->rd_rqstp, &dentry, &exp); + err = nfsd_cross_mnt(cd->rd_rqstp, &path, &exp); if (err) { nfserr = nfserrno(err); goto out_put; @@ -3446,13 +3448,13 @@ nfsd4_encode_dirent_fattr(struct xdr_stream *xdr, struct nfsd4_readdir *cd, nfserr = check_nfsd_access(exp, cd->rd_rqstp); if (nfserr) goto out_put; - } out_encode: - nfserr = nfsd4_encode_fattr(xdr, NULL, exp, dentry, cd->rd_bmval, - cd->rd_rqstp, ignore_crossmnt); + nfserr = nfsd4_encode_fattr(xdr, NULL, exp, path.mnt, path.dentry, + cd->rd_bmval, cd->rd_rqstp, + ignore_crossmnt); out_put: - dput(dentry); + path_put(&path); exp_put(exp); return nfserr; } @@ -3651,8 +3653,9 @@ nfsd4_encode_getattr(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4 struct svc_fh *fhp = getattr->ga_fhp; struct xdr_stream *xdr = resp->xdr; - return nfsd4_encode_fattr(xdr, fhp, fhp->fh_export, fhp->fh_dentry, - getattr->ga_bmval, resp->rqstp, 0); + return nfsd4_encode_fattr(xdr, fhp, fhp->fh_export, + fhp->fh_mnt, fhp->fh_dentry, + getattr->ga_bmval, resp->rqstp, 0); } static __be32 diff --git a/fs/nfsd/nfsfh.c b/fs/nfsd/nfsfh.c index c475d2271f9c..0bf7ac13ae50 100644 --- a/fs/nfsd/nfsfh.c +++ b/fs/nfsd/nfsfh.c @@ -299,6 +299,7 @@ static __be32 nfsd_set_fh_dentry(struct svc_rqst *rqstp, struct svc_fh *fhp) } fhp->fh_dentry = dentry; + fhp->fh_mnt = mntget(exp->ex_path.mnt); fhp->fh_export = exp; switch (rqstp->rq_vers) { @@ -556,7 +557,7 @@ static void set_version_and_fsid_type(struct svc_fh *fhp, struct svc_export *exp } __be32 -fh_compose(struct svc_fh *fhp, struct svc_export *exp, struct dentry *dentry, +fh_compose(struct svc_fh *fhp, struct svc_export *exp, struct path *path, struct svc_fh *ref_fh) { /* ref_fh is a reference file handle. @@ -567,13 +568,13 @@ fh_compose(struct svc_fh *fhp, struct svc_export *exp, struct dentry *dentry, * */ - struct inode * inode = d_inode(dentry); + struct inode * inode = d_inode(path->dentry); dev_t ex_dev = exp_sb(exp)->s_dev; dprintk("nfsd: fh_compose(exp %02x:%02x/%ld %pd2, ino=%ld)\n", MAJOR(ex_dev), MINOR(ex_dev), (long) d_inode(exp->ex_path.dentry)->i_ino, - dentry, + path->dentry, (inode ? inode->i_ino : 0)); /* Choose filehandle version and fsid type based on @@ -590,14 +591,15 @@ fh_compose(struct svc_fh *fhp, struct svc_export *exp, struct dentry *dentry, if (fhp->fh_locked || fhp->fh_dentry) { printk(KERN_ERR "fh_compose: fh %pd2 not initialized!\n", - dentry); + path->dentry); } if (fhp->fh_maxsize < NFS_FHSIZE) printk(KERN_ERR "fh_compose: called with maxsize %d! %pd2\n", fhp->fh_maxsize, - dentry); + path->dentry); - fhp->fh_dentry = dget(dentry); /* our internal copy */ + fhp->fh_dentry = dget(path->dentry); /* our internal copy */ + fhp->fh_mnt = mntget(path->mnt); fhp->fh_export = exp_get(exp); if (fhp->fh_handle.fh_version == 0xca) { @@ -609,9 +611,9 @@ fh_compose(struct svc_fh *fhp, struct svc_export *exp, struct dentry *dentry, fhp->fh_handle.ofh_xdev = fhp->fh_handle.ofh_dev; fhp->fh_handle.ofh_xino = ino_t_to_u32(d_inode(exp->ex_path.dentry)->i_ino); - fhp->fh_handle.ofh_dirino = ino_t_to_u32(parent_ino(dentry)); + fhp->fh_handle.ofh_dirino = ino_t_to_u32(parent_ino(path->dentry)); if (inode) - _fh_update_old(dentry, exp, &fhp->fh_handle); + _fh_update_old(path->dentry, exp, &fhp->fh_handle); } else { fhp->fh_handle.fh_size = key_len(fhp->fh_handle.fh_fsid_type) + 4; @@ -624,7 +626,7 @@ fh_compose(struct svc_fh *fhp, struct svc_export *exp, struct dentry *dentry, exp->ex_fsid, exp->ex_uuid); if (inode) - _fh_update(fhp, exp, dentry); + _fh_update(fhp, exp, path->dentry); if (fhp->fh_handle.fh_fileid_type == FILEID_INVALID) { fh_put(fhp); return nfserr_opnotsupp; @@ -675,8 +677,10 @@ fh_update(struct svc_fh *fhp) void fh_put(struct svc_fh *fhp) { - struct dentry * dentry = fhp->fh_dentry; - struct svc_export * exp = fhp->fh_export; + struct dentry *dentry = fhp->fh_dentry; + struct svc_export *exp = fhp->fh_export; + struct vfsmount *mnt = fhp->fh_mnt; + if (dentry) { fh_unlock(fhp); fhp->fh_dentry = NULL; @@ -684,6 +688,10 @@ fh_put(struct svc_fh *fhp) fh_clear_wcc(fhp); } fh_drop_write(fhp); + if (mnt) { + mntput(mnt); + fhp->fh_mnt = NULL; + } if (exp) { exp_put(exp); fhp->fh_export = NULL; diff --git a/fs/nfsd/nfsfh.h b/fs/nfsd/nfsfh.h index 6106697adc04..26c02209babd 100644 --- a/fs/nfsd/nfsfh.h +++ b/fs/nfsd/nfsfh.h @@ -31,6 +31,7 @@ static inline ino_t u32_to_ino_t(__u32 uino) typedef struct svc_fh { struct knfsd_fh fh_handle; /* FH data */ int fh_maxsize; /* max size for fh_handle */ + struct vfsmount * fh_mnt; /* mnt, possibly of subvol */ struct dentry * fh_dentry; /* validated dentry */ struct svc_export * fh_export; /* export pointer */ @@ -171,7 +172,7 @@ extern char * SVCFH_fmt(struct svc_fh *fhp); * Function prototypes */ __be32 fh_verify(struct svc_rqst *, struct svc_fh *, umode_t, int); -__be32 fh_compose(struct svc_fh *, struct svc_export *, struct dentry *, struct svc_fh *); +__be32 fh_compose(struct svc_fh *, struct svc_export *, struct path *, struct svc_fh *); __be32 fh_update(struct svc_fh *); void fh_put(struct svc_fh *); diff --git a/fs/nfsd/nfsproc.c b/fs/nfsd/nfsproc.c index 60d7c59e7935..245199b0e630 100644 --- a/fs/nfsd/nfsproc.c +++ b/fs/nfsd/nfsproc.c @@ -268,6 +268,7 @@ nfsd_proc_create(struct svc_rqst *rqstp) struct iattr *attr = &argp->attrs; struct inode *inode; struct dentry *dchild; + struct path path; int type, mode; int hosterr; dev_t rdev = 0, wanted = new_decode_dev(attr->ia_size); @@ -298,7 +299,9 @@ nfsd_proc_create(struct svc_rqst *rqstp) goto out_unlock; } fh_init(newfhp, NFS_FHSIZE); - resp->status = fh_compose(newfhp, dirfhp->fh_export, dchild, dirfhp); + path.mnt = dirfhp->fh_mnt; + path.dentry = dchild; + resp->status = fh_compose(newfhp, dirfhp->fh_export, &path, dirfhp); if (!resp->status && d_really_is_negative(dchild)) resp->status = nfserr_noent; dput(dchild); diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c index 7c32edcfd2e9..c0c6920f25a4 100644 --- a/fs/nfsd/vfs.c +++ b/fs/nfsd/vfs.c @@ -49,27 +49,26 @@ #define NFSDDBG_FACILITY NFSDDBG_FILEOP -/* - * Called from nfsd_lookup and encode_dirent. Check if we have crossed +/* + * Called from nfsd_lookup and encode_dirent. Check if we have crossed * a mount point. - * Returns -EAGAIN or -ETIMEDOUT leaving *dpp and *expp unchanged, - * or nfs_ok having possibly changed *dpp and *expp + * Returns -EAGAIN or -ETIMEDOUT leaving *path and *expp unchanged, + * or nfs_ok having possibly changed *path and *expp */ int -nfsd_cross_mnt(struct svc_rqst *rqstp, struct dentry **dpp, - struct svc_export **expp) +nfsd_cross_mnt(struct svc_rqst *rqstp, struct path *path_parent, + struct svc_export **expp) { struct svc_export *exp = *expp, *exp2 = NULL; - struct dentry *dentry = *dpp; - struct path path = {.mnt = mntget(exp->ex_path.mnt), - .dentry = dget(dentry)}; + struct path path = {.mnt = mntget(path_parent->mnt), + .dentry = dget(path_parent->dentry)}; int err = 0; err = follow_down(&path, 0); if (err < 0) goto out; - if (path.mnt == exp->ex_path.mnt && path.dentry == dentry && - nfsd_mountpoint(dentry, exp) == 2) { + if (path.mnt == path_parent->mnt && path.dentry == path_parent->dentry && + nfsd_mountpoint(path.dentry, exp) == 2) { /* This is only a mountpoint in some other namespace */ path_put(&path); goto out; @@ -93,19 +92,14 @@ nfsd_cross_mnt(struct svc_rqst *rqstp, struct dentry **dpp, if (nfsd_v4client(rqstp) || (exp->ex_flags & NFSEXP_CROSSMOUNT) || EX_NOHIDE(exp2)) { /* successfully crossed mount point */ - /* - * This is subtle: path.dentry is *not* on path.mnt - * at this point. The only reason we are safe is that - * original mnt is pinned down by exp, so we should - * put path *before* putting exp - */ - *dpp = path.dentry; - path.dentry = dentry; + path_put(path_parent); + *path_parent = path; + exp_put(exp); *expp = exp2; - exp2 = exp; + } else { + path_put(&path); + exp_put(exp2); } - path_put(&path); - exp_put(exp2); out: return err; } @@ -121,27 +115,30 @@ static void follow_to_parent(struct path *path) path->dentry = dp; } -static int nfsd_lookup_parent(struct svc_rqst *rqstp, struct dentry *dparent, struct svc_export **exp, struct dentry **dentryp) +static int nfsd_lookup_parent(struct svc_rqst *rqstp, struct svc_export **exp, + struct path *path) { + struct path path2; struct svc_export *exp2; - struct path path = {.mnt = mntget((*exp)->ex_path.mnt), - .dentry = dget(dparent)}; - follow_to_parent(&path); - - exp2 = rqst_exp_parent(rqstp, &path); + path2 = *path; + path_get(&path2); + follow_to_parent(&path2); + exp2 = rqst_exp_parent(rqstp, path); if (PTR_ERR(exp2) == -ENOENT) { - *dentryp = dget(dparent); + /* leave path unchanged */ + path_put(&path2); + return 0; } else if (IS_ERR(exp2)) { - path_put(&path); + path_put(&path2); return PTR_ERR(exp2); } else { - *dentryp = dget(path.dentry); + path_put(path); + *path = path2; exp_put(*exp); *exp = exp2; + return 0; } - path_put(&path); - return 0; } /* @@ -172,29 +169,32 @@ int nfsd_mountpoint(struct dentry *dentry, struct svc_export *exp) __be32 nfsd_lookup_dentry(struct svc_rqst *rqstp, struct svc_fh *fhp, const char *name, unsigned int len, - struct svc_export **exp_ret, struct dentry **dentry_ret) + struct svc_export **exp_ret, struct path *ret) { struct svc_export *exp; struct dentry *dparent; - struct dentry *dentry; int host_err; dprintk("nfsd: nfsd_lookup(fh %s, %.*s)\n", SVCFH_fmt(fhp), len,name); dparent = fhp->fh_dentry; + ret->mnt = mntget(fhp->fh_mnt); exp = exp_get(fhp->fh_export); /* Lookup the name, but don't follow links */ if (isdotent(name, len)) { if (len==1) - dentry = dget(dparent); + ret->dentry = dget(dparent); else if (dparent != exp->ex_path.dentry) - dentry = dget_parent(dparent); + ret->dentry = dget_parent(dparent); else if (!EX_NOHIDE(exp) && !nfsd_v4client(rqstp)) - dentry = dget(dparent); /* .. == . just like at / */ + ret->dentry = dget(dparent); /* .. == . just like at / */ else { - /* checking mountpoint crossing is very different when stepping up */ - host_err = nfsd_lookup_parent(rqstp, dparent, &exp, &dentry); + /* checking mountpoint crossing is very different when + * stepping up + */ + ret->dentry = dget(dparent); + host_err = nfsd_lookup_parent(rqstp, &exp, ret); if (host_err) goto out_nfserr; } @@ -205,11 +205,13 @@ nfsd_lookup_dentry(struct svc_rqst *rqstp, struct svc_fh *fhp, * need to take the child's i_mutex: */ fh_lock_nested(fhp, I_MUTEX_PARENT); - dentry = lookup_one_len(name, dparent, len); - host_err = PTR_ERR(dentry); - if (IS_ERR(dentry)) + ret->dentry = lookup_one_len(name, dparent, len); + host_err = PTR_ERR(ret->dentry); + if (IS_ERR(ret->dentry)) { + ret->dentry = NULL; goto out_nfserr; - if (nfsd_mountpoint(dentry, exp)) { + } + if (nfsd_mountpoint(ret->dentry, exp)) { /* * We don't need the i_mutex after all. It's * still possible we could open this (regular @@ -219,18 +221,16 @@ nfsd_lookup_dentry(struct svc_rqst *rqstp, struct svc_fh *fhp, * and a mountpoint won't be renamed: */ fh_unlock(fhp); - if ((host_err = nfsd_cross_mnt(rqstp, &dentry, &exp))) { - dput(dentry); + if ((host_err = nfsd_cross_mnt(rqstp, ret, &exp))) goto out_nfserr; - } } } - *dentry_ret = dentry; *exp_ret = exp; return 0; out_nfserr: exp_put(exp); + path_put(ret); return nfserrno(host_err); } @@ -251,13 +251,13 @@ nfsd_lookup(struct svc_rqst *rqstp, struct svc_fh *fhp, const char *name, unsigned int len, struct svc_fh *resfh) { struct svc_export *exp; - struct dentry *dentry; + struct path path; __be32 err; err = fh_verify(rqstp, fhp, S_IFDIR, NFSD_MAY_EXEC); if (err) return err; - err = nfsd_lookup_dentry(rqstp, fhp, name, len, &exp, &dentry); + err = nfsd_lookup_dentry(rqstp, fhp, name, len, &exp, &path); if (err) return err; err = check_nfsd_access(exp, rqstp); @@ -267,11 +267,11 @@ nfsd_lookup(struct svc_rqst *rqstp, struct svc_fh *fhp, const char *name, * Note: we compose the file handle now, but as the * dentry may be negative, it may need to be updated. */ - err = fh_compose(resfh, exp, dentry, fhp); - if (!err && d_really_is_negative(dentry)) + err = fh_compose(resfh, exp, &path, fhp); + if (!err && d_really_is_negative(path.dentry)) err = nfserr_noent; out: - dput(dentry); + path_put(&path); exp_put(exp); return err; } @@ -740,7 +740,7 @@ __nfsd_open(struct svc_rqst *rqstp, struct svc_fh *fhp, umode_t type, __be32 err; int host_err = 0; - path.mnt = fhp->fh_export->ex_path.mnt; + path.mnt = fhp->fh_mnt; path.dentry = fhp->fh_dentry; inode = d_inode(path.dentry); @@ -1350,6 +1350,7 @@ nfsd_create(struct svc_rqst *rqstp, struct svc_fh *fhp, int type, dev_t rdev, struct svc_fh *resfhp) { struct dentry *dentry, *dchild = NULL; + struct path path; __be32 err; int host_err; @@ -1371,7 +1372,9 @@ nfsd_create(struct svc_rqst *rqstp, struct svc_fh *fhp, host_err = PTR_ERR(dchild); if (IS_ERR(dchild)) return nfserrno(host_err); - err = fh_compose(resfhp, fhp->fh_export, dchild, fhp); + path.mnt = fhp->fh_mnt; + path.dentry = dchild; + err = fh_compose(resfhp, fhp->fh_export, &path, fhp); /* * We unconditionally drop our ref to dchild as fh_compose will have * already grabbed its own ref for it. @@ -1390,11 +1393,12 @@ nfsd_create(struct svc_rqst *rqstp, struct svc_fh *fhp, */ __be32 do_nfsd_create(struct svc_rqst *rqstp, struct svc_fh *fhp, - char *fname, int flen, struct iattr *iap, - struct svc_fh *resfhp, int createmode, u32 *verifier, - bool *truncp, bool *created) + char *fname, int flen, struct iattr *iap, + struct svc_fh *resfhp, int createmode, u32 *verifier, + bool *truncp, bool *created) { struct dentry *dentry, *dchild = NULL; + struct path path; struct inode *dirp; __be32 err; int host_err; @@ -1436,7 +1440,9 @@ do_nfsd_create(struct svc_rqst *rqstp, struct svc_fh *fhp, goto out; } - err = fh_compose(resfhp, fhp->fh_export, dchild, fhp); + path.mnt = fhp->fh_mnt; + path.dentry = dchild; + err = fh_compose(resfhp, fhp->fh_export, &path, fhp); if (err) goto out; @@ -1569,7 +1575,7 @@ nfsd_readlink(struct svc_rqst *rqstp, struct svc_fh *fhp, char *buf, int *lenp) if (unlikely(err)) return err; - path.mnt = fhp->fh_export->ex_path.mnt; + path.mnt = fhp->fh_mnt; path.dentry = fhp->fh_dentry; if (unlikely(!d_is_symlink(path.dentry))) @@ -1600,6 +1606,7 @@ nfsd_symlink(struct svc_rqst *rqstp, struct svc_fh *fhp, struct svc_fh *resfhp) { struct dentry *dentry, *dnew; + struct path pathnew; __be32 err, cerr; int host_err; @@ -1633,7 +1640,9 @@ nfsd_symlink(struct svc_rqst *rqstp, struct svc_fh *fhp, fh_drop_write(fhp); - cerr = fh_compose(resfhp, fhp->fh_export, dnew, fhp); + pathnew.mnt = fhp->fh_mnt; + pathnew.dentry = dnew; + cerr = fh_compose(resfhp, fhp->fh_export, &pathnew, fhp); dput(dnew); if (err==0) err = cerr; out: @@ -2107,7 +2116,7 @@ nfsd_statfs(struct svc_rqst *rqstp, struct svc_fh *fhp, struct kstatfs *stat, in err = fh_verify(rqstp, fhp, 0, NFSD_MAY_NOP | access); if (!err) { struct path path = { - .mnt = fhp->fh_export->ex_path.mnt, + .mnt = fhp->fh_mnt, .dentry = fhp->fh_dentry, }; if (vfs_statfs(&path, stat)) diff --git a/fs/nfsd/vfs.h b/fs/nfsd/vfs.h index b21b76e6b9a8..52f587716208 100644 --- a/fs/nfsd/vfs.h +++ b/fs/nfsd/vfs.h @@ -42,13 +42,13 @@ struct nfsd_file; typedef int (*nfsd_filldir_t)(void *, const char *, int, loff_t, u64, unsigned); /* nfsd/vfs.c */ -int nfsd_cross_mnt(struct svc_rqst *rqstp, struct dentry **dpp, +int nfsd_cross_mnt(struct svc_rqst *rqstp, struct path *, struct svc_export **expp); __be32 nfsd_lookup(struct svc_rqst *, struct svc_fh *, const char *, unsigned int, struct svc_fh *); __be32 nfsd_lookup_dentry(struct svc_rqst *, struct svc_fh *, const char *, unsigned int, - struct svc_export **, struct dentry **); + struct svc_export **, struct path *); __be32 nfsd_setattr(struct svc_rqst *, struct svc_fh *, struct iattr *, int, time64_t); int nfsd_mountpoint(struct dentry *, struct svc_export *); @@ -138,7 +138,7 @@ static inline int fh_want_write(struct svc_fh *fh) if (fh->fh_want_write) return 0; - ret = mnt_want_write(fh->fh_export->ex_path.mnt); + ret = mnt_want_write(fh->fh_mnt); if (!ret) fh->fh_want_write = true; return ret; @@ -148,13 +148,13 @@ static inline void fh_drop_write(struct svc_fh *fh) { if (fh->fh_want_write) { fh->fh_want_write = false; - mnt_drop_write(fh->fh_export->ex_path.mnt); + mnt_drop_write(fh->fh_mnt); } } static inline __be32 fh_getattr(const struct svc_fh *fh, struct kstat *stat) { - struct path p = {.mnt = fh->fh_export->ex_path.mnt, + struct path p = {.mnt = fh->fh_mnt, .dentry = fh->fh_dentry}; return nfserrno(vfs_getattr(&p, stat, STATX_BASIC_STATS, AT_STATX_SYNC_AS_STAT)); diff --git a/fs/nfsd/xdr4.h b/fs/nfsd/xdr4.h index 3e4052e3bd50..8934db5113ac 100644 --- a/fs/nfsd/xdr4.h +++ b/fs/nfsd/xdr4.h @@ -763,7 +763,7 @@ void nfsd4_encode_operation(struct nfsd4_compoundres *, struct nfsd4_op *); void nfsd4_encode_replay(struct xdr_stream *xdr, struct nfsd4_op *op); __be32 nfsd4_encode_fattr_to_buf(__be32 **p, int words, struct svc_fh *fhp, struct svc_export *exp, - struct dentry *dentry, + struct vfsmount *mnt, struct dentry *dentry, u32 *bmval, struct svc_rqst *, int ignore_crossmnt); extern __be32 nfsd4_setclientid(struct svc_rqst *rqstp, struct nfsd4_compound_state *, union nfsd4_op_u *u); From patchwork Tue Jul 27 22:37:45 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 12404517 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9FF4FC4338F for ; Tue, 27 Jul 2021 22:43:30 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8653760184 for ; Tue, 27 Jul 2021 22:43:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233167AbhG0Wna (ORCPT ); Tue, 27 Jul 2021 18:43:30 -0400 Received: from smtp-out2.suse.de ([195.135.220.29]:54034 "EHLO smtp-out2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231730AbhG0Wn3 (ORCPT ); Tue, 27 Jul 2021 18:43:29 -0400 Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id D640D1FF31; Tue, 27 Jul 2021 22:43:27 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1627425807; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=nAzEDNq9szLfRXZIXQmeCEtmKHhwvl25ZMKYMdggNqU=; b=XYIcHHLt4Xij6KyF3wshuLCstJytZMDSrXw/669iyHCbSWdGoDX6VLrOjeZs81a89po2YW LCqyltfpXmZmUo46qdU1+7SA5Oe7rkZGu5AoNVfjpMzH2vcwd9M2+2eYdL75CEGgFev6Jo HTO5vY5sTG09mnFrLQq0vQaB8AYPUog= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1627425807; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=nAzEDNq9szLfRXZIXQmeCEtmKHhwvl25ZMKYMdggNqU=; b=Drrt4JjHQmwX+10yFaeymiI4U3YVGPaLlkLifylJYX3ThkOZ6O0DuB2z5Bpx1NARsCG5/a GpgjO7deuQdCI7Cg== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id B361B13A5D; Tue, 27 Jul 2021 22:43:24 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id 12VHHAyMAGHOVQAAMHmgww (envelope-from ); Tue, 27 Jul 2021 22:43:24 +0000 Subject: [PATCH 07/11] exportfs: Allow filehandle lookup to cross internal mount points. From: NeilBrown To: Christoph Hellwig , Josef Bacik , "J. Bruce Fields" , Chuck Lever , Chris Mason , David Sterba , Alexander Viro Cc: linux-fsdevel@vger.kernel.org, linux-nfs@vger.kernel.org, linux-btrfs@vger.kernel.org Date: Wed, 28 Jul 2021 08:37:45 +1000 Message-ID: <162742546554.32498.9309110546560807513.stgit@noble.brown> In-Reply-To: <162742539595.32498.13687924366155737575.stgit@noble.brown> References: <162742539595.32498.13687924366155737575.stgit@noble.brown> User-Agent: StGit/0.23 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org When a filesystem has internal mounts, it controls the filehandles across all those mounts (subvols) in the filesystem. So it is useful to be able to look up a filehandle again one mount, and get a result which is in a different mount (part of the same overall file system). This patch makes that possible by changing export_decode_fh() and export_decode_fh_raw() to take a vfsmount pointer by reference, and possibly change the vfsmount pointed to before returning. The core of the change is in reconnect_path() which now not only checks that the dentry is fully connected, but also that the vfsmnt reported has the same 'dev' (reported by vfs_getattr) as the dentry. If it doesn't, we walk up the dparent() chain to find the highest place where the dev changes without there being a mount point, and trigger an automount there. As no filesystems yet provide local-mounts, this does not yet change any behaviour. In exportfs_decode_fh_raw() we previously tested for DCACHE_DISCONNECT before calling reconnect_path(). That test is dropped. It was only a minor optimisation and is now inconvenient. The change in overlayfs needs more careful thought than I have yet given it. Signed-off-by: NeilBrown --- fs/exportfs/expfs.c | 100 +++++++++++++++++++++++++++++++++++++++------- fs/fhandle.c | 2 - fs/nfsd/nfsfh.c | 9 +++- fs/overlayfs/namei.c | 5 ++ fs/xfs/xfs_ioctl.c | 12 ++++-- include/linux/exportfs.h | 4 +- 6 files changed, 106 insertions(+), 26 deletions(-) diff --git a/fs/exportfs/expfs.c b/fs/exportfs/expfs.c index 0106eba46d5a..2d7c42137b49 100644 --- a/fs/exportfs/expfs.c +++ b/fs/exportfs/expfs.c @@ -207,11 +207,18 @@ static struct dentry *reconnect_one(struct vfsmount *mnt, * that case reconnect_path may still succeed with target_dir fully * connected, but further operations using the filehandle will fail when * necessary (due to S_DEAD being set on the directory). + * + * If the filesystem supports multiple subvols, then *mntp may be updated + * to a subordinate mount point on the same filesystem. */ static int -reconnect_path(struct vfsmount *mnt, struct dentry *target_dir, char *nbuf) +reconnect_path(struct vfsmount **mntp, struct dentry *target_dir, char *nbuf) { + struct vfsmount *mnt = *mntp; + struct path path; struct dentry *dentry, *parent; + struct kstat stat; + dev_t target_dev; dentry = dget(target_dir); @@ -232,6 +239,68 @@ reconnect_path(struct vfsmount *mnt, struct dentry *target_dir, char *nbuf) } dput(dentry); clear_disconnected(target_dir); + + /* Need to find appropriate vfsmount, which might not exist yet. + * We may need to trigger automount points. + */ + path.mnt = mnt; + path.dentry = target_dir; + vfs_getattr_nosec(&path, &stat, 0, AT_STATX_DONT_SYNC); + target_dev = stat.dev; + + path.dentry = mnt->mnt_root; + vfs_getattr_nosec(&path, &stat, 0, AT_STATX_DONT_SYNC); + + while (stat.dev != target_dev) { + /* walk up the dcache tree from target_dir, recording the + * location of the most recent change in dev number, + * until we find a mountpoint. + * If there was no change in show_dev result before the + * mountpount, the vfsmount at the mountpoint is what we want. + * If there was, we need to trigger an automount where the + * show_dev() result changed. + */ + struct dentry *last_change = NULL; + dev_t last_dev = target_dev; + + dentry = dget(target_dir); + while ((parent = dget_parent(dentry)) != dentry) { + path.dentry = parent; + vfs_getattr_nosec(&path, &stat, 0, AT_STATX_DONT_SYNC); + if (stat.dev != last_dev) { + path.dentry = dentry; + mnt = lookup_mnt(&path); + if (mnt) { + mntput(path.mnt); + path.mnt = mnt; + break; + } + dput(last_change); + last_change = dget(dentry); + last_dev = stat.dev; + } + dput(dentry); + dentry = parent; + } + dput(dentry); dput(parent); + + if (!last_change) + break; + + mnt = path.mnt; + path.dentry = last_change; + follow_down(&path, LOOKUP_AUTOMOUNT); + dput(path.dentry); + if (path.mnt == mnt) + /* There should have been a mount-trap there, + * but there wasn't. Just give up. + */ + break; + + path.dentry = mnt->mnt_root; + vfs_getattr_nosec(&path, &stat, 0, AT_STATX_DONT_SYNC); + } + *mntp = path.mnt; return 0; } @@ -418,12 +487,13 @@ int exportfs_encode_fh(struct dentry *dentry, struct fid *fid, int *max_len, EXPORT_SYMBOL_GPL(exportfs_encode_fh); struct dentry * -exportfs_decode_fh_raw(struct vfsmount *mnt, struct fid *fid, int fh_len, +exportfs_decode_fh_raw(struct vfsmount **mntp, struct fid *fid, int fh_len, int fileid_type, int (*acceptable)(void *, struct dentry *), void *context) { - const struct export_operations *nop = mnt->mnt_sb->s_export_op; + struct super_block *sb = (*mntp)->mnt_sb; + const struct export_operations *nop = sb->s_export_op; struct dentry *result, *alias; char nbuf[NAME_MAX+1]; int err; @@ -433,7 +503,7 @@ exportfs_decode_fh_raw(struct vfsmount *mnt, struct fid *fid, int fh_len, */ if (!nop || !nop->fh_to_dentry) return ERR_PTR(-ESTALE); - result = nop->fh_to_dentry(mnt->mnt_sb, fid, fh_len, fileid_type); + result = nop->fh_to_dentry(sb, fid, fh_len, fileid_type); if (IS_ERR_OR_NULL(result)) return result; @@ -452,14 +522,12 @@ exportfs_decode_fh_raw(struct vfsmount *mnt, struct fid *fid, int fh_len, * * On the positive side there is only one dentry for each * directory inode. On the negative side this implies that we - * to ensure our dentry is connected all the way up to the + * need to ensure our dentry is connected all the way up to the * filesystem root. */ - if (result->d_flags & DCACHE_DISCONNECTED) { - err = reconnect_path(mnt, result, nbuf); - if (err) - goto err_result; - } + err = reconnect_path(mntp, result, nbuf); + if (err) + goto err_result; if (!acceptable(context, result)) { err = -EACCES; @@ -494,7 +562,7 @@ exportfs_decode_fh_raw(struct vfsmount *mnt, struct fid *fid, int fh_len, if (!nop->fh_to_parent) goto err_result; - target_dir = nop->fh_to_parent(mnt->mnt_sb, fid, + target_dir = nop->fh_to_parent(sb, fid, fh_len, fileid_type); if (!target_dir) goto err_result; @@ -507,7 +575,7 @@ exportfs_decode_fh_raw(struct vfsmount *mnt, struct fid *fid, int fh_len, * connected to the filesystem root. The VFS really doesn't * like disconnected directories.. */ - err = reconnect_path(mnt, target_dir, nbuf); + err = reconnect_path(mntp, target_dir, nbuf); if (err) { dput(target_dir); goto err_result; @@ -518,7 +586,7 @@ exportfs_decode_fh_raw(struct vfsmount *mnt, struct fid *fid, int fh_len, * dentry for the inode we're after, make sure that our * inode is actually connected to the parent. */ - err = exportfs_get_name(mnt, target_dir, nbuf, result); + err = exportfs_get_name(*mntp, target_dir, nbuf, result); if (err) { dput(target_dir); goto err_result; @@ -556,7 +624,7 @@ exportfs_decode_fh_raw(struct vfsmount *mnt, struct fid *fid, int fh_len, goto err_result; } - return alias; + return result; } err_result: @@ -565,14 +633,14 @@ exportfs_decode_fh_raw(struct vfsmount *mnt, struct fid *fid, int fh_len, } EXPORT_SYMBOL_GPL(exportfs_decode_fh_raw); -struct dentry *exportfs_decode_fh(struct vfsmount *mnt, struct fid *fid, +struct dentry *exportfs_decode_fh(struct vfsmount **mntp, struct fid *fid, int fh_len, int fileid_type, int (*acceptable)(void *, struct dentry *), void *context) { struct dentry *ret; - ret = exportfs_decode_fh_raw(mnt, fid, fh_len, fileid_type, + ret = exportfs_decode_fh_raw(mntp, fid, fh_len, fileid_type, acceptable, context); if (IS_ERR_OR_NULL(ret)) { if (ret == ERR_PTR(-ENOMEM)) diff --git a/fs/fhandle.c b/fs/fhandle.c index 6630c69c23a2..b47c7696469f 100644 --- a/fs/fhandle.c +++ b/fs/fhandle.c @@ -149,7 +149,7 @@ static int do_handle_to_path(int mountdirfd, struct file_handle *handle, } /* change the handle size to multiple of sizeof(u32) */ handle_dwords = handle->handle_bytes >> 2; - path->dentry = exportfs_decode_fh(path->mnt, + path->dentry = exportfs_decode_fh(&path->mnt, (struct fid *)handle->f_handle, handle_dwords, handle->handle_type, vfs_dentry_acceptable, NULL); diff --git a/fs/nfsd/nfsfh.c b/fs/nfsd/nfsfh.c index 0bf7ac13ae50..4023046f63e2 100644 --- a/fs/nfsd/nfsfh.c +++ b/fs/nfsd/nfsfh.c @@ -157,6 +157,7 @@ static __be32 nfsd_set_fh_dentry(struct svc_rqst *rqstp, struct svc_fh *fhp) struct fid *fid = NULL, sfid; struct svc_export *exp; struct dentry *dentry; + struct vfsmount *mnt = NULL; int fileid_type; int data_left = fh->fh_size/4; __be32 error; @@ -253,6 +254,8 @@ static __be32 nfsd_set_fh_dentry(struct svc_rqst *rqstp, struct svc_fh *fhp) if (rqstp->rq_vers > 2) error = nfserr_badhandle; + mnt = mntget(exp->ex_path.mnt); + if (fh->fh_version != 1) { sfid.i32.ino = fh->ofh_ino; sfid.i32.gen = fh->ofh_generation; @@ -269,7 +272,7 @@ static __be32 nfsd_set_fh_dentry(struct svc_rqst *rqstp, struct svc_fh *fhp) if (fileid_type == FILEID_ROOT) dentry = dget(exp->ex_path.dentry); else { - dentry = exportfs_decode_fh_raw(exp->ex_path.mnt, fid, + dentry = exportfs_decode_fh_raw(&mnt, fid, data_left, fileid_type, nfsd_acceptable, exp); if (IS_ERR_OR_NULL(dentry)) { @@ -299,7 +302,7 @@ static __be32 nfsd_set_fh_dentry(struct svc_rqst *rqstp, struct svc_fh *fhp) } fhp->fh_dentry = dentry; - fhp->fh_mnt = mntget(exp->ex_path.mnt); + fhp->fh_mnt = mnt; fhp->fh_export = exp; switch (rqstp->rq_vers) { @@ -317,6 +320,7 @@ static __be32 nfsd_set_fh_dentry(struct svc_rqst *rqstp, struct svc_fh *fhp) return 0; out: + mntput(mnt); exp_put(exp); return error; } @@ -428,7 +432,6 @@ fh_verify(struct svc_rqst *rqstp, struct svc_fh *fhp, umode_t type, int access) return error; } - /* * Compose a file handle for an NFS reply. * diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c index 210cd6f66e28..0bca19f6df54 100644 --- a/fs/overlayfs/namei.c +++ b/fs/overlayfs/namei.c @@ -155,6 +155,7 @@ struct dentry *ovl_decode_real_fh(struct ovl_fs *ofs, struct ovl_fh *fh, { struct dentry *real; int bytes; + struct vfsmount *mnt2; if (!capable(CAP_DAC_READ_SEARCH)) return NULL; @@ -169,9 +170,11 @@ struct dentry *ovl_decode_real_fh(struct ovl_fs *ofs, struct ovl_fh *fh, return NULL; bytes = (fh->fb.len - offsetof(struct ovl_fb, fid)); - real = exportfs_decode_fh(mnt, (struct fid *)fh->fb.fid, + mnt2 = mntget(mnt); + real = exportfs_decode_fh(&mnt2, (struct fid *)fh->fb.fid, bytes >> 2, (int)fh->fb.type, connected ? ovl_acceptable : NULL, mnt); + mntput(mnt2); if (IS_ERR(real)) { /* * Treat stale file handle to lower file as "origin unknown". diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c index 16039ea10ac9..76eb7d540811 100644 --- a/fs/xfs/xfs_ioctl.c +++ b/fs/xfs/xfs_ioctl.c @@ -149,6 +149,8 @@ xfs_handle_to_dentry( { xfs_handle_t handle; struct xfs_fid64 fid; + struct dentry *ret; + struct vfsmount *mnt; /* * Only allow handle opens under a directory. @@ -168,9 +170,13 @@ xfs_handle_to_dentry( fid.ino = handle.ha_fid.fid_ino; fid.gen = handle.ha_fid.fid_gen; - return exportfs_decode_fh(parfilp->f_path.mnt, (struct fid *)&fid, 3, - FILEID_INO32_GEN | XFS_FILEID_TYPE_64FLAG, - xfs_handle_acceptable, NULL); + mnt = mntget(parfilp->f_path.mnt); + ret = exportfs_decode_fh(&mnt, (struct fid *)&fid, 3, + FILEID_INO32_GEN | XFS_FILEID_TYPE_64FLAG, + xfs_handle_acceptable, NULL); + WARN_ON(mnt != parfilp->f_path.mnt); + mntput(mnt); + return ret; } STATIC struct dentry * diff --git a/include/linux/exportfs.h b/include/linux/exportfs.h index fe848901fcc3..9a8c5434a5cf 100644 --- a/include/linux/exportfs.h +++ b/include/linux/exportfs.h @@ -228,12 +228,12 @@ extern int exportfs_encode_inode_fh(struct inode *inode, struct fid *fid, int *max_len, struct inode *parent); extern int exportfs_encode_fh(struct dentry *dentry, struct fid *fid, int *max_len, int connectable); -extern struct dentry *exportfs_decode_fh_raw(struct vfsmount *mnt, +extern struct dentry *exportfs_decode_fh_raw(struct vfsmount **mntp, struct fid *fid, int fh_len, int fileid_type, int (*acceptable)(void *, struct dentry *), void *context); -extern struct dentry *exportfs_decode_fh(struct vfsmount *mnt, struct fid *fid, +extern struct dentry *exportfs_decode_fh(struct vfsmount **mnt, struct fid *fid, int fh_len, int fileid_type, int (*acceptable)(void *, struct dentry *), void *context); From patchwork Tue Jul 27 22:37:45 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 12404519 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E56EAC4338F for ; Tue, 27 Jul 2021 22:43:37 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id C718860F6D for ; Tue, 27 Jul 2021 22:43:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232599AbhG0Wnh (ORCPT ); Tue, 27 Jul 2021 18:43:37 -0400 Received: from smtp-out2.suse.de ([195.135.220.29]:54054 "EHLO smtp-out2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232449AbhG0Wnh (ORCPT ); Tue, 27 Jul 2021 18:43:37 -0400 Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 767CC1FF31; Tue, 27 Jul 2021 22:43:35 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1627425815; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=9Vs3Pylsgp5mwqAihPx4If4U0ooiHOgHgod8MkA49/Q=; b=1imyhpcQPAElE++MT1bROHAJ2lTK3+5vRY8kRdiflawAsfgMVAvxfhBm/z4RrWSLyZNyuM VRUeU0HW0l0asfs/VrDtVik6+V4UtClsgesqbCP8JstiBrzlllW8PWrdMtxDARVxcm6xRP LGIiA4zW10+jtIhYRkYGQ3nKYTkR9/w= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1627425815; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=9Vs3Pylsgp5mwqAihPx4If4U0ooiHOgHgod8MkA49/Q=; b=/Zf8YFdWKv+6CsF5e9OCySNWlNamfX/vsFct8llyCkILATrZ4U9GB7XtT+udmdEcdRDUN/ WSxZriRMjkfbcZDQ== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 890E213A5D; Tue, 27 Jul 2021 22:43:32 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id ErgmEhSMAGHaVQAAMHmgww (envelope-from ); Tue, 27 Jul 2021 22:43:32 +0000 Subject: [PATCH 08/11] nfsd: change get_parent_attributes() to nfsd_get_mounted_on() From: NeilBrown To: Christoph Hellwig , Josef Bacik , "J. Bruce Fields" , Chuck Lever , Chris Mason , David Sterba , Alexander Viro Cc: linux-fsdevel@vger.kernel.org, linux-nfs@vger.kernel.org, linux-btrfs@vger.kernel.org Date: Wed, 28 Jul 2021 08:37:45 +1000 Message-ID: <162742546555.32498.13169105043649293291.stgit@noble.brown> In-Reply-To: <162742539595.32498.13687924366155737575.stgit@noble.brown> References: <162742539595.32498.13687924366155737575.stgit@noble.brown> User-Agent: StGit/0.23 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org get_parent_attributes() is only used to get the inode number of the mounted-on directory. So change it to only do that and call it nfsd_get_mounted_on(). It will eventually be use by nfs3 as well as nfs4, so move it to vfs.c. Signed-off-by: NeilBrown --- fs/nfsd/nfs4xdr.c | 29 +++++------------------------ fs/nfsd/vfs.c | 18 ++++++++++++++++++ fs/nfsd/vfs.h | 2 ++ 3 files changed, 25 insertions(+), 24 deletions(-) diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c index 21c277fa28ae..d5683b6a74b2 100644 --- a/fs/nfsd/nfs4xdr.c +++ b/fs/nfsd/nfs4xdr.c @@ -2768,22 +2768,6 @@ static __be32 fattr_handle_absent_fs(u32 *bmval0, u32 *bmval1, u32 *bmval2, u32 return 0; } - -static int get_parent_attributes(struct svc_export *exp, struct kstat *stat) -{ - struct path path = exp->ex_path; - int err; - - path_get(&path); - while (follow_up(&path)) { - if (path.dentry != path.mnt->mnt_root) - break; - } - err = vfs_getattr(&path, stat, STATX_BASIC_STATS, AT_STATX_SYNC_AS_STAT); - path_put(&path); - return err; -} - static __be32 nfsd4_encode_bitmap(struct xdr_stream *xdr, u32 bmval0, u32 bmval1, u32 bmval2) { @@ -3269,8 +3253,7 @@ nfsd4_encode_fattr(struct xdr_stream *xdr, struct svc_fh *fhp, *p++ = cpu_to_be32(stat.mtime.tv_nsec); } if (bmval1 & FATTR4_WORD1_MOUNTED_ON_FILEID) { - struct kstat parent_stat; - u64 ino = stat.ino; + u64 ino; p = xdr_reserve_space(xdr, 8); if (!p) @@ -3279,12 +3262,10 @@ nfsd4_encode_fattr(struct xdr_stream *xdr, struct svc_fh *fhp, * Get parent's attributes if not ignoring crossmount * and this is the root of a cross-mounted filesystem. */ - if (ignore_crossmnt == 0 && dentry == mnt->mnt_root) { - err = get_parent_attributes(exp, &parent_stat); - if (err) - goto out_nfserr; - ino = parent_stat.ino; - } + if (ignore_crossmnt == 0 && dentry == mnt->mnt_root) + ino = nfsd_get_mounted_on(mnt); + if (!ino) + ino = stat.ino; p = xdr_encode_hyper(p, ino); } #ifdef CONFIG_NFSD_PNFS diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c index c0c6920f25a4..baa12ac36ece 100644 --- a/fs/nfsd/vfs.c +++ b/fs/nfsd/vfs.c @@ -2445,3 +2445,21 @@ nfsd_permission(struct svc_rqst *rqstp, struct svc_export *exp, return err? nfserrno(err) : 0; } + +unsigned long nfsd_get_mounted_on(struct vfsmount *mnt) +{ + struct kstat stat; + struct path path = { .mnt = mnt, .dentry = mnt->mnt_root }; + int err; + + path_get(&path); + while (follow_up(&path)) { + if (path.dentry != path.mnt->mnt_root) + break; + } + err = vfs_getattr(&path, &stat, STATX_INO, AT_STATX_DONT_SYNC); + path_put(&path); + if (err) + return 0; + return stat.ino; +} diff --git a/fs/nfsd/vfs.h b/fs/nfsd/vfs.h index 52f587716208..11ac36b21b4c 100644 --- a/fs/nfsd/vfs.h +++ b/fs/nfsd/vfs.h @@ -132,6 +132,8 @@ __be32 nfsd_statfs(struct svc_rqst *, struct svc_fh *, __be32 nfsd_permission(struct svc_rqst *, struct svc_export *, struct dentry *, int); +unsigned long nfsd_get_mounted_on(struct vfsmount *mnt); + static inline int fh_want_write(struct svc_fh *fh) { int ret; From patchwork Tue Jul 27 22:37:45 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 12404521 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7C9E6C432BE for ; Tue, 27 Jul 2021 22:43:50 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 6555F60F9B for ; Tue, 27 Jul 2021 22:43:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233491AbhG0Wns (ORCPT ); Tue, 27 Jul 2021 18:43:48 -0400 Received: from smtp-out1.suse.de ([195.135.220.28]:57288 "EHLO smtp-out1.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233162AbhG0Wnp (ORCPT ); Tue, 27 Jul 2021 18:43:45 -0400 Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id C19DA22236; Tue, 27 Jul 2021 22:43:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1627425823; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=9I86czI1fNP5NdLjZctZyiavaatPy8v72YtRXA3YdQI=; b=SXdOAfHSMW1DgAuiFY0EfuDmnvc1yrNE8dpub++9OMNIrzebAIBzEgnvStJ1keSSbCNDXE e8uPn7J1/AMU12E+KflRmCjo3S8ZtjgIntoUxkvbqdHgpX/OZSVvou2O+2bWQ1lTCUnCbz LOvpLOLFYLbaUtkGN4CbBAmwPRQQC7g= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1627425823; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=9I86czI1fNP5NdLjZctZyiavaatPy8v72YtRXA3YdQI=; b=Q1GO7xH+bIjb6SBBTG0/CAWxdQzelw72uxiT4R/6ubW3+5sGZbd6uMHbLMMggYRa4CPPWA 0K2XIV8imdNAPDDw== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id DC1C713A5D; Tue, 27 Jul 2021 22:43:40 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id bEZIJhyMAGHiVQAAMHmgww (envelope-from ); Tue, 27 Jul 2021 22:43:40 +0000 Subject: [PATCH 09/11] nfsd: Allow filehandle lookup to cross internal mount points. From: NeilBrown To: Christoph Hellwig , Josef Bacik , "J. Bruce Fields" , Chuck Lever , Chris Mason , David Sterba , Alexander Viro Cc: linux-fsdevel@vger.kernel.org, linux-nfs@vger.kernel.org, linux-btrfs@vger.kernel.org Date: Wed, 28 Jul 2021 08:37:45 +1000 Message-ID: <162742546556.32498.16708762469227881912.stgit@noble.brown> In-Reply-To: <162742539595.32498.13687924366155737575.stgit@noble.brown> References: <162742539595.32498.13687924366155737575.stgit@noble.brown> User-Agent: StGit/0.23 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org Enhance nfsd to detect internal mounts and to cross them without requiring a new export. Also ensure the fsid reported is different for different submounts. We do this by xoring in the ino of the mounted-on directory. This makes sense for btrfs at least. Signed-off-by: NeilBrown --- fs/nfsd/nfs3xdr.c | 28 +++++++++++++++++++++------- fs/nfsd/nfs4xdr.c | 34 +++++++++++++++++++++++----------- fs/nfsd/nfsfh.c | 7 ++++++- fs/nfsd/vfs.c | 11 +++++++++-- 4 files changed, 59 insertions(+), 21 deletions(-) diff --git a/fs/nfsd/nfs3xdr.c b/fs/nfsd/nfs3xdr.c index 67af0c5c1543..80b1cc0334fa 100644 --- a/fs/nfsd/nfs3xdr.c +++ b/fs/nfsd/nfs3xdr.c @@ -370,6 +370,8 @@ svcxdr_encode_fattr3(struct svc_rqst *rqstp, struct xdr_stream *xdr, case FSIDSOURCE_UUID: fsid = ((u64 *)fhp->fh_export->ex_uuid)[0]; fsid ^= ((u64 *)fhp->fh_export->ex_uuid)[1]; + if (fhp->fh_mnt != fhp->fh_export->ex_path.mnt) + fsid ^= nfsd_get_mounted_on(fhp->fh_mnt); break; default: fsid = (u64)huge_encode_dev(fhp->fh_dentry->d_sb->s_dev); @@ -1094,8 +1096,8 @@ compose_entry_fh(struct nfsd3_readdirres *cd, struct svc_fh *fhp, __be32 rv = nfserr_noent; dparent = cd->fh.fh_dentry; - exp = cd->fh.fh_export; - child.mnt = cd->fh.fh_mnt; + exp = exp_get(cd->fh.fh_export); + child.mnt = mntget(cd->fh.fh_mnt); if (isdotent(name, namlen)) { if (namlen == 2) { @@ -1112,15 +1114,27 @@ compose_entry_fh(struct nfsd3_readdirres *cd, struct svc_fh *fhp, child.dentry = dget(dparent); } else child.dentry = lookup_positive_unlocked(name, dparent, namlen); - if (IS_ERR(child.dentry)) + if (IS_ERR(child.dentry)) { + mntput(child.mnt); + exp_put(exp); return rv; - if (d_mountpoint(child.dentry)) - goto out; - if (child.dentry->d_inode->i_ino != ino) + } + /* If child is a mountpoint, then we want to expose the fact + * so client can create a mountpoint. If not, then a different + * ino number probably means a race with rename, so avoid providing + * too much detail. + */ + if (nfsd_mountpoint(child.dentry, exp)) { + int err; + err = nfsd_cross_mnt(cd->rqstp, &child, &exp); + if (err) + goto out; + } else if (child.dentry->d_inode->i_ino != ino) goto out; rv = fh_compose(fhp, exp, &child, &cd->fh); out: - dput(child.dentry); + path_put(&child); + exp_put(exp); return rv; } diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c index d5683b6a74b2..4dbc99ed2c8b 100644 --- a/fs/nfsd/nfs4xdr.c +++ b/fs/nfsd/nfs4xdr.c @@ -2817,6 +2817,8 @@ nfsd4_encode_fattr(struct xdr_stream *xdr, struct svc_fh *fhp, struct kstat stat; struct svc_fh *tempfh = NULL; struct kstatfs statfs; + u64 mounted_on_ino; + u64 sub_fsid; __be32 *p; int starting_len = xdr->buf->len; int attrlen_offset; @@ -2871,6 +2873,24 @@ nfsd4_encode_fattr(struct xdr_stream *xdr, struct svc_fh *fhp, goto out; fhp = tempfh; } + if ((bmval0 & FATTR4_WORD0_FSID) || + (bmval1 & FATTR4_WORD1_MOUNTED_ON_FILEID)) { + mounted_on_ino = stat.ino; + sub_fsid = 0; + /* + * The inode number that the current mnt is mounted on is + * used for MOUNTED_ON_FILED if we are at the root, + * and for sub_fsid if mnt is not the export mnt. + */ + if (ignore_crossmnt == 0) { + u64 moi = nfsd_get_mounted_on(mnt); + + if (dentry == mnt->mnt_root && moi) + mounted_on_ino = moi; + if (mnt != exp->ex_path.mnt) + sub_fsid = moi; + } + } if (bmval0 & FATTR4_WORD0_ACL) { err = nfsd4_get_nfs4_acl(rqstp, dentry, &acl); if (err == -EOPNOTSUPP) @@ -3008,6 +3028,8 @@ nfsd4_encode_fattr(struct xdr_stream *xdr, struct svc_fh *fhp, case FSIDSOURCE_UUID: p = xdr_encode_opaque_fixed(p, exp->ex_uuid, EX_UUID_LEN); + if (mnt != exp->ex_path.mnt) + *(u64*)(p-2) ^= sub_fsid; break; } } @@ -3253,20 +3275,10 @@ nfsd4_encode_fattr(struct xdr_stream *xdr, struct svc_fh *fhp, *p++ = cpu_to_be32(stat.mtime.tv_nsec); } if (bmval1 & FATTR4_WORD1_MOUNTED_ON_FILEID) { - u64 ino; - p = xdr_reserve_space(xdr, 8); if (!p) goto out_resource; - /* - * Get parent's attributes if not ignoring crossmount - * and this is the root of a cross-mounted filesystem. - */ - if (ignore_crossmnt == 0 && dentry == mnt->mnt_root) - ino = nfsd_get_mounted_on(mnt); - if (!ino) - ino = stat.ino; - p = xdr_encode_hyper(p, ino); + p = xdr_encode_hyper(p, mounted_on_ino); } #ifdef CONFIG_NFSD_PNFS if (bmval1 & FATTR4_WORD1_FS_LAYOUT_TYPES) { diff --git a/fs/nfsd/nfsfh.c b/fs/nfsd/nfsfh.c index 4023046f63e2..4b53838bca89 100644 --- a/fs/nfsd/nfsfh.c +++ b/fs/nfsd/nfsfh.c @@ -9,7 +9,7 @@ */ #include - +#include #include #include "nfsd.h" #include "vfs.h" @@ -285,6 +285,11 @@ static __be32 nfsd_set_fh_dentry(struct svc_rqst *rqstp, struct svc_fh *fhp) default: dentry = ERR_PTR(-ESTALE); } + } else if (nfsd_mountpoint(dentry, exp)) { + struct path path = { .mnt = mnt, .dentry = dentry }; + follow_down(&path, LOOKUP_AUTOMOUNT); + mnt = path.mnt; + dentry = path.dentry; } } if (dentry == NULL) diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c index baa12ac36ece..22523e1cd478 100644 --- a/fs/nfsd/vfs.c +++ b/fs/nfsd/vfs.c @@ -64,7 +64,7 @@ nfsd_cross_mnt(struct svc_rqst *rqstp, struct path *path_parent, .dentry = dget(path_parent->dentry)}; int err = 0; - err = follow_down(&path, 0); + err = follow_down(&path, LOOKUP_AUTOMOUNT); if (err < 0) goto out; if (path.mnt == path_parent->mnt && path.dentry == path_parent->dentry && @@ -73,6 +73,13 @@ nfsd_cross_mnt(struct svc_rqst *rqstp, struct path *path_parent, path_put(&path); goto out; } + if (mount_is_internal(path.mnt)) { + /* Use the new path, but don't look for a new export */ + /* FIXME should I check NOHIDE in this case?? */ + path_put(path_parent); + *path_parent = path; + goto out; + } exp2 = rqst_exp_get_by_name(rqstp, &path); if (IS_ERR(exp2)) { @@ -157,7 +164,7 @@ int nfsd_mountpoint(struct dentry *dentry, struct svc_export *exp) return 1; if (nfsd4_is_junction(dentry)) return 1; - if (d_mountpoint(dentry)) + if (d_managed(dentry)) /* * Might only be a mountpoint in a different namespace, * but we need to check. From patchwork Tue Jul 27 22:37:45 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 12404523 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 159AFC4338F for ; Tue, 27 Jul 2021 22:43:54 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id F0BB060F6D for ; Tue, 27 Jul 2021 22:43:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233439AbhG0Wnx (ORCPT ); Tue, 27 Jul 2021 18:43:53 -0400 Received: from smtp-out1.suse.de ([195.135.220.28]:57312 "EHLO smtp-out1.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233117AbhG0Wnv (ORCPT ); Tue, 27 Jul 2021 18:43:51 -0400 Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 1EA5922236; Tue, 27 Jul 2021 22:43:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1627425830; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Lm7XB6jR4TjnMiZxavOOo4KztvJwCpv6U5v30F4iOGM=; b=Ruy1UKS5MFYt+ikokL5kri3piAU9UYkgt7PEvOrIsLbHcZEhZllORnWezNvk6A7RB2slTd czUkBE/KdhHqx80je46krijH743QwxUO//XKIMczIEOYKqUVKvew5wyS/K6+UjgI/QVUR1 HBC/U3YHjy7EAMZ+CFb3fC4ovhW+teg= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1627425830; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Lm7XB6jR4TjnMiZxavOOo4KztvJwCpv6U5v30F4iOGM=; b=OEAQvFW1Q330kWqehW6WoRxhd+EvDE7JGKS4U7ymQgJuDbfwBYnCZH1J+uTMJagralu77U MOpBbC8+su+CkhAw== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 3ABFD13A5D; Tue, 27 Jul 2021 22:43:46 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id hGaHOiKMAGHnVQAAMHmgww (envelope-from ); Tue, 27 Jul 2021 22:43:46 +0000 Subject: [PATCH 10/11] btrfs: introduce mapping function from location to inum From: NeilBrown To: Christoph Hellwig , Josef Bacik , "J. Bruce Fields" , Chuck Lever , Chris Mason , David Sterba , Alexander Viro Cc: linux-fsdevel@vger.kernel.org, linux-nfs@vger.kernel.org, linux-btrfs@vger.kernel.org Date: Wed, 28 Jul 2021 08:37:45 +1000 Message-ID: <162742546557.32498.956193040064011710.stgit@noble.brown> In-Reply-To: <162742539595.32498.13687924366155737575.stgit@noble.brown> References: <162742539595.32498.13687924366155737575.stgit@noble.brown> User-Agent: StGit/0.23 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org A btrfs directory entry can refer to two different sorts of objects BTRFS_INODE_ITEM_KEY - a regular fs object (file, dir, etc) BTRFS_ROOT_ITEM_KEY - a reference to a subvol. The 'objectid' numbers for these two are independent, so it is possible (and common) for an INODE and a ROOT to have the same objectid. As readdir reports the objectid as the inode number, if two such are in the same directory, a tool which examines the inode numbers in getdents results could think they are links. As the BTRFS_ROOT_ITEM_KEY objectid is not visible via stat() (only getdents), this is rarely if ever a problem. However a future patch will expose this number as the i_ino of an automount point. This will cause problems if the objectid is used as-is. So: create a simple mapping function to reduce (or eliminate?) the possibility of conflict. The objectid of BTRFS_ROOT_ITEM_KEY is subtracted from ULONG_MAX to make an inode number. Signed-off-by: NeilBrown --- fs/btrfs/btrfs_inode.h | 10 ++++++++++ fs/btrfs/inode.c | 3 ++- 2 files changed, 12 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/btrfs_inode.h b/fs/btrfs/btrfs_inode.h index c652e19ad74e..a4b5f38196e6 100644 --- a/fs/btrfs/btrfs_inode.h +++ b/fs/btrfs/btrfs_inode.h @@ -328,6 +328,16 @@ static inline bool btrfs_inode_in_log(struct btrfs_inode *inode, u64 generation) return ret; } +static inline unsigned long btrfs_location_to_ino(struct btrfs_key *location) +{ + if (location->type == BTRFS_INODE_ITEM_KEY) + return location->objectid; + /* Probably BTRFS_ROOT_ITEM_KEY, try to keep the inode + * numbers separate. + */ + return ULONG_MAX - location->objectid; +} + struct btrfs_dio_private { struct inode *inode; u64 logical_offset; diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 8f60314c36c5..02537c1a9763 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -6136,7 +6136,8 @@ static int btrfs_real_readdir(struct file *file, struct dir_context *ctx) put_unaligned(fs_ftype_to_dtype(btrfs_dir_type(leaf, di)), &entry->type); btrfs_dir_item_key_to_cpu(leaf, di, &location); - put_unaligned(location.objectid, &entry->ino); + put_unaligned(btrfs_location_to_ino(&location), + &entry->ino); put_unaligned(found_key.offset, &entry->offset); entries++; addr += sizeof(struct dir_entry) + name_len; From patchwork Tue Jul 27 22:37:45 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 12404525 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7883CC4338F for ; Tue, 27 Jul 2021 22:44:02 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 59C4960184 for ; Tue, 27 Jul 2021 22:44:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233463AbhG0WoC (ORCPT ); Tue, 27 Jul 2021 18:44:02 -0400 Received: from smtp-out2.suse.de ([195.135.220.29]:54076 "EHLO smtp-out2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232506AbhG0WoB (ORCPT ); Tue, 27 Jul 2021 18:44:01 -0400 Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 081221FF31; Tue, 27 Jul 2021 22:44:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1627425840; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ntRRi13JXohfoyWNEcO4yHLJ45b2xCCxuE7qaQ+92/E=; b=Hxy1O7u+LTSY6GoBkPm7LXbB7iLSjjClEujaOyLoZXpN/ExHkqOCbQHQ8VyqbnXUuqFHN9 yb1FBXLlMNr1fR7R9DC30bIvL7mvr29C4rXWKMfm5ZhKSPhcUohhc6rHRjwnaElep0VaLb 8MrEdoCt/lzApwO5Qg1AIr3jsOwfU/o= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1627425840; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ntRRi13JXohfoyWNEcO4yHLJ45b2xCCxuE7qaQ+92/E=; b=pt3KKB2X11OwJt+bibSzY0h1UBVxh5Ns7JJ8zpKrIlGhr0DvOQjHUZ+KvKLff/y6J42U/r QLvTt62r8wtKwVDg== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 095A013A5D; Tue, 27 Jul 2021 22:43:56 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id mlVXLiyMAGHyVQAAMHmgww (envelope-from ); Tue, 27 Jul 2021 22:43:56 +0000 Subject: [PATCH 11/11] btrfs: use automount to bind-mount all subvol roots. From: NeilBrown To: Christoph Hellwig , Josef Bacik , "J. Bruce Fields" , Chuck Lever , Chris Mason , David Sterba , Alexander Viro Cc: linux-fsdevel@vger.kernel.org, linux-nfs@vger.kernel.org, linux-btrfs@vger.kernel.org Date: Wed, 28 Jul 2021 08:37:45 +1000 Message-ID: <162742546558.32498.1901201501617899416.stgit@noble.brown> In-Reply-To: <162742539595.32498.13687924366155737575.stgit@noble.brown> References: <162742539595.32498.13687924366155737575.stgit@noble.brown> User-Agent: StGit/0.23 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org All subvol roots are now marked as automounts. If the d_automount() function determines that the dentry is not the root of the vfsmount, it creates a simple loop-back mount of the dentry onto itself. If it determines that it IS the root of the vfsmount, it returns -EISDIR so that no further automounting is attempted. btrfs_getattr pays special attention to these automount dentries. If it is NOT the root of the vfsmount: - the ->dev is reported as that for the rest of the vfsmount - the ->ino is reported as the subvol objectid, suitable transformed to avoid collision. This way the same inode appear to be different depending on which mount it is in. automounted vfsmounts are kept on a list and timeout after 500 to 1000 seconds of last use. This is configurable via a module parameter. The tracking and timeout of automounts is copied from NFS. Signed-off-by: NeilBrown Reported-by: kernel test robot --- fs/btrfs/btrfs_inode.h | 2 + fs/btrfs/inode.c | 108 ++++++++++++++++++++++++++++++++++++++++++++++++ fs/btrfs/super.c | 1 3 files changed, 111 insertions(+) diff --git a/fs/btrfs/btrfs_inode.h b/fs/btrfs/btrfs_inode.h index a4b5f38196e6..f03056cacc4a 100644 --- a/fs/btrfs/btrfs_inode.h +++ b/fs/btrfs/btrfs_inode.h @@ -387,4 +387,6 @@ static inline void btrfs_print_data_csum_error(struct btrfs_inode *inode, mirror_num); } +void btrfs_release_automount_timer(void); + #endif diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 02537c1a9763..a5f46545fb38 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -31,6 +31,7 @@ #include #include #include +#include #include #include "misc.h" #include "ctree.h" @@ -5782,6 +5783,8 @@ static int btrfs_init_locked_inode(struct inode *inode, void *p) struct btrfs_iget_args *args = p; inode->i_ino = args->ino; + if (args->ino == BTRFS_FIRST_FREE_OBJECTID) + inode->i_flags |= S_AUTOMOUNT; BTRFS_I(inode)->location.objectid = args->ino; BTRFS_I(inode)->location.type = BTRFS_INODE_ITEM_KEY; BTRFS_I(inode)->location.offset = 0; @@ -5985,6 +5988,101 @@ static int btrfs_dentry_delete(const struct dentry *dentry) return 0; } +static void btrfs_expire_automounts(struct work_struct *work); +static LIST_HEAD(btrfs_automount_list); +static DECLARE_DELAYED_WORK(btrfs_automount_task, btrfs_expire_automounts); +int btrfs_mountpoint_expiry_timeout = 500 * HZ; +static void btrfs_expire_automounts(struct work_struct *work) +{ + struct list_head *list = &btrfs_automount_list; + int timeout = READ_ONCE(btrfs_mountpoint_expiry_timeout); + + mark_mounts_for_expiry(list); + if (!list_empty(list) && timeout > 0) + schedule_delayed_work(&btrfs_automount_task, timeout); +} + +void btrfs_release_automount_timer(void) +{ + if (list_empty(&btrfs_automount_list)) + cancel_delayed_work(&btrfs_automount_task); +} + +static struct vfsmount *btrfs_automount(struct path *path) +{ + struct fs_context fc; + struct vfsmount *mnt; + int timeout = READ_ONCE(btrfs_mountpoint_expiry_timeout); + + if (path->dentry == path->mnt->mnt_root) + /* dentry is root of the vfsmount, + * so skip automount processing + */ + return ERR_PTR(-EISDIR); + /* Create a bind-mount to expose the subvol in the mount table */ + fc.root = path->dentry; + fc.sb_flags = 0; + fc.source = "btrfs-automount"; + mnt = vfs_create_mount(&fc); + if (IS_ERR(mnt)) + return mnt; + mntget(mnt); + mnt_set_expiry(mnt, &btrfs_automount_list); + if (timeout > 0) + schedule_delayed_work(&btrfs_automount_task, timeout); + return mnt; +} + +static int param_set_btrfs_timeout(const char *val, const struct kernel_param *kp) +{ + long num; + int ret; + + if (!val) + return -EINVAL; + ret = kstrtol(val, 0, &num); + if (ret) + return -EINVAL; + if (num > 0) { + if (num >= INT_MAX / HZ) + num = INT_MAX; + else + num *= HZ; + *((int *)kp->arg) = num; + if (!list_empty(&btrfs_automount_list)) + mod_delayed_work(system_wq, &btrfs_automount_task, num); + } else { + *((int *)kp->arg) = -1*HZ; + cancel_delayed_work(&btrfs_automount_task); + } + return 0; +} + +static int param_get_btrfs_timeout(char *buffer, const struct kernel_param *kp) +{ + long num = *((int *)kp->arg); + + if (num > 0) { + if (num >= INT_MAX - (HZ - 1)) + num = INT_MAX / HZ; + else + num = (num + (HZ - 1)) / HZ; + } else + num = -1; + return scnprintf(buffer, PAGE_SIZE, "%li\n", num); +} + +static const struct kernel_param_ops param_ops_btrfs_timeout = { + .set = param_set_btrfs_timeout, + .get = param_get_btrfs_timeout, +}; +#define param_check_btrfs_timeout(name, p) __param_check(name, p, int) + +module_param(btrfs_mountpoint_expiry_timeout, btrfs_timeout, 0644); +MODULE_PARM_DESC(btrfs_mountpoint_expiry_timeout, + "Set the btrfs automounted mountpoint timeout value (seconds). " + "Values <= 0 turn expiration off."); + static struct dentry *btrfs_lookup(struct inode *dir, struct dentry *dentry, unsigned int flags) { @@ -9195,6 +9293,15 @@ static int btrfs_getattr(struct user_namespace *mnt_userns, generic_fillattr(&init_user_ns, inode, stat); stat->dev = BTRFS_I(inode)->root->anon_dev; + if ((inode->i_flags & S_AUTOMOUNT) && + path->dentry != path->mnt->mnt_root) { + /* This is the mounted-on side of the automount, + * so we show the inode number from the ROOT_ITEM key + * and the dev of the mountpoint. + */ + stat->ino = btrfs_location_to_ino(&BTRFS_I(inode)->root->root_key); + stat->dev = BTRFS_I(d_inode(path->mnt->mnt_root))->root->anon_dev; + } spin_lock(&BTRFS_I(inode)->lock); delalloc_bytes = BTRFS_I(inode)->new_delalloc_bytes; @@ -10844,4 +10951,5 @@ static const struct inode_operations btrfs_symlink_inode_operations = { const struct dentry_operations btrfs_dentry_operations = { .d_delete = btrfs_dentry_delete, + .d_automount = btrfs_automount, }; diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c index d07b18b2b250..33008e432a15 100644 --- a/fs/btrfs/super.c +++ b/fs/btrfs/super.c @@ -338,6 +338,7 @@ void __btrfs_panic(struct btrfs_fs_info *fs_info, const char *function, static void btrfs_put_super(struct super_block *sb) { close_ctree(btrfs_sb(sb)); + btrfs_release_automount_timer(); } enum {