From patchwork Mon Aug 19 18:17:06 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Snitzer X-Patchwork-Id: 13768770 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 94984D531; Mon, 19 Aug 2024 18:17:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724091473; cv=none; b=f2yCC+T+045uPRZAvV3cVczX2v09bMWTI2Q10115dXTSGHM5ZTpEuxUAFjk2On4SU2e1sbc5hSMYdORiP9alj8huSPmupUDou58qRsQ1er7Op1EUHvQ57NBLDzls5FUdE1ZGuCX7R8GDyxK10TOGG/DNiB+g0B9HcN8dxPlXGUI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724091473; c=relaxed/simple; bh=ccJiW4DuhKNnOdiRFGXGLE+maD/kYxEQO1hVWL2sDvs=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=q9UyJBzwrJyvnbXJt+GSDoRjQhU814FzOzDf9vv3i/cHeMF7h/IE74DxDRJUNNHXgBQanRxmT1PR3ML70g1GQyI/DVtTw+MM+VVRyXuicPCdC0wLo1uGi4Dn5YI11u8rPCxjYbawfTvVAOJid6+2GlcSpEnWiLxeaeK4AO3MMuw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=dpX2avIK; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="dpX2avIK" Received: by smtp.kernel.org (Postfix) with ESMTPSA id D8AC5C4AF0C; Mon, 19 Aug 2024 18:17:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1724091473; bh=ccJiW4DuhKNnOdiRFGXGLE+maD/kYxEQO1hVWL2sDvs=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=dpX2avIKS0MwSEF0bhdswopUNgYKoehn/3p9q3iQR9VZZoQMTthZIqJ1xkudOudTA kMBEb+A9h+cRgLRDe1riPjeAciBmnaIdI57xPa4g9Ld6oJmsfZm+KTYBXXlr432cUy 4bdDtK7FePHmCY3S2Z1NEOhsDcoF4a1AdfDXsHnHEw6m3JwsBPPo9k/X7bPANb8ObC Y39hdhtV9TTklW/AmkQdbEb93Q3CPGBzl7penUpJ6jGcv7PL4F707YfXX3Y16UudRG lN8TMdFUXo8J4F42eUiN+ASwn9iyqyB+BdjkMV0kUK7OmI2IS/0VObDS0ey+NqHYKz UYZCIQkjxUF6A== From: Mike Snitzer To: linux-nfs@vger.kernel.org Cc: Jeff Layton , Chuck Lever , Anna Schumaker , Trond Myklebust , NeilBrown , linux-fsdevel@vger.kernel.org Subject: [PATCH v12 01/24] nfs_common: factor out nfs_errtbl and nfs_stat_to_errno Date: Mon, 19 Aug 2024 14:17:06 -0400 Message-ID: <20240819181750.70570-2-snitzer@kernel.org> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240819181750.70570-1-snitzer@kernel.org> References: <20240819181750.70570-1-snitzer@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Common nfs_stat_to_errno() is used by both fs/nfs/nfs2xdr.c and fs/nfs/nfs3xdr.c Will also be used by fs/nfsd/localio.c Signed-off-by: Mike Snitzer --- fs/nfs/Kconfig | 1 + fs/nfs/nfs2xdr.c | 70 +----------------------- fs/nfs/nfs3xdr.c | 108 +++++++------------------------------ fs/nfs/nfs4xdr.c | 4 +- fs/nfs_common/Makefile | 2 + fs/nfs_common/common.c | 67 +++++++++++++++++++++++ fs/nfsd/Kconfig | 1 + include/linux/nfs_common.h | 16 ++++++ 8 files changed, 109 insertions(+), 160 deletions(-) create mode 100644 fs/nfs_common/common.c create mode 100644 include/linux/nfs_common.h diff --git a/fs/nfs/Kconfig b/fs/nfs/Kconfig index 57249f040dfc..0eb20012792f 100644 --- a/fs/nfs/Kconfig +++ b/fs/nfs/Kconfig @@ -4,6 +4,7 @@ config NFS_FS depends on INET && FILE_LOCKING && MULTIUSER select LOCKD select SUNRPC + select NFS_COMMON select NFS_ACL_SUPPORT if NFS_V3_ACL help Choose Y here if you want to access files residing on other diff --git a/fs/nfs/nfs2xdr.c b/fs/nfs/nfs2xdr.c index c19093814296..6e75c6c2d234 100644 --- a/fs/nfs/nfs2xdr.c +++ b/fs/nfs/nfs2xdr.c @@ -22,14 +22,12 @@ #include #include #include +#include #include "nfstrace.h" #include "internal.h" #define NFSDBG_FACILITY NFSDBG_XDR -/* Mapping from NFS error code to "errno" error code. */ -#define errno_NFSERR_IO EIO - /* * Declare the space requirements for NFS arguments and replies as * number of 32bit-words @@ -64,8 +62,6 @@ #define NFS_readdirres_sz (1+NFS_pagepad_sz) #define NFS_statfsres_sz (1+NFS_info_sz) -static int nfs_stat_to_errno(enum nfs_stat); - /* * Encode/decode NFSv2 basic data types * @@ -1054,70 +1050,6 @@ static int nfs2_xdr_dec_statfsres(struct rpc_rqst *req, struct xdr_stream *xdr, return nfs_stat_to_errno(status); } - -/* - * We need to translate between nfs status return values and - * the local errno values which may not be the same. - */ -static const struct { - int stat; - int errno; -} nfs_errtbl[] = { - { NFS_OK, 0 }, - { NFSERR_PERM, -EPERM }, - { NFSERR_NOENT, -ENOENT }, - { NFSERR_IO, -errno_NFSERR_IO}, - { NFSERR_NXIO, -ENXIO }, -/* { NFSERR_EAGAIN, -EAGAIN }, */ - { NFSERR_ACCES, -EACCES }, - { NFSERR_EXIST, -EEXIST }, - { NFSERR_XDEV, -EXDEV }, - { NFSERR_NODEV, -ENODEV }, - { NFSERR_NOTDIR, -ENOTDIR }, - { NFSERR_ISDIR, -EISDIR }, - { NFSERR_INVAL, -EINVAL }, - { NFSERR_FBIG, -EFBIG }, - { NFSERR_NOSPC, -ENOSPC }, - { NFSERR_ROFS, -EROFS }, - { NFSERR_MLINK, -EMLINK }, - { NFSERR_NAMETOOLONG, -ENAMETOOLONG }, - { NFSERR_NOTEMPTY, -ENOTEMPTY }, - { NFSERR_DQUOT, -EDQUOT }, - { NFSERR_STALE, -ESTALE }, - { NFSERR_REMOTE, -EREMOTE }, -#ifdef EWFLUSH - { NFSERR_WFLUSH, -EWFLUSH }, -#endif - { NFSERR_BADHANDLE, -EBADHANDLE }, - { NFSERR_NOT_SYNC, -ENOTSYNC }, - { NFSERR_BAD_COOKIE, -EBADCOOKIE }, - { NFSERR_NOTSUPP, -ENOTSUPP }, - { NFSERR_TOOSMALL, -ETOOSMALL }, - { NFSERR_SERVERFAULT, -EREMOTEIO }, - { NFSERR_BADTYPE, -EBADTYPE }, - { NFSERR_JUKEBOX, -EJUKEBOX }, - { -1, -EIO } -}; - -/** - * nfs_stat_to_errno - convert an NFS status code to a local errno - * @status: NFS status code to convert - * - * Returns a local errno value, or -EIO if the NFS status code is - * not recognized. This function is used jointly by NFSv2 and NFSv3. - */ -static int nfs_stat_to_errno(enum nfs_stat status) -{ - int i; - - for (i = 0; nfs_errtbl[i].stat != -1; i++) { - if (nfs_errtbl[i].stat == (int)status) - return nfs_errtbl[i].errno; - } - dprintk("NFS: Unrecognized nfs status value: %u\n", status); - return nfs_errtbl[i].errno; -} - #define PROC(proc, argtype, restype, timer) \ [NFSPROC_##proc] = { \ .p_proc = NFSPROC_##proc, \ diff --git a/fs/nfs/nfs3xdr.c b/fs/nfs/nfs3xdr.c index 60f032be805a..4ae01c10b7e2 100644 --- a/fs/nfs/nfs3xdr.c +++ b/fs/nfs/nfs3xdr.c @@ -21,14 +21,13 @@ #include #include #include +#include + #include "nfstrace.h" #include "internal.h" #define NFSDBG_FACILITY NFSDBG_XDR -/* Mapping from NFS error code to "errno" error code. */ -#define errno_NFSERR_IO EIO - /* * Declare the space requirements for NFS arguments and replies as * number of 32bit-words @@ -91,8 +90,6 @@ NFS3_pagepad_sz) #define ACL3_setaclres_sz (1+NFS3_post_op_attr_sz) -static int nfs3_stat_to_errno(enum nfs_stat); - /* * Map file type to S_IFMT bits */ @@ -1406,7 +1403,7 @@ static int nfs3_xdr_dec_getattr3res(struct rpc_rqst *req, out: return error; out_default: - return nfs3_stat_to_errno(status); + return nfs_stat_to_errno(status); } /* @@ -1445,7 +1442,7 @@ static int nfs3_xdr_dec_setattr3res(struct rpc_rqst *req, out: return error; out_status: - return nfs3_stat_to_errno(status); + return nfs_stat_to_errno(status); } /* @@ -1495,7 +1492,7 @@ static int nfs3_xdr_dec_lookup3res(struct rpc_rqst *req, error = decode_post_op_attr(xdr, result->dir_attr, userns); if (unlikely(error)) goto out; - return nfs3_stat_to_errno(status); + return nfs_stat_to_errno(status); } /* @@ -1537,7 +1534,7 @@ static int nfs3_xdr_dec_access3res(struct rpc_rqst *req, out: return error; out_default: - return nfs3_stat_to_errno(status); + return nfs_stat_to_errno(status); } /* @@ -1578,7 +1575,7 @@ static int nfs3_xdr_dec_readlink3res(struct rpc_rqst *req, out: return error; out_default: - return nfs3_stat_to_errno(status); + return nfs_stat_to_errno(status); } /* @@ -1658,7 +1655,7 @@ static int nfs3_xdr_dec_read3res(struct rpc_rqst *req, struct xdr_stream *xdr, out: return error; out_status: - return nfs3_stat_to_errno(status); + return nfs_stat_to_errno(status); } /* @@ -1728,7 +1725,7 @@ static int nfs3_xdr_dec_write3res(struct rpc_rqst *req, struct xdr_stream *xdr, out: return error; out_status: - return nfs3_stat_to_errno(status); + return nfs_stat_to_errno(status); } /* @@ -1795,7 +1792,7 @@ static int nfs3_xdr_dec_create3res(struct rpc_rqst *req, error = decode_wcc_data(xdr, result->dir_attr, userns); if (unlikely(error)) goto out; - return nfs3_stat_to_errno(status); + return nfs_stat_to_errno(status); } /* @@ -1835,7 +1832,7 @@ static int nfs3_xdr_dec_remove3res(struct rpc_rqst *req, out: return error; out_status: - return nfs3_stat_to_errno(status); + return nfs_stat_to_errno(status); } /* @@ -1881,7 +1878,7 @@ static int nfs3_xdr_dec_rename3res(struct rpc_rqst *req, out: return error; out_status: - return nfs3_stat_to_errno(status); + return nfs_stat_to_errno(status); } /* @@ -1926,7 +1923,7 @@ static int nfs3_xdr_dec_link3res(struct rpc_rqst *req, struct xdr_stream *xdr, out: return error; out_status: - return nfs3_stat_to_errno(status); + return nfs_stat_to_errno(status); } /** @@ -2101,7 +2098,7 @@ static int nfs3_xdr_dec_readdir3res(struct rpc_rqst *req, error = decode_post_op_attr(xdr, result->dir_attr, rpc_rqst_userns(req)); if (unlikely(error)) goto out; - return nfs3_stat_to_errno(status); + return nfs_stat_to_errno(status); } /* @@ -2167,7 +2164,7 @@ static int nfs3_xdr_dec_fsstat3res(struct rpc_rqst *req, out: return error; out_status: - return nfs3_stat_to_errno(status); + return nfs_stat_to_errno(status); } /* @@ -2243,7 +2240,7 @@ static int nfs3_xdr_dec_fsinfo3res(struct rpc_rqst *req, out: return error; out_status: - return nfs3_stat_to_errno(status); + return nfs_stat_to_errno(status); } /* @@ -2304,7 +2301,7 @@ static int nfs3_xdr_dec_pathconf3res(struct rpc_rqst *req, out: return error; out_status: - return nfs3_stat_to_errno(status); + return nfs_stat_to_errno(status); } /* @@ -2350,7 +2347,7 @@ static int nfs3_xdr_dec_commit3res(struct rpc_rqst *req, out: return error; out_status: - return nfs3_stat_to_errno(status); + return nfs_stat_to_errno(status); } #ifdef CONFIG_NFS_V3_ACL @@ -2416,7 +2413,7 @@ static int nfs3_xdr_dec_getacl3res(struct rpc_rqst *req, out: return error; out_default: - return nfs3_stat_to_errno(status); + return nfs_stat_to_errno(status); } static int nfs3_xdr_dec_setacl3res(struct rpc_rqst *req, @@ -2435,76 +2432,11 @@ static int nfs3_xdr_dec_setacl3res(struct rpc_rqst *req, out: return error; out_default: - return nfs3_stat_to_errno(status); + return nfs_stat_to_errno(status); } #endif /* CONFIG_NFS_V3_ACL */ - -/* - * We need to translate between nfs status return values and - * the local errno values which may not be the same. - */ -static const struct { - int stat; - int errno; -} nfs_errtbl[] = { - { NFS_OK, 0 }, - { NFSERR_PERM, -EPERM }, - { NFSERR_NOENT, -ENOENT }, - { NFSERR_IO, -errno_NFSERR_IO}, - { NFSERR_NXIO, -ENXIO }, -/* { NFSERR_EAGAIN, -EAGAIN }, */ - { NFSERR_ACCES, -EACCES }, - { NFSERR_EXIST, -EEXIST }, - { NFSERR_XDEV, -EXDEV }, - { NFSERR_NODEV, -ENODEV }, - { NFSERR_NOTDIR, -ENOTDIR }, - { NFSERR_ISDIR, -EISDIR }, - { NFSERR_INVAL, -EINVAL }, - { NFSERR_FBIG, -EFBIG }, - { NFSERR_NOSPC, -ENOSPC }, - { NFSERR_ROFS, -EROFS }, - { NFSERR_MLINK, -EMLINK }, - { NFSERR_NAMETOOLONG, -ENAMETOOLONG }, - { NFSERR_NOTEMPTY, -ENOTEMPTY }, - { NFSERR_DQUOT, -EDQUOT }, - { NFSERR_STALE, -ESTALE }, - { NFSERR_REMOTE, -EREMOTE }, -#ifdef EWFLUSH - { NFSERR_WFLUSH, -EWFLUSH }, -#endif - { NFSERR_BADHANDLE, -EBADHANDLE }, - { NFSERR_NOT_SYNC, -ENOTSYNC }, - { NFSERR_BAD_COOKIE, -EBADCOOKIE }, - { NFSERR_NOTSUPP, -ENOTSUPP }, - { NFSERR_TOOSMALL, -ETOOSMALL }, - { NFSERR_SERVERFAULT, -EREMOTEIO }, - { NFSERR_BADTYPE, -EBADTYPE }, - { NFSERR_JUKEBOX, -EJUKEBOX }, - { -1, -EIO } -}; - -/** - * nfs3_stat_to_errno - convert an NFS status code to a local errno - * @status: NFS status code to convert - * - * Returns a local errno value, or -EIO if the NFS status code is - * not recognized. This function is used jointly by NFSv2 and NFSv3. - */ -static int nfs3_stat_to_errno(enum nfs_stat status) -{ - int i; - - for (i = 0; nfs_errtbl[i].stat != -1; i++) { - if (nfs_errtbl[i].stat == (int)status) - return nfs_errtbl[i].errno; - } - dprintk("NFS: Unrecognized nfs status value: %u\n", status); - return nfs_errtbl[i].errno; -} - - #define PROC(proc, argtype, restype, timer) \ [NFS3PROC_##proc] = { \ .p_proc = NFS3PROC_##proc, \ diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c index 7704a4509676..b4091af1a60d 100644 --- a/fs/nfs/nfs4xdr.c +++ b/fs/nfs/nfs4xdr.c @@ -52,6 +52,7 @@ #include #include #include +#include #include "nfs4_fs.h" #include "nfs4trace.h" @@ -63,9 +64,6 @@ #define NFSDBG_FACILITY NFSDBG_XDR -/* Mapping from NFS error code to "errno" error code. */ -#define errno_NFSERR_IO EIO - struct compound_hdr; static int nfs4_stat_to_errno(int); static void encode_layoutget(struct xdr_stream *xdr, diff --git a/fs/nfs_common/Makefile b/fs/nfs_common/Makefile index 119c75ab9fd0..e58b01bb8dda 100644 --- a/fs/nfs_common/Makefile +++ b/fs/nfs_common/Makefile @@ -8,3 +8,5 @@ nfs_acl-objs := nfsacl.o obj-$(CONFIG_GRACE_PERIOD) += grace.o obj-$(CONFIG_NFS_V4_2_SSC_HELPER) += nfs_ssc.o + +obj-$(CONFIG_NFS_COMMON) += common.o diff --git a/fs/nfs_common/common.c b/fs/nfs_common/common.c new file mode 100644 index 000000000000..a4ee95da2174 --- /dev/null +++ b/fs/nfs_common/common.c @@ -0,0 +1,67 @@ +// SPDX-License-Identifier: GPL-2.0-only + +#include +#include + +/* + * We need to translate between nfs status return values and + * the local errno values which may not be the same. + */ +static const struct { + int stat; + int errno; +} nfs_errtbl[] = { + { NFS_OK, 0 }, + { NFSERR_PERM, -EPERM }, + { NFSERR_NOENT, -ENOENT }, + { NFSERR_IO, -errno_NFSERR_IO}, + { NFSERR_NXIO, -ENXIO }, +/* { NFSERR_EAGAIN, -EAGAIN }, */ + { NFSERR_ACCES, -EACCES }, + { NFSERR_EXIST, -EEXIST }, + { NFSERR_XDEV, -EXDEV }, + { NFSERR_NODEV, -ENODEV }, + { NFSERR_NOTDIR, -ENOTDIR }, + { NFSERR_ISDIR, -EISDIR }, + { NFSERR_INVAL, -EINVAL }, + { NFSERR_FBIG, -EFBIG }, + { NFSERR_NOSPC, -ENOSPC }, + { NFSERR_ROFS, -EROFS }, + { NFSERR_MLINK, -EMLINK }, + { NFSERR_NAMETOOLONG, -ENAMETOOLONG }, + { NFSERR_NOTEMPTY, -ENOTEMPTY }, + { NFSERR_DQUOT, -EDQUOT }, + { NFSERR_STALE, -ESTALE }, + { NFSERR_REMOTE, -EREMOTE }, +#ifdef EWFLUSH + { NFSERR_WFLUSH, -EWFLUSH }, +#endif + { NFSERR_BADHANDLE, -EBADHANDLE }, + { NFSERR_NOT_SYNC, -ENOTSYNC }, + { NFSERR_BAD_COOKIE, -EBADCOOKIE }, + { NFSERR_NOTSUPP, -ENOTSUPP }, + { NFSERR_TOOSMALL, -ETOOSMALL }, + { NFSERR_SERVERFAULT, -EREMOTEIO }, + { NFSERR_BADTYPE, -EBADTYPE }, + { NFSERR_JUKEBOX, -EJUKEBOX }, + { -1, -EIO } +}; + +/** + * nfs_stat_to_errno - convert an NFS status code to a local errno + * @status: NFS status code to convert + * + * Returns a local errno value, or -EIO if the NFS status code is + * not recognized. This function is used jointly by NFSv2 and NFSv3. + */ +int nfs_stat_to_errno(enum nfs_stat status) +{ + int i; + + for (i = 0; nfs_errtbl[i].stat != -1; i++) { + if (nfs_errtbl[i].stat == (int)status) + return nfs_errtbl[i].errno; + } + return nfs_errtbl[i].errno; +} +EXPORT_SYMBOL_GPL(nfs_stat_to_errno); diff --git a/fs/nfsd/Kconfig b/fs/nfsd/Kconfig index ec2ab6429e00..c0bd1509ccd4 100644 --- a/fs/nfsd/Kconfig +++ b/fs/nfsd/Kconfig @@ -7,6 +7,7 @@ config NFSD select LOCKD select SUNRPC select EXPORTFS + select NFS_COMMON select NFS_ACL_SUPPORT if NFSD_V2_ACL select NFS_ACL_SUPPORT if NFSD_V3_ACL depends on MULTIUSER diff --git a/include/linux/nfs_common.h b/include/linux/nfs_common.h new file mode 100644 index 000000000000..3395c4a4d372 --- /dev/null +++ b/include/linux/nfs_common.h @@ -0,0 +1,16 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * This file contains constants and methods used by both NFS client and server. + */ +#ifndef _LINUX_NFS_COMMON_H +#define _LINUX_NFS_COMMON_H + +#include +#include + +/* Mapping from NFS error code to "errno" error code. */ +#define errno_NFSERR_IO EIO + +int nfs_stat_to_errno(enum nfs_stat status); + +#endif /* _LINUX_NFS_COMMON_H */ From patchwork Mon Aug 19 18:17:07 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Snitzer X-Patchwork-Id: 13768771 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9341C189517; Mon, 19 Aug 2024 18:17:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724091474; cv=none; b=kEmXgkQpc+Kdn+8+eaLbbXv5Dm3S9eJAdnGf1dwrgdT5WtxLYXKbELcH7CldzIm88N4xdfz8+eLcqtz80s8PRKi9JKRVIoReLZrJbxgP9QHjos/6oZ45JfanTfP1uwHsdEI3ZHOhyrGZtRZzAZAFLXGdrfT6pedAWIiqUIJ5Mhs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724091474; c=relaxed/simple; bh=naa0+vU2hdij5+Oa6Z4mA9sWp5gwQRiL2MzywjCf7fk=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=XX72ERiUmY1D0FO+gYyY+fWtI4MFR/XgQuhaWoDjNvrjG/h2hc8E6ELOUxanc4gFC9T9jYCtcvKNgGnXXnR5ZXhnZuAHsHLjFTM2SqKeJoSCBFchDfIiOF5fdFqUbNBMhUeUW87mHi5nVtQ+SRdZvd4mnNAwhuHc54H/O94SNYI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=WXQ/Io0T; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="WXQ/Io0T" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 3CF09C32782; Mon, 19 Aug 2024 18:17:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1724091474; bh=naa0+vU2hdij5+Oa6Z4mA9sWp5gwQRiL2MzywjCf7fk=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=WXQ/Io0TOJG3OTdFhVPiL7MchPvGiVpohEjGbSDjtIiuwk5WJW58rLVPfTMAsO+4G GFb7kr8PPiovhAhvKeK2FQUpbkjgyVluHVRUVNKnvTt04XsryuskUMS4o591pBFRTw Wvfky/vt9dSm+P94IryWfmPovGT6DrvgJLlEJ+p2rAvm2wlPu/RlsH0ryZZvrhtkzl zgEUn6gh3PDn4/GNVyIv+Eo2n8QWlPZVtUROmGEZoyqr7x+25WDh9qpSzKKU3FjdGw YjSBXKL4k+vgYkH/0vNnwjtizz7Xej/ouxB6kvadpYJ1C0d2SoiBjaRC9yM4lp2Qeo TXWi+1urt9/hQ== From: Mike Snitzer To: linux-nfs@vger.kernel.org Cc: Jeff Layton , Chuck Lever , Anna Schumaker , Trond Myklebust , NeilBrown , linux-fsdevel@vger.kernel.org Subject: [PATCH v12 02/24] nfs_common: factor out nfs4_errtbl and nfs4_stat_to_errno Date: Mon, 19 Aug 2024 14:17:07 -0400 Message-ID: <20240819181750.70570-3-snitzer@kernel.org> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240819181750.70570-1-snitzer@kernel.org> References: <20240819181750.70570-1-snitzer@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Common nfs4_stat_to_errno() is used by fs/nfs/nfs4xdr.c and will be used by fs/nfs/localio.c Signed-off-by: Mike Snitzer --- fs/nfs/nfs4xdr.c | 67 -------------------------------------- fs/nfs_common/common.c | 67 ++++++++++++++++++++++++++++++++++++++ include/linux/nfs_common.h | 1 + 3 files changed, 68 insertions(+), 67 deletions(-) diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c index b4091af1a60d..971305bdaecb 100644 --- a/fs/nfs/nfs4xdr.c +++ b/fs/nfs/nfs4xdr.c @@ -65,7 +65,6 @@ #define NFSDBG_FACILITY NFSDBG_XDR struct compound_hdr; -static int nfs4_stat_to_errno(int); static void encode_layoutget(struct xdr_stream *xdr, const struct nfs4_layoutget_args *args, struct compound_hdr *hdr); @@ -7619,72 +7618,6 @@ int nfs4_decode_dirent(struct xdr_stream *xdr, struct nfs_entry *entry, return 0; } -/* - * We need to translate between nfs status return values and - * the local errno values which may not be the same. - */ -static struct { - int stat; - int errno; -} nfs_errtbl[] = { - { NFS4_OK, 0 }, - { NFS4ERR_PERM, -EPERM }, - { NFS4ERR_NOENT, -ENOENT }, - { NFS4ERR_IO, -errno_NFSERR_IO}, - { NFS4ERR_NXIO, -ENXIO }, - { NFS4ERR_ACCESS, -EACCES }, - { NFS4ERR_EXIST, -EEXIST }, - { NFS4ERR_XDEV, -EXDEV }, - { NFS4ERR_NOTDIR, -ENOTDIR }, - { NFS4ERR_ISDIR, -EISDIR }, - { NFS4ERR_INVAL, -EINVAL }, - { NFS4ERR_FBIG, -EFBIG }, - { NFS4ERR_NOSPC, -ENOSPC }, - { NFS4ERR_ROFS, -EROFS }, - { NFS4ERR_MLINK, -EMLINK }, - { NFS4ERR_NAMETOOLONG, -ENAMETOOLONG }, - { NFS4ERR_NOTEMPTY, -ENOTEMPTY }, - { NFS4ERR_DQUOT, -EDQUOT }, - { NFS4ERR_STALE, -ESTALE }, - { NFS4ERR_BADHANDLE, -EBADHANDLE }, - { NFS4ERR_BAD_COOKIE, -EBADCOOKIE }, - { NFS4ERR_NOTSUPP, -ENOTSUPP }, - { NFS4ERR_TOOSMALL, -ETOOSMALL }, - { NFS4ERR_SERVERFAULT, -EREMOTEIO }, - { NFS4ERR_BADTYPE, -EBADTYPE }, - { NFS4ERR_LOCKED, -EAGAIN }, - { NFS4ERR_SYMLINK, -ELOOP }, - { NFS4ERR_OP_ILLEGAL, -EOPNOTSUPP }, - { NFS4ERR_DEADLOCK, -EDEADLK }, - { NFS4ERR_NOXATTR, -ENODATA }, - { NFS4ERR_XATTR2BIG, -E2BIG }, - { -1, -EIO } -}; - -/* - * Convert an NFS error code to a local one. - * This one is used jointly by NFSv2 and NFSv3. - */ -static int -nfs4_stat_to_errno(int stat) -{ - int i; - for (i = 0; nfs_errtbl[i].stat != -1; i++) { - if (nfs_errtbl[i].stat == stat) - return nfs_errtbl[i].errno; - } - if (stat <= 10000 || stat > 10100) { - /* The server is looney tunes. */ - return -EREMOTEIO; - } - /* If we cannot translate the error, the recovery routines should - * handle it. - * Note: remaining NFSv4 error codes have values > 10000, so should - * not conflict with native Linux error codes. - */ - return -stat; -} - #ifdef CONFIG_NFS_V4_2 #include "nfs42xdr.c" #endif /* CONFIG_NFS_V4_2 */ diff --git a/fs/nfs_common/common.c b/fs/nfs_common/common.c index a4ee95da2174..34a115176f97 100644 --- a/fs/nfs_common/common.c +++ b/fs/nfs_common/common.c @@ -2,6 +2,7 @@ #include #include +#include /* * We need to translate between nfs status return values and @@ -65,3 +66,69 @@ int nfs_stat_to_errno(enum nfs_stat status) return nfs_errtbl[i].errno; } EXPORT_SYMBOL_GPL(nfs_stat_to_errno); + +/* + * We need to translate between nfs v4 status return values and + * the local errno values which may not be the same. + */ +static const struct { + int stat; + int errno; +} nfs4_errtbl[] = { + { NFS4_OK, 0 }, + { NFS4ERR_PERM, -EPERM }, + { NFS4ERR_NOENT, -ENOENT }, + { NFS4ERR_IO, -errno_NFSERR_IO}, + { NFS4ERR_NXIO, -ENXIO }, + { NFS4ERR_ACCESS, -EACCES }, + { NFS4ERR_EXIST, -EEXIST }, + { NFS4ERR_XDEV, -EXDEV }, + { NFS4ERR_NOTDIR, -ENOTDIR }, + { NFS4ERR_ISDIR, -EISDIR }, + { NFS4ERR_INVAL, -EINVAL }, + { NFS4ERR_FBIG, -EFBIG }, + { NFS4ERR_NOSPC, -ENOSPC }, + { NFS4ERR_ROFS, -EROFS }, + { NFS4ERR_MLINK, -EMLINK }, + { NFS4ERR_NAMETOOLONG, -ENAMETOOLONG }, + { NFS4ERR_NOTEMPTY, -ENOTEMPTY }, + { NFS4ERR_DQUOT, -EDQUOT }, + { NFS4ERR_STALE, -ESTALE }, + { NFS4ERR_BADHANDLE, -EBADHANDLE }, + { NFS4ERR_BAD_COOKIE, -EBADCOOKIE }, + { NFS4ERR_NOTSUPP, -ENOTSUPP }, + { NFS4ERR_TOOSMALL, -ETOOSMALL }, + { NFS4ERR_SERVERFAULT, -EREMOTEIO }, + { NFS4ERR_BADTYPE, -EBADTYPE }, + { NFS4ERR_LOCKED, -EAGAIN }, + { NFS4ERR_SYMLINK, -ELOOP }, + { NFS4ERR_OP_ILLEGAL, -EOPNOTSUPP }, + { NFS4ERR_DEADLOCK, -EDEADLK }, + { NFS4ERR_NOXATTR, -ENODATA }, + { NFS4ERR_XATTR2BIG, -E2BIG }, + { -1, -EIO } +}; + +/* + * Convert an NFS error code to a local one. + * This one is used by NFSv4. + */ +int nfs4_stat_to_errno(int stat) +{ + int i; + for (i = 0; nfs4_errtbl[i].stat != -1; i++) { + if (nfs4_errtbl[i].stat == stat) + return nfs4_errtbl[i].errno; + } + if (stat <= 10000 || stat > 10100) { + /* The server is looney tunes. */ + return -EREMOTEIO; + } + /* If we cannot translate the error, the recovery routines should + * handle it. + * Note: remaining NFSv4 error codes have values > 10000, so should + * not conflict with native Linux error codes. + */ + return -stat; +} +EXPORT_SYMBOL_GPL(nfs4_stat_to_errno); diff --git a/include/linux/nfs_common.h b/include/linux/nfs_common.h index 3395c4a4d372..5fc02df88252 100644 --- a/include/linux/nfs_common.h +++ b/include/linux/nfs_common.h @@ -12,5 +12,6 @@ #define errno_NFSERR_IO EIO int nfs_stat_to_errno(enum nfs_stat status); +int nfs4_stat_to_errno(int stat); #endif /* _LINUX_NFS_COMMON_H */ From patchwork Mon Aug 19 18:17:08 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Snitzer X-Patchwork-Id: 13768772 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 46E3A189517; Mon, 19 Aug 2024 18:17:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724091476; cv=none; b=N+uRLAbhDEu2/DQGwp7wCPL7nK8cnhvKVqEUMVHt0Om7Y0GrzGeppSaAMPS0+P1/xCwgoX9MYq/aM+wdGrPGNKFm6qQwr0rsdfgj9j063qafrQ+G0V3r4bb0C7GvxpPRjZxaNA5aDkQWVWblKHrCFADt/DSmdBoxEz2BPtDXDNI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724091476; c=relaxed/simple; bh=UKCjVhygcTf0F6+tr/Rm6lVEtvk5t+MoIGPRMLHRf0A=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=L9xude6Btn5cFu+43dBuQRKYm97JRV3qDGwQwrxPiDMh5gqn+5jZMaZDE7OKwtBdcwTOcnr+pX+MtCdu0Lmg89aSQlvXZXMWYUxhC3nhxcBO8IEr66S4+5Y8IzXWC0Ud5WAgSTU0J8Kb5dC0iP4XvXOI1hg8nDJ6HOm/PIsUMQI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=Kegi7j+J; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="Kegi7j+J" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 94B25C4AF11; Mon, 19 Aug 2024 18:17:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1724091475; bh=UKCjVhygcTf0F6+tr/Rm6lVEtvk5t+MoIGPRMLHRf0A=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Kegi7j+J+yVtai8IFiJa2JFICr5hdhIObLFGBOK2IPKxlZupRllmAwGmYVVAMhnSf BM+0V8N2M57vCesxzW7HVIbP5vInRwFOhiZ1vtOJAOvZWLEpv48fxd1tg42q9nHwGk SUFeCPs3JUySlLfc/nKg9A7KIcRlKa6/i5ouI8SZwCrap77DtEuluL3uEKpqrnr8YW 7CKoGqydVLDiHdFO4tVcKwkFDS3HM0lOGFtt8QY5Y9d3j7f2kvz+MPb/EhazS1X0dV snbFgEFCVN3BlfSxpTPUIQ3DCb+8xZEa65nv5ZxlQUdSVtz5sFojS76EWQyj5tx2FJ Ed6I/Ij2Ak7xg== From: Mike Snitzer To: linux-nfs@vger.kernel.org Cc: Jeff Layton , Chuck Lever , Anna Schumaker , Trond Myklebust , NeilBrown , linux-fsdevel@vger.kernel.org Subject: [PATCH v12 03/24] nfs: factor out {encode,decode}_opaque_fixed to nfs_xdr.h Date: Mon, 19 Aug 2024 14:17:08 -0400 Message-ID: <20240819181750.70570-4-snitzer@kernel.org> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240819181750.70570-1-snitzer@kernel.org> References: <20240819181750.70570-1-snitzer@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Eliminates duplicate functions in various files to allow for additional callers. Reviewed-by: NeilBrown Signed-off-by: Mike Snitzer --- fs/nfs/flexfilelayout/flexfilelayout.c | 6 ------ fs/nfs/nfs4xdr.c | 13 ------------- include/linux/nfs_xdr.h | 20 +++++++++++++++++++- 3 files changed, 19 insertions(+), 20 deletions(-) diff --git a/fs/nfs/flexfilelayout/flexfilelayout.c b/fs/nfs/flexfilelayout/flexfilelayout.c index 39ba9f4208aa..d4d551ffea7b 100644 --- a/fs/nfs/flexfilelayout/flexfilelayout.c +++ b/fs/nfs/flexfilelayout/flexfilelayout.c @@ -2086,12 +2086,6 @@ static int ff_layout_encode_ioerr(struct xdr_stream *xdr, return ff_layout_encode_ds_ioerr(xdr, &ff_args->errors); } -static void -encode_opaque_fixed(struct xdr_stream *xdr, const void *buf, size_t len) -{ - WARN_ON_ONCE(xdr_stream_encode_opaque_fixed(xdr, buf, len) < 0); -} - static void ff_layout_encode_ff_iostat_head(struct xdr_stream *xdr, const nfs4_stateid *stateid, diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c index 971305bdaecb..6bf2d44e5d4e 100644 --- a/fs/nfs/nfs4xdr.c +++ b/fs/nfs/nfs4xdr.c @@ -972,11 +972,6 @@ static __be32 *reserve_space(struct xdr_stream *xdr, size_t nbytes) return p; } -static void encode_opaque_fixed(struct xdr_stream *xdr, const void *buf, size_t len) -{ - WARN_ON_ONCE(xdr_stream_encode_opaque_fixed(xdr, buf, len) < 0); -} - static void encode_string(struct xdr_stream *xdr, unsigned int len, const char *str) { WARN_ON_ONCE(xdr_stream_encode_opaque(xdr, str, len) < 0); @@ -4406,14 +4401,6 @@ static int decode_access(struct xdr_stream *xdr, u32 *supported, u32 *access) return 0; } -static int decode_opaque_fixed(struct xdr_stream *xdr, void *buf, size_t len) -{ - ssize_t ret = xdr_stream_decode_opaque_fixed(xdr, buf, len); - if (unlikely(ret < 0)) - return -EIO; - return 0; -} - static int decode_stateid(struct xdr_stream *xdr, nfs4_stateid *stateid) { return decode_opaque_fixed(xdr, stateid, NFS4_STATEID_SIZE); diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h index 45623af3e7b8..5e93fbfb785a 100644 --- a/include/linux/nfs_xdr.h +++ b/include/linux/nfs_xdr.h @@ -1853,6 +1853,24 @@ struct nfs_rpc_ops { void (*disable_swap)(struct inode *inode); }; +/* + * Helper functions used by NFS client and/or server + */ +static inline void encode_opaque_fixed(struct xdr_stream *xdr, + const void *buf, size_t len) +{ + WARN_ON_ONCE(xdr_stream_encode_opaque_fixed(xdr, buf, len) < 0); +} + +static inline int decode_opaque_fixed(struct xdr_stream *xdr, + void *buf, size_t len) +{ + ssize_t ret = xdr_stream_decode_opaque_fixed(xdr, buf, len); + if (unlikely(ret < 0)) + return -EIO; + return 0; +} + /* * Function vectors etc. for the NFS client */ @@ -1866,4 +1884,4 @@ extern const struct rpc_version nfs_version4; extern const struct rpc_version nfsacl_version3; extern const struct rpc_program nfsacl_program; -#endif +#endif /* _LINUX_NFS_XDR_H */ From patchwork Mon Aug 19 18:17:09 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Snitzer X-Patchwork-Id: 13768773 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 83B1F13BAC6; Mon, 19 Aug 2024 18:17:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724091477; cv=none; b=SkRr2fyoVplNaP37WSA+rTFX0lfuhCxSjsWf0EuSFSzPwE7b08w+oNXJ1ypVMD1iro0M330AVzwPq2qy91TMIwz0zFbuyfKvsPN4BQIhRuLVuDwFvVt9f5lHi6Pk0umgsrwULuLeF9HaIlzPC9XRWbAji1kG+4ovS5RUzbAzORE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724091477; c=relaxed/simple; bh=aYdATA38A+TwkIIsetF8MEPAu7A+MWMvdfXrRs73iOw=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=E+/WYtAFn9VzC3M5H/bYSsdzZjwrYkNnssIgmikIMvhN4vYQx9LpW01NiijWRTe+yjWxfdhE03dzh0IhfBvm5J8U8b5j+k6Eywp+0EGXJ+go2VLxPNWLOT/2Msj01yoz8ksOZlfMC0h9PqrXK+S6IDZ1SlphIYc6hoc0wmLUnCU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=nNRfk7Iy; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="nNRfk7Iy" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 1CF58C4AF0E; Mon, 19 Aug 2024 18:17:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1724091477; bh=aYdATA38A+TwkIIsetF8MEPAu7A+MWMvdfXrRs73iOw=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=nNRfk7IyeEa+u8pes0UCXsRMd/20R6dyFgEGkrHbb8PZUIVhWSoidC3++YdME5MPy nZaOJjxaXiGiuV0SPq6muuFSZ8u3kMcCFC6zZ+NbHrFAi1dmyMEBPd9ADBA62WyS1K K83VNFeSRJHx7UbGrLAum/WozhL2MFf1NrA32t8LbRMnA/6scTUM5qVicGrcEZAHqP CMm+npdmvYq4FZFpJBpKf65xYbHWOi8MFbkNQze95zpqF8Er3Lvf7WAFYt6g+MM9TP /quq5IQ/QzyMicLYtzaaWwNM0j9S2aonhVEwHX66KA60ZZAltwI6qHxlkiA7Oe6xJM w0ZMSzmZ1QlmQ== From: Mike Snitzer To: linux-nfs@vger.kernel.org Cc: Jeff Layton , Chuck Lever , Anna Schumaker , Trond Myklebust , NeilBrown , linux-fsdevel@vger.kernel.org Subject: [PATCH v12 04/24] nfsd: factor out __fh_verify to allow NULL rqstp to be passed Date: Mon, 19 Aug 2024 14:17:09 -0400 Message-ID: <20240819181750.70570-5-snitzer@kernel.org> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240819181750.70570-1-snitzer@kernel.org> References: <20240819181750.70570-1-snitzer@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: NeilBrown __fh_verify() offers an interface like fh_verify() but doesn't require a struct svc_rqst *, instead it also takes the specific parts as explicit required arguments. So it is safe to call __fh_verify() with a NULL rqstp, but the net, cred, and client args must not be NULL. __fh_verify() does not use SVC_NET(), nor does the functions it calls. Rather then depending on rqstp->rq_vers to determine nfs version, pass it in explicitly. This removes another dependency on rqstp and ensures the correct version is checked. The rqstp can be for an NLM request and while some code tests that, other code does not. Rather than using rqstp->rq_client pass the client and gssclient explicitly to __fh_verify and then to nfsd_set_fh_dentry(). The final places where __fh_verify unconditionally dereferences rqstp involve checking if the connection is suitably secure. They look at rqstp->rq_xprt which is not meaningful in the target use case of "localio" NFS in which the client talks directly to the local server. So have these always succeed when rqstp is NULL. Lastly, 4 associated tracepoints are only used if rqstp is not NULL (this is stop-gap that will be properly fixed in the next commit). Signed-off-by: NeilBrown Co-developed-by: Mike Snitzer Signed-off-by: Mike Snitzer --- fs/nfsd/export.c | 8 ++- fs/nfsd/nfsfh.c | 124 ++++++++++++++++++++++++++++------------------- 2 files changed, 82 insertions(+), 50 deletions(-) diff --git a/fs/nfsd/export.c b/fs/nfsd/export.c index 7bb4f2075ac5..fe36f441d1d9 100644 --- a/fs/nfsd/export.c +++ b/fs/nfsd/export.c @@ -1077,7 +1077,13 @@ static struct svc_export *exp_find(struct cache_detail *cd, __be32 check_nfsd_access(struct svc_export *exp, struct svc_rqst *rqstp) { struct exp_flavor_info *f, *end = exp->ex_flavors + exp->ex_nflavors; - struct svc_xprt *xprt = rqstp->rq_xprt; + struct svc_xprt *xprt; + + if (!rqstp) + /* Always allow LOCALIO */ + return 0; + + xprt = rqstp->rq_xprt; if (exp->ex_xprtsec_modes & NFSEXP_XPRTSEC_NONE) { if (!test_bit(XPT_TLS_SESSION, &xprt->xpt_flags)) diff --git a/fs/nfsd/nfsfh.c b/fs/nfsd/nfsfh.c index 50d23d56f403..19e173187ab9 100644 --- a/fs/nfsd/nfsfh.c +++ b/fs/nfsd/nfsfh.c @@ -87,23 +87,24 @@ nfsd_mode_check(struct dentry *dentry, umode_t requested) return nfserr_wrong_type; } -static bool nfsd_originating_port_ok(struct svc_rqst *rqstp, int flags) +static bool nfsd_originating_port_ok(struct svc_rqst *rqstp, + struct svc_cred *cred, + struct svc_export *exp) { - if (flags & NFSEXP_INSECURE_PORT) + if (nfsexp_flags(cred, exp) & NFSEXP_INSECURE_PORT) return true; /* We don't require gss requests to use low ports: */ - if (rqstp->rq_cred.cr_flavor >= RPC_AUTH_GSS) + if (cred->cr_flavor >= RPC_AUTH_GSS) return true; return test_bit(RQ_SECURE, &rqstp->rq_flags); } static __be32 nfsd_setuser_and_check_port(struct svc_rqst *rqstp, + struct svc_cred *cred, struct svc_export *exp) { - int flags = nfsexp_flags(&rqstp->rq_cred, exp); - /* Check if the request originated from a secure port. */ - if (!nfsd_originating_port_ok(rqstp, flags)) { + if (rqstp && !nfsd_originating_port_ok(rqstp, cred, exp)) { RPC_IFDEBUG(char buf[RPC_MAX_ADDRBUFLEN]); dprintk("nfsd: request from insecure port %s!\n", svc_print_addr(rqstp, buf, sizeof(buf))); @@ -111,7 +112,7 @@ static __be32 nfsd_setuser_and_check_port(struct svc_rqst *rqstp, } /* Set user creds for this exportpoint */ - return nfserrno(nfsd_setuser(&rqstp->rq_cred, exp)); + return nfserrno(nfsd_setuser(cred, exp)); } static inline __be32 check_pseudo_root(struct dentry *dentry, @@ -141,7 +142,11 @@ static inline __be32 check_pseudo_root(struct dentry *dentry, * dentry. On success, the results are used to set fh_export and * fh_dentry. */ -static __be32 nfsd_set_fh_dentry(struct svc_rqst *rqstp, struct svc_fh *fhp) +static __be32 nfsd_set_fh_dentry(struct svc_rqst *rqstp, struct net *net, + struct svc_cred *cred, int nfs_vers, + struct auth_domain *client, + struct auth_domain *gssclient, + struct svc_fh *fhp) { struct knfsd_fh *fh = &fhp->fh_handle; struct fid *fid = NULL; @@ -183,14 +188,15 @@ static __be32 nfsd_set_fh_dentry(struct svc_rqst *rqstp, struct svc_fh *fhp) data_left -= len; if (data_left < 0) return error; - exp = rqst_exp_find(&rqstp->rq_chandle, SVC_NET(rqstp), - rqstp->rq_client, rqstp->rq_gssclient, + exp = rqst_exp_find(rqstp ? &rqstp->rq_chandle : NULL, + net, client, gssclient, fh->fh_fsid_type, fh->fh_fsid); fid = (struct fid *)(fh->fh_fsid + len); error = nfserr_stale; if (IS_ERR(exp)) { - trace_nfsd_set_fh_dentry_badexport(rqstp, fhp, PTR_ERR(exp)); + if (rqstp) + trace_nfsd_set_fh_dentry_badexport(rqstp, fhp, PTR_ERR(exp)); if (PTR_ERR(exp) == -ENOENT) return error; @@ -219,7 +225,7 @@ static __be32 nfsd_set_fh_dentry(struct svc_rqst *rqstp, struct svc_fh *fhp) put_cred(override_creds(new)); put_cred(new); } else { - error = nfsd_setuser_and_check_port(rqstp, exp); + error = nfsd_setuser_and_check_port(rqstp, cred, exp); if (error) goto out; } @@ -238,7 +244,8 @@ static __be32 nfsd_set_fh_dentry(struct svc_rqst *rqstp, struct svc_fh *fhp) data_left, fileid_type, 0, nfsd_acceptable, exp); if (IS_ERR_OR_NULL(dentry)) { - trace_nfsd_set_fh_dentry_badhandle(rqstp, fhp, + if (rqstp) + trace_nfsd_set_fh_dentry_badhandle(rqstp, fhp, dentry ? PTR_ERR(dentry) : -ESTALE); switch (PTR_ERR(dentry)) { case -ENOMEM: @@ -266,7 +273,7 @@ static __be32 nfsd_set_fh_dentry(struct svc_rqst *rqstp, struct svc_fh *fhp) fhp->fh_dentry = dentry; fhp->fh_export = exp; - switch (rqstp->rq_vers) { + switch (nfs_vers) { case 4: if (dentry->d_sb->s_export_op->flags & EXPORT_OP_NOATOMIC_ATTR) fhp->fh_no_atomic_attr = true; @@ -293,50 +300,29 @@ static __be32 nfsd_set_fh_dentry(struct svc_rqst *rqstp, struct svc_fh *fhp) return error; } -/** - * fh_verify - filehandle lookup and access checking - * @rqstp: pointer to current rpc request - * @fhp: filehandle to be verified - * @type: expected type of object pointed to by filehandle - * @access: type of access needed to object - * - * Look up a dentry from the on-the-wire filehandle, check the client's - * access to the export, and set the current task's credentials. - * - * Regardless of success or failure of fh_verify(), fh_put() should be - * called on @fhp when the caller is finished with the filehandle. - * - * fh_verify() may be called multiple times on a given filehandle, for - * example, when processing an NFSv4 compound. The first call will look - * up a dentry using the on-the-wire filehandle. Subsequent calls will - * skip the lookup and just perform the other checks and possibly change - * the current task's credentials. - * - * @type specifies the type of object expected using one of the S_IF* - * constants defined in include/linux/stat.h. The caller may use zero - * to indicate that it doesn't care, or a negative integer to indicate - * that it expects something not of the given type. - * - * @access is formed from the NFSD_MAY_* constants defined in - * fs/nfsd/vfs.h. - */ -__be32 -fh_verify(struct svc_rqst *rqstp, struct svc_fh *fhp, umode_t type, int access) +static __be32 +__fh_verify(struct svc_rqst *rqstp, + struct net *net, struct svc_cred *cred, + int nfs_vers, struct auth_domain *client, + struct auth_domain *gssclient, + struct svc_fh *fhp, umode_t type, int access) { - struct nfsd_net *nn = net_generic(SVC_NET(rqstp), nfsd_net_id); + struct nfsd_net *nn = net_generic(net, nfsd_net_id); struct svc_export *exp = NULL; struct dentry *dentry; __be32 error; if (!fhp->fh_dentry) { - error = nfsd_set_fh_dentry(rqstp, fhp); + error = nfsd_set_fh_dentry(rqstp, net, cred, nfs_vers, + client, gssclient, fhp); if (error) goto out; } dentry = fhp->fh_dentry; exp = fhp->fh_export; - trace_nfsd_fh_verify(rqstp, fhp, type, access); + if (rqstp) + trace_nfsd_fh_verify(rqstp, fhp, type, access); /* * We still have to do all these permission checks, even when @@ -358,7 +344,7 @@ fh_verify(struct svc_rqst *rqstp, struct svc_fh *fhp, umode_t type, int access) if (error) goto out; - error = nfsd_setuser_and_check_port(rqstp, exp); + error = nfsd_setuser_and_check_port(rqstp, cred, exp); if (error) goto out; @@ -388,14 +374,54 @@ fh_verify(struct svc_rqst *rqstp, struct svc_fh *fhp, umode_t type, int access) skip_pseudoflavor_check: /* Finally, check access permissions. */ - error = nfsd_permission(&rqstp->rq_cred, exp, dentry, access); + error = nfsd_permission(cred, exp, dentry, access); out: - trace_nfsd_fh_verify_err(rqstp, fhp, type, access, error); + if (rqstp) + trace_nfsd_fh_verify_err(rqstp, fhp, type, access, error); if (error == nfserr_stale) nfsd_stats_fh_stale_inc(nn, exp); return error; } +/** + * fh_verify - filehandle lookup and access checking + * @rqstp: pointer to current rpc request + * @fhp: filehandle to be verified + * @type: expected type of object pointed to by filehandle + * @access: type of access needed to object + * + * Look up a dentry from the on-the-wire filehandle, check the client's + * access to the export, and set the current task's credentials. + * + * Regardless of success or failure of fh_verify(), fh_put() should be + * called on @fhp when the caller is finished with the filehandle. + * + * fh_verify() may be called multiple times on a given filehandle, for + * example, when processing an NFSv4 compound. The first call will look + * up a dentry using the on-the-wire filehandle. Subsequent calls will + * skip the lookup and just perform the other checks and possibly change + * the current task's credentials. + * + * @type specifies the type of object expected using one of the S_IF* + * constants defined in include/linux/stat.h. The caller may use zero + * to indicate that it doesn't care, or a negative integer to indicate + * that it expects something not of the given type. + * + * @access is formed from the NFSD_MAY_* constants defined in + * fs/nfsd/vfs.h. + */ +__be32 +fh_verify(struct svc_rqst *rqstp, struct svc_fh *fhp, umode_t type, int access) +{ + int nfs_vers; + if (rqstp->rq_prog == NFS_PROGRAM) + nfs_vers = rqstp->rq_vers; + else /* must be NLM */ + nfs_vers = rqstp->rq_vers == 4 ? 3 : 2; + return __fh_verify(rqstp, SVC_NET(rqstp), &rqstp->rq_cred, nfs_vers, + rqstp->rq_client, rqstp->rq_gssclient, + fhp, type, access); +} /* * Compose a file handle for an NFS reply. From patchwork Mon Aug 19 18:17:10 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Snitzer X-Patchwork-Id: 13768774 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E0C7613BAC6; Mon, 19 Aug 2024 18:17:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724091479; cv=none; b=GfVy6pP4Qc3ahkoEOE5vpIbaxshoeGxZsoIfRBKGiYr8cez8i6xxLwuABRjM+3gcQcjqafwfOypdSy1HIcBO6ncC7uHceqb22qgEw259YfFecNHT40B9Ku7zQ2nHcL9PZ98sU4ndBqq1ukhcuCEmY7o/955tkScqda8ITsvutvI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724091479; c=relaxed/simple; bh=xp46w2XPQ1+9n25BOW0sy7euMG8U+rFu4BOmLblYQ2s=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=ImfMXXYzhTkz1UakrY0elmfXWSHlwJh6oOTdOkVckbz6VuzJnyzan4EOfMC6OXVfMxNxaLO/M0FiIAEkEsueWsZ+kYE+5nxySj7NYmIVecbyPqpLryjBg9jVD69LuG3R6+plGHZD1ZndQu0MNJS6dXFagtKKYzU8sCibqORmBJI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=H/GDtgyg; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="H/GDtgyg" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 87676C4AF0C; Mon, 19 Aug 2024 18:17:58 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1724091478; bh=xp46w2XPQ1+9n25BOW0sy7euMG8U+rFu4BOmLblYQ2s=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=H/GDtgygkysrDNBKJIVtXWKsGTL8NW2iIs33FGLtrlXePZgQwwTsTVYc4szhI5GRO pIxU8K2O1c4+eCeC9QaavfHPAlNeLzckU58wZFJd36c2otT7GNN2CbEvB49l0QXy8T sxpnaZvK2KcnsJmAaTwnphVGrcucrGWTdDR+fd8oPZDHjOBAHFf00Wlspmp2RExZAX dFVRgI8qvkVa1J1bG/1EqZqD2spKetBIDwoi2KquJfZ0rjfqZagbMH7D6N1ukABU1R 4E12GUjFKx2l35QQPu+dlaWOQ6M2KsIyLfmebrSvm/CgmMf56bj+1oA12OjoEPJhH4 iPv+yN80nVSKA== From: Mike Snitzer To: linux-nfs@vger.kernel.org Cc: Jeff Layton , Chuck Lever , Anna Schumaker , Trond Myklebust , NeilBrown , linux-fsdevel@vger.kernel.org Subject: [PATCH v12 05/24] nfsd: fix nfsfh tracepoints to properly handle NULL rqstp Date: Mon, 19 Aug 2024 14:17:10 -0400 Message-ID: <20240819181750.70570-6-snitzer@kernel.org> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240819181750.70570-1-snitzer@kernel.org> References: <20240819181750.70570-1-snitzer@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Fixes stop-gap used in previous commit where caller avoided using tracepoint if rqstp is NULL. Instead, have each tracepoint avoid dereferencing NULL rqstp. Signed-off-by: Mike Snitzer --- fs/nfsd/nfsfh.c | 12 ++++-------- fs/nfsd/trace.h | 36 +++++++++++++++++++++--------------- 2 files changed, 25 insertions(+), 23 deletions(-) diff --git a/fs/nfsd/nfsfh.c b/fs/nfsd/nfsfh.c index 19e173187ab9..bae727e65214 100644 --- a/fs/nfsd/nfsfh.c +++ b/fs/nfsd/nfsfh.c @@ -195,8 +195,7 @@ static __be32 nfsd_set_fh_dentry(struct svc_rqst *rqstp, struct net *net, error = nfserr_stale; if (IS_ERR(exp)) { - if (rqstp) - trace_nfsd_set_fh_dentry_badexport(rqstp, fhp, PTR_ERR(exp)); + trace_nfsd_set_fh_dentry_badexport(rqstp, fhp, PTR_ERR(exp)); if (PTR_ERR(exp) == -ENOENT) return error; @@ -244,8 +243,7 @@ static __be32 nfsd_set_fh_dentry(struct svc_rqst *rqstp, struct net *net, data_left, fileid_type, 0, nfsd_acceptable, exp); if (IS_ERR_OR_NULL(dentry)) { - if (rqstp) - trace_nfsd_set_fh_dentry_badhandle(rqstp, fhp, + trace_nfsd_set_fh_dentry_badhandle(rqstp, fhp, dentry ? PTR_ERR(dentry) : -ESTALE); switch (PTR_ERR(dentry)) { case -ENOMEM: @@ -321,8 +319,7 @@ __fh_verify(struct svc_rqst *rqstp, dentry = fhp->fh_dentry; exp = fhp->fh_export; - if (rqstp) - trace_nfsd_fh_verify(rqstp, fhp, type, access); + trace_nfsd_fh_verify(net, rqstp, fhp, type, access); /* * We still have to do all these permission checks, even when @@ -376,8 +373,7 @@ __fh_verify(struct svc_rqst *rqstp, /* Finally, check access permissions. */ error = nfsd_permission(cred, exp, dentry, access); out: - if (rqstp) - trace_nfsd_fh_verify_err(rqstp, fhp, type, access, error); + trace_nfsd_fh_verify_err(net, rqstp, fhp, type, access, error); if (error == nfserr_stale) nfsd_stats_fh_stale_inc(nn, exp); return error; diff --git a/fs/nfsd/trace.h b/fs/nfsd/trace.h index 77bbd23aa150..d49b3c1e3ba9 100644 --- a/fs/nfsd/trace.h +++ b/fs/nfsd/trace.h @@ -195,12 +195,13 @@ TRACE_EVENT(nfsd_compound_encode_err, TRACE_EVENT(nfsd_fh_verify, TP_PROTO( + const struct net *net, const struct svc_rqst *rqstp, const struct svc_fh *fhp, umode_t type, int access ), - TP_ARGS(rqstp, fhp, type, access), + TP_ARGS(net, rqstp, fhp, type, access), TP_STRUCT__entry( __field(unsigned int, netns_ino) __sockaddr(server, rqstp->rq_xprt->xpt_remotelen) @@ -212,12 +213,14 @@ TRACE_EVENT(nfsd_fh_verify, __field(unsigned long, access) ), TP_fast_assign( - __entry->netns_ino = SVC_NET(rqstp)->ns.inum; - __assign_sockaddr(server, &rqstp->rq_xprt->xpt_local, - rqstp->rq_xprt->xpt_locallen); - __assign_sockaddr(client, &rqstp->rq_xprt->xpt_remote, - rqstp->rq_xprt->xpt_remotelen); - __entry->xid = be32_to_cpu(rqstp->rq_xid); + __entry->netns_ino = net->ns.inum; + if (rqstp) { + __assign_sockaddr(server, &rqstp->rq_xprt->xpt_local, + rqstp->rq_xprt->xpt_locallen); + __assign_sockaddr(client, &rqstp->rq_xprt->xpt_remote, + rqstp->rq_xprt->xpt_remotelen); + } + __entry->xid = rqstp ? be32_to_cpu(rqstp->rq_xid) : 0; __entry->fh_hash = knfsd_fh_hash(&fhp->fh_handle); __entry->inode = d_inode(fhp->fh_dentry); __entry->type = type; @@ -232,13 +235,14 @@ TRACE_EVENT(nfsd_fh_verify, TRACE_EVENT_CONDITION(nfsd_fh_verify_err, TP_PROTO( + const struct net *net, const struct svc_rqst *rqstp, const struct svc_fh *fhp, umode_t type, int access, __be32 error ), - TP_ARGS(rqstp, fhp, type, access, error), + TP_ARGS(net, rqstp, fhp, type, access, error), TP_CONDITION(error), TP_STRUCT__entry( __field(unsigned int, netns_ino) @@ -252,12 +256,14 @@ TRACE_EVENT_CONDITION(nfsd_fh_verify_err, __field(int, error) ), TP_fast_assign( - __entry->netns_ino = SVC_NET(rqstp)->ns.inum; - __assign_sockaddr(server, &rqstp->rq_xprt->xpt_local, - rqstp->rq_xprt->xpt_locallen); - __assign_sockaddr(client, &rqstp->rq_xprt->xpt_remote, - rqstp->rq_xprt->xpt_remotelen); - __entry->xid = be32_to_cpu(rqstp->rq_xid); + __entry->netns_ino = net->ns.inum; + if (rqstp) { + __assign_sockaddr(server, &rqstp->rq_xprt->xpt_local, + rqstp->rq_xprt->xpt_locallen); + __assign_sockaddr(client, &rqstp->rq_xprt->xpt_remote, + rqstp->rq_xprt->xpt_remotelen); + } + __entry->xid = rqstp ? be32_to_cpu(rqstp->rq_xid) : 0; __entry->fh_hash = knfsd_fh_hash(&fhp->fh_handle); if (fhp->fh_dentry) __entry->inode = d_inode(fhp->fh_dentry); @@ -286,7 +292,7 @@ DECLARE_EVENT_CLASS(nfsd_fh_err_class, __field(int, status) ), TP_fast_assign( - __entry->xid = be32_to_cpu(rqstp->rq_xid); + __entry->xid = rqstp ? be32_to_cpu(rqstp->rq_xid) : 0; __entry->fh_hash = knfsd_fh_hash(&fhp->fh_handle); __entry->status = status; ), From patchwork Mon Aug 19 18:17:11 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Snitzer X-Patchwork-Id: 13768775 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 574F2189B80; Mon, 19 Aug 2024 18:18:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724091480; cv=none; b=orbyYMwaf4DdSkUVKQAH/UySaB3y10qt55vRxrDkeP5hA3FNHQUWtPtiY6IxZg3MPK544Uz9mioJLPohOgzVKWM746MArwxj5GgHt6B0bzhJs9GbN5Z5z/gGkEyAfel7oxmZR0cGqtz56c45MhOogAA5DUWyOuYGMhtNevxQp5M= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724091480; c=relaxed/simple; bh=m6sAKvRq/9Y6O5BOXO3ROXNSE7wChQv9chUGkACkda4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Ukw3P2jO1k1Pcx/LAOTkD/roCGobGHLNw3JH+PqfrfGKSEnDscS3yKtOh6Av9OEc95e+jrD33k96Qlkhn1fV+JMYk11BX69jgsxzJlCO0es65LPM5YPnZhCEO8XcO9AgFtD9B3BYck1PgpIZNPWvU3En3cWJhYJSHjsSxfoIIbk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=k/shJQsA; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="k/shJQsA" Received: by smtp.kernel.org (Postfix) with ESMTPSA id F0275C4AF11; Mon, 19 Aug 2024 18:17:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1724091480; bh=m6sAKvRq/9Y6O5BOXO3ROXNSE7wChQv9chUGkACkda4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=k/shJQsA0t7oATMShlnb56o3Lral0zUXTE9cIwdkBWlHEULRO9ytUKuuuOw6Aijo/ s6ali16CanR5x1xjxT06s52PAEFHXYxhe/HXpfGKSMSXspbOzRTcbcM9/xDFwW4hTi msOdYT76lnBt5UP3bepvScJEboGpIU/G0szUH5CyUyhuGckPSTiN82d5jN1RiN0vgy U76R7YNKPW0Lp94TvCPEIVfPOnjxIHARXATbRipJkRh/8BVm4dkLqqvlWoV7hUVOnV cCUg8TWyoXf2v1qOOn16EnYfrq5yad0dfk3CsTqsMYk2k+jrikS8V6atS2Pd6QmE9s RoIJmmfeKpC+w== From: Mike Snitzer To: linux-nfs@vger.kernel.org Cc: Jeff Layton , Chuck Lever , Anna Schumaker , Trond Myklebust , NeilBrown , linux-fsdevel@vger.kernel.org Subject: [PATCH v12 06/24] nfsd: add nfsd_file_acquire_local() Date: Mon, 19 Aug 2024 14:17:11 -0400 Message-ID: <20240819181750.70570-7-snitzer@kernel.org> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240819181750.70570-1-snitzer@kernel.org> References: <20240819181750.70570-1-snitzer@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: NeilBrown nfsd_file_acquire_local() can be used to look up a file by filehandle without having a struct svc_rqst. This can be used by NFS LOCALIO to allow the NFS client to bypass the NFS protocol to directly access a file provided by the NFS server which is running in the same kernel. In nfsd_file_do_acquire() care is taken to always use fh_verify() if rqstp is not NULL (as is the case for non-LOCALIO callers). Otherwise the non-LOCALIO callers will not supply the correct and required arguments to __fh_verify (e.g. nfs_vers is 0, gssclient isn't passed). Signed-off-by: NeilBrown Signed-off-by: Mike Snitzer --- fs/nfsd/filecache.c | 62 ++++++++++++++++++++++++++++++++++++++++----- fs/nfsd/filecache.h | 4 +++ fs/nfsd/nfsfh.c | 2 +- fs/nfsd/nfsfh.h | 5 ++++ 4 files changed, 65 insertions(+), 8 deletions(-) diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c index 9e9d246f993c..2cc838bbeb89 100644 --- a/fs/nfsd/filecache.c +++ b/fs/nfsd/filecache.c @@ -982,12 +982,14 @@ nfsd_file_is_cached(struct inode *inode) } static __be32 -nfsd_file_do_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp, +nfsd_file_do_acquire(struct svc_rqst *rqstp, struct net *net, + struct svc_cred *cred, int nfs_vers, + struct auth_domain *client, + struct svc_fh *fhp, unsigned int may_flags, struct file *file, struct nfsd_file **pnf, bool want_gc) { unsigned char need = may_flags & NFSD_FILE_MAY_MASK; - struct net *net = SVC_NET(rqstp); struct nfsd_file *new, *nf; bool stale_retry = true; bool open_retry = true; @@ -996,8 +998,13 @@ nfsd_file_do_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp, int ret; retry: - status = fh_verify(rqstp, fhp, S_IFREG, - may_flags|NFSD_MAY_OWNER_OVERRIDE); + if (rqstp) { + status = fh_verify(rqstp, fhp, S_IFREG, + may_flags|NFSD_MAY_OWNER_OVERRIDE); + } else { + status = __fh_verify(NULL, net, cred, nfs_vers, client, NULL, fhp, + S_IFREG, may_flags|NFSD_MAY_OWNER_OVERRIDE); + } if (status != nfs_ok) return status; inode = d_inode(fhp->fh_dentry); @@ -1143,7 +1150,8 @@ __be32 nfsd_file_acquire_gc(struct svc_rqst *rqstp, struct svc_fh *fhp, unsigned int may_flags, struct nfsd_file **pnf) { - return nfsd_file_do_acquire(rqstp, fhp, may_flags, NULL, pnf, true); + return nfsd_file_do_acquire(rqstp, SVC_NET(rqstp), NULL, 0, NULL, + fhp, may_flags, NULL, pnf, true); } /** @@ -1167,7 +1175,46 @@ __be32 nfsd_file_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp, unsigned int may_flags, struct nfsd_file **pnf) { - return nfsd_file_do_acquire(rqstp, fhp, may_flags, NULL, pnf, false); + return nfsd_file_do_acquire(rqstp, SVC_NET(rqstp), NULL, 0, NULL, + fhp, may_flags, NULL, pnf, false); +} + +/** + * nfsd_file_acquire_local - Get a struct nfsd_file with an open file for localio + * @net: The network namespace in which to perform a lookup + * @cred: the user credential with which to validate access + * @nfs_vers: NFS version number to assume for request + * @client: the auth_domain for LOCALIO lookup + * @fhp: the NFS filehandle of the file to be opened + * @may_flags: NFSD_MAY_ settings for the file + * @pnf: OUT: new or found "struct nfsd_file" object + * + * This file lookup interface provide access to a file given the + * filehandle and credential. No connection-based authorisation + * is performed and in that way it is quite different to other + * file access mediated by nfsd. It allows a kernel module such as the NFS + * client to reach across network and filesystem namespaces to access + * a file. The security implications of this should be carefully + * considered before use. + * + * The nfsd_file_object returned by this API is reference-counted + * but not garbage-collected. The object is unhashed after the + * final nfsd_file_put(). + * + * Return values: + * %nfs_ok - @pnf points to an nfsd_file with its reference + * count boosted. + * + * On error, an nfsstat value in network byte order is returned. + */ +__be32 +nfsd_file_acquire_local(struct net *net, struct svc_cred *cred, + int nfs_vers, struct auth_domain *client, + struct svc_fh *fhp, + unsigned int may_flags, struct nfsd_file **pnf) +{ + return nfsd_file_do_acquire(NULL, net, cred, nfs_vers, client, + fhp, may_flags, NULL, pnf, false); } /** @@ -1193,7 +1240,8 @@ nfsd_file_acquire_opened(struct svc_rqst *rqstp, struct svc_fh *fhp, unsigned int may_flags, struct file *file, struct nfsd_file **pnf) { - return nfsd_file_do_acquire(rqstp, fhp, may_flags, file, pnf, false); + return nfsd_file_do_acquire(rqstp, SVC_NET(rqstp), NULL, 0, NULL, + fhp, may_flags, file, pnf, false); } /* diff --git a/fs/nfsd/filecache.h b/fs/nfsd/filecache.h index 3fbec24eea6c..6dab41f8541e 100644 --- a/fs/nfsd/filecache.h +++ b/fs/nfsd/filecache.h @@ -66,5 +66,9 @@ __be32 nfsd_file_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp, __be32 nfsd_file_acquire_opened(struct svc_rqst *rqstp, struct svc_fh *fhp, unsigned int may_flags, struct file *file, struct nfsd_file **nfp); +__be32 nfsd_file_acquire_local(struct net *net, struct svc_cred *cred, + int nfs_vers, struct auth_domain *client, + struct svc_fh *fhp, + unsigned int may_flags, struct nfsd_file **pnf); int nfsd_file_cache_stats_show(struct seq_file *m, void *v); #endif /* _FS_NFSD_FILECACHE_H */ diff --git a/fs/nfsd/nfsfh.c b/fs/nfsd/nfsfh.c index bae727e65214..6253505c4555 100644 --- a/fs/nfsd/nfsfh.c +++ b/fs/nfsd/nfsfh.c @@ -298,7 +298,7 @@ static __be32 nfsd_set_fh_dentry(struct svc_rqst *rqstp, struct net *net, return error; } -static __be32 +__be32 __fh_verify(struct svc_rqst *rqstp, struct net *net, struct svc_cred *cred, int nfs_vers, struct auth_domain *client, diff --git a/fs/nfsd/nfsfh.h b/fs/nfsd/nfsfh.h index 8d46e203d139..1429bee0ac1c 100644 --- a/fs/nfsd/nfsfh.h +++ b/fs/nfsd/nfsfh.h @@ -217,6 +217,11 @@ extern char * SVCFH_fmt(struct svc_fh *fhp); * Function prototypes */ __be32 fh_verify(struct svc_rqst *, struct svc_fh *, umode_t, int); +__be32 __fh_verify(struct svc_rqst *rqstp, + struct net *net, struct svc_cred *cred, + int nfs_vers, struct auth_domain *client, + struct auth_domain *gssclient, + struct svc_fh *fhp, umode_t type, int access); __be32 fh_compose(struct svc_fh *, struct svc_export *, struct dentry *, struct svc_fh *); __be32 fh_update(struct svc_fh *); void fh_put(struct svc_fh *); From patchwork Mon Aug 19 18:17:12 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Snitzer X-Patchwork-Id: 13768776 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 19796189B80; Mon, 19 Aug 2024 18:18:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724091482; cv=none; b=WXOg7g68I00BzmS1wTJi5YKJzdWDBAdeLu4G5shPRW9GEbrBXxy1jmoM9i2vORALTMMjjtgWVXwqePOQwLRVvohYclFBUkI8H9/khaOWvZ4+4Qd6hH0qEGXsB+Hi1mUZ9XZ4N6T8TvhRp9c6p0PsN8jb3VrRxFmCxWWqz93ox24= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724091482; c=relaxed/simple; bh=sr03EmomsHBukis3MHoLZ8JDIsGx0uE2tUIIuCFaF7Q=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=FgwwUL9/vh45GqZ1BK0+bqgshiLQDuJEinw5L1bOv8PanuidF0CG/yJqoSpWHJHl2E7qS5V+4zUIhRna8axfwPQF0ARGmYN/xvGM14OR3jWVKQez6+YGtkvRV26f9/8zndXYtyJBCZq/Dc4oQBVld2lxgYz2BAliUnay0Lj/AVM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=JKCGzhlC; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="JKCGzhlC" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 5CBBEC4AF0F; Mon, 19 Aug 2024 18:18:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1724091481; bh=sr03EmomsHBukis3MHoLZ8JDIsGx0uE2tUIIuCFaF7Q=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=JKCGzhlCmYElPMoWh+YgAQfHS+vouq/5gc6ZhUy6rDkMw3zrkuoaRnOGQDYnoP6Qw GJ/BBzmtsYZqSVwysqWvlHxR535uPEjfvR3C9eDBX0CucaCvBHltyJ00ZBgCoBL6dc lt9rKS2qP5OvlozmuXL7tt9sKm9vy7j4HKFoIGGa8J0CUQ0A/FMB3BJ4095wi47f9L RMlBIZ7JCqp6NXZ3dqxOTmIbQt0DLTWgSrnMUvHhgWnTKMgOCSCBrN4jivGthckHIX YlqG9SIuwuP3gD2Jh8Sa0Iapl367UChJJbRxqxWjxWA7xq/mtpUe/9bK/QeKaEpeNa xlcqERxvCpX+A== From: Mike Snitzer To: linux-nfs@vger.kernel.org Cc: Jeff Layton , Chuck Lever , Anna Schumaker , Trond Myklebust , NeilBrown , linux-fsdevel@vger.kernel.org Subject: [PATCH v12 07/24] SUNRPC: remove call_allocate() BUG_ONs Date: Mon, 19 Aug 2024 14:17:12 -0400 Message-ID: <20240819181750.70570-8-snitzer@kernel.org> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240819181750.70570-1-snitzer@kernel.org> References: <20240819181750.70570-1-snitzer@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Remove BUG_ON if p_arglen=0 to allow RPC with void arg. Remove BUG_ON if p_replen=0 to allow RPC with void return. The former was needed for the first revision of the LOCALIO protocol which had an RPC that took a void arg: /* raw RFC 9562 UUID */ typedef u8 uuid_t; program NFS_LOCALIO_PROGRAM { version LOCALIO_V1 { void NULL(void) = 0; uuid_t GETUUID(void) = 1; } = 1; } = 400122; The latter is needed for the final revision of the LOCALIO protocol which has a UUID_IS_LOCAL RPC which returns a void: /* raw RFC 9562 UUID */ typedef u8 uuid_t; program NFS_LOCALIO_PROGRAM { version LOCALIO_V1 { void NULL(void) = 0; void UUID_IS_LOCAL(uuid_t) = 1; } = 1; } = 400122; There is really no value in triggering a BUG_ON in response to either of these previously unsupported conditions. NeilBrown would like the entire 'if (proc->p_proc != 0)' branch removed (not just the one BUG_ON that must be removed for LOCALIO's immediate needs of returning void). Reviewed-by: NeilBrown Signed-off-by: Mike Snitzer --- net/sunrpc/clnt.c | 6 ------ 1 file changed, 6 deletions(-) diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c index 09f29a95f2bc..00fe6df11ab7 100644 --- a/net/sunrpc/clnt.c +++ b/net/sunrpc/clnt.c @@ -1893,12 +1893,6 @@ call_allocate(struct rpc_task *task) if (req->rq_buffer) return; - if (proc->p_proc != 0) { - BUG_ON(proc->p_arglen == 0); - if (proc->p_decode != NULL) - BUG_ON(proc->p_replen == 0); - } - /* * Calculate the size (in quads) of the RPC call * and reply headers, and convert both values From patchwork Mon Aug 19 18:17:13 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Snitzer X-Patchwork-Id: 13768777 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 322D8189BAF; Mon, 19 Aug 2024 18:18:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724091483; cv=none; b=R5dxMMDKVRmT/enlCyiOe8JybckBWL9wqq35CSCa9eEOpkga1aTq8c1Op9p3I+i0WTKeg9KkpMfe8IO4GGx6Okfg4A/g+xoeuBjbLkamXFjrZyvzRNC6+1Gz4bP1qEGZsOwQ9UMPBHzjaxrnqszTRd0Q09wzZrCNAX3c5/aH4+0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724091483; c=relaxed/simple; bh=eKSyo4M/PwL7v/TFiclRIphm4rgipy9E/7hryKERvTk=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=sGk/zJUu5TyPKp1vH91LQnlOUjLdCJkaGU/X1EJRvgVuuvXO0X51U6WgYnPj0jR6qYoE7Z8pirbFIR+PIdFePJ9XcjnuB6UKjjyh1ezuD5dGSV/0IgKHaqYk0jTDKIEQc0CkfVDbpUpA1rIocrC1jlFSFnmWB+4pN1aWKIlT9Bs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=grB86qAr; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="grB86qAr" Received: by smtp.kernel.org (Postfix) with ESMTPSA id C2EA4C4AF0C; Mon, 19 Aug 2024 18:18:02 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1724091483; bh=eKSyo4M/PwL7v/TFiclRIphm4rgipy9E/7hryKERvTk=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=grB86qArsvXye1TvUsFsxOSJerPr4NqyGMXSeXt1hcd7aAjxVK8KLI+RFWSi2Fkxj h/JEz1V6uj43CmhHx0kQyaCvQ7v161RvOwUAgOxpjnrvp+I0EogcNt6LDd/rbEZOJn 0nScu9dKiF7IVhg2V1tFQjQ/wVW73QImgZp5E7Mw6wyZ2xsstVf/vA3qvg6BaejR8R pmAE7oQzjRptd1GRD+K/BOSWKasK3JGZ1z9MiVOmvSbp8073kkE5wsvf/A+GbN9hfD VHyBMsWrlW/8KmEpK6bpmd9GC0ZzIwL6iCH3wBNelEl/lPV1B9YrW6vtzMHU3ei7Cj nLup4ml8sEYZQ== From: Mike Snitzer To: linux-nfs@vger.kernel.org Cc: Jeff Layton , Chuck Lever , Anna Schumaker , Trond Myklebust , NeilBrown , linux-fsdevel@vger.kernel.org Subject: [PATCH v12 08/24] SUNRPC: add rpcauth_map_clnt_to_svc_cred_local Date: Mon, 19 Aug 2024 14:17:13 -0400 Message-ID: <20240819181750.70570-9-snitzer@kernel.org> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240819181750.70570-1-snitzer@kernel.org> References: <20240819181750.70570-1-snitzer@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Weston Andros Adamson Add new funtion rpcauth_map_clnt_to_svc_cred_local which maps a generic cred to a svc_cred suitable for use in nfsd. This is needed by the localio code to map nfs client creds to nfs server credentials. Following from net/sunrpc/auth_unix.c:unx_marshal() it is clear that ->fsuid and ->fsgid must be used (rather than ->uid and ->gid). In addition, these uid and gid must be translated with from_kuid_munged() so local client uses correct uid and gid when acting as local server. Suggested-by: NeilBrown # to approximate unx_marshal() Signed-off-by: Weston Andros Adamson Signed-off-by: Trond Myklebust Co-developed-by: Mike Snitzer Signed-off-by: Mike Snitzer --- include/linux/sunrpc/auth.h | 4 ++++ net/sunrpc/auth.c | 22 ++++++++++++++++++++++ 2 files changed, 26 insertions(+) diff --git a/include/linux/sunrpc/auth.h b/include/linux/sunrpc/auth.h index 61e58327b1aa..4cfb68f511db 100644 --- a/include/linux/sunrpc/auth.h +++ b/include/linux/sunrpc/auth.h @@ -11,6 +11,7 @@ #define _LINUX_SUNRPC_AUTH_H #include +#include #include #include @@ -184,6 +185,9 @@ int rpcauth_uptodatecred(struct rpc_task *); int rpcauth_init_credcache(struct rpc_auth *); void rpcauth_destroy_credcache(struct rpc_auth *); void rpcauth_clear_credcache(struct rpc_cred_cache *); +void rpcauth_map_clnt_to_svc_cred_local(struct rpc_clnt *clnt, + const struct cred *, + struct svc_cred *); char * rpcauth_stringify_acceptor(struct rpc_cred *); static inline diff --git a/net/sunrpc/auth.c b/net/sunrpc/auth.c index 04534ea537c8..3b6d91b36589 100644 --- a/net/sunrpc/auth.c +++ b/net/sunrpc/auth.c @@ -17,6 +17,7 @@ #include #include #include +#include #include @@ -308,6 +309,27 @@ rpcauth_init_credcache(struct rpc_auth *auth) } EXPORT_SYMBOL_GPL(rpcauth_init_credcache); +void +rpcauth_map_clnt_to_svc_cred_local(struct rpc_clnt *clnt, + const struct cred *cred, + struct svc_cred *svc) +{ + struct user_namespace *userns = clnt->cl_cred ? + clnt->cl_cred->user_ns : &init_user_ns; + + memset(svc, 0, sizeof(struct svc_cred)); + + svc->cr_uid = KUIDT_INIT(from_kuid_munged(userns, cred->fsuid)); + svc->cr_gid = KGIDT_INIT(from_kgid_munged(userns, cred->fsgid)); + svc->cr_flavor = clnt->cl_auth->au_flavor; + if (cred->group_info) + svc->cr_group_info = get_group_info(cred->group_info); + /* These aren't relevant for local (network is bypassed) */ + svc->cr_principal = NULL; + svc->cr_gss_mech = NULL; +} +EXPORT_SYMBOL_GPL(rpcauth_map_clnt_to_svc_cred_local); + char * rpcauth_stringify_acceptor(struct rpc_cred *cred) { From patchwork Mon Aug 19 18:17:14 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Snitzer X-Patchwork-Id: 13768778 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D6DE2189F33; Mon, 19 Aug 2024 18:18:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724091484; cv=none; b=SMRQvqEsPNP0rWAQfpDb2SDu8Ggs/09eQzLFKxFYS8ATeZjsauL5WvtIV94l+wRpHINXSng88P+9bGlkKRNoitsVWe1C3H5MPrvy9Ayl22A/uliNkdbQ/EhKf0PC1dnHgcTI1fZLRfWljzIIxAeio/oOmd+oP0KEN+Jz7iUBpFc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724091484; c=relaxed/simple; bh=PNkd3vaN135BkXm9X90tBGpk2HmP8V5YpvH4nnGEmm0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=uIruOQEru1OhTT6HWHDs94XGKVd4qYqebzcla1DWHSLCZPavYaG5zJ/Amy8YVf5bW6XbrbRfy7tc+GrVsRShuG/79wTkIieHXixFX/Iei++EsEn7u86GECxXxtH+UXDhtVM5T0yjEgcnoBtUmHifprGEVi86Lwyvcv3IbrX8nE4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=TjzBduyC; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="TjzBduyC" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 302C1C4AF11; Mon, 19 Aug 2024 18:18:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1724091484; bh=PNkd3vaN135BkXm9X90tBGpk2HmP8V5YpvH4nnGEmm0=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=TjzBduyCdrTRJwlqHCvhJqWCjEBYUYG/4HRZTVLnjkGxlwi/PnXwH0EZ3cPDYavAQ VmGpH062xrwzxdrg9fFgg/OzNxx2zBBg36m44vaKBaL07S+wmq6KNgNMXg1zmzDu5w AJ2V0wFQE53bwec61mXGSI8xODBcH7FySD7hdaG4m8Mokm5IoOnKFVSDUeMDY/jTRr XVE1jQrQz5/oJ7+Ie6zX3VOoQblc1F9qL2d2IVE44ZxD5FKdWwHzVGqXA5hGwfmbO3 Zw8cUSY3QT3dX5Kz1Z126lMpUs+dQfh3eWudRgl740WOP8fErYEb3nA4gExymx+Uid PMbiGVKFBxRYw== From: Mike Snitzer To: linux-nfs@vger.kernel.org Cc: Jeff Layton , Chuck Lever , Anna Schumaker , Trond Myklebust , NeilBrown , linux-fsdevel@vger.kernel.org Subject: [PATCH v12 09/24] nfs_common: add NFS LOCALIO auxiliary protocol enablement Date: Mon, 19 Aug 2024 14:17:14 -0400 Message-ID: <20240819181750.70570-10-snitzer@kernel.org> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240819181750.70570-1-snitzer@kernel.org> References: <20240819181750.70570-1-snitzer@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 fs/nfs_common/nfslocalio.c provides interfaces that enable an NFS client to generate a nonce (single-use UUID) and associated short-lived nfs_uuid_t struct, register it with nfs_common for subsequent lookup and verification by the NFS server and if matched the NFS server populates members in the nfs_uuid_t struct. nfs_common's nfs_uuids list is the basis for localio enablement, as such it has members that point to nfsd memory for direct use by the client (e.g. 'net' is the server's network namespace, through it the client can access nn->nfsd_serv with proper rcu read access). This commit adds all the nfs_client members required to implement the entire localio feature (which depends on the LOCALIO protocol). Signed-off-by: Mike Snitzer --- fs/nfs/client.c | 9 ++++ fs/nfs_common/Makefile | 3 ++ fs/nfs_common/nfslocalio.c | 97 ++++++++++++++++++++++++++++++++++++++ include/linux/nfs_fs_sb.h | 10 ++++ include/linux/nfslocalio.h | 37 +++++++++++++++ 5 files changed, 156 insertions(+) create mode 100644 fs/nfs_common/nfslocalio.c create mode 100644 include/linux/nfslocalio.h diff --git a/fs/nfs/client.c b/fs/nfs/client.c index 8286edd6062d..1b65a5d7af49 100644 --- a/fs/nfs/client.c +++ b/fs/nfs/client.c @@ -178,6 +178,15 @@ struct nfs_client *nfs_alloc_client(const struct nfs_client_initdata *cl_init) clp->cl_max_connect = cl_init->max_connect ? cl_init->max_connect : 1; clp->cl_net = get_net(cl_init->net); +#if IS_ENABLED(CONFIG_NFS_LOCALIO) + seqlock_init(&clp->cl_boot_lock); + ktime_get_real_ts64(&clp->cl_nfssvc_boot); + clp->cl_rpcclient_localio = ERR_PTR(-EINVAL); + clp->nfsd_open_local_fh = NULL; + clp->cl_nfssvc_net = NULL; + clp->cl_nfssvc_dom = NULL; +#endif /* CONFIG_NFS_LOCALIO */ + clp->cl_principal = "*"; clp->cl_xprtsec = cl_init->xprtsec; return clp; diff --git a/fs/nfs_common/Makefile b/fs/nfs_common/Makefile index e58b01bb8dda..a5e54809701e 100644 --- a/fs/nfs_common/Makefile +++ b/fs/nfs_common/Makefile @@ -6,6 +6,9 @@ obj-$(CONFIG_NFS_ACL_SUPPORT) += nfs_acl.o nfs_acl-objs := nfsacl.o +obj-$(CONFIG_NFS_COMMON_LOCALIO_SUPPORT) += nfs_localio.o +nfs_localio-objs := nfslocalio.o + obj-$(CONFIG_GRACE_PERIOD) += grace.o obj-$(CONFIG_NFS_V4_2_SSC_HELPER) += nfs_ssc.o diff --git a/fs/nfs_common/nfslocalio.c b/fs/nfs_common/nfslocalio.c new file mode 100644 index 000000000000..a20ff7607707 --- /dev/null +++ b/fs/nfs_common/nfslocalio.c @@ -0,0 +1,97 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Copyright (C) 2024 Mike Snitzer + */ + +#include +#include +#include + +MODULE_LICENSE("GPL"); +MODULE_DESCRIPTION("NFS localio protocol bypass support"); + +DEFINE_MUTEX(nfs_uuid_mutex); + +/* + * Global list of nfs_uuid_t instances, add/remove + * is protected by nfs_uuid_mutex. + * Reads are protected by RCU read lock (see below). + */ +LIST_HEAD(nfs_uuids); + +void nfs_uuid_begin(nfs_uuid_t *nfs_uuid) +{ + nfs_uuid->net = NULL; + nfs_uuid->dom = NULL; + uuid_gen(&nfs_uuid->uuid); + + mutex_lock(&nfs_uuid_mutex); + list_add_tail_rcu(&nfs_uuid->list, &nfs_uuids); + mutex_unlock(&nfs_uuid_mutex); +} +EXPORT_SYMBOL_GPL(nfs_uuid_begin); + +void nfs_uuid_end(nfs_uuid_t *nfs_uuid) +{ + mutex_lock(&nfs_uuid_mutex); + list_del_rcu(&nfs_uuid->list); + mutex_unlock(&nfs_uuid_mutex); +} +EXPORT_SYMBOL_GPL(nfs_uuid_end); + +/* Must be called with RCU read lock held. */ +static nfs_uuid_t * nfs_uuid_lookup(const uuid_t *uuid) +{ + nfs_uuid_t *nfs_uuid; + + list_for_each_entry_rcu(nfs_uuid, &nfs_uuids, list) + if (uuid_equal(&nfs_uuid->uuid, uuid)) + return nfs_uuid; + + return NULL; +} + +bool nfs_uuid_is_local(const uuid_t *uuid, struct net *net, struct auth_domain *dom) +{ + bool is_local = false; + nfs_uuid_t *nfs_uuid; + + rcu_read_lock(); + nfs_uuid = nfs_uuid_lookup(uuid); + if (nfs_uuid) { + is_local = true; + nfs_uuid->net = net; + kref_get(&dom->ref); + nfs_uuid->dom = dom; + } + rcu_read_unlock(); + + return is_local; +} +EXPORT_SYMBOL_GPL(nfs_uuid_is_local); + +/* + * The nfs localio code needs to call into nfsd to do the filehandle -> struct path + * mapping, but cannot be statically linked, because that will make the nfs module + * depend on the nfsd module. + * + * Instead, do dynamic linking to the nfsd module (via nfs_common module). The + * nfs_common module will only hold a reference on nfsd when localio is in use. + * This allows some sanity checking, like giving up on localio if nfsd isn't loaded. + */ + +extern int nfsd_open_local_fh(struct net *, struct auth_domain *, struct rpc_clnt *, + const struct cred *, const struct nfs_fh *, + const fmode_t, struct file **); + +nfs_to_nfsd_open_t get_nfsd_open_local_fh(void) +{ + return symbol_request(nfsd_open_local_fh); +} +EXPORT_SYMBOL_GPL(get_nfsd_open_local_fh); + +void put_nfsd_open_local_fh(void) +{ + symbol_put(nfsd_open_local_fh); +} +EXPORT_SYMBOL_GPL(put_nfsd_open_local_fh); diff --git a/include/linux/nfs_fs_sb.h b/include/linux/nfs_fs_sb.h index 1df86ab98c77..3849cc2832f0 100644 --- a/include/linux/nfs_fs_sb.h +++ b/include/linux/nfs_fs_sb.h @@ -8,6 +8,7 @@ #include #include #include +#include #include #include @@ -125,6 +126,15 @@ struct nfs_client { struct net *cl_net; struct list_head pending_cb_stateids; struct rcu_head rcu; + +#if IS_ENABLED(CONFIG_NFS_LOCALIO) + struct timespec64 cl_nfssvc_boot; + seqlock_t cl_boot_lock; + struct rpc_clnt * cl_rpcclient_localio; + struct net * cl_nfssvc_net; + struct auth_domain * cl_nfssvc_dom; + nfs_to_nfsd_open_t nfsd_open_local_fh; +#endif /* CONFIG_NFS_LOCALIO */ }; /* diff --git a/include/linux/nfslocalio.h b/include/linux/nfslocalio.h new file mode 100644 index 000000000000..109cb8534e3f --- /dev/null +++ b/include/linux/nfslocalio.h @@ -0,0 +1,37 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (C) 2024 Mike Snitzer + */ +#ifndef __LINUX_NFSLOCALIO_H +#define __LINUX_NFSLOCALIO_H + +#include +#include +#include +#include +#include +#include + +/* + * Useful to allow a client to negotiate if localio + * possible with its server. + */ +typedef struct { + uuid_t uuid; + struct list_head list; + struct net *net; /* nfsd's network namespace */ + struct auth_domain *dom; /* auth_domain for localio */ +} nfs_uuid_t; + +void nfs_uuid_begin(nfs_uuid_t *); +void nfs_uuid_end(nfs_uuid_t *); +bool nfs_uuid_is_local(const uuid_t *, struct net *, struct auth_domain *); + +typedef int (*nfs_to_nfsd_open_t)(struct net *, struct auth_domain *, struct rpc_clnt *, + const struct cred *, const struct nfs_fh *, + const fmode_t, struct file **); + +nfs_to_nfsd_open_t get_nfsd_open_local_fh(void); +void put_nfsd_open_local_fh(void); + +#endif /* __LINUX_NFSLOCALIO_H */ From patchwork Mon Aug 19 18:17:15 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Snitzer X-Patchwork-Id: 13768779 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 56C1F189F33; Mon, 19 Aug 2024 18:18:05 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724091486; cv=none; b=jRI8ZBpI/JGhNZYlskiQniMDHpHaztRGFBS1eLvPjfP2sCzYdYZ801gd67FGrldBTmtEh6ERfXNFRsObxfjDUipHqBMkGWjhQyYom1DbMllaNfeVfAUsXCeCsg6lmcoMnBsZqQTCoDr/aSXTOMlshFCRXya6UN1FNIA7y+jB5YI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724091486; c=relaxed/simple; bh=5WD74DXR7BeYq0WFWnmeaPOEpv9MT5wb/IV+WpkpIiU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=bxZ7SQmWszPzETTkzelK68+qsJ9pNfDKZljR/g6FfWwF+iSb+8TJRxgvGJwBSAxtrcncNMhmU7j3ZJzypYa5WTTQpLf6DZdnUws186Lm3SpSUfjgGf3QswAle+5Ox9GB6U5rLL2a8cF2KK1/n5UFytmBwwZwgHvgl762cPLV4no= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=hBzxDkiB; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="hBzxDkiB" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 9644AC32782; Mon, 19 Aug 2024 18:18:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1724091485; bh=5WD74DXR7BeYq0WFWnmeaPOEpv9MT5wb/IV+WpkpIiU=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=hBzxDkiBJQzquRShQ26Rmth3VHr8OEvsO0biEYDNPAn1a5ZZFWchacgBtT0OXG/8l j5+rN9erQqnmfcznz095w/MRpkHVEVd1G5YvEvm1Lzp7ZT8f38h6G0UOuMR3qX74Xq 070IpGkXaMgSFOkxwXd6aHBT6m2AzoLbnP0PFnlm244HCBQkfQeLTvQUzuGwlT04E3 Z6wnPnnGFWJ9GygcWLgQqGf/Ip39uLrhVrzTAo/2UWxgzoKvKk8l144++fZdcyiH+p O+2XvL1GgM5j6qfaL0WiYQkUxokYJzYsXGQJuXPQ8nRfV8ZnWEnAmmzRjWHtNN+krr FOK3gT8ABs1kw== From: Mike Snitzer To: linux-nfs@vger.kernel.org Cc: Jeff Layton , Chuck Lever , Anna Schumaker , Trond Myklebust , NeilBrown , linux-fsdevel@vger.kernel.org Subject: [PATCH v12 10/24] nfsd: add localio support Date: Mon, 19 Aug 2024 14:17:15 -0400 Message-ID: <20240819181750.70570-11-snitzer@kernel.org> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240819181750.70570-1-snitzer@kernel.org> References: <20240819181750.70570-1-snitzer@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Weston Andros Adamson Add server support for bypassing NFS for localhost reads, writes, and commits. This is only useful when both the client and server are running on the same host. If nfsd_open_local_fh() fails then the NFS client will both retry and fallback to normal network-based read, write and commit operations if localio is no longer supported. Care is taken to ensure the same NFS security mechanisms are used (authentication, etc) regardless of whether localio or regular NFS access is used. The auth_domain established as part of the traditional NFS client access to the NFS server is also used for localio. Store auth_domain for localio in nfsd_uuid_t and transfer it to the client if it is local to the server. Relative to containers, localio gives the client access to the network namespace the server has. This is required to allow the client to access the server's per-namespace nfsd_net struct. CONFIG_NFSD_LOCALIO controls the server enablement for localio. A later commit will add CONFIG_NFS_LOCALIO to allow the client enablement. This commit also introduces the use of a percpu_ref to interlock nfsd_destroy_serv and nfsd_open_local_fh, and warrants a more detailed explanation: Introduce nfsd_serv_try_get and nfsd_serv_put and update the nfsd code to prevent nfsd_destroy_serv from destroying nn->nfsd_serv until any client initiated localio calls to nfsd (that are _not_ in the context of nfsd) are complete. nfsd_open_local_fh is updated to nfsd_serv_try_get before opening its file handle and then drop the reference using nfsd_serv_put at the end of nfsd_open_local_fh. This "interlock" working relies heavily on nfsd_open_local_fh()'s maybe_get_net() safely dealing with the possibility that the struct net (and nfsd_net by association) may have been destroyed by nfsd_destroy_serv() via nfsd_shutdown_net(). Verified to fix an easy to hit crash that would occur if an nfsd instance running in a container, with a localio client mounted, is shutdown. Upon restart of the container and associated nfsd the client would go on to crash due to NULL pointer dereference that occuured due to the nfs client's localio attempting to nfsd_open_local_fh(), using nn->nfsd_serv, without having a proper reference on nn->nfsd_serv. Signed-off-by: Weston Andros Adamson Signed-off-by: Trond Myklebust Co-developed-by: Mike Snitzer Signed-off-by: Mike Snitzer --- fs/Kconfig | 3 ++ fs/nfsd/Kconfig | 14 ++++++ fs/nfsd/Makefile | 1 + fs/nfsd/filecache.c | 2 +- fs/nfsd/localio.c | 111 ++++++++++++++++++++++++++++++++++++++++++++ fs/nfsd/netns.h | 8 +++- fs/nfsd/nfssvc.c | 39 ++++++++++++++++ fs/nfsd/trace.h | 3 +- fs/nfsd/vfs.h | 10 ++++ 9 files changed, 188 insertions(+), 3 deletions(-) create mode 100644 fs/nfsd/localio.c diff --git a/fs/Kconfig b/fs/Kconfig index a46b0cbc4d8f..1b8a5edbddff 100644 --- a/fs/Kconfig +++ b/fs/Kconfig @@ -377,6 +377,9 @@ config NFS_ACL_SUPPORT tristate select FS_POSIX_ACL +config NFS_COMMON_LOCALIO_SUPPORT + bool + config NFS_COMMON bool depends on NFSD || NFS_FS || LOCKD diff --git a/fs/nfsd/Kconfig b/fs/nfsd/Kconfig index c0bd1509ccd4..1fca57c79c60 100644 --- a/fs/nfsd/Kconfig +++ b/fs/nfsd/Kconfig @@ -90,6 +90,20 @@ config NFSD_V4 If unsure, say N. +config NFSD_LOCALIO + bool "NFS server support for the LOCALIO auxiliary protocol" + depends on NFSD + select NFS_COMMON_LOCALIO_SUPPORT + help + Some NFS servers support an auxiliary NFS LOCALIO protocol + that is not an official part of the NFS protocol. + + This option enables support for the LOCALIO protocol in the + kernel's NFS server. Enable this to bypass using the NFS + protocol when issuing reads, writes and commits to the server. + + If unsure, say N. + config NFSD_PNFS bool diff --git a/fs/nfsd/Makefile b/fs/nfsd/Makefile index b8736a82e57c..78b421778a79 100644 --- a/fs/nfsd/Makefile +++ b/fs/nfsd/Makefile @@ -23,3 +23,4 @@ nfsd-$(CONFIG_NFSD_PNFS) += nfs4layouts.o nfsd-$(CONFIG_NFSD_BLOCKLAYOUT) += blocklayout.o blocklayoutxdr.o nfsd-$(CONFIG_NFSD_SCSILAYOUT) += blocklayout.o blocklayoutxdr.o nfsd-$(CONFIG_NFSD_FLEXFILELAYOUT) += flexfilelayout.o flexfilelayoutxdr.o +nfsd-$(CONFIG_NFSD_LOCALIO) += localio.o diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c index 2cc838bbeb89..56be99a3667a 100644 --- a/fs/nfsd/filecache.c +++ b/fs/nfsd/filecache.c @@ -52,7 +52,7 @@ #define NFSD_FILE_CACHE_UP (0) /* We only care about NFSD_MAY_READ/WRITE for this cache */ -#define NFSD_FILE_MAY_MASK (NFSD_MAY_READ|NFSD_MAY_WRITE) +#define NFSD_FILE_MAY_MASK (NFSD_MAY_READ|NFSD_MAY_WRITE|NFSD_MAY_LOCALIO) static DEFINE_PER_CPU(unsigned long, nfsd_file_cache_hits); static DEFINE_PER_CPU(unsigned long, nfsd_file_acquisitions); diff --git a/fs/nfsd/localio.c b/fs/nfsd/localio.c new file mode 100644 index 000000000000..ed528524b368 --- /dev/null +++ b/fs/nfsd/localio.c @@ -0,0 +1,111 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * NFS server support for local clients to bypass network stack + * + * Copyright (C) 2014 Weston Andros Adamson + * Copyright (C) 2019 Trond Myklebust + * Copyright (C) 2024 Mike Snitzer + */ + +#include +#include +#include +#include +#include +#include +#include + +#include "nfsd.h" +#include "vfs.h" +#include "netns.h" +#include "filecache.h" + +/** + * nfsd_open_local_fh - lookup a local filehandle @nfs_fh and map to @file + * + * @cl_nfssvc_net: the 'struct net' to use to get the proper nfsd_net + * @cl_nfssvc_dom: the 'struct auth_domain' required for localio access + * @rpc_clnt: rpc_clnt that the client established, used for sockaddr and cred + * @cred: cred that the client established + * @nfs_fh: filehandle to lookup + * @fmode: fmode_t to use for open + * @pfilp: returned file pointer that maps to @nfs_fh + * + * This function maps a local fh to a path on a local filesystem. + * This is useful when the nfs client has the local server mounted - it can + * avoid all the NFS overhead with reads, writes and commits. + * + * On successful return, caller is responsible for calling path_put. Also + * note that this is called from nfs.ko via find_symbol() to avoid an explicit + * dependency on knfsd. So, there is no forward declaration in a header file + * for it that is shared with the client. + */ +int nfsd_open_local_fh(struct net *cl_nfssvc_net, + struct auth_domain *cl_nfssvc_dom, + struct rpc_clnt *rpc_clnt, + const struct cred *cred, + const struct nfs_fh *nfs_fh, + const fmode_t fmode, + struct file **pfilp) +{ + int mayflags = NFSD_MAY_LOCALIO; + int status = 0; + struct nfsd_net *nn; + const struct cred *save_cred; + struct svc_cred rq_cred; + struct svc_fh fh; + struct nfsd_file *nf; + __be32 beres; + + if (nfs_fh->size > NFS4_FHSIZE) + return -EINVAL; + + /* Not running in nfsd context, must safely get reference on nfsd_serv */ + cl_nfssvc_net = maybe_get_net(cl_nfssvc_net); + if (!cl_nfssvc_net) + return -ENXIO; + nn = net_generic(cl_nfssvc_net, nfsd_net_id); + + /* The server may already be shutting down, disallow new localio */ + if (unlikely(!nfsd_serv_try_get(nn))) { + status = -ENXIO; + goto out_net; + } + + /* Save client creds before calling nfsd_file_acquire_local which calls nfsd_setuser */ + save_cred = get_current_cred(); + + /* nfs_fh -> svc_fh */ + fh_init(&fh, NFS4_FHSIZE); + fh.fh_handle.fh_size = nfs_fh->size; + memcpy(fh.fh_handle.fh_raw, nfs_fh->data, nfs_fh->size); + + if (fmode & FMODE_READ) + mayflags |= NFSD_MAY_READ; + if (fmode & FMODE_WRITE) + mayflags |= NFSD_MAY_WRITE; + + rpcauth_map_clnt_to_svc_cred_local(rpc_clnt, cred, &rq_cred); + + beres = nfsd_file_acquire_local(cl_nfssvc_net, &rq_cred, rpc_clnt->cl_vers, + cl_nfssvc_dom, &fh, mayflags, &nf); + if (beres) { + status = nfs_stat_to_errno(be32_to_cpu(beres)); + goto out_fh_put; + } + *pfilp = get_file(nf->nf_file); + nfsd_file_put(nf); +out_fh_put: + fh_put(&fh); + if (rq_cred.cr_group_info) + put_group_info(rq_cred.cr_group_info); + revert_creds(save_cred); + nfsd_serv_put(nn); +out_net: + put_net(cl_nfssvc_net); + return status; +} +EXPORT_SYMBOL_GPL(nfsd_open_local_fh); + +/* Compile time type checking, not used by anything */ +static nfs_to_nfsd_open_t __maybe_unused nfsd_open_local_fh_typecheck = nfsd_open_local_fh; diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h index 238fc4e56e53..e2d953f21dde 100644 --- a/fs/nfsd/netns.h +++ b/fs/nfsd/netns.h @@ -13,6 +13,7 @@ #include #include #include +#include #include #include @@ -139,7 +140,9 @@ struct nfsd_net { struct svc_info nfsd_info; #define nfsd_serv nfsd_info.serv - + struct percpu_ref nfsd_serv_ref; + struct completion nfsd_serv_confirm_done; + struct completion nfsd_serv_free_done; /* * clientid and stateid data for construction of net unique COPY @@ -221,6 +224,9 @@ struct nfsd_net { extern bool nfsd_support_version(int vers); extern unsigned int nfsd_net_id; +bool nfsd_serv_try_get(struct nfsd_net *nn); +void nfsd_serv_put(struct nfsd_net *nn); + void nfsd_copy_write_verifier(__be32 verf[2], struct nfsd_net *nn); void nfsd_reset_write_verifier(struct nfsd_net *nn); #endif /* __NFSD_NETNS_H__ */ diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c index defc430f912f..e43d440f9f0a 100644 --- a/fs/nfsd/nfssvc.c +++ b/fs/nfsd/nfssvc.c @@ -193,6 +193,30 @@ int nfsd_minorversion(struct nfsd_net *nn, u32 minorversion, enum vers_op change return 0; } +bool nfsd_serv_try_get(struct nfsd_net *nn) +{ + return percpu_ref_tryget_live(&nn->nfsd_serv_ref); +} + +void nfsd_serv_put(struct nfsd_net *nn) +{ + percpu_ref_put(&nn->nfsd_serv_ref); +} + +static void nfsd_serv_done(struct percpu_ref *ref) +{ + struct nfsd_net *nn = container_of(ref, struct nfsd_net, nfsd_serv_ref); + + complete(&nn->nfsd_serv_confirm_done); +} + +static void nfsd_serv_free(struct percpu_ref *ref) +{ + struct nfsd_net *nn = container_of(ref, struct nfsd_net, nfsd_serv_ref); + + complete(&nn->nfsd_serv_free_done); +} + /* * Maximum number of nfsd processes */ @@ -392,6 +416,7 @@ static void nfsd_shutdown_net(struct net *net) lockd_down(net); nn->lockd_up = false; } + percpu_ref_exit(&nn->nfsd_serv_ref); nn->nfsd_net_up = false; nfsd_shutdown_generic(); } @@ -471,6 +496,13 @@ void nfsd_destroy_serv(struct net *net) struct nfsd_net *nn = net_generic(net, nfsd_net_id); struct svc_serv *serv = nn->nfsd_serv; + lockdep_assert_held(&nfsd_mutex); + + percpu_ref_kill_and_confirm(&nn->nfsd_serv_ref, nfsd_serv_done); + wait_for_completion(&nn->nfsd_serv_confirm_done); + wait_for_completion(&nn->nfsd_serv_free_done); + /* percpu_ref_exit is called in nfsd_shutdown_net */ + spin_lock(&nfsd_notifier_lock); nn->nfsd_serv = NULL; spin_unlock(&nfsd_notifier_lock); @@ -595,6 +627,13 @@ int nfsd_create_serv(struct net *net) if (nn->nfsd_serv) return 0; + error = percpu_ref_init(&nn->nfsd_serv_ref, nfsd_serv_free, + 0, GFP_KERNEL); + if (error) + return error; + init_completion(&nn->nfsd_serv_free_done); + init_completion(&nn->nfsd_serv_confirm_done); + if (nfsd_max_blksize == 0) nfsd_max_blksize = nfsd_get_default_max_blksize(); nfsd_reset_versions(nn); diff --git a/fs/nfsd/trace.h b/fs/nfsd/trace.h index d49b3c1e3ba9..20e170f19b0b 100644 --- a/fs/nfsd/trace.h +++ b/fs/nfsd/trace.h @@ -86,7 +86,8 @@ DEFINE_NFSD_XDR_ERR_EVENT(cant_encode); { NFSD_MAY_NOT_BREAK_LEASE, "NOT_BREAK_LEASE" }, \ { NFSD_MAY_BYPASS_GSS, "BYPASS_GSS" }, \ { NFSD_MAY_READ_IF_EXEC, "READ_IF_EXEC" }, \ - { NFSD_MAY_64BIT_COOKIE, "64BIT_COOKIE" }) + { NFSD_MAY_64BIT_COOKIE, "64BIT_COOKIE" }, \ + { NFSD_MAY_LOCALIO, "LOCALIO" }) TRACE_EVENT(nfsd_compound, TP_PROTO( diff --git a/fs/nfsd/vfs.h b/fs/nfsd/vfs.h index 01947561d375..9720951c49a0 100644 --- a/fs/nfsd/vfs.h +++ b/fs/nfsd/vfs.h @@ -33,6 +33,8 @@ #define NFSD_MAY_64BIT_COOKIE 0x1000 /* 64 bit readdir cookies for >= NFSv3 */ +#define NFSD_MAY_LOCALIO 0x2000 /* for tracing, reflects when localio used */ + #define NFSD_MAY_CREATE (NFSD_MAY_EXEC|NFSD_MAY_WRITE) #define NFSD_MAY_REMOVE (NFSD_MAY_EXEC|NFSD_MAY_WRITE|NFSD_MAY_TRUNC) @@ -158,6 +160,14 @@ __be32 nfsd_permission(struct svc_cred *cred, struct svc_export *exp, void nfsd_filp_close(struct file *fp); +int nfsd_open_local_fh(struct net *net, + struct auth_domain *dom, + struct rpc_clnt *rpc_clnt, + const struct cred *cred, + const struct nfs_fh *nfs_fh, + const fmode_t fmode, + struct file **pfilp); + static inline int fh_want_write(struct svc_fh *fh) { int ret; From patchwork Mon Aug 19 18:17:16 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Snitzer X-Patchwork-Id: 13768780 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4107B189F5E; Mon, 19 Aug 2024 18:18:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724091487; cv=none; b=cv4AbP+VxytuQUrzNa7paprKwcUqsN7Xnim1fOv+8O/1lYP2TUupiNKwlkravtv4x7zG2kT4TLYD5VtAV7Sc1zmEtngiu7exq4zQ6fo6/WwljT6tDGywvtKfLIWZYG3COzz3r+jEGH+mnTjFC1DrrQxP4h3lJvB6ZjboIHQA2OA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724091487; c=relaxed/simple; bh=E44GAle6IAMk68oMehZ2d6ur13Psl7yi3BUaL4zl+70=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=FaDSgLGwFONOIms30DJDyD3OW3iB5461VEvnx3PYw2QhPRUC2AteU3vqG6dcsUE+qxcv6rFVPKI74+/aNdE8DIUPnT3Dpx+xakcbsKnHdboyo2WnBUIdvoU4w1DVtFTCMggPuxl/S5XbQEaToyyDjBxcHDp+BHHQckb8Iw47Xzs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=k6Or3Ose; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="k6Or3Ose" Received: by smtp.kernel.org (Postfix) with ESMTPSA id DF05CC4AF0F; Mon, 19 Aug 2024 18:18:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1724091487; bh=E44GAle6IAMk68oMehZ2d6ur13Psl7yi3BUaL4zl+70=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=k6Or3Osec4qnFvixekTrbr+sDaIkDCOHLTQqrJhHfnU7R6PtzO5WRTwdRbPOBzFaO OkZivbuJcZPk3z3dZpWcca67b8JOrw9rQsGqTN3uUQuRERnKTr6v2X7SG81bTLHKPC uIkYT8EdVrXM5xq380TrttuI4i3TfPkpMd44jJ6abmQ8QCRnlC1Y+DPFDL58dKsUoL 1QqxGG3mBFzsEzcf/UawP30FjrSTasNG6I3KdhCtttzCkwXdoQ85fEHazVmsVpmQFe 4IRayDD58RgQ9Ty0M2hiI5YSoXaHN0G6bT4cFs2Mg2KhyAfsimkoJn8VMJ1xIlR5s4 ZnyayYbItbROw== From: Mike Snitzer To: linux-nfs@vger.kernel.org Cc: Jeff Layton , Chuck Lever , Anna Schumaker , Trond Myklebust , NeilBrown , linux-fsdevel@vger.kernel.org Subject: [PATCH v12 11/24] nfsd: implement server support for NFS_LOCALIO_PROGRAM Date: Mon, 19 Aug 2024 14:17:16 -0400 Message-ID: <20240819181750.70570-12-snitzer@kernel.org> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240819181750.70570-1-snitzer@kernel.org> References: <20240819181750.70570-1-snitzer@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 The LOCALIO auxiliary RPC protocol consists of a single "UUID_IS_LOCAL" RPC method that allows the Linux NFS client to verify the local Linux NFS server can see the nonce (single-use UUID) the client generated and made available in nfs_common. This protocol isn't part of an IETF standard, nor does it need to be considering it is Linux-to-Linux auxiliary RPC protocol that amounts to an implementation detail. The UUID_IS_LOCAL method encodes the client generated uuid_t in terms of the fixed UUID_SIZE (16 bytes). The fixed size opaque encode and decode XDR methods are used instead of the less efficient variable sized methods. The RPC program number for the NFS_LOCALIO_PROGRAM is 400122 (as assigned by IANA, see https://www.iana.org/assignments/rpc-program-numbers/ ): Linux Kernel Organization 400122 nfslocalio Signed-off-by: Mike Snitzer [neilb: factored out and simplified single localio protocol] Co-developed-by: NeilBrown Signed-off-by: NeilBrown --- fs/nfsd/localio.c | 75 +++++++++++++++++++++++++++++++++++++++++++++ fs/nfsd/nfsd.h | 4 +++ fs/nfsd/nfssvc.c | 28 ++++++++++++++++- include/linux/nfs.h | 7 +++++ 4 files changed, 113 insertions(+), 1 deletion(-) diff --git a/fs/nfsd/localio.c b/fs/nfsd/localio.c index ed528524b368..9cdea1d1c28a 100644 --- a/fs/nfsd/localio.c +++ b/fs/nfsd/localio.c @@ -13,12 +13,15 @@ #include #include #include +#include +#include #include #include "nfsd.h" #include "vfs.h" #include "netns.h" #include "filecache.h" +#include "cache.h" /** * nfsd_open_local_fh - lookup a local filehandle @nfs_fh and map to @file @@ -109,3 +112,75 @@ EXPORT_SYMBOL_GPL(nfsd_open_local_fh); /* Compile time type checking, not used by anything */ static nfs_to_nfsd_open_t __maybe_unused nfsd_open_local_fh_typecheck = nfsd_open_local_fh; + +/* + * UUID_IS_LOCAL XDR functions + */ + +static __be32 localio_proc_null(struct svc_rqst *rqstp) +{ + return rpc_success; +} + +struct localio_uuidarg { + uuid_t uuid; +}; + +static __be32 localio_proc_uuid_is_local(struct svc_rqst *rqstp) +{ + struct localio_uuidarg *argp = rqstp->rq_argp; + + (void) nfs_uuid_is_local(&argp->uuid, SVC_NET(rqstp), + rqstp->rq_client); + + return rpc_success; +} + +static bool localio_decode_uuidarg(struct svc_rqst *rqstp, + struct xdr_stream *xdr) +{ + struct localio_uuidarg *argp = rqstp->rq_argp; + u8 uuid[UUID_SIZE]; + + if (decode_opaque_fixed(xdr, uuid, UUID_SIZE)) + return false; + import_uuid(&argp->uuid, uuid); + + return true; +} + +static const struct svc_procedure localio_procedures1[] = { + [LOCALIOPROC_NULL] = { + .pc_func = localio_proc_null, + .pc_decode = nfssvc_decode_voidarg, + .pc_encode = nfssvc_encode_voidres, + .pc_argsize = sizeof(struct nfsd_voidargs), + .pc_ressize = sizeof(struct nfsd_voidres), + .pc_cachetype = RC_NOCACHE, + .pc_xdrressize = 0, + .pc_name = "NULL", + }, + [LOCALIOPROC_UUID_IS_LOCAL] = { + .pc_func = localio_proc_uuid_is_local, + .pc_decode = localio_decode_uuidarg, + .pc_encode = nfssvc_encode_voidres, + .pc_argsize = sizeof(struct localio_uuidarg), + .pc_argzero = sizeof(struct localio_uuidarg), + .pc_ressize = sizeof(struct nfsd_voidres), + .pc_cachetype = RC_NOCACHE, + .pc_name = "UUID_IS_LOCAL", + }, +}; + +#define LOCALIO_NR_PROCEDURES ARRAY_SIZE(localio_procedures1) +static DEFINE_PER_CPU_ALIGNED(unsigned long, + localio_count[LOCALIO_NR_PROCEDURES]); +const struct svc_version localio_version1 = { + .vs_vers = 1, + .vs_nproc = LOCALIO_NR_PROCEDURES, + .vs_proc = localio_procedures1, + .vs_dispatch = nfsd_dispatch, + .vs_count = localio_count, + .vs_xdrsize = XDR_QUADLEN(UUID_SIZE), + .vs_hidden = true, +}; diff --git a/fs/nfsd/nfsd.h b/fs/nfsd/nfsd.h index 4ccbf014a2c7..f87a359d968f 100644 --- a/fs/nfsd/nfsd.h +++ b/fs/nfsd/nfsd.h @@ -146,6 +146,10 @@ extern const struct svc_version nfsd_acl_version3; #endif #endif +#if IS_ENABLED(CONFIG_NFSD_LOCALIO) +extern const struct svc_version localio_version1; +#endif + struct nfsd_net; enum vers_op {NFSD_SET, NFSD_CLEAR, NFSD_TEST, NFSD_AVAIL }; diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c index e43d440f9f0a..1bec3a53e35f 100644 --- a/fs/nfsd/nfssvc.c +++ b/fs/nfsd/nfssvc.c @@ -80,6 +80,25 @@ DEFINE_SPINLOCK(nfsd_drc_lock); unsigned long nfsd_drc_max_mem; unsigned long nfsd_drc_mem_used; +#if IS_ENABLED(CONFIG_NFSD_LOCALIO) +static const struct svc_version *localio_versions[] = { + [1] = &localio_version1, +}; + +#define NFSD_LOCALIO_NRVERS ARRAY_SIZE(localio_versions) + +static struct svc_program nfsd_localio_program = { + .pg_prog = NFS_LOCALIO_PROGRAM, + .pg_nvers = NFSD_LOCALIO_NRVERS, + .pg_vers = localio_versions, + .pg_name = "nfslocalio", + .pg_class = "nfsd", + .pg_authenticate = &svc_set_client, + .pg_init_request = svc_generic_init_request, + .pg_rpcbind_set = svc_generic_rpcbind_set, +}; +#endif /* CONFIG_NFSD_LOCALIO */ + #if defined(CONFIG_NFSD_V2_ACL) || defined(CONFIG_NFSD_V3_ACL) static const struct svc_version *nfsd_acl_version[] = { # if defined(CONFIG_NFSD_V2_ACL) @@ -94,6 +113,9 @@ static const struct svc_version *nfsd_acl_version[] = { #define NFSD_ACL_NRVERS ARRAY_SIZE(nfsd_acl_version) static struct svc_program nfsd_acl_program = { +#if IS_ENABLED(CONFIG_NFSD_LOCALIO) + .pg_next = &nfsd_localio_program, +#endif /* CONFIG_NFSD_LOCALIO */ .pg_prog = NFS_ACL_PROGRAM, .pg_nvers = NFSD_ACL_NRVERS, .pg_vers = nfsd_acl_version, @@ -119,6 +141,10 @@ static const struct svc_version *nfsd_version[NFSD_MAXVERS+1] = { struct svc_program nfsd_program = { #if defined(CONFIG_NFSD_V2_ACL) || defined(CONFIG_NFSD_V3_ACL) .pg_next = &nfsd_acl_program, +#else +#if IS_ENABLED(CONFIG_NFSD_LOCALIO) + .pg_next = &nfsd_localio_program, +#endif /* CONFIG_NFSD_LOCALIO */ #endif .pg_prog = NFS_PROGRAM, /* program number */ .pg_nvers = NFSD_MAXVERS+1, /* nr of entries in nfsd_version */ @@ -944,7 +970,7 @@ nfsd(void *vrqstp) } /** - * nfsd_dispatch - Process an NFS or NFSACL Request + * nfsd_dispatch - Process an NFS or NFSACL or LOCALIO Request * @rqstp: incoming request * * This RPC dispatcher integrates the NFS server's duplicate reply cache. diff --git a/include/linux/nfs.h b/include/linux/nfs.h index ceb70a926b95..5ff1a5b3b00c 100644 --- a/include/linux/nfs.h +++ b/include/linux/nfs.h @@ -13,6 +13,13 @@ #include #include +/* The localio program is entirely private to Linux and is + * NOT part of the uapi. + */ +#define NFS_LOCALIO_PROGRAM 400122 +#define LOCALIOPROC_NULL 0 +#define LOCALIOPROC_UUID_IS_LOCAL 1 + /* * This is the kernel NFS client file handle representation */ From patchwork Mon Aug 19 18:17:17 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Snitzer X-Patchwork-Id: 13768781 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 986B018A6B1; Mon, 19 Aug 2024 18:18:08 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724091488; cv=none; b=FM4eIriLrEof13O2cnKXQ7CzB+GpHGiPwxf2K1tY4YQz6s3a171GR9NcQ2YaiWImcbtDgeen9fuUfjcweKsr2lkU5pYe+8SJDuJRsx80dhyB/RnbQIoTyluI1BWn91v0BsigLG9u9ZvH0lxqSGCk/i5sOc/dJWAVllob1nZvC00= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724091488; c=relaxed/simple; bh=fPNnu7rXiFKKBDYztnr9NYqJxRzJ8ba5+z98D9s97sA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=sarDuejDYqTKI7oYZ4ZIieXJ4MUcBHiwFEMxy3wqE/KqAHHfM4em/gqecIISSnkwIFphvsurw+PlsnYC7MA+RErs37PUjoVEJeqqnwvxuLPy+9+UAE0Fd0IBoKevHUkj3JEbmv/2eYYbLHHMoyLus8cPU9POozdGoC9M5rUsNpo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=O/KxWCMV; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="O/KxWCMV" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 42227C4AF0F; Mon, 19 Aug 2024 18:18:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1724091488; bh=fPNnu7rXiFKKBDYztnr9NYqJxRzJ8ba5+z98D9s97sA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=O/KxWCMVnbYyXzZLNBXtyHxz0UpsL3zjirSX3krSxtf0LS3jAGEnm3ejG2TZG9kfk Z2/8sow+UcvtKZqY8IKALEo/CAclxAISHxzorIhx9kpCBTNdasQBEZV74XTdp0r/lL oo7H5xrGIJc7PAuzs0TOwDxvAO9O1Csx8ABPSJ/e8XTa1zu0hyJaedqCF99NzMYjbg G5g/ihssw4XZziLrh2BHu+7b9MBUtQLyG20Gyrk4sW4tmhqnlUFljhqEbZp9ddOgQu kupHSdu69QmcpaqXdnIdcZFywS9bDuPVId3QEphaYXuuydLji4mrPys61vMnxM0heo ytJWRy2laPJRA== From: Mike Snitzer To: linux-nfs@vger.kernel.org Cc: Jeff Layton , Chuck Lever , Anna Schumaker , Trond Myklebust , NeilBrown , linux-fsdevel@vger.kernel.org Subject: [PATCH v12 12/24] SUNRPC: replace program list with program array Date: Mon, 19 Aug 2024 14:17:17 -0400 Message-ID: <20240819181750.70570-13-snitzer@kernel.org> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240819181750.70570-1-snitzer@kernel.org> References: <20240819181750.70570-1-snitzer@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: NeilBrown A service created with svc_create_pooled() can be given a linked list of programs and all of these will be served. Using a linked list makes it cumbersome when there are several programs that can be optionally selected with CONFIG settings. After this patch is applied, API consumers must use only svc_create_pooled() when creating an RPC service that listens for more than one RPC program. Signed-off-by: NeilBrown Signed-off-by: Mike Snitzer --- fs/nfsd/nfsctl.c | 2 +- fs/nfsd/nfsd.h | 2 +- fs/nfsd/nfssvc.c | 67 +++++++++++++++++-------------------- include/linux/sunrpc/svc.h | 7 ++-- net/sunrpc/svc.c | 68 ++++++++++++++++++++++---------------- net/sunrpc/svc_xprt.c | 2 +- net/sunrpc/svcauth_unix.c | 3 +- 7 files changed, 79 insertions(+), 72 deletions(-) diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c index 1c9e5b4bcb0a..64c1b4d649bc 100644 --- a/fs/nfsd/nfsctl.c +++ b/fs/nfsd/nfsctl.c @@ -2246,7 +2246,7 @@ static __net_init int nfsd_net_init(struct net *net) if (retval) goto out_repcache_error; memset(&nn->nfsd_svcstats, 0, sizeof(nn->nfsd_svcstats)); - nn->nfsd_svcstats.program = &nfsd_program; + nn->nfsd_svcstats.program = &nfsd_programs[0]; for (i = 0; i < sizeof(nn->nfsd_versions); i++) nn->nfsd_versions[i] = nfsd_support_version(i); for (i = 0; i < sizeof(nn->nfsd4_minorversions); i++) diff --git a/fs/nfsd/nfsd.h b/fs/nfsd/nfsd.h index f87a359d968f..232a873dc53a 100644 --- a/fs/nfsd/nfsd.h +++ b/fs/nfsd/nfsd.h @@ -85,7 +85,7 @@ struct nfsd_genl_rqstp { u32 rq_opnum[NFSD_MAX_OPS_PER_COMPOUND]; }; -extern struct svc_program nfsd_program; +extern struct svc_program nfsd_programs[]; extern const struct svc_version nfsd_version2, nfsd_version3, nfsd_version4; extern struct mutex nfsd_mutex; extern spinlock_t nfsd_drc_lock; diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c index 1bec3a53e35f..5f8680ab1013 100644 --- a/fs/nfsd/nfssvc.c +++ b/fs/nfsd/nfssvc.c @@ -35,7 +35,6 @@ #define NFSDDBG_FACILITY NFSDDBG_SVC atomic_t nfsd_th_cnt = ATOMIC_INIT(0); -extern struct svc_program nfsd_program; static int nfsd(void *vrqstp); #if defined(CONFIG_NFSD_V2_ACL) || defined(CONFIG_NFSD_V3_ACL) static int nfsd_acl_rpcbind_set(struct net *, @@ -87,16 +86,6 @@ static const struct svc_version *localio_versions[] = { #define NFSD_LOCALIO_NRVERS ARRAY_SIZE(localio_versions) -static struct svc_program nfsd_localio_program = { - .pg_prog = NFS_LOCALIO_PROGRAM, - .pg_nvers = NFSD_LOCALIO_NRVERS, - .pg_vers = localio_versions, - .pg_name = "nfslocalio", - .pg_class = "nfsd", - .pg_authenticate = &svc_set_client, - .pg_init_request = svc_generic_init_request, - .pg_rpcbind_set = svc_generic_rpcbind_set, -}; #endif /* CONFIG_NFSD_LOCALIO */ #if defined(CONFIG_NFSD_V2_ACL) || defined(CONFIG_NFSD_V3_ACL) @@ -109,23 +98,9 @@ static const struct svc_version *nfsd_acl_version[] = { # endif }; -#define NFSD_ACL_MINVERS 2 +#define NFSD_ACL_MINVERS 2 #define NFSD_ACL_NRVERS ARRAY_SIZE(nfsd_acl_version) -static struct svc_program nfsd_acl_program = { -#if IS_ENABLED(CONFIG_NFSD_LOCALIO) - .pg_next = &nfsd_localio_program, -#endif /* CONFIG_NFSD_LOCALIO */ - .pg_prog = NFS_ACL_PROGRAM, - .pg_nvers = NFSD_ACL_NRVERS, - .pg_vers = nfsd_acl_version, - .pg_name = "nfsacl", - .pg_class = "nfsd", - .pg_authenticate = &svc_set_client, - .pg_init_request = nfsd_acl_init_request, - .pg_rpcbind_set = nfsd_acl_rpcbind_set, -}; - #endif /* defined(CONFIG_NFSD_V2_ACL) || defined(CONFIG_NFSD_V3_ACL) */ static const struct svc_version *nfsd_version[NFSD_MAXVERS+1] = { @@ -138,22 +113,41 @@ static const struct svc_version *nfsd_version[NFSD_MAXVERS+1] = { #endif }; -struct svc_program nfsd_program = { -#if defined(CONFIG_NFSD_V2_ACL) || defined(CONFIG_NFSD_V3_ACL) - .pg_next = &nfsd_acl_program, -#else -#if IS_ENABLED(CONFIG_NFSD_LOCALIO) - .pg_next = &nfsd_localio_program, -#endif /* CONFIG_NFSD_LOCALIO */ -#endif +struct svc_program nfsd_programs[] = { + { .pg_prog = NFS_PROGRAM, /* program number */ .pg_nvers = NFSD_MAXVERS+1, /* nr of entries in nfsd_version */ .pg_vers = nfsd_version, /* version table */ .pg_name = "nfsd", /* program name */ .pg_class = "nfsd", /* authentication class */ - .pg_authenticate = &svc_set_client, /* export authentication */ + .pg_authenticate = svc_set_client, /* export authentication */ .pg_init_request = nfsd_init_request, .pg_rpcbind_set = nfsd_rpcbind_set, + }, +#if defined(CONFIG_NFSD_V2_ACL) || defined(CONFIG_NFSD_V3_ACL) + { + .pg_prog = NFS_ACL_PROGRAM, + .pg_nvers = NFSD_ACL_NRVERS, + .pg_vers = nfsd_acl_version, + .pg_name = "nfsacl", + .pg_class = "nfsd", + .pg_authenticate = svc_set_client, + .pg_init_request = nfsd_acl_init_request, + .pg_rpcbind_set = nfsd_acl_rpcbind_set, + }, +#endif /* defined(CONFIG_NFSD_V2_ACL) || defined(CONFIG_NFSD_V3_ACL) */ +#if IS_ENABLED(CONFIG_NFSD_LOCALIO) + { + .pg_prog = NFS_LOCALIO_PROGRAM, + .pg_nvers = NFSD_LOCALIO_NRVERS, + .pg_vers = localio_versions, + .pg_name = "nfslocalio", + .pg_class = "nfsd", + .pg_authenticate = svc_set_client, + .pg_init_request = svc_generic_init_request, + .pg_rpcbind_set = svc_generic_rpcbind_set, + } +#endif /* IS_ENABLED(CONFIG_NFSD_LOCALIO) */ }; bool nfsd_support_version(int vers) @@ -663,7 +657,8 @@ int nfsd_create_serv(struct net *net) if (nfsd_max_blksize == 0) nfsd_max_blksize = nfsd_get_default_max_blksize(); nfsd_reset_versions(nn); - serv = svc_create_pooled(&nfsd_program, &nn->nfsd_svcstats, + serv = svc_create_pooled(nfsd_programs, ARRAY_SIZE(nfsd_programs), + &nn->nfsd_svcstats, nfsd_max_blksize, nfsd); if (serv == NULL) return -ENOMEM; diff --git a/include/linux/sunrpc/svc.h b/include/linux/sunrpc/svc.h index 437672bcaa22..c7ad2fb2a155 100644 --- a/include/linux/sunrpc/svc.h +++ b/include/linux/sunrpc/svc.h @@ -67,9 +67,10 @@ enum { * We currently do not support more than one RPC program per daemon. */ struct svc_serv { - struct svc_program * sv_program; /* RPC program */ + struct svc_program * sv_programs; /* RPC programs */ struct svc_stat * sv_stats; /* RPC statistics */ spinlock_t sv_lock; + unsigned int sv_nprogs; /* Number of sv_programs */ unsigned int sv_nrthreads; /* # of server threads */ unsigned int sv_maxconn; /* max connections allowed or * '0' causing max to be based @@ -357,10 +358,9 @@ struct svc_process_info { }; /* - * List of RPC programs on the same transport endpoint + * RPC program - an array of these can use the same transport endpoint */ struct svc_program { - struct svc_program * pg_next; /* other programs (same xprt) */ u32 pg_prog; /* program number */ unsigned int pg_lovers; /* lowest version */ unsigned int pg_hivers; /* highest version */ @@ -438,6 +438,7 @@ bool svc_rqst_replace_page(struct svc_rqst *rqstp, void svc_rqst_release_pages(struct svc_rqst *rqstp); void svc_exit_thread(struct svc_rqst *); struct svc_serv * svc_create_pooled(struct svc_program *prog, + unsigned int nprog, struct svc_stat *stats, unsigned int bufsize, int (*threadfn)(void *data)); diff --git a/net/sunrpc/svc.c b/net/sunrpc/svc.c index ff6f3e35b36d..b33386d249c2 100644 --- a/net/sunrpc/svc.c +++ b/net/sunrpc/svc.c @@ -440,10 +440,11 @@ EXPORT_SYMBOL_GPL(svc_rpcb_cleanup); static int svc_uses_rpcbind(struct svc_serv *serv) { - struct svc_program *progp; - unsigned int i; + unsigned int p, i; + + for (p = 0; p < serv->sv_nprogs; p++) { + struct svc_program *progp = &serv->sv_programs[p]; - for (progp = serv->sv_program; progp; progp = progp->pg_next) { for (i = 0; i < progp->pg_nvers; i++) { if (progp->pg_vers[i] == NULL) continue; @@ -480,7 +481,7 @@ __svc_init_bc(struct svc_serv *serv) * Create an RPC service */ static struct svc_serv * -__svc_create(struct svc_program *prog, struct svc_stat *stats, +__svc_create(struct svc_program *prog, int nprogs, struct svc_stat *stats, unsigned int bufsize, int npools, int (*threadfn)(void *data)) { struct svc_serv *serv; @@ -491,7 +492,8 @@ __svc_create(struct svc_program *prog, struct svc_stat *stats, if (!(serv = kzalloc(sizeof(*serv), GFP_KERNEL))) return NULL; serv->sv_name = prog->pg_name; - serv->sv_program = prog; + serv->sv_programs = prog; + serv->sv_nprogs = nprogs; serv->sv_stats = stats; if (bufsize > RPCSVC_MAXPAYLOAD) bufsize = RPCSVC_MAXPAYLOAD; @@ -499,17 +501,18 @@ __svc_create(struct svc_program *prog, struct svc_stat *stats, serv->sv_max_mesg = roundup(serv->sv_max_payload + PAGE_SIZE, PAGE_SIZE); serv->sv_threadfn = threadfn; xdrsize = 0; - while (prog) { - prog->pg_lovers = prog->pg_nvers-1; - for (vers=0; verspg_nvers ; vers++) - if (prog->pg_vers[vers]) { - prog->pg_hivers = vers; - if (prog->pg_lovers > vers) - prog->pg_lovers = vers; - if (prog->pg_vers[vers]->vs_xdrsize > xdrsize) - xdrsize = prog->pg_vers[vers]->vs_xdrsize; + for (i = 0; i < nprogs; i++) { + struct svc_program *progp = &prog[i]; + + progp->pg_lovers = progp->pg_nvers-1; + for (vers = 0; vers < progp->pg_nvers ; vers++) + if (progp->pg_vers[vers]) { + progp->pg_hivers = vers; + if (progp->pg_lovers > vers) + progp->pg_lovers = vers; + if (progp->pg_vers[vers]->vs_xdrsize > xdrsize) + xdrsize = progp->pg_vers[vers]->vs_xdrsize; } - prog = prog->pg_next; } serv->sv_xdrsize = xdrsize; INIT_LIST_HEAD(&serv->sv_tempsocks); @@ -558,13 +561,14 @@ __svc_create(struct svc_program *prog, struct svc_stat *stats, struct svc_serv *svc_create(struct svc_program *prog, unsigned int bufsize, int (*threadfn)(void *data)) { - return __svc_create(prog, NULL, bufsize, 1, threadfn); + return __svc_create(prog, 1, NULL, bufsize, 1, threadfn); } EXPORT_SYMBOL_GPL(svc_create); /** * svc_create_pooled - Create an RPC service with pooled threads - * @prog: the RPC program the new service will handle + * @prog: Array of RPC programs the new service will handle + * @nprogs: Number of programs in the array * @stats: the stats struct if desired * @bufsize: maximum message size for @prog * @threadfn: a function to service RPC requests for @prog @@ -572,6 +576,7 @@ EXPORT_SYMBOL_GPL(svc_create); * Returns an instantiated struct svc_serv object or NULL. */ struct svc_serv *svc_create_pooled(struct svc_program *prog, + unsigned int nprogs, struct svc_stat *stats, unsigned int bufsize, int (*threadfn)(void *data)) @@ -579,7 +584,7 @@ struct svc_serv *svc_create_pooled(struct svc_program *prog, struct svc_serv *serv; unsigned int npools = svc_pool_map_get(); - serv = __svc_create(prog, stats, bufsize, npools, threadfn); + serv = __svc_create(prog, nprogs, stats, bufsize, npools, threadfn); if (!serv) goto out_err; serv->sv_is_pooled = true; @@ -602,16 +607,16 @@ svc_destroy(struct svc_serv **servp) *servp = NULL; - dprintk("svc: svc_destroy(%s)\n", serv->sv_program->pg_name); + dprintk("svc: svc_destroy(%s)\n", serv->sv_programs->pg_name); timer_shutdown_sync(&serv->sv_temptimer); /* * Remaining transports at this point are not expected. */ WARN_ONCE(!list_empty(&serv->sv_permsocks), - "SVC: permsocks remain for %s\n", serv->sv_program->pg_name); + "SVC: permsocks remain for %s\n", serv->sv_programs->pg_name); WARN_ONCE(!list_empty(&serv->sv_tempsocks), - "SVC: tempsocks remain for %s\n", serv->sv_program->pg_name); + "SVC: tempsocks remain for %s\n", serv->sv_programs->pg_name); cache_clean_deferred(serv); @@ -1149,15 +1154,16 @@ int svc_register(const struct svc_serv *serv, struct net *net, const int family, const unsigned short proto, const unsigned short port) { - struct svc_program *progp; - unsigned int i; + unsigned int p, i; int error = 0; WARN_ON_ONCE(proto == 0 && port == 0); if (proto == 0 && port == 0) return -EINVAL; - for (progp = serv->sv_program; progp; progp = progp->pg_next) { + for (p = 0; p < serv->sv_nprogs; p++) { + struct svc_program *progp = &serv->sv_programs[p]; + for (i = 0; i < progp->pg_nvers; i++) { error = progp->pg_rpcbind_set(net, progp, i, @@ -1209,13 +1215,14 @@ static void __svc_unregister(struct net *net, const u32 program, const u32 versi static void svc_unregister(const struct svc_serv *serv, struct net *net) { struct sighand_struct *sighand; - struct svc_program *progp; unsigned long flags; - unsigned int i; + unsigned int p, i; clear_thread_flag(TIF_SIGPENDING); - for (progp = serv->sv_program; progp; progp = progp->pg_next) { + for (p = 0; p < serv->sv_nprogs; p++) { + struct svc_program *progp = &serv->sv_programs[p]; + for (i = 0; i < progp->pg_nvers; i++) { if (progp->pg_vers[i] == NULL) continue; @@ -1321,7 +1328,7 @@ svc_process_common(struct svc_rqst *rqstp) struct svc_process_info process; enum svc_auth_status auth_res; unsigned int aoffset; - int rc; + int pr, rc; __be32 *p; /* Will be turned off only when NFSv4 Sessions are used */ @@ -1345,9 +1352,12 @@ svc_process_common(struct svc_rqst *rqstp) rqstp->rq_vers = be32_to_cpup(p++); rqstp->rq_proc = be32_to_cpup(p); - for (progp = serv->sv_program; progp; progp = progp->pg_next) + for (pr = 0; pr < serv->sv_nprogs; pr++) { + progp = &serv->sv_programs[pr]; + if (rqstp->rq_prog == progp->pg_prog) break; + } /* * Decode auth data, and add verifier to reply buffer. diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c index 53ebc719ff5a..43c57124de52 100644 --- a/net/sunrpc/svc_xprt.c +++ b/net/sunrpc/svc_xprt.c @@ -268,7 +268,7 @@ static int _svc_xprt_create(struct svc_serv *serv, const char *xprt_name, spin_unlock(&svc_xprt_class_lock); newxprt = xcl->xcl_ops->xpo_create(serv, net, sap, len, flags); if (IS_ERR(newxprt)) { - trace_svc_xprt_create_err(serv->sv_program->pg_name, + trace_svc_xprt_create_err(serv->sv_programs->pg_name, xcl->xcl_name, sap, len, newxprt); module_put(xcl->xcl_owner); diff --git a/net/sunrpc/svcauth_unix.c b/net/sunrpc/svcauth_unix.c index 04b45588ae6f..8ca98b146ec8 100644 --- a/net/sunrpc/svcauth_unix.c +++ b/net/sunrpc/svcauth_unix.c @@ -697,7 +697,8 @@ svcauth_unix_set_client(struct svc_rqst *rqstp) rqstp->rq_auth_stat = rpc_autherr_badcred; ipm = ip_map_cached_get(xprt); if (ipm == NULL) - ipm = __ip_map_lookup(sn->ip_map_cache, rqstp->rq_server->sv_program->pg_class, + ipm = __ip_map_lookup(sn->ip_map_cache, + rqstp->rq_server->sv_programs->pg_class, &sin6->sin6_addr); if (ipm == NULL) From patchwork Mon Aug 19 18:17:18 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Snitzer X-Patchwork-Id: 13768782 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0706B18A6C3; Mon, 19 Aug 2024 18:18:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724091490; cv=none; b=nWFM0M26lg7G0/XfHmwB9a1SwdFzR+4iICHsHfeQtJmz4j9BJwSMa031/GeVxIyXVzz0BsKeQDapeiISDcuPzdO8LkBOtzDVXDAzq+JjaH9I1wqBL2rVSRvihltXbgYd44da6/oJejREgyjYX/D2SC0o/yCcqzLDRaupa6i9VlU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724091490; c=relaxed/simple; bh=7SNqsGdX8gScztGirOxQVfPK+ldTdUGdAOM8sFL17nY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=QGBM/eWOK4vi7sfMh0vpDtfM2q/bm8mvHFjMad8SgHTXVf0aj4VEoeXvBy7df9xKIRUMO3o7zeRVz+VdIKLiSXw/eJ+IYMdmqRxWMFGogeZg0JjsvRsCJQEMdSDAXQmTwtXT375zBqLbZQo6aUaK2WvJvMl54hBqGSlHsULA6KQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=SqgQCRp6; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="SqgQCRp6" Received: by smtp.kernel.org (Postfix) with ESMTPSA id A1FF9C4AF12; Mon, 19 Aug 2024 18:18:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1724091489; bh=7SNqsGdX8gScztGirOxQVfPK+ldTdUGdAOM8sFL17nY=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=SqgQCRp6JfAkaStXl3XsaYrzrpYhgzwHpRlckcIzgkAwA85QDGVe7u/IY/FgxsD9m bf4DzQLmNQ8fzVLR19oNGQAzRojWYQk6Mt6W2IUtTk7lcfAbdFg0xTXoQw1SuFZIge 2sGZT4g9RHdsredkPgU3l/0p6s6mnHr+IVkfP6qlXNX91KmXGY45Nbua14wku20na9 yU4gEWciiUM6VSHL4SA47bqqPOk4qSEKcEMl/DrZiT4cZk5nzkY3/1RyzdzcywJxmn 64BJPe1GC7WMq4yHtgPtId8AHh/hHsE5TtY5D6VjNSDGzRul9OICEovU+NpopUNC9s ZdL3+EGEJ2OPQ== From: Mike Snitzer To: linux-nfs@vger.kernel.org Cc: Jeff Layton , Chuck Lever , Anna Schumaker , Trond Myklebust , NeilBrown , linux-fsdevel@vger.kernel.org Subject: [PATCH v12 13/24] nfs: pass struct file to nfs_init_pgio and nfs_init_commit Date: Mon, 19 Aug 2024 14:17:18 -0400 Message-ID: <20240819181750.70570-14-snitzer@kernel.org> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240819181750.70570-1-snitzer@kernel.org> References: <20240819181750.70570-1-snitzer@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Weston Andros Adamson The localio file will be passed, in future commits, by callers that enable localio support (for both regular NFS and pNFS IO). Signed-off-by: Weston Andros Adamson Signed-off-by: Trond Myklebust Signed-off-by: Mike Snitzer --- fs/nfs/filelayout/filelayout.c | 6 +++--- fs/nfs/flexfilelayout/flexfilelayout.c | 6 +++--- fs/nfs/internal.h | 6 ++++-- fs/nfs/pagelist.c | 6 ++++-- fs/nfs/pnfs_nfs.c | 2 +- fs/nfs/write.c | 5 +++-- 6 files changed, 18 insertions(+), 13 deletions(-) diff --git a/fs/nfs/filelayout/filelayout.c b/fs/nfs/filelayout/filelayout.c index b6e9aeaf4ce2..d39a1f58e18d 100644 --- a/fs/nfs/filelayout/filelayout.c +++ b/fs/nfs/filelayout/filelayout.c @@ -488,7 +488,7 @@ filelayout_read_pagelist(struct nfs_pgio_header *hdr) /* Perform an asynchronous read to ds */ nfs_initiate_pgio(ds_clnt, hdr, hdr->cred, NFS_PROTO(hdr->inode), &filelayout_read_call_ops, - 0, RPC_TASK_SOFTCONN); + 0, RPC_TASK_SOFTCONN, NULL); return PNFS_ATTEMPTED; } @@ -530,7 +530,7 @@ filelayout_write_pagelist(struct nfs_pgio_header *hdr, int sync) /* Perform an asynchronous write */ nfs_initiate_pgio(ds_clnt, hdr, hdr->cred, NFS_PROTO(hdr->inode), &filelayout_write_call_ops, - sync, RPC_TASK_SOFTCONN); + sync, RPC_TASK_SOFTCONN, NULL); return PNFS_ATTEMPTED; } @@ -1011,7 +1011,7 @@ static int filelayout_initiate_commit(struct nfs_commit_data *data, int how) data->args.fh = fh; return nfs_initiate_commit(ds_clnt, data, NFS_PROTO(data->inode), &filelayout_commit_call_ops, how, - RPC_TASK_SOFTCONN); + RPC_TASK_SOFTCONN, NULL); out_err: pnfs_generic_prepare_to_resend_writes(data); pnfs_generic_commit_release(data); diff --git a/fs/nfs/flexfilelayout/flexfilelayout.c b/fs/nfs/flexfilelayout/flexfilelayout.c index d4d551ffea7b..01ee52551a63 100644 --- a/fs/nfs/flexfilelayout/flexfilelayout.c +++ b/fs/nfs/flexfilelayout/flexfilelayout.c @@ -1806,7 +1806,7 @@ ff_layout_read_pagelist(struct nfs_pgio_header *hdr) nfs_initiate_pgio(ds_clnt, hdr, ds_cred, ds->ds_clp->rpc_ops, vers == 3 ? &ff_layout_read_call_ops_v3 : &ff_layout_read_call_ops_v4, - 0, RPC_TASK_SOFTCONN); + 0, RPC_TASK_SOFTCONN, NULL); put_cred(ds_cred); return PNFS_ATTEMPTED; @@ -1874,7 +1874,7 @@ ff_layout_write_pagelist(struct nfs_pgio_header *hdr, int sync) nfs_initiate_pgio(ds_clnt, hdr, ds_cred, ds->ds_clp->rpc_ops, vers == 3 ? &ff_layout_write_call_ops_v3 : &ff_layout_write_call_ops_v4, - sync, RPC_TASK_SOFTCONN); + sync, RPC_TASK_SOFTCONN, NULL); put_cred(ds_cred); return PNFS_ATTEMPTED; @@ -1949,7 +1949,7 @@ static int ff_layout_initiate_commit(struct nfs_commit_data *data, int how) ret = nfs_initiate_commit(ds_clnt, data, ds->ds_clp->rpc_ops, vers == 3 ? &ff_layout_commit_call_ops_v3 : &ff_layout_commit_call_ops_v4, - how, RPC_TASK_SOFTCONN); + how, RPC_TASK_SOFTCONN, NULL); put_cred(ds_cred); return ret; out_err: diff --git a/fs/nfs/internal.h b/fs/nfs/internal.h index 5902a9beca1f..9fc6c1a41ee4 100644 --- a/fs/nfs/internal.h +++ b/fs/nfs/internal.h @@ -308,7 +308,8 @@ void nfs_pgio_header_free(struct nfs_pgio_header *); int nfs_generic_pgio(struct nfs_pageio_descriptor *, struct nfs_pgio_header *); int nfs_initiate_pgio(struct rpc_clnt *clnt, struct nfs_pgio_header *hdr, const struct cred *cred, const struct nfs_rpc_ops *rpc_ops, - const struct rpc_call_ops *call_ops, int how, int flags); + const struct rpc_call_ops *call_ops, int how, int flags, + struct file *localio); void nfs_free_request(struct nfs_page *req); struct nfs_pgio_mirror * nfs_pgio_current_mirror(struct nfs_pageio_descriptor *desc); @@ -528,7 +529,8 @@ extern int nfs_initiate_commit(struct rpc_clnt *clnt, struct nfs_commit_data *data, const struct nfs_rpc_ops *nfs_ops, const struct rpc_call_ops *call_ops, - int how, int flags); + int how, int flags, + struct file *localio); extern void nfs_init_commit(struct nfs_commit_data *data, struct list_head *head, struct pnfs_layout_segment *lseg, diff --git a/fs/nfs/pagelist.c b/fs/nfs/pagelist.c index 04124f226665..532cfaf79813 100644 --- a/fs/nfs/pagelist.c +++ b/fs/nfs/pagelist.c @@ -731,7 +731,8 @@ static void nfs_pgio_prepare(struct rpc_task *task, void *calldata) int nfs_initiate_pgio(struct rpc_clnt *clnt, struct nfs_pgio_header *hdr, const struct cred *cred, const struct nfs_rpc_ops *rpc_ops, - const struct rpc_call_ops *call_ops, int how, int flags) + const struct rpc_call_ops *call_ops, int how, int flags, + struct file *localio) { struct rpc_task *task; struct rpc_message msg = { @@ -961,7 +962,8 @@ static int nfs_generic_pg_pgios(struct nfs_pageio_descriptor *desc) NFS_PROTO(hdr->inode), desc->pg_rpc_callops, desc->pg_ioflags, - RPC_TASK_CRED_NOREF | task_flags); + RPC_TASK_CRED_NOREF | task_flags, + NULL); } return ret; } diff --git a/fs/nfs/pnfs_nfs.c b/fs/nfs/pnfs_nfs.c index a74ee69a2fa6..dbef837e871a 100644 --- a/fs/nfs/pnfs_nfs.c +++ b/fs/nfs/pnfs_nfs.c @@ -490,7 +490,7 @@ pnfs_generic_commit_pagelist(struct inode *inode, struct list_head *mds_pages, nfs_initiate_commit(NFS_CLIENT(inode), data, NFS_PROTO(data->inode), data->mds_ops, how, - RPC_TASK_CRED_NOREF); + RPC_TASK_CRED_NOREF, NULL); } else { nfs_init_commit(data, NULL, data->lseg, cinfo); initiate_commit(data, how); diff --git a/fs/nfs/write.c b/fs/nfs/write.c index d074d0ceb4f0..ad9e98e46a0d 100644 --- a/fs/nfs/write.c +++ b/fs/nfs/write.c @@ -1663,7 +1663,8 @@ EXPORT_SYMBOL_GPL(nfs_commitdata_release); int nfs_initiate_commit(struct rpc_clnt *clnt, struct nfs_commit_data *data, const struct nfs_rpc_ops *nfs_ops, const struct rpc_call_ops *call_ops, - int how, int flags) + int how, int flags, + struct file *localio) { struct rpc_task *task; int priority = flush_task_priority(how); @@ -1809,7 +1810,7 @@ nfs_commit_list(struct inode *inode, struct list_head *head, int how, task_flags = RPC_TASK_MOVEABLE; return nfs_initiate_commit(NFS_CLIENT(inode), data, NFS_PROTO(inode), data->mds_ops, how, - RPC_TASK_CRED_NOREF | task_flags); + RPC_TASK_CRED_NOREF | task_flags, NULL); } /* From patchwork Mon Aug 19 18:17:19 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Snitzer X-Patchwork-Id: 13768783 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6ACBA18A6C3; Mon, 19 Aug 2024 18:18:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724091491; cv=none; b=kZ02wpII2V3vjU7RiLc3iHkNCoupL+42rR9/fcqfhmiutD3lk6um8gGCiTqtzJZ390qYVFFOYWhgRJ859L5nwCVv29WPvGHVGIrwTm6pPL8hXcqoMOAzYMRHwRKVg7XwYvtgy6/T1Ie/yxLRciKoF929y0vh8/DG8j/fRa7h2RA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724091491; c=relaxed/simple; bh=XCdOSeXB+NQJQDD4of1vkcJGYX2LMI3CHpPNXJC55l4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=rnvYkNoEAitudrYe8pFYyD7fXeI6FwgwN+AwnzjVbKoygxNvKh5QVcoXvD38rlp5yrCtH61pQvuEz/2SCxNBbhgmZ36f6UM797gr/i2Oao0XhyLiCBl3Tns3K5YK2wmJHQimWPs535npn+0VLFMjeCjJdEenc9oR/CU8TchpYUE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=siCcRT9h; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="siCcRT9h" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 0510AC4AF14; Mon, 19 Aug 2024 18:18:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1724091491; bh=XCdOSeXB+NQJQDD4of1vkcJGYX2LMI3CHpPNXJC55l4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=siCcRT9hxKgfKsuYCacA6n2yJRF+ttyLW8V0KY6sUXz5N7wNbvTc4fIZS8mqXOzbx pnf2pk0fkcJELmvlL6QQP8Clubwn9wxZx4EzX9+fC03F1g+66/tTrXcnmvp7bpzvUw 6rba6IZSfWWbOpCVwolozDRD+W+SUVB2CH3hnoZNL9P7IZgqn9XFmf2lDEvpYi3MQi ppQJ/rLiYJryCeIikY7hJIMHhYAuADO4ycmSNezGILV11Byop9R9iBM609fLLrbGOg FCQecLsLKsLnpPSN5MiDnYi0zQZMtdWIIJYOIN5VIKVPkbVo7rcarhXQ0EiYbkivLa wzApgr38BSugQ== From: Mike Snitzer To: linux-nfs@vger.kernel.org Cc: Jeff Layton , Chuck Lever , Anna Schumaker , Trond Myklebust , NeilBrown , linux-fsdevel@vger.kernel.org Subject: [PATCH v12 14/24] nfs: add localio support Date: Mon, 19 Aug 2024 14:17:19 -0400 Message-ID: <20240819181750.70570-15-snitzer@kernel.org> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240819181750.70570-1-snitzer@kernel.org> References: <20240819181750.70570-1-snitzer@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Weston Andros Adamson Add client support for bypassing NFS for localhost reads, writes, and commits. This is only useful when the client and the server are running on the same host. nfs_local_probe() is stubbed out, later commits will enable client and server handshake via a Linux-only LOCALIO auxiliary RPC protocol. This has dynamic binding with the nfsd module (via nfs_localio module which is part of nfs_common). Localio will only work if nfsd is already loaded. The "localio_enabled" nfs kernel module parameter can be used to disable and enable the ability to use localio support. CONFIG_NFS_LOCALIO controls the client enablement. Signed-off-by: Weston Andros Adamson Signed-off-by: Trond Myklebust Co-developed-by: Mike Snitzer Signed-off-by: Mike Snitzer --- fs/nfs/Kconfig | 14 + fs/nfs/Makefile | 1 + fs/nfs/client.c | 3 + fs/nfs/internal.h | 51 ++++ fs/nfs/localio.c | 613 ++++++++++++++++++++++++++++++++++++++ fs/nfs/nfstrace.h | 61 ++++ fs/nfs/pagelist.c | 4 + fs/nfs/write.c | 3 + include/linux/nfs.h | 2 + include/linux/nfs_fs_sb.h | 1 + 10 files changed, 753 insertions(+) create mode 100644 fs/nfs/localio.c diff --git a/fs/nfs/Kconfig b/fs/nfs/Kconfig index 0eb20012792f..d52a1df28f69 100644 --- a/fs/nfs/Kconfig +++ b/fs/nfs/Kconfig @@ -87,6 +87,20 @@ config NFS_V4 If unsure, say Y. +config NFS_LOCALIO + bool "NFS client support for the LOCALIO auxiliary protocol" + depends on NFS_FS + select NFS_COMMON_LOCALIO_SUPPORT + help + Some NFS servers support an auxiliary NFS LOCALIO protocol + that is not an official part of the NFS protocol. + + This option enables support for the LOCALIO protocol in the + kernel's NFS client. Enable this to bypass using the NFS + protocol when issuing reads, writes and commits to the server. + + If unsure, say N. + config NFS_SWAP bool "Provide swap over NFS support" default n diff --git a/fs/nfs/Makefile b/fs/nfs/Makefile index 5f6db37f461e..9fb2f2cac87e 100644 --- a/fs/nfs/Makefile +++ b/fs/nfs/Makefile @@ -13,6 +13,7 @@ nfs-y := client.o dir.o file.o getroot.o inode.o super.o \ nfs-$(CONFIG_ROOT_NFS) += nfsroot.o nfs-$(CONFIG_SYSCTL) += sysctl.o nfs-$(CONFIG_NFS_FSCACHE) += fscache.o +nfs-$(CONFIG_NFS_LOCALIO) += localio.o obj-$(CONFIG_NFS_V2) += nfsv2.o nfsv2-y := nfs2super.o proc.o nfs2xdr.o diff --git a/fs/nfs/client.c b/fs/nfs/client.c index 1b65a5d7af49..bf327ddbdd25 100644 --- a/fs/nfs/client.c +++ b/fs/nfs/client.c @@ -242,6 +242,8 @@ static void pnfs_init_server(struct nfs_server *server) */ void nfs_free_client(struct nfs_client *clp) { + nfs_local_disable(clp); + /* -EIO all pending I/O */ if (!IS_ERR(clp->cl_rpcclient)) rpc_shutdown_client(clp->cl_rpcclient); @@ -433,6 +435,7 @@ struct nfs_client *nfs_get_client(const struct nfs_client_initdata *cl_init) list_add_tail(&new->cl_share_link, &nn->nfs_client_list); spin_unlock(&nn->nfs_client_lock); + nfs_local_probe(new); return rpc_ops->init_client(new, cl_init); } diff --git a/fs/nfs/internal.h b/fs/nfs/internal.h index 9fc6c1a41ee4..acb9d8bb4076 100644 --- a/fs/nfs/internal.h +++ b/fs/nfs/internal.h @@ -450,6 +450,57 @@ extern void nfs_set_cache_invalid(struct inode *inode, unsigned long flags); extern bool nfs_check_cache_invalid(struct inode *, unsigned long); extern int nfs_wait_bit_killable(struct wait_bit_key *key, int mode); +#if IS_ENABLED(CONFIG_NFS_LOCALIO) +/* localio.c */ +extern void nfs_local_disable(struct nfs_client *); +extern void nfs_local_probe(struct nfs_client *); +extern struct file *nfs_local_open_fh(struct nfs_client *, const struct cred *, + struct nfs_fh *, const fmode_t); +extern struct file *nfs_local_file_open(struct nfs_client *clp, + const struct cred *cred, + struct nfs_fh *fh, + struct nfs_open_context *ctx); +extern int nfs_local_doio(struct nfs_client *, struct file *, + struct nfs_pgio_header *, + const struct rpc_call_ops *); +extern int nfs_local_commit(struct file *, struct nfs_commit_data *, + const struct rpc_call_ops *, int); +extern bool nfs_server_is_local(const struct nfs_client *clp); + +#else +static inline void nfs_local_disable(struct nfs_client *clp) {} +static inline void nfs_local_probe(struct nfs_client *clp) {} +static inline struct file *nfs_local_open_fh(struct nfs_client *clp, + const struct cred *cred, + struct nfs_fh *fh, + const fmode_t mode) +{ + return ERR_PTR(-EINVAL); +} +static inline struct file *nfs_local_file_open(struct nfs_client *clp, + const struct cred *cred, + struct nfs_fh *fh, + struct nfs_open_context *ctx) +{ + return NULL; +} +static inline int nfs_local_doio(struct nfs_client *clp, struct file *filep, + struct nfs_pgio_header *hdr, + const struct rpc_call_ops *call_ops) +{ + return -EINVAL; +} +static inline int nfs_local_commit(struct file *filep, struct nfs_commit_data *data, + const struct rpc_call_ops *call_ops, int how) +{ + return -EINVAL; +} +static inline bool nfs_server_is_local(const struct nfs_client *clp) +{ + return false; +} +#endif /* CONFIG_NFS_LOCALIO */ + /* super.c */ extern const struct super_operations nfs_sops; bool nfs_auth_info_match(const struct nfs_auth_info *, rpc_authflavor_t); diff --git a/fs/nfs/localio.c b/fs/nfs/localio.c new file mode 100644 index 000000000000..d6ec425bf6f0 --- /dev/null +++ b/fs/nfs/localio.c @@ -0,0 +1,613 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * NFS client support for local clients to bypass network stack + * + * Copyright (C) 2014 Weston Andros Adamson + * Copyright (C) 2019 Trond Myklebust + * Copyright (C) 2024 Mike Snitzer + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include + +#include "internal.h" +#include "pnfs.h" +#include "nfstrace.h" + +#define NFSDBG_FACILITY NFSDBG_VFS + +struct nfs_local_kiocb { + struct kiocb kiocb; + struct bio_vec *bvec; + struct nfs_pgio_header *hdr; + struct work_struct work; +}; + +struct nfs_local_fsync_ctx { + struct file *filp; + struct nfs_commit_data *data; + struct work_struct work; + struct kref kref; + struct completion *done; +}; +static void nfs_local_fsync_work(struct work_struct *work); + +static bool localio_enabled __read_mostly = true; +module_param(localio_enabled, bool, 0644); + +bool nfs_server_is_local(const struct nfs_client *clp) +{ + return test_bit(NFS_CS_LOCAL_IO, &clp->cl_flags) != 0 && + localio_enabled; +} +EXPORT_SYMBOL_GPL(nfs_server_is_local); + +/* + * nfs_local_enable - enable local i/o for an nfs_client + */ +static __maybe_unused void nfs_local_enable(struct nfs_client *clp, + nfs_uuid_t *nfs_uuid) +{ + if (READ_ONCE(clp->nfsd_open_local_fh)) { + set_bit(NFS_CS_LOCAL_IO, &clp->cl_flags); + clp->cl_nfssvc_net = nfs_uuid->net; + clp->cl_nfssvc_dom = nfs_uuid->dom; + trace_nfs_local_enable(clp); + } +} + +/* + * nfs_local_disable - disable local i/o for an nfs_client + */ +void nfs_local_disable(struct nfs_client *clp) +{ + if (test_and_clear_bit(NFS_CS_LOCAL_IO, &clp->cl_flags)) { + trace_nfs_local_disable(clp); + clp->cl_nfssvc_net = NULL; + if (clp->cl_nfssvc_dom) { + auth_domain_put(clp->cl_nfssvc_dom); + clp->cl_nfssvc_dom = NULL; + } + } +} + +/* + * nfs_local_probe - probe local i/o support for an nfs_server and nfs_client + */ +void nfs_local_probe(struct nfs_client *clp) +{ +} +EXPORT_SYMBOL_GPL(nfs_local_probe); + +/* + * nfs_local_open_fh - open a local filehandle + * + * Returns a pointer to a struct file or an ERR_PTR + */ +struct file * +nfs_local_open_fh(struct nfs_client *clp, const struct cred *cred, + struct nfs_fh *fh, const fmode_t mode) +{ + struct file *filp; + int status; + + if (mode & ~(FMODE_READ | FMODE_WRITE)) + return ERR_PTR(-EINVAL); + + status = clp->nfsd_open_local_fh(clp->cl_nfssvc_net, clp->cl_nfssvc_dom, + clp->cl_rpcclient, cred, fh, mode, &filp); + if (status < 0) { + trace_nfs_local_open_fh(fh, mode, status); + switch (status) { + case -ENXIO: + case -ENOENT: + nfs_local_disable(clp); + fallthrough; + case -ETIMEDOUT: + status = -EAGAIN; + } + filp = ERR_PTR(status); + } + return filp; +} +EXPORT_SYMBOL_GPL(nfs_local_open_fh); + +struct file * +nfs_local_file_open(struct nfs_client *clp, const struct cred *cred, + struct nfs_fh *fh, struct nfs_open_context *ctx) +{ + struct file *filp; + + if (!nfs_server_is_local(clp)) + return NULL; + + filp = nfs_local_open_fh(clp, cred, fh, ctx->mode); + if (IS_ERR(filp)) + return NULL; + + return filp; +} + +static struct bio_vec * +nfs_bvec_alloc_and_import_pagevec(struct page **pagevec, + unsigned int npages, gfp_t flags) +{ + struct bio_vec *bvec, *p; + + bvec = kmalloc_array(npages, sizeof(*bvec), flags); + if (bvec != NULL) { + for (p = bvec; npages > 0; p++, pagevec++, npages--) { + p->bv_page = *pagevec; + p->bv_len = PAGE_SIZE; + p->bv_offset = 0; + } + } + return bvec; +} + +static void +nfs_local_iocb_free(struct nfs_local_kiocb *iocb) +{ + kfree(iocb->bvec); + kfree(iocb); +} + +static struct nfs_local_kiocb * +nfs_local_iocb_alloc(struct nfs_pgio_header *hdr, struct file *filp, + gfp_t flags) +{ + struct nfs_local_kiocb *iocb; + + iocb = kmalloc(sizeof(*iocb), flags); + if (iocb == NULL) + return NULL; + iocb->bvec = nfs_bvec_alloc_and_import_pagevec(hdr->page_array.pagevec, + hdr->page_array.npages, flags); + if (iocb->bvec == NULL) { + kfree(iocb); + return NULL; + } + init_sync_kiocb(&iocb->kiocb, filp); + iocb->kiocb.ki_pos = hdr->args.offset; + iocb->hdr = hdr; + iocb->kiocb.ki_flags &= ~IOCB_APPEND; + return iocb; +} + +static void +nfs_local_iter_init(struct iov_iter *i, struct nfs_local_kiocb *iocb, int dir) +{ + struct nfs_pgio_header *hdr = iocb->hdr; + + iov_iter_bvec(i, dir, iocb->bvec, hdr->page_array.npages, + hdr->args.count + hdr->args.pgbase); + if (hdr->args.pgbase != 0) + iov_iter_advance(i, hdr->args.pgbase); +} + +static void +nfs_local_hdr_release(struct nfs_pgio_header *hdr, + const struct rpc_call_ops *call_ops) +{ + call_ops->rpc_call_done(&hdr->task, hdr); + call_ops->rpc_release(hdr); +} + +static void +nfs_local_pgio_init(struct nfs_pgio_header *hdr, + const struct rpc_call_ops *call_ops) +{ + hdr->task.tk_ops = call_ops; + if (!hdr->task.tk_start) + hdr->task.tk_start = ktime_get(); +} + +static void +nfs_local_pgio_done(struct nfs_pgio_header *hdr, long status) +{ + if (status >= 0) { + hdr->res.count = status; + hdr->res.op_status = NFS4_OK; + hdr->task.tk_status = 0; + } else { + hdr->res.op_status = nfs4_stat_to_errno(status); + hdr->task.tk_status = status; + } +} + +static void +nfs_local_pgio_release(struct nfs_local_kiocb *iocb) +{ + struct nfs_pgio_header *hdr = iocb->hdr; + + fput(iocb->kiocb.ki_filp); + nfs_local_iocb_free(iocb); + nfs_local_hdr_release(hdr, hdr->task.tk_ops); +} + +static void +nfs_local_read_done(struct nfs_local_kiocb *iocb, long status) +{ + struct nfs_pgio_header *hdr = iocb->hdr; + struct file *filp = iocb->kiocb.ki_filp; + + nfs_local_pgio_done(hdr, status); + + if (hdr->res.count != hdr->args.count || + hdr->args.offset + hdr->res.count >= i_size_read(file_inode(filp))) + hdr->res.eof = true; + + dprintk("%s: read %ld bytes eof %d.\n", __func__, + status > 0 ? status : 0, hdr->res.eof); +} + +static int +nfs_do_local_read(struct nfs_pgio_header *hdr, struct file *filp, + const struct rpc_call_ops *call_ops) +{ + struct nfs_local_kiocb *iocb; + struct iov_iter iter; + ssize_t status; + + dprintk("%s: vfs_read count=%u pos=%llu\n", + __func__, hdr->args.count, hdr->args.offset); + + iocb = nfs_local_iocb_alloc(hdr, filp, GFP_KERNEL); + if (iocb == NULL) + return -ENOMEM; + nfs_local_iter_init(&iter, iocb, READ); + + nfs_local_pgio_init(hdr, call_ops); + hdr->res.eof = false; + + status = filp->f_op->read_iter(&iocb->kiocb, &iter); + WARN_ON_ONCE(status == -EIOCBQUEUED); + + nfs_local_read_done(iocb, status); + nfs_local_pgio_release(iocb); + + return 0; +} + +static void +nfs_copy_boot_verifier(struct nfs_write_verifier *verifier, struct inode *inode) +{ + struct nfs_client *clp = NFS_SERVER(inode)->nfs_client; + u32 *verf = (u32 *)verifier->data; + int seq = 0; + + do { + read_seqbegin_or_lock(&clp->cl_boot_lock, &seq); + verf[0] = (u32)clp->cl_nfssvc_boot.tv_sec; + verf[1] = (u32)clp->cl_nfssvc_boot.tv_nsec; + } while (need_seqretry(&clp->cl_boot_lock, seq)); + done_seqretry(&clp->cl_boot_lock, seq); +} + +static void +nfs_reset_boot_verifier(struct inode *inode) +{ + struct nfs_client *clp = NFS_SERVER(inode)->nfs_client; + + write_seqlock(&clp->cl_boot_lock); + ktime_get_real_ts64(&clp->cl_nfssvc_boot); + write_sequnlock(&clp->cl_boot_lock); +} + +static void +nfs_set_local_verifier(struct inode *inode, + struct nfs_writeverf *verf, + enum nfs3_stable_how how) +{ + + nfs_copy_boot_verifier(&verf->verifier, inode); + verf->committed = how; +} + +/* Factored out from fs/nfsd/vfs.h:fh_getattr() */ +static int __vfs_getattr(struct path *p, struct kstat *stat, int version) +{ + u32 request_mask = STATX_BASIC_STATS; + + if (version == 4) + request_mask |= (STATX_BTIME | STATX_CHANGE_COOKIE); + return vfs_getattr(p, stat, request_mask, AT_STATX_SYNC_AS_STAT); +} + +/* Copied from fs/nfsd/nfsfh.c:nfsd4_change_attribute() */ +static u64 __nfsd4_change_attribute(const struct kstat *stat, + const struct inode *inode) +{ + u64 chattr; + + if (stat->result_mask & STATX_CHANGE_COOKIE) { + chattr = stat->change_cookie; + if (S_ISREG(inode->i_mode) && + !(stat->attributes & STATX_ATTR_CHANGE_MONOTONIC)) { + chattr += (u64)stat->ctime.tv_sec << 30; + chattr += stat->ctime.tv_nsec; + } + } else { + chattr = time_to_chattr(&stat->ctime); + } + return chattr; +} + +static void nfs_local_vfs_getattr(struct nfs_local_kiocb *iocb) +{ + struct kstat stat; + struct file *filp = iocb->kiocb.ki_filp; + struct nfs_pgio_header *hdr = iocb->hdr; + struct nfs_fattr *fattr = hdr->res.fattr; + int version = NFS_PROTO(hdr->inode)->version; + + if (unlikely(!fattr) || __vfs_getattr(&filp->f_path, &stat, version)) + return; + + fattr->valid = (NFS_ATTR_FATTR_FILEID | + NFS_ATTR_FATTR_CHANGE | + NFS_ATTR_FATTR_SIZE | + NFS_ATTR_FATTR_ATIME | + NFS_ATTR_FATTR_MTIME | + NFS_ATTR_FATTR_CTIME | + NFS_ATTR_FATTR_SPACE_USED); + + fattr->fileid = stat.ino; + fattr->size = stat.size; + fattr->atime = stat.atime; + fattr->mtime = stat.mtime; + fattr->ctime = stat.ctime; + if (version == 4) { + fattr->change_attr = + __nfsd4_change_attribute(&stat, file_inode(filp)); + } else + fattr->change_attr = nfs_timespec_to_change_attr(&fattr->ctime); + fattr->du.nfs3.used = stat.blocks << 9; +} + +static void +nfs_local_write_done(struct nfs_local_kiocb *iocb, long status) +{ + struct nfs_pgio_header *hdr = iocb->hdr; + struct inode *inode = hdr->inode; + + dprintk("%s: wrote %ld bytes.\n", __func__, status > 0 ? status : 0); + + /* Handle short writes as if they are ENOSPC */ + if (status > 0 && status < hdr->args.count) { + hdr->mds_offset += status; + hdr->args.offset += status; + hdr->args.pgbase += status; + hdr->args.count -= status; + nfs_set_pgio_error(hdr, -ENOSPC, hdr->args.offset); + status = -ENOSPC; + } + if (status < 0) + nfs_reset_boot_verifier(inode); + else if (nfs_should_remove_suid(inode)) { + /* Deal with the suid/sgid bit corner case */ + spin_lock(&inode->i_lock); + nfs_set_cache_invalid(inode, NFS_INO_INVALID_MODE); + spin_unlock(&inode->i_lock); + } + nfs_local_pgio_done(hdr, status); +} + +static int +nfs_do_local_write(struct nfs_pgio_header *hdr, struct file *filp, + const struct rpc_call_ops *call_ops) +{ + struct nfs_local_kiocb *iocb; + struct iov_iter iter; + ssize_t status; + + dprintk("%s: vfs_write count=%u pos=%llu %s\n", + __func__, hdr->args.count, hdr->args.offset, + (hdr->args.stable == NFS_UNSTABLE) ? "unstable" : "stable"); + + iocb = nfs_local_iocb_alloc(hdr, filp, GFP_NOIO); + if (iocb == NULL) + return -ENOMEM; + nfs_local_iter_init(&iter, iocb, WRITE); + + switch (hdr->args.stable) { + default: + break; + case NFS_DATA_SYNC: + iocb->kiocb.ki_flags |= IOCB_DSYNC; + break; + case NFS_FILE_SYNC: + iocb->kiocb.ki_flags |= IOCB_DSYNC|IOCB_SYNC; + } + nfs_local_pgio_init(hdr, call_ops); + + nfs_set_local_verifier(hdr->inode, hdr->res.verf, hdr->args.stable); + + file_start_write(filp); + status = filp->f_op->write_iter(&iocb->kiocb, &iter); + file_end_write(filp); + WARN_ON_ONCE(status == -EIOCBQUEUED); + + nfs_local_write_done(iocb, status); + nfs_local_vfs_getattr(iocb); + nfs_local_pgio_release(iocb); + + return 0; +} + +int +nfs_local_doio(struct nfs_client *clp, struct file *filp, + struct nfs_pgio_header *hdr, + const struct rpc_call_ops *call_ops) +{ + int status = 0; + + if (!hdr->args.count) + return 0; + /* Don't support filesystems without read_iter/write_iter */ + if (!filp->f_op->read_iter || !filp->f_op->write_iter) { + nfs_local_disable(clp); + status = -EAGAIN; + goto out_fput; + } + + switch (hdr->rw_mode) { + case FMODE_READ: + status = nfs_do_local_read(hdr, filp, call_ops); + break; + case FMODE_WRITE: + status = nfs_do_local_write(hdr, filp, call_ops); + break; + default: + dprintk("%s: invalid mode: %d\n", __func__, + hdr->rw_mode); + status = -EINVAL; + } +out_fput: + if (status != 0) { + fput(filp); + hdr->task.tk_status = status; + nfs_local_hdr_release(hdr, call_ops); + } + return status; +} + +static void +nfs_local_init_commit(struct nfs_commit_data *data, + const struct rpc_call_ops *call_ops) +{ + data->task.tk_ops = call_ops; +} + +static int +nfs_local_run_commit(struct file *filp, struct nfs_commit_data *data) +{ + loff_t start = data->args.offset; + loff_t end = LLONG_MAX; + + if (data->args.count > 0) { + end = start + data->args.count - 1; + if (end < start) + end = LLONG_MAX; + } + + dprintk("%s: commit %llu - %llu\n", __func__, start, end); + return vfs_fsync_range(filp, start, end, 0); +} + +static void +nfs_local_commit_done(struct nfs_commit_data *data, int status) +{ + if (status >= 0) { + nfs_set_local_verifier(data->inode, + data->res.verf, + NFS_FILE_SYNC); + data->res.op_status = NFS4_OK; + data->task.tk_status = 0; + } else { + nfs_reset_boot_verifier(data->inode); + data->res.op_status = nfs4_stat_to_errno(status); + data->task.tk_status = status; + } +} + +static void +nfs_local_release_commit_data(struct file *filp, + struct nfs_commit_data *data, + const struct rpc_call_ops *call_ops) +{ + fput(filp); + call_ops->rpc_call_done(&data->task, data); + call_ops->rpc_release(data); +} + +static struct nfs_local_fsync_ctx * +nfs_local_fsync_ctx_alloc(struct nfs_commit_data *data, struct file *filp, + gfp_t flags) +{ + struct nfs_local_fsync_ctx *ctx = kmalloc(sizeof(*ctx), flags); + + if (ctx != NULL) { + ctx->filp = filp; + ctx->data = data; + INIT_WORK(&ctx->work, nfs_local_fsync_work); + kref_init(&ctx->kref); + ctx->done = NULL; + } + return ctx; +} + +static void +nfs_local_fsync_ctx_kref_free(struct kref *kref) +{ + kfree(container_of(kref, struct nfs_local_fsync_ctx, kref)); +} + +static void +nfs_local_fsync_ctx_put(struct nfs_local_fsync_ctx *ctx) +{ + kref_put(&ctx->kref, nfs_local_fsync_ctx_kref_free); +} + +static void +nfs_local_fsync_ctx_free(struct nfs_local_fsync_ctx *ctx) +{ + nfs_local_release_commit_data(ctx->filp, ctx->data, + ctx->data->task.tk_ops); + nfs_local_fsync_ctx_put(ctx); +} + +static void +nfs_local_fsync_work(struct work_struct *work) +{ + struct nfs_local_fsync_ctx *ctx; + int status; + + ctx = container_of(work, struct nfs_local_fsync_ctx, work); + + status = nfs_local_run_commit(ctx->filp, ctx->data); + nfs_local_commit_done(ctx->data, status); + if (ctx->done != NULL) + complete(ctx->done); + nfs_local_fsync_ctx_free(ctx); +} + +int +nfs_local_commit(struct file *filp, struct nfs_commit_data *data, + const struct rpc_call_ops *call_ops, int how) +{ + struct nfs_local_fsync_ctx *ctx; + + ctx = nfs_local_fsync_ctx_alloc(data, filp, GFP_KERNEL); + if (!ctx) { + nfs_local_commit_done(data, -ENOMEM); + nfs_local_release_commit_data(filp, data, call_ops); + return -ENOMEM; + } + + nfs_local_init_commit(data, call_ops); + kref_get(&ctx->kref); + if (how & FLUSH_SYNC) { + DECLARE_COMPLETION_ONSTACK(done); + ctx->done = &done; + queue_work(nfsiod_workqueue, &ctx->work); + wait_for_completion(&done); + } else + queue_work(nfsiod_workqueue, &ctx->work); + nfs_local_fsync_ctx_put(ctx); + return 0; +} diff --git a/fs/nfs/nfstrace.h b/fs/nfs/nfstrace.h index 352fdaed4075..1eab98c277fa 100644 --- a/fs/nfs/nfstrace.h +++ b/fs/nfs/nfstrace.h @@ -1685,6 +1685,67 @@ TRACE_EVENT(nfs_mount_path, TP_printk("path='%s'", __get_str(path)) ); +TRACE_EVENT(nfs_local_open_fh, + TP_PROTO( + const struct nfs_fh *fh, + fmode_t fmode, + int error + ), + + TP_ARGS(fh, fmode, error), + + TP_STRUCT__entry( + __field(int, error) + __field(u32, fhandle) + __field(unsigned int, fmode) + ), + + TP_fast_assign( + __entry->error = error; + __entry->fhandle = nfs_fhandle_hash(fh); + __entry->fmode = (__force unsigned int)fmode; + ), + + TP_printk( + "error=%d fhandle=0x%08x mode=%s", + __entry->error, + __entry->fhandle, + show_fs_fmode_flags(__entry->fmode) + ) +); + +DECLARE_EVENT_CLASS(nfs_local_client_event, + TP_PROTO( + const struct nfs_client *clp + ), + + TP_ARGS(clp), + + TP_STRUCT__entry( + __field(unsigned int, protocol) + __string(server, clp->cl_hostname) + ), + + TP_fast_assign( + __entry->protocol = clp->rpc_ops->version; + __assign_str(server); + ), + + TP_printk( + "server=%s NFSv%u", __get_str(server), __entry->protocol + ) +); + +#define DEFINE_NFS_LOCAL_CLIENT_EVENT(name) \ + DEFINE_EVENT(nfs_local_client_event, name, \ + TP_PROTO( \ + const struct nfs_client *clp \ + ), \ + TP_ARGS(clp)) + +DEFINE_NFS_LOCAL_CLIENT_EVENT(nfs_local_enable); +DEFINE_NFS_LOCAL_CLIENT_EVENT(nfs_local_disable); + DECLARE_EVENT_CLASS(nfs_xdr_event, TP_PROTO( const struct xdr_stream *xdr, diff --git a/fs/nfs/pagelist.c b/fs/nfs/pagelist.c index 532cfaf79813..c4160edd377e 100644 --- a/fs/nfs/pagelist.c +++ b/fs/nfs/pagelist.c @@ -762,6 +762,10 @@ int nfs_initiate_pgio(struct rpc_clnt *clnt, struct nfs_pgio_header *hdr, hdr->args.count, (unsigned long long)hdr->args.offset); + if (localio) + return nfs_local_doio(NFS_SERVER(hdr->inode)->nfs_client, + localio, hdr, call_ops); + task = rpc_run_task(&task_setup_data); if (IS_ERR(task)) return PTR_ERR(task); diff --git a/fs/nfs/write.c b/fs/nfs/write.c index ad9e98e46a0d..8bc807a3e041 100644 --- a/fs/nfs/write.c +++ b/fs/nfs/write.c @@ -1693,6 +1693,9 @@ int nfs_initiate_commit(struct rpc_clnt *clnt, struct nfs_commit_data *data, dprintk("NFS: initiated commit call\n"); + if (localio) + return nfs_local_commit(localio, data, call_ops, how); + task = rpc_run_task(&task_setup_data); if (IS_ERR(task)) return PTR_ERR(task); diff --git a/include/linux/nfs.h b/include/linux/nfs.h index 5ff1a5b3b00c..89ef8c5e98db 100644 --- a/include/linux/nfs.h +++ b/include/linux/nfs.h @@ -8,6 +8,8 @@ #ifndef _LINUX_NFS_H #define _LINUX_NFS_H +#include +#include #include #include #include diff --git a/include/linux/nfs_fs_sb.h b/include/linux/nfs_fs_sb.h index 3849cc2832f0..5edc57657985 100644 --- a/include/linux/nfs_fs_sb.h +++ b/include/linux/nfs_fs_sb.h @@ -50,6 +50,7 @@ struct nfs_client { #define NFS_CS_DS 7 /* - Server is a DS */ #define NFS_CS_REUSEPORT 8 /* - reuse src port on reconnect */ #define NFS_CS_PNFS 9 /* - Server used for pnfs */ +#define NFS_CS_LOCAL_IO 10 /* - client is local */ struct sockaddr_storage cl_addr; /* server identifier */ size_t cl_addrlen; char * cl_hostname; /* hostname of server */ From patchwork Mon Aug 19 18:17:20 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Snitzer X-Patchwork-Id: 13768784 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CB63018A926; Mon, 19 Aug 2024 18:18:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724091492; cv=none; b=i4Juhm6ua9wViyccDrAaBms++CmtF3UW4n1cu30gF/oVm/QL1uH2oHj6yCnc4Cb8bLl5HhLt+gghUDc5PX8+VGy73iMdS4Tr98hwMZ1//MEfgc86T8n6ET/hkEO5mpAwUkMYK2IJgoGRoRdVQzhjaUtqlIt7ae13/z4WxioZ7Ro= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724091492; c=relaxed/simple; bh=qmXxHYkAdwf8qjcIR3nGO4xhNDso/NdmI/3A86EOP3E=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=krDLFMU8n5Zaw+PMTKlSJ8V0EDcRGY/2iE/wbyGNKfip0XajK0mH2p3NEnyElum44PzKgdJm+6cBd0/TiRnaFT1qmYdaEAmjjA9k4KZDvEXp/VFuqvXgdG5BpV9IshILrf/TomZfWGXJZne5bCqYAlbITI2CwDImxto7kVS/oJ8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=HI3D85ER; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="HI3D85ER" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 74A4BC4AF0F; Mon, 19 Aug 2024 18:18:12 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1724091492; bh=qmXxHYkAdwf8qjcIR3nGO4xhNDso/NdmI/3A86EOP3E=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=HI3D85ER8l+Bm+luANxaur27jDlRXGWIMv248iFBY9NTlOenyvkGTqvwvy+e1hcr0 0NUsx8v3DRR7IRf3NEBpgETB8BkM84f1qChs4mT+67IZBbFtQwd16B9Bp+DElkHzn3 8GaeIXs0y3GEQm6ZZSRP/cpGJbQCwqDO+yvwqRtWU+pITzOBkL2LuBesSFGItRqcte W6/U0bDQCB22dPaZf8mKv3iroAttsqdQgIFHLmRA274sXkXRrL8xeo0mYWIQi3y1OF dCSG/SM2yMi/GZS7n4FQTEkwp4PulvBAAmJBVpt42KUafg0O7ZcgNGYZ+Pi1FsIQSu SDjmcHD6HCTIA== From: Mike Snitzer To: linux-nfs@vger.kernel.org Cc: Jeff Layton , Chuck Lever , Anna Schumaker , Trond Myklebust , NeilBrown , linux-fsdevel@vger.kernel.org Subject: [PATCH v12 15/24] nfs: enable localio for non-pNFS IO Date: Mon, 19 Aug 2024 14:17:20 -0400 Message-ID: <20240819181750.70570-16-snitzer@kernel.org> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240819181750.70570-1-snitzer@kernel.org> References: <20240819181750.70570-1-snitzer@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Trond Myklebust Try a local open of the file being written to, and if it succeeds, then use localio to issue IO. Signed-off-by: Trond Myklebust Signed-off-by: Mike Snitzer --- fs/nfs/pagelist.c | 8 +++++++- fs/nfs/write.c | 6 +++++- 2 files changed, 12 insertions(+), 2 deletions(-) diff --git a/fs/nfs/pagelist.c b/fs/nfs/pagelist.c index c4160edd377e..1bd0224f7ee8 100644 --- a/fs/nfs/pagelist.c +++ b/fs/nfs/pagelist.c @@ -958,6 +958,12 @@ static int nfs_generic_pg_pgios(struct nfs_pageio_descriptor *desc) nfs_pgheader_init(desc, hdr, nfs_pgio_header_free); ret = nfs_generic_pgio(desc, hdr); if (ret == 0) { + struct nfs_client *clp = NFS_SERVER(hdr->inode)->nfs_client; + + struct file *filp = nfs_local_file_open(clp, hdr->cred, + hdr->args.fh, + hdr->args.context); + if (NFS_SERVER(hdr->inode)->nfs_client->cl_minorversion) task_flags = RPC_TASK_MOVEABLE; ret = nfs_initiate_pgio(NFS_CLIENT(hdr->inode), @@ -967,7 +973,7 @@ static int nfs_generic_pg_pgios(struct nfs_pageio_descriptor *desc) desc->pg_rpc_callops, desc->pg_ioflags, RPC_TASK_CRED_NOREF | task_flags, - NULL); + filp); } return ret; } diff --git a/fs/nfs/write.c b/fs/nfs/write.c index 8bc807a3e041..6436db54b2fc 100644 --- a/fs/nfs/write.c +++ b/fs/nfs/write.c @@ -1795,6 +1795,7 @@ nfs_commit_list(struct inode *inode, struct list_head *head, int how, struct nfs_commit_info *cinfo) { struct nfs_commit_data *data; + struct file *filp; unsigned short task_flags = 0; /* another commit raced with us */ @@ -1811,9 +1812,12 @@ nfs_commit_list(struct inode *inode, struct list_head *head, int how, nfs_init_commit(data, head, NULL, cinfo); if (NFS_SERVER(inode)->nfs_client->cl_minorversion) task_flags = RPC_TASK_MOVEABLE; + + filp = nfs_local_file_open(NFS_SERVER(inode)->nfs_client, data->cred, + data->args.fh, data->context); return nfs_initiate_commit(NFS_CLIENT(inode), data, NFS_PROTO(inode), data->mds_ops, how, - RPC_TASK_CRED_NOREF | task_flags, NULL); + RPC_TASK_CRED_NOREF | task_flags, filp); } /* From patchwork Mon Aug 19 18:17:21 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Snitzer X-Patchwork-Id: 13768785 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 278CE18A926; Mon, 19 Aug 2024 18:18:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724091494; cv=none; b=tphCGTTRpnd7yB82FJDFQQAGPknGnHWNkrMpeRRT8uj6k8d6VdG49oHLd10fgDznRPx/rQBH8blW10c5uBeugWxmr4USnMeJhmMDbum0loFxi3B5CAB9HjHmyjaeHPof8yQ65xWY+BMk3RzQgFAkA/dGWIsRAQvWl3bwQ7MLrP4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724091494; c=relaxed/simple; bh=6hQCTME1Ngg9DTJKjvhZ4aBk99NpOU/Y3Y2csHZGKJM=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Q0AG9++UL4UglsCfyLma3gAaKPFz713a8rbTkWpYmQDGVXmNA3FhJiQStklMsXrZ5Pl3S0xPYzv8LrLrEyMvymzCSODcNnXbfQTrwBwvWgGBLDJmBuWeaUVagxvHABSST6YRacHJh+t/MYsDHTOJ8VZSAuhGoKeHO70K5kC7uZI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=kF2xu+o1; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="kF2xu+o1" Received: by smtp.kernel.org (Postfix) with ESMTPSA id D5FE3C4AF0F; Mon, 19 Aug 2024 18:18:13 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1724091494; bh=6hQCTME1Ngg9DTJKjvhZ4aBk99NpOU/Y3Y2csHZGKJM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=kF2xu+o1ZAGxNzf9gbQcvg7ydvjK8vNyjyXIwWYQfGYTqljH8wi8m+/y/epmVoHMQ BFU4EHd62IglcWH2pnIlRWPn+Abjv32qAXWeZLLR+IypSpvwbDilRs7NuH+5rWKBXf b7Gj2ZaOYBOsKDCGXZ1tPcMUjf+7XdEM0REsV3EMyHHzF116AyuYPx0fHA3X8EHlsG zVsenkdcM9znLL0GInCvDbMv7uaYM5qMtxcraXZHE18TomItjTKuuz1iJ/1jPEELqh J8BD7Le/J36Xi9fsH4kAHSfWV1tQLEIeAWzAkptopo2MW8lDvcLcFRMtMnIznSeUeR vT3vGhHezaiXQ== From: Mike Snitzer To: linux-nfs@vger.kernel.org Cc: Jeff Layton , Chuck Lever , Anna Schumaker , Trond Myklebust , NeilBrown , linux-fsdevel@vger.kernel.org Subject: [PATCH v12 16/24] pnfs/flexfiles: enable localio support Date: Mon, 19 Aug 2024 14:17:21 -0400 Message-ID: <20240819181750.70570-17-snitzer@kernel.org> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240819181750.70570-1-snitzer@kernel.org> References: <20240819181750.70570-1-snitzer@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Trond Myklebust If the DS is local to this client use localio to write the data. Signed-off-by: Trond Myklebust Signed-off-by: Mike Snitzer --- fs/nfs/flexfilelayout/flexfilelayout.c | 109 ++++++++++++++++++++-- fs/nfs/flexfilelayout/flexfilelayout.h | 2 + fs/nfs/flexfilelayout/flexfilelayoutdev.c | 6 ++ 3 files changed, 110 insertions(+), 7 deletions(-) diff --git a/fs/nfs/flexfilelayout/flexfilelayout.c b/fs/nfs/flexfilelayout/flexfilelayout.c index 01ee52551a63..206b4b524e43 100644 --- a/fs/nfs/flexfilelayout/flexfilelayout.c +++ b/fs/nfs/flexfilelayout/flexfilelayout.c @@ -11,6 +11,7 @@ #include #include #include +#include #include #include @@ -162,6 +163,52 @@ decode_name(struct xdr_stream *xdr, u32 *id) return 0; } +static struct file * +ff_local_open_fh(struct pnfs_layout_segment *lseg, + u32 ds_idx, + struct nfs_client *clp, + const struct cred *cred, + struct nfs_fh *fh, + fmode_t mode) +{ + struct nfs4_ff_layout_mirror *mirror = FF_LAYOUT_COMP(lseg, ds_idx); + struct file *filp, *new, __rcu **pfile; + + if (!nfs_server_is_local(clp)) + return NULL; + if (mode & FMODE_WRITE) { + /* + * Always request read and write access since this corresponds + * to a rw layout. + */ + mode |= FMODE_READ; + pfile = &mirror->rw_file; + } else + pfile = &mirror->ro_file; + + new = NULL; + rcu_read_lock(); + filp = rcu_dereference(*pfile); + if (!filp) { + rcu_read_unlock(); + new = nfs_local_open_fh(clp, cred, fh, mode); + if (IS_ERR(new)) + return NULL; + rcu_read_lock(); + /* try to swap in the pointer */ + filp = cmpxchg(pfile, NULL, new); + if (!filp) { + filp = new; + new = NULL; + } + } + filp = get_file_rcu(&filp); + rcu_read_unlock(); + if (new) + fput(new); + return filp; +} + static bool ff_mirror_match_fh(const struct nfs4_ff_layout_mirror *m1, const struct nfs4_ff_layout_mirror *m2) { @@ -237,8 +284,15 @@ static struct nfs4_ff_layout_mirror *ff_layout_alloc_mirror(gfp_t gfp_flags) static void ff_layout_free_mirror(struct nfs4_ff_layout_mirror *mirror) { + struct file *filp; const struct cred *cred; + filp = rcu_access_pointer(mirror->ro_file); + if (filp) + fput(filp); + filp = rcu_access_pointer(mirror->rw_file); + if (filp) + fput(filp); ff_layout_remove_mirror(mirror); kfree(mirror->fh_versions); cred = rcu_access_pointer(mirror->ro_cred); @@ -414,6 +468,7 @@ ff_layout_alloc_lseg(struct pnfs_layout_hdr *lh, struct nfs4_ff_layout_mirror *mirror; struct cred *kcred; const struct cred __rcu *cred; + const struct cred __rcu *old; kuid_t uid; kgid_t gid; u32 ds_count, fh_count, id; @@ -513,13 +568,26 @@ ff_layout_alloc_lseg(struct pnfs_layout_hdr *lh, mirror = ff_layout_add_mirror(lh, fls->mirror_array[i]); if (mirror != fls->mirror_array[i]) { + struct file *filp; + /* swap cred ptrs so free_mirror will clean up old */ if (lgr->range.iomode == IOMODE_READ) { - cred = xchg(&mirror->ro_cred, cred); - rcu_assign_pointer(fls->mirror_array[i]->ro_cred, cred); + old = xchg(&mirror->ro_cred, cred); + rcu_assign_pointer(fls->mirror_array[i]->ro_cred, old); + /* drop file if creds changed */ + if (old != cred) { + filp = rcu_dereference_protected(xchg(&mirror->ro_file, NULL), 1); + if (filp) + fput(filp); + } } else { - cred = xchg(&mirror->rw_cred, cred); - rcu_assign_pointer(fls->mirror_array[i]->rw_cred, cred); + old = xchg(&mirror->rw_cred, cred); + rcu_assign_pointer(fls->mirror_array[i]->rw_cred, old); + if (old != cred) { + filp = rcu_dereference_protected(xchg(&mirror->rw_file, NULL), 1); + if (filp) + fput(filp); + } } ff_layout_free_mirror(fls->mirror_array[i]); fls->mirror_array[i] = mirror; @@ -1756,6 +1824,7 @@ ff_layout_read_pagelist(struct nfs_pgio_header *hdr) struct pnfs_layout_segment *lseg = hdr->lseg; struct nfs4_pnfs_ds *ds; struct rpc_clnt *ds_clnt; + struct file *filp; struct nfs4_ff_layout_mirror *mirror; const struct cred *ds_cred; loff_t offset = hdr->args.offset; @@ -1802,11 +1871,19 @@ ff_layout_read_pagelist(struct nfs_pgio_header *hdr) hdr->args.offset = offset; hdr->mds_offset = offset; + /* Start IO accounting for local read */ + filp = ff_local_open_fh(lseg, idx, ds->ds_clp, ds_cred, fh, + FMODE_READ); + if (filp) { + hdr->task.tk_start = ktime_get(); + ff_layout_read_record_layoutstats_start(&hdr->task, hdr); + } + /* Perform an asynchronous read to ds */ nfs_initiate_pgio(ds_clnt, hdr, ds_cred, ds->ds_clp->rpc_ops, vers == 3 ? &ff_layout_read_call_ops_v3 : &ff_layout_read_call_ops_v4, - 0, RPC_TASK_SOFTCONN, NULL); + 0, RPC_TASK_SOFTCONN, filp); put_cred(ds_cred); return PNFS_ATTEMPTED; @@ -1826,6 +1903,7 @@ ff_layout_write_pagelist(struct nfs_pgio_header *hdr, int sync) struct pnfs_layout_segment *lseg = hdr->lseg; struct nfs4_pnfs_ds *ds; struct rpc_clnt *ds_clnt; + struct file *filp; struct nfs4_ff_layout_mirror *mirror; const struct cred *ds_cred; loff_t offset = hdr->args.offset; @@ -1870,11 +1948,19 @@ ff_layout_write_pagelist(struct nfs_pgio_header *hdr, int sync) */ hdr->args.offset = offset; + /* Start IO accounting for local write */ + filp = ff_local_open_fh(lseg, idx, ds->ds_clp, ds_cred, fh, + FMODE_READ|FMODE_WRITE); + if (filp) { + hdr->task.tk_start = ktime_get(); + ff_layout_write_record_layoutstats_start(&hdr->task, hdr); + } + /* Perform an asynchronous write */ nfs_initiate_pgio(ds_clnt, hdr, ds_cred, ds->ds_clp->rpc_ops, vers == 3 ? &ff_layout_write_call_ops_v3 : &ff_layout_write_call_ops_v4, - sync, RPC_TASK_SOFTCONN, NULL); + sync, RPC_TASK_SOFTCONN, filp); put_cred(ds_cred); return PNFS_ATTEMPTED; @@ -1908,6 +1994,7 @@ static int ff_layout_initiate_commit(struct nfs_commit_data *data, int how) struct pnfs_layout_segment *lseg = data->lseg; struct nfs4_pnfs_ds *ds; struct rpc_clnt *ds_clnt; + struct file *filp; struct nfs4_ff_layout_mirror *mirror; const struct cred *ds_cred; u32 idx; @@ -1946,10 +2033,18 @@ static int ff_layout_initiate_commit(struct nfs_commit_data *data, int how) if (fh) data->args.fh = fh; + /* Start IO accounting for local commit */ + filp = ff_local_open_fh(lseg, idx, ds->ds_clp, ds_cred, fh, + FMODE_READ|FMODE_WRITE); + if (filp) { + data->task.tk_start = ktime_get(); + ff_layout_commit_record_layoutstats_start(&data->task, data); + } + ret = nfs_initiate_commit(ds_clnt, data, ds->ds_clp->rpc_ops, vers == 3 ? &ff_layout_commit_call_ops_v3 : &ff_layout_commit_call_ops_v4, - how, RPC_TASK_SOFTCONN, NULL); + how, RPC_TASK_SOFTCONN, filp); put_cred(ds_cred); return ret; out_err: diff --git a/fs/nfs/flexfilelayout/flexfilelayout.h b/fs/nfs/flexfilelayout/flexfilelayout.h index f84b3fb0dddd..8e042df5a2c9 100644 --- a/fs/nfs/flexfilelayout/flexfilelayout.h +++ b/fs/nfs/flexfilelayout/flexfilelayout.h @@ -82,7 +82,9 @@ struct nfs4_ff_layout_mirror { struct nfs_fh *fh_versions; nfs4_stateid stateid; const struct cred __rcu *ro_cred; + struct file __rcu *ro_file; const struct cred __rcu *rw_cred; + struct file __rcu *rw_file; refcount_t ref; spinlock_t lock; unsigned long flags; diff --git a/fs/nfs/flexfilelayout/flexfilelayoutdev.c b/fs/nfs/flexfilelayout/flexfilelayoutdev.c index e028f5a0ef5f..e58bedfb1dcc 100644 --- a/fs/nfs/flexfilelayout/flexfilelayoutdev.c +++ b/fs/nfs/flexfilelayout/flexfilelayoutdev.c @@ -395,6 +395,12 @@ nfs4_ff_layout_prepare_ds(struct pnfs_layout_segment *lseg, /* connect success, check rsize/wsize limit */ if (!status) { + /* + * ds_clp is put in destroy_ds(). + * keep ds_clp even if DS is local, so that if local IO cannot + * proceed somehow, we can fall back to NFS whenever we want. + */ + nfs_local_probe(ds->ds_clp); max_payload = nfs_block_size(rpc_max_payload(ds->ds_clp->cl_rpcclient), NULL); From patchwork Mon Aug 19 18:17:22 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Snitzer X-Patchwork-Id: 13768786 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C4EC318A926; Mon, 19 Aug 2024 18:18:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724091495; cv=none; b=KliS2Ex6DXxRtYu/gUuhkwNmKG+1PqvM/wZ3T5RBl92V6h3JWqm7xNXR2Tk7S7KAubw3vRlmd+oWGi5vUxbn/xWi+e7xfYU3nXhylqkYLCuecEl1OKw5uqbpiIZWY4POy0giGwL2ueb00npXYaR/iC7LVphLRUrfAZjTICOKVgQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724091495; c=relaxed/simple; bh=SEQjq+EMrLZdi+P7R1Hxi2v99pg71dg3Qkmbh6w8g5M=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Txx7wHP4qpXV08NPbG/BL/SkeHJ9fZXcEoR9VPbOl9MuXd371m3NjPZR5k3KEKbA/iNWMmJFPn+KDuZ+++eG7TQzu8Q9NiJ7MKA3JAYJdd3CxxD7Af+tH2HhhISHbkPSy+a10/90ZOhHIVd1d2OKZLUkitk7KqlW1Pz8aElcQIY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=JgQe2op2; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="JgQe2op2" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 25C69C4AF0C; Mon, 19 Aug 2024 18:18:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1724091495; bh=SEQjq+EMrLZdi+P7R1Hxi2v99pg71dg3Qkmbh6w8g5M=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=JgQe2op2T6K7BgSdngLnoRPVaphjg/AIoKC191r7oFw6SSRj6quEM5FC2azqjL4Ca nh2XTaPMGkHJFGVPnY7Xv3s/Ua2Eb6b32zCkoDpnDrIWI7v4vxadUjsCeejBpqNMd0 WQwH2tAqc0ijjCPeHZ02FzzRlHgrwkVCgk4UsZI/x7kYJaYUxckL/J1m1oAIuygj7z GA3mZ2WpYHZsBuzVg/fAC54AmB4S0Ebez9xfjIQw+kRmz2pSxllNCu1erw+dh+NzOu dxmPvUEU0RKwnaheR70b+Alzi+nzHtszP/QzuxGmHjV+6ylITcMZoLjs5E0aWMfjM8 CqDgk7WU3AF4A== From: Mike Snitzer To: linux-nfs@vger.kernel.org Cc: Jeff Layton , Chuck Lever , Anna Schumaker , Trond Myklebust , NeilBrown , linux-fsdevel@vger.kernel.org Subject: [PATCH v12 17/24] nfs/localio: use dedicated workqueues for filesystem read and write Date: Mon, 19 Aug 2024 14:17:22 -0400 Message-ID: <20240819181750.70570-18-snitzer@kernel.org> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240819181750.70570-1-snitzer@kernel.org> References: <20240819181750.70570-1-snitzer@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Trond Myklebust For localio access, don't call filesystem read() and write() routines directly. This solves two problems: 1) localio writes need to use a normal (non-memreclaim) unbound workqueue. This avoids imposing new requirements on how underlying filesystems process frontend IO, which would cause a large amount of work to update all filesystems. Without this change, when XFS starts getting low on space, XFS flushes work on a non-memreclaim work queue, which causes a priority inversion problem: 00573 workqueue: WQ_MEM_RECLAIM writeback:wb_workfn is flushing !WQ_MEM_RECLAIM xfs-sync/vdc:xfs_flush_inodes_worker 00573 WARNING: CPU: 6 PID: 8525 at kernel/workqueue.c:3706 check_flush_dependency+0x2a4/0x328 00573 Modules linked in: 00573 CPU: 6 PID: 8525 Comm: kworker/u71:5 Not tainted 6.10.0-rc3-ktest-00032-g2b0a133403ab #18502 00573 Hardware name: linux,dummy-virt (DT) 00573 Workqueue: writeback wb_workfn (flush-0:33) 00573 pstate: 400010c5 (nZcv daIF -PAN -UAO -TCO -DIT +SSBS BTYPE=--) 00573 pc : check_flush_dependency+0x2a4/0x328 00573 lr : check_flush_dependency+0x2a4/0x328 00573 sp : ffff0000c5f06bb0 00573 x29: ffff0000c5f06bb0 x28: ffff0000c998a908 x27: 1fffe00019331521 00573 x26: ffff0000d0620900 x25: ffff0000c5f06ca0 x24: ffff8000828848c0 00573 x23: 1fffe00018be0d8e x22: ffff0000c1210000 x21: ffff0000c75fde00 00573 x20: ffff800080bfd258 x19: ffff0000cad63400 x18: ffff0000cd3a4810 00573 x17: 0000000000000000 x16: 0000000000000000 x15: ffff800080508d98 00573 x14: 0000000000000000 x13: 204d49414c434552 x12: 1fffe0001b6eeab2 00573 x11: ffff60001b6eeab2 x10: dfff800000000000 x9 : ffff60001b6eeab3 00573 x8 : 0000000000000001 x7 : 00009fffe491154e x6 : ffff0000db775593 00573 x5 : ffff0000db775590 x4 : ffff0000db775590 x3 : 0000000000000000 00573 x2 : 0000000000000027 x1 : ffff600018be0d62 x0 : dfff800000000000 00573 Call trace: 00573 check_flush_dependency+0x2a4/0x328 00573 __flush_work+0x184/0x5c8 00573 flush_work+0x18/0x28 00573 xfs_flush_inodes+0x68/0x88 00573 xfs_file_buffered_write+0x128/0x6f0 00573 xfs_file_write_iter+0x358/0x448 00573 nfs_local_doio+0x854/0x1568 00573 nfs_initiate_pgio+0x214/0x418 00573 nfs_generic_pg_pgios+0x304/0x480 00573 nfs_pageio_doio+0xe8/0x240 00573 nfs_pageio_complete+0x160/0x480 00573 nfs_writepages+0x300/0x4f0 00573 do_writepages+0x12c/0x4a0 00573 __writeback_single_inode+0xd4/0xa68 00573 writeback_sb_inodes+0x470/0xcb0 00573 __writeback_inodes_wb+0xb0/0x1d0 00573 wb_writeback+0x594/0x808 00573 wb_workfn+0x5e8/0x9e0 00573 process_scheduled_works+0x53c/0xd90 00573 worker_thread+0x370/0x8c8 00573 kthread+0x258/0x2e8 00573 ret_from_fork+0x10/0x20 2) Some filesystem writeback routines can end up taking up a lot of stack space (particularly XFS). Instead of risking running over due to the extra overhead from the NFS stack, we should just call these routines from a workqueue job. Since we need to do this to address 1) above we're able to avoid possibly blowing the stack "for free". Use of dedicated workqueues improves performance over using the system_unbound_wq. Also, the creds used to open the file are used to override_creds() in both nfs_local_call_read() and nfs_local_call_write() -- otherwise the workqueue could have elevated capabilities (which the caller may not). Lastly, care is taken to set PF_LOCAL_THROTTLE | PF_MEMALLOC_NOIO in nfs_do_local_write() to avoid writeback deadlocks. The PF_LOCAL_THROTTLE flag prevents deadlocks in balance_dirty_pages() by causing writes to only be throttled against other writes to the same bdi (it keeps the throttling local). Normally all writes to bdi(s) are throttled equally (after throughput factors are allowed for). The PF_MEMALLOC_NOIO flag prevents the lower filesystem IO from causing memory reclaim to re-enter filesystems or IO devices and so prevents deadlocks from occuring where IO that cleans pages is waiting on IO to complete. Signed-off-by: Trond Myklebust Co-developed-by: Mike Snitzer Signed-off-by: Mike Snitzer Co-developed-by: NeilBrown Signed-off-by: NeilBrown # eliminated wait_for_completion --- fs/nfs/inode.c | 57 ++++++++++++++++++++++------------ fs/nfs/internal.h | 1 + fs/nfs/localio.c | 79 +++++++++++++++++++++++++++++++++-------------- 3 files changed, 95 insertions(+), 42 deletions(-) diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c index b4914a11c3c2..542c7d97b235 100644 --- a/fs/nfs/inode.c +++ b/fs/nfs/inode.c @@ -2461,35 +2461,54 @@ static void nfs_destroy_inodecache(void) kmem_cache_destroy(nfs_inode_cachep); } +struct workqueue_struct *nfslocaliod_workqueue; struct workqueue_struct *nfsiod_workqueue; EXPORT_SYMBOL_GPL(nfsiod_workqueue); /* - * start up the nfsiod workqueue - */ -static int nfsiod_start(void) -{ - struct workqueue_struct *wq; - dprintk("RPC: creating workqueue nfsiod\n"); - wq = alloc_workqueue("nfsiod", WQ_MEM_RECLAIM | WQ_UNBOUND, 0); - if (wq == NULL) - return -ENOMEM; - nfsiod_workqueue = wq; - return 0; -} - -/* - * Destroy the nfsiod workqueue + * Destroy the nfsiod workqueues */ static void nfsiod_stop(void) { struct workqueue_struct *wq; wq = nfsiod_workqueue; - if (wq == NULL) - return; - nfsiod_workqueue = NULL; - destroy_workqueue(wq); + if (wq != NULL) { + nfsiod_workqueue = NULL; + destroy_workqueue(wq); + } +#if IS_ENABLED(CONFIG_NFS_LOCALIO) + wq = nfslocaliod_workqueue; + if (wq != NULL) { + nfslocaliod_workqueue = NULL; + destroy_workqueue(wq); + } +#endif /* CONFIG_NFS_LOCALIO */ +} + +/* + * Start the nfsiod workqueues + */ +static int nfsiod_start(void) +{ + dprintk("RPC: creating workqueue nfsiod\n"); + nfsiod_workqueue = alloc_workqueue("nfsiod", WQ_MEM_RECLAIM | WQ_UNBOUND, 0); + if (nfsiod_workqueue == NULL) + return -ENOMEM; +#if IS_ENABLED(CONFIG_NFS_LOCALIO) + /* + * localio writes need to use a normal (non-memreclaim) workqueue. + * When we start getting low on space, XFS goes and calls flush_work() on + * a non-memreclaim work queue, which causes a priority inversion problem. + */ + dprintk("RPC: creating workqueue nfslocaliod\n"); + nfslocaliod_workqueue = alloc_workqueue("nfslocaliod", WQ_UNBOUND, 0); + if (unlikely(nfslocaliod_workqueue == NULL)) { + nfsiod_stop(); + return -ENOMEM; + } +#endif /* CONFIG_NFS_LOCALIO */ + return 0; } unsigned int nfs_net_id; diff --git a/fs/nfs/internal.h b/fs/nfs/internal.h index acb9d8bb4076..23f0d180fd19 100644 --- a/fs/nfs/internal.h +++ b/fs/nfs/internal.h @@ -439,6 +439,7 @@ int nfs_check_flags(int); /* inode.c */ extern struct workqueue_struct *nfsiod_workqueue; +extern struct workqueue_struct *nfslocaliod_workqueue; extern struct inode *nfs_alloc_inode(struct super_block *sb); extern void nfs_free_inode(struct inode *); extern int nfs_write_inode(struct inode *, struct writeback_control *); diff --git a/fs/nfs/localio.c b/fs/nfs/localio.c index d6ec425bf6f0..2d3118005afc 100644 --- a/fs/nfs/localio.c +++ b/fs/nfs/localio.c @@ -255,30 +255,45 @@ nfs_local_read_done(struct nfs_local_kiocb *iocb, long status) status > 0 ? status : 0, hdr->res.eof); } -static int -nfs_do_local_read(struct nfs_pgio_header *hdr, struct file *filp, - const struct rpc_call_ops *call_ops) +static void nfs_local_call_read(struct work_struct *work) { - struct nfs_local_kiocb *iocb; + struct nfs_local_kiocb *iocb = + container_of(work, struct nfs_local_kiocb, work); + struct file *filp = iocb->kiocb.ki_filp; + const struct cred *save_cred; struct iov_iter iter; ssize_t status; + save_cred = override_creds(filp->f_cred); + + nfs_local_iter_init(&iter, iocb, READ); + + status = filp->f_op->read_iter(&iocb->kiocb, &iter); + WARN_ON_ONCE(status == -EIOCBQUEUED); + + nfs_local_read_done(iocb, status); + nfs_local_pgio_release(iocb); + + revert_creds(save_cred); +} + +static int nfs_do_local_read(struct nfs_pgio_header *hdr, struct file *filp, + const struct rpc_call_ops *call_ops) +{ + struct nfs_local_kiocb *iocb; + dprintk("%s: vfs_read count=%u pos=%llu\n", __func__, hdr->args.count, hdr->args.offset); iocb = nfs_local_iocb_alloc(hdr, filp, GFP_KERNEL); if (iocb == NULL) return -ENOMEM; - nfs_local_iter_init(&iter, iocb, READ); nfs_local_pgio_init(hdr, call_ops); hdr->res.eof = false; - status = filp->f_op->read_iter(&iocb->kiocb, &iter); - WARN_ON_ONCE(status == -EIOCBQUEUED); - - nfs_local_read_done(iocb, status); - nfs_local_pgio_release(iocb); + INIT_WORK(&iocb->work, nfs_local_call_read); + queue_work(nfslocaliod_workqueue, &iocb->work); return 0; } @@ -407,14 +422,39 @@ nfs_local_write_done(struct nfs_local_kiocb *iocb, long status) nfs_local_pgio_done(hdr, status); } -static int -nfs_do_local_write(struct nfs_pgio_header *hdr, struct file *filp, - const struct rpc_call_ops *call_ops) +static void nfs_local_call_write(struct work_struct *work) { - struct nfs_local_kiocb *iocb; + struct nfs_local_kiocb *iocb = + container_of(work, struct nfs_local_kiocb, work); + struct file *filp = iocb->kiocb.ki_filp; + unsigned long old_flags = current->flags; + const struct cred *save_cred; struct iov_iter iter; ssize_t status; + current->flags |= PF_LOCAL_THROTTLE | PF_MEMALLOC_NOIO; + save_cred = override_creds(filp->f_cred); + + nfs_local_iter_init(&iter, iocb, WRITE); + + file_start_write(filp); + status = filp->f_op->write_iter(&iocb->kiocb, &iter); + file_end_write(filp); + WARN_ON_ONCE(status == -EIOCBQUEUED); + + nfs_local_write_done(iocb, status); + nfs_local_vfs_getattr(iocb); + nfs_local_pgio_release(iocb); + + revert_creds(save_cred); + current->flags = old_flags; +} + +static int nfs_do_local_write(struct nfs_pgio_header *hdr, struct file *filp, + const struct rpc_call_ops *call_ops) +{ + struct nfs_local_kiocb *iocb; + dprintk("%s: vfs_write count=%u pos=%llu %s\n", __func__, hdr->args.count, hdr->args.offset, (hdr->args.stable == NFS_UNSTABLE) ? "unstable" : "stable"); @@ -422,7 +462,6 @@ nfs_do_local_write(struct nfs_pgio_header *hdr, struct file *filp, iocb = nfs_local_iocb_alloc(hdr, filp, GFP_NOIO); if (iocb == NULL) return -ENOMEM; - nfs_local_iter_init(&iter, iocb, WRITE); switch (hdr->args.stable) { default: @@ -437,14 +476,8 @@ nfs_do_local_write(struct nfs_pgio_header *hdr, struct file *filp, nfs_set_local_verifier(hdr->inode, hdr->res.verf, hdr->args.stable); - file_start_write(filp); - status = filp->f_op->write_iter(&iocb->kiocb, &iter); - file_end_write(filp); - WARN_ON_ONCE(status == -EIOCBQUEUED); - - nfs_local_write_done(iocb, status); - nfs_local_vfs_getattr(iocb); - nfs_local_pgio_release(iocb); + INIT_WORK(&iocb->work, nfs_local_call_write); + queue_work(nfslocaliod_workqueue, &iocb->work); return 0; } From patchwork Mon Aug 19 18:17:23 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Snitzer X-Patchwork-Id: 13768787 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CB24A1891AA; Mon, 19 Aug 2024 18:18:16 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724091496; cv=none; b=O6W7hOpXyWDxsltNjGLprxsFfx/xKA6Sl1RhRs4MVMSd0cvV/ilrh3z7FjNMq0r/NDArKgD0fx9dHacqJlrWpfSNnYUVlrIGAsMh+9jKLcqgqjxSMEeGJsXNq/zhyhMYV/0XgQCrDfw0qAcq3/aH6gknYyedpDgfPmUdhpGBUHM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724091496; c=relaxed/simple; bh=/LG+TpnVRcF6fC5u5Yfvpk8uvSQHIdBNkEEBnydrU2s=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=HAOK8IUcw0jsQSa4TuRcXL1xaTB+NsrrOiArN5kZcH0lRjhX+pfqQ7oyh8zxnzXpF2TbQvfj720B6NmbVdOiZOL+/SUDkBWuY/i247s9ydi22g2ENAZh8A2KNd1/xjAgAFL37WGSHt22ZyEKbZ19T6ERnHLyrqg+qc1PFQ+ky3Q= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=RdIuloAj; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="RdIuloAj" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 759A4C4AF0F; Mon, 19 Aug 2024 18:18:16 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1724091496; bh=/LG+TpnVRcF6fC5u5Yfvpk8uvSQHIdBNkEEBnydrU2s=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=RdIuloAjjom0bkC7BS7RoaTf6sKmgybMTRVjYjRAclAJJoRjV5vOftKM4hVR2Ijcn KtewRMwZZHXutj4puh72q1OtfHaT+atTgR6JbDlihJkUCdvSnsb2cotNpsDx7ZcoRK tvVxo4rhp+GwLMMC/B1myd1LX/5ehS23ZSOY1x5dJefb7UIZE0MQL/52EjigURIgLq 1ja5GN96jk1BIJ/4gmHI6GyPGA37WMKb4n10VUM+E2LFXKDw58lk6Uv7tHyjzYFp3C RZJpM8oHGe0dftxK3Tznp/kAHWEK9kdowuZpaP0uItvhbwbvNMc9pWQIRN8SrZglqG CtrtcAJYH5vsw== From: Mike Snitzer To: linux-nfs@vger.kernel.org Cc: Jeff Layton , Chuck Lever , Anna Schumaker , Trond Myklebust , NeilBrown , linux-fsdevel@vger.kernel.org Subject: [PATCH v12 18/24] nfs: implement client support for NFS_LOCALIO_PROGRAM Date: Mon, 19 Aug 2024 14:17:23 -0400 Message-ID: <20240819181750.70570-19-snitzer@kernel.org> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240819181750.70570-1-snitzer@kernel.org> References: <20240819181750.70570-1-snitzer@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 The LOCALIO auxiliary RPC protocol consists of a single "UUID_IS_LOCAL" RPC method that allows the Linux NFS client to verify the local Linux NFS server can see the nonce (single-use UUID) the client generated and made available in nfs_common for subsequent lookup and verification by the NFS server. If matched, the NFS server populates members in the nfs_uuid_t struct. The NFS client then transfers these nfs_uuid_t struct member pointers to the nfs_client struct and cleans up the nfs_uuid_t struct. See: fs/nfs/localio.c:nfs_local_probe() This protocol isn't part of an IETF standard, nor does it need to be considering it is Linux-to-Linux auxiliary RPC protocol that amounts to an implementation detail. Localio is only supported when UNIX-style authentication (AUTH_UNIX, aka AUTH_SYS) is used (enforced by fs/nfs/localio.c:nfs_local_probe()). The UUID_IS_LOCAL method encodes the client generated uuid_t in terms of the fixed UUID_SIZE (16 bytes). The fixed size opaque encode and decode XDR methods are used instead of the less efficient variable sized methods. Having a nonce (single-use uuid) is better than using the same uuid for the life of the server, and sending it proactively by client rather than reactively by the server is also safer. [NeilBrown factored out and simplified a single localio protocol and proposed making the uuid short-lived] Signed-off-by: Mike Snitzer Co-developed-by: NeilBrown Signed-off-by: NeilBrown --- fs/nfs/client.c | 6 +- fs/nfs/localio.c | 148 +++++++++++++++++++++++++++++++++++++++++++++-- 2 files changed, 147 insertions(+), 7 deletions(-) diff --git a/fs/nfs/client.c b/fs/nfs/client.c index bf327ddbdd25..23fe1611cc9f 100644 --- a/fs/nfs/client.c +++ b/fs/nfs/client.c @@ -435,8 +435,10 @@ struct nfs_client *nfs_get_client(const struct nfs_client_initdata *cl_init) list_add_tail(&new->cl_share_link, &nn->nfs_client_list); spin_unlock(&nn->nfs_client_lock); - nfs_local_probe(new); - return rpc_ops->init_client(new, cl_init); + new = rpc_ops->init_client(new, cl_init); + if (!IS_ERR(new)) + nfs_local_probe(new); + return new; } spin_unlock(&nn->nfs_client_lock); diff --git a/fs/nfs/localio.c b/fs/nfs/localio.c index 2d3118005afc..f5ecbd9fefb6 100644 --- a/fs/nfs/localio.c +++ b/fs/nfs/localio.c @@ -49,18 +49,77 @@ static void nfs_local_fsync_work(struct work_struct *work); static bool localio_enabled __read_mostly = true; module_param(localio_enabled, bool, 0644); +static inline bool nfs_client_is_local(const struct nfs_client *clp) +{ + return !!test_bit(NFS_CS_LOCAL_IO, &clp->cl_flags); +} + bool nfs_server_is_local(const struct nfs_client *clp) { - return test_bit(NFS_CS_LOCAL_IO, &clp->cl_flags) != 0 && - localio_enabled; + return nfs_client_is_local(clp) && localio_enabled; } EXPORT_SYMBOL_GPL(nfs_server_is_local); +/* + * UUID_IS_LOCAL XDR functions + */ + +static void localio_xdr_enc_uuidargs(struct rpc_rqst *req, + struct xdr_stream *xdr, + const void *data) +{ + const u8 *uuid = data; + + encode_opaque_fixed(xdr, uuid, UUID_SIZE); +} + +static int localio_xdr_dec_uuidres(struct rpc_rqst *req, + struct xdr_stream *xdr, + void *result) +{ + /* void return */ + return 0; +} + +static const struct rpc_procinfo nfs_localio_procedures[] = { + [LOCALIOPROC_UUID_IS_LOCAL] = { + .p_proc = LOCALIOPROC_UUID_IS_LOCAL, + .p_encode = localio_xdr_enc_uuidargs, + .p_decode = localio_xdr_dec_uuidres, + .p_arglen = XDR_QUADLEN(UUID_SIZE), + .p_replen = 0, + .p_statidx = LOCALIOPROC_UUID_IS_LOCAL, + .p_name = "UUID_IS_LOCAL", + }, +}; + +static unsigned int nfs_localio_counts[ARRAY_SIZE(nfs_localio_procedures)]; +static const struct rpc_version nfslocalio_version1 = { + .number = 1, + .nrprocs = ARRAY_SIZE(nfs_localio_procedures), + .procs = nfs_localio_procedures, + .counts = nfs_localio_counts, +}; + +static const struct rpc_version *nfslocalio_version[] = { + [1] = &nfslocalio_version1, +}; + +extern const struct rpc_program nfslocalio_program; +static struct rpc_stat nfslocalio_rpcstat = { &nfslocalio_program }; + +const struct rpc_program nfslocalio_program = { + .name = "nfslocalio", + .number = NFS_LOCALIO_PROGRAM, + .nrvers = ARRAY_SIZE(nfslocalio_version), + .version = nfslocalio_version, + .stats = &nfslocalio_rpcstat, +}; + /* * nfs_local_enable - enable local i/o for an nfs_client */ -static __maybe_unused void nfs_local_enable(struct nfs_client *clp, - nfs_uuid_t *nfs_uuid) +static void nfs_local_enable(struct nfs_client *clp, nfs_uuid_t *nfs_uuid) { if (READ_ONCE(clp->nfsd_open_local_fh)) { set_bit(NFS_CS_LOCAL_IO, &clp->cl_flags); @@ -77,6 +136,12 @@ void nfs_local_disable(struct nfs_client *clp) { if (test_and_clear_bit(NFS_CS_LOCAL_IO, &clp->cl_flags)) { trace_nfs_local_disable(clp); + put_nfsd_open_local_fh(); + clp->nfsd_open_local_fh = NULL; + if (!IS_ERR(clp->cl_rpcclient_localio)) { + rpc_shutdown_client(clp->cl_rpcclient_localio); + clp->cl_rpcclient_localio = ERR_PTR(-EINVAL); + } clp->cl_nfssvc_net = NULL; if (clp->cl_nfssvc_dom) { auth_domain_put(clp->cl_nfssvc_dom); @@ -85,11 +150,83 @@ void nfs_local_disable(struct nfs_client *clp) } } +/* + * nfs_init_localioclient - Initialise an NFS localio client connection + */ +static void nfs_init_localioclient(struct nfs_client *clp) +{ + if (unlikely(!IS_ERR(clp->cl_rpcclient_localio))) + goto out; + clp->cl_rpcclient_localio = rpc_bind_new_program(clp->cl_rpcclient, + &nfslocalio_program, 1); + if (IS_ERR(clp->cl_rpcclient_localio)) + goto out; + /* No errors! Assume that localio is supported */ + clp->nfsd_open_local_fh = get_nfsd_open_local_fh(); + if (!clp->nfsd_open_local_fh) { + rpc_shutdown_client(clp->cl_rpcclient_localio); + clp->cl_rpcclient_localio = ERR_PTR(-EINVAL); + } +out: + dprintk_rcu("%s: server (%s) %s NFS LOCALIO, nfsd_open_local_fh is %s.\n", + __func__, rpc_peeraddr2str(clp->cl_rpcclient, RPC_DISPLAY_ADDR), + (IS_ERR(clp->cl_rpcclient_localio) ? "does not support" : "supports"), + (clp->nfsd_open_local_fh ? "set" : "not set")); +} + +static bool nfs_server_uuid_is_local(struct nfs_client *clp, nfs_uuid_t *nfs_uuid) +{ + u8 uuid[UUID_SIZE]; + struct rpc_message msg = { + .rpc_argp = &uuid, + }; + int status; + + nfs_init_localioclient(clp); + if (IS_ERR(clp->cl_rpcclient_localio)) + return false; + + export_uuid(uuid, &nfs_uuid->uuid); + + msg.rpc_proc = &nfs_localio_procedures[LOCALIOPROC_UUID_IS_LOCAL]; + status = rpc_call_sync(clp->cl_rpcclient_localio, &msg, 0); + dprintk("%s: NFS reply UUID_IS_LOCAL: status=%d\n", + __func__, status); + if (status) + return false; + + /* Server is only local if it initialized required struct members */ + if (!nfs_uuid->net || !nfs_uuid->dom) + return false; + + return true; +} + /* * nfs_local_probe - probe local i/o support for an nfs_server and nfs_client + * - called after alloc_client and init_client (so cl_rpcclient exists) + * - this function is idempotent, it can be called for old or new clients */ void nfs_local_probe(struct nfs_client *clp) { + nfs_uuid_t nfs_uuid; + + /* Disallow localio if disabled via sysfs or AUTH_SYS isn't used */ + if (!localio_enabled || + clp->cl_rpcclient->cl_auth->au_flavor != RPC_AUTH_UNIX) { + nfs_local_disable(clp); + return; + } + + if (nfs_client_is_local(clp)) { + /* If already enabled, disable and re-enable */ + nfs_local_disable(clp); + } + + nfs_uuid_begin(&nfs_uuid); + if (nfs_server_uuid_is_local(clp, &nfs_uuid)) + nfs_local_enable(clp, &nfs_uuid); + nfs_uuid_end(&nfs_uuid); } EXPORT_SYMBOL_GPL(nfs_local_probe); @@ -115,7 +252,8 @@ nfs_local_open_fh(struct nfs_client *clp, const struct cred *cred, switch (status) { case -ENXIO: case -ENOENT: - nfs_local_disable(clp); + /* Revalidate localio, will disable if unsupported */ + nfs_local_probe(clp); fallthrough; case -ETIMEDOUT: status = -EAGAIN; From patchwork Mon Aug 19 18:17:24 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Snitzer X-Patchwork-Id: 13768788 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 774D4189531; Mon, 19 Aug 2024 18:18:18 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724091498; cv=none; b=eXFNfnX+oOO8E4A7j+4UYBiL9PhlH26ccoSjkSm4R7Hn66yvyw62ugiePidSAWJ3mqkmlcF/bunc+C/r0ZjFcM/chaeq4c8Ars89wZdDi3ePxyzCdX+XMLjdEGLNMIOfORcsFgncPbdxGz5vSXYZJM1zRILNa3xxtzhQnjCnGHs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724091498; c=relaxed/simple; bh=mjyQRMCe58a9B+YYnhVr9bUVZwZykriX2VNmuqTCmfY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=lCMTeCbD1wmPGGpn3mxWmDCmO3CUprneXMG+d4lEzTuLT1GUvLz6GLxTUxLTWaaeqaNwUd+S52RxXgUOpRbtMoa8OjTYKOYEnMpceDB0NqYjGT1fbpk7wvASFwI45oeftezycAfYGX4VOoGsc01sygpmkBHZPfHbHKSBg4ouysA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=kIS3y8zY; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="kIS3y8zY" Received: by smtp.kernel.org (Postfix) with ESMTPSA id C8E41C32782; Mon, 19 Aug 2024 18:18:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1724091498; bh=mjyQRMCe58a9B+YYnhVr9bUVZwZykriX2VNmuqTCmfY=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=kIS3y8zYf2MBOTUbQc4rRimybHQv6mGgs6aG/+4GVKG9NquB772UYrbjeuRFsZNBL 0DCLQZKuwKr0LbrBSyN+narIClWUhbG3jXshSIW+6uaQXaFspMc95NbXn2lwPWux2B aKMvUXX8YpMDBvb0MnhZjQjZvGz798PXlibk8MrL24+896/r1NX8rlSFUSg5v9QjCz vSxeCTzAbbTLs2CZJnEuYKRX/s0RgVG1OAvQHyslngDqsizlZIILn9XcPUDvyYtpJS bvDNVhnP38im09BC5BDZXseHBnoQdurzIJ2Yc5vQ5rCiZ0R7Dtf6CmauDBKcVOp2eW 2Z3q/52PCLvyw== From: Mike Snitzer To: linux-nfs@vger.kernel.org Cc: Jeff Layton , Chuck Lever , Anna Schumaker , Trond Myklebust , NeilBrown , linux-fsdevel@vger.kernel.org Subject: [PATCH v12 19/24] nfs: add Documentation/filesystems/nfs/localio.rst Date: Mon, 19 Aug 2024 14:17:24 -0400 Message-ID: <20240819181750.70570-20-snitzer@kernel.org> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240819181750.70570-1-snitzer@kernel.org> References: <20240819181750.70570-1-snitzer@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 This document gives an overview of the LOCALIO auxiliary RPC protocol added to the Linux NFS client and server to allow them to reliably handshake to determine if they are on the same host. Once an NFS client and server handshake as "local", the client will bypass the network RPC protocol for read, write and commit operations. Due to this XDR and RPC bypass, these operations will operate faster. Signed-off-by: Mike Snitzer --- Documentation/filesystems/nfs/localio.rst | 178 ++++++++++++++++++++++ include/linux/nfslocalio.h | 2 + 2 files changed, 180 insertions(+) create mode 100644 Documentation/filesystems/nfs/localio.rst diff --git a/Documentation/filesystems/nfs/localio.rst b/Documentation/filesystems/nfs/localio.rst new file mode 100644 index 000000000000..d8bdab88f1db --- /dev/null +++ b/Documentation/filesystems/nfs/localio.rst @@ -0,0 +1,178 @@ +=========== +NFS LOCALIO +=========== + +Overview +======== + +The LOCALIO auxiliary RPC protocol allows the Linux NFS client and +server to reliably handshake to determine if they are on the same host. + +Once an NFS client and server handshake as "local", the client will +bypass the network RPC protocol for read, write and commit operations. +Due to this XDR and RPC bypass, these operations will operate faster. + +The LOCALIO auxiliary protocol's implementation, which uses the same +connection as NFS traffic, follows the pattern established by the NFS +ACL protocol extension. + +The LOCALIO auxiliary protocol is needed to allow robust discovery of +clients local to their servers. In a private implementation that +preceded use of this LOCALIO protocol, a fragile sockaddr network +address based match against all local network interfaces was attempted. +But unlike the LOCALIO protocol, the sockaddr-based matching didn't +handle use of iptables or containers. + +The robust handshake between local client and server is just the +beginning, the ultimate use case this locality makes possible is the +client is able to open files and issue reads, writes and commits +directly to the server without having to go over the network. The +requirement is to perform these loopback NFS operations as efficiently +as possible, this is particularly useful for container use cases +(e.g. kubernetes) where it is possible to run an IO job local to the +server. + +The performance advantage realized from LOCALIO's ability to bypass +using XDR and RPC for reads, writes and commits can be extreme, e.g.: +fio for 20 secs with 24 libaio threads, 128k directio reads, qd of 8, +- With LOCALIO: + read: IOPS=311k, BW=38.0GiB/s (40.8GB/s)(760GiB/20001msec) +- Without LOCALIO: + read: IOPS=12.0k, BW=1495MiB/s (1568MB/s)(29.2GiB/20015msec) + +RPC +=== + +The LOCALIO auxiliary RPC protocol consists of a single "UUID_IS_LOCAL" +RPC method that allows the Linux NFS client to verify the local Linux +NFS server can see the nonce (single-use UUID) the client generated and +made available in nfs_common. This protocol isn't part of an IETF +standard, nor does it need to be considering it is Linux-to-Linux +auxiliary RPC protocol that amounts to an implementation detail. + +The UUID_IS_LOCAL method encodes the client generated uuid_t in terms of +the fixed UUID_SIZE (16 bytes). The fixed size opaque encode and decode +XDR methods are used instead of the less efficient variable sized +methods. + +The RPC program number for the NFS_LOCALIO_PROGRAM is 400122 (as assigned +by IANA, see https://www.iana.org/assignments/rpc-program-numbers/ ): +Linux Kernel Organization 400122 nfslocalio + +The LOCALIO protocol spec in rpcgen syntax is: + +/* raw RFC 9562 UUID */ +#define UUID_SIZE 16 +typedef u8 uuid_t; + +program NFS_LOCALIO_PROGRAM { + version LOCALIO_V1 { + void + NULL(void) = 0; + + void + UUID_IS_LOCAL(uuid_t) = 1; + } = 1; +} = 400122; + +LOCALIO uses the same transport connection as NFS traffic. As such, +LOCALIO is not registered with rpcbind. + +NFS Common and Client/Server Handshake +====================================== + +fs/nfs_common/nfslocalio.c provides interfaces that enable an NFS client +to generate a nonce (single-use UUID) and associated short-lived +nfs_uuid_t struct, register it with nfs_common for subsequent lookup and +verification by the NFS server and if matched the NFS server populates +members in the nfs_uuid_t struct. The nfs client then transfers these +nfs_uuid_t struct member pointers to the nfs_client struct and cleans up +the nfs_uuid_t struct. See: fs/nfs/localio.c:nfs_local_probe() + +nfs_common's nfs_uuids list is the basis for LOCALIO enablement, as such +it has members that point to nfsd memory for direct use by the client +(e.g. 'net' is the server's network namespace, through it the client can +access nn->nfsd_serv with proper rcu read access). It is this client +and server synchronization that enables advanced usage and lifetime of +objects to span from the host kernel's nfsd to per-container knfsd +instances that are connected to nfs client's running on the same local +host. + +NFS Client issues IO instead of Server +====================================== + +Because LOCALIO is focused on protocol bypass to achieve improved IO +performance alternatives to traditional NFS wire protocol (SUNRPC with +XDR) to access the backing filesystem must be provided. + +See fs/nfs/localio.c:nfs_local_open_fh() and +fs/nfsd/localio.c:nfsd_open_local_fh() for the interface that makes +focused use of select nfs server objects to allow a client local to a +server to open a file pointer without needing to go over the network. + +The client's fs/nfs/localio.c:nfs_local_open_fh() will call into the +server's fs/nfsd/localio.c:nfsd_open_local_fh() and carefully access +both the nfsd network namespace and the associated nn->nfsd_serv in +terms of RCU. If nfsd_open_local_fh() finds that client no longer sees +valid nfsd objects (be it struct net or nn->nfsd_serv) it returns ENXIO +to nfs_local_open_fh() and the client will try to reestablish the +LOCALIO resources needed by calling nfs_local_probe() again. This +recovery is needed if/when an nfsd instance running in a container were +to reboot while a LOCALIO client is connected to it. + +Once the client has an open file pointer it will issue reads, writes and +commits directly to the underlying local filesystem (normally done by +the nfs server). As such, for these operations, the NFS client is +issuing IO to the underlying local filesystem that it is sharing with +the NFS server. See: fs/nfs/localio.c:nfs_local_doio() and +fs/nfs/localio.c:nfs_local_commit(). + +Security +======== + +Localio is only supported when UNIX-style authentication (AUTH_UNIX, aka +AUTH_SYS) is used. + +Care is taken to ensure the same NFS security mechanisms are used +(authentication, etc) regardless of whether LOCALIO or regular NFS +access is used. The auth_domain established as part of the traditional +NFS client access to the NFS server is also used for LOCALIO. + +Relative to containers, LOCALIO gives the client access to the network +namespace the server has. This is required to allow the client to access +the server's per-namespace nfsd_net struct. With traditional NFS, the +client is afforded this same level of access (albeit in terms of the NFS +protocol via SUNRPC). No other namespaces (user, mount, etc) have been +altered or purposely extended from the server to the client. + +Testing +======= + +The LOCALIO auxiliary protocol and associated NFS LOCALIO read, write +and commit access have proven stable against various test scenarios: + +- Client and server both on the same host. + +- All permutations of client and server support enablement for both + local and remote client and server. + +- Testing against NFS storage products that don't support the LOCALIO + protocol was also performed. + +- Client on host, server within a container (for both v3 and v4.2). + The container testing was in terms of podman managed containers and + includes successful container stop/restart scenario. + +- Formalizing these test scenarios in terms of existing test + infrastructure is on-going. Initial regular coverage is provided in + terms of ktest running xfstests against a LOCALIO-enabled NFS loopback + mount configuration, and includes lockdep and KASAN coverage, see: + https://evilpiepirate.org/~testdashboard/ci?user=snitzer&branch=snitm-nfs-next + https://github.com/koverstreet/ktest + +- Various kdevops testing (in terms of "Chuck's BuildBot") has been + performed to regularly verify the LOCALIO changes haven't caused any + regressions to non-LOCALIO NFS use cases. + +- All of Hammerspace's various sanity tests pass with LOCALIO enabled + (this includes numerous pNFS and flexfiles tests). diff --git a/include/linux/nfslocalio.h b/include/linux/nfslocalio.h index 109cb8534e3f..c6edcb7c0dcd 100644 --- a/include/linux/nfslocalio.h +++ b/include/linux/nfslocalio.h @@ -15,6 +15,8 @@ /* * Useful to allow a client to negotiate if localio * possible with its server. + * + * See Documentation/filesystems/nfs/localio.rst for more detail. */ typedef struct { uuid_t uuid; From patchwork Mon Aug 19 18:17:25 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Snitzer X-Patchwork-Id: 13768789 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7695E18A926; Mon, 19 Aug 2024 18:18:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724091499; cv=none; b=U3DTec2rVNywJ1+P672a8bw2IxxvfNaeW9w+4fC5s33e7UyX8NYY0U4cPV6h6EanT/XBgJOeOxurFg7gNcYwKYamTXwzgQd7I2hXs0qNLvWFB84zgLjTMv5LmxpcmBkoweAnULcxy7PiEp6aJDKySctWpCSvf7oIcbiimNur/Cw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724091499; c=relaxed/simple; bh=4Id9vBjAEGLNLirVxTLvWyzW2NSnZzaTE5XtB31TylM=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=KRk2Tn2O6R9cp2jYl0pbCJ+X0/WXfvCH6cG6XbgKmUoyysmLusVx83xCyEBDFBy5nxp7c57VycA/9eVWZtuTIfJ5mPgfXjGJ2yMQkiJjklSyKEFISTWaxOkr1iWnfcVAF5zyQbGfxkiq4bJ/ZuYqjlP3GYORQzjINvgg5COEcXI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=C5gdrmNC; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="C5gdrmNC" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 23307C4AF0C; Mon, 19 Aug 2024 18:18:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1724091499; bh=4Id9vBjAEGLNLirVxTLvWyzW2NSnZzaTE5XtB31TylM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=C5gdrmNCvQV8weoBr6BEb2zUTY1Q7FFIH7XtTwbW1ZGQGXnjYZfv+zaHYbBb0hLYw foCKKS92qTIfNcvlYTDvoVz9J+HVG00GM0TP/JDSgT8WtVCPgHqMsR2hDGar2q4wrP 0ax1LQ37saH/VKB8fPpBsmE+DZac+1eo6o9Kw4LdM6+kGwkogAeqycVLMf2lRGFvvA lVWGypkUaH8EaAEPutgKPAOA5IDu114dDwvbgD8SCy5r0SZz4C5Zz6Q0w/Oa03feON eDYD96QUbEKFNPz0xCCQGVqJNJ75OwgVv48O+zwyPfT6lTjOsbUON3817e/rN33+ZN be+xNkVzej/Ww== From: Mike Snitzer To: linux-nfs@vger.kernel.org Cc: Jeff Layton , Chuck Lever , Anna Schumaker , Trond Myklebust , NeilBrown , linux-fsdevel@vger.kernel.org Subject: [PATCH v12 20/24] nfsd: use GC for nfsd_file returned by nfsd_file_acquire_local Date: Mon, 19 Aug 2024 14:17:25 -0400 Message-ID: <20240819181750.70570-21-snitzer@kernel.org> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240819181750.70570-1-snitzer@kernel.org> References: <20240819181750.70570-1-snitzer@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Offers performance improvements if/when a file is reopened before launderette cleans it from the filecache's LRU. Suggested-by: Jeff Layton Signed-off-by: Mike Snitzer --- fs/nfsd/filecache.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c index 56be99a3667a..447faa194166 100644 --- a/fs/nfsd/filecache.c +++ b/fs/nfsd/filecache.c @@ -1197,9 +1197,10 @@ nfsd_file_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp, * a file. The security implications of this should be carefully * considered before use. * - * The nfsd_file_object returned by this API is reference-counted - * but not garbage-collected. The object is unhashed after the - * final nfsd_file_put(). + * The nfsd_file object returned by this API is reference-counted + * and garbage-collected. The object is retained for a few + * seconds after the final nfsd_file_put() in case the caller + * wants to re-use it. * * Return values: * %nfs_ok - @pnf points to an nfsd_file with its reference @@ -1214,7 +1215,7 @@ nfsd_file_acquire_local(struct net *net, struct svc_cred *cred, unsigned int may_flags, struct nfsd_file **pnf) { return nfsd_file_do_acquire(NULL, net, cred, nfs_vers, client, - fhp, may_flags, NULL, pnf, false); + fhp, may_flags, NULL, pnf, true); } /** From patchwork Mon Aug 19 18:17:26 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Snitzer X-Patchwork-Id: 13768790 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C35AC189B99; Mon, 19 Aug 2024 18:18:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724091500; cv=none; b=aLL2NZLsT5x3ieVp1zP33fokfqx6SnxSobCsomIjVkwZxfQuZ6MSLxhk/tnN7MZAHeGNwzucpEEPHJaayuwj5xhGI6zWhpSzDxOlInHPpS0eH+44C9bxDrDh2GMumsfHG/h0Mk0ma2xjwsAZsU1xhMj/jHZYS7lqTi9EvFgXiCw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724091500; c=relaxed/simple; bh=aFPdln082tJMXxadR13hHMY+SiAhF/nnmGGNZcmP2h4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Zn+t7r/zaBaZoWsguIxyJBY3GGxKcW5xgFrpJGFIId41+p0AqdHYOul+MRy5olO3tylqLIqnez6YbklY+2QYb5BbygXYJ1t01sQv54wUtV5DtAv4oNTWzIxAZtd7Oi8Z22nUn440PZ8WTCQQY0qeRyvC8TkRq7hjyLic9gftyE4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=GIWQP9MW; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="GIWQP9MW" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 70689C4AF0F; Mon, 19 Aug 2024 18:18:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1724091500; bh=aFPdln082tJMXxadR13hHMY+SiAhF/nnmGGNZcmP2h4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=GIWQP9MWtiwxX1RCJT0EpAXPjMlI35BFEvCdQ72TyMlDtmDqLBV/WPSmaanq8wTTY hmYNBr646esWWDRNv9E8ujqjMbFVP1Z9tcQpk6rWvf+dvVgTz5Kga6CmqUQ7mmpbj3 Fud0VCW/2R2S254UcADTsRoUNxLZL9cqYUTpFYgUBzySn/6U+HsDATW6KtvoxA96/8 KirkB8GGqNawHsZL7Q2ss/LLnSApsj6qIPmpNWXcnKXyVARDgDYWk5qop/SYEvhLCo ea2GKpeh6DkV1kFRnQln3dihK7XMgb+g28YsfagSEth4ZAO0Zt2Ij7JCtezeOjrf+M uRaF2j6BMhLYg== From: Mike Snitzer To: linux-nfs@vger.kernel.org Cc: Jeff Layton , Chuck Lever , Anna Schumaker , Trond Myklebust , NeilBrown , linux-fsdevel@vger.kernel.org Subject: [PATCH v12 21/24] nfs_common: expose localio's required nfsd symbols to nfs client Date: Mon, 19 Aug 2024 14:17:26 -0400 Message-ID: <20240819181750.70570-22-snitzer@kernel.org> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240819181750.70570-1-snitzer@kernel.org> References: <20240819181750.70570-1-snitzer@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Mike Snitzer Switch the caching of nfsd_open_local_fh symbol from being within each nfs_client struct to being globally maintained within nfs_common (accessible via the global 'nfs_to' nfs_to_nfsd_t struct). Introduce nfsd_file_file() wrapper that provides access to nfsd_file's backing file. Keeps nfsd_file structure opaque to NFS client (as suggested by Jeff Layton). The addition of nfsd_file_get, nfsd_file_put and nfsd_file_file symbols prepares for switching the NFS client to using nfsd_file for localio. Despite the use of indirect function calls, caching these nfsd symbols for use by the client offers a ~10% performance win (compared to always doing get+call+put) for high IOPS workloads. Suggested-by: Jeff Layton # nfsd_file_file Signed-off-by: Mike Snitzer --- fs/nfs/client.c | 1 - fs/nfs/localio.c | 17 +++--- fs/nfs_common/nfslocalio.c | 117 +++++++++++++++++++++++++++++++++---- fs/nfsd/filecache.c | 25 ++++++++ fs/nfsd/filecache.h | 1 + fs/nfsd/localio.c | 2 +- include/linux/nfs_fs_sb.h | 1 - include/linux/nfslocalio.h | 27 +++++++-- 8 files changed, 163 insertions(+), 28 deletions(-) diff --git a/fs/nfs/client.c b/fs/nfs/client.c index 23fe1611cc9f..fe60a82f06d8 100644 --- a/fs/nfs/client.c +++ b/fs/nfs/client.c @@ -182,7 +182,6 @@ struct nfs_client *nfs_alloc_client(const struct nfs_client_initdata *cl_init) seqlock_init(&clp->cl_boot_lock); ktime_get_real_ts64(&clp->cl_nfssvc_boot); clp->cl_rpcclient_localio = ERR_PTR(-EINVAL); - clp->nfsd_open_local_fh = NULL; clp->cl_nfssvc_net = NULL; clp->cl_nfssvc_dom = NULL; #endif /* CONFIG_NFS_LOCALIO */ diff --git a/fs/nfs/localio.c b/fs/nfs/localio.c index f5ecbd9fefb6..38303427e0b3 100644 --- a/fs/nfs/localio.c +++ b/fs/nfs/localio.c @@ -121,7 +121,7 @@ const struct rpc_program nfslocalio_program = { */ static void nfs_local_enable(struct nfs_client *clp, nfs_uuid_t *nfs_uuid) { - if (READ_ONCE(clp->nfsd_open_local_fh)) { + if (!IS_ERR(clp->cl_rpcclient_localio)) { set_bit(NFS_CS_LOCAL_IO, &clp->cl_flags); clp->cl_nfssvc_net = nfs_uuid->net; clp->cl_nfssvc_dom = nfs_uuid->dom; @@ -136,8 +136,7 @@ void nfs_local_disable(struct nfs_client *clp) { if (test_and_clear_bit(NFS_CS_LOCAL_IO, &clp->cl_flags)) { trace_nfs_local_disable(clp); - put_nfsd_open_local_fh(); - clp->nfsd_open_local_fh = NULL; + put_nfs_to_nfsd_symbols(); if (!IS_ERR(clp->cl_rpcclient_localio)) { rpc_shutdown_client(clp->cl_rpcclient_localio); clp->cl_rpcclient_localio = ERR_PTR(-EINVAL); @@ -162,16 +161,14 @@ static void nfs_init_localioclient(struct nfs_client *clp) if (IS_ERR(clp->cl_rpcclient_localio)) goto out; /* No errors! Assume that localio is supported */ - clp->nfsd_open_local_fh = get_nfsd_open_local_fh(); - if (!clp->nfsd_open_local_fh) { + if (!get_nfs_to_nfsd_symbols()) { rpc_shutdown_client(clp->cl_rpcclient_localio); clp->cl_rpcclient_localio = ERR_PTR(-EINVAL); } out: - dprintk_rcu("%s: server (%s) %s NFS LOCALIO, nfsd_open_local_fh is %s.\n", + dprintk_rcu("%s: server (%s) %s NFS LOCALIO.\n", __func__, rpc_peeraddr2str(clp->cl_rpcclient, RPC_DISPLAY_ADDR), - (IS_ERR(clp->cl_rpcclient_localio) ? "does not support" : "supports"), - (clp->nfsd_open_local_fh ? "set" : "not set")); + (IS_ERR(clp->cl_rpcclient_localio) ? "does not support" : "supports")); } static bool nfs_server_uuid_is_local(struct nfs_client *clp, nfs_uuid_t *nfs_uuid) @@ -245,8 +242,8 @@ nfs_local_open_fh(struct nfs_client *clp, const struct cred *cred, if (mode & ~(FMODE_READ | FMODE_WRITE)) return ERR_PTR(-EINVAL); - status = clp->nfsd_open_local_fh(clp->cl_nfssvc_net, clp->cl_nfssvc_dom, - clp->cl_rpcclient, cred, fh, mode, &filp); + status = nfs_to.nfsd_open_local_fh(clp->cl_nfssvc_net, clp->cl_nfssvc_dom, + clp->cl_rpcclient, cred, fh, mode, &filp); if (status < 0) { trace_nfs_local_open_fh(fh, mode, status); switch (status) { diff --git a/fs/nfs_common/nfslocalio.c b/fs/nfs_common/nfslocalio.c index a20ff7607707..087649911b52 100644 --- a/fs/nfs_common/nfslocalio.c +++ b/fs/nfs_common/nfslocalio.c @@ -71,27 +71,124 @@ bool nfs_uuid_is_local(const uuid_t *uuid, struct net *net, struct auth_domain * EXPORT_SYMBOL_GPL(nfs_uuid_is_local); /* - * The nfs localio code needs to call into nfsd to do the filehandle -> struct path - * mapping, but cannot be statically linked, because that will make the nfs module + * The nfs localio code needs to call into nfsd using various symbols (below), + * but cannot be statically linked, because that will make the nfs module * depend on the nfsd module. * * Instead, do dynamic linking to the nfsd module (via nfs_common module). The * nfs_common module will only hold a reference on nfsd when localio is in use. * This allows some sanity checking, like giving up on localio if nfsd isn't loaded. */ +DEFINE_MUTEX(nfs_to_nfsd_mutex); +nfs_to_nfsd_t nfs_to; +EXPORT_SYMBOL_GPL(nfs_to); +/* Macro to define nfs_to get and put methods, avoids copy-n-paste bugs */ +#define DEFINE_NFS_TO_NFSD_SYMBOL(NFSD_SYMBOL) \ +static nfs_to_##NFSD_SYMBOL##_t get_##NFSD_SYMBOL(void) \ +{ \ + return symbol_request(NFSD_SYMBOL); \ +} \ +static void put_##NFSD_SYMBOL(void) \ +{ \ + symbol_put(NFSD_SYMBOL); \ + nfs_to.NFSD_SYMBOL = NULL; \ +} + +/* The nfs localio code needs to call into nfsd to map filehandle -> struct nfsd_file */ extern int nfsd_open_local_fh(struct net *, struct auth_domain *, struct rpc_clnt *, - const struct cred *, const struct nfs_fh *, - const fmode_t, struct file **); + const struct cred *, const struct nfs_fh *, + const fmode_t, struct file **); +DEFINE_NFS_TO_NFSD_SYMBOL(nfsd_open_local_fh); -nfs_to_nfsd_open_t get_nfsd_open_local_fh(void) +/* The nfs localio code needs to call into nfsd to acquire the nfsd_file */ +extern struct nfsd_file *nfsd_file_get(struct nfsd_file *nf); +DEFINE_NFS_TO_NFSD_SYMBOL(nfsd_file_get); + +/* The nfs localio code needs to call into nfsd to release the nfsd_file */ +extern void nfsd_file_put(struct nfsd_file *nf); +DEFINE_NFS_TO_NFSD_SYMBOL(nfsd_file_put); + +/* The nfs localio code needs to call into nfsd to access the nf->nf_file */ +extern struct file * nfsd_file_file(struct nfsd_file *nf); +DEFINE_NFS_TO_NFSD_SYMBOL(nfsd_file_file); +#undef DEFINE_NFS_TO_NFSD_SYMBOL + +bool get_nfs_to_nfsd_symbols(void) { - return symbol_request(nfsd_open_local_fh); + mutex_lock(&nfs_to_nfsd_mutex); + + /* Only get symbols on first reference */ + if (refcount_read(&nfs_to.ref) == 0) + refcount_set(&nfs_to.ref, 1); + else { + refcount_inc(&nfs_to.ref); + mutex_unlock(&nfs_to_nfsd_mutex); + return true; + } + + nfs_to.nfsd_open_local_fh = get_nfsd_open_local_fh(); + if (!nfs_to.nfsd_open_local_fh) + goto out_nfsd_open_local_fh; + + nfs_to.nfsd_file_get = get_nfsd_file_get(); + if (!nfs_to.nfsd_file_get) + goto out_nfsd_file_get; + + nfs_to.nfsd_file_put = get_nfsd_file_put(); + if (!nfs_to.nfsd_file_put) + goto out_nfsd_file_put; + + nfs_to.nfsd_file_file = get_nfsd_file_file(); + if (!nfs_to.nfsd_file_file) + goto out_nfsd_file_file; + + mutex_unlock(&nfs_to_nfsd_mutex); + return true; + +out_nfsd_file_file: + put_nfsd_file_put(); +out_nfsd_file_put: + put_nfsd_file_get(); +out_nfsd_file_get: + put_nfsd_open_local_fh(); +out_nfsd_open_local_fh: + mutex_unlock(&nfs_to_nfsd_mutex); + return false; } -EXPORT_SYMBOL_GPL(get_nfsd_open_local_fh); +EXPORT_SYMBOL_GPL(get_nfs_to_nfsd_symbols); -void put_nfsd_open_local_fh(void) +void put_nfs_to_nfsd_symbols(void) { - symbol_put(nfsd_open_local_fh); + mutex_lock(&nfs_to_nfsd_mutex); + + if (!refcount_dec_and_test(&nfs_to.ref)) + goto out; + + put_nfsd_open_local_fh(); + put_nfsd_file_get(); + put_nfsd_file_put(); + put_nfsd_file_file(); +out: + mutex_unlock(&nfs_to_nfsd_mutex); +} +EXPORT_SYMBOL_GPL(put_nfs_to_nfsd_symbols); + +static int __init nfslocalio_init(void) +{ + refcount_set(&nfs_to.ref, 0); + + nfs_to.nfsd_open_local_fh = NULL; + nfs_to.nfsd_file_get = NULL; + nfs_to.nfsd_file_put = NULL; + nfs_to.nfsd_file_file = NULL; + + return 0; } -EXPORT_SYMBOL_GPL(put_nfsd_open_local_fh); + +static void __exit nfslocalio_exit(void) +{ +} + +module_init(nfslocalio_init); +module_exit(nfslocalio_exit); diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c index 447faa194166..d7c6122231f4 100644 --- a/fs/nfsd/filecache.c +++ b/fs/nfsd/filecache.c @@ -39,6 +39,7 @@ #include #include #include +#include #include "vfs.h" #include "nfsd.h" @@ -345,6 +346,10 @@ nfsd_file_get(struct nfsd_file *nf) return nf; return NULL; } +EXPORT_SYMBOL_GPL(nfsd_file_get); + +/* Compile time type checking, not used by anything */ +static nfs_to_nfsd_file_get_t __maybe_unused nfsd_file_get_typecheck = nfsd_file_get; /** * nfsd_file_put - put the reference to a nfsd_file @@ -389,6 +394,26 @@ nfsd_file_put(struct nfsd_file *nf) if (refcount_dec_and_test(&nf->nf_ref)) nfsd_file_free(nf); } +EXPORT_SYMBOL_GPL(nfsd_file_put); + +/* Compile time type checking, not used by anything */ +static nfs_to_nfsd_file_put_t __maybe_unused nfsd_file_put_typecheck = nfsd_file_put; + +/** + * nfsd_file_file - get the backing file of an nfsd_file + * @nf: nfsd_file of which to access the backing file. + * + * Return backing file for @nf. + */ +struct file * +nfsd_file_file(struct nfsd_file *nf) +{ + return nf->nf_file; +} +EXPORT_SYMBOL_GPL(nfsd_file_file); + +/* Compile time type checking, not used by anything */ +static nfs_to_nfsd_file_file_t __maybe_unused nfsd_file_file_typecheck = nfsd_file_file; static void nfsd_file_dispose_list(struct list_head *dispose) diff --git a/fs/nfsd/filecache.h b/fs/nfsd/filecache.h index 6dab41f8541e..ab8a4423edd9 100644 --- a/fs/nfsd/filecache.h +++ b/fs/nfsd/filecache.h @@ -56,6 +56,7 @@ int nfsd_file_cache_start_net(struct net *net); void nfsd_file_cache_shutdown_net(struct net *net); void nfsd_file_put(struct nfsd_file *nf); struct nfsd_file *nfsd_file_get(struct nfsd_file *nf); +struct file *nfsd_file_file(struct nfsd_file *nf); void nfsd_file_close_inode_sync(struct inode *inode); void nfsd_file_net_dispose(struct nfsd_net *nn); bool nfsd_file_is_cached(struct inode *inode); diff --git a/fs/nfsd/localio.c b/fs/nfsd/localio.c index 9cdea1d1c28a..008b935a3a6c 100644 --- a/fs/nfsd/localio.c +++ b/fs/nfsd/localio.c @@ -111,7 +111,7 @@ int nfsd_open_local_fh(struct net *cl_nfssvc_net, EXPORT_SYMBOL_GPL(nfsd_open_local_fh); /* Compile time type checking, not used by anything */ -static nfs_to_nfsd_open_t __maybe_unused nfsd_open_local_fh_typecheck = nfsd_open_local_fh; +static nfs_to_nfsd_open_local_fh_t __maybe_unused nfsd_open_local_fh_typecheck = nfsd_open_local_fh; /* * UUID_IS_LOCAL XDR functions diff --git a/include/linux/nfs_fs_sb.h b/include/linux/nfs_fs_sb.h index 5edc57657985..10453c6f8ca8 100644 --- a/include/linux/nfs_fs_sb.h +++ b/include/linux/nfs_fs_sb.h @@ -134,7 +134,6 @@ struct nfs_client { struct rpc_clnt * cl_rpcclient_localio; struct net * cl_nfssvc_net; struct auth_domain * cl_nfssvc_dom; - nfs_to_nfsd_open_t nfsd_open_local_fh; #endif /* CONFIG_NFS_LOCALIO */ }; diff --git a/include/linux/nfslocalio.h b/include/linux/nfslocalio.h index c6edcb7c0dcd..6302d36f9112 100644 --- a/include/linux/nfslocalio.h +++ b/include/linux/nfslocalio.h @@ -7,6 +7,7 @@ #include #include +#include #include #include #include @@ -29,11 +30,27 @@ void nfs_uuid_begin(nfs_uuid_t *); void nfs_uuid_end(nfs_uuid_t *); bool nfs_uuid_is_local(const uuid_t *, struct net *, struct auth_domain *); -typedef int (*nfs_to_nfsd_open_t)(struct net *, struct auth_domain *, struct rpc_clnt *, - const struct cred *, const struct nfs_fh *, - const fmode_t, struct file **); +struct nfsd_file; -nfs_to_nfsd_open_t get_nfsd_open_local_fh(void); -void put_nfsd_open_local_fh(void); +typedef int (*nfs_to_nfsd_open_local_fh_t)(struct net *, struct auth_domain *, + struct rpc_clnt *, const struct cred *, + const struct nfs_fh *, const fmode_t, + struct file **); +typedef struct nfsd_file * (*nfs_to_nfsd_file_get_t)(struct nfsd_file *); +typedef void (*nfs_to_nfsd_file_put_t)(struct nfsd_file *); +typedef struct file * (*nfs_to_nfsd_file_file_t)(struct nfsd_file *); + +typedef struct { + refcount_t ref; + nfs_to_nfsd_open_local_fh_t nfsd_open_local_fh; + nfs_to_nfsd_file_get_t nfsd_file_get; + nfs_to_nfsd_file_put_t nfsd_file_put; + nfs_to_nfsd_file_file_t nfsd_file_file; +} nfs_to_nfsd_t; + +extern nfs_to_nfsd_t nfs_to; + +bool get_nfs_to_nfsd_symbols(void); +void put_nfs_to_nfsd_symbols(void); #endif /* __LINUX_NFSLOCALIO_H */ From patchwork Mon Aug 19 18:17:27 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Snitzer X-Patchwork-Id: 13768791 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3958818B46F; Mon, 19 Aug 2024 18:18:22 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724091502; cv=none; b=BsXdEyORxsw5Zwxq26LjdhxjS1O0Csm83SyqGNR2zRcAF1u2uNtaF+PuZLwjAm+aBDG//FCaNy5/WiidICbPK4LiE5gNi4kzqea9K8iN3KQyA1YoJzvh2fi6eUoKG1WMVwkJbpdLzq2o/bnK77dGfiOvHUcKzGjnNSTYNRkKS5o= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724091502; c=relaxed/simple; bh=ftw0vMv3HaFnOmejPIHrAE9siQzKsk6CGD9W9QIRQwE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=pOLdcJR6cnCViUPbnuJcqvj1G3VK1IF/PAzsmgMUaKDmLT+3YsfIWZdCjPuJECG3BGpEPNv+aBNMWe0rIUAlMxp0zwJHi6HhKuFktbGPkaQNlJpUrTLLtlFTPzKNL6rl8MSE/fN5/qLk9jaNqhknEYQshAS1Sf/FCJYZrxnhH+c= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=tIZNykQf; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="tIZNykQf" Received: by smtp.kernel.org (Postfix) with ESMTPSA id D4E29C32782; Mon, 19 Aug 2024 18:18:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1724091502; bh=ftw0vMv3HaFnOmejPIHrAE9siQzKsk6CGD9W9QIRQwE=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=tIZNykQfiFdhu17zjqpGC4nycz+WMgZMI9N497qTfQxvlpYUTtrha+qa00s+vOv51 NeXyzh89cKrmVfCEg2Es0roQW8LGNaJMTdpVv+AAhuVgxkGeNJm4fMC9DufCxuM0vp 9iN4D++rkYQFs+9qq+htN6BgRtQEJf8fyfNpMuekgPxCgEQwqBfFfeUPFBl7RzB3eP w1/2qWE5IpoyDzqZbQbUSEwZZTD6K7kA51wlqAFX4w8TypKFTuFy9XpUxf7DETVFXz dqddJJNp3WPY+m4bx8/FrQS6fiA4cU7fZKqzaBHk6gWWQp4o7Oij5/8MKs5cPFFD4v Vju+pupcy2q2Q== From: Mike Snitzer To: linux-nfs@vger.kernel.org Cc: Jeff Layton , Chuck Lever , Anna Schumaker , Trond Myklebust , NeilBrown , linux-fsdevel@vger.kernel.org Subject: [PATCH v12 22/24] nfs: push localio nfsd_file_put call out to client Date: Mon, 19 Aug 2024 14:17:27 -0400 Message-ID: <20240819181750.70570-23-snitzer@kernel.org> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240819181750.70570-1-snitzer@kernel.org> References: <20240819181750.70570-1-snitzer@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Change nfsd_open_local_fh() to return an nfsd_file, instead of file, and move associated reference counting to the nfs client's nfs_local_open_fh(). This is the first step in allowing proper use of nfsd_file for localio (such that the benefits of nfsd's filecache can be realized). Signed-off-by: Mike Snitzer --- fs/nfs/localio.c | 7 +++++-- fs/nfs_common/nfslocalio.c | 2 +- fs/nfsd/localio.c | 11 ++++------- fs/nfsd/vfs.h | 2 +- include/linux/nfslocalio.h | 2 +- 5 files changed, 12 insertions(+), 12 deletions(-) diff --git a/fs/nfs/localio.c b/fs/nfs/localio.c index 38303427e0b3..7d63d7e34643 100644 --- a/fs/nfs/localio.c +++ b/fs/nfs/localio.c @@ -237,13 +237,14 @@ nfs_local_open_fh(struct nfs_client *clp, const struct cred *cred, struct nfs_fh *fh, const fmode_t mode) { struct file *filp; + struct nfsd_file *nf; int status; if (mode & ~(FMODE_READ | FMODE_WRITE)) return ERR_PTR(-EINVAL); status = nfs_to.nfsd_open_local_fh(clp->cl_nfssvc_net, clp->cl_nfssvc_dom, - clp->cl_rpcclient, cred, fh, mode, &filp); + clp->cl_rpcclient, cred, fh, mode, &nf); if (status < 0) { trace_nfs_local_open_fh(fh, mode, status); switch (status) { @@ -255,8 +256,10 @@ nfs_local_open_fh(struct nfs_client *clp, const struct cred *cred, case -ETIMEDOUT: status = -EAGAIN; } - filp = ERR_PTR(status); + return ERR_PTR(status); } + filp = get_file(nfs_to.nfsd_file_file(nf)); + nfs_to.nfsd_file_put(nf); return filp; } EXPORT_SYMBOL_GPL(nfs_local_open_fh); diff --git a/fs/nfs_common/nfslocalio.c b/fs/nfs_common/nfslocalio.c index 087649911b52..f59167e596d3 100644 --- a/fs/nfs_common/nfslocalio.c +++ b/fs/nfs_common/nfslocalio.c @@ -98,7 +98,7 @@ static void put_##NFSD_SYMBOL(void) \ /* The nfs localio code needs to call into nfsd to map filehandle -> struct nfsd_file */ extern int nfsd_open_local_fh(struct net *, struct auth_domain *, struct rpc_clnt *, const struct cred *, const struct nfs_fh *, - const fmode_t, struct file **); + const fmode_t, struct nfsd_file **); DEFINE_NFS_TO_NFSD_SYMBOL(nfsd_open_local_fh); /* The nfs localio code needs to call into nfsd to acquire the nfsd_file */ diff --git a/fs/nfsd/localio.c b/fs/nfsd/localio.c index 008b935a3a6c..2ceab49f3cb6 100644 --- a/fs/nfsd/localio.c +++ b/fs/nfsd/localio.c @@ -32,13 +32,13 @@ * @cred: cred that the client established * @nfs_fh: filehandle to lookup * @fmode: fmode_t to use for open - * @pfilp: returned file pointer that maps to @nfs_fh + * @pnf: returned nfsd_file pointer that maps to @nfs_fh * * This function maps a local fh to a path on a local filesystem. * This is useful when the nfs client has the local server mounted - it can * avoid all the NFS overhead with reads, writes and commits. * - * On successful return, caller is responsible for calling path_put. Also + * On successful return, caller is responsible for calling nfsd_file_put. Also * note that this is called from nfs.ko via find_symbol() to avoid an explicit * dependency on knfsd. So, there is no forward declaration in a header file * for it that is shared with the client. @@ -49,7 +49,7 @@ int nfsd_open_local_fh(struct net *cl_nfssvc_net, const struct cred *cred, const struct nfs_fh *nfs_fh, const fmode_t fmode, - struct file **pfilp) + struct nfsd_file **pnf) { int mayflags = NFSD_MAY_LOCALIO; int status = 0; @@ -57,7 +57,6 @@ int nfsd_open_local_fh(struct net *cl_nfssvc_net, const struct cred *save_cred; struct svc_cred rq_cred; struct svc_fh fh; - struct nfsd_file *nf; __be32 beres; if (nfs_fh->size > NFS4_FHSIZE) @@ -91,13 +90,11 @@ int nfsd_open_local_fh(struct net *cl_nfssvc_net, rpcauth_map_clnt_to_svc_cred_local(rpc_clnt, cred, &rq_cred); beres = nfsd_file_acquire_local(cl_nfssvc_net, &rq_cred, rpc_clnt->cl_vers, - cl_nfssvc_dom, &fh, mayflags, &nf); + cl_nfssvc_dom, &fh, mayflags, pnf); if (beres) { status = nfs_stat_to_errno(be32_to_cpu(beres)); goto out_fh_put; } - *pfilp = get_file(nf->nf_file); - nfsd_file_put(nf); out_fh_put: fh_put(&fh); if (rq_cred.cr_group_info) diff --git a/fs/nfsd/vfs.h b/fs/nfsd/vfs.h index 9720951c49a0..ec8a8aae540b 100644 --- a/fs/nfsd/vfs.h +++ b/fs/nfsd/vfs.h @@ -166,7 +166,7 @@ int nfsd_open_local_fh(struct net *net, const struct cred *cred, const struct nfs_fh *nfs_fh, const fmode_t fmode, - struct file **pfilp); + struct nfsd_file **pnf); static inline int fh_want_write(struct svc_fh *fh) { diff --git a/include/linux/nfslocalio.h b/include/linux/nfslocalio.h index 6302d36f9112..7e09ff621d93 100644 --- a/include/linux/nfslocalio.h +++ b/include/linux/nfslocalio.h @@ -35,7 +35,7 @@ struct nfsd_file; typedef int (*nfs_to_nfsd_open_local_fh_t)(struct net *, struct auth_domain *, struct rpc_clnt *, const struct cred *, const struct nfs_fh *, const fmode_t, - struct file **); + struct nfsd_file **); typedef struct nfsd_file * (*nfs_to_nfsd_file_get_t)(struct nfsd_file *); typedef void (*nfs_to_nfsd_file_put_t)(struct nfsd_file *); typedef struct file * (*nfs_to_nfsd_file_file_t)(struct nfsd_file *); From patchwork Mon Aug 19 18:17:28 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Snitzer X-Patchwork-Id: 13768792 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D1ECB18B46F; Mon, 19 Aug 2024 18:18:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724091503; cv=none; b=KkUjSb3SzmM/85Jl/uMRNRiP5B8vEcaUGRAcQnqhlbf+lIH4i8YmEAcHM7xs7rvJK0cWEfDLmHhDGuB15FCD/png4S5DPKSH4YiodA09eODroEGEoDu3gV6Z97hi38sLKYWmVlP+OP2ZiL/xMGqmm+LutBExVkV33GLH/Q0QC7w= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724091503; c=relaxed/simple; bh=sHwFqtE2NbxQxRXSVumyXwo/c68mVLc2REYswVL/EK4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=nbHtNd+non0shwlf444/kjCDZfSHJE/TLztgqRnQ357dhzgTzyVpyWWEq/HNpBnf1pG7YBv1BIiEiuqNgNBN0+iOIzjeZmcrtoGHgjDr8yZRdpyVVyIK+7ca5spBjkbUkDquMGeF/rQXiYMrdE6rGn/uKK5BU1U1kTNK7JQzOGg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=dX+H8xYf; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="dX+H8xYf" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 37749C4AF0E; Mon, 19 Aug 2024 18:18:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1724091503; bh=sHwFqtE2NbxQxRXSVumyXwo/c68mVLc2REYswVL/EK4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=dX+H8xYfqz6RC1lN/nnbQPG5IXKNqot6pvb0jdCqX8KTcYvR1YGOHM1pjR5qfo389 frs9xBCHT5hPg14OYJC+6IFBG93T4x6SVFFBfr8XTpji1aTMyO03WvooE7mlVJOv+e exT1XJY7bZ3ibYkM+T8mOHQSc3SrQxXNyd9n7G1lV/FNvFG+RrY0+oDMfB5POVwfx5 RrOU/KV0Dv9VGTXOfQXTbf99VGLq0risFK93+C0gxwBV9AjFFIZrLNMqp5IeA2Xpq9 oYqqvo1ITBpmUeChDFbAWl99lSVS9dBnFfCIzajsgoXu6z7paMs+56wKMda6ctBRw8 BrMEQxmPBzKPg== From: Mike Snitzer To: linux-nfs@vger.kernel.org Cc: Jeff Layton , Chuck Lever , Anna Schumaker , Trond Myklebust , NeilBrown , linux-fsdevel@vger.kernel.org Subject: [PATCH v12 23/24] nfs: switch client to use nfsd_file for localio Date: Mon, 19 Aug 2024 14:17:28 -0400 Message-ID: <20240819181750.70570-24-snitzer@kernel.org> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240819181750.70570-1-snitzer@kernel.org> References: <20240819181750.70570-1-snitzer@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 The most interesting changes in this commit are those dealing with the shift to make proper use of nfsd_file so that its lifetime (before nfsd_file_put is called) is extended until after commit, read and write operations. So rather than immediately call nfsd_file_put() in nfs_local_open_fh(), nfsd_file_put() isn't called until nfs_local_pgio_release() for read/write and not until nfs_local_release_commit_data() for commit. Aside from that, the bulk of the changes are just various mechanical changes to switch localio code from passing a file pointer to passing an nfsd_file pointer. But the flexfilelayout.c changes also dealt with the unfortunate business of back-filling conditional compilation for the non-LOCALIO case because otherwise the 'nfs_to' symbol would be unavailable (only localio code is dependent on 'nfs_to'). With an fio test that issues 128K directio reads with 24 threads to the same file, for 20 secs, there are noticably fewer nfsd_file allocations and a slight performance improvement: Before this commit: read: IOPS=260k, BW=31.7GiB/s (34.0GB/s)(634GiB/20001msec) # cat /proc/fs/nfsd/filecache total inodes: 0 hash buckets: 256 lru entries: 0 cache hits: 4628466 acquisitions: 5191234 allocations: 827632 releases: 827632 evictions: 0 mean age (ms): 0 After this commit: read: IOPS=311k, BW=38.0GiB/s (40.8GB/s)(760GiB/20001msec) # cat /proc/fs/nfsd/filecache total inodes: 0 hash buckets: 256 lru entries: 0 cache hits: 6224711 acquisitions: 6224712 allocations: 1 releases: 1 evictions: 1 mean age (ms): 23786 mean age (ms): 23198 It should be noted that while making proper use of nfsd_file and nfsd's filecache offers a clear performance win, it still comes up short compared to simply caching the open file in the client: read: IOPS=375k, BW=45.7GiB/s (49.1GB/s)(915GiB/20002msec) # cat /proc/fs/nfsd/filecache total inodes: 0 hash buckets: 256 lru entries: 0 cache hits: 11 acquisitions: 24 allocations: 14 releases: 14 evictions: 0 mean age (ms): 0 But caching the open file in the client has object lifetime problems and is avoided because otherwise the client would disallow proper refcounting and release of each nfsd export's backing filesystem. Signed-off-by: Mike Snitzer --- fs/nfs/flexfilelayout/flexfilelayout.c | 123 ++++++++++++++++--------- fs/nfs/flexfilelayout/flexfilelayout.h | 4 +- fs/nfs/internal.h | 31 ++++--- fs/nfs/localio.c | 74 +++++++-------- fs/nfs/pagelist.c | 10 +- fs/nfs/write.c | 10 +- 6 files changed, 144 insertions(+), 108 deletions(-) diff --git a/fs/nfs/flexfilelayout/flexfilelayout.c b/fs/nfs/flexfilelayout/flexfilelayout.c index 206b4b524e43..d91b640f6c05 100644 --- a/fs/nfs/flexfilelayout/flexfilelayout.c +++ b/fs/nfs/flexfilelayout/flexfilelayout.c @@ -163,7 +163,17 @@ decode_name(struct xdr_stream *xdr, u32 *id) return 0; } -static struct file * +/* + * A dummy definition to make RCU (and non-LOCALIO compilation) happy. + * struct nfsd_file should never be dereferenced in this file. + */ +struct nfsd_file { + int undefined__; +}; + +#if IS_ENABLED(CONFIG_NFS_LOCALIO) + +static struct nfsd_file * ff_local_open_fh(struct pnfs_layout_segment *lseg, u32 ds_idx, struct nfs_client *clp, @@ -172,7 +182,7 @@ ff_local_open_fh(struct pnfs_layout_segment *lseg, fmode_t mode) { struct nfs4_ff_layout_mirror *mirror = FF_LAYOUT_COMP(lseg, ds_idx); - struct file *filp, *new, __rcu **pfile; + struct nfsd_file *nf, *new, __rcu **pnf; if (!nfs_server_is_local(clp)) return NULL; @@ -182,33 +192,43 @@ ff_local_open_fh(struct pnfs_layout_segment *lseg, * to a rw layout. */ mode |= FMODE_READ; - pfile = &mirror->rw_file; + pnf = &mirror->rw_file; } else - pfile = &mirror->ro_file; + pnf = &mirror->ro_file; new = NULL; rcu_read_lock(); - filp = rcu_dereference(*pfile); - if (!filp) { + nf = rcu_dereference(*pnf); + if (!nf) { rcu_read_unlock(); new = nfs_local_open_fh(clp, cred, fh, mode); if (IS_ERR(new)) return NULL; rcu_read_lock(); /* try to swap in the pointer */ - filp = cmpxchg(pfile, NULL, new); - if (!filp) { - filp = new; + nf = cmpxchg(pnf, NULL, new); + if (!nf) { + nf = new; new = NULL; } } - filp = get_file_rcu(&filp); + nf = nfs_to.nfsd_file_get(nf); rcu_read_unlock(); if (new) - fput(new); - return filp; + nfs_to.nfsd_file_put(new); + return nf; } +#else +static struct nfsd_file * +ff_local_open_fh(struct pnfs_layout_segment *lseg, u32 ds_idx, + struct nfs_client *clp, const struct cred *cred, + struct nfs_fh *fh, fmode_t mode) +{ + return NULL; +} +#endif /* IS_ENABLED(CONFIG_NFS_LOCALIO) */ + static bool ff_mirror_match_fh(const struct nfs4_ff_layout_mirror *m1, const struct nfs4_ff_layout_mirror *m2) { @@ -284,15 +304,17 @@ static struct nfs4_ff_layout_mirror *ff_layout_alloc_mirror(gfp_t gfp_flags) static void ff_layout_free_mirror(struct nfs4_ff_layout_mirror *mirror) { - struct file *filp; - const struct cred *cred; + struct nfsd_file * __maybe_unused nf; + const struct cred *cred; - filp = rcu_access_pointer(mirror->ro_file); - if (filp) - fput(filp); - filp = rcu_access_pointer(mirror->rw_file); - if (filp) - fput(filp); +#if IS_ENABLED(CONFIG_NFS_LOCALIO) + nf = rcu_access_pointer(mirror->ro_file); + if (nf) + nfs_to.nfsd_file_put(nf); + nf = rcu_access_pointer(mirror->rw_file); + if (nf) + nfs_to.nfsd_file_put(nf); +#endif ff_layout_remove_mirror(mirror); kfree(mirror->fh_versions); cred = rcu_access_pointer(mirror->ro_cred); @@ -468,7 +490,6 @@ ff_layout_alloc_lseg(struct pnfs_layout_hdr *lh, struct nfs4_ff_layout_mirror *mirror; struct cred *kcred; const struct cred __rcu *cred; - const struct cred __rcu *old; kuid_t uid; kgid_t gid; u32 ds_count, fh_count, id; @@ -568,27 +589,39 @@ ff_layout_alloc_lseg(struct pnfs_layout_hdr *lh, mirror = ff_layout_add_mirror(lh, fls->mirror_array[i]); if (mirror != fls->mirror_array[i]) { - struct file *filp; - /* swap cred ptrs so free_mirror will clean up old */ +#if IS_ENABLED(CONFIG_NFS_LOCALIO) if (lgr->range.iomode == IOMODE_READ) { - old = xchg(&mirror->ro_cred, cred); + const struct cred __rcu *old = + xchg(&mirror->ro_cred, cred); rcu_assign_pointer(fls->mirror_array[i]->ro_cred, old); /* drop file if creds changed */ if (old != cred) { - filp = rcu_dereference_protected(xchg(&mirror->ro_file, NULL), 1); - if (filp) - fput(filp); + struct nfsd_file *nf = + rcu_dereference_protected(xchg(&mirror->ro_file, NULL), 1); + if (nf) + nfs_to.nfsd_file_put(nf); } } else { - old = xchg(&mirror->rw_cred, cred); + const struct cred __rcu *old = + xchg(&mirror->rw_cred, cred); rcu_assign_pointer(fls->mirror_array[i]->rw_cred, old); if (old != cred) { - filp = rcu_dereference_protected(xchg(&mirror->rw_file, NULL), 1); - if (filp) - fput(filp); + struct nfsd_file *nf = + rcu_dereference_protected(xchg(&mirror->rw_file, NULL), 1); + if (nf) + nfs_to.nfsd_file_put(nf); } } +#else + if (lgr->range.iomode == IOMODE_READ) { + cred = xchg(&mirror->ro_cred, cred); + rcu_assign_pointer(fls->mirror_array[i]->ro_cred, cred); + } else { + cred = xchg(&mirror->rw_cred, cred); + rcu_assign_pointer(fls->mirror_array[i]->rw_cred, cred); + } +#endif /* IS_ENABLED(CONFIG_NFS_LOCALIO) */ ff_layout_free_mirror(fls->mirror_array[i]); fls->mirror_array[i] = mirror; } @@ -1824,7 +1857,7 @@ ff_layout_read_pagelist(struct nfs_pgio_header *hdr) struct pnfs_layout_segment *lseg = hdr->lseg; struct nfs4_pnfs_ds *ds; struct rpc_clnt *ds_clnt; - struct file *filp; + struct nfsd_file *nf; struct nfs4_ff_layout_mirror *mirror; const struct cred *ds_cred; loff_t offset = hdr->args.offset; @@ -1872,9 +1905,9 @@ ff_layout_read_pagelist(struct nfs_pgio_header *hdr) hdr->mds_offset = offset; /* Start IO accounting for local read */ - filp = ff_local_open_fh(lseg, idx, ds->ds_clp, ds_cred, fh, - FMODE_READ); - if (filp) { + nf = ff_local_open_fh(lseg, idx, ds->ds_clp, ds_cred, fh, + FMODE_READ); + if (nf) { hdr->task.tk_start = ktime_get(); ff_layout_read_record_layoutstats_start(&hdr->task, hdr); } @@ -1883,7 +1916,7 @@ ff_layout_read_pagelist(struct nfs_pgio_header *hdr) nfs_initiate_pgio(ds_clnt, hdr, ds_cred, ds->ds_clp->rpc_ops, vers == 3 ? &ff_layout_read_call_ops_v3 : &ff_layout_read_call_ops_v4, - 0, RPC_TASK_SOFTCONN, filp); + 0, RPC_TASK_SOFTCONN, nf); put_cred(ds_cred); return PNFS_ATTEMPTED; @@ -1903,7 +1936,7 @@ ff_layout_write_pagelist(struct nfs_pgio_header *hdr, int sync) struct pnfs_layout_segment *lseg = hdr->lseg; struct nfs4_pnfs_ds *ds; struct rpc_clnt *ds_clnt; - struct file *filp; + struct nfsd_file *nf; struct nfs4_ff_layout_mirror *mirror; const struct cred *ds_cred; loff_t offset = hdr->args.offset; @@ -1949,9 +1982,9 @@ ff_layout_write_pagelist(struct nfs_pgio_header *hdr, int sync) hdr->args.offset = offset; /* Start IO accounting for local write */ - filp = ff_local_open_fh(lseg, idx, ds->ds_clp, ds_cred, fh, + nf = ff_local_open_fh(lseg, idx, ds->ds_clp, ds_cred, fh, FMODE_READ|FMODE_WRITE); - if (filp) { + if (nf) { hdr->task.tk_start = ktime_get(); ff_layout_write_record_layoutstats_start(&hdr->task, hdr); } @@ -1960,7 +1993,7 @@ ff_layout_write_pagelist(struct nfs_pgio_header *hdr, int sync) nfs_initiate_pgio(ds_clnt, hdr, ds_cred, ds->ds_clp->rpc_ops, vers == 3 ? &ff_layout_write_call_ops_v3 : &ff_layout_write_call_ops_v4, - sync, RPC_TASK_SOFTCONN, filp); + sync, RPC_TASK_SOFTCONN, nf); put_cred(ds_cred); return PNFS_ATTEMPTED; @@ -1994,7 +2027,7 @@ static int ff_layout_initiate_commit(struct nfs_commit_data *data, int how) struct pnfs_layout_segment *lseg = data->lseg; struct nfs4_pnfs_ds *ds; struct rpc_clnt *ds_clnt; - struct file *filp; + struct nfsd_file *nf; struct nfs4_ff_layout_mirror *mirror; const struct cred *ds_cred; u32 idx; @@ -2034,9 +2067,9 @@ static int ff_layout_initiate_commit(struct nfs_commit_data *data, int how) data->args.fh = fh; /* Start IO accounting for local commit */ - filp = ff_local_open_fh(lseg, idx, ds->ds_clp, ds_cred, fh, - FMODE_READ|FMODE_WRITE); - if (filp) { + nf = ff_local_open_fh(lseg, idx, ds->ds_clp, ds_cred, fh, + FMODE_READ|FMODE_WRITE); + if (nf) { data->task.tk_start = ktime_get(); ff_layout_commit_record_layoutstats_start(&data->task, data); } @@ -2044,7 +2077,7 @@ static int ff_layout_initiate_commit(struct nfs_commit_data *data, int how) ret = nfs_initiate_commit(ds_clnt, data, ds->ds_clp->rpc_ops, vers == 3 ? &ff_layout_commit_call_ops_v3 : &ff_layout_commit_call_ops_v4, - how, RPC_TASK_SOFTCONN, filp); + how, RPC_TASK_SOFTCONN, nf); put_cred(ds_cred); return ret; out_err: diff --git a/fs/nfs/flexfilelayout/flexfilelayout.h b/fs/nfs/flexfilelayout/flexfilelayout.h index 8e042df5a2c9..562e7e27a8b5 100644 --- a/fs/nfs/flexfilelayout/flexfilelayout.h +++ b/fs/nfs/flexfilelayout/flexfilelayout.h @@ -82,9 +82,9 @@ struct nfs4_ff_layout_mirror { struct nfs_fh *fh_versions; nfs4_stateid stateid; const struct cred __rcu *ro_cred; - struct file __rcu *ro_file; + struct nfsd_file __rcu *ro_file; const struct cred __rcu *rw_cred; - struct file __rcu *rw_file; + struct nfsd_file __rcu *rw_file; refcount_t ref; spinlock_t lock; unsigned long flags; diff --git a/fs/nfs/internal.h b/fs/nfs/internal.h index 23f0d180fd19..a7677a16e929 100644 --- a/fs/nfs/internal.h +++ b/fs/nfs/internal.h @@ -309,7 +309,7 @@ int nfs_generic_pgio(struct nfs_pageio_descriptor *, struct nfs_pgio_header *); int nfs_initiate_pgio(struct rpc_clnt *clnt, struct nfs_pgio_header *hdr, const struct cred *cred, const struct nfs_rpc_ops *rpc_ops, const struct rpc_call_ops *call_ops, int how, int flags, - struct file *localio); + struct nfsd_file *localio); void nfs_free_request(struct nfs_page *req); struct nfs_pgio_mirror * nfs_pgio_current_mirror(struct nfs_pageio_descriptor *desc); @@ -455,43 +455,46 @@ extern int nfs_wait_bit_killable(struct wait_bit_key *key, int mode); /* localio.c */ extern void nfs_local_disable(struct nfs_client *); extern void nfs_local_probe(struct nfs_client *); -extern struct file *nfs_local_open_fh(struct nfs_client *, const struct cred *, - struct nfs_fh *, const fmode_t); -extern struct file *nfs_local_file_open(struct nfs_client *clp, - const struct cred *cred, - struct nfs_fh *fh, - struct nfs_open_context *ctx); -extern int nfs_local_doio(struct nfs_client *, struct file *, +extern struct nfsd_file *nfs_local_open_fh(struct nfs_client *, + const struct cred *, + struct nfs_fh *, + const fmode_t); +extern struct nfsd_file *nfs_local_file_open(struct nfs_client *clp, + const struct cred *cred, + struct nfs_fh *fh, + struct nfs_open_context *ctx); +extern int nfs_local_doio(struct nfs_client *, struct nfsd_file *, struct nfs_pgio_header *, const struct rpc_call_ops *); -extern int nfs_local_commit(struct file *, struct nfs_commit_data *, +extern int nfs_local_commit(struct nfsd_file *, struct nfs_commit_data *, const struct rpc_call_ops *, int); extern bool nfs_server_is_local(const struct nfs_client *clp); #else static inline void nfs_local_disable(struct nfs_client *clp) {} static inline void nfs_local_probe(struct nfs_client *clp) {} -static inline struct file *nfs_local_open_fh(struct nfs_client *clp, +static inline struct nfsd_file *nfs_local_open_fh(struct nfs_client *clp, const struct cred *cred, struct nfs_fh *fh, const fmode_t mode) { return ERR_PTR(-EINVAL); } -static inline struct file *nfs_local_file_open(struct nfs_client *clp, +static inline struct nfsd_file *nfs_local_file_open(struct nfs_client *clp, const struct cred *cred, struct nfs_fh *fh, struct nfs_open_context *ctx) { return NULL; } -static inline int nfs_local_doio(struct nfs_client *clp, struct file *filep, +static inline int nfs_local_doio(struct nfs_client *clp, struct nfsd_file *nf, struct nfs_pgio_header *hdr, const struct rpc_call_ops *call_ops) { return -EINVAL; } -static inline int nfs_local_commit(struct file *filep, struct nfs_commit_data *data, +static inline int nfs_local_commit(struct nfsd_file *nf, + struct nfs_commit_data *data, const struct rpc_call_ops *call_ops, int how) { return -EINVAL; @@ -582,7 +585,7 @@ extern int nfs_initiate_commit(struct rpc_clnt *clnt, const struct nfs_rpc_ops *nfs_ops, const struct rpc_call_ops *call_ops, int how, int flags, - struct file *localio); + struct nfsd_file *localio); extern void nfs_init_commit(struct nfs_commit_data *data, struct list_head *head, struct pnfs_layout_segment *lseg, diff --git a/fs/nfs/localio.c b/fs/nfs/localio.c index 7d63d7e34643..718114e52da4 100644 --- a/fs/nfs/localio.c +++ b/fs/nfs/localio.c @@ -35,10 +35,11 @@ struct nfs_local_kiocb { struct bio_vec *bvec; struct nfs_pgio_header *hdr; struct work_struct work; + struct nfsd_file *nf; }; struct nfs_local_fsync_ctx { - struct file *filp; + struct nfsd_file *nf; struct nfs_commit_data *data; struct work_struct work; struct kref kref; @@ -228,15 +229,14 @@ void nfs_local_probe(struct nfs_client *clp) EXPORT_SYMBOL_GPL(nfs_local_probe); /* - * nfs_local_open_fh - open a local filehandle + * nfs_local_open_fh - open a local filehandle in terms of nfsd_file * - * Returns a pointer to a struct file or an ERR_PTR + * Returns a pointer to a struct nfsd_file or an ERR_PTR */ -struct file * +struct nfsd_file * nfs_local_open_fh(struct nfs_client *clp, const struct cred *cred, struct nfs_fh *fh, const fmode_t mode) { - struct file *filp; struct nfsd_file *nf; int status; @@ -258,26 +258,24 @@ nfs_local_open_fh(struct nfs_client *clp, const struct cred *cred, } return ERR_PTR(status); } - filp = get_file(nfs_to.nfsd_file_file(nf)); - nfs_to.nfsd_file_put(nf); - return filp; + return nf; } EXPORT_SYMBOL_GPL(nfs_local_open_fh); -struct file * +struct nfsd_file * nfs_local_file_open(struct nfs_client *clp, const struct cred *cred, struct nfs_fh *fh, struct nfs_open_context *ctx) { - struct file *filp; + struct nfsd_file *nf; if (!nfs_server_is_local(clp)) return NULL; - filp = nfs_local_open_fh(clp, cred, fh, ctx->mode); - if (IS_ERR(filp)) + nf = nfs_local_open_fh(clp, cred, fh, ctx->mode); + if (IS_ERR(nf)) return NULL; - return filp; + return nf; } static struct bio_vec * @@ -305,7 +303,7 @@ nfs_local_iocb_free(struct nfs_local_kiocb *iocb) } static struct nfs_local_kiocb * -nfs_local_iocb_alloc(struct nfs_pgio_header *hdr, struct file *filp, +nfs_local_iocb_alloc(struct nfs_pgio_header *hdr, struct nfsd_file *nf, gfp_t flags) { struct nfs_local_kiocb *iocb; @@ -319,8 +317,9 @@ nfs_local_iocb_alloc(struct nfs_pgio_header *hdr, struct file *filp, kfree(iocb); return NULL; } - init_sync_kiocb(&iocb->kiocb, filp); + init_sync_kiocb(&iocb->kiocb, nfs_to.nfsd_file_file(nf)); iocb->kiocb.ki_pos = hdr->args.offset; + iocb->nf = nf; iocb->hdr = hdr; iocb->kiocb.ki_flags &= ~IOCB_APPEND; return iocb; @@ -372,7 +371,7 @@ nfs_local_pgio_release(struct nfs_local_kiocb *iocb) { struct nfs_pgio_header *hdr = iocb->hdr; - fput(iocb->kiocb.ki_filp); + nfs_to.nfsd_file_put(iocb->nf); nfs_local_iocb_free(iocb); nfs_local_hdr_release(hdr, hdr->task.tk_ops); } @@ -415,7 +414,7 @@ static void nfs_local_call_read(struct work_struct *work) revert_creds(save_cred); } -static int nfs_do_local_read(struct nfs_pgio_header *hdr, struct file *filp, +static int nfs_do_local_read(struct nfs_pgio_header *hdr, struct nfsd_file *nf, const struct rpc_call_ops *call_ops) { struct nfs_local_kiocb *iocb; @@ -423,7 +422,7 @@ static int nfs_do_local_read(struct nfs_pgio_header *hdr, struct file *filp, dprintk("%s: vfs_read count=%u pos=%llu\n", __func__, hdr->args.count, hdr->args.offset); - iocb = nfs_local_iocb_alloc(hdr, filp, GFP_KERNEL); + iocb = nfs_local_iocb_alloc(hdr, nf, GFP_KERNEL); if (iocb == NULL) return -ENOMEM; @@ -466,7 +465,6 @@ nfs_set_local_verifier(struct inode *inode, struct nfs_writeverf *verf, enum nfs3_stable_how how) { - nfs_copy_boot_verifier(&verf->verifier, inode); verf->committed = how; } @@ -588,7 +586,7 @@ static void nfs_local_call_write(struct work_struct *work) current->flags = old_flags; } -static int nfs_do_local_write(struct nfs_pgio_header *hdr, struct file *filp, +static int nfs_do_local_write(struct nfs_pgio_header *hdr, struct nfsd_file *nf, const struct rpc_call_ops *call_ops) { struct nfs_local_kiocb *iocb; @@ -597,7 +595,7 @@ static int nfs_do_local_write(struct nfs_pgio_header *hdr, struct file *filp, __func__, hdr->args.count, hdr->args.offset, (hdr->args.stable == NFS_UNSTABLE) ? "unstable" : "stable"); - iocb = nfs_local_iocb_alloc(hdr, filp, GFP_NOIO); + iocb = nfs_local_iocb_alloc(hdr, nf, GFP_NOIO); if (iocb == NULL) return -ENOMEM; @@ -621,11 +619,12 @@ static int nfs_do_local_write(struct nfs_pgio_header *hdr, struct file *filp, } int -nfs_local_doio(struct nfs_client *clp, struct file *filp, +nfs_local_doio(struct nfs_client *clp, struct nfsd_file *nf, struct nfs_pgio_header *hdr, const struct rpc_call_ops *call_ops) { int status = 0; + struct file *filp = nfs_to.nfsd_file_file(nf); if (!hdr->args.count) return 0; @@ -633,24 +632,24 @@ nfs_local_doio(struct nfs_client *clp, struct file *filp, if (!filp->f_op->read_iter || !filp->f_op->write_iter) { nfs_local_disable(clp); status = -EAGAIN; - goto out_fput; + goto out; } switch (hdr->rw_mode) { case FMODE_READ: - status = nfs_do_local_read(hdr, filp, call_ops); + status = nfs_do_local_read(hdr, nf, call_ops); break; case FMODE_WRITE: - status = nfs_do_local_write(hdr, filp, call_ops); + status = nfs_do_local_write(hdr, nf, call_ops); break; default: dprintk("%s: invalid mode: %d\n", __func__, hdr->rw_mode); status = -EINVAL; } -out_fput: +out: if (status != 0) { - fput(filp); + nfs_to.nfsd_file_put(nf); hdr->task.tk_status = status; nfs_local_hdr_release(hdr, call_ops); } @@ -697,23 +696,23 @@ nfs_local_commit_done(struct nfs_commit_data *data, int status) } static void -nfs_local_release_commit_data(struct file *filp, +nfs_local_release_commit_data(struct nfsd_file *nf, struct nfs_commit_data *data, const struct rpc_call_ops *call_ops) { - fput(filp); + nfs_to.nfsd_file_put(nf); call_ops->rpc_call_done(&data->task, data); call_ops->rpc_release(data); } static struct nfs_local_fsync_ctx * -nfs_local_fsync_ctx_alloc(struct nfs_commit_data *data, struct file *filp, - gfp_t flags) +nfs_local_fsync_ctx_alloc(struct nfs_commit_data *data, + struct nfsd_file *nf, gfp_t flags) { struct nfs_local_fsync_ctx *ctx = kmalloc(sizeof(*ctx), flags); if (ctx != NULL) { - ctx->filp = filp; + ctx->nf = nf; ctx->data = data; INIT_WORK(&ctx->work, nfs_local_fsync_work); kref_init(&ctx->kref); @@ -737,7 +736,7 @@ nfs_local_fsync_ctx_put(struct nfs_local_fsync_ctx *ctx) static void nfs_local_fsync_ctx_free(struct nfs_local_fsync_ctx *ctx) { - nfs_local_release_commit_data(ctx->filp, ctx->data, + nfs_local_release_commit_data(ctx->nf, ctx->data, ctx->data->task.tk_ops); nfs_local_fsync_ctx_put(ctx); } @@ -750,7 +749,8 @@ nfs_local_fsync_work(struct work_struct *work) ctx = container_of(work, struct nfs_local_fsync_ctx, work); - status = nfs_local_run_commit(ctx->filp, ctx->data); + status = nfs_local_run_commit(nfs_to.nfsd_file_file(ctx->nf), + ctx->data); nfs_local_commit_done(ctx->data, status); if (ctx->done != NULL) complete(ctx->done); @@ -758,15 +758,15 @@ nfs_local_fsync_work(struct work_struct *work) } int -nfs_local_commit(struct file *filp, struct nfs_commit_data *data, +nfs_local_commit(struct nfsd_file *nf, struct nfs_commit_data *data, const struct rpc_call_ops *call_ops, int how) { struct nfs_local_fsync_ctx *ctx; - ctx = nfs_local_fsync_ctx_alloc(data, filp, GFP_KERNEL); + ctx = nfs_local_fsync_ctx_alloc(data, nf, GFP_KERNEL); if (!ctx) { nfs_local_commit_done(data, -ENOMEM); - nfs_local_release_commit_data(filp, data, call_ops); + nfs_local_release_commit_data(nf, data, call_ops); return -ENOMEM; } diff --git a/fs/nfs/pagelist.c b/fs/nfs/pagelist.c index 1bd0224f7ee8..6f836b66ef79 100644 --- a/fs/nfs/pagelist.c +++ b/fs/nfs/pagelist.c @@ -732,7 +732,7 @@ static void nfs_pgio_prepare(struct rpc_task *task, void *calldata) int nfs_initiate_pgio(struct rpc_clnt *clnt, struct nfs_pgio_header *hdr, const struct cred *cred, const struct nfs_rpc_ops *rpc_ops, const struct rpc_call_ops *call_ops, int how, int flags, - struct file *localio) + struct nfsd_file *localio) { struct rpc_task *task; struct rpc_message msg = { @@ -960,9 +960,9 @@ static int nfs_generic_pg_pgios(struct nfs_pageio_descriptor *desc) if (ret == 0) { struct nfs_client *clp = NFS_SERVER(hdr->inode)->nfs_client; - struct file *filp = nfs_local_file_open(clp, hdr->cred, - hdr->args.fh, - hdr->args.context); + struct nfsd_file *nf = nfs_local_file_open(clp, hdr->cred, + hdr->args.fh, + hdr->args.context); if (NFS_SERVER(hdr->inode)->nfs_client->cl_minorversion) task_flags = RPC_TASK_MOVEABLE; @@ -973,7 +973,7 @@ static int nfs_generic_pg_pgios(struct nfs_pageio_descriptor *desc) desc->pg_rpc_callops, desc->pg_ioflags, RPC_TASK_CRED_NOREF | task_flags, - filp); + nf); } return ret; } diff --git a/fs/nfs/write.c b/fs/nfs/write.c index 6436db54b2fc..89a49a08bc90 100644 --- a/fs/nfs/write.c +++ b/fs/nfs/write.c @@ -1664,7 +1664,7 @@ int nfs_initiate_commit(struct rpc_clnt *clnt, struct nfs_commit_data *data, const struct nfs_rpc_ops *nfs_ops, const struct rpc_call_ops *call_ops, int how, int flags, - struct file *localio) + struct nfsd_file *localio) { struct rpc_task *task; int priority = flush_task_priority(how); @@ -1795,7 +1795,7 @@ nfs_commit_list(struct inode *inode, struct list_head *head, int how, struct nfs_commit_info *cinfo) { struct nfs_commit_data *data; - struct file *filp; + struct nfsd_file *nf; unsigned short task_flags = 0; /* another commit raced with us */ @@ -1813,11 +1813,11 @@ nfs_commit_list(struct inode *inode, struct list_head *head, int how, if (NFS_SERVER(inode)->nfs_client->cl_minorversion) task_flags = RPC_TASK_MOVEABLE; - filp = nfs_local_file_open(NFS_SERVER(inode)->nfs_client, data->cred, - data->args.fh, data->context); + nf = nfs_local_file_open(NFS_SERVER(inode)->nfs_client, data->cred, + data->args.fh, data->context); return nfs_initiate_commit(NFS_CLIENT(inode), data, NFS_PROTO(inode), data->mds_ops, how, - RPC_TASK_CRED_NOREF | task_flags, filp); + RPC_TASK_CRED_NOREF | task_flags, nf); } /* From patchwork Mon Aug 19 18:17:29 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Mike Snitzer X-Patchwork-Id: 13768793 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E669A18B46F; Mon, 19 Aug 2024 18:18:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724091505; cv=none; b=JgPxeaz1+DHVrZIFemZKdnrqXO4SBWK5h96uByMKLbacOp7lmUhHFonOia/8HJv0WifPG1gzuoCtSylOS1cJ9HhwmZXOPCOPowtN9PoIbT5ObxF+ruDaqbNM18lHHqqF4BVU8N+0ltLuTccAsxWSWpfOsSjdDOMW6aNDDMB2/Hk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724091505; c=relaxed/simple; bh=aG9IkFvVW3wIWK9Qnp3warZ9GiIWNMtEp7QTQUr9qZI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=JMvxIFP/XUNL8YhptOOCJ3DOypO0n6sEEzpkCtA8yEey6/UQdUZHHG+1pDHI28LwBPsxIW0vXvKYJmLaUAM3GzBGLpt+0cPDZlCBwDtLrHuJB3Iftg78klwHOodLUTRafU163vv64sgF2SFstQ97YfBBZRMjydBSf+gYVMUfT7E= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=WDpBRHwF; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="WDpBRHwF" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 8D4BBC32782; Mon, 19 Aug 2024 18:18:24 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1724091504; bh=aG9IkFvVW3wIWK9Qnp3warZ9GiIWNMtEp7QTQUr9qZI=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=WDpBRHwFbFp35p3Vz11Qx1anmLA4P3Hnh8cgYNW1icI6Tejqx1cyIcbMPZTk3SuMh uba5v5cTxOi8Nz8EaH1PdXfezlzNF78MyJnLVFg4WJpmfEzoeckwwIQ5MVRh9/gy1K Xc19mc0b9UPIkVhs+23/PciFItUiiDZ1n65k2ZqdM/eOAdFpeOSHxV4LXJ3HdH8d32 dk5h65ow1Z1DONjMi2nt1QudCeqnMu4YycvUfDdJAVmWCQrrq3SCp18t0Mf+rAiJ77 T8pdASvoN6V3MsRqO01RdGrdN+fz7445dN6G59B5SqGgmRehYCTY1XZUmjMENUsgCN kX/31aS+tioqw== From: Mike Snitzer To: linux-nfs@vger.kernel.org Cc: Jeff Layton , Chuck Lever , Anna Schumaker , Trond Myklebust , NeilBrown , linux-fsdevel@vger.kernel.org Subject: [PATCH v12 24/24] nfs: add FAQ section to Documentation/filesystems/nfs/localio.rst Date: Mon, 19 Aug 2024 14:17:29 -0400 Message-ID: <20240819181750.70570-25-snitzer@kernel.org> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240819181750.70570-1-snitzer@kernel.org> References: <20240819181750.70570-1-snitzer@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Trond Myklebust Add a FAQ section to give answers to questions that have been raised during review of the localio feature. Signed-off-by: Trond Myklebust Co-developed-by: Mike Snitzer Signed-off-by: Mike Snitzer --- Documentation/filesystems/nfs/localio.rst | 77 +++++++++++++++++++++++ 1 file changed, 77 insertions(+) diff --git a/Documentation/filesystems/nfs/localio.rst b/Documentation/filesystems/nfs/localio.rst index d8bdab88f1db..acd8f3e5d87a 100644 --- a/Documentation/filesystems/nfs/localio.rst +++ b/Documentation/filesystems/nfs/localio.rst @@ -40,6 +40,83 @@ fio for 20 secs with 24 libaio threads, 128k directio reads, qd of 8, - Without LOCALIO: read: IOPS=12.0k, BW=1495MiB/s (1568MB/s)(29.2GiB/20015msec) +FAQ +=== + +1. What are the use cases for LOCALIO? + + a. Workloads where the NFS client and server are on the same host + realize improved IO performance. In particular, it is common when + running containerised workloads for jobs to find themselves + running on the same host as the knfsd server being used for + storage. + +2. What are the requirements for LOCALIO? + + a. Bypass use of the network RPC protocol as much as possible. This + includes bypassing XDR and RPC for open, read, write and commit + operations. + b. Allow client and server to autonomously discover if they are + running local to each other without making any assumptions about + the local network topology. + c. Support the use of containers by being compatible with relevant + namespaces (e.g. network, user, mount). + d. Support all versions of NFS. NFSv3 is of particular importance + because it has wide enterprise usage and pNFS flexfiles makes use + of it for the data path. + +3. Why doesn’t LOCALIO just compare IP addresses or hostnames when + deciding if the NFS client and server are co-located on the same + host? + + Since one of the main use cases is containerised workloads, we cannot + assume that IP addresses will be shared between the client and + server. This sets up a requirement for a handshake protocol that + needs to go over the same connection as the NFS traffic in order to + identify that the client and the server really are running on the + same host. The handshake uses a secret that is sent over the wire, + and can be verified by both parties by comparing with a value stored + in shared kernel memory if they are truly co-located. + +4. Does LOCALIO improve pNFS flexfiles? + + Yes, LOCALIO complements pNFS flexfiles by allowing it to take + advantage of NFS client and server locality. Policy that initiates + client IO as closely to the server where the data is stored naturally + benefits from the data path optimization LOCALIO provides. + +5. Why not develop a new pNFS layout to enable LOCALIO? + + A new pNFS layout could be developed, but doing so would put the + onus on the server to somehow discover that the client is co-located + when deciding to hand out the layout. + There is value in a simpler approach (as provided by LOCALIO) that + allows the NFS client to negotiate and leverage locality without + requiring more elaborate modeling and discovery of such locality in a + more centralized manner. + +6. Why is having the client perform a server-side file OPEN, without + using RPC, beneficial? Is the benefit pNFS specific? + + Avoiding the use of XDR and RPC for file opens is beneficial to + performance regardless of whether pNFS is used. However adding a + requirement to go over the wire to do an open and/or close ends up + negating any benefit of avoiding the wire for doing the I/O itself + when we’re dealing with small files. There is no benefit to replacing + the READ or WRITE with a new open and/or close operation that still + needs to go over the wire. + +7. Why is LOCALIO only supported with UNIX Authentication (AUTH_UNIX)? + + Strong authentication is usually tied to the connection itself. It + works by establishing a context that is cached by the server, and + that acts as the key for discovering the authorisation token, which + can then be passed to rpc.mountd to complete the authentication + process. On the other hand, in the case of AUTH_UNIX, the credential + that was passed over the wire is used directly as the key in the + upcall to rpc.mountd. This simplifies the authentication process, and + so makes AUTH_UNIX easier to support. + RPC ===