From patchwork Fri Aug 30 02:20:27 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 13784130 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.223.130]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 420B02C18C for ; Fri, 30 Aug 2024 02:39:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.135.223.130 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724985546; cv=none; b=F/P7hQqkIpI7HF4PoD8RlUSrMadpeJm2mlgfRr64PziN9ym10ZEkK7XVbsVbyBE2FWxWgnyYnHmmZWITSOnxAhUeWZQpizD+3neyGgB6159JEyOC3waWJyQq77CV2e6TZfHH24Wu5V9pcM672rcUCQo//heqzCpSd+zX8VWiaGI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724985546; c=relaxed/simple; bh=7B48hc0EycFcZF4I3bj8DltkfzXZBBgNth+o7Qr7AOI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=MN6IzS0wveSUFjObzEXEAyFUX3Eg7cJSL/r/x1DWPRP2vTppbV/WdcR2Co1rnk3OJ4mp1ixZkOvdtShRGBLhWwJ7WB2R7N4BNvpMaJdYjsxn/TZeljipc9Y5WgOYxe7fNGOWJM2yhxJXGRcusotv9OMY2mMA4PBANQ2ors0AWug= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de; spf=pass smtp.mailfrom=suse.de; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=gz5+juHE; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=A8UtUZDo; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=gz5+juHE; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=A8UtUZDo; arc=none smtp.client-ip=195.135.223.130 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="gz5+juHE"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="A8UtUZDo"; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="gz5+juHE"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="A8UtUZDo" Received: from imap1.dmz-prg2.suse.org (unknown [10.150.64.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 53AEA219F0; Fri, 30 Aug 2024 02:39:02 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1724985542; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=frSftmE8tk4SPvBHtWtYJJIy7qxU+Rs5xCVx8ks19yg=; b=gz5+juHE3OiyJlmg7vUzVQj9pN2eVUP9R2zwrMY5Imp2CqMsMY1SauAcSeJFJWMOiIFvlu VroYavumeSp6w0N1b4DaYFa0FDAgQIJwEaHGhmPWrlu96o5GLlq3+dRKpxbbjClKK7OCHX 8qMArhLkMq7/5mihAFnjszjt9JD3XCg= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1724985542; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=frSftmE8tk4SPvBHtWtYJJIy7qxU+Rs5xCVx8ks19yg=; b=A8UtUZDoan9Bida++PPkGPh6gnu88g1UI2QoxUSYkwpJvqaQLrK5FMZGxezCa80G/wS+t0 6TxVGyzmT3uz84Dw== Authentication-Results: smtp-out1.suse.de; none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1724985542; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=frSftmE8tk4SPvBHtWtYJJIy7qxU+Rs5xCVx8ks19yg=; b=gz5+juHE3OiyJlmg7vUzVQj9pN2eVUP9R2zwrMY5Imp2CqMsMY1SauAcSeJFJWMOiIFvlu VroYavumeSp6w0N1b4DaYFa0FDAgQIJwEaHGhmPWrlu96o5GLlq3+dRKpxbbjClKK7OCHX 8qMArhLkMq7/5mihAFnjszjt9JD3XCg= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1724985542; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=frSftmE8tk4SPvBHtWtYJJIy7qxU+Rs5xCVx8ks19yg=; b=A8UtUZDoan9Bida++PPkGPh6gnu88g1UI2QoxUSYkwpJvqaQLrK5FMZGxezCa80G/wS+t0 6TxVGyzmT3uz84Dw== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id BDD1E136A4; Fri, 30 Aug 2024 02:39:00 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id 36W3HMQw0WbRFwAAD6G6ig (envelope-from ); Fri, 30 Aug 2024 02:39:00 +0000 From: NeilBrown To: Mike Snitzer , Chuck Lever , Jeff Layton Cc: linux-nfs@vger.kernel.org Subject: [PATCH 14/25] nfs_common: add NFS LOCALIO auxiliary protocol enablement Date: Fri, 30 Aug 2024 12:20:27 +1000 Message-ID: <20240830023531.29421-15-neilb@suse.de> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240830023531.29421-1-neilb@suse.de> References: <20240830023531.29421-1-neilb@suse.de> Precedence: bulk X-Mailing-List: linux-nfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Score: -2.80 X-Spamd-Result: default: False [-2.80 / 50.00]; BAYES_HAM(-3.00)[100.00%]; MID_CONTAINS_FROM(1.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; R_MISSING_CHARSET(0.50)[]; NEURAL_HAM_SHORT(-0.20)[-1.000]; MIME_GOOD(-0.10)[text/plain]; MIME_TRACE(0.00)[0:+]; TO_DN_SOME(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; ARC_NA(0.00)[]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; FUZZY_BLOCKED(0.00)[rspamd.com]; FROM_EQ_ENVFROM(0.00)[]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[4]; RCVD_COUNT_TWO(0.00)[2]; TO_MATCH_ENVRCPT_ALL(0.00)[]; DBL_BLOCKED_OPENRESOLVER(0.00)[suse.de:mid]; RCVD_TLS_ALL(0.00)[] X-Spam-Flag: NO X-Spam-Level: From: Mike Snitzer fs/nfs_common/nfslocalio.c provides interfaces that enable an NFS client to generate a nonce (single-use UUID) and associated nfs_uuid_t struct, register it with nfs_common for subsequent lookup and verification by the NFS server and if matched the NFS server populates members in the nfs_uuid_t struct. When the server populates the structure, it is moved onto a private list allowing the server to invalidate the ->net pointer when the name space is shut down. A maybe_get_net() call under rcu_read_lock() can be used to stablish the net pointer active use. nfs_common's nfs_uuids list is the basis for localio enablement, as such it has members that point to nfsd memory for direct use by the client (e.g. 'net' is the server's network namespace, through it the client can access nn->nfsd_serv with proper rcu read access). Signed-off-by: Mike Snitzer --- fs/nfs_common/Makefile | 3 ++ fs/nfs_common/nfslocalio.c | 108 +++++++++++++++++++++++++++++++++++++ include/linux/nfslocalio.h | 34 ++++++++++++ 3 files changed, 145 insertions(+) create mode 100644 fs/nfs_common/nfslocalio.c create mode 100644 include/linux/nfslocalio.h diff --git a/fs/nfs_common/Makefile b/fs/nfs_common/Makefile index e58b01bb8dda..a5e54809701e 100644 --- a/fs/nfs_common/Makefile +++ b/fs/nfs_common/Makefile @@ -6,6 +6,9 @@ obj-$(CONFIG_NFS_ACL_SUPPORT) += nfs_acl.o nfs_acl-objs := nfsacl.o +obj-$(CONFIG_NFS_COMMON_LOCALIO_SUPPORT) += nfs_localio.o +nfs_localio-objs := nfslocalio.o + obj-$(CONFIG_GRACE_PERIOD) += grace.o obj-$(CONFIG_NFS_V4_2_SSC_HELPER) += nfs_ssc.o diff --git a/fs/nfs_common/nfslocalio.c b/fs/nfs_common/nfslocalio.c new file mode 100644 index 000000000000..f78cf99e2547 --- /dev/null +++ b/fs/nfs_common/nfslocalio.c @@ -0,0 +1,108 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Copyright (C) 2024 Mike Snitzer + */ + +#include +#include +#include +#include + +MODULE_LICENSE("GPL"); +MODULE_DESCRIPTION("NFS localio protocol bypass support"); + +static DEFINE_SPINLOCK(nfs_uuid_lock); + +/* + * Global list of nfs_uuid_t instances, add/remove + * is protected by nfs_uuid_lock. + * Reads are protected by RCU read lock (see below). + */ +LIST_HEAD(nfs_uuids); + +void nfs_uuid_begin(nfs_uuid_t *nfs_uuid) +{ + nfs_uuid->net = NULL; + nfs_uuid->dom = NULL; + uuid_gen(&nfs_uuid->uuid); + + spin_lock(&nfs_uuid_lock); + list_add_tail_rcu(&nfs_uuid->list, &nfs_uuids); + spin_unlock(&nfs_uuid_lock); +} +EXPORT_SYMBOL_GPL(nfs_uuid_begin); + +void nfs_uuid_end(nfs_uuid_t *nfs_uuid) +{ + if (nfs_uuid->net == NULL) { + spin_lock(&nfs_uuid_lock); + list_del_init(&nfs_uuid->list); + spin_unlock(&nfs_uuid_lock); + } +} +EXPORT_SYMBOL_GPL(nfs_uuid_end); + +/* Must be called with RCU read lock held. */ +static nfs_uuid_t * nfs_uuid_lookup(const uuid_t *uuid) +{ + nfs_uuid_t *nfs_uuid; + + list_for_each_entry(nfs_uuid, &nfs_uuids, list) + if (uuid_equal(&nfs_uuid->uuid, uuid)) + return nfs_uuid; + + return NULL; +} + +void nfs_uuid_is_local(const uuid_t *uuid, struct list_head *list, + struct net *net, struct auth_domain *dom) +{ + nfs_uuid_t *nfs_uuid; + + spin_lock(&nfs_uuid_lock); + nfs_uuid = nfs_uuid_lookup(uuid); + if (nfs_uuid) { + kref_get(&dom->ref); + nfs_uuid->dom = dom; + /* We don't hold a ref on the net, but instead put + * ourselves on a list so the net pointer can be + * invalidated. + */ + list_move(&nfs_uuid->list, list); + nfs_uuid->net = net; + } + spin_unlock(&nfs_uuid_lock); +} +EXPORT_SYMBOL_GPL(nfs_uuid_is_local); + +static void nfs_uuid_put_locked(nfs_uuid_t *nfs_uuid) +{ + if (nfs_uuid->net) + put_net(nfs_uuid->net); + nfs_uuid->net = NULL; + if (nfs_uuid->dom) + auth_domain_put(nfs_uuid->dom); + nfs_uuid->dom = NULL; + list_del_init(&nfs_uuid->list); +} + +void nfs_uuid_invalidate_clients(struct list_head *list) +{ + nfs_uuid_t *nfs_uuid, *tmp; + + spin_lock(&nfs_uuid_lock); + list_for_each_entry_safe(nfs_uuid, tmp, list, list) + nfs_uuid_put_locked(nfs_uuid); + spin_unlock(&nfs_uuid_lock); +} +EXPORT_SYMBOL_GPL(nfs_uuid_invalidate_clients); + +void nfs_uuid_invalidate_one_client(nfs_uuid_t *nfs_uuid) +{ + if (nfs_uuid->net) { + spin_lock(&nfs_uuid_lock); + nfs_uuid_put_locked(nfs_uuid); + spin_unlock(&nfs_uuid_lock); + } +} +EXPORT_SYMBOL_GPL(nfs_uuid_invalidate_one_client); diff --git a/include/linux/nfslocalio.h b/include/linux/nfslocalio.h new file mode 100644 index 000000000000..a8216a777b40 --- /dev/null +++ b/include/linux/nfslocalio.h @@ -0,0 +1,34 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (C) 2024 Mike Snitzer + */ +#ifndef __LINUX_NFSLOCALIO_H +#define __LINUX_NFSLOCALIO_H + +#include +#include +#include +#include +#include + +/* + * Useful to allow a client to negotiate if localio + * possible with its server. + * + * See Documentation/filesystems/nfs/localio.rst for more detail. + */ +typedef struct { + uuid_t uuid; + struct list_head list; + struct net *net; /* nfsd's network namespace */ + struct auth_domain *dom; /* auth_domain for localio */ +} nfs_uuid_t; + +void nfs_uuid_begin(nfs_uuid_t *); +void nfs_uuid_end(nfs_uuid_t *); +void nfs_uuid_is_local(const uuid_t *, struct list_head *, + struct net *, struct auth_domain *); +void nfs_uuid_invalidate_clients(struct list_head *list); +void nfs_uuid_invalidate_one_client(nfs_uuid_t *nfs_uuid); + +#endif /* __LINUX_NFSLOCALIO_H */ From patchwork Fri Aug 30 02:20:28 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 13784131 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E921D2C18C for ; Fri, 30 Aug 2024 02:39:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.135.223.131 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724985556; cv=none; b=YhxNUDk1wr0aVGtaU8nSfYMwqNIBrSWm6uz8Y5LxCq01OpOvymqaTHVOWfznmQWAhiVzomGLisaRZXeNutGzW9+cunsbCxyFYAVtb7VnUhYbmXhQfbQpc5aSwoWw9lvyVstgct6f/vIvGLvxeUQmKY68FQLPCNxEQmJwFbsEvQg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724985556; c=relaxed/simple; bh=+UaMHfcUEn8ypL1Taw7TejrYKPBlAnfPhQA7vFQKPhI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=XFrDPn3C66j2yHOVKQG1hPRYB3QWv5AZzYmOHZ5hvD3dPu9bn86Wb9aOttfQimbGO9MRlln3nh7mbUzQr7UiR3/VcFAm+HdR34lBU1YgJCZwvweKX3dC9s4d0O7TCCjJHbnjEm2DzN0tQScnDNcr1xleL/zgquE6gjsqTkU7cj8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de; spf=pass smtp.mailfrom=suse.de; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=vt5e6E08; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=oZo6dmiV; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=vt5e6E08; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=oZo6dmiV; arc=none smtp.client-ip=195.135.223.131 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="vt5e6E08"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="oZo6dmiV"; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="vt5e6E08"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="oZo6dmiV" Received: from imap1.dmz-prg2.suse.org (unknown [10.150.64.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 46B421F791; Fri, 30 Aug 2024 02:39:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1724985551; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=v6BMAzZPFarrZAC3bf1YLFQqK3ED6bthBWsIjEySO6o=; b=vt5e6E0822PrDZJ0bCy6k6TiyMXBj5lKrfY2a84VOn0GECXR8bziX6VFk4TW3h+CoTlUBM oxYW7bD57SMeZlNR0LhkbwmJ54dh49TtQhdgGtDHvJXxWWYu+f5sAIVCT1XXj7pj+BdJ2V k8tFIBvsFfoh+S9HuePb7nM3wmreEz4= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1724985551; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=v6BMAzZPFarrZAC3bf1YLFQqK3ED6bthBWsIjEySO6o=; b=oZo6dmiVA3rcupFQ+HH4tnZ4Nv2IYrAa+b3zLvsRwJuB8ERewAqwtDRoWxhgyYBebNCT+l 9958z4NtBGKhqNBw== Authentication-Results: smtp-out2.suse.de; none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1724985551; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=v6BMAzZPFarrZAC3bf1YLFQqK3ED6bthBWsIjEySO6o=; b=vt5e6E0822PrDZJ0bCy6k6TiyMXBj5lKrfY2a84VOn0GECXR8bziX6VFk4TW3h+CoTlUBM oxYW7bD57SMeZlNR0LhkbwmJ54dh49TtQhdgGtDHvJXxWWYu+f5sAIVCT1XXj7pj+BdJ2V k8tFIBvsFfoh+S9HuePb7nM3wmreEz4= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1724985551; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=v6BMAzZPFarrZAC3bf1YLFQqK3ED6bthBWsIjEySO6o=; b=oZo6dmiVA3rcupFQ+HH4tnZ4Nv2IYrAa+b3zLvsRwJuB8ERewAqwtDRoWxhgyYBebNCT+l 9958z4NtBGKhqNBw== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id B5664136A4; Fri, 30 Aug 2024 02:39:09 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id PtW8Gs0w0WbZFwAAD6G6ig (envelope-from ); Fri, 30 Aug 2024 02:39:09 +0000 From: NeilBrown To: Mike Snitzer , Chuck Lever , Jeff Layton Cc: linux-nfs@vger.kernel.org Subject: [PATCH 15/25] nfs_common: introduce nfs_localio_ctx struct and interfaces Date: Fri, 30 Aug 2024 12:20:28 +1000 Message-ID: <20240830023531.29421-16-neilb@suse.de> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240830023531.29421-1-neilb@suse.de> References: <20240830023531.29421-1-neilb@suse.de> Precedence: bulk X-Mailing-List: linux-nfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Score: -2.80 X-Spamd-Result: default: False [-2.80 / 50.00]; BAYES_HAM(-3.00)[100.00%]; MID_CONTAINS_FROM(1.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; R_MISSING_CHARSET(0.50)[]; NEURAL_HAM_SHORT(-0.20)[-1.000]; MIME_GOOD(-0.10)[text/plain]; MIME_TRACE(0.00)[0:+]; TO_DN_SOME(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; ARC_NA(0.00)[]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; FUZZY_BLOCKED(0.00)[rspamd.com]; FROM_EQ_ENVFROM(0.00)[]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[4]; RCVD_COUNT_TWO(0.00)[2]; TO_MATCH_ENVRCPT_ALL(0.00)[]; DBL_BLOCKED_OPENRESOLVER(0.00)[suse.de:mid]; RCVD_TLS_ALL(0.00)[] X-Spam-Flag: NO X-Spam-Level: From: Mike Snitzer Introduce struct nfs_localio_ctx and the interfaces nfs_localio_ctx_alloc() and nfs_localio_ctx_free(). The next commit will introduce nfsd_open_local_fh() which returns a nfs_localio_ctx structure. Also, expose localio's required NFSD symbols to NFS client: - Cache nfsd_open_local_fh symbol and other required NFSD symbols in a globally accessible 'nfs_to' nfs_to_nfsd_t struct. Add interfaces get_nfs_to_nfsd_symbols() and put_nfs_to_nfsd_symbols() to allow each NFS client to take a reference on NFSD symbols. - Apologies for the DEFINE_NFS_TO_NFSD_SYMBOL macro that makes defining get_##NFSD_SYMBOL() and put_##NFSD_SYMBOL() functions far simpler (and avoids cut-n-paste bugs, which is what motivated the development and use of a macro for this). But as C macros go it is a very simple one and there are many like it all over the kernel. - Given the unique nature of NFS LOCALIO being an optional feature that when used requires NFS share access to NFSD memory: a unique bridging of NFSD resources to NFS (via nfs_common) is needed. But that bridge must be dynamic, hence the use of symbol_request() and symbol_put(). Proposed ideas to accomolish the same without using symbol_{request,put} would be far more tedious to implement and very likely no easier to review. Anyway: sorry NeilBrown... - Despite the use of indirect function calls, caching these nfsd symbols for use by the client offers a ~10% performance win (compared to always doing get+call+put) for high IOPS workloads. - Introduce nfsd_file_file() wrapper that provides access to nfsd_file's backing file. Keeps nfsd_file structure opaque to NFS client (as suggested by Jeff Layton). - The addition of nfsd_file_get, nfsd_file_put and nfsd_file_file symbols prepares for the NFS client to use nfsd_file for localio. Suggested-by: Trond Myklebust # nfs_to Suggested-by: Jeff Layton # nfsd_file_file Signed-off-by: Mike Snitzer --- fs/nfs_common/nfslocalio.c | 159 +++++++++++++++++++++++++++++++++++++ fs/nfsd/filecache.c | 25 ++++++ fs/nfsd/filecache.h | 1 + fs/nfsd/nfssvc.c | 5 ++ include/linux/nfslocalio.h | 38 +++++++++ 5 files changed, 228 insertions(+) diff --git a/fs/nfs_common/nfslocalio.c b/fs/nfs_common/nfslocalio.c index f78cf99e2547..8545ee75f756 100644 --- a/fs/nfs_common/nfslocalio.c +++ b/fs/nfs_common/nfslocalio.c @@ -106,3 +106,162 @@ void nfs_uuid_invalidate_one_client(nfs_uuid_t *nfs_uuid) } } EXPORT_SYMBOL_GPL(nfs_uuid_invalidate_one_client); + +/* + * The nfs localio code needs to call into nfsd using various symbols (below), + * but cannot be statically linked, because that will make the nfs module + * depend on the nfsd module. + * + * Instead, do dynamic linking to the nfsd module (via nfs_common module). The + * nfs_common module will only hold a reference on nfsd when localio is in use. + * This allows some sanity checking, like giving up on localio if nfsd isn't loaded. + */ +static DEFINE_SPINLOCK(nfs_to_nfsd_lock); +nfs_to_nfsd_t nfs_to; +EXPORT_SYMBOL_GPL(nfs_to); + +/* Macro to define nfs_to get and put methods, avoids copy-n-paste bugs */ +#define DEFINE_NFS_TO_NFSD_SYMBOL(NFSD_SYMBOL) \ +static nfs_to_##NFSD_SYMBOL##_t get_##NFSD_SYMBOL(void) \ +{ \ + return symbol_request(NFSD_SYMBOL); \ +} \ +static void put_##NFSD_SYMBOL(void) \ +{ \ + symbol_put(NFSD_SYMBOL); \ + nfs_to.NFSD_SYMBOL = NULL; \ +} + +/* The nfs localio code needs to call into nfsd to map filehandle -> struct nfsd_file */ +extern struct nfs_localio_ctx * +nfsd_open_local_fh(nfs_uuid_t *, struct rpc_clnt *, + const struct cred *, const struct nfs_fh *, const fmode_t); +DEFINE_NFS_TO_NFSD_SYMBOL(nfsd_open_local_fh); + +/* The nfs localio code needs to call into nfsd to acquire the nfsd_file */ +extern struct nfsd_file *nfsd_file_get(struct nfsd_file *nf); +DEFINE_NFS_TO_NFSD_SYMBOL(nfsd_file_get); + +/* The nfs localio code needs to call into nfsd to release the nfsd_file */ +extern void nfsd_file_put(struct nfsd_file *nf); +DEFINE_NFS_TO_NFSD_SYMBOL(nfsd_file_put); + +/* The nfs localio code needs to call into nfsd to access the nf->nf_file */ +extern struct file * nfsd_file_file(struct nfsd_file *nf); +DEFINE_NFS_TO_NFSD_SYMBOL(nfsd_file_file); + +/* The nfs localio code needs to call into nfsd to release nn->nfsd_serv */ +extern void nfsd_serv_put(struct nfsd_net *nn); +DEFINE_NFS_TO_NFSD_SYMBOL(nfsd_serv_put); +#undef DEFINE_NFS_TO_NFSD_SYMBOL + +static struct kmem_cache *nfs_localio_ctx_cache; + +struct nfs_localio_ctx *nfs_localio_ctx_alloc(void) +{ + return kmem_cache_alloc(nfs_localio_ctx_cache, + GFP_KERNEL | __GFP_ZERO); +} +EXPORT_SYMBOL_GPL(nfs_localio_ctx_alloc); + +void nfs_localio_ctx_free(struct nfs_localio_ctx *localio) +{ + if (localio->nf) + nfs_to.nfsd_file_put(localio->nf); + if (localio->nn) + nfs_to.nfsd_serv_put(localio->nn); + kmem_cache_free(nfs_localio_ctx_cache, localio); +} +EXPORT_SYMBOL_GPL(nfs_localio_ctx_free); + +bool get_nfs_to_nfsd_symbols(void) +{ + spin_lock(&nfs_to_nfsd_lock); + + /* Only get symbols on first reference */ + if (refcount_read(&nfs_to.ref) == 0) + refcount_set(&nfs_to.ref, 1); + else { + refcount_inc(&nfs_to.ref); + spin_unlock(&nfs_to_nfsd_lock); + return true; + } + + nfs_to.nfsd_open_local_fh = get_nfsd_open_local_fh(); + if (!nfs_to.nfsd_open_local_fh) + goto out_nfsd_open_local_fh; + + nfs_to.nfsd_file_get = get_nfsd_file_get(); + if (!nfs_to.nfsd_file_get) + goto out_nfsd_file_get; + + nfs_to.nfsd_file_put = get_nfsd_file_put(); + if (!nfs_to.nfsd_file_put) + goto out_nfsd_file_put; + + nfs_to.nfsd_file_file = get_nfsd_file_file(); + if (!nfs_to.nfsd_file_file) + goto out_nfsd_file_file; + + nfs_to.nfsd_serv_put = get_nfsd_serv_put(); + if (!nfs_to.nfsd_serv_put) + goto out_nfsd_serv_put; + + spin_unlock(&nfs_to_nfsd_lock); + return true; + +out_nfsd_serv_put: + put_nfsd_file_file(); +out_nfsd_file_file: + put_nfsd_file_put(); +out_nfsd_file_put: + put_nfsd_file_get(); +out_nfsd_file_get: + put_nfsd_open_local_fh(); +out_nfsd_open_local_fh: + spin_unlock(&nfs_to_nfsd_lock); + return false; +} +EXPORT_SYMBOL_GPL(get_nfs_to_nfsd_symbols); + +void put_nfs_to_nfsd_symbols(void) +{ + spin_lock(&nfs_to_nfsd_lock); + + if (!refcount_dec_and_test(&nfs_to.ref)) + goto out; + + put_nfsd_open_local_fh(); + put_nfsd_file_get(); + put_nfsd_file_put(); + put_nfsd_file_file(); + put_nfsd_serv_put(); +out: + spin_unlock(&nfs_to_nfsd_lock); +} +EXPORT_SYMBOL_GPL(put_nfs_to_nfsd_symbols); + +static int __init nfslocalio_init(void) +{ + refcount_set(&nfs_to.ref, 0); + + nfs_to.nfsd_open_local_fh = NULL; + nfs_to.nfsd_file_get = NULL; + nfs_to.nfsd_file_put = NULL; + nfs_to.nfsd_file_file = NULL; + nfs_to.nfsd_serv_put = NULL; + + nfs_localio_ctx_cache = KMEM_CACHE(nfs_localio_ctx, 0); + if (!nfs_localio_ctx_cache) + return -ENOMEM; + + return 0; +} + +static void __exit nfslocalio_exit(void) +{ + kmem_cache_destroy(nfs_localio_ctx_cache); +} + +module_init(nfslocalio_init); +module_exit(nfslocalio_exit); diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c index 2dc72de31f61..a83d469bca6b 100644 --- a/fs/nfsd/filecache.c +++ b/fs/nfsd/filecache.c @@ -39,6 +39,7 @@ #include #include #include +#include #include "vfs.h" #include "nfsd.h" @@ -345,6 +346,10 @@ nfsd_file_get(struct nfsd_file *nf) return nf; return NULL; } +EXPORT_SYMBOL_GPL(nfsd_file_get); + +/* Compile time type checking, not used by anything */ +static nfs_to_nfsd_file_get_t __maybe_unused nfsd_file_get_typecheck = nfsd_file_get; /** * nfsd_file_put - put the reference to a nfsd_file @@ -389,6 +394,26 @@ nfsd_file_put(struct nfsd_file *nf) if (refcount_dec_and_test(&nf->nf_ref)) nfsd_file_free(nf); } +EXPORT_SYMBOL_GPL(nfsd_file_put); + +/* Compile time type checking, not used by anything */ +static nfs_to_nfsd_file_put_t __maybe_unused nfsd_file_put_typecheck = nfsd_file_put; + +/** + * nfsd_file_file - get the backing file of an nfsd_file + * @nf: nfsd_file of which to access the backing file. + * + * Return backing file for @nf. + */ +struct file * +nfsd_file_file(struct nfsd_file *nf) +{ + return nf->nf_file; +} +EXPORT_SYMBOL_GPL(nfsd_file_file); + +/* Compile time type checking, not used by anything */ +static nfs_to_nfsd_file_file_t __maybe_unused nfsd_file_file_typecheck = nfsd_file_file; static void nfsd_file_dispose_list(struct list_head *dispose) diff --git a/fs/nfsd/filecache.h b/fs/nfsd/filecache.h index 26ada78b8c1e..6fbbb2e32e95 100644 --- a/fs/nfsd/filecache.h +++ b/fs/nfsd/filecache.h @@ -56,6 +56,7 @@ int nfsd_file_cache_start_net(struct net *net); void nfsd_file_cache_shutdown_net(struct net *net); void nfsd_file_put(struct nfsd_file *nf); struct nfsd_file *nfsd_file_get(struct nfsd_file *nf); +struct file *nfsd_file_file(struct nfsd_file *nf); void nfsd_file_close_inode_sync(struct inode *inode); void nfsd_file_net_dispose(struct nfsd_net *nn); bool nfsd_file_is_cached(struct inode *inode); diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c index c639fbe4d8c2..13c69aa40d1c 100644 --- a/fs/nfsd/nfssvc.c +++ b/fs/nfsd/nfssvc.c @@ -19,6 +19,7 @@ #include #include #include +#include #include #include #include @@ -201,6 +202,10 @@ void nfsd_serv_put(struct nfsd_net *nn) { percpu_ref_put(&nn->nfsd_serv_ref); } +EXPORT_SYMBOL_GPL(nfsd_serv_put); + +/* Compile time type checking, not used by anything */ +static nfs_to_nfsd_serv_put_t __maybe_unused nfsd_serv_put_typecheck = nfsd_serv_put; static void nfsd_serv_done(struct percpu_ref *ref) { diff --git a/include/linux/nfslocalio.h b/include/linux/nfslocalio.h index a8216a777b40..e196f716a2f5 100644 --- a/include/linux/nfslocalio.h +++ b/include/linux/nfslocalio.h @@ -7,6 +7,8 @@ #include #include +#include +#include #include #include #include @@ -31,4 +33,40 @@ void nfs_uuid_is_local(const uuid_t *, struct list_head *, void nfs_uuid_invalidate_clients(struct list_head *list); void nfs_uuid_invalidate_one_client(nfs_uuid_t *nfs_uuid); +struct nfsd_file; +struct nfsd_net; + +struct nfs_localio_ctx { + struct nfsd_file *nf; + struct nfsd_net *nn; +}; + +typedef struct nfs_localio_ctx * +(*nfs_to_nfsd_open_local_fh_t)(nfs_uuid_t *, + struct rpc_clnt *, const struct cred *, + const struct nfs_fh *, const fmode_t); +typedef struct nfsd_file * (*nfs_to_nfsd_file_get_t)(struct nfsd_file *); +typedef void (*nfs_to_nfsd_file_put_t)(struct nfsd_file *); +typedef struct file * (*nfs_to_nfsd_file_file_t)(struct nfsd_file *); +typedef unsigned int (*nfs_to_nfsd_net_id_value_t)(void); +typedef void (*nfs_to_nfsd_serv_put_t)(struct nfsd_net *); + +typedef struct { + refcount_t ref; + nfs_to_nfsd_open_local_fh_t nfsd_open_local_fh; + nfs_to_nfsd_file_get_t nfsd_file_get; + nfs_to_nfsd_file_put_t nfsd_file_put; + nfs_to_nfsd_file_file_t nfsd_file_file; + nfs_to_nfsd_net_id_value_t nfsd_net_id_value; + nfs_to_nfsd_serv_put_t nfsd_serv_put; +} nfs_to_nfsd_t; + +extern nfs_to_nfsd_t nfs_to; + +bool get_nfs_to_nfsd_symbols(void); +void put_nfs_to_nfsd_symbols(void); + +struct nfs_localio_ctx *nfs_localio_ctx_alloc(void); +void nfs_localio_ctx_free(struct nfs_localio_ctx *); + #endif /* __LINUX_NFSLOCALIO_H */ From patchwork Fri Aug 30 02:20:29 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 13784132 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 78D80380 for ; Fri, 30 Aug 2024 02:39:22 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.135.223.131 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724985564; cv=none; b=f1l2/tNFzJxytgLN179H497zmulszfHm5imK2t/VpAwCTQG/46oTzD4DAuZyYj5vBD/RBrXQcMrK78wvdYml8H1rqsNC/TZ+NC2tZ9D2K+c9UdbyktBa41UuV7zA5Lljvq1mAypHctmirrgSiRpam0Qb7Dq7XFwcm5eM8fZrTL0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724985564; c=relaxed/simple; bh=SDvzwGzCfhJkpAXLsrRd0l3EsfEnfxbz3hpE2Q8HgH4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=a2b8MUM5PZueZUXcykTPSaaVFui9sZJnNG2kxBtH35X3mjO7m5Q5qA+r6Yf1OVBMcwlk/o/Ar2FoxfMRraB16xZO/rDCXEVoBuJVzOessV7C8IwKpKJ+ZAVPSN8og/cLAKgt76DhCgstdHp3F+Ltu3eNmWhbx2A5GwmNQ6h/Rf0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de; spf=pass smtp.mailfrom=suse.de; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=pszK6c7s; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=4TKTpaJH; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=pszK6c7s; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=4TKTpaJH; arc=none smtp.client-ip=195.135.223.131 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="pszK6c7s"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="4TKTpaJH"; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="pszK6c7s"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="4TKTpaJH" Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [IPv6:2a07:de40:b281:104:10:150:64:97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 853031F791; Fri, 30 Aug 2024 02:39:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1724985560; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=0hbqfkB6NmBxaH4zP97vUx+kPECMEqpAI3++Ag86qmw=; b=pszK6c7sNcLEHTheiC89sylRE20uvr8xle0WrS5+jz/EB4F23iMHKU5jvlgg7uZZyR2WhO PBIDAte/pb2Mlit4E1K+KrN4ATaydTrUyLpAW7TBEcQJI4h2en9xaxvKZkq5g2sjRID7dF SDTj4hMvfEY10bElcNlg8B2t0sreLkY= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1724985560; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=0hbqfkB6NmBxaH4zP97vUx+kPECMEqpAI3++Ag86qmw=; b=4TKTpaJHCC89FXAIzHVvje0jMt6cedjsXm+bnHPJSlUk8CDOFNidNFI7P3Bs3YI3NCW47s DkttuVwb9gqFMPAQ== Authentication-Results: smtp-out2.suse.de; dkim=pass header.d=suse.de header.s=susede2_rsa header.b=pszK6c7s; dkim=pass header.d=suse.de header.s=susede2_ed25519 header.b=4TKTpaJH DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1724985560; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=0hbqfkB6NmBxaH4zP97vUx+kPECMEqpAI3++Ag86qmw=; b=pszK6c7sNcLEHTheiC89sylRE20uvr8xle0WrS5+jz/EB4F23iMHKU5jvlgg7uZZyR2WhO PBIDAte/pb2Mlit4E1K+KrN4ATaydTrUyLpAW7TBEcQJI4h2en9xaxvKZkq5g2sjRID7dF SDTj4hMvfEY10bElcNlg8B2t0sreLkY= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1724985560; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=0hbqfkB6NmBxaH4zP97vUx+kPECMEqpAI3++Ag86qmw=; b=4TKTpaJHCC89FXAIzHVvje0jMt6cedjsXm+bnHPJSlUk8CDOFNidNFI7P3Bs3YI3NCW47s DkttuVwb9gqFMPAQ== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id AF258136A4; Fri, 30 Aug 2024 02:39:18 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id Z14vGdYw0WbfFwAAD6G6ig (envelope-from ); Fri, 30 Aug 2024 02:39:18 +0000 From: NeilBrown To: Mike Snitzer , Chuck Lever , Jeff Layton Cc: linux-nfs@vger.kernel.org Subject: [PATCH 16/25] nfsd: add localio support Date: Fri, 30 Aug 2024 12:20:29 +1000 Message-ID: <20240830023531.29421-17-neilb@suse.de> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240830023531.29421-1-neilb@suse.de> References: <20240830023531.29421-1-neilb@suse.de> Precedence: bulk X-Mailing-List: linux-nfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Rspamd-Queue-Id: 853031F791 X-Spam-Score: -3.01 X-Rspamd-Action: no action X-Spamd-Result: default: False [-3.01 / 50.00]; BAYES_HAM(-3.00)[100.00%]; MID_CONTAINS_FROM(1.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; R_MISSING_CHARSET(0.50)[]; R_DKIM_ALLOW(-0.20)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; NEURAL_HAM_SHORT(-0.20)[-1.000]; MIME_GOOD(-0.10)[text/plain]; MX_GOOD(-0.01)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; FUZZY_BLOCKED(0.00)[rspamd.com]; MIME_TRACE(0.00)[0:+]; ARC_NA(0.00)[]; TO_DN_SOME(0.00)[]; DNSWL_BLOCKED(0.00)[2a07:de40:b281:104:10:150:64:97:from]; RCPT_COUNT_THREE(0.00)[4]; RCVD_TLS_ALL(0.00)[]; FROM_EQ_ENVFROM(0.00)[]; FROM_HAS_DN(0.00)[]; DWL_DNSWL_BLOCKED(0.00)[suse.de:dkim]; RCVD_COUNT_TWO(0.00)[2]; TO_MATCH_ENVRCPT_ALL(0.00)[]; DBL_BLOCKED_OPENRESOLVER(0.00)[suse.de:mid,suse.de:dkim]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; DKIM_TRACE(0.00)[suse.de:+] X-Rspamd-Server: rspamd1.dmz-prg2.suse.org X-Spam-Flag: NO X-Spam-Level: From: Weston Andros Adamson Add server support for bypassing NFS for localhost reads, writes, and commits. This is only useful when both the client and server are running on the same host. If nfsd_open_local_fh() fails then the NFS client will both retry and fallback to normal network-based read, write and commit operations if localio is no longer supported. Care is taken to ensure the same NFS security mechanisms are used (authentication, etc) regardless of whether localio or regular NFS access is used. The auth_domain established as part of the traditional NFS client access to the NFS server is also used for localio. Store auth_domain for localio in nfsd_uuid_t and transfer it to the client if it is local to the server. Relative to containers, localio gives the client access to the network namespace the server has. This is required to allow the client to access the server's per-namespace nfsd_net struct. CONFIG_NFSD_LOCALIO controls the server enablement for localio. A later commit will add CONFIG_NFS_LOCALIO to allow the client enablement. This commit also introduces the use of nfsd's percpu_ref to interlock nfsd_destroy_serv and nfsd_open_local_fh, to ensure nn->nfsd_serv is not destroyed while in use by nfsd_open_local_fh, and warrants a more detailed explanation: nfsd_open_local_fh uses nfsd_serv_try_get before opening its file handle and then the reference must be dropped by the caller using nfsd_serv_put (via nfs_localio_ctx_free). Verified to fix an easy to hit crash that would occur if an nfsd instance running in a container, with a localio client mounted, is shutdown. Upon restart of the container and associated nfsd the client would go on to crash due to NULL pointer dereference that occurred due to the nfs client's localio attempting to nfsd_open_local_fh(), using nn->nfsd_serv, without having a proper reference on nn->nfsd_serv. Signed-off-by: Weston Andros Adamson Signed-off-by: Trond Myklebust Co-developed-by: Mike Snitzer Signed-off-by: Mike Snitzer Acked-by: Chuck Lever Reviewed-by: Jeff Layton --- fs/Kconfig | 3 ++ fs/nfsd/Kconfig | 16 +++++++ fs/nfsd/Makefile | 1 + fs/nfsd/filecache.c | 2 +- fs/nfsd/localio.c | 114 ++++++++++++++++++++++++++++++++++++++++++++ fs/nfsd/trace.h | 3 +- fs/nfsd/vfs.h | 8 ++++ 7 files changed, 145 insertions(+), 2 deletions(-) create mode 100644 fs/nfsd/localio.c diff --git a/fs/Kconfig b/fs/Kconfig index a46b0cbc4d8f..1b8a5edbddff 100644 --- a/fs/Kconfig +++ b/fs/Kconfig @@ -377,6 +377,9 @@ config NFS_ACL_SUPPORT tristate select FS_POSIX_ACL +config NFS_COMMON_LOCALIO_SUPPORT + bool + config NFS_COMMON bool depends on NFSD || NFS_FS || LOCKD diff --git a/fs/nfsd/Kconfig b/fs/nfsd/Kconfig index c0bd1509ccd4..e6fa7eaa1db0 100644 --- a/fs/nfsd/Kconfig +++ b/fs/nfsd/Kconfig @@ -90,6 +90,22 @@ config NFSD_V4 If unsure, say N. +config NFSD_LOCALIO + bool "NFS server support for the LOCALIO auxiliary protocol" + depends on NFSD + select NFS_COMMON_LOCALIO_SUPPORT + default n + help + Some NFS servers support an auxiliary NFS LOCALIO protocol + that is not an official part of the NFS protocol. + + This option enables support for the LOCALIO protocol in the + kernel's NFS server. Enable this to permit local NFS clients + to bypass the network when issuing reads and writes to the + local NFS server. + + If unsure, say N. + config NFSD_PNFS bool diff --git a/fs/nfsd/Makefile b/fs/nfsd/Makefile index b8736a82e57c..78b421778a79 100644 --- a/fs/nfsd/Makefile +++ b/fs/nfsd/Makefile @@ -23,3 +23,4 @@ nfsd-$(CONFIG_NFSD_PNFS) += nfs4layouts.o nfsd-$(CONFIG_NFSD_BLOCKLAYOUT) += blocklayout.o blocklayoutxdr.o nfsd-$(CONFIG_NFSD_SCSILAYOUT) += blocklayout.o blocklayoutxdr.o nfsd-$(CONFIG_NFSD_FLEXFILELAYOUT) += flexfilelayout.o flexfilelayoutxdr.o +nfsd-$(CONFIG_NFSD_LOCALIO) += localio.o diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c index a83d469bca6b..49f4aab3208a 100644 --- a/fs/nfsd/filecache.c +++ b/fs/nfsd/filecache.c @@ -53,7 +53,7 @@ #define NFSD_FILE_CACHE_UP (0) /* We only care about NFSD_MAY_READ/WRITE for this cache */ -#define NFSD_FILE_MAY_MASK (NFSD_MAY_READ|NFSD_MAY_WRITE) +#define NFSD_FILE_MAY_MASK (NFSD_MAY_READ|NFSD_MAY_WRITE|NFSD_MAY_LOCALIO) static DEFINE_PER_CPU(unsigned long, nfsd_file_cache_hits); static DEFINE_PER_CPU(unsigned long, nfsd_file_acquisitions); diff --git a/fs/nfsd/localio.c b/fs/nfsd/localio.c new file mode 100644 index 000000000000..637402eecb61 --- /dev/null +++ b/fs/nfsd/localio.c @@ -0,0 +1,114 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * NFS server support for local clients to bypass network stack + * + * Copyright (C) 2014 Weston Andros Adamson + * Copyright (C) 2019 Trond Myklebust + * Copyright (C) 2024 Mike Snitzer + */ + +#include +#include +#include +#include +#include +#include +#include + +#include "nfsd.h" +#include "vfs.h" +#include "netns.h" +#include "filecache.h" + +/** + * nfsd_open_local_fh - lookup a local filehandle @nfs_fh and map to nfsd_file + * + * @cl_nfssvc_net: the 'struct net' to use to get the proper nfsd_net + * @cl_nfssvc_dom: the 'struct auth_domain' required for localio access + * @rpc_clnt: rpc_clnt that the client established, used for sockaddr and cred + * @cred: cred that the client established + * @nfs_fh: filehandle to lookup + * @fmode: fmode_t to use for open + * + * This function maps a local fh to a path on a local filesystem. + * This is useful when the nfs client has the local server mounted - it can + * avoid all the NFS overhead with reads, writes and commits. + * + * On successful return, returned nfs_localio_ctx will have its nfsd_file and + * nfsd_net members set. Caller is responsible for calling nfsd_file_put and + * nfsd_serv_put (via nfs_localio_ctx_free). + */ +struct nfs_localio_ctx * +nfsd_open_local_fh(nfs_uuid_t *uuid, + struct rpc_clnt *rpc_clnt, const struct cred *cred, + const struct nfs_fh *nfs_fh, const fmode_t fmode) +{ + int mayflags = NFSD_MAY_LOCALIO; + int status = 0; + struct nfsd_net *nn = NULL; + struct net *net; + struct svc_cred rq_cred; + struct svc_fh fh; + struct nfs_localio_ctx *localio; + __be32 beres; + + if (nfs_fh->size > NFS4_FHSIZE) + return ERR_PTR(-EINVAL); + + localio = nfs_localio_ctx_alloc(); + if (!localio) + return ERR_PTR(-ENOMEM); + + /* + * Not running in nfsd context, so must safely get reference on nfsd_serv. + * But the server may already be shutting down, if so disallow new localio. + * uuid->net is NOT a counted reference, but rcu_read_lock() ensures that if + * uuid->net is not NULL, then nfsd_serv_try_get() is safe and if that succeeds + * we will have an implied reference to the net. + */ + rcu_read_lock(); + net = READ_ONCE(uuid->net); + if (net) + nn = net_generic(net, nfsd_net_id); + if (unlikely(!nn || !nfsd_serv_try_get(nn))) { + rcu_read_unlock(); + status = -ENXIO; + goto out_nfsd_serv; + } + rcu_read_unlock(); + + /* nfs_fh -> svc_fh */ + fh_init(&fh, NFS4_FHSIZE); + fh.fh_handle.fh_size = nfs_fh->size; + memcpy(fh.fh_handle.fh_raw, nfs_fh->data, nfs_fh->size); + + if (fmode & FMODE_READ) + mayflags |= NFSD_MAY_READ; + if (fmode & FMODE_WRITE) + mayflags |= NFSD_MAY_WRITE; + + svcauth_map_clnt_to_svc_cred_local(rpc_clnt, cred, &rq_cred); + + beres = nfsd_file_acquire_local(uuid->net, &rq_cred, uuid->dom, + &fh, mayflags, &localio->nf); + if (beres) { + status = nfs_stat_to_errno(be32_to_cpu(beres)); + goto out_fh_put; + } + localio->nn = nn; + +out_fh_put: + fh_put(&fh); + if (rq_cred.cr_group_info) + put_group_info(rq_cred.cr_group_info); +out_nfsd_serv: + if (status) { + nfs_localio_ctx_free(localio); + return ERR_PTR(status); + } + return localio; +} +EXPORT_SYMBOL_GPL(nfsd_open_local_fh); + +/* Compile time type checking, not used by anything */ +static nfs_to_nfsd_open_local_fh_t __maybe_unused nfsd_open_local_fh_typecheck = nfsd_open_local_fh; diff --git a/fs/nfsd/trace.h b/fs/nfsd/trace.h index d22027e23761..82bcefcd1f21 100644 --- a/fs/nfsd/trace.h +++ b/fs/nfsd/trace.h @@ -86,7 +86,8 @@ DEFINE_NFSD_XDR_ERR_EVENT(cant_encode); { NFSD_MAY_NOT_BREAK_LEASE, "NOT_BREAK_LEASE" }, \ { NFSD_MAY_BYPASS_GSS, "BYPASS_GSS" }, \ { NFSD_MAY_READ_IF_EXEC, "READ_IF_EXEC" }, \ - { NFSD_MAY_64BIT_COOKIE, "64BIT_COOKIE" }) + { NFSD_MAY_64BIT_COOKIE, "64BIT_COOKIE" }, \ + { NFSD_MAY_LOCALIO, "LOCALIO" }) TRACE_EVENT(nfsd_compound, TP_PROTO( diff --git a/fs/nfsd/vfs.h b/fs/nfsd/vfs.h index 01947561d375..2ecceb8b9d3d 100644 --- a/fs/nfsd/vfs.h +++ b/fs/nfsd/vfs.h @@ -8,6 +8,7 @@ #include #include +#include #include "nfsfh.h" #include "nfsd.h" @@ -33,6 +34,8 @@ #define NFSD_MAY_64BIT_COOKIE 0x1000 /* 64 bit readdir cookies for >= NFSv3 */ +#define NFSD_MAY_LOCALIO 0x2000 /* for tracing, reflects when localio used */ + #define NFSD_MAY_CREATE (NFSD_MAY_EXEC|NFSD_MAY_WRITE) #define NFSD_MAY_REMOVE (NFSD_MAY_EXEC|NFSD_MAY_WRITE|NFSD_MAY_TRUNC) @@ -158,6 +161,11 @@ __be32 nfsd_permission(struct svc_cred *cred, struct svc_export *exp, void nfsd_filp_close(struct file *fp); +struct nfs_localio_ctx * +nfsd_open_local_fh(nfs_uuid_t *, + struct rpc_clnt *, const struct cred *, + const struct nfs_fh *, const fmode_t); + static inline int fh_want_write(struct svc_fh *fh) { int ret; From patchwork Fri Aug 30 02:20:30 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 13784133 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6F7C64D8CB for ; Fri, 30 Aug 2024 02:39:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.135.223.131 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724985584; cv=none; b=fVgm25de6EmVzRy68GIp3dAL3WFSfqLDakowHUmxUqhCfHBQ7n9gv+++ctqNO++urU8cbr1Zegjy4MKlRuhdNbgYVUz+rboRMY4bJd2P4jV2fBMXmiCOAYqNNZUN3/O9vELI2mMyadaVdck2zjafAJrYOn+HG4h8PmulqFeoqXA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724985584; c=relaxed/simple; bh=ZkB9Je2W03YxRPtD5kV6VCvi1IJ4zrWAq6oFboSeQ5M=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=M5UQKLMEzQdkRpTtxvpBekECxUYqejG9+hC/JDv7mCnEhECTlkaoEAShntC7XyOiUws/WS6oxAq/2W45oRQ6GVo5Mb921kSaMR9Mk9YaHM0Kk1T23BfksaXllsibicvhRoxMwTwac2JSw8zPTI2O0WjjuJN4hx4OpLPFGgeBjYQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de; spf=pass smtp.mailfrom=suse.de; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=oVhB6u1Z; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=IJGR9Uyq; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=oVhB6u1Z; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=IJGR9Uyq; arc=none smtp.client-ip=195.135.223.131 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="oVhB6u1Z"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="IJGR9Uyq"; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="oVhB6u1Z"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="IJGR9Uyq" Received: from imap1.dmz-prg2.suse.org (unknown [10.150.64.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 7CFB31F791; Fri, 30 Aug 2024 02:39:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1724985580; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Vj0UU+McX2u5MOjAxzpLgEYERfLGW4+J5tFgLGXNUQA=; b=oVhB6u1ZVPsc5bnZzpOJxfokChni4csZCclqOZGHuiLFxLja+YCBXVD4g/MwJOd1POua0d SUudSYuKYgjYGufdnf5VEfE59JTnNI7Nk8pUnzGMJZRm/MRmXUQ/AEthSeCyJf3nWaxGWv xQXXo0cUZepi4HLyxr3eiOJK8XD1zFQ= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1724985580; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Vj0UU+McX2u5MOjAxzpLgEYERfLGW4+J5tFgLGXNUQA=; b=IJGR9Uyq4nBgGR4gA5NxCdf0/q/251tZvTnUPYu5tFRrNx1h4EZU/uktThC+0aO0lCzSVv 2M6FbfE/kS5PYJCA== Authentication-Results: smtp-out2.suse.de; none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1724985580; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Vj0UU+McX2u5MOjAxzpLgEYERfLGW4+J5tFgLGXNUQA=; b=oVhB6u1ZVPsc5bnZzpOJxfokChni4csZCclqOZGHuiLFxLja+YCBXVD4g/MwJOd1POua0d SUudSYuKYgjYGufdnf5VEfE59JTnNI7Nk8pUnzGMJZRm/MRmXUQ/AEthSeCyJf3nWaxGWv xQXXo0cUZepi4HLyxr3eiOJK8XD1zFQ= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1724985580; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Vj0UU+McX2u5MOjAxzpLgEYERfLGW4+J5tFgLGXNUQA=; b=IJGR9Uyq4nBgGR4gA5NxCdf0/q/251tZvTnUPYu5tFRrNx1h4EZU/uktThC+0aO0lCzSVv 2M6FbfE/kS5PYJCA== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id EF163136A4; Fri, 30 Aug 2024 02:39:38 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id V3LJKOow0Wb4FwAAD6G6ig (envelope-from ); Fri, 30 Aug 2024 02:39:38 +0000 From: NeilBrown To: Mike Snitzer , Chuck Lever , Jeff Layton Cc: linux-nfs@vger.kernel.org Subject: [PATCH 17/25] nfsd: implement server support for NFS_LOCALIO_PROGRAM Date: Fri, 30 Aug 2024 12:20:30 +1000 Message-ID: <20240830023531.29421-18-neilb@suse.de> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240830023531.29421-1-neilb@suse.de> References: <20240830023531.29421-1-neilb@suse.de> Precedence: bulk X-Mailing-List: linux-nfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Score: -2.80 X-Spamd-Result: default: False [-2.80 / 50.00]; BAYES_HAM(-3.00)[100.00%]; MID_CONTAINS_FROM(1.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; R_MISSING_CHARSET(0.50)[]; NEURAL_HAM_SHORT(-0.20)[-1.000]; MIME_GOOD(-0.10)[text/plain]; MIME_TRACE(0.00)[0:+]; TO_DN_SOME(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; ARC_NA(0.00)[]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; FUZZY_BLOCKED(0.00)[rspamd.com]; FROM_EQ_ENVFROM(0.00)[]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[4]; RCVD_COUNT_TWO(0.00)[2]; TO_MATCH_ENVRCPT_ALL(0.00)[]; DBL_BLOCKED_OPENRESOLVER(0.00)[suse.de:mid]; RCVD_TLS_ALL(0.00)[] X-Spam-Flag: NO X-Spam-Level: From: Mike Snitzer The LOCALIO auxiliary RPC protocol consists of a single "UUID_IS_LOCAL" RPC method that allows the Linux NFS client to verify the local Linux NFS server can see the nonce (single-use UUID) the client generated and made available in nfs_common. The server expects this protocol to use the same transport as NFS and NFSACL for its RPCs. This protocol isn't part of an IETF standard, nor does it need to be considering it is Linux-to-Linux auxiliary RPC protocol that amounts to an implementation detail. The UUID_IS_LOCAL method encodes the client generated uuid_t in terms of the fixed UUID_SIZE (16 bytes). The fixed size opaque encode and decode XDR methods are used instead of the less efficient variable sized methods. The RPC program number for the NFS_LOCALIO_PROGRAM is 400122 (as assigned by IANA, see https://www.iana.org/assignments/rpc-program-numbers/ ): Linux Kernel Organization 400122 nfslocalio After a successful handshake the client will hold a non-counted reference to the server's network namespace. On namespace shutdown these non-counted references will be invalidated. Signed-off-by: Mike Snitzer [neilb: factored out and simplified single localio protocol] Co-developed-by: NeilBrown Signed-off-by: NeilBrown Acked-by: Chuck Lever Reviewed-by: Jeff Layton --- fs/nfsd/localio.c | 77 +++++++++++++++++++++++++++++++++++++++++++++ fs/nfsd/netns.h | 2 ++ fs/nfsd/nfsctl.c | 16 ++++++++++ fs/nfsd/nfsd.h | 4 +++ fs/nfsd/nfssvc.c | 23 +++++++++++++- include/linux/nfs.h | 7 +++++ 6 files changed, 128 insertions(+), 1 deletion(-) diff --git a/fs/nfsd/localio.c b/fs/nfsd/localio.c index 637402eecb61..491bf5017d34 100644 --- a/fs/nfsd/localio.c +++ b/fs/nfsd/localio.c @@ -13,12 +13,15 @@ #include #include #include +#include +#include #include #include "nfsd.h" #include "vfs.h" #include "netns.h" #include "filecache.h" +#include "cache.h" /** * nfsd_open_local_fh - lookup a local filehandle @nfs_fh and map to nfsd_file @@ -112,3 +115,77 @@ EXPORT_SYMBOL_GPL(nfsd_open_local_fh); /* Compile time type checking, not used by anything */ static nfs_to_nfsd_open_local_fh_t __maybe_unused nfsd_open_local_fh_typecheck = nfsd_open_local_fh; + +/* + * UUID_IS_LOCAL XDR functions + */ + +static __be32 localio_proc_null(struct svc_rqst *rqstp) +{ + return rpc_success; +} + +struct localio_uuidarg { + uuid_t uuid; +}; + +static __be32 localio_proc_uuid_is_local(struct svc_rqst *rqstp) +{ + struct localio_uuidarg *argp = rqstp->rq_argp; + struct net *net = SVC_NET(rqstp); + struct nfsd_net *nn = net_generic(net, nfsd_net_id); + + nfs_uuid_is_local(&argp->uuid, &nn->local_clients, + net, rqstp->rq_client); + + return rpc_success; +} + +static bool localio_decode_uuidarg(struct svc_rqst *rqstp, + struct xdr_stream *xdr) +{ + struct localio_uuidarg *argp = rqstp->rq_argp; + u8 uuid[UUID_SIZE]; + + if (decode_opaque_fixed(xdr, uuid, UUID_SIZE)) + return false; + import_uuid(&argp->uuid, uuid); + + return true; +} + +static const struct svc_procedure localio_procedures1[] = { + [LOCALIOPROC_NULL] = { + .pc_func = localio_proc_null, + .pc_decode = nfssvc_decode_voidarg, + .pc_encode = nfssvc_encode_voidres, + .pc_argsize = sizeof(struct nfsd_voidargs), + .pc_ressize = sizeof(struct nfsd_voidres), + .pc_cachetype = RC_NOCACHE, + .pc_xdrressize = 0, + .pc_name = "NULL", + }, + [LOCALIOPROC_UUID_IS_LOCAL] = { + .pc_func = localio_proc_uuid_is_local, + .pc_decode = localio_decode_uuidarg, + .pc_encode = nfssvc_encode_voidres, + .pc_argsize = sizeof(struct localio_uuidarg), + .pc_argzero = sizeof(struct localio_uuidarg), + .pc_ressize = sizeof(struct nfsd_voidres), + .pc_cachetype = RC_NOCACHE, + .pc_name = "UUID_IS_LOCAL", + }, +}; + +#define LOCALIO_NR_PROCEDURES ARRAY_SIZE(localio_procedures1) +static DEFINE_PER_CPU_ALIGNED(unsigned long, + localio_count[LOCALIO_NR_PROCEDURES]); +const struct svc_version localio_version1 = { + .vs_vers = 1, + .vs_nproc = LOCALIO_NR_PROCEDURES, + .vs_proc = localio_procedures1, + .vs_dispatch = nfsd_dispatch, + .vs_count = localio_count, + .vs_xdrsize = XDR_QUADLEN(UUID_SIZE), + .vs_hidden = true, +}; diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h index e2d953f21dde..9c65db8d3f44 100644 --- a/fs/nfsd/netns.h +++ b/fs/nfsd/netns.h @@ -216,6 +216,8 @@ struct nfsd_net { /* last time an admin-revoke happened for NFSv4.0 */ time64_t nfs40_last_revoke; + /* Local clients to be invalidated when net is shut down */ + struct list_head local_clients; }; /* Simple check to find out if a given net was properly initialized */ diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c index 64c1b4d649bc..01e383d692ab 100644 --- a/fs/nfsd/nfsctl.c +++ b/fs/nfsd/nfsctl.c @@ -18,6 +18,7 @@ #include #include #include +#include #include "idmap.h" #include "nfsd.h" @@ -2257,6 +2258,7 @@ static __net_init int nfsd_net_init(struct net *net) get_random_bytes(&nn->siphash_key, sizeof(nn->siphash_key)); seqlock_init(&nn->writeverf_lock); nfsd_proc_stat_init(net); + INIT_LIST_HEAD(&nn->local_clients); return 0; @@ -2268,6 +2270,19 @@ static __net_init int nfsd_net_init(struct net *net) return retval; } +/** + * nfsd_net_pre_exit - Disconnect localio clients from net namespace + * @net: a network namespace that is about to be destroyed + * + * This invalidated ->net pointers held by localio clients + * while they can still safely access nn->counter. + */ +static __net_exit void nfsd_net_pre_exit(struct net *net) +{ + struct nfsd_net *nn = net_generic(net, nfsd_net_id); + + nfs_uuid_invalidate_clients(&nn->local_clients); +} /** * nfsd_net_exit - Release the nfsd_net portion of a net namespace * @net: a network namespace that is about to be destroyed @@ -2285,6 +2300,7 @@ static __net_exit void nfsd_net_exit(struct net *net) static struct pernet_operations nfsd_net_ops = { .init = nfsd_net_init, + .pre_exit = nfsd_net_pre_exit, .exit = nfsd_net_exit, .id = &nfsd_net_id, .size = sizeof(struct nfsd_net), diff --git a/fs/nfsd/nfsd.h b/fs/nfsd/nfsd.h index b0d3e82d6dcd..232a873dc53a 100644 --- a/fs/nfsd/nfsd.h +++ b/fs/nfsd/nfsd.h @@ -146,6 +146,10 @@ extern const struct svc_version nfsd_acl_version3; #endif #endif +#if IS_ENABLED(CONFIG_NFSD_LOCALIO) +extern const struct svc_version localio_version1; +#endif + struct nfsd_net; enum vers_op {NFSD_SET, NFSD_CLEAR, NFSD_TEST, NFSD_AVAIL }; diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c index 13c69aa40d1c..eec4a9803c4a 100644 --- a/fs/nfsd/nfssvc.c +++ b/fs/nfsd/nfssvc.c @@ -80,6 +80,15 @@ DEFINE_SPINLOCK(nfsd_drc_lock); unsigned long nfsd_drc_max_mem; unsigned long nfsd_drc_mem_used; +#if IS_ENABLED(CONFIG_NFSD_LOCALIO) +static const struct svc_version *localio_versions[] = { + [1] = &localio_version1, +}; + +#define NFSD_LOCALIO_NRVERS ARRAY_SIZE(localio_versions) + +#endif /* CONFIG_NFSD_LOCALIO */ + #if defined(CONFIG_NFSD_V2_ACL) || defined(CONFIG_NFSD_V3_ACL) static const struct svc_version *nfsd_acl_version[] = { # if defined(CONFIG_NFSD_V2_ACL) @@ -128,6 +137,18 @@ struct svc_program nfsd_programs[] = { .pg_rpcbind_set = nfsd_acl_rpcbind_set, }, #endif /* defined(CONFIG_NFSD_V2_ACL) || defined(CONFIG_NFSD_V3_ACL) */ +#if IS_ENABLED(CONFIG_NFSD_LOCALIO) + { + .pg_prog = NFS_LOCALIO_PROGRAM, + .pg_nvers = NFSD_LOCALIO_NRVERS, + .pg_vers = localio_versions, + .pg_name = "nfslocalio", + .pg_class = "nfsd", + .pg_authenticate = svc_set_client, + .pg_init_request = svc_generic_init_request, + .pg_rpcbind_set = svc_generic_rpcbind_set, + } +#endif /* IS_ENABLED(CONFIG_NFSD_LOCALIO) */ }; bool nfsd_support_version(int vers) @@ -949,7 +970,7 @@ nfsd(void *vrqstp) } /** - * nfsd_dispatch - Process an NFS or NFSACL Request + * nfsd_dispatch - Process an NFS or NFSACL or LOCALIO Request * @rqstp: incoming request * * This RPC dispatcher integrates the NFS server's duplicate reply cache. diff --git a/include/linux/nfs.h b/include/linux/nfs.h index ceb70a926b95..5ff1a5b3b00c 100644 --- a/include/linux/nfs.h +++ b/include/linux/nfs.h @@ -13,6 +13,13 @@ #include #include +/* The localio program is entirely private to Linux and is + * NOT part of the uapi. + */ +#define NFS_LOCALIO_PROGRAM 400122 +#define LOCALIOPROC_NULL 0 +#define LOCALIOPROC_UUID_IS_LOCAL 1 + /* * This is the kernel NFS client file handle representation */ From patchwork Fri Aug 30 02:20:32 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 13784134 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.223.130]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6E4DB4D8CB for ; Fri, 30 Aug 2024 02:39:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.135.223.130 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724985590; cv=none; b=IeGESK8T2jlv1O39Uzn5yjwvdmcfR5KC1tGvMza3yYVzvveHTNk5S+Yt5fkgClqugIMOkWwqpb/OPKYJfdwyS/4VZAj6fQzrvZVFEovfulwVwxnakuUy2nmb1fL1kSYrkT2GRAaSCpaMgLQpVNDyObNqQh5+MM86bBMfnF9gfCg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724985590; c=relaxed/simple; bh=SGLwXmyoT5fLeADz5D68SnvTvv76eJZycgQy8QOmdJw=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=nRLVAIIpny4bnaBLOOu1U3ZcI01lxXVd3IUwgX07vdo/KgimksMhqFMcCPBk7heIsN0MY7yMs7Xg6ui3DDEY81qd2uSurvo6irZysT8hTVowNewIalbJ2w/tXbA24RwgyHJna6SDYoJe4mPa+dZyOJsTn4+bGKTC6sdfBeGNCbc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de; spf=pass smtp.mailfrom=suse.de; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=XrJOieYh; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=qC9PbZEn; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=tXwxxVZP; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=ImsHkmer; arc=none smtp.client-ip=195.135.223.130 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="XrJOieYh"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="qC9PbZEn"; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="tXwxxVZP"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="ImsHkmer" Received: from imap1.dmz-prg2.suse.org (unknown [10.150.64.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id D060E219F0; Fri, 30 Aug 2024 02:39:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1724985586; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=BX3zZToWvuqnb8wskXEPFC6DLHM6WkTYTLIaR7BQGA0=; b=XrJOieYhhk/J2dpmRknPsKP9f+Kr6YlYAqDQG8GR8kHGClZHr0kWV6jz8VcZBRh2bwmU/s oRpTjIpjf9AHP2yns802POI0ijjM8YMBvrXPRX6bczf6/uuQzvq1np7lyknKJz4oWZA7z9 JLRcXqO4Cnv7lLCPHWclL+RL5fLBZpA= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1724985586; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=BX3zZToWvuqnb8wskXEPFC6DLHM6WkTYTLIaR7BQGA0=; b=qC9PbZEngKRuGvnF4IWKzbcHlysfxTraIwQxkmNP+DhT42koJBkldzSgg3LQ/Ql9a2qCzR i8+j52Jx1K8HhNCA== Authentication-Results: smtp-out1.suse.de; none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1724985585; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=BX3zZToWvuqnb8wskXEPFC6DLHM6WkTYTLIaR7BQGA0=; b=tXwxxVZPGKWlYSUH740Xd1A9znSmCNZTG53T8qqnIe6GW5eJ8dyjO3VgT4j2L5n8aqY1y9 UWtrpxhRs2PyoFYscjf5U3dFwf1VfhJnANm/pNsUMCimiXb5VCXs3/W2WXcSJn8VNphjiv PCBHaWE7n6iLKB6Za64QoF9ozgMRujA= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1724985585; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=BX3zZToWvuqnb8wskXEPFC6DLHM6WkTYTLIaR7BQGA0=; b=ImsHkmernnQLV8i+h2PlDdg8adn9SpG37f1hZfumphGAKHzRT5eo1U47/hI5Khlb/x7CxL 8YCB4FwVEcNXL5Bw== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 0B0EE136A4; Fri, 30 Aug 2024 02:39:43 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id OGCYLO8w0WYAGAAAD6G6ig (envelope-from ); Fri, 30 Aug 2024 02:39:43 +0000 From: NeilBrown To: Mike Snitzer , Chuck Lever , Jeff Layton Cc: linux-nfs@vger.kernel.org Subject: [PATCH 19/25] nfs: add localio support Date: Fri, 30 Aug 2024 12:20:32 +1000 Message-ID: <20240830023531.29421-20-neilb@suse.de> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240830023531.29421-1-neilb@suse.de> References: <20240830023531.29421-1-neilb@suse.de> Precedence: bulk X-Mailing-List: linux-nfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Score: -2.80 X-Spamd-Result: default: False [-2.80 / 50.00]; BAYES_HAM(-3.00)[100.00%]; MID_CONTAINS_FROM(1.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; R_MISSING_CHARSET(0.50)[]; NEURAL_HAM_SHORT(-0.20)[-1.000]; MIME_GOOD(-0.10)[text/plain]; MIME_TRACE(0.00)[0:+]; TO_DN_SOME(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; ARC_NA(0.00)[]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; FUZZY_BLOCKED(0.00)[rspamd.com]; FROM_EQ_ENVFROM(0.00)[]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[4]; RCVD_COUNT_TWO(0.00)[2]; TO_MATCH_ENVRCPT_ALL(0.00)[]; DBL_BLOCKED_OPENRESOLVER(0.00)[suse.de:mid]; RCVD_TLS_ALL(0.00)[] X-Spam-Flag: NO X-Spam-Level: From: Weston Andros Adamson Add client support for bypassing NFS for localhost reads, writes, and commits. This is only useful when the client and the server are running on the same host. nfs_local_probe() is stubbed out, later commits will enable client and server handshake via a Linux-only LOCALIO auxiliary RPC protocol. This has dynamic binding with the nfsd module (via nfs_localio module which is part of nfs_common). Localio will only work if nfsd is already loaded. The "localio_enabled" nfs kernel module parameter can be used to disable and enable the ability to use localio support. CONFIG_NFS_LOCALIO controls the client enablement. Lastly, localio uses an nfsd_file to initiate all IO. To make proper use of nfsd_file (and nfsd's filecache) its lifetime (duration before nfsd_file_put is called) must extend until after commit, read and write operations. So rather than immediately call nfsd_file_put() in nfs_local_open_fh(), nfsd_file_put() isn't called until nfs_local_pgio_release() for read/write and not until nfs_local_release_commit_data() for commit. The same applies to the reference held on nfsd's nn->nfsd_serv. Both object lifetimes and associated references are managed through calls to nfs_localio_ctx_free(). Signed-off-by: Weston Andros Adamson Signed-off-by: Trond Myklebust Co-developed-by: Mike Snitzer Signed-off-by: Mike Snitzer --- fs/nfs/Kconfig | 16 + fs/nfs/Makefile | 1 + fs/nfs/client.c | 10 + fs/nfs/internal.h | 45 +++ fs/nfs/localio.c | 605 ++++++++++++++++++++++++++++++++++++++ fs/nfs/nfstrace.h | 61 ++++ fs/nfs/pagelist.c | 4 + fs/nfs/write.c | 3 + include/linux/nfs.h | 2 + include/linux/nfs_fs_sb.h | 9 + 10 files changed, 756 insertions(+) create mode 100644 fs/nfs/localio.c diff --git a/fs/nfs/Kconfig b/fs/nfs/Kconfig index 0eb20012792f..9fe3b2709666 100644 --- a/fs/nfs/Kconfig +++ b/fs/nfs/Kconfig @@ -87,6 +87,22 @@ config NFS_V4 If unsure, say Y. +config NFS_LOCALIO + bool "NFS client support for the LOCALIO auxiliary protocol" + depends on NFS_FS + select NFS_COMMON_LOCALIO_SUPPORT + default n + help + Some NFS servers support an auxiliary NFS LOCALIO protocol + that is not an official part of the NFS protocol. + + This option enables support for the LOCALIO protocol in the + kernel's NFS client. Enable this to permit local NFS clients + to bypass the network when issuing reads and writes to the + local NFS server. + + If unsure, say N. + config NFS_SWAP bool "Provide swap over NFS support" default n diff --git a/fs/nfs/Makefile b/fs/nfs/Makefile index 5f6db37f461e..9fb2f2cac87e 100644 --- a/fs/nfs/Makefile +++ b/fs/nfs/Makefile @@ -13,6 +13,7 @@ nfs-y := client.o dir.o file.o getroot.o inode.o super.o \ nfs-$(CONFIG_ROOT_NFS) += nfsroot.o nfs-$(CONFIG_SYSCTL) += sysctl.o nfs-$(CONFIG_NFS_FSCACHE) += fscache.o +nfs-$(CONFIG_NFS_LOCALIO) += localio.o obj-$(CONFIG_NFS_V2) += nfsv2.o nfsv2-y := nfs2super.o proc.o nfs2xdr.o diff --git a/fs/nfs/client.c b/fs/nfs/client.c index 8286edd6062d..a890319d4fab 100644 --- a/fs/nfs/client.c +++ b/fs/nfs/client.c @@ -178,6 +178,13 @@ struct nfs_client *nfs_alloc_client(const struct nfs_client_initdata *cl_init) clp->cl_max_connect = cl_init->max_connect ? cl_init->max_connect : 1; clp->cl_net = get_net(cl_init->net); +#if IS_ENABLED(CONFIG_NFS_LOCALIO) + seqlock_init(&clp->cl_boot_lock); + ktime_get_real_ts64(&clp->cl_nfssvc_boot); + clp->cl_uuid.net = NULL; + spin_lock_init(&clp->cl_localio_lock); +#endif /* CONFIG_NFS_LOCALIO */ + clp->cl_principal = "*"; clp->cl_xprtsec = cl_init->xprtsec; return clp; @@ -233,6 +240,8 @@ static void pnfs_init_server(struct nfs_server *server) */ void nfs_free_client(struct nfs_client *clp) { + nfs_local_disable(clp); + /* -EIO all pending I/O */ if (!IS_ERR(clp->cl_rpcclient)) rpc_shutdown_client(clp->cl_rpcclient); @@ -424,6 +433,7 @@ struct nfs_client *nfs_get_client(const struct nfs_client_initdata *cl_init) list_add_tail(&new->cl_share_link, &nn->nfs_client_list); spin_unlock(&nn->nfs_client_lock); + nfs_local_probe(new); return rpc_ops->init_client(new, cl_init); } diff --git a/fs/nfs/internal.h b/fs/nfs/internal.h index d4ab74a61668..0716c90eaf9c 100644 --- a/fs/nfs/internal.h +++ b/fs/nfs/internal.h @@ -451,6 +451,51 @@ extern void nfs_set_cache_invalid(struct inode *inode, unsigned long flags); extern bool nfs_check_cache_invalid(struct inode *, unsigned long); extern int nfs_wait_bit_killable(struct wait_bit_key *key, int mode); +#if IS_ENABLED(CONFIG_NFS_LOCALIO) +/* localio.c */ +extern void nfs_local_disable(struct nfs_client *); +extern void nfs_local_probe(struct nfs_client *); +extern struct nfs_localio_ctx *nfs_local_open_fh(struct nfs_client *, + const struct cred *, + struct nfs_fh *, + const fmode_t); +extern int nfs_local_doio(struct nfs_client *, + struct nfs_localio_ctx *, + struct nfs_pgio_header *, + const struct rpc_call_ops *); +extern int nfs_local_commit(struct nfs_localio_ctx *, + struct nfs_commit_data *, + const struct rpc_call_ops *, int); +extern bool nfs_server_is_local(const struct nfs_client *clp); + +#else +static inline void nfs_local_disable(struct nfs_client *clp) {} +static inline void nfs_local_probe(struct nfs_client *clp) {} +static inline struct nfs_localio_ctx * +nfs_local_open_fh(struct nfs_client *clp, const struct cred *cred, + struct nfs_fh *fh, const fmode_t mode) +{ + return NULL; +} +static inline int nfs_local_doio(struct nfs_client *clp, + struct nfs_localio_ctx *localio, + struct nfs_pgio_header *hdr, + const struct rpc_call_ops *call_ops) +{ + return -EINVAL; +} +static inline int nfs_local_commit(struct nfs_localio_ctx *localio, + struct nfs_commit_data *data, + const struct rpc_call_ops *call_ops, int how) +{ + return -EINVAL; +} +static inline bool nfs_server_is_local(const struct nfs_client *clp) +{ + return false; +} +#endif /* CONFIG_NFS_LOCALIO */ + /* super.c */ extern const struct super_operations nfs_sops; bool nfs_auth_info_match(const struct nfs_auth_info *, rpc_authflavor_t); diff --git a/fs/nfs/localio.c b/fs/nfs/localio.c new file mode 100644 index 000000000000..f391badfb999 --- /dev/null +++ b/fs/nfs/localio.c @@ -0,0 +1,605 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * NFS client support for local clients to bypass network stack + * + * Copyright (C) 2014 Weston Andros Adamson + * Copyright (C) 2019 Trond Myklebust + * Copyright (C) 2024 Mike Snitzer + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include + +#include "internal.h" +#include "pnfs.h" +#include "nfstrace.h" + +#define NFSDBG_FACILITY NFSDBG_VFS + +struct nfs_local_kiocb { + struct kiocb kiocb; + struct bio_vec *bvec; + struct nfs_pgio_header *hdr; + struct work_struct work; + struct nfs_localio_ctx *localio; +}; + +struct nfs_local_fsync_ctx { + struct nfs_localio_ctx *localio; + struct nfs_commit_data *data; + struct work_struct work; + struct kref kref; + struct completion *done; +}; +static void nfs_local_fsync_work(struct work_struct *work); + +static bool localio_enabled __read_mostly = true; +module_param(localio_enabled, bool, 0644); + +bool nfs_server_is_local(const struct nfs_client *clp) +{ + return test_bit(NFS_CS_LOCAL_IO, &clp->cl_flags) != 0 && + localio_enabled; +} +EXPORT_SYMBOL_GPL(nfs_server_is_local); + +/* + * nfs_local_enable - enable local i/o for an nfs_client + */ +static __maybe_unused void nfs_local_enable(struct nfs_client *clp) +{ + spin_lock(&clp->cl_localio_lock); + + if (unlikely(!get_nfs_to_nfsd_symbols())) + goto out; + set_bit(NFS_CS_LOCAL_IO, &clp->cl_flags); + trace_nfs_local_enable(clp); +out: + spin_unlock(&clp->cl_localio_lock); +} + +/* + * nfs_local_disable - disable local i/o for an nfs_client + */ +void nfs_local_disable(struct nfs_client *clp) +{ + spin_lock(&clp->cl_localio_lock); + if (test_and_clear_bit(NFS_CS_LOCAL_IO, &clp->cl_flags)) { + trace_nfs_local_disable(clp); + put_nfs_to_nfsd_symbols(); + + nfs_uuid_invalidate_one_client(&clp->cl_uuid); + } + spin_unlock(&clp->cl_localio_lock); +} + +/* + * nfs_local_probe - probe local i/o support for an nfs_server and nfs_client + */ +void nfs_local_probe(struct nfs_client *clp) +{ +} +EXPORT_SYMBOL_GPL(nfs_local_probe); + +/* + * nfs_local_open_fh - open a local filehandle in terms of nfsd_file + * + * Returns a pointer to a struct nfs_localio_ctx or NULL + */ +struct nfs_localio_ctx * +nfs_local_open_fh(struct nfs_client *clp, const struct cred *cred, + struct nfs_fh *fh, const fmode_t mode) +{ + struct nfs_localio_ctx *localio; + int status; + + if (!nfs_server_is_local(clp)) + return NULL; + if (mode & ~(FMODE_READ | FMODE_WRITE)) + return NULL; + + localio = nfs_to.nfsd_open_local_fh(&clp->cl_uuid, + clp->cl_rpcclient, cred, fh, mode); + if (IS_ERR(localio)) { + status = PTR_ERR(localio); + trace_nfs_local_open_fh(fh, mode, status); + switch (status) { + case -ENOMEM: + case -ENXIO: + case -ENOENT: + nfs_local_disable(clp); + } + return NULL; + } + return localio; +} +EXPORT_SYMBOL_GPL(nfs_local_open_fh); + +static struct bio_vec * +nfs_bvec_alloc_and_import_pagevec(struct page **pagevec, + unsigned int npages, gfp_t flags) +{ + struct bio_vec *bvec, *p; + + bvec = kmalloc_array(npages, sizeof(*bvec), flags); + if (bvec != NULL) { + for (p = bvec; npages > 0; p++, pagevec++, npages--) { + p->bv_page = *pagevec; + p->bv_len = PAGE_SIZE; + p->bv_offset = 0; + } + } + return bvec; +} + +static void +nfs_local_iocb_free(struct nfs_local_kiocb *iocb) +{ + kfree(iocb->bvec); + kfree(iocb); +} + +static struct nfs_local_kiocb * +nfs_local_iocb_alloc(struct nfs_pgio_header *hdr, + struct nfs_localio_ctx *localio, gfp_t flags) +{ + struct nfs_local_kiocb *iocb; + + iocb = kmalloc(sizeof(*iocb), flags); + if (iocb == NULL) + return NULL; + iocb->bvec = nfs_bvec_alloc_and_import_pagevec(hdr->page_array.pagevec, + hdr->page_array.npages, flags); + if (iocb->bvec == NULL) { + kfree(iocb); + return NULL; + } + init_sync_kiocb(&iocb->kiocb, nfs_to.nfsd_file_file(localio->nf)); + iocb->kiocb.ki_pos = hdr->args.offset; + iocb->localio = localio; + iocb->hdr = hdr; + iocb->kiocb.ki_flags &= ~IOCB_APPEND; + return iocb; +} + +static void +nfs_local_iter_init(struct iov_iter *i, struct nfs_local_kiocb *iocb, int dir) +{ + struct nfs_pgio_header *hdr = iocb->hdr; + + iov_iter_bvec(i, dir, iocb->bvec, hdr->page_array.npages, + hdr->args.count + hdr->args.pgbase); + if (hdr->args.pgbase != 0) + iov_iter_advance(i, hdr->args.pgbase); +} + +static void +nfs_local_hdr_release(struct nfs_pgio_header *hdr, + const struct rpc_call_ops *call_ops) +{ + call_ops->rpc_call_done(&hdr->task, hdr); + call_ops->rpc_release(hdr); +} + +static void +nfs_local_pgio_init(struct nfs_pgio_header *hdr, + const struct rpc_call_ops *call_ops) +{ + hdr->task.tk_ops = call_ops; + if (!hdr->task.tk_start) + hdr->task.tk_start = ktime_get(); +} + +static void +nfs_local_pgio_done(struct nfs_pgio_header *hdr, long status) +{ + if (status >= 0) { + hdr->res.count = status; + hdr->res.op_status = NFS4_OK; + hdr->task.tk_status = 0; + } else { + hdr->res.op_status = nfs4_stat_to_errno(status); + hdr->task.tk_status = status; + } +} + +static void +nfs_local_pgio_release(struct nfs_local_kiocb *iocb) +{ + struct nfs_pgio_header *hdr = iocb->hdr; + + nfs_localio_ctx_free(iocb->localio); + nfs_local_iocb_free(iocb); + nfs_local_hdr_release(hdr, hdr->task.tk_ops); +} + +static void +nfs_local_read_done(struct nfs_local_kiocb *iocb, long status) +{ + struct nfs_pgio_header *hdr = iocb->hdr; + struct file *filp = iocb->kiocb.ki_filp; + + nfs_local_pgio_done(hdr, status); + + if (hdr->res.count != hdr->args.count || + hdr->args.offset + hdr->res.count >= i_size_read(file_inode(filp))) + hdr->res.eof = true; + + dprintk("%s: read %ld bytes eof %d.\n", __func__, + status > 0 ? status : 0, hdr->res.eof); +} + +static int +nfs_do_local_read(struct nfs_pgio_header *hdr, + struct nfs_localio_ctx *localio, + const struct rpc_call_ops *call_ops) +{ + struct file *filp = nfs_to.nfsd_file_file(localio->nf); + struct nfs_local_kiocb *iocb; + struct iov_iter iter; + ssize_t status; + + dprintk("%s: vfs_read count=%u pos=%llu\n", + __func__, hdr->args.count, hdr->args.offset); + + iocb = nfs_local_iocb_alloc(hdr, localio, GFP_KERNEL); + if (iocb == NULL) + return -ENOMEM; + nfs_local_iter_init(&iter, iocb, READ); + + nfs_local_pgio_init(hdr, call_ops); + hdr->res.eof = false; + + status = filp->f_op->read_iter(&iocb->kiocb, &iter); + WARN_ON_ONCE(status == -EIOCBQUEUED); + + nfs_local_read_done(iocb, status); + nfs_local_pgio_release(iocb); + + return 0; +} + +static void +nfs_copy_boot_verifier(struct nfs_write_verifier *verifier, struct inode *inode) +{ + struct nfs_client *clp = NFS_SERVER(inode)->nfs_client; + u32 *verf = (u32 *)verifier->data; + int seq = 0; + + do { + read_seqbegin_or_lock(&clp->cl_boot_lock, &seq); + verf[0] = (u32)clp->cl_nfssvc_boot.tv_sec; + verf[1] = (u32)clp->cl_nfssvc_boot.tv_nsec; + } while (need_seqretry(&clp->cl_boot_lock, seq)); + done_seqretry(&clp->cl_boot_lock, seq); +} + +static void +nfs_reset_boot_verifier(struct inode *inode) +{ + struct nfs_client *clp = NFS_SERVER(inode)->nfs_client; + + write_seqlock(&clp->cl_boot_lock); + ktime_get_real_ts64(&clp->cl_nfssvc_boot); + write_sequnlock(&clp->cl_boot_lock); +} + +static void +nfs_set_local_verifier(struct inode *inode, + struct nfs_writeverf *verf, + enum nfs3_stable_how how) +{ + nfs_copy_boot_verifier(&verf->verifier, inode); + verf->committed = how; +} + +/* Factored out from fs/nfsd/vfs.h:fh_getattr() */ +static int __vfs_getattr(struct path *p, struct kstat *stat, int version) +{ + u32 request_mask = STATX_BASIC_STATS; + + if (version == 4) + request_mask |= (STATX_BTIME | STATX_CHANGE_COOKIE); + return vfs_getattr(p, stat, request_mask, AT_STATX_SYNC_AS_STAT); +} + +/* Copied from fs/nfsd/nfsfh.c:nfsd4_change_attribute() */ +static u64 __nfsd4_change_attribute(const struct kstat *stat, + const struct inode *inode) +{ + u64 chattr; + + if (stat->result_mask & STATX_CHANGE_COOKIE) { + chattr = stat->change_cookie; + if (S_ISREG(inode->i_mode) && + !(stat->attributes & STATX_ATTR_CHANGE_MONOTONIC)) { + chattr += (u64)stat->ctime.tv_sec << 30; + chattr += stat->ctime.tv_nsec; + } + } else { + chattr = time_to_chattr(&stat->ctime); + } + return chattr; +} + +static void nfs_local_vfs_getattr(struct nfs_local_kiocb *iocb) +{ + struct kstat stat; + struct file *filp = iocb->kiocb.ki_filp; + struct nfs_pgio_header *hdr = iocb->hdr; + struct nfs_fattr *fattr = hdr->res.fattr; + int version = NFS_PROTO(hdr->inode)->version; + + if (unlikely(!fattr) || __vfs_getattr(&filp->f_path, &stat, version)) + return; + + fattr->valid = (NFS_ATTR_FATTR_FILEID | + NFS_ATTR_FATTR_CHANGE | + NFS_ATTR_FATTR_SIZE | + NFS_ATTR_FATTR_ATIME | + NFS_ATTR_FATTR_MTIME | + NFS_ATTR_FATTR_CTIME | + NFS_ATTR_FATTR_SPACE_USED); + + fattr->fileid = stat.ino; + fattr->size = stat.size; + fattr->atime = stat.atime; + fattr->mtime = stat.mtime; + fattr->ctime = stat.ctime; + if (version == 4) { + fattr->change_attr = + __nfsd4_change_attribute(&stat, file_inode(filp)); + } else + fattr->change_attr = nfs_timespec_to_change_attr(&fattr->ctime); + fattr->du.nfs3.used = stat.blocks << 9; +} + +static void +nfs_local_write_done(struct nfs_local_kiocb *iocb, long status) +{ + struct nfs_pgio_header *hdr = iocb->hdr; + struct inode *inode = hdr->inode; + + dprintk("%s: wrote %ld bytes.\n", __func__, status > 0 ? status : 0); + + /* Handle short writes as if they are ENOSPC */ + if (status > 0 && status < hdr->args.count) { + hdr->mds_offset += status; + hdr->args.offset += status; + hdr->args.pgbase += status; + hdr->args.count -= status; + nfs_set_pgio_error(hdr, -ENOSPC, hdr->args.offset); + status = -ENOSPC; + } + if (status < 0) + nfs_reset_boot_verifier(inode); + else if (nfs_should_remove_suid(inode)) { + /* Deal with the suid/sgid bit corner case */ + spin_lock(&inode->i_lock); + nfs_set_cache_invalid(inode, NFS_INO_INVALID_MODE); + spin_unlock(&inode->i_lock); + } + nfs_local_pgio_done(hdr, status); +} + +static int +nfs_do_local_write(struct nfs_pgio_header *hdr, + struct nfs_localio_ctx *localio, + const struct rpc_call_ops *call_ops) +{ + struct file *filp = nfs_to.nfsd_file_file(localio->nf); + struct nfs_local_kiocb *iocb; + struct iov_iter iter; + ssize_t status; + + dprintk("%s: vfs_write count=%u pos=%llu %s\n", + __func__, hdr->args.count, hdr->args.offset, + (hdr->args.stable == NFS_UNSTABLE) ? "unstable" : "stable"); + + iocb = nfs_local_iocb_alloc(hdr, localio, GFP_NOIO); + if (iocb == NULL) + return -ENOMEM; + nfs_local_iter_init(&iter, iocb, WRITE); + + switch (hdr->args.stable) { + default: + break; + case NFS_DATA_SYNC: + iocb->kiocb.ki_flags |= IOCB_DSYNC; + break; + case NFS_FILE_SYNC: + iocb->kiocb.ki_flags |= IOCB_DSYNC|IOCB_SYNC; + } + nfs_local_pgio_init(hdr, call_ops); + + nfs_set_local_verifier(hdr->inode, hdr->res.verf, hdr->args.stable); + + file_start_write(filp); + status = filp->f_op->write_iter(&iocb->kiocb, &iter); + file_end_write(filp); + WARN_ON_ONCE(status == -EIOCBQUEUED); + + nfs_local_write_done(iocb, status); + nfs_local_vfs_getattr(iocb); + nfs_local_pgio_release(iocb); + + return 0; +} + +int nfs_local_doio(struct nfs_client *clp, struct nfs_localio_ctx *localio, + struct nfs_pgio_header *hdr, + const struct rpc_call_ops *call_ops) +{ + int status = 0; + struct file *filp = nfs_to.nfsd_file_file(localio->nf); + + if (!hdr->args.count) + return 0; + /* Don't support filesystems without read_iter/write_iter */ + if (!filp->f_op->read_iter || !filp->f_op->write_iter) { + nfs_local_disable(clp); + status = -EAGAIN; + goto out; + } + + switch (hdr->rw_mode) { + case FMODE_READ: + status = nfs_do_local_read(hdr, localio, call_ops); + break; + case FMODE_WRITE: + status = nfs_do_local_write(hdr, localio, call_ops); + break; + default: + dprintk("%s: invalid mode: %d\n", __func__, + hdr->rw_mode); + status = -EINVAL; + } +out: + if (status != 0) { + nfs_localio_ctx_free(localio); + hdr->task.tk_status = status; + nfs_local_hdr_release(hdr, call_ops); + } + return status; +} + +static void +nfs_local_init_commit(struct nfs_commit_data *data, + const struct rpc_call_ops *call_ops) +{ + data->task.tk_ops = call_ops; +} + +static int +nfs_local_run_commit(struct file *filp, struct nfs_commit_data *data) +{ + loff_t start = data->args.offset; + loff_t end = LLONG_MAX; + + if (data->args.count > 0) { + end = start + data->args.count - 1; + if (end < start) + end = LLONG_MAX; + } + + dprintk("%s: commit %llu - %llu\n", __func__, start, end); + return vfs_fsync_range(filp, start, end, 0); +} + +static void +nfs_local_commit_done(struct nfs_commit_data *data, int status) +{ + if (status >= 0) { + nfs_set_local_verifier(data->inode, + data->res.verf, + NFS_FILE_SYNC); + data->res.op_status = NFS4_OK; + data->task.tk_status = 0; + } else { + nfs_reset_boot_verifier(data->inode); + data->res.op_status = nfs4_stat_to_errno(status); + data->task.tk_status = status; + } +} + +static void +nfs_local_release_commit_data(struct nfs_localio_ctx *localio, + struct nfs_commit_data *data, + const struct rpc_call_ops *call_ops) +{ + nfs_localio_ctx_free(localio); + call_ops->rpc_call_done(&data->task, data); + call_ops->rpc_release(data); +} + +static struct nfs_local_fsync_ctx * +nfs_local_fsync_ctx_alloc(struct nfs_commit_data *data, + struct nfs_localio_ctx *localio, gfp_t flags) +{ + struct nfs_local_fsync_ctx *ctx = kmalloc(sizeof(*ctx), flags); + + if (ctx != NULL) { + ctx->localio = localio; + ctx->data = data; + INIT_WORK(&ctx->work, nfs_local_fsync_work); + kref_init(&ctx->kref); + ctx->done = NULL; + } + return ctx; +} + +static void +nfs_local_fsync_ctx_kref_free(struct kref *kref) +{ + kfree(container_of(kref, struct nfs_local_fsync_ctx, kref)); +} + +static void +nfs_local_fsync_ctx_put(struct nfs_local_fsync_ctx *ctx) +{ + kref_put(&ctx->kref, nfs_local_fsync_ctx_kref_free); +} + +static void +nfs_local_fsync_ctx_free(struct nfs_local_fsync_ctx *ctx) +{ + nfs_local_release_commit_data(ctx->localio, ctx->data, + ctx->data->task.tk_ops); + nfs_local_fsync_ctx_put(ctx); +} + +static void +nfs_local_fsync_work(struct work_struct *work) +{ + struct nfs_local_fsync_ctx *ctx; + int status; + + ctx = container_of(work, struct nfs_local_fsync_ctx, work); + + status = nfs_local_run_commit(nfs_to.nfsd_file_file(ctx->localio->nf), + ctx->data); + nfs_local_commit_done(ctx->data, status); + if (ctx->done != NULL) + complete(ctx->done); + nfs_local_fsync_ctx_free(ctx); +} + +int nfs_local_commit(struct nfs_localio_ctx *localio, + struct nfs_commit_data *data, + const struct rpc_call_ops *call_ops, int how) +{ + struct nfs_local_fsync_ctx *ctx; + + ctx = nfs_local_fsync_ctx_alloc(data, localio, GFP_KERNEL); + if (!ctx) { + nfs_local_commit_done(data, -ENOMEM); + nfs_local_release_commit_data(localio, data, call_ops); + return -ENOMEM; + } + + nfs_local_init_commit(data, call_ops); + kref_get(&ctx->kref); + if (how & FLUSH_SYNC) { + DECLARE_COMPLETION_ONSTACK(done); + ctx->done = &done; + queue_work(nfsiod_workqueue, &ctx->work); + wait_for_completion(&done); + } else + queue_work(nfsiod_workqueue, &ctx->work); + nfs_local_fsync_ctx_put(ctx); + return 0; +} diff --git a/fs/nfs/nfstrace.h b/fs/nfs/nfstrace.h index 352fdaed4075..1eab98c277fa 100644 --- a/fs/nfs/nfstrace.h +++ b/fs/nfs/nfstrace.h @@ -1685,6 +1685,67 @@ TRACE_EVENT(nfs_mount_path, TP_printk("path='%s'", __get_str(path)) ); +TRACE_EVENT(nfs_local_open_fh, + TP_PROTO( + const struct nfs_fh *fh, + fmode_t fmode, + int error + ), + + TP_ARGS(fh, fmode, error), + + TP_STRUCT__entry( + __field(int, error) + __field(u32, fhandle) + __field(unsigned int, fmode) + ), + + TP_fast_assign( + __entry->error = error; + __entry->fhandle = nfs_fhandle_hash(fh); + __entry->fmode = (__force unsigned int)fmode; + ), + + TP_printk( + "error=%d fhandle=0x%08x mode=%s", + __entry->error, + __entry->fhandle, + show_fs_fmode_flags(__entry->fmode) + ) +); + +DECLARE_EVENT_CLASS(nfs_local_client_event, + TP_PROTO( + const struct nfs_client *clp + ), + + TP_ARGS(clp), + + TP_STRUCT__entry( + __field(unsigned int, protocol) + __string(server, clp->cl_hostname) + ), + + TP_fast_assign( + __entry->protocol = clp->rpc_ops->version; + __assign_str(server); + ), + + TP_printk( + "server=%s NFSv%u", __get_str(server), __entry->protocol + ) +); + +#define DEFINE_NFS_LOCAL_CLIENT_EVENT(name) \ + DEFINE_EVENT(nfs_local_client_event, name, \ + TP_PROTO( \ + const struct nfs_client *clp \ + ), \ + TP_ARGS(clp)) + +DEFINE_NFS_LOCAL_CLIENT_EVENT(nfs_local_enable); +DEFINE_NFS_LOCAL_CLIENT_EVENT(nfs_local_disable); + DECLARE_EVENT_CLASS(nfs_xdr_event, TP_PROTO( const struct xdr_stream *xdr, diff --git a/fs/nfs/pagelist.c b/fs/nfs/pagelist.c index 849db19451ff..cf68c0a61b7d 100644 --- a/fs/nfs/pagelist.c +++ b/fs/nfs/pagelist.c @@ -762,6 +762,10 @@ int nfs_initiate_pgio(struct rpc_clnt *clnt, struct nfs_pgio_header *hdr, hdr->args.count, (unsigned long long)hdr->args.offset); + if (localio) + return nfs_local_doio(NFS_SERVER(hdr->inode)->nfs_client, + localio, hdr, call_ops); + task = rpc_run_task(&task_setup_data); if (IS_ERR(task)) return PTR_ERR(task); diff --git a/fs/nfs/write.c b/fs/nfs/write.c index 4bd16473a953..8bbbe8dace3b 100644 --- a/fs/nfs/write.c +++ b/fs/nfs/write.c @@ -1693,6 +1693,9 @@ int nfs_initiate_commit(struct rpc_clnt *clnt, struct nfs_commit_data *data, dprintk("NFS: initiated commit call\n"); + if (localio) + return nfs_local_commit(localio, data, call_ops, how); + task = rpc_run_task(&task_setup_data); if (IS_ERR(task)) return PTR_ERR(task); diff --git a/include/linux/nfs.h b/include/linux/nfs.h index 5ff1a5b3b00c..89ef8c5e98db 100644 --- a/include/linux/nfs.h +++ b/include/linux/nfs.h @@ -8,6 +8,8 @@ #ifndef _LINUX_NFS_H #define _LINUX_NFS_H +#include +#include #include #include #include diff --git a/include/linux/nfs_fs_sb.h b/include/linux/nfs_fs_sb.h index 1df86ab98c77..b43e3e067e44 100644 --- a/include/linux/nfs_fs_sb.h +++ b/include/linux/nfs_fs_sb.h @@ -8,6 +8,7 @@ #include #include #include +#include #include #include @@ -49,6 +50,7 @@ struct nfs_client { #define NFS_CS_DS 7 /* - Server is a DS */ #define NFS_CS_REUSEPORT 8 /* - reuse src port on reconnect */ #define NFS_CS_PNFS 9 /* - Server used for pnfs */ +#define NFS_CS_LOCAL_IO 10 /* - client is local */ struct sockaddr_storage cl_addr; /* server identifier */ size_t cl_addrlen; char * cl_hostname; /* hostname of server */ @@ -125,6 +127,13 @@ struct nfs_client { struct net *cl_net; struct list_head pending_cb_stateids; struct rcu_head rcu; + +#if IS_ENABLED(CONFIG_NFS_LOCALIO) + struct timespec64 cl_nfssvc_boot; + seqlock_t cl_boot_lock; + nfs_uuid_t cl_uuid; + spinlock_t cl_localio_lock; +#endif /* CONFIG_NFS_LOCALIO */ }; /* From patchwork Fri Aug 30 02:20:36 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 13784135 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 69376380 for ; Fri, 30 Aug 2024 02:39:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.135.223.131 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724985594; cv=none; b=OutuDNMcPwYAvoKr7LRLT00MAdNQE4/QfRA/CxiZZiuRW2ZMU+mQ31FYG9dlN2QAk3jrDw5F9vc3nK76W2RRdTybM3uu/gXfkLWdiyphKGm2QLfccaMcVJNmzY4zBo9Og3K/lDdT/MbSk1XACNbNQpqNoumm+H5L8+Y0T3XVx2Y= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724985594; c=relaxed/simple; bh=6YE9GUlzErZqMUq/cDudZ5Rid9wFgbt27/XbDz+aGPU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=S5AwJCQEnf8zlsFtM1RLm/64yck27a0+RDWN933m1au3KyOP7NCmEdzVvDT8cLG9ZLPYPHP34C1IouLSQuYK8/o9TxcV96gD0HAWj2+sLdvIgiPoi6EvbvWVkpoR16t5+T/jY7UnN+yooB1PmmKn5ymoCJvpnJdX62bNJ2ZnJsg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de; spf=pass smtp.mailfrom=suse.de; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=Ki6c752f; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=q1a2BArn; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=Ki6c752f; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=q1a2BArn; arc=none smtp.client-ip=195.135.223.131 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="Ki6c752f"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="q1a2BArn"; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="Ki6c752f"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="q1a2BArn" Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [IPv6:2a07:de40:b281:104:10:150:64:97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id D72131F791; Fri, 30 Aug 2024 02:39:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1724985590; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=XWkiBtKA/cWamN/d+vca5lupntPHjjLAPN3z8BFO7jU=; b=Ki6c752fmNiaY+fU84AtzMLFaCXvvjaI0nOzd7oHHb+9BVzdoGZBKu4cNIAziVDvw6v2iA Xx/gAaGnuLqr9wXJ9fo2eZZtQR8bHVcJCv/3xzJeVD2/28crsyrp3AcNm6PyytVk5Sd3OF ZMHxx/UaCRsQAiKOKfogOTd6aV/s8FI= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1724985590; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=XWkiBtKA/cWamN/d+vca5lupntPHjjLAPN3z8BFO7jU=; b=q1a2BArnKS8nj6Pzc4wMd5yLpDc/23ZmKInkeOwmxSg2gzgwTLqxnVGzTd6fzzthaIhZug w6lugtD+Or5o6UBA== Authentication-Results: smtp-out2.suse.de; dkim=pass header.d=suse.de header.s=susede2_rsa header.b=Ki6c752f; dkim=pass header.d=suse.de header.s=susede2_ed25519 header.b=q1a2BArn DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1724985590; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=XWkiBtKA/cWamN/d+vca5lupntPHjjLAPN3z8BFO7jU=; b=Ki6c752fmNiaY+fU84AtzMLFaCXvvjaI0nOzd7oHHb+9BVzdoGZBKu4cNIAziVDvw6v2iA Xx/gAaGnuLqr9wXJ9fo2eZZtQR8bHVcJCv/3xzJeVD2/28crsyrp3AcNm6PyytVk5Sd3OF ZMHxx/UaCRsQAiKOKfogOTd6aV/s8FI= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1724985590; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=XWkiBtKA/cWamN/d+vca5lupntPHjjLAPN3z8BFO7jU=; b=q1a2BArnKS8nj6Pzc4wMd5yLpDc/23ZmKInkeOwmxSg2gzgwTLqxnVGzTd6fzzthaIhZug w6lugtD+Or5o6UBA== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 51CC0136A4; Fri, 30 Aug 2024 02:39:49 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id bl1aAvUw0WYHGAAAD6G6ig (envelope-from ); Fri, 30 Aug 2024 02:39:49 +0000 From: NeilBrown To: Mike Snitzer , Chuck Lever , Jeff Layton Cc: linux-nfs@vger.kernel.org Subject: [PATCH 23/25] nfs: implement client support for NFS_LOCALIO_PROGRAM Date: Fri, 30 Aug 2024 12:20:36 +1000 Message-ID: <20240830023531.29421-24-neilb@suse.de> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240830023531.29421-1-neilb@suse.de> References: <20240830023531.29421-1-neilb@suse.de> Precedence: bulk X-Mailing-List: linux-nfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Rspamd-Queue-Id: D72131F791 X-Spam-Score: -3.01 X-Rspamd-Action: no action X-Spamd-Result: default: False [-3.01 / 50.00]; BAYES_HAM(-3.00)[100.00%]; MID_CONTAINS_FROM(1.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; R_MISSING_CHARSET(0.50)[]; R_DKIM_ALLOW(-0.20)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; NEURAL_HAM_SHORT(-0.20)[-1.000]; MIME_GOOD(-0.10)[text/plain]; MX_GOOD(-0.01)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; FUZZY_BLOCKED(0.00)[rspamd.com]; MIME_TRACE(0.00)[0:+]; ARC_NA(0.00)[]; TO_DN_SOME(0.00)[]; DNSWL_BLOCKED(0.00)[2a07:de40:b281:104:10:150:64:97:from]; RCPT_COUNT_THREE(0.00)[4]; RCVD_TLS_ALL(0.00)[]; FROM_EQ_ENVFROM(0.00)[]; FROM_HAS_DN(0.00)[]; DWL_DNSWL_BLOCKED(0.00)[suse.de:dkim]; RCVD_COUNT_TWO(0.00)[2]; TO_MATCH_ENVRCPT_ALL(0.00)[]; DBL_BLOCKED_OPENRESOLVER(0.00)[suse.de:dkim,suse.de:mid,suse.de:email]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; DKIM_TRACE(0.00)[suse.de:+] X-Rspamd-Server: rspamd1.dmz-prg2.suse.org X-Spam-Flag: NO X-Spam-Level: From: Mike Snitzer The LOCALIO auxiliary RPC protocol consists of a single "UUID_IS_LOCAL" RPC method that allows the Linux NFS client to verify the local Linux NFS server can see the nonce (single-use UUID) the client generated and made available in nfs_common for subsequent lookup and verification by the NFS server. If matched, the NFS server populates members in the nfs_uuid_t struct. The NFS client then transfers these nfs_uuid_t struct member pointers to the nfs_client struct and cleans up the nfs_uuid_t struct. See: fs/nfs/localio.c:nfs_local_probe() This protocol isn't part of an IETF standard, nor does it need to be considering it is Linux-to-Linux auxiliary RPC protocol that amounts to an implementation detail. Localio is only supported when UNIX-style authentication (AUTH_UNIX, aka AUTH_SYS) is used (enforced by fs/nfs/localio.c:nfs_local_probe()). The UUID_IS_LOCAL method encodes the client generated uuid_t in terms of the fixed UUID_SIZE (16 bytes). The fixed size opaque encode and decode XDR methods are used instead of the less efficient variable sized methods. Having a nonce (single-use uuid) is better than using the same uuid for the life of the server, and sending it proactively by client rather than reactively by the server is also safer. [NeilBrown factored out and simplified a single localio protocol and proposed making the uuid short-lived] Signed-off-by: Mike Snitzer Co-developed-by: NeilBrown Signed-off-by: NeilBrown --- fs/nfs/client.c | 6 ++- fs/nfs/localio.c | 132 +++++++++++++++++++++++++++++++++++++++++++++-- 2 files changed, 132 insertions(+), 6 deletions(-) diff --git a/fs/nfs/client.c b/fs/nfs/client.c index a890319d4fab..4903f5ff5758 100644 --- a/fs/nfs/client.c +++ b/fs/nfs/client.c @@ -433,8 +433,10 @@ struct nfs_client *nfs_get_client(const struct nfs_client_initdata *cl_init) list_add_tail(&new->cl_share_link, &nn->nfs_client_list); spin_unlock(&nn->nfs_client_lock); - nfs_local_probe(new); - return rpc_ops->init_client(new, cl_init); + new = rpc_ops->init_client(new, cl_init); + if (!IS_ERR(new)) + nfs_local_probe(new); + return new; } spin_unlock(&nn->nfs_client_lock); diff --git a/fs/nfs/localio.c b/fs/nfs/localio.c index 9d8a1436f3c6..55622084d5c2 100644 --- a/fs/nfs/localio.c +++ b/fs/nfs/localio.c @@ -50,17 +50,77 @@ static void nfs_local_fsync_work(struct work_struct *work); static bool localio_enabled __read_mostly = true; module_param(localio_enabled, bool, 0644); +static inline bool nfs_client_is_local(const struct nfs_client *clp) +{ + return !!test_bit(NFS_CS_LOCAL_IO, &clp->cl_flags); +} + bool nfs_server_is_local(const struct nfs_client *clp) { - return test_bit(NFS_CS_LOCAL_IO, &clp->cl_flags) != 0 && - localio_enabled; + return nfs_client_is_local(clp) && localio_enabled; } EXPORT_SYMBOL_GPL(nfs_server_is_local); +/* + * UUID_IS_LOCAL XDR functions + */ + +static void localio_xdr_enc_uuidargs(struct rpc_rqst *req, + struct xdr_stream *xdr, + const void *data) +{ + const u8 *uuid = data; + + encode_opaque_fixed(xdr, uuid, UUID_SIZE); +} + +static int localio_xdr_dec_uuidres(struct rpc_rqst *req, + struct xdr_stream *xdr, + void *result) +{ + /* void return */ + return 0; +} + +static const struct rpc_procinfo nfs_localio_procedures[] = { + [LOCALIOPROC_UUID_IS_LOCAL] = { + .p_proc = LOCALIOPROC_UUID_IS_LOCAL, + .p_encode = localio_xdr_enc_uuidargs, + .p_decode = localio_xdr_dec_uuidres, + .p_arglen = XDR_QUADLEN(UUID_SIZE), + .p_replen = 0, + .p_statidx = LOCALIOPROC_UUID_IS_LOCAL, + .p_name = "UUID_IS_LOCAL", + }, +}; + +static unsigned int nfs_localio_counts[ARRAY_SIZE(nfs_localio_procedures)]; +static const struct rpc_version nfslocalio_version1 = { + .number = 1, + .nrprocs = ARRAY_SIZE(nfs_localio_procedures), + .procs = nfs_localio_procedures, + .counts = nfs_localio_counts, +}; + +static const struct rpc_version *nfslocalio_version[] = { + [1] = &nfslocalio_version1, +}; + +extern const struct rpc_program nfslocalio_program; +static struct rpc_stat nfslocalio_rpcstat = { &nfslocalio_program }; + +const struct rpc_program nfslocalio_program = { + .name = "nfslocalio", + .number = NFS_LOCALIO_PROGRAM, + .nrvers = ARRAY_SIZE(nfslocalio_version), + .version = nfslocalio_version, + .stats = &nfslocalio_rpcstat, +}; + /* * nfs_local_enable - enable local i/o for an nfs_client */ -static __maybe_unused void nfs_local_enable(struct nfs_client *clp) +static void nfs_local_enable(struct nfs_client *clp) { spin_lock(&clp->cl_localio_lock); @@ -87,11 +147,74 @@ void nfs_local_disable(struct nfs_client *clp) spin_unlock(&clp->cl_localio_lock); } +/* + * nfs_init_localioclient - Initialise an NFS localio client connection + */ +static struct rpc_clnt *nfs_init_localioclient(struct nfs_client *clp) +{ + struct rpc_clnt *rpcclient_localio; + + rpcclient_localio = rpc_bind_new_program(clp->cl_rpcclient, + &nfslocalio_program, 1); + + dprintk_rcu("%s: server (%s) %s NFS LOCALIO.\n", + __func__, rpc_peeraddr2str(clp->cl_rpcclient, RPC_DISPLAY_ADDR), + (IS_ERR(rpcclient_localio) ? "does not support" : "supports")); + + return rpcclient_localio; +} + +static bool nfs_server_uuid_is_local(struct nfs_client *clp) +{ + u8 uuid[UUID_SIZE]; + struct rpc_message msg = { + .rpc_argp = &uuid, + }; + struct rpc_clnt *rpcclient_localio; + int status; + + rpcclient_localio = nfs_init_localioclient(clp); + if (IS_ERR(rpcclient_localio)) + return false; + + export_uuid(uuid, &clp->cl_uuid.uuid); + + msg.rpc_proc = &nfs_localio_procedures[LOCALIOPROC_UUID_IS_LOCAL]; + status = rpc_call_sync(rpcclient_localio, &msg, 0); + dprintk("%s: NFS reply UUID_IS_LOCAL: status=%d\n", + __func__, status); + rpc_shutdown_client(rpcclient_localio); + + /* Server is only local if it initialized required struct members */ + if (status || !clp->cl_uuid.net || !clp->cl_uuid.dom) + return false; + + return true; +} + /* * nfs_local_probe - probe local i/o support for an nfs_server and nfs_client + * - called after alloc_client and init_client (so cl_rpcclient exists) + * - this function is idempotent, it can be called for old or new clients */ void nfs_local_probe(struct nfs_client *clp) { + /* Disallow localio if disabled via sysfs or AUTH_SYS isn't used */ + if (!localio_enabled || + clp->cl_rpcclient->cl_auth->au_flavor != RPC_AUTH_UNIX) { + nfs_local_disable(clp); + return; + } + + if (nfs_client_is_local(clp)) { + /* If already enabled, disable and re-enable */ + nfs_local_disable(clp); + } + + nfs_uuid_begin(&clp->cl_uuid); + if (nfs_server_uuid_is_local(clp)) + nfs_local_enable(clp); + nfs_uuid_end(&clp->cl_uuid); } EXPORT_SYMBOL_GPL(nfs_local_probe); @@ -121,7 +244,8 @@ nfs_local_open_fh(struct nfs_client *clp, const struct cred *cred, case -ENOMEM: case -ENXIO: case -ENOENT: - nfs_local_disable(clp); + /* Revalidate localio, will disable if unsupported */ + nfs_local_probe(clp); } return NULL; }