From patchwork Fri Apr 2 23:30:30 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dai Ngo X-Patchwork-Id: 12181745 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0424BC43460 for ; Fri, 2 Apr 2021 23:30:44 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D58E661178 for ; Fri, 2 Apr 2021 23:30:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235720AbhDBXao (ORCPT ); Fri, 2 Apr 2021 19:30:44 -0400 Received: from aserp2130.oracle.com ([141.146.126.79]:32910 "EHLO aserp2130.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235626AbhDBXan (ORCPT ); Fri, 2 Apr 2021 19:30:43 -0400 Received: from pps.filterd (aserp2130.oracle.com [127.0.0.1]) by aserp2130.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 132NTwVl079713; Fri, 2 Apr 2021 23:30:36 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=corp-2020-01-29; bh=0sdb7NCh3HL4NKtAle5mFBIVtzen1RwoeBWXTBqo6HM=; b=G18+iSe1A0HCUX5xaKq0Jlj37cih2870y4XGKuH0IKtl+fM4AeJ20Xc8KfXmXfviz/DQ njWSdPj+5YOsA1Drol7pSQ3BQsBIRmb6irvatfwEMI7zKP3xcekDUGNExGNh/bozg/Yk d4NUPtLz1npBaL8Kz/wYHKOAWjCOsevywkQnXOlQzUM/JxolWFUMd9no16P1Y9harPSU x7g3oVEAGZd2E1ziJ8iajchRaTuuqRYaq4BODUlfJnaRfhHyLi5gBAQ6nusRHtkFMVRK e/EYtmLgHfOrFczNep9oiu1Ot1Z9imGpdyzWJ5t7l/DuotWbdXDQUk0aKwaMDLWYHME3 jQ== Received: from aserp3020.oracle.com (aserp3020.oracle.com [141.146.126.70]) by aserp2130.oracle.com with ESMTP id 37n33dwny3-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 02 Apr 2021 23:30:35 +0000 Received: from pps.filterd (aserp3020.oracle.com [127.0.0.1]) by aserp3020.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 132NP2l5033455; Fri, 2 Apr 2021 23:30:35 GMT Received: from pps.reinject (localhost [127.0.0.1]) by aserp3020.oracle.com with ESMTP id 37n2acxbsx-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 02 Apr 2021 23:30:35 +0000 Received: from aserp3020.oracle.com (aserp3020.oracle.com [127.0.0.1]) by pps.reinject (8.16.0.36/8.16.0.36) with SMTP id 132NPvWK035088; Fri, 2 Apr 2021 23:30:35 GMT Received: from aserp3030.oracle.com (ksplice-shell2.us.oracle.com [10.152.118.36]) by aserp3020.oracle.com with ESMTP id 37n2acxbsh-2; Fri, 02 Apr 2021 23:30:35 +0000 From: Dai Ngo To: olga.kornievskaia@gmail.com Cc: linux-nfs@vger.kernel.org, trondmy@hammerspace.com, bfields@fieldses.org, chuck.lever@oracle.com Subject: [PATCH v2 1/2] NFSD: delay unmount source's export after inter-server copy completed. Date: Fri, 2 Apr 2021 19:30:30 -0400 Message-Id: <20210402233031.36731-2-dai.ngo@oracle.com> X-Mailer: git-send-email 2.20.1.1226.g1595ea5.dirty In-Reply-To: <20210402233031.36731-1-dai.ngo@oracle.com> References: <20210402233031.36731-1-dai.ngo@oracle.com> MIME-Version: 1.0 X-Proofpoint-GUID: YfOC4ZbmBj0FC8TS0p9l2gShzlV3ZdVs X-Proofpoint-ORIG-GUID: YfOC4ZbmBj0FC8TS0p9l2gShzlV3ZdVs X-Proofpoint-Virus-Version: vendor=nai engine=6200 definitions=9942 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0 mlxlogscore=999 mlxscore=0 lowpriorityscore=0 suspectscore=0 priorityscore=1501 phishscore=0 clxscore=1015 impostorscore=0 malwarescore=0 bulkscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2103310000 definitions=main-2104020158 Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org Currently the source's export is mounted and unmounted on every inter-server copy operation. This patch is an enhancement to delay the unmount of the source export for a certain period of time to eliminate the mount and unmount overhead on subsequent copy operations. After a copy operation completes, a delayed task is scheduled to unmount the export after a configurable idle time. Each time the export is being used again, its expire time is extended to allow the export to remain mounted. The unmount task and the mount operation of the copy request are synced to make sure the export is not unmounted while it's being used. Signed-off-by: Dai Ngo --- fs/nfsd/nfs4proc.c | 136 ++++++++++++++++++++++++++++++++++++++++++++++-- fs/nfsd/nfsd.h | 4 ++ fs/nfsd/nfssvc.c | 3 ++ include/linux/nfs_ssc.h | 17 ++++++ 4 files changed, 157 insertions(+), 3 deletions(-) diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c index dd9f38d072dd..6a810f4a4988 100644 --- a/fs/nfsd/nfs4proc.c +++ b/fs/nfsd/nfs4proc.c @@ -55,6 +55,74 @@ module_param(inter_copy_offload_enable, bool, 0644); MODULE_PARM_DESC(inter_copy_offload_enable, "Enable inter server to server copy offload. Default: false"); +#ifdef CONFIG_NFSD_V4_2_INTER_SSC +static int nfsd4_ssc_umount_timeout = 900000; /* default to 15 mins */ +module_param(nfsd4_ssc_umount_timeout, int, 0644); +MODULE_PARM_DESC(nfsd4_ssc_umount_timeout, + "idle msecs before unmount export from source server"); + +static void nfsd4_ssc_expire_umount(struct work_struct *work); +static struct nfsd4_ssc_umount nfsd4_ssc_umount; + +/* nfsd4_ssc_umount.nsu_lock must be held */ +static void nfsd4_scc_update_umnt_timo(void) +{ + struct nfsd4_ssc_umount_item *ni = 0; + + if (!list_empty(&nfsd4_ssc_umount.nsu_list)) { + ni = list_first_entry(&nfsd4_ssc_umount.nsu_list, + struct nfsd4_ssc_umount_item, nsui_list); + nfsd4_ssc_umount.nsu_expire = ni->nsui_expire; + schedule_delayed_work(&nfsd4_ssc_umount.nsu_umount_work, + ni->nsui_expire - jiffies); + } else + nfsd4_ssc_umount.nsu_expire = 0; +} + +static void nfsd4_ssc_expire_umount(struct work_struct *work) +{ + struct nfsd4_ssc_umount_item *ni = 0; + struct nfsd4_ssc_umount_item *tmp; + + down_write(&nfsd4_ssc_umount.nsu_sem); + spin_lock(&nfsd4_ssc_umount.nsu_lock); + list_for_each_entry_safe(ni, tmp, &nfsd4_ssc_umount.nsu_list, nsui_list) { + if (time_after(jiffies, ni->nsui_expire)) { + list_del(&ni->nsui_list); + cancel_delayed_work(&nfsd4_ssc_umount.nsu_umount_work); + spin_unlock(&nfsd4_ssc_umount.nsu_lock); + up_write(&nfsd4_ssc_umount.nsu_sem); + + mntput(ni->nsui_vfsmount); + kfree(ni); + + down_write(&nfsd4_ssc_umount.nsu_sem); + spin_lock(&nfsd4_ssc_umount.nsu_lock); + continue; + } + break; + } + nfsd4_scc_update_umnt_timo(); + spin_unlock(&nfsd4_ssc_umount.nsu_lock); + up_write(&nfsd4_ssc_umount.nsu_sem); +} + +static DECLARE_DELAYED_WORK(nfsd4, nfsd4_ssc_expire_umount); + +void nfsd4_ssc_init_umount_work(void) +{ + if (nfsd4_ssc_umount.nsu_inited) + return; + INIT_DELAYED_WORK(&nfsd4_ssc_umount.nsu_umount_work, + nfsd4_ssc_expire_umount); + INIT_LIST_HEAD(&nfsd4_ssc_umount.nsu_list); + spin_lock_init(&nfsd4_ssc_umount.nsu_lock); + init_rwsem(&nfsd4_ssc_umount.nsu_sem); + nfsd4_ssc_umount.nsu_inited = true; +} +EXPORT_SYMBOL_GPL(nfsd4_ssc_init_umount_work); +#endif + #ifdef CONFIG_NFSD_V4_SECURITY_LABEL #include @@ -1181,6 +1249,8 @@ nfsd4_interssc_connect(struct nl4_server *nss, struct svc_rqst *rqstp, char *ipaddr, *dev_name, *raw_data; int len, raw_len; __be32 status = nfserr_inval; + struct nfsd4_ssc_umount_item *ni = 0; + struct nfsd4_ssc_umount_item *tmp; naddr = &nss->u.nl4_addr; tmp_addrlen = rpc_uaddr2sockaddr(SVC_NET(rqstp), naddr->addr, @@ -1229,11 +1299,33 @@ nfsd4_interssc_connect(struct nl4_server *nss, struct svc_rqst *rqstp, goto out_free_rawdata; snprintf(dev_name, len + 5, "%s%s%s:/", startsep, ipaddr, endsep); + /* wait for ssc unmount task */ + down_read(&nfsd4_ssc_umount.nsu_sem); + /* Use an 'internal' mount: SB_KERNMOUNT -> MNT_INTERNAL */ ss_mnt = vfs_kern_mount(type, SB_KERNMOUNT, dev_name, raw_data); module_put(type->owner); - if (IS_ERR(ss_mnt)) + if (IS_ERR(ss_mnt)) { + up_read(&nfsd4_ssc_umount.nsu_sem); goto out_free_devname; + } + + /* delete work entry if it exists */ + spin_lock(&nfsd4_ssc_umount.nsu_lock); + list_for_each_entry_safe(ni, tmp, &nfsd4_ssc_umount.nsu_list, nsui_list) { + if (ni->nsui_vfsmount->mnt_sb != ss_mnt->mnt_sb) + continue; + list_del(&ni->nsui_list); + cancel_delayed_work(&nfsd4_ssc_umount.nsu_umount_work); + nfsd4_scc_update_umnt_timo(); + spin_unlock(&nfsd4_ssc_umount.nsu_lock); + mntput(ni->nsui_vfsmount); + kfree(ni); + goto out_done; + } + spin_unlock(&nfsd4_ssc_umount.nsu_lock); +out_done: + up_read(&nfsd4_ssc_umount.nsu_sem); status = 0; *mount = ss_mnt; @@ -1301,10 +1393,48 @@ static void nfsd4_cleanup_inter_ssc(struct vfsmount *ss_mnt, struct nfsd_file *src, struct nfsd_file *dst) { + long timeout; + struct nfsd4_ssc_umount_item *work, *tmp; + struct nfsd4_ssc_umount_item *ni = 0; + nfs42_ssc_close(src->nf_file); - fput(src->nf_file); nfsd_file_put(dst); - mntput(ss_mnt); + fput(src->nf_file); + + work = kzalloc(sizeof(*work), GFP_KERNEL); + if (!work) { + mntput(ss_mnt); + return; + } + timeout = msecs_to_jiffies(nfsd4_ssc_umount_timeout); + work->nsui_vfsmount = ss_mnt; + work->nsui_expire = jiffies + timeout; + + spin_lock(&nfsd4_ssc_umount.nsu_lock); + /* + * check if entry for vfsmount->mnt_sb exists, if it does + * then remove it, update expire time and re-insert at tail, + * do the mntput for this call and return. Otherwise create + * new work entry. + */ + list_for_each_entry_safe(ni, tmp, &nfsd4_ssc_umount.nsu_list, + nsui_list) { + if (ni->nsui_vfsmount->mnt_sb == ss_mnt->mnt_sb) { + list_del(&ni->nsui_list); + mntput(ss_mnt); + kfree(work); + ni->nsui_expire = jiffies + timeout; + work = ni; + break; + } + } + list_add_tail(&work->nsui_list, &nfsd4_ssc_umount.nsu_list); + if (!nfsd4_ssc_umount.nsu_expire) { + nfsd4_ssc_umount.nsu_expire = work->nsui_expire; + schedule_delayed_work(&nfsd4_ssc_umount.nsu_umount_work, + timeout); + } + spin_unlock(&nfsd4_ssc_umount.nsu_lock); } #else /* CONFIG_NFSD_V4_2_INTER_SSC */ diff --git a/fs/nfsd/nfsd.h b/fs/nfsd/nfsd.h index 8bdc37aa2c2e..b3bf8a5f4472 100644 --- a/fs/nfsd/nfsd.h +++ b/fs/nfsd/nfsd.h @@ -483,6 +483,10 @@ static inline bool nfsd_attrs_supported(u32 minorversion, const u32 *bmval) extern int nfsd4_is_junction(struct dentry *dentry); extern int register_cld_notifier(void); extern void unregister_cld_notifier(void); +#ifdef CONFIG_NFSD_V4_2_INTER_SSC +extern void nfsd4_ssc_init_umount_work(void); +#endif + #else /* CONFIG_NFSD_V4 */ static inline int nfsd4_is_junction(struct dentry *dentry) { diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c index 6de406322106..2558db55b88b 100644 --- a/fs/nfsd/nfssvc.c +++ b/fs/nfsd/nfssvc.c @@ -322,6 +322,9 @@ static int nfsd_startup_generic(int nrservs) ret = nfs4_state_start(); if (ret) goto out_file_cache; +#ifdef CONFIG_NFSD_V4_2_INTER_SSC + nfsd4_ssc_init_umount_work(); +#endif return 0; out_file_cache: diff --git a/include/linux/nfs_ssc.h b/include/linux/nfs_ssc.h index f5ba0fbff72f..337d740dad17 100644 --- a/include/linux/nfs_ssc.h +++ b/include/linux/nfs_ssc.h @@ -8,6 +8,7 @@ */ #include +#include extern struct nfs_ssc_client_ops_tbl nfs_ssc_client_tbl; @@ -52,6 +53,22 @@ static inline void nfs42_ssc_close(struct file *filep) if (nfs_ssc_client_tbl.ssc_nfs4_ops) (*nfs_ssc_client_tbl.ssc_nfs4_ops->sco_close)(filep); } + +struct nfsd4_ssc_umount_item { + struct list_head nsui_list; + unsigned long nsui_expire; + struct vfsmount *nsui_vfsmount; +}; + +struct nfsd4_ssc_umount { + struct list_head nsu_list; + struct delayed_work nsu_umount_work; + spinlock_t nsu_lock; + struct rw_semaphore nsu_sem; + unsigned long nsu_expire; + bool nsu_inited; +}; + #endif /* From patchwork Fri Apr 2 23:30:31 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dai Ngo X-Patchwork-Id: 12181741 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A3C90C433ED for ; Fri, 2 Apr 2021 23:30:43 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7A67661158 for ; Fri, 2 Apr 2021 23:30:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235228AbhDBXao (ORCPT ); Fri, 2 Apr 2021 19:30:44 -0400 Received: from aserp2120.oracle.com ([141.146.126.78]:41052 "EHLO aserp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235473AbhDBXan (ORCPT ); Fri, 2 Apr 2021 19:30:43 -0400 Received: from pps.filterd (aserp2120.oracle.com [127.0.0.1]) by aserp2120.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 132NUaE3004034; Fri, 2 Apr 2021 23:30:36 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=corp-2020-01-29; bh=8juw05n9USKVXWHoj4+6Hm2JGOgeBvMKxjqyvoiMRzg=; b=B69vCrDUxdDxMefm5Anzly4eLyzNQUxxPJIiM9whAJxmU4YkL2rgiYLL6adfWWnHHBui UPlks9c3vSw4NNgN89Us8myG5NA7ZZpkXEh/wFTW+LlzeRwHM/+Db+dBiiuozvzslrKA wRJTJ677G8M2LPb6MQA/Yo1cOltYU6cil8/dDmFIczL6AepuAk3X+pan3tlGqn5fhc2y AlOkzB1SAJ/ndvpvzujZAiNafk+qZDZoReUMBY0IePfJwbeEtKHpr+deV4HQ7la4dtOA DmGeCNVtFJlms0FtjsEdZA5iCsZjBbOG25YhmwXve49C7Iu3IBGcDNJs/lSOY1r7vB8A aA== Received: from aserp3020.oracle.com (aserp3020.oracle.com [141.146.126.70]) by aserp2120.oracle.com with ESMTP id 37n2aknrgs-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 02 Apr 2021 23:30:36 +0000 Received: from pps.filterd (aserp3020.oracle.com [127.0.0.1]) by aserp3020.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 132NP4L8033715; Fri, 2 Apr 2021 23:30:36 GMT Received: from pps.reinject (localhost [127.0.0.1]) by aserp3020.oracle.com with ESMTP id 37n2acxbt9-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 02 Apr 2021 23:30:36 +0000 Received: from aserp3020.oracle.com (aserp3020.oracle.com [127.0.0.1]) by pps.reinject (8.16.0.36/8.16.0.36) with SMTP id 132NPvWM035088; Fri, 2 Apr 2021 23:30:36 GMT Received: from aserp3030.oracle.com (ksplice-shell2.us.oracle.com [10.152.118.36]) by aserp3020.oracle.com with ESMTP id 37n2acxbsh-3; Fri, 02 Apr 2021 23:30:35 +0000 From: Dai Ngo To: olga.kornievskaia@gmail.com Cc: linux-nfs@vger.kernel.org, trondmy@hammerspace.com, bfields@fieldses.org, chuck.lever@oracle.com Subject: [PATCH v2 2/2] NFSv4.2: mount overhead should not be used as threshold for inter-server copy Date: Fri, 2 Apr 2021 19:30:31 -0400 Message-Id: <20210402233031.36731-3-dai.ngo@oracle.com> X-Mailer: git-send-email 2.20.1.1226.g1595ea5.dirty In-Reply-To: <20210402233031.36731-1-dai.ngo@oracle.com> References: <20210402233031.36731-1-dai.ngo@oracle.com> MIME-Version: 1.0 X-Proofpoint-ORIG-GUID: _wOxAX_Qykqu9oRlGmBkttmO2NDfBF5p X-Proofpoint-GUID: _wOxAX_Qykqu9oRlGmBkttmO2NDfBF5p X-Proofpoint-Virus-Version: vendor=nai engine=6200 definitions=9942 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 mlxscore=0 impostorscore=0 phishscore=0 bulkscore=0 adultscore=0 clxscore=1015 malwarescore=0 priorityscore=1501 suspectscore=0 spamscore=0 mlxlogscore=999 lowpriorityscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2103310000 definitions=main-2104020158 Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org Since mount and unmount are not done on each copy request, its overhead should not be considered as the threshold for doing inter-server copy. The threshold used to determine sync or async copy is also used to decide whether copy is done with inter-server copy or generic copy. Signed-off-by: Dai Ngo --- fs/nfs/nfs4file.c | 10 ++++------ 1 file changed, 4 insertions(+), 6 deletions(-) diff --git a/fs/nfs/nfs4file.c b/fs/nfs/nfs4file.c index 441a2fa073c8..67ca798a1a79 100644 --- a/fs/nfs/nfs4file.c +++ b/fs/nfs/nfs4file.c @@ -158,13 +158,11 @@ static ssize_t __nfs4_copy_file_range(struct file *file_in, loff_t pos_in, sync = true; retry: if (!nfs42_files_from_same_server(file_in, file_out)) { - /* for inter copy, if copy size if smaller than 12 RPC - * payloads, fallback to traditional copy. There are - * 14 RPCs during an NFSv4.x mount between source/dest - * servers. + /* + * for inter copy, if copy size is small enough + * for sync copy then fallback to traditional copy. */ - if (sync || - count <= 14 * NFS_SERVER(file_inode(file_in))->rsize) + if (sync) return -EOPNOTSUPP; cn_resp = kzalloc(sizeof(struct nfs42_copy_notify_res), GFP_NOFS);