From patchwork Thu Dec 7 13:21:58 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gang He X-Patchwork-Id: 10098959 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 2BDE060360 for ; Thu, 7 Dec 2017 13:22:55 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1354D20500 for ; Thu, 7 Dec 2017 13:22:55 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0804028D7D; Thu, 7 Dec 2017 13:22:55 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID autolearn=ham version=3.3.1 Received: from aserp2120.oracle.com (aserp2120.oracle.com [141.146.126.78]) (using TLSv1.2 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 3790220500 for ; Thu, 7 Dec 2017 13:22:53 +0000 (UTC) Received: from pps.filterd (aserp2120.oracle.com [127.0.0.1]) by aserp2120.oracle.com (8.16.0.21/8.16.0.21) with SMTP id vB7DMREu177404; Thu, 7 Dec 2017 13:22:40 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : date : message-id : cc : subject : list-id : list-unsubscribe : list-archive : list-post : list-help : list-subscribe : mime-version : content-type : content-transfer-encoding : sender; s=corp-2017-10-26; bh=7188wzSjPBJRY+STSD0Qr6LhDjKpvm8MyRXU/b87DK4=; b=rKV0jycm1xv1ImXjgLGwqoUWaLo9vPzroIDwbyhkcpwdJbgGvCPCE6c9NgFDymj0f/o6 /X39GbREQxQ39VHfxOHpSpCxS9eCi7sn2k/5aDczE3Q0hXhYsSuRwvIC52kE0cwjN9BL zcdus6KkBcVTuYmC7HFKM1zqSStityB9g/Yzc1rG+Nx27skIvjhNeoIJIyjOQp4rFN3x 89MSmnZUMeJzN4TQS7kZgkk56DOEvUa1EhG7i3NvZx0DYax503zWgFWYlTe3y0Gn+o/r eB8bm9ChVrhvuTNxQtp/qykbkgS/26v+tGcbdo7yJiPS1cJCwnAeoSCiWmvQReXHIS89 Qg== Received: from aserv0021.oracle.com (aserv0021.oracle.com [141.146.126.233]) by aserp2120.oracle.com with ESMTP id 2epcvqq7h7-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 07 Dec 2017 13:22:39 +0000 Received: from oss.oracle.com (oss-old-reserved.oracle.com [137.254.22.2]) by aserv0021.oracle.com (8.14.4/8.14.4) with ESMTP id vB7DMbUq019718 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 7 Dec 2017 13:22:37 GMT Received: from localhost ([127.0.0.1] helo=lb-oss.oracle.com) by oss.oracle.com with esmtp (Exim 4.63) (envelope-from ) id 1eMw8L-0000Uu-Io; Thu, 07 Dec 2017 05:22:37 -0800 Received: from aserv0021.oracle.com ([141.146.126.233]) by oss.oracle.com with esmtp (Exim 4.63) (envelope-from ) id 1eMw8J-0000Uk-AN for ocfs2-devel@oss.oracle.com; Thu, 07 Dec 2017 05:22:35 -0800 Received: from userp2030.oracle.com (userp2030.oracle.com [156.151.31.89]) by aserv0021.oracle.com (8.14.4/8.14.4) with ESMTP id vB7DMYjE019531 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=NO) for ; Thu, 7 Dec 2017 13:22:34 GMT Received: from pps.filterd (userp2030.oracle.com [127.0.0.1]) by userp2030.oracle.com (8.16.0.21/8.16.0.21) with SMTP id vB7DMYKQ031398 for ; Thu, 7 Dec 2017 13:22:34 GMT Received: from prv3-mh.provo.novell.com (victor.provo.novell.com [137.65.250.26]) by userp2030.oracle.com with ESMTP id 2eq6agr7ax-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Thu, 07 Dec 2017 13:22:33 +0000 Received: from ghe-pc.suse.asia (prv-ext-foundry1int.gns.novell.com [137.65.251.240]) by prv3-mh.provo.novell.com with ESMTP (TLS encrypted); Thu, 07 Dec 2017 06:22:20 -0700 From: Gang He To: mfasheh@versity.com, jlbec@evilplan.org Date: Thu, 7 Dec 2017 21:21:58 +0800 Message-Id: <1512652918-11777-1-git-send-email-ghe@suse.com> X-Mailer: git-send-email 1.8.5.6 X-CLX-Shades: MLX X-CLX-Response: 1TFkXGx0bEQpMehcaEQpZTRdnZnIRCllJFxpxGhAadwYbHxtxGRsQGncGGBo GGhEKWV4XaGN5EQpJRhdFWEtJRk91WlhFTl9JXkNFRBl1T0sRCkNOFxlSSxNdaVxpXF4HHWN+Zn 5bTFJ9QEFlXl9rQXp1bR4ZEQpYXBcfBBoEGxgYBxxLSEtPHhwaBRsaBBsaGgQeEgQfEBseGh8aE QpeWRd4cl9/GxEKTVwXGRgfEQpMWhdoaU1NexEKQ1oXGxkdBBwfBBgfGgQYHBEKQl4XGxEKRF4X GBEKREkXGREKQkYXZxNtYBtbZUIffn0RCkJcFxoRCkJFF2hwG0Ztfk4dU1gTEQpCThdnHmAac0V MYUcBRBEKQkwXa0ljQmVjE1tBU2ERCkJsF2lbTl59SUdbbFlhEQpCQBdrR11CclB4a2ZTfREKQl gXYn1veQFPGBlwcHsRCnBoF2ZMYhx/Hm1Mbl9ZEBkaEQpwaBduGmBgU2FsfUNNRRAZGhEKcGgXa xxNZW9AbmtrWRMQGRoRCnBoF29bRU0eXmNnQxlPEBkaEQpwaBdiHV17fB1ee2FcUBAZGhEKcGwX Z3xHcm19TngBHlMQGRoRCm1+FxoRClhNF0sRIA== X-PDR: PASS X-Source-IP: 137.65.250.26 X-ServerName: victor.provo.novell.com X-Proofpoint-SPF-Result: pass X-Proofpoint-SPF-Record: v=spf1 include:novell.com ~all X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=8737 signatures=668643 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=0 malwarescore=0 suspectscore=8 phishscore=0 bulkscore=0 spamscore=0 clxscore=171 lowpriorityscore=0 impostorscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1709140000 definitions=main-1712070198 X-Spam: Clean Cc: ocfs2-devel@oss.oracle.com Subject: [Ocfs2-devel] [PATCH] ocfs2: add trimfs dlm lock X-BeenThere: ocfs2-devel@oss.oracle.com X-Mailman-Version: 2.1.9 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: ocfs2-devel-bounces@oss.oracle.com Errors-To: ocfs2-devel-bounces@oss.oracle.com X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=8737 signatures=668643 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0 suspectscore=1 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1709140000 definitions=main-1712070198 X-Virus-Scanned: ClamAV using ClamSMTP As you know, ocfs2 has support trim the underlying disk via fstrim command. But there is a problem, ocfs2 is a shared storage cluster file system, if the user configures a scheduled fstrim job on each file system node, this will trigger multiple nodes trim a shared disk simultaneously, it is very wasteful for CPU and IO consumption. Then, we introduce a trimfs dlm lock, which will make only one fstrim command is running on the shared disk among the cluster, the other fstrim command should be returned with -EBUSY errno. Signed-off-by: Gang He --- fs/ocfs2/alloc.c | 18 +++++++++++++++++- fs/ocfs2/dlmglue.c | 48 ++++++++++++++++++++++++++++++++++++++++++++++++ fs/ocfs2/dlmglue.h | 2 ++ fs/ocfs2/ocfs2.h | 1 + fs/ocfs2/ocfs2_lockid.h | 5 +++++ 5 files changed, 73 insertions(+), 1 deletion(-) diff --git a/fs/ocfs2/alloc.c b/fs/ocfs2/alloc.c index ab5105f..89d16ad 100644 --- a/fs/ocfs2/alloc.c +++ b/fs/ocfs2/alloc.c @@ -7401,10 +7401,24 @@ int ocfs2_trim_fs(struct super_block *sb, struct fstrim_range *range) inode_lock(main_bm_inode); + ret = ocfs2_trim_fs_lock(osb); + if (ret < 0) { + if (ret != -EAGAIN) + mlog_errno(ret); + else { + ret = -EBUSY; + mlog(ML_NOTICE, + "Cannot trim disk %s since a trim operation is " + "running on it from another node.\n", + sb->s_id); + } + goto out_mutex; + } + ret = ocfs2_inode_lock(main_bm_inode, &main_bm_bh, 0); if (ret < 0) { mlog_errno(ret); - goto out_mutex; + goto out_fsunlock; } main_bm = (struct ocfs2_dinode *)main_bm_bh->b_data; @@ -7466,6 +7480,8 @@ int ocfs2_trim_fs(struct super_block *sb, struct fstrim_range *range) out_unlock: ocfs2_inode_unlock(main_bm_inode, 0); brelse(main_bm_bh); +out_fsunlock: + ocfs2_trim_fs_unlock(osb); out_mutex: inode_unlock(main_bm_inode); iput(main_bm_inode); diff --git a/fs/ocfs2/dlmglue.c b/fs/ocfs2/dlmglue.c index 4689940..b28fdf4 100644 --- a/fs/ocfs2/dlmglue.c +++ b/fs/ocfs2/dlmglue.c @@ -259,6 +259,10 @@ struct ocfs2_lock_res_ops { .flags = 0, }; +static struct ocfs2_lock_res_ops ocfs2_trim_fs_lops = { + .flags = 0, +}; + static struct ocfs2_lock_res_ops ocfs2_orphan_scan_lops = { .flags = LOCK_TYPE_REQUIRES_REFRESH|LOCK_TYPE_USES_LVB, }; @@ -676,6 +680,15 @@ static void ocfs2_nfs_sync_lock_res_init(struct ocfs2_lock_res *res, &ocfs2_nfs_sync_lops, osb); } +static void ocfs2_trim_fs_lock_res_init(struct ocfs2_lock_res *res, + struct ocfs2_super *osb) +{ + ocfs2_lock_res_init_once(res); + ocfs2_build_lock_name(OCFS2_LOCK_TYPE_TRIM_FS, 0, 0, res->l_name); + ocfs2_lock_res_init_common(osb, res, OCFS2_LOCK_TYPE_TRIM_FS, + &ocfs2_trim_fs_lops, osb); +} + static void ocfs2_orphan_scan_lock_res_init(struct ocfs2_lock_res *res, struct ocfs2_super *osb) { @@ -2745,6 +2758,41 @@ void ocfs2_nfs_sync_unlock(struct ocfs2_super *osb, int ex) ex ? LKM_EXMODE : LKM_PRMODE); } +int ocfs2_trim_fs_lock(struct ocfs2_super *osb) +{ + int status; + struct ocfs2_lock_res *lockres = &osb->osb_trim_fs_lockres; + + if (ocfs2_is_hard_readonly(osb)) + return -EROFS; + + if (ocfs2_mount_local(osb)) + return 0; + + ocfs2_trim_fs_lock_res_init(lockres, osb); + status = ocfs2_cluster_lock(osb, lockres, LKM_EXMODE, + DLM_LKF_NOQUEUE, 0); + if (status < 0) { + if (status != -EAGAIN) + mlog_errno(status); + ocfs2_simple_drop_lockres(osb, lockres); + ocfs2_lock_res_free(lockres); + } + + return status; +} + +void ocfs2_trim_fs_unlock(struct ocfs2_super *osb) +{ + struct ocfs2_lock_res *lockres = &osb->osb_trim_fs_lockres; + + if (!ocfs2_mount_local(osb)) { + ocfs2_cluster_unlock(osb, lockres, LKM_EXMODE); + ocfs2_simple_drop_lockres(osb, lockres); + ocfs2_lock_res_free(lockres); + } +} + int ocfs2_dentry_lock(struct dentry *dentry, int ex) { int ret; diff --git a/fs/ocfs2/dlmglue.h b/fs/ocfs2/dlmglue.h index a7fc18b..361e8a5 100644 --- a/fs/ocfs2/dlmglue.h +++ b/fs/ocfs2/dlmglue.h @@ -153,6 +153,8 @@ void ocfs2_super_unlock(struct ocfs2_super *osb, void ocfs2_rename_unlock(struct ocfs2_super *osb); int ocfs2_nfs_sync_lock(struct ocfs2_super *osb, int ex); void ocfs2_nfs_sync_unlock(struct ocfs2_super *osb, int ex); +int ocfs2_trim_fs_lock(struct ocfs2_super *osb); +void ocfs2_trim_fs_unlock(struct ocfs2_super *osb); int ocfs2_dentry_lock(struct dentry *dentry, int ex); void ocfs2_dentry_unlock(struct dentry *dentry, int ex); int ocfs2_file_lock(struct file *file, int ex, int trylock); diff --git a/fs/ocfs2/ocfs2.h b/fs/ocfs2/ocfs2.h index 9a50f22..6867eef 100644 --- a/fs/ocfs2/ocfs2.h +++ b/fs/ocfs2/ocfs2.h @@ -404,6 +404,7 @@ struct ocfs2_super struct ocfs2_lock_res osb_super_lockres; struct ocfs2_lock_res osb_rename_lockres; struct ocfs2_lock_res osb_nfs_sync_lockres; + struct ocfs2_lock_res osb_trim_fs_lockres; struct ocfs2_dlm_debug *osb_dlm_debug; struct dentry *osb_debug_root; diff --git a/fs/ocfs2/ocfs2_lockid.h b/fs/ocfs2/ocfs2_lockid.h index d277aab..7051b99 100644 --- a/fs/ocfs2/ocfs2_lockid.h +++ b/fs/ocfs2/ocfs2_lockid.h @@ -50,6 +50,7 @@ enum ocfs2_lock_type { OCFS2_LOCK_TYPE_NFS_SYNC, OCFS2_LOCK_TYPE_ORPHAN_SCAN, OCFS2_LOCK_TYPE_REFCOUNT, + OCFS2_LOCK_TYPE_TRIM_FS, OCFS2_NUM_LOCK_TYPES }; @@ -93,6 +94,9 @@ static inline char ocfs2_lock_type_char(enum ocfs2_lock_type type) case OCFS2_LOCK_TYPE_REFCOUNT: c = 'T'; break; + case OCFS2_LOCK_TYPE_TRIM_FS: + c = 'I'; + break; default: c = '\0'; } @@ -115,6 +119,7 @@ static inline char ocfs2_lock_type_char(enum ocfs2_lock_type type) [OCFS2_LOCK_TYPE_NFS_SYNC] = "NFSSync", [OCFS2_LOCK_TYPE_ORPHAN_SCAN] = "OrphanScan", [OCFS2_LOCK_TYPE_REFCOUNT] = "Refcount", + [OCFS2_LOCK_TYPE_TRIM_FS] = "TrimFs", }; static inline const char *ocfs2_lock_type_string(enum ocfs2_lock_type type)