From patchwork Tue Jun 11 01:54:14 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gang He X-Patchwork-Id: 10985763 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CADBF76 for ; Tue, 11 Jun 2019 01:56:02 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id BB1B628671 for ; Tue, 11 Jun 2019 01:56:02 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id AD22128685; Tue, 11 Jun 2019 01:56:02 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from userp2130.oracle.com (userp2130.oracle.com [156.151.31.86]) (using TLSv1.2 with cipher AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 25DD428671 for ; Tue, 11 Jun 2019 01:56:01 +0000 (UTC) Received: from pps.filterd (userp2130.oracle.com [127.0.0.1]) by userp2130.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x5B1rlOX042793; Tue, 11 Jun 2019 01:55:51 GMT Received: from aserv0021.oracle.com (aserv0021.oracle.com [141.146.126.233]) by userp2130.oracle.com with ESMTP id 2t04etj59x-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 11 Jun 2019 01:55:51 +0000 Received: from oss.oracle.com (oss-old-reserved.oracle.com [137.254.22.2]) by aserv0021.oracle.com (8.14.4/8.14.4) with ESMTP id x5B1tnnY015304 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 11 Jun 2019 01:55:49 GMT Received: from localhost ([127.0.0.1] helo=lb-oss.oracle.com) by oss.oracle.com with esmtp (Exim 4.63) (envelope-from ) id 1haW0r-0000JB-Oz; Mon, 10 Jun 2019 18:55:49 -0700 Received: from userp3020.oracle.com ([156.151.31.79]) by oss.oracle.com with esmtp (Exim 4.63) (envelope-from ) id 1haVzr-00007f-CO for ocfs2-devel@oss.oracle.com; Mon, 10 Jun 2019 18:54:47 -0700 Received: from pps.filterd (userp3020.oracle.com [127.0.0.1]) by userp3020.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x5B1slJL121007 for ; Tue, 11 Jun 2019 01:54:47 GMT Authentication-Results: aserp3010.oracle.com; spf=fail smtp.mailfrom=ghe@suse.com; dmarc=none Received: from userp2040.oracle.com (userp2040.oracle.com [156.151.31.90]) by userp3020.oracle.com with ESMTP id 2t1jph66pp-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 11 Jun 2019 01:54:46 +0000 Received: from pps.filterd (userp2040.oracle.com [127.0.0.1]) by userp2040.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x5B1s7wV030474 for ; Tue, 11 Jun 2019 01:54:45 GMT Received: from smtp2.provo.novell.com (smtp2.provo.novell.com [137.65.250.81]) by userp2040.oracle.com with ESMTP id 2t20k52vdm-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO) for ; Tue, 11 Jun 2019 01:54:44 +0000 Received: from ghe-pc.suse.asia (prva10-snat226-1.provo.novell.com [137.65.226.35]) by smtp2.provo.novell.com with ESMTP (TLS encrypted); Mon, 10 Jun 2019 19:54:33 -0600 From: Gang He To: mark@fasheh.com, jlbec@evilplan.org, joseph.qi@linux.alibaba.com Date: Tue, 11 Jun 2019 09:54:14 +0800 Message-Id: <20190611015414.27754-3-ghe@suse.com> X-Mailer: git-send-email 2.12.3 In-Reply-To: <20190611015414.27754-1-ghe@suse.com> References: <20190611015414.27754-1-ghe@suse.com> X-PDR: PASS X-Source-IP: 137.65.250.81 X-ServerName: smtp2.provo.novell.com X-Proofpoint-SPF-Result: pass X-Proofpoint-SPF-Record: v=spf1 include:novell.com -all X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9284 signatures=668687 X-Proofpoint-Spam-Details: rule=tap_spam_policies_notspam policy=tap_spam_policies score=0 priorityscore=200 malwarescore=0 suspectscore=2 phishscore=0 bulkscore=0 spamscore=0 clxscore=181 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1906110010 X-Spam: Clean Cc: linux-kernel@vger.kernel.org, ocfs2-devel@oss.oracle.com Subject: [Ocfs2-devel] [PATCH V4 3/3] ocfs2: add first lock wait time in locking_state X-BeenThere: ocfs2-devel@oss.oracle.com X-Mailman-Version: 2.1.9 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: ocfs2-devel-bounces@oss.oracle.com Errors-To: ocfs2-devel-bounces@oss.oracle.com X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9284 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1906110010 X-Virus-Scanned: ClamAV using ClamSMTP ocfs2 file system uses locking_state file under debugfs to dump each ocfs2 file system's dlm lock resources, but the users ever encountered some hang(deadlock) problems in ocfs2 file system. I'd like to add first lock wait time in locking_state file, which can help the upper scripts detect these deadlock problems via comparing the first lock wait time with the current time. Signed-off-by: Gang He --- fs/ocfs2/dlmglue.c | 32 +++++++++++++++++++++++++++++--- fs/ocfs2/ocfs2.h | 1 + 2 files changed, 30 insertions(+), 3 deletions(-) diff --git a/fs/ocfs2/dlmglue.c b/fs/ocfs2/dlmglue.c index d4caa6d117c6..8ce4b76f81ee 100644 --- a/fs/ocfs2/dlmglue.c +++ b/fs/ocfs2/dlmglue.c @@ -440,6 +440,7 @@ static void ocfs2_remove_lockres_tracking(struct ocfs2_lock_res *res) static void ocfs2_init_lock_stats(struct ocfs2_lock_res *res) { res->l_lock_refresh = 0; + res->l_lock_wait = 0; memset(&res->l_lock_prmode, 0, sizeof(struct ocfs2_lock_stats)); memset(&res->l_lock_exmode, 0, sizeof(struct ocfs2_lock_stats)); } @@ -483,6 +484,21 @@ static inline void ocfs2_track_lock_refresh(struct ocfs2_lock_res *lockres) lockres->l_lock_refresh++; } +static inline void ocfs2_track_lock_wait(struct ocfs2_lock_res *lockres) +{ + struct ocfs2_mask_waiter *mw; + + if (list_empty(&lockres->l_mask_waiters)) { + lockres->l_lock_wait = 0; + return; + } + + mw = list_first_entry(&lockres->l_mask_waiters, + struct ocfs2_mask_waiter, mw_item); + lockres->l_lock_wait = + ktime_to_us(ktime_mono_to_real(mw->mw_lock_start)); +} + static inline void ocfs2_init_start_time(struct ocfs2_mask_waiter *mw) { mw->mw_lock_start = ktime_get(); @@ -498,6 +514,9 @@ static inline void ocfs2_update_lock_stats(struct ocfs2_lock_res *res, static inline void ocfs2_track_lock_refresh(struct ocfs2_lock_res *lockres) { } +static inline void ocfs2_track_lock_wait(struct ocfs2_lock_res *lockres) +{ +} static inline void ocfs2_init_start_time(struct ocfs2_mask_waiter *mw) { } @@ -891,6 +910,7 @@ static void lockres_set_flags(struct ocfs2_lock_res *lockres, list_del_init(&mw->mw_item); mw->mw_status = 0; complete(&mw->mw_complete); + ocfs2_track_lock_wait(lockres); } } static void lockres_or_flags(struct ocfs2_lock_res *lockres, unsigned long or) @@ -1402,6 +1422,7 @@ static void lockres_add_mask_waiter(struct ocfs2_lock_res *lockres, list_add_tail(&mw->mw_item, &lockres->l_mask_waiters); mw->mw_mask = mask; mw->mw_goal = goal; + ocfs2_track_lock_wait(lockres); } /* returns 0 if the mw that was removed was already satisfied, -EBUSY @@ -1418,6 +1439,7 @@ static int __lockres_remove_mask_waiter(struct ocfs2_lock_res *lockres, list_del_init(&mw->mw_item); init_completion(&mw->mw_complete); + ocfs2_track_lock_wait(lockres); } return ret; @@ -3098,7 +3120,7 @@ static void *ocfs2_dlm_seq_next(struct seq_file *m, void *v, loff_t *pos) * New in version 3 * - Max time in lock stats is in usecs (instead of nsecs) * New in version 4 - * - Add last pr/ex unlock times in usecs + * - Add last pr/ex unlock times and first lock wait time in usecs */ #define OCFS2_DLM_DEBUG_STR_VERSION 4 static int ocfs2_dlm_seq_show(struct seq_file *m, void *v) @@ -3116,7 +3138,7 @@ static int ocfs2_dlm_seq_show(struct seq_file *m, void *v) return -EINVAL; #ifdef CONFIG_OCFS2_FS_STATS - if (dlm_debug->d_filter_secs) { + if (!lockres->l_lock_wait && dlm_debug->d_filter_secs) { now = ktime_to_us(ktime_get_real()); if (lockres->l_lock_prmode.ls_last > lockres->l_lock_exmode.ls_last) @@ -3177,6 +3199,7 @@ static int ocfs2_dlm_seq_show(struct seq_file *m, void *v) # define lock_refresh(_l) ((_l)->l_lock_refresh) # define lock_last_prmode(_l) ((_l)->l_lock_prmode.ls_last) # define lock_last_exmode(_l) ((_l)->l_lock_exmode.ls_last) +# define lock_wait(_l) ((_l)->l_lock_wait) #else # define lock_num_prmode(_l) (0) # define lock_num_exmode(_l) (0) @@ -3189,6 +3212,7 @@ static int ocfs2_dlm_seq_show(struct seq_file *m, void *v) # define lock_refresh(_l) (0) # define lock_last_prmode(_l) (0ULL) # define lock_last_exmode(_l) (0ULL) +# define lock_wait(_l) (0ULL) #endif /* The following seq_print was added in version 2 of this output */ seq_printf(m, "%u\t" @@ -3201,6 +3225,7 @@ static int ocfs2_dlm_seq_show(struct seq_file *m, void *v) "%u\t" "%u\t" "%llu\t" + "%llu\t" "%llu\t", lock_num_prmode(lockres), lock_num_exmode(lockres), @@ -3212,7 +3237,8 @@ static int ocfs2_dlm_seq_show(struct seq_file *m, void *v) lock_max_exmode(lockres), lock_refresh(lockres), lock_last_prmode(lockres), - lock_last_exmode(lockres)); + lock_last_exmode(lockres), + lock_wait(lockres)); /* End the line */ seq_printf(m, "\n"); diff --git a/fs/ocfs2/ocfs2.h b/fs/ocfs2/ocfs2.h index 6d0a77703d0e..99ce40063da6 100644 --- a/fs/ocfs2/ocfs2.h +++ b/fs/ocfs2/ocfs2.h @@ -206,6 +206,7 @@ struct ocfs2_lock_res { #ifdef CONFIG_OCFS2_FS_STATS struct ocfs2_lock_stats l_lock_prmode; /* PR mode stats */ u32 l_lock_refresh; /* Disk refreshes */ + u64 l_lock_wait; /* First lock wait time */ struct ocfs2_lock_stats l_lock_exmode; /* EX mode stats */ #endif #ifdef CONFIG_DEBUG_LOCK_ALLOC