From patchwork Thu Sep 24 03:19:15 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Joseph Qi X-Patchwork-Id: 7253551 Return-Path: X-Original-To: patchwork-ocfs2-devel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id F1A539F40A for ; Thu, 24 Sep 2015 03:20:18 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 13DBF20968 for ; Thu, 24 Sep 2015 03:20:18 +0000 (UTC) Received: from aserp1040.oracle.com (aserp1040.oracle.com [141.146.126.69]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 6C1D920963 for ; Thu, 24 Sep 2015 03:20:16 +0000 (UTC) Received: from userv0022.oracle.com (userv0022.oracle.com [156.151.31.74]) by aserp1040.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id t8O3K2Vs020930 (version=TLSv1 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Thu, 24 Sep 2015 03:20:05 GMT Received: from oss.oracle.com (oss-old-reserved.oracle.com [137.254.22.2]) by userv0022.oracle.com (8.13.8/8.13.8) with ESMTP id t8O3K1FE003314 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 24 Sep 2015 03:20:01 GMT Received: from localhost ([127.0.0.1] helo=lb-oss.oracle.com) by oss.oracle.com with esmtp (Exim 4.63) (envelope-from ) id 1Zex4i-0007Zo-VD; Wed, 23 Sep 2015 20:20:00 -0700 Received: from aserv0022.oracle.com ([141.146.126.234]) by oss.oracle.com with esmtp (Exim 4.63) (envelope-from ) id 1Zex4L-0007Ye-0e for ocfs2-devel@oss.oracle.com; Wed, 23 Sep 2015 20:19:37 -0700 Received: from aserp1020.oracle.com (aserp1020.oracle.com [141.146.126.67]) by aserv0022.oracle.com (8.13.8/8.13.8) with ESMTP id t8O3JaDI011851 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Thu, 24 Sep 2015 03:19:36 GMT Received: from userp2040.oracle.com (userp2040.oracle.com [156.151.31.90]) by aserp1020.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id t8O3JaHk026016 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO) for ; Thu, 24 Sep 2015 03:19:36 GMT Received: from pps.filterd (userp2040.oracle.com [127.0.0.1]) by userp2040.oracle.com (8.15.0.59/8.15.0.59) with SMTP id t8O3JEb4026141 for ; Thu, 24 Sep 2015 03:19:36 GMT Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [58.251.152.64]) by userp2040.oracle.com with ESMTP id 1x4952r0mm-1 (version=TLSv1/SSLv3 cipher=RC4-SHA bits=128 verify=NOT) for ; Thu, 24 Sep 2015 03:19:35 +0000 Received: from 172.24.1.50 (EHLO szxeml428-hub.china.huawei.com) ([172.24.1.50]) by szxrg01-dlp.huawei.com (MOS 4.3.7-GA FastPath queued) with ESMTP id CVS24087; Thu, 24 Sep 2015 11:19:22 +0800 (CST) Received: from [127.0.0.1] (10.177.22.101) by szxeml428-hub.china.huawei.com (10.82.67.183) with Microsoft SMTP Server id 14.3.235.1; Thu, 24 Sep 2015 11:19:16 +0800 Message-ID: <56036BB3.9020302@huawei.com> Date: Thu, 24 Sep 2015 11:19:15 +0800 From: Joseph Qi User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20130328 Thunderbird/17.0.5 MIME-Version: 1.0 To: Andrew Morton X-Originating-IP: [10.177.22.101] X-CFilter-Loop: Reflected X-Proofpoint-SPF-Result: pass X-Proofpoint-SPF-Record: v=spf1 ip4:119.145.14.64/30 ip4:58.251.152.64/30 ip4:119.145.14.93 ip4:58.251.152.93 ip4:206.16.17.74 ip4:194.213.3.16 ip4:194.213.3.17 ip4:206.16.17.72 ~all X-ServerName: szxga01-in.huawei.com X-Proofpoint-Virus-Version: vendor=nai engine=5700 definitions=7933 signatures=670636 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0 suspectscore=2 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1507310000 definitions=main-1509240059 Cc: Mark Fasheh , "ocfs2-devel@oss.oracle.com" Subject: [Ocfs2-devel] [PATCH 2/2] ocfs2/dlm: fix BUG in dlm_move_lockres_to_recovery_list X-BeenThere: ocfs2-devel@oss.oracle.com X-Mailman-Version: 2.1.9 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: ocfs2-devel-bounces@oss.oracle.com Errors-To: ocfs2-devel-bounces@oss.oracle.com X-Source-IP: userv0022.oracle.com [156.151.31.74] X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP When master handles convert request, it queues ast first and then returns status. This may happen that the ast is sent before the request status because the above two messages are sent by two threads. And right after the ast is sent, if master down, it may trigger BUG in dlm_move_lockres_to_recovery_list in the requested node because ast handler moves it to grant list without clear lock->convert_pending. So remove BUG_ON statement and check if the ast is processed in dlmconvert_remote. Signed-off-by: Joseph Qi Cc: --- fs/ocfs2/dlm/dlmconvert.c | 13 +++++++++++++ fs/ocfs2/dlm/dlmrecovery.c | 1 - 2 files changed, 13 insertions(+), 1 deletion(-) diff --git a/fs/ocfs2/dlm/dlmconvert.c b/fs/ocfs2/dlm/dlmconvert.c index 9e6116e..a338715 100644 --- a/fs/ocfs2/dlm/dlmconvert.c +++ b/fs/ocfs2/dlm/dlmconvert.c @@ -288,6 +288,19 @@ enum dlm_status dlmconvert_remote(struct dlm_ctxt *dlm, status = DLM_DENIED; goto bail; } + + if (lock->ml.type == type && lock->ml.convert_type == LKM_IVMODE) { + mlog(0, "last convert request returned DLM_RECOVERING, but " + "owner has already queued and sent ast to me. res %.*s, " + "(cookie=%u:%llu, type=%d, conv=%d)\n", + res->lockname.len, res->lockname.name, + dlm_get_lock_cookie_node(be64_to_cpu(lock->ml.cookie)), + dlm_get_lock_cookie_seq(be64_to_cpu(lock->ml.cookie)), + lock->ml.type, lock->ml.convert_type); + status = DLM_NORMAL; + goto bail; + } + res->state |= DLM_LOCK_RES_IN_PROGRESS; /* move lock to local convert queue */ /* do not alter lock refcount. switching lists. */ diff --git a/fs/ocfs2/dlm/dlmrecovery.c b/fs/ocfs2/dlm/dlmrecovery.c index 3d90ad7..5c0f761 100644 --- a/fs/ocfs2/dlm/dlmrecovery.c +++ b/fs/ocfs2/dlm/dlmrecovery.c @@ -2064,7 +2064,6 @@ void dlm_move_lockres_to_recovery_list(struct dlm_ctxt *dlm, dlm_lock_get(lock); if (lock->convert_pending) { /* move converting lock back to granted */ - BUG_ON(i != DLM_CONVERTING_LIST); mlog(0, "node died with convert pending " "on %.*s. move back to granted list.\n", res->lockname.len, res->lockname.name);