From patchwork Tue Aug 27 06:17:17 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Joseph Qi X-Patchwork-Id: 2850006 Return-Path: X-Original-To: patchwork-ocfs2-devel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 710D49F313 for ; Tue, 27 Aug 2013 06:20:40 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 3BF9820143 for ; Tue, 27 Aug 2013 06:20:37 +0000 (UTC) Received: from aserp1040.oracle.com (aserp1040.oracle.com [141.146.126.69]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id E32312012F for ; Tue, 27 Aug 2013 06:20:34 +0000 (UTC) Received: from acsinet22.oracle.com (acsinet22.oracle.com [141.146.126.238]) by aserp1040.oracle.com (Sentrion-MTA-4.3.1/Sentrion-MTA-4.3.1) with ESMTP id r7R6KLjF020239 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Tue, 27 Aug 2013 06:20:22 GMT Received: from oss.oracle.com (oss-external.oracle.com [137.254.96.51]) by acsinet22.oracle.com (8.14.4+Sun/8.14.4) with ESMTP id r7R6KHZB020277 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 27 Aug 2013 06:20:17 GMT Received: from localhost ([127.0.0.1] helo=oss.oracle.com) by oss.oracle.com with esmtp (Exim 4.63) (envelope-from ) id 1VECdV-0005LG-0K; Mon, 26 Aug 2013 23:20:17 -0700 Received: from acsinet22.oracle.com ([141.146.126.238]) by oss.oracle.com with esmtp (Exim 4.63) (envelope-from ) id 1VECd6-0005K9-SK for ocfs2-devel@oss.oracle.com; Mon, 26 Aug 2013 23:19:52 -0700 Received: from aserp1020.oracle.com (aserp1020.oracle.com [141.146.126.67]) by acsinet22.oracle.com (8.14.4+Sun/8.14.4) with ESMTP id r7R6JqG3019241 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Tue, 27 Aug 2013 06:19:52 GMT Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [119.145.14.65]) by aserp1020.oracle.com (Sentrion-MTA-4.3.1/Sentrion-MTA-4.3.1) with ESMTP id r7R6JnO2011490 (version=TLSv1/SSLv3 cipher=DES-CBC3-SHA bits=168 verify=FAIL) for ; Tue, 27 Aug 2013 06:19:51 GMT Received: from 172.24.2.119 (EHLO szxeml209-edg.china.huawei.com) ([172.24.2.119]) by szxrg02-dlp.huawei.com (MOS 4.3.4-GA FastPath queued) with ESMTP id BGN10929; Tue, 27 Aug 2013 14:18:42 +0800 (CST) Received: from SZXEML413-HUB.china.huawei.com (10.82.67.152) by szxeml209-edg.china.huawei.com (172.24.2.184) with Microsoft SMTP Server (TLS) id 14.1.323.7; Tue, 27 Aug 2013 14:17:28 +0800 Received: from [127.0.0.1] (10.135.64.116) by szxeml413-hub.china.huawei.com (10.82.67.152) with Microsoft SMTP Server id 14.1.323.7; Tue, 27 Aug 2013 14:17:27 +0800 Message-ID: <521C446D.8000500@huawei.com> Date: Tue, 27 Aug 2013 14:17:17 +0800 From: Joseph Qi User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:17.0) Gecko/20130620 Thunderbird/17.0.7 MIME-Version: 1.0 To: Andrew Morton X-Originating-IP: [10.135.64.116] X-CFilter-Loop: Reflected X-Flow-Control-Info: class=Pass-to-MM reputation=ipRisk-All ip=119.145.14.65 ct-class=T1 ct-vol1=0 ct-vol2=6 ct-vol3=5 ct-risk=10 ct-spam1=0 ct-spam2=0 ct-bulk=90 rcpts=1 size=3593 X-Sendmail-CM-Score: 0.00% X-Sendmail-CM-Analysis: v=2.1 cv=Yp42GeoX c=1 sm=1 tr=0 a=qbZWUeANkjeORAZY4leFnw==:117 a=qbZWUeANkjeORAZY4leFnw==:17 a=0JQp9YiEBiYA:10 a=je8okafH2F8A:10 a=7AgqxE65s1sA:10 a=O9dq5j03pVQA:10 a=8nJEP1OIZ-IA:10 a=i0EeH86SAAAA:8 a=owGNOhDILl8A:10 a=nTGxd2ykFxpJPes7j 1gA:9 a=KsHz_jHaYN_XCrBq:21 a=U8zr621Xh0lfeAPg:21 a=wPNLvfGTeEIA:10 a=hPjdaMEvmhQA:10 X-Sendmail-CT-Classification: not spam X-Sendmail-CT-RefID: str=0001.0A090201.521C4508.0001:SCFSTAT1612107, ss=1, re=-4.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 Cc: Mark Fasheh , "ocfs2-devel@oss.oracle.com" Subject: [Ocfs2-devel] [PATCH] ocfs2: fix a tiny race case when fire callbacks X-BeenThere: ocfs2-devel@oss.oracle.com X-Mailman-Version: 2.1.9 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: ocfs2-devel-bounces@oss.oracle.com Errors-To: ocfs2-devel-bounces@oss.oracle.com X-Source-IP: acsinet22.oracle.com [141.146.126.238] X-Spam-Status: No, score=-6.7 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP In o2hb_shutdown_slot() and o2hb_check_slot(), since event is defined as local, it is only valid during the call stack. So the following tiny race case may happen in a multi-volumes mounted environment: o2hb-vol1 o2hb-vol2 1) o2hb_shutdown_slot allocate local event1 2) queue_node_event add event1 to global o2hb_node_events 3) o2hb_shutdown_slot allocate local event2 4) queue_node_event add event2 to global o2hb_node_events 5) o2hb_run_event_list delete event1 from o2hb_node_events 6) o2hb_run_event_list event1 empty, return 7) o2hb_shutdown_slot event1 lifecycle ends 8) o2hb_fire_callbacks event1 is already *invalid* This patch lets it wait o2hb_callback_sem when another thread is firing callbacks. And for performance consideration, we only call o2hb_run_event_list when there is an event queued. Signed-off-by: Joyce Signed-off-by: Joseph Qi --- fs/ocfs2/cluster/heartbeat.c | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) -- 1.7.9.7 diff --git a/fs/ocfs2/cluster/heartbeat.c b/fs/ocfs2/cluster/heartbeat.c index 42252bf..af5cd3b 100644 --- a/fs/ocfs2/cluster/heartbeat.c +++ b/fs/ocfs2/cluster/heartbeat.c @@ -641,16 +641,9 @@ static void o2hb_fire_callbacks(struct o2hb_callback *hbcall, /* Will run the list in order until we process the passed event */ static void o2hb_run_event_list(struct o2hb_node_event *queued_event) { - int empty; struct o2hb_callback *hbcall; struct o2hb_node_event *event; - spin_lock(&o2hb_live_lock); - empty = list_empty(&queued_event->hn_item); - spin_unlock(&o2hb_live_lock); - if (empty) - return; - /* Holding callback sem assures we don't alter the callback * lists when doing this, and serializes ourselves with other * processes wanting callbacks. */ @@ -709,6 +702,7 @@ static void o2hb_shutdown_slot(struct o2hb_disk_slot *slot) struct o2hb_node_event event = { .hn_item = LIST_HEAD_INIT(event.hn_item), }; struct o2nm_node *node; + int queued = 0; node = o2nm_get_node_by_num(slot->ds_node_num); if (!node) @@ -726,11 +720,13 @@ static void o2hb_shutdown_slot(struct o2hb_disk_slot *slot) o2hb_queue_node_event(&event, O2HB_NODE_DOWN_CB, node, slot->ds_node_num); + queued = 1; } } spin_unlock(&o2hb_live_lock); - o2hb_run_event_list(&event); + if (queued) + o2hb_run_event_list(&event); o2nm_node_put(node); } @@ -790,6 +786,7 @@ static int o2hb_check_slot(struct o2hb_region *reg, unsigned int dead_ms = o2hb_dead_threshold * O2HB_REGION_TIMEOUT_MS; unsigned int slot_dead_ms; int tmp; + int queued = 0; memcpy(hb_block, slot->ds_raw_block, reg->hr_block_bytes); @@ -883,6 +880,7 @@ fire_callbacks: slot->ds_node_num); changed = 1; + queued = 1; } list_add_tail(&slot->ds_live_item, @@ -934,6 +932,7 @@ fire_callbacks: node, slot->ds_node_num); changed = 1; + queued = 1; } /* We don't clear this because the node is still @@ -949,7 +948,8 @@ fire_callbacks: out: spin_unlock(&o2hb_live_lock); - o2hb_run_event_list(&event); + if (queued) + o2hb_run_event_list(&event); if (node) o2nm_node_put(node);