From patchwork Tue Dec 12 00:26:24 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bart Van Assche X-Patchwork-Id: 10106057 X-Patchwork-Delegate: snitzer@redhat.com Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 32C50602B3 for ; Tue, 12 Dec 2017 00:27:13 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 262DC299C6 for ; Tue, 12 Dec 2017 00:27:13 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 1AD9F299D5; Tue, 12 Dec 2017 00:27:13 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_HI,T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 8D4A8299C6 for ; Tue, 12 Dec 2017 00:27:12 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 1FC62883CF; Tue, 12 Dec 2017 00:27:10 +0000 (UTC) Received: from colo-mx.corp.redhat.com (colo-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.21]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 8688817C21; Tue, 12 Dec 2017 00:27:08 +0000 (UTC) Received: from lists01.pubmisc.prod.ext.phx2.redhat.com (lists01.pubmisc.prod.ext.phx2.redhat.com [10.5.19.33]) by colo-mx.corp.redhat.com (Postfix) with ESMTP id 9E2C44BB79; Tue, 12 Dec 2017 00:27:05 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) by lists01.pubmisc.prod.ext.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id vBC0R47j003504 for ; Mon, 11 Dec 2017 19:27:04 -0500 Received: by smtp.corp.redhat.com (Postfix) id 6B48E17C25; Tue, 12 Dec 2017 00:27:04 +0000 (UTC) Delivered-To: dm-devel@redhat.com Received: from mx1.redhat.com (ext-mx03.extmail.prod.ext.phx2.redhat.com [10.5.110.27]) by smtp.corp.redhat.com (Postfix) with ESMTPS id B909F4143; Tue, 12 Dec 2017 00:26:59 +0000 (UTC) Received: from esa5.hgst.iphmx.com (esa5.hgst.iphmx.com [216.71.153.144]) (using TLSv1.2 with cipher RC4-SHA (112/128 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 01C733AAC0; Tue, 12 Dec 2017 00:26:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1513038407; x=1544574407; h=from:to:cc:subject:date:message-id; bh=z3qg7fp/ZkBk+8VzTYZaOwPjbFPoskp2A8SBy9ITyzg=; b=WV89slN5jU+/ayE9yWI2Vi7IR8yhc1qRcOcMbWbMJ8pSfd4ttLFuDYSc 1BZSWO9aCLFl6cb625buEzTIIF7UAFHSkywNH7D5so07vWzAR+oE4B4lo U0ftUKOJpDZKrv1KeyoojCn5FbLzJBW/jBzmEJnd6d6eEV4KUzFh3wgUs +h3LUJ7qQswCmcH3UWJAa0XiQ3CWRk76hc3ooPFx7jV0rFKQMXR3h2q/z 7vp5akZ/XTFGSfB9c5qJVbyEOcBRic1L6oNbYFHCGTFbaME2AlA/FGsC9 Kp0U1h5+reL6Y1UvNkVOlw7K6mApuVw57GASwaJjRBQ3mW5YDBR1LTh46 Q==; X-IronPort-AV: E=Sophos;i="5.45,393,1508774400"; d="scan'208";a="64717411" Received: from uls-op-cesaip01.wdc.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 12 Dec 2017 08:26:25 +0800 Received: from uls-op-cesaip01.wdc.com ([10.248.3.36]) by uls-op-cesaep01.wdc.com with ESMTP; 11 Dec 2017 16:23:22 -0800 Received: from thinkpad-bart.sdcorp.global.sandisk.com (HELO thinkpad-bart.int.fusionio.com) ([10.11.166.51]) by uls-op-cesaip01.wdc.com with ESMTP; 11 Dec 2017 16:26:25 -0800 From: Bart Van Assche To: Mike Snitzer Date: Mon, 11 Dec 2017 16:26:24 -0800 Message-Id: <20171212002624.7747-1-bart.vanassche@wdc.com> X-Greylist: Sender passed SPF test, Sender IP whitelisted by DNSRBL, ACL 207 matched, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.27]); Tue, 12 Dec 2017 00:26:47 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.27]); Tue, 12 Dec 2017 00:26:47 +0000 (UTC) for IP:'216.71.153.144' DOMAIN:'esa5.hgst.iphmx.com' HELO:'esa5.hgst.iphmx.com' FROM:'bart.vanassche@wdc.com' RCPT:'' X-RedHat-Spam-Score: -0.8 (DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, RCVD_IN_DNSWL_LOW) 216.71.153.144 esa5.hgst.iphmx.com 216.71.153.144 esa5.hgst.iphmx.com X-Scanned-By: MIMEDefang 2.78 on 10.5.110.27 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-loop: dm-devel@redhat.com Cc: Bart Van Assche , dm-devel@redhat.com, Hannes Reinecke , stable@vger.kernel.org Subject: [dm-devel] [PATCH] dm-mpath: Fix a race condition X-BeenThere: dm-devel@redhat.com X-Mailman-Version: 2.1.12 Precedence: junk List-Id: device-mapper development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: dm-devel-bounces@redhat.com Errors-To: dm-devel-bounces@redhat.com X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.26]); Tue, 12 Dec 2017 00:27:11 +0000 (UTC) X-Virus-Scanned: ClamAV using ClamSMTP Hold multipath.lock around all code that iterates over the priority_groups list. This patch fixes the following crash: general protection fault: 0000 [#1] PREEMPT SMP RIP: 0010:multipath_busy+0x77/0xd0 [dm_multipath] Call Trace: dm_mq_queue_rq+0x44/0x110 [dm_mod] blk_mq_dispatch_rq_list+0x73/0x440 blk_mq_do_dispatch_sched+0x60/0xe0 blk_mq_sched_dispatch_requests+0x11a/0x1a0 __blk_mq_run_hw_queue+0x11f/0x1c0 __blk_mq_delay_run_hw_queue+0x95/0xe0 blk_mq_run_hw_queue+0x25/0x80 blk_mq_flush_plug_list+0x197/0x420 blk_flush_plug_list+0xe4/0x270 blk_finish_plug+0x27/0x40 __do_page_cache_readahead+0x2b4/0x370 force_page_cache_readahead+0xb4/0x110 generic_file_read_iter+0x755/0x970 __vfs_read+0xd2/0x140 vfs_read+0x9b/0x140 SyS_read+0x45/0xa0 do_syscall_64+0x56/0x1a0 entry_SYSCALL64_slow_path+0x25/0x25 >From the disassembly of multipath_busy (0x77 = 119): ./include/linux/blkdev.h: 992 return bdev->bd_disk->queue; /* this is never NULL */ 0x00000000000006b4 <+116>: mov (%rax),%rax 0x00000000000006b7 <+119>: mov 0xe0(%rax),%rax Signed-off-by: Bart Van Assche Cc: Hannes Reinecke Cc: stable@vger.kernel.org --- drivers/md/dm-mpath.c | 40 ++++++++++++++++++++++++++++------------ 1 file changed, 28 insertions(+), 12 deletions(-) diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c index c8faa2b85842..61def92f306a 100644 --- a/drivers/md/dm-mpath.c +++ b/drivers/md/dm-mpath.c @@ -237,10 +237,19 @@ static int alloc_multipath_stage2(struct dm_target *ti, struct multipath *m) static void free_multipath(struct multipath *m) { - struct priority_group *pg, *tmp; + struct priority_group *pg; + unsigned long flags; + + while (true) { + spin_lock_irqsave(&m->lock, flags); + pg = list_first_entry_or_null(&m->priority_groups, typeof(*pg), + list); + if (pg) + list_del(&pg->list); + spin_unlock_irqrestore(&m->lock, flags); - list_for_each_entry_safe(pg, tmp, &m->priority_groups, list) { - list_del(&pg->list); + if (!pg) + break; free_priority_group(pg, m->ti); } @@ -337,6 +346,7 @@ static int pg_init_all_paths(struct multipath *m) } static void __switch_pg(struct multipath *m, struct priority_group *pg) + __must_hold(&m->lock) { m->current_pg = pg; @@ -355,8 +365,8 @@ static void __switch_pg(struct multipath *m, struct priority_group *pg) static struct pgpath *choose_path_in_pg(struct multipath *m, struct priority_group *pg, size_t nr_bytes) + __must_hold(&m->lock) { - unsigned long flags; struct dm_path *path; struct pgpath *pgpath; @@ -368,10 +378,8 @@ static struct pgpath *choose_path_in_pg(struct multipath *m, if (unlikely(READ_ONCE(m->current_pg) != pg)) { /* Only update current_pgpath if pg changed */ - spin_lock_irqsave(&m->lock, flags); m->current_pgpath = pgpath; __switch_pg(m, pg); - spin_unlock_irqrestore(&m->lock, flags); } return pgpath; @@ -381,7 +389,7 @@ static struct pgpath *choose_pgpath(struct multipath *m, size_t nr_bytes) { unsigned long flags; struct priority_group *pg; - struct pgpath *pgpath; + struct pgpath *pgpath, *res = NULL; unsigned bypassed = 1; if (!atomic_read(&m->nr_valid_paths)) { @@ -419,6 +427,7 @@ static struct pgpath *choose_pgpath(struct multipath *m, size_t nr_bytes) * Second time we only try the ones we skipped, but set * pg_init_delay_retry so we do not hammer controllers. */ + spin_lock_irqsave(&m->lock, flags); do { list_for_each_entry(pg, &m->priority_groups, list) { if (pg->bypassed == !!bypassed) @@ -427,18 +436,22 @@ static struct pgpath *choose_pgpath(struct multipath *m, size_t nr_bytes) if (!IS_ERR_OR_NULL(pgpath)) { if (!bypassed) set_bit(MPATHF_PG_INIT_DELAY_RETRY, &m->flags); - return pgpath; + res = pgpath; + break; } } - } while (bypassed--); + } while (!res && bypassed--); + spin_unlock_irqrestore(&m->lock, flags); failed: spin_lock_irqsave(&m->lock, flags); - m->current_pgpath = NULL; - m->current_pg = NULL; + if (!res) { + m->current_pgpath = NULL; + m->current_pg = NULL; + } spin_unlock_irqrestore(&m->lock, flags); - return NULL; + return res; } /* @@ -1875,6 +1888,7 @@ static int multipath_busy(struct dm_target *ti) struct multipath *m = ti->private; struct priority_group *pg, *next_pg; struct pgpath *pgpath; + unsigned long flags; /* pg_init in progress */ if (atomic_read(&m->pg_init_in_progress)) @@ -1906,6 +1920,7 @@ static int multipath_busy(struct dm_target *ti) * will be able to select it. So we consider such a pg as not busy. */ busy = true; + spin_lock_irqsave(&m->lock, flags); list_for_each_entry(pgpath, &pg->pgpaths, list) { if (pgpath->is_active) { has_active = true; @@ -1915,6 +1930,7 @@ static int multipath_busy(struct dm_target *ti) } } } + spin_unlock_irqrestore(&m->lock, flags); if (!has_active) { /*