From patchwork Fri Nov 24 17:43:50 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "monty_pavel@sina.com" X-Patchwork-Id: 10074531 X-Patchwork-Delegate: snitzer@redhat.com Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 9AEC960375 for ; Fri, 24 Nov 2017 17:45:05 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 834AA2A283 for ; Fri, 24 Nov 2017 17:45:05 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 756702A297; Fri, 24 Nov 2017 17:45:05 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.9 required=2.0 tests=BAYES_00,FREEMAIL_FROM, MSGID_MULTIPLE_AT,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 1FB762A283 for ; Fri, 24 Nov 2017 17:45:03 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 2188E356C6; Fri, 24 Nov 2017 17:45:02 +0000 (UTC) Received: from colo-mx.corp.redhat.com (colo-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.21]) by smtp.corp.redhat.com (Postfix) with ESMTPS id A05272AE59; Fri, 24 Nov 2017 17:44:59 +0000 (UTC) Received: from lists01.pubmisc.prod.ext.phx2.redhat.com (lists01.pubmisc.prod.ext.phx2.redhat.com [10.5.19.33]) by colo-mx.corp.redhat.com (Postfix) with ESMTP id 0181D4A467; Fri, 24 Nov 2017 17:44:57 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) by lists01.pubmisc.prod.ext.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id vAOHi34T023883 for ; Fri, 24 Nov 2017 12:44:03 -0500 Received: by smtp.corp.redhat.com (Postfix) id 6CD2B5D6AE; Fri, 24 Nov 2017 17:44:03 +0000 (UTC) Delivered-To: dm-devel@redhat.com Received: from mx1.redhat.com (ext-mx01.extmail.prod.ext.phx2.redhat.com [10.5.110.25]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 62AE15D6A5 for ; Fri, 24 Nov 2017 17:44:01 +0000 (UTC) Received: from mail78-58.sinamail.sina.com.cn (mail78-58.sinamail.sina.com.cn [219.142.78.58]) by mx1.redhat.com (Postfix) with SMTP id 9114081DFB for ; Fri, 24 Nov 2017 17:43:54 +0000 (UTC) Received: from unknown (HELO softwareyyp@sina.com)([220.113.8.82]) by sina.com with ESMTP id 5A185A560000024C; Fri, 25 Nov 2017 01:43:51 +0800 (CST) X-Sender: monty_pavel@sina.com X-Auth-ID: monty_pavel@sina.com X-SMAIL-MID: 950368399473 Date: Sat, 25 Nov 2017 01:43:50 +0800 From: monty To: Alasdair G Kergon Message-ID: <20171124174350.GB14660@softwareyyp@sina.com> References: <201711241659129513254@sina.com> <20171124141258.GH5952@agk-dp.fab.redhat.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20171124141258.GH5952@agk-dp.fab.redhat.com> User-Agent: Mutt/1.5.24 (2015-08-30) X-Greylist: Sender passed SPF test, Sender IP whitelisted by DNSRBL, ACL 207 matched, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.25]); Fri, 24 Nov 2017 17:43:57 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.25]); Fri, 24 Nov 2017 17:43:57 +0000 (UTC) for IP:'219.142.78.58' DOMAIN:'mail78-58.sinamail.sina.com.cn' HELO:'mail78-58.sinamail.sina.com.cn' FROM:'monty_pavel@sina.com' RCPT:'' X-RedHat-Spam-Score: 0.99 (FREEMAIL_FROM, MSGID_MULTIPLE_AT, RCVD_IN_DNSWL_NONE, SPF_PASS) 219.142.78.58 mail78-58.sinamail.sina.com.cn 219.142.78.58 mail78-58.sinamail.sina.com.cn X-Scanned-By: MIMEDefang 2.78 on 10.5.110.25 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-loop: dm-devel@redhat.com Cc: dm-devel@redhat.com Subject: Re: [dm-devel] [PATCH] dm thin: fix NULL pointer exception caused by two concurrent "vgchange -ay -K " processes. X-BeenThere: dm-devel@redhat.com X-Mailman-Version: 2.1.12 Precedence: junk List-Id: device-mapper development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: dm-devel-bounces@redhat.com Errors-To: dm-devel-bounces@redhat.com X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.30]); Fri, 24 Nov 2017 17:45:03 +0000 (UTC) X-Virus-Scanned: ClamAV using ClamSMTP Yes, it is. New patch is as follows: We got a NULL pointer exception when testing the two concurrent "vgchange -ay -K ". panic call trace: PID: 25992 TASK: ffff883cd7d23500 CPU: 4 COMMAND: "vgchange" #0 [ffff883cd743d600] machine_kexec at ffffffff81038fa9 0000001 [ffff883cd743d660] crash_kexec at ffffffff810c5992 0000002 [ffff883cd743d730] oops_end at ffffffff81515c90 0000003 [ffff883cd743d760] no_context at ffffffff81049f1b 0000004 [ffff883cd743d7b0] __bad_area_nosemaphore at ffffffff8104a1a5 0000005 [ffff883cd743d800] bad_area at ffffffff8104a2ce 0000006 [ffff883cd743d830] __do_page_fault at ffffffff8104aa6f 0000007 [ffff883cd743d950] do_page_fault at ffffffff81517bae 0000008 [ffff883cd743d980] page_fault at ffffffff81514f95 [exception RIP: kmem_cache_alloc+108] RIP: ffffffff8116ef3c RSP: ffff883cd743da38 RFLAGS: 00010046 RAX: 0000000000000004 RBX: ffffffff81121b90 RCX: ffff881bf1e78cc0 RDX: 0000000000000000 RSI: 00000000000000d0 RDI: 0000000000000000 RBP: ffff883cd743da68 R8: ffff881bf1a4eb00 R9: 0000000080042000 R10: 0000000000002000 R11: 0000000000000000 R12: 00000000000000d0 R13: 0000000000000000 R14: 00000000000000d0 R15: 0000000000000246 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 0000009 [ffff883cd743da70] mempool_alloc_slab at ffffffff81121ba5 0000010 [ffff883cd743da80] mempool_create_node at ffffffff81122083 0000011 [ffff883cd743dad0] mempool_create at ffffffff811220f4 0000012 [ffff883cd743dae0] pool_ctr at ffffffffa08de049 [dm_thin_pool] 0000013 [ffff883cd743dbd0] dm_table_add_target at ffffffffa0005f2f [dm_mod] 0000014 [ffff883cd743dc30] table_load at ffffffffa0008ba9 [dm_mod] 0000015 [ffff883cd743dc90] ctl_ioctl at ffffffffa0009dc4 [dm_mod] this bug's scence is as follows: process A(vgchange -ay -K): a. send DM_LIST_VERSIONS_CMD ioctl; b. pool_target not registered; c. modprobe dm_thin_pool and wait until end. process B(vgchange -ay -K): a. send DM_LIST_VERSIONS_CMD ioctl; b. pool_target registered; c. table_load->dm_table_add_target->pool_ctr; d. _new_mapping_cache is NULL and panic. note: 1. process A and process B are two concurrent processes. 2. pool_target can be detected by process B but _new_mapping_cache initialization has not ended. All that we need do is to ensure pool_target registering ops is the last ops in dm_thin_init. So does the dm-cache, dm-mpath, and dm-snap. Thanks for Alasdair G Kergon reminding me. Signed-off-by: monty Signed-off-by: Alasdair G Kergon --- drivers/md/dm-cache-target.c | 12 +++++----- drivers/md/dm-mpath.c | 18 +++++++------- drivers/md/dm-snap.c | 48 +++++++++++++++++++++--------------------- drivers/md/dm-thin.c | 22 ++++++++---------- 4 files changed, 49 insertions(+), 51 deletions(-) >From c031cdcd0e88f36e295c60506322279bbaba999b Mon Sep 17 00:00:00 2001 From: monty Date: Fri, 24 Nov 2017 10:54:05 -0500 Subject: [PATCH] We got a NULL pointer exception when testing the two concurrent "vgchange -ay -K ". panic call trace: PID: 25992 TASK: ffff883cd7d23500 CPU: 4 COMMAND: "vgchange" #0 [ffff883cd743d600] machine_kexec at ffffffff81038fa9 0000001 [ffff883cd743d660] crash_kexec at ffffffff810c5992 0000002 [ffff883cd743d730] oops_end at ffffffff81515c90 0000003 [ffff883cd743d760] no_context at ffffffff81049f1b 0000004 [ffff883cd743d7b0] __bad_area_nosemaphore at ffffffff8104a1a5 0000005 [ffff883cd743d800] bad_area at ffffffff8104a2ce 0000006 [ffff883cd743d830] __do_page_fault at ffffffff8104aa6f 0000007 [ffff883cd743d950] do_page_fault at ffffffff81517bae 0000008 [ffff883cd743d980] page_fault at ffffffff81514f95 [exception RIP: kmem_cache_alloc+108] RIP: ffffffff8116ef3c RSP: ffff883cd743da38 RFLAGS: 00010046 RAX: 0000000000000004 RBX: ffffffff81121b90 RCX: ffff881bf1e78cc0 RDX: 0000000000000000 RSI: 00000000000000d0 RDI: 0000000000000000 RBP: ffff883cd743da68 R8: ffff881bf1a4eb00 R9: 0000000080042000 R10: 0000000000002000 R11: 0000000000000000 R12: 00000000000000d0 R13: 0000000000000000 R14: 00000000000000d0 R15: 0000000000000246 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 0000009 [ffff883cd743da70] mempool_alloc_slab at ffffffff81121ba5 0000010 [ffff883cd743da80] mempool_create_node at ffffffff81122083 0000011 [ffff883cd743dad0] mempool_create at ffffffff811220f4 0000012 [ffff883cd743dae0] pool_ctr at ffffffffa08de049 [dm_thin_pool] 0000013 [ffff883cd743dbd0] dm_table_add_target at ffffffffa0005f2f [dm_mod] 0000014 [ffff883cd743dc30] table_load at ffffffffa0008ba9 [dm_mod] 0000015 [ffff883cd743dc90] ctl_ioctl at ffffffffa0009dc4 [dm_mod] this bug's scence is as follows: process A(vgchange -ay -K): a. send DM_LIST_VERSIONS_CMD ioctl; b. pool_target not registered; c. modprobe dm_thin_pool and wait until end. process B(vgchange -ay -K): a. send DM_LIST_VERSIONS_CMD ioctl; b. pool_target registered; c. table_load->dm_table_add_target->pool_ctr; d. _new_mapping_cache is NULL and panic. note: 1. process A and process B are two concurrent processes. 2. pool_target can be detected by process B but _new_mapping_cache initialization has not ended. All that we need do is to ensure pool_target registering ops is the last ops in dm_thin_init. So does the dm-cache, dm-mpath, and dm-snap. Thanks for Alasdair G Kergon reminding me. Signed-off-by: monty Signed-off-by: Alasdair G Kergon --- drivers/md/dm-cache-target.c | 12 +++++----- drivers/md/dm-mpath.c | 18 +++++++------- drivers/md/dm-snap.c | 48 +++++++++++++++++++++--------------------- drivers/md/dm-thin.c | 22 ++++++++---------- 4 files changed, 49 insertions(+), 51 deletions(-) diff --git a/drivers/md/dm-cache-target.c b/drivers/md/dm-cache-target.c index cf23a14..47407e4 100644 --- a/drivers/md/dm-cache-target.c +++ b/drivers/md/dm-cache-target.c @@ -3472,18 +3472,18 @@ static int __init dm_cache_init(void) { int r; - r = dm_register_target(&cache_target); - if (r) { - DMERR("cache target registration failed: %d", r); - return r; - } - migration_cache = KMEM_CACHE(dm_cache_migration, 0); if (!migration_cache) { dm_unregister_target(&cache_target); return -ENOMEM; } + r = dm_register_target(&cache_target); + if (r) { + DMERR("cache target registration failed: %d", r); + return r; + } + return 0; } diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c index c8faa2b..35a2a2f 100644 --- a/drivers/md/dm-mpath.c +++ b/drivers/md/dm-mpath.c @@ -1957,13 +1957,6 @@ static int __init dm_multipath_init(void) { int r; - r = dm_register_target(&multipath_target); - if (r < 0) { - DMERR("request-based register failed %d", r); - r = -EINVAL; - goto bad_register_target; - } - kmultipathd = alloc_workqueue("kmpathd", WQ_MEM_RECLAIM, 0); if (!kmultipathd) { DMERR("failed to create workqueue kmpathd"); @@ -1985,13 +1978,20 @@ static int __init dm_multipath_init(void) goto bad_alloc_kmpath_handlerd; } + r = dm_register_target(&multipath_target); + if (r < 0) { + DMERR("request-based register failed %d", r); + r = -EINVAL; + goto bad_register_target; + } + return 0; +bad_register_target: + destroy_workqueue(kmpath_handlerd); bad_alloc_kmpath_handlerd: destroy_workqueue(kmultipathd); bad_alloc_kmultipathd: - dm_unregister_target(&multipath_target); -bad_register_target: return r; } diff --git a/drivers/md/dm-snap.c b/drivers/md/dm-snap.c index 1113b42..a0613bd 100644 --- a/drivers/md/dm-snap.c +++ b/drivers/md/dm-snap.c @@ -2411,24 +2411,6 @@ static int __init dm_snapshot_init(void) return r; } - r = dm_register_target(&snapshot_target); - if (r < 0) { - DMERR("snapshot target register failed %d", r); - goto bad_register_snapshot_target; - } - - r = dm_register_target(&origin_target); - if (r < 0) { - DMERR("Origin target register failed %d", r); - goto bad_register_origin_target; - } - - r = dm_register_target(&merge_target); - if (r < 0) { - DMERR("Merge target register failed %d", r); - goto bad_register_merge_target; - } - r = init_origin_hash(); if (r) { DMERR("init_origin_hash failed."); @@ -2449,19 +2431,37 @@ static int __init dm_snapshot_init(void) goto bad_pending_cache; } + r = dm_register_target(&snapshot_target); + if (r < 0) { + DMERR("snapshot target register failed %d", r); + goto bad_register_snapshot_target; + } + + r = dm_register_target(&origin_target); + if (r < 0) { + DMERR("Origin target register failed %d", r); + goto bad_register_origin_target; + } + + r = dm_register_target(&merge_target); + if (r < 0) { + DMERR("Merge target register failed %d", r); + goto bad_register_merge_target; + } + return 0; -bad_pending_cache: - kmem_cache_destroy(exception_cache); -bad_exception_cache: - exit_origin_hash(); -bad_origin_hash: - dm_unregister_target(&merge_target); bad_register_merge_target: dm_unregister_target(&origin_target); bad_register_origin_target: dm_unregister_target(&snapshot_target); bad_register_snapshot_target: + kmem_cache_destroy(pending_cache); +bad_pending_cache: + kmem_cache_destroy(exception_cache); +bad_exception_cache: + exit_origin_hash(); +bad_origin_hash: dm_exception_store_exit(); return r; diff --git a/drivers/md/dm-thin.c b/drivers/md/dm-thin.c index 89e5dff..f91d771 100644 --- a/drivers/md/dm-thin.c +++ b/drivers/md/dm-thin.c @@ -4355,30 +4355,28 @@ static void thin_io_hints(struct dm_target *ti, struct queue_limits *limits) static int __init dm_thin_init(void) { - int r; + int r = -ENOMEM; pool_table_init(); + _new_mapping_cache = KMEM_CACHE(dm_thin_new_mapping, 0); + if (!_new_mapping_cache) + return r; + r = dm_register_target(&thin_target); if (r) - return r; + goto bad_new_mapping_cache; r = dm_register_target(&pool_target); if (r) - goto bad_pool_target; - - r = -ENOMEM; - - _new_mapping_cache = KMEM_CACHE(dm_thin_new_mapping, 0); - if (!_new_mapping_cache) - goto bad_new_mapping_cache; + goto bad_thin_target; return 0; -bad_new_mapping_cache: - dm_unregister_target(&pool_target); -bad_pool_target: +bad_thin_target: dm_unregister_target(&thin_target); +bad_new_mapping_cache: + kmem_cache_destroy(_new_mapping_cache); return r; } -- 1.7.1