From patchwork Mon Feb 25 09:49:14 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Junichi Nomura X-Patchwork-Id: 2180511 Return-Path: X-Original-To: patchwork-dm-devel@patchwork.kernel.org Delivered-To: patchwork-process-083081@patchwork1.kernel.org Received: from mx3-phx2.redhat.com (mx3-phx2.redhat.com [209.132.183.24]) by patchwork1.kernel.org (Postfix) with ESMTP id 0A0723FCA4 for ; Mon, 25 Feb 2013 09:59:57 +0000 (UTC) Received: from lists01.pubmisc.prod.ext.phx2.redhat.com (lists01.pubmisc.prod.ext.phx2.redhat.com [10.5.19.33]) by mx3-phx2.redhat.com (8.13.8/8.13.8) with ESMTP id r1P9utLr017330; Mon, 25 Feb 2013 04:56:56 -0500 Received: from int-mx01.intmail.prod.int.phx2.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) by lists01.pubmisc.prod.ext.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id r1P9usM3018127 for ; Mon, 25 Feb 2013 04:56:54 -0500 Received: from mx1.redhat.com (ext-mx13.extmail.prod.ext.phx2.redhat.com [10.5.110.18]) by int-mx01.intmail.prod.int.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id r1P9urao017168; Mon, 25 Feb 2013 04:56:53 -0500 Received: from tyo201.gate.nec.co.jp (TYO201.gate.nec.co.jp [210.143.35.51]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id r1P9upe1006100; Mon, 25 Feb 2013 04:56:52 -0500 Received: from mailgate3.nec.co.jp ([10.7.69.197]) by tyo201.gate.nec.co.jp (8.13.8/8.13.4) with ESMTP id r1P9nfDO001094; Mon, 25 Feb 2013 18:49:41 +0900 (JST) Received: (from root@localhost) by mailgate3.nec.co.jp (8.11.7/3.7W-MAILGATE-NEC) id r1P9nfL17656; Mon, 25 Feb 2013 18:49:41 +0900 (JST) Received: from mail03.kamome.nec.co.jp (mail03.kamome.nec.co.jp [10.25.43.7]) by mailsv.nec.co.jp (8.13.8/8.13.4) with ESMTP id r1P9neqU015892; Mon, 25 Feb 2013 18:49:40 +0900 (JST) Received: from saigo.jp.nec.com ([10.26.220.6] [10.26.220.6]) by mail03.kamome.nec.co.jp with ESMTP id BT-MMP-2028728; Mon, 25 Feb 2013 18:49:15 +0900 Received: from xzibit.linux.bs1.fc.nec.co.jp ([10.34.125.175] [10.34.125.175]) by mail.jp.nec.com with ESMTP; Mon, 25 Feb 2013 18:49:14 +0900 Message-ID: <512B339A.7010606@ce.jp.nec.com> Date: Mon, 25 Feb 2013 18:49:14 +0900 From: "Jun'ichi Nomura" User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:15.0) Gecko/20120911 Thunderbird/15.0.1 MIME-Version: 1.0 To: Bart Van Assche References: <51274C2F.6070500@acm.org> <51274CC3.9070204@acm.org> In-Reply-To: <51274CC3.9070204@acm.org> X-RedHat-Spam-Score: -4.202 (BAYES_00, RCVD_IN_DNSWL_MED, SPF_HELO_PASS, SPF_PASS) X-Scanned-By: MIMEDefang 2.67 on 10.5.11.11 X-Scanned-By: MIMEDefang 2.68 on 10.5.110.18 X-loop: dm-devel@redhat.com Cc: Jens Axboe , linux-scsi , Mike Snitzer , James Bottomley , device-mapper development , Tejun Heo , Alasdair G Kergon Subject: Re: [dm-devel] [PATCH 2/2] dm: Avoid use-after-free of a mapped device X-BeenThere: dm-devel@redhat.com X-Mailman-Version: 2.1.12 Precedence: junk Reply-To: device-mapper development List-Id: device-mapper development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: dm-devel-bounces@redhat.com Errors-To: dm-devel-bounces@redhat.com Hello Bart, On 02/22/13 19:47, Bart Van Assche wrote: > As the comment above rq_completed() explains, md members must > not be touched after the dm_put() at the end of that function > has been invoked. Avoid that the md->queue can be run > asynchronously after the last md reference has been dropped by > running that queue synchronously. This patch fixes the > following kernel oops: Calling blk_run_queue_async() there should be ok. After dm_put(), the dm device may be removed. But free_dev() in dm.c calls blk_queue_cleanup() and it should solve the race vs. delayed work. And I could reproduce very similar oops without removing dm device by following procedure: (please replace "mpathX" with your dm-multipath map name) # t=`dmsetup table mpathX` # while sleep 1; do \ echo "$t" | dmsetup load mpathX; dmsetup resume mpathX; done Looking at the following back trace: > general protection fault: 0000 [#1] SMP > RIP: 0010:[] [] mempool_free+0x24/0xb0 > Call Trace: > > [] bio_put+0x97/0xc0 > [] end_clone_bio+0x35/0x90 [dm_mod] > [] bio_endio+0x1d/0x30 > [] req_bio_endio.isra.51+0xa3/0xe0 > [] blk_update_request+0x118/0x520 > [] blk_update_bidi_request+0x27/0xa0 > [] blk_end_bidi_request+0x2c/0x80 > [] blk_end_request+0x10/0x20 > [] scsi_io_completion+0xfb/0x6c0 [scsi_mod] > [] scsi_finish_command+0xbd/0x120 [scsi_mod] > [] scsi_softirq_done+0x13f/0x160 [scsi_mod] > [] blk_done_softirq+0x80/0xa0 > [] __do_softirq+0xf1/0x250 > [] call_softirq+0x1c/0x30 > [] do_softirq+0x8d/0xc0 > [] irq_exit+0xd5/0xe0 > [] do_IRQ+0x63/0xe0 > [] common_interrupt+0x6f/0x6f > > [] srp_queuecommand+0x8c/0xcb0 [ib_srp] > [] scsi_dispatch_cmd+0x148/0x310 [scsi_mod] > [] scsi_request_fn+0x31e/0x520 [scsi_mod] > [] __blk_run_queue+0x37/0x50 > [] blk_delay_work+0x29/0x40 > [] process_one_work+0x1c3/0x5c0 > [] worker_thread+0x15e/0x440 > [] kthread+0xdb/0xe0 > [] ret_from_fork+0x7c/0xb0 it seems that the bioset was removed while being referenced. c0820cf5 "dm: introduce per_bio_data" started to replace dm bioset during table replacement because the size of bioset front_pad might change for bio-based dm. However, for request-based dm, it is not necessary because the size of front_pad is static. Also we can't simply replace bioset because prep-ed requests in queue have reference to the old bioset. The patch below changes it not to replace bioset for request-based dm. (Brings back to the same behavior with v3.7) With this patch, I could not reproduce the problem. Could you try this? diff --git a/drivers/md/dm.c b/drivers/md/dm.c index 314a0e2..51fefb5 100644 --- a/drivers/md/dm.c +++ b/drivers/md/dm.c @@ -1973,15 +1973,27 @@ static void __bind_mempools(struct mapped_device *md, struct dm_table *t) { struct dm_md_mempools *p = dm_table_get_md_mempools(t); - if (md->io_pool && (md->tio_pool || dm_table_get_type(t) == DM_TYPE_BIO_BASED) && md->bs) { - /* - * The md already has necessary mempools. Reload just the - * bioset because front_pad may have changed because - * a different table was loaded. - */ - bioset_free(md->bs); - md->bs = p->bs; - p->bs = NULL; + if (md->io_pool && md->bs) { + /* The md already has necessary mempools. */ + if (dm_table_get_type(t) == DM_TYPE_BIO_BASED) { + /* + * Reload bioset because front_pad may have changed + * because a different table was loaded. + */ + bioset_free(md->bs); + md->bs = p->bs; + p->bs = NULL; + } else if (dm_table_get_type(t) == DM_TYPE_REQUEST_BASED) { + BUG_ON(!md->tio_pool); + /* + * No need to reload in case of request-based dm + * because of fixed size front_pad. + * Note for future: if you are to reload bioset, + * prep-ed requests in queue may have reference + * to bio from the old bioset. + * So you must walk through the queue to unprep. + */ + } goto out; }