From patchwork Sat Jul 30 15:52:23 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 9253197 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id C49BD607D3 for ; Sat, 30 Jul 2016 15:52:30 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B478428434 for ; Sat, 30 Jul 2016 15:52:30 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A92E42845C; Sat, 30 Jul 2016 15:52:30 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_HI, T_DKIM_INVALID, T_TVD_MIME_EPI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 129BA28434 for ; Sat, 30 Jul 2016 15:52:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751347AbcG3Pw2 (ORCPT ); Sat, 30 Jul 2016 11:52:28 -0400 Received: from mail-oi0-f47.google.com ([209.85.218.47]:35105 "EHLO mail-oi0-f47.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751014AbcG3Pw1 (ORCPT ); Sat, 30 Jul 2016 11:52:27 -0400 Received: by mail-oi0-f47.google.com with SMTP id l72so143477566oig.2 for ; Sat, 30 Jul 2016 08:52:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=intel-com.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=KHOX8lUd7LsELsaOaxkdD1YCe9M2gIVZD9fHcNx30zs=; b=zkvugp9XSl1SpQyJuwNa95gvVvvE7m/GWPEeBWihzY0zU3aVyPOG41NQ6O4pLqgv1U BNOClKRWkU/cj4mb09PBNewEO1JYHKn+xD9LpIJnVHqDatP/FHKF6Aqeluz8t2SQ2IFL gqXAc4Zue3DLWLBXqH27086pVqeGD6fgwlntpW7q+7REYaMhsxRCHkHpfMdvzLX0yxm1 k48kz//QIHprDWojawkQzHw/1/5ZRKyW67MKj+hL+0+GngwJ9T2qVYuJEv+wFOEuWxHl mGriL4FUH43K1oY/vfHkQ6IeW39pV38usiAHUm0VwgHMLq5AIQR77s4ibRl8gRBtQPKr R0QA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=KHOX8lUd7LsELsaOaxkdD1YCe9M2gIVZD9fHcNx30zs=; b=SabAbGHw3oMwzJj1pbqmpFK6R5a72F5T5rTCZfJ65u25XM+PR2MkxDpxnuMeqglIp4 YumULFiNc+J8eXJRpE8QLJMo53fV7FGNOK9bVex6+DTM6yDRbfh+BrTqb9aMCCSeuNH4 bHtHiM9a5hMicYRyqDxAx6LHusBYDOMWO/8ggRAV3rE29k5KwhfcygTL9KQ2ygWrMjXt WJZwPAXwER2Sa8FW0r6mexyUJQNQ7D2iALWzqHPb8MiNs7/YSSmlXUyb7sHA5ndPCHF+ YmXv571Q4Af8E2e8gnNrKWCqsjIhNTyjovN9aGapnMdCkCvOZ4U5ullaokE+/Q6O4cxV PvCA== X-Gm-Message-State: AEkoouvIoAdysL2QpNNEUi7yK3ZOk29HNnYtZ7cMgaIWBBIbBy7mc4ih3621y3bt9+6GqEEZ+oE1cbT57Xjwr+Id X-Received: by 10.202.114.81 with SMTP id p78mr27963879oic.16.1469893946042; Sat, 30 Jul 2016 08:52:26 -0700 (PDT) MIME-Version: 1.0 Received: by 10.157.21.18 with HTTP; Sat, 30 Jul 2016 08:52:23 -0700 (PDT) In-Reply-To: References: <622794958.9574724.1469674652262.JavaMail.zimbra@redhat.com> <1762637089.9575520.1469676013321.JavaMail.zimbra@redhat.com> From: Dan Williams Date: Sat, 30 Jul 2016 08:52:23 -0700 Message-ID: Subject: Re: [BUG] kernel NULL pointer dereference observed during pmem btt switch test To: Yi Zhang Cc: linux-nvdimm , linux-block@vger.kernel.org Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On Thu, Jul 28, 2016 at 8:50 AM, Dan Williams wrote: > [ adding linux-block ] > > On Wed, Jul 27, 2016 at 8:20 PM, Yi Zhang wrote: >> Hello everyone >> >> Could you help check this issue, thanks. >> >> Steps I used: >> 1. Reserve 4*8G of memory for pmem by add kernel parameter "memmap=8G!4G memmap=8G!12G memmap=8G!20G memmap=8G!28G" >> 2. Execute below script >> #!/bin/bash >> pmem_btt_switch() { >> sector_size_list="512 520 528 4096 4104 4160 4224" >> for sector_size in $sector_size_list; do >> ndctl create-namespace -f -e namespace${1}.0 --mode=sector -l $sector_size >> ndctl create-namespace -f -e namespace${1}.0 --mode=raw >> done >> } >> >> for i in 0 1 2 3; do >> pmem_btt_switch $i & >> done > > Thanks for the report. This looks like del_gendisk() frees the > previous usage of the devt before the bdi is unregistered. This > appears to be a general problem with all block drivers, not just > libnvdimm, since blk_cleanup_queue() is typically called after > del_gendisk(). I.e. it will always be the case that the bdi > registered with the devt allocated at add_disk() will still be alive > when del_gendisk()->disk_release() frees the previous devt number. > > I *think* the path forward is to allow the bdi to hold a reference > against the blk_alloc_devt() allocation until it is done with it. Any > other ideas on fixing this object lifetime problem? Does the attached patch solve this for you? From 44bcbf8c531e9249d09e6bf502d3696668f3d22c Mon Sep 17 00:00:00 2001 From: Dan Williams Date: Sat, 30 Jul 2016 08:23:06 -0700 Subject: [PATCH] block: fix bdi vs gendisk lifetime mismatch The bdi for gendisk is named after the gendisk. However, since the gendisk is destroyed before the bdi it leaves a window where a new gendisk could dynamically reuse the same devt while a bdi while a bdi with the same name is still live. Arrange for the bdi to hold a reference against its "owner" disk device while it is registered. Otherwise we can hit sysfs duplicate name collisions like the following: WARNING: CPU: 10 PID: 2078 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x64/0x80 sysfs: cannot create duplicate filename '/devices/virtual/bdi/259:1' Hardware name: HP ProLiant DL580 Gen8, BIOS P79 05/06/2015 0000000000000286 0000000002c04ad5 ffff88006f24f970 ffffffff8134caec ffff88006f24f9c0 0000000000000000 ffff88006f24f9b0 ffffffff8108c351 0000001f0000000c ffff88105d236000 ffff88105d1031e0 ffff8800357427f8 Call Trace: [] dump_stack+0x63/0x87 [] __warn+0xd1/0xf0 [] warn_slowpath_fmt+0x5f/0x80 [] sysfs_warn_dup+0x64/0x80 [] sysfs_create_dir_ns+0x7e/0x90 [] kobject_add_internal+0xaa/0x320 [] ? vsnprintf+0x34e/0x4d0 [] kobject_add+0x75/0xd0 [] ? mutex_lock+0x12/0x2f [] device_add+0x125/0x610 [] device_create_groups_vargs+0xd8/0x100 [] device_create_vargs+0x1c/0x20 [] bdi_register+0x8c/0x180 [] bdi_register_dev+0x27/0x30 [] add_disk+0x175/0x4a0 Reported-by: Yi Zhang Signed-off-by: Dan Williams --- block/genhd.c | 2 +- include/linux/backing-dev-defs.h | 1 + include/linux/backing-dev.h | 1 + mm/backing-dev.c | 18 ++++++++++++++++++ 4 files changed, 21 insertions(+), 1 deletion(-) diff --git a/block/genhd.c b/block/genhd.c index 3c9dede4e04f..f6f7ffcd4eab 100644 --- a/block/genhd.c +++ b/block/genhd.c @@ -614,7 +614,7 @@ void device_add_disk(struct device *parent, struct gendisk *disk) /* Register BDI before referencing it from bdev */ bdi = &disk->queue->backing_dev_info; - bdi_register_dev(bdi, disk_devt(disk)); + bdi_register_owner(bdi, disk_to_dev(disk)); blk_register_region(disk_devt(disk), disk->minors, NULL, exact_match, exact_lock, disk); diff --git a/include/linux/backing-dev-defs.h b/include/linux/backing-dev-defs.h index 3f103076d0bf..c357f27d5483 100644 --- a/include/linux/backing-dev-defs.h +++ b/include/linux/backing-dev-defs.h @@ -163,6 +163,7 @@ struct backing_dev_info { wait_queue_head_t wb_waitq; struct device *dev; + struct device *owner; struct timer_list laptop_mode_wb_timer; diff --git a/include/linux/backing-dev.h b/include/linux/backing-dev.h index 491a91717788..43b93a947e61 100644 --- a/include/linux/backing-dev.h +++ b/include/linux/backing-dev.h @@ -24,6 +24,7 @@ __printf(3, 4) int bdi_register(struct backing_dev_info *bdi, struct device *parent, const char *fmt, ...); int bdi_register_dev(struct backing_dev_info *bdi, dev_t dev); +int bdi_register_owner(struct backing_dev_info *bdi, struct device *owner); void bdi_unregister(struct backing_dev_info *bdi); int __must_check bdi_setup_and_register(struct backing_dev_info *, char *); diff --git a/mm/backing-dev.c b/mm/backing-dev.c index efe237742074..7b51cb7905be 100644 --- a/mm/backing-dev.c +++ b/mm/backing-dev.c @@ -825,6 +825,19 @@ int bdi_register_dev(struct backing_dev_info *bdi, dev_t dev) } EXPORT_SYMBOL(bdi_register_dev); +int bdi_register_owner(struct backing_dev_info *bdi, struct device *owner) +{ + int rc; + + rc = bdi_register(bdi, NULL, "%u:%u", MAJOR(owner->devt), + MINOR(owner->devt)); + if (rc) + return rc; + bdi->owner = owner; + get_device(owner); +} +EXPORT_SYMBOL(bdi_register_owner); + /* * Remove bdi from bdi_list, and ensure that it is no longer visible */ @@ -849,6 +862,11 @@ void bdi_unregister(struct backing_dev_info *bdi) device_unregister(bdi->dev); bdi->dev = NULL; } + + if (bdi->owner) { + put_device(bdi->owner); + bdi->owner = NULL; + } } void bdi_exit(struct backing_dev_info *bdi) -- 2.5.5