From patchwork Wed Apr 29 07:46:22 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Luis Chamberlain X-Patchwork-Id: 11516325 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9726F92C for ; Wed, 29 Apr 2020 07:46:34 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 536812082E for ; Wed, 29 Apr 2020 07:46:34 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 536812082E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 843AC8E0006; Wed, 29 Apr 2020 03:46:33 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 81A438E0001; Wed, 29 Apr 2020 03:46:33 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 708A68E0006; Wed, 29 Apr 2020 03:46:33 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0037.hostedemail.com [216.40.44.37]) by kanga.kvack.org (Postfix) with ESMTP id 592088E0001 for ; Wed, 29 Apr 2020 03:46:33 -0400 (EDT) Received: from smtpin16.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 205BC824805A for ; Wed, 29 Apr 2020 07:46:33 +0000 (UTC) X-FDA: 76760110266.16.roll90_3928969020c35 X-Spam-Summary: 10,1,0,af03b606133a0636,d41d8cd98f00b204,mcgrof@gmail.com,,RULES_HIT:1:41:69:355:379:541:800:960:966:968:973:982:988:989:1260:1311:1314:1345:1359:1437:1515:1605:1730:1747:1777:1792:2194:2196:2198:2199:2200:2201:2393:2553:2559:2562:2637:2689:2693:2892:3138:3139:3140:3141:3142:3865:3866:3867:3868:3870:3871:3872:3874:4031:4321:4385:4605:5007:6261:7875:7903:8604:8660:9108:9389:9393:9592:10007:11026:11232:11473:11658:11914:12043:12048:12291:12296:12297:12438:12517:12519:12555:12679:12683:12895:12986:13148:13149:13161:13229:13230:13868:13894:13972:14394:21063:21080:21212:21324:21433:21444:21451:21627:21740:21972:21987:21990:30012:30054:30064:30075:30090:30091,0,RBL:209.85.214.196:@gmail.com:.lbl8.mailshell.net-62.50.0.100 66.100.201.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:none,Custom_rules:0:1:0,LFtime:24,LUA_SUMMARY:none X-HE-Tag: roll90_3928969020c35 X-Filterd-Recvd-Size: 14420 Received: from mail-pl1-f196.google.com (mail-pl1-f196.google.com [209.85.214.196]) by imf08.hostedemail.com (Postfix) with ESMTP for ; Wed, 29 Apr 2020 07:46:32 +0000 (UTC) Received: by mail-pl1-f196.google.com with SMTP id u22so534312plq.12 for ; Wed, 29 Apr 2020 00:46:32 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=A5vpTVrx10Rf7A/Z6Yy5oBCSjI4ZVpUsuug2LyGGOvQ=; b=kL894ofyJkhVLKDzqhKSPlkUnHjHhXh0xwWL+/yzSRM2Z3kfP4oQeqULd89YqUrnWi BkDOlJvdNA3Kr+GVbMZZ9sveE75adw75H331Nj9EBgP0uuD9BgXXQMXMapDcL8OSVFpL QsoWwBqINlUWzydWu60hqDVzB2ACx1aR4gVK08tJtxB1ylPOKcDyh8QkuM/+x1pbYaxo bpDWrUspQM5F9bivVPorL8PzadhoSCGomHtCzcYAN4PuZjhnnUN6TrFDBs5bjIKKgq+S 2/Mml4TzMVgsXbP+XOUX0g1KBijTEEAG7y9U7AiZX5VuMiEPNDCtmF2Y1qmHkFfqeiYE aNfQ== X-Gm-Message-State: AGi0PuaDEGHJgbzITF4bAfq6u0QszH44Uz2Y+xX4JbCUnfXuRin6tJju 4I5KP0SMLtHJX7dl+jDAXdg= X-Google-Smtp-Source: APiQypLH/HANiOD/dQwddQ11EuhweU5P2arbITUF+MzuZpr33tqxR0UJYIvjyFj1xf+88z3YPmnS8g== X-Received: by 2002:a17:90a:fa81:: with SMTP id cu1mr1649338pjb.25.1588146391677; Wed, 29 Apr 2020 00:46:31 -0700 (PDT) Received: from 42.do-not-panic.com (42.do-not-panic.com. [157.230.128.187]) by smtp.gmail.com with ESMTPSA id u188sm451446pfu.33.2020.04.29.00.46.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 29 Apr 2020 00:46:30 -0700 (PDT) Received: by 42.do-not-panic.com (Postfix, from userid 1000) id C52CF40858; Wed, 29 Apr 2020 07:46:29 +0000 (UTC) From: Luis Chamberlain To: axboe@kernel.dk, viro@zeniv.linux.org.uk, bvanassche@acm.org, gregkh@linuxfoundation.org, rostedt@goodmis.org, mingo@redhat.com, jack@suse.cz, ming.lei@redhat.com, nstange@suse.de, akpm@linux-foundation.org Cc: mhocko@suse.com, yukuai3@huawei.com, linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Luis Chamberlain , Omar Sandoval , Hannes Reinecke , Michal Hocko Subject: [PATCH v3 1/6] block: revert back to synchronous request_queue removal Date: Wed, 29 Apr 2020 07:46:22 +0000 Message-Id: <20200429074627.5955-2-mcgrof@kernel.org> X-Mailer: git-send-email 2.23.0.rc1 In-Reply-To: <20200429074627.5955-1-mcgrof@kernel.org> References: <20200429074627.5955-1-mcgrof@kernel.org> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Commit dc9edc44de6c ("block: Fix a blk_exit_rl() regression") merged on v4.12 moved the work behind blk_release_queue() into a workqueue after a splat floated around which indicated some work on blk_release_queue() could sleep in blk_exit_rl(). This splat would be possible when a driver called blk_put_queue() or blk_cleanup_queue() (which calls blk_put_queue() as its final call) from an atomic context. blk_put_queue() decrements the refcount for the request_queue kobject, and upon reaching 0 blk_release_queue() is called. Although blk_exit_rl() is now removed through commit db6d9952356 ("block: remove request_list code") on v5.0, we reserve the right to be able to sleep within blk_release_queue() context. The last reference for the request_queue must not be called from atomic conext. *When* the last reference to the request_queue reaches 0 varies, and so let's take the opportunity to document when that is expected to happen and also document the context of the related calls as best as possible so we can avoid future issues, and with the hopes that the synchronous request_queue removal sticks. We revert back to synchronous request_queue removal because asynchronous removal creates a regression with expected userspace interaction with several drivers. An example is when removing the loopback driver, one uses ioctls from userspace to do so, but upon return and if successful, one expects the device to be removed. Likewise if one races to add another device the new one may not be added as it is still being removed. This was expected behaviour before and it now fails as the device is still present and busy still. Moving to asynchronous request_queue removal could have broken many scripts which relied on the removal to have been completed if there was no error. Document this expectation as well so that this doesn't regress userspace again. Using asynchronous request_queue removal however has helped us find other bugs. In the future we can test what could break with this arrangement by enabling CONFIG_DEBUG_KOBJECT_RELEASE. Cc: Bart Van Assche Cc: Omar Sandoval Cc: Hannes Reinecke Cc: Nicolai Stange Cc: Greg Kroah-Hartman Cc: Michal Hocko Cc: yu kuai Suggested-by: Nicolai Stange Fixes: dc9edc44de6c ("block: Fix a blk_exit_rl() regression") Signed-off-by: Luis Chamberlain Reviewed-by: Christoph Hellwig --- block/blk-core.c | 23 +++++++++++++ block/blk-sysfs.c | 43 +++++++++++++------------ block/genhd.c | 73 +++++++++++++++++++++++++++++++++++++++++- include/linux/blkdev.h | 2 -- 4 files changed, 117 insertions(+), 24 deletions(-) diff --git a/block/blk-core.c b/block/blk-core.c index 0641c2916d7e..8a27c772982e 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -306,6 +306,16 @@ void blk_clear_pm_only(struct request_queue *q) } EXPORT_SYMBOL_GPL(blk_clear_pm_only); +/** + * blk_put_queue - decrement the request_queue refcount + * @q: the request_queue structure to decrement the refcount for + * + * Decrements the refcount to the request_queue kobject. When this reaches 0 + * we'll have blk_release_queue() called. + * + * Context: Any context, but the last reference must not be dropped from + * atomic context. + */ void blk_put_queue(struct request_queue *q) { kobject_put(&q->kobj); @@ -337,9 +347,14 @@ EXPORT_SYMBOL_GPL(blk_set_queue_dying); * * Mark @q DYING, drain all pending requests, mark @q DEAD, destroy and * put it. All future requests will be failed immediately with -ENODEV. + * + * Context: can sleep */ void blk_cleanup_queue(struct request_queue *q) { + /* cannot be called from atomic context */ + might_sleep(); + WARN_ON_ONCE(blk_queue_registered(q)); /* mark @q DYING, no new request or merges will be allowed afterwards */ @@ -567,6 +582,14 @@ struct request_queue *blk_alloc_queue(make_request_fn make_request, int node_id) } EXPORT_SYMBOL(blk_alloc_queue); +/** + * blk_get_queue - increment the request_queue refcount + * @q: the request_queue structure to incremenet the refcount for + * + * Increment the refcount to the request_queue kobject. + * + * Context: Any context. + */ bool blk_get_queue(struct request_queue *q) { if (likely(!blk_queue_dying(q))) { diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c index fca9b158f4a0..eda8c4985511 100644 --- a/block/blk-sysfs.c +++ b/block/blk-sysfs.c @@ -860,22 +860,32 @@ static void blk_exit_queue(struct request_queue *q) bdi_put(q->backing_dev_info); } - /** - * __blk_release_queue - release a request queue - * @work: pointer to the release_work member of the request queue to be released + * blk_release_queue - releases all allocated resources of the request_queue + * @kobj: pointer to a kobject, who's container is a request_queue + * + * This function releases all allocated resources of the request queue. + * + * The struct request_queue refcount is incremented with blk_get_queue() and + * decremented with blk_put_queue(). Once the refcount reaches 0 this function + * is called. + * + * For drivers that have a request_queue on a gendisk and added with + * __device_add_disk() the refcount to request_queue will reach 0 with + * the last put_disk() called by the driver. For drivers which don't use + * __device_add_disk() this happens with blk_cleanup_queue(). * - * Description: - * This function is called when a block device is being unregistered. The - * process of releasing a request queue starts with blk_cleanup_queue, which - * set the appropriate flags and then calls blk_put_queue, that decrements - * the reference counter of the request queue. Once the reference counter - * of the request queue reaches zero, blk_release_queue is called to release - * all allocated resources of the request queue. + * Drivers exist which depend on the release of the request_queue to be + * synchronous, it should not be deferred. + * + * Context: can sleep */ -static void __blk_release_queue(struct work_struct *work) +static void blk_release_queue(struct kobject *kobj) { - struct request_queue *q = container_of(work, typeof(*q), release_work); + struct request_queue *q = + container_of(kobj, struct request_queue, kobj); + + might_sleep(); if (test_bit(QUEUE_FLAG_POLL_STATS, &q->queue_flags)) blk_stat_remove_callback(q, q->poll_cb); @@ -904,15 +914,6 @@ static void __blk_release_queue(struct work_struct *work) call_rcu(&q->rcu_head, blk_free_queue_rcu); } -static void blk_release_queue(struct kobject *kobj) -{ - struct request_queue *q = - container_of(kobj, struct request_queue, kobj); - - INIT_WORK(&q->release_work, __blk_release_queue); - schedule_work(&q->release_work); -} - static const struct sysfs_ops queue_sysfs_ops = { .show = queue_attr_show, .store = queue_attr_store, diff --git a/block/genhd.c b/block/genhd.c index c05d509877fa..a933cffbee2e 100644 --- a/block/genhd.c +++ b/block/genhd.c @@ -897,11 +897,32 @@ static void invalidate_partition(struct gendisk *disk, int partno) bdput(bdev); } +/** + * del_gendisk - remove the gendisk + * @disk: the struct gendisk to remove + * + * Removes the gendisk and all its associated resources. This deletes the + * partitions associated with the gendisk, and unregisters the associated + * request_queue. + * + * This is the counter to the respective __device_add_disk() call. + * + * The final removal of the struct gendisk happens when its refcount reaches 0 + * with put_disk(), which should be called after del_gendisk(), if + * __device_add_disk() was used. + * + * Drivers exist which depend on the release of the gendisk to be synchronous, + * it should not be deferred. + * + * Context: can sleep + */ void del_gendisk(struct gendisk *disk) { struct disk_part_iter piter; struct hd_struct *part; + might_sleep(); + blk_integrity_del(disk); disk_del_events(disk); @@ -992,11 +1013,15 @@ static ssize_t disk_badblocks_store(struct device *dev, * * This function gets the structure containing partitioning * information for the given device @devt. + * + * Context: can sleep */ struct gendisk *get_gendisk(dev_t devt, int *partno) { struct gendisk *disk = NULL; + might_sleep(); + if (MAJOR(devt) != BLOCK_EXT_MAJOR) { struct kobject *kobj; @@ -1528,10 +1553,31 @@ int disk_expand_part_tbl(struct gendisk *disk, int partno) return 0; } +/** + * disk_release - releases all allocated resources of the gendisk + * @dev: the device representing this disk + * + * This function releases all allocated resources of the gendisk. + * + * The struct gendisk refcounted is incremeneted with get_gendisk() or + * get_disk_and_module(), and its refcount is decremented with + * put_disk_and_module() or put_disk(). Once the refcount reaches 0 this + * function is called. + * + * Drivers which used __device_add_disk() have a gendisk with a request_queue + * assigned. Since the request_queue sits on top of the gendisk for these + * drivers we also call blk_put_queue() for them, and we expect the + * request_queue refcount to reach 0 at this point, and so the request_queue + * will also be freed prior to the disk. + * + * Context: can sleep + */ static void disk_release(struct device *dev) { struct gendisk *disk = dev_to_disk(dev); + might_sleep(); + blk_free_devt(dev->devt); disk_release_events(disk); kfree(disk->random); @@ -1737,6 +1783,15 @@ struct gendisk *__alloc_disk_node(int minors, int node_id) } EXPORT_SYMBOL(__alloc_disk_node); +/** + * get_disk_and_module - increments the gendisk and gendisk fops module refcount + * @disk: the struct gendisk to to increment the refcount for + * + * This increments the refcount for the struct gendisk, and the gendisk's + * fops module owner. + * + * Context: Any context. + */ struct kobject *get_disk_and_module(struct gendisk *disk) { struct module *owner; @@ -1757,6 +1812,16 @@ struct kobject *get_disk_and_module(struct gendisk *disk) } EXPORT_SYMBOL(get_disk_and_module); +/** + * put_disk - decrements the gendisk refcount + * @disk: the struct gendisk to to decrement the refcount for + * + * This decrements the refcount for the struct gendisk. When this reaches 0 + * we'll have disk_release() called. + * + * Context: Any context, but the last reference must not be dropped from + * atomic context. + */ void put_disk(struct gendisk *disk) { if (disk) @@ -1764,9 +1829,15 @@ void put_disk(struct gendisk *disk) } EXPORT_SYMBOL(put_disk); -/* +/** + * put_disk_and_module - decrements the module and gendisk refcount + * @disk: the struct gendisk to to decrement the refcount for + * * This is a counterpart of get_disk_and_module() and thus also of * get_gendisk(). + * + * Context: Any context, but the last reference must not be dropped from + * atomic context. */ void put_disk_and_module(struct gendisk *disk) { diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index f00bd4042295..3122a93c7277 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -571,8 +571,6 @@ struct request_queue { size_t cmd_size; - struct work_struct release_work; - #define BLK_MAX_WRITE_HINTS 5 u64 write_hints[BLK_MAX_WRITE_HINTS]; }; From patchwork Wed Apr 29 07:46:23 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Luis Chamberlain X-Patchwork-Id: 11516331 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8696392C for ; Wed, 29 Apr 2020 07:46:38 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 53FC92076B for ; Wed, 29 Apr 2020 07:46:38 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 53FC92076B Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 955728E0008; Wed, 29 Apr 2020 03:46:35 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 92EE88E0001; Wed, 29 Apr 2020 03:46:35 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7A67D8E0008; Wed, 29 Apr 2020 03:46:35 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0211.hostedemail.com [216.40.44.211]) by kanga.kvack.org (Postfix) with ESMTP id 6365F8E0001 for ; Wed, 29 Apr 2020 03:46:35 -0400 (EDT) Received: from smtpin19.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 28F3B180AD80F for ; Wed, 29 Apr 2020 07:46:35 +0000 (UTC) X-FDA: 76760110350.19.pen77_3972622ce6f41 X-Spam-Summary: 2,0,0,4f4823b881da7b24,d41d8cd98f00b204,mcgrof@gmail.com,,RULES_HIT:41:355:379:541:800:960:973:988:989:1260:1311:1314:1345:1359:1437:1515:1535:1543:1711:1730:1747:1777:1792:2393:2559:2562:3138:3139:3140:3141:3142:3165:3354:3867:3868:3872:4321:4605:5007:6261:9592:10004:11026:11232:11473:11658:11914:12043:12048:12296:12297:12438:12517:12519:12555:12895:12986:13894:13972:14093:14096:14181:14394:14721:21080:21222:21444:21451:21627:21990:30012:30054:30064,0,RBL:209.85.216.66:@gmail.com:.lbl8.mailshell.net-62.50.0.100 66.100.201.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:none,Custom_rules:0:0:0,LFtime:24,LUA_SUMMARY:none X-HE-Tag: pen77_3972622ce6f41 X-Filterd-Recvd-Size: 5990 Received: from mail-pj1-f66.google.com (mail-pj1-f66.google.com [209.85.216.66]) by imf48.hostedemail.com (Postfix) with ESMTP for ; Wed, 29 Apr 2020 07:46:34 +0000 (UTC) Received: by mail-pj1-f66.google.com with SMTP id a7so433921pju.2 for ; Wed, 29 Apr 2020 00:46:34 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=m8dSq6ES2jPXw5+HnmMcnSKKoOvoW5PINzIJYL/yYY4=; b=pu/2dnF37VXWMjvVVvy+xYb7R3qOBVsCyWOmGbkM0VsOqSm8VWEcpaiKkSw1A8vtsP 0yb9Esd/cwpmOtOP9Tvop1Bz29j4vSB0OaTRjH+b8e8LKcFRxpPnFJG+qOJsbjTVUXtA Oh5JjEN2R7281Iqil5EkiuezF61loc8ri4ymx6RoW8/LZ+UchcyLYmWdFiE6ICdztAKc pk/7L+40Pl+aWPwd85Llos/btBdNFGBorIxNcdp7lZ8Db8YI1Hf0F+DZSlaKZV26fb5Y ETQVgiI16vC2RirRWbiYGtOeHT/FijY8IGwzVx+7njjFX/CJEx1HTcV27V1q40Urm5Rr Ve6g== X-Gm-Message-State: AGi0PubgF32xuGUAP5WpZ3P1OUDQCE5uPx4R5cnTqjaP0Rmpj3s7zRH/ cLHR/ZssMHNN49Ch7e31pPY= X-Google-Smtp-Source: APiQypKnDao+pl7grXlaGqzpC6Us6sQ/cc5Bx3+Kf4IjSfI/lLruTTPQF6zXoAlX0W6PyZPP+Le1Kg== X-Received: by 2002:a17:902:b711:: with SMTP id d17mr32928427pls.333.1588146393824; Wed, 29 Apr 2020 00:46:33 -0700 (PDT) Received: from 42.do-not-panic.com (42.do-not-panic.com. [157.230.128.187]) by smtp.gmail.com with ESMTPSA id c84sm427219pfb.153.2020.04.29.00.46.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 29 Apr 2020 00:46:30 -0700 (PDT) Received: by 42.do-not-panic.com (Postfix, from userid 1000) id D74F941381; Wed, 29 Apr 2020 07:46:29 +0000 (UTC) From: Luis Chamberlain To: axboe@kernel.dk, viro@zeniv.linux.org.uk, bvanassche@acm.org, gregkh@linuxfoundation.org, rostedt@goodmis.org, mingo@redhat.com, jack@suse.cz, ming.lei@redhat.com, nstange@suse.de, akpm@linux-foundation.org Cc: mhocko@suse.com, yukuai3@huawei.com, linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Luis Chamberlain , Omar Sandoval , Hannes Reinecke , Michal Hocko Subject: [PATCH v3 2/6] block: move main block debugfs initialization to its own file Date: Wed, 29 Apr 2020 07:46:23 +0000 Message-Id: <20200429074627.5955-3-mcgrof@kernel.org> X-Mailer: git-send-email 2.23.0.rc1 In-Reply-To: <20200429074627.5955-1-mcgrof@kernel.org> References: <20200429074627.5955-1-mcgrof@kernel.org> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: make_request-based drivers and and request-based drivers share some debugfs code. By moving this into its own file it makes it easier to expand and audit this shared code. This patch contains no functional changes. Cc: Bart Van Assche Cc: Omar Sandoval Cc: Hannes Reinecke Cc: Nicolai Stange Cc: Greg Kroah-Hartman Cc: Michal Hocko Cc: yu kuai Reviewed-by: Greg Kroah-Hartman Reviewed-by: Bart Van Assche Signed-off-by: Luis Chamberlain Reviewed-by: Christoph Hellwig --- block/Makefile | 1 + block/blk-core.c | 9 +-------- block/blk-debugfs.c | 15 +++++++++++++++ block/blk.h | 7 +++++++ 4 files changed, 24 insertions(+), 8 deletions(-) create mode 100644 block/blk-debugfs.c diff --git a/block/Makefile b/block/Makefile index 206b96e9387f..1d3ab20505d8 100644 --- a/block/Makefile +++ b/block/Makefile @@ -10,6 +10,7 @@ obj-$(CONFIG_BLOCK) := bio.o elevator.o blk-core.o blk-sysfs.o \ blk-mq-sysfs.o blk-mq-cpumap.o blk-mq-sched.o ioctl.o \ genhd.o ioprio.o badblocks.o partitions/ blk-rq-qos.o +obj-$(CONFIG_DEBUG_FS) += blk-debugfs.o obj-$(CONFIG_BOUNCE) += bounce.o obj-$(CONFIG_BLK_SCSI_REQUEST) += scsi_ioctl.o obj-$(CONFIG_BLK_DEV_BSG) += bsg.o diff --git a/block/blk-core.c b/block/blk-core.c index 8a27c772982e..4b26f686e249 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -49,10 +49,6 @@ #include "blk-pm.h" #include "blk-rq-qos.h" -#ifdef CONFIG_DEBUG_FS -struct dentry *blk_debugfs_root; -#endif - EXPORT_TRACEPOINT_SYMBOL_GPL(block_bio_remap); EXPORT_TRACEPOINT_SYMBOL_GPL(block_rq_remap); EXPORT_TRACEPOINT_SYMBOL_GPL(block_bio_complete); @@ -1828,10 +1824,7 @@ int __init blk_dev_init(void) blk_requestq_cachep = kmem_cache_create("request_queue", sizeof(struct request_queue), 0, SLAB_PANIC, NULL); - -#ifdef CONFIG_DEBUG_FS - blk_debugfs_root = debugfs_create_dir("block", NULL); -#endif + blk_debugfs_register(); return 0; } diff --git a/block/blk-debugfs.c b/block/blk-debugfs.c new file mode 100644 index 000000000000..19091e1effc0 --- /dev/null +++ b/block/blk-debugfs.c @@ -0,0 +1,15 @@ +// SPDX-License-Identifier: GPL-2.0 + +/* + * Shared request-based / make_request-based functionality + */ +#include +#include +#include + +struct dentry *blk_debugfs_root; + +void blk_debugfs_register(void) +{ + blk_debugfs_root = debugfs_create_dir("block", NULL); +} diff --git a/block/blk.h b/block/blk.h index 73bd3b1c6938..ec16e8a6049e 100644 --- a/block/blk.h +++ b/block/blk.h @@ -456,5 +456,12 @@ struct request_queue *__blk_alloc_queue(int node_id); int __bio_add_pc_page(struct request_queue *q, struct bio *bio, struct page *page, unsigned int len, unsigned int offset, bool *same_page); +#ifdef CONFIG_DEBUG_FS +void blk_debugfs_register(void); +#else +static inline void blk_debugfs_register(void) +{ +} +#endif /* CONFIG_DEBUG_FS */ #endif /* BLK_INTERNAL_H */ From patchwork Wed Apr 29 07:46:24 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Luis Chamberlain X-Patchwork-Id: 11516333 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D107092C for ; Wed, 29 Apr 2020 07:46:40 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 9E4B32076B for ; Wed, 29 Apr 2020 07:46:40 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 9E4B32076B Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 4E90E8E0009; Wed, 29 Apr 2020 03:46:36 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 498538E0001; Wed, 29 Apr 2020 03:46:36 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 361C28E0009; Wed, 29 Apr 2020 03:46:36 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0170.hostedemail.com [216.40.44.170]) by kanga.kvack.org (Postfix) with ESMTP id 1F8CA8E0001 for ; Wed, 29 Apr 2020 03:46:36 -0400 (EDT) Received: from smtpin29.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id DD9FA40CD for ; Wed, 29 Apr 2020 07:46:35 +0000 (UTC) X-FDA: 76760110350.29.wax99_399431b460918 X-Spam-Summary: 2,0,0,9e5cf36d6b93dd77,d41d8cd98f00b204,mcgrof@gmail.com,,RULES_HIT:41:355:379:541:800:960:973:988:989:1260:1311:1314:1345:1359:1437:1515:1534:1541:1711:1730:1747:1777:1792:2393:2559:2562:2693:3138:3139:3140:3141:3142:3352:3865:3867:3872:5007:6261:10004:11026:11658:11914:12043:12048:12296:12297:12438:12517:12519:12555:12895:13069:13141:13230:13311:13357:13894:14096:14181:14384:14394:14721:21080:21444:21627:21740:21990:30029:30054,0,RBL:209.85.216.67:@gmail.com:.lbl8.mailshell.net-62.50.0.100 66.100.201.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:23,LUA_SUMMARY:none X-HE-Tag: wax99_399431b460918 X-Filterd-Recvd-Size: 3982 Received: from mail-pj1-f67.google.com (mail-pj1-f67.google.com [209.85.216.67]) by imf27.hostedemail.com (Postfix) with ESMTP for ; Wed, 29 Apr 2020 07:46:35 +0000 (UTC) Received: by mail-pj1-f67.google.com with SMTP id a7so433937pju.2 for ; Wed, 29 Apr 2020 00:46:35 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=M3p9eZkDL4rEHA+mf+DziMPxePr/ZoUmDl5szec5+p8=; b=C1VFyBR7EAx0CubyHzOPOFEcIFcPP0JZ4UjXUliI+9Zr3/WCqWb6w97tucjEVFtn8s ozJ/1NiPe65nKL6OpkoMDAu59UbEdk0Qf0NidXvlI3AmJjZKwd84wprYsAoI9rSoAeXG yhLp/R33M3lxz9pOWinzQZPkeF55HevCqyPOlJQCAhR+9N1Mp9FbH8hiJEi0CWHypNuG TOdpU49ZUBdCthUvEySrVhuromXUzM2fIjr9xUiIP43YUCsEVs8uIfoM76ZrpCNXdOrH 54o0t6BIidu7BoMi8scE1toqiiOoIbLCu6CouWrKzsv09bmfsuM8DVVGjIH1JZV7M1kv 9qdw== X-Gm-Message-State: AGi0PuZgVGr+ILK/7+GWA3qI36dCI4Dicj4Lljwk4PZz9KNC6jHigIOh hcPxy5mUzpUTirGzQgmIx8M= X-Google-Smtp-Source: APiQypIepM7UkpDZsnPZzlZJGdqb5bC+BXdvy5fR4OB7A7AnaI0/jrmsFgVoflR1BJYu333zB+xnmQ== X-Received: by 2002:a17:90a:fa8d:: with SMTP id cu13mr1631966pjb.27.1588146394821; Wed, 29 Apr 2020 00:46:34 -0700 (PDT) Received: from 42.do-not-panic.com (42.do-not-panic.com. [157.230.128.187]) by smtp.gmail.com with ESMTPSA id r31sm387473pgl.86.2020.04.29.00.46.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 29 Apr 2020 00:46:30 -0700 (PDT) Received: by 42.do-not-panic.com (Postfix, from userid 1000) id E693841C23; Wed, 29 Apr 2020 07:46:29 +0000 (UTC) From: Luis Chamberlain To: axboe@kernel.dk, viro@zeniv.linux.org.uk, bvanassche@acm.org, gregkh@linuxfoundation.org, rostedt@goodmis.org, mingo@redhat.com, jack@suse.cz, ming.lei@redhat.com, nstange@suse.de, akpm@linux-foundation.org Cc: mhocko@suse.com, yukuai3@huawei.com, linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Luis Chamberlain Subject: [PATCH v3 3/6] blktrace: move blktrace debugfs creation to helper function Date: Wed, 29 Apr 2020 07:46:24 +0000 Message-Id: <20200429074627.5955-4-mcgrof@kernel.org> X-Mailer: git-send-email 2.23.0.rc1 In-Reply-To: <20200429074627.5955-1-mcgrof@kernel.org> References: <20200429074627.5955-1-mcgrof@kernel.org> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Move the work to create the debugfs directory used into a helper. It will make further checks easier to read. This commit introduces no functional changes. Signed-off-by: Luis Chamberlain Reviewed-by: Christoph Hellwig Reviewed-by: Bart Van Assche --- kernel/trace/blktrace.c | 16 +++++++++++++--- 1 file changed, 13 insertions(+), 3 deletions(-) diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c index ca39dc3230cb..2c6e6c386ace 100644 --- a/kernel/trace/blktrace.c +++ b/kernel/trace/blktrace.c @@ -468,6 +468,18 @@ static void blk_trace_setup_lba(struct blk_trace *bt, } } +static struct dentry *blk_trace_debugfs_dir(struct blk_user_trace_setup *buts, + struct blk_trace *bt) +{ + struct dentry *dir = NULL; + + dir = debugfs_lookup(buts->name, blk_debugfs_root); + if (!dir) + bt->dir = dir = debugfs_create_dir(buts->name, blk_debugfs_root); + + return dir; +} + /* * Setup everything required to start tracing */ @@ -509,9 +521,7 @@ static int do_blk_trace_setup(struct request_queue *q, char *name, dev_t dev, ret = -ENOENT; - dir = debugfs_lookup(buts->name, blk_debugfs_root); - if (!dir) - bt->dir = dir = debugfs_create_dir(buts->name, blk_debugfs_root); + dir = blk_trace_debugfs_dir(buts, bt); bt->dev = dev; atomic_set(&bt->dropped, 0); From patchwork Wed Apr 29 07:46:25 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Luis Chamberlain X-Patchwork-Id: 11516337 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id AC6AA15AB for ; Wed, 29 Apr 2020 07:46:43 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 5D0D520787 for ; Wed, 29 Apr 2020 07:46:43 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 5D0D520787 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id E84CC8E000A; Wed, 29 Apr 2020 03:46:37 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id E61488E0001; Wed, 29 Apr 2020 03:46:37 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C3BB58E000A; Wed, 29 Apr 2020 03:46:37 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0046.hostedemail.com [216.40.44.46]) by kanga.kvack.org (Postfix) with ESMTP id A1B048E0001 for ; Wed, 29 Apr 2020 03:46:37 -0400 (EDT) Received: from smtpin18.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 5A47A181AC217 for ; Wed, 29 Apr 2020 07:46:37 +0000 (UTC) X-FDA: 76760110434.18.vase43_39c1aa14ebd0b X-Spam-Summary: 2,0,0,641aab0bbc59c6ac,d41d8cd98f00b204,mcgrof@gmail.com,,RULES_HIT:41:69:327:355:379:541:800:960:965:966:968:973:982:988:989:1260:1311:1314:1345:1359:1431:1437:1515:1605:1730:1747:1777:1792:1981:2194:2196:2198:2199:2200:2201:2393:2538:2559:2562:2693:2901:3138:3139:3140:3141:3142:3865:3866:3867:3868:3870:3871:3872:3873:3874:4250:4321:4385:4390:4395:4605:5007:6119:6238:6261:6742:7208:7903:7904:8660:9040:9121:9163:9165:9389:9592:10004:10226:10954:11026:11233:11473:11657:11658:11914:12043:12048:12291:12296:12297:12438:12517:12519:12555:12679:12683:12740:12895:12986:13148:13161:13221:13229:13230:13870:13894:13972:14096:14394:21063:21080:21444:21451:21611:21622:21740:21987:21990:30003:30012:30029:30054:30056:30064:30067:30070,0,RBL:209.85.215.194:@gmail.com:.lbl8.mailshell.net-66.100.201.100 62.18.0.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:none,Custom_rules:0:0:0,LFtime:24,LUA_SUMMARY:none X-HE-Tag: vase43_39c1aa14ebd0b X-Filterd-Recvd-Size: 20992 Received: from mail-pg1-f194.google.com (mail-pg1-f194.google.com [209.85.215.194]) by imf45.hostedemail.com (Postfix) with ESMTP for ; Wed, 29 Apr 2020 07:46:36 +0000 (UTC) Received: by mail-pg1-f194.google.com with SMTP id t11so653996pgg.2 for ; Wed, 29 Apr 2020 00:46:36 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Y6alevxVbq4lT5G/F/xLcZsrGrJm8e/CjPPT54yI/XQ=; b=BOn0poR4R8yycOMRXl2Wgz6LaBcTiGURZFDdbcPcss0AKZniupwPfzESWtBPOLnbEW A5SKju6s9R6tYT2WdC3+PKaHA6ccPAcAQukS6X6iiRQeYvh6vc4/68TchUhbbMHNAYvC wT9zagc25DutWgfBr3+uwjUKEK1RlmPT5/fN0PqyJuAgzzM+xPW+1u1uyxnrajHDp1Ta Otl3Ry6k8vVfqIgkKUOMArlbmQNX/NXZjK9iKSQAeOb4ho3QJ8kViV33HSJ7LuUaoqGn n+pyKi150vjusWh+IXWZ2GNhBIpDjEMzmkEdY1ThhQwdo82JDzgtZwXS/I479LiFj5TZ xk4Q== X-Gm-Message-State: AGi0PubW4K+69hX91hf+C2Ym4pRf0AERmB4NaWQrGBCo1C1FbiQRaI5I FL49zIFHNUkI0/6mvqlRI20= X-Google-Smtp-Source: APiQypKr6xJtfpSJI2VnSdsU2hz5ISEG+u5kClhHNJjepktCchGSt7BCa4KWo/4CCywSd17NbopH0A== X-Received: by 2002:aa7:92cc:: with SMTP id k12mr30378793pfa.184.1588146395696; Wed, 29 Apr 2020 00:46:35 -0700 (PDT) Received: from 42.do-not-panic.com (42.do-not-panic.com. [157.230.128.187]) by smtp.gmail.com with ESMTPSA id a200sm390400pfa.201.2020.04.29.00.46.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 29 Apr 2020 00:46:30 -0700 (PDT) Received: by 42.do-not-panic.com (Postfix, from userid 1000) id 050EF41C6A; Wed, 29 Apr 2020 07:46:30 +0000 (UTC) From: Luis Chamberlain To: axboe@kernel.dk, viro@zeniv.linux.org.uk, bvanassche@acm.org, gregkh@linuxfoundation.org, rostedt@goodmis.org, mingo@redhat.com, jack@suse.cz, ming.lei@redhat.com, nstange@suse.de, akpm@linux-foundation.org Cc: mhocko@suse.com, yukuai3@huawei.com, linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Luis Chamberlain , Omar Sandoval , Hannes Reinecke , Michal Hocko , syzbot+603294af2d01acfdd6da@syzkaller.appspotmail.com Subject: [PATCH v3 4/6] blktrace: fix debugfs use after free Date: Wed, 29 Apr 2020 07:46:25 +0000 Message-Id: <20200429074627.5955-5-mcgrof@kernel.org> X-Mailer: git-send-email 2.23.0.rc1 In-Reply-To: <20200429074627.5955-1-mcgrof@kernel.org> References: <20200429074627.5955-1-mcgrof@kernel.org> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On commit 6ac93117ab00 ("blktrace: use existing disk debugfs directory") merged on v4.12 Omar fixed the original blktrace code for request-based drivers (multiqueue). This however left in place a possible crash, if you happen to abuse blktrace while racing to remove / add a device. We used to use asynchronous removal of the request_queue, and with that the issue was easier to reproduce. Now that we have reverted to synchronous removal of the request_queue, the issue is still possible to reproduce, its however just a bit more difficult. We essentially run two instances of break-blktrace which add/remove a loop device, and setup a blktrace and just never tear the blktrace down. We do this twice in parallel. This is easily reproduced with the break-blktrace run_0004.sh script. We can end up with two types of panics each reflecting where we race, one a failed blktrace setup: [ 252.426751] debugfs: Directory 'loop0' with parent 'block' already present! [ 252.432265] BUG: kernel NULL pointer dereference, address: 00000000000000a0 [ 252.436592] #PF: supervisor write access in kernel mode [ 252.439822] #PF: error_code(0x0002) - not-present page [ 252.442967] PGD 0 P4D 0 [ 252.444656] Oops: 0002 [#1] SMP NOPTI [ 252.446972] CPU: 10 PID: 1153 Comm: break-blktrace Tainted: G E 5.7.0-rc2-next-20200420+ #164 [ 252.452673] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 04/01/2014 [ 252.456343] RIP: 0010:down_write+0x15/0x40 [ 252.458146] Code: eb ca e8 ae 22 8d ff cc cc cc cc cc cc cc cc cc cc cc cc cc cc 0f 1f 44 00 00 55 48 89 fd e8 52 db ff ff 31 c0 ba 01 00 00 00 48 0f b1 55 00 75 0f 48 8b 04 25 c0 8b 01 00 48 89 45 08 5d [ 252.463638] RSP: 0018:ffffa626415abcc8 EFLAGS: 00010246 [ 252.464950] RAX: 0000000000000000 RBX: ffff958c25f0f5c0 RCX: ffffff8100000000 [ 252.466727] RDX: 0000000000000001 RSI: ffffff8100000000 RDI: 00000000000000a0 [ 252.468482] RBP: 00000000000000a0 R08: 0000000000000000 R09: 0000000000000001 [ 252.470014] R10: 0000000000000000 R11: ffff958d1f9227ff R12: 0000000000000000 [ 252.471473] R13: ffff958c25ea5380 R14: ffffffff8cce15f1 R15: 00000000000000a0 [ 252.473346] FS: 00007f2e69dee540(0000) GS:ffff958c2fc80000(0000) knlGS:0000000000000000 [ 252.475225] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 252.476267] CR2: 00000000000000a0 CR3: 0000000427d10004 CR4: 0000000000360ee0 [ 252.477526] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 252.478776] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 252.479866] Call Trace: [ 252.480322] simple_recursive_removal+0x4e/0x2e0 [ 252.481078] ? debugfs_remove+0x60/0x60 [ 252.481725] ? relay_destroy_buf+0x77/0xb0 [ 252.482662] debugfs_remove+0x40/0x60 [ 252.483518] blk_remove_buf_file_callback+0x5/0x10 [ 252.484328] relay_close_buf+0x2e/0x60 [ 252.484930] relay_open+0x1ce/0x2c0 [ 252.485520] do_blk_trace_setup+0x14f/0x2b0 [ 252.486187] __blk_trace_setup+0x54/0xb0 [ 252.486803] blk_trace_ioctl+0x90/0x140 [ 252.487423] ? do_sys_openat2+0x1ab/0x2d0 [ 252.488053] blkdev_ioctl+0x4d/0x260 [ 252.488636] block_ioctl+0x39/0x40 [ 252.489139] ksys_ioctl+0x87/0xc0 [ 252.489675] __x64_sys_ioctl+0x16/0x20 [ 252.490380] do_syscall_64+0x52/0x180 [ 252.491032] entry_SYSCALL_64_after_hwframe+0x44/0xa9 And the other on the device removal: [ 128.528940] debugfs: Directory 'loop0' with parent 'block' already present! [ 128.615325] BUG: kernel NULL pointer dereference, address: 00000000000000a0 [ 128.619537] #PF: supervisor write access in kernel mode [ 128.622700] #PF: error_code(0x0002) - not-present page [ 128.625842] PGD 0 P4D 0 [ 128.627585] Oops: 0002 [#1] SMP NOPTI [ 128.629871] CPU: 12 PID: 544 Comm: break-blktrace Tainted: G E 5.7.0-rc2-next-20200420+ #164 [ 128.635595] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 04/01/2014 [ 128.640471] RIP: 0010:down_write+0x15/0x40 [ 128.643041] Code: eb ca e8 ae 22 8d ff cc cc cc cc cc cc cc cc cc cc cc cc cc cc 0f 1f 44 00 00 55 48 89 fd e8 52 db ff ff 31 c0 ba 01 00 00 00 48 0f b1 55 00 75 0f 65 48 8b 04 25 c0 8b 01 00 48 89 45 08 5d [ 128.650180] RSP: 0018:ffffa9c3c05ebd78 EFLAGS: 00010246 [ 128.651820] RAX: 0000000000000000 RBX: ffff8ae9a6370240 RCX: ffffff8100000000 [ 128.653942] RDX: 0000000000000001 RSI: ffffff8100000000 RDI: 00000000000000a0 [ 128.655720] RBP: 00000000000000a0 R08: 0000000000000002 R09: ffff8ae9afd2d3d0 [ 128.657400] R10: 0000000000000056 R11: 0000000000000000 R12: 0000000000000000 [ 128.659099] R13: 0000000000000000 R14: 0000000000000003 R15: 00000000000000a0 [ 128.660500] FS: 00007febfd995540(0000) GS:ffff8ae9afd00000(0000) knlGS:0000000000000000 [ 128.662204] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 128.663426] CR2: 00000000000000a0 CR3: 0000000420042003 CR4: 0000000000360ee0 [ 128.664776] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 128.666022] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 128.667282] Call Trace: [ 128.667801] simple_recursive_removal+0x4e/0x2e0 [ 128.668663] ? debugfs_remove+0x60/0x60 [ 128.669368] debugfs_remove+0x40/0x60 [ 128.669985] blk_trace_free+0xd/0x50 [ 128.670593] __blk_trace_remove+0x27/0x40 [ 128.671274] blk_trace_shutdown+0x30/0x40 [ 128.671935] blk_release_queue+0x95/0xf0 [ 128.672589] kobject_put+0xa5/0x1b0 [ 128.673188] disk_release+0xa2/0xc0 [ 128.673786] device_release+0x28/0x80 [ 128.674376] kobject_put+0xa5/0x1b0 [ 128.674915] loop_remove+0x39/0x50 [loop] [ 128.675511] loop_control_ioctl+0x113/0x130 [loop] [ 128.676199] ksys_ioctl+0x87/0xc0 [ 128.676708] __x64_sys_ioctl+0x16/0x20 [ 128.677274] do_syscall_64+0x52/0x180 [ 128.677823] entry_SYSCALL_64_after_hwframe+0x44/0xa9 The common theme here is: debugfs: Directory 'loop0' with parent 'block' already present This crash happens because of how blktrace uses the debugfs directory where it places its files. Upon init we always create the same directory which would be needed by blktrace but we only do this for make_request drivers (multiqueue) block drivers, but never for request-based block drivers. Furthermore, that directory is only created on init for the entire disk. This means that if you use blktrace on a parition, we'll always be creating a new directory regardless of whether or not you are doing blktrace on a make_request driver (multiqueue) or a request-based block drivers. These directory creations are only associated with a path, and so when a debugfs_remove() is called it removes everything in its way. A device removal will remove all blktrace files, and so if a blktrace is still present a cleanup of blktrace files later will end up trying to remove dentries pointing to NULL. We can fix the UAF by using a debugfs directory which moving forward will always be accessible if debugfs is enabled for both make_request drivers (multiqueue) and request-based block drivers, *and* for all partitions upon creation. This ensures that removal of the directories only happens on device removal and removes the race of the files underneath an active blktrace. This also simplifies the code considerably, with the only penalty now being that we're always creating the request queue debugfs directory for the request-based block device drivers, and the partition debugfs directories upon initialization for both types of drivers. This patch is part of the work which disputes the severity of CVE-2019-19770 which shows this issue is not a core debugfs issue, but a misuse of debugfs within blktace. Cc: Bart Van Assche Cc: Omar Sandoval Cc: Hannes Reinecke Cc: Nicolai Stange Cc: Greg Kroah-Hartman Cc: Michal Hocko Cc: yu kuai Reported-by: syzbot+603294af2d01acfdd6da@syzkaller.appspotmail.com Fixes: 6ac93117ab00 ("blktrace: use existing disk debugfs directory") Signed-off-by: Luis Chamberlain Reported-by: kernel test robot --- block/blk-debugfs.c | 29 +++++++++++++++++++++++++++++ block/blk-mq-debugfs.c | 5 ----- block/blk-sysfs.c | 4 ++++ block/blk.h | 11 +++++++++++ block/partitions/core.c | 3 +++ drivers/scsi/sg.c | 2 ++ include/linux/blkdev.h | 5 ++++- include/linux/blktrace_api.h | 1 - include/linux/genhd.h | 18 ++++++++++++++++++ kernel/trace/blktrace.c | 32 +++++++++++++++++++++----------- 10 files changed, 92 insertions(+), 18 deletions(-) diff --git a/block/blk-debugfs.c b/block/blk-debugfs.c index 19091e1effc0..a0f4077d6959 100644 --- a/block/blk-debugfs.c +++ b/block/blk-debugfs.c @@ -13,3 +13,32 @@ void blk_debugfs_register(void) { blk_debugfs_root = debugfs_create_dir("block", NULL); } + +static struct dentry *blk_debugfs_dir_register(const char *name) +{ + return debugfs_create_dir(name, blk_debugfs_root); +} + +void blk_queue_debugfs_register(struct request_queue *q, const char *name) +{ + q->debugfs_dir = blk_debugfs_dir_register(name); +} +EXPORT_SYMBOL_GPL(blk_queue_debugfs_register); + +void blk_queue_debugfs_unregister(struct request_queue *q) +{ + debugfs_remove_recursive(q->debugfs_dir); + q->debugfs_dir = NULL; +} +EXPORT_SYMBOL_GPL(blk_queue_debugfs_unregister); + +void blk_part_debugfs_register(struct hd_struct *p, const char *name) +{ + p->debugfs_dir = blk_debugfs_dir_register(name); +} + +void blk_part_debugfs_unregister(struct hd_struct *p) +{ + debugfs_remove_recursive(p->debugfs_dir); + p->debugfs_dir = NULL; +} diff --git a/block/blk-mq-debugfs.c b/block/blk-mq-debugfs.c index 96b7a35c898a..08edc3a54114 100644 --- a/block/blk-mq-debugfs.c +++ b/block/blk-mq-debugfs.c @@ -822,9 +822,6 @@ void blk_mq_debugfs_register(struct request_queue *q) struct blk_mq_hw_ctx *hctx; int i; - q->debugfs_dir = debugfs_create_dir(kobject_name(q->kobj.parent), - blk_debugfs_root); - debugfs_create_files(q->debugfs_dir, q, blk_mq_debugfs_queue_attrs); /* @@ -855,9 +852,7 @@ void blk_mq_debugfs_register(struct request_queue *q) void blk_mq_debugfs_unregister(struct request_queue *q) { - debugfs_remove_recursive(q->debugfs_dir); q->sched_debugfs_dir = NULL; - q->debugfs_dir = NULL; } static void blk_mq_debugfs_register_ctx(struct blk_mq_hw_ctx *hctx, diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c index eda8c4985511..f758a7e06671 100644 --- a/block/blk-sysfs.c +++ b/block/blk-sysfs.c @@ -905,6 +905,7 @@ static void blk_release_queue(struct kobject *kobj) blk_trace_shutdown(q); + blk_queue_debugfs_unregister(q); if (queue_is_mq(q)) blk_mq_debugfs_unregister(q); @@ -976,6 +977,8 @@ int blk_register_queue(struct gendisk *disk) goto unlock; } + blk_queue_debugfs_register(q, kobject_name(q->kobj.parent)); + if (queue_is_mq(q)) { __blk_mq_register_dev(dev, q); blk_mq_debugfs_register(q); @@ -986,6 +989,7 @@ int blk_register_queue(struct gendisk *disk) ret = elv_register_queue(q, false); if (ret) { mutex_unlock(&q->sysfs_lock); + blk_queue_debugfs_unregister(q); mutex_unlock(&q->sysfs_dir_lock); kobject_del(&q->kobj); blk_trace_remove_sysfs(dev); diff --git a/block/blk.h b/block/blk.h index ec16e8a6049e..46d867a7f5bc 100644 --- a/block/blk.h +++ b/block/blk.h @@ -458,10 +458,21 @@ int __bio_add_pc_page(struct request_queue *q, struct bio *bio, bool *same_page); #ifdef CONFIG_DEBUG_FS void blk_debugfs_register(void); +void blk_part_debugfs_register(struct hd_struct *p, const char *name); +void blk_part_debugfs_unregister(struct hd_struct *p); #else static inline void blk_debugfs_register(void) { } + +static inline void blk_part_debugfs_register(struct hd_struct *p, + const char *name) +{ +} + +static inline void blk_part_debugfs_unregister(struct hd_struct *p) +{ +} #endif /* CONFIG_DEBUG_FS */ #endif /* BLK_INTERNAL_H */ diff --git a/block/partitions/core.c b/block/partitions/core.c index c085bf85509b..ae395b3ec9cc 100644 --- a/block/partitions/core.c +++ b/block/partitions/core.c @@ -312,6 +312,7 @@ void delete_partition(struct gendisk *disk, struct hd_struct *part) rcu_assign_pointer(ptbl->part[part->partno], NULL); rcu_assign_pointer(ptbl->last_lookup, NULL); kobject_put(part->holder_dir); + blk_part_debugfs_unregister(part); device_del(part_to_dev(part)); /* @@ -433,6 +434,8 @@ static struct hd_struct *add_partition(struct gendisk *disk, int partno, if (!p->holder_dir) goto out_del; + blk_part_debugfs_register(p, dev_name(pdev)); + dev_set_uevent_suppress(pdev, 0); if (flags & ADDPART_FLAG_WHOLEDISK) { err = device_create_file(pdev, &dev_attr_whole_disk); diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c index 20472aaaf630..f21787611918 100644 --- a/drivers/scsi/sg.c +++ b/drivers/scsi/sg.c @@ -1548,6 +1548,7 @@ sg_add_device(struct device *cl_dev, struct class_interface *cl_intf) goto out; } + blk_queue_debugfs_register(sdp->device->request_queue, disk->disk_name); error = cdev_add(cdev, MKDEV(SCSI_GENERIC_MAJOR, sdp->index), 1); if (error) goto cdev_add_err; @@ -1644,6 +1645,7 @@ sg_remove_device(struct device *cl_dev, struct class_interface *cl_intf) sysfs_remove_link(&scsidp->sdev_gendev.kobj, "generic"); device_destroy(sg_sysfs_class, MKDEV(SCSI_GENERIC_MAJOR, sdp->index)); + blk_queue_debugfs_unregister(sdp->device->request_queue); cdev_del(sdp->cdev); sdp->cdev = NULL; diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 3122a93c7277..e7edd31bdf9a 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -561,8 +561,11 @@ struct request_queue { struct list_head tag_set_list; struct bio_set bio_split; -#ifdef CONFIG_BLK_DEBUG_FS +#ifdef CONFIG_DEBUG_FS + /* Used by block/blk-*debugfs.c and kernel/trace/blktrace.c */ struct dentry *debugfs_dir; +#endif +#ifdef CONFIG_BLK_DEBUG_FS struct dentry *sched_debugfs_dir; struct dentry *rqos_debugfs_dir; #endif diff --git a/include/linux/blktrace_api.h b/include/linux/blktrace_api.h index 3b6ff5902edc..eb6db276e293 100644 --- a/include/linux/blktrace_api.h +++ b/include/linux/blktrace_api.h @@ -22,7 +22,6 @@ struct blk_trace { u64 end_lba; u32 pid; u32 dev; - struct dentry *dir; struct dentry *dropped_file; struct dentry *msg_file; struct list_head running_list; diff --git a/include/linux/genhd.h b/include/linux/genhd.h index 058d895544c7..899760cf8c37 100644 --- a/include/linux/genhd.h +++ b/include/linux/genhd.h @@ -86,6 +86,10 @@ struct hd_struct { #endif struct percpu_ref ref; struct rcu_work rcu_work; +#ifdef CONFIG_DEBUG_FS + /* Currently only used by kernel/trace/blktrace.c */ + struct dentry *debugfs_dir; +#endif }; /** @@ -382,4 +386,18 @@ static inline dev_t blk_lookup_devt(const char *name, int partno) } #endif /* CONFIG_BLOCK */ +#ifdef CONFIG_DEBUG_FS +void blk_queue_debugfs_register(struct request_queue *q, const char *name); +void blk_queue_debugfs_unregister(struct request_queue *q); +#else +static inline void blk_queue_debugfs_register(struct request_queue *q, + const char *name) +{ +} + +static inline void blk_queue_debugfs_unregister(struct request_queue *q) +{ +} +#endif /* CONFIG_DEBUG_FS */ + #endif /* _LINUX_GENHD_H */ diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c index 2c6e6c386ace..5c52976bd762 100644 --- a/kernel/trace/blktrace.c +++ b/kernel/trace/blktrace.c @@ -3,6 +3,7 @@ * Copyright (C) 2006 Jens Axboe * */ + #include #include #include @@ -311,7 +312,6 @@ static void blk_trace_free(struct blk_trace *bt) debugfs_remove(bt->msg_file); debugfs_remove(bt->dropped_file); relay_close(bt->rchan); - debugfs_remove(bt->dir); free_percpu(bt->sequence); free_percpu(bt->msg_data); kfree(bt); @@ -468,16 +468,25 @@ static void blk_trace_setup_lba(struct blk_trace *bt, } } -static struct dentry *blk_trace_debugfs_dir(struct blk_user_trace_setup *buts, - struct blk_trace *bt) +static struct dentry *blk_trace_debugfs_dir(struct block_device *bdev, + struct request_queue *q) { - struct dentry *dir = NULL; + struct hd_struct *p = NULL; - dir = debugfs_lookup(buts->name, blk_debugfs_root); - if (!dir) - bt->dir = dir = debugfs_create_dir(buts->name, blk_debugfs_root); + /* + * Some drivers like scsi-generic use a NULL block device. For + * other drivers when bdev != bdev->bd_contain we are doing a blktrace + * on a parition, otherwise we know we are working on the whole + * disk, and for that the request_queue already has its own debugfs_dir. + * which we have been using for other things other than blktrace. + */ + if (bdev && bdev != bdev->bd_contains) + p = bdev->bd_part; - return dir; + if (p) + return p->debugfs_dir; + + return q->debugfs_dir; } /* @@ -491,6 +500,7 @@ static int do_blk_trace_setup(struct request_queue *q, char *name, dev_t dev, struct dentry *dir = NULL; int ret; + if (!buts->buf_size || !buts->buf_nr) return -EINVAL; @@ -521,7 +531,9 @@ static int do_blk_trace_setup(struct request_queue *q, char *name, dev_t dev, ret = -ENOENT; - dir = blk_trace_debugfs_dir(buts, bt); + dir = blk_trace_debugfs_dir(bdev, q); + if (WARN_ON(!dir)) + goto err; bt->dev = dev; atomic_set(&bt->dropped, 0); @@ -561,8 +573,6 @@ static int do_blk_trace_setup(struct request_queue *q, char *name, dev_t dev, ret = 0; err: - if (dir && !bt->dir) - dput(dir); if (ret) blk_trace_free(bt); return ret; From patchwork Wed Apr 29 07:46:26 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Luis Chamberlain X-Patchwork-Id: 11516339 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 37B3392C for ; Wed, 29 Apr 2020 07:46:46 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 04BD72076B for ; Wed, 29 Apr 2020 07:46:46 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 04BD72076B Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 7A8518E000B; Wed, 29 Apr 2020 03:46:38 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 7315F8E0001; Wed, 29 Apr 2020 03:46:38 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 647338E000B; Wed, 29 Apr 2020 03:46:38 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0121.hostedemail.com [216.40.44.121]) by kanga.kvack.org (Postfix) with ESMTP id 4D26C8E0001 for ; Wed, 29 Apr 2020 03:46:38 -0400 (EDT) Received: from smtpin07.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 01F23824805A for ; Wed, 29 Apr 2020 07:46:38 +0000 (UTC) X-FDA: 76760110476.07.fuel77_39dec42cfd537 X-Spam-Summary: 2,0,0,5aafa2bcd43ac47b,d41d8cd98f00b204,mcgrof@gmail.com,,RULES_HIT:41:355:379:541:800:960:967:973:988:989:1260:1311:1314:1345:1359:1437:1515:1534:1542:1711:1730:1747:1777:1792:2393:2525:2559:2563:2682:2685:2693:2859:2933:2937:2939:2942:2945:2947:2951:2954:3022:3138:3139:3140:3141:3142:3354:3865:3866:3867:3868:3870:3871:3872:3874:3934:3936:3938:3941:3944:3947:3950:3953:3956:3959:4321:5007:6119:6261:7903:8660:9025:9040:10004:11026:11658:11914:12043:12048:12294:12296:12297:12517:12519:12555:12679:12895:12986:13095:13148:13161:13229:13230:13894:14181:14394:14721:14824:21063:21080:21212:21324:21433:21444:21451:21627:21939:21990:30029:30054:30070,0,RBL:209.85.216.67:@gmail.com:.lbl8.mailshell.net-62.50.0.100 66.100.201.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:23,LUA_SUMMARY:none X-HE-Tag: fuel77_39dec42cfd537 X-Filterd-Recvd-Size: 4998 Received: from mail-pj1-f67.google.com (mail-pj1-f67.google.com [209.85.216.67]) by imf17.hostedemail.com (Postfix) with ESMTP for ; Wed, 29 Apr 2020 07:46:37 +0000 (UTC) Received: by mail-pj1-f67.google.com with SMTP id y6so440133pjc.4 for ; Wed, 29 Apr 2020 00:46:37 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=lbuhBnmZzFbDhZxSJ1ni5AOA61z3VyXXGZL4QHqGGx0=; b=cS3OKUcRWAQVo9am5pVghpTVNnIASbjYjS4VCoLJw0BrsVkmVg1ndz4EQ+SmkpqRYo gc9zHGo+bX/lhVbICjvvMZ+FiLSlwGVbaarpTwjz6Bz3SmMNSa30t7i4DXDOFWgHmp1J Hbq8aXevAWo1Jr26ChKyUMOAjWjqDvQSIAewo83XqjTbUMhMUJxzsge2FD1ncfd/7DsT bqwF5WhGU4YKmKJeDa44CVPQGmm/cChHxlyZ6pndz57n3W9VO6Q25DdbvWFALO9Npib0 bMkXDHgbxR4HE93/v1AEGzp5Osz9SBnl9GQPfgHyKQTAztosdxfgunplrLPRyT3vLj3e y2WA== X-Gm-Message-State: AGi0PubzNQy4a8px/JuoPqcNoLGDuJaf4wWRZ3D36AYVwx0sz6VbS9Fl SgKdV1qgaXTxKkapKarYLJ8= X-Google-Smtp-Source: APiQypKrREry4NDgPPBJilLfXP5Wql3EjomxOYfYxqGwUMX/NYY+ZmpZXVnkCpQ7jDJW/k/bn9/ubQ== X-Received: by 2002:a17:90a:e382:: with SMTP id b2mr1640417pjz.110.1588146396679; Wed, 29 Apr 2020 00:46:36 -0700 (PDT) Received: from 42.do-not-panic.com (42.do-not-panic.com. [157.230.128.187]) by smtp.gmail.com with ESMTPSA id r31sm387543pgl.86.2020.04.29.00.46.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 29 Apr 2020 00:46:35 -0700 (PDT) Received: by 42.do-not-panic.com (Postfix, from userid 1000) id 1548241DCA; Wed, 29 Apr 2020 07:46:30 +0000 (UTC) From: Luis Chamberlain To: axboe@kernel.dk, viro@zeniv.linux.org.uk, bvanassche@acm.org, gregkh@linuxfoundation.org, rostedt@goodmis.org, mingo@redhat.com, jack@suse.cz, ming.lei@redhat.com, nstange@suse.de, akpm@linux-foundation.org Cc: mhocko@suse.com, yukuai3@huawei.com, linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Luis Chamberlain Subject: [PATCH v3 5/6] blktrace: break out of blktrace setup on concurrent calls Date: Wed, 29 Apr 2020 07:46:26 +0000 Message-Id: <20200429074627.5955-6-mcgrof@kernel.org> X-Mailer: git-send-email 2.23.0.rc1 In-Reply-To: <20200429074627.5955-1-mcgrof@kernel.org> References: <20200429074627.5955-1-mcgrof@kernel.org> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: We use one blktrace per request_queue, that means one per the entire disk. So we cannot run one blktrace on say /dev/vda and then /dev/vda1, or just two calls on /dev/vda. We check for concurrent setup only at the very end of the blktrace setup though. If we try to run two concurrent blktraces on the same block device the second one will fail, and the first one seems to go on. However when one tries to kill the first one one will see things like this: The kernel will show these: ``` debugfs: File 'dropped' in directory 'nvme1n1' already present! debugfs: File 'msg' in directory 'nvme1n1' already present! debugfs: File 'trace0' in directory 'nvme1n1' already present! `` And userspace just sees this error message for the second call: ``` blktrace /dev/nvme1n1 BLKTRACESETUP(2) /dev/nvme1n1 failed: 5/Input/output error ``` The first userspace process #1 will also claim that the files were taken underneath their nose as well. The files are taken away form the first process given that when the second blktrace fails, it will follow up with a BLKTRACESTOP and BLKTRACETEARDOWN. This means that even if go-happy process #1 is waiting for blktrace data, we *have* been asked to take teardown the blktrace. This can easily be reproduced with break-blktrace [0] run_0005.sh test. Just break out early if we know we're already going to fail, this will prevent trying to create the files all over again, which we know still exist. [0] https://github.com/mcgrof/break-blktrace Signed-off-by: Luis Chamberlain --- kernel/trace/blktrace.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c index 5c52976bd762..383045f67cb8 100644 --- a/kernel/trace/blktrace.c +++ b/kernel/trace/blktrace.c @@ -4,6 +4,8 @@ * */ +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + #include #include #include @@ -516,6 +518,11 @@ static int do_blk_trace_setup(struct request_queue *q, char *name, dev_t dev, */ strreplace(buts->name, '/', '_'); + if (q->blk_trace) { + pr_warn("Concurrent blktraces are not allowed\n"); + return -EBUSY; + } + bt = kzalloc(sizeof(*bt), GFP_KERNEL); if (!bt) return -ENOMEM; From patchwork Wed Apr 29 07:46:27 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Luis Chamberlain X-Patchwork-Id: 11516343 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 77F4215AB for ; Wed, 29 Apr 2020 07:46:48 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 458B82076B for ; Wed, 29 Apr 2020 07:46:48 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 458B82076B Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 0CA6E8E000C; Wed, 29 Apr 2020 03:46:40 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 054528E0001; Wed, 29 Apr 2020 03:46:40 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EAAED8E000C; Wed, 29 Apr 2020 03:46:39 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0236.hostedemail.com [216.40.44.236]) by kanga.kvack.org (Postfix) with ESMTP id CEBCA8E0001 for ; Wed, 29 Apr 2020 03:46:39 -0400 (EDT) Received: from smtpin14.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 8D6282C04 for ; Wed, 29 Apr 2020 07:46:39 +0000 (UTC) X-FDA: 76760110518.14.water67_3a1b9e9bab217 X-Spam-Summary: 2,0,0,4c5f70896f75b135,d41d8cd98f00b204,mcgrof@gmail.com,,RULES_HIT:41:355:379:541:800:960:973:988:989:1260:1311:1314:1345:1359:1437:1515:1534:1540:1711:1714:1730:1747:1777:1792:2393:2559:2562:3138:3139:3140:3141:3142:3350:3865:3866:3868:4321:5007:6261:9389:10004:11026:11658:11914:12043:12048:12297:12517:12519:12555:12895:13069:13311:13357:13894:13972:14096:14181:14384:14394:14721:21080:21444:21627:30054,0,RBL:209.85.214.195:@gmail.com:.lbl8.mailshell.net-62.50.0.100 66.100.201.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:24,LUA_SUMMARY:none X-HE-Tag: water67_3a1b9e9bab217 X-Filterd-Recvd-Size: 3533 Received: from mail-pl1-f195.google.com (mail-pl1-f195.google.com [209.85.214.195]) by imf28.hostedemail.com (Postfix) with ESMTP for ; Wed, 29 Apr 2020 07:46:39 +0000 (UTC) Received: by mail-pl1-f195.google.com with SMTP id f15so553296plr.3 for ; Wed, 29 Apr 2020 00:46:39 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=7SGf6Ly+O4Ndriq4xyCuLphKEBBCLbvmraZ/f8/ln1o=; b=psRsMo5ToQWzGHqzdWuzZz1Z/aBDckQeeAAOZeqyeAYJxuD3ldsk+OCqgOFDr7WZ0C Jb3xfX34dwU3TxE+371vrroPbc+mMIIvkm5wMcaCKFlKEy3ElIfn+MsSKclgM/XZ4zDt XlMVNvhj0bx6DjkM3aTEHR/YhfH97qurrvLc2PXVX8mHWzXkAYCrEPNc+W0KKT2AXmE0 Cqog8m3dPW0udsjwGqg3b1mKW2rhJQqjUwImbnh6tMvlyFihj6a38gItDhnRWmJHh3zK Rf441nsDKzvFZo2g7eCHGPDIexP529Qfw8UVlqtq8IPOJXjoeXHpVlzDAARpNtiPZYTl X7eA== X-Gm-Message-State: AGi0PuY/djlGqZR4cxo+iKaj/ivr9ZjO+N1rHC1UvQ7FUFRImjKz6eX8 yMdWDJ/dWtuMeko5qVjfPZg= X-Google-Smtp-Source: APiQypKvHe5sxZ7BY9kIc9F1nfVnrNrN57rzYZfJEMB8GR3JOh+gOanykqeVDjgeRVbFaOR+yoZOIQ== X-Received: by 2002:a17:902:8687:: with SMTP id g7mr30866549plo.59.1588146398422; Wed, 29 Apr 2020 00:46:38 -0700 (PDT) Received: from 42.do-not-panic.com (42.do-not-panic.com. [157.230.128.187]) by smtp.gmail.com with ESMTPSA id y3sm4096498pjb.41.2020.04.29.00.46.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 29 Apr 2020 00:46:35 -0700 (PDT) Received: by 42.do-not-panic.com (Postfix, from userid 1000) id 1F17E42000; Wed, 29 Apr 2020 07:46:30 +0000 (UTC) From: Luis Chamberlain To: axboe@kernel.dk, viro@zeniv.linux.org.uk, bvanassche@acm.org, gregkh@linuxfoundation.org, rostedt@goodmis.org, mingo@redhat.com, jack@suse.cz, ming.lei@redhat.com, nstange@suse.de, akpm@linux-foundation.org Cc: mhocko@suse.com, yukuai3@huawei.com, linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Luis Chamberlain Subject: [PATCH v3 6/6] loop: be paranoid on exit and prevent new additions / removals Date: Wed, 29 Apr 2020 07:46:27 +0000 Message-Id: <20200429074627.5955-7-mcgrof@kernel.org> X-Mailer: git-send-email 2.23.0.rc1 In-Reply-To: <20200429074627.5955-1-mcgrof@kernel.org> References: <20200429074627.5955-1-mcgrof@kernel.org> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Be pedantic on removal as well and hold the mutex. This should prevent uses of addition while we exit. Signed-off-by: Luis Chamberlain Reviewed-by: Ming Lei --- drivers/block/loop.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/drivers/block/loop.c b/drivers/block/loop.c index da693e6a834e..6dccba22c9b5 100644 --- a/drivers/block/loop.c +++ b/drivers/block/loop.c @@ -2333,6 +2333,8 @@ static void __exit loop_exit(void) range = max_loop ? max_loop << part_shift : 1UL << MINORBITS; + mutex_lock(&loop_ctl_mutex); + idr_for_each(&loop_index_idr, &loop_exit_cb, NULL); idr_destroy(&loop_index_idr); @@ -2340,6 +2342,8 @@ static void __exit loop_exit(void) unregister_blkdev(LOOP_MAJOR, "loop"); misc_deregister(&loop_misc); + + mutex_unlock(&loop_ctl_mutex); } module_init(loop_init);