From patchwork Thu Apr 2 00:00:00 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Luis Chamberlain X-Patchwork-Id: 11469717 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A49271667 for ; Thu, 2 Apr 2020 00:00:28 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 8348A2082F for ; Thu, 2 Apr 2020 00:00:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1585785628; bh=AQiHX8FkqiBny7pBzb7w1oGoc/AmBPy22JWbqE+ElkA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:List-ID:From; b=dwHr7+4gnrdIY44CUcb25B4GR2camHLJJxJdW2a87eMLHyVZfea82Fn5Cu3dTdChN Dav5LC+LjFgqo/pUFxLJ9S8O32y3l5evbXZNVDLgfyjlk+uHbVQpIcMGL8fOnmmy2o DQuaRST7ljDgzndwQljuannaPu45PgfUrwVvf8F8= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1733239AbgDBAAO (ORCPT ); Wed, 1 Apr 2020 20:00:14 -0400 Received: from mail-pf1-f195.google.com ([209.85.210.195]:44805 "EHLO mail-pf1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732560AbgDBAAN (ORCPT ); Wed, 1 Apr 2020 20:00:13 -0400 Received: by mail-pf1-f195.google.com with SMTP id b72so825671pfb.11; Wed, 01 Apr 2020 17:00:13 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=IFwcC/w73MX4o2yQ8rfJAUpy9x+uzW31JyxuFegsf5o=; b=M8tdxnMMrHpie5+x79htSm+WQsCttcDD3joPI9Yunrl74BPFoKg8XBH61EK0dgVgUB Og8ssomY+8D0goIgxxP3Utw9dqC5pG0tlg4cRDQx6LbvNbOdnpkhiYOniuWfLMX6ThGZ w2yiZsMLco988eX1jgtyzBEIg64wWFoRV+sFOm3VXaqa1DqEJo46L3T8lFiTd6tRd4j5 FrHHm4fjVUlbYaPJhiX/cQC5/u61jh8MYVkt+Ck7YR7zGfIJbNzh4gYu68Pr8f8QcwvH CWpVQLkWuRcfVlUlhOk4SAEZoZ3shOtraShEmrdZuvGYolGS8aFkYBoJTtZISPKxS/Mh w9ng== X-Gm-Message-State: AGi0PubcEu/hPz/cts6mY95+nVwv5fjRtqxuEl6y1gY0ce4gL8/dLyYd yt91ir/RjP1LolplbiE2JGU= X-Google-Smtp-Source: APiQypKeCcDQDiyCN2rdAtJfDymVmvOrZ70v+7dz07bMma08br8CelIedf6QzSZmqp2A/+DOlwQxTQ== X-Received: by 2002:a63:78e:: with SMTP id 136mr691233pgh.181.1585785612494; Wed, 01 Apr 2020 17:00:12 -0700 (PDT) Received: from 42.do-not-panic.com (42.do-not-panic.com. [157.230.128.187]) by smtp.gmail.com with ESMTPSA id y13sm2388379pfp.88.2020.04.01.17.00.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 01 Apr 2020 17:00:11 -0700 (PDT) Received: by 42.do-not-panic.com (Postfix, from userid 1000) id 8FA2C4018C; Thu, 2 Apr 2020 00:00:10 +0000 (UTC) From: Luis Chamberlain To: axboe@kernel.dk, viro@zeniv.linux.org.uk, gregkh@linuxfoundation.org, rostedt@goodmis.org, mingo@redhat.com, jack@suse.cz, ming.lei@redhat.com, nstange@suse.de Cc: mhocko@suse.com, linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Luis Chamberlain , Bart Van Assche , Omar Sandoval , Hannes Reinecke , Michal Hocko Subject: [RFC 1/3] block: move main block debugfs initialization to its own file Date: Thu, 2 Apr 2020 00:00:00 +0000 Message-Id: <20200402000002.7442-2-mcgrof@kernel.org> X-Mailer: git-send-email 2.23.0.rc1 In-Reply-To: <20200402000002.7442-1-mcgrof@kernel.org> References: <20200402000002.7442-1-mcgrof@kernel.org> MIME-Version: 1.0 Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Single and multiqeueue block devices share some debugfs code. By moving this into its own file it makes it easier to expand and audit this shared code. This patch contains no functional changes. Cc: Bart Van Assche Cc: Omar Sandoval Cc: Hannes Reinecke Cc: Nicolai Stange Cc: Greg Kroah-Hartman Cc: Michal Hocko Signed-off-by: Luis Chamberlain --- block/Makefile | 1 + block/blk-core.c | 9 +-------- block/blk-debugfs.c | 15 +++++++++++++++ block/blk.h | 7 +++++++ 4 files changed, 24 insertions(+), 8 deletions(-) create mode 100644 block/blk-debugfs.c diff --git a/block/Makefile b/block/Makefile index 206b96e9387f..1d3ab20505d8 100644 --- a/block/Makefile +++ b/block/Makefile @@ -10,6 +10,7 @@ obj-$(CONFIG_BLOCK) := bio.o elevator.o blk-core.o blk-sysfs.o \ blk-mq-sysfs.o blk-mq-cpumap.o blk-mq-sched.o ioctl.o \ genhd.o ioprio.o badblocks.o partitions/ blk-rq-qos.o +obj-$(CONFIG_DEBUG_FS) += blk-debugfs.o obj-$(CONFIG_BOUNCE) += bounce.o obj-$(CONFIG_BLK_SCSI_REQUEST) += scsi_ioctl.o obj-$(CONFIG_BLK_DEV_BSG) += bsg.o diff --git a/block/blk-core.c b/block/blk-core.c index 7e4a1da0715e..5aaae7a1b338 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -48,10 +48,6 @@ #include "blk-pm.h" #include "blk-rq-qos.h" -#ifdef CONFIG_DEBUG_FS -struct dentry *blk_debugfs_root; -#endif - EXPORT_TRACEPOINT_SYMBOL_GPL(block_bio_remap); EXPORT_TRACEPOINT_SYMBOL_GPL(block_rq_remap); EXPORT_TRACEPOINT_SYMBOL_GPL(block_bio_complete); @@ -1796,10 +1792,7 @@ int __init blk_dev_init(void) blk_requestq_cachep = kmem_cache_create("request_queue", sizeof(struct request_queue), 0, SLAB_PANIC, NULL); - -#ifdef CONFIG_DEBUG_FS - blk_debugfs_root = debugfs_create_dir("block", NULL); -#endif + blk_debugfs_register(); return 0; } diff --git a/block/blk-debugfs.c b/block/blk-debugfs.c new file mode 100644 index 000000000000..634dea4b1507 --- /dev/null +++ b/block/blk-debugfs.c @@ -0,0 +1,15 @@ +// SPDX-License-Identifier: GPL-2.0 + +/* + * Shared debugfs mq / non-mq functionality + */ +#include +#include +#include + +struct dentry *blk_debugfs_root; + +void blk_debugfs_register(void) +{ + blk_debugfs_root = debugfs_create_dir("block", NULL); +} diff --git a/block/blk.h b/block/blk.h index 0a94ec68af32..86a66b614f08 100644 --- a/block/blk.h +++ b/block/blk.h @@ -487,5 +487,12 @@ struct request_queue *__blk_alloc_queue(int node_id); int __bio_add_pc_page(struct request_queue *q, struct bio *bio, struct page *page, unsigned int len, unsigned int offset, bool *same_page); +#ifdef CONFIG_DEBUG_FS +void blk_debugfs_register(void); +#else +static inline void blk_debugfs_register(void) +{ +} +#endif /* CONFIG_DEBUG_FS */ #endif /* BLK_INTERNAL_H */ From patchwork Thu Apr 2 00:00:01 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Luis Chamberlain X-Patchwork-Id: 11469711 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 21ADF17EA for ; Thu, 2 Apr 2020 00:00:26 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id DE1D12077D for ; Thu, 2 Apr 2020 00:00:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1585785626; bh=8SG2lxNlUhFihJncs4atxSh1fxcwhyatjGzSsMkpvHY=; h=From:To:Cc:Subject:Date:In-Reply-To:References:List-ID:From; b=wXxLAdEBYgIbQnmsatDImV1BST7InVoLiujp6SP6ohYpBH4qNXJ5HzfcFNLWJy/y0 P22qXCT3kD8T4Rc4SMqokGeLSCm6+AY4wBICF1pDKbmKqHMZiNtwUPydZLF9A0V7a5 teNTRviMU3p7cPqxUcCXDl3KYkr5dXNCB8H6OqiE= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2387461AbgDBAAU (ORCPT ); Wed, 1 Apr 2020 20:00:20 -0400 Received: from mail-pl1-f193.google.com ([209.85.214.193]:44681 "EHLO mail-pl1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1733287AbgDBAAR (ORCPT ); Wed, 1 Apr 2020 20:00:17 -0400 Received: by mail-pl1-f193.google.com with SMTP id h11so621545plr.11; Wed, 01 Apr 2020 17:00:16 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=QnLJUtXOSia7PQE5OyDEi1WF/+/PGHAetD+0mMY8uXs=; b=LmktueFt+ykwGooK4RXZpbEEXpKBy0T2C5udtWS5Grj6anmIXZVY2JZinRH9SyuS1w Y6agJmJ85vubcUJm7TF3X1i760C/ZgaFtcBimuYsgnGiRLzIgiLBPmbffKML0mYp/Dnv xzaCWxUywluZBvyLqMxq8tabmO1tSCyKu+lZwmseM6ddRQ/ef6rHNKl6X+VysSsF0VFU qArFLmOhpJ/UZjOy0y7gWDVIs/Y3IvquIOWYllaIzURnu487xlGFXSvogrvQn8oBnhFa o6/mfx/6kC2CjWch2Z+jnnFypDgGoZtw/RoIlj+fQyUrVP36KBpq+5jLVaHZzfrv/Bmt CXbg== X-Gm-Message-State: AGi0PuY74lyZlVVQ+ZBFfrxM7LVMvYKz2pDYFMvsmmWdAvkEqGLKwRNL 5XgfgpHsambwB5q64FY40X0= X-Google-Smtp-Source: APiQypIZbqiOw+Z25iDmBXUboufCfW1IaEAppS0y8nlBkDk1CH1yqy/pe9p4O36P93+CKhcfM9hY4A== X-Received: by 2002:a17:90a:77cc:: with SMTP id e12mr660680pjs.134.1585785615909; Wed, 01 Apr 2020 17:00:15 -0700 (PDT) Received: from 42.do-not-panic.com (42.do-not-panic.com. [157.230.128.187]) by smtp.gmail.com with ESMTPSA id ci18sm2504005pjb.23.2020.04.01.17.00.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 01 Apr 2020 17:00:11 -0700 (PDT) Received: by 42.do-not-panic.com (Postfix, from userid 1000) id A2C49418C0; Thu, 2 Apr 2020 00:00:10 +0000 (UTC) From: Luis Chamberlain To: axboe@kernel.dk, viro@zeniv.linux.org.uk, gregkh@linuxfoundation.org, rostedt@goodmis.org, mingo@redhat.com, jack@suse.cz, ming.lei@redhat.com, nstange@suse.de Cc: mhocko@suse.com, linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Luis Chamberlain , Bart Van Assche , Omar Sandoval , Hannes Reinecke , Michal Hocko , syzbot+603294af2d01acfdd6da@syzkaller.appspotmail.com Subject: [RFC 2/3] blktrace: fix debugfs use after free Date: Thu, 2 Apr 2020 00:00:01 +0000 Message-Id: <20200402000002.7442-3-mcgrof@kernel.org> X-Mailer: git-send-email 2.23.0.rc1 In-Reply-To: <20200402000002.7442-1-mcgrof@kernel.org> References: <20200402000002.7442-1-mcgrof@kernel.org> MIME-Version: 1.0 Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org On commit 6ac93117ab00 ("blktrace: use existing disk debugfs directory") Omar fixed the original blktrace code for multiqueue use. This however left in place a possible crash, if you happen to abuse blktrace in a way it was not intended. Namely, if you loop adding a device, setup the blktrace with BLKTRACESETUP, forget to BLKTRACETEARDOWN, and then just remove the device you end up with a panic: [ 107.193134] debugfs: Directory 'loop0' with parent 'block' already present! [ 107.254615] BUG: kernel NULL pointer dereference, address: 00000000000000a0 [ 107.258785] #PF: supervisor write access in kernel mode [ 107.262035] #PF: error_code(0x0002) - not-present page [ 107.264106] PGD 0 P4D 0 [ 107.264404] Oops: 0002 [#1] SMP NOPTI [ 107.264803] CPU: 8 PID: 674 Comm: kworker/8:2 Tainted: G E 5.6.0-rc7-next-20200327 #1 [ 107.265712] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 04/01/2014 [ 107.266553] Workqueue: events __blk_release_queue [ 107.267051] RIP: 0010:down_write+0x15/0x40 [ 107.267488] Code: eb ca e8 ee a5 8d ff cc cc cc cc cc cc cc cc cc cc cc cc cc cc 0f 1f 44 00 00 55 48 89 fd e8 52 db ff ff 31 c0 ba 01 00 00 00 48 0f b1 55 00 75 0f 65 48 8b 04 25 c0 8b 01 00 48 89 45 08 5d [ 107.269300] RSP: 0018:ffff9927c06efda8 EFLAGS: 00010246 [ 107.269841] RAX: 0000000000000000 RBX: ffff8be7e73b0600 RCX: ffffff8100000000 [ 107.270559] RDX: 0000000000000001 RSI: ffffff8100000000 RDI: 00000000000000a0 [ 107.271281] RBP: 00000000000000a0 R08: ffff8be7ebc80fa8 R09: ffff8be7ebc80fa8 [ 107.272001] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 [ 107.272722] R13: ffff8be7efc30400 R14: ffff8be7e0571200 R15: 00000000000000a0 [ 107.273475] FS: 0000000000000000(0000) GS:ffff8be7efc00000(0000) knlGS:0000000000000000 [ 107.274346] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 107.274968] CR2: 00000000000000a0 CR3: 000000042abee003 CR4: 0000000000360ee0 [ 107.275710] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 107.276465] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 107.277214] Call Trace: [ 107.277532] simple_recursive_removal+0x4e/0x2e0 [ 107.278049] ? debugfs_remove+0x60/0x60 [ 107.278493] debugfs_remove+0x40/0x60 [ 107.278922] blk_trace_free+0xd/0x50 [ 107.279339] __blk_trace_remove+0x27/0x40 [ 107.279797] blk_trace_shutdown+0x30/0x40 [ 107.280256] __blk_release_queue+0xab/0x110 [ 107.280734] process_one_work+0x1b4/0x380 [ 107.281194] worker_thread+0x50/0x3c0 [ 107.281622] kthread+0xf9/0x130 [ 107.281994] ? process_one_work+0x380/0x380 [ 107.282467] ? kthread_park+0x90/0x90 [ 107.282895] ret_from_fork+0x1f/0x40 [ 107.283316] Modules linked in: loop(E) [ 107.288562] CR2: 00000000000000a0 [ 107.288957] ---[ end trace b885d243d441bbce ]--- This splat happens to be very similar to the one reported via kernel.org korg#205713, only that korg#205713 was for v4.19.83 and the above now includes the simple_recursive_removal() introduced via commit a3d1e7eb5abe ("simple_recursive_removal(): kernel-side rm -rf for ramfs-style filesystems") merged on v5.6. korg#205713 then was used to create CVE-2019-19770 and claims that the bug is in a use-after-free in the debugfs core code. The implications of this being a generic UAF on debugfs would be much more severe, as it would imply parent dentries can sometimes not be possitive, which is something claim is not possible. It turns out that the issue actually is a mis-use of debugfs for the multiqueue case, and the fragile nature of how we free the directory used to keep track of blktrace debugfs files. Omar's commit assumed the parent directory would be kept with debugfs_lookup() but this is not the case, only the dentry is kept around. We also special-case a solution for multiqueue given that for multiqueue code we always instantiate the debugfs directory for the request queue. We were leaving it only to chance, if someone happens to use blktrace, on single queue block devices for the respective debugfs directory be created. We can fix the UAF by simply using a debugfs directory which is always created for singlequeue and multiqueue block devices. This simplifies the code considerably, with the only penalty now being that we're always creating the request queue directory debugfs directory for the block device on singlequeue block devices. The UAF then is not a core debugfs issue, but instead a mis-use of debugfs, and this issue can only be triggered if you are root, and mis-use blktrace. This issue can be reproduced with break-blktrace [2] using: break-blktrace -c 10 -d This patch fixes this issue. Note that there is also another respective UAF but from the ioctl path [3], this should also fix that issue. This patch then also contends the severity of CVE-2019-19770 as this issue is only possible using root to shoot yourself in the foot by also misuing blktrace. [0] https://bugzilla.kernel.org/show_bug.cgi?id=205713 [1] https://nvd.nist.gov/vuln/detail/CVE-2019-19770 [2] https://github.com/mcgrof/break-blktrace [3] https://lore.kernel.org/lkml/000000000000ec635b059f752700@google.com/ Cc: Bart Van Assche Cc: Omar Sandoval Cc: Hannes Reinecke Cc: Nicolai Stange Cc: Greg Kroah-Hartman Cc: Michal Hocko Reported-by: syzbot+603294af2d01acfdd6da@syzkaller.appspotmail.com Signed-off-by: Luis Chamberlain --- block/blk-debugfs.c | 12 ++++++++++++ block/blk-mq-debugfs.c | 5 ----- block/blk-sysfs.c | 3 +++ block/blk.h | 10 ++++++++++ include/linux/blktrace_api.h | 1 - kernel/trace/blktrace.c | 19 ++++++++----------- 6 files changed, 33 insertions(+), 17 deletions(-) diff --git a/block/blk-debugfs.c b/block/blk-debugfs.c index 634dea4b1507..a8b343e758e4 100644 --- a/block/blk-debugfs.c +++ b/block/blk-debugfs.c @@ -13,3 +13,15 @@ void blk_debugfs_register(void) { blk_debugfs_root = debugfs_create_dir("block", NULL); } + +void blk_q_debugfs_register(struct request_queue *q) +{ + q->debugfs_dir = debugfs_create_dir(kobject_name(q->kobj.parent), + blk_debugfs_root); +} + +void blk_q_debugfs_unregister(struct request_queue *q) +{ + debugfs_remove_recursive(q->debugfs_dir); + q->debugfs_dir = NULL; +} diff --git a/block/blk-mq-debugfs.c b/block/blk-mq-debugfs.c index b3f2ba483992..bda9378eab90 100644 --- a/block/blk-mq-debugfs.c +++ b/block/blk-mq-debugfs.c @@ -823,9 +823,6 @@ void blk_mq_debugfs_register(struct request_queue *q) struct blk_mq_hw_ctx *hctx; int i; - q->debugfs_dir = debugfs_create_dir(kobject_name(q->kobj.parent), - blk_debugfs_root); - debugfs_create_files(q->debugfs_dir, q, blk_mq_debugfs_queue_attrs); /* @@ -856,9 +853,7 @@ void blk_mq_debugfs_register(struct request_queue *q) void blk_mq_debugfs_unregister(struct request_queue *q) { - debugfs_remove_recursive(q->debugfs_dir); q->sched_debugfs_dir = NULL; - q->debugfs_dir = NULL; } static void blk_mq_debugfs_register_ctx(struct blk_mq_hw_ctx *hctx, diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c index fca9b158f4a0..20f20b0fa0b9 100644 --- a/block/blk-sysfs.c +++ b/block/blk-sysfs.c @@ -895,6 +895,7 @@ static void __blk_release_queue(struct work_struct *work) blk_trace_shutdown(q); + blk_q_debugfs_unregister(q); if (queue_is_mq(q)) blk_mq_debugfs_unregister(q); @@ -975,6 +976,8 @@ int blk_register_queue(struct gendisk *disk) goto unlock; } + blk_q_debugfs_register(q); + if (queue_is_mq(q)) { __blk_mq_register_dev(dev, q); blk_mq_debugfs_register(q); diff --git a/block/blk.h b/block/blk.h index 86a66b614f08..b86123a2d74f 100644 --- a/block/blk.h +++ b/block/blk.h @@ -489,10 +489,20 @@ int __bio_add_pc_page(struct request_queue *q, struct bio *bio, bool *same_page); #ifdef CONFIG_DEBUG_FS void blk_debugfs_register(void); +void blk_q_debugfs_register(struct request_queue *q); +void blk_q_debugfs_unregister(struct request_queue *q); #else static inline void blk_debugfs_register(void) { } + +static inline void blk_q_debugfs_register(struct request_queue *q) +{ +} + +static inline void blk_q_debugfs_unregister(struct request_queue *q) +{ +} #endif /* CONFIG_DEBUG_FS */ #endif /* BLK_INTERNAL_H */ diff --git a/include/linux/blktrace_api.h b/include/linux/blktrace_api.h index 3b6ff5902edc..eb6db276e293 100644 --- a/include/linux/blktrace_api.h +++ b/include/linux/blktrace_api.h @@ -22,7 +22,6 @@ struct blk_trace { u64 end_lba; u32 pid; u32 dev; - struct dentry *dir; struct dentry *dropped_file; struct dentry *msg_file; struct list_head running_list; diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c index ca39dc3230cb..15086227592f 100644 --- a/kernel/trace/blktrace.c +++ b/kernel/trace/blktrace.c @@ -311,7 +311,6 @@ static void blk_trace_free(struct blk_trace *bt) debugfs_remove(bt->msg_file); debugfs_remove(bt->dropped_file); relay_close(bt->rchan); - debugfs_remove(bt->dir); free_percpu(bt->sequence); free_percpu(bt->msg_data); kfree(bt); @@ -476,7 +475,6 @@ static int do_blk_trace_setup(struct request_queue *q, char *name, dev_t dev, struct blk_user_trace_setup *buts) { struct blk_trace *bt = NULL; - struct dentry *dir = NULL; int ret; if (!buts->buf_size || !buts->buf_nr) @@ -485,6 +483,9 @@ static int do_blk_trace_setup(struct request_queue *q, char *name, dev_t dev, if (!blk_debugfs_root) return -ENOENT; + if (!q->debugfs_dir) + return -ENOENT; + strncpy(buts->name, name, BLKTRACE_BDEV_SIZE); buts->name[BLKTRACE_BDEV_SIZE - 1] = '\0'; @@ -509,21 +510,19 @@ static int do_blk_trace_setup(struct request_queue *q, char *name, dev_t dev, ret = -ENOENT; - dir = debugfs_lookup(buts->name, blk_debugfs_root); - if (!dir) - bt->dir = dir = debugfs_create_dir(buts->name, blk_debugfs_root); - bt->dev = dev; atomic_set(&bt->dropped, 0); INIT_LIST_HEAD(&bt->running_list); ret = -EIO; - bt->dropped_file = debugfs_create_file("dropped", 0444, dir, bt, + bt->dropped_file = debugfs_create_file("dropped", 0444, + q->debugfs_dir, bt, &blk_dropped_fops); - bt->msg_file = debugfs_create_file("msg", 0222, dir, bt, &blk_msg_fops); + bt->msg_file = debugfs_create_file("msg", 0222, q->debugfs_dir, + bt, &blk_msg_fops); - bt->rchan = relay_open("trace", dir, buts->buf_size, + bt->rchan = relay_open("trace", q->debugfs_dir, buts->buf_size, buts->buf_nr, &blk_relay_callbacks, bt); if (!bt->rchan) goto err; @@ -551,8 +550,6 @@ static int do_blk_trace_setup(struct request_queue *q, char *name, dev_t dev, ret = 0; err: - if (dir && !bt->dir) - dput(dir); if (ret) blk_trace_free(bt); return ret; From patchwork Thu Apr 2 00:00:02 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Luis Chamberlain X-Patchwork-Id: 11469715 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7208417EA for ; Thu, 2 Apr 2020 00:00:27 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 502EE2077D for ; Thu, 2 Apr 2020 00:00:27 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1585785627; bh=8fR77l9xfbt6qWg1vV/ZZYimPLZEkMBq+8PO1jeeOiI=; h=From:To:Cc:Subject:Date:In-Reply-To:References:List-ID:From; b=WFfXcdao1b8Pngl17rKpcEKFxxgXuuRnZN4IycnOtKVg6fLcSE56ueWgWp0lD2QY5 68iCEg1o9yN0vVfQ0DCCsai79Hf94GY8va9cRUqefi2PDSqCY13wDI0RSFLHickRQe 8hSuoP6NwhOLm77fgaAty53v78J4EnwQ43YmQc7I= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2387452AbgDBAAU (ORCPT ); Wed, 1 Apr 2020 20:00:20 -0400 Received: from mail-pl1-f194.google.com ([209.85.214.194]:43419 "EHLO mail-pl1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732560AbgDBAAP (ORCPT ); Wed, 1 Apr 2020 20:00:15 -0400 Received: by mail-pl1-f194.google.com with SMTP id v23so624225ply.10; Wed, 01 Apr 2020 17:00:15 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Vm+IPYywOxBkwyr7TKQH9BbZsRaS9vSqfzACF8qBlbo=; b=e1rLc8B8aJKzxPYn8uOzZ6D0p0nnn5Ii+IEh8aonCLNkHZcvLtidcer+LxxMjduCQw RXNLolG3HXSOcPtRMtGo/QzvI5GhtBSm7uHVWCXBmygZ4YNYL2ImK++lApPu6VPP+QgQ 9t9FZOVhvQJPGhp3IdiY2nq5EqWG9l0gSSsfxo2Z00fsqCfkyBz5vZK99rnJt8ZIga8t J2Xzl96bMejZnnr3HCbzhjBEf1W3HLQS3Jb2ufoS5saU89cOLRTjRb2Q/JgoY/Uv6Hs2 3If6d63MfxEqNdUpZ6U2QT3kvmnVGR/Fucw8U+FzK5KabzA1KBQE1+pw7KV8OsgC4Xy/ uZbQ== X-Gm-Message-State: AGi0PuYl08aelrDL+y1pFZ7KOjHywDznQo0/efCrQXy4k5FLEXfyzA+Y HdABba5Q11Mp3q/YSFeuinM= X-Google-Smtp-Source: APiQypKV6AYf89cCEMicViLa7eFHdETMC/+pi997inC3D16Iti5incZo5Y11kbr1GFTvBYkvHkXSDg== X-Received: by 2002:a17:902:788e:: with SMTP id q14mr409569pll.72.1585785614675; Wed, 01 Apr 2020 17:00:14 -0700 (PDT) Received: from 42.do-not-panic.com (42.do-not-panic.com. [157.230.128.187]) by smtp.gmail.com with ESMTPSA id np4sm2615858pjb.48.2020.04.01.17.00.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 01 Apr 2020 17:00:11 -0700 (PDT) Received: by 42.do-not-panic.com (Postfix, from userid 1000) id B29F041DCA; Thu, 2 Apr 2020 00:00:10 +0000 (UTC) From: Luis Chamberlain To: axboe@kernel.dk, viro@zeniv.linux.org.uk, gregkh@linuxfoundation.org, rostedt@goodmis.org, mingo@redhat.com, jack@suse.cz, ming.lei@redhat.com, nstange@suse.de Cc: mhocko@suse.com, linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Luis Chamberlain , Bart Van Assche , Omar Sandoval , Hannes Reinecke , Michal Hocko Subject: [RFC 3/3] block: avoid deferral of blk_release_queue() work Date: Thu, 2 Apr 2020 00:00:02 +0000 Message-Id: <20200402000002.7442-4-mcgrof@kernel.org> X-Mailer: git-send-email 2.23.0.rc1 In-Reply-To: <20200402000002.7442-1-mcgrof@kernel.org> References: <20200402000002.7442-1-mcgrof@kernel.org> MIME-Version: 1.0 Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Commit dc9edc44de6c ("block: Fix a blk_exit_rl() regression") moved the blk_release_queue() into a workqueue after a splat floated around with some work here which could sleep in blk_exit_rl(). On recent commit db6d9952356 ("block: remove request_list code") though Jens Axboe removed this code, now merged since v5.0. We no longer have to defer this work. By doing this we also avoid failing to detach / attach a block device with a BLKTRACESETUP. This issue can be reproduced with break-blktrace [0] using: break-blktrace -c 10 -d -s The kernel does not crash without this commit, it just fails to create the block device because the prior block device removal deferred work is pending. After this commit we can use the above flaky use of blktrace without an issue. [0] https://github.com/mcgrof/break-blktrace Cc: Bart Van Assche Cc: Omar Sandoval Cc: Hannes Reinecke Cc: Nicolai Stange Cc: Greg Kroah-Hartman Cc: Michal Hocko Suggested-by: Nicolai Stange Signed-off-by: Luis Chamberlain --- block/blk-sysfs.c | 18 +++++------------- 1 file changed, 5 insertions(+), 13 deletions(-) diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c index 20f20b0fa0b9..f159b40899ee 100644 --- a/block/blk-sysfs.c +++ b/block/blk-sysfs.c @@ -862,8 +862,8 @@ static void blk_exit_queue(struct request_queue *q) /** - * __blk_release_queue - release a request queue - * @work: pointer to the release_work member of the request queue to be released + * blk_release_queue - release a request queue + * @kojb: pointer to the kobj representing the request queue * * Description: * This function is called when a block device is being unregistered. The @@ -873,9 +873,10 @@ static void blk_exit_queue(struct request_queue *q) * of the request queue reaches zero, blk_release_queue is called to release * all allocated resources of the request queue. */ -static void __blk_release_queue(struct work_struct *work) +static void blk_release_queue(struct kobject *kobj) { - struct request_queue *q = container_of(work, typeof(*q), release_work); + struct request_queue *q = + container_of(kobj, struct request_queue, kobj); if (test_bit(QUEUE_FLAG_POLL_STATS, &q->queue_flags)) blk_stat_remove_callback(q, q->poll_cb); @@ -905,15 +906,6 @@ static void __blk_release_queue(struct work_struct *work) call_rcu(&q->rcu_head, blk_free_queue_rcu); } -static void blk_release_queue(struct kobject *kobj) -{ - struct request_queue *q = - container_of(kobj, struct request_queue, kobj); - - INIT_WORK(&q->release_work, __blk_release_queue); - schedule_work(&q->release_work); -} - static const struct sysfs_ops queue_sysfs_ops = { .show = queue_attr_show, .store = queue_attr_store,