From patchwork Mon Jan 15 23:13:32 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Snitzer X-Patchwork-Id: 10165425 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 7622D602C2 for ; Mon, 15 Jan 2018 23:13:38 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 66A1227BA5 for ; Mon, 15 Jan 2018 23:13:38 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 5990127C05; Mon, 15 Jan 2018 23:13:38 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C12EE27BA5 for ; Mon, 15 Jan 2018 23:13:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1750776AbeAOXNg (ORCPT ); Mon, 15 Jan 2018 18:13:36 -0500 Received: from mx1.redhat.com ([209.132.183.28]:50566 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750733AbeAOXNg (ORCPT ); Mon, 15 Jan 2018 18:13:36 -0500 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 618B3C04B925; Mon, 15 Jan 2018 23:13:36 +0000 (UTC) Received: from localhost (ovpn-112-18.rdu2.redhat.com [10.10.112.18]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 6C27F600D2; Mon, 15 Jan 2018 23:13:33 +0000 (UTC) Date: Mon, 15 Jan 2018 18:13:32 -0500 From: Mike Snitzer To: Bart Van Assche Cc: "dm-devel@redhat.com" , "linux-block@vger.kernel.org" , "hare@suse.de" , "tom.leiming@gmail.com" , "djeffery@redhat.com" , "axboe@kernel.dk" Subject: Re: [for-4.16 PATCH v5 0/4] block/dm: allow DM to defer blk_register_queue() until ready Message-ID: <20180115231331.GA18564@redhat.com> References: <20180112150606.6037-1-snitzer@redhat.com> <1516036612.10386.2.camel@wdc.com> <20180115221511.GA25858@redhat.com> <1516056697.3951.30.camel@wdc.com> <20180115231010.GA26447@redhat.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20180115231010.GA26447@redhat.com> User-Agent: Mutt/1.5.23 (2014-03-12) X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.31]); Mon, 15 Jan 2018 23:13:36 +0000 (UTC) Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On Mon, Jan 15 2018 at 6:10P -0500, Mike Snitzer wrote: > On Mon, Jan 15 2018 at 5:51pm -0500, > Bart Van Assche wrote: > > > On Mon, 2018-01-15 at 17:15 -0500, Mike Snitzer wrote: > > > sysfs write op calls kernfs_fop_write which takes: > > > of->mutex then kn->count#213 (no idea what that is) > > > then q->sysfs_lock (via queue_attr_store) > > > > > > vs > > > > > > blk_unregister_queue takes: > > > q->sysfs_lock then > > > kernfs_mutex (via kernfs_remove) > > > seems lockdep thinks "kernfs_mutex" is "kn->count#213"? > > > > > > Feels like lockdep code in fs/kernfs/file.c and fs/kernfs/dir.c is > > > triggering false positives.. because these seem like different kernfs > > > locks yet they are reported as "kn->count#213". > > > > > > Certainly feeling out of my depth with kernfs's locking though. > > > > Hello Mike, > > > > I don't think that this is a false positive but rather the following traditional > > pattern of a potential deadlock involving sysfs attributes: > > * One context obtains a mutex from inside a sysfs attribute method: > > queue_attr_store() obtains q->sysfs_lock. > > * Another context removes a sysfs attribute while holding a mutex: > > blk_unregister_queue() removes the queue sysfs attributes while holding > > q->sysfs_lock. > > > > This can result in a real deadlock because the code that removes sysfs objects > > waits until all ongoing attribute callbacks have finished. > > > > Since commit 667257e8b298 ("block: properly protect the 'queue' kobj in > > blk_unregister_queue") modified blk_unregister_queue() such that q->sysfs_lock > > is held around the kobject_del(&q->kobj) call I think this is a regression > > introduced by that commit. > > Sure, of course it is a regression. > > Aside from moving the mutex_unlock(&q->sysfs_lock) above the > kobject_del(&q->kobj) I don't know how to fix it. > > Though, realistically that'd be an adequate fix (given the way the code > was before). Any chance you apply this and re-run your srp_test that triggered the lockdep splat? diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c index 4a6a40ffd78e..c50e08e9bf17 100644 --- a/block/blk-sysfs.c +++ b/block/blk-sysfs.c @@ -952,10 +952,10 @@ void blk_unregister_queue(struct gendisk *disk) if (q->request_fn || (q->mq_ops && q->elevator)) elv_unregister_queue(q); + mutex_unlock(&q->sysfs_lock); + kobject_uevent(&q->kobj, KOBJ_REMOVE); kobject_del(&q->kobj); blk_trace_remove_sysfs(disk_to_dev(disk)); kobject_put(&disk_to_dev(disk)->kobj); - - mutex_unlock(&q->sysfs_lock); }