From patchwork Wed Feb 8 17:17:13 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Omar Sandoval X-Patchwork-Id: 9562849 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id F1A0A60434 for ; Wed, 8 Feb 2017 17:20:00 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B6B5028509 for ; Wed, 8 Feb 2017 17:20:00 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id AB81F28510; Wed, 8 Feb 2017 17:20:00 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C9E8028509 for ; Wed, 8 Feb 2017 17:19:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754121AbdBHRT7 (ORCPT ); Wed, 8 Feb 2017 12:19:59 -0500 Received: from mail-pf0-f177.google.com ([209.85.192.177]:35281 "EHLO mail-pf0-f177.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753511AbdBHRSc (ORCPT ); Wed, 8 Feb 2017 12:18:32 -0500 Received: by mail-pf0-f177.google.com with SMTP id f144so43660049pfa.2 for ; Wed, 08 Feb 2017 09:17:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=osandov-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=Sz6gLP6kFn5z1oDI8NpywI/qylEYmHtNy8Lz1uJ1TtI=; b=pgg2zdhT05IfsTrOU+QPS0IsSVUthBT9J2RvgUybX0txMqEvKTHUYYNuW0XJ6hiTNP bWvF1uZRRzhCtgAVU3L7jYRlUm7aHTe5Mwaz0WpEBBCqwYZIllE3d7jfosbtw0sjCZZE UyLf3nZod7I7hrw5tUgo6Ol9VluYMACTDOZKboKWYtMdL4UKC7v6VtTPWbB7wjFLS4cw vJQ96eiN+7c7ag+uP7cFmXRIr0dy42VGfRX4H3sUzCdWT4DWmDbc6KLiYEWqLbaxAfLp OysKWTypifRDhMXS8IQfgHiPWHNs1LO4MiifkG9ecrCkSbbUfzvhNIIZby5r8KuSZK1u 2HPQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=Sz6gLP6kFn5z1oDI8NpywI/qylEYmHtNy8Lz1uJ1TtI=; b=sUrvpB5L9AUwO10b0GsnJNx7g7DVtb7VrPfZrfgBGsVMwL+BD/SJVBOtZUO8lLriAH j4xQwkrq3VUFS3JA/RmPXKH+kzy4gWLyhmCJ20ouRzuEUoWJFxEX3aPgnCF9xuuON+gT aQx9g9m49ljbLXVmDUcevrGUbEessLBtOicU3meusmbYEjOlb5ieUT7ZDO3LkDgZk8Qx XBDA4q5UNziynOpG+p2X0A0UqNB9EOMlCl9zc9dd/wmiu3Zv9qXelpZ97rQpn1cjRcxf Qu+Wr9Qp9JnCgbetK6h3HkjPaXpa9r4dHswcPIDNCFjeGzHURueKfv2pMIy94U4fcuR8 cfiQ== X-Gm-Message-State: AIkVDXIrutnAbH7xVsjYSqeovVo3/NKGsdvxw9nLZjuxNpdxxx1Bo3CBAXXAAsJt0UR4P3MK X-Received: by 10.98.92.4 with SMTP id q4mr27112651pfb.151.1486574267068; Wed, 08 Feb 2017 09:17:47 -0800 (PST) Received: from vader ([2620:10d:c090:180::2d0b]) by smtp.gmail.com with ESMTPSA id v4sm21900061pfb.36.2017.02.08.09.17.46 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 08 Feb 2017 09:17:46 -0800 (PST) Date: Wed, 8 Feb 2017 09:17:13 -0800 From: Omar Sandoval To: Paolo Valente Cc: Jens Axboe , Tejun Heo , linux-block@vger.kernel.org, Linux-Kernal , Ulf Hansson , Linus Walleij , broonie@kernel.org Subject: Re: [PATCH] bfq-mq: cause deadlock by executing exit_icq body immediately Message-ID: <20170208171713.GA7811@vader> References: <20170207173346.4789-1-paolo.valente@linaro.org> <20170207214516.GA14269@vader.DHCP.thefacebook.com> <20170208103342.GA4647@vader> <49CC2920-EF8D-4B9B-B6BE-1722156AAB70@linaro.org> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <49CC2920-EF8D-4B9B-B6BE-1722156AAB70@linaro.org> User-Agent: Mutt/1.7.2 (2016-11-26) Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On Wed, Feb 08, 2017 at 11:39:24AM +0100, Paolo Valente wrote: > > > Il giorno 08 feb 2017, alle ore 11:33, Omar Sandoval ha scritto: > > > > On Wed, Feb 08, 2017 at 11:03:01AM +0100, Paolo Valente wrote: > >> > >>> Il giorno 07 feb 2017, alle ore 22:45, Omar Sandoval ha scritto: > >>> > >>> On Tue, Feb 07, 2017 at 06:33:46PM +0100, Paolo Valente wrote: > >>>> Hi, > >>>> this patch is meant to show that, if the body of the hook exit_icq is executed > >>>> from inside that hook, and not as deferred work, then a circular deadlock > >>>> occurs. > >>>> > >>>> It happens if, on a CPU > >>>> - the body of icq_exit takes the scheduler lock, > >>>> - it does so from inside the exit_icq hook, which is invoked with the queue > >>>> lock held > >>>> > >>>> while, on another CPU > >>>> - bfq_bio_merge, after taking the scheduler lock, invokes bfq_bic_lookup, > >>>> which, in its turn, takes the queue lock. bfq_bic_lookup needs to take such a > >>>> lock, because it invokes ioc_lookup_icq. > >>>> > >>>> For more details, here is a lockdep report, right before the deadlock did occur. > >>>> > >>>> [ 44.059877] ====================================================== > >>>> [ 44.124922] [ INFO: possible circular locking dependency detected ] > >>>> [ 44.125795] 4.10.0-rc5-bfq-mq+ #38 Not tainted > >>>> [ 44.126414] ------------------------------------------------------- > >>>> [ 44.127291] sync/2043 is trying to acquire lock: > >>>> [ 44.128918] (&(&bfqd->lock)->rlock){-.-...}, at: [] bfq_exit_icq_bfqq+0x55/0x140 > >>>> [ 44.134052] > >>>> [ 44.134052] but task is already holding lock: > >>>> [ 44.134868] (&(&q->__queue_lock)->rlock){-.....}, at: [] put_io_context_active+0x6e/0xc0 > >>> > >>> Hey, Paolo, > >>> > >>> I only briefly skimmed the code, but what are you using the queue_lock > >>> for? You should just use your scheduler lock everywhere. blk-mq doesn't > >>> use the queue lock, so the scheduler is the only thing you need mutual > >>> exclusion against. > >> > >> Hi Omar, > >> the cause of the problem is that the hook functions bfq_request_merge > >> and bfq_allow_bio_merge invoke, directly or through other functions, > >> the function bfq_bic_lookup, which, in its turn, invokes > >> ioc_lookup_icq. The latter must be invoked with the queue lock held. > >> In particular the offending lines in bfq_bic_lookup are: > >> > >> spin_lock_irqsave(q->queue_lock, flags); > >> icq = icq_to_bic(ioc_lookup_icq(ioc, q)); > >> spin_unlock_irqrestore(q->queue_lock, flags); > >> > >> Maybe I'm missing something and we can avoid taking this lock? > > > > Ah, I didn't realize we still used the q->queue_lock for the icq stuff. > > You're right, you still need that lock for ioc_lookup_icq(). Unless > > there's something else I'm forgetting, that should be the only thing you > > need it for in the core code, and you should use your scheduler lock for > > everything else. What else are you using q->queue_lock for? > > Nothing. The deadlock follows from that bfq_request_merge gets called > with the scheduler lock already held. Problematic paths start from: > bfq_bio_merge and bfq_insert_request. > > I'm trying to understand whether I/we can reorder operations in some > way that avoids the nested locking, but at no avail so far. > > Thanks, > Paolo Okay, I understand what you're saying now. It was all in the first email but I didn't see it right away, sorry about that. I don't think it makes sense for ->exit_icq() to be invoked while holding q->queue_lock for blk-mq -- we don't hold that lock for any of the other hooks. Could you try the below? I haven't convinced myself that there isn't a circular locking dependency between bfqd->lock and ioc->lock now, but it's probably easiest for you to just try it. diff --git a/block/blk-ioc.c b/block/blk-ioc.c index fe186a9eade9..b12f9c87b4c3 100644 --- a/block/blk-ioc.c +++ b/block/blk-ioc.c @@ -35,7 +35,10 @@ static void icq_free_icq_rcu(struct rcu_head *head) kmem_cache_free(icq->__rcu_icq_cache, icq); } -/* Exit an icq. Called with both ioc and q locked. */ +/* + * Exit an icq. Called with both ioc and q locked for sq, only ioc locked for + * mq. + */ static void ioc_exit_icq(struct io_cq *icq) { struct elevator_type *et = icq->q->elevator->type; @@ -166,6 +169,7 @@ EXPORT_SYMBOL(put_io_context); */ void put_io_context_active(struct io_context *ioc) { + struct elevator_type *et; unsigned long flags; struct io_cq *icq; @@ -184,13 +188,19 @@ void put_io_context_active(struct io_context *ioc) hlist_for_each_entry(icq, &ioc->icq_list, ioc_node) { if (icq->flags & ICQ_EXITED) continue; - if (spin_trylock(icq->q->queue_lock)) { + + et = icq->q->elevator->type; + if (et->uses_mq) { ioc_exit_icq(icq); - spin_unlock(icq->q->queue_lock); } else { - spin_unlock_irqrestore(&ioc->lock, flags); - cpu_relax(); - goto retry; + if (spin_trylock(icq->q->queue_lock)) { + ioc_exit_icq(icq); + spin_unlock(icq->q->queue_lock); + } else { + spin_unlock_irqrestore(&ioc->lock, flags); + cpu_relax(); + goto retry; + } } } spin_unlock_irqrestore(&ioc->lock, flags);