From patchwork Fri Jan 9 21:11:24 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 5602911 X-Patchwork-Delegate: snitzer@redhat.com Return-Path: X-Original-To: patchwork-dm-devel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork2.web.kernel.org (Postfix) with ESMTP id E64B7C058D for ; Fri, 9 Jan 2015 21:15:31 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id E65B7205CD for ; Fri, 9 Jan 2015 21:15:30 +0000 (UTC) Received: from mx6-phx2.redhat.com (mx6-phx2.redhat.com [209.132.183.39]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 87B96205B8 for ; Fri, 9 Jan 2015 21:15:29 +0000 (UTC) Received: from lists01.pubmisc.prod.ext.phx2.redhat.com (lists01.pubmisc.prod.ext.phx2.redhat.com [10.5.19.33]) by mx6-phx2.redhat.com (8.14.4/8.14.4) with ESMTP id t09LBY72009862; Fri, 9 Jan 2015 16:11:35 -0500 Received: from int-mx09.intmail.prod.int.phx2.redhat.com (int-mx09.intmail.prod.int.phx2.redhat.com [10.5.11.22]) by lists01.pubmisc.prod.ext.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id t09LBXDm015662 for ; Fri, 9 Jan 2015 16:11:33 -0500 Received: from mx1.redhat.com (ext-mx12.extmail.prod.ext.phx2.redhat.com [10.5.110.17]) by int-mx09.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id t09LBXK6019105 for ; Fri, 9 Jan 2015 16:11:33 -0500 Received: from mail-ie0-f178.google.com (mail-ie0-f178.google.com [209.85.223.178]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id t09LBVGw010252 (version=TLSv1/SSLv3 cipher=RC4-SHA bits=128 verify=FAIL) for ; Fri, 9 Jan 2015 16:11:32 -0500 Received: by mail-ie0-f178.google.com with SMTP id vy18so17173743iec.9 for ; Fri, 09 Jan 2015 13:11:31 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:message-id:date:from:user-agent:mime-version:to :cc:subject:references:in-reply-to:content-type; bh=SmXB/z7t/zxv1dtKmY5O3kpIss9Nrx0tn68U+e8Gujs=; b=jGDxTaS47ncRPCnOh2dX7a9JiHWGOzyFpoS8/l7fZsA6Mn4NI+Mp0peC4zhu3CdVmH 3lgfZHZkk98LWQpjZ0G8A1VjIFUj0bPYsIEds2XWmtfBJKTZYzc9Lzh5IS9djvH0BPiq 0T4Z2bIJPW4Z9nVSzGbWl3ZjjY2a0iasmsTqzBNWJbKqvn+hY//+oObUm999o4spEMj3 yahtENcBJa49MlPKBZPRafCBVERcvUyxvQ/Z2bvgb/9ngw2R+/1ti3E5hSaLjRZExcuj +dPo8oL0XnduUQb6yx27fODxGVN0kBuOajuI+t4IwuCwaJLQNqbTI6pSMm+Jwoe+8YGd jobg== X-Gm-Message-State: ALoCoQlX4eVzEW2SJToXyWIwx4hECZIPAcSnzeCkcgz6YDNSo5ODdqTlzHUSMl48z4ErmaIRB4q3 X-Received: by 10.42.88.212 with SMTP id d20mr11205804icm.32.1420837891665; Fri, 09 Jan 2015 13:11:31 -0800 (PST) Received: from [192.168.0.34] ([67.40.118.70]) by mx.google.com with ESMTPSA id la3sm16251igb.0.2015.01.09.13.11.25 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 09 Jan 2015 13:11:25 -0800 (PST) Message-ID: <54B043FC.8000902@kernel.dk> Date: Fri, 09 Jan 2015 14:11:24 -0700 From: Jens Axboe User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.3.0 MIME-Version: 1.0 To: Mike Snitzer , Keith Busch , Bart Van Assche References: <20141224182143.GA12922@redhat.com> <20141224185529.GA13246@redhat.com> <20141224192643.GA30461@redhat.com> <54A6DB1D.4030201@acm.org> <20150105213557.GA5030@redhat.com> <54ABAB80.70006@acm.org> <20150106160553.GB10224@redhat.com> <54AC0A39.90801@kernel.dk> <54AD0B63.3010505@acm.org> <20150109194955.GA32641@redhat.com> <54B042FE.2000205@kernel.dk> In-Reply-To: <54B042FE.2000205@kernel.dk> X-RedHat-Spam-Score: -1.934 (BAYES_00, RCVD_IN_DNSWL_LOW, SPF_SOFTFAIL, URIBL_BLOCKED) 209.85.223.178 mail-ie0-f178.google.com 209.85.223.178 mail-ie0-f178.google.com X-Scanned-By: MIMEDefang 2.68 on 10.5.11.22 X-Scanned-By: MIMEDefang 2.68 on 10.5.110.17 X-loop: dm-devel@redhat.com Cc: Christoph Hellwig , "Jun'ichi Nomura" , device-mapper development Subject: Re: [dm-devel] blk-mq request allocation stalls [was: Re: [PATCH v3 0/8] dm: add request-based blk-mq support] X-BeenThere: dm-devel@redhat.com X-Mailman-Version: 2.1.12 Precedence: junk Reply-To: device-mapper development List-Id: device-mapper development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: dm-devel-bounces@redhat.com Errors-To: dm-devel-bounces@redhat.com X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, T_RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On 01/09/2015 02:07 PM, Jens Axboe wrote: > On 01/09/2015 12:49 PM, Mike Snitzer wrote: >> On Wed, Jan 07 2015 at 3:40pm -0500, >> Keith Busch wrote: >> >>> On Wed, 7 Jan 2015, Bart Van Assche wrote: >>>> On 01/06/15 17:15, Jens Axboe wrote: >>>>> blk-mq request allocation is pretty much as optimized/fast as it >>>>> can be. >>>>> The slowdown must be due to one of two reasons: >>>>> >>>>> - A bug related to running out of requests, perhaps a missing queue >>>>> run >>>>> or something like that. >>>>> - A smaller number of available requests, due to the requested >>>>> queue depth. >>>>> >>>>> Looking at Barts results, it looks like it's usually fast, but >>>>> sometimes >>>>> very slow. That would seem to indicate it's option #1 above that is >>>>> the >>>>> issue. Bart, since this seems to wait for quite a bit, would it be >>>>> possible to cat the 'tags' file for that queue when it is stuck >>>>> like that? >>>> >>>> Hello Jens, >>>> >>>> Thanks for the assistance. Is this the output you were looking for >>> >>> I'm a little confused by the later comments given the below data. It >>> says >>> multipath_clone_and_map() is stuck at bt_get, but that doesn't block >>> unless there are no tags available. The tags should be coming from one >>> of dm-1's path queues, and I'm assuming these queues are provided by sdc >>> and sdd. All their tags are free, so that looks like a missing wake_up >>> when the queue idles. >> >> Like I said in an earlier email, I cannot reproduce Bart's hangs running >> mkfs.xfs against a multipath device that is built ontop of a virtio >> device in a KVM guest. >> >> But I can hit __bt_get() failures on the virtio-blk device that I'm >> using for the root device on this guest. Bart I'd be interested to see >> what you get when running the attached debug patch (likely will just >> echo the same type of info you've already provided). >> >> There does appear to be something weird going on with bt_get(). With >> the debug patch I'm seeing the following when I simply run "make install" >> of the kernel (it'll run dracut to build the initramfs, etc): >> >> You'll note that in all instances where __bt_get() returns -1 nr_free >> isn't 0. > > Yeah, that doesn't look good. Can you try with this patch? The second > hunk is the interesting bit, the first is more of a cleanup. Actually, try this one instead, it should be a bit more precise than the first. diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c index 60c9d4a93fe4..2e38cd118c1d 100644 --- a/block/blk-mq-tag.c +++ b/block/blk-mq-tag.c @@ -143,7 +143,6 @@ static inline bool hctx_may_queue(struct blk_mq_hw_ctx *hctx, static int __bt_get_word(struct blk_align_bitmap *bm, unsigned int last_tag) { int tag, org_last_tag, end; - bool wrap = last_tag != 0; org_last_tag = last_tag; end = bm->depth; @@ -155,15 +154,16 @@ restart: * We started with an offset, start from 0 to * exhaust the map. */ - if (wrap) { - wrap = false; + if (org_last_tag) { end = org_last_tag; - last_tag = 0; + last_tag = org_last_tag = 0; goto restart; } return -1; } last_tag = tag + 1; + if (last_tag >= bm->depth - 1) + last_tag = 0; } while (test_and_set_bit(tag, &bm->word)); return tag; @@ -199,9 +199,13 @@ static int __bt_get(struct blk_mq_hw_ctx *hctx, struct blk_mq_bitmap_tags *bt, goto done; } - last_tag = 0; - if (++index >= bt->map_nr) + index++; + last_tag = (index << bt->bits_per_word); + + if (index >= bt->map_nr) { index = 0; + last_tag = 0; + } } *tag_cache = 0;