From patchwork Mon Sep 11 21:27:57 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 9948173 X-Patchwork-Delegate: snitzer@redhat.com Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 02B796035D for ; Mon, 11 Sep 2017 21:39:22 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id F274328D78 for ; Mon, 11 Sep 2017 21:39:21 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id E593628D7E; Mon, 11 Sep 2017 21:39:21 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_HI,T_DKIM_INVALID autolearn=unavailable version=3.3.1 Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 4A59C28D78 for ; Mon, 11 Sep 2017 21:39:20 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id B5D0F36DAD4; Mon, 11 Sep 2017 21:39:18 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com B5D0F36DAD4 Authentication-Results: ext-mx05.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: ext-mx05.extmail.prod.ext.phx2.redhat.com; spf=fail smtp.mailfrom=dm-devel-bounces@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com B5D0F36DAD4 Authentication-Results: mx1.redhat.com; dkim=fail reason="signature verification failed" (2048-bit key) header.d=kernel-dk.20150623.gappssmtp.com header.i=@kernel-dk.20150623.gappssmtp.com header.b="mWR/fpcD" Received: from colo-mx.corp.redhat.com (colo-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.21]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 93E0D6F99F; Mon, 11 Sep 2017 21:39:15 +0000 (UTC) Received: from lists01.pubmisc.prod.ext.phx2.redhat.com (lists01.pubmisc.prod.ext.phx2.redhat.com [10.5.19.33]) by colo-mx.corp.redhat.com (Postfix) with ESMTP id 292A33FAD0; Mon, 11 Sep 2017 21:39:11 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) by lists01.pubmisc.prod.ext.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id v8BLSCAp019824 for ; Mon, 11 Sep 2017 17:28:13 -0400 Received: by smtp.corp.redhat.com (Postfix) id F218A60317; Mon, 11 Sep 2017 21:28:12 +0000 (UTC) Delivered-To: dm-devel@redhat.com Received: from mx1.redhat.com (ext-mx07.extmail.prod.ext.phx2.redhat.com [10.5.110.31]) by smtp.corp.redhat.com (Postfix) with ESMTPS id EC9DE600C4 for ; Mon, 11 Sep 2017 21:28:11 +0000 (UTC) Received: from mail-pg0-f46.google.com (mail-pg0-f46.google.com [74.125.83.46]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 4C3E5C0D8D77 for ; Mon, 11 Sep 2017 21:28:02 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 4C3E5C0D8D77 Authentication-Results: ext-mx07.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: ext-mx07.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=axboe@kernel.dk DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com 4C3E5C0D8D77 Received: by mail-pg0-f46.google.com with SMTP id j16so7288017pga.1 for ; Mon, 11 Sep 2017 14:28:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=zHhUi4kD0WT5g06Ble+UJpV5p/Je3c4msvrsw4CevL0=; b=mWR/fpcDQO51G9bNNEsvl0KPRM+MuCqWqer3os4oTDFeQI+8EC28x2J4Pbl7UilFhw URs1JFy8TcrzJ/yaDma0srvEti3owIuiEuMsoplEiFx+yqNbphjVyLx0QyZ5d6tNF6VA erGzOMImcmuzgkaJofH50H4uBuGt3vj9b44sLj3ZmS7j2gBno8qiu+DeuaJfP/rYKFie U88ee/gcAY/IrVmOUpr3CVcHJjBrLZBTt/wTI1NaTNug7MM1Wexg0Ri/9pCVzcAvaekS AW1u9n9SQ1MEsk7d6OEhX4EutRL0tFoweuD/MaIWMPaXhU5nUREROzzXiof+Lgz1zaxR +IZA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=zHhUi4kD0WT5g06Ble+UJpV5p/Je3c4msvrsw4CevL0=; b=sdXWqxIyDnND4149yKFLIvsjInG+ZhLnornRMfqWx2/zRUgR0cVzyH3gkhJ0UvvnYJ lgd5ACZTb8FrlbCqTz8+yADLzE0MoU5kD90Nk1PXlmkTAwhB0GA+IItb9+LQkiFzGEPQ iCUa+II7iq/tw0azKM7V8tnipIowOrJ195qc9RmB1gquHz2zReO5Tr63xXhwoaP2Uyqc KvB2FAFcMJsArdm/C6JYaRbpcQQGT2uO0WHOUUzs7fFlCsZDakFHJLEfLLFf9XRqIXB+ ArV/jmtr2P1mIMDyA0QzfCQGiS8/JC5ezuJAlcJlO/kjhLd/blbNBJwFjVZ0DkD79wbZ H34Q== X-Gm-Message-State: AHPjjUi5LfQGxDVxzw+Y4rDaEPMGIa80iDPwrV+qQSUxKc25U22xJPsF cUdd41Rqej2buq27f0Fvtg== X-Google-Smtp-Source: ADKCNb4qrhRbY/SRQxz9TW5+pbw1bxrXzsXl8wlpmYEHvKBrneN1L9CQT9CqQ9yJeGQYK1cTwWAlEQ== X-Received: by 10.84.210.73 with SMTP id z67mr15142121plh.306.1505165281407; Mon, 11 Sep 2017 14:28:01 -0700 (PDT) Received: from [192.168.1.176] (66.29.164.166.static.utbb.net. [66.29.164.166]) by smtp.gmail.com with ESMTPSA id 80sm16472149pfq.38.2017.09.11.14.28.00 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 11 Sep 2017 14:28:00 -0700 (PDT) To: Mike Snitzer References: <20170908164129.GA48286@redhat.com> <113c15ac-75f5-ff5b-695d-920dc6d5f708@kernel.dk> <20170908170727.GA29318@redhat.com> <20170908195805.GA30517@redhat.com> <57b4bb39-f50c-ad5e-d47d-01f0f84ec3ca@kernel.dk> <20170908214236.GA49918@redhat.com> <49cc6184-48da-2a45-e50e-13430e14e197@kernel.dk> <20170908220318.GA31120@redhat.com> <20170911161647.GA55061@redhat.com> <79287d3a-62e0-912e-dcad-e7ab5297ae34@kernel.dk> <20170911211327.GA17344@redhat.com> From: Jens Axboe Message-ID: Date: Mon, 11 Sep 2017 15:27:57 -0600 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.2.1 MIME-Version: 1.0 In-Reply-To: <20170911211327.GA17344@redhat.com> Content-Language: en-US X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.31]); Mon, 11 Sep 2017 21:28:03 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.31]); Mon, 11 Sep 2017 21:28:03 +0000 (UTC) for IP:'74.125.83.46' DOMAIN:'mail-pg0-f46.google.com' HELO:'mail-pg0-f46.google.com' FROM:'axboe@kernel.dk' RCPT:'' X-RedHat-Spam-Score: -0.011 (DKIM_SIGNED, DKIM_VALID, RCVD_IN_DNSWL_NONE, SPF_PASS) 74.125.83.46 mail-pg0-f46.google.com 74.125.83.46 mail-pg0-f46.google.com X-Scanned-By: MIMEDefang 2.78 on 10.5.110.31 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-loop: dm-devel@redhat.com Cc: Bart Van Assche , "linux-block@vger.kernel.org" , dm-devel@redhat.com, Paolo Valente Subject: Re: [dm-devel] [PATCH v2] block: directly insert blk-mq request from blk_insert_cloned_request() X-BeenThere: dm-devel@redhat.com X-Mailman-Version: 2.1.12 Precedence: junk List-Id: device-mapper development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: dm-devel-bounces@redhat.com Errors-To: dm-devel-bounces@redhat.com X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.29]); Mon, 11 Sep 2017 21:39:20 +0000 (UTC) X-Virus-Scanned: ClamAV using ClamSMTP On 09/11/2017 03:13 PM, Mike Snitzer wrote: > On Mon, Sep 11 2017 at 4:51pm -0400, > Jens Axboe wrote: > >> On 09/11/2017 10:16 AM, Mike Snitzer wrote: >>> Here is v2 that should obviate the need to rename blk_mq_insert_request >>> (by using bools to control run_queue and async). >>> >>> As for inserting directly into dispatch, if that can be done that is >>> great but I'd prefer to have that be a follow-up optimization. This >>> fixes the regression in question, and does so in well-known terms. >>> >>> What do you think? >> >> I think it looks reasonable. My only concern is the use of the software >> queues. Depending on the scheduler, they may or may not be used. I'd >> need to review the code, but my first thought is that this would break >> if you use blk_mq_insert_request() on a device that is managed by >> mq-deadline or bfq, for instance. Schedulers are free to use the >> software queues, but they are also free to ignore them and use internal >> queuing. >> >> Looking at the code, looks like this was changed slightly at some point, >> we always flush the software queues, if any of them contain requests. So >> it's probably fine. > > OK good, but is that too brittle to rely on? Something that might change > in the future? I'm actually surprised we do flush software queues for that case, since we don't always have to. So it is a bit of a wart. If we don't have a scheduler, software queues is where IO goes. If we have a scheduler, the scheduler has complete control of where to queue IO. Generally, the scheduler will either utilize the software queues or it won't, there's nothing in between. I know realize I'm an idiot and didn't read it right. So here's the code in question: const bool has_sched_dispatch = e && e->type->ops.mq.dispatch_request; [...] } else if (!has_sched_dispatch) { blk_mq_flush_busy_ctxs(hctx, &rq_list); blk_mq_dispatch_rq_list(q, &rq_list); } so we do only enter sw queue flushing, if we don't have a scheduler with a dispatch_request hook. So now I am really wondering how your patch could work if the bottom device has bfq or mq-deadline attached? >> My earlier suggestion to use just hctx->dispatch for the IO and bypass >> the software queues completely. The use case for the dispatch list is >> the same, regardless of whether the device has a scheduler attached or >> not. > > I'm missing how these details relate to the goal of bypassing any > scheduler that might be attached. Are you saying the attached elevator > would still get in the way? See above. > Looking at blk_mq_sched_insert_request(), submission when an elevator > isn't attached is exactly what I made blk_mq_insert_request() do > (which is exactly what it did in the past). Right, but that path is only used if we don't have a scheduler attached. So while the code will use that path IFF a scheduler isn't attached to that device, your use case will use it for both cases. > In the case of DM multipath, nothing else should be submitting IO to > the device so elevator shouldn't be used -- only interface for > submitting IO would be blk_mq_insert_request(). So even if a > scheduler is attached it should be bypassed right? The problem is the usage of the sw queue. Does the below work for you? Tested-by: Mike Snitzer diff --git a/block/blk-core.c b/block/blk-core.c index d709c0e3a2ac..aebe676225e6 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -2342,7 +2342,12 @@ blk_status_t blk_insert_cloned_request(struct request_queue *q, struct request * if (q->mq_ops) { if (blk_queue_io_stat(q)) blk_account_io_start(rq, true); - blk_mq_sched_insert_request(rq, false, true, false, false); + /* + * Since we have a scheduler attached on the top device, + * bypass a potential scheduler on the bottom device for + * insert. + */ + blk_mq_request_bypass_insert(rq); return BLK_STS_OK; } diff --git a/block/blk-mq.c b/block/blk-mq.c index 3f18cff80050..98a18609755e 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1401,6 +1401,22 @@ void __blk_mq_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq, blk_mq_hctx_mark_pending(hctx, ctx); } +/* + * Should only be used carefully, when the caller knows we want to + * bypass a potential IO scheduler on the target device. + */ +void blk_mq_request_bypass_insert(struct request *rq) +{ + struct blk_mq_ctx *ctx = rq->mq_ctx; + struct blk_mq_hw_ctx *hctx = blk_mq_map_queue(rq->q, ctx->cpu); + + spin_lock(&hctx->lock); + list_add_tail(&rq->queuelist, &hctx->dispatch); + spin_unlock(&hctx->lock); + + blk_mq_run_hw_queue(hctx, false); +} + void blk_mq_insert_requests(struct blk_mq_hw_ctx *hctx, struct blk_mq_ctx *ctx, struct list_head *list) diff --git a/block/blk-mq.h b/block/blk-mq.h index 98252b79b80b..ef15b3414da5 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -54,6 +54,7 @@ int blk_mq_alloc_rqs(struct blk_mq_tag_set *set, struct blk_mq_tags *tags, */ void __blk_mq_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq, bool at_head); +void blk_mq_request_bypass_insert(struct request *rq); void blk_mq_insert_requests(struct blk_mq_hw_ctx *hctx, struct blk_mq_ctx *ctx, struct list_head *list);