From patchwork Tue Jan 26 10:50:57 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Valente X-Patchwork-Id: 12046259 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 231A7C433E0 for ; Tue, 26 Jan 2021 12:25:14 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id EE01723122 for ; Tue, 26 Jan 2021 12:25:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2404484AbhAZKwU (ORCPT ); Tue, 26 Jan 2021 05:52:20 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54340 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2404297AbhAZKwD (ORCPT ); Tue, 26 Jan 2021 05:52:03 -0500 Received: from mail-wr1-x42d.google.com (mail-wr1-x42d.google.com [IPv6:2a00:1450:4864:20::42d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 073E0C061786 for ; Tue, 26 Jan 2021 02:51:20 -0800 (PST) Received: by mail-wr1-x42d.google.com with SMTP id a1so15937329wrq.6 for ; Tue, 26 Jan 2021 02:51:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=kPe9/bC+fapDM9TyJPKrxL534jOiWcGHIwwJRVky7EU=; b=OqVjkA2ciaTvaPcheuyRt1SetUn8Cn9eiDfOks1lNQ0KhavOCQ5vHCAbaHip0qlUV9 C0MnMHOVDo5egF9h14gwy6ggaES2ZqNF/fjajaERGJDfFZFJe0MBxCYnsF4DAPN/tJJO fECQBlcPUz/1nq05YendvbJoMIDJLveSjIsh9vlXZSzUhpNXEeEVP38GiPIze65MEXER iF4dypnmf8rRmuGKYVmhtEA8y6oDfcm9IaVyldRYXqIVgAULa9jkNkuxfhvkEmPdHwEj c1vbc81yOwT1V7Ffg0qAHofokiUgxgQhZ68UmaSEFkj7i27dZq7IQiWsx1FBg7h9ydez G5Nw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=kPe9/bC+fapDM9TyJPKrxL534jOiWcGHIwwJRVky7EU=; b=kc7R7RaiAYFHb1XEbNqVsryZX6qHftxIhe+UwBKxz1ojAQPZZhElzj8pWZlKJrDtLH l24IQXdxgrVrPz1CCrlYEzaxw+VE7PNEPuBfkRxI15e9N0tNkdPxitHbBNYKUpljmr/O x5SKA1WuGOTpLs91VmAFCIJ9hrG/rpl83S3kDre7yzUJhJR0K1oLwY7wwTta0p685fds KYcUqq1a0pALQKdV6PTf2H6xGIqIuWqdmy0gTihbaj52OyG28UIsD0YRk1+NfsnTBAUO 2q0gqjvHV7aVCIkkVPcHR9mhUZPKYD8FzpWLvesPg7JFf1cG9VmWmvuz+rYED+3cE7if uLaw== X-Gm-Message-State: AOAM531HrPFHnnbuK96nJdbc+0IEj7Yn4Q+7W/of7OoMjNPKmAzLU7bG HUNTcoa5JpyG+sBYUWdnmgwUYw== X-Google-Smtp-Source: ABdhPJwTmI9AK+E7QAqlT2/FajyUAKTEwOhi4zQFqWexzunKCays3a2knrkCYOYxMAweTUNfyRPD/Q== X-Received: by 2002:adf:fe46:: with SMTP id m6mr5420092wrs.48.1611658278637; Tue, 26 Jan 2021 02:51:18 -0800 (PST) Received: from localhost.localdomain ([83.216.184.132]) by smtp.gmail.com with ESMTPSA id h18sm7177879wru.65.2021.01.26.02.51.17 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Tue, 26 Jan 2021 02:51:18 -0800 (PST) From: Paolo Valente To: Jens Axboe Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, Paolo Valente , Jan Kara Subject: [PATCH BUGFIX/IMPROVEMENT 1/6] block, bfq: always inject I/O of queues blocked by wakers Date: Tue, 26 Jan 2021 11:50:57 +0100 Message-Id: <20210126105102.53102-2-paolo.valente@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210126105102.53102-1-paolo.valente@linaro.org> References: <20210126105102.53102-1-paolo.valente@linaro.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org Suppose that I/O dispatch is plugged, to wait for new I/O for the in-service bfq-queue, say bfqq. Suppose then that there is a further bfq_queue woken by bfqq, and that this woken queue has pending I/O. A woken queue does not steal bandwidth from bfqq, because it remains soon without I/O if bfqq is not served. So there is virtually no risk of loss of bandwidth for bfqq if this woken queue has I/O dispatched while bfqq is waiting for new I/O. In contrast, this extra I/O injection boosts throughput. This commit performs this extra injection. Tested-by: Jan Kara Signed-off-by: Paolo Valente --- block/bfq-iosched.c | 32 +++++++++++++++++++++++++++----- block/bfq-wf2q.c | 8 ++++++++ 2 files changed, 35 insertions(+), 5 deletions(-) diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c index 445cef9c0bb9..a83149407336 100644 --- a/block/bfq-iosched.c +++ b/block/bfq-iosched.c @@ -4487,9 +4487,15 @@ static struct bfq_queue *bfq_select_queue(struct bfq_data *bfqd) bfq_bfqq_busy(bfqq->bic->bfqq[0]) && bfqq->bic->bfqq[0]->next_rq ? bfqq->bic->bfqq[0] : NULL; + struct bfq_queue *blocked_bfqq = + !hlist_empty(&bfqq->woken_list) ? + container_of(bfqq->woken_list.first, + struct bfq_queue, + woken_list_node) + : NULL; /* - * The next three mutually-exclusive ifs decide + * The next four mutually-exclusive ifs decide * whether to try injection, and choose the queue to * pick an I/O request from. * @@ -4522,7 +4528,15 @@ static struct bfq_queue *bfq_select_queue(struct bfq_data *bfqd) * next bfqq's I/O is brought forward dramatically, * for it is not blocked for milliseconds. * - * The third if checks whether bfqq is a queue for + * The third if checks whether there is a queue woken + * by bfqq, and currently with pending I/O. Such a + * woken queue does not steal bandwidth from bfqq, + * because it remains soon without I/O if bfqq is not + * served. So there is virtually no risk of loss of + * bandwidth for bfqq if this woken queue has I/O + * dispatched while bfqq is waiting for new I/O. + * + * The fourth if checks whether bfqq is a queue for * which it is better to avoid injection. It is so if * bfqq delivers more throughput when served without * any further I/O from other queues in the middle, or @@ -4542,11 +4556,11 @@ static struct bfq_queue *bfq_select_queue(struct bfq_data *bfqd) * bfq_update_has_short_ttime(), it is rather likely * that, if I/O is being plugged for bfqq and the * waker queue has pending I/O requests that are - * blocking bfqq's I/O, then the third alternative + * blocking bfqq's I/O, then the fourth alternative * above lets the waker queue get served before the * I/O-plugging timeout fires. So one may deem the * second alternative superfluous. It is not, because - * the third alternative may be way less effective in + * the fourth alternative may be way less effective in * case of a synchronization. For two main * reasons. First, throughput may be low because the * inject limit may be too low to guarantee the same @@ -4555,7 +4569,7 @@ static struct bfq_queue *bfq_select_queue(struct bfq_data *bfqd) * guarantees (the second alternative unconditionally * injects a pending I/O request of the waker queue * for each bfq_dispatch_request()). Second, with the - * third alternative, the duration of the plugging, + * fourth alternative, the duration of the plugging, * i.e., the time before bfqq finally receives new I/O, * may not be minimized, because the waker queue may * happen to be served only after other queues. @@ -4573,6 +4587,14 @@ static struct bfq_queue *bfq_select_queue(struct bfq_data *bfqd) bfq_bfqq_budget_left(bfqq->waker_bfqq) ) bfqq = bfqq->waker_bfqq; + else if (blocked_bfqq && + bfq_bfqq_busy(blocked_bfqq) && + blocked_bfqq->next_rq && + bfq_serv_to_charge(blocked_bfqq->next_rq, + blocked_bfqq) <= + bfq_bfqq_budget_left(blocked_bfqq) + ) + bfqq = blocked_bfqq; else if (!idling_boosts_thr_without_issues(bfqd, bfqq) && (bfqq->wr_coeff == 1 || bfqd->wr_busy_queues > 1 || !bfq_bfqq_has_short_ttime(bfqq))) diff --git a/block/bfq-wf2q.c b/block/bfq-wf2q.c index 26776bdbdf36..02e59931d897 100644 --- a/block/bfq-wf2q.c +++ b/block/bfq-wf2q.c @@ -1709,4 +1709,12 @@ void bfq_add_bfqq_busy(struct bfq_data *bfqd, struct bfq_queue *bfqq) if (bfqq->wr_coeff > 1) bfqd->wr_busy_queues++; + + /* Move bfqq to the head of the woken list of its waker */ + if (!hlist_unhashed(&bfqq->woken_list_node) && + &bfqq->woken_list_node != bfqq->waker_bfqq->woken_list.first) { + hlist_del_init(&bfqq->woken_list_node); + hlist_add_head(&bfqq->woken_list_node, + &bfqq->waker_bfqq->woken_list); + } } From patchwork Tue Jan 26 10:50:58 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Valente X-Patchwork-Id: 12046097 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7ECB2C433DB for ; Tue, 26 Jan 2021 10:52:57 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 27749230FD for ; Tue, 26 Jan 2021 10:52:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732217AbhAZKwe (ORCPT ); Tue, 26 Jan 2021 05:52:34 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54446 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2404468AbhAZKwc (ORCPT ); Tue, 26 Jan 2021 05:52:32 -0500 Received: from mail-wm1-x32d.google.com (mail-wm1-x32d.google.com [IPv6:2a00:1450:4864:20::32d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6489BC06178A for ; Tue, 26 Jan 2021 02:51:21 -0800 (PST) Received: by mail-wm1-x32d.google.com with SMTP id j18so2037106wmi.3 for ; Tue, 26 Jan 2021 02:51:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=ldZICOD4+ui/kUfiZ0vMoBvq1PCCAphXE29xFBVIlys=; b=G7ICBg/HX4w9a1GqJB27JAqz8cc0X+kgoIkcDQuZenp2gsTZ6mTqoSdCRx2eU+IwKP f5wj1S20RZAybyxk4/MFclkHANhWhRWHb/fcC4gAASOO2hG/iGRZh2GMekfcgoHqVe2H ZTqQPDHc32sLwRfFBznLHU/4kV7jbB5EXWGbJM/81ppffPe6nEeV9ZCATOM9/BNrjTaO stBVUNGL+6fVd5urSrOyGPEIsWOU81GNcrsGKozsA63vc2nkttojVn8rcOJM1SitPMIP mu/yrZvXFY0L5bTAml5HRVLfAWeX1zPB0dapun8rAPq3RsVhHhEHXmvzzUH3MmLz/vm/ t2WA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=ldZICOD4+ui/kUfiZ0vMoBvq1PCCAphXE29xFBVIlys=; b=mn8qzA294rdD4pgY9M6UoY2EfzVHqkP11oJQ8NkaEe5EFSkOO6ccAUxIi33u+Ds8VL buGg4PgmEgnQqs91jWvQrUkWGIhjRRRyMA9Wvcioe5grIceTgdjmqBSF2A8USuNo6TnU 1IvPileLMNXJsJ9lFGKOhFIstpE7jqEJtZsglX+DpQD60sb7Nuckz/w6o1weYxFwkz85 cesuRo+XP+C0YtFCMNHSb5gSXkLK3G2V+QHDkiAEBoLGtyyick0seoz5wfpHwY+Lzbje eMAebrZEjmei9Keh+TCPebVYmlq9/HiAnhiA+cl9GF4ZKcqjAxJXEyFZp8MF5KwtpPTv k7Bg== X-Gm-Message-State: AOAM532e6OxF5u4D7Gznsahud5FSgd7YfYlRZ5Sski/3RSvQER1id1/9 GhI0uoyMwoAkmc+X6jpYBbIZww== X-Google-Smtp-Source: ABdhPJwJW9GfocdFPmrjj+lUfufIOZMzP5Wi4MiM4O4hzEq21ri6ywTRe8vvpEuZThiA0f+A54Zg1w== X-Received: by 2002:a1c:5456:: with SMTP id p22mr4111219wmi.81.1611658280142; Tue, 26 Jan 2021 02:51:20 -0800 (PST) Received: from localhost.localdomain ([83.216.184.132]) by smtp.gmail.com with ESMTPSA id h18sm7177879wru.65.2021.01.26.02.51.18 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Tue, 26 Jan 2021 02:51:19 -0800 (PST) From: Paolo Valente To: Jens Axboe Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, Paolo Valente , Jan Kara Subject: [PATCH BUGFIX/IMPROVEMENT 2/6] block, bfq: put reqs of waker and woken in dispatch list Date: Tue, 26 Jan 2021 11:50:58 +0100 Message-Id: <20210126105102.53102-3-paolo.valente@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210126105102.53102-1-paolo.valente@linaro.org> References: <20210126105102.53102-1-paolo.valente@linaro.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org Consider a new I/O request that arrives for a bfq_queue bfqq. If, when this happens, the only active bfq_queues are bfqq and either its waker bfq_queue or one of its woken bfq_queues, then there is no point in queueing this new I/O request in bfqq for service. In fact, the in-service queue and bfqq agree on serving this new I/O request as soon as possible. So this commit puts this new I/O request directly into the dispatch list. Tested-by: Jan Kara Signed-off-by: Paolo Valente --- block/bfq-iosched.c | 17 ++++++++++++++++- 1 file changed, 16 insertions(+), 1 deletion(-) diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c index a83149407336..e5b83910fbe0 100644 --- a/block/bfq-iosched.c +++ b/block/bfq-iosched.c @@ -5640,7 +5640,22 @@ static void bfq_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq, spin_lock_irq(&bfqd->lock); bfqq = bfq_init_rq(rq); - if (!bfqq || at_head || blk_rq_is_passthrough(rq)) { + + /* + * Additional case for putting rq directly into the dispatch + * queue: the only active bfq_queues are bfqq and either its + * waker bfq_queue or one of its woken bfq_queues. In this + * case, there is no point in queueing rq in bfqq for + * service. In fact, the in-service queue and bfqq agree on + * serving this new I/O request as soon as possible. + */ + if (!bfqq || + (bfqq != bfqd->in_service_queue && + bfqd->in_service_queue != NULL && + bfq_tot_busy_queues(bfqd) == 1 + bfq_bfqq_busy(bfqq) && + (bfqq->waker_bfqq == bfqd->in_service_queue || + bfqd->in_service_queue->waker_bfqq == bfqq)) || + at_head || blk_rq_is_passthrough(rq)) { if (at_head) list_add(&rq->queuelist, &bfqd->dispatch); else From patchwork Tue Jan 26 10:50:59 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Valente X-Patchwork-Id: 12046099 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8FD30C433E0 for ; Tue, 26 Jan 2021 10:52:57 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4ED9E23109 for ; Tue, 26 Jan 2021 10:52:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2404468AbhAZKwe (ORCPT ); Tue, 26 Jan 2021 05:52:34 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54448 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2404476AbhAZKwc (ORCPT ); Tue, 26 Jan 2021 05:52:32 -0500 Received: from mail-wr1-x431.google.com (mail-wr1-x431.google.com [IPv6:2a00:1450:4864:20::431]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A4569C06178B for ; Tue, 26 Jan 2021 02:51:22 -0800 (PST) Received: by mail-wr1-x431.google.com with SMTP id h9so5687038wrr.9 for ; Tue, 26 Jan 2021 02:51:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=t86iMQKIqsS0m2aCrWu55vo8IS053fSBSrJitc0MorY=; b=o7mD/KjydYdHiZ0EaOi+To5IHQ2Tr3u1GFOjlqb57h46hhwlQI9CbkB+zdYmqK0C2A Ndn6MZHrdqMvjJA6yFwjCnRuyvQwx9uNrzKIlYIDbSayT6cahYF7zn/w3ixc9SmG7gMl O3kiZStPme9n68UyflaMzQ4Xtl4Qm4+h/cYBUnT1AjEHGaodBv1V18nFA8qmsTCuYICd us2Y1L2Mk9j9mCuN8LCYzBCBfL4jGL1Gp0hNs/7S9LFU08Pjz4MJhVT/D6FtmDgTPiMR c96mheNENcTGjtOYLmjONbjVreusfNQ/oeFotn42jd7TwDAWKkSVEqLGyJW/raAvT4Ab a1qg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=t86iMQKIqsS0m2aCrWu55vo8IS053fSBSrJitc0MorY=; b=BPWmSA7lptIZkyYjviqNrZmH1LCjzlJVN2ZBB4LiKFtJYqHNXJ/bcH6Picn83rR3L9 9AmGYKqtA3TNETj3R39moFr2YzJz0FC8vIK22HUN+5zjJ44F0BimpwnELt0NGQyWfmpn lqSwHKUvtrlMrf4R08977k9+expdH25wE6DeEeKAj09UP/UAGsvMPcDiSteOKbBmXSFz 6yh7AwV2kkB1NhLmXt0Vo7MUCqKoRK5k4197m1z/O0yBAvXozcfWR1j0dH2EnlWjpFA6 RN+hK+xw1jF/3f474yA48zKD71jTim6SUzRB2Im74/krkf8AGFreYbGz11inL4tyw0cK 5CtQ== X-Gm-Message-State: AOAM533lY3emY60L5APnyNE1pXzyYN4nf9eg3ZgE71vDzH7krrPY53w3 UpgZPOQhwYeowtHDqIvEIe5sng== X-Google-Smtp-Source: ABdhPJyTqVADEw9M3IHNU0FlSb0fH6q1RWNU4RKVai0zsh9bNEHZ4TfAJv6yPLjuOy5tjfaM/4Uofw== X-Received: by 2002:adf:80c3:: with SMTP id 61mr4963946wrl.100.1611658281280; Tue, 26 Jan 2021 02:51:21 -0800 (PST) Received: from localhost.localdomain ([83.216.184.132]) by smtp.gmail.com with ESMTPSA id h18sm7177879wru.65.2021.01.26.02.51.20 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Tue, 26 Jan 2021 02:51:20 -0800 (PST) From: Paolo Valente To: Jens Axboe Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, Paolo Valente , Jan Kara Subject: [PATCH BUGFIX/IMPROVEMENT 3/6] block, bfq: make shared queues inherit wakers Date: Tue, 26 Jan 2021 11:50:59 +0100 Message-Id: <20210126105102.53102-4-paolo.valente@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210126105102.53102-1-paolo.valente@linaro.org> References: <20210126105102.53102-1-paolo.valente@linaro.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org Consider a bfq_queue bfqq that is about to be merged with another bfq_queue new_bfqq. The processes associated with bfqq are cooperators of the processes associated with new_bfqq. So, if bfqq has a waker, then it is reasonable (and beneficial for throughput) to assume that all these processes will be happy to let bfqq's waker freely inject I/O when they have no I/O. So this commit makes new_bfqq inherit bfqq's waker. Tested-by: Jan Kara Signed-off-by: Paolo Valente --- block/bfq-iosched.c | 42 +++++++++++++++++++++++++++++++++++++++--- 1 file changed, 39 insertions(+), 3 deletions(-) diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c index e5b83910fbe0..c5bda33c0923 100644 --- a/block/bfq-iosched.c +++ b/block/bfq-iosched.c @@ -2819,6 +2819,29 @@ bfq_merge_bfqqs(struct bfq_data *bfqd, struct bfq_io_cq *bic, bfq_mark_bfqq_IO_bound(new_bfqq); bfq_clear_bfqq_IO_bound(bfqq); + /* + * The processes associated with bfqq are cooperators of the + * processes associated with new_bfqq. So, if bfqq has a + * waker, then assume that all these processes will be happy + * to let bfqq's waker freely inject I/O when they have no + * I/O. + */ + if (bfqq->waker_bfqq && !new_bfqq->waker_bfqq && + bfqq->waker_bfqq != new_bfqq) { + new_bfqq->waker_bfqq = bfqq->waker_bfqq; + new_bfqq->tentative_waker_bfqq = NULL; + + /* + * If the waker queue disappears, then + * new_bfqq->waker_bfqq must be reset. So insert + * new_bfqq into the woken_list of the waker. See + * bfq_check_waker for details. + */ + hlist_add_head(&new_bfqq->woken_list_node, + &new_bfqq->waker_bfqq->woken_list); + + } + /* * If bfqq is weight-raised, then let new_bfqq inherit * weight-raising. To reduce false positives, neglect the case @@ -6276,7 +6299,7 @@ static struct bfq_queue *bfq_init_rq(struct request *rq) if (likely(!new_queue)) { /* If the queue was seeky for too long, break it apart. */ if (bfq_bfqq_coop(bfqq) && bfq_bfqq_split_coop(bfqq)) { - bfq_log_bfqq(bfqd, bfqq, "breaking apart bfqq"); + struct bfq_queue *old_bfqq = bfqq; /* Update bic before losing reference to bfqq */ if (bfq_bfqq_in_large_burst(bfqq)) @@ -6285,11 +6308,24 @@ static struct bfq_queue *bfq_init_rq(struct request *rq) bfqq = bfq_split_bfqq(bic, bfqq); split = true; - if (!bfqq) + if (!bfqq) { bfqq = bfq_get_bfqq_handle_split(bfqd, bic, bio, true, is_sync, NULL); - else + bfqq->waker_bfqq = old_bfqq->waker_bfqq; + bfqq->tentative_waker_bfqq = NULL; + + /* + * If the waker queue disappears, then + * new_bfqq->waker_bfqq must be + * reset. So insert new_bfqq into the + * woken_list of the waker. See + * bfq_check_waker for details. + */ + if (bfqq->waker_bfqq) + hlist_add_head(&bfqq->woken_list_node, + &bfqq->waker_bfqq->woken_list); + } else bfqq_already_existing = true; } } From patchwork Tue Jan 26 10:51:00 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Valente X-Patchwork-Id: 12046101 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 011E1C433E0 for ; Tue, 26 Jan 2021 10:53:07 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B6FBE23109 for ; Tue, 26 Jan 2021 10:53:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2404532AbhAZKwz (ORCPT ); Tue, 26 Jan 2021 05:52:55 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54510 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2404523AbhAZKwr (ORCPT ); Tue, 26 Jan 2021 05:52:47 -0500 Received: from mail-wm1-x331.google.com (mail-wm1-x331.google.com [IPv6:2a00:1450:4864:20::331]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B8B65C061793 for ; Tue, 26 Jan 2021 02:51:23 -0800 (PST) Received: by mail-wm1-x331.google.com with SMTP id j18so2037187wmi.3 for ; Tue, 26 Jan 2021 02:51:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=g0MXCokkbnaIDoDqUI9Wq5P/6nCekdb4/ZayF5jaU44=; b=vj1b6NidgsO3AAvLOcErxLumqmJkTgHblZ2YHQzhzjKBenORxnL05KJV+vzV+/H4M1 qD6TVraUtAGqfx+BhiM51UH0O0bGE8/UfsSJZnn5kKjpc5/sATZAj7MKszmX3BfeSGaL 39KVXY5Igy9YDb4tUpgwnvpqoBk/6P7U1l5wAsmey3oYSlYUM6wELOk9KKELCJhtF4Yc hIQU5oavrd7La/YTBNrrkM61cRzuzJvruQS8Po/Zl4YyakHcj19xuU4V79xaFl2bf1I+ 9Xo4QuJu7tg4OS/Rco/9Y/OQQS3ntIOFkCISszXN+onU2Mb7mnabTUXTUvfUZ3Sqhm52 hD6g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=g0MXCokkbnaIDoDqUI9Wq5P/6nCekdb4/ZayF5jaU44=; b=fRT9wAE+BOfbcizLHhqkFWTtfLBh0RKhtQgZBxj2phJOJbE+5AdYAP6Nwg3SGCB4cE jt/C/5Xw8g6eYsHaY5aN/+yjk888i0v9f6zuDXgHAZB5Q68a/t4uo2gMYrC1w3TohaNR 6tVCN+3UdgBzPAlqZmaMm1vPTjgQTiUQBlPVlEVO6YP233Ja1sagw3CZJ5TnsED1k346 JaBkV1zXLNYXS7jhX27dtuKeL1xv+VHiHOeTIobiFa2nURhnuUvBC28RfAVtsdNNXu7P Oqi/G8QSXdui0JZMrU320hoMBJjg07ozpkvCE9U6vGbqd2qd4D4EiMlOgM9eVC3mkUyR 7Cuw== X-Gm-Message-State: AOAM530IY/BsbvzE3lPHziXf1X41o/AqJGQpWOzAf2pP8+q/ALB3vLQp Q7Nlcgl98F6ef9pDWgYLE12rAvlowkJcoQ== X-Google-Smtp-Source: ABdhPJzK8Zs6frbsR2txsRD/xENvh79/znuNQXhGsBjUQ+py4BQ16PPJQWOtSdFqV0GvmIo8IjaV+g== X-Received: by 2002:a1c:e0d4:: with SMTP id x203mr4071206wmg.160.1611658282468; Tue, 26 Jan 2021 02:51:22 -0800 (PST) Received: from localhost.localdomain ([83.216.184.132]) by smtp.gmail.com with ESMTPSA id h18sm7177879wru.65.2021.01.26.02.51.21 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Tue, 26 Jan 2021 02:51:22 -0800 (PST) From: Paolo Valente To: Jens Axboe Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, Paolo Valente , Jan Kara Subject: [PATCH BUGFIX/IMPROVEMENT 4/6] block, bfq: fix weight-raising resume with !low_latency Date: Tue, 26 Jan 2021 11:51:00 +0100 Message-Id: <20210126105102.53102-5-paolo.valente@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210126105102.53102-1-paolo.valente@linaro.org> References: <20210126105102.53102-1-paolo.valente@linaro.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org When the io_latency heuristic is off, bfq_queues must not start to be weight-raised. Unfortunately, by mistake, this may happen when the state of a previously weight-raised bfq_queue is resumed after a queue split. This commit fixes this error. Tested-by: Jan Kara Signed-off-by: Paolo Valente --- block/bfq-iosched.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c index c5bda33c0923..0c7e203085f1 100644 --- a/block/bfq-iosched.c +++ b/block/bfq-iosched.c @@ -1010,7 +1010,7 @@ static void bfq_bfqq_resume_state(struct bfq_queue *bfqq, struct bfq_data *bfqd, struct bfq_io_cq *bic, bool bfq_already_existing) { - unsigned int old_wr_coeff = bfqq->wr_coeff; + unsigned int old_wr_coeff = 1; bool busy = bfq_already_existing && bfq_bfqq_busy(bfqq); if (bic->saved_has_short_ttime) @@ -1031,7 +1031,13 @@ bfq_bfqq_resume_state(struct bfq_queue *bfqq, struct bfq_data *bfqd, bfqq->ttime = bic->saved_ttime; bfqq->io_start_time = bic->saved_io_start_time; bfqq->tot_idle_time = bic->saved_tot_idle_time; - bfqq->wr_coeff = bic->saved_wr_coeff; + /* + * Restore weight coefficient only if low_latency is on + */ + if (bfqd->low_latency) { + old_wr_coeff = bfqq->wr_coeff; + bfqq->wr_coeff = bic->saved_wr_coeff; + } bfqq->service_from_wr = bic->saved_service_from_wr; bfqq->wr_start_at_switch_to_srt = bic->saved_wr_start_at_switch_to_srt; bfqq->last_wr_start_finish = bic->saved_last_wr_start_finish; From patchwork Tue Jan 26 10:51:01 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Valente X-Patchwork-Id: 12046257 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4F20BC433E0 for ; Tue, 26 Jan 2021 12:25:01 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2550D23109 for ; Tue, 26 Jan 2021 12:25:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2391617AbhAZMYn (ORCPT ); Tue, 26 Jan 2021 07:24:43 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54512 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2404524AbhAZKwr (ORCPT ); Tue, 26 Jan 2021 05:52:47 -0500 Received: from mail-wm1-x333.google.com (mail-wm1-x333.google.com [IPv6:2a00:1450:4864:20::333]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0CF0AC061794 for ; Tue, 26 Jan 2021 02:51:25 -0800 (PST) Received: by mail-wm1-x333.google.com with SMTP id u14so2031511wml.4 for ; Tue, 26 Jan 2021 02:51:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=eDB2VePP2gLo3jktXjvGnSm3jDWHWVOh6rC8z0CbRvQ=; b=D7x3P3OFR2vff+Yisjrvazi8eadRJNZbJnawBZbQpfFxy4dMa/orEnq0vHCEEDJL5p kLf5J9b4uznLltOw5/AuQA7c4JtVJS5tWd3XV2xZCo/QMYerYkgcfBVBJ8m/gt3+9pII fVhlxp2QrMCtwfNZoLRG6TGUEXup8VrC6YaIUNfjVr8LmWlbg2NbS7jjmegOFbe7KmRK 8bTtTWco0Xh45LldthuzzW8S4sL4fQ/ZdfVmAwzTguy8WBcROiEkYlEmabKewjrhpJeE S5iKogUo4fst39p5sS8skCcFs1ScgkFe4iJncmXTbaBjfclKz/4Cgitov9ctk+kgVtjS Aa5Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=eDB2VePP2gLo3jktXjvGnSm3jDWHWVOh6rC8z0CbRvQ=; b=T1yYS2k7Mh12DumlgycbIFls3NtkAENIm4+dY+g4dH1lB7YtF41Ytusy9z61+LUYKg 2026LbMgbUYagTV6d381byGuljdCnQgZsnzabuHBDpIeLJWTQDWBey0wR29sWbjVq97w ERqROY/bmQREfCWzt/VoYoNDrwbm9eFx84lgCBi73NV/5qf6bI7By8Acyxve809Ltqb+ uoOivpurWT+Od5j415XtFN31xF8JFMVz5sgaxQzL3+SF4LK5wqNGBlHdjEQ64Z4baKyr 3HmvKf7Vk5LBriBVxoF9qwNiwm30NMINOkby4Sw1H0gDMrXKOGB8XMAcGL0198pYW7rR 13Fg== X-Gm-Message-State: AOAM532B65Ry5L2EbQRXyKdQXjo9+XY3is06Hatf7tR/yLWGYvCU+m5n V5ObU/AgdTHagUkdDEqXNeUJZQ== X-Google-Smtp-Source: ABdhPJxt0t0zfhum4rHmKwRmfDFrxAJaGG/0X91xEqYzJx6/dtV9+zRIwoFomMvoiUeOP+B2K1NtCQ== X-Received: by 2002:a7b:ce96:: with SMTP id q22mr4150055wmj.165.1611658283791; Tue, 26 Jan 2021 02:51:23 -0800 (PST) Received: from localhost.localdomain ([83.216.184.132]) by smtp.gmail.com with ESMTPSA id h18sm7177879wru.65.2021.01.26.02.51.22 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Tue, 26 Jan 2021 02:51:23 -0800 (PST) From: Paolo Valente To: Jens Axboe Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, Paolo Valente , Jan Kara Subject: [PATCH BUGFIX/IMPROVEMENT 5/6] block, bfq: keep shared queues out of the waker mechanism Date: Tue, 26 Jan 2021 11:51:01 +0100 Message-Id: <20210126105102.53102-6-paolo.valente@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210126105102.53102-1-paolo.valente@linaro.org> References: <20210126105102.53102-1-paolo.valente@linaro.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org Shared queues are likely to receive I/O at a high rate. This may deceptively let them be considered as wakers of other queues. But a false waker will unjustly steal bandwidth to its supposedly woken queue. So considering also shared queues in the waking mechanism may cause more control troubles than throughput benefits. This commit keeps shared queues out of the waker-detection mechanism. Tested-by: Jan Kara Signed-off-by: Paolo Valente --- block/bfq-iosched.c | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c index 0c7e203085f1..23d0dd7bd90f 100644 --- a/block/bfq-iosched.c +++ b/block/bfq-iosched.c @@ -5825,7 +5825,17 @@ static void bfq_completed_request(struct bfq_queue *bfqq, struct bfq_data *bfqd) 1UL<<(BFQ_RATE_SHIFT - 10)) bfq_update_rate_reset(bfqd, NULL); bfqd->last_completion = now_ns; - bfqd->last_completed_rq_bfqq = bfqq; + /* + * Shared queues are likely to receive I/O at a high + * rate. This may deceptively let them be considered as wakers + * of other queues. But a false waker will unjustly steal + * bandwidth to its supposedly woken queue. So considering + * also shared queues in the waking mechanism may cause more + * control troubles than throughput benefits. Then do not set + * last_completed_rq_bfqq to bfqq if bfqq is a shared queue. + */ + if (!bfq_bfqq_coop(bfqq)) + bfqd->last_completed_rq_bfqq = bfqq; /* * If we are waiting to discover whether the request pattern From patchwork Tue Jan 26 10:51:02 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Valente X-Patchwork-Id: 12046103 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A84C9C433DB for ; Tue, 26 Jan 2021 10:53:11 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 67EE323109 for ; Tue, 26 Jan 2021 10:53:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2404523AbhAZKw4 (ORCPT ); Tue, 26 Jan 2021 05:52:56 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54520 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2404527AbhAZKwt (ORCPT ); Tue, 26 Jan 2021 05:52:49 -0500 Received: from mail-wm1-x336.google.com (mail-wm1-x336.google.com [IPv6:2a00:1450:4864:20::336]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6A809C0617A7 for ; Tue, 26 Jan 2021 02:51:26 -0800 (PST) Received: by mail-wm1-x336.google.com with SMTP id c128so2320451wme.2 for ; Tue, 26 Jan 2021 02:51:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=75nba2FFMX/u1RP+BEl5+Iv9hNTPDrpQPQBT7H2LRuU=; b=G0NLYJo/1v7j+1Kc/UjJPklZNzEI25x2nqccIddRZaaTpX8JyryKm5Y2CCHUzBxZNs M71QS/leJFZmFIS35QS3pmu2wl1tISCl4VXvxV1dFI96AqCzIMXi0u2FcZcYGRDFJXN/ mEgN7qhUSyPP10Qz9l8Y41sFWSVN11nHdwlRoIJ5Sf/IUFlHKwk/92FpGp2SsPLTxoYw QTqdA4pwprFxx4daQKa+Y/rymLwsgq979Xvnv0rZuhUetBn4tEEPJ9/555XgsVZj4v+1 i1mMOPHva/W0n1p7SABoNu2lOpvzcwm43TSHLFaZrGDsHufPsMVOfWAsK+rhSV+fDeeE k9Mg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=75nba2FFMX/u1RP+BEl5+Iv9hNTPDrpQPQBT7H2LRuU=; b=hPYNnms45OZzQOvgVS7CUeacKDxFQjVBOCuTAsmcO50OdxJhkVQELVl/bA+mLmxyE0 ARAmvlYosP7UiFlZ4ytsCWbBbI5E4408rbG52otgqCTyePrIvEGwQNZ3pasMmnerT0As iKHne4yBPq9K67oFiJ9kjycBjdcbKVNbAQEi1TtB2jy/FTXFWGXMj8aERE4yTwQvbfQq Lcv61TKzxo1GB7+OV3w65leX+bVUUAXUSeu5YwVY/wbDcYzh2/ix8ui/pALtOgYHMG/o ACMvXoyj9Gi9FdsjgC2y/8IaLDV9EZfPe4LlfeOH5NdWmeT4d3Bh3+C37GrFBLObgnlD 3GeA== X-Gm-Message-State: AOAM533DCbPRQMYRT5r1fx+FGj2MDHS9h6ZgqD9QcP/3J0j+S/Q5blWC 3c/wX7Gcetd2QHnPNQW8xcJNSQ== X-Google-Smtp-Source: ABdhPJwTVt/bP7QD4sX0zJAQwtBguyWVN5KO1uLPne+MV6kG7LswoXWS75QT6WgciljHs/LTvm7GRg== X-Received: by 2002:a05:600c:27d1:: with SMTP id l17mr4072045wmb.18.1611658284987; Tue, 26 Jan 2021 02:51:24 -0800 (PST) Received: from localhost.localdomain ([83.216.184.132]) by smtp.gmail.com with ESMTPSA id h18sm7177879wru.65.2021.01.26.02.51.23 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Tue, 26 Jan 2021 02:51:24 -0800 (PST) From: Paolo Valente To: Jens Axboe Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, Paolo Valente , Jan Kara Subject: [PATCH BUGFIX/IMPROVEMENT 6/6] block, bfq: merge bursts of newly-created queues Date: Tue, 26 Jan 2021 11:51:02 +0100 Message-Id: <20210126105102.53102-7-paolo.valente@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210126105102.53102-1-paolo.valente@linaro.org> References: <20210126105102.53102-1-paolo.valente@linaro.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org Many throughput-sensitive workloads are made of several parallel I/O flows, with all flows generated by the same application, or more generically by the same task (e.g., system boot). The most counterproductive action with these workloads is plugging I/O dispatch when one of the bfq_queues associated with these flows remains temporarily empty. To avoid this plugging, BFQ has been using a burst-handling mechanism for years now. This mechanism has proven effective for throughput, and not detrimental for service guarantees. This commit pushes this mechanism a little bit further, basing on the following two facts. First, all the I/O flows of a the same application or task contribute to the execution/completion of that common application or task. So the performance figures that matter are total throughput of the flows and task-wide I/O latency. In particular, these flows do not need to be protected from each other, in terms of individual bandwidth or latency. Second, the above fact holds regardless of the number of flows. Putting these two facts together, this commits merges stably the bfq_queues associated with these I/O flows, i.e., with the processes that generate these IO/ flows, regardless of how many the involved processes are. To decide whether a set of bfq_queues is actually associated with the I/O flows of a common application or task, and to merge these queues stably, this commit operates as follows: given a bfq_queue, say Q2, currently being created, and the last bfq_queue, say Q1, created before Q2, Q2 is merged stably with Q1 if - very little time has elapsed since when Q1 was created - Q2 has the same ioprio as Q1 - Q2 belongs to the same group as Q1 Merging bfq_queues also reduces scheduling overhead. A fio test with ten random readers on /dev/nullb shows a throughput boost of 40%, with a quadcore. Since BFQ's execution time amounts to ~50% of the total per-request processing time, the above throughput boost implies that BFQ's overhead is reduced by more than 50%. Tested-by: Jan Kara Signed-off-by: Paolo Valente Reported-by: kernel test robot Reported-by: kernel test robot --- block/bfq-cgroup.c | 2 + block/bfq-iosched.c | 249 ++++++++++++++++++++++++++++++++++++++++++-- block/bfq-iosched.h | 15 +++ 3 files changed, 256 insertions(+), 10 deletions(-) diff --git a/block/bfq-cgroup.c b/block/bfq-cgroup.c index b791e2041e49..e2f14508f2d6 100644 --- a/block/bfq-cgroup.c +++ b/block/bfq-cgroup.c @@ -547,6 +547,8 @@ static void bfq_pd_init(struct blkg_policy_data *pd) entity->orig_weight = entity->weight = entity->new_weight = d->weight; entity->my_sched_data = &bfqg->sched_data; + entity->last_bfqq_created = NULL; + bfqg->my_entity = entity; /* * the root_group's will be set to NULL * in bfq_init_queue() diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c index 23d0dd7bd90f..f4e23e0ced74 100644 --- a/block/bfq-iosched.c +++ b/block/bfq-iosched.c @@ -1073,7 +1073,7 @@ bfq_bfqq_resume_state(struct bfq_queue *bfqq, struct bfq_data *bfqd, static int bfqq_process_refs(struct bfq_queue *bfqq) { return bfqq->ref - bfqq->allocated - bfqq->entity.on_st_or_in_serv - - (bfqq->weight_counter != NULL); + (bfqq->weight_counter != NULL) - bfqq->stable_ref; } /* Empty burst list and add just bfqq (see comments on bfq_handle_burst) */ @@ -2625,6 +2625,11 @@ static bool bfq_may_be_close_cooperator(struct bfq_queue *bfqq, return true; } +static bool idling_boosts_thr_without_issues(struct bfq_data *bfqd, + struct bfq_queue *bfqq); + +static void bfq_put_stable_ref(struct bfq_queue *bfqq); + /* * Attempt to schedule a merge of bfqq with the currently in-service * queue or with a close queue among the scheduled queues. Return @@ -2647,10 +2652,49 @@ static bool bfq_may_be_close_cooperator(struct bfq_queue *bfqq, */ static struct bfq_queue * bfq_setup_cooperator(struct bfq_data *bfqd, struct bfq_queue *bfqq, - void *io_struct, bool request) + void *io_struct, bool request, struct bfq_io_cq *bic) { struct bfq_queue *in_service_bfqq, *new_bfqq; + /* + * Check delayed stable merge for rotational or non-queueing + * devs. For this branch to be executed, bfqq must not be + * currently merged with some other queue (i.e., bfqq->bic + * must be non null). If we considered also merged queues, + * then we should also check whether bfqq has already been + * merged with bic->stable_merge_bfqq. But this would be + * costly and complicated. + */ + if (unlikely(!bfqd->nonrot_with_queueing)) { + if (bic->stable_merge_bfqq && + !bfq_bfqq_just_created(bfqq) && + time_is_after_jiffies(bfqq->split_time + + msecs_to_jiffies(200))) { + struct bfq_queue *stable_merge_bfqq = + bic->stable_merge_bfqq; + int proc_ref = min(bfqq_process_refs(bfqq), + bfqq_process_refs(stable_merge_bfqq)); + + /* deschedule stable merge, because done or aborted here */ + bfq_put_stable_ref(stable_merge_bfqq); + + bic->stable_merge_bfqq = NULL; + + if (!idling_boosts_thr_without_issues(bfqd, bfqq) && + proc_ref > 0) { + /* next function will take at least one ref */ + struct bfq_queue *new_bfqq = + bfq_setup_merge(bfqq, stable_merge_bfqq); + + bic->stably_merged = true; + if (new_bfqq && new_bfqq->bic) + new_bfqq->bic->stably_merged = true; + return new_bfqq; + } else + return NULL; + } + } + /* * Do not perform queue merging if the device is non * rotational and performs internal queueing. In fact, such a @@ -2809,6 +2853,12 @@ void bfq_release_process_ref(struct bfq_data *bfqd, struct bfq_queue *bfqq) bfqq != bfqd->in_service_queue) bfq_del_bfqq_busy(bfqd, bfqq, false); + if (bfqq->entity.parent && + bfqq->entity.parent->last_bfqq_created == bfqq) + bfqq->entity.parent->last_bfqq_created = NULL; + else if (bfqq->bfqd && bfqq->bfqd->last_bfqq_created == bfqq) + bfqq->bfqd->last_bfqq_created = NULL; + bfq_put_queue(bfqq); } @@ -2905,6 +2955,13 @@ bfq_merge_bfqqs(struct bfq_data *bfqd, struct bfq_io_cq *bic, */ new_bfqq->pid = -1; bfqq->bic = NULL; + + if (bfqq->entity.parent && + bfqq->entity.parent->last_bfqq_created == bfqq) + bfqq->entity.parent->last_bfqq_created = new_bfqq; + else if (bfqq->bfqd && bfqq->bfqd->last_bfqq_created == bfqq) + bfqq->bfqd->last_bfqq_created = new_bfqq; + bfq_release_process_ref(bfqd, bfqq); } @@ -2932,7 +2989,7 @@ static bool bfq_allow_bio_merge(struct request_queue *q, struct request *rq, * We take advantage of this function to perform an early merge * of the queues of possible cooperating processes. */ - new_bfqq = bfq_setup_cooperator(bfqd, bfqq, bio, false); + new_bfqq = bfq_setup_cooperator(bfqd, bfqq, bio, false, bfqd->bio_bic); if (new_bfqq) { /* * bic still points to bfqq, then it has not yet been @@ -5033,6 +5090,12 @@ void bfq_put_queue(struct bfq_queue *bfqq) bfqg_and_blkg_put(bfqg); } +static void bfq_put_stable_ref(struct bfq_queue *bfqq) +{ + bfqq->stable_ref--; + bfq_put_queue(bfqq); +} + static void bfq_put_cooperator(struct bfq_queue *bfqq) { struct bfq_queue *__bfqq, *next; @@ -5089,6 +5152,17 @@ static void bfq_exit_icq(struct io_cq *icq) { struct bfq_io_cq *bic = icq_to_bic(icq); + if (bic->stable_merge_bfqq) { + unsigned long flags; + struct bfq_data *bfqd = bic->stable_merge_bfqq->bfqd; + + if (bfqd) + spin_lock_irqsave(&bfqd->lock, flags); + bfq_put_stable_ref(bic->stable_merge_bfqq); + if (bfqd) + spin_unlock_irqrestore(&bfqd->lock, flags); + } + bfq_exit_icq_bfqq(bic, true); bfq_exit_icq_bfqq(bic, false); } @@ -5149,7 +5223,8 @@ bfq_set_next_ioprio_data(struct bfq_queue *bfqq, struct bfq_io_cq *bic) static struct bfq_queue *bfq_get_queue(struct bfq_data *bfqd, struct bio *bio, bool is_sync, - struct bfq_io_cq *bic); + struct bfq_io_cq *bic, + bool respawn); static void bfq_check_ioprio_change(struct bfq_io_cq *bic, struct bio *bio) { @@ -5169,7 +5244,7 @@ static void bfq_check_ioprio_change(struct bfq_io_cq *bic, struct bio *bio) bfqq = bic_to_bfqq(bic, false); if (bfqq) { bfq_release_process_ref(bfqd, bfqq); - bfqq = bfq_get_queue(bfqd, bio, BLK_RW_ASYNC, bic); + bfqq = bfq_get_queue(bfqd, bio, BLK_RW_ASYNC, bic, true); bic_set_bfqq(bic, bfqq, false); } @@ -5212,6 +5287,8 @@ static void bfq_init_bfqq(struct bfq_data *bfqd, struct bfq_queue *bfqq, /* set end request to minus infinity from now */ bfqq->ttime.last_end_request = now_ns + 1; + bfqq->creation_time = jiffies; + bfqq->io_start_time = now_ns; bfq_mark_bfqq_IO_bound(bfqq); @@ -5261,9 +5338,156 @@ static struct bfq_queue **bfq_async_queue_prio(struct bfq_data *bfqd, } } +struct bfq_queue * +bfq_do_early_stable_merge(struct bfq_data *bfqd, struct bfq_queue *bfqq, + struct bfq_io_cq *bic, + struct bfq_queue *last_bfqq_created) +{ + struct bfq_queue *new_bfqq = + bfq_setup_merge(bfqq, last_bfqq_created); + + if (!new_bfqq) + return bfqq; + + if (new_bfqq->bic) + new_bfqq->bic->stably_merged = true; + bic->stably_merged = true; + + /* + * Reusing merge functions. This implies that + * bfqq->bic must be set too, for + * bfq_merge_bfqqs to correctly save bfqq's + * state before killing it. + */ + bfqq->bic = bic; + bfq_merge_bfqqs(bfqd, bic, bfqq, new_bfqq); + + return new_bfqq; +} + +/* + * Many throughput-sensitive workloads are made of several parallel + * I/O flows, with all flows generated by the same application, or + * more generically by the same task (e.g., system boot). The most + * counterproductive action with these workloads is plugging I/O + * dispatch when one of the bfq_queues associated with these flows + * remains temporarily empty. + * + * To avoid this plugging, BFQ has been using a burst-handling + * mechanism for years now. This mechanism has proven effective for + * throughput, and not detrimental for service guarantees. The + * following function pushes this mechanism a little bit further, + * basing on the following two facts. + * + * First, all the I/O flows of a the same application or task + * contribute to the execution/completion of that common application + * or task. So the performance figures that matter are total + * throughput of the flows and task-wide I/O latency. In particular, + * these flows do not need to be protected from each other, in terms + * of individual bandwidth or latency. + * + * Second, the above fact holds regardless of the number of flows. + * + * Putting these two facts together, this commits merges stably the + * bfq_queues associated with these I/O flows, i.e., with the + * processes that generate these IO/ flows, regardless of how many the + * involved processes are. + * + * To decide whether a set of bfq_queues is actually associated with + * the I/O flows of a common application or task, and to merge these + * queues stably, this function operates as follows: given a bfq_queue, + * say Q2, currently being created, and the last bfq_queue, say Q1, + * created before Q2, Q2 is merged stably with Q1 if + * - very little time has elapsed since when Q1 was created + * - Q2 has the same ioprio as Q1 + * - Q2 belongs to the same group as Q1 + * + * Merging bfq_queues also reduces scheduling overhead. A fio test + * with ten random readers on /dev/nullb shows a throughput boost of + * 40%, with a quadcore. Since BFQ's execution time amounts to ~50% of + * the total per-request processing time, the above throughput boost + * implies that BFQ's overhead is reduced by more than 50%. + * + * This new mechanism most certainly obsoletes the current + * burst-handling heuristics. We keep those heuristics for the moment. + */ +static struct bfq_queue *bfq_do_or_sched_stable_merge(struct bfq_data *bfqd, + struct bfq_queue *bfqq, + struct bfq_io_cq *bic) +{ + struct bfq_queue **source_bfqq = bfqq->entity.parent ? + &bfqq->entity.parent->last_bfqq_created : + &bfqd->last_bfqq_created; + + struct bfq_queue *last_bfqq_created = *source_bfqq; + + /* + * If last_bfqq_created has not been set yet, then init it. If + * it has been set already, but too long ago, then move it + * forward to bfqq. Finally, move also if bfqq belongs to a + * different group than last_bfqq_created, or if bfqq has a + * different ioprio or ioprio_class. If none of these + * conditions holds true, then try an early stable merge or + * schedule a delayed stable merge. + * + * A delayed merge is scheduled (instead of performing an + * early merge), in case bfqq might soon prove to be more + * throughput-beneficial if not merged. Currently this is + * possible only if bfqd is rotational with no queueing. For + * such a drive, not merging bfqq is better for throughput if + * bfqq happens to contain sequential I/O. So, we wait a + * little bit for enough I/O to flow through bfqq. After that, + * if such an I/O is sequential, then the merge is + * canceled. Otherwise the merge is finally performed. + */ + if (!last_bfqq_created || + time_before(last_bfqq_created->creation_time + + bfqd->bfq_burst_interval, + bfqq->creation_time) || + bfqq->entity.parent != last_bfqq_created->entity.parent || + bfqq->ioprio != last_bfqq_created->ioprio || + bfqq->ioprio_class != last_bfqq_created->ioprio_class) + *source_bfqq = bfqq; + else if (time_after_eq(last_bfqq_created->creation_time + + bfqd->bfq_burst_interval, + bfqq->creation_time)) { + if (likely(bfqd->nonrot_with_queueing)) + /* + * With this type of drive, leaving + * bfqq alone may provide no + * throughput benefits compared with + * merging bfqq. So merge bfqq now. + */ + bfqq = bfq_do_early_stable_merge(bfqd, bfqq, + bic, + last_bfqq_created); + else { /* schedule tentative stable merge */ + /* + * get reference on last_bfqq_created, + * to prevent it from being freed, + * until we decide whether to merge + */ + last_bfqq_created->ref++; + /* + * need to keep track of stable refs, to + * compute process refs correctly + */ + last_bfqq_created->stable_ref++; + /* + * Record the bfqq to merge to. + */ + bic->stable_merge_bfqq = last_bfqq_created; + } + } + + return bfqq; +} + + static struct bfq_queue *bfq_get_queue(struct bfq_data *bfqd, struct bio *bio, bool is_sync, - struct bfq_io_cq *bic) + struct bfq_io_cq *bic, + bool respawn) { const int ioprio = IOPRIO_PRIO_DATA(bic->ioprio); const int ioprio_class = IOPRIO_PRIO_CLASS(bic->ioprio); @@ -5321,7 +5545,10 @@ static struct bfq_queue *bfq_get_queue(struct bfq_data *bfqd, out: bfqq->ref++; /* get a process reference to this queue */ - bfq_log_bfqq(bfqd, bfqq, "get_queue, at end: %p, %d", bfqq, bfqq->ref); + + if (bfqq != &bfqd->oom_bfqq && is_sync && !respawn) + bfqq = bfq_do_or_sched_stable_merge(bfqd, bfqq, bic); + rcu_read_unlock(); return bfqq; } @@ -5563,7 +5790,8 @@ static void bfq_rq_enqueued(struct bfq_data *bfqd, struct bfq_queue *bfqq, static bool __bfq_insert_request(struct bfq_data *bfqd, struct request *rq) { struct bfq_queue *bfqq = RQ_BFQQ(rq), - *new_bfqq = bfq_setup_cooperator(bfqd, bfqq, rq, true); + *new_bfqq = bfq_setup_cooperator(bfqd, bfqq, rq, true, + RQ_BIC(rq)); bool waiting, idle_timer_disabled = false; if (new_bfqq) { @@ -6193,7 +6421,7 @@ static struct bfq_queue *bfq_get_bfqq_handle_split(struct bfq_data *bfqd, if (bfqq) bfq_put_queue(bfqq); - bfqq = bfq_get_queue(bfqd, bio, is_sync, bic); + bfqq = bfq_get_queue(bfqd, bio, is_sync, bic, split); bic_set_bfqq(bic, bfqq, is_sync); if (split && is_sync) { @@ -6314,7 +6542,8 @@ static struct bfq_queue *bfq_init_rq(struct request *rq) if (likely(!new_queue)) { /* If the queue was seeky for too long, break it apart. */ - if (bfq_bfqq_coop(bfqq) && bfq_bfqq_split_coop(bfqq)) { + if (bfq_bfqq_coop(bfqq) && bfq_bfqq_split_coop(bfqq) && + !bic->stably_merged) { struct bfq_queue *old_bfqq = bfqq; /* Update bic before losing reference to bfqq */ diff --git a/block/bfq-iosched.h b/block/bfq-iosched.h index b8e793c34ff1..99c2a3cb081e 100644 --- a/block/bfq-iosched.h +++ b/block/bfq-iosched.h @@ -197,6 +197,9 @@ struct bfq_entity { /* flag, set if the entity is counted in groups_with_pending_reqs */ bool in_groups_with_pending_reqs; + + /* last child queue of entity created (for non-leaf entities) */ + struct bfq_queue *last_bfqq_created; }; struct bfq_group; @@ -230,6 +233,8 @@ struct bfq_ttime { struct bfq_queue { /* reference counter */ int ref; + /* counter of references from other queues for delayed stable merge */ + int stable_ref; /* parent bfq_data */ struct bfq_data *bfqd; @@ -365,6 +370,8 @@ struct bfq_queue { unsigned long first_IO_time; /* time of first I/O for this queue */ + unsigned long creation_time; /* when this queue is created */ + /* max service rate measured so far */ u32 max_service_rate; @@ -454,6 +461,11 @@ struct bfq_io_cq { u64 saved_last_serv_time_ns; unsigned int saved_inject_limit; unsigned long saved_decrease_time_jif; + + /* candidate queue for a stable merge (due to close creation time) */ + struct bfq_queue *stable_merge_bfqq; + + bool stably_merged; /* non splittable if true */ }; /** @@ -578,6 +590,9 @@ struct bfq_data { /* bfqq owning the last completed rq */ struct bfq_queue *last_completed_rq_bfqq; + /* last bfqq created, among those in the root group */ + struct bfq_queue *last_bfqq_created; + /* time of last transition from empty to non-empty (ns) */ u64 last_empty_occupied_ns;