From patchwork Tue Jan 29 11:06:25 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Valente X-Patchwork-Id: 10786031 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 62A951390 for ; Tue, 29 Jan 2019 11:08:16 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 52C9329730 for ; Tue, 29 Jan 2019 11:08:16 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 46B972B616; Tue, 29 Jan 2019 11:08:16 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CA36629730 for ; Tue, 29 Jan 2019 11:08:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727131AbfA2LHD (ORCPT ); Tue, 29 Jan 2019 06:07:03 -0500 Received: from mail-wm1-f66.google.com ([209.85.128.66]:38277 "EHLO mail-wm1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726885AbfA2LHB (ORCPT ); Tue, 29 Jan 2019 06:07:01 -0500 Received: by mail-wm1-f66.google.com with SMTP id m22so17373182wml.3 for ; Tue, 29 Jan 2019 03:06:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=KqztSqIPzKKBWIFY2MqAio0w3pO/+CETJ20qHxYP8sU=; b=dtSOU/WkVIhSYUqeW3JTn3QUufABzxHPLxowScogLO1dE4ZwITfBQdRm0f+ZtddNKB UXUi1A/1zNrj6MnKUUu6LpqcGQ1WdyfuVgpmnqma4bCJGWtcxZaEfLgdfbanuGNMHgWo vhkVrylJKJu+zYCvMj/ntuM4/9rogp/gl39Lk= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=KqztSqIPzKKBWIFY2MqAio0w3pO/+CETJ20qHxYP8sU=; b=b1rTkr+qVbFHM+CsYRG58zqqLBeHtVokcLsZUF/nPZ4pletZOVDLHLoe4ogA8X9CsH z2RXSw41OIh2ORb2uHm/C0m5eggFIO2E78jgKQJ2DSlErYcVAfgyZNhwhEKKYFmjrHKo tuMTAvRhX+3GGeHEReo4ILfCrKLtEQ8+vqNZ4w/hmoIV0KhmGlOtQOHZK1auGsYS9LW0 zrxV2F8ZsrUIXzAhLaetrhcBX1qMoOgjKysmBqOq08PTinOsJQNoq8KUIPzyOw/67sPz Pdb0EgAsLIFv9pB9nUGbBYGZIzzf4ivOvcRSWgoTX7XqzuTwhB1kgw0CG4zQgIeLair0 sSIw== X-Gm-Message-State: AJcUukcKqYjBzJziPoAIlmztIX9VmSJBg1W0unmcFn1+asNOEZqGmT1Z IU2ChKGDKpI9h1X80MQLyG31yg== X-Google-Smtp-Source: ALg8bN700krUAVlb9TYy54/s/Am6FyVeMUw5idUjzm72ZYoxaexT8Ki6r4Ds4SZLdVhuJamsIraEfg== X-Received: by 2002:a1c:c87:: with SMTP id 129mr19704056wmm.116.1548760018406; Tue, 29 Jan 2019 03:06:58 -0800 (PST) Received: from localhost.localdomain ([88.147.67.218]) by smtp.gmail.com with ESMTPSA id s132sm2066112wmf.28.2019.01.29.03.06.57 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 29 Jan 2019 03:06:57 -0800 (PST) From: Paolo Valente To: Jens Axboe Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, ulf.hansson@linaro.org, linus.walleij@linaro.org, broonie@kernel.org, bfq-iosched@googlegroups.com, oleksandr@natalenko.name, mancha@tower-research.com, Paolo Valente Subject: [PATCH BUGFIX IMPROVEMENT 01/14] block, bfq: do not consider interactive queues in srt filtering Date: Tue, 29 Jan 2019 12:06:25 +0100 Message-Id: <20190129110638.12652-2-paolo.valente@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190129110638.12652-1-paolo.valente@linaro.org> References: <20190129110638.12652-1-paolo.valente@linaro.org> MIME-Version: 1.0 Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP The speed at which a bfq_queue receives I/O is one of the parameters by which bfq decides whether the queue is soft real-time (i.e., whether the queue contains the I/O of a soft real-time application). In particular, when a bfq_queue remains without outstanding I/O requests, bfq computes the minimum time instant, named soft_rt_next_start, at which the next request of the queue may arrive for the queue to be deemed as soft real time. Unfortunately this filtering may cause problems with a queue in interactive weight raising. In fact, such a queue may be conveying the I/O needed to load a soft real-time application. The latter will actually exhibit a soft real-time I/O pattern after it finally starts doing its job. But, if soft_rt_next_start is updated for an interactive bfq_queue, and the queue has received a lot of service before remaining with no outstanding request (likely to happen on a fast device), then soft_rt_next_start is assigned such a high value that, for a very long time, the queue is prevented from being possibly considered as soft real time. This commit removes the updating of soft_rt_next_start for bfq_queues in interactive weight raising. Signed-off-by: Paolo Valente --- block/bfq-iosched.c | 39 +++++++++++++++++++++++++++++---------- 1 file changed, 29 insertions(+), 10 deletions(-) diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c index cd307767a134..c7a4a15c7c19 100644 --- a/block/bfq-iosched.c +++ b/block/bfq-iosched.c @@ -3274,16 +3274,32 @@ void bfq_bfqq_expire(struct bfq_data *bfqd, * requests, then the request pattern is isochronous * (see the comments on the function * bfq_bfqq_softrt_next_start()). Thus we can compute - * soft_rt_next_start. If, instead, the queue still - * has outstanding requests, then we have to wait for - * the completion of all the outstanding requests to - * discover whether the request pattern is actually - * isochronous. + * soft_rt_next_start. And we do it, unless bfqq is in + * interactive weight raising. We do not do it in the + * latter subcase, for the following reason. bfqq may + * be conveying the I/O needed to load a soft + * real-time application. Such an application will + * actually exhibit a soft real-time I/O pattern after + * it finally starts doing its job. But, if + * soft_rt_next_start is computed here for an + * interactive bfqq, and bfqq had received a lot of + * service before remaining with no outstanding + * request (likely to happen on a fast device), then + * soft_rt_next_start would be assigned such a high + * value that, for a very long time, bfqq would be + * prevented from being possibly considered as soft + * real time. + * + * If, instead, the queue still has outstanding + * requests, then we have to wait for the completion + * of all the outstanding requests to discover whether + * the request pattern is actually isochronous. */ - if (bfqq->dispatched == 0) + if (bfqq->dispatched == 0 && + bfqq->wr_coeff != bfqd->bfq_wr_coeff) bfqq->soft_rt_next_start = bfq_bfqq_softrt_next_start(bfqd, bfqq); - else { + else if (bfqq->dispatched > 0) { /* * Schedule an update of soft_rt_next_start to when * the task may be discovered to be isochronous. @@ -4834,11 +4850,14 @@ static void bfq_completed_request(struct bfq_queue *bfqq, struct bfq_data *bfqd) * isochronous, and both requisites for this condition to hold * are now satisfied, then compute soft_rt_next_start (see the * comments on the function bfq_bfqq_softrt_next_start()). We - * schedule this delayed check when bfqq expires, if it still - * has in-flight requests. + * do not compute soft_rt_next_start if bfqq is in interactive + * weight raising (see the comments in bfq_bfqq_expire() for + * an explanation). We schedule this delayed update when bfqq + * expires, if it still has in-flight requests. */ if (bfq_bfqq_softrt_update(bfqq) && bfqq->dispatched == 0 && - RB_EMPTY_ROOT(&bfqq->sort_list)) + RB_EMPTY_ROOT(&bfqq->sort_list) && + bfqq->wr_coeff != bfqd->bfq_wr_coeff) bfqq->soft_rt_next_start = bfq_bfqq_softrt_next_start(bfqd, bfqq); From patchwork Tue Jan 29 11:06:26 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Valente X-Patchwork-Id: 10786029 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8D07814E1 for ; Tue, 29 Jan 2019 11:08:15 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7DC6C29730 for ; Tue, 29 Jan 2019 11:08:15 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 716BA2B616; Tue, 29 Jan 2019 11:08:15 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 18D0229730 for ; Tue, 29 Jan 2019 11:08:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727353AbfA2LHE (ORCPT ); Tue, 29 Jan 2019 06:07:04 -0500 Received: from mail-wr1-f65.google.com ([209.85.221.65]:36330 "EHLO mail-wr1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727077AbfA2LHB (ORCPT ); Tue, 29 Jan 2019 06:07:01 -0500 Received: by mail-wr1-f65.google.com with SMTP id u4so21555245wrp.3 for ; Tue, 29 Jan 2019 03:07:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=gAUJ5pwSD9lzILEQVJ0hCKATJxbw5gWvL06xocd7ofI=; b=QvbYcrkZ78lOm1jmQmedgdRvvGotGVNAELS2tmY8vO3gAAYJdENhs+i6K5OIIS+aBo xUCOofevIfoCvVZg2QYNSy+FXYOWoznVsUprE60jpyQfKtFMCKboVcVrfLJ5xT07Nnt9 1ToHzHBJJd9kTtBhEl43pS4OGRudDsvoqg7rY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=gAUJ5pwSD9lzILEQVJ0hCKATJxbw5gWvL06xocd7ofI=; b=Vt1lgfiyaBAJTdCxcc376QOaCNmfES6o8Q7bzfkzujnLrm1nqnnRaHJD9Dh4C0p7Br z8SHnDtA8/v+/d+kM35USP5/vcMvfJJx28VvuQ7oG1kyEK1WI7fxke8W782cXjUXT6Di 7Octx6KRgVVq/ZoINbEbTMk8/8ORS1BtsAT7UilH/1yNlE7xqO31OXZdAvJBpP8yyywo 67HzAQXHoXtWHe3wvIGfeglu0qff9iXXwO8Lim2pM6Al032ovpnfRcL4Q0MpOhDsoAHF 17K3+wMD2SDPkeW14HCp/I7DVvyYY64HqVNzOllkKWYhHG99Gtq5wHJ/JqzhWtL/oLJL GGQA== X-Gm-Message-State: AHQUAuaNjNiCRYaDPDuOkGqi1/UzqQkPdA0BP1ZCEZmB50C0NinI4Wh+ cLieKOcfT5x+0bNonH/e9LHikw== X-Google-Smtp-Source: AHgI3IYc2i9UPPHabAM2+VOeqDKlwg1KhTzJ03eClBSuqGfiMhraGk5QDm1qWeOcqeQ7vS96EOIOIw== X-Received: by 2002:a5d:4046:: with SMTP id w6mr4707214wrp.92.1548760019712; Tue, 29 Jan 2019 03:06:59 -0800 (PST) Received: from localhost.localdomain ([88.147.67.218]) by smtp.gmail.com with ESMTPSA id s132sm2066112wmf.28.2019.01.29.03.06.58 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 29 Jan 2019 03:06:59 -0800 (PST) From: Paolo Valente To: Jens Axboe Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, ulf.hansson@linaro.org, linus.walleij@linaro.org, broonie@kernel.org, bfq-iosched@googlegroups.com, oleksandr@natalenko.name, mancha@tower-research.com, Paolo Valente Subject: [PATCH BUGFIX IMPROVEMENT 02/14] block, bfq: avoid selecting a queue w/o budget Date: Tue, 29 Jan 2019 12:06:26 +0100 Message-Id: <20190129110638.12652-3-paolo.valente@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190129110638.12652-1-paolo.valente@linaro.org> References: <20190129110638.12652-1-paolo.valente@linaro.org> MIME-Version: 1.0 Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP To boost throughput on devices with internal queueing and in scenarios where device idling is not strictly needed, bfq immediately starts serving a new bfq_queue if the in-service bfq_queue remains without pending I/O, even if new I/O may arrive soon for the latter queue. Then, if such I/O actually arrives soon, bfq preempts the new in-service bfq_queue so as to give the previous queue a chance to go on being served (in case the previous queue should actually be the one to be served, according to its timestamps). However, the in-service bfq_queue, say Q, may also be without further budget when it remains also pending I/O. Since bfq changes budgets dynamically to fit the needs of bfq_queues, this happens more often than one may expect. If this happens, then there is no point in trying to go on serving Q when new I/O arrives for it soon: Q would be expired immediately after being selected for service. This would only cause useless overhead. This commit avoids such a useless selection. Signed-off-by: Paolo Valente --- block/bfq-iosched.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c index c7a4a15c7c19..9ea2c4f42501 100644 --- a/block/bfq-iosched.c +++ b/block/bfq-iosched.c @@ -1380,7 +1380,15 @@ static bool bfq_bfqq_update_budg_for_activation(struct bfq_data *bfqd, { struct bfq_entity *entity = &bfqq->entity; - if (bfq_bfqq_non_blocking_wait_rq(bfqq) && arrived_in_time) { + /* + * In the next compound condition, we check also whether there + * is some budget left, because otherwise there is no point in + * trying to go on serving bfqq with this same budget: bfqq + * would be expired immediately after being selected for + * service. This would only cause useless overhead. + */ + if (bfq_bfqq_non_blocking_wait_rq(bfqq) && arrived_in_time && + bfq_bfqq_budget_left(bfqq) > 0) { /* * We do not clear the flag non_blocking_wait_rq here, as * the latter is used in bfq_activate_bfqq to signal From patchwork Tue Jan 29 11:06:27 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Valente X-Patchwork-Id: 10786035 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4793D1390 for ; Tue, 29 Jan 2019 11:08:17 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3920829730 for ; Tue, 29 Jan 2019 11:08:17 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 2D7B32B616; Tue, 29 Jan 2019 11:08:17 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D246829730 for ; Tue, 29 Jan 2019 11:08:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727262AbfA2LHD (ORCPT ); Tue, 29 Jan 2019 06:07:03 -0500 Received: from mail-wm1-f65.google.com ([209.85.128.65]:50921 "EHLO mail-wm1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727131AbfA2LHC (ORCPT ); Tue, 29 Jan 2019 06:07:02 -0500 Received: by mail-wm1-f65.google.com with SMTP id n190so17363878wmd.0 for ; Tue, 29 Jan 2019 03:07:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=sfh0pAFFntluWsfkKiHNCWmGd04mxXjXMAOMMF2nCLo=; b=bbEl1Xzgyv62vtDcTw804jba5WzfBOVjKpHPxC3U5g2tcPH4HKqtB+uTCqHcWc7xfx WgdKQhwnUJws1yCsxGdrqh5MOEpOUBRGDkjY7U14LxizTJ5yiae2dwFGTATB4rNJdjbr bf7Ph+MwLgcCr8DcYkS2jIlYcoixT+wlG6DFU= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=sfh0pAFFntluWsfkKiHNCWmGd04mxXjXMAOMMF2nCLo=; b=TOvlcJiylQnEdr18Y7vkBWZeRgvttm1bi1e709HvWtCswb0zfNO8WrTTWhP/tvzcM0 MIB52VfZ5cWukBbsPvIAAruCrqkIVKBe/stf3jk41q72+bVFvLs6AloUe/rJAzOPIh6i hd+FcgnqwCe/GtwyhybpRx7HzVcL02tlLol/DCO6/FryWypmV1+LI0avOypQxe6gqPQh dX5vm00tbaWT3QZXoWoZJyEZam/Zw9g/fh26Wz23Ra26Un0It19fbAxAucztLfGBU+vK 1zIXhyn/M4oiz3Qy/UZztSojawQ2MsA1AcnZdiOI32JqSQ95udgWJauMR2gjXRhBrAGT 2ZvA== X-Gm-Message-State: AJcUuke2M2HKa/BABuvuKQ4wCgHo34eX61lavgx6K8CaLhwH0QVQzC+b V/OzR1pRllEO+1v6KE2m+EPO5g== X-Google-Smtp-Source: ALg8bN6fqF+RgUuq/0rqhbPXf1ZmTs5ioSHxjm84uvoeXP1aKMG7HS0sWrkM0S7RuGoVGc5WY3pyjA== X-Received: by 2002:a1c:8b09:: with SMTP id n9mr20820350wmd.38.1548760020924; Tue, 29 Jan 2019 03:07:00 -0800 (PST) Received: from localhost.localdomain ([88.147.67.218]) by smtp.gmail.com with ESMTPSA id s132sm2066112wmf.28.2019.01.29.03.06.59 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 29 Jan 2019 03:07:00 -0800 (PST) From: Paolo Valente To: Jens Axboe Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, ulf.hansson@linaro.org, linus.walleij@linaro.org, broonie@kernel.org, bfq-iosched@googlegroups.com, oleksandr@natalenko.name, mancha@tower-research.com, Paolo Valente Subject: [PATCH BUGFIX IMPROVEMENT 03/14] block, bfq: make sure queue budgets are not below service received Date: Tue, 29 Jan 2019 12:06:27 +0100 Message-Id: <20190129110638.12652-4-paolo.valente@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190129110638.12652-1-paolo.valente@linaro.org> References: <20190129110638.12652-1-paolo.valente@linaro.org> MIME-Version: 1.0 Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP With some unlucky sequences of events, the function bfq_updated_next_req updates the current budget of a bfq_queue to a lower value than the service received by the queue using such a budget. Unfortunately, if this happens, then the return value of the function bfq_bfqq_budget_left becomes inconsistent. This commit solves this problem by lower-bounding the budget computed in bfq_updated_next_req to the service currently charged to the queue. Signed-off-by: Paolo Valente --- block/bfq-iosched.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c index 9ea2c4f42501..b0e8006475be 100644 --- a/block/bfq-iosched.c +++ b/block/bfq-iosched.c @@ -907,8 +907,10 @@ static void bfq_updated_next_req(struct bfq_data *bfqd, */ return; - new_budget = max_t(unsigned long, bfqq->max_budget, - bfq_serv_to_charge(next_rq, bfqq)); + new_budget = max_t(unsigned long, + max_t(unsigned long, bfqq->max_budget, + bfq_serv_to_charge(next_rq, bfqq)), + entity->service); if (entity->budget != new_budget) { entity->budget = new_budget; bfq_log_bfqq(bfqd, bfqq, "updated next rq: new budget %lu", From patchwork Tue Jan 29 11:06:28 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Valente X-Patchwork-Id: 10786033 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C4D81184E for ; Tue, 29 Jan 2019 11:08:16 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B665529730 for ; Tue, 29 Jan 2019 11:08:16 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id AAB942B106; Tue, 29 Jan 2019 11:08:16 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 58F252B62E for ; Tue, 29 Jan 2019 11:08:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1725927AbfA2LIP (ORCPT ); Tue, 29 Jan 2019 06:08:15 -0500 Received: from mail-wr1-f65.google.com ([209.85.221.65]:41212 "EHLO mail-wr1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726390AbfA2LHE (ORCPT ); Tue, 29 Jan 2019 06:07:04 -0500 Received: by mail-wr1-f65.google.com with SMTP id x10so21517428wrs.8 for ; Tue, 29 Jan 2019 03:07:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=NergthX5oSNRCeJESFYVsMGHKG3cwu/m8eBx+7n5JOE=; b=djoXGnsK11deAS6J4IpmOiUJzQK9Kh7UxFQ1sIfWXNRnsCRYdWvW0iAmNOyyvH1U5l ORyYOSUQzOHLOCqOg8RLHcMw3X44QxZocHBXJf/enpRVP2QWwBZEQTd1ADwMLsAu7YOh D74Y4PF0RgKe7o69H7P4Au761wDwXZIkzxtm4= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=NergthX5oSNRCeJESFYVsMGHKG3cwu/m8eBx+7n5JOE=; b=HiJ97AGG95IJHUL0AR5rY4GFQddw5ZNu14qeagSV2APmM4xwkWYFnBO0pB5/OUx/RA D+g8H0KLJywoq9E0oa1Rz1NKraUqnbCGua6+Uq+OEuad0rwmQ63rIyzxFb71WGEWF9jG UdIyiobiLl6B0Ftt55bbWe5leID2DbGv5I5ut8y877UK6H5Fm/FsTHLX79/4ScAxXfC7 NFq01kO00YqSxsVY6vswhipdq5+1BxGHR5tomQQ2HDscUHao857ndeAmPvZUVGNEyRlw ShOT4cBq8dTTLAnIFYFxF/J6k8fB9rxLhnwiYP0+ghPNa2mW3aE6TY4HyvZeAhrYvzBs lbJw== X-Gm-Message-State: AJcUukfFtHjVV8OjREk/zPPz7lJZ58r0ZuVdI3ceVK0LKYCDlCRPaokk UJfb6cv6p95WwZFgjbJYzVV61Q== X-Google-Smtp-Source: ALg8bN7KyU4T2hntASzwapUm1qGbXPcwc372C+KpSmztKCfyCq9jXBrjLPKcmlxaBmqmGaTtpBWGHA== X-Received: by 2002:adf:9361:: with SMTP id 88mr24827980wro.204.1548760022218; Tue, 29 Jan 2019 03:07:02 -0800 (PST) Received: from localhost.localdomain ([88.147.67.218]) by smtp.gmail.com with ESMTPSA id s132sm2066112wmf.28.2019.01.29.03.07.00 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 29 Jan 2019 03:07:01 -0800 (PST) From: Paolo Valente To: Jens Axboe Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, ulf.hansson@linaro.org, linus.walleij@linaro.org, broonie@kernel.org, bfq-iosched@googlegroups.com, oleksandr@natalenko.name, mancha@tower-research.com, Paolo Valente Subject: [PATCH BUGFIX IMPROVEMENT 04/14] block, bfq: remove case of redirected bic from insert_request Date: Tue, 29 Jan 2019 12:06:28 +0100 Message-Id: <20190129110638.12652-5-paolo.valente@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190129110638.12652-1-paolo.valente@linaro.org> References: <20190129110638.12652-1-paolo.valente@linaro.org> MIME-Version: 1.0 Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Before commit 18e5a57d7987 ("block, bfq: postpone rq preparation to insert or merge"), the destination queue for a request was chosen by a different hook than the one that then inserted the request. So, between the execution of the two hooks, the bic of the process generating the request could happen to be redirected to a different bfq_queue. As a consequence, the destination bfq_queue stored in the request could be wrong. Such an event does not need to ba handled any longer. Signed-off-by: Paolo Valente --- block/bfq-iosched.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c index b0e8006475be..a9275ed57726 100644 --- a/block/bfq-iosched.c +++ b/block/bfq-iosched.c @@ -4633,8 +4633,6 @@ static bool __bfq_insert_request(struct bfq_data *bfqd, struct request *rq) bool waiting, idle_timer_disabled = false; if (new_bfqq) { - if (bic_to_bfqq(RQ_BIC(rq), 1) != bfqq) - new_bfqq = bic_to_bfqq(RQ_BIC(rq), 1); /* * Release the request's reference to the old bfqq * and make sure one is taken to the shared queue. From patchwork Tue Jan 29 11:06:29 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Valente X-Patchwork-Id: 10786027 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CA90314E1 for ; Tue, 29 Jan 2019 11:08:06 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B89EE29730 for ; Tue, 29 Jan 2019 11:08:06 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A9CE62B616; Tue, 29 Jan 2019 11:08:06 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D2EC329730 for ; Tue, 29 Jan 2019 11:08:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726893AbfA2LIA (ORCPT ); Tue, 29 Jan 2019 06:08:00 -0500 Received: from mail-wm1-f65.google.com ([209.85.128.65]:53592 "EHLO mail-wm1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727466AbfA2LHG (ORCPT ); Tue, 29 Jan 2019 06:07:06 -0500 Received: by mail-wm1-f65.google.com with SMTP id d15so17344476wmb.3 for ; Tue, 29 Jan 2019 03:07:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=pqaJMhHhfa+weQuFHVAWQG1S7lc+lO9DghKMucIuInE=; b=RQyLqjJkLnXNErvrLwkz6/bNzX/uXTTeUHDCjG6NXq6xDaRWgLdnjLRygLiCuODTsx L/FBcJHYMh6Rj7nfy/e3yHNwqE1Q+nk3jp99ALN6zDKFZjcTqsYZJWJ98u0kfWiryCFU QNwJf/9aBcD34eR7Eti129p1MBKHZz7TcmHIw= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=pqaJMhHhfa+weQuFHVAWQG1S7lc+lO9DghKMucIuInE=; b=Cqi0q6+vg+VRjUnTEf2uJ2XNtP/iwlT7aCNaspdUzZRn1bajKx6sIwHALAQArXk5h0 nZ+cE4oR4jf4ovNedNAn4dHm3Ti/tR+ikaaHWSUt2yweZZFpmByzlrkbRjaPDKk5t7ui PufSmIY681EsIIzWOW3xV8TQE7XqzGVyLjBpERkTMA9RJR0kmvh8D2MnTKOguQSQwbsN pzpKBJQAQvJs0TF4XVyvGYYjOpaMuFA68wU+TsRNV07ktbhgoaY1G004twIpQZgs9OC/ 5UYc30lYUJUHXOF0lmLbI9kqWZML1nLwpMFg1tUORh3tVeRF1ezTfzzmT2+DTNpv4EfM cqTA== X-Gm-Message-State: AHQUAuYzQx8qjU7Zw4rx8BspT/09/0DulwTrDeW0Ka6SCCbroFFHRnQe jCqMZ+4MPALBgIHOqYCCYMVzcQ== X-Google-Smtp-Source: AHgI3IZuKp0rYO0AUupSe03vR5gKzD5AA9e/e0m3XD2/rvBM6QZESFLieD9MLxzgRuyF302jYWwgFw== X-Received: by 2002:a1c:4c9:: with SMTP id 192mr10565730wme.135.1548760023620; Tue, 29 Jan 2019 03:07:03 -0800 (PST) Received: from localhost.localdomain ([88.147.67.218]) by smtp.gmail.com with ESMTPSA id s132sm2066112wmf.28.2019.01.29.03.07.02 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 29 Jan 2019 03:07:02 -0800 (PST) From: Paolo Valente To: Jens Axboe Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, ulf.hansson@linaro.org, linus.walleij@linaro.org, broonie@kernel.org, bfq-iosched@googlegroups.com, oleksandr@natalenko.name, mancha@tower-research.com, Paolo Valente Subject: [PATCH BUGFIX IMPROVEMENT 05/14] block, bfq: consider also ioprio classes in symmetry detection Date: Tue, 29 Jan 2019 12:06:29 +0100 Message-Id: <20190129110638.12652-6-paolo.valente@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190129110638.12652-1-paolo.valente@linaro.org> References: <20190129110638.12652-1-paolo.valente@linaro.org> MIME-Version: 1.0 Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP In asymmetric scenarios, i.e., when some bfq_queue or bfq_group needs to be guaranteed a different bandwidth than other bfq_queues or bfq_groups, these service guaranteed can be provided only by plugging I/O dispatch, completely or partially, when the queue in service remains temporarily empty. A case where asymmetry is particularly strong is when some active bfq_queues belong to a higher-priority class than some other active bfq_queues. Unfortunately, this important case is not considered at all in the code for detecting asymmetric scenarios. This commit adds the missing logic. Signed-off-by: Paolo Valente --- block/bfq-iosched.c | 86 ++++++++++++++++++++++++--------------------- block/bfq-iosched.h | 8 +++-- block/bfq-wf2q.c | 12 +++++-- 3 files changed, 59 insertions(+), 47 deletions(-) diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c index a9275ed57726..6bfbfa65610b 100644 --- a/block/bfq-iosched.c +++ b/block/bfq-iosched.c @@ -623,26 +623,6 @@ void bfq_pos_tree_add_move(struct bfq_data *bfqd, struct bfq_queue *bfqq) bfqq->pos_root = NULL; } -/* - * Tell whether there are active queues with different weights or - * active groups. - */ -static bool bfq_varied_queue_weights_or_active_groups(struct bfq_data *bfqd) -{ - /* - * For queue weights to differ, queue_weights_tree must contain - * at least two nodes. - */ - return (!RB_EMPTY_ROOT(&bfqd->queue_weights_tree) && - (bfqd->queue_weights_tree.rb_node->rb_left || - bfqd->queue_weights_tree.rb_node->rb_right) -#ifdef CONFIG_BFQ_GROUP_IOSCHED - ) || - (bfqd->num_groups_with_pending_reqs > 0 -#endif - ); -} - /* * The following function returns true if every queue must receive the * same share of the throughput (this condition is used when deciding @@ -651,25 +631,48 @@ static bool bfq_varied_queue_weights_or_active_groups(struct bfq_data *bfqd) * * Such a scenario occurs when: * 1) all active queues have the same weight, - * 2) all active groups at the same level in the groups tree have the same - * weight, + * 2) all active queues belong to the same I/O-priority class, * 3) all active groups at the same level in the groups tree have the same + * weight, + * 4) all active groups at the same level in the groups tree have the same * number of children. * * Unfortunately, keeping the necessary state for evaluating exactly * the last two symmetry sub-conditions above would be quite complex - * and time consuming. Therefore this function evaluates, instead, - * only the following stronger two sub-conditions, for which it is + * and time consuming. Therefore this function evaluates, instead, + * only the following stronger three sub-conditions, for which it is * much easier to maintain the needed state: * 1) all active queues have the same weight, - * 2) there are no active groups. + * 2) all active queues belong to the same I/O-priority class, + * 3) there are no active groups. * In particular, the last condition is always true if hierarchical * support or the cgroups interface are not enabled, thus no state * needs to be maintained in this case. */ static bool bfq_symmetric_scenario(struct bfq_data *bfqd) { - return !bfq_varied_queue_weights_or_active_groups(bfqd); + /* + * For queue weights to differ, queue_weights_tree must contain + * at least two nodes. + */ + bool varied_queue_weights = !RB_EMPTY_ROOT(&bfqd->queue_weights_tree) && + (bfqd->queue_weights_tree.rb_node->rb_left || + bfqd->queue_weights_tree.rb_node->rb_right); + + bool multiple_classes_busy = + (bfqd->busy_queues[0] && bfqd->busy_queues[1]) || + (bfqd->busy_queues[0] && bfqd->busy_queues[2]) || + (bfqd->busy_queues[1] && bfqd->busy_queues[2]); + + /* + * For queue weights to differ, queue_weights_tree must contain + * at least two nodes. + */ + return !(varied_queue_weights || multiple_classes_busy +#ifdef BFQ_GROUP_IOSCHED_ENABLED + || bfqd->num_groups_with_pending_reqs > 0 +#endif + ); } /* @@ -728,15 +731,14 @@ void bfq_weights_tree_add(struct bfq_data *bfqd, struct bfq_queue *bfqq, /* * In the unlucky event of an allocation failure, we just * exit. This will cause the weight of queue to not be - * considered in bfq_varied_queue_weights_or_active_groups, - * which, in its turn, causes the scenario to be deemed - * wrongly symmetric in case bfqq's weight would have been - * the only weight making the scenario asymmetric. On the - * bright side, no unbalance will however occur when bfqq - * becomes inactive again (the invocation of this function - * is triggered by an activation of queue). In fact, - * bfq_weights_tree_remove does nothing if - * !bfqq->weight_counter. + * considered in bfq_symmetric_scenario, which, in its turn, + * causes the scenario to be deemed wrongly symmetric in case + * bfqq's weight would have been the only weight making the + * scenario asymmetric. On the bright side, no unbalance will + * however occur when bfqq becomes inactive again (the + * invocation of this function is triggered by an activation + * of queue). In fact, bfq_weights_tree_remove does nothing + * if !bfqq->weight_counter. */ if (unlikely(!bfqq->weight_counter)) return; @@ -2227,7 +2229,7 @@ bfq_setup_cooperator(struct bfq_data *bfqd, struct bfq_queue *bfqq, return NULL; /* If there is only one backlogged queue, don't search. */ - if (bfqd->busy_queues == 1) + if (bfq_tot_busy_queues(bfqd) == 1) return NULL; in_service_bfqq = bfqd->in_service_queue; @@ -3681,7 +3683,8 @@ static bool bfq_better_to_idle(struct bfq_queue *bfqq) * the requests already queued in the device have been served. */ asymmetric_scenario = (bfqq->wr_coeff > 1 && - bfqd->wr_busy_queues < bfqd->busy_queues) || + bfqd->wr_busy_queues < + bfq_tot_busy_queues(bfqd)) || !bfq_symmetric_scenario(bfqd); /* @@ -3960,7 +3963,7 @@ static struct request *bfq_dispatch_rq_from_bfqq(struct bfq_data *bfqd, * belongs to CLASS_IDLE and other queues are waiting for * service. */ - if (!(bfqd->busy_queues > 1 && bfq_class_idle(bfqq))) + if (!(bfq_tot_busy_queues(bfqd) > 1 && bfq_class_idle(bfqq))) goto return_rq; bfq_bfqq_expire(bfqd, bfqq, false, BFQQE_BUDGET_EXHAUSTED); @@ -3978,7 +3981,7 @@ static bool bfq_has_work(struct blk_mq_hw_ctx *hctx) * most a call to dispatch for nothing */ return !list_empty_careful(&bfqd->dispatch) || - bfqd->busy_queues > 0; + bfq_tot_busy_queues(bfqd) > 0; } static struct request *__bfq_dispatch_request(struct blk_mq_hw_ctx *hctx) @@ -4032,9 +4035,10 @@ static struct request *__bfq_dispatch_request(struct blk_mq_hw_ctx *hctx) goto start_rq; } - bfq_log(bfqd, "dispatch requests: %d busy queues", bfqd->busy_queues); + bfq_log(bfqd, "dispatch requests: %d busy queues", + bfq_tot_busy_queues(bfqd)); - if (bfqd->busy_queues == 0) + if (bfq_tot_busy_queues(bfqd) == 0) goto exit; /* diff --git a/block/bfq-iosched.h b/block/bfq-iosched.h index 0b02bf302de0..30be669be465 100644 --- a/block/bfq-iosched.h +++ b/block/bfq-iosched.h @@ -501,10 +501,11 @@ struct bfq_data { unsigned int num_groups_with_pending_reqs; /* - * Number of bfq_queues containing requests (including the - * queue in service, even if it is idling). + * Per-class (RT, BE, IDLE) number of bfq_queues containing + * requests (including the queue in service, even if it is + * idling). */ - int busy_queues; + unsigned int busy_queues[3]; /* number of weight-raised busy @bfq_queues */ int wr_busy_queues; /* number of queued requests */ @@ -974,6 +975,7 @@ extern struct blkcg_policy blkcg_policy_bfq; struct bfq_group *bfq_bfqq_to_bfqg(struct bfq_queue *bfqq); struct bfq_queue *bfq_entity_to_bfqq(struct bfq_entity *entity); +unsigned int bfq_tot_busy_queues(struct bfq_data *bfqd); struct bfq_service_tree *bfq_entity_service_tree(struct bfq_entity *entity); struct bfq_entity *bfq_entity_of(struct rb_node *node); unsigned short bfq_ioprio_to_weight(int ioprio); diff --git a/block/bfq-wf2q.c b/block/bfq-wf2q.c index 72adbbe975d5..ce37d709a34f 100644 --- a/block/bfq-wf2q.c +++ b/block/bfq-wf2q.c @@ -44,6 +44,12 @@ static unsigned int bfq_class_idx(struct bfq_entity *entity) BFQ_DEFAULT_GRP_CLASS - 1; } +unsigned int bfq_tot_busy_queues(struct bfq_data *bfqd) +{ + return bfqd->busy_queues[0] + bfqd->busy_queues[1] + + bfqd->busy_queues[2]; +} + static struct bfq_entity *bfq_lookup_next_entity(struct bfq_sched_data *sd, bool expiration); @@ -1513,7 +1519,7 @@ struct bfq_queue *bfq_get_next_queue(struct bfq_data *bfqd) struct bfq_sched_data *sd; struct bfq_queue *bfqq; - if (bfqd->busy_queues == 0) + if (bfq_tot_busy_queues(bfqd) == 0) return NULL; /* @@ -1665,7 +1671,7 @@ void bfq_del_bfqq_busy(struct bfq_data *bfqd, struct bfq_queue *bfqq, bfq_clear_bfqq_busy(bfqq); - bfqd->busy_queues--; + bfqd->busy_queues[bfqq->ioprio_class - 1]--; if (!bfqq->dispatched) bfq_weights_tree_remove(bfqd, bfqq); @@ -1688,7 +1694,7 @@ void bfq_add_bfqq_busy(struct bfq_data *bfqd, struct bfq_queue *bfqq) bfq_activate_bfqq(bfqd, bfqq); bfq_mark_bfqq_busy(bfqq); - bfqd->busy_queues++; + bfqd->busy_queues[bfqq->ioprio_class - 1]++; if (!bfqq->dispatched) if (bfqq->wr_coeff == 1) From patchwork Tue Jan 29 11:06:30 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Valente X-Patchwork-Id: 10786023 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id BE2BD14E1 for ; Tue, 29 Jan 2019 11:07:53 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id AD99629730 for ; Tue, 29 Jan 2019 11:07:53 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A1D8F2B616; Tue, 29 Jan 2019 11:07:53 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D769329730 for ; Tue, 29 Jan 2019 11:07:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728466AbfA2LHI (ORCPT ); Tue, 29 Jan 2019 06:07:08 -0500 Received: from mail-wm1-f65.google.com ([209.85.128.65]:55089 "EHLO mail-wm1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728069AbfA2LHH (ORCPT ); Tue, 29 Jan 2019 06:07:07 -0500 Received: by mail-wm1-f65.google.com with SMTP id a62so17343429wmh.4 for ; Tue, 29 Jan 2019 03:07:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=QUU3CUC8cc8cju82LpP1ETMRx5xYgBZFuzHMif5PKgE=; b=I7GipiBy1jwfZzcBnanIeEjUnph/c3fRUsd0smPewaxwZgXl13jxCG7lJDGG7pRnWe YnObOi6zGYuBEL6aqupczOMweeg9M3PBxM3SgnsWgpKWDIyhQRz6U+vCKr0Ur6gjFbxH jvRqkwyKPT6zMzU2EqfnFgpdeCH0endlgmAd4= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=QUU3CUC8cc8cju82LpP1ETMRx5xYgBZFuzHMif5PKgE=; b=tgvQkiooTFGg9a5d3tXkzWY8yxl7HkyQi821VrY5uekWus84hc69+P9D5yOxiFrxAg rCqmtfNoptISEFWmVAjrFkxWiLnjckrzs8dbj0KjNTPTOlB16Oi4OyEU/EOp1mv+uRfP 11+BR14+5mNGtzAl5ceBW5eotZ1H9j3vcrLKMcrryJ3urbsOyuO4rENoyW/k33K1bqOa XhfZkMzvQKyrxughZiqhAlewk0XRuNfTLlhHqDWh2ohgUC18al3U/JJvCxObKdLl2vZ8 v1Q/Xy3zWTLWZhqRQpgsrt3FiVvc5O81NlQwHI+iplDEXQmtyYUnWOXccvks0uQfkhYn 0fuA== X-Gm-Message-State: AJcUukcSbGj8Ipno+tNLMNBzMTc0p3IlHaPgyboPeCk9+FwMejjr0dnm kawtJqXOtQBKcIE25SCaKTVJjgFrwOs= X-Google-Smtp-Source: ALg8bN4ORgPAWFRDmgrq9lmIgRRBZRhrFMld6DE45GDfGtmckU+2QEC3wWhFXogSwd6oXLVdTxgrjA== X-Received: by 2002:a1c:2457:: with SMTP id k84mr20732866wmk.139.1548760024837; Tue, 29 Jan 2019 03:07:04 -0800 (PST) Received: from localhost.localdomain ([88.147.67.218]) by smtp.gmail.com with ESMTPSA id s132sm2066112wmf.28.2019.01.29.03.07.03 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 29 Jan 2019 03:07:04 -0800 (PST) From: Paolo Valente To: Jens Axboe Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, ulf.hansson@linaro.org, linus.walleij@linaro.org, broonie@kernel.org, bfq-iosched@googlegroups.com, oleksandr@natalenko.name, mancha@tower-research.com, Paolo Valente Subject: [PATCH BUGFIX IMPROVEMENT 06/14] block, bfq: split function bfq_better_to_idle Date: Tue, 29 Jan 2019 12:06:30 +0100 Message-Id: <20190129110638.12652-7-paolo.valente@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190129110638.12652-1-paolo.valente@linaro.org> References: <20190129110638.12652-1-paolo.valente@linaro.org> MIME-Version: 1.0 Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP This is a preparatory commit for commits that need to check only one of the two main reasons for idling. This change should also improve the quality of the code a little bit, by splitting a function that contains very long, non-trivial and little related comments. Signed-off-by: Paolo Valente --- block/bfq-iosched.c | 155 +++++++++++++++++++++++--------------------- 1 file changed, 82 insertions(+), 73 deletions(-) diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c index 6bfbfa65610b..2756f4b1432b 100644 --- a/block/bfq-iosched.c +++ b/block/bfq-iosched.c @@ -3404,53 +3404,13 @@ static bool bfq_may_expire_for_budg_timeout(struct bfq_queue *bfqq) bfq_bfqq_budget_timeout(bfqq); } -/* - * For a queue that becomes empty, device idling is allowed only if - * this function returns true for the queue. As a consequence, since - * device idling plays a critical role in both throughput boosting and - * service guarantees, the return value of this function plays a - * critical role in both these aspects as well. - * - * In a nutshell, this function returns true only if idling is - * beneficial for throughput or, even if detrimental for throughput, - * idling is however necessary to preserve service guarantees (low - * latency, desired throughput distribution, ...). In particular, on - * NCQ-capable devices, this function tries to return false, so as to - * help keep the drives' internal queues full, whenever this helps the - * device boost the throughput without causing any service-guarantee - * issue. - * - * In more detail, the return value of this function is obtained by, - * first, computing a number of boolean variables that take into - * account throughput and service-guarantee issues, and, then, - * combining these variables in a logical expression. Most of the - * issues taken into account are not trivial. We discuss these issues - * individually while introducing the variables. - */ -static bool bfq_better_to_idle(struct bfq_queue *bfqq) +static bool idling_boosts_thr_without_issues(struct bfq_data *bfqd, + struct bfq_queue *bfqq) { - struct bfq_data *bfqd = bfqq->bfqd; bool rot_without_queueing = !blk_queue_nonrot(bfqd->queue) && !bfqd->hw_tag, bfqq_sequential_and_IO_bound, - idling_boosts_thr, idling_boosts_thr_without_issues, - idling_needed_for_service_guarantees, - asymmetric_scenario; - - if (bfqd->strict_guarantees) - return true; - - /* - * Idling is performed only if slice_idle > 0. In addition, we - * do not idle if - * (a) bfqq is async - * (b) bfqq is in the idle io prio class: in this case we do - * not idle because we want to minimize the bandwidth that - * queues in this class can steal to higher-priority queues - */ - if (bfqd->bfq_slice_idle == 0 || !bfq_bfqq_sync(bfqq) || - bfq_class_idle(bfqq)) - return false; + idling_boosts_thr; bfqq_sequential_and_IO_bound = !BFQQ_SEEKY(bfqq) && bfq_bfqq_IO_bound(bfqq) && bfq_bfqq_has_short_ttime(bfqq); @@ -3482,8 +3442,7 @@ static bool bfq_better_to_idle(struct bfq_queue *bfqq) bfqq_sequential_and_IO_bound); /* - * The value of the next variable, - * idling_boosts_thr_without_issues, is equal to that of + * The return value of this function is equal to that of * idling_boosts_thr, unless a special case holds. In this * special case, described below, idling may cause problems to * weight-raised queues. @@ -3500,32 +3459,35 @@ static bool bfq_better_to_idle(struct bfq_queue *bfqq) * which enqueue several requests in advance, and further * reorder internally-queued requests. * - * For this reason, we force to false the value of - * idling_boosts_thr_without_issues if there are weight-raised - * busy queues. In this case, and if bfqq is not weight-raised, - * this guarantees that the device is not idled for bfqq (if, - * instead, bfqq is weight-raised, then idling will be - * guaranteed by another variable, see below). Combined with - * the timestamping rules of BFQ (see [1] for details), this - * behavior causes bfqq, and hence any sync non-weight-raised - * queue, to get a lower number of requests served, and thus - * to ask for a lower number of requests from the request - * pool, before the busy weight-raised queues get served - * again. This often mitigates starvation problems in the - * presence of heavy write workloads and NCQ, thereby - * guaranteeing a higher application and system responsiveness - * in these hostile scenarios. + * For this reason, we force to false the return value if + * there are weight-raised busy queues. In this case, and if + * bfqq is not weight-raised, this guarantees that the device + * is not idled for bfqq (if, instead, bfqq is weight-raised, + * then idling will be guaranteed by another variable, see + * below). Combined with the timestamping rules of BFQ (see + * [1] for details), this behavior causes bfqq, and hence any + * sync non-weight-raised queue, to get a lower number of + * requests served, and thus to ask for a lower number of + * requests from the request pool, before the busy + * weight-raised queues get served again. This often mitigates + * starvation problems in the presence of heavy write + * workloads and NCQ, thereby guaranteeing a higher + * application and system responsiveness in these hostile + * scenarios. */ - idling_boosts_thr_without_issues = idling_boosts_thr && + return idling_boosts_thr && bfqd->wr_busy_queues == 0; +} +static bool idling_needed_for_service_guarantees(struct bfq_data *bfqd, + struct bfq_queue *bfqq) +{ /* - * There is then a case where idling must be performed not - * for throughput concerns, but to preserve service - * guarantees. + * There is a case where idling must be performed not for + * throughput concerns, but to preserve service guarantees. * * To introduce this case, we can note that allowing the drive - * to enqueue more than one request at a time, and hence + * to enqueue more than one request at a time, and thereby * delegating de facto final scheduling decisions to the * drive's internal scheduler, entails loss of control on the * actual request service order. In particular, the critical @@ -3682,9 +3644,9 @@ static bool bfq_better_to_idle(struct bfq_queue *bfqq) * to let requests be served in the desired order until all * the requests already queued in the device have been served. */ - asymmetric_scenario = (bfqq->wr_coeff > 1 && - bfqd->wr_busy_queues < - bfq_tot_busy_queues(bfqd)) || + bool asymmetric_scenario = (bfqq->wr_coeff > 1 && + bfqd->wr_busy_queues < + bfq_tot_busy_queues(bfqd)) || !bfq_symmetric_scenario(bfqd); /* @@ -3701,17 +3663,64 @@ static bool bfq_better_to_idle(struct bfq_queue *bfqq) * now establish when idling is actually needed to preserve * service guarantees. */ - idling_needed_for_service_guarantees = - asymmetric_scenario && !bfq_bfqq_in_large_burst(bfqq); + return asymmetric_scenario && !bfq_bfqq_in_large_burst(bfqq); +} + +/* + * For a queue that becomes empty, device idling is allowed only if + * this function returns true for that queue. As a consequence, since + * device idling plays a critical role for both throughput boosting + * and service guarantees, the return value of this function plays a + * critical role as well. + * + * In a nutshell, this function returns true only if idling is + * beneficial for throughput or, even if detrimental for throughput, + * idling is however necessary to preserve service guarantees (low + * latency, desired throughput distribution, ...). In particular, on + * NCQ-capable devices, this function tries to return false, so as to + * help keep the drives' internal queues full, whenever this helps the + * device boost the throughput without causing any service-guarantee + * issue. + * + * Most of the issues taken into account to get the return value of + * this function are not trivial. We discuss these issues in the two + * functions providing the main pieces of information needed by this + * function. + */ +static bool bfq_better_to_idle(struct bfq_queue *bfqq) +{ + struct bfq_data *bfqd = bfqq->bfqd; + bool idling_boosts_thr_with_no_issue, idling_needed_for_service_guar; + + if (unlikely(bfqd->strict_guarantees)) + return true; + + /* + * Idling is performed only if slice_idle > 0. In addition, we + * do not idle if + * (a) bfqq is async + * (b) bfqq is in the idle io prio class: in this case we do + * not idle because we want to minimize the bandwidth that + * queues in this class can steal to higher-priority queues + */ + if (bfqd->bfq_slice_idle == 0 || !bfq_bfqq_sync(bfqq) || + bfq_class_idle(bfqq)) + return false; + + idling_boosts_thr_with_no_issue = + idling_boosts_thr_without_issues(bfqd, bfqq); + + idling_needed_for_service_guar = + idling_needed_for_service_guarantees(bfqd, bfqq); /* - * We have now all the components we need to compute the + * We have now the two components we need to compute the * return value of the function, which is true only if idling * either boosts the throughput (without issues), or is * necessary to preserve service guarantees. */ - return idling_boosts_thr_without_issues || - idling_needed_for_service_guarantees; + return idling_boosts_thr_with_no_issue || + idling_needed_for_service_guar; } /* From patchwork Tue Jan 29 11:06:31 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Valente X-Patchwork-Id: 10786025 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 941C01390 for ; Tue, 29 Jan 2019 11:07:58 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 851E029730 for ; Tue, 29 Jan 2019 11:07:58 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 790402B616; Tue, 29 Jan 2019 11:07:58 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,LUCRATIVE,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 02D6429730 for ; Tue, 29 Jan 2019 11:07:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727598AbfA2LHw (ORCPT ); Tue, 29 Jan 2019 06:07:52 -0500 Received: from mail-wm1-f66.google.com ([209.85.128.66]:52746 "EHLO mail-wm1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728364AbfA2LHI (ORCPT ); Tue, 29 Jan 2019 06:07:08 -0500 Received: by mail-wm1-f66.google.com with SMTP id m1so17344401wml.2 for ; Tue, 29 Jan 2019 03:07:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=aWB7QVcaljANFueMNZpw7vbeVw33aKruWnTjcF4CgUA=; b=kM1EsQuHwH9AJMB8rxg28Ik28/xZAi7sJvABLEcNMGaHmqVDqRJJR82Tjy6I/OcK93 inMy91trQkSDaTol6pj/Aau2E+kAh+mmQ/PWw1ScIHdN1+JKhHr4oonLBu07wLNa0rgC zGrqDdox+p4OWTyHWlKNcD6uirgyB/rljFY6E= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=aWB7QVcaljANFueMNZpw7vbeVw33aKruWnTjcF4CgUA=; b=Nfq17u9GWZkor9LDfuAUlb3wqY2kiKZknwjzraEI/rSjVGaK3kbnzDH1LTNd9gzi41 rxno8nxwBokmMf49/qjngDLWzAbVvZ+v/sGbVOLVoZOqKDcoZCSgIV/tZUWIxD8YaepU Td3x3x0f4MMqwSyXIEnzr6d3HvpB9PlpMF+pkAwk3GFOx0y/XlTJZBsWpmkDraxm9Iex MIG/H1C4BesgQIMORHDrkUoO02Bp9a7UO/OyE9QJCTzbgdwXueRswGE9qBUHwBucf0W5 4pDhgkfI7TqL9XYpZ+oDPd42eMXGtt+csJEDDCEEvBHaWd4W7dMZm5StVB9UTU5LW9CT VrIQ== X-Gm-Message-State: AJcUukdlJK04akPhAiF5ctWLiy2UwbHF3RPq8RKZyof2D1Ptosh/7j56 7hWaJZVQg5Ur5KwmefTa9Y5iig== X-Google-Smtp-Source: ALg8bN4r+l2h5vyRlM8a4/dsCDbGHig07s0B4F29lFDOYmG80sptM9G+9xr4WEhNYLlUjncMA+2Hag== X-Received: by 2002:a1c:6243:: with SMTP id w64mr20309254wmb.153.1548760026112; Tue, 29 Jan 2019 03:07:06 -0800 (PST) Received: from localhost.localdomain ([88.147.67.218]) by smtp.gmail.com with ESMTPSA id s132sm2066112wmf.28.2019.01.29.03.07.04 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 29 Jan 2019 03:07:05 -0800 (PST) From: Paolo Valente To: Jens Axboe Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, ulf.hansson@linaro.org, linus.walleij@linaro.org, broonie@kernel.org, bfq-iosched@googlegroups.com, oleksandr@natalenko.name, mancha@tower-research.com, Paolo Valente Subject: [PATCH BUGFIX IMPROVEMENT 07/14] block, bfq: do not plug I/O of in-service queue when harmful Date: Tue, 29 Jan 2019 12:06:31 +0100 Message-Id: <20190129110638.12652-8-paolo.valente@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190129110638.12652-1-paolo.valente@linaro.org> References: <20190129110638.12652-1-paolo.valente@linaro.org> MIME-Version: 1.0 Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP If the in-service bfq_queue is sync and remains temporarily idle, then I/O dispatching (from other queues) may be plugged. It may be dome for two reasons: either to boost throughput, or to preserve the bandwidth share of the in-service queue. In the first case, if the I/O of the in-service queue, when it finally arrives, consists only of one small I/O request, then it makes sense to plug even the I/O of the in-service queue. In fact, serving such a small request immediately is likely to lower throughput instead of boosting it, whereas waiting a little bit is likely to let that request grow, thanks to request merging, and become more profitable in terms of throughput (this is likely to happen exactly because the I/O of the queue has been detected to boost throughput). On the opposite end, if I/O dispatching is being plugged only to preserve the bandwidth of the in-service queue, then it would be better not to plug also the I/O of the in-service queue, because such a plugging is likely to cause only loss of bandwidth for the queue. Unfortunately, no distinction is made between the two cases, and the I/O of the in-service queue is always plugged in case just a small I/O request arrives. This commit draws this missing distinction and does not perform harmful plugging. Signed-off-by: Paolo Valente --- block/bfq-iosched.c | 31 +++++++++++++++++-------------- 1 file changed, 17 insertions(+), 14 deletions(-) diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c index 2756f4b1432b..a6fe60114ade 100644 --- a/block/bfq-iosched.c +++ b/block/bfq-iosched.c @@ -4599,28 +4599,31 @@ static void bfq_rq_enqueued(struct bfq_data *bfqd, struct bfq_queue *bfqq, bool budget_timeout = bfq_bfqq_budget_timeout(bfqq); /* - * There is just this request queued: if the request - * is small and the queue is not to be expired, then - * just exit. + * There is just this request queued: if + * - the request is small, and + * - we are idling to boost throughput, and + * - the queue is not to be expired, + * then just exit. * * In this way, if the device is being idled to wait * for a new request from the in-service queue, we * avoid unplugging the device and committing the - * device to serve just a small request. On the - * contrary, we wait for the block layer to decide - * when to unplug the device: hopefully, new requests - * will be merged to this one quickly, then the device - * will be unplugged and larger requests will be - * dispatched. + * device to serve just a small request. In contrast + * we wait for the block layer to decide when to + * unplug the device: hopefully, new requests will be + * merged to this one quickly, then the device will be + * unplugged and larger requests will be dispatched. */ - if (small_req && !budget_timeout) + if (small_req && idling_boosts_thr_without_issues(bfqd, bfqq) && + !budget_timeout) return; /* - * A large enough request arrived, or the queue is to - * be expired: in both cases disk idling is to be - * stopped, so clear wait_request flag and reset - * timer. + * A large enough request arrived, or idling is being + * performed to preserve service guarantees, or + * finally the queue is to be expired: in all these + * cases disk idling is to be stopped, so clear + * wait_request flag and reset timer. */ bfq_clear_bfqq_wait_request(bfqq); hrtimer_try_to_cancel(&bfqd->idle_slice_timer); From patchwork Tue Jan 29 11:06:32 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Valente X-Patchwork-Id: 10786017 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 76FC014E1 for ; Tue, 29 Jan 2019 11:07:46 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6654D29730 for ; Tue, 29 Jan 2019 11:07:46 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 5A6102B616; Tue, 29 Jan 2019 11:07:46 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 05C3B29730 for ; Tue, 29 Jan 2019 11:07:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728675AbfA2LHm (ORCPT ); Tue, 29 Jan 2019 06:07:42 -0500 Received: from mail-wm1-f68.google.com ([209.85.128.68]:35952 "EHLO mail-wm1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728497AbfA2LHL (ORCPT ); Tue, 29 Jan 2019 06:07:11 -0500 Received: by mail-wm1-f68.google.com with SMTP id p6so17244841wmc.1 for ; Tue, 29 Jan 2019 03:07:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=HhC/EOzI5AWKwqzzKX+9bur5mem9K87e/EC40bQoIXA=; b=A1zDoa5SPTJfDqxoD2QjiAUeJMiNW7/5lHf+GeLqqWqmA7mtaIvcgNjz5A0uFNirMt u6l1P5tGtpxcr+ZgGqCn8lDwCW4+ptQJg0NHioR2F/kTwbnQ4Lqk5Y7QFITSuRDmVH9I QPoBTBQ6FPVpQ5doMz5yMrrC5BCECiv/i3KV0= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=HhC/EOzI5AWKwqzzKX+9bur5mem9K87e/EC40bQoIXA=; b=cq1Y/uoQtmLSHSjmpXL+gPic1qIEn/fvB5hpiAqyUqUMPk/1UtVKyozeQzm5/EJfZp sGoI07DWpPsfBJOy9PezUYp/u4s4Mesxvx83EdSarQfK9skTyZa4sfEBqQoyi7kF9ZL+ MKbqEW78T+CdtHuS5CWZ33FNWvSTBGvHMmUKgck4aiyxMTROcKPFjT16IHj7bZVg3tO5 AyFVnH1bdVw1oE6iJidF6mxlcisBfvrTAws1HcYkBNMUIgv0hUeuebionzsX0OanmCmr AxZsUYNLt1MgeU1357lPx/V3O6WJbQiwTKL3yuV86qHeVO1/tdyYYnlceVcUKHBLF3Hy tupA== X-Gm-Message-State: AJcUukdr+HfYyqe++UKa9nPlIcSmJohuFU7lv8yqymjc4KQima+UPZ6s lYssVugpw/vm4watjdqppubp9w== X-Google-Smtp-Source: ALg8bN7U3p3jPKDqiH7eA3JInvjKAsX0ZYORF3G+5wh1QEdfEaeYHAlmG4ukt0ILRIGKcnIj8mHplA== X-Received: by 2002:a1c:de57:: with SMTP id v84mr20483408wmg.55.1548760027573; Tue, 29 Jan 2019 03:07:07 -0800 (PST) Received: from localhost.localdomain ([88.147.67.218]) by smtp.gmail.com with ESMTPSA id s132sm2066112wmf.28.2019.01.29.03.07.06 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 29 Jan 2019 03:07:06 -0800 (PST) From: Paolo Valente To: Jens Axboe Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, ulf.hansson@linaro.org, linus.walleij@linaro.org, broonie@kernel.org, bfq-iosched@googlegroups.com, oleksandr@natalenko.name, mancha@tower-research.com, Paolo Valente Subject: [PATCH BUGFIX IMPROVEMENT 08/14] block, bfq: unconditionally plug I/O in asymmetric scenarios Date: Tue, 29 Jan 2019 12:06:32 +0100 Message-Id: <20190129110638.12652-9-paolo.valente@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190129110638.12652-1-paolo.valente@linaro.org> References: <20190129110638.12652-1-paolo.valente@linaro.org> MIME-Version: 1.0 Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP bfq detects the creation of multiple bfq_queues shortly after each other, namely a burst of queue creations in the terminology used in the code. If the burst is large, then no queue in the burst is granted - either I/O-dispatch plugging when the queue remains temporarily idle while in service; - or weight raising, because it causes even longer plugging. In fact, such a plugging tends to lower throughput, while these bursts are typically due to applications or services that spawn multiple processes, to reach a common goal as soon as possible. Examples are a "git grep" or the booting of a system. Unfortunately, disabling plugging may cause a loss of service guarantees in asymmetric scenarios, i.e., if queue weights are differentiated or if more than one group is active. This commit addresses this issue by no longer disabling I/O-dispatch plugging for queues in large bursts. Signed-off-by: Paolo Valente --- block/bfq-iosched.c | 346 +++++++++++++++++++++----------------------- 1 file changed, 165 insertions(+), 181 deletions(-) diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c index a6fe60114ade..c1bb5e5fcdc4 100644 --- a/block/bfq-iosched.c +++ b/block/bfq-iosched.c @@ -3479,191 +3479,175 @@ static bool idling_boosts_thr_without_issues(struct bfq_data *bfqd, bfqd->wr_busy_queues == 0; } +/* + * There is a case where idling must be performed not for + * throughput concerns, but to preserve service guarantees. + * + * To introduce this case, we can note that allowing the drive + * to enqueue more than one request at a time, and hence + * delegating de facto final scheduling decisions to the + * drive's internal scheduler, entails loss of control on the + * actual request service order. In particular, the critical + * situation is when requests from different processes happen + * to be present, at the same time, in the internal queue(s) + * of the drive. In such a situation, the drive, by deciding + * the service order of the internally-queued requests, does + * determine also the actual throughput distribution among + * these processes. But the drive typically has no notion or + * concern about per-process throughput distribution, and + * makes its decisions only on a per-request basis. Therefore, + * the service distribution enforced by the drive's internal + * scheduler is likely to coincide with the desired + * device-throughput distribution only in a completely + * symmetric scenario where: + * (i) each of these processes must get the same throughput as + * the others; + * (ii) the I/O of each process has the same properties, in + * terms of locality (sequential or random), direction + * (reads or writes), request sizes, greediness + * (from I/O-bound to sporadic), and so on. + * In fact, in such a scenario, the drive tends to treat + * the requests of each of these processes in about the same + * way as the requests of the others, and thus to provide + * each of these processes with about the same throughput + * (which is exactly the desired throughput distribution). In + * contrast, in any asymmetric scenario, device idling is + * certainly needed to guarantee that bfqq receives its + * assigned fraction of the device throughput (see [1] for + * details). + * The problem is that idling may significantly reduce + * throughput with certain combinations of types of I/O and + * devices. An important example is sync random I/O, on flash + * storage with command queueing. So, unless bfqq falls in the + * above cases where idling also boosts throughput, it would + * be important to check conditions (i) and (ii) accurately, + * so as to avoid idling when not strictly needed for service + * guarantees. + * + * Unfortunately, it is extremely difficult to thoroughly + * check condition (ii). And, in case there are active groups, + * it becomes very difficult to check condition (i) too. In + * fact, if there are active groups, then, for condition (i) + * to become false, it is enough that an active group contains + * more active processes or sub-groups than some other active + * group. More precisely, for condition (i) to hold because of + * such a group, it is not even necessary that the group is + * (still) active: it is sufficient that, even if the group + * has become inactive, some of its descendant processes still + * have some request already dispatched but still waiting for + * completion. In fact, requests have still to be guaranteed + * their share of the throughput even after being + * dispatched. In this respect, it is easy to show that, if a + * group frequently becomes inactive while still having + * in-flight requests, and if, when this happens, the group is + * not considered in the calculation of whether the scenario + * is asymmetric, then the group may fail to be guaranteed its + * fair share of the throughput (basically because idling may + * not be performed for the descendant processes of the group, + * but it had to be). We address this issue with the + * following bi-modal behavior, implemented in the function + * bfq_symmetric_scenario(). + * + * If there are groups with requests waiting for completion + * (as commented above, some of these groups may even be + * already inactive), then the scenario is tagged as + * asymmetric, conservatively, without checking any of the + * conditions (i) and (ii). So the device is idled for bfqq. + * This behavior matches also the fact that groups are created + * exactly if controlling I/O is a primary concern (to + * preserve bandwidth and latency guarantees). + * + * On the opposite end, if there are no groups with requests + * waiting for completion, then only condition (i) is actually + * controlled, i.e., provided that condition (i) holds, idling + * is not performed, regardless of whether condition (ii) + * holds. In other words, only if condition (i) does not hold, + * then idling is allowed, and the device tends to be + * prevented from queueing many requests, possibly of several + * processes. Since there are no groups with requests waiting + * for completion, then, to control condition (i) it is enough + * to check just whether all the queues with requests waiting + * for completion also have the same weight. + * + * Not checking condition (ii) evidently exposes bfqq to the + * risk of getting less throughput than its fair share. + * However, for queues with the same weight, a further + * mechanism, preemption, mitigates or even eliminates this + * problem. And it does so without consequences on overall + * throughput. This mechanism and its benefits are explained + * in the next three paragraphs. + * + * Even if a queue, say Q, is expired when it remains idle, Q + * can still preempt the new in-service queue if the next + * request of Q arrives soon (see the comments on + * bfq_bfqq_update_budg_for_activation). If all queues and + * groups have the same weight, this form of preemption, + * combined with the hole-recovery heuristic described in the + * comments on function bfq_bfqq_update_budg_for_activation, + * are enough to preserve a correct bandwidth distribution in + * the mid term, even without idling. In fact, even if not + * idling allows the internal queues of the device to contain + * many requests, and thus to reorder requests, we can rather + * safely assume that the internal scheduler still preserves a + * minimum of mid-term fairness. + * + * More precisely, this preemption-based, idleless approach + * provides fairness in terms of IOPS, and not sectors per + * second. This can be seen with a simple example. Suppose + * that there are two queues with the same weight, but that + * the first queue receives requests of 8 sectors, while the + * second queue receives requests of 1024 sectors. In + * addition, suppose that each of the two queues contains at + * most one request at a time, which implies that each queue + * always remains idle after it is served. Finally, after + * remaining idle, each queue receives very quickly a new + * request. It follows that the two queues are served + * alternatively, preempting each other if needed. This + * implies that, although both queues have the same weight, + * the queue with large requests receives a service that is + * 1024/8 times as high as the service received by the other + * queue. + * + * The motivation for using preemption instead of idling (for + * queues with the same weight) is that, by not idling, + * service guarantees are preserved (completely or at least in + * part) without minimally sacrificing throughput. And, if + * there is no active group, then the primary expectation for + * this device is probably a high throughput. + * + * We are now left only with explaining the additional + * compound condition that is checked below for deciding + * whether the scenario is asymmetric. To explain this + * compound condition, we need to add that the function + * bfq_symmetric_scenario checks the weights of only + * non-weight-raised queues, for efficiency reasons (see + * comments on bfq_weights_tree_add()). Then the fact that + * bfqq is weight-raised is checked explicitly here. More + * precisely, the compound condition below takes into account + * also the fact that, even if bfqq is being weight-raised, + * the scenario is still symmetric if all queues with requests + * waiting for completion happen to be + * weight-raised. Actually, we should be even more precise + * here, and differentiate between interactive weight raising + * and soft real-time weight raising. + * + * As a side note, it is worth considering that the above + * device-idling countermeasures may however fail in the + * following unlucky scenario: if idling is (correctly) + * disabled in a time period during which all symmetry + * sub-conditions hold, and hence the device is allowed to + * enqueue many requests, but at some later point in time some + * sub-condition stops to hold, then it may become impossible + * to let requests be served in the desired order until all + * the requests already queued in the device have been served. + */ static bool idling_needed_for_service_guarantees(struct bfq_data *bfqd, struct bfq_queue *bfqq) { - /* - * There is a case where idling must be performed not for - * throughput concerns, but to preserve service guarantees. - * - * To introduce this case, we can note that allowing the drive - * to enqueue more than one request at a time, and thereby - * delegating de facto final scheduling decisions to the - * drive's internal scheduler, entails loss of control on the - * actual request service order. In particular, the critical - * situation is when requests from different processes happen - * to be present, at the same time, in the internal queue(s) - * of the drive. In such a situation, the drive, by deciding - * the service order of the internally-queued requests, does - * determine also the actual throughput distribution among - * these processes. But the drive typically has no notion or - * concern about per-process throughput distribution, and - * makes its decisions only on a per-request basis. Therefore, - * the service distribution enforced by the drive's internal - * scheduler is likely to coincide with the desired - * device-throughput distribution only in a completely - * symmetric scenario where: - * (i) each of these processes must get the same throughput as - * the others; - * (ii) the I/O of each process has the same properties, in - * terms of locality (sequential or random), direction - * (reads or writes), request sizes, greediness - * (from I/O-bound to sporadic), and so on. - * In fact, in such a scenario, the drive tends to treat - * the requests of each of these processes in about the same - * way as the requests of the others, and thus to provide - * each of these processes with about the same throughput - * (which is exactly the desired throughput distribution). In - * contrast, in any asymmetric scenario, device idling is - * certainly needed to guarantee that bfqq receives its - * assigned fraction of the device throughput (see [1] for - * details). - * The problem is that idling may significantly reduce - * throughput with certain combinations of types of I/O and - * devices. An important example is sync random I/O, on flash - * storage with command queueing. So, unless bfqq falls in the - * above cases where idling also boosts throughput, it would - * be important to check conditions (i) and (ii) accurately, - * so as to avoid idling when not strictly needed for service - * guarantees. - * - * Unfortunately, it is extremely difficult to thoroughly - * check condition (ii). And, in case there are active groups, - * it becomes very difficult to check condition (i) too. In - * fact, if there are active groups, then, for condition (i) - * to become false, it is enough that an active group contains - * more active processes or sub-groups than some other active - * group. More precisely, for condition (i) to hold because of - * such a group, it is not even necessary that the group is - * (still) active: it is sufficient that, even if the group - * has become inactive, some of its descendant processes still - * have some request already dispatched but still waiting for - * completion. In fact, requests have still to be guaranteed - * their share of the throughput even after being - * dispatched. In this respect, it is easy to show that, if a - * group frequently becomes inactive while still having - * in-flight requests, and if, when this happens, the group is - * not considered in the calculation of whether the scenario - * is asymmetric, then the group may fail to be guaranteed its - * fair share of the throughput (basically because idling may - * not be performed for the descendant processes of the group, - * but it had to be). We address this issue with the - * following bi-modal behavior, implemented in the function - * bfq_symmetric_scenario(). - * - * If there are groups with requests waiting for completion - * (as commented above, some of these groups may even be - * already inactive), then the scenario is tagged as - * asymmetric, conservatively, without checking any of the - * conditions (i) and (ii). So the device is idled for bfqq. - * This behavior matches also the fact that groups are created - * exactly if controlling I/O is a primary concern (to - * preserve bandwidth and latency guarantees). - * - * On the opposite end, if there are no groups with requests - * waiting for completion, then only condition (i) is actually - * controlled, i.e., provided that condition (i) holds, idling - * is not performed, regardless of whether condition (ii) - * holds. In other words, only if condition (i) does not hold, - * then idling is allowed, and the device tends to be - * prevented from queueing many requests, possibly of several - * processes. Since there are no groups with requests waiting - * for completion, then, to control condition (i) it is enough - * to check just whether all the queues with requests waiting - * for completion also have the same weight. - * - * Not checking condition (ii) evidently exposes bfqq to the - * risk of getting less throughput than its fair share. - * However, for queues with the same weight, a further - * mechanism, preemption, mitigates or even eliminates this - * problem. And it does so without consequences on overall - * throughput. This mechanism and its benefits are explained - * in the next three paragraphs. - * - * Even if a queue, say Q, is expired when it remains idle, Q - * can still preempt the new in-service queue if the next - * request of Q arrives soon (see the comments on - * bfq_bfqq_update_budg_for_activation). If all queues and - * groups have the same weight, this form of preemption, - * combined with the hole-recovery heuristic described in the - * comments on function bfq_bfqq_update_budg_for_activation, - * are enough to preserve a correct bandwidth distribution in - * the mid term, even without idling. In fact, even if not - * idling allows the internal queues of the device to contain - * many requests, and thus to reorder requests, we can rather - * safely assume that the internal scheduler still preserves a - * minimum of mid-term fairness. - * - * More precisely, this preemption-based, idleless approach - * provides fairness in terms of IOPS, and not sectors per - * second. This can be seen with a simple example. Suppose - * that there are two queues with the same weight, but that - * the first queue receives requests of 8 sectors, while the - * second queue receives requests of 1024 sectors. In - * addition, suppose that each of the two queues contains at - * most one request at a time, which implies that each queue - * always remains idle after it is served. Finally, after - * remaining idle, each queue receives very quickly a new - * request. It follows that the two queues are served - * alternatively, preempting each other if needed. This - * implies that, although both queues have the same weight, - * the queue with large requests receives a service that is - * 1024/8 times as high as the service received by the other - * queue. - * - * The motivation for using preemption instead of idling (for - * queues with the same weight) is that, by not idling, - * service guarantees are preserved (completely or at least in - * part) without minimally sacrificing throughput. And, if - * there is no active group, then the primary expectation for - * this device is probably a high throughput. - * - * We are now left only with explaining the additional - * compound condition that is checked below for deciding - * whether the scenario is asymmetric. To explain this - * compound condition, we need to add that the function - * bfq_symmetric_scenario checks the weights of only - * non-weight-raised queues, for efficiency reasons (see - * comments on bfq_weights_tree_add()). Then the fact that - * bfqq is weight-raised is checked explicitly here. More - * precisely, the compound condition below takes into account - * also the fact that, even if bfqq is being weight-raised, - * the scenario is still symmetric if all queues with requests - * waiting for completion happen to be - * weight-raised. Actually, we should be even more precise - * here, and differentiate between interactive weight raising - * and soft real-time weight raising. - * - * As a side note, it is worth considering that the above - * device-idling countermeasures may however fail in the - * following unlucky scenario: if idling is (correctly) - * disabled in a time period during which all symmetry - * sub-conditions hold, and hence the device is allowed to - * enqueue many requests, but at some later point in time some - * sub-condition stops to hold, then it may become impossible - * to let requests be served in the desired order until all - * the requests already queued in the device have been served. - */ - bool asymmetric_scenario = (bfqq->wr_coeff > 1 && - bfqd->wr_busy_queues < - bfq_tot_busy_queues(bfqd)) || + return (bfqq->wr_coeff > 1 && + bfqd->wr_busy_queues < + bfq_tot_busy_queues(bfqd)) || !bfq_symmetric_scenario(bfqd); - - /* - * Finally, there is a case where maximizing throughput is the - * best choice even if it may cause unfairness toward - * bfqq. Such a case is when bfqq became active in a burst of - * queue activations. Queues that became active during a large - * burst benefit only from throughput, as discussed in the - * comments on bfq_handle_burst. Thus, if bfqq became active - * in a burst and not idling the device maximizes throughput, - * then the device must no be idled, because not idling the - * device provides bfqq and all other queues in the burst with - * maximum benefit. Combining this and the above case, we can - * now establish when idling is actually needed to preserve - * service guarantees. - */ - return asymmetric_scenario && !bfq_bfqq_in_large_burst(bfqq); } /* From patchwork Tue Jan 29 11:06:33 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Valente X-Patchwork-Id: 10786021 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 26F8D14E1 for ; Tue, 29 Jan 2019 11:07:49 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 17B822B106 for ; Tue, 29 Jan 2019 11:07:49 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0B3D829730; Tue, 29 Jan 2019 11:07:49 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A724129730 for ; Tue, 29 Jan 2019 11:07:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726952AbfA2LHr (ORCPT ); Tue, 29 Jan 2019 06:07:47 -0500 Received: from mail-wr1-f68.google.com ([209.85.221.68]:36352 "EHLO mail-wr1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728511AbfA2LHL (ORCPT ); Tue, 29 Jan 2019 06:07:11 -0500 Received: by mail-wr1-f68.google.com with SMTP id u4so21555849wrp.3 for ; Tue, 29 Jan 2019 03:07:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=VwIs+zonmjtiCZ1USOo/5sS6ODQShYjtQFUwoinOfyA=; b=dzNNNTGG5PdbsAoIUTXnKMvPIzXU2GiE2VkYxsWTf2UiyCY0wriVi0q9foVNjfoQol P/FaGtCg2bXNR6bff01uCTbsm3htu93+aSrkN86YiYT7JF4IpxTl1xTYqfozHHkmz1rD xO5QtpvJKdv15+MIghqrBcKbEoaxfmx0cjXzM= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=VwIs+zonmjtiCZ1USOo/5sS6ODQShYjtQFUwoinOfyA=; b=pRbfom/3gN+ngELBi/P9gj0uDqpt848HIK2t35mZyJOb2pmtVXpz/lxH5LSSvOexmZ xyNaOiT7dpjKYFted8EAdQtQcEh2tXLqfQ7Di0qNg5/pG296b0a3gVRBydJbi2+q3cQT Cbrl+qZUOmvs4gbi7FxON4zSMkU8O8Fax1x32dkJ/fsn/8x8qrTTdwn+rh0cqUz0PMuc Vdez+6PGcNI33HAi5Vwl7ymjyYBc15HERf5QicGxVzjqf7uFPI5ifd08BAfIwAQGqdmL nT0C/vdrMcgRlDyEnCtT7hj4+Zuu1VIW04eP/SgCCAbNJxNUPBTF96kz+72CWW7oVYau Cz/Q== X-Gm-Message-State: AJcUukdSXNCBt9LpZmHMb3+qq4MdRbEsJPaGjuS4mUY84AEI+Cu9167+ 3REbvKPpFVsMQ4U/pqKNitv02g== X-Google-Smtp-Source: AHgI3IaoYkVg1V+aaV+ipWtTYVvEmycMI3EljtyJaeHGv0fpAC5NBzizUMkuaLwJoNdy0BOxILKnMQ== X-Received: by 2002:adf:f3c6:: with SMTP id g6mr21053279wrp.111.1548760028824; Tue, 29 Jan 2019 03:07:08 -0800 (PST) Received: from localhost.localdomain ([88.147.67.218]) by smtp.gmail.com with ESMTPSA id s132sm2066112wmf.28.2019.01.29.03.07.07 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 29 Jan 2019 03:07:08 -0800 (PST) From: Paolo Valente To: Jens Axboe Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, ulf.hansson@linaro.org, linus.walleij@linaro.org, broonie@kernel.org, bfq-iosched@googlegroups.com, oleksandr@natalenko.name, mancha@tower-research.com, Paolo Valente Subject: [PATCH BUGFIX IMPROVEMENT 09/14] block, bfq: fix sequential rq detection in rate estimation Date: Tue, 29 Jan 2019 12:06:33 +0100 Message-Id: <20190129110638.12652-10-paolo.valente@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190129110638.12652-1-paolo.valente@linaro.org> References: <20190129110638.12652-1-paolo.valente@linaro.org> MIME-Version: 1.0 Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP In bfq_update_peak_rate, to check whether an I/O request rq is sequential, only the seek distance of rq w.r.t. the last request dispatched is controlled. This is not sufficient for non-rotational storage, where the size of rq is at least as relevant. This commit adds the missing control. Signed-off-by: Paolo Valente --- block/bfq-iosched.c | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c index c1bb5e5fcdc4..12228af16198 100644 --- a/block/bfq-iosched.c +++ b/block/bfq-iosched.c @@ -235,6 +235,11 @@ static struct kmem_cache *bfq_pool; #define BFQQ_SEEK_THR (sector_t)(8 * 100) #define BFQQ_SECT_THR_NONROT (sector_t)(2 * 32) +#define BFQ_RQ_SEEKY(bfqd, last_pos, rq) \ + (get_sdist(last_pos, rq) > \ + BFQQ_SEEK_THR && \ + (!blk_queue_nonrot(bfqd->queue) || \ + blk_rq_sectors(rq) < BFQQ_SECT_THR_NONROT)) #define BFQQ_CLOSE_THR (sector_t)(8 * 1024) #define BFQQ_SEEKY(bfqq) (hweight32(bfqq->seek_history) > 19) @@ -2754,7 +2759,7 @@ static void bfq_update_peak_rate(struct bfq_data *bfqd, struct request *rq) if ((bfqd->rq_in_driver > 0 || now_ns - bfqd->last_completion < BFQ_MIN_TT) - && get_sdist(bfqd->last_position, rq) < BFQQ_SEEK_THR) + && !BFQ_RQ_SEEKY(bfqd, bfqd->last_position, rq)) bfqd->sequential_samples++; bfqd->tot_sectors_dispatched += blk_rq_sectors(rq); @@ -4511,10 +4516,7 @@ bfq_update_io_seektime(struct bfq_data *bfqd, struct bfq_queue *bfqq, struct request *rq) { bfqq->seek_history <<= 1; - bfqq->seek_history |= - get_sdist(bfqq->last_request_pos, rq) > BFQQ_SEEK_THR && - (!blk_queue_nonrot(bfqd->queue) || - blk_rq_sectors(rq) < BFQQ_SECT_THR_NONROT); + bfqq->seek_history |= BFQ_RQ_SEEKY(bfqd, bfqq->last_request_pos, rq); } static void bfq_update_has_short_ttime(struct bfq_data *bfqd, From patchwork Tue Jan 29 11:06:34 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Valente X-Patchwork-Id: 10786019 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id AF8331390 for ; Tue, 29 Jan 2019 11:07:47 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A0EF729730 for ; Tue, 29 Jan 2019 11:07:47 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 9520F2B616; Tue, 29 Jan 2019 11:07:47 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2D9DD29730 for ; Tue, 29 Jan 2019 11:07:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728671AbfA2LHl (ORCPT ); Tue, 29 Jan 2019 06:07:41 -0500 Received: from mail-wr1-f65.google.com ([209.85.221.65]:43333 "EHLO mail-wr1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728367AbfA2LHL (ORCPT ); Tue, 29 Jan 2019 06:07:11 -0500 Received: by mail-wr1-f65.google.com with SMTP id r10so21516360wrs.10 for ; Tue, 29 Jan 2019 03:07:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=ui0RozgVj782eZenovCwcdQYstEPpPQMeO8MT0fjT5o=; b=L/8rgA1a0X/Tzr6jFJ7LPCtro61Yhi/kU9AXGI8zscrdDCQQoDzxOAgtOlxQEJmkqQ Im3eeKa7fN2xw9GQceKjWdHYXTC11GBPY67zB3gdzRXMhhAjAD6CISk+xyKiBEO5cAtj amn9Kk1fJqkf9621GzFX8GpON9gGM72ruLoC0= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=ui0RozgVj782eZenovCwcdQYstEPpPQMeO8MT0fjT5o=; b=jgOVnyw8tOPntzYh+6AHtwuPvW3Cb7yoZwOV7Q0HDSdVAQ8CjZewoWNfILwpfgLUeN L3ZpWRUmpFbS0SFg7q1Y9vKrKy1vlW+Tqk3PXyAcMyZd9tnMi33uwvvO3G22nj3HylRN i2/00DxvJlyhQ7VkmNNypi+CBZ1ty5chG2DziW4cQj1XK9CCoeWZV+B/IjMaBLG3e9Fh qGrE+pSMRvQBhFHnYUvfVpgUwQ9/RDj6lG45+jaBH/+rIKsfFoglVSUGR+vQN1eSf/sv lXOapLboEQudHBRUfZVxLcWFUYexhu4g/hhYxv1aJzRhdDQurU4XxBdPhprxZih0IV5t 9VkA== X-Gm-Message-State: AJcUukccPOpF/YjgYE0PIi8gFFP56kp8hY1PurL5m4J3s3f7jxqv2zl1 uYWS+tuoS2Ly5djcTvQ9PD/J61Cjo2s= X-Google-Smtp-Source: ALg8bN5t0xmEieYTRN0u2FYyM7D646ckc7BUkQGUXiRkGgk2ntuTucSJK2cHBiDVXduWjX/bG7M3vw== X-Received: by 2002:adf:8068:: with SMTP id 95mr26094476wrk.181.1548760030187; Tue, 29 Jan 2019 03:07:10 -0800 (PST) Received: from localhost.localdomain ([88.147.67.218]) by smtp.gmail.com with ESMTPSA id s132sm2066112wmf.28.2019.01.29.03.07.08 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 29 Jan 2019 03:07:09 -0800 (PST) From: Paolo Valente To: Jens Axboe Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, ulf.hansson@linaro.org, linus.walleij@linaro.org, broonie@kernel.org, bfq-iosched@googlegroups.com, oleksandr@natalenko.name, mancha@tower-research.com, Paolo Valente Subject: [PATCH BUGFIX IMPROVEMENT 10/14] block, bfq: fix queue removal from weights tree Date: Tue, 29 Jan 2019 12:06:34 +0100 Message-Id: <20190129110638.12652-11-paolo.valente@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190129110638.12652-1-paolo.valente@linaro.org> References: <20190129110638.12652-1-paolo.valente@linaro.org> MIME-Version: 1.0 Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP bfq maintains an ordered list, through a red-black tree, of unique weights of active bfq_queues. This list is used to detect whether there are active queues with differentiated weights. The weight of a queue is removed from the list when both the following two conditions become true: (1) the bfq_queue is flagged as inactive (2) the has no in-flight request any longer; Unfortunately, in the rare cases where condition (2) becomes true before condition (1), the removal fails, because the function to remove the weight of the queue (bfq_weights_tree_remove) is rightly invoked in the path that deactivates the bfq_queue, but mistakenly invoked *before* the function that actually performs the deactivation (bfq_deactivate_bfqq). This commits moves the invocation of bfq_weights_tree_remove for condition (1) to after bfq_deactivate_bfqq. As a consequence of this move, it is necessary to add a further reference to the queue when the weight of a queue is added, because the queue might otherwise be freed before bfq_weights_tree_remove is invoked. This commit adds this reference and makes all related modifications. Signed-off-by: Paolo Valente --- block/bfq-iosched.c | 17 +++++++++++++---- block/bfq-wf2q.c | 6 +++--- 2 files changed, 16 insertions(+), 7 deletions(-) diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c index 12228af16198..bf585ad29bb5 100644 --- a/block/bfq-iosched.c +++ b/block/bfq-iosched.c @@ -754,6 +754,7 @@ void bfq_weights_tree_add(struct bfq_data *bfqd, struct bfq_queue *bfqq, inc_counter: bfqq->weight_counter->num_active++; + bfqq->ref++; } /* @@ -778,6 +779,7 @@ void __bfq_weights_tree_remove(struct bfq_data *bfqd, reset_entity_pointer: bfqq->weight_counter = NULL; + bfq_put_queue(bfqq); } /* @@ -789,9 +791,6 @@ void bfq_weights_tree_remove(struct bfq_data *bfqd, { struct bfq_entity *entity = bfqq->entity.parent; - __bfq_weights_tree_remove(bfqd, bfqq, - &bfqd->queue_weights_tree); - for_each_entity(entity) { struct bfq_sched_data *sd = entity->my_sched_data; @@ -825,6 +824,15 @@ void bfq_weights_tree_remove(struct bfq_data *bfqd, bfqd->num_groups_with_pending_reqs--; } } + + /* + * Next function is invoked last, because it causes bfqq to be + * freed if the following holds: bfqq is not in service and + * has no dispatched request. DO NOT use bfqq after the next + * function invocation. + */ + __bfq_weights_tree_remove(bfqd, bfqq, + &bfqd->queue_weights_tree); } /* @@ -1020,7 +1028,8 @@ bfq_bfqq_resume_state(struct bfq_queue *bfqq, struct bfq_data *bfqd, static int bfqq_process_refs(struct bfq_queue *bfqq) { - return bfqq->ref - bfqq->allocated - bfqq->entity.on_st; + return bfqq->ref - bfqq->allocated - bfqq->entity.on_st - + (bfqq->weight_counter != NULL); } /* Empty burst list and add just bfqq (see comments on bfq_handle_burst) */ diff --git a/block/bfq-wf2q.c b/block/bfq-wf2q.c index ce37d709a34f..63311d1ff1ed 100644 --- a/block/bfq-wf2q.c +++ b/block/bfq-wf2q.c @@ -1673,15 +1673,15 @@ void bfq_del_bfqq_busy(struct bfq_data *bfqd, struct bfq_queue *bfqq, bfqd->busy_queues[bfqq->ioprio_class - 1]--; - if (!bfqq->dispatched) - bfq_weights_tree_remove(bfqd, bfqq); - if (bfqq->wr_coeff > 1) bfqd->wr_busy_queues--; bfqg_stats_update_dequeue(bfqq_group(bfqq)); bfq_deactivate_bfqq(bfqd, bfqq, true, expiration); + + if (!bfqq->dispatched) + bfq_weights_tree_remove(bfqd, bfqq); } /* From patchwork Tue Jan 29 11:06:35 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Valente X-Patchwork-Id: 10786015 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 643BC184E for ; Tue, 29 Jan 2019 11:07:35 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5401B29730 for ; Tue, 29 Jan 2019 11:07:35 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 481D72B616; Tue, 29 Jan 2019 11:07:35 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id EC21D29730 for ; Tue, 29 Jan 2019 11:07:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728602AbfA2LHO (ORCPT ); Tue, 29 Jan 2019 06:07:14 -0500 Received: from mail-wm1-f65.google.com ([209.85.128.65]:53609 "EHLO mail-wm1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728581AbfA2LHO (ORCPT ); Tue, 29 Jan 2019 06:07:14 -0500 Received: by mail-wm1-f65.google.com with SMTP id d15so17345068wmb.3 for ; Tue, 29 Jan 2019 03:07:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=/L0sIFrF4VuBSatFaX/8mVU9i1rVnBBzRyuN7LWEkY0=; b=T8SwCMBxHn6msupWEQMC7s4HLPYVlg1FbZACnt/EPRze8AUAKrACng21WGluOoVdo2 L7cL+ds9pHdQQ2VaaL4qtXOdX7QcYXDB4xr7IgGJ4/SanZvahBsefcbqgUZLDb0WY7fX j5Vk+mwVIseQWtHLXWsOlAwYh9i8eN/GHSiBQ= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=/L0sIFrF4VuBSatFaX/8mVU9i1rVnBBzRyuN7LWEkY0=; b=OTLFxHL07c3jHlEGSewM5/GUzmejEl9/lZRlauTVo7+eD+aPLiW/ttuoxPbj5f+uWf P5fqhfunL5dDeY9aRsyV/jAvTMazMA7uVTUOA9okrgu3GOBhE7RtaM8rjM9VhxUJgF7/ b4Dgt2cW/uX41eTEwpvZ7E5dfSE4bwPN3ekuTjgUayjXxzY+a4n//TXK4dGPalvIdYtk KTaTlJ2WJ1462nxXMv2S6aM3EibD7pjif4r0+xOaoaLb/7HB6kdsq8N5yRJH4TC8H7iJ W/i54ODF8XkjOpcwXu9OT8uxnf7tcFPsa8005MC40dlc/X8iy7G1FN03a/4qW8TPlvnG Npmw== X-Gm-Message-State: AJcUukdmF6qd0VzUzNC9k7g/hzVWYNV24J6Q4tU/EBBySC8egId00c17 I6pwMQt1VourS9yfTiHCz6yQ2A== X-Google-Smtp-Source: ALg8bN6gizgOIgUL0g87ugnhrkKHTPSLL8sSU9ok3wSDOxDILpGfEIUV2yu20GRFSfV5y2M1u4idFQ== X-Received: by 2002:a1c:7d06:: with SMTP id y6mr20407313wmc.7.1548760031541; Tue, 29 Jan 2019 03:07:11 -0800 (PST) Received: from localhost.localdomain ([88.147.67.218]) by smtp.gmail.com with ESMTPSA id s132sm2066112wmf.28.2019.01.29.03.07.10 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 29 Jan 2019 03:07:10 -0800 (PST) From: Paolo Valente To: Jens Axboe Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, ulf.hansson@linaro.org, linus.walleij@linaro.org, broonie@kernel.org, bfq-iosched@googlegroups.com, oleksandr@natalenko.name, mancha@tower-research.com, Paolo Valente Subject: [PATCH BUGFIX IMPROVEMENT 11/14] block, bfq: reduce threshold for detecting command queueing Date: Tue, 29 Jan 2019 12:06:35 +0100 Message-Id: <20190129110638.12652-12-paolo.valente@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190129110638.12652-1-paolo.valente@linaro.org> References: <20190129110638.12652-1-paolo.valente@linaro.org> MIME-Version: 1.0 Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP bfq borrowed from cfq a simple heuristic for detecting whether the drive performs command queueing: check whether the average number of in-flight requests is above a given threshold. Unfortunately this heuristic does fail to detect queueing (on drives with queueing) if processes doing I/O are few and issue I/O with a low depth. To reduce false negatives, this commit lowers the threshold. Signed-off-by: Paolo Valente --- block/bfq-iosched.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c index bf585ad29bb5..48b579032d14 100644 --- a/block/bfq-iosched.c +++ b/block/bfq-iosched.c @@ -230,7 +230,7 @@ static struct kmem_cache *bfq_pool; #define BFQ_MIN_TT (2 * NSEC_PER_MSEC) /* hw_tag detection: parallel requests threshold and min samples needed. */ -#define BFQ_HW_QUEUE_THRESHOLD 4 +#define BFQ_HW_QUEUE_THRESHOLD 3 #define BFQ_HW_QUEUE_SAMPLES 32 #define BFQQ_SEEK_THR (sector_t)(8 * 100) @@ -4798,7 +4798,7 @@ static void bfq_update_hw_tag(struct bfq_data *bfqd) * sum is not exact, as it's not taking into account deactivated * requests. */ - if (bfqd->rq_in_driver + bfqd->queued < BFQ_HW_QUEUE_THRESHOLD) + if (bfqd->rq_in_driver + bfqd->queued <= BFQ_HW_QUEUE_THRESHOLD) return; if (bfqd->hw_tag_samples++ < BFQ_HW_QUEUE_SAMPLES) From patchwork Tue Jan 29 11:06:36 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Valente X-Patchwork-Id: 10786013 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1CEE71390 for ; Tue, 29 Jan 2019 11:07:30 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0B4A629730 for ; Tue, 29 Jan 2019 11:07:30 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id F3BB52B616; Tue, 29 Jan 2019 11:07:29 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9F2B429730 for ; Tue, 29 Jan 2019 11:07:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728613AbfA2LH2 (ORCPT ); Tue, 29 Jan 2019 06:07:28 -0500 Received: from mail-wr1-f66.google.com ([209.85.221.66]:41229 "EHLO mail-wr1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728597AbfA2LHO (ORCPT ); Tue, 29 Jan 2019 06:07:14 -0500 Received: by mail-wr1-f66.google.com with SMTP id x10so21518088wrs.8 for ; Tue, 29 Jan 2019 03:07:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=xDzJdQaJUVbOMOgsobVLGeLVo1CynjuO1LWZuQEuQ0s=; b=YlNrCME02AvhfJTfFzDA7cXLW6GkN4P602kH9FqZdMo8dfqIjrdqLc0SZVieJXCoRk EjdcaZrCdjJRqKoBG9NcdFH+dOguGYF0kGl4fWzV9VPBr3jVZNgBhaVc8Z87DiCt5APe jwOLIrtW+c8AGnIapiBil70XxXFnbCiHytrkY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=xDzJdQaJUVbOMOgsobVLGeLVo1CynjuO1LWZuQEuQ0s=; b=WPBxQi2sT9KROdH2pTT5vwMG3Unt/4FNUj7uQcXVOnvpvoPFdtdXBi5TYS88cTK+tP MT9mcXL9MZ2+aPO1MlKq+HrT339A4OgpujXX9W+u/bdIzB6mw9TTFSiy+JfiucUObEOf xsspztPBun5DXqXx+b4e9QCaEltVbCpbo086Lzz3qqvpoIWfpBtNqF6J+UBw/xeTfak5 Scnx/0DTDyKJiTZapy0JWms6ArWwRe23jDDMoAPadjUKL1BKJBO2dRF8tftfthqBtZB6 VstD8bNvhw6OMVYMMg9b/4p7cA1d32HJEcyggbvLv384eZcByBZ2Wy27ULiPoBOE4kfE aRmw== X-Gm-Message-State: AJcUukeRPINHYj/MdlL62CMAvf+mY73nWh3kDt7+C0JkahbDTlulwsiz 7d09OYkszf2aOKZLrpxNVY4/Ng== X-Google-Smtp-Source: ALg8bN4i2QB/ggjoyUvCO+iRRW6VcZ2yv+tDBmGHgCCBvaPAOEFc754Zt/AEpp3mU/3ngl4Pq/tjNw== X-Received: by 2002:adf:c711:: with SMTP id k17mr24732765wrg.197.1548760032900; Tue, 29 Jan 2019 03:07:12 -0800 (PST) Received: from localhost.localdomain ([88.147.67.218]) by smtp.gmail.com with ESMTPSA id s132sm2066112wmf.28.2019.01.29.03.07.11 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 29 Jan 2019 03:07:12 -0800 (PST) From: Paolo Valente To: Jens Axboe Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, ulf.hansson@linaro.org, linus.walleij@linaro.org, broonie@kernel.org, bfq-iosched@googlegroups.com, oleksandr@natalenko.name, mancha@tower-research.com, Paolo Valente Subject: [PATCH BUGFIX IMPROVEMENT 12/14] block, bfq: port commit "cfq-iosched: improve hw_tag detection" Date: Tue, 29 Jan 2019 12:06:36 +0100 Message-Id: <20190129110638.12652-13-paolo.valente@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190129110638.12652-1-paolo.valente@linaro.org> References: <20190129110638.12652-1-paolo.valente@linaro.org> MIME-Version: 1.0 Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP The original commit is commit 1a1238a7dd48 ("cfq-iosched: improve hw_tag detection") and has the following commit message. If active queue hasn't enough requests and idle window opens, cfq will not dispatch sufficient requests to hardware. In such situation, current code will zero hw_tag. But this is because cfq doesn't dispatch enough requests instead of hardware queue doesn't work. Don't zero hw_tag in such case. Signed-off-by: Paolo Valente --- block/bfq-iosched.c | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c index 48b579032d14..2ab53d93ba12 100644 --- a/block/bfq-iosched.c +++ b/block/bfq-iosched.c @@ -4786,6 +4786,8 @@ static void bfq_insert_requests(struct blk_mq_hw_ctx *hctx, static void bfq_update_hw_tag(struct bfq_data *bfqd) { + struct bfq_queue *bfqq = bfqd->in_service_queue; + bfqd->max_rq_in_driver = max_t(int, bfqd->max_rq_in_driver, bfqd->rq_in_driver); @@ -4801,6 +4803,17 @@ static void bfq_update_hw_tag(struct bfq_data *bfqd) if (bfqd->rq_in_driver + bfqd->queued <= BFQ_HW_QUEUE_THRESHOLD) return; + /* + * If active queue hasn't enough requests and can idle, bfq might not + * dispatch sufficient requests to hardware. Don't zero hw_tag in this + * case + */ + if (bfqq && bfq_bfqq_has_short_ttime(bfqq) && + bfqq->dispatched + bfqq->queued[0] + bfqq->queued[1] < + BFQ_HW_QUEUE_THRESHOLD && + bfqd->rq_in_driver < BFQ_HW_QUEUE_THRESHOLD) + return; + if (bfqd->hw_tag_samples++ < BFQ_HW_QUEUE_SAMPLES) return; From patchwork Tue Jan 29 11:06:37 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Valente X-Patchwork-Id: 10786009 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id BA51D14E1 for ; Tue, 29 Jan 2019 11:07:18 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id ABB3A29730 for ; Tue, 29 Jan 2019 11:07:18 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 9E9252B616; Tue, 29 Jan 2019 11:07:18 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4A96D29730 for ; Tue, 29 Jan 2019 11:07:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728614AbfA2LHR (ORCPT ); Tue, 29 Jan 2019 06:07:17 -0500 Received: from mail-wr1-f66.google.com ([209.85.221.66]:43343 "EHLO mail-wr1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728605AbfA2LHQ (ORCPT ); Tue, 29 Jan 2019 06:07:16 -0500 Received: by mail-wr1-f66.google.com with SMTP id r10so21516603wrs.10 for ; Tue, 29 Jan 2019 03:07:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=8JaSsc/YYZ/kdnvfBrrFjWTAlWsvvc6dlhkWZwzIj+Q=; b=jDOzgYh+7JYBxH0+MkncQ3vaM6G1Wwz/zf4/BEGiViuG5e2T+PP/awx4hfjQQJzq4o YZf+PzUJlNnIkncTTrYZrkQRjIGafwLxTEYhosPBqcbspQ3tc3VqYsXYT9acbSshwXsm o+krKWN7u2PKp1ijOnxKTWPCL6r8kjlcvyqS8= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=8JaSsc/YYZ/kdnvfBrrFjWTAlWsvvc6dlhkWZwzIj+Q=; b=AOgQ4a+2oK1E4EoDS7raIMJ5SIKzpPp8oX8Qk1yn1rCJadNsZ7gQ3yF2vAYQWn8gaP ludKmf+qUXWTEqAsm6Bh/X99qI1Kr7qFBr5m5JLVIt6d6IUVISH8LtrzZzgxZ9zOq7at 1H3pvzgsXDzHGUavjQdcVk9ZDSZr0kpZqIJwd/Lss+2LkNsQ59QTP85OUNDU7LtsJfJM 9sFpRnmZ87a6JyzullhUFChpZiLvWN7pvdMSbMpN4ig1kzN5+2nxgOyeLuJFzhOCY5EI SaKAiSc4rON+MNB81hgYU65Qp3l8Elev7Ymo/3p53utwwZYCXopSoX/QNQ5WpIxBktOs RGKw== X-Gm-Message-State: AHQUAub8mymlUbM1j5ztIzmfboSnWjuKbiUfz8dRnvvnGcVJ2CmA5jSX Wtw/3gNQ0lo5E20tPwpMqOj4ZQ== X-Google-Smtp-Source: AHgI3IZp+jZT9UKS7sUYJA6gDD8eKqjxbmrelTmX1Wv1yhd73BdRwGh/3wlWrdZEzIYaNpZVUfNRrQ== X-Received: by 2002:adf:dcd0:: with SMTP id x16mr4708238wrm.143.1548760034065; Tue, 29 Jan 2019 03:07:14 -0800 (PST) Received: from localhost.localdomain ([88.147.67.218]) by smtp.gmail.com with ESMTPSA id s132sm2066112wmf.28.2019.01.29.03.07.12 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 29 Jan 2019 03:07:13 -0800 (PST) From: Paolo Valente To: Jens Axboe Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, ulf.hansson@linaro.org, linus.walleij@linaro.org, broonie@kernel.org, bfq-iosched@googlegroups.com, oleksandr@natalenko.name, mancha@tower-research.com, Paolo Valente Subject: [PATCH BUGFIX IMPROVEMENT 13/14] block, bfq: do not overcharge writes in asymmetric scenarios Date: Tue, 29 Jan 2019 12:06:37 +0100 Message-Id: <20190129110638.12652-14-paolo.valente@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190129110638.12652-1-paolo.valente@linaro.org> References: <20190129110638.12652-1-paolo.valente@linaro.org> MIME-Version: 1.0 Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Writes tend to starve reads. bfq counters this problem by overcharging writes with an inflated service w.r.t. the actual service (number of sector written) they receive. Yet his overcharging is useless, and actually causes unfairness in the opposite direction, when bfq happens to be enforcing strong I/O control. bfq does this enforcing when the scenario is asymmetric, i.e., when some bfq_queue or group of bfq_queues is to be granted a different bandwidth than some other bfq_queue or group of bfq_queues. So, in such a scenario, this commit disables write overcharging. Signed-off-by: Paolo Valente --- block/bfq-iosched.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c index 2ab53d93ba12..06268449d2ca 100644 --- a/block/bfq-iosched.c +++ b/block/bfq-iosched.c @@ -888,7 +888,8 @@ static struct request *bfq_find_next_rq(struct bfq_data *bfqd, static unsigned long bfq_serv_to_charge(struct request *rq, struct bfq_queue *bfqq) { - if (bfq_bfqq_sync(bfqq) || bfqq->wr_coeff > 1) + if (bfq_bfqq_sync(bfqq) || bfqq->wr_coeff > 1 || + !bfq_symmetric_scenario(bfqq->bfqd)) return blk_rq_sectors(rq); return blk_rq_sectors(rq) * bfq_async_charge_factor; From patchwork Tue Jan 29 11:06:38 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Valente X-Patchwork-Id: 10786011 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7D54D1390 for ; Tue, 29 Jan 2019 11:07:28 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6E5F029730 for ; Tue, 29 Jan 2019 11:07:28 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 6249B2B616; Tue, 29 Jan 2019 11:07:28 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 025CA29730 for ; Tue, 29 Jan 2019 11:07:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728474AbfA2LH0 (ORCPT ); Tue, 29 Jan 2019 06:07:26 -0500 Received: from mail-wm1-f65.google.com ([209.85.128.65]:35173 "EHLO mail-wm1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728613AbfA2LHR (ORCPT ); Tue, 29 Jan 2019 06:07:17 -0500 Received: by mail-wm1-f65.google.com with SMTP id t200so17406294wmt.0 for ; Tue, 29 Jan 2019 03:07:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=C6mbkxYARjOVWPicsF+8F1IWlVh58inCJzHv80eeZ00=; b=WaAV8QPFZ1BYtgx4VTqu89ErSb86GwumiYIkwW57mgQfl7Nl8icwCI+CpZ6V/iKbv0 99j6c8j7vc8VEpAnOX39pOSRzg09Sv85zsy1n2cOa9URAETvg/4YZ621WnoNUPCzDTCp ATa00BVE+iqnjy+Teq0mqD9PtsCE6+Kx6VCCw= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=C6mbkxYARjOVWPicsF+8F1IWlVh58inCJzHv80eeZ00=; b=js07Pxm7G0MV4FurD2hM3cXmiPAerYOAQM/BNuwFvaT7rsCAnozJ51qMeAtF/D1OZu 4dlbcPhJUI2543NjTRbdYcDVdNqdr4qhg15bquOocFTBkqsIDeQn62ZYQQIwdyLAMQ+W rRuOLPFfd7Pd/ztNvwYE7DkSrJHdPseJeXjYSjy60O31w1lb7VKV+emUvBCyKtHbMh6z +ZnKlV6FOc5QwYwjRNMYDnYhvgV07CpUfHc5XmjJVW2v0cUI1hCSHhx7hvwNS9TWKQai AnrcOh8TodEdp37AgH/n39JnECylJ8UdvRikjkplZiOv27hqZ1YfqsKDJ9uQCVY7KNmN DIRg== X-Gm-Message-State: AJcUukcHJgJ7YFMsLtxHhso0ZPZw4suXg3mqMA52REb+7PZwax1xoOFQ wnMiVgQxlvSnGwpSG8AZuk/7WQ== X-Google-Smtp-Source: ALg8bN4WpmkB3JOAr6mfsrbaLi4/0RCbgabv9YQ6/z5XVQzxXWYhMQNk9CKSBxw001ahO5aDFGbDsQ== X-Received: by 2002:a1c:ef11:: with SMTP id n17mr20436112wmh.112.1548760035222; Tue, 29 Jan 2019 03:07:15 -0800 (PST) Received: from localhost.localdomain ([88.147.67.218]) by smtp.gmail.com with ESMTPSA id s132sm2066112wmf.28.2019.01.29.03.07.14 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 29 Jan 2019 03:07:14 -0800 (PST) From: Paolo Valente To: Jens Axboe Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, ulf.hansson@linaro.org, linus.walleij@linaro.org, broonie@kernel.org, bfq-iosched@googlegroups.com, oleksandr@natalenko.name, mancha@tower-research.com, Paolo Valente Subject: [PATCH BUGFIX IMPROVEMENT 14/14] block, bfq: fix in-service-queue check for queue merging Date: Tue, 29 Jan 2019 12:06:38 +0100 Message-Id: <20190129110638.12652-15-paolo.valente@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190129110638.12652-1-paolo.valente@linaro.org> References: <20190129110638.12652-1-paolo.valente@linaro.org> MIME-Version: 1.0 Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP When a new I/O request arrives for a bfq_queue, say Q, bfq checks whether that request is close to (a) the head request of some other queue waiting to be served, or (b) the last request dispatched for the in-service queue (in case Q itself is not the in-service queue) If a queue, say Q2, is found for which the above condition holds, then bfq merges Q and Q2, to hopefully get a more sequential I/O in the resulting merged queue, and thus a possibly higher throughput. Case (b) is checked by comparing the new request for Q with the last request dispatched, assuming that the latter necessarily belonged to the in-service queue. Unfortunately, this assumption is no longer always correct, since commit d0edc2473be9 ("block, bfq: inject other-queue I/O into seeky idle queues on NCQ flash"). When the assumption does not hold, queues that must not be merged may be merged, causing unexpected loss of control on per-queue service guarantees. This commit solves this problem by adding an extra field, which stores the actual last request dispatched for the in-service queue, and by using this new field to correctly check case (b). Signed-off-by: Paolo Valente --- block/bfq-iosched.c | 5 ++++- block/bfq-iosched.h | 3 +++ 2 files changed, 7 insertions(+), 1 deletion(-) diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c index 06268449d2ca..4c592496a16a 100644 --- a/block/bfq-iosched.c +++ b/block/bfq-iosched.c @@ -2251,7 +2251,8 @@ bfq_setup_cooperator(struct bfq_data *bfqd, struct bfq_queue *bfqq, if (in_service_bfqq && in_service_bfqq != bfqq && likely(in_service_bfqq != &bfqd->oom_bfqq) && - bfq_rq_close_to_sector(io_struct, request, bfqd->last_position) && + bfq_rq_close_to_sector(io_struct, request, + bfqd->in_serv_last_pos) && bfqq->entity.parent == in_service_bfqq->entity.parent && bfq_may_be_close_cooperator(bfqq, in_service_bfqq)) { new_bfqq = bfq_setup_merge(bfqq, in_service_bfqq); @@ -2791,6 +2792,8 @@ static void bfq_update_peak_rate(struct bfq_data *bfqd, struct request *rq) bfq_update_rate_reset(bfqd, rq); update_last_values: bfqd->last_position = blk_rq_pos(rq) + blk_rq_sectors(rq); + if (RQ_BFQQ(rq) == bfqd->in_service_queue) + bfqd->in_serv_last_pos = bfqd->last_position; bfqd->last_dispatch = now_ns; } diff --git a/block/bfq-iosched.h b/block/bfq-iosched.h index 30be669be465..062e1c4787f4 100644 --- a/block/bfq-iosched.h +++ b/block/bfq-iosched.h @@ -538,6 +538,9 @@ struct bfq_data { /* on-disk position of the last served request */ sector_t last_position; + /* position of the last served request for the in-service queue */ + sector_t in_serv_last_pos; + /* time of last request completion (ns) */ u64 last_completion;