From patchwork Thu Nov 7 19:17:59 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tejun Heo X-Patchwork-Id: 11233657 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7F1471599 for ; Thu, 7 Nov 2019 19:18:36 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 5CDFB21D7B for ; Thu, 7 Nov 2019 19:18:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1573154316; bh=3Bz9VvvLHSDC0nF6OZnxtcFm0XjlWZt0GHsDH4fIBRo=; h=From:To:Cc:Subject:Date:In-Reply-To:References:List-ID:From; b=uFSkyUVCHLjBL+Mna48wr3VDg2mImNaejXOJK3d76if2LP7e4+n/nzS9Tjyu1G/Xt eWafq2imtfdUhRnC5DR7btTHMKdIR6kYZV4pXugPxQHak1pgGrHySfJS1AYdqAsfbK iZQSTLSGDdEbNAT8e1AqCENHtowaWdo8yqK4MGMw= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726295AbfKGTSM (ORCPT ); Thu, 7 Nov 2019 14:18:12 -0500 Received: from mail-qt1-f196.google.com ([209.85.160.196]:33834 "EHLO mail-qt1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725497AbfKGTSL (ORCPT ); Thu, 7 Nov 2019 14:18:11 -0500 Received: by mail-qt1-f196.google.com with SMTP id c25so2975518qtq.1; Thu, 07 Nov 2019 11:18:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:from:to:cc:subject:date:message-id:in-reply-to:references; bh=znlRTNHoKykO2PNyEazvN0YZ+bFmJkjSeLacwzzLw1A=; b=tjG1KjgaTrTGNnprsl2NxFJreLf/OjzVAt0A6QrnOxpwd0AHDEMI0Y26C72AY7F2MU zKhDevIrcosd7yDgfQEfu6gx1ezxW3hiCHBjwwp9Hg0BRg9qmf3F7twiyESf28ieVBfz M6EI3Yf4APy4CwFNOVkRfDMUUl876WTZNb9ueC3SyuC9KGpuY7sQJhS63dcc/tdTJa/q vBv7EwAjxDKFuZtzStKbDkhBcy2q7W9qsFpmBNRh+coxxeF6GMqItIdziVMN/1NLCRjC 62tsHMUPgDU/e7x/ylQf7TzBAIcsOzKWB51wQp9jtbQjsS/lrgAT5PZYce/5/6+wojid Y5Gw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:to:cc:subject:date:message-id :in-reply-to:references; bh=znlRTNHoKykO2PNyEazvN0YZ+bFmJkjSeLacwzzLw1A=; b=slo9XBJxkfqSzM5BphDcd4mj/iADqaupOMxpQ9VhQVxRe45A+JOAnJvITJ65JN8V30 bsGPHESccnNfM2IjaMb30yfTePj9Lkyle3wWVijIldQh06ok6sqq7Vd/yx1OtwujvuSa X/q4GPEq/UgYaXQkD6QCPUCGcjaTdfox1Hp8jKeuZA0RbK7sAg8+UHVwu4aCAMCFVi9s PKARqpYtmaR7oZvwsGc4LNTbo0/nPUPeW2DDHvKAb6BkoH2w6zMiF+MeovohJ3ZzZSxB wfAl/flpE02uKOQ7sDhJ5rluCymVVLBlwS8dVprAThvNaFIhO4ufPU4viKrIYyZbX6wf VCFQ== X-Gm-Message-State: APjAAAUGqfPLYJxDrO4ZV2J/1ll479ZsvCiDpdNfrzOUfjYtqlW1aY7D Gv6KhxfzO48sHDCntrRQ0vk= X-Google-Smtp-Source: APXvYqxr5bVBbF246UZewTgmCA6ffw+w2tBv0vJNSbEB5Y+O60muyD2EofKG0zvGXSxg1TGWfudM7A== X-Received: by 2002:ac8:7216:: with SMTP id a22mr5692998qtp.187.1573154290213; Thu, 07 Nov 2019 11:18:10 -0800 (PST) Received: from localhost ([2620:10d:c091:500::2:3f13]) by smtp.gmail.com with ESMTPSA id w20sm1395361qkj.87.2019.11.07.11.18.09 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 07 Nov 2019 11:18:09 -0800 (PST) From: Tejun Heo To: axboe@kernel.dk Cc: cgroups@vger.kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, lizefan@huawei.com, hannes@cmpxchg.org, kernel-team@fb.com, Tejun Heo Subject: [PATCH 1/6] bfq-iosched: relocate bfqg_*rwstat*() helpers Date: Thu, 7 Nov 2019 11:17:59 -0800 Message-Id: <20191107191804.3735303-2-tj@kernel.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20191107191804.3735303-1-tj@kernel.org> References: <20191107191804.3735303-1-tj@kernel.org> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org Collect them right under #ifdef CONFIG_BFQ_CGROUP_DEBUG. The next patch will use them from !DEBUG path and this makes it easy to move them out of the ifdef block. This is pure code reorganization. Signed-off-by: Tejun Heo --- block/bfq-cgroup.c | 46 +++++++++++++++++++++++----------------------- 1 file changed, 23 insertions(+), 23 deletions(-) diff --git a/block/bfq-cgroup.c b/block/bfq-cgroup.c index 86a607cf19a1..d4755d4ad009 100644 --- a/block/bfq-cgroup.c +++ b/block/bfq-cgroup.c @@ -1058,17 +1058,34 @@ static ssize_t bfq_io_set_weight(struct kernfs_open_file *of, } #ifdef CONFIG_BFQ_CGROUP_DEBUG -static int bfqg_print_stat(struct seq_file *sf, void *v) +static int bfqg_print_rwstat(struct seq_file *sf, void *v) { - blkcg_print_blkgs(sf, css_to_blkcg(seq_css(sf)), blkg_prfill_stat, - &blkcg_policy_bfq, seq_cft(sf)->private, false); + blkcg_print_blkgs(sf, css_to_blkcg(seq_css(sf)), blkg_prfill_rwstat, + &blkcg_policy_bfq, seq_cft(sf)->private, true); return 0; } -static int bfqg_print_rwstat(struct seq_file *sf, void *v) +static u64 bfqg_prfill_rwstat_recursive(struct seq_file *sf, + struct blkg_policy_data *pd, int off) { - blkcg_print_blkgs(sf, css_to_blkcg(seq_css(sf)), blkg_prfill_rwstat, - &blkcg_policy_bfq, seq_cft(sf)->private, true); + struct blkg_rwstat_sample sum; + + blkg_rwstat_recursive_sum(pd_to_blkg(pd), &blkcg_policy_bfq, off, &sum); + return __blkg_prfill_rwstat(sf, pd, &sum); +} + +static int bfqg_print_rwstat_recursive(struct seq_file *sf, void *v) +{ + blkcg_print_blkgs(sf, css_to_blkcg(seq_css(sf)), + bfqg_prfill_rwstat_recursive, &blkcg_policy_bfq, + seq_cft(sf)->private, true); + return 0; +} + +static int bfqg_print_stat(struct seq_file *sf, void *v) +{ + blkcg_print_blkgs(sf, css_to_blkcg(seq_css(sf)), blkg_prfill_stat, + &blkcg_policy_bfq, seq_cft(sf)->private, false); return 0; } @@ -1097,15 +1114,6 @@ static u64 bfqg_prfill_stat_recursive(struct seq_file *sf, return __blkg_prfill_u64(sf, pd, sum); } -static u64 bfqg_prfill_rwstat_recursive(struct seq_file *sf, - struct blkg_policy_data *pd, int off) -{ - struct blkg_rwstat_sample sum; - - blkg_rwstat_recursive_sum(pd_to_blkg(pd), &blkcg_policy_bfq, off, &sum); - return __blkg_prfill_rwstat(sf, pd, &sum); -} - static int bfqg_print_stat_recursive(struct seq_file *sf, void *v) { blkcg_print_blkgs(sf, css_to_blkcg(seq_css(sf)), @@ -1114,14 +1122,6 @@ static int bfqg_print_stat_recursive(struct seq_file *sf, void *v) return 0; } -static int bfqg_print_rwstat_recursive(struct seq_file *sf, void *v) -{ - blkcg_print_blkgs(sf, css_to_blkcg(seq_css(sf)), - bfqg_prfill_rwstat_recursive, &blkcg_policy_bfq, - seq_cft(sf)->private, true); - return 0; -} - static u64 bfqg_prfill_sectors(struct seq_file *sf, struct blkg_policy_data *pd, int off) { From patchwork Thu Nov 7 19:18:00 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tejun Heo X-Patchwork-Id: 11233655 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9934D14E5 for ; Thu, 7 Nov 2019 19:18:35 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 6CBFB21D82 for ; Thu, 7 Nov 2019 19:18:35 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1573154315; bh=TLa8vH08d/tbkPvU1HX9Ho90zA2TEhiirf0wKdsBy8E=; h=From:To:Cc:Subject:Date:In-Reply-To:References:List-ID:From; b=ksMtLSV/CpTjIwDHPi/vJ1A9HLQz8ha5fqvTeV59dT4fNHhbmnmAmQKR7f1GKzsvA FGntMfuOCXML8OkUqZJBgj6nucju1O3HMfT61CwvIXQkuvIX0oh+Rlu8IP+ohzsKBN PQMHaNnkVV7aVsD+dZGkcoCSv7ufk5xZHelM8z60= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726520AbfKGTSO (ORCPT ); Thu, 7 Nov 2019 14:18:14 -0500 Received: from mail-qt1-f194.google.com ([209.85.160.194]:33485 "EHLO mail-qt1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725497AbfKGTSO (ORCPT ); Thu, 7 Nov 2019 14:18:14 -0500 Received: by mail-qt1-f194.google.com with SMTP id y39so3651121qty.0; Thu, 07 Nov 2019 11:18:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:from:to:cc:subject:date:message-id:in-reply-to:references; bh=xtRJy0h7C1YGD2EnSPEUZFAgaH1uL5OYTuzgSNt/iVM=; b=pUB/azFAGpBRWFY58zdbH4OPq1uOjGqFXzihxWn4zu/4YeW/WZvJbE0anEgs1rKM7W w5xuKjVjbn4yRzrZU8F4cFGMV5cKi/sxi5QdZhRhCcPBxyWgyf69B42xwceNgczzDmPs bDhmlsYDKIGJASTMulgG5Z4D7ywSOz5h53ri3guJIyNhDRHzb7Ay93Eabv48ivCXbKl7 vdr8C9pSFfKMLW2q+9egxVRQUVC3ZBr6L2hSVkg4891f8OJkqVomgRB6VjmBoSogF6S1 W2yx1pYDTZ59sPZnlQBOeBJ1L7SYLjRvdUGRzwy7mr8E2TzTWwQ4JCUvj80Gsf7a+NhI EYuA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:to:cc:subject:date:message-id :in-reply-to:references; bh=xtRJy0h7C1YGD2EnSPEUZFAgaH1uL5OYTuzgSNt/iVM=; b=HlvGrbcmuHBZC3nyUVMVEppGWGYgH2HhaYn5aB8/j1bNpwQObAtRE9GmjuEPLX9uTy Nini4LZHtdTMWjphD0Qu6MKTD7il7bBLz13XuPPruxNIMZQdYOu7ernb9R4fRjHwQp6I HYg7emXW+Ur6DdDlBkSPkYdowhNUH+n89WcqZcLKc3Yt53KoXbOJnNL1UqhXpdx3LYEZ RzvdMUFOXQAsGj0jPdx5W/fC4aGXPRH4QCgfsnGelbJRySTwRyA2MT3kEDJBppxCDu7W aDNcL1dNecr8OnldxG04Om+eY2MjgIyuEHr98uAtTyJq8CNO9BUbIUvzlTugHl878owI dheA== X-Gm-Message-State: APjAAAVUBaZnJDo/DsIDrubiKX74XcpxIE8vW7XyXtILqmATpDYv8aqd MKdkKLfm9o57G3bjfy5S7wA= X-Google-Smtp-Source: APXvYqyV9AOwdQ7tA/bVLpKPwb0kbWhz6+1mCZ96J4QCEzLPRw8asWn4ybQHTf91PhtsIFZ92Euz1Q== X-Received: by 2002:ac8:f52:: with SMTP id l18mr5653549qtk.126.1573154292940; Thu, 07 Nov 2019 11:18:12 -0800 (PST) Received: from localhost ([2620:10d:c091:500::2:3f13]) by smtp.gmail.com with ESMTPSA id u22sm985460qtb.59.2019.11.07.11.18.12 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 07 Nov 2019 11:18:12 -0800 (PST) From: Tejun Heo To: axboe@kernel.dk Cc: cgroups@vger.kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, lizefan@huawei.com, hannes@cmpxchg.org, kernel-team@fb.com, Tejun Heo , Paolo Valente Subject: [PATCH 2/6] bfq-iosched: stop using blkg->stat_bytes and ->stat_ios Date: Thu, 7 Nov 2019 11:18:00 -0800 Message-Id: <20191107191804.3735303-3-tj@kernel.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20191107191804.3735303-1-tj@kernel.org> References: <20191107191804.3735303-1-tj@kernel.org> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org When used on cgroup1, bfq uses the blkg->stat_bytes and ->stat_ios from blk-cgroup core to populate six stat knobs. blk-cgroup core is moving away from blkg_rwstat to improve scalability and won't be able to support this usage. It isn't like the sharing gains all that much. Let's break it out to dedicated rwstat counters which are updated when on cgroup1. This makes use of bfqg_*rwstat*() helpers outside of CONFIG_BFQ_CGROUP_DEBUG. Move them out. v2: Compile fix when !CONFIG_BFQ_CGROUP_DEBUG. Signed-off-by: Tejun Heo Cc: Paolo Valente --- block/bfq-cgroup.c | 39 +++++++++++++++++++++++++++------------ block/bfq-iosched.c | 4 ++++ block/bfq-iosched.h | 4 ++++ 3 files changed, 35 insertions(+), 12 deletions(-) diff --git a/block/bfq-cgroup.c b/block/bfq-cgroup.c index d4755d4ad009..cea0ae12f937 100644 --- a/block/bfq-cgroup.c +++ b/block/bfq-cgroup.c @@ -347,6 +347,14 @@ void bfqg_and_blkg_put(struct bfq_group *bfqg) bfqg_put(bfqg); } +void bfqg_stats_update_legacy_io(struct request_queue *q, struct request *rq) +{ + struct bfq_group *bfqg = blkg_to_bfqg(rq->bio->bi_blkg); + + blkg_rwstat_add(&bfqg->stats.bytes, rq->cmd_flags, blk_rq_bytes(rq)); + blkg_rwstat_add(&bfqg->stats.ios, rq->cmd_flags, 1); +} + /* @stats = 0 */ static void bfqg_stats_reset(struct bfqg_stats *stats) { @@ -431,6 +439,8 @@ void bfq_init_entity(struct bfq_entity *entity, struct bfq_group *bfqg) static void bfqg_stats_exit(struct bfqg_stats *stats) { + blkg_rwstat_exit(&stats->bytes); + blkg_rwstat_exit(&stats->ios); #ifdef CONFIG_BFQ_CGROUP_DEBUG blkg_rwstat_exit(&stats->merged); blkg_rwstat_exit(&stats->service_time); @@ -448,6 +458,10 @@ static void bfqg_stats_exit(struct bfqg_stats *stats) static int bfqg_stats_init(struct bfqg_stats *stats, gfp_t gfp) { + if (blkg_rwstat_init(&stats->bytes, gfp) || + blkg_rwstat_init(&stats->ios, gfp)) + return -ENOMEM; + #ifdef CONFIG_BFQ_CGROUP_DEBUG if (blkg_rwstat_init(&stats->merged, gfp) || blkg_rwstat_init(&stats->service_time, gfp) || @@ -1057,7 +1071,6 @@ static ssize_t bfq_io_set_weight(struct kernfs_open_file *of, return bfq_io_set_device_weight(of, buf, nbytes, off); } -#ifdef CONFIG_BFQ_CGROUP_DEBUG static int bfqg_print_rwstat(struct seq_file *sf, void *v) { blkcg_print_blkgs(sf, css_to_blkcg(seq_css(sf)), blkg_prfill_rwstat, @@ -1082,6 +1095,7 @@ static int bfqg_print_rwstat_recursive(struct seq_file *sf, void *v) return 0; } +#ifdef CONFIG_BFQ_CGROUP_DEBUG static int bfqg_print_stat(struct seq_file *sf, void *v) { blkcg_print_blkgs(sf, css_to_blkcg(seq_css(sf)), blkg_prfill_stat, @@ -1125,7 +1139,8 @@ static int bfqg_print_stat_recursive(struct seq_file *sf, void *v) static u64 bfqg_prfill_sectors(struct seq_file *sf, struct blkg_policy_data *pd, int off) { - u64 sum = blkg_rwstat_total(&pd->blkg->stat_bytes); + struct bfq_group *bfqg = blkg_to_bfqg(pd->blkg); + u64 sum = blkg_rwstat_total(&bfqg->stats.bytes); return __blkg_prfill_u64(sf, pd, sum >> 9); } @@ -1142,8 +1157,8 @@ static u64 bfqg_prfill_sectors_recursive(struct seq_file *sf, { struct blkg_rwstat_sample tmp; - blkg_rwstat_recursive_sum(pd->blkg, NULL, - offsetof(struct blkcg_gq, stat_bytes), &tmp); + blkg_rwstat_recursive_sum(pd->blkg, &blkcg_policy_bfq, + offsetof(struct bfq_group, stats.bytes), &tmp); return __blkg_prfill_u64(sf, pd, (tmp.cnt[BLKG_RWSTAT_READ] + tmp.cnt[BLKG_RWSTAT_WRITE]) >> 9); @@ -1226,13 +1241,13 @@ struct cftype bfq_blkcg_legacy_files[] = { /* statistics, covers only the tasks in the bfqg */ { .name = "bfq.io_service_bytes", - .private = (unsigned long)&blkcg_policy_bfq, - .seq_show = blkg_print_stat_bytes, + .private = offsetof(struct bfq_group, stats.bytes), + .seq_show = bfqg_print_rwstat, }, { .name = "bfq.io_serviced", - .private = (unsigned long)&blkcg_policy_bfq, - .seq_show = blkg_print_stat_ios, + .private = offsetof(struct bfq_group, stats.ios), + .seq_show = bfqg_print_rwstat, }, #ifdef CONFIG_BFQ_CGROUP_DEBUG { @@ -1269,13 +1284,13 @@ struct cftype bfq_blkcg_legacy_files[] = { /* the same statistics which cover the bfqg and its descendants */ { .name = "bfq.io_service_bytes_recursive", - .private = (unsigned long)&blkcg_policy_bfq, - .seq_show = blkg_print_stat_bytes_recursive, + .private = offsetof(struct bfq_group, stats.bytes), + .seq_show = bfqg_print_rwstat_recursive, }, { .name = "bfq.io_serviced_recursive", - .private = (unsigned long)&blkcg_policy_bfq, - .seq_show = blkg_print_stat_ios_recursive, + .private = offsetof(struct bfq_group, stats.ios), + .seq_show = bfqg_print_rwstat_recursive, }, #ifdef CONFIG_BFQ_CGROUP_DEBUG { diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c index 0319d6339822..41d2d83c919b 100644 --- a/block/bfq-iosched.c +++ b/block/bfq-iosched.c @@ -5464,6 +5464,10 @@ static void bfq_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq, bool idle_timer_disabled = false; unsigned int cmd_flags; +#ifdef CONFIG_BFQ_GROUP_IOSCHED + if (!cgroup_subsys_on_dfl(io_cgrp_subsys) && rq->bio) + bfqg_stats_update_legacy_io(q, rq); +#endif spin_lock_irq(&bfqd->lock); if (blk_mq_sched_try_insert_merge(q, rq)) { spin_unlock_irq(&bfqd->lock); diff --git a/block/bfq-iosched.h b/block/bfq-iosched.h index 5d1a519640f6..2676c06218f1 100644 --- a/block/bfq-iosched.h +++ b/block/bfq-iosched.h @@ -809,6 +809,9 @@ struct bfq_stat { }; struct bfqg_stats { + /* basic stats */ + struct blkg_rwstat bytes; + struct blkg_rwstat ios; #ifdef CONFIG_BFQ_CGROUP_DEBUG /* number of ios merged */ struct blkg_rwstat merged; @@ -956,6 +959,7 @@ void bfq_put_async_queues(struct bfq_data *bfqd, struct bfq_group *bfqg); /* ---------------- cgroups-support interface ---------------- */ +void bfqg_stats_update_legacy_io(struct request_queue *q, struct request *rq); void bfqg_stats_update_io_add(struct bfq_group *bfqg, struct bfq_queue *bfqq, unsigned int op); void bfqg_stats_update_io_remove(struct bfq_group *bfqg, unsigned int op); From patchwork Thu Nov 7 19:18:01 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tejun Heo X-Patchwork-Id: 11233653 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 03A161599 for ; Thu, 7 Nov 2019 19:18:35 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id D5E4D222C6 for ; Thu, 7 Nov 2019 19:18:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1573154314; bh=lZoNVutLlozqAtic0zSso+60DgTk77sR+ruJsqsfE9U=; h=From:To:Cc:Subject:Date:In-Reply-To:References:List-ID:From; b=ZSZoVY1wcDC7mkzfiVrIoz9VAa2BJySp5evfSQx/Ku+Jtb/dx2G1I+9EcGTa1b7jU EA3vPQF4Y1OvU57FQd8X1boW7dSa9NTPUVdZRoZzbJGulfsh6NESH6PImdWG9XMuYY mAIPUBzUSfiISPYgFgtHoZd6GDtcvsg6oXB5JSNA= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726726AbfKGTSS (ORCPT ); Thu, 7 Nov 2019 14:18:18 -0500 Received: from mail-qk1-f196.google.com ([209.85.222.196]:35757 "EHLO mail-qk1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725497AbfKGTSR (ORCPT ); Thu, 7 Nov 2019 14:18:17 -0500 Received: by mail-qk1-f196.google.com with SMTP id i19so3047767qki.2; Thu, 07 Nov 2019 11:18:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:from:to:cc:subject:date:message-id:in-reply-to:references; bh=vMqhWfpdfuPCtzz4h4xpyfgHaJP+ivcOVe0R+amrObc=; b=HdZZU5I5vTMc9Ex/54W/e4TkSETgqYkiFfEYhMgNaJhQdd+onKikiaU2ZwcyX78T9f Oa5tYEbglcQ7BNDDmKYsg/Cddj4nl1MvSwmLTVYa42IO8g21JU45R62X7j4feTUwd/zr tr5/FsGA/y5AtFuz/eKR/ftmqQYxPlAtdqhorB/H7Px+w5ma+bMxZ5jMwQUIRnEmiSPB t/A585WVXN8hr39fuKUNgmJ6HaC9CkgBbGu2mU2xpF+rsF/pUxKQ9P3EUPFYdJyr1X34 0ZR5335OuQ1I1sXKrBq99pb+4E3nds0D8vMjbXXIm/0D9bLv16oZBtAzgPHGpC0l2pJo /tAQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:to:cc:subject:date:message-id :in-reply-to:references; bh=vMqhWfpdfuPCtzz4h4xpyfgHaJP+ivcOVe0R+amrObc=; b=HobgxKEy3cszSoaVko0FcZVSzyl5xz4/oQ96gl7T0naL34fFVML+PqG/vPXaYbLfd2 AC/iEQGCHL0FhyUex9aJn8jQYDa4083gDuJJ/uzzZWR1RBDxriP1El1ZdRXzBAt5UHeb 2IgUV4nHidiLVD7nNqxlHi5grT4lSHNNVM0N/0oaccFrPYg1k9WU8YcPnvJ/xMorKeTy xMTjjUZBoGIWPnQinrn8dwbQWTohLV74y/FkdqzLHNnFSmenZewvcFyfEOp3lipz9yJ3 dklpq4bKKrs+W7QxKRPahc95idAUvkRewXFwdWzDKfOa8VopRQuMOZjrnFGcD6OwW5Ju FevQ== X-Gm-Message-State: APjAAAXWnOZ9ap2rm2kBjgbM/9dmrEWTtKllPodTyhVpusxYgXR/XAQ/ J5w5c0xU8JhFhr6EC4MrssU= X-Google-Smtp-Source: APXvYqyT8RfabQyfXaqXK3nxD/nlxlEXGgLKxtX6I8z10xCWd3gY3I2Nj7MqXa7u8++c1LH12VeHGg== X-Received: by 2002:ae9:c203:: with SMTP id j3mr4661629qkg.12.1573154295635; Thu, 07 Nov 2019 11:18:15 -0800 (PST) Received: from localhost ([2620:10d:c091:500::2:3f13]) by smtp.gmail.com with ESMTPSA id 23sm1607427qkj.52.2019.11.07.11.18.14 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 07 Nov 2019 11:18:14 -0800 (PST) From: Tejun Heo To: axboe@kernel.dk Cc: cgroups@vger.kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, lizefan@huawei.com, hannes@cmpxchg.org, kernel-team@fb.com, Tejun Heo Subject: [PATCH 3/6] blk-throtl: stop using blkg->stat_bytes and ->stat_ios Date: Thu, 7 Nov 2019 11:18:01 -0800 Message-Id: <20191107191804.3735303-4-tj@kernel.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20191107191804.3735303-1-tj@kernel.org> References: <20191107191804.3735303-1-tj@kernel.org> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org When used on cgroup1, blk-throtl uses the blkg->stat_bytes and ->stat_ios from blk-cgroup core to populate four stat knobs. blk-cgroup core is moving away from blkg_rwstat to improve scalability and won't be able to support this usage. It isn't like the sharing gains all that much. Let's break them out to dedicated rwstat counters which are updated when on cgroup1. Signed-off-by: Tejun Heo --- block/blk-throttle.c | 70 ++++++++++++++++++++++++++++++++++++++------ 1 file changed, 61 insertions(+), 9 deletions(-) diff --git a/block/blk-throttle.c b/block/blk-throttle.c index 18f773e52dfb..2d0fc73d9781 100644 --- a/block/blk-throttle.c +++ b/block/blk-throttle.c @@ -176,6 +176,9 @@ struct throtl_grp { unsigned int bio_cnt; /* total bios */ unsigned int bad_bio_cnt; /* bios exceeding latency threshold */ unsigned long bio_cnt_reset_time; + + struct blkg_rwstat stat_bytes; + struct blkg_rwstat stat_ios; }; /* We measure latency for request size from <= 4k to >= 1M */ @@ -489,6 +492,12 @@ static struct blkg_policy_data *throtl_pd_alloc(gfp_t gfp, if (!tg) return NULL; + if (blkg_rwstat_init(&tg->stat_bytes, gfp)) + goto err_free_tg; + + if (blkg_rwstat_init(&tg->stat_ios, gfp)) + goto err_exit_stat_bytes; + throtl_service_queue_init(&tg->service_queue); for (rw = READ; rw <= WRITE; rw++) { @@ -513,6 +522,12 @@ static struct blkg_policy_data *throtl_pd_alloc(gfp_t gfp, tg->idletime_threshold_conf = DFL_IDLE_THRESHOLD; return &tg->pd; + +err_exit_stat_bytes: + blkg_rwstat_exit(&tg->stat_bytes); +err_free_tg: + kfree(tg); + return NULL; } static void throtl_pd_init(struct blkg_policy_data *pd) @@ -611,6 +626,8 @@ static void throtl_pd_free(struct blkg_policy_data *pd) struct throtl_grp *tg = pd_to_tg(pd); del_timer_sync(&tg->service_queue.pending_timer); + blkg_rwstat_exit(&tg->stat_bytes); + blkg_rwstat_exit(&tg->stat_ios); kfree(tg); } @@ -1464,6 +1481,32 @@ static ssize_t tg_set_conf_uint(struct kernfs_open_file *of, return tg_set_conf(of, buf, nbytes, off, false); } +static int tg_print_rwstat(struct seq_file *sf, void *v) +{ + blkcg_print_blkgs(sf, css_to_blkcg(seq_css(sf)), + blkg_prfill_rwstat, &blkcg_policy_throtl, + seq_cft(sf)->private, true); + return 0; +} + +static u64 tg_prfill_rwstat_recursive(struct seq_file *sf, + struct blkg_policy_data *pd, int off) +{ + struct blkg_rwstat_sample sum; + + blkg_rwstat_recursive_sum(pd_to_blkg(pd), &blkcg_policy_throtl, off, + &sum); + return __blkg_prfill_rwstat(sf, pd, &sum); +} + +static int tg_print_rwstat_recursive(struct seq_file *sf, void *v) +{ + blkcg_print_blkgs(sf, css_to_blkcg(seq_css(sf)), + tg_prfill_rwstat_recursive, &blkcg_policy_throtl, + seq_cft(sf)->private, true); + return 0; +} + static struct cftype throtl_legacy_files[] = { { .name = "throttle.read_bps_device", @@ -1491,23 +1534,23 @@ static struct cftype throtl_legacy_files[] = { }, { .name = "throttle.io_service_bytes", - .private = (unsigned long)&blkcg_policy_throtl, - .seq_show = blkg_print_stat_bytes, + .private = offsetof(struct throtl_grp, stat_bytes), + .seq_show = tg_print_rwstat, }, { .name = "throttle.io_service_bytes_recursive", - .private = (unsigned long)&blkcg_policy_throtl, - .seq_show = blkg_print_stat_bytes_recursive, + .private = offsetof(struct throtl_grp, stat_bytes), + .seq_show = tg_print_rwstat_recursive, }, { .name = "throttle.io_serviced", - .private = (unsigned long)&blkcg_policy_throtl, - .seq_show = blkg_print_stat_ios, + .private = offsetof(struct throtl_grp, stat_ios), + .seq_show = tg_print_rwstat, }, { .name = "throttle.io_serviced_recursive", - .private = (unsigned long)&blkcg_policy_throtl, - .seq_show = blkg_print_stat_ios_recursive, + .private = offsetof(struct throtl_grp, stat_ios), + .seq_show = tg_print_rwstat_recursive, }, { } /* terminate */ }; @@ -2127,7 +2170,16 @@ bool blk_throtl_bio(struct request_queue *q, struct blkcg_gq *blkg, WARN_ON_ONCE(!rcu_read_lock_held()); /* see throtl_charge_bio() */ - if (bio_flagged(bio, BIO_THROTTLED) || !tg->has_rules[rw]) + if (bio_flagged(bio, BIO_THROTTLED)) + goto out; + + if (!cgroup_subsys_on_dfl(io_cgrp_subsys)) { + blkg_rwstat_add(&tg->stat_bytes, bio->bi_opf, + bio->bi_iter.bi_size); + blkg_rwstat_add(&tg->stat_ios, bio->bi_opf, 1); + } + + if (!tg->has_rules[rw]) goto out; spin_lock_irq(&q->queue_lock); From patchwork Thu Nov 7 19:18:02 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tejun Heo X-Patchwork-Id: 11233649 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CE8F515AB for ; Thu, 7 Nov 2019 19:18:29 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id AC70A21D7F for ; Thu, 7 Nov 2019 19:18:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1573154309; bh=q8juDoPYv3ypamRHCksga266RnS5ycqVvz2sQy3MBGg=; h=From:To:Cc:Subject:Date:In-Reply-To:References:List-ID:From; b=ZFsRewfj8RbOSXqek8IjUf46YFuTFCyNGWM9C6cOIJZm9BqJB3kfoq0rpTEj9yXBE Pl4H5hoAMy1IDniBYLSxJRirMW6mmrH08oEdFbMwJJAR94petAJwCJ0UzfSm9ZerIt LP31mxcwN/NTM8sVqKEao435GnHMNxcwdo82PUsI= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1725497AbfKGTSU (ORCPT ); Thu, 7 Nov 2019 14:18:20 -0500 Received: from mail-qk1-f196.google.com ([209.85.222.196]:41436 "EHLO mail-qk1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726799AbfKGTST (ORCPT ); Thu, 7 Nov 2019 14:18:19 -0500 Received: by mail-qk1-f196.google.com with SMTP id m125so3013657qkd.8; Thu, 07 Nov 2019 11:18:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:from:to:cc:subject:date:message-id:in-reply-to:references; bh=2Y+pHXtP5O8Okj4I0mI2qlT4EC3+7maC5emZuz1NNRA=; b=dSIVX8HNakf4UCSDoL9GG1/oa+JSiZjL1/w344VGzrazd/6fAm080EaF9aM6QD3MFA vq5u1kxJxiBZLyIF2H2xhW+la8wFC8tcKKkGNbQez6kyomQU1uijmoEMK9HT1AqS+CV3 vIYYkZKpcIM/GAewRf3OCKq81YJZ1gw7Dw2T86U5a3lNuiy0/4ScCO2uJeX+RsV7PgAq UP1MEMD/D78B7sO0L+tyC6BPtLKx1QOf+C4p7t7ppanFT0Hp1KTAWvSpEOx9rjllz0T9 bsRnu/XNTsEeQckp1QUWnXntos/XMOMa9VPkmJTOUiue0zs7Fx6LmS7jqs8dQQuFi1y3 dFKA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:to:cc:subject:date:message-id :in-reply-to:references; bh=2Y+pHXtP5O8Okj4I0mI2qlT4EC3+7maC5emZuz1NNRA=; b=VZEQ3ZHRyOXdAg97P8pc1UwGplkxM01Dnu0b5pUWnh3X+dh1ucsx+a4yiMgZGsfvEn kmmZaSyh603uMpIUaJRTXkg8JSWa5GZQQ1Gdm76DxTRsifWofQR/Woarfl9/UIpFtWDb G7+DFY6c/XkZ8qlhsIrTQeEMjF46AfIs9GCRDCjI5Wh7UWYs0fiMCKnrlJMzOD+uAy3c N+/4HQZZj5lJkjS3vLmljP7pkTnvo7TBcOoIEYVvo7vjdRM5nIPwKMZWYwE3Z2W/B1EB 3v2nAhy0f1Ic8olUCsdRedtOMiu3hhZV40NtNRTyEXd7qyqvBQZL1I9knwfLe+mPpIpv UXtQ== X-Gm-Message-State: APjAAAWW9Ui0PvicP8+wiqAEGfccfyn4+mfqRFVG74yWwWfYO/kchDvx XI7ZaH6ddlkFRX7mPqfY4VJZ+PkC X-Google-Smtp-Source: APXvYqxEzXjT/Yjh2YaX2GmIVA2JV8RHUOMpZm34G+LXXpps6UG6QGzE2IzX5MEqSD4vNWt/wk/rmw== X-Received: by 2002:a37:5bc4:: with SMTP id p187mr4430131qkb.420.1573154298093; Thu, 07 Nov 2019 11:18:18 -0800 (PST) Received: from localhost ([2620:10d:c091:500::2:3f13]) by smtp.gmail.com with ESMTPSA id h12sm1541234qkh.123.2019.11.07.11.18.17 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 07 Nov 2019 11:18:17 -0800 (PST) From: Tejun Heo To: axboe@kernel.dk Cc: cgroups@vger.kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, lizefan@huawei.com, hannes@cmpxchg.org, kernel-team@fb.com, Tejun Heo Subject: [PATCH 4/6] blk-cgroup: remove now unused blkg_print_stat_{bytes|ios}_recursive() Date: Thu, 7 Nov 2019 11:18:02 -0800 Message-Id: <20191107191804.3735303-5-tj@kernel.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20191107191804.3735303-1-tj@kernel.org> References: <20191107191804.3735303-1-tj@kernel.org> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org These don't have users anymore. Remove them. Signed-off-by: Tejun Heo --- block/blk-cgroup.c | 83 -------------------------------------- include/linux/blk-cgroup.h | 5 --- 2 files changed, 88 deletions(-) diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c index 1eb8895be4c6..e7e93377e320 100644 --- a/block/blk-cgroup.c +++ b/block/blk-cgroup.c @@ -615,89 +615,6 @@ u64 blkg_prfill_rwstat(struct seq_file *sf, struct blkg_policy_data *pd, } EXPORT_SYMBOL_GPL(blkg_prfill_rwstat); -static u64 blkg_prfill_rwstat_field(struct seq_file *sf, - struct blkg_policy_data *pd, int off) -{ - struct blkg_rwstat_sample rwstat = { }; - - blkg_rwstat_read((void *)pd->blkg + off, &rwstat); - return __blkg_prfill_rwstat(sf, pd, &rwstat); -} - -/** - * blkg_print_stat_bytes - seq_show callback for blkg->stat_bytes - * @sf: seq_file to print to - * @v: unused - * - * To be used as cftype->seq_show to print blkg->stat_bytes. - * cftype->private must be set to the blkcg_policy. - */ -int blkg_print_stat_bytes(struct seq_file *sf, void *v) -{ - blkcg_print_blkgs(sf, css_to_blkcg(seq_css(sf)), - blkg_prfill_rwstat_field, (void *)seq_cft(sf)->private, - offsetof(struct blkcg_gq, stat_bytes), true); - return 0; -} -EXPORT_SYMBOL_GPL(blkg_print_stat_bytes); - -/** - * blkg_print_stat_bytes - seq_show callback for blkg->stat_ios - * @sf: seq_file to print to - * @v: unused - * - * To be used as cftype->seq_show to print blkg->stat_ios. cftype->private - * must be set to the blkcg_policy. - */ -int blkg_print_stat_ios(struct seq_file *sf, void *v) -{ - blkcg_print_blkgs(sf, css_to_blkcg(seq_css(sf)), - blkg_prfill_rwstat_field, (void *)seq_cft(sf)->private, - offsetof(struct blkcg_gq, stat_ios), true); - return 0; -} -EXPORT_SYMBOL_GPL(blkg_print_stat_ios); - -static u64 blkg_prfill_rwstat_field_recursive(struct seq_file *sf, - struct blkg_policy_data *pd, - int off) -{ - struct blkg_rwstat_sample rwstat; - - blkg_rwstat_recursive_sum(pd->blkg, NULL, off, &rwstat); - return __blkg_prfill_rwstat(sf, pd, &rwstat); -} - -/** - * blkg_print_stat_bytes_recursive - recursive version of blkg_print_stat_bytes - * @sf: seq_file to print to - * @v: unused - */ -int blkg_print_stat_bytes_recursive(struct seq_file *sf, void *v) -{ - blkcg_print_blkgs(sf, css_to_blkcg(seq_css(sf)), - blkg_prfill_rwstat_field_recursive, - (void *)seq_cft(sf)->private, - offsetof(struct blkcg_gq, stat_bytes), true); - return 0; -} -EXPORT_SYMBOL_GPL(blkg_print_stat_bytes_recursive); - -/** - * blkg_print_stat_ios_recursive - recursive version of blkg_print_stat_ios - * @sf: seq_file to print to - * @v: unused - */ -int blkg_print_stat_ios_recursive(struct seq_file *sf, void *v) -{ - blkcg_print_blkgs(sf, css_to_blkcg(seq_css(sf)), - blkg_prfill_rwstat_field_recursive, - (void *)seq_cft(sf)->private, - offsetof(struct blkcg_gq, stat_ios), true); - return 0; -} -EXPORT_SYMBOL_GPL(blkg_print_stat_ios_recursive); - /** * blkg_rwstat_recursive_sum - collect hierarchical blkg_rwstat * @blkg: blkg of interest diff --git a/include/linux/blk-cgroup.h b/include/linux/blk-cgroup.h index bed9e43f9426..914ce55fa8c2 100644 --- a/include/linux/blk-cgroup.h +++ b/include/linux/blk-cgroup.h @@ -220,11 +220,6 @@ u64 __blkg_prfill_rwstat(struct seq_file *sf, struct blkg_policy_data *pd, const struct blkg_rwstat_sample *rwstat); u64 blkg_prfill_rwstat(struct seq_file *sf, struct blkg_policy_data *pd, int off); -int blkg_print_stat_bytes(struct seq_file *sf, void *v); -int blkg_print_stat_ios(struct seq_file *sf, void *v); -int blkg_print_stat_bytes_recursive(struct seq_file *sf, void *v); -int blkg_print_stat_ios_recursive(struct seq_file *sf, void *v); - void blkg_rwstat_recursive_sum(struct blkcg_gq *blkg, struct blkcg_policy *pol, int off, struct blkg_rwstat_sample *sum); From patchwork Thu Nov 7 19:18:03 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tejun Heo X-Patchwork-Id: 11233647 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6A57914E5 for ; Thu, 7 Nov 2019 19:18:29 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 360D621D7F for ; Thu, 7 Nov 2019 19:18:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1573154309; bh=SOpXDVv+lTTmwtU6iBvVwwYjqRwbci4fDrej1qYpQWQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:List-ID:From; b=NpS6OtEYqTaYdGAVZ8UNi2ac2w5s/ivapsIdMcTZ463vUikV2PTlBBpgDRqmiDGVK +cc3uuwiK4Zd/eghT//xYuSdJgSpTbDoHjDJrFXdIvxi3xSm1KSRO94vo1tMHE9817 M7n3e+82WXuUSwiCoq+OgWkll0LyHWIsK0lXXU7U= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726983AbfKGTSZ (ORCPT ); Thu, 7 Nov 2019 14:18:25 -0500 Received: from mail-qk1-f195.google.com ([209.85.222.195]:38061 "EHLO mail-qk1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726906AbfKGTSW (ORCPT ); Thu, 7 Nov 2019 14:18:22 -0500 Received: by mail-qk1-f195.google.com with SMTP id e2so3030499qkn.5; Thu, 07 Nov 2019 11:18:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:from:to:cc:subject:date:message-id:in-reply-to:references; bh=Pg7gxuQae0aIJicehpPMrFA6vzoQZTt2JWYNWZeBtSg=; b=Bwc+QgJC7mKraYygjU0cU1mZAE1Gtn1J1mzkLrPHJqqB/pGIQMYCjbO4h6dc17E+OR a7cW9r5ODRioQBFDL4zrDQHOeKsLDnRP5pTlUCSaVw+iW9vPE2NgGby1oBPvNlc+cqTe M0vmXUMte9KrMC4os5/Jr6tn2LNS4+8gcLwntNhJasaQwuND17Lqnh6Hq65x3amEBVpJ H2HlLpSmpN5mSm5qO3cWhfG9QcpiQq6FdwNiELHQK0jnaBLIKrqZwM+vaJTJTXh8uUx5 etpdwxIHWWrYig2cxT8D/nSohIS0SsIDNLS7TpNVwdzSS1jFQW8TIkfzi/5QzLohhrvQ 9hDw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:to:cc:subject:date:message-id :in-reply-to:references; bh=Pg7gxuQae0aIJicehpPMrFA6vzoQZTt2JWYNWZeBtSg=; b=l0BwYIZK0lLYo3ZyCUZu/TkzWl0jMLStdkyd1dQdiPFF3eDlVFGwkXytLp0uuygeJb ZD+OYn60lLF2HtDIIdZmzq5g/KJuGtle+PFWnbvsZ16w/TBLKCLG5OtQvaAZB1XCLyNW +5nCWfS7U0grJFYUfxTXHQosqAOJ3e9fx366+lwiihs0eQN60Q5uwMNa0BhMTB2UoaB4 S1wvxedtM/C5y6/xUzGFyvBBYoCX3teiazT9wpOY3fOKYKWerHWM1NGSqdKp7iYivH6G 8srTaMO3ONrcXNMsGa8JF1vwJV1tXr/xCZ5vf8jragGTPCpwfmOzQJwRzNU2ILxiIvej Xkfw== X-Gm-Message-State: APjAAAX0boim9GI9y3RQrQtjM1/aRt6p/K04FoGFiMEMLla4LMMUCjbl kYnonieqNSsjRDuOBGQkOaY= X-Google-Smtp-Source: APXvYqwMLQqocWXbaLBhWGSAYgSQyCwGu6F7NP7q5kJM58GyLx2MoiN8AVrPNcF2IOwVFClQ+GRZpQ== X-Received: by 2002:a37:d8e:: with SMTP id 136mr4538301qkn.249.1573154300386; Thu, 07 Nov 2019 11:18:20 -0800 (PST) Received: from localhost ([2620:10d:c091:500::2:3f13]) by smtp.gmail.com with ESMTPSA id x1sm1591321qtf.81.2019.11.07.11.18.19 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 07 Nov 2019 11:18:19 -0800 (PST) From: Tejun Heo To: axboe@kernel.dk Cc: cgroups@vger.kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, lizefan@huawei.com, hannes@cmpxchg.org, kernel-team@fb.com, Tejun Heo , Dan Schatzberg , Daniel Xu Subject: [PATCH 5/6] blk-cgroup: reimplement basic IO stats using cgroup rstat Date: Thu, 7 Nov 2019 11:18:03 -0800 Message-Id: <20191107191804.3735303-6-tj@kernel.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20191107191804.3735303-1-tj@kernel.org> References: <20191107191804.3735303-1-tj@kernel.org> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org blk-cgroup has been using blkg_rwstat to track basic IO stats. Unfortunately, reading recursive stats scales badly as itinvolves walking all descendants. On systems with a huge number of cgroups (dead or alive), this can lead to substantial CPU cost when reading IO stats. This patch reimplements basic IO stats using cgroup rstat which uses more memory but makes recursive stat reading O(# descendants which have been active since last reading) instead of O(# descendants). * blk-cgroup core no longer uses sync/async stats. Introduce new stat enums - BLKG_IOSTAT_{READ|WRITE|DISCARD}. * Add blkg_iostat[_set] which encapsulates byte and io stats, last values for propagation delta calculation and u64_stats_sync for correctness on 32bit archs. * Update the new percpu stat counters directly and implement blkcg_rstat_flush() to implement propagation. * blkg_print_stat() can now bring the stats up to date by calling cgroup_rstat_flush() and print them instead of directly summing up all descendants. * It now allocates 96 bytes per cpu. It used to be 40 bytes. Signed-off-by: Tejun Heo Cc: Dan Schatzberg Cc: Daniel Xu --- block/blk-cgroup.c | 124 +++++++++++++++++++++++++++++-------- include/linux/blk-cgroup.h | 48 ++++++++++++-- 2 files changed, 142 insertions(+), 30 deletions(-) diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c index e7e93377e320..b3429be62057 100644 --- a/block/blk-cgroup.c +++ b/block/blk-cgroup.c @@ -80,8 +80,7 @@ static void blkg_free(struct blkcg_gq *blkg) if (blkg->pd[i]) blkcg_policy[i]->pd_free_fn(blkg->pd[i]); - blkg_rwstat_exit(&blkg->stat_ios); - blkg_rwstat_exit(&blkg->stat_bytes); + free_percpu(blkg->iostat_cpu); percpu_ref_exit(&blkg->refcnt); kfree(blkg); } @@ -146,7 +145,7 @@ static struct blkcg_gq *blkg_alloc(struct blkcg *blkcg, struct request_queue *q, gfp_t gfp_mask) { struct blkcg_gq *blkg; - int i; + int i, cpu; /* alloc and init base part */ blkg = kzalloc_node(sizeof(*blkg), gfp_mask, q->node); @@ -156,8 +155,8 @@ static struct blkcg_gq *blkg_alloc(struct blkcg *blkcg, struct request_queue *q, if (percpu_ref_init(&blkg->refcnt, blkg_release, 0, gfp_mask)) goto err_free; - if (blkg_rwstat_init(&blkg->stat_bytes, gfp_mask) || - blkg_rwstat_init(&blkg->stat_ios, gfp_mask)) + blkg->iostat_cpu = alloc_percpu_gfp(struct blkg_iostat_set, gfp_mask); + if (!blkg->iostat_cpu) goto err_free; blkg->q = q; @@ -167,6 +166,10 @@ static struct blkcg_gq *blkg_alloc(struct blkcg *blkcg, struct request_queue *q, INIT_WORK(&blkg->async_bio_work, blkg_async_bio_workfn); blkg->blkcg = blkcg; + u64_stats_init(&blkg->iostat.sync); + for_each_possible_cpu(cpu) + u64_stats_init(&per_cpu_ptr(blkg->iostat_cpu, cpu)->sync); + for (i = 0; i < BLKCG_MAX_POLS; i++) { struct blkcg_policy *pol = blkcg_policy[i]; struct blkg_policy_data *pd; @@ -393,7 +396,6 @@ struct blkcg_gq *blkg_lookup_create(struct blkcg *blkcg, static void blkg_destroy(struct blkcg_gq *blkg) { struct blkcg *blkcg = blkg->blkcg; - struct blkcg_gq *parent = blkg->parent; int i; lockdep_assert_held(&blkg->q->queue_lock); @@ -410,11 +412,6 @@ static void blkg_destroy(struct blkcg_gq *blkg) pol->pd_offline_fn(blkg->pd[i]); } - if (parent) { - blkg_rwstat_add_aux(&parent->stat_bytes, &blkg->stat_bytes); - blkg_rwstat_add_aux(&parent->stat_ios, &blkg->stat_ios); - } - blkg->online = false; radix_tree_delete(&blkcg->blkg_tree, blkg->q->id); @@ -464,7 +461,7 @@ static int blkcg_reset_stats(struct cgroup_subsys_state *css, { struct blkcg *blkcg = css_to_blkcg(css); struct blkcg_gq *blkg; - int i; + int i, cpu; mutex_lock(&blkcg_pol_mutex); spin_lock_irq(&blkcg->lock); @@ -475,8 +472,12 @@ static int blkcg_reset_stats(struct cgroup_subsys_state *css, * anyway. If you get hit by a race, retry. */ hlist_for_each_entry(blkg, &blkcg->blkg_list, blkcg_node) { - blkg_rwstat_reset(&blkg->stat_bytes); - blkg_rwstat_reset(&blkg->stat_ios); + for_each_possible_cpu(cpu) { + struct blkg_iostat_set *bis = + per_cpu_ptr(blkg->iostat_cpu, cpu); + memset(bis, 0, sizeof(*bis)); + } + memset(&blkg->iostat, 0, sizeof(blkg->iostat)); for (i = 0; i < BLKCG_MAX_POLS; i++) { struct blkcg_policy *pol = blkcg_policy[i]; @@ -840,16 +841,18 @@ static int blkcg_print_stat(struct seq_file *sf, void *v) struct blkcg *blkcg = css_to_blkcg(seq_css(sf)); struct blkcg_gq *blkg; + cgroup_rstat_flush(blkcg->css.cgroup); rcu_read_lock(); hlist_for_each_entry_rcu(blkg, &blkcg->blkg_list, blkcg_node) { + struct blkg_iostat_set *bis = &blkg->iostat; const char *dname; char *buf; - struct blkg_rwstat_sample rwstat; u64 rbytes, wbytes, rios, wios, dbytes, dios; size_t size = seq_get_buf(sf, &buf), off = 0; int i; bool has_stats = false; + unsigned seq; spin_lock_irq(&blkg->q->queue_lock); @@ -868,17 +871,16 @@ static int blkcg_print_stat(struct seq_file *sf, void *v) */ off += scnprintf(buf+off, size-off, "%s ", dname); - blkg_rwstat_recursive_sum(blkg, NULL, - offsetof(struct blkcg_gq, stat_bytes), &rwstat); - rbytes = rwstat.cnt[BLKG_RWSTAT_READ]; - wbytes = rwstat.cnt[BLKG_RWSTAT_WRITE]; - dbytes = rwstat.cnt[BLKG_RWSTAT_DISCARD]; + do { + seq = u64_stats_fetch_begin(&bis->sync); - blkg_rwstat_recursive_sum(blkg, NULL, - offsetof(struct blkcg_gq, stat_ios), &rwstat); - rios = rwstat.cnt[BLKG_RWSTAT_READ]; - wios = rwstat.cnt[BLKG_RWSTAT_WRITE]; - dios = rwstat.cnt[BLKG_RWSTAT_DISCARD]; + rbytes = bis->cur.bytes[BLKG_IOSTAT_READ]; + wbytes = bis->cur.bytes[BLKG_IOSTAT_WRITE]; + dbytes = bis->cur.bytes[BLKG_IOSTAT_DISCARD]; + rios = bis->cur.ios[BLKG_IOSTAT_READ]; + wios = bis->cur.ios[BLKG_IOSTAT_WRITE]; + dios = bis->cur.ios[BLKG_IOSTAT_DISCARD]; + } while (u64_stats_fetch_retry(&bis->sync, seq)); if (rbytes || wbytes || rios || wios) { has_stats = true; @@ -1214,6 +1216,77 @@ static int blkcg_can_attach(struct cgroup_taskset *tset) return ret; } +static void blkg_iostat_set(struct blkg_iostat *dst, struct blkg_iostat *src) +{ + int i; + + for (i = 0; i < BLKG_IOSTAT_NR; i++) { + dst->bytes[i] = src->bytes[i]; + dst->ios[i] = src->ios[i]; + } +} + +static void blkg_iostat_add(struct blkg_iostat *dst, struct blkg_iostat *src) +{ + int i; + + for (i = 0; i < BLKG_IOSTAT_NR; i++) { + dst->bytes[i] += src->bytes[i]; + dst->ios[i] += src->ios[i]; + } +} + +static void blkg_iostat_sub(struct blkg_iostat *dst, struct blkg_iostat *src) +{ + int i; + + for (i = 0; i < BLKG_IOSTAT_NR; i++) { + dst->bytes[i] -= src->bytes[i]; + dst->ios[i] -= src->ios[i]; + } +} + +static void blkcg_rstat_flush(struct cgroup_subsys_state *css, int cpu) +{ + struct blkcg *blkcg = css_to_blkcg(css); + struct blkcg_gq *blkg; + + rcu_read_lock(); + + hlist_for_each_entry_rcu(blkg, &blkcg->blkg_list, blkcg_node) { + struct blkcg_gq *parent = blkg->parent; + struct blkg_iostat_set *bisc = per_cpu_ptr(blkg->iostat_cpu, cpu); + struct blkg_iostat cur, delta; + unsigned seq; + + /* fetch the current per-cpu values */ + do { + seq = u64_stats_fetch_begin(&bisc->sync); + blkg_iostat_set(&cur, &bisc->cur); + } while (u64_stats_fetch_retry(&bisc->sync, seq)); + + /* propagate percpu delta to global */ + u64_stats_update_begin(&blkg->iostat.sync); + blkg_iostat_set(&delta, &cur); + blkg_iostat_sub(&delta, &bisc->last); + blkg_iostat_add(&blkg->iostat.cur, &delta); + blkg_iostat_add(&bisc->last, &delta); + u64_stats_update_end(&blkg->iostat.sync); + + /* propagate global delta to parent */ + if (parent) { + u64_stats_update_begin(&parent->iostat.sync); + blkg_iostat_set(&delta, &blkg->iostat.cur); + blkg_iostat_sub(&delta, &blkg->iostat.last); + blkg_iostat_add(&parent->iostat.cur, &delta); + blkg_iostat_add(&blkg->iostat.last, &delta); + u64_stats_update_end(&parent->iostat.sync); + } + } + + rcu_read_unlock(); +} + static void blkcg_bind(struct cgroup_subsys_state *root_css) { int i; @@ -1246,6 +1319,7 @@ struct cgroup_subsys io_cgrp_subsys = { .css_offline = blkcg_css_offline, .css_free = blkcg_css_free, .can_attach = blkcg_can_attach, + .css_rstat_flush = blkcg_rstat_flush, .bind = blkcg_bind, .dfl_cftypes = blkcg_files, .legacy_cftypes = blkcg_legacy_files, diff --git a/include/linux/blk-cgroup.h b/include/linux/blk-cgroup.h index 914ce55fa8c2..867ab391e409 100644 --- a/include/linux/blk-cgroup.h +++ b/include/linux/blk-cgroup.h @@ -15,7 +15,9 @@ */ #include +#include #include +#include #include #include #include @@ -31,6 +33,14 @@ #ifdef CONFIG_BLK_CGROUP +enum blkg_iostat_type { + BLKG_IOSTAT_READ, + BLKG_IOSTAT_WRITE, + BLKG_IOSTAT_DISCARD, + + BLKG_IOSTAT_NR, +}; + enum blkg_rwstat_type { BLKG_RWSTAT_READ, BLKG_RWSTAT_WRITE, @@ -61,6 +71,17 @@ struct blkcg { #endif }; +struct blkg_iostat { + u64 bytes[BLKG_IOSTAT_NR]; + u64 ios[BLKG_IOSTAT_NR]; +}; + +struct blkg_iostat_set { + struct u64_stats_sync sync; + struct blkg_iostat cur; + struct blkg_iostat last; +}; + /* * blkg_[rw]stat->aux_cnt is excluded for local stats but included for * recursive. Used to carry stats of dead children. @@ -127,8 +148,8 @@ struct blkcg_gq { /* is this blkg online? protected by both blkcg and q locks */ bool online; - struct blkg_rwstat stat_bytes; - struct blkg_rwstat stat_ios; + struct blkg_iostat_set __percpu *iostat_cpu; + struct blkg_iostat_set iostat; struct blkg_policy_data *pd[BLKCG_MAX_POLS]; @@ -740,15 +761,32 @@ static inline bool blkcg_bio_issue_check(struct request_queue *q, throtl = blk_throtl_bio(q, blkg, bio); if (!throtl) { + struct blkg_iostat_set *bis; + int rwd, cpu; + + if (op_is_discard(bio->bi_opf)) + rwd = BLKG_IOSTAT_DISCARD; + else if (op_is_write(bio->bi_opf)) + rwd = BLKG_IOSTAT_WRITE; + else + rwd = BLKG_IOSTAT_READ; + + cpu = get_cpu(); + bis = per_cpu_ptr(blkg->iostat_cpu, cpu); + u64_stats_update_begin(&bis->sync); + /* * If the bio is flagged with BIO_QUEUE_ENTERED it means this * is a split bio and we would have already accounted for the * size of the bio. */ if (!bio_flagged(bio, BIO_QUEUE_ENTERED)) - blkg_rwstat_add(&blkg->stat_bytes, bio->bi_opf, - bio->bi_iter.bi_size); - blkg_rwstat_add(&blkg->stat_ios, bio->bi_opf, 1); + bis->cur.bytes[rwd] += bio->bi_iter.bi_size; + bis->cur.ios[rwd]++; + + u64_stats_update_end(&bis->sync); + cgroup_rstat_updated(blkg->blkcg->css.cgroup, cpu); + put_cpu(); } blkcg_bio_issue_init(bio); From patchwork Thu Nov 7 19:18:04 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tejun Heo X-Patchwork-Id: 11233651 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 518CE1599 for ; Thu, 7 Nov 2019 19:18:34 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 13A0D21D79 for ; Thu, 7 Nov 2019 19:18:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1573154314; bh=o+0TNd8WZXHBDr5cEdU5UFXA4JWooEyOxM2BtsMQLmo=; h=From:To:Cc:Subject:Date:In-Reply-To:References:List-ID:From; b=T9tgnSt5mcHFKWJtKZhwZMD1RSxtMqswfAzFrgMfE03uKiSR6+1vvZuRNOArj60Lw H/PzjmrMNVln/TSeRhVKMmKQ0B81s9iHhpxUbHwQARM01hedDOj5ypI9kv0xDlrf/Y o9lePplvNKwJ6Jno6eEgOw317Fk06cl38k0KDm2w= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726561AbfKGTS3 (ORCPT ); Thu, 7 Nov 2019 14:18:29 -0500 Received: from mail-qk1-f196.google.com ([209.85.222.196]:33759 "EHLO mail-qk1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726019AbfKGTSY (ORCPT ); Thu, 7 Nov 2019 14:18:24 -0500 Received: by mail-qk1-f196.google.com with SMTP id 71so3066530qkl.0; Thu, 07 Nov 2019 11:18:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:from:to:cc:subject:date:message-id:in-reply-to:references; bh=QE+dG2A0JV3/hEHPfyxjosOSTNZMxpMgY5UN3q7iv60=; b=cnzrPUBTfJgcbYioR1pTTzMZRSIMIoRQVwEW4nhb4r0KNYMPcKbm0jIHLkO0HmSvin F8WTQM/0IxynTfR9BlL75+C+bvQAoajeNAkn8gih5Q5WNRo7lnJUjUI/BCI/FWOziuqc OI69O20GZvqwJR0x7KO9p0ebhYa8cf/PSBRv53WHafqqlr1hEX6rJB0bRp1jRyEtNHl5 VUJ8l7PPDmP70+nccoffZ8ETzCoVQOeBJVXMI+x02PQ6JXrEg4XTdHmsQtsLBAIibU9Q e7/Bsp20bqfDuKNYgqw6pnyZNsUziAQmQAIB6h2EVcYyqBCdG0afG8SYo3u5qKcItZTy akOA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:to:cc:subject:date:message-id :in-reply-to:references; bh=QE+dG2A0JV3/hEHPfyxjosOSTNZMxpMgY5UN3q7iv60=; b=UPS29DJOSa+peasyjxhuXJVA48X2zuFdhFsFFA8H1zAqxbVOjZKeWGge3ZigseRFYR 9kNFfKiymAc/ZLlXhJcGHLlOmwt+0axg6OriwwrzSzVrqQ/DYLkNmYMHzsXqXuGOmzql nCXDsNtzQOJmmfUz7g4l8oSAW1dH7O/wbaReNrJqo90cRcDdCmEirjaFKppU5hAh2k67 GUukRh9+zH3ogTfptbW90cEW1QoGOj7qDW4kZ4N7PjCcVxxNBJXUvU7Q1oQLL3J9TiZu LIeTZIGryPb6Srx5NcevOH65d9grrSPOEgYvts3NYTR40JwTYeTs7Rt2eBg//zdQAPCB Utnw== X-Gm-Message-State: APjAAAUwtEAjFT2/36BflMQA0y8t/ZGXxsG0cIdKAPl5cEaCsL10GXcj MMc8OV92dfX6PK5j/yFjJrw= X-Google-Smtp-Source: APXvYqxC11KnWSeiN5eMtkmkYbqRFY/l3kagiCasX3ICTl83gaDBeIVp0iYKh5Yvy+30IQ4kJid4RQ== X-Received: by 2002:a37:ae05:: with SMTP id x5mr4569104qke.243.1573154302813; Thu, 07 Nov 2019 11:18:22 -0800 (PST) Received: from localhost ([2620:10d:c091:500::2:3f13]) by smtp.gmail.com with ESMTPSA id z17sm1355578qtq.69.2019.11.07.11.18.21 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 07 Nov 2019 11:18:22 -0800 (PST) From: Tejun Heo To: axboe@kernel.dk Cc: cgroups@vger.kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, lizefan@huawei.com, hannes@cmpxchg.org, kernel-team@fb.com, Tejun Heo Subject: [PATCH 6/6] blk-cgroup: separate out blkg_rwstat under CONFIG_BLK_CGROUP_RWSTAT Date: Thu, 7 Nov 2019 11:18:04 -0800 Message-Id: <20191107191804.3735303-7-tj@kernel.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20191107191804.3735303-1-tj@kernel.org> References: <20191107191804.3735303-1-tj@kernel.org> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org blkg_rwstat is now only used by bfq-iosched and blk-throtl when on cgroup1. Let's move it into its own files and gate it behind a config option. Signed-off-by: Tejun Heo --- block/Kconfig | 4 + block/Kconfig.iosched | 1 + block/Makefile | 1 + block/bfq-iosched.h | 2 + block/blk-cgroup-rwstat.c | 129 ++++++++++++++++++++++++++++++ block/blk-cgroup-rwstat.h | 149 ++++++++++++++++++++++++++++++++++ block/blk-cgroup.c | 97 ---------------------- block/blk-throttle.c | 1 + include/linux/blk-cgroup.h | 159 ------------------------------------- 9 files changed, 287 insertions(+), 256 deletions(-) create mode 100644 block/blk-cgroup-rwstat.c create mode 100644 block/blk-cgroup-rwstat.h diff --git a/block/Kconfig b/block/Kconfig index 41c0917ce622..c23094a14a2b 100644 --- a/block/Kconfig +++ b/block/Kconfig @@ -32,6 +32,9 @@ config BLK_RQ_ALLOC_TIME config BLK_SCSI_REQUEST bool +config BLK_CGROUP_RWSTAT + bool + config BLK_DEV_BSG bool "Block layer SG support v4" default y @@ -86,6 +89,7 @@ config BLK_DEV_ZONED config BLK_DEV_THROTTLING bool "Block layer bio throttling support" depends on BLK_CGROUP=y + select BLK_CGROUP_RWSTAT ---help--- Block layer bio throttling support. It can be used to limit the IO rate to a device. IO rate policies are per cgroup and diff --git a/block/Kconfig.iosched b/block/Kconfig.iosched index b89310a022ad..7df14133adc8 100644 --- a/block/Kconfig.iosched +++ b/block/Kconfig.iosched @@ -31,6 +31,7 @@ config IOSCHED_BFQ config BFQ_GROUP_IOSCHED bool "BFQ hierarchical scheduling support" depends on IOSCHED_BFQ && BLK_CGROUP + select BLK_CGROUP_RWSTAT ---help--- Enable hierarchical scheduling in BFQ, using the blkio diff --git a/block/Makefile b/block/Makefile index 9ef57ace90d4..205a5f2fef17 100644 --- a/block/Makefile +++ b/block/Makefile @@ -16,6 +16,7 @@ obj-$(CONFIG_BLK_SCSI_REQUEST) += scsi_ioctl.o obj-$(CONFIG_BLK_DEV_BSG) += bsg.o obj-$(CONFIG_BLK_DEV_BSGLIB) += bsg-lib.o obj-$(CONFIG_BLK_CGROUP) += blk-cgroup.o +obj-$(CONFIG_BLK_CGROUP_RWSTAT) += blk-cgroup-rwstat.o obj-$(CONFIG_BLK_DEV_THROTTLING) += blk-throttle.o obj-$(CONFIG_BLK_CGROUP_IOLATENCY) += blk-iolatency.o obj-$(CONFIG_BLK_CGROUP_IOCOST) += blk-iocost.o diff --git a/block/bfq-iosched.h b/block/bfq-iosched.h index 2676c06218f1..9c82c1f35716 100644 --- a/block/bfq-iosched.h +++ b/block/bfq-iosched.h @@ -10,6 +10,8 @@ #include #include +#include "blk-cgroup-rwstat.h" + #define BFQ_IOPRIO_CLASSES 3 #define BFQ_CL_IDLE_TIMEOUT (HZ/5) diff --git a/block/blk-cgroup-rwstat.c b/block/blk-cgroup-rwstat.c new file mode 100644 index 000000000000..85d5790ac49b --- /dev/null +++ b/block/blk-cgroup-rwstat.c @@ -0,0 +1,129 @@ +/* SPDX-License-Identifier: GPL-2.0 + * + * Legacy blkg rwstat helpers enabled by CONFIG_BLK_CGROUP_RWSTAT. + * Do not use in new code. + */ +#include "blk-cgroup-rwstat.h" + +int blkg_rwstat_init(struct blkg_rwstat *rwstat, gfp_t gfp) +{ + int i, ret; + + for (i = 0; i < BLKG_RWSTAT_NR; i++) { + ret = percpu_counter_init(&rwstat->cpu_cnt[i], 0, gfp); + if (ret) { + while (--i >= 0) + percpu_counter_destroy(&rwstat->cpu_cnt[i]); + return ret; + } + atomic64_set(&rwstat->aux_cnt[i], 0); + } + return 0; +} +EXPORT_SYMBOL_GPL(blkg_rwstat_init); + +void blkg_rwstat_exit(struct blkg_rwstat *rwstat) +{ + int i; + + for (i = 0; i < BLKG_RWSTAT_NR; i++) + percpu_counter_destroy(&rwstat->cpu_cnt[i]); +} +EXPORT_SYMBOL_GPL(blkg_rwstat_exit); + +/** + * __blkg_prfill_rwstat - prfill helper for a blkg_rwstat + * @sf: seq_file to print to + * @pd: policy private data of interest + * @rwstat: rwstat to print + * + * Print @rwstat to @sf for the device assocaited with @pd. + */ +u64 __blkg_prfill_rwstat(struct seq_file *sf, struct blkg_policy_data *pd, + const struct blkg_rwstat_sample *rwstat) +{ + static const char *rwstr[] = { + [BLKG_RWSTAT_READ] = "Read", + [BLKG_RWSTAT_WRITE] = "Write", + [BLKG_RWSTAT_SYNC] = "Sync", + [BLKG_RWSTAT_ASYNC] = "Async", + [BLKG_RWSTAT_DISCARD] = "Discard", + }; + const char *dname = blkg_dev_name(pd->blkg); + u64 v; + int i; + + if (!dname) + return 0; + + for (i = 0; i < BLKG_RWSTAT_NR; i++) + seq_printf(sf, "%s %s %llu\n", dname, rwstr[i], + rwstat->cnt[i]); + + v = rwstat->cnt[BLKG_RWSTAT_READ] + + rwstat->cnt[BLKG_RWSTAT_WRITE] + + rwstat->cnt[BLKG_RWSTAT_DISCARD]; + seq_printf(sf, "%s Total %llu\n", dname, v); + return v; +} +EXPORT_SYMBOL_GPL(__blkg_prfill_rwstat); + +/** + * blkg_prfill_rwstat - prfill callback for blkg_rwstat + * @sf: seq_file to print to + * @pd: policy private data of interest + * @off: offset to the blkg_rwstat in @pd + * + * prfill callback for printing a blkg_rwstat. + */ +u64 blkg_prfill_rwstat(struct seq_file *sf, struct blkg_policy_data *pd, + int off) +{ + struct blkg_rwstat_sample rwstat = { }; + + blkg_rwstat_read((void *)pd + off, &rwstat); + return __blkg_prfill_rwstat(sf, pd, &rwstat); +} +EXPORT_SYMBOL_GPL(blkg_prfill_rwstat); + +/** + * blkg_rwstat_recursive_sum - collect hierarchical blkg_rwstat + * @blkg: blkg of interest + * @pol: blkcg_policy which contains the blkg_rwstat + * @off: offset to the blkg_rwstat in blkg_policy_data or @blkg + * @sum: blkg_rwstat_sample structure containing the results + * + * Collect the blkg_rwstat specified by @blkg, @pol and @off and all its + * online descendants and their aux counts. The caller must be holding the + * queue lock for online tests. + * + * If @pol is NULL, blkg_rwstat is at @off bytes into @blkg; otherwise, it + * is at @off bytes into @blkg's blkg_policy_data of the policy. + */ +void blkg_rwstat_recursive_sum(struct blkcg_gq *blkg, struct blkcg_policy *pol, + int off, struct blkg_rwstat_sample *sum) +{ + struct blkcg_gq *pos_blkg; + struct cgroup_subsys_state *pos_css; + unsigned int i; + + lockdep_assert_held(&blkg->q->queue_lock); + + rcu_read_lock(); + blkg_for_each_descendant_pre(pos_blkg, pos_css, blkg) { + struct blkg_rwstat *rwstat; + + if (!pos_blkg->online) + continue; + + if (pol) + rwstat = (void *)blkg_to_pd(pos_blkg, pol) + off; + else + rwstat = (void *)pos_blkg + off; + + for (i = 0; i < BLKG_RWSTAT_NR; i++) + sum->cnt[i] = blkg_rwstat_read_counter(rwstat, i); + } + rcu_read_unlock(); +} +EXPORT_SYMBOL_GPL(blkg_rwstat_recursive_sum); diff --git a/block/blk-cgroup-rwstat.h b/block/blk-cgroup-rwstat.h new file mode 100644 index 000000000000..ee746919c41f --- /dev/null +++ b/block/blk-cgroup-rwstat.h @@ -0,0 +1,149 @@ +/* SPDX-License-Identifier: GPL-2.0 + * + * Legacy blkg rwstat helpers enabled by CONFIG_BLK_CGROUP_RWSTAT. + * Do not use in new code. + */ +#ifndef _BLK_CGROUP_RWSTAT_H +#define _BLK_CGROUP_RWSTAT_H + +#include + +enum blkg_rwstat_type { + BLKG_RWSTAT_READ, + BLKG_RWSTAT_WRITE, + BLKG_RWSTAT_SYNC, + BLKG_RWSTAT_ASYNC, + BLKG_RWSTAT_DISCARD, + + BLKG_RWSTAT_NR, + BLKG_RWSTAT_TOTAL = BLKG_RWSTAT_NR, +}; + +/* + * blkg_[rw]stat->aux_cnt is excluded for local stats but included for + * recursive. Used to carry stats of dead children. + */ +struct blkg_rwstat { + struct percpu_counter cpu_cnt[BLKG_RWSTAT_NR]; + atomic64_t aux_cnt[BLKG_RWSTAT_NR]; +}; + +struct blkg_rwstat_sample { + u64 cnt[BLKG_RWSTAT_NR]; +}; + +static inline u64 blkg_rwstat_read_counter(struct blkg_rwstat *rwstat, + unsigned int idx) +{ + return atomic64_read(&rwstat->aux_cnt[idx]) + + percpu_counter_sum_positive(&rwstat->cpu_cnt[idx]); +} + +int blkg_rwstat_init(struct blkg_rwstat *rwstat, gfp_t gfp); +void blkg_rwstat_exit(struct blkg_rwstat *rwstat); +u64 __blkg_prfill_rwstat(struct seq_file *sf, struct blkg_policy_data *pd, + const struct blkg_rwstat_sample *rwstat); +u64 blkg_prfill_rwstat(struct seq_file *sf, struct blkg_policy_data *pd, + int off); +void blkg_rwstat_recursive_sum(struct blkcg_gq *blkg, struct blkcg_policy *pol, + int off, struct blkg_rwstat_sample *sum); + + +/** + * blkg_rwstat_add - add a value to a blkg_rwstat + * @rwstat: target blkg_rwstat + * @op: REQ_OP and flags + * @val: value to add + * + * Add @val to @rwstat. The counters are chosen according to @rw. The + * caller is responsible for synchronizing calls to this function. + */ +static inline void blkg_rwstat_add(struct blkg_rwstat *rwstat, + unsigned int op, uint64_t val) +{ + struct percpu_counter *cnt; + + if (op_is_discard(op)) + cnt = &rwstat->cpu_cnt[BLKG_RWSTAT_DISCARD]; + else if (op_is_write(op)) + cnt = &rwstat->cpu_cnt[BLKG_RWSTAT_WRITE]; + else + cnt = &rwstat->cpu_cnt[BLKG_RWSTAT_READ]; + + percpu_counter_add_batch(cnt, val, BLKG_STAT_CPU_BATCH); + + if (op_is_sync(op)) + cnt = &rwstat->cpu_cnt[BLKG_RWSTAT_SYNC]; + else + cnt = &rwstat->cpu_cnt[BLKG_RWSTAT_ASYNC]; + + percpu_counter_add_batch(cnt, val, BLKG_STAT_CPU_BATCH); +} + +/** + * blkg_rwstat_read - read the current values of a blkg_rwstat + * @rwstat: blkg_rwstat to read + * + * Read the current snapshot of @rwstat and return it in the aux counts. + */ +static inline void blkg_rwstat_read(struct blkg_rwstat *rwstat, + struct blkg_rwstat_sample *result) +{ + int i; + + for (i = 0; i < BLKG_RWSTAT_NR; i++) + result->cnt[i] = + percpu_counter_sum_positive(&rwstat->cpu_cnt[i]); +} + +/** + * blkg_rwstat_total - read the total count of a blkg_rwstat + * @rwstat: blkg_rwstat to read + * + * Return the total count of @rwstat regardless of the IO direction. This + * function can be called without synchronization and takes care of u64 + * atomicity. + */ +static inline uint64_t blkg_rwstat_total(struct blkg_rwstat *rwstat) +{ + struct blkg_rwstat_sample tmp = { }; + + blkg_rwstat_read(rwstat, &tmp); + return tmp.cnt[BLKG_RWSTAT_READ] + tmp.cnt[BLKG_RWSTAT_WRITE]; +} + +/** + * blkg_rwstat_reset - reset a blkg_rwstat + * @rwstat: blkg_rwstat to reset + */ +static inline void blkg_rwstat_reset(struct blkg_rwstat *rwstat) +{ + int i; + + for (i = 0; i < BLKG_RWSTAT_NR; i++) { + percpu_counter_set(&rwstat->cpu_cnt[i], 0); + atomic64_set(&rwstat->aux_cnt[i], 0); + } +} + +/** + * blkg_rwstat_add_aux - add a blkg_rwstat into another's aux count + * @to: the destination blkg_rwstat + * @from: the source + * + * Add @from's count including the aux one to @to's aux count. + */ +static inline void blkg_rwstat_add_aux(struct blkg_rwstat *to, + struct blkg_rwstat *from) +{ + u64 sum[BLKG_RWSTAT_NR]; + int i; + + for (i = 0; i < BLKG_RWSTAT_NR; i++) + sum[i] = percpu_counter_sum_positive(&from->cpu_cnt[i]); + + for (i = 0; i < BLKG_RWSTAT_NR; i++) + atomic64_add(sum[i] + atomic64_read(&from->aux_cnt[i]), + &to->aux_cnt[i]); +} +#endif /* _BLK_CGROUP_RWSTAT_H */ diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c index b3429be62057..708dea92dac8 100644 --- a/block/blk-cgroup.c +++ b/block/blk-cgroup.c @@ -561,103 +561,6 @@ u64 __blkg_prfill_u64(struct seq_file *sf, struct blkg_policy_data *pd, u64 v) } EXPORT_SYMBOL_GPL(__blkg_prfill_u64); -/** - * __blkg_prfill_rwstat - prfill helper for a blkg_rwstat - * @sf: seq_file to print to - * @pd: policy private data of interest - * @rwstat: rwstat to print - * - * Print @rwstat to @sf for the device assocaited with @pd. - */ -u64 __blkg_prfill_rwstat(struct seq_file *sf, struct blkg_policy_data *pd, - const struct blkg_rwstat_sample *rwstat) -{ - static const char *rwstr[] = { - [BLKG_RWSTAT_READ] = "Read", - [BLKG_RWSTAT_WRITE] = "Write", - [BLKG_RWSTAT_SYNC] = "Sync", - [BLKG_RWSTAT_ASYNC] = "Async", - [BLKG_RWSTAT_DISCARD] = "Discard", - }; - const char *dname = blkg_dev_name(pd->blkg); - u64 v; - int i; - - if (!dname) - return 0; - - for (i = 0; i < BLKG_RWSTAT_NR; i++) - seq_printf(sf, "%s %s %llu\n", dname, rwstr[i], - rwstat->cnt[i]); - - v = rwstat->cnt[BLKG_RWSTAT_READ] + - rwstat->cnt[BLKG_RWSTAT_WRITE] + - rwstat->cnt[BLKG_RWSTAT_DISCARD]; - seq_printf(sf, "%s Total %llu\n", dname, v); - return v; -} -EXPORT_SYMBOL_GPL(__blkg_prfill_rwstat); - -/** - * blkg_prfill_rwstat - prfill callback for blkg_rwstat - * @sf: seq_file to print to - * @pd: policy private data of interest - * @off: offset to the blkg_rwstat in @pd - * - * prfill callback for printing a blkg_rwstat. - */ -u64 blkg_prfill_rwstat(struct seq_file *sf, struct blkg_policy_data *pd, - int off) -{ - struct blkg_rwstat_sample rwstat = { }; - - blkg_rwstat_read((void *)pd + off, &rwstat); - return __blkg_prfill_rwstat(sf, pd, &rwstat); -} -EXPORT_SYMBOL_GPL(blkg_prfill_rwstat); - -/** - * blkg_rwstat_recursive_sum - collect hierarchical blkg_rwstat - * @blkg: blkg of interest - * @pol: blkcg_policy which contains the blkg_rwstat - * @off: offset to the blkg_rwstat in blkg_policy_data or @blkg - * @sum: blkg_rwstat_sample structure containing the results - * - * Collect the blkg_rwstat specified by @blkg, @pol and @off and all its - * online descendants and their aux counts. The caller must be holding the - * queue lock for online tests. - * - * If @pol is NULL, blkg_rwstat is at @off bytes into @blkg; otherwise, it - * is at @off bytes into @blkg's blkg_policy_data of the policy. - */ -void blkg_rwstat_recursive_sum(struct blkcg_gq *blkg, struct blkcg_policy *pol, - int off, struct blkg_rwstat_sample *sum) -{ - struct blkcg_gq *pos_blkg; - struct cgroup_subsys_state *pos_css; - unsigned int i; - - lockdep_assert_held(&blkg->q->queue_lock); - - rcu_read_lock(); - blkg_for_each_descendant_pre(pos_blkg, pos_css, blkg) { - struct blkg_rwstat *rwstat; - - if (!pos_blkg->online) - continue; - - if (pol) - rwstat = (void *)blkg_to_pd(pos_blkg, pol) + off; - else - rwstat = (void *)pos_blkg + off; - - for (i = 0; i < BLKG_RWSTAT_NR; i++) - sum->cnt[i] = blkg_rwstat_read_counter(rwstat, i); - } - rcu_read_unlock(); -} -EXPORT_SYMBOL_GPL(blkg_rwstat_recursive_sum); - /* Performs queue bypass and policy enabled checks then looks up blkg. */ static struct blkcg_gq *blkg_lookup_check(struct blkcg *blkcg, const struct blkcg_policy *pol, diff --git a/block/blk-throttle.c b/block/blk-throttle.c index 2d0fc73d9781..98233c9c65a8 100644 --- a/block/blk-throttle.c +++ b/block/blk-throttle.c @@ -12,6 +12,7 @@ #include #include #include "blk.h" +#include "blk-cgroup-rwstat.h" /* Max dispatch from a group in 1 round */ static int throtl_grp_quantum = 8; diff --git a/include/linux/blk-cgroup.h b/include/linux/blk-cgroup.h index 867ab391e409..48a66738143d 100644 --- a/include/linux/blk-cgroup.h +++ b/include/linux/blk-cgroup.h @@ -41,17 +41,6 @@ enum blkg_iostat_type { BLKG_IOSTAT_NR, }; -enum blkg_rwstat_type { - BLKG_RWSTAT_READ, - BLKG_RWSTAT_WRITE, - BLKG_RWSTAT_SYNC, - BLKG_RWSTAT_ASYNC, - BLKG_RWSTAT_DISCARD, - - BLKG_RWSTAT_NR, - BLKG_RWSTAT_TOTAL = BLKG_RWSTAT_NR, -}; - struct blkcg_gq; struct blkcg { @@ -82,19 +71,6 @@ struct blkg_iostat_set { struct blkg_iostat last; }; -/* - * blkg_[rw]stat->aux_cnt is excluded for local stats but included for - * recursive. Used to carry stats of dead children. - */ -struct blkg_rwstat { - struct percpu_counter cpu_cnt[BLKG_RWSTAT_NR]; - atomic64_t aux_cnt[BLKG_RWSTAT_NR]; -}; - -struct blkg_rwstat_sample { - u64 cnt[BLKG_RWSTAT_NR]; -}; - /* * A blkcg_gq (blkg) is association between a block cgroup (blkcg) and a * request_queue (q). This is used by blkcg policies which need to track @@ -223,13 +199,6 @@ int blkcg_activate_policy(struct request_queue *q, void blkcg_deactivate_policy(struct request_queue *q, const struct blkcg_policy *pol); -static inline u64 blkg_rwstat_read_counter(struct blkg_rwstat *rwstat, - unsigned int idx) -{ - return atomic64_read(&rwstat->aux_cnt[idx]) + - percpu_counter_sum_positive(&rwstat->cpu_cnt[idx]); -} - const char *blkg_dev_name(struct blkcg_gq *blkg); void blkcg_print_blkgs(struct seq_file *sf, struct blkcg *blkcg, u64 (*prfill)(struct seq_file *, @@ -237,12 +206,6 @@ void blkcg_print_blkgs(struct seq_file *sf, struct blkcg *blkcg, const struct blkcg_policy *pol, int data, bool show_total); u64 __blkg_prfill_u64(struct seq_file *sf, struct blkg_policy_data *pd, u64 v); -u64 __blkg_prfill_rwstat(struct seq_file *sf, struct blkg_policy_data *pd, - const struct blkg_rwstat_sample *rwstat); -u64 blkg_prfill_rwstat(struct seq_file *sf, struct blkg_policy_data *pd, - int off); -void blkg_rwstat_recursive_sum(struct blkcg_gq *blkg, struct blkcg_policy *pol, - int off, struct blkg_rwstat_sample *sum); struct blkg_conf_ctx { struct gendisk *disk; @@ -594,128 +557,6 @@ static inline void blkg_put(struct blkcg_gq *blkg) if (((d_blkg) = __blkg_lookup(css_to_blkcg(pos_css), \ (p_blkg)->q, false))) -static inline int blkg_rwstat_init(struct blkg_rwstat *rwstat, gfp_t gfp) -{ - int i, ret; - - for (i = 0; i < BLKG_RWSTAT_NR; i++) { - ret = percpu_counter_init(&rwstat->cpu_cnt[i], 0, gfp); - if (ret) { - while (--i >= 0) - percpu_counter_destroy(&rwstat->cpu_cnt[i]); - return ret; - } - atomic64_set(&rwstat->aux_cnt[i], 0); - } - return 0; -} - -static inline void blkg_rwstat_exit(struct blkg_rwstat *rwstat) -{ - int i; - - for (i = 0; i < BLKG_RWSTAT_NR; i++) - percpu_counter_destroy(&rwstat->cpu_cnt[i]); -} - -/** - * blkg_rwstat_add - add a value to a blkg_rwstat - * @rwstat: target blkg_rwstat - * @op: REQ_OP and flags - * @val: value to add - * - * Add @val to @rwstat. The counters are chosen according to @rw. The - * caller is responsible for synchronizing calls to this function. - */ -static inline void blkg_rwstat_add(struct blkg_rwstat *rwstat, - unsigned int op, uint64_t val) -{ - struct percpu_counter *cnt; - - if (op_is_discard(op)) - cnt = &rwstat->cpu_cnt[BLKG_RWSTAT_DISCARD]; - else if (op_is_write(op)) - cnt = &rwstat->cpu_cnt[BLKG_RWSTAT_WRITE]; - else - cnt = &rwstat->cpu_cnt[BLKG_RWSTAT_READ]; - - percpu_counter_add_batch(cnt, val, BLKG_STAT_CPU_BATCH); - - if (op_is_sync(op)) - cnt = &rwstat->cpu_cnt[BLKG_RWSTAT_SYNC]; - else - cnt = &rwstat->cpu_cnt[BLKG_RWSTAT_ASYNC]; - - percpu_counter_add_batch(cnt, val, BLKG_STAT_CPU_BATCH); -} - -/** - * blkg_rwstat_read - read the current values of a blkg_rwstat - * @rwstat: blkg_rwstat to read - * - * Read the current snapshot of @rwstat and return it in the aux counts. - */ -static inline void blkg_rwstat_read(struct blkg_rwstat *rwstat, - struct blkg_rwstat_sample *result) -{ - int i; - - for (i = 0; i < BLKG_RWSTAT_NR; i++) - result->cnt[i] = - percpu_counter_sum_positive(&rwstat->cpu_cnt[i]); -} - -/** - * blkg_rwstat_total - read the total count of a blkg_rwstat - * @rwstat: blkg_rwstat to read - * - * Return the total count of @rwstat regardless of the IO direction. This - * function can be called without synchronization and takes care of u64 - * atomicity. - */ -static inline uint64_t blkg_rwstat_total(struct blkg_rwstat *rwstat) -{ - struct blkg_rwstat_sample tmp = { }; - - blkg_rwstat_read(rwstat, &tmp); - return tmp.cnt[BLKG_RWSTAT_READ] + tmp.cnt[BLKG_RWSTAT_WRITE]; -} - -/** - * blkg_rwstat_reset - reset a blkg_rwstat - * @rwstat: blkg_rwstat to reset - */ -static inline void blkg_rwstat_reset(struct blkg_rwstat *rwstat) -{ - int i; - - for (i = 0; i < BLKG_RWSTAT_NR; i++) { - percpu_counter_set(&rwstat->cpu_cnt[i], 0); - atomic64_set(&rwstat->aux_cnt[i], 0); - } -} - -/** - * blkg_rwstat_add_aux - add a blkg_rwstat into another's aux count - * @to: the destination blkg_rwstat - * @from: the source - * - * Add @from's count including the aux one to @to's aux count. - */ -static inline void blkg_rwstat_add_aux(struct blkg_rwstat *to, - struct blkg_rwstat *from) -{ - u64 sum[BLKG_RWSTAT_NR]; - int i; - - for (i = 0; i < BLKG_RWSTAT_NR; i++) - sum[i] = percpu_counter_sum_positive(&from->cpu_cnt[i]); - - for (i = 0; i < BLKG_RWSTAT_NR; i++) - atomic64_add(sum[i] + atomic64_read(&from->aux_cnt[i]), - &to->aux_cnt[i]); -} - #ifdef CONFIG_BLK_DEV_THROTTLING extern bool blk_throtl_bio(struct request_queue *q, struct blkcg_gq *blkg, struct bio *bio);