From patchwork Wed Jul 24 17:37:22 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tejun Heo X-Patchwork-Id: 11057413 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A6810138D for ; Wed, 24 Jul 2019 17:37:27 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8B71D285A4 for ; Wed, 24 Jul 2019 17:37:27 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 7D3CE2866D; Wed, 24 Jul 2019 17:37:27 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.7 required=2.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E9FAE285A4 for ; Wed, 24 Jul 2019 17:37:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726911AbfGXRh0 (ORCPT ); Wed, 24 Jul 2019 13:37:26 -0400 Received: from mail-pg1-f195.google.com ([209.85.215.195]:37958 "EHLO mail-pg1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727458AbfGXRh0 (ORCPT ); Wed, 24 Jul 2019 13:37:26 -0400 Received: by mail-pg1-f195.google.com with SMTP id f5so12750591pgu.5; Wed, 24 Jul 2019 10:37:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=cHxf1kTdWDT0gUqEm0BhBo9+91LG1FL8yrCj/eDjN5Q=; b=ZFjj/SoOb4H/1IrlfjFztvbIO1Bsh8BYcN0SwpbOE0BRo+gHZCy74Y5vkvBaDeJHoi UhuC8apPTWl9jCTNHFDLqLY9O5jg925z0nTsXAFcaK4NUHgb54HHKIcT126TPO8kzPWV 4YxeqDr374ZiyQLQ4TSijs58a7dlf3cMbZYfIRfzOT7uC1mps8qnerrY+JgHaZq5QVAN krgdsbOeSTxS0CpNRbR7Cc9Uxew1ohLZY23j4JVimyBi+qee7oQI46TYuoymFlZn9E/b sCmix2SVuOSy3t+rUc1T6KdV74Mzxcy5kREmM++gtDZqR38wvgoQe89Q8oCRN6Y30jPN TrOw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:from:to:cc:subject:message-id :references:mime-version:content-disposition:in-reply-to:user-agent; bh=cHxf1kTdWDT0gUqEm0BhBo9+91LG1FL8yrCj/eDjN5Q=; b=FwHoY9vVSSwH7mmmHvmg2aoZsVUq8b8RatAvbYE5g9eTdM7uo+d2XjZvllcYvkgsFM A8NBdTwTmpkyTSnfU+oLJGXKDDb4T3y6JilAwPF2FzKpqsHMuJ8Gdz041AGgc3r3SG3y Ajd7AAtG0XuTeR1LQnM8W8YrgHDF7jdnlG7jYTSsDK7yLObj5TJ7jm/EL5SmdC02n71i WAqgvZXKwdbsyQmbIvFn0VBPjPkNk+6ubaXp0IhU7FY4Jhi/lF+7sZrs+MPGAC6O27SD FtR69+6MG++2GWnMx0iJIv4nRkM7pJom8gAQYTfLyhjdXLPHIbYj04oWFgk6lwMUTMfS v41Q== X-Gm-Message-State: APjAAAUWVR0wtzWUHVm/wtMd5Y931FlhVaHhvOkzEXuP2f7QAIGGLb8Q /hKk+iXEQnhhpQMHySzLI0Q= X-Google-Smtp-Source: APXvYqwROvBNdwR2JGWp0IvGhoGV0WlnB62tVM2PJk7IS+1rw1bl5XAJaH9t3EgBH+R+/lh4emQQpw== X-Received: by 2002:a63:d301:: with SMTP id b1mr74847311pgg.379.1563989844842; Wed, 24 Jul 2019 10:37:24 -0700 (PDT) Received: from localhost ([2620:10d:c091:500::2:c91a]) by smtp.gmail.com with ESMTPSA id f19sm65642862pfk.180.2019.07.24.10.37.23 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 24 Jul 2019 10:37:24 -0700 (PDT) Date: Wed, 24 Jul 2019 10:37:22 -0700 From: Tejun Heo To: Jens Axboe Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: [PATCH RESEND 1/2] blkcg: rename blkcg->cgwb_refcnt to ->online_pin and always use it Message-ID: <20190724173722.GA569612@devbig004.ftw2.facebook.com> References: <20190724173517.GA559934@devbig004.ftw2.facebook.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20190724173517.GA559934@devbig004.ftw2.facebook.com> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP blkcg->cgwb_refcnt is used to delay blkcg offlining so that blkgs don't get offlined while there are active cgwbs on them. However, it ends up making offlining unordered sometimes causing parents to be offlined before children. To fix it, we want child blkcgs to pin the parents' online states turning the refcnt into a more generic online pinning mechanism. In prepartion, * blkcg->cgwb_refcnt -> blkcg->online_pin * blkcg_cgwb_get/put() -> blkcg_pin/unpin_online() * Take them out of CONFIG_CGROUP_WRITEBACK Signed-off-by: Tejun Heo --- (Resending cuz somehow I cleared To: line before sending) Hello, The asynchronous blkcg offlining can break offline ordering. This doesn't affect any of in-kernel users but it broke an assumption that the pending io.cost controller was making and is generally nasty. These two patches fix the offlining ordering by making children pin parents. Thanks. block/blk-cgroup.c | 6 +++--- include/linux/blk-cgroup.h | 39 +++++++++++++-------------------------- mm/backing-dev.c | 6 +++--- 3 files changed, 19 insertions(+), 32 deletions(-) --- a/block/blk-cgroup.c +++ b/block/blk-cgroup.c @@ -1035,8 +1035,8 @@ static void blkcg_css_offline(struct cgr /* this prevents anyone from attaching or migrating to this blkcg */ wb_blkcg_offline(blkcg); - /* put the base cgwb reference allowing step 2 to be triggered */ - blkcg_cgwb_put(blkcg); + /* put the base online pin allowing step 2 to be triggered */ + blkcg_unpin_online(blkcg); } /** @@ -1135,11 +1135,11 @@ blkcg_css_alloc(struct cgroup_subsys_sta } spin_lock_init(&blkcg->lock); + refcount_set(&blkcg->online_pin, 1); INIT_RADIX_TREE(&blkcg->blkg_tree, GFP_NOWAIT | __GFP_NOWARN); INIT_HLIST_HEAD(&blkcg->blkg_list); #ifdef CONFIG_CGROUP_WRITEBACK INIT_LIST_HEAD(&blkcg->cgwb_list); - refcount_set(&blkcg->cgwb_refcnt, 1); #endif list_add_tail(&blkcg->all_blkcgs_node, &all_blkcgs); --- a/include/linux/blk-cgroup.h +++ b/include/linux/blk-cgroup.h @@ -47,6 +47,7 @@ struct blkcg_gq; struct blkcg { struct cgroup_subsys_state css; spinlock_t lock; + refcount_t online_pin; struct radix_tree_root blkg_tree; struct blkcg_gq __rcu *blkg_hint; @@ -57,7 +58,6 @@ struct blkcg { struct list_head all_blkcgs_node; #ifdef CONFIG_CGROUP_WRITEBACK struct list_head cgwb_list; - refcount_t cgwb_refcnt; #endif }; @@ -431,47 +431,34 @@ static inline struct blkcg *cpd_to_blkcg extern void blkcg_destroy_blkgs(struct blkcg *blkcg); -#ifdef CONFIG_CGROUP_WRITEBACK - /** - * blkcg_cgwb_get - get a reference for blkcg->cgwb_list + * blkcg_pin_online - pin online state * @blkcg: blkcg of interest * - * This is used to track the number of active wb's related to a blkcg. + * While pinned, a blkcg is kept online. This is primarily used to + * impedance-match blkg and cgwb lifetimes so that blkg doesn't go offline + * while an associated cgwb is still active. */ -static inline void blkcg_cgwb_get(struct blkcg *blkcg) +static inline void blkcg_pin_online(struct blkcg *blkcg) { - refcount_inc(&blkcg->cgwb_refcnt); + refcount_inc(&blkcg->online_pin); } /** - * blkcg_cgwb_put - put a reference for @blkcg->cgwb_list + * blkcg_unpin_online - unpin online state * @blkcg: blkcg of interest * - * This is used to track the number of active wb's related to a blkcg. - * When this count goes to zero, all active wb has finished so the + * This is primarily used to impedance-match blkg and cgwb lifetimes so + * that blkg doesn't go offline while an associated cgwb is still active. + * When this count goes to zero, all active cgwbs have finished so the * blkcg can continue destruction by calling blkcg_destroy_blkgs(). - * This work may occur in cgwb_release_workfn() on the cgwb_release - * workqueue. */ -static inline void blkcg_cgwb_put(struct blkcg *blkcg) +static inline void blkcg_unpin_online(struct blkcg *blkcg) { - if (refcount_dec_and_test(&blkcg->cgwb_refcnt)) + if (refcount_dec_and_test(&blkcg->online_pin)) blkcg_destroy_blkgs(blkcg); } -#else - -static inline void blkcg_cgwb_get(struct blkcg *blkcg) { } - -static inline void blkcg_cgwb_put(struct blkcg *blkcg) -{ - /* wb isn't being accounted, so trigger destruction right away */ - blkcg_destroy_blkgs(blkcg); -} - -#endif - /** * blkg_path - format cgroup path of blkg * @blkg: blkg of interest --- a/mm/backing-dev.c +++ b/mm/backing-dev.c @@ -487,8 +487,8 @@ static void cgwb_release_workfn(struct w css_put(wb->blkcg_css); mutex_unlock(&wb->bdi->cgwb_release_mutex); - /* triggers blkg destruction if cgwb_refcnt becomes zero */ - blkcg_cgwb_put(blkcg); + /* triggers blkg destruction if no online users left */ + blkcg_unpin_online(blkcg); fprop_local_destroy_percpu(&wb->memcg_completions); percpu_ref_exit(&wb->refcnt); @@ -588,7 +588,7 @@ static int cgwb_create(struct backing_de list_add_tail_rcu(&wb->bdi_node, &bdi->wb_list); list_add(&wb->memcg_node, memcg_cgwb_list); list_add(&wb->blkcg_node, blkcg_cgwb_list); - blkcg_cgwb_get(blkcg); + blkcg_pin_online(blkcg); css_get(memcg_css); css_get(blkcg_css); } From patchwork Wed Jul 24 17:37:55 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tejun Heo X-Patchwork-Id: 11057415 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4B0ED112C for ; Wed, 24 Jul 2019 17:38:00 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3440A285A4 for ; Wed, 24 Jul 2019 17:38:00 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 2858D2866D; Wed, 24 Jul 2019 17:38:00 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.7 required=2.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C6941285A4 for ; Wed, 24 Jul 2019 17:37:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727458AbfGXRh7 (ORCPT ); Wed, 24 Jul 2019 13:37:59 -0400 Received: from mail-pl1-f196.google.com ([209.85.214.196]:44995 "EHLO mail-pl1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726323AbfGXRh6 (ORCPT ); Wed, 24 Jul 2019 13:37:58 -0400 Received: by mail-pl1-f196.google.com with SMTP id t14so22248469plr.11; Wed, 24 Jul 2019 10:37:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=210jieYj6x4iDn+RzSO4QMxbDZOJTaHp79XKUEHy4po=; b=hEIOZzj60rL78OaSsU8W3pEv0nJLPBgdwJhBm+OTrlPQ4Dd28EfK2xeqNvC+YGG1iS o7+D047hKFmZkjPSVvrcm81oE1rDcz+IZmg6FJ/axEGwNDZR95o5vSzbXgpbwrf2/mfc z8SddeI9szOe6KIUJ7vMY/XhgwLeyq8qXJzegLuhJRZrhtXpnYfG93bp8qHbj5Z5m+BC eGqOrfAFGREptDwpyg3OOnPOGiQK1kk2OIYuZ8WwuEv+wR1kuDyJz5E4YTlRTomysIRU LP0OHdDIafKLY5KlYpdPJE9e41tYkm7y46iWC3NOVLtZveI4poFnX5aOg9hOiMzgpwn/ +9/Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:from:to:cc:subject:message-id :references:mime-version:content-disposition:in-reply-to:user-agent; bh=210jieYj6x4iDn+RzSO4QMxbDZOJTaHp79XKUEHy4po=; b=P/C3R4isLq1o+mFXfYrNrFyhzJb4hVbwYxGkWIXXSZE598Jfs1hsROXWBnMwkpWtI2 yqJgLry+LuF+QAUoCEkKV7uobc+mZGOiT9I65ptD8lRgnGUyvLHN0+15nT+taQMwQBs1 BgUizfAy2AFkgqul34xw8HkSZ+MaOv4adeBKduTBZRGvv6f+LrTIcwYAFEDKmbISm0vl I8++DR39U/1mS9MSdnO1GWXRIszaMWugCZUPNSFR7NtTwf5hUCBvAJrT/F/N50IX/MJQ z6CPnJ2z4ir7JdZFcs/JZQo4ETGnFZP4W9RI1krTU9vQ1oPEgdHQedi4ju/EFP3AnTRA cACQ== X-Gm-Message-State: APjAAAWiN4LFYeCtgy40kANFvRQOEp1iBEw0ZZxTKFZ/blGdbFaGhhP8 Iv5eTZAVAekj7ABpOejp6V+sY54+ X-Google-Smtp-Source: APXvYqz22Xp4Bha/kHcoUHwURa5oH9/WA3W1Js1+o3th/+U8OIMfM61Tbdg9ifuIT2cMuqk1plKZQA== X-Received: by 2002:a17:902:6a87:: with SMTP id n7mr85600071plk.336.1563989877739; Wed, 24 Jul 2019 10:37:57 -0700 (PDT) Received: from localhost ([2620:10d:c091:500::2:c91a]) by smtp.gmail.com with ESMTPSA id d15sm82710266pjc.8.2019.07.24.10.37.57 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 24 Jul 2019 10:37:57 -0700 (PDT) Date: Wed, 24 Jul 2019 10:37:55 -0700 From: Tejun Heo To: Jens Axboe Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: [PATCH 2/2] blkcg: don't offline parent blkcg first Message-ID: <20190724173755.GB569612@devbig004.ftw2.facebook.com> References: <20190724173517.GA559934@devbig004.ftw2.facebook.com> <20190724173722.GA569612@devbig004.ftw2.facebook.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20190724173722.GA569612@devbig004.ftw2.facebook.com> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP blkcg->cgwb_refcnt is used to delay blkcg offlining so that blkgs don't get offlined while there are active cgwbs on them. However, it ends up making offlining unordered sometimes causing parents to be offlined before children. Let's fix this by making child blkcgs pin the parents' online states. Note that pin/unpin names are chosen over get/put intentionally because css uses get/put online for something different. Signed-off-by: Tejun Heo --- block/blk-cgroup.c | 16 ++++++++++++++++ include/linux/blk-cgroup.h | 6 +++++- 2 files changed, 21 insertions(+), 1 deletion(-) --- a/block/blk-cgroup.c +++ b/block/blk-cgroup.c @@ -1158,6 +1158,21 @@ unlock: return ret; } +static int blkcg_css_online(struct cgroup_subsys_state *css) +{ + struct blkcg *blkcg = css_to_blkcg(css); + struct blkcg *parent = blkcg_parent(blkcg); + + /* + * blkcg_pin_online() is used to delay blkcg offline so that blkgs + * don't go offline while cgwbs are still active on them. Pin the + * parent so that offline always happens towards the root. + */ + if (parent) + blkcg_pin_online(parent); + return 0; +} + /** * blkcg_init_queue - initialize blkcg part of request queue * @q: request_queue to initialize @@ -1300,6 +1315,7 @@ static void blkcg_exit(struct task_struc struct cgroup_subsys io_cgrp_subsys = { .css_alloc = blkcg_css_alloc, + .css_online = blkcg_css_online, .css_offline = blkcg_css_offline, .css_free = blkcg_css_free, .can_attach = blkcg_can_attach, --- a/include/linux/blk-cgroup.h +++ b/include/linux/blk-cgroup.h @@ -455,8 +455,12 @@ static inline void blkcg_pin_online(stru */ static inline void blkcg_unpin_online(struct blkcg *blkcg) { - if (refcount_dec_and_test(&blkcg->online_pin)) + do { + if (!refcount_dec_and_test(&blkcg->online_pin)) + break; blkcg_destroy_blkgs(blkcg); + blkcg = blkcg_parent(blkcg); + } while (blkcg); } /**