From patchwork Sat Jun 15 18:24:45 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tejun Heo X-Patchwork-Id: 10997209 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 08AA414B6 for ; Sat, 15 Jun 2019 18:25:17 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id EE4A528707 for ; Sat, 15 Jun 2019 18:25:16 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id E0EB82873E; Sat, 15 Jun 2019 18:25:16 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.7 required=2.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7F62428707 for ; Sat, 15 Jun 2019 18:25:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726857AbfFOSZF (ORCPT ); Sat, 15 Jun 2019 14:25:05 -0400 Received: from mail-qt1-f195.google.com ([209.85.160.195]:35368 "EHLO mail-qt1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725270AbfFOSZF (ORCPT ); Sat, 15 Jun 2019 14:25:05 -0400 Received: by mail-qt1-f195.google.com with SMTP id d23so6371988qto.2; Sat, 15 Jun 2019 11:25:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:from:to:cc:subject:date:message-id:in-reply-to:references; bh=mzP081WUxg14pTor13LXGmwUotvYUmI76qpBJZ7MQmo=; b=ZRmkMz0qIw8nZnwRBWYitdL5G7bqJQ5Krrum9cypCGmsBxLiw9wW4X+EEfk5B0peB4 E7XtmP/Sea0vSXcTMwMq6UT1O1U8MgB/Vn00i2tOaqdDXwzOKVxDsKz84lK8plowZ7sK v7jfD2iiXYr87DVUujW+AgDSz2VQWqY7LqPTp6Je4XCdjaKW1vi0G/Ed87L8YTSBEWHB +CzfIcVR2/5YsUZoq/GXQnnzDdtkPznwAqlQG8keNevJIGDsrqvwVqvv6rcSdD+QnwsR ggUpGQBAdBx3RxTiPNetKRjwhklpOmd5a7lDw4jto6vgGHuUXQOFXloeD/43wWCTpTFf gJng== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:to:cc:subject:date:message-id :in-reply-to:references; bh=mzP081WUxg14pTor13LXGmwUotvYUmI76qpBJZ7MQmo=; b=QzzxYU3e1Ef26amDyBd/DURiypHUUkKntTgKll9+6Be1Zqb5vRHE2VQRgSHAfKKFgJ yKTb+TowdoamRyN+64UrIi1bx/hYQI+D7EMAS3jcxnm07IoVs2ndfNn7aRas6p9rmshN L9JEYoxjYFyJNxlg4yOXySHhw6DepBjMEktok6cgMMPGyAy1HB/ChIk/D4ywZD9F3+3e g1Ud8/mn0lfGpHtfJpJrECgMmpBsze+yKMpzxf1Zz4fyNw1EzW/1/jKm9o3szTSwEgwC Z1YF09WdJYsdOf86hMDqPQFZkn0K4wGYmKoBEm8Oozd4FBlIczpX4FjlvQXIc8eqS9eS 8euw== X-Gm-Message-State: APjAAAXM4fQEyR4EK5/SfbKLsrk8PX5rWEeQvftyEQ8pxptZbxV8h3Fs x9cwyajtJohO2XfXjSC9B+k= X-Google-Smtp-Source: APXvYqyGoPk/K9c8GITJnQX6OVVFL5lE9btPsHaCNEMx1NWVv/ObSNXQtn9+ofrPJcVSolw+/a2SJA== X-Received: by 2002:a0c:b39e:: with SMTP id t30mr14164013qve.212.1560623103669; Sat, 15 Jun 2019 11:25:03 -0700 (PDT) Received: from localhost ([2620:10d:c091:480::4883]) by smtp.gmail.com with ESMTPSA id u7sm4694764qta.82.2019.06.15.11.25.01 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 15 Jun 2019 11:25:02 -0700 (PDT) From: Tejun Heo To: dsterba@suse.com, clm@fb.com, josef@toxicpanda.com, axboe@kernel.dk, jack@suse.cz Cc: linux-btrfs@vger.kernel.org, linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, kernel-team@fb.com, Tejun Heo Subject: [PATCH 1/9] cgroup, blkcg: Prepare some symbols for module and !CONFIG_CGROUP usages Date: Sat, 15 Jun 2019 11:24:45 -0700 Message-Id: <20190615182453.843275-2-tj@kernel.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190615182453.843275-1-tj@kernel.org> References: <20190615182453.843275-1-tj@kernel.org> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP btrfs is going to use css_put() and wbc helpers to improve cgroup writeback support. Add dummy css_get() definition and export wbc helpers to prepare for module and !CONFIG_CGROUP builds. Signed-off-by: Tejun Heo Reported-by: kbuild test robot Reviewed-by: Jan Kara --- block/blk-cgroup.c | 1 + fs/fs-writeback.c | 3 +++ include/linux/cgroup.h | 1 + 3 files changed, 5 insertions(+) diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c index 617a2b3f7582..07600d3c9520 100644 --- a/block/blk-cgroup.c +++ b/block/blk-cgroup.c @@ -46,6 +46,7 @@ struct blkcg blkcg_root; EXPORT_SYMBOL_GPL(blkcg_root); struct cgroup_subsys_state * const blkcg_root_css = &blkcg_root.css; +EXPORT_SYMBOL_GPL(blkcg_root_css); static struct blkcg_policy *blkcg_policy[BLKCG_MAX_POLS]; diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c index 36855c1f8daf..c29cff345b1f 100644 --- a/fs/fs-writeback.c +++ b/fs/fs-writeback.c @@ -269,6 +269,7 @@ void __inode_attach_wb(struct inode *inode, struct page *page) if (unlikely(cmpxchg(&inode->i_wb, NULL, wb))) wb_put(wb); } +EXPORT_SYMBOL_GPL(__inode_attach_wb); /** * locked_inode_to_wb_and_lock_list - determine a locked inode's wb and lock it @@ -580,6 +581,7 @@ void wbc_attach_and_unlock_inode(struct writeback_control *wbc, if (unlikely(wb_dying(wbc->wb))) inode_switch_wbs(inode, wbc->wb_id); } +EXPORT_SYMBOL_GPL(wbc_attach_and_unlock_inode); /** * wbc_detach_inode - disassociate wbc from inode and perform foreign detection @@ -699,6 +701,7 @@ void wbc_detach_inode(struct writeback_control *wbc) wb_put(wbc->wb); wbc->wb = NULL; } +EXPORT_SYMBOL_GPL(wbc_detach_inode); /** * wbc_account_io - account IO issued during writeback diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h index 81f58b4a5418..4cb5d5646986 100644 --- a/include/linux/cgroup.h +++ b/include/linux/cgroup.h @@ -687,6 +687,7 @@ void cgroup_path_from_kernfs_id(const union kernfs_node_id *id, struct cgroup_subsys_state; struct cgroup; +static inline void css_get(struct cgroup_subsys_state *css) {} static inline void css_put(struct cgroup_subsys_state *css) {} static inline int cgroup_attach_task_all(struct task_struct *from, struct task_struct *t) { return 0; } From patchwork Sat Jun 15 18:24:46 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tejun Heo X-Patchwork-Id: 10997243 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9577114B6 for ; Sat, 15 Jun 2019 18:25:55 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 860E5284D4 for ; Sat, 15 Jun 2019 18:25:55 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 74F6A2872F; Sat, 15 Jun 2019 18:25:55 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.7 required=2.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 26975284D4 for ; Sat, 15 Jun 2019 18:25:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726953AbfFOSZI (ORCPT ); Sat, 15 Jun 2019 14:25:08 -0400 Received: from mail-qt1-f195.google.com ([209.85.160.195]:43916 "EHLO mail-qt1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725270AbfFOSZH (ORCPT ); Sat, 15 Jun 2019 14:25:07 -0400 Received: by mail-qt1-f195.google.com with SMTP id z24so6310897qtj.10; Sat, 15 Jun 2019 11:25:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:from:to:cc:subject:date:message-id:in-reply-to:references; bh=9sm1pVtM6HiDSHPec5dQ2RrXdzESv3uF6n2cxGS4b90=; b=LaQmOARvbblztTTfLS5Ehee5ijSLh4q543RGtynnH4HkuNB7M4uzAhQLD3JXJarPmn JRDk8C6WcBIsYUEngq32//h7IhZto6U5aH3rLMQ6H4GnH067999tO6/YxQS8kysWuc6s gVn3YdJx+RYd4DKUmUaVW1fRxrO0kbSJsFxRSLasWGOb1ptV6thdmM/0gX14C9ptyEFn 7FJnGd9lr1jgSpXt024xBXGvqRzWH0uYdv9HpEZU4lvrZf0oKydBQTvcvE0RZjWy6zOB v4glJnAua7DpxyBdwhVD9fm0exxkGt0DEqIpfwagFte8xga4BIgqkcT7PBHU6LuE99/K 5jkg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:to:cc:subject:date:message-id :in-reply-to:references; bh=9sm1pVtM6HiDSHPec5dQ2RrXdzESv3uF6n2cxGS4b90=; b=llz+ei3au7FArEK8xwr6/VqeTJpG0t31Xwsd1hNkYzyt8N2a4T+IkDRRUI0tNeqqGY sxgHdnrv8iRrfRQYDrwpOQgDvhk+Rrfsaq6Vth4guSeBE/YEe5eEXst8fcf4/2x43k1T CQ4c9rZ6JHeH0t8PC7DyA2M/qL830x0QVZ+wxmNRuaipj80VHdQrKrX71wyxzN0Ji66w GPfZWMlIiz8wJH2RdC4Tu4h2B8VxpyLxSo/uMpNwezfvNMOU2h7dhdZbgFExI/cUtTux Sxyaef2OQ5rc1U8xtbcq9aYdXQcCqIh5ROG6Q7funIocKYwolzrAMZxSsW78ySWqEzn4 wabA== X-Gm-Message-State: APjAAAXuK/UxOBO+VbqG1HZ+pUer+Tid/iWAHAySt0+YhWQY5tK6J32+ wC2OhAQ2Yl0i0qdPsbKMTrM= X-Google-Smtp-Source: APXvYqwCG3rONyzLg+di3X49mi3qomyEHccDtC6yZgNxwCjZg1WAD2VALGRr2jdBrcID4Cylu6mjsA== X-Received: by 2002:aed:3b25:: with SMTP id p34mr86126011qte.289.1560623106540; Sat, 15 Jun 2019 11:25:06 -0700 (PDT) Received: from localhost ([2620:10d:c091:480::4883]) by smtp.gmail.com with ESMTPSA id l3sm3499902qkd.49.2019.06.15.11.25.05 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 15 Jun 2019 11:25:06 -0700 (PDT) From: Tejun Heo To: dsterba@suse.com, clm@fb.com, josef@toxicpanda.com, axboe@kernel.dk, jack@suse.cz Cc: linux-btrfs@vger.kernel.org, linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, kernel-team@fb.com, Tejun Heo Subject: [PATCH 2/9] blkcg, writeback: Add wbc->no_wbc_acct Date: Sat, 15 Jun 2019 11:24:46 -0700 Message-Id: <20190615182453.843275-3-tj@kernel.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190615182453.843275-1-tj@kernel.org> References: <20190615182453.843275-1-tj@kernel.org> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP When writeback IOs are bounced through async layers, the IOs should only be accounted against the wbc from the original bdi writeback to avoid confusing cgroup inode ownership arbitration. Add wbc->no_wbc_acct to allow disabling wbc accounting. This will be used make btfs compression work well with cgroup IO control. Signed-off-by: Tejun Heo Reviewed-by: Josef Bacik Reviewed-by: Jan Kara --- fs/fs-writeback.c | 2 +- include/linux/writeback.h | 1 + 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c index c29cff345b1f..667ba07fffcd 100644 --- a/fs/fs-writeback.c +++ b/fs/fs-writeback.c @@ -724,7 +724,7 @@ void wbc_account_io(struct writeback_control *wbc, struct page *page, * behind a slow cgroup. Ultimately, we want pageout() to kick off * regular writeback instead of writing things out itself. */ - if (!wbc->wb) + if (!wbc->wb || wbc->no_wbc_acct) return; id = mem_cgroup_css_from_page(page)->id; diff --git a/include/linux/writeback.h b/include/linux/writeback.h index 738a0c24874f..b8f5f000cde4 100644 --- a/include/linux/writeback.h +++ b/include/linux/writeback.h @@ -68,6 +68,7 @@ struct writeback_control { unsigned for_reclaim:1; /* Invoked from the page allocator */ unsigned range_cyclic:1; /* range_start is cyclic */ unsigned for_sync:1; /* sync(2) WB_SYNC_ALL writeback */ + unsigned no_wbc_acct:1; /* skip wbc IO accounting */ #ifdef CONFIG_CGROUP_WRITEBACK struct bdi_writeback *wb; /* wb this writeback is issued under */ struct inode *inode; /* inode being written out */ From patchwork Sat Jun 15 18:24:47 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tejun Heo X-Patchwork-Id: 10997215 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5E61318E8 for ; Sat, 15 Jun 2019 18:25:17 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 52D4D28735 for ; Sat, 15 Jun 2019 18:25:17 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 46C052873D; Sat, 15 Jun 2019 18:25:17 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.7 required=2.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DDBC22873D for ; Sat, 15 Jun 2019 18:25:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727045AbfFOSZM (ORCPT ); Sat, 15 Jun 2019 14:25:12 -0400 Received: from mail-qt1-f195.google.com ([209.85.160.195]:43922 "EHLO mail-qt1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725270AbfFOSZK (ORCPT ); Sat, 15 Jun 2019 14:25:10 -0400 Received: by mail-qt1-f195.google.com with SMTP id z24so6310972qtj.10; Sat, 15 Jun 2019 11:25:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:from:to:cc:subject:date:message-id:in-reply-to:references; bh=gfbJRWowmbBmL/5PSSu1BU5VBQ2iIkWS/MK9bvMff9Y=; b=FAwf+6hjmWThr/Zo8e7ZPLeQ4Osz81kO6XOn01SydzL9G7yN45Gxq1gV1sDU8bZ8Ps FFMW53WswxVXCIz5ymPH+aojtStDVDFc9U4gzOUzoQm5z7aO4LdqWWx+0rGps4WFCt3i yAC+1YBfKSPmk/Rx0KOBAifiqYR+upzOWcfNjxSo/sUW9yTg5oQ89iEfb0VuH5V3kC5A 7dJCt5OfnPvSgkYgK2OHXh79Ftd14ZNMHmmmS1DEhRy6wdKaZepGZ6VA29ZMMDldvE2d 4TdTtSY/E3LUI5x/rycNBKJom4cvPWPtrOWTozTAnrOMUd8K9VmOU1ZlzofGVFmBbGp3 Wo5g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:to:cc:subject:date:message-id :in-reply-to:references; bh=gfbJRWowmbBmL/5PSSu1BU5VBQ2iIkWS/MK9bvMff9Y=; b=kbKfcA+g/1Ig8z7c9kvktKH8yMXbC8e3RyoT5JTYrOaGM8ERmuyQk9WNIW0BSEaHWT VyOCYYHCgA+iX4nKnaIbMaODdDLkX2EgS+uARsbEa53iaA9j67eu34rhgf46DS6V0mVg RTqfRY9NQgzAQ/8DSWbnDb6cDP8tNHPY6SJs2WG/ALm+HjEXQQWXsk3/gqXCetZEKL54 7WKfFEZ9AGH/Wnc8UR8PPZSqEnHMnRmwn0OL86V3DpYNvjsAxBQ3cUdPuD6HVVorT8Bm jwTeyIRCkI4+FxmYSBE5HVDD9lMncJ3MDDNYm3ku2oRLK8wZ06tDcmYfEQUIBarwnem7 GRRQ== X-Gm-Message-State: APjAAAXqv4fp3AqZ5UkiXJFRFWaeS/jlEhJRxAB8uhI5mrajLiouocoE HKZAKHV8ox9T4mxM66LsGZM= X-Google-Smtp-Source: APXvYqzAsjP5XFJMaZa4nNVaYMsFZyf6LFDCRQlhH8zipSL5vUkPHifbL2l8oi8JL4wyS9o7nD81zw== X-Received: by 2002:ac8:4705:: with SMTP id f5mr60570826qtp.99.1560623109356; Sat, 15 Jun 2019 11:25:09 -0700 (PDT) Received: from localhost ([2620:10d:c091:480::4883]) by smtp.gmail.com with ESMTPSA id y6sm3413435qki.67.2019.06.15.11.25.08 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 15 Jun 2019 11:25:08 -0700 (PDT) From: Tejun Heo To: dsterba@suse.com, clm@fb.com, josef@toxicpanda.com, axboe@kernel.dk, jack@suse.cz Cc: linux-btrfs@vger.kernel.org, linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, kernel-team@fb.com, Tejun Heo Subject: [PATCH 3/9] blkcg, writeback: Implement wbc_blkcg_css() Date: Sat, 15 Jun 2019 11:24:47 -0700 Message-Id: <20190615182453.843275-4-tj@kernel.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190615182453.843275-1-tj@kernel.org> References: <20190615182453.843275-1-tj@kernel.org> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Add a helper to determine the target blkcg from wbc. Signed-off-by: Tejun Heo Reviewed-by: Josef Bacik Reviewed-by: Jan Kara --- include/linux/writeback.h | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/include/linux/writeback.h b/include/linux/writeback.h index b8f5f000cde4..800ee031e88a 100644 --- a/include/linux/writeback.h +++ b/include/linux/writeback.h @@ -11,6 +11,7 @@ #include #include #include +#include struct bio; @@ -93,6 +94,16 @@ static inline int wbc_to_write_flags(struct writeback_control *wbc) return 0; } +static inline struct cgroup_subsys_state * +wbc_blkcg_css(struct writeback_control *wbc) +{ +#ifdef CONFIG_CGROUP_WRITEBACK + if (wbc->wb) + return wbc->wb->blkcg_css; +#endif + return blkcg_root_css; +} + /* * A wb_domain represents a domain that wb's (bdi_writeback's) belong to * and are measured against each other in. There always is one global From patchwork Sat Jun 15 18:24:48 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tejun Heo X-Patchwork-Id: 10997219 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id F36B81922 for ; Sat, 15 Jun 2019 18:25:17 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E71E328707 for ; Sat, 15 Jun 2019 18:25:17 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id DBC5B28735; Sat, 15 Jun 2019 18:25:17 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.7 required=2.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 279542872F for ; Sat, 15 Jun 2019 18:25:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727089AbfFOSZP (ORCPT ); Sat, 15 Jun 2019 14:25:15 -0400 Received: from mail-qt1-f194.google.com ([209.85.160.194]:35386 "EHLO mail-qt1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727063AbfFOSZO (ORCPT ); Sat, 15 Jun 2019 14:25:14 -0400 Received: by mail-qt1-f194.google.com with SMTP id d23so6372238qto.2; Sat, 15 Jun 2019 11:25:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:from:to:cc:subject:date:message-id:in-reply-to:references; bh=3gUlErPb54sHgERuHjLiYsVshuw0uLtH/4nFqCCKsLw=; b=UxrapmyZh0lqq8LkK6li6+LW2vRLaEnbzYVXhx4e+CL8muZo7ZwTcx4V7FrS12oI6e ehNVwzyCRpAjeIfzYI5NCkoskEnrBizCRFDXwDntl6/wPLwfXFhwzaER8wasIH58h/tO bChxCGMLVJKlOIHB6p0TxNKHAf9navOQrr5w31ThlaS8HlmqYKAX7cYTD7xV5NEz8hqg SWT0v6TO7qZ8ZzwY8MIyxe03njU3GpngLlr3yw5l17ZFNl2taNzwqm/Y6i45ub+Ov8UB gKGePRfyJ+mRZdrdx6u4Yg360t9XVTSY15EEOOiO+EtcvXcGT0G+AVB68AonLBbbYqCq h/4w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:to:cc:subject:date:message-id :in-reply-to:references; bh=3gUlErPb54sHgERuHjLiYsVshuw0uLtH/4nFqCCKsLw=; b=ZokJLgmVYr6QVYaC5heRj9Ekx32M4dsZ+/sxV+GPoaOCQo8KdIQXOC9VQE8KcNRh7J Scj1coi3/Fj0MFSHIe7xP/Nk7NHNBaV74NfwcCGgACvN5HBFms4KJZMFt9HsiDOltxV5 F7d4NPMuMk3MRvBIsL3IGCgxZDvuFqKpjaJrQXAIZ5o80zmC1wRSEd8NoyrIjOtbdp46 XU8/GP01muNqvCr16HJUdM5RGh5ftDs/Tt13tcIslWPI/8A/rcUAnOgV0el8DJwkKePC OMGRmV+aNsezcwpk2YU6dj0QUH+Iys9RipI0reKD/wwAAThKyaU/rXvTzK0FI+7ndhdu Pejg== X-Gm-Message-State: APjAAAXwPk5jdN3wj6X+2tth+NcQoYH8qoK/uGHlsMgnkJgZHb2XHYrN eu5JRteE42xHb5xmMhR1OdM= X-Google-Smtp-Source: APXvYqyM9/s6TwD7Ab3f2m2xIMZjb7wiSj685X/0KO1fCEa04MNyg7afUcXoRc8vSvI4djF32xVaGg== X-Received: by 2002:a0c:b902:: with SMTP id u2mr14101419qvf.151.1560623112865; Sat, 15 Jun 2019 11:25:12 -0700 (PDT) Received: from localhost ([2620:10d:c091:480::4883]) by smtp.gmail.com with ESMTPSA id r40sm3923177qtr.57.2019.06.15.11.25.11 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 15 Jun 2019 11:25:12 -0700 (PDT) From: Tejun Heo To: dsterba@suse.com, clm@fb.com, josef@toxicpanda.com, axboe@kernel.dk, jack@suse.cz Cc: linux-btrfs@vger.kernel.org, linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, kernel-team@fb.com, Tejun Heo Subject: [PATCH 4/9] blkcg: implement REQ_CGROUP_PUNT Date: Sat, 15 Jun 2019 11:24:48 -0700 Message-Id: <20190615182453.843275-5-tj@kernel.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190615182453.843275-1-tj@kernel.org> References: <20190615182453.843275-1-tj@kernel.org> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP When a shared kthread needs to issue a bio for a cgroup, doing so synchronously can lead to priority inversions as the kthread can be trapped waiting for that cgroup. This patch implements REQ_CGROUP_PUNT flag which makes submit_bio() punt the actual issuing to a dedicated per-blkcg work item to avoid such priority inversions. This will be used to fix priority inversions in btrfs compression and should be generally useful as we grow filesystem support for comprehensive IO control. Signed-off-by: Tejun Heo Reviewed-by: Josef Bacik Cc: Chris Mason Reviewed-by: Jan Kara --- block/blk-cgroup.c | 53 +++++++++++++++++++++++++++++++++++++ block/blk-core.c | 3 +++ include/linux/backing-dev.h | 1 + include/linux/blk-cgroup.h | 16 ++++++++++- include/linux/blk_types.h | 10 +++++++ include/linux/writeback.h | 12 ++++++--- 6 files changed, 91 insertions(+), 4 deletions(-) diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c index 07600d3c9520..48239bb93fbe 100644 --- a/block/blk-cgroup.c +++ b/block/blk-cgroup.c @@ -53,6 +53,7 @@ static struct blkcg_policy *blkcg_policy[BLKCG_MAX_POLS]; static LIST_HEAD(all_blkcgs); /* protected by blkcg_pol_mutex */ static bool blkcg_debug_stats = false; +static struct workqueue_struct *blkcg_punt_bio_wq; static bool blkcg_policy_enabled(struct request_queue *q, const struct blkcg_policy *pol) @@ -88,6 +89,8 @@ static void __blkg_release(struct rcu_head *rcu) percpu_ref_exit(&blkg->refcnt); + WARN_ON(!bio_list_empty(&blkg->async_bios)); + /* release the blkcg and parent blkg refs this blkg has been holding */ css_put(&blkg->blkcg->css); if (blkg->parent) @@ -113,6 +116,23 @@ static void blkg_release(struct percpu_ref *ref) call_rcu(&blkg->rcu_head, __blkg_release); } +static void blkg_async_bio_workfn(struct work_struct *work) +{ + struct blkcg_gq *blkg = container_of(work, struct blkcg_gq, + async_bio_work); + struct bio_list bios = BIO_EMPTY_LIST; + struct bio *bio; + + /* as long as there are pending bios, @blkg can't go away */ + spin_lock_bh(&blkg->async_bio_lock); + bio_list_merge(&bios, &blkg->async_bios); + bio_list_init(&blkg->async_bios); + spin_unlock_bh(&blkg->async_bio_lock); + + while ((bio = bio_list_pop(&bios))) + submit_bio(bio); +} + /** * blkg_alloc - allocate a blkg * @blkcg: block cgroup the new blkg is associated with @@ -138,6 +158,9 @@ static struct blkcg_gq *blkg_alloc(struct blkcg *blkcg, struct request_queue *q, blkg->q = q; INIT_LIST_HEAD(&blkg->q_node); + spin_lock_init(&blkg->async_bio_lock); + bio_list_init(&blkg->async_bios); + INIT_WORK(&blkg->async_bio_work, blkg_async_bio_workfn); blkg->blkcg = blkcg; for (i = 0; i < BLKCG_MAX_POLS; i++) { @@ -1583,6 +1606,25 @@ void blkcg_policy_unregister(struct blkcg_policy *pol) } EXPORT_SYMBOL_GPL(blkcg_policy_unregister); +bool __blkcg_punt_bio_submit(struct bio *bio) +{ + struct blkcg_gq *blkg = bio->bi_blkg; + + /* consume the flag first */ + bio->bi_opf &= ~REQ_CGROUP_PUNT; + + /* never bounce for the root cgroup */ + if (!blkg->parent) + return false; + + spin_lock_bh(&blkg->async_bio_lock); + bio_list_add(&blkg->async_bios, bio); + spin_unlock_bh(&blkg->async_bio_lock); + + queue_work(blkcg_punt_bio_wq, &blkg->async_bio_work); + return true; +} + /* * Scale the accumulated delay based on how long it has been since we updated * the delay. We only call this when we are adding delay, in case it's been a @@ -1783,5 +1825,16 @@ void blkcg_add_delay(struct blkcg_gq *blkg, u64 now, u64 delta) atomic64_add(delta, &blkg->delay_nsec); } +static int __init blkcg_init(void) +{ + blkcg_punt_bio_wq = alloc_workqueue("blkcg_punt_bio", + WQ_MEM_RECLAIM | WQ_FREEZABLE | + WQ_UNBOUND | WQ_SYSFS, 0); + if (!blkcg_punt_bio_wq) + return -ENOMEM; + return 0; +} +subsys_initcall(blkcg_init); + module_param(blkcg_debug_stats, bool, 0644); MODULE_PARM_DESC(blkcg_debug_stats, "True if you want debug stats, false if not"); diff --git a/block/blk-core.c b/block/blk-core.c index a55389ba8779..5879c1ec044d 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -1165,6 +1165,9 @@ EXPORT_SYMBOL_GPL(direct_make_request); */ blk_qc_t submit_bio(struct bio *bio) { + if (blkcg_punt_bio_submit(bio)) + return BLK_QC_T_NONE; + /* * If it's a regular read/write or a barrier with data attached, * go through the normal accounting stuff before submission. diff --git a/include/linux/backing-dev.h b/include/linux/backing-dev.h index f9b029180241..35b31d176f74 100644 --- a/include/linux/backing-dev.h +++ b/include/linux/backing-dev.h @@ -48,6 +48,7 @@ extern spinlock_t bdi_lock; extern struct list_head bdi_list; extern struct workqueue_struct *bdi_wq; +extern struct workqueue_struct *bdi_async_bio_wq; static inline bool wb_has_dirty_io(struct bdi_writeback *wb) { diff --git a/include/linux/blk-cgroup.h b/include/linux/blk-cgroup.h index 76c61318fda5..ffb2f88e87c6 100644 --- a/include/linux/blk-cgroup.h +++ b/include/linux/blk-cgroup.h @@ -134,13 +134,17 @@ struct blkcg_gq { struct blkg_policy_data *pd[BLKCG_MAX_POLS]; - struct rcu_head rcu_head; + spinlock_t async_bio_lock; + struct bio_list async_bios; + struct work_struct async_bio_work; atomic_t use_delay; atomic64_t delay_nsec; atomic64_t delay_start; u64 last_delay; int last_use; + + struct rcu_head rcu_head; }; typedef struct blkcg_policy_data *(blkcg_pol_alloc_cpd_fn)(gfp_t gfp); @@ -763,6 +767,15 @@ static inline bool blk_throtl_bio(struct request_queue *q, struct blkcg_gq *blkg struct bio *bio) { return false; } #endif +bool __blkcg_punt_bio_submit(struct bio *bio); + +static inline bool blkcg_punt_bio_submit(struct bio *bio) +{ + if (bio->bi_opf & REQ_CGROUP_PUNT) + return __blkcg_punt_bio_submit(bio); + else + return false; +} static inline void blkcg_bio_issue_init(struct bio *bio) { @@ -910,6 +923,7 @@ static inline char *blkg_path(struct blkcg_gq *blkg) { return NULL; } static inline void blkg_get(struct blkcg_gq *blkg) { } static inline void blkg_put(struct blkcg_gq *blkg) { } +static inline bool blkcg_punt_bio_submit(struct bio *bio) { return false; } static inline void blkcg_bio_issue_init(struct bio *bio) { } static inline bool blkcg_bio_issue_check(struct request_queue *q, struct bio *bio) { return true; } diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h index 791fee35df88..e8b42a786315 100644 --- a/include/linux/blk_types.h +++ b/include/linux/blk_types.h @@ -321,6 +321,14 @@ enum req_flag_bits { __REQ_RAHEAD, /* read ahead, can fail anytime */ __REQ_BACKGROUND, /* background IO */ __REQ_NOWAIT, /* Don't wait if request will block */ + /* + * When a shared kthread needs to issue a bio for a cgroup, doing + * so synchronously can lead to priority inversions as the kthread + * can be trapped waiting for that cgroup. CGROUP_PUNT flag makes + * submit_bio() punt the actual issuing to a dedicated per-blkcg + * work item to avoid such priority inversions. + */ + __REQ_CGROUP_PUNT, /* command specific flags for REQ_OP_WRITE_ZEROES: */ __REQ_NOUNMAP, /* do not free blocks when zeroing */ @@ -347,6 +355,8 @@ enum req_flag_bits { #define REQ_RAHEAD (1ULL << __REQ_RAHEAD) #define REQ_BACKGROUND (1ULL << __REQ_BACKGROUND) #define REQ_NOWAIT (1ULL << __REQ_NOWAIT) +#define REQ_CGROUP_PUNT (1ULL << __REQ_CGROUP_PUNT) + #define REQ_NOUNMAP (1ULL << __REQ_NOUNMAP) #define REQ_HIPRI (1ULL << __REQ_HIPRI) diff --git a/include/linux/writeback.h b/include/linux/writeback.h index 800ee031e88a..be602c42aab8 100644 --- a/include/linux/writeback.h +++ b/include/linux/writeback.h @@ -70,6 +70,7 @@ struct writeback_control { unsigned range_cyclic:1; /* range_start is cyclic */ unsigned for_sync:1; /* sync(2) WB_SYNC_ALL writeback */ unsigned no_wbc_acct:1; /* skip wbc IO accounting */ + unsigned punt_to_cgroup:1; /* cgrp punting, see __REQ_CGROUP_PUNT */ #ifdef CONFIG_CGROUP_WRITEBACK struct bdi_writeback *wb; /* wb this writeback is issued under */ struct inode *inode; /* inode being written out */ @@ -86,12 +87,17 @@ struct writeback_control { static inline int wbc_to_write_flags(struct writeback_control *wbc) { + int flags = 0; + + if (wbc->punt_to_cgroup) + flags = REQ_CGROUP_PUNT; + if (wbc->sync_mode == WB_SYNC_ALL) - return REQ_SYNC; + flags |= REQ_SYNC; else if (wbc->for_kupdate || wbc->for_background) - return REQ_BACKGROUND; + flags |= REQ_BACKGROUND; - return 0; + return flags; } static inline struct cgroup_subsys_state * From patchwork Sat Jun 15 18:24:49 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tejun Heo X-Patchwork-Id: 10997241 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9786D1398 for ; Sat, 15 Jun 2019 18:25:52 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8C248284D4 for ; Sat, 15 Jun 2019 18:25:52 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 803B32872F; Sat, 15 Jun 2019 18:25:52 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.7 required=2.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CC3EF28538 for ; Sat, 15 Jun 2019 18:25:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727128AbfFOSZS (ORCPT ); Sat, 15 Jun 2019 14:25:18 -0400 Received: from mail-qk1-f195.google.com ([209.85.222.195]:37665 "EHLO mail-qk1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727107AbfFOSZR (ORCPT ); Sat, 15 Jun 2019 14:25:17 -0400 Received: by mail-qk1-f195.google.com with SMTP id d15so3815513qkl.4; Sat, 15 Jun 2019 11:25:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:from:to:cc:subject:date:message-id:in-reply-to:references; bh=YOUrDCMg1yDLfqXbDj7+733WP4wT9Kq/3RGxh4FkPz0=; b=cvHkYQG8X0HXtsqTco0RaZecgA5i/7cLmz0YZUl/MLokK4JWRwXCDUvrEw7++BjQtU cc50svs5FzEtynEO0c4PtBVFVst+N00hRbaGD6F/MdzZURvSJC7qnCQLrxGAWauRSWEe l5z5C9VDU2KaynosUMCWZZ+XV9vpNeaFo8FLEoH8vBI6YjlhnszyNefavDHw4AWnaVRo ulBEUkoll/7mVGK29o4hluF8J+Y7d7f8BULmK4n0WXLr9vEFzTZKyq7HHDVW+R4gJIHM M7p5Xk2OLF7rldn4uJ4DB4+PPr6u/ianEwxgGO5bxWBwPh4ocADH5nmyxaae1uZDGnZ1 6a8w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:to:cc:subject:date:message-id :in-reply-to:references; bh=YOUrDCMg1yDLfqXbDj7+733WP4wT9Kq/3RGxh4FkPz0=; b=BeY0hmbrCh31AzFvBH3EjWaxSxK4u8aSggr/zRBP7dtS2vaItk6GwX7YZ3tJdKMZ0E 8LFlIi1RpuOOOgsxMOtsqamqjtxYwKtnag+zydRFVwbRzMIof5Mx0KnmG1DSvufS8wDO uRTZpXmcMRs8uMPAoZr8mAF8Vr87A9/zRAWROKB7mhp7URzWz0zt4fcH/qdEG7nHcmbo BN4+OKfEtmSSSJEhL707GZl7e/AeHEvePqHDFFtetG/gdlIfwZXSYuSM+APH93R7DNw6 pKLoig1YzoAy2uD6noqKWo37dKWViALGe9JS4OJi+kVRK+I7hhBisdEECbbkzTprA2zn 5/bA== X-Gm-Message-State: APjAAAV/VOHs/cYAtEn702cToahdTv+2z7ibVtRM9iEfQTD8jaLz3YRd v8SpBaI38PQ+Mqwz4YUv9K4= X-Google-Smtp-Source: APXvYqxRed92zUp/aw0i5gOYaxpXI0CPIxh4WJ0r3A01+20vWe9I+pHRW4oytSSHjxNJtAOL4Ksj1Q== X-Received: by 2002:a37:a854:: with SMTP id r81mr4984090qke.53.1560623115842; Sat, 15 Jun 2019 11:25:15 -0700 (PDT) Received: from localhost ([2620:10d:c091:480::4883]) by smtp.gmail.com with ESMTPSA id o66sm3445473qkb.90.2019.06.15.11.25.14 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 15 Jun 2019 11:25:15 -0700 (PDT) From: Tejun Heo To: dsterba@suse.com, clm@fb.com, josef@toxicpanda.com, axboe@kernel.dk, jack@suse.cz Cc: linux-btrfs@vger.kernel.org, linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, kernel-team@fb.com Subject: [PATCH 5/9] Btrfs: stop using btrfs_schedule_bio() Date: Sat, 15 Jun 2019 11:24:49 -0700 Message-Id: <20190615182453.843275-6-tj@kernel.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190615182453.843275-1-tj@kernel.org> References: <20190615182453.843275-1-tj@kernel.org> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Chris Mason btrfs_schedule_bio() hands IO off to a helper thread to do the actual submit_bio() call. This has been used to make sure async crc and compression helpers don't get stuck on IO submission. To maintain good performance, over time the IO submission threads duplicated some IO scheduler characteristics such as high and low priority IOs and they also made some ugly assumptions about request allocation batch sizes. All of this cost at least one extra context switch during IO submission, and doesn't fit well with the modern blkmq IO stack. So, this commit stops using btrfs_schedule_bio(). We may need to adjust the number of async helper threads for crcs and compression, but long term it's a better path. Signed-off-by: Chris Mason Reviewed-by: Josef Bacik --- fs/btrfs/compression.c | 8 +++--- fs/btrfs/disk-io.c | 6 ++--- fs/btrfs/inode.c | 6 ++--- fs/btrfs/volumes.c | 55 +++--------------------------------------- fs/btrfs/volumes.h | 2 +- 5 files changed, 15 insertions(+), 62 deletions(-) diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c index 4ec1df369e47..873261b932b8 100644 --- a/fs/btrfs/compression.c +++ b/fs/btrfs/compression.c @@ -355,7 +355,7 @@ blk_status_t btrfs_submit_compressed_write(struct inode *inode, u64 start, BUG_ON(ret); /* -ENOMEM */ } - ret = btrfs_map_bio(fs_info, bio, 0, 1); + ret = btrfs_map_bio(fs_info, bio, 0); if (ret) { bio->bi_status = ret; bio_endio(bio); @@ -385,7 +385,7 @@ blk_status_t btrfs_submit_compressed_write(struct inode *inode, u64 start, BUG_ON(ret); /* -ENOMEM */ } - ret = btrfs_map_bio(fs_info, bio, 0, 1); + ret = btrfs_map_bio(fs_info, bio, 0); if (ret) { bio->bi_status = ret; bio_endio(bio); @@ -638,7 +638,7 @@ blk_status_t btrfs_submit_compressed_read(struct inode *inode, struct bio *bio, sums += DIV_ROUND_UP(comp_bio->bi_iter.bi_size, fs_info->sectorsize); - ret = btrfs_map_bio(fs_info, comp_bio, mirror_num, 0); + ret = btrfs_map_bio(fs_info, comp_bio, mirror_num); if (ret) { comp_bio->bi_status = ret; bio_endio(comp_bio); @@ -662,7 +662,7 @@ blk_status_t btrfs_submit_compressed_read(struct inode *inode, struct bio *bio, BUG_ON(ret); /* -ENOMEM */ } - ret = btrfs_map_bio(fs_info, comp_bio, mirror_num, 0); + ret = btrfs_map_bio(fs_info, comp_bio, mirror_num); if (ret) { comp_bio->bi_status = ret; bio_endio(comp_bio); diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 663efce22d98..b34240406f36 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -800,7 +800,7 @@ static void run_one_async_done(struct btrfs_work *work) } ret = btrfs_map_bio(btrfs_sb(inode->i_sb), async->bio, - async->mirror_num, 1); + async->mirror_num); if (ret) { async->bio->bi_status = ret; bio_endio(async->bio); @@ -901,12 +901,12 @@ static blk_status_t btree_submit_bio_hook(struct inode *inode, struct bio *bio, BTRFS_WQ_ENDIO_METADATA); if (ret) goto out_w_error; - ret = btrfs_map_bio(fs_info, bio, mirror_num, 0); + ret = btrfs_map_bio(fs_info, bio, mirror_num); } else if (!async) { ret = btree_csum_one_bio(bio); if (ret) goto out_w_error; - ret = btrfs_map_bio(fs_info, bio, mirror_num, 0); + ret = btrfs_map_bio(fs_info, bio, mirror_num); } else { /* * kthread helpers are used to submit writes so that diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index d519c3520e87..91b161fb1521 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -2032,7 +2032,7 @@ static blk_status_t btrfs_submit_bio_hook(struct inode *inode, struct bio *bio, } mapit: - ret = btrfs_map_bio(fs_info, bio, mirror_num, 0); + ret = btrfs_map_bio(fs_info, bio, mirror_num); out: if (ret) { @@ -7764,7 +7764,7 @@ static inline blk_status_t submit_dio_repair_bio(struct inode *inode, if (ret) return ret; - ret = btrfs_map_bio(fs_info, bio, mirror_num, 0); + ret = btrfs_map_bio(fs_info, bio, mirror_num); return ret; } @@ -8295,7 +8295,7 @@ static inline blk_status_t btrfs_submit_dio_bio(struct bio *bio, goto err; } map: - ret = btrfs_map_bio(fs_info, bio, 0, 0); + ret = btrfs_map_bio(fs_info, bio, 0); err: return ret; } diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 1c2a6e4b39da..72326cc23985 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -6386,52 +6386,8 @@ static void btrfs_end_bio(struct bio *bio) } } -/* - * see run_scheduled_bios for a description of why bios are collected for - * async submit. - * - * This will add one bio to the pending list for a device and make sure - * the work struct is scheduled. - */ -static noinline void btrfs_schedule_bio(struct btrfs_device *device, - struct bio *bio) -{ - struct btrfs_fs_info *fs_info = device->fs_info; - int should_queue = 1; - struct btrfs_pending_bios *pending_bios; - - /* don't bother with additional async steps for reads, right now */ - if (bio_op(bio) == REQ_OP_READ) { - btrfsic_submit_bio(bio); - return; - } - - WARN_ON(bio->bi_next); - bio->bi_next = NULL; - - spin_lock(&device->io_lock); - if (op_is_sync(bio->bi_opf)) - pending_bios = &device->pending_sync_bios; - else - pending_bios = &device->pending_bios; - - if (pending_bios->tail) - pending_bios->tail->bi_next = bio; - - pending_bios->tail = bio; - if (!pending_bios->head) - pending_bios->head = bio; - if (device->running_pending) - should_queue = 0; - - spin_unlock(&device->io_lock); - - if (should_queue) - btrfs_queue_work(fs_info->submit_workers, &device->work); -} - static void submit_stripe_bio(struct btrfs_bio *bbio, struct bio *bio, - u64 physical, int dev_nr, int async) + u64 physical, int dev_nr) { struct btrfs_device *dev = bbio->stripes[dev_nr].dev; struct btrfs_fs_info *fs_info = bbio->fs_info; @@ -6449,10 +6405,7 @@ static void submit_stripe_bio(struct btrfs_bio *bbio, struct bio *bio, btrfs_bio_counter_inc_noblocked(fs_info); - if (async) - btrfs_schedule_bio(dev, bio); - else - btrfsic_submit_bio(bio); + btrfsic_submit_bio(bio); } static void bbio_error(struct btrfs_bio *bbio, struct bio *bio, u64 logical) @@ -6473,7 +6426,7 @@ static void bbio_error(struct btrfs_bio *bbio, struct bio *bio, u64 logical) } blk_status_t btrfs_map_bio(struct btrfs_fs_info *fs_info, struct bio *bio, - int mirror_num, int async_submit) + int mirror_num) { struct btrfs_device *dev; struct bio *first_bio = bio; @@ -6542,7 +6495,7 @@ blk_status_t btrfs_map_bio(struct btrfs_fs_info *fs_info, struct bio *bio, bio = first_bio; submit_stripe_bio(bbio, bio, bbio->stripes[dev_nr].physical, - dev_nr, async_submit); + dev_nr); } btrfs_bio_counter_dec(fs_info); return BLK_STS_OK; diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h index b8a0e8d0672d..8c7bd79b234a 100644 --- a/fs/btrfs/volumes.h +++ b/fs/btrfs/volumes.h @@ -415,7 +415,7 @@ int btrfs_alloc_chunk(struct btrfs_trans_handle *trans, u64 type); void btrfs_mapping_init(struct btrfs_mapping_tree *tree); void btrfs_mapping_tree_free(struct btrfs_mapping_tree *tree); blk_status_t btrfs_map_bio(struct btrfs_fs_info *fs_info, struct bio *bio, - int mirror_num, int async_submit); + int mirror_num); int btrfs_open_devices(struct btrfs_fs_devices *fs_devices, fmode_t flags, void *holder); struct btrfs_device *btrfs_scan_one_device(const char *path, From patchwork Sat Jun 15 18:24:50 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tejun Heo X-Patchwork-Id: 10997237 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 727A414B6 for ; Sat, 15 Jun 2019 18:25:48 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6691F284D4 for ; Sat, 15 Jun 2019 18:25:48 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 5B037285E3; Sat, 15 Jun 2019 18:25:48 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.7 required=2.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A5539284D4 for ; Sat, 15 Jun 2019 18:25:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727156AbfFOSZV (ORCPT ); Sat, 15 Jun 2019 14:25:21 -0400 Received: from mail-qt1-f193.google.com ([209.85.160.193]:33632 "EHLO mail-qt1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727141AbfFOSZU (ORCPT ); Sat, 15 Jun 2019 14:25:20 -0400 Received: by mail-qt1-f193.google.com with SMTP id x2so6389082qtr.0; Sat, 15 Jun 2019 11:25:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:from:to:cc:subject:date:message-id:in-reply-to:references; bh=1X3FWS80K1JJNMQCA2qoi8Ha3uMJZ3PHL5FlErhUkgg=; b=W5jXqn75isWClTobNCs6iOsCmrWOO2OVjTjjKtX/6Di8BIiuclxLmeP11JNsJVWfGj 8/Eem3yHPwJOv8rOjpTPTAiN0Y8aKMCVD3/IiRVhR9TkoujW2WHiMM/DdeODhQVXham7 cewl5BtiFF/GTRKx44D6E0apIHg5dA+D9CwCONOkNxTSA11zblml+QCf13U2nf94iSNp AtEmHzvfuNzs98Hw0wLaypGDA54YOBmqjTbSJNbiOhUgVkhSudCct2ar5/uIU09R0JyN C8/PZ3DA2VZ2WtlO+Fssl5VN5ETXu26367Qh6WuxLjwtwp33mk8TXQiQctGztqwrwpGK p6JA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:to:cc:subject:date:message-id :in-reply-to:references; bh=1X3FWS80K1JJNMQCA2qoi8Ha3uMJZ3PHL5FlErhUkgg=; b=K9/83Mq+5HH1TornZTtB3Mc6iAMMwZSiqrI7RRtOE/AFYp79fVtYvdhn/PdKJLZ9qV 2JoL1DtT6oiS7nANr5Xvu0NkqsLktmoVNQX2o7Ri8sLCiv3pJI1Xj+qpU3dbe3cyuy/0 yc+mvfzZYYJ8L7bO0GQb+JQWf0/VFGoGXQctyErIqmCaRRufTxD7kxjkcf1RIb/3fskv qs76jyLcaoLfNG3Whs1yuqlAfI3DDD+clz9zapY+dl+M80n+1eY4DafwVcTyz+MVz6sn 6F1PLQBD2GIxkvmfo3FrYSgKR6tlaQomBAFtewaoh8dzYydf3sGP4eyjWwK6wrjwmck8 9DrQ== X-Gm-Message-State: APjAAAV4AOeaY6s3H6YFHGOv3qp29cw68MxY9H8q+hjRdxF2iADBvW/6 HJQy2ILBVvKft2jSyfOrq9k= X-Google-Smtp-Source: APXvYqwbk1Oa31Z5DL6a2mtq8MwGt/kbQeQGXY5aOtFWYWv/DxcGr0acqLs3Xu9/ppWEO5pAs+lH5w== X-Received: by 2002:a0c:b0e4:: with SMTP id p33mr3955600qvc.208.1560623118761; Sat, 15 Jun 2019 11:25:18 -0700 (PDT) Received: from localhost ([2620:10d:c091:480::4883]) by smtp.gmail.com with ESMTPSA id o185sm3280557qkd.64.2019.06.15.11.25.17 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 15 Jun 2019 11:25:18 -0700 (PDT) From: Tejun Heo To: dsterba@suse.com, clm@fb.com, josef@toxicpanda.com, axboe@kernel.dk, jack@suse.cz Cc: linux-btrfs@vger.kernel.org, linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, kernel-team@fb.com Subject: [PATCH 6/9] Btrfs: delete the entire async bio submission framework Date: Sat, 15 Jun 2019 11:24:50 -0700 Message-Id: <20190615182453.843275-7-tj@kernel.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190615182453.843275-1-tj@kernel.org> References: <20190615182453.843275-1-tj@kernel.org> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Chris Mason Now that we're not using btrfs_schedule_bio() anymore, delete all the code that supported it. Signed-off-by: Chris Mason Reviewed-by: Josef Bacik --- fs/btrfs/ctree.h | 1 - fs/btrfs/disk-io.c | 13 +-- fs/btrfs/super.c | 1 - fs/btrfs/volumes.c | 209 --------------------------------------------- fs/btrfs/volumes.h | 8 -- 5 files changed, 1 insertion(+), 231 deletions(-) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index b81c331b28fa..2a5ba0f85ed3 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -989,7 +989,6 @@ struct btrfs_fs_info { struct btrfs_workqueue *endio_meta_write_workers; struct btrfs_workqueue *endio_write_workers; struct btrfs_workqueue *endio_freespace_worker; - struct btrfs_workqueue *submit_workers; struct btrfs_workqueue *caching_workers; struct btrfs_workqueue *readahead_workers; diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index b34240406f36..9dbe4ba3995d 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -2028,7 +2028,6 @@ static void btrfs_stop_all_workers(struct btrfs_fs_info *fs_info) btrfs_destroy_workqueue(fs_info->rmw_workers); btrfs_destroy_workqueue(fs_info->endio_write_workers); btrfs_destroy_workqueue(fs_info->endio_freespace_worker); - btrfs_destroy_workqueue(fs_info->submit_workers); btrfs_destroy_workqueue(fs_info->delayed_workers); btrfs_destroy_workqueue(fs_info->caching_workers); btrfs_destroy_workqueue(fs_info->readahead_workers); @@ -2194,16 +2193,6 @@ static int btrfs_init_workqueues(struct btrfs_fs_info *fs_info, fs_info->caching_workers = btrfs_alloc_workqueue(fs_info, "cache", flags, max_active, 0); - /* - * a higher idle thresh on the submit workers makes it much more - * likely that bios will be send down in a sane order to the - * devices - */ - fs_info->submit_workers = - btrfs_alloc_workqueue(fs_info, "submit", flags, - min_t(u64, fs_devices->num_devices, - max_active), 64); - fs_info->fixup_workers = btrfs_alloc_workqueue(fs_info, "fixup", flags, 1, 0); @@ -2246,7 +2235,7 @@ static int btrfs_init_workqueues(struct btrfs_fs_info *fs_info, max_active), 8); if (!(fs_info->workers && fs_info->delalloc_workers && - fs_info->submit_workers && fs_info->flush_workers && + fs_info->flush_workers && fs_info->endio_workers && fs_info->endio_meta_workers && fs_info->endio_meta_write_workers && fs_info->endio_repair_workers && diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c index 2c66d9ea6a3b..3fb86a7bfdf7 100644 --- a/fs/btrfs/super.c +++ b/fs/btrfs/super.c @@ -1668,7 +1668,6 @@ static void btrfs_resize_thread_pool(struct btrfs_fs_info *fs_info, btrfs_workqueue_set_max(fs_info->workers, new_pool_size); btrfs_workqueue_set_max(fs_info->delalloc_workers, new_pool_size); - btrfs_workqueue_set_max(fs_info->submit_workers, new_pool_size); btrfs_workqueue_set_max(fs_info->caching_workers, new_pool_size); btrfs_workqueue_set_max(fs_info->endio_workers, new_pool_size); btrfs_workqueue_set_max(fs_info->endio_meta_workers, new_pool_size); diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 72326cc23985..fc3a16d87869 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -509,212 +509,6 @@ btrfs_get_bdev_and_sb(const char *device_path, fmode_t flags, void *holder, return ret; } -static void requeue_list(struct btrfs_pending_bios *pending_bios, - struct bio *head, struct bio *tail) -{ - - struct bio *old_head; - - old_head = pending_bios->head; - pending_bios->head = head; - if (pending_bios->tail) - tail->bi_next = old_head; - else - pending_bios->tail = tail; -} - -/* - * we try to collect pending bios for a device so we don't get a large - * number of procs sending bios down to the same device. This greatly - * improves the schedulers ability to collect and merge the bios. - * - * But, it also turns into a long list of bios to process and that is sure - * to eventually make the worker thread block. The solution here is to - * make some progress and then put this work struct back at the end of - * the list if the block device is congested. This way, multiple devices - * can make progress from a single worker thread. - */ -static noinline void run_scheduled_bios(struct btrfs_device *device) -{ - struct btrfs_fs_info *fs_info = device->fs_info; - struct bio *pending; - struct backing_dev_info *bdi; - struct btrfs_pending_bios *pending_bios; - struct bio *tail; - struct bio *cur; - int again = 0; - unsigned long num_run; - unsigned long batch_run = 0; - unsigned long last_waited = 0; - int force_reg = 0; - int sync_pending = 0; - struct blk_plug plug; - - /* - * this function runs all the bios we've collected for - * a particular device. We don't want to wander off to - * another device without first sending all of these down. - * So, setup a plug here and finish it off before we return - */ - blk_start_plug(&plug); - - bdi = device->bdev->bd_bdi; - -loop: - spin_lock(&device->io_lock); - -loop_lock: - num_run = 0; - - /* take all the bios off the list at once and process them - * later on (without the lock held). But, remember the - * tail and other pointers so the bios can be properly reinserted - * into the list if we hit congestion - */ - if (!force_reg && device->pending_sync_bios.head) { - pending_bios = &device->pending_sync_bios; - force_reg = 1; - } else { - pending_bios = &device->pending_bios; - force_reg = 0; - } - - pending = pending_bios->head; - tail = pending_bios->tail; - WARN_ON(pending && !tail); - - /* - * if pending was null this time around, no bios need processing - * at all and we can stop. Otherwise it'll loop back up again - * and do an additional check so no bios are missed. - * - * device->running_pending is used to synchronize with the - * schedule_bio code. - */ - if (device->pending_sync_bios.head == NULL && - device->pending_bios.head == NULL) { - again = 0; - device->running_pending = 0; - } else { - again = 1; - device->running_pending = 1; - } - - pending_bios->head = NULL; - pending_bios->tail = NULL; - - spin_unlock(&device->io_lock); - - while (pending) { - - rmb(); - /* we want to work on both lists, but do more bios on the - * sync list than the regular list - */ - if ((num_run > 32 && - pending_bios != &device->pending_sync_bios && - device->pending_sync_bios.head) || - (num_run > 64 && pending_bios == &device->pending_sync_bios && - device->pending_bios.head)) { - spin_lock(&device->io_lock); - requeue_list(pending_bios, pending, tail); - goto loop_lock; - } - - cur = pending; - pending = pending->bi_next; - cur->bi_next = NULL; - - BUG_ON(atomic_read(&cur->__bi_cnt) == 0); - - /* - * if we're doing the sync list, record that our - * plug has some sync requests on it - * - * If we're doing the regular list and there are - * sync requests sitting around, unplug before - * we add more - */ - if (pending_bios == &device->pending_sync_bios) { - sync_pending = 1; - } else if (sync_pending) { - blk_finish_plug(&plug); - blk_start_plug(&plug); - sync_pending = 0; - } - - btrfsic_submit_bio(cur); - num_run++; - batch_run++; - - cond_resched(); - - /* - * we made progress, there is more work to do and the bdi - * is now congested. Back off and let other work structs - * run instead - */ - if (pending && bdi_write_congested(bdi) && batch_run > 8 && - fs_info->fs_devices->open_devices > 1) { - struct io_context *ioc; - - ioc = current->io_context; - - /* - * the main goal here is that we don't want to - * block if we're going to be able to submit - * more requests without blocking. - * - * This code does two great things, it pokes into - * the elevator code from a filesystem _and_ - * it makes assumptions about how batching works. - */ - if (ioc && ioc->nr_batch_requests > 0 && - time_before(jiffies, ioc->last_waited + HZ/50UL) && - (last_waited == 0 || - ioc->last_waited == last_waited)) { - /* - * we want to go through our batch of - * requests and stop. So, we copy out - * the ioc->last_waited time and test - * against it before looping - */ - last_waited = ioc->last_waited; - cond_resched(); - continue; - } - spin_lock(&device->io_lock); - requeue_list(pending_bios, pending, tail); - device->running_pending = 1; - - spin_unlock(&device->io_lock); - btrfs_queue_work(fs_info->submit_workers, - &device->work); - goto done; - } - } - - cond_resched(); - if (again) - goto loop; - - spin_lock(&device->io_lock); - if (device->pending_bios.head || device->pending_sync_bios.head) - goto loop_lock; - spin_unlock(&device->io_lock); - -done: - blk_finish_plug(&plug); -} - -static void pending_bios_fn(struct btrfs_work *work) -{ - struct btrfs_device *device; - - device = container_of(work, struct btrfs_device, work); - run_scheduled_bios(device); -} - static bool device_path_matched(const char *path, struct btrfs_device *device) { int found; @@ -6599,9 +6393,6 @@ struct btrfs_device *btrfs_alloc_device(struct btrfs_fs_info *fs_info, else generate_random_uuid(dev->uuid); - btrfs_init_work(&dev->work, btrfs_submit_helper, - pending_bios_fn, NULL, NULL); - return dev; } diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h index 8c7bd79b234a..231f50dd107d 100644 --- a/fs/btrfs/volumes.h +++ b/fs/btrfs/volumes.h @@ -18,10 +18,6 @@ extern struct mutex uuid_mutex; #define BTRFS_STRIPE_LEN SZ_64K struct buffer_head; -struct btrfs_pending_bios { - struct bio *head; - struct bio *tail; -}; /* * Use sequence counter to get consistent device stat data on @@ -55,10 +51,6 @@ struct btrfs_device { spinlock_t io_lock ____cacheline_aligned; int running_pending; - /* regular prio bios */ - struct btrfs_pending_bios pending_bios; - /* sync bios */ - struct btrfs_pending_bios pending_sync_bios; struct block_device *bdev; From patchwork Sat Jun 15 18:24:51 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tejun Heo X-Patchwork-Id: 10997223 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6E4821398 for ; Sat, 15 Jun 2019 18:25:28 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5FA0E284D4 for ; Sat, 15 Jun 2019 18:25:28 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 5273128571; Sat, 15 Jun 2019 18:25:28 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.7 required=2.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id AA645284D4 for ; Sat, 15 Jun 2019 18:25:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727141AbfFOSZ0 (ORCPT ); Sat, 15 Jun 2019 14:25:26 -0400 Received: from mail-qt1-f193.google.com ([209.85.160.193]:39803 "EHLO mail-qt1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727107AbfFOSZX (ORCPT ); Sat, 15 Jun 2019 14:25:23 -0400 Received: by mail-qt1-f193.google.com with SMTP id i34so1063147qta.6; Sat, 15 Jun 2019 11:25:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:from:to:cc:subject:date:message-id:in-reply-to:references; bh=ZC/QplMmZKlhTk34cptDJfM3LixD1OQPShT0TI5blRI=; b=dy6vM8sacEsVKHgqm9o3Hf74CXTw/JmtIsx4iaOAz8VtSZ1xNbtnTy/BLuii1FZFsI C0zWUuZPAPiS2zYNJ5wE67P2kSN/ybgfBDkBGU+Ehv4fohyNIkKPkVh2SZKGTYVX4jfb B3QOZdDqXJ7SaraKWigt6+YLpiXVe+3ZyiN/mn8w4LuMekWnB6vcFM+8SO/puG1e5OV2 Wn2oj3Sp38dYEqtBRdNol7RdZq27ruqQw6Rp8lBO8Xe0BzkUhH2weBzU64le6YvQgnbA ZXLpV2eEe/8PL2M7+DVmcnK5xrTLHCpYuzJFsQtQkVPggEKWI9d5ORrR5DkhBeSGzUPs 1Iww== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:to:cc:subject:date:message-id :in-reply-to:references; bh=ZC/QplMmZKlhTk34cptDJfM3LixD1OQPShT0TI5blRI=; b=oRmHzjIdjWoHejjm+5+bhGzxRb0Fkn9hDEb47DPQZUDLUWTGlIV3xWBPszjSwtbeDA L+aDLqqMoVvDqGZMoyvzJYYSHmetGWR+QGRlO4gIcu30cVgMRlypxWYW96FacZ2odcO/ a/ywwl5n+JYH1Eo3ix0HzLML8FWp4UB+SvaOfKQ46bdkyNGUhYm2aM9fzFzOMb2ZijND kQFGlsi+PwmQv9aB1TP5EoP3sbgSUhNOwWd9HSJa05bRdt2HaWwruGo91W6KOUNztgEw dIXs1Q3mQeUr7HdMAO4actyRTrZGw4cqkLexa6DuYg3psK4OTchzrZKA5EDeMEyiN0ZC lqjg== X-Gm-Message-State: APjAAAXWkzf5TS50DLA0iwHj95yJILB8RkS7MW4KOAf11lXHvcgXE+Gl MbNczAe8bqlGNYcbz04vERo= X-Google-Smtp-Source: APXvYqwYrF+koTSSYf6Q66+JJU6yLxXQHYfjvr4NDkr/5pnPDuMf2jisO0lATDcqDm0Cjl+tjXDewg== X-Received: by 2002:ac8:1a3c:: with SMTP id v57mr86229634qtj.339.1560623122003; Sat, 15 Jun 2019 11:25:22 -0700 (PDT) Received: from localhost ([2620:10d:c091:480::4883]) by smtp.gmail.com with ESMTPSA id f189sm3770142qkj.13.2019.06.15.11.25.21 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 15 Jun 2019 11:25:21 -0700 (PDT) From: Tejun Heo To: dsterba@suse.com, clm@fb.com, josef@toxicpanda.com, axboe@kernel.dk, jack@suse.cz Cc: linux-btrfs@vger.kernel.org, linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, kernel-team@fb.com Subject: [PATCH 7/9] Btrfs: only associate the locked page with one async_cow struct Date: Sat, 15 Jun 2019 11:24:51 -0700 Message-Id: <20190615182453.843275-8-tj@kernel.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190615182453.843275-1-tj@kernel.org> References: <20190615182453.843275-1-tj@kernel.org> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Chris Mason The btrfs writepages function collects a large range of pages flagged for delayed allocation, and then sends them down through the COW code for processing. When compression is on, we allocate one async_cow structure for every 512K, and then run those pages through the compression code for IO submission. writepages starts all of this off with a single page, locked by the original call to extent_write_cache_pages(), and it's important to keep track of this page because it has already been through clear_page_dirty_for_io(). The btrfs async_cow struct has a pointer to the locked_page, and when we're redirtying the page because compression had to fallback to uncompressed IO, we use page->index to decide if a given async_cow struct really owns that page. But, this is racey. If a given delalloc range is broken up into two async_cows (cow_A and cow_B), we can end up with something like this: compress_file_range(cowA) submit_compress_extents(cowA) submit compressed bios(cowA) put_page(locked_page) compress_file_range(cowB) ... The end result is that cowA is completed and cleaned up before cowB even starts processing. This means we can free locked_page() and reuse it elsewhere. If we get really lucky, it'll have the same page->index in its new home as it did before. While we're processing cowB, we might decide we need to fall back to uncompressed IO, and so compress_file_range() will call __set_page_dirty_nobufers() on cowB->locked_page. Without cgroups in use, this creates as a phantom dirty page, which isn't great but isn't the end of the world. With cgroups in use, we might crash in the accounting code because page->mapping->i_wb isn't set. [ 8308.523110] BUG: unable to handle kernel NULL pointer dereference at 00000000000000d0 [ 8308.531084] IP: percpu_counter_add_batch+0x11/0x70 [ 8308.538371] PGD 66534e067 P4D 66534e067 PUD 66534f067 PMD 0 [ 8308.541750] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC [ 8308.551948] CPU: 16 PID: 2172 Comm: rm Not tainted [ 8308.566883] RIP: 0010:percpu_counter_add_batch+0x11/0x70 [ 8308.567891] RSP: 0018:ffffc9000a97bbe0 EFLAGS: 00010286 [ 8308.568986] RAX: 0000000000000005 RBX: 0000000000000090 RCX: 0000000000026115 [ 8308.570734] RDX: 0000000000000030 RSI: ffffffffffffffff RDI: 0000000000000090 [ 8308.572543] RBP: 0000000000000000 R08: fffffffffffffff5 R09: 0000000000000000 [ 8308.573856] R10: 00000000000260c0 R11: ffff881037fc26c0 R12: ffffffffffffffff [ 8308.580099] R13: ffff880fe4111548 R14: ffffc9000a97bc90 R15: 0000000000000001 [ 8308.582520] FS: 00007f5503ced480(0000) GS:ffff880ff7200000(0000) knlGS:0000000000000000 [ 8308.585440] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 8308.587951] CR2: 00000000000000d0 CR3: 00000001e0459005 CR4: 0000000000360ee0 [ 8308.590707] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 8308.592865] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 8308.594469] Call Trace: [ 8308.595149] account_page_cleaned+0x15b/0x1f0 [ 8308.596340] __cancel_dirty_page+0x146/0x200 [ 8308.599395] truncate_cleanup_page+0x92/0xb0 [ 8308.600480] truncate_inode_pages_range+0x202/0x7d0 [ 8308.617392] btrfs_evict_inode+0x92/0x5a0 [ 8308.619108] evict+0xc1/0x190 [ 8308.620023] do_unlinkat+0x176/0x280 [ 8308.621202] do_syscall_64+0x63/0x1a0 [ 8308.623451] entry_SYSCALL_64_after_hwframe+0x42/0xb7 The fix here is to make asyc_cow->locked_page NULL everywhere but the one async_cow struct that's allowed to do things to the locked page. Signed-off-by: Chris Mason Fixes: 771ed689d2cd ("Btrfs: Optimize compressed writeback and reads") Reviewed-by: Josef Bacik --- fs/btrfs/extent_io.c | 2 +- fs/btrfs/inode.c | 25 +++++++++++++++++++++---- 2 files changed, 22 insertions(+), 5 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 13fca7bfc1f2..9f223d7d78c0 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -1838,7 +1838,7 @@ static int __process_pages_contig(struct address_space *mapping, if (page_ops & PAGE_SET_PRIVATE2) SetPagePrivate2(pages[i]); - if (pages[i] == locked_page) { + if (locked_page && pages[i] == locked_page) { put_page(pages[i]); pages_locked++; continue; diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 91b161fb1521..df5527cc07b9 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -666,10 +666,12 @@ static noinline void compress_file_range(struct async_chunk *async_chunk, * to our extent and set things up for the async work queue to run * cow_file_range to do the normal delalloc dance. */ - if (page_offset(async_chunk->locked_page) >= start && - page_offset(async_chunk->locked_page) <= end) + if (async_chunk->locked_page && + (page_offset(async_chunk->locked_page) >= start && + page_offset(async_chunk->locked_page)) <= end) { __set_page_dirty_nobuffers(async_chunk->locked_page); /* unlocked later on in the async handlers */ + } if (redirty) extent_range_redirty_for_io(inode, start, end); @@ -759,7 +761,7 @@ static noinline void submit_compressed_extents(struct async_chunk *async_chunk) async_extent->start + async_extent->ram_size - 1, WB_SYNC_ALL); - else if (ret) + else if (ret && async_chunk->locked_page) unlock_page(async_chunk->locked_page); kfree(async_extent); cond_resched(); @@ -1236,10 +1238,25 @@ static int cow_file_range_async(struct inode *inode, struct page *locked_page, async_chunk[i].inode = inode; async_chunk[i].start = start; async_chunk[i].end = cur_end; - async_chunk[i].locked_page = locked_page; async_chunk[i].write_flags = write_flags; INIT_LIST_HEAD(&async_chunk[i].extents); + /* + * The locked_page comes all the way from writepage and its + * the original page we were actually given. As we spread + * this large delalloc region across multiple async_cow + * structs, only the first struct needs a pointer to locked_page + * + * This way we don't need racey decisions about who is supposed + * to unlock it. + */ + if (locked_page) { + async_chunk[i].locked_page = locked_page; + locked_page = NULL; + } else { + async_chunk[i].locked_page = NULL; + } + btrfs_init_work(&async_chunk[i].work, btrfs_delalloc_helper, async_cow_start, async_cow_submit, From patchwork Sat Jun 15 18:24:52 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tejun Heo X-Patchwork-Id: 10997231 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3CE6B14B6 for ; Sat, 15 Jun 2019 18:25:42 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 311C9284D4 for ; Sat, 15 Jun 2019 18:25:42 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 256B9285E3; Sat, 15 Jun 2019 18:25:42 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.7 required=2.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8BCEE284D4 for ; Sat, 15 Jun 2019 18:25:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727213AbfFOSZ2 (ORCPT ); Sat, 15 Jun 2019 14:25:28 -0400 Received: from mail-qt1-f194.google.com ([209.85.160.194]:45763 "EHLO mail-qt1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727195AbfFOSZ1 (ORCPT ); Sat, 15 Jun 2019 14:25:27 -0400 Received: by mail-qt1-f194.google.com with SMTP id j19so6302553qtr.12; Sat, 15 Jun 2019 11:25:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:from:to:cc:subject:date:message-id:in-reply-to:references; bh=pLgqiVc8WgttgL57a6zlFC+HBkK0RO6jy2y58MUbUT4=; b=UZBAdIST64h9/L93v+Uv8msKsSVnKhFMSQHg/22JuYK5k3DLY/0ykkOzXlAYktzoyA v6yMG4HP/Cg7E6aVGsWeRShi/BT7SoUaeb7NE1z+nfXwf6iY5Kd3fhuRKn+4W9Dx/CJr ywfu61mF/aCeW0pTFl6k1hnRd+n/kvPPIioJ0R+45ggMDYfN4ykJFUbS5LGrRaIEbp8s 97d+QR7U3waeG2YNS8sAwUHXRoZ92E3X8Iwh2J6FUKA9SVOquDLHHow7yeT0gWKps6+h 1Rc17No2J6q6G9Q/iXZgyoGNeJoEGHpNYvBda3xjLKF6252KROvKW7ksXAwZbA4Qp8UT 51Cg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:to:cc:subject:date:message-id :in-reply-to:references; bh=pLgqiVc8WgttgL57a6zlFC+HBkK0RO6jy2y58MUbUT4=; b=Mx9S4t1i+3N/jjMbYWjrxTqwNC58MLeOyZNFJNzdUkJIWcPdgL0/sKg7Gb97cUln6c 2tJKgSwviYK/tCYoVisfGLS671fqd+Tz0E7cLc5w9L7NWN78ey8OXcyfsXoRUKLUXYQS sHea6uapU/+XL+NTfQKp90HmBcswWjrvAWkIFglWLcL8XNnA2P44Uxvu8q/HFNYdfye6 Q7sPdmHzhmb+mmCJ/7ehjSp057829y/NA+PMKJGMVrHcfKi+/sj+1B+NOr62yJVtQmt9 jcpwL4LK5v3pouFebMnkRQ+qI98x1mO2QD3WEiZLeGXaY+6OdoREkC/qSdoUuJPozfsk QYnw== X-Gm-Message-State: APjAAAWNGvg9XiFZY9TWgfx2Gpe5a5/z8wjJbBM+MEASbiW5Qzagf58C HVwbOG3AtX/cREtPdtdg9L4= X-Google-Smtp-Source: APXvYqwNb+FCgzHWpIHkiWYcW19s6mZ0HR17PRI49sbK0dH0AnCBjMzU7HFOWMMcPTe0ecfsAFSaxQ== X-Received: by 2002:a0c:fb07:: with SMTP id c7mr14426266qvp.229.1560623125887; Sat, 15 Jun 2019 11:25:25 -0700 (PDT) Received: from localhost ([2620:10d:c091:480::4883]) by smtp.gmail.com with ESMTPSA id j66sm3749897qkf.86.2019.06.15.11.25.24 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 15 Jun 2019 11:25:24 -0700 (PDT) From: Tejun Heo To: dsterba@suse.com, clm@fb.com, josef@toxicpanda.com, axboe@kernel.dk, jack@suse.cz Cc: linux-btrfs@vger.kernel.org, linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, kernel-team@fb.com Subject: [PATCH 8/9] Btrfs: use REQ_CGROUP_PUNT for worker thread submitted bios Date: Sat, 15 Jun 2019 11:24:52 -0700 Message-Id: <20190615182453.843275-9-tj@kernel.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190615182453.843275-1-tj@kernel.org> References: <20190615182453.843275-1-tj@kernel.org> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Chris Mason Async CRCs and compression submit IO through helper threads, which means they have IO priority inversions when cgroup IO controllers are in use. This flags all of the writes submitted by btrfs helper threads as REQ_CGROUP_PUNT. submit_bio() will punt these to dedicated per-blkcg work items to avoid the priority inversion. For the compression code, we take a reference on the wbc's blkg css and pass it down to the async workers. For the async crcs, the bio already has the correct css, we just need to tell the block layer to use REQ_CGROUP_PUNT. Signed-off-by: Chris Mason Modified-and-reviewed-by: Tejun Heo Reviewed-by: Josef Bacik --- fs/btrfs/compression.c | 8 +++++++- fs/btrfs/compression.h | 3 ++- fs/btrfs/disk-io.c | 6 ++++++ fs/btrfs/extent_io.c | 3 +++ fs/btrfs/inode.c | 30 +++++++++++++++++++++++++++--- 5 files changed, 45 insertions(+), 5 deletions(-) diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c index 873261b932b8..138479a9576c 100644 --- a/fs/btrfs/compression.c +++ b/fs/btrfs/compression.c @@ -289,7 +289,8 @@ blk_status_t btrfs_submit_compressed_write(struct inode *inode, u64 start, unsigned long compressed_len, struct page **compressed_pages, unsigned long nr_pages, - unsigned int write_flags) + unsigned int write_flags, + struct cgroup_subsys_state *blkcg_css) { struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); struct bio *bio = NULL; @@ -323,6 +324,11 @@ blk_status_t btrfs_submit_compressed_write(struct inode *inode, u64 start, bio->bi_opf = REQ_OP_WRITE | write_flags; bio->bi_private = cb; bio->bi_end_io = end_compressed_bio_write; + + if (blkcg_css) { + bio->bi_opf |= REQ_CGROUP_PUNT; + bio_associate_blkg_from_css(bio, blkcg_css); + } refcount_set(&cb->pending_bios, 1); /* create and submit bios for the compressed pages */ diff --git a/fs/btrfs/compression.h b/fs/btrfs/compression.h index 9976fe0f7526..7cbefab96ecf 100644 --- a/fs/btrfs/compression.h +++ b/fs/btrfs/compression.h @@ -93,7 +93,8 @@ blk_status_t btrfs_submit_compressed_write(struct inode *inode, u64 start, unsigned long compressed_len, struct page **compressed_pages, unsigned long nr_pages, - unsigned int write_flags); + unsigned int write_flags, + struct cgroup_subsys_state *blkcg_css); blk_status_t btrfs_submit_compressed_read(struct inode *inode, struct bio *bio, int mirror_num, unsigned long bio_flags); diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 9dbe4ba3995d..a5ebbf3d0833 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -799,6 +799,12 @@ static void run_one_async_done(struct btrfs_work *work) return; } + /* + * All of the bios that pass through here are from async helpers. + * Use REQ_CGROUP_PUNT to issue them from the owning cgroup's + * context. This changes nothing when cgroups aren't in use. + */ + async->bio->bi_opf |= REQ_CGROUP_PUNT; ret = btrfs_map_bio(btrfs_sb(inode->i_sb), async->bio, async->mirror_num); if (ret) { diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 9f223d7d78c0..d7b57341ff1a 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -4175,6 +4175,9 @@ int extent_write_locked_range(struct inode *inode, u64 start, u64 end, .nr_to_write = nr_pages * 2, .range_start = start, .range_end = end + 1, + /* we're called from an async helper function */ + .punt_to_cgroup = 1, + .no_wbc_acct = 1, }; while (start <= end) { diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index df5527cc07b9..3f9b35bc0455 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -357,6 +357,7 @@ struct async_extent { }; struct async_chunk { + struct cgroup_subsys_state *blkcg_css; struct inode *inode; struct page *locked_page; u64 start; @@ -846,7 +847,8 @@ static noinline void submit_compressed_extents(struct async_chunk *async_chunk) ins.objectid, ins.offset, async_extent->pages, async_extent->nr_pages, - async_chunk->write_flags)) { + async_chunk->write_flags, + async_chunk->blkcg_css)) { struct page *p = async_extent->pages[0]; const u64 start = async_extent->start; const u64 end = start + async_extent->ram_size - 1; @@ -1170,6 +1172,8 @@ static noinline void async_cow_free(struct btrfs_work *work) async_chunk = container_of(work, struct async_chunk, work); if (async_chunk->inode) btrfs_add_delayed_iput(async_chunk->inode); + if (async_chunk->blkcg_css) + css_put(async_chunk->blkcg_css); /* * Since the pointer to 'pending' is at the beginning of the array of * async_chunk's, freeing it ensures the whole array has been freed. @@ -1178,12 +1182,15 @@ static noinline void async_cow_free(struct btrfs_work *work) kvfree(async_chunk->pending); } -static int cow_file_range_async(struct inode *inode, struct page *locked_page, +static int cow_file_range_async(struct inode *inode, + struct writeback_control *wbc, + struct page *locked_page, u64 start, u64 end, int *page_started, unsigned long *nr_written, unsigned int write_flags) { struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); + struct cgroup_subsys_state *blkcg_css = wbc_blkcg_css(wbc); struct async_cow *ctx; struct async_chunk *async_chunk; unsigned long nr_pages; @@ -1251,12 +1258,29 @@ static int cow_file_range_async(struct inode *inode, struct page *locked_page, * to unlock it. */ if (locked_page) { + /* + * Depending on the compressibility, the pages + * might or might not go through async. We want + * all of them to be accounted against @wbc once. + * Let's do it here before the paths diverge. wbc + * accounting is used only for foreign writeback + * detection and doesn't need full accuracy. Just + * account the whole thing against the first page. + */ + wbc_account_io(wbc, locked_page, cur_end - start); async_chunk[i].locked_page = locked_page; locked_page = NULL; } else { async_chunk[i].locked_page = NULL; } + if (blkcg_css != blkcg_root_css) { + css_get(blkcg_css); + async_chunk[i].blkcg_css = blkcg_css; + } else { + async_chunk[i].blkcg_css = NULL; + } + btrfs_init_work(&async_chunk[i].work, btrfs_delalloc_helper, async_cow_start, async_cow_submit, @@ -1653,7 +1677,7 @@ int btrfs_run_delalloc_range(struct inode *inode, struct page *locked_page, } else { set_bit(BTRFS_INODE_HAS_ASYNC_EXTENT, &BTRFS_I(inode)->runtime_flags); - ret = cow_file_range_async(inode, locked_page, start, end, + ret = cow_file_range_async(inode, wbc, locked_page, start, end, page_started, nr_written, write_flags); } From patchwork Sat Jun 15 18:24:53 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tejun Heo X-Patchwork-Id: 10997227 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C8AD01398 for ; Sat, 15 Jun 2019 18:25:39 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id BC416284D4 for ; Sat, 15 Jun 2019 18:25:39 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id AF01528571; Sat, 15 Jun 2019 18:25:39 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.7 required=2.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 63DCC284D4 for ; Sat, 15 Jun 2019 18:25:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727195AbfFOSZd (ORCPT ); Sat, 15 Jun 2019 14:25:33 -0400 Received: from mail-qk1-f196.google.com ([209.85.222.196]:36373 "EHLO mail-qk1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727224AbfFOSZa (ORCPT ); Sat, 15 Jun 2019 14:25:30 -0400 Received: by mail-qk1-f196.google.com with SMTP id g18so3827338qkl.3; Sat, 15 Jun 2019 11:25:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:from:to:cc:subject:date:message-id:in-reply-to:references; bh=LIH/ZSr79Tn55qxPBQsZoWtTcM5IaydfNkQSRyHlb9Q=; b=CurpXwQSCE0pNhvm+GLfAqWmICgukKsaE/IpRpfQXwEsw160H63N5DXWHoUsoljImq su0WKV6unNlBZ5LtyEhB0oDAMkxIiXExCCGVbw3UQOXZSnO5Y2h3sKKg0lT26lesjYvH EXcAnnwWa21/To7Jzgz5zetr/GDjvVEjGFWD4jRzK0J9jJI8m17nmxvOXJSa4LcGBAus sYE0om7+0urC7gOqTtfhLvzSJ1xpUJu2R0tdoAKSZT5Gx/D7o0ovnkmDl2ZMx6czrrHJ V6isIKBAfeL4FSui+fw3Yv0Eo6MLXIY/GIFY7MK0Sea9+Iv+9vkcR0gg1xy44IUYqwqe pZOg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:to:cc:subject:date:message-id :in-reply-to:references; bh=LIH/ZSr79Tn55qxPBQsZoWtTcM5IaydfNkQSRyHlb9Q=; b=Ln8LIcqbSXe4SS3vhSr8vd8Gv6fAUf347j2tfjggRU+68IL3rh35wnZUcBs9/dWogN miMLPK0wGqtYjnQaJ2eGK3iJs1BusOTU9x7O/DOCUM7UAmTo7DYO/gNYx0THoZVQ3moq 4rGQuwE0dx1EEXxr5CK24Qv1lIW89CKzXDn/n8ldqUrOLGyZx/Rt5mY47c2ZwCW6g9LJ Or1vdCIm8bP0NaUizUwwZ//FP/9u8OzQEkfygVfG7Ira52NiKNG2BoIYLjkQigq0QNXC nABMMEgFPvsicyjcugCgxCVchFp5G3JLvxwie/DjHWKcK52UjKe4U1IhMPK5Z3wJmoEt AvxA== X-Gm-Message-State: APjAAAVNkHyyScobBW/NM1THY6FNcxLoj+Di3trmx5+F1abcczfev4+g vTAN23ldfCLkql49TcVgKPg= X-Google-Smtp-Source: APXvYqyR8G50+n7mA4iZ9L0MB6ZIAZ2h2QTc2eTMh3NaZzH0rWhmETS5t5glFSzSuFEM2/UgoEmwBA== X-Received: by 2002:a37:4bc5:: with SMTP id y188mr12989076qka.13.1560623128770; Sat, 15 Jun 2019 11:25:28 -0700 (PDT) Received: from localhost ([2620:10d:c091:480::4883]) by smtp.gmail.com with ESMTPSA id f68sm3715819qtb.83.2019.06.15.11.25.28 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 15 Jun 2019 11:25:28 -0700 (PDT) From: Tejun Heo To: dsterba@suse.com, clm@fb.com, josef@toxicpanda.com, axboe@kernel.dk, jack@suse.cz Cc: linux-btrfs@vger.kernel.org, linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, kernel-team@fb.com Subject: [PATCH 9/9] Btrfs: extent_write_locked_range() should attach inode->i_wb Date: Sat, 15 Jun 2019 11:24:53 -0700 Message-Id: <20190615182453.843275-10-tj@kernel.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190615182453.843275-1-tj@kernel.org> References: <20190615182453.843275-1-tj@kernel.org> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Chris Mason extent_write_locked_range() is used when we're falling back to buffered IO from inside of compression. It allocates its own wbc and should associate it with the inode's i_wb to make sure the IO goes down from the correct cgroup. Signed-off-by: Chris Mason Reviewed-by: Josef Bacik --- fs/btrfs/extent_io.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index d7b57341ff1a..afb916a69c30 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -4180,6 +4180,7 @@ int extent_write_locked_range(struct inode *inode, u64 start, u64 end, .no_wbc_acct = 1, }; + wbc_attach_fdatawrite_inode(&wbc_writepages, inode); while (start <= end) { page = find_get_page(mapping, start >> PAGE_SHIFT); if (clear_page_dirty_for_io(page)) @@ -4194,11 +4195,12 @@ int extent_write_locked_range(struct inode *inode, u64 start, u64 end, } ASSERT(ret <= 0); - if (ret < 0) { + if (ret == 0) + ret = flush_write_bio(&epd); + else end_write_bio(&epd, ret); - return ret; - } - ret = flush_write_bio(&epd); + + wbc_detach_inode(&wbc_writepages); return ret; }