From patchwork Sat Sep 10 06:50:54 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Hellwig X-Patchwork-Id: 12972439 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 79309C6FA92 for ; Sat, 10 Sep 2022 06:51:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230036AbiIJGvW (ORCPT ); Sat, 10 Sep 2022 02:51:22 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53596 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229596AbiIJGvU (ORCPT ); Sat, 10 Sep 2022 02:51:20 -0400 Received: from bombadil.infradead.org (bombadil.infradead.org [IPv6:2607:7c80:54:3::133]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 41C5C3121D; Fri, 9 Sep 2022 23:51:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description; bh=TWMB057d75ixVxyjdIFV1XXGOo0lh1V+gLJp5z7ed9M=; b=NBIurit5r6+q8mICiFS5WC3bos uEapd4J0qYeXSXm7FM/nQTunbL1M8v4YBFR9liWo5ysvmgUYjxYsTtwK2M8sQCLKESTfdhMW3H/3l Jz7Q6BUAlaFmzwe8swKOZJT70jf5yW9XUqZoi1Yecm/TY3dhKnFbxNoLAsYNyqPvFRsmJiM/MEZiN npLdahDgF7NnGQTEutWPlQcdWL2ZeMoJ8yHrx6o8043Pvp0q9HwICGWwVeK1zI8LF6iZu6Xv0k04b vTmK+8o2q/D1f7TAGVbdim1bIj0B4DNljspC3aewMIQ4+UoC3TVM/JOldv1tphvsPX88P38NUZCGG TJI26X9A==; Received: from [2001:4bb8:198:38af:e8dc:dbbd:a9d:5c54] (helo=localhost) by bombadil.infradead.org with esmtpsa (Exim 4.94.2 #2 (Red Hat Linux)) id 1oWuKa-006pZL-QP; Sat, 10 Sep 2022 06:51:09 +0000 From: Christoph Hellwig To: Jens Axboe , Matthew Wilcox , Johannes Weiner , Suren Baghdasaryan , Andrew Morton Cc: Chris Mason , Josef Bacik , David Sterba , Gao Xiang , Chao Yu , linux-block@vger.kernel.org, linux-btrfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-erofs@lists.ozlabs.org, linux-mm@kvack.org Subject: [PATCH 1/5] mm: add PSI accounting around ->read_folio and ->readahead calls Date: Sat, 10 Sep 2022 08:50:54 +0200 Message-Id: <20220910065058.3303831-2-hch@lst.de> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220910065058.3303831-1-hch@lst.de> References: <20220910065058.3303831-1-hch@lst.de> MIME-Version: 1.0 X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org PSI tries to account for the cost of bringing back in pages discarded by the MM LRU management. Currently the prime place for that is hooked into the bio submission path, which is a rather bad place: - it does not actually account I/O for non-block file systems, of which we have many - it adds overhead and a layering violation to the block layer Add the accounting into the two places in the core MM code that read pages into an address space by calling into ->read_folio and ->readahead so that the entire file system operations are covered, to broaden the coverage and allow removing the accounting in the block layer going forward. As psi_memstall_enter can deal with nested calls this will not lead to double accounting even while the bio annotations are still present. Signed-off-by: Christoph Hellwig Acked-by: Johannes Weiner --- include/linux/pagemap.h | 2 ++ mm/filemap.c | 7 +++++++ mm/readahead.c | 22 ++++++++++++++++++---- 3 files changed, 27 insertions(+), 4 deletions(-) diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h index 0178b2040ea38..201dc7281640b 100644 --- a/include/linux/pagemap.h +++ b/include/linux/pagemap.h @@ -1173,6 +1173,8 @@ struct readahead_control { pgoff_t _index; unsigned int _nr_pages; unsigned int _batch_count; + bool _workingset; + unsigned long _pflags; }; #define DEFINE_READAHEAD(ractl, f, r, m, i) \ diff --git a/mm/filemap.c b/mm/filemap.c index 15800334147b3..c943d1b90cc26 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -2382,6 +2382,8 @@ static void filemap_get_read_batch(struct address_space *mapping, static int filemap_read_folio(struct file *file, filler_t filler, struct folio *folio) { + bool workingset = folio_test_workingset(folio); + unsigned long pflags; int error; /* @@ -2390,8 +2392,13 @@ static int filemap_read_folio(struct file *file, filler_t filler, * fails. */ folio_clear_error(folio); + /* Start the actual read. The read will unlock the page. */ + if (unlikely(workingset)) + psi_memstall_enter(&pflags); error = filler(file, folio); + if (unlikely(workingset)) + psi_memstall_leave(&pflags); if (error) return error; diff --git a/mm/readahead.c b/mm/readahead.c index fdcd28cbd92de..43631c146d6dc 100644 --- a/mm/readahead.c +++ b/mm/readahead.c @@ -122,6 +122,7 @@ #include #include #include +#include #include #include #include @@ -152,6 +153,8 @@ static void read_pages(struct readahead_control *rac) if (!readahead_count(rac)) return; + if (unlikely(rac->_workingset)) + psi_memstall_enter(&rac->_pflags); blk_start_plug(&plug); if (aops->readahead) { @@ -179,6 +182,9 @@ static void read_pages(struct readahead_control *rac) } blk_finish_plug(&plug); + if (unlikely(rac->_workingset)) + psi_memstall_leave(&rac->_pflags); + rac->_workingset = false; BUG_ON(readahead_count(rac)); } @@ -252,6 +258,7 @@ void page_cache_ra_unbounded(struct readahead_control *ractl, } if (i == nr_to_read - lookahead_size) folio_set_readahead(folio); + ractl->_workingset |= folio_test_workingset(folio); ractl->_nr_pages++; } @@ -480,11 +487,14 @@ static inline int ra_alloc_folio(struct readahead_control *ractl, pgoff_t index, if (index == mark) folio_set_readahead(folio); err = filemap_add_folio(ractl->mapping, folio, index, gfp); - if (err) + if (err) { folio_put(folio); - else - ractl->_nr_pages += 1UL << order; - return err; + return err; + } + + ractl->_nr_pages += 1UL << order; + ractl->_workingset = folio_test_workingset(folio); + return 0; } void page_cache_ra_order(struct readahead_control *ractl, @@ -826,6 +836,10 @@ void readahead_expand(struct readahead_control *ractl, put_page(page); return; } + if (unlikely(PageWorkingset(page)) && !ractl->_workingset) { + ractl->_workingset = true; + psi_memstall_enter(&ractl->_pflags); + } ractl->_nr_pages++; if (ra) { ra->size++; From patchwork Sat Sep 10 06:50:55 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Hellwig X-Patchwork-Id: 12972438 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AA31DC6FA94 for ; Sat, 10 Sep 2022 06:51:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230133AbiIJGvY (ORCPT ); Sat, 10 Sep 2022 02:51:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53632 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229550AbiIJGvV (ORCPT ); Sat, 10 Sep 2022 02:51:21 -0400 Received: from bombadil.infradead.org (bombadil.infradead.org [IPv6:2607:7c80:54:3::133]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9EE8A27B35; Fri, 9 Sep 2022 23:51:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description; bh=vqMGMaEkOmTqVQylBL+skkFNuiQaEqBjU4MG7KR5LL0=; b=kqYZ5RijV8voK/z0zN+91658ME othEptJ5Jh2KM6dR4Cxso2p8jnG+xQo2M5kHEwh9mo1XVCJT3+LjcQVT86P0gUbIOZqNeFapGyX/h /OdgGVYp+QTVSkxLppYxTFUKdb59n7fqclYWiXunNSvOEQZbHZ7Al4fPyqE+Y7niQJCNcFr1QpNzt E9JUwcNBCafe3zyNEyvt4LB8sGEahpq/lVI09ZRV8hg7amkemarEy1CYLP/bzYggDRP43uf/Xx2Fa L7KwLUlqXcGXEbb1eNQP//XtM7trmDf+dq+zs0ZJyirR+O7PS1uicM9oKe1DTjiDKpApviawkwvvL FHmOX0DA==; Received: from [2001:4bb8:198:38af:e8dc:dbbd:a9d:5c54] (helo=localhost) by bombadil.infradead.org with esmtpsa (Exim 4.94.2 #2 (Red Hat Linux)) id 1oWuKd-006pZX-Fw; Sat, 10 Sep 2022 06:51:12 +0000 From: Christoph Hellwig To: Jens Axboe , Matthew Wilcox , Johannes Weiner , Suren Baghdasaryan , Andrew Morton Cc: Chris Mason , Josef Bacik , David Sterba , Gao Xiang , Chao Yu , linux-block@vger.kernel.org, linux-btrfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-erofs@lists.ozlabs.org, linux-mm@kvack.org Subject: [PATCH 2/5] sched/psi: export psi_memstall_{enter,leave} Date: Sat, 10 Sep 2022 08:50:55 +0200 Message-Id: <20220910065058.3303831-3-hch@lst.de> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220910065058.3303831-1-hch@lst.de> References: <20220910065058.3303831-1-hch@lst.de> MIME-Version: 1.0 X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org To properly account for all refaults from file system logic, file systems need to call psi_memstall_enter directly, so export it. Signed-off-by: Christoph Hellwig Acked-by: Johannes Weiner --- kernel/sched/psi.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c index ecb4b4ff4ce0a..7f6030091aeee 100644 --- a/kernel/sched/psi.c +++ b/kernel/sched/psi.c @@ -917,6 +917,7 @@ void psi_memstall_enter(unsigned long *flags) rq_unlock_irq(rq, &rf); } +EXPORT_SYMBOL_GPL(psi_memstall_enter); /** * psi_memstall_leave - mark the end of an memory stall section @@ -946,6 +947,7 @@ void psi_memstall_leave(unsigned long *flags) rq_unlock_irq(rq, &rf); } +EXPORT_SYMBOL_GPL(psi_memstall_leave); #ifdef CONFIG_CGROUPS int psi_cgroup_alloc(struct cgroup *cgroup) From patchwork Sat Sep 10 06:50:56 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Hellwig X-Patchwork-Id: 12972440 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B1603C6FA83 for ; Sat, 10 Sep 2022 06:51:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230170AbiIJGv1 (ORCPT ); Sat, 10 Sep 2022 02:51:27 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53690 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229758AbiIJGvX (ORCPT ); Sat, 10 Sep 2022 02:51:23 -0400 Received: from bombadil.infradead.org (bombadil.infradead.org [IPv6:2607:7c80:54:3::133]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6E61E3121D; Fri, 9 Sep 2022 23:51:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description; bh=FkPl9IvUjfBsIN7AIjdm08mTjlqewl1g1KHKU4sEYn8=; b=RMyNmneg26qCu35hD5+W/DxVr4 iVPficcPRLAokYM8ZwsoBLv8dW8VnZmJoK+GDv3k99nhtWxs6z+2MP6KXvBvQGDZX3E3IOlhYWYih 1wSKkmK6byR4m9gCQAvdCO4LFG0XyTL/Wg4/XQ/NxIibvPoIxqr+YNihM/P/pvyv3RLQssprSncca iEiwC1YXUzHQ2sBq+jikNtoTM9Asnl3xpUT3rE6pcjSKj787bsdsV1ETRfYPz/KIjQc2g5iacXJE3 +wyvgt21XtR4GFAfm29jnH92HvRDkK2397b1S1gSfn/wfDIHA8M7HhR6zgyZJzSMSIqFggodqjXIG lsWWrGew==; Received: from [2001:4bb8:198:38af:e8dc:dbbd:a9d:5c54] (helo=localhost) by bombadil.infradead.org with esmtpsa (Exim 4.94.2 #2 (Red Hat Linux)) id 1oWuKg-006paD-79; Sat, 10 Sep 2022 06:51:14 +0000 From: Christoph Hellwig To: Jens Axboe , Matthew Wilcox , Johannes Weiner , Suren Baghdasaryan , Andrew Morton Cc: Chris Mason , Josef Bacik , David Sterba , Gao Xiang , Chao Yu , linux-block@vger.kernel.org, linux-btrfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-erofs@lists.ozlabs.org, linux-mm@kvack.org Subject: [PATCH 3/5] btrfs: add manual PSI accounting for compressed reads Date: Sat, 10 Sep 2022 08:50:56 +0200 Message-Id: <20220910065058.3303831-4-hch@lst.de> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220910065058.3303831-1-hch@lst.de> References: <20220910065058.3303831-1-hch@lst.de> MIME-Version: 1.0 X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org btrfs compressed reads try to always read the entire compressed chunk, even if only a subset is requested. Currently this is covered by the magic PSI accounting underneath submit_bio, but that is about to go away. Instead add manual psi_memstall_{enter,leave} annotations. Note that for readahead this really should be using readahead_expand, but the additionals reads are also done for plain ->read_folio where readahead_expand can't work, so this overall logic is left as-is for now. Signed-off-by: Christoph Hellwig Acked-by: David Sterba Acked-by: Johannes Weiner --- fs/btrfs/compression.c | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-) diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c index e84d22c5c6a83..f7889a00e0055 100644 --- a/fs/btrfs/compression.c +++ b/fs/btrfs/compression.c @@ -15,6 +15,7 @@ #include #include #include +#include #include #include #include @@ -519,7 +520,8 @@ static u64 bio_end_offset(struct bio *bio) */ static noinline int add_ra_bio_pages(struct inode *inode, u64 compressed_end, - struct compressed_bio *cb) + struct compressed_bio *cb, + unsigned long *pflags) { struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); unsigned long end_index; @@ -588,6 +590,9 @@ static noinline int add_ra_bio_pages(struct inode *inode, continue; } + if (unlikely(PageWorkingset(page))) + psi_memstall_enter(pflags); + ret = set_page_extent_mapped(page); if (ret < 0) { unlock_page(page); @@ -674,6 +679,8 @@ void btrfs_submit_compressed_read(struct inode *inode, struct bio *bio, u64 em_len; u64 em_start; struct extent_map *em; + /* initialize to 1 to make skip psi_memstall_leave unless needed */ + unsigned long pflags = 1; blk_status_t ret; int ret2; int i; @@ -729,7 +736,7 @@ void btrfs_submit_compressed_read(struct inode *inode, struct bio *bio, goto fail; } - add_ra_bio_pages(inode, em_start + em_len, cb); + add_ra_bio_pages(inode, em_start + em_len, cb, &pflags); /* include any pages we added in add_ra-bio_pages */ cb->len = bio->bi_iter.bi_size; @@ -810,6 +817,9 @@ void btrfs_submit_compressed_read(struct inode *inode, struct bio *bio, } } + if (!pflags) + psi_memstall_leave(&pflags); + if (refcount_dec_and_test(&cb->pending_ios)) finish_compressed_bio_read(cb); return; From patchwork Sat Sep 10 06:50:57 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Hellwig X-Patchwork-Id: 12972441 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 20BBFC6FA8D for ; Sat, 10 Sep 2022 06:51:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230250AbiIJGvg (ORCPT ); Sat, 10 Sep 2022 02:51:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53786 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230135AbiIJGv0 (ORCPT ); Sat, 10 Sep 2022 02:51:26 -0400 Received: from bombadil.infradead.org (bombadil.infradead.org [IPv6:2607:7c80:54:3::133]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 56B4B75CF1; Fri, 9 Sep 2022 23:51:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description; bh=fQqaIwOPxSuMna00dQRnvLHVvXRo+zAlkdWprGAd+Cc=; b=TY6vqRLMJhvweO+6nWY/Yk6QCH 2wbbFyPkYsHGjWKYBe7GnKXWBeP7BegsJdll9FnrtOthJp4dUf1Kvsl5a/qlVyFhFhLHEPwEE0WQL wDaXd5YpJFiR0RumNIFaRd1Po+fn9GcxeyCCNfN1fJY7mmjO9rTpn8caq4TXFBBayzRcOc6bu4NKh IoDNIXl5IexFES5+qwaLTG71wyLwN1NbONFzYKyCJUeec2R61kCjzTUcXjZLj0pLvBFd+WW7ty19m JKcjT5Dzix0LV4x4Q6R9PV7MhsSEKNMJR6LZPTnwjwtyUHob7eNK6sy3hKBwQOGcSJCPj2NHLWdC9 4OgUvzQQ==; Received: from [2001:4bb8:198:38af:e8dc:dbbd:a9d:5c54] (helo=localhost) by bombadil.infradead.org with esmtpsa (Exim 4.94.2 #2 (Red Hat Linux)) id 1oWuKj-006pb8-02; Sat, 10 Sep 2022 06:51:17 +0000 From: Christoph Hellwig To: Jens Axboe , Matthew Wilcox , Johannes Weiner , Suren Baghdasaryan , Andrew Morton Cc: Chris Mason , Josef Bacik , David Sterba , Gao Xiang , Chao Yu , linux-block@vger.kernel.org, linux-btrfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-erofs@lists.ozlabs.org, linux-mm@kvack.org Subject: [PATCH 4/5] erofs: add manual PSI accounting for the compressed address space Date: Sat, 10 Sep 2022 08:50:57 +0200 Message-Id: <20220910065058.3303831-5-hch@lst.de> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220910065058.3303831-1-hch@lst.de> References: <20220910065058.3303831-1-hch@lst.de> MIME-Version: 1.0 X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org erofs uses an additional address space for compressed data read from disk in addition to the one directly associated with the inode. Reading into the lower address space is open coded using add_to_page_cache_lru instead of using the filemap.c helper for page allocation micro-optimizations, which means it is not covered by the MM PSI annotations for ->read_folio and ->readahead, so add manual ones instead. Signed-off-by: Christoph Hellwig Acked-by: Johannes Weiner Acked-by: Gao Xiang --- fs/erofs/zdata.c | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/fs/erofs/zdata.c b/fs/erofs/zdata.c index 5792ca9e0d5ef..143a101a36887 100644 --- a/fs/erofs/zdata.c +++ b/fs/erofs/zdata.c @@ -7,6 +7,7 @@ #include "zdata.h" #include "compress.h" #include +#include #include @@ -1365,6 +1366,8 @@ static void z_erofs_submit_queue(struct z_erofs_decompress_frontend *f, struct block_device *last_bdev; unsigned int nr_bios = 0; struct bio *bio = NULL; + /* initialize to 1 to make skip psi_memstall_leave unless needed */ + unsigned long pflags = 1; bi_private = jobqueueset_init(sb, q, fgq, force_fg); qtail[JQ_BYPASS] = &q[JQ_BYPASS]->head; @@ -1414,10 +1417,15 @@ static void z_erofs_submit_queue(struct z_erofs_decompress_frontend *f, if (bio && (cur != last_index + 1 || last_bdev != mdev.m_bdev)) { submit_bio_retry: + if (!pflags) + psi_memstall_leave(&pflags); submit_bio(bio); bio = NULL; } + if (unlikely(PageWorkingset(page))) + psi_memstall_enter(&pflags); + if (!bio) { bio = bio_alloc(mdev.m_bdev, BIO_MAX_VECS, REQ_OP_READ, GFP_NOIO); @@ -1445,8 +1453,11 @@ static void z_erofs_submit_queue(struct z_erofs_decompress_frontend *f, move_to_bypass_jobqueue(pcl, qtail, owned_head); } while (owned_head != Z_EROFS_PCLUSTER_TAIL); - if (bio) + if (bio) { + if (!pflags) + psi_memstall_leave(&pflags); submit_bio(bio); + } /* * although background is preferred, no one is pending for submission. From patchwork Sat Sep 10 06:50:58 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Hellwig X-Patchwork-Id: 12972442 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 02F8CECAAD5 for ; Sat, 10 Sep 2022 06:51:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230313AbiIJGvh (ORCPT ); Sat, 10 Sep 2022 02:51:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53932 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230181AbiIJGva (ORCPT ); Sat, 10 Sep 2022 02:51:30 -0400 Received: from bombadil.infradead.org (bombadil.infradead.org [IPv6:2607:7c80:54:3::133]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BC06D760FA; Fri, 9 Sep 2022 23:51:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description; bh=Bj6Wxth2LNlOrb9ptrXcpFWJGYrnuyA2Icp7ZXfeW/o=; b=K2RnBhW7QUIkDosGrl4GPsn3WL puVVLQPIHR+oiYIc0dCiAWg+BoaPF8fvXmI4e2zGOB0616LPS5EAr/7le5HmA7pYGWsCs5nftgX/M 2hkMrWbjbXBoEZ479tAk2Vo+2eP6ql/aFbRT/g5HHwiGsclbr/+8TZzCo2c78a9DVQWiXf62EGI/c achcb3IFQdWr2jdFOKpg/Yi7UDa3pHtmwSLvTJmAsfwkCPyr5BdhV7dmrIvtZL4aEQHUAVVIVRAS2 bYBTfNySyorNvWtavgLy5TTGayzG7SaZUQo1VrHGyISynz0lpOmzunL//qKfUZN3qe1lO5DVavmbc 5CYi4Y4g==; Received: from [2001:4bb8:198:38af:e8dc:dbbd:a9d:5c54] (helo=localhost) by bombadil.infradead.org with esmtpsa (Exim 4.94.2 #2 (Red Hat Linux)) id 1oWuKl-006pbo-MI; Sat, 10 Sep 2022 06:51:20 +0000 From: Christoph Hellwig To: Jens Axboe , Matthew Wilcox , Johannes Weiner , Suren Baghdasaryan , Andrew Morton Cc: Chris Mason , Josef Bacik , David Sterba , Gao Xiang , Chao Yu , linux-block@vger.kernel.org, linux-btrfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-erofs@lists.ozlabs.org, linux-mm@kvack.org Subject: [PATCH 5/5] block: remove PSI accounting from the bio layer Date: Sat, 10 Sep 2022 08:50:58 +0200 Message-Id: <20220910065058.3303831-6-hch@lst.de> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220910065058.3303831-1-hch@lst.de> References: <20220910065058.3303831-1-hch@lst.de> MIME-Version: 1.0 X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org PSI accounting is now done by the VM code, where it should have been since the beginning. Signed-off-by: Christoph Hellwig Acked-by: Johannes Weiner --- block/bio.c | 8 -------- block/blk-core.c | 17 ----------------- fs/direct-io.c | 2 -- include/linux/blk_types.h | 1 - 4 files changed, 28 deletions(-) diff --git a/block/bio.c b/block/bio.c index 3d3a2678fea25..d10c4e888cdcf 100644 --- a/block/bio.c +++ b/block/bio.c @@ -1065,9 +1065,6 @@ void __bio_add_page(struct bio *bio, struct page *page, bio->bi_iter.bi_size += len; bio->bi_vcnt++; - - if (!bio_flagged(bio, BIO_WORKINGSET) && unlikely(PageWorkingset(page))) - bio_set_flag(bio, BIO_WORKINGSET); } EXPORT_SYMBOL_GPL(__bio_add_page); @@ -1276,9 +1273,6 @@ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter) * fit into the bio, or are requested in @iter, whatever is smaller. If * MM encounters an error pinning the requested pages, it stops. Error * is returned only if 0 pages could be pinned. - * - * It's intended for direct IO, so doesn't do PSI tracking, the caller is - * responsible for setting BIO_WORKINGSET if necessary. */ int bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter) { @@ -1294,8 +1288,6 @@ int bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter) ret = __bio_iov_iter_get_pages(bio, iter); } while (!ret && iov_iter_count(iter) && !bio_full(bio, 0)); - /* don't account direct I/O as memory stall */ - bio_clear_flag(bio, BIO_WORKINGSET); return bio->bi_vcnt ? 0 : ret; } EXPORT_SYMBOL_GPL(bio_iov_iter_get_pages); diff --git a/block/blk-core.c b/block/blk-core.c index a0d1104c5590c..9e19195af6f5b 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -37,7 +37,6 @@ #include #include #include -#include #include #include #include @@ -829,22 +828,6 @@ void submit_bio(struct bio *bio) count_vm_events(PGPGOUT, bio_sectors(bio)); } - /* - * If we're reading data that is part of the userspace workingset, count - * submission time as memory stall. When the device is congested, or - * the submitting cgroup IO-throttled, submission can be a significant - * part of overall IO time. - */ - if (unlikely(bio_op(bio) == REQ_OP_READ && - bio_flagged(bio, BIO_WORKINGSET))) { - unsigned long pflags; - - psi_memstall_enter(&pflags); - submit_bio_noacct(bio); - psi_memstall_leave(&pflags); - return; - } - submit_bio_noacct(bio); } EXPORT_SYMBOL(submit_bio); diff --git a/fs/direct-io.c b/fs/direct-io.c index f669163d5860f..03d381377ae10 100644 --- a/fs/direct-io.c +++ b/fs/direct-io.c @@ -421,8 +421,6 @@ static inline void dio_bio_submit(struct dio *dio, struct dio_submit *sdio) unsigned long flags; bio->bi_private = dio; - /* don't account direct I/O as memory stall */ - bio_clear_flag(bio, BIO_WORKINGSET); spin_lock_irqsave(&dio->bio_lock, flags); dio->refcount++; diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h index 1ef99790f6ed3..8b1858df21752 100644 --- a/include/linux/blk_types.h +++ b/include/linux/blk_types.h @@ -321,7 +321,6 @@ enum { BIO_NO_PAGE_REF, /* don't put release vec pages */ BIO_CLONED, /* doesn't own data */ BIO_BOUNCED, /* bio is a bounce bio */ - BIO_WORKINGSET, /* contains userspace workingset pages */ BIO_QUIET, /* Make BIO Quiet */ BIO_CHAIN, /* chained bio, ->bi_remaining in effect */ BIO_REFFED, /* bio has elevated ->bi_cnt */