[v6,50/51] mm: Add THP readahead

Message ID	20200610201345.13273-51-willy@infradead.org (mailing list archive)
State	New, archived
Headers	show Return-Path: <SRS0=G8WG=7X=kvack.org=owner-linux-mm@kernel.org> DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org DF2A6207ED From: Matthew Wilcox <willy@infradead.org> To: linux-fsdevel@vger.kernel.org Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v6 50/51] mm: Add THP readahead Date: Wed, 10 Jun 2020 13:13:44 -0700 Message-Id: <20200610201345.13273-51-willy@infradead.org> In-Reply-To: <20200610201345.13273-1-willy@infradead.org> References: <20200610201345.13273-1-willy@infradead.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Sender: owner-linux-mm@kvack.org Precedence: bulk
Series	Large pages in the page cache \| expand [RFC,v6,00/51] Large pages in the page cache [v6,01/51] mm: Print head flags in dump_page [v6,02/51] mm: Print the inode number in dump_page [v6,03/51] mm: Print hashed address of struct page [v6,04/51] mm: Move PageDoubleMap bit [v6,05/51] mm: Simplify PageDoubleMap with PF_SECOND policy [v6,06/51] mm: Store compound_nr as well as compound_order [v6,07/51] mm: Move page-flags include to top of file [v6,08/51] mm: Add thp_order [v6,09/51] mm: Add thp_size [v6,10/51] mm: Replace hpage_nr_pages with thp_nr_pages [v6,11/51] mm: Add thp_head [v6,12/51] mm: Introduce offset_in_thp [v6,13/51] mm: Support arbitrary THP sizes [v6,14/51] fs: Add a filesystem flag for THPs [v6,15/51] fs: Do not update nr_thps for mappings which support THPs [v6,16/51] fs: Introduce i_blocks_per_page [v6,17/51] fs: Make page_mkwrite_check_truncate thp-aware [v6,18/51] mm: Support THPs in zero_user_segments [v6,19/51] mm: Zero the head page, not the tail page [v6,20/51] block: Add bio_for_each_thp_segment_all [v6,21/51] block: Support THPs in page_is_mergeable [v6,22/51] iomap: Support arbitrarily many blocks per page [v6,23/51] iomap: Support THPs in iomap_adjust_read_range [v6,24/51] iomap: Support THPs in invalidatepage [v6,25/51] iomap: Support THPs in read paths [v6,26/51] iomap: Convert iomap_write_end types [v6,27/51] iomap: Change calling convention for zeroing [v6,28/51] iomap: Change iomap_write_begin calling convention [v6,29/51] iomap: Support THPs in write paths [v6,30/51] iomap: Inline data shouldn't see THPs [v6,31/51] iomap: Handle tail pages in iomap_page_mkwrite [v6,32/51] xfs: Support THPs [v6,33/51] mm: Make prep_transhuge_page return its argument [v6,34/51] mm: Add __page_cache_alloc_order [v6,35/51] mm: Allow THPs to be added to the page cache [v6,36/51] mm: Allow THPs to be removed from the page cache [v6,37/51] mm: Remove page fault assumption of compound page size [v6,38/51] mm: Fix total_mapcount assumption of page size [v6,39/51] mm: Remove assumptions of THP size [v6,40/51] mm: Avoid splitting THPs [v6,41/51] mm: Fix truncation for pages of arbitrary size [v6,42/51] mm: Handle truncates that split THPs [v6,43/51] mm: Support storing shadow entries for THPs [v6,44/51] mm: Support retrieving tail pages from the page cache [v6,45/51] mm: Support tail pages in wait_for_stable_page [v6,46/51] mm: Add DEFINE_READAHEAD [v6,47/51] mm: Make page_cache_readahead_unbounded take a readahead_control [v6,48/51] mm: Make __do_page_cache_readahead take a readahead_control [v6,49/51] mm: Allow PageReadahead to be set on head pages [v6,50/51] mm: Add THP readahead [v6,51/51] mm: Align THP mappings for non-DAX

Message ID

20200610201345.13273-51-willy@infradead.org (mailing list archive)

State

New, archived

Headers

DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org DF2A6207ED
From: Matthew Wilcox <willy@infradead.org>
To: linux-fsdevel@vger.kernel.org
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>,
	linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Subject: [PATCH v6 50/51] mm: Add THP readahead
Date: Wed, 10 Jun 2020 13:13:44 -0700
Message-Id: <20200610201345.13273-51-willy@infradead.org>
In-Reply-To: <20200610201345.13273-1-willy@infradead.org>
References: <20200610201345.13273-1-willy@infradead.org>
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable
Sender: owner-linux-mm@kvack.org
Precedence: bulk

Series

Large pages in the page cache | expand

Commit Message

Matthew Wilcox June 10, 2020, 8:13 p.m. UTC

From: "Matthew Wilcox (Oracle)" <willy@infradead.org>

If the filesystem supports THPs, allocate larger pages in the
readahead code when it seems worth doing.  The heuristic for choosing
larger page sizes will surely need some tuning, but this aggressive
ramp-up seems good for testing.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/readahead.c | 93 ++++++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 87 insertions(+), 6 deletions(-)

diff --git a/mm/readahead.c b/mm/readahead.c
index 74c7e1eff540..98bbcc986b39 100644
--- a/mm/readahead.c
+++ b/mm/readahead.c
@@ -149,7 +149,7 @@  static void read_pages(struct readahead_control *rac, struct list_head *pages,
 
 	blk_finish_plug(&plug);
 
-	BUG_ON(!list_empty(pages));
+	BUG_ON(pages && !list_empty(pages));
 	BUG_ON(readahead_count(rac));
 
 out:
@@ -428,13 +428,92 @@  static int try_context_readahead(struct address_space *mapping,
 	return 1;
 }
 
+#ifdef CONFIG_TRANSPARENT_HUGEPAGE
+static inline int ra_alloc_page(struct readahead_control *rac, pgoff_t index,
+		pgoff_t mark, unsigned int order, gfp_t gfp)
+{
+	int err;
+	struct page *page = __page_cache_alloc_order(gfp, order);
+
+	if (!page)
+		return -ENOMEM;
+	if (mark - index < (1UL << order))
+		SetPageReadahead(page);
+	err = add_to_page_cache_lru(page, rac->mapping, index, gfp);
+	if (err)
+		put_page(page);
+	else
+		rac->_nr_pages += 1UL << order;
+	return err;
+}
+
+static bool page_cache_readahead_order(struct readahead_control *rac,
+		struct file_ra_state *ra, unsigned int order)
+{
+	struct address_space *mapping = rac->mapping;
+	unsigned int old_order = order;
+	pgoff_t index = readahead_index(rac);
+	pgoff_t limit = (i_size_read(mapping->host) - 1) >> PAGE_SHIFT;
+	pgoff_t mark = index + ra->size - ra->async_size;
+	int err = 0;
+	gfp_t gfp = readahead_gfp_mask(mapping);
+
+	if (!mapping_thp_support(mapping))
+		return false;
+
+	limit = min(limit, index + ra->size - 1);
+
+	/* Grow page size up to PMD size */
+	if (order < HPAGE_PMD_ORDER) {
+		order += 2;
+		if (order > HPAGE_PMD_ORDER)
+			order = HPAGE_PMD_ORDER;
+		while ((1 << order) > ra->size)
+			order--;
+	}
+
+	/* If size is somehow misaligned, fill with order-0 pages */
+	while (!err && index & ((1UL << old_order) - 1))
+		err = ra_alloc_page(rac, index++, mark, 0, gfp);
+
+	while (!err && index & ((1UL << order) - 1)) {
+		err = ra_alloc_page(rac, index, mark, old_order, gfp);
+		index += 1UL << old_order;
+	}
+
+	while (!err && index <= limit) {
+		err = ra_alloc_page(rac, index, mark, order, gfp);
+		index += 1UL << order;
+	}
+
+	if (index > limit) {
+		ra->size += index - limit - 1;
+		ra->async_size += index - limit - 1;
+	}
+
+	read_pages(rac, NULL, false);
+
+	/*
+	 * If there were already pages in the page cache, then we may have
+	 * left some gaps.  Let the regular readahead code take care of this
+	 * situation.
+	 */
+	return !err;
+}
+#else
+static bool page_cache_readahead_order(struct readahead_control *rac,
+		struct file_ra_state *ra, unsigned int order)
+{
+	return false;
+}
+#endif
+
 /*
  * A minimal readahead algorithm for trivial sequential/random reads.
  */
 static void ondemand_readahead(struct address_space *mapping,
 		struct file_ra_state *ra, struct file *file,
-		bool hit_readahead_marker, pgoff_t index,
-		unsigned long req_size)
+		struct page *page, pgoff_t index, unsigned long req_size)
 {
 	DEFINE_READAHEAD(rac, file, mapping, index);
 	struct backing_dev_info *bdi = inode_to_bdi(mapping->host);
@@ -473,7 +552,7 @@  static void ondemand_readahead(struct address_space *mapping,
 	 * Query the pagecache for async_size, which normally equals to
 	 * readahead size. Ramp it up and use it as the new readahead size.
 	 */
-	if (hit_readahead_marker) {
+	if (page) {
 		pgoff_t start;
 
 		rcu_read_lock();
@@ -544,6 +623,8 @@  static void ondemand_readahead(struct address_space *mapping,
 	}
 
 	rac._index = ra->start;
+	if (page && page_cache_readahead_order(&rac, ra, thp_order(page)))
+		return;
 	__do_page_cache_readahead(&rac, ra->size, ra->async_size);
 }
 
@@ -578,7 +659,7 @@  void page_cache_sync_readahead(struct address_space *mapping,
 	}
 
 	/* do read-ahead */
-	ondemand_readahead(mapping, ra, filp, false, index, req_count);
+	ondemand_readahead(mapping, ra, filp, NULL, index, req_count);
 }
 EXPORT_SYMBOL_GPL(page_cache_sync_readahead);
 
@@ -624,7 +705,7 @@  page_cache_async_readahead(struct address_space *mapping,
 		return;
 
 	/* do read-ahead */
-	ondemand_readahead(mapping, ra, filp, true, index, req_count);
+	ondemand_readahead(mapping, ra, filp, page, index, req_count);
 }
 EXPORT_SYMBOL_GPL(page_cache_async_readahead);

[v6,50/51] mm: Add THP readahead

Commit Message

Patch