diff mbox series

[3/3] mm: make PageReadahead more strict

Message ID 20200212221614.215302-3-minchan@kernel.org (mailing list archive)
State New, archived
Headers show
Series [1/3] mm: Don't bother dropping mmap_sem for zero size readahead | expand

Commit Message

Minchan Kim Feb. 12, 2020, 10:16 p.m. UTC
PG_readahead flag is shared with PG_reclaim but PG_reclaim is only
used in write context while PG_readahead is used for read context.

To make it clear, let's introduce PageReadahead wrapper with
!PageWriteback so it could make code clear and we could drop
PageWriteback check in page_cache_async_readahead, which removes
pointless dropping mmap_sem.

Signed-off-by: Minchan Kim <minchan@kernel.org>
---
 include/linux/page-flags.h | 28 ++++++++++++++++++++++++++--
 mm/readahead.c             |  6 ------
 2 files changed, 26 insertions(+), 8 deletions(-)

Comments

Jan Kara Feb. 17, 2020, 9:31 a.m. UTC | #1
On Wed 12-02-20 14:16:14, Minchan Kim wrote:
> PG_readahead flag is shared with PG_reclaim but PG_reclaim is only
> used in write context while PG_readahead is used for read context.
> 
> To make it clear, let's introduce PageReadahead wrapper with
> !PageWriteback so it could make code clear and we could drop
> PageWriteback check in page_cache_async_readahead, which removes
> pointless dropping mmap_sem.
> 
> Signed-off-by: Minchan Kim <minchan@kernel.org>

...

> +/* Clear PG_readahead only if it's PG_readahead, not PG_reclaim */
> +static inline int TestClearPageReadahead(struct page *page)
> +{
> +	VM_BUG_ON_PGFLAGS(PageCompound(page), page);
> +
> +	return !PageWriteback(page) ||
> +			test_and_clear_bit(PG_reclaim, &page->flags);
> +}

I think this is still wrong - if PageWriteback is not set, it will never
clear PG_reclaim bit so effectively the page will stay in PageReadahead
state!

The logic you really want to implement is:

	if (PageReadahead(page)) { <- this is your new PageReadahead
				    implementation
		clear_bit(PG_reclaim, &page->flags);
		return 1;
	}
	return 0;

Now this has the problem that it is not atomic. The only way I see to make
this fully atomic is using cmpxchg(). If we wanted to make this kinda-sorta
OK, the proper condition would look like:

	return !PageWriteback(page) **&&**
			test_and_clear_bit(PG_reclaim, &page->flags);

Which is similar to what you originally had but different because in C '&&'
operator is not commutative due to side-effects committed at sequence points.

BTW: I share Andrew's view that we are piling hacks to fix problems caused
by older hacks. But I don't see any good option how to unalias
PG_readahead and PG_reclaim.

								Honza
Minchan Kim Feb. 18, 2020, 10:28 p.m. UTC | #2
On Mon, Feb 17, 2020 at 10:31:28AM +0100, Jan Kara wrote:
> On Wed 12-02-20 14:16:14, Minchan Kim wrote:
> > PG_readahead flag is shared with PG_reclaim but PG_reclaim is only
> > used in write context while PG_readahead is used for read context.
> > 
> > To make it clear, let's introduce PageReadahead wrapper with
> > !PageWriteback so it could make code clear and we could drop
> > PageWriteback check in page_cache_async_readahead, which removes
> > pointless dropping mmap_sem.
> > 
> > Signed-off-by: Minchan Kim <minchan@kernel.org>
> 
> ...
> 
> > +/* Clear PG_readahead only if it's PG_readahead, not PG_reclaim */
> > +static inline int TestClearPageReadahead(struct page *page)
> > +{
> > +	VM_BUG_ON_PGFLAGS(PageCompound(page), page);
> > +
> > +	return !PageWriteback(page) ||
> > +			test_and_clear_bit(PG_reclaim, &page->flags);
> > +}
> 
> I think this is still wrong - if PageWriteback is not set, it will never
> clear PG_reclaim bit so effectively the page will stay in PageReadahead
> state!
> 
> The logic you really want to implement is:
> 
> 	if (PageReadahead(page)) { <- this is your new PageReadahead
> 				    implementation
> 		clear_bit(PG_reclaim, &page->flags);
> 		return 1;
> 	}
> 	return 0;
> 
> Now this has the problem that it is not atomic. The only way I see to make
> this fully atomic is using cmpxchg(). If we wanted to make this kinda-sorta
> OK, the proper condition would look like:
> 
> 	return !PageWriteback(page) **&&**
> 			test_and_clear_bit(PG_reclaim, &page->flags);
> 
> Which is similar to what you originally had but different because in C '&&'
> operator is not commutative due to side-effects committed at sequence points.

It's accurate. Thanks, Jan.

> 
> BTW: I share Andrew's view that we are piling hacks to fix problems caused
> by older hacks. But I don't see any good option how to unalias
> PG_readahead and PG_reclaim.

I believe it's okay to remove PG_writeback check in page_cache_async_readahead. 
It's merely optmization to accelerate page reclaim when the LRU victim page's
writeback is done by VM. If we removes the condition, we just lose few pages at
the moment for fast reclaim but finally they could be reclaimed in inactive LRU
and it would be *rare* in that that kinds of writeback from reclaim context
should be not common and doesn't affect system behavior a lot.


> 
> 								Honza
> -- 
> Jan Kara <jack@suse.com>
> SUSE Labs, CR
diff mbox series

Patch

diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index 1bf83c8fcaa7..f91a9b2a49bd 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -363,8 +363,32 @@  PAGEFLAG(MappedToDisk, mappedtodisk, PF_NO_TAIL)
 /* PG_readahead is only used for reads; PG_reclaim is only for writes */
 PAGEFLAG(Reclaim, reclaim, PF_NO_TAIL)
 	TESTCLEARFLAG(Reclaim, reclaim, PF_NO_TAIL)
-PAGEFLAG(Readahead, reclaim, PF_NO_COMPOUND)
-	TESTCLEARFLAG(Readahead, reclaim, PF_NO_COMPOUND)
+
+SETPAGEFLAG(Readahead, reclaim, PF_NO_COMPOUND)
+CLEARPAGEFLAG(Readahead, reclaim, PF_NO_COMPOUND)
+
+/*
+ * Since PG_readahead is shared with PG_reclaim of the page flags,
+ * PageReadahead should double check whether it's readahead marker
+ * or PG_reclaim. It could be done by PageWriteback check because
+ * PG_reclaim is always with PG_writeback.
+ */
+static inline int PageReadahead(struct page *page)
+{
+	VM_BUG_ON_PGFLAGS(PageCompound(page), page);
+
+	return (page->flags & (1UL << PG_reclaim | 1UL << PG_writeback)) ==
+		(1UL << PG_reclaim);
+}
+
+/* Clear PG_readahead only if it's PG_readahead, not PG_reclaim */
+static inline int TestClearPageReadahead(struct page *page)
+{
+	VM_BUG_ON_PGFLAGS(PageCompound(page), page);
+
+	return !PageWriteback(page) ||
+			test_and_clear_bit(PG_reclaim, &page->flags);
+}
 
 #ifdef CONFIG_HIGHMEM
 /*
diff --git a/mm/readahead.c b/mm/readahead.c
index 2fe72cd29b47..85b15e5a1d7b 100644
--- a/mm/readahead.c
+++ b/mm/readahead.c
@@ -553,12 +553,6 @@  page_cache_async_readahead(struct address_space *mapping,
 	if (!ra->ra_pages)
 		return;
 
-	/*
-	 * Same bit is used for PG_readahead and PG_reclaim.
-	 */
-	if (PageWriteback(page))
-		return;
-
 	ClearPageReadahead(page);
 
 	/*