diff mbox series

[v2] mm: Optimise put_pages_list()

Message ID 20211007192138.561673-1-willy@infradead.org (mailing list archive)
State New
Headers show
Series [v2] mm: Optimise put_pages_list() | expand

Commit Message

Matthew Wilcox Oct. 7, 2021, 7:21 p.m. UTC
Instead of calling put_page() one page at a time, pop pages off
the list if their refcount was too high and pass the remainder to
put_unref_page_list().  This should be a speed improvement, but I have
no measurements to support that.  Current callers do not care about
performance, but I hope to add some which do.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
v2:
 - Handle compound pages (Mel)
 - Comment why we don't need to handle PageLRU
 - Added call to __ClearPageWaiters(), matching that in release_pages()

 mm/swap.c | 23 ++++++++++++++++-------
 1 file changed, 16 insertions(+), 7 deletions(-)

Comments

Andrew Morton Oct. 7, 2021, 7:31 p.m. UTC | #1
On Thu,  7 Oct 2021 20:21:37 +0100 "Matthew Wilcox (Oracle)" <willy@infradead.org> wrote:

> Instead of calling put_page() one page at a time, pop pages off
> the list if their refcount was too high and pass the remainder to
> put_unref_page_list().  This should be a speed improvement, but I have
> no measurements to support that.  Current callers do not care about
> performance, but I hope to add some which do.

Don't you think it would actually be slower to take an additional pass
across the list?  If the list is long enough to cause cache thrashing. 
Maybe it's faster for small lists.
Matthew Wilcox Oct. 7, 2021, 8:55 p.m. UTC | #2
On Thu, Oct 07, 2021 at 12:31:09PM -0700, Andrew Morton wrote:
> On Thu,  7 Oct 2021 20:21:37 +0100 "Matthew Wilcox (Oracle)" <willy@infradead.org> wrote:
> 
> > Instead of calling put_page() one page at a time, pop pages off
> > the list if their refcount was too high and pass the remainder to
> > put_unref_page_list().  This should be a speed improvement, but I have
> > no measurements to support that.  Current callers do not care about
> > performance, but I hope to add some which do.
> 
> Don't you think it would actually be slower to take an additional pass
> across the list?  If the list is long enough to cause cache thrashing. 
> Maybe it's faster for small lists.

My first response is an appeal to authority -- release_pages() does
this same thing.  Only it takes an array, constructs a list and passes
that to put_unref_page_list().  So if that's slower (and lists _are_
slower than arrays), we should have a put_unref_page_array().

Second, we can follow through the code paths and reason about it.

Before:

while (!list_empty(pages)) {
	put_page(victim);
		page = compound_head(page);
		if (put_page_testzero(page))
			__put_page(page);
				__put_single_page(page)
					__page_cache_release(page);
					mem_cgroup_uncharge(page);
					<---
free_unref_page(page, 0);
	free_unref_page_prepare()
        local_lock_irqsave(&pagesets.lock, flags);
        free_unref_page_commit(page, pfn, migratetype, order);
        local_unlock_irqrestore(&pagesets.lock, flags);

After:

free_unref_page_list(pages);
        list_for_each_entry_safe(page, next, list, lru) {
                if (!free_unref_page_prepare(page, pfn, 0)) {
        }

        local_lock_irqsave(&pagesets.lock, flags);
        list_for_each_entry_safe(page, next, list, lru) {
		free_unref_page_commit()
	}
        local_unlock_irqrestore(&pagesets.lock, flags);

So the major win here is that we disable/enable interrupts once per
batch rather than once per page.
Andrew Morton Oct. 7, 2021, 11:35 p.m. UTC | #3
On Thu, 7 Oct 2021 21:55:21 +0100 Matthew Wilcox <willy@infradead.org> wrote:

> On Thu, Oct 07, 2021 at 12:31:09PM -0700, Andrew Morton wrote:
> > On Thu,  7 Oct 2021 20:21:37 +0100 "Matthew Wilcox (Oracle)" <willy@infradead.org> wrote:
> > 
> > > Instead of calling put_page() one page at a time, pop pages off
> > > the list if their refcount was too high and pass the remainder to
> > > put_unref_page_list().  This should be a speed improvement, but I have
> > > no measurements to support that.  Current callers do not care about
> > > performance, but I hope to add some which do.
> > 
> > Don't you think it would actually be slower to take an additional pass
> > across the list?  If the list is long enough to cause cache thrashing. 
> > Maybe it's faster for small lists.
> 
> My first response is an appeal to authority -- release_pages() does
> this same thing.  Only it takes an array, constructs a list and passes
> that to put_unref_page_list().  So if that's slower (and lists _are_
> slower than arrays), we should have a put_unref_page_array().

And put_unref_page_list() does two passes across the list!

<quietly sobs>

Here is my beautiful release_pages(), as disrtibuted in linux-2.5.33:

void release_pages(struct page **pages, int nr)
{
	int i;
	struct pagevec pages_to_free;
	struct zone *zone = NULL;

	pagevec_init(&pages_to_free);
	for (i = 0; i < nr; i++) {
		struct page *page = pages[i];
		struct zone *pagezone;

		if (PageReserved(page) || !put_page_testzero(page))
			continue;

		pagezone = page_zone(page);
		if (pagezone != zone) {
			if (zone)
				spin_unlock_irq(&zone->lru_lock);
			zone = pagezone;
			spin_lock_irq(&zone->lru_lock);
		}
		if (TestClearPageLRU(page))
			del_page_from_lru(zone, page);
		if (page_count(page) == 0) {
			if (!pagevec_add(&pages_to_free, page)) {
				spin_unlock_irq(&zone->lru_lock);
				pagevec_free(&pages_to_free);
				pagevec_init(&pages_to_free);
				spin_lock_irq(&zone->lru_lock);
			}
		}
	}
	if (zone)
		spin_unlock_irq(&zone->lru_lock);

	pagevec_free(&pages_to_free);
}

I guess the current version is some commentary on the aging process?


> Second, we can follow through the code paths and reason about it.
> 
> Before:
> 
> while (!list_empty(pages)) {
> 	put_page(victim);
> 		page = compound_head(page);
> 		if (put_page_testzero(page))
> 			__put_page(page);
> 				__put_single_page(page)
> 					__page_cache_release(page);
> 					mem_cgroup_uncharge(page);
> 					<---
> free_unref_page(page, 0);
> 	free_unref_page_prepare()
>         local_lock_irqsave(&pagesets.lock, flags);
>         free_unref_page_commit(page, pfn, migratetype, order);
>         local_unlock_irqrestore(&pagesets.lock, flags);
> 
> After:
> 
> free_unref_page_list(pages);
>         list_for_each_entry_safe(page, next, list, lru) {
>                 if (!free_unref_page_prepare(page, pfn, 0)) {
>         }
> 
>         local_lock_irqsave(&pagesets.lock, flags);
>         list_for_each_entry_safe(page, next, list, lru) {
> 		free_unref_page_commit()
> 	}
>         local_unlock_irqrestore(&pagesets.lock, flags);
> 
> So the major win here is that we disable/enable interrupts once per
> batch rather than once per page.

Perhaps that's faster if the list is fully cached.

Any feelings for how often release_pages() will be passed a huge enough
list for this to occur?
Matthew Wilcox Oct. 8, 2021, 2:17 p.m. UTC | #4
On Thu, Oct 07, 2021 at 04:35:54PM -0700, Andrew Morton wrote:
> On Thu, 7 Oct 2021 21:55:21 +0100 Matthew Wilcox <willy@infradead.org> wrote:
> > My first response is an appeal to authority -- release_pages() does
> > this same thing.  Only it takes an array, constructs a list and passes
> > that to put_unref_page_list().  So if that's slower (and lists _are_
> > slower than arrays), we should have a put_unref_page_array().
> 
> And put_unref_page_list() does two passes across the list!
> 
> <quietly sobs>
> 
> Here is my beautiful release_pages(), as disrtibuted in linux-2.5.33:

I think that looks much better!

> I guess the current version is some commentary on the aging process?

I blame the guy who sent cc59850ef940 to Linus back in 2011 ...

I think Anthony was going to look into this and perhaps revert us to
a pagevec_free() interface.
Anthony Yznaga Oct. 20, 2021, 10:07 p.m. UTC | #5
On 10/8/21 7:17 AM, Matthew Wilcox wrote:
> On Thu, Oct 07, 2021 at 04:35:54PM -0700, Andrew Morton wrote:
>> On Thu, 7 Oct 2021 21:55:21 +0100 Matthew Wilcox <willy@infradead.org> wrote:
>>> My first response is an appeal to authority -- release_pages() does
>>> this same thing.  Only it takes an array, constructs a list and passes
>>> that to put_unref_page_list().  So if that's slower (and lists _are_
>>> slower than arrays), we should have a put_unref_page_array().
>> And put_unref_page_list() does two passes across the list!
>>
>> <quietly sobs>
>>
>> Here is my beautiful release_pages(), as disrtibuted in linux-2.5.33:
> I think that looks much better!
>
>> I guess the current version is some commentary on the aging process?
> I blame the guy who sent cc59850ef940 to Linus back in 2011 ...
>
> I think Anthony was going to look into this and perhaps revert us to
> a pagevec_free() interface.


Sorry for not responding sooner.  I have some cycles so I'll take a look 
at this now.


Anthony
Joao Martins Oct. 21, 2021, 2:41 p.m. UTC | #6
On 10/8/21 15:17, Matthew Wilcox wrote:
> On Thu, Oct 07, 2021 at 04:35:54PM -0700, Andrew Morton wrote:
>> On Thu, 7 Oct 2021 21:55:21 +0100 Matthew Wilcox <willy@infradead.org> wrote:
>>> My first response is an appeal to authority -- release_pages() does
>>> this same thing.  Only it takes an array, constructs a list and passes
>>> that to put_unref_page_list().  So if that's slower (and lists _are_
>>> slower than arrays), we should have a put_unref_page_array().
>>
>> And put_unref_page_list() does two passes across the list!
>>
>> <quietly sobs>
>>
>> Here is my beautiful release_pages(), as disrtibuted in linux-2.5.33:
> 
> I think that looks much better!
> 

For what is worth, something that was suggested by Jason over the unpin_user_pages*
improvements series was to apply the same trick to how we iterate a page array to
release_pages(). Purpose would be to batch the page refcount update if you pass N
contiguous struct pages.

	Joao
Anthony Yznaga Oct. 22, 2021, 11:26 p.m. UTC | #7
On 10/7/21 12:21 PM, Matthew Wilcox (Oracle) wrote:
> Instead of calling put_page() one page at a time, pop pages off
> the list if their refcount was too high and pass the remainder to
> put_unref_page_list().  This should be a speed improvement, but I have
> no measurements to support that.  Current callers do not care about
> performance, but I hope to add some which do.
>
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> ---
> v2:
>   - Handle compound pages (Mel)
>   - Comment why we don't need to handle PageLRU
>   - Added call to __ClearPageWaiters(), matching that in release_pages()
>
>   mm/swap.c | 23 ++++++++++++++++-------
>   1 file changed, 16 insertions(+), 7 deletions(-)
>
> diff --git a/mm/swap.c b/mm/swap.c
> index af3cad4e5378..9f334d503fd2 100644
> --- a/mm/swap.c
> +++ b/mm/swap.c
> @@ -134,18 +134,27 @@ EXPORT_SYMBOL(__put_page);
>    * put_pages_list() - release a list of pages
>    * @pages: list of pages threaded on page->lru
>    *
> - * Release a list of pages which are strung together on page.lru.  Currently
> - * used by read_cache_pages() and related error recovery code.
> + * Release a list of pages which are strung together on page.lru.
>    */
>   void put_pages_list(struct list_head *pages)
>   {
> -	while (!list_empty(pages)) {
> -		struct page *victim;
> +	struct page *page, *next;
>   
> -		victim = lru_to_page(pages);
> -		list_del(&victim->lru);
> -		put_page(victim);
> +	list_for_each_entry_safe(page, next, pages, lru) {
> +		if (!put_page_testzero(page)) {
> +			list_del(&page->lru);
> +			continue;
> +		}


I know that compound pages are not currently passed to put_pages_list(),

but I assume the put_page_testzero() should only be done on the head

page similar to release_pages()?


Anthony

> +		if (PageHead(page)) {
> +			list_del(&page->lru);
> +			__put_compound_page(page);
> +			continue;
> +		}
> +		/* Cannot be PageLRU because it's passed to us using the lru */
> +		__ClearPageWaiters(page);
>   	}
> +
> +	free_unref_page_list(pages);
>   }
>   EXPORT_SYMBOL(put_pages_list);
>
Matthew Wilcox Oct. 23, 2021, 1:13 a.m. UTC | #8
On Fri, Oct 22, 2021 at 04:26:59PM -0700, Anthony Yznaga wrote:
> 
> On 10/7/21 12:21 PM, Matthew Wilcox (Oracle) wrote:
> > Instead of calling put_page() one page at a time, pop pages off
> > the list if their refcount was too high and pass the remainder to
> > put_unref_page_list().  This should be a speed improvement, but I have
> > no measurements to support that.  Current callers do not care about
> > performance, but I hope to add some which do.
> > 
> > Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> > ---
> > v2:
> >   - Handle compound pages (Mel)
> >   - Comment why we don't need to handle PageLRU
> >   - Added call to __ClearPageWaiters(), matching that in release_pages()
> > 
> >   mm/swap.c | 23 ++++++++++++++++-------
> >   1 file changed, 16 insertions(+), 7 deletions(-)
> > 
> > diff --git a/mm/swap.c b/mm/swap.c
> > index af3cad4e5378..9f334d503fd2 100644
> > --- a/mm/swap.c
> > +++ b/mm/swap.c
> > @@ -134,18 +134,27 @@ EXPORT_SYMBOL(__put_page);
> >    * put_pages_list() - release a list of pages
> >    * @pages: list of pages threaded on page->lru
> >    *
> > - * Release a list of pages which are strung together on page.lru.  Currently
> > - * used by read_cache_pages() and related error recovery code.
> > + * Release a list of pages which are strung together on page.lru.
> >    */
> >   void put_pages_list(struct list_head *pages)
> >   {
> > -	while (!list_empty(pages)) {
> > -		struct page *victim;
> > +	struct page *page, *next;
> > -		victim = lru_to_page(pages);
> > -		list_del(&victim->lru);
> > -		put_page(victim);
> > +	list_for_each_entry_safe(page, next, pages, lru) {
> > +		if (!put_page_testzero(page)) {
> > +			list_del(&page->lru);
> > +			continue;
> > +		}
> 
> 
> I know that compound pages are not currently passed to put_pages_list(),
> but I assume the put_page_testzero() should only be done on the head
> page similar to release_pages()?

Fun fact about pages: You can't put a tail page on an LRU list.  Why?

struct page {
...
        union {
                struct {        /* Page cache and anonymous pages */
                        struct list_head lru;
...
                struct {        /* Tail pages of compound page */
                        unsigned long compound_head;    /* Bit zero is set */

so if you try to put a tail page on the LRU list, it becomes no longer
a tail page.
Anthony Yznaga Oct. 23, 2021, 6:11 a.m. UTC | #9
On 10/22/21 6:13 PM, Matthew Wilcox wrote:
> On Fri, Oct 22, 2021 at 04:26:59PM -0700, Anthony Yznaga wrote:
>> On 10/7/21 12:21 PM, Matthew Wilcox (Oracle) wrote:
>>> Instead of calling put_page() one page at a time, pop pages off
>>> the list if their refcount was too high and pass the remainder to
>>> put_unref_page_list().  This should be a speed improvement, but I have
>>> no measurements to support that.  Current callers do not care about
>>> performance, but I hope to add some which do.
>>>
>>> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
>>> ---
>>> v2:
>>>    - Handle compound pages (Mel)
>>>    - Comment why we don't need to handle PageLRU
>>>    - Added call to __ClearPageWaiters(), matching that in release_pages()
>>>
>>>    mm/swap.c | 23 ++++++++++++++++-------
>>>    1 file changed, 16 insertions(+), 7 deletions(-)
>>>
>>> diff --git a/mm/swap.c b/mm/swap.c
>>> index af3cad4e5378..9f334d503fd2 100644
>>> --- a/mm/swap.c
>>> +++ b/mm/swap.c
>>> @@ -134,18 +134,27 @@ EXPORT_SYMBOL(__put_page);
>>>     * put_pages_list() - release a list of pages
>>>     * @pages: list of pages threaded on page->lru
>>>     *
>>> - * Release a list of pages which are strung together on page.lru.  Currently
>>> - * used by read_cache_pages() and related error recovery code.
>>> + * Release a list of pages which are strung together on page.lru.
>>>     */
>>>    void put_pages_list(struct list_head *pages)
>>>    {
>>> -	while (!list_empty(pages)) {
>>> -		struct page *victim;
>>> +	struct page *page, *next;
>>> -		victim = lru_to_page(pages);
>>> -		list_del(&victim->lru);
>>> -		put_page(victim);
>>> +	list_for_each_entry_safe(page, next, pages, lru) {
>>> +		if (!put_page_testzero(page)) {
>>> +			list_del(&page->lru);
>>> +			continue;
>>> +		}
>>
>> I know that compound pages are not currently passed to put_pages_list(),
>> but I assume the put_page_testzero() should only be done on the head
>> page similar to release_pages()?
> Fun fact about pages: You can't put a tail page on an LRU list.  Why?
>
> struct page {
> ...
>          union {
>                  struct {        /* Page cache and anonymous pages */
>                          struct list_head lru;
> ...
>                  struct {        /* Tail pages of compound page */
>                          unsigned long compound_head;    /* Bit zero is set */
>
> so if you try to put a tail page on the LRU list, it becomes no longer
> a tail page.

Ah, makes sense.  Thanks!

In that case here's my r-b:

Reviewed-by: Anthony Yznaga <anthony.yznaga@oracle.com>

Anthony
diff mbox series

Patch

diff --git a/mm/swap.c b/mm/swap.c
index af3cad4e5378..9f334d503fd2 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -134,18 +134,27 @@  EXPORT_SYMBOL(__put_page);
  * put_pages_list() - release a list of pages
  * @pages: list of pages threaded on page->lru
  *
- * Release a list of pages which are strung together on page.lru.  Currently
- * used by read_cache_pages() and related error recovery code.
+ * Release a list of pages which are strung together on page.lru.
  */
 void put_pages_list(struct list_head *pages)
 {
-	while (!list_empty(pages)) {
-		struct page *victim;
+	struct page *page, *next;
 
-		victim = lru_to_page(pages);
-		list_del(&victim->lru);
-		put_page(victim);
+	list_for_each_entry_safe(page, next, pages, lru) {
+		if (!put_page_testzero(page)) {
+			list_del(&page->lru);
+			continue;
+		}
+		if (PageHead(page)) {
+			list_del(&page->lru);
+			__put_compound_page(page);
+			continue;
+		}
+		/* Cannot be PageLRU because it's passed to us using the lru */
+		__ClearPageWaiters(page);
 	}
+
+	free_unref_page_list(pages);
 }
 EXPORT_SYMBOL(put_pages_list);