diff mbox series

[3/4] mm/thp: allow drop THP from page cache

Message ID 20191016073731.4076725-4-songliubraving@fb.com (mailing list archive)
State New, archived
Headers show
Series Fixes for THP in page cache | expand

Commit Message

Song Liu Oct. 16, 2019, 7:37 a.m. UTC
From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>

Once a THP is added to the page cache, it cannot be dropped via
/proc/sys/vm/drop_caches. Fix this issue with proper handling in
invalidate_mapping_pages() and __remove_mapping().

Fixes: 99cb0dbd47a1 ("mm,thp: add read-only THP support for (non-shmem) FS")
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Tested-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Song Liu <songliubraving@fb.com>
---
 mm/truncate.c | 12 ++++++++++++
 mm/vmscan.c   |  3 ++-
 2 files changed, 14 insertions(+), 1 deletion(-)

Comments

Matthew Wilcox Oct. 17, 2019, 4:12 p.m. UTC | #1
On Wed, Oct 16, 2019 at 12:37:30AM -0700, Song Liu wrote:
> From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> 
> Once a THP is added to the page cache, it cannot be dropped via
> /proc/sys/vm/drop_caches. Fix this issue with proper handling in
> invalidate_mapping_pages() and __remove_mapping().
> 
> Fixes: 99cb0dbd47a1 ("mm,thp: add read-only THP support for (non-shmem) FS")
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Tested-by: Song Liu <songliubraving@fb.com>
> Signed-off-by: Song Liu <songliubraving@fb.com>
> ---
>  mm/truncate.c | 12 ++++++++++++
>  mm/vmscan.c   |  3 ++-
>  2 files changed, 14 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index c6659bb758a4..1d80a188ad4a 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -932,7 +932,8 @@ static int __remove_mapping(struct address_space *mapping, struct page *page,
>  	 * Note that if SetPageDirty is always performed via set_page_dirty,
>  	 * and thus under the i_pages lock, then this ordering is not required.
>  	 */
> -	if (unlikely(PageTransHuge(page)) && PageSwapCache(page))
> +	if (unlikely(PageTransHuge(page)) &&
> +			(PageSwapCache(page) || !PageSwapBacked(page)))
>  		refcount = 1 + HPAGE_PMD_NR;
>  	else
>  		refcount = 2;

Kirill suggests that this patch would be better (for this part of the patch;
the part in truncate.c should remain as it is)

commit ddcee327f96d57cb9d5310486d21e43892b7a368
Author: William Kucharski <william.kucharski@oracle.com>
Date:   Fri Sep 20 16:14:51 2019 -0400

    mm: Support removing arbitrary sized pages from mapping
    
    __remove_mapping() assumes that pages can only be either base pages
    or HPAGE_PMD_SIZE.  Further, it assumes that large pages are
    swap-backed.  Support all kinds of pages by unconditionally asking how
    many pages this page references.
    
    Signed-off-by: William Kucharski <william.kucharski@oracle.com>
    Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>

diff --git a/mm/vmscan.c b/mm/vmscan.c
index c6659bb758a4..f870da1f4bb7 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -932,10 +932,7 @@ static int __remove_mapping(struct address_space *mapping, struct page *page,
 	 * Note that if SetPageDirty is always performed via set_page_dirty,
 	 * and thus under the i_pages lock, then this ordering is not required.
 	 */
-	if (unlikely(PageTransHuge(page)) && PageSwapCache(page))
-		refcount = 1 + HPAGE_PMD_NR;
-	else
-		refcount = 2;
+	refcount = 1 + compound_nr(page);
 	if (!page_ref_freeze(page, refcount))
 		goto cannot_free;
 	/* note: atomic_cmpxchg in page_ref_freeze provides the smp_rmb */
Song Liu Oct. 17, 2019, 4:36 p.m. UTC | #2
> On Oct 17, 2019, at 9:12 AM, Matthew Wilcox <willy@infradead.org> wrote:
> 
> On Wed, Oct 16, 2019 at 12:37:30AM -0700, Song Liu wrote:
>> From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
>> 
>> Once a THP is added to the page cache, it cannot be dropped via
>> /proc/sys/vm/drop_caches. Fix this issue with proper handling in
>> invalidate_mapping_pages() and __remove_mapping().
>> 
>> Fixes: 99cb0dbd47a1 ("mm,thp: add read-only THP support for (non-shmem) FS")
>> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
>> Tested-by: Song Liu <songliubraving@fb.com>
>> Signed-off-by: Song Liu <songliubraving@fb.com>
>> ---
>> mm/truncate.c | 12 ++++++++++++
>> mm/vmscan.c   |  3 ++-
>> 2 files changed, 14 insertions(+), 1 deletion(-)
>> 
>> diff --git a/mm/vmscan.c b/mm/vmscan.c
>> index c6659bb758a4..1d80a188ad4a 100644
>> --- a/mm/vmscan.c
>> +++ b/mm/vmscan.c
>> @@ -932,7 +932,8 @@ static int __remove_mapping(struct address_space *mapping, struct page *page,
>> 	 * Note that if SetPageDirty is always performed via set_page_dirty,
>> 	 * and thus under the i_pages lock, then this ordering is not required.
>> 	 */
>> -	if (unlikely(PageTransHuge(page)) && PageSwapCache(page))
>> +	if (unlikely(PageTransHuge(page)) &&
>> +			(PageSwapCache(page) || !PageSwapBacked(page)))
>> 		refcount = 1 + HPAGE_PMD_NR;
>> 	else
>> 		refcount = 2;
> 
> Kirill suggests that this patch would be better (for this part of the patch;
> the part in truncate.c should remain as it is)
> 
> commit ddcee327f96d57cb9d5310486d21e43892b7a368
> Author: William Kucharski <william.kucharski@oracle.com>
> Date:   Fri Sep 20 16:14:51 2019 -0400
> 
>    mm: Support removing arbitrary sized pages from mapping
> 
>    __remove_mapping() assumes that pages can only be either base pages
>    or HPAGE_PMD_SIZE.  Further, it assumes that large pages are
>    swap-backed.  Support all kinds of pages by unconditionally asking how
>    many pages this page references.
> 
>    Signed-off-by: William Kucharski <william.kucharski@oracle.com>
>    Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> 
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index c6659bb758a4..f870da1f4bb7 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -932,10 +932,7 @@ static int __remove_mapping(struct address_space *mapping, struct page *page,
> 	 * Note that if SetPageDirty is always performed via set_page_dirty,
> 	 * and thus under the i_pages lock, then this ordering is not required.
> 	 */
> -	if (unlikely(PageTransHuge(page)) && PageSwapCache(page))
> -		refcount = 1 + HPAGE_PMD_NR;
> -	else
> -		refcount = 2;
> +	refcount = 1 + compound_nr(page);
> 	if (!page_ref_freeze(page, refcount))
> 		goto cannot_free;
> 	/* note: atomic_cmpxchg in page_ref_freeze provides the smp_rmb */

This does look cleaner, and works fine in my tests.

Let me include it in v2 set. 

Thanks,
Song
diff mbox series

Patch

diff --git a/mm/truncate.c b/mm/truncate.c
index 8563339041f6..dd9ebc1da356 100644
--- a/mm/truncate.c
+++ b/mm/truncate.c
@@ -592,6 +592,16 @@  unsigned long invalidate_mapping_pages(struct address_space *mapping,
 					unlock_page(page);
 					continue;
 				}
+
+				/* Take a pin outside pagevec */
+				get_page(page);
+
+				/*
+				 * Drop extra pins before trying to invalidate
+				 * the huge page.
+				 */
+				pagevec_remove_exceptionals(&pvec);
+				pagevec_release(&pvec);
 			}
 
 			ret = invalidate_inode_page(page);
@@ -602,6 +612,8 @@  unsigned long invalidate_mapping_pages(struct address_space *mapping,
 			 */
 			if (!ret)
 				deactivate_file_page(page);
+			if (PageTransHuge(page))
+				put_page(page);
 			count += ret;
 		}
 		pagevec_remove_exceptionals(&pvec);
diff --git a/mm/vmscan.c b/mm/vmscan.c
index c6659bb758a4..1d80a188ad4a 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -932,7 +932,8 @@  static int __remove_mapping(struct address_space *mapping, struct page *page,
 	 * Note that if SetPageDirty is always performed via set_page_dirty,
 	 * and thus under the i_pages lock, then this ordering is not required.
 	 */
-	if (unlikely(PageTransHuge(page)) && PageSwapCache(page))
+	if (unlikely(PageTransHuge(page)) &&
+			(PageSwapCache(page) || !PageSwapBacked(page)))
 		refcount = 1 + HPAGE_PMD_NR;
 	else
 		refcount = 2;