diff mbox series

mm, swap: Fix swapoff with KSM pages

Message ID 20181226051522.28442-1-ying.huang@intel.com (mailing list archive)
State New, archived
Headers show
Series mm, swap: Fix swapoff with KSM pages | expand

Commit Message

Huang, Ying Dec. 26, 2018, 5:15 a.m. UTC
KSM pages may be mapped to the multiple VMAs that cannot be reached
from one anon_vma.  So during swapin, a new copy of the page need to
be generated if a different anon_vma is needed, please refer to
comments of ksm_might_need_to_copy() for details.

During swapoff, unuse_vma() uses anon_vma (if available) to locate VMA
and virtual address mapped to the page, so not all mappings to a
swapped out KSM page could be found.  So in try_to_unuse(), even if
the swap count of a swap entry isn't zero, the page needs to be
deleted from swap cache, so that, in the next round a new page could
be allocated and swapin for the other mappings of the swapped out KSM
page.

But this contradicts with the THP swap support.  Where the THP could
be deleted from swap cache only after the swap count of every swap
entry in the huge swap cluster backing the THP has reach 0.  So
try_to_unuse() is changed in commit e07098294adf ("mm, THP, swap:
support to reclaim swap space for THP swapped out") to check that
before delete a page from swap cache, but this has broken KSM swapoff
too.

Fortunately, KSM is for the normal pages only, so the original
behavior for KSM pages could be restored easily via checking
PageTransCompound().  That is how this patch works.

Fixes: e07098294adf ("mm, THP, swap: support to reclaim swap space for THP swapped out")
Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
Reported-and-Tested-and-Acked-by: Hugh Dickins <hughd@google.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Shaohua Li <shli@kernel.org>
Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
---
 mm/swapfile.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Huang, Ying Dec. 26, 2018, 5:37 a.m. UTC | #1
Hi, Andrew,

This patch is based on linus' tree instead of the head of mmotm tree
because it is to fix a bug there.

The bug is introduced by commit e07098294adf ("mm, THP, swap: support to
reclaim swap space for THP swapped out"), which is merged by v4.14-rc1.
So I think we should backport the fix to from 4.14 on.  But Hugh thinks
it may be rare for the KSM pages being in the swap device when swapoff,
so nobody reports the bug so far.

Best Regards,
Huang, Ying

Huang Ying <ying.huang@intel.com> writes:

> KSM pages may be mapped to the multiple VMAs that cannot be reached
> from one anon_vma.  So during swapin, a new copy of the page need to
> be generated if a different anon_vma is needed, please refer to
> comments of ksm_might_need_to_copy() for details.
>
> During swapoff, unuse_vma() uses anon_vma (if available) to locate VMA
> and virtual address mapped to the page, so not all mappings to a
> swapped out KSM page could be found.  So in try_to_unuse(), even if
> the swap count of a swap entry isn't zero, the page needs to be
> deleted from swap cache, so that, in the next round a new page could
> be allocated and swapin for the other mappings of the swapped out KSM
> page.
>
> But this contradicts with the THP swap support.  Where the THP could
> be deleted from swap cache only after the swap count of every swap
> entry in the huge swap cluster backing the THP has reach 0.  So
> try_to_unuse() is changed in commit e07098294adf ("mm, THP, swap:
> support to reclaim swap space for THP swapped out") to check that
> before delete a page from swap cache, but this has broken KSM swapoff
> too.
>
> Fortunately, KSM is for the normal pages only, so the original
> behavior for KSM pages could be restored easily via checking
> PageTransCompound().  That is how this patch works.
>
> Fixes: e07098294adf ("mm, THP, swap: support to reclaim swap space for THP swapped out")
> Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
> Reported-and-Tested-and-Acked-by: Hugh Dickins <hughd@google.com>
> Cc: Rik van Riel <riel@redhat.com>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Minchan Kim <minchan@kernel.org>
> Cc: Shaohua Li <shli@kernel.org>
> Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
> ---
>  mm/swapfile.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/mm/swapfile.c b/mm/swapfile.c
> index 8688ae65ef58..20d3c0f47a5f 100644
> --- a/mm/swapfile.c
> +++ b/mm/swapfile.c
> @@ -2197,7 +2197,8 @@ int try_to_unuse(unsigned int type, bool frontswap,
>  		 */
>  		if (PageSwapCache(page) &&
>  		    likely(page_private(page) == entry.val) &&
> -		    !page_swapped(page))
> +		    (!PageTransCompound(page) ||
> +		     !swap_page_trans_huge_swapped(si, entry)))
>  			delete_from_swap_cache(compound_head(page));
>  
>  		/*
Andrew Morton Dec. 28, 2018, 2:55 a.m. UTC | #2
On Wed, 26 Dec 2018 13:15:22 +0800 Huang Ying <ying.huang@intel.com> wrote:

> KSM pages may be mapped to the multiple VMAs that cannot be reached
> from one anon_vma.  So during swapin, a new copy of the page need to
> be generated if a different anon_vma is needed, please refer to
> comments of ksm_might_need_to_copy() for details.
> 
> During swapoff, unuse_vma() uses anon_vma (if available) to locate VMA
> and virtual address mapped to the page, so not all mappings to a
> swapped out KSM page could be found.  So in try_to_unuse(), even if
> the swap count of a swap entry isn't zero, the page needs to be
> deleted from swap cache, so that, in the next round a new page could
> be allocated and swapin for the other mappings of the swapped out KSM
> page.
> 
> But this contradicts with the THP swap support.  Where the THP could
> be deleted from swap cache only after the swap count of every swap
> entry in the huge swap cluster backing the THP has reach 0.  So
> try_to_unuse() is changed in commit e07098294adf ("mm, THP, swap:
> support to reclaim swap space for THP swapped out") to check that
> before delete a page from swap cache, but this has broken KSM swapoff
> too.
> 
> Fortunately, KSM is for the normal pages only, so the original
> behavior for KSM pages could be restored easily via checking
> PageTransCompound().  That is how this patch works.
> 
> ...
>
> --- a/mm/swapfile.c
> +++ b/mm/swapfile.c
> @@ -2197,7 +2197,8 @@ int try_to_unuse(unsigned int type, bool frontswap,
>  		 */
>  		if (PageSwapCache(page) &&
>  		    likely(page_private(page) == entry.val) &&
> -		    !page_swapped(page))
> +		    (!PageTransCompound(page) ||
> +		     !swap_page_trans_huge_swapped(si, entry)))
>  			delete_from_swap_cache(compound_head(page));
>  

The patch "mm, swap: rid swapoff of quadratic complexity" changes this
code significantly.  There are a few issues with that patch so I'll
drop it for now.

Vineeth, please ensure that future versions retain the above fix,
thanks.
Vineeth Remanan Pillai Dec. 28, 2018, 4:56 p.m. UTC | #3
Thanks for letting me know Andrew! I shall include all the fixes in the
next iteration.

Thanks,
Vineeth

On Thu, Dec 27, 2018 at 9:55 PM Andrew Morton <akpm@linux-foundation.org>
wrote:

> On Wed, 26 Dec 2018 13:15:22 +0800 Huang Ying <ying.huang@intel.com>
> wrote:
>
> > KSM pages may be mapped to the multiple VMAs that cannot be reached
> > from one anon_vma.  So during swapin, a new copy of the page need to
> > be generated if a different anon_vma is needed, please refer to
> > comments of ksm_might_need_to_copy() for details.
> >
> > During swapoff, unuse_vma() uses anon_vma (if available) to locate VMA
> > and virtual address mapped to the page, so not all mappings to a
> > swapped out KSM page could be found.  So in try_to_unuse(), even if
> > the swap count of a swap entry isn't zero, the page needs to be
> > deleted from swap cache, so that, in the next round a new page could
> > be allocated and swapin for the other mappings of the swapped out KSM
> > page.
> >
> > But this contradicts with the THP swap support.  Where the THP could
> > be deleted from swap cache only after the swap count of every swap
> > entry in the huge swap cluster backing the THP has reach 0.  So
> > try_to_unuse() is changed in commit e07098294adf ("mm, THP, swap:
> > support to reclaim swap space for THP swapped out") to check that
> > before delete a page from swap cache, but this has broken KSM swapoff
> > too.
> >
> > Fortunately, KSM is for the normal pages only, so the original
> > behavior for KSM pages could be restored easily via checking
> > PageTransCompound().  That is how this patch works.
> >
> > ...
> >
> > --- a/mm/swapfile.c
> > +++ b/mm/swapfile.c
> > @@ -2197,7 +2197,8 @@ int try_to_unuse(unsigned int type, bool frontswap,
> >                */
> >               if (PageSwapCache(page) &&
> >                   likely(page_private(page) == entry.val) &&
> > -                 !page_swapped(page))
> > +                 (!PageTransCompound(page) ||
> > +                  !swap_page_trans_huge_swapped(si, entry)))
> >                       delete_from_swap_cache(compound_head(page));
> >
>
> The patch "mm, swap: rid swapoff of quadratic complexity" changes this
> code significantly.  There are a few issues with that patch so I'll
> drop it for now.
>
> Vineeth, please ensure that future versions retain the above fix,
> thanks.
>
>
>
<div dir="ltr"><div class="gmail_default" style="font-family:verdana,sans-serif;font-size:small">Thanks for letting me know Andrew! I shall include all the fixes in the next iteration.</div><div class="gmail_default" style="font-family:verdana,sans-serif;font-size:small"><br></div><div class="gmail_default" style="font-family:verdana,sans-serif;font-size:small">Thanks,</div><div class="gmail_default" style="font-family:verdana,sans-serif;font-size:small">Vineeth</div></div><br><div class="gmail_quote"><div dir="ltr">On Thu, Dec 27, 2018 at 9:55 PM Andrew Morton &lt;<a href="mailto:akpm@linux-foundation.org">akpm@linux-foundation.org</a>&gt; wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On Wed, 26 Dec 2018 13:15:22 +0800 Huang Ying &lt;<a href="mailto:ying.huang@intel.com" target="_blank">ying.huang@intel.com</a>&gt; wrote:<br>
<br>
&gt; KSM pages may be mapped to the multiple VMAs that cannot be reached<br>
&gt; from one anon_vma.  So during swapin, a new copy of the page need to<br>
&gt; be generated if a different anon_vma is needed, please refer to<br>
&gt; comments of ksm_might_need_to_copy() for details.<br>
&gt; <br>
&gt; During swapoff, unuse_vma() uses anon_vma (if available) to locate VMA<br>
&gt; and virtual address mapped to the page, so not all mappings to a<br>
&gt; swapped out KSM page could be found.  So in try_to_unuse(), even if<br>
&gt; the swap count of a swap entry isn&#39;t zero, the page needs to be<br>
&gt; deleted from swap cache, so that, in the next round a new page could<br>
&gt; be allocated and swapin for the other mappings of the swapped out KSM<br>
&gt; page.<br>
&gt; <br>
&gt; But this contradicts with the THP swap support.  Where the THP could<br>
&gt; be deleted from swap cache only after the swap count of every swap<br>
&gt; entry in the huge swap cluster backing the THP has reach 0.  So<br>
&gt; try_to_unuse() is changed in commit e07098294adf (&quot;mm, THP, swap:<br>
&gt; support to reclaim swap space for THP swapped out&quot;) to check that<br>
&gt; before delete a page from swap cache, but this has broken KSM swapoff<br>
&gt; too.<br>
&gt; <br>
&gt; Fortunately, KSM is for the normal pages only, so the original<br>
&gt; behavior for KSM pages could be restored easily via checking<br>
&gt; PageTransCompound().  That is how this patch works.<br>
&gt; <br>
&gt; ...<br>
&gt;<br>
&gt; --- a/mm/swapfile.c<br>
&gt; +++ b/mm/swapfile.c<br>
&gt; @@ -2197,7 +2197,8 @@ int try_to_unuse(unsigned int type, bool frontswap,<br>
&gt;                */<br>
&gt;               if (PageSwapCache(page) &amp;&amp;<br>
&gt;                   likely(page_private(page) == entry.val) &amp;&amp;<br>
&gt; -                 !page_swapped(page))<br>
&gt; +                 (!PageTransCompound(page) ||<br>
&gt; +                  !swap_page_trans_huge_swapped(si, entry)))<br>
&gt;                       delete_from_swap_cache(compound_head(page));<br>
&gt;  <br>
<br>
The patch &quot;mm, swap: rid swapoff of quadratic complexity&quot; changes this<br>
code significantly.  There are a few issues with that patch so I&#39;ll<br>
drop it for now.<br>
<br>
Vineeth, please ensure that future versions retain the above fix,<br>
thanks.<br>
<br>
<br>
</blockquote></div>
diff mbox series

Patch

diff --git a/mm/swapfile.c b/mm/swapfile.c
index 8688ae65ef58..20d3c0f47a5f 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -2197,7 +2197,8 @@  int try_to_unuse(unsigned int type, bool frontswap,
 		 */
 		if (PageSwapCache(page) &&
 		    likely(page_private(page) == entry.val) &&
-		    !page_swapped(page))
+		    (!PageTransCompound(page) ||
+		     !swap_page_trans_huge_swapped(si, entry)))
 			delete_from_swap_cache(compound_head(page));
 
 		/*