diff mbox series

[v1] hugetlb, userfaultfd: Fix reservation restore on userfaultfd error

Message ID 20211116235733.3774702-1-almasrymina@google.com (mailing list archive)
State New
Headers show
Series [v1] hugetlb, userfaultfd: Fix reservation restore on userfaultfd error | expand

Commit Message

Mina Almasry Nov. 16, 2021, 11:57 p.m. UTC
Currently in the is_continue case in hugetlb_mcopy_atomic_pte(), if we
bail out using "goto out_release_unlock;" in the cases where idx >=
size, or !huge_pte_none(), the code will detect that new_pagecache_page
== false, and so call restore_reserve_on_error().
In this case I see restore_reserve_on_error() delete the reservation,
and the following call to remove_inode_hugepages() will increment
h->resv_hugepages causing a 100% reproducible leak.

We should treat the is_continue case similar to adding a page into the
pagecache and set new_pagecache_page to true, to indicate that there is
no reservation to restore on the error path, and we need not call
restore_reserve_on_error().

Cc: Wei Xu <weixugc@google.com>

Fixes: c7b1850dfb41 ("hugetlb: don't pass page cache pages to restore_reserve_on_error")
Signed-off-by: Mina Almasry <almasrymina@google.com>
Reported-by: James Houghton <jthoughton@google.com>
---
 mm/hugetlb.c | 8 ++++++++
 1 file changed, 8 insertions(+)

--
2.34.0.rc1.387.gb447b232ab-goog

Comments

Mina Almasry Nov. 17, 2021, midnight UTC | #1
Hi Mike,

On Tue, Nov 16, 2021 at 3:57 PM Mina Almasry <almasrymina@google.com> wrote:
>
> Currently in the is_continue case in hugetlb_mcopy_atomic_pte(), if we
> bail out using "goto out_release_unlock;" in the cases where idx >=
> size, or !huge_pte_none(), the code will detect that new_pagecache_page
> == false, and so call restore_reserve_on_error().
> In this case I see restore_reserve_on_error() delete the reservation,
> and the following call to remove_inode_hugepages() will increment
> h->resv_hugepages causing a 100% reproducible leak.
>

Attached is the .c file with the 100% repro.

> We should treat the is_continue case similar to adding a page into the
> pagecache and set new_pagecache_page to true, to indicate that there is
> no reservation to restore on the error path, and we need not call
> restore_reserve_on_error().
>
> Cc: Wei Xu <weixugc@google.com>
>
> Fixes: c7b1850dfb41 ("hugetlb: don't pass page cache pages to restore_reserve_on_error")
> Signed-off-by: Mina Almasry <almasrymina@google.com>
> Reported-by: James Houghton <jthoughton@google.com>

Not sure if this is a Cc: stable issue. If it is, I can add in v2.

> ---
>  mm/hugetlb.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
>
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index e09159c957e3..25a7a3d84607 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -5741,6 +5741,14 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm,
>                 page = find_lock_page(mapping, idx);
>                 if (!page)
>                         goto out;
> +               /*
> +                * Set new_pagecache_page to true, as we've added a page to the
> +                * pagecache, but userfaultfd hasn't set up a mapping for this
> +                * page yet. If we bail out before setting up the mapping, we
> +                * want to indicate to restore_reserve_on_error() that we've
> +                * added the page to the page cache.
> +                */
> +               new_pagecache_page = true;
>         } else if (!*pagep) {
>                 /* If a page already exists, then it's UFFDIO_COPY for
>                  * a non-missing case. Return -EEXIST.
> --
> 2.34.0.rc1.387.gb447b232ab-goog
Andrew Morton Nov. 17, 2021, 12:32 a.m. UTC | #2
On Tue, 16 Nov 2021 15:57:32 -0800 Mina Almasry <almasrymina@google.com> wrote:

> Currently in the is_continue case in hugetlb_mcopy_atomic_pte(), if we
> bail out using "goto out_release_unlock;" in the cases where idx >=
> size, or !huge_pte_none(), the code will detect that new_pagecache_page
> == false, and so call restore_reserve_on_error().
> In this case I see restore_reserve_on_error() delete the reservation,
> and the following call to remove_inode_hugepages() will increment
> h->resv_hugepages causing a 100% reproducible leak.
> 
> We should treat the is_continue case similar to adding a page into the
> pagecache and set new_pagecache_page to true, to indicate that there is
> no reservation to restore on the error path, and we need not call
> restore_reserve_on_error().
> 
> Cc: Wei Xu <weixugc@google.com>
> 
> Fixes: c7b1850dfb41 ("hugetlb: don't pass page cache pages to restore_reserve_on_error")

I added cc:stable to this.
Mike Kravetz Nov. 17, 2021, 12:58 a.m. UTC | #3
Subject:   Re: [PATCH v1] hugetlb, userfaultfd: Fix reservation restore on userfaultfd error

To:        Mina Almasry <almasrymina@google.com>, Andrew Morton <akpm@linux-foundation.org>

Cc:        Wei Xu <weixugc@google.com>, James Houghton <jthoughton@google.com>, linux-mm@kvack.org, linux-kernel@vger.kernel.org

Bcc:       

-=-=-=-=-=-=-=-=-=# Don't remove this line #=-=-=-=-=-=-=-=-=-

On 11/16/21 3:57 PM, Mina Almasry wrote:

> Currently in the is_continue case in hugetlb_mcopy_atomic_pte(), if we

> bail out using "goto out_release_unlock;" in the cases where idx >=

> size, or !huge_pte_none(), the code will detect that new_pagecache_page

> == false, and so call restore_reserve_on_error().

> In this case I see restore_reserve_on_error() delete the reservation,

> and the following call to remove_inode_hugepages() will increment

> h->resv_hugepages causing a 100% reproducible leak.

> 

> We should treat the is_continue case similar to adding a page into the

> pagecache and set new_pagecache_page to true, to indicate that there is

> no reservation to restore on the error path, and we need not call

> restore_reserve_on_error().

> 

> Cc: Wei Xu <weixugc@google.com>

> 

> Fixes: c7b1850dfb41 ("hugetlb: don't pass page cache pages to restore_reserve_on_error")

> Signed-off-by: Mina Almasry <almasrymina@google.com>

> Reported-by: James Houghton <jthoughton@google.com>



Thanks Mina and James!



Technically, the issue was introduced by commit 846be08578ed.  See the

'Note on Fixes tag' in c7b1850dfb41.  It is true that commit c7b1850dfb41

should have taken the 'is_continue' case into account when deciding whether

or not to call restore_reserve_on_error.  However, this issue first

showed up with 846be08578ed.  But, this patch depends on c7b1850dfb41 so

I think c7b1850dfb41 it best for the Fixes tag.



> ---

>  mm/hugetlb.c | 8 ++++++++

>  1 file changed, 8 insertions(+)

> 

> diff --git a/mm/hugetlb.c b/mm/hugetlb.c

> index e09159c957e3..25a7a3d84607 100644

> --- a/mm/hugetlb.c

> +++ b/mm/hugetlb.c

> @@ -5741,6 +5741,14 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm,

>  		page = find_lock_page(mapping, idx);

>  		if (!page)

>  			goto out;

> +		/*

> +		 * Set new_pagecache_page to true, as we've added a page to the

> +		 * pagecache, but userfaultfd hasn't set up a mapping for this



We did not add the the page to the pagecache.  Rather, this is the case

where the page already exists in the cache.  Right?



> +		 * page yet. If we bail out before setting up the mapping, we

> +		 * want to indicate to restore_reserve_on_error() that we've

> +		 * added the page to the page cache.

> +		 */

> +		new_pagecache_page = true;





How about changing the variable name new_pagecache_page to page_in_pagecache?

Then it makes sense both here and below when actually adding to the

cache.  I think we could then drop the above comment.
diff mbox series

Patch

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index e09159c957e3..25a7a3d84607 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -5741,6 +5741,14 @@  int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm,
 		page = find_lock_page(mapping, idx);
 		if (!page)
 			goto out;
+		/*
+		 * Set new_pagecache_page to true, as we've added a page to the
+		 * pagecache, but userfaultfd hasn't set up a mapping for this
+		 * page yet. If we bail out before setting up the mapping, we
+		 * want to indicate to restore_reserve_on_error() that we've
+		 * added the page to the page cache.
+		 */
+		new_pagecache_page = true;
 	} else if (!*pagep) {
 		/* If a page already exists, then it's UFFDIO_COPY for
 		 * a non-missing case. Return -EEXIST.