Message ID | 1725373521-451395-3-git-send-email-steven.sistare@oracle.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | memfd-pin huge page fixes | expand |
Hi Steve, > Subject: [PATCH V1 2/5] mm/hugetlb: fix memfd_pin_folios free_huge_pages > leak > > memfd_pin_folios followed by unpin_folios fails to restore > free_huge_pages if the pages were not already faulted in, because the > folio refcount for pages created by memfd_alloc_folio never goes to 0. > memfd_pin_folios needs another folio_put to undo the folio_try_get > below: > > memfd_alloc_folio() > alloc_hugetlb_folio_nodemask() > dequeue_hugetlb_folio_nodemask() > dequeue_hugetlb_folio_node_exact() > folio_ref_unfreeze(folio, 1); ; adds 1 refcount > folio_try_get() ; adds 1 refcount I wonder if it is more optimal to skip the folio_try_get() above. I think it (folio_try_get) was probably added because I didn't realize that alloc_hugetlb_folio_nodemask() already adds a reference. > hugetlb_add_to_page_cache() ; adds 512 refcount (on x86) > > With the fix, after memfd_pin_folios + unpin_folios, the refcount for > the (unfaulted) page is 512, which is correct, as the refcount for a > faulted unpinned page is 513. > > Fixes: 89c1905d9c14 ("mm/gup: introduce memfd_pin_folios() for pinning > memfd folios") > > Signed-off-by: Steve Sistare <steven.sistare@oracle.com> > --- > mm/gup.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/mm/gup.c b/mm/gup.c > index 54d0dc3..5b92f1d 100644 > --- a/mm/gup.c > +++ b/mm/gup.c > @@ -3618,7 +3618,7 @@ long memfd_pin_folios(struct file *memfd, loff_t > start, loff_t end, > pgoff_t start_idx, end_idx, next_idx; > struct folio *folio = NULL; > struct folio_batch fbatch; > - struct hstate *h; > + struct hstate *h = NULL; > long ret = -EINVAL; > > if (start < 0 || start > end || !max_folios) > @@ -3662,6 +3662,8 @@ long memfd_pin_folios(struct file *memfd, loff_t > start, loff_t end, > &fbatch); > if (folio) { > folio_put(folio); > + if (h) > + folio_put(folio); If we stick with this change, I guess there needs to be an additional folio_put() in memfd_alloc_folio() as well if adding the folio to the page cache fails. Acked-by: Vivek Kasireddy <vivek.kasireddy@intel.com> Thanks, Vivek > folio = NULL; > } > > -- > 1.8.3.1
On 9/3/2024 8:45 PM, Kasireddy, Vivek wrote: > Hi Steve, > >> Subject: [PATCH V1 2/5] mm/hugetlb: fix memfd_pin_folios free_huge_pages >> leak >> >> memfd_pin_folios followed by unpin_folios fails to restore >> free_huge_pages if the pages were not already faulted in, because the >> folio refcount for pages created by memfd_alloc_folio never goes to 0. >> memfd_pin_folios needs another folio_put to undo the folio_try_get >> below: >> >> memfd_alloc_folio() >> alloc_hugetlb_folio_nodemask() >> dequeue_hugetlb_folio_nodemask() >> dequeue_hugetlb_folio_node_exact() >> folio_ref_unfreeze(folio, 1); ; adds 1 refcount >> folio_try_get() ; adds 1 refcount > I wonder if it is more optimal to skip the folio_try_get() above. > I think it (folio_try_get) was probably added because I didn't realize that > alloc_hugetlb_folio_nodemask() already adds a reference. Agreed, that is a simpler fix. I tried it and my tests pass. >> hugetlb_add_to_page_cache() ; adds 512 refcount (on x86) >> >> With the fix, after memfd_pin_folios + unpin_folios, the refcount for >> the (unfaulted) page is 512, which is correct, as the refcount for a >> faulted unpinned page is 513. >> >> Fixes: 89c1905d9c14 ("mm/gup: introduce memfd_pin_folios() for pinning >> memfd folios") >> >> Signed-off-by: Steve Sistare <steven.sistare@oracle.com> >> --- >> mm/gup.c | 4 +++- >> 1 file changed, 3 insertions(+), 1 deletion(-) >> >> diff --git a/mm/gup.c b/mm/gup.c >> index 54d0dc3..5b92f1d 100644 >> --- a/mm/gup.c >> +++ b/mm/gup.c >> @@ -3618,7 +3618,7 @@ long memfd_pin_folios(struct file *memfd, loff_t >> start, loff_t end, >> pgoff_t start_idx, end_idx, next_idx; >> struct folio *folio = NULL; >> struct folio_batch fbatch; >> - struct hstate *h; >> + struct hstate *h = NULL; >> long ret = -EINVAL; >> >> if (start < 0 || start > end || !max_folios) >> @@ -3662,6 +3662,8 @@ long memfd_pin_folios(struct file *memfd, loff_t >> start, loff_t end, >> &fbatch); >> if (folio) { >> folio_put(folio); >> + if (h) >> + folio_put(folio); > If we stick with this change, I guess there needs to be an additional > folio_put() in memfd_alloc_folio() as well if adding the folio to the > page cache fails. Indeed, another reason why just deleting the folio_try_get is better. Thanks! I will submit a V2 of this patch. - Steve
diff --git a/mm/gup.c b/mm/gup.c index 54d0dc3..5b92f1d 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -3618,7 +3618,7 @@ long memfd_pin_folios(struct file *memfd, loff_t start, loff_t end, pgoff_t start_idx, end_idx, next_idx; struct folio *folio = NULL; struct folio_batch fbatch; - struct hstate *h; + struct hstate *h = NULL; long ret = -EINVAL; if (start < 0 || start > end || !max_folios) @@ -3662,6 +3662,8 @@ long memfd_pin_folios(struct file *memfd, loff_t start, loff_t end, &fbatch); if (folio) { folio_put(folio); + if (h) + folio_put(folio); folio = NULL; }
memfd_pin_folios followed by unpin_folios fails to restore free_huge_pages if the pages were not already faulted in, because the folio refcount for pages created by memfd_alloc_folio never goes to 0. memfd_pin_folios needs another folio_put to undo the folio_try_get below: memfd_alloc_folio() alloc_hugetlb_folio_nodemask() dequeue_hugetlb_folio_nodemask() dequeue_hugetlb_folio_node_exact() folio_ref_unfreeze(folio, 1); ; adds 1 refcount folio_try_get() ; adds 1 refcount hugetlb_add_to_page_cache() ; adds 512 refcount (on x86) With the fix, after memfd_pin_folios + unpin_folios, the refcount for the (unfaulted) page is 512, which is correct, as the refcount for a faulted unpinned page is 513. Fixes: 89c1905d9c14 ("mm/gup: introduce memfd_pin_folios() for pinning memfd folios") Signed-off-by: Steve Sistare <steven.sistare@oracle.com> --- mm/gup.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)