Message ID | 20240225080008.1019653-2-vivek.kasireddy@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | mm/gup: Introduce memfd_pin_folios() for pinning memfd folios | expand |
On 25.02.24 08:56, Vivek Kasireddy wrote: > These helpers are the folio versions of unpin_user_page/unpin_user_pages. > They are currently only useful for unpinning folios pinned by > memfd_pin_folios() or other associated routines. However, they could > find new uses in the future, when more and more folio-only helpers > are added to GUP. > > Cc: David Hildenbrand <david@redhat.com> > Cc: Matthew Wilcox <willy@infradead.org> > Cc: Christoph Hellwig <hch@infradead.org> > Cc: Jason Gunthorpe <jgg@nvidia.com> > Cc: Peter Xu <peterx@redhat.com> > Suggested-by: David Hildenbrand <david@redhat.com> > Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com> > --- > include/linux/mm.h | 2 ++ > mm/gup.c | 81 ++++++++++++++++++++++++++++++++++++++++------ > 2 files changed, 74 insertions(+), 9 deletions(-) > > diff --git a/include/linux/mm.h b/include/linux/mm.h > index 6f4825d82965..36e4c2b22600 100644 > --- a/include/linux/mm.h > +++ b/include/linux/mm.h > @@ -1601,11 +1601,13 @@ static inline void put_page(struct page *page) > #define GUP_PIN_COUNTING_BIAS (1U << 10) > > void unpin_user_page(struct page *page); > +void unpin_folio(struct folio *folio); > void unpin_user_pages_dirty_lock(struct page **pages, unsigned long npages, > bool make_dirty); > void unpin_user_page_range_dirty_lock(struct page *page, unsigned long npages, > bool make_dirty); > void unpin_user_pages(struct page **pages, unsigned long npages); > +void unpin_folios(struct folio **folios, unsigned long nfolios); > > static inline bool is_cow_mapping(vm_flags_t flags) > { > diff --git a/mm/gup.c b/mm/gup.c > index df83182ec72d..0a45eda6aaeb 100644 > --- a/mm/gup.c > +++ b/mm/gup.c > @@ -30,6 +30,23 @@ struct follow_page_context { > unsigned int page_mask; > }; > > +static inline void sanity_check_pinned_folios(struct folio **folios, > + unsigned long nfolios) > +{ > + if (!IS_ENABLED(CONFIG_DEBUG_VM)) > + return; > + > + for (; nfolios; nfolios--, folios++) { > + struct folio *folio = *folios; > + > + if (is_zero_folio(folio) || > + !folio_test_anon(folio)) > + continue; > + > + VM_BUG_ON_FOLIO(!PageAnonExclusive(&folio->page), folio); That change is wrong (and the split makes the check confusing). It could be that the first subpage is no longer exclusive, but the given (sanity_check_pinned_pages() ) subpage is exclusive for large folios. I suggest dropping that change, and instead, in unpin_folio()/unpin_folios(), reject any anon folios for now. So, replace the sanity_check_pinned_folios() in unpin_folio() / unpin_folios() by a VM_WARN_ON(folio_test_anon(folio)); It will all be better once we have a single anon-exclusive flag per folio (which I am working on), but in the meantime, we really don't expect code that called pin_user_pages() to call unpin_folios(). [...] > > +/** > + * unpin_folio() - release a dma-pinned folio > + * @folio: pointer to folio to be released > + * > + * Folios that were pinned via memfd_pin_folios() or other similar routines > + * must be released either using unpin_folio() or unpin_folios(). This is so > + * that such folios can be separately tracked and uniquely handled. I'd drop the last sentence; no need for apologies/explanations, this is simply how ;pinning works :) > + */ > +void unpin_folio(struct folio *folio) > +{ > + sanity_check_pinned_folios(&folio, 1); > + gup_put_folio(folio, 1, FOLL_PIN); > +} > +EXPORT_SYMBOL(unpin_folio); Can we restrict that to EXPORT_SYMBOL_GPL for now? memfd_pin_folios() uses EXPORT_SYMBOL_GPL... > + > /** > * folio_add_pin - Try to get an additional pin on a pinned folio > * @folio: The folio to be pinned > @@ -488,6 +516,41 @@ void unpin_user_pages(struct page **pages, unsigned long npages) > } > EXPORT_SYMBOL(unpin_user_pages); > > +/** > + * unpin_folios() - release an array of gup-pinned folios. > + * @folios: array of folios to be marked dirty and released. > + * @nfolios: number of folios in the @folios array. > + * > + * For each folio in the @folios array, release the folio using unpin_folio(). > + * > + * Please see the unpin_folio() documentation for details. > + */ > +void unpin_folios(struct folio **folios, unsigned long nfolios) > +{ > + unsigned long i = 0, j; > + > + /* > + * If this WARN_ON() fires, then the system *might* be leaking folios > + * (by leaving them pinned), but probably not. More likely, gup/pup > + * returned a hard -ERRNO error to the caller, who erroneously passed > + * it here. > + */ > + if (WARN_ON(IS_ERR_VALUE(nfolios))) > + return; > + > + sanity_check_pinned_folios(folios, nfolios); > + while (i < nfolios) { > + for (j = i + 1; j < nfolios; j++) > + if (folios[i] != folios[j]) > + break; > + > + if (folios[i]) > + gup_put_folio(folios[i], j - i, FOLL_PIN); > + i = j; > + } > +} > +EXPORT_SYMBOL(unpin_folios); Same thought here.
On 02.04.24 15:52, David Hildenbrand wrote: > On 25.02.24 08:56, Vivek Kasireddy wrote: >> These helpers are the folio versions of unpin_user_page/unpin_user_pages. >> They are currently only useful for unpinning folios pinned by >> memfd_pin_folios() or other associated routines. However, they could >> find new uses in the future, when more and more folio-only helpers >> are added to GUP. >> >> Cc: David Hildenbrand <david@redhat.com> >> Cc: Matthew Wilcox <willy@infradead.org> >> Cc: Christoph Hellwig <hch@infradead.org> >> Cc: Jason Gunthorpe <jgg@nvidia.com> >> Cc: Peter Xu <peterx@redhat.com> >> Suggested-by: David Hildenbrand <david@redhat.com> >> Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com> >> --- >> include/linux/mm.h | 2 ++ >> mm/gup.c | 81 ++++++++++++++++++++++++++++++++++++++++------ >> 2 files changed, 74 insertions(+), 9 deletions(-) >> >> diff --git a/include/linux/mm.h b/include/linux/mm.h >> index 6f4825d82965..36e4c2b22600 100644 >> --- a/include/linux/mm.h >> +++ b/include/linux/mm.h >> @@ -1601,11 +1601,13 @@ static inline void put_page(struct page *page) >> #define GUP_PIN_COUNTING_BIAS (1U << 10) >> >> void unpin_user_page(struct page *page); >> +void unpin_folio(struct folio *folio); >> void unpin_user_pages_dirty_lock(struct page **pages, unsigned long npages, >> bool make_dirty); >> void unpin_user_page_range_dirty_lock(struct page *page, unsigned long npages, >> bool make_dirty); >> void unpin_user_pages(struct page **pages, unsigned long npages); >> +void unpin_folios(struct folio **folios, unsigned long nfolios); >> >> static inline bool is_cow_mapping(vm_flags_t flags) >> { >> diff --git a/mm/gup.c b/mm/gup.c >> index df83182ec72d..0a45eda6aaeb 100644 >> --- a/mm/gup.c >> +++ b/mm/gup.c >> @@ -30,6 +30,23 @@ struct follow_page_context { >> unsigned int page_mask; >> }; >> >> +static inline void sanity_check_pinned_folios(struct folio **folios, >> + unsigned long nfolios) >> +{ >> + if (!IS_ENABLED(CONFIG_DEBUG_VM)) >> + return; >> + >> + for (; nfolios; nfolios--, folios++) { >> + struct folio *folio = *folios; >> + >> + if (is_zero_folio(folio) || >> + !folio_test_anon(folio)) >> + continue; >> + >> + VM_BUG_ON_FOLIO(!PageAnonExclusive(&folio->page), folio); > > That change is wrong (and the split makes the check confusing). > > It could be that the first subpage is no longer exclusive, but the given > (sanity_check_pinned_pages() ) subpage is exclusive for large folios. > > I suggest dropping that change, and instead, in > unpin_folio()/unpin_folios(), reject any anon folios for now. > > So, replace the sanity_check_pinned_folios() in unpin_folio() / > unpin_folios() by a VM_WARN_ON(folio_test_anon(folio)); After reading patch #2: drop both the sanity check and VM_WARN_ON() from unpin_folio()/unpin_folios(), and add a comment to the patch description that we cannot do the sanity checking without the subpage, and that we can reintroduce it once we have a single per-folio AnonExclusive bit.
diff --git a/include/linux/mm.h b/include/linux/mm.h index 6f4825d82965..36e4c2b22600 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1601,11 +1601,13 @@ static inline void put_page(struct page *page) #define GUP_PIN_COUNTING_BIAS (1U << 10) void unpin_user_page(struct page *page); +void unpin_folio(struct folio *folio); void unpin_user_pages_dirty_lock(struct page **pages, unsigned long npages, bool make_dirty); void unpin_user_page_range_dirty_lock(struct page *page, unsigned long npages, bool make_dirty); void unpin_user_pages(struct page **pages, unsigned long npages); +void unpin_folios(struct folio **folios, unsigned long nfolios); static inline bool is_cow_mapping(vm_flags_t flags) { diff --git a/mm/gup.c b/mm/gup.c index df83182ec72d..0a45eda6aaeb 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -30,6 +30,23 @@ struct follow_page_context { unsigned int page_mask; }; +static inline void sanity_check_pinned_folios(struct folio **folios, + unsigned long nfolios) +{ + if (!IS_ENABLED(CONFIG_DEBUG_VM)) + return; + + for (; nfolios; nfolios--, folios++) { + struct folio *folio = *folios; + + if (is_zero_folio(folio) || + !folio_test_anon(folio)) + continue; + + VM_BUG_ON_FOLIO(!PageAnonExclusive(&folio->page), folio); + } +} + static inline void sanity_check_pinned_pages(struct page **pages, unsigned long npages) { @@ -52,15 +69,11 @@ static inline void sanity_check_pinned_pages(struct page **pages, struct page *page = *pages; struct folio *folio = page_folio(page); - if (is_zero_page(page) || - !folio_test_anon(folio)) - continue; - if (!folio_test_large(folio) || folio_test_hugetlb(folio)) - VM_BUG_ON_PAGE(!PageAnonExclusive(&folio->page), page); - else - /* Either a PTE-mapped or a PMD-mapped THP. */ - VM_BUG_ON_PAGE(!PageAnonExclusive(&folio->page) && - !PageAnonExclusive(page), page); + sanity_check_pinned_folios(&folio, 1); + + /* Either a PTE-mapped or a PMD-mapped THP. */ + if (folio_test_large(folio) && !folio_test_hugetlb(folio)) + VM_BUG_ON_PAGE(!PageAnonExclusive(page), page); } } @@ -276,6 +289,21 @@ void unpin_user_page(struct page *page) } EXPORT_SYMBOL(unpin_user_page); +/** + * unpin_folio() - release a dma-pinned folio + * @folio: pointer to folio to be released + * + * Folios that were pinned via memfd_pin_folios() or other similar routines + * must be released either using unpin_folio() or unpin_folios(). This is so + * that such folios can be separately tracked and uniquely handled. + */ +void unpin_folio(struct folio *folio) +{ + sanity_check_pinned_folios(&folio, 1); + gup_put_folio(folio, 1, FOLL_PIN); +} +EXPORT_SYMBOL(unpin_folio); + /** * folio_add_pin - Try to get an additional pin on a pinned folio * @folio: The folio to be pinned @@ -488,6 +516,41 @@ void unpin_user_pages(struct page **pages, unsigned long npages) } EXPORT_SYMBOL(unpin_user_pages); +/** + * unpin_folios() - release an array of gup-pinned folios. + * @folios: array of folios to be marked dirty and released. + * @nfolios: number of folios in the @folios array. + * + * For each folio in the @folios array, release the folio using unpin_folio(). + * + * Please see the unpin_folio() documentation for details. + */ +void unpin_folios(struct folio **folios, unsigned long nfolios) +{ + unsigned long i = 0, j; + + /* + * If this WARN_ON() fires, then the system *might* be leaking folios + * (by leaving them pinned), but probably not. More likely, gup/pup + * returned a hard -ERRNO error to the caller, who erroneously passed + * it here. + */ + if (WARN_ON(IS_ERR_VALUE(nfolios))) + return; + + sanity_check_pinned_folios(folios, nfolios); + while (i < nfolios) { + for (j = i + 1; j < nfolios; j++) + if (folios[i] != folios[j]) + break; + + if (folios[i]) + gup_put_folio(folios[i], j - i, FOLL_PIN); + i = j; + } +} +EXPORT_SYMBOL(unpin_folios); + /* * Set the MMF_HAS_PINNED if not set yet; after set it'll be there for the mm's * lifecycle. Avoid setting the bit unless necessary, or it might cause write
These helpers are the folio versions of unpin_user_page/unpin_user_pages. They are currently only useful for unpinning folios pinned by memfd_pin_folios() or other associated routines. However, they could find new uses in the future, when more and more folio-only helpers are added to GUP. Cc: David Hildenbrand <david@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Christoph Hellwig <hch@infradead.org> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Peter Xu <peterx@redhat.com> Suggested-by: David Hildenbrand <david@redhat.com> Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com> --- include/linux/mm.h | 2 ++ mm/gup.c | 81 ++++++++++++++++++++++++++++++++++++++++------ 2 files changed, 74 insertions(+), 9 deletions(-)