diff mbox series

[v2,1/3] mm/gup: introduce pin_user_page()

Message ID 20200829080853.20337-2-jhubbard@nvidia.com
State New
Headers show
Series bio: Direct IO: convert to pin_user_pages_fast() | expand

Commit Message

John Hubbard Aug. 29, 2020, 8:08 a.m. UTC
pin_user_page() is the FOLL_PIN equivalent of get_page().

This was always a missing piece of the pin/unpin API calls (early
reviewers of pin_user_pages() asked about it, in fact), but until now,
it just wasn't needed. Finally though, now that the Direct IO pieces in
block/bio are about to be converted to use FOLL_PIN, it turns out that
there are some cases in which get_page() and get_user_pages_fast() were
both used. Converting those sites requires a drop-in replacement for
get_page(), which this patch supplies.

[1] and [2] provide some background about pin_user_pages() in general.

[1] "Explicit pinning of user-space pages":
    https://lwn.net/Articles/807108/

[2] Documentation/core-api/pin_user_pages.rst

Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 include/linux/mm.h |  2 ++
 mm/gup.c           | 30 ++++++++++++++++++++++++++++++
 2 files changed, 32 insertions(+)

Comments

Christoph Hellwig Aug. 29, 2020, 2:54 p.m. UTC | #1
On Sat, Aug 29, 2020 at 01:08:51AM -0700, John Hubbard wrote:
> pin_user_page() is the FOLL_PIN equivalent of get_page().
> 
> This was always a missing piece of the pin/unpin API calls (early
> reviewers of pin_user_pages() asked about it, in fact), but until now,
> it just wasn't needed. Finally though, now that the Direct IO pieces in
> block/bio are about to be converted to use FOLL_PIN, it turns out that
> there are some cases in which get_page() and get_user_pages_fast() were
> both used. Converting those sites requires a drop-in replacement for
> get_page(), which this patch supplies.

I find the name really confusing vs pin_user_pages*, as it suggests as
single version of the same.  It also seems partially wrong, at least
in the direct I/O case as the only thing pinned here is the zero page.

So maybe pin_kernel_page is a better name together with an explanation?
Or just pin_page?
John Hubbard Aug. 29, 2020, 9:57 p.m. UTC | #2
On 8/29/20 7:54 AM, Christoph Hellwig wrote:
> On Sat, Aug 29, 2020 at 01:08:51AM -0700, John Hubbard wrote:
>> pin_user_page() is the FOLL_PIN equivalent of get_page().
>>
>> This was always a missing piece of the pin/unpin API calls (early
>> reviewers of pin_user_pages() asked about it, in fact), but until now,
>> it just wasn't needed. Finally though, now that the Direct IO pieces in
>> block/bio are about to be converted to use FOLL_PIN, it turns out that
>> there are some cases in which get_page() and get_user_pages_fast() were
>> both used. Converting those sites requires a drop-in replacement for
>> get_page(), which this patch supplies.
> 
> I find the name really confusing vs pin_user_pages*, as it suggests as
> single version of the same.  It also seems partially wrong, at least
> in the direct I/O case as the only thing pinned here is the zero page.
> 
> So maybe pin_kernel_page is a better name together with an explanation?
> Or just pin_page?
> 

Yes. Both its behavior and use are closer to get_page() than it is to
get_user_pages(). So pin_page() is just right.


thanks,
diff mbox series

Patch

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 97c83773b6f0..51c6ae4b63f7 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1152,6 +1152,8 @@  static inline void get_page(struct page *page)
 	page_ref_inc(page);
 }
 
+void pin_user_page(struct page *page);
+
 bool __must_check try_grab_page(struct page *page, unsigned int flags);
 
 static inline __must_check bool try_get_page(struct page *page)
diff --git a/mm/gup.c b/mm/gup.c
index ae096ea7583f..2cae5bbbc862 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -123,6 +123,36 @@  static __maybe_unused struct page *try_grab_compound_head(struct page *page,
 	return NULL;
 }
 
+/*
+ * pin_user_page() - elevate the page refcount, and mark as FOLL_PIN
+ *
+ * This the FOLL_PIN equivalent of get_page(). It is intended for use when the
+ * page will be released via unpin_user_page().
+ */
+void pin_user_page(struct page *page)
+{
+	int refs = 1;
+
+	page = compound_head(page);
+
+	VM_BUG_ON_PAGE(page_ref_count(page) <= 0, page);
+
+	if (hpage_pincount_available(page))
+		hpage_pincount_add(page, 1);
+	else
+		refs = GUP_PIN_COUNTING_BIAS;
+
+	/*
+	 * Similar to try_grab_compound_head(): even if using the
+	 * hpage_pincount_add/_sub() routines, be sure to
+	 * *also* increment the normal page refcount field at least
+	 * once, so that the page really is pinned.
+	 */
+	page_ref_add(page, refs);
+
+	mod_node_page_state(page_pgdat(page), NR_FOLL_PIN_ACQUIRED, 1);
+}
+
 /**
  * try_grab_page() - elevate a page's refcount by a flag-dependent amount
  *