diff mbox series

[1/3] mm/gup: introduce __put_user_pages()

Message ID 20190722223415.13269-2-jhubbard@nvidia.com (mailing list archive)
State New, archived
Headers show
Series introduce __put_user_pages(), convert a few call sites | expand

Commit Message

john.hubbard@gmail.com July 22, 2019, 10:34 p.m. UTC
From: John Hubbard <jhubbard@nvidia.com>

Add a more capable variation of put_user_pages() to the
API set, and call it from the simple ones.

The new __put_user_pages() takes an enum that handles the various
combinations of needing to call set_page_dirty() or
set_page_dirty_lock(), before calling put_user_page().

Cc: Matthew Wilcox <willy@infradead.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 include/linux/mm.h |  58 ++++++++++++++++++-
 mm/gup.c           | 137 ++++++++++++++++++++++-----------------------
 2 files changed, 124 insertions(+), 71 deletions(-)

Comments

Christoph Hellwig July 23, 2019, 5:53 a.m. UTC | #1
On Mon, Jul 22, 2019 at 03:34:13PM -0700, john.hubbard@gmail.com wrote:
> +enum pup_flags_t {
> +	PUP_FLAGS_CLEAN		= 0,
> +	PUP_FLAGS_DIRTY		= 1,
> +	PUP_FLAGS_LOCK		= 2,
> +	PUP_FLAGS_DIRTY_LOCK	= 3,
> +};

Well, the enum defeats the ease of just being able to pass a boolean
expression to the function, which would simplify a lot of the caller,
so if we need to support the !locked version I'd rather see that as
a separate helper.

But do we actually have callers where not using the _lock version is
not a bug?  set_page_dirty makes sense in the context of a file systems
that have a reference to the inode the page hangs off, but that is
(almost?) never the case for get_user_pages.
John Hubbard July 23, 2019, 6:33 a.m. UTC | #2
On 7/22/19 10:53 PM, Christoph Hellwig wrote:
> On Mon, Jul 22, 2019 at 03:34:13PM -0700, john.hubbard@gmail.com wrote:
>> +enum pup_flags_t {
>> +	PUP_FLAGS_CLEAN		= 0,
>> +	PUP_FLAGS_DIRTY		= 1,
>> +	PUP_FLAGS_LOCK		= 2,
>> +	PUP_FLAGS_DIRTY_LOCK	= 3,
>> +};
> 
> Well, the enum defeats the ease of just being able to pass a boolean
> expression to the function, which would simplify a lot of the caller,
> so if we need to support the !locked version I'd rather see that as
> a separate helper.
> 
> But do we actually have callers where not using the _lock version is
> not a bug?  set_page_dirty makes sense in the context of a file systems
> that have a reference to the inode the page hangs off, but that is
> (almost?) never the case for get_user_pages.
> 

I'm seeing about 18 places where set_page_dirty() is used, in the call site
conversions so far, and about 20 places where set_page_dirty_lock() is
used. So without knowing how many of the former (if any) represent bugs,
you can see why the proposal here supports both DIRTY and DIRTY_LOCK.

Anyway, yes, I could change it, based on your estimation that most of the 
set_page_dirty() calls really should be set_page_dirty_lock().
In that case, we would end up with approximately the following:

/* Here, "dirty" really means, "call set_page_dirty_lock()": */
void __put_user_pages(struct page **pages, unsigned long npages,
		      bool dirty);

/* Here, "dirty" really means, "call set_page_dirty()": */
void __put_user_pages_unlocked(struct page **pages, unsigned long npages,
			       bool dirty);

?


thanks,
Christoph Hellwig July 23, 2019, 3:36 p.m. UTC | #3
On Mon, Jul 22, 2019 at 11:33:32PM -0700, John Hubbard wrote:
> I'm seeing about 18 places where set_page_dirty() is used, in the call site
> conversions so far, and about 20 places where set_page_dirty_lock() is
> used. So without knowing how many of the former (if any) represent bugs,
> you can see why the proposal here supports both DIRTY and DIRTY_LOCK.

Well, it should be fairly easy to audit.  set_page_dirty() is only
safe if we are dealing with a file backed page where we have reference
on the inode it hangs off.  Which should basically be never or almost
never.
diff mbox series

Patch

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 0334ca97c584..7218585681b2 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1057,8 +1057,62 @@  static inline void put_user_page(struct page *page)
 	put_page(page);
 }
 
-void put_user_pages_dirty(struct page **pages, unsigned long npages);
-void put_user_pages_dirty_lock(struct page **pages, unsigned long npages);
+enum pup_flags_t {
+	PUP_FLAGS_CLEAN		= 0,
+	PUP_FLAGS_DIRTY		= 1,
+	PUP_FLAGS_LOCK		= 2,
+	PUP_FLAGS_DIRTY_LOCK	= 3,
+};
+
+void __put_user_pages(struct page **pages, unsigned long npages,
+		      enum pup_flags_t flags);
+
+/**
+ * put_user_pages_dirty() - release and dirty an array of gup-pinned pages
+ * @pages:  array of pages to be marked dirty and released.
+ * @npages: number of pages in the @pages array.
+ *
+ * "gup-pinned page" refers to a page that has had one of the get_user_pages()
+ * variants called on that page.
+ *
+ * For each page in the @pages array, make that page (or its head page, if a
+ * compound page) dirty, if it was previously listed as clean. Then, release
+ * the page using put_user_page().
+ *
+ * Please see the put_user_page() documentation for details.
+ *
+ * set_page_dirty(), which does not lock the page, is used here.
+ * Therefore, it is the caller's responsibility to ensure that this is
+ * safe. If not, then put_user_pages_dirty_lock() should be called instead.
+ *
+ */
+static inline void put_user_pages_dirty(struct page **pages,
+					unsigned long npages)
+{
+	__put_user_pages(pages, npages, PUP_FLAGS_DIRTY);
+}
+
+/**
+ * put_user_pages_dirty_lock() - release and dirty an array of gup-pinned pages
+ * @pages:  array of pages to be marked dirty and released.
+ * @npages: number of pages in the @pages array.
+ *
+ * For each page in the @pages array, make that page (or its head page, if a
+ * compound page) dirty, if it was previously listed as clean. Then, release
+ * the page using put_user_page().
+ *
+ * Please see the put_user_page() documentation for details.
+ *
+ * This is just like put_user_pages_dirty(), except that it invokes
+ * set_page_dirty_lock(), instead of set_page_dirty().
+ *
+ */
+static inline void put_user_pages_dirty_lock(struct page **pages,
+					     unsigned long npages)
+{
+	__put_user_pages(pages, npages, PUP_FLAGS_DIRTY_LOCK);
+}
+
 void put_user_pages(struct page **pages, unsigned long npages);
 
 #if defined(CONFIG_SPARSEMEM) && !defined(CONFIG_SPARSEMEM_VMEMMAP)
diff --git a/mm/gup.c b/mm/gup.c
index 98f13ab37bac..6831ef064d76 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -29,87 +29,86 @@  struct follow_page_context {
 	unsigned int page_mask;
 };
 
-typedef int (*set_dirty_func_t)(struct page *page);
-
-static void __put_user_pages_dirty(struct page **pages,
-				   unsigned long npages,
-				   set_dirty_func_t sdf)
-{
-	unsigned long index;
-
-	for (index = 0; index < npages; index++) {
-		struct page *page = compound_head(pages[index]);
-
-		/*
-		 * Checking PageDirty at this point may race with
-		 * clear_page_dirty_for_io(), but that's OK. Two key cases:
-		 *
-		 * 1) This code sees the page as already dirty, so it skips
-		 * the call to sdf(). That could happen because
-		 * clear_page_dirty_for_io() called page_mkclean(),
-		 * followed by set_page_dirty(). However, now the page is
-		 * going to get written back, which meets the original
-		 * intention of setting it dirty, so all is well:
-		 * clear_page_dirty_for_io() goes on to call
-		 * TestClearPageDirty(), and write the page back.
-		 *
-		 * 2) This code sees the page as clean, so it calls sdf().
-		 * The page stays dirty, despite being written back, so it
-		 * gets written back again in the next writeback cycle.
-		 * This is harmless.
-		 */
-		if (!PageDirty(page))
-			sdf(page);
-
-		put_user_page(page);
-	}
-}
-
 /**
- * put_user_pages_dirty() - release and dirty an array of gup-pinned pages
+ * __put_user_pages() - release an array of gup-pinned pages.
  * @pages:  array of pages to be marked dirty and released.
  * @npages: number of pages in the @pages array.
+ * @flags: additional hints, to be applied to each page:
  *
- * "gup-pinned page" refers to a page that has had one of the get_user_pages()
- * variants called on that page.
+ *	PUP_FLAGS_CLEAN: no additional steps required. (Consider calling
+ *			 put_user_pages() directly, instead.)
  *
- * For each page in the @pages array, make that page (or its head page, if a
- * compound page) dirty, if it was previously listed as clean. Then, release
- * the page using put_user_page().
+ *	PUP_FLAGS_DIRTY: Call set_page_dirty() on the page (if not already
+ *			 dirty).
  *
- * Please see the put_user_page() documentation for details.
+ *      PUP_FLAGS_LOCK: meaningless by itself, but included in order to show
+ *			the numeric relationship between the flags.
  *
- * set_page_dirty(), which does not lock the page, is used here.
- * Therefore, it is the caller's responsibility to ensure that this is
- * safe. If not, then put_user_pages_dirty_lock() should be called instead.
+ *      PUP_FLAGS_DIRTY_LOCK: Call set_page_dirty_lock() on the page (if not
+ *			      already dirty).
  *
+ * For each page in the @pages array, release the page using put_user_page().
  */
-void put_user_pages_dirty(struct page **pages, unsigned long npages)
+void __put_user_pages(struct page **pages, unsigned long npages,
+		      enum pup_flags_t flags)
 {
-	__put_user_pages_dirty(pages, npages, set_page_dirty);
-}
-EXPORT_SYMBOL(put_user_pages_dirty);
+	unsigned long index;
 
-/**
- * put_user_pages_dirty_lock() - release and dirty an array of gup-pinned pages
- * @pages:  array of pages to be marked dirty and released.
- * @npages: number of pages in the @pages array.
- *
- * For each page in the @pages array, make that page (or its head page, if a
- * compound page) dirty, if it was previously listed as clean. Then, release
- * the page using put_user_page().
- *
- * Please see the put_user_page() documentation for details.
- *
- * This is just like put_user_pages_dirty(), except that it invokes
- * set_page_dirty_lock(), instead of set_page_dirty().
- *
- */
-void put_user_pages_dirty_lock(struct page **pages, unsigned long npages)
-{
-	__put_user_pages_dirty(pages, npages, set_page_dirty_lock);
+	/*
+	 * TODO: this can be optimized for huge pages: if a series of pages is
+	 * physically contiguous and part of the same compound page, then a
+	 * single operation to the head page should suffice.
+	 */
+
+	for (index = 0; index < npages; index++) {
+		struct page *page = compound_head(pages[index]);
+
+		switch (flags) {
+		case PUP_FLAGS_CLEAN:
+			break;
+
+		case PUP_FLAGS_DIRTY:
+			/*
+			 * Checking PageDirty at this point may race with
+			 * clear_page_dirty_for_io(), but that's OK. Two key
+			 * cases:
+			 *
+			 * 1) This code sees the page as already dirty, so it
+			 * skips the call to set_page_dirty(). That could happen
+			 * because clear_page_dirty_for_io() called
+			 * page_mkclean(), followed by set_page_dirty().
+			 * However, now the page is going to get written back,
+			 * which meets the original intention of setting it
+			 * dirty, so all is well: clear_page_dirty_for_io() goes
+			 * on to call TestClearPageDirty(), and write the page
+			 * back.
+			 *
+			 * 2) This code sees the page as clean, so it calls
+			 * set_page_dirty(). The page stays dirty, despite being
+			 * written back, so it gets written back again in the
+			 * next writeback cycle. This is harmless.
+			 */
+			if (!PageDirty(page))
+				set_page_dirty(page);
+			break;
+
+		case PUP_FLAGS_LOCK:
+			VM_WARN_ON_ONCE(flags == PUP_FLAGS_LOCK);
+			/*
+			 * Shouldn't happen, but treat it as _DIRTY_LOCK if
+			 * it does: fall through.
+			 */
+
+		case PUP_FLAGS_DIRTY_LOCK:
+			/* Same comments as for PUP_FLAGS_DIRTY apply here. */
+			if (!PageDirty(page))
+				set_page_dirty_lock(page);
+			break;
+		};
+		put_user_page(page);
+	}
 }
-EXPORT_SYMBOL(put_user_pages_dirty_lock);
+EXPORT_SYMBOL(__put_user_pages);
 
 /**
  * put_user_pages() - release an array of gup-pinned pages.