diff mbox series

[v2,2/3] mm: introduce put_user_page[s](), placeholder versions

Message ID 20181005040225.14292-3-jhubbard@nvidia.com (mailing list archive)
State New, archived
Headers show
Series get_user_pages*() and RDMA: first steps | expand

Commit Message

john.hubbard@gmail.com Oct. 5, 2018, 4:02 a.m. UTC
From: John Hubbard <jhubbard@nvidia.com>

Introduces put_user_page(), which simply calls put_page().
This provides a way to update all get_user_pages*() callers,
so that they call put_user_page(), instead of put_page().

Also introduces put_user_pages(), and a few dirty/locked variations,
as a replacement for release_pages(), for the same reasons.
These may be used for subsequent performance improvements,
via batching of pages to be released.

This prepares for eventually fixing the problem described
in [1], and is following a plan listed in [2], [3], [4].

[1] https://lwn.net/Articles/753027/ : "The Trouble with get_user_pages()"

[2] https://lkml.kernel.org/r/20180709080554.21931-1-jhubbard@nvidia.com
    Proposed steps for fixing get_user_pages() + DMA problems.

[3]https://lkml.kernel.org/r/20180710082100.mkdwngdv5kkrcz6n@quack2.suse.cz
    Bounce buffers (otherwise [2] is not really viable).

[4] https://lkml.kernel.org/r/20181003162115.GG24030@quack2.suse.cz
    Follow-up discussions.

CC: Matthew Wilcox <willy@infradead.org>
CC: Michal Hocko <mhocko@kernel.org>
CC: Christopher Lameter <cl@linux.com>
CC: Jason Gunthorpe <jgg@ziepe.ca>
CC: Dan Williams <dan.j.williams@intel.com>
CC: Jan Kara <jack@suse.cz>
CC: Al Viro <viro@zeniv.linux.org.uk>
CC: Jerome Glisse <jglisse@redhat.com>
CC: Christoph Hellwig <hch@infradead.org>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 include/linux/mm.h | 42 ++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 40 insertions(+), 2 deletions(-)

Comments

Jason Gunthorpe Oct. 5, 2018, 3:17 p.m. UTC | #1
On Thu, Oct 04, 2018 at 09:02:24PM -0700, john.hubbard@gmail.com wrote:
> From: John Hubbard <jhubbard@nvidia.com>
> 
> Introduces put_user_page(), which simply calls put_page().
> This provides a way to update all get_user_pages*() callers,
> so that they call put_user_page(), instead of put_page().
> 
> Also introduces put_user_pages(), and a few dirty/locked variations,
> as a replacement for release_pages(), for the same reasons.
> These may be used for subsequent performance improvements,
> via batching of pages to be released.
> 
> This prepares for eventually fixing the problem described
> in [1], and is following a plan listed in [2], [3], [4].
> 
> [1] https://lwn.net/Articles/753027/ : "The Trouble with get_user_pages()"
> 
> [2] https://lkml.kernel.org/r/20180709080554.21931-1-jhubbard@nvidia.com
>     Proposed steps for fixing get_user_pages() + DMA problems.
> 
> [3]https://lkml.kernel.org/r/20180710082100.mkdwngdv5kkrcz6n@quack2.suse.cz
>     Bounce buffers (otherwise [2] is not really viable).
> 
> [4] https://lkml.kernel.org/r/20181003162115.GG24030@quack2.suse.cz
>     Follow-up discussions.
> 
> CC: Matthew Wilcox <willy@infradead.org>
> CC: Michal Hocko <mhocko@kernel.org>
> CC: Christopher Lameter <cl@linux.com>
> CC: Jason Gunthorpe <jgg@ziepe.ca>
> CC: Dan Williams <dan.j.williams@intel.com>
> CC: Jan Kara <jack@suse.cz>
> CC: Al Viro <viro@zeniv.linux.org.uk>
> CC: Jerome Glisse <jglisse@redhat.com>
> CC: Christoph Hellwig <hch@infradead.org>
> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
>  include/linux/mm.h | 42 ++++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 40 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index a61ebe8ad4ca..1a9aae7c659f 100644
> +++ b/include/linux/mm.h
> @@ -137,6 +137,8 @@ extern int overcommit_ratio_handler(struct ctl_table *, int, void __user *,
>  				    size_t *, loff_t *);
>  extern int overcommit_kbytes_handler(struct ctl_table *, int, void __user *,
>  				    size_t *, loff_t *);
> +int set_page_dirty(struct page *page);
> +int set_page_dirty_lock(struct page *page);
>  
>  #define nth_page(page,n) pfn_to_page(page_to_pfn((page)) + (n))
>  
> @@ -943,6 +945,44 @@ static inline void put_page(struct page *page)
>  		__put_page(page);
>  }
>  
> +/* Placeholder version, until all get_user_pages*() callers are updated. */
> +static inline void put_user_page(struct page *page)
> +{
> +	put_page(page);
> +}
> +
> +/* For get_user_pages*()-pinned pages, use these variants instead of
> + * release_pages():
> + */
> +static inline void put_user_pages_dirty(struct page **pages,
> +					unsigned long npages)
> +{
> +	while (npages) {
> +		set_page_dirty(pages[npages]);
> +		put_user_page(pages[npages]);
> +		--npages;
> +	}
> +}

Shouldn't these do the !PageDirty(page) thing?

Jason
John Hubbard Oct. 5, 2018, 7:49 p.m. UTC | #2
On 10/5/18 8:17 AM, Jason Gunthorpe wrote:
> On Thu, Oct 04, 2018 at 09:02:24PM -0700, john.hubbard@gmail.com wrote:
>> From: John Hubbard <jhubbard@nvidia.com>
>>
>> Introduces put_user_page(), which simply calls put_page().
>> This provides a way to update all get_user_pages*() callers,
>> so that they call put_user_page(), instead of put_page().
>>
>> Also introduces put_user_pages(), and a few dirty/locked variations,
>> as a replacement for release_pages(), for the same reasons.
>> These may be used for subsequent performance improvements,
>> via batching of pages to be released.
>>
>> This prepares for eventually fixing the problem described
>> in [1], and is following a plan listed in [2], [3], [4].
>>
>> [1] https://lwn.net/Articles/753027/ : "The Trouble with get_user_pages()"
>>
>> [2] https://lkml.kernel.org/r/20180709080554.21931-1-jhubbard@nvidia.com
>>     Proposed steps for fixing get_user_pages() + DMA problems.
>>
>> [3]https://lkml.kernel.org/r/20180710082100.mkdwngdv5kkrcz6n@quack2.suse.cz
>>     Bounce buffers (otherwise [2] is not really viable).
>>
>> [4] https://lkml.kernel.org/r/20181003162115.GG24030@quack2.suse.cz
>>     Follow-up discussions.
>>
[...]
>>  
>> +/* Placeholder version, until all get_user_pages*() callers are updated. */
>> +static inline void put_user_page(struct page *page)
>> +{
>> +	put_page(page);
>> +}
>> +
>> +/* For get_user_pages*()-pinned pages, use these variants instead of
>> + * release_pages():
>> + */
>> +static inline void put_user_pages_dirty(struct page **pages,
>> +					unsigned long npages)
>> +{
>> +	while (npages) {
>> +		set_page_dirty(pages[npages]);
>> +		put_user_page(pages[npages]);
>> +		--npages;
>> +	}
>> +}
> 
> Shouldn't these do the !PageDirty(page) thing?
> 

Well, not yet. This is the "placeholder" patch, in which I planned to keep
the behavior the same, while I go to all the get_user_pages call sites and change 
put_page() and release_pages() over to use these new routines.

After the call sites are changed, then these routines will be updated to do more.
[2], above has slightly more detail about that.


thanks,
John Hubbard Oct. 5, 2018, 8:51 p.m. UTC | #3
On 10/5/18 12:49 PM, John Hubbard wrote:
> On 10/5/18 8:17 AM, Jason Gunthorpe wrote:
>> On Thu, Oct 04, 2018 at 09:02:24PM -0700, john.hubbard@gmail.com wrote:
>>> From: John Hubbard <jhubbard@nvidia.com>
>>>
>>> Introduces put_user_page(), which simply calls put_page().
>>> This provides a way to update all get_user_pages*() callers,
>>> so that they call put_user_page(), instead of put_page().
>>>
>>> Also introduces put_user_pages(), and a few dirty/locked variations,
>>> as a replacement for release_pages(), for the same reasons.
>>> These may be used for subsequent performance improvements,
>>> via batching of pages to be released.
>>>
>>> This prepares for eventually fixing the problem described
>>> in [1], and is following a plan listed in [2], [3], [4].
>>>
>>> [1] https://lwn.net/Articles/753027/ : "The Trouble with get_user_pages()"
>>>
>>> [2] https://lkml.kernel.org/r/20180709080554.21931-1-jhubbard@nvidia.com
>>>     Proposed steps for fixing get_user_pages() + DMA problems.
>>>
>>> [3]https://lkml.kernel.org/r/20180710082100.mkdwngdv5kkrcz6n@quack2.suse.cz
>>>     Bounce buffers (otherwise [2] is not really viable).
>>>
>>> [4] https://lkml.kernel.org/r/20181003162115.GG24030@quack2.suse.cz
>>>     Follow-up discussions.
>>>
> [...]
>>>  
>>> +/* Placeholder version, until all get_user_pages*() callers are updated. */
>>> +static inline void put_user_page(struct page *page)
>>> +{
>>> +	put_page(page);
>>> +}
>>> +
>>> +/* For get_user_pages*()-pinned pages, use these variants instead of
>>> + * release_pages():
>>> + */
>>> +static inline void put_user_pages_dirty(struct page **pages,
>>> +					unsigned long npages)
>>> +{
>>> +	while (npages) {
>>> +		set_page_dirty(pages[npages]);
>>> +		put_user_page(pages[npages]);
>>> +		--npages;
>>> +	}
>>> +}
>>
>> Shouldn't these do the !PageDirty(page) thing?
>>
> 
> Well, not yet. This is the "placeholder" patch, in which I planned to keep
> the behavior the same, while I go to all the get_user_pages call sites and change 
> put_page() and release_pages() over to use these new routines.
> 
> After the call sites are changed, then these routines will be updated to do more.
> [2], above has slightly more detail about that.
> 
> 

Also, I plan to respin again pretty soon, because someone politely pointed out offline
that even in this small patchset, I've botched the handling of the --npages loop, sigh. 
(Thanks, Ralph!)

The original form:

    while(--npages)

was correct, but now it's not so much.

thanks,
Jason Gunthorpe Oct. 5, 2018, 9:48 p.m. UTC | #4
On Fri, Oct 05, 2018 at 12:49:06PM -0700, John Hubbard wrote:
> On 10/5/18 8:17 AM, Jason Gunthorpe wrote:
> > On Thu, Oct 04, 2018 at 09:02:24PM -0700, john.hubbard@gmail.com wrote:
> >> From: John Hubbard <jhubbard@nvidia.com>
> >>
> >> Introduces put_user_page(), which simply calls put_page().
> >> This provides a way to update all get_user_pages*() callers,
> >> so that they call put_user_page(), instead of put_page().
> >>
> >> Also introduces put_user_pages(), and a few dirty/locked variations,
> >> as a replacement for release_pages(), for the same reasons.
> >> These may be used for subsequent performance improvements,
> >> via batching of pages to be released.
> >>
> >> This prepares for eventually fixing the problem described
> >> in [1], and is following a plan listed in [2], [3], [4].
> >>
> >> [1] https://lwn.net/Articles/753027/ : "The Trouble with get_user_pages()"
> >>
> >> [2] https://lkml.kernel.org/r/20180709080554.21931-1-jhubbard@nvidia.com
> >>     Proposed steps for fixing get_user_pages() + DMA problems.
> >>
> >> [3]https://lkml.kernel.org/r/20180710082100.mkdwngdv5kkrcz6n@quack2.suse.cz
> >>     Bounce buffers (otherwise [2] is not really viable).
> >>
> >> [4] https://lkml.kernel.org/r/20181003162115.GG24030@quack2.suse.cz
> >>     Follow-up discussions.
> >>
> [...]
> >>  
> >> +/* Placeholder version, until all get_user_pages*() callers are updated. */
> >> +static inline void put_user_page(struct page *page)
> >> +{
> >> +	put_page(page);
> >> +}
> >> +
> >> +/* For get_user_pages*()-pinned pages, use these variants instead of
> >> + * release_pages():
> >> + */
> >> +static inline void put_user_pages_dirty(struct page **pages,
> >> +					unsigned long npages)
> >> +{
> >> +	while (npages) {
> >> +		set_page_dirty(pages[npages]);
> >> +		put_user_page(pages[npages]);
> >> +		--npages;
> >> +	}
> >> +}
> > 
> > Shouldn't these do the !PageDirty(page) thing?
> > 
> 
> Well, not yet. This is the "placeholder" patch, in which I planned to keep
> the behavior the same, while I go to all the get_user_pages call sites and change 
> put_page() and release_pages() over to use these new routines.

Hmm.. Well, if it is the right thing to do here, why not include it and
take it out of callers when doing the conversion?

If it is the wrong thing, then let us still take it out of callers
when doing the conversion :)

Just seems like things will be in a better place to make future
changes if all the call sights are de-duplicated and correct.

Jason
John Hubbard Oct. 6, 2018, 12:03 a.m. UTC | #5
On 10/5/18 2:48 PM, Jason Gunthorpe wrote:
> On Fri, Oct 05, 2018 at 12:49:06PM -0700, John Hubbard wrote:
>> On 10/5/18 8:17 AM, Jason Gunthorpe wrote:
>>> On Thu, Oct 04, 2018 at 09:02:24PM -0700, john.hubbard@gmail.com wrote:
>>>> From: John Hubbard <jhubbard@nvidia.com>
>>>>
>>>> Introduces put_user_page(), which simply calls put_page().
>>>> This provides a way to update all get_user_pages*() callers,
>>>> so that they call put_user_page(), instead of put_page().
>>>>
>>>> Also introduces put_user_pages(), and a few dirty/locked variations,
>>>> as a replacement for release_pages(), for the same reasons.
>>>> These may be used for subsequent performance improvements,
>>>> via batching of pages to be released.
>>>>
>>>> This prepares for eventually fixing the problem described
>>>> in [1], and is following a plan listed in [2], [3], [4].
>>>>
>>>> [1] https://lwn.net/Articles/753027/ : "The Trouble with get_user_pages()"
>>>>
>>>> [2] https://lkml.kernel.org/r/20180709080554.21931-1-jhubbard@nvidia.com
>>>>     Proposed steps for fixing get_user_pages() + DMA problems.
>>>>
>>>> [3]https://lkml.kernel.org/r/20180710082100.mkdwngdv5kkrcz6n@quack2.suse.cz
>>>>     Bounce buffers (otherwise [2] is not really viable).
>>>>
>>>> [4] https://lkml.kernel.org/r/20181003162115.GG24030@quack2.suse.cz
>>>>     Follow-up discussions.
>>>>
>> [...]
>>>>  
>>>> +/* Placeholder version, until all get_user_pages*() callers are updated. */
>>>> +static inline void put_user_page(struct page *page)
>>>> +{
>>>> +	put_page(page);
>>>> +}
>>>> +
>>>> +/* For get_user_pages*()-pinned pages, use these variants instead of
>>>> + * release_pages():
>>>> + */
>>>> +static inline void put_user_pages_dirty(struct page **pages,
>>>> +					unsigned long npages)
>>>> +{
>>>> +	while (npages) {
>>>> +		set_page_dirty(pages[npages]);
>>>> +		put_user_page(pages[npages]);
>>>> +		--npages;
>>>> +	}
>>>> +}
>>>
>>> Shouldn't these do the !PageDirty(page) thing?
>>>
>>
>> Well, not yet. This is the "placeholder" patch, in which I planned to keep
>> the behavior the same, while I go to all the get_user_pages call sites and change 
>> put_page() and release_pages() over to use these new routines.
> 
> Hmm.. Well, if it is the right thing to do here, why not include it and
> take it out of callers when doing the conversion?
> 
> If it is the wrong thing, then let us still take it out of callers
> when doing the conversion :)
> 
> Just seems like things will be in a better place to make future
> changes if all the call sights are de-duplicated and correct.
> 

OK, yes. Let me send out a v3 with that included, then.

thanks,
diff mbox series

Patch

diff --git a/include/linux/mm.h b/include/linux/mm.h
index a61ebe8ad4ca..1a9aae7c659f 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -137,6 +137,8 @@  extern int overcommit_ratio_handler(struct ctl_table *, int, void __user *,
 				    size_t *, loff_t *);
 extern int overcommit_kbytes_handler(struct ctl_table *, int, void __user *,
 				    size_t *, loff_t *);
+int set_page_dirty(struct page *page);
+int set_page_dirty_lock(struct page *page);
 
 #define nth_page(page,n) pfn_to_page(page_to_pfn((page)) + (n))
 
@@ -943,6 +945,44 @@  static inline void put_page(struct page *page)
 		__put_page(page);
 }
 
+/* Placeholder version, until all get_user_pages*() callers are updated. */
+static inline void put_user_page(struct page *page)
+{
+	put_page(page);
+}
+
+/* For get_user_pages*()-pinned pages, use these variants instead of
+ * release_pages():
+ */
+static inline void put_user_pages_dirty(struct page **pages,
+					unsigned long npages)
+{
+	while (npages) {
+		set_page_dirty(pages[npages]);
+		put_user_page(pages[npages]);
+		--npages;
+	}
+}
+
+static inline void put_user_pages_dirty_lock(struct page **pages,
+					     unsigned long npages)
+{
+	while (npages) {
+		set_page_dirty_lock(pages[npages]);
+		put_user_page(pages[npages]);
+		--npages;
+	}
+}
+
+static inline void put_user_pages(struct page **pages,
+				  unsigned long npages)
+{
+	while (npages) {
+		put_user_page(pages[npages]);
+		--npages;
+	}
+}
+
 #if defined(CONFIG_SPARSEMEM) && !defined(CONFIG_SPARSEMEM_VMEMMAP)
 #define SECTION_IN_PAGE_FLAGS
 #endif
@@ -1534,8 +1574,6 @@  int redirty_page_for_writepage(struct writeback_control *wbc,
 void account_page_dirtied(struct page *page, struct address_space *mapping);
 void account_page_cleaned(struct page *page, struct address_space *mapping,
 			  struct bdi_writeback *wb);
-int set_page_dirty(struct page *page);
-int set_page_dirty_lock(struct page *page);
 void __cancel_dirty_page(struct page *page);
 static inline void cancel_dirty_page(struct page *page)
 {