diff mbox series

[v2,1/3] mm: document zone device struct page field usage

Message ID 20190719192955.30462-2-rcampbell@nvidia.com (mailing list archive)
State New, archived
Headers show
Series mm/hmm: fixes for device private page migration | expand

Commit Message

Ralph Campbell July 19, 2019, 7:29 p.m. UTC
Struct page for ZONE_DEVICE private pages uses the page->mapping and
and page->index fields while the source anonymous pages are migrated to
device private memory. This is so rmap_walk() can find the page when
migrating the ZONE_DEVICE private page back to system memory.
ZONE_DEVICE pmem backed fsdax pages also use the page->mapping and
page->index fields when files are mapped into a process address space.

Restructure struct page and add comments to make this more clear.

Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Christoph Lameter <cl@linux.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Jérôme Glisse <jglisse@redhat.com>
Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jason Gunthorpe <jgg@mellanox.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
---
 include/linux/mm_types.h | 42 +++++++++++++++++++++++++++-------------
 1 file changed, 29 insertions(+), 13 deletions(-)

Comments

John Hubbard July 19, 2019, 9:43 p.m. UTC | #1
On 7/19/19 12:29 PM, Ralph Campbell wrote:
> Struct page for ZONE_DEVICE private pages uses the page->mapping and
> and page->index fields while the source anonymous pages are migrated to
> device private memory. This is so rmap_walk() can find the page when
> migrating the ZONE_DEVICE private page back to system memory.
> ZONE_DEVICE pmem backed fsdax pages also use the page->mapping and
> page->index fields when files are mapped into a process address space.
> 
> Restructure struct page and add comments to make this more clear.
> 
> Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
> Reviewed-by: John Hubbard <jhubbard@nvidia.com>
> Cc: Matthew Wilcox <willy@infradead.org>
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Cc: Christoph Lameter <cl@linux.com>
> Cc: Dave Hansen <dave.hansen@linux.intel.com>
> Cc: Jérôme Glisse <jglisse@redhat.com>
> Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
> Cc: Lai Jiangshan <jiangshanlai@gmail.com>
> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
> Cc: Pekka Enberg <penberg@kernel.org>
> Cc: Randy Dunlap <rdunlap@infradead.org>
> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Jason Gunthorpe <jgg@mellanox.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> ---
>  include/linux/mm_types.h | 42 +++++++++++++++++++++++++++-------------
>  1 file changed, 29 insertions(+), 13 deletions(-)
> 
> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
> index 3a37a89eb7a7..f6c52e44d40c 100644
> --- a/include/linux/mm_types.h
> +++ b/include/linux/mm_types.h
> @@ -76,13 +76,35 @@ struct page {
>  	 * avoid collision and false-positive PageTail().
>  	 */
>  	union {
> -		struct {	/* Page cache and anonymous pages */
> -			/**
> -			 * @lru: Pageout list, eg. active_list protected by
> -			 * pgdat->lru_lock.  Sometimes used as a generic list
> -			 * by the page owner.
> -			 */
> -			struct list_head lru;
> +		struct {	/* Page cache, anonymous, ZONE_DEVICE pages */
> +			union {
> +				/**
> +				 * @lru: Pageout list, e.g., active_list
> +				 * protected by pgdat->lru_lock. Sometimes
> +				 * used as a generic list by the page owner.
> +				 */
> +				struct list_head lru;
> +				/**
> +				 * ZONE_DEVICE pages are never on the lru
> +				 * list so they reuse the list space.
> +				 * ZONE_DEVICE private pages are counted as
> +				 * being mapped so the @mapping and @index
> +				 * fields are used while the page is migrated
> +				 * to device private memory.
> +				 * ZONE_DEVICE MEMORY_DEVICE_FS_DAX pages also
> +				 * use the @mapping and @index fields when pmem
> +				 * backed DAX files are mapped.
> +				 */
> +				struct {
> +					/**
> +					 * @pgmap: Points to the hosting
> +					 * device page map.
> +					 */
> +					struct dev_pagemap *pgmap;
> +					/** @zone_device_data: opaque data. */

This is nice, and I think it's a solid step forward in documentation via
clearer code. I recall a number of email, and even face to face discussions,
in which the statement kept coming up: "remember, the ZONE_DEVICE pages use
mapping and index, too. Actually, that reminds me: page->private is often
in use in that case, too, so ZONE_DEVICE couldn't use that, either. I don't
think we need to explicitly say that, though, with this new layout.

nit: the above comment can be deleted, because it merely echoes the actual
code that directly follows it.

Either way, you can add:

	Reviewed-by: John Hubbard <jhubbard@nvidia.com>


thanks,
Randy Dunlap July 21, 2019, 3:05 p.m. UTC | #2
Hi,

On 7/19/19 12:29 PM, Ralph Campbell wrote:
> Struct page for ZONE_DEVICE private pages uses the page->mapping and
> and page->index fields while the source anonymous pages are migrated to
> device private memory. This is so rmap_walk() can find the page when
> migrating the ZONE_DEVICE private page back to system memory.
> ZONE_DEVICE pmem backed fsdax pages also use the page->mapping and
> page->index fields when files are mapped into a process address space.
> 
> Restructure struct page and add comments to make this more clear.
> 
> Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
> Reviewed-by: John Hubbard <jhubbard@nvidia.com>
> Cc: Matthew Wilcox <willy@infradead.org>
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Cc: Christoph Lameter <cl@linux.com>
> Cc: Dave Hansen <dave.hansen@linux.intel.com>
> Cc: Jérôme Glisse <jglisse@redhat.com>
> Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
> Cc: Lai Jiangshan <jiangshanlai@gmail.com>
> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
> Cc: Pekka Enberg <penberg@kernel.org>
> Cc: Randy Dunlap <rdunlap@infradead.org>
> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Jason Gunthorpe <jgg@mellanox.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> ---
>  include/linux/mm_types.h | 42 +++++++++++++++++++++++++++-------------
>  1 file changed, 29 insertions(+), 13 deletions(-)
> 
> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
> index 3a37a89eb7a7..f6c52e44d40c 100644
> --- a/include/linux/mm_types.h
> +++ b/include/linux/mm_types.h
> @@ -76,13 +76,35 @@ struct page {
>  	 * avoid collision and false-positive PageTail().
>  	 */
>  	union {
> -		struct {	/* Page cache and anonymous pages */
> -			/**
> -			 * @lru: Pageout list, eg. active_list protected by
> -			 * pgdat->lru_lock.  Sometimes used as a generic list
> -			 * by the page owner.
> -			 */
> -			struct list_head lru;
> +		struct {	/* Page cache, anonymous, ZONE_DEVICE pages */
> +			union {
> +				/**
> +				 * @lru: Pageout list, e.g., active_list
> +				 * protected by pgdat->lru_lock. Sometimes
> +				 * used as a generic list by the page owner.
> +				 */
> +				struct list_head lru;

Did you run this through 'make htmldocs' or anything similar?
The reason I ask is that the "/**" comment below is not in kernel-doc format AFAICT.
I would expect an error or warning, but I haven't tested it.

Thanks.

> +				/**
> +				 * ZONE_DEVICE pages are never on the lru
> +				 * list so they reuse the list space.
> +				 * ZONE_DEVICE private pages are counted as
> +				 * being mapped so the @mapping and @index
> +				 * fields are used while the page is migrated
> +				 * to device private memory.
> +				 * ZONE_DEVICE MEMORY_DEVICE_FS_DAX pages also
> +				 * use the @mapping and @index fields when pmem
> +				 * backed DAX files are mapped.
> +				 */
> +				struct {
> +					/**
> +					 * @pgmap: Points to the hosting
> +					 * device page map.
> +					 */
> +					struct dev_pagemap *pgmap;
> +					/** @zone_device_data: opaque data. */
> +					void *zone_device_data;
> +				};
> +			};
>  			/* See page-flags.h for PAGE_MAPPING_FLAGS */
>  			struct address_space *mapping;
>  			pgoff_t index;		/* Our offset within mapping. */
> @@ -155,12 +177,6 @@ struct page {
>  			spinlock_t ptl;
>  #endif
>  		};
> -		struct {	/* ZONE_DEVICE pages */
> -			/** @pgmap: Points to the hosting device page map. */
> -			struct dev_pagemap *pgmap;
> -			void *zone_device_data;
> -			unsigned long _zd_pad_1;	/* uses mapping */
> -		};
>  
>  		/** @rcu_head: You can use this to free a page by RCU. */
>  		struct rcu_head rcu_head;
>
Matthew Wilcox July 21, 2019, 4:02 p.m. UTC | #3
On Fri, Jul 19, 2019 at 12:29:53PM -0700, Ralph Campbell wrote:
> Struct page for ZONE_DEVICE private pages uses the page->mapping and
> and page->index fields while the source anonymous pages are migrated to
> device private memory. This is so rmap_walk() can find the page when
> migrating the ZONE_DEVICE private page back to system memory.
> ZONE_DEVICE pmem backed fsdax pages also use the page->mapping and
> page->index fields when files are mapped into a process address space.
> 
> Restructure struct page and add comments to make this more clear.

NAK.  I just got rid of this kind of foolishness from struct page,
and you're making it harder to understand, not easier.  The comments
could be improved, but don't lay it out like this again.

> @@ -76,13 +76,35 @@ struct page {
>  	 * avoid collision and false-positive PageTail().
>  	 */
>  	union {
> -		struct {	/* Page cache and anonymous pages */
> -			/**
> -			 * @lru: Pageout list, eg. active_list protected by
> -			 * pgdat->lru_lock.  Sometimes used as a generic list
> -			 * by the page owner.
> -			 */
> -			struct list_head lru;
> +		struct {	/* Page cache, anonymous, ZONE_DEVICE pages */
> +			union {
> +				/**
> +				 * @lru: Pageout list, e.g., active_list
> +				 * protected by pgdat->lru_lock. Sometimes
> +				 * used as a generic list by the page owner.
> +				 */
> +				struct list_head lru;
> +				/**
> +				 * ZONE_DEVICE pages are never on the lru
> +				 * list so they reuse the list space.
> +				 * ZONE_DEVICE private pages are counted as
> +				 * being mapped so the @mapping and @index
> +				 * fields are used while the page is migrated
> +				 * to device private memory.
> +				 * ZONE_DEVICE MEMORY_DEVICE_FS_DAX pages also
> +				 * use the @mapping and @index fields when pmem
> +				 * backed DAX files are mapped.
> +				 */
> +				struct {
> +					/**
> +					 * @pgmap: Points to the hosting
> +					 * device page map.
> +					 */
> +					struct dev_pagemap *pgmap;
> +					/** @zone_device_data: opaque data. */
> +					void *zone_device_data;
> +				};
> +			};
>  			/* See page-flags.h for PAGE_MAPPING_FLAGS */
>  			struct address_space *mapping;
>  			pgoff_t index;		/* Our offset within mapping. */
> @@ -155,12 +177,6 @@ struct page {
>  			spinlock_t ptl;
>  #endif
>  		};
> -		struct {	/* ZONE_DEVICE pages */
> -			/** @pgmap: Points to the hosting device page map. */
> -			struct dev_pagemap *pgmap;
> -			void *zone_device_data;
> -			unsigned long _zd_pad_1;	/* uses mapping */
> -		};
>  
>  		/** @rcu_head: You can use this to free a page by RCU. */
>  		struct rcu_head rcu_head;
> -- 
> 2.20.1
>
Ira Weiny July 22, 2019, 5:13 a.m. UTC | #4
On Sun, Jul 21, 2019 at 09:02:04AM -0700, Matthew Wilcox wrote:
> On Fri, Jul 19, 2019 at 12:29:53PM -0700, Ralph Campbell wrote:
> > Struct page for ZONE_DEVICE private pages uses the page->mapping and
> > and page->index fields while the source anonymous pages are migrated to
> > device private memory. This is so rmap_walk() can find the page when
> > migrating the ZONE_DEVICE private page back to system memory.
> > ZONE_DEVICE pmem backed fsdax pages also use the page->mapping and
> > page->index fields when files are mapped into a process address space.
> > 
> > Restructure struct page and add comments to make this more clear.
> 
> NAK.  I just got rid of this kind of foolishness from struct page,
> and you're making it harder to understand, not easier.  The comments
> could be improved, but don't lay it out like this again.

Was V1 of Ralphs patch ok?  It seemed ok to me.

Ira

> 
> > @@ -76,13 +76,35 @@ struct page {
> >  	 * avoid collision and false-positive PageTail().
> >  	 */
> >  	union {
> > -		struct {	/* Page cache and anonymous pages */
> > -			/**
> > -			 * @lru: Pageout list, eg. active_list protected by
> > -			 * pgdat->lru_lock.  Sometimes used as a generic list
> > -			 * by the page owner.
> > -			 */
> > -			struct list_head lru;
> > +		struct {	/* Page cache, anonymous, ZONE_DEVICE pages */
> > +			union {
> > +				/**
> > +				 * @lru: Pageout list, e.g., active_list
> > +				 * protected by pgdat->lru_lock. Sometimes
> > +				 * used as a generic list by the page owner.
> > +				 */
> > +				struct list_head lru;
> > +				/**
> > +				 * ZONE_DEVICE pages are never on the lru
> > +				 * list so they reuse the list space.
> > +				 * ZONE_DEVICE private pages are counted as
> > +				 * being mapped so the @mapping and @index
> > +				 * fields are used while the page is migrated
> > +				 * to device private memory.
> > +				 * ZONE_DEVICE MEMORY_DEVICE_FS_DAX pages also
> > +				 * use the @mapping and @index fields when pmem
> > +				 * backed DAX files are mapped.
> > +				 */
> > +				struct {
> > +					/**
> > +					 * @pgmap: Points to the hosting
> > +					 * device page map.
> > +					 */
> > +					struct dev_pagemap *pgmap;
> > +					/** @zone_device_data: opaque data. */
> > +					void *zone_device_data;
> > +				};
> > +			};
> >  			/* See page-flags.h for PAGE_MAPPING_FLAGS */
> >  			struct address_space *mapping;
> >  			pgoff_t index;		/* Our offset within mapping. */
> > @@ -155,12 +177,6 @@ struct page {
> >  			spinlock_t ptl;
> >  #endif
> >  		};
> > -		struct {	/* ZONE_DEVICE pages */
> > -			/** @pgmap: Points to the hosting device page map. */
> > -			struct dev_pagemap *pgmap;
> > -			void *zone_device_data;
> > -			unsigned long _zd_pad_1;	/* uses mapping */
> > -		};
> >  
> >  		/** @rcu_head: You can use this to free a page by RCU. */
> >  		struct rcu_head rcu_head;
> > -- 
> > 2.20.1
> > 
>
Christoph Hellwig July 22, 2019, 9:36 a.m. UTC | #5
On Sun, Jul 21, 2019 at 09:02:04AM -0700, Matthew Wilcox wrote:
> On Fri, Jul 19, 2019 at 12:29:53PM -0700, Ralph Campbell wrote:
> > Struct page for ZONE_DEVICE private pages uses the page->mapping and
> > and page->index fields while the source anonymous pages are migrated to
> > device private memory. This is so rmap_walk() can find the page when
> > migrating the ZONE_DEVICE private page back to system memory.
> > ZONE_DEVICE pmem backed fsdax pages also use the page->mapping and
> > page->index fields when files are mapped into a process address space.
> > 
> > Restructure struct page and add comments to make this more clear.
> 
> NAK.  I just got rid of this kind of foolishness from struct page,
> and you're making it harder to understand, not easier.  The comments
> could be improved, but don't lay it out like this again.

This comes over pretty agressive.  Please explain how making the
layout match how the code actually is used vs the previous separation
that is actively misleading and confused multiple people is "foolishness".
Christoph Hellwig July 22, 2019, 9:37 a.m. UTC | #6
Looks good modulo any potential kerneldoc issues for which I'm not
the expert:

Reviewed-by: Christoph Hellwig <hch@lst.de>
Matthew Wilcox July 22, 2019, 11:08 a.m. UTC | #7
On Sun, Jul 21, 2019 at 10:13:45PM -0700, Ira Weiny wrote:
> On Sun, Jul 21, 2019 at 09:02:04AM -0700, Matthew Wilcox wrote:
> > On Fri, Jul 19, 2019 at 12:29:53PM -0700, Ralph Campbell wrote:
> > > Struct page for ZONE_DEVICE private pages uses the page->mapping and
> > > and page->index fields while the source anonymous pages are migrated to
> > > device private memory. This is so rmap_walk() can find the page when
> > > migrating the ZONE_DEVICE private page back to system memory.
> > > ZONE_DEVICE pmem backed fsdax pages also use the page->mapping and
> > > page->index fields when files are mapped into a process address space.
> > > 
> > > Restructure struct page and add comments to make this more clear.
> > 
> > NAK.  I just got rid of this kind of foolishness from struct page,
> > and you're making it harder to understand, not easier.  The comments
> > could be improved, but don't lay it out like this again.
> 
> Was V1 of Ralphs patch ok?  It seemed ok to me.

Yes, v1 was fine.  This seems like a regression.
Ralph Campbell July 23, 2019, 9:25 p.m. UTC | #8
On 7/22/19 4:08 AM, Matthew Wilcox wrote:
> On Sun, Jul 21, 2019 at 10:13:45PM -0700, Ira Weiny wrote:
>> On Sun, Jul 21, 2019 at 09:02:04AM -0700, Matthew Wilcox wrote:
>>> On Fri, Jul 19, 2019 at 12:29:53PM -0700, Ralph Campbell wrote:
>>>> Struct page for ZONE_DEVICE private pages uses the page->mapping and
>>>> and page->index fields while the source anonymous pages are migrated to
>>>> device private memory. This is so rmap_walk() can find the page when
>>>> migrating the ZONE_DEVICE private page back to system memory.
>>>> ZONE_DEVICE pmem backed fsdax pages also use the page->mapping and
>>>> page->index fields when files are mapped into a process address space.
>>>>
>>>> Restructure struct page and add comments to make this more clear.
>>>
>>> NAK.  I just got rid of this kind of foolishness from struct page,
>>> and you're making it harder to understand, not easier.  The comments
>>> could be improved, but don't lay it out like this again.
>>
>> Was V1 of Ralphs patch ok?  It seemed ok to me.
> 
> Yes, v1 was fine.  This seems like a regression.
> 

This is about what people find "easiest to understand" and so
I'm not surprised that opinions differ.
What if I post a v3 based on v1 but remove the _zd_pad_* variables
that Christoph found misleading and add some more comments
about how the different ZONE_DEVICE types use the 3 remaining
words (basically the comment from v2)?
Ira Weiny July 23, 2019, 9:39 p.m. UTC | #9
> 
> On 7/22/19 4:08 AM, Matthew Wilcox wrote:
> > On Sun, Jul 21, 2019 at 10:13:45PM -0700, Ira Weiny wrote:
> >> On Sun, Jul 21, 2019 at 09:02:04AM -0700, Matthew Wilcox wrote:
> >>> On Fri, Jul 19, 2019 at 12:29:53PM -0700, Ralph Campbell wrote:
> >>>> Struct page for ZONE_DEVICE private pages uses the page->mapping
> >>>> and and page->index fields while the source anonymous pages are
> >>>> migrated to device private memory. This is so rmap_walk() can find
> >>>> the page when migrating the ZONE_DEVICE private page back to system
> memory.
> >>>> ZONE_DEVICE pmem backed fsdax pages also use the page->mapping
> and
> >>>> page->index fields when files are mapped into a process address space.
> >>>>
> >>>> Restructure struct page and add comments to make this more clear.
> >>>
> >>> NAK.  I just got rid of this kind of foolishness from struct page,
> >>> and you're making it harder to understand, not easier.  The comments
> >>> could be improved, but don't lay it out like this again.
> >>
> >> Was V1 of Ralphs patch ok?  It seemed ok to me.
> >
> > Yes, v1 was fine.  This seems like a regression.
> >
> 
> This is about what people find "easiest to understand" and so I'm not
> surprised that opinions differ.
> What if I post a v3 based on v1 but remove the _zd_pad_* variables that
> Christoph found misleading and add some more comments about how the
> different ZONE_DEVICE types use the 3 remaining words (basically the
> comment from v2)?

I'm ok with that...

Ira
diff mbox series

Patch

diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 3a37a89eb7a7..f6c52e44d40c 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -76,13 +76,35 @@  struct page {
 	 * avoid collision and false-positive PageTail().
 	 */
 	union {
-		struct {	/* Page cache and anonymous pages */
-			/**
-			 * @lru: Pageout list, eg. active_list protected by
-			 * pgdat->lru_lock.  Sometimes used as a generic list
-			 * by the page owner.
-			 */
-			struct list_head lru;
+		struct {	/* Page cache, anonymous, ZONE_DEVICE pages */
+			union {
+				/**
+				 * @lru: Pageout list, e.g., active_list
+				 * protected by pgdat->lru_lock. Sometimes
+				 * used as a generic list by the page owner.
+				 */
+				struct list_head lru;
+				/**
+				 * ZONE_DEVICE pages are never on the lru
+				 * list so they reuse the list space.
+				 * ZONE_DEVICE private pages are counted as
+				 * being mapped so the @mapping and @index
+				 * fields are used while the page is migrated
+				 * to device private memory.
+				 * ZONE_DEVICE MEMORY_DEVICE_FS_DAX pages also
+				 * use the @mapping and @index fields when pmem
+				 * backed DAX files are mapped.
+				 */
+				struct {
+					/**
+					 * @pgmap: Points to the hosting
+					 * device page map.
+					 */
+					struct dev_pagemap *pgmap;
+					/** @zone_device_data: opaque data. */
+					void *zone_device_data;
+				};
+			};
 			/* See page-flags.h for PAGE_MAPPING_FLAGS */
 			struct address_space *mapping;
 			pgoff_t index;		/* Our offset within mapping. */
@@ -155,12 +177,6 @@  struct page {
 			spinlock_t ptl;
 #endif
 		};
-		struct {	/* ZONE_DEVICE pages */
-			/** @pgmap: Points to the hosting device page map. */
-			struct dev_pagemap *pgmap;
-			void *zone_device_data;
-			unsigned long _zd_pad_1;	/* uses mapping */
-		};
 
 		/** @rcu_head: You can use this to free a page by RCU. */
 		struct rcu_head rcu_head;