Message ID | b8b39849e171c120442963d3fd81c49a8f005bf0.1736488799.git-series.apopple@nvidia.com |
---|---|
State | New |
Headers | show |
Series | fs/dax: Fix ZONE_DEVICE page reference counts | expand |
Alistair Popple wrote: > PAGE_MAPPING_DAX_SHARED is the same as PAGE_MAPPING_ANON. I think a bit a bit more detail is warranted, how about? The page ->mapping pointer can have magic values like PAGE_MAPPING_DAX_SHARED and PAGE_MAPPING_ANON for page owner specific usage. In fact, PAGE_MAPPING_DAX_SHARED and PAGE_MAPPING_ANON alias the same value. > This isn't currently a problem because FS DAX pages are treated > specially. s/are treated specially/are never seen by the anonymous mapping code and vice versa/ > However a future change will make FS DAX pages more like > normal pages, so folio_test_anon() must not return true for a FS DAX > page. > > We could explicitly test for a FS DAX page in folio_test_anon(), > etc. however the PAGE_MAPPING_DAX_SHARED flag isn't actually > needed. Instead we can use the page->mapping field to implicitly track > the first mapping of a page. If page->mapping is non-NULL it implies > the page is associated with a single mapping at page->index. If the > page is associated with a second mapping clear page->mapping and set > page->share to 1. > > This is possible because a shared mapping implies the file-system > implements dax_holder_operations which makes the ->mapping and > ->index, which is a union with ->share, unused. > > The page is considered shared when page->mapping == NULL and > page->share > 0 or page->mapping != NULL, implying it is present in at > least one address space. This also makes it easier for a future change > to detect when a page is first mapped into an address space which > requires special handling. > > Signed-off-by: Alistair Popple <apopple@nvidia.com> > --- > fs/dax.c | 45 +++++++++++++++++++++++++-------------- > include/linux/page-flags.h | 6 +----- > 2 files changed, 29 insertions(+), 22 deletions(-) > > diff --git a/fs/dax.c b/fs/dax.c > index 4e49cc4..d35dbe1 100644 > --- a/fs/dax.c > +++ b/fs/dax.c > @@ -351,38 +351,41 @@ static unsigned long dax_end_pfn(void *entry) > for (pfn = dax_to_pfn(entry); \ > pfn < dax_end_pfn(entry); pfn++) > > +/* > + * A DAX page is considered shared if it has no mapping set and ->share (which > + * shares the ->index field) is non-zero. Note this may return false even if the > + * page is shared between multiple files but has not yet actually been mapped > + * into multiple address spaces. > + */ > static inline bool dax_page_is_shared(struct page *page) > { > - return page->mapping == PAGE_MAPPING_DAX_SHARED; > + return !page->mapping && page->share; > } > > /* > - * Set the page->mapping with PAGE_MAPPING_DAX_SHARED flag, increase the > - * refcount. > + * Increase the page share refcount, warning if the page is not marked as shared. > */ > static inline void dax_page_share_get(struct page *page) > { > - if (page->mapping != PAGE_MAPPING_DAX_SHARED) { > - /* > - * Reset the index if the page was already mapped > - * regularly before. > - */ > - if (page->mapping) > - page->share = 1; > - page->mapping = PAGE_MAPPING_DAX_SHARED; > - } > + WARN_ON_ONCE(!page->share); > + WARN_ON_ONCE(page->mapping); Given the only caller of this function is dax_associate_entry() it seems like overkill to check that a function only a few lines away manipulated ->mapping correctly. I don't see much reason for dax_page_share_get() to exist after your changes. Perhaps all that is needed is a dax_make_shared() helper that does the initial fiddling of '->mapping = NULL' and '->share = 1'? > page->share++; > } > > static inline unsigned long dax_page_share_put(struct page *page) > { > + WARN_ON_ONCE(!page->share); > return --page->share; > } > > /* > - * When it is called in dax_insert_entry(), the shared flag will indicate that > - * whether this entry is shared by multiple files. If so, set the page->mapping > - * PAGE_MAPPING_DAX_SHARED, and use page->share as refcount. > + * When it is called in dax_insert_entry(), the shared flag will indicate > + * whether this entry is shared by multiple files. If the page has not > + * previously been associated with any mappings the ->mapping and ->index > + * fields will be set. If it has already been associated with a mapping > + * the mapping will be cleared and the share count set. It's then up to the > + * file-system to track which mappings contain which pages, ie. by implementing > + * dax_holder_operations. This feels like a good comment for a new dax_make_shared() not dax_associate_entry(). I would also: s/up to the file-system to track which mappings contain which pages, ie. by implementing dax_holder_operations/up to reverse map users like memory_failure() to call back into the filesystem to recover ->mapping and ->index information/ > */ > static void dax_associate_entry(void *entry, struct address_space *mapping, > struct vm_area_struct *vma, unsigned long address, bool shared) > @@ -397,7 +400,17 @@ static void dax_associate_entry(void *entry, struct address_space *mapping, > for_each_mapped_pfn(entry, pfn) { > struct page *page = pfn_to_page(pfn); > > - if (shared) { > + if (shared && page->mapping && page->share) { How does this case happen? I don't think any page would ever enter with both ->mapping and ->share set, right? If the file was mapped then reflinked then ->share should be zero at the first mapping attempt. It might not be zero because it is aliased with index until it is converted to a shared page.
On 10.01.25 07:00, Alistair Popple wrote: > PAGE_MAPPING_DAX_SHARED is the same as PAGE_MAPPING_ANON. This isn't > currently a problem because FS DAX pages are treated > specially. However a future change will make FS DAX pages more like > normal pages, so folio_test_anon() must not return true for a FS DAX > page. Yes, very nice to see PAGE_MAPPING_DAX_SHARED go!
On Mon, Jan 13, 2025 at 04:52:34PM -0800, Dan Williams wrote: > Alistair Popple wrote: > > PAGE_MAPPING_DAX_SHARED is the same as PAGE_MAPPING_ANON. > > I think a bit a bit more detail is warranted, how about? > > The page ->mapping pointer can have magic values like > PAGE_MAPPING_DAX_SHARED and PAGE_MAPPING_ANON for page owner specific > usage. In fact, PAGE_MAPPING_DAX_SHARED and PAGE_MAPPING_ANON alias the > same value. Massaged it slightly but sounds good. > > This isn't currently a problem because FS DAX pages are treated > > specially. > > s/are treated specially/are never seen by the anonymous mapping code and > vice versa/ > > > However a future change will make FS DAX pages more like > > normal pages, so folio_test_anon() must not return true for a FS DAX > > page. > > > > We could explicitly test for a FS DAX page in folio_test_anon(), > > etc. however the PAGE_MAPPING_DAX_SHARED flag isn't actually > > needed. Instead we can use the page->mapping field to implicitly track > > the first mapping of a page. If page->mapping is non-NULL it implies > > the page is associated with a single mapping at page->index. If the > > page is associated with a second mapping clear page->mapping and set > > page->share to 1. > > > > This is possible because a shared mapping implies the file-system > > implements dax_holder_operations which makes the ->mapping and > > ->index, which is a union with ->share, unused. > > > > The page is considered shared when page->mapping == NULL and > > page->share > 0 or page->mapping != NULL, implying it is present in at > > least one address space. This also makes it easier for a future change > > to detect when a page is first mapped into an address space which > > requires special handling. > > > > Signed-off-by: Alistair Popple <apopple@nvidia.com> > > --- > > fs/dax.c | 45 +++++++++++++++++++++++++-------------- > > include/linux/page-flags.h | 6 +----- > > 2 files changed, 29 insertions(+), 22 deletions(-) > > > > diff --git a/fs/dax.c b/fs/dax.c > > index 4e49cc4..d35dbe1 100644 > > --- a/fs/dax.c > > +++ b/fs/dax.c > > @@ -351,38 +351,41 @@ static unsigned long dax_end_pfn(void *entry) > > for (pfn = dax_to_pfn(entry); \ > > pfn < dax_end_pfn(entry); pfn++) > > > > +/* > > + * A DAX page is considered shared if it has no mapping set and ->share (which > > + * shares the ->index field) is non-zero. Note this may return false even if the > > + * page is shared between multiple files but has not yet actually been mapped > > + * into multiple address spaces. > > + */ > > static inline bool dax_page_is_shared(struct page *page) > > { > > - return page->mapping == PAGE_MAPPING_DAX_SHARED; > > + return !page->mapping && page->share; > > } > > > > /* > > - * Set the page->mapping with PAGE_MAPPING_DAX_SHARED flag, increase the > > - * refcount. > > + * Increase the page share refcount, warning if the page is not marked as shared. > > */ > > static inline void dax_page_share_get(struct page *page) > > { > > - if (page->mapping != PAGE_MAPPING_DAX_SHARED) { > > - /* > > - * Reset the index if the page was already mapped > > - * regularly before. > > - */ > > - if (page->mapping) > > - page->share = 1; > > - page->mapping = PAGE_MAPPING_DAX_SHARED; > > - } > > + WARN_ON_ONCE(!page->share); > > + WARN_ON_ONCE(page->mapping); > > Given the only caller of this function is dax_associate_entry() it seems > like overkill to check that a function only a few lines away manipulated > ->mapping correctly. Good call. > I don't see much reason for dax_page_share_get() to exist after your > changes. > > Perhaps all that is needed is a dax_make_shared() helper that does the > initial fiddling of '->mapping = NULL' and '->share = 1'? Ok. I was going to make the argument that dax_make_shared() was overkill as well, but as noted below it's a good place to put the comment describing how this all works so have done that. > > page->share++; > > } > > > > static inline unsigned long dax_page_share_put(struct page *page) > > { > > + WARN_ON_ONCE(!page->share); > > return --page->share; > > } > > > > /* > > - * When it is called in dax_insert_entry(), the shared flag will indicate that > > - * whether this entry is shared by multiple files. If so, set the page->mapping > > - * PAGE_MAPPING_DAX_SHARED, and use page->share as refcount. > > + * When it is called in dax_insert_entry(), the shared flag will indicate > > + * whether this entry is shared by multiple files. If the page has not > > + * previously been associated with any mappings the ->mapping and ->index > > + * fields will be set. If it has already been associated with a mapping > > + * the mapping will be cleared and the share count set. It's then up to the > > + * file-system to track which mappings contain which pages, ie. by implementing > > + * dax_holder_operations. > > This feels like a good comment for a new dax_make_shared() not > dax_associate_entry(). > > I would also: > > s/up to the file-system to track which mappings contain which pages, ie. by implementing > dax_holder_operations/up to reverse map users like memory_failure() to > call back into the filesystem to recover ->mapping and ->index > information/ Sounds good, although I left a reference to dax_holder_operations in the comment because it's not immediately obvious how file-systems do this currently and I had to relearn that more times than I'd care to admit :-) > > */ > > static void dax_associate_entry(void *entry, struct address_space *mapping, > > struct vm_area_struct *vma, unsigned long address, bool shared) > > @@ -397,7 +400,17 @@ static void dax_associate_entry(void *entry, struct address_space *mapping, > > for_each_mapped_pfn(entry, pfn) { > > struct page *page = pfn_to_page(pfn); > > > > - if (shared) { > > + if (shared && page->mapping && page->share) { > > How does this case happen? I don't think any page would ever enter with > both ->mapping and ->share set, right? Sigh. You're right - it can't. This patch series is getting a litte bit large and unweildy with all the prerequisite bugfixes and cleanups. Obviously I fixed this when developing the main fs dax count fixup but forgot to rebase the fix further back in the series. Anyway I have fixed that now, thanks. > If the file was mapped then reflinked then ->share should be zero at the > first mapping attempt. It might not be zero because it is aliased with > index until it is converted to a shared page.
Alistair Popple wrote: [..] > > How does this case happen? I don't think any page would ever enter with > > both ->mapping and ->share set, right? > > Sigh. You're right - it can't. This patch series is getting a litte bit large > and unweildy with all the prerequisite bugfixes and cleanups. Obviously I fixed > this when developing the main fs dax count fixup but forgot to rebase the fix > further back in the series. I assumed as much when I got to that patch. > Anyway I have fixed that now, thanks. You deserve a large helping of grace for waking and then slaying this old dragon.
On Tue, Jan 14, 2025 at 09:44:38PM -0800, Dan Williams wrote: > Alistair Popple wrote: > [..] > > > How does this case happen? I don't think any page would ever enter with > > > both ->mapping and ->share set, right? > > > > Sigh. You're right - it can't. This patch series is getting a litte bit large > > and unweildy with all the prerequisite bugfixes and cleanups. Obviously I fixed > > this when developing the main fs dax count fixup but forgot to rebase the fix > > further back in the series. > > I assumed as much when I got to that patch. > > > Anyway I have fixed that now, thanks. > > You deserve a large helping of grace for waking and then slaying this > old dragon. Heh, thanks. Lets hope this dragon doesn't have too many more heads :-)
diff --git a/fs/dax.c b/fs/dax.c index 4e49cc4..d35dbe1 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -351,38 +351,41 @@ static unsigned long dax_end_pfn(void *entry) for (pfn = dax_to_pfn(entry); \ pfn < dax_end_pfn(entry); pfn++) +/* + * A DAX page is considered shared if it has no mapping set and ->share (which + * shares the ->index field) is non-zero. Note this may return false even if the + * page is shared between multiple files but has not yet actually been mapped + * into multiple address spaces. + */ static inline bool dax_page_is_shared(struct page *page) { - return page->mapping == PAGE_MAPPING_DAX_SHARED; + return !page->mapping && page->share; } /* - * Set the page->mapping with PAGE_MAPPING_DAX_SHARED flag, increase the - * refcount. + * Increase the page share refcount, warning if the page is not marked as shared. */ static inline void dax_page_share_get(struct page *page) { - if (page->mapping != PAGE_MAPPING_DAX_SHARED) { - /* - * Reset the index if the page was already mapped - * regularly before. - */ - if (page->mapping) - page->share = 1; - page->mapping = PAGE_MAPPING_DAX_SHARED; - } + WARN_ON_ONCE(!page->share); + WARN_ON_ONCE(page->mapping); page->share++; } static inline unsigned long dax_page_share_put(struct page *page) { + WARN_ON_ONCE(!page->share); return --page->share; } /* - * When it is called in dax_insert_entry(), the shared flag will indicate that - * whether this entry is shared by multiple files. If so, set the page->mapping - * PAGE_MAPPING_DAX_SHARED, and use page->share as refcount. + * When it is called in dax_insert_entry(), the shared flag will indicate + * whether this entry is shared by multiple files. If the page has not + * previously been associated with any mappings the ->mapping and ->index + * fields will be set. If it has already been associated with a mapping + * the mapping will be cleared and the share count set. It's then up to the + * file-system to track which mappings contain which pages, ie. by implementing + * dax_holder_operations. */ static void dax_associate_entry(void *entry, struct address_space *mapping, struct vm_area_struct *vma, unsigned long address, bool shared) @@ -397,7 +400,17 @@ static void dax_associate_entry(void *entry, struct address_space *mapping, for_each_mapped_pfn(entry, pfn) { struct page *page = pfn_to_page(pfn); - if (shared) { + if (shared && page->mapping && page->share) { + if (page->mapping) { + page->mapping = NULL; + + /* + * Page has already been mapped into one address + * space so set the share count. + */ + page->share = 1; + } + dax_page_share_get(page); } else { WARN_ON_ONCE(page->mapping); diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h index 691506b..598334e 100644 --- a/include/linux/page-flags.h +++ b/include/linux/page-flags.h @@ -668,12 +668,6 @@ PAGEFLAG_FALSE(VmemmapSelfHosted, vmemmap_self_hosted) #define PAGE_MAPPING_KSM (PAGE_MAPPING_ANON | PAGE_MAPPING_MOVABLE) #define PAGE_MAPPING_FLAGS (PAGE_MAPPING_ANON | PAGE_MAPPING_MOVABLE) -/* - * Different with flags above, this flag is used only for fsdax mode. It - * indicates that this page->mapping is now under reflink case. - */ -#define PAGE_MAPPING_DAX_SHARED ((void *)0x1) - static __always_inline bool folio_mapping_flags(const struct folio *folio) { return ((unsigned long)folio->mapping & PAGE_MAPPING_FLAGS) != 0;
PAGE_MAPPING_DAX_SHARED is the same as PAGE_MAPPING_ANON. This isn't currently a problem because FS DAX pages are treated specially. However a future change will make FS DAX pages more like normal pages, so folio_test_anon() must not return true for a FS DAX page. We could explicitly test for a FS DAX page in folio_test_anon(), etc. however the PAGE_MAPPING_DAX_SHARED flag isn't actually needed. Instead we can use the page->mapping field to implicitly track the first mapping of a page. If page->mapping is non-NULL it implies the page is associated with a single mapping at page->index. If the page is associated with a second mapping clear page->mapping and set page->share to 1. This is possible because a shared mapping implies the file-system implements dax_holder_operations which makes the ->mapping and ->index, which is a union with ->share, unused. The page is considered shared when page->mapping == NULL and page->share > 0 or page->mapping != NULL, implying it is present in at least one address space. This also makes it easier for a future change to detect when a page is first mapped into an address space which requires special handling. Signed-off-by: Alistair Popple <apopple@nvidia.com> --- fs/dax.c | 45 +++++++++++++++++++++++++-------------- include/linux/page-flags.h | 6 +----- 2 files changed, 29 insertions(+), 22 deletions(-)