diff mbox series

[v7,16/21] block: add check when merging zone device pages

Message ID 20220615161233.17527-17-logang@deltatee.com (mailing list archive)
State New, archived
Headers show
Series Userspace P2PDMA with O_DIRECT NVMe devices | expand

Commit Message

Logan Gunthorpe June 15, 2022, 4:12 p.m. UTC
Consecutive zone device pages should not be merged into the same sgl
or bvec segment with other types of pages or if they belong to different
pgmaps. Otherwise getting the pgmap of a given segment is not possible
without scanning the entire segment. This helper returns true either if
both pages are not zone device pages or both pages are zone device
pages with the same pgmap.

Add a helper to determine if zone device pages are mergeable and use
this helper in page_is_mergeable().

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
---
 block/bio.c        |  2 ++
 include/linux/mm.h | 23 +++++++++++++++++++++++
 2 files changed, 25 insertions(+)

Comments

Christoph Hellwig June 29, 2022, 6:46 a.m. UTC | #1
On Wed, Jun 15, 2022 at 10:12:28AM -0600, Logan Gunthorpe wrote:
> Consecutive zone device pages should not be merged into the same sgl
> or bvec segment with other types of pages or if they belong to different
> pgmaps. Otherwise getting the pgmap of a given segment is not possible
> without scanning the entire segment. This helper returns true either if
> both pages are not zone device pages or both pages are zone device
> pages with the same pgmap.
> 
> Add a helper to determine if zone device pages are mergeable and use
> this helper in page_is_mergeable().

Any reason not to simply set REQ_NOMERGE for these requests?  We
can't merge for passthrough requests anyway, and genrally don't merge
for direct I/O either, so adding all this overhead seems a bit pointless.
Logan Gunthorpe June 29, 2022, 4:06 p.m. UTC | #2
On 2022-06-29 00:46, Christoph Hellwig wrote:
> On Wed, Jun 15, 2022 at 10:12:28AM -0600, Logan Gunthorpe wrote:
>> Consecutive zone device pages should not be merged into the same sgl
>> or bvec segment with other types of pages or if they belong to different
>> pgmaps. Otherwise getting the pgmap of a given segment is not possible
>> without scanning the entire segment. This helper returns true either if
>> both pages are not zone device pages or both pages are zone device
>> pages with the same pgmap.
>>
>> Add a helper to determine if zone device pages are mergeable and use
>> this helper in page_is_mergeable().
> 
> Any reason not to simply set REQ_NOMERGE for these requests?  We
> can't merge for passthrough requests anyway, and genrally don't merge
> for direct I/O either, so adding all this overhead seems a bit pointless.

Hmm, I suppose we could also ensure that REQ_NOMERGE is set in a bio
before setting FOLL_PCI_P2PDMA in bio_map_user_iov() and
__bio_iov_iter_get_pages(). Assuming it's always set for any direct I/O.

I'll look into it.

Logan
Logan Gunthorpe June 30, 2022, 9:50 p.m. UTC | #3
On 2022-06-29 10:06, Logan Gunthorpe wrote:
> 
> 
> 
> On 2022-06-29 00:46, Christoph Hellwig wrote:
>> On Wed, Jun 15, 2022 at 10:12:28AM -0600, Logan Gunthorpe wrote:
>>> Consecutive zone device pages should not be merged into the same sgl
>>> or bvec segment with other types of pages or if they belong to different
>>> pgmaps. Otherwise getting the pgmap of a given segment is not possible
>>> without scanning the entire segment. This helper returns true either if
>>> both pages are not zone device pages or both pages are zone device
>>> pages with the same pgmap.
>>>
>>> Add a helper to determine if zone device pages are mergeable and use
>>> this helper in page_is_mergeable().
>>
>> Any reason not to simply set REQ_NOMERGE for these requests?  We
>> can't merge for passthrough requests anyway, and genrally don't merge
>> for direct I/O either, so adding all this overhead seems a bit pointless.
> 
> Hmm, I suppose we could also ensure that REQ_NOMERGE is set in a bio
> before setting FOLL_PCI_P2PDMA in bio_map_user_iov() and
> __bio_iov_iter_get_pages(). Assuming it's always set for any direct I/O.
> 

Oh, it turns out this code has nothing to do with REQ_NOMERGE. It's used
indirectly in bio_map_user_iov() and __bio_iov_iter_get_pages() when
adding pages to the bio via page_is_mergeable(). So it's not about
requests being merged it's about pages being merged.

So I'm not sure how we can avoid this, but it only happens when two
adjacent pages are added to the same bio in a row, so I don't think it's
that common, but the check can probably be moved down so it happens
after the same_page check to make it a little less common.

Logan

Logan
Christoph Hellwig July 4, 2022, 6:07 a.m. UTC | #4
On Thu, Jun 30, 2022 at 03:50:10PM -0600, Logan Gunthorpe wrote:
> Oh, it turns out this code has nothing to do with REQ_NOMERGE. It's used
> indirectly in bio_map_user_iov() and __bio_iov_iter_get_pages() when
> adding pages to the bio via page_is_mergeable(). So it's not about
> requests being merged it's about pages being merged.

Oh, true.

> So I'm not sure how we can avoid this, but it only happens when two
> adjacent pages are added to the same bio in a row, so I don't think it's
> that common, but the check can probably be moved down so it happens
> after the same_page check to make it a little less common.

Yes, looks like we have to keep it.
diff mbox series

Patch

diff --git a/block/bio.c b/block/bio.c
index f92d0223247b..a402a4760457 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -865,6 +865,8 @@  static inline bool page_is_mergeable(const struct bio_vec *bv,
 		return false;
 	if (xen_domain() && !xen_biovec_phys_mergeable(bv, page))
 		return false;
+	if (!zone_device_pages_have_same_pgmap(bv->bv_page, page))
+		return false;
 
 	*same_page = ((vec_end_addr & PAGE_MASK) == page_addr);
 	if (*same_page)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 0bcb54ea503c..33b2f4d9fd0a 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1108,6 +1108,24 @@  static inline bool is_zone_device_page(const struct page *page)
 {
 	return page_zonenum(page) == ZONE_DEVICE;
 }
+
+/*
+ * Consecutive zone device pages should not be merged into the same sgl
+ * or bvec segment with other types of pages or if they belong to different
+ * pgmaps. Otherwise getting the pgmap of a given segment is not possible
+ * without scanning the entire segment. This helper returns true either if
+ * both pages are not zone device pages or both pages are zone device pages
+ * with the same pgmap.
+ */
+static inline bool zone_device_pages_have_same_pgmap(const struct page *a,
+						     const struct page *b)
+{
+	if (is_zone_device_page(a) != is_zone_device_page(b))
+		return false;
+	if (!is_zone_device_page(a))
+		return true;
+	return a->pgmap == b->pgmap;
+}
 extern void memmap_init_zone_device(struct zone *, unsigned long,
 				    unsigned long, struct dev_pagemap *);
 #else
@@ -1115,6 +1133,11 @@  static inline bool is_zone_device_page(const struct page *page)
 {
 	return false;
 }
+static inline bool zone_device_pages_have_same_pgmap(const struct page *a,
+						     const struct page *b)
+{
+	return true;
+}
 #endif
 
 static inline bool folio_is_zone_device(const struct folio *folio)