diff mbox series

[v20,29/32] block: Replace BIO_NO_PAGE_REF with BIO_PAGE_REFFED with inverted logic

Message ID 20230519074047.1739879-30-dhowells@redhat.com (mailing list archive)
State New
Headers show
Series splice, block: Use page pinning and kill ITER_PIPE | expand

Commit Message

David Howells May 19, 2023, 7:40 a.m. UTC
From: Christoph Hellwig <hch@lst.de>

Replace BIO_NO_PAGE_REF with a BIO_PAGE_REFFED flag that has the inverted
meaning is only set when a page reference has been acquired that needs to
be released by bio_release_pages().

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
cc: Al Viro <viro@zeniv.linux.org.uk>
cc: Jens Axboe <axboe@kernel.dk>
cc: Jan Kara <jack@suse.cz>
cc: Matthew Wilcox <willy@infradead.org>
cc: Logan Gunthorpe <logang@deltatee.com>
cc: linux-block@vger.kernel.org
---

Notes:
    ver #8)
     - Split out from another patch [hch].
     - Don't default to BIO_PAGE_REFFED [hch].
    
    ver #5)
     - Split from patch that uses iov_iter_extract_pages().

 block/bio.c               | 2 +-
 block/blk-map.c           | 1 +
 fs/direct-io.c            | 2 ++
 fs/iomap/direct-io.c      | 1 -
 include/linux/bio.h       | 2 +-
 include/linux/blk_types.h | 2 +-
 6 files changed, 6 insertions(+), 4 deletions(-)

Comments

Kent Overstreet May 20, 2023, 1:26 a.m. UTC | #1
On Fri, May 19, 2023 at 08:40:44AM +0100, David Howells wrote:
> From: Christoph Hellwig <hch@lst.de>
> 
> Replace BIO_NO_PAGE_REF with a BIO_PAGE_REFFED flag that has the inverted
> meaning is only set when a page reference has been acquired that needs to
> be released by bio_release_pages().

What was the motivation for this patch?
Christoph Hellwig May 20, 2023, 3:56 a.m. UTC | #2
On Fri, May 19, 2023 at 09:26:07PM -0400, Kent Overstreet wrote:
> On Fri, May 19, 2023 at 08:40:44AM +0100, David Howells wrote:
> > From: Christoph Hellwig <hch@lst.de>
> > 
> > Replace BIO_NO_PAGE_REF with a BIO_PAGE_REFFED flag that has the inverted
> > meaning is only set when a page reference has been acquired that needs to
> > be released by bio_release_pages().
> 
> What was the motivation for this patch?

So that is only is set when we need to release a page, instead telling
code to not release it when it otherwise would, where otherwise would
is implicit and undocumented and changes in this series.
Kent Overstreet May 20, 2023, 4:13 a.m. UTC | #3
On Fri, May 19, 2023 at 08:56:56PM -0700, Christoph Hellwig wrote:
> On Fri, May 19, 2023 at 09:26:07PM -0400, Kent Overstreet wrote:
> > On Fri, May 19, 2023 at 08:40:44AM +0100, David Howells wrote:
> > > From: Christoph Hellwig <hch@lst.de>
> > > 
> > > Replace BIO_NO_PAGE_REF with a BIO_PAGE_REFFED flag that has the inverted
> > > meaning is only set when a page reference has been acquired that needs to
> > > be released by bio_release_pages().
> > 
> > What was the motivation for this patch?
> 
> So that is only is set when we need to release a page, instead telling
> code to not release it when it otherwise would, where otherwise would
> is implicit and undocumented and changes in this series.

I suppose this way setting it can be done in bio_iov_iter_get_pages() -
ok yeah, that makes sense.

But it seems like it should be set in bio_iov_iter_get_pages() though,
and I'm not seeing that?
Christoph Hellwig May 20, 2023, 4:17 a.m. UTC | #4
On Sat, May 20, 2023 at 12:13:49AM -0400, Kent Overstreet wrote:
> I suppose this way setting it can be done in bio_iov_iter_get_pages() -
> ok yeah, that makes sense.
> 
> But it seems like it should be set in bio_iov_iter_get_pages() though,
> and I'm not seeing that?

It is set in bio_iov_iter_get_pages in this patch.  The later gets
replaced with the pinned flag when we bio_iov_iter_get_pages is
changed to pin pages instead.
Kent Overstreet May 20, 2023, 5:52 a.m. UTC | #5
On Fri, May 19, 2023 at 09:17:43PM -0700, Christoph Hellwig wrote:
> On Sat, May 20, 2023 at 12:13:49AM -0400, Kent Overstreet wrote:
> > I suppose this way setting it can be done in bio_iov_iter_get_pages() -
> > ok yeah, that makes sense.
> > 
> > But it seems like it should be set in bio_iov_iter_get_pages() though,
> > and I'm not seeing that?
> 
> It is set in bio_iov_iter_get_pages in this patch.  The later gets
> replaced with the pinned flag when we bio_iov_iter_get_pages is
> changed to pin pages instead.

Whoops, missed it.

Reviewed-by: Kent Overstreet <kent.overstreet@linux.dev>
David Howells May 20, 2023, 8:40 a.m. UTC | #6
Kent Overstreet <kent.overstreet@linux.dev> wrote:

> > Replace BIO_NO_PAGE_REF with a BIO_PAGE_REFFED flag that has the inverted
> > meaning is only set when a page reference has been acquired that needs to
> > be released by bio_release_pages().
> 
> What was the motivation for this patch?

We need to move to using FOLL_PIN for buffers derived from direct I/O to avoid
the fork vs async-DIO race.  Further, we shouldn't be taking a ref or a pin on
pages derived from internal kernel iterators such as KVEC or BVEC as the page
refcount might not be a valid way to control the lifetime of the data/buffers
in those pages (slab, for instance).  Rather, for internal kernel I/O, we need
to rely on the caller to hold onto the memory until we tell them we've
finished.

So we flip the polarity of the page-is-ref'd flag and then add a
page-is-pinned flag.  The intention is to ultimately drop the page-is-ref'd
flag - but we still need to keep the page-is-pinned flag.  This makes it
easier to take a stepwise approach - and having both flags working the same
way makes the logic easier to follow.

See iov_iter_extract_pages() and iov_iter_extract_will_pin().

David
diff mbox series

Patch

diff --git a/block/bio.c b/block/bio.c
index 043944fd46eb..8516adeaea26 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -1191,7 +1191,6 @@  void bio_iov_bvec_set(struct bio *bio, struct iov_iter *iter)
 	bio->bi_io_vec = (struct bio_vec *)iter->bvec;
 	bio->bi_iter.bi_bvec_done = iter->iov_offset;
 	bio->bi_iter.bi_size = size;
-	bio_set_flag(bio, BIO_NO_PAGE_REF);
 	bio_set_flag(bio, BIO_CLONED);
 }
 
@@ -1336,6 +1335,7 @@  int bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
 		return 0;
 	}
 
+	bio_set_flag(bio, BIO_PAGE_REFFED);
 	do {
 		ret = __bio_iov_iter_get_pages(bio, iter);
 	} while (!ret && iov_iter_count(iter) && !bio_full(bio, 0));
diff --git a/block/blk-map.c b/block/blk-map.c
index 04c55f1c492e..33d9f6e89ba6 100644
--- a/block/blk-map.c
+++ b/block/blk-map.c
@@ -282,6 +282,7 @@  static int bio_map_user_iov(struct request *rq, struct iov_iter *iter,
 	if (blk_queue_pci_p2pdma(rq->q))
 		extraction_flags |= ITER_ALLOW_P2PDMA;
 
+	bio_set_flag(bio, BIO_PAGE_REFFED);
 	while (iov_iter_count(iter)) {
 		struct page **pages, *stack_pages[UIO_FASTIOV];
 		ssize_t bytes;
diff --git a/fs/direct-io.c b/fs/direct-io.c
index 0b380bb8a81e..ad20f3428bab 100644
--- a/fs/direct-io.c
+++ b/fs/direct-io.c
@@ -402,6 +402,8 @@  dio_bio_alloc(struct dio *dio, struct dio_submit *sdio,
 		bio->bi_end_io = dio_bio_end_aio;
 	else
 		bio->bi_end_io = dio_bio_end_io;
+	/* for now require references for all pages */
+	bio_set_flag(bio, BIO_PAGE_REFFED);
 	sdio->bio = bio;
 	sdio->logical_offset_in_bio = sdio->cur_page_fs_offset;
 }
diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
index 66a9f10e3207..08873f0627dd 100644
--- a/fs/iomap/direct-io.c
+++ b/fs/iomap/direct-io.c
@@ -203,7 +203,6 @@  static void iomap_dio_zero(const struct iomap_iter *iter, struct iomap_dio *dio,
 	bio->bi_private = dio;
 	bio->bi_end_io = iomap_dio_bio_end_io;
 
-	bio_set_flag(bio, BIO_NO_PAGE_REF);
 	__bio_add_page(bio, page, len, 0);
 	iomap_dio_submit_bio(iter, dio, bio, pos);
 }
diff --git a/include/linux/bio.h b/include/linux/bio.h
index 7f53be035cf0..0922729acd26 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -488,7 +488,7 @@  void zero_fill_bio(struct bio *bio);
 
 static inline void bio_release_pages(struct bio *bio, bool mark_dirty)
 {
-	if (!bio_flagged(bio, BIO_NO_PAGE_REF))
+	if (bio_flagged(bio, BIO_PAGE_REFFED))
 		__bio_release_pages(bio, mark_dirty);
 }
 
diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
index 740afe80f297..dfd2c2cb909d 100644
--- a/include/linux/blk_types.h
+++ b/include/linux/blk_types.h
@@ -323,7 +323,7 @@  struct bio {
  * bio flags
  */
 enum {
-	BIO_NO_PAGE_REF,	/* don't put release vec pages */
+	BIO_PAGE_REFFED,	/* put pages in bio_release_pages() */
 	BIO_CLONED,		/* doesn't own data */
 	BIO_BOUNCED,		/* bio is a bounce bio */
 	BIO_QUIET,		/* Make BIO Quiet */