[06/19] iomap: use write_begin to read pages to unshare

Message ID	20190909182722.16783-7-hch@lst.de (mailing list archive)
State	Superseded, archived
Headers	show Return-Path: <SRS0=Pnyc=XE=vger.kernel.org=linux-xfs-owner@kernel.org> From: Christoph Hellwig <hch@lst.de> To: Goldwyn Rodrigues <rgoldwyn@suse.com>, linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org Subject: [PATCH 06/19] iomap: use write_begin to read pages to unshare Date: Mon, 9 Sep 2019 20:27:09 +0200 Message-Id: <20190909182722.16783-7-hch@lst.de> In-Reply-To: <20190909182722.16783-1-hch@lst.de> References: <20190909182722.16783-1-hch@lst.de> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk
Series	[01/19] iomap: better document the IOMAP_F_* flags \| expand [01/19] iomap: better document the IOMAP_F_* flags [02/19] iomap: remove the unused iomap argument to __iomap_write_end [03/19] iomap: always use AOP_FLAG_NOFS in iomap_write_begin [04/19] iomap: ignore non-shared or non-data blocks in xfs_file_dirty [05/19] iomap: move the zeroing case out of iomap_read_page_sync [06/19] iomap: use write_begin to read pages to unshare [07/19] iomap: use a srcmap for a read-modify-write I/O [08/19] xfs: also call xfs_file_iomap_end_delalloc for zeroing operations [09/19] xfs: remove xfs_reflink_dirty_extents [10/19] xfs: pass two imaps to xfs_reflink_allocate_cow [11/19] xfs: refactor xfs_file_iomap_begin_delay [12/19] xfs: fill out the srcmap in iomap_begin [13/19] xfs: factor out a helper to calculate the end_fsb [14/19] xfs: split out a new set of read-only iomap ops [15/19] xfs: move xfs_file_iomap_begin_delay around [16/19] xfs: split the iomap ops for buffered vs direct writes [17/19] xfs: rename the whichfork variable in xfs_buffered_write_iomap_begin [18/19] xfs: cleanup xfs_iomap_write_unwritten [19/19] xfs: improve the IOMAP_NOWAIT check for COW inodes

Message ID

20190909182722.16783-7-hch@lst.de (mailing list archive)

State

Superseded, archived

Headers

From: Christoph Hellwig <hch@lst.de>
To: Goldwyn Rodrigues <rgoldwyn@suse.com>, linux-xfs@vger.kernel.org,
        linux-fsdevel@vger.kernel.org
Subject: [PATCH 06/19] iomap: use write_begin to read pages to unshare
Date: Mon,  9 Sep 2019 20:27:09 +0200
Message-Id: <20190909182722.16783-7-hch@lst.de>
In-Reply-To: <20190909182722.16783-1-hch@lst.de>
References: <20190909182722.16783-1-hch@lst.de>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Sender: linux-xfs-owner@vger.kernel.org
Precedence: bulk

Series

[01/19] iomap: better document the IOMAP_F_* flags | expand

Commit Message

Christoph Hellwig Sept. 9, 2019, 6:27 p.m. UTC

Use the existing iomap write_begin code to read the pages unshared
by iomap_file_unshare.  That avoids the extra ->readpage call and
extent tree lookup currently done by read_mapping_page.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/iomap/buffered-io.c | 49 ++++++++++++++----------------------------
 1 file changed, 16 insertions(+), 33 deletions(-)

Comments

Darrick J. Wong Sept. 16, 2019, 6:34 p.m. UTC | #1

On Mon, Sep 09, 2019 at 08:27:09PM +0200, Christoph Hellwig wrote:
> Use the existing iomap write_begin code to read the pages unshared
> by iomap_file_unshare.  That avoids the extra ->readpage call and
> extent tree lookup currently done by read_mapping_page.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  fs/iomap/buffered-io.c | 49 ++++++++++++++----------------------------
>  1 file changed, 16 insertions(+), 33 deletions(-)
> 
> diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
> index fe099faf540f..a421977a9496 100644
> --- a/fs/iomap/buffered-io.c
> +++ b/fs/iomap/buffered-io.c
> @@ -533,6 +533,10 @@ iomap_migrate_page(struct address_space *mapping, struct page *newpage,
>  EXPORT_SYMBOL_GPL(iomap_migrate_page);
>  #endif /* CONFIG_MIGRATION */
>  
> +enum {
> +	IOMAP_WRITE_F_UNSHARE		= (1 << 0),
> +};
> +
>  static void
>  iomap_write_failed(struct inode *inode, loff_t pos, unsigned len)
>  {
> @@ -562,7 +566,7 @@ iomap_read_page_sync(loff_t block_start, struct page *page, unsigned poff,
>  }
>  
>  static int
> -__iomap_write_begin(struct inode *inode, loff_t pos, unsigned len,
> +__iomap_write_begin(struct inode *inode, loff_t pos, unsigned len, int flags,
>  		struct page *page, struct iomap *iomap)
>  {
>  	struct iomap_page *iop = iomap_page_create(inode, page);
> @@ -581,11 +585,14 @@ __iomap_write_begin(struct inode *inode, loff_t pos, unsigned len,
>  		if (plen == 0)
>  			break;
>  
> -		if ((from <= poff || from >= poff + plen) &&
> +		if (!(flags & IOMAP_WRITE_F_UNSHARE) &&

Mmm, archeology of code that I wrote originally and have forgotten
already... :)

I think the purpose of F_UNSHARE is to mimic the behavior of the code
that's being removed, and the old behavior is that if a user asks to
unshare a page backed by shared extents we'll read in all the blocks
backing the page, even if that means reading in blocks that weren't part
of the original unshare request, right?

The only reason I can think of (or remember) for doing it that way is
that read_mapping_page does its IO one page at a time, i.e. sheer
laziness on my part.

So do we actually need F_UNSHARE to read in the whole page or can we get
by with reading only the blocks that are included in the range that's
being unshared, just like how write only reads in the blocks that are
included in the range being written to?

--D

> +		    (from <= poff || from >= poff + plen) &&
>  		    (to <= poff || to >= poff + plen))
>  			continue;
>  
>  		if (iomap_block_needs_zeroing(inode, iomap, block_start)) {
> +			if (WARN_ON_ONCE(flags & IOMAP_WRITE_F_UNSHARE))
> +				return -EIO;
>  			zero_user_segments(page, poff, from, to, poff + plen);
>  			iomap_set_range_uptodate(page, poff, plen);
>  			continue;
> @@ -631,7 +638,8 @@ iomap_write_begin(struct inode *inode, loff_t pos, unsigned len, unsigned flags,
>  	else if (iomap->flags & IOMAP_F_BUFFER_HEAD)
>  		status = __block_write_begin_int(page, pos, len, NULL, iomap);
>  	else
> -		status = __iomap_write_begin(inode, pos, len, page, iomap);
> +		status = __iomap_write_begin(inode, pos, len, flags, page,
> +				iomap);
>  
>  	if (unlikely(status))
>  		goto out_unlock;
> @@ -854,22 +862,6 @@ iomap_file_buffered_write(struct kiocb *iocb, struct iov_iter *iter,
>  }
>  EXPORT_SYMBOL_GPL(iomap_file_buffered_write);
>  
> -static struct page *
> -__iomap_read_page(struct inode *inode, loff_t offset)
> -{
> -	struct address_space *mapping = inode->i_mapping;
> -	struct page *page;
> -
> -	page = read_mapping_page(mapping, offset >> PAGE_SHIFT, NULL);
> -	if (IS_ERR(page))
> -		return page;
> -	if (!PageUptodate(page)) {
> -		put_page(page);
> -		return ERR_PTR(-EIO);
> -	}
> -	return page;
> -}
> -
>  static loff_t
>  iomap_unshare_actor(struct inode *inode, loff_t pos, loff_t length, void *data,
>  		struct iomap *iomap)
> @@ -885,24 +877,15 @@ iomap_unshare_actor(struct inode *inode, loff_t pos, loff_t length, void *data,
>  		return length;
>  
>  	do {
> -		struct page *page, *rpage;
> -		unsigned long offset;	/* Offset into pagecache page */
> -		unsigned long bytes;	/* Bytes to write to page */
> -
> -		offset = offset_in_page(pos);
> -		bytes = min_t(loff_t, PAGE_SIZE - offset, length);
> -
> -		rpage = __iomap_read_page(inode, pos);
> -		if (IS_ERR(rpage))
> -			return PTR_ERR(rpage);
> +		unsigned long offset = offset_in_page(pos);
> +		unsigned long bytes = min_t(loff_t, PAGE_SIZE - offset, length);
> +		struct page *page;
>  
> -		status = iomap_write_begin(inode, pos, bytes, 0, &page, iomap);
> -		put_page(rpage);
> +		status = iomap_write_begin(inode, pos, bytes,
> +				IOMAP_WRITE_F_UNSHARE, &page, iomap);
>  		if (unlikely(status))
>  			return status;
>  
> -		WARN_ON_ONCE(!PageUptodate(page));
> -
>  		status = iomap_write_end(inode, pos, bytes, bytes, page, iomap);
>  		if (unlikely(status <= 0)) {
>  			if (WARN_ON_ONCE(status == 0))
> -- 
> 2.20.1
>

Christoph Hellwig Sept. 30, 2019, 11:07 a.m. UTC | #2

On Mon, Sep 16, 2019 at 11:34:28AM -0700, Darrick J. Wong wrote:
> > -		if ((from <= poff || from >= poff + plen) &&
> > +		if (!(flags & IOMAP_WRITE_F_UNSHARE) &&
> 
> Mmm, archeology of code that I wrote originally and have forgotten
> already... :)
> 
> I think the purpose of F_UNSHARE is to mimic the behavior of the code
> that's being removed, and the old behavior is that if a user asks to
> unshare a page backed by shared extents we'll read in all the blocks
> backing the page, even if that means reading in blocks that weren't part
> of the original unshare request, right?

No.  The flag causes the code to always read the page, even if the iomap
range covers the whole block.  For normal writes that means we don't to
read the block in at all, but for unshare we absolutely must do so.

Darrick J. Wong Oct. 8, 2019, 3:12 p.m. UTC | #3

On Mon, Sep 30, 2019 at 01:07:31PM +0200, Christoph Hellwig wrote:
> On Mon, Sep 16, 2019 at 11:34:28AM -0700, Darrick J. Wong wrote:
> > > -		if ((from <= poff || from >= poff + plen) &&
> > > +		if (!(flags & IOMAP_WRITE_F_UNSHARE) &&
> > 
> > Mmm, archeology of code that I wrote originally and have forgotten
> > already... :)
> > 
> > I think the purpose of F_UNSHARE is to mimic the behavior of the code
> > that's being removed, and the old behavior is that if a user asks to
> > unshare a page backed by shared extents we'll read in all the blocks
> > backing the page, even if that means reading in blocks that weren't part
> > of the original unshare request, right?
> 
> No.  The flag causes the code to always read the page, even if the iomap
> range covers the whole block.  For normal writes that means we don't to
> read the block in at all, but for unshare we absolutely must do so.

Ahhh, right.  Missed that.  I'll go RVB the v2 patch now.

--D

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index fe099faf540f..a421977a9496 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -533,6 +533,10 @@  iomap_migrate_page(struct address_space *mapping, struct page *newpage,
 EXPORT_SYMBOL_GPL(iomap_migrate_page);
 #endif /* CONFIG_MIGRATION */
 
+enum {
+	IOMAP_WRITE_F_UNSHARE		= (1 << 0),
+};
+
 static void
 iomap_write_failed(struct inode *inode, loff_t pos, unsigned len)
 {
@@ -562,7 +566,7 @@  iomap_read_page_sync(loff_t block_start, struct page *page, unsigned poff,
 }
 
 static int
-__iomap_write_begin(struct inode *inode, loff_t pos, unsigned len,
+__iomap_write_begin(struct inode *inode, loff_t pos, unsigned len, int flags,
 		struct page *page, struct iomap *iomap)
 {
 	struct iomap_page *iop = iomap_page_create(inode, page);
@@ -581,11 +585,14 @@  __iomap_write_begin(struct inode *inode, loff_t pos, unsigned len,
 		if (plen == 0)
 			break;
 
-		if ((from <= poff || from >= poff + plen) &&
+		if (!(flags & IOMAP_WRITE_F_UNSHARE) &&
+		    (from <= poff || from >= poff + plen) &&
 		    (to <= poff || to >= poff + plen))
 			continue;
 
 		if (iomap_block_needs_zeroing(inode, iomap, block_start)) {
+			if (WARN_ON_ONCE(flags & IOMAP_WRITE_F_UNSHARE))
+				return -EIO;
 			zero_user_segments(page, poff, from, to, poff + plen);
 			iomap_set_range_uptodate(page, poff, plen);
 			continue;
@@ -631,7 +638,8 @@  iomap_write_begin(struct inode *inode, loff_t pos, unsigned len, unsigned flags,
 	else if (iomap->flags & IOMAP_F_BUFFER_HEAD)
 		status = __block_write_begin_int(page, pos, len, NULL, iomap);
 	else
-		status = __iomap_write_begin(inode, pos, len, page, iomap);
+		status = __iomap_write_begin(inode, pos, len, flags, page,
+				iomap);
 
 	if (unlikely(status))
 		goto out_unlock;
@@ -854,22 +862,6 @@  iomap_file_buffered_write(struct kiocb *iocb, struct iov_iter *iter,
 }
 EXPORT_SYMBOL_GPL(iomap_file_buffered_write);
 
-static struct page *
-__iomap_read_page(struct inode *inode, loff_t offset)
-{
-	struct address_space *mapping = inode->i_mapping;
-	struct page *page;
-
-	page = read_mapping_page(mapping, offset >> PAGE_SHIFT, NULL);
-	if (IS_ERR(page))
-		return page;
-	if (!PageUptodate(page)) {
-		put_page(page);
-		return ERR_PTR(-EIO);
-	}
-	return page;
-}
-
 static loff_t
 iomap_unshare_actor(struct inode *inode, loff_t pos, loff_t length, void *data,
 		struct iomap *iomap)
@@ -885,24 +877,15 @@  iomap_unshare_actor(struct inode *inode, loff_t pos, loff_t length, void *data,
 		return length;
 
 	do {
-		struct page *page, *rpage;
-		unsigned long offset;	/* Offset into pagecache page */
-		unsigned long bytes;	/* Bytes to write to page */
-
-		offset = offset_in_page(pos);
-		bytes = min_t(loff_t, PAGE_SIZE - offset, length);
-
-		rpage = __iomap_read_page(inode, pos);
-		if (IS_ERR(rpage))
-			return PTR_ERR(rpage);
+		unsigned long offset = offset_in_page(pos);
+		unsigned long bytes = min_t(loff_t, PAGE_SIZE - offset, length);
+		struct page *page;
 
-		status = iomap_write_begin(inode, pos, bytes, 0, &page, iomap);
-		put_page(rpage);
+		status = iomap_write_begin(inode, pos, bytes,
+				IOMAP_WRITE_F_UNSHARE, &page, iomap);
 		if (unlikely(status))
 			return status;
 
-		WARN_ON_ONCE(!PageUptodate(page));
-
 		status = iomap_write_end(inode, pos, bytes, bytes, page, iomap);
 		if (unlikely(status <= 0)) {
 			if (WARN_ON_ONCE(status == 0))

[06/19] iomap: use write_begin to read pages to unshare

Commit Message

Comments

Patch