diff mbox

[06/11] ext4: Avoid split extents for DAX writes

Message ID 1478603297-11793-7-git-send-email-jack@suse.cz (mailing list archive)
State New, archived
Headers show

Commit Message

Jan Kara Nov. 8, 2016, 11:08 a.m. UTC
Currently mapping of blocks for DAX writes happen with
EXT4_GET_BLOCKS_PRE_IO flag set. That has a result that each
ext4_map_blocks() call creates a separate written extent, although it
could be merged to the neighboring extents in the extent tree.  The
reason for using this flag is that in case the extent is unwritten, we
need to convert it to written one and zero it out. However this "convert
mapped range to written" operation is already implemented by
ext4_map_blocks() for the case of data writes into unwritten extent. So
just use flags for that mode of operation, simplify the code, and avoid
unnecessary split extents.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/ext4/inode.c | 17 -----------------
 1 file changed, 17 deletions(-)

Comments

Ross Zwisler Nov. 11, 2016, 8:25 p.m. UTC | #1
On Tue, Nov 08, 2016 at 12:08:12PM +0100, Jan Kara wrote:
> Currently mapping of blocks for DAX writes happen with
> EXT4_GET_BLOCKS_PRE_IO flag set. That has a result that each
> ext4_map_blocks() call creates a separate written extent, although it
> could be merged to the neighboring extents in the extent tree.  The
> reason for using this flag is that in case the extent is unwritten, we
> need to convert it to written one and zero it out. However this "convert
> mapped range to written" operation is already implemented by
> ext4_map_blocks() for the case of data writes into unwritten extent. So
> just use flags for that mode of operation, simplify the code, and avoid
> unnecessary split extents.
> 
> Signed-off-by: Jan Kara <jack@suse.cz>

The code that this patch modifies was just introduced in the previous patch.
Is there a reason to keep both patches, or would it be cleaner just to squash
them and have one patch that introduces the code as you intend for it to end
up?

> ---
>  fs/ext4/inode.c | 17 -----------------
>  1 file changed, 17 deletions(-)
> 
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index a7079cab645a..3192ec0768d4 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -3351,7 +3351,6 @@ static int ext4_iomap_begin(struct inode *inode, loff_t offset, loff_t length,
>  			return PTR_ERR(handle);
>  
>  		ret = ext4_map_blocks(handle, inode, &map,
> -				      EXT4_GET_BLOCKS_PRE_IO |
>  				      EXT4_GET_BLOCKS_CREATE_ZERO);
>  		if (ret < 0) {
>  			ext4_journal_stop(handle);
> @@ -3360,22 +3359,6 @@ static int ext4_iomap_begin(struct inode *inode, loff_t offset, loff_t length,
>  				goto retry;
>  			return ret;
>  		}
> -		/* For DAX writes we need to zero out unwritten extents */
> -		if (map.m_flags & EXT4_MAP_UNWRITTEN) {
> -			/*
> -			 * We are protected by i_mmap_sem or i_rwsem so we know
> -			 * block cannot go away from under us even though we
> -			 * dropped i_data_sem. Convert extent to written and
> -			 * write zeros there.
> -			 */
> -			ret = ext4_map_blocks(handle, inode, &map,
> -					      EXT4_GET_BLOCKS_CONVERT |
> -					      EXT4_GET_BLOCKS_CREATE_ZERO);
> -			if (ret < 0) {
> -				ext4_journal_stop(handle);
> -				return ret;
> -			}
> -		}
>  
>  		/*
>  		 * If we added blocks beyond i_size we need to make sure they
> -- 
> 2.6.6
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jan Kara Nov. 14, 2016, 8:54 a.m. UTC | #2
On Fri 11-11-16 13:25:47, Ross Zwisler wrote:
> On Tue, Nov 08, 2016 at 12:08:12PM +0100, Jan Kara wrote:
> > Currently mapping of blocks for DAX writes happen with
> > EXT4_GET_BLOCKS_PRE_IO flag set. That has a result that each
> > ext4_map_blocks() call creates a separate written extent, although it
> > could be merged to the neighboring extents in the extent tree.  The
> > reason for using this flag is that in case the extent is unwritten, we
> > need to convert it to written one and zero it out. However this "convert
> > mapped range to written" operation is already implemented by
> > ext4_map_blocks() for the case of data writes into unwritten extent. So
> > just use flags for that mode of operation, simplify the code, and avoid
> > unnecessary split extents.
> > 
> > Signed-off-by: Jan Kara <jack@suse.cz>
> 
> The code that this patch modifies was just introduced in the previous patch.
> Is there a reason to keep both patches, or would it be cleaner just to squash
> them and have one patch that introduces the code as you intend for it to end
> up?

Well, the previous patch just mostly moved the block allocation logic from
dedicated DAX get_block function into iomap_begin(). This patch changes the
block allocation logic so I prefer to keep them separate - mostly for the
sake of documenting the change how allocation of blocks for DAX works in the
changelog.

								Honza
diff mbox

Patch

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index a7079cab645a..3192ec0768d4 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -3351,7 +3351,6 @@  static int ext4_iomap_begin(struct inode *inode, loff_t offset, loff_t length,
 			return PTR_ERR(handle);
 
 		ret = ext4_map_blocks(handle, inode, &map,
-				      EXT4_GET_BLOCKS_PRE_IO |
 				      EXT4_GET_BLOCKS_CREATE_ZERO);
 		if (ret < 0) {
 			ext4_journal_stop(handle);
@@ -3360,22 +3359,6 @@  static int ext4_iomap_begin(struct inode *inode, loff_t offset, loff_t length,
 				goto retry;
 			return ret;
 		}
-		/* For DAX writes we need to zero out unwritten extents */
-		if (map.m_flags & EXT4_MAP_UNWRITTEN) {
-			/*
-			 * We are protected by i_mmap_sem or i_rwsem so we know
-			 * block cannot go away from under us even though we
-			 * dropped i_data_sem. Convert extent to written and
-			 * write zeros there.
-			 */
-			ret = ext4_map_blocks(handle, inode, &map,
-					      EXT4_GET_BLOCKS_CONVERT |
-					      EXT4_GET_BLOCKS_CREATE_ZERO);
-			if (ret < 0) {
-				ext4_journal_stop(handle);
-				return ret;
-			}
-		}
 
 		/*
 		 * If we added blocks beyond i_size we need to make sure they