Message ID | 995387be9841bde2151c85880555c18bec68a641.1571647179.git.mbobrowski@mbobrowski.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | ext4: port direct I/O to iomap infrastructure | expand |
On Mon 21-10-19 20:17:46, Matthew Bobrowski wrote: > This patch is effectively addressed what Dave Chinner had found and > fixed within this commit: 8a23414ee345. Justification for needing this > modification has been provided below: > > When doing a direct IO that spans the current EOF, and there are > written blocks beyond EOF that extend beyond the current write, the > only metadata update that needs to be done is a file size extension. > > However, we don't mark such iomaps as IOMAP_F_DIRTY to indicate that > there is IO completion metadata updates required, and hence we may > fail to correctly sync file size extensions made in IO completion when > O_DSYNC writes are being used and the hardware supports FUA. > > Hence when setting IOMAP_F_DIRTY, we need to also take into account > whether the iomap spans the current EOF. If it does, then we need to > mark it dirty so that IO completion will call generic_write_sync() to > flush the inode size update to stable storage correctly. > > Signed-off-by: Matthew Bobrowski <mbobrowski@mbobrowski.org> Looks good to me. You could possibly also comment in the changelog that currently, this change doesn't have user visible impact for ext4 as none of current users of ext4_iomap_begin() that extend files depend of IOMAP_F_DIRTY. Also this patch would make slightly more sense to be before 1/12 so that you don't have there those two strange unused arguments. But these are just small nits. Feel free to add: Reviewed-by: Jan Kara <jack@suse.cz> Honza > --- > fs/ext4/inode.c | 8 +++++++- > 1 file changed, 7 insertions(+), 1 deletion(-) > > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c > index 158eea9a1944..0dd29ae5cc8c 100644 > --- a/fs/ext4/inode.c > +++ b/fs/ext4/inode.c > @@ -3412,8 +3412,14 @@ static void ext4_set_iomap(struct inode *inode, struct iomap *iomap, > { > u8 blkbits = inode->i_blkbits; > > + /* > + * Writes that span EOF might trigger an I/O size update on completion, > + * so consider them to be dirty for the purposes of O_DSYNC, even if > + * there is no other metadata changes being made or are pending here. > + */ > iomap->flags = 0; > - if (ext4_inode_datasync_dirty(inode)) > + if (ext4_inode_datasync_dirty(inode) || > + offset + length > i_size_read(inode)) > iomap->flags |= IOMAP_F_DIRTY; > > if (map->m_flags & EXT4_MAP_NEW) > -- > 2.20.1 > > --<M>--
On Mon, Oct 21, 2019 at 03:28:18PM +0200, Jan Kara wrote: > On Mon 21-10-19 20:17:46, Matthew Bobrowski wrote: > > This patch is effectively addressed what Dave Chinner had found and > > fixed within this commit: 8a23414ee345. Justification for needing this > > modification has been provided below: > > > > When doing a direct IO that spans the current EOF, and there are > > written blocks beyond EOF that extend beyond the current write, the > > only metadata update that needs to be done is a file size extension. > > > > However, we don't mark such iomaps as IOMAP_F_DIRTY to indicate that > > there is IO completion metadata updates required, and hence we may > > fail to correctly sync file size extensions made in IO completion when > > O_DSYNC writes are being used and the hardware supports FUA. > > > > Hence when setting IOMAP_F_DIRTY, we need to also take into account > > whether the iomap spans the current EOF. If it does, then we need to > > mark it dirty so that IO completion will call generic_write_sync() to > > flush the inode size update to stable storage correctly. > > > > Signed-off-by: Matthew Bobrowski <mbobrowski@mbobrowski.org> > > Looks good to me. You could possibly also comment in the changelog that > currently, this change doesn't have user visible impact for ext4 as none of > current users of ext4_iomap_begin() that extend files depend of > IOMAP_F_DIRTY. Sure, I will add this. > Also this patch would make slightly more sense to be before 1/12 so that > you don't have there those two strange unused arguments. But these are just > small nits. You're right. I will rearrange it in v6 so that this patch comes first. > Feel free to add: > > Reviewed-by: Jan Kara <jack@suse.cz> Thanks Jan! --<M>--
On 10/21/19 2:47 PM, Matthew Bobrowski wrote: > This patch is effectively addressed what Dave Chinner had found and > fixed within this commit: 8a23414ee345. Justification for needing this > modification has been provided below: Not sure if this is a valid commit id. I couldn't find it. > > When doing a direct IO that spans the current EOF, and there are > written blocks beyond EOF that extend beyond the current write, the > only metadata update that needs to be done is a file size extension. > > However, we don't mark such iomaps as IOMAP_F_DIRTY to indicate that > there is IO completion metadata updates required, and hence we may > fail to correctly sync file size extensions made in IO completion when > O_DSYNC writes are being used and the hardware supports FUA. > > Hence when setting IOMAP_F_DIRTY, we need to also take into account > whether the iomap spans the current EOF. If it does, then we need to > mark it dirty so that IO completion will call generic_write_sync() to > flush the inode size update to stable storage correctly. > > Signed-off-by: Matthew Bobrowski <mbobrowski@mbobrowski.org> Otherwise, this patch looks good to me. You may add: Reviewed-by: Ritesh Harjani <riteshh@linux.ibm.com> > --- > fs/ext4/inode.c | 8 +++++++- > 1 file changed, 7 insertions(+), 1 deletion(-) > > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c > index 158eea9a1944..0dd29ae5cc8c 100644 > --- a/fs/ext4/inode.c > +++ b/fs/ext4/inode.c > @@ -3412,8 +3412,14 @@ static void ext4_set_iomap(struct inode *inode, struct iomap *iomap, > { > u8 blkbits = inode->i_blkbits; > > + /* > + * Writes that span EOF might trigger an I/O size update on completion, > + * so consider them to be dirty for the purposes of O_DSYNC, even if > + * there is no other metadata changes being made or are pending here. > + */ > iomap->flags = 0; > - if (ext4_inode_datasync_dirty(inode)) > + if (ext4_inode_datasync_dirty(inode) || > + offset + length > i_size_read(inode)) > iomap->flags |= IOMAP_F_DIRTY; > > if (map->m_flags & EXT4_MAP_NEW) >
On Wed, Oct 23, 2019 at 12:05:55PM +0530, Ritesh Harjani wrote: > On 10/21/19 2:47 PM, Matthew Bobrowski wrote: > > This patch is effectively addressed what Dave Chinner had found and > > fixed within this commit: 8a23414ee345. Justification for needing this > > modification has been provided below: > Not sure if this is a valid commit id. I couldn't find it. Ah, oops! I plucked that from somewhere, but where (some thread)? Hm, anyway, this is what I was referring to: https://git.kernel.org/pub/scm/fs/xfs/xfs-linux.git/commit/?h=xfs-5.5-merge&id=7684e2c4384d5d1f884b01ab8bff2369e4db0bff This is queued for 5.5, so I will add this commit hash to my changelog in v6. --<M>--
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 158eea9a1944..0dd29ae5cc8c 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -3412,8 +3412,14 @@ static void ext4_set_iomap(struct inode *inode, struct iomap *iomap, { u8 blkbits = inode->i_blkbits; + /* + * Writes that span EOF might trigger an I/O size update on completion, + * so consider them to be dirty for the purposes of O_DSYNC, even if + * there is no other metadata changes being made or are pending here. + */ iomap->flags = 0; - if (ext4_inode_datasync_dirty(inode)) + if (ext4_inode_datasync_dirty(inode) || + offset + length > i_size_read(inode)) iomap->flags |= IOMAP_F_DIRTY; if (map->m_flags & EXT4_MAP_NEW)
This patch is effectively addressed what Dave Chinner had found and fixed within this commit: 8a23414ee345. Justification for needing this modification has been provided below: When doing a direct IO that spans the current EOF, and there are written blocks beyond EOF that extend beyond the current write, the only metadata update that needs to be done is a file size extension. However, we don't mark such iomaps as IOMAP_F_DIRTY to indicate that there is IO completion metadata updates required, and hence we may fail to correctly sync file size extensions made in IO completion when O_DSYNC writes are being used and the hardware supports FUA. Hence when setting IOMAP_F_DIRTY, we need to also take into account whether the iomap spans the current EOF. If it does, then we need to mark it dirty so that IO completion will call generic_write_sync() to flush the inode size update to stable storage correctly. Signed-off-by: Matthew Bobrowski <mbobrowski@mbobrowski.org> --- fs/ext4/inode.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)