Message ID | 20240320110548.2200662-7-yi.zhang@huaweicloud.com (mailing list archive) |
---|---|
State | Superseded, archived |
Headers | show |
Series | xfs/iomap: fix non-atomic clone operation and don't update size when zeroing range post eof | expand |
On Wed, Mar 20, 2024 at 07:05:45 PM +0800, Zhang Yi wrote: > From: Zhang Yi <yi.zhang@huawei.com> > > Increase i_size in iomap_zero_range() and iomap_unshare_iter() is not > needed, the caller should handle it. Especially, when truncate partial > block, we should not increase i_size beyond the new EOF here. It doesn't > affect xfs and gfs2 now because they set the new file size after zero > out, it doesn't matter that a transient increase in i_size, but it will > affect ext4 because it set file size before truncate. So move the i_size > updating logic to iomap_write_iter(). > This patch causes generic/522 to consistently fail when using the following fstest configuration, TEST_DEV=/dev/loop16 TEST_LOGDEV=/dev/loop13 SCRATCH_DEV_POOL="/dev/loop5 /dev/loop6 /dev/loop7 /dev/loop8 /dev/loop9 /dev/loop10 /dev/loop11 /dev/loop12" MKFS_OPTIONS='-f -m reflink=1,rmapbt=1, -i sparse=1, -lsize=1g' MOUNT_OPTIONS='-o usrquota,grpquota,prjquota' TEST_FS_MOUNT_OPTS="$TEST_FS_MOUNT_OPTS -o usrquota,grpquota,prjquota" TEST_FS_MOUNT_OPTS="-o logdev=/dev/loop13" SCRATCH_LOGDEV=/dev/loop15 USE_EXTERNAL=yes LOGWRITES_DEV=/dev/loop15
On 2024/4/19 14:07, Chandan Babu R wrote: > On Wed, Mar 20, 2024 at 07:05:45 PM +0800, Zhang Yi wrote: >> From: Zhang Yi <yi.zhang@huawei.com> >> >> Increase i_size in iomap_zero_range() and iomap_unshare_iter() is not >> needed, the caller should handle it. Especially, when truncate partial >> block, we should not increase i_size beyond the new EOF here. It doesn't >> affect xfs and gfs2 now because they set the new file size after zero >> out, it doesn't matter that a transient increase in i_size, but it will >> affect ext4 because it set file size before truncate. So move the i_size >> updating logic to iomap_write_iter(). >> > > This patch causes generic/522 to consistently fail when using the following > fstest configuration, > > TEST_DEV=/dev/loop16 > TEST_LOGDEV=/dev/loop13 > SCRATCH_DEV_POOL="/dev/loop5 /dev/loop6 /dev/loop7 /dev/loop8 /dev/loop9 /dev/loop10 /dev/loop11 /dev/loop12" > MKFS_OPTIONS='-f -m reflink=1,rmapbt=1, -i sparse=1, -lsize=1g' > MOUNT_OPTIONS='-o usrquota,grpquota,prjquota' > TEST_FS_MOUNT_OPTS="$TEST_FS_MOUNT_OPTS -o usrquota,grpquota,prjquota" > TEST_FS_MOUNT_OPTS="-o logdev=/dev/loop13" > SCRATCH_LOGDEV=/dev/loop15 > USE_EXTERNAL=yes > LOGWRITES_DEV=/dev/loop15 > Sorry for the regression, I didn't notice this issue last time I ran this test. It looks like the 004 patch doesn't work for some cases. I can reproduce it on my machine now, and I'll take a look at this and fix it soon. Thanks, Yi.
On 2024/4/19 14:07, Chandan Babu R wrote: > On Wed, Mar 20, 2024 at 07:05:45 PM +0800, Zhang Yi wrote: >> From: Zhang Yi <yi.zhang@huawei.com> >> >> Increase i_size in iomap_zero_range() and iomap_unshare_iter() is not >> needed, the caller should handle it. Especially, when truncate partial >> block, we should not increase i_size beyond the new EOF here. It doesn't >> affect xfs and gfs2 now because they set the new file size after zero >> out, it doesn't matter that a transient increase in i_size, but it will >> affect ext4 because it set file size before truncate. So move the i_size >> updating logic to iomap_write_iter(). >> > > This patch causes generic/522 to consistently fail when using the following > fstest configuration, > > TEST_DEV=/dev/loop16 > TEST_LOGDEV=/dev/loop13 > SCRATCH_DEV_POOL="/dev/loop5 /dev/loop6 /dev/loop7 /dev/loop8 /dev/loop9 /dev/loop10 /dev/loop11 /dev/loop12" > MKFS_OPTIONS='-f -m reflink=1,rmapbt=1, -i sparse=1, -lsize=1g' > MOUNT_OPTIONS='-o usrquota,grpquota,prjquota' > TEST_FS_MOUNT_OPTS="$TEST_FS_MOUNT_OPTS -o usrquota,grpquota,prjquota" > TEST_FS_MOUNT_OPTS="-o logdev=/dev/loop13" > SCRATCH_LOGDEV=/dev/loop15 > USE_EXTERNAL=yes > LOGWRITES_DEV=/dev/loop15 > Hello! The root cause of the problem is caused by patch 4/9, this patch is fine, I've revised the patch 4/9 and send it out separately in reply to v4. I've tested that on my machine for over 100 rounds on generic/522 and it hasn't failed again. Please take a look at the v5 patch for details and test it on your machine. https://lore.kernel.org/linux-xfs/20240423111735.1298851-1-yi.zhang@huaweicloud.com/T/#m2da33e643b642071aa20077321e2c431f5a64e38 Thanks, Yi.
diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index 7e32a204650b..e9112dc78d15 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -837,32 +837,13 @@ static size_t iomap_write_end(struct iomap_iter *iter, loff_t pos, size_t len, size_t copied, struct folio *folio) { const struct iomap *srcmap = iomap_iter_srcmap(iter); - loff_t old_size = iter->inode->i_size; - size_t ret; - - if (srcmap->type == IOMAP_INLINE) { - ret = iomap_write_end_inline(iter, folio, pos, copied); - } else if (srcmap->flags & IOMAP_F_BUFFER_HEAD) { - ret = block_write_end(NULL, iter->inode->i_mapping, pos, len, - copied, &folio->page, NULL); - } else { - ret = __iomap_write_end(iter->inode, pos, len, copied, folio); - } - - /* - * Update the in-memory inode size after copying the data into the page - * cache. It's up to the file system to write the updated size to disk, - * preferably after I/O completion so that no stale data is exposed. - */ - if (pos + ret > old_size) { - i_size_write(iter->inode, pos + ret); - iter->iomap.flags |= IOMAP_F_SIZE_CHANGED; - } - __iomap_put_folio(iter, pos, ret, folio); - if (old_size < pos) - pagecache_isize_extended(iter->inode, old_size, pos); - return ret; + if (srcmap->type == IOMAP_INLINE) + return iomap_write_end_inline(iter, folio, pos, copied); + if (srcmap->flags & IOMAP_F_BUFFER_HEAD) + return block_write_end(NULL, iter->inode->i_mapping, pos, len, + copied, &folio->page, NULL); + return __iomap_write_end(iter->inode, pos, len, copied, folio); } static loff_t iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i) @@ -877,6 +858,7 @@ static loff_t iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i) do { struct folio *folio; + loff_t old_size; size_t offset; /* Offset into folio */ size_t bytes; /* Bytes to write to folio */ size_t copied; /* Bytes copied from user */ @@ -926,6 +908,22 @@ static loff_t iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i) copied = copy_folio_from_iter_atomic(folio, offset, bytes, i); status = iomap_write_end(iter, pos, bytes, copied, folio); + /* + * Update the in-memory inode size after copying the data into + * the page cache. It's up to the file system to write the + * updated size to disk, preferably after I/O completion so that + * no stale data is exposed. Only once that's done can we + * unlock and release the folio. + */ + old_size = iter->inode->i_size; + if (pos + status > old_size) { + i_size_write(iter->inode, pos + status); + iter->iomap.flags |= IOMAP_F_SIZE_CHANGED; + } + __iomap_put_folio(iter, pos, status, folio); + + if (old_size < pos) + pagecache_isize_extended(iter->inode, old_size, pos); if (status < bytes) iomap_write_failed(iter->inode, pos + status, bytes - status); @@ -1298,6 +1296,7 @@ static loff_t iomap_unshare_iter(struct iomap_iter *iter) bytes = folio_size(folio) - offset; bytes = iomap_write_end(iter, pos, bytes, bytes, folio); + __iomap_put_folio(iter, pos, bytes, folio); if (WARN_ON_ONCE(bytes == 0)) return -EIO; @@ -1362,6 +1361,7 @@ static loff_t iomap_zero_iter(struct iomap_iter *iter, bool *did_zero) folio_mark_accessed(folio); bytes = iomap_write_end(iter, pos, bytes, bytes, folio); + __iomap_put_folio(iter, pos, bytes, folio); if (WARN_ON_ONCE(bytes == 0)) return -EIO;