Message ID | 974b00afc2e703d5e0085fefbb46a1e495956ae1.1738778225.git.fdmanana@suse.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | btrfs: fix hole expansion when writing at an offset beyond eof | expand |
在 2025/2/6 04:30, fdmanana@kernel.org 写道: > From: Filipe Manana <fdmanana@suse.com> > > At btrfs_write_check() if our file's i_size is not sector size aligned and > we have a write that starts at an offset larger than the i_size that falls > within the same page of the i_size, then we end up not zeroing the file > range [i_size, write_offset). > > The code is this: > > start_pos = round_down(pos, fs_info->sectorsize); > oldsize = i_size_read(inode); > if (start_pos > oldsize) { > /* Expand hole size to cover write data, preventing empty gap */ > loff_t end_pos = round_up(pos + count, fs_info->sectorsize); > > ret = btrfs_cont_expand(BTRFS_I(inode), oldsize, end_pos); > if (ret) > return ret; > } > > So if our file's i_size is 90269 bytes and a write at offset 90365 bytes > comes in, we get 'start_pos' set to 90112 bytes, which is less than the > i_size and therefore we don't zero out the range [90269, 90365) by > calling btrfs_cont_expand(). > > This is an old bug introduced in commit 9036c10208e1 ("Btrfs: update hole > handling v2"), from 2008, and the buggy code got moved around over the > years. > > Fix this by discarding 'start_pos' and comparing against the write offset > ('pos') without any alignment. > > This bug was recently exposed by test case generic/363 which tests this > scenario by polluting ranges beyond eof with a mmap write and than verify > that after a file increases we get zeroes for the range which is supposed > to be a hole and not what we wrote with the previous mmaped write. > > We're only seeing this exposed now because generic/363 used to run only > on xfs until last Sunday's fstests update. > > The test was failing like this: > > $ ./check generic/363 > FSTYP -- btrfs > PLATFORM -- Linux/x86_64 debian0 6.13.0-rc7-btrfs-next-185+ #17 SMP PREEMPT_DYNAMIC Mon Feb 3 12:28:46 WET 2025 > MKFS_OPTIONS -- /dev/sdc > MOUNT_OPTIONS -- /dev/sdc /home/fdmanana/btrfs-tests/scratch_1 > > generic/363 0s ... [failed, exit status 1]- output mismatch (see /home/fdmanana/git/hub/xfstests/results//generic/363.out.bad) > --- tests/generic/363.out 2025-02-05 15:31:14.013646509 +0000 > +++ /home/fdmanana/git/hub/xfstests/results//generic/363.out.bad 2025-02-05 17:25:33.112630781 +0000 > @@ -1 +1,46 @@ > QA output created by 363 > +READ BAD DATA: offset = 0xdcad, size = 0xd921, fname = /home/fdmanana/btrfs-tests/dev/junk > +OFFSET GOOD BAD RANGE > +0x1609d 0x0000 0x3104 0x0 > +operation# (mod 256) for the bad data may be 4 > +0x1609e 0x0000 0x0472 0x1 > +operation# (mod 256) for the bad data may be 4 > ... > (Run 'diff -u /home/fdmanana/git/hub/xfstests/tests/generic/363.out /home/fdmanana/git/hub/xfstests/results//generic/363.out.bad' to see the entire diff) > Ran: generic/363 > Failures: generic/363 > Failed 1 of 1 tests > > Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Thanks, Qu > --- > fs/btrfs/file.c | 4 +--- > 1 file changed, 1 insertion(+), 3 deletions(-) > > diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c > index 36f51c311bb1..ed3c0d6546c5 100644 > --- a/fs/btrfs/file.c > +++ b/fs/btrfs/file.c > @@ -1039,7 +1039,6 @@ int btrfs_write_check(struct kiocb *iocb, size_t count) > loff_t pos = iocb->ki_pos; > int ret; > loff_t oldsize; > - loff_t start_pos; > > /* > * Quickly bail out on NOWAIT writes if we don't have the nodatacow or > @@ -1066,9 +1065,8 @@ int btrfs_write_check(struct kiocb *iocb, size_t count) > inode_inc_iversion(inode); > } > > - start_pos = round_down(pos, fs_info->sectorsize); > oldsize = i_size_read(inode); > - if (start_pos > oldsize) { > + if (pos > oldsize) { > /* Expand hole size to cover write data, preventing empty gap */ > loff_t end_pos = round_up(pos + count, fs_info->sectorsize); >
diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index 36f51c311bb1..ed3c0d6546c5 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -1039,7 +1039,6 @@ int btrfs_write_check(struct kiocb *iocb, size_t count) loff_t pos = iocb->ki_pos; int ret; loff_t oldsize; - loff_t start_pos; /* * Quickly bail out on NOWAIT writes if we don't have the nodatacow or @@ -1066,9 +1065,8 @@ int btrfs_write_check(struct kiocb *iocb, size_t count) inode_inc_iversion(inode); } - start_pos = round_down(pos, fs_info->sectorsize); oldsize = i_size_read(inode); - if (start_pos > oldsize) { + if (pos > oldsize) { /* Expand hole size to cover write data, preventing empty gap */ loff_t end_pos = round_up(pos + count, fs_info->sectorsize);