Message ID | 6d62fb2a-a4e6-3094-c1bf-0ca5569b244c@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | xfs_alloc_file_space() rounds len independently of offset | expand |
On Thu, Sep 26, 2019 at 12:57:49PM +0200, Max Reitz wrote: > Hi, > > I’ve noticed that fallocating some range on XFS sometimes does not > include the last block covered by the range, when the start offset is > unaligned. > > (Tested on 5.3.0-gf41def397.) > > This happens whenever ceil((offset + len) / block_size) - floor(offset / > block_size) > ceil(len / block_size), for example: > > Let block_size be 4096. Then (on XFS): > > $ fallocate -o 2048 -l 4096 foo # Range [2048, 6144) > $ xfs_bmap foo > foo: > 0: [0..7]: 80..87 > 1: [8..15]: hole > > There should not be a hole there. Both of the first two blocks should > be allocated. XFS will do that if I just let the range start one byte > sooner and increase the length by one byte: > > $ rm -f foo > $ fallocate -o 2047 -l 4097 foo # Range [2047, 6144) > $ xfs_bmap foo > foo: > 0: [0..15]: 88..103 > > > (See [1] for a more extensive reasoning why this is a bug.) > > > The problem is (as far as I can see) that xfs_alloc_file_space() rounds > count (which equals len) independently of the offset. So in the > examples above, 4096 is rounded to one block and 4097 is rounded to two; > even though the first example actually touches two blocks because of the > misaligned offset. > > Therefore, this should fix the problem (and does fix it for me): > > diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c > index 0910cb75b..4f4437030 100644 > --- a/fs/xfs/xfs_bmap_util.c > +++ b/fs/xfs/xfs_bmap_util.c > @@ -864,6 +864,7 @@ xfs_alloc_file_space( > xfs_filblks_t allocatesize_fsb; > xfs_extlen_t extsz, temp; > xfs_fileoff_t startoffset_fsb; > + xfs_fileoff_t endoffset_fsb; > int nimaps; > int quota_flag; > int rt; > @@ -891,7 +892,8 @@ xfs_alloc_file_space( > imapp = &imaps[0]; > nimaps = 1; > startoffset_fsb = XFS_B_TO_FSBT(mp, offset); > - allocatesize_fsb = XFS_B_TO_FSB(mp, count); > + endoffset_fsb = XFS_B_TO_FSB(mp, offset + count); > + allocatesize_fsb = endoffset_fsb - startoffset_fsb; > > /* > * Allocate file space until done or until there is an error > That looks like a reasonable fix to me and it's in the spirit of how xfs_free_file_space() works as well (outside of the obvious difference in how unaligned boundary blocks are handled). Care to send a proper patch? Brian > > Thanks and kind regards, > > Max > > > [1] That this is a bug can be proven as follows: > > 1. The fallocate(2) man page states "subsequent writes into the range > specified by offset and len are guaranteed not to fail because of lack > of disk space." > > 2. Run this test (anywhere, e.g. tmpfs): > > $ truncate -s $((4096 * 4096)) test_fs > $ mkfs.xfs -b size=4096 test_fs > [Success-indicating output, I hope] > > $ mkdir mount_point > $ sudo mount -o loop test_fs mount_point > $ sudo chmod go+rwx mount_point > $ cd mount_point > > $ free_blocks=$(df -B4k . | tail -n 1 \ > | awk '{ split($0, f); print f[4] }') > > $ falloc_length=$((free_blocks * 4096)) > > $ while true; do \ > fallocate -o 2048 -l $falloc_length test_file && break; \ > falloc_length=$((falloc_length - 4096)); \ > done > fallocate: fallocate failed: No space left on device > fallocate: fallocate failed: No space left on device > fallocate: fallocate failed: No space left on device > fallocate: fallocate failed: No space left on device > > # Now we have a test_file with an fallocated range of > # [2048, 2048 + $falloc_length) > # So we should be able to write anywhere in that area without > # encountering ENOSPC; but that is what happens when we write > # to the last block covered by the range: > > $ dd if=/dev/zero of=test_file bs=1 conv=notrunc \ > seek=$falloc_length count=2048 > dd: error writing 'test_file': No space left on device > 1+0 records in > 0+0 records out > 0 bytes copied, 0.000164691 s, 0.0 kB/s > > > When I apply the diff shown above, I get one more “No space left on > device” line (indicating that fallocate consistently takes one > additional block), and then: > > $ uname -sr > Linux 5.3.0-gf41def397-dirty > > $ dd if=/dev/zero of=test_file bs=1 conv=notrunc \ > seek=$falloc_length count=2048 > 2048+0 records in > 2048+0 records out > 2048 bytes (2.0 kB, 2.0 KiB) copied, 0.0121903 s, 168 kB/s > > (i.e., what I’d expect)
On 26.09.19 14:59, Brian Foster wrote: > On Thu, Sep 26, 2019 at 12:57:49PM +0200, Max Reitz wrote: >> Hi, >> >> I’ve noticed that fallocating some range on XFS sometimes does not >> include the last block covered by the range, when the start offset is >> unaligned. >> >> (Tested on 5.3.0-gf41def397.) >> >> This happens whenever ceil((offset + len) / block_size) - floor(offset / >> block_size) > ceil(len / block_size), for example: >> >> Let block_size be 4096. Then (on XFS): >> >> $ fallocate -o 2048 -l 4096 foo # Range [2048, 6144) >> $ xfs_bmap foo >> foo: >> 0: [0..7]: 80..87 >> 1: [8..15]: hole >> >> There should not be a hole there. Both of the first two blocks should >> be allocated. XFS will do that if I just let the range start one byte >> sooner and increase the length by one byte: >> >> $ rm -f foo >> $ fallocate -o 2047 -l 4097 foo # Range [2047, 6144) >> $ xfs_bmap foo >> foo: >> 0: [0..15]: 88..103 >> >> >> (See [1] for a more extensive reasoning why this is a bug.) >> >> >> The problem is (as far as I can see) that xfs_alloc_file_space() rounds >> count (which equals len) independently of the offset. So in the >> examples above, 4096 is rounded to one block and 4097 is rounded to two; >> even though the first example actually touches two blocks because of the >> misaligned offset. >> >> Therefore, this should fix the problem (and does fix it for me): >> >> diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c >> index 0910cb75b..4f4437030 100644 >> --- a/fs/xfs/xfs_bmap_util.c >> +++ b/fs/xfs/xfs_bmap_util.c >> @@ -864,6 +864,7 @@ xfs_alloc_file_space( >> xfs_filblks_t allocatesize_fsb; >> xfs_extlen_t extsz, temp; >> xfs_fileoff_t startoffset_fsb; >> + xfs_fileoff_t endoffset_fsb; >> int nimaps; >> int quota_flag; >> int rt; >> @@ -891,7 +892,8 @@ xfs_alloc_file_space( >> imapp = &imaps[0]; >> nimaps = 1; >> startoffset_fsb = XFS_B_TO_FSBT(mp, offset); >> - allocatesize_fsb = XFS_B_TO_FSB(mp, count); >> + endoffset_fsb = XFS_B_TO_FSB(mp, offset + count); >> + allocatesize_fsb = endoffset_fsb - startoffset_fsb; >> >> /* >> * Allocate file space until done or until there is an error >> > > That looks like a reasonable fix to me and it's in the spirit of how > xfs_free_file_space() works as well (outside of the obvious difference > in how unaligned boundary blocks are handled). Care to send a proper > patch? I’ve never sent a kernel patch before, but I’ll give it a go. Max
On Thu, Sep 26, 2019 at 03:16:47PM +0200, Max Reitz wrote: > On 26.09.19 14:59, Brian Foster wrote: > > On Thu, Sep 26, 2019 at 12:57:49PM +0200, Max Reitz wrote: > >> Hi, > >> > >> I’ve noticed that fallocating some range on XFS sometimes does not > >> include the last block covered by the range, when the start offset is > >> unaligned. > >> > >> (Tested on 5.3.0-gf41def397.) > >> > >> This happens whenever ceil((offset + len) / block_size) - floor(offset / > >> block_size) > ceil(len / block_size), for example: > >> > >> Let block_size be 4096. Then (on XFS): > >> > >> $ fallocate -o 2048 -l 4096 foo # Range [2048, 6144) > >> $ xfs_bmap foo > >> foo: > >> 0: [0..7]: 80..87 > >> 1: [8..15]: hole > >> > >> There should not be a hole there. Both of the first two blocks should > >> be allocated. XFS will do that if I just let the range start one byte > >> sooner and increase the length by one byte: > >> > >> $ rm -f foo > >> $ fallocate -o 2047 -l 4097 foo # Range [2047, 6144) > >> $ xfs_bmap foo > >> foo: > >> 0: [0..15]: 88..103 > >> > >> > >> (See [1] for a more extensive reasoning why this is a bug.) > >> > >> > >> The problem is (as far as I can see) that xfs_alloc_file_space() rounds > >> count (which equals len) independently of the offset. So in the > >> examples above, 4096 is rounded to one block and 4097 is rounded to two; > >> even though the first example actually touches two blocks because of the > >> misaligned offset. > >> > >> Therefore, this should fix the problem (and does fix it for me): > >> > >> diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c > >> index 0910cb75b..4f4437030 100644 > >> --- a/fs/xfs/xfs_bmap_util.c > >> +++ b/fs/xfs/xfs_bmap_util.c > >> @@ -864,6 +864,7 @@ xfs_alloc_file_space( > >> xfs_filblks_t allocatesize_fsb; > >> xfs_extlen_t extsz, temp; > >> xfs_fileoff_t startoffset_fsb; > >> + xfs_fileoff_t endoffset_fsb; > >> int nimaps; > >> int quota_flag; > >> int rt; > >> @@ -891,7 +892,8 @@ xfs_alloc_file_space( > >> imapp = &imaps[0]; > >> nimaps = 1; > >> startoffset_fsb = XFS_B_TO_FSBT(mp, offset); > >> - allocatesize_fsb = XFS_B_TO_FSB(mp, count); > >> + endoffset_fsb = XFS_B_TO_FSB(mp, offset + count); > >> + allocatesize_fsb = endoffset_fsb - startoffset_fsb; > >> > >> /* > >> * Allocate file space until done or until there is an error > >> > > > > That looks like a reasonable fix to me and it's in the spirit of how > > xfs_free_file_space() works as well (outside of the obvious difference > > in how unaligned boundary blocks are handled). Care to send a proper > > patch? > > I’ve never sent a kernel patch before, but I’ll give it a go. > Heh.. well you've already done the hard part. ;) If you already have a local commit in a git tree, just write up a proper commit log based on what you've described above (no need to include the "proof," I think), make sure there's a Signed-off-by: tag and 'git format-patch ..' / 'git send-email' to the list. If you want to be sure everything is formatted correctly, it can be useful to mail the patch to yourself and verify that you can save it and apply it locally to a clean branch before sending to the list. Brian > Max >
diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c index 0910cb75b..4f4437030 100644 --- a/fs/xfs/xfs_bmap_util.c +++ b/fs/xfs/xfs_bmap_util.c @@ -864,6 +864,7 @@ xfs_alloc_file_space( xfs_filblks_t allocatesize_fsb; xfs_extlen_t extsz, temp; xfs_fileoff_t startoffset_fsb; + xfs_fileoff_t endoffset_fsb; int nimaps; int quota_flag; int rt; @@ -891,7 +892,8 @@ xfs_alloc_file_space( imapp = &imaps[0]; nimaps = 1; startoffset_fsb = XFS_B_TO_FSBT(mp, offset); - allocatesize_fsb = XFS_B_TO_FSB(mp, count); + endoffset_fsb = XFS_B_TO_FSB(mp, offset + count); + allocatesize_fsb = endoffset_fsb - startoffset_fsb; /* * Allocate file space until done or until there is an error