Message ID | c7715d09a67f212e0ecb5fea2d598513912092f4.1619443900.git.anand.jain@oracle.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | btrfs: fix unmountable seed device after fstrim | expand |
On Fri, Apr 30, 2021 at 10:14 AM Anand Jain <anand.jain@oracle.com> wrote: > > The following test case reproduces an issue of wrongly freeing the in-use > block on the readonly seed device when fstrim is called on the rw sprout > device. As shown below. > > create a seed device and add a sprout device to it > mkfs.btrfs -fq -dsingle -msingle /dev/loop0 > btrfstune -S 1 /dev/loop0 > mount /dev/loop0 /btrfs > btrfs dev add -f /dev/loop1 /btrfs > > BTRFS info (device loop0): relocating block group 290455552 flags system > BTRFS info (device loop0): relocating block group 1048576 flags system > BTRFS info (device loop0): disk added /dev/loop1 > > umount /btrfs > mount the sprout device and run fstrim > mount /dev/loop1 /btrfs > fstrim /btrfs > umount /btrfs > > now try to mount the seed device, and it fails... > mount /dev/loop0 /btrfs > > mount: /btrfs: wrong fs type, bad option, bad superblock on /dev/loop0, missing codepage or helper program, or other error. > > BTRFS error (device loop0): bad tree block start, want 5292032 have 0 > BTRFS warning (device loop0): couldn't read-tree root > BTRFS error (device loop0): open_ctree failed > > Block 5292032 is missing on the readonly seed device. > > From the dump-tree of the seed device taken before the fstrim. Block > 5292032 belonged to the block group starting at 5242880 > > <snip> > item 3 key (5242880 BLOCK_GROUP_ITEM 8388608) itemoff 16169 itemsize 24 > block group used 114688 chunk_objectid 256 flags METADATA > <snip> > > From the dump-tree of the sprout device (taken before the fstrim). > fstrim used the block-group 5242880 to find the free space to free. > > <snip> > > item 1 key (5242880 BLOCK_GROUP_ITEM 8388608) itemoff 16226 itemsize 24 > block group used 32768 chunk_objectid 256 flags METADATA > <snip> > > > bpf tracing the fstrim command finds the missing block 5292032 within the > range of the discarded blocks... > > kprobe:btrfs_discard_extent { > printf("free start %llu end %llu num_bytes %llu\n", arg1, arg1+arg2, arg2); > } > > btrfs_discard_extent(..., start, num_bytes, ...): > > free start 5259264 end 5406720 num_bytes 147456 > <snip> > > Fix this by avoiding the discard command to the readonly seed device. Quite hard to read the changelog. It could be better organized, indentation is screwed up in many places, shell command lines should have some prefix like $ or # to make it clear what is a command and what is the output of a command, missing full collons at the end of sentences, sentences not starting with a capital letter, etc. > > Signed-off-by: Anand Jain <anand.jain@oracle.com> > Reported-by: Chris Murphy <lists@colorremedies.com> > --- > > A xfstests case to follow. > > fs/btrfs/extent-tree.c | 9 ++++++++- > 1 file changed, 8 insertions(+), 1 deletion(-) > > diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c > index 7a28314189b4..0d19bd213715 100644 > --- a/fs/btrfs/extent-tree.c > +++ b/fs/btrfs/extent-tree.c > @@ -1340,12 +1340,19 @@ int btrfs_discard_extent(struct btrfs_fs_info *fs_info, u64 bytenr, > stripe = bbio->stripes; > for (i = 0; i < bbio->num_stripes; i++, stripe++) { > u64 bytes; > + struct btrfs_device *device = stripe->dev; > > - if (!stripe->dev->bdev) { > + if (!device->bdev) { > ASSERT(btrfs_test_opt(fs_info, DEGRADED)); > continue; > } > > + /* > + * Skip sending discard command to a non-writeable device. > + */ > + if (!test_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state)) > + continue; With such an obvious and easy to read conditional, the comment is completely unnecessary, it doesn't add any value. Other than that, the fix itself looks good. Thanks. > + > ret = do_discard_extent(stripe, &bytes); > if (!ret) { > discarded_bytes += bytes; > -- > 2.29.2 >
On 30/4/21 6:11 pm, Filipe Manana wrote: > On Fri, Apr 30, 2021 at 10:14 AM Anand Jain <anand.jain@oracle.com> wrote: >> >> The following test case reproduces an issue of wrongly freeing the in-use >> block on the readonly seed device when fstrim is called on the rw sprout >> device. As shown below. >> >> create a seed device and add a sprout device to it >> mkfs.btrfs -fq -dsingle -msingle /dev/loop0 >> btrfstune -S 1 /dev/loop0 >> mount /dev/loop0 /btrfs >> btrfs dev add -f /dev/loop1 /btrfs >> >> BTRFS info (device loop0): relocating block group 290455552 flags system >> BTRFS info (device loop0): relocating block group 1048576 flags system >> BTRFS info (device loop0): disk added /dev/loop1 >> >> umount /btrfs >> mount the sprout device and run fstrim >> mount /dev/loop1 /btrfs >> fstrim /btrfs >> umount /btrfs >> >> now try to mount the seed device, and it fails... >> mount /dev/loop0 /btrfs >> >> mount: /btrfs: wrong fs type, bad option, bad superblock on /dev/loop0, missing codepage or helper program, or other error. >> >> BTRFS error (device loop0): bad tree block start, want 5292032 have 0 >> BTRFS warning (device loop0): couldn't read-tree root >> BTRFS error (device loop0): open_ctree failed >> >> Block 5292032 is missing on the readonly seed device. >> >> From the dump-tree of the seed device taken before the fstrim. Block >> 5292032 belonged to the block group starting at 5242880 >> >> <snip> >> item 3 key (5242880 BLOCK_GROUP_ITEM 8388608) itemoff 16169 itemsize 24 >> block group used 114688 chunk_objectid 256 flags METADATA >> <snip> >> >> From the dump-tree of the sprout device (taken before the fstrim). >> fstrim used the block-group 5242880 to find the free space to free. >> >> <snip> >> >> item 1 key (5242880 BLOCK_GROUP_ITEM 8388608) itemoff 16226 itemsize 24 >> block group used 32768 chunk_objectid 256 flags METADATA >> <snip> >> >> >> bpf tracing the fstrim command finds the missing block 5292032 within the >> range of the discarded blocks... >> >> kprobe:btrfs_discard_extent { >> printf("free start %llu end %llu num_bytes %llu\n", arg1, arg1+arg2, arg2); >> } >> >> btrfs_discard_extent(..., start, num_bytes, ...): >> >> free start 5259264 end 5406720 num_bytes 147456 >> <snip> >> >> Fix this by avoiding the discard command to the readonly seed device. > > Quite hard to read the changelog. > > It could be better organized, indentation is screwed up in many > places, shell command lines should have some prefix like $ or # to > make it clear what is a command and what is the output of a command, > missing full collons at the end of sentences, sentences not starting > with a capital letter, etc. OK. Let me try to fix it. > >> >> Signed-off-by: Anand Jain <anand.jain@oracle.com> >> Reported-by: Chris Murphy <lists@colorremedies.com> >> --- >> >> A xfstests case to follow. >> >> fs/btrfs/extent-tree.c | 9 ++++++++- >> 1 file changed, 8 insertions(+), 1 deletion(-) >> >> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c >> index 7a28314189b4..0d19bd213715 100644 >> --- a/fs/btrfs/extent-tree.c >> +++ b/fs/btrfs/extent-tree.c >> @@ -1340,12 +1340,19 @@ int btrfs_discard_extent(struct btrfs_fs_info *fs_info, u64 bytenr, >> stripe = bbio->stripes; >> for (i = 0; i < bbio->num_stripes; i++, stripe++) { >> u64 bytes; >> + struct btrfs_device *device = stripe->dev; >> >> - if (!stripe->dev->bdev) { >> + if (!device->bdev) { >> ASSERT(btrfs_test_opt(fs_info, DEGRADED)); >> continue; >> } >> >> + /* >> + * Skip sending discard command to a non-writeable device. >> + */ >> + if (!test_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state)) >> + continue; > > With such an obvious and easy to read conditional, the comment is > completely unnecessary, it doesn't add any value. > Will remove. > Other than that, the fix itself looks good. > Thanks > Thanks. > >> + >> ret = do_discard_extent(stripe, &bytes); >> if (!ret) { >> discarded_bytes += bytes; >> -- >> 2.29.2 >> > >
diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index 7a28314189b4..0d19bd213715 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -1340,12 +1340,19 @@ int btrfs_discard_extent(struct btrfs_fs_info *fs_info, u64 bytenr, stripe = bbio->stripes; for (i = 0; i < bbio->num_stripes; i++, stripe++) { u64 bytes; + struct btrfs_device *device = stripe->dev; - if (!stripe->dev->bdev) { + if (!device->bdev) { ASSERT(btrfs_test_opt(fs_info, DEGRADED)); continue; } + /* + * Skip sending discard command to a non-writeable device. + */ + if (!test_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state)) + continue; + ret = do_discard_extent(stripe, &bytes); if (!ret) { discarded_bytes += bytes;
The following test case reproduces an issue of wrongly freeing the in-use block on the readonly seed device when fstrim is called on the rw sprout device. As shown below. create a seed device and add a sprout device to it mkfs.btrfs -fq -dsingle -msingle /dev/loop0 btrfstune -S 1 /dev/loop0 mount /dev/loop0 /btrfs btrfs dev add -f /dev/loop1 /btrfs BTRFS info (device loop0): relocating block group 290455552 flags system BTRFS info (device loop0): relocating block group 1048576 flags system BTRFS info (device loop0): disk added /dev/loop1 umount /btrfs mount the sprout device and run fstrim mount /dev/loop1 /btrfs fstrim /btrfs umount /btrfs now try to mount the seed device, and it fails... mount /dev/loop0 /btrfs mount: /btrfs: wrong fs type, bad option, bad superblock on /dev/loop0, missing codepage or helper program, or other error. BTRFS error (device loop0): bad tree block start, want 5292032 have 0 BTRFS warning (device loop0): couldn't read-tree root BTRFS error (device loop0): open_ctree failed Block 5292032 is missing on the readonly seed device. From the dump-tree of the seed device taken before the fstrim. Block 5292032 belonged to the block group starting at 5242880 <snip> item 3 key (5242880 BLOCK_GROUP_ITEM 8388608) itemoff 16169 itemsize 24 block group used 114688 chunk_objectid 256 flags METADATA <snip> From the dump-tree of the sprout device (taken before the fstrim). fstrim used the block-group 5242880 to find the free space to free. <snip> item 1 key (5242880 BLOCK_GROUP_ITEM 8388608) itemoff 16226 itemsize 24 block group used 32768 chunk_objectid 256 flags METADATA <snip> bpf tracing the fstrim command finds the missing block 5292032 within the range of the discarded blocks... kprobe:btrfs_discard_extent { printf("free start %llu end %llu num_bytes %llu\n", arg1, arg1+arg2, arg2); } btrfs_discard_extent(..., start, num_bytes, ...): free start 5259264 end 5406720 num_bytes 147456 <snip> Fix this by avoiding the discard command to the readonly seed device. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reported-by: Chris Murphy <lists@colorremedies.com> --- A xfstests case to follow. fs/btrfs/extent-tree.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)