Message ID | 20200720125109.93970-1-wqu@suse.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [1/2] btrfs-progs: convert: Prevent bit overflow for cctx->total_bytes | expand |
On 20.07.20 г. 15:51 ч., Qu Wenruo wrote: > [BUG] > When convert is called on a 64GiB ext4 fs, it fails like this: > > $ btrfs-convert /dev/loop0p1 > create btrfs filesystem: > blocksize: 4096 > nodesize: 16384 > features: extref, skinny-metadata (default) > checksum: crc32c > creating ext2 image file > ERROR: missing data block for bytenr 1048576 > ERROR: failed to create ext2_saved/image: -2 > WARNING: an error occurred during conversion, filesystem is partially created but not finalized and not mountable > > Btrfs-convert also corrupts the source fs: > $ LANG=C e2fsck /dev/loop0p1 -f > e2fsck 1.45.6 (20-Mar-2020) > Resize inode not valid. Recreate<y>? yes > Pass 1: Checking inodes, blocks, and sizes > Deleted inode 3681 has zero dtime. Fix<y>? yes > Inodes that were part of a corrupted orphan linked list found. Fix<y>? yes > Inode 3744 was part of the orphaned inode list. FIXED. > Deleted inode 3745 has zero dtime. Fix<y>? yes > Inode 3747 has INLINE_DATA_FL flag on filesystem without inline data support. > Clear<y>? yes > ... > > [CAUSE] > After some debugging, the first strange behavior is, the value of > cctx->total_bytes is 0 in ext2_open_fs(). > > It turns out that, the value assign for cctx->total_bytes could lead to > bit overflow for the unsigned int value. > > And that 0 cctx->total_bytes leads to vairous problems for later free > space calculation. > For example, in calculate_available_space(), we use cctx->total_bytes to > ensure we won't create a data chunk beyond device end: > > cue_len = min(cctx->total_bytes - cur_off, cur_len); > > If that cur_offset is also 0, we will create a cache_extent with 0 size, > which could cause a lot of problems for cache tree search. > > [FIX] > Do manual casting for the multiply operation, so we could got a real u64 > result. > The fix will be applied to all supported fses (ext* and reiserfs). > > Reported-by: Christian Zangl <coralllama@gmail.com> > Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com>
On Mon, Jul 20, 2020 at 08:51:08PM +0800, Qu Wenruo wrote: > --- a/convert/source-ext2.c > +++ b/convert/source-ext2.c > @@ -87,7 +87,8 @@ static int ext2_open_fs(struct btrfs_convert_context *cctx, const char *name) > cctx->fs_data = ext2_fs; > cctx->blocksize = ext2_fs->blocksize; > cctx->block_count = ext2_fs->super->s_blocks_count; > - cctx->total_bytes = ext2_fs->blocksize * ext2_fs->super->s_blocks_count; > + cctx->total_bytes = (u64)ext2_fs->blocksize * > + (u64)ext2_fs->super->s_blocks_count; Do you need to cast both? Once one of the types is wide enough for the result, there should be no loss.
On 2020/7/21 上午12:09, David Sterba wrote: > On Mon, Jul 20, 2020 at 08:51:08PM +0800, Qu Wenruo wrote: >> --- a/convert/source-ext2.c >> +++ b/convert/source-ext2.c >> @@ -87,7 +87,8 @@ static int ext2_open_fs(struct btrfs_convert_context *cctx, const char *name) >> cctx->fs_data = ext2_fs; >> cctx->blocksize = ext2_fs->blocksize; >> cctx->block_count = ext2_fs->super->s_blocks_count; >> - cctx->total_bytes = ext2_fs->blocksize * ext2_fs->super->s_blocks_count; >> + cctx->total_bytes = (u64)ext2_fs->blocksize * >> + (u64)ext2_fs->super->s_blocks_count; > > Do you need to cast both? Once one of the types is wide enough for the > result, there should be no loss. > I just want to be extra safe. Feel free to reduce one. Thanks, Qu
On Tue, Jul 21, 2020 at 07:51:00AM +0800, Qu Wenruo wrote: > > > On 2020/7/21 上午12:09, David Sterba wrote: > > On Mon, Jul 20, 2020 at 08:51:08PM +0800, Qu Wenruo wrote: > >> --- a/convert/source-ext2.c > >> +++ b/convert/source-ext2.c > >> @@ -87,7 +87,8 @@ static int ext2_open_fs(struct btrfs_convert_context *cctx, const char *name) > >> cctx->fs_data = ext2_fs; > >> cctx->blocksize = ext2_fs->blocksize; > >> cctx->block_count = ext2_fs->super->s_blocks_count; > >> - cctx->total_bytes = ext2_fs->blocksize * ext2_fs->super->s_blocks_count; > >> + cctx->total_bytes = (u64)ext2_fs->blocksize * > >> + (u64)ext2_fs->super->s_blocks_count; > > > > Do you need to cast both? Once one of the types is wide enough for the > > result, there should be no loss. > > > I just want to be extra safe. Typecasts in code raise questions why are they needed, 'to be extra' safe is not a good reason. One typecast in multiplication/shifts is a common pattern to widen the result but two look more like lack of understanding of the integer promotion rules.
On 2020/7/21 下午5:58, David Sterba wrote: > On Tue, Jul 21, 2020 at 07:51:00AM +0800, Qu Wenruo wrote: >> >> >> On 2020/7/21 上午12:09, David Sterba wrote: >>> On Mon, Jul 20, 2020 at 08:51:08PM +0800, Qu Wenruo wrote: >>>> --- a/convert/source-ext2.c >>>> +++ b/convert/source-ext2.c >>>> @@ -87,7 +87,8 @@ static int ext2_open_fs(struct btrfs_convert_context *cctx, const char *name) >>>> cctx->fs_data = ext2_fs; >>>> cctx->blocksize = ext2_fs->blocksize; >>>> cctx->block_count = ext2_fs->super->s_blocks_count; >>>> - cctx->total_bytes = ext2_fs->blocksize * ext2_fs->super->s_blocks_count; >>>> + cctx->total_bytes = (u64)ext2_fs->blocksize * >>>> + (u64)ext2_fs->super->s_blocks_count; >>> >>> Do you need to cast both? Once one of the types is wide enough for the >>> result, there should be no loss. >>> >> I just want to be extra safe. > > Typecasts in code raise questions why are they needed, 'to be extra' > safe is not a good reason. One typecast in multiplication/shifts is a > common pattern to widen the result but two look more like lack of > understanding of the integer promotion rules. > My point here is, I don't want the reviewers or new contributors to bother about the promotion rules at all. They only need to know that using blocksize and blocks_count directly to do multiply would lead to overflow. Other details like whether the multiply follows the highest factor or the left operator or the right operator, shouldn't be the point and we don't really need to bother. Thus casting both would definitely be right, without the need to refer to the complex rule book, thus save the reviewer several minutes. Thanks, Qu
On Tue, Jul 21, 2020 at 06:29:31PM +0800, Qu Wenruo wrote: > On 2020/7/21 下午5:58, David Sterba wrote: > > On Tue, Jul 21, 2020 at 07:51:00AM +0800, Qu Wenruo wrote: > >> > >> > >> On 2020/7/21 上午12:09, David Sterba wrote: > >>> On Mon, Jul 20, 2020 at 08:51:08PM +0800, Qu Wenruo wrote: > >>>> --- a/convert/source-ext2.c > >>>> +++ b/convert/source-ext2.c > >>>> @@ -87,7 +87,8 @@ static int ext2_open_fs(struct btrfs_convert_context *cctx, const char *name) > >>>> cctx->fs_data = ext2_fs; > >>>> cctx->blocksize = ext2_fs->blocksize; > >>>> cctx->block_count = ext2_fs->super->s_blocks_count; > >>>> - cctx->total_bytes = ext2_fs->blocksize * ext2_fs->super->s_blocks_count; > >>>> + cctx->total_bytes = (u64)ext2_fs->blocksize * > >>>> + (u64)ext2_fs->super->s_blocks_count; > >>> > >>> Do you need to cast both? Once one of the types is wide enough for the > >>> result, there should be no loss. > >>> > >> I just want to be extra safe. > > > > Typecasts in code raise questions why are they needed, 'to be extra' > > safe is not a good reason. One typecast in multiplication/shifts is a > > common pattern to widen the result but two look more like lack of > > understanding of the integer promotion rules. > > My point here is, I don't want the reviewers or new contributors to > bother about the promotion rules at all. Ouch, I hope you don't mean that contributors should ignore the trickier parts of C language. Especially reviewers _have_ to bother about all sorts of subtle behaviour. > They only need to know that using blocksize and blocks_count directly to > do multiply would lead to overflow. > > Other details like whether the multiply follows the highest factor or > the left operator or the right operator, shouldn't be the point and we > don't really need to bother. ... and introduce bugs? > Thus casting both would definitely be right, without the need to refer > to the complex rule book, thus save the reviewer several minutes. The opposite, if you send me code that's not following known schemes or idiomatic schemes I'll be highly suspicious and looking for the reasons why it's that way and making sure it's correct costs way more time.
On Tue, Jul 21, 2020 at 06:29:31PM +0800, Qu Wenruo wrote: > On 2020/7/21 下午5:58, David Sterba wrote: > > On Tue, Jul 21, 2020 at 07:51:00AM +0800, Qu Wenruo wrote: > >> > >> > >> On 2020/7/21 上午12:09, David Sterba wrote: > >>> On Mon, Jul 20, 2020 at 08:51:08PM +0800, Qu Wenruo wrote: > >>>> --- a/convert/source-ext2.c > >>>> +++ b/convert/source-ext2.c > >>>> @@ -87,7 +87,8 @@ static int ext2_open_fs(struct btrfs_convert_context *cctx, const char *name) > >>>> cctx->fs_data = ext2_fs; > >>>> cctx->blocksize = ext2_fs->blocksize; > >>>> cctx->block_count = ext2_fs->super->s_blocks_count; > >>>> - cctx->total_bytes = ext2_fs->blocksize * ext2_fs->super->s_blocks_count; > >>>> + cctx->total_bytes = (u64)ext2_fs->blocksize * > >>>> + (u64)ext2_fs->super->s_blocks_count; > >>> > >>> Do you need to cast both? Once one of the types is wide enough for the > >>> result, there should be no loss. > >>> > >> I just want to be extra safe. > > > > Typecasts in code raise questions why are they needed, 'to be extra' > > safe is not a good reason. One typecast in multiplication/shifts is a > > common pattern to widen the result but two look more like lack of > > understanding of the integer promotion rules. > > > > My point here is, I don't want the reviewers or new contributors to > bother about the promotion rules at all. Reviewers and contributors need to understand the rules of C. > They only need to know that using blocksize and blocks_count directly to > do multiply would lead to overflow. > > Other details like whether the multiply follows the highest factor or > the left operator or the right operator, shouldn't be the point and we > don't really need to bother. This is like suggesting to use brackets on every int-expression like ((5*3)+7) because the precedere rules are too complex. Do you think that writing ((5*3)+7) is "extra safe"? > Thus casting both would definitely be right, without the need to refer > to the complex rule book, thus save the reviewer several minutes. I don't think so.
On 2020/7/21 下午9:55, David Sterba wrote: > On Tue, Jul 21, 2020 at 06:29:31PM +0800, Qu Wenruo wrote: >> On 2020/7/21 下午5:58, David Sterba wrote: >>> On Tue, Jul 21, 2020 at 07:51:00AM +0800, Qu Wenruo wrote: >>>> >>>> >>>> On 2020/7/21 上午12:09, David Sterba wrote: >>>>> On Mon, Jul 20, 2020 at 08:51:08PM +0800, Qu Wenruo wrote: >>>>>> --- a/convert/source-ext2.c >>>>>> +++ b/convert/source-ext2.c >>>>>> @@ -87,7 +87,8 @@ static int ext2_open_fs(struct btrfs_convert_context *cctx, const char *name) >>>>>> cctx->fs_data = ext2_fs; >>>>>> cctx->blocksize = ext2_fs->blocksize; >>>>>> cctx->block_count = ext2_fs->super->s_blocks_count; >>>>>> - cctx->total_bytes = ext2_fs->blocksize * ext2_fs->super->s_blocks_count; >>>>>> + cctx->total_bytes = (u64)ext2_fs->blocksize * >>>>>> + (u64)ext2_fs->super->s_blocks_count; >>>>> >>>>> Do you need to cast both? Once one of the types is wide enough for the >>>>> result, there should be no loss. >>>>> >>>> I just want to be extra safe. >>> >>> Typecasts in code raise questions why are they needed, 'to be extra' >>> safe is not a good reason. One typecast in multiplication/shifts is a >>> common pattern to widen the result but two look more like lack of >>> understanding of the integer promotion rules. >> >> My point here is, I don't want the reviewers or new contributors to >> bother about the promotion rules at all. > > Ouch, I hope you don't mean that contributors should ignore the trickier > parts of C language. Especially reviewers _have_ to bother about all > sorts of subtle behaviour. > >> They only need to know that using blocksize and blocks_count directly to >> do multiply would lead to overflow. >> >> Other details like whether the multiply follows the highest factor or >> the left operator or the right operator, shouldn't be the point and we >> don't really need to bother. > > ... and introduce bugs? > >> Thus casting both would definitely be right, without the need to refer >> to the complex rule book, thus save the reviewer several minutes. > > The opposite, if you send me code that's not following known schemes or > idiomatic schemes I'll be highly suspicious and looking for the reasons > why it's that way and making sure it's correct costs way more time. > OK, then would you please remove one casting at merge time, or do I need to resend? Thanks, Qu
On Wed, Jul 22, 2020 at 06:58:39AM +0800, Qu Wenruo wrote: > >> Thus casting both would definitely be right, without the need to refer > >> to the complex rule book, thus save the reviewer several minutes. > > > > The opposite, if you send me code that's not following known schemes or > > idiomatic schemes I'll be highly suspicious and looking for the reasons > > why it's that way and making sure it's correct costs way more time. > > > OK, then would you please remove one casting at merge time, or do I need > to resend? Yeah, I fix such things routinely no need to resend.
On Wed, Jul 22, 2020 at 7:33 AM David Sterba <dsterba@suse.cz> wrote: > > On Wed, Jul 22, 2020 at 06:58:39AM +0800, Qu Wenruo wrote: > > >> Thus casting both would definitely be right, without the need to refer > > >> to the complex rule book, thus save the reviewer several minutes. > > > > > > The opposite, if you send me code that's not following known schemes or > > > idiomatic schemes I'll be highly suspicious and looking for the reasons > > > why it's that way and making sure it's correct costs way more time. > > > > > OK, then would you please remove one casting at merge time, or do I need > > to resend? > > Yeah, I fix such things routinely no need to resend. I have a report[1] that seems to look like this patch solves it, is that correct? [1]: https://bugzilla.redhat.com/show_bug.cgi?id=1851674#c7
On 2020/7/23 下午9:31, Neal Gompa wrote: > On Wed, Jul 22, 2020 at 7:33 AM David Sterba <dsterba@suse.cz> wrote: >> >> On Wed, Jul 22, 2020 at 06:58:39AM +0800, Qu Wenruo wrote: >>>>> Thus casting both would definitely be right, without the need to refer >>>>> to the complex rule book, thus save the reviewer several minutes. >>>> >>>> The opposite, if you send me code that's not following known schemes or >>>> idiomatic schemes I'll be highly suspicious and looking for the reasons >>>> why it's that way and making sure it's correct costs way more time. >>>> >>> OK, then would you please remove one casting at merge time, or do I need >>> to resend? >> >> Yeah, I fix such things routinely no need to resend. > > I have a report[1] that seems to look like this patch solves it, is > that correct? > > [1]: https://bugzilla.redhat.com/show_bug.cgi?id=1851674#c7 > Yep, looks like the same bug. Thanks, Qu
On Thu, Jul 23, 2020 at 8:02 PM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote: > > > > On 2020/7/23 下午9:31, Neal Gompa wrote: > > On Wed, Jul 22, 2020 at 7:33 AM David Sterba <dsterba@suse.cz> wrote: > >> > >> On Wed, Jul 22, 2020 at 06:58:39AM +0800, Qu Wenruo wrote: > >>>>> Thus casting both would definitely be right, without the need to refer > >>>>> to the complex rule book, thus save the reviewer several minutes. > >>>> > >>>> The opposite, if you send me code that's not following known schemes or > >>>> idiomatic schemes I'll be highly suspicious and looking for the reasons > >>>> why it's that way and making sure it's correct costs way more time. > >>>> > >>> OK, then would you please remove one casting at merge time, or do I need > >>> to resend? > >> > >> Yeah, I fix such things routinely no need to resend. > > > > I have a report[1] that seems to look like this patch solves it, is > > that correct? > > > > [1]: https://bugzilla.redhat.com/show_bug.cgi?id=1851674#c7 > > > Yep, looks like the same bug. > So I backported this fix into btrfs-progs-5.7-4.fc32[1], and the reporter is still seeing issues[2]. Pasting from the bug comment[2]: [liveuser@localhost-live ~]$ sudo btrfs-convert /dev/sda2 create btrfs filesystem: blocksize: 4096 nodesize: 16384 features: extref, skinny-metadata (default) checksum: crc32c creating ext2 image file creating btrfs metadata convert/source-ext2.c:845: ext2_copy_inodes: BUG_ON `ret` triggered, value -28 btrfs-convert(+0x13589)[0x558cfe4a6589] btrfs-convert(+0x14549)[0x558cfe4a7549] btrfs-convert(main+0xfc0)[0x558cfe4a22b0] /lib64/libc.so.6(__libc_start_main+0xf2)[0x7fd2c6666042] btrfs-convert(_start+0x2e)[0x558cfe4a35ce] Aborted Any idea what's going on here? [1]: https://src.fedoraproject.org/rpms/btrfs-progs/c/e4a3d39e87313954cb3e8214e28ee0ba50bee874 [2]: https://bugzilla.redhat.com/show_bug.cgi?id=1851674#c13
On 2020/7/28 下午9:14, Neal Gompa wrote: > On Thu, Jul 23, 2020 at 8:02 PM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote: >> >> >> >> On 2020/7/23 下午9:31, Neal Gompa wrote: >>> On Wed, Jul 22, 2020 at 7:33 AM David Sterba <dsterba@suse.cz> wrote: >>>> >>>> On Wed, Jul 22, 2020 at 06:58:39AM +0800, Qu Wenruo wrote: >>>>>>> Thus casting both would definitely be right, without the need to refer >>>>>>> to the complex rule book, thus save the reviewer several minutes. >>>>>> >>>>>> The opposite, if you send me code that's not following known schemes or >>>>>> idiomatic schemes I'll be highly suspicious and looking for the reasons >>>>>> why it's that way and making sure it's correct costs way more time. >>>>>> >>>>> OK, then would you please remove one casting at merge time, or do I need >>>>> to resend? >>>> >>>> Yeah, I fix such things routinely no need to resend. >>> >>> I have a report[1] that seems to look like this patch solves it, is >>> that correct? >>> >>> [1]: https://bugzilla.redhat.com/show_bug.cgi?id=1851674#c7 >>> >> Yep, looks like the same bug. >> > > So I backported this fix into btrfs-progs-5.7-4.fc32[1], and the > reporter is still seeing issues[2]. > > Pasting from the bug comment[2]: > > [liveuser@localhost-live ~]$ sudo btrfs-convert /dev/sda2 > create btrfs filesystem: > blocksize: 4096 > nodesize: 16384 > features: extref, skinny-metadata (default) > checksum: crc32c > creating ext2 image file > creating btrfs metadata > convert/source-ext2.c:845: ext2_copy_inodes: BUG_ON `ret` triggered, value -28 This means we have no space left. We don't know if it's the fs already exhausted (little space left for EXT*), or it's btrfs' extent allocator not working. Would it possible to update the image? BTW, even btrfs-convert crashed, the fs should be completely fine, just as nothing happened to it (from the point of view of ext*) Thanks, Qu > btrfs-convert(+0x13589)[0x558cfe4a6589] > btrfs-convert(+0x14549)[0x558cfe4a7549] > btrfs-convert(main+0xfc0)[0x558cfe4a22b0] > /lib64/libc.so.6(__libc_start_main+0xf2)[0x7fd2c6666042] > btrfs-convert(_start+0x2e)[0x558cfe4a35ce] > Aborted > > Any idea what's going on here? > > > [1]: https://src.fedoraproject.org/rpms/btrfs-progs/c/e4a3d39e87313954cb3e8214e28ee0ba50bee874 > [2]: https://bugzilla.redhat.com/show_bug.cgi?id=1851674#c13 >
On Tue, Jul 28, 2020 at 9:20 AM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote: > > > > On 2020/7/28 下午9:14, Neal Gompa wrote: > > On Thu, Jul 23, 2020 at 8:02 PM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote: > >> > >> > >> > >> On 2020/7/23 下午9:31, Neal Gompa wrote: > >>> On Wed, Jul 22, 2020 at 7:33 AM David Sterba <dsterba@suse.cz> wrote: > >>>> > >>>> On Wed, Jul 22, 2020 at 06:58:39AM +0800, Qu Wenruo wrote: > >>>>>>> Thus casting both would definitely be right, without the need to refer > >>>>>>> to the complex rule book, thus save the reviewer several minutes. > >>>>>> > >>>>>> The opposite, if you send me code that's not following known schemes or > >>>>>> idiomatic schemes I'll be highly suspicious and looking for the reasons > >>>>>> why it's that way and making sure it's correct costs way more time. > >>>>>> > >>>>> OK, then would you please remove one casting at merge time, or do I need > >>>>> to resend? > >>>> > >>>> Yeah, I fix such things routinely no need to resend. > >>> > >>> I have a report[1] that seems to look like this patch solves it, is > >>> that correct? > >>> > >>> [1]: https://bugzilla.redhat.com/show_bug.cgi?id=1851674#c7 > >>> > >> Yep, looks like the same bug. > >> > > > > So I backported this fix into btrfs-progs-5.7-4.fc32[1], and the > > reporter is still seeing issues[2]. > > > > Pasting from the bug comment[2]: > > > > [liveuser@localhost-live ~]$ sudo btrfs-convert /dev/sda2 > > create btrfs filesystem: > > blocksize: 4096 > > nodesize: 16384 > > features: extref, skinny-metadata (default) > > checksum: crc32c > > creating ext2 image file > > creating btrfs metadata > > convert/source-ext2.c:845: ext2_copy_inodes: BUG_ON `ret` triggered, value -28 > > This means we have no space left. > > We don't know if it's the fs already exhausted (little space left for > EXT*), or it's btrfs' extent allocator not working. > > Would it possible to update the image? > I'm not sure what you're asking here? Do you mean update the live environment? You can boot a Fedora 32 live environment and update the btrfs-progs package before using it like the bug reporter did... > BTW, even btrfs-convert crashed, the fs should be completely fine, just > as nothing happened to it (from the point of view of ext*) > I figured as much, but we shouldn't have a case where btrfs-convert crashes like this... -- 真実はいつも一つ!/ Always, there's only one truth!
On 2020/7/29 上午9:56, Neal Gompa wrote: > On Tue, Jul 28, 2020 at 9:20 AM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote: >> >> >> >> On 2020/7/28 下午9:14, Neal Gompa wrote: >>> On Thu, Jul 23, 2020 at 8:02 PM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote: >>>> >>>> >>>> >>>> On 2020/7/23 下午9:31, Neal Gompa wrote: >>>>> On Wed, Jul 22, 2020 at 7:33 AM David Sterba <dsterba@suse.cz> wrote: >>>>>> >>>>>> On Wed, Jul 22, 2020 at 06:58:39AM +0800, Qu Wenruo wrote: >>>>>>>>> Thus casting both would definitely be right, without the need to refer >>>>>>>>> to the complex rule book, thus save the reviewer several minutes. >>>>>>>> >>>>>>>> The opposite, if you send me code that's not following known schemes or >>>>>>>> idiomatic schemes I'll be highly suspicious and looking for the reasons >>>>>>>> why it's that way and making sure it's correct costs way more time. >>>>>>>> >>>>>>> OK, then would you please remove one casting at merge time, or do I need >>>>>>> to resend? >>>>>> >>>>>> Yeah, I fix such things routinely no need to resend. >>>>> >>>>> I have a report[1] that seems to look like this patch solves it, is >>>>> that correct? >>>>> >>>>> [1]: https://bugzilla.redhat.com/show_bug.cgi?id=1851674#c7 >>>>> >>>> Yep, looks like the same bug. >>>> >>> >>> So I backported this fix into btrfs-progs-5.7-4.fc32[1], and the >>> reporter is still seeing issues[2]. >>> >>> Pasting from the bug comment[2]: >>> >>> [liveuser@localhost-live ~]$ sudo btrfs-convert /dev/sda2 >>> create btrfs filesystem: >>> blocksize: 4096 >>> nodesize: 16384 >>> features: extref, skinny-metadata (default) >>> checksum: crc32c >>> creating ext2 image file >>> creating btrfs metadata >>> convert/source-ext2.c:845: ext2_copy_inodes: BUG_ON `ret` triggered, value -28 >> >> This means we have no space left. >> >> We don't know if it's the fs already exhausted (little space left for >> EXT*), or it's btrfs' extent allocator not working. >> >> Would it possible to update the image? >> > > I'm not sure what you're asking here? Do you mean update the live > environment? You can boot a Fedora 32 live environment and update the > btrfs-progs package before using it like the bug reporter did... I mean, upload the binary dump or e2image dump of /dev/sda2. e2image -Q would be much smaller and without any data stored in it. Sometimes that's the easiest way to debug. Or we need to add tons of probes to btrfs-convert and let the reporter to try-and-report to pin down the bug. > >> BTW, even btrfs-convert crashed, the fs should be completely fine, just >> as nothing happened to it (from the point of view of ext*) >> > > I figured as much, but we shouldn't have a case where btrfs-convert > crashes like this... Yep, the BUG_ON() should be removed and replaced with a more graceful exit. I'll do that soon. Thanks, Qu
diff --git a/convert/main.c b/convert/main.c index 7709e9a6c085..df6a2ae32722 100644 --- a/convert/main.c +++ b/convert/main.c @@ -1136,6 +1136,7 @@ static int do_convert(const char *devname, u32 convert_flags, u32 nodesize, if (ret) goto fail; + ASSERT(cctx.total_bytes); blocksize = cctx.blocksize; total_bytes = (u64)blocksize * (u64)cctx.block_count; if (blocksize < 4096) { diff --git a/convert/source-ext2.c b/convert/source-ext2.c index f11ef651245a..a5e3a726863f 100644 --- a/convert/source-ext2.c +++ b/convert/source-ext2.c @@ -87,7 +87,8 @@ static int ext2_open_fs(struct btrfs_convert_context *cctx, const char *name) cctx->fs_data = ext2_fs; cctx->blocksize = ext2_fs->blocksize; cctx->block_count = ext2_fs->super->s_blocks_count; - cctx->total_bytes = ext2_fs->blocksize * ext2_fs->super->s_blocks_count; + cctx->total_bytes = (u64)ext2_fs->blocksize * + (u64)ext2_fs->super->s_blocks_count; cctx->volume_name = strndup((char *)ext2_fs->super->s_volume_name, 16); cctx->first_data_block = ext2_fs->super->s_first_data_block; cctx->inodes_count = ext2_fs->super->s_inodes_count; diff --git a/convert/source-reiserfs.c b/convert/source-reiserfs.c index 9fd6b9abb9b4..8d9752f06ca9 100644 --- a/convert/source-reiserfs.c +++ b/convert/source-reiserfs.c @@ -82,7 +82,7 @@ static int reiserfs_open_fs(struct btrfs_convert_context *cxt, const char *name) cxt->fs_data = fs; cxt->blocksize = fs->fs_blocksize; cxt->block_count = get_sb_block_count(fs->fs_ondisk_sb); - cxt->total_bytes = cxt->blocksize * cxt->block_count; + cxt->total_bytes = (u64)cxt->blocksize * (u64)cxt->block_count; cxt->volume_name = strndup(fs->fs_ondisk_sb->s_label, 16); cxt->first_data_block = 0; cxt->inodes_count = reiserfs_count_objectids(fs);
[BUG] When convert is called on a 64GiB ext4 fs, it fails like this: $ btrfs-convert /dev/loop0p1 create btrfs filesystem: blocksize: 4096 nodesize: 16384 features: extref, skinny-metadata (default) checksum: crc32c creating ext2 image file ERROR: missing data block for bytenr 1048576 ERROR: failed to create ext2_saved/image: -2 WARNING: an error occurred during conversion, filesystem is partially created but not finalized and not mountable Btrfs-convert also corrupts the source fs: $ LANG=C e2fsck /dev/loop0p1 -f e2fsck 1.45.6 (20-Mar-2020) Resize inode not valid. Recreate<y>? yes Pass 1: Checking inodes, blocks, and sizes Deleted inode 3681 has zero dtime. Fix<y>? yes Inodes that were part of a corrupted orphan linked list found. Fix<y>? yes Inode 3744 was part of the orphaned inode list. FIXED. Deleted inode 3745 has zero dtime. Fix<y>? yes Inode 3747 has INLINE_DATA_FL flag on filesystem without inline data support. Clear<y>? yes ... [CAUSE] After some debugging, the first strange behavior is, the value of cctx->total_bytes is 0 in ext2_open_fs(). It turns out that, the value assign for cctx->total_bytes could lead to bit overflow for the unsigned int value. And that 0 cctx->total_bytes leads to vairous problems for later free space calculation. For example, in calculate_available_space(), we use cctx->total_bytes to ensure we won't create a data chunk beyond device end: cue_len = min(cctx->total_bytes - cur_off, cur_len); If that cur_offset is also 0, we will create a cache_extent with 0 size, which could cause a lot of problems for cache tree search. [FIX] Do manual casting for the multiply operation, so we could got a real u64 result. The fix will be applied to all supported fses (ext* and reiserfs). Reported-by: Christian Zangl <coralllama@gmail.com> Signed-off-by: Qu Wenruo <wqu@suse.com> --- convert/main.c | 1 + convert/source-ext2.c | 3 ++- convert/source-reiserfs.c | 2 +- 3 files changed, 4 insertions(+), 2 deletions(-)