Message ID | 14f4089a9d26606c7a15e398536ca75f9c484c9c.1682505780.git.fdmanana@suse.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | btrfs: improve sibling keys check failure report | expand |
On 2023/4/26 18:51, fdmanana@kernel.org wrote: > From: Filipe Manana <fdmanana@suse.com> > > When trying to move keys from one node/leaf to another sibling node/leaf, > if the sibling keys check fails we just print an error message with the > last key of the left sibling and the first key of the right sibling. > However it's also useful to print all the keys of each sibling, as it > may provide some clues to what went wrong, which code path may be > inserting keys in an incorrect order. So just do that, print the siblings > with btrfs_print_tree(), as it works for both leaves and nodes. > > Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: Qu Wenruo <wqu@suse.com> However my concern is, printing two tree blocks may be a little too large. Definitely not something can be capture by one screen. Although dumping single tree block is already too large for a single screen, I don't have any better way... Thanks, Qu > --- > fs/btrfs/ctree.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c > index a0b97a6d075a..a061d7092369 100644 > --- a/fs/btrfs/ctree.c > +++ b/fs/btrfs/ctree.c > @@ -2708,6 +2708,10 @@ static bool check_sibling_keys(struct extent_buffer *left, > } > > if (btrfs_comp_cpu_keys(&left_last, &right_first) >= 0) { > + btrfs_crit(left->fs_info, "left extent buffer:"); > + btrfs_print_tree(left, false); > + btrfs_crit(left->fs_info, "right extent buffer:"); > + btrfs_print_tree(right, false); > btrfs_crit(left->fs_info, > "bad key order, sibling blocks, left last (%llu %u %llu) right first (%llu %u %llu)", > left_last.objectid, left_last.type,
On Wed, Apr 26, 2023 at 12:21 PM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote: > > > > On 2023/4/26 18:51, fdmanana@kernel.org wrote: > > From: Filipe Manana <fdmanana@suse.com> > > > > When trying to move keys from one node/leaf to another sibling node/leaf, > > if the sibling keys check fails we just print an error message with the > > last key of the left sibling and the first key of the right sibling. > > However it's also useful to print all the keys of each sibling, as it > > may provide some clues to what went wrong, which code path may be > > inserting keys in an incorrect order. So just do that, print the siblings > > with btrfs_print_tree(), as it works for both leaves and nodes. > > > > Signed-off-by: Filipe Manana <fdmanana@suse.com> > > Reviewed-by: Qu Wenruo <wqu@suse.com> > > However my concern is, printing two tree blocks may be a little too large. > Definitely not something can be capture by one screen. What? If I check syslog/dmesg, I can certainly access hundreds, thousands of lines... I suppose by "capture by one screen" you mean a screenshot? > > Although dumping single tree block is already too large for a single > screen, I don't have any better way... > > Thanks, > Qu > > --- > > fs/btrfs/ctree.c | 4 ++++ > > 1 file changed, 4 insertions(+) > > > > diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c > > index a0b97a6d075a..a061d7092369 100644 > > --- a/fs/btrfs/ctree.c > > +++ b/fs/btrfs/ctree.c > > @@ -2708,6 +2708,10 @@ static bool check_sibling_keys(struct extent_buffer *left, > > } > > > > if (btrfs_comp_cpu_keys(&left_last, &right_first) >= 0) { > > + btrfs_crit(left->fs_info, "left extent buffer:"); > > + btrfs_print_tree(left, false); > > + btrfs_crit(left->fs_info, "right extent buffer:"); > > + btrfs_print_tree(right, false); > > btrfs_crit(left->fs_info, > > "bad key order, sibling blocks, left last (%llu %u %llu) right first (%llu %u %llu)", > > left_last.objectid, left_last.type,
On 2023/4/26 19:31, Filipe Manana wrote: > On Wed, Apr 26, 2023 at 12:21 PM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote: >> >> >> >> On 2023/4/26 18:51, fdmanana@kernel.org wrote: >>> From: Filipe Manana <fdmanana@suse.com> >>> >>> When trying to move keys from one node/leaf to another sibling node/leaf, >>> if the sibling keys check fails we just print an error message with the >>> last key of the left sibling and the first key of the right sibling. >>> However it's also useful to print all the keys of each sibling, as it >>> may provide some clues to what went wrong, which code path may be >>> inserting keys in an incorrect order. So just do that, print the siblings >>> with btrfs_print_tree(), as it works for both leaves and nodes. >>> >>> Signed-off-by: Filipe Manana <fdmanana@suse.com> >> >> Reviewed-by: Qu Wenruo <wqu@suse.com> >> >> However my concern is, printing two tree blocks may be a little too large. >> Definitely not something can be capture by one screen. > > What? > If I check syslog/dmesg, I can certainly access hundreds, thousands of lines... > > I suppose by "capture by one screen" you mean a screenshot? Yep, mostly a phone shot of a physical monitor, which a lot of end users choose to use to report their initial trans abort. E.g. https://lore.kernel.org/linux-btrfs/f057bdd1-bdd9-459f-b25f-6a2faa652499@betaapp.fastmail.com/ I guess the reason is, if the trans abort happens on the root fs, there will be no persistent log files anyway. (Although it's still possible to pass the dmesg to another machine or go netconsole, but not everyone has such setup ready). Thanks, Qu > >> >> Although dumping single tree block is already too large for a single >> screen, I don't have any better way... >> >> Thanks, >> Qu >>> --- >>> fs/btrfs/ctree.c | 4 ++++ >>> 1 file changed, 4 insertions(+) >>> >>> diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c >>> index a0b97a6d075a..a061d7092369 100644 >>> --- a/fs/btrfs/ctree.c >>> +++ b/fs/btrfs/ctree.c >>> @@ -2708,6 +2708,10 @@ static bool check_sibling_keys(struct extent_buffer *left, >>> } >>> >>> if (btrfs_comp_cpu_keys(&left_last, &right_first) >= 0) { >>> + btrfs_crit(left->fs_info, "left extent buffer:"); >>> + btrfs_print_tree(left, false); >>> + btrfs_crit(left->fs_info, "right extent buffer:"); >>> + btrfs_print_tree(right, false); >>> btrfs_crit(left->fs_info, >>> "bad key order, sibling blocks, left last (%llu %u %llu) right first (%llu %u %llu)", >>> left_last.objectid, left_last.type,
On Wed, Apr 26, 2023 at 12:40 PM Qu Wenruo <wqu@suse.com> wrote: > > > > On 2023/4/26 19:31, Filipe Manana wrote: > > On Wed, Apr 26, 2023 at 12:21 PM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote: > >> > >> > >> > >> On 2023/4/26 18:51, fdmanana@kernel.org wrote: > >>> From: Filipe Manana <fdmanana@suse.com> > >>> > >>> When trying to move keys from one node/leaf to another sibling node/leaf, > >>> if the sibling keys check fails we just print an error message with the > >>> last key of the left sibling and the first key of the right sibling. > >>> However it's also useful to print all the keys of each sibling, as it > >>> may provide some clues to what went wrong, which code path may be > >>> inserting keys in an incorrect order. So just do that, print the siblings > >>> with btrfs_print_tree(), as it works for both leaves and nodes. > >>> > >>> Signed-off-by: Filipe Manana <fdmanana@suse.com> > >> > >> Reviewed-by: Qu Wenruo <wqu@suse.com> > >> > >> However my concern is, printing two tree blocks may be a little too large. > >> Definitely not something can be capture by one screen. > > > > What? > > If I check syslog/dmesg, I can certainly access hundreds, thousands of lines... > > > > I suppose by "capture by one screen" you mean a screenshot? > > Yep, mostly a phone shot of a physical monitor, which a lot of end users > choose to use to report their initial trans abort. > > E.g. > https://lore.kernel.org/linux-btrfs/f057bdd1-bdd9-459f-b25f-6a2faa652499@betaapp.fastmail.com/ > > I guess the reason is, if the trans abort happens on the root fs, there > will be no persistent log files anyway. > (Although it's still possible to pass the dmesg to another machine or go > netconsole, but not everyone has such setup ready). So what? I don't see how that invalidates printing extent buffers... Many users are able to access dmesg/syslog and paste what they get there... It's also useful for development where we can certainly access everything... > > Thanks, > Qu > > > >> > >> Although dumping single tree block is already too large for a single > >> screen, I don't have any better way... > >> > >> Thanks, > >> Qu > >>> --- > >>> fs/btrfs/ctree.c | 4 ++++ > >>> 1 file changed, 4 insertions(+) > >>> > >>> diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c > >>> index a0b97a6d075a..a061d7092369 100644 > >>> --- a/fs/btrfs/ctree.c > >>> +++ b/fs/btrfs/ctree.c > >>> @@ -2708,6 +2708,10 @@ static bool check_sibling_keys(struct extent_buffer *left, > >>> } > >>> > >>> if (btrfs_comp_cpu_keys(&left_last, &right_first) >= 0) { > >>> + btrfs_crit(left->fs_info, "left extent buffer:"); > >>> + btrfs_print_tree(left, false); > >>> + btrfs_crit(left->fs_info, "right extent buffer:"); > >>> + btrfs_print_tree(right, false); > >>> btrfs_crit(left->fs_info, > >>> "bad key order, sibling blocks, left last (%llu %u %llu) right first (%llu %u %llu)", > >>> left_last.objectid, left_last.type,
On 2023/4/26 19:44, Filipe Manana wrote: > On Wed, Apr 26, 2023 at 12:40 PM Qu Wenruo <wqu@suse.com> wrote: >> >> >> >> On 2023/4/26 19:31, Filipe Manana wrote: >>> On Wed, Apr 26, 2023 at 12:21 PM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote: >>>> >>>> >>>> >>>> On 2023/4/26 18:51, fdmanana@kernel.org wrote: >>>>> From: Filipe Manana <fdmanana@suse.com> >>>>> >>>>> When trying to move keys from one node/leaf to another sibling node/leaf, >>>>> if the sibling keys check fails we just print an error message with the >>>>> last key of the left sibling and the first key of the right sibling. >>>>> However it's also useful to print all the keys of each sibling, as it >>>>> may provide some clues to what went wrong, which code path may be >>>>> inserting keys in an incorrect order. So just do that, print the siblings >>>>> with btrfs_print_tree(), as it works for both leaves and nodes. >>>>> >>>>> Signed-off-by: Filipe Manana <fdmanana@suse.com> >>>> >>>> Reviewed-by: Qu Wenruo <wqu@suse.com> >>>> >>>> However my concern is, printing two tree blocks may be a little too large. >>>> Definitely not something can be capture by one screen. >>> >>> What? >>> If I check syslog/dmesg, I can certainly access hundreds, thousands of lines... >>> >>> I suppose by "capture by one screen" you mean a screenshot? >> >> Yep, mostly a phone shot of a physical monitor, which a lot of end users >> choose to use to report their initial trans abort. >> >> E.g. >> https://lore.kernel.org/linux-btrfs/f057bdd1-bdd9-459f-b25f-6a2faa652499@betaapp.fastmail.com/ >> >> I guess the reason is, if the trans abort happens on the root fs, there >> will be no persistent log files anyway. >> (Although it's still possible to pass the dmesg to another machine or go >> netconsole, but not everyone has such setup ready). > > So what? > I don't see how that invalidates printing extent buffers... Many users > are able to access dmesg/syslog > and paste what they get there... It's also useful for development > where we can certainly access everything... That's why I gave the reviewed-by tag. Maybe we can shrink the output to the first several and last several items in the future. But for now since it's just a trans abort, it should be mostly fine for the end uesrs to access the full dump. Thanks, Qu > > >> >> Thanks, >> Qu >>> >>>> >>>> Although dumping single tree block is already too large for a single >>>> screen, I don't have any better way... >>>> >>>> Thanks, >>>> Qu >>>>> --- >>>>> fs/btrfs/ctree.c | 4 ++++ >>>>> 1 file changed, 4 insertions(+) >>>>> >>>>> diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c >>>>> index a0b97a6d075a..a061d7092369 100644 >>>>> --- a/fs/btrfs/ctree.c >>>>> +++ b/fs/btrfs/ctree.c >>>>> @@ -2708,6 +2708,10 @@ static bool check_sibling_keys(struct extent_buffer *left, >>>>> } >>>>> >>>>> if (btrfs_comp_cpu_keys(&left_last, &right_first) >= 0) { >>>>> + btrfs_crit(left->fs_info, "left extent buffer:"); >>>>> + btrfs_print_tree(left, false); >>>>> + btrfs_crit(left->fs_info, "right extent buffer:"); >>>>> + btrfs_print_tree(right, false); >>>>> btrfs_crit(left->fs_info, >>>>> "bad key order, sibling blocks, left last (%llu %u %llu) right first (%llu %u %llu)", >>>>> left_last.objectid, left_last.type,
On Wed, Apr 26, 2023 at 12:50 PM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote: > > > > On 2023/4/26 19:44, Filipe Manana wrote: > > On Wed, Apr 26, 2023 at 12:40 PM Qu Wenruo <wqu@suse.com> wrote: > >> > >> > >> > >> On 2023/4/26 19:31, Filipe Manana wrote: > >>> On Wed, Apr 26, 2023 at 12:21 PM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote: > >>>> > >>>> > >>>> > >>>> On 2023/4/26 18:51, fdmanana@kernel.org wrote: > >>>>> From: Filipe Manana <fdmanana@suse.com> > >>>>> > >>>>> When trying to move keys from one node/leaf to another sibling node/leaf, > >>>>> if the sibling keys check fails we just print an error message with the > >>>>> last key of the left sibling and the first key of the right sibling. > >>>>> However it's also useful to print all the keys of each sibling, as it > >>>>> may provide some clues to what went wrong, which code path may be > >>>>> inserting keys in an incorrect order. So just do that, print the siblings > >>>>> with btrfs_print_tree(), as it works for both leaves and nodes. > >>>>> > >>>>> Signed-off-by: Filipe Manana <fdmanana@suse.com> > >>>> > >>>> Reviewed-by: Qu Wenruo <wqu@suse.com> > >>>> > >>>> However my concern is, printing two tree blocks may be a little too large. > >>>> Definitely not something can be capture by one screen. > >>> > >>> What? > >>> If I check syslog/dmesg, I can certainly access hundreds, thousands of lines... > >>> > >>> I suppose by "capture by one screen" you mean a screenshot? > >> > >> Yep, mostly a phone shot of a physical monitor, which a lot of end users > >> choose to use to report their initial trans abort. > >> > >> E.g. > >> https://lore.kernel.org/linux-btrfs/f057bdd1-bdd9-459f-b25f-6a2faa652499@betaapp.fastmail.com/ > >> > >> I guess the reason is, if the trans abort happens on the root fs, there > >> will be no persistent log files anyway. > >> (Although it's still possible to pass the dmesg to another machine or go > >> netconsole, but not everyone has such setup ready). > > > > So what? > > I don't see how that invalidates printing extent buffers... Many users > > are able to access dmesg/syslog > > and paste what they get there... It's also useful for development > > where we can certainly access everything... > > That's why I gave the reviewed-by tag. > > Maybe we can shrink the output to the first several and last several > items in the future. That's not really useful for most cases, and I've certainly run into key ordering bugs in the past where having an entire leaf printed made the problem easier to debug, sometimes it made it immediately obvious where the bug happened and why. We are avoiding printing useful stuff just because we assume people will take a photo from their phone... For those with access to dmesg/syslog, we lose useful information. > > But for now since it's just a trans abort, it should be mostly fine for > the end uesrs to access the full dump. > > Thanks, > Qu > > > > > >> > >> Thanks, > >> Qu > >>> > >>>> > >>>> Although dumping single tree block is already too large for a single > >>>> screen, I don't have any better way... > >>>> > >>>> Thanks, > >>>> Qu > >>>>> --- > >>>>> fs/btrfs/ctree.c | 4 ++++ > >>>>> 1 file changed, 4 insertions(+) > >>>>> > >>>>> diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c > >>>>> index a0b97a6d075a..a061d7092369 100644 > >>>>> --- a/fs/btrfs/ctree.c > >>>>> +++ b/fs/btrfs/ctree.c > >>>>> @@ -2708,6 +2708,10 @@ static bool check_sibling_keys(struct extent_buffer *left, > >>>>> } > >>>>> > >>>>> if (btrfs_comp_cpu_keys(&left_last, &right_first) >= 0) { > >>>>> + btrfs_crit(left->fs_info, "left extent buffer:"); > >>>>> + btrfs_print_tree(left, false); > >>>>> + btrfs_crit(left->fs_info, "right extent buffer:"); > >>>>> + btrfs_print_tree(right, false); > >>>>> btrfs_crit(left->fs_info, > >>>>> "bad key order, sibling blocks, left last (%llu %u %llu) right first (%llu %u %llu)", > >>>>> left_last.objectid, left_last.type,
On Wed, Apr 26, 2023 at 07:20:57PM +0800, Qu Wenruo wrote: > On 2023/4/26 18:51, fdmanana@kernel.org wrote: > > From: Filipe Manana <fdmanana@suse.com> > > > > When trying to move keys from one node/leaf to another sibling node/leaf, > > if the sibling keys check fails we just print an error message with the > > last key of the left sibling and the first key of the right sibling. > > However it's also useful to print all the keys of each sibling, as it > > may provide some clues to what went wrong, which code path may be > > inserting keys in an incorrect order. So just do that, print the siblings > > with btrfs_print_tree(), as it works for both leaves and nodes. > > > > Signed-off-by: Filipe Manana <fdmanana@suse.com> > > Reviewed-by: Qu Wenruo <wqu@suse.com> > > However my concern is, printing two tree blocks may be a little too large. > Definitely not something can be capture by one screen. > > Although dumping single tree block is already too large for a single > screen, I don't have any better way... For debugging with full logging we can add what we think would be useful but for the screenshot debugging we can only choose the last piece of information which I think is the stack trace.
diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c index a0b97a6d075a..a061d7092369 100644 --- a/fs/btrfs/ctree.c +++ b/fs/btrfs/ctree.c @@ -2708,6 +2708,10 @@ static bool check_sibling_keys(struct extent_buffer *left, } if (btrfs_comp_cpu_keys(&left_last, &right_first) >= 0) { + btrfs_crit(left->fs_info, "left extent buffer:"); + btrfs_print_tree(left, false); + btrfs_crit(left->fs_info, "right extent buffer:"); + btrfs_print_tree(right, false); btrfs_crit(left->fs_info, "bad key order, sibling blocks, left last (%llu %u %llu) right first (%llu %u %llu)", left_last.objectid, left_last.type,