Message ID | 1459390774-12424-1-git-send-email-quwenruo@cn.fujitsu.com (mailing list archive) |
---|---|
State | Accepted |
Headers | show |
On Thu, Mar 31, 2016 at 10:19:34AM +0800, Qu Wenruo wrote: > At least 2 user from mail list reported btrfsck reported false alert of > "bad metadata [XXXX,YYYY) crossing stripe boundary". > > While the reported number are all inside the same 64K boundary. > After some check, all the false alert have the same bytenr feature, > which can be divided by stripe size (64K). > > The result seems to be initial 'max_size' can be 0, causing 'start' + > 'max_size' - 1, to cross the stripe boundary. > > Fix it by always update extent_record->cross_stripe when the > extent_record is updated, to avoid temporary false alert to be reported. > > Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com> Applied, thanks. Do you have a test image for that? -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
David Sterba wrote on 2016/03/31 18:30 +0200: > On Thu, Mar 31, 2016 at 10:19:34AM +0800, Qu Wenruo wrote: >> At least 2 user from mail list reported btrfsck reported false alert of >> "bad metadata [XXXX,YYYY) crossing stripe boundary". >> >> While the reported number are all inside the same 64K boundary. >> After some check, all the false alert have the same bytenr feature, >> which can be divided by stripe size (64K). >> >> The result seems to be initial 'max_size' can be 0, causing 'start' + >> 'max_size' - 1, to cross the stripe boundary. >> >> Fix it by always update extent_record->cross_stripe when the >> extent_record is updated, to avoid temporary false alert to be reported. >> >> Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com> > > Applied, thanks. > > Do you have a test image for that? > > Unfortunately, no. Although I figured out the cause the the false alert, I still didn't find a image/method to reproduce it, except the images of reporters. I can dig a little further trying to make a image. Thanks, Qu -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Apr 01, 2016 at 08:28:18AM +0800, Qu Wenruo wrote: > > > David Sterba wrote on 2016/03/31 18:30 +0200: > > On Thu, Mar 31, 2016 at 10:19:34AM +0800, Qu Wenruo wrote: > >> At least 2 user from mail list reported btrfsck reported false alert of > >> "bad metadata [XXXX,YYYY) crossing stripe boundary". > >> > >> While the reported number are all inside the same 64K boundary. > >> After some check, all the false alert have the same bytenr feature, > >> which can be divided by stripe size (64K). > >> > >> The result seems to be initial 'max_size' can be 0, causing 'start' + > >> 'max_size' - 1, to cross the stripe boundary. > >> > >> Fix it by always update extent_record->cross_stripe when the > >> extent_record is updated, to avoid temporary false alert to be reported. > >> > >> Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com> > > > > Applied, thanks. > > > > Do you have a test image for that? > > > > > Unfortunately, no. > > Although I figured out the cause the the false alert, I still didn't > find a image/method to reproduce it, except the images of reporters. > > I can dig a little further trying to make a image. After another look, why don't we use nodesize directly? Or stripesize where applies. With max_size == 0 the test does not make sense, we ought to know the alignment. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
David Sterba wrote on 2016/04/01 10:44 +0200: > On Fri, Apr 01, 2016 at 08:28:18AM +0800, Qu Wenruo wrote: >> >> >> David Sterba wrote on 2016/03/31 18:30 +0200: >>> On Thu, Mar 31, 2016 at 10:19:34AM +0800, Qu Wenruo wrote: >>>> At least 2 user from mail list reported btrfsck reported false alert of >>>> "bad metadata [XXXX,YYYY) crossing stripe boundary". >>>> >>>> While the reported number are all inside the same 64K boundary. >>>> After some check, all the false alert have the same bytenr feature, >>>> which can be divided by stripe size (64K). >>>> >>>> The result seems to be initial 'max_size' can be 0, causing 'start' + >>>> 'max_size' - 1, to cross the stripe boundary. >>>> >>>> Fix it by always update extent_record->cross_stripe when the >>>> extent_record is updated, to avoid temporary false alert to be reported. >>>> >>>> Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com> >>> >>> Applied, thanks. >>> >>> Do you have a test image for that? >>> >>> >> Unfortunately, no. >> >> Although I figured out the cause the the false alert, I still didn't >> find a image/method to reproduce it, except the images of reporters. >> >> I can dig a little further trying to make a image. > > After another look, why don't we use nodesize directly? Or stripesize > where applies. With max_size == 0 the test does not make sense, we ought > to know the alignment. > > Yes, my first though is also to use nodesize directly, which should be always correct. But the problem is, the related function call stack doesn't have any member to reach btrfs_root or btrfs_fs_info. In the very beginning version of such crossing stripe check, I used to add a btrfs_root/btrfs_fs_info parameter to the function. But the code change are too many, so I use 'max_size'. I can try to re-do such modification, but IIRC it didn't cause a good result. Thanks, Qu -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Apr 01, 2016 at 04:50:06PM +0800, Qu Wenruo wrote: > > After another look, why don't we use nodesize directly? Or stripesize > > where applies. With max_size == 0 the test does not make sense, we ought > > to know the alignment. > > > Yes, my first though is also to use nodesize directly, which should be > always correct. > > But the problem is, the related function call stack doesn't have any > member to reach btrfs_root or btrfs_fs_info. > > In the very beginning version of such crossing stripe check, I used to > add a btrfs_root/btrfs_fs_info parameter to the function. > > But the code change are too many, so I use 'max_size'. > > I can try to re-do such modification, but IIRC it didn't cause a good > result. Yes it would require refactoring, which would be good on itself, because add_extent_rec takes 12(!) parameters. Some of its callers would need to be updated, but it seems doable. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 04/01/2016 07:39 PM, David Sterba wrote: > On Fri, Apr 01, 2016 at 04:50:06PM +0800, Qu Wenruo wrote: >>> After another look, why don't we use nodesize directly? Or stripesize >>> where applies. With max_size == 0 the test does not make sense, we ought >>> to know the alignment. >>> >> Yes, my first though is also to use nodesize directly, which should be >> always correct. >> >> But the problem is, the related function call stack doesn't have any >> member to reach btrfs_root or btrfs_fs_info. >> >> In the very beginning version of such crossing stripe check, I used to >> add a btrfs_root/btrfs_fs_info parameter to the function. >> >> But the code change are too many, so I use 'max_size'. >> >> I can try to re-do such modification, but IIRC it didn't cause a good >> result. > > Yes it would require refactoring, which would be good on itself, because > add_extent_rec takes 12(!) parameters. Some of its callers would need to > be updated, but it seems doable. I'll try to refactor. I though current extent-tree rework would change all these mess, but considering the case of btrfs-convert, I'd better refactor current code other than waiting other reviewers to appear. Thanks, Qu > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Apr 01, 2016 at 08:09:56PM +0800, Qu Wenruo wrote: > > > On 04/01/2016 07:39 PM, David Sterba wrote: > > On Fri, Apr 01, 2016 at 04:50:06PM +0800, Qu Wenruo wrote: > >>> After another look, why don't we use nodesize directly? Or stripesize > >>> where applies. With max_size == 0 the test does not make sense, we ought > >>> to know the alignment. > >>> > >> Yes, my first though is also to use nodesize directly, which should be > >> always correct. > >> > >> But the problem is, the related function call stack doesn't have any > >> member to reach btrfs_root or btrfs_fs_info. > >> > >> In the very beginning version of such crossing stripe check, I used to > >> add a btrfs_root/btrfs_fs_info parameter to the function. > >> > >> But the code change are too many, so I use 'max_size'. > >> > >> I can try to re-do such modification, but IIRC it didn't cause a good > >> result. > > > > Yes it would require refactoring, which would be good on itself, because > > add_extent_rec takes 12(!) parameters. Some of its callers would need to > > be updated, but it seems doable. > > I'll try to refactor. I'm working on it. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Apr 01, 2016 at 04:50:06PM +0800, Qu Wenruo wrote: > > After another look, why don't we use nodesize directly? Or stripesize > > where applies. With max_size == 0 the test does not make sense, we ought > > to know the alignment. > > > > > Yes, my first though is also to use nodesize directly, which should be > always correct. > > But the problem is, the related function call stack doesn't have any > member to reach btrfs_root or btrfs_fs_info. JFYI, there's global_info avalaible, so it's not necessary to pass fs_info down the callstacks. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
David Sterba wrote on 2016/04/04 13:18 +0200: > On Fri, Apr 01, 2016 at 04:50:06PM +0800, Qu Wenruo wrote: >>> After another look, why don't we use nodesize directly? Or stripesize >>> where applies. With max_size == 0 the test does not make sense, we ought >>> to know the alignment. >>> >>> >> Yes, my first though is also to use nodesize directly, which should be >> always correct. >> >> But the problem is, the related function call stack doesn't have any >> member to reach btrfs_root or btrfs_fs_info. > > JFYI, there's global_info avalaible, so it's not necessary to pass > fs_info down the callstacks. > > Oh, that's a good news. Do I need to re-submit the patch to use fs_info->tree_root->nodesize to avoid false alert? Or wait for your refactor? Thanks, Qu -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Apr 05, 2016 at 09:28:31AM +0800, Qu Wenruo wrote: > > > David Sterba wrote on 2016/04/04 13:18 +0200: > > On Fri, Apr 01, 2016 at 04:50:06PM +0800, Qu Wenruo wrote: > >>> After another look, why don't we use nodesize directly? Or stripesize > >>> where applies. With max_size == 0 the test does not make sense, we ought > >>> to know the alignment. > >>> > >>> > >> Yes, my first though is also to use nodesize directly, which should be > >> always correct. > >> > >> But the problem is, the related function call stack doesn't have any > >> member to reach btrfs_root or btrfs_fs_info. > > > > JFYI, there's global_info avalaible, so it's not necessary to pass > > fs_info down the callstacks. > > > > > Oh, that's a good news. > > Do I need to re-submit the patch to use fs_info->tree_root->nodesize to > avoid false alert? > Or wait for your refactor? No need to resend, the refactored code is now in devel. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/cmds-check.c b/cmds-check.c index d157075..ef23ddb 100644 --- a/cmds-check.c +++ b/cmds-check.c @@ -4579,9 +4579,9 @@ static int add_extent_rec(struct cache_tree *extent_cache, * As now stripe_len is fixed to BTRFS_STRIPE_LEN, just check * it. */ - if (metadata && check_crossing_stripes(rec->start, - rec->max_size)) - rec->crossing_stripes = 1; + if (metadata) + rec->crossing_stripes = check_crossing_stripes( + rec->start, rec->max_size); check_extent_type(rec); maybe_free_extent_rec(extent_cache, rec); return ret; @@ -4641,8 +4641,8 @@ static int add_extent_rec(struct cache_tree *extent_cache, } if (metadata) - if (check_crossing_stripes(rec->start, rec->max_size)) - rec->crossing_stripes = 1; + rec->crossing_stripes = check_crossing_stripes(rec->start, + rec->max_size); check_extent_type(rec); return ret; }
At least 2 user from mail list reported btrfsck reported false alert of "bad metadata [XXXX,YYYY) crossing stripe boundary". While the reported number are all inside the same 64K boundary. After some check, all the false alert have the same bytenr feature, which can be divided by stripe size (64K). The result seems to be initial 'max_size' can be 0, causing 'start' + 'max_size' - 1, to cross the stripe boundary. Fix it by always update extent_record->cross_stripe when the extent_record is updated, to avoid temporary false alert to be reported. Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com> --- cmds-check.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)