Message ID | 20191128075437.10621-1-wqu@suse.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | btrfs: relocation: Output current relocation stage at btrfs_relocate_block_group() | expand |
On 2019/11/28 3:54 PM, Qu Wenruo wrote: > There are several reports of hanging relocation, populating the dmesg > with things like: > BTRFS info (device dm-5): found 1 extents > > The investigation is still on going, but will never hurt to output a > little more info. > > This patch will also output the current relocation stage, making that > output something like: > > BTRFS info (device dm-5): balance: start -d -m -s > BTRFS info (device dm-5): relocating block group 30408704 flags metadata|dup > BTRFS info (device dm-5): found 2 extents at MOVE_DATA_EXTENT stage > BTRFS info (device dm-5): relocating block group 22020096 flags system|dup > BTRFS info (device dm-5): found 1 extents at MOVE_DATA_EXTENT stage > BTRFS info (device dm-5): relocating block group 13631488 flags data > BTRFS info (device dm-5): found 1 extents at MOVE_DATA_EXTENT stage > BTRFS info (device dm-5): found 1 extents at UPDATE_DATA_PTRS stage > BTRFS info (device dm-5): balance: ended with status: 0 > > The string "MOVE_DATA_EXTENT" and "UPDATE_DATA_PTRS" is mostly from the > macro MOVE_DATA_EXTENTS and UPDATE_DATA_PTRS, but the 'S' from > MOVE_DATA_EXTENTS is removed in the output string to make the alignment > better. > > This patch will not increase the number of lines, but with extra info > for us to debug the reported problem. > (Although it's very likely the bug is sticking at UPDATE_DATA_PTRS > stage, even without the patch) > > Signed-off-by: Qu Wenruo <wqu@suse.com> > --- > fs/btrfs/relocation.c | 16 ++++++++++++++-- > 1 file changed, 14 insertions(+), 2 deletions(-) > > diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c > index d897a8e5e430..88fd9182852d 100644 > --- a/fs/btrfs/relocation.c > +++ b/fs/btrfs/relocation.c > @@ -4291,6 +4291,15 @@ static void describe_relocation(struct btrfs_fs_info *fs_info, > block_group->start, buf); > } > > +static const char *stage_to_string(int stage) > +{ > + if (stage == MOVE_DATA_EXTENTS) > + return "MOVE_DATA_EXTENT"; > + if (stage == UPDATE_DATA_PTRS) > + return "UPDATE_DATA_PTRS"; > + return "UNKNOWN"; > +} > + > /* > * function to relocate all extents in a block group. > */ > @@ -4365,12 +4374,15 @@ int btrfs_relocate_block_group(struct btrfs_fs_info *fs_info, u64 group_start) > rc->block_group->length); > > while (1) { > + int finishes_stage; > + NIT: the rc::stage is an unsigned integer. > mutex_lock(&fs_info->cleaner_mutex); > ret = relocate_block_group(rc); > mutex_unlock(&fs_info->cleaner_mutex); > if (ret < 0) > err = ret; > > + finishes_stage = rc->stage; > /* > * We may have gotten ENOSPC after we already dirtied some > * extents. If writeout happens while we're relocating a > @@ -4396,8 +4408,8 @@ int btrfs_relocate_block_group(struct btrfs_fs_info *fs_info, u64 group_start) > if (rc->extents_found == 0) > break; > > - btrfs_info(fs_info, "found %llu extents", rc->extents_found); > - > + btrfs_info(fs_info, "found %llu extents at %s stage", > + rc->extents_found, stage_to_string(finishes_stage)); > } > > WARN_ON(rc->block_group->pinned > 0); >
On 2019/11/28 下午6:50, Su Yue wrote: > > > On 2019/11/28 3:54 PM, Qu Wenruo wrote: >> There are several reports of hanging relocation, populating the dmesg >> with things like: >> BTRFS info (device dm-5): found 1 extents >> >> The investigation is still on going, but will never hurt to output a >> little more info. >> >> This patch will also output the current relocation stage, making that >> output something like: >> >> BTRFS info (device dm-5): balance: start -d -m -s >> BTRFS info (device dm-5): relocating block group 30408704 flags >> metadata|dup >> BTRFS info (device dm-5): found 2 extents at MOVE_DATA_EXTENT stage >> BTRFS info (device dm-5): relocating block group 22020096 flags >> system|dup >> BTRFS info (device dm-5): found 1 extents at MOVE_DATA_EXTENT stage >> BTRFS info (device dm-5): relocating block group 13631488 flags data >> BTRFS info (device dm-5): found 1 extents at MOVE_DATA_EXTENT stage >> BTRFS info (device dm-5): found 1 extents at UPDATE_DATA_PTRS stage >> BTRFS info (device dm-5): balance: ended with status: 0 >> >> The string "MOVE_DATA_EXTENT" and "UPDATE_DATA_PTRS" is mostly from the >> macro MOVE_DATA_EXTENTS and UPDATE_DATA_PTRS, but the 'S' from >> MOVE_DATA_EXTENTS is removed in the output string to make the alignment >> better. >> >> This patch will not increase the number of lines, but with extra info >> for us to debug the reported problem. >> (Although it's very likely the bug is sticking at UPDATE_DATA_PTRS >> stage, even without the patch) >> >> Signed-off-by: Qu Wenruo <wqu@suse.com> >> --- >> fs/btrfs/relocation.c | 16 ++++++++++++++-- >> 1 file changed, 14 insertions(+), 2 deletions(-) >> >> diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c >> index d897a8e5e430..88fd9182852d 100644 >> --- a/fs/btrfs/relocation.c >> +++ b/fs/btrfs/relocation.c >> @@ -4291,6 +4291,15 @@ static void describe_relocation(struct >> btrfs_fs_info *fs_info, >> block_group->start, buf); >> } >> >> +static const char *stage_to_string(int stage) >> +{ >> + if (stage == MOVE_DATA_EXTENTS) >> + return "MOVE_DATA_EXTENT"; >> + if (stage == UPDATE_DATA_PTRS) >> + return "UPDATE_DATA_PTRS"; >> + return "UNKNOWN"; >> +} >> + >> /* >> * function to relocate all extents in a block group. >> */ >> @@ -4365,12 +4374,15 @@ int btrfs_relocate_block_group(struct >> btrfs_fs_info *fs_info, u64 group_start) >> rc->block_group->length); >> >> while (1) { >> + int finishes_stage; >> + > > NIT: the rc::stage is an unsigned integer. It doesn't matter. rc::stage is only a 8bit value. Both unsigned int/int can handle it without any problem. And the strage_to_string() function can handle any value, so no problem at all. Thanks, Qu > > >> mutex_lock(&fs_info->cleaner_mutex); >> ret = relocate_block_group(rc); >> mutex_unlock(&fs_info->cleaner_mutex); >> if (ret < 0) >> err = ret; >> >> + finishes_stage = rc->stage; >> /* >> * We may have gotten ENOSPC after we already dirtied some >> * extents. If writeout happens while we're relocating a >> @@ -4396,8 +4408,8 @@ int btrfs_relocate_block_group(struct >> btrfs_fs_info *fs_info, u64 group_start) >> if (rc->extents_found == 0) >> break; >> >> - btrfs_info(fs_info, "found %llu extents", rc->extents_found); >> - >> + btrfs_info(fs_info, "found %llu extents at %s stage", >> + rc->extents_found, stage_to_string(finishes_stage)); >> } >> >> WARN_ON(rc->block_group->pinned > 0); >>
On Thu, Nov 28, 2019 at 03:54:36PM +0800, Qu Wenruo wrote: > There are several reports of hanging relocation, populating the dmesg > with things like: > BTRFS info (device dm-5): found 1 extents > > The investigation is still on going, but will never hurt to output a > little more info. > > This patch will also output the current relocation stage, making that > output something like: > > BTRFS info (device dm-5): balance: start -d -m -s > BTRFS info (device dm-5): relocating block group 30408704 flags metadata|dup > BTRFS info (device dm-5): found 2 extents at MOVE_DATA_EXTENT stage > BTRFS info (device dm-5): relocating block group 22020096 flags system|dup > BTRFS info (device dm-5): found 1 extents at MOVE_DATA_EXTENT stage > BTRFS info (device dm-5): relocating block group 13631488 flags data > BTRFS info (device dm-5): found 1 extents at MOVE_DATA_EXTENT stage > BTRFS info (device dm-5): found 1 extents at UPDATE_DATA_PTRS stage > BTRFS info (device dm-5): balance: ended with status: 0 > > The string "MOVE_DATA_EXTENT" and "UPDATE_DATA_PTRS" is mostly from the > macro MOVE_DATA_EXTENTS and UPDATE_DATA_PTRS, but the 'S' from > MOVE_DATA_EXTENTS is removed in the output string to make the alignment > better. > > This patch will not increase the number of lines, but with extra info > for us to debug the reported problem. Nice. I'd suggest to make it more user friendly relocation: found 111 extents, stage: move data blocks relocation: found 111 extents, stage: update data pointers The identifier can be understood what it means but it's IMHO not important to copy it to the message verbatim.
On 2019/11/28 下午11:40, David Sterba wrote: > On Thu, Nov 28, 2019 at 03:54:36PM +0800, Qu Wenruo wrote: >> There are several reports of hanging relocation, populating the dmesg >> with things like: >> BTRFS info (device dm-5): found 1 extents >> >> The investigation is still on going, but will never hurt to output a >> little more info. >> >> This patch will also output the current relocation stage, making that >> output something like: >> >> BTRFS info (device dm-5): balance: start -d -m -s >> BTRFS info (device dm-5): relocating block group 30408704 flags metadata|dup >> BTRFS info (device dm-5): found 2 extents at MOVE_DATA_EXTENT stage >> BTRFS info (device dm-5): relocating block group 22020096 flags system|dup >> BTRFS info (device dm-5): found 1 extents at MOVE_DATA_EXTENT stage >> BTRFS info (device dm-5): relocating block group 13631488 flags data >> BTRFS info (device dm-5): found 1 extents at MOVE_DATA_EXTENT stage >> BTRFS info (device dm-5): found 1 extents at UPDATE_DATA_PTRS stage >> BTRFS info (device dm-5): balance: ended with status: 0 >> >> The string "MOVE_DATA_EXTENT" and "UPDATE_DATA_PTRS" is mostly from the >> macro MOVE_DATA_EXTENTS and UPDATE_DATA_PTRS, but the 'S' from >> MOVE_DATA_EXTENTS is removed in the output string to make the alignment >> better. >> >> This patch will not increase the number of lines, but with extra info >> for us to debug the reported problem. > > Nice. I'd suggest to make it more user friendly > > relocation: found 111 extents, stage: move data blocks > relocation: found 111 extents, stage: update data pointers This is much better, keeps the indent while provide better readability. I'll go this way in the next version. Thanks, Qu > > The identifier can be understood what it means but it's IMHO not > important to copy it to the message verbatim. >
diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c index d897a8e5e430..88fd9182852d 100644 --- a/fs/btrfs/relocation.c +++ b/fs/btrfs/relocation.c @@ -4291,6 +4291,15 @@ static void describe_relocation(struct btrfs_fs_info *fs_info, block_group->start, buf); } +static const char *stage_to_string(int stage) +{ + if (stage == MOVE_DATA_EXTENTS) + return "MOVE_DATA_EXTENT"; + if (stage == UPDATE_DATA_PTRS) + return "UPDATE_DATA_PTRS"; + return "UNKNOWN"; +} + /* * function to relocate all extents in a block group. */ @@ -4365,12 +4374,15 @@ int btrfs_relocate_block_group(struct btrfs_fs_info *fs_info, u64 group_start) rc->block_group->length); while (1) { + int finishes_stage; + mutex_lock(&fs_info->cleaner_mutex); ret = relocate_block_group(rc); mutex_unlock(&fs_info->cleaner_mutex); if (ret < 0) err = ret; + finishes_stage = rc->stage; /* * We may have gotten ENOSPC after we already dirtied some * extents. If writeout happens while we're relocating a @@ -4396,8 +4408,8 @@ int btrfs_relocate_block_group(struct btrfs_fs_info *fs_info, u64 group_start) if (rc->extents_found == 0) break; - btrfs_info(fs_info, "found %llu extents", rc->extents_found); - + btrfs_info(fs_info, "found %llu extents at %s stage", + rc->extents_found, stage_to_string(finishes_stage)); } WARN_ON(rc->block_group->pinned > 0);
There are several reports of hanging relocation, populating the dmesg with things like: BTRFS info (device dm-5): found 1 extents The investigation is still on going, but will never hurt to output a little more info. This patch will also output the current relocation stage, making that output something like: BTRFS info (device dm-5): balance: start -d -m -s BTRFS info (device dm-5): relocating block group 30408704 flags metadata|dup BTRFS info (device dm-5): found 2 extents at MOVE_DATA_EXTENT stage BTRFS info (device dm-5): relocating block group 22020096 flags system|dup BTRFS info (device dm-5): found 1 extents at MOVE_DATA_EXTENT stage BTRFS info (device dm-5): relocating block group 13631488 flags data BTRFS info (device dm-5): found 1 extents at MOVE_DATA_EXTENT stage BTRFS info (device dm-5): found 1 extents at UPDATE_DATA_PTRS stage BTRFS info (device dm-5): balance: ended with status: 0 The string "MOVE_DATA_EXTENT" and "UPDATE_DATA_PTRS" is mostly from the macro MOVE_DATA_EXTENTS and UPDATE_DATA_PTRS, but the 'S' from MOVE_DATA_EXTENTS is removed in the output string to make the alignment better. This patch will not increase the number of lines, but with extra info for us to debug the reported problem. (Although it's very likely the bug is sticking at UPDATE_DATA_PTRS stage, even without the patch) Signed-off-by: Qu Wenruo <wqu@suse.com> --- fs/btrfs/relocation.c | 16 ++++++++++++++-- 1 file changed, 14 insertions(+), 2 deletions(-)