Message ID | 20170325075933.21072-1-jaegeuk@kernel.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Hi Jaegeuk, On 2017/3/25 15:59, Jaegeuk Kim wrote: > - has_not_enough_free_secs > node_secs: 0 dent_secs: 0 freed:0 free_segments:103 reserved:104 > > - f2fs_gc > - get_victim_by_default > alloc_mode 0, gc_mode 1, max_search 2672, offset 4654, ofs_unit 1 > > - do_garbage_collect > start_segno 3976, end_segno 3977 type 0 > > - is_alive > nid 22797, blkaddr 2131882, ofs_in_node 0, version 0x8/0x0 > > - gc_data_segment 766, segno 3976, block 512/426 not alive > > So, this patch fixes subtle corrupted case where node version does not match > to summary version which results in infinite loop by gc. > > Reported-by: Yunlei He <heyunlei@huawei.com> > Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> > --- > fs/f2fs/gc.c | 6 ++++-- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c > index 939be88a8833..bbeee41aaf73 100644 > --- a/fs/f2fs/gc.c > +++ b/fs/f2fs/gc.c > @@ -551,8 +551,10 @@ static bool is_alive(struct f2fs_sb_info *sbi, struct f2fs_summary *sum, > get_node_info(sbi, nid, dni); > > if (sum->version != dni->version) { If the node was been truncated, we will increase its version number, since it was been truncated, so it will never be writebacked to storage, so the version in summary will not be updated. So this case can happen, shouldn't we just set SBI_NEED_FSCK for the case: sum->version != dni->version - 1 Thanks, > - f2fs_put_page(node_page, 1); > - return false; > + f2fs_msg(sbi->sb, KERN_WARNING, > + "%s: valid data with mismatched node version.", > + __func__); > + set_sbi_flag(sbi, SBI_NEED_FSCK); > } > > *nofs = ofs_of_node(node_page); >
On 03/25, Chao Yu wrote: > Hi Jaegeuk, > > On 2017/3/25 15:59, Jaegeuk Kim wrote: > > - has_not_enough_free_secs > > node_secs: 0 dent_secs: 0 freed:0 free_segments:103 reserved:104 > > > > - f2fs_gc > > - get_victim_by_default > > alloc_mode 0, gc_mode 1, max_search 2672, offset 4654, ofs_unit 1 > > > > - do_garbage_collect > > start_segno 3976, end_segno 3977 type 0 > > > > - is_alive > > nid 22797, blkaddr 2131882, ofs_in_node 0, version 0x8/0x0 > > > > - gc_data_segment 766, segno 3976, block 512/426 not alive > > > > So, this patch fixes subtle corrupted case where node version does not match > > to summary version which results in infinite loop by gc. > > > > Reported-by: Yunlei He <heyunlei@huawei.com> > > Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> > > --- > > fs/f2fs/gc.c | 6 ++++-- > > 1 file changed, 4 insertions(+), 2 deletions(-) > > > > diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c > > index 939be88a8833..bbeee41aaf73 100644 > > --- a/fs/f2fs/gc.c > > +++ b/fs/f2fs/gc.c > > @@ -551,8 +551,10 @@ static bool is_alive(struct f2fs_sb_info *sbi, struct f2fs_summary *sum, > > get_node_info(sbi, nid, dni); > > > > if (sum->version != dni->version) { > > If the node was been truncated, we will increase its version number, since it > was been truncated, so it will never be writebacked to storage, so the version > in summary will not be updated. That's covered by node page lock, so we shouldn't be reached out to this point. Let's think more about this. Thanks, > So this case can happen, shouldn't we just set SBI_NEED_FSCK for the case: > sum->version != dni->version - 1 > > Thanks, > > > - f2fs_put_page(node_page, 1); > > - return false; > > + f2fs_msg(sbi->sb, KERN_WARNING, > > + "%s: valid data with mismatched node version.", > > + __func__); > > + set_sbi_flag(sbi, SBI_NEED_FSCK); > > } > > > > *nofs = ofs_of_node(node_page); > >
On 2017/3/26 5:27, Jaegeuk Kim wrote: > On 03/25, Chao Yu wrote: >> Hi Jaegeuk, >> >> On 2017/3/25 15:59, Jaegeuk Kim wrote: >>> - has_not_enough_free_secs >>> node_secs: 0 dent_secs: 0 freed:0 free_segments:103 reserved:104 >>> >>> - f2fs_gc >>> - get_victim_by_default >>> alloc_mode 0, gc_mode 1, max_search 2672, offset 4654, ofs_unit 1 >>> >>> - do_garbage_collect >>> start_segno 3976, end_segno 3977 type 0 >>> >>> - is_alive >>> nid 22797, blkaddr 2131882, ofs_in_node 0, version 0x8/0x0 >>> >>> - gc_data_segment 766, segno 3976, block 512/426 not alive >>> >>> So, this patch fixes subtle corrupted case where node version does not match >>> to summary version which results in infinite loop by gc. >>> >>> Reported-by: Yunlei He <heyunlei@huawei.com> >>> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> >>> --- >>> fs/f2fs/gc.c | 6 ++++-- >>> 1 file changed, 4 insertions(+), 2 deletions(-) >>> >>> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c >>> index 939be88a8833..bbeee41aaf73 100644 >>> --- a/fs/f2fs/gc.c >>> +++ b/fs/f2fs/gc.c >>> @@ -551,8 +551,10 @@ static bool is_alive(struct f2fs_sb_info *sbi, struct f2fs_summary *sum, >>> get_node_info(sbi, nid, dni); >>> >>> if (sum->version != dni->version) { >> >> If the node was been truncated, we will increase its version number, since it >> was been truncated, so it will never be writebacked to storage, so the version >> in summary will not be updated. > > That's covered by node page lock, so we shouldn't be reached out to this point. > Let's think more about this. Yes, agreed. ;) Thanks, > > Thanks, > >> So this case can happen, shouldn't we just set SBI_NEED_FSCK for the case: >> sum->version != dni->version - 1 >> >> Thanks, >> >>> - f2fs_put_page(node_page, 1); >>> - return false; >>> + f2fs_msg(sbi->sb, KERN_WARNING, >>> + "%s: valid data with mismatched node version.", >>> + __func__); >>> + set_sbi_flag(sbi, SBI_NEED_FSCK); >>> } >>> >>> *nofs = ofs_of_node(node_page); >>> > > . >
Hi all, On 2017/3/26 5:27, Jaegeuk Kim wrote: > On 03/25, Chao Yu wrote: >> Hi Jaegeuk, >> >> On 2017/3/25 15:59, Jaegeuk Kim wrote: >>> - has_not_enough_free_secs >>> node_secs: 0 dent_secs: 0 freed:0 free_segments:103 reserved:104 >>> >>> - f2fs_gc >>> - get_victim_by_default >>> alloc_mode 0, gc_mode 1, max_search 2672, offset 4654, ofs_unit 1 >>> >>> - do_garbage_collect >>> start_segno 3976, end_segno 3977 type 0 >>> >>> - is_alive >>> nid 22797, blkaddr 2131882, ofs_in_node 0, version 0x8/0x0 >>> >>> - gc_data_segment 766, segno 3976, block 512/426 not alive >>> >>> So, this patch fixes subtle corrupted case where node version does not match >>> to summary version which results in infinite loop by gc. >>> >>> Reported-by: Yunlei He <heyunlei@huawei.com> >>> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> >>> --- >>> fs/f2fs/gc.c | 6 ++++-- >>> 1 file changed, 4 insertions(+), 2 deletions(-) >>> >>> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c >>> index 939be88a8833..bbeee41aaf73 100644 >>> --- a/fs/f2fs/gc.c >>> +++ b/fs/f2fs/gc.c >>> @@ -551,8 +551,10 @@ static bool is_alive(struct f2fs_sb_info *sbi, struct f2fs_summary *sum, >>> get_node_info(sbi, nid, dni); >>> >>> if (sum->version != dni->version) { >> >> If the node was been truncated, we will increase its version number, since it >> was been truncated, so it will never be writebacked to storage, so the version >> in summary will not be updated. > The same problem I came across with a node segment: 481 get_node_info(sbi, nid, &ni); 482 if (ni.blk_addr != start_addr + off) { 483 f2fs_put_page(node_page, 1); 484 continue; 485 } Here, get victim method always selected segno 5169 for garbage collection, but this section gc failed for upper condition: gc_node_segment 494, blk_addr 1697572,start_addr 2668544,off 200 I think is same problem with is_alive function. Thanks. > That's covered by node page lock, so we shouldn't be reached out to this point. > Let's think more about this. > > Thanks, > >> So this case can happen, shouldn't we just set SBI_NEED_FSCK for the case: >> sum->version != dni->version - 1 >> >> Thanks, >> >>> - f2fs_put_page(node_page, 1); >>> - return false; >>> + f2fs_msg(sbi->sb, KERN_WARNING, >>> + "%s: valid data with mismatched node version.", >>> + __func__); >>> + set_sbi_flag(sbi, SBI_NEED_FSCK); >>> } >>> >>> *nofs = ofs_of_node(node_page); >>> > > ------------------------------------------------------------------------------ > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > _______________________________________________ > Linux-f2fs-devel mailing list > Linux-f2fs-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel > > . >
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c index 939be88a8833..bbeee41aaf73 100644 --- a/fs/f2fs/gc.c +++ b/fs/f2fs/gc.c @@ -551,8 +551,10 @@ static bool is_alive(struct f2fs_sb_info *sbi, struct f2fs_summary *sum, get_node_info(sbi, nid, dni); if (sum->version != dni->version) { - f2fs_put_page(node_page, 1); - return false; + f2fs_msg(sbi->sb, KERN_WARNING, + "%s: valid data with mismatched node version.", + __func__); + set_sbi_flag(sbi, SBI_NEED_FSCK); } *nofs = ofs_of_node(node_page);
- has_not_enough_free_secs node_secs: 0 dent_secs: 0 freed:0 free_segments:103 reserved:104 - f2fs_gc - get_victim_by_default alloc_mode 0, gc_mode 1, max_search 2672, offset 4654, ofs_unit 1 - do_garbage_collect start_segno 3976, end_segno 3977 type 0 - is_alive nid 22797, blkaddr 2131882, ofs_in_node 0, version 0x8/0x0 - gc_data_segment 766, segno 3976, block 512/426 not alive So, this patch fixes subtle corrupted case where node version does not match to summary version which results in infinite loop by gc. Reported-by: Yunlei He <heyunlei@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> --- fs/f2fs/gc.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)