diff mbox

[1/5] f2fs: relax node version check for victim data in gc

Message ID 20170325075933.21072-1-jaegeuk@kernel.org (mailing list archive)
State New, archived
Headers show

Commit Message

Jaegeuk Kim March 25, 2017, 7:59 a.m. UTC
- has_not_enough_free_secs
node_secs: 0  dent_secs: 0  freed:0  free_segments:103  reserved:104

          - f2fs_gc
             - get_victim_by_default
alloc_mode 0, gc_mode 1, max_search 2672, offset 4654, ofs_unit 1

                - do_garbage_collect
start_segno 3976, end_segno 3977   type 0

                  - is_alive
nid 22797, blkaddr 2131882, ofs_in_node 0, version 0x8/0x0

                   - gc_data_segment 766, segno 3976, block 512/426 not alive

So, this patch fixes subtle corrupted case where node version does not match
to summary version which results in infinite loop by gc.

Reported-by: Yunlei He <heyunlei@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
---
 fs/f2fs/gc.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

Comments

Chao Yu March 25, 2017, 9:05 a.m. UTC | #1
Hi Jaegeuk,

On 2017/3/25 15:59, Jaegeuk Kim wrote:
> - has_not_enough_free_secs
> node_secs: 0  dent_secs: 0  freed:0  free_segments:103  reserved:104
> 
>           - f2fs_gc
>              - get_victim_by_default
> alloc_mode 0, gc_mode 1, max_search 2672, offset 4654, ofs_unit 1
> 
>                 - do_garbage_collect
> start_segno 3976, end_segno 3977   type 0
> 
>                   - is_alive
> nid 22797, blkaddr 2131882, ofs_in_node 0, version 0x8/0x0
> 
>                    - gc_data_segment 766, segno 3976, block 512/426 not alive
> 
> So, this patch fixes subtle corrupted case where node version does not match
> to summary version which results in infinite loop by gc.
> 
> Reported-by: Yunlei He <heyunlei@huawei.com>
> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
> ---
>  fs/f2fs/gc.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
> index 939be88a8833..bbeee41aaf73 100644
> --- a/fs/f2fs/gc.c
> +++ b/fs/f2fs/gc.c
> @@ -551,8 +551,10 @@ static bool is_alive(struct f2fs_sb_info *sbi, struct f2fs_summary *sum,
>  	get_node_info(sbi, nid, dni);
>  
>  	if (sum->version != dni->version) {

If the node was been truncated, we will increase its version number, since it
was been truncated, so it will never be writebacked to storage, so the version
in summary will not be updated.

So this case can happen, shouldn't we just set SBI_NEED_FSCK for the case:
sum->version != dni->version - 1

Thanks,

> -		f2fs_put_page(node_page, 1);
> -		return false;
> +		f2fs_msg(sbi->sb, KERN_WARNING,
> +				"%s: valid data with mismatched node version.",
> +				__func__);
> +		set_sbi_flag(sbi, SBI_NEED_FSCK);
>  	}
>  
>  	*nofs = ofs_of_node(node_page);
>
Jaegeuk Kim March 25, 2017, 9:27 p.m. UTC | #2
On 03/25, Chao Yu wrote:
> Hi Jaegeuk,
> 
> On 2017/3/25 15:59, Jaegeuk Kim wrote:
> > - has_not_enough_free_secs
> > node_secs: 0  dent_secs: 0  freed:0  free_segments:103  reserved:104
> > 
> >           - f2fs_gc
> >              - get_victim_by_default
> > alloc_mode 0, gc_mode 1, max_search 2672, offset 4654, ofs_unit 1
> > 
> >                 - do_garbage_collect
> > start_segno 3976, end_segno 3977   type 0
> > 
> >                   - is_alive
> > nid 22797, blkaddr 2131882, ofs_in_node 0, version 0x8/0x0
> > 
> >                    - gc_data_segment 766, segno 3976, block 512/426 not alive
> > 
> > So, this patch fixes subtle corrupted case where node version does not match
> > to summary version which results in infinite loop by gc.
> > 
> > Reported-by: Yunlei He <heyunlei@huawei.com>
> > Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
> > ---
> >  fs/f2fs/gc.c | 6 ++++--
> >  1 file changed, 4 insertions(+), 2 deletions(-)
> > 
> > diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
> > index 939be88a8833..bbeee41aaf73 100644
> > --- a/fs/f2fs/gc.c
> > +++ b/fs/f2fs/gc.c
> > @@ -551,8 +551,10 @@ static bool is_alive(struct f2fs_sb_info *sbi, struct f2fs_summary *sum,
> >  	get_node_info(sbi, nid, dni);
> >  
> >  	if (sum->version != dni->version) {
> 
> If the node was been truncated, we will increase its version number, since it
> was been truncated, so it will never be writebacked to storage, so the version
> in summary will not be updated.

That's covered by node page lock, so we shouldn't be reached out to this point.
Let's think more about this.

Thanks,

> So this case can happen, shouldn't we just set SBI_NEED_FSCK for the case:
> sum->version != dni->version - 1
> 
> Thanks,
> 
> > -		f2fs_put_page(node_page, 1);
> > -		return false;
> > +		f2fs_msg(sbi->sb, KERN_WARNING,
> > +				"%s: valid data with mismatched node version.",
> > +				__func__);
> > +		set_sbi_flag(sbi, SBI_NEED_FSCK);
> >  	}
> >  
> >  	*nofs = ofs_of_node(node_page);
> >
Chao Yu March 27, 2017, 8:18 a.m. UTC | #3
On 2017/3/26 5:27, Jaegeuk Kim wrote:
> On 03/25, Chao Yu wrote:
>> Hi Jaegeuk,
>>
>> On 2017/3/25 15:59, Jaegeuk Kim wrote:
>>> - has_not_enough_free_secs
>>> node_secs: 0  dent_secs: 0  freed:0  free_segments:103  reserved:104
>>>
>>>           - f2fs_gc
>>>              - get_victim_by_default
>>> alloc_mode 0, gc_mode 1, max_search 2672, offset 4654, ofs_unit 1
>>>
>>>                 - do_garbage_collect
>>> start_segno 3976, end_segno 3977   type 0
>>>
>>>                   - is_alive
>>> nid 22797, blkaddr 2131882, ofs_in_node 0, version 0x8/0x0
>>>
>>>                    - gc_data_segment 766, segno 3976, block 512/426 not alive
>>>
>>> So, this patch fixes subtle corrupted case where node version does not match
>>> to summary version which results in infinite loop by gc.
>>>
>>> Reported-by: Yunlei He <heyunlei@huawei.com>
>>> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
>>> ---
>>>  fs/f2fs/gc.c | 6 ++++--
>>>  1 file changed, 4 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
>>> index 939be88a8833..bbeee41aaf73 100644
>>> --- a/fs/f2fs/gc.c
>>> +++ b/fs/f2fs/gc.c
>>> @@ -551,8 +551,10 @@ static bool is_alive(struct f2fs_sb_info *sbi, struct f2fs_summary *sum,
>>>  	get_node_info(sbi, nid, dni);
>>>  
>>>  	if (sum->version != dni->version) {
>>
>> If the node was been truncated, we will increase its version number, since it
>> was been truncated, so it will never be writebacked to storage, so the version
>> in summary will not be updated.
> 
> That's covered by node page lock, so we shouldn't be reached out to this point.
> Let's think more about this.

Yes, agreed. ;)

Thanks,

> 
> Thanks,
> 
>> So this case can happen, shouldn't we just set SBI_NEED_FSCK for the case:
>> sum->version != dni->version - 1
>>
>> Thanks,
>>
>>> -		f2fs_put_page(node_page, 1);
>>> -		return false;
>>> +		f2fs_msg(sbi->sb, KERN_WARNING,
>>> +				"%s: valid data with mismatched node version.",
>>> +				__func__);
>>> +		set_sbi_flag(sbi, SBI_NEED_FSCK);
>>>  	}
>>>  
>>>  	*nofs = ofs_of_node(node_page);
>>>
> 
> .
>
He YunLei March 29, 2017, 7:04 a.m. UTC | #4
Hi all,

On 2017/3/26 5:27, Jaegeuk Kim wrote:
> On 03/25, Chao Yu wrote:
>> Hi Jaegeuk,
>>
>> On 2017/3/25 15:59, Jaegeuk Kim wrote:
>>> - has_not_enough_free_secs
>>> node_secs: 0  dent_secs: 0  freed:0  free_segments:103  reserved:104
>>>
>>>           - f2fs_gc
>>>              - get_victim_by_default
>>> alloc_mode 0, gc_mode 1, max_search 2672, offset 4654, ofs_unit 1
>>>
>>>                 - do_garbage_collect
>>> start_segno 3976, end_segno 3977   type 0
>>>
>>>                   - is_alive
>>> nid 22797, blkaddr 2131882, ofs_in_node 0, version 0x8/0x0
>>>
>>>                    - gc_data_segment 766, segno 3976, block 512/426 not alive
>>>
>>> So, this patch fixes subtle corrupted case where node version does not match
>>> to summary version which results in infinite loop by gc.
>>>
>>> Reported-by: Yunlei He <heyunlei@huawei.com>
>>> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
>>> ---
>>>  fs/f2fs/gc.c | 6 ++++--
>>>  1 file changed, 4 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
>>> index 939be88a8833..bbeee41aaf73 100644
>>> --- a/fs/f2fs/gc.c
>>> +++ b/fs/f2fs/gc.c
>>> @@ -551,8 +551,10 @@ static bool is_alive(struct f2fs_sb_info *sbi, struct f2fs_summary *sum,
>>>  	get_node_info(sbi, nid, dni);
>>>
>>>  	if (sum->version != dni->version) {
>>
>> If the node was been truncated, we will increase its version number, since it
>> was been truncated, so it will never be writebacked to storage, so the version
>> in summary will not be updated.
>

The same problem I came across with a node segment:

481                 get_node_info(sbi, nid, &ni);
482                 if (ni.blk_addr != start_addr + off) {
483                         f2fs_put_page(node_page, 1);
484                         continue;
485                 }

Here, get victim method always selected segno 5169 for garbage collection,

but this section gc failed for upper condition:

	gc_node_segment 494, blk_addr 1697572,start_addr 2668544,off 200

I think is same problem with is_alive function.

Thanks.


> That's covered by node page lock, so we shouldn't be reached out to this point.
> Let's think more about this.
>
> Thanks,
>
>> So this case can happen, shouldn't we just set SBI_NEED_FSCK for the case:
>> sum->version != dni->version - 1
>>
>> Thanks,
>>
>>> -		f2fs_put_page(node_page, 1);
>>> -		return false;
>>> +		f2fs_msg(sbi->sb, KERN_WARNING,
>>> +				"%s: valid data with mismatched node version.",
>>> +				__func__);
>>> +		set_sbi_flag(sbi, SBI_NEED_FSCK);
>>>  	}
>>>
>>>  	*nofs = ofs_of_node(node_page);
>>>
>
> ------------------------------------------------------------------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> _______________________________________________
> Linux-f2fs-devel mailing list
> Linux-f2fs-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
>
> .
>
diff mbox

Patch

diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index 939be88a8833..bbeee41aaf73 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -551,8 +551,10 @@  static bool is_alive(struct f2fs_sb_info *sbi, struct f2fs_summary *sum,
 	get_node_info(sbi, nid, dni);
 
 	if (sum->version != dni->version) {
-		f2fs_put_page(node_page, 1);
-		return false;
+		f2fs_msg(sbi->sb, KERN_WARNING,
+				"%s: valid data with mismatched node version.",
+				__func__);
+		set_sbi_flag(sbi, SBI_NEED_FSCK);
 	}
 
 	*nofs = ofs_of_node(node_page);