diff mbox

f2fs: let fill_super handle roll-forward errors

Message ID 20170811004204.24078-1-jaegeuk@kernel.org (mailing list archive)
State New, archived
Headers show

Commit Message

Jaegeuk Kim Aug. 11, 2017, 12:42 a.m. UTC
If we set CP_ERROR_FLAG in roll-forward error, f2fs is no longer to proceed
any IOs due to f2fs_cp_error(). But, for example, if some stale data is involved
on roll-forward process, we're able to get -ENOENT, getting fs stuck.
If we get any error, let fill_super set SBI_NEED_FSCK and try to recover back
to stable point.

Cc: <stable@vger.kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
---
 fs/f2fs/recovery.c | 2 --
 1 file changed, 2 deletions(-)

Comments

Chao Yu Aug. 15, 2017, 1:44 a.m. UTC | #1
Hi Jaegeuk,

On 2017/8/11 8:42, Jaegeuk Kim wrote:
> If we set CP_ERROR_FLAG in roll-forward error, f2fs is no longer to proceed
> any IOs due to f2fs_cp_error(). But, for example, if some stale data is involved
> on roll-forward process, we're able to get -ENOENT, getting fs stuck.
> If we get any error, let fill_super set SBI_NEED_FSCK and try to recover back
> to stable point.

Before that, we have cleaned up all node/meta page cache, so we will get back to
last checkpoint status, means losing fsynced datas for ever.

Would it be better to just leave message reminding user to mount with
disable_roll_forward or run fsck offline.

Thanks,

> 
> Cc: <stable@vger.kernel.org>
> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
> ---
>  fs/f2fs/recovery.c | 2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c
> index a3d02613934a..f707d810c87d 100644
> --- a/fs/f2fs/recovery.c
> +++ b/fs/f2fs/recovery.c
> @@ -649,8 +649,6 @@ int recover_fsync_data(struct f2fs_sb_info *sbi, bool check_only)
>  	}
>  
>  	clear_sbi_flag(sbi, SBI_POR_DOING);
> -	if (err)
> -		set_ckpt_flags(sbi, CP_ERROR_FLAG);
>  	mutex_unlock(&sbi->cp_mutex);
>  
>  	/* let's drop all the directory inodes for clean checkpoint */
>
Jaegeuk Kim Aug. 15, 2017, 3:22 a.m. UTC | #2
On 08/15, Chao Yu wrote:
> Hi Jaegeuk,
> 
> On 2017/8/11 8:42, Jaegeuk Kim wrote:
> > If we set CP_ERROR_FLAG in roll-forward error, f2fs is no longer to proceed
> > any IOs due to f2fs_cp_error(). But, for example, if some stale data is involved
> > on roll-forward process, we're able to get -ENOENT, getting fs stuck.
> > If we get any error, let fill_super set SBI_NEED_FSCK and try to recover back
> > to stable point.
> 
> Before that, we have cleaned up all node/meta page cache, so we will get back to
> last checkpoint status, means losing fsynced datas for ever.
> 
> Would it be better to just leave message reminding user to mount with
> disable_roll_forward or run fsck offline.

We can't rely on user for this, since fsck cannot recover this, resulting in
infinite mount failure. The only way is to disable roll-forward recovery, which
is same as returning error here.

Thanks,

> 
> Thanks,
> 
> > 
> > Cc: <stable@vger.kernel.org>
> > Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
> > ---
> >  fs/f2fs/recovery.c | 2 --
> >  1 file changed, 2 deletions(-)
> > 
> > diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c
> > index a3d02613934a..f707d810c87d 100644
> > --- a/fs/f2fs/recovery.c
> > +++ b/fs/f2fs/recovery.c
> > @@ -649,8 +649,6 @@ int recover_fsync_data(struct f2fs_sb_info *sbi, bool check_only)
> >  	}
> >  
> >  	clear_sbi_flag(sbi, SBI_POR_DOING);
> > -	if (err)
> > -		set_ckpt_flags(sbi, CP_ERROR_FLAG);
> >  	mutex_unlock(&sbi->cp_mutex);
> >  
> >  	/* let's drop all the directory inodes for clean checkpoint */
> >
Chao Yu Aug. 15, 2017, 11:45 a.m. UTC | #3
On 2017/8/15 11:22, Jaegeuk Kim wrote:
> On 08/15, Chao Yu wrote:
>> Hi Jaegeuk,
>>
>> On 2017/8/11 8:42, Jaegeuk Kim wrote:
>>> If we set CP_ERROR_FLAG in roll-forward error, f2fs is no longer to proceed
>>> any IOs due to f2fs_cp_error(). But, for example, if some stale data is involved
>>> on roll-forward process, we're able to get -ENOENT, getting fs stuck.
>>> If we get any error, let fill_super set SBI_NEED_FSCK and try to recover back
>>> to stable point.
>>
>> Before that, we have cleaned up all node/meta page cache, so we will get back to
>> last checkpoint status, means losing fsynced datas for ever.
>>
>> Would it be better to just leave message reminding user to mount with
>> disable_roll_forward or run fsck offline.
> 
> We can't rely on user for this, since fsck cannot recover this, resulting in

If fsck has no ability to recover this, it could tag superblock in somewhere,
then kernel could skip recovery. Comparing to fail recovery directly, it give
user another chance to rescuer his datas.

Thanks,

> infinite mount failure. The only way is to disable roll-forward recovery, which
> is same as returning error here.
> 
> Thanks,
> 
>>
>> Thanks,
>>
>>>
>>> Cc: <stable@vger.kernel.org>
>>> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
>>> ---
>>>  fs/f2fs/recovery.c | 2 --
>>>  1 file changed, 2 deletions(-)
>>>
>>> diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c
>>> index a3d02613934a..f707d810c87d 100644
>>> --- a/fs/f2fs/recovery.c
>>> +++ b/fs/f2fs/recovery.c
>>> @@ -649,8 +649,6 @@ int recover_fsync_data(struct f2fs_sb_info *sbi, bool check_only)
>>>  	}
>>>  
>>>  	clear_sbi_flag(sbi, SBI_POR_DOING);
>>> -	if (err)
>>> -		set_ckpt_flags(sbi, CP_ERROR_FLAG);
>>>  	mutex_unlock(&sbi->cp_mutex);
>>>  
>>>  	/* let's drop all the directory inodes for clean checkpoint */
>>>
> 
> .
>
Jaegeuk Kim Aug. 15, 2017, 4:42 p.m. UTC | #4
On 08/15, Chao Yu wrote:
> On 2017/8/15 11:22, Jaegeuk Kim wrote:
> > On 08/15, Chao Yu wrote:
> >> Hi Jaegeuk,
> >>
> >> On 2017/8/11 8:42, Jaegeuk Kim wrote:
> >>> If we set CP_ERROR_FLAG in roll-forward error, f2fs is no longer to proceed
> >>> any IOs due to f2fs_cp_error(). But, for example, if some stale data is involved
> >>> on roll-forward process, we're able to get -ENOENT, getting fs stuck.
> >>> If we get any error, let fill_super set SBI_NEED_FSCK and try to recover back
> >>> to stable point.
> >>
> >> Before that, we have cleaned up all node/meta page cache, so we will get back to
> >> last checkpoint status, means losing fsynced datas for ever.
> >>
> >> Would it be better to just leave message reminding user to mount with
> >> disable_roll_forward or run fsck offline.
> > 
> > We can't rely on user for this, since fsck cannot recover this, resulting in
> 
> If fsck has no ability to recover this, it could tag superblock in somewhere,
> then kernel could skip recovery. Comparing to fail recovery directly, it give
> user another chance to rescuer his datas.

Huh, what do you mean? This patch let f2fs_fill_super set SBI_NEED_FSCK in
superblock and skip roll-forward in second round.

> 
> Thanks,
> 
> > infinite mount failure. The only way is to disable roll-forward recovery, which
> > is same as returning error here.
> > 
> > Thanks,
> > 
> >>
> >> Thanks,
> >>
> >>>
> >>> Cc: <stable@vger.kernel.org>
> >>> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
> >>> ---
> >>>  fs/f2fs/recovery.c | 2 --
> >>>  1 file changed, 2 deletions(-)
> >>>
> >>> diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c
> >>> index a3d02613934a..f707d810c87d 100644
> >>> --- a/fs/f2fs/recovery.c
> >>> +++ b/fs/f2fs/recovery.c
> >>> @@ -649,8 +649,6 @@ int recover_fsync_data(struct f2fs_sb_info *sbi, bool check_only)
> >>>  	}
> >>>  
> >>>  	clear_sbi_flag(sbi, SBI_POR_DOING);
> >>> -	if (err)
> >>> -		set_ckpt_flags(sbi, CP_ERROR_FLAG);
> >>>  	mutex_unlock(&sbi->cp_mutex);
> >>>  
> >>>  	/* let's drop all the directory inodes for clean checkpoint */
> >>>
> > 
> > .
> >
Chao Yu Aug. 16, 2017, 1:17 a.m. UTC | #5
On 2017/8/16 0:42, Jaegeuk Kim wrote:
> On 08/15, Chao Yu wrote:
>> On 2017/8/15 11:22, Jaegeuk Kim wrote:
>>> On 08/15, Chao Yu wrote:
>>>> Hi Jaegeuk,
>>>>
>>>> On 2017/8/11 8:42, Jaegeuk Kim wrote:
>>>>> If we set CP_ERROR_FLAG in roll-forward error, f2fs is no longer to proceed
>>>>> any IOs due to f2fs_cp_error(). But, for example, if some stale data is involved
>>>>> on roll-forward process, we're able to get -ENOENT, getting fs stuck.
>>>>> If we get any error, let fill_super set SBI_NEED_FSCK and try to recover back
>>>>> to stable point.
>>>>
>>>> Before that, we have cleaned up all node/meta page cache, so we will get back to
>>>> last checkpoint status, means losing fsynced datas for ever.
>>>>
>>>> Would it be better to just leave message reminding user to mount with
>>>> disable_roll_forward or run fsck offline.
>>>
>>> We can't rely on user for this, since fsck cannot recover this, resulting in
>>
>> If fsck has no ability to recover this, it could tag superblock in somewhere,
>> then kernel could skip recovery. Comparing to fail recovery directly, it give
>> user another chance to rescuer his datas.
> 
> Huh, what do you mean? This patch let f2fs_fill_super set SBI_NEED_FSCK in
> superblock and skip roll-forward in second round.

Oh, just notice that additional checkpoint which sets SBI_NEED_FSCK won't
destroy warn node chain, so we have chance to run fsck and recover after another
mount. Sorry.

Reviewed-by: Chao Yu <yuchao0@huawei.com>

> 
>>
>> Thanks,
>>
>>> infinite mount failure. The only way is to disable roll-forward recovery, which
>>> is same as returning error here.
>>>
>>> Thanks,
>>>
>>>>
>>>> Thanks,
>>>>
>>>>>
>>>>> Cc: <stable@vger.kernel.org>
>>>>> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
>>>>> ---
>>>>>  fs/f2fs/recovery.c | 2 --
>>>>>  1 file changed, 2 deletions(-)
>>>>>
>>>>> diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c
>>>>> index a3d02613934a..f707d810c87d 100644
>>>>> --- a/fs/f2fs/recovery.c
>>>>> +++ b/fs/f2fs/recovery.c
>>>>> @@ -649,8 +649,6 @@ int recover_fsync_data(struct f2fs_sb_info *sbi, bool check_only)
>>>>>  	}
>>>>>  
>>>>>  	clear_sbi_flag(sbi, SBI_POR_DOING);
>>>>> -	if (err)
>>>>> -		set_ckpt_flags(sbi, CP_ERROR_FLAG);
>>>>>  	mutex_unlock(&sbi->cp_mutex);
>>>>>  
>>>>>  	/* let's drop all the directory inodes for clean checkpoint */
>>>>>
>>>
>>> .
>>>
> 
> .
>
diff mbox

Patch

diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c
index a3d02613934a..f707d810c87d 100644
--- a/fs/f2fs/recovery.c
+++ b/fs/f2fs/recovery.c
@@ -649,8 +649,6 @@  int recover_fsync_data(struct f2fs_sb_info *sbi, bool check_only)
 	}
 
 	clear_sbi_flag(sbi, SBI_POR_DOING);
-	if (err)
-		set_ckpt_flags(sbi, CP_ERROR_FLAG);
 	mutex_unlock(&sbi->cp_mutex);
 
 	/* let's drop all the directory inodes for clean checkpoint */