mbox series

[f2fs-dev,v2,0/4] f2fs: fix panic issue in small capacity device

Message ID 1707271264-5551-1-git-send-email-zhiguo.niu@unisoc.com (mailing list archive)
Headers show
Series f2fs: fix panic issue in small capacity device | expand

Message

Zhiguo Niu Feb. 7, 2024, 2:01 a.m. UTC
A panic issue happened in a reboot test in small capacity device
as following:
1.The device size is 64MB, and main area has 24 segments, and
CONFIG_F2FS_CHECK_FS is not enabled.
2.There is no any free segments left shown in free_segmap_info,
then another write request cause get_new_segment get a out-of-bound
segment with segno 24.
3.panic happen in update_sit_entry because access invalid bitmap
pointer.

More detail shown in following patch sets.
The three patches are splited here because the modifications are
relatively independent and more readable.

---
Changes of v2: stop checkpoint when get a out-of-bound segment
---

Zhiguo Niu (4):
  f2fs: correct counting methods of free_segments in __set_inuse
  f2fs: fix panic issue in update_sit_entry
  f2fs: enhance judgment conditions of GET_SEGNO
  f2fs: stop checkpoint when get a out-of-bounds segment

 fs/f2fs/file.c          |  7 ++++++-
 fs/f2fs/segment.c       | 21 ++++++++++++++++-----
 fs/f2fs/segment.h       |  7 ++++---
 include/linux/f2fs_fs.h |  1 +
 4 files changed, 27 insertions(+), 9 deletions(-)

Comments

patchwork-bot+f2fs@kernel.org Feb. 21, 2024, 6:10 p.m. UTC | #1
Hello:

This series was applied to jaegeuk/f2fs.git (dev)
by Jaegeuk Kim <jaegeuk@kernel.org>:

On Wed, 7 Feb 2024 10:01:00 +0800 you wrote:
> A panic issue happened in a reboot test in small capacity device
> as following:
> 1.The device size is 64MB, and main area has 24 segments, and
> CONFIG_F2FS_CHECK_FS is not enabled.
> 2.There is no any free segments left shown in free_segmap_info,
> then another write request cause get_new_segment get a out-of-bound
> segment with segno 24.
> 3.panic happen in update_sit_entry because access invalid bitmap
> pointer.
> 
> [...]

Here is the summary with links:
  - [f2fs-dev,v2,1/4] f2fs: correct counting methods of free_segments in __set_inuse
    https://git.kernel.org/jaegeuk/f2fs/c/8bac4167fd14
  - [f2fs-dev,v2,2/4] f2fs: fix panic issue in update_sit_entry
    https://git.kernel.org/jaegeuk/f2fs/c/4acac2bf18d6
  - [f2fs-dev,v2,3/4] f2fs: enhance judgment conditions of GET_SEGNO
    (no matching commit)
  - [f2fs-dev,v2,4/4] f2fs: stop checkpoint when get a out-of-bounds segment
    (no matching commit)

You are awesome, thank you!
Chao Yu Feb. 22, 2024, 12:30 p.m. UTC | #2
On 2024/2/7 10:01, Zhiguo Niu wrote:
> A panic issue happened in a reboot test in small capacity device
> as following:
> 1.The device size is 64MB, and main area has 24 segments, and
> CONFIG_F2FS_CHECK_FS is not enabled.
> 2.There is no any free segments left shown in free_segmap_info,
> then another write request cause get_new_segment get a out-of-bound
> segment with segno 24.
> 3.panic happen in update_sit_entry because access invalid bitmap
> pointer.

Zhiguo,

Can you please try below patch to see whether it can fix your problem?

https://lore.kernel.org/linux-f2fs-devel/20240222121851.883141-3-chao@kernel.org

Thanks,

> 
> More detail shown in following patch sets.
> The three patches are splited here because the modifications are
> relatively independent and more readable.
> 
> ---
> Changes of v2: stop checkpoint when get a out-of-bound segment
> ---
> 
> Zhiguo Niu (4):
>    f2fs: correct counting methods of free_segments in __set_inuse
>    f2fs: fix panic issue in update_sit_entry
>    f2fs: enhance judgment conditions of GET_SEGNO
>    f2fs: stop checkpoint when get a out-of-bounds segment
> 
>   fs/f2fs/file.c          |  7 ++++++-
>   fs/f2fs/segment.c       | 21 ++++++++++++++++-----
>   fs/f2fs/segment.h       |  7 ++++---
>   include/linux/f2fs_fs.h |  1 +
>   4 files changed, 27 insertions(+), 9 deletions(-)
>
Chao Yu Feb. 23, 2024, 2:38 a.m. UTC | #3
On 2024/2/23 10:01, Zhiguo Niu wrote:
> 
> 
> On Thu, Feb 22, 2024 at 8:30 PM Chao Yu <chao@kernel.org <mailto:chao@kernel.org>> wrote:
> 
>     On 2024/2/7 10:01, Zhiguo Niu wrote:
>      > A panic issue happened in a reboot test in small capacity device
>      > as following:
>      > 1.The device size is 64MB, and main area has 24 segments, and
>      > CONFIG_F2FS_CHECK_FS is not enabled.
>      > 2.There is no any free segments left shown in free_segmap_info,
>      > then another write request cause get_new_segment get a out-of-bound
>      > segment with segno 24.
>      > 3.panic happen in update_sit_entry because access invalid bitmap
>      > pointer.
> 
>     Zhiguo,
> 
>     Can you please try below patch to see whether it can fix your problem?
> 
>     https://lore.kernel.org/linux-f2fs-devel/20240222121851.883141-3-chao@kernel.org <https://lore.kernel.org/linux-f2fs-devel/20240222121851.883141-3-chao@kernel.org>
> 
>     Thanks,
> 
> 
> Dear Chao,
> I need to coordinate the testing resources. The previous testing has been stopped because it was fixed with the current patch. In addition, this requires stability testing to reproduce, so it will take a certain amount of time. If there is any situation, I will tell you in time.

Zhiguo, thank you!

BTW, I've tested this patch for a while, and it looks there is no issue w/
FAULT_NO_SEGMENT fault injection is on.

> btw, Why can’t I see this patch on your branch^^?
> https://git.kernel.org/pub/scm/linux/kernel/git/chao/linux.git/log/?h=dev-test <https://git.kernel.org/pub/scm/linux/kernel/git/chao/linux.git/log/?h=dev-test>

Too lazy to push patches in time, will do it in this weekend. :P

> thanks!
> 
> 
>      >
>      > More detail shown in following patch sets.
>      > The three patches are splited here because the modifications are
>      > relatively independent and more readable.
>      >
>      > ---
>      > Changes of v2: stop checkpoint when get a out-of-bound segment
>      > ---
>      >
>      > Zhiguo Niu (4):
>      >    f2fs: correct counting methods of free_segments in __set_inuse
>      >    f2fs: fix panic issue in update_sit_entry
>      >    f2fs: enhance judgment conditions of GET_SEGNO
>      >    f2fs: stop checkpoint when get a out-of-bounds segment
>      >
>      >   fs/f2fs/file.c          |  7 ++++++-
>      >   fs/f2fs/segment.c       | 21 ++++++++++++++++-----
>      >   fs/f2fs/segment.h       |  7 ++++---
>      >   include/linux/f2fs_fs.h |  1 +
>      >   4 files changed, 27 insertions(+), 9 deletions(-)
>      >
>
Zhiguo Niu Feb. 26, 2024, 3:25 a.m. UTC | #4
Dear Chao,

On Fri, Feb 23, 2024 at 10:38 AM Chao Yu <chao@kernel.org> wrote:
>
> On 2024/2/23 10:01, Zhiguo Niu wrote:
> >
> >
> > On Thu, Feb 22, 2024 at 8:30 PM Chao Yu <chao@kernel.org <mailto:chao@kernel.org>> wrote:
> >
> >     On 2024/2/7 10:01, Zhiguo Niu wrote:
> >      > A panic issue happened in a reboot test in small capacity device
> >      > as following:
> >      > 1.The device size is 64MB, and main area has 24 segments, and
> >      > CONFIG_F2FS_CHECK_FS is not enabled.
> >      > 2.There is no any free segments left shown in free_segmap_info,
> >      > then another write request cause get_new_segment get a out-of-bound
> >      > segment with segno 24.
> >      > 3.panic happen in update_sit_entry because access invalid bitmap
> >      > pointer.
> >
> >     Zhiguo,
> >
> >     Can you please try below patch to see whether it can fix your problem?
> >
> >     https://lore.kernel.org/linux-f2fs-devel/20240222121851.883141-3-chao@kernel.org <https://lore.kernel.org/linux-f2fs-devel/20240222121851.883141-3-chao@kernel.org>
> >
> >     Thanks,
> >
> >
> > Dear Chao,
> > I need to coordinate the testing resources. The previous testing has been stopped because it was fixed with the current patch. In addition, this requires stability testing to reproduce, so it will take a certain amount of time. If there is any situation, I will tell you in time.
>
> Zhiguo, thank you!

We tested this patch  this weekend on the previous version with
problem, and it can not reproduce panic issues,
so this patch should fix the original issue.
thanks!

>
> BTW, I've tested this patch for a while, and it looks there is no issue w/
> FAULT_NO_SEGMENT fault injection is on.
>
> > btw, Why can’t I see this patch on your branch^^?
> > https://git.kernel.org/pub/scm/linux/kernel/git/chao/linux.git/log/?h=dev-test <https://git.kernel.org/pub/scm/linux/kernel/git/chao/linux.git/log/?h=dev-test>
>
> Too lazy to push patches in time, will do it in this weekend. :P
>
> > thanks!
> >
> >
> >      >
> >      > More detail shown in following patch sets.
> >      > The three patches are splited here because the modifications are
> >      > relatively independent and more readable.
> >      >
> >      > ---
> >      > Changes of v2: stop checkpoint when get a out-of-bound segment
> >      > ---
> >      >
> >      > Zhiguo Niu (4):
> >      >    f2fs: correct counting methods of free_segments in __set_inuse
> >      >    f2fs: fix panic issue in update_sit_entry
> >      >    f2fs: enhance judgment conditions of GET_SEGNO
> >      >    f2fs: stop checkpoint when get a out-of-bounds segment
> >      >
> >      >   fs/f2fs/file.c          |  7 ++++++-
> >      >   fs/f2fs/segment.c       | 21 ++++++++++++++++-----
> >      >   fs/f2fs/segment.h       |  7 ++++---
> >      >   include/linux/f2fs_fs.h |  1 +
> >      >   4 files changed, 27 insertions(+), 9 deletions(-)
> >      >
> >
Chao Yu Feb. 26, 2024, 6:48 a.m. UTC | #5
On 2024/2/26 11:25, Zhiguo Niu wrote:
> Dear Chao,
> 
> On Fri, Feb 23, 2024 at 10:38 AM Chao Yu <chao@kernel.org> wrote:
>>
>> On 2024/2/23 10:01, Zhiguo Niu wrote:
>>>
>>>
>>> On Thu, Feb 22, 2024 at 8:30 PM Chao Yu <chao@kernel.org <mailto:chao@kernel.org>> wrote:
>>>
>>>      On 2024/2/7 10:01, Zhiguo Niu wrote:
>>>       > A panic issue happened in a reboot test in small capacity device
>>>       > as following:
>>>       > 1.The device size is 64MB, and main area has 24 segments, and
>>>       > CONFIG_F2FS_CHECK_FS is not enabled.
>>>       > 2.There is no any free segments left shown in free_segmap_info,
>>>       > then another write request cause get_new_segment get a out-of-bound
>>>       > segment with segno 24.
>>>       > 3.panic happen in update_sit_entry because access invalid bitmap
>>>       > pointer.
>>>
>>>      Zhiguo,
>>>
>>>      Can you please try below patch to see whether it can fix your problem?
>>>
>>>      https://lore.kernel.org/linux-f2fs-devel/20240222121851.883141-3-chao@kernel.org <https://lore.kernel.org/linux-f2fs-devel/20240222121851.883141-3-chao@kernel.org>
>>>
>>>      Thanks,
>>>
>>>
>>> Dear Chao,
>>> I need to coordinate the testing resources. The previous testing has been stopped because it was fixed with the current patch. In addition, this requires stability testing to reproduce, so it will take a certain amount of time. If there is any situation, I will tell you in time.
>>
>> Zhiguo, thank you!
> 
> We tested this patch  this weekend on the previous version with
> problem, and it can not reproduce panic issues,
> so this patch should fix the original issue.

Zhiguo,

Thanks a lot for the test!

Do you mind replying to original patch below tag?

Tested-by: Zhiguo Niu <zhiguo.niu@unisoc.com>

Thanks,

> thanks!
> 
>>
>> BTW, I've tested this patch for a while, and it looks there is no issue w/
>> FAULT_NO_SEGMENT fault injection is on.
>>
>>> btw, Why can’t I see this patch on your branch^^?
>>> https://git.kernel.org/pub/scm/linux/kernel/git/chao/linux.git/log/?h=dev-test <https://git.kernel.org/pub/scm/linux/kernel/git/chao/linux.git/log/?h=dev-test>
>>
>> Too lazy to push patches in time, will do it in this weekend. :P
>>
>>> thanks!
>>>
>>>
>>>       >
>>>       > More detail shown in following patch sets.
>>>       > The three patches are splited here because the modifications are
>>>       > relatively independent and more readable.
>>>       >
>>>       > ---
>>>       > Changes of v2: stop checkpoint when get a out-of-bound segment
>>>       > ---
>>>       >
>>>       > Zhiguo Niu (4):
>>>       >    f2fs: correct counting methods of free_segments in __set_inuse
>>>       >    f2fs: fix panic issue in update_sit_entry
>>>       >    f2fs: enhance judgment conditions of GET_SEGNO
>>>       >    f2fs: stop checkpoint when get a out-of-bounds segment
>>>       >
>>>       >   fs/f2fs/file.c          |  7 ++++++-
>>>       >   fs/f2fs/segment.c       | 21 ++++++++++++++++-----
>>>       >   fs/f2fs/segment.h       |  7 ++++---
>>>       >   include/linux/f2fs_fs.h |  1 +
>>>       >   4 files changed, 27 insertions(+), 9 deletions(-)
>>>       >
>>>
Jaegeuk Kim Feb. 27, 2024, 1:13 a.m. UTC | #6
On 02/26, Zhiguo Niu wrote:
> Dear Chao,
> 
> On Fri, Feb 23, 2024 at 10:38 AM Chao Yu <chao@kernel.org> wrote:
> >
> > On 2024/2/23 10:01, Zhiguo Niu wrote:
> > >
> > >
> > > On Thu, Feb 22, 2024 at 8:30 PM Chao Yu <chao@kernel.org <mailto:chao@kernel.org>> wrote:
> > >
> > >     On 2024/2/7 10:01, Zhiguo Niu wrote:
> > >      > A panic issue happened in a reboot test in small capacity device
> > >      > as following:
> > >      > 1.The device size is 64MB, and main area has 24 segments, and
> > >      > CONFIG_F2FS_CHECK_FS is not enabled.
> > >      > 2.There is no any free segments left shown in free_segmap_info,
> > >      > then another write request cause get_new_segment get a out-of-bound
> > >      > segment with segno 24.
> > >      > 3.panic happen in update_sit_entry because access invalid bitmap
> > >      > pointer.
> > >
> > >     Zhiguo,
> > >
> > >     Can you please try below patch to see whether it can fix your problem?
> > >
> > >     https://lore.kernel.org/linux-f2fs-devel/20240222121851.883141-3-chao@kernel.org <https://lore.kernel.org/linux-f2fs-devel/20240222121851.883141-3-chao@kernel.org>
> > >
> > >     Thanks,
> > >
> > >
> > > Dear Chao,
> > > I need to coordinate the testing resources. The previous testing has been stopped because it was fixed with the current patch. In addition, this requires stability testing to reproduce, so it will take a certain amount of time. If there is any situation, I will tell you in time.
> >
> > Zhiguo, thank you!
> 
> We tested this patch  this weekend on the previous version with
> problem, and it can not reproduce panic issues,
> so this patch should fix the original issue.
> thanks!

Hey, do you guys please point out which patches were tested without what?
IOWs, which patches should I remove and keep Chao's patch?

> 
> >
> > BTW, I've tested this patch for a while, and it looks there is no issue w/
> > FAULT_NO_SEGMENT fault injection is on.
> >
> > > btw, Why can’t I see this patch on your branch^^?
> > > https://git.kernel.org/pub/scm/linux/kernel/git/chao/linux.git/log/?h=dev-test <https://git.kernel.org/pub/scm/linux/kernel/git/chao/linux.git/log/?h=dev-test>
> >
> > Too lazy to push patches in time, will do it in this weekend. :P
> >
> > > thanks!
> > >
> > >
> > >      >
> > >      > More detail shown in following patch sets.
> > >      > The three patches are splited here because the modifications are
> > >      > relatively independent and more readable.
> > >      >
> > >      > ---
> > >      > Changes of v2: stop checkpoint when get a out-of-bound segment
> > >      > ---
> > >      >
> > >      > Zhiguo Niu (4):
> > >      >    f2fs: correct counting methods of free_segments in __set_inuse
> > >      >    f2fs: fix panic issue in update_sit_entry
> > >      >    f2fs: enhance judgment conditions of GET_SEGNO
> > >      >    f2fs: stop checkpoint when get a out-of-bounds segment
> > >      >
> > >      >   fs/f2fs/file.c          |  7 ++++++-
> > >      >   fs/f2fs/segment.c       | 21 ++++++++++++++++-----
> > >      >   fs/f2fs/segment.h       |  7 ++++---
> > >      >   include/linux/f2fs_fs.h |  1 +
> > >      >   4 files changed, 27 insertions(+), 9 deletions(-)
> > >      >
> > >
Zhiguo Niu Feb. 27, 2024, 2:34 a.m. UTC | #7
On Tue, Feb 27, 2024 at 9:13 AM Jaegeuk Kim <jaegeuk@kernel.org> wrote:
>
> On 02/26, Zhiguo Niu wrote:
> > Dear Chao,
> >
> > On Fri, Feb 23, 2024 at 10:38 AM Chao Yu <chao@kernel.org> wrote:
> > >
> > > On 2024/2/23 10:01, Zhiguo Niu wrote:
> > > >
> > > >
> > > > On Thu, Feb 22, 2024 at 8:30 PM Chao Yu <chao@kernel.org <mailto:chao@kernel.org>> wrote:
> > > >
> > > >     On 2024/2/7 10:01, Zhiguo Niu wrote:
> > > >      > A panic issue happened in a reboot test in small capacity device
> > > >      > as following:
> > > >      > 1.The device size is 64MB, and main area has 24 segments, and
> > > >      > CONFIG_F2FS_CHECK_FS is not enabled.
> > > >      > 2.There is no any free segments left shown in free_segmap_info,
> > > >      > then another write request cause get_new_segment get a out-of-bound
> > > >      > segment with segno 24.
> > > >      > 3.panic happen in update_sit_entry because access invalid bitmap
> > > >      > pointer.
> > > >
> > > >     Zhiguo,
> > > >
> > > >     Can you please try below patch to see whether it can fix your problem?
> > > >
> > > >     https://lore.kernel.org/linux-f2fs-devel/20240222121851.883141-3-chao@kernel.org <https://lore.kernel.org/linux-f2fs-devel/20240222121851.883141-3-chao@kernel.org>
> > > >
> > > >     Thanks,
> > > >
> > > >
> > > > Dear Chao,
> > > > I need to coordinate the testing resources. The previous testing has been stopped because it was fixed with the current patch. In addition, this requires stability testing to reproduce, so it will take a certain amount of time. If there is any situation, I will tell you in time.
> > >
> > > Zhiguo, thank you!
> >
> > We tested this patch  this weekend on the previous version with
> > problem, and it can not reproduce panic issues,
> > so this patch should fix the original issue.
> > thanks!
>
Dear Jaegeuk,
> Hey, do you guys please point out which patches were tested without what?
This problem occurred during our platform stability testing.
it can be fixed by my  this patch set, mainly be fixed by:
f2fs: fix panic issue in update_sit_entry & f2fs: enhance judgment
conditions of GET_SEGNO
and Chao's patch can also fix this problems testing without my patch
> IOWs, which patches should I remove and keep Chao's patch?
I think chao's patch is more reasonable, it does error handling more complete.
but my patch just do some sanity check for return value of GET_SEGNO
Same as other codes(update_segment_mtime)
and i think it also needed except this part:

diff --git a/fs/f2fs/segment.h b/fs/f2fs/segment.h
index 3bf2ce46fa0907..bb22feeae1cfcb 100644
--- a/fs/f2fs/segment.h
+++ b/fs/f2fs/segment.h
@@ -96,7 +96,8 @@ static inline void sanity_check_seg_type(struct
f2fs_sb_info *sbi,
(GET_SEGOFF_FROM_SEG0(sbi, blk_addr) & (BLKS_PER_SEG(sbi) - 1))
#define GET_SEGNO(sbi, blk_addr) \
- ((!__is_valid_data_blkaddr(blk_addr)) ? \
+ ((!__is_valid_data_blkaddr(blk_addr) || \
+ !f2fs_is_valid_blkaddr(sbi, blk_addr, DATA_GENERIC)) ? \
NULL_SEGNO : GET_L2R_SEGNO(FREE_I(sbi), \
GET_SEGNO_FROM_SEG0(sbi, blk_addr)))
#define CAP_BLKS_PER_SEC(sbi)
because Chao's patch let new_addr=null_addr when  get_new_segment
returns NOSPACE,
so I think this can be reverted and it also saves code running time.
How about Chao's opinions?
thanks!
>
> >
> > >
> > > BTW, I've tested this patch for a while, and it looks there is no issue w/
> > > FAULT_NO_SEGMENT fault injection is on.
> > >
> > > > btw, Why can’t I see this patch on your branch^^?
> > > > https://git.kernel.org/pub/scm/linux/kernel/git/chao/linux.git/log/?h=dev-test <https://git.kernel.org/pub/scm/linux/kernel/git/chao/linux.git/log/?h=dev-test>
> > >
> > > Too lazy to push patches in time, will do it in this weekend. :P
> > >
> > > > thanks!
> > > >
> > > >
> > > >      >
> > > >      > More detail shown in following patch sets.
> > > >      > The three patches are splited here because the modifications are
> > > >      > relatively independent and more readable.
> > > >      >
> > > >      > ---
> > > >      > Changes of v2: stop checkpoint when get a out-of-bound segment
> > > >      > ---
> > > >      >
> > > >      > Zhiguo Niu (4):
> > > >      >    f2fs: correct counting methods of free_segments in __set_inuse
> > > >      >    f2fs: fix panic issue in update_sit_entry
> > > >      >    f2fs: enhance judgment conditions of GET_SEGNO
> > > >      >    f2fs: stop checkpoint when get a out-of-bounds segment
> > > >      >
> > > >      >   fs/f2fs/file.c          |  7 ++++++-
> > > >      >   fs/f2fs/segment.c       | 21 ++++++++++++++++-----
> > > >      >   fs/f2fs/segment.h       |  7 ++++---
> > > >      >   include/linux/f2fs_fs.h |  1 +
> > > >      >   4 files changed, 27 insertions(+), 9 deletions(-)
> > > >      >
> > > >
Jaegeuk Kim Feb. 27, 2024, 5:18 p.m. UTC | #8
On 02/27, Zhiguo Niu wrote:
> On Tue, Feb 27, 2024 at 9:13 AM Jaegeuk Kim <jaegeuk@kernel.org> wrote:
> >
> > On 02/26, Zhiguo Niu wrote:
> > > Dear Chao,
> > >
> > > On Fri, Feb 23, 2024 at 10:38 AM Chao Yu <chao@kernel.org> wrote:
> > > >
> > > > On 2024/2/23 10:01, Zhiguo Niu wrote:
> > > > >
> > > > >
> > > > > On Thu, Feb 22, 2024 at 8:30 PM Chao Yu <chao@kernel.org <mailto:chao@kernel.org>> wrote:
> > > > >
> > > > >     On 2024/2/7 10:01, Zhiguo Niu wrote:
> > > > >      > A panic issue happened in a reboot test in small capacity device
> > > > >      > as following:
> > > > >      > 1.The device size is 64MB, and main area has 24 segments, and
> > > > >      > CONFIG_F2FS_CHECK_FS is not enabled.
> > > > >      > 2.There is no any free segments left shown in free_segmap_info,
> > > > >      > then another write request cause get_new_segment get a out-of-bound
> > > > >      > segment with segno 24.
> > > > >      > 3.panic happen in update_sit_entry because access invalid bitmap
> > > > >      > pointer.
> > > > >
> > > > >     Zhiguo,
> > > > >
> > > > >     Can you please try below patch to see whether it can fix your problem?
> > > > >
> > > > >     https://lore.kernel.org/linux-f2fs-devel/20240222121851.883141-3-chao@kernel.org <https://lore.kernel.org/linux-f2fs-devel/20240222121851.883141-3-chao@kernel.org>
> > > > >
> > > > >     Thanks,
> > > > >
> > > > >
> > > > > Dear Chao,
> > > > > I need to coordinate the testing resources. The previous testing has been stopped because it was fixed with the current patch. In addition, this requires stability testing to reproduce, so it will take a certain amount of time. If there is any situation, I will tell you in time.
> > > >
> > > > Zhiguo, thank you!
> > >
> > > We tested this patch  this weekend on the previous version with
> > > problem, and it can not reproduce panic issues,
> > > so this patch should fix the original issue.
> > > thanks!
> >
> Dear Jaegeuk,
> > Hey, do you guys please point out which patches were tested without what?
> This problem occurred during our platform stability testing.
> it can be fixed by my  this patch set, mainly be fixed by:
> f2fs: fix panic issue in update_sit_entry & f2fs: enhance judgment
> conditions of GET_SEGNO
> and Chao's patch can also fix this problems testing without my patch
> > IOWs, which patches should I remove and keep Chao's patch?
> I think chao's patch is more reasonable, it does error handling more complete.
> but my patch just do some sanity check for return value of GET_SEGNO
> Same as other codes(update_segment_mtime)
> and i think it also needed except this part:

Thanks for confirmation. It seems it'd be better to revert yours and apply
Chao's patch first. If you think there's something to improve on top of it,
could you please send another patch afterwards?

> 
> diff --git a/fs/f2fs/segment.h b/fs/f2fs/segment.h
> index 3bf2ce46fa0907..bb22feeae1cfcb 100644
> --- a/fs/f2fs/segment.h
> +++ b/fs/f2fs/segment.h
> @@ -96,7 +96,8 @@ static inline void sanity_check_seg_type(struct
> f2fs_sb_info *sbi,
> (GET_SEGOFF_FROM_SEG0(sbi, blk_addr) & (BLKS_PER_SEG(sbi) - 1))
> #define GET_SEGNO(sbi, blk_addr) \
> - ((!__is_valid_data_blkaddr(blk_addr)) ? \
> + ((!__is_valid_data_blkaddr(blk_addr) || \
> + !f2fs_is_valid_blkaddr(sbi, blk_addr, DATA_GENERIC)) ? \
> NULL_SEGNO : GET_L2R_SEGNO(FREE_I(sbi), \
> GET_SEGNO_FROM_SEG0(sbi, blk_addr)))
> #define CAP_BLKS_PER_SEC(sbi)
> because Chao's patch let new_addr=null_addr when  get_new_segment
> returns NOSPACE,
> so I think this can be reverted and it also saves code running time.
> How about Chao's opinions?
> thanks!
> >
> > >
> > > >
> > > > BTW, I've tested this patch for a while, and it looks there is no issue w/
> > > > FAULT_NO_SEGMENT fault injection is on.
> > > >
> > > > > btw, Why can’t I see this patch on your branch^^?
> > > > > https://git.kernel.org/pub/scm/linux/kernel/git/chao/linux.git/log/?h=dev-test <https://git.kernel.org/pub/scm/linux/kernel/git/chao/linux.git/log/?h=dev-test>
> > > >
> > > > Too lazy to push patches in time, will do it in this weekend. :P
> > > >
> > > > > thanks!
> > > > >
> > > > >
> > > > >      >
> > > > >      > More detail shown in following patch sets.
> > > > >      > The three patches are splited here because the modifications are
> > > > >      > relatively independent and more readable.
> > > > >      >
> > > > >      > ---
> > > > >      > Changes of v2: stop checkpoint when get a out-of-bound segment
> > > > >      > ---
> > > > >      >
> > > > >      > Zhiguo Niu (4):
> > > > >      >    f2fs: correct counting methods of free_segments in __set_inuse
> > > > >      >    f2fs: fix panic issue in update_sit_entry
> > > > >      >    f2fs: enhance judgment conditions of GET_SEGNO
> > > > >      >    f2fs: stop checkpoint when get a out-of-bounds segment
> > > > >      >
> > > > >      >   fs/f2fs/file.c          |  7 ++++++-
> > > > >      >   fs/f2fs/segment.c       | 21 ++++++++++++++++-----
> > > > >      >   fs/f2fs/segment.h       |  7 ++++---
> > > > >      >   include/linux/f2fs_fs.h |  1 +
> > > > >      >   4 files changed, 27 insertions(+), 9 deletions(-)
> > > > >      >
> > > > >
Zhiguo Niu Feb. 28, 2024, 1 a.m. UTC | #9
On Wed, Feb 28, 2024 at 1:18 AM Jaegeuk Kim <jaegeuk@kernel.org> wrote:
>
> On 02/27, Zhiguo Niu wrote:
> > On Tue, Feb 27, 2024 at 9:13 AM Jaegeuk Kim <jaegeuk@kernel.org> wrote:
> > >
> > > On 02/26, Zhiguo Niu wrote:
> > > > Dear Chao,
> > > >
> > > > On Fri, Feb 23, 2024 at 10:38 AM Chao Yu <chao@kernel.org> wrote:
> > > > >
> > > > > On 2024/2/23 10:01, Zhiguo Niu wrote:
> > > > > >
> > > > > >
> > > > > > On Thu, Feb 22, 2024 at 8:30 PM Chao Yu <chao@kernel.org <mailto:chao@kernel.org>> wrote:
> > > > > >
> > > > > >     On 2024/2/7 10:01, Zhiguo Niu wrote:
> > > > > >      > A panic issue happened in a reboot test in small capacity device
> > > > > >      > as following:
> > > > > >      > 1.The device size is 64MB, and main area has 24 segments, and
> > > > > >      > CONFIG_F2FS_CHECK_FS is not enabled.
> > > > > >      > 2.There is no any free segments left shown in free_segmap_info,
> > > > > >      > then another write request cause get_new_segment get a out-of-bound
> > > > > >      > segment with segno 24.
> > > > > >      > 3.panic happen in update_sit_entry because access invalid bitmap
> > > > > >      > pointer.
> > > > > >
> > > > > >     Zhiguo,
> > > > > >
> > > > > >     Can you please try below patch to see whether it can fix your problem?
> > > > > >
> > > > > >     https://lore.kernel.org/linux-f2fs-devel/20240222121851.883141-3-chao@kernel.org <https://lore.kernel.org/linux-f2fs-devel/20240222121851.883141-3-chao@kernel.org>
> > > > > >
> > > > > >     Thanks,
> > > > > >
> > > > > >
> > > > > > Dear Chao,
> > > > > > I need to coordinate the testing resources. The previous testing has been stopped because it was fixed with the current patch. In addition, this requires stability testing to reproduce, so it will take a certain amount of time. If there is any situation, I will tell you in time.
> > > > >
> > > > > Zhiguo, thank you!
> > > >
> > > > We tested this patch  this weekend on the previous version with
> > > > problem, and it can not reproduce panic issues,
> > > > so this patch should fix the original issue.
> > > > thanks!
> > >
> > Dear Jaegeuk,
> > > Hey, do you guys please point out which patches were tested without what?
> > This problem occurred during our platform stability testing.
> > it can be fixed by my  this patch set, mainly be fixed by:
> > f2fs: fix panic issue in update_sit_entry & f2fs: enhance judgment
> > conditions of GET_SEGNO
> > and Chao's patch can also fix this problems testing without my patch
> > > IOWs, which patches should I remove and keep Chao's patch?
> > I think chao's patch is more reasonable, it does error handling more complete.
> > but my patch just do some sanity check for return value of GET_SEGNO
> > Same as other codes(update_segment_mtime)
> > and i think it also needed except this part:
>
> Thanks for confirmation. It seems it'd be better to revert yours and apply
> Chao's patch first. If you think there's something to improve on top of it,
> could you please send another patch afterwards?

OK, I think this two patches still needed
  f2fs: correct counting methods of free_segments in __set_inuse
  f2fs: fix panic issue in update_sit_entry
and I'll reorganize it
thanks
>
> >
> > diff --git a/fs/f2fs/segment.h b/fs/f2fs/segment.h
> > index 3bf2ce46fa0907..bb22feeae1cfcb 100644
> > --- a/fs/f2fs/segment.h
> > +++ b/fs/f2fs/segment.h
> > @@ -96,7 +96,8 @@ static inline void sanity_check_seg_type(struct
> > f2fs_sb_info *sbi,
> > (GET_SEGOFF_FROM_SEG0(sbi, blk_addr) & (BLKS_PER_SEG(sbi) - 1))
> > #define GET_SEGNO(sbi, blk_addr) \
> > - ((!__is_valid_data_blkaddr(blk_addr)) ? \
> > + ((!__is_valid_data_blkaddr(blk_addr) || \
> > + !f2fs_is_valid_blkaddr(sbi, blk_addr, DATA_GENERIC)) ? \
> > NULL_SEGNO : GET_L2R_SEGNO(FREE_I(sbi), \
> > GET_SEGNO_FROM_SEG0(sbi, blk_addr)))
> > #define CAP_BLKS_PER_SEC(sbi)
> > because Chao's patch let new_addr=null_addr when  get_new_segment
> > returns NOSPACE,
> > so I think this can be reverted and it also saves code running time.
> > How about Chao's opinions?
> > thanks!
> > >
> > > >
> > > > >
> > > > > BTW, I've tested this patch for a while, and it looks there is no issue w/
> > > > > FAULT_NO_SEGMENT fault injection is on.
> > > > >
> > > > > > btw, Why can’t I see this patch on your branch^^?
> > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/chao/linux.git/log/?h=dev-test <https://git.kernel.org/pub/scm/linux/kernel/git/chao/linux.git/log/?h=dev-test>
> > > > >
> > > > > Too lazy to push patches in time, will do it in this weekend. :P
> > > > >
> > > > > > thanks!
> > > > > >
> > > > > >
> > > > > >      >
> > > > > >      > More detail shown in following patch sets.
> > > > > >      > The three patches are splited here because the modifications are
> > > > > >      > relatively independent and more readable.
> > > > > >      >
> > > > > >      > ---
> > > > > >      > Changes of v2: stop checkpoint when get a out-of-bound segment
> > > > > >      > ---
> > > > > >      >
> > > > > >      > Zhiguo Niu (4):
> > > > > >      >    f2fs: correct counting methods of free_segments in __set_inuse
> > > > > >      >    f2fs: fix panic issue in update_sit_entry
> > > > > >      >    f2fs: enhance judgment conditions of GET_SEGNO
> > > > > >      >    f2fs: stop checkpoint when get a out-of-bounds segment
> > > > > >      >
> > > > > >      >   fs/f2fs/file.c          |  7 ++++++-
> > > > > >      >   fs/f2fs/segment.c       | 21 ++++++++++++++++-----
> > > > > >      >   fs/f2fs/segment.h       |  7 ++++---
> > > > > >      >   include/linux/f2fs_fs.h |  1 +
> > > > > >      >   4 files changed, 27 insertions(+), 9 deletions(-)
> > > > > >      >
> > > > > >
Chao Yu Feb. 28, 2024, 1:21 a.m. UTC | #10
On 2024/2/28 1:18, Jaegeuk Kim wrote:
> On 02/27, Zhiguo Niu wrote:
>> On Tue, Feb 27, 2024 at 9:13 AM Jaegeuk Kim <jaegeuk@kernel.org> wrote:
>>>
>>> On 02/26, Zhiguo Niu wrote:
>>>> Dear Chao,
>>>>
>>>> On Fri, Feb 23, 2024 at 10:38 AM Chao Yu <chao@kernel.org> wrote:
>>>>>
>>>>> On 2024/2/23 10:01, Zhiguo Niu wrote:
>>>>>>
>>>>>>
>>>>>> On Thu, Feb 22, 2024 at 8:30 PM Chao Yu <chao@kernel.org <mailto:chao@kernel.org>> wrote:
>>>>>>
>>>>>>      On 2024/2/7 10:01, Zhiguo Niu wrote:
>>>>>>       > A panic issue happened in a reboot test in small capacity device
>>>>>>       > as following:
>>>>>>       > 1.The device size is 64MB, and main area has 24 segments, and
>>>>>>       > CONFIG_F2FS_CHECK_FS is not enabled.
>>>>>>       > 2.There is no any free segments left shown in free_segmap_info,
>>>>>>       > then another write request cause get_new_segment get a out-of-bound
>>>>>>       > segment with segno 24.
>>>>>>       > 3.panic happen in update_sit_entry because access invalid bitmap
>>>>>>       > pointer.
>>>>>>
>>>>>>      Zhiguo,
>>>>>>
>>>>>>      Can you please try below patch to see whether it can fix your problem?
>>>>>>
>>>>>>      https://lore.kernel.org/linux-f2fs-devel/20240222121851.883141-3-chao@kernel.org <https://lore.kernel.org/linux-f2fs-devel/20240222121851.883141-3-chao@kernel.org>
>>>>>>
>>>>>>      Thanks,
>>>>>>
>>>>>>
>>>>>> Dear Chao,
>>>>>> I need to coordinate the testing resources. The previous testing has been stopped because it was fixed with the current patch. In addition, this requires stability testing to reproduce, so it will take a certain amount of time. If there is any situation, I will tell you in time.
>>>>>
>>>>> Zhiguo, thank you!
>>>>
>>>> We tested this patch  this weekend on the previous version with
>>>> problem, and it can not reproduce panic issues,
>>>> so this patch should fix the original issue.
>>>> thanks!
>>>
>> Dear Jaegeuk,
>>> Hey, do you guys please point out which patches were tested without what?
>> This problem occurred during our platform stability testing.
>> it can be fixed by my  this patch set, mainly be fixed by:
>> f2fs: fix panic issue in update_sit_entry & f2fs: enhance judgment
>> conditions of GET_SEGNO
>> and Chao's patch can also fix this problems testing without my patch
>>> IOWs, which patches should I remove and keep Chao's patch?
>> I think chao's patch is more reasonable, it does error handling more complete.
>> but my patch just do some sanity check for return value of GET_SEGNO
>> Same as other codes(update_segment_mtime)
>> and i think it also needed except this part:
> 
> Thanks for confirmation. It seems it'd be better to revert yours and apply
> Chao's patch first. If you think there's something to improve on top of it,
> could you please send another patch afterwards?

Agreed.

Thanks,

> 
>>
>> diff --git a/fs/f2fs/segment.h b/fs/f2fs/segment.h
>> index 3bf2ce46fa0907..bb22feeae1cfcb 100644
>> --- a/fs/f2fs/segment.h
>> +++ b/fs/f2fs/segment.h
>> @@ -96,7 +96,8 @@ static inline void sanity_check_seg_type(struct
>> f2fs_sb_info *sbi,
>> (GET_SEGOFF_FROM_SEG0(sbi, blk_addr) & (BLKS_PER_SEG(sbi) - 1))
>> #define GET_SEGNO(sbi, blk_addr) \
>> - ((!__is_valid_data_blkaddr(blk_addr)) ? \
>> + ((!__is_valid_data_blkaddr(blk_addr) || \
>> + !f2fs_is_valid_blkaddr(sbi, blk_addr, DATA_GENERIC)) ? \
>> NULL_SEGNO : GET_L2R_SEGNO(FREE_I(sbi), \
>> GET_SEGNO_FROM_SEG0(sbi, blk_addr)))
>> #define CAP_BLKS_PER_SEC(sbi)
>> because Chao's patch let new_addr=null_addr when  get_new_segment
>> returns NOSPACE,
>> so I think this can be reverted and it also saves code running time.
>> How about Chao's opinions?
>> thanks!
>>>
>>>>
>>>>>
>>>>> BTW, I've tested this patch for a while, and it looks there is no issue w/
>>>>> FAULT_NO_SEGMENT fault injection is on.
>>>>>
>>>>>> btw, Why can’t I see this patch on your branch^^?
>>>>>> https://git.kernel.org/pub/scm/linux/kernel/git/chao/linux.git/log/?h=dev-test <https://git.kernel.org/pub/scm/linux/kernel/git/chao/linux.git/log/?h=dev-test>
>>>>>
>>>>> Too lazy to push patches in time, will do it in this weekend. :P
>>>>>
>>>>>> thanks!
>>>>>>
>>>>>>
>>>>>>       >
>>>>>>       > More detail shown in following patch sets.
>>>>>>       > The three patches are splited here because the modifications are
>>>>>>       > relatively independent and more readable.
>>>>>>       >
>>>>>>       > ---
>>>>>>       > Changes of v2: stop checkpoint when get a out-of-bound segment
>>>>>>       > ---
>>>>>>       >
>>>>>>       > Zhiguo Niu (4):
>>>>>>       >    f2fs: correct counting methods of free_segments in __set_inuse
>>>>>>       >    f2fs: fix panic issue in update_sit_entry
>>>>>>       >    f2fs: enhance judgment conditions of GET_SEGNO
>>>>>>       >    f2fs: stop checkpoint when get a out-of-bounds segment
>>>>>>       >
>>>>>>       >   fs/f2fs/file.c          |  7 ++++++-
>>>>>>       >   fs/f2fs/segment.c       | 21 ++++++++++++++++-----
>>>>>>       >   fs/f2fs/segment.h       |  7 ++++---
>>>>>>       >   include/linux/f2fs_fs.h |  1 +
>>>>>>       >   4 files changed, 27 insertions(+), 9 deletions(-)
>>>>>>       >
>>>>>>