Message ID | 1584364028-122886-2-git-send-email-zhengbin13@huawei.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | xfs: always init fdblocks in mount and avoid f_bfree overflow | expand |
On Mon, Mar 16, 2020 at 09:07:07PM +0800, Zheng Bin wrote: > Use fuzz(hydra) to test XFS and automatically generate > tmp.img(XFS v5 format, but some metadata is wrong) > > xfs_repair information(just one AG): > agf_freeblks 0, counted 3224 in ag 0 > agf_longest 0, counted 3224 in ag 0 > sb_fdblocks 3228, counted 3224 > > Test as follows: > mount tmp.img tmpdir > cp file1M tmpdir > sync > > In 4.19-stable, sync will stuck, while in linux-next, sync not stuck. > The reason is same to commit d0c7feaf8767 > ("xfs: add agf freeblocks verify in xfs_agf_verify"), cause agf_longest > is 0, we can not block this in xfs_agf_verify. Uh.... what are you saying here? That the allocator misbehaves and loops forever if sb_fdblocks > sum(agf_freeblks) after mount? Also, uh, what do you mean by "sync not stuck"? Writeback will fail on allocation error, right...? So I think the problem with incorrect AGF contents (on upstream) is that writeback will fail due to ENOSPC, which should never happen under normal circumstance? > > Make sure fdblocks is always inited in mount(also init ifree, icount). > > xfs_mountfs > xfs_check_summary_counts > xfs_initialize_perag_data > > Signed-off-by: Zheng Bin <zhengbin13@huawei.com> > --- > fs/xfs/xfs_mount.c | 33 --------------------------------- > 1 file changed, 33 deletions(-) > > diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c > index c5513e5..dc41801 100644 > --- a/fs/xfs/xfs_mount.c > +++ b/fs/xfs/xfs_mount.c > @@ -594,39 +594,6 @@ xfs_check_summary_counts( > return -EFSCORRUPTED; > } > > - /* > - * Now the log is mounted, we know if it was an unclean shutdown or > - * not. If it was, with the first phase of recovery has completed, we > - * have consistent AG blocks on disk. We have not recovered EFIs yet, > - * but they are recovered transactionally in the second recovery phase > - * later. > - * > - * If the log was clean when we mounted, we can check the summary > - * counters. If any of them are obviously incorrect, we can recompute > - * them from the AGF headers in the next step. > - */ > - if (XFS_LAST_UNMOUNT_WAS_CLEAN(mp) && > - (mp->m_sb.sb_fdblocks > mp->m_sb.sb_dblocks || > - !xfs_verify_icount(mp, mp->m_sb.sb_icount) || > - mp->m_sb.sb_ifree > mp->m_sb.sb_icount)) > - xfs_fs_mark_sick(mp, XFS_SICK_FS_COUNTERS); > - > - /* > - * We can safely re-initialise incore superblock counters from the > - * per-ag data. These may not be correct if the filesystem was not > - * cleanly unmounted, so we waited for recovery to finish before doing > - * this. > - * > - * If the filesystem was cleanly unmounted or the previous check did > - * not flag anything weird, then we can trust the values in the > - * superblock to be correct and we don't need to do anything here. > - * Otherwise, recalculate the summary counters. > - */ > - if ((!xfs_sb_version_haslazysbcount(&mp->m_sb) || > - XFS_LAST_UNMOUNT_WAS_CLEAN(mp)) && > - !xfs_fs_has_sickness(mp, XFS_SICK_FS_COUNTERS)) > - return 0; > - > return xfs_initialize_perag_data(mp, mp->m_sb.sb_agcount); The downside of this is that now we /always/ have to make two trips around all of the AGs at mount time. If you're proposing to require a fresh fdblocks recomputation at mount, could you please refactor all of the mount-time AG walks into a single loop? And perhaps use xfs_pwork so that we don't have to do it serially? --D > } > > -- > 2.7.4 >
On 3/16/20 8:07 AM, Zheng Bin wrote: > Use fuzz(hydra) to test XFS and automatically generate > tmp.img(XFS v5 format, but some metadata is wrong) > > xfs_repair information(just one AG): > agf_freeblks 0, counted 3224 in ag 0 > agf_longest 0, counted 3224 in ag 0 > sb_fdblocks 3228, counted 3224 > > Test as follows: > mount tmp.img tmpdir > cp file1M tmpdir > sync > > In 4.19-stable, sync will stuck, while in linux-next, sync not stuck. Is there any observable problem on linux-next? I can't tell. Can you provide an image that demonstrates this problem? -Eric > The reason is same to commit d0c7feaf8767 > ("xfs: add agf freeblocks verify in xfs_agf_verify"), cause agf_longest > is 0, we can not block this in xfs_agf_verify. > > Make sure fdblocks is always inited in mount(also init ifree, icount). > > xfs_mountfs > xfs_check_summary_counts > xfs_initialize_perag_data > > Signed-off-by: Zheng Bin <zhengbin13@huawei.com> > --- > fs/xfs/xfs_mount.c | 33 --------------------------------- > 1 file changed, 33 deletions(-) > > diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c > index c5513e5..dc41801 100644 > --- a/fs/xfs/xfs_mount.c > +++ b/fs/xfs/xfs_mount.c > @@ -594,39 +594,6 @@ xfs_check_summary_counts( > return -EFSCORRUPTED; > } > > - /* > - * Now the log is mounted, we know if it was an unclean shutdown or > - * not. If it was, with the first phase of recovery has completed, we > - * have consistent AG blocks on disk. We have not recovered EFIs yet, > - * but they are recovered transactionally in the second recovery phase > - * later. > - * > - * If the log was clean when we mounted, we can check the summary > - * counters. If any of them are obviously incorrect, we can recompute > - * them from the AGF headers in the next step. > - */ > - if (XFS_LAST_UNMOUNT_WAS_CLEAN(mp) && > - (mp->m_sb.sb_fdblocks > mp->m_sb.sb_dblocks || > - !xfs_verify_icount(mp, mp->m_sb.sb_icount) || > - mp->m_sb.sb_ifree > mp->m_sb.sb_icount)) > - xfs_fs_mark_sick(mp, XFS_SICK_FS_COUNTERS); > - > - /* > - * We can safely re-initialise incore superblock counters from the > - * per-ag data. These may not be correct if the filesystem was not > - * cleanly unmounted, so we waited for recovery to finish before doing > - * this. > - * > - * If the filesystem was cleanly unmounted or the previous check did > - * not flag anything weird, then we can trust the values in the > - * superblock to be correct and we don't need to do anything here. > - * Otherwise, recalculate the summary counters. > - */ > - if ((!xfs_sb_version_haslazysbcount(&mp->m_sb) || > - XFS_LAST_UNMOUNT_WAS_CLEAN(mp)) && > - !xfs_fs_has_sickness(mp, XFS_SICK_FS_COUNTERS)) > - return 0; > - > return xfs_initialize_perag_data(mp, mp->m_sb.sb_agcount); > } > > -- > 2.7.4 >
On 2020/3/16 23:13, Darrick J. Wong wrote: > On Mon, Mar 16, 2020 at 09:07:07PM +0800, Zheng Bin wrote: >> Use fuzz(hydra) to test XFS and automatically generate >> tmp.img(XFS v5 format, but some metadata is wrong) >> >> xfs_repair information(just one AG): >> agf_freeblks 0, counted 3224 in ag 0 >> agf_longest 0, counted 3224 in ag 0 >> sb_fdblocks 3228, counted 3224 >> >> Test as follows: >> mount tmp.img tmpdir >> cp file1M tmpdir >> sync >> >> In 4.19-stable, sync will stuck, while in linux-next, sync not stuck. >> The reason is same to commit d0c7feaf8767 >> ("xfs: add agf freeblocks verify in xfs_agf_verify"), cause agf_longest >> is 0, we can not block this in xfs_agf_verify. > Uh.... what are you saying here? That the allocator misbehaves and > loops forever if sb_fdblocks > sum(agf_freeblks) after mount? > > Also, uh, what do you mean by "sync not stuck"? Writeback will fail on > allocation error, right...? So I think the problem with incorrect AGF > contents (on upstream) is that writeback will fail due to ENOSPC, which > should never happen under normal circumstance? Yes, see commit d0c7feaf8767 ("xfs: add agf freeblocks verify in xfs_agf_verify") 4.19-stable loops forerver, while linux-next will fail due to ENOSPC, cause commit c2b3164320b5 ("xfs: use the latest extent at writeback delalloc conversion time") remove the above while, dmesg is as follows: [ 55.250114] XFS (loop0): page discard on page ffffea0008bc7380, inode 0x1b0c, offset 0. @sandeen, we can construct an img like this: dd if=/dev/zero of=xfs.img bs=1M count=20 mkfs.xfs -d agcount=1 xfs.img xfs_db -x xfs.img agf 0 write freeblks 0 write longest 0 quit >> Make sure fdblocks is always inited in mount(also init ifree, icount). >> >> xfs_mountfs >> xfs_check_summary_counts >> xfs_initialize_perag_data >> >> Signed-off-by: Zheng Bin <zhengbin13@huawei.com> >> --- >> fs/xfs/xfs_mount.c | 33 --------------------------------- >> 1 file changed, 33 deletions(-) >> >> diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c >> index c5513e5..dc41801 100644 >> --- a/fs/xfs/xfs_mount.c >> +++ b/fs/xfs/xfs_mount.c >> @@ -594,39 +594,6 @@ xfs_check_summary_counts( >> return -EFSCORRUPTED; >> } >> >> - /* >> - * Now the log is mounted, we know if it was an unclean shutdown or >> - * not. If it was, with the first phase of recovery has completed, we >> - * have consistent AG blocks on disk. We have not recovered EFIs yet, >> - * but they are recovered transactionally in the second recovery phase >> - * later. >> - * >> - * If the log was clean when we mounted, we can check the summary >> - * counters. If any of them are obviously incorrect, we can recompute >> - * them from the AGF headers in the next step. >> - */ >> - if (XFS_LAST_UNMOUNT_WAS_CLEAN(mp) && >> - (mp->m_sb.sb_fdblocks > mp->m_sb.sb_dblocks || >> - !xfs_verify_icount(mp, mp->m_sb.sb_icount) || >> - mp->m_sb.sb_ifree > mp->m_sb.sb_icount)) >> - xfs_fs_mark_sick(mp, XFS_SICK_FS_COUNTERS); >> - >> - /* >> - * We can safely re-initialise incore superblock counters from the >> - * per-ag data. These may not be correct if the filesystem was not >> - * cleanly unmounted, so we waited for recovery to finish before doing >> - * this. >> - * >> - * If the filesystem was cleanly unmounted or the previous check did >> - * not flag anything weird, then we can trust the values in the >> - * superblock to be correct and we don't need to do anything here. >> - * Otherwise, recalculate the summary counters. >> - */ >> - if ((!xfs_sb_version_haslazysbcount(&mp->m_sb) || >> - XFS_LAST_UNMOUNT_WAS_CLEAN(mp)) && >> - !xfs_fs_has_sickness(mp, XFS_SICK_FS_COUNTERS)) >> - return 0; >> - >> return xfs_initialize_perag_data(mp, mp->m_sb.sb_agcount); > The downside of this is that now we /always/ have to make two trips > around all of the AGs at mount time. If you're proposing to require a > fresh fdblocks recomputation at mount, could you please refactor all of > the mount-time AG walks into a single loop? And perhaps use xfs_pwork > so that we don't have to do it serially? > > --D > >> } >> >> -- >> 2.7.4 >> > . >
On 2020/3/16 23:13, Darrick J. Wong wrote: > On Mon, Mar 16, 2020 at 09:07:07PM +0800, Zheng Bin wrote: >> Use fuzz(hydra) to test XFS and automatically generate >> tmp.img(XFS v5 format, but some metadata is wrong) >> >> xfs_repair information(just one AG): >> agf_freeblks 0, counted 3224 in ag 0 >> agf_longest 0, counted 3224 in ag 0 >> sb_fdblocks 3228, counted 3224 >> >> Test as follows: >> mount tmp.img tmpdir >> cp file1M tmpdir >> sync >> >> In 4.19-stable, sync will stuck, while in linux-next, sync not stuck. >> The reason is same to commit d0c7feaf8767 >> ("xfs: add agf freeblocks verify in xfs_agf_verify"), cause agf_longest >> is 0, we can not block this in xfs_agf_verify. > Uh.... what are you saying here? That the allocator misbehaves and > loops forever if sb_fdblocks > sum(agf_freeblks) after mount? > > Also, uh, what do you mean by "sync not stuck"? Writeback will fail on > allocation error, right...? So I think the problem with incorrect AGF > contents (on upstream) is that writeback will fail due to ENOSPC, which > should never happen under normal circumstance? > >> Make sure fdblocks is always inited in mount(also init ifree, icount). >> >> xfs_mountfs >> xfs_check_summary_counts >> xfs_initialize_perag_data >> >> Signed-off-by: Zheng Bin <zhengbin13@huawei.com> >> --- >> fs/xfs/xfs_mount.c | 33 --------------------------------- >> 1 file changed, 33 deletions(-) >> >> diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c >> index c5513e5..dc41801 100644 >> --- a/fs/xfs/xfs_mount.c >> +++ b/fs/xfs/xfs_mount.c >> @@ -594,39 +594,6 @@ xfs_check_summary_counts( >> return -EFSCORRUPTED; >> } >> >> - /* >> - * Now the log is mounted, we know if it was an unclean shutdown or >> - * not. If it was, with the first phase of recovery has completed, we >> - * have consistent AG blocks on disk. We have not recovered EFIs yet, >> - * but they are recovered transactionally in the second recovery phase >> - * later. >> - * >> - * If the log was clean when we mounted, we can check the summary >> - * counters. If any of them are obviously incorrect, we can recompute >> - * them from the AGF headers in the next step. >> - */ >> - if (XFS_LAST_UNMOUNT_WAS_CLEAN(mp) && >> - (mp->m_sb.sb_fdblocks > mp->m_sb.sb_dblocks || >> - !xfs_verify_icount(mp, mp->m_sb.sb_icount) || >> - mp->m_sb.sb_ifree > mp->m_sb.sb_icount)) >> - xfs_fs_mark_sick(mp, XFS_SICK_FS_COUNTERS); >> - >> - /* >> - * We can safely re-initialise incore superblock counters from the >> - * per-ag data. These may not be correct if the filesystem was not >> - * cleanly unmounted, so we waited for recovery to finish before doing >> - * this. >> - * >> - * If the filesystem was cleanly unmounted or the previous check did >> - * not flag anything weird, then we can trust the values in the >> - * superblock to be correct and we don't need to do anything here. >> - * Otherwise, recalculate the summary counters. >> - */ >> - if ((!xfs_sb_version_haslazysbcount(&mp->m_sb) || >> - XFS_LAST_UNMOUNT_WAS_CLEAN(mp)) && >> - !xfs_fs_has_sickness(mp, XFS_SICK_FS_COUNTERS)) >> - return 0; >> - >> return xfs_initialize_perag_data(mp, mp->m_sb.sb_agcount); > The downside of this is that now we /always/ have to make two trips > around all of the AGs at mount time. If you're proposing to require a > fresh fdblocks recomputation at mount, could you please refactor all of > the mount-time AG walks into a single loop? And perhaps use xfs_pwork > so that we don't have to do it serially? xfs_mountfs xfs_initialize_perag -->just alloc memory xfs_check_summary_counts xfs_initialize_perag_data -->read agf,agi from disk for (index = 0; index < agcount; index++) error = xfs_alloc_pagf_init(mp, NULL, index, 0) error = xfs_ialloc_pagi_init(mp, NULL, index) xfs_fs_reserve_ag_blocks for (agno = 0; agno < mp->m_sb.sb_agcount; agno++) xfs_ag_resv_init #ifdef DEBUG -->read agf error = xfs_alloc_pagf_init(pag->pag_mount, tp, pag->pag_agno, 0) #endif Is this? if enable XFS_DEBUG, xfs_initialize_perag_data read agf, xfs_ag_resv_init also read agf? > > --D > >> } >> >> -- >> 2.7.4 >> > . >
On Tue, Mar 17, 2020 at 10:22:33AM +0800, zhengbin (A) wrote: > > On 2020/3/16 23:13, Darrick J. Wong wrote: > > On Mon, Mar 16, 2020 at 09:07:07PM +0800, Zheng Bin wrote: > >> Use fuzz(hydra) to test XFS and automatically generate > >> tmp.img(XFS v5 format, but some metadata is wrong) > >> > >> xfs_repair information(just one AG): > >> agf_freeblks 0, counted 3224 in ag 0 > >> agf_longest 0, counted 3224 in ag 0 > >> sb_fdblocks 3228, counted 3224 > >> > >> Test as follows: > >> mount tmp.img tmpdir > >> cp file1M tmpdir > >> sync > >> > >> In 4.19-stable, sync will stuck, while in linux-next, sync not stuck. > >> The reason is same to commit d0c7feaf8767 > >> ("xfs: add agf freeblocks verify in xfs_agf_verify"), cause agf_longest > >> is 0, we can not block this in xfs_agf_verify. > > Uh.... what are you saying here? That the allocator misbehaves and > > loops forever if sb_fdblocks > sum(agf_freeblks) after mount? > > > > Also, uh, what do you mean by "sync not stuck"? Writeback will fail on > > allocation error, right...? So I think the problem with incorrect AGF > > contents (on upstream) is that writeback will fail due to ENOSPC, which > > should never happen under normal circumstance? > > > >> Make sure fdblocks is always inited in mount(also init ifree, icount). > >> > >> xfs_mountfs > >> xfs_check_summary_counts > >> xfs_initialize_perag_data > >> > >> Signed-off-by: Zheng Bin <zhengbin13@huawei.com> > >> --- > >> fs/xfs/xfs_mount.c | 33 --------------------------------- > >> 1 file changed, 33 deletions(-) > >> > >> diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c > >> index c5513e5..dc41801 100644 > >> --- a/fs/xfs/xfs_mount.c > >> +++ b/fs/xfs/xfs_mount.c > >> @@ -594,39 +594,6 @@ xfs_check_summary_counts( > >> return -EFSCORRUPTED; > >> } > >> > >> - /* > >> - * Now the log is mounted, we know if it was an unclean shutdown or > >> - * not. If it was, with the first phase of recovery has completed, we > >> - * have consistent AG blocks on disk. We have not recovered EFIs yet, > >> - * but they are recovered transactionally in the second recovery phase > >> - * later. > >> - * > >> - * If the log was clean when we mounted, we can check the summary > >> - * counters. If any of them are obviously incorrect, we can recompute > >> - * them from the AGF headers in the next step. > >> - */ > >> - if (XFS_LAST_UNMOUNT_WAS_CLEAN(mp) && > >> - (mp->m_sb.sb_fdblocks > mp->m_sb.sb_dblocks || > >> - !xfs_verify_icount(mp, mp->m_sb.sb_icount) || > >> - mp->m_sb.sb_ifree > mp->m_sb.sb_icount)) > >> - xfs_fs_mark_sick(mp, XFS_SICK_FS_COUNTERS); > >> - > >> - /* > >> - * We can safely re-initialise incore superblock counters from the > >> - * per-ag data. These may not be correct if the filesystem was not > >> - * cleanly unmounted, so we waited for recovery to finish before doing > >> - * this. > >> - * > >> - * If the filesystem was cleanly unmounted or the previous check did > >> - * not flag anything weird, then we can trust the values in the > >> - * superblock to be correct and we don't need to do anything here. > >> - * Otherwise, recalculate the summary counters. > >> - */ > >> - if ((!xfs_sb_version_haslazysbcount(&mp->m_sb) || > >> - XFS_LAST_UNMOUNT_WAS_CLEAN(mp)) && > >> - !xfs_fs_has_sickness(mp, XFS_SICK_FS_COUNTERS)) > >> - return 0; > >> - > >> return xfs_initialize_perag_data(mp, mp->m_sb.sb_agcount); > > The downside of this is that now we /always/ have to make two trips > > around all of the AGs at mount time. If you're proposing to require a > > fresh fdblocks recomputation at mount, could you please refactor all of > > the mount-time AG walks into a single loop? And perhaps use xfs_pwork > > so that we don't have to do it serially? > xfs_mountfs > xfs_initialize_perag -->just alloc memory > xfs_check_summary_counts > xfs_initialize_perag_data -->read agf,agi from disk > for (index = 0; index < agcount; index++) > error = xfs_alloc_pagf_init(mp, NULL, index, 0) > error = xfs_ialloc_pagi_init(mp, NULL, index) > xfs_fs_reserve_ag_blocks > for (agno = 0; agno < mp->m_sb.sb_agcount; agno++) > xfs_ag_resv_init > #ifdef DEBUG -->read agf > error = xfs_alloc_pagf_init(pag->pag_mount, tp, pag->pag_agno, 0) > #endif > > Is this? if enable XFS_DEBUG, xfs_initialize_perag_data read agf, xfs_ag_resv_init > > also read agf? Yes, that's the other AG-walker that I was referring to. --D > > > > > --D > > > >> } > >> > >> -- > >> 2.7.4 > >> > > . > > >
On 2020/3/17 11:55, Darrick J. Wong wrote: > On Tue, Mar 17, 2020 at 10:22:33AM +0800, zhengbin (A) wrote: >> On 2020/3/16 23:13, Darrick J. Wong wrote: >>> On Mon, Mar 16, 2020 at 09:07:07PM +0800, Zheng Bin wrote: >>>> Use fuzz(hydra) to test XFS and automatically generate >>>> tmp.img(XFS v5 format, but some metadata is wrong) >>>> >>>> xfs_repair information(just one AG): >>>> agf_freeblks 0, counted 3224 in ag 0 >>>> agf_longest 0, counted 3224 in ag 0 >>>> sb_fdblocks 3228, counted 3224 >>>> >>>> Test as follows: >>>> mount tmp.img tmpdir >>>> cp file1M tmpdir >>>> sync >>>> >>>> In 4.19-stable, sync will stuck, while in linux-next, sync not stuck. >>>> The reason is same to commit d0c7feaf8767 >>>> ("xfs: add agf freeblocks verify in xfs_agf_verify"), cause agf_longest >>>> is 0, we can not block this in xfs_agf_verify. >>> Uh.... what are you saying here? That the allocator misbehaves and >>> loops forever if sb_fdblocks > sum(agf_freeblks) after mount? >>> >>> Also, uh, what do you mean by "sync not stuck"? Writeback will fail on >>> allocation error, right...? So I think the problem with incorrect AGF >>> contents (on upstream) is that writeback will fail due to ENOSPC, which >>> should never happen under normal circumstance? >>> >>>> Make sure fdblocks is always inited in mount(also init ifree, icount). >>>> >>>> xfs_mountfs >>>> xfs_check_summary_counts >>>> xfs_initialize_perag_data >>>> >>>> Signed-off-by: Zheng Bin <zhengbin13@huawei.com> >>>> --- >>>> fs/xfs/xfs_mount.c | 33 --------------------------------- >>>> 1 file changed, 33 deletions(-) >>>> >>>> diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c >>>> index c5513e5..dc41801 100644 >>>> --- a/fs/xfs/xfs_mount.c >>>> +++ b/fs/xfs/xfs_mount.c >>>> @@ -594,39 +594,6 @@ xfs_check_summary_counts( >>>> return -EFSCORRUPTED; >>>> } >>>> >>>> - /* >>>> - * Now the log is mounted, we know if it was an unclean shutdown or >>>> - * not. If it was, with the first phase of recovery has completed, we >>>> - * have consistent AG blocks on disk. We have not recovered EFIs yet, >>>> - * but they are recovered transactionally in the second recovery phase >>>> - * later. >>>> - * >>>> - * If the log was clean when we mounted, we can check the summary >>>> - * counters. If any of them are obviously incorrect, we can recompute >>>> - * them from the AGF headers in the next step. >>>> - */ >>>> - if (XFS_LAST_UNMOUNT_WAS_CLEAN(mp) && >>>> - (mp->m_sb.sb_fdblocks > mp->m_sb.sb_dblocks || >>>> - !xfs_verify_icount(mp, mp->m_sb.sb_icount) || >>>> - mp->m_sb.sb_ifree > mp->m_sb.sb_icount)) >>>> - xfs_fs_mark_sick(mp, XFS_SICK_FS_COUNTERS); >>>> - >>>> - /* >>>> - * We can safely re-initialise incore superblock counters from the >>>> - * per-ag data. These may not be correct if the filesystem was not >>>> - * cleanly unmounted, so we waited for recovery to finish before doing >>>> - * this. >>>> - * >>>> - * If the filesystem was cleanly unmounted or the previous check did >>>> - * not flag anything weird, then we can trust the values in the >>>> - * superblock to be correct and we don't need to do anything here. >>>> - * Otherwise, recalculate the summary counters. >>>> - */ >>>> - if ((!xfs_sb_version_haslazysbcount(&mp->m_sb) || >>>> - XFS_LAST_UNMOUNT_WAS_CLEAN(mp)) && >>>> - !xfs_fs_has_sickness(mp, XFS_SICK_FS_COUNTERS)) >>>> - return 0; >>>> - >>>> return xfs_initialize_perag_data(mp, mp->m_sb.sb_agcount); >>> The downside of this is that now we /always/ have to make two trips >>> around all of the AGs at mount time. If you're proposing to require a >>> fresh fdblocks recomputation at mount, could you please refactor all of >>> the mount-time AG walks into a single loop? And perhaps use xfs_pwork >>> so that we don't have to do it serially? >> xfs_mountfs >> xfs_initialize_perag -->just alloc memory >> xfs_check_summary_counts >> xfs_initialize_perag_data -->read agf,agi from disk >> for (index = 0; index < agcount; index++) >> error = xfs_alloc_pagf_init(mp, NULL, index, 0) >> error = xfs_ialloc_pagi_init(mp, NULL, index) >> xfs_fs_reserve_ag_blocks >> for (agno = 0; agno < mp->m_sb.sb_agcount; agno++) >> xfs_ag_resv_init >> #ifdef DEBUG -->read agf >> error = xfs_alloc_pagf_init(pag->pag_mount, tp, pag->pag_agno, 0) >> #endif >> >> Is this? if enable XFS_DEBUG, xfs_initialize_perag_data read agf, xfs_ag_resv_init >> >> also read agf? > Yes, that's the other AG-walker that I was referring to. Maybe we can judge if (!pag->pagf_init) in xfs_ag_resv_init? xfs_fs_reserve_ag_blocks for (agno = 0; agno < mp->m_sb.sb_agcount; agno++) xfs_ag_resv_init #ifdef DEBUG -->read agf if (!pag->pagf_init) { error = xfs_alloc_pagf_init(pag->pag_mount, tp, pag->pag_agno, 0) } > > --D > >>> --D >>> >>>> } >>>> >>>> -- >>>> 2.7.4 >>>> >>> . >>> > . >
diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c index c5513e5..dc41801 100644 --- a/fs/xfs/xfs_mount.c +++ b/fs/xfs/xfs_mount.c @@ -594,39 +594,6 @@ xfs_check_summary_counts( return -EFSCORRUPTED; } - /* - * Now the log is mounted, we know if it was an unclean shutdown or - * not. If it was, with the first phase of recovery has completed, we - * have consistent AG blocks on disk. We have not recovered EFIs yet, - * but they are recovered transactionally in the second recovery phase - * later. - * - * If the log was clean when we mounted, we can check the summary - * counters. If any of them are obviously incorrect, we can recompute - * them from the AGF headers in the next step. - */ - if (XFS_LAST_UNMOUNT_WAS_CLEAN(mp) && - (mp->m_sb.sb_fdblocks > mp->m_sb.sb_dblocks || - !xfs_verify_icount(mp, mp->m_sb.sb_icount) || - mp->m_sb.sb_ifree > mp->m_sb.sb_icount)) - xfs_fs_mark_sick(mp, XFS_SICK_FS_COUNTERS); - - /* - * We can safely re-initialise incore superblock counters from the - * per-ag data. These may not be correct if the filesystem was not - * cleanly unmounted, so we waited for recovery to finish before doing - * this. - * - * If the filesystem was cleanly unmounted or the previous check did - * not flag anything weird, then we can trust the values in the - * superblock to be correct and we don't need to do anything here. - * Otherwise, recalculate the summary counters. - */ - if ((!xfs_sb_version_haslazysbcount(&mp->m_sb) || - XFS_LAST_UNMOUNT_WAS_CLEAN(mp)) && - !xfs_fs_has_sickness(mp, XFS_SICK_FS_COUNTERS)) - return 0; - return xfs_initialize_perag_data(mp, mp->m_sb.sb_agcount); }
Use fuzz(hydra) to test XFS and automatically generate tmp.img(XFS v5 format, but some metadata is wrong) xfs_repair information(just one AG): agf_freeblks 0, counted 3224 in ag 0 agf_longest 0, counted 3224 in ag 0 sb_fdblocks 3228, counted 3224 Test as follows: mount tmp.img tmpdir cp file1M tmpdir sync In 4.19-stable, sync will stuck, while in linux-next, sync not stuck. The reason is same to commit d0c7feaf8767 ("xfs: add agf freeblocks verify in xfs_agf_verify"), cause agf_longest is 0, we can not block this in xfs_agf_verify. Make sure fdblocks is always inited in mount(also init ifree, icount). xfs_mountfs xfs_check_summary_counts xfs_initialize_perag_data Signed-off-by: Zheng Bin <zhengbin13@huawei.com> --- fs/xfs/xfs_mount.c | 33 --------------------------------- 1 file changed, 33 deletions(-) -- 2.7.4