Message ID | 161076029319.3386490.2011901341184065451.stgit@magnolia (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | xfs: add the ability to flag a fs for repair | expand |
On Fri, Jan 15, 2021 at 05:24:53PM -0800, Darrick J. Wong wrote: > From: Darrick J. Wong <djwong@kernel.org> > > Clear the needsrepair flag, since it's used to prevent mounting of an > inconsistent filesystem. > > Signed-off-by: Darrick J. Wong <djwong@kernel.org> > --- Code/errors look much cleaner. Though looking at the repair code again, I wonder... if we clear the needsrepair bit and dirty/write the sb in phase 2 and then xfs_repair happens to crash, do we risk clearing the bit and thus allowing a potential mount before whatever requisite metadata updates have been made? Brian > repair/agheader.c | 15 +++++++++++++++ > 1 file changed, 15 insertions(+) > > > diff --git a/repair/agheader.c b/repair/agheader.c > index 8bb99489..d9b72d3a 100644 > --- a/repair/agheader.c > +++ b/repair/agheader.c > @@ -452,6 +452,21 @@ secondary_sb_whack( > rval |= XR_AG_SB_SEC; > } > > + if (xfs_sb_version_needsrepair(sb)) { > + if (!no_modify) > + sb->sb_features_incompat &= > + ~XFS_SB_FEAT_INCOMPAT_NEEDSREPAIR; > + if (i == 0) { > + if (!no_modify) > + do_warn( > + _("clearing needsrepair flag and regenerating metadata\n")); > + else > + do_warn( > + _("would clear needsrepair flag and regenerate metadata\n")); > + } > + rval |= XR_AG_SB_SEC; > + } > + > return(rval); > } > >
On Tue, Jan 19, 2021 at 09:37:54AM -0500, Brian Foster wrote: > On Fri, Jan 15, 2021 at 05:24:53PM -0800, Darrick J. Wong wrote: > > From: Darrick J. Wong <djwong@kernel.org> > > > > Clear the needsrepair flag, since it's used to prevent mounting of an > > inconsistent filesystem. > > > > Signed-off-by: Darrick J. Wong <djwong@kernel.org> > > --- > > Code/errors look much cleaner. Though looking at the repair code again, > I wonder... if we clear the needsrepair bit and dirty/write the sb in > phase 2 and then xfs_repair happens to crash, do we risk clearing the > bit and thus allowing a potential mount before whatever requisite > metadata updates have been made? [Oh good, now mail.kernel.org is having problems...] Yes, though I think that falls into the realm of "sysadmins should be sufficiently self-aware not to expect mount to work after repair fails/system crashes during an upgrade". I've thought about how to solve the general problem of preventing people from mounting filesystems if repair doesn't run to completion. I think xfs_repair could be modified so that once it finds the primary super, it writes it back out with NEEDSREPAIR set (V5) or inprogress set (V4). Once we've finished the buffer cache flush at the end of repair, we clear needsrepair/inprogress and write the primary super again. An optimization on that would be to find a way to avoid that first super write until we flush the first dirty buffer. Another way to make repair more "transactional" would be to do it would be to fiddle with the buffer manager so that writes are sent to a metadump file which could be mdrestore'd if repair completes successfully. But that's a short-circuit around the even bigger project of porting the kernel logging code to userspace and use that in repair. --D > Brian > > > repair/agheader.c | 15 +++++++++++++++ > > 1 file changed, 15 insertions(+) > > > > > > diff --git a/repair/agheader.c b/repair/agheader.c > > index 8bb99489..d9b72d3a 100644 > > --- a/repair/agheader.c > > +++ b/repair/agheader.c > > @@ -452,6 +452,21 @@ secondary_sb_whack( > > rval |= XR_AG_SB_SEC; > > } > > > > + if (xfs_sb_version_needsrepair(sb)) { > > + if (!no_modify) > > + sb->sb_features_incompat &= > > + ~XFS_SB_FEAT_INCOMPAT_NEEDSREPAIR; > > + if (i == 0) { > > + if (!no_modify) > > + do_warn( > > + _("clearing needsrepair flag and regenerating metadata\n")); > > + else > > + do_warn( > > + _("would clear needsrepair flag and regenerate metadata\n")); > > + } > > + rval |= XR_AG_SB_SEC; > > + } > > + > > return(rval); > > } > > > > >
On Tue, Jan 19, 2021 at 10:15:49AM -0800, Darrick J. Wong wrote: > On Tue, Jan 19, 2021 at 09:37:54AM -0500, Brian Foster wrote: > > On Fri, Jan 15, 2021 at 05:24:53PM -0800, Darrick J. Wong wrote: > > > From: Darrick J. Wong <djwong@kernel.org> > > > > > > Clear the needsrepair flag, since it's used to prevent mounting of an > > > inconsistent filesystem. > > > > > > Signed-off-by: Darrick J. Wong <djwong@kernel.org> > > > --- > > > > Code/errors look much cleaner. Though looking at the repair code again, > > I wonder... if we clear the needsrepair bit and dirty/write the sb in > > phase 2 and then xfs_repair happens to crash, do we risk clearing the > > bit and thus allowing a potential mount before whatever requisite > > metadata updates have been made? > > [Oh good, now mail.kernel.org is having problems...] > > Yes, though I think that falls into the realm of "sysadmins should be > sufficiently self-aware not to expect mount to work after repair > fails/system crashes during an upgrade". > > I've thought about how to solve the general problem of preventing people > from mounting filesystems if repair doesn't run to completion. I think > xfs_repair could be modified so that once it finds the primary super, it > writes it back out with NEEDSREPAIR set (V5) or inprogress set (V4). > Once we've finished the buffer cache flush at the end of repair, we > clear needsrepair/inprogress and write the primary super again. > That's kind of what I was thinking.. set a global flag if/when we come across the bit set on disk and clear it as a final step. > An optimization on that would be to find a way to avoid that first super > write until we flush the first dirty buffer. > > Another way to make repair more "transactional" would be to do it would > be to fiddle with the buffer manager so that writes are sent to a > metadump file which could be mdrestore'd if repair completes > successfully. But that's a short-circuit around the even bigger project > of porting the kernel logging code to userspace and use that in repair. > Yeah, I wouldn't want to have clearing a feature bit depend on such a significant rework of core functionality. The bit is not going to be set in most cases, so I'd suspect an extra superblock write at the end of repair wouldn't be much of a problem. In fact, it looks like main() already has an unconditional sb write before the final libxfs_unmount() call. Perhaps we could just hitch onto that for the primary super update (and continue to clear the secondaries earlier as the current patch does)..? Brian > --D > > > Brian > > > > > repair/agheader.c | 15 +++++++++++++++ > > > 1 file changed, 15 insertions(+) > > > > > > > > > diff --git a/repair/agheader.c b/repair/agheader.c > > > index 8bb99489..d9b72d3a 100644 > > > --- a/repair/agheader.c > > > +++ b/repair/agheader.c > > > @@ -452,6 +452,21 @@ secondary_sb_whack( > > > rval |= XR_AG_SB_SEC; > > > } > > > > > > + if (xfs_sb_version_needsrepair(sb)) { > > > + if (!no_modify) > > > + sb->sb_features_incompat &= > > > + ~XFS_SB_FEAT_INCOMPAT_NEEDSREPAIR; > > > + if (i == 0) { > > > + if (!no_modify) > > > + do_warn( > > > + _("clearing needsrepair flag and regenerating metadata\n")); > > > + else > > > + do_warn( > > > + _("would clear needsrepair flag and regenerate metadata\n")); > > > + } > > > + rval |= XR_AG_SB_SEC; > > > + } > > > + > > > return(rval); > > > } > > > > > > > > >
On Tue, Jan 19, 2021 at 02:44:06PM -0500, Brian Foster wrote: > On Tue, Jan 19, 2021 at 10:15:49AM -0800, Darrick J. Wong wrote: > > On Tue, Jan 19, 2021 at 09:37:54AM -0500, Brian Foster wrote: > > > On Fri, Jan 15, 2021 at 05:24:53PM -0800, Darrick J. Wong wrote: > > > > From: Darrick J. Wong <djwong@kernel.org> > > > > > > > > Clear the needsrepair flag, since it's used to prevent mounting of an > > > > inconsistent filesystem. > > > > > > > > Signed-off-by: Darrick J. Wong <djwong@kernel.org> > > > > --- > > > > > > Code/errors look much cleaner. Though looking at the repair code again, > > > I wonder... if we clear the needsrepair bit and dirty/write the sb in > > > phase 2 and then xfs_repair happens to crash, do we risk clearing the > > > bit and thus allowing a potential mount before whatever requisite > > > metadata updates have been made? > > > > [Oh good, now mail.kernel.org is having problems...] > > > > Yes, though I think that falls into the realm of "sysadmins should be > > sufficiently self-aware not to expect mount to work after repair > > fails/system crashes during an upgrade". > > > > I've thought about how to solve the general problem of preventing people > > from mounting filesystems if repair doesn't run to completion. I think > > xfs_repair could be modified so that once it finds the primary super, it > > writes it back out with NEEDSREPAIR set (V5) or inprogress set (V4). > > Once we've finished the buffer cache flush at the end of repair, we > > clear needsrepair/inprogress and write the primary super again. > > > > That's kind of what I was thinking.. set a global flag if/when we come > across the bit set on disk and clear it as a final step. I think if I found it set on the primary sb I would leave it alone, and if the bit wasn't set then I'd set it and bwrite the buffer immediately. Probably we're talking about more or less the same thing... > > An optimization on that would be to find a way to avoid that first super > > write until we flush the first dirty buffer. > > > > Another way to make repair more "transactional" would be to do it would > > be to fiddle with the buffer manager so that writes are sent to a > > metadump file which could be mdrestore'd if repair completes > > successfully. But that's a short-circuit around the even bigger project > > of porting the kernel logging code to userspace and use that in repair. > > > > Yeah, I wouldn't want to have clearing a feature bit depend on such a > significant rework of core functionality. The bit is not going to be set > in most cases, so I'd suspect an extra superblock write at the end of > repair wouldn't be much of a problem. In fact, it looks like main() > already has an unconditional sb write before the final libxfs_unmount() > call. Perhaps we could just hitch onto that for the primary super > update (and continue to clear the secondaries earlier as the current > patch does)..? Clearing the bit has to happen after the bcache purge and disk flush, because we need to make sure that everything else we wrote actually made it to stable storage. Hm, I guess we could export libxfs_flush_mount so that repair could call that, clear the fields in the sb, and then go ahead with the libxfs_umount. --D > Brian > > > --D > > > > > Brian > > > > > > > repair/agheader.c | 15 +++++++++++++++ > > > > 1 file changed, 15 insertions(+) > > > > > > > > > > > > diff --git a/repair/agheader.c b/repair/agheader.c > > > > index 8bb99489..d9b72d3a 100644 > > > > --- a/repair/agheader.c > > > > +++ b/repair/agheader.c > > > > @@ -452,6 +452,21 @@ secondary_sb_whack( > > > > rval |= XR_AG_SB_SEC; > > > > } > > > > > > > > + if (xfs_sb_version_needsrepair(sb)) { > > > > + if (!no_modify) > > > > + sb->sb_features_incompat &= > > > > + ~XFS_SB_FEAT_INCOMPAT_NEEDSREPAIR; > > > > + if (i == 0) { > > > > + if (!no_modify) > > > > + do_warn( > > > > + _("clearing needsrepair flag and regenerating metadata\n")); > > > > + else > > > > + do_warn( > > > > + _("would clear needsrepair flag and regenerate metadata\n")); > > > > + } > > > > + rval |= XR_AG_SB_SEC; > > > > + } > > > > + > > > > return(rval); > > > > } > > > > > > > > > > > > > >
On Tue, Jan 19, 2021 at 12:31:10PM -0800, Darrick J. Wong wrote: > On Tue, Jan 19, 2021 at 02:44:06PM -0500, Brian Foster wrote: > > On Tue, Jan 19, 2021 at 10:15:49AM -0800, Darrick J. Wong wrote: > > > On Tue, Jan 19, 2021 at 09:37:54AM -0500, Brian Foster wrote: > > > > On Fri, Jan 15, 2021 at 05:24:53PM -0800, Darrick J. Wong wrote: > > > > > From: Darrick J. Wong <djwong@kernel.org> > > > > > > > > > > Clear the needsrepair flag, since it's used to prevent mounting of an > > > > > inconsistent filesystem. > > > > > > > > > > Signed-off-by: Darrick J. Wong <djwong@kernel.org> > > > > > --- > > > > > > > > Code/errors look much cleaner. Though looking at the repair code again, > > > > I wonder... if we clear the needsrepair bit and dirty/write the sb in > > > > phase 2 and then xfs_repair happens to crash, do we risk clearing the > > > > bit and thus allowing a potential mount before whatever requisite > > > > metadata updates have been made? > > > > > > [Oh good, now mail.kernel.org is having problems...] > > > > > > Yes, though I think that falls into the realm of "sysadmins should be > > > sufficiently self-aware not to expect mount to work after repair > > > fails/system crashes during an upgrade". > > > > > > I've thought about how to solve the general problem of preventing people > > > from mounting filesystems if repair doesn't run to completion. I think > > > xfs_repair could be modified so that once it finds the primary super, it > > > writes it back out with NEEDSREPAIR set (V5) or inprogress set (V4). > > > Once we've finished the buffer cache flush at the end of repair, we > > > clear needsrepair/inprogress and write the primary super again. > > > > > > > That's kind of what I was thinking.. set a global flag if/when we come > > across the bit set on disk and clear it as a final step. > > I think if I found it set on the primary sb I would leave it alone, and > if the bit wasn't set then I'd set it and bwrite the buffer immediately. > Probably we're talking about more or less the same thing... ...though after talking to Eric some more, I think I should change this patch to clear needsrepair at the end of repair, like you said. --D > > > An optimization on that would be to find a way to avoid that first super > > > write until we flush the first dirty buffer. > > > > > > Another way to make repair more "transactional" would be to do it would > > > be to fiddle with the buffer manager so that writes are sent to a > > > metadump file which could be mdrestore'd if repair completes > > > successfully. But that's a short-circuit around the even bigger project > > > of porting the kernel logging code to userspace and use that in repair. > > > > > > > Yeah, I wouldn't want to have clearing a feature bit depend on such a > > significant rework of core functionality. The bit is not going to be set > > in most cases, so I'd suspect an extra superblock write at the end of > > repair wouldn't be much of a problem. In fact, it looks like main() > > already has an unconditional sb write before the final libxfs_unmount() > > call. Perhaps we could just hitch onto that for the primary super > > update (and continue to clear the secondaries earlier as the current > > patch does)..? > > Clearing the bit has to happen after the bcache purge and disk flush, > because we need to make sure that everything else we wrote actually made > it to stable storage. > > Hm, I guess we could export libxfs_flush_mount so that repair could call > that, clear the fields in the sb, and then go ahead with the > libxfs_umount. > > --D > > > Brian > > > > > --D > > > > > > > Brian > > > > > > > > > repair/agheader.c | 15 +++++++++++++++ > > > > > 1 file changed, 15 insertions(+) > > > > > > > > > > > > > > > diff --git a/repair/agheader.c b/repair/agheader.c > > > > > index 8bb99489..d9b72d3a 100644 > > > > > --- a/repair/agheader.c > > > > > +++ b/repair/agheader.c > > > > > @@ -452,6 +452,21 @@ secondary_sb_whack( > > > > > rval |= XR_AG_SB_SEC; > > > > > } > > > > > > > > > > + if (xfs_sb_version_needsrepair(sb)) { > > > > > + if (!no_modify) > > > > > + sb->sb_features_incompat &= > > > > > + ~XFS_SB_FEAT_INCOMPAT_NEEDSREPAIR; > > > > > + if (i == 0) { > > > > > + if (!no_modify) > > > > > + do_warn( > > > > > + _("clearing needsrepair flag and regenerating metadata\n")); > > > > > + else > > > > > + do_warn( > > > > > + _("would clear needsrepair flag and regenerate metadata\n")); > > > > > + } > > > > > + rval |= XR_AG_SB_SEC; > > > > > + } > > > > > + > > > > > return(rval); > > > > > } > > > > > > > > > > > > > > > > > > >
From: Darrick J. Wong <djwong@kernel.org> Clear the needsrepair flag, since it's used to prevent mounting of an inconsistent filesystem. Signed-off-by: Darrick J. Wong <djwong@kernel.org> --- v2.1: only remove needsrepair at the end of the xfs_repair run --- include/xfs_mount.h | 1 + libxfs/init.c | 2 +- repair/agheader.c | 55 +++++++++++++++++++++++++++++++++++++++++++++++++++ repair/agheader.h | 2 ++ repair/xfs_repair.c | 4 +++- 5 files changed, 62 insertions(+), 2 deletions(-) diff --git a/include/xfs_mount.h b/include/xfs_mount.h index 36594643..75230ca5 100644 --- a/include/xfs_mount.h +++ b/include/xfs_mount.h @@ -181,6 +181,7 @@ xfs_perag_resv( extern xfs_mount_t *libxfs_mount (xfs_mount_t *, xfs_sb_t *, dev_t, dev_t, dev_t, int); +int libxfs_flush_mount(struct xfs_mount *mp); int libxfs_umount(struct xfs_mount *mp); extern void libxfs_rtmount_destroy (xfs_mount_t *); diff --git a/libxfs/init.c b/libxfs/init.c index 9fe13b8d..083060bf 100644 --- a/libxfs/init.c +++ b/libxfs/init.c @@ -870,7 +870,7 @@ _("%s: Flushing the %s failed, err=%d!\n"), * Flush all dirty buffers to stable storage and report on writes that didn't * make it to stable storage. */ -static int +int libxfs_flush_mount( struct xfs_mount *mp) { diff --git a/repair/agheader.c b/repair/agheader.c index 8bb99489..56a7f45c 100644 --- a/repair/agheader.c +++ b/repair/agheader.c @@ -220,6 +220,40 @@ compare_sb(xfs_mount_t *mp, xfs_sb_t *sb) return(XR_OK); } +/* Clear needsrepair after a successful repair run. */ +int +clear_needsrepair( + struct xfs_mount *mp) +{ + struct xfs_buf *bp; + int error; + + if (!xfs_sb_version_needsrepair(&mp->m_sb) || no_modify) + return 0; + + /* We must succeed at flushing all dirty buffers to disk. */ + error = -libxfs_flush_mount(mp); + if (error) + return error; + + /* Clear needsrepair from the superblock. */ + bp = libxfs_getsb(mp); + if (!bp) + return ENOMEM; + if (bp->b_error) { + error = bp->b_error; + libxfs_buf_relse(bp); + return -error; + } + + mp->m_sb.sb_features_incompat &= ~XFS_SB_FEAT_INCOMPAT_NEEDSREPAIR; + + libxfs_sb_to_disk(bp->b_addr, &mp->m_sb); + libxfs_buf_mark_dirty(bp); + libxfs_buf_relse(bp); + return 0; +} + /* * Possible fields that may have been set at mkfs time, * sb_inoalignmt, sb_unit, sb_width and sb_dirblklog. @@ -452,6 +486,27 @@ secondary_sb_whack( rval |= XR_AG_SB_SEC; } + if (xfs_sb_version_needsrepair(sb)) { + if (!no_modify) + sb->sb_features_incompat &= + ~XFS_SB_FEAT_INCOMPAT_NEEDSREPAIR; + if (i == 0) { + if (!no_modify) + do_warn( + _("clearing needsrepair flag and regenerating metadata\n")); + else + do_warn( + _("would clear needsrepair flag and regenerate metadata\n")); + } else { + /* + * Quietly clear needsrepair on the secondary supers as + * part of ensuring them. If needsrepair is set on the + * primary, it will be done at the very end of repair. + */ + rval |= XR_AG_SB_SEC; + } + } + return(rval); } diff --git a/repair/agheader.h b/repair/agheader.h index a63827c8..552c1f70 100644 --- a/repair/agheader.h +++ b/repair/agheader.h @@ -82,3 +82,5 @@ typedef struct fs_geo_list { #define XR_AG_AGF 0x2 #define XR_AG_AGI 0x4 #define XR_AG_SB_SEC 0x8 + +int clear_needsrepair(struct xfs_mount *mp); diff --git a/repair/xfs_repair.c b/repair/xfs_repair.c index 9409f0d8..d36c5a21 100644 --- a/repair/xfs_repair.c +++ b/repair/xfs_repair.c @@ -1133,7 +1133,9 @@ _("Note - stripe unit (%d) and width (%d) were copied from a backup superblock.\ format_log_max_lsn(mp); /* Report failure if anything failed to get written to our fs. */ - error = -libxfs_umount(mp); + error = clear_needsrepair(mp); + if (!error) + error = -libxfs_umount(mp); if (error) do_error( _("File system metadata writeout failed, err=%d. Re-run xfs_repair.\n"),
On Tue, Jan 19, 2021 at 08:31:28PM -0800, Darrick J. Wong wrote: > From: Darrick J. Wong <djwong@kernel.org> > > Clear the needsrepair flag, since it's used to prevent mounting of an > inconsistent filesystem. > > Signed-off-by: Darrick J. Wong <djwong@kernel.org> > --- > v2.1: only remove needsrepair at the end of the xfs_repair run > --- > include/xfs_mount.h | 1 + > libxfs/init.c | 2 +- > repair/agheader.c | 55 +++++++++++++++++++++++++++++++++++++++++++++++++++ > repair/agheader.h | 2 ++ > repair/xfs_repair.c | 4 +++- > 5 files changed, 62 insertions(+), 2 deletions(-) > ... > diff --git a/repair/agheader.c b/repair/agheader.c > index 8bb99489..56a7f45c 100644 > --- a/repair/agheader.c > +++ b/repair/agheader.c > @@ -220,6 +220,40 @@ compare_sb(xfs_mount_t *mp, xfs_sb_t *sb) > return(XR_OK); > } > > +/* Clear needsrepair after a successful repair run. */ > +int > +clear_needsrepair( > + struct xfs_mount *mp) > +{ > + struct xfs_buf *bp; > + int error; > + > + if (!xfs_sb_version_needsrepair(&mp->m_sb) || no_modify) > + return 0; > + > + /* We must succeed at flushing all dirty buffers to disk. */ > + error = -libxfs_flush_mount(mp); > + if (error) > + return error; > + Do we really need a new helper and buf get/relse cycle just to incorporate the above flush? ISTM we could either lift the libxfs_bcache_flush() call above the superblock update in the caller, insert the libxfs_flush_mount() right after that, and just do: dsb->sb_features_incompat &= ~XFS_SB_FEAT_INCOMPAT_NEEDSREPAIR; ... in the hunk that also updates the quota flags. Though perhaps cleaner would be to keep the helper, but genericize it a bit to something like final_sb_update() and fold in the qflags update and whatever flush/ordering is required for the feature bit. > + /* Clear needsrepair from the superblock. */ > + bp = libxfs_getsb(mp); > + if (!bp) > + return ENOMEM; > + if (bp->b_error) { > + error = bp->b_error; > + libxfs_buf_relse(bp); > + return -error; > + } > + > + mp->m_sb.sb_features_incompat &= ~XFS_SB_FEAT_INCOMPAT_NEEDSREPAIR; > + > + libxfs_sb_to_disk(bp->b_addr, &mp->m_sb); > + libxfs_buf_mark_dirty(bp); > + libxfs_buf_relse(bp); > + return 0; > +} > + > /* > * Possible fields that may have been set at mkfs time, > * sb_inoalignmt, sb_unit, sb_width and sb_dirblklog. > @@ -452,6 +486,27 @@ secondary_sb_whack( > rval |= XR_AG_SB_SEC; > } > > + if (xfs_sb_version_needsrepair(sb)) { > + if (!no_modify) > + sb->sb_features_incompat &= > + ~XFS_SB_FEAT_INCOMPAT_NEEDSREPAIR; I suspect this should be folded into the check below so we don't modify the primary sb by accident (should some other check dirty it). Brian > + if (i == 0) { > + if (!no_modify) > + do_warn( > + _("clearing needsrepair flag and regenerating metadata\n")); > + else > + do_warn( > + _("would clear needsrepair flag and regenerate metadata\n")); > + } else { > + /* > + * Quietly clear needsrepair on the secondary supers as > + * part of ensuring them. If needsrepair is set on the > + * primary, it will be done at the very end of repair. > + */ > + rval |= XR_AG_SB_SEC; > + } > + } > + > return(rval); > } > > diff --git a/repair/agheader.h b/repair/agheader.h > index a63827c8..552c1f70 100644 > --- a/repair/agheader.h > +++ b/repair/agheader.h > @@ -82,3 +82,5 @@ typedef struct fs_geo_list { > #define XR_AG_AGF 0x2 > #define XR_AG_AGI 0x4 > #define XR_AG_SB_SEC 0x8 > + > +int clear_needsrepair(struct xfs_mount *mp); > diff --git a/repair/xfs_repair.c b/repair/xfs_repair.c > index 9409f0d8..d36c5a21 100644 > --- a/repair/xfs_repair.c > +++ b/repair/xfs_repair.c > @@ -1133,7 +1133,9 @@ _("Note - stripe unit (%d) and width (%d) were copied from a backup superblock.\ > format_log_max_lsn(mp); > > /* Report failure if anything failed to get written to our fs. */ > - error = -libxfs_umount(mp); > + error = clear_needsrepair(mp); > + if (!error) > + error = -libxfs_umount(mp); > if (error) > do_error( > _("File system metadata writeout failed, err=%d. Re-run xfs_repair.\n"), >
On Wed, Jan 20, 2021 at 12:38:20PM -0500, Brian Foster wrote: > On Tue, Jan 19, 2021 at 08:31:28PM -0800, Darrick J. Wong wrote: > > From: Darrick J. Wong <djwong@kernel.org> > > > > Clear the needsrepair flag, since it's used to prevent mounting of an > > inconsistent filesystem. > > > > Signed-off-by: Darrick J. Wong <djwong@kernel.org> > > --- > > v2.1: only remove needsrepair at the end of the xfs_repair run > > --- > > include/xfs_mount.h | 1 + > > libxfs/init.c | 2 +- > > repair/agheader.c | 55 +++++++++++++++++++++++++++++++++++++++++++++++++++ > > repair/agheader.h | 2 ++ > > repair/xfs_repair.c | 4 +++- > > 5 files changed, 62 insertions(+), 2 deletions(-) > > > ... > > diff --git a/repair/agheader.c b/repair/agheader.c > > index 8bb99489..56a7f45c 100644 > > --- a/repair/agheader.c > > +++ b/repair/agheader.c > > @@ -220,6 +220,40 @@ compare_sb(xfs_mount_t *mp, xfs_sb_t *sb) > > return(XR_OK); > > } > > > > +/* Clear needsrepair after a successful repair run. */ > > +int > > +clear_needsrepair( > > + struct xfs_mount *mp) > > +{ > > + struct xfs_buf *bp; > > + int error; > > + > > + if (!xfs_sb_version_needsrepair(&mp->m_sb) || no_modify) > > + return 0; > > + > > + /* We must succeed at flushing all dirty buffers to disk. */ > > + error = -libxfs_flush_mount(mp); > > + if (error) > > + return error; > > + > > Do we really need a new helper and buf get/relse cycle just to > incorporate the above flush? ISTM we could either lift the Yes. If quotacheck thinks the dquots are bad, we always want to try to clear the CHKD flags in the primary superblock, even if other disk writes fail due to IO errors or write verifiers failing, etc. Note that libxfs_bcache_flush only pwrite()s the dirty buffers to disk, it doesn't force the disks to persist to stable media. However, if NEEDSREPAIR was set and /any/ part of persisting the dirty buffers fails (write verifier trips, media errors, etc.) then we don't even want to try to clear NEEDSREPAIR. Because of that requirement, the two writes have to be separate steps, separated by a big flush. --D > libxfs_bcache_flush() call above the superblock update in the caller, > insert the libxfs_flush_mount() right after that, and just do: > > dsb->sb_features_incompat &= ~XFS_SB_FEAT_INCOMPAT_NEEDSREPAIR; > > ... in the hunk that also updates the quota flags. > > Though perhaps cleaner would be to keep the helper, but genericize it a > bit to something like final_sb_update() and fold in the qflags update > and whatever flush/ordering is required for the feature bit. > > > + /* Clear needsrepair from the superblock. */ > > + bp = libxfs_getsb(mp); > > + if (!bp) > > + return ENOMEM; > > + if (bp->b_error) { > > + error = bp->b_error; > > + libxfs_buf_relse(bp); > > + return -error; > > + } > > + > > + mp->m_sb.sb_features_incompat &= ~XFS_SB_FEAT_INCOMPAT_NEEDSREPAIR; > > + > > + libxfs_sb_to_disk(bp->b_addr, &mp->m_sb); > > + libxfs_buf_mark_dirty(bp); > > + libxfs_buf_relse(bp); > > + return 0; > > +} > > + > > /* > > * Possible fields that may have been set at mkfs time, > > * sb_inoalignmt, sb_unit, sb_width and sb_dirblklog. > > @@ -452,6 +486,27 @@ secondary_sb_whack( > > rval |= XR_AG_SB_SEC; > > } > > > > + if (xfs_sb_version_needsrepair(sb)) { > > + if (!no_modify) > > + sb->sb_features_incompat &= > > + ~XFS_SB_FEAT_INCOMPAT_NEEDSREPAIR; > > I suspect this should be folded into the check below so we don't modify > the primary sb by accident (should some other check dirty it). Yup. Fixed. --D > > Brian > > > + if (i == 0) { > > + if (!no_modify) > > + do_warn( > > + _("clearing needsrepair flag and regenerating metadata\n")); > > + else > > + do_warn( > > + _("would clear needsrepair flag and regenerate metadata\n")); > > + } else { > > + /* > > + * Quietly clear needsrepair on the secondary supers as > > + * part of ensuring them. If needsrepair is set on the > > + * primary, it will be done at the very end of repair. > > + */ > > + rval |= XR_AG_SB_SEC; > > + } > > + } > > + > > return(rval); > > } > > > > diff --git a/repair/agheader.h b/repair/agheader.h > > index a63827c8..552c1f70 100644 > > --- a/repair/agheader.h > > +++ b/repair/agheader.h > > @@ -82,3 +82,5 @@ typedef struct fs_geo_list { > > #define XR_AG_AGF 0x2 > > #define XR_AG_AGI 0x4 > > #define XR_AG_SB_SEC 0x8 > > + > > +int clear_needsrepair(struct xfs_mount *mp); > > diff --git a/repair/xfs_repair.c b/repair/xfs_repair.c > > index 9409f0d8..d36c5a21 100644 > > --- a/repair/xfs_repair.c > > +++ b/repair/xfs_repair.c > > @@ -1133,7 +1133,9 @@ _("Note - stripe unit (%d) and width (%d) were copied from a backup superblock.\ > > format_log_max_lsn(mp); > > > > /* Report failure if anything failed to get written to our fs. */ > > - error = -libxfs_umount(mp); > > + error = clear_needsrepair(mp); > > + if (!error) > > + error = -libxfs_umount(mp); > > if (error) > > do_error( > > _("File system metadata writeout failed, err=%d. Re-run xfs_repair.\n"), > > >
diff --git a/repair/agheader.c b/repair/agheader.c index 8bb99489..d9b72d3a 100644 --- a/repair/agheader.c +++ b/repair/agheader.c @@ -452,6 +452,21 @@ secondary_sb_whack( rval |= XR_AG_SB_SEC; } + if (xfs_sb_version_needsrepair(sb)) { + if (!no_modify) + sb->sb_features_incompat &= + ~XFS_SB_FEAT_INCOMPAT_NEEDSREPAIR; + if (i == 0) { + if (!no_modify) + do_warn( + _("clearing needsrepair flag and regenerating metadata\n")); + else + do_warn( + _("would clear needsrepair flag and regenerate metadata\n")); + } + rval |= XR_AG_SB_SEC; + } + return(rval); }