diff mbox series

[06/10] xfs_repair: clear the needsrepair flag

Message ID 161284383828.3057868.1762356472271947821.stgit@magnolia (mailing list archive)
State Superseded, archived
Headers show
Series xfs: add the ability to flag a fs for repair | expand

Commit Message

Darrick J. Wong Feb. 9, 2021, 4:10 a.m. UTC
From: Darrick J. Wong <djwong@kernel.org>

Clear the needsrepair flag, since it's used to prevent mounting of an
inconsistent filesystem.  We only do this if we make it to the end of
repair with a non-zero error code, and all the rebuilt indices and
corrected metadata are persisted correctly.

Note that we cannot combine clearing needsrepair with clearing the quota
checked flags because we need to clear the quota flags even if
reformatting the log fails, whereas we can't clear needsrepair if the
log reformat fails.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 include/xfs_mount.h |    1 +
 libxfs/init.c       |   25 +++++++++++++------------
 repair/agheader.c   |   21 +++++++++++++++++++++
 repair/xfs_repair.c |   45 +++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 80 insertions(+), 12 deletions(-)

Comments

Christoph Hellwig Feb. 9, 2021, 9:12 a.m. UTC | #1
On Mon, Feb 08, 2021 at 08:10:38PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> Clear the needsrepair flag, since it's used to prevent mounting of an
> inconsistent filesystem.  We only do this if we make it to the end of
> repair with a non-zero error code, and all the rebuilt indices and
> corrected metadata are persisted correctly.
> 
> Note that we cannot combine clearing needsrepair with clearing the quota
> checked flags because we need to clear the quota flags even if
> reformatting the log fails, whereas we can't clear needsrepair if the
> log reformat fails.
> 
> Signed-off-by: Darrick J. Wong <djwong@kernel.org>

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>
Brian Foster Feb. 9, 2021, 5:20 p.m. UTC | #2
On Mon, Feb 08, 2021 at 08:10:38PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> Clear the needsrepair flag, since it's used to prevent mounting of an
> inconsistent filesystem.  We only do this if we make it to the end of
> repair with a non-zero error code, and all the rebuilt indices and
> corrected metadata are persisted correctly.
> 
> Note that we cannot combine clearing needsrepair with clearing the quota
> checked flags because we need to clear the quota flags even if
> reformatting the log fails, whereas we can't clear needsrepair if the
> log reformat fails.
> 
> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> ---
>  include/xfs_mount.h |    1 +
>  libxfs/init.c       |   25 +++++++++++++------------
>  repair/agheader.c   |   21 +++++++++++++++++++++
>  repair/xfs_repair.c |   45 +++++++++++++++++++++++++++++++++++++++++++++
>  4 files changed, 80 insertions(+), 12 deletions(-)
> 
> 
...
> diff --git a/libxfs/init.c b/libxfs/init.c
> index 9fe13b8d..98057b78 100644
> --- a/libxfs/init.c
> +++ b/libxfs/init.c
> @@ -867,25 +867,17 @@ _("%s: Flushing the %s failed, err=%d!\n"),
>  }
>  
>  /*
> - * Flush all dirty buffers to stable storage and report on writes that didn't
> - * make it to stable storage.
> + * Persist all disk write caches and report on writes that didn't make it to
> + * stable storage.  Callers should flush (or purge) the libxfs buffer caches
> + * before calling this function.
>   */
> -static int
> +int
>  libxfs_flush_mount(
>  	struct xfs_mount	*mp)
>  {
>  	int			error = 0;
>  	int			err2;
>  
> -	/*
> -	 * Purge the buffer cache to write all dirty buffers to disk and free
> -	 * all incore buffers.  Buffers that fail write verification will cause
> -	 * the CORRUPT_WRITE flag to be set in the buftarg.  Buffers that
> -	 * cannot be written will cause the LOST_WRITE flag to be set in the
> -	 * buftarg.
> -	 */
> -	libxfs_bcache_purge();
> -

FWIW, my comment on the previous version was that I think it's
reasonable to call libxfs_bcache_flush() here instead of the purge so
callers don't necessarily have to do anything special. The one caller
that does the purge is free to do so before calling
libxfs_flush_mount(), as that essentially supercedes the flush that
would otherwise occur here. Either way, the patch looks fine to me:

Reviewed-by: Brian Foster <bfoster@redhat.com>

>  	/* Flush all kernel and disk write caches, and report failures. */
>  	if (mp->m_ddev_targp) {
>  		err2 = libxfs_flush_buftarg(mp->m_ddev_targp, _("data device"));
> @@ -923,6 +915,15 @@ libxfs_umount(
>  
>  	libxfs_rtmount_destroy(mp);
>  
> +	/*
> +	 * Purge the buffer cache to write all dirty buffers to disk and free
> +	 * all incore buffers.  Buffers that fail write verification will cause
> +	 * the CORRUPT_WRITE flag to be set in the buftarg.  Buffers that
> +	 * cannot be written will cause the LOST_WRITE flag to be set in the
> +	 * buftarg.  Once that's done, instruct the disks to persist their
> +	 * write caches.
> +	 */
> +	libxfs_bcache_purge();
>  	error = libxfs_flush_mount(mp);
>  
>  	for (agno = 0; agno < mp->m_maxagi; agno++) {
> diff --git a/repair/agheader.c b/repair/agheader.c
> index 8bb99489..2af24106 100644
> --- a/repair/agheader.c
> +++ b/repair/agheader.c
> @@ -452,6 +452,27 @@ secondary_sb_whack(
>  			rval |= XR_AG_SB_SEC;
>  	}
>  
> +	if (xfs_sb_version_needsrepair(sb)) {
> +		if (i == 0) {
> +			if (!no_modify)
> +				do_warn(
> +	_("clearing needsrepair flag and regenerating metadata\n"));
> +			else
> +				do_warn(
> +	_("would clear needsrepair flag and regenerate metadata\n"));
> +		} else {
> +			/*
> +			 * Quietly clear needsrepair on the secondary supers as
> +			 * part of ensuring them.  If needsrepair is set on the
> +			 * primary, it will be cleared at the end of repair
> +			 * once we've flushed all other dirty blocks to disk.
> +			 */
> +			sb->sb_features_incompat &=
> +					~XFS_SB_FEAT_INCOMPAT_NEEDSREPAIR;
> +			rval |= XR_AG_SB_SEC;
> +		}
> +	}
> +
>  	return(rval);
>  }
>  
> diff --git a/repair/xfs_repair.c b/repair/xfs_repair.c
> index 32755821..f607afcb 100644
> --- a/repair/xfs_repair.c
> +++ b/repair/xfs_repair.c
> @@ -712,6 +712,48 @@ check_fs_vs_host_sectsize(
>  	}
>  }
>  
> +/* Clear needsrepair after a successful repair run. */
> +void
> +clear_needsrepair(
> +	struct xfs_mount	*mp)
> +{
> +	struct xfs_buf		*bp;
> +	int			error;
> +
> +	/*
> +	 * If we're going to clear NEEDSREPAIR, we need to make absolutely sure
> +	 * that everything is ok with the ondisk filesystem.  At this point
> +	 * we've flushed the filesystem metadata out of the buffer cache and
> +	 * possibly rewrote the log, but we haven't forced the disks to persist
> +	 * the writes to stable storage.  Do that now, and if anything goes
> +	 * wrong, leave NEEDSREPAIR in place.  Don't purge the buffer cache
> +	 * here since we're not done yet.
> +	 */
> +	libxfs_bcache_flush();
> +	error = -libxfs_flush_mount(mp);
> +	if (error) {
> +		do_warn(
> +	_("Cannot clear needsrepair due to flush failure, err=%d.\n"),
> +			error);
> +		return;
> +	}
> +
> +	/* Clear needsrepair from the superblock. */
> +	bp = libxfs_getsb(mp);
> +	if (!bp || bp->b_error) {
> +		do_warn(
> +	_("Cannot clear needsrepair from primary super, err=%d.\n"),
> +			bp ? bp->b_error : ENOMEM);
> +	} else {
> +		mp->m_sb.sb_features_incompat &=
> +				~XFS_SB_FEAT_INCOMPAT_NEEDSREPAIR;
> +		libxfs_sb_to_disk(bp->b_addr, &mp->m_sb);
> +		libxfs_buf_mark_dirty(bp);
> +	}
> +	if (bp)
> +		libxfs_buf_relse(bp);
> +}
> +
>  int
>  main(int argc, char **argv)
>  {
> @@ -1131,6 +1173,9 @@ _("Note - stripe unit (%d) and width (%d) were copied from a backup superblock.\
>  	libxfs_bcache_flush();
>  	format_log_max_lsn(mp);
>  
> +	if (xfs_sb_version_needsrepair(&mp->m_sb))
> +		clear_needsrepair(mp);
> +
>  	/* Report failure if anything failed to get written to our fs. */
>  	error = -libxfs_umount(mp);
>  	if (error)
>
Darrick J. Wong Feb. 9, 2021, 6:01 p.m. UTC | #3
On Tue, Feb 09, 2021 at 12:20:29PM -0500, Brian Foster wrote:
> On Mon, Feb 08, 2021 at 08:10:38PM -0800, Darrick J. Wong wrote:
> > From: Darrick J. Wong <djwong@kernel.org>
> > 
> > Clear the needsrepair flag, since it's used to prevent mounting of an
> > inconsistent filesystem.  We only do this if we make it to the end of
> > repair with a non-zero error code, and all the rebuilt indices and
> > corrected metadata are persisted correctly.
> > 
> > Note that we cannot combine clearing needsrepair with clearing the quota
> > checked flags because we need to clear the quota flags even if
> > reformatting the log fails, whereas we can't clear needsrepair if the
> > log reformat fails.
> > 
> > Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> > ---
> >  include/xfs_mount.h |    1 +
> >  libxfs/init.c       |   25 +++++++++++++------------
> >  repair/agheader.c   |   21 +++++++++++++++++++++
> >  repair/xfs_repair.c |   45 +++++++++++++++++++++++++++++++++++++++++++++
> >  4 files changed, 80 insertions(+), 12 deletions(-)
> > 
> > 
> ...
> > diff --git a/libxfs/init.c b/libxfs/init.c
> > index 9fe13b8d..98057b78 100644
> > --- a/libxfs/init.c
> > +++ b/libxfs/init.c
> > @@ -867,25 +867,17 @@ _("%s: Flushing the %s failed, err=%d!\n"),
> >  }
> >  
> >  /*
> > - * Flush all dirty buffers to stable storage and report on writes that didn't
> > - * make it to stable storage.
> > + * Persist all disk write caches and report on writes that didn't make it to
> > + * stable storage.  Callers should flush (or purge) the libxfs buffer caches
> > + * before calling this function.
> >   */
> > -static int
> > +int
> >  libxfs_flush_mount(
> >  	struct xfs_mount	*mp)
> >  {
> >  	int			error = 0;
> >  	int			err2;
> >  
> > -	/*
> > -	 * Purge the buffer cache to write all dirty buffers to disk and free
> > -	 * all incore buffers.  Buffers that fail write verification will cause
> > -	 * the CORRUPT_WRITE flag to be set in the buftarg.  Buffers that
> > -	 * cannot be written will cause the LOST_WRITE flag to be set in the
> > -	 * buftarg.
> > -	 */
> > -	libxfs_bcache_purge();
> > -
> 
> FWIW, my comment on the previous version was that I think it's
> reasonable to call libxfs_bcache_flush() here instead of the purge so
> callers don't necessarily have to do anything special. The one caller
> that does the purge is free to do so before calling
> libxfs_flush_mount(), as that essentially supercedes the flush that
> would otherwise occur here. Either way, the patch looks fine to me:

Ah, yeah.  That's a pretty simple change to make.  Thanks for the
review.

--D

> Reviewed-by: Brian Foster <bfoster@redhat.com>
> 
> >  	/* Flush all kernel and disk write caches, and report failures. */
> >  	if (mp->m_ddev_targp) {
> >  		err2 = libxfs_flush_buftarg(mp->m_ddev_targp, _("data device"));
> > @@ -923,6 +915,15 @@ libxfs_umount(
> >  
> >  	libxfs_rtmount_destroy(mp);
> >  
> > +	/*
> > +	 * Purge the buffer cache to write all dirty buffers to disk and free
> > +	 * all incore buffers.  Buffers that fail write verification will cause
> > +	 * the CORRUPT_WRITE flag to be set in the buftarg.  Buffers that
> > +	 * cannot be written will cause the LOST_WRITE flag to be set in the
> > +	 * buftarg.  Once that's done, instruct the disks to persist their
> > +	 * write caches.
> > +	 */
> > +	libxfs_bcache_purge();
> >  	error = libxfs_flush_mount(mp);
> >  
> >  	for (agno = 0; agno < mp->m_maxagi; agno++) {
> > diff --git a/repair/agheader.c b/repair/agheader.c
> > index 8bb99489..2af24106 100644
> > --- a/repair/agheader.c
> > +++ b/repair/agheader.c
> > @@ -452,6 +452,27 @@ secondary_sb_whack(
> >  			rval |= XR_AG_SB_SEC;
> >  	}
> >  
> > +	if (xfs_sb_version_needsrepair(sb)) {
> > +		if (i == 0) {
> > +			if (!no_modify)
> > +				do_warn(
> > +	_("clearing needsrepair flag and regenerating metadata\n"));
> > +			else
> > +				do_warn(
> > +	_("would clear needsrepair flag and regenerate metadata\n"));
> > +		} else {
> > +			/*
> > +			 * Quietly clear needsrepair on the secondary supers as
> > +			 * part of ensuring them.  If needsrepair is set on the
> > +			 * primary, it will be cleared at the end of repair
> > +			 * once we've flushed all other dirty blocks to disk.
> > +			 */
> > +			sb->sb_features_incompat &=
> > +					~XFS_SB_FEAT_INCOMPAT_NEEDSREPAIR;
> > +			rval |= XR_AG_SB_SEC;
> > +		}
> > +	}
> > +
> >  	return(rval);
> >  }
> >  
> > diff --git a/repair/xfs_repair.c b/repair/xfs_repair.c
> > index 32755821..f607afcb 100644
> > --- a/repair/xfs_repair.c
> > +++ b/repair/xfs_repair.c
> > @@ -712,6 +712,48 @@ check_fs_vs_host_sectsize(
> >  	}
> >  }
> >  
> > +/* Clear needsrepair after a successful repair run. */
> > +void
> > +clear_needsrepair(
> > +	struct xfs_mount	*mp)
> > +{
> > +	struct xfs_buf		*bp;
> > +	int			error;
> > +
> > +	/*
> > +	 * If we're going to clear NEEDSREPAIR, we need to make absolutely sure
> > +	 * that everything is ok with the ondisk filesystem.  At this point
> > +	 * we've flushed the filesystem metadata out of the buffer cache and
> > +	 * possibly rewrote the log, but we haven't forced the disks to persist
> > +	 * the writes to stable storage.  Do that now, and if anything goes
> > +	 * wrong, leave NEEDSREPAIR in place.  Don't purge the buffer cache
> > +	 * here since we're not done yet.
> > +	 */
> > +	libxfs_bcache_flush();
> > +	error = -libxfs_flush_mount(mp);
> > +	if (error) {
> > +		do_warn(
> > +	_("Cannot clear needsrepair due to flush failure, err=%d.\n"),
> > +			error);
> > +		return;
> > +	}
> > +
> > +	/* Clear needsrepair from the superblock. */
> > +	bp = libxfs_getsb(mp);
> > +	if (!bp || bp->b_error) {
> > +		do_warn(
> > +	_("Cannot clear needsrepair from primary super, err=%d.\n"),
> > +			bp ? bp->b_error : ENOMEM);
> > +	} else {
> > +		mp->m_sb.sb_features_incompat &=
> > +				~XFS_SB_FEAT_INCOMPAT_NEEDSREPAIR;
> > +		libxfs_sb_to_disk(bp->b_addr, &mp->m_sb);
> > +		libxfs_buf_mark_dirty(bp);
> > +	}
> > +	if (bp)
> > +		libxfs_buf_relse(bp);
> > +}
> > +
> >  int
> >  main(int argc, char **argv)
> >  {
> > @@ -1131,6 +1173,9 @@ _("Note - stripe unit (%d) and width (%d) were copied from a backup superblock.\
> >  	libxfs_bcache_flush();
> >  	format_log_max_lsn(mp);
> >  
> > +	if (xfs_sb_version_needsrepair(&mp->m_sb))
> > +		clear_needsrepair(mp);
> > +
> >  	/* Report failure if anything failed to get written to our fs. */
> >  	error = -libxfs_umount(mp);
> >  	if (error)
> > 
>
diff mbox series

Patch

diff --git a/include/xfs_mount.h b/include/xfs_mount.h
index 36594643..75230ca5 100644
--- a/include/xfs_mount.h
+++ b/include/xfs_mount.h
@@ -181,6 +181,7 @@  xfs_perag_resv(
 
 extern xfs_mount_t	*libxfs_mount (xfs_mount_t *, xfs_sb_t *,
 				dev_t, dev_t, dev_t, int);
+int libxfs_flush_mount(struct xfs_mount *mp);
 int		libxfs_umount(struct xfs_mount *mp);
 extern void	libxfs_rtmount_destroy (xfs_mount_t *);
 
diff --git a/libxfs/init.c b/libxfs/init.c
index 9fe13b8d..98057b78 100644
--- a/libxfs/init.c
+++ b/libxfs/init.c
@@ -867,25 +867,17 @@  _("%s: Flushing the %s failed, err=%d!\n"),
 }
 
 /*
- * Flush all dirty buffers to stable storage and report on writes that didn't
- * make it to stable storage.
+ * Persist all disk write caches and report on writes that didn't make it to
+ * stable storage.  Callers should flush (or purge) the libxfs buffer caches
+ * before calling this function.
  */
-static int
+int
 libxfs_flush_mount(
 	struct xfs_mount	*mp)
 {
 	int			error = 0;
 	int			err2;
 
-	/*
-	 * Purge the buffer cache to write all dirty buffers to disk and free
-	 * all incore buffers.  Buffers that fail write verification will cause
-	 * the CORRUPT_WRITE flag to be set in the buftarg.  Buffers that
-	 * cannot be written will cause the LOST_WRITE flag to be set in the
-	 * buftarg.
-	 */
-	libxfs_bcache_purge();
-
 	/* Flush all kernel and disk write caches, and report failures. */
 	if (mp->m_ddev_targp) {
 		err2 = libxfs_flush_buftarg(mp->m_ddev_targp, _("data device"));
@@ -923,6 +915,15 @@  libxfs_umount(
 
 	libxfs_rtmount_destroy(mp);
 
+	/*
+	 * Purge the buffer cache to write all dirty buffers to disk and free
+	 * all incore buffers.  Buffers that fail write verification will cause
+	 * the CORRUPT_WRITE flag to be set in the buftarg.  Buffers that
+	 * cannot be written will cause the LOST_WRITE flag to be set in the
+	 * buftarg.  Once that's done, instruct the disks to persist their
+	 * write caches.
+	 */
+	libxfs_bcache_purge();
 	error = libxfs_flush_mount(mp);
 
 	for (agno = 0; agno < mp->m_maxagi; agno++) {
diff --git a/repair/agheader.c b/repair/agheader.c
index 8bb99489..2af24106 100644
--- a/repair/agheader.c
+++ b/repair/agheader.c
@@ -452,6 +452,27 @@  secondary_sb_whack(
 			rval |= XR_AG_SB_SEC;
 	}
 
+	if (xfs_sb_version_needsrepair(sb)) {
+		if (i == 0) {
+			if (!no_modify)
+				do_warn(
+	_("clearing needsrepair flag and regenerating metadata\n"));
+			else
+				do_warn(
+	_("would clear needsrepair flag and regenerate metadata\n"));
+		} else {
+			/*
+			 * Quietly clear needsrepair on the secondary supers as
+			 * part of ensuring them.  If needsrepair is set on the
+			 * primary, it will be cleared at the end of repair
+			 * once we've flushed all other dirty blocks to disk.
+			 */
+			sb->sb_features_incompat &=
+					~XFS_SB_FEAT_INCOMPAT_NEEDSREPAIR;
+			rval |= XR_AG_SB_SEC;
+		}
+	}
+
 	return(rval);
 }
 
diff --git a/repair/xfs_repair.c b/repair/xfs_repair.c
index 32755821..f607afcb 100644
--- a/repair/xfs_repair.c
+++ b/repair/xfs_repair.c
@@ -712,6 +712,48 @@  check_fs_vs_host_sectsize(
 	}
 }
 
+/* Clear needsrepair after a successful repair run. */
+void
+clear_needsrepair(
+	struct xfs_mount	*mp)
+{
+	struct xfs_buf		*bp;
+	int			error;
+
+	/*
+	 * If we're going to clear NEEDSREPAIR, we need to make absolutely sure
+	 * that everything is ok with the ondisk filesystem.  At this point
+	 * we've flushed the filesystem metadata out of the buffer cache and
+	 * possibly rewrote the log, but we haven't forced the disks to persist
+	 * the writes to stable storage.  Do that now, and if anything goes
+	 * wrong, leave NEEDSREPAIR in place.  Don't purge the buffer cache
+	 * here since we're not done yet.
+	 */
+	libxfs_bcache_flush();
+	error = -libxfs_flush_mount(mp);
+	if (error) {
+		do_warn(
+	_("Cannot clear needsrepair due to flush failure, err=%d.\n"),
+			error);
+		return;
+	}
+
+	/* Clear needsrepair from the superblock. */
+	bp = libxfs_getsb(mp);
+	if (!bp || bp->b_error) {
+		do_warn(
+	_("Cannot clear needsrepair from primary super, err=%d.\n"),
+			bp ? bp->b_error : ENOMEM);
+	} else {
+		mp->m_sb.sb_features_incompat &=
+				~XFS_SB_FEAT_INCOMPAT_NEEDSREPAIR;
+		libxfs_sb_to_disk(bp->b_addr, &mp->m_sb);
+		libxfs_buf_mark_dirty(bp);
+	}
+	if (bp)
+		libxfs_buf_relse(bp);
+}
+
 int
 main(int argc, char **argv)
 {
@@ -1131,6 +1173,9 @@  _("Note - stripe unit (%d) and width (%d) were copied from a backup superblock.\
 	libxfs_bcache_flush();
 	format_log_max_lsn(mp);
 
+	if (xfs_sb_version_needsrepair(&mp->m_sb))
+		clear_needsrepair(mp);
+
 	/* Report failure if anything failed to get written to our fs. */
 	error = -libxfs_umount(mp);
 	if (error)