diff mbox series

[5/9] xfs: don't reset log idle state on covering checkpoints

Message ID 20210106174127.805660-6-bfoster@redhat.com (mailing list archive)
State Superseded
Headers show
Series xfs: rework log quiesce to cover the log | expand

Commit Message

Brian Foster Jan. 6, 2021, 5:41 p.m. UTC
Now that log covering occurs on quiesce, we'd like to reuse the
underlying superblock sync for final superblock updates. This
includes things like lazy superblock counter updates, log feature
incompat bits in the future, etc. One quirk to this approach is that
once the log is in the IDLE (i.e. already covered) state, any
subsequent log write resets the state back to NEED. This means that
a final superblock sync to an already covered log requires two more
sb syncs to return the log back to IDLE again.

For example, if a lazy superblock enabled filesystem is mount cycled
without any modifications, the unmount path syncs the superblock
once and writes an unmount record. With the desired log quiesce
covering behavior, we sync the superblock three times at unmount
time: once for the lazy superblock counter update and twice more to
cover the log. By contrast, if the log is active or only partially
covered at unmount time, a final superblock sync would doubly serve
as the one or two remaining syncs required to cover the log.

This duplicate covering sequence is unnecessary because the
filesystem remains consistent if a crash occurs at any point. The
superblock will either be recovered in the event of a crash or
written back before the log is quiesced and potentially cleaned with
an unmount record.

Update the log covering state machine to remain in the IDLE state if
additional covering checkpoints pass through the log. This
facilitates final superblock updates (such as lazy superblock
counters) via a single sb sync without losing covered status. This
provides some consistency with the active and partially covered
cases and also avoids harmless, but spurious checkpoints when
quiescing the log.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 fs/xfs/xfs_log.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

Comments

Darrick J. Wong Jan. 7, 2021, 7:30 p.m. UTC | #1
On Wed, Jan 06, 2021 at 12:41:23PM -0500, Brian Foster wrote:
> Now that log covering occurs on quiesce, we'd like to reuse the
> underlying superblock sync for final superblock updates. This
> includes things like lazy superblock counter updates, log feature
> incompat bits in the future, etc. One quirk to this approach is that
> once the log is in the IDLE (i.e. already covered) state, any
> subsequent log write resets the state back to NEED. This means that
> a final superblock sync to an already covered log requires two more
> sb syncs to return the log back to IDLE again.
> 
> For example, if a lazy superblock enabled filesystem is mount cycled
> without any modifications, the unmount path syncs the superblock
> once and writes an unmount record. With the desired log quiesce
> covering behavior, we sync the superblock three times at unmount
> time: once for the lazy superblock counter update and twice more to
> cover the log. By contrast, if the log is active or only partially
> covered at unmount time, a final superblock sync would doubly serve
> as the one or two remaining syncs required to cover the log.
> 
> This duplicate covering sequence is unnecessary because the
> filesystem remains consistent if a crash occurs at any point. The
> superblock will either be recovered in the event of a crash or
> written back before the log is quiesced and potentially cleaned with
> an unmount record.
> 
> Update the log covering state machine to remain in the IDLE state if
> additional covering checkpoints pass through the log. This
> facilitates final superblock updates (such as lazy superblock
> counters) via a single sb sync without losing covered status. This
> provides some consistency with the active and partially covered
> cases and also avoids harmless, but spurious checkpoints when
> quiescing the log.
> 
> Signed-off-by: Brian Foster <bfoster@redhat.com>

I /think/ the premise of this is ok.

I found myself wondering if xlog_state_activate_iclog could mistake an
iclog containing 5 log ops from a real update for an iclog containing
just the dummy transaction, since a synchronous inode mtime update
transaction can also produce an iclog with 5 ops.  I /think/ that
doesn't matter because xlog_covered_state only cares about the value of
iclogs_changed if the log worker previously set the log state to DONE or
DONE2, and iclogs_changed won't be 1 here if there were multiple dirty
iclogs or if the sole dirty iclog contains more than just the dummy.

If I got that right,
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>

--D

> ---
>  fs/xfs/xfs_log.c | 9 ++++++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
> index f7b23044723d..9b8564f280b7 100644
> --- a/fs/xfs/xfs_log.c
> +++ b/fs/xfs/xfs_log.c
> @@ -2597,12 +2597,15 @@ xlog_covered_state(
>  	int			iclogs_changed)
>  {
>  	/*
> -	 * We usually go to NEED. But we go to NEED2 if the changed indicates we
> -	 * are done writing the dummy record.  If we are done with the second
> -	 * dummy recored (DONE2), then we go to IDLE.
> +	 * We go to NEED for any non-covering writes. We go to NEED2 if we just
> +	 * wrote the first covering record (DONE). We go to IDLE if we just
> +	 * wrote the second covering record (DONE2) and remain in IDLE until a
> +	 * non-covering write occurs.
>  	 */
>  	switch (prev_state) {
>  	case XLOG_STATE_COVER_IDLE:
> +		if (iclogs_changed == 1)
> +			return XLOG_STATE_COVER_IDLE;
>  	case XLOG_STATE_COVER_NEED:
>  	case XLOG_STATE_COVER_NEED2:
>  		break;
> -- 
> 2.26.2
>
Brian Foster Jan. 7, 2021, 8:01 p.m. UTC | #2
On Thu, Jan 07, 2021 at 11:30:50AM -0800, Darrick J. Wong wrote:
> On Wed, Jan 06, 2021 at 12:41:23PM -0500, Brian Foster wrote:
> > Now that log covering occurs on quiesce, we'd like to reuse the
> > underlying superblock sync for final superblock updates. This
> > includes things like lazy superblock counter updates, log feature
> > incompat bits in the future, etc. One quirk to this approach is that
> > once the log is in the IDLE (i.e. already covered) state, any
> > subsequent log write resets the state back to NEED. This means that
> > a final superblock sync to an already covered log requires two more
> > sb syncs to return the log back to IDLE again.
> > 
> > For example, if a lazy superblock enabled filesystem is mount cycled
> > without any modifications, the unmount path syncs the superblock
> > once and writes an unmount record. With the desired log quiesce
> > covering behavior, we sync the superblock three times at unmount
> > time: once for the lazy superblock counter update and twice more to
> > cover the log. By contrast, if the log is active or only partially
> > covered at unmount time, a final superblock sync would doubly serve
> > as the one or two remaining syncs required to cover the log.
> > 
> > This duplicate covering sequence is unnecessary because the
> > filesystem remains consistent if a crash occurs at any point. The
> > superblock will either be recovered in the event of a crash or
> > written back before the log is quiesced and potentially cleaned with
> > an unmount record.
> > 
> > Update the log covering state machine to remain in the IDLE state if
> > additional covering checkpoints pass through the log. This
> > facilitates final superblock updates (such as lazy superblock
> > counters) via a single sb sync without losing covered status. This
> > provides some consistency with the active and partially covered
> > cases and also avoids harmless, but spurious checkpoints when
> > quiescing the log.
> > 
> > Signed-off-by: Brian Foster <bfoster@redhat.com>
> 
> I /think/ the premise of this is ok.
> 
> I found myself wondering if xlog_state_activate_iclog could mistake an
> iclog containing 5 log ops from a real update for an iclog containing
> just the dummy transaction, since a synchronous inode mtime update
> transaction can also produce an iclog with 5 ops.  I /think/ that
> doesn't matter because xlog_covered_state only cares about the value of
> iclogs_changed if the log worker previously set the log state to DONE or
> DONE2, and iclogs_changed won't be 1 here if there were multiple dirty
> iclogs or if the sole dirty iclog contains more than just the dummy.
> 

That is my understanding. I had similar questions when first passing
through the covering code and realizing the detection of the covering
commit was sort of implicit in and of itself (hence the additional state
checking and iclogs_changed logic). I think the bigger picture with
regard to this series was that it doesn't really matter because nothing
else is happening during quiesce, but I might revisit that from the
background covering perspective now that I have a better handle on how
the covering mechanism works...

Brian

> If I got that right,
> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
> 
> --D
> 
> > ---
> >  fs/xfs/xfs_log.c | 9 ++++++---
> >  1 file changed, 6 insertions(+), 3 deletions(-)
> > 
> > diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
> > index f7b23044723d..9b8564f280b7 100644
> > --- a/fs/xfs/xfs_log.c
> > +++ b/fs/xfs/xfs_log.c
> > @@ -2597,12 +2597,15 @@ xlog_covered_state(
> >  	int			iclogs_changed)
> >  {
> >  	/*
> > -	 * We usually go to NEED. But we go to NEED2 if the changed indicates we
> > -	 * are done writing the dummy record.  If we are done with the second
> > -	 * dummy recored (DONE2), then we go to IDLE.
> > +	 * We go to NEED for any non-covering writes. We go to NEED2 if we just
> > +	 * wrote the first covering record (DONE). We go to IDLE if we just
> > +	 * wrote the second covering record (DONE2) and remain in IDLE until a
> > +	 * non-covering write occurs.
> >  	 */
> >  	switch (prev_state) {
> >  	case XLOG_STATE_COVER_IDLE:
> > +		if (iclogs_changed == 1)
> > +			return XLOG_STATE_COVER_IDLE;
> >  	case XLOG_STATE_COVER_NEED:
> >  	case XLOG_STATE_COVER_NEED2:
> >  		break;
> > -- 
> > 2.26.2
> > 
>
diff mbox series

Patch

diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
index f7b23044723d..9b8564f280b7 100644
--- a/fs/xfs/xfs_log.c
+++ b/fs/xfs/xfs_log.c
@@ -2597,12 +2597,15 @@  xlog_covered_state(
 	int			iclogs_changed)
 {
 	/*
-	 * We usually go to NEED. But we go to NEED2 if the changed indicates we
-	 * are done writing the dummy record.  If we are done with the second
-	 * dummy recored (DONE2), then we go to IDLE.
+	 * We go to NEED for any non-covering writes. We go to NEED2 if we just
+	 * wrote the first covering record (DONE). We go to IDLE if we just
+	 * wrote the second covering record (DONE2) and remain in IDLE until a
+	 * non-covering write occurs.
 	 */
 	switch (prev_state) {
 	case XLOG_STATE_COVER_IDLE:
+		if (iclogs_changed == 1)
+			return XLOG_STATE_COVER_IDLE;
 	case XLOG_STATE_COVER_NEED:
 	case XLOG_STATE_COVER_NEED2:
 		break;