diff mbox series

[4/4] xfs: only relog deferred intent items if free space in the log gets low

Message ID 160140147498.831337.5344692693382121188.stgit@magnolia
State Superseded
Headers show
Series xfs: fix some log stalling problems in defer ops | expand

Commit Message

Darrick J. Wong Sept. 29, 2020, 5:44 p.m. UTC
From: Darrick J. Wong <darrick.wong@oracle.com>

Now that we have the ability to ask the log how far the tail needs to be
pushed to maintain its free space targets, augment the decision to relog
an intent item so that we only do it if the log has hit the 75% full
threshold.  There's no point in relogging an intent into the same
checkpoint, and there's no need to relog if there's plenty of free space
in the log.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/libxfs/xfs_defer.c |   16 ++++++++++++++++
 1 file changed, 16 insertions(+)

Comments

Brian Foster Oct. 1, 2020, 4:02 p.m. UTC | #1
On Tue, Sep 29, 2020 at 10:44:35AM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Now that we have the ability to ask the log how far the tail needs to be
> pushed to maintain its free space targets, augment the decision to relog
> an intent item so that we only do it if the log has hit the 75% full
> threshold.  There's no point in relogging an intent into the same
> checkpoint, and there's no need to relog if there's plenty of free space
> in the log.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
>  fs/xfs/libxfs/xfs_defer.c |   16 ++++++++++++++++
>  1 file changed, 16 insertions(+)
> 
> 
> diff --git a/fs/xfs/libxfs/xfs_defer.c b/fs/xfs/libxfs/xfs_defer.c
> index 554777d1069c..2ba02f2e59a1 100644
> --- a/fs/xfs/libxfs/xfs_defer.c
> +++ b/fs/xfs/libxfs/xfs_defer.c
> @@ -356,7 +356,10 @@ xfs_defer_relog(
>  	struct xfs_trans		**tpp,
>  	struct list_head		*dfops)
>  {
> +	struct xlog			*log = (*tpp)->t_mountp->m_log;
>  	struct xfs_defer_pending	*dfp;
> +	xfs_lsn_t			threshold_lsn = NULLCOMMITLSN;
> +
>  
>  	ASSERT((*tpp)->t_flags & XFS_TRANS_PERM_LOG_RES);
>  
> @@ -372,6 +375,19 @@ xfs_defer_relog(
>  		    xfs_log_item_in_current_chkpt(dfp->dfp_intent))
>  			continue;
>  
> +		/*
> +		 * Figure out where we need the tail to be in order to maintain
> +		 * the minimum required free space in the log.  Only sample
> +		 * the log threshold once per call.
> +		 */
> +		if (threshold_lsn == NULLCOMMITLSN) {
> +			threshold_lsn = xlog_grant_push_threshold(log, 0);
> +			if (threshold_lsn == NULLCOMMITLSN)
> +				break;
> +		}

FWIW, this looks slightly different from the referenced repo again. :P
It might be good practice to create a -v# branch for patches sent to the
list and to keep that one stable for the associated review cycle.

That aside, I'm not quite clear where we stand with this patch. My
preference was to keep it unless there was some fundamental correctness
issue that I'm not aware of. I think your and Dave's preference was to
drop it. So either way, for posterity:

Reviewed-by: Brian Foster <bfoster@redhat.com>

> +		if (XFS_LSN_CMP(dfp->dfp_intent->li_lsn, threshold_lsn) >= 0)
> +			continue;
> +
>  		trace_xfs_defer_relog_intent((*tpp)->t_mountp, dfp);
>  		XFS_STATS_INC((*tpp)->t_mountp, defer_relog);
>  		dfp->dfp_intent = xfs_trans_item_relog(dfp->dfp_intent, *tpp);
>
Darrick J. Wong Oct. 1, 2020, 5:33 p.m. UTC | #2
On Thu, Oct 01, 2020 at 12:02:56PM -0400, Brian Foster wrote:
> On Tue, Sep 29, 2020 at 10:44:35AM -0700, Darrick J. Wong wrote:
> > From: Darrick J. Wong <darrick.wong@oracle.com>
> > 
> > Now that we have the ability to ask the log how far the tail needs to be
> > pushed to maintain its free space targets, augment the decision to relog
> > an intent item so that we only do it if the log has hit the 75% full
> > threshold.  There's no point in relogging an intent into the same
> > checkpoint, and there's no need to relog if there's plenty of free space
> > in the log.
> > 
> > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > ---
> >  fs/xfs/libxfs/xfs_defer.c |   16 ++++++++++++++++
> >  1 file changed, 16 insertions(+)
> > 
> > 
> > diff --git a/fs/xfs/libxfs/xfs_defer.c b/fs/xfs/libxfs/xfs_defer.c
> > index 554777d1069c..2ba02f2e59a1 100644
> > --- a/fs/xfs/libxfs/xfs_defer.c
> > +++ b/fs/xfs/libxfs/xfs_defer.c
> > @@ -356,7 +356,10 @@ xfs_defer_relog(
> >  	struct xfs_trans		**tpp,
> >  	struct list_head		*dfops)
> >  {
> > +	struct xlog			*log = (*tpp)->t_mountp->m_log;
> >  	struct xfs_defer_pending	*dfp;
> > +	xfs_lsn_t			threshold_lsn = NULLCOMMITLSN;
> > +
> >  
> >  	ASSERT((*tpp)->t_flags & XFS_TRANS_PERM_LOG_RES);
> >  
> > @@ -372,6 +375,19 @@ xfs_defer_relog(
> >  		    xfs_log_item_in_current_chkpt(dfp->dfp_intent))
> >  			continue;
> >  
> > +		/*
> > +		 * Figure out where we need the tail to be in order to maintain
> > +		 * the minimum required free space in the log.  Only sample
> > +		 * the log threshold once per call.
> > +		 */
> > +		if (threshold_lsn == NULLCOMMITLSN) {
> > +			threshold_lsn = xlog_grant_push_threshold(log, 0);
> > +			if (threshold_lsn == NULLCOMMITLSN)
> > +				break;
> > +		}
> 
> FWIW, this looks slightly different from the referenced repo again. :P
> It might be good practice to create a -v# branch for patches sent to the
> list and to keep that one stable for the associated review cycle.

I did, as we talked about on irc.  For anyone following along at home:

https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/tag/?h=defer-ops-stalls-5.10_2020-09-29

> That aside, I'm not quite clear where we stand with this patch. My
> preference was to keep it unless there was some fundamental correctness
> issue that I'm not aware of. I think your and Dave's preference was to
> drop it. So either way, for posterity:

As the author who keeps spraying this patch onto the list, I'm still of
a mind to include it. :)  Since we don't really need to relog the
intents unless we need to move the log forward.

That said, even my attempts to force worst case scenarios (make a few
hundred of the big-extent-partially-shared files, truncate them all at
the same time while repeatedly fsyncing (or freezing the fs) still shows
that the relog counts are a tiny percentage of the total transactions
run.  I guess I'll try a higher AG count fs next...

--D

> 
> Reviewed-by: Brian Foster <bfoster@redhat.com>
> 
> > +		if (XFS_LSN_CMP(dfp->dfp_intent->li_lsn, threshold_lsn) >= 0)
> > +			continue;
> > +
> >  		trace_xfs_defer_relog_intent((*tpp)->t_mountp, dfp);
> >  		XFS_STATS_INC((*tpp)->t_mountp, defer_relog);
> >  		dfp->dfp_intent = xfs_trans_item_relog(dfp->dfp_intent, *tpp);
> > 
>
diff mbox series

Patch

diff --git a/fs/xfs/libxfs/xfs_defer.c b/fs/xfs/libxfs/xfs_defer.c
index 554777d1069c..2ba02f2e59a1 100644
--- a/fs/xfs/libxfs/xfs_defer.c
+++ b/fs/xfs/libxfs/xfs_defer.c
@@ -356,7 +356,10 @@  xfs_defer_relog(
 	struct xfs_trans		**tpp,
 	struct list_head		*dfops)
 {
+	struct xlog			*log = (*tpp)->t_mountp->m_log;
 	struct xfs_defer_pending	*dfp;
+	xfs_lsn_t			threshold_lsn = NULLCOMMITLSN;
+
 
 	ASSERT((*tpp)->t_flags & XFS_TRANS_PERM_LOG_RES);
 
@@ -372,6 +375,19 @@  xfs_defer_relog(
 		    xfs_log_item_in_current_chkpt(dfp->dfp_intent))
 			continue;
 
+		/*
+		 * Figure out where we need the tail to be in order to maintain
+		 * the minimum required free space in the log.  Only sample
+		 * the log threshold once per call.
+		 */
+		if (threshold_lsn == NULLCOMMITLSN) {
+			threshold_lsn = xlog_grant_push_threshold(log, 0);
+			if (threshold_lsn == NULLCOMMITLSN)
+				break;
+		}
+		if (XFS_LSN_CMP(dfp->dfp_intent->li_lsn, threshold_lsn) >= 0)
+			continue;
+
 		trace_xfs_defer_relog_intent((*tpp)->t_mountp, dfp);
 		XFS_STATS_INC((*tpp)->t_mountp, defer_relog);
 		dfp->dfp_intent = xfs_trans_item_relog(dfp->dfp_intent, *tpp);