diff mbox series

[3/5] xfs: add XFS_ITEM_UNSAFE for log item push return result

Message ID 20240823110439.1585041-4-leo.lilong@huawei.com (mailing list archive)
State Deferred, archived
Headers show
Series xfs: fix and cleanups for log item push | expand

Commit Message

Long Li Aug. 23, 2024, 11:04 a.m. UTC
After pushing log items, the log item may have been freed, making it
unsafe to access in tracepoints. This commit introduces XFS_ITEM_UNSAFE
to indicate when an item might be freed during the item push operation.

Signed-off-by: Long Li <leo.lilong@huawei.com>
---
 fs/xfs/xfs_stats.h     | 1 +
 fs/xfs/xfs_trans.h     | 1 +
 fs/xfs/xfs_trans_ail.c | 7 +++++++
 3 files changed, 9 insertions(+)

Comments

Darrick J. Wong Aug. 23, 2024, 5:17 p.m. UTC | #1
On Fri, Aug 23, 2024 at 07:04:37PM +0800, Long Li wrote:
> After pushing log items, the log item may have been freed, making it
> unsafe to access in tracepoints. This commit introduces XFS_ITEM_UNSAFE
> to indicate when an item might be freed during the item push operation.
> 
> Signed-off-by: Long Li <leo.lilong@huawei.com>
> ---
>  fs/xfs/xfs_stats.h     | 1 +
>  fs/xfs/xfs_trans.h     | 1 +
>  fs/xfs/xfs_trans_ail.c | 7 +++++++
>  3 files changed, 9 insertions(+)
> 
> diff --git a/fs/xfs/xfs_stats.h b/fs/xfs/xfs_stats.h
> index a61fb56ed2e6..9a7a020587cf 100644
> --- a/fs/xfs/xfs_stats.h
> +++ b/fs/xfs/xfs_stats.h
> @@ -86,6 +86,7 @@ struct __xfsstats {
>  	uint32_t		xs_push_ail_pushbuf;
>  	uint32_t		xs_push_ail_pinned;
>  	uint32_t		xs_push_ail_locked;
> +	uint32_t		xs_push_ail_unsafe;
>  	uint32_t		xs_push_ail_flushing;
>  	uint32_t		xs_push_ail_restarts;
>  	uint32_t		xs_push_ail_flush;
> diff --git a/fs/xfs/xfs_trans.h b/fs/xfs/xfs_trans.h
> index f06cc0f41665..fd4f04853fe2 100644
> --- a/fs/xfs/xfs_trans.h
> +++ b/fs/xfs/xfs_trans.h
> @@ -117,6 +117,7 @@ void	xfs_log_item_init(struct xfs_mount *mp, struct xfs_log_item *item,
>  #define XFS_ITEM_PINNED		1
>  #define XFS_ITEM_LOCKED		2
>  #define XFS_ITEM_FLUSHING	3
> +#define XFS_ITEM_UNSAFE		4
>  
>  /*
>   * This is the structure maintained for every active transaction.
> diff --git a/fs/xfs/xfs_trans_ail.c b/fs/xfs/xfs_trans_ail.c
> index 8ede9d099d1f..a5ab1ffb8937 100644
> --- a/fs/xfs/xfs_trans_ail.c
> +++ b/fs/xfs/xfs_trans_ail.c
> @@ -561,6 +561,13 @@ xfsaild_push(
>  
>  			stuck++;
>  			break;
> +		case XFS_ITEM_UNSAFE:
> +			/*
> +			 * The item may have been freed, so we can't access the
> +			 * log item here.
> +			 */
> +			XFS_STATS_INC(mp, xs_push_ail_unsafe);

Aha, so this is in reaction to Dave's comment "So, yeah, these failure
cases need to return something different to xfsaild_push() so it knows
that it is unsafe to reference the log item, even for tracing purposes."

What we're trying to communicate through xfsaild_push_item is that we've
cycled the AIL lock and possibly done something (e.g. deleting the log
item from the AIL) such that the lip reference might be stale.

Can we call this XFS_ITEM_STALEREF?  "Unsafe" is vague.

--D

> +			break;
>  		default:
>  			ASSERT(0);
>  			break;
> -- 
> 2.39.2
> 
>
Long Li Aug. 24, 2024, 3:30 a.m. UTC | #2
On Fri, Aug 23, 2024 at 10:17:09AM -0700, Darrick J. Wong wrote:
> On Fri, Aug 23, 2024 at 07:04:37PM +0800, Long Li wrote:
> > After pushing log items, the log item may have been freed, making it
> > unsafe to access in tracepoints. This commit introduces XFS_ITEM_UNSAFE
> > to indicate when an item might be freed during the item push operation.
> > 
> > Signed-off-by: Long Li <leo.lilong@huawei.com>
> > ---
> >  fs/xfs/xfs_stats.h     | 1 +
> >  fs/xfs/xfs_trans.h     | 1 +
> >  fs/xfs/xfs_trans_ail.c | 7 +++++++
> >  3 files changed, 9 insertions(+)
> > 
> > diff --git a/fs/xfs/xfs_stats.h b/fs/xfs/xfs_stats.h
> > index a61fb56ed2e6..9a7a020587cf 100644
> > --- a/fs/xfs/xfs_stats.h
> > +++ b/fs/xfs/xfs_stats.h
> > @@ -86,6 +86,7 @@ struct __xfsstats {
> >  	uint32_t		xs_push_ail_pushbuf;
> >  	uint32_t		xs_push_ail_pinned;
> >  	uint32_t		xs_push_ail_locked;
> > +	uint32_t		xs_push_ail_unsafe;
> >  	uint32_t		xs_push_ail_flushing;
> >  	uint32_t		xs_push_ail_restarts;
> >  	uint32_t		xs_push_ail_flush;
> > diff --git a/fs/xfs/xfs_trans.h b/fs/xfs/xfs_trans.h
> > index f06cc0f41665..fd4f04853fe2 100644
> > --- a/fs/xfs/xfs_trans.h
> > +++ b/fs/xfs/xfs_trans.h
> > @@ -117,6 +117,7 @@ void	xfs_log_item_init(struct xfs_mount *mp, struct xfs_log_item *item,
> >  #define XFS_ITEM_PINNED		1
> >  #define XFS_ITEM_LOCKED		2
> >  #define XFS_ITEM_FLUSHING	3
> > +#define XFS_ITEM_UNSAFE		4
> >  
> >  /*
> >   * This is the structure maintained for every active transaction.
> > diff --git a/fs/xfs/xfs_trans_ail.c b/fs/xfs/xfs_trans_ail.c
> > index 8ede9d099d1f..a5ab1ffb8937 100644
> > --- a/fs/xfs/xfs_trans_ail.c
> > +++ b/fs/xfs/xfs_trans_ail.c
> > @@ -561,6 +561,13 @@ xfsaild_push(
> >  
> >  			stuck++;
> >  			break;
> > +		case XFS_ITEM_UNSAFE:
> > +			/*
> > +			 * The item may have been freed, so we can't access the
> > +			 * log item here.
> > +			 */
> > +			XFS_STATS_INC(mp, xs_push_ail_unsafe);
> 
> Aha, so this is in reaction to Dave's comment "So, yeah, these failure
> cases need to return something different to xfsaild_push() so it knows
> that it is unsafe to reference the log item, even for tracing purposes."
> 

Yse ...

> What we're trying to communicate through xfsaild_push_item is that we've
> cycled the AIL lock and possibly done something (e.g. deleting the log
> item from the AIL) such that the lip reference might be stale.
> 
> Can we call this XFS_ITEM_STALEREF?  "Unsafe" is vague.
> 

Yes, it's seems reasonable, but doesn't XFS_ITEM_STALED looks more consistent??
 
Thanks,
Long Li
Christoph Hellwig Aug. 24, 2024, 3:34 a.m. UTC | #3
On Fri, Aug 23, 2024 at 07:04:37PM +0800, Long Li wrote:
> After pushing log items, the log item may have been freed, making it
> unsafe to access in tracepoints. This commit introduces XFS_ITEM_UNSAFE
> to indicate when an item might be freed during the item push operation.

So instead of this magic unsafe operation I think declaring a rule that
the lip must never be accessed after the return is the much saner
choice.

We'll then need to figure out how we can still keep useful tracing
without accessing the lip.  The only information the trace points need
from the lip itself are the type, flags, and lsn and those seem small
enough to save on the stack before calling into ->iop_push.
Long Li Aug. 27, 2024, 9:41 a.m. UTC | #4
On Fri, Aug 23, 2024 at 08:34:43PM -0700, Christoph Hellwig wrote:
> On Fri, Aug 23, 2024 at 07:04:37PM +0800, Long Li wrote:
> > After pushing log items, the log item may have been freed, making it
> > unsafe to access in tracepoints. This commit introduces XFS_ITEM_UNSAFE
> > to indicate when an item might be freed during the item push operation.
> 
> So instead of this magic unsafe operation I think declaring a rule that
> the lip must never be accessed after the return is the much saner
> choice.
> 
> We'll then need to figure out how we can still keep useful tracing
> without accessing the lip.  The only information the trace points need
> from the lip itself are the type, flags, and lsn and those seem small
> enough to save on the stack before calling into ->iop_push.
> 
> 

Hi Christoph,

Thank you for pointing out the issues with the current approach. Establishing
a rule to not access 'lip' after the item has been pushed would indeed make
the logic clearer.

However, saving the log item information that needs to be traced on the stack
also looks unappealing:

	1. We would need to add new log item trace code, instead of using the
	generic DEFINE_LOG_ITEM_EVENT macro definition. This increases code
	redundancy.

	2. We would need to use CONFIG_TRACEPOINTS to manage whether we need
	to save type, flags, lsn, and other information on the stack.

	3. If we need to extend tracing to other fields of the log item in
	the future, we would need to add new variables.

If we save log item on the stack before each push, I think it would affect
performance, although this impact would be minimal. I wonder what other 
people's opinions are?

Thanks,
Long Li
Dave Chinner Aug. 27, 2024, 9:44 a.m. UTC | #5
On Fri, Aug 23, 2024 at 10:17:09AM -0700, Darrick J. Wong wrote:
> On Fri, Aug 23, 2024 at 07:04:37PM +0800, Long Li wrote:
> > After pushing log items, the log item may have been freed, making it
> > unsafe to access in tracepoints. This commit introduces XFS_ITEM_UNSAFE
> > to indicate when an item might be freed during the item push operation.
> > 
> > Signed-off-by: Long Li <leo.lilong@huawei.com>
> > ---
> >  fs/xfs/xfs_stats.h     | 1 +
> >  fs/xfs/xfs_trans.h     | 1 +
> >  fs/xfs/xfs_trans_ail.c | 7 +++++++
> >  3 files changed, 9 insertions(+)
> > 
> > diff --git a/fs/xfs/xfs_stats.h b/fs/xfs/xfs_stats.h
> > index a61fb56ed2e6..9a7a020587cf 100644
> > --- a/fs/xfs/xfs_stats.h
> > +++ b/fs/xfs/xfs_stats.h
> > @@ -86,6 +86,7 @@ struct __xfsstats {
> >  	uint32_t		xs_push_ail_pushbuf;
> >  	uint32_t		xs_push_ail_pinned;
> >  	uint32_t		xs_push_ail_locked;
> > +	uint32_t		xs_push_ail_unsafe;
> >  	uint32_t		xs_push_ail_flushing;
> >  	uint32_t		xs_push_ail_restarts;
> >  	uint32_t		xs_push_ail_flush;
> > diff --git a/fs/xfs/xfs_trans.h b/fs/xfs/xfs_trans.h
> > index f06cc0f41665..fd4f04853fe2 100644
> > --- a/fs/xfs/xfs_trans.h
> > +++ b/fs/xfs/xfs_trans.h
> > @@ -117,6 +117,7 @@ void	xfs_log_item_init(struct xfs_mount *mp, struct xfs_log_item *item,
> >  #define XFS_ITEM_PINNED		1
> >  #define XFS_ITEM_LOCKED		2
> >  #define XFS_ITEM_FLUSHING	3
> > +#define XFS_ITEM_UNSAFE		4
> >  
> >  /*
> >   * This is the structure maintained for every active transaction.
> > diff --git a/fs/xfs/xfs_trans_ail.c b/fs/xfs/xfs_trans_ail.c
> > index 8ede9d099d1f..a5ab1ffb8937 100644
> > --- a/fs/xfs/xfs_trans_ail.c
> > +++ b/fs/xfs/xfs_trans_ail.c
> > @@ -561,6 +561,13 @@ xfsaild_push(
> >  
> >  			stuck++;
> >  			break;
> > +		case XFS_ITEM_UNSAFE:
> > +			/*
> > +			 * The item may have been freed, so we can't access the
> > +			 * log item here.
> > +			 */
> > +			XFS_STATS_INC(mp, xs_push_ail_unsafe);
> 
> Aha, so this is in reaction to Dave's comment "So, yeah, these failure
> cases need to return something different to xfsaild_push() so it knows
> that it is unsafe to reference the log item, even for tracing purposes."

I wish I knew when I said that and what I said it about. There's no
pointer to previous versions of this fix in the series description -
it's not even marked as a "v2" patchset.

> What we're trying to communicate through xfsaild_push_item is that we've
> cycled the AIL lock and possibly done something (e.g. deleting the log
> item from the AIL) such that the lip reference might be stale.
> 
> Can we call this XFS_ITEM_STALEREF?  "Unsafe" is vague.

In this case, the returned value tells the aild what the result of
the push was. i.e. the return value is XFS_ITEM_FREED, telling the
aild that the push has freed the item (and so the aild should not
access it again).

-Dave.
Dave Chinner Aug. 27, 2024, 10 a.m. UTC | #6
On Fri, Aug 23, 2024 at 08:34:43PM -0700, Christoph Hellwig wrote:
> On Fri, Aug 23, 2024 at 07:04:37PM +0800, Long Li wrote:
> > After pushing log items, the log item may have been freed, making it
> > unsafe to access in tracepoints. This commit introduces XFS_ITEM_UNSAFE
> > to indicate when an item might be freed during the item push operation.
> 
> So instead of this magic unsafe operation I think declaring a rule that
> the lip must never be accessed after the return is the much saner
> choice.

That may well be the case, but in the normal case the only way to
remove the item from the AIL is to run IO completion. We don't
actually submit IO for anything we've pushed until after we've
dropped out of the item push loop, so IO completion and potential
item AIL removal and freeing can't occur for anything we need to do
IO on.

Hence the only cases where the item might have been already removed
from the AIL by the ->iop_push() are those where the push itself
removes the item from the AIL. This only occurs in shutdown
situations, so it's not the common case.

In which case, returning XFS_ITEM_FREED to tell the push code that
it was freed and should not reference it at all is fine. We don't
really even need tracing for this case because if the items can't be
removed from the AIL, they will leave some other AIL trace when
pushe (i.e.  they will be stuck locked, pinned or flushing and those
will leave traces...)

-Dave.
Christoph Hellwig Aug. 27, 2024, 12:30 p.m. UTC | #7
On Tue, Aug 27, 2024 at 08:00:05PM +1000, Dave Chinner wrote:
> Hence the only cases where the item might have been already removed
> from the AIL by the ->iop_push() are those where the push itself
> removes the item from the AIL. This only occurs in shutdown
> situations, so it's not the common case.
> 
> In which case, returning XFS_ITEM_FREED to tell the push code that
> it was freed and should not reference it at all is fine. We don't
> really even need tracing for this case because if the items can't be
> removed from the AIL, they will leave some other AIL trace when
> pushe (i.e.  they will be stuck locked, pinned or flushing and those
> will leave traces...)

So XFS_ITEM_FREED is definitively a better name, but it still feels
a bit fragile that any of these shutdown paths need special handling
inside ->iop_push.
Dave Chinner Aug. 27, 2024, 9:52 p.m. UTC | #8
On Tue, Aug 27, 2024 at 05:30:49AM -0700, Christoph Hellwig wrote:
> On Tue, Aug 27, 2024 at 08:00:05PM +1000, Dave Chinner wrote:
> > Hence the only cases where the item might have been already removed
> > from the AIL by the ->iop_push() are those where the push itself
> > removes the item from the AIL. This only occurs in shutdown
> > situations, so it's not the common case.
> > 
> > In which case, returning XFS_ITEM_FREED to tell the push code that
> > it was freed and should not reference it at all is fine. We don't
> > really even need tracing for this case because if the items can't be
> > removed from the AIL, they will leave some other AIL trace when
> > pushe (i.e.  they will be stuck locked, pinned or flushing and those
> > will leave traces...)
> 
> So XFS_ITEM_FREED is definitively a better name, but it still feels
> a bit fragile that any of these shutdown paths need special handling
> inside ->iop_push.

Agreed, but I don't see an easy way to fix that right now because
the shutdown behaviour is both item type and item state specific.

I suspect that we'd do better to have explicit shutdown processing
of log items in the AIL (i.e. a ->iop_shutdown method) that is
called instead of ->iop_push when the AIL detects that the
filesystem has shut down. We can then define the exact behaviour we
want in this case and processing does not have to be non-blocking
for performance and latency reasons.

If we go down that route, I think we'd want to add a
XFS_ITEM_SHUTDOWN return value after the push code calls
xfs_force_shutdown(). The push code does not error out the item or
remove it from the AIL, just shuts down the fs and returns
XFS_ITEM_SHUTDOWN.

The AIL then breaks out of the push loop and submits the delwri
buffers which will error them all out and remove them from the AIL
because the fs is shut down. It then starts a new walk from the tail
calling ->iop_shutdown on everything remaining in the AIL until the
AIL is empty....

-Dave.
Christoph Hellwig Aug. 28, 2024, 4:23 a.m. UTC | #9
On Wed, Aug 28, 2024 at 07:52:48AM +1000, Dave Chinner wrote:
> I suspect that we'd do better to have explicit shutdown processing
> of log items in the AIL (i.e. a ->iop_shutdown method) that is
> called instead of ->iop_push when the AIL detects that the
> filesystem has shut down. We can then define the exact behaviour we
> want in this case and processing does not have to be non-blocking
> for performance and latency reasons.
> 
> If we go down that route, I think we'd want to add a
> XFS_ITEM_SHUTDOWN return value after the push code calls
> xfs_force_shutdown(). The push code does not error out the item or
> remove it from the AIL, just shuts down the fs and returns
> XFS_ITEM_SHUTDOWN.

Yes, that seems even better.  But it would probably be a fair amount
of work.
Dave Chinner Aug. 29, 2024, 10:16 a.m. UTC | #10
On Tue, Aug 27, 2024 at 09:23:33PM -0700, Christoph Hellwig wrote:
> On Wed, Aug 28, 2024 at 07:52:48AM +1000, Dave Chinner wrote:
> > I suspect that we'd do better to have explicit shutdown processing
> > of log items in the AIL (i.e. a ->iop_shutdown method) that is
> > called instead of ->iop_push when the AIL detects that the
> > filesystem has shut down. We can then define the exact behaviour we
> > want in this case and processing does not have to be non-blocking
> > for performance and latency reasons.
> > 
> > If we go down that route, I think we'd want to add a
> > XFS_ITEM_SHUTDOWN return value after the push code calls
> > xfs_force_shutdown(). The push code does not error out the item or
> > remove it from the AIL, just shuts down the fs and returns
> > XFS_ITEM_SHUTDOWN.
> 
> Yes, that seems even better.  But it would probably be a fair amount
> of work.

I don't think so. Only the log items that implement ->iop_push would
need to implement ->iop_shutdown, and there are only 3 items that
implement iop_push. i.e. inode, dquot and buffer items.

The buffer item shutdown method would simply be:

{
	xfs_buf_lock(bp);
	bp->b_flags |= XBF_ASYNC;
	xfs_buf_ioend_fail(bp);
	return XFS_ITEM_FREED;
}

The inode item would use the same setup as xfs_inode_item_push()
to lock the inode cluster buffer, then run the same code as above
for the buffer item shutdown.

I think that dquots end up the same - do the same setup work to get
the locked dquot buffer, then fail it directly without doing IO.

It doesn't seem like it's all that complex....

-Dave.
diff mbox series

Patch

diff --git a/fs/xfs/xfs_stats.h b/fs/xfs/xfs_stats.h
index a61fb56ed2e6..9a7a020587cf 100644
--- a/fs/xfs/xfs_stats.h
+++ b/fs/xfs/xfs_stats.h
@@ -86,6 +86,7 @@  struct __xfsstats {
 	uint32_t		xs_push_ail_pushbuf;
 	uint32_t		xs_push_ail_pinned;
 	uint32_t		xs_push_ail_locked;
+	uint32_t		xs_push_ail_unsafe;
 	uint32_t		xs_push_ail_flushing;
 	uint32_t		xs_push_ail_restarts;
 	uint32_t		xs_push_ail_flush;
diff --git a/fs/xfs/xfs_trans.h b/fs/xfs/xfs_trans.h
index f06cc0f41665..fd4f04853fe2 100644
--- a/fs/xfs/xfs_trans.h
+++ b/fs/xfs/xfs_trans.h
@@ -117,6 +117,7 @@  void	xfs_log_item_init(struct xfs_mount *mp, struct xfs_log_item *item,
 #define XFS_ITEM_PINNED		1
 #define XFS_ITEM_LOCKED		2
 #define XFS_ITEM_FLUSHING	3
+#define XFS_ITEM_UNSAFE		4
 
 /*
  * This is the structure maintained for every active transaction.
diff --git a/fs/xfs/xfs_trans_ail.c b/fs/xfs/xfs_trans_ail.c
index 8ede9d099d1f..a5ab1ffb8937 100644
--- a/fs/xfs/xfs_trans_ail.c
+++ b/fs/xfs/xfs_trans_ail.c
@@ -561,6 +561,13 @@  xfsaild_push(
 
 			stuck++;
 			break;
+		case XFS_ITEM_UNSAFE:
+			/*
+			 * The item may have been freed, so we can't access the
+			 * log item here.
+			 */
+			XFS_STATS_INC(mp, xs_push_ail_unsafe);
+			break;
 		default:
 			ASSERT(0);
 			break;