diff mbox series

[2/4] xfs: remove callback dequeue loop from xlog_state_do_iclog_callbacks

Message ID 20210622040604.1290539-3-david@fromorbit.com (mailing list archive)
State Accepted
Headers show
Series xfs: fix CIL shutdown UAF and shutdown hang | expand

Commit Message

Dave Chinner June 22, 2021, 4:06 a.m. UTC
From: Dave Chinner <dchinner@redhat.com>

If we are processing callbacks on an iclog, nothing can be
concurrently adding callbacks to the loop. We only add callbacks to
the iclog when they are in ACTIVE or WANT_SYNC state, and we
explicitly do not add callbacks if the iclog is already in IOERROR
state.

The only way to have a dequeue racing with an enqueue is to be
processing a shutdown without a direct reference to an iclog in
ACTIVE or WANT_SYNC state. As the enqueue avoids this race
condition, we only ever need a single dequeue operation in
xlog_state_do_iclog_callbacks(). Hence we can remove the loop.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 fs/xfs/xfs_log.c | 16 ++++++----------
 1 file changed, 6 insertions(+), 10 deletions(-)

Comments

Brian Foster June 22, 2021, 12:39 p.m. UTC | #1
On Tue, Jun 22, 2021 at 02:06:02PM +1000, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
> 
> If we are processing callbacks on an iclog, nothing can be
> concurrently adding callbacks to the loop. We only add callbacks to
> the iclog when they are in ACTIVE or WANT_SYNC state, and we
> explicitly do not add callbacks if the iclog is already in IOERROR
> state.
> 
> The only way to have a dequeue racing with an enqueue is to be
> processing a shutdown without a direct reference to an iclog in
> ACTIVE or WANT_SYNC state. As the enqueue avoids this race
> condition, we only ever need a single dequeue operation in
> xlog_state_do_iclog_callbacks(). Hence we can remove the loop.
> 

This sort of relates to my question on the previous patch..

> Signed-off-by: Dave Chinner <dchinner@redhat.com>
> ---

Reviewed-by: Brian Foster <bfoster@redhat.com>

>  fs/xfs/xfs_log.c | 16 ++++++----------
>  1 file changed, 6 insertions(+), 10 deletions(-)
> 
> diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
> index bb4390942275..05b00fa4d661 100644
> --- a/fs/xfs/xfs_log.c
> +++ b/fs/xfs/xfs_log.c
> @@ -2774,19 +2774,15 @@ xlog_state_do_iclog_callbacks(
>  	struct xlog		*log,
>  	struct xlog_in_core	*iclog)
>  {
> -	trace_xlog_iclog_callbacks_start(iclog, _RET_IP_);
> -	spin_lock(&iclog->ic_callback_lock);
> -	while (!list_empty(&iclog->ic_callbacks)) {
> -		LIST_HEAD(tmp);
> +	LIST_HEAD(tmp);
>  
> -		list_splice_init(&iclog->ic_callbacks, &tmp);
> -
> -		spin_unlock(&iclog->ic_callback_lock);
> -		xlog_cil_process_committed(&tmp);
> -		spin_lock(&iclog->ic_callback_lock);
> -	}
> +	trace_xlog_iclog_callbacks_start(iclog, _RET_IP_);
>  
> +	spin_lock(&iclog->ic_callback_lock);
> +	list_splice_init(&iclog->ic_callbacks, &tmp);
>  	spin_unlock(&iclog->ic_callback_lock);
> +
> +	xlog_cil_process_committed(&tmp);
>  	trace_xlog_iclog_callbacks_done(iclog, _RET_IP_);
>  }
>  
> -- 
> 2.31.1
>
Dave Chinner June 22, 2021, 10:56 p.m. UTC | #2
On Tue, Jun 22, 2021 at 08:39:07AM -0400, Brian Foster wrote:
> On Tue, Jun 22, 2021 at 02:06:02PM +1000, Dave Chinner wrote:
> > From: Dave Chinner <dchinner@redhat.com>
> > 
> > If we are processing callbacks on an iclog, nothing can be
> > concurrently adding callbacks to the loop. We only add callbacks to
> > the iclog when they are in ACTIVE or WANT_SYNC state, and we
> > explicitly do not add callbacks if the iclog is already in IOERROR
> > state.
> > 
> > The only way to have a dequeue racing with an enqueue is to be
> > processing a shutdown without a direct reference to an iclog in
> > ACTIVE or WANT_SYNC state. As the enqueue avoids this race
> > condition, we only ever need a single dequeue operation in
> > xlog_state_do_iclog_callbacks(). Hence we can remove the loop.
> > 
> 
> This sort of relates to my question on the previous patch..

Been that way since 1995:

commit fdae46676ab5d359d02d955c989b20b18e2a97f8
Author: Adam Sweeney <ajs@sgi.com>
Date:   Thu May 4 20:54:43 1995 +0000

    275579 - - Fix timing bug in the log callback code.  Callbacks
    must be queued until the incore log buffer goes to the dirty
    state.

......
               /*
+                * Keep processing entries in the callback list
+                * until we come around and it is empty.  We need
+                * to atomically see that the list is empty and change
+                * the state to DIRTY so that we don't miss any more
+                * callbacks being added.
+                */
                spl = LOG_LOCK(log);
+               cb = iclog->ic_callback;
+               while (cb != 0) {
+                       iclog->ic_callback_tail = &(iclog->ic_callback);
+                       iclog->ic_callback = 0;
+                       LOG_UNLOCK(log, spl);
+
+                       /* perform callbacks in the order given */
+                       for (; cb != 0; cb = cb_next) {
+                               cb_next = cb->cb_next;
+                               cb->cb_func(cb->cb_arg);
+                       }
+                       spl = LOG_LOCK(log);
+                       cb = iclog->ic_callback;
+               }
+
+               ASSERT(iclog->ic_callback == 0);

THat's likely also what the locking I removed in the previous patch
was attempting to retain - atomic transition to DIRTY state -
without really understanding if it is necessary or not.

The only way I can see this happening now is racing with shutdown
state being set and callbacks being run at the same time a commit
record is being processed. But now we check for shutdown before we
add callbacks to the iclog, and hence we can't be adding callbacks
while shutdown state callbacks are being run. And that's made even
more impossible by this patch set that is serialising all the
shutdown state changes and callback add/remove under the same
lock....

So, yeah, this is largely behaviour that is historic and the
situation that it avoided is unknown and almost certainly doesn't
exist anymore...

Cheers,

Dave.
Darrick J. Wong June 25, 2021, 8:57 p.m. UTC | #3
On Tue, Jun 22, 2021 at 02:06:02PM +1000, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
> 
> If we are processing callbacks on an iclog, nothing can be
> concurrently adding callbacks to the loop. We only add callbacks to
> the iclog when they are in ACTIVE or WANT_SYNC state, and we
> explicitly do not add callbacks if the iclog is already in IOERROR
> state.
> 
> The only way to have a dequeue racing with an enqueue is to be
> processing a shutdown without a direct reference to an iclog in
> ACTIVE or WANT_SYNC state. As the enqueue avoids this race
> condition, we only ever need a single dequeue operation in
> xlog_state_do_iclog_callbacks(). Hence we can remove the loop.
> 
> Signed-off-by: Dave Chinner <dchinner@redhat.com>

Makes sense...
Reviewed-by: Darrick J. Wong <djwong@kernel.org>

--D

> ---
>  fs/xfs/xfs_log.c | 16 ++++++----------
>  1 file changed, 6 insertions(+), 10 deletions(-)
> 
> diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
> index bb4390942275..05b00fa4d661 100644
> --- a/fs/xfs/xfs_log.c
> +++ b/fs/xfs/xfs_log.c
> @@ -2774,19 +2774,15 @@ xlog_state_do_iclog_callbacks(
>  	struct xlog		*log,
>  	struct xlog_in_core	*iclog)
>  {
> -	trace_xlog_iclog_callbacks_start(iclog, _RET_IP_);
> -	spin_lock(&iclog->ic_callback_lock);
> -	while (!list_empty(&iclog->ic_callbacks)) {
> -		LIST_HEAD(tmp);
> +	LIST_HEAD(tmp);
>  
> -		list_splice_init(&iclog->ic_callbacks, &tmp);
> -
> -		spin_unlock(&iclog->ic_callback_lock);
> -		xlog_cil_process_committed(&tmp);
> -		spin_lock(&iclog->ic_callback_lock);
> -	}
> +	trace_xlog_iclog_callbacks_start(iclog, _RET_IP_);
>  
> +	spin_lock(&iclog->ic_callback_lock);
> +	list_splice_init(&iclog->ic_callbacks, &tmp);
>  	spin_unlock(&iclog->ic_callback_lock);
> +
> +	xlog_cil_process_committed(&tmp);
>  	trace_xlog_iclog_callbacks_done(iclog, _RET_IP_);
>  }
>  
> -- 
> 2.31.1
>
diff mbox series

Patch

diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
index bb4390942275..05b00fa4d661 100644
--- a/fs/xfs/xfs_log.c
+++ b/fs/xfs/xfs_log.c
@@ -2774,19 +2774,15 @@  xlog_state_do_iclog_callbacks(
 	struct xlog		*log,
 	struct xlog_in_core	*iclog)
 {
-	trace_xlog_iclog_callbacks_start(iclog, _RET_IP_);
-	spin_lock(&iclog->ic_callback_lock);
-	while (!list_empty(&iclog->ic_callbacks)) {
-		LIST_HEAD(tmp);
+	LIST_HEAD(tmp);
 
-		list_splice_init(&iclog->ic_callbacks, &tmp);
-
-		spin_unlock(&iclog->ic_callback_lock);
-		xlog_cil_process_committed(&tmp);
-		spin_lock(&iclog->ic_callback_lock);
-	}
+	trace_xlog_iclog_callbacks_start(iclog, _RET_IP_);
 
+	spin_lock(&iclog->ic_callback_lock);
+	list_splice_init(&iclog->ic_callbacks, &tmp);
 	spin_unlock(&iclog->ic_callback_lock);
+
+	xlog_cil_process_committed(&tmp);
 	trace_xlog_iclog_callbacks_done(iclog, _RET_IP_);
 }