Message ID | 20200417150859.14734-11-bfoster@redhat.com (mailing list archive) |
---|---|
State | Superseded, archived |
Headers | show |
Series | xfs: flush related error handling cleanups | expand |
On Fri, Apr 17, 2020 at 11:08:57AM -0400, Brian Foster wrote: > We have two AIL removal functions with slightly different semantics. > xfs_trans_ail_delete() expects the caller to have the AIL lock and > for the associated item to be AIL resident. If not, the filesystem > is shut down. xfs_trans_ail_remove() acquires the AIL lock, checks > that the item is AIL resident and calls the former if so. > > These semantics lead to confused usage between the two. For example, > the _remove() variant takes a shutdown parameter to pass to the > _delete() variant, but immediately returns if the AIL bit is not > set. This means that _remove() would never shut down if an item is > not AIL resident, even though it appears that many callers would > expect it to. > > Make the following changes to clean up both of these functions: > > - Most callers of xfs_trans_ail_delete() acquire the AIL lock just > before the call. Update _delete() to acquire the lock and open > code the couple of callers that make additional checks under AIL > lock. > - Drop the unnecessary ailp parameter from _delete(). > - Drop the unused shutdown parameter from _remove() and open code > the implementation. > > In summary, this leaves a _delete() variant that expects an AIL > resident item and a _remove() helper that checks the AIL bit. Audit > the existing callsites for use of the appropriate function and > update as necessary. > > Signed-off-by: Brian Foster <bfoster@redhat.com> .... Good start, but... > @@ -1032,10 +1033,11 @@ xfs_qm_dqflush_done( > goto out; > > spin_lock(&ailp->ail_lock); > - if (lip->li_lsn == qip->qli_flush_lsn) > - /* xfs_trans_ail_delete() drops the AIL lock */ > - xfs_trans_ail_delete(ailp, lip, SHUTDOWN_CORRUPT_INCORE); > - else > + if (lip->li_lsn == qip->qli_flush_lsn) { > + /* xfs_ail_update_finish() drops the AIL lock */ > + tail_lsn = xfs_ail_delete_one(ailp, lip); > + xfs_ail_update_finish(ailp, tail_lsn); > + } else > spin_unlock(&ailp->ail_lock); This drops the shutdown if the dquot is not in the AIL. It should be in the AIL, so if it isn't we should be shutting down... > @@ -872,13 +872,14 @@ xfs_ail_delete_one( > */ > void > xfs_trans_ail_delete( > - struct xfs_ail *ailp, > struct xfs_log_item *lip, > int shutdown_type) > { > + struct xfs_ail *ailp = lip->li_ailp; > struct xfs_mount *mp = ailp->ail_mount; > xfs_lsn_t tail_lsn; > > + spin_lock(&ailp->ail_lock); > if (!test_bit(XFS_LI_IN_AIL, &lip->li_flags)) { > spin_unlock(&ailp->ail_lock); > if (!XFS_FORCED_SHUTDOWN(mp)) { > diff --git a/fs/xfs/xfs_trans_priv.h b/fs/xfs/xfs_trans_priv.h > index 9135afdcee9d..7563c78e2997 100644 > --- a/fs/xfs/xfs_trans_priv.h > +++ b/fs/xfs/xfs_trans_priv.h > @@ -94,22 +94,23 @@ xfs_trans_ail_update( > xfs_lsn_t xfs_ail_delete_one(struct xfs_ail *ailp, struct xfs_log_item *lip); > void xfs_ail_update_finish(struct xfs_ail *ailp, xfs_lsn_t old_lsn) > __releases(ailp->ail_lock); > -void xfs_trans_ail_delete(struct xfs_ail *ailp, struct xfs_log_item *lip, > - int shutdown_type); > +void xfs_trans_ail_delete(struct xfs_log_item *lip, int shutdown_type); > > static inline void > xfs_trans_ail_remove( > - struct xfs_log_item *lip, > - int shutdown_type) > + struct xfs_log_item *lip) > { > struct xfs_ail *ailp = lip->li_ailp; > + xfs_lsn_t tail_lsn; > > spin_lock(&ailp->ail_lock); > - /* xfs_trans_ail_delete() drops the AIL lock */ > - if (test_bit(XFS_LI_IN_AIL, &lip->li_flags)) > - xfs_trans_ail_delete(ailp, lip, shutdown_type); > - else > + /* xfs_ail_update_finish() drops the AIL lock */ > + if (test_bit(XFS_LI_IN_AIL, &lip->li_flags)) { > + tail_lsn = xfs_ail_delete_one(ailp, lip); > + xfs_ail_update_finish(ailp, tail_lsn); > + } else { > spin_unlock(&ailp->ail_lock); > + } > } This makes xfs_trans_ail_delete() and xfs_trans_ail_remove() almost identical, except one will shutdown if the item is not in the AIL and the other won't. Wouldn't it be better to get it down to just one function that does everything, and remove the confusion of which to use altogether? void xfs_trans_ail_delete( struct xfs_log_item *lip, int shutdown) { struct xfs_ail *ailp = lip->li_ailp; spin_lock(&ailp->ail_lock); if (test_bit(XFS_LI_IN_AIL, &lip->li_flags)) { xfs_lsn_t tail_lsn = xfs_ail_delete_one(ailp, lip); xfs_ail_update_finish(ailp, tail_lsn); return; } spin_unlock(&ailp->ail_lock); if (!shutdown) return; /* do shutdown stuff */ } -Dave.
On Mon, Apr 20, 2020 at 02:32:33PM +1000, Dave Chinner wrote: > On Fri, Apr 17, 2020 at 11:08:57AM -0400, Brian Foster wrote: > > We have two AIL removal functions with slightly different semantics. > > xfs_trans_ail_delete() expects the caller to have the AIL lock and > > for the associated item to be AIL resident. If not, the filesystem > > is shut down. xfs_trans_ail_remove() acquires the AIL lock, checks > > that the item is AIL resident and calls the former if so. > > > > These semantics lead to confused usage between the two. For example, > > the _remove() variant takes a shutdown parameter to pass to the > > _delete() variant, but immediately returns if the AIL bit is not > > set. This means that _remove() would never shut down if an item is > > not AIL resident, even though it appears that many callers would > > expect it to. > > > > Make the following changes to clean up both of these functions: > > > > - Most callers of xfs_trans_ail_delete() acquire the AIL lock just > > before the call. Update _delete() to acquire the lock and open > > code the couple of callers that make additional checks under AIL > > lock. > > - Drop the unnecessary ailp parameter from _delete(). > > - Drop the unused shutdown parameter from _remove() and open code > > the implementation. > > > > In summary, this leaves a _delete() variant that expects an AIL > > resident item and a _remove() helper that checks the AIL bit. Audit > > the existing callsites for use of the appropriate function and > > update as necessary. > > > > Signed-off-by: Brian Foster <bfoster@redhat.com> > .... > > Good start, but... > > > @@ -1032,10 +1033,11 @@ xfs_qm_dqflush_done( > > goto out; > > > > spin_lock(&ailp->ail_lock); > > - if (lip->li_lsn == qip->qli_flush_lsn) > > - /* xfs_trans_ail_delete() drops the AIL lock */ > > - xfs_trans_ail_delete(ailp, lip, SHUTDOWN_CORRUPT_INCORE); > > - else > > + if (lip->li_lsn == qip->qli_flush_lsn) { > > + /* xfs_ail_update_finish() drops the AIL lock */ > > + tail_lsn = xfs_ail_delete_one(ailp, lip); > > + xfs_ail_update_finish(ailp, tail_lsn); > > + } else > > spin_unlock(&ailp->ail_lock); > > This drops the shutdown if the dquot is not in the AIL. It should be > in the AIL, so if it isn't we should be shutting down... > It might not be in the AIL if we're in quotacheck because it does everything in-core. > > @@ -872,13 +872,14 @@ xfs_ail_delete_one( > > */ > > void > > xfs_trans_ail_delete( > > - struct xfs_ail *ailp, > > struct xfs_log_item *lip, > > int shutdown_type) > > { > > + struct xfs_ail *ailp = lip->li_ailp; > > struct xfs_mount *mp = ailp->ail_mount; > > xfs_lsn_t tail_lsn; > > > > + spin_lock(&ailp->ail_lock); > > if (!test_bit(XFS_LI_IN_AIL, &lip->li_flags)) { > > spin_unlock(&ailp->ail_lock); > > if (!XFS_FORCED_SHUTDOWN(mp)) { > > diff --git a/fs/xfs/xfs_trans_priv.h b/fs/xfs/xfs_trans_priv.h > > index 9135afdcee9d..7563c78e2997 100644 > > --- a/fs/xfs/xfs_trans_priv.h > > +++ b/fs/xfs/xfs_trans_priv.h > > @@ -94,22 +94,23 @@ xfs_trans_ail_update( > > xfs_lsn_t xfs_ail_delete_one(struct xfs_ail *ailp, struct xfs_log_item *lip); > > void xfs_ail_update_finish(struct xfs_ail *ailp, xfs_lsn_t old_lsn) > > __releases(ailp->ail_lock); > > -void xfs_trans_ail_delete(struct xfs_ail *ailp, struct xfs_log_item *lip, > > - int shutdown_type); > > +void xfs_trans_ail_delete(struct xfs_log_item *lip, int shutdown_type); > > > > static inline void > > xfs_trans_ail_remove( > > - struct xfs_log_item *lip, > > - int shutdown_type) > > + struct xfs_log_item *lip) > > { > > struct xfs_ail *ailp = lip->li_ailp; > > + xfs_lsn_t tail_lsn; > > > > spin_lock(&ailp->ail_lock); > > - /* xfs_trans_ail_delete() drops the AIL lock */ > > - if (test_bit(XFS_LI_IN_AIL, &lip->li_flags)) > > - xfs_trans_ail_delete(ailp, lip, shutdown_type); > > - else > > + /* xfs_ail_update_finish() drops the AIL lock */ > > + if (test_bit(XFS_LI_IN_AIL, &lip->li_flags)) { > > + tail_lsn = xfs_ail_delete_one(ailp, lip); > > + xfs_ail_update_finish(ailp, tail_lsn); > > + } else { > > spin_unlock(&ailp->ail_lock); > > + } > > } > > This makes xfs_trans_ail_delete() and xfs_trans_ail_remove() almost > identical, except one will shutdown if the item is not in the AIL > and the other won't. Wouldn't it be better to get it down to just > one function that does everything, and remove the confusion of which > to use altogether? > Yes, I was thinking about doing this when working on this patch but determined it was easier to fix up both functions first and then consider combining them in a separate step, but then never got back to it. That might have been before I ended up open-coding some of the other sites too so the end result wasn't really clear to me. Regardless, I'll take another look and fold that change in if it makes sense.. Brian > void > xfs_trans_ail_delete( > struct xfs_log_item *lip, > int shutdown) > { > struct xfs_ail *ailp = lip->li_ailp; > > spin_lock(&ailp->ail_lock); > if (test_bit(XFS_LI_IN_AIL, &lip->li_flags)) { > xfs_lsn_t tail_lsn = xfs_ail_delete_one(ailp, lip); > xfs_ail_update_finish(ailp, tail_lsn); > return; > } > spin_unlock(&ailp->ail_lock); > if (!shutdown) > return; > > /* do shutdown stuff */ > } > > -Dave. > > -- > Dave Chinner > david@fromorbit.com >
diff --git a/fs/xfs/xfs_bmap_item.c b/fs/xfs/xfs_bmap_item.c index ee6f4229cebc..909221a4a8ab 100644 --- a/fs/xfs/xfs_bmap_item.c +++ b/fs/xfs/xfs_bmap_item.c @@ -51,7 +51,7 @@ xfs_bui_release( { ASSERT(atomic_read(&buip->bui_refcount) > 0); if (atomic_dec_and_test(&buip->bui_refcount)) { - xfs_trans_ail_remove(&buip->bui_item, SHUTDOWN_LOG_IO_ERROR); + xfs_trans_ail_delete(&buip->bui_item, SHUTDOWN_LOG_IO_ERROR); xfs_bui_item_free(buip); } } diff --git a/fs/xfs/xfs_buf_item.c b/fs/xfs/xfs_buf_item.c index 6b000f855e13..1bf1c14d4ebb 100644 --- a/fs/xfs/xfs_buf_item.c +++ b/fs/xfs/xfs_buf_item.c @@ -410,7 +410,6 @@ xfs_buf_item_unpin( { struct xfs_buf_log_item *bip = BUF_ITEM(lip); xfs_buf_t *bp = bip->bli_buf; - struct xfs_ail *ailp = lip->li_ailp; int stale = bip->bli_flags & XFS_BLI_STALE; int freed; @@ -463,8 +462,7 @@ xfs_buf_item_unpin( list_del_init(&bp->b_li_list); bp->b_iodone = NULL; } else { - spin_lock(&ailp->ail_lock); - xfs_trans_ail_delete(ailp, lip, SHUTDOWN_LOG_IO_ERROR); + xfs_trans_ail_delete(lip, SHUTDOWN_LOG_IO_ERROR); xfs_buf_item_relse(bp); ASSERT(bp->b_log_item == NULL); } @@ -568,7 +566,7 @@ xfs_buf_item_put( * state. */ if (aborted) - xfs_trans_ail_remove(lip, SHUTDOWN_LOG_IO_ERROR); + xfs_trans_ail_delete(lip, SHUTDOWN_LOG_IO_ERROR); xfs_buf_item_relse(bip->bli_buf); return true; } @@ -1209,22 +1207,19 @@ xfs_buf_iodone( struct xfs_buf *bp, struct xfs_log_item *lip) { - struct xfs_ail *ailp = lip->li_ailp; - ASSERT(BUF_ITEM(lip)->bli_buf == bp); xfs_buf_rele(bp); /* - * If we are forcibly shutting down, this may well be - * off the AIL already. That's because we simulate the - * log-committed callbacks to unpin these buffers. Or we may never - * have put this item on AIL because of the transaction was - * aborted forcibly. xfs_trans_ail_delete() takes care of these. + * If we are forcibly shutting down, this may well be off the AIL + * already. That's because we simulate the log-committed callbacks to + * unpin these buffers. Or we may never have put this item on AIL + * because of the transaction was aborted forcibly. + * xfs_trans_ail_delete() takes care of these. * * Either way, AIL is useless if we're forcing a shutdown. */ - spin_lock(&ailp->ail_lock); - xfs_trans_ail_delete(ailp, lip, SHUTDOWN_CORRUPT_INCORE); + xfs_trans_ail_delete(lip, SHUTDOWN_CORRUPT_INCORE); xfs_buf_item_free(BUF_ITEM(lip)); } diff --git a/fs/xfs/xfs_dquot.c b/fs/xfs/xfs_dquot.c index 953059235130..996751dd6302 100644 --- a/fs/xfs/xfs_dquot.c +++ b/fs/xfs/xfs_dquot.c @@ -1021,6 +1021,7 @@ xfs_qm_dqflush_done( struct xfs_dq_logitem *qip = (struct xfs_dq_logitem *)lip; struct xfs_dquot *dqp = qip->qli_dquot; struct xfs_ail *ailp = lip->li_ailp; + xfs_lsn_t tail_lsn; /* * Only pull the item from the AIL if its location in the log has not @@ -1032,10 +1033,11 @@ xfs_qm_dqflush_done( goto out; spin_lock(&ailp->ail_lock); - if (lip->li_lsn == qip->qli_flush_lsn) - /* xfs_trans_ail_delete() drops the AIL lock */ - xfs_trans_ail_delete(ailp, lip, SHUTDOWN_CORRUPT_INCORE); - else + if (lip->li_lsn == qip->qli_flush_lsn) { + /* xfs_ail_update_finish() drops the AIL lock */ + tail_lsn = xfs_ail_delete_one(ailp, lip); + xfs_ail_update_finish(ailp, tail_lsn); + } else spin_unlock(&ailp->ail_lock); out: @@ -1137,7 +1139,7 @@ xfs_qm_dqflush( out_abort: dqp->dq_flags &= ~XFS_DQ_DIRTY; - xfs_trans_ail_remove(lip, SHUTDOWN_CORRUPT_INCORE); + xfs_trans_ail_remove(lip); xfs_force_shutdown(mp, SHUTDOWN_CORRUPT_INCORE); out_unlock: xfs_dqfunlock(dqp); diff --git a/fs/xfs/xfs_dquot_item.c b/fs/xfs/xfs_dquot_item.c index 582b3796a0c9..f129cfcc36be 100644 --- a/fs/xfs/xfs_dquot_item.c +++ b/fs/xfs/xfs_dquot_item.c @@ -329,7 +329,7 @@ xfs_qm_qoff_logitem_relse( ASSERT(test_bit(XFS_LI_IN_AIL, &lip->li_flags) || test_bit(XFS_LI_ABORTED, &lip->li_flags) || XFS_FORCED_SHUTDOWN(lip->li_mountp)); - xfs_trans_ail_remove(lip, SHUTDOWN_LOG_IO_ERROR); + xfs_trans_ail_remove(lip); kmem_free(lip->li_lv_shadow); kmem_free(qoff); } diff --git a/fs/xfs/xfs_extfree_item.c b/fs/xfs/xfs_extfree_item.c index 6ea847f6e298..cd98eba24884 100644 --- a/fs/xfs/xfs_extfree_item.c +++ b/fs/xfs/xfs_extfree_item.c @@ -55,7 +55,7 @@ xfs_efi_release( { ASSERT(atomic_read(&efip->efi_refcount) > 0); if (atomic_dec_and_test(&efip->efi_refcount)) { - xfs_trans_ail_remove(&efip->efi_item, SHUTDOWN_LOG_IO_ERROR); + xfs_trans_ail_delete(&efip->efi_item, SHUTDOWN_LOG_IO_ERROR); xfs_efi_item_free(efip); } } diff --git a/fs/xfs/xfs_inode_item.c b/fs/xfs/xfs_inode_item.c index 0ae61844b224..f8dd9bb8c851 100644 --- a/fs/xfs/xfs_inode_item.c +++ b/fs/xfs/xfs_inode_item.c @@ -763,11 +763,7 @@ xfs_iflush_abort( xfs_inode_log_item_t *iip = ip->i_itemp; if (iip) { - if (test_bit(XFS_LI_IN_AIL, &iip->ili_item.li_flags)) { - xfs_trans_ail_remove(&iip->ili_item, - stale ? SHUTDOWN_LOG_IO_ERROR : - SHUTDOWN_CORRUPT_INCORE); - } + xfs_trans_ail_remove(&iip->ili_item); iip->ili_logged = 0; /* * Clear the ili_last_fields bits now that we know that the diff --git a/fs/xfs/xfs_refcount_item.c b/fs/xfs/xfs_refcount_item.c index 8eeed73928cd..712939a015a9 100644 --- a/fs/xfs/xfs_refcount_item.c +++ b/fs/xfs/xfs_refcount_item.c @@ -50,7 +50,7 @@ xfs_cui_release( { ASSERT(atomic_read(&cuip->cui_refcount) > 0); if (atomic_dec_and_test(&cuip->cui_refcount)) { - xfs_trans_ail_remove(&cuip->cui_item, SHUTDOWN_LOG_IO_ERROR); + xfs_trans_ail_delete(&cuip->cui_item, SHUTDOWN_LOG_IO_ERROR); xfs_cui_item_free(cuip); } } diff --git a/fs/xfs/xfs_rmap_item.c b/fs/xfs/xfs_rmap_item.c index 4911b68f95dd..ff949b32c051 100644 --- a/fs/xfs/xfs_rmap_item.c +++ b/fs/xfs/xfs_rmap_item.c @@ -50,7 +50,7 @@ xfs_rui_release( { ASSERT(atomic_read(&ruip->rui_refcount) > 0); if (atomic_dec_and_test(&ruip->rui_refcount)) { - xfs_trans_ail_remove(&ruip->rui_item, SHUTDOWN_LOG_IO_ERROR); + xfs_trans_ail_delete(&ruip->rui_item, SHUTDOWN_LOG_IO_ERROR); xfs_rui_item_free(ruip); } } diff --git a/fs/xfs/xfs_trans_ail.c b/fs/xfs/xfs_trans_ail.c index 6af609070143..80acdb89bd6e 100644 --- a/fs/xfs/xfs_trans_ail.c +++ b/fs/xfs/xfs_trans_ail.c @@ -872,13 +872,14 @@ xfs_ail_delete_one( */ void xfs_trans_ail_delete( - struct xfs_ail *ailp, struct xfs_log_item *lip, int shutdown_type) { + struct xfs_ail *ailp = lip->li_ailp; struct xfs_mount *mp = ailp->ail_mount; xfs_lsn_t tail_lsn; + spin_lock(&ailp->ail_lock); if (!test_bit(XFS_LI_IN_AIL, &lip->li_flags)) { spin_unlock(&ailp->ail_lock); if (!XFS_FORCED_SHUTDOWN(mp)) { diff --git a/fs/xfs/xfs_trans_priv.h b/fs/xfs/xfs_trans_priv.h index 9135afdcee9d..7563c78e2997 100644 --- a/fs/xfs/xfs_trans_priv.h +++ b/fs/xfs/xfs_trans_priv.h @@ -94,22 +94,23 @@ xfs_trans_ail_update( xfs_lsn_t xfs_ail_delete_one(struct xfs_ail *ailp, struct xfs_log_item *lip); void xfs_ail_update_finish(struct xfs_ail *ailp, xfs_lsn_t old_lsn) __releases(ailp->ail_lock); -void xfs_trans_ail_delete(struct xfs_ail *ailp, struct xfs_log_item *lip, - int shutdown_type); +void xfs_trans_ail_delete(struct xfs_log_item *lip, int shutdown_type); static inline void xfs_trans_ail_remove( - struct xfs_log_item *lip, - int shutdown_type) + struct xfs_log_item *lip) { struct xfs_ail *ailp = lip->li_ailp; + xfs_lsn_t tail_lsn; spin_lock(&ailp->ail_lock); - /* xfs_trans_ail_delete() drops the AIL lock */ - if (test_bit(XFS_LI_IN_AIL, &lip->li_flags)) - xfs_trans_ail_delete(ailp, lip, shutdown_type); - else + /* xfs_ail_update_finish() drops the AIL lock */ + if (test_bit(XFS_LI_IN_AIL, &lip->li_flags)) { + tail_lsn = xfs_ail_delete_one(ailp, lip); + xfs_ail_update_finish(ailp, tail_lsn); + } else { spin_unlock(&ailp->ail_lock); + } } void xfs_ail_push(struct xfs_ail *, xfs_lsn_t);
We have two AIL removal functions with slightly different semantics. xfs_trans_ail_delete() expects the caller to have the AIL lock and for the associated item to be AIL resident. If not, the filesystem is shut down. xfs_trans_ail_remove() acquires the AIL lock, checks that the item is AIL resident and calls the former if so. These semantics lead to confused usage between the two. For example, the _remove() variant takes a shutdown parameter to pass to the _delete() variant, but immediately returns if the AIL bit is not set. This means that _remove() would never shut down if an item is not AIL resident, even though it appears that many callers would expect it to. Make the following changes to clean up both of these functions: - Most callers of xfs_trans_ail_delete() acquire the AIL lock just before the call. Update _delete() to acquire the lock and open code the couple of callers that make additional checks under AIL lock. - Drop the unnecessary ailp parameter from _delete(). - Drop the unused shutdown parameter from _remove() and open code the implementation. In summary, this leaves a _delete() variant that expects an AIL resident item and a _remove() helper that checks the AIL bit. Audit the existing callsites for use of the appropriate function and update as necessary. Signed-off-by: Brian Foster <bfoster@redhat.com> --- fs/xfs/xfs_bmap_item.c | 2 +- fs/xfs/xfs_buf_item.c | 21 ++++++++------------- fs/xfs/xfs_dquot.c | 12 +++++++----- fs/xfs/xfs_dquot_item.c | 2 +- fs/xfs/xfs_extfree_item.c | 2 +- fs/xfs/xfs_inode_item.c | 6 +----- fs/xfs/xfs_refcount_item.c | 2 +- fs/xfs/xfs_rmap_item.c | 2 +- fs/xfs/xfs_trans_ail.c | 3 ++- fs/xfs/xfs_trans_priv.h | 17 +++++++++-------- 10 files changed, 32 insertions(+), 37 deletions(-)