Message ID | 163961697749.3129691.10072192517670663885.stgit@magnolia (mailing list archive) |
---|---|
State | Superseded, archived |
Headers | show |
Series | xfs: random fixes for 5.17 | expand |
On Wed, Dec 15, 2021 at 05:09:37PM -0800, Darrick J. Wong wrote: > From: Darrick J. Wong <djwong@kernel.org> > > While I was running with KASAN and lockdep enabled, I stumbled upon an > KASAN report about a UAF to a freed CIL checkpoint. Looking at the > comment for xfs_log_item_in_current_chkpt, it seems pretty obvious to me > that the original patch to xfs_defer_finish_noroll should have done > something to lock the CIL to prevent it from switching the CIL contexts > while the predicate runs. > > For upper level code that needs to know if a given log item is new > enough not to need relogging, add a new wrapper that takes the CIL > context lock long enough to sample the current CIL context. This is > kind of racy in that the CIL can switch the contexts immediately after > sampling, but that's ok because the consequence is that the defer ops > code is a little slow to relog items. > I see the problem, but I don't think this is the right way to fix it. The CIL context lock is already a major contention point in the transaction commit code when it is only taken once per xfs_trans_commit() call. If we now potentially take it once per intent item per xfs_trans_commit() call, we're going to make the contention even worse than it already is. The current sequence is always available from the CIL itself via cil->xc_current_sequence, and we can read that without needing any locking to provide existence guarantees of the CIL structure. So.... bool xfs_log_item_in_current_chkpt( struct xfs_log_item *lip) { struct xfs_cil *cil = lip->li_mountp->m_log->l_cilp; if (list_empty(&lip->li_cil)) return false; /* * li_seq is written on the first commit of a log item to record the * first checkpoint it is written to. Hence if it is different to the * current sequence, we're in a new checkpoint. */ return lip->li_seq == READ_ONCE(cil->xc_current_sequence); } Cheers, Dave.
On Thu, Dec 16, 2021 at 03:36:07PM +1100, Dave Chinner wrote: > On Wed, Dec 15, 2021 at 05:09:37PM -0800, Darrick J. Wong wrote: > > From: Darrick J. Wong <djwong@kernel.org> > > > > While I was running with KASAN and lockdep enabled, I stumbled upon an > > KASAN report about a UAF to a freed CIL checkpoint. Looking at the > > comment for xfs_log_item_in_current_chkpt, it seems pretty obvious to me > > that the original patch to xfs_defer_finish_noroll should have done > > something to lock the CIL to prevent it from switching the CIL contexts > > while the predicate runs. > > > > For upper level code that needs to know if a given log item is new > > enough not to need relogging, add a new wrapper that takes the CIL > > context lock long enough to sample the current CIL context. This is > > kind of racy in that the CIL can switch the contexts immediately after > > sampling, but that's ok because the consequence is that the defer ops > > code is a little slow to relog items. > > > > I see the problem, but I don't think this is the right way to fix > it. The CIL context lock is already a major contention point in the > transaction commit code when it is only taken once per > xfs_trans_commit() call. If we now potentially take it once per > intent item per xfs_trans_commit() call, we're going to make the > contention even worse than it already is. > > The current sequence is always available from the CIL itself via > cil->xc_current_sequence, and we can read that without needing any > locking to provide existence guarantees of the CIL structure. > > So.... > > bool > xfs_log_item_in_current_chkpt( > struct xfs_log_item *lip) > { > struct xfs_cil *cil = lip->li_mountp->m_log->l_cilp; > > if (list_empty(&lip->li_cil)) > return false; > > /* > * li_seq is written on the first commit of a log item to record the > * first checkpoint it is written to. Hence if it is different to the > * current sequence, we're in a new checkpoint. > */ > return lip->li_seq == READ_ONCE(cil->xc_current_sequence); Ooh, much better! --D > } > > Cheers, > > Dave. > -- > Dave Chinner > david@fromorbit.com
diff --git a/fs/xfs/xfs_buf_item.c b/fs/xfs/xfs_buf_item.c index a7a8e4528881..41a77f4e42ab 100644 --- a/fs/xfs/xfs_buf_item.c +++ b/fs/xfs/xfs_buf_item.c @@ -430,7 +430,7 @@ xfs_buf_item_format( if (bip->bli_flags & XFS_BLI_INODE_BUF) { if (xfs_has_v3inodes(lip->li_mountp) || !((bip->bli_flags & XFS_BLI_INODE_ALLOC_BUF) && - xfs_log_item_in_current_chkpt(lip))) + __xfs_log_item_in_current_chkpt(lip))) bip->__bli_format.blf_flags |= XFS_BLF_INODE_BUF; bip->bli_flags &= ~XFS_BLI_INODE_BUF; } diff --git a/fs/xfs/xfs_log.h b/fs/xfs/xfs_log.h index dc1b77b92fc1..6c7b6981c443 100644 --- a/fs/xfs/xfs_log.h +++ b/fs/xfs/xfs_log.h @@ -132,6 +132,7 @@ struct xlog_ticket *xfs_log_ticket_get(struct xlog_ticket *ticket); void xfs_log_ticket_put(struct xlog_ticket *ticket); void xlog_cil_process_committed(struct list_head *list); +bool __xfs_log_item_in_current_chkpt(struct xfs_log_item *lip); bool xfs_log_item_in_current_chkpt(struct xfs_log_item *lip); void xfs_log_work_queue(struct xfs_mount *mp); diff --git a/fs/xfs/xfs_log_cil.c b/fs/xfs/xfs_log_cil.c index 6c93c8ada6f3..62e84e040642 100644 --- a/fs/xfs/xfs_log_cil.c +++ b/fs/xfs/xfs_log_cil.c @@ -1441,10 +1441,10 @@ xlog_cil_force_seq( * transaction commit process when deciding what to format into the item. */ bool -xfs_log_item_in_current_chkpt( - struct xfs_log_item *lip) +__xfs_log_item_in_current_chkpt( + struct xfs_log_item *lip) { - struct xfs_cil_ctx *ctx = lip->li_mountp->m_log->l_cilp->xc_ctx; + struct xfs_cil_ctx *ctx = lip->li_mountp->m_log->l_cilp->xc_ctx; if (list_empty(&lip->li_cil)) return false; @@ -1457,6 +1457,25 @@ xfs_log_item_in_current_chkpt( return lip->li_seq == ctx->sequence; } +/* + * Check if the current log item was first committed in this sequence. + * This version locks out CIL flushes, but can only be used to gather + * a rough estimate of the answer. + */ +bool +xfs_log_item_in_current_chkpt( + struct xfs_log_item *lip) +{ + struct xfs_cil *cil = lip->li_mountp->m_log->l_cilp; + bool ret; + + down_read(&cil->xc_ctx_lock); + ret = __xfs_log_item_in_current_chkpt(lip); + up_read(&cil->xc_ctx_lock); + + return ret; +} + /* * Perform initial CIL structure initialisation. */