Message ID | 20250106095613.847700-15-hch@lst.de (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [01/15] xfs: fix a double completion for buffers on in-memory targets | expand |
On Mon, Jan 06, 2025 at 10:54:51AM +0100, Christoph Hellwig wrote: > The dquot and inode version are very similar, which is expected given the > overall b_li_list logic. The differences are that the inode version also > clears the XFS_LI_FLUSHING which is defined in common but only ever set > by the inode item, and that the dquot version takes the ail_lock over > the list iteration. While this seems sensible given that additions and > removals from b_li_list are protected by the ail_lock, log items are > only added before buffer submission, and are only removed when completing > the buffer, so nothing can change the list when retrying a buffer. Heh, I think that's not quite true -- I think xfs_dquot_detach_buf actually has a bug where it needs to take the buffer lock before detaching the dquot from the b_li_list. And I think kfence just whacked me for that on tonight's fstests run. But that's neither here nor there. Moving along... > Signed-off-by: Christoph Hellwig <hch@lst.de> > --- > fs/xfs/xfs_buf.c | 12 ++++++------ > fs/xfs/xfs_buf_item.h | 5 ----- > fs/xfs/xfs_dquot.c | 12 ------------ > fs/xfs/xfs_inode_item.c | 12 ------------ > 4 files changed, 6 insertions(+), 35 deletions(-) > > diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c > index 0ad3cacfdba1..1cf5d14d0d06 100644 > --- a/fs/xfs/xfs_buf.c > +++ b/fs/xfs/xfs_buf.c > @@ -1288,6 +1288,7 @@ xfs_buf_ioend_handle_error( > { > struct xfs_mount *mp = bp->b_mount; > struct xfs_error_cfg *cfg; > + struct xfs_log_item *lip; > > /* > * If we've already shutdown the journal because of I/O errors, there's > @@ -1335,12 +1336,11 @@ xfs_buf_ioend_handle_error( > } > > /* Still considered a transient error. Caller will schedule retries. */ > - if (bp->b_flags & _XBF_INODES) > - xfs_buf_inode_io_fail(bp); > - else if (bp->b_flags & _XBF_DQUOTS) > - xfs_buf_dquot_io_fail(bp); > - else > - ASSERT(list_empty(&bp->b_li_list)); > + list_for_each_entry(lip, &bp->b_li_list, li_bio_list) { > + set_bit(XFS_LI_FAILED, &lip->li_flags); > + clear_bit(XFS_LI_FLUSHING, &lip->li_flags); Should dquot log items be setting XFS_LI_FLUSHING? --D > + } > + > xfs_buf_ioerror(bp, 0); > xfs_buf_relse(bp); > return true; > diff --git a/fs/xfs/xfs_buf_item.h b/fs/xfs/xfs_buf_item.h > index 4d8a6aece995..8cde85259a58 100644 > --- a/fs/xfs/xfs_buf_item.h > +++ b/fs/xfs/xfs_buf_item.h > @@ -54,17 +54,12 @@ bool xfs_buf_item_put(struct xfs_buf_log_item *); > void xfs_buf_item_log(struct xfs_buf_log_item *, uint, uint); > bool xfs_buf_item_dirty_format(struct xfs_buf_log_item *); > void xfs_buf_inode_iodone(struct xfs_buf *); > -void xfs_buf_inode_io_fail(struct xfs_buf *bp); > #ifdef CONFIG_XFS_QUOTA > void xfs_buf_dquot_iodone(struct xfs_buf *); > -void xfs_buf_dquot_io_fail(struct xfs_buf *bp); > #else > static inline void xfs_buf_dquot_iodone(struct xfs_buf *bp) > { > } > -static inline void xfs_buf_dquot_io_fail(struct xfs_buf *bp) > -{ > -} > #endif /* CONFIG_XFS_QUOTA */ > void xfs_buf_iodone(struct xfs_buf *); > bool xfs_buf_log_check_iovec(struct xfs_log_iovec *iovec); > diff --git a/fs/xfs/xfs_dquot.c b/fs/xfs/xfs_dquot.c > index f11d475898f2..78dde811ab16 100644 > --- a/fs/xfs/xfs_dquot.c > +++ b/fs/xfs/xfs_dquot.c > @@ -1229,18 +1229,6 @@ xfs_buf_dquot_iodone( > } > } > > -void > -xfs_buf_dquot_io_fail( > - struct xfs_buf *bp) > -{ > - struct xfs_log_item *lip; > - > - spin_lock(&bp->b_mount->m_ail->ail_lock); > - list_for_each_entry(lip, &bp->b_li_list, li_bio_list) > - set_bit(XFS_LI_FAILED, &lip->li_flags); > - spin_unlock(&bp->b_mount->m_ail->ail_lock); > -} > - > /* Check incore dquot for errors before we flush. */ > static xfs_failaddr_t > xfs_qm_dqflush_check( > diff --git a/fs/xfs/xfs_inode_item.c b/fs/xfs/xfs_inode_item.c > index 912f0b1bc3cb..4fb2e1a6ad26 100644 > --- a/fs/xfs/xfs_inode_item.c > +++ b/fs/xfs/xfs_inode_item.c > @@ -1023,18 +1023,6 @@ xfs_buf_inode_iodone( > list_splice_tail(&flushed_inodes, &bp->b_li_list); > } > > -void > -xfs_buf_inode_io_fail( > - struct xfs_buf *bp) > -{ > - struct xfs_log_item *lip; > - > - list_for_each_entry(lip, &bp->b_li_list, li_bio_list) { > - set_bit(XFS_LI_FAILED, &lip->li_flags); > - clear_bit(XFS_LI_FLUSHING, &lip->li_flags); > - } > -} > - > /* > * Clear the inode logging fields so no more flushes are attempted. If we are > * on a buffer list, it is now safe to remove it because the buffer is > -- > 2.45.2 > >
On Mon, Jan 06, 2025 at 10:55:47PM -0800, Darrick J. Wong wrote: > On Mon, Jan 06, 2025 at 10:54:51AM +0100, Christoph Hellwig wrote: > > The dquot and inode version are very similar, which is expected given the > > overall b_li_list logic. The differences are that the inode version also > > clears the XFS_LI_FLUSHING which is defined in common but only ever set > > by the inode item, and that the dquot version takes the ail_lock over > > the list iteration. While this seems sensible given that additions and > > removals from b_li_list are protected by the ail_lock, log items are > > only added before buffer submission, and are only removed when completing > > the buffer, so nothing can change the list when retrying a buffer. > > Heh, I think that's not quite true -- I think xfs_dquot_detach_buf > actually has a bug where it needs to take the buffer lock before > detaching the dquot from the b_li_list. And I think kfence just whacked > me for that on tonight's fstests run. Ooops :) > > + list_for_each_entry(lip, &bp->b_li_list, li_bio_list) { > > + set_bit(XFS_LI_FAILED, &lip->li_flags); > > + clear_bit(XFS_LI_FLUSHING, &lip->li_flags); > > Should dquot log items be setting XFS_LI_FLUSHING? That would help to avoid roundtrips into ->iop_push and thus a dqlock (try)lock roundtrip for them. So it would be nice to have, but it's not functionally needed.
diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c index 0ad3cacfdba1..1cf5d14d0d06 100644 --- a/fs/xfs/xfs_buf.c +++ b/fs/xfs/xfs_buf.c @@ -1288,6 +1288,7 @@ xfs_buf_ioend_handle_error( { struct xfs_mount *mp = bp->b_mount; struct xfs_error_cfg *cfg; + struct xfs_log_item *lip; /* * If we've already shutdown the journal because of I/O errors, there's @@ -1335,12 +1336,11 @@ xfs_buf_ioend_handle_error( } /* Still considered a transient error. Caller will schedule retries. */ - if (bp->b_flags & _XBF_INODES) - xfs_buf_inode_io_fail(bp); - else if (bp->b_flags & _XBF_DQUOTS) - xfs_buf_dquot_io_fail(bp); - else - ASSERT(list_empty(&bp->b_li_list)); + list_for_each_entry(lip, &bp->b_li_list, li_bio_list) { + set_bit(XFS_LI_FAILED, &lip->li_flags); + clear_bit(XFS_LI_FLUSHING, &lip->li_flags); + } + xfs_buf_ioerror(bp, 0); xfs_buf_relse(bp); return true; diff --git a/fs/xfs/xfs_buf_item.h b/fs/xfs/xfs_buf_item.h index 4d8a6aece995..8cde85259a58 100644 --- a/fs/xfs/xfs_buf_item.h +++ b/fs/xfs/xfs_buf_item.h @@ -54,17 +54,12 @@ bool xfs_buf_item_put(struct xfs_buf_log_item *); void xfs_buf_item_log(struct xfs_buf_log_item *, uint, uint); bool xfs_buf_item_dirty_format(struct xfs_buf_log_item *); void xfs_buf_inode_iodone(struct xfs_buf *); -void xfs_buf_inode_io_fail(struct xfs_buf *bp); #ifdef CONFIG_XFS_QUOTA void xfs_buf_dquot_iodone(struct xfs_buf *); -void xfs_buf_dquot_io_fail(struct xfs_buf *bp); #else static inline void xfs_buf_dquot_iodone(struct xfs_buf *bp) { } -static inline void xfs_buf_dquot_io_fail(struct xfs_buf *bp) -{ -} #endif /* CONFIG_XFS_QUOTA */ void xfs_buf_iodone(struct xfs_buf *); bool xfs_buf_log_check_iovec(struct xfs_log_iovec *iovec); diff --git a/fs/xfs/xfs_dquot.c b/fs/xfs/xfs_dquot.c index f11d475898f2..78dde811ab16 100644 --- a/fs/xfs/xfs_dquot.c +++ b/fs/xfs/xfs_dquot.c @@ -1229,18 +1229,6 @@ xfs_buf_dquot_iodone( } } -void -xfs_buf_dquot_io_fail( - struct xfs_buf *bp) -{ - struct xfs_log_item *lip; - - spin_lock(&bp->b_mount->m_ail->ail_lock); - list_for_each_entry(lip, &bp->b_li_list, li_bio_list) - set_bit(XFS_LI_FAILED, &lip->li_flags); - spin_unlock(&bp->b_mount->m_ail->ail_lock); -} - /* Check incore dquot for errors before we flush. */ static xfs_failaddr_t xfs_qm_dqflush_check( diff --git a/fs/xfs/xfs_inode_item.c b/fs/xfs/xfs_inode_item.c index 912f0b1bc3cb..4fb2e1a6ad26 100644 --- a/fs/xfs/xfs_inode_item.c +++ b/fs/xfs/xfs_inode_item.c @@ -1023,18 +1023,6 @@ xfs_buf_inode_iodone( list_splice_tail(&flushed_inodes, &bp->b_li_list); } -void -xfs_buf_inode_io_fail( - struct xfs_buf *bp) -{ - struct xfs_log_item *lip; - - list_for_each_entry(lip, &bp->b_li_list, li_bio_list) { - set_bit(XFS_LI_FAILED, &lip->li_flags); - clear_bit(XFS_LI_FLUSHING, &lip->li_flags); - } -} - /* * Clear the inode logging fields so no more flushes are attempted. If we are * on a buffer list, it is now safe to remove it because the buffer is
The dquot and inode version are very similar, which is expected given the overall b_li_list logic. The differences are that the inode version also clears the XFS_LI_FLUSHING which is defined in common but only ever set by the inode item, and that the dquot version takes the ail_lock over the list iteration. While this seems sensible given that additions and removals from b_li_list are protected by the ail_lock, log items are only added before buffer submission, and are only removed when completing the buffer, so nothing can change the list when retrying a buffer. Signed-off-by: Christoph Hellwig <hch@lst.de> --- fs/xfs/xfs_buf.c | 12 ++++++------ fs/xfs/xfs_buf_item.h | 5 ----- fs/xfs/xfs_dquot.c | 12 ------------ fs/xfs/xfs_inode_item.c | 12 ------------ 4 files changed, 6 insertions(+), 35 deletions(-)