Message ID | 20210218201458.718889-1-bfoster@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | xfs: don't call into blockgc scan with freeze protection | expand |
On Thu, Feb 18, 2021 at 03:14:58PM -0500, Brian Foster wrote: > fstest xfs/167 produced a lockdep splat that complained about a > nested transaction acquiring freeze protection during an eofblocks > scan. Drop freeze protection around the block reclaim scan in the > transaction allocation code to avoid this problem. > > Signed-off-by: Brian Foster <bfoster@redhat.com> I think it seems reasonable, though I really wish that other patchset to clean up all the "modify thread state when we start/end transactions" had landed. Reviewed-by: Darrick J. Wong <djwong@kernel.org> --D > --- > fs/xfs/xfs_trans.c | 19 ++++++++++++++----- > 1 file changed, 14 insertions(+), 5 deletions(-) > > diff --git a/fs/xfs/xfs_trans.c b/fs/xfs/xfs_trans.c > index 44f72c09c203..c32c62d3b77a 100644 > --- a/fs/xfs/xfs_trans.c > +++ b/fs/xfs/xfs_trans.c > @@ -261,6 +261,7 @@ xfs_trans_alloc( > { > struct xfs_trans *tp; > int error; > + bool retried = false; > > /* > * Allocate the handle before we do our freeze accounting and setting up > @@ -288,19 +289,27 @@ xfs_trans_alloc( > INIT_LIST_HEAD(&tp->t_dfops); > tp->t_firstblock = NULLFSBLOCK; > > +retry: > error = xfs_trans_reserve(tp, resp, blocks, rtextents); > - if (error == -ENOSPC) { > + if (error == -ENOSPC && !retried) { > /* > * We weren't able to reserve enough space for the transaction. > * Flush the other speculative space allocations to free space. > * Do not perform a synchronous scan because callers can hold > * other locks. > */ > + retried = true; > + if (!(flags & XFS_TRANS_NO_WRITECOUNT)) > + sb_end_intwrite(mp->m_super); > error = xfs_blockgc_free_space(mp, NULL); > - if (!error) > - error = xfs_trans_reserve(tp, resp, blocks, rtextents); > - } > - if (error) { > + if (error) { > + kmem_cache_free(xfs_trans_zone, tp); > + return error; > + } > + if (!(flags & XFS_TRANS_NO_WRITECOUNT)) > + sb_start_intwrite(mp->m_super); > + goto retry; > + } else if (error) { > xfs_trans_cancel(tp); > return error; > } > -- > 2.26.2 >
On Thu, Feb 18, 2021 at 07:23:09PM -0800, Darrick J. Wong wrote: > On Thu, Feb 18, 2021 at 03:14:58PM -0500, Brian Foster wrote: > > fstest xfs/167 produced a lockdep splat that complained about a > > nested transaction acquiring freeze protection during an eofblocks > > scan. Drop freeze protection around the block reclaim scan in the > > transaction allocation code to avoid this problem. > > > > Signed-off-by: Brian Foster <bfoster@redhat.com> > > I think it seems reasonable, though I really wish that other patchset to > clean up all the "modify thread state when we start/end transactions" > had landed. > > Reviewed-by: Darrick J. Wong <djwong@kernel.org> > > --D > > > --- > > fs/xfs/xfs_trans.c | 19 ++++++++++++++----- > > 1 file changed, 14 insertions(+), 5 deletions(-) > > > > diff --git a/fs/xfs/xfs_trans.c b/fs/xfs/xfs_trans.c > > index 44f72c09c203..c32c62d3b77a 100644 > > --- a/fs/xfs/xfs_trans.c > > +++ b/fs/xfs/xfs_trans.c > > @@ -261,6 +261,7 @@ xfs_trans_alloc( > > { > > struct xfs_trans *tp; > > int error; > > + bool retried = false; > > > > /* > > * Allocate the handle before we do our freeze accounting and setting up > > @@ -288,19 +289,27 @@ xfs_trans_alloc( > > INIT_LIST_HEAD(&tp->t_dfops); > > tp->t_firstblock = NULLFSBLOCK; > > > > +retry: > > error = xfs_trans_reserve(tp, resp, blocks, rtextents); > > - if (error == -ENOSPC) { > > + if (error == -ENOSPC && !retried) { > > /* > > * We weren't able to reserve enough space for the transaction. > > * Flush the other speculative space allocations to free space. > > * Do not perform a synchronous scan because callers can hold > > * other locks. > > */ > > + retried = true; > > + if (!(flags & XFS_TRANS_NO_WRITECOUNT)) > > + sb_end_intwrite(mp->m_super); > > error = xfs_blockgc_free_space(mp, NULL); > > - if (!error) > > - error = xfs_trans_reserve(tp, resp, blocks, rtextents); > > - } > > - if (error) { > > + if (error) { > > + kmem_cache_free(xfs_trans_zone, tp); > > + return error; > > + } This seems dangerous to me. If xfs_trans_reserve() adds anything to the transaction even if it fails, this will fail to free it. e.g. xfs_log_reserve() call allocate a ticket and attach it to the transaction and *then fail*. This code will now leak that ticket. Cheers, Dave.
On Fri, Feb 19, 2021 at 03:56:58PM +1100, Dave Chinner wrote: > On Thu, Feb 18, 2021 at 07:23:09PM -0800, Darrick J. Wong wrote: > > On Thu, Feb 18, 2021 at 03:14:58PM -0500, Brian Foster wrote: > > > fstest xfs/167 produced a lockdep splat that complained about a > > > nested transaction acquiring freeze protection during an eofblocks > > > scan. Drop freeze protection around the block reclaim scan in the > > > transaction allocation code to avoid this problem. > > > > > > Signed-off-by: Brian Foster <bfoster@redhat.com> ... > > > fs/xfs/xfs_trans.c | 19 ++++++++++++++----- > > > 1 file changed, 14 insertions(+), 5 deletions(-) > > > > > > diff --git a/fs/xfs/xfs_trans.c b/fs/xfs/xfs_trans.c > > > index 44f72c09c203..c32c62d3b77a 100644 > > > --- a/fs/xfs/xfs_trans.c > > > +++ b/fs/xfs/xfs_trans.c ... > > > @@ -288,19 +289,27 @@ xfs_trans_alloc( > > > INIT_LIST_HEAD(&tp->t_dfops); > > > tp->t_firstblock = NULLFSBLOCK; > > > > > > +retry: > > > error = xfs_trans_reserve(tp, resp, blocks, rtextents); > > > - if (error == -ENOSPC) { > > > + if (error == -ENOSPC && !retried) { > > > /* > > > * We weren't able to reserve enough space for the transaction. > > > * Flush the other speculative space allocations to free space. > > > * Do not perform a synchronous scan because callers can hold > > > * other locks. > > > */ > > > + retried = true; > > > + if (!(flags & XFS_TRANS_NO_WRITECOUNT)) > > > + sb_end_intwrite(mp->m_super); > > > error = xfs_blockgc_free_space(mp, NULL); > > > - if (!error) > > > - error = xfs_trans_reserve(tp, resp, blocks, rtextents); > > > - } > > > - if (error) { > > > + if (error) { > > > + kmem_cache_free(xfs_trans_zone, tp); > > > + return error; > > > + } > > This seems dangerous to me. If xfs_trans_reserve() adds anything to > the transaction even if it fails, this will fail to free it. e.g. > xfs_log_reserve() call allocate a ticket and attach it to the > transaction and *then fail*. This code will now leak that ticket. > xfs_trans_reserve() ungrants the log ticket (which frees it, at least in the allocation case) and disassociates from the transaction on error, so I don't see how this causes any problems. Brian > Cheers, > > Dave. > -- > Dave Chinner > david@fromorbit.com >
On Fri, Feb 19, 2021 at 08:09:32AM -0500, Brian Foster wrote: > On Fri, Feb 19, 2021 at 03:56:58PM +1100, Dave Chinner wrote: > > On Thu, Feb 18, 2021 at 07:23:09PM -0800, Darrick J. Wong wrote: > > > On Thu, Feb 18, 2021 at 03:14:58PM -0500, Brian Foster wrote: > > > > fstest xfs/167 produced a lockdep splat that complained about a > > > > nested transaction acquiring freeze protection during an eofblocks > > > > scan. Drop freeze protection around the block reclaim scan in the > > > > transaction allocation code to avoid this problem. > > > > > > > > Signed-off-by: Brian Foster <bfoster@redhat.com> > ... > > > > fs/xfs/xfs_trans.c | 19 ++++++++++++++----- > > > > 1 file changed, 14 insertions(+), 5 deletions(-) > > > > > > > > diff --git a/fs/xfs/xfs_trans.c b/fs/xfs/xfs_trans.c > > > > index 44f72c09c203..c32c62d3b77a 100644 > > > > --- a/fs/xfs/xfs_trans.c > > > > +++ b/fs/xfs/xfs_trans.c > ... > > > > @@ -288,19 +289,27 @@ xfs_trans_alloc( > > > > INIT_LIST_HEAD(&tp->t_dfops); > > > > tp->t_firstblock = NULLFSBLOCK; > > > > > > > > +retry: > > > > error = xfs_trans_reserve(tp, resp, blocks, rtextents); > > > > - if (error == -ENOSPC) { > > > > + if (error == -ENOSPC && !retried) { > > > > /* > > > > * We weren't able to reserve enough space for the transaction. > > > > * Flush the other speculative space allocations to free space. > > > > * Do not perform a synchronous scan because callers can hold > > > > * other locks. > > > > */ > > > > + retried = true; > > > > + if (!(flags & XFS_TRANS_NO_WRITECOUNT)) > > > > + sb_end_intwrite(mp->m_super); > > > > error = xfs_blockgc_free_space(mp, NULL); > > > > - if (!error) > > > > - error = xfs_trans_reserve(tp, resp, blocks, rtextents); > > > > - } > > > > - if (error) { > > > > + if (error) { > > > > + kmem_cache_free(xfs_trans_zone, tp); > > > > + return error; > > > > + } > > > > This seems dangerous to me. If xfs_trans_reserve() adds anything to > > the transaction even if it fails, this will fail to free it. e.g. > > xfs_log_reserve() call allocate a ticket and attach it to the > > transaction and *then fail*. This code will now leak that ticket. > > > > xfs_trans_reserve() ungrants the log ticket (which frees it, at least in > the allocation case) and disassociates from the transaction on error, so > I don't see how this causes any problems. It ungrants the log ticket when it jumps to "undo_log" on error. When xfs_log_reserve() fails, it jumps to "undo_blocks" and doesn't ungrant the ticket. Hence potentially leaving an allocated ticket attached to the transaction on error. xfs_trans_cancel() handles this just fine, just freeing the transaction doesn't. Cheers, Dave.
On Sat, Feb 20, 2021 at 07:42:48AM +1100, Dave Chinner wrote: > On Fri, Feb 19, 2021 at 08:09:32AM -0500, Brian Foster wrote: > > On Fri, Feb 19, 2021 at 03:56:58PM +1100, Dave Chinner wrote: > > > On Thu, Feb 18, 2021 at 07:23:09PM -0800, Darrick J. Wong wrote: > > > > On Thu, Feb 18, 2021 at 03:14:58PM -0500, Brian Foster wrote: > > > > > fstest xfs/167 produced a lockdep splat that complained about a > > > > > nested transaction acquiring freeze protection during an eofblocks > > > > > scan. Drop freeze protection around the block reclaim scan in the > > > > > transaction allocation code to avoid this problem. > > > > > > > > > > Signed-off-by: Brian Foster <bfoster@redhat.com> > > ... > > > > > fs/xfs/xfs_trans.c | 19 ++++++++++++++----- > > > > > 1 file changed, 14 insertions(+), 5 deletions(-) > > > > > > > > > > diff --git a/fs/xfs/xfs_trans.c b/fs/xfs/xfs_trans.c > > > > > index 44f72c09c203..c32c62d3b77a 100644 > > > > > --- a/fs/xfs/xfs_trans.c > > > > > +++ b/fs/xfs/xfs_trans.c > > ... > > > > > @@ -288,19 +289,27 @@ xfs_trans_alloc( > > > > > INIT_LIST_HEAD(&tp->t_dfops); > > > > > tp->t_firstblock = NULLFSBLOCK; > > > > > > > > > > +retry: > > > > > error = xfs_trans_reserve(tp, resp, blocks, rtextents); > > > > > - if (error == -ENOSPC) { > > > > > + if (error == -ENOSPC && !retried) { > > > > > /* > > > > > * We weren't able to reserve enough space for the transaction. > > > > > * Flush the other speculative space allocations to free space. > > > > > * Do not perform a synchronous scan because callers can hold > > > > > * other locks. > > > > > */ > > > > > + retried = true; > > > > > + if (!(flags & XFS_TRANS_NO_WRITECOUNT)) > > > > > + sb_end_intwrite(mp->m_super); > > > > > error = xfs_blockgc_free_space(mp, NULL); > > > > > - if (!error) > > > > > - error = xfs_trans_reserve(tp, resp, blocks, rtextents); > > > > > - } > > > > > - if (error) { > > > > > + if (error) { > > > > > + kmem_cache_free(xfs_trans_zone, tp); > > > > > + return error; > > > > > + } > > > > > > This seems dangerous to me. If xfs_trans_reserve() adds anything to > > > the transaction even if it fails, this will fail to free it. e.g. > > > xfs_log_reserve() call allocate a ticket and attach it to the > > > transaction and *then fail*. This code will now leak that ticket. > > > > > > > xfs_trans_reserve() ungrants the log ticket (which frees it, at least in > > the allocation case) and disassociates from the transaction on error, so > > I don't see how this causes any problems. > > It ungrants the log ticket when it jumps to "undo_log" on error. > When xfs_log_reserve() fails, it jumps to "undo_blocks" and doesn't > ungrant the ticket. Hence potentially leaving an allocated ticket > attached to the transaction on error. xfs_trans_cancel() handles > this just fine, just freeing the transaction doesn't. > Ah, my mistake. Must have misread the label... Brian > Cheers, > > Dave. > -- > Dave Chinner > david@fromorbit.com >
diff --git a/fs/xfs/xfs_trans.c b/fs/xfs/xfs_trans.c index 44f72c09c203..c32c62d3b77a 100644 --- a/fs/xfs/xfs_trans.c +++ b/fs/xfs/xfs_trans.c @@ -261,6 +261,7 @@ xfs_trans_alloc( { struct xfs_trans *tp; int error; + bool retried = false; /* * Allocate the handle before we do our freeze accounting and setting up @@ -288,19 +289,27 @@ xfs_trans_alloc( INIT_LIST_HEAD(&tp->t_dfops); tp->t_firstblock = NULLFSBLOCK; +retry: error = xfs_trans_reserve(tp, resp, blocks, rtextents); - if (error == -ENOSPC) { + if (error == -ENOSPC && !retried) { /* * We weren't able to reserve enough space for the transaction. * Flush the other speculative space allocations to free space. * Do not perform a synchronous scan because callers can hold * other locks. */ + retried = true; + if (!(flags & XFS_TRANS_NO_WRITECOUNT)) + sb_end_intwrite(mp->m_super); error = xfs_blockgc_free_space(mp, NULL); - if (!error) - error = xfs_trans_reserve(tp, resp, blocks, rtextents); - } - if (error) { + if (error) { + kmem_cache_free(xfs_trans_zone, tp); + return error; + } + if (!(flags & XFS_TRANS_NO_WRITECOUNT)) + sb_start_intwrite(mp->m_super); + goto retry; + } else if (error) { xfs_trans_cancel(tp); return error; }
fstest xfs/167 produced a lockdep splat that complained about a nested transaction acquiring freeze protection during an eofblocks scan. Drop freeze protection around the block reclaim scan in the transaction allocation code to avoid this problem. Signed-off-by: Brian Foster <bfoster@redhat.com> --- fs/xfs/xfs_trans.c | 19 ++++++++++++++----- 1 file changed, 14 insertions(+), 5 deletions(-)