Message ID | 170069441966.1865809.4282467818590298794.stgit@frogsfrogsfrogs (mailing list archive) |
---|---|
State | Superseded, archived |
Headers | show |
Series | xfsprogs: minor fixes for 6.6 | expand |
On Wed, Nov 22, 2023 at 03:06:59PM -0800, Darrick J. Wong wrote: > From: Darrick J. Wong <djwong@kernel.org> > > In the kernel, commit 8ebbf262d4684 ("xfs: don't block in busy flushing > when freeing extents") changed the allocator behavior such that AGFL > fixing can return -EAGAIN in response to detection of a deadlock with > the transaction busy extent list. If this happens, we're supposed to > requeue the EFI so that we can roll the transaction and try the item > again. > > If a requeue happens, we should not free the xefi pointer in > xfs_extent_free_finish_item or else the retry will walk off a dangling > pointer. There is no extent busy list in userspace so this should > never happen, but let's fix the logic bomb anyway. > > We should have ported kernel commit 0853b5de42b47 ("xfs: allow extent > free intents to be retried") to userspace, but neither Carlos nor I > noticed this fine detail. :( It might be time to move this code into shared files? Btw, I still keep getting annoyed a bit about minor difference in some of the libxfs file due to header inclusion. Maybe we also need to be able to automate this more and require libxfs to only include libxfs headers and xfs.h? Otherwise looks good: Reviewed-by: Christoph Hellwig <hch@lst.de>
On Wed, Nov 22, 2023 at 03:06:59 PM -0800, Darrick J. Wong wrote: > From: Darrick J. Wong <djwong@kernel.org> > > In the kernel, commit 8ebbf262d4684 ("xfs: don't block in busy flushing > when freeing extents") changed the allocator behavior such that AGFL > fixing can return -EAGAIN in response to detection of a deadlock with > the transaction busy extent list. If this happens, we're supposed to > requeue the EFI so that we can roll the transaction and try the item > again. > > If a requeue happens, we should not free the xefi pointer in > xfs_extent_free_finish_item or else the retry will walk off a dangling > pointer. There is no extent busy list in userspace so this should > never happen, but let's fix the logic bomb anyway. > > We should have ported kernel commit 0853b5de42b47 ("xfs: allow extent > free intents to be retried") to userspace, but neither Carlos nor I > noticed this fine detail. :( > Looks good to me. Reviewed-by: Chandan Babu R <chandanbabu@kernel.org> > Signed-off-by: Darrick J. Wong <djwong@kernel.org> > --- > libxfs/defer_item.c | 7 +++++++ > 1 file changed, 7 insertions(+) > > > diff --git a/libxfs/defer_item.c b/libxfs/defer_item.c > index 3f519252046..8731d1834be 100644 > --- a/libxfs/defer_item.c > +++ b/libxfs/defer_item.c > @@ -115,6 +115,13 @@ xfs_extent_free_finish_item( > error = xfs_free_extent(tp, xefi->xefi_pag, agbno, > xefi->xefi_blockcount, &oinfo, XFS_AG_RESV_NONE); > > + /* > + * Don't free the XEFI if we need a new transaction to complete > + * processing of it. > + */ > + if (error == -EAGAIN) > + return error; > + > xfs_extent_free_put_group(xefi); > kmem_cache_free(xfs_extfree_item_cache, xefi); > return error;
On Wed, Nov 22, 2023 at 10:36:57PM -0800, Christoph Hellwig wrote: > On Wed, Nov 22, 2023 at 03:06:59PM -0800, Darrick J. Wong wrote: > > From: Darrick J. Wong <djwong@kernel.org> > > > > In the kernel, commit 8ebbf262d4684 ("xfs: don't block in busy flushing > > when freeing extents") changed the allocator behavior such that AGFL > > fixing can return -EAGAIN in response to detection of a deadlock with > > the transaction busy extent list. If this happens, we're supposed to > > requeue the EFI so that we can roll the transaction and try the item > > again. > > > > If a requeue happens, we should not free the xefi pointer in > > xfs_extent_free_finish_item or else the retry will walk off a dangling > > pointer. There is no extent busy list in userspace so this should > > never happen, but let's fix the logic bomb anyway. > > > > We should have ported kernel commit 0853b5de42b47 ("xfs: allow extent > > free intents to be retried") to userspace, but neither Carlos nor I > > noticed this fine detail. :( > > It might be time to move this code into shared files? I think Chandan started looking into what it would take to port the log code from the kernel into userspace. Then xfs_trans_commit in userspace could actually write transactions to the log and write them back atomically; and xfs_repair could finally lose the -L switch. > Btw, I still keep getting annoyed a bit about minor difference > in some of the libxfs file due to header inclusion. Maybe we also > need to be able to automate this more and require libxfs to only > include libxfs headers and xfs.h? <shrug> Ages ago Dave had a sketch to make xfs_{log,mount,inode}.h pull in the relevant headers so that all the other source files only had to #include at most three files. I wish the kernel had precompiled headers, then it would make sense to have one xfs.h file that pulled in the world and took advantage of caching. --D > Otherwise looks good: > > Reviewed-by: Christoph Hellwig <hch@lst.de> >
On Mon, Nov 27, 2023 at 10:10:24AM -0800, Darrick J. Wong wrote: > > It might be time to move this code into shared files? > > I think Chandan started looking into what it would take to port the log > code from the kernel into userspace. Then xfs_trans_commit in userspace > could actually write transactions to the log and write them back > atomically; and xfs_repair could finally lose the -L switch. While that does sound like a really good idea, it's now what I meant here. I think if we moved the actual defer ops instances out of the _item.c files into libxfs, I think we could reuse them for the current way of operating in userspace quite easily with strategic stubs for some functionality.
On Mon, Nov 27, 2023 at 09:37:15PM -0800, Christoph Hellwig wrote: > On Mon, Nov 27, 2023 at 10:10:24AM -0800, Darrick J. Wong wrote: > > > It might be time to move this code into shared files? > > > > I think Chandan started looking into what it would take to port the log > > code from the kernel into userspace. Then xfs_trans_commit in userspace > > could actually write transactions to the log and write them back > > atomically; and xfs_repair could finally lose the -L switch. > > While that does sound like a really good idea, it's now what I meant > here. I think if we moved the actual defer ops instances out of the > _item.c files into libxfs, I think we could reuse them for the current > way of operating in userspace quite easily with strategic stubs for > some functionality. Oh! You're talking about moving xfs_rmap_update_defer_type and the functions it points to into xfs_rmap.c, then? Hmm. I just moved ->iop_recover into xfs_defer_op_type, let me send an RFC for that. (You and I might have hit critical mass for log item cleanups... ;)) --D
On Tue, Nov 28, 2023 at 09:01:21AM -0800, Darrick J. Wong wrote: > Oh! You're talking about moving xfs_rmap_update_defer_type and the > functions it points to into xfs_rmap.c, then? Yes. > Hmm. I just moved > ->iop_recover into xfs_defer_op_type, let me send an RFC for that. > > (You and I might have hit critical mass for log item cleanups... ;)) Yeah..
diff --git a/libxfs/defer_item.c b/libxfs/defer_item.c index 3f519252046..8731d1834be 100644 --- a/libxfs/defer_item.c +++ b/libxfs/defer_item.c @@ -115,6 +115,13 @@ xfs_extent_free_finish_item( error = xfs_free_extent(tp, xefi->xefi_pag, agbno, xefi->xefi_blockcount, &oinfo, XFS_AG_RESV_NONE); + /* + * Don't free the XEFI if we need a new transaction to complete + * processing of it. + */ + if (error == -EAGAIN) + return error; + xfs_extent_free_put_group(xefi); kmem_cache_free(xfs_extfree_item_cache, xefi); return error;