mbox series

[0/8,v3] xfs: various fixes for 6.5

Message ID 20230627224412.2242198-1-david@fromorbit.com (mailing list archive)
Headers show
Series xfs: various fixes for 6.5 | expand

Message

Dave Chinner June 27, 2023, 10:44 p.m. UTC
Hi folks,

This is an update of the fixes patchset I sent here:

https://lore.kernel.org/linux-xfs/20230620002021.1038067-1-david@fromorbit.com/

There are new patches in the series - patch two is a new patch to
fix a potential issue in the non-blocking busy extent flush code
that Chandan noticed where btree block freeing could potentially
trip over busy extents and return -EAGAIN that isn't handled.
Patches 7 and 8 are new patches to this series; they are outstanding
standalone bug fixes that need review, so I've included them here,
too.

Original cover letter (with patch numbers updated) follows.

-Dave.

--

This is a wrap up of version 3 of all the fixes I have recently
pushed out for review.

The first patch fixes a AIL ordering problem identified when testing
patches 3-5 in this series. This patch only addresses the AIL ordering
problem that was found, it does not address any other corner cases
in intent recovery that may be exposed by patches 3-5.

Patches 3-5 allow partial handling of multi-extent EFIs during
runtime operation and during journal recovery. This is needed as
we attempt to process all the extents in the EFI in a single
transaction and we can deadlock during AGFL fixup if the transaction
context holds the only free space in the AG in a busy extent.

This patchset uncovered a problem where log recovery would run off
the end of the list of intents to recover and into intents that
require deferred processing. This was caused by the ordering issue
fixed in patch 1.

This patchset does not attempt to address the end of intent recovery
detection issues - this raises other issues with the intent recovery
beyond loop termination. Solving those issues requires more thought,
and the problem can largely be avoided by the first patch in the
series. As it is, CUI recovery has been vulnerable to these intent
recovery loop termination issues for several years but we don't have
any evidence that systems have tripped over this so the urgency to
fix the loop termination fully isn't as high as fixing the AIL bulk
insertion ordering problem that exposed it.

Finally, patch 6 addresses journal geometry validation issues. It
makes all geometry mismatches hard failures for V4 and V5
filesystems, except for the log size being too small on V4
filesystems. This addresses the problem on V4 filesystems where we'd
fail to perform ithe remaining validation on the geometry once we'd
detected that the log was too small or too large.

This all passed fstests on v4 and v5 filesystems without
regressions.

---
Version 3:
- patch 2
  - new patch to defer block freeing for inobt and refcountbt
    blocks. This is to close a problem Chandan found during review
    of "xfs: don't block in busy flushing when freeing extents" in
    the V2 series.
- patch 7
  - pulled in AGF length bounds chekcing fixes patch.
  - rearranged slightly for better error discrimination
  - comments added
  - minor syntax and comment fixes
- patch 8
  - new bug fix for a memory leak regression discovered by Coverity
    during xfsprogs scan.

Version 2:
- patch 1
  - rewritten commit message
- patch 2
  - uint32_t alloc_flag conversion pushed all the way down into
    xfs_extent_busy_flush
- patch 3
  - Updated comment for xfs_efd_from_efi()
  - fixed whitespace damage
  - check error return from xfs_free_extent_later()
- patch 5
  - update error message for max LSU size check
  - fix whitespace damage
  - clean up error handling in xfs_log_mount() changes