mbox series

[GIT,PULL,11/11] xfs_repair: support more than 4 billion records

Message ID 171339162291.1911630.9932999805644506997.stg-ugh@frogsfrogsfrogs (mailing list archive)
State New
Headers show
Series [GIT,PULL,01/11] xfsprogs: packaging fixes for 6.7 | expand

Pull-request

https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfsprogs-dev.git tags/repair-support-4bn-records-6.8_2024-04-17

Message

Darrick J. Wong April 17, 2024, 10:10 p.m. UTC
Hi Carlos,

Please pull this branch with changes for xfsprogs for 6.6-rc1.

As usual, I did a test-merge with the main upstream branch as of a few
minutes ago, and didn't see any conflicts.  Please let me know if you
encounter any problems.

The following changes since commit b3bcb8f0a8b5763defc09bc6d9a04da275ad780a:

xfs_repair: rebuild block mappings from rmapbt data (2024-04-17 14:06:28 -0700)

are available in the Git repository at:

https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfsprogs-dev.git tags/repair-support-4bn-records-6.8_2024-04-17

for you to fetch changes up to 90ee2c3a94511da87929989a06199fd537c94db4:

xfs_repair: support more than INT_MAX block maps (2024-04-17 14:06:28 -0700)

----------------------------------------------------------------
xfs_repair: support more than 4 billion records [11/20]

I started looking through all the places where XFS has to deal with the
rc_refcount attribute of refcount records, and noticed that offline
repair doesn't handle the situation where there are more than 2^32
reverse mappings in an AG, or that there are more than 2^32 owners of a
particular piece of AG space.  I've estimated that it would take several
months to produce a filesystem with this many records, but we really
ought to do better at handling them than crashing or (worse) not
crashing and writing out corrupt btrees due to integer truncation.

Once I started using the bmap_inflate debugger command to create extreme
reflink scenarios, I noticed that the memory usage of xfs_repair was
astronomical.  This I observed to be due to the fact that it allocates a
single huge block mapping array for all files on the system, even though
it only uses that array for data and attr forks that map metadata blocks
(e.g. directories, xattrs, symlinks) and does not use it for regular
data files.

So I got rid of the 2^31-1 limits on the block map array and turned off
the block mapping for regular data files.  This doesn't answer the
question of what to do if there are a lot of extents, but it kicks the
can down the road until someone creates a maximally sized xattr tree,
which so far nobody's ever stuck to long enough to complain about.

This has been running on the djcloud for months with no problems.  Enjoy!

Signed-off-by: Darrick J. Wong <djwong@kernel.org>

----------------------------------------------------------------
Darrick J. Wong (8):
xfs_db: add a bmbt inflation command
xfs_repair: slab and bag structs need to track more than 2^32 items
xfs_repair: support more than 2^32 rmapbt records per AG
xfs_repair: support more than 2^32 owners per physical block
xfs_repair: clean up lock resources
xfs_repair: constrain attr fork extent count
xfs_repair: don't create block maps for data files
xfs_repair: support more than INT_MAX block maps

db/Makefile       |  65 ++++++-
db/bmap_inflate.c | 551 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
db/command.c      |   1 +
db/command.h      |   1 +
man/man8/xfs_db.8 |  23 +++
repair/bmap.c     |  23 +--
repair/bmap.h     |   7 +-
repair/dinode.c   |  18 +-
repair/dir2.c     |   2 +-
repair/incore.c   |   9 +
repair/rmap.c     |  25 ++-
repair/rmap.h     |   4 +-
repair/slab.c     |  36 ++--
repair/slab.h     |  36 ++--
14 files changed, 725 insertions(+), 76 deletions(-)
create mode 100644 db/bmap_inflate.c