mbox series

[PATCHSET,v26.0,0/6] xfs: online repair of inodes and forks

Message ID 169049626432.922543.2560381879385116722.stgit@frogsfrogsfrogs (mailing list archive)
Headers show
Series xfs: online repair of inodes and forks | expand

Message

Darrick J. Wong July 27, 2023, 10:21 p.m. UTC
Hi all,

In this series, online repair gains the ability to repair inode records.
To do this, we must repair the ondisk inode and fork information enough
to pass the iget verifiers and hence make the inode igettable again.
Once that's done, we can perform higher level repairs on the incore
inode.  The fstests counterpart of this patchset implements stress
testing of repair.

If you're going to start using this mess, you probably ought to just
pull from my git trees, which are linked below.

This is an extraordinary way to destroy everything.  Enjoy!
Comments and questions are, as always, welcome.

--D

kernel git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=repair-inodes

xfsprogs git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfsprogs-dev.git/log/?h=repair-inodes

fstests git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfstests-dev.git/log/?h=repair-inodes
---
 fs/xfs/Makefile                    |    1 
 fs/xfs/libxfs/xfs_attr_leaf.c      |   32 -
 fs/xfs/libxfs/xfs_attr_leaf.h      |    2 
 fs/xfs/libxfs/xfs_bmap.c           |   22 
 fs/xfs/libxfs/xfs_bmap.h           |    2 
 fs/xfs/libxfs/xfs_dir2_priv.h      |    2 
 fs/xfs/libxfs/xfs_dir2_sf.c        |   29 -
 fs/xfs/libxfs/xfs_format.h         |    3 
 fs/xfs/libxfs/xfs_shared.h         |    1 
 fs/xfs/libxfs/xfs_symlink_remote.c |   21 
 fs/xfs/scrub/alloc.c               |    2 
 fs/xfs/scrub/bmap.c                |    4 
 fs/xfs/scrub/common.c              |   26 +
 fs/xfs/scrub/common.h              |    8 
 fs/xfs/scrub/dir.c                 |   21 
 fs/xfs/scrub/inode.c               |   14 
 fs/xfs/scrub/inode_repair.c        | 1607 ++++++++++++++++++++++++++++++++++++
 fs/xfs/scrub/parent.c              |   10 
 fs/xfs/scrub/repair.c              |   47 +
 fs/xfs/scrub/repair.h              |   28 +
 fs/xfs/scrub/rtbitmap.c            |    4 
 fs/xfs/scrub/rtsummary.c           |    4 
 fs/xfs/scrub/scrub.c               |    2 
 fs/xfs/scrub/trace.h               |  174 ++++
 24 files changed, 2023 insertions(+), 43 deletions(-)
 create mode 100644 fs/xfs/scrub/inode_repair.c

Comments

Dave Chinner Aug. 9, 2023, 9:44 a.m. UTC | #1
On Thu, Jul 27, 2023 at 03:21:08PM -0700, Darrick J. Wong wrote:
> Hi all,
> 
> In this series, online repair gains the ability to repair inode records.
> To do this, we must repair the ondisk inode and fork information enough
> to pass the iget verifiers and hence make the inode igettable again.
> Once that's done, we can perform higher level repairs on the incore
> inode.  The fstests counterpart of this patchset implements stress
> testing of repair.
> 
> If you're going to start using this mess, you probably ought to just
> pull from my git trees, which are linked below.

Most of this makes sense. I think the main thing I'd suggest is
documenting the repair decisions being made and how things that get
zapped are then rebuilt - it seems like there is a lot of dependency
on running other parts of repair after zapping for things to be
rebuilt, but it's not immediately clear how the bits are supposed to
go together so a little bit of documentation for that would go a
long way....

-Dave.
Darrick J. Wong Aug. 10, 2023, 12:45 a.m. UTC | #2
On Wed, Aug 09, 2023 at 07:44:02PM +1000, Dave Chinner wrote:
> On Thu, Jul 27, 2023 at 03:21:08PM -0700, Darrick J. Wong wrote:
> > Hi all,
> > 
> > In this series, online repair gains the ability to repair inode records.
> > To do this, we must repair the ondisk inode and fork information enough
> > to pass the iget verifiers and hence make the inode igettable again.
> > Once that's done, we can perform higher level repairs on the incore
> > inode.  The fstests counterpart of this patchset implements stress
> > testing of repair.
> > 
> > If you're going to start using this mess, you probably ought to just
> > pull from my git trees, which are linked below.
> 
> Most of this makes sense. I think the main thing I'd suggest is
> documenting the repair decisions being made and how things that get
> zapped are then rebuilt - it seems like there is a lot of dependency
> on running other parts of repair after zapping for things to be
> rebuilt, but it's not immediately clear how the bits are supposed to
> go together so a little bit of documentation for that would go a
> long way....

Ok.  The comment for inode_repair.c now reads:

/*
 * Inode Record Repair
 * ===================
 *
 * Roughly speaking, inode problems can be classified based on whether
 * or not they trip the dinode verifiers.  If those trip, then we won't
 * be able to xfs_iget ourselves the inode.
 *
 * Therefore, the xrep_dinode_* functions fix anything that will cause
 * the inode buffer verifier or the dinode verifier.  The xrep_inode_*
 * functions fix things on live incore inodes.  The inode repair
 * functions make decisions with security and usability implications
 * when reviving a file:
 *
 * - Files with zero di_mode or a garbage di_mode are converted to a
 * file that only root can read.  If the immediate data fork area or
 * block 0 of the data fork look like a directory, the file type will be
 * set to a directory.  If the immediate data fork area has no nulls, it
 * will be turned into a symbolic link.  Otherwise, it is turned into a
 * regular file.  This file may not actually contain user data, if the
 * file was not previously a regular file.  Setuid and setgid bits are
 * cleared.
 *
 * - Zero-size directories can be truncated to look empty.  It is
 * necessary to run the bmapbtd and directory repair functions to fully
 * rebuild the directory.
 *
 * - Zero-size symbolic link targets can be truncated to '.'.  It is
 * necessary to run the bmapbtd and symlink repair functions to salvage
 * the symlink.
 *
 * - Invalid extent size hints will be removed.
 *
 * - Quotacheck will be scheduled if we repaired an inode that was so
 * badly damaged that the ondisk inode had to be rebuilt.
 *
 * - Invalid user, group, or project IDs (aka -1U) will be reset to
 * zero.  Setuid and setgid bits are cleared.
 *
 * - Data and attr forks are reset to extents format with zero extents
 * if the fork data is inconsistent.  It is necessary to run the bmapbtd
 * or bmapbta repair functions to recover the space mapping.
 *
 * - ACLs will not be recovered if the attr fork is zapped or the
 * extended attribute structure itself requires salvaging.
 *
 * - If the attr fork is zapped, the user and group ids are reset to
 * root and the setuid and setgid bits are removed.
 */

--D

> -Dave.
> 
> -- 
> Dave Chinner
> david@fromorbit.com