Message ID | 151520349393.2027.11445111828418979100.stgit@magnolia (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
On 1/5/18 7:51 PM, Darrick J. Wong wrote: > From: Darrick J. Wong <darrick.wong@oracle.com> <man page nitpicking> > diff --git a/man/man8/xfs_scrub.8 b/man/man8/xfs_scrub.8 > new file mode 100644 > index 0000000..95f4fea > --- /dev/null > +++ b/man/man8/xfs_scrub.8 > @@ -0,0 +1,117 @@ > +.TH xfs_scrub 8 > +.SH NAME > +xfs_scrub \- scrub the contents of an XFS filesystem > +.SH SYNOPSIS > +.B xfs_scrub > +[ > +.B \-abemnTvVxy ^ > +] > +.I mount-point or block device? > +.br > +.B xfs_scrub \-V ^ If V is special it probably shouldn't be in the first arg string? Do you mean to hide the "-d" option? > +.SH DESCRIPTION > +.B xfs_scrub > +attempts to check and repair all metadata in a mounted XFS filesystem. > +.PP > +.B xfs_scrub > +asks the kernel to scrub all metadata objects in the filesystem. > +Metadata records are scanned for obviously bad values and then > +cross-referenced against other metadata. > +The goal is to establish a threasonable confidence about the consistency "reasonable" > +of the overall filesystem by examining the consistency of individual > +metadata records against the other metadata in the filesystem across the > +entire filesystem. Redundant, "examining the consistency of individual metadata records against the other medtadata in the filesystem." would suffice. > +Damaged metadata can be rebuilt from other metadata if there is > +sufficient redundancy (and no other corruption) in the metadata. Again redundant, maybe just "if there is sufficient redundancy within other intact metadata?" > +.PP > +This utility does not know how to correct all errors. > +If the tool cannot fix the detected errors, you must unmount the > +filesystem and run > +.B xfs_repair > +to fix the problems. > +If this tool is not run with either of the > +.B \-n > +or > +.B \-y > +options, then it will optimize the filesystem when possible, > +but it will not try to fix errors. I think the manpage needs to describe what this optimization might involve, at least at a high level. Will it fsr all my files? Will it trim my free space? Will it compact my directories? Will it ...? What exactly am I agreeing to here? :) > +.SH OPTIONS > +.TP > +.BI \-a " errors" > +Abort if more than this many errors are found on the filesystem. > +.TP > +.B \-b > +Run in background mode. > +If the option is specified once, only run a single scrubbing thread at a > +time. > +If given more than once, an artificial delay of 100us is added to each > +scrub call to reduce CPU overhead even further. I wonder, should it take a value instead of -bbbbbbbbb? > +.TP > +.B \-e > +Specifies what happens when errors are detected. > +If > +.IR shutdown > +is given, the filesystem will be taken offline if errors are found. > +Not all backends can shut down a filesystem. <user> what's a backend? </user> > +If > +.IR continue > +is given, no action taken if errors are found. > +This is the default. <user> so how do I know what errors were found? </user> > +.TP > +.BI \-m " file" > +Search this file for mounted filesystems instead of /etc/mtab. > +.TP > +.B \-n > +Dry run, do not modify anything in the filesystem. > +This disables all preening and optimization behaviors, and disables > +calling FITRIM on the free space after a successful run. what if I only want to disable FITRIM? (-k?) Oh, and it runs FITRIM? Can you mention that more prominently in the behavior description? (and should it, given that we have a tool for that purpose?) > +.TP > +.BI \-T > +Print timing and memory usage information for each phase. > +.TP > +.B \-v > +Enable verbose mode, which prints periodic status updates. > +.TP > +.B \-V > +Prints the version number and exits. > +.TP > +.B \-x > +Scrub all file data too. colloquial? maybe s/too/as well/ > +The block list will be sorted in disk order for better performance. Cool, so when I'm done, my filesystem will have better performance if I use -x? and none of my files will be corrupted! ;) The read order is probably an implementation detail that doesn't need to be in the manpage. It may be worth changing the description a bit to make it clearer that the purpose is to determine readability of every file block? I mean, that should probably be obvious, but ... > +.B xfs_scrub > +will issue O_DIRECT reads to the block device directly. > +If the block device is a SCSI disk, it will issue READ VERIFY commands > +directly to the disk. + These actions will confirm that all file data blocks can be read from storage. or something? > +.TP > +.B \-y > +Try to repair all filesystem errors. > +If the errors cannot be fixed online, then the filesystem must be taken > +offline for repair. > +.SH EXIT CODE > +The exit code returned by > +.B xfs_scrub > +is the sum of the following conditions: > +.br > +\ 0\ \-\ No errors > +.br > +\ 1\ \-\ File system errors left uncorrected > +.br > +\ 2\ \-\ File system optimizations possible > +.br > +\ 4\ \-\ Operational error > +.br > +\ 8\ \-\ Usage or syntax error > +.br > +.SH CAVEATS > +.B xfs_scrub > +is an immature utility! Might it damage my filesystem? ;) > +This program takes advantage of in-kernel scrubbing to verify a given > +data structure with locks held. > +The kernel must support the BULKSTAT, FSGEOMETRY, FSCOUNTS, GET_RESBLKS, > +GETBMAPX, GETFSMAP, INUMBERS, and SCRUB_METADATA ioctls. Some of those ioctls are ancient and probably don't need to be specified... Can you do anything at all without SCRUB_METADATA? If not, is SCRUB_METADATA sufficient to determine that the kernel has the rest of what it needs? > +This can tie up the system for a while. Maybe that's a statement to go right after "locks held" > +.PP > +If errors are found and cannot be repaired, the filesystem must be taken > +offline and repaired. "unmounted and repaired" might be more specific? *shrug* > +.SH SEE ALSO > +.BR xfs_repair (8). -- To unsubscribe from this list: send the line "unsubscribe linux-xfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 1/5/18 7:51 PM, Darrick J. Wong wrote: > From: Darrick J. Wong <darrick.wong@oracle.com> > > Create the foundations of a filesystem scrubbing tool that asks the > kernel to inspect all metadata in the filesystem and (ultimately) to > repair anything that's broken. Also create the man page for the > utility. > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> ... > +/* > + * XFS Online Metadata Scrub (and Repair) > + * > + * The XFS scrubber uses custom XFS ioctls to probe more deeply into the > + * internals of the filesystem. It takes advantage of scrubbing ioctls > + * to check all the records stored in a metadata object and to > + * cross-reference those records against the other filesystem metadata. > + * > + * After the program gathers command line arguments to figure out > + * exactly what the user wants the program is going to do, scrub * exactly what the user wants the program to do or - * exactly what the program is going to do or - * exactly what the user wants to do :) > + * execution is split up into several separate phases: > + * > + * The "find geometry" phase queries XFS for the filesystem geometry. > + * The block devices for the data, realtime, and log devices are opened. > + * Kernel ioctls are test-queried to see if they actually work (the scrub > + * ioctl in particular), and any other filesystem-specific information > + * is gathered. > + * > + * In the "check internal metadata" phase, we call the metadata scrub > + * ioctl to check the filesystem's internal per-AG btrees. This > + * includes the AG superblock, AGF, AGFL, and AGI headers, freespace > + * btrees, the regular and free inode btrees, the reverse mapping > + * btrees, and the reference counting btrees. If the realtime device is > + * enabled, the realtime bitmap and reverse mapping btrees are enabled. checked? -Eric -- To unsubscribe from this list: send the line "unsubscribe linux-xfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Jan 11, 2018 at 06:16:02PM -0600, Eric Sandeen wrote: > On 1/5/18 7:51 PM, Darrick J. Wong wrote: > > From: Darrick J. Wong <darrick.wong@oracle.com> > > <man page nitpicking> > > > diff --git a/man/man8/xfs_scrub.8 b/man/man8/xfs_scrub.8 > > new file mode 100644 > > index 0000000..95f4fea > > --- /dev/null > > +++ b/man/man8/xfs_scrub.8 > > @@ -0,0 +1,117 @@ > > +.TH xfs_scrub 8 > > +.SH NAME > > +xfs_scrub \- scrub the contents of an XFS filesystem > > +.SH SYNOPSIS > > +.B xfs_scrub > > +[ > > +.B \-abemnTvVxy > ^ > > +] > > +.I mount-point > > or block device? > > > +.br > > +.B xfs_scrub \-V > ^ > > If V is special it probably shouldn't be in the first arg string? Yes, fixed. > Do you mean to hide the "-d" option? -d turn on debug mode; I was going to keep that hidden from users. > > > +.SH DESCRIPTION > > +.B xfs_scrub > > +attempts to check and repair all metadata in a mounted XFS filesystem. > > +.PP > > +.B xfs_scrub > > +asks the kernel to scrub all metadata objects in the filesystem. > > +Metadata records are scanned for obviously bad values and then > > +cross-referenced against other metadata. > > +The goal is to establish a threasonable confidence about the consistency > > "reasonable" Fixed. > > +of the overall filesystem by examining the consistency of individual > > +metadata records against the other metadata in the filesystem across the > > +entire filesystem. > > Redundant, "examining the consistency of individual metadata records against > the other medtadata in the filesystem." would suffice. Fixed. > > +Damaged metadata can be rebuilt from other metadata if there is > > +sufficient redundancy (and no other corruption) in the metadata. > > Again redundant, maybe just "if there is sufficient redundancy within > other intact metadata?" "Damaged metadata can be rebuilt from other metadata if there exists redundant data structures which are intact." ? > > +.PP > > +This utility does not know how to correct all errors. > > +If the tool cannot fix the detected errors, you must unmount the > > +filesystem and run > > +.B xfs_repair > > +to fix the problems. > > +If this tool is not run with either of the > > +.B \-n > > +or > > +.B \-y > > +options, then it will optimize the filesystem when possible, > > +but it will not try to fix errors. > > I think the manpage needs to describe what this optimization might > involve, at least at a high level. Will it fsr all my files? Will > it trim my free space? Will it compact my directories? Will it ...? > What exactly am I agreeing to here? :) "Optimizations may include, but are not limited to, activities such as compacting metadata or bypassing shared block write checks for files that no longer share blocks." > > +.SH OPTIONS > > +.TP > > +.BI \-a " errors" > > +Abort if more than this many errors are found on the filesystem. > > +.TP > > +.B \-b > > +Run in background mode. > > +If the option is specified once, only run a single scrubbing thread at a > > +time. > > +If given more than once, an artificial delay of 100us is added to each > > +scrub call to reduce CPU overhead even further. > > I wonder, should it take a value instead of -bbbbbbbbb? More than ten -b and this program gets reallllly slow. There are currently six global fs checks, ten per-AG checks, and seven per-file checks. On my /home filesystem with 4M inodes and 32 AGs that adds up to... 6 + (32 * 10) + (4M * 7) == ~28M scrub calls, or 324 days to perform a scan. > > +.TP > > +.B \-e > > +Specifies what happens when errors are detected. > > +If > > +.IR shutdown > > +is given, the filesystem will be taken offline if errors are found. > > +Not all backends can shut down a filesystem. > > <user> what's a backend? </user> Leftover remnant from the days when this was a frankentool that could be used to walk filesystems via the standard interfaces. I removed this sentence. > > +If > > +.IR continue > > +is given, no action taken if errors are found. > > +This is the default. > > <user> so how do I know what errors were found? </user> "Filesystem corruption and optimization opportunities will be logged to the standard error stream." I'll put that at the top. > > +.TP > > +.BI \-m " file" > > +Search this file for mounted filesystems instead of /etc/mtab. > > +.TP > > +.B \-n > > +Dry run, do not modify anything in the filesystem. > > +This disables all preening and optimization behaviors, and disables > > +calling FITRIM on the free space after a successful run. > > what if I only want to disable FITRIM? (-k?) Oh all right. :) > Oh, and it runs FITRIM? Can you mention that more prominently > in the behavior description? I'll put it in the list of optimizations. > (and should it, given that we have a tool for that purpose?) Yes we have fstrim but I consider it too scary to run out of the blue without checking the health of the free space info first. > > +.TP > > +.BI \-T > > +Print timing and memory usage information for each phase. > > +.TP > > +.B \-v > > +Enable verbose mode, which prints periodic status updates. > > +.TP > > +.B \-V > > +Prints the version number and exits. > > +.TP > > +.B \-x > > +Scrub all file data too. > > colloquial? maybe s/too/as well/ "Read all file data extents to look for disk errors." > > +The block list will be sorted in disk order for better performance. > > Cool, so when I'm done, my filesystem will have better performance if I use -x? > and none of my files will be corrupted! ;) > > The read order is probably an implementation detail that doesn't need to be in > the manpage. It may be worth changing the description a bit to make it > clearer that the purpose is to determine readability of every file block? > I mean, that should probably be obvious, but ... Eh, I'll just remove it. > > +.B xfs_scrub > > +will issue O_DIRECT reads to the block device directly. > > +If the block device is a SCSI disk, it will issue READ VERIFY commands > > +directly to the disk. > > + These actions will confirm that all file data blocks can be read from storage. > > or something? Ok, added that verbatim. > > +.TP > > +.B \-y > > +Try to repair all filesystem errors. > > +If the errors cannot be fixed online, then the filesystem must be taken > > +offline for repair. > > +.SH EXIT CODE > > +The exit code returned by > > +.B xfs_scrub > > +is the sum of the following conditions: > > +.br > > +\ 0\ \-\ No errors > > +.br > > +\ 1\ \-\ File system errors left uncorrected > > +.br > > +\ 2\ \-\ File system optimizations possible > > +.br > > +\ 4\ \-\ Operational error > > +.br > > +\ 8\ \-\ Usage or syntax error > > +.br > > +.SH CAVEATS > > +.B xfs_scrub > > +is an immature utility! > > Might it damage my filesystem? ;) It glides as softly as a piston! ...oh, are we not doing the monorail song? > > +This program takes advantage of in-kernel scrubbing to verify a given > > +data structure with locks held. "This program takes advantage of in-kernel scrubbing to verify a given data structure with locks held and can keep the filesystem busy for a long time." > > +The kernel must support the BULKSTAT, FSGEOMETRY, FSCOUNTS, GET_RESBLKS, > > +GETBMAPX, GETFSMAP, INUMBERS, and SCRUB_METADATA ioctls. > > Some of those ioctls are ancient and probably don't need to be specified... > Can you do anything at all without SCRUB_METADATA? If not, > is SCRUB_METADATA sufficient to determine that the kernel has the rest > of what it needs? SCRUB_METADATA is enough, provided we don't get kernel-tinyfication'd. > > +This can tie up the system for a while. > > Maybe that's a statement to go right after "locks held" Ok. > > +.PP > > +If errors are found and cannot be repaired, the filesystem must be taken > > +offline and repaired. > > "unmounted and repaired" might be more specific? *shrug* Ok. --D > > +.SH SEE ALSO > > +.BR xfs_repair (8). > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-xfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Jan 11, 2018 at 07:07:43PM -0600, Eric Sandeen wrote: > > > On 1/5/18 7:51 PM, Darrick J. Wong wrote: > > From: Darrick J. Wong <darrick.wong@oracle.com> > > > > Create the foundations of a filesystem scrubbing tool that asks the > > kernel to inspect all metadata in the filesystem and (ultimately) to > > repair anything that's broken. Also create the man page for the > > utility. > > > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> > > ... > > > +/* > > + * XFS Online Metadata Scrub (and Repair) > > + * > > + * The XFS scrubber uses custom XFS ioctls to probe more deeply into the > > + * internals of the filesystem. It takes advantage of scrubbing ioctls > > + * to check all the records stored in a metadata object and to > > + * cross-reference those records against the other filesystem metadata. > > + * > > + * After the program gathers command line arguments to figure out > > + * exactly what the user wants the program is going to do, scrub > > * exactly what the user wants the program to do > > or - > > * exactly what the program is going to do > > or - > > * exactly what the user wants to do > > :) The second. The program can figure out what the program is going to do; it has no idea what the user wants. > > + * execution is split up into several separate phases: > > + * > > + * The "find geometry" phase queries XFS for the filesystem geometry. > > + * The block devices for the data, realtime, and log devices are opened. > > + * Kernel ioctls are test-queried to see if they actually work (the scrub > > + * ioctl in particular), and any other filesystem-specific information > > + * is gathered. > > + * > > + * In the "check internal metadata" phase, we call the metadata scrub > > + * ioctl to check the filesystem's internal per-AG btrees. This > > + * includes the AG superblock, AGF, AGFL, and AGI headers, freespace > > + * btrees, the regular and free inode btrees, the reverse mapping > > + * btrees, and the reference counting btrees. If the realtime device is > > + * enabled, the realtime bitmap and reverse mapping btrees are enabled. > > checked? Fixed. --D > -Eric > -- > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-xfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/.gitignore b/.gitignore index e839e2a..a3db640 100644 --- a/.gitignore +++ b/.gitignore @@ -68,6 +68,7 @@ cscope.* /repair/xfs_repair /rtcp/xfs_rtcp /spaceman/xfs_spaceman +/scrub/xfs_scrub # generated crc files /libxfs/crc32selftest diff --git a/Makefile b/Makefile index 0dce80a..3bd0796 100644 --- a/Makefile +++ b/Makefile @@ -48,7 +48,7 @@ LIBFROG_SUBDIR = libfrog DLIB_SUBDIRS = libxlog libxcmd libhandle LIB_SUBDIRS = libxfs $(DLIB_SUBDIRS) TOOL_SUBDIRS = copy db estimate fsck growfs io logprint mkfs quota \ - mdrestore repair rtcp m4 man doc debian spaceman + mdrestore repair rtcp m4 man doc debian spaceman scrub ifneq ("$(PKG_PLATFORM)","darwin") TOOL_SUBDIRS += fsr @@ -91,6 +91,7 @@ repair: libxlog libxcmd copy: libxlog mkfs: libxcmd spaceman: libxcmd +scrub: libhandle libxcmd ifeq ($(HAVE_BUILDDEFS), yes) include $(BUILDRULES) diff --git a/man/man8/xfs_scrub.8 b/man/man8/xfs_scrub.8 new file mode 100644 index 0000000..95f4fea --- /dev/null +++ b/man/man8/xfs_scrub.8 @@ -0,0 +1,117 @@ +.TH xfs_scrub 8 +.SH NAME +xfs_scrub \- scrub the contents of an XFS filesystem +.SH SYNOPSIS +.B xfs_scrub +[ +.B \-abemnTvVxy +] +.I mount-point +.br +.B xfs_scrub \-V +.SH DESCRIPTION +.B xfs_scrub +attempts to check and repair all metadata in a mounted XFS filesystem. +.PP +.B xfs_scrub +asks the kernel to scrub all metadata objects in the filesystem. +Metadata records are scanned for obviously bad values and then +cross-referenced against other metadata. +The goal is to establish a threasonable confidence about the consistency +of the overall filesystem by examining the consistency of individual +metadata records against the other metadata in the filesystem across the +entire filesystem. +Damaged metadata can be rebuilt from other metadata if there is +sufficient redundancy (and no other corruption) in the metadata. +.PP +This utility does not know how to correct all errors. +If the tool cannot fix the detected errors, you must unmount the +filesystem and run +.B xfs_repair +to fix the problems. +If this tool is not run with either of the +.B \-n +or +.B \-y +options, then it will optimize the filesystem when possible, +but it will not try to fix errors. +.SH OPTIONS +.TP +.BI \-a " errors" +Abort if more than this many errors are found on the filesystem. +.TP +.B \-b +Run in background mode. +If the option is specified once, only run a single scrubbing thread at a +time. +If given more than once, an artificial delay of 100us is added to each +scrub call to reduce CPU overhead even further. +.TP +.B \-e +Specifies what happens when errors are detected. +If +.IR shutdown +is given, the filesystem will be taken offline if errors are found. +Not all backends can shut down a filesystem. +If +.IR continue +is given, no action taken if errors are found. +This is the default. +.TP +.BI \-m " file" +Search this file for mounted filesystems instead of /etc/mtab. +.TP +.B \-n +Dry run, do not modify anything in the filesystem. +This disables all preening and optimization behaviors, and disables +calling FITRIM on the free space after a successful run. +.TP +.BI \-T +Print timing and memory usage information for each phase. +.TP +.B \-v +Enable verbose mode, which prints periodic status updates. +.TP +.B \-V +Prints the version number and exits. +.TP +.B \-x +Scrub all file data too. +The block list will be sorted in disk order for better performance. +.B xfs_scrub +will issue O_DIRECT reads to the block device directly. +If the block device is a SCSI disk, it will issue READ VERIFY commands +directly to the disk. +.TP +.B \-y +Try to repair all filesystem errors. +If the errors cannot be fixed online, then the filesystem must be taken +offline for repair. +.SH EXIT CODE +The exit code returned by +.B xfs_scrub +is the sum of the following conditions: +.br +\ 0\ \-\ No errors +.br +\ 1\ \-\ File system errors left uncorrected +.br +\ 2\ \-\ File system optimizations possible +.br +\ 4\ \-\ Operational error +.br +\ 8\ \-\ Usage or syntax error +.br +.SH CAVEATS +.B xfs_scrub +is an immature utility! +This program takes advantage of in-kernel scrubbing to verify a given +data structure with locks held. +The kernel must support the BULKSTAT, FSGEOMETRY, FSCOUNTS, GET_RESBLKS, +GETBMAPX, GETFSMAP, INUMBERS, and SCRUB_METADATA ioctls. +This can tie up the system for a while. +.PP +If errors are found and cannot be repaired, the filesystem must be taken +offline and repaired. +.SH SEE ALSO +.BR xfs_repair (8). diff --git a/scrub/Makefile b/scrub/Makefile new file mode 100644 index 0000000..62cca3b --- /dev/null +++ b/scrub/Makefile @@ -0,0 +1,42 @@ +# +# Copyright (C) 2018 Oracle. All Rights Reserved. +# + +TOPDIR = .. +include $(TOPDIR)/include/builddefs + +# On linux we get fsmap from the system or define it ourselves +# so include this based on platform type. If this reverts to only +# the autoconf check w/o local definition, change to testing HAVE_GETFSMAP +SCRUB_PREREQS=$(PKG_PLATFORM) + +ifeq ($(SCRUB_PREREQS),linux) +LTCOMMAND = xfs_scrub +INSTALL_SCRUB = install-scrub +endif # scrub_prereqs + +HFILES = \ +common.h \ +xfs_scrub.h + +CFILES = \ +common.c \ +xfs_scrub.c + +LLDLIBS += $(LIBHANDLE) $(LIBFROG) $(LIBPTHREAD) +LTDEPENDENCIES += $(LIBHANDLE) $(LIBFROG) +LLDFLAGS = -static + +default: depend $(LTCOMMAND) + +include $(BUILDRULES) + +install: default $(INSTALL_SCRUB) + +install-scrub: + $(INSTALL) -m 755 -d $(PKG_ROOT_SBIN_DIR) + $(LTINSTALL) -m 755 $(LTCOMMAND) $(PKG_ROOT_SBIN_DIR) + +install-dev: + +-include .dep diff --git a/scrub/common.c b/scrub/common.c new file mode 100644 index 0000000..0a58c16 --- /dev/null +++ b/scrub/common.c @@ -0,0 +1,20 @@ +/* + * Copyright (C) 2018 Oracle. All Rights Reserved. + * + * Author: Darrick J. Wong <darrick.wong@oracle.com> + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version 2 + * of the License, or (at your option) any later version. + * + * This program is distributed in the hope that it would be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write the Free Software Foundation, + * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA. + */ +#include "common.h" diff --git a/scrub/common.h b/scrub/common.h new file mode 100644 index 0000000..1082296 --- /dev/null +++ b/scrub/common.h @@ -0,0 +1,23 @@ +/* + * Copyright (C) 2018 Oracle. All Rights Reserved. + * + * Author: Darrick J. Wong <darrick.wong@oracle.com> + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version 2 + * of the License, or (at your option) any later version. + * + * This program is distributed in the hope that it would be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write the Free Software Foundation, + * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA. + */ +#ifndef XFS_SCRUB_COMMON_H_ +#define XFS_SCRUB_COMMON_H_ + +#endif /* XFS_SCRUB_COMMON_H_ */ diff --git a/scrub/xfs_scrub.c b/scrub/xfs_scrub.c new file mode 100644 index 0000000..4f26855 --- /dev/null +++ b/scrub/xfs_scrub.c @@ -0,0 +1,109 @@ +/* + * Copyright (C) 2018 Oracle. All Rights Reserved. + * + * Author: Darrick J. Wong <darrick.wong@oracle.com> + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version 2 + * of the License, or (at your option) any later version. + * + * This program is distributed in the hope that it would be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write the Free Software Foundation, + * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA. + */ +#include <stdio.h> +#include "xfs_scrub.h" + +/* + * XFS Online Metadata Scrub (and Repair) + * + * The XFS scrubber uses custom XFS ioctls to probe more deeply into the + * internals of the filesystem. It takes advantage of scrubbing ioctls + * to check all the records stored in a metadata object and to + * cross-reference those records against the other filesystem metadata. + * + * After the program gathers command line arguments to figure out + * exactly what the user wants the program is going to do, scrub + * execution is split up into several separate phases: + * + * The "find geometry" phase queries XFS for the filesystem geometry. + * The block devices for the data, realtime, and log devices are opened. + * Kernel ioctls are test-queried to see if they actually work (the scrub + * ioctl in particular), and any other filesystem-specific information + * is gathered. + * + * In the "check internal metadata" phase, we call the metadata scrub + * ioctl to check the filesystem's internal per-AG btrees. This + * includes the AG superblock, AGF, AGFL, and AGI headers, freespace + * btrees, the regular and free inode btrees, the reverse mapping + * btrees, and the reference counting btrees. If the realtime device is + * enabled, the realtime bitmap and reverse mapping btrees are enabled. + * Quotas, if enabled, are also checked in this phase. + * + * Each AG (and the realtime device) has its metadata checked in a + * separate thread for better performance. Errors in the internal + * metadata can be fixed here prior to the inode scan; refer to the + * section about the "repair filesystem" phase for more information. + * + * The "scan all inodes" phase uses BULKSTAT to scan all the inodes in + * an AG in disk order. The BULKSTAT information provides enough + * information to construct a file handle that is used to check the + * following parts of every file: + * + * - The inode record + * - All three block forks (data, attr, CoW) + * - If it's a symlink, the symlink target. + * - If it's a directory, the directory entries. + * - All extended attributes + * - The parent pointer + * + * Multiple threads are started to check each the inodes of each AG in + * parallel. Errors in file metadata can be fixed here; see the section + * about the "repair filesystem" phase for more information. + * + * Next comes the (configurable) "repair filesystem" phase. The user + * can instruct this program to fix all problems encountered; to fix + * only optimality problems and leave the corruptions; or not to touch + * the filesystem at all. Any metadata repairs that did not succeed in + * the previous two phases are retried here; if there are uncorrectable + * errors, xfs_scrub stops here. + * + * The next phase is the "check directory tree" phase. In this phase, + * every directory is opened (via file handle) to confirm that each + * directory is connected to the root. Directory entries are checked + * for ambiguous Unicode normalization mappings, which is to say that we + * look for pairs of entries whose utf-8 strings normalize to the same + * code point sequence and map to different inodes, because that could + * be used to trick a user into opening the wrong file. The names of + * extended attributes are checked for Unicode normalization collisions. + * + * In the "verify data file integrity" phase, we employ GETFSMAP to read + * the reverse-mappings of all AGs and issue direct-reads of the + * underlying disk blocks. We rely on the underlying storage to have + * checksummed the data blocks appropriately. Multiple threads are + * started to check each AG in parallel; a separate thread pool is used + * to handle the direct reads. + * + * In the "check summary counters" phase, use GETFSMAP to tally up the + * blocks and BULKSTAT to tally up the inodes we saw and compare that to + * the statfs output. This gives the user a rough estimate of how + * thorough the scrub was. + */ + +/* Program name; needed for libxcmd error reports. */ +char *progname = "xfs_scrub"; + +int +main( + int argc, + char **argv) +{ + fprintf(stderr, "XXX: This program is not complete!\n"); + return 4; +} diff --git a/scrub/xfs_scrub.h b/scrub/xfs_scrub.h new file mode 100644 index 0000000..ff9c24d --- /dev/null +++ b/scrub/xfs_scrub.h @@ -0,0 +1,23 @@ +/* + * Copyright (C) 2018 Oracle. All Rights Reserved. + * + * Author: Darrick J. Wong <darrick.wong@oracle.com> + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version 2 + * of the License, or (at your option) any later version. + * + * This program is distributed in the hope that it would be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write the Free Software Foundation, + * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA. + */ +#ifndef XFS_SCRUB_XFS_SCRUB_H_ +#define XFS_SCRUB_XFS_SCRUB_H_ + +#endif /* XFS_SCRUB_XFS_SCRUB_H_ */ diff --git a/tools/find-api-violations.sh b/tools/find-api-violations.sh index 3b976d3..cb075ba 100755 --- a/tools/find-api-violations.sh +++ b/tools/find-api-violations.sh @@ -6,7 +6,7 @@ # NOTE: This script doesn't look for API violations in function parameters. -tool_dirs="copy db estimate fsck fsr growfs io logprint mdrestore mkfs quota repair rtcp" +tool_dirs="copy db estimate fsck fsr growfs io logprint mdrestore mkfs quota repair rtcp scrub" # Calls to xfs_* functions in libxfs/*.c without the libxfs_ prefix find_possible_api_calls() {