diff mbox

xfs_repair: add '-F' option to ignore writable mount checking

Message ID 5A914B45.8080200@xtaotech.com (mailing list archive)
State Superseded, archived
Headers show

Commit Message

Yang Joseph Feb. 24, 2018, 11:23 a.m. UTC
hello,

Before the repair process, xfs_repair will check if user specified 
device already
has a writable mountpoint. And it will stat all the mountpoints of the 
system. If there
is a dead mountpoint, this checking will be blocked and xfs_repair will 
enter 'D' state.

So, if user can make sure there is no writable mountpoint, xfs_repair 
can skip
this checking. This PR add a new option '-F' to do this job.

Your suggestions are appreciated!

Yang Honggang

---------------- patch begin --------------------

---------------- patch end ----------------
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Eric Sandeen Feb. 24, 2018, 5:56 p.m. UTC | #1
On 2/24/18 5:23 AM, Yang Joseph wrote:
> hello,
> 
> Before the repair process, xfs_repair will check if user specified device already
> has a writable mountpoint. And it will stat all the mountpoints of the system. If there
> is a dead mountpoint, this checking will be blocked and xfs_repair will enter 'D' state.
> 
> So, if user can make sure there is no writable mountpoint, xfs_repair can skip
> this checking. This PR add a new option '-F' to do this job.
> 
> Your suggestions are appreciated!

For starters, this patch seems whitespace mangled and does not apply for me.

However, repairing a filesystem that is actually mounted read-write
could cause serious damage.  Giving the admin tools to do this carries
significant risk; we already call it "dangerous" to repair even an
ro-mounted filesystem.

Do you have a testcase to produce a "dead" mountpoint which causes xfs_repair
to go into "D" state?  Where was xfs_repair stuck?

That sounds like a bug worth fixing, but I am much
less excited about adding options which could do serious damage
to a filesystem.

Thanks,
-Eric

--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Dave Chinner Feb. 24, 2018, 10:04 p.m. UTC | #2
On Sat, Feb 24, 2018 at 11:56:44AM -0600, Eric Sandeen wrote:
> On 2/24/18 5:23 AM, Yang Joseph wrote:
> > hello,
> > 
> > Before the repair process, xfs_repair will check if user specified device already
> > has a writable mountpoint. And it will stat all the mountpoints of the system. If there
> > is a dead mountpoint, this checking will be blocked and xfs_repair will enter 'D' state.

So why is the mount point dead?

That kinda means that the filesystem is still mounted, but something
has hung somewhere and the filesystem may still have active
references to the underlying device and be doing stuff that is
modifying the filesystem....

And if the device is still busy, then you aren't going to be able to
mount the repaired device, anyway, because the block device is still
busy...

> That sounds like a bug worth fixing, but I am much
> less excited about adding options which could do serious damage
> to a filesystem.

TO me it sounds like something that should be fixed by a reboot, not
by adding dangerous options to xfs_repair...

Cheers,

Dave.
Eric Sandeen Feb. 24, 2018, 10:08 p.m. UTC | #3
On 2/24/18 4:04 PM, Dave Chinner wrote:
>> That sounds like a bug worth fixing, but I am much
>> less excited about adding options which could do serious damage
>> to a filesystem.

> TO me it sounds like something that should be fixed by a reboot, not
> by adding dangerous options to xfs_repair...

Yes, that was my point - if it's hung there is a bug somewhere, that
should be fixed - not necessarily in xfsprogs ;)  But without more
info about the situation, I don't know.

-Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Yang Joseph Feb. 26, 2018, 2:59 a.m. UTC | #4
In our case, there is a mountpoint of ceph-fuse type and this mountpoint 
is abnormal.
I execute 'xfs_repair -n /dev/nbd4' cmd. Then xfs_repair is blocked in 
stat()
systemcall. '/dev/nbd4' has no relationship with the ceph-fuse mountpoint.

[root@compute5 ~]# ps aux | grep xfs_repair
root     16469  0.0  0.0 114744   564 ?        D    10:50   0:00 
xfs_repair -n /dev/nbd4

[root@compute5 ~]# cat /proc/16469/stack
[<ffffffffa04b953d>] __fuse_request_send+0x13d/0x2c0 [fuse]
[<ffffffffa04b96d2>] fuse_request_send+0x12/0x20 [fuse]
[<ffffffffa04be67a>] fuse_do_getattr+0x11a/0x2e0 [fuse]
[<ffffffffa04bfba5>] fuse_update_attributes+0x75/0x80 [fuse]
[<ffffffffa04bfbf3>] fuse_getattr+0x43/0x50 [fuse]
[<ffffffff81203976>] vfs_getattr+0x46/0x80
[<ffffffff81203aa5>] vfs_fstatat+0x75/0xc0
[<ffffffff81203ffe>] SYSC_newstat+0x2e/0x60
[<ffffffff812042de>] SyS_newstat+0xe/0x10
[<ffffffff81697809>] system_call_fastpath+0x16/0x1b
[<ffffffffffffffff>] 0xffffffffffffffff

The stat() is from the following code:

// libxfs/linux.c:platform_check_mount()
   while ((mnt = getmntent(f)) != NULL) {
     if (stat64(mnt->mnt_fsname, &mst) < 0) <---------<<<< 
unconditionally stat all mountpoints
       continue;

xfs_repair have to check all mountpoints of the system to make sure 
there is
no writable mount point of user specified device. If there is one abnormal
mountpoint, event it not related to user specified device, xfs_repair will
be blocked.

I can make sure there is no writable mountpoint of /dev/nbd4, so xfs_repair
don't need to check all mountpoints of the system. This is why I want to add
this '-F' option.

Because there are lots of other services on this node, I can't reboot 
the machine.

thx

Yang Honggang

On 02/25/2018 06:04 AM, Dave Chinner wrote:
> On Sat, Feb 24, 2018 at 11:56:44AM -0600, Eric Sandeen wrote:
>> On 2/24/18 5:23 AM, Yang Joseph wrote:
>>> hello,
>>>
>>> Before the repair process, xfs_repair will check if user specified device already
>>> has a writable mountpoint. And it will stat all the mountpoints of the system. If there
>>> is a dead mountpoint, this checking will be blocked and xfs_repair will enter 'D' state.
> So why is the mount point dead?
>
> That kinda means that the filesystem is still mounted, but something
> has hung somewhere and the filesystem may still have active
> references to the underlying device and be doing stuff that is
> modifying the filesystem....
>
> And if the device is still busy, then you aren't going to be able to
> mount the repaired device, anyway, because the block device is still
> busy...
>
>> That sounds like a bug worth fixing, but I am much
>> less excited about adding options which could do serious damage
>> to a filesystem.
> TO me it sounds like something that should be fixed by a reboot, not
> by adding dangerous options to xfs_repair...
>
> Cheers,
>
> Dave.

--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Carlos Maiolino Feb. 26, 2018, 12:02 p.m. UTC | #5
On Mon, Feb 26, 2018 at 10:59:07AM +0800, Yang Joseph wrote:
> In our case, there is a mountpoint of ceph-fuse type and this mountpoint is
> abnormal.
> I execute 'xfs_repair -n /dev/nbd4' cmd. Then xfs_repair is blocked in
> stat()
> systemcall. '/dev/nbd4' has no relationship with the ceph-fuse mountpoint.
> 
> [root@compute5 ~]# ps aux | grep xfs_repair
> root     16469  0.0  0.0 114744   564 ?        D    10:50   0:00 xfs_repair
> -n /dev/nbd4
> 
> [root@compute5 ~]# cat /proc/16469/stack
> [<ffffffffa04b953d>] __fuse_request_send+0x13d/0x2c0 [fuse]
> [<ffffffffa04b96d2>] fuse_request_send+0x12/0x20 [fuse]
> [<ffffffffa04be67a>] fuse_do_getattr+0x11a/0x2e0 [fuse]
> [<ffffffffa04bfba5>] fuse_update_attributes+0x75/0x80 [fuse]
> [<ffffffffa04bfbf3>] fuse_getattr+0x43/0x50 [fuse]
> [<ffffffff81203976>] vfs_getattr+0x46/0x80
> [<ffffffff81203aa5>] vfs_fstatat+0x75/0xc0
> [<ffffffff81203ffe>] SYSC_newstat+0x2e/0x60
> [<ffffffff812042de>] SyS_newstat+0xe/0x10
> [<ffffffff81697809>] system_call_fastpath+0x16/0x1b
> [<ffffffffffffffff>] 0xffffffffffffffff
> 

So, you have a mount point stuck because fuse can't connect. Why should
xfs_repair workaround this issue?

> The stat() is from the following code:
> 
> // libxfs/linux.c:platform_check_mount()
>   while ((mnt = getmntent(f)) != NULL) {
>     if (stat64(mnt->mnt_fsname, &mst) < 0) <---------<<<< unconditionally
> stat all mountpoints
>       continue;

> 
> xfs_repair have to check all mountpoints of the system to make sure there is
> no writable mount point of user specified device. If there is one abnormal
> mountpoint, event it not related to user specified device, xfs_repair will
> be blocked.
> 
> I can make sure there is no writable mountpoint of /dev/nbd4, so xfs_repair
> don't need to check all mountpoints of the system. This is why I want to add
> this '-F' option.
> 

While I understand your point, I wonder why you can't close the specific fuse
connection here, and, if the right approach for you wouldn't be able to close
this fuse connection, instead of hack xfs_repair to bypass mount point checks.

In any way, I think '-F' is really not a good argument for such force, it could
easily be used by mistake in place of, let's say '-f', if such option is ever to
be implemented, it should be typo-safe, something like --force.

But still, I think the right approach here would be fuse to provide a way to
force a close on the specific connection.

> Because there are lots of other services on this node, I can't reboot the
> machine.
> 
> thx
> 
> Yang Honggang
> 
> > > > hello,
> > > > 
> > > > Before the repair process, xfs_repair will check if user specified device already
> > > > has a writable mountpoint. And it will stat all the mountpoints of the system. If there
> > > > is a dead mountpoint, this checking will be blocked and xfs_repair will enter 'D' state.
> > So why is the mount point dead?
> > 
> > That kinda means that the filesystem is still mounted, but something
> > has hung somewhere and the filesystem may still have active
> > references to the underlying device and be doing stuff that is
> > modifying the filesystem....
> > 
> > And if the device is still busy, then you aren't going to be able to
> > mount the repaired device, anyway, because the block device is still
> > busy...
> > 
> > > That sounds like a bug worth fixing, but I am much
> > > less excited about adding options which could do serious damage
> > > to a filesystem.
> > TO me it sounds like something that should be fixed by a reboot, not
> > by adding dangerous options to xfs_repair...
> > 
> > Cheers,
> > 
> > Dave.
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
Brian Foster Feb. 26, 2018, 12:19 p.m. UTC | #6
On Mon, Feb 26, 2018 at 10:59:07AM +0800, Yang Joseph wrote:
> In our case, there is a mountpoint of ceph-fuse type and this mountpoint is
> abnormal.
> I execute 'xfs_repair -n /dev/nbd4' cmd. Then xfs_repair is blocked in
> stat()
> systemcall. '/dev/nbd4' has no relationship with the ceph-fuse mountpoint.
> 
> [root@compute5 ~]# ps aux | grep xfs_repair
> root     16469  0.0  0.0 114744   564 ?        D    10:50   0:00 xfs_repair
> -n /dev/nbd4
> 
> [root@compute5 ~]# cat /proc/16469/stack
> [<ffffffffa04b953d>] __fuse_request_send+0x13d/0x2c0 [fuse]
> [<ffffffffa04b96d2>] fuse_request_send+0x12/0x20 [fuse]
> [<ffffffffa04be67a>] fuse_do_getattr+0x11a/0x2e0 [fuse]
> [<ffffffffa04bfba5>] fuse_update_attributes+0x75/0x80 [fuse]
> [<ffffffffa04bfbf3>] fuse_getattr+0x43/0x50 [fuse]
> [<ffffffff81203976>] vfs_getattr+0x46/0x80
> [<ffffffff81203aa5>] vfs_fstatat+0x75/0xc0
> [<ffffffff81203ffe>] SYSC_newstat+0x2e/0x60
> [<ffffffff812042de>] SyS_newstat+0xe/0x10
> [<ffffffff81697809>] system_call_fastpath+0x16/0x1b
> [<ffffffffffffffff>] 0xffffffffffffffff
> 
> The stat() is from the following code:
> 
> // libxfs/linux.c:platform_check_mount()
>   while ((mnt = getmntent(f)) != NULL) {
>     if (stat64(mnt->mnt_fsname, &mst) < 0) <---------<<<< unconditionally
> stat all mountpoints
>       continue;
> 
> xfs_repair have to check all mountpoints of the system to make sure there is
> no writable mount point of user specified device. If there is one abnormal
> mountpoint, event it not related to user specified device, xfs_repair will
> be blocked.
> 
> I can make sure there is no writable mountpoint of /dev/nbd4, so xfs_repair
> don't need to check all mountpoints of the system. This is why I want to add
> this '-F' option.
> 
> Because there are lots of other services on this node, I can't reboot the
> machine.
> 

Suffice it to say that I agree with the other comments that this
probably isn't something we want to "fix" in xfs_repair... but given
your particular circumstances, would a lazy unmount of the borked mount
allow the repair to proceed?

Brian

> thx
> 
> Yang Honggang
> 
> On 02/25/2018 06:04 AM, Dave Chinner wrote:
> > On Sat, Feb 24, 2018 at 11:56:44AM -0600, Eric Sandeen wrote:
> > > On 2/24/18 5:23 AM, Yang Joseph wrote:
> > > > hello,
> > > > 
> > > > Before the repair process, xfs_repair will check if user specified device already
> > > > has a writable mountpoint. And it will stat all the mountpoints of the system. If there
> > > > is a dead mountpoint, this checking will be blocked and xfs_repair will enter 'D' state.
> > So why is the mount point dead?
> > 
> > That kinda means that the filesystem is still mounted, but something
> > has hung somewhere and the filesystem may still have active
> > references to the underlying device and be doing stuff that is
> > modifying the filesystem....
> > 
> > And if the device is still busy, then you aren't going to be able to
> > mount the repaired device, anyway, because the block device is still
> > busy...
> > 
> > > That sounds like a bug worth fixing, but I am much
> > > less excited about adding options which could do serious damage
> > > to a filesystem.
> > TO me it sounds like something that should be fixed by a reboot, not
> > by adding dangerous options to xfs_repair...
> > 
> > Cheers,
> > 
> > Dave.
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/include/libxfs.h b/include/libxfs.h
index c5fb396..9d5d3ee 100644
--- a/include/libxfs.h
+++ b/include/libxfs.h
@@ -112,6 +112,8 @@  typedef struct libxfs_xinit {
         int             rcreat;         /* try to create realtime 
subvolume */
         int             setblksize;     /* attempt to set device blksize */
         int             usebuflock;     /* lock xfs_buf_t's - for MT 
usage */
+       int             force_repair; /* ignore writable mount checking */
+
                                 /* output results */
         dev_t           ddev;           /* device for data subvolume */
         dev_t           logdev;         /* device for log subvolume */
diff --git a/libxfs/init.c b/libxfs/init.c
index 7bde8b7..8c0b418 100644
--- a/libxfs/init.c
+++ b/libxfs/init.c
@@ -199,7 +199,8 @@  libxfs_device_close(dev_t dev)
  }

  static int
-check_open(char *path, int flags, char **rawfile, char **blockfile)
+check_open(char *path, int flags, char **rawfile, char **blockfile,
+    int force_repair)
  {
         int readonly = (flags & LIBXFS_ISREADONLY);
         int inactive = (flags & LIBXFS_ISINACTIVE);
@@ -225,7 +226,8 @@  check_open(char *path, int flags, char **rawfile, 
char **blockfile)
         if (!readonly && !inactive && platform_check_ismounted(path, 
*blockfile, NULL, 1))
                 return 0;

-       if (inactive && check_isactive(path, *blockfile, 
((readonly|dangerously)?1:0)))
+       if (inactive && !force_repair && check_isactive(path, *blockfile,
+            ((readonly|dangerously)?1:0)))
                 return 0;

         return 1;
@@ -270,7 +272,7 @@  libxfs_init(libxfs_init_t *a)
         radix_tree_init();

         if (a->volname) {
- if(!check_open(a->volname,flags,&rawfile,&blockfile))
+               if(!check_open(a->volname, flags, &rawfile, &blockfile, 
a->force_repair))
                         goto done;
                 fd = open(rawfile, O_RDONLY);
                 dname = a->dname = a->volname;
@@ -284,7 +286,7 @@  libxfs_init(libxfs_init_t *a)
                         platform_findsizes(dname, a->dfd, &a->dsize,
                                            &a->dbsize);
                 } else {
-                       if (!check_open(dname, flags, &rawfile, &blockfile))
+                       if (!check_open(dname, flags, &rawfile, 
&blockfile, a->force_repair))
                                 goto done;
                         a->ddev = libxfs_device_open(rawfile,
                                         a->dcreat, flags, a->setblksize);
@@ -302,7 +304,7 @@  libxfs_init(libxfs_init_t *a)
                         platform_findsizes(dname, a->logfd, &a->logBBsize,
                                            &a->lbsize);
                 } else {
-                       if (!check_open(logname, flags, &rawfile, 
&blockfile))
+                       if (!check_open(logname, flags, &rawfile, 
&blockfile, a->force_repair))
                                 goto done;
                         a->logdev = libxfs_device_open(rawfile,
                                         a->lcreat, flags, a->setblksize);
@@ -320,7 +322,7 @@  libxfs_init(libxfs_init_t *a)
                         platform_findsizes(dname, a->rtfd, &a->rtsize,
                                            &a->rtbsize);
                 } else {
-                       if (!check_open(rtname, flags, &rawfile, 
&blockfile))
+                       if (!check_open(rtname, flags, &rawfile, 
&blockfile, a->force_repair))
                                 goto done;
                         a->rtdev = libxfs_device_open(rawfile,
                                         a->rcreat, flags, a->setblksize);
diff --git a/man/man8/xfs_repair.8 b/man/man8/xfs_repair.8
index 85e4dc9..78e1125 100644
--- a/man/man8/xfs_repair.8
+++ b/man/man8/xfs_repair.8
@@ -4,7 +4,7 @@  xfs_repair \- repair an XFS filesystem
  .SH SYNOPSIS
  .B xfs_repair
  [
-.B \-dfLnPv
+.B \-dfLnPvF
  ] [
  .B \-m
  .I maxmem
@@ -162,6 +162,9 @@  ag_stride is enabled.
  .B \-v
  Verbose output.  May be specified multiple times to increase verbosity.
  .TP
+.B \-F
+Force to repair, ignore writable mount checking.
+.TP
  .B \-d
  Repair dangerously. Allow
  .B xfs_repair
diff --git a/repair/globals.h b/repair/globals.h
index c7bbe6f..085ea0f 100644
--- a/repair/globals.h
+++ b/repair/globals.h
@@ -104,6 +104,7 @@  EXTERN char *rt_name;               /* Name of 
realtime device */
  EXTERN int     rt_spec;                /* Realtime dev specified as 
option */
  EXTERN int     convert_lazy_count;     /* Convert lazy-count mode 
on/off */
  EXTERN int     lazy_count;             /* What to set if to if 
converting */
+EXTERN int     force_repair;           /* ignore writable mount checking */

  /* misc status variables */

diff --git a/repair/init.c b/repair/init.c
index 609229c..a3b4539 100644
--- a/repair/init.c
+++ b/repair/init.c
@@ -90,6 +90,8 @@  xfs_init(libxfs_init_t *args)
         args->usebuflock = do_prefetch;
         args->setblksize = 0;
         args->isdirect = LIBXFS_DIRECT;
+       args->force_repair = force_repair;
+
         if (no_modify)
                 args->isreadonly = (LIBXFS_ISREADONLY | LIBXFS_ISINACTIVE);
         else if (dangerously)
diff --git a/repair/xfs_repair.c b/repair/xfs_repair.c
index b2dd91b..81864c3 100644
--- a/repair/xfs_repair.c
+++ b/repair/xfs_repair.c
@@ -97,6 +97,7 @@  usage(void)
  "  -o subopts   Override default behaviour, refer to man page.\n"
  "  -t interval  Reporting interval in seconds.\n"
  "  -d           Repair dangerously.\n"
+"  -F           Force to repair, ignore writable mount checking.\n"
  "  -V           Reports version and exits.\n"), progname);
         exit(1);
  }
@@ -214,12 +215,13 @@  process_args(int argc, char **argv)
         ag_stride = 0;
         thread_count = 1;
         report_interval = PROG_RPT_DEFAULT;
+       force_repair = 0;

         /*
          * XXX have to add suboption processing here
          * attributes, quotas, nlinks, aligned_inos, sb_fbits
          */
-       while ((c = getopt(argc, argv, "c:o:fl:m:r:LnDvVdPt:")) != EOF)  {
+       while ((c = getopt(argc, argv, "c:o:fl:m:r:LnDvVdPt:F")) != EOF)  {
                 switch (c) {
                 case 'D':
                         dumpcore = 1;
@@ -329,6 +331,9 @@  process_args(int argc, char **argv)
                 case 't':
                         report_interval = (int)strtol(optarg, NULL, 0);
                         break;
+               case 'F':
+                       force_repair = 1;
+                       break;
                 case '?':
                         usage();
                 }