Message ID | 1472209496-2401-1-git-send-email-zlang@redhat.com (mailing list archive) |
---|---|
State | Rejected |
Headers | show |
On 8/26/16 6:04 AM, Zorro Lang wrote: > When I try to do below steps(a V5 xfs on $fsile): > > # xfs_db -x -c "sb 0" -c "write features_ro_compat 4096" $fsfile > # xfs_db -x -c "sb 0" -c "write features_ro_compat 0" $fsfile > # xfs_db -c "sb 0" -c p $fsfile > > The step#2 and #3 all failed, as: > > Superblock has unknown read-only compatible features (0x1000) enable > > When the "sb" command try to verify the superblock, it find a bad > features_ro_compat number then end the xfs_db process. > > Even"-F" option can't help more. So we need a "super force" mode > which can ignore all "verify" failures, continue the command running. > > Signed-off-by: Zorro Lang <zlang@redhat.com> Hi Zorro - I think this isn't quite the right approach. I can see the advantage of allowing xfs_db to operate on filesystems with unknown features, in case those "features" are the result of corruption, and we'd like to analyze it and/or fix it. Or, for testing, as motivated you here. So allowing xfs_db to skip superblock feature checks makes some sense to me. However, as I read the patch, it is completely overriding every verifier for every type of metadata; in addition to skipping every read verification, it also unhooks the write verifiers, so now crcs aren't updated: # db/xfs_db -x -FF -c "sb 0" -c "write fname \"test\"" fsfile fname = "test\000\000\000\000\000\000\000\000" # db/xfs_db -x -c "sb 0" -c "p crc" fsfile Metadata CRC error detected at xfs_sb block 0x0/0x200 crc = 0x7e27467b (bad) So I think that if the goal is to be able to dangerously operate on filesystems with unknown features, this needs to be more targeted to allow only that, and not completely unhook all verifiers. Thanks, -Eric > --- > > Hi, > > I don't know if my patch is good or not, but I think the "--forceforce" > option is needed for xfs_db. As above example: > > # xfs_db -x -c "sb 0" -c "write features_ro_compat 4096" $fsfile > # xfs_db -x -c "sb 0" -c "write features_ro_compat 0" $fsfile > # xfs_db -c "sb 0" -c p $fsfile > > If we break the superblock, at least xfs_db should help to print the > superblock info. And as a xfs debugger, xfs_db should can ignore > unexpected errors to continue the "expert" command which an expert > want to do. > > That's my personal opinion, so if I'm wrong, feel free to correct me:) > > Thanks, > Zorro > > db/init.c | 6 +++++- > db/io.c | 21 ++++++++++++++++++++- > db/io.h | 2 ++ > man/man8/xfs_db.8 | 4 ++++ > 4 files changed, 31 insertions(+), 2 deletions(-) > > diff --git a/db/init.c b/db/init.c > index c0472c8..690e6ea 100644 > --- a/db/init.c > +++ b/db/init.c > @@ -29,6 +29,7 @@ > #include "malloc.h" > #include "type.h" > > +int xfs_skip_verify = 0; > static char **cmdline; > static int ncmdline; > char *fsdevice; > @@ -75,7 +76,7 @@ init( > x.disfile = 1; > break; > case 'F': > - force = 1; > + force++; > break; > case 'i': > x.isreadonly = (LIBXFS_ISREADONLY|LIBXFS_ISINACTIVE); > @@ -105,6 +106,9 @@ init( > /*NOTREACHED*/ > } > > + if (force > 1) > + xfs_skip_verify = 1; > + > fsdevice = argv[optind]; > if (!x.disfile) > x.volname = fsdevice; > diff --git a/db/io.c b/db/io.c > index 91cab12..897388d 100644 > --- a/db/io.c > +++ b/db/io.c > @@ -41,6 +41,16 @@ static void back_help(void); > static int ring_f(int argc, char **argv); > static void ring_help(void); > > +/* > + * If xfs_skip_verify is true, use this dummy xfs_buf_ops structure > + * to instead of the real xfs_buf_ops in set_cur() > + */ > +static const struct xfs_buf_ops xfs_dummy_buf_ops = { > + .name = "dummy", > + .verify_read = xfs_dummy_verify, > + .verify_write = xfs_dummy_verify, > +}; > + > static const cmdinfo_t pop_cmd = > { "pop", NULL, pop_f, 0, 0, 0, NULL, > N_("pop location from the stack"), pop_help }; > @@ -503,7 +513,16 @@ set_cur( > xfs_ino_t dirino; > xfs_ino_t ino; > __uint16_t mode; > - const struct xfs_buf_ops *ops = t ? t->bops : NULL; > + const struct xfs_buf_ops *ops = NULL; > + > + if (t) { > + if (xfs_skip_verify) { > + ops = &xfs_dummy_buf_ops; > + } else { > + ops = t->bops; > + } > + } else > + ops = NULL; > > if (iocur_sp < 0) { > dbprintf(_("set_cur no stack element to set\n")); > diff --git a/db/io.h b/db/io.h > index c69e9ce..eb64638 100644 > --- a/db/io.h > +++ b/db/io.h > @@ -16,6 +16,8 @@ > * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA > */ > > +extern int xfs_skip_verify; > + > struct typ; > > #define BBMAP_SIZE (XFS_MAX_BLOCKSIZE / BBSIZE) > diff --git a/man/man8/xfs_db.8 b/man/man8/xfs_db.8 > index ff8f862..c52a5bf 100644 > --- a/man/man8/xfs_db.8 > +++ b/man/man8/xfs_db.8 > @@ -57,6 +57,10 @@ Specifies that we want to continue even if the superblock magic is not > correct. For use in > .BR xfs_metadump . > .TP > +.B \-FF > +The "force force" mode. Skip all read/write verify to continue the command, > +even if it'll cause something be broken. > +.TP > .B \-i > Allows execution on a mounted filesystem, provided it is mounted read-only. > Useful for shell scripts >
On Fri, Aug 26, 2016 at 07:04:56PM +0800, Zorro Lang wrote: > When I try to do below steps(a V5 xfs on $fsile): > > # xfs_db -x -c "sb 0" -c "write features_ro_compat 4096" $fsfile > # xfs_db -x -c "sb 0" -c "write features_ro_compat 0" $fsfile > # xfs_db -c "sb 0" -c p $fsfile > > The step#2 and #3 all failed, as: > > Superblock has unknown read-only compatible features (0x1000) enable As it should. xfs_db *in expert mode* allows you to shoot yourself in the foot. However, it doesn't guarantee you'll have the expertise to be able to fix the hole you just shot in your foot. You have much to learn, grasshopper. :P > If we break the superblock, at least xfs_db should help to print the > superblock info. It should. You tried to write to it in expert mode, though. Try just printing the field in read-only mode (-r).... > And as a xfs debugger, xfs_db should can ignore > unexpected errors to continue the "expert" command which an expert > want to do. You're making the assumption that "-x" makes the user an expert. That is far from the truth - it just means someone read a man page, not that they have any specific XFS knowledge. The talk I gave at LCA 2015 is relevant here: https://www.youtube.com/watch?v=VpuVDfSXs-g I go into a bit of depth about how cognitive biases affect how people learn and assess their levels of expertise. This should explain why "expert" mode still needs /some/ safeguards. Remember: feature flags are the most critical part of the on-disk format because they tell everything that parses the on-disk format what format exists on disk. If it is changed to something the tools don't recognise, the tools should /absolutely refuse/ to run in anything other than the mode indicated by the feature flags (i.e. read-only or not at all). This, however, only turns off part of xfs_db's "make it easy-for-non-experts" automatic type detection functionality. If you know what you are doing and how xfs_db works (i.e. the user is an expert), this is trivial to get around. So, let me demonstrate: $ sudo xfs_db -x -c "sb 0" -c "write features_ro_compat 4096" /dev/vdc features_ro_compat = 0x1000 $ sudo xfs_db -x -c "sb 0" -c "write features_ro_compat 0" /dev/vdc Superblock has unknown read-only compatible features (0x1000) enabled. Attempted to mount read-only compatible filesystem read-write. Filesystem can only be safely mounted read only. no current type $ Ok, there's an error there that tells me I can't decode the buffer that was loaded because automatic type detection was not run. So, lets load it as a raw buffer and set the type manually: $ sudo xfs_db -r -c "daddr 0" -c "type sb" -c "p features_ro_compat" /dev/vdc Superblock has unknown read-only compatible features (0x1000) enabled. Attempted to mount read-only compatible filesystem read-write. Filesystem can only be safely mounted read only. features_ro_compat = 0x1000 Yup, there we go, access to the superblock type parsing is re-enabled by starting with a raw data buffer, then setting the type manually. We probably should fix userspace to then remount in read-only mode so this works correctly without needing this step. (It's probably just a libxfs init flag that is wrong.) So: $ sudo xfs_db -x -c "daddr 0" -c "type sb" -c "write features_ro_compat 0" /dev/vdc Superblock has unknown read-only compatible features (0x1000) enabled. Attempted to mount read-only compatible filesystem read-write. Filesystem can only be safely mounted read only. features_ro_compat = 0 $ sudo xfs_db -r -c "sb 0" -c "p features_ro_compat" /dev/vdc features_ro_compat = 0 $ Yup, I just reset it to zero with expert mode, simply by knowing how xfs_db works and avoiding the error that occurred with automatic type detection. But even if I can't set the type manually, I can still fix this by going back to first principles. $ sudo xfs_db -x -c "sb 0" -c "write features_ro_compat 4096" /dev/vdc features_ro_compat = 0x1000 $ First - manually find the on-disk format structure offset: $ pahole fs/xfs/libxfs/xfs_sb.o |grep sb_features_ro_compat __uint32_t sb_features_ro_compat; /* 212 4 */ __be32 sb_features_ro_compat; /* 212 4 */ $ Now we can just write zeros directly to the raw buffer, then check it by re-reading the superblock using the automatic type detection: $ sudo xfs_db -x -c "daddr 0" -c "write fill 0 212 4" -c "sb 0" -c "p features_ro_compat" /dev/vdc features_ro_compat = 0 Yup, problem solved. But: $ sudo xfs_db -r -c "sb 0" -c "p features_ro_compat" /dev/vdc Metadata CRC error detected at xfs_sb block 0x0/0x200 features_ro_compat = 0 It's not a clean solution for v5 filesystems - I have to recalc the CRC. Previous I would have simply run xfs_repair to fix this, but now I can do this: $ sudo xfs_db -x -c "sb 0" -c "crc" -c "crc -r" /dev/vdc Verifying CRC: crc = 0xdea3392d (bad) Recalculating CRC: crc = 0x3a7763c8 (correct) $ sudo xfs_db -r -c "sb 0" -c "p features_ro_compat" /dev/vdc features_ro_compat = 0 $ And now it's all good. So, as you can see we have /multiple ways/ we can fix bad feature fields with xfs_db. None of them require magic force flags, but it does require the ability to solve problems from first principles. Cheers, Dave.
On Sat, Aug 27, 2016 at 10:43:18AM +1000, Dave Chinner wrote: > On Fri, Aug 26, 2016 at 07:04:56PM +0800, Zorro Lang wrote: > > When I try to do below steps(a V5 xfs on $fsile): > > > > # xfs_db -x -c "sb 0" -c "write features_ro_compat 4096" $fsfile > > # xfs_db -x -c "sb 0" -c "write features_ro_compat 0" $fsfile > > # xfs_db -c "sb 0" -c p $fsfile > > > > The step#2 and #3 all failed, as: > > > > Superblock has unknown read-only compatible features (0x1000) enable > > As it should. xfs_db *in expert mode* allows you to shoot yourself > in the foot. However, it doesn't guarantee you'll have the expertise > to be able to fix the hole you just shot in your foot. > > You have much to learn, grasshopper. :P So I wrote a *stupid* patch at here... HaHa, I'm not afraid to try more *this* if I can learn more:) Thanks so much you take time to explain lots of things for me. That's my pleasure:) And they're very helpful for me, I'm trying to read more man-pages and code for understand all you said below(Looks like I have much things to do this weekend:-P) Thanks, Zorro > > > If we break the superblock, at least xfs_db should help to print the > > superblock info. > > It should. You tried to write to it in expert mode, though. Try just > printing the field in read-only mode (-r).... > > > And as a xfs debugger, xfs_db should can ignore > > unexpected errors to continue the "expert" command which an expert > > want to do. > > You're making the assumption that "-x" makes the user an expert. > That is far from the truth - it just means someone read a man page, > not that they have any specific XFS knowledge. The talk I > gave at LCA 2015 is relevant here: > > https://www.youtube.com/watch?v=VpuVDfSXs-g > > I go into a bit of depth about how cognitive biases affect how > people learn and assess their levels of expertise. This should > explain why "expert" mode still needs /some/ safeguards. > > Remember: feature flags are the most critical part of the on-disk > format because they tell everything that parses the on-disk format > what format exists on disk. If it is changed to something the tools > don't recognise, the tools should /absolutely refuse/ to run in > anything other than the mode indicated by the feature flags (i.e. > read-only or not at all). > > This, however, only turns off part of xfs_db's "make it > easy-for-non-experts" automatic type detection functionality. If you > know what you are doing and how xfs_db works (i.e. the user is an > expert), this is trivial to get around. > > So, let me demonstrate: > > $ sudo xfs_db -x -c "sb 0" -c "write features_ro_compat 4096" /dev/vdc > features_ro_compat = 0x1000 > $ sudo xfs_db -x -c "sb 0" -c "write features_ro_compat 0" /dev/vdc > Superblock has unknown read-only compatible features (0x1000) enabled. > Attempted to mount read-only compatible filesystem read-write. > Filesystem can only be safely mounted read only. > no current type > $ > > Ok, there's an error there that tells me I can't decode the buffer > that was loaded because automatic type detection was not run. So, > lets load it as a raw buffer and set the type manually: > > $ sudo xfs_db -r -c "daddr 0" -c "type sb" -c "p features_ro_compat" /dev/vdc > Superblock has unknown read-only compatible features (0x1000) enabled. > Attempted to mount read-only compatible filesystem read-write. > Filesystem can only be safely mounted read only. > features_ro_compat = 0x1000 > > Yup, there we go, access to the superblock type parsing is > re-enabled by starting with a raw data buffer, then setting the type > manually. We probably should fix userspace to then remount in > read-only mode so this works correctly without needing this step. > (It's probably just a libxfs init flag that is wrong.) > > So: > > $ sudo xfs_db -x -c "daddr 0" -c "type sb" -c "write features_ro_compat 0" /dev/vdc > Superblock has unknown read-only compatible features (0x1000) enabled. > Attempted to mount read-only compatible filesystem read-write. > Filesystem can only be safely mounted read only. > features_ro_compat = 0 > $ sudo xfs_db -r -c "sb 0" -c "p features_ro_compat" /dev/vdc > features_ro_compat = 0 > $ > > Yup, I just reset it to zero with expert mode, simply by knowing how > xfs_db works and avoiding the error that occurred with automatic type > detection. But even if I can't set the type manually, I can still > fix this by going back to first principles. > > $ sudo xfs_db -x -c "sb 0" -c "write features_ro_compat 4096" /dev/vdc > features_ro_compat = 0x1000 > $ > > First - manually find the on-disk format structure offset: > > $ pahole fs/xfs/libxfs/xfs_sb.o |grep sb_features_ro_compat > __uint32_t sb_features_ro_compat; /* 212 4 */ > __be32 sb_features_ro_compat; /* 212 4 */ > $ > > Now we can just write zeros directly to the raw buffer, then check > it by re-reading the superblock using the automatic type detection: > > $ sudo xfs_db -x -c "daddr 0" -c "write fill 0 212 4" -c "sb 0" -c "p features_ro_compat" /dev/vdc > features_ro_compat = 0 > > Yup, problem solved. But: > > $ sudo xfs_db -r -c "sb 0" -c "p features_ro_compat" /dev/vdc > Metadata CRC error detected at xfs_sb block 0x0/0x200 > features_ro_compat = 0 > > It's not a clean solution for v5 filesystems - I have to recalc the > CRC. Previous I would have simply run xfs_repair to fix this, but > now I can do this: > > $ sudo xfs_db -x -c "sb 0" -c "crc" -c "crc -r" /dev/vdc > Verifying CRC: > crc = 0xdea3392d (bad) > Recalculating CRC: > crc = 0x3a7763c8 (correct) > $ sudo xfs_db -r -c "sb 0" -c "p features_ro_compat" /dev/vdc > features_ro_compat = 0 > $ > > And now it's all good. > > So, as you can see we have /multiple ways/ we can fix bad feature > fields with xfs_db. None of them require magic force flags, but it > does require the ability to solve problems from first principles. > > Cheers, > > Dave. > -- > Dave Chinner > david@fromorbit.com > > _______________________________________________ > xfs mailing list > xfs@oss.sgi.com > http://oss.sgi.com/mailman/listinfo/xfs
diff --git a/db/init.c b/db/init.c index c0472c8..690e6ea 100644 --- a/db/init.c +++ b/db/init.c @@ -29,6 +29,7 @@ #include "malloc.h" #include "type.h" +int xfs_skip_verify = 0; static char **cmdline; static int ncmdline; char *fsdevice; @@ -75,7 +76,7 @@ init( x.disfile = 1; break; case 'F': - force = 1; + force++; break; case 'i': x.isreadonly = (LIBXFS_ISREADONLY|LIBXFS_ISINACTIVE); @@ -105,6 +106,9 @@ init( /*NOTREACHED*/ } + if (force > 1) + xfs_skip_verify = 1; + fsdevice = argv[optind]; if (!x.disfile) x.volname = fsdevice; diff --git a/db/io.c b/db/io.c index 91cab12..897388d 100644 --- a/db/io.c +++ b/db/io.c @@ -41,6 +41,16 @@ static void back_help(void); static int ring_f(int argc, char **argv); static void ring_help(void); +/* + * If xfs_skip_verify is true, use this dummy xfs_buf_ops structure + * to instead of the real xfs_buf_ops in set_cur() + */ +static const struct xfs_buf_ops xfs_dummy_buf_ops = { + .name = "dummy", + .verify_read = xfs_dummy_verify, + .verify_write = xfs_dummy_verify, +}; + static const cmdinfo_t pop_cmd = { "pop", NULL, pop_f, 0, 0, 0, NULL, N_("pop location from the stack"), pop_help }; @@ -503,7 +513,16 @@ set_cur( xfs_ino_t dirino; xfs_ino_t ino; __uint16_t mode; - const struct xfs_buf_ops *ops = t ? t->bops : NULL; + const struct xfs_buf_ops *ops = NULL; + + if (t) { + if (xfs_skip_verify) { + ops = &xfs_dummy_buf_ops; + } else { + ops = t->bops; + } + } else + ops = NULL; if (iocur_sp < 0) { dbprintf(_("set_cur no stack element to set\n")); diff --git a/db/io.h b/db/io.h index c69e9ce..eb64638 100644 --- a/db/io.h +++ b/db/io.h @@ -16,6 +16,8 @@ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA */ +extern int xfs_skip_verify; + struct typ; #define BBMAP_SIZE (XFS_MAX_BLOCKSIZE / BBSIZE) diff --git a/man/man8/xfs_db.8 b/man/man8/xfs_db.8 index ff8f862..c52a5bf 100644 --- a/man/man8/xfs_db.8 +++ b/man/man8/xfs_db.8 @@ -57,6 +57,10 @@ Specifies that we want to continue even if the superblock magic is not correct. For use in .BR xfs_metadump . .TP +.B \-FF +The "force force" mode. Skip all read/write verify to continue the command, +even if it'll cause something be broken. +.TP .B \-i Allows execution on a mounted filesystem, provided it is mounted read-only. Useful for shell scripts
When I try to do below steps(a V5 xfs on $fsile): # xfs_db -x -c "sb 0" -c "write features_ro_compat 4096" $fsfile # xfs_db -x -c "sb 0" -c "write features_ro_compat 0" $fsfile # xfs_db -c "sb 0" -c p $fsfile The step#2 and #3 all failed, as: Superblock has unknown read-only compatible features (0x1000) enable When the "sb" command try to verify the superblock, it find a bad features_ro_compat number then end the xfs_db process. Even"-F" option can't help more. So we need a "super force" mode which can ignore all "verify" failures, continue the command running. Signed-off-by: Zorro Lang <zlang@redhat.com> --- Hi, I don't know if my patch is good or not, but I think the "--forceforce" option is needed for xfs_db. As above example: # xfs_db -x -c "sb 0" -c "write features_ro_compat 4096" $fsfile # xfs_db -x -c "sb 0" -c "write features_ro_compat 0" $fsfile # xfs_db -c "sb 0" -c p $fsfile If we break the superblock, at least xfs_db should help to print the superblock info. And as a xfs debugger, xfs_db should can ignore unexpected errors to continue the "expert" command which an expert want to do. That's my personal opinion, so if I'm wrong, feel free to correct me:) Thanks, Zorro db/init.c | 6 +++++- db/io.c | 21 ++++++++++++++++++++- db/io.h | 2 ++ man/man8/xfs_db.8 | 4 ++++ 4 files changed, 31 insertions(+), 2 deletions(-)