Message ID | Yo03mZ12X1nLGihK@magnolia (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | xfs: test mkfs.xfs sizing of internal logs that | expand |
On Tue, May 24, 2022 at 12:52:57PM -0700, Darrick J. Wong wrote: > From: Darrick J. Wong <djwong@kernel.org> > > This is a regression test that exercises the mkfs.xfs code that creates > log sizes that are very close to the AG size when stripe units are in > play and/or when the log is forced to be in AG 0. > > Signed-off-by: Darrick J. Wong <djwong@kernel.org> > --- > tests/xfs/843 | 56 +++++++++++++++++++++++++++++++++++++++++++++++++++++ > tests/xfs/843.out | 2 ++ > 2 files changed, 58 insertions(+) > create mode 100755 tests/xfs/843 > create mode 100644 tests/xfs/843.out > > diff --git a/tests/xfs/843 b/tests/xfs/843 > new file mode 100755 > index 00000000..3384b1aa > --- /dev/null > +++ b/tests/xfs/843 > @@ -0,0 +1,56 @@ > +#! /bin/bash > +# SPDX-License-Identifier: GPL-2.0 > +# Copyright (c) 2022 Oracle. All Rights Reserved. > +# > +# FS QA Test 843 > +# > +# Now that we've increased the default log size calculation, test mkfs with > +# various stripe units and filesystem sizes to see if we can provoke mkfs into > +# breaking. > +# > +. ./common/preamble > +_begin_fstest auto mkfs > + > +_cleanup() > +{ > + cd / > + rm -r -f $tmp.* $testfile > +} I'd omit this completely. > +# real QA test starts here > + > +# Modify as appropriate. > +_supported_fs xfs > +_require_test > + > +testfile=$TEST_DIR/a > +rm -f $testfile > + > +test_format() { > + local tag="$1" > + shift > + > + echo "$tag" >> $seqres.full > + $MKFS_XFS_PROG $@ -d file,name=$testfile &>> $seqres.full > + local res=$? > + test $res -eq 0 || echo "$tag FAIL $res" | tee -a $seqres.full What breakage are you trying to provoke? Just the log size calculation? If so, why do we need to actually write the filesystem to disk? Won't "-N" still calculate everything and fail if it's broken or quit with success without needing to write anything to disk? > +} > + > +# First we try various small filesystems and stripe sizes. > +for M in `seq 298 302` `seq 490 520`; do > + for S in `seq 32 4 64`; do > + test_format "M=$M S=$S" -dsu=${S}k,sw=1,size=${M}m > + done > +done > + > +# log so large it pushes the root dir into AG 1 > +test_format "log pushes rootdir into AG 1" -d agcount=3200,size=6366g -lagnum=0 > + > +# log end rounded beyond EOAG due to stripe unit > +test_format "log end beyond eoag" -d agcount=3200,size=6366g -d su=256k,sw=4 > + > +echo Silence is golden Put this at the top where the test is being set up (i.e. where you define testfile). That tells the reader straight away that no output is expected on a successful run before they start reading the test code.... Cheers, Dave.
On Wed, May 25, 2022 at 09:44:26AM +1000, Dave Chinner wrote: > On Tue, May 24, 2022 at 12:52:57PM -0700, Darrick J. Wong wrote: > > From: Darrick J. Wong <djwong@kernel.org> > > > > This is a regression test that exercises the mkfs.xfs code that creates > > log sizes that are very close to the AG size when stripe units are in > > play and/or when the log is forced to be in AG 0. > > > > Signed-off-by: Darrick J. Wong <djwong@kernel.org> > > --- > > tests/xfs/843 | 56 +++++++++++++++++++++++++++++++++++++++++++++++++++++ > > tests/xfs/843.out | 2 ++ > > 2 files changed, 58 insertions(+) > > create mode 100755 tests/xfs/843 > > create mode 100644 tests/xfs/843.out > > > > diff --git a/tests/xfs/843 b/tests/xfs/843 > > new file mode 100755 > > index 00000000..3384b1aa > > --- /dev/null > > +++ b/tests/xfs/843 > > @@ -0,0 +1,56 @@ > > +#! /bin/bash > > +# SPDX-License-Identifier: GPL-2.0 > > +# Copyright (c) 2022 Oracle. All Rights Reserved. > > +# > > +# FS QA Test 843 > > +# > > +# Now that we've increased the default log size calculation, test mkfs with > > +# various stripe units and filesystem sizes to see if we can provoke mkfs into > > +# breaking. > > +# > > +. ./common/preamble > > +_begin_fstest auto mkfs > > + > > +_cleanup() > > +{ > > + cd / > > + rm -r -f $tmp.* $testfile > > +} > > I'd omit this completely. > > > +# real QA test starts here > > + > > +# Modify as appropriate. > > +_supported_fs xfs > > +_require_test > > + > > +testfile=$TEST_DIR/a > > +rm -f $testfile > > + > > +test_format() { > > + local tag="$1" > > + shift > > + > > + echo "$tag" >> $seqres.full > > + $MKFS_XFS_PROG $@ -d file,name=$testfile &>> $seqres.full > > + local res=$? > > + test $res -eq 0 || echo "$tag FAIL $res" | tee -a $seqres.full > > What breakage are you trying to provoke? Just the log size > calculation? If so, why do we need to actually write the filesystem > to disk? Won't "-N" still calculate everything and fail if it's > broken or quit with success without needing to write anything to > disk? It will, but... > > +} > > + > > +# First we try various small filesystems and stripe sizes. > > +for M in `seq 298 302` `seq 490 520`; do > > + for S in `seq 32 4 64`; do > > + test_format "M=$M S=$S" -dsu=${S}k,sw=1,size=${M}m > > + done > > +done > > + > > +# log so large it pushes the root dir into AG 1 > > +test_format "log pushes rootdir into AG 1" -d agcount=3200,size=6366g -lagnum=0 ...this particular check in mkfs only happens after we allocate the root directory, which an -N invocation doesn't do. > > + > > +# log end rounded beyond EOAG due to stripe unit > > +test_format "log end beyond eoag" -d agcount=3200,size=6366g -d su=256k,sw=4 > > + > > +echo Silence is golden > > Put this at the top where the test is being set up (i.e. where you > define testfile). That tells the reader straight away that no output > is expected on a successful run before they start reading the test > code.... Hmm, we probably ought to update the ./new template too. --D > Cheers, > > Dave. > -- > Dave Chinner > david@fromorbit.com
On Tue, May 24, 2022 at 04:49:30PM -0700, Darrick J. Wong wrote: > On Wed, May 25, 2022 at 09:44:26AM +1000, Dave Chinner wrote: > > On Tue, May 24, 2022 at 12:52:57PM -0700, Darrick J. Wong wrote: > > > From: Darrick J. Wong <djwong@kernel.org> > > > > > > This is a regression test that exercises the mkfs.xfs code that creates > > > log sizes that are very close to the AG size when stripe units are in > > > play and/or when the log is forced to be in AG 0. > > > > > > Signed-off-by: Darrick J. Wong <djwong@kernel.org> > > > --- > > > tests/xfs/843 | 56 +++++++++++++++++++++++++++++++++++++++++++++++++++++ > > > tests/xfs/843.out | 2 ++ > > > 2 files changed, 58 insertions(+) > > > create mode 100755 tests/xfs/843 > > > create mode 100644 tests/xfs/843.out > > > > > > diff --git a/tests/xfs/843 b/tests/xfs/843 > > > new file mode 100755 > > > index 00000000..3384b1aa > > > --- /dev/null > > > +++ b/tests/xfs/843 > > > @@ -0,0 +1,56 @@ > > > +#! /bin/bash > > > +# SPDX-License-Identifier: GPL-2.0 > > > +# Copyright (c) 2022 Oracle. All Rights Reserved. > > > +# > > > +# FS QA Test 843 > > > +# > > > +# Now that we've increased the default log size calculation, test mkfs with > > > +# various stripe units and filesystem sizes to see if we can provoke mkfs into > > > +# breaking. > > > +# > > > +. ./common/preamble > > > +_begin_fstest auto mkfs > > > + > > > +_cleanup() > > > +{ > > > + cd / > > > + rm -r -f $tmp.* $testfile > > > +} > > > > I'd omit this completely. > > > > > +# real QA test starts here > > > + > > > +# Modify as appropriate. > > > +_supported_fs xfs > > > +_require_test > > > + > > > +testfile=$TEST_DIR/a > > > +rm -f $testfile > > > + > > > +test_format() { > > > + local tag="$1" > > > + shift > > > + > > > + echo "$tag" >> $seqres.full > > > + $MKFS_XFS_PROG $@ -d file,name=$testfile &>> $seqres.full > > > + local res=$? > > > + test $res -eq 0 || echo "$tag FAIL $res" | tee -a $seqres.full > > > > What breakage are you trying to provoke? Just the log size > > calculation? If so, why do we need to actually write the filesystem > > to disk? Won't "-N" still calculate everything and fail if it's > > broken or quit with success without needing to write anything to > > disk? > > It will, but... > > > > +} > > > + > > > +# First we try various small filesystems and stripe sizes. > > > +for M in `seq 298 302` `seq 490 520`; do > > > + for S in `seq 32 4 64`; do > > > + test_format "M=$M S=$S" -dsu=${S}k,sw=1,size=${M}m > > > + done > > > +done > > > + > > > +# log so large it pushes the root dir into AG 1 > > > +test_format "log pushes rootdir into AG 1" -d agcount=3200,size=6366g -lagnum=0 > > ...this particular check in mkfs only happens after we allocate the root > directory, which an -N invocation doesn't do. Ok, so for this test can we drop the -N? We don't need to do 30 IOs and write 64MB logs for every config we test - I think there's ~35 * 8 invocations of test_format in the loop above before we get here... Also, why do we need a 6.3TB filesystem with 2.1GB AGs and a 2GB log to trigger this? That means we have to write 2GB to disk, plus ~20,000 write IOs for the AG headers and btree root blocks before we get to the failure case, yes? Why not just exercise the failure case with something like this: # mkfs.xfs -d agcount=2,size=64M -l size=8180b,agnum=0 -d file,name=test.img meta-data=test.img isize=512 agcount=2, agsize=8192 blks = sectsz=512 attr=2, projid32bit=1 = crc=1 finobt=1, sparse=1, rmapbt=0 = reflink=1 bigtime=0 inobtcount=0 data = bsize=4096 blocks=16384, imaxpct=25 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0, ftype=1 log =internal log bsize=4096 blocks=8180, version=2 = sectsz=512 sunit=0 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 mkfs.xfs: root inode created in AG 1, not AG 0 # echo $? 1 # Otherwise I don't exactly understand what this specific case is supposed to be testing, so maybe some more explaining is necessary? Cheers, Dave.
On Wed, May 25, 2022 at 10:24:13AM +1000, Dave Chinner wrote: > On Tue, May 24, 2022 at 04:49:30PM -0700, Darrick J. Wong wrote: > > On Wed, May 25, 2022 at 09:44:26AM +1000, Dave Chinner wrote: > > > On Tue, May 24, 2022 at 12:52:57PM -0700, Darrick J. Wong wrote: > > > > From: Darrick J. Wong <djwong@kernel.org> > > > > > > > > This is a regression test that exercises the mkfs.xfs code that creates > > > > log sizes that are very close to the AG size when stripe units are in > > > > play and/or when the log is forced to be in AG 0. > > > > > > > > Signed-off-by: Darrick J. Wong <djwong@kernel.org> > > > > --- > > > > tests/xfs/843 | 56 +++++++++++++++++++++++++++++++++++++++++++++++++++++ > > > > tests/xfs/843.out | 2 ++ > > > > 2 files changed, 58 insertions(+) > > > > create mode 100755 tests/xfs/843 > > > > create mode 100644 tests/xfs/843.out > > > > > > > > diff --git a/tests/xfs/843 b/tests/xfs/843 > > > > new file mode 100755 > > > > index 00000000..3384b1aa > > > > --- /dev/null > > > > +++ b/tests/xfs/843 > > > > @@ -0,0 +1,56 @@ > > > > +#! /bin/bash > > > > +# SPDX-License-Identifier: GPL-2.0 > > > > +# Copyright (c) 2022 Oracle. All Rights Reserved. > > > > +# > > > > +# FS QA Test 843 > > > > +# > > > > +# Now that we've increased the default log size calculation, test mkfs with > > > > +# various stripe units and filesystem sizes to see if we can provoke mkfs into > > > > +# breaking. > > > > +# > > > > +. ./common/preamble > > > > +_begin_fstest auto mkfs > > > > + > > > > +_cleanup() > > > > +{ > > > > + cd / > > > > + rm -r -f $tmp.* $testfile > > > > +} > > > > > > I'd omit this completely. > > > > > > > +# real QA test starts here > > > > + > > > > +# Modify as appropriate. > > > > +_supported_fs xfs > > > > +_require_test > > > > + > > > > +testfile=$TEST_DIR/a > > > > +rm -f $testfile > > > > + > > > > +test_format() { > > > > + local tag="$1" > > > > + shift > > > > + > > > > + echo "$tag" >> $seqres.full > > > > + $MKFS_XFS_PROG $@ -d file,name=$testfile &>> $seqres.full > > > > + local res=$? > > > > + test $res -eq 0 || echo "$tag FAIL $res" | tee -a $seqres.full > > > > > > What breakage are you trying to provoke? Just the log size > > > calculation? If so, why do we need to actually write the filesystem > > > to disk? Won't "-N" still calculate everything and fail if it's > > > broken or quit with success without needing to write anything to > > > disk? > > > > It will, but... > > > > > > +} > > > > + > > > > +# First we try various small filesystems and stripe sizes. > > > > +for M in `seq 298 302` `seq 490 520`; do > > > > + for S in `seq 32 4 64`; do > > > > + test_format "M=$M S=$S" -dsu=${S}k,sw=1,size=${M}m > > > > + done > > > > +done > > > > + > > > > +# log so large it pushes the root dir into AG 1 > > > > +test_format "log pushes rootdir into AG 1" -d agcount=3200,size=6366g -lagnum=0 > > > > ...this particular check in mkfs only happens after we allocate the root > > directory, which an -N invocation doesn't do. > > Ok, so for this test can we drop the -N? We don't need to do 30 IOs > and write 64MB logs for every config we test - I think there's ~35 * > 8 invocations of test_format in the loop above before we get here... > > Also, why do we need a 6.3TB filesystem with 2.1GB AGs and a 2GB log > to trigger this? That means we have to write 2GB to disk, plus > ~20,000 write IOs for the AG headers and btree root blocks before we > get to the failure case, yes? So long as the test filesystem is itself XFS and a reasonably new vintage, mkfs.xfs will call FALLOC_FL_ZERO_RANGE to zero the log, which means we're not writing 2GB to storage with every format. > Why not just exercise the failure case with something like this: > > # mkfs.xfs -d agcount=2,size=64M -l size=8180b,agnum=0 -d file,name=test.img > meta-data=test.img isize=512 agcount=2, agsize=8192 blks > = sectsz=512 attr=2, projid32bit=1 > = crc=1 finobt=1, sparse=1, rmapbt=0 > = reflink=1 bigtime=0 inobtcount=0 > data = bsize=4096 blocks=16384, imaxpct=25 > = sunit=0 swidth=0 blks > naming =version 2 bsize=4096 ascii-ci=0, ftype=1 > log =internal log bsize=4096 blocks=8180, version=2 > = sectsz=512 sunit=0 blks, lazy-count=1 > realtime =none extsz=4096 blocks=0, rtextents=0 > mkfs.xfs: root inode created in AG 1, not AG 0 > # echo $? > 1 > # > > Otherwise I don't exactly understand what this specific case is > supposed to be testing, so maybe some more explaining is necessary? To be honest, I hadn't thought of that; this test encodes all the weird scenarios that Eric and I came up with over many hours of bluejeans chatting to see if we could trip up all the mkfs log sizing code and related fixes. --D > Cheers, > > Dave. > -- > Dave Chinner > david@fromorbit.com
On Tue, May 24, 2022 at 05:32:50PM -0700, Darrick J. Wong wrote: > On Wed, May 25, 2022 at 10:24:13AM +1000, Dave Chinner wrote: > > On Tue, May 24, 2022 at 04:49:30PM -0700, Darrick J. Wong wrote: > > > On Wed, May 25, 2022 at 09:44:26AM +1000, Dave Chinner wrote: > > > > On Tue, May 24, 2022 at 12:52:57PM -0700, Darrick J. Wong wrote: > > > > > From: Darrick J. Wong <djwong@kernel.org> > > > > > > > > > > This is a regression test that exercises the mkfs.xfs code that creates > > > > > log sizes that are very close to the AG size when stripe units are in > > > > > play and/or when the log is forced to be in AG 0. > > > > > > > > > > Signed-off-by: Darrick J. Wong <djwong@kernel.org> > > > > > --- > > > > > tests/xfs/843 | 56 +++++++++++++++++++++++++++++++++++++++++++++++++++++ > > > > > tests/xfs/843.out | 2 ++ > > > > > 2 files changed, 58 insertions(+) > > > > > create mode 100755 tests/xfs/843 > > > > > create mode 100644 tests/xfs/843.out > > > > > > > > > > diff --git a/tests/xfs/843 b/tests/xfs/843 > > > > > new file mode 100755 > > > > > index 00000000..3384b1aa > > > > > --- /dev/null > > > > > +++ b/tests/xfs/843 > > > > > @@ -0,0 +1,56 @@ > > > > > +#! /bin/bash > > > > > +# SPDX-License-Identifier: GPL-2.0 > > > > > +# Copyright (c) 2022 Oracle. All Rights Reserved. > > > > > +# > > > > > +# FS QA Test 843 > > > > > +# > > > > > +# Now that we've increased the default log size calculation, test mkfs with > > > > > +# various stripe units and filesystem sizes to see if we can provoke mkfs into > > > > > +# breaking. > > > > > +# > > > > > +. ./common/preamble > > > > > +_begin_fstest auto mkfs > > > > > + > > > > > +_cleanup() > > > > > +{ > > > > > + cd / > > > > > + rm -r -f $tmp.* $testfile > > > > > +} > > > > > > > > I'd omit this completely. > > > > > > > > > +# real QA test starts here > > > > > + > > > > > +# Modify as appropriate. > > > > > +_supported_fs xfs > > > > > +_require_test > > > > > + > > > > > +testfile=$TEST_DIR/a > > > > > +rm -f $testfile > > > > > + > > > > > +test_format() { > > > > > + local tag="$1" > > > > > + shift > > > > > + > > > > > + echo "$tag" >> $seqres.full > > > > > + $MKFS_XFS_PROG $@ -d file,name=$testfile &>> $seqres.full > > > > > + local res=$? > > > > > + test $res -eq 0 || echo "$tag FAIL $res" | tee -a $seqres.full > > > > > > > > What breakage are you trying to provoke? Just the log size > > > > calculation? If so, why do we need to actually write the filesystem > > > > to disk? Won't "-N" still calculate everything and fail if it's > > > > broken or quit with success without needing to write anything to > > > > disk? > > > > > > It will, but... > > > > > > > > +} > > > > > + > > > > > +# First we try various small filesystems and stripe sizes. > > > > > +for M in `seq 298 302` `seq 490 520`; do > > > > > + for S in `seq 32 4 64`; do > > > > > + test_format "M=$M S=$S" -dsu=${S}k,sw=1,size=${M}m > > > > > + done > > > > > +done > > > > > + > > > > > +# log so large it pushes the root dir into AG 1 > > > > > +test_format "log pushes rootdir into AG 1" -d agcount=3200,size=6366g -lagnum=0 > > > > > > ...this particular check in mkfs only happens after we allocate the root > > > directory, which an -N invocation doesn't do. > > > > Ok, so for this test can we drop the -N? We don't need to do 30 IOs > > and write 64MB logs for every config we test - I think there's ~35 * > > 8 invocations of test_format in the loop above before we get here... > > > > Also, why do we need a 6.3TB filesystem with 2.1GB AGs and a 2GB log > > to trigger this? That means we have to write 2GB to disk, plus > > ~20,000 write IOs for the AG headers and btree root blocks before we > > get to the failure case, yes? > > So long as the test filesystem is itself XFS and a reasonably new > vintage, mkfs.xfs will call FALLOC_FL_ZERO_RANGE to zero the log, which > means we're not writing 2GB to storage with every format. Oh, cycle = XLOG_INIT_CYCLE short-cuts stamping the head/tail LSNs into every sector of the log. Ok, so we don't have to write the log in this case, just issue 20,000 IOs to write out the AG headers... > > Why not just exercise the failure case with something like this: > > > > # mkfs.xfs -d agcount=2,size=64M -l size=8180b,agnum=0 -d file,name=test.img > > meta-data=test.img isize=512 agcount=2, agsize=8192 blks > > = sectsz=512 attr=2, projid32bit=1 > > = crc=1 finobt=1, sparse=1, rmapbt=0 > > = reflink=1 bigtime=0 inobtcount=0 > > data = bsize=4096 blocks=16384, imaxpct=25 > > = sunit=0 swidth=0 blks > > naming =version 2 bsize=4096 ascii-ci=0, ftype=1 > > log =internal log bsize=4096 blocks=8180, version=2 > > = sectsz=512 sunit=0 blks, lazy-count=1 > > realtime =none extsz=4096 blocks=0, rtextents=0 > > mkfs.xfs: root inode created in AG 1, not AG 0 > > # echo $? > > 1 > > # > > > > Otherwise I don't exactly understand what this specific case is > > supposed to be testing, so maybe some more explaining is necessary? > > To be honest, I hadn't thought of that; this test encodes all the weird > scenarios that Eric and I came up with over many hours of bluejeans > chatting to see if we could trip up all the mkfs log sizing code and > related fixes. *nod*. I figured it was probably either an oversight or me being incredibly stupid again. EIther way, it looks like the only way we can trip over this is incredibly contrived mkfs configs, so I'd argue that the smaller, faster option is just as good as the bigger, slower one... Cheers, Dave.
On Wed, May 25, 2022 at 10:24:13AM +1000, Dave Chinner wrote: > On Tue, May 24, 2022 at 04:49:30PM -0700, Darrick J. Wong wrote: > > On Wed, May 25, 2022 at 09:44:26AM +1000, Dave Chinner wrote: > > > On Tue, May 24, 2022 at 12:52:57PM -0700, Darrick J. Wong wrote: > > > > From: Darrick J. Wong <djwong@kernel.org> > > > > > > > > This is a regression test that exercises the mkfs.xfs code that creates > > > > log sizes that are very close to the AG size when stripe units are in > > > > play and/or when the log is forced to be in AG 0. > > > > > > > > Signed-off-by: Darrick J. Wong <djwong@kernel.org> > > > > --- > > > > tests/xfs/843 | 56 +++++++++++++++++++++++++++++++++++++++++++++++++++++ > > > > tests/xfs/843.out | 2 ++ > > > > 2 files changed, 58 insertions(+) > > > > create mode 100755 tests/xfs/843 > > > > create mode 100644 tests/xfs/843.out > > > > > > > > diff --git a/tests/xfs/843 b/tests/xfs/843 > > > > new file mode 100755 > > > > index 00000000..3384b1aa > > > > --- /dev/null > > > > +++ b/tests/xfs/843 > > > > @@ -0,0 +1,56 @@ > > > > +#! /bin/bash > > > > +# SPDX-License-Identifier: GPL-2.0 > > > > +# Copyright (c) 2022 Oracle. All Rights Reserved. > > > > +# > > > > +# FS QA Test 843 > > > > +# > > > > +# Now that we've increased the default log size calculation, test mkfs with > > > > +# various stripe units and filesystem sizes to see if we can provoke mkfs into > > > > +# breaking. > > > > +# > > > > +. ./common/preamble > > > > +_begin_fstest auto mkfs > > > > + > > > > +_cleanup() > > > > +{ > > > > + cd / > > > > + rm -r -f $tmp.* $testfile > > > > +} > > > > > > I'd omit this completely. > > > > > > > +# real QA test starts here > > > > + > > > > +# Modify as appropriate. > > > > +_supported_fs xfs > > > > +_require_test > > > > + > > > > +testfile=$TEST_DIR/a > > > > +rm -f $testfile > > > > + > > > > +test_format() { > > > > + local tag="$1" > > > > + shift > > > > + > > > > + echo "$tag" >> $seqres.full > > > > + $MKFS_XFS_PROG $@ -d file,name=$testfile &>> $seqres.full > > > > + local res=$? > > > > + test $res -eq 0 || echo "$tag FAIL $res" | tee -a $seqres.full > > > > > > What breakage are you trying to provoke? Just the log size > > > calculation? If so, why do we need to actually write the filesystem > > > to disk? Won't "-N" still calculate everything and fail if it's > > > broken or quit with success without needing to write anything to > > > disk? > > > > It will, but... > > > > > > +} > > > > + > > > > +# First we try various small filesystems and stripe sizes. > > > > +for M in `seq 298 302` `seq 490 520`; do > > > > + for S in `seq 32 4 64`; do > > > > + test_format "M=$M S=$S" -dsu=${S}k,sw=1,size=${M}m > > > > + done > > > > +done > > > > + > > > > +# log so large it pushes the root dir into AG 1 > > > > +test_format "log pushes rootdir into AG 1" -d agcount=3200,size=6366g -lagnum=0 > > > > ...this particular check in mkfs only happens after we allocate the root > > directory, which an -N invocation doesn't do. > > Ok, so for this test can we drop the -N? We don't need to do 30 IOs > and write 64MB logs for every config we test - I think there's ~35 * > 8 invocations of test_format in the loop above before we get here... > > Also, why do we need a 6.3TB filesystem with 2.1GB AGs and a 2GB log > to trigger this? That means we have to write 2GB to disk, plus > ~20,000 write IOs for the AG headers and btree root blocks before we > get to the failure case, yes? > > Why not just exercise the failure case with something like this: > > # mkfs.xfs -d agcount=2,size=64M -l size=8180b,agnum=0 -d file,name=test.img > meta-data=test.img isize=512 agcount=2, agsize=8192 blks > = sectsz=512 attr=2, projid32bit=1 > = crc=1 finobt=1, sparse=1, rmapbt=0 > = reflink=1 bigtime=0 inobtcount=0 > data = bsize=4096 blocks=16384, imaxpct=25 > = sunit=0 swidth=0 blks > naming =version 2 bsize=4096 ascii-ci=0, ftype=1 > log =internal log bsize=4096 blocks=8180, version=2 > = sectsz=512 sunit=0 blks, lazy-count=1 > realtime =none extsz=4096 blocks=0, rtextents=0 > mkfs.xfs: root inode created in AG 1, not AG 0 Welll... a better reason for why this test can't do that -- one of my fixes also made mkfs reject -l size=XXX when XXX is not possible. That said... -d agcount=3200,size=6366g -lagnum=0 -N seems to work? --D > # echo $? > 1 > # > > Otherwise I don't exactly understand what this specific case is > supposed to be testing, so maybe some more explaining is necessary? > > Cheers, > > Dave. > -- > Dave Chinner > david@fromorbit.com
On Wed, May 25, 2022 at 04:21:36PM -0700, Darrick J. Wong wrote: > On Wed, May 25, 2022 at 10:24:13AM +1000, Dave Chinner wrote: > > On Tue, May 24, 2022 at 04:49:30PM -0700, Darrick J. Wong wrote: > > > On Wed, May 25, 2022 at 09:44:26AM +1000, Dave Chinner wrote: > > > > On Tue, May 24, 2022 at 12:52:57PM -0700, Darrick J. Wong wrote: > > > > > +# First we try various small filesystems and stripe sizes. > > > > > +for M in `seq 298 302` `seq 490 520`; do > > > > > + for S in `seq 32 4 64`; do > > > > > + test_format "M=$M S=$S" -dsu=${S}k,sw=1,size=${M}m > > > > > + done > > > > > +done > > > > > + > > > > > +# log so large it pushes the root dir into AG 1 > > > > > +test_format "log pushes rootdir into AG 1" -d agcount=3200,size=6366g -lagnum=0 > > > > > > ...this particular check in mkfs only happens after we allocate the root > > > directory, which an -N invocation doesn't do. > > > > Ok, so for this test can we drop the -N? We don't need to do 30 IOs > > and write 64MB logs for every config we test - I think there's ~35 * > > 8 invocations of test_format in the loop above before we get here... > > > > Also, why do we need a 6.3TB filesystem with 2.1GB AGs and a 2GB log > > to trigger this? That means we have to write 2GB to disk, plus > > ~20,000 write IOs for the AG headers and btree root blocks before we > > get to the failure case, yes? > > > > Why not just exercise the failure case with something like this: > > > > # mkfs.xfs -d agcount=2,size=64M -l size=8180b,agnum=0 -d file,name=test.img > > meta-data=test.img isize=512 agcount=2, agsize=8192 blks > > = sectsz=512 attr=2, projid32bit=1 > > = crc=1 finobt=1, sparse=1, rmapbt=0 > > = reflink=1 bigtime=0 inobtcount=0 > > data = bsize=4096 blocks=16384, imaxpct=25 > > = sunit=0 swidth=0 blks > > naming =version 2 bsize=4096 ascii-ci=0, ftype=1 > > log =internal log bsize=4096 blocks=8180, version=2 > > = sectsz=512 sunit=0 blks, lazy-count=1 > > realtime =none extsz=4096 blocks=0, rtextents=0 > > mkfs.xfs: root inode created in AG 1, not AG 0 > > Welll... a better reason for why this test can't do that -- one of my > fixes also made mkfs reject -l size=XXX when XXX is not possible. So you make other changes to mkfs that aren't yet upstream that check whether the log causes the root inode to change position? I mean, '-l size=8180b' results in a log that fits perfectly fine in an empty AG, and so you must have added more checks to detect this moves the root inode somehow? > That said... -d agcount=3200,size=6366g -lagnum=0 -N seems to work? On a new mkfs binary with those checks that you mention above? If so, it seems to me that those new checks trigger for this case, too, before we write anything? IOWs, this won't fail on older mkfs.xfs binaries that don't have these new log size checks because we never get around to allocating the root inode and checking it's location, right? Yet what we really want here is for it to fail on an old mkfs binary? Cheers, Dave.
On Thu, May 26, 2022 at 10:11:49AM +1000, Dave Chinner wrote: > On Wed, May 25, 2022 at 04:21:36PM -0700, Darrick J. Wong wrote: > > On Wed, May 25, 2022 at 10:24:13AM +1000, Dave Chinner wrote: > > > On Tue, May 24, 2022 at 04:49:30PM -0700, Darrick J. Wong wrote: > > > > On Wed, May 25, 2022 at 09:44:26AM +1000, Dave Chinner wrote: > > > > > On Tue, May 24, 2022 at 12:52:57PM -0700, Darrick J. Wong wrote: > > > > > > +# First we try various small filesystems and stripe sizes. > > > > > > +for M in `seq 298 302` `seq 490 520`; do > > > > > > + for S in `seq 32 4 64`; do > > > > > > + test_format "M=$M S=$S" -dsu=${S}k,sw=1,size=${M}m > > > > > > + done > > > > > > +done > > > > > > + > > > > > > +# log so large it pushes the root dir into AG 1 > > > > > > +test_format "log pushes rootdir into AG 1" -d agcount=3200,size=6366g -lagnum=0 > > > > > > > > ...this particular check in mkfs only happens after we allocate the root > > > > directory, which an -N invocation doesn't do. > > > > > > Ok, so for this test can we drop the -N? We don't need to do 30 IOs > > > and write 64MB logs for every config we test - I think there's ~35 * > > > 8 invocations of test_format in the loop above before we get here... > > > > > > Also, why do we need a 6.3TB filesystem with 2.1GB AGs and a 2GB log > > > to trigger this? That means we have to write 2GB to disk, plus > > > ~20,000 write IOs for the AG headers and btree root blocks before we > > > get to the failure case, yes? > > > > > > Why not just exercise the failure case with something like this: > > > > > > # mkfs.xfs -d agcount=2,size=64M -l size=8180b,agnum=0 -d file,name=test.img > > > meta-data=test.img isize=512 agcount=2, agsize=8192 blks > > > = sectsz=512 attr=2, projid32bit=1 > > > = crc=1 finobt=1, sparse=1, rmapbt=0 > > > = reflink=1 bigtime=0 inobtcount=0 > > > data = bsize=4096 blocks=16384, imaxpct=25 > > > = sunit=0 swidth=0 blks > > > naming =version 2 bsize=4096 ascii-ci=0, ftype=1 > > > log =internal log bsize=4096 blocks=8180, version=2 > > > = sectsz=512 sunit=0 blks, lazy-count=1 > > > realtime =none extsz=4096 blocks=0, rtextents=0 > > > mkfs.xfs: root inode created in AG 1, not AG 0 > > > > Welll... a better reason for why this test can't do that -- one of my > > fixes also made mkfs reject -l size=XXX when XXX is not possible. > > So you make other changes to mkfs that aren't yet upstream that > check whether the log causes the root inode to change position? It's in for-next, is that not good enough? > I mean, '-l size=8180b' results in a log that fits perfectly fine in > an empty AG, and so you must have added more checks to detect this > moves the root inode somehow? > > > That said... -d agcount=3200,size=6366g -lagnum=0 -N seems to work? > > On a new mkfs binary with those checks that you mention above? If > so, it seems to me that those new checks trigger for this case, too, > before we write anything? > > IOWs, this won't fail on older mkfs.xfs binaries that don't have > these new log size checks because we never get around to allocating > the root inode and checking it's location, right? Yet what we > really want here is for it to fail on an old mkfs binary? Yeah, something like that. It's been so long since Eric and I were actively poking on this that I don't remember so well anymore. --D > Cheers, > > Dave. > -- > Dave Chinner > david@fromorbit.com
On Wed, May 25, 2022 at 06:35:20PM -0700, Darrick J. Wong wrote: > On Thu, May 26, 2022 at 10:11:49AM +1000, Dave Chinner wrote: > > On Wed, May 25, 2022 at 04:21:36PM -0700, Darrick J. Wong wrote: > > > On Wed, May 25, 2022 at 10:24:13AM +1000, Dave Chinner wrote: > > > > On Tue, May 24, 2022 at 04:49:30PM -0700, Darrick J. Wong wrote: > > > > > On Wed, May 25, 2022 at 09:44:26AM +1000, Dave Chinner wrote: > > > > > > On Tue, May 24, 2022 at 12:52:57PM -0700, Darrick J. Wong wrote: > > > > > > > +# First we try various small filesystems and stripe sizes. > > > > > > > +for M in `seq 298 302` `seq 490 520`; do > > > > > > > + for S in `seq 32 4 64`; do > > > > > > > + test_format "M=$M S=$S" -dsu=${S}k,sw=1,size=${M}m > > > > > > > + done > > > > > > > +done > > > > > > > + > > > > > > > +# log so large it pushes the root dir into AG 1 > > > > > > > +test_format "log pushes rootdir into AG 1" -d agcount=3200,size=6366g -lagnum=0 > > > > > > > > > > ...this particular check in mkfs only happens after we allocate the root > > > > > directory, which an -N invocation doesn't do. > > > > > > > > Ok, so for this test can we drop the -N? We don't need to do 30 IOs > > > > and write 64MB logs for every config we test - I think there's ~35 * > > > > 8 invocations of test_format in the loop above before we get here... > > > > > > > > Also, why do we need a 6.3TB filesystem with 2.1GB AGs and a 2GB log > > > > to trigger this? That means we have to write 2GB to disk, plus > > > > ~20,000 write IOs for the AG headers and btree root blocks before we > > > > get to the failure case, yes? > > > > > > > > Why not just exercise the failure case with something like this: > > > > > > > > # mkfs.xfs -d agcount=2,size=64M -l size=8180b,agnum=0 -d file,name=test.img > > > > meta-data=test.img isize=512 agcount=2, agsize=8192 blks > > > > = sectsz=512 attr=2, projid32bit=1 > > > > = crc=1 finobt=1, sparse=1, rmapbt=0 > > > > = reflink=1 bigtime=0 inobtcount=0 > > > > data = bsize=4096 blocks=16384, imaxpct=25 > > > > = sunit=0 swidth=0 blks > > > > naming =version 2 bsize=4096 ascii-ci=0, ftype=1 > > > > log =internal log bsize=4096 blocks=8180, version=2 > > > > = sectsz=512 sunit=0 blks, lazy-count=1 > > > > realtime =none extsz=4096 blocks=0, rtextents=0 > > > > mkfs.xfs: root inode created in AG 1, not AG 0 > > > > > > Welll... a better reason for why this test can't do that -- one of my > > > fixes also made mkfs reject -l size=XXX when XXX is not possible. > > > > So you make other changes to mkfs that aren't yet upstream that > > check whether the log causes the root inode to change position? > > It's in for-next, is that not good enough? Sorry, I didn't realise Eric had pulled it in. I'm not keep a close eye on exactly what non-libxfs-sync stuff going into xfsprogs right now - I've had to filter that stuff out to concentrate on the kernel cycle stuff... As it is, it's fine however we reject this case, isn't? it doesn't have to be the root inode error message that causes it to abort, as long as we get a non-zero exit value. Hence the above method would fail with both old and new mkfs binaries, even if they are detecting the issue differently. > > I mean, '-l size=8180b' results in a log that fits perfectly fine in > > an empty AG, and so you must have added more checks to detect this > > moves the root inode somehow? > > > > > That said... -d agcount=3200,size=6366g -lagnum=0 -N seems to work? > > > > On a new mkfs binary with those checks that you mention above? If > > so, it seems to me that those new checks trigger for this case, too, > > before we write anything? > > > > IOWs, this won't fail on older mkfs.xfs binaries that don't have > > these new log size checks because we never get around to allocating > > the root inode and checking it's location, right? Yet what we > > really want here is for it to fail on an old mkfs binary? > > Yeah, something like that. It's been so long since Eric and I were > actively poking on this that I don't remember so well anymore. Worth checking, because ISTM that if it doesn't fail on old mkfs, then we'd still the non "-N" version or the targetted log size variant like above. Of course, I could still be misunderstanding what you're trying to exercise here... Cheers, Dave.
diff --git a/tests/xfs/843 b/tests/xfs/843 new file mode 100755 index 00000000..3384b1aa --- /dev/null +++ b/tests/xfs/843 @@ -0,0 +1,56 @@ +#! /bin/bash +# SPDX-License-Identifier: GPL-2.0 +# Copyright (c) 2022 Oracle. All Rights Reserved. +# +# FS QA Test 843 +# +# Now that we've increased the default log size calculation, test mkfs with +# various stripe units and filesystem sizes to see if we can provoke mkfs into +# breaking. +# +. ./common/preamble +_begin_fstest auto mkfs + +_cleanup() +{ + cd / + rm -r -f $tmp.* $testfile +} + +# real QA test starts here + +# Modify as appropriate. +_supported_fs xfs +_require_test + +testfile=$TEST_DIR/a +rm -f $testfile + +test_format() { + local tag="$1" + shift + + echo "$tag" >> $seqres.full + $MKFS_XFS_PROG $@ -d file,name=$testfile &>> $seqres.full + local res=$? + test $res -eq 0 || echo "$tag FAIL $res" | tee -a $seqres.full +} + +# First we try various small filesystems and stripe sizes. +for M in `seq 298 302` `seq 490 520`; do + for S in `seq 32 4 64`; do + test_format "M=$M S=$S" -dsu=${S}k,sw=1,size=${M}m + done +done + +# log so large it pushes the root dir into AG 1 +test_format "log pushes rootdir into AG 1" -d agcount=3200,size=6366g -lagnum=0 + +# log end rounded beyond EOAG due to stripe unit +test_format "log end beyond eoag" -d agcount=3200,size=6366g -d su=256k,sw=4 + +echo Silence is golden + +# success, all done +status=0 +exit diff --git a/tests/xfs/843.out b/tests/xfs/843.out new file mode 100644 index 00000000..87c13504 --- /dev/null +++ b/tests/xfs/843.out @@ -0,0 +1,2 @@ +QA output created by 843 +Silence is golden