[v2] xfstests: xfs discontiguous multi-block buffer logging test
diff mbox

Message ID 1464873001-17508-1-git-send-email-bfoster@redhat.com
State New
Headers show

Commit Message

Brian Foster June 2, 2016, 1:10 p.m. UTC
XFS had a bug in the multi-block buffer logging code that caused a NULL
lv panic at log push time due to invalid regions being set in the buffer
log format bitmap. This was demonstrated by modifying a multi-block
directory buffer in a manner that only logs regions beyond the first
FSB-sized mapping of the buffer.

To recreate these conditions, this test fragments free space and
populates several directories with enough entries to require
discontiguous multi-block buffers. To recreate the problem, we remove
entries from the tail end of the directory and fsync to flush the log.

Note that this test causes a panic on kernels affected by the bug. As
such, it is included in the 'dangerous' group. The bug is resolved by
kernel commit a3916e528b91 ("xfs: fix broken multi-fsb buffer logging").

Signed-off-by: Brian Foster <bfoster@redhat.com>
---

v2:
- Added comments and some aesthetic fixups.
- Fixed up output file.
- Added to quick group.
v1: http://thread.gmane.org/gmane.comp.file-systems.fstests/2475

 tests/xfs/399     | 121 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 tests/xfs/399.out |   2 +
 tests/xfs/group   |   1 +
 3 files changed, 124 insertions(+)
 create mode 100755 tests/xfs/399
 create mode 100644 tests/xfs/399.out

Comments

Eryu Guan June 3, 2016, 3:31 a.m. UTC | #1
On Thu, Jun 02, 2016 at 09:10:01AM -0400, Brian Foster wrote:
> XFS had a bug in the multi-block buffer logging code that caused a NULL
> lv panic at log push time due to invalid regions being set in the buffer
> log format bitmap. This was demonstrated by modifying a multi-block
> directory buffer in a manner that only logs regions beyond the first
> FSB-sized mapping of the buffer.
> 
> To recreate these conditions, this test fragments free space and
> populates several directories with enough entries to require
> discontiguous multi-block buffers. To recreate the problem, we remove
> entries from the tail end of the directory and fsync to flush the log.
> 
> Note that this test causes a panic on kernels affected by the bug. As
> such, it is included in the 'dangerous' group. The bug is resolved by
> kernel commit a3916e528b91 ("xfs: fix broken multi-fsb buffer logging").
> 
> Signed-off-by: Brian Foster <bfoster@redhat.com>

Looks good to me. Also tested on patched kernel with different block
size XFS, test passed within 20s for me.

Reviewed-by: Eryu Guan <eguan@redhat.com>
--
To unsubscribe from this list: send the line "unsubscribe fstests" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Dave Chinner June 16, 2016, 1:13 a.m. UTC | #2
On Thu, Jun 02, 2016 at 09:10:01AM -0400, Brian Foster wrote:
> XFS had a bug in the multi-block buffer logging code that caused a NULL
> lv panic at log push time due to invalid regions being set in the buffer
> log format bitmap. This was demonstrated by modifying a multi-block
> directory buffer in a manner that only logs regions beyond the first
> FSB-sized mapping of the buffer.
> 
> To recreate these conditions, this test fragments free space and
> populates several directories with enough entries to require
> discontiguous multi-block buffers. To recreate the problem, we remove
> entries from the tail end of the directory and fsync to flush the log.
> 
> Note that this test causes a panic on kernels affected by the bug. As
> such, it is included in the 'dangerous' group. The bug is resolved by
> kernel commit a3916e528b91 ("xfs: fix broken multi-fsb buffer logging").
> 
> Signed-off-by: Brian Foster <bfoster@redhat.com>
.....
> +# Create a small fs with a large directory block size. We want to fill up the fs
> +# quickly and then create multi-fsb dirblocks over fragmented free space.
> +_scratch_mkfs_xfs -d size=20m -n size=64k >> $seqres.full 2>&1
> +_scratch_mount
> +
> +# Fill a source directory with many largish-named files. 1k uuid-named entries
> +# sufficiently populates a 64k directory block.
> +mkdir $SCRATCH_MNT/src
> +for i in $(seq 0 1023); do
> +	touch $SCRATCH_MNT/src/`uuidgen`
> +done

What's 'uuidgen'? Not installed on my test systems, so this needs
a requires check, I think....

Hmmm - looks like a separate package is needed on debian systems:
uuid-runtime. Can you send a followup patch that adds the necesary
checks to this test and added the above package to the iniitial list
in the README file (for ubunutu, I know, but the package name will
be the same).

Cheers,

Dave.
Brian Foster June 16, 2016, 11:45 a.m. UTC | #3
On Thu, Jun 16, 2016 at 11:13:49AM +1000, Dave Chinner wrote:
> On Thu, Jun 02, 2016 at 09:10:01AM -0400, Brian Foster wrote:
> > XFS had a bug in the multi-block buffer logging code that caused a NULL
> > lv panic at log push time due to invalid regions being set in the buffer
> > log format bitmap. This was demonstrated by modifying a multi-block
> > directory buffer in a manner that only logs regions beyond the first
> > FSB-sized mapping of the buffer.
> > 
> > To recreate these conditions, this test fragments free space and
> > populates several directories with enough entries to require
> > discontiguous multi-block buffers. To recreate the problem, we remove
> > entries from the tail end of the directory and fsync to flush the log.
> > 
> > Note that this test causes a panic on kernels affected by the bug. As
> > such, it is included in the 'dangerous' group. The bug is resolved by
> > kernel commit a3916e528b91 ("xfs: fix broken multi-fsb buffer logging").
> > 
> > Signed-off-by: Brian Foster <bfoster@redhat.com>
> .....
> > +# Create a small fs with a large directory block size. We want to fill up the fs
> > +# quickly and then create multi-fsb dirblocks over fragmented free space.
> > +_scratch_mkfs_xfs -d size=20m -n size=64k >> $seqres.full 2>&1
> > +_scratch_mount
> > +
> > +# Fill a source directory with many largish-named files. 1k uuid-named entries
> > +# sufficiently populates a 64k directory block.
> > +mkdir $SCRATCH_MNT/src
> > +for i in $(seq 0 1023); do
> > +	touch $SCRATCH_MNT/src/`uuidgen`
> > +done
> 
> What's 'uuidgen'? Not installed on my test systems, so this needs
> a requires check, I think....
> 
> Hmmm - looks like a separate package is needed on debian systems:
> uuid-runtime. Can you send a followup patch that adds the necesary
> checks to this test and added the above package to the iniitial list
> in the README file (for ubunutu, I know, but the package name will
> be the same).
> 

Eryu pointed this on on v1 of the test but I didn't think it necessary
as it's part of a standard package on fedora. I guess that's not true
for other distros. I'll post something to fix it up...

Brian

> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com
> --
> To unsubscribe from this list: send the line "unsubscribe fstests" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe fstests" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Sandeen June 16, 2016, 3:10 p.m. UTC | #4
On 6/16/16 6:45 AM, Brian Foster wrote:
> On Thu, Jun 16, 2016 at 11:13:49AM +1000, Dave Chinner wrote:
>> On Thu, Jun 02, 2016 at 09:10:01AM -0400, Brian Foster wrote:
>>> XFS had a bug in the multi-block buffer logging code that caused a NULL
>>> lv panic at log push time due to invalid regions being set in the buffer
>>> log format bitmap. This was demonstrated by modifying a multi-block
>>> directory buffer in a manner that only logs regions beyond the first
>>> FSB-sized mapping of the buffer.
>>>
>>> To recreate these conditions, this test fragments free space and
>>> populates several directories with enough entries to require
>>> discontiguous multi-block buffers. To recreate the problem, we remove
>>> entries from the tail end of the directory and fsync to flush the log.
>>>
>>> Note that this test causes a panic on kernels affected by the bug. As
>>> such, it is included in the 'dangerous' group. The bug is resolved by
>>> kernel commit a3916e528b91 ("xfs: fix broken multi-fsb buffer logging").
>>>
>>> Signed-off-by: Brian Foster <bfoster@redhat.com>
>> .....
>>> +# Create a small fs with a large directory block size. We want to fill up the fs
>>> +# quickly and then create multi-fsb dirblocks over fragmented free space.
>>> +_scratch_mkfs_xfs -d size=20m -n size=64k >> $seqres.full 2>&1
>>> +_scratch_mount
>>> +
>>> +# Fill a source directory with many largish-named files. 1k uuid-named entries
>>> +# sufficiently populates a 64k directory block.
>>> +mkdir $SCRATCH_MNT/src
>>> +for i in $(seq 0 1023); do
>>> +	touch $SCRATCH_MNT/src/`uuidgen`
>>> +done
>>
>> What's 'uuidgen'? Not installed on my test systems, so this needs
>> a requires check, I think....
>>
>> Hmmm - looks like a separate package is needed on debian systems:
>> uuid-runtime. Can you send a followup patch that adds the necesary
>> checks to this test and added the above package to the iniitial list
>> in the README file (for ubunutu, I know, but the package name will
>> be the same).
>>
> 
> Eryu pointed this on on v1 of the test but I didn't think it necessary
> as it's part of a standard package on fedora. I guess that's not true
> for other distros. I'll post something to fix it up...

Yeah, that's surprising, it's been part of util-linux[-ng] on my RHEL &
Fedora distros since 2009.

-Eric
--
To unsubscribe from this list: send the line "unsubscribe fstests" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Dave Chinner June 20, 2016, 10:48 p.m. UTC | #5
On Thu, Jun 16, 2016 at 10:10:09AM -0500, Eric Sandeen wrote:
> On 6/16/16 6:45 AM, Brian Foster wrote:
> > On Thu, Jun 16, 2016 at 11:13:49AM +1000, Dave Chinner wrote:
> >> On Thu, Jun 02, 2016 at 09:10:01AM -0400, Brian Foster wrote:
> >>> XFS had a bug in the multi-block buffer logging code that caused a NULL
> >>> lv panic at log push time due to invalid regions being set in the buffer
> >>> log format bitmap. This was demonstrated by modifying a multi-block
> >>> directory buffer in a manner that only logs regions beyond the first
> >>> FSB-sized mapping of the buffer.
> >>>
> >>> To recreate these conditions, this test fragments free space and
> >>> populates several directories with enough entries to require
> >>> discontiguous multi-block buffers. To recreate the problem, we remove
> >>> entries from the tail end of the directory and fsync to flush the log.
> >>>
> >>> Note that this test causes a panic on kernels affected by the bug. As
> >>> such, it is included in the 'dangerous' group. The bug is resolved by
> >>> kernel commit a3916e528b91 ("xfs: fix broken multi-fsb buffer logging").
> >>>
> >>> Signed-off-by: Brian Foster <bfoster@redhat.com>
> >> .....
> >>> +# Create a small fs with a large directory block size. We want to fill up the fs
> >>> +# quickly and then create multi-fsb dirblocks over fragmented free space.
> >>> +_scratch_mkfs_xfs -d size=20m -n size=64k >> $seqres.full 2>&1
> >>> +_scratch_mount
> >>> +
> >>> +# Fill a source directory with many largish-named files. 1k uuid-named entries
> >>> +# sufficiently populates a 64k directory block.
> >>> +mkdir $SCRATCH_MNT/src
> >>> +for i in $(seq 0 1023); do
> >>> +	touch $SCRATCH_MNT/src/`uuidgen`
> >>> +done
> >>
> >> What's 'uuidgen'? Not installed on my test systems, so this needs
> >> a requires check, I think....
> >>
> >> Hmmm - looks like a separate package is needed on debian systems:
> >> uuid-runtime. Can you send a followup patch that adds the necesary
> >> checks to this test and added the above package to the iniitial list
> >> in the README file (for ubunutu, I know, but the package name will
> >> be the same).
> >>
> > 
> > Eryu pointed this on on v1 of the test but I didn't think it necessary
> > as it's part of a standard package on fedora. I guess that's not true
> > for other distros. I'll post something to fix it up...
> 
> Yeah, that's surprising, it's been part of util-linux[-ng] on my RHEL &
> Fedora distros since 2009.

It's to do with dependencies against systemd and ensuring core
packages don't have systemd dependencies.  i.e. uuid-runtime has a
dependency on libsystemd because of the uuidd daemon that is used to
generate uuids. Hence it has to be packaged it separately so that
linux-util/libuuid packages do not require systemd to be installed
on the system.

Cheers,

Dave.

Patch
diff mbox

diff --git a/tests/xfs/399 b/tests/xfs/399
new file mode 100755
index 0000000..8cea305
--- /dev/null
+++ b/tests/xfs/399
@@ -0,0 +1,121 @@ 
+#! /bin/bash
+# FS QA Test No. 399
+#
+# Regression test for an XFS multi-block buffer logging bug.
+#
+# The XFS bug results in a panic when a non-contiguous multi-block buffer is
+# mapped and logged in a particular manner, such that only regions beyond the
+# first fsb-sized mapping are logged. The crash occurs asynchronous to
+# transaction submission, when the associated buffer log item is pushed from the
+# CIL (i.e., when the log is subsequently flushed).
+#
+#-----------------------------------------------------------------------
+# Copyright (c) 2016 Red Hat, Inc.  All Rights Reserved.
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License as
+# published by the Free Software Foundation.
+#
+# This program is distributed in the hope that it would be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write the Free Software Foundation,
+# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+#-----------------------------------------------------------------------
+#
+
+seq=`basename $0`
+seqres=$RESULT_DIR/$seq
+echo "QA output created by $seq"
+
+here=`pwd`
+tmp=/tmp/$$
+status=1	# failure is the default!
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+_cleanup()
+{
+	cd /
+	rm -f $tmp.*
+}
+
+# get standard environment, filters and checks
+. ./common/rc
+
+# Modify as appropriate.
+_supported_fs xfs
+_supported_os Linux
+
+_require_scratch_nocheck	# check complains about single AG fs
+_require_xfs_io_command "fpunch"
+
+rm -f $seqres.full
+
+# Create a small fs with a large directory block size. We want to fill up the fs
+# quickly and then create multi-fsb dirblocks over fragmented free space.
+_scratch_mkfs_xfs -d size=20m -n size=64k >> $seqres.full 2>&1
+_scratch_mount
+
+# Fill a source directory with many largish-named files. 1k uuid-named entries
+# sufficiently populates a 64k directory block.
+mkdir $SCRATCH_MNT/src
+for i in $(seq 0 1023); do
+	touch $SCRATCH_MNT/src/`uuidgen`
+done
+
+# precreate target dirs while we still have free space for inodes
+for i in $(seq 0 3); do
+	mkdir $SCRATCH_MNT/$i
+done
+
+# consume and fragment free space
+$XFS_IO_PROG -xc "resblks 16" $SCRATCH_MNT >> $seqres.full 2>&1
+dd if=/dev/zero of=$SCRATCH_MNT/file bs=4k >> $seqres.full 2>&1
+$XFS_IO_PROG -c "fsync" $SCRATCH_MNT/file >> $seqres.full 2>&1
+size=`stat -c %s $SCRATCH_MNT/file`
+for i in $(seq 0 8192 $size); do
+	$XFS_IO_PROG -c "fpunch $i 4k" $SCRATCH_MNT/file >> $seqres.full 2>&1
+done
+
+# Replicate the src dir several times into fragmented free space. After one or
+# two dirs, we should have nothing but non-contiguous directory blocks.
+for d in $(seq 0 3); do
+	for f in `ls -1 $SCRATCH_MNT/src`; do
+		ln $SCRATCH_MNT/src/$f $SCRATCH_MNT/$d/$f
+	done
+done
+
+# Fragment the target dirs a bit. Remove a handful of entries from each to
+# populate the best free space regions in the directory block headers. We want
+# to populate these now so the subsequent unlinks have no reason to log the
+# first block of the directory.
+for d in $(seq 0 3); do
+	i=0
+	for f in `ls -U $SCRATCH_MNT/$d`; do
+		if [ $i == 0 ]; then
+			unlink $SCRATCH_MNT/$d/$f
+		fi
+		i=$(((i + 1) % 128))
+	done
+done
+
+# remount to flush and ensure subsequent operations allocate a new log item
+_scratch_cycle_mount
+
+# Unlink an entry towards the end of each dir and fsync. The unlink should only
+# need to log the latter mappings of the 64k directory block. If the logging bug
+# is present, this will crash!
+for d in $(seq 0 3); do
+	f=`ls -U $SCRATCH_MNT/$d | tail -10 | head -n 1`
+	unlink $SCRATCH_MNT/$d/$f
+	$XFS_IO_PROG -c fsync $SCRATCH_MNT/$d
+done
+
+echo Silence is golden.
+
+# success, all done
+status=0
+exit
diff --git a/tests/xfs/399.out b/tests/xfs/399.out
new file mode 100644
index 0000000..01b7175
--- /dev/null
+++ b/tests/xfs/399.out
@@ -0,0 +1,2 @@ 
+QA output created by 399
+Silence is golden.
diff --git a/tests/xfs/group b/tests/xfs/group
index f4c6816..e1bc647 100644
--- a/tests/xfs/group
+++ b/tests/xfs/group
@@ -285,3 +285,4 @@ 
 303 auto quick quota
 304 auto quick quota
 305 auto quota
+399 auto dangerous quick