diff mbox series

fstests: btrfs: Test if btrfs will panic when mounting a partially balanced fs

Message ID 20190522083944.32365-1-wqu@suse.com (mailing list archive)
State New, archived
Headers show
Series fstests: btrfs: Test if btrfs will panic when mounting a partially balanced fs | expand

Commit Message

Qu Wenruo May 22, 2019, 8:39 a.m. UTC
There are two regressions that when mounting a partially balance btrfs
after v5.1 kernel:
- Kernel NULL pointer dereference at mount time
- Kernel BUG_ON() just after mount

The kernel fixes are:
"btrfs: qgroup: Check if @bg is NULL to avoid NULL pointer
 dereference"
"btrfs: reloc: Also queue orphan reloc tree for cleanup to
 avoid BUG_ON()"

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 tests/btrfs/188     | 94 +++++++++++++++++++++++++++++++++++++++++++++
 tests/btrfs/188.out |  2 +
 tests/btrfs/group   |  1 +
 3 files changed, 97 insertions(+)
 create mode 100755 tests/btrfs/188
 create mode 100644 tests/btrfs/188.out

Comments

Filipe Manana May 22, 2019, 9:32 a.m. UTC | #1
On Wed, May 22, 2019 at 9:40 AM Qu Wenruo <wqu@suse.com> wrote:
>
> There are two regressions that when mounting a partially balance btrfs
> after v5.1 kernel:
> - Kernel NULL pointer dereference at mount time
> - Kernel BUG_ON() just after mount
>
> The kernel fixes are:
> "btrfs: qgroup: Check if @bg is NULL to avoid NULL pointer
>  dereference"
> "btrfs: reloc: Also queue orphan reloc tree for cleanup to
>  avoid BUG_ON()"
>
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
>  tests/btrfs/188     | 94 +++++++++++++++++++++++++++++++++++++++++++++
>  tests/btrfs/188.out |  2 +
>  tests/btrfs/group   |  1 +
>  3 files changed, 97 insertions(+)
>  create mode 100755 tests/btrfs/188
>  create mode 100644 tests/btrfs/188.out
>
> diff --git a/tests/btrfs/188 b/tests/btrfs/188
> new file mode 100755
> index 00000000..f43be007
> --- /dev/null
> +++ b/tests/btrfs/188
> @@ -0,0 +1,94 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright (c) 2019 SUSE Linux Products GmbH.  All Rights Reserved.
> +#
> +# FS QA Test 188
> +#
> +# Test if btrfs mount will hit the following bugs when mounting
> +# a fs going through partial balance:
> +# - NULL pointer dereference
> +# - Kernel BUG_ON()

I would make the description be closer to what the test is - a general
test to validate that balance and qgroups work correctly when balance
needs to be resumed on mount.
You can leave those specific problems in the change log.

> +#
> +seq=`basename $0`
> +seqres=$RESULT_DIR/$seq
> +echo "QA output created by $seq"
> +
> +here=`pwd`
> +tmp=/tmp/$$
> +status=1       # failure is the default!
> +trap "_cleanup; exit \$status" 0 1 2 3 15
> +
> +_cleanup()
> +{
> +       cd /
> +       rm -f $tmp.*
> +}
> +
> +# get standard environment, filters and checks
> +. ./common/rc
> +. ./common/filter
> +. ./common/dmlogwrites
> +
> +# remove previous $seqres.full before test
> +rm -f $seqres.full
> +
> +# real QA test starts here
> +
> +# Modify as appropriate.
> +_supported_fs btrfs
> +_supported_os Linux
> +_require_scratch
> +# and we need extra device as log device
> +_require_log_writes
> +
> +nr_files=512                           # enough metadata to bump tree height
> +file_size=2048                         # small enough to be inlined
> +
> +_log_writes_init $SCRATCH_DEV
> +_log_writes_mkfs >> $seqres.full 2>&1
> +
> +_log_writes_mount
> +$BTRFS_UTIL_PROG quota enable $SCRATCH_MNT >> $seqres.full
> +$BTRFS_UTIL_PROG quota rescan -w $SCRATCH_MNT >> $seqres.full
> +
> +# Create enough metadata for later balance
> +for ((i = 0; i < $nr_files; i++)); do
> +       _pwrite_byte 0xcd 0 $file_size $SCRATCH_MNT/file_$i > /dev/null
> +done
> +
> +# Ensure we write all data/metadata back to disk so that later
> +# balance will do real I/O

I don't understand this. Real I/O? Do we have any fake I/O? What is it?

> +sync
> +
> +# Balance metadata so we will have at least one transaction committed with
> +# valid reloc tree, and hopefully an orphan reloc tree.
> +$BTRFS_UTIL_PROG balance start -f -m $SCRATCH_MNT >> $seqres.full
> +_log_writes_unmount
> +_log_writes_remove
> +
> +cur=$(_log_writes_find_next_fua 0)
> +echo "cur=$cur" >> $seqres.full
> +while [ ! -z "$cur" ]; do
> +       _log_writes_replay_log_range $cur $SCRATCH_DEV >> $seqref.full
> +
> +       # If the fs contains valid reloc tree and kernel is not patched,
> +       # we'll hit a NULL pointer dereference
> +       # Or if it contains orphan reloc tree and kernel is unpatched,
> +       # we'll hit a BUG_ON()

# Test that no crashes happen or any other kind of failure.

> +       _scratch_mount
> +       _scratch_unmount
> +
> +       # Don't trigger fsck here, as relocation get paused,
> +       # at that transistent state, qgroup number may differ
> +       # and cause false alert.
> +
> +       prev=$cur
> +       cur=$(_log_writes_find_next_fua $(($cur + 1)))
> +       [ -z "$cur" ] && break
> +done

After the balance finishes, can we verify that qgroup values are correct?

> +
> +echo "Silence is golden"
> +
> +# success, all done
> +status=0
> +exit
> diff --git a/tests/btrfs/188.out b/tests/btrfs/188.out
> new file mode 100644
> index 00000000..6f23fda0
> --- /dev/null
> +++ b/tests/btrfs/188.out
> @@ -0,0 +1,2 @@
> +QA output created by 188
> +Silence is golden
> diff --git a/tests/btrfs/group b/tests/btrfs/group
> index 44ee0dd9..16a7c31e 100644
> --- a/tests/btrfs/group
> +++ b/tests/btrfs/group
> @@ -190,3 +190,4 @@
>  185 volume
>  186 auto quick send volume
>  187 auto send dedupe clone balance
> +188 auto quick replay

"balance" and "qgroup" groups as well

Thanks.
> --
> 2.21.0
>
Qu Wenruo May 22, 2019, 9:36 a.m. UTC | #2
On 2019/5/22 下午5:32, Filipe Manana wrote:
> On Wed, May 22, 2019 at 9:40 AM Qu Wenruo <wqu@suse.com> wrote:
>>
>> There are two regressions that when mounting a partially balance btrfs
>> after v5.1 kernel:
>> - Kernel NULL pointer dereference at mount time
>> - Kernel BUG_ON() just after mount
>>
>> The kernel fixes are:
>> "btrfs: qgroup: Check if @bg is NULL to avoid NULL pointer
>>  dereference"
>> "btrfs: reloc: Also queue orphan reloc tree for cleanup to
>>  avoid BUG_ON()"
>>
>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>> ---
>>  tests/btrfs/188     | 94 +++++++++++++++++++++++++++++++++++++++++++++
>>  tests/btrfs/188.out |  2 +
>>  tests/btrfs/group   |  1 +
>>  3 files changed, 97 insertions(+)
>>  create mode 100755 tests/btrfs/188
>>  create mode 100644 tests/btrfs/188.out
>>
>> diff --git a/tests/btrfs/188 b/tests/btrfs/188
>> new file mode 100755
>> index 00000000..f43be007
>> --- /dev/null
>> +++ b/tests/btrfs/188
>> @@ -0,0 +1,94 @@
>> +#! /bin/bash
>> +# SPDX-License-Identifier: GPL-2.0
>> +# Copyright (c) 2019 SUSE Linux Products GmbH.  All Rights Reserved.
>> +#
>> +# FS QA Test 188
>> +#
>> +# Test if btrfs mount will hit the following bugs when mounting
>> +# a fs going through partial balance:
>> +# - NULL pointer dereference
>> +# - Kernel BUG_ON()
> 
> I would make the description be closer to what the test is - a general
> test to validate that balance and qgroups work correctly when balance
> needs to be resumed on mount.
> You can leave those specific problems in the change log.
> 
>> +#
>> +seq=`basename $0`
>> +seqres=$RESULT_DIR/$seq
>> +echo "QA output created by $seq"
>> +
>> +here=`pwd`
>> +tmp=/tmp/$$
>> +status=1       # failure is the default!
>> +trap "_cleanup; exit \$status" 0 1 2 3 15
>> +
>> +_cleanup()
>> +{
>> +       cd /
>> +       rm -f $tmp.*
>> +}
>> +
>> +# get standard environment, filters and checks
>> +. ./common/rc
>> +. ./common/filter
>> +. ./common/dmlogwrites
>> +
>> +# remove previous $seqres.full before test
>> +rm -f $seqres.full
>> +
>> +# real QA test starts here
>> +
>> +# Modify as appropriate.
>> +_supported_fs btrfs
>> +_supported_os Linux
>> +_require_scratch
>> +# and we need extra device as log device
>> +_require_log_writes
>> +
>> +nr_files=512                           # enough metadata to bump tree height
>> +file_size=2048                         # small enough to be inlined
>> +
>> +_log_writes_init $SCRATCH_DEV
>> +_log_writes_mkfs >> $seqres.full 2>&1
>> +
>> +_log_writes_mount
>> +$BTRFS_UTIL_PROG quota enable $SCRATCH_MNT >> $seqres.full
>> +$BTRFS_UTIL_PROG quota rescan -w $SCRATCH_MNT >> $seqres.full
>> +
>> +# Create enough metadata for later balance
>> +for ((i = 0; i < $nr_files; i++)); do
>> +       _pwrite_byte 0xcd 0 $file_size $SCRATCH_MNT/file_$i > /dev/null
>> +done
>> +
>> +# Ensure we write all data/metadata back to disk so that later
>> +# balance will do real I/O
> 
> I don't understand this. Real I/O? Do we have any fake I/O? What is it?

I mean to avoid things like delayed inode where we only have dirty pages
but haven't written them onto disk.

In this case we want some metadata space to be really used. Dirty page
caches can't do that.

What's the proper expression for this?

> 
>> +sync
>> +
>> +# Balance metadata so we will have at least one transaction committed with
>> +# valid reloc tree, and hopefully an orphan reloc tree.
>> +$BTRFS_UTIL_PROG balance start -f -m $SCRATCH_MNT >> $seqres.full
>> +_log_writes_unmount
>> +_log_writes_remove
>> +
>> +cur=$(_log_writes_find_next_fua 0)
>> +echo "cur=$cur" >> $seqres.full
>> +while [ ! -z "$cur" ]; do
>> +       _log_writes_replay_log_range $cur $SCRATCH_DEV >> $seqref.full
>> +
>> +       # If the fs contains valid reloc tree and kernel is not patched,
>> +       # we'll hit a NULL pointer dereference
>> +       # Or if it contains orphan reloc tree and kernel is unpatched,
>> +       # we'll hit a BUG_ON()
> 
> # Test that no crashes happen or any other kind of failure.
> 
>> +       _scratch_mount
>> +       _scratch_unmount
>> +
>> +       # Don't trigger fsck here, as relocation get paused,
>> +       # at that transistent state, qgroup number may differ
>> +       # and cause false alert.
>> +
>> +       prev=$cur
>> +       cur=$(_log_writes_find_next_fua $(($cur + 1)))
>> +       [ -z "$cur" ] && break
>> +done
> 
> After the balance finishes, can we verify that qgroup values are correct?

Isn't  that done automatically when the test is finished?
As i'm using _require_scratch, not _require_scratch_no_check.

> 
>> +
>> +echo "Silence is golden"
>> +
>> +# success, all done
>> +status=0
>> +exit
>> diff --git a/tests/btrfs/188.out b/tests/btrfs/188.out
>> new file mode 100644
>> index 00000000..6f23fda0
>> --- /dev/null
>> +++ b/tests/btrfs/188.out
>> @@ -0,0 +1,2 @@
>> +QA output created by 188
>> +Silence is golden
>> diff --git a/tests/btrfs/group b/tests/btrfs/group
>> index 44ee0dd9..16a7c31e 100644
>> --- a/tests/btrfs/group
>> +++ b/tests/btrfs/group
>> @@ -190,3 +190,4 @@
>>  185 volume
>>  186 auto quick send volume
>>  187 auto send dedupe clone balance
>> +188 auto quick replay
> 
> "balance" and "qgroup" groups as well

Forgot that.

Thanks for comment,
Qu

> 
> Thanks.
>> --
>> 2.21.0
>>
> 
>
Filipe Manana May 22, 2019, 12:11 p.m. UTC | #3
On Wed, May 22, 2019 at 10:37 AM Qu Wenruo <wqu@suse.de> wrote:
>
>
>
> On 2019/5/22 下午5:32, Filipe Manana wrote:
> > On Wed, May 22, 2019 at 9:40 AM Qu Wenruo <wqu@suse.com> wrote:
> >>
> >> There are two regressions that when mounting a partially balance btrfs
> >> after v5.1 kernel:
> >> - Kernel NULL pointer dereference at mount time
> >> - Kernel BUG_ON() just after mount
> >>
> >> The kernel fixes are:
> >> "btrfs: qgroup: Check if @bg is NULL to avoid NULL pointer
> >>  dereference"
> >> "btrfs: reloc: Also queue orphan reloc tree for cleanup to
> >>  avoid BUG_ON()"
> >>
> >> Signed-off-by: Qu Wenruo <wqu@suse.com>
> >> ---
> >>  tests/btrfs/188     | 94 +++++++++++++++++++++++++++++++++++++++++++++
> >>  tests/btrfs/188.out |  2 +
> >>  tests/btrfs/group   |  1 +
> >>  3 files changed, 97 insertions(+)
> >>  create mode 100755 tests/btrfs/188
> >>  create mode 100644 tests/btrfs/188.out
> >>
> >> diff --git a/tests/btrfs/188 b/tests/btrfs/188
> >> new file mode 100755
> >> index 00000000..f43be007
> >> --- /dev/null
> >> +++ b/tests/btrfs/188
> >> @@ -0,0 +1,94 @@
> >> +#! /bin/bash
> >> +# SPDX-License-Identifier: GPL-2.0
> >> +# Copyright (c) 2019 SUSE Linux Products GmbH.  All Rights Reserved.
> >> +#
> >> +# FS QA Test 188
> >> +#
> >> +# Test if btrfs mount will hit the following bugs when mounting
> >> +# a fs going through partial balance:
> >> +# - NULL pointer dereference
> >> +# - Kernel BUG_ON()
> >
> > I would make the description be closer to what the test is - a general
> > test to validate that balance and qgroups work correctly when balance
> > needs to be resumed on mount.
> > You can leave those specific problems in the change log.
> >
> >> +#
> >> +seq=`basename $0`
> >> +seqres=$RESULT_DIR/$seq
> >> +echo "QA output created by $seq"
> >> +
> >> +here=`pwd`
> >> +tmp=/tmp/$$
> >> +status=1       # failure is the default!
> >> +trap "_cleanup; exit \$status" 0 1 2 3 15
> >> +
> >> +_cleanup()
> >> +{
> >> +       cd /
> >> +       rm -f $tmp.*
> >> +}
> >> +
> >> +# get standard environment, filters and checks
> >> +. ./common/rc
> >> +. ./common/filter
> >> +. ./common/dmlogwrites
> >> +
> >> +# remove previous $seqres.full before test
> >> +rm -f $seqres.full
> >> +
> >> +# real QA test starts here
> >> +
> >> +# Modify as appropriate.
> >> +_supported_fs btrfs
> >> +_supported_os Linux
> >> +_require_scratch
> >> +# and we need extra device as log device
> >> +_require_log_writes
> >> +
> >> +nr_files=512                           # enough metadata to bump tree height
> >> +file_size=2048                         # small enough to be inlined
> >> +
> >> +_log_writes_init $SCRATCH_DEV
> >> +_log_writes_mkfs >> $seqres.full 2>&1
> >> +
> >> +_log_writes_mount
> >> +$BTRFS_UTIL_PROG quota enable $SCRATCH_MNT >> $seqres.full
> >> +$BTRFS_UTIL_PROG quota rescan -w $SCRATCH_MNT >> $seqres.full
> >> +
> >> +# Create enough metadata for later balance
> >> +for ((i = 0; i < $nr_files; i++)); do
> >> +       _pwrite_byte 0xcd 0 $file_size $SCRATCH_MNT/file_$i > /dev/null
> >> +done
> >> +
> >> +# Ensure we write all data/metadata back to disk so that later
> >> +# balance will do real I/O
> >
> > I don't understand this. Real I/O? Do we have any fake I/O? What is it?
>
> I mean to avoid things like delayed inode where we only have dirty pages
> but haven't written them onto disk.

You meant delalloc (delayed allocation) then (delayed inode is another thing).
>
> In this case we want some metadata space to be really used. Dirty page
> caches can't do that.
>
> What's the proper expression for this?

Something like: "Flush delalloc so that balance has work to do."

>
> >
> >> +sync
> >> +
> >> +# Balance metadata so we will have at least one transaction committed with
> >> +# valid reloc tree, and hopefully an orphan reloc tree.
> >> +$BTRFS_UTIL_PROG balance start -f -m $SCRATCH_MNT >> $seqres.full
> >> +_log_writes_unmount
> >> +_log_writes_remove
> >> +
> >> +cur=$(_log_writes_find_next_fua 0)
> >> +echo "cur=$cur" >> $seqres.full
> >> +while [ ! -z "$cur" ]; do
> >> +       _log_writes_replay_log_range $cur $SCRATCH_DEV >> $seqref.full
> >> +
> >> +       # If the fs contains valid reloc tree and kernel is not patched,
> >> +       # we'll hit a NULL pointer dereference
> >> +       # Or if it contains orphan reloc tree and kernel is unpatched,
> >> +       # we'll hit a BUG_ON()
> >
> > # Test that no crashes happen or any other kind of failure.
> >
> >> +       _scratch_mount
> >> +       _scratch_unmount
> >> +
> >> +       # Don't trigger fsck here, as relocation get paused,
> >> +       # at that transistent state, qgroup number may differ
> >> +       # and cause false alert.
> >> +
> >> +       prev=$cur
> >> +       cur=$(_log_writes_find_next_fua $(($cur + 1)))
> >> +       [ -z "$cur" ] && break
> >> +done
> >
> > After the balance finishes, can we verify that qgroup values are correct?
>
> Isn't  that done automatically when the test is finished?
> As i'm using _require_scratch, not _require_scratch_no_check.

If fsck can accurately determine that, I'm fine with it.

Thanks.

>
> >
> >> +
> >> +echo "Silence is golden"
> >> +
> >> +# success, all done
> >> +status=0
> >> +exit
> >> diff --git a/tests/btrfs/188.out b/tests/btrfs/188.out
> >> new file mode 100644
> >> index 00000000..6f23fda0
> >> --- /dev/null
> >> +++ b/tests/btrfs/188.out
> >> @@ -0,0 +1,2 @@
> >> +QA output created by 188
> >> +Silence is golden
> >> diff --git a/tests/btrfs/group b/tests/btrfs/group
> >> index 44ee0dd9..16a7c31e 100644
> >> --- a/tests/btrfs/group
> >> +++ b/tests/btrfs/group
> >> @@ -190,3 +190,4 @@
> >>  185 volume
> >>  186 auto quick send volume
> >>  187 auto send dedupe clone balance
> >> +188 auto quick replay
> >
> > "balance" and "qgroup" groups as well
>
> Forgot that.
>
> Thanks for comment,
> Qu
>
> >
> > Thanks.
> >> --
> >> 2.21.0
> >>
> >
> >
>
diff mbox series

Patch

diff --git a/tests/btrfs/188 b/tests/btrfs/188
new file mode 100755
index 00000000..f43be007
--- /dev/null
+++ b/tests/btrfs/188
@@ -0,0 +1,94 @@ 
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (c) 2019 SUSE Linux Products GmbH.  All Rights Reserved.
+#
+# FS QA Test 188
+#
+# Test if btrfs mount will hit the following bugs when mounting
+# a fs going through partial balance:
+# - NULL pointer dereference
+# - Kernel BUG_ON()
+#
+seq=`basename $0`
+seqres=$RESULT_DIR/$seq
+echo "QA output created by $seq"
+
+here=`pwd`
+tmp=/tmp/$$
+status=1	# failure is the default!
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+_cleanup()
+{
+	cd /
+	rm -f $tmp.*
+}
+
+# get standard environment, filters and checks
+. ./common/rc
+. ./common/filter
+. ./common/dmlogwrites
+
+# remove previous $seqres.full before test
+rm -f $seqres.full
+
+# real QA test starts here
+
+# Modify as appropriate.
+_supported_fs btrfs
+_supported_os Linux
+_require_scratch
+# and we need extra device as log device
+_require_log_writes
+
+nr_files=512				# enough metadata to bump tree height
+file_size=2048				# small enough to be inlined
+
+_log_writes_init $SCRATCH_DEV
+_log_writes_mkfs >> $seqres.full 2>&1
+
+_log_writes_mount
+$BTRFS_UTIL_PROG quota enable $SCRATCH_MNT >> $seqres.full
+$BTRFS_UTIL_PROG quota rescan -w $SCRATCH_MNT >> $seqres.full
+
+# Create enough metadata for later balance
+for ((i = 0; i < $nr_files; i++)); do
+	_pwrite_byte 0xcd 0 $file_size $SCRATCH_MNT/file_$i > /dev/null
+done
+
+# Ensure we write all data/metadata back to disk so that later
+# balance will do real I/O
+sync
+
+# Balance metadata so we will have at least one transaction committed with
+# valid reloc tree, and hopefully an orphan reloc tree.
+$BTRFS_UTIL_PROG balance start -f -m $SCRATCH_MNT >> $seqres.full
+_log_writes_unmount
+_log_writes_remove
+
+cur=$(_log_writes_find_next_fua 0)
+echo "cur=$cur" >> $seqres.full
+while [ ! -z "$cur" ]; do
+	_log_writes_replay_log_range $cur $SCRATCH_DEV >> $seqref.full
+
+	# If the fs contains valid reloc tree and kernel is not patched,
+	# we'll hit a NULL pointer dereference
+	# Or if it contains orphan reloc tree and kernel is unpatched,
+	# we'll hit a BUG_ON()
+	_scratch_mount
+	_scratch_unmount
+
+	# Don't trigger fsck here, as relocation get paused,
+	# at that transistent state, qgroup number may differ
+	# and cause false alert.
+
+	prev=$cur
+	cur=$(_log_writes_find_next_fua $(($cur + 1)))
+	[ -z "$cur" ] && break
+done
+
+echo "Silence is golden"
+
+# success, all done
+status=0
+exit
diff --git a/tests/btrfs/188.out b/tests/btrfs/188.out
new file mode 100644
index 00000000..6f23fda0
--- /dev/null
+++ b/tests/btrfs/188.out
@@ -0,0 +1,2 @@ 
+QA output created by 188
+Silence is golden
diff --git a/tests/btrfs/group b/tests/btrfs/group
index 44ee0dd9..16a7c31e 100644
--- a/tests/btrfs/group
+++ b/tests/btrfs/group
@@ -190,3 +190,4 @@ 
 185 volume
 186 auto quick send volume
 187 auto send dedupe clone balance
+188 auto quick replay