diff mbox

[2/3] xfstests: btrfs: test device replace, with EIO on the src dev

Message ID 1430388527-16700-3-git-send-email-anand.jain@oracle.com (mailing list archive)
State Not Applicable
Headers show

Commit Message

Anand Jain April 30, 2015, 10:08 a.m. UTC
This test case will test to confirm the replace works when
the replacing source device has EIO errors.

EIO condition is achieved using the DM device.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
---
 tests/btrfs/087     | 97 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 tests/btrfs/087.out |  2 ++
 tests/btrfs/group   |  1 +
 3 files changed, 100 insertions(+)
 create mode 100755 tests/btrfs/087
 create mode 100644 tests/btrfs/087.out

Comments

Dave Chinner May 5, 2015, 12:14 a.m. UTC | #1
On Thu, Apr 30, 2015 at 06:08:46PM +0800, Anand Jain wrote:
> This test case will test to confirm the replace works when
> the replacing source device has EIO errors.
> 
> EIO condition is achieved using the DM device.
> 
> Signed-off-by: Anand Jain <anand.jain@oracle.com>
> ---
>  tests/btrfs/087     | 97 +++++++++++++++++++++++++++++++++++++++++++++++++++++
>  tests/btrfs/087.out |  2 ++
>  tests/btrfs/group   |  1 +
>  3 files changed, 100 insertions(+)
>  create mode 100755 tests/btrfs/087
>  create mode 100644 tests/btrfs/087.out
> 
> diff --git a/tests/btrfs/087 b/tests/btrfs/087
> new file mode 100755
> index 0000000..c2eb16a
> --- /dev/null
> +++ b/tests/btrfs/087
> @@ -0,0 +1,97 @@
> +#! /bin/bash
> +# FS QA Test No. btrfs/087
> +#
> +# test device replace works when the source device has EIO
> +#
> +# Copyright (c) 2015 Oracle.  All Rights Reserved.
> +#
> +# This program is free software; you can redistribute it and/or
> +# modify it under the terms of the GNU General Public License as
> +# published by the Free Software Foundation.
> +#
> +# This program is distributed in the hope that it would be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program; if not, write the Free Software Foundation,
> +# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
> +#
> +
> +seq=`basename $0`
> +seqres=$RESULT_DIR/$seq
> +echo "QA output created by $seq"
> +
> +here=`pwd`
> +tmp=/tmp/$$
> +status=1	# failure is the default!
> +trap "_cleanup; exit \$status" 0 1 2 3 15
> +rm -f $seqres.full
> +
> +_cleanup()
> +{
> +	# remove dm erro-dev from the scratch pool
> +	SCRATCH_DEV_POOL=`echo $SCRATCH_DEV_POOL | sed -e "s# *$DM_ERROR_DEV *##"`
> +
> +	_cleanup_dm_error_dev
> +	SCRATCH_DEV_POOL="$SCRATCH_DEV_POOL $DEV_REPLACE_SRC"
> +	SCRATCH_DEV_POOL="$SCRATCH_DEV_POOL $DEV_REPLACE_TGT"
> +	for i in $SCRATCH_DEV_POOL; do
> +		$WIPEFS_PROG -a $i > /dev/null 2>&1
> +	done

If you need to ensure the devices are clean, then that is an
intialisation job, not a cleanup job. Otherwise the automatic
scratch fs check will fail....

> +	rm -f $tmp.*
> +}
> +
> +# get standard environment, filters and checks
> +. ./common/rc
> +. ./common/filter
> +. ./common/dmerror
> +
> +_need_to_be_root
> +_supported_fs btrfs
> +_supported_os Linux
> +_require_scratch
> +_require_scratch_dev_pool 4
> +_require_command "$WIPEFS_PROG" wipefs
> +
> +# peal a (last) device from dev pool to be use it as replace target device
> +DEV_REPLACE_TGT="`echo $SCRATCH_DEV_POOL | $AWK_PROG '{print $NF}'`"
> +SCRATCH_DEV_POOL=`echo $SCRATCH_DEV_POOL | sed -e "s# *$DEV_REPLACE_TGT *##"`
> +
> +# create a dmerror device
> +# peal the last device in SCRATCH_DEV_POOL and assign it to DM_ERROR_DEV
> +DEV_REPLACE_SRC="`echo $SCRATCH_DEV_POOL | $AWK_PROG '{print $NF}'`"
> +SCRATCH_DEV_POOL=`echo $SCRATCH_DEV_POOL | sed -e "s# *$DEV_REPLACE_SRC *##"`
>
> +# Create dm linear device /dev/mapper/error-dev using dev_last
> +# DM_ERROR_DEV will have dm linear device set if created successfully
> +_require_dm_error $DEV_REPLACE_SRC
> +_init_dm_error_dev  $DEV_REPLACE_SRC
> +
> +# assign created dm device to the scratch pool
> +SCRATCH_DEV_POOL="$SCRATCH_DEV_POOL $DM_ERROR_DEV"

This is a lot of screwing around to modify the device list.

Just make a local copy of the device pool, clear the
SCRATCH_DEV_POOL variable, and pass the local devices to
_scratch_pool_mkfs....

> +
> +echo DEV_REPLACE_SRC=$DEV_REPLACE_SRC >> $seqres.full
> +echo DEV_REPLACE_TGT=$DEV_REPLACE_TGT >> $seqres.full
> +echo DM_ERROR_DEV=$DM_ERROR_DEV >> $seqres.full
> +echo SCRATCH_DEV_POOL=$SCRATCH_DEV_POOL >> $seqres.full
> +
> +MKFS_OPTIONS="-m raid1 -d raid1"
> +_scratch_pool_mkfs >> $seqres.full 2>&1 || _fail "mkfs failed"
> +_scratch_mount
> +
> +$BTRFS_UTIL_PROG filesystem show -m $SCRATCH_MNT >> $seqres.full
> +
> +error_dev_devid=`$BTRFS_UTIL_PROG filesystem show -m $SCRATCH_MNT |\
> +			egrep $DM_ERROR_DEV | $AWK_PROG '{print $2}'`
> +echo error_dev_devid=$error_dev_devid >> $seqres.full
> +
> +# now load the error into the DM_ERROR_DEV
> +_load_dm_error_table
> +
> +$BTRFS_UTIL_PROG replace start -B $error_dev_devid $DEV_REPLACE_TGT $SCRATCH_MNT -f ||\
> +		_fail "btrfs device replace failed, devid $error_dev_devid"

You haven't actually done any filesystem operations to trigger a
device error and have the filesystem react to it. before replacing
the device, you need to check that the filesystem doesn't fall over
on error, and that it recovers correctly once the device is
replaced.

> +echo "Silence is golden"

This test should not be silent - A user is running a maintenance
operation, and btrfs needs to be emitting a success of failure
message to the user.  The golden output should capture that
message, not use return values to determine if the test failed or
not.

Cheers,

Dave.
Anand Jain May 5, 2015, 3:13 p.m. UTC | #2
Dave,

  Thanks for comments. I am rewriting it.

  But any idea why should _scratch_mkfs_dmerror (below) should
  fail with one of the device in SCRATCH_DEV_POOL being busy.?
  (when I start the script non of the device provided is actually
  busy).

  I am showing section of the code below which is relevant
  to my question.

----------

+SCRATCH_DEV_POOL_DEV1="`echo $SCRATCH_DEV_POOL | $AWK_PROG '{print $1}'`"
+SCRATCH_DEV_POOL_DEV2="`echo $SCRATCH_DEV_POOL | $AWK_PROG '{print $2}'`"
+
+_init_dmerror
+_scratch_mkfs_dmerror "-f -d raid1 -m raid1 $SCRATCH_DEV_POOL_DEV1 
$SCRATCH_DEV_POOL_DEV2
+_mount_dmerror
+
+local error_dev_devid
+error_dev_devid=`$BTRFS_UTIL_PROG filesystem show -m $SCRATCH_MNT |\
+                       egrep $DMERROR_DEV | $AWK_PROG '{print $2}'`
+
+# now load the error into the DM_ERROR_DEV
+_load_dmerror_table
+
+_run_btrfs_util_prog device delete $error_dev_devid $SCRATCH_MNT
+
+_cleanup_dmerror
----------------

+_init_dmerror()
+{
+       $DMSETUP_PROG remove error-test > /dev/null 2>&1
+
+       local BLK_DEV_SIZE=`blockdev --getsz $SCRATCH_DEV`
+
+       DMERROR_DEV='/dev/mapper/error-test'
+
+       DMLINEAR_TABLE="0 $BLK_DEV_SIZE linear $SCRATCH_DEV 0"
+
+       $DMSETUP_PROG create error-test --table "$DMLINEAR_TABLE" || \
+               _fatal "failed to create dm linear device"
+
+       DMERROR_TABLE="0 $BLK_DEV_SIZE error $SCRATCH_DEV 0"
+}
+
+_scratch_mkfs_dmerror()
+{
+       $MKFS_BTRFS_PROG $* $DMERROR_DEV >> $seqres.full 2>&1 || \
+                       _fatal "failed to create mkfs.btrfs $* $DMERROR_DEV"
+}
----------------


Thanks, Anand




On 05/05/2015 08:14 AM, Dave Chinner wrote:
> On Thu, Apr 30, 2015 at 06:08:46PM +0800, Anand Jain wrote:
>> This test case will test to confirm the replace works when
>> the replacing source device has EIO errors.
>>
>> EIO condition is achieved using the DM device.
>>
>> Signed-off-by: Anand Jain <anand.jain@oracle.com>
>> ---
>>   tests/btrfs/087     | 97 +++++++++++++++++++++++++++++++++++++++++++++++++++++
>>   tests/btrfs/087.out |  2 ++
>>   tests/btrfs/group   |  1 +
>>   3 files changed, 100 insertions(+)
>>   create mode 100755 tests/btrfs/087
>>   create mode 100644 tests/btrfs/087.out
>>
>> diff --git a/tests/btrfs/087 b/tests/btrfs/087
>> new file mode 100755
>> index 0000000..c2eb16a
>> --- /dev/null
>> +++ b/tests/btrfs/087
>> @@ -0,0 +1,97 @@
>> +#! /bin/bash
>> +# FS QA Test No. btrfs/087
>> +#
>> +# test device replace works when the source device has EIO
>> +#
>> +# Copyright (c) 2015 Oracle.  All Rights Reserved.
>> +#
>> +# This program is free software; you can redistribute it and/or
>> +# modify it under the terms of the GNU General Public License as
>> +# published by the Free Software Foundation.
>> +#
>> +# This program is distributed in the hope that it would be useful,
>> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
>> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> +# GNU General Public License for more details.
>> +#
>> +# You should have received a copy of the GNU General Public License
>> +# along with this program; if not, write the Free Software Foundation,
>> +# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
>> +#
>> +
>> +seq=`basename $0`
>> +seqres=$RESULT_DIR/$seq
>> +echo "QA output created by $seq"
>> +
>> +here=`pwd`
>> +tmp=/tmp/$$
>> +status=1	# failure is the default!
>> +trap "_cleanup; exit \$status" 0 1 2 3 15
>> +rm -f $seqres.full
>> +
>> +_cleanup()
>> +{
>> +	# remove dm erro-dev from the scratch pool
>> +	SCRATCH_DEV_POOL=`echo $SCRATCH_DEV_POOL | sed -e "s# *$DM_ERROR_DEV *##"`
>> +
>> +	_cleanup_dm_error_dev
>> +	SCRATCH_DEV_POOL="$SCRATCH_DEV_POOL $DEV_REPLACE_SRC"
>> +	SCRATCH_DEV_POOL="$SCRATCH_DEV_POOL $DEV_REPLACE_TGT"
>> +	for i in $SCRATCH_DEV_POOL; do
>> +		$WIPEFS_PROG -a $i > /dev/null 2>&1
>> +	done
>
> If you need to ensure the devices are clean, then that is an
> intialisation job, not a cleanup job. Otherwise the automatic
> scratch fs check will fail....
>
>> +	rm -f $tmp.*
>> +}
>> +
>> +# get standard environment, filters and checks
>> +. ./common/rc
>> +. ./common/filter
>> +. ./common/dmerror
>> +
>> +_need_to_be_root
>> +_supported_fs btrfs
>> +_supported_os Linux
>> +_require_scratch
>> +_require_scratch_dev_pool 4
>> +_require_command "$WIPEFS_PROG" wipefs
>> +
>> +# peal a (last) device from dev pool to be use it as replace target device
>> +DEV_REPLACE_TGT="`echo $SCRATCH_DEV_POOL | $AWK_PROG '{print $NF}'`"
>> +SCRATCH_DEV_POOL=`echo $SCRATCH_DEV_POOL | sed -e "s# *$DEV_REPLACE_TGT *##"`
>> +
>> +# create a dmerror device
>> +# peal the last device in SCRATCH_DEV_POOL and assign it to DM_ERROR_DEV
>> +DEV_REPLACE_SRC="`echo $SCRATCH_DEV_POOL | $AWK_PROG '{print $NF}'`"
>> +SCRATCH_DEV_POOL=`echo $SCRATCH_DEV_POOL | sed -e "s# *$DEV_REPLACE_SRC *##"`
>>
>> +# Create dm linear device /dev/mapper/error-dev using dev_last
>> +# DM_ERROR_DEV will have dm linear device set if created successfully
>> +_require_dm_error $DEV_REPLACE_SRC
>> +_init_dm_error_dev  $DEV_REPLACE_SRC
>> +
>> +# assign created dm device to the scratch pool
>> +SCRATCH_DEV_POOL="$SCRATCH_DEV_POOL $DM_ERROR_DEV"
>
> This is a lot of screwing around to modify the device list.
>
> Just make a local copy of the device pool, clear the
> SCRATCH_DEV_POOL variable, and pass the local devices to
> _scratch_pool_mkfs....
>
>> +
>> +echo DEV_REPLACE_SRC=$DEV_REPLACE_SRC >> $seqres.full
>> +echo DEV_REPLACE_TGT=$DEV_REPLACE_TGT >> $seqres.full
>> +echo DM_ERROR_DEV=$DM_ERROR_DEV >> $seqres.full
>> +echo SCRATCH_DEV_POOL=$SCRATCH_DEV_POOL >> $seqres.full
>> +
>> +MKFS_OPTIONS="-m raid1 -d raid1"
>> +_scratch_pool_mkfs >> $seqres.full 2>&1 || _fail "mkfs failed"
>> +_scratch_mount
>> +
>> +$BTRFS_UTIL_PROG filesystem show -m $SCRATCH_MNT >> $seqres.full
>> +
>> +error_dev_devid=`$BTRFS_UTIL_PROG filesystem show -m $SCRATCH_MNT |\
>> +			egrep $DM_ERROR_DEV | $AWK_PROG '{print $2}'`
>> +echo error_dev_devid=$error_dev_devid >> $seqres.full
>> +
>> +# now load the error into the DM_ERROR_DEV
>> +_load_dm_error_table
>> +
>> +$BTRFS_UTIL_PROG replace start -B $error_dev_devid $DEV_REPLACE_TGT $SCRATCH_MNT -f ||\
>> +		_fail "btrfs device replace failed, devid $error_dev_devid"
>
> You haven't actually done any filesystem operations to trigger a
> device error and have the filesystem react to it. before replacing
> the device, you need to check that the filesystem doesn't fall over
> on error, and that it recovers correctly once the device is
> replaced.
>
>> +echo "Silence is golden"
>
> This test should not be silent - A user is running a maintenance
> operation, and btrfs needs to be emitting a success of failure
> message to the user.  The golden output should capture that
> message, not use return values to determine if the test failed or
> not.
>
> Cheers,
>
> Dave.
>
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Dave Chinner May 5, 2015, 11:11 p.m. UTC | #3
On Tue, May 05, 2015 at 11:13:44PM +0800, Anand Jain wrote:
> 
> Dave,
> 
>  Thanks for comments. I am rewriting it.
> 
>  But any idea why should _scratch_mkfs_dmerror (below) should
>  fail with one of the device in SCRATCH_DEV_POOL being busy.?
>  (when I start the script non of the device provided is actually
>  busy).

$SCRATCH_DEV is the first device in $SCRATCH_DEV_POOL, right?

> +SCRATCH_DEV_POOL_DEV1="`echo $SCRATCH_DEV_POOL | $AWK_PROG '{print $1}'`"
> +SCRATCH_DEV_POOL_DEV2="`echo $SCRATCH_DEV_POOL | $AWK_PROG '{print $2}'`"

BTW, don't shout so much in your code: dev1 and dev2 are perfectly
good local variable names. In general, local variables are in lower
case, global/exported variables are in upper case.

Cheers,

Dave.
diff mbox

Patch

diff --git a/tests/btrfs/087 b/tests/btrfs/087
new file mode 100755
index 0000000..c2eb16a
--- /dev/null
+++ b/tests/btrfs/087
@@ -0,0 +1,97 @@ 
+#! /bin/bash
+# FS QA Test No. btrfs/087
+#
+# test device replace works when the source device has EIO
+#
+# Copyright (c) 2015 Oracle.  All Rights Reserved.
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License as
+# published by the Free Software Foundation.
+#
+# This program is distributed in the hope that it would be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write the Free Software Foundation,
+# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+#
+
+seq=`basename $0`
+seqres=$RESULT_DIR/$seq
+echo "QA output created by $seq"
+
+here=`pwd`
+tmp=/tmp/$$
+status=1	# failure is the default!
+trap "_cleanup; exit \$status" 0 1 2 3 15
+rm -f $seqres.full
+
+_cleanup()
+{
+	# remove dm erro-dev from the scratch pool
+	SCRATCH_DEV_POOL=`echo $SCRATCH_DEV_POOL | sed -e "s# *$DM_ERROR_DEV *##"`
+
+	_cleanup_dm_error_dev
+	SCRATCH_DEV_POOL="$SCRATCH_DEV_POOL $DEV_REPLACE_SRC"
+	SCRATCH_DEV_POOL="$SCRATCH_DEV_POOL $DEV_REPLACE_TGT"
+	for i in $SCRATCH_DEV_POOL; do
+		$WIPEFS_PROG -a $i > /dev/null 2>&1
+	done
+	rm -f $tmp.*
+}
+
+# get standard environment, filters and checks
+. ./common/rc
+. ./common/filter
+. ./common/dmerror
+
+_need_to_be_root
+_supported_fs btrfs
+_supported_os Linux
+_require_scratch
+_require_scratch_dev_pool 4
+_require_command "$WIPEFS_PROG" wipefs
+
+# peal a (last) device from dev pool to be use it as replace target device
+DEV_REPLACE_TGT="`echo $SCRATCH_DEV_POOL | $AWK_PROG '{print $NF}'`"
+SCRATCH_DEV_POOL=`echo $SCRATCH_DEV_POOL | sed -e "s# *$DEV_REPLACE_TGT *##"`
+
+# create a dmerror device
+# peal the last device in SCRATCH_DEV_POOL and assign it to DM_ERROR_DEV
+DEV_REPLACE_SRC="`echo $SCRATCH_DEV_POOL | $AWK_PROG '{print $NF}'`"
+SCRATCH_DEV_POOL=`echo $SCRATCH_DEV_POOL | sed -e "s# *$DEV_REPLACE_SRC *##"`
+
+# Create dm linear device /dev/mapper/error-dev using dev_last
+# DM_ERROR_DEV will have dm linear device set if created successfully
+_require_dm_error $DEV_REPLACE_SRC
+_init_dm_error_dev  $DEV_REPLACE_SRC
+
+# assign created dm device to the scratch pool
+SCRATCH_DEV_POOL="$SCRATCH_DEV_POOL $DM_ERROR_DEV"
+
+echo DEV_REPLACE_SRC=$DEV_REPLACE_SRC >> $seqres.full
+echo DEV_REPLACE_TGT=$DEV_REPLACE_TGT >> $seqres.full
+echo DM_ERROR_DEV=$DM_ERROR_DEV >> $seqres.full
+echo SCRATCH_DEV_POOL=$SCRATCH_DEV_POOL >> $seqres.full
+
+MKFS_OPTIONS="-m raid1 -d raid1"
+_scratch_pool_mkfs >> $seqres.full 2>&1 || _fail "mkfs failed"
+_scratch_mount
+
+$BTRFS_UTIL_PROG filesystem show -m $SCRATCH_MNT >> $seqres.full
+
+error_dev_devid=`$BTRFS_UTIL_PROG filesystem show -m $SCRATCH_MNT |\
+			egrep $DM_ERROR_DEV | $AWK_PROG '{print $2}'`
+echo error_dev_devid=$error_dev_devid >> $seqres.full
+
+# now load the error into the DM_ERROR_DEV
+_load_dm_error_table
+
+$BTRFS_UTIL_PROG replace start -B $error_dev_devid $DEV_REPLACE_TGT $SCRATCH_MNT -f ||\
+		_fail "btrfs device replace failed, devid $error_dev_devid"
+
+echo "Silence is golden"
+status=0; exit
diff --git a/tests/btrfs/087.out b/tests/btrfs/087.out
new file mode 100644
index 0000000..e84c64c
--- /dev/null
+++ b/tests/btrfs/087.out
@@ -0,0 +1,2 @@ 
+QA output created by 087
+Silence is golden
diff --git a/tests/btrfs/group b/tests/btrfs/group
index e9c15af..d66b4fc 100644
--- a/tests/btrfs/group
+++ b/tests/btrfs/group
@@ -89,3 +89,4 @@ 
 084 auto quick send
 085 auto quick metadata subvol
 086 auto quick clone
+087 auto quick replace