diff mbox series

[3/2] xfstests: check for COW overflows in i_delayed_blks

Message ID 20190417062450.GD114154@magnolia (mailing list archive)
State New, archived
Headers show
Series xfs: prevent overflow of delalloc block counters | expand

Commit Message

Darrick J. Wong April 17, 2019, 6:24 a.m. UTC
From: Darrick J. Wong <darrick.wong@oracle.com>

With the new copy on write functionality it's possible to reserve so
much COW space for a file that we end up overflowing i_delayed_blks.
The only user-visible effect of this is to cause totally wrong i_blocks
output in stat, so check for that.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 tests/xfs/907     |  128 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 tests/xfs/907.out |    7 +++
 tests/xfs/group   |    1 
 3 files changed, 136 insertions(+)
 create mode 100755 tests/xfs/907
 create mode 100644 tests/xfs/907.out

Comments

Dave Chinner April 17, 2019, 9:29 p.m. UTC | #1
On Tue, Apr 16, 2019 at 11:24:50PM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> With the new copy on write functionality it's possible to reserve so
> much COW space for a file that we end up overflowing i_delayed_blks.
> The only user-visible effect of this is to cause totally wrong i_blocks
> output in stat, so check for that.
....
> 
> +	umount $loop_mount > /dev/null 2>&1
> +	rm -rf $tmp.*
> +}
> +
> +# get standard environment, filters and checks
> +. ./common/rc
> +. ./common/reflink
> +
> +# real QA test starts here
> +_supported_os Linux
> +_supported_fs xfs
> +_require_scratch_reflink
> +_require_loop
> +_require_xfs_debug
> +
> +echo "Format and mount"
> +_scratch_mkfs > "$seqres.full" 2>&1
> +_scratch_mount
> +_require_fs_space $SCRATCH_MNT 2400000	# 100T fs requires ~2.4GB of space
> +
> +loop_file=$SCRATCH_MNT/a.img
> +loop_mount=$SCRATCH_MNT/a
> +truncate -s 100T $loop_file
> +$MKFS_XFS_PROG $MKFS_OPTIONS -f $loop_file >> $seqres.full

Hmm - that's going to create a 2GB log and zero it, meaning on slow
devices this is going to take some time.

lodev=$(_create_loop_device $file)
_mkfs_dev -l size=128m $lodev


> +mkdir $loop_mount
> +mount -o loop -t xfs $loop_file $loop_mount
> +
> +echo "Create crazy huge file"
> +touch "${loop_mount}/a"
> +blksz="$(stat -f -c '%S' "${loop_mount}")"
> +MAXEXTLEN=2097151	# cowextsize can't be more than MAXEXTLEN
> +extsize="$(( ((2 ** 32) - 1) / blksz ))"
> +test "${extsize}" -gt "${MAXEXTLEN}" && extsize="${MAXEXTLEN}"
> +extsize_bytes="$(( extsize * blksz ))"

This is overkill, yes? When is extsize_bytes not equal to MAXEXTLEN
* blksz on this 100TB filesystem?


> +# Set the largest cowextsize we can
> +$XFS_IO_PROG -c "cowextsize ${extsize_bytes}" "${loop_mount}/a"
> +set_cowextsize="$($XFS_IO_PROG -c 'cowextsize' "${loop_mount}/a" | sed -e 's/^.\([0-9]*\).*$/\1/g')"
> +test "${set_cowextsize}" -eq 0 && _fail "could not set cowextsize?"

Run the test anyway, even if the cowextsize setting fails. WHo knows
what random crap will fall out....

> +statB="$(stat -c '%B' "${loop_mount}/a")"
> +
> +# Write a single byte every cowextsize bytes so that we minimize the space
> +# required to create maximally sized cow reservations
> +nr="$(( ((2 ** 32) / extsize) + 100 ))"

What's the magic 2^32 here?

> +seq 0 "${nr}" | tac | while read n; do

seq ${nr} -1 0 | while read n; do

> +	off="$((n * extsize * blksz))"
> +	$XFS_IO_PROG -c "pwrite ${off} 1" "${loop_mount}/a" > /dev/null
> +done
> +
> +echo "Reflink crazy huge file"
> +cp --reflink=always "${loop_mount}/a" "${loop_mount}/b"
> +
> +echo "COW crazy huge file"
> +# Try to create enough maximally sized cow reservations to overflow
> +# i_delayed_blks
> +seq 0 "${nr}" | tac | while read n; do
> +	off="$((n * extsize * blksz))"
> +	$XFS_IO_PROG -c "pwrite ${off} 1" "${loop_mount}/a" > /dev/null
> +done
> +
> +echo "Check crazy huge file"
> +blocks="$(stat -c '%b' "${loop_mount}/a")"
> +fsblocks="$((blocks * statB / blksz))"
> +
> +# Make sure we got enough COW reservations to overflow a 32-bit counter.
> +$XFS_IO_PROG -c 'bmap -clpv' "${loop_mount}/a" > $tmp.extents
> +echo "COW EXTENT STATE" >> $seqres.full
> +cat $tmp.extents >> $seqres.full
> +cat > $tmp.awk << ENDL
> +{
> +	if (\$3 == "delalloc") {
> +		x += \$4;
> +	} else if (\$3 == "hole") {
> +		;
> +	} else {
> +		x += \$6;
> +	}
> +}
> +END {
> +	printf("%d\\n", x / ($blksz / 512));
> +}
> +ENDL

Write that as a filter function and use tee to direct it to
seqres.full and the filter function at the same time?

> +cat $tmp.awk >> $seqres.full
> +cowblocks="$(awk -f $tmp.awk $tmp.extents)"
> +echo "cowblocks is ${cowblocks}" >> $seqres.full
> +if [ "${cowblocks}" -lt "$((2 ** 32))" ]; then
> +	echo "cowblocks (${cowblocks}) should be more than 2^32!"
> +fi
> +
> +# And finally, see if i_delayed_blks overflowed.
> +echo "stat blocks is ${fsblocks}" >> $seqres.full
> +if [ "${fsblocks}" -lt "$((2 ** 32))" ]; then
> +	echo "stat blocks (${fsblocks}) should be more than 2^32!"
> +	if [ "${cowblocks}" -lt "$((2 ** 32))" ]; then
> +		echo "cowblocks (${cowblocks}) is more than 2^32, your system has overflowed!!!"
> +	fi
> +fi

_within_tolerance?

CHeers,

Dave.
Darrick J. Wong April 17, 2019, 10:24 p.m. UTC | #2
On Thu, Apr 18, 2019 at 07:29:50AM +1000, Dave Chinner wrote:
> On Tue, Apr 16, 2019 at 11:24:50PM -0700, Darrick J. Wong wrote:
> > From: Darrick J. Wong <darrick.wong@oracle.com>
> > 
> > With the new copy on write functionality it's possible to reserve so
> > much COW space for a file that we end up overflowing i_delayed_blks.
> > The only user-visible effect of this is to cause totally wrong i_blocks
> > output in stat, so check for that.
> ....
> > 
> > +	umount $loop_mount > /dev/null 2>&1
> > +	rm -rf $tmp.*
> > +}
> > +
> > +# get standard environment, filters and checks
> > +. ./common/rc
> > +. ./common/reflink
> > +
> > +# real QA test starts here
> > +_supported_os Linux
> > +_supported_fs xfs
> > +_require_scratch_reflink
> > +_require_loop
> > +_require_xfs_debug
> > +
> > +echo "Format and mount"
> > +_scratch_mkfs > "$seqres.full" 2>&1
> > +_scratch_mount
> > +_require_fs_space $SCRATCH_MNT 2400000	# 100T fs requires ~2.4GB of space
> > +
> > +loop_file=$SCRATCH_MNT/a.img
> > +loop_mount=$SCRATCH_MNT/a
> > +truncate -s 100T $loop_file
> > +$MKFS_XFS_PROG $MKFS_OPTIONS -f $loop_file >> $seqres.full
> 
> Hmm - that's going to create a 2GB log and zero it, meaning on slow
> devices this is going to take some time.
> 
> lodev=$(_create_loop_device $file)
> _mkfs_dev -l size=128m $lodev

<nod>

> 
> > +mkdir $loop_mount
> > +mount -o loop -t xfs $loop_file $loop_mount
> > +
> > +echo "Create crazy huge file"
> > +touch "${loop_mount}/a"
> > +blksz="$(stat -f -c '%S' "${loop_mount}")"
> > +MAXEXTLEN=2097151	# cowextsize can't be more than MAXEXTLEN
> > +extsize="$(( ((2 ** 32) - 1) / blksz ))"
> > +test "${extsize}" -gt "${MAXEXTLEN}" && extsize="${MAXEXTLEN}"
> > +extsize_bytes="$(( extsize * blksz ))"
> 
> This is overkill, yes? When is extsize_bytes not equal to MAXEXTLEN
> * blksz on this 100TB filesystem?

Most of the time.  struct fsxattr.fsx_cowextsize is a u32 field and
expects units of bytes, which means that we have to clamp the hint we
set on any filesystem with larger than 2k blocks.

> > +# Set the largest cowextsize we can
> > +$XFS_IO_PROG -c "cowextsize ${extsize_bytes}" "${loop_mount}/a"
> > +set_cowextsize="$($XFS_IO_PROG -c 'cowextsize' "${loop_mount}/a" | sed -e 's/^.\([0-9]*\).*$/\1/g')"
> > +test "${set_cowextsize}" -eq 0 && _fail "could not set cowextsize?"
> 
> Run the test anyway, even if the cowextsize setting fails. WHo knows
> what random crap will fall out....

Ok.

> > +statB="$(stat -c '%B' "${loop_mount}/a")"
> > +
> > +# Write a single byte every cowextsize bytes so that we minimize the space
> > +# required to create maximally sized cow reservations
> > +nr="$(( ((2 ** 32) / extsize) + 100 ))"
> 
> What's the magic 2^32 here?

We're relying on cowextsize hints to create oversized speculative
preallocations in the cow fork to bump up i_delayed_blks, so we only
really have to touch a block every extsize_bytes.

> 
> > +seq 0 "${nr}" | tac | while read n; do
> 
> seq ${nr} -1 0 | while read n; do

Ok

> 
> > +	off="$((n * extsize * blksz))"
> > +	$XFS_IO_PROG -c "pwrite ${off} 1" "${loop_mount}/a" > /dev/null
> > +done
> > +
> > +echo "Reflink crazy huge file"
> > +cp --reflink=always "${loop_mount}/a" "${loop_mount}/b"
> > +
> > +echo "COW crazy huge file"
> > +# Try to create enough maximally sized cow reservations to overflow
> > +# i_delayed_blks
> > +seq 0 "${nr}" | tac | while read n; do
> > +	off="$((n * extsize * blksz))"
> > +	$XFS_IO_PROG -c "pwrite ${off} 1" "${loop_mount}/a" > /dev/null
> > +done
> > +
> > +echo "Check crazy huge file"
> > +blocks="$(stat -c '%b' "${loop_mount}/a")"
> > +fsblocks="$((blocks * statB / blksz))"
> > +
> > +# Make sure we got enough COW reservations to overflow a 32-bit counter.
> > +$XFS_IO_PROG -c 'bmap -clpv' "${loop_mount}/a" > $tmp.extents
> > +echo "COW EXTENT STATE" >> $seqres.full
> > +cat $tmp.extents >> $seqres.full
> > +cat > $tmp.awk << ENDL
> > +{
> > +	if (\$3 == "delalloc") {
> > +		x += \$4;
> > +	} else if (\$3 == "hole") {
> > +		;
> > +	} else {
> > +		x += \$6;
> > +	}
> > +}
> > +END {
> > +	printf("%d\\n", x / ($blksz / 512));
> > +}
> > +ENDL
> 
> Write that as a filter function and use tee to direct it to
> seqres.full and the filter function at the same time?

Ok.

> > +cat $tmp.awk >> $seqres.full
> > +cowblocks="$(awk -f $tmp.awk $tmp.extents)"
> > +echo "cowblocks is ${cowblocks}" >> $seqres.full
> > +if [ "${cowblocks}" -lt "$((2 ** 32))" ]; then
> > +	echo "cowblocks (${cowblocks}) should be more than 2^32!"
> > +fi
> > +
> > +# And finally, see if i_delayed_blks overflowed.
> > +echo "stat blocks is ${fsblocks}" >> $seqres.full
> > +if [ "${fsblocks}" -lt "$((2 ** 32))" ]; then
> > +	echo "stat blocks (${fsblocks}) should be more than 2^32!"
> > +	if [ "${cowblocks}" -lt "$((2 ** 32))" ]; then
> > +		echo "cowblocks (${cowblocks}) is more than 2^32, your system has overflowed!!!"
> > +	fi
> > +fi
> 
> _within_tolerance?

Sure?  I only care that it's above 2^32 though, not that we have an
exact value ... but I guess we can put fairly wide thresholds on that
comparison since if we overflow then the counter will be way off.

--D

> 
> CHeers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com
diff mbox series

Patch

diff --git a/tests/xfs/907 b/tests/xfs/907
new file mode 100755
index 00000000..5791f835
--- /dev/null
+++ b/tests/xfs/907
@@ -0,0 +1,128 @@ 
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0+
+# Copyright (c) 2019 Oracle, Inc.  All Rights Reserved.
+#
+# FS QA Test No. 907
+#
+# Try to overflow i_delayed_blks by setting the largest cowextsize hint
+# possible, creating a sparse file with a single byte every cowextsize bytes,
+# reflinking it, and retouching every written byte to see if we can create
+# enough speculative COW reservations to overflow i_delayed_blks.
+#
+seq=`basename $0`
+seqres=$RESULT_DIR/$seq
+echo "QA output created by $seq"
+
+here=`pwd`
+tmp=/tmp/$$
+status=1	# failure is the default!
+trap "_cleanup; exit \$status" 0 1 2 3 7 15
+
+_cleanup()
+{
+	cd /
+	umount $loop_mount > /dev/null 2>&1
+	rm -rf $tmp.*
+}
+
+# get standard environment, filters and checks
+. ./common/rc
+. ./common/reflink
+
+# real QA test starts here
+_supported_os Linux
+_supported_fs xfs
+_require_scratch_reflink
+_require_loop
+_require_xfs_debug
+
+echo "Format and mount"
+_scratch_mkfs > "$seqres.full" 2>&1
+_scratch_mount
+_require_fs_space $SCRATCH_MNT 2400000	# 100T fs requires ~2.4GB of space
+
+loop_file=$SCRATCH_MNT/a.img
+loop_mount=$SCRATCH_MNT/a
+truncate -s 100T $loop_file
+$MKFS_XFS_PROG $MKFS_OPTIONS -f $loop_file >> $seqres.full
+mkdir $loop_mount
+mount -o loop -t xfs $loop_file $loop_mount
+
+echo "Create crazy huge file"
+touch "${loop_mount}/a"
+blksz="$(stat -f -c '%S' "${loop_mount}")"
+MAXEXTLEN=2097151	# cowextsize can't be more than MAXEXTLEN
+extsize="$(( ((2 ** 32) - 1) / blksz ))"
+test "${extsize}" -gt "${MAXEXTLEN}" && extsize="${MAXEXTLEN}"
+extsize_bytes="$(( extsize * blksz ))"
+
+# Set the largest cowextsize we can
+$XFS_IO_PROG -c "cowextsize ${extsize_bytes}" "${loop_mount}/a"
+set_cowextsize="$($XFS_IO_PROG -c 'cowextsize' "${loop_mount}/a" | sed -e 's/^.\([0-9]*\).*$/\1/g')"
+test "${set_cowextsize}" -eq 0 && _fail "could not set cowextsize?"
+
+statB="$(stat -c '%B' "${loop_mount}/a")"
+
+# Write a single byte every cowextsize bytes so that we minimize the space
+# required to create maximally sized cow reservations
+nr="$(( ((2 ** 32) / extsize) + 100 ))"
+seq 0 "${nr}" | tac | while read n; do
+	off="$((n * extsize * blksz))"
+	$XFS_IO_PROG -c "pwrite ${off} 1" "${loop_mount}/a" > /dev/null
+done
+
+echo "Reflink crazy huge file"
+cp --reflink=always "${loop_mount}/a" "${loop_mount}/b"
+
+echo "COW crazy huge file"
+# Try to create enough maximally sized cow reservations to overflow
+# i_delayed_blks
+seq 0 "${nr}" | tac | while read n; do
+	off="$((n * extsize * blksz))"
+	$XFS_IO_PROG -c "pwrite ${off} 1" "${loop_mount}/a" > /dev/null
+done
+
+echo "Check crazy huge file"
+blocks="$(stat -c '%b' "${loop_mount}/a")"
+fsblocks="$((blocks * statB / blksz))"
+
+# Make sure we got enough COW reservations to overflow a 32-bit counter.
+$XFS_IO_PROG -c 'bmap -clpv' "${loop_mount}/a" > $tmp.extents
+echo "COW EXTENT STATE" >> $seqres.full
+cat $tmp.extents >> $seqres.full
+cat > $tmp.awk << ENDL
+{
+	if (\$3 == "delalloc") {
+		x += \$4;
+	} else if (\$3 == "hole") {
+		;
+	} else {
+		x += \$6;
+	}
+}
+END {
+	printf("%d\\n", x / ($blksz / 512));
+}
+ENDL
+cat $tmp.awk >> $seqres.full
+cowblocks="$(awk -f $tmp.awk $tmp.extents)"
+echo "cowblocks is ${cowblocks}" >> $seqres.full
+if [ "${cowblocks}" -lt "$((2 ** 32))" ]; then
+	echo "cowblocks (${cowblocks}) should be more than 2^32!"
+fi
+
+# And finally, see if i_delayed_blks overflowed.
+echo "stat blocks is ${fsblocks}" >> $seqres.full
+if [ "${fsblocks}" -lt "$((2 ** 32))" ]; then
+	echo "stat blocks (${fsblocks}) should be more than 2^32!"
+	if [ "${cowblocks}" -lt "$((2 ** 32))" ]; then
+		echo "cowblocks (${cowblocks}) is more than 2^32, your system has overflowed!!!"
+	fi
+fi
+
+echo "Test done"
+umount $loop_mount
+
+# success, all done
+status=0
+exit
diff --git a/tests/xfs/907.out b/tests/xfs/907.out
new file mode 100644
index 00000000..9778d5ed
--- /dev/null
+++ b/tests/xfs/907.out
@@ -0,0 +1,7 @@ 
+QA output created by 907
+Format and mount
+Create crazy huge file
+Reflink crazy huge file
+COW crazy huge file
+Check crazy huge file
+Test done
diff --git a/tests/xfs/group b/tests/xfs/group
index 5a4ef4bf..e0c7fc97 100644
--- a/tests/xfs/group
+++ b/tests/xfs/group
@@ -504,3 +504,4 @@ 
 739 auto quick mkfs label
 742 auto quick spaceman
 743 auto quick health
+907 clone