Message ID | 20190417062450.GD114154@magnolia (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | xfs: prevent overflow of delalloc block counters | expand |
On Tue, Apr 16, 2019 at 11:24:50PM -0700, Darrick J. Wong wrote: > From: Darrick J. Wong <darrick.wong@oracle.com> > > With the new copy on write functionality it's possible to reserve so > much COW space for a file that we end up overflowing i_delayed_blks. > The only user-visible effect of this is to cause totally wrong i_blocks > output in stat, so check for that. .... > > + umount $loop_mount > /dev/null 2>&1 > + rm -rf $tmp.* > +} > + > +# get standard environment, filters and checks > +. ./common/rc > +. ./common/reflink > + > +# real QA test starts here > +_supported_os Linux > +_supported_fs xfs > +_require_scratch_reflink > +_require_loop > +_require_xfs_debug > + > +echo "Format and mount" > +_scratch_mkfs > "$seqres.full" 2>&1 > +_scratch_mount > +_require_fs_space $SCRATCH_MNT 2400000 # 100T fs requires ~2.4GB of space > + > +loop_file=$SCRATCH_MNT/a.img > +loop_mount=$SCRATCH_MNT/a > +truncate -s 100T $loop_file > +$MKFS_XFS_PROG $MKFS_OPTIONS -f $loop_file >> $seqres.full Hmm - that's going to create a 2GB log and zero it, meaning on slow devices this is going to take some time. lodev=$(_create_loop_device $file) _mkfs_dev -l size=128m $lodev > +mkdir $loop_mount > +mount -o loop -t xfs $loop_file $loop_mount > + > +echo "Create crazy huge file" > +touch "${loop_mount}/a" > +blksz="$(stat -f -c '%S' "${loop_mount}")" > +MAXEXTLEN=2097151 # cowextsize can't be more than MAXEXTLEN > +extsize="$(( ((2 ** 32) - 1) / blksz ))" > +test "${extsize}" -gt "${MAXEXTLEN}" && extsize="${MAXEXTLEN}" > +extsize_bytes="$(( extsize * blksz ))" This is overkill, yes? When is extsize_bytes not equal to MAXEXTLEN * blksz on this 100TB filesystem? > +# Set the largest cowextsize we can > +$XFS_IO_PROG -c "cowextsize ${extsize_bytes}" "${loop_mount}/a" > +set_cowextsize="$($XFS_IO_PROG -c 'cowextsize' "${loop_mount}/a" | sed -e 's/^.\([0-9]*\).*$/\1/g')" > +test "${set_cowextsize}" -eq 0 && _fail "could not set cowextsize?" Run the test anyway, even if the cowextsize setting fails. WHo knows what random crap will fall out.... > +statB="$(stat -c '%B' "${loop_mount}/a")" > + > +# Write a single byte every cowextsize bytes so that we minimize the space > +# required to create maximally sized cow reservations > +nr="$(( ((2 ** 32) / extsize) + 100 ))" What's the magic 2^32 here? > +seq 0 "${nr}" | tac | while read n; do seq ${nr} -1 0 | while read n; do > + off="$((n * extsize * blksz))" > + $XFS_IO_PROG -c "pwrite ${off} 1" "${loop_mount}/a" > /dev/null > +done > + > +echo "Reflink crazy huge file" > +cp --reflink=always "${loop_mount}/a" "${loop_mount}/b" > + > +echo "COW crazy huge file" > +# Try to create enough maximally sized cow reservations to overflow > +# i_delayed_blks > +seq 0 "${nr}" | tac | while read n; do > + off="$((n * extsize * blksz))" > + $XFS_IO_PROG -c "pwrite ${off} 1" "${loop_mount}/a" > /dev/null > +done > + > +echo "Check crazy huge file" > +blocks="$(stat -c '%b' "${loop_mount}/a")" > +fsblocks="$((blocks * statB / blksz))" > + > +# Make sure we got enough COW reservations to overflow a 32-bit counter. > +$XFS_IO_PROG -c 'bmap -clpv' "${loop_mount}/a" > $tmp.extents > +echo "COW EXTENT STATE" >> $seqres.full > +cat $tmp.extents >> $seqres.full > +cat > $tmp.awk << ENDL > +{ > + if (\$3 == "delalloc") { > + x += \$4; > + } else if (\$3 == "hole") { > + ; > + } else { > + x += \$6; > + } > +} > +END { > + printf("%d\\n", x / ($blksz / 512)); > +} > +ENDL Write that as a filter function and use tee to direct it to seqres.full and the filter function at the same time? > +cat $tmp.awk >> $seqres.full > +cowblocks="$(awk -f $tmp.awk $tmp.extents)" > +echo "cowblocks is ${cowblocks}" >> $seqres.full > +if [ "${cowblocks}" -lt "$((2 ** 32))" ]; then > + echo "cowblocks (${cowblocks}) should be more than 2^32!" > +fi > + > +# And finally, see if i_delayed_blks overflowed. > +echo "stat blocks is ${fsblocks}" >> $seqres.full > +if [ "${fsblocks}" -lt "$((2 ** 32))" ]; then > + echo "stat blocks (${fsblocks}) should be more than 2^32!" > + if [ "${cowblocks}" -lt "$((2 ** 32))" ]; then > + echo "cowblocks (${cowblocks}) is more than 2^32, your system has overflowed!!!" > + fi > +fi _within_tolerance? CHeers, Dave.
On Thu, Apr 18, 2019 at 07:29:50AM +1000, Dave Chinner wrote: > On Tue, Apr 16, 2019 at 11:24:50PM -0700, Darrick J. Wong wrote: > > From: Darrick J. Wong <darrick.wong@oracle.com> > > > > With the new copy on write functionality it's possible to reserve so > > much COW space for a file that we end up overflowing i_delayed_blks. > > The only user-visible effect of this is to cause totally wrong i_blocks > > output in stat, so check for that. > .... > > > > + umount $loop_mount > /dev/null 2>&1 > > + rm -rf $tmp.* > > +} > > + > > +# get standard environment, filters and checks > > +. ./common/rc > > +. ./common/reflink > > + > > +# real QA test starts here > > +_supported_os Linux > > +_supported_fs xfs > > +_require_scratch_reflink > > +_require_loop > > +_require_xfs_debug > > + > > +echo "Format and mount" > > +_scratch_mkfs > "$seqres.full" 2>&1 > > +_scratch_mount > > +_require_fs_space $SCRATCH_MNT 2400000 # 100T fs requires ~2.4GB of space > > + > > +loop_file=$SCRATCH_MNT/a.img > > +loop_mount=$SCRATCH_MNT/a > > +truncate -s 100T $loop_file > > +$MKFS_XFS_PROG $MKFS_OPTIONS -f $loop_file >> $seqres.full > > Hmm - that's going to create a 2GB log and zero it, meaning on slow > devices this is going to take some time. > > lodev=$(_create_loop_device $file) > _mkfs_dev -l size=128m $lodev <nod> > > > +mkdir $loop_mount > > +mount -o loop -t xfs $loop_file $loop_mount > > + > > +echo "Create crazy huge file" > > +touch "${loop_mount}/a" > > +blksz="$(stat -f -c '%S' "${loop_mount}")" > > +MAXEXTLEN=2097151 # cowextsize can't be more than MAXEXTLEN > > +extsize="$(( ((2 ** 32) - 1) / blksz ))" > > +test "${extsize}" -gt "${MAXEXTLEN}" && extsize="${MAXEXTLEN}" > > +extsize_bytes="$(( extsize * blksz ))" > > This is overkill, yes? When is extsize_bytes not equal to MAXEXTLEN > * blksz on this 100TB filesystem? Most of the time. struct fsxattr.fsx_cowextsize is a u32 field and expects units of bytes, which means that we have to clamp the hint we set on any filesystem with larger than 2k blocks. > > +# Set the largest cowextsize we can > > +$XFS_IO_PROG -c "cowextsize ${extsize_bytes}" "${loop_mount}/a" > > +set_cowextsize="$($XFS_IO_PROG -c 'cowextsize' "${loop_mount}/a" | sed -e 's/^.\([0-9]*\).*$/\1/g')" > > +test "${set_cowextsize}" -eq 0 && _fail "could not set cowextsize?" > > Run the test anyway, even if the cowextsize setting fails. WHo knows > what random crap will fall out.... Ok. > > +statB="$(stat -c '%B' "${loop_mount}/a")" > > + > > +# Write a single byte every cowextsize bytes so that we minimize the space > > +# required to create maximally sized cow reservations > > +nr="$(( ((2 ** 32) / extsize) + 100 ))" > > What's the magic 2^32 here? We're relying on cowextsize hints to create oversized speculative preallocations in the cow fork to bump up i_delayed_blks, so we only really have to touch a block every extsize_bytes. > > > +seq 0 "${nr}" | tac | while read n; do > > seq ${nr} -1 0 | while read n; do Ok > > > + off="$((n * extsize * blksz))" > > + $XFS_IO_PROG -c "pwrite ${off} 1" "${loop_mount}/a" > /dev/null > > +done > > + > > +echo "Reflink crazy huge file" > > +cp --reflink=always "${loop_mount}/a" "${loop_mount}/b" > > + > > +echo "COW crazy huge file" > > +# Try to create enough maximally sized cow reservations to overflow > > +# i_delayed_blks > > +seq 0 "${nr}" | tac | while read n; do > > + off="$((n * extsize * blksz))" > > + $XFS_IO_PROG -c "pwrite ${off} 1" "${loop_mount}/a" > /dev/null > > +done > > + > > +echo "Check crazy huge file" > > +blocks="$(stat -c '%b' "${loop_mount}/a")" > > +fsblocks="$((blocks * statB / blksz))" > > + > > +# Make sure we got enough COW reservations to overflow a 32-bit counter. > > +$XFS_IO_PROG -c 'bmap -clpv' "${loop_mount}/a" > $tmp.extents > > +echo "COW EXTENT STATE" >> $seqres.full > > +cat $tmp.extents >> $seqres.full > > +cat > $tmp.awk << ENDL > > +{ > > + if (\$3 == "delalloc") { > > + x += \$4; > > + } else if (\$3 == "hole") { > > + ; > > + } else { > > + x += \$6; > > + } > > +} > > +END { > > + printf("%d\\n", x / ($blksz / 512)); > > +} > > +ENDL > > Write that as a filter function and use tee to direct it to > seqres.full and the filter function at the same time? Ok. > > +cat $tmp.awk >> $seqres.full > > +cowblocks="$(awk -f $tmp.awk $tmp.extents)" > > +echo "cowblocks is ${cowblocks}" >> $seqres.full > > +if [ "${cowblocks}" -lt "$((2 ** 32))" ]; then > > + echo "cowblocks (${cowblocks}) should be more than 2^32!" > > +fi > > + > > +# And finally, see if i_delayed_blks overflowed. > > +echo "stat blocks is ${fsblocks}" >> $seqres.full > > +if [ "${fsblocks}" -lt "$((2 ** 32))" ]; then > > + echo "stat blocks (${fsblocks}) should be more than 2^32!" > > + if [ "${cowblocks}" -lt "$((2 ** 32))" ]; then > > + echo "cowblocks (${cowblocks}) is more than 2^32, your system has overflowed!!!" > > + fi > > +fi > > _within_tolerance? Sure? I only care that it's above 2^32 though, not that we have an exact value ... but I guess we can put fairly wide thresholds on that comparison since if we overflow then the counter will be way off. --D > > CHeers, > > Dave. > -- > Dave Chinner > david@fromorbit.com
diff --git a/tests/xfs/907 b/tests/xfs/907 new file mode 100755 index 00000000..5791f835 --- /dev/null +++ b/tests/xfs/907 @@ -0,0 +1,128 @@ +#! /bin/bash +# SPDX-License-Identifier: GPL-2.0+ +# Copyright (c) 2019 Oracle, Inc. All Rights Reserved. +# +# FS QA Test No. 907 +# +# Try to overflow i_delayed_blks by setting the largest cowextsize hint +# possible, creating a sparse file with a single byte every cowextsize bytes, +# reflinking it, and retouching every written byte to see if we can create +# enough speculative COW reservations to overflow i_delayed_blks. +# +seq=`basename $0` +seqres=$RESULT_DIR/$seq +echo "QA output created by $seq" + +here=`pwd` +tmp=/tmp/$$ +status=1 # failure is the default! +trap "_cleanup; exit \$status" 0 1 2 3 7 15 + +_cleanup() +{ + cd / + umount $loop_mount > /dev/null 2>&1 + rm -rf $tmp.* +} + +# get standard environment, filters and checks +. ./common/rc +. ./common/reflink + +# real QA test starts here +_supported_os Linux +_supported_fs xfs +_require_scratch_reflink +_require_loop +_require_xfs_debug + +echo "Format and mount" +_scratch_mkfs > "$seqres.full" 2>&1 +_scratch_mount +_require_fs_space $SCRATCH_MNT 2400000 # 100T fs requires ~2.4GB of space + +loop_file=$SCRATCH_MNT/a.img +loop_mount=$SCRATCH_MNT/a +truncate -s 100T $loop_file +$MKFS_XFS_PROG $MKFS_OPTIONS -f $loop_file >> $seqres.full +mkdir $loop_mount +mount -o loop -t xfs $loop_file $loop_mount + +echo "Create crazy huge file" +touch "${loop_mount}/a" +blksz="$(stat -f -c '%S' "${loop_mount}")" +MAXEXTLEN=2097151 # cowextsize can't be more than MAXEXTLEN +extsize="$(( ((2 ** 32) - 1) / blksz ))" +test "${extsize}" -gt "${MAXEXTLEN}" && extsize="${MAXEXTLEN}" +extsize_bytes="$(( extsize * blksz ))" + +# Set the largest cowextsize we can +$XFS_IO_PROG -c "cowextsize ${extsize_bytes}" "${loop_mount}/a" +set_cowextsize="$($XFS_IO_PROG -c 'cowextsize' "${loop_mount}/a" | sed -e 's/^.\([0-9]*\).*$/\1/g')" +test "${set_cowextsize}" -eq 0 && _fail "could not set cowextsize?" + +statB="$(stat -c '%B' "${loop_mount}/a")" + +# Write a single byte every cowextsize bytes so that we minimize the space +# required to create maximally sized cow reservations +nr="$(( ((2 ** 32) / extsize) + 100 ))" +seq 0 "${nr}" | tac | while read n; do + off="$((n * extsize * blksz))" + $XFS_IO_PROG -c "pwrite ${off} 1" "${loop_mount}/a" > /dev/null +done + +echo "Reflink crazy huge file" +cp --reflink=always "${loop_mount}/a" "${loop_mount}/b" + +echo "COW crazy huge file" +# Try to create enough maximally sized cow reservations to overflow +# i_delayed_blks +seq 0 "${nr}" | tac | while read n; do + off="$((n * extsize * blksz))" + $XFS_IO_PROG -c "pwrite ${off} 1" "${loop_mount}/a" > /dev/null +done + +echo "Check crazy huge file" +blocks="$(stat -c '%b' "${loop_mount}/a")" +fsblocks="$((blocks * statB / blksz))" + +# Make sure we got enough COW reservations to overflow a 32-bit counter. +$XFS_IO_PROG -c 'bmap -clpv' "${loop_mount}/a" > $tmp.extents +echo "COW EXTENT STATE" >> $seqres.full +cat $tmp.extents >> $seqres.full +cat > $tmp.awk << ENDL +{ + if (\$3 == "delalloc") { + x += \$4; + } else if (\$3 == "hole") { + ; + } else { + x += \$6; + } +} +END { + printf("%d\\n", x / ($blksz / 512)); +} +ENDL +cat $tmp.awk >> $seqres.full +cowblocks="$(awk -f $tmp.awk $tmp.extents)" +echo "cowblocks is ${cowblocks}" >> $seqres.full +if [ "${cowblocks}" -lt "$((2 ** 32))" ]; then + echo "cowblocks (${cowblocks}) should be more than 2^32!" +fi + +# And finally, see if i_delayed_blks overflowed. +echo "stat blocks is ${fsblocks}" >> $seqres.full +if [ "${fsblocks}" -lt "$((2 ** 32))" ]; then + echo "stat blocks (${fsblocks}) should be more than 2^32!" + if [ "${cowblocks}" -lt "$((2 ** 32))" ]; then + echo "cowblocks (${cowblocks}) is more than 2^32, your system has overflowed!!!" + fi +fi + +echo "Test done" +umount $loop_mount + +# success, all done +status=0 +exit diff --git a/tests/xfs/907.out b/tests/xfs/907.out new file mode 100644 index 00000000..9778d5ed --- /dev/null +++ b/tests/xfs/907.out @@ -0,0 +1,7 @@ +QA output created by 907 +Format and mount +Create crazy huge file +Reflink crazy huge file +COW crazy huge file +Check crazy huge file +Test done diff --git a/tests/xfs/group b/tests/xfs/group index 5a4ef4bf..e0c7fc97 100644 --- a/tests/xfs/group +++ b/tests/xfs/group @@ -504,3 +504,4 @@ 739 auto quick mkfs label 742 auto quick spaceman 743 auto quick health +907 clone