diff mbox series

xfs: new EOF fragmentation tests

Message ID 20240924084551.1802795-2-hch@lst.de (mailing list archive)
State Not Applicable, archived
Headers show
Series xfs: new EOF fragmentation tests | expand

Commit Message

Christoph Hellwig Sept. 24, 2024, 8:45 a.m. UTC
From: Dave Chinner <dchinner@redhat.com>

These tests create substantial file fragmentation as a result of
application actions that defeat post-EOF preallocation
optimisations. They are intended to replicate known vectors for
these problems, and provide a check that the fragmentation levels
have been controlled. The mitigations we make may not completely
remove fragmentation (e.g. they may demonstrate speculative delalloc
related extent size growth) so the checks don't assume we'll end up
with perfect layouts and hence check for an exceptable level of
fragmentation rather than none.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
[move to different test number, update to current xfstest APIs]
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 tests/xfs/1500     | 66 +++++++++++++++++++++++++++++++++++++++
 tests/xfs/1500.out |  9 ++++++
 tests/xfs/1501     | 68 ++++++++++++++++++++++++++++++++++++++++
 tests/xfs/1501.out |  9 ++++++
 tests/xfs/1502     | 68 ++++++++++++++++++++++++++++++++++++++++
 tests/xfs/1502.out |  9 ++++++
 tests/xfs/1503     | 77 ++++++++++++++++++++++++++++++++++++++++++++++
 tests/xfs/1503.out | 33 ++++++++++++++++++++
 8 files changed, 339 insertions(+)
 create mode 100755 tests/xfs/1500
 create mode 100644 tests/xfs/1500.out
 create mode 100755 tests/xfs/1501
 create mode 100644 tests/xfs/1501.out
 create mode 100755 tests/xfs/1502
 create mode 100644 tests/xfs/1502.out
 create mode 100755 tests/xfs/1503
 create mode 100644 tests/xfs/1503.out

Comments

Darrick J. Wong Sept. 24, 2024, 3:03 p.m. UTC | #1
On Tue, Sep 24, 2024 at 10:45:48AM +0200, Christoph Hellwig wrote:
> From: Dave Chinner <dchinner@redhat.com>
> 
> These tests create substantial file fragmentation as a result of
> application actions that defeat post-EOF preallocation
> optimisations. They are intended to replicate known vectors for
> these problems, and provide a check that the fragmentation levels
> have been controlled. The mitigations we make may not completely
> remove fragmentation (e.g. they may demonstrate speculative delalloc
> related extent size growth) so the checks don't assume we'll end up
> with perfect layouts and hence check for an exceptable level of
> fragmentation rather than none.
> 
> Signed-off-by: Dave Chinner <dchinner@redhat.com>
> [move to different test number, update to current xfstest APIs]
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  tests/xfs/1500     | 66 +++++++++++++++++++++++++++++++++++++++
>  tests/xfs/1500.out |  9 ++++++
>  tests/xfs/1501     | 68 ++++++++++++++++++++++++++++++++++++++++
>  tests/xfs/1501.out |  9 ++++++
>  tests/xfs/1502     | 68 ++++++++++++++++++++++++++++++++++++++++
>  tests/xfs/1502.out |  9 ++++++
>  tests/xfs/1503     | 77 ++++++++++++++++++++++++++++++++++++++++++++++
>  tests/xfs/1503.out | 33 ++++++++++++++++++++
>  8 files changed, 339 insertions(+)
>  create mode 100755 tests/xfs/1500
>  create mode 100644 tests/xfs/1500.out
>  create mode 100755 tests/xfs/1501
>  create mode 100644 tests/xfs/1501.out
>  create mode 100755 tests/xfs/1502
>  create mode 100644 tests/xfs/1502.out
>  create mode 100755 tests/xfs/1503
>  create mode 100644 tests/xfs/1503.out
> 
> diff --git a/tests/xfs/1500 b/tests/xfs/1500
> new file mode 100755
> index 000000000..de0e1df62
> --- /dev/null
> +++ b/tests/xfs/1500
> @@ -0,0 +1,66 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright (c) 2019 Red Hat, Inc.  All Rights Reserved.
> +#
> +# FS QA Test xfs/500

Should be "FS QA Test 1500" or at least not say "xfs/500" given the
filename.  Same for the other tests.

> +#
> +# Post-EOF preallocation defeat test for O_SYNC buffered I/O.
> +#
> +
> +. ./common/preamble
> +_begin_fstest auto quick prealloc rw
> +
> +. ./common/rc
> +. ./common/filter
> +
> +_require_scratch
> +
> +_cleanup()
> +{
> +	# try to kill all background processes
> +	wait
> +	cd /
> +	rm -r -f $tmp.*
> +}
> +
> +_scratch_mkfs > "$seqres.full" 2>&1
> +_scratch_mount
> +
> +# Write multiple files in parallel using synchronous buffered writes. Aim is to
> +# interleave allocations to fragment the files. Synchronous writes defeat the
> +# open/write/close heuristics in xfs_file_release() that prevent EOF block
> +# removal, so this should fragment badly. Typical problematic behaviour shows
> +# per-file extent counts of >900 (almost worse case) whilst fixed behaviour
> +# typically shows extent counts in the low 20s.
> +#
> +# Failure is determined by golden output mismatch from _within_tolerance().
> +
> +workfile=$SCRATCH_MNT/file
> +nfiles=8
> +wsize=4096

Shouldn't this be _get_file_block_size instead of hardcoded 4k?
Same for all the other tests.

> +wcnt=1000
> +
> +write_sync_file()
> +{
> +	idx=$1
> +
> +	for ((cnt=0; cnt<$wcnt; cnt++)); do
> +		$XFS_IO_PROG -f -s -c "pwrite $((cnt * wsize)) $wsize" $workfile.$idx
> +	done
> +}
> +
> +rm -f $workfile*
> +for ((n=0; n<$nfiles; n++)); do
> +	write_sync_file $n > /dev/null 2>&1 &
> +done
> +wait
> +sync
> +
> +for ((n=0; n<$nfiles; n++)); do
> +	count=$(_count_extents $workfile.$n)
> +	# Acceptible extent count range is 1-40
> +	_within_tolerance "file.$n extent count" $count 21 19 -v
> +done
> +
> +status=0
> +exit
> diff --git a/tests/xfs/1500.out b/tests/xfs/1500.out
> new file mode 100644
> index 000000000..414df87ed
> --- /dev/null
> +++ b/tests/xfs/1500.out
> @@ -0,0 +1,9 @@
> +QA output created by 1500
> +file.0 extent count is in range
> +file.1 extent count is in range
> +file.2 extent count is in range
> +file.3 extent count is in range
> +file.4 extent count is in range
> +file.5 extent count is in range
> +file.6 extent count is in range
> +file.7 extent count is in range
> diff --git a/tests/xfs/1501 b/tests/xfs/1501
> new file mode 100755
> index 000000000..cf3cbf8b5
> --- /dev/null
> +++ b/tests/xfs/1501
> @@ -0,0 +1,68 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright (c) 2019 Red Hat, Inc.  All Rights Reserved.
> +#
> +# FS QA Test xfs/501
> +#
> +# Post-EOF preallocation defeat test for buffered I/O with extent size hints.
> +#
> +
> +. ./common/preamble
> +_begin_fstest auto quick prealloc rw
> +
> +. ./common/rc
> +. ./common/filter
> +
> +_require_scratch
> +
> +_cleanup()
> +{
> +	# try to kill all background processes
> +	wait
> +	cd /
> +	rm -r -f $tmp.*
> +}
> +
> +_scratch_mkfs > "$seqres.full" 2>&1
> +_scratch_mount
> +
> +# Write multiple files in parallel using buffered writes with extent size hints.
> +# Aim is to interleave allocations to fragment the files. Writes w/ extent size
> +# hints set defeat the open/write/close heuristics in xfs_file_release() that
> +# prevent EOF block removal, so this should fragment badly. Typical problematic
> +# behaviour shows per-file extent counts of 1000 (worst case!) whilst
> +# fixed behaviour should show very few extents (almost best case).
> +#
> +# Failure is determined by golden output mismatch from _within_tolerance().
> +
> +workfile=$SCRATCH_MNT/file
> +nfiles=8
> +wsize=4096
> +wcnt=1000
> +extent_size=16m
> +
> +write_extsz_file()
> +{
> +	idx=$1
> +
> +	$XFS_IO_PROG -f -c "extsize $extent_size" $workfile.$idx
> +	for ((cnt=0; cnt<$wcnt; cnt++)); do
> +		$XFS_IO_PROG -f -c "pwrite $((cnt * wsize)) $wsize" $workfile.$idx
> +	done
> +}
> +
> +rm -f $workfile*
> +for ((n=0; n<$nfiles; n++)); do
> +	write_extsz_file $n > /dev/null 2>&1 &
> +done
> +wait
> +sync
> +
> +for ((n=0; n<$nfiles; n++)); do
> +	count=$(_count_extents $workfile.$n)
> +	# Acceptible extent count range is 1-10
> +	_within_tolerance "file.$n extent count" $count 2 1 8 -v
> +done
> +
> +status=0
> +exit
> diff --git a/tests/xfs/1501.out b/tests/xfs/1501.out
> new file mode 100644
> index 000000000..a266ef74b
> --- /dev/null
> +++ b/tests/xfs/1501.out
> @@ -0,0 +1,9 @@
> +QA output created by 1501
> +file.0 extent count is in range
> +file.1 extent count is in range
> +file.2 extent count is in range
> +file.3 extent count is in range
> +file.4 extent count is in range
> +file.5 extent count is in range
> +file.6 extent count is in range
> +file.7 extent count is in range
> diff --git a/tests/xfs/1502 b/tests/xfs/1502
> new file mode 100755
> index 000000000..f4228667a
> --- /dev/null
> +++ b/tests/xfs/1502
> @@ -0,0 +1,68 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright (c) 2019 Red Hat, Inc.  All Rights Reserved.
> +#
> +# FS QA Test xfs/502
> +#
> +# Post-EOF preallocation defeat test for direct I/O with extent size hints.
> +#
> +
> +. ./common/preamble
> +_begin_fstest auto quick prealloc rw
> +
> +. ./common/rc
> +. ./common/filter
> +
> +_require_scratch

This one wants _require_odirect in case we ever disable directio for
some weird xfs config.

> +
> +_cleanup()
> +{
> +	# try to kill all background processes
> +	wait
> +	cd /
> +	rm -r -f $tmp.*
> +}
> +
> +_scratch_mkfs > "$seqres.full" 2>&1
> +_scratch_mount
> +
> +# Write multiple files in parallel using O_DIRECT writes w/ extent size hints.
> +# Aim is to interleave allocations to fragment the files. O_DIRECT writes defeat
> +# the open/write/close heuristics in xfs_file_release() that prevent EOF block
> +# removal, so this should fragment badly. Typical problematic behaviour shows
> +# per-file extent counts of ~1000 (worst case) whilst fixed behaviour typically
> +# shows extent counts in the low single digits (almost best case)
> +#
> +# Failure is determined by golden output mismatch from _within_tolerance().
> +
> +workfile=$SCRATCH_MNT/file
> +nfiles=8
> +wsize=4096
> +wcnt=1000
> +extent_size=16m
> +
> +write_direct_file()
> +{
> +	idx=$1
> +
> +	$XFS_IO_PROG -f -c "extsize $extent_size" $workfile.$idx
> +	for ((cnt=0; cnt<$wcnt; cnt++)); do
> +		$XFS_IO_PROG -f -d -c "pwrite $((cnt * wsize)) $wsize" $workfile.$idx
> +	done
> +}
> +
> +rm -f $workfile*
> +for ((n=0; n<$nfiles; n++)); do
> +	write_direct_file $n > /dev/null 2>&1 &
> +done
> +wait
> +sync
> +
> +for ((n=0; n<$nfiles; n++)); do
> +	count=$(_count_extents $workfile.$n)
> +	# Acceptible extent count range is 1-10
> +	_within_tolerance "file.$n extent count" $count 2 1 8 -v
> +done
> +
> +status=0
> +exit
> diff --git a/tests/xfs/1502.out b/tests/xfs/1502.out
> new file mode 100644
> index 000000000..82c8760a3
> --- /dev/null
> +++ b/tests/xfs/1502.out
> @@ -0,0 +1,9 @@
> +QA output created by 1502
> +file.0 extent count is in range
> +file.1 extent count is in range
> +file.2 extent count is in range
> +file.3 extent count is in range
> +file.4 extent count is in range
> +file.5 extent count is in range
> +file.6 extent count is in range
> +file.7 extent count is in range
> diff --git a/tests/xfs/1503 b/tests/xfs/1503
> new file mode 100755
> index 000000000..9002f87e6
> --- /dev/null
> +++ b/tests/xfs/1503
> @@ -0,0 +1,77 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright (c) 2019 Red Hat, Inc.  All Rights Reserved.
> +#
> +# FS QA Test xfs/503
> +#
> +# Post-EOF preallocation defeat test with O_SYNC buffered I/O that repeatedly
> +# closes and reopens the files.
> +#
> +
> +. ./common/preamble
> +_begin_fstest auto prealloc rw
> +
> +. ./common/rc
> +. ./common/filter
> +
> +_require_scratch
> +
> +_cleanup()
> +{
> +	# try to kill all background processes
> +	wait
> +	cd /
> +	rm -r -f $tmp.*
> +}
> +
> +_scratch_mkfs > "$seqres.full" 2>&1
> +_scratch_mount
> +
> +# Write multiple files in parallel using synchronous buffered writes that
> +# repeatedly close and reopen the fails. Aim is to interleave allocations to
> +# fragment the files. Assuming we've fixed the synchronous write defeat, we can
> +# still trigger the same issue with a open/read/close on O_RDONLY files. We
> +# should not be triggering EOF preallocation removal on files we don't have
> +# permission to write, so until this is fixed it should fragment badly.  Typical
> +# problematic behaviour shows per-file extent counts of 50-350 whilst fixed
> +# behaviour typically demonstrates post-eof speculative delalloc growth in
> +# extent size (~6 extents for 50MB file).
> +#
> +# Failure is determined by golden output mismatch from _within_tolerance().
> +
> +workfile=$SCRATCH_MNT/file
> +nfiles=32
> +wsize=4096
> +wcnt=1000
> +
> +write_file()
> +{
> +	idx=$1
> +
> +	$XFS_IO_PROG -f -s -c "pwrite -b 64k 0 50m" $workfile.$idx
> +}
> +
> +read_file()
> +{
> +	idx=$1
> +
> +	for ((cnt=0; cnt<$wcnt; cnt++)); do
> +		$XFS_IO_PROG -f -r -c "pread 0 28" $workfile.$idx
> +	done
> +}
> +
> +rm -f $workdir/file*
> +for ((n=0; n<$((nfiles)); n++)); do
> +	write_file $n > /dev/null 2>&1 &
> +	read_file $n > /dev/null 2>&1 &
> +done
> +wait
> +
> +for ((n=0; n<$nfiles; n++)); do
> +	count=$(_count_extents $workfile.$n)
> +	# Acceptible extent count range is 1-40
> +	_within_tolerance "file.$n extent count" $count 6 5 10 -v
> +done
> +
> +status=0
> +exit
> diff --git a/tests/xfs/1503.out b/tests/xfs/1503.out
> new file mode 100644
> index 000000000..1780b16df
> --- /dev/null
> +++ b/tests/xfs/1503.out
> @@ -0,0 +1,33 @@
> +QA output created by 1503
> +file.0 extent count is in range
> +file.1 extent count is in range
> +file.2 extent count is in range
> +file.3 extent count is in range
> +file.4 extent count is in range
> +file.5 extent count is in range
> +file.6 extent count is in range
> +file.7 extent count is in range
> +file.8 extent count is in range
> +file.9 extent count is in range
> +file.10 extent count is in range
> +file.11 extent count is in range
> +file.12 extent count is in range
> +file.13 extent count is in range
> +file.14 extent count is in range
> +file.15 extent count is in range
> +file.16 extent count is in range
> +file.17 extent count is in range
> +file.18 extent count is in range
> +file.19 extent count is in range
> +file.20 extent count is in range
> +file.21 extent count is in range
> +file.22 extent count is in range
> +file.23 extent count is in range
> +file.24 extent count is in range
> +file.25 extent count is in range
> +file.26 extent count is in range
> +file.27 extent count is in range
> +file.28 extent count is in range
> +file.29 extent count is in range
> +file.30 extent count is in range
> +file.31 extent count is in range
> -- 
> 2.45.2
> 
>
Zorro Lang Sept. 25, 2024, 11:15 a.m. UTC | #2
On Tue, Sep 24, 2024 at 10:45:48AM +0200, Christoph Hellwig wrote:
> From: Dave Chinner <dchinner@redhat.com>
> 
> These tests create substantial file fragmentation as a result of
> application actions that defeat post-EOF preallocation
> optimisations. They are intended to replicate known vectors for
> these problems, and provide a check that the fragmentation levels
> have been controlled. The mitigations we make may not completely
> remove fragmentation (e.g. they may demonstrate speculative delalloc
> related extent size growth) so the checks don't assume we'll end up
> with perfect layouts and hence check for an exceptable level of
> fragmentation rather than none.
> 
> Signed-off-by: Dave Chinner <dchinner@redhat.com>
> [move to different test number, update to current xfstest APIs]
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---

This patch looks good to me, just a few nit-picking below...

>  tests/xfs/1500     | 66 +++++++++++++++++++++++++++++++++++++++
>  tests/xfs/1500.out |  9 ++++++
>  tests/xfs/1501     | 68 ++++++++++++++++++++++++++++++++++++++++
>  tests/xfs/1501.out |  9 ++++++
>  tests/xfs/1502     | 68 ++++++++++++++++++++++++++++++++++++++++
>  tests/xfs/1502.out |  9 ++++++
>  tests/xfs/1503     | 77 ++++++++++++++++++++++++++++++++++++++++++++++
>  tests/xfs/1503.out | 33 ++++++++++++++++++++
>  8 files changed, 339 insertions(+)
>  create mode 100755 tests/xfs/1500
>  create mode 100644 tests/xfs/1500.out
>  create mode 100755 tests/xfs/1501
>  create mode 100644 tests/xfs/1501.out
>  create mode 100755 tests/xfs/1502
>  create mode 100644 tests/xfs/1502.out
>  create mode 100755 tests/xfs/1503
>  create mode 100644 tests/xfs/1503.out
> 
> diff --git a/tests/xfs/1500 b/tests/xfs/1500
> new file mode 100755
> index 000000000..de0e1df62
> --- /dev/null
> +++ b/tests/xfs/1500
> @@ -0,0 +1,66 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright (c) 2019 Red Hat, Inc.  All Rights Reserved.
> +#
> +# FS QA Test xfs/500
> +#
> +# Post-EOF preallocation defeat test for O_SYNC buffered I/O.
> +#
> +
> +. ./common/preamble
> +_begin_fstest auto quick prealloc rw
> +
> +. ./common/rc
> +. ./common/filter
> +
> +_require_scratch
> +
> +_cleanup()
> +{
> +	# try to kill all background processes

I didn't see "kill" below, maybe "wait all background processes done"? Or you'd
like to use "kill" but forgot? If you don't want to use "kill", please tell me,
then I'll help to change the comment when I merge it.

> +	wait
> +	cd /
> +	rm -r -f $tmp.*
> +}
> +
> +_scratch_mkfs > "$seqres.full" 2>&1
> +_scratch_mount
> +
> +# Write multiple files in parallel using synchronous buffered writes. Aim is to
> +# interleave allocations to fragment the files. Synchronous writes defeat the
> +# open/write/close heuristics in xfs_file_release() that prevent EOF block
> +# removal, so this should fragment badly. Typical problematic behaviour shows
> +# per-file extent counts of >900 (almost worse case) whilst fixed behaviour
> +# typically shows extent counts in the low 20s.
> +#
> +# Failure is determined by golden output mismatch from _within_tolerance().
> +
> +workfile=$SCRATCH_MNT/file
> +nfiles=8
> +wsize=4096
> +wcnt=1000
> +
> +write_sync_file()
> +{
> +	idx=$1
> +
> +	for ((cnt=0; cnt<$wcnt; cnt++)); do
> +		$XFS_IO_PROG -f -s -c "pwrite $((cnt * wsize)) $wsize" $workfile.$idx
> +	done
> +}
> +
> +rm -f $workfile*

Hmm, "rm -f $XXX*", but looks like the $workdfile doesn't have chance to be
null :) Maybe rm -f $workfile.* is safer, as all test files are $workfile.$idx
or $workfile.$n. I can do this change when I merge it.

Thanks,
Zorro

> +for ((n=0; n<$nfiles; n++)); do
> +	write_sync_file $n > /dev/null 2>&1 &
> +done
> +wait
> +sync
> +
> +for ((n=0; n<$nfiles; n++)); do
> +	count=$(_count_extents $workfile.$n)
> +	# Acceptible extent count range is 1-40
> +	_within_tolerance "file.$n extent count" $count 21 19 -v
> +done
> +
> +status=0
> +exit
> diff --git a/tests/xfs/1500.out b/tests/xfs/1500.out
> new file mode 100644
> index 000000000..414df87ed
> --- /dev/null
> +++ b/tests/xfs/1500.out
> @@ -0,0 +1,9 @@
> +QA output created by 1500
> +file.0 extent count is in range
> +file.1 extent count is in range
> +file.2 extent count is in range
> +file.3 extent count is in range
> +file.4 extent count is in range
> +file.5 extent count is in range
> +file.6 extent count is in range
> +file.7 extent count is in range
> diff --git a/tests/xfs/1501 b/tests/xfs/1501
> new file mode 100755
> index 000000000..cf3cbf8b5
> --- /dev/null
> +++ b/tests/xfs/1501
> @@ -0,0 +1,68 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright (c) 2019 Red Hat, Inc.  All Rights Reserved.
> +#
> +# FS QA Test xfs/501
> +#
> +# Post-EOF preallocation defeat test for buffered I/O with extent size hints.
> +#
> +
> +. ./common/preamble
> +_begin_fstest auto quick prealloc rw
> +
> +. ./common/rc
> +. ./common/filter
> +
> +_require_scratch
> +
> +_cleanup()
> +{
> +	# try to kill all background processes
> +	wait
> +	cd /
> +	rm -r -f $tmp.*
> +}
> +
> +_scratch_mkfs > "$seqres.full" 2>&1
> +_scratch_mount
> +
> +# Write multiple files in parallel using buffered writes with extent size hints.
> +# Aim is to interleave allocations to fragment the files. Writes w/ extent size
> +# hints set defeat the open/write/close heuristics in xfs_file_release() that
> +# prevent EOF block removal, so this should fragment badly. Typical problematic
> +# behaviour shows per-file extent counts of 1000 (worst case!) whilst
> +# fixed behaviour should show very few extents (almost best case).
> +#
> +# Failure is determined by golden output mismatch from _within_tolerance().
> +
> +workfile=$SCRATCH_MNT/file
> +nfiles=8
> +wsize=4096
> +wcnt=1000
> +extent_size=16m
> +
> +write_extsz_file()
> +{
> +	idx=$1
> +
> +	$XFS_IO_PROG -f -c "extsize $extent_size" $workfile.$idx
> +	for ((cnt=0; cnt<$wcnt; cnt++)); do
> +		$XFS_IO_PROG -f -c "pwrite $((cnt * wsize)) $wsize" $workfile.$idx
> +	done
> +}
> +
> +rm -f $workfile*
> +for ((n=0; n<$nfiles; n++)); do
> +	write_extsz_file $n > /dev/null 2>&1 &
> +done
> +wait
> +sync
> +
> +for ((n=0; n<$nfiles; n++)); do
> +	count=$(_count_extents $workfile.$n)
> +	# Acceptible extent count range is 1-10
> +	_within_tolerance "file.$n extent count" $count 2 1 8 -v
> +done
> +
> +status=0
> +exit
> diff --git a/tests/xfs/1501.out b/tests/xfs/1501.out
> new file mode 100644
> index 000000000..a266ef74b
> --- /dev/null
> +++ b/tests/xfs/1501.out
> @@ -0,0 +1,9 @@
> +QA output created by 1501
> +file.0 extent count is in range
> +file.1 extent count is in range
> +file.2 extent count is in range
> +file.3 extent count is in range
> +file.4 extent count is in range
> +file.5 extent count is in range
> +file.6 extent count is in range
> +file.7 extent count is in range
> diff --git a/tests/xfs/1502 b/tests/xfs/1502
> new file mode 100755
> index 000000000..f4228667a
> --- /dev/null
> +++ b/tests/xfs/1502
> @@ -0,0 +1,68 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright (c) 2019 Red Hat, Inc.  All Rights Reserved.
> +#
> +# FS QA Test xfs/502
> +#
> +# Post-EOF preallocation defeat test for direct I/O with extent size hints.
> +#
> +
> +. ./common/preamble
> +_begin_fstest auto quick prealloc rw
> +
> +. ./common/rc
> +. ./common/filter
> +
> +_require_scratch
> +
> +_cleanup()
> +{
> +	# try to kill all background processes
> +	wait
> +	cd /
> +	rm -r -f $tmp.*
> +}
> +
> +_scratch_mkfs > "$seqres.full" 2>&1
> +_scratch_mount
> +
> +# Write multiple files in parallel using O_DIRECT writes w/ extent size hints.
> +# Aim is to interleave allocations to fragment the files. O_DIRECT writes defeat
> +# the open/write/close heuristics in xfs_file_release() that prevent EOF block
> +# removal, so this should fragment badly. Typical problematic behaviour shows
> +# per-file extent counts of ~1000 (worst case) whilst fixed behaviour typically
> +# shows extent counts in the low single digits (almost best case)
> +#
> +# Failure is determined by golden output mismatch from _within_tolerance().
> +
> +workfile=$SCRATCH_MNT/file
> +nfiles=8
> +wsize=4096
> +wcnt=1000
> +extent_size=16m
> +
> +write_direct_file()
> +{
> +	idx=$1
> +
> +	$XFS_IO_PROG -f -c "extsize $extent_size" $workfile.$idx
> +	for ((cnt=0; cnt<$wcnt; cnt++)); do
> +		$XFS_IO_PROG -f -d -c "pwrite $((cnt * wsize)) $wsize" $workfile.$idx
> +	done
> +}
> +
> +rm -f $workfile*
> +for ((n=0; n<$nfiles; n++)); do
> +	write_direct_file $n > /dev/null 2>&1 &
> +done
> +wait
> +sync
> +
> +for ((n=0; n<$nfiles; n++)); do
> +	count=$(_count_extents $workfile.$n)
> +	# Acceptible extent count range is 1-10
> +	_within_tolerance "file.$n extent count" $count 2 1 8 -v
> +done
> +
> +status=0
> +exit
> diff --git a/tests/xfs/1502.out b/tests/xfs/1502.out
> new file mode 100644
> index 000000000..82c8760a3
> --- /dev/null
> +++ b/tests/xfs/1502.out
> @@ -0,0 +1,9 @@
> +QA output created by 1502
> +file.0 extent count is in range
> +file.1 extent count is in range
> +file.2 extent count is in range
> +file.3 extent count is in range
> +file.4 extent count is in range
> +file.5 extent count is in range
> +file.6 extent count is in range
> +file.7 extent count is in range
> diff --git a/tests/xfs/1503 b/tests/xfs/1503
> new file mode 100755
> index 000000000..9002f87e6
> --- /dev/null
> +++ b/tests/xfs/1503
> @@ -0,0 +1,77 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright (c) 2019 Red Hat, Inc.  All Rights Reserved.
> +#
> +# FS QA Test xfs/503
> +#
> +# Post-EOF preallocation defeat test with O_SYNC buffered I/O that repeatedly
> +# closes and reopens the files.
> +#
> +
> +. ./common/preamble
> +_begin_fstest auto prealloc rw
> +
> +. ./common/rc
> +. ./common/filter
> +
> +_require_scratch
> +
> +_cleanup()
> +{
> +	# try to kill all background processes
> +	wait
> +	cd /
> +	rm -r -f $tmp.*
> +}
> +
> +_scratch_mkfs > "$seqres.full" 2>&1
> +_scratch_mount
> +
> +# Write multiple files in parallel using synchronous buffered writes that
> +# repeatedly close and reopen the fails. Aim is to interleave allocations to
> +# fragment the files. Assuming we've fixed the synchronous write defeat, we can
> +# still trigger the same issue with a open/read/close on O_RDONLY files. We
> +# should not be triggering EOF preallocation removal on files we don't have
> +# permission to write, so until this is fixed it should fragment badly.  Typical
> +# problematic behaviour shows per-file extent counts of 50-350 whilst fixed
> +# behaviour typically demonstrates post-eof speculative delalloc growth in
> +# extent size (~6 extents for 50MB file).
> +#
> +# Failure is determined by golden output mismatch from _within_tolerance().
> +
> +workfile=$SCRATCH_MNT/file
> +nfiles=32
> +wsize=4096
> +wcnt=1000
> +
> +write_file()
> +{
> +	idx=$1
> +
> +	$XFS_IO_PROG -f -s -c "pwrite -b 64k 0 50m" $workfile.$idx
> +}
> +
> +read_file()
> +{
> +	idx=$1
> +
> +	for ((cnt=0; cnt<$wcnt; cnt++)); do
> +		$XFS_IO_PROG -f -r -c "pread 0 28" $workfile.$idx
> +	done
> +}
> +
> +rm -f $workdir/file*
> +for ((n=0; n<$((nfiles)); n++)); do
> +	write_file $n > /dev/null 2>&1 &
> +	read_file $n > /dev/null 2>&1 &
> +done
> +wait
> +
> +for ((n=0; n<$nfiles; n++)); do
> +	count=$(_count_extents $workfile.$n)
> +	# Acceptible extent count range is 1-40
> +	_within_tolerance "file.$n extent count" $count 6 5 10 -v
> +done
> +
> +status=0
> +exit
> diff --git a/tests/xfs/1503.out b/tests/xfs/1503.out
> new file mode 100644
> index 000000000..1780b16df
> --- /dev/null
> +++ b/tests/xfs/1503.out
> @@ -0,0 +1,33 @@
> +QA output created by 1503
> +file.0 extent count is in range
> +file.1 extent count is in range
> +file.2 extent count is in range
> +file.3 extent count is in range
> +file.4 extent count is in range
> +file.5 extent count is in range
> +file.6 extent count is in range
> +file.7 extent count is in range
> +file.8 extent count is in range
> +file.9 extent count is in range
> +file.10 extent count is in range
> +file.11 extent count is in range
> +file.12 extent count is in range
> +file.13 extent count is in range
> +file.14 extent count is in range
> +file.15 extent count is in range
> +file.16 extent count is in range
> +file.17 extent count is in range
> +file.18 extent count is in range
> +file.19 extent count is in range
> +file.20 extent count is in range
> +file.21 extent count is in range
> +file.22 extent count is in range
> +file.23 extent count is in range
> +file.24 extent count is in range
> +file.25 extent count is in range
> +file.26 extent count is in range
> +file.27 extent count is in range
> +file.28 extent count is in range
> +file.29 extent count is in range
> +file.30 extent count is in range
> +file.31 extent count is in range
> -- 
> 2.45.2
> 
>
Zorro Lang Sept. 26, 2024, 12:31 p.m. UTC | #3
On Wed, Sep 25, 2024 at 07:15:32PM +0800, Zorro Lang wrote:
> On Tue, Sep 24, 2024 at 10:45:48AM +0200, Christoph Hellwig wrote:
> > From: Dave Chinner <dchinner@redhat.com>
> > 
> > These tests create substantial file fragmentation as a result of
> > application actions that defeat post-EOF preallocation
> > optimisations. They are intended to replicate known vectors for
> > these problems, and provide a check that the fragmentation levels
> > have been controlled. The mitigations we make may not completely
> > remove fragmentation (e.g. they may demonstrate speculative delalloc
> > related extent size growth) so the checks don't assume we'll end up
> > with perfect layouts and hence check for an exceptable level of
> > fragmentation rather than none.
> > 
> > Signed-off-by: Dave Chinner <dchinner@redhat.com>
> > [move to different test number, update to current xfstest APIs]
> > Signed-off-by: Christoph Hellwig <hch@lst.de>
> > ---
> 
> This patch looks good to me, just a few nit-picking below...
> 
> >  tests/xfs/1500     | 66 +++++++++++++++++++++++++++++++++++++++
> >  tests/xfs/1500.out |  9 ++++++
> >  tests/xfs/1501     | 68 ++++++++++++++++++++++++++++++++++++++++
> >  tests/xfs/1501.out |  9 ++++++
> >  tests/xfs/1502     | 68 ++++++++++++++++++++++++++++++++++++++++
> >  tests/xfs/1502.out |  9 ++++++
> >  tests/xfs/1503     | 77 ++++++++++++++++++++++++++++++++++++++++++++++
> >  tests/xfs/1503.out | 33 ++++++++++++++++++++
> >  8 files changed, 339 insertions(+)
> >  create mode 100755 tests/xfs/1500
> >  create mode 100644 tests/xfs/1500.out
> >  create mode 100755 tests/xfs/1501
> >  create mode 100644 tests/xfs/1501.out
> >  create mode 100755 tests/xfs/1502
> >  create mode 100644 tests/xfs/1502.out
> >  create mode 100755 tests/xfs/1503
> >  create mode 100644 tests/xfs/1503.out
> > 
> > diff --git a/tests/xfs/1500 b/tests/xfs/1500
> > new file mode 100755
> > index 000000000..de0e1df62
> > --- /dev/null
> > +++ b/tests/xfs/1500
> > @@ -0,0 +1,66 @@
> > +#! /bin/bash
> > +# SPDX-License-Identifier: GPL-2.0
> > +# Copyright (c) 2019 Red Hat, Inc.  All Rights Reserved.
> > +#
> > +# FS QA Test xfs/500
> > +#
> > +# Post-EOF preallocation defeat test for O_SYNC buffered I/O.
> > +#
> > +
> > +. ./common/preamble
> > +_begin_fstest auto quick prealloc rw
> > +
> > +. ./common/rc
> > +. ./common/filter

Above two lines are not necessary.
(same for other cases)

> > +
> > +_require_scratch
> > +
> > +_cleanup()
> > +{
> > +	# try to kill all background processes
> 
> I didn't see "kill" below, maybe "wait all background processes done"? Or you'd
> like to use "kill" but forgot? If you don't want to use "kill", please tell me,
> then I'll help to change the comment when I merge it.
> 
> > +	wait
> > +	cd /
> > +	rm -r -f $tmp.*
> > +}
> > +
> > +_scratch_mkfs > "$seqres.full" 2>&1
> > +_scratch_mount
> > +
> > +# Write multiple files in parallel using synchronous buffered writes. Aim is to
> > +# interleave allocations to fragment the files. Synchronous writes defeat the
> > +# open/write/close heuristics in xfs_file_release() that prevent EOF block
> > +# removal, so this should fragment badly. Typical problematic behaviour shows
> > +# per-file extent counts of >900 (almost worse case) whilst fixed behaviour
> > +# typically shows extent counts in the low 20s.
> > +#
> > +# Failure is determined by golden output mismatch from _within_tolerance().
> > +
> > +workfile=$SCRATCH_MNT/file
> > +nfiles=8
> > +wsize=4096
> > +wcnt=1000
> > +
> > +write_sync_file()
> > +{
> > +	idx=$1
> > +
> > +	for ((cnt=0; cnt<$wcnt; cnt++)); do
> > +		$XFS_IO_PROG -f -s -c "pwrite $((cnt * wsize)) $wsize" $workfile.$idx
> > +	done
> > +}
> > +
> > +rm -f $workfile*
> 
> Hmm, "rm -f $XXX*", but looks like the $workdfile doesn't have chance to be
> null :) Maybe rm -f $workfile.* is safer, as all test files are $workfile.$idx
> or $workfile.$n. I can do this change when I merge it.
> 
> Thanks,
> Zorro
> 
> > +for ((n=0; n<$nfiles; n++)); do
> > +	write_sync_file $n > /dev/null 2>&1 &
> > +done
> > +wait
> > +sync
> > +
> > +for ((n=0; n<$nfiles; n++)); do
> > +	count=$(_count_extents $workfile.$n)
> > +	# Acceptible extent count range is 1-40
> > +	_within_tolerance "file.$n extent count" $count 21 19 -v
> > +done
> > +
> > +status=0
> > +exit
> > diff --git a/tests/xfs/1500.out b/tests/xfs/1500.out
> > new file mode 100644
> > index 000000000..414df87ed
> > --- /dev/null
> > +++ b/tests/xfs/1500.out
> > @@ -0,0 +1,9 @@
> > +QA output created by 1500
> > +file.0 extent count is in range
> > +file.1 extent count is in range
> > +file.2 extent count is in range
> > +file.3 extent count is in range
> > +file.4 extent count is in range
> > +file.5 extent count is in range
> > +file.6 extent count is in range
> > +file.7 extent count is in range
> > diff --git a/tests/xfs/1501 b/tests/xfs/1501
> > new file mode 100755
> > index 000000000..cf3cbf8b5
> > --- /dev/null
> > +++ b/tests/xfs/1501
> > @@ -0,0 +1,68 @@
> > +#! /bin/bash
> > +# SPDX-License-Identifier: GPL-2.0
> > +# Copyright (c) 2019 Red Hat, Inc.  All Rights Reserved.
> > +#
> > +# FS QA Test xfs/501
> > +#
> > +# Post-EOF preallocation defeat test for buffered I/O with extent size hints.
> > +#
> > +
> > +. ./common/preamble
> > +_begin_fstest auto quick prealloc rw
> > +
> > +. ./common/rc
> > +. ./common/filter
> > +
> > +_require_scratch
> > +
> > +_cleanup()
> > +{
> > +	# try to kill all background processes
> > +	wait
> > +	cd /
> > +	rm -r -f $tmp.*
> > +}
> > +
> > +_scratch_mkfs > "$seqres.full" 2>&1
> > +_scratch_mount
> > +
> > +# Write multiple files in parallel using buffered writes with extent size hints.
> > +# Aim is to interleave allocations to fragment the files. Writes w/ extent size
> > +# hints set defeat the open/write/close heuristics in xfs_file_release() that
> > +# prevent EOF block removal, so this should fragment badly. Typical problematic
> > +# behaviour shows per-file extent counts of 1000 (worst case!) whilst
> > +# fixed behaviour should show very few extents (almost best case).
> > +#
> > +# Failure is determined by golden output mismatch from _within_tolerance().
> > +
> > +workfile=$SCRATCH_MNT/file
> > +nfiles=8
> > +wsize=4096
> > +wcnt=1000
> > +extent_size=16m
> > +
> > +write_extsz_file()
> > +{
> > +	idx=$1
> > +
> > +	$XFS_IO_PROG -f -c "extsize $extent_size" $workfile.$idx

_require_xfs_io_command "extsize"

> > +	for ((cnt=0; cnt<$wcnt; cnt++)); do
> > +		$XFS_IO_PROG -f -c "pwrite $((cnt * wsize)) $wsize" $workfile.$idx
> > +	done
> > +}
> > +
> > +rm -f $workfile*
> > +for ((n=0; n<$nfiles; n++)); do
> > +	write_extsz_file $n > /dev/null 2>&1 &
> > +done
> > +wait
> > +sync
> > +
> > +for ((n=0; n<$nfiles; n++)); do
> > +	count=$(_count_extents $workfile.$n)

_count_extents uses fiemap command, so maybe:

_require_xfs_io_command "fiemap"

> > +	# Acceptible extent count range is 1-10
> > +	_within_tolerance "file.$n extent count" $count 2 1 8 -v
> > +done
> > +
> > +status=0
> > +exit

[snap]

> > +read_file()
> > +{
> > +	idx=$1
> > +
> > +	for ((cnt=0; cnt<$wcnt; cnt++)); do
> > +		$XFS_IO_PROG -f -r -c "pread 0 28" $workfile.$idx
> > +	done
> > +}
> > +
> > +rm -f $workdir/file*
> > +for ((n=0; n<$((nfiles)); n++)); do

What's the $(( )) for?

Thanks,
Zorro

> > +	write_file $n > /dev/null 2>&1 &
> > +	read_file $n > /dev/null 2>&1 &
> > +done
> > +wait
> > +
> > +for ((n=0; n<$nfiles; n++)); do
> > +	count=$(_count_extents $workfile.$n)
> > +	# Acceptible extent count range is 1-40
> > +	_within_tolerance "file.$n extent count" $count 6 5 10 -v
> > +done
> > +
> > +status=0
> > +exit
> > diff --git a/tests/xfs/1503.out b/tests/xfs/1503.out
> > new file mode 100644
> > index 000000000..1780b16df
> > --- /dev/null
> > +++ b/tests/xfs/1503.out
> > @@ -0,0 +1,33 @@
> > +QA output created by 1503
> > +file.0 extent count is in range
> > +file.1 extent count is in range
> > +file.2 extent count is in range
> > +file.3 extent count is in range
> > +file.4 extent count is in range
> > +file.5 extent count is in range
> > +file.6 extent count is in range
> > +file.7 extent count is in range
> > +file.8 extent count is in range
> > +file.9 extent count is in range
> > +file.10 extent count is in range
> > +file.11 extent count is in range
> > +file.12 extent count is in range
> > +file.13 extent count is in range
> > +file.14 extent count is in range
> > +file.15 extent count is in range
> > +file.16 extent count is in range
> > +file.17 extent count is in range
> > +file.18 extent count is in range
> > +file.19 extent count is in range
> > +file.20 extent count is in range
> > +file.21 extent count is in range
> > +file.22 extent count is in range
> > +file.23 extent count is in range
> > +file.24 extent count is in range
> > +file.25 extent count is in range
> > +file.26 extent count is in range
> > +file.27 extent count is in range
> > +file.28 extent count is in range
> > +file.29 extent count is in range
> > +file.30 extent count is in range
> > +file.31 extent count is in range
> > -- 
> > 2.45.2
> > 
> >
Darrick J. Wong Oct. 1, 2024, 2:59 p.m. UTC | #4
On Tue, Sep 24, 2024 at 10:45:48AM +0200, Christoph Hellwig wrote:
> From: Dave Chinner <dchinner@redhat.com>
> 
> These tests create substantial file fragmentation as a result of
> application actions that defeat post-EOF preallocation
> optimisations. They are intended to replicate known vectors for
> these problems, and provide a check that the fragmentation levels
> have been controlled. The mitigations we make may not completely
> remove fragmentation (e.g. they may demonstrate speculative delalloc
> related extent size growth) so the checks don't assume we'll end up
> with perfect layouts and hence check for an exceptable level of
> fragmentation rather than none.
> 
> Signed-off-by: Dave Chinner <dchinner@redhat.com>
> [move to different test number, update to current xfstest APIs]
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  tests/xfs/1500     | 66 +++++++++++++++++++++++++++++++++++++++
>  tests/xfs/1500.out |  9 ++++++
>  tests/xfs/1501     | 68 ++++++++++++++++++++++++++++++++++++++++
>  tests/xfs/1501.out |  9 ++++++
>  tests/xfs/1502     | 68 ++++++++++++++++++++++++++++++++++++++++
>  tests/xfs/1502.out |  9 ++++++
>  tests/xfs/1503     | 77 ++++++++++++++++++++++++++++++++++++++++++++++
>  tests/xfs/1503.out | 33 ++++++++++++++++++++
>  8 files changed, 339 insertions(+)
>  create mode 100755 tests/xfs/1500
>  create mode 100644 tests/xfs/1500.out
>  create mode 100755 tests/xfs/1501
>  create mode 100644 tests/xfs/1501.out
>  create mode 100755 tests/xfs/1502
>  create mode 100644 tests/xfs/1502.out
>  create mode 100755 tests/xfs/1503
>  create mode 100644 tests/xfs/1503.out
> 
> diff --git a/tests/xfs/1500 b/tests/xfs/1500
> new file mode 100755
> index 000000000..de0e1df62
> --- /dev/null
> +++ b/tests/xfs/1500
> @@ -0,0 +1,66 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright (c) 2019 Red Hat, Inc.  All Rights Reserved.
> +#
> +# FS QA Test xfs/500
> +#
> +# Post-EOF preallocation defeat test for O_SYNC buffered I/O.
> +#
> +
> +. ./common/preamble
> +_begin_fstest auto quick prealloc rw
> +
> +. ./common/rc
> +. ./common/filter
> +
> +_require_scratch
> +
> +_cleanup()
> +{
> +	# try to kill all background processes
> +	wait
> +	cd /
> +	rm -r -f $tmp.*
> +}
> +
> +_scratch_mkfs > "$seqres.full" 2>&1
> +_scratch_mount
> +
> +# Write multiple files in parallel using synchronous buffered writes. Aim is to
> +# interleave allocations to fragment the files. Synchronous writes defeat the
> +# open/write/close heuristics in xfs_file_release() that prevent EOF block
> +# removal, so this should fragment badly. Typical problematic behaviour shows
> +# per-file extent counts of >900 (almost worse case) whilst fixed behaviour
> +# typically shows extent counts in the low 20s.

Now that these are in for-next, I've noticed that these new tests
consistently fail in the above-documented manner on various configs --
fsdax, always_cow, rtextsize > 1fsb, and sometimes 1k fsblock size.

I'm not sure why this happens, but it probably needs to be looked at
along with all the FALLOC_FL_UNSHARE_RANGE brokenness that's also been
exposed by fstests that /does/ need to be fixed.

--D

> +# Failure is determined by golden output mismatch from _within_tolerance().
> +
> +workfile=$SCRATCH_MNT/file
> +nfiles=8
> +wsize=4096
> +wcnt=1000
> +
> +write_sync_file()
> +{
> +	idx=$1
> +
> +	for ((cnt=0; cnt<$wcnt; cnt++)); do
> +		$XFS_IO_PROG -f -s -c "pwrite $((cnt * wsize)) $wsize" $workfile.$idx
> +	done
> +}
> +
> +rm -f $workfile*
> +for ((n=0; n<$nfiles; n++)); do
> +	write_sync_file $n > /dev/null 2>&1 &
> +done
> +wait
> +sync
> +
> +for ((n=0; n<$nfiles; n++)); do
> +	count=$(_count_extents $workfile.$n)
> +	# Acceptible extent count range is 1-40
> +	_within_tolerance "file.$n extent count" $count 21 19 -v
> +done
> +
> +status=0
> +exit
> diff --git a/tests/xfs/1500.out b/tests/xfs/1500.out
> new file mode 100644
> index 000000000..414df87ed
> --- /dev/null
> +++ b/tests/xfs/1500.out
> @@ -0,0 +1,9 @@
> +QA output created by 1500
> +file.0 extent count is in range
> +file.1 extent count is in range
> +file.2 extent count is in range
> +file.3 extent count is in range
> +file.4 extent count is in range
> +file.5 extent count is in range
> +file.6 extent count is in range
> +file.7 extent count is in range
> diff --git a/tests/xfs/1501 b/tests/xfs/1501
> new file mode 100755
> index 000000000..cf3cbf8b5
> --- /dev/null
> +++ b/tests/xfs/1501
> @@ -0,0 +1,68 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright (c) 2019 Red Hat, Inc.  All Rights Reserved.
> +#
> +# FS QA Test xfs/501
> +#
> +# Post-EOF preallocation defeat test for buffered I/O with extent size hints.
> +#
> +
> +. ./common/preamble
> +_begin_fstest auto quick prealloc rw
> +
> +. ./common/rc
> +. ./common/filter
> +
> +_require_scratch
> +
> +_cleanup()
> +{
> +	# try to kill all background processes
> +	wait
> +	cd /
> +	rm -r -f $tmp.*
> +}
> +
> +_scratch_mkfs > "$seqres.full" 2>&1
> +_scratch_mount
> +
> +# Write multiple files in parallel using buffered writes with extent size hints.
> +# Aim is to interleave allocations to fragment the files. Writes w/ extent size
> +# hints set defeat the open/write/close heuristics in xfs_file_release() that
> +# prevent EOF block removal, so this should fragment badly. Typical problematic
> +# behaviour shows per-file extent counts of 1000 (worst case!) whilst
> +# fixed behaviour should show very few extents (almost best case).
> +#
> +# Failure is determined by golden output mismatch from _within_tolerance().
> +
> +workfile=$SCRATCH_MNT/file
> +nfiles=8
> +wsize=4096
> +wcnt=1000
> +extent_size=16m
> +
> +write_extsz_file()
> +{
> +	idx=$1
> +
> +	$XFS_IO_PROG -f -c "extsize $extent_size" $workfile.$idx
> +	for ((cnt=0; cnt<$wcnt; cnt++)); do
> +		$XFS_IO_PROG -f -c "pwrite $((cnt * wsize)) $wsize" $workfile.$idx
> +	done
> +}
> +
> +rm -f $workfile*
> +for ((n=0; n<$nfiles; n++)); do
> +	write_extsz_file $n > /dev/null 2>&1 &
> +done
> +wait
> +sync
> +
> +for ((n=0; n<$nfiles; n++)); do
> +	count=$(_count_extents $workfile.$n)
> +	# Acceptible extent count range is 1-10
> +	_within_tolerance "file.$n extent count" $count 2 1 8 -v
> +done
> +
> +status=0
> +exit
> diff --git a/tests/xfs/1501.out b/tests/xfs/1501.out
> new file mode 100644
> index 000000000..a266ef74b
> --- /dev/null
> +++ b/tests/xfs/1501.out
> @@ -0,0 +1,9 @@
> +QA output created by 1501
> +file.0 extent count is in range
> +file.1 extent count is in range
> +file.2 extent count is in range
> +file.3 extent count is in range
> +file.4 extent count is in range
> +file.5 extent count is in range
> +file.6 extent count is in range
> +file.7 extent count is in range
> diff --git a/tests/xfs/1502 b/tests/xfs/1502
> new file mode 100755
> index 000000000..f4228667a
> --- /dev/null
> +++ b/tests/xfs/1502
> @@ -0,0 +1,68 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright (c) 2019 Red Hat, Inc.  All Rights Reserved.
> +#
> +# FS QA Test xfs/502
> +#
> +# Post-EOF preallocation defeat test for direct I/O with extent size hints.
> +#
> +
> +. ./common/preamble
> +_begin_fstest auto quick prealloc rw
> +
> +. ./common/rc
> +. ./common/filter
> +
> +_require_scratch
> +
> +_cleanup()
> +{
> +	# try to kill all background processes
> +	wait
> +	cd /
> +	rm -r -f $tmp.*
> +}
> +
> +_scratch_mkfs > "$seqres.full" 2>&1
> +_scratch_mount
> +
> +# Write multiple files in parallel using O_DIRECT writes w/ extent size hints.
> +# Aim is to interleave allocations to fragment the files. O_DIRECT writes defeat
> +# the open/write/close heuristics in xfs_file_release() that prevent EOF block
> +# removal, so this should fragment badly. Typical problematic behaviour shows
> +# per-file extent counts of ~1000 (worst case) whilst fixed behaviour typically
> +# shows extent counts in the low single digits (almost best case)
> +#
> +# Failure is determined by golden output mismatch from _within_tolerance().
> +
> +workfile=$SCRATCH_MNT/file
> +nfiles=8
> +wsize=4096
> +wcnt=1000
> +extent_size=16m
> +
> +write_direct_file()
> +{
> +	idx=$1
> +
> +	$XFS_IO_PROG -f -c "extsize $extent_size" $workfile.$idx
> +	for ((cnt=0; cnt<$wcnt; cnt++)); do
> +		$XFS_IO_PROG -f -d -c "pwrite $((cnt * wsize)) $wsize" $workfile.$idx
> +	done
> +}
> +
> +rm -f $workfile*
> +for ((n=0; n<$nfiles; n++)); do
> +	write_direct_file $n > /dev/null 2>&1 &
> +done
> +wait
> +sync
> +
> +for ((n=0; n<$nfiles; n++)); do
> +	count=$(_count_extents $workfile.$n)
> +	# Acceptible extent count range is 1-10
> +	_within_tolerance "file.$n extent count" $count 2 1 8 -v
> +done
> +
> +status=0
> +exit
> diff --git a/tests/xfs/1502.out b/tests/xfs/1502.out
> new file mode 100644
> index 000000000..82c8760a3
> --- /dev/null
> +++ b/tests/xfs/1502.out
> @@ -0,0 +1,9 @@
> +QA output created by 1502
> +file.0 extent count is in range
> +file.1 extent count is in range
> +file.2 extent count is in range
> +file.3 extent count is in range
> +file.4 extent count is in range
> +file.5 extent count is in range
> +file.6 extent count is in range
> +file.7 extent count is in range
> diff --git a/tests/xfs/1503 b/tests/xfs/1503
> new file mode 100755
> index 000000000..9002f87e6
> --- /dev/null
> +++ b/tests/xfs/1503
> @@ -0,0 +1,77 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright (c) 2019 Red Hat, Inc.  All Rights Reserved.
> +#
> +# FS QA Test xfs/503
> +#
> +# Post-EOF preallocation defeat test with O_SYNC buffered I/O that repeatedly
> +# closes and reopens the files.
> +#
> +
> +. ./common/preamble
> +_begin_fstest auto prealloc rw
> +
> +. ./common/rc
> +. ./common/filter
> +
> +_require_scratch
> +
> +_cleanup()
> +{
> +	# try to kill all background processes
> +	wait
> +	cd /
> +	rm -r -f $tmp.*
> +}
> +
> +_scratch_mkfs > "$seqres.full" 2>&1
> +_scratch_mount
> +
> +# Write multiple files in parallel using synchronous buffered writes that
> +# repeatedly close and reopen the fails. Aim is to interleave allocations to
> +# fragment the files. Assuming we've fixed the synchronous write defeat, we can
> +# still trigger the same issue with a open/read/close on O_RDONLY files. We
> +# should not be triggering EOF preallocation removal on files we don't have
> +# permission to write, so until this is fixed it should fragment badly.  Typical
> +# problematic behaviour shows per-file extent counts of 50-350 whilst fixed
> +# behaviour typically demonstrates post-eof speculative delalloc growth in
> +# extent size (~6 extents for 50MB file).
> +#
> +# Failure is determined by golden output mismatch from _within_tolerance().
> +
> +workfile=$SCRATCH_MNT/file
> +nfiles=32
> +wsize=4096
> +wcnt=1000
> +
> +write_file()
> +{
> +	idx=$1
> +
> +	$XFS_IO_PROG -f -s -c "pwrite -b 64k 0 50m" $workfile.$idx
> +}
> +
> +read_file()
> +{
> +	idx=$1
> +
> +	for ((cnt=0; cnt<$wcnt; cnt++)); do
> +		$XFS_IO_PROG -f -r -c "pread 0 28" $workfile.$idx
> +	done
> +}
> +
> +rm -f $workdir/file*
> +for ((n=0; n<$((nfiles)); n++)); do
> +	write_file $n > /dev/null 2>&1 &
> +	read_file $n > /dev/null 2>&1 &
> +done
> +wait
> +
> +for ((n=0; n<$nfiles; n++)); do
> +	count=$(_count_extents $workfile.$n)
> +	# Acceptible extent count range is 1-40
> +	_within_tolerance "file.$n extent count" $count 6 5 10 -v
> +done
> +
> +status=0
> +exit
> diff --git a/tests/xfs/1503.out b/tests/xfs/1503.out
> new file mode 100644
> index 000000000..1780b16df
> --- /dev/null
> +++ b/tests/xfs/1503.out
> @@ -0,0 +1,33 @@
> +QA output created by 1503
> +file.0 extent count is in range
> +file.1 extent count is in range
> +file.2 extent count is in range
> +file.3 extent count is in range
> +file.4 extent count is in range
> +file.5 extent count is in range
> +file.6 extent count is in range
> +file.7 extent count is in range
> +file.8 extent count is in range
> +file.9 extent count is in range
> +file.10 extent count is in range
> +file.11 extent count is in range
> +file.12 extent count is in range
> +file.13 extent count is in range
> +file.14 extent count is in range
> +file.15 extent count is in range
> +file.16 extent count is in range
> +file.17 extent count is in range
> +file.18 extent count is in range
> +file.19 extent count is in range
> +file.20 extent count is in range
> +file.21 extent count is in range
> +file.22 extent count is in range
> +file.23 extent count is in range
> +file.24 extent count is in range
> +file.25 extent count is in range
> +file.26 extent count is in range
> +file.27 extent count is in range
> +file.28 extent count is in range
> +file.29 extent count is in range
> +file.30 extent count is in range
> +file.31 extent count is in range
> -- 
> 2.45.2
> 
>
Zorro Lang Oct. 2, 2024, 1:38 p.m. UTC | #5
On Tue, Oct 01, 2024 at 07:59:44AM -0700, Darrick J. Wong wrote:
> On Tue, Sep 24, 2024 at 10:45:48AM +0200, Christoph Hellwig wrote:
> > From: Dave Chinner <dchinner@redhat.com>
> > 
> > These tests create substantial file fragmentation as a result of
> > application actions that defeat post-EOF preallocation
> > optimisations. They are intended to replicate known vectors for
> > these problems, and provide a check that the fragmentation levels
> > have been controlled. The mitigations we make may not completely
> > remove fragmentation (e.g. they may demonstrate speculative delalloc
> > related extent size growth) so the checks don't assume we'll end up
> > with perfect layouts and hence check for an exceptable level of
> > fragmentation rather than none.
> > 
> > Signed-off-by: Dave Chinner <dchinner@redhat.com>
> > [move to different test number, update to current xfstest APIs]
> > Signed-off-by: Christoph Hellwig <hch@lst.de>
> > ---
> >  tests/xfs/1500     | 66 +++++++++++++++++++++++++++++++++++++++
> >  tests/xfs/1500.out |  9 ++++++
> >  tests/xfs/1501     | 68 ++++++++++++++++++++++++++++++++++++++++
> >  tests/xfs/1501.out |  9 ++++++
> >  tests/xfs/1502     | 68 ++++++++++++++++++++++++++++++++++++++++
> >  tests/xfs/1502.out |  9 ++++++
> >  tests/xfs/1503     | 77 ++++++++++++++++++++++++++++++++++++++++++++++
> >  tests/xfs/1503.out | 33 ++++++++++++++++++++
> >  8 files changed, 339 insertions(+)
> >  create mode 100755 tests/xfs/1500
> >  create mode 100644 tests/xfs/1500.out
> >  create mode 100755 tests/xfs/1501
> >  create mode 100644 tests/xfs/1501.out
> >  create mode 100755 tests/xfs/1502
> >  create mode 100644 tests/xfs/1502.out
> >  create mode 100755 tests/xfs/1503
> >  create mode 100644 tests/xfs/1503.out
> > 
> > diff --git a/tests/xfs/1500 b/tests/xfs/1500
> > new file mode 100755
> > index 000000000..de0e1df62
> > --- /dev/null
> > +++ b/tests/xfs/1500
> > @@ -0,0 +1,66 @@
> > +#! /bin/bash
> > +# SPDX-License-Identifier: GPL-2.0
> > +# Copyright (c) 2019 Red Hat, Inc.  All Rights Reserved.
> > +#
> > +# FS QA Test xfs/500
> > +#
> > +# Post-EOF preallocation defeat test for O_SYNC buffered I/O.
> > +#
> > +
> > +. ./common/preamble
> > +_begin_fstest auto quick prealloc rw
> > +
> > +. ./common/rc
> > +. ./common/filter
> > +
> > +_require_scratch
> > +
> > +_cleanup()
> > +{
> > +	# try to kill all background processes
> > +	wait
> > +	cd /
> > +	rm -r -f $tmp.*
> > +}
> > +
> > +_scratch_mkfs > "$seqres.full" 2>&1
> > +_scratch_mount
> > +
> > +# Write multiple files in parallel using synchronous buffered writes. Aim is to
> > +# interleave allocations to fragment the files. Synchronous writes defeat the
> > +# open/write/close heuristics in xfs_file_release() that prevent EOF block
> > +# removal, so this should fragment badly. Typical problematic behaviour shows
> > +# per-file extent counts of >900 (almost worse case) whilst fixed behaviour
> > +# typically shows extent counts in the low 20s.
> 
> Now that these are in for-next, I've noticed that these new tests
> consistently fail in the above-documented manner on various configs --
> fsdax, always_cow, rtextsize > 1fsb, and sometimes 1k fsblock size.
> 
> I'm not sure why this happens, but it probably needs to be looked at
> along with all the FALLOC_FL_UNSHARE_RANGE brokenness that's also been
> exposed by fstests that /does/ need to be fixed.

Yes, some fsx tests fail on xfs, after the FALLOC_FL_UNSHARE_RANGE supporting.
e.g. g/091, g/127, g/263, g/363 and g/616. I thought they're known issues as
you known. If they're not, better to check. Hi Brian, are these failures as you
known?

Thanks,
Zorro

> 
> --D
> 
> > +# Failure is determined by golden output mismatch from _within_tolerance().
> > +
> > +workfile=$SCRATCH_MNT/file
> > +nfiles=8
> > +wsize=4096
> > +wcnt=1000
> > +
> > +write_sync_file()
> > +{
> > +	idx=$1
> > +
> > +	for ((cnt=0; cnt<$wcnt; cnt++)); do
> > +		$XFS_IO_PROG -f -s -c "pwrite $((cnt * wsize)) $wsize" $workfile.$idx
> > +	done
> > +}
> > +
> > +rm -f $workfile*
> > +for ((n=0; n<$nfiles; n++)); do
> > +	write_sync_file $n > /dev/null 2>&1 &
> > +done
> > +wait
> > +sync
> > +
> > +for ((n=0; n<$nfiles; n++)); do
> > +	count=$(_count_extents $workfile.$n)
> > +	# Acceptible extent count range is 1-40
> > +	_within_tolerance "file.$n extent count" $count 21 19 -v
> > +done
> > +
> > +status=0
> > +exit
> > diff --git a/tests/xfs/1500.out b/tests/xfs/1500.out
> > new file mode 100644
> > index 000000000..414df87ed
> > --- /dev/null
> > +++ b/tests/xfs/1500.out
> > @@ -0,0 +1,9 @@
> > +QA output created by 1500
> > +file.0 extent count is in range
> > +file.1 extent count is in range
> > +file.2 extent count is in range
> > +file.3 extent count is in range
> > +file.4 extent count is in range
> > +file.5 extent count is in range
> > +file.6 extent count is in range
> > +file.7 extent count is in range
> > diff --git a/tests/xfs/1501 b/tests/xfs/1501
> > new file mode 100755
> > index 000000000..cf3cbf8b5
> > --- /dev/null
> > +++ b/tests/xfs/1501
> > @@ -0,0 +1,68 @@
> > +#! /bin/bash
> > +# SPDX-License-Identifier: GPL-2.0
> > +# Copyright (c) 2019 Red Hat, Inc.  All Rights Reserved.
> > +#
> > +# FS QA Test xfs/501
> > +#
> > +# Post-EOF preallocation defeat test for buffered I/O with extent size hints.
> > +#
> > +
> > +. ./common/preamble
> > +_begin_fstest auto quick prealloc rw
> > +
> > +. ./common/rc
> > +. ./common/filter
> > +
> > +_require_scratch
> > +
> > +_cleanup()
> > +{
> > +	# try to kill all background processes
> > +	wait
> > +	cd /
> > +	rm -r -f $tmp.*
> > +}
> > +
> > +_scratch_mkfs > "$seqres.full" 2>&1
> > +_scratch_mount
> > +
> > +# Write multiple files in parallel using buffered writes with extent size hints.
> > +# Aim is to interleave allocations to fragment the files. Writes w/ extent size
> > +# hints set defeat the open/write/close heuristics in xfs_file_release() that
> > +# prevent EOF block removal, so this should fragment badly. Typical problematic
> > +# behaviour shows per-file extent counts of 1000 (worst case!) whilst
> > +# fixed behaviour should show very few extents (almost best case).
> > +#
> > +# Failure is determined by golden output mismatch from _within_tolerance().
> > +
> > +workfile=$SCRATCH_MNT/file
> > +nfiles=8
> > +wsize=4096
> > +wcnt=1000
> > +extent_size=16m
> > +
> > +write_extsz_file()
> > +{
> > +	idx=$1
> > +
> > +	$XFS_IO_PROG -f -c "extsize $extent_size" $workfile.$idx
> > +	for ((cnt=0; cnt<$wcnt; cnt++)); do
> > +		$XFS_IO_PROG -f -c "pwrite $((cnt * wsize)) $wsize" $workfile.$idx
> > +	done
> > +}
> > +
> > +rm -f $workfile*
> > +for ((n=0; n<$nfiles; n++)); do
> > +	write_extsz_file $n > /dev/null 2>&1 &
> > +done
> > +wait
> > +sync
> > +
> > +for ((n=0; n<$nfiles; n++)); do
> > +	count=$(_count_extents $workfile.$n)
> > +	# Acceptible extent count range is 1-10
> > +	_within_tolerance "file.$n extent count" $count 2 1 8 -v
> > +done
> > +
> > +status=0
> > +exit
> > diff --git a/tests/xfs/1501.out b/tests/xfs/1501.out
> > new file mode 100644
> > index 000000000..a266ef74b
> > --- /dev/null
> > +++ b/tests/xfs/1501.out
> > @@ -0,0 +1,9 @@
> > +QA output created by 1501
> > +file.0 extent count is in range
> > +file.1 extent count is in range
> > +file.2 extent count is in range
> > +file.3 extent count is in range
> > +file.4 extent count is in range
> > +file.5 extent count is in range
> > +file.6 extent count is in range
> > +file.7 extent count is in range
> > diff --git a/tests/xfs/1502 b/tests/xfs/1502
> > new file mode 100755
> > index 000000000..f4228667a
> > --- /dev/null
> > +++ b/tests/xfs/1502
> > @@ -0,0 +1,68 @@
> > +#! /bin/bash
> > +# SPDX-License-Identifier: GPL-2.0
> > +# Copyright (c) 2019 Red Hat, Inc.  All Rights Reserved.
> > +#
> > +# FS QA Test xfs/502
> > +#
> > +# Post-EOF preallocation defeat test for direct I/O with extent size hints.
> > +#
> > +
> > +. ./common/preamble
> > +_begin_fstest auto quick prealloc rw
> > +
> > +. ./common/rc
> > +. ./common/filter
> > +
> > +_require_scratch
> > +
> > +_cleanup()
> > +{
> > +	# try to kill all background processes
> > +	wait
> > +	cd /
> > +	rm -r -f $tmp.*
> > +}
> > +
> > +_scratch_mkfs > "$seqres.full" 2>&1
> > +_scratch_mount
> > +
> > +# Write multiple files in parallel using O_DIRECT writes w/ extent size hints.
> > +# Aim is to interleave allocations to fragment the files. O_DIRECT writes defeat
> > +# the open/write/close heuristics in xfs_file_release() that prevent EOF block
> > +# removal, so this should fragment badly. Typical problematic behaviour shows
> > +# per-file extent counts of ~1000 (worst case) whilst fixed behaviour typically
> > +# shows extent counts in the low single digits (almost best case)
> > +#
> > +# Failure is determined by golden output mismatch from _within_tolerance().
> > +
> > +workfile=$SCRATCH_MNT/file
> > +nfiles=8
> > +wsize=4096
> > +wcnt=1000
> > +extent_size=16m
> > +
> > +write_direct_file()
> > +{
> > +	idx=$1
> > +
> > +	$XFS_IO_PROG -f -c "extsize $extent_size" $workfile.$idx
> > +	for ((cnt=0; cnt<$wcnt; cnt++)); do
> > +		$XFS_IO_PROG -f -d -c "pwrite $((cnt * wsize)) $wsize" $workfile.$idx
> > +	done
> > +}
> > +
> > +rm -f $workfile*
> > +for ((n=0; n<$nfiles; n++)); do
> > +	write_direct_file $n > /dev/null 2>&1 &
> > +done
> > +wait
> > +sync
> > +
> > +for ((n=0; n<$nfiles; n++)); do
> > +	count=$(_count_extents $workfile.$n)
> > +	# Acceptible extent count range is 1-10
> > +	_within_tolerance "file.$n extent count" $count 2 1 8 -v
> > +done
> > +
> > +status=0
> > +exit
> > diff --git a/tests/xfs/1502.out b/tests/xfs/1502.out
> > new file mode 100644
> > index 000000000..82c8760a3
> > --- /dev/null
> > +++ b/tests/xfs/1502.out
> > @@ -0,0 +1,9 @@
> > +QA output created by 1502
> > +file.0 extent count is in range
> > +file.1 extent count is in range
> > +file.2 extent count is in range
> > +file.3 extent count is in range
> > +file.4 extent count is in range
> > +file.5 extent count is in range
> > +file.6 extent count is in range
> > +file.7 extent count is in range
> > diff --git a/tests/xfs/1503 b/tests/xfs/1503
> > new file mode 100755
> > index 000000000..9002f87e6
> > --- /dev/null
> > +++ b/tests/xfs/1503
> > @@ -0,0 +1,77 @@
> > +#! /bin/bash
> > +# SPDX-License-Identifier: GPL-2.0
> > +# Copyright (c) 2019 Red Hat, Inc.  All Rights Reserved.
> > +#
> > +# FS QA Test xfs/503
> > +#
> > +# Post-EOF preallocation defeat test with O_SYNC buffered I/O that repeatedly
> > +# closes and reopens the files.
> > +#
> > +
> > +. ./common/preamble
> > +_begin_fstest auto prealloc rw
> > +
> > +. ./common/rc
> > +. ./common/filter
> > +
> > +_require_scratch
> > +
> > +_cleanup()
> > +{
> > +	# try to kill all background processes
> > +	wait
> > +	cd /
> > +	rm -r -f $tmp.*
> > +}
> > +
> > +_scratch_mkfs > "$seqres.full" 2>&1
> > +_scratch_mount
> > +
> > +# Write multiple files in parallel using synchronous buffered writes that
> > +# repeatedly close and reopen the fails. Aim is to interleave allocations to
> > +# fragment the files. Assuming we've fixed the synchronous write defeat, we can
> > +# still trigger the same issue with a open/read/close on O_RDONLY files. We
> > +# should not be triggering EOF preallocation removal on files we don't have
> > +# permission to write, so until this is fixed it should fragment badly.  Typical
> > +# problematic behaviour shows per-file extent counts of 50-350 whilst fixed
> > +# behaviour typically demonstrates post-eof speculative delalloc growth in
> > +# extent size (~6 extents for 50MB file).
> > +#
> > +# Failure is determined by golden output mismatch from _within_tolerance().
> > +
> > +workfile=$SCRATCH_MNT/file
> > +nfiles=32
> > +wsize=4096
> > +wcnt=1000
> > +
> > +write_file()
> > +{
> > +	idx=$1
> > +
> > +	$XFS_IO_PROG -f -s -c "pwrite -b 64k 0 50m" $workfile.$idx
> > +}
> > +
> > +read_file()
> > +{
> > +	idx=$1
> > +
> > +	for ((cnt=0; cnt<$wcnt; cnt++)); do
> > +		$XFS_IO_PROG -f -r -c "pread 0 28" $workfile.$idx
> > +	done
> > +}
> > +
> > +rm -f $workdir/file*
> > +for ((n=0; n<$((nfiles)); n++)); do
> > +	write_file $n > /dev/null 2>&1 &
> > +	read_file $n > /dev/null 2>&1 &
> > +done
> > +wait
> > +
> > +for ((n=0; n<$nfiles; n++)); do
> > +	count=$(_count_extents $workfile.$n)
> > +	# Acceptible extent count range is 1-40
> > +	_within_tolerance "file.$n extent count" $count 6 5 10 -v
> > +done
> > +
> > +status=0
> > +exit
> > diff --git a/tests/xfs/1503.out b/tests/xfs/1503.out
> > new file mode 100644
> > index 000000000..1780b16df
> > --- /dev/null
> > +++ b/tests/xfs/1503.out
> > @@ -0,0 +1,33 @@
> > +QA output created by 1503
> > +file.0 extent count is in range
> > +file.1 extent count is in range
> > +file.2 extent count is in range
> > +file.3 extent count is in range
> > +file.4 extent count is in range
> > +file.5 extent count is in range
> > +file.6 extent count is in range
> > +file.7 extent count is in range
> > +file.8 extent count is in range
> > +file.9 extent count is in range
> > +file.10 extent count is in range
> > +file.11 extent count is in range
> > +file.12 extent count is in range
> > +file.13 extent count is in range
> > +file.14 extent count is in range
> > +file.15 extent count is in range
> > +file.16 extent count is in range
> > +file.17 extent count is in range
> > +file.18 extent count is in range
> > +file.19 extent count is in range
> > +file.20 extent count is in range
> > +file.21 extent count is in range
> > +file.22 extent count is in range
> > +file.23 extent count is in range
> > +file.24 extent count is in range
> > +file.25 extent count is in range
> > +file.26 extent count is in range
> > +file.27 extent count is in range
> > +file.28 extent count is in range
> > +file.29 extent count is in range
> > +file.30 extent count is in range
> > +file.31 extent count is in range
> > -- 
> > 2.45.2
> > 
> > 
>
Brian Foster Oct. 2, 2024, 2:35 p.m. UTC | #6
On Wed, Oct 02, 2024 at 09:38:00PM +0800, Zorro Lang wrote:
> On Tue, Oct 01, 2024 at 07:59:44AM -0700, Darrick J. Wong wrote:
> > On Tue, Sep 24, 2024 at 10:45:48AM +0200, Christoph Hellwig wrote:
> > > From: Dave Chinner <dchinner@redhat.com>
> > > 
> > > These tests create substantial file fragmentation as a result of
> > > application actions that defeat post-EOF preallocation
> > > optimisations. They are intended to replicate known vectors for
> > > these problems, and provide a check that the fragmentation levels
> > > have been controlled. The mitigations we make may not completely
> > > remove fragmentation (e.g. they may demonstrate speculative delalloc
> > > related extent size growth) so the checks don't assume we'll end up
> > > with perfect layouts and hence check for an exceptable level of
> > > fragmentation rather than none.
> > > 
> > > Signed-off-by: Dave Chinner <dchinner@redhat.com>
> > > [move to different test number, update to current xfstest APIs]
> > > Signed-off-by: Christoph Hellwig <hch@lst.de>
> > > ---
> > >  tests/xfs/1500     | 66 +++++++++++++++++++++++++++++++++++++++
> > >  tests/xfs/1500.out |  9 ++++++
> > >  tests/xfs/1501     | 68 ++++++++++++++++++++++++++++++++++++++++
> > >  tests/xfs/1501.out |  9 ++++++
> > >  tests/xfs/1502     | 68 ++++++++++++++++++++++++++++++++++++++++
> > >  tests/xfs/1502.out |  9 ++++++
> > >  tests/xfs/1503     | 77 ++++++++++++++++++++++++++++++++++++++++++++++
> > >  tests/xfs/1503.out | 33 ++++++++++++++++++++
> > >  8 files changed, 339 insertions(+)
> > >  create mode 100755 tests/xfs/1500
> > >  create mode 100644 tests/xfs/1500.out
> > >  create mode 100755 tests/xfs/1501
> > >  create mode 100644 tests/xfs/1501.out
> > >  create mode 100755 tests/xfs/1502
> > >  create mode 100644 tests/xfs/1502.out
> > >  create mode 100755 tests/xfs/1503
> > >  create mode 100644 tests/xfs/1503.out
> > > 
> > > diff --git a/tests/xfs/1500 b/tests/xfs/1500
> > > new file mode 100755
> > > index 000000000..de0e1df62
> > > --- /dev/null
> > > +++ b/tests/xfs/1500
> > > @@ -0,0 +1,66 @@
> > > +#! /bin/bash
> > > +# SPDX-License-Identifier: GPL-2.0
> > > +# Copyright (c) 2019 Red Hat, Inc.  All Rights Reserved.
> > > +#
> > > +# FS QA Test xfs/500
> > > +#
> > > +# Post-EOF preallocation defeat test for O_SYNC buffered I/O.
> > > +#
> > > +
> > > +. ./common/preamble
> > > +_begin_fstest auto quick prealloc rw
> > > +
> > > +. ./common/rc
> > > +. ./common/filter
> > > +
> > > +_require_scratch
> > > +
> > > +_cleanup()
> > > +{
> > > +	# try to kill all background processes
> > > +	wait
> > > +	cd /
> > > +	rm -r -f $tmp.*
> > > +}
> > > +
> > > +_scratch_mkfs > "$seqres.full" 2>&1
> > > +_scratch_mount
> > > +
> > > +# Write multiple files in parallel using synchronous buffered writes. Aim is to
> > > +# interleave allocations to fragment the files. Synchronous writes defeat the
> > > +# open/write/close heuristics in xfs_file_release() that prevent EOF block
> > > +# removal, so this should fragment badly. Typical problematic behaviour shows
> > > +# per-file extent counts of >900 (almost worse case) whilst fixed behaviour
> > > +# typically shows extent counts in the low 20s.
> > 
> > Now that these are in for-next, I've noticed that these new tests
> > consistently fail in the above-documented manner on various configs --
> > fsdax, always_cow, rtextsize > 1fsb, and sometimes 1k fsblock size.
> > 
> > I'm not sure why this happens, but it probably needs to be looked at
> > along with all the FALLOC_FL_UNSHARE_RANGE brokenness that's also been
> > exposed by fstests that /does/ need to be fixed.
> 
> Yes, some fsx tests fail on xfs, after the FALLOC_FL_UNSHARE_RANGE supporting.
> e.g. g/091, g/127, g/263, g/363 and g/616. I thought they're known issues as
> you known. If they're not, better to check. Hi Brian, are these failures as you
> known?
> 

So I'm aware of two fundamental issues that fsx unshare range support
uncovers. First is the XFS data loss issue that is addressed here[1],
second is the iomap unshare range warning/error splat that Julian Sun
has been working on (last version posted here[2] I believe).

My initial testing of the fsx unshare range patch was to run fsx
directly on the fs until I could run for some notable number of
operations without triggering a failure (probably at least 1m+, but I
don't recall exactly). I was initially able to do that with the patches
from [1] plus a local hack to trim to i_size in iomap_unshare_range(),
so based on that I _think_ these are the only two outstanding issues
with unshare range.

[1] https://lore.kernel.org/linux-xfs/20240906114051.120743-1-bfoster@redhat.com/
[2] https://lore.kernel.org/linux-fsdevel/20240927065344.2628691-1-sunjunchao2870@gmail.com/

The patches at [1] have been reviewed, but I'm not really sure where
they stand in terms of the XFS pipeline. Carlos?

It looks like the fix associated with [2] is still under
development/review. In any event, I just ran the set of tests noted by
Zorro above (w/ unshare range support) and they all fail on my distro
kernel, but all but generic/363 pass on current master (6.12.0-rc1+)
plus [1]. The generic/363 failure produces the iomap error associated
with [2], so I suspect that all of these test failures can be
categorized into one of those two known issues.

Brian

> Thanks,
> Zorro
> 
> > 
> > --D
> > 
> > > +# Failure is determined by golden output mismatch from _within_tolerance().
> > > +
> > > +workfile=$SCRATCH_MNT/file
> > > +nfiles=8
> > > +wsize=4096
> > > +wcnt=1000
> > > +
> > > +write_sync_file()
> > > +{
> > > +	idx=$1
> > > +
> > > +	for ((cnt=0; cnt<$wcnt; cnt++)); do
> > > +		$XFS_IO_PROG -f -s -c "pwrite $((cnt * wsize)) $wsize" $workfile.$idx
> > > +	done
> > > +}
> > > +
> > > +rm -f $workfile*
> > > +for ((n=0; n<$nfiles; n++)); do
> > > +	write_sync_file $n > /dev/null 2>&1 &
> > > +done
> > > +wait
> > > +sync
> > > +
> > > +for ((n=0; n<$nfiles; n++)); do
> > > +	count=$(_count_extents $workfile.$n)
> > > +	# Acceptible extent count range is 1-40
> > > +	_within_tolerance "file.$n extent count" $count 21 19 -v
> > > +done
> > > +
> > > +status=0
> > > +exit
> > > diff --git a/tests/xfs/1500.out b/tests/xfs/1500.out
> > > new file mode 100644
> > > index 000000000..414df87ed
> > > --- /dev/null
> > > +++ b/tests/xfs/1500.out
> > > @@ -0,0 +1,9 @@
> > > +QA output created by 1500
> > > +file.0 extent count is in range
> > > +file.1 extent count is in range
> > > +file.2 extent count is in range
> > > +file.3 extent count is in range
> > > +file.4 extent count is in range
> > > +file.5 extent count is in range
> > > +file.6 extent count is in range
> > > +file.7 extent count is in range
> > > diff --git a/tests/xfs/1501 b/tests/xfs/1501
> > > new file mode 100755
> > > index 000000000..cf3cbf8b5
> > > --- /dev/null
> > > +++ b/tests/xfs/1501
> > > @@ -0,0 +1,68 @@
> > > +#! /bin/bash
> > > +# SPDX-License-Identifier: GPL-2.0
> > > +# Copyright (c) 2019 Red Hat, Inc.  All Rights Reserved.
> > > +#
> > > +# FS QA Test xfs/501
> > > +#
> > > +# Post-EOF preallocation defeat test for buffered I/O with extent size hints.
> > > +#
> > > +
> > > +. ./common/preamble
> > > +_begin_fstest auto quick prealloc rw
> > > +
> > > +. ./common/rc
> > > +. ./common/filter
> > > +
> > > +_require_scratch
> > > +
> > > +_cleanup()
> > > +{
> > > +	# try to kill all background processes
> > > +	wait
> > > +	cd /
> > > +	rm -r -f $tmp.*
> > > +}
> > > +
> > > +_scratch_mkfs > "$seqres.full" 2>&1
> > > +_scratch_mount
> > > +
> > > +# Write multiple files in parallel using buffered writes with extent size hints.
> > > +# Aim is to interleave allocations to fragment the files. Writes w/ extent size
> > > +# hints set defeat the open/write/close heuristics in xfs_file_release() that
> > > +# prevent EOF block removal, so this should fragment badly. Typical problematic
> > > +# behaviour shows per-file extent counts of 1000 (worst case!) whilst
> > > +# fixed behaviour should show very few extents (almost best case).
> > > +#
> > > +# Failure is determined by golden output mismatch from _within_tolerance().
> > > +
> > > +workfile=$SCRATCH_MNT/file
> > > +nfiles=8
> > > +wsize=4096
> > > +wcnt=1000
> > > +extent_size=16m
> > > +
> > > +write_extsz_file()
> > > +{
> > > +	idx=$1
> > > +
> > > +	$XFS_IO_PROG -f -c "extsize $extent_size" $workfile.$idx
> > > +	for ((cnt=0; cnt<$wcnt; cnt++)); do
> > > +		$XFS_IO_PROG -f -c "pwrite $((cnt * wsize)) $wsize" $workfile.$idx
> > > +	done
> > > +}
> > > +
> > > +rm -f $workfile*
> > > +for ((n=0; n<$nfiles; n++)); do
> > > +	write_extsz_file $n > /dev/null 2>&1 &
> > > +done
> > > +wait
> > > +sync
> > > +
> > > +for ((n=0; n<$nfiles; n++)); do
> > > +	count=$(_count_extents $workfile.$n)
> > > +	# Acceptible extent count range is 1-10
> > > +	_within_tolerance "file.$n extent count" $count 2 1 8 -v
> > > +done
> > > +
> > > +status=0
> > > +exit
> > > diff --git a/tests/xfs/1501.out b/tests/xfs/1501.out
> > > new file mode 100644
> > > index 000000000..a266ef74b
> > > --- /dev/null
> > > +++ b/tests/xfs/1501.out
> > > @@ -0,0 +1,9 @@
> > > +QA output created by 1501
> > > +file.0 extent count is in range
> > > +file.1 extent count is in range
> > > +file.2 extent count is in range
> > > +file.3 extent count is in range
> > > +file.4 extent count is in range
> > > +file.5 extent count is in range
> > > +file.6 extent count is in range
> > > +file.7 extent count is in range
> > > diff --git a/tests/xfs/1502 b/tests/xfs/1502
> > > new file mode 100755
> > > index 000000000..f4228667a
> > > --- /dev/null
> > > +++ b/tests/xfs/1502
> > > @@ -0,0 +1,68 @@
> > > +#! /bin/bash
> > > +# SPDX-License-Identifier: GPL-2.0
> > > +# Copyright (c) 2019 Red Hat, Inc.  All Rights Reserved.
> > > +#
> > > +# FS QA Test xfs/502
> > > +#
> > > +# Post-EOF preallocation defeat test for direct I/O with extent size hints.
> > > +#
> > > +
> > > +. ./common/preamble
> > > +_begin_fstest auto quick prealloc rw
> > > +
> > > +. ./common/rc
> > > +. ./common/filter
> > > +
> > > +_require_scratch
> > > +
> > > +_cleanup()
> > > +{
> > > +	# try to kill all background processes
> > > +	wait
> > > +	cd /
> > > +	rm -r -f $tmp.*
> > > +}
> > > +
> > > +_scratch_mkfs > "$seqres.full" 2>&1
> > > +_scratch_mount
> > > +
> > > +# Write multiple files in parallel using O_DIRECT writes w/ extent size hints.
> > > +# Aim is to interleave allocations to fragment the files. O_DIRECT writes defeat
> > > +# the open/write/close heuristics in xfs_file_release() that prevent EOF block
> > > +# removal, so this should fragment badly. Typical problematic behaviour shows
> > > +# per-file extent counts of ~1000 (worst case) whilst fixed behaviour typically
> > > +# shows extent counts in the low single digits (almost best case)
> > > +#
> > > +# Failure is determined by golden output mismatch from _within_tolerance().
> > > +
> > > +workfile=$SCRATCH_MNT/file
> > > +nfiles=8
> > > +wsize=4096
> > > +wcnt=1000
> > > +extent_size=16m
> > > +
> > > +write_direct_file()
> > > +{
> > > +	idx=$1
> > > +
> > > +	$XFS_IO_PROG -f -c "extsize $extent_size" $workfile.$idx
> > > +	for ((cnt=0; cnt<$wcnt; cnt++)); do
> > > +		$XFS_IO_PROG -f -d -c "pwrite $((cnt * wsize)) $wsize" $workfile.$idx
> > > +	done
> > > +}
> > > +
> > > +rm -f $workfile*
> > > +for ((n=0; n<$nfiles; n++)); do
> > > +	write_direct_file $n > /dev/null 2>&1 &
> > > +done
> > > +wait
> > > +sync
> > > +
> > > +for ((n=0; n<$nfiles; n++)); do
> > > +	count=$(_count_extents $workfile.$n)
> > > +	# Acceptible extent count range is 1-10
> > > +	_within_tolerance "file.$n extent count" $count 2 1 8 -v
> > > +done
> > > +
> > > +status=0
> > > +exit
> > > diff --git a/tests/xfs/1502.out b/tests/xfs/1502.out
> > > new file mode 100644
> > > index 000000000..82c8760a3
> > > --- /dev/null
> > > +++ b/tests/xfs/1502.out
> > > @@ -0,0 +1,9 @@
> > > +QA output created by 1502
> > > +file.0 extent count is in range
> > > +file.1 extent count is in range
> > > +file.2 extent count is in range
> > > +file.3 extent count is in range
> > > +file.4 extent count is in range
> > > +file.5 extent count is in range
> > > +file.6 extent count is in range
> > > +file.7 extent count is in range
> > > diff --git a/tests/xfs/1503 b/tests/xfs/1503
> > > new file mode 100755
> > > index 000000000..9002f87e6
> > > --- /dev/null
> > > +++ b/tests/xfs/1503
> > > @@ -0,0 +1,77 @@
> > > +#! /bin/bash
> > > +# SPDX-License-Identifier: GPL-2.0
> > > +# Copyright (c) 2019 Red Hat, Inc.  All Rights Reserved.
> > > +#
> > > +# FS QA Test xfs/503
> > > +#
> > > +# Post-EOF preallocation defeat test with O_SYNC buffered I/O that repeatedly
> > > +# closes and reopens the files.
> > > +#
> > > +
> > > +. ./common/preamble
> > > +_begin_fstest auto prealloc rw
> > > +
> > > +. ./common/rc
> > > +. ./common/filter
> > > +
> > > +_require_scratch
> > > +
> > > +_cleanup()
> > > +{
> > > +	# try to kill all background processes
> > > +	wait
> > > +	cd /
> > > +	rm -r -f $tmp.*
> > > +}
> > > +
> > > +_scratch_mkfs > "$seqres.full" 2>&1
> > > +_scratch_mount
> > > +
> > > +# Write multiple files in parallel using synchronous buffered writes that
> > > +# repeatedly close and reopen the fails. Aim is to interleave allocations to
> > > +# fragment the files. Assuming we've fixed the synchronous write defeat, we can
> > > +# still trigger the same issue with a open/read/close on O_RDONLY files. We
> > > +# should not be triggering EOF preallocation removal on files we don't have
> > > +# permission to write, so until this is fixed it should fragment badly.  Typical
> > > +# problematic behaviour shows per-file extent counts of 50-350 whilst fixed
> > > +# behaviour typically demonstrates post-eof speculative delalloc growth in
> > > +# extent size (~6 extents for 50MB file).
> > > +#
> > > +# Failure is determined by golden output mismatch from _within_tolerance().
> > > +
> > > +workfile=$SCRATCH_MNT/file
> > > +nfiles=32
> > > +wsize=4096
> > > +wcnt=1000
> > > +
> > > +write_file()
> > > +{
> > > +	idx=$1
> > > +
> > > +	$XFS_IO_PROG -f -s -c "pwrite -b 64k 0 50m" $workfile.$idx
> > > +}
> > > +
> > > +read_file()
> > > +{
> > > +	idx=$1
> > > +
> > > +	for ((cnt=0; cnt<$wcnt; cnt++)); do
> > > +		$XFS_IO_PROG -f -r -c "pread 0 28" $workfile.$idx
> > > +	done
> > > +}
> > > +
> > > +rm -f $workdir/file*
> > > +for ((n=0; n<$((nfiles)); n++)); do
> > > +	write_file $n > /dev/null 2>&1 &
> > > +	read_file $n > /dev/null 2>&1 &
> > > +done
> > > +wait
> > > +
> > > +for ((n=0; n<$nfiles; n++)); do
> > > +	count=$(_count_extents $workfile.$n)
> > > +	# Acceptible extent count range is 1-40
> > > +	_within_tolerance "file.$n extent count" $count 6 5 10 -v
> > > +done
> > > +
> > > +status=0
> > > +exit
> > > diff --git a/tests/xfs/1503.out b/tests/xfs/1503.out
> > > new file mode 100644
> > > index 000000000..1780b16df
> > > --- /dev/null
> > > +++ b/tests/xfs/1503.out
> > > @@ -0,0 +1,33 @@
> > > +QA output created by 1503
> > > +file.0 extent count is in range
> > > +file.1 extent count is in range
> > > +file.2 extent count is in range
> > > +file.3 extent count is in range
> > > +file.4 extent count is in range
> > > +file.5 extent count is in range
> > > +file.6 extent count is in range
> > > +file.7 extent count is in range
> > > +file.8 extent count is in range
> > > +file.9 extent count is in range
> > > +file.10 extent count is in range
> > > +file.11 extent count is in range
> > > +file.12 extent count is in range
> > > +file.13 extent count is in range
> > > +file.14 extent count is in range
> > > +file.15 extent count is in range
> > > +file.16 extent count is in range
> > > +file.17 extent count is in range
> > > +file.18 extent count is in range
> > > +file.19 extent count is in range
> > > +file.20 extent count is in range
> > > +file.21 extent count is in range
> > > +file.22 extent count is in range
> > > +file.23 extent count is in range
> > > +file.24 extent count is in range
> > > +file.25 extent count is in range
> > > +file.26 extent count is in range
> > > +file.27 extent count is in range
> > > +file.28 extent count is in range
> > > +file.29 extent count is in range
> > > +file.30 extent count is in range
> > > +file.31 extent count is in range
> > > -- 
> > > 2.45.2
> > > 
> > > 
> > 
>
Darrick J. Wong Oct. 2, 2024, 2:57 p.m. UTC | #7
On Wed, Oct 02, 2024 at 10:35:55AM -0400, Brian Foster wrote:
> On Wed, Oct 02, 2024 at 09:38:00PM +0800, Zorro Lang wrote:
> > On Tue, Oct 01, 2024 at 07:59:44AM -0700, Darrick J. Wong wrote:
> > > On Tue, Sep 24, 2024 at 10:45:48AM +0200, Christoph Hellwig wrote:
> > > > From: Dave Chinner <dchinner@redhat.com>
> > > > 
> > > > These tests create substantial file fragmentation as a result of
> > > > application actions that defeat post-EOF preallocation
> > > > optimisations. They are intended to replicate known vectors for
> > > > these problems, and provide a check that the fragmentation levels
> > > > have been controlled. The mitigations we make may not completely
> > > > remove fragmentation (e.g. they may demonstrate speculative delalloc
> > > > related extent size growth) so the checks don't assume we'll end up
> > > > with perfect layouts and hence check for an exceptable level of
> > > > fragmentation rather than none.
> > > > 
> > > > Signed-off-by: Dave Chinner <dchinner@redhat.com>
> > > > [move to different test number, update to current xfstest APIs]
> > > > Signed-off-by: Christoph Hellwig <hch@lst.de>
> > > > ---
> > > >  tests/xfs/1500     | 66 +++++++++++++++++++++++++++++++++++++++
> > > >  tests/xfs/1500.out |  9 ++++++
> > > >  tests/xfs/1501     | 68 ++++++++++++++++++++++++++++++++++++++++
> > > >  tests/xfs/1501.out |  9 ++++++
> > > >  tests/xfs/1502     | 68 ++++++++++++++++++++++++++++++++++++++++
> > > >  tests/xfs/1502.out |  9 ++++++
> > > >  tests/xfs/1503     | 77 ++++++++++++++++++++++++++++++++++++++++++++++
> > > >  tests/xfs/1503.out | 33 ++++++++++++++++++++
> > > >  8 files changed, 339 insertions(+)
> > > >  create mode 100755 tests/xfs/1500
> > > >  create mode 100644 tests/xfs/1500.out
> > > >  create mode 100755 tests/xfs/1501
> > > >  create mode 100644 tests/xfs/1501.out
> > > >  create mode 100755 tests/xfs/1502
> > > >  create mode 100644 tests/xfs/1502.out
> > > >  create mode 100755 tests/xfs/1503
> > > >  create mode 100644 tests/xfs/1503.out
> > > > 
> > > > diff --git a/tests/xfs/1500 b/tests/xfs/1500
> > > > new file mode 100755
> > > > index 000000000..de0e1df62
> > > > --- /dev/null
> > > > +++ b/tests/xfs/1500
> > > > @@ -0,0 +1,66 @@
> > > > +#! /bin/bash
> > > > +# SPDX-License-Identifier: GPL-2.0
> > > > +# Copyright (c) 2019 Red Hat, Inc.  All Rights Reserved.
> > > > +#
> > > > +# FS QA Test xfs/500
> > > > +#
> > > > +# Post-EOF preallocation defeat test for O_SYNC buffered I/O.
> > > > +#
> > > > +
> > > > +. ./common/preamble
> > > > +_begin_fstest auto quick prealloc rw
> > > > +
> > > > +. ./common/rc
> > > > +. ./common/filter
> > > > +
> > > > +_require_scratch
> > > > +
> > > > +_cleanup()
> > > > +{
> > > > +	# try to kill all background processes
> > > > +	wait
> > > > +	cd /
> > > > +	rm -r -f $tmp.*
> > > > +}
> > > > +
> > > > +_scratch_mkfs > "$seqres.full" 2>&1
> > > > +_scratch_mount
> > > > +
> > > > +# Write multiple files in parallel using synchronous buffered writes. Aim is to
> > > > +# interleave allocations to fragment the files. Synchronous writes defeat the
> > > > +# open/write/close heuristics in xfs_file_release() that prevent EOF block
> > > > +# removal, so this should fragment badly. Typical problematic behaviour shows
> > > > +# per-file extent counts of >900 (almost worse case) whilst fixed behaviour
> > > > +# typically shows extent counts in the low 20s.
> > > 
> > > Now that these are in for-next, I've noticed that these new tests
> > > consistently fail in the above-documented manner on various configs --
> > > fsdax, always_cow, rtextsize > 1fsb, and sometimes 1k fsblock size.
> > > 
> > > I'm not sure why this happens, but it probably needs to be looked at
> > > along with all the FALLOC_FL_UNSHARE_RANGE brokenness that's also been
> > > exposed by fstests that /does/ need to be fixed.
> > 
> > Yes, some fsx tests fail on xfs, after the FALLOC_FL_UNSHARE_RANGE supporting.
> > e.g. g/091, g/127, g/263, g/363 and g/616. I thought they're known issues as
> > you known. If they're not, better to check. Hi Brian, are these failures as you
> > known?
> > 
> 
> So I'm aware of two fundamental issues that fsx unshare range support
> uncovers. First is the XFS data loss issue that is addressed here[1],
> second is the iomap unshare range warning/error splat that Julian Sun
> has been working on (last version posted here[2] I believe).
> 
> My initial testing of the fsx unshare range patch was to run fsx
> directly on the fs until I could run for some notable number of
> operations without triggering a failure (probably at least 1m+, but I
> don't recall exactly). I was initially able to do that with the patches
> from [1] plus a local hack to trim to i_size in iomap_unshare_range(),
> so based on that I _think_ these are the only two outstanding issues
> with unshare range.
> 
> [1] https://lore.kernel.org/linux-xfs/20240906114051.120743-1-bfoster@redhat.com/
> [2] https://lore.kernel.org/linux-fsdevel/20240927065344.2628691-1-sunjunchao2870@gmail.com/
> 
> The patches at [1] have been reviewed, but I'm not really sure where
> they stand in terms of the XFS pipeline. Carlos?
> 
> It looks like the fix associated with [2] is still under
> development/review. In any event, I just ran the set of tests noted by
> Zorro above (w/ unshare range support) and they all fail on my distro
> kernel, but all but generic/363 pass on current master (6.12.0-rc1+)
> plus [1]. The generic/363 failure produces the iomap error associated
> with [2], so I suspect that all of these test failures can be
> categorized into one of those two known issues.

Yep.  [1] fixes a lot of the splats, and the rest of the splats can be
fixed by a couple of other patches that I'll send out today.

Those same fsx tests above are still broken on fsdax though, so
something is still wrong. :(

--D

> Brian
> 
> > Thanks,
> > Zorro
> > 
> > > 
> > > --D
> > > 
> > > > +# Failure is determined by golden output mismatch from _within_tolerance().
> > > > +
> > > > +workfile=$SCRATCH_MNT/file
> > > > +nfiles=8
> > > > +wsize=4096
> > > > +wcnt=1000
> > > > +
> > > > +write_sync_file()
> > > > +{
> > > > +	idx=$1
> > > > +
> > > > +	for ((cnt=0; cnt<$wcnt; cnt++)); do
> > > > +		$XFS_IO_PROG -f -s -c "pwrite $((cnt * wsize)) $wsize" $workfile.$idx
> > > > +	done
> > > > +}
> > > > +
> > > > +rm -f $workfile*
> > > > +for ((n=0; n<$nfiles; n++)); do
> > > > +	write_sync_file $n > /dev/null 2>&1 &
> > > > +done
> > > > +wait
> > > > +sync
> > > > +
> > > > +for ((n=0; n<$nfiles; n++)); do
> > > > +	count=$(_count_extents $workfile.$n)
> > > > +	# Acceptible extent count range is 1-40
> > > > +	_within_tolerance "file.$n extent count" $count 21 19 -v
> > > > +done
> > > > +
> > > > +status=0
> > > > +exit
> > > > diff --git a/tests/xfs/1500.out b/tests/xfs/1500.out
> > > > new file mode 100644
> > > > index 000000000..414df87ed
> > > > --- /dev/null
> > > > +++ b/tests/xfs/1500.out
> > > > @@ -0,0 +1,9 @@
> > > > +QA output created by 1500
> > > > +file.0 extent count is in range
> > > > +file.1 extent count is in range
> > > > +file.2 extent count is in range
> > > > +file.3 extent count is in range
> > > > +file.4 extent count is in range
> > > > +file.5 extent count is in range
> > > > +file.6 extent count is in range
> > > > +file.7 extent count is in range
> > > > diff --git a/tests/xfs/1501 b/tests/xfs/1501
> > > > new file mode 100755
> > > > index 000000000..cf3cbf8b5
> > > > --- /dev/null
> > > > +++ b/tests/xfs/1501
> > > > @@ -0,0 +1,68 @@
> > > > +#! /bin/bash
> > > > +# SPDX-License-Identifier: GPL-2.0
> > > > +# Copyright (c) 2019 Red Hat, Inc.  All Rights Reserved.
> > > > +#
> > > > +# FS QA Test xfs/501
> > > > +#
> > > > +# Post-EOF preallocation defeat test for buffered I/O with extent size hints.
> > > > +#
> > > > +
> > > > +. ./common/preamble
> > > > +_begin_fstest auto quick prealloc rw
> > > > +
> > > > +. ./common/rc
> > > > +. ./common/filter
> > > > +
> > > > +_require_scratch
> > > > +
> > > > +_cleanup()
> > > > +{
> > > > +	# try to kill all background processes
> > > > +	wait
> > > > +	cd /
> > > > +	rm -r -f $tmp.*
> > > > +}
> > > > +
> > > > +_scratch_mkfs > "$seqres.full" 2>&1
> > > > +_scratch_mount
> > > > +
> > > > +# Write multiple files in parallel using buffered writes with extent size hints.
> > > > +# Aim is to interleave allocations to fragment the files. Writes w/ extent size
> > > > +# hints set defeat the open/write/close heuristics in xfs_file_release() that
> > > > +# prevent EOF block removal, so this should fragment badly. Typical problematic
> > > > +# behaviour shows per-file extent counts of 1000 (worst case!) whilst
> > > > +# fixed behaviour should show very few extents (almost best case).
> > > > +#
> > > > +# Failure is determined by golden output mismatch from _within_tolerance().
> > > > +
> > > > +workfile=$SCRATCH_MNT/file
> > > > +nfiles=8
> > > > +wsize=4096
> > > > +wcnt=1000
> > > > +extent_size=16m
> > > > +
> > > > +write_extsz_file()
> > > > +{
> > > > +	idx=$1
> > > > +
> > > > +	$XFS_IO_PROG -f -c "extsize $extent_size" $workfile.$idx
> > > > +	for ((cnt=0; cnt<$wcnt; cnt++)); do
> > > > +		$XFS_IO_PROG -f -c "pwrite $((cnt * wsize)) $wsize" $workfile.$idx
> > > > +	done
> > > > +}
> > > > +
> > > > +rm -f $workfile*
> > > > +for ((n=0; n<$nfiles; n++)); do
> > > > +	write_extsz_file $n > /dev/null 2>&1 &
> > > > +done
> > > > +wait
> > > > +sync
> > > > +
> > > > +for ((n=0; n<$nfiles; n++)); do
> > > > +	count=$(_count_extents $workfile.$n)
> > > > +	# Acceptible extent count range is 1-10
> > > > +	_within_tolerance "file.$n extent count" $count 2 1 8 -v
> > > > +done
> > > > +
> > > > +status=0
> > > > +exit
> > > > diff --git a/tests/xfs/1501.out b/tests/xfs/1501.out
> > > > new file mode 100644
> > > > index 000000000..a266ef74b
> > > > --- /dev/null
> > > > +++ b/tests/xfs/1501.out
> > > > @@ -0,0 +1,9 @@
> > > > +QA output created by 1501
> > > > +file.0 extent count is in range
> > > > +file.1 extent count is in range
> > > > +file.2 extent count is in range
> > > > +file.3 extent count is in range
> > > > +file.4 extent count is in range
> > > > +file.5 extent count is in range
> > > > +file.6 extent count is in range
> > > > +file.7 extent count is in range
> > > > diff --git a/tests/xfs/1502 b/tests/xfs/1502
> > > > new file mode 100755
> > > > index 000000000..f4228667a
> > > > --- /dev/null
> > > > +++ b/tests/xfs/1502
> > > > @@ -0,0 +1,68 @@
> > > > +#! /bin/bash
> > > > +# SPDX-License-Identifier: GPL-2.0
> > > > +# Copyright (c) 2019 Red Hat, Inc.  All Rights Reserved.
> > > > +#
> > > > +# FS QA Test xfs/502
> > > > +#
> > > > +# Post-EOF preallocation defeat test for direct I/O with extent size hints.
> > > > +#
> > > > +
> > > > +. ./common/preamble
> > > > +_begin_fstest auto quick prealloc rw
> > > > +
> > > > +. ./common/rc
> > > > +. ./common/filter
> > > > +
> > > > +_require_scratch
> > > > +
> > > > +_cleanup()
> > > > +{
> > > > +	# try to kill all background processes
> > > > +	wait
> > > > +	cd /
> > > > +	rm -r -f $tmp.*
> > > > +}
> > > > +
> > > > +_scratch_mkfs > "$seqres.full" 2>&1
> > > > +_scratch_mount
> > > > +
> > > > +# Write multiple files in parallel using O_DIRECT writes w/ extent size hints.
> > > > +# Aim is to interleave allocations to fragment the files. O_DIRECT writes defeat
> > > > +# the open/write/close heuristics in xfs_file_release() that prevent EOF block
> > > > +# removal, so this should fragment badly. Typical problematic behaviour shows
> > > > +# per-file extent counts of ~1000 (worst case) whilst fixed behaviour typically
> > > > +# shows extent counts in the low single digits (almost best case)
> > > > +#
> > > > +# Failure is determined by golden output mismatch from _within_tolerance().
> > > > +
> > > > +workfile=$SCRATCH_MNT/file
> > > > +nfiles=8
> > > > +wsize=4096
> > > > +wcnt=1000
> > > > +extent_size=16m
> > > > +
> > > > +write_direct_file()
> > > > +{
> > > > +	idx=$1
> > > > +
> > > > +	$XFS_IO_PROG -f -c "extsize $extent_size" $workfile.$idx
> > > > +	for ((cnt=0; cnt<$wcnt; cnt++)); do
> > > > +		$XFS_IO_PROG -f -d -c "pwrite $((cnt * wsize)) $wsize" $workfile.$idx
> > > > +	done
> > > > +}
> > > > +
> > > > +rm -f $workfile*
> > > > +for ((n=0; n<$nfiles; n++)); do
> > > > +	write_direct_file $n > /dev/null 2>&1 &
> > > > +done
> > > > +wait
> > > > +sync
> > > > +
> > > > +for ((n=0; n<$nfiles; n++)); do
> > > > +	count=$(_count_extents $workfile.$n)
> > > > +	# Acceptible extent count range is 1-10
> > > > +	_within_tolerance "file.$n extent count" $count 2 1 8 -v
> > > > +done
> > > > +
> > > > +status=0
> > > > +exit
> > > > diff --git a/tests/xfs/1502.out b/tests/xfs/1502.out
> > > > new file mode 100644
> > > > index 000000000..82c8760a3
> > > > --- /dev/null
> > > > +++ b/tests/xfs/1502.out
> > > > @@ -0,0 +1,9 @@
> > > > +QA output created by 1502
> > > > +file.0 extent count is in range
> > > > +file.1 extent count is in range
> > > > +file.2 extent count is in range
> > > > +file.3 extent count is in range
> > > > +file.4 extent count is in range
> > > > +file.5 extent count is in range
> > > > +file.6 extent count is in range
> > > > +file.7 extent count is in range
> > > > diff --git a/tests/xfs/1503 b/tests/xfs/1503
> > > > new file mode 100755
> > > > index 000000000..9002f87e6
> > > > --- /dev/null
> > > > +++ b/tests/xfs/1503
> > > > @@ -0,0 +1,77 @@
> > > > +#! /bin/bash
> > > > +# SPDX-License-Identifier: GPL-2.0
> > > > +# Copyright (c) 2019 Red Hat, Inc.  All Rights Reserved.
> > > > +#
> > > > +# FS QA Test xfs/503
> > > > +#
> > > > +# Post-EOF preallocation defeat test with O_SYNC buffered I/O that repeatedly
> > > > +# closes and reopens the files.
> > > > +#
> > > > +
> > > > +. ./common/preamble
> > > > +_begin_fstest auto prealloc rw
> > > > +
> > > > +. ./common/rc
> > > > +. ./common/filter
> > > > +
> > > > +_require_scratch
> > > > +
> > > > +_cleanup()
> > > > +{
> > > > +	# try to kill all background processes
> > > > +	wait
> > > > +	cd /
> > > > +	rm -r -f $tmp.*
> > > > +}
> > > > +
> > > > +_scratch_mkfs > "$seqres.full" 2>&1
> > > > +_scratch_mount
> > > > +
> > > > +# Write multiple files in parallel using synchronous buffered writes that
> > > > +# repeatedly close and reopen the fails. Aim is to interleave allocations to
> > > > +# fragment the files. Assuming we've fixed the synchronous write defeat, we can
> > > > +# still trigger the same issue with a open/read/close on O_RDONLY files. We
> > > > +# should not be triggering EOF preallocation removal on files we don't have
> > > > +# permission to write, so until this is fixed it should fragment badly.  Typical
> > > > +# problematic behaviour shows per-file extent counts of 50-350 whilst fixed
> > > > +# behaviour typically demonstrates post-eof speculative delalloc growth in
> > > > +# extent size (~6 extents for 50MB file).
> > > > +#
> > > > +# Failure is determined by golden output mismatch from _within_tolerance().
> > > > +
> > > > +workfile=$SCRATCH_MNT/file
> > > > +nfiles=32
> > > > +wsize=4096
> > > > +wcnt=1000
> > > > +
> > > > +write_file()
> > > > +{
> > > > +	idx=$1
> > > > +
> > > > +	$XFS_IO_PROG -f -s -c "pwrite -b 64k 0 50m" $workfile.$idx
> > > > +}
> > > > +
> > > > +read_file()
> > > > +{
> > > > +	idx=$1
> > > > +
> > > > +	for ((cnt=0; cnt<$wcnt; cnt++)); do
> > > > +		$XFS_IO_PROG -f -r -c "pread 0 28" $workfile.$idx
> > > > +	done
> > > > +}
> > > > +
> > > > +rm -f $workdir/file*
> > > > +for ((n=0; n<$((nfiles)); n++)); do
> > > > +	write_file $n > /dev/null 2>&1 &
> > > > +	read_file $n > /dev/null 2>&1 &
> > > > +done
> > > > +wait
> > > > +
> > > > +for ((n=0; n<$nfiles; n++)); do
> > > > +	count=$(_count_extents $workfile.$n)
> > > > +	# Acceptible extent count range is 1-40
> > > > +	_within_tolerance "file.$n extent count" $count 6 5 10 -v
> > > > +done
> > > > +
> > > > +status=0
> > > > +exit
> > > > diff --git a/tests/xfs/1503.out b/tests/xfs/1503.out
> > > > new file mode 100644
> > > > index 000000000..1780b16df
> > > > --- /dev/null
> > > > +++ b/tests/xfs/1503.out
> > > > @@ -0,0 +1,33 @@
> > > > +QA output created by 1503
> > > > +file.0 extent count is in range
> > > > +file.1 extent count is in range
> > > > +file.2 extent count is in range
> > > > +file.3 extent count is in range
> > > > +file.4 extent count is in range
> > > > +file.5 extent count is in range
> > > > +file.6 extent count is in range
> > > > +file.7 extent count is in range
> > > > +file.8 extent count is in range
> > > > +file.9 extent count is in range
> > > > +file.10 extent count is in range
> > > > +file.11 extent count is in range
> > > > +file.12 extent count is in range
> > > > +file.13 extent count is in range
> > > > +file.14 extent count is in range
> > > > +file.15 extent count is in range
> > > > +file.16 extent count is in range
> > > > +file.17 extent count is in range
> > > > +file.18 extent count is in range
> > > > +file.19 extent count is in range
> > > > +file.20 extent count is in range
> > > > +file.21 extent count is in range
> > > > +file.22 extent count is in range
> > > > +file.23 extent count is in range
> > > > +file.24 extent count is in range
> > > > +file.25 extent count is in range
> > > > +file.26 extent count is in range
> > > > +file.27 extent count is in range
> > > > +file.28 extent count is in range
> > > > +file.29 extent count is in range
> > > > +file.30 extent count is in range
> > > > +file.31 extent count is in range
> > > > -- 
> > > > 2.45.2
> > > > 
> > > > 
> > > 
> > 
> 
>
Brian Foster Oct. 2, 2024, 3:56 p.m. UTC | #8
On Wed, Oct 02, 2024 at 07:57:15AM -0700, Darrick J. Wong wrote:
> On Wed, Oct 02, 2024 at 10:35:55AM -0400, Brian Foster wrote:
> > On Wed, Oct 02, 2024 at 09:38:00PM +0800, Zorro Lang wrote:
> > > On Tue, Oct 01, 2024 at 07:59:44AM -0700, Darrick J. Wong wrote:
> > > > On Tue, Sep 24, 2024 at 10:45:48AM +0200, Christoph Hellwig wrote:
> > > > > From: Dave Chinner <dchinner@redhat.com>
> > > > > 
> > > > > These tests create substantial file fragmentation as a result of
> > > > > application actions that defeat post-EOF preallocation
> > > > > optimisations. They are intended to replicate known vectors for
> > > > > these problems, and provide a check that the fragmentation levels
> > > > > have been controlled. The mitigations we make may not completely
> > > > > remove fragmentation (e.g. they may demonstrate speculative delalloc
> > > > > related extent size growth) so the checks don't assume we'll end up
> > > > > with perfect layouts and hence check for an exceptable level of
> > > > > fragmentation rather than none.
> > > > > 
> > > > > Signed-off-by: Dave Chinner <dchinner@redhat.com>
> > > > > [move to different test number, update to current xfstest APIs]
> > > > > Signed-off-by: Christoph Hellwig <hch@lst.de>
> > > > > ---
> > > > >  tests/xfs/1500     | 66 +++++++++++++++++++++++++++++++++++++++
> > > > >  tests/xfs/1500.out |  9 ++++++
> > > > >  tests/xfs/1501     | 68 ++++++++++++++++++++++++++++++++++++++++
> > > > >  tests/xfs/1501.out |  9 ++++++
> > > > >  tests/xfs/1502     | 68 ++++++++++++++++++++++++++++++++++++++++
> > > > >  tests/xfs/1502.out |  9 ++++++
> > > > >  tests/xfs/1503     | 77 ++++++++++++++++++++++++++++++++++++++++++++++
> > > > >  tests/xfs/1503.out | 33 ++++++++++++++++++++
> > > > >  8 files changed, 339 insertions(+)
> > > > >  create mode 100755 tests/xfs/1500
> > > > >  create mode 100644 tests/xfs/1500.out
> > > > >  create mode 100755 tests/xfs/1501
> > > > >  create mode 100644 tests/xfs/1501.out
> > > > >  create mode 100755 tests/xfs/1502
> > > > >  create mode 100644 tests/xfs/1502.out
> > > > >  create mode 100755 tests/xfs/1503
> > > > >  create mode 100644 tests/xfs/1503.out
> > > > > 
> > > > > diff --git a/tests/xfs/1500 b/tests/xfs/1500
> > > > > new file mode 100755
> > > > > index 000000000..de0e1df62
> > > > > --- /dev/null
> > > > > +++ b/tests/xfs/1500
> > > > > @@ -0,0 +1,66 @@
> > > > > +#! /bin/bash
> > > > > +# SPDX-License-Identifier: GPL-2.0
> > > > > +# Copyright (c) 2019 Red Hat, Inc.  All Rights Reserved.
> > > > > +#
> > > > > +# FS QA Test xfs/500
> > > > > +#
> > > > > +# Post-EOF preallocation defeat test for O_SYNC buffered I/O.
> > > > > +#
> > > > > +
> > > > > +. ./common/preamble
> > > > > +_begin_fstest auto quick prealloc rw
> > > > > +
> > > > > +. ./common/rc
> > > > > +. ./common/filter
> > > > > +
> > > > > +_require_scratch
> > > > > +
> > > > > +_cleanup()
> > > > > +{
> > > > > +	# try to kill all background processes
> > > > > +	wait
> > > > > +	cd /
> > > > > +	rm -r -f $tmp.*
> > > > > +}
> > > > > +
> > > > > +_scratch_mkfs > "$seqres.full" 2>&1
> > > > > +_scratch_mount
> > > > > +
> > > > > +# Write multiple files in parallel using synchronous buffered writes. Aim is to
> > > > > +# interleave allocations to fragment the files. Synchronous writes defeat the
> > > > > +# open/write/close heuristics in xfs_file_release() that prevent EOF block
> > > > > +# removal, so this should fragment badly. Typical problematic behaviour shows
> > > > > +# per-file extent counts of >900 (almost worse case) whilst fixed behaviour
> > > > > +# typically shows extent counts in the low 20s.
> > > > 
> > > > Now that these are in for-next, I've noticed that these new tests
> > > > consistently fail in the above-documented manner on various configs --
> > > > fsdax, always_cow, rtextsize > 1fsb, and sometimes 1k fsblock size.
> > > > 
> > > > I'm not sure why this happens, but it probably needs to be looked at
> > > > along with all the FALLOC_FL_UNSHARE_RANGE brokenness that's also been
> > > > exposed by fstests that /does/ need to be fixed.
> > > 
> > > Yes, some fsx tests fail on xfs, after the FALLOC_FL_UNSHARE_RANGE supporting.
> > > e.g. g/091, g/127, g/263, g/363 and g/616. I thought they're known issues as
> > > you known. If they're not, better to check. Hi Brian, are these failures as you
> > > known?
> > > 
> > 
> > So I'm aware of two fundamental issues that fsx unshare range support
> > uncovers. First is the XFS data loss issue that is addressed here[1],
> > second is the iomap unshare range warning/error splat that Julian Sun
> > has been working on (last version posted here[2] I believe).
> > 
> > My initial testing of the fsx unshare range patch was to run fsx
> > directly on the fs until I could run for some notable number of
> > operations without triggering a failure (probably at least 1m+, but I
> > don't recall exactly). I was initially able to do that with the patches
> > from [1] plus a local hack to trim to i_size in iomap_unshare_range(),
> > so based on that I _think_ these are the only two outstanding issues
> > with unshare range.
> > 
> > [1] https://lore.kernel.org/linux-xfs/20240906114051.120743-1-bfoster@redhat.com/
> > [2] https://lore.kernel.org/linux-fsdevel/20240927065344.2628691-1-sunjunchao2870@gmail.com/
> > 
> > The patches at [1] have been reviewed, but I'm not really sure where
> > they stand in terms of the XFS pipeline. Carlos?
> > 
> > It looks like the fix associated with [2] is still under
> > development/review. In any event, I just ran the set of tests noted by
> > Zorro above (w/ unshare range support) and they all fail on my distro
> > kernel, but all but generic/363 pass on current master (6.12.0-rc1+)
> > plus [1]. The generic/363 failure produces the iomap error associated
> > with [2], so I suspect that all of these test failures can be
> > categorized into one of those two known issues.
> 
> Yep.  [1] fixes a lot of the splats, and the rest of the splats can be
> fixed by a couple of other patches that I'll send out today.
> 
> Those same fsx tests above are still broken on fsdax though, so
> something is still wrong. :(
> 

Ah, Ok.. I hadn't tested for fsdax. Is that fixed by your corresponding
i_size fix, or even with that we're still failing? Note that if it's
isolated to generic/363 and not explicitly the iomap warning it could be
a zeroing issue rather than unshare. That could probably be confirmed by
disabling unshare and the eof pollution bits one at a time in fsx..

Brian

> --D
> 
> > Brian
> > 
> > > Thanks,
> > > Zorro
> > > 
> > > > 
> > > > --D
> > > > 
> > > > > +# Failure is determined by golden output mismatch from _within_tolerance().
> > > > > +
> > > > > +workfile=$SCRATCH_MNT/file
> > > > > +nfiles=8
> > > > > +wsize=4096
> > > > > +wcnt=1000
> > > > > +
> > > > > +write_sync_file()
> > > > > +{
> > > > > +	idx=$1
> > > > > +
> > > > > +	for ((cnt=0; cnt<$wcnt; cnt++)); do
> > > > > +		$XFS_IO_PROG -f -s -c "pwrite $((cnt * wsize)) $wsize" $workfile.$idx
> > > > > +	done
> > > > > +}
> > > > > +
> > > > > +rm -f $workfile*
> > > > > +for ((n=0; n<$nfiles; n++)); do
> > > > > +	write_sync_file $n > /dev/null 2>&1 &
> > > > > +done
> > > > > +wait
> > > > > +sync
> > > > > +
> > > > > +for ((n=0; n<$nfiles; n++)); do
> > > > > +	count=$(_count_extents $workfile.$n)
> > > > > +	# Acceptible extent count range is 1-40
> > > > > +	_within_tolerance "file.$n extent count" $count 21 19 -v
> > > > > +done
> > > > > +
> > > > > +status=0
> > > > > +exit
> > > > > diff --git a/tests/xfs/1500.out b/tests/xfs/1500.out
> > > > > new file mode 100644
> > > > > index 000000000..414df87ed
> > > > > --- /dev/null
> > > > > +++ b/tests/xfs/1500.out
> > > > > @@ -0,0 +1,9 @@
> > > > > +QA output created by 1500
> > > > > +file.0 extent count is in range
> > > > > +file.1 extent count is in range
> > > > > +file.2 extent count is in range
> > > > > +file.3 extent count is in range
> > > > > +file.4 extent count is in range
> > > > > +file.5 extent count is in range
> > > > > +file.6 extent count is in range
> > > > > +file.7 extent count is in range
> > > > > diff --git a/tests/xfs/1501 b/tests/xfs/1501
> > > > > new file mode 100755
> > > > > index 000000000..cf3cbf8b5
> > > > > --- /dev/null
> > > > > +++ b/tests/xfs/1501
> > > > > @@ -0,0 +1,68 @@
> > > > > +#! /bin/bash
> > > > > +# SPDX-License-Identifier: GPL-2.0
> > > > > +# Copyright (c) 2019 Red Hat, Inc.  All Rights Reserved.
> > > > > +#
> > > > > +# FS QA Test xfs/501
> > > > > +#
> > > > > +# Post-EOF preallocation defeat test for buffered I/O with extent size hints.
> > > > > +#
> > > > > +
> > > > > +. ./common/preamble
> > > > > +_begin_fstest auto quick prealloc rw
> > > > > +
> > > > > +. ./common/rc
> > > > > +. ./common/filter
> > > > > +
> > > > > +_require_scratch
> > > > > +
> > > > > +_cleanup()
> > > > > +{
> > > > > +	# try to kill all background processes
> > > > > +	wait
> > > > > +	cd /
> > > > > +	rm -r -f $tmp.*
> > > > > +}
> > > > > +
> > > > > +_scratch_mkfs > "$seqres.full" 2>&1
> > > > > +_scratch_mount
> > > > > +
> > > > > +# Write multiple files in parallel using buffered writes with extent size hints.
> > > > > +# Aim is to interleave allocations to fragment the files. Writes w/ extent size
> > > > > +# hints set defeat the open/write/close heuristics in xfs_file_release() that
> > > > > +# prevent EOF block removal, so this should fragment badly. Typical problematic
> > > > > +# behaviour shows per-file extent counts of 1000 (worst case!) whilst
> > > > > +# fixed behaviour should show very few extents (almost best case).
> > > > > +#
> > > > > +# Failure is determined by golden output mismatch from _within_tolerance().
> > > > > +
> > > > > +workfile=$SCRATCH_MNT/file
> > > > > +nfiles=8
> > > > > +wsize=4096
> > > > > +wcnt=1000
> > > > > +extent_size=16m
> > > > > +
> > > > > +write_extsz_file()
> > > > > +{
> > > > > +	idx=$1
> > > > > +
> > > > > +	$XFS_IO_PROG -f -c "extsize $extent_size" $workfile.$idx
> > > > > +	for ((cnt=0; cnt<$wcnt; cnt++)); do
> > > > > +		$XFS_IO_PROG -f -c "pwrite $((cnt * wsize)) $wsize" $workfile.$idx
> > > > > +	done
> > > > > +}
> > > > > +
> > > > > +rm -f $workfile*
> > > > > +for ((n=0; n<$nfiles; n++)); do
> > > > > +	write_extsz_file $n > /dev/null 2>&1 &
> > > > > +done
> > > > > +wait
> > > > > +sync
> > > > > +
> > > > > +for ((n=0; n<$nfiles; n++)); do
> > > > > +	count=$(_count_extents $workfile.$n)
> > > > > +	# Acceptible extent count range is 1-10
> > > > > +	_within_tolerance "file.$n extent count" $count 2 1 8 -v
> > > > > +done
> > > > > +
> > > > > +status=0
> > > > > +exit
> > > > > diff --git a/tests/xfs/1501.out b/tests/xfs/1501.out
> > > > > new file mode 100644
> > > > > index 000000000..a266ef74b
> > > > > --- /dev/null
> > > > > +++ b/tests/xfs/1501.out
> > > > > @@ -0,0 +1,9 @@
> > > > > +QA output created by 1501
> > > > > +file.0 extent count is in range
> > > > > +file.1 extent count is in range
> > > > > +file.2 extent count is in range
> > > > > +file.3 extent count is in range
> > > > > +file.4 extent count is in range
> > > > > +file.5 extent count is in range
> > > > > +file.6 extent count is in range
> > > > > +file.7 extent count is in range
> > > > > diff --git a/tests/xfs/1502 b/tests/xfs/1502
> > > > > new file mode 100755
> > > > > index 000000000..f4228667a
> > > > > --- /dev/null
> > > > > +++ b/tests/xfs/1502
> > > > > @@ -0,0 +1,68 @@
> > > > > +#! /bin/bash
> > > > > +# SPDX-License-Identifier: GPL-2.0
> > > > > +# Copyright (c) 2019 Red Hat, Inc.  All Rights Reserved.
> > > > > +#
> > > > > +# FS QA Test xfs/502
> > > > > +#
> > > > > +# Post-EOF preallocation defeat test for direct I/O with extent size hints.
> > > > > +#
> > > > > +
> > > > > +. ./common/preamble
> > > > > +_begin_fstest auto quick prealloc rw
> > > > > +
> > > > > +. ./common/rc
> > > > > +. ./common/filter
> > > > > +
> > > > > +_require_scratch
> > > > > +
> > > > > +_cleanup()
> > > > > +{
> > > > > +	# try to kill all background processes
> > > > > +	wait
> > > > > +	cd /
> > > > > +	rm -r -f $tmp.*
> > > > > +}
> > > > > +
> > > > > +_scratch_mkfs > "$seqres.full" 2>&1
> > > > > +_scratch_mount
> > > > > +
> > > > > +# Write multiple files in parallel using O_DIRECT writes w/ extent size hints.
> > > > > +# Aim is to interleave allocations to fragment the files. O_DIRECT writes defeat
> > > > > +# the open/write/close heuristics in xfs_file_release() that prevent EOF block
> > > > > +# removal, so this should fragment badly. Typical problematic behaviour shows
> > > > > +# per-file extent counts of ~1000 (worst case) whilst fixed behaviour typically
> > > > > +# shows extent counts in the low single digits (almost best case)
> > > > > +#
> > > > > +# Failure is determined by golden output mismatch from _within_tolerance().
> > > > > +
> > > > > +workfile=$SCRATCH_MNT/file
> > > > > +nfiles=8
> > > > > +wsize=4096
> > > > > +wcnt=1000
> > > > > +extent_size=16m
> > > > > +
> > > > > +write_direct_file()
> > > > > +{
> > > > > +	idx=$1
> > > > > +
> > > > > +	$XFS_IO_PROG -f -c "extsize $extent_size" $workfile.$idx
> > > > > +	for ((cnt=0; cnt<$wcnt; cnt++)); do
> > > > > +		$XFS_IO_PROG -f -d -c "pwrite $((cnt * wsize)) $wsize" $workfile.$idx
> > > > > +	done
> > > > > +}
> > > > > +
> > > > > +rm -f $workfile*
> > > > > +for ((n=0; n<$nfiles; n++)); do
> > > > > +	write_direct_file $n > /dev/null 2>&1 &
> > > > > +done
> > > > > +wait
> > > > > +sync
> > > > > +
> > > > > +for ((n=0; n<$nfiles; n++)); do
> > > > > +	count=$(_count_extents $workfile.$n)
> > > > > +	# Acceptible extent count range is 1-10
> > > > > +	_within_tolerance "file.$n extent count" $count 2 1 8 -v
> > > > > +done
> > > > > +
> > > > > +status=0
> > > > > +exit
> > > > > diff --git a/tests/xfs/1502.out b/tests/xfs/1502.out
> > > > > new file mode 100644
> > > > > index 000000000..82c8760a3
> > > > > --- /dev/null
> > > > > +++ b/tests/xfs/1502.out
> > > > > @@ -0,0 +1,9 @@
> > > > > +QA output created by 1502
> > > > > +file.0 extent count is in range
> > > > > +file.1 extent count is in range
> > > > > +file.2 extent count is in range
> > > > > +file.3 extent count is in range
> > > > > +file.4 extent count is in range
> > > > > +file.5 extent count is in range
> > > > > +file.6 extent count is in range
> > > > > +file.7 extent count is in range
> > > > > diff --git a/tests/xfs/1503 b/tests/xfs/1503
> > > > > new file mode 100755
> > > > > index 000000000..9002f87e6
> > > > > --- /dev/null
> > > > > +++ b/tests/xfs/1503
> > > > > @@ -0,0 +1,77 @@
> > > > > +#! /bin/bash
> > > > > +# SPDX-License-Identifier: GPL-2.0
> > > > > +# Copyright (c) 2019 Red Hat, Inc.  All Rights Reserved.
> > > > > +#
> > > > > +# FS QA Test xfs/503
> > > > > +#
> > > > > +# Post-EOF preallocation defeat test with O_SYNC buffered I/O that repeatedly
> > > > > +# closes and reopens the files.
> > > > > +#
> > > > > +
> > > > > +. ./common/preamble
> > > > > +_begin_fstest auto prealloc rw
> > > > > +
> > > > > +. ./common/rc
> > > > > +. ./common/filter
> > > > > +
> > > > > +_require_scratch
> > > > > +
> > > > > +_cleanup()
> > > > > +{
> > > > > +	# try to kill all background processes
> > > > > +	wait
> > > > > +	cd /
> > > > > +	rm -r -f $tmp.*
> > > > > +}
> > > > > +
> > > > > +_scratch_mkfs > "$seqres.full" 2>&1
> > > > > +_scratch_mount
> > > > > +
> > > > > +# Write multiple files in parallel using synchronous buffered writes that
> > > > > +# repeatedly close and reopen the fails. Aim is to interleave allocations to
> > > > > +# fragment the files. Assuming we've fixed the synchronous write defeat, we can
> > > > > +# still trigger the same issue with a open/read/close on O_RDONLY files. We
> > > > > +# should not be triggering EOF preallocation removal on files we don't have
> > > > > +# permission to write, so until this is fixed it should fragment badly.  Typical
> > > > > +# problematic behaviour shows per-file extent counts of 50-350 whilst fixed
> > > > > +# behaviour typically demonstrates post-eof speculative delalloc growth in
> > > > > +# extent size (~6 extents for 50MB file).
> > > > > +#
> > > > > +# Failure is determined by golden output mismatch from _within_tolerance().
> > > > > +
> > > > > +workfile=$SCRATCH_MNT/file
> > > > > +nfiles=32
> > > > > +wsize=4096
> > > > > +wcnt=1000
> > > > > +
> > > > > +write_file()
> > > > > +{
> > > > > +	idx=$1
> > > > > +
> > > > > +	$XFS_IO_PROG -f -s -c "pwrite -b 64k 0 50m" $workfile.$idx
> > > > > +}
> > > > > +
> > > > > +read_file()
> > > > > +{
> > > > > +	idx=$1
> > > > > +
> > > > > +	for ((cnt=0; cnt<$wcnt; cnt++)); do
> > > > > +		$XFS_IO_PROG -f -r -c "pread 0 28" $workfile.$idx
> > > > > +	done
> > > > > +}
> > > > > +
> > > > > +rm -f $workdir/file*
> > > > > +for ((n=0; n<$((nfiles)); n++)); do
> > > > > +	write_file $n > /dev/null 2>&1 &
> > > > > +	read_file $n > /dev/null 2>&1 &
> > > > > +done
> > > > > +wait
> > > > > +
> > > > > +for ((n=0; n<$nfiles; n++)); do
> > > > > +	count=$(_count_extents $workfile.$n)
> > > > > +	# Acceptible extent count range is 1-40
> > > > > +	_within_tolerance "file.$n extent count" $count 6 5 10 -v
> > > > > +done
> > > > > +
> > > > > +status=0
> > > > > +exit
> > > > > diff --git a/tests/xfs/1503.out b/tests/xfs/1503.out
> > > > > new file mode 100644
> > > > > index 000000000..1780b16df
> > > > > --- /dev/null
> > > > > +++ b/tests/xfs/1503.out
> > > > > @@ -0,0 +1,33 @@
> > > > > +QA output created by 1503
> > > > > +file.0 extent count is in range
> > > > > +file.1 extent count is in range
> > > > > +file.2 extent count is in range
> > > > > +file.3 extent count is in range
> > > > > +file.4 extent count is in range
> > > > > +file.5 extent count is in range
> > > > > +file.6 extent count is in range
> > > > > +file.7 extent count is in range
> > > > > +file.8 extent count is in range
> > > > > +file.9 extent count is in range
> > > > > +file.10 extent count is in range
> > > > > +file.11 extent count is in range
> > > > > +file.12 extent count is in range
> > > > > +file.13 extent count is in range
> > > > > +file.14 extent count is in range
> > > > > +file.15 extent count is in range
> > > > > +file.16 extent count is in range
> > > > > +file.17 extent count is in range
> > > > > +file.18 extent count is in range
> > > > > +file.19 extent count is in range
> > > > > +file.20 extent count is in range
> > > > > +file.21 extent count is in range
> > > > > +file.22 extent count is in range
> > > > > +file.23 extent count is in range
> > > > > +file.24 extent count is in range
> > > > > +file.25 extent count is in range
> > > > > +file.26 extent count is in range
> > > > > +file.27 extent count is in range
> > > > > +file.28 extent count is in range
> > > > > +file.29 extent count is in range
> > > > > +file.30 extent count is in range
> > > > > +file.31 extent count is in range
> > > > > -- 
> > > > > 2.45.2
> > > > > 
> > > > > 
> > > > 
> > > 
> > 
> > 
>
Darrick J. Wong Oct. 2, 2024, 8:04 p.m. UTC | #9
On Wed, Oct 02, 2024 at 11:56:40AM -0400, Brian Foster wrote:
> On Wed, Oct 02, 2024 at 07:57:15AM -0700, Darrick J. Wong wrote:
> > On Wed, Oct 02, 2024 at 10:35:55AM -0400, Brian Foster wrote:
> > > On Wed, Oct 02, 2024 at 09:38:00PM +0800, Zorro Lang wrote:
> > > > On Tue, Oct 01, 2024 at 07:59:44AM -0700, Darrick J. Wong wrote:
> > > > > On Tue, Sep 24, 2024 at 10:45:48AM +0200, Christoph Hellwig wrote:
> > > > > > From: Dave Chinner <dchinner@redhat.com>
> > > > > > 
> > > > > > These tests create substantial file fragmentation as a result of
> > > > > > application actions that defeat post-EOF preallocation
> > > > > > optimisations. They are intended to replicate known vectors for
> > > > > > these problems, and provide a check that the fragmentation levels
> > > > > > have been controlled. The mitigations we make may not completely
> > > > > > remove fragmentation (e.g. they may demonstrate speculative delalloc
> > > > > > related extent size growth) so the checks don't assume we'll end up
> > > > > > with perfect layouts and hence check for an exceptable level of
> > > > > > fragmentation rather than none.
> > > > > > 
> > > > > > Signed-off-by: Dave Chinner <dchinner@redhat.com>
> > > > > > [move to different test number, update to current xfstest APIs]
> > > > > > Signed-off-by: Christoph Hellwig <hch@lst.de>
> > > > > > ---
> > > > > >  tests/xfs/1500     | 66 +++++++++++++++++++++++++++++++++++++++
> > > > > >  tests/xfs/1500.out |  9 ++++++
> > > > > >  tests/xfs/1501     | 68 ++++++++++++++++++++++++++++++++++++++++
> > > > > >  tests/xfs/1501.out |  9 ++++++
> > > > > >  tests/xfs/1502     | 68 ++++++++++++++++++++++++++++++++++++++++
> > > > > >  tests/xfs/1502.out |  9 ++++++
> > > > > >  tests/xfs/1503     | 77 ++++++++++++++++++++++++++++++++++++++++++++++
> > > > > >  tests/xfs/1503.out | 33 ++++++++++++++++++++
> > > > > >  8 files changed, 339 insertions(+)
> > > > > >  create mode 100755 tests/xfs/1500
> > > > > >  create mode 100644 tests/xfs/1500.out
> > > > > >  create mode 100755 tests/xfs/1501
> > > > > >  create mode 100644 tests/xfs/1501.out
> > > > > >  create mode 100755 tests/xfs/1502
> > > > > >  create mode 100644 tests/xfs/1502.out
> > > > > >  create mode 100755 tests/xfs/1503
> > > > > >  create mode 100644 tests/xfs/1503.out
> > > > > > 
> > > > > > diff --git a/tests/xfs/1500 b/tests/xfs/1500
> > > > > > new file mode 100755
> > > > > > index 000000000..de0e1df62
> > > > > > --- /dev/null
> > > > > > +++ b/tests/xfs/1500
> > > > > > @@ -0,0 +1,66 @@
> > > > > > +#! /bin/bash
> > > > > > +# SPDX-License-Identifier: GPL-2.0
> > > > > > +# Copyright (c) 2019 Red Hat, Inc.  All Rights Reserved.
> > > > > > +#
> > > > > > +# FS QA Test xfs/500
> > > > > > +#
> > > > > > +# Post-EOF preallocation defeat test for O_SYNC buffered I/O.
> > > > > > +#
> > > > > > +
> > > > > > +. ./common/preamble
> > > > > > +_begin_fstest auto quick prealloc rw
> > > > > > +
> > > > > > +. ./common/rc
> > > > > > +. ./common/filter
> > > > > > +
> > > > > > +_require_scratch
> > > > > > +
> > > > > > +_cleanup()
> > > > > > +{
> > > > > > +	# try to kill all background processes
> > > > > > +	wait
> > > > > > +	cd /
> > > > > > +	rm -r -f $tmp.*
> > > > > > +}
> > > > > > +
> > > > > > +_scratch_mkfs > "$seqres.full" 2>&1
> > > > > > +_scratch_mount
> > > > > > +
> > > > > > +# Write multiple files in parallel using synchronous buffered writes. Aim is to
> > > > > > +# interleave allocations to fragment the files. Synchronous writes defeat the
> > > > > > +# open/write/close heuristics in xfs_file_release() that prevent EOF block
> > > > > > +# removal, so this should fragment badly. Typical problematic behaviour shows
> > > > > > +# per-file extent counts of >900 (almost worse case) whilst fixed behaviour
> > > > > > +# typically shows extent counts in the low 20s.
> > > > > 
> > > > > Now that these are in for-next, I've noticed that these new tests
> > > > > consistently fail in the above-documented manner on various configs --
> > > > > fsdax, always_cow, rtextsize > 1fsb, and sometimes 1k fsblock size.
> > > > > 
> > > > > I'm not sure why this happens, but it probably needs to be looked at
> > > > > along with all the FALLOC_FL_UNSHARE_RANGE brokenness that's also been
> > > > > exposed by fstests that /does/ need to be fixed.
> > > > 
> > > > Yes, some fsx tests fail on xfs, after the FALLOC_FL_UNSHARE_RANGE supporting.
> > > > e.g. g/091, g/127, g/263, g/363 and g/616. I thought they're known issues as
> > > > you known. If they're not, better to check. Hi Brian, are these failures as you
> > > > known?
> > > > 
> > > 
> > > So I'm aware of two fundamental issues that fsx unshare range support
> > > uncovers. First is the XFS data loss issue that is addressed here[1],
> > > second is the iomap unshare range warning/error splat that Julian Sun
> > > has been working on (last version posted here[2] I believe).
> > > 
> > > My initial testing of the fsx unshare range patch was to run fsx
> > > directly on the fs until I could run for some notable number of
> > > operations without triggering a failure (probably at least 1m+, but I
> > > don't recall exactly). I was initially able to do that with the patches
> > > from [1] plus a local hack to trim to i_size in iomap_unshare_range(),
> > > so based on that I _think_ these are the only two outstanding issues
> > > with unshare range.
> > > 
> > > [1] https://lore.kernel.org/linux-xfs/20240906114051.120743-1-bfoster@redhat.com/
> > > [2] https://lore.kernel.org/linux-fsdevel/20240927065344.2628691-1-sunjunchao2870@gmail.com/
> > > 
> > > The patches at [1] have been reviewed, but I'm not really sure where
> > > they stand in terms of the XFS pipeline. Carlos?
> > > 
> > > It looks like the fix associated with [2] is still under
> > > development/review. In any event, I just ran the set of tests noted by
> > > Zorro above (w/ unshare range support) and they all fail on my distro
> > > kernel, but all but generic/363 pass on current master (6.12.0-rc1+)
> > > plus [1]. The generic/363 failure produces the iomap error associated
> > > with [2], so I suspect that all of these test failures can be
> > > categorized into one of those two known issues.
> > 
> > Yep.  [1] fixes a lot of the splats, and the rest of the splats can be
> > fixed by a couple of other patches that I'll send out today.
> > 
> > Those same fsx tests above are still broken on fsdax though, so
> > something is still wrong. :(
> > 
> 
> Ah, Ok.. I hadn't tested for fsdax. Is that fixed by your corresponding
> i_size fix, or even with that we're still failing? Note that if it's
> isolated to generic/363 and not explicitly the iomap warning it could be
> a zeroing issue rather than unshare. That could probably be confirmed by
> disabling unshare and the eof pollution bits one at a time in fsx..

Oh, no, it's anything that uses fsx on fsdax.

Roughly speaking, the "is this really shared?" predicate didn't get
updated when the iomap_unshare_iter version did; the "if srcmap is hole
or unwritten" code makes no sense because you can't share holes or
unwritten extents; the range to memcpy should have been extended to
align with the fsblocks because we cow whole blocks; the rounding down
that the dax_iomap_direct_access method does to the returned pointer
means it's been corrupting data; and it doesn't invalidate the mappings
before doing the cow which means any real fsdax programs access the
wrong memory afterwards.  IOWs, we've been corrupting data since 2022.

Oh and also xfs_direct_write_iomap_begin shouldn't be allocating cow
extents for an unshare operation over a hole.

So, uh, a big thank you to you for exposing this untested crap code
(both iomap and fsdax versions).  Patches coming. :)

--D

> Brian
> 
> > --D
> > 
> > > Brian
> > > 
> > > > Thanks,
> > > > Zorro
> > > > 
> > > > > 
> > > > > --D
> > > > > 
> > > > > > +# Failure is determined by golden output mismatch from _within_tolerance().
> > > > > > +
> > > > > > +workfile=$SCRATCH_MNT/file
> > > > > > +nfiles=8
> > > > > > +wsize=4096
> > > > > > +wcnt=1000
> > > > > > +
> > > > > > +write_sync_file()
> > > > > > +{
> > > > > > +	idx=$1
> > > > > > +
> > > > > > +	for ((cnt=0; cnt<$wcnt; cnt++)); do
> > > > > > +		$XFS_IO_PROG -f -s -c "pwrite $((cnt * wsize)) $wsize" $workfile.$idx
> > > > > > +	done
> > > > > > +}
> > > > > > +
> > > > > > +rm -f $workfile*
> > > > > > +for ((n=0; n<$nfiles; n++)); do
> > > > > > +	write_sync_file $n > /dev/null 2>&1 &
> > > > > > +done
> > > > > > +wait
> > > > > > +sync
> > > > > > +
> > > > > > +for ((n=0; n<$nfiles; n++)); do
> > > > > > +	count=$(_count_extents $workfile.$n)
> > > > > > +	# Acceptible extent count range is 1-40
> > > > > > +	_within_tolerance "file.$n extent count" $count 21 19 -v
> > > > > > +done
> > > > > > +
> > > > > > +status=0
> > > > > > +exit
> > > > > > diff --git a/tests/xfs/1500.out b/tests/xfs/1500.out
> > > > > > new file mode 100644
> > > > > > index 000000000..414df87ed
> > > > > > --- /dev/null
> > > > > > +++ b/tests/xfs/1500.out
> > > > > > @@ -0,0 +1,9 @@
> > > > > > +QA output created by 1500
> > > > > > +file.0 extent count is in range
> > > > > > +file.1 extent count is in range
> > > > > > +file.2 extent count is in range
> > > > > > +file.3 extent count is in range
> > > > > > +file.4 extent count is in range
> > > > > > +file.5 extent count is in range
> > > > > > +file.6 extent count is in range
> > > > > > +file.7 extent count is in range
> > > > > > diff --git a/tests/xfs/1501 b/tests/xfs/1501
> > > > > > new file mode 100755
> > > > > > index 000000000..cf3cbf8b5
> > > > > > --- /dev/null
> > > > > > +++ b/tests/xfs/1501
> > > > > > @@ -0,0 +1,68 @@
> > > > > > +#! /bin/bash
> > > > > > +# SPDX-License-Identifier: GPL-2.0
> > > > > > +# Copyright (c) 2019 Red Hat, Inc.  All Rights Reserved.
> > > > > > +#
> > > > > > +# FS QA Test xfs/501
> > > > > > +#
> > > > > > +# Post-EOF preallocation defeat test for buffered I/O with extent size hints.
> > > > > > +#
> > > > > > +
> > > > > > +. ./common/preamble
> > > > > > +_begin_fstest auto quick prealloc rw
> > > > > > +
> > > > > > +. ./common/rc
> > > > > > +. ./common/filter
> > > > > > +
> > > > > > +_require_scratch
> > > > > > +
> > > > > > +_cleanup()
> > > > > > +{
> > > > > > +	# try to kill all background processes
> > > > > > +	wait
> > > > > > +	cd /
> > > > > > +	rm -r -f $tmp.*
> > > > > > +}
> > > > > > +
> > > > > > +_scratch_mkfs > "$seqres.full" 2>&1
> > > > > > +_scratch_mount
> > > > > > +
> > > > > > +# Write multiple files in parallel using buffered writes with extent size hints.
> > > > > > +# Aim is to interleave allocations to fragment the files. Writes w/ extent size
> > > > > > +# hints set defeat the open/write/close heuristics in xfs_file_release() that
> > > > > > +# prevent EOF block removal, so this should fragment badly. Typical problematic
> > > > > > +# behaviour shows per-file extent counts of 1000 (worst case!) whilst
> > > > > > +# fixed behaviour should show very few extents (almost best case).
> > > > > > +#
> > > > > > +# Failure is determined by golden output mismatch from _within_tolerance().
> > > > > > +
> > > > > > +workfile=$SCRATCH_MNT/file
> > > > > > +nfiles=8
> > > > > > +wsize=4096
> > > > > > +wcnt=1000
> > > > > > +extent_size=16m
> > > > > > +
> > > > > > +write_extsz_file()
> > > > > > +{
> > > > > > +	idx=$1
> > > > > > +
> > > > > > +	$XFS_IO_PROG -f -c "extsize $extent_size" $workfile.$idx
> > > > > > +	for ((cnt=0; cnt<$wcnt; cnt++)); do
> > > > > > +		$XFS_IO_PROG -f -c "pwrite $((cnt * wsize)) $wsize" $workfile.$idx
> > > > > > +	done
> > > > > > +}
> > > > > > +
> > > > > > +rm -f $workfile*
> > > > > > +for ((n=0; n<$nfiles; n++)); do
> > > > > > +	write_extsz_file $n > /dev/null 2>&1 &
> > > > > > +done
> > > > > > +wait
> > > > > > +sync
> > > > > > +
> > > > > > +for ((n=0; n<$nfiles; n++)); do
> > > > > > +	count=$(_count_extents $workfile.$n)
> > > > > > +	# Acceptible extent count range is 1-10
> > > > > > +	_within_tolerance "file.$n extent count" $count 2 1 8 -v
> > > > > > +done
> > > > > > +
> > > > > > +status=0
> > > > > > +exit
> > > > > > diff --git a/tests/xfs/1501.out b/tests/xfs/1501.out
> > > > > > new file mode 100644
> > > > > > index 000000000..a266ef74b
> > > > > > --- /dev/null
> > > > > > +++ b/tests/xfs/1501.out
> > > > > > @@ -0,0 +1,9 @@
> > > > > > +QA output created by 1501
> > > > > > +file.0 extent count is in range
> > > > > > +file.1 extent count is in range
> > > > > > +file.2 extent count is in range
> > > > > > +file.3 extent count is in range
> > > > > > +file.4 extent count is in range
> > > > > > +file.5 extent count is in range
> > > > > > +file.6 extent count is in range
> > > > > > +file.7 extent count is in range
> > > > > > diff --git a/tests/xfs/1502 b/tests/xfs/1502
> > > > > > new file mode 100755
> > > > > > index 000000000..f4228667a
> > > > > > --- /dev/null
> > > > > > +++ b/tests/xfs/1502
> > > > > > @@ -0,0 +1,68 @@
> > > > > > +#! /bin/bash
> > > > > > +# SPDX-License-Identifier: GPL-2.0
> > > > > > +# Copyright (c) 2019 Red Hat, Inc.  All Rights Reserved.
> > > > > > +#
> > > > > > +# FS QA Test xfs/502
> > > > > > +#
> > > > > > +# Post-EOF preallocation defeat test for direct I/O with extent size hints.
> > > > > > +#
> > > > > > +
> > > > > > +. ./common/preamble
> > > > > > +_begin_fstest auto quick prealloc rw
> > > > > > +
> > > > > > +. ./common/rc
> > > > > > +. ./common/filter
> > > > > > +
> > > > > > +_require_scratch
> > > > > > +
> > > > > > +_cleanup()
> > > > > > +{
> > > > > > +	# try to kill all background processes
> > > > > > +	wait
> > > > > > +	cd /
> > > > > > +	rm -r -f $tmp.*
> > > > > > +}
> > > > > > +
> > > > > > +_scratch_mkfs > "$seqres.full" 2>&1
> > > > > > +_scratch_mount
> > > > > > +
> > > > > > +# Write multiple files in parallel using O_DIRECT writes w/ extent size hints.
> > > > > > +# Aim is to interleave allocations to fragment the files. O_DIRECT writes defeat
> > > > > > +# the open/write/close heuristics in xfs_file_release() that prevent EOF block
> > > > > > +# removal, so this should fragment badly. Typical problematic behaviour shows
> > > > > > +# per-file extent counts of ~1000 (worst case) whilst fixed behaviour typically
> > > > > > +# shows extent counts in the low single digits (almost best case)
> > > > > > +#
> > > > > > +# Failure is determined by golden output mismatch from _within_tolerance().
> > > > > > +
> > > > > > +workfile=$SCRATCH_MNT/file
> > > > > > +nfiles=8
> > > > > > +wsize=4096
> > > > > > +wcnt=1000
> > > > > > +extent_size=16m
> > > > > > +
> > > > > > +write_direct_file()
> > > > > > +{
> > > > > > +	idx=$1
> > > > > > +
> > > > > > +	$XFS_IO_PROG -f -c "extsize $extent_size" $workfile.$idx
> > > > > > +	for ((cnt=0; cnt<$wcnt; cnt++)); do
> > > > > > +		$XFS_IO_PROG -f -d -c "pwrite $((cnt * wsize)) $wsize" $workfile.$idx
> > > > > > +	done
> > > > > > +}
> > > > > > +
> > > > > > +rm -f $workfile*
> > > > > > +for ((n=0; n<$nfiles; n++)); do
> > > > > > +	write_direct_file $n > /dev/null 2>&1 &
> > > > > > +done
> > > > > > +wait
> > > > > > +sync
> > > > > > +
> > > > > > +for ((n=0; n<$nfiles; n++)); do
> > > > > > +	count=$(_count_extents $workfile.$n)
> > > > > > +	# Acceptible extent count range is 1-10
> > > > > > +	_within_tolerance "file.$n extent count" $count 2 1 8 -v
> > > > > > +done
> > > > > > +
> > > > > > +status=0
> > > > > > +exit
> > > > > > diff --git a/tests/xfs/1502.out b/tests/xfs/1502.out
> > > > > > new file mode 100644
> > > > > > index 000000000..82c8760a3
> > > > > > --- /dev/null
> > > > > > +++ b/tests/xfs/1502.out
> > > > > > @@ -0,0 +1,9 @@
> > > > > > +QA output created by 1502
> > > > > > +file.0 extent count is in range
> > > > > > +file.1 extent count is in range
> > > > > > +file.2 extent count is in range
> > > > > > +file.3 extent count is in range
> > > > > > +file.4 extent count is in range
> > > > > > +file.5 extent count is in range
> > > > > > +file.6 extent count is in range
> > > > > > +file.7 extent count is in range
> > > > > > diff --git a/tests/xfs/1503 b/tests/xfs/1503
> > > > > > new file mode 100755
> > > > > > index 000000000..9002f87e6
> > > > > > --- /dev/null
> > > > > > +++ b/tests/xfs/1503
> > > > > > @@ -0,0 +1,77 @@
> > > > > > +#! /bin/bash
> > > > > > +# SPDX-License-Identifier: GPL-2.0
> > > > > > +# Copyright (c) 2019 Red Hat, Inc.  All Rights Reserved.
> > > > > > +#
> > > > > > +# FS QA Test xfs/503
> > > > > > +#
> > > > > > +# Post-EOF preallocation defeat test with O_SYNC buffered I/O that repeatedly
> > > > > > +# closes and reopens the files.
> > > > > > +#
> > > > > > +
> > > > > > +. ./common/preamble
> > > > > > +_begin_fstest auto prealloc rw
> > > > > > +
> > > > > > +. ./common/rc
> > > > > > +. ./common/filter
> > > > > > +
> > > > > > +_require_scratch
> > > > > > +
> > > > > > +_cleanup()
> > > > > > +{
> > > > > > +	# try to kill all background processes
> > > > > > +	wait
> > > > > > +	cd /
> > > > > > +	rm -r -f $tmp.*
> > > > > > +}
> > > > > > +
> > > > > > +_scratch_mkfs > "$seqres.full" 2>&1
> > > > > > +_scratch_mount
> > > > > > +
> > > > > > +# Write multiple files in parallel using synchronous buffered writes that
> > > > > > +# repeatedly close and reopen the fails. Aim is to interleave allocations to
> > > > > > +# fragment the files. Assuming we've fixed the synchronous write defeat, we can
> > > > > > +# still trigger the same issue with a open/read/close on O_RDONLY files. We
> > > > > > +# should not be triggering EOF preallocation removal on files we don't have
> > > > > > +# permission to write, so until this is fixed it should fragment badly.  Typical
> > > > > > +# problematic behaviour shows per-file extent counts of 50-350 whilst fixed
> > > > > > +# behaviour typically demonstrates post-eof speculative delalloc growth in
> > > > > > +# extent size (~6 extents for 50MB file).
> > > > > > +#
> > > > > > +# Failure is determined by golden output mismatch from _within_tolerance().
> > > > > > +
> > > > > > +workfile=$SCRATCH_MNT/file
> > > > > > +nfiles=32
> > > > > > +wsize=4096
> > > > > > +wcnt=1000
> > > > > > +
> > > > > > +write_file()
> > > > > > +{
> > > > > > +	idx=$1
> > > > > > +
> > > > > > +	$XFS_IO_PROG -f -s -c "pwrite -b 64k 0 50m" $workfile.$idx
> > > > > > +}
> > > > > > +
> > > > > > +read_file()
> > > > > > +{
> > > > > > +	idx=$1
> > > > > > +
> > > > > > +	for ((cnt=0; cnt<$wcnt; cnt++)); do
> > > > > > +		$XFS_IO_PROG -f -r -c "pread 0 28" $workfile.$idx
> > > > > > +	done
> > > > > > +}
> > > > > > +
> > > > > > +rm -f $workdir/file*
> > > > > > +for ((n=0; n<$((nfiles)); n++)); do
> > > > > > +	write_file $n > /dev/null 2>&1 &
> > > > > > +	read_file $n > /dev/null 2>&1 &
> > > > > > +done
> > > > > > +wait
> > > > > > +
> > > > > > +for ((n=0; n<$nfiles; n++)); do
> > > > > > +	count=$(_count_extents $workfile.$n)
> > > > > > +	# Acceptible extent count range is 1-40
> > > > > > +	_within_tolerance "file.$n extent count" $count 6 5 10 -v
> > > > > > +done
> > > > > > +
> > > > > > +status=0
> > > > > > +exit
> > > > > > diff --git a/tests/xfs/1503.out b/tests/xfs/1503.out
> > > > > > new file mode 100644
> > > > > > index 000000000..1780b16df
> > > > > > --- /dev/null
> > > > > > +++ b/tests/xfs/1503.out
> > > > > > @@ -0,0 +1,33 @@
> > > > > > +QA output created by 1503
> > > > > > +file.0 extent count is in range
> > > > > > +file.1 extent count is in range
> > > > > > +file.2 extent count is in range
> > > > > > +file.3 extent count is in range
> > > > > > +file.4 extent count is in range
> > > > > > +file.5 extent count is in range
> > > > > > +file.6 extent count is in range
> > > > > > +file.7 extent count is in range
> > > > > > +file.8 extent count is in range
> > > > > > +file.9 extent count is in range
> > > > > > +file.10 extent count is in range
> > > > > > +file.11 extent count is in range
> > > > > > +file.12 extent count is in range
> > > > > > +file.13 extent count is in range
> > > > > > +file.14 extent count is in range
> > > > > > +file.15 extent count is in range
> > > > > > +file.16 extent count is in range
> > > > > > +file.17 extent count is in range
> > > > > > +file.18 extent count is in range
> > > > > > +file.19 extent count is in range
> > > > > > +file.20 extent count is in range
> > > > > > +file.21 extent count is in range
> > > > > > +file.22 extent count is in range
> > > > > > +file.23 extent count is in range
> > > > > > +file.24 extent count is in range
> > > > > > +file.25 extent count is in range
> > > > > > +file.26 extent count is in range
> > > > > > +file.27 extent count is in range
> > > > > > +file.28 extent count is in range
> > > > > > +file.29 extent count is in range
> > > > > > +file.30 extent count is in range
> > > > > > +file.31 extent count is in range
> > > > > > -- 
> > > > > > 2.45.2
> > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > > 
> > 
> 
>
Zorro Lang Oct. 13, 2024, 5:49 p.m. UTC | #10
On Tue, Oct 01, 2024 at 07:59:44AM -0700, Darrick J. Wong wrote:
> On Tue, Sep 24, 2024 at 10:45:48AM +0200, Christoph Hellwig wrote:
> > From: Dave Chinner <dchinner@redhat.com>
> > 
> > These tests create substantial file fragmentation as a result of
> > application actions that defeat post-EOF preallocation
> > optimisations. They are intended to replicate known vectors for
> > these problems, and provide a check that the fragmentation levels
> > have been controlled. The mitigations we make may not completely
> > remove fragmentation (e.g. they may demonstrate speculative delalloc
> > related extent size growth) so the checks don't assume we'll end up
> > with perfect layouts and hence check for an exceptable level of
> > fragmentation rather than none.
> > 
> > Signed-off-by: Dave Chinner <dchinner@redhat.com>
> > [move to different test number, update to current xfstest APIs]
> > Signed-off-by: Christoph Hellwig <hch@lst.de>
> > ---
> >  tests/xfs/1500     | 66 +++++++++++++++++++++++++++++++++++++++
> >  tests/xfs/1500.out |  9 ++++++
> >  tests/xfs/1501     | 68 ++++++++++++++++++++++++++++++++++++++++
> >  tests/xfs/1501.out |  9 ++++++
> >  tests/xfs/1502     | 68 ++++++++++++++++++++++++++++++++++++++++
> >  tests/xfs/1502.out |  9 ++++++
> >  tests/xfs/1503     | 77 ++++++++++++++++++++++++++++++++++++++++++++++
> >  tests/xfs/1503.out | 33 ++++++++++++++++++++
> >  8 files changed, 339 insertions(+)
> >  create mode 100755 tests/xfs/1500
> >  create mode 100644 tests/xfs/1500.out
> >  create mode 100755 tests/xfs/1501
> >  create mode 100644 tests/xfs/1501.out
> >  create mode 100755 tests/xfs/1502
> >  create mode 100644 tests/xfs/1502.out
> >  create mode 100755 tests/xfs/1503
> >  create mode 100644 tests/xfs/1503.out
> > 
> > diff --git a/tests/xfs/1500 b/tests/xfs/1500
> > new file mode 100755
> > index 000000000..de0e1df62
> > --- /dev/null
> > +++ b/tests/xfs/1500
> > @@ -0,0 +1,66 @@
> > +#! /bin/bash
> > +# SPDX-License-Identifier: GPL-2.0
> > +# Copyright (c) 2019 Red Hat, Inc.  All Rights Reserved.
> > +#
> > +# FS QA Test xfs/500
> > +#
> > +# Post-EOF preallocation defeat test for O_SYNC buffered I/O.
> > +#
> > +
> > +. ./common/preamble
> > +_begin_fstest auto quick prealloc rw
> > +
> > +. ./common/rc
> > +. ./common/filter
> > +
> > +_require_scratch
> > +
> > +_cleanup()
> > +{
> > +	# try to kill all background processes
> > +	wait
> > +	cd /
> > +	rm -r -f $tmp.*
> > +}
> > +
> > +_scratch_mkfs > "$seqres.full" 2>&1
> > +_scratch_mount
> > +
> > +# Write multiple files in parallel using synchronous buffered writes. Aim is to
> > +# interleave allocations to fragment the files. Synchronous writes defeat the
> > +# open/write/close heuristics in xfs_file_release() that prevent EOF block
> > +# removal, so this should fragment badly. Typical problematic behaviour shows
> > +# per-file extent counts of >900 (almost worse case) whilst fixed behaviour
> > +# typically shows extent counts in the low 20s.
> 
> Now that these are in for-next, I've noticed that these new tests
> consistently fail in the above-documented manner on various configs --
> fsdax, always_cow, rtextsize > 1fsb, and sometimes 1k fsblock size.

Hi Christoph,

Thanks for reworking this patch, it's been merged into fstests, named
xfs/629~632. But now these 4 cases always fail on upstream xfs, e.g
(diff output) [1][2][3][4]. Could you help to take a look at the
failure which Darick metioned above too :)

Thanks,
Zorro

[1]
--- /dev/fd/63	2024-10-12 03:26:05.854655824 -0400
+++ xfs/629.out.bad	2024-10-12 03:26:05.196658410 -0400
@@ -1,9 +1,17 @@
 QA output created by 629
-file.0 extent count is in range
-file.1 extent count is in range
-file.2 extent count is in range
-file.3 extent count is in range
-file.4 extent count is in range
-file.5 extent count is in range
-file.6 extent count is in range
-file.7 extent count is in range
+file.0 extent count has value of 262
+file.0 extent count is NOT in range 2 .. 40
+file.1 extent count has value of 278
+file.1 extent count is NOT in range 2 .. 40
+file.2 extent count has value of 292
+file.2 extent count is NOT in range 2 .. 40
+file.3 extent count has value of 255
+file.3 extent count is NOT in range 2 .. 40
+file.4 extent count has value of 299
+file.4 extent count is NOT in range 2 .. 40
+file.5 extent count has value of 276
+file.5 extent count is NOT in range 2 .. 40
+file.6 extent count has value of 281
+file.6 extent count is NOT in range 2 .. 40
+file.7 extent count has value of 290
+file.7 extent count is NOT in range 2 .. 40

[2]
--- /dev/fd/63	2024-10-12 03:27:24.685345937 -0400
+++ xfs/630.out.bad	2024-10-12 03:27:24.002348622 -0400
@@ -1,9 +1,17 @@
 QA output created by 630
-file.0 extent count is in range
-file.1 extent count is in range
-file.2 extent count is in range
-file.3 extent count is in range
-file.4 extent count is in range
-file.5 extent count is in range
-file.6 extent count is in range
-file.7 extent count is in range
+file.0 extent count has value of 996
+file.0 extent count is NOT in range 1 .. 10
+file.1 extent count has value of 991
+file.1 extent count is NOT in range 1 .. 10
+file.2 extent count has value of 989
+file.2 extent count is NOT in range 1 .. 10
+file.3 extent count has value of 998
+file.3 extent count is NOT in range 1 .. 10
+file.4 extent count has value of 993
+file.4 extent count is NOT in range 1 .. 10
+file.5 extent count has value of 990
+file.5 extent count is NOT in range 1 .. 10
+file.6 extent count has value of 997
+file.6 extent count is NOT in range 1 .. 10
+file.7 extent count has value of 995
+file.7 extent count is NOT in range 1 .. 10

[3]
--- /dev/fd/63	2024-10-12 03:28:38.598055384 -0400
+++ xfs/631.out.bad	2024-10-12 03:28:37.973057841 -0400
@@ -1,9 +1,17 @@
 QA output created by 631
-file.0 extent count is in range
-file.1 extent count is in range
-file.2 extent count is in range
-file.3 extent count is in range
-file.4 extent count is in range
-file.5 extent count is in range
-file.6 extent count is in range
-file.7 extent count is in range
+file.0 extent count has value of 994
+file.0 extent count is NOT in range 1 .. 10
+file.1 extent count has value of 992
+file.1 extent count is NOT in range 1 .. 10
+file.2 extent count has value of 980
+file.2 extent count is NOT in range 1 .. 10
+file.3 extent count has value of 996
+file.3 extent count is NOT in range 1 .. 10
+file.4 extent count has value of 994
+file.4 extent count is NOT in range 1 .. 10
+file.5 extent count has value of 985
+file.5 extent count is NOT in range 1 .. 10
+file.6 extent count has value of 987
+file.6 extent count is NOT in range 1 .. 10
+file.7 extent count has value of 990
+file.7 extent count is NOT in range 1 .. 10

[4]
--- /dev/fd/63	2024-10-12 03:31:07.166471365 -0400
+++ xfs/632.out.bad	2024-10-12 03:31:06.487474034 -0400
@@ -1,33 +1,65 @@
 QA output created by 632
-file.0 extent count is in range
-file.1 extent count is in range
-file.2 extent count is in range
-file.3 extent count is in range
-file.4 extent count is in range
-file.5 extent count is in range
-file.6 extent count is in range
-file.7 extent count is in range
-file.8 extent count is in range
-file.9 extent count is in range
-file.10 extent count is in range
-file.11 extent count is in range
-file.12 extent count is in range
-file.13 extent count is in range
-file.14 extent count is in range
-file.15 extent count is in range
-file.16 extent count is in range
-file.17 extent count is in range
-file.18 extent count is in range
-file.19 extent count is in range
-file.20 extent count is in range
-file.21 extent count is in range
-file.22 extent count is in range
-file.23 extent count is in range
-file.24 extent count is in range
-file.25 extent count is in range
-file.26 extent count is in range
-file.27 extent count is in range
-file.28 extent count is in range
-file.29 extent count is in range
-file.30 extent count is in range
-file.31 extent count is in range
+file.0 extent count has value of 530
+file.0 extent count is NOT in range 1 .. 16
+file.1 extent count has value of 516
+file.1 extent count is NOT in range 1 .. 16
+file.2 extent count has value of 524
+file.2 extent count is NOT in range 1 .. 16
+file.3 extent count has value of 526
+file.3 extent count is NOT in range 1 .. 16
+file.4 extent count has value of 531
+file.4 extent count is NOT in range 1 .. 16
+file.5 extent count has value of 529
+file.5 extent count is NOT in range 1 .. 16
+file.6 extent count has value of 533
+file.6 extent count is NOT in range 1 .. 16
+file.7 extent count has value of 519
+file.7 extent count is NOT in range 1 .. 16
+file.8 extent count has value of 385
+file.8 extent count is NOT in range 1 .. 16
+file.9 extent count has value of 465
+file.9 extent count is NOT in range 1 .. 16
+file.10 extent count has value of 525
+file.10 extent count is NOT in range 1 .. 16
+file.11 extent count has value of 527
+file.11 extent count is NOT in range 1 .. 16
+file.12 extent count has value of 345
+file.12 extent count is NOT in range 1 .. 16
+file.13 extent count has value of 523
+file.13 extent count is NOT in range 1 .. 16
+file.14 extent count has value of 504
+file.14 extent count is NOT in range 1 .. 16
+file.15 extent count has value of 518
+file.15 extent count is NOT in range 1 .. 16
+file.16 extent count has value of 501
+file.16 extent count is NOT in range 1 .. 16
+file.17 extent count has value of 518
+file.17 extent count is NOT in range 1 .. 16
+file.18 extent count has value of 524
+file.18 extent count is NOT in range 1 .. 16
+file.19 extent count has value of 530
+file.19 extent count is NOT in range 1 .. 16
+file.20 extent count has value of 509
+file.20 extent count is NOT in range 1 .. 16
+file.21 extent count has value of 519
+file.21 extent count is NOT in range 1 .. 16
+file.22 extent count has value of 522
+file.22 extent count is NOT in range 1 .. 16
+file.23 extent count has value of 522
+file.23 extent count is NOT in range 1 .. 16
+file.24 extent count has value of 501
+file.24 extent count is NOT in range 1 .. 16
+file.25 extent count has value of 218
+file.25 extent count is NOT in range 1 .. 16
+file.26 extent count has value of 529
+file.26 extent count is NOT in range 1 .. 16
+file.27 extent count has value of 527
+file.27 extent count is NOT in range 1 .. 16
+file.28 extent count has value of 525
+file.28 extent count is NOT in range 1 .. 16
+file.29 extent count has value of 545
+file.29 extent count is NOT in range 1 .. 16
+file.30 extent count has value of 527
+file.30 extent count is NOT in range 1 .. 16
+file.31 extent count has value of 519
+file.31 extent count is NOT in range 1 .. 16

> 
> I'm not sure why this happens, but it probably needs to be looked at
> along with all the FALLOC_FL_UNSHARE_RANGE brokenness that's also been
> exposed by fstests that /does/ need to be fixed.
> 
> --D
> 
> > +# Failure is determined by golden output mismatch from _within_tolerance().
> > +
> > +workfile=$SCRATCH_MNT/file
> > +nfiles=8
> > +wsize=4096
> > +wcnt=1000
> > +
> > +write_sync_file()
> > +{
> > +	idx=$1
> > +
> > +	for ((cnt=0; cnt<$wcnt; cnt++)); do
> > +		$XFS_IO_PROG -f -s -c "pwrite $((cnt * wsize)) $wsize" $workfile.$idx
> > +	done
> > +}
> > +
> > +rm -f $workfile*
> > +for ((n=0; n<$nfiles; n++)); do
> > +	write_sync_file $n > /dev/null 2>&1 &
> > +done
> > +wait
> > +sync
> > +
> > +for ((n=0; n<$nfiles; n++)); do
> > +	count=$(_count_extents $workfile.$n)
> > +	# Acceptible extent count range is 1-40
> > +	_within_tolerance "file.$n extent count" $count 21 19 -v
> > +done
> > +
> > +status=0
> > +exit
> > diff --git a/tests/xfs/1500.out b/tests/xfs/1500.out
> > new file mode 100644
> > index 000000000..414df87ed
> > --- /dev/null
> > +++ b/tests/xfs/1500.out
> > @@ -0,0 +1,9 @@
> > +QA output created by 1500
> > +file.0 extent count is in range
> > +file.1 extent count is in range
> > +file.2 extent count is in range
> > +file.3 extent count is in range
> > +file.4 extent count is in range
> > +file.5 extent count is in range
> > +file.6 extent count is in range
> > +file.7 extent count is in range
> > diff --git a/tests/xfs/1501 b/tests/xfs/1501
> > new file mode 100755
> > index 000000000..cf3cbf8b5
> > --- /dev/null
> > +++ b/tests/xfs/1501
> > @@ -0,0 +1,68 @@
> > +#! /bin/bash
> > +# SPDX-License-Identifier: GPL-2.0
> > +# Copyright (c) 2019 Red Hat, Inc.  All Rights Reserved.
> > +#
> > +# FS QA Test xfs/501
> > +#
> > +# Post-EOF preallocation defeat test for buffered I/O with extent size hints.
> > +#
> > +
> > +. ./common/preamble
> > +_begin_fstest auto quick prealloc rw
> > +
> > +. ./common/rc
> > +. ./common/filter
> > +
> > +_require_scratch
> > +
> > +_cleanup()
> > +{
> > +	# try to kill all background processes
> > +	wait
> > +	cd /
> > +	rm -r -f $tmp.*
> > +}
> > +
> > +_scratch_mkfs > "$seqres.full" 2>&1
> > +_scratch_mount
> > +
> > +# Write multiple files in parallel using buffered writes with extent size hints.
> > +# Aim is to interleave allocations to fragment the files. Writes w/ extent size
> > +# hints set defeat the open/write/close heuristics in xfs_file_release() that
> > +# prevent EOF block removal, so this should fragment badly. Typical problematic
> > +# behaviour shows per-file extent counts of 1000 (worst case!) whilst
> > +# fixed behaviour should show very few extents (almost best case).
> > +#
> > +# Failure is determined by golden output mismatch from _within_tolerance().
> > +
> > +workfile=$SCRATCH_MNT/file
> > +nfiles=8
> > +wsize=4096
> > +wcnt=1000
> > +extent_size=16m
> > +
> > +write_extsz_file()
> > +{
> > +	idx=$1
> > +
> > +	$XFS_IO_PROG -f -c "extsize $extent_size" $workfile.$idx
> > +	for ((cnt=0; cnt<$wcnt; cnt++)); do
> > +		$XFS_IO_PROG -f -c "pwrite $((cnt * wsize)) $wsize" $workfile.$idx
> > +	done
> > +}
> > +
> > +rm -f $workfile*
> > +for ((n=0; n<$nfiles; n++)); do
> > +	write_extsz_file $n > /dev/null 2>&1 &
> > +done
> > +wait
> > +sync
> > +
> > +for ((n=0; n<$nfiles; n++)); do
> > +	count=$(_count_extents $workfile.$n)
> > +	# Acceptible extent count range is 1-10
> > +	_within_tolerance "file.$n extent count" $count 2 1 8 -v
> > +done
> > +
> > +status=0
> > +exit
> > diff --git a/tests/xfs/1501.out b/tests/xfs/1501.out
> > new file mode 100644
> > index 000000000..a266ef74b
> > --- /dev/null
> > +++ b/tests/xfs/1501.out
> > @@ -0,0 +1,9 @@
> > +QA output created by 1501
> > +file.0 extent count is in range
> > +file.1 extent count is in range
> > +file.2 extent count is in range
> > +file.3 extent count is in range
> > +file.4 extent count is in range
> > +file.5 extent count is in range
> > +file.6 extent count is in range
> > +file.7 extent count is in range
> > diff --git a/tests/xfs/1502 b/tests/xfs/1502
> > new file mode 100755
> > index 000000000..f4228667a
> > --- /dev/null
> > +++ b/tests/xfs/1502
> > @@ -0,0 +1,68 @@
> > +#! /bin/bash
> > +# SPDX-License-Identifier: GPL-2.0
> > +# Copyright (c) 2019 Red Hat, Inc.  All Rights Reserved.
> > +#
> > +# FS QA Test xfs/502
> > +#
> > +# Post-EOF preallocation defeat test for direct I/O with extent size hints.
> > +#
> > +
> > +. ./common/preamble
> > +_begin_fstest auto quick prealloc rw
> > +
> > +. ./common/rc
> > +. ./common/filter
> > +
> > +_require_scratch
> > +
> > +_cleanup()
> > +{
> > +	# try to kill all background processes
> > +	wait
> > +	cd /
> > +	rm -r -f $tmp.*
> > +}
> > +
> > +_scratch_mkfs > "$seqres.full" 2>&1
> > +_scratch_mount
> > +
> > +# Write multiple files in parallel using O_DIRECT writes w/ extent size hints.
> > +# Aim is to interleave allocations to fragment the files. O_DIRECT writes defeat
> > +# the open/write/close heuristics in xfs_file_release() that prevent EOF block
> > +# removal, so this should fragment badly. Typical problematic behaviour shows
> > +# per-file extent counts of ~1000 (worst case) whilst fixed behaviour typically
> > +# shows extent counts in the low single digits (almost best case)
> > +#
> > +# Failure is determined by golden output mismatch from _within_tolerance().
> > +
> > +workfile=$SCRATCH_MNT/file
> > +nfiles=8
> > +wsize=4096
> > +wcnt=1000
> > +extent_size=16m
> > +
> > +write_direct_file()
> > +{
> > +	idx=$1
> > +
> > +	$XFS_IO_PROG -f -c "extsize $extent_size" $workfile.$idx
> > +	for ((cnt=0; cnt<$wcnt; cnt++)); do
> > +		$XFS_IO_PROG -f -d -c "pwrite $((cnt * wsize)) $wsize" $workfile.$idx
> > +	done
> > +}
> > +
> > +rm -f $workfile*
> > +for ((n=0; n<$nfiles; n++)); do
> > +	write_direct_file $n > /dev/null 2>&1 &
> > +done
> > +wait
> > +sync
> > +
> > +for ((n=0; n<$nfiles; n++)); do
> > +	count=$(_count_extents $workfile.$n)
> > +	# Acceptible extent count range is 1-10
> > +	_within_tolerance "file.$n extent count" $count 2 1 8 -v
> > +done
> > +
> > +status=0
> > +exit
> > diff --git a/tests/xfs/1502.out b/tests/xfs/1502.out
> > new file mode 100644
> > index 000000000..82c8760a3
> > --- /dev/null
> > +++ b/tests/xfs/1502.out
> > @@ -0,0 +1,9 @@
> > +QA output created by 1502
> > +file.0 extent count is in range
> > +file.1 extent count is in range
> > +file.2 extent count is in range
> > +file.3 extent count is in range
> > +file.4 extent count is in range
> > +file.5 extent count is in range
> > +file.6 extent count is in range
> > +file.7 extent count is in range
> > diff --git a/tests/xfs/1503 b/tests/xfs/1503
> > new file mode 100755
> > index 000000000..9002f87e6
> > --- /dev/null
> > +++ b/tests/xfs/1503
> > @@ -0,0 +1,77 @@
> > +#! /bin/bash
> > +# SPDX-License-Identifier: GPL-2.0
> > +# Copyright (c) 2019 Red Hat, Inc.  All Rights Reserved.
> > +#
> > +# FS QA Test xfs/503
> > +#
> > +# Post-EOF preallocation defeat test with O_SYNC buffered I/O that repeatedly
> > +# closes and reopens the files.
> > +#
> > +
> > +. ./common/preamble
> > +_begin_fstest auto prealloc rw
> > +
> > +. ./common/rc
> > +. ./common/filter
> > +
> > +_require_scratch
> > +
> > +_cleanup()
> > +{
> > +	# try to kill all background processes
> > +	wait
> > +	cd /
> > +	rm -r -f $tmp.*
> > +}
> > +
> > +_scratch_mkfs > "$seqres.full" 2>&1
> > +_scratch_mount
> > +
> > +# Write multiple files in parallel using synchronous buffered writes that
> > +# repeatedly close and reopen the fails. Aim is to interleave allocations to
> > +# fragment the files. Assuming we've fixed the synchronous write defeat, we can
> > +# still trigger the same issue with a open/read/close on O_RDONLY files. We
> > +# should not be triggering EOF preallocation removal on files we don't have
> > +# permission to write, so until this is fixed it should fragment badly.  Typical
> > +# problematic behaviour shows per-file extent counts of 50-350 whilst fixed
> > +# behaviour typically demonstrates post-eof speculative delalloc growth in
> > +# extent size (~6 extents for 50MB file).
> > +#
> > +# Failure is determined by golden output mismatch from _within_tolerance().
> > +
> > +workfile=$SCRATCH_MNT/file
> > +nfiles=32
> > +wsize=4096
> > +wcnt=1000
> > +
> > +write_file()
> > +{
> > +	idx=$1
> > +
> > +	$XFS_IO_PROG -f -s -c "pwrite -b 64k 0 50m" $workfile.$idx
> > +}
> > +
> > +read_file()
> > +{
> > +	idx=$1
> > +
> > +	for ((cnt=0; cnt<$wcnt; cnt++)); do
> > +		$XFS_IO_PROG -f -r -c "pread 0 28" $workfile.$idx
> > +	done
> > +}
> > +
> > +rm -f $workdir/file*
> > +for ((n=0; n<$((nfiles)); n++)); do
> > +	write_file $n > /dev/null 2>&1 &
> > +	read_file $n > /dev/null 2>&1 &
> > +done
> > +wait
> > +
> > +for ((n=0; n<$nfiles; n++)); do
> > +	count=$(_count_extents $workfile.$n)
> > +	# Acceptible extent count range is 1-40
> > +	_within_tolerance "file.$n extent count" $count 6 5 10 -v
> > +done
> > +
> > +status=0
> > +exit
> > diff --git a/tests/xfs/1503.out b/tests/xfs/1503.out
> > new file mode 100644
> > index 000000000..1780b16df
> > --- /dev/null
> > +++ b/tests/xfs/1503.out
> > @@ -0,0 +1,33 @@
> > +QA output created by 1503
> > +file.0 extent count is in range
> > +file.1 extent count is in range
> > +file.2 extent count is in range
> > +file.3 extent count is in range
> > +file.4 extent count is in range
> > +file.5 extent count is in range
> > +file.6 extent count is in range
> > +file.7 extent count is in range
> > +file.8 extent count is in range
> > +file.9 extent count is in range
> > +file.10 extent count is in range
> > +file.11 extent count is in range
> > +file.12 extent count is in range
> > +file.13 extent count is in range
> > +file.14 extent count is in range
> > +file.15 extent count is in range
> > +file.16 extent count is in range
> > +file.17 extent count is in range
> > +file.18 extent count is in range
> > +file.19 extent count is in range
> > +file.20 extent count is in range
> > +file.21 extent count is in range
> > +file.22 extent count is in range
> > +file.23 extent count is in range
> > +file.24 extent count is in range
> > +file.25 extent count is in range
> > +file.26 extent count is in range
> > +file.27 extent count is in range
> > +file.28 extent count is in range
> > +file.29 extent count is in range
> > +file.30 extent count is in range
> > +file.31 extent count is in range
> > -- 
> > 2.45.2
> > 
> > 
>
Christoph Hellwig Oct. 14, 2024, 6:07 a.m. UTC | #11
On Mon, Oct 14, 2024 at 01:49:36AM +0800, Zorro Lang wrote:
> Thanks for reworking this patch, it's been merged into fstests, named
> xfs/629~632. But now these 4 cases always fail on upstream xfs, e.g
> (diff output) [1][2][3][4]. Could you help to take a look at the
> failure which Darick metioned above too :)

What do you mean with upstream xfs?  Any kernel before the eofblocks
fixes will obviously fail.  Always_cow will also always fail and I'll
send a patch for that.  Any other configuration you've seen?
Zorro Lang Oct. 14, 2024, 2:14 p.m. UTC | #12
On Mon, Oct 14, 2024 at 08:07:25AM +0200, Christoph Hellwig wrote:
> On Mon, Oct 14, 2024 at 01:49:36AM +0800, Zorro Lang wrote:
> > Thanks for reworking this patch, it's been merged into fstests, named
> > xfs/629~632. But now these 4 cases always fail on upstream xfs, e.g
> > (diff output) [1][2][3][4]. Could you help to take a look at the
> > failure which Darick metioned above too :)
> 
> What do you mean with upstream xfs?  Any kernel before the eofblocks
> fixes will obviously fail.  Always_cow will also always fail and I'll
> send a patch for that.  Any other configuration you've seen?

Sorry, I've lost old test results. From my current test results, the x/629
and x/632 fails [1][2] on pmem device (xfs) with linux v6.12-rc2+
(HEAD=cfea70e835b9180029257d8b772c9e99c3305a9a). The .full output as [3],
the .out.bad output as [4][5]. Besides that, I hit x/629 failed on a loop
device on s390x once (as [6]) with same v6.12-rc2+ kernel.

Thanks,
Zorro

[1]
FSTYP         -- xfs (debug)
PLATFORM      -- Linux/x86_64 hpe-dl360gen11-01 6.12.0-rc2+ #1 SMP PREEMPT_DYNAMIC Sun Oct 13 15:36:13 EDT 2024
MKFS_OPTIONS  -- -f -f -b size=4096 -m crc=1,finobt=1,rmapbt=0,reflink=0,inobtcount=1,bigtime=1 /dev/pmem0
MOUNT_OPTIONS -- -o dax=always -o context=system_u:object_r:root_t:s0 /dev/pmem0 /mnt/fstests/SCRATCH_DIR

xfs/629       - output mismatch (see /var/lib/xfstests/results//xfs/629.out.bad)
    --- tests/xfs/629.out	2024-10-13 16:02:57.113908110 -0400
    +++ /var/lib/xfstests/results//xfs/629.out.bad	2024-10-14 01:52:24.553081084 -0400
    @@ -1,9 +1,17 @@
     QA output created by 629
    -file.0 extent count is in range
    -file.1 extent count is in range
    -file.2 extent count is in range
    -file.3 extent count is in range
    -file.4 extent count is in range
    -file.5 extent count is in range
    ...
    (Run 'diff -u /var/lib/xfstests/tests/xfs/629.out /var/lib/xfstests/results//xfs/629.out.bad'  to see the entire diff)
Ran: xfs/629
Failures: xfs/629
Failed 1 of 1 tests

[2]
FSTYP         -- xfs (debug)
PLATFORM      -- Linux/x86_64 hpe-dl360gen11-01 6.12.0-rc2+ #1 SMP PREEMPT_DYNAMIC Sun Oct 13 15:36:13 EDT 2024
MKFS_OPTIONS  -- -f -f -b size=4096 -m crc=1,finobt=1,rmapbt=0,reflink=0,inobtcount=1,bigtime=1 /dev/pmem0
MOUNT_OPTIONS -- -o dax=always -o context=system_u:object_r:root_t:s0 /dev/pmem0 /mnt/fstests/SCRATCH_DIR

xfs/632       - output mismatch (see /var/lib/xfstests/results//xfs/632.out.bad)
    --- tests/xfs/632.out	2024-10-13 16:02:57.166908766 -0400
    +++ /var/lib/xfstests/results//xfs/632.out.bad	2024-10-14 01:55:32.671284956 -0400
    @@ -1,33 +1,65 @@
     QA output created by 632
    -file.0 extent count is in range
    -file.1 extent count is in range
    -file.2 extent count is in range
    -file.3 extent count is in range
    -file.4 extent count is in range
    -file.5 extent count is in range
    ...
    (Run 'diff -u /var/lib/xfstests/tests/xfs/632.out /var/lib/xfstests/results//xfs/632.out.bad'  to see the entire diff)
Ran: xfs/632
Failures: xfs/632
Failed 1 of 1 tests

[3]
meta-data=/dev/pmem0             isize=512    agcount=4, agsize=655360 blks
         =                       sectsz=4096  attr=2, projid32bit=1
         =                       crc=1        finobt=1, sparse=1, rmapbt=0
         =                       reflink=0    bigtime=1 inobtcount=1 nrext64=1
         =                       exchange=0  
data     =                       bsize=4096   blocks=2621440, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0, ftype=1, parent=0
log      =internal log           bsize=4096   blocks=300954, version=2
         =                       sectsz=4096  sunit=1 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0

[4]
QA output created by 629
file.0 extent count has value of 520
file.0 extent count is NOT in range 2 .. 40
file.1 extent count has value of 627
file.1 extent count is NOT in range 2 .. 40
file.2 extent count has value of 776
file.2 extent count is NOT in range 2 .. 40
file.3 extent count has value of 725
file.3 extent count is NOT in range 2 .. 40
file.4 extent count has value of 672
file.4 extent count is NOT in range 2 .. 40
file.5 extent count has value of 709
file.5 extent count is NOT in range 2 .. 40
file.6 extent count has value of 660
file.6 extent count is NOT in range 2 .. 40
file.7 extent count has value of 703
file.7 extent count is NOT in range 2 .. 40

[5]
QA output created by 632
file.0 extent count has value of 762
file.0 extent count is NOT in range 1 .. 16
file.1 extent count has value of 800
file.1 extent count is NOT in range 1 .. 16
file.2 extent count has value of 800
file.2 extent count is NOT in range 1 .. 16
file.3 extent count has value of 794
file.3 extent count is NOT in range 1 .. 16
file.4 extent count has value of 800
file.4 extent count is NOT in range 1 .. 16
file.5 extent count has value of 787
file.5 extent count is NOT in range 1 .. 16
file.6 extent count has value of 788
file.6 extent count is NOT in range 1 .. 16
file.7 extent count has value of 800
file.7 extent count is NOT in range 1 .. 16
file.8 extent count has value of 799
file.8 extent count is NOT in range 1 .. 16
file.9 extent count has value of 800
file.9 extent count is NOT in range 1 .. 16
file.10 extent count has value of 793
file.10 extent count is NOT in range 1 .. 16
file.11 extent count has value of 800
file.11 extent count is NOT in range 1 .. 16
file.12 extent count has value of 800
file.12 extent count is NOT in range 1 .. 16
file.13 extent count has value of 800
file.13 extent count is NOT in range 1 .. 16
file.14 extent count has value of 800
file.14 extent count is NOT in range 1 .. 16
file.15 extent count has value of 800
file.15 extent count is NOT in range 1 .. 16
file.16 extent count has value of 800
file.16 extent count is NOT in range 1 .. 16
file.17 extent count has value of 800
file.17 extent count is NOT in range 1 .. 16
file.18 extent count has value of 800
file.18 extent count is NOT in range 1 .. 16
file.19 extent count has value of 797
file.19 extent count is NOT in range 1 .. 16
file.20 extent count has value of 782
file.20 extent count is NOT in range 1 .. 16
file.21 extent count has value of 800
file.21 extent count is NOT in range 1 .. 16
file.22 extent count has value of 800
file.22 extent count is NOT in range 1 .. 16
file.23 extent count has value of 800
file.23 extent count is NOT in range 1 .. 16
file.24 extent count has value of 792
file.24 extent count is NOT in range 1 .. 16
file.25 extent count has value of 777
file.25 extent count is NOT in range 1 .. 16
file.26 extent count has value of 799
file.26 extent count is NOT in range 1 .. 16
file.27 extent count has value of 800
file.27 extent count is NOT in range 1 .. 16
file.28 extent count has value of 793
file.28 extent count is NOT in range 1 .. 16
file.29 extent count has value of 800
file.29 extent count is NOT in range 1 .. 16
file.30 extent count has value of 797
file.30 extent count is NOT in range 1 .. 16


[6]
FSTYP         -- xfs (debug)
PLATFORM      -- Linux/s390x s390x-kvm-003 6.12.0-rc2+ #1 SMP Sun Oct 13 15:47:15 EDT 2024
MKFS_OPTIONS  -- -f -m crc=1,finobt=1,reflink=1,rmapbt=1,bigtime=1,inobtcount=1 -b size=1024 /dev/loop1
MOUNT_OPTIONS -- -o context=system_u:object_r:root_t:s0 /dev/loop1 /mnt/fstests/SCRATCH_DIR

xfs/629       - output mismatch (see /var/lib/xfstests/results//xfs/629.out.bad)
    --- tests/xfs/629.out	2024-10-13 15:56:03.564908748 -0400
    +++ /var/lib/xfstests/results//xfs/629.out.bad	2024-10-13 18:45:54.344510502 -0400
    @@ -1,9 +1,16 @@
     QA output created by 629
    -file.0 extent count is in range
    -file.1 extent count is in range
    -file.2 extent count is in range
    -file.3 extent count is in range
    -file.4 extent count is in range
    -file.5 extent count is in range
    ...
    (Run 'diff -u /var/lib/xfstests/tests/xfs/629.out /var/lib/xfstests/results//xfs/629.out.bad'  to see the entire diff)
Ran: xfs/629
Failures: xfs/629
Failed 1 of 1 tests
>
Darrick J. Wong Oct. 14, 2024, 3:24 p.m. UTC | #13
On Mon, Oct 14, 2024 at 08:07:25AM +0200, Christoph Hellwig wrote:
> On Mon, Oct 14, 2024 at 01:49:36AM +0800, Zorro Lang wrote:
> > Thanks for reworking this patch, it's been merged into fstests, named
> > xfs/629~632. But now these 4 cases always fail on upstream xfs, e.g
> > (diff output) [1][2][3][4]. Could you help to take a look at the
> > failure which Darick metioned above too :)
> 
> What do you mean with upstream xfs?  Any kernel before the eofblocks
> fixes will obviously fail.  Always_cow will also always fail and I'll
> send a patch for that.  Any other configuration you've seen?

fsdax, any config with an extent size hint set, and any time
sb_rextsize > 1 fsblock.

--D
Zorro Lang Oct. 14, 2024, 5:46 p.m. UTC | #14
On Mon, Oct 14, 2024 at 08:24:28AM -0700, Darrick J. Wong wrote:
> On Mon, Oct 14, 2024 at 08:07:25AM +0200, Christoph Hellwig wrote:
> > On Mon, Oct 14, 2024 at 01:49:36AM +0800, Zorro Lang wrote:
> > > Thanks for reworking this patch, it's been merged into fstests, named
> > > xfs/629~632. But now these 4 cases always fail on upstream xfs, e.g
> > > (diff output) [1][2][3][4]. Could you help to take a look at the
> > > failure which Darick metioned above too :)
> > 
> > What do you mean with upstream xfs?  Any kernel before the eofblocks
> > fixes will obviously fail.  Always_cow will also always fail and I'll
> > send a patch for that.  Any other configuration you've seen?
> 
> fsdax, any config with an extent size hint set, and any time
> sb_rextsize > 1 fsblock.

Even with dax=never, it's still reproducible. There's not any special config
about the extent size hint. The mkfs.xfs option is:
"-b size=4096 -m crc=1,finobt=1,rmapbt=0,reflink=0,inobtcount=1,bigtime=1"

The mkfs output is:
meta-data=/dev/pmem0             isize=512    agcount=4, agsize=655360 blks
         =                       sectsz=4096  attr=2, projid32bit=1
         =                       crc=1        finobt=1, sparse=1, rmapbt=0
         =                       reflink=0    bigtime=1 inobtcount=1 nrext64=1
         =                       exchange=0  
data     =                       bsize=4096   blocks=2621440, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0, ftype=1, parent=0
log      =internal log           bsize=4096   blocks=300954, version=2
         =                       sectsz=4096  sunit=1 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0

From the output we can see the sector size is 4096.
The mount option is "-o dax=always/inode/never", no more, without external
log device, no rtdev.

Thanks,
Zorro

> 
> --D
>
Christoph Hellwig Oct. 15, 2024, 3:39 a.m. UTC | #15
On Tue, Oct 15, 2024 at 01:46:50AM +0800, Zorro Lang wrote:
> >From the output we can see the sector size is 4096.
> The mount option is "-o dax=always/inode/never", no more, without external
> log device, no rtdev.

Hmm.  Given how unreliable these tests are maybe we should just drop
them given that they are trying to verify optimal expected behavior and
not just correctness.  Maybe Dave who originally wrote them can chime
in.
diff mbox series

Patch

diff --git a/tests/xfs/1500 b/tests/xfs/1500
new file mode 100755
index 000000000..de0e1df62
--- /dev/null
+++ b/tests/xfs/1500
@@ -0,0 +1,66 @@ 
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (c) 2019 Red Hat, Inc.  All Rights Reserved.
+#
+# FS QA Test xfs/500
+#
+# Post-EOF preallocation defeat test for O_SYNC buffered I/O.
+#
+
+. ./common/preamble
+_begin_fstest auto quick prealloc rw
+
+. ./common/rc
+. ./common/filter
+
+_require_scratch
+
+_cleanup()
+{
+	# try to kill all background processes
+	wait
+	cd /
+	rm -r -f $tmp.*
+}
+
+_scratch_mkfs > "$seqres.full" 2>&1
+_scratch_mount
+
+# Write multiple files in parallel using synchronous buffered writes. Aim is to
+# interleave allocations to fragment the files. Synchronous writes defeat the
+# open/write/close heuristics in xfs_file_release() that prevent EOF block
+# removal, so this should fragment badly. Typical problematic behaviour shows
+# per-file extent counts of >900 (almost worse case) whilst fixed behaviour
+# typically shows extent counts in the low 20s.
+#
+# Failure is determined by golden output mismatch from _within_tolerance().
+
+workfile=$SCRATCH_MNT/file
+nfiles=8
+wsize=4096
+wcnt=1000
+
+write_sync_file()
+{
+	idx=$1
+
+	for ((cnt=0; cnt<$wcnt; cnt++)); do
+		$XFS_IO_PROG -f -s -c "pwrite $((cnt * wsize)) $wsize" $workfile.$idx
+	done
+}
+
+rm -f $workfile*
+for ((n=0; n<$nfiles; n++)); do
+	write_sync_file $n > /dev/null 2>&1 &
+done
+wait
+sync
+
+for ((n=0; n<$nfiles; n++)); do
+	count=$(_count_extents $workfile.$n)
+	# Acceptible extent count range is 1-40
+	_within_tolerance "file.$n extent count" $count 21 19 -v
+done
+
+status=0
+exit
diff --git a/tests/xfs/1500.out b/tests/xfs/1500.out
new file mode 100644
index 000000000..414df87ed
--- /dev/null
+++ b/tests/xfs/1500.out
@@ -0,0 +1,9 @@ 
+QA output created by 1500
+file.0 extent count is in range
+file.1 extent count is in range
+file.2 extent count is in range
+file.3 extent count is in range
+file.4 extent count is in range
+file.5 extent count is in range
+file.6 extent count is in range
+file.7 extent count is in range
diff --git a/tests/xfs/1501 b/tests/xfs/1501
new file mode 100755
index 000000000..cf3cbf8b5
--- /dev/null
+++ b/tests/xfs/1501
@@ -0,0 +1,68 @@ 
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (c) 2019 Red Hat, Inc.  All Rights Reserved.
+#
+# FS QA Test xfs/501
+#
+# Post-EOF preallocation defeat test for buffered I/O with extent size hints.
+#
+
+. ./common/preamble
+_begin_fstest auto quick prealloc rw
+
+. ./common/rc
+. ./common/filter
+
+_require_scratch
+
+_cleanup()
+{
+	# try to kill all background processes
+	wait
+	cd /
+	rm -r -f $tmp.*
+}
+
+_scratch_mkfs > "$seqres.full" 2>&1
+_scratch_mount
+
+# Write multiple files in parallel using buffered writes with extent size hints.
+# Aim is to interleave allocations to fragment the files. Writes w/ extent size
+# hints set defeat the open/write/close heuristics in xfs_file_release() that
+# prevent EOF block removal, so this should fragment badly. Typical problematic
+# behaviour shows per-file extent counts of 1000 (worst case!) whilst
+# fixed behaviour should show very few extents (almost best case).
+#
+# Failure is determined by golden output mismatch from _within_tolerance().
+
+workfile=$SCRATCH_MNT/file
+nfiles=8
+wsize=4096
+wcnt=1000
+extent_size=16m
+
+write_extsz_file()
+{
+	idx=$1
+
+	$XFS_IO_PROG -f -c "extsize $extent_size" $workfile.$idx
+	for ((cnt=0; cnt<$wcnt; cnt++)); do
+		$XFS_IO_PROG -f -c "pwrite $((cnt * wsize)) $wsize" $workfile.$idx
+	done
+}
+
+rm -f $workfile*
+for ((n=0; n<$nfiles; n++)); do
+	write_extsz_file $n > /dev/null 2>&1 &
+done
+wait
+sync
+
+for ((n=0; n<$nfiles; n++)); do
+	count=$(_count_extents $workfile.$n)
+	# Acceptible extent count range is 1-10
+	_within_tolerance "file.$n extent count" $count 2 1 8 -v
+done
+
+status=0
+exit
diff --git a/tests/xfs/1501.out b/tests/xfs/1501.out
new file mode 100644
index 000000000..a266ef74b
--- /dev/null
+++ b/tests/xfs/1501.out
@@ -0,0 +1,9 @@ 
+QA output created by 1501
+file.0 extent count is in range
+file.1 extent count is in range
+file.2 extent count is in range
+file.3 extent count is in range
+file.4 extent count is in range
+file.5 extent count is in range
+file.6 extent count is in range
+file.7 extent count is in range
diff --git a/tests/xfs/1502 b/tests/xfs/1502
new file mode 100755
index 000000000..f4228667a
--- /dev/null
+++ b/tests/xfs/1502
@@ -0,0 +1,68 @@ 
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (c) 2019 Red Hat, Inc.  All Rights Reserved.
+#
+# FS QA Test xfs/502
+#
+# Post-EOF preallocation defeat test for direct I/O with extent size hints.
+#
+
+. ./common/preamble
+_begin_fstest auto quick prealloc rw
+
+. ./common/rc
+. ./common/filter
+
+_require_scratch
+
+_cleanup()
+{
+	# try to kill all background processes
+	wait
+	cd /
+	rm -r -f $tmp.*
+}
+
+_scratch_mkfs > "$seqres.full" 2>&1
+_scratch_mount
+
+# Write multiple files in parallel using O_DIRECT writes w/ extent size hints.
+# Aim is to interleave allocations to fragment the files. O_DIRECT writes defeat
+# the open/write/close heuristics in xfs_file_release() that prevent EOF block
+# removal, so this should fragment badly. Typical problematic behaviour shows
+# per-file extent counts of ~1000 (worst case) whilst fixed behaviour typically
+# shows extent counts in the low single digits (almost best case)
+#
+# Failure is determined by golden output mismatch from _within_tolerance().
+
+workfile=$SCRATCH_MNT/file
+nfiles=8
+wsize=4096
+wcnt=1000
+extent_size=16m
+
+write_direct_file()
+{
+	idx=$1
+
+	$XFS_IO_PROG -f -c "extsize $extent_size" $workfile.$idx
+	for ((cnt=0; cnt<$wcnt; cnt++)); do
+		$XFS_IO_PROG -f -d -c "pwrite $((cnt * wsize)) $wsize" $workfile.$idx
+	done
+}
+
+rm -f $workfile*
+for ((n=0; n<$nfiles; n++)); do
+	write_direct_file $n > /dev/null 2>&1 &
+done
+wait
+sync
+
+for ((n=0; n<$nfiles; n++)); do
+	count=$(_count_extents $workfile.$n)
+	# Acceptible extent count range is 1-10
+	_within_tolerance "file.$n extent count" $count 2 1 8 -v
+done
+
+status=0
+exit
diff --git a/tests/xfs/1502.out b/tests/xfs/1502.out
new file mode 100644
index 000000000..82c8760a3
--- /dev/null
+++ b/tests/xfs/1502.out
@@ -0,0 +1,9 @@ 
+QA output created by 1502
+file.0 extent count is in range
+file.1 extent count is in range
+file.2 extent count is in range
+file.3 extent count is in range
+file.4 extent count is in range
+file.5 extent count is in range
+file.6 extent count is in range
+file.7 extent count is in range
diff --git a/tests/xfs/1503 b/tests/xfs/1503
new file mode 100755
index 000000000..9002f87e6
--- /dev/null
+++ b/tests/xfs/1503
@@ -0,0 +1,77 @@ 
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (c) 2019 Red Hat, Inc.  All Rights Reserved.
+#
+# FS QA Test xfs/503
+#
+# Post-EOF preallocation defeat test with O_SYNC buffered I/O that repeatedly
+# closes and reopens the files.
+#
+
+. ./common/preamble
+_begin_fstest auto prealloc rw
+
+. ./common/rc
+. ./common/filter
+
+_require_scratch
+
+_cleanup()
+{
+	# try to kill all background processes
+	wait
+	cd /
+	rm -r -f $tmp.*
+}
+
+_scratch_mkfs > "$seqres.full" 2>&1
+_scratch_mount
+
+# Write multiple files in parallel using synchronous buffered writes that
+# repeatedly close and reopen the fails. Aim is to interleave allocations to
+# fragment the files. Assuming we've fixed the synchronous write defeat, we can
+# still trigger the same issue with a open/read/close on O_RDONLY files. We
+# should not be triggering EOF preallocation removal on files we don't have
+# permission to write, so until this is fixed it should fragment badly.  Typical
+# problematic behaviour shows per-file extent counts of 50-350 whilst fixed
+# behaviour typically demonstrates post-eof speculative delalloc growth in
+# extent size (~6 extents for 50MB file).
+#
+# Failure is determined by golden output mismatch from _within_tolerance().
+
+workfile=$SCRATCH_MNT/file
+nfiles=32
+wsize=4096
+wcnt=1000
+
+write_file()
+{
+	idx=$1
+
+	$XFS_IO_PROG -f -s -c "pwrite -b 64k 0 50m" $workfile.$idx
+}
+
+read_file()
+{
+	idx=$1
+
+	for ((cnt=0; cnt<$wcnt; cnt++)); do
+		$XFS_IO_PROG -f -r -c "pread 0 28" $workfile.$idx
+	done
+}
+
+rm -f $workdir/file*
+for ((n=0; n<$((nfiles)); n++)); do
+	write_file $n > /dev/null 2>&1 &
+	read_file $n > /dev/null 2>&1 &
+done
+wait
+
+for ((n=0; n<$nfiles; n++)); do
+	count=$(_count_extents $workfile.$n)
+	# Acceptible extent count range is 1-40
+	_within_tolerance "file.$n extent count" $count 6 5 10 -v
+done
+
+status=0
+exit
diff --git a/tests/xfs/1503.out b/tests/xfs/1503.out
new file mode 100644
index 000000000..1780b16df
--- /dev/null
+++ b/tests/xfs/1503.out
@@ -0,0 +1,33 @@ 
+QA output created by 1503
+file.0 extent count is in range
+file.1 extent count is in range
+file.2 extent count is in range
+file.3 extent count is in range
+file.4 extent count is in range
+file.5 extent count is in range
+file.6 extent count is in range
+file.7 extent count is in range
+file.8 extent count is in range
+file.9 extent count is in range
+file.10 extent count is in range
+file.11 extent count is in range
+file.12 extent count is in range
+file.13 extent count is in range
+file.14 extent count is in range
+file.15 extent count is in range
+file.16 extent count is in range
+file.17 extent count is in range
+file.18 extent count is in range
+file.19 extent count is in range
+file.20 extent count is in range
+file.21 extent count is in range
+file.22 extent count is in range
+file.23 extent count is in range
+file.24 extent count is in range
+file.25 extent count is in range
+file.26 extent count is in range
+file.27 extent count is in range
+file.28 extent count is in range
+file.29 extent count is in range
+file.30 extent count is in range
+file.31 extent count is in range