diff mbox series

common/xfs: Execute _xfs_check only for block size <= 4k

Message ID 20200324034729.32678-1-chandanrlinux@gmail.com (mailing list archive)
State New, archived
Headers show
Series common/xfs: Execute _xfs_check only for block size <= 4k | expand

Commit Message

Chandan Babu R March 24, 2020, 3:47 a.m. UTC
fsstress when executed as part of some of the tests (e.g. generic/270)
invokes chown() syscall many times by passing random integers as value
for the uid argument. For each such syscall invocation for which there
is no on-disk quota block, xfs invokes xfs_dquot_disk_alloc() which
allocates a new block and instantiates all the quota structures mapped
by the newly allocated block. For a single 64k block, the number of
on-disk quota structures thus created will be 16 times more than that
for a 4k block.

xfs_db's check command (executed after test script finishes execution)
will read in all of the on-disk quota structures into memory. This
causes the OOM event to be triggered when reading from filesystems with
64k block size. For machines with sufficiently large amount of system
memory, this causes the test to execute for a very long time.

Due to the above stated reasons, this commit disables execution of
xfs_db's check command when working on 64k blocksized filesystem.

Signed-off-by: Chandan Rajendra <chandanrlinux@gmail.com>
---
 common/xfs | 17 +++++++++++++----
 1 file changed, 13 insertions(+), 4 deletions(-)

Comments

Christoph Hellwig March 25, 2020, 1:12 p.m. UTC | #1
On Tue, Mar 24, 2020 at 09:17:29AM +0530, Chandan Rajendra wrote:
> fsstress when executed as part of some of the tests (e.g. generic/270)
> invokes chown() syscall many times by passing random integers as value
> for the uid argument. For each such syscall invocation for which there
> is no on-disk quota block, xfs invokes xfs_dquot_disk_alloc() which
> allocates a new block and instantiates all the quota structures mapped
> by the newly allocated block. For a single 64k block, the number of
> on-disk quota structures thus created will be 16 times more than that
> for a 4k block.
> 
> xfs_db's check command (executed after test script finishes execution)
> will read in all of the on-disk quota structures into memory. This
> causes the OOM event to be triggered when reading from filesystems with
> 64k block size. For machines with sufficiently large amount of system
> memory, this causes the test to execute for a very long time.
> 
> Due to the above stated reasons, this commit disables execution of
> xfs_db's check command when working on 64k blocksized filesystem.

Due to all the scalability issues in the xfs_db check command I think
it finally is time to just not run it by default at all.
diff mbox series

Patch

diff --git a/common/xfs b/common/xfs
index d9a9784f..d65c38d8 100644
--- a/common/xfs
+++ b/common/xfs
@@ -455,10 +455,19 @@  _check_xfs_filesystem()
 		ok=0
 	fi
 
-	# xfs_check runs out of memory on large files, so even providing the test
-	# option (-t) to avoid indexing the free space trees doesn't make it pass on
-	# large filesystems. Avoid it.
-	if [ "$LARGE_SCRATCH_DEV" != yes ]; then
+	dbsize="$($XFS_INFO_PROG "${device}" | grep data.*bsize | sed -e 's/^.*bsize=//g' -e 's/\([0-9]*\).*$/\1/g')"
+
+	# xfs_check runs out of memory,
+	# 1. On large files. So even providing the test option (-t) to
+	# avoid indexing the free space trees doesn't make it pass on
+	# large filesystems.
+	# 2. When checking filesystems with large number of quota
+	# structures. This case happens consistently with 64k blocksize when
+	# creating large number of on-disk quota structures whose quota ids
+	# are spread across a large integer range.
+	#
+	# Hence avoid it in these two cases.
+	if [ $dbsize -le 4096 -a "$LARGE_SCRATCH_DEV" != yes ]; then
 		_xfs_check $extra_log_options $device 2>&1 > $tmp.fs_check
 	fi
 	if [ -s $tmp.fs_check ]; then