Message ID | 20200330101203.12049-1-chandanrlinux@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | common/xfs: Execute _xfs_check only when explicitly asked | expand |
On Mon, Mar 30, 2020 at 03:42:03PM +0530, Chandan Rajendra wrote: > fsstress when executed as part of some of the tests (e.g. generic/270) > invokes chown() syscall many times by passing random integers as value > for the uid argument. For each such syscall invocation for which there > is no on-disk quota block, xfs invokes xfs_dquot_disk_alloc() which > allocates a new block and instantiates all the quota structures mapped > by the newly allocated block. For filesystems with 64k block size, the > number of on-disk quota structures created will be 16 times more than > that for a filesystem with 4k block size. > > xfs_db's check command (executed after test script finishes execution) > will try to read in all of the on-disk quota structures into > memory. This causes the OOM event to be triggered when reading from > filesystems with 64k block size. For filesystems with sufficiently large > amount of system memory, this causes the test to execute for a very long > time. > > Due to the above stated reasons, this commit disables execution of > xfs_db's check command unless explictly asked by the user by setting > $EXECUTE_XFS_DB_CHECK variable. > > Signed-off-by: Chandan Rajendra <chandanrlinux@gmail.com> > --- > README | 2 ++ > common/xfs | 17 +++++++++++++---- > 2 files changed, 15 insertions(+), 4 deletions(-) > > diff --git a/README b/README > index 094a7742..c47569cd 100644 > --- a/README > +++ b/README > @@ -88,6 +88,8 @@ Preparing system for tests: > run xfs_repair -n to check the filesystem; xfs_repair to rebuild > metadata indexes; and xfs_repair -n (a third time) to check the > results of the rebuilding. > + - set EXECUTE_XFS_DB_CHECK=1 to have _check_xfs_filesystem > + run xfs_db's check command on the filesystem. It seems spaces are used for indention instead of tab in README. > - xfs_scrub, if present, will always check the test and scratch > filesystems if they are still online at the end of the test. > It is no longer necessary to set TEST_XFS_SCRUB. > diff --git a/common/xfs b/common/xfs > index d9a9784f..93ebab75 100644 > --- a/common/xfs > +++ b/common/xfs > @@ -455,10 +455,19 @@ _check_xfs_filesystem() > ok=0 > fi > > - # xfs_check runs out of memory on large files, so even providing the test > - # option (-t) to avoid indexing the free space trees doesn't make it pass on > - # large filesystems. Avoid it. > - if [ "$LARGE_SCRATCH_DEV" != yes ]; then > + dbsize="$($XFS_INFO_PROG "${device}" | grep data.*bsize | sed -e 's/^.*bsize=//g' -e 's/\([0-9]*\).*$/\1/g')" This is probably a left-over from v1 patch, which is not needed in v2. > + > + # xfs_check runs out of memory, > + # 1. On large files. So even providing the test option (-t) to > + # avoid indexing the free space trees doesn't make it pass on > + # large filesystems. > + # 2. When checking filesystems with large number of quota > + # structures. This case happens consistently with 64k blocksize when > + # creating large number of on-disk quota structures whose quota ids > + # are spread across a large integer range. > + # > + # Hence avoid executing it unless explicitly asked by user. > + if [ -n "$EXECUTE_XFS_DB_CHECK" -a "$LARGE_SCRATCH_DEV" != yes ]; then > _xfs_check $extra_log_options $device 2>&1 > $tmp.fs_check Looks fine to me, I'd like to see xfs_check being disabled. But it'd be great if xfs folks could ack it as well. Thanks, Eryu > fi > if [ -s $tmp.fs_check ]; then > -- > 2.19.1 >
On Mon, Apr 06, 2020 at 08:31:07PM +0800, Eryu Guan wrote: > On Mon, Mar 30, 2020 at 03:42:03PM +0530, Chandan Rajendra wrote: > > fsstress when executed as part of some of the tests (e.g. generic/270) > > invokes chown() syscall many times by passing random integers as value > > for the uid argument. For each such syscall invocation for which there > > is no on-disk quota block, xfs invokes xfs_dquot_disk_alloc() which > > allocates a new block and instantiates all the quota structures mapped > > by the newly allocated block. For filesystems with 64k block size, the > > number of on-disk quota structures created will be 16 times more than > > that for a filesystem with 4k block size. > > > > xfs_db's check command (executed after test script finishes execution) > > will try to read in all of the on-disk quota structures into > > memory. This causes the OOM event to be triggered when reading from > > filesystems with 64k block size. For filesystems with sufficiently large > > amount of system memory, this causes the test to execute for a very long > > time. > > > > Due to the above stated reasons, this commit disables execution of > > xfs_db's check command unless explictly asked by the user by setting > > $EXECUTE_XFS_DB_CHECK variable. > > > > Signed-off-by: Chandan Rajendra <chandanrlinux@gmail.com> > > --- > > README | 2 ++ > > common/xfs | 17 +++++++++++++---- > > 2 files changed, 15 insertions(+), 4 deletions(-) > > > > diff --git a/README b/README > > index 094a7742..c47569cd 100644 > > --- a/README > > +++ b/README > > @@ -88,6 +88,8 @@ Preparing system for tests: > > run xfs_repair -n to check the filesystem; xfs_repair to rebuild > > metadata indexes; and xfs_repair -n (a third time) to check the > > results of the rebuilding. > > + - set EXECUTE_XFS_DB_CHECK=1 to have _check_xfs_filesystem > > + run xfs_db's check command on the filesystem. > > It seems spaces are used for indention instead of tab in README. > > > - xfs_scrub, if present, will always check the test and scratch > > filesystems if they are still online at the end of the test. > > It is no longer necessary to set TEST_XFS_SCRUB. > > diff --git a/common/xfs b/common/xfs > > index d9a9784f..93ebab75 100644 > > --- a/common/xfs > > +++ b/common/xfs > > @@ -455,10 +455,19 @@ _check_xfs_filesystem() > > ok=0 > > fi > > > > - # xfs_check runs out of memory on large files, so even providing the test > > - # option (-t) to avoid indexing the free space trees doesn't make it pass on > > - # large filesystems. Avoid it. > > - if [ "$LARGE_SCRATCH_DEV" != yes ]; then > > + dbsize="$($XFS_INFO_PROG "${device}" | grep data.*bsize | sed -e 's/^.*bsize=//g' -e 's/\([0-9]*\).*$/\1/g')" > > This is probably a left-over from v1 patch, which is not needed in v2. > > > + > > + # xfs_check runs out of memory, > > + # 1. On large files. So even providing the test option (-t) to > > + # avoid indexing the free space trees doesn't make it pass on > > + # large filesystems. > > + # 2. When checking filesystems with large number of quota > > + # structures. This case happens consistently with 64k blocksize when > > + # creating large number of on-disk quota structures whose quota ids > > + # are spread across a large integer range. > > + # > > + # Hence avoid executing it unless explicitly asked by user. > > + if [ -n "$EXECUTE_XFS_DB_CHECK" -a "$LARGE_SCRATCH_DEV" != yes ]; then > > _xfs_check $extra_log_options $device 2>&1 > $tmp.fs_check > > Looks fine to me, I'd like to see xfs_check being disabled. But it'd be > great if xfs folks could ack it as well. I think I'd rather we just do the testing to see what things xfs_check catches that xfs_repair doesn't, and then we can deprecate running xfs_check in fstests (by default) entirely. In the long run it doesn't make sense to maintain /three/ separate metadata verification utilities. --D > Thanks, > Eryu > > > fi > > if [ -s $tmp.fs_check ]; then > > -- > > 2.19.1 > >
diff --git a/README b/README index 094a7742..c47569cd 100644 --- a/README +++ b/README @@ -88,6 +88,8 @@ Preparing system for tests: run xfs_repair -n to check the filesystem; xfs_repair to rebuild metadata indexes; and xfs_repair -n (a third time) to check the results of the rebuilding. + - set EXECUTE_XFS_DB_CHECK=1 to have _check_xfs_filesystem + run xfs_db's check command on the filesystem. - xfs_scrub, if present, will always check the test and scratch filesystems if they are still online at the end of the test. It is no longer necessary to set TEST_XFS_SCRUB. diff --git a/common/xfs b/common/xfs index d9a9784f..93ebab75 100644 --- a/common/xfs +++ b/common/xfs @@ -455,10 +455,19 @@ _check_xfs_filesystem() ok=0 fi - # xfs_check runs out of memory on large files, so even providing the test - # option (-t) to avoid indexing the free space trees doesn't make it pass on - # large filesystems. Avoid it. - if [ "$LARGE_SCRATCH_DEV" != yes ]; then + dbsize="$($XFS_INFO_PROG "${device}" | grep data.*bsize | sed -e 's/^.*bsize=//g' -e 's/\([0-9]*\).*$/\1/g')" + + # xfs_check runs out of memory, + # 1. On large files. So even providing the test option (-t) to + # avoid indexing the free space trees doesn't make it pass on + # large filesystems. + # 2. When checking filesystems with large number of quota + # structures. This case happens consistently with 64k blocksize when + # creating large number of on-disk quota structures whose quota ids + # are spread across a large integer range. + # + # Hence avoid executing it unless explicitly asked by user. + if [ -n "$EXECUTE_XFS_DB_CHECK" -a "$LARGE_SCRATCH_DEV" != yes ]; then _xfs_check $extra_log_options $device 2>&1 > $tmp.fs_check fi if [ -s $tmp.fs_check ]; then
fsstress when executed as part of some of the tests (e.g. generic/270) invokes chown() syscall many times by passing random integers as value for the uid argument. For each such syscall invocation for which there is no on-disk quota block, xfs invokes xfs_dquot_disk_alloc() which allocates a new block and instantiates all the quota structures mapped by the newly allocated block. For filesystems with 64k block size, the number of on-disk quota structures created will be 16 times more than that for a filesystem with 4k block size. xfs_db's check command (executed after test script finishes execution) will try to read in all of the on-disk quota structures into memory. This causes the OOM event to be triggered when reading from filesystems with 64k block size. For filesystems with sufficiently large amount of system memory, this causes the test to execute for a very long time. Due to the above stated reasons, this commit disables execution of xfs_db's check command unless explictly asked by the user by setting $EXECUTE_XFS_DB_CHECK variable. Signed-off-by: Chandan Rajendra <chandanrlinux@gmail.com> --- README | 2 ++ common/xfs | 17 +++++++++++++---- 2 files changed, 15 insertions(+), 4 deletions(-)