From patchwork Fri Dec 30 22:12:53 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13084685 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5E860C4332F for ; Fri, 30 Dec 2022 22:54:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231341AbiL3Wyb (ORCPT ); Fri, 30 Dec 2022 17:54:31 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57566 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229681AbiL3Wya (ORCPT ); Fri, 30 Dec 2022 17:54:30 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BC13A1AA0F; Fri, 30 Dec 2022 14:54:29 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 65D03B81D96; Fri, 30 Dec 2022 22:54:28 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 0788EC433EF; Fri, 30 Dec 2022 22:54:27 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672440867; bh=sFuGD3n7++hEnqnDWvFcEcTz7jlN4CvVk+nuBM9fwsg=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=bF0DIAHN+Rd9LrCTPxCKPg1PMeXHKOYgBnSgkqB4M98QGA6110oVbFf7hDuEOQi6x q2bD/dws+eTaM+zSlXOY6t3rFfJQt2EmnV9AJ783hOw9HF5+AXbtZxtOjKfO3449Fm K+7mothgf08AiNQiN5dwwu1RmHkQOE8n61dkI/OEA4HXx+ZpWsXXMCETY0EnqXftYK MU6gFeTd/lr7hutMdYjoYKUfSnMjPjqdt4g2IpY+CZX9G6qC4zefEn4BIsdAKLGxw2 kW1SHuhEKq4lI3wjLKxiiswNePaUMGxLtreSNZ96PVCk4+eElAsqHGwhNuAnpQ6dXv yfu5olOUwt5+A== Subject: [PATCH 01/16] xfs/422: create a new test group for fsstress/repair racers From: "Darrick J. Wong" To: zlang@redhat.com, djwong@kernel.org Cc: linux-xfs@vger.kernel.org, fstests@vger.kernel.org, guan@eryu.me Date: Fri, 30 Dec 2022 14:12:53 -0800 Message-ID: <167243837315.694541.14915989006044275705.stgit@magnolia> In-Reply-To: <167243837296.694541.13203497631389630964.stgit@magnolia> References: <167243837296.694541.13203497631389630964.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: fstests@vger.kernel.org From: Darrick J. Wong Create a new group for tests that race fsstress with online filesystem repair, and add this to the dangerous_online_repair group too. Signed-off-by: Darrick J. Wong --- doc/group-names.txt | 1 + tests/xfs/422 | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/doc/group-names.txt b/doc/group-names.txt index 6cc9af7844..ac219e05b3 100644 --- a/doc/group-names.txt +++ b/doc/group-names.txt @@ -34,6 +34,7 @@ dangerous_bothrepair fuzzers to evaluate xfs_scrub + xfs_repair repair dangerous_fuzzers fuzzers that can crash your computer dangerous_norepair fuzzers to evaluate kernel metadata verifiers dangerous_online_repair fuzzers to evaluate xfs_scrub online repair +dangerous_fsstress_repair race fsstress and xfs_scrub online repair dangerous_repair fuzzers to evaluate xfs_repair offline repair dangerous_scrub fuzzers to evaluate xfs_scrub checking data data loss checkers diff --git a/tests/xfs/422 b/tests/xfs/422 index f3c63e8d6a..9ed944ed63 100755 --- a/tests/xfs/422 +++ b/tests/xfs/422 @@ -9,7 +9,7 @@ # activity, so we can't have userspace wandering in and thawing it. # . ./common/preamble -_begin_fstest dangerous_scrub dangerous_online_repair freeze +_begin_fstest online_repair dangerous_fsstress_repair freeze _register_cleanup "_cleanup" BUS From patchwork Fri Dec 30 22:12:53 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13084686 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 17BA4C3DA7C for ; Fri, 30 Dec 2022 22:54:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229551AbiL3Wyt (ORCPT ); Fri, 30 Dec 2022 17:54:49 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57808 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235605AbiL3Wyq (ORCPT ); Fri, 30 Dec 2022 17:54:46 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 913D11D0E2; Fri, 30 Dec 2022 14:54:43 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 2E20761AC4; Fri, 30 Dec 2022 22:54:43 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 887B4C433D2; Fri, 30 Dec 2022 22:54:42 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672440882; bh=sl1BeavL/iPTTWfqEZT3TR+7+516RsIAFpISl65NUOE=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=g8dUrYRzj23zpw5DvwneShLauAmcXZ1CRM+Y+iwhTXdJgOAxwUaRJ3jXu8WTP36K1 ysd4g1p1SELHyyQyYPf5GpC8zq/QkuFuL+KSACYCw3qBeHixd3erWe1lBcoqAuhN+d VPwlOH6ziQzJ8oGk40Z1qPzxZogJv+diBL57kTh6Gkjt2b3gu7oPKQ6imlNXc4xN7b JlTK4ZciAOkoBGh3K2O1jqPccgzupIxedQmRbc7ieQJrfRv42YCTC2vnbAMOqNDBFk Cr9c4rXmoQSnIcJR9QUcvxJ0bvd7Ls+yFjUbkM/dFH0ssoxyV1PKxb+xX16kAoadcA M7QNkx+dkZfow== Subject: [PATCH 02/16] xfs/422: move the fsstress/freeze/scrub racing logic to common/fuzzy From: "Darrick J. Wong" To: zlang@redhat.com, djwong@kernel.org Cc: linux-xfs@vger.kernel.org, fstests@vger.kernel.org, guan@eryu.me Date: Fri, 30 Dec 2022 14:12:53 -0800 Message-ID: <167243837327.694541.10370212917252408651.stgit@magnolia> In-Reply-To: <167243837296.694541.13203497631389630964.stgit@magnolia> References: <167243837296.694541.13203497631389630964.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: fstests@vger.kernel.org From: Darrick J. Wong Hoist all this code to common/fuzzy in preparation for making this code more generic so that we implement a variety of tests that check the concurrency correctness of online fsck. Do just enough renaming so that we don't pollute the test program's namespace; we'll fix the other warts in subsequent patches. Signed-off-by: Darrick J. Wong --- common/fuzzy | 100 +++++++++++++++++++++++++++++++++++++++++++++++++++ tests/xfs/422 | 104 ++++------------------------------------------------- tests/xfs/422.out | 4 +- 3 files changed, 109 insertions(+), 99 deletions(-) diff --git a/common/fuzzy b/common/fuzzy index 70213af5db..979fa55515 100644 --- a/common/fuzzy +++ b/common/fuzzy @@ -316,3 +316,103 @@ _scratch_xfs_fuzz_metadata() { done done } + +# Functions to race fsstress, fs freeze, and xfs metadata scrubbing against +# each other to shake out bugs in xfs online repair. + +# Filter freeze and thaw loop output so that we don't tarnish the golden output +# if the kernel temporarily won't let us freeze. +__stress_freeze_filter_output() { + grep -E -v '(Device or resource busy|Invalid argument)' +} + +# Filter scrub output so that we don't tarnish the golden output if the fs is +# too busy to scrub. Note: Tests should _notrun if the scrub type is not +# supported. +__stress_scrub_filter_output() { + grep -E -v '(Device or resource busy|Invalid argument)' +} + +# Run fs freeze and thaw in a tight loop. +__stress_scrub_freeze_loop() { + local end="$1" + + while [ "$(date +%s)" -lt $end ]; do + $XFS_IO_PROG -x -c 'freeze' -c 'thaw' $SCRATCH_MNT 2>&1 | \ + __stress_freeze_filter_output + done +} + +# Run xfs online fsck commands in a tight loop. +__stress_scrub_loop() { + local end="$1" + + while [ "$(date +%s)" -lt $end ]; do + $XFS_IO_PROG -x -c 'repair rmapbt 0' -c 'repair rmapbt 1' $SCRATCH_MNT 2>&1 | \ + __stress_scrub_filter_output + done +} + +# Run fsstress while we're testing online fsck. +__stress_scrub_fsstress_loop() { + local end="$1" + + local args=$(_scale_fsstress_args -p 4 -d $SCRATCH_MNT -n 2000 $FSSTRESS_AVOID) + + while [ "$(date +%s)" -lt $end ]; do + $FSSTRESS_PROG $args >> $seqres.full + done +} + +# Make sure we have everything we need to run stress and scrub +_require_xfs_stress_scrub() { + _require_xfs_io_command "scrub" + _require_command "$KILLALL_PROG" killall + _require_freeze +} + +# Make sure we have everything we need to run stress and online repair +_require_xfs_stress_online_repair() { + _require_xfs_stress_scrub + _require_xfs_io_command "repair" + _require_xfs_io_error_injection "force_repair" + _require_freeze +} + +# Clean up after the loops in case they didn't do it themselves. +_scratch_xfs_stress_scrub_cleanup() { + $KILLALL_PROG -TERM xfs_io fsstress >> $seqres.full 2>&1 + $XFS_IO_PROG -x -c 'thaw' $SCRATCH_MNT >> $seqres.full 2>&1 +} + +# Start scrub, freeze, and fsstress in background looping processes, and wait +# for 30*TIME_FACTOR seconds to see if the filesystem goes down. Callers +# must call _scratch_xfs_stress_scrub_cleanup from their cleanup functions. +_scratch_xfs_stress_scrub() { + local start="$(date +%s)" + local end="$((start + (30 * TIME_FACTOR) ))" + + echo "Loop started at $(date --date="@${start}")," \ + "ending at $(date --date="@${end}")" >> $seqres.full + + __stress_scrub_fsstress_loop $end & + __stress_scrub_freeze_loop $end & + __stress_scrub_loop $end & + + # Wait until 2 seconds after the loops should have finished, then + # clean up after ourselves. + while [ "$(date +%s)" -lt $((end + 2)) ]; do + sleep 1 + done + _scratch_xfs_stress_scrub_cleanup + + echo "Loop finished at $(date)" >> $seqres.full +} + +# Start online repair, freeze, and fsstress in background looping processes, +# and wait for 30*TIME_FACTOR seconds to see if the filesystem goes down. +# Same requirements and arguments as _scratch_xfs_stress_scrub. +_scratch_xfs_stress_online_repair() { + $XFS_IO_PROG -x -c 'inject force_repair' $SCRATCH_MNT + _scratch_xfs_stress_scrub "$@" +} diff --git a/tests/xfs/422 b/tests/xfs/422 index 9ed944ed63..0bf08572f3 100755 --- a/tests/xfs/422 +++ b/tests/xfs/422 @@ -4,40 +4,19 @@ # # FS QA Test No. 422 # -# Race freeze and rmapbt repair for a while to see if we crash or livelock. +# Race fsstress and rmapbt repair for a while to see if we crash or livelock. # rmapbt repair requires us to freeze the filesystem to stop all filesystem # activity, so we can't have userspace wandering in and thawing it. # . ./common/preamble _begin_fstest online_repair dangerous_fsstress_repair freeze -_register_cleanup "_cleanup" BUS - -# First kill and wait the freeze loop so it won't try to freeze fs again -# Then make sure fs is not frozen -# Then kill and wait for the rest of the workers -# Because if fs is frozen a killed writer will never exit -kill_loops() { - local sig=$1 - - [ -n "$freeze_pid" ] && kill $sig $freeze_pid - wait $freeze_pid - unset freeze_pid - $XFS_IO_PROG -x -c 'thaw' $SCRATCH_MNT - [ -n "$stress_pid" ] && kill $sig $stress_pid - [ -n "$repair_pid" ] && kill $sig $repair_pid - wait - unset stress_pid - unset repair_pid -} - -# Override the default cleanup function. -_cleanup() -{ - kill_loops -9 > /dev/null 2>&1 +_cleanup() { + _scratch_xfs_stress_scrub_cleanup &> /dev/null cd / - rm -rf $tmp.* + rm -r -f $tmp.* } +_register_cleanup "_cleanup" BUS # Import common functions. . ./common/filter @@ -47,80 +26,13 @@ _cleanup() # real QA test starts here _supported_fs xfs _require_xfs_scratch_rmapbt -_require_xfs_io_command "scrub" -_require_xfs_io_error_injection "force_repair" -_require_command "$KILLALL_PROG" killall -_require_freeze +_require_xfs_stress_online_repair -echo "Format and populate" _scratch_mkfs > "$seqres.full" 2>&1 _scratch_mount - -STRESS_DIR="$SCRATCH_MNT/testdir" -mkdir -p $STRESS_DIR - -for i in $(seq 0 9); do - mkdir -p $STRESS_DIR/$i - for j in $(seq 0 9); do - mkdir -p $STRESS_DIR/$i/$j - for k in $(seq 0 9); do - echo x > $STRESS_DIR/$i/$j/$k - done - done -done - -cpus=$(( $($here/src/feature -o) * 4 * LOAD_FACTOR)) - -echo "Concurrent repair" -filter_output() { - grep -E -v '(Device or resource busy|Invalid argument)' -} -freeze_loop() { - end="$1" - - while [ "$(date +%s)" -lt $end ]; do - $XFS_IO_PROG -x -c 'freeze' -c 'thaw' $SCRATCH_MNT 2>&1 | filter_output - done -} -repair_loop() { - end="$1" - - while [ "$(date +%s)" -lt $end ]; do - $XFS_IO_PROG -x -c 'repair rmapbt 0' -c 'repair rmapbt 1' $SCRATCH_MNT 2>&1 | filter_output - done -} -stress_loop() { - end="$1" - - FSSTRESS_ARGS=$(_scale_fsstress_args -p 4 -d $SCRATCH_MNT -n 2000 $FSSTRESS_AVOID) - while [ "$(date +%s)" -lt $end ]; do - $FSSTRESS_PROG $FSSTRESS_ARGS >> $seqres.full - done -} -$XFS_IO_PROG -x -c 'inject force_repair' $SCRATCH_MNT - -start=$(date +%s) -end=$((start + (30 * TIME_FACTOR) )) - -echo "Loop started at $(date --date="@${start}"), ending at $(date --date="@${end}")" >> $seqres.full -stress_loop $end & -stress_pid=$! -freeze_loop $end & -freeze_pid=$! -repair_loop $end & -repair_pid=$! - -# Wait until 2 seconds after the loops should have finished... -while [ "$(date +%s)" -lt $((end + 2)) ]; do - sleep 1 -done - -# ...and clean up after the loops in case they didn't do it themselves. -kill_loops >> $seqres.full 2>&1 - -echo "Loop finished at $(date)" >> $seqres.full -echo "Test done" +_scratch_xfs_stress_online_repair # success, all done +echo Silence is golden status=0 exit diff --git a/tests/xfs/422.out b/tests/xfs/422.out index 3818c48fa8..f70693fde6 100644 --- a/tests/xfs/422.out +++ b/tests/xfs/422.out @@ -1,4 +1,2 @@ QA output created by 422 -Format and populate -Concurrent repair -Test done +Silence is golden From patchwork Fri Dec 30 22:12:53 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13084687 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6A89DC4332F for ; Fri, 30 Dec 2022 22:55:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235436AbiL3WzA (ORCPT ); Fri, 30 Dec 2022 17:55:00 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57842 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229758AbiL3Wy7 (ORCPT ); Fri, 30 Dec 2022 17:54:59 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 270E91AA17; Fri, 30 Dec 2022 14:54:59 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id B8DC061AC4; Fri, 30 Dec 2022 22:54:58 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2174BC433D2; Fri, 30 Dec 2022 22:54:58 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672440898; bh=nKnqdkRnRBI8Qh+Yd6TUhrIjepqFiqtDJUSDh2W3KBc=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=N2xTgL3rvFTtToiGdgnJ/UsVKGKogkz9dgFmOOmkatwwKCMMFL9RtmiIhkBPkC+Vi tfJpa6LQdPE09tCMa1vP2FwN5sDY6y2+G0hOALR7GLTW0igYHdy/UrjPWzoegO7Ai/ WR3iMnSzuEMxoVo9FU4cYxzZSn581WCwb8GZUFa3BB0f3FZhI9olpBFLAIFIBc4+Is o1fZ9W/mh5490a/1bQ6qtSGYqndgxJvdFENa1dPjDF7ey3Cww+FnPfbu2pP9WpuY4C Pu56ZcNMHqjNRzXJ3yAF2IRsV51qM+a+wTKs1GcCFhpKdI5nl8cqAIVkBrQkafRoAw n1ZPuLmRNBMpQ== Subject: [PATCH 03/16] xfs/422: rework feature detection so we only test-format scratch once From: "Darrick J. Wong" To: zlang@redhat.com, djwong@kernel.org Cc: linux-xfs@vger.kernel.org, fstests@vger.kernel.org, guan@eryu.me Date: Fri, 30 Dec 2022 14:12:53 -0800 Message-ID: <167243837339.694541.16731359558761133108.stgit@magnolia> In-Reply-To: <167243837296.694541.13203497631389630964.stgit@magnolia> References: <167243837296.694541.13203497631389630964.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: fstests@vger.kernel.org From: Darrick J. Wong Rework the feature detection in the one online fsck stress test so that we only format the scratch device twice per test run. Signed-off-by: Darrick J. Wong --- tests/xfs/422 | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/tests/xfs/422 b/tests/xfs/422 index 0bf08572f3..b3353d2202 100755 --- a/tests/xfs/422 +++ b/tests/xfs/422 @@ -25,11 +25,12 @@ _register_cleanup "_cleanup" BUS # real QA test starts here _supported_fs xfs -_require_xfs_scratch_rmapbt +_require_scratch _require_xfs_stress_online_repair _scratch_mkfs > "$seqres.full" 2>&1 _scratch_mount +_require_xfs_has_feature "$SCRATCH_MNT" rmapbt _scratch_xfs_stress_online_repair # success, all done From patchwork Fri Dec 30 22:12:53 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13084688 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 68438C4332F for ; Fri, 30 Dec 2022 22:55:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235581AbiL3WzR (ORCPT ); Fri, 30 Dec 2022 17:55:17 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57878 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235490AbiL3WzP (ORCPT ); Fri, 30 Dec 2022 17:55:15 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id ABCCD1AA29; Fri, 30 Dec 2022 14:55:14 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 496A361C16; Fri, 30 Dec 2022 22:55:14 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id A5EA4C433D2; Fri, 30 Dec 2022 22:55:13 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672440913; bh=lCVt8f/wb4II/loLUcHIsKKO0IDq0zjG30nHr32ViIw=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=k8jTCSLjcEn/u9CGfH8de8mSHjX3k2kJVXtX/WVONscYKKHuCCZ343HhAxOLPKGk+ 2t+9YMaKw+cJ1vU2b4/iGEGvKKEITLQYQU2CgMEoWL5sJOEc9P0qpXqJqHqCeiwBaz qQuSat4tpEtDn3SeXBN8yKB/WwZNh34s7Uws2Pg58GJvUOZTgeXxeWGcmxmrKBVP1C 6pfQ64TUMpHHwbiVV+bNNhRZhO6/8/tJt/DBJ3nGEoBjTY90ZHWbZ9TRBNd5pYlNhL Ih79xouE57Ibff0RpWpjh9orKICAPvvWcjB8t7xf2sVVTW36AnnQThqTn7wqLVcc6p mdDm1CYl3p7Qg== Subject: [PATCH 04/16] fuzzy: clean up scrub stress programs quietly From: "Darrick J. Wong" To: zlang@redhat.com, djwong@kernel.org Cc: linux-xfs@vger.kernel.org, fstests@vger.kernel.org, guan@eryu.me Date: Fri, 30 Dec 2022 14:12:53 -0800 Message-ID: <167243837353.694541.4864104518386801319.stgit@magnolia> In-Reply-To: <167243837296.694541.13203497631389630964.stgit@magnolia> References: <167243837296.694541.13203497631389630964.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: fstests@vger.kernel.org From: Darrick J. Wong In the cleanup function for online fsck stress test common code, send SIGINT instead of SIGTERM to the fsstress and xfs_io processes to kill them. bash prints 'Terminated' to the golden output when children die with SIGTERM, which can make a test fail, and we don't want a regular cleanup function being the thing that prevents the test from passing. Signed-off-by: Darrick J. Wong --- common/fuzzy | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/common/fuzzy b/common/fuzzy index 979fa55515..e52831560d 100644 --- a/common/fuzzy +++ b/common/fuzzy @@ -381,7 +381,9 @@ _require_xfs_stress_online_repair() { # Clean up after the loops in case they didn't do it themselves. _scratch_xfs_stress_scrub_cleanup() { - $KILLALL_PROG -TERM xfs_io fsstress >> $seqres.full 2>&1 + # Send SIGINT so that bash won't print a 'Terminated' message that + # distorts the golden output. + $KILLALL_PROG -INT xfs_io fsstress >> $seqres.full 2>&1 $XFS_IO_PROG -x -c 'thaw' $SCRATCH_MNT >> $seqres.full 2>&1 } From patchwork Fri Dec 30 22:12:53 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13084689 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A15CEC4332F for ; Fri, 30 Dec 2022 22:55:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235513AbiL3Wzc (ORCPT ); Fri, 30 Dec 2022 17:55:32 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57928 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235490AbiL3Wzb (ORCPT ); Fri, 30 Dec 2022 17:55:31 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 542151AA29; Fri, 30 Dec 2022 14:55:30 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id E501B61C31; Fri, 30 Dec 2022 22:55:29 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 4DC01C433EF; Fri, 30 Dec 2022 22:55:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672440929; bh=3ueXZBy0OcKUPBcpEcvUNhRW2rGIy9ymN6thpWt/1Ao=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=iuyPWHqaEmxwqlF8KMaNI0cNzIdVJ5ibo+d4ZOVMCuwFbpna4FHv9xcLT3/tN4uGb irwRjC/Yy1lcUfdoVr+CiuF232ExLRPDyUSQrE2vnPYqwHG3u142PiOUyy2wKdxsHv s8KeUOY98Ks3DJ6dszKghEOk6r6JrIG1TU+EgjqwX3T36UfIuCDiQgaZDQHIleAheS po9KBsgnbf9UMS8MxnieilQUcFNpAGIb9cJvK8EmwNmuyvxm9PADIDuY+RNUG2AnIf bIMaHsqR8axGPh0j9xBgtBwM3CoPYcN7v3mNrTO342FAabRnfBNfCSRzNxV/WV1cNn rQS/1KCfgo5uw== Subject: [PATCH 05/16] fuzzy: rework scrub stress output filtering From: "Darrick J. Wong" To: zlang@redhat.com, djwong@kernel.org Cc: linux-xfs@vger.kernel.org, fstests@vger.kernel.org, guan@eryu.me Date: Fri, 30 Dec 2022 14:12:53 -0800 Message-ID: <167243837366.694541.12412040391997627012.stgit@magnolia> In-Reply-To: <167243837296.694541.13203497631389630964.stgit@magnolia> References: <167243837296.694541.13203497631389630964.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: fstests@vger.kernel.org From: Darrick J. Wong Rework the output filtering functions for scrub stress tests: first, we should use _filter_scratch to avoid leaking the scratch fs details to the output. Second, for scrub and repair, change the filter elements to reflect outputs that don't indicate failure (such as busy resources, preening requests, and insufficient space to do anything). Finally, change the _require function to check that filter functions have been sourced. Signed-off-by: Darrick J. Wong --- common/fuzzy | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/common/fuzzy b/common/fuzzy index e52831560d..94a6ce85a3 100644 --- a/common/fuzzy +++ b/common/fuzzy @@ -323,14 +323,19 @@ _scratch_xfs_fuzz_metadata() { # Filter freeze and thaw loop output so that we don't tarnish the golden output # if the kernel temporarily won't let us freeze. __stress_freeze_filter_output() { - grep -E -v '(Device or resource busy|Invalid argument)' + _filter_scratch | \ + sed -e '/Device or resource busy/d' \ + -e '/Invalid argument/d' } # Filter scrub output so that we don't tarnish the golden output if the fs is # too busy to scrub. Note: Tests should _notrun if the scrub type is not # supported. __stress_scrub_filter_output() { - grep -E -v '(Device or resource busy|Invalid argument)' + _filter_scratch | \ + sed -e '/Device or resource busy/d' \ + -e '/Optimization possible/d' \ + -e '/No space left on device/d' } # Run fs freeze and thaw in a tight loop. @@ -369,6 +374,8 @@ _require_xfs_stress_scrub() { _require_xfs_io_command "scrub" _require_command "$KILLALL_PROG" killall _require_freeze + command -v _filter_scratch &>/dev/null || \ + _notrun 'xfs scrub stress test requires common/filter' } # Make sure we have everything we need to run stress and online repair From patchwork Fri Dec 30 22:12:53 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13084690 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 93E8AC4332F for ; Fri, 30 Dec 2022 22:55:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235597AbiL3Wzr (ORCPT ); Fri, 30 Dec 2022 17:55:47 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58036 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235490AbiL3Wzr (ORCPT ); Fri, 30 Dec 2022 17:55:47 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EF2A6F7; Fri, 30 Dec 2022 14:55:45 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 8B07561AC4; Fri, 30 Dec 2022 22:55:45 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id E5F92C433EF; Fri, 30 Dec 2022 22:55:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672440945; bh=Y2Jx8eIWENFg14c1hFYXP1NthflUe0vjDug5mZu7fgQ=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=ime3PQjmDArfhzcEu1B2JZEWLjrCFywv+tAMDA1wRwlSFrsec2Rq7Flf16uXhOxJH KS3GbI8mRz+8LgtHLhRwf+jEBYWlbEVRhHbWOmYFEfsvl0H/rzLIrRSMzeaHqB8/k4 WuR8WWWZFSp/yZAt5Xb48yVoE6Up0PoFjESYy+WlpHDj5QVKVYHGNUV61sHzNlnM98 6mAWFhJIuHHqVx2khg8ySYW951D3BxZIn5jGXe+7vzO39i+Lo3jz8f4Q8Yc2Z5btYG qxakjO697oecX9ujcSePQj3htVrb9w/YotYTEi+gOdM2QzZhfjdqNavOjN2tCnZb7A 1tKtt5XK5+kRg== Subject: [PATCH 06/16] fuzzy: explicitly check for common/inject in _require_xfs_stress_online_repair From: "Darrick J. Wong" To: zlang@redhat.com, djwong@kernel.org Cc: linux-xfs@vger.kernel.org, fstests@vger.kernel.org, guan@eryu.me Date: Fri, 30 Dec 2022 14:12:53 -0800 Message-ID: <167243837380.694541.16030787606766361808.stgit@magnolia> In-Reply-To: <167243837296.694541.13203497631389630964.stgit@magnolia> References: <167243837296.694541.13203497631389630964.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: fstests@vger.kernel.org From: Darrick J. Wong In _require_xfs_stress_online_repair, make sure that the test has sourced common/inject before we try to call its functions. Signed-off-by: Darrick J. Wong --- common/fuzzy | 2 ++ 1 file changed, 2 insertions(+) diff --git a/common/fuzzy b/common/fuzzy index 94a6ce85a3..de9e398984 100644 --- a/common/fuzzy +++ b/common/fuzzy @@ -382,6 +382,8 @@ _require_xfs_stress_scrub() { _require_xfs_stress_online_repair() { _require_xfs_stress_scrub _require_xfs_io_command "repair" + command -v _require_xfs_io_error_injection &>/dev/null || \ + _notrun 'xfs repair stress test requires common/inject' _require_xfs_io_error_injection "force_repair" _require_freeze } From patchwork Fri Dec 30 22:12:53 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13084712 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E5274C4332F for ; Fri, 30 Dec 2022 22:56:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235608AbiL3W4F (ORCPT ); Fri, 30 Dec 2022 17:56:05 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58064 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235490AbiL3W4E (ORCPT ); Fri, 30 Dec 2022 17:56:04 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 28B806462; Fri, 30 Dec 2022 14:56:03 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id DCB54B81DA0; Fri, 30 Dec 2022 22:56:01 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 94F02C433F0; Fri, 30 Dec 2022 22:56:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672440960; bh=0vYOWMEidKWH2xQ18ZPBL3AFF2W1dVpYtLsOLMB0KwQ=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=O+5BsUmP+/rrY0I2zKSg277jCYvQZS/D6Z4mwPXXofouEptquCW+kDM0CWVNZ0QIh 3KgFBkBswmt0D/2Y/vsBEYs+alTHM4nI76C5QXnYFs9+1oCaDjzwSy+DJUh5RlL2lX WglOAVXe0b30pxf8LBO9HeqEIHf9R5PhlpyDHaqbYb+5RVYEhMItl+J8mCyiTtLmxJ 5SYEFLQf+jyM4iSXPwWYgJ1LuLkR8MP1WIx7doOgZ79/yjjt2xepxUWc4eoMPmOgbY SFW0npmZYVLJ6AWZdimlw1CAoMiSGp+clPRCVyBbZv+zwWvkjvfL5T3D9K39b/zDSl mncN8SSMWXqfw== Subject: [PATCH 07/16] fuzzy: give each test local control over what scrub stress tests get run From: "Darrick J. Wong" To: zlang@redhat.com, djwong@kernel.org Cc: linux-xfs@vger.kernel.org, fstests@vger.kernel.org, guan@eryu.me Date: Fri, 30 Dec 2022 14:12:53 -0800 Message-ID: <167243837393.694541.5087918179710010888.stgit@magnolia> In-Reply-To: <167243837296.694541.13203497631389630964.stgit@magnolia> References: <167243837296.694541.13203497631389630964.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: fstests@vger.kernel.org From: Darrick J. Wong Now that we've hoisted the scrub stress code to common/fuzzy, introduce argument parsing so that each test can specify what they want to test. Signed-off-by: Darrick J. Wong --- common/fuzzy | 39 +++++++++++++++++++++++++++++++++++---- tests/xfs/422 | 2 +- 2 files changed, 36 insertions(+), 5 deletions(-) diff --git a/common/fuzzy b/common/fuzzy index de9e398984..88ba5fef69 100644 --- a/common/fuzzy +++ b/common/fuzzy @@ -348,12 +348,19 @@ __stress_scrub_freeze_loop() { done } -# Run xfs online fsck commands in a tight loop. -__stress_scrub_loop() { +# Run individual XFS online fsck commands in a tight loop with xfs_io. +__stress_one_scrub_loop() { local end="$1" + local scrub_tgt="$2" + shift; shift + + local xfs_io_args=() + for arg in "$@"; do + xfs_io_args+=('-c' "$arg") + done while [ "$(date +%s)" -lt $end ]; do - $XFS_IO_PROG -x -c 'repair rmapbt 0' -c 'repair rmapbt 1' $SCRATCH_MNT 2>&1 | \ + $XFS_IO_PROG -x "${xfs_io_args[@]}" "$scrub_tgt" 2>&1 | \ __stress_scrub_filter_output done } @@ -390,6 +397,8 @@ _require_xfs_stress_online_repair() { # Clean up after the loops in case they didn't do it themselves. _scratch_xfs_stress_scrub_cleanup() { + echo "Cleaning up scrub stress run at $(date)" >> $seqres.full + # Send SIGINT so that bash won't print a 'Terminated' message that # distorts the golden output. $KILLALL_PROG -INT xfs_io fsstress >> $seqres.full 2>&1 @@ -399,7 +408,25 @@ _scratch_xfs_stress_scrub_cleanup() { # Start scrub, freeze, and fsstress in background looping processes, and wait # for 30*TIME_FACTOR seconds to see if the filesystem goes down. Callers # must call _scratch_xfs_stress_scrub_cleanup from their cleanup functions. +# +# Various options include: +# +# -s Pass this command to xfs_io to test scrub. If zero -s options are +# specified, xfs_io will not be run. +# -t Run online scrub against this file; $SCRATCH_MNT is the default. _scratch_xfs_stress_scrub() { + local one_scrub_args=() + local scrub_tgt="$SCRATCH_MNT" + + OPTIND=1 + while getopts "s:t:" c; do + case "$c" in + s) one_scrub_args+=("$OPTARG");; + t) scrub_tgt="$OPTARG";; + *) return 1; ;; + esac + done + local start="$(date +%s)" local end="$((start + (30 * TIME_FACTOR) ))" @@ -408,7 +435,11 @@ _scratch_xfs_stress_scrub() { __stress_scrub_fsstress_loop $end & __stress_scrub_freeze_loop $end & - __stress_scrub_loop $end & + + if [ "${#one_scrub_args[@]}" -gt 0 ]; then + __stress_one_scrub_loop "$end" "$scrub_tgt" \ + "${one_scrub_args[@]}" & + fi # Wait until 2 seconds after the loops should have finished, then # clean up after ourselves. diff --git a/tests/xfs/422 b/tests/xfs/422 index b3353d2202..faea5d6792 100755 --- a/tests/xfs/422 +++ b/tests/xfs/422 @@ -31,7 +31,7 @@ _require_xfs_stress_online_repair _scratch_mkfs > "$seqres.full" 2>&1 _scratch_mount _require_xfs_has_feature "$SCRATCH_MNT" rmapbt -_scratch_xfs_stress_online_repair +_scratch_xfs_stress_online_repair -s "repair rmapbt 0" -s "repair rmapbt 1" # success, all done echo Silence is golden From patchwork Fri Dec 30 22:12:54 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13084713 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AC31BC4332F for ; Fri, 30 Dec 2022 22:56:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235635AbiL3W4T (ORCPT ); Fri, 30 Dec 2022 17:56:19 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58104 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235620AbiL3W4S (ORCPT ); Fri, 30 Dec 2022 17:56:18 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 44B9A6157; Fri, 30 Dec 2022 14:56:17 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id D621061C12; Fri, 30 Dec 2022 22:56:16 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 35E52C433D2; Fri, 30 Dec 2022 22:56:16 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672440976; bh=luVab/+ZYzX0rnW86r9k1EQ12iKVPWafSQ2RN0amBxs=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=oSpx1nhukICJH/JmkbT5sUZjLUVI6EGyBsb8eqfBH/nTyeB6bhEPs2cf9JNs4y6z9 xScSbuQyItw3Hnt1Pu8GdnqLPnd7RzAeoTAJka5hCpBMZ2+Say2TSDKBElZvPOOIrn SN070OEXTsVfAFao4ysHCMIquFE8vR2E6CdDkx0n9j7dyJy3VptdKKgk7bHqlzzNkt NDpmR2x/6KtQ2Jmxj3O/yBLkNcro/oFs3IAaqnbAGvfR4d/Gsf+XomwYja4dpOPRWw m4K13iftqL3Z1jyblB6kSZyTN5b2a1/cOoVgiHhUKBukgBzVAN+vx86RRP5nKZ19k1 M5aAPpgMUt/yg== Subject: [PATCH 08/16] fuzzy: test the scrub stress subcommands before looping From: "Darrick J. Wong" To: zlang@redhat.com, djwong@kernel.org Cc: linux-xfs@vger.kernel.org, fstests@vger.kernel.org, guan@eryu.me Date: Fri, 30 Dec 2022 14:12:54 -0800 Message-ID: <167243837407.694541.14407835892445280870.stgit@magnolia> In-Reply-To: <167243837296.694541.13203497631389630964.stgit@magnolia> References: <167243837296.694541.13203497631389630964.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: fstests@vger.kernel.org From: Darrick J. Wong Before we commit to running fsstress and scrub commands in a loop for some time, we should check that the provided commands actually work on the scratch filesystem. The _require_xfs_io_command predicate only detects the presence of the scrub ioctl, not any particular subcommand. Signed-off-by: Darrick J. Wong --- common/fuzzy | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) diff --git a/common/fuzzy b/common/fuzzy index 88ba5fef69..8d3e30e32b 100644 --- a/common/fuzzy +++ b/common/fuzzy @@ -405,6 +405,25 @@ _scratch_xfs_stress_scrub_cleanup() { $XFS_IO_PROG -x -c 'thaw' $SCRATCH_MNT >> $seqres.full 2>&1 } +# Make sure the provided scrub/repair commands actually work on the scratch +# filesystem before we start running them in a loop. +__stress_scrub_check_commands() { + local scrub_tgt="$1" + shift + + for arg in "$@"; do + testio=`$XFS_IO_PROG -x -c "$arg" $scrub_tgt 2>&1` + echo $testio | grep -q "Unknown type" && \ + _notrun "xfs_io scrub subcommand support is missing" + echo $testio | grep -q "Inappropriate ioctl" && \ + _notrun "kernel scrub ioctl is missing" + echo $testio | grep -q "No such file or directory" && \ + _notrun "kernel does not know about: $arg" + echo $testio | grep -q "Operation not supported" && \ + _notrun "kernel does not support: $arg" + done +} + # Start scrub, freeze, and fsstress in background looping processes, and wait # for 30*TIME_FACTOR seconds to see if the filesystem goes down. Callers # must call _scratch_xfs_stress_scrub_cleanup from their cleanup functions. @@ -427,6 +446,8 @@ _scratch_xfs_stress_scrub() { esac done + __stress_scrub_check_commands "$scrub_tgt" "${one_scrub_args[@]}" + local start="$(date +%s)" local end="$((start + (30 * TIME_FACTOR) ))" From patchwork Fri Dec 30 22:12:54 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13084714 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6406FC4332F for ; Fri, 30 Dec 2022 22:56:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235620AbiL3W4g (ORCPT ); Fri, 30 Dec 2022 17:56:36 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58150 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231294AbiL3W4f (ORCPT ); Fri, 30 Dec 2022 17:56:35 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 66DF062DE; Fri, 30 Dec 2022 14:56:34 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 2196FB81D95; Fri, 30 Dec 2022 22:56:33 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id CFC1BC433EF; Fri, 30 Dec 2022 22:56:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672440991; bh=bWwjLIwEhZ40D4Nf3HRtglU97HndELAaJWvbx7atFOo=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=ksO25vLFy20+DPGV/nZCMurP1ojfcdi35kbOFrtXB7l18KE78aEo1at5ftg1Wa1M3 wqr2gKJvixoYwTkAybeJ1jEq8m9E0HpB80B2yCasYiBPuguutwlIdWPxnFRh9D1fDJ XfN0ujC3jzfuheDNxSNkXU5LL5wYVAGcW1BJgYU6jc7jqZYuBQaeKyaNA//o0ldrxs gvZ6icc87Y2YJAWknZAA3KrlTjI0CkEqJSqzmw+d9DwUcIueWHVZ4ufzC7E+bGYw1O 4kdk+i9612tcXcTw+n2pbsY6dgObrS/dGOHuBgLw6vzm2FA9XKt5RD8q9ykWVv4GXm X4K1DoSuod81A== Subject: [PATCH 09/16] fuzzy: make scrub stress loop control more robust From: "Darrick J. Wong" To: zlang@redhat.com, djwong@kernel.org Cc: linux-xfs@vger.kernel.org, fstests@vger.kernel.org, guan@eryu.me Date: Fri, 30 Dec 2022 14:12:54 -0800 Message-ID: <167243837420.694541.15959759084869220605.stgit@magnolia> In-Reply-To: <167243837296.694541.13203497631389630964.stgit@magnolia> References: <167243837296.694541.13203497631389630964.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: fstests@vger.kernel.org From: Darrick J. Wong Currently, each of the scrub stress testing background threads open-codes logic to decide if it should exit the loop. This decision is based entirely on TIME_FACTOR*30 seconds having gone by, which means that we ignore external factors, such as the user pressing ^C, which (in theory) will invoke cleanup functions to tear everything down. This is not a great user experience, so refactor the loop exit test into a helper function and establish a sentinel file that must be present to continue looping. If the user presses ^C, the cleanup function will remove the sentinel file and kill the background thread children, which should be enough to stop everything more or less immediately. Signed-off-by: Darrick J. Wong --- common/fuzzy | 39 ++++++++++++++++++++++++++++----------- 1 file changed, 28 insertions(+), 11 deletions(-) diff --git a/common/fuzzy b/common/fuzzy index 8d3e30e32b..6519d5c1e2 100644 --- a/common/fuzzy +++ b/common/fuzzy @@ -338,11 +338,18 @@ __stress_scrub_filter_output() { -e '/No space left on device/d' } +# Decide if we want to keep running stress tests. The first argument is the +# stop time, and second argument is the path to the sentinel file. +__stress_scrub_running() { + test -e "$2" && test "$(date +%s)" -lt "$1" +} + # Run fs freeze and thaw in a tight loop. __stress_scrub_freeze_loop() { local end="$1" + local runningfile="$2" - while [ "$(date +%s)" -lt $end ]; do + while __stress_scrub_running "$end" "$runningfile"; do $XFS_IO_PROG -x -c 'freeze' -c 'thaw' $SCRATCH_MNT 2>&1 | \ __stress_freeze_filter_output done @@ -351,15 +358,16 @@ __stress_scrub_freeze_loop() { # Run individual XFS online fsck commands in a tight loop with xfs_io. __stress_one_scrub_loop() { local end="$1" - local scrub_tgt="$2" - shift; shift + local runningfile="$2" + local scrub_tgt="$3" + shift; shift; shift local xfs_io_args=() for arg in "$@"; do xfs_io_args+=('-c' "$arg") done - while [ "$(date +%s)" -lt $end ]; do + while __stress_scrub_running "$end" "$runningfile"; do $XFS_IO_PROG -x "${xfs_io_args[@]}" "$scrub_tgt" 2>&1 | \ __stress_scrub_filter_output done @@ -368,12 +376,16 @@ __stress_one_scrub_loop() { # Run fsstress while we're testing online fsck. __stress_scrub_fsstress_loop() { local end="$1" + local runningfile="$2" local args=$(_scale_fsstress_args -p 4 -d $SCRATCH_MNT -n 2000 $FSSTRESS_AVOID) + echo "Running $FSSTRESS_PROG $args" >> $seqres.full - while [ "$(date +%s)" -lt $end ]; do + while __stress_scrub_running "$end" "$runningfile"; do $FSSTRESS_PROG $args >> $seqres.full + echo "fsstress exits with $? at $(date)" >> $seqres.full done + rm -f "$runningfile" } # Make sure we have everything we need to run stress and scrub @@ -397,6 +409,7 @@ _require_xfs_stress_online_repair() { # Clean up after the loops in case they didn't do it themselves. _scratch_xfs_stress_scrub_cleanup() { + rm -f "$runningfile" echo "Cleaning up scrub stress run at $(date)" >> $seqres.full # Send SIGINT so that bash won't print a 'Terminated' message that @@ -436,6 +449,10 @@ __stress_scrub_check_commands() { _scratch_xfs_stress_scrub() { local one_scrub_args=() local scrub_tgt="$SCRATCH_MNT" + local runningfile="$tmp.fsstress" + + rm -f "$runningfile" + touch "$runningfile" OPTIND=1 while getopts "s:t:" c; do @@ -454,17 +471,17 @@ _scratch_xfs_stress_scrub() { echo "Loop started at $(date --date="@${start}")," \ "ending at $(date --date="@${end}")" >> $seqres.full - __stress_scrub_fsstress_loop $end & - __stress_scrub_freeze_loop $end & + __stress_scrub_fsstress_loop "$end" "$runningfile" & + __stress_scrub_freeze_loop "$end" "$runningfile" & if [ "${#one_scrub_args[@]}" -gt 0 ]; then - __stress_one_scrub_loop "$end" "$scrub_tgt" \ + __stress_one_scrub_loop "$end" "$runningfile" "$scrub_tgt" \ "${one_scrub_args[@]}" & fi - # Wait until 2 seconds after the loops should have finished, then - # clean up after ourselves. - while [ "$(date +%s)" -lt $((end + 2)) ]; do + # Wait until the designated end time or fsstress dies, then kill all of + # our background processes. + while __stress_scrub_running "$end" "$runningfile"; do sleep 1 done _scratch_xfs_stress_scrub_cleanup From patchwork Fri Dec 30 22:12:54 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13084715 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1F3CEC4332F for ; Fri, 30 Dec 2022 22:56:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235636AbiL3W45 (ORCPT ); Fri, 30 Dec 2022 17:56:57 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58182 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231294AbiL3W4v (ORCPT ); Fri, 30 Dec 2022 17:56:51 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F1E041CB3F; Fri, 30 Dec 2022 14:56:49 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id B2C4AB81D95; Fri, 30 Dec 2022 22:56:48 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7120CC433D2; Fri, 30 Dec 2022 22:56:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672441007; bh=18Vk5SdrtloXpub5avMYmbLMio9pDxg7VblV6vckqo0=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=PZsefOpVNRJtgBl8inWM25C9zLMbKx6gIDrOzHfBLgXRFYzGZnYTZXKH1LnBI9Ct4 Z4qas8xGFqZzdqL7YSZ/2EUYzVCWRpaS2dlgluiytGsj0TZis/rKTr7CuDf0WnPYX9 q0XVzZJKrEMB4sVZ3U8njuooSW5lTNdZUtL+plv3MrkSmkPFfaNUKiAGDlchq/vDhp w3+plAIbm1nEraQOsD0eg9Ji3b95W0/Q3H382Yu29Nbn0mSUw4Fy255M72gUNWKeUa ALGmamEax6Ncxp3VgDuxsTk9VxUunvgpEZGxaJHbag/cAmB6Pa5ee+A3rEjk/AQzWH UtJ9a1Zui7YTQ== Subject: [PATCH 10/16] fuzzy: abort scrub stress testing if the scratch fs went down From: "Darrick J. Wong" To: zlang@redhat.com, djwong@kernel.org Cc: linux-xfs@vger.kernel.org, fstests@vger.kernel.org, guan@eryu.me Date: Fri, 30 Dec 2022 14:12:54 -0800 Message-ID: <167243837433.694541.10388508931249405710.stgit@magnolia> In-Reply-To: <167243837296.694541.13203497631389630964.stgit@magnolia> References: <167243837296.694541.13203497631389630964.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: fstests@vger.kernel.org From: Darrick J. Wong There's no point in continuing a stress test of online fsck if the filesystem goes down. We can't query that kind of state directly, so as a proxy we try to stat the mountpoint and interpret any error return as a sign that the fs is down. Signed-off-by: Darrick J. Wong --- common/fuzzy | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/common/fuzzy b/common/fuzzy index 6519d5c1e2..f1bc2dc756 100644 --- a/common/fuzzy +++ b/common/fuzzy @@ -338,10 +338,17 @@ __stress_scrub_filter_output() { -e '/No space left on device/d' } +# Decide if the scratch filesystem is still alive. +__stress_scrub_scratch_alive() { + # If we can't stat the scratch filesystem, there's a reasonably good + # chance that the fs shut down, which is not good. + stat "$SCRATCH_MNT" &>/dev/null +} + # Decide if we want to keep running stress tests. The first argument is the # stop time, and second argument is the path to the sentinel file. __stress_scrub_running() { - test -e "$2" && test "$(date +%s)" -lt "$1" + test -e "$2" && test "$(date +%s)" -lt "$1" && __stress_scrub_scratch_alive } # Run fs freeze and thaw in a tight loop. @@ -486,6 +493,10 @@ _scratch_xfs_stress_scrub() { done _scratch_xfs_stress_scrub_cleanup + # Warn the user if we think the scratch filesystem went down. + __stress_scrub_scratch_alive || \ + echo "Did the scratch filesystem die?" + echo "Loop finished at $(date)" >> $seqres.full } From patchwork Fri Dec 30 22:12:54 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13084716 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5EBFBC4332F for ; Fri, 30 Dec 2022 22:57:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235656AbiL3W5G (ORCPT ); Fri, 30 Dec 2022 17:57:06 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58212 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235647AbiL3W5E (ORCPT ); Fri, 30 Dec 2022 17:57:04 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2A92B1B9E2; Fri, 30 Dec 2022 14:57:04 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id B820A61C30; Fri, 30 Dec 2022 22:57:03 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2036DC433F1; Fri, 30 Dec 2022 22:57:03 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672441023; bh=okdMoSbtkbYrBQB1NH2KJpgYM/rd0QyX4yIcR9uhfL4=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=hQvN9gmE3OqID5QXh+8NVNvPyQfZy0QvQV+RYvrSl6X3ze+b48MDEs4ZsGNVAlaQq 3wy4dp5dC9qGLMWo/13O4OmzUuVT9CdaOta4hajfK0nEFyLlm8RN5ij/R5rBkX5jsh 5tDrp6RxLuYcjd6MPe0zimvbR0hj8+SC0XZRFkKMNyXHU0MJAQL/vCLzmgHbY4NkPj zcX0R50FdmtiCRdr+iI+Oi1MoUJGLEKutA5HvD0u2iw8ep8oUXLOIQ5/EIQe3qxchD Wpv5QWttFOkJgG6rqQn9hydo1TFUxmGQtPP5Ua1HIX17l+P9JPS4u78eMcmqxgD6qy f/N4k8Vso/3yg== Subject: [PATCH 11/16] fuzzy: clear out the scratch filesystem if it's too full From: "Darrick J. Wong" To: zlang@redhat.com, djwong@kernel.org Cc: linux-xfs@vger.kernel.org, fstests@vger.kernel.org, guan@eryu.me Date: Fri, 30 Dec 2022 14:12:54 -0800 Message-ID: <167243837447.694541.8212586612646646637.stgit@magnolia> In-Reply-To: <167243837296.694541.13203497631389630964.stgit@magnolia> References: <167243837296.694541.13203497631389630964.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: fstests@vger.kernel.org From: Darrick J. Wong If the online fsck stress tests run for long enough, they'll fill up the scratch filesystem completely. While it is interesting to test repair functionality on a *nearly* full filesystem undergoing a heavy workload, a totally full filesystem is really only exercising the ENOSPC handlers in the kernel. That's not what we came here to test, so change the fsstress loop to detect a nearly full filesystem and erase everything before starting fsstress again. Signed-off-by: Darrick J. Wong --- common/fuzzy | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/common/fuzzy b/common/fuzzy index f1bc2dc756..01cf7f00d8 100644 --- a/common/fuzzy +++ b/common/fuzzy @@ -380,6 +380,20 @@ __stress_one_scrub_loop() { done } +# Clean the scratch filesystem between rounds of fsstress if there is 2% +# available space or less because that isn't an interesting stress test. +# +# Returns 0 if we cleared anything, and 1 if we did nothing. +__stress_scrub_clean_scratch() { + local used_pct="$(_used $SCRATCH_DEV)" + + test "$used_pct" -lt 98 && return 1 + + echo "Clearing scratch fs at $(date)" >> $seqres.full + rm -r -f $SCRATCH_MNT/p* + return 0 +} + # Run fsstress while we're testing online fsck. __stress_scrub_fsstress_loop() { local end="$1" @@ -389,6 +403,8 @@ __stress_scrub_fsstress_loop() { echo "Running $FSSTRESS_PROG $args" >> $seqres.full while __stress_scrub_running "$end" "$runningfile"; do + # Need to recheck running conditions if we cleared anything + __stress_scrub_clean_scratch && continue $FSSTRESS_PROG $args >> $seqres.full echo "fsstress exits with $? at $(date)" >> $seqres.full done From patchwork Fri Dec 30 22:12:54 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13084717 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C0EA0C4332F for ; Fri, 30 Dec 2022 22:57:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235663AbiL3W5W (ORCPT ); Fri, 30 Dec 2022 17:57:22 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58246 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235647AbiL3W5W (ORCPT ); Fri, 30 Dec 2022 17:57:22 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4EB241B9E2; Fri, 30 Dec 2022 14:57:21 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 0E086B81D95; Fri, 30 Dec 2022 22:57:20 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id BE989C433EF; Fri, 30 Dec 2022 22:57:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672441038; bh=28h2JW4kYwgtShJujLRwlnsZt43YsOlA6fDOLH1qXL4=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=QxX0Ivy+aMHlukTCOMvLoh6SgDhPmz3Tml1xv0Oo05/E4qGjAVCST8SrBll7nAFDQ ixz9uW2i/R+Ak2K2m20PrgbS6z3/2Sdu9N3gKvh1Aj6cH9tq3JO1cYwKUirctzvil/ JsyzY4Afh9NgFvJaJCfluXX261dfaj5zb5FqoCzdsqTt18xrkO6lhVZ2A04wr0z92Y nCOxRsOXWCjbtmqmv+P84q2SAHYctimEOmKDAL5bRivnSbwVoSmunABjM0fMJ8FmH5 EZ25QOfW4WnE9C6HcgUsnuVPXMTIMi5qeq/LabV3mj1Eqdj+36F802eTmEjO0zClFD jYz0mAXQZ6fww== Subject: [PATCH 12/16] fuzzy: increase operation count for each fsstress invocation From: "Darrick J. Wong" To: zlang@redhat.com, djwong@kernel.org Cc: linux-xfs@vger.kernel.org, fstests@vger.kernel.org, guan@eryu.me Date: Fri, 30 Dec 2022 14:12:54 -0800 Message-ID: <167243837460.694541.14076101650568669658.stgit@magnolia> In-Reply-To: <167243837296.694541.13203497631389630964.stgit@magnolia> References: <167243837296.694541.13203497631389630964.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: fstests@vger.kernel.org From: Darrick J. Wong For online fsck stress testing, increase the number of filesystem operations per fsstress run to 2 million, now that we have the ability to kill fsstress if the user should push ^C to abort the test early. This should guarantee a couple of hours of continuous stress testing in between clearing the scratch filesystem. Signed-off-by: Darrick J. Wong --- common/fuzzy | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/common/fuzzy b/common/fuzzy index 01cf7f00d8..3e23edc9e4 100644 --- a/common/fuzzy +++ b/common/fuzzy @@ -399,7 +399,9 @@ __stress_scrub_fsstress_loop() { local end="$1" local runningfile="$2" - local args=$(_scale_fsstress_args -p 4 -d $SCRATCH_MNT -n 2000 $FSSTRESS_AVOID) + # As of March 2022, 2 million fsstress ops should be enough to keep + # any filesystem busy for a couple of hours. + local args=$(_scale_fsstress_args -p 4 -d $SCRATCH_MNT -n 2000000 $FSSTRESS_AVOID) echo "Running $FSSTRESS_PROG $args" >> $seqres.full while __stress_scrub_running "$end" "$runningfile"; do From patchwork Fri Dec 30 22:12:54 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13084718 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1586EC4332F for ; Fri, 30 Dec 2022 22:57:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235667AbiL3W5h (ORCPT ); Fri, 30 Dec 2022 17:57:37 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58278 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235581AbiL3W5g (ORCPT ); Fri, 30 Dec 2022 17:57:36 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7C8F21CB3F; Fri, 30 Dec 2022 14:57:35 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 18C5161C16; Fri, 30 Dec 2022 22:57:35 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7214FC433D2; Fri, 30 Dec 2022 22:57:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672441054; bh=4tKDUK0K0HxPCY3EkqtsA2OGOFvXBLo51hAlhJealJ4=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=EIeIRkCdP/18NRR+Sj8iYI9VQ87m9zJg/WgOciXoW/X0o8Zqh8jQjyzipAjzotnMX 6YiqFpftcnCB0ixSRTQlKqzBD0AF28k/Tm1OOBvt8Nyq5bAok+l9xGK7q0buIIJpN3 dZuBeu3YhedxsLPSgRpMz6Eck5QCFKES2MrwckDo8/h9q8QH7klh4HVI6+X9xEFyAc umjZeOonYX7wpdm+w44H3dT/8t6MOjYiv8r4cBBe00OwMj5tHRPk0xme+7HZG9XC/m fdzHbwYtZfvUPZANGdNSz2Zhf+LAOSDxPTVWiClf61/HwHFiOnf1jvUjroZ0S8w++Q +DD+IgRvQEEQQ== Subject: [PATCH 13/16] fuzzy: clean up frozen fses after scrub stress testing From: "Darrick J. Wong" To: zlang@redhat.com, djwong@kernel.org Cc: linux-xfs@vger.kernel.org, fstests@vger.kernel.org, guan@eryu.me Date: Fri, 30 Dec 2022 14:12:54 -0800 Message-ID: <167243837474.694541.10883151107803003382.stgit@magnolia> In-Reply-To: <167243837296.694541.13203497631389630964.stgit@magnolia> References: <167243837296.694541.13203497631389630964.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: fstests@vger.kernel.org From: Darrick J. Wong Some of our scrub stress tests involve racing scrub, fsstress, and a program that repeatedly freeze and thaws the scratch filesystem. The current cleanup code suffers from the deficiency that it doesn't actually wait for the child processes to exit. First, change it to do that. However, that exposes a second problem: there's a race condition with a freezer process that leads to the stress test exiting with a frozen fs. If the freezer process is blocked trying to acquire the unmount or sb_write locks, the receipt of a signal (even a fatal one) doesn't cause it to abort the freeze. This causes further problems with fstests, since ./check doesn't expect to regain control with the scratch fs frozen. Fix both problems by making the cleanup function smarter. Signed-off-by: Darrick J. Wong --- common/fuzzy | 35 ++++++++++++++++++++++++++++++++++- 1 file changed, 34 insertions(+), 1 deletion(-) diff --git a/common/fuzzy b/common/fuzzy index 3e23edc9e4..0f6fc91b80 100644 --- a/common/fuzzy +++ b/common/fuzzy @@ -439,8 +439,39 @@ _scratch_xfs_stress_scrub_cleanup() { # Send SIGINT so that bash won't print a 'Terminated' message that # distorts the golden output. + echo "Killing stressor processes at $(date)" >> $seqres.full $KILLALL_PROG -INT xfs_io fsstress >> $seqres.full 2>&1 - $XFS_IO_PROG -x -c 'thaw' $SCRATCH_MNT >> $seqres.full 2>&1 + + # Tests are not allowed to exit with the scratch fs frozen. If we + # started a fs freeze/thaw background loop, wait for that loop to exit + # and then thaw the filesystem. Cleanup for the freeze loop must be + # performed prior to waiting for the other children to avoid triggering + # a race condition that can hang fstests. + # + # If the xfs_io -c freeze process is asleep waiting for a write lock on + # s_umount or sb_write when the killall signal is delivered, it will + # not check for pending signals until after it has frozen the fs. If + # even one thread of the stress test processes (xfs_io, fsstress, etc.) + # is waiting for read locks on sb_write when the killall signals are + # delivered, they will block in the kernel until someone thaws the fs, + # and the `wait' below will wait forever. + # + # Hence we issue the killall, wait for the freezer loop to exit, thaw + # the filesystem, and wait for the rest of the children. + if [ -n "$__SCRUB_STRESS_FREEZE_PID" ]; then + echo "Waiting for fs freezer $__SCRUB_STRESS_FREEZE_PID to exit at $(date)" >> $seqres.full + wait "$__SCRUB_STRESS_FREEZE_PID" + + echo "Thawing filesystem at $(date)" >> $seqres.full + $XFS_IO_PROG -x -c 'thaw' $SCRATCH_MNT >> $seqres.full 2>&1 + __SCRUB_STRESS_FREEZE_PID="" + fi + + # Wait for the remaining children to exit. + echo "Waiting for children to exit at $(date)" >> $seqres.full + wait + + echo "Cleanup finished at $(date)" >> $seqres.full } # Make sure the provided scrub/repair commands actually work on the scratch @@ -476,6 +507,7 @@ _scratch_xfs_stress_scrub() { local scrub_tgt="$SCRATCH_MNT" local runningfile="$tmp.fsstress" + __SCRUB_STRESS_FREEZE_PID="" rm -f "$runningfile" touch "$runningfile" @@ -498,6 +530,7 @@ _scratch_xfs_stress_scrub() { __stress_scrub_fsstress_loop "$end" "$runningfile" & __stress_scrub_freeze_loop "$end" "$runningfile" & + __SCRUB_STRESS_FREEZE_PID="$!" if [ "${#one_scrub_args[@]}" -gt 0 ]; then __stress_one_scrub_loop "$end" "$runningfile" "$scrub_tgt" \ From patchwork Fri Dec 30 22:12:54 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13084719 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B90E9C3DA7C for ; Fri, 30 Dec 2022 22:57:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235647AbiL3W5w (ORCPT ); Fri, 30 Dec 2022 17:57:52 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58308 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235581AbiL3W5v (ORCPT ); Fri, 30 Dec 2022 17:57:51 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1CA7D1B9E2; Fri, 30 Dec 2022 14:57:51 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id AC34361C30; Fri, 30 Dec 2022 22:57:50 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 10B2FC433D2; Fri, 30 Dec 2022 22:57:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672441070; bh=N7ng7JtzN4kW7DNFVLF0bvK5xmgiuIB/v1OJKPdAJqg=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=taPrPJgG1byzU5mjLTth/jW36w3PcBLyc9Cx8eJLLVmFFPwjAOxgGniH5vQP98Sok lzPS72RgDgPOgJsdbSmBfE3m1UIDsQnkSvicZmJMApym8YulD7BvntJ735pu1Zx/dO GQ0xfAtt/k/7CT4y/dmTFXZo3BD/9Jhgt5A37h4kKpZj3GnUI/D6JDJ/0DaGiXXyTo Eknj2+OHHHEKhQQA1cFSHjIZF+v8OC8+R2c03b/OV5apd+yfB/hrqAuHPxYwp0hHx7 9iAcw1NSJeqJtBPpsM5rdq1sTgaLuDYc4pEXnh1eUmNSYHfCFCLUc3K3U7y1LxT5zn NygyaaUMh2DNA== Subject: [PATCH 14/16] fuzzy: make freezing optional for scrub stress tests From: "Darrick J. Wong" To: zlang@redhat.com, djwong@kernel.org Cc: linux-xfs@vger.kernel.org, fstests@vger.kernel.org, guan@eryu.me Date: Fri, 30 Dec 2022 14:12:54 -0800 Message-ID: <167243837487.694541.11855121854386930402.stgit@magnolia> In-Reply-To: <167243837296.694541.13203497631389630964.stgit@magnolia> References: <167243837296.694541.13203497631389630964.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: fstests@vger.kernel.org From: Darrick J. Wong Make the freeze/thaw loop optional, since that's a significant change in behavior if it's enabled. Signed-off-by: Darrick J. Wong --- common/fuzzy | 13 ++++++++++--- tests/xfs/422 | 2 +- 2 files changed, 11 insertions(+), 4 deletions(-) diff --git a/common/fuzzy b/common/fuzzy index 0f6fc91b80..219dd3bb0a 100644 --- a/common/fuzzy +++ b/common/fuzzy @@ -499,6 +499,8 @@ __stress_scrub_check_commands() { # # Various options include: # +# -f Run a freeze/thaw loop while we're doing other things. Defaults to +# disabled, unless XFS_SCRUB_STRESS_FREEZE is set. # -s Pass this command to xfs_io to test scrub. If zero -s options are # specified, xfs_io will not be run. # -t Run online scrub against this file; $SCRATCH_MNT is the default. @@ -506,14 +508,16 @@ _scratch_xfs_stress_scrub() { local one_scrub_args=() local scrub_tgt="$SCRATCH_MNT" local runningfile="$tmp.fsstress" + local freeze="${XFS_SCRUB_STRESS_FREEZE}" __SCRUB_STRESS_FREEZE_PID="" rm -f "$runningfile" touch "$runningfile" OPTIND=1 - while getopts "s:t:" c; do + while getopts "fs:t:" c; do case "$c" in + f) freeze=yes;; s) one_scrub_args+=("$OPTARG");; t) scrub_tgt="$OPTARG";; *) return 1; ;; @@ -529,8 +533,11 @@ _scratch_xfs_stress_scrub() { "ending at $(date --date="@${end}")" >> $seqres.full __stress_scrub_fsstress_loop "$end" "$runningfile" & - __stress_scrub_freeze_loop "$end" "$runningfile" & - __SCRUB_STRESS_FREEZE_PID="$!" + + if [ -n "$freeze" ]; then + __stress_scrub_freeze_loop "$end" "$runningfile" & + __SCRUB_STRESS_FREEZE_PID="$!" + fi if [ "${#one_scrub_args[@]}" -gt 0 ]; then __stress_one_scrub_loop "$end" "$runningfile" "$scrub_tgt" \ diff --git a/tests/xfs/422 b/tests/xfs/422 index faea5d6792..ac88713257 100755 --- a/tests/xfs/422 +++ b/tests/xfs/422 @@ -31,7 +31,7 @@ _require_xfs_stress_online_repair _scratch_mkfs > "$seqres.full" 2>&1 _scratch_mount _require_xfs_has_feature "$SCRATCH_MNT" rmapbt -_scratch_xfs_stress_online_repair -s "repair rmapbt 0" -s "repair rmapbt 1" +_scratch_xfs_stress_online_repair -f -s "repair rmapbt 0" -s "repair rmapbt 1" # success, all done echo Silence is golden From patchwork Fri Dec 30 22:12:55 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13084720 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A3816C4332F for ; Fri, 30 Dec 2022 22:58:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235669AbiL3W6K (ORCPT ); Fri, 30 Dec 2022 17:58:10 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58352 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235671AbiL3W6J (ORCPT ); Fri, 30 Dec 2022 17:58:09 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 673B01CB3F; Fri, 30 Dec 2022 14:58:08 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 12B70B81DA0; Fri, 30 Dec 2022 22:58:07 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id AF14DC433EF; Fri, 30 Dec 2022 22:58:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672441085; bh=BfjqOk88ZtFx6eLVmz4NC6IzcTQjwPAa+NLXr7NEQ7Y=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=iIAjnb94P8z1z1WiIeXvb1frkWxgSdVAzbuf5SsrlUoaqkOYP8iCi0OGi0cv63VRv TH7AQCUGIqI154rov3rPj0+ORTgXau146vDJfO87UGzXoefsXbwDxyjaD97TqB9kgb gjECpvvzt0kHjl0Aie5upRtmkobRA0EEtSD+hjJDF6Mic1JXHHzen1AaMZzjMK2woh c/jCKKQwKf9ZT5DKKx1fhAR1HR3GQnznXM9y1qhe4yyRLaqo2ru+Z9jeuTTCdL6ogt WIgBvunckqcvYM2OXThzESh4OdgAgOIYKHg6k/7osVlwjh5kgbVHc9yBf9CJQwW9By WgeOXm7yjSzdQ== Subject: [PATCH 15/16] fuzzy: allow substitution of AG numbers when configuring scrub stress test From: "Darrick J. Wong" To: zlang@redhat.com, djwong@kernel.org Cc: linux-xfs@vger.kernel.org, fstests@vger.kernel.org, guan@eryu.me Date: Fri, 30 Dec 2022 14:12:55 -0800 Message-ID: <167243837501.694541.13900520713966204152.stgit@magnolia> In-Reply-To: <167243837296.694541.13203497631389630964.stgit@magnolia> References: <167243837296.694541.13203497631389630964.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: fstests@vger.kernel.org From: Darrick J. Wong Allow the test program to use the metavariable '%agno%' when passing scrub commands to the scrub stress loop. This makes it easier for tests to scrub or repair every AG in the filesystem without a lot of work. Signed-off-by: Darrick J. Wong --- common/fuzzy | 14 ++++++++++++-- tests/xfs/422 | 2 +- 2 files changed, 13 insertions(+), 3 deletions(-) diff --git a/common/fuzzy b/common/fuzzy index 219dd3bb0a..e42e2ccec1 100644 --- a/common/fuzzy +++ b/common/fuzzy @@ -368,10 +368,19 @@ __stress_one_scrub_loop() { local runningfile="$2" local scrub_tgt="$3" shift; shift; shift + local agcount="$(_xfs_mount_agcount $SCRATCH_MNT)" local xfs_io_args=() for arg in "$@"; do - xfs_io_args+=('-c' "$arg") + if echo "$arg" | grep -q -w '%agno%'; then + # Substitute the AG number + for ((agno = 0; agno < agcount; agno++)); do + local ag_arg="$(echo "$arg" | sed -e "s|%agno%|$agno|g")" + xfs_io_args+=('-c' "$ag_arg") + done + else + xfs_io_args+=('-c' "$arg") + fi done while __stress_scrub_running "$end" "$runningfile"; do @@ -481,7 +490,8 @@ __stress_scrub_check_commands() { shift for arg in "$@"; do - testio=`$XFS_IO_PROG -x -c "$arg" $scrub_tgt 2>&1` + local cooked_arg="$(echo "$arg" | sed -e "s/%agno%/0/g")" + testio=`$XFS_IO_PROG -x -c "$cooked_arg" $scrub_tgt 2>&1` echo $testio | grep -q "Unknown type" && \ _notrun "xfs_io scrub subcommand support is missing" echo $testio | grep -q "Inappropriate ioctl" && \ diff --git a/tests/xfs/422 b/tests/xfs/422 index ac88713257..995f612166 100755 --- a/tests/xfs/422 +++ b/tests/xfs/422 @@ -31,7 +31,7 @@ _require_xfs_stress_online_repair _scratch_mkfs > "$seqres.full" 2>&1 _scratch_mount _require_xfs_has_feature "$SCRATCH_MNT" rmapbt -_scratch_xfs_stress_online_repair -f -s "repair rmapbt 0" -s "repair rmapbt 1" +_scratch_xfs_stress_online_repair -f -s "repair rmapbt %agno%" # success, all done echo Silence is golden From patchwork Fri Dec 30 22:12:55 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13084721 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 20FBFC4332F for ; Fri, 30 Dec 2022 22:58:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235675AbiL3W63 (ORCPT ); Fri, 30 Dec 2022 17:58:29 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58374 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235671AbiL3W6Y (ORCPT ); Fri, 30 Dec 2022 17:58:24 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F11821B9E2; Fri, 30 Dec 2022 14:58:23 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 85E19B81DA0; Fri, 30 Dec 2022 22:58:22 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 4BE6EC433D2; Fri, 30 Dec 2022 22:58:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672441101; bh=qJOgOtApFt1FimPSiqKSnn6aglt4aNYat6b7oUOZ3GM=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=inYLKP5G0/nDr/CJK1kirxnuma2UNJBxxeeRUJZNu1UR1RYX5DixJks5WHxy6+H6V 1SIl1QltY5SpBJVWvChX3CmPIvVScKQcI3tPvIIQbGDhcZxo31Mb9FCFfBOwVYk5cf VcvizWv3lFS1GkOh7KYmZLGfJprgfWBDvdygOxOtou0M0VqgIqRLR/PwP9k+PjJ3op Kw4uuAvoIXo/dmsc5mCrzSKXKCNmj1IDttzY7LjWcC0PzVc6pcMAw5C+z2kMZrpMCg Wbs8LWZGOxxP9HjINWV3LTGEFAk+9qwKQfW/5n3rXHOs2BufDiS3jKAtrhzDSpfbet w3CJQdvCglF0w== Subject: [PATCH 16/16] fuzzy: delay the start of the scrub loop when stress-testing scrub From: "Darrick J. Wong" To: zlang@redhat.com, djwong@kernel.org Cc: linux-xfs@vger.kernel.org, fstests@vger.kernel.org, guan@eryu.me Date: Fri, 30 Dec 2022 14:12:55 -0800 Message-ID: <167243837514.694541.17252818873640821069.stgit@magnolia> In-Reply-To: <167243837296.694541.13203497631389630964.stgit@magnolia> References: <167243837296.694541.13203497631389630964.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: fstests@vger.kernel.org From: Darrick J. Wong By default, online fsck stress testing kicks off the loops for fsstress and online fsck at the same time. However, in certain debugging scenarios it can help if we let fsstress get a head-start in filling up the filesystem. Plumb in a means to delay the start of the scrub loop. Signed-off-by: Darrick J. Wong --- common/fuzzy | 19 ++++++++++++++++--- 1 file changed, 16 insertions(+), 3 deletions(-) diff --git a/common/fuzzy b/common/fuzzy index e42e2ccec1..1df51a6dd8 100644 --- a/common/fuzzy +++ b/common/fuzzy @@ -367,7 +367,8 @@ __stress_one_scrub_loop() { local end="$1" local runningfile="$2" local scrub_tgt="$3" - shift; shift; shift + local scrub_startat="$4" + shift; shift; shift; shift local agcount="$(_xfs_mount_agcount $SCRATCH_MNT)" local xfs_io_args=() @@ -383,6 +384,10 @@ __stress_one_scrub_loop() { fi done + while __stress_scrub_running "$scrub_startat" "$runningfile"; do + sleep 1 + done + while __stress_scrub_running "$end" "$runningfile"; do $XFS_IO_PROG -x "${xfs_io_args[@]}" "$scrub_tgt" 2>&1 | \ __stress_scrub_filter_output @@ -514,22 +519,27 @@ __stress_scrub_check_commands() { # -s Pass this command to xfs_io to test scrub. If zero -s options are # specified, xfs_io will not be run. # -t Run online scrub against this file; $SCRATCH_MNT is the default. +# -w Delay the start of the scrub/repair loop by this number of seconds. +# Defaults to no delay unless XFS_SCRUB_STRESS_DELAY is set. This value +# will be clamped to ten seconds before the end time. _scratch_xfs_stress_scrub() { local one_scrub_args=() local scrub_tgt="$SCRATCH_MNT" local runningfile="$tmp.fsstress" local freeze="${XFS_SCRUB_STRESS_FREEZE}" + local scrub_delay="${XFS_SCRUB_STRESS_DELAY:--1}" __SCRUB_STRESS_FREEZE_PID="" rm -f "$runningfile" touch "$runningfile" OPTIND=1 - while getopts "fs:t:" c; do + while getopts "fs:t:w:" c; do case "$c" in f) freeze=yes;; s) one_scrub_args+=("$OPTARG");; t) scrub_tgt="$OPTARG";; + w) scrub_delay="$OPTARG";; *) return 1; ;; esac done @@ -538,6 +548,9 @@ _scratch_xfs_stress_scrub() { local start="$(date +%s)" local end="$((start + (30 * TIME_FACTOR) ))" + local scrub_startat="$((start + scrub_delay))" + test "$scrub_startat" -gt "$((end - 10))" && + scrub_startat="$((end - 10))" echo "Loop started at $(date --date="@${start}")," \ "ending at $(date --date="@${end}")" >> $seqres.full @@ -551,7 +564,7 @@ _scratch_xfs_stress_scrub() { if [ "${#one_scrub_args[@]}" -gt 0 ]; then __stress_one_scrub_loop "$end" "$runningfile" "$scrub_tgt" \ - "${one_scrub_args[@]}" & + "$scrub_startat" "${one_scrub_args[@]}" & fi # Wait until the designated end time or fsstress dies, then kill all of