Message ID | patch-1.1-d7e88a77fef-20211009T133043Z-avarab@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | test-lib.sh: try to re-chmod & retry on failed trash removal | expand |
On Sat, Oct 09, 2021 at 03:31:02PM +0200, Ævar Arnfjörð Bjarmason wrote: > This fixes a regression in [1] where t0004-unwritable.sh was made to > use "test_when_finished" for cleanup, we wouldn't re-chmod the > ".git/objects" on failure, leading to a persistent error when running > the test. > > This can be demonstrated as e.g. (output snipped for less verbosity): > > $ ./t0004-unwritable.sh --run=3 --immediate > ok 1 # skip setup (--run) > ok 2 # skip write-tree should notice unwritable repository (--run) > not ok 3 - commit should notice unwritable repository > [...] > $ ./t0004-unwritable.sh --run=3 --immediate > rm: cannot remove '[...]/trash directory.t0004-unwritable/.git/objects/info': Permission denied > FATAL: Cannot prepare test area > [...] > > I.e. if any of its tests failed, and the tests were being run under > "--debug"[2] or "--immediate"[3] (which was introduced after [1]) we '--debug' runs the tests as usual, including any 'test_when_finished' commands, so I don't see how it would be affected. And indeed running './t0004-unwritable.sh --run=3 --debug' repeatedly doesn't have any issues with removing the trash directory of the previous run. > wouldn't re-chmod the object directory. We'd then fail on the next run > since the test setup couldn't remove the trash files. > > Instead of some version of reverting [1] let's make the test-lib.sh > resilient to this edge-case, it will happen due to [1], but also > e.g. if the relevant "test-lib.sh" process is kill -9'd during the > test run. We should try harder to recover in this case. If we fail to > remove the test directory let's retry after (re-)chmod-ing it. > > This doesn't need to be guarded by something that's equivalent to > "POSIXPERM" since if we don't support "chmod" we were about to fail > anyway. Let's also discard any error output from (a possibly > nonexisting) "chmod", we'll fail on the subsequent "rm -rf" > anyway. > > The lack of &&-chaining between the "chmod" and "rm -rf" is > intentional, if we fail the first "rm -rf", can't chmod, but then > succeed the second time around that's what we were hoping for. We just > want to nuke the directory, not carry forward every possible error > code or error message. > > 1. dbda967684d (t0004 (unwritable files): simplify error handling, > 2010-09-06) > 2. 5a4a088add3 (test-lib: do not remove trash_directory if called with > --debug, 2008-08-21) > 3. b586744a864 (test: skip clean-up when running under --immediate > mode, 2011-06-27) > > Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> > --- > t/test-lib.sh | 18 +++++++++++++++--- > 1 file changed, 15 insertions(+), 3 deletions(-) > > diff --git a/t/test-lib.sh b/t/test-lib.sh > index aa1ad8180ed..706ce0cdeb9 100644 > --- a/t/test-lib.sh > +++ b/t/test-lib.sh > @@ -1210,10 +1210,10 @@ test_done () { > error "Tests passed but trash directory already removed before test cleanup; aborting" > > cd "$TRASH_DIRECTORY/.." && > - rm -fr "$TRASH_DIRECTORY" || { > + remove_trash_directory "$TRASH_DIRECTORY" || { > # try again in a bit > sleep 5; > - rm -fr "$TRASH_DIRECTORY" > + remove_trash_directory "$TRASH_DIRECTORY" > } || > error "Tests passed but test cleanup failed; aborting" > fi > @@ -1370,6 +1370,18 @@ then > exit 1 > fi > > +# Try really hard to clean up our mess > +remove_trash_directory() { > + dir="$1" > + if ! rm -rf "$dir" > + then > + say_color info >&3 "Failed to remove trash directory, trying to re-chmod it first..." Is this message really necessary? I don't think it is, and there was no similar message in the previous hunk around that 'sleep 5' either. > + chmod -R u+w "$dir" 2>/dev/null > + rm -rf "$dir" > + fi > + ! test -d "$dir" > +} > + > # Are we running this test at all? > remove_trash= > this_test=${0##*/} > @@ -1388,7 +1400,7 @@ GNUPGHOME="$HOME/gnupg-home-not-used" > export HOME GNUPGHOME USER_HOME > > # Test repository > -rm -fr "$TRASH_DIRECTORY" || { > +remove_trash_directory "$TRASH_DIRECTORY" || { > GIT_EXIT_OK=t > echo >&5 "FATAL: Cannot prepare test area" > exit 1 > -- > 2.33.0.1492.gcc3b81fc59a >
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: > This can be demonstrated as e.g. (output snipped for less verbosity): > > $ ./t0004-unwritable.sh --run=3 --immediate > ok 1 # skip setup (--run) > ok 2 # skip write-tree should notice unwritable repository (--run) > not ok 3 - commit should notice unwritable repository > [...] > $ ./t0004-unwritable.sh --run=3 --immediate > rm: cannot remove '[...]/trash directory.t0004-unwritable/.git/objects/info': Permission denied > FATAL: Cannot prepare test area > [...] > > I.e. if any of its tests failed, and the tests were being run under > "--debug"[2] or "--immediate"[3] (which was introduced after [1]) we > wouldn't re-chmod the object directory. We'd then fail on the next run > since the test setup couldn't remove the trash files. OK, the first thing these scripts do is to create $TRASH_DIRECTORY for their own use, and that is what fails (hence "Cannot prepare"). And making sure we try harder to prepare would be a good thing. Given that understanding of the motivation... > @@ -1210,10 +1210,10 @@ test_done () { ...it is puzzling why we are touching the code that happens _after_ a test is run. If our philosophy is to leave the mess after a test fails so that the debugging person can take a look without contaminating the forensics data, but blindly and forcibly clean up before starting a new round, reusing the "do more than usual to remove" logic meant to be used for the latter here feels counter productive and also a bit counter intuitive. > error "Tests passed but trash directory already removed before test cleanup; aborting" > > cd "$TRASH_DIRECTORY/.." && > - rm -fr "$TRASH_DIRECTORY" || { > + remove_trash_directory "$TRASH_DIRECTORY" || { > # try again in a bit > sleep 5; > - rm -fr "$TRASH_DIRECTORY" > + remove_trash_directory "$TRASH_DIRECTORY" > } || > error "Tests passed but test cleanup failed; aborting" > fi > +# Try really hard to clean up our mess > +remove_trash_directory() { > + dir="$1" > + if ! rm -rf "$dir" > + then > + say_color info >&3 "Failed to remove trash directory, trying to re-chmod it first..." > + chmod -R u+w "$dir" 2>/dev/null If a test lost searchable bit from directories, "u+wx" may be necessary to clean fully, no? > + rm -rf "$dir" > + fi > + ! test -d "$dir" > +} > + On the other hand... > # Are we running this test at all? > remove_trash= > this_test=${0##*/} > @@ -1388,7 +1400,7 @@ GNUPGHOME="$HOME/gnupg-home-not-used" > export HOME GNUPGHOME USER_HOME > > # Test repository > -rm -fr "$TRASH_DIRECTORY" || { > +remove_trash_directory "$TRASH_DIRECTORY" || { ... this one does make sense and matches what the proposed log message sold us the change as. > GIT_EXIT_OK=t > echo >&5 "FATAL: Cannot prepare test area" > exit 1 Thanks.
Junio C Hamano <gitster@pobox.com> writes: >> + chmod -R u+w "$dir" 2>/dev/null > > If a test lost searchable bit from directories, "u+wx" may be > necessary to clean fully, no? Sorry, but I was stupid. If we were to worry about losing the searchable bit, then I should considered the possibility that we may also lose the readable bit, too.
diff --git a/t/test-lib.sh b/t/test-lib.sh index aa1ad8180ed..706ce0cdeb9 100644 --- a/t/test-lib.sh +++ b/t/test-lib.sh @@ -1210,10 +1210,10 @@ test_done () { error "Tests passed but trash directory already removed before test cleanup; aborting" cd "$TRASH_DIRECTORY/.." && - rm -fr "$TRASH_DIRECTORY" || { + remove_trash_directory "$TRASH_DIRECTORY" || { # try again in a bit sleep 5; - rm -fr "$TRASH_DIRECTORY" + remove_trash_directory "$TRASH_DIRECTORY" } || error "Tests passed but test cleanup failed; aborting" fi @@ -1370,6 +1370,18 @@ then exit 1 fi +# Try really hard to clean up our mess +remove_trash_directory() { + dir="$1" + if ! rm -rf "$dir" + then + say_color info >&3 "Failed to remove trash directory, trying to re-chmod it first..." + chmod -R u+w "$dir" 2>/dev/null + rm -rf "$dir" + fi + ! test -d "$dir" +} + # Are we running this test at all? remove_trash= this_test=${0##*/} @@ -1388,7 +1400,7 @@ GNUPGHOME="$HOME/gnupg-home-not-used" export HOME GNUPGHOME USER_HOME # Test repository -rm -fr "$TRASH_DIRECTORY" || { +remove_trash_directory "$TRASH_DIRECTORY" || { GIT_EXIT_OK=t echo >&5 "FATAL: Cannot prepare test area" exit 1
This fixes a regression in [1] where t0004-unwritable.sh was made to use "test_when_finished" for cleanup, we wouldn't re-chmod the ".git/objects" on failure, leading to a persistent error when running the test. This can be demonstrated as e.g. (output snipped for less verbosity): $ ./t0004-unwritable.sh --run=3 --immediate ok 1 # skip setup (--run) ok 2 # skip write-tree should notice unwritable repository (--run) not ok 3 - commit should notice unwritable repository [...] $ ./t0004-unwritable.sh --run=3 --immediate rm: cannot remove '[...]/trash directory.t0004-unwritable/.git/objects/info': Permission denied FATAL: Cannot prepare test area [...] I.e. if any of its tests failed, and the tests were being run under "--debug"[2] or "--immediate"[3] (which was introduced after [1]) we wouldn't re-chmod the object directory. We'd then fail on the next run since the test setup couldn't remove the trash files. Instead of some version of reverting [1] let's make the test-lib.sh resilient to this edge-case, it will happen due to [1], but also e.g. if the relevant "test-lib.sh" process is kill -9'd during the test run. We should try harder to recover in this case. If we fail to remove the test directory let's retry after (re-)chmod-ing it. This doesn't need to be guarded by something that's equivalent to "POSIXPERM" since if we don't support "chmod" we were about to fail anyway. Let's also discard any error output from (a possibly nonexisting) "chmod", we'll fail on the subsequent "rm -rf" anyway. The lack of &&-chaining between the "chmod" and "rm -rf" is intentional, if we fail the first "rm -rf", can't chmod, but then succeed the second time around that's what we were hoping for. We just want to nuke the directory, not carry forward every possible error code or error message. 1. dbda967684d (t0004 (unwritable files): simplify error handling, 2010-09-06) 2. 5a4a088add3 (test-lib: do not remove trash_directory if called with --debug, 2008-08-21) 3. b586744a864 (test: skip clean-up when running under --immediate mode, 2011-06-27) Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- t/test-lib.sh | 18 +++++++++++++++--- 1 file changed, 15 insertions(+), 3 deletions(-)