Message ID | 47cecb4a83a3f726088ffba0b00679384c7349ae.1574374826.git.gitgitgadget@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Improve testability with GIT_TEST_FSMONITOR | expand |
On Thu, Nov 21, 2019 at 10:20:26PM +0000, Derrick Stolee via GitGitGadget wrote: > From: Derrick Stolee <dstolee@microsoft.com> > > Signed-off-by: Derrick Stolee <dstolee@microsoft.com> > --- > t/test-lib-functions.sh | 15 +++++++++++++++ > t/test-lib.sh | 2 ++ > 2 files changed, 17 insertions(+) > > diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh > index e0b3f28d3a..03573caf42 100644 > --- a/t/test-lib-functions.sh > +++ b/t/test-lib-functions.sh > @@ -1475,3 +1475,18 @@ test_set_port () { > port=$(($port + ${GIT_TEST_STRESS_JOB_NR:-0})) > eval $var=$port > } > + > +test_clear_watchman () { > + if test $GIT_TEST_FSMONITOR -ne "" In the rare cases when this function is invoked (see below) this condition triggers an error from the shell running test script: - when the variable is not set, because of the lack of quotes around the variable name: $ ./t5570-git-daemon.sh [....] ok 21 - hostname interpolation works after LF-stripping ./t5570-git-daemon.sh: 1482: test: -ne: unexpected operator # passed all 21 test(s) 1..21 - when the variable is set, because the '-ne' operator does integer comparison: $ GIT_TEST_FSMONITOR="$PWD"/t7519/fsmonitor-none ./t5570-git-daemon.sh [...] ok 21 - hostname interpolation works after LF-stripping ./t5570-git-daemon.sh: 1482: test: Illegal number: /home/szeder/src/git/t/t7519/fsmonitor-none # failed 1 among 21 test(s) 1..21 Please use 'if test -n "$GIT_TEST_FSMONITOR"' instead. > + then > + watchman watch-list | Then with the above fixed, trying to run 'watchman' triggers another error if it's not installed: $ GIT_TEST_FSMONITOR="$PWD"/t7519/fsmonitor-none ./t5570-git-daemon.sh [...] ok 21 - hostname interpolation works after LF-stripping ./t5570-git-daemon.sh: 1484: ./t5570-git-daemon.sh: watchman: not found # failed 1 among 21 test(s) I think we need an additional condition to run this only if 't7519/fsmonitor-watchman' is used in the tests. > + grep "$TRASH_DIRECTORY" | > + sed "s/\t\"//g" | > + sed "s/\",//g" >repo-list > + > + for repo in $(cat repo-list) > + do > + watchman watch-del "$repo" > + done > + fi > +} > diff --git a/t/test-lib.sh b/t/test-lib.sh > index 30b07e310f..067a432ea5 100644 > --- a/t/test-lib.sh > +++ b/t/test-lib.sh > @@ -1072,6 +1072,8 @@ test_atexit_handler () { > # sure that the registered cleanup commands are run only once. > test : != "$test_atexit_cleanup" || return 0 > > + test_clear_watchman I'm not sure where to put this call, but this is definitely not the right place for it. See that 'return 0' above in the context? That's where the test_atexit_handler function returns early when no atexit handler commands are set, i.e. in all test scripts that don't involve some kind of daemons, thus this call is not invoked in the majority of test scripts. Simply moving this call before that early return is not good, because then it would be invoked twice. An option would be to register this call as an atexit command somewhere late in 'test-lib.sh' (around where GIT_TEST_GETTEXT_POISON is restored, perhaps). That way it would be invoked most of the time, and it would be invoked only once, but I'm not sure how it would work out with test scripts that unset GIT_TEST_FSMONITOR somewhere in the middle for the remainder of the test script. However, register the atexit command only if GIT_TEST_FSMONITOR is set (to something watchman-specific), so it won't be invoked at all if GIT_TEST_FSMONITOR is not set, and thus it won't generate additional test output and trace. I don't have a better idea. > + > setup_malloc_check > test_eval_ "$test_atexit_cleanup" > test_atexit_cleanup=: > -- > gitgitgadget
On 11/21/2019 8:06 PM, SZEDER Gábor wrote: Thanks for this message. Sorry I'm so late getting back to it. > On Thu, Nov 21, 2019 at 10:20:26PM +0000, Derrick Stolee via GitGitGadget wrote: >> From: Derrick Stolee <dstolee@microsoft.com> >> >> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> >> --- >> t/test-lib-functions.sh | 15 +++++++++++++++ >> t/test-lib.sh | 2 ++ >> 2 files changed, 17 insertions(+) >> >> diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh >> index e0b3f28d3a..03573caf42 100644 >> --- a/t/test-lib-functions.sh >> +++ b/t/test-lib-functions.sh >> @@ -1475,3 +1475,18 @@ test_set_port () { >> port=$(($port + ${GIT_TEST_STRESS_JOB_NR:-0})) >> eval $var=$port >> } >> + >> +test_clear_watchman () { >> + if test $GIT_TEST_FSMONITOR -ne "" > > In the rare cases when this function is invoked (see below) this > condition triggers an error from the shell running test script: > > - when the variable is not set, because of the lack of quotes around > the variable name: > > $ ./t5570-git-daemon.sh > [....] > ok 21 - hostname interpolation works after LF-stripping > ./t5570-git-daemon.sh: 1482: test: -ne: unexpected operator > # passed all 21 test(s) > 1..21 > > - when the variable is set, because the '-ne' operator does integer > comparison: > > $ GIT_TEST_FSMONITOR="$PWD"/t7519/fsmonitor-none ./t5570-git-daemon.sh > [...] > ok 21 - hostname interpolation works after LF-stripping > ./t5570-git-daemon.sh: 1482: test: Illegal number: /home/szeder/src/git/t/t7519/fsmonitor-none > # failed 1 among 21 test(s) > 1..21 > > Please use 'if test -n "$GIT_TEST_FSMONITOR"' instead. Thanks for the pointers. >> + then >> + watchman watch-list | > > Then with the above fixed, trying to run 'watchman' triggers another > error if it's not installed: > > $ GIT_TEST_FSMONITOR="$PWD"/t7519/fsmonitor-none ./t5570-git-daemon.sh > [...] > ok 21 - hostname interpolation works after LF-stripping > ./t5570-git-daemon.sh: 1484: ./t5570-git-daemon.sh: watchman: not found > # failed 1 among 21 test(s) > > I think we need an additional condition to run this only if > 't7519/fsmonitor-watchman' is used in the tests. The intention is to enable a test-suite-wide run using GIT_TEST_FSMONITOR, and that can only use watchman (currently). Barring wanting to unset the variable if it was set on purpose in a test script, the other options do not actually return correct values to make use of the feature. >> + grep "$TRASH_DIRECTORY" | >> + sed "s/\t\"//g" | >> + sed "s/\",//g" >repo-list >> + >> + for repo in $(cat repo-list) >> + do >> + watchman watch-del "$repo" >> + done >> + fi >> +} >> diff --git a/t/test-lib.sh b/t/test-lib.sh >> index 30b07e310f..067a432ea5 100644 >> --- a/t/test-lib.sh >> +++ b/t/test-lib.sh >> @@ -1072,6 +1072,8 @@ test_atexit_handler () { >> # sure that the registered cleanup commands are run only once. >> test : != "$test_atexit_cleanup" || return 0 >> >> + test_clear_watchman > > I'm not sure where to put this call, but this is definitely not the > right place for it. See that 'return 0' above in the context? That's > where the test_atexit_handler function returns early when no atexit > handler commands are set, i.e. in all test scripts that don't involve > some kind of daemons, thus this call is not invoked in the majority of > test scripts. Ah, I misunderstood the point of test_atexit_handler. > Simply moving this call before that early return is not good, because > then it would be invoked twice. > > An option would be to register this call as an atexit command > somewhere late in 'test-lib.sh' (around where GIT_TEST_GETTEXT_POISON > is restored, perhaps). That way it would be invoked most of the time, > and it would be invoked only once, but I'm not sure how it would work > out with test scripts that unset GIT_TEST_FSMONITOR somewhere in the > middle for the remainder of the test script. However, register the > atexit command only if GIT_TEST_FSMONITOR is set (to something > watchman-specific), so it won't be invoked at all if > GIT_TEST_FSMONITOR is not set, and thus it won't generate additional > test output and trace. > > I don't have a better idea. Shouldn't it be sufficient to add it into test_done? If the test fails, then we could leave watches open, but that's no worse than we had without this test_clear_watchman method. Thanks, -Stolee
On Mon, Dec 09, 2019 at 09:12:37AM -0500, Derrick Stolee wrote: > >> + watchman watch-list | > > > > Then with the above fixed, trying to run 'watchman' triggers another > > error if it's not installed: > > > > $ GIT_TEST_FSMONITOR="$PWD"/t7519/fsmonitor-none ./t5570-git-daemon.sh > > [...] > > ok 21 - hostname interpolation works after LF-stripping > > ./t5570-git-daemon.sh: 1484: ./t5570-git-daemon.sh: watchman: not found > > # failed 1 among 21 test(s) > > > > I think we need an additional condition to run this only if > > 't7519/fsmonitor-watchman' is used in the tests. > > The intention is to enable a test-suite-wide run using GIT_TEST_FSMONITOR, > and that can only use watchman (currently). I've just run 'GIT_TEST_FSMONITOR=$(pwd)/t7519/fsmonitor-all make', and it only failed one test in 't0090-cache-tree.sh', but the fix is already in 'pu' in 61eea521fe (fsmonitor: do not compare bitmap size with size of split index, 2019-11-13). > >> diff --git a/t/test-lib.sh b/t/test-lib.sh > >> index 30b07e310f..067a432ea5 100644 > >> --- a/t/test-lib.sh > >> +++ b/t/test-lib.sh > >> @@ -1072,6 +1072,8 @@ test_atexit_handler () { > >> # sure that the registered cleanup commands are run only once. > >> test : != "$test_atexit_cleanup" || return 0 > >> > >> + test_clear_watchman > > > > I'm not sure where to put this call, but this is definitely not the > > right place for it. See that 'return 0' above in the context? That's > > where the test_atexit_handler function returns early when no atexit > > handler commands are set, i.e. in all test scripts that don't involve > > some kind of daemons, thus this call is not invoked in the majority of > > test scripts. > > Ah, I misunderstood the point of test_atexit_handler. > > > Simply moving this call before that early return is not good, because > > then it would be invoked twice. > > > > An option would be to register this call as an atexit command > > somewhere late in 'test-lib.sh' (around where GIT_TEST_GETTEXT_POISON > > is restored, perhaps). That way it would be invoked most of the time, > > and it would be invoked only once, but I'm not sure how it would work > > out with test scripts that unset GIT_TEST_FSMONITOR somewhere in the > > middle for the remainder of the test script. However, register the > > atexit command only if GIT_TEST_FSMONITOR is set (to something > > watchman-specific), so it won't be invoked at all if > > GIT_TEST_FSMONITOR is not set, and thus it won't generate additional > > test output and trace. > > > > I don't have a better idea. > > Shouldn't it be sufficient to add it into test_done? If the test fails, > then we could leave watches open, but that's no worse than we had without > this test_clear_watchman method. I don't know enough about watchman to have an informed opinion. I think the answer mainly depends on what we want to achive and what happens when a test script run with GIT_TEST_FSMONITOR exits without invoking 'test_done' is re-executed (e.g. after a test case fails with '--immediate' or when the user hits ctrl-c or closes the terminal window mid-test). As far as I understand the commit message of v2 of this patch [1], we mainly want two things: - Avoid overloading watchman's watch queue. For this it might indeed be sufficient to clear watches in 'test_done', because most test scripts tend to succeed most of the time. - Make GIT_TEST_FSMONITOR work reliably on Windows. For this, I'm afraid it's not enough in general, because a failure with '--immediate' or after a ctrl-c we won't run 'test_done', so we won't clear the watches, and watchman will keep the fd to the trash dir open, and, consequently, will interfere with subsequent executions of the same test script as it can't delete the still existing trash dir left over from the previous run. It could still be sufficient for fsmonitor-enabled CI builds, though, because there we don't re-run tests, don't hit ctrl-c, and (at least on Azure Pipelines) don't use '--immediate', and the whole VM/container/whatever is thrown away at end anyway. On Linux/Unix-y systems it probably doesn't matter much, because they can delete open directories, but I wonder what happens with a watch when the directory it is supposed observe gets deleted. If the watch is removed in this case, great; if it isn't, then... well, then what happens with it? Will it be overwritten with the next test run, or will there be duplicate watches for the same dir? [1] https://public-inbox.org/git/e51165f260d564ccb7a9b8e696691eccb184c01a.1575907804.git.gitgitgadget@gmail.com/
On 12/9/2019 6:40 PM, SZEDER Gábor wrote: > On Mon, Dec 09, 2019 at 09:12:37AM -0500, Derrick Stolee wrote: >>>> + watchman watch-list | >>> >>> Then with the above fixed, trying to run 'watchman' triggers another >>> error if it's not installed: >>> >>> $ GIT_TEST_FSMONITOR="$PWD"/t7519/fsmonitor-none ./t5570-git-daemon.sh >>> [...] >>> ok 21 - hostname interpolation works after LF-stripping >>> ./t5570-git-daemon.sh: 1484: ./t5570-git-daemon.sh: watchman: not found >>> # failed 1 among 21 test(s) >>> >>> I think we need an additional condition to run this only if >>> 't7519/fsmonitor-watchman' is used in the tests. >> >> The intention is to enable a test-suite-wide run using GIT_TEST_FSMONITOR, >> and that can only use watchman (currently). > > I've just run 'GIT_TEST_FSMONITOR=$(pwd)/t7519/fsmonitor-all make', > and it only failed one test in 't0090-cache-tree.sh', but the fix is > already in 'pu' in 61eea521fe (fsmonitor: do not compare bitmap size > with size of split index, 2019-11-13). > > >>>> diff --git a/t/test-lib.sh b/t/test-lib.sh >>>> index 30b07e310f..067a432ea5 100644 >>>> --- a/t/test-lib.sh >>>> +++ b/t/test-lib.sh >>>> @@ -1072,6 +1072,8 @@ test_atexit_handler () { >>>> # sure that the registered cleanup commands are run only once. >>>> test : != "$test_atexit_cleanup" || return 0 >>>> >>>> + test_clear_watchman >>> >>> I'm not sure where to put this call, but this is definitely not the >>> right place for it. See that 'return 0' above in the context? That's >>> where the test_atexit_handler function returns early when no atexit >>> handler commands are set, i.e. in all test scripts that don't involve >>> some kind of daemons, thus this call is not invoked in the majority of >>> test scripts. >> >> Ah, I misunderstood the point of test_atexit_handler. >> >>> Simply moving this call before that early return is not good, because >>> then it would be invoked twice. >>> >>> An option would be to register this call as an atexit command >>> somewhere late in 'test-lib.sh' (around where GIT_TEST_GETTEXT_POISON >>> is restored, perhaps). That way it would be invoked most of the time, >>> and it would be invoked only once, but I'm not sure how it would work >>> out with test scripts that unset GIT_TEST_FSMONITOR somewhere in the >>> middle for the remainder of the test script. However, register the >>> atexit command only if GIT_TEST_FSMONITOR is set (to something >>> watchman-specific), so it won't be invoked at all if >>> GIT_TEST_FSMONITOR is not set, and thus it won't generate additional >>> test output and trace. >>> >>> I don't have a better idea. >> >> Shouldn't it be sufficient to add it into test_done? If the test fails, >> then we could leave watches open, but that's no worse than we had without >> this test_clear_watchman method. > > I don't know enough about watchman to have an informed opinion. > > I think the answer mainly depends on what we want to achive and what > happens when a test script run with GIT_TEST_FSMONITOR exits without > invoking 'test_done' is re-executed (e.g. after a test case fails with > '--immediate' or when the user hits ctrl-c or closes the terminal > window mid-test). > > As far as I understand the commit message of v2 of this patch [1], we > mainly want two things: > > - Avoid overloading watchman's watch queue. For this it might > indeed be sufficient to clear watches in 'test_done', because most > test scripts tend to succeed most of the time. > > - Make GIT_TEST_FSMONITOR work reliably on Windows. For this, I'm > afraid it's not enough in general, because a failure with > '--immediate' or after a ctrl-c we won't run 'test_done', so we > won't clear the watches, and watchman will keep the fd to the > trash dir open, and, consequently, will interfere with subsequent > executions of the same test script as it can't delete the still > existing trash dir left over from the previous run. You are right. Running an individual test and ending it early would lead to these leaked handles. This assumes someone is aware of the GIT_TEST_FSMONITOR environment variable, so they are at least interacting with the feature directly to some extent. > It could still be sufficient for fsmonitor-enabled CI builds, > though, because there we don't re-run tests, don't hit ctrl-c, and > (at least on Azure Pipelines) don't use '--immediate', and the > whole VM/container/whatever is thrown away at end anyway. This is the hope. It would be nice to get to that point. > > On Linux/Unix-y systems it probably doesn't matter much, because > they can delete open directories, but I wonder what happens with a > watch when the directory it is supposed observe gets deleted. If > the watch is removed in this case, great; if it isn't, then... > well, then what happens with it? Will it be overwritten with the > next test run, or will there be duplicate watches for the same > dir? When a directory is deleted from under Watchman on Linux, the watch is removed...eventually. I'm not sure at exactly what point that happens. At the very least, Watchman will receive and process the signals for all of the paths being removed inside the directory. Running 'watch-del' removes that overhead. Thanks, -Stolee
diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh index e0b3f28d3a..03573caf42 100644 --- a/t/test-lib-functions.sh +++ b/t/test-lib-functions.sh @@ -1475,3 +1475,18 @@ test_set_port () { port=$(($port + ${GIT_TEST_STRESS_JOB_NR:-0})) eval $var=$port } + +test_clear_watchman () { + if test $GIT_TEST_FSMONITOR -ne "" + then + watchman watch-list | + grep "$TRASH_DIRECTORY" | + sed "s/\t\"//g" | + sed "s/\",//g" >repo-list + + for repo in $(cat repo-list) + do + watchman watch-del "$repo" + done + fi +} diff --git a/t/test-lib.sh b/t/test-lib.sh index 30b07e310f..067a432ea5 100644 --- a/t/test-lib.sh +++ b/t/test-lib.sh @@ -1072,6 +1072,8 @@ test_atexit_handler () { # sure that the registered cleanup commands are run only once. test : != "$test_atexit_cleanup" || return 0 + test_clear_watchman + setup_malloc_check test_eval_ "$test_atexit_cleanup" test_atexit_cleanup=: