diff mbox series

[v2,8/8] test-lib: clear watchman watches at test completion

Message ID e51165f260d564ccb7a9b8e696691eccb184c01a.1575907804.git.gitgitgadget@gmail.com (mailing list archive)
State New, archived
Headers show
Series Improve testability with GIT_TEST_FSMONITOR | expand

Commit Message

John Passaro via GitGitGadget Dec. 9, 2019, 4:10 p.m. UTC
From: Derrick Stolee <dstolee@microsoft.com>

The fsmonitor feature allows an external tool such as watchman to
monitor the working directory. The direct test
t7619-status-fsmonitor.sh provides some coverage, but it would be
better to run the entire test suite with watchman enabled. This
would provide more confidence that the feature is working as
intended.

When running the test suite in parallel with 'prove -j <N>', many
repos are created and deleted in parallel. When GIT_TEST_FSMONITOR
points to t/t7519/fsmonitor-watchman, this can lead to watchman
tracking many different folders, overloading its watch queue.

As a test script completes, we can tell watchman to stop watching
the directories inside the TRASH_DIRECTORY.

This is particularly important on Windows where watchman keeps an
open handle on the directories it watches, preventing them from
being deleted. There is currently a bug in watchman [1] where this
handle still is not closed, but if that is updated then these tests
can be run on Windows as well.

[1] https://github.com/facebook/watchman/issues/764

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 t/test-lib-functions.sh | 15 +++++++++++++++
 t/test-lib.sh           |  4 ++++
 2 files changed, 19 insertions(+)

Comments

Junio C Hamano Dec. 9, 2019, 10:52 p.m. UTC | #1
"Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:

> +test_clear_watchman () {
> +	if test -n "$GIT_TEST_FSMONITOR"
> +	then
> +		watchman watch-list |
> +			grep "$TRASH_DIRECTORY" |
> +			sed "s/\",//g" |
> +			sed "s/\"//g" >repo-list

Whoa, this is scary.  "$TRASH_DIRECTORY" comes from $(pwd) and the
leading part of it can have arbitrary garbage like "[a-z]*" that may
match paths "watchman watch-list" may emit that does not have
anything to do with the temporary directory used by this test.

What are these stripping of ", and " about?  Could you tell readers
how a typical output from the program we are reading from looks like
perhaps in the log message or in-code comment around here?

Thanks.
Derrick Stolee Dec. 10, 2019, 1:49 a.m. UTC | #2
On 12/9/2019 5:52 PM, Junio C Hamano wrote:
> "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:
> 
>> +test_clear_watchman () {
>> +	if test -n "$GIT_TEST_FSMONITOR"
>> +	then
>> +		watchman watch-list |
>> +			grep "$TRASH_DIRECTORY" |
>> +			sed "s/\",//g" |
>> +			sed "s/\"//g" >repo-list
> 
> Whoa, this is scary.  "$TRASH_DIRECTORY" comes from $(pwd) and the
> leading part of it can have arbitrary garbage like "[a-z]*" that may
> match paths "watchman watch-list" may emit that does not have
> anything to do with the temporary directory used by this test.

Hm. That is a good point. Can we assume that our version of grep has
a "-F" or "--fixed-strings" option? ([1] seems to say that "-F" would
work.)

[1] https://www.gnu.org/savannah-checkouts/gnu/grep/manual/grep.html#index-grep-programs

> What are these stripping of ", and " about?  Could you tell readers
> how a typical output from the program we are reading from looks like
> perhaps in the log message or in-code comment around here?

Watchman outputs its list of paths in JSON format. Luckily, it formats
the output so the path lines are on separate lines, each quoted.

For example:

{
	"version": "4.9.0",
	"roots": [
		"<path1>",
		"<path2>",
		"<path3>"
	]
}

Thanks,
-Stolee
Junio C Hamano Dec. 10, 2019, 5:20 a.m. UTC | #3
Derrick Stolee <stolee@gmail.com> writes:

> Hm. That is a good point. Can we assume that our version of grep has
> a "-F" or "--fixed-strings" option? ([1] seems to say that "-F" would
> work.)

$ git grep "grep -F" -- \*.sh

is your friend ;-) 

And never use https://www.gnu.org/ manual as a yardstick---you will
end up using GNUism that is not unavailable elsewhere pretty easily.

> [1] https://www.gnu.org/savannah-checkouts/gnu/grep/manual/grep.html#index-grep-programs
>
>> What are these stripping of ", and " about?  Could you tell readers
>> how a typical output from the program we are reading from looks like
>> perhaps in the log message or in-code comment around here?
>
> Watchman outputs its list of paths in JSON format. Luckily, it formats
> the output so the path lines are on separate lines, each quoted.
>
> For example:
>
> {
> 	"version": "4.9.0",
> 	"roots": [
> 		"<path1>",
> 		"<path2>",
> 		"<path3>"
> 	]
> }

Yeek; how is a dq in path represented?  by doubling?  by
backslash-quoting (if so how is a backslash in path represented)?
By something else?

It's OK at least for now to declare that our test repository does
not contain any funny paths, but in the longer run does the above
mean that we somehow need to be able to grok JSON reliably in our
tests?  It may not be such a bad thing especially for longer term,
as there are other parts of the system that may benefit from having
JSON capable output readers in our tests (e.g. trace2 code can do
JSON, right?)..
Derrick Stolee Dec. 10, 2019, 1:51 p.m. UTC | #4
On 12/10/2019 12:20 AM, Junio C Hamano wrote:
> Derrick Stolee <stolee@gmail.com> writes:
> 
>> Hm. That is a good point. Can we assume that our version of grep has
>> a "-F" or "--fixed-strings" option? ([1] seems to say that "-F" would
>> work.)
> 
> $ git grep "grep -F" -- \*.sh
> 
> is your friend ;-) 

Yes, of course I should have just looked for examples.

> And never use https://www.gnu.org/ manual as a yardstick---you will
> end up using GNUism that is not unavailable elsewhere pretty easily.

I tried to focus on the part that said "this is part of POSIX", but
you are right that may not be the best place to look.

>> [1] https://www.gnu.org/savannah-checkouts/gnu/grep/manual/grep.html#index-grep-programs
>>
>>> What are these stripping of ", and " about?  Could you tell readers
>>> how a typical output from the program we are reading from looks like
>>> perhaps in the log message or in-code comment around here?
>>
>> Watchman outputs its list of paths in JSON format. Luckily, it formats
>> the output so the path lines are on separate lines, each quoted.
>>
>> For example:
>>
>> {
>> 	"version": "4.9.0",
>> 	"roots": [
>> 		"<path1>",
>> 		"<path2>",
>> 		"<path3>"
>> 	]
>> }
> 
> Yeek; how is a dq in path represented?  by doubling?  by
> backslash-quoting (if so how is a backslash in path represented)?
> By something else?
> 
> It's OK at least for now to declare that our test repository does
> not contain any funny paths, but in the longer run does the above
> mean that we somehow need to be able to grok JSON reliably in our
> tests?  It may not be such a bad thing especially for longer term,
> as there are other parts of the system that may benefit from having
> JSON capable output readers in our tests (e.g. trace2 code can do
> JSON, right?)..

trace2 can _write_ JSON, not parse it. However, we have some parsing
code (using a package) in the performance tests. I could try
adapting that for this purpose. That package is not currently required
by the test suite, so it causes some dependency issues when first
running the perf suite. At least we wouldn't need the package
unless running with GIT_TEST_FSMONITOR.

My guess is that this patch is going to be trouble, so I'll eject
it in the next version and save the JSON parsing and everything
for its own series. We only really need it when we are getting
close to running watchman in CI on Windows.

Thanks,
-Stolee
Johannes Schindelin Dec. 10, 2019, 2:09 p.m. UTC | #5
Hi,

On Mon, 9 Dec 2019, Junio C Hamano wrote:

> Derrick Stolee <stolee@gmail.com> writes:
>
> > Hm. That is a good point. Can we assume that our version of grep has
> > a "-F" or "--fixed-strings" option? ([1] seems to say that "-F" would
> > work.)
>
> $ git grep "grep -F" -- \*.sh
>
> is your friend ;-)
>
> And never use https://www.gnu.org/ manual as a yardstick---you will
> end up using GNUism that is not unavailable elsewhere pretty easily.
>
> > [1] https://www.gnu.org/savannah-checkouts/gnu/grep/manual/grep.html#index-grep-programs

I often look at GNU grep's man page first and then verify via
https://man.openbsd.org/grep and
https://pubs.opengroup.org/onlinepubs/009695399/utilities/grep.html
that the option can be considered portable.

> >> What are these stripping of ", and " about?  Could you tell readers
> >> how a typical output from the program we are reading from looks like
> >> perhaps in the log message or in-code comment around here?
> >
> > Watchman outputs its list of paths in JSON format. Luckily, it formats
> > the output so the path lines are on separate lines, each quoted.
> >
> > For example:
> >
> > {
> > 	"version": "4.9.0",
> > 	"roots": [
> > 		"<path1>",
> > 		"<path2>",
> > 		"<path3>"
> > 	]
> > }
>
> Yeek; how is a dq in path represented?  by doubling?  by
> backslash-quoting (if so how is a backslash in path represented)?
> By something else?
>
> It's OK at least for now to declare that our test repository does
> not contain any funny paths, but in the longer run does the above
> mean that we somehow need to be able to grok JSON reliably in our
> tests?  It may not be such a bad thing especially for longer term,
> as there are other parts of the system that may benefit from having
> JSON capable output readers in our tests (e.g. trace2 code can do
> JSON, right?)..

From
https://www.ecma-international.org/publications/files/ECMA-ST/ECMA-404.pdf
(section "9 String"):

A string is a sequence of Unicode code points wrapped with quotation marks
(U+0022). All code points may be  placed  within  the  quotation  marks
except  for  the code  points that  must  be  escaped:  quotation  mark
(U+0022), reverse solidus (U+005C), and the control characters U+0000 to
U+001F. There are two-character escape sequence representations of some
characters.

	\"	represents the quotation mark character (U+0022).
	\\	represents the reverse solidus character(U+005C).
	\/	represents the solidus character (U+002F).
	\b	represents the backspace character(U+0008).
	\f	represents the form feed character (U+000C).
	\n	represents the line feed character (U+000A).
	\r	represents the carriage return character (U+000D).
	\t	represents the character tabulation character (U+0009).

(Side note: It is amazing what things you learn unexpectedly, e.g. when
researching information about the JSON format, you learn that about the
word "solidus", that it refers to the slash, and that it was once also
know as the "shilling mark"...)

I am not sure why the forward slash needs to be escaped, but I guess that
this is voluntary rather than mandatory.

Ciao,
Dscho
diff mbox series

Patch

diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index e0b3f28d3a..ef840ce097 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -1475,3 +1475,18 @@  test_set_port () {
 	port=$(($port + ${GIT_TEST_STRESS_JOB_NR:-0}))
 	eval $var=$port
 }
+
+test_clear_watchman () {
+	if test -n "$GIT_TEST_FSMONITOR"
+	then
+		watchman watch-list |
+			grep "$TRASH_DIRECTORY" |
+			sed "s/\",//g" |
+			sed "s/\"//g" >repo-list
+
+		while read repo
+		do
+			watchman watch-del "$repo"
+		done <repo-list
+	fi
+}
diff --git a/t/test-lib.sh b/t/test-lib.sh
index 30b07e310f..4114953ebc 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1081,6 +1081,10 @@  test_atexit_handler () {
 test_done () {
 	GIT_EXIT_OK=t
 
+	# If watchman is being used with GIT_TEST_FSMONITOR, then
+	# clear all watches on directories inside the TRASH_DIRECTORY.
+	test_clear_watchman
+
 	# Run the atexit commands _before_ the trash directory is
 	# removed, so the commands can access pidfiles and socket files.
 	test_atexit_handler