Message ID | e51165f260d564ccb7a9b8e696691eccb184c01a.1575907804.git.gitgitgadget@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Improve testability with GIT_TEST_FSMONITOR | expand |
"Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes: > +test_clear_watchman () { > + if test -n "$GIT_TEST_FSMONITOR" > + then > + watchman watch-list | > + grep "$TRASH_DIRECTORY" | > + sed "s/\",//g" | > + sed "s/\"//g" >repo-list Whoa, this is scary. "$TRASH_DIRECTORY" comes from $(pwd) and the leading part of it can have arbitrary garbage like "[a-z]*" that may match paths "watchman watch-list" may emit that does not have anything to do with the temporary directory used by this test. What are these stripping of ", and " about? Could you tell readers how a typical output from the program we are reading from looks like perhaps in the log message or in-code comment around here? Thanks.
On 12/9/2019 5:52 PM, Junio C Hamano wrote: > "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes: > >> +test_clear_watchman () { >> + if test -n "$GIT_TEST_FSMONITOR" >> + then >> + watchman watch-list | >> + grep "$TRASH_DIRECTORY" | >> + sed "s/\",//g" | >> + sed "s/\"//g" >repo-list > > Whoa, this is scary. "$TRASH_DIRECTORY" comes from $(pwd) and the > leading part of it can have arbitrary garbage like "[a-z]*" that may > match paths "watchman watch-list" may emit that does not have > anything to do with the temporary directory used by this test. Hm. That is a good point. Can we assume that our version of grep has a "-F" or "--fixed-strings" option? ([1] seems to say that "-F" would work.) [1] https://www.gnu.org/savannah-checkouts/gnu/grep/manual/grep.html#index-grep-programs > What are these stripping of ", and " about? Could you tell readers > how a typical output from the program we are reading from looks like > perhaps in the log message or in-code comment around here? Watchman outputs its list of paths in JSON format. Luckily, it formats the output so the path lines are on separate lines, each quoted. For example: { "version": "4.9.0", "roots": [ "<path1>", "<path2>", "<path3>" ] } Thanks, -Stolee
Derrick Stolee <stolee@gmail.com> writes: > Hm. That is a good point. Can we assume that our version of grep has > a "-F" or "--fixed-strings" option? ([1] seems to say that "-F" would > work.) $ git grep "grep -F" -- \*.sh is your friend ;-) And never use https://www.gnu.org/ manual as a yardstick---you will end up using GNUism that is not unavailable elsewhere pretty easily. > [1] https://www.gnu.org/savannah-checkouts/gnu/grep/manual/grep.html#index-grep-programs > >> What are these stripping of ", and " about? Could you tell readers >> how a typical output from the program we are reading from looks like >> perhaps in the log message or in-code comment around here? > > Watchman outputs its list of paths in JSON format. Luckily, it formats > the output so the path lines are on separate lines, each quoted. > > For example: > > { > "version": "4.9.0", > "roots": [ > "<path1>", > "<path2>", > "<path3>" > ] > } Yeek; how is a dq in path represented? by doubling? by backslash-quoting (if so how is a backslash in path represented)? By something else? It's OK at least for now to declare that our test repository does not contain any funny paths, but in the longer run does the above mean that we somehow need to be able to grok JSON reliably in our tests? It may not be such a bad thing especially for longer term, as there are other parts of the system that may benefit from having JSON capable output readers in our tests (e.g. trace2 code can do JSON, right?)..
On 12/10/2019 12:20 AM, Junio C Hamano wrote: > Derrick Stolee <stolee@gmail.com> writes: > >> Hm. That is a good point. Can we assume that our version of grep has >> a "-F" or "--fixed-strings" option? ([1] seems to say that "-F" would >> work.) > > $ git grep "grep -F" -- \*.sh > > is your friend ;-) Yes, of course I should have just looked for examples. > And never use https://www.gnu.org/ manual as a yardstick---you will > end up using GNUism that is not unavailable elsewhere pretty easily. I tried to focus on the part that said "this is part of POSIX", but you are right that may not be the best place to look. >> [1] https://www.gnu.org/savannah-checkouts/gnu/grep/manual/grep.html#index-grep-programs >> >>> What are these stripping of ", and " about? Could you tell readers >>> how a typical output from the program we are reading from looks like >>> perhaps in the log message or in-code comment around here? >> >> Watchman outputs its list of paths in JSON format. Luckily, it formats >> the output so the path lines are on separate lines, each quoted. >> >> For example: >> >> { >> "version": "4.9.0", >> "roots": [ >> "<path1>", >> "<path2>", >> "<path3>" >> ] >> } > > Yeek; how is a dq in path represented? by doubling? by > backslash-quoting (if so how is a backslash in path represented)? > By something else? > > It's OK at least for now to declare that our test repository does > not contain any funny paths, but in the longer run does the above > mean that we somehow need to be able to grok JSON reliably in our > tests? It may not be such a bad thing especially for longer term, > as there are other parts of the system that may benefit from having > JSON capable output readers in our tests (e.g. trace2 code can do > JSON, right?).. trace2 can _write_ JSON, not parse it. However, we have some parsing code (using a package) in the performance tests. I could try adapting that for this purpose. That package is not currently required by the test suite, so it causes some dependency issues when first running the perf suite. At least we wouldn't need the package unless running with GIT_TEST_FSMONITOR. My guess is that this patch is going to be trouble, so I'll eject it in the next version and save the JSON parsing and everything for its own series. We only really need it when we are getting close to running watchman in CI on Windows. Thanks, -Stolee
Hi, On Mon, 9 Dec 2019, Junio C Hamano wrote: > Derrick Stolee <stolee@gmail.com> writes: > > > Hm. That is a good point. Can we assume that our version of grep has > > a "-F" or "--fixed-strings" option? ([1] seems to say that "-F" would > > work.) > > $ git grep "grep -F" -- \*.sh > > is your friend ;-) > > And never use https://www.gnu.org/ manual as a yardstick---you will > end up using GNUism that is not unavailable elsewhere pretty easily. > > > [1] https://www.gnu.org/savannah-checkouts/gnu/grep/manual/grep.html#index-grep-programs I often look at GNU grep's man page first and then verify via https://man.openbsd.org/grep and https://pubs.opengroup.org/onlinepubs/009695399/utilities/grep.html that the option can be considered portable. > >> What are these stripping of ", and " about? Could you tell readers > >> how a typical output from the program we are reading from looks like > >> perhaps in the log message or in-code comment around here? > > > > Watchman outputs its list of paths in JSON format. Luckily, it formats > > the output so the path lines are on separate lines, each quoted. > > > > For example: > > > > { > > "version": "4.9.0", > > "roots": [ > > "<path1>", > > "<path2>", > > "<path3>" > > ] > > } > > Yeek; how is a dq in path represented? by doubling? by > backslash-quoting (if so how is a backslash in path represented)? > By something else? > > It's OK at least for now to declare that our test repository does > not contain any funny paths, but in the longer run does the above > mean that we somehow need to be able to grok JSON reliably in our > tests? It may not be such a bad thing especially for longer term, > as there are other parts of the system that may benefit from having > JSON capable output readers in our tests (e.g. trace2 code can do > JSON, right?).. From https://www.ecma-international.org/publications/files/ECMA-ST/ECMA-404.pdf (section "9 String"): A string is a sequence of Unicode code points wrapped with quotation marks (U+0022). All code points may be placed within the quotation marks except for the code points that must be escaped: quotation mark (U+0022), reverse solidus (U+005C), and the control characters U+0000 to U+001F. There are two-character escape sequence representations of some characters. \" represents the quotation mark character (U+0022). \\ represents the reverse solidus character(U+005C). \/ represents the solidus character (U+002F). \b represents the backspace character(U+0008). \f represents the form feed character (U+000C). \n represents the line feed character (U+000A). \r represents the carriage return character (U+000D). \t represents the character tabulation character (U+0009). (Side note: It is amazing what things you learn unexpectedly, e.g. when researching information about the JSON format, you learn that about the word "solidus", that it refers to the slash, and that it was once also know as the "shilling mark"...) I am not sure why the forward slash needs to be escaped, but I guess that this is voluntary rather than mandatory. Ciao, Dscho
diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh index e0b3f28d3a..ef840ce097 100644 --- a/t/test-lib-functions.sh +++ b/t/test-lib-functions.sh @@ -1475,3 +1475,18 @@ test_set_port () { port=$(($port + ${GIT_TEST_STRESS_JOB_NR:-0})) eval $var=$port } + +test_clear_watchman () { + if test -n "$GIT_TEST_FSMONITOR" + then + watchman watch-list | + grep "$TRASH_DIRECTORY" | + sed "s/\",//g" | + sed "s/\"//g" >repo-list + + while read repo + do + watchman watch-del "$repo" + done <repo-list + fi +} diff --git a/t/test-lib.sh b/t/test-lib.sh index 30b07e310f..4114953ebc 100644 --- a/t/test-lib.sh +++ b/t/test-lib.sh @@ -1081,6 +1081,10 @@ test_atexit_handler () { test_done () { GIT_EXIT_OK=t + # If watchman is being used with GIT_TEST_FSMONITOR, then + # clear all watches on directories inside the TRASH_DIRECTORY. + test_clear_watchman + # Run the atexit commands _before_ the trash directory is # removed, so the commands can access pidfiles and socket files. test_atexit_handler