Message ID | 20191130104644.17350-1-szeder.dev@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | t9300-fast-import: don't hang if background fast-import exits too early | expand |
SZEDER Gábor <szeder.dev@gmail.com> writes: > + ( > + git fast-import $options <&8 >&9 & > + echo $! >V.fi.pid > + wait $! > + echo >&2 "background fast-import terminated too early with exit code $?" > + # Un-block the read loop in the main shell process. > + echo >&9 UNEXPECTED > + ) & > + echo $! >V.sh.pid > # We don't mind if fast-import has already died by the time the test > # ends. > - test_when_finished " > + test_when_finished ' > exec 8>&-; exec 9>&-; > - kill $(cat V.pid) && wait $(cat V.pid) > - true" > + kill $(cat V.sh.pid) && wait $(cat V.sh.pid) > + kill $(cat V.fi.pid) && wait $(cat V.sh.pid) > + true' The original interpolates the PID of the fast-import when "when-finished" program is registered, so it is OK if somebody else removed V.pid file; the new one interpolates when "when-finished" program is run, reading from V.??.pid, so somebody needs to make sure these pid files will stay around. I do not think it is an issue as I suspect we've left it to the global clean-up procedure that removes the trash directory to remove the pid file. By the way, does the second "kill && wait" wait for the right process? > background_import_still_running () { > - if ! kill -0 "$(cat V.pid)" > + if ! kill -0 "$(cat V.fi.pid)" > then > echo >&2 "background fast-import terminated too early" > false > fi > }
On Sat, Nov 30, 2019 at 01:16:30PM -0800, Junio C Hamano wrote: > SZEDER Gábor <szeder.dev@gmail.com> writes: > > > + ( > > + git fast-import $options <&8 >&9 & > > + echo $! >V.fi.pid > > + wait $! > > + echo >&2 "background fast-import terminated too early with exit code $?" > > + # Un-block the read loop in the main shell process. > > + echo >&9 UNEXPECTED > > + ) & > > + echo $! >V.sh.pid > > # We don't mind if fast-import has already died by the time the test > > # ends. > > - test_when_finished " > > + test_when_finished ' > > exec 8>&-; exec 9>&-; > > - kill $(cat V.pid) && wait $(cat V.pid) > > - true" > > + kill $(cat V.sh.pid) && wait $(cat V.sh.pid) > > + kill $(cat V.fi.pid) && wait $(cat V.sh.pid) > > + true' > > The original interpolates the PID of the fast-import when > "when-finished" program is registered, so it is OK if somebody else > removed V.pid file; the new one interpolates when "when-finished" > program is run, reading from V.??.pid, so somebody needs to make > sure these pid files will stay around. I do not think it is an > issue as I suspect we've left it to the global clean-up procedure > that removes the trash directory to remove the pid file. In the original the same shell process starts 'git fast-import', writes its pidfile, and registers the test_when_finished commands, so we can be sure that the pid file is already present when the shell runs the $(cat V.pid) command substitutions. With this patch that's not the case anymore, because the background subshell starts 'git fast-import' and writes the pidfile, but the main shell process registers the test_when_finished commands. IOW these two shell processes are racing, and it's possible that the test_when_finished command is executed before the background subshell can write the pidfile. So double quotes around the block of test_when_finished commands are not good. > By the way, does the second "kill && wait" wait for the right > process? Ouch, it clearly doesn't. Copy-paste error I suppose. Thanks for spotting it. > > > background_import_still_running () { > > - if ! kill -0 "$(cat V.pid)" > > + if ! kill -0 "$(cat V.fi.pid)" > > then > > echo >&2 "background fast-import terminated too early" > > false > > fi > > }
SZEDER Gábor <szeder.dev@gmail.com> writes: >> > - test_when_finished " >> > + test_when_finished ' >> > exec 8>&-; exec 9>&-; >> > - kill $(cat V.pid) && wait $(cat V.pid) >> > - true" >> > + kill $(cat V.sh.pid) && wait $(cat V.sh.pid) >> > + kill $(cat V.fi.pid) && wait $(cat V.sh.pid) >> > + true' >> >> The original interpolates the PID of the fast-import when >> "when-finished" program is registered, so it is OK if somebody else >> removed V.pid file; the new one interpolates when "when-finished" >> program is run, reading from V.??.pid, so somebody needs to make >> sure these pid files will stay around. I do not think it is an >> issue as I suspect we've left it to the global clean-up procedure >> that removes the trash directory to remove the pid file. > > In the original the same shell process starts 'git fast-import', > writes its pidfile, and registers the test_when_finished commands, so > we can be sure that the pid file is already present when the shell > runs the $(cat V.pid) command substitutions. Yes. It also means that V.pid file was not very useful in the when-finished handler in the original. We could have just used a shell variable. > With this patch that's not the case anymore, because the background > subshell starts 'git fast-import' and writes the pidfile, but the main > shell process registers the test_when_finished commands. IOW these > two shell processes are racing, and it's possible that the > test_when_finished command is executed before the background subshell > can write the pidfile. So double quotes around the block of > test_when_finished commands are not good. Oh, I was not questioning that. I wanted to make sure, and I was doing so aloud, that these files are (1) created way before, and (2) left on the filesystem when-finished handler actually runs, because the original did not need any guarantee for (2), but now the sq version does.
diff --git a/t/t9300-fast-import.sh b/t/t9300-fast-import.sh index e707fb861e..dcfaa9cc36 100755 --- a/t/t9300-fast-import.sh +++ b/t/t9300-fast-import.sh @@ -3148,63 +3148,74 @@ test_expect_success 'U: validate root delete result' ' # The commands in input_file should not produce any output on the file # descriptor set with --cat-blob-fd (or stdout if unspecified). # # To make sure you're observing the side effects of checkpoint *before* # fast-import terminates (and thus writes out its state), check that the # fast-import process is still running using background_import_still_running # *after* evaluating the test conditions. background_import_then_checkpoint () { options=$1 input_file=$2 mkfifo V.input exec 8<>V.input rm V.input mkfifo V.output exec 9<>V.output rm V.output - git fast-import $options <&8 >&9 & - echo $! >V.pid + ( + git fast-import $options <&8 >&9 & + echo $! >V.fi.pid + wait $! + echo >&2 "background fast-import terminated too early with exit code $?" + # Un-block the read loop in the main shell process. + echo >&9 UNEXPECTED + ) & + echo $! >V.sh.pid # We don't mind if fast-import has already died by the time the test # ends. - test_when_finished " + test_when_finished ' exec 8>&-; exec 9>&-; - kill $(cat V.pid) && wait $(cat V.pid) - true" + kill $(cat V.sh.pid) && wait $(cat V.sh.pid) + kill $(cat V.fi.pid) && wait $(cat V.sh.pid) + true' # Start in the background to ensure we adhere strictly to (blocking) # pipes writing sequence. We want to assume that the write below could # block, e.g. if fast-import blocks writing its own output to &9 # because there is no reader on &9 yet. ( cat "$input_file" echo "checkpoint" echo "progress checkpoint" ) >&8 & error=1 ;# assume the worst while read output <&9 do if test "$output" = "progress checkpoint" then error=0 break + elif test "$output" = "UNEXPECTED" + then + break fi # otherwise ignore cruft echo >&2 "cruft: $output" done if test $error -eq 1 then false fi } background_import_still_running () { - if ! kill -0 "$(cat V.pid)" + if ! kill -0 "$(cat V.fi.pid)" then echo >&2 "background fast-import terminated too early" false fi }
The five tests checking 'git fast-import's checkpoint handling in 't9300-fast-import.sh', all with the prefix "V:" in their test description, can hang indefinitely if 'git fast-import' unexpectedly dies early in any of these tests. These five tests run 'git fast-import' in the background, while feeding instructions to its standard input through a fifo from a background subshell, and reading and verifying its standard output through another fifo in the test script's main shell process. This "reading and verifying" is basically a 'while read ...' shell loop iterating until 'git fast-import' outputs the expected line, ignoring any other output. This doesn't work very well when 'git fast-import' dies before printing that particular line, because the 'read' builtin doesn't get EOF after the death of 'git fast-import', as their input and output are not connected directly but through a fifo. Consequently, that 'read' hangs waiting for the next line from the already dead 'git fast-import', leaving the test script and in turn the whole test suite hanging. Avoid this hang by checking whether the background 'git fast-import' process exited unexpectedly early, and interrupt the 'while read' loop if it did. We have to jump through some hoops to achive that, though: - Start the background 'git fast-import' in another background subshell, which then waits until that 'git fast-import' process exits. If it does exit, then report its exit code, and write a message to the fifo used for 'git fast-import's standard output, thus un-block the 'read' builtin in the main shell process. - Modify that 'while read' loop to break the loop upon seeing that message, and fail the test in the usual way. - Once the test is finished kill that background subshell as well, and do so before killing the background 'git fast-import'. Otherwise the background 'git fast-import' and subshell would die racily, and if 'git fast-import' were to die sooner, then we might get some undesired and potentially confusing messages in the test's output. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> --- t/t9300-fast-import.sh | 23 +++++++++++++++++------ 1 file changed, 17 insertions(+), 6 deletions(-)