mbox series

[PATCHv1,0/6] git-p4: wait() for child processes better

Message ID 20200129111246.12196-1-luke@diamand.org (mailing list archive)
Headers show
Series git-p4: wait() for child processes better | expand

Message

Luke Diamand Jan. 29, 2020, 11:12 a.m. UTC
git-p4 handles most errors by calling die(). This can leave child
processes still running, orphaned.

    https://public-inbox.org/git/20190227094926.GE19739@szeder.dev/

This is not a problem for humans, but for CI, it is.

This change improves things by raising an exception and cleaning up
further up the stack, rather than simply calling die().

This is only done in a few places, such that the tests pass with the changes
suggested in the link (adding sleep strategically) but there are still
plenty of places where git-p4 calls die().

This also adds some pylint disables, so that we can start to run pylint
on git-p4.

Luke Diamand (6):
  git-p4: make closeStreams() idempotent
  git-p4: add P4CommandException to report errors talking to Perforce
  git-p4: disable some pylint warnings, to get pylint output to
    something manageable
  git-p4: create helper function importRevisions()
  git-p4: cleanup better on error exit
  git-p4: check for access to remote host earlier

 git-p4.py | 180 +++++++++++++++++++++++++++++++++---------------------
 1 file changed, 109 insertions(+), 71 deletions(-)

Comments

Johannes Schindelin Jan. 30, 2020, 10:52 a.m. UTC | #1
Hi Luke,

On Wed, 29 Jan 2020, Luke Diamand wrote:

> git-p4 handles most errors by calling die(). This can leave child
> processes still running, orphaned.
>
>     https://public-inbox.org/git/20190227094926.GE19739@szeder.dev/
>
> This is not a problem for humans, but for CI, it is.

Just to make sure that we're talking about the same thing. Looking at

https://dev.azure.com/git/git/_test/analytics?definitionId=11&contextType=build

I am not even sure that `git-p4` is our biggest problem there. In that
list, the only flaky `git-p4` test I see is t9806.5. And it failed only
once recently:
https://dev.azure.com/git/git/_build/results?buildId=1640&view=ms.vss-test-web.build-test-results-tab

The log looks like this:

-- snip --
[...]
expecting success of 9806.5 'sync when no master branch prints a nice error':
	test_when_finished cleanup_git &&
	git p4 clone --branch=refs/remotes/p4/sb --dest="$git" //depot@2 &&
	(
		cd "$git" &&
		test_must_fail git p4 sync 2>err &&
		grep "Error: no branch refs/remotes/p4/master" err
	)

+ test_when_finished cleanup_git
+ test 0 = 0
+ test_cleanup={ cleanup_git
		} && (exit "$eval_ret"); eval_ret=$?; :
+ git p4 clone --branch=refs/remotes/p4/sb --dest=/home/vsts/work/1/s/t/trash directory.t9806-git-p4-options/git //depot@2
Initialized empty Git repository in /home/vsts/work/1/s/t/trash
directory.t9806-git-p4-options/git/.git/
Perforce db files in '.' will be created if missing...
Perforce db files in '.' will be created if missing...
Perforce db files in '.' will be created if missing...
Perforce db files in '.' will be created if missing...
Perforce db files in '.' will be created if missing...
Importing from //depot@2 into /home/vsts/work/1/s/t/trash directory.t9806-git-p4-options/git
Doing initial import of //depot/ from revision @2 into refs/remotes/p4/sb
+ cd /home/vsts/work/1/s/t/trash directory.t9806-git-p4-options/git
+ test_must_fail git p4 sync
+ _test_ok=
+ git p4 sync
Performing incremental import into refs/remotes/p4/master git branch
Depot paths: //depot/
+ exit_code=1
+ test 1 -eq 0
+ test_match_signal 13 1
+ test 1 = 141
+ test 1 = 269
+ return 1
+ test 1 -gt 129
+ test 1 -eq 127
+ test 1 -eq 126
+ return 0
+ grep Error: no branch refs/remotes/p4/master err
Error: no branch refs/remotes/p4/master; perhaps specify one with --branch.
+ cleanup_git
+ retry_until_success rm -r /home/vsts/work/1/s/t/trash directory.t9806-git-p4-options/git
+ nr_tries_left=60
+ rm -r /home/vsts/work/1/s/t/trash directory.t9806-git-p4-options/git
+ test_path_is_missing /home/vsts/work/1/s/t/trash directory.t9806-git-p4-options/git
+ test -e /home/vsts/work/1/s/t/trash directory.t9806-git-p4-options/git
+ echo Path exists:
Path exists:
+ ls -ld /home/vsts/work/1/s/t/trash directory.t9806-git-p4-options/git
drwxr-xr-x 3 vsts docker 4096 Jan 24 22:20 /home/vsts/work/1/s/t/trash directory.t9806-git-p4-options/git
+ test 1 -ge 1
+ echo /home/vsts/work/1/s/t/trash directory.t9806-git-p4-options/git /home/vsts/work/1/s/t/trash directory.t9806-git-p4-options/git
+ false
+ eval_ret=1
+ :
-- snap --

FWIW I see the same test being flaky in Git for Windows (but it _also_
only happened once in the past 14 days):
https://dev.azure.com/git-for-windows/git/_test/analytics?definitionId=17&contextType=build
and
https://dev.azure.com/git-for-windows/git/_build/results?buildId=49174&view=ms.vss-test-web.build-test-results-tab

The log suggests to me that there is a path that has not been cleaned up,
and _maybe_ it is timing-related, but it is also possible that a child
process is still running that should have cleaned it up.

Are your patches about this failure?

I see that PATCH 5/6 talks about Gábor's analysis of t9800.6 in
https://public-inbox.org/git/20190227094926.GE19739@szeder.dev/ which
_looks_ similar.

FWIW I looked over your patches and they seem relatively obvious, even for
somebody like me who barely gets to code in Python anymore.

Thanks,
Dscho

>
> This change improves things by raising an exception and cleaning up
> further up the stack, rather than simply calling die().
>
> This is only done in a few places, such that the tests pass with the changes
> suggested in the link (adding sleep strategically) but there are still
> plenty of places where git-p4 calls die().
>
> This also adds some pylint disables, so that we can start to run pylint
> on git-p4.
>
> Luke Diamand (6):
>   git-p4: make closeStreams() idempotent
>   git-p4: add P4CommandException to report errors talking to Perforce
>   git-p4: disable some pylint warnings, to get pylint output to
>     something manageable
>   git-p4: create helper function importRevisions()
>   git-p4: cleanup better on error exit
>   git-p4: check for access to remote host earlier
>
>  git-p4.py | 180 +++++++++++++++++++++++++++++++++---------------------
>  1 file changed, 109 insertions(+), 71 deletions(-)
>
> --
> 2.20.1.390.gb5101f9297
>
>
Luke Diamand Jan. 30, 2020, 11:33 a.m. UTC | #2
On Thu, 30 Jan 2020 at 10:52, Johannes Schindelin
<Johannes.Schindelin@gmx.de> wrote:
>
> Hi Luke,
>
> On Wed, 29 Jan 2020, Luke Diamand wrote:
>
> > git-p4 handles most errors by calling die(). This can leave child
> > processes still running, orphaned.
> >
> >     https://public-inbox.org/git/20190227094926.GE19739@szeder.dev/
> >
> > This is not a problem for humans, but for CI, it is.
>
> Just to make sure that we're talking about the same thing. Looking at
>
> https://dev.azure.com/git/git/_test/analytics?definitionId=11&contextType=build
>
> I am not even sure that `git-p4` is our biggest problem there. In that
> list, the only flaky `git-p4` test I see is t9806.5. And it failed only
> once recently:
> https://dev.azure.com/git/git/_build/results?buildId=1640&view=ms.vss-test-web.build-test-results-tab
>
> The log looks like this:
>
> -- snip --
> [...]
> expecting success of 9806.5 'sync when no master branch prints a nice error':
>         test_when_finished cleanup_git &&
>         git p4 clone --branch=refs/remotes/p4/sb --dest="$git" //depot@2 &&
>         (
>                 cd "$git" &&
>                 test_must_fail git p4 sync 2>err &&
>                 grep "Error: no branch refs/remotes/p4/master" err
>         )
>
> + test_when_finished cleanup_git
> + test 0 = 0
> + test_cleanup={ cleanup_git
>                 } && (exit "$eval_ret"); eval_ret=$?; :
> + git p4 clone --branch=refs/remotes/p4/sb --dest=/home/vsts/work/1/s/t/trash directory.t9806-git-p4-options/git //depot@2
> Initialized empty Git repository in /home/vsts/work/1/s/t/trash
> directory.t9806-git-p4-options/git/.git/
> Perforce db files in '.' will be created if missing...
> Perforce db files in '.' will be created if missing...
> Perforce db files in '.' will be created if missing...
> Perforce db files in '.' will be created if missing...
> Perforce db files in '.' will be created if missing...
> Importing from //depot@2 into /home/vsts/work/1/s/t/trash directory.t9806-git-p4-options/git
> Doing initial import of //depot/ from revision @2 into refs/remotes/p4/sb
> + cd /home/vsts/work/1/s/t/trash directory.t9806-git-p4-options/git
> + test_must_fail git p4 sync
> + _test_ok=
> + git p4 sync
> Performing incremental import into refs/remotes/p4/master git branch
> Depot paths: //depot/
> + exit_code=1
> + test 1 -eq 0
> + test_match_signal 13 1
> + test 1 = 141
> + test 1 = 269
> + return 1
> + test 1 -gt 129
> + test 1 -eq 127
> + test 1 -eq 126
> + return 0
> + grep Error: no branch refs/remotes/p4/master err
> Error: no branch refs/remotes/p4/master; perhaps specify one with --branch.
> + cleanup_git
> + retry_until_success rm -r /home/vsts/work/1/s/t/trash directory.t9806-git-p4-options/git
> + nr_tries_left=60
> + rm -r /home/vsts/work/1/s/t/trash directory.t9806-git-p4-options/git
> + test_path_is_missing /home/vsts/work/1/s/t/trash directory.t9806-git-p4-options/git
> + test -e /home/vsts/work/1/s/t/trash directory.t9806-git-p4-options/git
> + echo Path exists:
> Path exists:
> + ls -ld /home/vsts/work/1/s/t/trash directory.t9806-git-p4-options/git
> drwxr-xr-x 3 vsts docker 4096 Jan 24 22:20 /home/vsts/work/1/s/t/trash directory.t9806-git-p4-options/git
> + test 1 -ge 1
> + echo /home/vsts/work/1/s/t/trash directory.t9806-git-p4-options/git /home/vsts/work/1/s/t/trash directory.t9806-git-p4-options/git
> + false
> + eval_ret=1
> + :
> -- snap --

This should be fixed by my change. When that error occurs, the current
version calls die(), with my change it raises an exception, and then
cleans up further up the stack.

-                            die("Error: no branch %s; perhaps specify
one with --branch." %
+                            raise P4CommandException("Error: no
branch %s; perhaps specify one with --branch." %


>
> FWIW I see the same test being flaky in Git for Windows (but it _also_
> only happened once in the past 14 days):
> https://dev.azure.com/git-for-windows/git/_test/analytics?definitionId=17&contextType=build
> and
> https://dev.azure.com/git-for-windows/git/_build/results?buildId=49174&view=ms.vss-test-web.build-test-results-tab
>
> The log suggests to me that there is a path that has not been cleaned up,
> and _maybe_ it is timing-related, but it is also possible that a child
> process is still running that should have cleaned it up.
>
> Are your patches about this failure?

I was actually looking at Gábor's report. He has a nifty hack which
makes the errors crop up very rapidly, so I just fixed all of the
failures that arose from that.

It's easy to see from inspecting the code that there are plenty of
other places where this is not handled properly. Probably those should
wait until the python3 fixes have been merged and/or there is some
demand for them.

>
> I see that PATCH 5/6 talks about Gábor's analysis of t9800.6 in
> https://public-inbox.org/git/20190227094926.GE19739@szeder.dev/ which
> _looks_ similar.
>
> FWIW I looked over your patches and they seem relatively obvious, even for
> somebody like me who barely gets to code in Python anymore.

Thanks!
Luke

>
> Thanks,
> Dscho
>
> >
> > This change improves things by raising an exception and cleaning up
> > further up the stack, rather than simply calling die().
> >
> > This is only done in a few places, such that the tests pass with the changes
> > suggested in the link (adding sleep strategically) but there are still
> > plenty of places where git-p4 calls die().
> >
> > This also adds some pylint disables, so that we can start to run pylint
> > on git-p4.
> >
> > Luke Diamand (6):
> >   git-p4: make closeStreams() idempotent
> >   git-p4: add P4CommandException to report errors talking to Perforce
> >   git-p4: disable some pylint warnings, to get pylint output to
> >     something manageable
> >   git-p4: create helper function importRevisions()
> >   git-p4: cleanup better on error exit
> >   git-p4: check for access to remote host earlier
> >
> >  git-p4.py | 180 +++++++++++++++++++++++++++++++++---------------------
> >  1 file changed, 109 insertions(+), 71 deletions(-)
> >
> > --
> > 2.20.1.390.gb5101f9297
> >
> >