diff mbox series

[v2,10/13] tests: include detailed trace logs with --write-junit-xml upon failure

Message ID ae3c42519abff7ef32c767f9587ef7f0160033ed.1539598316.git.gitgitgadget@gmail.com (mailing list archive)
State New, archived
Headers show
Series Offer to run CI/PR builds in Azure Pipelines | expand

Commit Message

Linus Arver via GitGitGadget Oct. 15, 2018, 10:12 a.m. UTC
From: Johannes Schindelin <johannes.schindelin@gmx.de>

The JUnit XML format lends itself to be presented in a powerful UI,
where you can drill down to the information you are interested in very
quickly.

For test failures, this usually means that you want to see the detailed
trace of the failing tests.

With Travis CI, we passed the `--verbose-log` option to get those
traces. However, that seems excessive, as we do not need/use the logs in
almost all of those cases: only when a test fails do we have a way to
include the trace.

So let's do something different when using Azure DevOps: let's run all
the tests with `--quiet` first, and only if a failure is encountered,
try to trace the commands as they are executed.

Of course, we cannot turn on `--verbose-log` after the fact. So let's
just re-run the test with all the same options, adding `--verbose-log`.
And then munging the output file into the JUnit XML on the fly.

Note: there is an off chance that re-running the test in verbose mode
"fixes" the failures (and this does happen from time to time!). That is
a possibility we should be able to live with. Ideally, we would label
this as "Passed upon rerun", and Azure Pipelines even know about that
outcome, but it is not available when using the JUnit XML format for
now:
https://github.com/Microsoft/azure-pipelines-agent/blob/master/src/Agent.Worker/TestResults/JunitResultReader.cs

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/test-lib.sh | 34 +++++++++++++++++++++++++++++++++-
 1 file changed, 33 insertions(+), 1 deletion(-)

Comments

SZEDER Gábor Oct. 16, 2018, 10:04 a.m. UTC | #1
On Mon, Oct 15, 2018 at 03:12:12AM -0700, Johannes Schindelin via GitGitGadget wrote:
> From: Johannes Schindelin <johannes.schindelin@gmx.de>
> 
> The JUnit XML format lends itself to be presented in a powerful UI,
> where you can drill down to the information you are interested in very
> quickly.
> 
> For test failures, this usually means that you want to see the detailed
> trace of the failing tests.
> 
> With Travis CI, we passed the `--verbose-log` option to get those
> traces. However, that seems excessive, as we do not need/use the logs in

As someone who has dug into a few occasional failures found by Travis
CI, I'd say that the output of '--verbose-log -x' is not excessive,
but downright essential.

> almost all of those cases: only when a test fails do we have a way to
> include the trace.
> 
> So let's do something different when using Azure DevOps: let's run all
> the tests with `--quiet` first, and only if a failure is encountered,
> try to trace the commands as they are executed.
> 
> Of course, we cannot turn on `--verbose-log` after the fact. So let's
> just re-run the test with all the same options, adding `--verbose-log`.
> And then munging the output file into the JUnit XML on the fly.
> 
> Note: there is an off chance that re-running the test in verbose mode
> "fixes" the failures (and this does happen from time to time!). That is
> a possibility we should be able to live with.

Any CI system worth its salt should provide as much information about
any failures as possible, especially when it was lucky enough to
stumble upon a rare and hard to reproduce non-deterministic failure.

> Ideally, we would label
> this as "Passed upon rerun", and Azure Pipelines even know about that
> outcome, but it is not available when using the JUnit XML format for
> now:
> https://github.com/Microsoft/azure-pipelines-agent/blob/master/src/Agent.Worker/TestResults/JunitResultReader.cs
> 
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Johannes Schindelin Oct. 16, 2018, 1:02 p.m. UTC | #2
Hi Gábor,

On Tue, 16 Oct 2018, SZEDER Gábor wrote:

> On Mon, Oct 15, 2018 at 03:12:12AM -0700, Johannes Schindelin via GitGitGadget wrote:
> > From: Johannes Schindelin <johannes.schindelin@gmx.de>
> > 
> > The JUnit XML format lends itself to be presented in a powerful UI,
> > where you can drill down to the information you are interested in very
> > quickly.
> > 
> > For test failures, this usually means that you want to see the detailed
> > trace of the failing tests.
> > 
> > With Travis CI, we passed the `--verbose-log` option to get those
> > traces. However, that seems excessive, as we do not need/use the logs in
> 
> As someone who has dug into a few occasional failures found by Travis
> CI, I'd say that the output of '--verbose-log -x' is not excessive,
> but downright essential.

I agree that the output is essential for drilling down into failures. This
paragraph, however, talks about the general case: where there are *no*
failures. See here:

> > almost all of those cases: only when a test fails do we have a way to
> > include the trace.
> > 
> > So let's do something different when using Azure DevOps: let's run all
> > the tests with `--quiet` first, and only if a failure is encountered,
> > try to trace the commands as they are executed.
> > 
> > Of course, we cannot turn on `--verbose-log` after the fact. So let's
> > just re-run the test with all the same options, adding `--verbose-log`.
> > And then munging the output file into the JUnit XML on the fly.
> > 
> > Note: there is an off chance that re-running the test in verbose mode
> > "fixes" the failures (and this does happen from time to time!). That is
> > a possibility we should be able to live with.
> 
> Any CI system worth its salt should provide as much information about
> any failures as possible, especially when it was lucky enough to
> stumble upon a rare and hard to reproduce non-deterministic failure.

I would agree with you if more people started to pay attention to our CI
failures. And if we had some sort of a development model where a CI
failure would halt development on that particular topic until the failure
is fixed, with the responsibility assigned to somebody to fix it.

This is not the case here, though. pu is broken for ages, at least on
Windows, and even a *single* topic is enough to do that. And this is even
worse with flakey tests. I cannot remember *how often* I saw CI failures
in t5570-git-daemon.sh, for example. It is rare enough that it is obvious
that this is a problem of the *regression test*, rather than a problem of
the code that is to be tested.

So I would suggest to go forward with my proposed strategy for the moment,
right up until the time when we have had the resources to fix t5570, for
starters.

Ciao,
Dscho

> > Ideally, we would label this as "Passed upon rerun", and Azure
> > Pipelines even know about that outcome, but it is not available when
> > using the JUnit XML format for now:
> > https://github.com/Microsoft/azure-pipelines-agent/blob/master/src/Agent.Worker/TestResults/JunitResultReader.cs
> > 
> > Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
>
SZEDER Gábor Oct. 16, 2018, 4:03 p.m. UTC | #3
On Tue, Oct 16, 2018 at 03:02:38PM +0200, Johannes Schindelin wrote:
> Hi Gábor,
> 
> On Tue, 16 Oct 2018, SZEDER Gábor wrote:
> 
> > On Mon, Oct 15, 2018 at 03:12:12AM -0700, Johannes Schindelin via GitGitGadget wrote:
> > > From: Johannes Schindelin <johannes.schindelin@gmx.de>
> > > 
> > > The JUnit XML format lends itself to be presented in a powerful UI,
> > > where you can drill down to the information you are interested in very
> > > quickly.
> > > 
> > > For test failures, this usually means that you want to see the detailed
> > > trace of the failing tests.
> > > 
> > > With Travis CI, we passed the `--verbose-log` option to get those
> > > traces. However, that seems excessive, as we do not need/use the logs in
> > 
> > As someone who has dug into a few occasional failures found by Travis
> > CI, I'd say that the output of '--verbose-log -x' is not excessive,
> > but downright essential.
> 
> I agree that the output is essential for drilling down into failures. This
> paragraph, however, talks about the general case: where there are *no*
> failures. See here:

But you don't know in advance whether there will be any failures or
not, so it only makes sense to run all tests with '--verbose-log -x'
by default, just in case a Heisenbug decides to make an appearance.

> > > almost all of those cases: only when a test fails do we have a way to
> > > include the trace.
> > > 
> > > So let's do something different when using Azure DevOps: let's run all
> > > the tests with `--quiet` first, and only if a failure is encountered,
> > > try to trace the commands as they are executed.
> > > 
> > > Of course, we cannot turn on `--verbose-log` after the fact. So let's
> > > just re-run the test with all the same options, adding `--verbose-log`.
> > > And then munging the output file into the JUnit XML on the fly.
> > > 
> > > Note: there is an off chance that re-running the test in verbose mode
> > > "fixes" the failures (and this does happen from time to time!). That is
> > > a possibility we should be able to live with.
> > 
> > Any CI system worth its salt should provide as much information about
> > any failures as possible, especially when it was lucky enough to
> > stumble upon a rare and hard to reproduce non-deterministic failure.
> 
> I would agree with you if more people started to pay attention to our CI
> failures. And if we had some sort of a development model where a CI
> failure would halt development on that particular topic until the failure
> is fixed, with the responsibility assigned to somebody to fix it.
> 
> This is not the case here, though. pu is broken for ages, at least on
> Windows, and even a *single* topic is enough to do that. And this is even
> worse with flakey tests. I cannot remember *how often* I saw CI failures
> in t5570-git-daemon.sh, for example. It is rare enough that it is obvious
> that this is a problem of the *regression test*, rather than a problem of
> the code that is to be tested.

Some occasional failures in t5570 are actually caused by issues in Git
on certain platforms:

  https://public-inbox.org/git/CAM0VKj=MCS+cmOgzf_XyPeb+qZrFmuMH52-PV_NDMZA9X+rRoA@mail.gmail.com/T/#u

> So I would suggest to go forward with my proposed strategy for the moment,
> right up until the time when we have had the resources to fix t5570, for
> starters.

I don't really understand what the occasional failures in t5570 have
to do with the amount of information a CI system should gather about
failures in general.  Or how many people pay attention to it, or what
kind of development model we have, for that matter.  The way I see it
these are unrelated issues, and a CI system should always provide as
much information about failures as possible.  If only a few people pay
attention to it, then for the sake of those few.


> > > Ideally, we would label this as "Passed upon rerun", and Azure
> > > Pipelines even know about that outcome, but it is not available when
> > > using the JUnit XML format for now:
> > > https://github.com/Microsoft/azure-pipelines-agent/blob/master/src/Agent.Worker/TestResults/JunitResultReader.cs
> > > 
> > > Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> >
Johannes Schindelin Oct. 16, 2018, 8:53 p.m. UTC | #4
Hi Gábor,

On Tue, 16 Oct 2018, SZEDER Gábor wrote:

> On Tue, Oct 16, 2018 at 03:02:38PM +0200, Johannes Schindelin wrote:
>
> > So I would suggest to go forward with my proposed strategy for the
> > moment, right up until the time when we have had the resources to fix
> > t5570, for starters.
> 
> I don't really understand what the occasional failures in t5570 have
> to do with the amount of information a CI system should gather about
> failures in general.

I see it plenty of times that too many CI failures essentially render
every developer numb.

If every 3rd CI run causes a failure, and seemingly every of these
failures indicates a mistake in the regression test, rather than a
regression, developers stop paying attention.

Which is the exact opposite of what I want to achieve here.

Ciao,
Dscho
diff mbox series

Patch

diff --git a/t/test-lib.sh b/t/test-lib.sh
index 6f9c1f5300..8a60e39364 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -85,6 +85,13 @@  done,*)
 	test "$(cat "$BASE.exit")" = 0
 	exit
 	;;
+*' --write-junit-xml '*)
+	# record how to call this script *with* --verbose-log, in case
+	# we encounter a breakage
+	junit_rerun_options_sq="$(printf '%s\n' "$0" --verbose-log -x "$@" |
+		sed -e "s/'/'\\\\''/g" -e "s/^/'/" -e "s/\$/'/" |
+		tr '\012' ' ')"
+	;;
 esac
 
 # For repeatability, reset the environment to known value.
@@ -446,10 +453,31 @@  test_ok_ () {
 test_failure_ () {
 	if test -n "$write_junit_xml"
 	then
+		if test -z "$GIT_TEST_TEE_OUTPUT_FILE"
+		then
+			# clean up
+			test_atexit_handler
+
+			# re-run with --verbose-log
+			echo "# Re-running: $junit_rerun_options_sq" >&2
+
+			cd "$TEST_DIRECTORY" &&
+			eval "${TEST_SHELL_PATH}" "$junit_rerun_options_sq" \
+				>/dev/null 2>&1
+			status=$?
+
+			say_color "" "$(test 0 = $status ||
+				echo "not ")ok $test_count - (re-ran with trace)"
+			say "1..$test_count"
+			GIT_EXIT_OK=t
+			exit $status
+		fi
+
 		junit_insert="<failure message=\"not ok $test_count -"
 		junit_insert="$junit_insert $(xml_attr_encode "$1")\">"
 		junit_insert="$junit_insert $(xml_attr_encode \
-			"$(printf '%s\n' "$@" | sed 1d)")"
+			"$(cat "$GIT_TEST_TEE_OUTPUT_FILE")")"
+		>"$GIT_TEST_TEE_OUTPUT_FILE"
 		junit_insert="$junit_insert</failure>"
 		write_junit_xml_testcase "$1" "      $junit_insert"
 	fi
@@ -734,6 +762,10 @@  test_start_ () {
 	if test -n "$write_junit_xml"
 	then
 		junit_start=$(test-tool date getnanos)
+
+		# truncate output
+		test -z "$GIT_TEST_TEE_OUTPUT_FILE" ||
+		>"$GIT_TEST_TEE_OUTPUT_FILE"
 	fi
 }