diff mbox series

[v3,3/4] bisect--helper: reimplement `bisect_run` shell function in C

Message ID 20210411095538.34129-4-mirucam@gmail.com (mailing list archive)
State New, archived
Headers show
Series Finish converting git bisect to C part 4 | expand

Commit Message

Miriam R. April 11, 2021, 9:55 a.m. UTC
From: Tanushree Tumane <tanushreetumane@gmail.com>

Reimplement the `bisect_run()` shell function
in C and also add `--bisect-run` subcommand to
`git bisect--helper` to call it from git-bisect.sh.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
---
 builtin/bisect--helper.c | 83 ++++++++++++++++++++++++++++++++++++++++
 git-bisect.sh            | 62 +-----------------------------
 2 files changed, 84 insertions(+), 61 deletions(-)

Comments

Junio C Hamano April 11, 2021, 8:31 p.m. UTC | #1
Miriam Rubio <mirucam@gmail.com> writes:

> +		if (res < 0 || 128 <= res) {
> +			error(_("bisect run failed: exit code %d from"
> +				" '%s' is < 0 or >= 128"), res, command.buf);

Good now.

> +		temporary_stdout_fd = open(git_path_bisect_run(), O_CREAT | O_WRONLY | O_TRUNC, 0666);
> +		saved_stdout = dup(1);
> +		dup2(temporary_stdout_fd, 1);
> +
> +		res = bisect_state(terms, args.v, args.nr);
> +
> +		dup2(saved_stdout, 1);
> +		close(saved_stdout);
> +		close(temporary_stdout_fd);

In v2, we just let bisect_state() to write to our standard output,
which was the reason why the loss of "cat" in the "write to
BISECT_RUN file and cat it to show to the user" in the scripted
version in v2 was OK.  Now, we are writing out, just like the
scripted version did, to BISECT_RUN, the user does not see its
contents.

Does the distinction matter?  Christian, what's your call on this?

If it does not matter, then the code and tests are good as-is, but
if it does, the fact that you posted this round without noticing any
broken tests means that we have a gap in the test coverage.  Can we
have a new test about this (i.e. at the end of each round in "bisect
run", the output from bisect_state() is shown to the user)?

> +		if (res == BISECT_ONLY_SKIPPED_LEFT)
> +			error(_("bisect run cannot continue any more"));
> +		else if (res == BISECT_INTERNAL_SUCCESS_MERGE_BASE) {
> +			printf(_("bisect run success"));
> +			res = BISECT_OK;
> +		} else if (res) {
> +			error(_("bisect run failed:'git bisect--helper --bisect-state"
> +			" %s' exited with error code %d"), args.v[0], res);
> +		} else {
> +			exit = 0;
> +		}
Junio C Hamano April 11, 2021, 8:33 p.m. UTC | #2
Junio C Hamano <gitster@pobox.com> writes:

> Miriam Rubio <mirucam@gmail.com> writes:
>
>> +		if (res < 0 || 128 <= res) {
>> +			error(_("bisect run failed: exit code %d from"
>> +				" '%s' is < 0 or >= 128"), res, command.buf);
>
> Good now.

Oh, while asking for better test coverage, it is somewhat surprising
that the broken error condition check in v2 was never noticed.  Can
we add a test for this, too?

Thanks.
Andrzej Hunt May 4, 2021, 5:26 p.m. UTC | #3
On 11/04/2021 11:55, Miriam Rubio wrote:
> From: Tanushree Tumane <tanushreetumane@gmail.com>
> 
> Reimplement the `bisect_run()` shell function
> in C and also add `--bisect-run` subcommand to
> `git bisect--helper` to call it from git-bisect.sh.
> 
> Mentored-by: Christian Couder <chriscool@tuxfamily.org>
> Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com>
> Signed-off-by: Miriam Rubio <mirucam@gmail.com>
> ---
>   builtin/bisect--helper.c | 83 ++++++++++++++++++++++++++++++++++++++++
>   git-bisect.sh            | 62 +-----------------------------
>   2 files changed, 84 insertions(+), 61 deletions(-)
> 
> diff --git a/builtin/bisect--helper.c b/builtin/bisect--helper.c
> index 71963979b1..dc87fc7dd0 100644
> --- a/builtin/bisect--helper.c
> +++ b/builtin/bisect--helper.c
> @@ -18,6 +18,7 @@ static GIT_PATH_FUNC(git_path_bisect_log, "BISECT_LOG")
>   static GIT_PATH_FUNC(git_path_head_name, "head-name")
>   static GIT_PATH_FUNC(git_path_bisect_names, "BISECT_NAMES")
>   static GIT_PATH_FUNC(git_path_bisect_first_parent, "BISECT_FIRST_PARENT")
> +static GIT_PATH_FUNC(git_path_bisect_run, "BISECT_RUN")
>   
>   static const char * const git_bisect_helper_usage[] = {
>   	N_("git bisect--helper --bisect-reset [<commit>]"),
> @@ -31,6 +32,7 @@ static const char * const git_bisect_helper_usage[] = {
>   	N_("git bisect--helper --bisect-replay <filename>"),
>   	N_("git bisect--helper --bisect-skip [(<rev>|<range>)...]"),
>   	N_("git bisect--helper --bisect-visualize"),
> +	N_("git bisect--helper --bisect-run <cmd>..."),
>   	NULL
>   };
>   
> @@ -1073,6 +1075,78 @@ static int bisect_visualize(struct bisect_terms *terms, const char **argv, int a
>   	return res;
>   }
>   
> +static int bisect_run(struct bisect_terms *terms, const char **argv, int argc)
> +{
> +	int res = BISECT_OK;
> +	struct strbuf command = STRBUF_INIT;
> +	struct strvec args = STRVEC_INIT;
> +	struct strvec run_args = STRVEC_INIT;
> +	int exit = 0;
> +	int temporary_stdout_fd, saved_stdout;
> +
> +	if (bisect_next_check(terms, NULL))
> +		return BISECT_FAILED;
> +
> +	if (argc)
> +		sq_quote_argv(&command, argv);
> +	else
> +		return BISECT_FAILED;
> +
> +	run_args.v[0] = xstrdup(command.buf);
> +	run_args.nr = 1;

AFAIUI manipulating the strvec directly like this means that we will 
violate the promise that strvec.v is always NULL-terminated. It's 
probably safer to call 'strvec_push(run_args, command.buf)' instead of 
manipulating v and nr?

Violating the NULL-termination promise a problem because... (continued 
below)

> +
> +	while (1) {
> +		strvec_clear(&args);
> +		exit = 1;
> +
> +		printf(_("running %s"), command.buf);
> +		res = run_command_v_opt(run_args.v, RUN_USING_SHELL);

run_command_v_opt() implicitly expects a NULL-terminated list of 
strings. It's not documented in run_command_v_opt()'s comments, however 
run_command_v_opt() does explain that it's a wrapper around 
start_command(), which uses child_process, and child_process.argv is 
documented to require a NULL-terminated list.

If argv is not NULL-terminated, we hit a buffer overflow read  in 
prepare_shell_cmd(), which can be reproduced by running something like:

   make CC=clang-11 SANITIZE=address COPTS="-Og -g" GIT_TEST_OPTS=-v 
T=t6030-bisect-porcelain.sh test

which results in ASAN reporting this error:

==2116==ERROR: AddressSanitizer: global-buffer-overflow on address 
0x000001a51e28 at pc 0x0000009c6c1f bp 0x7ffcf0f60670 sp 0x7ffcf0f60668
READ of size 8 at 0x000001a51e28 thread T0
     #0 0x9c6c1e in prepare_shell_cmd run-command.c:284:8
     #1 0x9c6c1e in prepare_cmd run-command.c:419:3
     #2 0x9c6c1e in start_command run-command.c:753:6
     #3 0x9c7d35 in run_command run-command.c:1015:9
     #4 0x9c800c in run_command_v_opt_cd_env_tr2 run-command.c:1051:9
     #5 0x9c800c in run_command_v_opt_cd_env run-command.c:1033:9
     #6 0x9c800c in run_command_v_opt run-command.c:1023:9
     #7 0x4e5b60 in bisect_run builtin/bisect--helper.c:1102:9
     #8 0x4e5b60 in cmd_bisect__helper builtin/bisect--helper.c:1252:9
     #9 0x4ce8fe in run_builtin git.c:461:11
     #10 0x4ccbc8 in handle_builtin git.c:718:3
     #11 0x4cb0cc in run_argv git.c:785:4
     #12 0x4cb0cc in cmd_main git.c:916:19
     #13 0x6beded in main common-main.c:52:11
     #14 0x7f28636f7349 in __libc_start_main (/lib64/libc.so.6+0x24349)
     #15 0x420769 in _start ../sysdeps/x86_64/start.S:120

0x000001a51e28 is located 56 bytes to the left of global variable 
'config_update_recurse_submodules' defined in 'submodule.c:26:12' 
(0x1a51e60) of size 4
0x000001a51e28 is located 0 bytes to the right of global variable 
'empty_strvec' defined in 'strvec.c:5:13' (0x1a51e20) of size 8
SUMMARY: AddressSanitizer: global-buffer-overflow run-command.c:284:8 in 
prepare_shell_cmd


[... snip the rest ...]
Junio C Hamano May 5, 2021, 2:04 a.m. UTC | #4
Andrzej Hunt <andrzej@ahunt.org> writes:

>> +	struct strbuf command = STRBUF_INIT;
>> +	struct strvec args = STRVEC_INIT;
>> +	struct strvec run_args = STRVEC_INIT;
>> + ...
>> +	run_args.v[0] = xstrdup(command.buf);
>> +	run_args.nr = 1;
>
> AFAIUI manipulating the strvec directly like this means that we will
> violate the promise that strvec.v is always NULL-terminated. It's 
> probably safer to call 'strvec_push(run_args, command.buf)' instead of
> manipulating v and nr?

True.

> Violating the NULL-termination promise a problem because... (continued
> below)
>
>> +
>> +	while (1) {
>> +		strvec_clear(&args);
>> +		exit = 1;
>> +
>> +		printf(_("running %s"), command.buf);
>> +		res = run_command_v_opt(run_args.v, RUN_USING_SHELL);
>
> run_command_v_opt() implicitly expects a NULL-terminated list of
> strings. It's not documented in run_command_v_opt()'s comments,
> however run_command_v_opt() does explain that it's a wrapper around 
> start_command(), which uses child_process, and child_process.argv is
> documented to require a NULL-terminated list.
>
> If argv is not NULL-terminated, we hit a buffer overflow read  in
> prepare_shell_cmd(), which can be reproduced by running something
> like:
>
>   make CC=clang-11 SANITIZE=address COPTS="-Og -g" GIT_TEST_OPTS=-v
>   T=t6030-bisect-porcelain.sh test
>
> which results in ASAN reporting this error:
> ...

Thanks for a careful explanation.
Christian Couder May 5, 2021, 9:04 a.m. UTC | #5
On Sun, Apr 11, 2021 at 10:31 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Miriam Rubio <mirucam@gmail.com> writes:

> > +             temporary_stdout_fd = open(git_path_bisect_run(), O_CREAT | O_WRONLY | O_TRUNC, 0666);
> > +             saved_stdout = dup(1);
> > +             dup2(temporary_stdout_fd, 1);
> > +
> > +             res = bisect_state(terms, args.v, args.nr);
> > +
> > +             dup2(saved_stdout, 1);
> > +             close(saved_stdout);
> > +             close(temporary_stdout_fd);
>
> In v2, we just let bisect_state() to write to our standard output,
> which was the reason why the loss of "cat" in the "write to
> BISECT_RUN file and cat it to show to the user" in the scripted
> version in v2 was OK.  Now, we are writing out, just like the
> scripted version did, to BISECT_RUN, the user does not see its
> contents.
>
> Does the distinction matter?  Christian, what's your call on this?

Sorry for the late answer. I think the content of the BISECT_RUN file
should indeed be shown.

bisect_state() calls bisect_auto_next() which calls bisect_next()
which calls bisect_next_all(), and bisect_next_all() uses printf() to
show things like "XXX is the first bad commit" which should be shown
when using `git bisect run`.

Also when adding an "exit 1 &&" before "git bisect reset" at the end
of the `"git bisect run" simple case` test, with 'next' I get:

-----------------
$ ./t6030-bisect-porcelain.sh -i -v
...
expecting success of 6030.21 '"git bisect run" simple case':
       write_script test_script.sh <<-\EOF &&
       ! grep Another hello >/dev/null
       EOF
       git bisect start &&
       git bisect good $HASH1 &&
       git bisect bad $HASH4 &&
       git bisect run ./test_script.sh >my_bisect_log.txt &&
       grep "$HASH3 is the first bad commit" my_bisect_log.txt &&
       exit 1 &&
       git bisect reset

Bisecting: 0 revisions left to test after this (roughly 1 step)
[3de952f2416b6084f557ec417709eac740c6818c] Add <3: Another new day for
git> into <hello>.
3de952f2416b6084f557ec417709eac740c6818c is the first bad commit
FATAL: Unexpected exit with code 1
-----------------

and:

-----------------
$ cat trash\ directory.t6030-bisect-porcelain/.git/BISECT_RUN
3de952f2416b6084f557ec417709eac740c6818c is the first bad commit
commit 3de952f2416b6084f557ec417709eac740c6818c
Author: A U Thor <author@example.com>
Date:   Thu Apr 7 15:15:13 2005 -0700

   Add <3: Another new day for git> into <hello>.

hello | 1 +
1 file changed, 1 insertion(+)
-----------------

while with 'seen' I get:

-----------------
$ ./t6030-bisect-porcelain.sh -i -v
...
expecting success of 6030.21 '"git bisect run" simple case':
       write_script test_script.sh <<-\EOF &&
       ! grep Another hello >/dev/null
       EOF
       git bisect start &&
       git bisect good $HASH1 &&
       git bisect bad $HASH4 &&
       git bisect run ./test_script.sh >my_bisect_log.txt &&
       grep "$HASH3 is the first bad commit" my_bisect_log.txt &&
       exit 1 &&
       git bisect reset

Bisecting: 0 revisions left to test after this (roughly 1 step)
[3de952f2416b6084f557ec417709eac740c6818c] Add <3: Another new day for
git> into <hello>.
error: bisect run failed:'git bisect--helper --bisect-state good'
exited with error code -10
running  './test_script.sh'running
'./test_script.sh'3de952f2416b6084f557ec417709eac740c6818c is the
first bad commit
FATAL: Unexpected exit with code 1
-----------------

and:

-----------------
$ cat trash\ directory.t6030-bisect-porcelain/.git/BISECT_RUN
-----------------

So it looks like there might be another issue with what's in 'seen'.

> If it does not matter, then the code and tests are good as-is, but
> if it does, the fact that you posted this round without noticing any
> broken tests means that we have a gap in the test coverage.  Can we
> have a new test about this (i.e. at the end of each round in "bisect
> run", the output from bisect_state() is shown to the user)?

Definitely it seems that taking a look at the tests is a good idea.
For example, comparing the verbose (with -v) output of t6030 before
and after each patch (and before and after this patch series) might
help find issues.
diff mbox series

Patch

diff --git a/builtin/bisect--helper.c b/builtin/bisect--helper.c
index 71963979b1..dc87fc7dd0 100644
--- a/builtin/bisect--helper.c
+++ b/builtin/bisect--helper.c
@@ -18,6 +18,7 @@  static GIT_PATH_FUNC(git_path_bisect_log, "BISECT_LOG")
 static GIT_PATH_FUNC(git_path_head_name, "head-name")
 static GIT_PATH_FUNC(git_path_bisect_names, "BISECT_NAMES")
 static GIT_PATH_FUNC(git_path_bisect_first_parent, "BISECT_FIRST_PARENT")
+static GIT_PATH_FUNC(git_path_bisect_run, "BISECT_RUN")
 
 static const char * const git_bisect_helper_usage[] = {
 	N_("git bisect--helper --bisect-reset [<commit>]"),
@@ -31,6 +32,7 @@  static const char * const git_bisect_helper_usage[] = {
 	N_("git bisect--helper --bisect-replay <filename>"),
 	N_("git bisect--helper --bisect-skip [(<rev>|<range>)...]"),
 	N_("git bisect--helper --bisect-visualize"),
+	N_("git bisect--helper --bisect-run <cmd>..."),
 	NULL
 };
 
@@ -1073,6 +1075,78 @@  static int bisect_visualize(struct bisect_terms *terms, const char **argv, int a
 	return res;
 }
 
+static int bisect_run(struct bisect_terms *terms, const char **argv, int argc)
+{
+	int res = BISECT_OK;
+	struct strbuf command = STRBUF_INIT;
+	struct strvec args = STRVEC_INIT;
+	struct strvec run_args = STRVEC_INIT;
+	int exit = 0;
+	int temporary_stdout_fd, saved_stdout;
+
+	if (bisect_next_check(terms, NULL))
+		return BISECT_FAILED;
+
+	if (argc)
+		sq_quote_argv(&command, argv);
+	else
+		return BISECT_FAILED;
+
+	run_args.v[0] = xstrdup(command.buf);
+	run_args.nr = 1;
+
+	while (1) {
+		strvec_clear(&args);
+		exit = 1;
+
+		printf(_("running %s"), command.buf);
+		res = run_command_v_opt(run_args.v, RUN_USING_SHELL);
+
+		if (res < 0 || 128 <= res) {
+			error(_("bisect run failed: exit code %d from"
+				" '%s' is < 0 or >= 128"), res, command.buf);
+			strbuf_release(&command);
+			return res;
+		}
+
+		if (res == 125)
+			strvec_push(&args, "skip");
+		else if (res > 0)
+			strvec_push(&args, terms->term_bad);
+		else
+			strvec_push(&args, terms->term_good);
+
+		temporary_stdout_fd = open(git_path_bisect_run(), O_CREAT | O_WRONLY | O_TRUNC, 0666);
+		saved_stdout = dup(1);
+		dup2(temporary_stdout_fd, 1);
+
+		res = bisect_state(terms, args.v, args.nr);
+
+		dup2(saved_stdout, 1);
+		close(saved_stdout);
+		close(temporary_stdout_fd);
+
+		if (res == BISECT_ONLY_SKIPPED_LEFT)
+			error(_("bisect run cannot continue any more"));
+		else if (res == BISECT_INTERNAL_SUCCESS_MERGE_BASE) {
+			printf(_("bisect run success"));
+			res = BISECT_OK;
+		} else if (res) {
+			error(_("bisect run failed:'git bisect--helper --bisect-state"
+			" %s' exited with error code %d"), args.v[0], res);
+		} else {
+			exit = 0;
+		}
+
+		if (exit) {
+			strbuf_release(&command);
+			strvec_clear(&args);
+			strvec_clear(&run_args);
+			return res;
+		}
+	}
+}
+
 int cmd_bisect__helper(int argc, const char **argv, const char *prefix)
 {
 	enum {
@@ -1087,6 +1161,7 @@  int cmd_bisect__helper(int argc, const char **argv, const char *prefix)
 		BISECT_REPLAY,
 		BISECT_SKIP,
 		BISECT_VISUALIZE,
+		BISECT_RUN,
 	} cmdmode = 0;
 	int res = 0, nolog = 0;
 	struct option options[] = {
@@ -1110,6 +1185,8 @@  int cmd_bisect__helper(int argc, const char **argv, const char *prefix)
 			 N_("skip some commits for checkout"), BISECT_SKIP),
 		OPT_CMDMODE(0, "bisect-visualize", &cmdmode,
 			 N_("visualize the bisection"), BISECT_VISUALIZE),
+		OPT_CMDMODE(0, "bisect-run", &cmdmode,
+			 N_("use <cmd>... to automatically bisect."), BISECT_RUN),
 		OPT_BOOL(0, "no-log", &nolog,
 			 N_("no log for BISECT_WRITE")),
 		OPT_END()
@@ -1174,6 +1251,12 @@  int cmd_bisect__helper(int argc, const char **argv, const char *prefix)
 		get_terms(&terms);
 		res = bisect_visualize(&terms, argv, argc);
 		break;
+	case BISECT_RUN:
+		if (!argc)
+			return error(_("bisect run failed: no command provided."));
+		get_terms(&terms);
+		res = bisect_run(&terms, argv, argc);
+		break;
 	default:
 		BUG("unknown subcommand %d", cmdmode);
 	}
diff --git a/git-bisect.sh b/git-bisect.sh
index 95f7f3fb8c..e83d011e17 100755
--- a/git-bisect.sh
+++ b/git-bisect.sh
@@ -39,66 +39,6 @@  _x40="$_x40$_x40$_x40$_x40$_x40$_x40$_x40$_x40"
 TERM_BAD=bad
 TERM_GOOD=good
 
-bisect_run () {
-	git bisect--helper --bisect-next-check $TERM_GOOD $TERM_BAD fail || exit
-
-	test -n "$*" || die "$(gettext "bisect run failed: no command provided.")"
-
-	while true
-	do
-		command="$@"
-		eval_gettextln "running \$command"
-		"$@"
-		res=$?
-
-		# Check for really bad run error.
-		if [ $res -lt 0 -o $res -ge 128 ]
-		then
-			eval_gettextln "bisect run failed:
-exit code \$res from '\$command' is < 0 or >= 128" >&2
-			exit $res
-		fi
-
-		# Find current state depending on run success or failure.
-		# A special exit code of 125 means cannot test.
-		if [ $res -eq 125 ]
-		then
-			state='skip'
-		elif [ $res -gt 0 ]
-		then
-			state="$TERM_BAD"
-		else
-			state="$TERM_GOOD"
-		fi
-
-		git bisect--helper --bisect-state $state >"$GIT_DIR/BISECT_RUN"
-		res=$?
-
-		cat "$GIT_DIR/BISECT_RUN"
-
-		if sane_grep "first $TERM_BAD commit could be any of" "$GIT_DIR/BISECT_RUN" \
-			>/dev/null
-		then
-			gettextln "bisect run cannot continue any more" >&2
-			exit $res
-		fi
-
-		if [ $res -ne 0 ]
-		then
-			eval_gettextln "bisect run failed:
-'bisect-state \$state' exited with error code \$res" >&2
-			exit $res
-		fi
-
-		if sane_grep "is the first $TERM_BAD commit" "$GIT_DIR/BISECT_RUN" >/dev/null
-		then
-			gettextln "bisect run success"
-			exit 0;
-		fi
-
-	done
-}
-
 get_terms () {
 	if test -s "$GIT_DIR/BISECT_TERMS"
 	then
@@ -137,7 +77,7 @@  case "$#" in
 	log)
 		git bisect--helper --bisect-log || exit ;;
 	run)
-		bisect_run "$@" ;;
+		git bisect--helper --bisect-run "$@" || exit;;
 	terms)
 		git bisect--helper --bisect-terms "$@" || exit;;
 	*)