diff mbox series

run_command: teach API users to use embedded 'args' more

Message ID xmqqmu2ht58g.fsf_-_@gitster.c.googlers.com (mailing list archive)
State Accepted
Commit afbdba391eaf3c473eff8f12437ff510935b520f
Headers show
Series run_command: teach API users to use embedded 'args' more | expand

Commit Message

Junio C Hamano Aug. 26, 2020, 10:25 p.m. UTC
The child_process structure has an embedded strvec for formulating
the command line argument list these days, but code that predates
the wide use of it prepared a separate char *argv[] array and
manually set the child_process.argv pointer point at it.

Teach these old-style code to lose the separate argv[] array.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 convert.c    |  5 +----
 credential.c |  4 +---
 submodule.c  | 13 ++++---------
 trailer.c    |  4 +---
 4 files changed, 7 insertions(+), 19 deletions(-)

Comments

Jeff King Aug. 27, 2020, 4:21 a.m. UTC | #1
On Wed, Aug 26, 2020 at 03:25:03PM -0700, Junio C Hamano wrote:

> The child_process structure has an embedded strvec for formulating
> the command line argument list these days, but code that predates
> the wide use of it prepared a separate char *argv[] array and
> manually set the child_process.argv pointer point at it.
> 
> Teach these old-style code to lose the separate argv[] array.

I think the result is much nicer and less error-prone (especially the
ones that pre-size the array with NULLs and fill in the components
later). It incurs a few extra mallocs at run-time, but on top of an
execve(), that's a drop in the bucket.

I've actually considered dropping child_process.argv entirely. Having
two separate ways to do the same thing gives the potential for
confusion. But I never dug into whether any existing callers would be
made worse for it (I kind of doubt it, though; worst case they can use
strvec_pushv). There are still several left after this patch, it seems.

Likewise for child_process.env_array.

-Peff
Junio C Hamano Aug. 27, 2020, 4:30 a.m. UTC | #2
Jeff King <peff@peff.net> writes:

> I've actually considered dropping child_process.argv entirely. Having
> two separate ways to do the same thing gives the potential for
> confusion. But I never dug into whether any existing callers would be
> made worse for it (I kind of doubt it, though; worst case they can use
> strvec_pushv). There are still several left after this patch, it seems.
>
> Likewise for child_process.env_array.

Yup, conversion similar to what I did in this patch may be too
trivial for #microproject, but would nevertheless be a good
#leftoverbits task.  The removal of .argv/.env is not entirely
trivial but a good candidate for #microproject.

Thanks.
Eric Sunshine Aug. 27, 2020, 4:31 a.m. UTC | #3
On Thu, Aug 27, 2020 at 12:22 AM Jeff King <peff@peff.net> wrote:
> I've actually considered dropping child_process.argv entirely. Having
> two separate ways to do the same thing gives the potential for
> confusion. [...]
>
> Likewise for child_process.env_array.

builtin/worktree.c:add_worktree() is a case in which an environment
strvec is built once and re-used for multiple run_command()
invocations by re-assigning it to child_process.env before each
run_command(). It uses child_process.env rather than
child_process.env_array because run_command() clears
child_process.env_array upon completion, which makes it difficult to
reuse a prepared environment strvec repeatedly.

Nevertheless, that isn't much of a reason to keep child_process.env.
Refactoring add_worktree() a bit to rebuild the environment strvec
on-demand wouldn't be very difficult.
Jeff King Aug. 27, 2020, 4:44 a.m. UTC | #4
On Thu, Aug 27, 2020 at 12:31:52AM -0400, Eric Sunshine wrote:

> On Thu, Aug 27, 2020 at 12:22 AM Jeff King <peff@peff.net> wrote:
> > I've actually considered dropping child_process.argv entirely. Having
> > two separate ways to do the same thing gives the potential for
> > confusion. [...]
> >
> > Likewise for child_process.env_array.
> 
> builtin/worktree.c:add_worktree() is a case in which an environment
> strvec is built once and re-used for multiple run_command()
> invocations by re-assigning it to child_process.env before each
> run_command(). It uses child_process.env rather than
> child_process.env_array because run_command() clears
> child_process.env_array upon completion, which makes it difficult to
> reuse a prepared environment strvec repeatedly.
> 
> Nevertheless, that isn't much of a reason to keep child_process.env.
> Refactoring add_worktree() a bit to rebuild the environment strvec
> on-demand wouldn't be very difficult.

I think they'd still be one-liner changes, like:

diff --git a/builtin/worktree.c b/builtin/worktree.c
index 378f332b5d..b40f0d4cea 100644
--- a/builtin/worktree.c
+++ b/builtin/worktree.c
@@ -422,7 +422,7 @@ static int add_worktree(const char *path, const char *refname,
 			strvec_push(&cp.args, "--quiet");
 	}
 
-	cp.env = child_env.v;
+	strvec_pushv(&cp.env_array, child_env.v);
 	ret = run_command(&cp);
 	if (ret)
 		goto done;
@@ -433,7 +433,7 @@ static int add_worktree(const char *path, const char *refname,
 		strvec_pushl(&cp.args, "reset", "--hard", "--no-recurse-submodules", NULL);
 		if (opts->quiet)
 			strvec_push(&cp.args, "--quiet");
-		cp.env = child_env.v;
+		strvec_pushv(&cp.env_array, child_env.v);
 		ret = run_command(&cp);
 		if (ret)
 			goto done;

I think there are other opportunities for cleanup, too:

  - the code right above the second hunk clears cp.args manually. That
    shouldn't be necessary because run_command() will leave it in a
    blank state (and we're already relying on that, since otherwise we'd
    be left with cruft in other fields from the previous run).

  - check_clean_worktree() only uses it once, and could drop the
    separate child_env (and in fact appears to leak it)

-Peff
Eric Sunshine Aug. 27, 2020, 5:03 a.m. UTC | #5
On Thu, Aug 27, 2020 at 12:44 AM Jeff King <peff@peff.net> wrote:
> On Thu, Aug 27, 2020 at 12:31:52AM -0400, Eric Sunshine wrote:
> > Nevertheless, that isn't much of a reason to keep child_process.env.
> > Refactoring add_worktree() a bit to rebuild the environment strvec
> > on-demand wouldn't be very difficult.
>
> I think they'd still be one-liner changes, like:
>
> -       cp.env = child_env.v;
> +       strvec_pushv(&cp.env_array, child_env.v);

Nice and simple.

> I think there are other opportunities for cleanup, too:

True on both counts.

>   - the code right above the second hunk clears cp.args manually. That
>     shouldn't be necessary because run_command() will leave it in a
>     blank state (and we're already relying on that, since otherwise we'd
>     be left with cruft in other fields from the previous run).

Right. I wonder why the author of 7f44e3d1de (worktree: make setup of
new HEAD distinct from worktree population, 2015-07-17) chose to clear
cp.args manually like that.

>   - check_clean_worktree() only uses it once, and could drop the
>     separate child_env (and in fact appears to leak it)

Perhaps this unnecessary 'child_env' strvec was a copy/paste from
add_worktree()? But certainly cp.env_array would be simpler and avoid
the leak.
diff mbox series

Patch

diff --git a/convert.c b/convert.c
index 572449825c..8e6c292421 100644
--- a/convert.c
+++ b/convert.c
@@ -638,7 +638,6 @@  static int filter_buffer_or_fd(int in, int out, void *data)
 	struct child_process child_process = CHILD_PROCESS_INIT;
 	struct filter_params *params = (struct filter_params *)data;
 	int write_err, status;
-	const char *argv[] = { NULL, NULL };
 
 	/* apply % substitution to cmd */
 	struct strbuf cmd = STRBUF_INIT;
@@ -656,9 +655,7 @@  static int filter_buffer_or_fd(int in, int out, void *data)
 	strbuf_expand(&cmd, params->cmd, strbuf_expand_dict_cb, &dict);
 	strbuf_release(&path);
 
-	argv[0] = cmd.buf;
-
-	child_process.argv = argv;
+	strvec_push(&child_process.args, cmd.buf);
 	child_process.use_shell = 1;
 	child_process.in = -1;
 	child_process.out = out;
diff --git a/credential.c b/credential.c
index d8d226b97e..efc29dc5e1 100644
--- a/credential.c
+++ b/credential.c
@@ -274,11 +274,9 @@  static int run_credential_helper(struct credential *c,
 				 int want_output)
 {
 	struct child_process helper = CHILD_PROCESS_INIT;
-	const char *argv[] = { NULL, NULL };
 	FILE *fp;
 
-	argv[0] = cmd;
-	helper.argv = argv;
+	strvec_push(&helper.args, cmd);
 	helper.use_shell = 1;
 	helper.in = -1;
 	if (want_output)
diff --git a/submodule.c b/submodule.c
index 01697848be..e6086faff7 100644
--- a/submodule.c
+++ b/submodule.c
@@ -1727,14 +1727,6 @@  unsigned is_submodule_modified(const char *path, int ignore_untracked)
 int submodule_uses_gitfile(const char *path)
 {
 	struct child_process cp = CHILD_PROCESS_INIT;
-	const char *argv[] = {
-		"submodule",
-		"foreach",
-		"--quiet",
-		"--recursive",
-		"test -f .git",
-		NULL,
-	};
 	struct strbuf buf = STRBUF_INIT;
 	const char *git_dir;
 
@@ -1747,7 +1739,10 @@  int submodule_uses_gitfile(const char *path)
 	strbuf_release(&buf);
 
 	/* Now test that all nested submodules use a gitfile too */
-	cp.argv = argv;
+	strvec_pushl(&cp.args,
+		     "submodule", "foreach", "--quiet",	"--recursive",
+		     "test -f .git", NULL);
+
 	prepare_submodule_repo_env(&cp.env_array);
 	cp.git_cmd = 1;
 	cp.no_stdin = 1;
diff --git a/trailer.c b/trailer.c
index 0c414f2fed..68dabc2556 100644
--- a/trailer.c
+++ b/trailer.c
@@ -221,15 +221,13 @@  static char *apply_command(const char *command, const char *arg)
 	struct strbuf cmd = STRBUF_INIT;
 	struct strbuf buf = STRBUF_INIT;
 	struct child_process cp = CHILD_PROCESS_INIT;
-	const char *argv[] = {NULL, NULL};
 	char *result;
 
 	strbuf_addstr(&cmd, command);
 	if (arg)
 		strbuf_replace(&cmd, TRAILER_ARG_STRING, arg);
 
-	argv[0] = cmd.buf;
-	cp.argv = argv;
+	strvec_push(&cp.args, cmd.buf);
 	cp.env = local_repo_env;
 	cp.no_stdin = 1;
 	cp.use_shell = 1;