diff mbox series

[v2,2/3] submodule: port submodule subcommand 'add' from shell to C

Message ID 20201007074538.25891-3-shouryashukla.oo@gmail.com (mailing list archive)
State New, archived
Headers show
Series submodule: port subcommand add from shell to C | expand

Commit Message

Shourya Shukla Oct. 7, 2020, 7:45 a.m. UTC
From: Prathamesh Chavan <pc44800@gmail.com>

Convert submodule subcommand 'add' to a builtin and call it via
'git-submodule.sh'.

Also, since the command die()s out in case of absence of commits in the
submodule, the keyword 'fatal' is prefixed in the error messages.
Therefore, prepend the keyword in the expected output of test t7400.6.

While at it, eliminate the extra preprocessor directive
`#include "dir.h"` at the start of 'submodule--helper.c'.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Stefan Beller <stefanbeller@gmail.com>
Signed-off-by: Prathamesh Chavan <pc44800@gmail.com>
Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
Signed-off-by: Shourya Shukla <shouryashukla.oo@gmail.com>
---
 builtin/submodule--helper.c | 391 +++++++++++++++++++++++++++++++++++-
 git-submodule.sh            | 161 +--------------
 t/t7400-submodule-basic.sh  |   2 +-
 3 files changed, 392 insertions(+), 162 deletions(-)

Comments

Junio C Hamano Oct. 7, 2020, 6:37 p.m. UTC | #1
Shourya Shukla <shouryashukla.oo@gmail.com> writes:

> From: Prathamesh Chavan <pc44800@gmail.com>
>
> Convert submodule subcommand 'add' to a builtin and call it via
> 'git-submodule.sh'.
>
> Also, since the command die()s out in case of absence of commits in the
> submodule, the keyword 'fatal' is prefixed in the error messages.
> Therefore, prepend the keyword in the expected output of test t7400.6.
>
> While at it, eliminate the extra preprocessor directive
> `#include "dir.h"` at the start of 'submodule--helper.c'.
>
> Mentored-by: Christian Couder <chriscool@tuxfamily.org>
> Mentored-by: Stefan Beller <stefanbeller@gmail.com>
> Signed-off-by: Prathamesh Chavan <pc44800@gmail.com>
> Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
> Signed-off-by: Shourya Shukla <shouryashukla.oo@gmail.com>
> ---
>  builtin/submodule--helper.c | 391 +++++++++++++++++++++++++++++++++++-
>  git-submodule.sh            | 161 +--------------
>  t/t7400-submodule-basic.sh  |   2 +-
>  3 files changed, 392 insertions(+), 162 deletions(-)

Whoa.  That looks like a huge change.  Makes me wonder if we want
this split into multiple pieces, but let's read on.

> diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
> index de5ad73bb8..ec0a50d032 100644
> --- a/builtin/submodule--helper.c
> +++ b/builtin/submodule--helper.c
> @@ -19,7 +19,6 @@
>  #include "diffcore.h"
>  #include "diff.h"
>  #include "object-store.h"
> -#include "dir.h"
>  #include "advice.h"
>  
>  #define OPT_QUIET (1 << 0)
> @@ -2744,6 +2743,395 @@ static int module_set_branch(int argc, const char **argv, const char *prefix)
>  	return !!ret;
>  }
>  
> +struct add_data {
> +	const char *prefix;
> +	const char *branch;
> +	const char *reference_path;
> +	const char *sm_path;
> +	const char *sm_name;
> +	const char *repo;
> +	const char *realrepo;
> +	int depth;
> +	unsigned int force: 1;
> +	unsigned int quiet: 1;
> +	unsigned int progress: 1;
> +	unsigned int dissociate: 1;
> +};
> +#define ADD_DATA_INIT { NULL, NULL, NULL, NULL, NULL, NULL, NULL, 0, 0, 0, 0, 0 }

This is used in a context like this:

	struct add_data data = ADD_DATA_INIT;

It is a tangent, but wouldn't

	#define ADD_DATA_INIT { 0 }

be a more appropriate way to express that there is nothing other
than the initialization to zero values going on?

> +/*
> + * Guess dir name from repository: strip leading '.*[/:]',
> + * strip trailing '[:/]*.git'.
> + */

The original also strips trailing '/'.  The original does these in
order:

 - if $repo ends with '/', remove that.  The above description does
   not mention it.

 - if the result of the above ends with zero or more ':', followed
   by zero or more '/', followed by ".git", drop the matching part.
   The above description sounds as if it will remove ":/:/:.git"
   from the end (and the code seems to have the same bug, as
   after_slash_or_colon won't allow the code to know if the previous
   character before ".git" was slash or colon).

 - if the result of the above has '/' or ':' in it, remove everything
   before it and '/' or ':' itself.

> +static char *guess_dir_name(const char *repo)
> +{
> +	const char *p, *start, *end, *limit;
> +	int after_slash_or_colon;
> +
> +	after_slash_or_colon = 0;
> +	limit = repo + strlen(repo);
> +	start = repo;
> +	end = limit;
> +	for (p = repo; p < limit; p++) {
> +		if (starts_with(p, ".git")) {
> +			/* strip trailing '[:/]*.git' */
> +			if (!after_slash_or_colon)
> +				end = p;
> +			p += 3;
> +		} else if (*p == '/' || *p == ':') {
> +			/* strip leading '.*[/:]' */
> +			if (end == limit)
> +				end = p;
> +			after_slash_or_colon = 1;
> +		} else if (after_slash_or_colon) {
> +			start = p;
> +			end = limit;
> +			after_slash_or_colon = 0;
> +		}
> +	}
> +	return xstrndup(start, end - start);

So, this looks quite bogus and unnatural.  Checking for ".git" at
every position in the string is meaningless.

I wonder if something along the following (beware: not even compile
tested or checked for off-by-ones) would be easier to follow and
more faithful conversion to the original.

	ep = repo + strlen(repo);

        /*
         * eat trailing slashes - a conversion less faithful to
         * the original may want to loop to cull duplicated trailing
	 * slashes, but we can leave it as user-error for now.
	 */
	if (repo < ep - 1 && ep[-1] == '/')
		ep--;

	/* eat ":*/*\.git" at the tail */
	if (repo < ep - 4 && !memcmp(".git", ep - 4, 4)) {
		ep -= 4;
		while (repo < ep - 1 && ep[-1] == '/')
			ep--;
		while (repo < ep - 1 && ep[-1] == ':')
			ep--;
	}

	/* find the last ':' or '/' */
	for (sp = ep - 1; repo <= sp; sp--) {
		if (*sp == '/' || *sp == ':')
			break;
	}
        sp++; /* exclude '/' or ':' itself */

        /* sp point at the beginning, and ep points at the end */
	return xmemdupz(sp, ep - sp);

> +}

That's it for now; I didn't look at the remainder of this patch
during this sitting before I have to move on, but I may revisit the
rest at some other time.

Thanks.
Junio C Hamano Oct. 7, 2020, 10:19 p.m. UTC | #2
Shourya Shukla <shouryashukla.oo@gmail.com> writes:

> +static void fprintf_submodule_remote(const char *str)

The fact that the helper happens to use fprintf() to do its job is
much less important than it writes to the standard error stream.
Name it after what it does than how it does so.  Is there a word
that explains at a higher-level concept than "print to stderr" that
this function tries to achieve?  

Same question for the name of the only caller of this function,
modify_remote_v().  That name does not mean anything to readers
other than that it futz with output from "remote -v" command, which
is the least interesting piece of information.  What does it try to
achieve by using "remote -v"?  Can we name the function after that?

> +{
> +	const char *p = str;
> +	const char *start;
> +	const char *end;
> +	char *name, *url;
> +
> +	start = p;
> +	while (*p != ' ')
> +		p++;
> +	end = p;
> +	name = xstrndup(start, end - start);
> +	while(*p == ' ')
> +		p++;

Perhaps make a small helper out of these seven lines, so that the
caller can say something like

    p = str;
    name = parse_token(&p);
    url = parse_token(&p);

This one you should be able to do without any extra allocation,
though.  Just write a parse_token() that finds start and length,
prepare "char *name; int namelen" and the same pair for URL,
and then

	fprintf(stderr, "  %.*s\t%.*s\n",
		namelen, name, urllen, url);

> +	fprintf(stderr, "  %s\t%s\n", name, url);
> +	free(name);
> +	free(url);
> +}

> +static int check_sm_exists(unsigned int force, const char *path) {
> +
> +	int cache_pos, dir_in_cache = 0;

Have a blank line here to separate decl and the first statement.

> +	if (read_cache() < 0)
> +		die(_("index file corrupt"));
> +
> +	cache_pos = cache_name_pos(path, strlen(path));
> +	if(cache_pos < 0 && (directory_exists_in_index(&the_index,
> +	   path, strlen(path)) == index_directory))
> +		dir_in_cache = 1;

Funny line wrapping.  Try to cut long line at an operator as close
to the root of the parse tree (in this case, &&) as possible, i.e.

	if (cache_pos < 0 &&
	    directory_exists_in_index(&the_index, path, strlen(path)) == index_directory)

It is OK to further wrap after == if the second line bothers you.

A bigger question.  Can the path be a regular file but at a higher
stage because we are in the middle of a conflicted merge?  We'd get
cache_pos that is negative in that case, too, and we definitely would
want to say the path already exists in the index in such a case, but ...

> +
> +	if (!force) {
> +		if (cache_pos >= 0 || dir_in_cache)
> +			die(_("'%s' already exists in the index"), path);

... the current code may not trigger this die() in such a case, no?

> +	} else {
> +		struct cache_entry *ce = NULL;
> +		if (cache_pos >= 0)
> +			ce = the_index.cache[cache_pos];
> +		if (dir_in_cache || (ce && !S_ISGITLINK(ce->ce_mode)))
> +			die(_("'%s' already exists in the index and is not a "
> +			      "submodule"), path);

Likewise here.  cache_pos < 0 does not automatically mean it does
not exist.  It tells you that it does not exist as a merged entry.

> +	}
> +	return 0;
> +}
> +
> +static void modify_remote_v(struct strbuf *sb)

This roughly corresponds to this part of the original

    grep '(fetch)' | sed -e s,^,"  ", -e s,' (fetch)',,

I actualy would suggest moving the "git remote -v" invocation and
capturing of its output to this helper function and name it after
what it does, which seems to be "show fetch remotes" to me.

> +{
> +	int i;
> +	for (i = 0; i < sb->len; i++) {
> +		const char *start = sb->buf + i;
> +		const char *end = start;
> +		while (sb->buf[i++] != '\n')
> +			end++;
> +		if (!strcmp("fetch", xstrndup(end - 6, 5)))

The original makes sure the 'fetch' appears inside "()" but this
does not.  Any reason why we want to do it differently?

> +			fprintf_submodule_remote(xstrndup(start, end - start - 7));

The result of xstrndup() is leaking here.

In any case, with a helper function like parse_token() suggested
before, you can get rid of the fprintf_submodule_remote() helper and
open code it here, without any temporary allocation and freeing.
You'd have the start of each line of "git remote -v" output (so you
know where it starts and it ends), and a parser that roughly does
this:

	/*
	 * cp points at the current location, and ep points the
	 * end of the buffer.  find the tail of the current string
	 * and store its length in *len, skip over whitespaces and
	 * return the location to be used as the new cp.
	 */
	const char *parse_token(char *cp, char *ep, int *len);

and make the latter half of this function (former half would be
spawning "remote -v" and capturing its output in sb) a loop whose
body may look like

	{
		char *end, *name, *url, *tail;
		int namelen, urllen;

		end = strchrnul(start, '\n');
		name = start;
		url = parse_token(name, end, &namelen);
		tail = parse_token(url, end, &urllen);
		if (!memcmp(tail, "(fetch)", 7))
			fprintf(stderr, ...); /* see above */
		start = *end ? end + 1 : end;
	}

> +static int add_submodule(struct add_data *info)
> +{
> +	/* perhaps the path exists and is already a git repo, else clone it */
> +	if (is_directory(info->sm_path)) {
> +		char *sub_git_path = xstrfmt("%s/.git", info->sm_path);
> +		if (is_directory(sub_git_path) || file_exists(sub_git_path))
> +			printf(_("Adding existing repo at '%s' to the index\n"),
> +				 info->sm_path);
> +		else
> +			die(_("'%s' already exists and is not a valid git repo"),
> +			      info->sm_path);
> +		free(sub_git_path);
> +	} else {
> +		struct strvec clone_args = STRVEC_INIT;
> +		struct child_process cp = CHILD_PROCESS_INIT;
> +		char *submodule_git_dir = xstrfmt(".git/modules/%s", info->sm_name);
> +
> +		if (is_directory(submodule_git_dir)) {
> +			if (!info->force) {

As I said, it would be a better organization to have a helper
function that does what is done from here ...

> +				struct child_process cp_rem = CHILD_PROCESS_INIT;
> +				struct strbuf sb_rem = STRBUF_INIT;
> +				cp_rem.git_cmd = 1;
> +				fprintf(stderr, _("A git directory for '%s' is "
> +					"found locally with remote(s):\n"),
> +					info->sm_name);
> +				strvec_pushf(&cp_rem.env_array,
> +					     "GIT_DIR=%s", submodule_git_dir);
> +				strvec_push(&cp_rem.env_array, "GIT_WORK_TREE=.");
> +				strvec_pushl(&cp_rem.args, "remote", "-v", NULL);
> +				if (!capture_command(&cp_rem, &sb_rem, 0)) {
> +					modify_remote_v(&sb_rem);
> +				}

... up to here.  Can you say what the purpose of that helper
function?  I'd say it is for a given git repository (specified by
submodule_git_dir, which we will pass as the parameter to that
helper), report the names and URLs of fetch remotes defined in that
repository.  So, perhaps its signature might be:

	static void report_fetch_remotes(FILE *output, const char *git_dir);

where we would make a call to it from here like so:

				report_fetch_remotes(stderr, submodule_git_dir);

> +				error(_("If you want to reuse this local git "
> +				      "directory instead of cloning again from\n "
> +				      "  %s\n"
> +				      "use the '--force' option. If the local "
> +				      "git directory is not the correct repo\n"
> +				      "or you are unsure what this means choose "
> +				      "another name with the '--name' option."),
> +				      info->realrepo);
> +				return 1;
> +			} else {
> +				printf(_("Reactivating local git directory for "
> +					 "submodule '%s'."), info->sm_path);
> +			}
> +		}
> +		free(submodule_git_dir);

I think I've reviewed up to this point this time around.

Thanks.
Junio C Hamano Oct. 8, 2020, 5:19 p.m. UTC | #3
Shourya Shukla <shouryashukla.oo@gmail.com> writes:

> +static void config_added_submodule(struct add_data *info)
> +{

This one I may take a look at later, but won't review in this
message.

> +}
> +
> +static int module_add(int argc, const char **argv, const char *prefix)
> +{
> +	const char *branch = NULL, *custom_name = NULL, *realrepo = NULL;
> +	const char *reference_path = NULL, *repo = NULL, *name = NULL;
> +	char *path;
> +	int force = 0, quiet = 0, depth = -1, progress = 0, dissociate = 0;
> +	struct add_data info = ADD_DATA_INIT;
> +	struct strbuf sb = STRBUF_INIT;
> +
> +	struct option options[] = {
> +		OPT_STRING('b', "branch", &branch, N_("branch"),
> +			   N_("branch of repository to add as submodule")),
> +		OPT_BOOL('f', "force", &force, N_("allow adding an otherwise "
> +						  "ignored submodule path")),
> +		OPT__QUIET(&quiet, N_("print only error messages")),
> +		OPT_BOOL(0, "progress", &progress, N_("force cloning progress")),
> +		OPT_STRING(0, "reference", &reference_path, N_("repository"),
> +			   N_("reference repository")),
> +		OPT_BOOL(0, "dissociate", &dissociate, N_("borrow the objects from reference repositories")),
> +		OPT_STRING(0, "name", &custom_name, N_("name"),
> +			   N_("sets the submodule’s name to the given string "
> +			      "instead of defaulting to its path")),
> +		OPT_INTEGER(0, "depth", &depth, N_("depth for shallow clones")),
> +		OPT_END()
> +	};
> +
> +	const char *const usage[] = {
> +		N_("git submodule--helper add [<options>] [--] [<path>]"),
> +		NULL
> +	};
> +
> +	argc = parse_options(argc, argv, prefix, options, usage, 0);
> +
> +	if (!is_writing_gitmodules_ok())
> +		die(_("please make sure that the .gitmodules file is in the working tree"));
> +
> +	if (reference_path && !is_absolute_path(reference_path) && prefix)

Checking "*prefix" lets us avoid an unnecessary allocation, i.e.

	if (prefix && *prefix &&
	    reference_path && !is_absolute_path(reference_path))

> +		reference_path = xstrfmt("%s%s", prefix, reference_path);
> +
> +	if (argc == 0 || argc > 2) {

Nice that you are checking excess args, which the original didn't do.

> +		usage_with_options(usage, options);
> +	} else if (argc == 1) {
> +		repo = argv[0];
> +		path = guess_dir_name(repo);

We've reviewed the function already.  Good.

> +	} else {
> +		repo = argv[0];
> +		path = xstrdup(argv[1]);

OK.  So after this if/else if/else cascade, path is an allocated
piece of memory we could later free() whichever branch is taken.

> +	}
> +
> +	if (!is_absolute_path(path) && prefix)
> +		path = xstrfmt("%s%s", prefix, path);

This also makes path freeable, but the original path is leaked.

	if (prefix && *prefix && !is_absolute_path(path)) {
		free(path);
		path = xstrfmt(...);
	}

Is there a reason (does not have to be a strong reason) why we use
'path', not 'sm_path', as the variable name that corresponds to
$sm_path in the original, by the way?

> +	/* assure repo is absolute or relative to parent */
> +	if (starts_with_dot_dot_slash(repo) || starts_with_dot_slash(repo)) {
> +		char *remote = get_default_remote();
> +		char *remoteurl;
> +		struct strbuf sb = STRBUF_INIT;
> +
> +		if (prefix)
> +			die(_("relative path can only be used from the toplevel "
> +			      "of the working tree"));

This is 'git submodule--helper resolve-relative-url "$repo"' in the
original.

> +		/* dereference source url relative to parent's url */
> +		strbuf_addf(&sb, "remote.%s.url", remote);
> +		if (git_config_get_string(sb.buf, &remoteurl))
> +			remoteurl = xgetcwd();
> +		realrepo = relative_url(remoteurl, repo, NULL);

And this is copied-and-pasted from resolve_relative_url() function
found in builtins/submodule--helper.c.

relative_url() returns an allocated memory so we can free() realrepo
if we took this branch.

> +		free(remoteurl);
> +		free(remote);
> +	} else if (is_dir_sep(repo[0]) || strchr(repo, ':')) {
> +		realrepo = repo;

This repo came from argv[0] so we cannot free realrepo if we took
this branch.  Are we willing to leak realrepo we obtained from the
other branch?

> +	} else {
> +		die(_("repo URL: '%s' must be absolute or begin with ./|../"),
> +		      repo);
> +	}
> +
> +	/*
> +	 * normalize path:
> +	 * multiple //; leading ./; /./; /../;
> +	 */
> +	normalize_path_copy(path, path);

It's nice that a handy (almost) equivalent helper is already
available ;-)

> +	/* strip trailing '/' */
> +	if (is_dir_sep(path[strlen(path) -1]))
> +		path[strlen(path) - 1] = '\0';

The original dealt with multiple trailing '/' but this one does not.
Shouldn't it loop starting at the end?

> +	if (check_sm_exists(force, path))
> +		return 1;

OK.  I think we reviewed the function.  Seeing it in the context of
the calling site makes us realize that it has a wrong name.  "check
submodule exists" sounds as if we expect a submodule to exist at the
path, and it is an error for a submodule not to be there, but that
is not what this caller (which is the only caller of the helper)
wants to check.  And more importantly, the helper reacts to anything
sitting at the path, not just submoudle.

So what does the helper really do?  I think it checks if it is OK to
create a submodule there.  IOW, "exists" part of the name is what
makes it a misnomer.  Perhaps "can_create_submodule()"?

> +	strbuf_addstr(&sb, path);
> +	if (is_nonbare_repository_dir(&sb)) {
> +		struct object_id oid;
> +		if (resolve_gitlink_ref(path, "HEAD", &oid) < 0)
> +			die(_("'%s' does not have a commit checked out"), path);
> +	}
> +
> +	if (!force) {
> +		struct strbuf sb = STRBUF_INIT;
> +		struct child_process cp = CHILD_PROCESS_INIT;
> +		cp.git_cmd = 1;
> +		cp.no_stdout = 1;
> +		strvec_pushl(&cp.args, "add", "--dry-run", "--ignore-missing",
> +			     "--no-warn-embedded-repo", path, NULL);
> +		if (pipe_command(&cp, NULL, 0, NULL, 0, &sb, 0)) {
> +			fprintf(stderr, _("%s"), sb.buf);

Sorry, but I cannot guess what this _("%s") is trying to achieve.
Shouldn't it be
			strbuf_complete_line(&sb);
			fputs(sb.buf, stderr);
instead?

> +			return 1;

The original honors the exit code from the dry-run and relays it to
the user.  Is this a regression, or nobody care what exit status
they get as long as it is not zero?

> +		}
> +		strbuf_release(&sb);
> +	}
> +
> +	name = custom_name ? custom_name : path;
> +	if (check_submodule_name(name))
> +		die(_("'%s' is not a valid submodule name"), name);
> +
> +	info.prefix = prefix;
> +	info.sm_name = name;
> +	info.sm_path = path;
> +	info.repo = repo;
> +	info.realrepo = realrepo;
> +	info.reference_path = reference_path;
> +	info.branch = branch;
> +	info.depth = depth;
> +	info.progress = !!progress;
> +	info.dissociate = !!dissociate;
> +	info.force = !!force;
> +	info.quiet = !!quiet;
> +
> +	if (add_submodule(&info))
> +		return 1;

I think we've reviewed this funciton already.

> +	config_added_submodule(&info);
> +
> +	free(path);

Looking a bit uneven wrt to leak handling.

> +	return 0;
> +}

Thanks.
Junio C Hamano Oct. 9, 2020, 5:09 a.m. UTC | #4
Shourya Shukla <shouryashukla.oo@gmail.com> writes:

> +static void config_added_submodule(struct add_data *info)
> +{
> +	char *key, *var = NULL;
> +	struct child_process cp = CHILD_PROCESS_INIT;
> +
> +	key = xstrfmt("submodule.%s.url", info->sm_name);
> +	git_config_set_gently(key, info->realrepo);
> +	free(key);
> +
> +	cp.git_cmd = 1;
> +	strvec_pushl(&cp.args, "add", "--no-warn-embedded-repo", NULL);
> +	if (info->force)
> +		strvec_push(&cp.args, "--force");
> +	strvec_pushl(&cp.args, "--", info->sm_path, ".gitmodules", NULL);

Hmph, you lost me.  I think this ought to correspond to this part of
the original:

	git add --no-warn-embedded-repo $force "$sm_path" ||
	die "$(eval_gettext "Failed to add submodule '\$sm_path'")"

I can see that adding "--" before $sm_path may be an improvement,
but why do we also add .gitmodules here, and ...

> +	key = xstrfmt("submodule.%s.path", info->sm_name);
> +	git_config_set_in_file_gently(".gitmodules", key, info->sm_path);
> +	free(key);
> +	key = xstrfmt("submodule.%s.url", info->sm_name);
> +	git_config_set_in_file_gently(".gitmodules", key, info->repo);
> +	free(key);
> +	key = xstrfmt("submodule.%s.branch", info->sm_name);
> +	if (info->branch)
> +		git_config_set_in_file_gently(".gitmodules", key, info->branch);
> +	free(key);

... perform quite a lot of configuration writing, before actually
spawning the "git add" and make sure it succeeds?  The original
won't futz with any of these .gitmodules entries if "git add" of the
$sm_path fails and that is a good discipline to follow.

> +	if (run_command(&cp))
> +		die(_("failed to add submodule '%s'"), info->sm_path);
> +
> +	/*
> +	 * NEEDSWORK: In a multi-working-tree world, this needs to be
> +	 * set in the per-worktree config.
> +	 */
> +	if (!git_config_get_string("submodule.active", &var) && var) {

What if this were a valueless true ("[submodule] active\n" without
"= true")?  Wouldn't get_string() fail?

> +		/*
> +		 * If the submodule being adding isn't already covered by the
> +		 * current configured pathspec, set the submodule's active flag
> +		 */
> +		if (!is_submodule_active(the_repository, info->sm_path)) {
> +			key = xstrfmt("submodule.%s.active", info->sm_name);
> +			git_config_set_gently(key, "true");
> +			free(key);
> +		}
> +	} else {
> +		key = xstrfmt("submodule.%s.active", info->sm_name);
> +		git_config_set_gently(key, "true");
> +		free(key);
> +	}
> +}
> +
> + ...
> +	info.prefix = prefix;
> +	info.sm_name = name;
> +	info.sm_path = path;
> +	info.repo = repo;
> +	info.realrepo = realrepo;
> +	info.reference_path = reference_path;
> +	info.branch = branch;
> +	info.depth = depth;
> +	info.progress = !!progress;
> +	info.dissociate = !!dissociate;
> +	info.force = !!force;
> +	info.quiet = !!quiet;
> +
> +	if (add_submodule(&info))
> +		return 1;
> +	config_added_submodule(&info);

Whew.

This was way too big to be reviewed in a single sitting.  I do not
know offhand if there is a better way to structure the changes into
a more digestible pieces to help prevent reviewers from overlooking
potential mistakes, though.

Thanks.
Jonathan Tan Nov. 18, 2020, 11:13 p.m. UTC | #5
> Whew.
> 
> This was way too big to be reviewed in a single sitting.  I do not
> know offhand if there is a better way to structure the changes into
> a more digestible pieces to help prevent reviewers from overlooking
> potential mistakes, though.
> 
> Thanks.

I just took a look at this, and one thing that would have helped is if
you ported the end of the function first in a commit, and work your way
backwards (in one or more commits).

After reading through the whole thing, I saw that this is mostly a
straightforward start-to-finish port (besides factoring out code into
functions), but it would be much easier for reviewers to conceptualize
and discuss the different parts if they were already divided.
Ævar Arnfjörð Bjarmason Nov. 19, 2020, 7:44 a.m. UTC | #6
On Thu, Nov 19 2020, Jonathan Tan wrote:

>> Whew.
>> 
>> This was way too big to be reviewed in a single sitting.  I do not
>> know offhand if there is a better way to structure the changes into
>> a more digestible pieces to help prevent reviewers from overlooking
>> potential mistakes, though.
>> 
>> Thanks.
>
> I just took a look at this, and one thing that would have helped is if
> you ported the end of the function first in a commit, and work your way
> backwards (in one or more commits).
>
> After reading through the whole thing, I saw that this is mostly a
> straightforward start-to-finish port (besides factoring out code into
> functions), but it would be much easier for reviewers to conceptualize
> and discuss the different parts if they were already divided.

Having done some minor changes to git-submodule.sh recently, I wondered
if we weren't at the point where it would be a nice approach to invert
the C/sh helper relationship.

I.e. write git-submodule.c, which would be the small entry point, it
would then mostly dispatch to a submodule--helper, which would in turn
mostly dispatch to a new submodule--helper-sh (containing most of the
current git-submodule.sh code), which in turn would re-dispatch to the C
submodule--helper (which as an aside, then sometimes calls itself via
process invocation).

It's quite a bit of spaghetti code, but means that there's a straighter
path to porting some of the setup code such as the "--check-writeable",
is_absolute_path() etc. being changed at the start of the change here to
git-submodule.sh.
Johannes Schindelin Nov. 19, 2020, 12:38 p.m. UTC | #7
Hi Ævar,

On Thu, 19 Nov 2020, Ævar Arnfjörð Bjarmason wrote:

>
> On Thu, Nov 19 2020, Jonathan Tan wrote:
>
> >> Whew.
> >>
> >> This was way too big to be reviewed in a single sitting.  I do not
> >> know offhand if there is a better way to structure the changes into
> >> a more digestible pieces to help prevent reviewers from overlooking
> >> potential mistakes, though.
> >>
> >> Thanks.
> >
> > I just took a look at this, and one thing that would have helped is if
> > you ported the end of the function first in a commit, and work your way
> > backwards (in one or more commits).
> >
> > After reading through the whole thing, I saw that this is mostly a
> > straightforward start-to-finish port (besides factoring out code into
> > functions), but it would be much easier for reviewers to conceptualize
> > and discuss the different parts if they were already divided.
>
> Having done some minor changes to git-submodule.sh recently, I wondered
> if we weren't at the point where it would be a nice approach to invert
> the C/sh helper relationship.
>
> I.e. write git-submodule.c, which would be the small entry point, it
> would then mostly dispatch to a submodule--helper, which would in turn
> mostly dispatch to a new submodule--helper-sh (containing most of the
> current git-submodule.sh code), which in turn would re-dispatch to the C
> submodule--helper (which as an aside, then sometimes calls itself via
> process invocation).
>
> It's quite a bit of spaghetti code, but means that there's a straighter
> path to porting some of the setup code such as the "--check-writeable",
> is_absolute_path() etc. being changed at the start of the change here to
> git-submodule.sh.

Looking at
https://github.com/gitgitgadget/git/blob/ss/submodule-add-in-c/git-submodule.sh,
I see that while there are still 794 lines, most of it is just mostly
redundant option parsing. The only function left to convert is
`cmd_update()`, and the first half was already converted to C long ago,
via the `git submodule--helper update-clone` subcommand:
https://github.com/gitgitgadget/git/blob/ss/submodule-add-in-c/git-submodule.sh#L269-L530

At this stage I suspect that having a little more patience would make more
sense. After `cmd_update` is converted, we can simply finish the
conversion, without having to keep the shell script around.

Ciao,
Dscho
Junio C Hamano Nov. 19, 2020, 8:37 p.m. UTC | #8
Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> At this stage I suspect that having a little more patience would make more
> sense. After `cmd_update` is converted, we can simply finish the
> conversion, without having to keep the shell script around.

That matches how I view the current state.  We seem to be reviewer
bandwidth limited and folks who took a look at this series to help
moving it forward certainly deserve a lot of gratitude.

Thanks.
diff mbox series

Patch

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index de5ad73bb8..ec0a50d032 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -19,7 +19,6 @@ 
 #include "diffcore.h"
 #include "diff.h"
 #include "object-store.h"
-#include "dir.h"
 #include "advice.h"
 
 #define OPT_QUIET (1 << 0)
@@ -2744,6 +2743,395 @@  static int module_set_branch(int argc, const char **argv, const char *prefix)
 	return !!ret;
 }
 
+struct add_data {
+	const char *prefix;
+	const char *branch;
+	const char *reference_path;
+	const char *sm_path;
+	const char *sm_name;
+	const char *repo;
+	const char *realrepo;
+	int depth;
+	unsigned int force: 1;
+	unsigned int quiet: 1;
+	unsigned int progress: 1;
+	unsigned int dissociate: 1;
+};
+#define ADD_DATA_INIT { NULL, NULL, NULL, NULL, NULL, NULL, NULL, 0, 0, 0, 0, 0 }
+
+/*
+ * Guess dir name from repository: strip leading '.*[/:]',
+ * strip trailing '[:/]*.git'.
+ */
+static char *guess_dir_name(const char *repo)
+{
+	const char *p, *start, *end, *limit;
+	int after_slash_or_colon;
+
+	after_slash_or_colon = 0;
+	limit = repo + strlen(repo);
+	start = repo;
+	end = limit;
+	for (p = repo; p < limit; p++) {
+		if (starts_with(p, ".git")) {
+			/* strip trailing '[:/]*.git' */
+			if (!after_slash_or_colon)
+				end = p;
+			p += 3;
+		} else if (*p == '/' || *p == ':') {
+			/* strip leading '.*[/:]' */
+			if (end == limit)
+				end = p;
+			after_slash_or_colon = 1;
+		} else if (after_slash_or_colon) {
+			start = p;
+			end = limit;
+			after_slash_or_colon = 0;
+		}
+	}
+	return xstrndup(start, end - start);
+}
+
+static void fprintf_submodule_remote(const char *str)
+{
+	const char *p = str;
+	const char *start;
+	const char *end;
+	char *name, *url;
+
+	start = p;
+	while (*p != ' ')
+		p++;
+	end = p;
+	name = xstrndup(start, end - start);
+
+	while(*p == ' ')
+		p++;
+	start = p;
+	while (*p != ' ')
+		p++;
+	end = p;
+	url = xstrndup(start, end - start);
+
+	fprintf(stderr, "  %s\t%s\n", name, url);
+	free(name);
+	free(url);
+}
+
+static int check_sm_exists(unsigned int force, const char *path) {
+
+	int cache_pos, dir_in_cache = 0;
+	if (read_cache() < 0)
+		die(_("index file corrupt"));
+
+	cache_pos = cache_name_pos(path, strlen(path));
+	if(cache_pos < 0 && (directory_exists_in_index(&the_index,
+	   path, strlen(path)) == index_directory))
+		dir_in_cache = 1;
+
+	if (!force) {
+		if (cache_pos >= 0 || dir_in_cache)
+			die(_("'%s' already exists in the index"), path);
+	} else {
+		struct cache_entry *ce = NULL;
+		if (cache_pos >= 0)
+			ce = the_index.cache[cache_pos];
+		if (dir_in_cache || (ce && !S_ISGITLINK(ce->ce_mode)))
+			die(_("'%s' already exists in the index and is not a "
+			      "submodule"), path);
+	}
+	return 0;
+}
+
+static void modify_remote_v(struct strbuf *sb)
+{
+	int i;
+	for (i = 0; i < sb->len; i++) {
+		const char *start = sb->buf + i;
+		const char *end = start;
+		while (sb->buf[i++] != '\n')
+			end++;
+		if (!strcmp("fetch", xstrndup(end - 6, 5)))
+			fprintf_submodule_remote(xstrndup(start, end - start - 7));
+	}
+}
+
+static int add_submodule(struct add_data *info)
+{
+	/* perhaps the path exists and is already a git repo, else clone it */
+	if (is_directory(info->sm_path)) {
+		char *sub_git_path = xstrfmt("%s/.git", info->sm_path);
+		if (is_directory(sub_git_path) || file_exists(sub_git_path))
+			printf(_("Adding existing repo at '%s' to the index\n"),
+				 info->sm_path);
+		else
+			die(_("'%s' already exists and is not a valid git repo"),
+			      info->sm_path);
+		free(sub_git_path);
+	} else {
+		struct strvec clone_args = STRVEC_INIT;
+		struct child_process cp = CHILD_PROCESS_INIT;
+		char *submodule_git_dir = xstrfmt(".git/modules/%s", info->sm_name);
+
+		if (is_directory(submodule_git_dir)) {
+			if (!info->force) {
+				struct child_process cp_rem = CHILD_PROCESS_INIT;
+				struct strbuf sb_rem = STRBUF_INIT;
+				cp_rem.git_cmd = 1;
+				fprintf(stderr, _("A git directory for '%s' is "
+					"found locally with remote(s):\n"),
+					info->sm_name);
+				strvec_pushf(&cp_rem.env_array,
+					     "GIT_DIR=%s", submodule_git_dir);
+				strvec_push(&cp_rem.env_array, "GIT_WORK_TREE=.");
+				strvec_pushl(&cp_rem.args, "remote", "-v", NULL);
+				if (!capture_command(&cp_rem, &sb_rem, 0)) {
+					modify_remote_v(&sb_rem);
+				}
+				error(_("If you want to reuse this local git "
+				      "directory instead of cloning again from\n "
+				      "  %s\n"
+				      "use the '--force' option. If the local "
+				      "git directory is not the correct repo\n"
+				      "or you are unsure what this means choose "
+				      "another name with the '--name' option."),
+				      info->realrepo);
+				return 1;
+			} else {
+				printf(_("Reactivating local git directory for "
+					 "submodule '%s'."), info->sm_path);
+			}
+		}
+		free(submodule_git_dir);
+
+		strvec_push(&clone_args, "clone");
+
+		if (info->quiet)
+			strvec_push(&clone_args, "--quiet");
+
+		if (info->progress)
+			strvec_push(&clone_args, "--progress");
+
+		if (info->prefix)
+			strvec_pushl(&clone_args, "--prefix", info->prefix, NULL);
+		strvec_pushl(&clone_args, "--path", info->sm_path, "--name",
+			     info->sm_name, "--url", info->realrepo, NULL);
+		if (info->reference_path)
+			strvec_pushl(&clone_args, "--reference",
+				     info->reference_path, NULL);
+		if (info->dissociate)
+			strvec_push(&clone_args, "--dissociate");
+
+		if (info->depth >= 0)
+			strvec_pushf(&clone_args, "--depth=%d", info->depth);
+
+		if (module_clone(clone_args.nr, clone_args.v, info->prefix)) {
+			strvec_clear(&clone_args);
+			return -1;
+		}
+
+		prepare_submodule_repo_env(&cp.env_array);
+		cp.git_cmd = 1;
+		cp.dir = info->sm_path;
+		strvec_pushl(&cp.args, "checkout", "-f", "-q", NULL);
+
+		if (info->branch) {
+			strvec_pushl(&cp.args, "-B", info->branch, NULL);
+			strvec_pushf(&cp.args, "origin/%s", info->branch);
+		}
+
+		if (run_command(&cp))
+			die(_("unable to checkout submodule '%s'"), info->sm_path);
+	}
+	return 0;
+}
+
+static void config_added_submodule(struct add_data *info)
+{
+	char *key, *var = NULL;
+	struct child_process cp = CHILD_PROCESS_INIT;
+
+	key = xstrfmt("submodule.%s.url", info->sm_name);
+	git_config_set_gently(key, info->realrepo);
+	free(key);
+
+	cp.git_cmd = 1;
+	strvec_pushl(&cp.args, "add", "--no-warn-embedded-repo", NULL);
+	if (info->force)
+		strvec_push(&cp.args, "--force");
+	strvec_pushl(&cp.args, "--", info->sm_path, ".gitmodules", NULL);
+
+	key = xstrfmt("submodule.%s.path", info->sm_name);
+	git_config_set_in_file_gently(".gitmodules", key, info->sm_path);
+	free(key);
+	key = xstrfmt("submodule.%s.url", info->sm_name);
+	git_config_set_in_file_gently(".gitmodules", key, info->repo);
+	free(key);
+	key = xstrfmt("submodule.%s.branch", info->sm_name);
+	if (info->branch)
+		git_config_set_in_file_gently(".gitmodules", key, info->branch);
+	free(key);
+
+	if (run_command(&cp))
+		die(_("failed to add submodule '%s'"), info->sm_path);
+
+	/*
+	 * NEEDSWORK: In a multi-working-tree world, this needs to be
+	 * set in the per-worktree config.
+	 */
+	if (!git_config_get_string("submodule.active", &var) && var) {
+
+		/*
+		 * If the submodule being adding isn't already covered by the
+		 * current configured pathspec, set the submodule's active flag
+		 */
+		if (!is_submodule_active(the_repository, info->sm_path)) {
+			key = xstrfmt("submodule.%s.active", info->sm_name);
+			git_config_set_gently(key, "true");
+			free(key);
+		}
+	} else {
+		key = xstrfmt("submodule.%s.active", info->sm_name);
+		git_config_set_gently(key, "true");
+		free(key);
+	}
+}
+
+static int module_add(int argc, const char **argv, const char *prefix)
+{
+	const char *branch = NULL, *custom_name = NULL, *realrepo = NULL;
+	const char *reference_path = NULL, *repo = NULL, *name = NULL;
+	char *path;
+	int force = 0, quiet = 0, depth = -1, progress = 0, dissociate = 0;
+	struct add_data info = ADD_DATA_INIT;
+	struct strbuf sb = STRBUF_INIT;
+
+	struct option options[] = {
+		OPT_STRING('b', "branch", &branch, N_("branch"),
+			   N_("branch of repository to add as submodule")),
+		OPT_BOOL('f', "force", &force, N_("allow adding an otherwise "
+						  "ignored submodule path")),
+		OPT__QUIET(&quiet, N_("print only error messages")),
+		OPT_BOOL(0, "progress", &progress, N_("force cloning progress")),
+		OPT_STRING(0, "reference", &reference_path, N_("repository"),
+			   N_("reference repository")),
+		OPT_BOOL(0, "dissociate", &dissociate, N_("borrow the objects from reference repositories")),
+		OPT_STRING(0, "name", &custom_name, N_("name"),
+			   N_("sets the submodule’s name to the given string "
+			      "instead of defaulting to its path")),
+		OPT_INTEGER(0, "depth", &depth, N_("depth for shallow clones")),
+		OPT_END()
+	};
+
+	const char *const usage[] = {
+		N_("git submodule--helper add [<options>] [--] [<path>]"),
+		NULL
+	};
+
+	argc = parse_options(argc, argv, prefix, options, usage, 0);
+
+	if (!is_writing_gitmodules_ok())
+		die(_("please make sure that the .gitmodules file is in the working tree"));
+
+	if (reference_path && !is_absolute_path(reference_path) && prefix)
+		reference_path = xstrfmt("%s%s", prefix, reference_path);
+
+	if (argc == 0 || argc > 2) {
+		usage_with_options(usage, options);
+	} else if (argc == 1) {
+		repo = argv[0];
+		path = guess_dir_name(repo);
+	} else {
+		repo = argv[0];
+		path = xstrdup(argv[1]);
+	}
+
+	if (!is_absolute_path(path) && prefix)
+		path = xstrfmt("%s%s", prefix, path);
+
+	/* assure repo is absolute or relative to parent */
+	if (starts_with_dot_dot_slash(repo) || starts_with_dot_slash(repo)) {
+		char *remote = get_default_remote();
+		char *remoteurl;
+		struct strbuf sb = STRBUF_INIT;
+
+		if (prefix)
+			die(_("relative path can only be used from the toplevel "
+			      "of the working tree"));
+		/* dereference source url relative to parent's url */
+		strbuf_addf(&sb, "remote.%s.url", remote);
+		if (git_config_get_string(sb.buf, &remoteurl))
+			remoteurl = xgetcwd();
+		realrepo = relative_url(remoteurl, repo, NULL);
+
+		free(remoteurl);
+		free(remote);
+	} else if (is_dir_sep(repo[0]) || strchr(repo, ':')) {
+		realrepo = repo;
+	} else {
+		die(_("repo URL: '%s' must be absolute or begin with ./|../"),
+		      repo);
+	}
+
+	/*
+	 * normalize path:
+	 * multiple //; leading ./; /./; /../;
+	 */
+	normalize_path_copy(path, path);
+	/* strip trailing '/' */
+	if (is_dir_sep(path[strlen(path) -1]))
+		path[strlen(path) - 1] = '\0';
+
+	if (check_sm_exists(force, path))
+		return 1;
+
+	strbuf_addstr(&sb, path);
+	if (is_nonbare_repository_dir(&sb)) {
+		struct object_id oid;
+		if (resolve_gitlink_ref(path, "HEAD", &oid) < 0)
+			die(_("'%s' does not have a commit checked out"), path);
+	}
+
+	if (!force) {
+		struct strbuf sb = STRBUF_INIT;
+		struct child_process cp = CHILD_PROCESS_INIT;
+		cp.git_cmd = 1;
+		cp.no_stdout = 1;
+		strvec_pushl(&cp.args, "add", "--dry-run", "--ignore-missing",
+			     "--no-warn-embedded-repo", path, NULL);
+		if (pipe_command(&cp, NULL, 0, NULL, 0, &sb, 0)) {
+			fprintf(stderr, _("%s"), sb.buf);
+			return 1;
+		}
+		strbuf_release(&sb);
+	}
+
+	name = custom_name ? custom_name : path;
+	if (check_submodule_name(name))
+		die(_("'%s' is not a valid submodule name"), name);
+
+	info.prefix = prefix;
+	info.sm_name = name;
+	info.sm_path = path;
+	info.repo = repo;
+	info.realrepo = realrepo;
+	info.reference_path = reference_path;
+	info.branch = branch;
+	info.depth = depth;
+	info.progress = !!progress;
+	info.dissociate = !!dissociate;
+	info.force = !!force;
+	info.quiet = !!quiet;
+
+	if (add_submodule(&info))
+		return 1;
+	config_added_submodule(&info);
+
+	free(path);
+
+	return 0;
+}
+
 #define SUPPORT_SUPER_PREFIX (1<<0)
 
 struct cmd_struct {
@@ -2777,6 +3165,7 @@  static struct cmd_struct commands[] = {
 	{"config", module_config, 0},
 	{"set-url", module_set_url, 0},
 	{"set-branch", module_set_branch, 0},
+	{"add", module_add, SUPPORT_SUPER_PREFIX},
 };
 
 int cmd_submodule__helper(int argc, const char **argv, const char *prefix)
diff --git a/git-submodule.sh b/git-submodule.sh
index 7ce52872b7..f1cbe4934a 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -146,166 +146,7 @@  cmd_add()
 		shift
 	done
 
-	if ! git submodule--helper config --check-writeable >/dev/null 2>&1
-	then
-		 die "$(eval_gettext "please make sure that the .gitmodules file is in the working tree")"
-	fi
-
-	if test -n "$reference_path"
-	then
-		is_absolute_path "$reference_path" ||
-		reference_path="$wt_prefix$reference_path"
-
-		reference="--reference=$reference_path"
-	fi
-
-	repo=$1
-	sm_path=$2
-
-	if test -z "$sm_path"; then
-		sm_path=$(printf '%s\n' "$repo" |
-			sed -e 's|/$||' -e 's|:*/*\.git$||' -e 's|.*[/:]||g')
-	fi
-
-	if test -z "$repo" || test -z "$sm_path"; then
-		usage
-	fi
-
-	is_absolute_path "$sm_path" || sm_path="$wt_prefix$sm_path"
-
-	# assure repo is absolute or relative to parent
-	case "$repo" in
-	./*|../*)
-		test -z "$wt_prefix" ||
-		die "$(gettext "Relative path can only be used from the toplevel of the working tree")"
-
-		# dereference source url relative to parent's url
-		realrepo=$(git submodule--helper resolve-relative-url "$repo") || exit
-		;;
-	*:*|/*)
-		# absolute url
-		realrepo=$repo
-		;;
-	*)
-		die "$(eval_gettext "repo URL: '\$repo' must be absolute or begin with ./|../")"
-	;;
-	esac
-
-	# normalize path:
-	# multiple //; leading ./; /./; /../; trailing /
-	sm_path=$(printf '%s/\n' "$sm_path" |
-		sed -e '
-			s|//*|/|g
-			s|^\(\./\)*||
-			s|/\(\./\)*|/|g
-			:start
-			s|\([^/]*\)/\.\./||
-			tstart
-			s|/*$||
-		')
-	if test -z "$force"
-	then
-		git ls-files --error-unmatch "$sm_path" > /dev/null 2>&1 &&
-		die "$(eval_gettext "'\$sm_path' already exists in the index")"
-	else
-		git ls-files -s "$sm_path" | sane_grep -v "^160000" > /dev/null 2>&1 &&
-		die "$(eval_gettext "'\$sm_path' already exists in the index and is not a submodule")"
-	fi
-
-	if test -d "$sm_path" &&
-		test -z $(git -C "$sm_path" rev-parse --show-cdup 2>/dev/null)
-	then
-	    git -C "$sm_path" rev-parse --verify -q HEAD >/dev/null ||
-	    die "$(eval_gettext "'\$sm_path' does not have a commit checked out")"
-	fi
-
-	if test -z "$force"
-	then
-	    dryerr=$(git add --dry-run --ignore-missing --no-warn-embedded-repo "$sm_path" 2>&1 >/dev/null)
-	    res=$?
-	    if test $res -ne 0
-	    then
-		 echo >&2 "$dryerr"
-		 exit $res
-	    fi
-	fi
-
-	if test -n "$custom_name"
-	then
-		sm_name="$custom_name"
-	else
-		sm_name="$sm_path"
-	fi
-
-	if ! git submodule--helper check-name "$sm_name"
-	then
-		die "$(eval_gettext "'$sm_name' is not a valid submodule name")"
-	fi
-
-	# perhaps the path exists and is already a git repo, else clone it
-	if test -e "$sm_path"
-	then
-		if test -d "$sm_path"/.git || test -f "$sm_path"/.git
-		then
-			eval_gettextln "Adding existing repo at '\$sm_path' to the index"
-		else
-			die "$(eval_gettext "'\$sm_path' already exists and is not a valid git repo")"
-		fi
-
-	else
-		if test -d ".git/modules/$sm_name"
-		then
-			if test -z "$force"
-			then
-				eval_gettextln >&2 "A git directory for '\$sm_name' is found locally with remote(s):"
-				GIT_DIR=".git/modules/$sm_name" GIT_WORK_TREE=. git remote -v | grep '(fetch)' | sed -e s,^,"  ", -e s,' (fetch)',, >&2
-				die "$(eval_gettextln "\
-If you want to reuse this local git directory instead of cloning again from
-  \$realrepo
-use the '--force' option. If the local git directory is not the correct repo
-or you are unsure what this means choose another name with the '--name' option.")"
-			else
-				eval_gettextln "Reactivating local git directory for submodule '\$sm_name'."
-			fi
-		fi
-		git submodule--helper clone ${GIT_QUIET:+--quiet} ${progress:+"--progress"} --prefix "$wt_prefix" --path "$sm_path" --name "$sm_name" --url "$realrepo" ${reference:+"$reference"} ${dissociate:+"--dissociate"} ${depth:+"$depth"} || exit
-		(
-			sanitize_submodule_env
-			cd "$sm_path" &&
-			# ash fails to wordsplit ${branch:+-b "$branch"...}
-			case "$branch" in
-			'') git checkout -f -q ;;
-			?*) git checkout -f -q -B "$branch" "origin/$branch" ;;
-			esac
-		) || die "$(eval_gettext "Unable to checkout submodule '\$sm_path'")"
-	fi
-	git config submodule."$sm_name".url "$realrepo"
-
-	git add --no-warn-embedded-repo $force "$sm_path" ||
-	die "$(eval_gettext "Failed to add submodule '\$sm_path'")"
-
-	git submodule--helper config submodule."$sm_name".path "$sm_path" &&
-	git submodule--helper config submodule."$sm_name".url "$repo" &&
-	if test -n "$branch"
-	then
-		git submodule--helper config submodule."$sm_name".branch "$branch"
-	fi &&
-	git add --force .gitmodules ||
-	die "$(eval_gettext "Failed to register submodule '\$sm_path'")"
-
-	# NEEDSWORK: In a multi-working-tree world, this needs to be
-	# set in the per-worktree config.
-	if git config --get submodule.active >/dev/null
-	then
-		# If the submodule being adding isn't already covered by the
-		# current configured pathspec, set the submodule's active flag
-		if ! git submodule--helper is-active "$sm_path"
-		then
-			git config submodule."$sm_name".active "true"
-		fi
-	else
-		git config submodule."$sm_name".active "true"
-	fi
+	git ${wt_prefix:+-C "$wt_prefix"} ${prefix:+--super-prefix "$prefix"} submodule--helper add ${force:+--force} ${GIT_QUIET:+--quiet} ${progress:+--progress} ${branch:+--branch "$branch"} ${reference_path:+--reference "$reference_path"} ${dissociate:+--dissociate} ${custom_name:+--name "$custom_name"} ${depth:+"$depth"} -- "$@"
 }
 
 #
diff --git a/t/t7400-submodule-basic.sh b/t/t7400-submodule-basic.sh
index fec7e0299d..4ab8298385 100755
--- a/t/t7400-submodule-basic.sh
+++ b/t/t7400-submodule-basic.sh
@@ -48,7 +48,7 @@  test_expect_success 'submodule update aborts on missing gitmodules url' '
 
 test_expect_success 'add aborts on repository with no commits' '
 	cat >expect <<-\EOF &&
-	'"'repo-no-commits'"' does not have a commit checked out
+	fatal: '"'repo-no-commits'"' does not have a commit checked out
 	EOF
 	git init repo-no-commits &&
 	test_must_fail git submodule add ../a ./repo-no-commits 2>actual &&