diff mbox series

[1/6] builtin/submodule: allow cloning with different ref storage format

Message ID a450759bd1e0d84192fd8b278b660fc8527369ca.1723032100.git.ps@pks.im (mailing list archive)
State Superseded
Headers show
Series Improvements for ref storage formats with submodules | expand

Commit Message

Patrick Steinhardt Aug. 7, 2024, 12:43 p.m. UTC
As submodules are proper self-contained repositories, it is perfectly
valid for them to have a different ref storage format than their parent
repository. There is no obvious way for users to ask for the ref storage
format when initializing submodules though. Whether the setup of such
mixed-ref-storage-format constellations is all that useful remains to be
seen. But there is no good reason to not expose such an option, and we
will require it in a subsequent patch.

Introduce a new `--ref-format=` option for git-submodule(1) that allows
the user to pick the ref storage format. This option will also be used
in a subsequent commit, where we start to propagate the same flag from
git-clone(1) to cloning submodules with the `--recursive` switch.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Documentation/git-submodule.txt        |  5 +++-
 builtin/submodule--helper.c            | 30 +++++++++++++++++++
 git-submodule.sh                       |  9 ++++++
 t/t7424-submodule-mixed-ref-formats.sh | 41 ++++++++++++++++++++++++++
 4 files changed, 84 insertions(+), 1 deletion(-)
 create mode 100755 t/t7424-submodule-mixed-ref-formats.sh

Comments

Jeppe Øland Aug. 7, 2024, 2:45 p.m. UTC | #1
> Just got a shower thought that this isn't quite there yet. Most
> importantly, I think we should default to the ref storage format of the
> parent submodule both when adding submodules and when cloning a
> submodule into a preexisting repository.

Yep, I was going to suggest exactly the same!
Thanks for the quick response.

Regards
-Jeppe
Junio C Hamano Aug. 7, 2024, 10:55 p.m. UTC | #2
Patrick Steinhardt <ps@pks.im> writes:

> -update [--init] [--remote] [-N|--no-fetch] [--[no-]recommend-shallow] [-f|--force] [--checkout|--rebase|--merge] [--reference <repository>] [--depth <depth>] [--recursive] [--jobs <n>] [--[no-]single-branch] [--filter <filter-spec>] [--] [<path>...]::
> +update [--init] [--remote] [-N|--no-fetch] [--[no-]recommend-shallow] [-f|--force] [--checkout|--rebase|--merge] [--reference <repository>] [--ref-format <format>] [--depth <depth>] [--recursive] [--jobs <n>] [--[no-]single-branch] [--filter <filter-spec>] [--] [<path>...]::
>  +
>  --
>  Update the registered submodules to match what the superproject
> @@ -185,6 +185,9 @@ submodule with the `--init` option.
>  If `--recursive` is specified, this command will recurse into the
>  registered submodules, and update any nested submodules within.
>  
> +If `--ref-format <format>`  is specified, the ref storage format of newly
> +cloned submodules will be set accordingly.
> +
>  If `--filter <filter-spec>` is specified, the given partial clone filter will be
>  applied to the submodule. See linkgit:git-rev-list[1] for details on filter
>  specifications.

Presumably, if the named submodule has already been initialized, we
are not converting its ref backend with --ref-format=<format> option
when "git submodule update --ref-format=<format>" is run.  Would it
make sense to say it is an error to give it without "--init", I
wonder.  If so, we probably would need to see if other existing
options like "--filter" also need a similar sanity check, if not
already done.

Thanks.
Patrick Steinhardt Aug. 8, 2024, 7 a.m. UTC | #3
On Wed, Aug 07, 2024 at 03:55:57PM -0700, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> > -update [--init] [--remote] [-N|--no-fetch] [--[no-]recommend-shallow] [-f|--force] [--checkout|--rebase|--merge] [--reference <repository>] [--depth <depth>] [--recursive] [--jobs <n>] [--[no-]single-branch] [--filter <filter-spec>] [--] [<path>...]::
> > +update [--init] [--remote] [-N|--no-fetch] [--[no-]recommend-shallow] [-f|--force] [--checkout|--rebase|--merge] [--reference <repository>] [--ref-format <format>] [--depth <depth>] [--recursive] [--jobs <n>] [--[no-]single-branch] [--filter <filter-spec>] [--] [<path>...]::
> >  +
> >  --
> >  Update the registered submodules to match what the superproject
> > @@ -185,6 +185,9 @@ submodule with the `--init` option.
> >  If `--recursive` is specified, this command will recurse into the
> >  registered submodules, and update any nested submodules within.
> >  
> > +If `--ref-format <format>`  is specified, the ref storage format of newly
> > +cloned submodules will be set accordingly.
> > +
> >  If `--filter <filter-spec>` is specified, the given partial clone filter will be
> >  applied to the submodule. See linkgit:git-rev-list[1] for details on filter
> >  specifications.
> 
> Presumably, if the named submodule has already been initialized, we
> are not converting its ref backend with --ref-format=<format> option
> when "git submodule update --ref-format=<format>" is run.  Would it
> make sense to say it is an error to give it without "--init", I
> wonder.  If so, we probably would need to see if other existing
> options like "--filter" also need a similar sanity check, if not
> already done.

Well, even when "--init" was given it is not sure whether the ref
storage format will actually end up being used, because that option only
tells us to initialize uninitialized submodules. So if the submodule was
initialized already, then the ref storage format won't be used.

We probably could add such a sanity check. But as you say, other options
like "--filter", "--depth", "--reference" and "--recommend-shallow"
don't have that check, either, so it would feel a bit like opening a can
of worms. So personally, I'd rather defer this to another day where we
then implement the check for all of these options.

Patrick
Junio C Hamano Aug. 8, 2024, 4:08 p.m. UTC | #4
Patrick Steinhardt <ps@pks.im> writes:

>> Presumably, if the named submodule has already been initialized, we
>> are not converting its ref backend with --ref-format=<format> option
>> when "git submodule update --ref-format=<format>" is run.  Would it
>> make sense to say it is an error to give it without "--init", I
>> wonder.  If so, we probably would need to see if other existing
>> options like "--filter" also need a similar sanity check, if not
>> already done.
>
> Well, even when "--init" was given it is not sure whether the ref
> storage format will actually end up being used, because that option only
> tells us to initialize uninitialized submodules. So if the submodule was
> initialized already, then the ref storage format won't be used.
>
> We probably could add such a sanity check. But as you say, other options
> like "--filter", "--depth", "--reference" and "--recommend-shallow"
> don't have that check, either, so it would feel a bit like opening a can
> of worms. So personally, I'd rather defer this to another day where we
> then implement the check for all of these options.

I am perfectly fine if we stopped by clearly documenting that these
options can be no-op when the submodule repository already exists.
Failing the operation in the name of "sanity check" at this point,
especially for existing options, does not sound like a sensible
change.

Thanks.
Patrick Steinhardt Aug. 8, 2024, 4:19 p.m. UTC | #5
On August 8, 2024 6:08:01 PM GMT+02:00, Junio C Hamano <gitster@pobox.com> wrote:
>Patrick Steinhardt <ps@pks.im> writes:
>
>>> Presumably, if the named submodule has already been initialized, we
>>> are not converting its ref backend with --ref-format=<format> option
>>> when "git submodule update --ref-format=<format>" is run.  Would it
>>> make sense to say it is an error to give it without "--init", I
>>> wonder.  If so, we probably would need to see if other existing
>>> options like "--filter" also need a similar sanity check, if not
>>> already done.
>>
>> Well, even when "--init" was given it is not sure whether the ref
>> storage format will actually end up being used, because that option only
>> tells us to initialize uninitialized submodules. So if the submodule was
>> initialized already, then the ref storage format won't be used.
>>
>> We probably could add such a sanity check. But as you say, other options
>> like "--filter", "--depth", "--reference" and "--recommend-shallow"
>> don't have that check, either, so it would feel a bit like opening a can
>> of worms. So personally, I'd rather defer this to another day where we
>> then implement the check for all of these options.
>
>I am perfectly fine if we stopped by clearly documenting that these
>options can be no-op when the submodule repository already exists.
>Failing the operation in the name of "sanity check" at this point,
>especially for existing options, does not sound like a sensible
>change.
>
>Thanks.

The documentation already points this out, as we explicitly say that it
only has an impact on newly cloned submodules:

> If `--ref-format <format>`  is specified, the ref storage format of newly
cloned submodules will be set accordingly.

I'm happy to make this more explicit though, in case you don't think
this is sufficient.

Patrick
Junio C Hamano Aug. 8, 2024, 5:26 p.m. UTC | #6
Patrick Steinhardt <ps@pks.im> writes:

> The documentation already points this out, as we explicitly say that it
> only has an impact on newly cloned submodules:

Perfect.  Thanks.
diff mbox series

Patch

diff --git a/Documentation/git-submodule.txt b/Documentation/git-submodule.txt
index ca0347a37b..73ef8b9696 100644
--- a/Documentation/git-submodule.txt
+++ b/Documentation/git-submodule.txt
@@ -136,7 +136,7 @@  If you really want to remove a submodule from the repository and commit
 that use linkgit:git-rm[1] instead. See linkgit:gitsubmodules[7] for removal
 options.
 
-update [--init] [--remote] [-N|--no-fetch] [--[no-]recommend-shallow] [-f|--force] [--checkout|--rebase|--merge] [--reference <repository>] [--depth <depth>] [--recursive] [--jobs <n>] [--[no-]single-branch] [--filter <filter-spec>] [--] [<path>...]::
+update [--init] [--remote] [-N|--no-fetch] [--[no-]recommend-shallow] [-f|--force] [--checkout|--rebase|--merge] [--reference <repository>] [--ref-format <format>] [--depth <depth>] [--recursive] [--jobs <n>] [--[no-]single-branch] [--filter <filter-spec>] [--] [<path>...]::
 +
 --
 Update the registered submodules to match what the superproject
@@ -185,6 +185,9 @@  submodule with the `--init` option.
 If `--recursive` is specified, this command will recurse into the
 registered submodules, and update any nested submodules within.
 
+If `--ref-format <format>`  is specified, the ref storage format of newly
+cloned submodules will be set accordingly.
+
 If `--filter <filter-spec>` is specified, the given partial clone filter will be
 applied to the submodule. See linkgit:git-rev-list[1] for details on filter
 specifications.
diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index f1218a1995..42a36bc2f7 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -1532,6 +1532,7 @@  struct module_clone_data {
 	const char *url;
 	const char *depth;
 	struct list_objects_filter_options *filter_options;
+	enum ref_storage_format ref_storage_format;
 	unsigned int quiet: 1;
 	unsigned int progress: 1;
 	unsigned int dissociate: 1;
@@ -1540,6 +1541,7 @@  struct module_clone_data {
 };
 #define MODULE_CLONE_DATA_INIT { \
 	.single_branch = -1, \
+	.ref_storage_format = REF_STORAGE_FORMAT_UNKNOWN, \
 }
 
 struct submodule_alternate_setup {
@@ -1738,6 +1740,9 @@  static int clone_submodule(const struct module_clone_data *clone_data,
 				strvec_pushl(&cp.args, "--reference",
 					     item->string, NULL);
 		}
+		if (clone_data->ref_storage_format != REF_STORAGE_FORMAT_UNKNOWN)
+			strvec_pushf(&cp.args, "--ref-format=%s",
+				     ref_storage_format_to_name(clone_data->ref_storage_format));
 		if (clone_data->dissociate)
 			strvec_push(&cp.args, "--dissociate");
 		if (sm_gitdir && *sm_gitdir)
@@ -1832,6 +1837,7 @@  static int module_clone(int argc, const char **argv, const char *prefix)
 	struct string_list reference = STRING_LIST_INIT_NODUP;
 	struct list_objects_filter_options filter_options =
 		LIST_OBJECTS_FILTER_INIT;
+	const char *ref_storage_format = NULL;
 
 	struct option module_clone_options[] = {
 		OPT_STRING(0, "prefix", &clone_data.prefix,
@@ -1849,6 +1855,8 @@  static int module_clone(int argc, const char **argv, const char *prefix)
 		OPT_STRING_LIST(0, "reference", &reference,
 			   N_("repo"),
 			   N_("reference repository")),
+		OPT_STRING(0, "ref-format", &ref_storage_format, N_("format"),
+			   N_("specify the reference format to use")),
 		OPT_BOOL(0, "dissociate", &dissociate,
 			   N_("use --reference only while cloning")),
 		OPT_STRING(0, "depth", &clone_data.depth,
@@ -1875,6 +1883,11 @@  static int module_clone(int argc, const char **argv, const char *prefix)
 	argc = parse_options(argc, argv, prefix, module_clone_options,
 			     git_submodule_helper_usage, 0);
 
+	if (ref_storage_format) {
+		clone_data.ref_storage_format = ref_storage_format_by_name(ref_storage_format);
+		if (clone_data.ref_storage_format == REF_STORAGE_FORMAT_UNKNOWN)
+			die(_("unknown ref storage format '%s'"), ref_storage_format);
+	}
 	clone_data.dissociate = !!dissociate;
 	clone_data.quiet = !!quiet;
 	clone_data.progress = !!progress;
@@ -1974,6 +1987,7 @@  struct update_data {
 	struct submodule_update_strategy update_strategy;
 	struct list_objects_filter_options *filter_options;
 	struct module_list list;
+	enum ref_storage_format ref_storage_format;
 	int depth;
 	int max_jobs;
 	int single_branch;
@@ -1997,6 +2011,7 @@  struct update_data {
 #define UPDATE_DATA_INIT { \
 	.update_strategy = SUBMODULE_UPDATE_STRATEGY_INIT, \
 	.list = MODULE_LIST_INIT, \
+	.ref_storage_format = REF_STORAGE_FORMAT_UNKNOWN, \
 	.recommend_shallow = -1, \
 	.references = STRING_LIST_INIT_DUP, \
 	.single_branch = -1, \
@@ -2132,6 +2147,9 @@  static int prepare_to_clone_next_submodule(const struct cache_entry *ce,
 			     expand_list_objects_filter_spec(suc->update_data->filter_options));
 	if (suc->update_data->require_init)
 		strvec_push(&child->args, "--require-init");
+	if (suc->update_data->ref_storage_format != REF_STORAGE_FORMAT_UNKNOWN)
+		strvec_pushf(&child->args, "--ref-format=%s",
+			     ref_storage_format_to_name(suc->update_data->ref_storage_format));
 	strvec_pushl(&child->args, "--path", sub->path, NULL);
 	strvec_pushl(&child->args, "--name", sub->name, NULL);
 	strvec_pushl(&child->args, "--url", url, NULL);
@@ -2562,6 +2580,9 @@  static void update_data_to_args(const struct update_data *update_data,
 		for_each_string_list_item(item, &update_data->references)
 			strvec_pushl(args, "--reference", item->string, NULL);
 	}
+	if (update_data->ref_storage_format != REF_STORAGE_FORMAT_UNKNOWN)
+		strvec_pushf(args, "--ref-format=%s",
+			     ref_storage_format_to_name(update_data->ref_storage_format));
 	if (update_data->filter_options && update_data->filter_options->choice)
 		strvec_pushf(args, "--filter=%s",
 				expand_list_objects_filter_spec(
@@ -2737,6 +2758,7 @@  static int module_update(int argc, const char **argv, const char *prefix)
 	struct update_data opt = UPDATE_DATA_INIT;
 	struct list_objects_filter_options filter_options =
 		LIST_OBJECTS_FILTER_INIT;
+	const char *ref_storage_format = NULL;
 	int ret;
 	struct option module_update_options[] = {
 		OPT__SUPER_PREFIX(&opt.super_prefix),
@@ -2760,6 +2782,8 @@  static int module_update(int argc, const char **argv, const char *prefix)
 			SM_UPDATE_REBASE),
 		OPT_STRING_LIST(0, "reference", &opt.references, N_("repo"),
 			   N_("reference repository")),
+		OPT_STRING(0, "ref-format", &ref_storage_format, N_("format"),
+			   N_("specify the reference format to use")),
 		OPT_BOOL(0, "dissociate", &opt.dissociate,
 			   N_("use --reference only while cloning")),
 		OPT_INTEGER(0, "depth", &opt.depth,
@@ -2803,6 +2827,12 @@  static int module_update(int argc, const char **argv, const char *prefix)
 				   module_update_options);
 	}
 
+	if (ref_storage_format) {
+		opt.ref_storage_format = ref_storage_format_by_name(ref_storage_format);
+		if (opt.ref_storage_format == REF_STORAGE_FORMAT_UNKNOWN)
+			die(_("unknown ref storage format '%s'"), ref_storage_format);
+	}
+
 	opt.filter_options = &filter_options;
 	opt.prefix = prefix;
 
diff --git a/git-submodule.sh b/git-submodule.sh
index 7f9582d923..dcbe943071 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -268,6 +268,14 @@  cmd_update()
 		-r|--rebase)
 			rebase=1
 			;;
+		--ref-format)
+			case "$2" in '') usage ;; esac
+			ref_format="--ref-format=$2"
+			shift
+			;;
+		--ref-format=*)
+			ref_format="$1"
+			;;
 		--reference)
 			case "$2" in '') usage ;; esac
 			reference="--reference=$2"
@@ -349,6 +357,7 @@  cmd_update()
 		${rebase:+--rebase} \
 		${merge:+--merge} \
 		${checkout:+--checkout} \
+		${ref_format:+"$ref_format"} \
 		${reference:+"$reference"} \
 		${dissociate:+"--dissociate"} \
 		${depth:+"$depth"} \
diff --git a/t/t7424-submodule-mixed-ref-formats.sh b/t/t7424-submodule-mixed-ref-formats.sh
new file mode 100755
index 0000000000..de84007554
--- /dev/null
+++ b/t/t7424-submodule-mixed-ref-formats.sh
@@ -0,0 +1,41 @@ 
+#!/bin/sh
+
+test_description='submodules handle mixed ref storage formats'
+
+. ./test-lib.sh
+
+test_ref_format () {
+	echo "$2" >expect &&
+	git -C "$1" rev-parse --show-ref-format >actual &&
+	test_cmp expect actual
+}
+
+for OTHER_FORMAT in files reftable
+do
+	if test "$OTHER_FORMAT" = "$GIT_DEFAULT_REF_FORMAT"
+	then
+		continue
+	fi
+
+test_expect_success 'setup' '
+	git config set --global protocol.file.allow always
+'
+
+test_expect_success 'clone submodules with different ref storage format' '
+	test_when_finished "rm -rf submodule upstream downstream" &&
+
+	git init submodule &&
+	test_commit -C submodule submodule-initial &&
+	git init upstream &&
+	git -C upstream submodule add "file://$(pwd)/submodule" &&
+	git -C upstream commit -m "upstream submodule" &&
+
+	git clone --no-recurse-submodules "file://$(pwd)/upstream" downstream &&
+	test_ref_format downstream "$GIT_DEFAULT_REF_FORMAT" &&
+	git -C downstream submodule update --init --ref-format=$OTHER_FORMAT &&
+	test_ref_format downstream/submodule "$OTHER_FORMAT"
+'
+
+done
+
+test_done