[v6,09/10] submodule: support reading .gitmodules when it's not in the working tree
diff mbox series

Message ID 20181005130601.15879-10-ao2@ao2.it
State Superseded
Headers show
Series
  • Make submodules work if .gitmodules is not checked out
Related show

Commit Message

Antonio Ospite Oct. 5, 2018, 1:06 p.m. UTC
When the .gitmodules file is not available in the working tree, try
using the content from the index and from the current branch. This
covers the case when the file is part of the repository but for some
reason it is not checked out, for example because of a sparse checkout.

This makes it possible to use at least the 'git submodule' commands
which *read* the gitmodules configuration file without fully populating
the working tree.

Writing to .gitmodules will still require that the file is checked out,
so check for that before calling config_set_in_gitmodules_file_gently.

Add a similar check also in git-submodule.sh::cmd_add() to anticipate
the eventual failure of the "git submodule add" command when .gitmodules
is not safely writeable; this prevents the command from leaving the
repository in a spurious state (e.g. the submodule repository was cloned
but .gitmodules was not updated because
config_set_in_gitmodules_file_gently failed).

Moreover, since config_from_gitmodules() now accesses the global object
store, it is necessary to protect all code paths which call the function
against concurrent access to the global object store. Currently this
only happens in builtin/grep.c::grep_submodules(), so call
grep_read_lock() before invoking code involving
config_from_gitmodules().

Finally, add t7416-submodule-sparse-gitmodules.sh to verify that reading
from .gitmodules succeeds and that writing to it fails when the file is
not checked out.

NOTE: there is one rare case where this new feature does not work
properly yet: nested submodules without .gitmodules in their working
tree.  This has been documented with a warning and a test_expect_failure
item in t7814, and in this case the current behavior is not altered: no
config is read.

Signed-off-by: Antonio Ospite <ao2@ao2.it>
---
 builtin/grep.c                         | 17 +++++-
 builtin/submodule--helper.c            |  6 +-
 git-submodule.sh                       |  5 ++
 submodule-config.c                     | 31 +++++++++-
 t/t7411-submodule-config.sh            | 26 ++++++++-
 t/t7416-submodule-sparse-gitmodules.sh | 78 ++++++++++++++++++++++++++
 t/t7814-grep-recurse-submodules.sh     | 16 ++++++
 7 files changed, 172 insertions(+), 7 deletions(-)
 create mode 100755 t/t7416-submodule-sparse-gitmodules.sh

Comments

Stefan Beller Oct. 8, 2018, 10:19 p.m. UTC | #1
> +test_expect_success 'not writing gitmodules config file when it is not checked out' '
> +        test_must_fail git -C super submodule--helper config submodule.submodule.url newurl

This only checks the exit code, do we also want to check for

    test_path_is_missing .gitmodules ?

> +test_expect_success 'initialising submodule when the gitmodules config is not checked out' '
> +       git -C super submodule init
> +'
> +
> +test_expect_success 'showing submodule summary when the gitmodules config is not checked out' '
> +       git -C super submodule summary
> +'

Same for these, is the exit code enough, or do we want to look at
specific things?

> +
> +test_expect_success 'updating submodule when the gitmodules config is not checked out' '
> +       (cd submodule &&
> +               echo file2 >file2 &&
> +               git add file2 &&
> +               git commit -m "add file2 to submodule"
> +       ) &&
> +       git -C super submodule update

git status would want to be clean afterwards?
Junio C Hamano Oct. 9, 2018, 3:39 a.m. UTC | #2
Antonio Ospite <ao2@ao2.it> writes:

> Finally, add t7416-submodule-sparse-gitmodules.sh to verify that reading
> from .gitmodules succeeds and that writing to it fails when the file is
> not checked out.
> ...
>  t/t7416-submodule-sparse-gitmodules.sh | 78 ++++++++++++++++++++++++++

This now triggers test-lint errors as the most recent maintenance
release took t/t7416 for something else.  I'll do s/t7416-/t7418-/g
on the mailbox before running "git am -s" on this series.
Junio C Hamano Oct. 9, 2018, 3:48 a.m. UTC | #3
Junio C Hamano <gitster@pobox.com> writes:

> Antonio Ospite <ao2@ao2.it> writes:
>
>> Finally, add t7416-submodule-sparse-gitmodules.sh to verify that reading
>> from .gitmodules succeeds and that writing to it fails when the file is
>> not checked out.
>> ...
>>  t/t7416-submodule-sparse-gitmodules.sh | 78 ++++++++++++++++++++++++++
>
> This now triggers test-lint errors as the most recent maintenance
> release took t/t7416 for something else.  I'll do s/t7416-/t7418-/g
> on the mailbox before running "git am -s" on this series.

This is an unrelated tangent to the topic, but running "range-diff"
on what has been queued on 'pu' since mid September and this
replacement after doing the renaming was a surprisingly pleasant
experience.  In its comparison between 09/10 of the two iterations,
it showed that 7416's name has been changed to 7418 but otherwise
there is no change in the contents of that test script.

FWIW, tbdiff also gets this right, so the pleasant experience was
inherited without getting broken.  Kudos should go to both Thomas
and Dscho ;-).
Antonio Ospite Oct. 10, 2018, 6:56 p.m. UTC | #4
On Mon, 8 Oct 2018 15:19:00 -0700
Stefan Beller <sbeller@google.com> wrote:

> > +test_expect_success 'not writing gitmodules config file when it is not checked out' '
> > +        test_must_fail git -C super submodule--helper config submodule.submodule.url newurl
> 
> This only checks the exit code, do we also want to check for
> 
>     test_path_is_missing .gitmodules ?
>

OK, I agree, let's re-check also *after* we tried and failed to set
a config value, just to be sure that the code does not get accidentally
changed in the future to create the file. I'll add the check.

> > +test_expect_success 'initialising submodule when the gitmodules config is not checked out' '
> > +       git -C super submodule init
> > +'
> > +
> > +test_expect_success 'showing submodule summary when the gitmodules config is not checked out' '
> > +       git -C super submodule summary
> > +'
> 
> Same for these, is the exit code enough, or do we want to look at
> specific things?
>

Except for the "summary" test which was not even exercising the
config_from_gitmodule path,  checking exist status should be sufficient
to verify that "submodule--helper config" does not fail, but we can
surely do better.

I will add checks to confirm that not only the commands exited without
errors but they also achieved the desired effect, to validate the actual
high-level use case advertised by the test file. This should be more
future-proof.

And I think I'll merge the summary and the update tests.

> > +
> > +test_expect_success 'updating submodule when the gitmodules config is not checked out' '
> > +       (cd submodule &&
> > +               echo file2 >file2 &&
> > +               git add file2 &&
> > +               git commit -m "add file2 to submodule"
> > +       ) &&
> > +       git -C super submodule update
> 
> git status would want to be clean afterwards?

Mmh, this should have been "submodule update --remote" in the first
place to have any effect, I'll take the chance and rewrite this test in
a different way and also check the effect of the update operation, and
the repository status.

I'll be something like this:

ORIG_SUBMODULE=$(git -C submodule rev-parse HEAD)
ORIG_UPSTREAM=$(git -C upstream rev-parse HEAD)
ORIG_SUPER=$(git -C super rev-parse HEAD)

test_expect_success 're-updating submodule when the gitmodules config is not checked out' '
	test_when_finished "git -C submodule reset --hard $ORIG_SUBMODULE;
	                    git -C upstream reset --hard $ORIG_UPSTREAM;
	                    git -C super reset --hard $ORIG_SUPER;
	                    git -C upstream submodule update --remote;
	                    git -C super pull;
	                    git -C super submodule update --remote" &&
	(cd submodule &&
		echo file2 >file2 &&
		git add file2 &&
		test_tick &&
		git commit -m "add file2 to submodule"
	) &&
	(cd upstream &&
		git submodule update --remote &&
		git add submodule &&
		test_tick &&
		git commit -m "Update submodule"
	) &&
	git -C super pull &&
	# The --for-status options reads the gitmdoules config
	git -C super submodule summary --for-status >actual &&
	cat >expect <<-\EOF &&
	* submodule 951c301...a939200 (1):
	  < add file2 to submodule
	
	EOF
	test_cmp expect actual &&
	# Test that the update actually succeeds
	test_path_is_missing super/submodule/file2 &&
	git -C super submodule update &&
	test_cmp submodule/file2 super/submodule/file2 &&
	git -C super status --short >output &&
	test_must_be_empty output
'

Maybe a little overkill?

The "upstream" repo will be added in test 1 to better clarify the roles
of the involved repositories.

The commit ids should be stable because of test_tick, shouldn't they?

Thanks for the comments, they helped improving the quality of the tests
once again.

I'll wait a few days before sending a v7, hopefully someone will find
time to take another look at patch 9 and comment also on patch 10, and
give an opinion on the "mergeability" status of the whole patchset.

Ciao ciao,
   Antonio
Stefan Beller Oct. 10, 2018, 10:55 p.m. UTC | #5
On Wed, Oct 10, 2018 at 11:56 AM Antonio Ospite <ao2@ao2.it> wrote:
>
> On Mon, 8 Oct 2018 15:19:00 -0700
> Stefan Beller <sbeller@google.com> wrote:
>
> > > +test_expect_success 'not writing gitmodules config file when it is not checked out' '
> > > +        test_must_fail git -C super submodule--helper config submodule.submodule.url newurl
> >
> > This only checks the exit code, do we also want to check for
> >
> >     test_path_is_missing .gitmodules ?
> >
>
> OK, I agree, let's re-check also *after* we tried and failed to set
> a config value, just to be sure that the code does not get accidentally
> changed in the future to create the file. I'll add the check.
>
> > > +test_expect_success 'initialising submodule when the gitmodules config is not checked out' '
> > > +       git -C super submodule init
> > > +'
> > > +
> > > +test_expect_success 'showing submodule summary when the gitmodules config is not checked out' '
> > > +       git -C super submodule summary
> > > +'
> >
> > Same for these, is the exit code enough, or do we want to look at
> > specific things?
> >
>
> Except for the "summary" test which was not even exercising the
> config_from_gitmodule path,  checking exist status should be sufficient
> to verify that "submodule--helper config" does not fail, but we can
> surely do better.
>
> I will add checks to confirm that not only the commands exited without
> errors but they also achieved the desired effect, to validate the actual
> high-level use case advertised by the test file. This should be more
> future-proof.
>
> And I think I'll merge the summary and the update tests.
>
> > > +
> > > +test_expect_success 'updating submodule when the gitmodules config is not checked out' '
> > > +       (cd submodule &&
> > > +               echo file2 >file2 &&
> > > +               git add file2 &&
> > > +               git commit -m "add file2 to submodule"
> > > +       ) &&
> > > +       git -C super submodule update
> >
> > git status would want to be clean afterwards?
>
> Mmh, this should have been "submodule update --remote" in the first
> place to have any effect, I'll take the chance and rewrite this test in
> a different way and also check the effect of the update operation, and
> the repository status.
>
> I'll be something like this:
>
> ORIG_SUBMODULE=$(git -C submodule rev-parse HEAD)
> ORIG_UPSTREAM=$(git -C upstream rev-parse HEAD)
> ORIG_SUPER=$(git -C super rev-parse HEAD)
>
> test_expect_success 're-updating submodule when the gitmodules config is not checked out' '
>         test_when_finished "git -C submodule reset --hard $ORIG_SUBMODULE;
>                             git -C upstream reset --hard $ORIG_UPSTREAM;
>                             git -C super reset --hard $ORIG_SUPER;
>                             git -C upstream submodule update --remote;
>                             git -C super pull;
>                             git -C super submodule update --remote" &&
>         (cd submodule &&
>                 echo file2 >file2 &&
>                 git add file2 &&
>                 test_tick &&
>                 git commit -m "add file2 to submodule"
>         ) &&
>         (cd upstream &&
>                 git submodule update --remote &&
>                 git add submodule &&
>                 test_tick &&
>                 git commit -m "Update submodule"
>         ) &&
>         git -C super pull &&
>         # The --for-status options reads the gitmdoules config

gitmodules

>         git -C super submodule summary --for-status >actual &&
>         cat >expect <<-\EOF &&
>         * submodule 951c301...a939200 (1):

hardcoding hash values burdens the plan to migrate to another
hash function,

    rev1=$(git -C submodule rev-parse --short HEAD^)
    rev2=$(git -C submodule rev-parse --short HEAD)

and then use ${rev1}..${rev2} ?


>           < add file2 to submodule
>
>         EOF
>         test_cmp expect actual &&
>         # Test that the update actually succeeds
>         test_path_is_missing super/submodule/file2 &&
>         git -C super submodule update &&
>         test_cmp submodule/file2 super/submodule/file2 &&
>         git -C super status --short >output &&
>         test_must_be_empty output
> '
>
> Maybe a little overkill?

Wow, very thorough! You might call it overkill, but now that you have it...

> The "upstream" repo will be added in test 1 to better clarify the roles
> of the involved repositories.
>
> The commit ids should be stable because of test_tick, shouldn't they?

Yes, but see
Documentation/technical/hash-function-transition.txt
that a couple people are working on. Let's be nice to them. :-)

Stefan

Patch
diff mbox series

diff --git a/builtin/grep.c b/builtin/grep.c
index 601f801158..7da8fef31a 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -421,11 +421,23 @@  static int grep_submodule(struct grep_opt *opt, struct repository *superproject,
 	struct repository submodule;
 	int hit;
 
-	if (!is_submodule_active(superproject, path))
+	/*
+	 * NEEDSWORK: submodules functions need to be protected because they
+	 * access the object store via config_from_gitmodules(): the latter
+	 * uses get_oid() which, for now, relies on the global the_repository
+	 * object.
+	 */
+	grep_read_lock();
+
+	if (!is_submodule_active(superproject, path)) {
+		grep_read_unlock();
 		return 0;
+	}
 
-	if (repo_submodule_init(&submodule, superproject, path))
+	if (repo_submodule_init(&submodule, superproject, path)) {
+		grep_read_unlock();
 		return 0;
+	}
 
 	repo_read_gitmodules(&submodule);
 
@@ -439,7 +451,6 @@  static int grep_submodule(struct grep_opt *opt, struct repository *superproject,
 	 * store is no longer global and instead is a member of the repository
 	 * object.
 	 */
-	grep_read_lock();
 	add_to_alternates_memory(submodule.objects->objectdir);
 	grep_read_unlock();
 
diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index 28f3ccca6d..5f8a804a6e 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -2154,8 +2154,12 @@  static int module_config(int argc, const char **argv, const char *prefix)
 		return print_config_from_gitmodules(the_repository, argv[1]);
 
 	/* Equivalent to ACTION_SET in builtin/config.c */
-	if (argc == 3)
+	if (argc == 3) {
+		if (!is_writing_gitmodules_ok())
+			die(_("please make sure that the .gitmodules file is in the working tree"));
+
 		return config_set_in_gitmodules_file_gently(argv[1], argv[2]);
+	}
 
 	usage_with_options(git_submodule_helper_usage, module_config_options);
 }
diff --git a/git-submodule.sh b/git-submodule.sh
index 0805fadf47..f5124bbf23 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -159,6 +159,11 @@  cmd_add()
 		shift
 	done
 
+	if ! git submodule--helper config --check-writeable >/dev/null 2>&1
+	then
+		 die "$(eval_gettext "please make sure that the .gitmodules file is in the working tree")"
+	fi
+
 	if test -n "$reference_path"
 	then
 		is_absolute_path "$reference_path" ||
diff --git a/submodule-config.c b/submodule-config.c
index 8659d97e06..69bebb721e 100644
--- a/submodule-config.c
+++ b/submodule-config.c
@@ -1,4 +1,5 @@ 
 #include "cache.h"
+#include "dir.h"
 #include "repository.h"
 #include "config.h"
 #include "submodule-config.h"
@@ -603,8 +604,34 @@  static void submodule_cache_check_init(struct repository *repo)
 static void config_from_gitmodules(config_fn_t fn, struct repository *repo, void *data)
 {
 	if (repo->worktree) {
-		char *file = repo_worktree_path(repo, GITMODULES_FILE);
-		git_config_from_file(fn, file, data);
+		struct git_config_source config_source = { 0 };
+		const struct config_options opts = { 0 };
+		struct object_id oid;
+		char *file;
+
+		file = repo_worktree_path(repo, GITMODULES_FILE);
+		if (file_exists(file)) {
+			config_source.file = file;
+		} else if (repo->submodule_prefix) {
+			/*
+			 * When get_oid and config_with_options, used below,
+			 * become able to work on a specific repository, this
+			 * warning branch can be removed.
+			 */
+			warning("nested submodules without %s in the working tree are not supported yet",
+				GITMODULES_FILE);
+			goto out;
+		} else if (get_oid(GITMODULES_INDEX, &oid) >= 0) {
+			config_source.blob = GITMODULES_INDEX;
+		} else if (get_oid(GITMODULES_HEAD, &oid) >= 0) {
+			config_source.blob = GITMODULES_HEAD;
+		} else {
+			goto out;
+		}
+
+		config_with_options(fn, data, &config_source, &opts);
+
+out:
 		free(file);
 	}
 }
diff --git a/t/t7411-submodule-config.sh b/t/t7411-submodule-config.sh
index 45953f9300..2cfabb18bc 100755
--- a/t/t7411-submodule-config.sh
+++ b/t/t7411-submodule-config.sh
@@ -134,7 +134,7 @@  test_expect_success 'error in history in fetchrecursesubmodule lets continue' '
 	)
 '
 
-test_expect_success 'reading submodules config with "submodule--helper config"' '
+test_expect_success 'reading submodules config from the working tree with "submodule--helper config"' '
 	(cd super &&
 		echo "../submodule" >expect &&
 		git submodule--helper config submodule.submodule.url >actual &&
@@ -192,4 +192,28 @@  test_expect_success 'non-writeable .gitmodules when it is in the current branch
 	)
 '
 
+test_expect_success 'reading submodules config from the index when .gitmodules is not in the working tree' '
+	ORIG=$(git -C super rev-parse HEAD) &&
+	test_when_finished "git -C super reset --hard $ORIG" &&
+	(cd super &&
+		git submodule--helper config submodule.submodule.url "staged_url" &&
+		git add .gitmodules &&
+		rm -f .gitmodules &&
+		echo "staged_url" >expect &&
+		git submodule--helper config submodule.submodule.url >actual &&
+		test_cmp expect actual
+	)
+'
+
+test_expect_success 'reading submodules config from the current branch when .gitmodules is not in the index' '
+	ORIG=$(git -C super rev-parse HEAD) &&
+	test_when_finished "git -C super reset --hard $ORIG" &&
+	(cd super &&
+		git rm .gitmodules &&
+		echo "../submodule" >expect &&
+		git submodule--helper config submodule.submodule.url >actual &&
+		test_cmp expect actual
+	)
+'
+
 test_done
diff --git a/t/t7416-submodule-sparse-gitmodules.sh b/t/t7416-submodule-sparse-gitmodules.sh
new file mode 100755
index 0000000000..908a4e6958
--- /dev/null
+++ b/t/t7416-submodule-sparse-gitmodules.sh
@@ -0,0 +1,78 @@ 
+#!/bin/sh
+#
+# Copyright (C) 2018  Antonio Ospite <ao2@ao2.it>
+#
+
+test_description='Test reading/writing .gitmodules when not in the working tree
+
+This test verifies that, when .gitmodules is in the current branch but is not
+in the working tree reading from it still works but writing to it does not.
+
+The test setup uses a sparse checkout, however the same scenario can be set up
+also by committing .gitmodules and then just removing it from the filesystem.
+'
+
+. ./test-lib.sh
+
+test_expect_success 'sparse checkout setup which hides .gitmodules' '
+	echo file >file &&
+	git add file &&
+	test_tick &&
+	git commit -m upstream &&
+	git clone . super &&
+	git clone super submodule &&
+	git clone super new_submodule &&
+	(cd super &&
+		git submodule add ../submodule &&
+		test_tick &&
+		git commit -m submodule &&
+		cat >.git/info/sparse-checkout <<-\EOF &&
+		/*
+		!/.gitmodules
+		EOF
+		git config core.sparsecheckout true &&
+		git read-tree -m -u HEAD &&
+		test_path_is_missing .gitmodules
+	)
+'
+
+test_expect_success 'reading gitmodules config file when it is not checked out' '
+	(cd super &&
+		echo "../submodule" >expect &&
+		git submodule--helper config submodule.submodule.url >actual &&
+		test_cmp expect actual
+	)
+'
+
+test_expect_success 'not writing gitmodules config file when it is not checked out' '
+	 test_must_fail git -C super submodule--helper config submodule.submodule.url newurl
+'
+
+test_expect_success 'initialising submodule when the gitmodules config is not checked out' '
+	git -C super submodule init
+'
+
+test_expect_success 'showing submodule summary when the gitmodules config is not checked out' '
+	git -C super submodule summary
+'
+
+test_expect_success 'updating submodule when the gitmodules config is not checked out' '
+	(cd submodule &&
+		echo file2 >file2 &&
+		git add file2 &&
+		git commit -m "add file2 to submodule"
+	) &&
+	git -C super submodule update
+'
+
+test_expect_success 'not adding submodules when the gitmodules config is not checked out' '
+	test_must_fail git -C super submodule add ../new_submodule
+'
+
+# This test checks that the previous "git submodule add" did not leave the
+# repository in a spurious state when it failed.
+test_expect_success 'init submodule still works even after the previous add failed' '
+	git -C super submodule init
+'
+
+test_done
diff --git a/t/t7814-grep-recurse-submodules.sh b/t/t7814-grep-recurse-submodules.sh
index 7184113b9b..fa475d52fa 100755
--- a/t/t7814-grep-recurse-submodules.sh
+++ b/t/t7814-grep-recurse-submodules.sh
@@ -380,4 +380,20 @@  test_expect_success 'grep --recurse-submodules should pass the pattern type alon
 	fi
 '
 
+# Recursing down into nested submodules which do not have .gitmodules in their
+# working tree does not work yet. This is because config_from_gitmodules()
+# uses get_oid() and the latter is still not able to get objects from an
+# arbitrary repository (the nested submodule, in this case).
+test_expect_failure 'grep --recurse-submodules with submodules without .gitmodules in the working tree' '
+	test_when_finished "git -C submodule checkout .gitmodules" &&
+	rm submodule/.gitmodules &&
+	git grep --recurse-submodules -e "(.|.)[\d]" >actual &&
+	cat >expect <<-\EOF &&
+	a:(1|2)d(3|4)
+	submodule/a:(1|2)d(3|4)
+	submodule/sub/a:(1|2)d(3|4)
+	EOF
+	test_cmp expect actual
+'
+
 test_done