diff mbox series

[v6] submodule: absorb git dir instead of dying on deinit

Message ID pull.1078.v6.git.git.1633523057369.gitgitgadget@gmail.com (mailing list archive)
State New, archived
Headers show
Series [v6] submodule: absorb git dir instead of dying on deinit | expand

Commit Message

Mugdha Pattnaik Oct. 6, 2021, 12:24 p.m. UTC
From: mugdha <mugdhapattnaik@gmail.com>

Currently, running 'git submodule deinit' on repos where the
submodule's '.git' is a directory, aborts with a message that is not
exactly user friendly. Let's change this to instead warn the user
to rerun the command with '--force'.

This internally calls 'absorb_git_dir_into_superproject()', which
moves the git dir into the superproject and replaces it with
a '.git' file. The rest of the deinit function can operate as it
already does with new-style submodules.

We also edit a test case such that it matches the new behaviour of
deinit.

Suggested-by: Atharva Raykar <raykar.ath@gmail.com>
Signed-off-by: Mugdha Pattnaik <mugdhapattnaik@gmail.com>
---
    submodule: absorb git dir instead of dying on deinit
    
    Changes since v5:
    
     * Fixed accidental submission of old version
    
    Changes since v4:
    
     * Changed test case from "! test -d" to "test_path_is_missing"
    
    Thank you,
    Mugdha

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1078%2Fmugdhapattnaik%2Fsubmodule-deinit-absorbgitdirs-v6
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1078/mugdhapattnaik/submodule-deinit-absorbgitdirs-v6
Pull-Request: https://github.com/git/git/pull/1078

Range-diff vs v5:

 1:  c39cd681e71 ! 1:  384a6742388 submodule: absorb git dir instead of dying on deinit
     @@ t/t7400-submodule-basic.sh: test_expect_success 'submodule deinit is silent when
      -	test -d init/.git &&
      -	test -n "$(git config --get-regexp "submodule\.example\.")"
      +	git submodule deinit -f init &&
     -+	! test -d init/.git &&
     ++	test_path_is_missing init/.git &&
      +	test -z "$(git config --get-regexp "submodule\.example\.")"
       '
       


 builtin/submodule--helper.c | 28 ++++++++++++++++++----------
 t/t7400-submodule-basic.sh  | 10 +++++-----
 2 files changed, 23 insertions(+), 15 deletions(-)


base-commit: c4203212e360b25a1c69467b5a8437d45a373cac

Comments

Junio C Hamano Oct. 7, 2021, 7:41 p.m. UTC | #1
"Mugdha Pattnaik via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: mugdha <mugdhapattnaik@gmail.com>

This line is called "in-body header" that is used to override the
author name that is automatically obtained from the e-mail's "From:"
header (which is set to "Mugdha Pattnaik via GitGitGadget" by GGG,
which is obviously not your name, hence we use an in-body header to
override it).  It should match what you use to sign off your patches,
the one we see below.

I'll hand-edit so that "git show" will say that the author is
"Mugdha Pattnaik", not "mugdha", while applying, but please make
sure your further submissions will not have this problem.

> Currently, running 'git submodule deinit' on repos where the
> submodule's '.git' is a directory, aborts with a message that is not
> exactly user friendly. Let's change this to instead warn the user
> to rerun the command with '--force'.

OK.  That sounds like an improvement, albeit possibly an overly
cautious one, as a casual "deinit" user will get an error as before
without "--force", which may or may not be a good thing.  Requiring
"--force" means it is safer by default by not changing the on-disk
data.  But requiring "--force" also means we end up training users
to say "--force" when it shouldn't have to be.

A plausible alternative is to always absorb but with a warning "we
absorbed it for you", without requiring "--force".  If we didn't
have "git submodule deinit" command now and were designing it from
scratch, would we design it to fail because the submodule's git
directory is not absorbed?  I doubt it, as I do not think of a good
reason to do so offhand.

Does "git submodule" currently reject a "deinit" request due to some
_other_ conditions or safety concerns and require the "--force"
option to continue?  Requiring the "--force" option to resolve ".git
is a directory, and the user wants to make it absorbed" means that
the user will be _forced_ to bypass these _other_ safety valves only
to save the submodule repository from destruction when running
"deinit", which may not be a good trade-off between the safety
requirements of these _other_ conditions, if exists, and the one we
are dealing with.

> This internally calls 'absorb_git_dir_into_superproject()', which
> moves the git dir into the superproject and replaces it with
> a '.git' file. The rest of the deinit function can operate as it
> already does with new-style submodules.

This is not wrong per-se, but such an implementation detail is
something best left for the patch.  

> We also edit a test case such that it matches the new behaviour of
> deinit.

"match the new behaviour" in what way?

    In one test, we used to require "git submodule deinit" to fail
    even with the "--force" option when the submodule's .git/
    directory is not absorbed.  Adjust it to expect the operation to
    pass.

would be a description at the right level of detail, I think.

> diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
> index ef2776a9e45..040b26f149d 100644
> --- a/builtin/submodule--helper.c
> +++ b/builtin/submodule--helper.c
> @@ -1539,16 +1539,24 @@ static void deinit_submodule(const char *path, const char *prefix,
>  		struct strbuf sb_rm = STRBUF_INIT;
>  		const char *format;
>  
> -		/*
> -		 * protect submodules containing a .git directory
> -		 * NEEDSWORK: instead of dying, automatically call
> -		 * absorbgitdirs and (possibly) warn.
> -		 */
> -		if (is_directory(sub_git_dir))
> -			die(_("Submodule work tree '%s' contains a .git "
> -			      "directory (use 'rm -rf' if you really want "
> -			      "to remove it including all of its history)"),
> -			    displaypath);
> +		if (is_directory(sub_git_dir)) {
> +			if (!(flags & OPT_FORCE))
> +				die(_("Submodule work tree '%s' contains a "
> +				      ".git directory.\nUse --force if you want "
> +				      "to move its contents to superproject's "
> +				      "module directory and convert .git to a file "
> +				      "and then proceed with deinit."),
> +				    displaypath);
> +
> +			if (!(flags & OPT_QUIET))
> +				warning(_("Submodule work tree '%s' contains a .git "
> +					  "directory. This will be replaced with a "
> +					  ".git file by using absorbgitdirs."),
> +					displaypath);
> +
> +			absorb_git_dir_into_superproject(displaypath, flags);

Shouldn't the first argument to this call be "path" not
"displaypath"?  The paths in messages may want to have the path from
the top to the submodule location prefixed for human consumption,
but the called function only cares about the path to the submodule
from the current directory, no?

The second parameter of this call seems totally bogus to me.  What
is the vocabulary of bits the called function takes?  Is that from
the same set the flags this function takes?  Does the called
function even understand OPT_QUIET, or does the bitpattern that
happens to correspond to OPT_QUIET have a totally different meaning
to the called function and we will trigger a behaviour that this
caller does not expect at all?

Thanks.
Mugdha Pattnaik Nov. 16, 2021, 6:20 p.m. UTC | #2
Apologies for the late reply, I got caught up with other commitments.

> On Fri, Oct 8, 2021 at 1:11 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> "Mugdha Pattnaik via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > From: mugdha <mugdhapattnaik@gmail.com>
>
> This line is called "in-body header" that is used to override the
> author name that is automatically obtained from the e-mail's "From:"
> header (which is set to "Mugdha Pattnaik via GitGitGadget" by GGG,
> which is obviously not your name, hence we use an in-body header to
> override it).  It should match what you use to sign off your patches,
> the one we see below.
>
> I'll hand-edit so that "git show" will say that the author is
> "Mugdha Pattnaik", not "mugdha", while applying, but please make
> sure your further submissions will not have this problem.

Okay, I will keep this in mind for future patches.


> > Currently, running 'git submodule deinit' on repos where the
> > submodule's '.git' is a directory, aborts with a message that is not
> > exactly user friendly. Let's change this to instead warn the user
> > to rerun the command with '--force'.
>
> OK.  That sounds like an improvement, albeit possibly an overly
> cautious one, as a casual "deinit" user will get an error as before
> without "--force", which may or may not be a good thing.  Requiring
> "--force" means it is safer by default by not changing the on-disk
> data.  But requiring "--force" also means we end up training users
> to say "--force" when it shouldn't have to be.
>
> A plausible alternative is to always absorb but with a warning "we
> absorbed it for you", without requiring "--force".  If we didn't
> have "git submodule deinit" command now and were designing it from
> scratch, would we design it to fail because the submodule's git
> directory is not absorbed?  I doubt it, as I do not think of a good
> reason to do so offhand.

Okay, I understand now. I'll remove the condition that checks for "--force"
and will go ahead with absorbing the gitdir after displaying the suggested
warning message. Currently I suppress the warnings when the "--quiet"
flag is set; I think I will continue to do that, even after implementing the
above change. Do let me know if I should be doing otherwise.


> Does "git submodule" currently reject a "deinit" request due to some
> _other_ conditions or safety concerns and require the "--force"
> option to continue?  Requiring the "--force" option to resolve ".git
> is a directory, and the user wants to make it absorbed" means that
> the user will be _forced_ to bypass these _other_ safety valves only
> to save the submodule repository from destruction when running
> "deinit", which may not be a good trade-off between the safety
> requirements of these _other_ conditions, if exists, and the one we
> are dealing with.

This is definitely a situation we want to avoid. How about we try to run
a check for uncommitted local modifications first? If these are present,
then deinit can die with a warning. In case there are no uncommitted
local modifications, deinit can go ahead and absorb the gitdir and do the
rest.

The existing submodule--helper.c file already has a check for this, but it
seems to be doing it below the check for absorption. I can try to switch it
around to see how that works.


> > This internally calls 'absorb_git_dir_into_superproject()', which
> > moves the git dir into the superproject and replaces it with
> > a '.git' file. The rest of the deinit function can operate as it
> > already does with new-style submodules.
>
> This is not wrong per-se, but such an implementation detail is
> something best left for the patch.
>
> > We also edit a test case such that it matches the new behaviour of
> > deinit.
>
> "match the new behaviour" in what way?
>
>     In one test, we used to require "git submodule deinit" to fail
>     even with the "--force" option when the submodule's .git/
>     directory is not absorbed.  Adjust it to expect the operation to
>     pass.
>
> would be a description at the right level of detail, I think.

Noted. I'll apply the above changes.


> > diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
> > index ef2776a9e45..040b26f149d 100644
> > --- a/builtin/submodule--helper.c
> > +++ b/builtin/submodule--helper.c
> > @@ -1539,16 +1539,24 @@ static void deinit_submodule(const char *path, const char *prefix,
> >               struct strbuf sb_rm = STRBUF_INIT;
> >               const char *format;
> >
> > -             /*
> > -              * protect submodules containing a .git directory
> > -              * NEEDSWORK: instead of dying, automatically call
> > -              * absorbgitdirs and (possibly) warn.
> > -              */
> > -             if (is_directory(sub_git_dir))
> > -                     die(_("Submodule work tree '%s' contains a .git "
> > -                           "directory (use 'rm -rf' if you really want "
> > -                           "to remove it including all of its history)"),
> > -                         displaypath);
> > +             if (is_directory(sub_git_dir)) {
> > +                     if (!(flags & OPT_FORCE))
> > +                             die(_("Submodule work tree '%s' contains a "
> > +                                   ".git directory.\nUse --force if you want "
> > +                                   "to move its contents to superproject's "
> > +                                   "module directory and convert .git to a file "
> > +                                   "and then proceed with deinit."),
> > +                                 displaypath);
> > +
> > +                     if (!(flags & OPT_QUIET))
> > +                             warning(_("Submodule work tree '%s' contains a .git "
> > +                                       "directory. This will be replaced with a "
> > +                                       ".git file by using absorbgitdirs."),
> > +                                     displaypath);
> > +
> > +                     absorb_git_dir_into_superproject(displaypath, flags);
>
> Shouldn't the first argument to this call be "path" not
> "displaypath"?  The paths in messages may want to have the path from
> the top to the submodule location prefixed for human consumption,
> but the called function only cares about the path to the submodule
> from the current directory, no?

Yes that makes sense, path is the better argument to pass.


> The second parameter of this call seems totally bogus to me.  What
> is the vocabulary of bits the called function takes?  Is that from
> the same set the flags this function takes?  Does the called
> function even understand OPT_QUIET, or does the bitpattern that
> happens to correspond to OPT_QUIET have a totally different meaning
> to the called function and we will trigger a behaviour that this
> caller does not expect at all?

I realised I was passing the wrong flags. On investigating further, the
flags in submodule.c do have different semantics. I also noticed that it
checks for recursive submodule absorption. I believe I should just be
passing the recursive flag to the function that absorbs gitdir. This way,
nested old-style gitdirs will also be handled.

I intend to incorporate these changes by the end of the week.

Thanks

--
Mugdha
Junio C Hamano Nov. 16, 2021, 6:54 p.m. UTC | #3
Mugdha Pattnaik <mugdhapattnaik@gmail.com> writes:

>> OK.  That sounds like an improvement, albeit possibly an overly
>> cautious one, as a casual "deinit" user will get an error as before
>> without "--force", which may or may not be a good thing.  Requiring
>> "--force" means it is safer by default by not changing the on-disk
>> data.  But requiring "--force" also means we end up training users
>> to say "--force" when it shouldn't have to be.
>> ...
>> Does "git submodule" currently reject a "deinit" request due to some
>> _other_ conditions or safety concerns and require the "--force"
>> option to continue?  Requiring the "--force" option to resolve ".git
>> is a directory, and the user wants to make it absorbed" means that
>> the user will be _forced_ to bypass these _other_ safety valves only
>> to save the submodule repository from destruction when running
>> "deinit", which may not be a good trade-off between the safety
>> requirements of these _other_ conditions, if exists, and the one we
>> are dealing with.
>
> This is definitely a situation we want to avoid. How about we try to run
> a check for uncommitted local modifications first?

I am not sure if I follow.  If we stop (ab)using "--force" for the
situation (i.e. where today's "deinit" would die because .git needs
to be absorbed first), then the user would not have to say "--force"
which may override other safety valve.  You'd check if .git needs to
be absorbed, make it absorbed as needed while reporting the fact
that you did so, and then let the existing "deinit" code to take
over.  If there are other safety checks that needs "--force" to be
overridden, that is handled (presumably) correctly by the existing
code, no?  So other than "do we need absorbing, and if so do it for
the user" check, I do not think you'd need to add any new "we try to
run a check for ..." at all.
Mugdha Pattnaik Nov. 17, 2021, 5:55 a.m. UTC | #4
On Wed, Nov 17, 2021 at 12:24 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> Mugdha Pattnaik <mugdhapattnaik@gmail.com> writes:
>
> >> OK.  That sounds like an improvement, albeit possibly an overly
> >> cautious one, as a casual "deinit" user will get an error as before
> >> without "--force", which may or may not be a good thing.  Requiring
> >> "--force" means it is safer by default by not changing the on-disk
> >> data.  But requiring "--force" also means we end up training users
> >> to say "--force" when it shouldn't have to be.
> >> ...
> >> Does "git submodule" currently reject a "deinit" request due to some
> >> _other_ conditions or safety concerns and require the "--force"
> >> option to continue?  Requiring the "--force" option to resolve ".git
> >> is a directory, and the user wants to make it absorbed" means that
> >> the user will be _forced_ to bypass these _other_ safety valves only
> >> to save the submodule repository from destruction when running
> >> "deinit", which may not be a good trade-off between the safety
> >> requirements of these _other_ conditions, if exists, and the one we
> >> are dealing with.
> >
> > This is definitely a situation we want to avoid. How about we try to run
> > a check for uncommitted local modifications first?
>
> I am not sure if I follow.  If we stop (ab)using "--force" for the
> situation (i.e. where today's "deinit" would die because .git needs
> to be absorbed first), then the user would not have to say "--force"
> which may override other safety valve.  You'd check if .git needs to
> be absorbed, make it absorbed as needed while reporting the fact
> that you did so, and then let the existing "deinit" code to take
> over.  If there are other safety checks that needs "--force" to be
> overridden, that is handled (presumably) correctly by the existing
> code, no?  So other than "do we need absorbing, and if so do it for
> the user" check, I do not think you'd need to add any new "we try to
> run a check for ..." at all.

Yes, I understand why "--force" should not be used. The reason why I
suggested the check for local modifications is because I thought we
should warn users that they have local modifications before we absorb
the gitdir. But I see now that this is okay, considering deinit would
die in such a situation anyway, and users would not lose their work.
The only side-effect of running deinit despite users having local
modifications, would be that the gitdir of the submodule has been
absorbed and that should be okay.
diff mbox series

Patch

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index ef2776a9e45..040b26f149d 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -1539,16 +1539,24 @@  static void deinit_submodule(const char *path, const char *prefix,
 		struct strbuf sb_rm = STRBUF_INIT;
 		const char *format;
 
-		/*
-		 * protect submodules containing a .git directory
-		 * NEEDSWORK: instead of dying, automatically call
-		 * absorbgitdirs and (possibly) warn.
-		 */
-		if (is_directory(sub_git_dir))
-			die(_("Submodule work tree '%s' contains a .git "
-			      "directory (use 'rm -rf' if you really want "
-			      "to remove it including all of its history)"),
-			    displaypath);
+		if (is_directory(sub_git_dir)) {
+			if (!(flags & OPT_FORCE))
+				die(_("Submodule work tree '%s' contains a "
+				      ".git directory.\nUse --force if you want "
+				      "to move its contents to superproject's "
+				      "module directory and convert .git to a file "
+				      "and then proceed with deinit."),
+				    displaypath);
+
+			if (!(flags & OPT_QUIET))
+				warning(_("Submodule work tree '%s' contains a .git "
+					  "directory. This will be replaced with a "
+					  ".git file by using absorbgitdirs."),
+					displaypath);
+
+			absorb_git_dir_into_superproject(displaypath, flags);
+
+		}
 
 		if (!(flags & OPT_FORCE)) {
 			struct child_process cp_rm = CHILD_PROCESS_INIT;
diff --git a/t/t7400-submodule-basic.sh b/t/t7400-submodule-basic.sh
index cb1b8e35dbf..f35479ea634 100755
--- a/t/t7400-submodule-basic.sh
+++ b/t/t7400-submodule-basic.sh
@@ -1182,18 +1182,18 @@  test_expect_success 'submodule deinit is silent when used on an uninitialized su
 	rmdir init example2
 '
 
-test_expect_success 'submodule deinit fails when submodule has a .git directory even when forced' '
+test_expect_success 'submodule deinit fails when submodule has a .git directory unless forced' '
 	git submodule update --init &&
 	(
 		cd init &&
 		rm .git &&
-		cp -R ../.git/modules/example .git &&
+		mv ../.git/modules/example .git &&
 		GIT_WORK_TREE=. git config --unset core.worktree
 	) &&
 	test_must_fail git submodule deinit init &&
-	test_must_fail git submodule deinit -f init &&
-	test -d init/.git &&
-	test -n "$(git config --get-regexp "submodule\.example\.")"
+	git submodule deinit -f init &&
+	test_path_is_missing init/.git &&
+	test -z "$(git config --get-regexp "submodule\.example\.")"
 '
 
 test_expect_success 'submodule with UTF-8 name' '