diff mbox series

[1/1,RFC,GSoC] submodule: using 'is_writing_gitmodules_ok()' for a stricter check

Message ID 20200211170359.31835-2-shouryashukla.oo@gmail.com (mailing list archive)
State New, archived
Headers show
Series submodule: enforcing stricter checks | expand

Commit Message

Shourya Shukla Feb. 11, 2020, 5:03 p.m. UTC
The if conditions of the functions 'update_path_in_gitmodules()'
and 'remove_path_from_gitmodules()' are not catering to every
condition encountered by the function. On detailed observation,
one can notice that .gitmodules cannot be changed (i.e. removal
of a path or updation of a path) until these conditions are satisfied:

    1. The file exists
    2. The file, if it does not exist, should be absent from
       the index and other branches as well.
    3. There should not be any unmerged changes in the file.
    4. The submodules do not exist or if the submodule name
       does not match.

Only the conditions 1, 3 and 4 were being satisfied earlier. Now
on changing the if statement in one of the places, the condition
2 is satisfied as well.

Signed-off-by: Shourya Shukla <shouryashukla.oo@gmail.com>
---
 submodule.c | 16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)

Comments

Johannes Schindelin Feb. 13, 2020, 1:42 p.m. UTC | #1
Hi Shourya,

On Tue, 11 Feb 2020, Shourya Shukla wrote:

> The if conditions of the functions 'update_path_in_gitmodules()'
> and 'remove_path_from_gitmodules()' are not catering to every
> condition encountered by the function. On detailed observation,
> one can notice that .gitmodules cannot be changed (i.e. removal
> of a path or updation of a path) until these conditions are satisfied:
>
>     1. The file exists
>     2. The file, if it does not exist, should be absent from
>        the index and other branches as well.

I don't think that other branches matter in this context.

>     3. There should not be any unmerged changes in the file.
>     4. The submodules do not exist or if the submodule name
>        does not match.
>
> Only the conditions 1, 3 and 4 were being satisfied earlier. Now
> on changing the if statement in one of the places, the condition
> 2 is satisfied as well.

Let's see how this is done...

> Signed-off-by: Shourya Shukla <shouryashukla.oo@gmail.com>
> ---
>  submodule.c | 16 ++++++++++++++--
>  1 file changed, 14 insertions(+), 2 deletions(-)
>
> diff --git a/submodule.c b/submodule.c
> index 3a184b66ab..f7836a6851 100644
> --- a/submodule.c
> +++ b/submodule.c
> @@ -107,7 +107,13 @@ int update_path_in_gitmodules(const char *oldpath, const char *newpath)
>  	const struct submodule *submodule;
>  	int ret;
>
> -	if (!file_exists(GITMODULES_FILE)) /* Do nothing without .gitmodules */
> +	/* If .gitmodules file is not safe to write(update a path) i.e.
> +	 * if it does not exist or if it is not present in the working tree
> +	 * but lies in the index or in the current branch.
> +	 * The function 'is_writing_gitmodules_ok()' checks for the same.
> +	 * and exits with failure if above conditions are not satisfied
> +	*/

Style: we always begin and end multi-line comments with `/*` and `*/` on
their own line.

> +	if (is_writing_gitmodules_ok())

Hmm. This function is defined thusly:

int is_writing_gitmodules_ok(void)
{
	struct object_id oid;
	return file_exists(GITMODULES_FILE) ||
		(get_oid(GITMODULES_INDEX, &oid) < 0 && get_oid(GITMODULES_HEAD, &oid) < 0);
}

Aha! So this tries to ensure that the `.gitmodules` file exists on disk,
or if it does not, then it should not exist in the index nor in the
_current_ branch.

But we're in the function called `update_path_in_gitmodules()` which
suggests that we're working on an existing, valid `.gitmodules`.

So I do not think that we can proceed if `.gitmodules` is absent from
disk, even if in case that it is _also_ absent from the index and from the
current branch.

>  		return -1;
>
>  	if (is_gitmodules_unmerged(the_repository->index))
> @@ -136,7 +142,13 @@ int remove_path_from_gitmodules(const char *path)
>  	struct strbuf sect = STRBUF_INIT;
>  	const struct submodule *submodule;
>
> -	if (!file_exists(GITMODULES_FILE)) /* Do nothing without .gitmodules */
> +	/* If .gitmodules file is not safe to write(remove a path) i.e.
> +	 * if it does not exist or if it is not present in the working tree
> +	 * but lies in the index or in the current branch.
> +	 * The function 'is_writing_gitmodules_ok()' checks for the same.
> +	 * and exits with failure if above conditions are not satisfied
> +	*/
> +	if (is_writing_gitmodules_ok())

Here, we want to remove a path from `.gitmodules`, so I think that the
same analysis applies as above.

In other words, I think that the existing code is correct and does not
need to be patched.

Ciao,
Johannes

>  		return -1;
>
>  	if (is_gitmodules_unmerged(the_repository->index))
> --
> 2.20.1
>
>
Shourya Shukla Feb. 13, 2020, 4:38 p.m. UTC | #2
Hello Johannes,

I understand your point of view here. What I am trying to say is that we must update
our .gitmodules if atleast the function 'is_writing_gitmodules_ok()' passes.

Before, we used to pass the if condition if our .gitmomdules existed and it did not matter
if there were any traces of it in the index.

> But we're in the function called `update_path_in_gitmodules()` which
> suggests that we're working on an existing, valid `.gitmodules`.

But we still originally(before my patch) checked for the existence of .gitmodules right?
The functions exits with error in case of absence of the file(which should happen).

> So I do not think that we can proceed if `.gitmodules` is absent from
> disk, even if in case that it is _also_ absent from the index and from the
> current branch.

Yes that is one case, but the other case is that _if_ the file exists, it **should** not
exist in the index or our current branch(which must be necessary to ensure before making
any updates to the file right?). This is the case which was not covered before but I have
tried to cover it in my patch.

Is this explanation correct?

Regards,
Shourya Shukla
Johannes Schindelin Feb. 14, 2020, 1:28 p.m. UTC | #3
Hi Shourya,

On Thu, 13 Feb 2020, Shourya Shukla wrote:

> I understand your point of view here. What I am trying to say is that we
> must update our .gitmodules if atleast the function
> 'is_writing_gitmodules_ok()' passes.

Well, you know, I totally overlooked something: your patch replaces

	if (!file_exists(GITMODULES_FILE)) /* Do nothing without .gitmodules */

by

	if (is_writing_gitmodules_ok())

which is incorrect: it should _at least_ replace it with

	if (!is_writing_gitmodules_ok())

Note the `!`. The reason is that this statement guards an early exit from
the function, indicating an error. In particular, the code before said: if
this file does not exist, error out.

The new code (with the `!`) would say: if the file does not exist, _or if
`is_writing_gitmodules_ok()` fails, error out.

But that function does not do what we want: if we rewrite the code in the
way you suggested, then we would _no longer_ error out if the file is
missing if it at least is in the index or in `HEAD`.

But if the file is missing, we cannot edit it, which is what both the
"update" and the "remove" code path want to do.

> Before, we used to pass the if condition if our .gitmomdules existed and
> it did not matter if there were any traces of it in the index.

Exactly. If there is no `.gitmodules` file on disk, we cannot edit it.
Period.

It does not matter whether there is a copy in the index or in `HEAD`: the
`git mv` and `git rm` commands want to work _on the worktree_ by default.

Side note: From a cursory read of the callers in `builtin/rm.c`, I suspect
that `git rm --cached` actually does not handle the `.gitmodules` file
correctly: it would not edit it in that case, but we would want it to be
edited _in the index_.

> > But we're in the function called `update_path_in_gitmodules()` which
> > suggests that we're working on an existing, valid `.gitmodules`.
>
> But we still originally(before my patch) checked for the existence of
> .gitmodules right?

We checked for the _non_-existence.

> The functions exits with error in case of absence of the file(which
> should happen).

Yes.

And your patch changes this so that the file _can_ be absent, _as long_ as
it exists either in the index or in the tip commit of the current branch.

But the code then goes on to call `config_set_in_gitmodules_file_gently()`
or `git_config_rename_section_in_file()`, respectively. Both of these
functions _expect_ the file to exist.

Therefore, the condition that your patch now allows would lead to
incorrect behavior. A test case would have caught this, which is actually
a good reminder that patches that change behavior should always be
accompanied by changes/additions to the test suite to document the
expected behavior.

> > So I do not think that we can proceed if `.gitmodules` is absent from
> > disk, even if in case that it is _also_ absent from the index and from
> > the current branch.
>
> Yes that is one case, but the other case is that _if_ the file exists,
> it **should** not exist in the index or our current branch(which must be
> necessary to ensure before making any updates to the file right?).

Assuming that you are talking about the conditions that have to be met
_not_ to error out early from those functions, I disagree: both of these
functions operate on the `.gitmodules` _file_. They require that file. It
must exist. Otherwise we must error out early. Which the existing code
already does.

> This is the case which was not covered before but I have tried to cover
> it in my patch.

If you truly want to cover the case where we want to edit the
`.gitmodules` file even if it does not exist on disk, but only in the
index and/or the current branch, then those functions need _quite_ a bit
more work to actually pull the contents from the index, and/or from the
tip commit, _and_ to put the modified contents into the index.

However, I am not at all sure that that is a wise thing to do (except in
the case that we're talking about `git rm`'s `--cached` option,
in which case you would _definitely_ need quite a bit more modifications
e.g. to extend the signature of at least `remove_path_from_gitmodules()`
to indicate the desire _not_ to work on the worktree file but on the index
instead, and that mode should not even allow `.gitmodules` to be absent
from worktree and index but only exist in the tip commit of the current
branch).

So I am afraid that the patch is incorrect as-is. It would require a
clearer idea of what its goal is, which would have to be reflected in the
commit message, and it would have to be accompanied by a regression test
case.

As things stand, I don't think that this patch is going in the right
direction.

If, on the other hand, the direction is changed to try to support the
`--cached` option, I would agree that that would be going toward the right
direction.

Ciao,
Johannes

> Is this explanation correct?
>
> Regards,
> Shourya Shukla
>
diff mbox series

Patch

diff --git a/submodule.c b/submodule.c
index 3a184b66ab..f7836a6851 100644
--- a/submodule.c
+++ b/submodule.c
@@ -107,7 +107,13 @@  int update_path_in_gitmodules(const char *oldpath, const char *newpath)
 	const struct submodule *submodule;
 	int ret;
 
-	if (!file_exists(GITMODULES_FILE)) /* Do nothing without .gitmodules */
+	/* If .gitmodules file is not safe to write(update a path) i.e.
+	 * if it does not exist or if it is not present in the working tree
+	 * but lies in the index or in the current branch.
+	 * The function 'is_writing_gitmodules_ok()' checks for the same.
+	 * and exits with failure if above conditions are not satisfied
+	*/
+	if (is_writing_gitmodules_ok())
 		return -1;
 
 	if (is_gitmodules_unmerged(the_repository->index))
@@ -136,7 +142,13 @@  int remove_path_from_gitmodules(const char *path)
 	struct strbuf sect = STRBUF_INIT;
 	const struct submodule *submodule;
 
-	if (!file_exists(GITMODULES_FILE)) /* Do nothing without .gitmodules */
+	/* If .gitmodules file is not safe to write(remove a path) i.e.
+	 * if it does not exist or if it is not present in the working tree
+	 * but lies in the index or in the current branch.
+	 * The function 'is_writing_gitmodules_ok()' checks for the same.
+	 * and exits with failure if above conditions are not satisfied
+	*/
+	if (is_writing_gitmodules_ok())
 		return -1;
 
 	if (is_gitmodules_unmerged(the_repository->index))