diff mbox series

[v2,3/3] builtin/add: error out when passing untracked path with -u

Message ID 20240329205649.1483032-5-shyamthakkar001@gmail.com (mailing list archive)
State New
Headers show
Series fix certain cases of add and commit with untracked path not erroring out | expand

Commit Message

Ghanshyam Thakkar March 29, 2024, 8:56 p.m. UTC
When passing untracked path with -u option, it silently succeeds. There
is no error message and the exit code is zero. This is inconsistent
with other instances of git commands where the expected argument is a
known path. In those other instances, we error out when the path is
not known.

Therefore, fix this by passing a character array to
add_files_to_cache() to collect the pathspec matching information and
report the error if a pathspec does not match any cache entry. Also add
a testcase to cover this scenario.

Signed-off-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com>
---
 builtin/add.c         | 9 ++++++++-
 t/t2200-add-update.sh | 6 ++++++
 2 files changed, 14 insertions(+), 1 deletion(-)

Comments

Junio C Hamano March 29, 2024, 9:43 p.m. UTC | #1
Ghanshyam Thakkar <shyamthakkar001@gmail.com> writes:

> When passing untracked path with -u option, it silently succeeds. There
> is no error message and the exit code is zero. This is inconsistent
> with other instances of git commands where the expected argument is a
> known path. In those other instances, we error out when the path is
> not known.
>
> Therefore, fix this by passing a character array to

"Therefore, fix" -> "Fix".

> add_files_to_cache() to collect the pathspec matching information and
> report the error if a pathspec does not match any cache entry. Also add
> a testcase to cover this scenario.
>
> Signed-off-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com>
> ---
>  builtin/add.c         | 9 ++++++++-
>  t/t2200-add-update.sh | 6 ++++++
>  2 files changed, 14 insertions(+), 1 deletion(-)
>
> diff --git a/builtin/add.c b/builtin/add.c
> index ffe5fd8d44..650432bb13 100644
> --- a/builtin/add.c
> +++ b/builtin/add.c
> @@ -370,6 +370,7 @@ int cmd_add(int argc, const char **argv, const char *prefix)
>  	int add_new_files;
>  	int require_pathspec;
>  	char *seen = NULL;
> +	char *ps_matched = NULL;
>  	struct lock_file lock_file = LOCK_INIT;
>  
>  	git_config(add_config, NULL);
> @@ -547,15 +548,20 @@ int cmd_add(int argc, const char **argv, const char *prefix)
>  		string_list_clear(&only_match_skip_worktree, 0);
>  	}
>  
> +
>  	begin_odb_transaction();

Unnecessary change.

> +	ps_matched = xcalloc(pathspec.nr, 1);
>  	if (add_renormalize)
>  		exit_status |= renormalize_tracked_files(&pathspec, flags);
>  	else
>  		exit_status |= add_files_to_cache(the_repository, prefix,
> -						  &pathspec, NULL,
> +						  &pathspec, ps_matched,
>  						  include_sparse, flags);
>  
> +	if (take_worktree_changes)
> +		exit_status |= report_path_error(ps_matched, &pathspec);

Hmph, are we sure take_worktree_changes is true only when
add_renormalize is false?

>  	if (add_new_files)
>  		exit_status |= add_files(&dir, flags);

If report_path_error() detected that the pathspec were faulty,
should we still proceed to add new files?  This is NOT a rhetorical
question, as I do not know the answer myself.  I do not even know
offhand what add_files_to_cache() above did when pathspec elements
are not all consumed---if it does not complain and does not refrain
from doing any change to the index, then we should follow suite and
add_files() here, too.

> @@ -568,6 +574,7 @@ int cmd_add(int argc, const char **argv, const char *prefix)
>  			       COMMIT_LOCK | SKIP_IF_UNCHANGED))
>  		die(_("unable to write new index file"));
>  
> +	free(ps_matched);
>  	dir_clear(&dir);
>  	clear_pathspec(&pathspec);
>  	return exit_status;
> diff --git a/t/t2200-add-update.sh b/t/t2200-add-update.sh
> index c01492f33f..7cba325f08 100755
> --- a/t/t2200-add-update.sh
> +++ b/t/t2200-add-update.sh
> @@ -65,6 +65,12 @@ test_expect_success 'update did not touch untracked files' '
>  	test_must_be_empty out
>  '
>  
> +test_expect_success 'error out when passing untracked path' '
> +	echo content >baz &&
> +	test_must_fail git add -u baz 2>err &&
> +	test_grep -e "error: pathspec .baz. did not match any file(s) known to git" err
> +'
> +
>  test_expect_success 'cache tree has not been corrupted' '
>  
>  	git ls-files -s |
Ghanshyam Thakkar March 30, 2024, 2:18 p.m. UTC | #2
On Fri, 29 Mar 2024, Junio C Hamano <gitster@pobox.com> wrote:
> Ghanshyam Thakkar <shyamthakkar001@gmail.com> writes:
> > +	if (take_worktree_changes)
> > +		exit_status |= report_path_error(ps_matched, &pathspec);
> 
> Hmph, are we sure take_worktree_changes is true only when
> add_renormalize is false?
> 
> >  	if (add_new_files)
> >  		exit_status |= add_files(&dir, flags);
> 
> If report_path_error() detected that the pathspec were faulty,
> should we still proceed to add new files?  This is NOT a rhetorical
> question, as I do not know the answer myself.  I do not even know
> offhand what add_files_to_cache() above did when pathspec elements
> are not all consumed---if it does not complain and does not refrain
> from doing any change to the index, then we should follow suite and
> add_files() here, too.
Sorry if I'm missing something, but in your last line after '---', do you mean
that we should proceed even after report_path_error() detected error like in
the above patch or perhaps something like this:

diff --git a/builtin/add.c b/builtin/add.c
index dc4b42d0ad..eccda485ed 100644
--- a/builtin/add.c
+++ b/builtin/add.c
@@ -64,7 +64,8 @@ static int chmod_pathspec(struct pathspec *pathspec, char flip, int show_only)
        return ret;
 }
 
-static int renormalize_tracked_files(const struct pathspec *pathspec, int flags)
+static int renormalize_tracked_files(const struct pathspec *pathspec,
+                                    char *ps_matched, int flags)
 {
        int i, retval = 0;
 
@@ -79,7 +80,8 @@ static int renormalize_tracked_files(const struct pathspec *pathspec, int flags)
                        continue; /* do not touch unmerged paths */
                if (!S_ISREG(ce->ce_mode) && !S_ISLNK(ce->ce_mode))
                        continue; /* do not touch non blobs */
-               if (pathspec && !ce_path_match(&the_index, ce, pathspec, NULL))
+               if (pathspec &&
+                   !ce_path_match(&the_index, ce, pathspec, ps_matched))
                        continue;
                retval |= add_file_to_index(&the_index, ce->name,
                                            flags | ADD_CACHE_RENORMALIZE);
@@ -370,7 +372,9 @@ int cmd_add(int argc, const char **argv, const char *prefix)
        int add_new_files;
        int require_pathspec;
        char *seen = NULL;
+       char *ps_matched = NULL;
        struct lock_file lock_file = LOCK_INIT;
+       struct string_list only_match_skip_worktree = STRING_LIST_INIT_NODUP;
 
        git_config(add_config, NULL);
 
@@ -487,7 +491,6 @@ int cmd_add(int argc, const char **argv, const char *prefix)
        if (pathspec.nr) {
                int i;
                char *skip_worktree_seen = NULL;
-               struct string_list only_match_skip_worktree = STRING_LIST_INIT_NODUP;
 
                if (!seen)
                        seen = find_pathspecs_matching_against_index(&pathspec,
@@ -544,18 +547,26 @@ int cmd_add(int argc, const char **argv, const char *prefix)
 
                free(seen);
                free(skip_worktree_seen);
-               string_list_clear(&only_match_skip_worktree, 0);
        }
 
        begin_odb_transaction();
 
+       ps_matched = xcalloc(pathspec.nr, 1);
        if (add_renormalize)
-               exit_status |= renormalize_tracked_files(&pathspec, flags);
+               exit_status |=
+                       renormalize_tracked_files(&pathspec, ps_matched, flags);
        else
                exit_status |= add_files_to_cache(the_repository, prefix,
-                                                 &pathspec, NULL,
+                                                 &pathspec, ps_matched,
                                                  include_sparse, flags);
 
+       if ((take_worktree_changes ||
+            (add_renormalize && !only_match_skip_worktree.nr)) &&
                                                  include_sparse, flags);
 
+       if ((take_worktree_changes ||
+            (add_renormalize && !only_match_skip_worktree.nr)) &&
+           report_path_error(ps_matched, &pathspec)) {
+               exit_status = 1;
+               goto cleanup;
+       }
+
        if (add_new_files)
                exit_status |= add_files(&dir, flags);
 
@@ -568,6 +579,9 @@ int cmd_add(int argc, const char **argv, const char *prefix)
                               COMMIT_LOCK | SKIP_IF_UNCHANGED))
                die(_("unable to write new index file"));
 
+cleanup:
+       string_list_clear(&only_match_skip_worktree, 0);
+       free(ps_matched);
        dir_clear(&dir);
        clear_pathspec(&pathspec);
        return exit_status;

Although I'm not sure if we should flush_odb_transaction() in the
cleanup, because end_odb_transaction() would not be called if we go
straight to cleanup.
Junio C Hamano March 30, 2024, 4:49 p.m. UTC | #3
Ghanshyam Thakkar <shyamthakkar001@gmail.com> writes:

> On Fri, 29 Mar 2024, Junio C Hamano <gitster@pobox.com> wrote:
>> Ghanshyam Thakkar <shyamthakkar001@gmail.com> writes:
>> > +	if (take_worktree_changes)
>> > +		exit_status |= report_path_error(ps_matched, &pathspec);
>> 
>> Hmph, are we sure take_worktree_changes is true only when
>> add_renormalize is false?
>> 
>> >  	if (add_new_files)
>> >  		exit_status |= add_files(&dir, flags);
>> 
>> If report_path_error() detected that the pathspec were faulty,
>> should we still proceed to add new files?  This is NOT a rhetorical
>> question, as I do not know the answer myself.  I do not even know
>> offhand what add_files_to_cache() above did when pathspec elements
>> are not all consumed---if it does not complain and does not refrain
>> from doing any change to the index, then we should follow suite and
>> add_files() here, too.
> Sorry if I'm missing something, but in your last line after '---', do you mean
> that we should proceed even after report_path_error() detected error like in
> the above patch or perhaps something like this:

We roughly do:

	if (add_renorm)
		exit_status |= renorm();
	else
		exit_status |= add_files_to_cache();
+	if (take_worktree_changes)
+		exit_status |= report_path_error();
	if (add_new_files)
		exit_status |= add_files();

I was wondering if we should refrain from adding new files when we
exit_status is true to avoid making "further damage", and was
wondering if the last one should become:

	if (!exit_status && add_new_files)
		exit_status |= add_files();

But that was merely because I was not thinking things through.  If
we were to go that route, the whole thing needs to become (because
there are other things that notice errors before this part of the
code):
	
	if (!exit_status) {
		if (add_renorm)
			exit_status |= renorm();
		else
                	exit_status |= add_files_to_cache();
	}
	if (!exit_status && take_worktree_changes)
		exit_status |= report_path_error();

	if (!exit_status && add_new_files)
		exit_status |= add_files();

but (1) that is far bigger change of behaviour to the code than
suitable for "notice unmatched pathspec elements and report an
error" topic, and more importantly (2) it is still not sufficient to
make it "all-or-none". E.g., if "add_files_to_cache()" call added
contents from a few paths and then noticed that some pathspec
elements were not used, we are not restoring the previous state to
recover.  The damage is already done, and not making further damage
does not help the user all that much.

So, it was a fairly pointless thing that I was wondering about.  The
current behaviour, and the new behaviour with the new check, are
fine as-is.

If we wanted to make it "all-or-none", I think the way to do so is
to tweak the final part of the cmd_add() function to skip committing
the updated index, e.g.,

         finish:
        -	if (write_locked_index(&the_index, &lock_file,
        +	if (exit_status)
        +		fputs(_("not updating the index due to failure(s)\n"), stderr);
        +	else if (write_locked_index(&the_index, &lock_file,
                                       COMMIT_LOCK | SKIP_IF_UNCHANGED))
                        die(_("unable to write new index file"));
 
And if/when we do so, the existing code (with or without the updates
made by the topic under discussion) needs no change.  We can do all
steps regardless of the errors we notice along the way with earlier
steps, and discard the in-core index if we saw any errors.

The renormalize() thing is not noticing unused pathspec elements,
which we might want to fix, but I suspect it is far less commonly
used mode of operation, so it may be OK to leave it to future
follow-up series.

Thanks.
Ghanshyam Thakkar April 1, 2024, 1:27 p.m. UTC | #4
On Sat, 30 Mar 2024, Junio C Hamano <gitster@pobox.com> wrote:
> So, it was a fairly pointless thing that I was wondering about.  The
> current behaviour, and the new behaviour with the new check, are
> fine as-is.

Well I think we should be going 'all-or-none' way as I can't think of
any major user-facing command that does partial changes incase of
error (besides two testcase below).

> If we wanted to make it "all-or-none", I think the way to do so is
> to tweak the final part of the cmd_add() function to skip committing
> the updated index, e.g.,
> 
>          finish:
>         -	if (write_locked_index(&the_index, &lock_file,
>         +	if (exit_status)
>         +		fputs(_("not updating the index due to failure(s)\n"), stderr);
>         +	else if (write_locked_index(&the_index, &lock_file,
>                                        COMMIT_LOCK | SKIP_IF_UNCHANGED))
>                         die(_("unable to write new index file"));
>  
> And if/when we do so, the existing code (with or without the updates
> made by the topic under discussion) needs no change.  We can do all
> steps regardless of the errors we notice along the way with earlier
> steps, and discard the in-core index if we saw any errors.

Doing this, we would need to take care of atleast 4 tests breaking in
t3700-add:
 error out when attempting to add ignored ones but add others
 git add --ignore-errors
 git add (add.ignore-errors)
 git add --chmod fails with non regular files (but updates the other paths)

while ignore-errors ones would be trivial to fix, fixing other 2 would
probably require some more than trivial code changes, as from the title,
their behavior seems pretty much set in stone. That's why I did the
'goto cleanup' approach to not break these.

Thanks.

> The renormalize() thing is not noticing unused pathspec elements,
> which we might want to fix, but I suspect it is far less commonly
> used mode of operation, so it may be OK to leave it to future
> follow-up series.
> 
> Thanks.
Junio C Hamano April 1, 2024, 4:31 p.m. UTC | #5
Ghanshyam Thakkar <shyamthakkar001@gmail.com> writes:

> Well I think we should be going 'all-or-none' way as I can't think of
> any major user-facing command that does partial changes incase of
> error (besides two testcase below).

I agree that in the longer run, all-or-none would be something we
should aim for, but I'd strongly prefer leaving that outside this
topic, especially the existing ones that set exit_status to non-zero
but still commits the index changes.

I am OK, as a place to stop for now, if the topic had something like

+	if (take_worktree_changes) {
+		if (report_path_error(ps_matched, &pathspec))
+			exit(128);
+	}

in it, though, because this is a new behaviour.

> Doing this, we would need to take care of atleast 4 tests breaking in
> t3700-add:
>  error out when attempting to add ignored ones but add others
>  git add --ignore-errors
>  git add (add.ignore-errors)
>  git add --chmod fails with non regular files (but updates the other paths)
>
> while ignore-errors ones would be trivial to fix, fixing other 2 would
> probably require some more than trivial code changes, as from the title,
> their behavior seems pretty much set in stone. That's why I did the
> 'goto cleanup' approach to not break these.

I am not sure if these are expecting the right outcome in the first
place, and the need to examine what the right behaviour should be is
what makes me say "I do not want to make the all-or-none thing part
of this topic".

>> The renormalize() thing is not noticing unused pathspec elements,
>> which we might want to fix, but I suspect it is far less commonly
>> used mode of operation, so it may be OK to leave it to future
>> follow-up series.

Thanks.
diff mbox series

Patch

diff --git a/builtin/add.c b/builtin/add.c
index ffe5fd8d44..650432bb13 100644
--- a/builtin/add.c
+++ b/builtin/add.c
@@ -370,6 +370,7 @@  int cmd_add(int argc, const char **argv, const char *prefix)
 	int add_new_files;
 	int require_pathspec;
 	char *seen = NULL;
+	char *ps_matched = NULL;
 	struct lock_file lock_file = LOCK_INIT;
 
 	git_config(add_config, NULL);
@@ -547,15 +548,20 @@  int cmd_add(int argc, const char **argv, const char *prefix)
 		string_list_clear(&only_match_skip_worktree, 0);
 	}
 
+
 	begin_odb_transaction();
 
+	ps_matched = xcalloc(pathspec.nr, 1);
 	if (add_renormalize)
 		exit_status |= renormalize_tracked_files(&pathspec, flags);
 	else
 		exit_status |= add_files_to_cache(the_repository, prefix,
-						  &pathspec, NULL,
+						  &pathspec, ps_matched,
 						  include_sparse, flags);
 
+	if (take_worktree_changes)
+		exit_status |= report_path_error(ps_matched, &pathspec);
+
 	if (add_new_files)
 		exit_status |= add_files(&dir, flags);
 
@@ -568,6 +574,7 @@  int cmd_add(int argc, const char **argv, const char *prefix)
 			       COMMIT_LOCK | SKIP_IF_UNCHANGED))
 		die(_("unable to write new index file"));
 
+	free(ps_matched);
 	dir_clear(&dir);
 	clear_pathspec(&pathspec);
 	return exit_status;
diff --git a/t/t2200-add-update.sh b/t/t2200-add-update.sh
index c01492f33f..7cba325f08 100755
--- a/t/t2200-add-update.sh
+++ b/t/t2200-add-update.sh
@@ -65,6 +65,12 @@  test_expect_success 'update did not touch untracked files' '
 	test_must_be_empty out
 '
 
+test_expect_success 'error out when passing untracked path' '
+	echo content >baz &&
+	test_must_fail git add -u baz 2>err &&
+	test_grep -e "error: pathspec .baz. did not match any file(s) known to git" err
+'
+
 test_expect_success 'cache tree has not been corrupted' '
 
 	git ls-files -s |