diff mbox series

[v3] describe: refresh the index when 'broken' flag is used

Message ID 20240625133534.223579-1-abhijeet.nkt@gmail.com (mailing list archive)
State Superseded
Headers show
Series [v3] describe: refresh the index when 'broken' flag is used | expand

Commit Message

Abhijeet Sonar June 25, 2024, 1:35 p.m. UTC
When describe is run with 'dirty' flag, we refresh the index
to make sure it is in sync with the filesystem before
determining if the working tree is dirty.  However, this is
not done for the codepath where the 'broken' flag is used.

This causes `git describe --broken --dirty` to false
positively report the worktree being dirty if a file has
different stat info than what is recorded in the index.
Running `git update-index -q --refresh` to refresh the index
before running diff-index fixes the problem.

Also add tests to deliberately update stat info of a
file before running describe to verify it behaves correctly.

Reported-by: Paul Millar <paul.millar@desy.de>
Suggested-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: Abhijeet Sonar <abhijeet.nkt@gmail.com>
---
 builtin/describe.c  | 11 +++++++++++
 t/t6120-describe.sh | 24 ++++++++++++++++++++++++
 2 files changed, 35 insertions(+)


Range-diff against v2:
1:  d60fc0fa02 ! 1:  1da5fa48d9 describe: refresh the index when 'broken' flag is used
    @@ Commit message
         Reported-by: Paul Millar <paul.millar@desy.de>
         Suggested-by: Junio C Hamano <gitster@pobox.com>
         Helped-by: Junio C Hamano <gitster@pobox.com>
    +    Helped-by: Phillip Wood <phillip.wood123@gmail.com>
         Signed-off-by: Abhijeet Sonar <abhijeet.nkt@gmail.com>
     
      ## builtin/describe.c ##
    @@ builtin/describe.c: int cmd_describe(int argc, const char **argv, const char *pr
      	if (argc == 0) {
      		if (broken) {
      			struct child_process cp = CHILD_PROCESS_INIT;
    -+			struct child_process update_index_cp = CHILD_PROCESS_INIT;
    -+
    -+			strvec_pushv(&update_index_cp.args, update_index_args);
    -+			update_index_cp.git_cmd = 1;
    -+			update_index_cp.no_stdin = 1;
    -+			update_index_cp.no_stdout = 1;
    -+			run_command(&update_index_cp);
    ++			strvec_pushv(&cp.args, update_index_args);
    ++			cp.git_cmd = 1;
    ++			cp.no_stdin = 1;
    ++			cp.no_stdout = 1;
    ++			run_command(&cp);
    ++			strvec_clear(&cp.args);
     +
      			strvec_pushv(&cp.args, diff_index_args);
      			cp.git_cmd = 1;
    @@ t/t6120-describe.sh: test_expect_success 'setup misleading taggerdates' '
      
     +test_expect_success 'describe --dirty with a file with changed stat' '
     +	git init stat-dirty &&
    -+	cd stat-dirty &&
    -+
    -+	echo A >file &&
    -+	git add file &&
    -+	git commit -m A &&
    -+	git tag A -a -m A &&
    -+
    -+	cat file >file.new &&
    -+	mv file.new file &&
    -+	git describe --dirty >actual &&
    -+	echo "A" >expected &&
    -+	test_cmp expected actual
    -+'
    ++	(
    ++		cd stat-dirty &&
     +
    -+test_expect_success 'describe --dirty --broken with a file with changed stat' '
    -+	git init stat-dirty-broken &&
    -+	cd stat-dirty-broken &&
    ++		echo A >file &&
    ++		git add file &&
    ++		git commit -m A &&
    ++		git tag A -a -m A &&
     +
    -+	echo A >file &&
    -+	git add file &&
    -+	git commit -m A &&
    -+	git tag A -a -m A &&
    ++		cat file >file.new &&
    ++		mv file.new file &&
    ++		git describe --dirty >actual &&
    ++		echo "A" >expected &&
    ++		test_cmp expected actual &&
     +
    -+	cat file >file.new &&
    -+	mv file.new file &&
    -+	git describe --dirty --broken >actual &&
    -+	echo "A" >expected &&
    -+	test_cmp expected actual
    ++		cat file >file.new &&
    ++		mv file.new file &&
    ++		git describe --dirty --broken >actual &&
    ++		echo "A" >expected &&
    ++		test_cmp expected actual
    ++	)
     +'
     +
      test_done

Comments

Junio C Hamano June 25, 2024, 3:59 p.m. UTC | #1
Abhijeet Sonar <abhijeet.nkt@gmail.com> writes:

>  		if (broken) {
>  			struct child_process cp = CHILD_PROCESS_INIT;
> +			strvec_pushv(&cp.args, update_index_args);
> +			cp.git_cmd = 1;
> +			cp.no_stdin = 1;
> +			cp.no_stdout = 1;
> +			run_command(&cp);
> +			strvec_clear(&cp.args);

Why clear .args here?

Either "struct child_process" is reusable after finish_command()
that is called as the last step of run_command() returns
successfully, or it shouldn't be reused at all.  And when
finish_command() is called, .args as well as .env are cleared
because it calls child_process_clear().

I am wondering if the last part need to be more like

	...
	cp.no_stdout = 1;
	if (run_command(&cp))
		child_process_clear(&cp);

> +
>  			strvec_pushv(&cp.args, diff_index_args);
>  			cp.git_cmd = 1;
>  			cp.no_stdin = 1;

Thanks.


(#leftoverbit)

Outside the scope of this patch, I'd prefer to see somebody makes
sure that it is truly equivalent to prepare a separate and new
struct child_process for each run_command() call and to reuse the
same struct child_process after calling child_process_clear() each
time.  It is unclear if they are equivalent in general, even though
in this particular case I think we should be OK.

There _might_ be other things in the child_process structure that
need to be reset to the initial state before it can be reused, but
are not cleared by child_process_clear().  .git_cmd and other flags
as well as in/out/err file descriptors do not seem to be cleared,
and other callers of run_command() may even be depending on the
current behaviour that they are kept.
Junio C Hamano June 25, 2024, 4:05 p.m. UTC | #2
Junio C Hamano <gitster@pobox.com> writes:

> (#leftoverbit)
>
> Outside the scope of this patch, I'd prefer to see somebody makes
> sure that it is truly equivalent to prepare a separate and new
> struct child_process for each run_command() call and to reuse the
> same struct child_process after calling child_process_clear() each
> time.  It is unclear if they are equivalent in general, even though
> in this particular case I think we should be OK.
>
> There _might_ be other things in the child_process structure that
> need to be reset to the initial state before it can be reused, but
> are not cleared by child_process_clear().  .git_cmd and other flags
> as well as in/out/err file descriptors do not seem to be cleared,
> and other callers of run_command() may even be depending on the
> current behaviour that they are kept.

Ahh, the reuse of the same struct came directly from Karthik's
review on the second iteration.  I guess Karthik volunteered himself
into this #leftoverbit task?  I am not convinced that

 (1) the selective clearing done by current child_process_clear() is
     the best thing we can do to make child_process reusable, and

 (2) among the current callers, there is nobody that depends on the
     state left by the previous use of child_process in another
     run_command() call that is left uncleared by child_process_clear().

If (1) is false, then reusing child_process structure is not quite
safe, and if (2) is false, updating child_process_clear() to really
clear everything will first need to adjust some callers.

Thanks.
Abhijeet Sonar June 26, 2024, 6:11 a.m. UTC | #3
On 25/06/24 21:29, Junio C Hamano wrote:

> Why clear .args here?

I overlooked the fact that run_command already clears .args if the 
command exits with a zero exit code.  Will fix, thank you.

> I am wondering if the last part need to be more like
> 
> 	...
> 	cp.no_stdout = 1;
> 	if (run_command(&cp))
> 		child_process_clear(&cp);
> 

That makes sense to me, will do.

Thanks.
Karthik Nayak June 26, 2024, 11:16 a.m. UTC | #4
Junio C Hamano <gitster@pobox.com> writes:

> Junio C Hamano <gitster@pobox.com> writes:
>
>> (#leftoverbit)
>>
>> Outside the scope of this patch, I'd prefer to see somebody makes
>> sure that it is truly equivalent to prepare a separate and new
>> struct child_process for each run_command() call and to reuse the
>> same struct child_process after calling child_process_clear() each
>> time.  It is unclear if they are equivalent in general, even though
>> in this particular case I think we should be OK.
>>
>> There _might_ be other things in the child_process structure that
>> need to be reset to the initial state before it can be reused, but
>> are not cleared by child_process_clear().  .git_cmd and other flags
>> as well as in/out/err file descriptors do not seem to be cleared,
>> and other callers of run_command() may even be depending on the
>> current behaviour that they are kept.
>
> Ahh, the reuse of the same struct came directly from Karthik's
> review on the second iteration.  I guess Karthik volunteered himself
> into this #leftoverbit task?  I am not convinced that
>

Hehe. I'll take it up!

>  (1) the selective clearing done by current child_process_clear() is
>      the best thing we can do to make child_process reusable, and
>
>  (2) among the current callers, there is nobody that depends on the
>      state left by the previous use of child_process in another
>      run_command() call that is left uncleared by child_process_clear().
>
> If (1) is false, then reusing child_process structure is not quite
> safe, and if (2) is false, updating child_process_clear() to really
> clear everything will first need to adjust some callers.
>
> Thanks.
>

I think it would be best to write some unit tests to capture the current
behavior and based on the findings and as you suggested, we can decide
the path forward.

Karthik
diff mbox series

Patch

diff --git a/builtin/describe.c b/builtin/describe.c
index e5287eddf2..deec19b29a 100644
--- a/builtin/describe.c
+++ b/builtin/describe.c
@@ -53,6 +53,10 @@  static const char *diff_index_args[] = {
 	"diff-index", "--quiet", "HEAD", "--", NULL
 };
 
+static const char *update_index_args[] = {
+	"update-index", "--unmerged", "-q", "--refresh", NULL
+};
+
 struct commit_name {
 	struct hashmap_entry entry;
 	struct object_id peeled;
@@ -645,6 +649,13 @@  int cmd_describe(int argc, const char **argv, const char *prefix)
 	if (argc == 0) {
 		if (broken) {
 			struct child_process cp = CHILD_PROCESS_INIT;
+			strvec_pushv(&cp.args, update_index_args);
+			cp.git_cmd = 1;
+			cp.no_stdin = 1;
+			cp.no_stdout = 1;
+			run_command(&cp);
+			strvec_clear(&cp.args);
+
 			strvec_pushv(&cp.args, diff_index_args);
 			cp.git_cmd = 1;
 			cp.no_stdin = 1;
diff --git a/t/t6120-describe.sh b/t/t6120-describe.sh
index e78315d23d..6c396e7abc 100755
--- a/t/t6120-describe.sh
+++ b/t/t6120-describe.sh
@@ -671,4 +671,28 @@  test_expect_success 'setup misleading taggerdates' '
 
 check_describe newer-tag-older-commit~1 --contains unique-file~2
 
+test_expect_success 'describe --dirty with a file with changed stat' '
+	git init stat-dirty &&
+	(
+		cd stat-dirty &&
+
+		echo A >file &&
+		git add file &&
+		git commit -m A &&
+		git tag A -a -m A &&
+
+		cat file >file.new &&
+		mv file.new file &&
+		git describe --dirty >actual &&
+		echo "A" >expected &&
+		test_cmp expected actual &&
+
+		cat file >file.new &&
+		mv file.new file &&
+		git describe --dirty --broken >actual &&
+		echo "A" >expected &&
+		test_cmp expected actual
+	)
+'
+
 test_done