diff mbox series

[2/6] diff-merges: move specific diff-index "-m" handling to diff-index

Message ID 20210510153451.15090-3-sorganov@gmail.com (mailing list archive)
State Superseded
Headers show
Series diff-merges: let -m imply -p | expand

Commit Message

Sergey Organov May 10, 2021, 3:34 p.m. UTC
Move specific handling of "-m" for diff-index to diff-index.c, so
diff-merges is left to handle only diff for merges options.

Being a better design by itself, this is especially essential in
preparation for letting -m imply -p, as "diff-index -m" obviously
should not imply -p, as it's entirely unrelated.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
---
 builtin/diff-index.c | 45 ++++++++++++++++++++++++++++++++++----------
 diff-merges.c        | 13 +------------
 2 files changed, 36 insertions(+), 22 deletions(-)

Comments

Junio C Hamano May 11, 2021, 4:09 a.m. UTC | #1
Sergey Organov <sorganov@gmail.com> writes:

>  int cmd_diff_index(int argc, const char **argv, const char *prefix)
>  {
>  	struct rev_info rev;
>  	unsigned int option = 0;
> -	int i;
>  	int result;
>  
>  	if (argc == 2 && !strcmp(argv[1], "-h"))
> @@ -27,17 +53,16 @@ int cmd_diff_index(int argc, const char **argv, const char *prefix)
>  	rev.abbrev = 0;
>  	prefix = precompose_argv_prefix(argc, argv, prefix);
>  
> +	/*
> +	 * It's essential to parse our distinct options before calling
> +	 * setup_revisions(), for the latter not to see "-m".
> +	 */
> +	argc = parse_distinct_options(argc, argv, &rev, &option);
>  	argc = setup_revisions(argc, argv, &rev, NULL);

This change is risky, as the loop below (which this patch moves to
parse_distinct_options()) has no knowledge of other options that
setup_revisions() helper is prepared to handle and that takes an
argument.  When parsing "git cmd --opt --cached A", setup_revisions()
may know that --opt takes an argument and eat both (i.e. the
"--cached" is not an option but an arg given to "--opt"), but the
new parse_distinct_options() helper does not; it will happily skip
"--opt" and leave it in, mistake "--cached" as an option and remove,
and instead make "A" the arg given to "--opt".

Picking up the remnant _after_ setup_revisions() ate what it
understands would not have such a downside, as long as none of our
"distinct options" take any argument.

Can't we make "-m means something special for diff-index" without
butchering the command line processing in this step?  diff-index
does not care about --diff-merges, so letting setup_revisions()
remember only the fact that "-m" was given while parsing, and then
postprocess what "-m" means depending on the command (i.e. everybody
else would treat it as a short-hand for "--diff-merges=m" plus "we
need some form of diff output, while allowing "diff-index" to treat
it differently) should not be rocket science.

Thanks.
Junio C Hamano May 11, 2021, 5:23 a.m. UTC | #2
Junio C Hamano <gitster@pobox.com> writes:

> This change is risky, as the loop below (which this patch moves to
> parse_distinct_options()) has no knowledge of other options that
> setup_revisions() helper is prepared to handle and that takes an
> argument.  When parsing "git cmd --opt --cached A", setup_revisions()
> may know that --opt takes an argument and eat both (i.e. the
> "--cached" is not an option but an arg given to "--opt"), but the
> new parse_distinct_options() helper does not; it will happily skip
> "--opt" and leave it in, mistake "--cached" as an option and remove,
> and instead make "A" the arg given to "--opt".

The above is not theoretical.  The series breaks

    $ git checkout master
    $ git diff-index -S --cached maint

IOW, what I predicted in the previous message with s/--opt/-S/
happens with this step.

> Picking up the remnant _after_ setup_revisions() ate what it
> understands would not have such a downside, as long as none of our
> "distinct options" take any argument.
>
> Can't we make "-m means something special for diff-index" without
> butchering the command line processing in this step?  diff-index
> does not care about --diff-merges, so letting setup_revisions()
> remember only the fact that "-m" was given while parsing, and then
> postprocess what "-m" means depending on the command (i.e. everybody
> else would treat it as a short-hand for "--diff-merges=m" plus "we
> need some form of diff output, while allowing "diff-index" to treat
> it differently) should not be rocket science.
>
> Thanks.
Junio C Hamano May 11, 2021, 5:41 a.m. UTC | #3
I'll insert the following immediately after this step as a reminder
for ourselves while queuing.

---- >8 -------- >8 -------- >8 -------- >8 -------- >8 ----
Subject: [PATCH] t4062: diff-index -S can take its string as a separate arg

Butchering the command line parser in a wrong way can break this
test, confusing the parser to mistake "--cached" which is a mere
argument to the "-S" option as "compare the tree with the index"
option.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 t/t4062-diff-pickaxe.sh | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/t/t4062-diff-pickaxe.sh b/t/t4062-diff-pickaxe.sh
index 1130c8019b..1c93812378 100755
--- a/t/t4062-diff-pickaxe.sh
+++ b/t/t4062-diff-pickaxe.sh
@@ -26,4 +26,9 @@ test_expect_success '-S --pickaxe-regex' '
 	verbose test 4096-zeroes.txt = "$(cat out)"
 '
 
+test_expect_failure '-S with separate option should not error out' '
+	# "--cached" here is not an option---it is an arg to "-S"
+	git diff-index -S --cached HEAD^
+'
+
 test_done
Sergey Organov May 11, 2021, 1:43 p.m. UTC | #4
Junio C Hamano <gitster@pobox.com> writes:

> Sergey Organov <sorganov@gmail.com> writes:
>
>>  int cmd_diff_index(int argc, const char **argv, const char *prefix)
>>  {
>>  	struct rev_info rev;
>>  	unsigned int option = 0;
>> -	int i;
>>  	int result;
>>  
>>  	if (argc == 2 && !strcmp(argv[1], "-h"))
>> @@ -27,17 +53,16 @@ int cmd_diff_index(int argc, const char **argv, const char *prefix)
>>  	rev.abbrev = 0;
>>  	prefix = precompose_argv_prefix(argc, argv, prefix);
>>  
>> +	/*
>> +	 * It's essential to parse our distinct options before calling
>> +	 * setup_revisions(), for the latter not to see "-m".
>> +	 */
>> +	argc = parse_distinct_options(argc, argv, &rev, &option);
>>  	argc = setup_revisions(argc, argv, &rev, NULL);
>
> This change is risky, as the loop below (which this patch moves to
> parse_distinct_options()) has no knowledge of other options that
> setup_revisions() helper is prepared to handle and that takes an
> argument.  When parsing "git cmd --opt --cached A", setup_revisions()
> may know that --opt takes an argument and eat both (i.e. the
> "--cached" is not an option but an arg given to "--opt"), but the
> new parse_distinct_options() helper does not; it will happily skip
> "--opt" and leave it in, mistake "--cached" as an option and remove,
> and instead make "A" the arg given to "--opt".
>
> Picking up the remnant _after_ setup_revisions() ate what it
> understands would not have such a downside, as long as none of our
> "distinct options" take any argument.
>
> Can't we make "-m means something special for diff-index" without
> butchering the command line processing in this step?  diff-index
> does not care about --diff-merges, so letting setup_revisions()
> remember only the fact that "-m" was given while parsing, and then
> postprocess what "-m" means depending on the command (i.e. everybody
> else would treat it as a short-hand for "--diff-merges=m" plus "we
> need some form of diff output, while allowing "diff-index" to treat
> it differently) should not be rocket science.

I have already considered a few ways of doing it, and what I came up
with looked least destructive to me at the moment, especially as it
broke no tests whatsoever.

I'll now re-consider my approach because of your observations, thanks!

-- Sergey Organov
diff mbox series

Patch

diff --git a/builtin/diff-index.c b/builtin/diff-index.c
index 176fe7ff2b4e..28bc51d0d8f4 100644
--- a/builtin/diff-index.c
+++ b/builtin/diff-index.c
@@ -12,11 +12,37 @@  static const char diff_cache_usage[] =
 "[<common-diff-options>] <tree-ish> [<path>...]"
 COMMON_DIFF_OPTIONS_HELP;
 
+static int parse_distinct_options(int argc, const char **argv,
+				  struct rev_info *revs, unsigned int *options)
+{
+	int i, left;
+
+	for (i = left = 1; i < argc; i++) {
+		const char *arg = argv[i];
+		int leave = 0;
+
+		if (!strcmp(arg, "--cached"))
+			*options |= DIFF_INDEX_CACHED;
+		else if (!strcmp(arg, "--merge-base"))
+			*options |= DIFF_INDEX_MERGE_BASE;
+		else if (!strcmp(arg, "-m"))
+			revs->match_missing = 1;
+		else
+			leave = 1;
+
+		if (leave)
+			argv[left++] = arg;
+	}
+
+	argv[left] = NULL;
+
+	return left;
+}
+
 int cmd_diff_index(int argc, const char **argv, const char *prefix)
 {
 	struct rev_info rev;
 	unsigned int option = 0;
-	int i;
 	int result;
 
 	if (argc == 2 && !strcmp(argv[1], "-h"))
@@ -27,17 +53,16 @@  int cmd_diff_index(int argc, const char **argv, const char *prefix)
 	rev.abbrev = 0;
 	prefix = precompose_argv_prefix(argc, argv, prefix);
 
+	/*
+	 * It's essential to parse our distinct options before calling
+	 * setup_revisions(), for the latter not to see "-m".
+	 */
+	argc = parse_distinct_options(argc, argv, &rev, &option);
 	argc = setup_revisions(argc, argv, &rev, NULL);
-	for (i = 1; i < argc; i++) {
-		const char *arg = argv[i];
 
-		if (!strcmp(arg, "--cached"))
-			option |= DIFF_INDEX_CACHED;
-		else if (!strcmp(arg, "--merge-base"))
-			option |= DIFF_INDEX_MERGE_BASE;
-		else
-			usage(diff_cache_usage);
-	}
+	if (argc > 1)
+		usage(diff_cache_usage);
+
 	if (!rev.diffopt.output_format)
 		rev.diffopt.output_format = DIFF_FORMAT_RAW;
 
diff --git a/diff-merges.c b/diff-merges.c
index f3a9daed7e05..4016800c422c 100644
--- a/diff-merges.c
+++ b/diff-merges.c
@@ -30,17 +30,6 @@  static void set_first_parent(struct rev_info *revs)
 	revs->first_parent_merges = 1;
 }
 
-static void set_m(struct rev_info *revs)
-{
-	/*
-	 * To "diff-index", "-m" means "match missing", and to the "log"
-	 * family of commands, it means "show default diff for merges". Set
-	 * both fields appropriately.
-	 */
-	set_to_default(revs);
-	revs->match_missing = 1;
-}
-
 static void set_combined(struct rev_info *revs)
 {
 	suppress(revs);
@@ -108,7 +97,7 @@  int diff_merges_parse_opts(struct rev_info *revs, const char **argv)
 	const char *arg = argv[0];
 
 	if (!strcmp(arg, "-m")) {
-		set_m(revs);
+		set_to_default(revs);
 	} else if (!strcmp(arg, "-c")) {
 		set_combined(revs);
 		revs->combined_imply_patch = 1;