Message ID | b330222ce83bdf03c20085ff10fcff8a090474d5.1676665285.git.gitgitgadget@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Teach diff to honor diff algorithms set through git attributes | expand |
"John Cai via GitGitGadget" <gitgitgadget@gmail.com> writes: Looking good. Some comments below. Many of them minor. > +Setting the internal diff algorithm > +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > + > +The diff algorithm can be set through the `diff.algorithm` config key, but > +sometimes it may be helpful to set the diff algorithm by path. For example, one I would have expected "per path" instead of "by path". > +might wish to set a diff algorithm automatically for all `.json` files such that > +the user would not need to pass in a separate command line `--diff-algorithm` > +flag each time. While this is not incorrect per-se, I think the first paragraph of the proposed commit log message was a lot more convincing. Your changes may not be limited to a single kind of files, and a command line option is simply not enough. You may want one algorithm for ".json" while using another for ".c", which was really an excellent example you gave. > +This diff algorithm applies to user facing diff output like git-diff(1), > +git-show(1) and is used for the `--stat` output as well. The merge machinery > +will not use the diff algorithm set through this method. Is "format-patch" considered "user-facing"? > +NOTE: If the `command` key also exists, then Git will treat this as an external > +diff and attempt to use the value set for `command` as an external program. For > +instance, the following config, combined with the above `.gitattributes` file, > +will result in `command` favored over `algorithm`. > + > +---------------------------------------------------------------- > +[diff "<name>"] > + command = j-c-diff > + algorithm = histogram > +---------------------------------------------------------------- Isn't this a bit too verbose, given that the reader has just seen the external diff driver section. I wonder something like this is sufficient, without any sample configuration? NOTE: If `diff.<name>.command` is defined for path with the `diff=<name>` attribute, it is executed as an external diff driver (see above), and adding `diff.<name>.algorithm` has no effect (the algorithm is not passed to the external diff driver). > diff --git a/diff.c b/diff.c > index 5efc22ca06b..04469da6d34 100644 > --- a/diff.c > +++ b/diff.c > @@ -4456,15 +4456,13 @@ static void run_diff_cmd(const char *pgm, > const char *xfrm_msg = NULL; > int complete_rewrite = (p->status == DIFF_STATUS_MODIFIED) && p->score; > int must_show_header = 0; > + struct userdiff_driver *drv = NULL; > > - > - if (o->flags.allow_external) { > - struct userdiff_driver *drv; > - > + if (o->flags.allow_external || !o->ignore_driver_algorithm) > drv = userdiff_find_by_path(o->repo->index, attr_path); > - if (drv && drv->external) > - pgm = drv->external; > - } > + > + if (o->flags.allow_external && drv && drv->external) > + pgm = drv->external; OK. There is no explicit "pgm = NULL" initialization in this function, but that is done by the caller passing NULL to the function as a parameter, so it all makes sense. > @@ -4481,12 +4479,16 @@ static void run_diff_cmd(const char *pgm, > run_external_diff(pgm, name, other, one, two, xfrm_msg, o); > return; > } > - if (one && two) > + if (one && two) { > + if (drv && !o->ignore_driver_algorithm && drv->algorithm) > + set_diff_algorithm(o, drv->algorithm); For symmetry with the above choice of pgm we just saw, the order of the condition might be easier to follow if written like so: if (!o->ignore_driver_algorithm && drv && drv->algorithm) It would not make any measurable difference performance-wise either way. > @@ -4583,6 +4585,14 @@ static void run_diffstat(struct diff_filepair *p, struct diff_options *o, > const char *name; > const char *other; > > + if (!o->ignore_driver_algorithm) { > + struct userdiff_driver *drv = userdiff_find_by_path(o->repo->index, p->one->path); That's an overlong line. > + > + if (drv && drv->algorithm) { > + set_diff_algorithm(o, drv->algorithm); > + } No need to have {} around a single statement block. > + } > + > if (DIFF_PAIR_UNMERGED(p)) { > /* unmerged */ > builtin_diffstat(p->one->path, NULL, NULL, NULL, > @@ -5130,6 +5140,8 @@ static int diff_opt_diff_algorithm(const struct option *opt, > return error(_("option diff-algorithm accepts \"myers\", " > "\"minimal\", \"patience\" and \"histogram\"")); > > + options->ignore_driver_algorithm = 1; > + > return 0; > } > > @@ -5145,6 +5157,8 @@ static int diff_opt_diff_algorithm_no_arg(const struct option *opt, > BUG("available diff algorithms include \"myers\", " > "\"minimal\", \"patience\" and \"histogram\""); > > + options->ignore_driver_algorithm = 1; > + > return 0; > } > @@ -5285,6 +5299,7 @@ static int diff_opt_patience(const struct option *opt, > for (i = 0; i < options->anchors_nr; i++) > free(options->anchors[i]); > options->anchors_nr = 0; > + options->ignore_driver_algorithm = 1; > > return set_diff_algorithm(options, "patience"); > } I was hoping that set_diff_algorithm() can be the shared common one that signals we were told to use a specific algorithm, but it also is called from internal codepaths so it cannot be it. It is probably not worth introducing an extra helper that only calls set_diff_algorithm() and sets ignore_driver_algorithm bit only for that to reduce three-times repetition. OK. > diff --git a/userdiff.c b/userdiff.c > index d71b82feb74..ff25cfc4b4c 100644 > --- a/userdiff.c > +++ b/userdiff.c > @@ -293,7 +293,7 @@ PATTERNS("scheme", > "|([^][)(}{[ \t])+"), > PATTERNS("tex", "^(\\\\((sub)*section|chapter|part)\\*{0,1}\\{.*)$", > "\\\\[a-zA-Z@]+|\\\\.|[a-zA-Z0-9\x80-\xff]+"), > -{ "default", NULL, -1, { NULL, 0 } }, > +{ "default", NULL, NULL, -1, { NULL, 0 } }, > }; I was surprised that there is so little damage to the built-in userdiff driver definitions, but this is thanks to the PATTERNS() and IPATTERN() macro that use designated initializers. Very nice. Nicely done.
On Fri, Feb 17, 2023 at 12:21 PM John Cai via GitGitGadget <gitgitgadget@gmail.com> wrote: > > From: John Cai <johncai86@gmail.com> > > It can be useful to specify diff algorithms per file type. For example, > one may want to use the minimal diff algorithm for .json files, another > for .c files, etc. > > The diff machinery already checks attributes for a diff driver. Teach > the diff driver parser a new type "algorithm" to look for in the > config, which will be used if a driver has been specified through the > attributes. > > Enforce precedence of the diff algorithm by favoring the command line > option, then looking at the driver attributes & config combination, then > finally the diff.algorithm config. > > To enforce precedence order, use a new `ignore_driver_algorithm` member > during options pasing to indicate the diff algorithm was set via command > line args. s/pasing/parsing/ > Signed-off-by: John Cai <johncai86@gmail.com> > --- > Documentation/gitattributes.txt | 37 ++++++++++++++++++++++++++++++++ > diff.c | 33 ++++++++++++++++++++-------- > diff.h | 1 + > t/lib-diff-alternative.sh | 38 ++++++++++++++++++++++++++++++++- > userdiff.c | 4 +++- > userdiff.h | 1 + > 6 files changed, 103 insertions(+), 11 deletions(-) > > diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt > index c19e64ea0ef..f212079a131 100644 > --- a/Documentation/gitattributes.txt > +++ b/Documentation/gitattributes.txt > @@ -758,6 +758,43 @@ with the above configuration, i.e. `j-c-diff`, with 7 > parameters, just like `GIT_EXTERNAL_DIFF` program is called. > See linkgit:git[1] for details. > > +Setting the internal diff algorithm > +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > + > +The diff algorithm can be set through the `diff.algorithm` config key, but > +sometimes it may be helpful to set the diff algorithm by path. For example, one > +might wish to set a diff algorithm automatically for all `.json` files such that > +the user would not need to pass in a separate command line `--diff-algorithm` > +flag each time. > + > +First, in `.gitattributes`, assign the `diff` attribute for paths. > + > +------------------------ > +*.json diff=<name> > +------------------------ > + > +Then, define a "diff.<name>.algorithm" configuration to specify the diff Should there be a link to `git-config` right after "configuration"? Otherwise, users may think that they are being told to specify additional configuration within the .gitattributes file. > +algorithm, choosing from `meyers`, `patience`, `minimal`, or `histogram`. s/meyers/myers/ > + > +---------------------------------------------------------------- > +[diff "<name>"] > + algorithm = histogram > +---------------------------------------------------------------- It's pretty easy to assume the above is meant to be part of the .gitattributes file instead of the .git/config file. Don't most users run `git config` commands directly rather than edit the .git/config file? Should we provide a sample command rather than showing what the config file will contain? > + > +This diff algorithm applies to user facing diff output like git-diff(1), > +git-show(1) and is used for the `--stat` output as well. The merge machinery > +will not use the diff algorithm set through this method. Yaay, thanks for including this! I'm still curious if this should this also include warnings/caveats, such as: * The diff attribute specified in .gitattributes will be ignored in a bare clone * The diff attribute specified in .gitattributes will be ignored if it is only specified in another branch (e.g. on a branch "special-file diff=patience" recorded in .gitattributes, then checkout master but run `git log -1 -p $branch`) * When a file is renamed, the diff attribute for the pre-image name is the only one the system pays attention to (thus adding "-R" can flip which diff algorithm is run for the renamed file). Also, since I tested the three items above to verify they are valid warnings, I'm a bit confused. I thought your intent was to use this server-side[1], so isn't the bare clone aspect a deal-breaker for your intended usecase? [1] https://lore.kernel.org/git/7852AC7B-7A4E-4DD0-ADEA-CFFD5D16C595@gmail.com/ > + > +NOTE: If the `command` key also exists, then Git will treat this as an external > +diff and attempt to use the value set for `command` as an external program. For > +instance, the following config, combined with the above `.gitattributes` file, > +will result in `command` favored over `algorithm`. > + > +---------------------------------------------------------------- > +[diff "<name>"] > + command = j-c-diff > + algorithm = histogram > +---------------------------------------------------------------- > > Defining a custom hunk-header > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > diff --git a/diff.c b/diff.c > index 5efc22ca06b..04469da6d34 100644 > --- a/diff.c > +++ b/diff.c > @@ -4456,15 +4456,13 @@ static void run_diff_cmd(const char *pgm, > const char *xfrm_msg = NULL; > int complete_rewrite = (p->status == DIFF_STATUS_MODIFIED) && p->score; > int must_show_header = 0; > + struct userdiff_driver *drv = NULL; > > - > - if (o->flags.allow_external) { > - struct userdiff_driver *drv; > - > + if (o->flags.allow_external || !o->ignore_driver_algorithm) > drv = userdiff_find_by_path(o->repo->index, attr_path); > - if (drv && drv->external) > - pgm = drv->external; > - } > + > + if (o->flags.allow_external && drv && drv->external) > + pgm = drv->external; > > if (msg) { > /* > @@ -4481,12 +4479,16 @@ static void run_diff_cmd(const char *pgm, > run_external_diff(pgm, name, other, one, two, xfrm_msg, o); > return; > } > - if (one && two) > + if (one && two) { > + if (drv && !o->ignore_driver_algorithm && drv->algorithm) > + set_diff_algorithm(o, drv->algorithm); > + > builtin_diff(name, other ? other : name, > one, two, xfrm_msg, must_show_header, > o, complete_rewrite); > - else > + } else { > fprintf(o->file, "* Unmerged path %s\n", name); > + } > } > > static void diff_fill_oid_info(struct diff_filespec *one, struct index_state *istate) > @@ -4583,6 +4585,14 @@ static void run_diffstat(struct diff_filepair *p, struct diff_options *o, > const char *name; > const char *other; > > + if (!o->ignore_driver_algorithm) { > + struct userdiff_driver *drv = userdiff_find_by_path(o->repo->index, p->one->path); > + > + if (drv && drv->algorithm) { > + set_diff_algorithm(o, drv->algorithm); > + } > + } > + > if (DIFF_PAIR_UNMERGED(p)) { > /* unmerged */ > builtin_diffstat(p->one->path, NULL, NULL, NULL, > @@ -5130,6 +5140,8 @@ static int diff_opt_diff_algorithm(const struct option *opt, > return error(_("option diff-algorithm accepts \"myers\", " > "\"minimal\", \"patience\" and \"histogram\"")); > > + options->ignore_driver_algorithm = 1; > + > return 0; > } > > @@ -5145,6 +5157,8 @@ static int diff_opt_diff_algorithm_no_arg(const struct option *opt, > BUG("available diff algorithms include \"myers\", " > "\"minimal\", \"patience\" and \"histogram\""); > > + options->ignore_driver_algorithm = 1; > + > return 0; > } > > @@ -5285,6 +5299,7 @@ static int diff_opt_patience(const struct option *opt, > for (i = 0; i < options->anchors_nr; i++) > free(options->anchors[i]); > options->anchors_nr = 0; > + options->ignore_driver_algorithm = 1; > > return set_diff_algorithm(options, "patience"); > } > diff --git a/diff.h b/diff.h > index 41eb2c3d428..8d770b1d579 100644 > --- a/diff.h > +++ b/diff.h > @@ -333,6 +333,7 @@ struct diff_options { > int prefix_length; > const char *stat_sep; > int xdl_opts; > + int ignore_driver_algorithm; > > /* see Documentation/diff-options.txt */ > char **anchors; > diff --git a/t/lib-diff-alternative.sh b/t/lib-diff-alternative.sh > index 8d1e408bb58..2dc02bca873 100644 > --- a/t/lib-diff-alternative.sh > +++ b/t/lib-diff-alternative.sh > @@ -105,10 +105,46 @@ index $file1..$file2 100644 > } > EOF > > + cat >expect_diffstat <<EOF > + file1 => file2 | 21 ++++++++++----------- > + 1 file changed, 10 insertions(+), 11 deletions(-) > +EOF > + > STRATEGY=$1 > > + test_expect_success "$STRATEGY diff from attributes" ' > + echo "file* diff=driver" >.gitattributes && > + git config diff.driver.algorithm "$STRATEGY" && > + test_must_fail git diff --no-index file1 file2 > output && > + cat expect && > + cat output && > + test_cmp expect output > + ' > + > + test_expect_success "$STRATEGY diff from attributes has valid diffstat" ' > + echo "file* diff=driver" >.gitattributes && > + git config diff.driver.algorithm "$STRATEGY" && > + test_must_fail git diff --stat --no-index file1 file2 > output && > + test_cmp expect_diffstat output > + ' > + > test_expect_success "$STRATEGY diff" ' > - test_must_fail git diff --no-index "--$STRATEGY" file1 file2 > output && > + test_must_fail git diff --no-index "--diff-algorithm=$STRATEGY" file1 file2 > output && > + test_cmp expect output > + ' > + > + test_expect_success "$STRATEGY diff command line precedence before attributes" ' > + echo "file* diff=driver" >.gitattributes && > + git config diff.driver.algorithm meyers && Is this misspelling of myers intentional? I think with the typo, the code falls back to the default algorithm, which happens to be myers, so I think the test works either way, but were you intending to test fallback behavior in case of a typo here, or was that accidental? > + test_must_fail git diff --no-index "--diff-algorithm=$STRATEGY" file1 file2 > output && > + test_cmp expect output > + ' > + > + test_expect_success "$STRATEGY diff attributes precedence before config" ' > + git config diff.algorithm default && > + echo "file* diff=driver" >.gitattributes && > + git config diff.driver.algorithm "$STRATEGY" && > + test_must_fail git diff --no-index file1 file2 > output && > test_cmp expect output > ' > > diff --git a/userdiff.c b/userdiff.c > index d71b82feb74..ff25cfc4b4c 100644 > --- a/userdiff.c > +++ b/userdiff.c > @@ -293,7 +293,7 @@ PATTERNS("scheme", > "|([^][)(}{[ \t])+"), > PATTERNS("tex", "^(\\\\((sub)*section|chapter|part)\\*{0,1}\\{.*)$", > "\\\\[a-zA-Z@]+|\\\\.|[a-zA-Z0-9\x80-\xff]+"), > -{ "default", NULL, -1, { NULL, 0 } }, > +{ "default", NULL, NULL, -1, { NULL, 0 } }, > }; > #undef PATTERNS > #undef IPATTERN > @@ -394,6 +394,8 @@ int userdiff_config(const char *k, const char *v) > return parse_bool(&drv->textconv_want_cache, k, v); > if (!strcmp(type, "wordregex")) > return git_config_string(&drv->word_regex, k, v); > + if (!strcmp(type, "algorithm")) > + return git_config_string(&drv->algorithm, k, v); > > return 0; > } > diff --git a/userdiff.h b/userdiff.h > index aee91bc77e6..24419db6973 100644 > --- a/userdiff.h > +++ b/userdiff.h > @@ -14,6 +14,7 @@ struct userdiff_funcname { > struct userdiff_driver { > const char *name; > const char *external; > + const char *algorithm; > int binary; > struct userdiff_funcname funcname; > const char *word_regex; > -- > gitgitgadget
Hi Elijah, On 17 Feb 2023, at 21:56, Elijah Newren wrote: > On Fri, Feb 17, 2023 at 12:21 PM John Cai via GitGitGadget > <gitgitgadget@gmail.com> wrote: >> >> From: John Cai <johncai86@gmail.com> >> >> It can be useful to specify diff algorithms per file type. For example, >> one may want to use the minimal diff algorithm for .json files, another >> for .c files, etc. >> >> The diff machinery already checks attributes for a diff driver. Teach >> the diff driver parser a new type "algorithm" to look for in the >> config, which will be used if a driver has been specified through the >> attributes. >> >> Enforce precedence of the diff algorithm by favoring the command line >> option, then looking at the driver attributes & config combination, then >> finally the diff.algorithm config. >> >> To enforce precedence order, use a new `ignore_driver_algorithm` member >> during options pasing to indicate the diff algorithm was set via command >> line args. > > s/pasing/parsing/ thanks for noticing this! > >> Signed-off-by: John Cai <johncai86@gmail.com> >> --- >> Documentation/gitattributes.txt | 37 ++++++++++++++++++++++++++++++++ >> diff.c | 33 ++++++++++++++++++++-------- >> diff.h | 1 + >> t/lib-diff-alternative.sh | 38 ++++++++++++++++++++++++++++++++- >> userdiff.c | 4 +++- >> userdiff.h | 1 + >> 6 files changed, 103 insertions(+), 11 deletions(-) >> >> diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt >> index c19e64ea0ef..f212079a131 100644 >> --- a/Documentation/gitattributes.txt >> +++ b/Documentation/gitattributes.txt >> @@ -758,6 +758,43 @@ with the above configuration, i.e. `j-c-diff`, with 7 >> parameters, just like `GIT_EXTERNAL_DIFF` program is called. >> See linkgit:git[1] for details. >> >> +Setting the internal diff algorithm >> +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ >> + >> +The diff algorithm can be set through the `diff.algorithm` config key, but >> +sometimes it may be helpful to set the diff algorithm by path. For example, one >> +might wish to set a diff algorithm automatically for all `.json` files such that >> +the user would not need to pass in a separate command line `--diff-algorithm` >> +flag each time. >> + >> +First, in `.gitattributes`, assign the `diff` attribute for paths. >> + >> +------------------------ >> +*.json diff=<name> >> +------------------------ >> + >> +Then, define a "diff.<name>.algorithm" configuration to specify the diff > > Should there be a link to `git-config` right after "configuration"? > Otherwise, users may think that they are being told to specify > additional configuration within the .gitattributes file. > >> +algorithm, choosing from `meyers`, `patience`, `minimal`, or `histogram`. > > s/meyers/myers/ likewise. > >> + >> +---------------------------------------------------------------- >> +[diff "<name>"] >> + algorithm = histogram >> +---------------------------------------------------------------- > > It's pretty easy to assume the above is meant to be part of the > .gitattributes file instead of the .git/config file. Don't most users > run `git config` commands directly rather than edit the .git/config > file? Should we provide a sample command rather than showing what the > config file will contain? Not sure--I think I was was just following what I saw in the existing documentation for driver configuration. But would be interested to see what other folks think. > >> + >> +This diff algorithm applies to user facing diff output like git-diff(1), >> +git-show(1) and is used for the `--stat` output as well. The merge machinery >> +will not use the diff algorithm set through this method. > > Yaay, thanks for including this! > > > I'm still curious if this should this also include warnings/caveats, such as: > * The diff attribute specified in .gitattributes will be ignored in > a bare clone > * The diff attribute specified in .gitattributes will be ignored if > it is only specified in another branch (e.g. on a branch "special-file > diff=patience" recorded in .gitattributes, then checkout master but > run `git log -1 -p $branch`) > * When a file is renamed, the diff attribute for the pre-image name > is the only one the system pays attention to (thus adding "-R" can > flip which diff algorithm is run for the renamed file). I would be fine with adding that--though originally I was thinking that these can be inferred from the way that gitattributes are documented in [1]. Calling these out would make it more clear though, so I could go either way. > > Also, since I tested the three items above to verify they are valid > warnings, I'm a bit confused. I thought your intent was to use this > server-side[1], so isn't the bare clone aspect a deal-breaker for your > intended usecase? > > [1] https://lore.kernel.org/git/7852AC7B-7A4E-4DD0-ADEA-CFFD5D16C595@gmail.com/ yes, indeed. I was planning on adding bare repository support in a separate patch series, since the additions in [2] allows .gitattributes to be read from a bare repository. 1. https://git-scm.com/docs/gitattributes 2. https://lore.kernel.org/git/0ca8b2458921fc40269b0c43b5ec86eba77d6b54.1673684790.git.karthik.188@gmail.com/ thanks! John > >> + >> +NOTE: If the `command` key also exists, then Git will treat this as an external >> +diff and attempt to use the value set for `command` as an external program. For >> +instance, the following config, combined with the above `.gitattributes` file, >> +will result in `command` favored over `algorithm`. >> + >> +---------------------------------------------------------------- >> +[diff "<name>"] >> + command = j-c-diff >> + algorithm = histogram >> +---------------------------------------------------------------- >> >> Defining a custom hunk-header >> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ >> diff --git a/diff.c b/diff.c >> index 5efc22ca06b..04469da6d34 100644 >> --- a/diff.c >> +++ b/diff.c >> @@ -4456,15 +4456,13 @@ static void run_diff_cmd(const char *pgm, >> const char *xfrm_msg = NULL; >> int complete_rewrite = (p->status == DIFF_STATUS_MODIFIED) && p->score; >> int must_show_header = 0; >> + struct userdiff_driver *drv = NULL; >> >> - >> - if (o->flags.allow_external) { >> - struct userdiff_driver *drv; >> - >> + if (o->flags.allow_external || !o->ignore_driver_algorithm) >> drv = userdiff_find_by_path(o->repo->index, attr_path); >> - if (drv && drv->external) >> - pgm = drv->external; >> - } >> + >> + if (o->flags.allow_external && drv && drv->external) >> + pgm = drv->external; >> >> if (msg) { >> /* >> @@ -4481,12 +4479,16 @@ static void run_diff_cmd(const char *pgm, >> run_external_diff(pgm, name, other, one, two, xfrm_msg, o); >> return; >> } >> - if (one && two) >> + if (one && two) { >> + if (drv && !o->ignore_driver_algorithm && drv->algorithm) >> + set_diff_algorithm(o, drv->algorithm); >> + >> builtin_diff(name, other ? other : name, >> one, two, xfrm_msg, must_show_header, >> o, complete_rewrite); >> - else >> + } else { >> fprintf(o->file, "* Unmerged path %s\n", name); >> + } >> } >> >> static void diff_fill_oid_info(struct diff_filespec *one, struct index_state *istate) >> @@ -4583,6 +4585,14 @@ static void run_diffstat(struct diff_filepair *p, struct diff_options *o, >> const char *name; >> const char *other; >> >> + if (!o->ignore_driver_algorithm) { >> + struct userdiff_driver *drv = userdiff_find_by_path(o->repo->index, p->one->path); >> + >> + if (drv && drv->algorithm) { >> + set_diff_algorithm(o, drv->algorithm); >> + } >> + } >> + >> if (DIFF_PAIR_UNMERGED(p)) { >> /* unmerged */ >> builtin_diffstat(p->one->path, NULL, NULL, NULL, >> @@ -5130,6 +5140,8 @@ static int diff_opt_diff_algorithm(const struct option *opt, >> return error(_("option diff-algorithm accepts \"myers\", " >> "\"minimal\", \"patience\" and \"histogram\"")); >> >> + options->ignore_driver_algorithm = 1; >> + >> return 0; >> } >> >> @@ -5145,6 +5157,8 @@ static int diff_opt_diff_algorithm_no_arg(const struct option *opt, >> BUG("available diff algorithms include \"myers\", " >> "\"minimal\", \"patience\" and \"histogram\""); >> >> + options->ignore_driver_algorithm = 1; >> + >> return 0; >> } >> >> @@ -5285,6 +5299,7 @@ static int diff_opt_patience(const struct option *opt, >> for (i = 0; i < options->anchors_nr; i++) >> free(options->anchors[i]); >> options->anchors_nr = 0; >> + options->ignore_driver_algorithm = 1; >> >> return set_diff_algorithm(options, "patience"); >> } >> diff --git a/diff.h b/diff.h >> index 41eb2c3d428..8d770b1d579 100644 >> --- a/diff.h >> +++ b/diff.h >> @@ -333,6 +333,7 @@ struct diff_options { >> int prefix_length; >> const char *stat_sep; >> int xdl_opts; >> + int ignore_driver_algorithm; >> >> /* see Documentation/diff-options.txt */ >> char **anchors; >> diff --git a/t/lib-diff-alternative.sh b/t/lib-diff-alternative.sh >> index 8d1e408bb58..2dc02bca873 100644 >> --- a/t/lib-diff-alternative.sh >> +++ b/t/lib-diff-alternative.sh >> @@ -105,10 +105,46 @@ index $file1..$file2 100644 >> } >> EOF >> >> + cat >expect_diffstat <<EOF >> + file1 => file2 | 21 ++++++++++----------- >> + 1 file changed, 10 insertions(+), 11 deletions(-) >> +EOF >> + >> STRATEGY=$1 >> >> + test_expect_success "$STRATEGY diff from attributes" ' >> + echo "file* diff=driver" >.gitattributes && >> + git config diff.driver.algorithm "$STRATEGY" && >> + test_must_fail git diff --no-index file1 file2 > output && >> + cat expect && >> + cat output && >> + test_cmp expect output >> + ' >> + >> + test_expect_success "$STRATEGY diff from attributes has valid diffstat" ' >> + echo "file* diff=driver" >.gitattributes && >> + git config diff.driver.algorithm "$STRATEGY" && >> + test_must_fail git diff --stat --no-index file1 file2 > output && >> + test_cmp expect_diffstat output >> + ' >> + >> test_expect_success "$STRATEGY diff" ' >> - test_must_fail git diff --no-index "--$STRATEGY" file1 file2 > output && >> + test_must_fail git diff --no-index "--diff-algorithm=$STRATEGY" file1 file2 > output && >> + test_cmp expect output >> + ' >> + >> + test_expect_success "$STRATEGY diff command line precedence before attributes" ' >> + echo "file* diff=driver" >.gitattributes && >> + git config diff.driver.algorithm meyers && > > Is this misspelling of myers intentional? I think with the typo, the > code falls back to the default algorithm, which happens to be myers, > so I think the test works either way, but were you intending to test > fallback behavior in case of a typo here, or was that accidental? > > > >> + test_must_fail git diff --no-index "--diff-algorithm=$STRATEGY" file1 file2 > output && >> + test_cmp expect output >> + ' >> + >> + test_expect_success "$STRATEGY diff attributes precedence before config" ' >> + git config diff.algorithm default && >> + echo "file* diff=driver" >.gitattributes && >> + git config diff.driver.algorithm "$STRATEGY" && >> + test_must_fail git diff --no-index file1 file2 > output && >> test_cmp expect output >> ' >> >> diff --git a/userdiff.c b/userdiff.c >> index d71b82feb74..ff25cfc4b4c 100644 >> --- a/userdiff.c >> +++ b/userdiff.c >> @@ -293,7 +293,7 @@ PATTERNS("scheme", >> "|([^][)(}{[ \t])+"), >> PATTERNS("tex", "^(\\\\((sub)*section|chapter|part)\\*{0,1}\\{.*)$", >> "\\\\[a-zA-Z@]+|\\\\.|[a-zA-Z0-9\x80-\xff]+"), >> -{ "default", NULL, -1, { NULL, 0 } }, >> +{ "default", NULL, NULL, -1, { NULL, 0 } }, >> }; >> #undef PATTERNS >> #undef IPATTERN >> @@ -394,6 +394,8 @@ int userdiff_config(const char *k, const char *v) >> return parse_bool(&drv->textconv_want_cache, k, v); >> if (!strcmp(type, "wordregex")) >> return git_config_string(&drv->word_regex, k, v); >> + if (!strcmp(type, "algorithm")) >> + return git_config_string(&drv->algorithm, k, v); >> >> return 0; >> } >> diff --git a/userdiff.h b/userdiff.h >> index aee91bc77e6..24419db6973 100644 >> --- a/userdiff.h >> +++ b/userdiff.h >> @@ -14,6 +14,7 @@ struct userdiff_funcname { >> struct userdiff_driver { >> const char *name; >> const char *external; >> + const char *algorithm; >> int binary; >> struct userdiff_funcname funcname; >> const char *word_regex; >> -- >> gitgitgadget
On Mon, Feb 20, 2023 at 7:32 AM John Cai <johncai86@gmail.com> wrote: [...] > > I'm still curious if this should this also include warnings/caveats, such as: > > * The diff attribute specified in .gitattributes will be ignored in > > a bare clone > > * The diff attribute specified in .gitattributes will be ignored if > > it is only specified in another branch (e.g. on a branch "special-file > > diff=patience" recorded in .gitattributes, then checkout master but > > run `git log -1 -p $branch`) > > * When a file is renamed, the diff attribute for the pre-image name > > is the only one the system pays attention to (thus adding "-R" can > > flip which diff algorithm is run for the renamed file). > > I would be fine with adding that--though originally I was thinking that these > can be inferred from the way that gitattributes are documented in [1]. Calling > these out would make it more clear though, so I could go either way. > > > > > Also, since I tested the three items above to verify they are valid > > warnings, I'm a bit confused. I thought your intent was to use this > > server-side[1], so isn't the bare clone aspect a deal-breaker for your > > intended usecase? > > > > [1] https://lore.kernel.org/git/7852AC7B-7A4E-4DD0-ADEA-CFFD5D16C595@gmail.com/ > > yes, indeed. I was planning on adding bare repository support in a separate > patch series, since the additions in [2] allows .gitattributes to be read from a > bare repository. > > 1. https://git-scm.com/docs/gitattributes > 2. https://lore.kernel.org/git/0ca8b2458921fc40269b0c43b5ec86eba77d6b54.1673684790.git.karthik.188@gmail.com/ > > thanks! > John Oh, interesting, I didn't know about [2]. So, is the plan to take the --source option from that series and add it to diff (perhaps with a different name, since log tends to consume diff options and --source is already taken)? And do you expect to get the tree-ish from the two the users are already specifying to diff? If so, which one do you use (the two commits being diffed might have differing .gitattributes files)? If not, what does that mean for users of e.g. the GitLab UI who have to specify a third tree when diffing?
Hi Elijah, On 20 Feb 2023, at 11:21, Elijah Newren wrote: > On Mon, Feb 20, 2023 at 7:32 AM John Cai <johncai86@gmail.com> wrote: > [...] >>> I'm still curious if this should this also include warnings/caveats, such as: >>> * The diff attribute specified in .gitattributes will be ignored in >>> a bare clone >>> * The diff attribute specified in .gitattributes will be ignored if >>> it is only specified in another branch (e.g. on a branch "special-file >>> diff=patience" recorded in .gitattributes, then checkout master but >>> run `git log -1 -p $branch`) >>> * When a file is renamed, the diff attribute for the pre-image name >>> is the only one the system pays attention to (thus adding "-R" can >>> flip which diff algorithm is run for the renamed file). >> >> I would be fine with adding that--though originally I was thinking that these >> can be inferred from the way that gitattributes are documented in [1]. Calling >> these out would make it more clear though, so I could go either way. >> >>> >>> Also, since I tested the three items above to verify they are valid >>> warnings, I'm a bit confused. I thought your intent was to use this >>> server-side[1], so isn't the bare clone aspect a deal-breaker for your >>> intended usecase? >>> >>> [1] https://lore.kernel.org/git/7852AC7B-7A4E-4DD0-ADEA-CFFD5D16C595@gmail.com/ >> >> yes, indeed. I was planning on adding bare repository support in a separate >> patch series, since the additions in [2] allows .gitattributes to be read from a >> bare repository. >> >> 1. https://git-scm.com/docs/gitattributes >> 2. https://lore.kernel.org/git/0ca8b2458921fc40269b0c43b5ec86eba77d6b54.1673684790.git.karthik.188@gmail.com/ >> >> thanks! >> John > > Oh, interesting, I didn't know about [2]. So, is the plan to take the > --source option from that series and add it to diff (perhaps with a > different name, since log tends to consume diff options and --source > is already taken)? Yep, that would be the general idea > > And do you expect to get the tree-ish from the two the users are > already specifying to diff? If so, which one do you use (the two > commits being diffed might have differing .gitattributes files)? If > not, what does that mean for users of e.g. the GitLab UI who have to > specify a third tree when diffing? Good question! Since it seems that when `git-diff(1)` considers diff.<driver>, it goes with the path of the first one. (might need some confirmation here) in diff.c: static void run_diff(struct diff_filepair *p, struct diff_options *o) { const char *pgm = external_diff(); struct strbuf msg; struct diff_filespec *one = p->one; struct diff_filespec *two = p->two; const char *name; const char *other; const char *attr_path; name = one->path; other = (strcmp(name, two->path) ? two->path : NULL); attr_path = name; if (o->prefix_length) I was thinking we would just use the tree-ish of the first one thanks John
On Mon, Feb 20, 2023 at 8:49 AM John Cai <johncai86@gmail.com> wrote: > > Hi Elijah, > > On 20 Feb 2023, at 11:21, Elijah Newren wrote: > > > On Mon, Feb 20, 2023 at 7:32 AM John Cai <johncai86@gmail.com> wrote: > > [...] > >>> I'm still curious if this should this also include warnings/caveats, such as: > >>> * The diff attribute specified in .gitattributes will be ignored in > >>> a bare clone > >>> * The diff attribute specified in .gitattributes will be ignored if > >>> it is only specified in another branch (e.g. on a branch "special-file > >>> diff=patience" recorded in .gitattributes, then checkout master but > >>> run `git log -1 -p $branch`) > >>> * When a file is renamed, the diff attribute for the pre-image name > >>> is the only one the system pays attention to (thus adding "-R" can > >>> flip which diff algorithm is run for the renamed file). > >> > >> I would be fine with adding that--though originally I was thinking that these > >> can be inferred from the way that gitattributes are documented in [1]. Calling > >> these out would make it more clear though, so I could go either way. > >> > >>> > >>> Also, since I tested the three items above to verify they are valid > >>> warnings, I'm a bit confused. I thought your intent was to use this > >>> server-side[1], so isn't the bare clone aspect a deal-breaker for your > >>> intended usecase? > >>> > >>> [1] https://lore.kernel.org/git/7852AC7B-7A4E-4DD0-ADEA-CFFD5D16C595@gmail.com/ > >> > >> yes, indeed. I was planning on adding bare repository support in a separate > >> patch series, since the additions in [2] allows .gitattributes to be read from a > >> bare repository. > >> > >> 1. https://git-scm.com/docs/gitattributes > >> 2. https://lore.kernel.org/git/0ca8b2458921fc40269b0c43b5ec86eba77d6b54.1673684790.git.karthik.188@gmail.com/ > >> > >> thanks! > >> John > > > > Oh, interesting, I didn't know about [2]. So, is the plan to take the > > --source option from that series and add it to diff (perhaps with a > > different name, since log tends to consume diff options and --source > > is already taken)? > > Yep, that would be the general idea > > > > > And do you expect to get the tree-ish from the two the users are > > already specifying to diff? If so, which one do you use (the two > > commits being diffed might have differing .gitattributes files)? If > > not, what does that mean for users of e.g. the GitLab UI who have to > > specify a third tree when diffing? > > Good question! Since it seems that when `git-diff(1)` considers diff.<driver>, > it goes with the path of the first one. (might need some confirmation here) > > in diff.c: > > > static void run_diff(struct diff_filepair *p, struct diff_options *o) > { > const char *pgm = external_diff(); > struct strbuf msg; > struct diff_filespec *one = p->one; > struct diff_filespec *two = p->two; > const char *name; > const char *other; > const char *attr_path; > > name = one->path; > other = (strcmp(name, two->path) ? two->path : NULL); > attr_path = name; > if (o->prefix_length) > > I was thinking we would just use the tree-ish of the first one That would certainly simplify, but it'd be pretty important to document. (Incidentally, this kind of decision was my reason for asking about all those special cases earlier, i.e. how to handle diff between different commits, how to handle renames, how to handle bare repositories, etc.) This kind of decision probably also means you'd need a variety of testcases where .gitattributes is different in every commit & the index & the working tree, and then you start testing several of the possible pairings to make sure the right .gitattributes file is used (e.g. (commit, commit), (commit, index), (index, commit), (worktree, index), etc.) However, I'm curious again. You brought this up because you want to use it in GitLab, yet configuration of using this option as it appears in this series requires changing _both_ .gitattributes and .git/config. How will users of the GitLab UI do the configuration necessary for the git-config side to take effect?
Hi Elijah, On 20 Feb 2023, at 12:32, Elijah Newren wrote: > On Mon, Feb 20, 2023 at 8:49 AM John Cai <johncai86@gmail.com> wrote: >> >> Hi Elijah, >> >> On 20 Feb 2023, at 11:21, Elijah Newren wrote: >> >>> On Mon, Feb 20, 2023 at 7:32 AM John Cai <johncai86@gmail.com> wrote: >>> [...] >>>>> I'm still curious if this should this also include warnings/caveats, such as: >>>>> * The diff attribute specified in .gitattributes will be ignored in >>>>> a bare clone >>>>> * The diff attribute specified in .gitattributes will be ignored if >>>>> it is only specified in another branch (e.g. on a branch "special-file >>>>> diff=patience" recorded in .gitattributes, then checkout master but >>>>> run `git log -1 -p $branch`) >>>>> * When a file is renamed, the diff attribute for the pre-image name >>>>> is the only one the system pays attention to (thus adding "-R" can >>>>> flip which diff algorithm is run for the renamed file). >>>> >>>> I would be fine with adding that--though originally I was thinking that these >>>> can be inferred from the way that gitattributes are documented in [1]. Calling >>>> these out would make it more clear though, so I could go either way. >>>> >>>>> >>>>> Also, since I tested the three items above to verify they are valid >>>>> warnings, I'm a bit confused. I thought your intent was to use this >>>>> server-side[1], so isn't the bare clone aspect a deal-breaker for your >>>>> intended usecase? >>>>> >>>>> [1] https://lore.kernel.org/git/7852AC7B-7A4E-4DD0-ADEA-CFFD5D16C595@gmail.com/ >>>> >>>> yes, indeed. I was planning on adding bare repository support in a separate >>>> patch series, since the additions in [2] allows .gitattributes to be read from a >>>> bare repository. >>>> >>>> 1. https://git-scm.com/docs/gitattributes >>>> 2. https://lore.kernel.org/git/0ca8b2458921fc40269b0c43b5ec86eba77d6b54.1673684790.git.karthik.188@gmail.com/ >>>> >>>> thanks! >>>> John >>> >>> Oh, interesting, I didn't know about [2]. So, is the plan to take the >>> --source option from that series and add it to diff (perhaps with a >>> different name, since log tends to consume diff options and --source >>> is already taken)? >> >> Yep, that would be the general idea >> >>> >>> And do you expect to get the tree-ish from the two the users are >>> already specifying to diff? If so, which one do you use (the two >>> commits being diffed might have differing .gitattributes files)? If >>> not, what does that mean for users of e.g. the GitLab UI who have to >>> specify a third tree when diffing? >> >> Good question! Since it seems that when `git-diff(1)` considers diff.<driver>, >> it goes with the path of the first one. (might need some confirmation here) >> >> in diff.c: >> >> >> static void run_diff(struct diff_filepair *p, struct diff_options *o) >> { >> const char *pgm = external_diff(); >> struct strbuf msg; >> struct diff_filespec *one = p->one; >> struct diff_filespec *two = p->two; >> const char *name; >> const char *other; >> const char *attr_path; >> >> name = one->path; >> other = (strcmp(name, two->path) ? two->path : NULL); >> attr_path = name; >> if (o->prefix_length) >> >> I was thinking we would just use the tree-ish of the first one > > That would certainly simplify, but it'd be pretty important to > document. (Incidentally, this kind of decision was my reason for > asking about all those special cases earlier, i.e. how to handle diff > between different commits, how to handle renames, how to handle bare > repositories, etc.) Good point--that would be good to document. > > This kind of decision probably also means you'd need a variety of > testcases where .gitattributes is different in every commit & the > index & the working tree, and then you start testing several of the > possible pairings to make sure the right .gitattributes file is used > (e.g. (commit, commit), (commit, index), (index, commit), (worktree, > index), etc.) > > However, I'm curious again. You brought this up because you want to > use it in GitLab, yet configuration of using this option as it appears > in this series requires changing _both_ .gitattributes and > .git/config. How will users of the GitLab UI do the configuration > necessary for the git-config side to take effect? Good question--we would likely need to add some pre-baked configuration server side with a driver config that users could then tap into with their gitattributes files.
On Mon, Feb 20, 2023 at 09:32:52AM -0800, Elijah Newren wrote: > > I was thinking we would just use the tree-ish of the first one > > That would certainly simplify, but it'd be pretty important to > document. (Incidentally, this kind of decision was my reason for > asking about all those special cases earlier, i.e. how to handle diff > between different commits, how to handle renames, how to handle bare > repositories, etc.) > > This kind of decision probably also means you'd need a variety of > testcases where .gitattributes is different in every commit & the > index & the working tree, and then you start testing several of the > possible pairings to make sure the right .gitattributes file is used > (e.g. (commit, commit), (commit, index), (index, commit), (worktree, > index), etc.) There may be some prior art here in how we handle mailmaps in a bare repository. In that case, we pull them from HEAD (or really any commit of your choosing, but the default is HEAD). That may seem a bit weird, but it matches how non-bare repositories work, which read the mailmap from the working tree. So likewise, even looking at an old commit like "git show HEAD~1000", in a non-bare repository we will read .gitattributes from the working tree, which means it is (roughly) coming from HEAD. In some ways that is good (there may be improvements to the attributes) and in some ways it is weird and confusing (the meaning of the attributes may have been different back then, or the two histories may even be somewhat unrelated!). So I think you can make an argument either way on what is useful, but harmonizing the non-bare and bare cases seems like the best place to start. And then that machinery would probably be enough to let people ask for specific things on top (like "git show --attributes-from=HEAD~1000 HEAD~1000" if they really wanted). -Peff
Hey Peff, On 22 Feb 2023, at 14:47, Jeff King wrote: > On Mon, Feb 20, 2023 at 09:32:52AM -0800, Elijah Newren wrote: > >>> I was thinking we would just use the tree-ish of the first one >> >> That would certainly simplify, but it'd be pretty important to >> document. (Incidentally, this kind of decision was my reason for >> asking about all those special cases earlier, i.e. how to handle diff >> between different commits, how to handle renames, how to handle bare >> repositories, etc.) >> >> This kind of decision probably also means you'd need a variety of >> testcases where .gitattributes is different in every commit & the >> index & the working tree, and then you start testing several of the >> possible pairings to make sure the right .gitattributes file is used >> (e.g. (commit, commit), (commit, index), (index, commit), (worktree, >> index), etc.) > > There may be some prior art here in how we handle mailmaps in a bare > repository. In that case, we pull them from HEAD (or really any commit > of your choosing, but the default is HEAD). That may seem a bit weird, > but it matches how non-bare repositories work, which read the mailmap > from the working tree. > > So likewise, even looking at an old commit like "git show HEAD~1000", in > a non-bare repository we will read .gitattributes from the working tree, > which means it is (roughly) coming from HEAD. In some ways that is good > (there may be improvements to the attributes) and in some ways it is > weird and confusing (the meaning of the attributes may have been > different back then, or the two histories may even be somewhat > unrelated!). So I think you can make an argument either way on what is > useful, but harmonizing the non-bare and bare cases seems like the best > place to start. Thanks for the historical context and suggestion here > > And then that machinery would probably be enough to let people ask for > specific things on top (like "git show --attributes-from=HEAD~1000 > HEAD~1000" if they really wanted). Makes sense to me thanks John > > -Peff
diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt index c19e64ea0ef..f212079a131 100644 --- a/Documentation/gitattributes.txt +++ b/Documentation/gitattributes.txt @@ -758,6 +758,43 @@ with the above configuration, i.e. `j-c-diff`, with 7 parameters, just like `GIT_EXTERNAL_DIFF` program is called. See linkgit:git[1] for details. +Setting the internal diff algorithm +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The diff algorithm can be set through the `diff.algorithm` config key, but +sometimes it may be helpful to set the diff algorithm by path. For example, one +might wish to set a diff algorithm automatically for all `.json` files such that +the user would not need to pass in a separate command line `--diff-algorithm` +flag each time. + +First, in `.gitattributes`, assign the `diff` attribute for paths. + +------------------------ +*.json diff=<name> +------------------------ + +Then, define a "diff.<name>.algorithm" configuration to specify the diff +algorithm, choosing from `meyers`, `patience`, `minimal`, or `histogram`. + +---------------------------------------------------------------- +[diff "<name>"] + algorithm = histogram +---------------------------------------------------------------- + +This diff algorithm applies to user facing diff output like git-diff(1), +git-show(1) and is used for the `--stat` output as well. The merge machinery +will not use the diff algorithm set through this method. + +NOTE: If the `command` key also exists, then Git will treat this as an external +diff and attempt to use the value set for `command` as an external program. For +instance, the following config, combined with the above `.gitattributes` file, +will result in `command` favored over `algorithm`. + +---------------------------------------------------------------- +[diff "<name>"] + command = j-c-diff + algorithm = histogram +---------------------------------------------------------------- Defining a custom hunk-header ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ diff --git a/diff.c b/diff.c index 5efc22ca06b..04469da6d34 100644 --- a/diff.c +++ b/diff.c @@ -4456,15 +4456,13 @@ static void run_diff_cmd(const char *pgm, const char *xfrm_msg = NULL; int complete_rewrite = (p->status == DIFF_STATUS_MODIFIED) && p->score; int must_show_header = 0; + struct userdiff_driver *drv = NULL; - - if (o->flags.allow_external) { - struct userdiff_driver *drv; - + if (o->flags.allow_external || !o->ignore_driver_algorithm) drv = userdiff_find_by_path(o->repo->index, attr_path); - if (drv && drv->external) - pgm = drv->external; - } + + if (o->flags.allow_external && drv && drv->external) + pgm = drv->external; if (msg) { /* @@ -4481,12 +4479,16 @@ static void run_diff_cmd(const char *pgm, run_external_diff(pgm, name, other, one, two, xfrm_msg, o); return; } - if (one && two) + if (one && two) { + if (drv && !o->ignore_driver_algorithm && drv->algorithm) + set_diff_algorithm(o, drv->algorithm); + builtin_diff(name, other ? other : name, one, two, xfrm_msg, must_show_header, o, complete_rewrite); - else + } else { fprintf(o->file, "* Unmerged path %s\n", name); + } } static void diff_fill_oid_info(struct diff_filespec *one, struct index_state *istate) @@ -4583,6 +4585,14 @@ static void run_diffstat(struct diff_filepair *p, struct diff_options *o, const char *name; const char *other; + if (!o->ignore_driver_algorithm) { + struct userdiff_driver *drv = userdiff_find_by_path(o->repo->index, p->one->path); + + if (drv && drv->algorithm) { + set_diff_algorithm(o, drv->algorithm); + } + } + if (DIFF_PAIR_UNMERGED(p)) { /* unmerged */ builtin_diffstat(p->one->path, NULL, NULL, NULL, @@ -5130,6 +5140,8 @@ static int diff_opt_diff_algorithm(const struct option *opt, return error(_("option diff-algorithm accepts \"myers\", " "\"minimal\", \"patience\" and \"histogram\"")); + options->ignore_driver_algorithm = 1; + return 0; } @@ -5145,6 +5157,8 @@ static int diff_opt_diff_algorithm_no_arg(const struct option *opt, BUG("available diff algorithms include \"myers\", " "\"minimal\", \"patience\" and \"histogram\""); + options->ignore_driver_algorithm = 1; + return 0; } @@ -5285,6 +5299,7 @@ static int diff_opt_patience(const struct option *opt, for (i = 0; i < options->anchors_nr; i++) free(options->anchors[i]); options->anchors_nr = 0; + options->ignore_driver_algorithm = 1; return set_diff_algorithm(options, "patience"); } diff --git a/diff.h b/diff.h index 41eb2c3d428..8d770b1d579 100644 --- a/diff.h +++ b/diff.h @@ -333,6 +333,7 @@ struct diff_options { int prefix_length; const char *stat_sep; int xdl_opts; + int ignore_driver_algorithm; /* see Documentation/diff-options.txt */ char **anchors; diff --git a/t/lib-diff-alternative.sh b/t/lib-diff-alternative.sh index 8d1e408bb58..2dc02bca873 100644 --- a/t/lib-diff-alternative.sh +++ b/t/lib-diff-alternative.sh @@ -105,10 +105,46 @@ index $file1..$file2 100644 } EOF + cat >expect_diffstat <<EOF + file1 => file2 | 21 ++++++++++----------- + 1 file changed, 10 insertions(+), 11 deletions(-) +EOF + STRATEGY=$1 + test_expect_success "$STRATEGY diff from attributes" ' + echo "file* diff=driver" >.gitattributes && + git config diff.driver.algorithm "$STRATEGY" && + test_must_fail git diff --no-index file1 file2 > output && + cat expect && + cat output && + test_cmp expect output + ' + + test_expect_success "$STRATEGY diff from attributes has valid diffstat" ' + echo "file* diff=driver" >.gitattributes && + git config diff.driver.algorithm "$STRATEGY" && + test_must_fail git diff --stat --no-index file1 file2 > output && + test_cmp expect_diffstat output + ' + test_expect_success "$STRATEGY diff" ' - test_must_fail git diff --no-index "--$STRATEGY" file1 file2 > output && + test_must_fail git diff --no-index "--diff-algorithm=$STRATEGY" file1 file2 > output && + test_cmp expect output + ' + + test_expect_success "$STRATEGY diff command line precedence before attributes" ' + echo "file* diff=driver" >.gitattributes && + git config diff.driver.algorithm meyers && + test_must_fail git diff --no-index "--diff-algorithm=$STRATEGY" file1 file2 > output && + test_cmp expect output + ' + + test_expect_success "$STRATEGY diff attributes precedence before config" ' + git config diff.algorithm default && + echo "file* diff=driver" >.gitattributes && + git config diff.driver.algorithm "$STRATEGY" && + test_must_fail git diff --no-index file1 file2 > output && test_cmp expect output ' diff --git a/userdiff.c b/userdiff.c index d71b82feb74..ff25cfc4b4c 100644 --- a/userdiff.c +++ b/userdiff.c @@ -293,7 +293,7 @@ PATTERNS("scheme", "|([^][)(}{[ \t])+"), PATTERNS("tex", "^(\\\\((sub)*section|chapter|part)\\*{0,1}\\{.*)$", "\\\\[a-zA-Z@]+|\\\\.|[a-zA-Z0-9\x80-\xff]+"), -{ "default", NULL, -1, { NULL, 0 } }, +{ "default", NULL, NULL, -1, { NULL, 0 } }, }; #undef PATTERNS #undef IPATTERN @@ -394,6 +394,8 @@ int userdiff_config(const char *k, const char *v) return parse_bool(&drv->textconv_want_cache, k, v); if (!strcmp(type, "wordregex")) return git_config_string(&drv->word_regex, k, v); + if (!strcmp(type, "algorithm")) + return git_config_string(&drv->algorithm, k, v); return 0; } diff --git a/userdiff.h b/userdiff.h index aee91bc77e6..24419db6973 100644 --- a/userdiff.h +++ b/userdiff.h @@ -14,6 +14,7 @@ struct userdiff_funcname { struct userdiff_driver { const char *name; const char *external; + const char *algorithm; int binary; struct userdiff_funcname funcname; const char *word_regex;