diff mbox series

[v3] difftool.c: learn a new way start from specified file

Message ID pull.870.v3.git.1612884654078.gitgitgadget@gmail.com (mailing list archive)
State New
Headers show
Series [v3] difftool.c: learn a new way start from specified file | expand

Commit Message

ZheNing Hu Feb. 9, 2021, 3:30 p.m. UTC
From: ZheNing Hu <adlternative@gmail.com>

`git difftool` only allow us to select file to view in turn.
If there is a commit with many files and we exit in the search,
We will have to traverse list again to get the file diff which
we want to see. Therefore, here is a new method: user can use
`git difftool --start-from=<filename>` to start viewing from
the specified file. This will improve the user experience.
At the same time, turn bit field constants into bit shift format
in `diff.h`.

Signed-off-by: ZheNing Hu <adlternative@gmail.com>
---
    difftool.c: learn a new way start at specified file
    
    git user may should travel the diff list to choice file diff to view, if
    they exit in midway,they must travel it again. By starting from the
    specified file method, provides a possibility for this user-friendly
    solution.
    
    this patch's origin discuss is here:
    https://lore.kernel.org/git/gOXOaoqn-E9A2ob7ykWEcDc7ZxmSwAjcP5CCFKfr5ejCOWZQ1lfAUZcbgYT9AyQCcDgJvCrnrtziXiels-Hxol3xlkGTVHk24SvAdaSUtKQ=@rtzoeller.com/
    
    Maybe this patch is more like skip to in Junio's original thread than
    the previous versions.
    
    Thanks!

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-870%2Fadlternative%2Fdifftool_save_point-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-870/adlternative/difftool_save_point-v3
Pull-Request: https://github.com/gitgitgadget/git/pull/870

Range-diff vs v2:

 1:  2468eaff322b < -:  ------------ git-difftool-helper.sh: learn a new way go back to last save point
 -:  ------------ > 1:  29fc6b4bc08f difftool.c: learn a new way start from specified file


 Documentation/git-difftool.txt |  3 +++
 builtin/difftool.c             |  7 ++++++-
 diff.c                         |  9 +++++++++
 diff.h                         | 20 ++++++++++----------
 t/t7800-difftool.sh            | 12 ++++++++++++
 5 files changed, 40 insertions(+), 11 deletions(-)


base-commit: e6362826a0409539642a5738db61827e5978e2e4

Comments

Junio C Hamano Feb. 9, 2021, 6:17 p.m. UTC | #1
"ZheNing Hu via GitGitGadget" <gitgitgadget@gmail.com> writes:

> diff --git a/Documentation/git-difftool.txt b/Documentation/git-difftool.txt
> index 484c485fd06c..552be097dfea 100644
> --- a/Documentation/git-difftool.txt
> +++ b/Documentation/git-difftool.txt
> @@ -34,6 +34,9 @@ OPTIONS
>  	This is the default behaviour; the option is provided to
>  	override any configuration settings.
>  
> +--start-from::
> +	Start viewing diff from the specified file.
> +

There is nothing that specifies a file in the above ;-)

	--start-from=<file>::
		Start viewing ...

This is even more important as SYNOPSIS section of this manual page
does not duplicate the list of options and their arguments.

>  -t <tool>::
>  --tool=<tool>::
>  	Use the diff tool specified by <tool>.  Valid values include

There are many things I dislike about this patch, but do not take it
as a personal attack.  I'll try to suggest an alternative at the end,
but read along.

> diff --git a/builtin/difftool.c b/builtin/difftool.c
> index 6e18e623fddf..67d2befa1210 100644
> --- a/builtin/difftool.c
> +++ b/builtin/difftool.c
> @@ -690,7 +690,7 @@ int cmd_difftool(int argc, const char **argv, const char *prefix)
>  {
>  	int use_gui_tool = 0, dir_diff = 0, prompt = -1, symlinks = 0,
>  	    tool_help = 0, no_index = 0;
> -	static char *difftool_cmd = NULL, *extcmd = NULL;
> +	static char *difftool_cmd = NULL, *extcmd = NULL, *start_file = NULL;
>  	struct option builtin_difftool_options[] = {
>  		OPT_BOOL('g', "gui", &use_gui_tool,
>  			 N_("use `diff.guitool` instead of `diff.tool`")),
> @@ -714,6 +714,8 @@ int cmd_difftool(int argc, const char **argv, const char *prefix)
>  		OPT_STRING('x', "extcmd", &extcmd, N_("command"),
>  			   N_("specify a custom command for viewing diffs")),
>  		OPT_ARGUMENT("no-index", &no_index, N_("passed to `diff`")),
> +		OPT_STRING(0, "start-from", &start_file, N_("start-from"),
> +			   N_("start viewing diff from the specified file")),
>  		OPT_END()
>  	};

This may be a good UI to "difftool".

> @@ -724,6 +726,9 @@ int cmd_difftool(int argc, const char **argv, const char *prefix)
>  			     builtin_difftool_usage, PARSE_OPT_KEEP_UNKNOWN |
>  			     PARSE_OPT_KEEP_DASHDASH);
>  
> +	if (start_file)
> +		setenv("START_FILE", start_file, 1);

Unacceptable.  Nothing gives Git the right to squat on such a
generic name, and there is no hint that it is used to specify the
starting point of what operation.  In addition, I do not see a good
reason why we need to use an environment variable in the first
place.  We run "diff" as an external process, with GIT_EXTERNAL_DIFF
environment pointing back at us, no?  This information should be
passed from its command line.

> diff --git a/diff.c b/diff.c
> index 69e3bc00ed8f..cdad26f99063 100644
> --- a/diff.c
> +++ b/diff.c
> @@ -4249,6 +4249,7 @@ static void run_external_diff(const char *pgm,
>  			      const char *xfrm_msg,
>  			      struct diff_options *o)
>  {
> +	const char *start_file = NULL;
>  	struct strvec argv = STRVEC_INIT;
>  	struct strvec env = STRVEC_INIT;
>  	struct diff_queue_struct *q = &diff_queued_diff;
> @@ -4272,9 +4273,17 @@ static void run_external_diff(const char *pgm,
>  
>  	diff_free_filespec_data(one);
>  	diff_free_filespec_data(two);
> +
> +	start_file = xstrdup_or_null(getenv("START_FILE"));
> +	if (start_file) {
> +		if (strcmp(start_file, name))
> +			goto finish;
> +		unsetenv("START_FILE");
> +	}

Again, an unacceptable "hack".  "start the diff output showing from
this path" would plausibly a good feature even outside the scope of
"difftool", and the feature should not be limited to the external
diff interface.  More on this later.

>  	if (run_command_v_opt_cd_env(argv.v, RUN_USING_SHELL, NULL, env.v))
>  		die(_("external diff died, stopping at %s"), name);
>  
> +finish:
>  	remove_tempfile();
>  	strvec_clear(&argv);
>  	strvec_clear(&env);
> diff --git a/diff.h b/diff.h
> index 2ff2b1c7f2ca..f67c43f5af95 100644
> --- a/diff.h
> +++ b/diff.h
> @@ -86,18 +86,18 @@ typedef void (*diff_format_fn_t)(struct diff_queue_struct *q,
>  
>  typedef struct strbuf *(*diff_prefix_fn_t)(struct diff_options *opt, void *data);
>  
> -#define DIFF_FORMAT_RAW		0x0001
> -#define DIFF_FORMAT_DIFFSTAT	0x0002
> -#define DIFF_FORMAT_NUMSTAT	0x0004
> -#define DIFF_FORMAT_SUMMARY	0x0008
> -#define DIFF_FORMAT_PATCH	0x0010
> -#define DIFF_FORMAT_SHORTSTAT	0x0020
> -#define DIFF_FORMAT_DIRSTAT	0x0040
> +#define DIFF_FORMAT_RAW		(1U<<0)
> +#define DIFF_FORMAT_DIFFSTAT	(1U<<1)
> +#define DIFF_FORMAT_NUMSTAT	(1U<<2)
> +#define DIFF_FORMAT_SUMMARY	(1U<<3)
> +#define DIFF_FORMAT_PATCH	(1U<<4)
> +#define DIFF_FORMAT_SHORTSTAT	(1U<<5)
> +#define DIFF_FORMAT_DIRSTAT	(1U<<6)
>  
>  /* These override all above */
> -#define DIFF_FORMAT_NAME	0x0100
> -#define DIFF_FORMAT_NAME_STATUS	0x0200
> -#define DIFF_FORMAT_CHECKDIFF	0x0400
> +#define DIFF_FORMAT_NAME	(1U<<8)
> +#define DIFF_FORMAT_NAME_STATUS	(1U<<9)
> +#define DIFF_FORMAT_CHECKDIFF	(1U<<10)

Do we need these changes for the new feature to work?

I also find this "skip and discard ones earlier than the given path"
makes the utility of the feature artificially narrower than needed,
when I imagine how else this feature, or a variant of it, would be
useful in other situations.  For example, consider that there are 5
paths, and you've seen 3 of them so far before you went off, so you
are restarting from 4th file.  But wouldn't it be more useful, after
showing the 4th and 5th file, if the tool wraps around to show 1st,
2nd, and 3rd file if the user kept going?

I suspect that this feature fits the overall system much better if
it is implemented as a new step in the diffcore transformation.  The
way our diff subsystem works is roughly:

 * The front-end "diff", "diff-files", "diff-index" and "diff-trees"
   are given two "tree like things" to compare, and feeds bunch of
   <old, new> tuples to the diff internal machinery.  Each of these
   tuples are called "filepairs", and a "pair" has two "filespecs",
   one describing the contents, the mode, and the path in the "old"
   side of the "tree like things", the other describing the
   contents, the mode, and the path in the "new" side of the "tree
   like things".

 * The series of filepairs are given to the "diffcore" machinery,
   where they may be broken (i.e. a filepair that says that the old
   contents of "hello.txt" was X, and the new contents of
   "hello.txt" is Y, may become two filepairs, one that says that
   the "hello.txt" file with contents X used to be in the old tree
   but there is nothing corresponding to it in the new tree, and the
   other says that a new "hello.txt" with contents Y appeared
   without corresponding thing in the old tree), matched (i.e. there
   may originally be two filepairs, one that says path A.txt appears
   on the old side but disappeared on the new side, and the other
   that says path sub/A.txt did not exist on the old side but
   appears on the new side---these two filepairs may be merged to
   express "A.txt on the old side got renamed to sub/A.txt on the
   new side"), etc.

 * The set of filepairs processed in the "diffcore" machinery is
   given to the backend and each filepair will be shown in the
   output, as a series of patches, a diffstat, etc.

There is a step in the "diffcore" machinery called "diffcore-order".
The front-ends all feed the filepairs alphabetically to the
"diffcore" machinery, but "diffcore-order" can reorder them, so that
the original order that the "git diff HEAD --" frontend found the
changes may be to "hello.c" and then "hello.h" (because .c sorts
before .h), but the users can specify with the "-O<orderfile>"
option that they want to view the header files before the source
files.  When the set of filepairs exits the diffcore machinery, the
original "hello.c" then "hello.h" order may get modified to show
"hello.h" then "hello.c".  It probably is the simplest to model this
new feature after how "diffcore-order" does it.

So, perhaps we can introduce a new "diffcore-rotate" step, where the
filepairs before the specified location are rotated out to the end
of the filepairs?

The following is just a quick draft that is only lightly tested by
running itself with the "--rotate-to=diffcore-rotate.c" option, but
it should be sufficient to get you started.  You should add a way to
diagnose that the name given to the "--start-file" option actually
exists in the diff on the "difftool" side, because a path that does
not exist in the patch with the following code is simply ignored
(and that is very much deliberate, because we do not want it to die
while running "git log -p --rotate-to=X" where some changes may or
may not touch X).


$ ./git diff --stat -p --rotate-to=diffcore-rotate.c HEAD

 diffcore-rotate.c | 35 +++++++++++++++++++++++++++++++++++
 diffcore.h        |  1 +
 Makefile          |  1 +
 diff.c            |  4 ++++
 diff.h            |  1 +
 5 files changed, 42 insertions(+)

diff --git c/diffcore-rotate.c w/diffcore-rotate.c
new file mode 100644
index 0000000000..0d17901616
--- /dev/null
+++ w/diffcore-rotate.c
@@ -0,0 +1,35 @@
+/*
+ * Copyright (C) 2021, Google LLC.
+ * Based on diffcore-order.c, which is Copyright (C) 2005, Junio C Hamano
+ */
+#include "cache.h"
+#include "diff.h"
+#include "diffcore.h"
+
+void diffcore_rotate(const char *rotate_to_filename)
+{
+	struct diff_queue_struct *q = &diff_queued_diff;
+	struct diff_queue_struct outq;
+	int rotate_to, i;
+
+	if (!q->nr)
+		return;
+
+	for (i = 0; i < q->nr; i++)
+		if (!strcmp(rotate_to_filename, q->queue[i]->two->path))
+			break;
+	/* we did not find the specified path */
+	if (q->nr <= i)
+		return;
+
+	DIFF_QUEUE_CLEAR(&outq);
+	rotate_to = i;
+
+	for (i = rotate_to; i < q->nr; i++)
+		diff_q(&outq, q->queue[i]);
+	for (i = 0; i < rotate_to; i++)
+		diff_q(&outq, q->queue[i]);
+
+	free(q->queue);
+	*q = outq;
+}
diff --git c/diffcore.h w/diffcore.h
index d2a63c5c71..bd5959375b 100644
--- c/diffcore.h
+++ w/diffcore.h
@@ -164,6 +164,7 @@ void diffcore_rename(struct diff_options *);
 void diffcore_merge_broken(void);
 void diffcore_pickaxe(struct diff_options *);
 void diffcore_order(const char *orderfile);
+void diffcore_rotate(const char *rotate_to_filename);
 
 /* low-level interface to diffcore_order */
 struct obj_order {
diff --git c/Makefile w/Makefile
index b797033c58..031b0b88e6 100644
--- c/Makefile
+++ w/Makefile
@@ -869,6 +869,7 @@ LIB_OBJS += diffcore-delta.o
 LIB_OBJS += diffcore-order.o
 LIB_OBJS += diffcore-pickaxe.o
 LIB_OBJS += diffcore-rename.o
+LIB_OBJS += diffcore-rotate.o
 LIB_OBJS += dir-iterator.o
 LIB_OBJS += dir.o
 LIB_OBJS += editor.o
diff --git c/diff.c w/diff.c
index 69e3bc00ed..90a8d5abd0 100644
--- c/diff.c
+++ w/diff.c
@@ -5599,6 +5599,8 @@ static void prep_parse_options(struct diff_options *options)
 			  DIFF_PICKAXE_REGEX, PARSE_OPT_NONEG),
 		OPT_FILENAME('O', NULL, &options->orderfile,
 			     N_("control the order in which files appear in the output")),
+		OPT_STRING(0, "rotate-to", &options->rotate_to, N_("<path>"),
+			   N_("show the change in the specified path first")),
 		OPT_CALLBACK_F(0, "find-object", options, N_("<object-id>"),
 			       N_("look for differences that change the number of occurrences of the specified object"),
 			       PARSE_OPT_NONEG, diff_opt_find_object),
@@ -6669,6 +6671,8 @@ void diffcore_std(struct diff_options *options)
 		diffcore_pickaxe(options);
 	if (options->orderfile)
 		diffcore_order(options->orderfile);
+	if (options->rotate_to)
+		diffcore_rotate(options->rotate_to);
 	if (!options->found_follow)
 		/* See try_to_follow_renames() in tree-diff.c */
 		diff_resolve_rename_copy();
diff --git c/diff.h w/diff.h
index 2ff2b1c7f2..0801469e63 100644
--- c/diff.h
+++ w/diff.h
@@ -226,6 +226,7 @@ enum diff_submodule_format {
  */
 struct diff_options {
 	const char *orderfile;
+	const char *rotate_to;
 
 	/**
 	 * A constant string (can and typically does contain newlines to look for
ZheNing Hu Feb. 10, 2021, 5 p.m. UTC | #2
Junio C Hamano <gitster@pobox.com> 于2021年2月10日周三 上午2:17写道:
>
> "ZheNing Hu via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > diff --git a/Documentation/git-difftool.txt b/Documentation/git-difftool.txt
> > index 484c485fd06c..552be097dfea 100644
> > --- a/Documentation/git-difftool.txt
> > +++ b/Documentation/git-difftool.txt
> > @@ -34,6 +34,9 @@ OPTIONS
> >       This is the default behaviour; the option is provided to
> >       override any configuration settings.
> >
> > +--start-from::
> > +     Start viewing diff from the specified file.
> > +
>
> There is nothing that specifies a file in the above ;-)
>
>         --start-from=<file>::
>                 Start viewing ...
>
> This is even more important as SYNOPSIS section of this manual page
> does not duplicate the list of options and their arguments.
>
I admit my mistake,thanks for reminding.
> >  -t <tool>::
> >  --tool=<tool>::
> >       Use the diff tool specified by <tool>.  Valid values include
>
> There are many things I dislike about this patch, but do not take it
> as a personal attack.  I'll try to suggest an alternative at the end,
> but read along.
>
Of course I humbly accepted your criticism,I am only a sophomore in
university after all,It is extremely important to listen to your suggestions.
> > diff --git a/builtin/difftool.c b/builtin/difftool.c
> > index 6e18e623fddf..67d2befa1210 100644
> > --- a/builtin/difftool.c
> > +++ b/builtin/difftool.c
> > @@ -690,7 +690,7 @@ int cmd_difftool(int argc, const char **argv, const char *prefix)
> >  {
> >       int use_gui_tool = 0, dir_diff = 0, prompt = -1, symlinks = 0,
> >           tool_help = 0, no_index = 0;
> > -     static char *difftool_cmd = NULL, *extcmd = NULL;
> > +     static char *difftool_cmd = NULL, *extcmd = NULL, *start_file = NULL;
> >       struct option builtin_difftool_options[] = {
> >               OPT_BOOL('g', "gui", &use_gui_tool,
> >                        N_("use `diff.guitool` instead of `diff.tool`")),
> > @@ -714,6 +714,8 @@ int cmd_difftool(int argc, const char **argv, const char *prefix)
> >               OPT_STRING('x', "extcmd", &extcmd, N_("command"),
> >                          N_("specify a custom command for viewing diffs")),
> >               OPT_ARGUMENT("no-index", &no_index, N_("passed to `diff`")),
> > +             OPT_STRING(0, "start-from", &start_file, N_("start-from"),
> > +                        N_("start viewing diff from the specified file")),
> >               OPT_END()
> >       };
>
> This may be a good UI to "difftool".
>
> > @@ -724,6 +726,9 @@ int cmd_difftool(int argc, const char **argv, const char *prefix)
> >                            builtin_difftool_usage, PARSE_OPT_KEEP_UNKNOWN |
> >                            PARSE_OPT_KEEP_DASHDASH);
> >
> > +     if (start_file)
> > +             setenv("START_FILE", start_file, 1);
>
> Unacceptable.  Nothing gives Git the right to squat on such a
> generic name, and there is no hint that it is used to specify the
> starting point of what operation.  In addition, I do not see a good
> reason why we need to use an environment variable in the first
> place.  We run "diff" as an external process, with GIT_EXTERNAL_DIFF
> environment pointing back at us, no?  This information should be
> passed from its command line.
Sure,I using environment variables originally hoped can be used in
git--difftool-helper.sh, later I found it was not easy to deal with, so I am
attempt to use environment variables to transmit information in
`run_external_diff`,now it seems that this method is not very good ...
>
> > diff --git a/diff.c b/diff.c
> > index 69e3bc00ed8f..cdad26f99063 100644
> > --- a/diff.c
> > +++ b/diff.c
> > @@ -4249,6 +4249,7 @@ static void run_external_diff(const char *pgm,
> >                             const char *xfrm_msg,
> >                             struct diff_options *o)
> >  {
> > +     const char *start_file = NULL;
> >       struct strvec argv = STRVEC_INIT;
> >       struct strvec env = STRVEC_INIT;
> >       struct diff_queue_struct *q = &diff_queued_diff;
> > @@ -4272,9 +4273,17 @@ static void run_external_diff(const char *pgm,
> >
> >       diff_free_filespec_data(one);
> >       diff_free_filespec_data(two);
> > +
> > +     start_file = xstrdup_or_null(getenv("START_FILE"));
> > +     if (start_file) {
> > +             if (strcmp(start_file, name))
> > +                     goto finish;
> > +             unsetenv("START_FILE");
> > +     }
>
> Again, an unacceptable "hack".  "start the diff output showing from
> this path" would plausibly a good feature even outside the scope of
> "difftool", and the feature should not be limited to the external
> diff interface.  More on this later.
What you said `from its command line` is using `git difftool
--rotate-to=<file>`,
Maybe I didn’t know whether to deal with the subcommands of `diff`, now I admit
that passing from parameters is better than passing environment variables.
>
> >       if (run_command_v_opt_cd_env(argv.v, RUN_USING_SHELL, NULL, env.v))
> >               die(_("external diff died, stopping at %s"), name);
> >
> > +finish:
> >       remove_tempfile();
> >       strvec_clear(&argv);
> >       strvec_clear(&env);
> > diff --git a/diff.h b/diff.h
> > index 2ff2b1c7f2ca..f67c43f5af95 100644
> > --- a/diff.h
> > +++ b/diff.h
> > @@ -86,18 +86,18 @@ typedef void (*diff_format_fn_t)(struct diff_queue_struct *q,
> >
> >  typedef struct strbuf *(*diff_prefix_fn_t)(struct diff_options *opt, void *data);
> >
> > -#define DIFF_FORMAT_RAW              0x0001
> > -#define DIFF_FORMAT_DIFFSTAT 0x0002
> > -#define DIFF_FORMAT_NUMSTAT  0x0004
> > -#define DIFF_FORMAT_SUMMARY  0x0008
> > -#define DIFF_FORMAT_PATCH    0x0010
> > -#define DIFF_FORMAT_SHORTSTAT        0x0020
> > -#define DIFF_FORMAT_DIRSTAT  0x0040
> > +#define DIFF_FORMAT_RAW              (1U<<0)
> > +#define DIFF_FORMAT_DIFFSTAT (1U<<1)
> > +#define DIFF_FORMAT_NUMSTAT  (1U<<2)
> > +#define DIFF_FORMAT_SUMMARY  (1U<<3)
> > +#define DIFF_FORMAT_PATCH    (1U<<4)
> > +#define DIFF_FORMAT_SHORTSTAT        (1U<<5)
> > +#define DIFF_FORMAT_DIRSTAT  (1U<<6)
> >
> >  /* These override all above */
> > -#define DIFF_FORMAT_NAME     0x0100
> > -#define DIFF_FORMAT_NAME_STATUS      0x0200
> > -#define DIFF_FORMAT_CHECKDIFF        0x0400
> > +#define DIFF_FORMAT_NAME     (1U<<8)
> > +#define DIFF_FORMAT_NAME_STATUS      (1U<<9)
> > +#define DIFF_FORMAT_CHECKDIFF        (1U<<10)
>
> Do we need these changes for the new feature to work?
It has no effect on this new feature. I should put this modification
in an additional commit, right?
>
> I also find this "skip and discard ones earlier than the given path"
> makes the utility of the feature artificially narrower than needed,
> when I imagine how else this feature, or a variant of it, would be
> useful in other situations.  For example, consider that there are 5
> paths, and you've seen 3 of them so far before you went off, so you
> are restarting from 4th file.  But wouldn't it be more useful, after
> showing the 4th and 5th file, if the tool wraps around to show 1st,
> 2nd, and 3rd file if the user kept going?
>
Well, it would be better if you can see 1st, 2nd, 3rd again.
> I suspect that this feature fits the overall system much better if
> it is implemented as a new step in the diffcore transformation.  The
> way our diff subsystem works is roughly:
>
>  * The front-end "diff", "diff-files", "diff-index" and "diff-trees"
>    are given two "tree like things" to compare, and feeds bunch of
>    <old, new> tuples to the diff internal machinery.  Each of these
>    tuples are called "filepairs", and a "pair" has two "filespecs",
>    one describing the contents, the mode, and the path in the "old"
>    side of the "tree like things", the other describing the
>    contents, the mode, and the path in the "new" side of the "tree
>    like things".
>
>  * The series of filepairs are given to the "diffcore" machinery,
>    where they may be broken (i.e. a filepair that says that the old
>    contents of "hello.txt" was X, and the new contents of
>    "hello.txt" is Y, may become two filepairs, one that says that
>    the "hello.txt" file with contents X used to be in the old tree
>    but there is nothing corresponding to it in the new tree, and the
>    other says that a new "hello.txt" with contents Y appeared
>    without corresponding thing in the old tree), matched (i.e. there
>    may originally be two filepairs, one that says path A.txt appears
>    on the old side but disappeared on the new side, and the other
>    that says path sub/A.txt did not exist on the old side but
>    appears on the new side---these two filepairs may be merged to
>    express "A.txt on the old side got renamed to sub/A.txt on the
>    new side"), etc.
>
Thank you for these interpretations on "git diff" for me, it will help me
understand its principle.
>  * The set of filepairs processed in the "diffcore" machinery is
>    given to the backend and each filepair will be shown in the
>    output, as a series of patches, a diffstat, etc.
>
> There is a step in the "diffcore" machinery called "diffcore-order".
> The front-ends all feed the filepairs alphabetically to the
> "diffcore" machinery, but "diffcore-order" can reorder them, so that
> the original order that the "git diff HEAD --" frontend found the
> changes may be to "hello.c" and then "hello.h" (because .c sorts
> before .h), but the users can specify with the "-O<orderfile>"
> option that they want to view the header files before the source
> files.  When the set of filepairs exits the diffcore machinery, the
> original "hello.c" then "hello.h" order may get modified to show
> "hello.h" then "hello.c".  It probably is the simplest to model this
> new feature after how "diffcore-order" does it.
>
Very magical "diffcore-order", when I was looking for a solution yesterday,
I noticed that these functions similar to "diffcore_apply_filter" in
"diffcore_std",
> So, perhaps we can introduce a new "diffcore-rotate" step, where the
> filepairs before the specified location are rotated out to the end
> of the filepairs?
>
Awesome idea. In this way, `difftool --rotate-to=<file>` can call
`diff --rotate-to=<file>` , user can choose the starting file, and they can
also see previous files.
> The following is just a quick draft that is only lightly tested by
> running itself with the "--rotate-to=diffcore-rotate.c" option, but
> it should be sufficient to get you started.  You should add a way to
> diagnose that the name given to the "--start-file" option actually
> exists in the diff on the "difftool" side, because a path that does
> not exist in the patch with the following code is simply ignored
> (and that is very much deliberate, because we do not want it to die
> while running "git log -p --rotate-to=X" where some changes may or
> may not touch X).
Yes, I want to know why being so cautious in git log?If the file name is
wrong, why can't I make it exit? :)
>
>
> $ ./git diff --stat -p --rotate-to=diffcore-rotate.c HEAD
>
>  diffcore-rotate.c | 35 +++++++++++++++++++++++++++++++++++
>  diffcore.h        |  1 +
>  Makefile          |  1 +
>  diff.c            |  4 ++++
>  diff.h            |  1 +
>  5 files changed, 42 insertions(+)
>
> diff --git c/diffcore-rotate.c w/diffcore-rotate.c
> new file mode 100644
> index 0000000000..0d17901616
> --- /dev/null
> +++ w/diffcore-rotate.c
> @@ -0,0 +1,35 @@
> +/*
> + * Copyright (C) 2021, Google LLC.
> + * Based on diffcore-order.c, which is Copyright (C) 2005, Junio C Hamano
> + */
> +#include "cache.h"
> +#include "diff.h"
> +#include "diffcore.h"
> +
> +void diffcore_rotate(const char *rotate_to_filename)
> +{
> +       struct diff_queue_struct *q = &diff_queued_diff;
> +       struct diff_queue_struct outq;
> +       int rotate_to, i;
> +
> +       if (!q->nr)
> +               return;
> +
> +       for (i = 0; i < q->nr; i++)
> +               if (!strcmp(rotate_to_filename, q->queue[i]->two->path))
> +                       break;
> +       /* we did not find the specified path */
> +       if (q->nr <= i)
> +               return;
> +
> +       DIFF_QUEUE_CLEAR(&outq);
> +       rotate_to = i;
> +
> +       for (i = rotate_to; i < q->nr; i++)
> +               diff_q(&outq, q->queue[i]);
> +       for (i = 0; i < rotate_to; i++)
> +               diff_q(&outq, q->queue[i]);
> +
> +       free(q->queue);
> +       *q = outq;
> +}
> diff --git c/diffcore.h w/diffcore.h
> index d2a63c5c71..bd5959375b 100644
> --- c/diffcore.h
> +++ w/diffcore.h
> @@ -164,6 +164,7 @@ void diffcore_rename(struct diff_options *);
>  void diffcore_merge_broken(void);
>  void diffcore_pickaxe(struct diff_options *);
>  void diffcore_order(const char *orderfile);
> +void diffcore_rotate(const char *rotate_to_filename);
>
>  /* low-level interface to diffcore_order */
>  struct obj_order {
> diff --git c/Makefile w/Makefile
> index b797033c58..031b0b88e6 100644
> --- c/Makefile
> +++ w/Makefile
> @@ -869,6 +869,7 @@ LIB_OBJS += diffcore-delta.o
>  LIB_OBJS += diffcore-order.o
>  LIB_OBJS += diffcore-pickaxe.o
>  LIB_OBJS += diffcore-rename.o
> +LIB_OBJS += diffcore-rotate.o
>  LIB_OBJS += dir-iterator.o
>  LIB_OBJS += dir.o
>  LIB_OBJS += editor.o
> diff --git c/diff.c w/diff.c
> index 69e3bc00ed..90a8d5abd0 100644
> --- c/diff.c
> +++ w/diff.c
> @@ -5599,6 +5599,8 @@ static void prep_parse_options(struct diff_options *options)
>                           DIFF_PICKAXE_REGEX, PARSE_OPT_NONEG),
>                 OPT_FILENAME('O', NULL, &options->orderfile,
>                              N_("control the order in which files appear in the output")),
> +               OPT_STRING(0, "rotate-to", &options->rotate_to, N_("<path>"),
> +                          N_("show the change in the specified path first")),
>                 OPT_CALLBACK_F(0, "find-object", options, N_("<object-id>"),
>                                N_("look for differences that change the number of occurrences of the specified object"),
>                                PARSE_OPT_NONEG, diff_opt_find_object),
> @@ -6669,6 +6671,8 @@ void diffcore_std(struct diff_options *options)
>                 diffcore_pickaxe(options);
>         if (options->orderfile)
>                 diffcore_order(options->orderfile);
> +       if (options->rotate_to)
> +               diffcore_rotate(options->rotate_to);
>         if (!options->found_follow)
>                 /* See try_to_follow_renames() in tree-diff.c */
>                 diff_resolve_rename_copy();
> diff --git c/diff.h w/diff.h
> index 2ff2b1c7f2..0801469e63 100644
> --- c/diff.h
> +++ w/diff.h
> @@ -226,6 +226,7 @@ enum diff_submodule_format {
>   */
>  struct diff_options {
>         const char *orderfile;
> +       const char *rotate_to;
>
>         /**
>          * A constant string (can and typically does contain newlines to look for

Thanks for the patch!
After that, there was too little work I could do,do i just need to add
the following
 code in `diff_flush_patch_all_file_pairs`?
if (o->rotate_to && q->nr && strcmp(q->queue[0]->one->path, o->rotate_to) &&
strcmp(q->queue[0]->one->path, o->rotate_to)) {
    error(_("could not find start file name '%s'"), o->rotate_to);
        return;
}
In addition, Do I need to do the documentation and tests related to
your `diff --rotate-to`?
If so, I will finish it, otherwise, I will only do corresponding tests
for `git difftool --rotate-to`.

Thank you Junio, you are like a tireless teacher,

Happy Chinese New Year!

--
ZheNing Hu
Junio C Hamano Feb. 10, 2021, 6:16 p.m. UTC | #3
胡哲宁 <adlternative@gmail.com> writes:

> It has no effect on this new feature. I should put this modification
> in an additional commit, right?

Or you can just drop it.  It certainly is a distracting change to be
part of this topic.

> Yes, I want to know why being so cautious in git log?If the file name is
> wrong, why can't I make it exit? :)

Imagine a history where file1 and file2 are in the initial commit.
The second commit adds file3 and modifies file2, and then the third
commit modifies file1.  What would happen when you did this?

  $ git log -p --rotate-to=file2

For the commit at HEAD, the set of paths shown would be file1 and
nothing else (because it is the only path that gets modified).  You
cannot start showing from "file2".  Dying (and not showing HEAD~1
and HEAD~2) is the last thing you want to do in this situation.  We
do not even want to give a warning or an error, as it is totally
expected that some commits do not touch a given path---it would be
very unusual if a path is touched by every commit ;-).

For "difftool --start-at=file2", the equation is different.  It does
not traverse history where each commit may or may not modify file2,
and when the user says s/he wants to resume from file2, file2 must
be in the set of paths that have changes, or something is wrong (i.e.
the user meant file3 but mistyped it as file2).

> Awesome idea. In this way, `difftool --rotate-to=<file>` can call
> `diff --rotate-to=<file>` , user can choose the starting file, and they can
> also see previous files.

So "difftool --start-at=<file>" can of use "diff --rotate-to=<file>"
to implement the feature (after all, that is why I wrote it), but
the error condition between the two are quite different.  And ...

> After that, there was too little work I could do,do i just need to add
> the following
>  code in `diff_flush_patch_all_file_pairs`?


> if (o->rotate_to && q->nr && strcmp(q->queue[0]->one->path, o->rotate_to) &&
> strcmp(q->queue[0]->one->path, o->rotate_to)) {
>     error(_("could not find start file name '%s'"), o->rotate_to);
>         return;
> }

... that is why an unconditional change like this in diff.c is not
acceptable, as it would break other codepaths like "git log -p".  If
we were to add an error there, it has to be very limited to exclude
at least "log -p"---there may be other features that share the code
that should not trigger an error for similar reasons.

If diffcore-rotate chooses "missing rotate-to file makes it a no-op"
as its semantics, and if "difftool --start-at" does not want to see
a misspelt filename making it a no-op, then the latter needs to
ensure that the name it got from the user is indeed in the set of
paths that have changes before running "diff --rotate-to" to
implement its "difftool --start-at" feature.

The "missing rotate-to file in the diff_queue MUST NOT cause
diffcore-rotate to error out" rule is probably unnegitiable, but
there are other ways to make it easier to use, though.

For example, we could change the rotate-to logic to mean "start at
the given path, or if such a path does not exist in the diff_queue,
then the first path that sorts after the given path is where we
start".  That way, if the diff_queue has paths A B C D E and
rotate-to gives C, then we rotate the output to C D E A B.  And if
the diff_queue has A B D E and rotate-to gives C, then the output
would become D E A B (instead of becoming a no-op).  Then, a mistyped
filename may not do what the user wanted to do (after all, that is
the definition of MIStyping), but it would do something noticeable
by the user, which may be useful enough and at least would let the
user notice the mistake.

> In addition, Do I need to do the documentation and tests related to
> your `diff --rotate-to`?

Once we know how we want "diff --rotate-to" to behave exactly, I can
help that part further, if you want.  And then you can build on top.

But we need to design exactly what the desired semantics would be
before any of that.

Thanks.
Junio C Hamano Feb. 10, 2021, 8:05 p.m. UTC | #4
Junio C Hamano <gitster@pobox.com> writes:

> So "difftool --start-at=<file>" can of use "diff --rotate-to=<file>"

sorry; s/can of/can of course/.
ZheNing Hu Feb. 14, 2021, 7:53 a.m. UTC | #5
Junio C Hamano <gitster@pobox.com> 于2021年2月11日周四 上午2:17写道:
>
> 胡哲宁 <adlternative@gmail.com> writes:
>
> > It has no effect on this new feature. I should put this modification
> > in an additional commit, right?
>
> Or you can just drop it.  It certainly is a distracting change to be
> part of this topic.
>
OK.
> > Yes, I want to know why being so cautious in git log?If the file name is
> > wrong, why can't I make it exit? :)
>
> Imagine a history where file1 and file2 are in the initial commit.
> The second commit adds file3 and modifies file2, and then the third
> commit modifies file1.  What would happen when you did this?
>
>   $ git log -p --rotate-to=file2
>
> For the commit at HEAD, the set of paths shown would be file1 and
> nothing else (because it is the only path that gets modified).  You
> cannot start showing from "file2".  Dying (and not showing HEAD~1
> and HEAD~2) is the last thing you want to do in this situation.  We
> do not even want to give a warning or an error, as it is totally
> expected that some commits do not touch a given path---it would be
> very unusual if a path is touched by every commit ;-).
>
Now I understand that in `log -p`, some commit do not have the file, and some
commit have the file. The best way to display the commit without the file
is to keep it as it is, and rotate the commit with the file. And in `difftool`
need to ensure the exist the specified file or give the more matching file
as the beginning(As you mentioned later).
> For "difftool --start-at=file2", the equation is different.  It does
> not traverse history where each commit may or may not modify file2,
> and when the user says s/he wants to resume from file2, file2 must
> be in the set of paths that have changes, or something is wrong (i.e.
> the user meant file3 but mistyped it as file2).
>
> > Awesome idea. In this way, `difftool --rotate-to=<file>` can call
> > `diff --rotate-to=<file>` , user can choose the starting file, and they can
> > also see previous files.
>
> So "difftool --start-at=<file>" can of use "diff --rotate-to=<file>"
> to implement the feature (after all, that is why I wrote it), but
> the error condition between the two are quite different.  And ...

> > After that, there was too little work I could do,do i just need to add
> > the following
> >  code in `diff_flush_patch_all_file_pairs`?
>
>
> > if (o->rotate_to && q->nr && strcmp(q->queue[0]->one->path, o->rotate_to) &&
> > strcmp(q->queue[0]->one->path, o->rotate_to)) {
> >     error(_("could not find start file name '%s'"), o->rotate_to);
> >         return;
> > }
>
> ... that is why an unconditional change like this in diff.c is not
> acceptable, as it would break other codepaths like "git log -p".  If
> we were to add an error there, it has to be very limited to exclude
> at least "log -p"---there may be other features that share the code
> that should not trigger an error for similar reasons.
>
> If diffcore-rotate chooses "missing rotate-to file makes it a no-op"
> as its semantics, and if "difftool --start-at" does not want to see
> a misspelt filename making it a no-op, then the latter needs to
> ensure that the name it got from the user is indeed in the set of
> paths that have changes before running "diff --rotate-to" to
> implement its "difftool --start-at" feature.
>
> The "missing rotate-to file in the diff_queue MUST NOT cause
> diffcore-rotate to error out" rule is probably unnegitiable, but
> there are other ways to make it easier to use, though.
>
> For example, we could change the rotate-to logic to mean "start at
> the given path, or if such a path does not exist in the diff_queue,
> then the first path that sorts after the given path is where we
> start".  That way, if the diff_queue has paths A B C D E and
> rotate-to gives C, then we rotate the output to C D E A B.  And if
> the diff_queue has A B D E and rotate-to gives C, then the output
> would become D E A B (instead of becoming a no-op).  Then, a mistyped
> filename may not do what the user wanted to do (after all, that is
> the definition of MIStyping), but it would do something noticeable
> by the user, which may be useful enough and at least would let the
> user notice the mistake.
>
In doing this, I think the processing methods of difftool and other diffs
are unified.I think this kind of processing is actually very easy,
just need to change
> if (!strcmp(rotate_to_filename, q->queue[i]->two->path))
to
> if (strcmp(rotate_to_filename, q->queue[i]->two->path) <= 0)
Of course, it might be better if there is an algorithm that can achieve
the highest degree of file name matching.

Now that `difftool --start-at` and `diff --rotate-to` are unified effects,
is "start-at" just an alias for "rotate-to"?
Or do I need to write like this?
> OPT_STRING(0, "start-with", &options->rotate-to, N_("<path>"),
>   N_("pass from difftool to diff, has the same effort as `rotate-to`")),

> > In addition, Do I need to do the documentation and tests related to
> > your `diff --rotate-to`?
>
> Once we know how we want "diff --rotate-to" to behave exactly, I can
> help that part further, if you want.  And then you can build on top.
>
> But we need to design exactly what the desired semantics would be
> before any of that.
>
> Thanks.
>
Thanks.
diff mbox series

Patch

diff --git a/Documentation/git-difftool.txt b/Documentation/git-difftool.txt
index 484c485fd06c..552be097dfea 100644
--- a/Documentation/git-difftool.txt
+++ b/Documentation/git-difftool.txt
@@ -34,6 +34,9 @@  OPTIONS
 	This is the default behaviour; the option is provided to
 	override any configuration settings.
 
+--start-from::
+	Start viewing diff from the specified file.
+
 -t <tool>::
 --tool=<tool>::
 	Use the diff tool specified by <tool>.  Valid values include
diff --git a/builtin/difftool.c b/builtin/difftool.c
index 6e18e623fddf..67d2befa1210 100644
--- a/builtin/difftool.c
+++ b/builtin/difftool.c
@@ -690,7 +690,7 @@  int cmd_difftool(int argc, const char **argv, const char *prefix)
 {
 	int use_gui_tool = 0, dir_diff = 0, prompt = -1, symlinks = 0,
 	    tool_help = 0, no_index = 0;
-	static char *difftool_cmd = NULL, *extcmd = NULL;
+	static char *difftool_cmd = NULL, *extcmd = NULL, *start_file = NULL;
 	struct option builtin_difftool_options[] = {
 		OPT_BOOL('g', "gui", &use_gui_tool,
 			 N_("use `diff.guitool` instead of `diff.tool`")),
@@ -714,6 +714,8 @@  int cmd_difftool(int argc, const char **argv, const char *prefix)
 		OPT_STRING('x', "extcmd", &extcmd, N_("command"),
 			   N_("specify a custom command for viewing diffs")),
 		OPT_ARGUMENT("no-index", &no_index, N_("passed to `diff`")),
+		OPT_STRING(0, "start-from", &start_file, N_("start-from"),
+			   N_("start viewing diff from the specified file")),
 		OPT_END()
 	};
 
@@ -724,6 +726,9 @@  int cmd_difftool(int argc, const char **argv, const char *prefix)
 			     builtin_difftool_usage, PARSE_OPT_KEEP_UNKNOWN |
 			     PARSE_OPT_KEEP_DASHDASH);
 
+	if (start_file)
+		setenv("START_FILE", start_file, 1);
+
 	if (tool_help)
 		return print_tool_help();
 
diff --git a/diff.c b/diff.c
index 69e3bc00ed8f..cdad26f99063 100644
--- a/diff.c
+++ b/diff.c
@@ -4249,6 +4249,7 @@  static void run_external_diff(const char *pgm,
 			      const char *xfrm_msg,
 			      struct diff_options *o)
 {
+	const char *start_file = NULL;
 	struct strvec argv = STRVEC_INIT;
 	struct strvec env = STRVEC_INIT;
 	struct diff_queue_struct *q = &diff_queued_diff;
@@ -4272,9 +4273,17 @@  static void run_external_diff(const char *pgm,
 
 	diff_free_filespec_data(one);
 	diff_free_filespec_data(two);
+
+	start_file = xstrdup_or_null(getenv("START_FILE"));
+	if (start_file) {
+		if (strcmp(start_file, name))
+			goto finish;
+		unsetenv("START_FILE");
+	}
 	if (run_command_v_opt_cd_env(argv.v, RUN_USING_SHELL, NULL, env.v))
 		die(_("external diff died, stopping at %s"), name);
 
+finish:
 	remove_tempfile();
 	strvec_clear(&argv);
 	strvec_clear(&env);
diff --git a/diff.h b/diff.h
index 2ff2b1c7f2ca..f67c43f5af95 100644
--- a/diff.h
+++ b/diff.h
@@ -86,18 +86,18 @@  typedef void (*diff_format_fn_t)(struct diff_queue_struct *q,
 
 typedef struct strbuf *(*diff_prefix_fn_t)(struct diff_options *opt, void *data);
 
-#define DIFF_FORMAT_RAW		0x0001
-#define DIFF_FORMAT_DIFFSTAT	0x0002
-#define DIFF_FORMAT_NUMSTAT	0x0004
-#define DIFF_FORMAT_SUMMARY	0x0008
-#define DIFF_FORMAT_PATCH	0x0010
-#define DIFF_FORMAT_SHORTSTAT	0x0020
-#define DIFF_FORMAT_DIRSTAT	0x0040
+#define DIFF_FORMAT_RAW		(1U<<0)
+#define DIFF_FORMAT_DIFFSTAT	(1U<<1)
+#define DIFF_FORMAT_NUMSTAT	(1U<<2)
+#define DIFF_FORMAT_SUMMARY	(1U<<3)
+#define DIFF_FORMAT_PATCH	(1U<<4)
+#define DIFF_FORMAT_SHORTSTAT	(1U<<5)
+#define DIFF_FORMAT_DIRSTAT	(1U<<6)
 
 /* These override all above */
-#define DIFF_FORMAT_NAME	0x0100
-#define DIFF_FORMAT_NAME_STATUS	0x0200
-#define DIFF_FORMAT_CHECKDIFF	0x0400
+#define DIFF_FORMAT_NAME	(1U<<8)
+#define DIFF_FORMAT_NAME_STATUS	(1U<<9)
+#define DIFF_FORMAT_CHECKDIFF	(1U<<10)
 
 /* Same as output_format = 0 but we know that -s flag was given
  * and we should not give default value to output_format.
diff --git a/t/t7800-difftool.sh b/t/t7800-difftool.sh
index 9662abc1e784..74baac96a23f 100755
--- a/t/t7800-difftool.sh
+++ b/t/t7800-difftool.sh
@@ -762,4 +762,16 @@  test_expect_success 'difftool --gui, --tool and --extcmd are mutually exclusive'
 	test_must_fail git difftool --gui --tool=test-tool --extcmd=cat
 '
 
+test_expect_success 'difftool --start-from' '
+	difftool_test_setup &&
+	test_when_finished git reset --hard &&
+	echo 1 >1 &&
+	echo 2 >2 &&
+	echo 3 >3 &&
+	git add 1 2 3 &&
+	git commit -a -m "123" &&
+	git difftool --start-from="2" HEAD^ 2>&1 >output &&
+	test_line_count = 4 output
+'
+
 test_done