Message ID | 20230324170800.331022-1-jacob.e.keller@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v3] blame: allow --contents to work with non-HEAD commit | expand |
Jacob Keller <jacob.e.keller@intel.com> writes: > Changes since v2: > * Improved commit message further > * Re-wrote documentation for --contents. I really had trouble figuring out a > good succinct way to explain the behavior, so hopefully this is good. > * Updated the comment in setup_scoreboard. Looking good. Will queue. Thanks for a quick update.
Hi Jacob! Great to see you at Review Club :) If you'd like, yo ucan review the notes here: https://docs.google.com/document/d/14L8BAumGTpsXpjDY8VzZ4rRtpAjuGrFSRqn3stCuS_w/edit but as always, all important discussion happens on the ML. I see that this patch is already queued for 'master' (which is fine), though I think it might even better with another patch on top. Jacob Keller <jacob.e.keller@intel.com> writes: > From: Jacob Keller <jacob.keller@gmail.com> > > The --contents option can be used with git blame to blame the file as if > it had the contents from the specified file. This is akin to copying the > contents into the working tree and then running git blame. This option > has been supported since 1cfe77333f27 ("git-blame: no rev means start > from the working tree file.") > > The --contents option always blames the file as if it was based on the > current HEAD commit. If you try to pass a revision while using > --contents, you get the following error: > > fatal: cannot use --contents with final commit object name > > This is because the blame process generates a fake working tree commit > which always uses the HEAD object as its sole parent. > > Enhance fake_working_tree_commit to take the object ID to use for the > parent instead of always using the HEAD object. Then, always generate a > fake commit when we have contents provided, even if we have a final > object. Remove the check to disallow --contents and a final revision. I thought that the commit message was very clear and provided enough context even for reviewers who weren't familiar with "git blame --contents". Thanks! I'll reorder the patch hunks to make discussion easier: > diff --git a/t/annotate-tests.sh b/t/annotate-tests.sh > index f1b9a6ce4dae..b35be20cf327 100644 > --- a/t/annotate-tests.sh > +++ b/t/annotate-tests.sh > @@ -98,6 +108,10 @@ test_expect_success 'blame 2 authors' ' > check_count A 2 B 2 > ' > > +test_expect_success 'blame with --contents and revision' ' > + check_count -h testTag --contents=file A 2 "Not Committed Yet" 2 > +' > + As the test notes, the author of the changes is "Not Committed Yet"... > diff --git a/Documentation/blame-options.txt b/Documentation/blame-options.txt > index 9a663535f443..95599bd6e5f4 100644 > --- a/Documentation/blame-options.txt > +++ b/Documentation/blame-options.txt > @@ -64,11 +64,11 @@ include::line-range-format.txt[] > manual page. > > --contents <file>:: > - When <rev> is not specified, the command annotates the > - changes starting backwards from the working tree copy. > - This flag makes the command pretend as if the working > - tree copy has the contents of the named file (specify > - `-` to make the command read from the standard input). > + Pretend the file being annotated has a commit with the > + contents from the named file and a parent of <rev>, > + defaulting to HEAD when no <rev> is specified. You may > + specify '-' to make the command read from the standard > + input for the file contents. which I found quite difficult to reconcile with the description here, in particular: Pretend the file being annotated has a commit... We could try to make the two more coherent by rewording the docs, maybe: Pretend that the working copy has the contents of the named file. If <rev> is also given, also pretend that HEAD is at <rev>. But (as Junio suggested in Review Club), maybe it would be better to just change "Not Committed Yet" to something more accurate, like "External file (--contents)", and then we can drop the language around "pretend". It would be pretty simple, too. Here's a rough patch (that I don't mind you sending as your own): ----- >8 --------- >8 --------- >8 --------- >8 --------- >8 ---- diff --git a/Documentation/blame-options.txt b/Documentation/blame-options.txt index 95599bd6e5..4a861ff31c 100644 --- a/Documentation/blame-options.txt +++ b/Documentation/blame-options.txt @@ -64,11 +64,10 @@ include::line-range-format.txt[] manual page. --contents <file>:: - Pretend the file being annotated has a commit with the - contents from the named file and a parent of <rev>, - defaulting to HEAD when no <rev> is specified. You may - specify '-' to make the command read from the standard - input for the file contents. + Annotate using the contents from the named file instead of the + working tree copy, starting with <rev> if it is specified, and + HEAD otherwise. You may specify '-' to make the command read + from the standard input for the file contents. --date <format>:: Specifies the format used to output dates. If --date is not diff --git a/blame.c b/blame.c index 2d02cf0636..129dae7641 100644 --- a/blame.c +++ b/blame.c @@ -204,8 +204,12 @@ static struct commit *fake_working_tree_commit(struct repository *r, origin = make_origin(commit, path); - ident = fmt_ident("Not Committed Yet", "not.committed.yet", - WANT_BLANK_IDENT, NULL, 0); + if (contents_from) + ident = fmt_ident("External file (--contents)", "external.file", + WANT_BLANK_IDENT, NULL, 0); + else + ident = fmt_ident("Not Committed Yet", "not.committed.yet", + WANT_BLANK_IDENT, NULL, 0); strbuf_addstr(&msg, "tree 0000000000000000000000000000000000000000\n"); for (parent = commit->parents; parent; parent = parent->next) strbuf_addf(&msg, "parent %s\n", diff --git a/t/annotate-tests.sh b/t/annotate-tests.sh index b35be20cf3..2ef70235b1 100644 --- a/t/annotate-tests.sh +++ b/t/annotate-tests.sh @@ -72,6 +72,13 @@ test_expect_success 'blame 1 author' ' check_count A 2 ' +test_expect_success 'blame working copy' ' + test_when_finished "git restore file" && + echo "1A quick brown fox jumps over the" >file && + echo "another lazy dog" >>file && + check_count A 1 "Not Committed Yet" 1 +' + test_expect_success 'blame with --contents' ' check_count --contents=file A 2 ' @@ -79,7 +86,7 @@ test_expect_success 'blame with --contents' ' test_expect_success 'blame with --contents changed' ' echo "1A quick brown fox jumps over the" >contents && echo "another lazy dog" >>contents && - check_count --contents=contents A 1 "Not Committed Yet" 1 + check_count --contents=contents A 1 "External file (--contents)" 1 ' test_expect_success 'blame in a bare repo without starting commit' ' @@ -109,7 +116,7 @@ test_expect_success 'blame 2 authors' ' ' test_expect_success 'blame with --contents and revision' ' - check_count -h testTag --contents=file A 2 "Not Committed Yet" 2 + check_count -h testTag --contents=file A 2 "External file (--contents)" 2 ' test_expect_success 'setup B1 lines (branch1)' ' ----- >8 --------- >8 --------- >8 --------- >8 --------- >8 ---- The patch doesn't do anything special, except that we'd need to make sure to test that the working copy is annotated correctly (we didn't have tests for this before??). The doc and the fake commit author could use some tweaking, but IMO the end result is a lot easier to explain to end users, even in the case where "--contents" is given without <rev>. The _one_ reason not to merge this is that this changes the annotated author with "--contents", which could cause trouble for end users who are parsing the output and expecting literal "Not Committed Yet". Given our history with Hyrum's law, surely _someone_ is doing this in the non-"--contents" case, but I suspect that "--contents" is uncommon enough that we could take the chance.
On Thu, Mar 30, 2023 at 1:46 PM Glen Choo <chooglen@google.com> wrote: > > Hi Jacob! Great to see you at Review Club :) > > If you'd like, yo ucan review the notes here: > > https://docs.google.com/document/d/14L8BAumGTpsXpjDY8VzZ4rRtpAjuGrFSRqn3stCuS_w/edit > > but as always, all important discussion happens on the ML. > > I see that this patch is already queued for 'master' (which is fine), > though I think it might even better with another patch on top. > Yep. I'm happy to send some follow up. > Jacob Keller <jacob.e.keller@intel.com> writes: > > > From: Jacob Keller <jacob.keller@gmail.com> > > > > The --contents option can be used with git blame to blame the file as if > > it had the contents from the specified file. This is akin to copying the > > contents into the working tree and then running git blame. This option > > has been supported since 1cfe77333f27 ("git-blame: no rev means start > > from the working tree file.") > > > > The --contents option always blames the file as if it was based on the > > current HEAD commit. If you try to pass a revision while using > > --contents, you get the following error: > > > > fatal: cannot use --contents with final commit object name > > > > This is because the blame process generates a fake working tree commit > > which always uses the HEAD object as its sole parent. > > > > Enhance fake_working_tree_commit to take the object ID to use for the > > parent instead of always using the HEAD object. Then, always generate a > > fake commit when we have contents provided, even if we have a final > > object. Remove the check to disallow --contents and a final revision. > > I thought that the commit message was very clear and provided enough > context even for reviewers who weren't familiar with "git blame > --contents". Thanks! > > I'll reorder the patch hunks to make discussion easier: > Thanks! > > diff --git a/t/annotate-tests.sh b/t/annotate-tests.sh > > index f1b9a6ce4dae..b35be20cf327 100644 > > --- a/t/annotate-tests.sh > > +++ b/t/annotate-tests.sh > > @@ -98,6 +108,10 @@ test_expect_success 'blame 2 authors' ' > > check_count A 2 B 2 > > ' > > > > +test_expect_success 'blame with --contents and revision' ' > > + check_count -h testTag --contents=file A 2 "Not Committed Yet" 2 > > +' > > + > > As the test notes, the author of the changes is "Not Committed Yet"... > > > diff --git a/Documentation/blame-options.txt b/Documentation/blame-options.txt > > index 9a663535f443..95599bd6e5f4 100644 > > --- a/Documentation/blame-options.txt > > +++ b/Documentation/blame-options.txt > > @@ -64,11 +64,11 @@ include::line-range-format.txt[] > > manual page. > > > > --contents <file>:: > > - When <rev> is not specified, the command annotates the > > - changes starting backwards from the working tree copy. > > - This flag makes the command pretend as if the working > > - tree copy has the contents of the named file (specify > > - `-` to make the command read from the standard input). > > + Pretend the file being annotated has a commit with the > > + contents from the named file and a parent of <rev>, > > + defaulting to HEAD when no <rev> is specified. You may > > + specify '-' to make the command read from the standard > > + input for the file contents. > > which I found quite difficult to reconcile with the description here, > in particular: > > Pretend the file being annotated has a commit... > > We could try to make the two more coherent by rewording the docs, maybe: > > Pretend that the working copy has the contents of the named file. If > <rev> is also given, also pretend that HEAD is at <rev>. > It is quite tricky to get a concise description here. > But (as Junio suggested in Review Club), maybe it would be better to > just change "Not Committed Yet" to something more accurate, like > "External file (--contents)", and then we can drop the language around > "pretend". > I think this is a good direction, It makes it distinguished from the working tree output. > It would be pretty simple, too. Here's a rough patch (that I don't mind > you sending as your own): > > ----- >8 --------- >8 --------- >8 --------- >8 --------- >8 ---- > diff --git a/Documentation/blame-options.txt b/Documentation/blame-options.txt > index 95599bd6e5..4a861ff31c 100644 > --- a/Documentation/blame-options.txt > +++ b/Documentation/blame-options.txt > @@ -64,11 +64,10 @@ include::line-range-format.txt[] > manual page. > > --contents <file>:: > - Pretend the file being annotated has a commit with the > - contents from the named file and a parent of <rev>, > - defaulting to HEAD when no <rev> is specified. You may > - specify '-' to make the command read from the standard > - input for the file contents. > + Annotate using the contents from the named file instead of the > + working tree copy, starting with <rev> if it is specified, and > + HEAD otherwise. You may specify '-' to make the command read > + from the standard input for the file contents. > I think I would reword this slightly: Annotate using the contents from the name file, starting from <rev> if it is specified, and HEAD otherwise. I do not think we need "instead of the working tree copy" as this is confusing since if <rev> is specified we do not use the working tree copy today. This is where the difficulty in being concise here is. I think just not mentioning working copy is correct here. Perhaps we need to mention that the working copy is used if no <rev> is specified elsewhere in the documentation? > --date <format>:: > Specifies the format used to output dates. If --date is not > diff --git a/blame.c b/blame.c > index 2d02cf0636..129dae7641 100644 > --- a/blame.c > +++ b/blame.c > @@ -204,8 +204,12 @@ static struct commit *fake_working_tree_commit(struct repository *r, > > origin = make_origin(commit, path); > > - ident = fmt_ident("Not Committed Yet", "not.committed.yet", > - WANT_BLANK_IDENT, NULL, 0); > + if (contents_from) > + ident = fmt_ident("External file (--contents)", "external.file", > + WANT_BLANK_IDENT, NULL, 0); > + else > + ident = fmt_ident("Not Committed Yet", "not.committed.yet", > + WANT_BLANK_IDENT, NULL, 0); > strbuf_addstr(&msg, "tree 0000000000000000000000000000000000000000\n"); > for (parent = commit->parents; parent; parent = parent->next) > strbuf_addf(&msg, "parent %s\n", > diff --git a/t/annotate-tests.sh b/t/annotate-tests.sh > index b35be20cf3..2ef70235b1 100644 > --- a/t/annotate-tests.sh > +++ b/t/annotate-tests.sh > @@ -72,6 +72,13 @@ test_expect_success 'blame 1 author' ' > check_count A 2 > ' > > +test_expect_success 'blame working copy' ' > + test_when_finished "git restore file" && > + echo "1A quick brown fox jumps over the" >file && > + echo "another lazy dog" >>file && > + check_count A 1 "Not Committed Yet" 1 > +' > + > test_expect_success 'blame with --contents' ' > check_count --contents=file A 2 > ' > @@ -79,7 +86,7 @@ test_expect_success 'blame with --contents' ' > test_expect_success 'blame with --contents changed' ' > echo "1A quick brown fox jumps over the" >contents && > echo "another lazy dog" >>contents && > - check_count --contents=contents A 1 "Not Committed Yet" 1 > + check_count --contents=contents A 1 "External file (--contents)" 1 > ' > > test_expect_success 'blame in a bare repo without starting commit' ' > @@ -109,7 +116,7 @@ test_expect_success 'blame 2 authors' ' > ' > > test_expect_success 'blame with --contents and revision' ' > - check_count -h testTag --contents=file A 2 "Not Committed Yet" 2 > + check_count -h testTag --contents=file A 2 "External file (--contents)" 2 > ' > > test_expect_success 'setup B1 lines (branch1)' ' > > ----- >8 --------- >8 --------- >8 --------- >8 --------- >8 ---- > > The patch doesn't do anything special, except that we'd need to make > sure to test that the working copy is annotated correctly (we didn't > have tests for this before??). We check the authorship with the tests I added now. > > The doc and the fake commit author could use some tweaking, but IMO the > end result is a lot easier to explain to end users, even in the case > where "--contents" is given without <rev>. > Yea I think changing the author name is good. > The _one_ reason not to merge this is that this changes the annotated > author with "--contents", which could cause trouble for end users who > are parsing the output and expecting literal "Not Committed Yet". Given > our history with Hyrum's law, surely _someone_ is doing this in the > non-"--contents" case, but I suspect that "--contents" is uncommon > enough that we could take the chance. I am one of those people using --contents and analyzing the output... but at least in my case I'm looking for the 00000000 commit ID instead of looking for the author name. I'm in favor of changing the output, but I'd like to hear other opinions here.
diff --git a/Documentation/blame-options.txt b/Documentation/blame-options.txt index 9a663535f443..95599bd6e5f4 100644 --- a/Documentation/blame-options.txt +++ b/Documentation/blame-options.txt @@ -64,11 +64,11 @@ include::line-range-format.txt[] manual page. --contents <file>:: - When <rev> is not specified, the command annotates the - changes starting backwards from the working tree copy. - This flag makes the command pretend as if the working - tree copy has the contents of the named file (specify - `-` to make the command read from the standard input). + Pretend the file being annotated has a commit with the + contents from the named file and a parent of <rev>, + defaulting to HEAD when no <rev> is specified. You may + specify '-' to make the command read from the standard + input for the file contents. --date <format>:: Specifies the format used to output dates. If --date is not diff --git a/Documentation/git-blame.txt b/Documentation/git-blame.txt index 4400a17330b4..f69a871a96f7 100644 --- a/Documentation/git-blame.txt +++ b/Documentation/git-blame.txt @@ -12,7 +12,7 @@ SYNOPSIS [-L <range>] [-S <revs-file>] [-M] [-C] [-C] [-C] [--since=<date>] [--ignore-rev <rev>] [--ignore-revs-file <file>] [--color-lines] [--color-by-age] [--progress] [--abbrev=<n>] - [<rev> | --contents <file> | --reverse <rev>..<rev>] [--] <file> + [ --contents <file> ] [<rev> | --reverse <rev>..<rev>] [--] <file> DESCRIPTION ----------- diff --git a/blame.c b/blame.c index e45d8a3bf92a..2d02cf0636ca 100644 --- a/blame.c +++ b/blame.c @@ -177,12 +177,12 @@ static void set_commit_buffer_from_strbuf(struct repository *r, static struct commit *fake_working_tree_commit(struct repository *r, struct diff_options *opt, const char *path, - const char *contents_from) + const char *contents_from, + struct object_id *oid) { struct commit *commit; struct blame_origin *origin; struct commit_list **parent_tail, *parent; - struct object_id head_oid; struct strbuf buf = STRBUF_INIT; const char *ident; time_t now; @@ -198,10 +198,7 @@ static struct commit *fake_working_tree_commit(struct repository *r, commit->date = now; parent_tail = &commit->parents; - if (!resolve_ref_unsafe("HEAD", RESOLVE_REF_READING, &head_oid, NULL)) - die("no such ref: HEAD"); - - parent_tail = append_parent(r, parent_tail, &head_oid); + parent_tail = append_parent(r, parent_tail, oid); append_merge_parents(r, parent_tail); verify_working_tree_path(r, commit, path); @@ -2772,22 +2769,37 @@ void setup_scoreboard(struct blame_scoreboard *sb, sb->commits.compare = compare_commits_by_reverse_commit_date; } - if (sb->final && sb->contents_from) - die(_("cannot use --contents with final commit object name")); - if (sb->reverse && sb->revs->first_parent_only) sb->revs->children.name = NULL; - if (!sb->final) { + if (sb->contents_from || !sb->final) { + struct object_id head_oid, *parent_oid; + /* - * "--not A B -- path" without anything positive; - * do not default to HEAD, but use the working tree - * or "--contents". + * Build a fake commit at the top of the history, when + * (1) "git blame ^A ^B --path", i.e. without any positive end + * end of the history range, in which case we build such + * a fake commit on top of the HEAD to blame in-tree + * modifications. + * (2) "git blame --contents=file [A] -- path", with or + * without positive end of the history range but with + * --contents, in which case we pretend that there is + * a fake commit on top of the positive end (defaulting to + * HEAD) that has the given contents in the path. */ + if (sb->final) { + parent_oid = &sb->final->object.oid; + } else { + if (!resolve_ref_unsafe("HEAD", RESOLVE_REF_READING, &head_oid, NULL)) + die("no such ref: HEAD"); + parent_oid = &head_oid; + } + setup_work_tree(); sb->final = fake_working_tree_commit(sb->repo, &sb->revs->diffopt, - sb->path, sb->contents_from); + sb->path, sb->contents_from, + parent_oid); add_pending_object(sb->revs, &(sb->final->object), ":"); } diff --git a/t/annotate-tests.sh b/t/annotate-tests.sh index f1b9a6ce4dae..b35be20cf327 100644 --- a/t/annotate-tests.sh +++ b/t/annotate-tests.sh @@ -72,6 +72,16 @@ test_expect_success 'blame 1 author' ' check_count A 2 ' +test_expect_success 'blame with --contents' ' + check_count --contents=file A 2 +' + +test_expect_success 'blame with --contents changed' ' + echo "1A quick brown fox jumps over the" >contents && + echo "another lazy dog" >>contents && + check_count --contents=contents A 1 "Not Committed Yet" 1 +' + test_expect_success 'blame in a bare repo without starting commit' ' git clone --bare . bare.git && ( @@ -98,6 +108,10 @@ test_expect_success 'blame 2 authors' ' check_count A 2 B 2 ' +test_expect_success 'blame with --contents and revision' ' + check_count -h testTag --contents=file A 2 "Not Committed Yet" 2 +' + test_expect_success 'setup B1 lines (branch1)' ' git checkout -b branch1 main && echo "3A slow green fox jumps into the" >>file &&