diff mbox series

[v2,2/2] revision: allow --ancestry-path to take an argument

Message ID f580ec6d06072ea6ed2ecc4f8142b94fccbe4c0f.1660803467.git.gitgitgadget@gmail.com (mailing list archive)
State Superseded
Headers show
Series Allow --ancestry-path to take an argument | expand

Commit Message

Elijah Newren Aug. 18, 2022, 6:17 a.m. UTC
From: Elijah Newren <newren@gmail.com>

We have long allowed users to run e.g.
    git log --ancestry-path master..seen
which shows all commits which satisfy all three of these criteria:
  * are an ancestor of seen
  * are not an ancestor master
  * have master as an ancestor

This commit allows another variant:
    git log --ancestry-path=$TOPIC master..seen
which shows all commits which satisfy all of these criteria:
  * are an ancestor of seen
  * are not an ancestor of master
  * have $TOPIC in their ancestry-path
that last bullet can be defined as commits meeting any of these
criteria:
    * are an ancestor of $TOPIC
    * have $TOPIC as an ancestor
    * are $TOPIC

This also allows multiple --ancestry-path arguments, which can be
used to find commits with any of the given topics in their ancestry
path.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 Documentation/rev-list-options.txt | 45 ++++++++++++----
 object.h                           |  2 +-
 revision.c                         | 84 +++++++++++++++++++-----------
 revision.h                         |  3 ++
 t/t6019-rev-list-ancestry-path.sh  | 47 ++++++++++++++++-
 5 files changed, 138 insertions(+), 43 deletions(-)

Comments

Derrick Stolee Aug. 18, 2022, 3:30 p.m. UTC | #1
On 8/18/2022 2:17 AM, Elijah Newren via GitGitGadget wrote:
> From: Elijah Newren <newren@gmail.com>

Code looks good!

> diff --git a/t/t6019-rev-list-ancestry-path.sh b/t/t6019-rev-list-ancestry-path.sh
> index af57a04b7ff..d3657737fa6 100755
> --- a/t/t6019-rev-list-ancestry-path.sh
> +++ b/t/t6019-rev-list-ancestry-path.sh
> @@ -8,8 +8,13 @@ test_description='--ancestry-path'
>  #   /                     \
>  #  A-------K---------------L--M
>  #
> -#  D..M                 == E F G H I J K L M
> -#  --ancestry-path D..M == E F H I J L M
> +#  D..M                                     == E F G H I J K L M
> +#  --ancestry-path                     D..M == E F   H I J   L M
> +#  --ancestry-path=F                   D..M == E F       J   L M
> +#  --ancestry-path=G                   D..M ==     G H I J   L M
> +#  --ancestry-path=H                   D..M == E   G H I J   L M
> +#  --ancestry-path=K                   D..M ==             K L M
> +#  --ancestry-path=K --ancestry-path=F D..M == E F       J K L M

These are good examples, because they help clarify what I had initially
been confused about: you include things in _both_ directions. In
particular, "--ancestry-path=K --ancestry-path=f D..M" you are kind of
taking the union of the following queries:

	--ancestry-path D..K
	--ancestry-path K..M
	--ancestry-path D..F
	--ancestry-path F..M

I did check just in case, but specifying multiple ranges such as

	--ancestry-path D..K K..M

does not do what is expected.

> +test_expect_success 'rev-list --ancestry-path=F D..M' '
> +	test_write_lines E F J L M >expect &&
> +	git rev-list --ancestry-path=F --format=%s D..M |
> +	sed -e "/^commit /d" |
> +	sort >actual &&
> +	test_cmp expect actual
> +'

These tests follow the patterns from other tests in this file, but
it also has bad patterns. Specifically, the 'git rev-list' command
is fed directly into a pipe. I include a patch below that applies
directly on this one to rewrite these tests. If you want, you could
rebase to have that test refactor happen before you add your new
--ancestry-path=<X> option tests.

> +test_expect_success 'rev-list --ancestry-path=G D..M' '
> +	test_write_lines G H I J L M >expect &&
> +	git rev-list --ancestry-path=G --format=%s D..M |
> +	sed -e "/^commit /d" |
> +	sort >actual &&
> +	test_cmp expect actual
> +'
> +test_expect_success 'rev-list --ancestry-path=H D..M' '

nit: needs extra whitespace between tests. The above test pair
needs one, too. This becomes moot with the patch I provide.

Thanks,
-Stolee

--- >8 ---

From 9ac4e81dba0d7801513a09f5fe307d01357123dd Mon Sep 17 00:00:00 2001
From: Derrick Stolee <derrickstolee@github.com>
Date: Thu, 18 Aug 2022 11:25:04 -0400
Subject: [PATCH] t6019: modernize tests with helper

The tests in t6019 are repetive, so create a helper that greatly
simplifies the test script.

In addition, update the common pattern that places 'git rev-list' on the
left side of a pipe, which can hide some exit codes. Send the output to
a 'raw' file that is then consumed by other tools so the Git exit code
is verified as zero.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
 t/t6019-rev-list-ancestry-path.sh | 131 +++++++-----------------------
 1 file changed, 31 insertions(+), 100 deletions(-)

diff --git a/t/t6019-rev-list-ancestry-path.sh b/t/t6019-rev-list-ancestry-path.sh
index d3657737fa60..18941a80ce67 100755
--- a/t/t6019-rev-list-ancestry-path.sh
+++ b/t/t6019-rev-list-ancestry-path.sh
@@ -55,111 +55,42 @@ test_expect_success setup '
 	test_commit M
 '
 
-test_expect_success 'rev-list D..M' '
-	test_write_lines E F G H I J K L M >expect &&
-	git rev-list --format=%s D..M |
-	sed -e "/^commit /d" |
-	sort >actual &&
-	test_cmp expect actual
-'
-
-test_expect_success 'rev-list --ancestry-path D..M' '
-	test_write_lines E F H I J L M >expect &&
-	git rev-list --ancestry-path --format=%s D..M |
-	sed -e "/^commit /d" |
-	sort >actual &&
-	test_cmp expect actual
-'
-
-test_expect_success 'rev-list --ancestry-path=F D..M' '
-	test_write_lines E F J L M >expect &&
-	git rev-list --ancestry-path=F --format=%s D..M |
-	sed -e "/^commit /d" |
-	sort >actual &&
-	test_cmp expect actual
-'
-test_expect_success 'rev-list --ancestry-path=G D..M' '
-	test_write_lines G H I J L M >expect &&
-	git rev-list --ancestry-path=G --format=%s D..M |
-	sed -e "/^commit /d" |
-	sort >actual &&
-	test_cmp expect actual
-'
-test_expect_success 'rev-list --ancestry-path=H D..M' '
-	test_write_lines E G H I J L M >expect &&
-	git rev-list --ancestry-path=H --format=%s D..M |
-	sed -e "/^commit /d" |
-	sort >actual &&
-	test_cmp expect actual
-'
-
-test_expect_success 'rev-list --ancestry-path=K D..M' '
-	test_write_lines K L M >expect &&
-	git rev-list --ancestry-path=K --format=%s D..M |
-	sed -e "/^commit /d" |
-	sort >actual &&
-	test_cmp expect actual
-'
-
-test_expect_success 'rev-list --ancestry-path=F --ancestry-path=K D..M' '
-	test_write_lines E F J K L M >expect &&
-	git rev-list --ancestry-path=F --ancestry-path=K --format=%s D..M |
-	sed -e "/^commit /d" |
-	sort >actual &&
-	test_cmp expect actual
-'
-
-test_expect_success 'rev-list D..M -- M.t' '
-	echo M >expect &&
-	git rev-list --format=%s D..M -- M.t |
-	sed -e "/^commit /d" >actual &&
-	test_cmp expect actual
-'
-
-test_expect_success 'rev-list --ancestry-path D..M -- M.t' '
-	echo M >expect &&
-	git rev-list --ancestry-path --format=%s D..M -- M.t |
-	sed -e "/^commit /d" >actual &&
-	test_cmp expect actual
-'
+test_ancestry () {
+	args=$1
+	expected=$2
+	test_expect_success "rev-list $args" "
+		test_write_lines $expected >expect &&
+		git rev-list --format=%s $args >raw &&
+
+		if test -n \"$expected\"
+		then
+			sed -e \"/^commit /d\" raw | sort >actual &&
+			test_cmp expect actual || return 1
+		else
+			test_must_be_empty raw
+		fi
+	"
+}
 
-test_expect_success 'rev-list F...I' '
-	test_write_lines F G H I >expect &&
-	git rev-list --format=%s F...I |
-	sed -e "/^commit /d" |
-	sort >actual &&
-	test_cmp expect actual
-'
+test_ancestry "D..M" "E F G H I J K L M"
 
-test_expect_success 'rev-list --ancestry-path F...I' '
-	test_write_lines F H I >expect &&
-	git rev-list --ancestry-path --format=%s F...I |
-	sed -e "/^commit /d" |
-	sort >actual &&
-	test_cmp expect actual
-'
+test_ancestry "--ancestry-path D..M" "E F H I J L M"
+test_ancestry "--ancestry-path D..M" "E F H I J L M"
+test_ancestry "--ancestry-path=F D..M" "E F J L M"
+test_ancestry "--ancestry-path=G D..M" "G H I J L M"
+test_ancestry "--ancestry-path=H D..M" "E G H I J L M"
+test_ancestry "--ancestry-path=K D..M" "K L M"
+test_ancestry "--ancestry-path=K --ancestry-path=F D..M" "E F J K L M"
 
-# G.t is dropped in an "-s ours" merge
-test_expect_success 'rev-list G..M -- G.t' '
-	git rev-list --format=%s G..M -- G.t |
-	sed -e "/^commit /d" >actual &&
-	test_must_be_empty actual
-'
+test_ancestry "D..M -- M.t" "M"
+test_ancestry "--ancestry-path D..M -- M.t" "M"
 
-test_expect_success 'rev-list --ancestry-path G..M -- G.t' '
-	echo L >expect &&
-	git rev-list --ancestry-path --format=%s G..M -- G.t |
-	sed -e "/^commit /d" >actual &&
-	test_cmp expect actual
-'
+test_ancestry "F...I" "F G H I"
+test_ancestry "--ancestry-path F...I" "F H I"
 
-test_expect_success 'rev-list --ancestry-path --simplify-merges G^..M -- G.t' '
-	test_write_lines G L >expect &&
-	git rev-list --ancestry-path --simplify-merges --format=%s G^..M -- G.t |
-	sed -e "/^commit /d" |
-	sort >actual &&
-	test_cmp expect actual
-'
+test_ancestry "G..M -- G.t" ""
+test_ancestry "--ancestry-path G..M -- G.t" "L"
+test_ancestry "--ancestry-path --simplify-merges G^..M -- G.t" "G L"
 
 #   b---bc
 #  / \ /
Ævar Arnfjörð Bjarmason Aug. 18, 2022, 3:50 p.m. UTC | #2
On Thu, Aug 18 2022, Derrick Stolee wrote:

> On 8/18/2022 2:17 AM, Elijah Newren via GitGitGadget wrote:
>> +test_expect_success 'rev-list --ancestry-path=F D..M' '
>> +	test_write_lines E F J L M >expect &&
>> +	git rev-list --ancestry-path=F --format=%s D..M |
>> +	sed -e "/^commit /d" |
>> +	sort >actual &&
>> +	test_cmp expect actual
>> +'
>
> These tests follow the patterns from other tests in this file, but
> it also has bad patterns. Specifically, the 'git rev-list' command
> is fed directly into a pipe. I include a patch below that applies
> directly on this one to rewrite these tests. If you want, you could
> rebase to have that test refactor happen before you add your new
> --ancestry-path=<X> option tests.

Thanks, I was going to comment on the same, but your solution is much
better (I was just going to suggest using intermediate files).

> [...]
> -test_expect_success 'rev-list --ancestry-path D..M -- M.t' '
> -	echo M >expect &&
> -	git rev-list --ancestry-path --format=%s D..M -- M.t |
> -	sed -e "/^commit /d" >actual &&
> -	test_cmp expect actual
> -'
> +test_ancestry () {
> +	args=$1
> +	expected=$2

Maybe add &&-chaining here (we do it in some cases, but I'm not sure
when such assignments would ever fail).

> +	test_expect_success "rev-list $args" "
> +		test_write_lines $expected >expect &&
> +		git rev-list --format=%s $args >raw &&
> +
> +		if test -n \"$expected\"

Aren't you making things harder for yourself here than required by using
""-quoting for the body of the test.

We eval it into existence, so you can use ''-quotes, and then you don't
need to escape e.g. the "" quotes here for expected, no?

> +		then
> +			sed -e \"/^commit /d\" raw | sort >actual &&

nit for debuggability (and not correctness), maybe using intermediate
files here would be nicer? And then perhaps call them "actual" and
"actual.sorted".


> +			test_cmp expect actual || return 1

No need for a "return 1" here when we're not in a loop. It's redundant,
and makes the -x output on failure confusing ("why didn't I fail on the
test_cmp, but on this stray return?...").

...

> +		else
> +			test_must_be_empty raw

...which would also allow you to extract much of this if/else at the
cost of not using test_must_be_empty, i.e. just make the "expected"
empty unless "$expected" is provided. Just a thought/nit, we could also
leave this as-is :)

Also does the "compare rev" part of this want test_cmp_rev instead?
Derrick Stolee Aug. 18, 2022, 4:51 p.m. UTC | #3
On 8/18/2022 11:50 AM, Ævar Arnfjörð Bjarmason wrote:
> 
> On Thu, Aug 18 2022, Derrick Stolee wrote:
> 
>> On 8/18/2022 2:17 AM, Elijah Newren via GitGitGadget wrote:
>>> +test_expect_success 'rev-list --ancestry-path=F D..M' '
>>> +	test_write_lines E F J L M >expect &&
>>> +	git rev-list --ancestry-path=F --format=%s D..M |
>>> +	sed -e "/^commit /d" |
>>> +	sort >actual &&
>>> +	test_cmp expect actual
>>> +'
>>
>> These tests follow the patterns from other tests in this file, but
>> it also has bad patterns. Specifically, the 'git rev-list' command
>> is fed directly into a pipe. I include a patch below that applies
>> directly on this one to rewrite these tests. If you want, you could
>> rebase to have that test refactor happen before you add your new
>> --ancestry-path=<X> option tests.
> 
> Thanks, I was going to comment on the same, but your solution is much
> better (I was just going to suggest using intermediate files).
> 
>> [...]
>> -test_expect_success 'rev-list --ancestry-path D..M -- M.t' '
>> -	echo M >expect &&
>> -	git rev-list --ancestry-path --format=%s D..M -- M.t |
>> -	sed -e "/^commit /d" >actual &&
>> -	test_cmp expect actual
>> -'
>> +test_ancestry () {
>> +	args=$1
>> +	expected=$2
> 
> Maybe add &&-chaining here (we do it in some cases, but I'm not sure
> when such assignments would ever fail).

These are outside of a test_expect_success, so they are not important.
 
>> +	test_expect_success "rev-list $args" "
>> +		test_write_lines $expected >expect &&
>> +		git rev-list --format=%s $args >raw &&
>> +
>> +		if test -n \"$expected\"
> 
> Aren't you making things harder for yourself here than required by using
> ""-quoting for the body of the test.
> 
> We eval it into existence, so you can use ''-quotes, and then you don't
> need to escape e.g. the "" quotes here for expected, no?

Are "args" and "expected" in-scope if I use single quotes? I don't think
they are unless we export them. I could be wrong.

>> +		then
>> +			sed -e \"/^commit /d\" raw | sort >actual &&
> 
> nit for debuggability (and not correctness), maybe using intermediate
> files here would be nicer? And then perhaps call them "actual" and
> "actual.sorted".

I don't think that level of debuggability is important. We have the
raw file if we want to see what the Git output was.

> 
>> +			test_cmp expect actual || return 1
> 
> No need for a "return 1" here when we're not in a loop. It's redundant,
> and makes the -x output on failure confusing ("why didn't I fail on the
> test_cmp, but on this stray return?...").

Sure. I did this more out of habit, but it makes sense that we don't
need it for an if block.

> ...
> 
>> +		else
>> +			test_must_be_empty raw
> 
> ...which would also allow you to extract much of this if/else at the
> cost of not using test_must_be_empty, i.e. just make the "expected"
> empty unless "$expected" is provided. Just a thought/nit, we could also
> leave this as-is :)

*shrug* either way is fine by me.

> Also does the "compare rev" part of this want test_cmp_rev instead?

I don't know what you mean here. We are not comparing revisions anywhere,
but we are using the commit messages to create an easy comparison for the
output.

Thanks,
-Stolee
Eric Sunshine Aug. 18, 2022, 4:53 p.m. UTC | #4
On Thu, Aug 18, 2022 at 11:50 AM Derrick Stolee
<derrickstolee@github.com> wrote:
> From: Derrick Stolee <derrickstolee@github.com>
> Subject: [PATCH] t6019: modernize tests with helper
>
> The tests in t6019 are repetive, so create a helper that greatly
> simplifies the test script.

s/repetive/repetitive/

> In addition, update the common pattern that places 'git rev-list' on the
> left side of a pipe, which can hide some exit codes. Send the output to
> a 'raw' file that is then consumed by other tools so the Git exit code
> is verified as zero.
>
> Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Eric Sunshine Aug. 18, 2022, 4:56 p.m. UTC | #5
On Thu, Aug 18, 2022 at 12:01 PM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
> On Thu, Aug 18 2022, Derrick Stolee wrote:
> > +test_ancestry () {
> > +     args=$1
> > +     expected=$2
>
> Maybe add &&-chaining here (we do it in some cases, but I'm not sure
> when such assignments would ever fail).

Assignment shouldn't fail, but keeping the &&-chain intact here
protects us against the unlikely event of someone inserting &&-chained
code above these assignments and not realizing that the &&-chain is
not intact at the assignment lines.
Jonathan Tan Aug. 18, 2022, 10:24 p.m. UTC | #6
Thanks - overall this looks good. I only have some minor textual
comments and one small code comment.

"Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
> From: Elijah Newren <newren@gmail.com>
> 
> We have long allowed users to run e.g.
>     git log --ancestry-path master..seen
> which shows all commits which satisfy all three of these criteria:
>   * are an ancestor of seen
>   * are not an ancestor master

are not an ancestor *of* master

> diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
> index 2f85726745a..001e49cee55 100644
> --- a/Documentation/rev-list-options.txt
> +++ b/Documentation/rev-list-options.txt
> @@ -389,12 +389,15 @@ Default mode::
>  	merges from the resulting history, as there are no selected
>  	commits contributing to this merge.
>  
> ---ancestry-path::
> +--ancestry-path[=<commit>]::
>  	When given a range of commits to display (e.g. 'commit1..commit2'
> -	or 'commit2 {caret}commit1'), only display commits that exist
> -	directly on the ancestry chain between the 'commit1' and
> -	'commit2', i.e. commits that are both descendants of 'commit1',
> -	and ancestors of 'commit2'.
> +	or 'commit2 {caret}commit1'), only display commits in that
> +	range where <commit> is part of the ancestry chain.  By "part of
> +	the ancestry chain", we mean including <commit> itself and
> +	commits that are either ancestors or descendants of <commit>.
> +	If no commit is specified, use 'commit1' (the excluded part of
> +	the range) as <commit>.  Can be passed multiple times to look for
> +	commits in the ancestry range of multiple commits.

"Ancestry chain" seems to be used multiple times in the Git codebase,
apparently with slightly different definitions, so probably best to
clear up at least this part by not reusing the term. It's also probably
not worth introducing a new term "ancestry range". Maybe rewrite to
say:

  When given a range of commits to display (e.g. 'commit1..commit2'
  or 'commit2 {caret}commit1'), only display commits in that
  range that are ancestors of <commit>, descendants of <commit>, or
  <commit> itself. If no commit is specified, use 'commit1' (the
  excluded part of the range) as <commit>. Can be passed multiple times;
  if so, a commit is included if it is any of the commits given or if it
  is an ancestor or descendant of one of them.

> @@ -568,11 +571,10 @@ Note the major differences in `N`, `P`, and `Q` over `--full-history`:
>  
>  There is another simplification mode available:
>  
> ---ancestry-path::
> -	Limit the displayed commits to those directly on the ancestry
> -	chain between the ``from'' and ``to'' commits in the given commit
> -	range. I.e. only display commits that are ancestor of the ``to''
> -	commit and descendants of the ``from'' commit.
> +--ancestry-path[=<commit>]::
> +	Limit the displayed commits to those containing <commit> in their
> +	ancestry path.  I.e. only display <commit> and commits which have
> +	<commit> as either a direct ancestor or descendant.

Can we refer back to the documentation of --ancestry-path instead?

> @@ -1304,13 +1304,20 @@ static int still_interesting(struct commit_list *src, timestamp_t date, int slop
>  }
>  
>  /*
> - * "rev-list --ancestry-path A..B" computes commits that are ancestors
> + * "rev-list --ancestry-path=C A..B" computes commits that are ancestors
>   * of B but not ancestors of A but further limits the result to those
> - * that are descendants of A.  This takes the list of bottom commits and
> - * the result of "A..B" without --ancestry-path, and limits the latter
> - * further to the ones that can reach one of the commits in "bottom".
> + * that have C in their ancestry path (i.e. are either ancestors of C,
> + * descendants of C, or are C).  If multiple --ancestry-path=$COMMITTISH
> + * arguments are supplied, we limit the result to those that have at
> + * least one of those COMMITTISH in their ancestry path. If
> + * --ancestry-path is specified with no commit, we use all bottom
> + * commits for C.
> + *
> + * Before this function is called, ancestors of C will have already been
> + * marked with ANCESTRY_PATH previously, so we just need to also mark
> + * the descendants here, then collect both sets of commits.
>   */
> -static void limit_to_ancestry(struct commit_list *bottom, struct commit_list *list)
> +static void limit_to_ancestry(struct commit_list *bottoms, struct commit_list *list)

I thought the original description of this function ("This takes the
list...") to be clear and it would be nice to retain it. So, e.g.:

  "rev-list --ancestry-path=C_0 [--ancestry-path=C_1 ...] A..B" computes commits
  that are ancestors of B but not ancestors of A but further limits the
  result to those that have any of C in their ancestry path (i.e. are
  either ancestors of any of C, descendants of any of C, or are any of
  C). If --ancestry-path is specified with no commit, we use all bottom
  commits for C.
  
  Before this function is called, ancestors of C will have already been
  marked with ANCESTRY_PATH previously.

  This takes the list of bottom commits and the result of "A..B" without
  --ancestry-path, and limits the latter further to the ones that have
  any of C in their ancestry path. Since the ancestors of C have already
  been marked (a prerequisite of this function), we just need to mark
  the descendants, then exclude any commit that does not have any of
  these marks.

Optional: Besides that, from what I can tell, sometimes the C commits
themselves are marked with ANCESTRY_PATH (when they are explicitly
specified) and sometimes they are not (when they are not explicitly
specified). It's not a bug here, but it might be worth handling that in
the ancestry_path_need_bottoms codepath (instead of explicitly setting
TMP_MARK on the bottoms in limit_to_ancestry() - that way, I think we
can also use ANCESTRY_PATH instead of TMP_MARK throughout the ancestry
path codepaths, but I haven't tested it), at least to prevent possible
future bugs.

> @@ -2213,7 +2220,7 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
>  			       const struct setup_revision_opt* opt)
>  {
>  	const char *arg = argv[0];
> -	const char *optarg;
> +	const char *optarg = NULL;
>  	int argcount;
>  	const unsigned hexsz = the_hash_algo->hexsz;

[snip]

> @@ -2280,10 +2287,26 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
>  		revs->first_parent_only = 1;
>  	} else if (!strcmp(arg, "--exclude-first-parent-only")) {
>  		revs->exclude_first_parent_only = 1;
> -	} else if (!strcmp(arg, "--ancestry-path")) {
> +	} else if (!strcmp(arg, "--ancestry-path") ||
> +		   skip_prefix(arg, "--ancestry-path=", &optarg)) {

This and the above hunk might cause bugs if --ancestry-path was first
specified with a commit and then specified without. Probably best to
break this up into separate "else if" branches, even though there is a
bit of code duplication (and also remove the "= NULL" addition in the
above hunk).

> @@ -164,6 +165,7 @@ struct rev_info {
>  			cherry_mark:1,
>  			bisect:1,
>  			ancestry_path:1,
> +			ancestry_path_need_bottoms:1,

Might be better named as ancestry_path_implicit_bottoms? And probably
worth documenting, e.g.

  True if --ancestry-path was specified without an argument. The bottom
  revisions are implicitly the arguments in this case.
Elijah Newren Aug. 19, 2022, 1:01 a.m. UTC | #7
On Thu, Aug 18, 2022 at 8:30 AM Derrick Stolee <derrickstolee@github.com> wrote:
>
> On 8/18/2022 2:17 AM, Elijah Newren via GitGitGadget wrote:
> > From: Elijah Newren <newren@gmail.com>
>
> Code looks good!
>
> > diff --git a/t/t6019-rev-list-ancestry-path.sh b/t/t6019-rev-list-ancestry-path.sh
> > index af57a04b7ff..d3657737fa6 100755
> > --- a/t/t6019-rev-list-ancestry-path.sh
> > +++ b/t/t6019-rev-list-ancestry-path.sh
> > @@ -8,8 +8,13 @@ test_description='--ancestry-path'
> >  #   /                     \
> >  #  A-------K---------------L--M
> >  #
> > -#  D..M                 == E F G H I J K L M
> > -#  --ancestry-path D..M == E F H I J L M
> > +#  D..M                                     == E F G H I J K L M
> > +#  --ancestry-path                     D..M == E F   H I J   L M
> > +#  --ancestry-path=F                   D..M == E F       J   L M
> > +#  --ancestry-path=G                   D..M ==     G H I J   L M
> > +#  --ancestry-path=H                   D..M == E   G H I J   L M
> > +#  --ancestry-path=K                   D..M ==             K L M
> > +#  --ancestry-path=K --ancestry-path=F D..M == E F       J K L M
>
> These are good examples, because they help clarify what I had initially
> been confused about: you include things in _both_ directions. In
> particular, "--ancestry-path=K --ancestry-path=f D..M" you are kind of
> taking the union of the following queries:
>
>         --ancestry-path D..K
>         --ancestry-path K..M
>         --ancestry-path D..F
>         --ancestry-path F..M
>
> I did check just in case, but specifying multiple ranges such as
>
>         --ancestry-path D..K K..M
>
> does not do what is expected.

Right, because there's no such thing as multiple ranges.  Quoting from the docs:

"""
Commands that are specifically designed to take two distinct ranges
(e.g. "git range-diff R1 R2" to compare two ranges) do exist, but
they are exceptions.  Unless otherwise noted, all "git" commands
that operate on a set of commits work on a single revision range.
In other words, writing two "two-dot range notation" next to each
other, e.g.

    $ git log A..B C..D

does *not* specify two revision ranges for most commands.  Instead
it will name a single connected set of commits, i.e. those that are
reachable from either B or D but are reachable from neither A or C.
"""

>
> > +test_expect_success 'rev-list --ancestry-path=F D..M' '
> > +     test_write_lines E F J L M >expect &&
> > +     git rev-list --ancestry-path=F --format=%s D..M |
> > +     sed -e "/^commit /d" |
> > +     sort >actual &&
> > +     test_cmp expect actual
> > +'
>
> These tests follow the patterns from other tests in this file, but
> it also has bad patterns. Specifically, the 'git rev-list' command
> is fed directly into a pipe. I include a patch below that applies
> directly on this one to rewrite these tests. If you want, you could
> rebase to have that test refactor happen before you add your new
> --ancestry-path=<X> option tests.

Ooh, I like it.  I'll rebase and include this patch earlier in my
series.  Thanks!

>
> > +test_expect_success 'rev-list --ancestry-path=G D..M' '
> > +     test_write_lines G H I J L M >expect &&
> > +     git rev-list --ancestry-path=G --format=%s D..M |
> > +     sed -e "/^commit /d" |
> > +     sort >actual &&
> > +     test_cmp expect actual
> > +'
> > +test_expect_success 'rev-list --ancestry-path=H D..M' '
>
> nit: needs extra whitespace between tests. The above test pair
> needs one, too. This becomes moot with the patch I provide.
>
> Thanks,
> -Stolee
>
> --- >8 ---
>
> From 9ac4e81dba0d7801513a09f5fe307d01357123dd Mon Sep 17 00:00:00 2001
> From: Derrick Stolee <derrickstolee@github.com>
> Date: Thu, 18 Aug 2022 11:25:04 -0400
> Subject: [PATCH] t6019: modernize tests with helper
>
> The tests in t6019 are repetive, so create a helper that greatly
> simplifies the test script.
>
> In addition, update the common pattern that places 'git rev-list' on the
> left side of a pipe, which can hide some exit codes. Send the output to
> a 'raw' file that is then consumed by other tools so the Git exit code
> is verified as zero.
>
> Signed-off-by: Derrick Stolee <derrickstolee@github.com>
> ---
>  t/t6019-rev-list-ancestry-path.sh | 131 +++++++-----------------------
>  1 file changed, 31 insertions(+), 100 deletions(-)
>
> diff --git a/t/t6019-rev-list-ancestry-path.sh b/t/t6019-rev-list-ancestry-path.sh
> index d3657737fa60..18941a80ce67 100755
> --- a/t/t6019-rev-list-ancestry-path.sh
> +++ b/t/t6019-rev-list-ancestry-path.sh
> @@ -55,111 +55,42 @@ test_expect_success setup '
>         test_commit M
>  '
>
> -test_expect_success 'rev-list D..M' '
> -       test_write_lines E F G H I J K L M >expect &&
> -       git rev-list --format=%s D..M |
> -       sed -e "/^commit /d" |
> -       sort >actual &&
> -       test_cmp expect actual
> -'
> -
> -test_expect_success 'rev-list --ancestry-path D..M' '
> -       test_write_lines E F H I J L M >expect &&
> -       git rev-list --ancestry-path --format=%s D..M |
> -       sed -e "/^commit /d" |
> -       sort >actual &&
> -       test_cmp expect actual
> -'
> -
> -test_expect_success 'rev-list --ancestry-path=F D..M' '
> -       test_write_lines E F J L M >expect &&
> -       git rev-list --ancestry-path=F --format=%s D..M |
> -       sed -e "/^commit /d" |
> -       sort >actual &&
> -       test_cmp expect actual
> -'
> -test_expect_success 'rev-list --ancestry-path=G D..M' '
> -       test_write_lines G H I J L M >expect &&
> -       git rev-list --ancestry-path=G --format=%s D..M |
> -       sed -e "/^commit /d" |
> -       sort >actual &&
> -       test_cmp expect actual
> -'
> -test_expect_success 'rev-list --ancestry-path=H D..M' '
> -       test_write_lines E G H I J L M >expect &&
> -       git rev-list --ancestry-path=H --format=%s D..M |
> -       sed -e "/^commit /d" |
> -       sort >actual &&
> -       test_cmp expect actual
> -'
> -
> -test_expect_success 'rev-list --ancestry-path=K D..M' '
> -       test_write_lines K L M >expect &&
> -       git rev-list --ancestry-path=K --format=%s D..M |
> -       sed -e "/^commit /d" |
> -       sort >actual &&
> -       test_cmp expect actual
> -'
> -
> -test_expect_success 'rev-list --ancestry-path=F --ancestry-path=K D..M' '
> -       test_write_lines E F J K L M >expect &&
> -       git rev-list --ancestry-path=F --ancestry-path=K --format=%s D..M |
> -       sed -e "/^commit /d" |
> -       sort >actual &&
> -       test_cmp expect actual
> -'
> -
> -test_expect_success 'rev-list D..M -- M.t' '
> -       echo M >expect &&
> -       git rev-list --format=%s D..M -- M.t |
> -       sed -e "/^commit /d" >actual &&
> -       test_cmp expect actual
> -'
> -
> -test_expect_success 'rev-list --ancestry-path D..M -- M.t' '
> -       echo M >expect &&
> -       git rev-list --ancestry-path --format=%s D..M -- M.t |
> -       sed -e "/^commit /d" >actual &&
> -       test_cmp expect actual
> -'
> +test_ancestry () {
> +       args=$1
> +       expected=$2
> +       test_expect_success "rev-list $args" "
> +               test_write_lines $expected >expect &&
> +               git rev-list --format=%s $args >raw &&
> +
> +               if test -n \"$expected\"
> +               then
> +                       sed -e \"/^commit /d\" raw | sort >actual &&
> +                       test_cmp expect actual || return 1
> +               else
> +                       test_must_be_empty raw
> +               fi
> +       "
> +}
>
> -test_expect_success 'rev-list F...I' '
> -       test_write_lines F G H I >expect &&
> -       git rev-list --format=%s F...I |
> -       sed -e "/^commit /d" |
> -       sort >actual &&
> -       test_cmp expect actual
> -'
> +test_ancestry "D..M" "E F G H I J K L M"
>
> -test_expect_success 'rev-list --ancestry-path F...I' '
> -       test_write_lines F H I >expect &&
> -       git rev-list --ancestry-path --format=%s F...I |
> -       sed -e "/^commit /d" |
> -       sort >actual &&
> -       test_cmp expect actual
> -'
> +test_ancestry "--ancestry-path D..M" "E F H I J L M"
> +test_ancestry "--ancestry-path D..M" "E F H I J L M"
> +test_ancestry "--ancestry-path=F D..M" "E F J L M"
> +test_ancestry "--ancestry-path=G D..M" "G H I J L M"
> +test_ancestry "--ancestry-path=H D..M" "E G H I J L M"
> +test_ancestry "--ancestry-path=K D..M" "K L M"
> +test_ancestry "--ancestry-path=K --ancestry-path=F D..M" "E F J K L M"
>
> -# G.t is dropped in an "-s ours" merge
> -test_expect_success 'rev-list G..M -- G.t' '
> -       git rev-list --format=%s G..M -- G.t |
> -       sed -e "/^commit /d" >actual &&
> -       test_must_be_empty actual
> -'
> +test_ancestry "D..M -- M.t" "M"
> +test_ancestry "--ancestry-path D..M -- M.t" "M"
>
> -test_expect_success 'rev-list --ancestry-path G..M -- G.t' '
> -       echo L >expect &&
> -       git rev-list --ancestry-path --format=%s G..M -- G.t |
> -       sed -e "/^commit /d" >actual &&
> -       test_cmp expect actual
> -'
> +test_ancestry "F...I" "F G H I"
> +test_ancestry "--ancestry-path F...I" "F H I"
>
> -test_expect_success 'rev-list --ancestry-path --simplify-merges G^..M -- G.t' '
> -       test_write_lines G L >expect &&
> -       git rev-list --ancestry-path --simplify-merges --format=%s G^..M -- G.t |
> -       sed -e "/^commit /d" |
> -       sort >actual &&
> -       test_cmp expect actual
> -'
> +test_ancestry "G..M -- G.t" ""
> +test_ancestry "--ancestry-path G..M -- G.t" "L"
> +test_ancestry "--ancestry-path --simplify-merges G^..M -- G.t" "G L"
>
>  #   b---bc
>  #  / \ /
> --
> 2.37.1.vfs.0.0.rebase
>
>
>
>
Elijah Newren Aug. 19, 2022, 1:12 a.m. UTC | #8
On Thu, Aug 18, 2022 at 8:57 AM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
> On Thu, Aug 18 2022, Derrick Stolee wrote:
>
> > On 8/18/2022 2:17 AM, Elijah Newren via GitGitGadget wrote:
> >> +test_expect_success 'rev-list --ancestry-path=F D..M' '
> >> +    test_write_lines E F J L M >expect &&
> >> +    git rev-list --ancestry-path=F --format=%s D..M |
> >> +    sed -e "/^commit /d" |
> >> +    sort >actual &&
> >> +    test_cmp expect actual
> >> +'
> >
> > These tests follow the patterns from other tests in this file, but
> > it also has bad patterns. Specifically, the 'git rev-list' command
> > is fed directly into a pipe. I include a patch below that applies
> > directly on this one to rewrite these tests. If you want, you could
> > rebase to have that test refactor happen before you add your new
> > --ancestry-path=<X> option tests.
>
> Thanks, I was going to comment on the same, but your solution is much
> better (I was just going to suggest using intermediate files).
>
> > [...]
> > -test_expect_success 'rev-list --ancestry-path D..M -- M.t' '
> > -     echo M >expect &&
> > -     git rev-list --ancestry-path --format=%s D..M -- M.t |
> > -     sed -e "/^commit /d" >actual &&
> > -     test_cmp expect actual
> > -'
> > +test_ancestry () {
> > +     args=$1
> > +     expected=$2
>
> Maybe add &&-chaining here (we do it in some cases, but I'm not sure
> when such assignments would ever fail).
>
> > +     test_expect_success "rev-list $args" "
> > +             test_write_lines $expected >expect &&
> > +             git rev-list --format=%s $args >raw &&
> > +
> > +             if test -n \"$expected\"
>
> Aren't you making things harder for yourself here than required by using
> ""-quoting for the body of the test.
>
> We eval it into existence, so you can use ''-quotes, and then you don't
> need to escape e.g. the "" quotes here for expected, no?
>
> > +             then
> > +                     sed -e \"/^commit /d\" raw | sort >actual &&
>
> nit for debuggability (and not correctness), maybe using intermediate
> files here would be nicer? And then perhaps call them "actual" and
> "actual.sorted".

Would be better to just nuke the sed by replacing 'rev-list' with
'log' (the line already has a --format option, so might as well get
the output we want).

> > +                     test_cmp expect actual || return 1
>
> No need for a "return 1" here when we're not in a loop. It's redundant,
> and makes the -x output on failure confusing ("why didn't I fail on the
> test_cmp, but on this stray return?...").
>
> ...
>
> > +             else
> > +                     test_must_be_empty raw
>
> ...which would also allow you to extract much of this if/else at the
> cost of not using test_must_be_empty, i.e. just make the "expected"
> empty unless "$expected" is provided. Just a thought/nit, we could also
> leave this as-is :)
>
> Also does the "compare rev" part of this want test_cmp_rev instead?

Um, I don't see any "compare rev" part of this, or any revision
comparing.  What are you referring to?
Elijah Newren Aug. 19, 2022, 1:23 a.m. UTC | #9
On Thu, Aug 18, 2022 at 3:24 PM Jonathan Tan <jonathantanmy@google.com> wrote:
>
> Thanks - overall this looks good. I only have some minor textual
> comments and one small code comment.
>
> "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
> > From: Elijah Newren <newren@gmail.com>
> >
> > We have long allowed users to run e.g.
> >     git log --ancestry-path master..seen
> > which shows all commits which satisfy all three of these criteria:
> >   * are an ancestor of seen
> >   * are not an ancestor master
>
> are not an ancestor *of* master

Thanks, good catch.

> > diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
> > index 2f85726745a..001e49cee55 100644
> > --- a/Documentation/rev-list-options.txt
> > +++ b/Documentation/rev-list-options.txt
> > @@ -389,12 +389,15 @@ Default mode::
> >       merges from the resulting history, as there are no selected
> >       commits contributing to this merge.
> >
> > ---ancestry-path::
> > +--ancestry-path[=<commit>]::
> >       When given a range of commits to display (e.g. 'commit1..commit2'
> > -     or 'commit2 {caret}commit1'), only display commits that exist
> > -     directly on the ancestry chain between the 'commit1' and
> > -     'commit2', i.e. commits that are both descendants of 'commit1',
> > -     and ancestors of 'commit2'.
> > +     or 'commit2 {caret}commit1'), only display commits in that
> > +     range where <commit> is part of the ancestry chain.  By "part of
> > +     the ancestry chain", we mean including <commit> itself and
> > +     commits that are either ancestors or descendants of <commit>.
> > +     If no commit is specified, use 'commit1' (the excluded part of
> > +     the range) as <commit>.  Can be passed multiple times to look for
> > +     commits in the ancestry range of multiple commits.
>
> "Ancestry chain" seems to be used multiple times in the Git codebase,
> apparently with slightly different definitions, so probably best to
> clear up at least this part by not reusing the term. It's also probably
> not worth introducing a new term "ancestry range". Maybe rewrite to
> say:
>
>   When given a range of commits to display (e.g. 'commit1..commit2'
>   or 'commit2 {caret}commit1'), only display commits in that
>   range that are ancestors of <commit>, descendants of <commit>, or
>   <commit> itself. If no commit is specified, use 'commit1' (the
>   excluded part of the range) as <commit>. Can be passed multiple times;
>   if so, a commit is included if it is any of the commits given or if it
>   is an ancestor or descendant of one of them.

Works for me; thanks.

> > @@ -568,11 +571,10 @@ Note the major differences in `N`, `P`, and `Q` over `--full-history`:
> >
> >  There is another simplification mode available:
> >
> > ---ancestry-path::
> > -     Limit the displayed commits to those directly on the ancestry
> > -     chain between the ``from'' and ``to'' commits in the given commit
> > -     range. I.e. only display commits that are ancestor of the ``to''
> > -     commit and descendants of the ``from'' commit.
> > +--ancestry-path[=<commit>]::
> > +     Limit the displayed commits to those containing <commit> in their
> > +     ancestry path.  I.e. only display <commit> and commits which have
> > +     <commit> as either a direct ancestor or descendant.
>
> Can we refer back to the documentation of --ancestry-path instead?

I had the same thought, especially since the nearby text also
duplicates definitions for --dense, --sparse, --simplify-merges, and
--show-pulls, each of which might also benefit from just referring
back to previous definitions to avoid drift.  I think we should handle
this whole "History Simplification" section consistently, so if we
just refer back to the previous definition somehow for this flag then
we should also do the same for the others.  But some of the others
appear to intentionally avoid using the same wording in order to draw
graphs and pictorially explain it.

So it feels like a can of worms that I'm not sure how to solve, and
what I currently have is the best solution available.

> > @@ -1304,13 +1304,20 @@ static int still_interesting(struct commit_list *src, timestamp_t date, int slop
> >  }
> >
> >  /*
> > - * "rev-list --ancestry-path A..B" computes commits that are ancestors
> > + * "rev-list --ancestry-path=C A..B" computes commits that are ancestors
> >   * of B but not ancestors of A but further limits the result to those
> > - * that are descendants of A.  This takes the list of bottom commits and
> > - * the result of "A..B" without --ancestry-path, and limits the latter
> > - * further to the ones that can reach one of the commits in "bottom".
> > + * that have C in their ancestry path (i.e. are either ancestors of C,
> > + * descendants of C, or are C).  If multiple --ancestry-path=$COMMITTISH
> > + * arguments are supplied, we limit the result to those that have at
> > + * least one of those COMMITTISH in their ancestry path. If
> > + * --ancestry-path is specified with no commit, we use all bottom
> > + * commits for C.
> > + *
> > + * Before this function is called, ancestors of C will have already been
> > + * marked with ANCESTRY_PATH previously, so we just need to also mark
> > + * the descendants here, then collect both sets of commits.
> >   */
> > -static void limit_to_ancestry(struct commit_list *bottom, struct commit_list *list)
> > +static void limit_to_ancestry(struct commit_list *bottoms, struct commit_list *list)
>
> I thought the original description of this function ("This takes the
> list...") to be clear and it would be nice to retain it. So, e.g.:
>
>   "rev-list --ancestry-path=C_0 [--ancestry-path=C_1 ...] A..B" computes commits
>   that are ancestors of B but not ancestors of A but further limits the
>   result to those that have any of C in their ancestry path (i.e. are
>   either ancestors of any of C, descendants of any of C, or are any of
>   C). If --ancestry-path is specified with no commit, we use all bottom
>   commits for C.
>
>   Before this function is called, ancestors of C will have already been
>   marked with ANCESTRY_PATH previously.
>
>   This takes the list of bottom commits and the result of "A..B" without
>   --ancestry-path, and limits the latter further to the ones that have
>   any of C in their ancestry path. Since the ancestors of C have already
>   been marked (a prerequisite of this function), we just need to mark
>   the descendants, then exclude any commit that does not have any of
>   these marks.

Sounds good to me; I'm happy to adopt this wording.

> Optional: Besides that, from what I can tell, sometimes the C commits
> themselves are marked with ANCESTRY_PATH (when they are explicitly
> specified) and sometimes they are not (when they are not explicitly
> specified). It's not a bug here, but it might be worth handling that in
> the ancestry_path_need_bottoms codepath (instead of explicitly setting
> TMP_MARK on the bottoms in limit_to_ancestry() - that way, I think we
> can also use ANCESTRY_PATH instead of TMP_MARK throughout the ancestry
> path codepaths, but I haven't tested it), at least to prevent possible
> future bugs.

That sounds like you're trying to duplicate the bug in my first
attempt at this patch.  If you try to coalesce ANCESTRY_PATH and
TMP_MARK, then you not only get all descendants of C, you also get all
descendants of any ancestor of C, which defeats the whole point of my
changes.

It's true that I don't mark implicit C commits with ANCESTRY_PATH, but
those are always bottom commits that are the excluded end of a range
anyway.  While those could be marked without causing problems, it
would always be a waste of effort.

> > @@ -2213,7 +2220,7 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
> >                              const struct setup_revision_opt* opt)
> >  {
> >       const char *arg = argv[0];
> > -     const char *optarg;
> > +     const char *optarg = NULL;
> >       int argcount;
> >       const unsigned hexsz = the_hash_algo->hexsz;
>
> [snip]
>
> > @@ -2280,10 +2287,26 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
> >               revs->first_parent_only = 1;
> >       } else if (!strcmp(arg, "--exclude-first-parent-only")) {
> >               revs->exclude_first_parent_only = 1;
> > -     } else if (!strcmp(arg, "--ancestry-path")) {
> > +     } else if (!strcmp(arg, "--ancestry-path") ||
> > +                skip_prefix(arg, "--ancestry-path=", &optarg)) {
>
> This and the above hunk might cause bugs if --ancestry-path was first
> specified with a commit and then specified without. Probably best to
> break this up into separate "else if" branches, even though there is a
> bit of code duplication (and also remove the "= NULL" addition in the
> above hunk).

Ah, good catch.

> > @@ -164,6 +165,7 @@ struct rev_info {
> >                       cherry_mark:1,
> >                       bisect:1,
> >                       ancestry_path:1,
> > +                     ancestry_path_need_bottoms:1,
>
> Might be better named as ancestry_path_implicit_bottoms? And probably
> worth documenting, e.g.
>
>   True if --ancestry-path was specified without an argument. The bottom
>   revisions are implicitly the arguments in this case.

Sure, sounds good.

Thanks for the careful review!
Ævar Arnfjörð Bjarmason Aug. 19, 2022, 2:45 a.m. UTC | #10
On Thu, Aug 18 2022, Elijah Newren wrote:

> On Thu, Aug 18, 2022 at 8:57 AM Ævar Arnfjörð Bjarmason
> <avarab@gmail.com> wrote:
>> [...]
>> Also does the "compare rev" part of this want test_cmp_rev instead?
>
> Um, I don't see any "compare rev" part of this, or any revision
> comparing.  What are you referring to?

I just misread this part while skimming it, sorry.
Jonathan Tan Aug. 19, 2022, 5:25 p.m. UTC | #11
Elijah Newren <newren@gmail.com> writes:
> > Optional: Besides that, from what I can tell, sometimes the C commits
> > themselves are marked with ANCESTRY_PATH (when they are explicitly
> > specified) and sometimes they are not (when they are not explicitly
> > specified). It's not a bug here, but it might be worth handling that in
> > the ancestry_path_need_bottoms codepath (instead of explicitly setting
> > TMP_MARK on the bottoms in limit_to_ancestry() - that way, I think we
> > can also use ANCESTRY_PATH instead of TMP_MARK throughout the ancestry
> > path codepaths, but I haven't tested it), at least to prevent possible
> > future bugs.
> 
> That sounds like you're trying to duplicate the bug in my first
> attempt at this patch.  If you try to coalesce ANCESTRY_PATH and
> TMP_MARK, then you not only get all descendants of C, you also get all
> descendants of any ancestor of C, which defeats the whole point of my
> changes.

Ah, yes you're right.

> It's true that I don't mark implicit C commits with ANCESTRY_PATH, but
> those are always bottom commits that are the excluded end of a range
> anyway.  While those could be marked without causing problems, it
> would always be a waste of effort.

Yes, that's true.
diff mbox series

Patch

diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
index 2f85726745a..001e49cee55 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -389,12 +389,15 @@  Default mode::
 	merges from the resulting history, as there are no selected
 	commits contributing to this merge.
 
---ancestry-path::
+--ancestry-path[=<commit>]::
 	When given a range of commits to display (e.g. 'commit1..commit2'
-	or 'commit2 {caret}commit1'), only display commits that exist
-	directly on the ancestry chain between the 'commit1' and
-	'commit2', i.e. commits that are both descendants of 'commit1',
-	and ancestors of 'commit2'.
+	or 'commit2 {caret}commit1'), only display commits in that
+	range where <commit> is part of the ancestry chain.  By "part of
+	the ancestry chain", we mean including <commit> itself and
+	commits that are either ancestors or descendants of <commit>.
+	If no commit is specified, use 'commit1' (the excluded part of
+	the range) as <commit>.  Can be passed multiple times to look for
+	commits in the ancestry range of multiple commits.
 
 A more detailed explanation follows.
 
@@ -568,11 +571,10 @@  Note the major differences in `N`, `P`, and `Q` over `--full-history`:
 
 There is another simplification mode available:
 
---ancestry-path::
-	Limit the displayed commits to those directly on the ancestry
-	chain between the ``from'' and ``to'' commits in the given commit
-	range. I.e. only display commits that are ancestor of the ``to''
-	commit and descendants of the ``from'' commit.
+--ancestry-path[=<commit>]::
+	Limit the displayed commits to those containing <commit> in their
+	ancestry path.  I.e. only display <commit> and commits which have
+	<commit> as either a direct ancestor or descendant.
 +
 As an example use case, consider the following commit history:
 +
@@ -604,6 +606,29 @@  option does. Applied to the 'D..M' range, it results in:
 			       \
 				L--M
 -----------------------------------------------------------------------
++
+We can also use `--ancestry-path=D` instead of `--ancestry-path` which
+means the same thing when applied to the 'D..M' range but is just more
+explicit.
++
+If we instead are interested in a given topic within this range, and all
+commits affected by that topic, we may only want to view the subset of
+`D..M` which contain that topic in their ancestry path.  So, using
+`--ancestry-path=H D..M` for example would result in:
++
+-----------------------------------------------------------------------
+		E
+		 \
+		  G---H---I---J
+			       \
+				L--M
+-----------------------------------------------------------------------
++
+Whereas `--ancestry-path=K D..M` would result in
++
+-----------------------------------------------------------------------
+		K---------------L--M
+-----------------------------------------------------------------------
 
 Before discussing another option, `--show-pulls`, we need to
 create a new example history.
diff --git a/object.h b/object.h
index a2219464c2b..9caef89f1f0 100644
--- a/object.h
+++ b/object.h
@@ -59,7 +59,7 @@  struct object_array {
 
 /*
  * object flag allocation:
- * revision.h:               0---------10         15             23------26
+ * revision.h:               0---------10         15             23------27
  * fetch-pack.c:             01    67
  * negotiator/default.c:       2--5
  * walker.c:                 0-2
diff --git a/revision.c b/revision.c
index 0c6e26cd9c8..4ebec71677c 100644
--- a/revision.c
+++ b/revision.c
@@ -1105,7 +1105,7 @@  static int process_parents(struct rev_info *revs, struct commit *commit,
 			   struct commit_list **list, struct prio_queue *queue)
 {
 	struct commit_list *parent = commit->parents;
-	unsigned left_flag;
+	unsigned pass_flags;
 
 	if (commit->object.flags & ADDED)
 		return 0;
@@ -1160,7 +1160,7 @@  static int process_parents(struct rev_info *revs, struct commit *commit,
 	if (revs->no_walk)
 		return 0;
 
-	left_flag = (commit->object.flags & SYMMETRIC_LEFT);
+	pass_flags = (commit->object.flags & (SYMMETRIC_LEFT | ANCESTRY_PATH));
 
 	for (parent = commit->parents; parent; parent = parent->next) {
 		struct commit *p = parent->item;
@@ -1181,7 +1181,7 @@  static int process_parents(struct rev_info *revs, struct commit *commit,
 			if (!*slot)
 				*slot = *revision_sources_at(revs->sources, commit);
 		}
-		p->object.flags |= left_flag;
+		p->object.flags |= pass_flags;
 		if (!(p->object.flags & SEEN)) {
 			p->object.flags |= (SEEN | NOT_USER_GIVEN);
 			if (list)
@@ -1304,13 +1304,20 @@  static int still_interesting(struct commit_list *src, timestamp_t date, int slop
 }
 
 /*
- * "rev-list --ancestry-path A..B" computes commits that are ancestors
+ * "rev-list --ancestry-path=C A..B" computes commits that are ancestors
  * of B but not ancestors of A but further limits the result to those
- * that are descendants of A.  This takes the list of bottom commits and
- * the result of "A..B" without --ancestry-path, and limits the latter
- * further to the ones that can reach one of the commits in "bottom".
+ * that have C in their ancestry path (i.e. are either ancestors of C,
+ * descendants of C, or are C).  If multiple --ancestry-path=$COMMITTISH
+ * arguments are supplied, we limit the result to those that have at
+ * least one of those COMMITTISH in their ancestry path. If
+ * --ancestry-path is specified with no commit, we use all bottom
+ * commits for C.
+ *
+ * Before this function is called, ancestors of C will have already been
+ * marked with ANCESTRY_PATH previously, so we just need to also mark
+ * the descendants here, then collect both sets of commits.
  */
-static void limit_to_ancestry(struct commit_list *bottom, struct commit_list *list)
+static void limit_to_ancestry(struct commit_list *bottoms, struct commit_list *list)
 {
 	struct commit_list *p;
 	struct commit_list *rlist = NULL;
@@ -1323,7 +1330,7 @@  static void limit_to_ancestry(struct commit_list *bottom, struct commit_list *li
 	for (p = list; p; p = p->next)
 		commit_list_insert(p->item, &rlist);
 
-	for (p = bottom; p; p = p->next)
+	for (p = bottoms; p; p = p->next)
 		p->item->object.flags |= TMP_MARK;
 
 	/*
@@ -1356,38 +1363,39 @@  static void limit_to_ancestry(struct commit_list *bottom, struct commit_list *li
 	 */
 
 	/*
-	 * The ones that are not marked with TMP_MARK are uninteresting
+	 * The ones that are not marked with either TMP_MARK or
+	 * ANCESTRY_PATH are uninteresting
 	 */
 	for (p = list; p; p = p->next) {
 		struct commit *c = p->item;
-		if (c->object.flags & TMP_MARK)
+		if (c->object.flags & (TMP_MARK | ANCESTRY_PATH))
 			continue;
 		c->object.flags |= UNINTERESTING;
 	}
 
-	/* We are done with the TMP_MARK */
+	/* We are done with TMP_MARK and ANCESTRY_PATH */
 	for (p = list; p; p = p->next)
-		p->item->object.flags &= ~TMP_MARK;
-	for (p = bottom; p; p = p->next)
-		p->item->object.flags &= ~TMP_MARK;
+		p->item->object.flags &= ~(TMP_MARK | ANCESTRY_PATH);
+	for (p = bottoms; p; p = p->next)
+		p->item->object.flags &= ~(TMP_MARK | ANCESTRY_PATH);
 	free_commit_list(rlist);
 }
 
 /*
- * Before walking the history, keep the set of "negative" refs the
- * caller has asked to exclude.
+ * Before walking the history, add the set of "negative" refs the
+ * caller has asked to exclude to the bottom list.
  *
  * This is used to compute "rev-list --ancestry-path A..B", as we need
  * to filter the result of "A..B" further to the ones that can actually
  * reach A.
  */
-static struct commit_list *collect_bottom_commits(struct commit_list *list)
+static void collect_bottom_commits(struct commit_list *list,
+				   struct commit_list **bottom)
 {
-	struct commit_list *elem, *bottom = NULL;
+	struct commit_list *elem;
 	for (elem = list; elem; elem = elem->next)
 		if (elem->item->object.flags & BOTTOM)
-			commit_list_insert(elem->item, &bottom);
-	return bottom;
+			commit_list_insert(elem->item, bottom);
 }
 
 /* Assumes either left_only or right_only is set */
@@ -1414,12 +1422,12 @@  static int limit_list(struct rev_info *revs)
 	struct commit_list *original_list = revs->commits;
 	struct commit_list *newlist = NULL;
 	struct commit_list **p = &newlist;
-	struct commit_list *bottom = NULL;
 	struct commit *interesting_cache = NULL;
 
-	if (revs->ancestry_path) {
-		bottom = collect_bottom_commits(original_list);
-		if (!bottom)
+	if (revs->ancestry_path_need_bottoms) {
+		collect_bottom_commits(original_list,
+				       &revs->ancestry_path_bottoms);
+		if (!revs->ancestry_path_bottoms)
 			die("--ancestry-path given but there are no bottom commits");
 	}
 
@@ -1464,9 +1472,8 @@  static int limit_list(struct rev_info *revs)
 	if (revs->left_only || revs->right_only)
 		limit_left_right(newlist, revs);
 
-	if (bottom)
-		limit_to_ancestry(bottom, newlist);
-	free_commit_list(bottom);
+	if (revs->ancestry_path)
+		limit_to_ancestry(revs->ancestry_path_bottoms, newlist);
 
 	/*
 	 * Check if any commits have become TREESAME by some of their parents
@@ -2213,7 +2220,7 @@  static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
 			       const struct setup_revision_opt* opt)
 {
 	const char *arg = argv[0];
-	const char *optarg;
+	const char *optarg = NULL;
 	int argcount;
 	const unsigned hexsz = the_hash_algo->hexsz;
 
@@ -2280,10 +2287,26 @@  static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
 		revs->first_parent_only = 1;
 	} else if (!strcmp(arg, "--exclude-first-parent-only")) {
 		revs->exclude_first_parent_only = 1;
-	} else if (!strcmp(arg, "--ancestry-path")) {
+	} else if (!strcmp(arg, "--ancestry-path") ||
+		   skip_prefix(arg, "--ancestry-path=", &optarg)) {
 		revs->ancestry_path = 1;
 		revs->simplify_history = 0;
 		revs->limited = 1;
+		if (optarg) {
+			struct commit *c;
+			struct object_id oid;
+			const char *msg = _("could not get commit for ancestry-path argument %s");
+
+			if (repo_get_oid_committish(revs->repo, optarg, &oid))
+				return error(msg, optarg);
+			get_reference(revs, optarg, &oid, ANCESTRY_PATH);
+			c = lookup_commit_reference(revs->repo, &oid);
+			if (!c)
+				return error(msg, optarg);
+			commit_list_insert(c, &revs->ancestry_path_bottoms);
+		} else {
+			revs->ancestry_path_need_bottoms = 1;
+		}
 	} else if (!strcmp(arg, "-g") || !strcmp(arg, "--walk-reflogs")) {
 		init_reflog_walk(&revs->reflog_info);
 	} else if (!strcmp(arg, "--default")) {
@@ -2991,6 +3014,7 @@  static void release_revisions_topo_walk_info(struct topo_walk_info *info);
 void release_revisions(struct rev_info *revs)
 {
 	free_commit_list(revs->commits);
+	free_commit_list(revs->ancestry_path_bottoms);
 	object_array_clear(&revs->pending);
 	object_array_clear(&revs->boundary_commits);
 	release_revisions_cmdline(&revs->cmdline);
diff --git a/revision.h b/revision.h
index e576845cdd1..d86241ad8fc 100644
--- a/revision.h
+++ b/revision.h
@@ -48,6 +48,7 @@ 
  */
 #define NOT_USER_GIVEN	(1u<<25)
 #define TRACK_LINEAR	(1u<<26)
+#define ANCESTRY_PATH	(1u<<27)
 #define ALL_REV_FLAGS	(((1u<<11)-1) | NOT_USER_GIVEN | TRACK_LINEAR | PULL_MERGE)
 
 #define DECORATE_SHORT_REFS	1
@@ -164,6 +165,7 @@  struct rev_info {
 			cherry_mark:1,
 			bisect:1,
 			ancestry_path:1,
+			ancestry_path_need_bottoms:1,
 			first_parent_only:1,
 			exclude_first_parent_only:1,
 			line_level_traverse:1,
@@ -306,6 +308,7 @@  struct rev_info {
 	struct saved_parents *saved_parents_slab;
 
 	struct commit_list *previous_parents;
+	struct commit_list *ancestry_path_bottoms;
 	const char *break_bar;
 
 	struct revision_sources *sources;
diff --git a/t/t6019-rev-list-ancestry-path.sh b/t/t6019-rev-list-ancestry-path.sh
index af57a04b7ff..d3657737fa6 100755
--- a/t/t6019-rev-list-ancestry-path.sh
+++ b/t/t6019-rev-list-ancestry-path.sh
@@ -8,8 +8,13 @@  test_description='--ancestry-path'
 #   /                     \
 #  A-------K---------------L--M
 #
-#  D..M                 == E F G H I J K L M
-#  --ancestry-path D..M == E F H I J L M
+#  D..M                                     == E F G H I J K L M
+#  --ancestry-path                     D..M == E F   H I J   L M
+#  --ancestry-path=F                   D..M == E F       J   L M
+#  --ancestry-path=G                   D..M ==     G H I J   L M
+#  --ancestry-path=H                   D..M == E   G H I J   L M
+#  --ancestry-path=K                   D..M ==             K L M
+#  --ancestry-path=K --ancestry-path=F D..M == E F       J K L M
 #
 #  D..M -- M.t                 == M
 #  --ancestry-path D..M -- M.t == M
@@ -66,6 +71,44 @@  test_expect_success 'rev-list --ancestry-path D..M' '
 	test_cmp expect actual
 '
 
+test_expect_success 'rev-list --ancestry-path=F D..M' '
+	test_write_lines E F J L M >expect &&
+	git rev-list --ancestry-path=F --format=%s D..M |
+	sed -e "/^commit /d" |
+	sort >actual &&
+	test_cmp expect actual
+'
+test_expect_success 'rev-list --ancestry-path=G D..M' '
+	test_write_lines G H I J L M >expect &&
+	git rev-list --ancestry-path=G --format=%s D..M |
+	sed -e "/^commit /d" |
+	sort >actual &&
+	test_cmp expect actual
+'
+test_expect_success 'rev-list --ancestry-path=H D..M' '
+	test_write_lines E G H I J L M >expect &&
+	git rev-list --ancestry-path=H --format=%s D..M |
+	sed -e "/^commit /d" |
+	sort >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'rev-list --ancestry-path=K D..M' '
+	test_write_lines K L M >expect &&
+	git rev-list --ancestry-path=K --format=%s D..M |
+	sed -e "/^commit /d" |
+	sort >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'rev-list --ancestry-path=F --ancestry-path=K D..M' '
+	test_write_lines E F J K L M >expect &&
+	git rev-list --ancestry-path=F --ancestry-path=K --format=%s D..M |
+	sed -e "/^commit /d" |
+	sort >actual &&
+	test_cmp expect actual
+'
+
 test_expect_success 'rev-list D..M -- M.t' '
 	echo M >expect &&
 	git rev-list --format=%s D..M -- M.t |