diff mbox series

name-rev: use generation numbers if available

Message ID 20220228215025.325904-3-jacob.e.keller@intel.com (mailing list archive)
State Superseded
Headers show
Series name-rev: use generation numbers if available | expand

Commit Message

Jacob Keller Feb. 28, 2022, 9:50 p.m. UTC
From: Jacob Keller <jacob.keller@gmail.com>

If a commit in a sequence of linear history has a non-monotonically
increasing commit timestamp, git name-rev might not properly name the
commit.

This occurs because name-rev uses a heuristic of the commit date to
avoid searching down tags which lead to commits that are older than the
named commit. This is intended to avoid work on larger repositories.

This heuristic impacts git name-rev, and by extension git describe
--contains which is built on top of name-rev.

Further more, if --all or --annotate-stdin is used, the heuristic is not
enabled because the full history has to be analyzed anyways. This
results in some confusion if a user sees that --annotate-stdin works but
a normal name-rev does not.

If the repository has a commit graph, we can use the generation numbers
instead of using the commit dates. This is essentially the same check
except that generation numbers make it exact, where the commit date
heuristic could be incorrect due to clock errors.

Since we're extending the notion of cutoff to more than one variable,
create a series of functions for setting and checking the cutoff. This
avoids duplication and moves access of the global cutoff and
generation_cutoff to as few functions as possible.

Add several test cases including coverage of --all, --annotate-stdin,
and the normal case with the cutoff heuristic, both with and without
commit graphs enabled.

Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
---
 builtin/name-rev.c  |  71 +++++++++++++++++++-----
 t/t6120-describe.sh | 132 ++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 189 insertions(+), 14 deletions(-)

Comments

Junio C Hamano March 1, 2022, 2:36 a.m. UTC | #1
Jacob Keller <jacob.e.keller@intel.com> writes:

> +test_expect_success 'name-rev without commitGraph does not handle non-monotonic timestamps' '
> +	test_config -C non-monotonic core.commitGraph false &&
> +	(
> +		cd non-monotonic &&
> +
> +		rm -rf .git/info/commit-graph* &&
> +
> +		echo "main~3 undefined" >expect &&
> +		git name-rev --tags main~3 >actual &&
> +
> +		test_cmp expect actual
> +	)
> +'

I doubt it is wise to "test" that a program does _not_ produce a
correct output, or even worse, it produces a particular wrong
output.  This test, for example, casts in stone that any future
optimization that does not depend on the commit-graph is forever
prohibited.

Just dropping the test would be fine, I would think.
Jacob Keller March 1, 2022, 7:08 a.m. UTC | #2
On Mon, Feb 28, 2022 at 6:36 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Jacob Keller <jacob.e.keller@intel.com> writes:
>
> > +test_expect_success 'name-rev without commitGraph does not handle non-monotonic timestamps' '
> > +     test_config -C non-monotonic core.commitGraph false &&
> > +     (
> > +             cd non-monotonic &&
> > +
> > +             rm -rf .git/info/commit-graph* &&
> > +
> > +             echo "main~3 undefined" >expect &&
> > +             git name-rev --tags main~3 >actual &&
> > +
> > +             test_cmp expect actual
> > +     )
> > +'
>
> I doubt it is wise to "test" that a program does _not_ produce a
> correct output, or even worse, it produces a particular wrong
> output.  This test, for example, casts in stone that any future
> optimization that does not depend on the commit-graph is forever
> prohibited.
>
> Just dropping the test would be fine, I would think.

Stolee mentioned it. We could also convert it to a
"test_expect_failure" with the expected output too... But that makes
it look like something we'll fix
Jacob Keller March 1, 2022, 7:09 a.m. UTC | #3
On Mon, Feb 28, 2022 at 11:08 PM Jacob Keller <jacob.keller@gmail.com> wrote:
>
> On Mon, Feb 28, 2022 at 6:36 PM Junio C Hamano <gitster@pobox.com> wrote:
> >
> > Jacob Keller <jacob.e.keller@intel.com> writes:
> >
> > > +test_expect_success 'name-rev without commitGraph does not handle non-monotonic timestamps' '
> > > +     test_config -C non-monotonic core.commitGraph false &&
> > > +     (
> > > +             cd non-monotonic &&
> > > +
> > > +             rm -rf .git/info/commit-graph* &&
> > > +
> > > +             echo "main~3 undefined" >expect &&
> > > +             git name-rev --tags main~3 >actual &&
> > > +
> > > +             test_cmp expect actual
> > > +     )
> > > +'
> >
> > I doubt it is wise to "test" that a program does _not_ produce a
> > correct output, or even worse, it produces a particular wrong
> > output.  This test, for example, casts in stone that any future
> > optimization that does not depend on the commit-graph is forever
> > prohibited.
> >
> > Just dropping the test would be fine, I would think.
>
> Stolee mentioned it. We could also convert it to a
> "test_expect_failure" with the expected output too... But that makes
> it look like something we'll fix

I'm happy to drop this test though

Thanks,
Jake
Junio C Hamano March 1, 2022, 7:33 a.m. UTC | #4
Jacob Keller <jacob.keller@gmail.com> writes:

> On Mon, Feb 28, 2022 at 6:36 PM Junio C Hamano <gitster@pobox.com> wrote:
>>
>> Jacob Keller <jacob.e.keller@intel.com> writes:
>>
>> > +test_expect_success 'name-rev without commitGraph does not handle non-monotonic timestamps' '
>> > +     test_config -C non-monotonic core.commitGraph false &&
>> > +     (
>> > +             cd non-monotonic &&
>> > +
>> > +             rm -rf .git/info/commit-graph* &&
>> > +
>> > +             echo "main~3 undefined" >expect &&
>> > +             git name-rev --tags main~3 >actual &&
>> > +
>> > +             test_cmp expect actual
>> > +     )
>> > +'
>>
>> I doubt it is wise to "test" that a program does _not_ produce a
>> correct output, or even worse, it produces a particular wrong
>> output.  This test, for example, casts in stone that any future
>> optimization that does not depend on the commit-graph is forever
>> prohibited.
>>
>> Just dropping the test would be fine, I would think.
>
> Stolee mentioned it. We could also convert it to a
> "test_expect_failure" with the expected output too... But that makes
> it look like something we'll fix

Neither sounds like a good idea anyway.  What we care most is with
commit graph, the algorithm will not be fooled by skewed timestamps.
Derrick Stolee March 1, 2022, 3:09 p.m. UTC | #5
On 3/1/2022 2:33 AM, Junio C Hamano wrote:
> Jacob Keller <jacob.keller@gmail.com> writes:
> 
>> On Mon, Feb 28, 2022 at 6:36 PM Junio C Hamano <gitster@pobox.com> wrote:
>>>
>>> Jacob Keller <jacob.e.keller@intel.com> writes:
>>>
>>>> +test_expect_success 'name-rev without commitGraph does not handle non-monotonic timestamps' '
>>>> +     test_config -C non-monotonic core.commitGraph false &&
>>>> +     (
>>>> +             cd non-monotonic &&
>>>> +
>>>> +             rm -rf .git/info/commit-graph* &&
>>>> +
>>>> +             echo "main~3 undefined" >expect &&
>>>> +             git name-rev --tags main~3 >actual &&
>>>> +
>>>> +             test_cmp expect actual
>>>> +     )
>>>> +'
>>>
>>> I doubt it is wise to "test" that a program does _not_ produce a
>>> correct output, or even worse, it produces a particular wrong
>>> output.  This test, for example, casts in stone that any future
>>> optimization that does not depend on the commit-graph is forever
>>> prohibited.
>>>
>>> Just dropping the test would be fine, I would think.
>>
>> Stolee mentioned it. We could also convert it to a
>> "test_expect_failure" with the expected output too... But that makes
>> it look like something we'll fix
> 
> Neither sounds like a good idea anyway.  What we care most is with
> commit graph, the algorithm will not be fooled by skewed timestamps.

I'm fine with losing this test.

I perhaps lean too hard on "tests should document current behavior"
so we know when we are changing behavior, and the commit can justify
that change. For this one, we are really documenting that we have
an optimization that doesn't walk all commits based on the date of
the target commit. If we dropped that optimization accidentally,
then we have no test so far that verifies that we don't walk the
entire commit history with these name-rev queries.

If there is value in documenting that optimization, then a
comment before the test could describe that the output is not
desirable, but it's due to an optimization that we want to keep in
place.

Thanks,
-Stolee
Jacob Keller March 1, 2022, 7:52 p.m. UTC | #6
On 3/1/2022 7:09 AM, Derrick Stolee wrote:
> On 3/1/2022 2:33 AM, Junio C Hamano wrote:
>> Jacob Keller <jacob.keller@gmail.com> writes:
>>
>>> On Mon, Feb 28, 2022 at 6:36 PM Junio C Hamano <gitster@pobox.com> wrote:
>>>>
>>>> Jacob Keller <jacob.e.keller@intel.com> writes:
>>>>
>>>>> +test_expect_success 'name-rev without commitGraph does not handle non-monotonic timestamps' '
>>>>> +     test_config -C non-monotonic core.commitGraph false &&
>>>>> +     (
>>>>> +             cd non-monotonic &&
>>>>> +
>>>>> +             rm -rf .git/info/commit-graph* &&
>>>>> +
>>>>> +             echo "main~3 undefined" >expect &&
>>>>> +             git name-rev --tags main~3 >actual &&
>>>>> +
>>>>> +             test_cmp expect actual
>>>>> +     )
>>>>> +'
>>>>
>>>> I doubt it is wise to "test" that a program does _not_ produce a
>>>> correct output, or even worse, it produces a particular wrong
>>>> output.  This test, for example, casts in stone that any future
>>>> optimization that does not depend on the commit-graph is forever
>>>> prohibited.
>>>>
>>>> Just dropping the test would be fine, I would think.
>>>
>>> Stolee mentioned it. We could also convert it to a
>>> "test_expect_failure" with the expected output too... But that makes
>>> it look like something we'll fix
>>
>> Neither sounds like a good idea anyway.  What we care most is with
>> commit graph, the algorithm will not be fooled by skewed timestamps.
> 
> I'm fine with losing this test.
> 
> I perhaps lean too hard on "tests should document current behavior"
> so we know when we are changing behavior, and the commit can justify
> that change. For this one, we are really documenting that we have
> an optimization that doesn't walk all commits based on the date of
> the target commit. If we dropped that optimization accidentally,
> then we have no test so far that verifies that we don't walk the
> entire commit history with these name-rev queries.
> 

I think the "tests should document current behavior" is handled by the
fact that this specific test fails if you revert the name-rev changes
but keep the test.

> If there is value in documenting that optimization, then a
> comment before the test could describe that the output is not
> desirable, but it's due to an optimization that we want to keep in
> place.
> 
> Thanks,
> -Stolee

What about a test which uses something like the trace system to list all
the commits it checked? I guess that might get a bit messy but that
could be used to cover the "this optimization is important" and that
applies to the commit graph implementation rather than keeping a
negative test of the other implementation.

Thanks,
Jake
Derrick Stolee March 1, 2022, 7:56 p.m. UTC | #7
On 3/1/2022 2:52 PM, Keller, Jacob E wrote:
> On 3/1/2022 7:09 AM, Derrick Stolee wrote:
>> On 3/1/2022 2:33 AM, Junio C Hamano wrote:
>>> Jacob Keller <jacob.keller@gmail.com> writes:
>>>
>>>> On Mon, Feb 28, 2022 at 6:36 PM Junio C Hamano <gitster@pobox.com> wrote:
>>>>>
>>>>> Jacob Keller <jacob.e.keller@intel.com> writes:
>>>>>
>>>>>> +test_expect_success 'name-rev without commitGraph does not handle non-monotonic timestamps' '
>>>>>> +     test_config -C non-monotonic core.commitGraph false &&
>>>>>> +     (
>>>>>> +             cd non-monotonic &&
>>>>>> +
>>>>>> +             rm -rf .git/info/commit-graph* &&
>>>>>> +
>>>>>> +             echo "main~3 undefined" >expect &&
>>>>>> +             git name-rev --tags main~3 >actual &&
>>>>>> +
>>>>>> +             test_cmp expect actual
>>>>>> +     )
>>>>>> +'
>>>>>
>>>>> I doubt it is wise to "test" that a program does _not_ produce a
>>>>> correct output, or even worse, it produces a particular wrong
>>>>> output.  This test, for example, casts in stone that any future
>>>>> optimization that does not depend on the commit-graph is forever
>>>>> prohibited.
>>>>>
>>>>> Just dropping the test would be fine, I would think.
>>>>
>>>> Stolee mentioned it. We could also convert it to a
>>>> "test_expect_failure" with the expected output too... But that makes
>>>> it look like something we'll fix
>>>
>>> Neither sounds like a good idea anyway.  What we care most is with
>>> commit graph, the algorithm will not be fooled by skewed timestamps.
>>
>> I'm fine with losing this test.
>>
>> I perhaps lean too hard on "tests should document current behavior"
>> so we know when we are changing behavior, and the commit can justify
>> that change. For this one, we are really documenting that we have
>> an optimization that doesn't walk all commits based on the date of
>> the target commit. If we dropped that optimization accidentally,
>> then we have no test so far that verifies that we don't walk the
>> entire commit history with these name-rev queries.
>>
> 
> I think the "tests should document current behavior" is handled by the
> fact that this specific test fails if you revert the name-rev changes
> but keep the test.

Ah, so this _is_ documenting a new behavior that didn't exist
before the series. That is good to include, then. If it was
"just" testing the behavior before this series, then it would
have less reason to exist.

>> If there is value in documenting that optimization, then a
>> comment before the test could describe that the output is not
>> desirable, but it's due to an optimization that we want to keep in
>> place.
>>
>> Thanks,
>> -Stolee
> 
> What about a test which uses something like the trace system to list all
> the commits it checked? I guess that might get a bit messy but that
> could be used to cover the "this optimization is important" and that
> applies to the commit graph implementation rather than keeping a
> negative test of the other implementation.

A trace of the _count_ of visited commits might be effective,
without being too noisy in the trace logs or too fragile to
future updates (only need to change a number if the optimization
changes).

Thanks,
-Stolee
Junio C Hamano March 1, 2022, 8:22 p.m. UTC | #8
Derrick Stolee <derrickstolee@github.com> writes:

>> I think the "tests should document current behavior" is handled by the
>> fact that this specific test fails if you revert the name-rev changes
>> but keep the test.
>
> Ah, so this _is_ documenting a new behavior that didn't exist
> before the series. That is good to include, then. If it was
> "just" testing the behavior before this series, then it would
> have less reason to exist.

With of without the additional codepath to handle the case where
commit graph is available, the original heuristics that is based on
commit timestamps are fooled by a history with skewed timestamps.

So I thought this "without commit graph, the algorithm must fail
this way" test would be testing the current behaviour *and* the
behaviour of the new code, no?
Jacob Keller March 1, 2022, 10:46 p.m. UTC | #9
> -----Original Message-----
> From: Junio C Hamano <gitster@pobox.com>
> Sent: Tuesday, March 01, 2022 12:23 PM
> To: Derrick Stolee <derrickstolee@github.com>
> Cc: Keller, Jacob E <jacob.e.keller@intel.com>; Jacob Keller
> <jacob.keller@gmail.com>; Git mailing list <git@vger.kernel.org>
> Subject: Re: [PATCH] name-rev: use generation numbers if available
> 
> Derrick Stolee <derrickstolee@github.com> writes:
> 
> >> I think the "tests should document current behavior" is handled by the
> >> fact that this specific test fails if you revert the name-rev changes
> >> but keep the test.
> >
> > Ah, so this _is_ documenting a new behavior that didn't exist
> > before the series. That is good to include, then. If it was
> > "just" testing the behavior before this series, then it would
> > have less reason to exist.
> 
> With of without the additional codepath to handle the case where
> commit graph is available, the original heuristics that is based on
> commit timestamps are fooled by a history with skewed timestamps.
> 


Let's clarify. There are two versions of the test in this version:

1) test which enables commit graph and tests that it does the right behavior.

2) test which removes commit graph and tests that it behaves the old way.


test (1) checks the flow with the commit graph enabled, and verifies that with a commit graph the new behavior is used. This test will fail if you revert the name-rev commit-graph support.

test (2) always performs the way we don't like. (since we disable the commit graph and the new flow doesn't kick in) This is the test I think I will eliminate in the next revision.


I will remove the 2nd test, since the first test covers the change in behavior just fine, and I think I agree we don't need to set in-stone the implementation without commit graph.

I will also look at adding a test which performs a count of which revisions get inspected and makes sure that we actually are doing the optimization.

> So I thought this "without commit graph, the algorithm must fail
> this way" test would be testing the current behaviour *and* the
> behaviour of the new code, no?
Junio C Hamano March 3, 2022, 1:10 a.m. UTC | #10
"Keller, Jacob E" <jacob.e.keller@intel.com> writes:

> Let's clarify. There are two versions of the test in this version:
>
> 1) test which enables commit graph and tests that it does the right behavior.
>
> 2) test which removes commit graph and tests that it behaves the old way.
>
>
> test (1) checks the flow with the commit graph enabled, and verifies that with a commit graph the new behavior is used. This test will fail if you revert the name-rev commit-graph support.
>
> test (2) always performs the way we don't like. (since we disable the commit graph and the new flow doesn't kick in) This is the test I think I will eliminate in the next revision.
>
>
> I will remove the 2nd test, since the first test covers the change
> in behavior just fine, and I think I agree we don't need to set
> in-stone the implementation without commit graph.
>
> I will also look at adding a test which performs a count of which
> revisions get inspected and makes sure that we actually are doing
> the optimization.

Sounds like a sensible thing to do.

In any case, in the current patch, #2 is not working in
linux-TEST-vars job at CI.  You can visit this URL

https://github.com/git/git/runs/5400048732?check_suite_focus=true#step:4:68062

while logged into your GitHub account for details.

Thanks.
Jacob Keller March 7, 2022, 8:22 p.m. UTC | #11
On Wed, Mar 2, 2022 at 5:10 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> "Keller, Jacob E" <jacob.e.keller@intel.com> writes:
>
> > Let's clarify. There are two versions of the test in this version:
> >
> > 1) test which enables commit graph and tests that it does the right behavior.
> >
> > 2) test which removes commit graph and tests that it behaves the old way.
> >
> >
> > test (1) checks the flow with the commit graph enabled, and verifies that with a commit graph the new behavior is used. This test will fail if you revert the name-rev commit-graph support.
> >
> > test (2) always performs the way we don't like. (since we disable the commit graph and the new flow doesn't kick in) This is the test I think I will eliminate in the next revision.
> >
> >
> > I will remove the 2nd test, since the first test covers the change
> > in behavior just fine, and I think I agree we don't need to set
> > in-stone the implementation without commit graph.
> >
> > I will also look at adding a test which performs a count of which
> > revisions get inspected and makes sure that we actually are doing
> > the optimization.
>
> Sounds like a sensible thing to do.
>
> In any case, in the current patch, #2 is not working in
> linux-TEST-vars job at CI.  You can visit this URL
>
> https://github.com/git/git/runs/5400048732?check_suite_focus=true#step:4:68062
>
> while logged into your GitHub account for details.

Looks like this job sets all the TEST variables including
GIT_TEST_COMMIT_GRAPH=1? The negative test passes because the commit
graph is enforced on and we then succeed even though we were trying to
test the negative case.

I'm going to remove that test in v3 anyways, so I don't think it is a
big deal. However, I wonder is there some way to mark a test as
explicitely "don't run if GIT_TEST_COMMIT_GRAPH is set"?

Thanks,
Jake
Derrick Stolee March 7, 2022, 8:26 p.m. UTC | #12
On 3/7/2022 3:22 PM, Jacob Keller wrote:
> On Wed, Mar 2, 2022 at 5:10 PM Junio C Hamano <gitster@pobox.com> wrote:
>>
>> "Keller, Jacob E" <jacob.e.keller@intel.com> writes:
>>
>>> Let's clarify. There are two versions of the test in this version:
>>>
>>> 1) test which enables commit graph and tests that it does the right behavior.
>>>
>>> 2) test which removes commit graph and tests that it behaves the old way.
>>>
>>>
>>> test (1) checks the flow with the commit graph enabled, and verifies that with a commit graph the new behavior is used. This test will fail if you revert the name-rev commit-graph support.
>>>
>>> test (2) always performs the way we don't like. (since we disable the commit graph and the new flow doesn't kick in) This is the test I think I will eliminate in the next revision.
>>>
>>>
>>> I will remove the 2nd test, since the first test covers the change
>>> in behavior just fine, and I think I agree we don't need to set
>>> in-stone the implementation without commit graph.
>>>
>>> I will also look at adding a test which performs a count of which
>>> revisions get inspected and makes sure that we actually are doing
>>> the optimization.
>>
>> Sounds like a sensible thing to do.
>>
>> In any case, in the current patch, #2 is not working in
>> linux-TEST-vars job at CI.  You can visit this URL
>>
>> https://github.com/git/git/runs/5400048732?check_suite_focus=true#step:4:68062
>>
>> while logged into your GitHub account for details.
> 
> Looks like this job sets all the TEST variables including
> GIT_TEST_COMMIT_GRAPH=1? The negative test passes because the commit
> graph is enforced on and we then succeed even though we were trying to
> test the negative case.
> 
> I'm going to remove that test in v3 anyways, so I don't think it is a
> big deal. However, I wonder is there some way to mark a test as
> explicitely "don't run if GIT_TEST_COMMIT_GRAPH is set"?

Typically, we try to keep them compatible in both cases. However,
you can set GIT_TEST_COMMIT_GRAPH=0 for the test, if you want. Be
careful to only change it locally to the single test, not "globally"
to the full test script.

Thanks,
-Stolee
Jacob Keller March 7, 2022, 10:30 p.m. UTC | #13
> -----Original Message-----
> From: Derrick Stolee <derrickstolee@github.com>
> Sent: Monday, March 07, 2022 12:27 PM
> To: Jacob Keller <jacob.keller@gmail.com>; Junio C Hamano
> <gitster@pobox.com>
> Cc: Keller, Jacob E <jacob.e.keller@intel.com>; Git mailing list
> <git@vger.kernel.org>
> Subject: Re: [PATCH] name-rev: use generation numbers if available
> 
> On 3/7/2022 3:22 PM, Jacob Keller wrote:
> > On Wed, Mar 2, 2022 at 5:10 PM Junio C Hamano <gitster@pobox.com> wrote:
> >>
> >> "Keller, Jacob E" <jacob.e.keller@intel.com> writes:
> >>
> >>> Let's clarify. There are two versions of the test in this version:
> >>>
> >>> 1) test which enables commit graph and tests that it does the right behavior.
> >>>
> >>> 2) test which removes commit graph and tests that it behaves the old way.
> >>>
> >>>
> >>> test (1) checks the flow with the commit graph enabled, and verifies that with
> a commit graph the new behavior is used. This test will fail if you revert the
> name-rev commit-graph support.
> >>>
> >>> test (2) always performs the way we don't like. (since we disable the commit
> graph and the new flow doesn't kick in) This is the test I think I will eliminate in
> the next revision.
> >>>
> >>>
> >>> I will remove the 2nd test, since the first test covers the change
> >>> in behavior just fine, and I think I agree we don't need to set
> >>> in-stone the implementation without commit graph.
> >>>
> >>> I will also look at adding a test which performs a count of which
> >>> revisions get inspected and makes sure that we actually are doing
> >>> the optimization.
> >>
> >> Sounds like a sensible thing to do.
> >>
> >> In any case, in the current patch, #2 is not working in
> >> linux-TEST-vars job at CI.  You can visit this URL
> >>
> >>
> https://github.com/git/git/runs/5400048732?check_suite_focus=true#step:4:680
> 62
> >>
> >> while logged into your GitHub account for details.
> >
> > Looks like this job sets all the TEST variables including
> > GIT_TEST_COMMIT_GRAPH=1? The negative test passes because the commit
> > graph is enforced on and we then succeed even though we were trying to
> > test the negative case.
> >
> > I'm going to remove that test in v3 anyways, so I don't think it is a
> > big deal. However, I wonder is there some way to mark a test as
> > explicitely "don't run if GIT_TEST_COMMIT_GRAPH is set"?
> 
> Typically, we try to keep them compatible in both cases. However,
> you can set GIT_TEST_COMMIT_GRAPH=0 for the test, if you want. Be
> careful to only change it locally to the single test, not "globally"
> to the full test script.
> 
> Thanks,
> -Stolee

Ok. The problem is that specific test does not behave the same. In fact it *cannot* behave the same because we're trying to test the non-commit-graph flow there. Since i'm dropping it in v3 I won't worry too much about it.

Thanks,
Jake
Derrick Stolee March 7, 2022, 10:43 p.m. UTC | #14
On 3/7/2022 5:30 PM, Keller, Jacob E wrote:
> 
> 
>> -----Original Message-----
>> From: Derrick Stolee <derrickstolee@github.com>
>> Sent: Monday, March 07, 2022 12:27 PM
>> To: Jacob Keller <jacob.keller@gmail.com>; Junio C Hamano
>> <gitster@pobox.com>
>> Cc: Keller, Jacob E <jacob.e.keller@intel.com>; Git mailing list
>> <git@vger.kernel.org>
>> Subject: Re: [PATCH] name-rev: use generation numbers if available
>>
>> On 3/7/2022 3:22 PM, Jacob Keller wrote:
>>> On Wed, Mar 2, 2022 at 5:10 PM Junio C Hamano <gitster@pobox.com> wrote:
>>>>
>>>> "Keller, Jacob E" <jacob.e.keller@intel.com> writes:
>>>>> I will also look at adding a test which performs a count of which
>>>>> revisions get inspected and makes sure that we actually are doing
>>>>> the optimization.
>>>>
>>>> Sounds like a sensible thing to do.
>>>>
>>>> In any case, in the current patch, #2 is not working in
>>>> linux-TEST-vars job at CI.  You can visit this URL
>>>>
>>>>
>> https://github.com/git/git/runs/5400048732?check_suite_focus=true#step:4:680
>> 62
>>>>
>>>> while logged into your GitHub account for details.
>>>
>>> Looks like this job sets all the TEST variables including
>>> GIT_TEST_COMMIT_GRAPH=1? The negative test passes because the commit
>>> graph is enforced on and we then succeed even though we were trying to
>>> test the negative case.
>>>
>>> I'm going to remove that test in v3 anyways, so I don't think it is a
>>> big deal. However, I wonder is there some way to mark a test as
>>> explicitely "don't run if GIT_TEST_COMMIT_GRAPH is set"?
>>
>> Typically, we try to keep them compatible in both cases. However,
>> you can set GIT_TEST_COMMIT_GRAPH=0 for the test, if you want. Be
>> careful to only change it locally to the single test, not "globally"
>> to the full test script.
>>
>> Thanks,
>> -Stolee
> 
> Ok. The problem is that specific test does not behave the same. In fact it *cannot* behave the same because we're trying to test the non-commit-graph flow there. Since i'm dropping it in v3 I won't worry too much about it.

You could disable _parsing_ the commit-graph in the necessary
Git command with "-c core.commitGraph=false".

Thanks,
-Stolee
Junio C Hamano March 7, 2022, 10:52 p.m. UTC | #15
"Keller, Jacob E" <jacob.e.keller@intel.com> writes:

> Ok. The problem is that specific test does not behave the same. In fact it *cannot* behave the same because we're trying to test the non-commit-graph flow there. Since i'm dropping it in v3 I won't worry too much about it.

As you said, if you are removing the test, it is a moot point, but I
think what Derrick suggests is to do something like this:

diff --git c/t/t6120-describe.sh w/t/t6120-describe.sh
index c353c21cc8..871bdbbec9 100755
--- c/t/t6120-describe.sh
+++ w/t/t6120-describe.sh
@@ -508,6 +508,7 @@ test_expect_success 'name-rev without commitGraph does not handle non-monotonic
 		cd non-monotonic &&
 
 		rm -rf .git/info/commit-graph* &&
+		sane_unset GIT_TEST_COMMIT_GRAPH &&
 
 		echo "main~3 undefined" >expect &&
 		git name-rev --tags main~3 >actual &&

where you want to decline using the commit-graph feature.  As this
test piece is already in its own subshell, the unsetting will affect
only this one.
diff mbox series

Patch

diff --git a/builtin/name-rev.c b/builtin/name-rev.c
index 929591269ddf..c59b5699fe80 100644
--- a/builtin/name-rev.c
+++ b/builtin/name-rev.c
@@ -9,6 +9,7 @@ 
 #include "prio-queue.h"
 #include "hash-lookup.h"
 #include "commit-slab.h"
+#include "commit-graph.h"
 
 /*
  * One day.  See the 'name a rev shortly after epoch' test in t6120 when
@@ -26,9 +27,58 @@  struct rev_name {
 
 define_commit_slab(commit_rev_name, struct rev_name);
 
+static timestamp_t generation_cutoff = GENERATION_NUMBER_INFINITY;
 static timestamp_t cutoff = TIME_MAX;
 static struct commit_rev_name rev_names;
 
+/* Disable the cutoff checks entirely */
+static void disable_cutoff(void)
+{
+	generation_cutoff = 0;
+	cutoff = 0;
+}
+
+/* Cutoff searching any commits older than this one */
+static void set_commit_cutoff(struct commit *commit)
+{
+
+	if (cutoff > commit->date)
+		cutoff = commit->date;
+
+	if (generation_cutoff) {
+		timestamp_t generation = commit_graph_generation(commit);
+
+		if (generation_cutoff > generation)
+			generation_cutoff = generation;
+	}
+}
+
+/* adjust the commit date cutoff with a slop to allow for slightly incorrect
+ * commit timestamps in case of clock skew.
+ */
+static void adjust_cutoff_timestamp_for_slop(void)
+{
+	if (cutoff) {
+		/* check for undeflow */
+		if (cutoff > TIME_MIN + CUTOFF_DATE_SLOP)
+			cutoff = cutoff - CUTOFF_DATE_SLOP;
+		else
+			cutoff = TIME_MIN;
+	}
+}
+
+/* Check if a commit is before the cutoff. Prioritize generation numbers
+ * first, but use the commit timestamp if we lack generation data.
+ */
+static int commit_is_before_cutoff(struct commit *commit)
+{
+	if (generation_cutoff < GENERATION_NUMBER_INFINITY)
+		return generation_cutoff &&
+			commit_graph_generation(commit) < generation_cutoff;
+
+	return commit->date < cutoff;
+}
+
 /* How many generations are maximally preferred over _one_ merge traversal? */
 #define MERGE_TRAVERSAL_WEIGHT 65535
 
@@ -151,7 +201,7 @@  static void name_rev(struct commit *start_commit,
 	struct rev_name *start_name;
 
 	parse_commit(start_commit);
-	if (start_commit->date < cutoff)
+	if (commit_is_before_cutoff(start_commit))
 		return;
 
 	start_name = create_or_update_name(start_commit, taggerdate, 0, 0,
@@ -181,7 +231,7 @@  static void name_rev(struct commit *start_commit,
 			int generation, distance;
 
 			parse_commit(parent);
-			if (parent->date < cutoff)
+			if (commit_is_before_cutoff(parent))
 				continue;
 
 			if (parent_number > 1) {
@@ -568,7 +618,7 @@  int cmd_name_rev(int argc, const char **argv, const char *prefix)
 		usage_with_options(name_rev_usage, opts);
 	}
 	if (all || annotate_stdin)
-		cutoff = 0;
+		disable_cutoff();
 
 	for (; argc; argc--, argv++) {
 		struct object_id oid;
@@ -596,10 +646,8 @@  int cmd_name_rev(int argc, const char **argv, const char *prefix)
 			continue;
 		}
 
-		if (commit) {
-			if (cutoff > commit->date)
-				cutoff = commit->date;
-		}
+		if (commit)
+			set_commit_cutoff(commit);
 
 		if (peel_tag) {
 			if (!commit) {
@@ -612,13 +660,8 @@  int cmd_name_rev(int argc, const char **argv, const char *prefix)
 		add_object_array(object, *argv, &revs);
 	}
 
-	if (cutoff) {
-		/* check for undeflow */
-		if (cutoff > TIME_MIN + CUTOFF_DATE_SLOP)
-			cutoff = cutoff - CUTOFF_DATE_SLOP;
-		else
-			cutoff = TIME_MIN;
-	}
+	adjust_cutoff_timestamp_for_slop();
+
 	for_each_ref(name_ref, &data);
 	name_tips();
 
diff --git a/t/t6120-describe.sh b/t/t6120-describe.sh
index 9781b92aeddf..c353c21cc837 100755
--- a/t/t6120-describe.sh
+++ b/t/t6120-describe.sh
@@ -488,6 +488,138 @@  test_expect_success 'name-rev covers all conditions while looking at parents' '
 	)
 '
 
+# A-B-C-D-E-main
+#
+# Where C has a non-monotonically increasing commit timestamp w.r.t. other
+# commits
+test_expect_success 'non-monotonic commit dates setup' '
+	UNIX_EPOCH_ZERO="@0 +0000" &&
+	git init non-monotonic &&
+	test_commit -C non-monotonic A &&
+	test_commit -C non-monotonic --no-tag B &&
+	test_commit -C non-monotonic --no-tag --date "$UNIX_EPOCH_ZERO" C &&
+	test_commit -C non-monotonic D &&
+	test_commit -C non-monotonic E
+'
+
+test_expect_success 'name-rev without commitGraph does not handle non-monotonic timestamps' '
+	test_config -C non-monotonic core.commitGraph false &&
+	(
+		cd non-monotonic &&
+
+		rm -rf .git/info/commit-graph* &&
+
+		echo "main~3 undefined" >expect &&
+		git name-rev --tags main~3 >actual &&
+
+		test_cmp expect actual
+	)
+'
+
+test_expect_success 'name-rev --all works with non-monotonic timestamps' '
+	test_config -C non-monotonic core.commitGraph false &&
+	(
+		cd non-monotonic &&
+
+		rm -rf .git/info/commit-graph* &&
+
+		cat >tags <<-\EOF &&
+		tags/E
+		tags/D
+		tags/D~1
+		tags/D~2
+		tags/A
+		EOF
+
+		git log --pretty=%H >revs &&
+
+		paste -d" " revs tags | sort >expect &&
+
+		git name-rev --tags --all | sort >actual &&
+		test_cmp expect actual
+	)
+'
+
+test_expect_success 'name-rev --annotate-stdin works with non-monotonic timestamps' '
+	test_config -C non-monotonic core.commitGraph false &&
+	(
+		cd non-monotonic &&
+
+		rm -rf .git/info/commit-graph* &&
+
+		cat >expect <<-\EOF &&
+		E
+		D
+		D~1
+		D~2
+		A
+		EOF
+
+		git log --pretty=%H >revs &&
+		git name-rev --tags --annotate-stdin --name-only <revs >actual &&
+		test_cmp expect actual
+	)
+'
+
+test_expect_success 'name-rev with commitGraph handles non-monotonic timestamps' '
+	test_config -C non-monotonic core.commitGraph true &&
+	(
+		cd non-monotonic &&
+
+		git commit-graph write --reachable &&
+
+		echo "main~3 tags/D~2" >expect &&
+		git name-rev --tags main~3 >actual &&
+
+		test_cmp expect actual
+	)
+'
+
+test_expect_success 'name-rev --all works with commitGraph' '
+	test_config -C non-monotonic core.commitGraph true &&
+	(
+		cd non-monotonic &&
+
+		git commit-graph write --reachable &&
+
+		cat >tags <<-\EOF &&
+		tags/E
+		tags/D
+		tags/D~1
+		tags/D~2
+		tags/A
+		EOF
+
+		git log --pretty=%H >revs &&
+
+		paste -d" " revs tags | sort >expect &&
+
+		git name-rev --tags --all | sort >actual &&
+		test_cmp expect actual
+	)
+'
+
+test_expect_success 'name-rev --annotate-stdin works with commitGraph' '
+	test_config -C non-monotonic core.commitGraph true &&
+	(
+		cd non-monotonic &&
+
+		git commit-graph write --reachable &&
+
+		cat >expect <<-\EOF &&
+		E
+		D
+		D~1
+		D~2
+		A
+		EOF
+
+		git log --pretty=%H >revs &&
+		git name-rev --tags --annotate-stdin --name-only <revs >actual &&
+		test_cmp expect actual
+	)
+'
+
 #               B
 #               o
 #                \