diff mbox series

[v4,1/3] push: add reflog check for "--force-if-includes"

Message ID 20200919170316.5310-2-shrinidhi.kaushik@gmail.com (mailing list archive)
State Superseded
Headers show
Series push: add "--[no-]force-if-includes" | expand

Commit Message

Srinidhi Kaushik Sept. 19, 2020, 5:03 p.m. UTC
Adds a check to verify if the remote-tracking ref of the local branch
is reachable from one of its "reflog" entries.

When a local branch that is based on a remote ref, has been rewound
and is to be force pushed on the remote, "--force-if-includes" runs
a check that ensures any updates to remote-tracking refs that may have
happened (by push from another repository) in-between the time of the
last checkout, and right before the time of push, have been integrated
locally before allowing a forced updated.

A new field "use_force_if_includes" has been added to "push_cas_option",
which is set to "1" when "--force-if-includes" is specified as an
argument in the command line or set as a configuration option.

The struct "ref" has two new bit-fields:
  - if_includes:
    Set when we have to run the new check on the ref, and the remote
    ref was marked as "use_tracking" or "use_tracking_for_rest" by
    compare-and-swap (if the "the remote tip must be at the expected
    commit" condition is not specified); "apply_push_cas()" has been
    updated to check if this field is set and run the check.

  - unreachable:
    Set if the ref is unreachable from any of the "reflog" entries of
    its local counterpart.

"REF_STATUS_REJECT_REMOTE_UPDATED" has been added to the "status"
enum to imply that the ref failed the check; "case" statements in
"send-pack", "transport" and "transport-helper" have been updated
accordingly to catch this status when set.

When "--force-with-includes" is used along with "--force-with-lease",
the check runs only for refs marked as "if_includes". If the option
is passed without specifying "--force-with-lease", or specified along
with "--force-with-lease=<refname>:<expect>" it is a "no-op".

Signed-off-by: Srinidhi Kaushik <shrinidhi.kaushik@gmail.com>
---
 builtin/send-pack.c |  5 +++
 remote.c            | 92 ++++++++++++++++++++++++++++++++++++++++++++-
 remote.h            |  6 ++-
 send-pack.c         |  1 +
 transport-helper.c  |  5 +++
 transport.c         |  6 +++
 6 files changed, 112 insertions(+), 3 deletions(-)

Comments

Junio C Hamano Sept. 19, 2020, 8:03 p.m. UTC | #1
Srinidhi Kaushik <shrinidhi.kaushik@gmail.com> writes:

> Adds a check to verify if the remote-tracking ref of the local branch
> is reachable from one of its "reflog" entries.

s/Adds/Add/

> When "--force-with-includes" is used along with "--force-with-lease",

A misspelt name for the new option is found here.

> +/* Checks if the ref is reachable from the reflog entry. */
> +static int reflog_entry_reachable(struct object_id *o_oid,
> +			       struct object_id *n_oid,
> +			       const char *ident, timestamp_t timestamp,
> +			       int tz, const char *message, void *cb_data)
> +{
> +	struct commit *local_commit;
> +	struct commit *remote_commit = cb_data;
> +
> +	local_commit = lookup_commit_reference(the_repository, n_oid);
> +	if (local_commit)
> +		return in_merge_bases(remote_commit, local_commit);
> +
> +	return 0;
> +}

Makes me wonder, if in_merge_bases() is so expensive that it makes
sense to split the "were we exactly at the tip?" and "is one of the
commits we were at a descendant of the tip?" into separate phases,
if this part should be calling in_merge_bases() one by one.

Would it make more sense to iterate over reflog entries from newer
to older, collect them in an array of pointers to "struct commit" in
a batch of say 8 commits or less, and then ask in_merge_bases_many()
if the remote_commit is an ancestor of one of these local commits?

The traversal cost to start from one "local commit" to see if
remote_commit is an ancestor of it using in_merge_bases() and
in_merge_bases_many() should be the same and an additional traversal
cost to start from more local commits should be negligible compared
to the traversal itself, so making a call to in_merge_bases() for
each local_commit smells somewhat suboptimal.

If we were talking about older parts of the history, optional
generation numbers could change the equation somewhat, but the
common case for the workflow this series is trying to help is that
these local commits ane the remote tip are relatively new and it is
not unlikely that the generation numbers have not been computed for
them, which is why I suspect that in_merges_many may be a win.

> @@ -2301,6 +2380,15 @@ void apply_push_cas(struct push_cas_option *cas,
>  		    struct ref *remote_refs)
>  {
>  	struct ref *ref;
> -	for (ref = remote_refs; ref; ref = ref->next)
> +	for (ref = remote_refs; ref; ref = ref->next) {
>  		apply_cas(cas, remote, ref);
> +
> +		/*
> +		 * If "compare-and-swap" is in "use_tracking[_for_rest]"
> +		 * mode, and if "--foce-if-includes" was specified, run
> +		 * the check.
> +		 */
> +		if (ref->if_includes)
> +			check_if_includes_upstream(ref);

s/foce/force/; 

I can see that the code is checking "and if force-if-includes was
specified" part, but it is not immediately clear where the code
checks if "--force-with-lease" is used with "tracking" and not with
"the other side must be exactly this commit" mode here.

    ... goes and looks ...

Ah, ok, I found out. 

The field name "if_includes", and the comment for the field in
remote.h, are both misleading.  It gives an impression that the
field being true means "--force-if-included is in use", but in
reality the field means a lot more.  When it is true, it signals
that "--force-if-included" is in use *and* for this ref we were told
to use the "--force-with-lease" without an exact object name.  And
that logic is not here, but has already happened in apply_cas().

Which makes the above comment correct.  We however need a better
name for this field and/or an explanation for the field in the
header file, or both, to avoid misleading readers.

> diff --git a/remote.h b/remote.h
> index 5e3ea5a26d..38ab8539e2 100644
> --- a/remote.h
> +++ b/remote.h
> @@ -104,7 +104,9 @@ struct ref {
>  		forced_update:1,
>  		expect_old_sha1:1,
>  		exact_oid:1,
> -		deletion:1;
> +		deletion:1,
> +		if_includes:1, /* If "--force-with-includes" was specified. */

The description needs to be tightened.

> +		unreachable:1; /* For "if_includes"; unreachable in reflog. */


Thanks.
Srinidhi Kaushik Sept. 21, 2020, 8:42 a.m. UTC | #2
Hi Junio,

On 09/19/2020 13:03, Junio C Hamano wrote:
> Srinidhi Kaushik <shrinidhi.kaushik@gmail.com> writes:
> 
> > Adds a check to verify if the remote-tracking ref of the local branch
> > is reachable from one of its "reflog" entries.
> 
> s/Adds/Add/

Gotcha, I will reword the commit messages for all three commits.
in the patch series.

> > When "--force-with-includes" is used along with "--force-with-lease",
> 
> A misspelt name for the new option is found here.

*Facepalm.* Thanks, will update.

> > [...]
> Makes me wonder, if in_merge_bases() is so expensive that it makes
> sense to split the "were we exactly at the tip?" and "is one of the
> commits we were at a descendant of the tip?" into separate phases,
> if this part should be calling in_merge_bases() one by one.
> 
> Would it make more sense to iterate over reflog entries from newer
> to older, collect them in an array of pointers to "struct commit" in
> a batch of say 8 commits or less, and then ask in_merge_bases_many()
> if the remote_commit is an ancestor of one of these local commits?
>
> The traversal cost to start from one "local commit" to see if
> remote_commit is an ancestor of it using in_merge_bases() and
> in_merge_bases_many() should be the same and an additional traversal
> cost to start from more local commits should be negligible compared
> to the traversal itself, so making a call to in_merge_bases() for
> each local_commit smells somewhat suboptimal.
> 
> If we were talking about older parts of the history, optional
> generation numbers could change the equation somewhat, but the
> common case for the workflow this series is trying to help is that
> these local commits ane the remote tip are relatively new and it is
> not unlikely that the generation numbers have not been computed for
> them, which is why I suspect that in_merges_many may be a win.

Nice! We can definitely try batching commits from the reflog and
pass it along to "in_merge_bases_many()". As for being faster than
calling "in_merge_bases()" for each commit entry in the reflog --
I am not familiar with how the former works. Do we still keep the
"reflog_entry_exists()" part? It might still be faster to go through
the entries once to check with "oideq()" in the first run.

Also, I was wondering if it is worth considering this:
  - check if the reflog of the HEAD has the remote ref
  - check if the reflog of the local branch has the remote ref
  - check if the remote ref is reachable from any of the local ref's
    "reflog" entries using "in_merge_bases_many()" in batches as
    suggested here.

The first two (we can even skip the second one) runs are relatively
fast, and the third one might be faster than checking "in_merge_bases()"
for each reflog entry. I suppose adding these three steps would make
the process slower overall, though. For context, I was referring to
your message [1] on the other thread regarding checking the HEAD's
reflog.

> > [...]
> > +		/*
> > +		 * If "compare-and-swap" is in "use_tracking[_for_rest]"
> > +		 * mode, and if "--foce-if-includes" was specified, run
> > +		 * the check.
> > +		 */
> > +		if (ref->if_includes)
> > +			check_if_includes_upstream(ref);
> 
> s/foce/force/; 

Yes, sorry about that; will update.
 
> I can see that the code is checking "and if force-if-includes was
> specified" part, but it is not immediately clear where the code
> checks if "--force-with-lease" is used with "tracking" and not with
> "the other side must be exactly this commit" mode here.
> 
>     ... goes and looks ...
> 
> Ah, ok, I found out. 
> 
> The field name "if_includes", and the comment for the field in
> remote.h, are both misleading.  It gives an impression that the
> field being true means "--force-if-included is in use", but in
> reality the field means a lot more.  When it is true, it signals
> that "--force-if-included" is in use *and* for this ref we were told
> to use the "--force-with-lease" without an exact object name.  And
> that logic is not here, but has already happened in apply_cas().
> 
> Which makes the above comment correct.  We however need a better
> name for this field and/or an explanation for the field in the
> header file, or both, to avoid misleading readers.
>
> > diff --git a/remote.h b/remote.h
> > index 5e3ea5a26d..38ab8539e2 100644
> > --- a/remote.h
> > +++ b/remote.h
> > @@ -104,7 +104,9 @@ struct ref {
> >  		forced_update:1,
> >  		expect_old_sha1:1,
> >  		exact_oid:1,
> > -		deletion:1;
> > +		deletion:1,
> > +		if_includes:1, /* If "--force-with-includes" was specified. */
> 
> The description needs to be tightened.
> 
> > +		unreachable:1; /* For "if_includes"; unreachable in reflog. */

OK, you're right. Perhaps, we could rename it to something like
"if_includes_for_tracking" and update the comment description
with saying something along the lines of:

+  /*
+   * Set when "--force-if-includes" is enabled, and
+   * if "compare-and-swap" is not provided with the
+   * exact commit to be expected on the remote (in
+   * "use_tracking" or use_tracking_for_rest" mode).
+   */


[1]: https://public-inbox.org/git/xmqqsgbdk69b.fsf@gitster.c.googlers.com

Thanks again, for taking the time to review this.
Phillip Wood Sept. 21, 2020, 1:19 p.m. UTC | #3
On 19/09/2020 21:03, Junio C Hamano wrote:
> Srinidhi Kaushik <shrinidhi.kaushik@gmail.com> writes:
> 
>> +/* Checks if the ref is reachable from the reflog entry. */
>> +static int reflog_entry_reachable(struct object_id *o_oid,
>> +			       struct object_id *n_oid,
>> +			       const char *ident, timestamp_t timestamp,
>> +			       int tz, const char *message, void *cb_data)
>> +{
>> +	struct commit *local_commit;
>> +	struct commit *remote_commit = cb_data;
>> +
>> +	local_commit = lookup_commit_reference(the_repository, n_oid);
>> +	if (local_commit)
>> +		return in_merge_bases(remote_commit, local_commit);
>> +
>> +	return 0;
>> +}
> 
> Makes me wonder, if in_merge_bases() is so expensive that it makes
> sense to split the "were we exactly at the tip?" and "is one of the
> commits we were at a descendant of the tip?" into separate phases,
> if this part should be calling in_merge_bases() one by one.
> 
> Would it make more sense to iterate over reflog entries from newer
> to older, collect them in an array of pointers to "struct commit" in
> a batch of say 8 commits or less, and then ask in_merge_bases_many()
> if the remote_commit is an ancestor of one of these local commits?

As I said before[1] I think we should also be checking the reflog dates 
so that we do not look at any local reflog entries that are older than 
the most recent reflog entry for the remote tracking branch. This 
protects against a background fetch when the remote has been rewound and 
it would also reduce the number of calls to in_merge_bases().

Best Wishes

Phillip

[1] 
https://lore.kernel.org/git/624d9e35-29b8-4012-a3d6-e9b00a9e4485@gmail.com/

> The traversal cost to start from one "local commit" to see if
> remote_commit is an ancestor of it using in_merge_bases() and
> in_merge_bases_many() should be the same and an additional traversal
> cost to start from more local commits should be negligible compared
> to the traversal itself, so making a call to in_merge_bases() for
> each local_commit smells somewhat suboptimal.
> 
> If we were talking about older parts of the history, optional
> generation numbers could change the equation somewhat, but the
> common case for the workflow this series is trying to help is that
> these local commits ane the remote tip are relatively new and it is
> not unlikely that the generation numbers have not been computed for
> them, which is why I suspect that in_merges_many may be a win.
> 
>> @@ -2301,6 +2380,15 @@ void apply_push_cas(struct push_cas_option *cas,
>>   		    struct ref *remote_refs)
>>   {
>>   	struct ref *ref;
>> -	for (ref = remote_refs; ref; ref = ref->next)
>> +	for (ref = remote_refs; ref; ref = ref->next) {
>>   		apply_cas(cas, remote, ref);
>> +
>> +		/*
>> +		 * If "compare-and-swap" is in "use_tracking[_for_rest]"
>> +		 * mode, and if "--foce-if-includes" was specified, run
>> +		 * the check.
>> +		 */
>> +		if (ref->if_includes)
>> +			check_if_includes_upstream(ref);
> 
> s/foce/force/;
> 
> I can see that the code is checking "and if force-if-includes was
> specified" part, but it is not immediately clear where the code
> checks if "--force-with-lease" is used with "tracking" and not with
> "the other side must be exactly this commit" mode here.
> 
>      ... goes and looks ...
> 
> Ah, ok, I found out.
> 
> The field name "if_includes", and the comment for the field in
> remote.h, are both misleading.  It gives an impression that the
> field being true means "--force-if-included is in use", but in
> reality the field means a lot more.  When it is true, it signals
> that "--force-if-included" is in use *and* for this ref we were told
> to use the "--force-with-lease" without an exact object name.  And
> that logic is not here, but has already happened in apply_cas().
> 
> Which makes the above comment correct.  We however need a better
> name for this field and/or an explanation for the field in the
> header file, or both, to avoid misleading readers.
> 
>> diff --git a/remote.h b/remote.h
>> index 5e3ea5a26d..38ab8539e2 100644
>> --- a/remote.h
>> +++ b/remote.h
>> @@ -104,7 +104,9 @@ struct ref {
>>   		forced_update:1,
>>   		expect_old_sha1:1,
>>   		exact_oid:1,
>> -		deletion:1;
>> +		deletion:1,
>> +		if_includes:1, /* If "--force-with-includes" was specified. */
> 
> The description needs to be tightened.
> 
>> +		unreachable:1; /* For "if_includes"; unreachable in reflog. */
> 
> 
> Thanks.
>
Junio C Hamano Sept. 21, 2020, 4:12 p.m. UTC | #4
Phillip Wood <phillip.wood123@gmail.com> writes:

> As I said before[1] I think we should also be checking the reflog
> dates so that we do not look at any local reflog entries that are
> older than the most recent reflog entry for the remote tracking
> branch. This protects against a background fetch when the remote has
> been rewound and it would also reduce the number of calls to
> in_merge_bases().

Meaning we first check the timestamp of the topmost reflog entry of
remote-tracking branch (i.e. the time *we* acquired the tip commit
that we are about to lose), and leverage on the fact that no local
commit older than that timestamp can possibly be written with the
knowledge of that remote work?  Assuming that local timestamp is
monotonically increasing, it is a quite valid optimization (and the
clock skew we often talk about in the context of revision traversal
are often discrepancy between matchines).

Having said that.

The new generation number based on (adjusted) timestamp is being
worked in, and that work is supposed to bring such an optimization
to us automatically (at least on the reachability's side, i.e. logic
that uses get_merge_bases()), I think, so we probably do *not* want
to add such a heuristics specifically for this codepath.

Thanks.
Junio C Hamano Sept. 21, 2020, 6:11 p.m. UTC | #5
Junio C Hamano <gitster@pobox.com> writes:

> Meaning we first check the timestamp of the topmost reflog entry of
> remote-tracking branch (i.e. the time *we* acquired the tip commit
> that we are about to lose), and leverage on the fact that no local
> commit older than that timestamp can possibly be written with the
> knowledge of that remote work?  Assuming that local timestamp is
> monotonically increasing, it is a quite valid optimization (and the
> clock skew we often talk about in the context of revision traversal
> are often discrepancy between matchines).
>
> Having said that.
>
> The new generation number based on (adjusted) timestamp is being
> worked in, and that work is supposed to bring such an optimization
> to us automatically (at least on the reachability's side, i.e. logic
> that uses get_merge_bases()), I think, so we probably do *not* want
> to add such a heuristics specifically for this codepath.

Eh, I spoke too soon before I thought it through.  I do not think we
will gain "assume that any commit whose timestamp is older than this
externally given one will never reach the other commit" even with
the reachability index based on (adjusted) timestamp.  At least,
stopping the traversal of reflog entries of the local side at the
timestamp of the topmost reflog entry of remote-tracking branch in
question would be an easy, worthwhile and sensible optimization.

Thanks.
Junio C Hamano Sept. 21, 2020, 6:48 p.m. UTC | #6
Srinidhi Kaushik <shrinidhi.kaushik@gmail.com> writes:

>> If we were talking about older parts of the history, optional
>> generation numbers could change the equation somewhat, but the
>> common case for the workflow this series is trying to help is that
>> these local commits ane the remote tip are relatively new and it is
>> not unlikely that the generation numbers have not been computed for
>> them, which is why I suspect that in_merges_many may be a win.
>
> Nice! We can definitely try batching commits from the reflog and
> pass it along to "in_merge_bases_many()". As for being faster than
> calling "in_merge_bases()" for each commit entry in the reflog --
> I am not familiar with how the former works. Do we still keep the
> "reflog_entry_exists()" part? It might still be faster to go through
> the entries once to check with "oideq()" in the first run.

That is what I meant.  You go through local reflog entries until you
find one that is older than the timestamp of the reflog entry of the
remote-tracking branch, check with oideq() to see if the tip was ever
directly checked out.  Then, using these same local reflog entries,
you can make in_merge_bases_many() tranversal to see if any of them
reach the tip.  I suspect that the number of local reflog entries you
need to examine would not be too many, so if you can put them all in
a single array of "struct commit *" pointers in the first "oideq()"
phase, you may be able to do just a single in_merge_bases_many() batch
to check for the reachability.

> Also, I was wondering if it is worth considering this:
>   - check if the reflog of the HEAD has the remote ref

It would help the workflow I had in mind, but it would raise the
risk of false positives according to Dscho and I tend to agree, so
I do not know if it is overall a good idea.

>   - check if the reflog of the local branch has the remote ref

Isn't that the oideq() test?

>   - check if the remote ref is reachable from any of the local ref's
>     "reflog" entries using "in_merge_bases_many()" in batches as
>     suggested here.

I think it amounts to the same as "does any reflog entry of HEAD
reach it?" and shares the same issues with false positives as the
first one.

>> > +		deletion:1,
>> > +		if_includes:1, /* If "--force-with-includes" was specified. */
>> 
>> The description needs to be tightened.
>> 
>> > +		unreachable:1; /* For "if_includes"; unreachable in reflog. */
>
> OK, you're right. Perhaps, we could rename it to something like
> "if_includes_for_tracking" and update the comment description
> with saying something along the lines of:

That is overlong.  Let me try:

		/* need to check if local reflog reaches the remote tip */
		check_reachable:1,

		/* local reflog does not reach the remote tip */
		unreachable:1;

Thans.
Srinidhi Kaushik Sept. 23, 2020, 10:22 a.m. UTC | #7
Hi Junio,

On 09/21/2020 11:48, Junio C Hamano wrote:
> Srinidhi Kaushik <shrinidhi.kaushik@gmail.com> writes:
> 
> >> If we were talking about older parts of the history, optional
> >> generation numbers could change the equation somewhat, but the
> >> common case for the workflow this series is trying to help is that
> >> these local commits ane the remote tip are relatively new and it is
> >> not unlikely that the generation numbers have not been computed for
> >> them, which is why I suspect that in_merges_many may be a win.
> >
> > Nice! We can definitely try batching commits from the reflog and
> > pass it along to "in_merge_bases_many()". As for being faster than
> > calling "in_merge_bases()" for each commit entry in the reflog --
> > I am not familiar with how the former works. Do we still keep the
> > "reflog_entry_exists()" part? It might still be faster to go through
> > the entries once to check with "oideq()" in the first run.
> 
> That is what I meant.  You go through local reflog entries until you
> find one that is older than the timestamp of the reflog entry of the
> remote-tracking branch, check with oideq() to see if the tip was ever
> directly checked out.  Then, using these same local reflog entries,
> you can make in_merge_bases_many() tranversal to see if any of them
> reach the tip.  I suspect that the number of local reflog entries you
> need to examine would not be too many, so if you can put them all in
> a single array of "struct commit *" pointers in the first "oideq()"
> phase, you may be able to do just a single in_merge_bases_many() batch
> to check for the reachability.

Gotcha.

> > Also, I was wondering if it is worth considering this:
> >   - check if the reflog of the HEAD has the remote ref
> 
> It would help the workflow I had in mind, but it would raise the
> risk of false positives according to Dscho and I tend to agree, so
> I do not know if it is overall a good idea.

Oh, right. This doesn't work when a "git pull --rebase" is run on
a different branch (and a few other cases, as mentioned by Johannes).

> >   - check if the reflog of the local branch has the remote ref
> 
> Isn't that the oideq() test?

Yes.

> >   - check if the remote ref is reachable from any of the local ref's
> >     "reflog" entries using "in_merge_bases_many()" in batches as
> >     suggested here.
> 
> I think it amounts to the same as "does any reflog entry of HEAD
> reach it?" and shares the same issues with false positives as the
> first one.

Hmm, isn't this the same as what was mentioned by you earlier (without
the timestamp:

> [...] Then, using these same local reflog entries,
> you can make in_merge_bases_many() tranversal to see if any of them
> reach the tip.

In v5 (the new patch) [1], the check does this:
  - go through the local reflog until it hits an entry with a timestamp
    older than the remote commit, and doing an "oideq()" check and
    collecting commits into a list along the way.

  - if an exact entry was found, the test passes; otherwise use
    the commit list and make a call to "in_merge_bases_many()" to
    check for reachability, and report it.

> >> > +		deletion:1,
> >> > +		if_includes:1, /* If "--force-with-includes" was specified. */
> >> 
> >> The description needs to be tightened.
> >> 
> >> > +		unreachable:1; /* For "if_includes"; unreachable in reflog. */
> >
> > OK, you're right. Perhaps, we could rename it to something like
> > "if_includes_for_tracking" and update the comment description
> > with saying something along the lines of:
> 
> That is overlong.  Let me try:
> 
> 		/* need to check if local reflog reaches the remote tip */
> 		check_reachable:1,
> 
> 		/* local reflog does not reach the remote tip */
> 		unreachable:1;
> 

I have updated the description in v5 [1]; thanks!

[1]: https://public-inbox.org/git/20200923073022.61293-1-shrinidhi.kaushik@gmail.com/

Thanks.
Srinidhi Kaushik Sept. 23, 2020, 10:27 a.m. UTC | #8
Hi Phillip,

On 09/21/2020 14:19, Phillip Wood wrote:
> As I said before[1] I think we should also be checking the reflog dates so
> that we do not look at any local reflog entries that are older than the most
> recent reflog entry for the remote tracking branch. This protects against a
> background fetch when the remote has been rewound and it would also reduce
> the number of calls to in_merge_bases().

Thanks for suggesting this; I must have missed it earlier (sorry).
The latest patch [1], has been updated to take timestamps into
account which helps reduce the number of reflog iterations.

[1]: https://public-inbox.org/git/20200923073022.61293-1-shrinidhi.kaushik@gmail.com/

Thanks.
--
Srinidhi Kaushik
Junio C Hamano Sept. 23, 2020, 4:47 p.m. UTC | #9
Srinidhi Kaushik <shrinidhi.kaushik@gmail.com> writes:

>> >   - check if the remote ref is reachable from any of the local ref's
>> >     "reflog" entries using "in_merge_bases_many()" in batches as
>> >     suggested here.
>> 
>> I think it amounts to the same as "does any reflog entry of HEAD
>> reach it?" and shares the same issues with false positives as the
>> first one.
>
> Hmm, isn't this the same as what was mentioned by you earlier (without
> the timestamp:

I misunderstood what you meant with the "any of the local ref" part
of the sentence and mistook it as "enumerate all the local refs and
HEAD, and collect all reflog entries of these local refs, and see if
any of them reach the remote ref".  Re-reading the sentence with
that "any of" refers to "entries" (of a single local ref) in mind,
it is pretty much the same as we agreed is a good way to go.

Thanks.
diff mbox series

Patch

diff --git a/builtin/send-pack.c b/builtin/send-pack.c
index 2b9610f121..4d76727edb 100644
--- a/builtin/send-pack.c
+++ b/builtin/send-pack.c
@@ -69,6 +69,11 @@  static void print_helper_status(struct ref *ref)
 			msg = "stale info";
 			break;
 
+		case REF_STATUS_REJECT_REMOTE_UPDATED:
+			res = "error";
+			msg = "remote ref updated since checkout";
+			break;
+
 		case REF_STATUS_REJECT_ALREADY_EXISTS:
 			res = "error";
 			msg = "already exists";
diff --git a/remote.c b/remote.c
index eafc14cbe7..60d681a885 100644
--- a/remote.c
+++ b/remote.c
@@ -1471,12 +1471,23 @@  void set_ref_status_for_push(struct ref *remote_refs, int send_mirror,
 		 * with the remote-tracking branch to find the value
 		 * to expect, but we did not have such a tracking
 		 * branch.
+		 *
+		 * If the tip of the remote-tracking ref is unreachable
+		 * from any reflog entry of its local ref indicating a
+		 * possible update since checkout; reject the push.
 		 */
 		if (ref->expect_old_sha1) {
 			if (!oideq(&ref->old_oid, &ref->old_oid_expect))
 				reject_reason = REF_STATUS_REJECT_STALE;
+			else if (ref->if_includes && ref->unreachable)
+				reject_reason =
+					REF_STATUS_REJECT_REMOTE_UPDATED;
 			else
-				/* If the ref isn't stale then force the update. */
+				/*
+				 * If the ref isn't stale, and is reachable
+				 * from from one of the reflog entries of
+				 * the local branch, force the update.
+				 */
 				force_ref_update = 1;
 		}
 
@@ -2268,6 +2279,70 @@  static int remote_tracking(struct remote *remote, const char *refname,
 	return 0;
 }
 
+/* Checks if the ref exists in the reflog entry. */
+static int reflog_entry_exists(struct object_id *o_oid,
+				  struct object_id *n_oid,
+				  const char *ident, timestamp_t timestamp,
+				  int tz, const char *message, void *cb_data)
+{
+	struct object_id *remote_oid = cb_data;
+	return oideq(n_oid, remote_oid);
+}
+
+/* Checks if the ref is reachable from the reflog entry. */
+static int reflog_entry_reachable(struct object_id *o_oid,
+			       struct object_id *n_oid,
+			       const char *ident, timestamp_t timestamp,
+			       int tz, const char *message, void *cb_data)
+{
+	struct commit *local_commit;
+	struct commit *remote_commit = cb_data;
+
+	local_commit = lookup_commit_reference(the_repository, n_oid);
+	if (local_commit)
+		return in_merge_bases(remote_commit, local_commit);
+
+	return 0;
+}
+
+/*
+ * Iterate through he reflog entries of the local branch to check
+ * if the remote-tracking ref exists in on of the entries; if not,
+ * go through the entries once more, but this time check if the
+ * remote-tracking ref is reachable from any of the entries.
+ */
+static int is_reachable_in_reflog(const char *local_ref_name,
+				  const struct object_id *remote_oid)
+{
+	struct commit *remote_commit;
+
+	if (for_each_reflog_ent_reverse(local_ref_name, reflog_entry_exists,
+					(struct object_id *)remote_oid))
+		return 1;
+
+	remote_commit = lookup_commit_reference(the_repository, remote_oid);
+	if (remote_commit)
+		return for_each_reflog_ent_reverse(local_ref_name,
+						   reflog_entry_reachable,
+						   remote_commit);
+	return 0;
+}
+
+/*
+ * Check for reachability of a remote-tracking
+ * ref in the reflog entries of its local ref.
+ */
+static void check_if_includes_upstream(struct ref *remote_ref)
+{
+	struct ref *local_ref = get_local_ref(remote_ref->name);
+
+	if (!local_ref)
+		return;
+
+	if (!is_reachable_in_reflog(local_ref->name, &remote_ref->old_oid))
+		remote_ref->unreachable = 1;
+}
+
 static void apply_cas(struct push_cas_option *cas,
 		      struct remote *remote,
 		      struct ref *ref)
@@ -2284,6 +2359,8 @@  static void apply_cas(struct push_cas_option *cas,
 			oidcpy(&ref->old_oid_expect, &entry->expect);
 		else if (remote_tracking(remote, ref->name, &ref->old_oid_expect))
 			oidclr(&ref->old_oid_expect);
+		else
+			ref->if_includes = cas->use_force_if_includes;
 		return;
 	}
 
@@ -2294,6 +2371,8 @@  static void apply_cas(struct push_cas_option *cas,
 	ref->expect_old_sha1 = 1;
 	if (remote_tracking(remote, ref->name, &ref->old_oid_expect))
 		oidclr(&ref->old_oid_expect);
+	else
+		ref->if_includes = cas->use_force_if_includes;
 }
 
 void apply_push_cas(struct push_cas_option *cas,
@@ -2301,6 +2380,15 @@  void apply_push_cas(struct push_cas_option *cas,
 		    struct ref *remote_refs)
 {
 	struct ref *ref;
-	for (ref = remote_refs; ref; ref = ref->next)
+	for (ref = remote_refs; ref; ref = ref->next) {
 		apply_cas(cas, remote, ref);
+
+		/*
+		 * If "compare-and-swap" is in "use_tracking[_for_rest]"
+		 * mode, and if "--foce-if-includes" was specified, run
+		 * the check.
+		 */
+		if (ref->if_includes)
+			check_if_includes_upstream(ref);
+	}
 }
diff --git a/remote.h b/remote.h
index 5e3ea5a26d..38ab8539e2 100644
--- a/remote.h
+++ b/remote.h
@@ -104,7 +104,9 @@  struct ref {
 		forced_update:1,
 		expect_old_sha1:1,
 		exact_oid:1,
-		deletion:1;
+		deletion:1,
+		if_includes:1, /* If "--force-with-includes" was specified. */
+		unreachable:1; /* For "if_includes"; unreachable in reflog. */
 
 	enum {
 		REF_NOT_MATCHED = 0, /* initial value */
@@ -134,6 +136,7 @@  struct ref {
 		REF_STATUS_REJECT_NEEDS_FORCE,
 		REF_STATUS_REJECT_STALE,
 		REF_STATUS_REJECT_SHALLOW,
+		REF_STATUS_REJECT_REMOTE_UPDATED,
 		REF_STATUS_UPTODATE,
 		REF_STATUS_REMOTE_REJECT,
 		REF_STATUS_EXPECTING_REPORT,
@@ -332,6 +335,7 @@  struct ref *get_stale_heads(struct refspec *rs, struct ref *fetch_map);
 
 struct push_cas_option {
 	unsigned use_tracking_for_rest:1;
+	unsigned use_force_if_includes:1;
 	struct push_cas {
 		struct object_id expect;
 		unsigned use_tracking:1;
diff --git a/send-pack.c b/send-pack.c
index 632f1580ca..956306e8e8 100644
--- a/send-pack.c
+++ b/send-pack.c
@@ -240,6 +240,7 @@  static int check_to_send_update(const struct ref *ref, const struct send_pack_ar
 	case REF_STATUS_REJECT_FETCH_FIRST:
 	case REF_STATUS_REJECT_NEEDS_FORCE:
 	case REF_STATUS_REJECT_STALE:
+	case REF_STATUS_REJECT_REMOTE_UPDATED:
 	case REF_STATUS_REJECT_NODELETE:
 		return CHECK_REF_STATUS_REJECTED;
 	case REF_STATUS_UPTODATE:
diff --git a/transport-helper.c b/transport-helper.c
index c52c99d829..e547e21199 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -779,6 +779,10 @@  static int push_update_ref_status(struct strbuf *buf,
 			status = REF_STATUS_REJECT_STALE;
 			FREE_AND_NULL(msg);
 		}
+		else if (!strcmp(msg, "remote ref updated since checkout")) {
+			status = REF_STATUS_REJECT_REMOTE_UPDATED;
+			FREE_AND_NULL(msg);
+		}
 		else if (!strcmp(msg, "forced update")) {
 			forced = 1;
 			FREE_AND_NULL(msg);
@@ -897,6 +901,7 @@  static int push_refs_with_push(struct transport *transport,
 		case REF_STATUS_REJECT_NONFASTFORWARD:
 		case REF_STATUS_REJECT_STALE:
 		case REF_STATUS_REJECT_ALREADY_EXISTS:
+		case REF_STATUS_REJECT_REMOTE_UPDATED:
 			if (atomic) {
 				reject_atomic_push(remote_refs, mirror);
 				string_list_clear(&cas_options, 0);
diff --git a/transport.c b/transport.c
index 43e24bf1e5..99fe6233a3 100644
--- a/transport.c
+++ b/transport.c
@@ -567,6 +567,11 @@  static int print_one_push_status(struct ref *ref, const char *dest, int count,
 		print_ref_status('!', "[rejected]", ref, ref->peer_ref,
 				 "stale info", porcelain, summary_width);
 		break;
+	case REF_STATUS_REJECT_REMOTE_UPDATED:
+		print_ref_status('!', "[rejected]", ref, ref->peer_ref,
+				 "remote ref updated since checkout",
+				 porcelain, summary_width);
+		break;
 	case REF_STATUS_REJECT_SHALLOW:
 		print_ref_status('!', "[rejected]", ref, ref->peer_ref,
 				 "new shallow roots not allowed",
@@ -1101,6 +1106,7 @@  static int run_pre_push_hook(struct transport *transport,
 		if (!r->peer_ref) continue;
 		if (r->status == REF_STATUS_REJECT_NONFASTFORWARD) continue;
 		if (r->status == REF_STATUS_REJECT_STALE) continue;
+		if (r->status == REF_STATUS_REJECT_REMOTE_UPDATED) continue;
 		if (r->status == REF_STATUS_UPTODATE) continue;
 
 		strbuf_reset(&buf);